JP2005149037A - Method, apparatus and program for estimating gene expression action - Google Patents

Method, apparatus and program for estimating gene expression action Download PDF

Info

Publication number
JP2005149037A
JP2005149037A JP2003384457A JP2003384457A JP2005149037A JP 2005149037 A JP2005149037 A JP 2005149037A JP 2003384457 A JP2003384457 A JP 2003384457A JP 2003384457 A JP2003384457 A JP 2003384457A JP 2005149037 A JP2005149037 A JP 2005149037A
Authority
JP
Japan
Prior art keywords
gene
estimation
gene expression
strain
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2003384457A
Other languages
Japanese (ja)
Other versions
JP4459597B2 (en
Inventor
Hiroya Nobori
博也 昇
Yoshiki Miura
佳樹 三浦
Shigeyuki Mitsui
重之 三井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitsubishi Space Software Co Ltd
Original Assignee
Mitsubishi Space Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Space Software Co Ltd filed Critical Mitsubishi Space Software Co Ltd
Priority to JP2003384457A priority Critical patent/JP4459597B2/en
Publication of JP2005149037A publication Critical patent/JP2005149037A/en
Application granted granted Critical
Publication of JP4459597B2 publication Critical patent/JP4459597B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To obtain a method for estimating interaction of gene expression in which an initial value of a suitable coefficient can be easily obtained and the volume of calculation is reduced. <P>SOLUTION: The method for preparing a gene model and estimating gene expression action on the basis of time-sequence data of a wild strain composed of a gene group which is not operated and destroyed strain obtained by destroying specific genes is provided with individual gene expression coefficient estimation steps (S17-S20) in which a loop for repeating a procedure for estimating a generation coefficient for simultaneously satisfying the time-sequence data of the wild strain and the specific destroyed strain in the gene model by the number of individual genes to be estimated which are included in the wild strain is included, and gene expression coefficient matrix estimation steps (S16-S20) in which a repeating loop for successively performing the individual gene expression coefficient estimation steps by extracting succeeding destroyed strain after finishing the estimation of the specific destroyed strain by the prescribed number of destroyed strains is included. An initial value for estimating a gene expression coefficient matrix is obtained through the two steps. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、複数の遺伝子間において、個々の遺伝子が他の遺伝子から時間的に影響を受けて変化する、その変化作用量を推測する方法に関するものである。   The present invention relates to a method for estimating the amount of change action, in which each gene changes by being influenced by time from other genes among a plurality of genes.

従来、こうした複数の遺伝子群において、個々の遺伝子が他の遺伝子から影響を受ける様子、つまり遺伝子相互作用を推定する方法として、閾値検定、統計的手法を用いたBayesianネットワーク、二項関係を用いたブーリアンネットワーク、更には微分方程式を用いる方法等、各種の技術がある。
なかでも同値類のフィードバック制御が同定可能な微分方程式モデルを使用した推定方法がよく知られている。
Conventionally, in such a plurality of gene groups, individual genes are influenced by other genes, that is, as a method for estimating gene interaction, a threshold test, a Bayesian network using a statistical method, and a binomial relationship are used. There are various technologies such as a Boolean network and a method using a differential equation.
In particular, an estimation method using a differential equation model that can identify feedback control of an equivalence class is well known.

なお、出願人は、出願時点で先行技術文献を知得していない。   The applicant has not obtained prior art documents at the time of filing.

従来の方法は、推定のために各遺伝子の作用係数に初期値を与えて推定を開始するが、その初期値の与え方に関して適切な方法を示していないので、最適方法を推定するまでに非常に時間がかかるという課題がある。
また計算量が膨大になり、または繰返し推定の中で数値の発散を防ぐために制限を加える必要がある、そのため遺伝子数を制限する必要がある、などの課題もある。
In the conventional method, the initial value is given to the coefficient of action of each gene for estimation, but the estimation is started. There is a problem that it takes time.
In addition, there is a problem that the amount of calculation becomes enormous, or that it is necessary to add a restriction in order to prevent the divergence of numerical values in iterative estimation, and therefore it is necessary to limit the number of genes.

この発明は上記の課題を解決するためになされたもので、適切な係数の初期値が容易に得られ、かつ計算量を抑えた遺伝子発現相互作用推定方法を得る。   The present invention has been made in order to solve the above-described problems, and provides a gene expression interaction estimation method in which an initial value of an appropriate coefficient can be easily obtained and a calculation amount is suppressed.

この発明に係る遺伝子発現作用推定方法は、遺伝子モデルを作成し、操作されていない遺伝子群からなる野生株と特定遺伝子を破壊した破壊株の時間列データを基に遺伝子発現作用を推定する方法において、
遺伝子モデルに対し、野生株と特定の破壊株との時間列データを同時に満足する発現係数を推定する手順を野生株に含まれる推定対象個別遺伝子の数だけ繰返すループとして持つ個別遺伝子への発現係数推定ステップと、
特定の破壊株との推定を終えると、次の破壊株を取上げて個別遺伝子への発現係数推定ステップを繰返すことを、所定数の破壊株について順次行う、繰り返しループとして持つ遺伝子発現係数行列推定ステップとを備えて、
上記2つのステップを経て、遺伝子発現係数行列の推定初期値を得るようにした。
The gene expression action estimation method according to the present invention is a method of creating a gene model and estimating a gene expression action based on time series data of a wild strain consisting of an unmanipulated gene group and a disrupted strain in which a specific gene is destroyed. ,
Expression coefficient for individual genes with a loop that repeats the number of individual genes to be estimated included in the wild strain for the gene model to estimate the expression coefficient that simultaneously satisfies the time series data of the wild strain and the specific disrupted strain An estimation step;
When the estimation with a specific disrupted strain is completed, the next step is to take the next disrupted strain and repeat the expression coefficient estimation step for individual genes. And with
Through the above two steps, an estimated initial value of the gene expression coefficient matrix was obtained.

野生株と破壊株とのタイムコース値の組合わせから遺伝子間相互作用行列を初期値推定するようにしたので、推定収束までの時間が短縮され、かつ安定で妥当な推定ができる効果がある。   Since the initial value of the gene interaction matrix is estimated from the combination of the time course values of the wild strain and the disrupted strain, the time until the convergence of the estimation is shortened and there is an effect that stable and reasonable estimation can be performed.

実施の形態1.
遺伝子の相互作用、または他の遺伝子から受ける影響度を推定するために、最小二乗法による準最適化手法を適用して初期値を得て、またその初期値に対して線形微分方程式モデルによる準最適化を適用して、制約を少なくして推定所要時間を短縮する方法を説明する。
本実施の形態において、線形微分方程式モデルとして次の(1)式を設定する。
Embodiment 1 FIG.
In order to estimate gene interaction or the degree of influence from other genes, a semi-optimization method using the least square method is applied to obtain an initial value, and a quasi-optimization using a linear differential equation model is applied to the initial value. A method for reducing the constraints and reducing the estimated time by applying optimization will be described.
In the present embodiment, the following equation (1) is set as a linear differential equation model.

Figure 2005149037
Figure 2005149037

ここでxは遺伝子発現量ベクトル、Aは各遺伝子間の相互作用を示す係数行列、wは不確定要素によるバイアス係数ベクトルである。不確定要素に関しては、時間の多項式関数とする場合もあるが、ここではバイアスモデルとして記述する。また上記(1)式から、最小二乗法では遺伝子発現量ベクトルを(1)式の両辺を積分した次の(2)式とする。   Here, x is a gene expression level vector, A is a coefficient matrix indicating an interaction between genes, and w is a bias coefficient vector based on uncertain elements. The uncertain element may be a polynomial function of time, but is described here as a bias model. Further, from the above equation (1), in the least square method, the gene expression level vector is represented by the following equation (2) obtained by integrating both sides of the equation (1).

Figure 2005149037
Figure 2005149037

0 は遺伝子発現量初期値(初期値ベクトル)、tは時間を示す。
また最小二乗法は、遺伝子発現量ベクトル(遺伝子ベクトル)xの一つの遺伝子に対し、時間で変化するタイムコースデータを基に次の(3)式で各遺伝子Yが変化すると設定する。そうしてその時間変化において、他の遺伝子からの相互作用を示す行列Bと、Wの推定を行う。
Y=Y0 +B・X+W・t (3)
ここでYは対象遺伝子群中における一つの遺伝子のタイムコースデータであり、Xは個別の遺伝子Yの集合体としての遺伝子群で、遺伝子発現量xの時間積分値である。またY0 はその初期値である。
こうして設定したモデルに基づく推定方法をフローにしたものを図1と図2に示す。またこの動作を展開する計算回路構成の例を図3に示す。図3において、推定を行うための回路構成として、実際のデータを読込む入力装置51、本装置の解析結果である、遺伝子間の影響度をパスウェイとして得て、このパスウェイを表示する表示装置52と、この入出力間で本処理を行うハードウェア構成の遺伝子発現作用推定装置からなる。この遺伝子発現作用推定装置は、中央処理装置を有する計算機で実現できるものである。遺伝子発現作用推定装置の詳細構成として、以下の構成要素を持っている。入力側から順に、タイムコース・データ入力処理部41、これで得たデータを補間する補間処理部42、補間結果から正規化する正規化処理部43、モデルの係数初期値を推定する初期値推定部44、得られた初期値に基づいて相互作用係数を推定する相互作用係数推定処理部45、更に最適化処理をする係数最適化処理部46、入力のタイムコース・データを記憶するタイムコース・データ記憶部48、モデルの数式を表現する遺伝子発現モデルを記憶する解析手法記憶部47、得られた遺伝子間の影響を図式化するパスウェイ描画処理部49で、それぞれ対応プログラムを持ったハードウェアである。あるいは、上記各部は、プログラムのみにより実現されてもよいし、ハードウェアのみにより実現されてもよいし、プログラムとハードウェアとの組み合わせにより実現されてもよい。
x 0 is the gene expression amount initial value (initial value vector), t denotes time.
The least square method is set so that each gene Y changes in the following equation (3) based on the time course data that changes with time for one gene of the gene expression level vector (gene vector) x. Then, in the time change, the matrix B indicating the interaction from other genes and W are estimated.
Y = Y 0 + B · X + W · t (3)
Here, Y is time course data of one gene in the target gene group, and X is a gene group as an aggregate of individual genes Y, and is a time integral value of the gene expression level x. Y 0 is the initial value.
FIGS. 1 and 2 show a flow of the estimation method based on the model set in this way. FIG. 3 shows an example of a calculation circuit configuration for developing this operation. In FIG. 3, as a circuit configuration for estimation, an input device 51 that reads actual data, a display device 52 that obtains the degree of influence between genes, which is an analysis result of this device, as a pathway, and displays this pathway. And a gene expression action estimation device having a hardware configuration for performing this processing between the input and output. This gene expression action estimation apparatus can be realized by a computer having a central processing unit. The detailed configuration of the gene expression action estimation apparatus has the following components. In order from the input side, a time course / data input processing unit 41, an interpolation processing unit 42 for interpolating the data obtained thereby, a normalization processing unit 43 for normalizing from the interpolation result, an initial value estimation for estimating the coefficient initial value of the model Unit 44, an interaction coefficient estimation processing unit 45 that estimates an interaction coefficient based on the obtained initial value, a coefficient optimization processing unit 46 that performs further optimization processing, and a time course that stores input time course data. A data storage unit 48, an analysis method storage unit 47 that stores a gene expression model that expresses a mathematical expression of the model, and a pathway drawing processing unit 49 that graphically illustrates the effect between the obtained genes, each with hardware having a corresponding program is there. Or each said part may be implement | achieved only by a program, may be implement | achieved only by hardware, and may be implement | achieved by the combination of a program and hardware.

この計算機による推定方法を図1と図2に基づいて説明する。
先ずタイムコースデータである実験値により着目する個別の遺伝子Yの時間変化データを読み込む。図4は実験等で得られる観測値であり、図において、左端の列は説明のためのインデックスであり、図では1から27まで記されている。またこの図では説明を簡単にするために遺伝子を4個(G01,G02,G03,G04)の例を示している。インデックス5ないし7は操作されていない自然の個別遺伝子からなる遺伝子群を野生株として、その野生株の時間変化データを示している。そして左端の列が時刻、この例では3つの時刻(1.1,1.2,1.3)を表し、次の列(1.11,1.21,1.31)は、遺伝子G01の各左側時刻におけるタイムコース値で、次の1.12ないし1.32は、遺伝子G02の同じくサンプル時刻のタイムコース値、同様に1.13ないし1.33はG03のタイムコース値、1.14ないし1.34はG04のタイムコース値である。更にインデックス10ないし12は、インデックス8に示すように、G01遺伝子を破壊した破壊株の3つの時刻(1.1,1.2,1.3)におけるタイムコース値で、2.11,2.12,2.13,2.14の各下への列が、それぞれ遺伝子G01,G02,G03,G04の時刻変化値である。以下、インデックス15ないし17は遺伝子G02を破壊した破壊株、20ないし22はG03を破壊した破壊株、25ないし27はG04を破壊した破壊株のタイムコース値である。これをタイムコース・データ入力処理部41がS11で行う。
An estimation method using this computer will be described with reference to FIGS.
First, the time change data of the individual gene Y of interest is read from the experimental value which is time course data. FIG. 4 shows observed values obtained by experiments and the like. In the figure, the leftmost column is an index for explanation, and 1 to 27 are shown in the figure. Further, in this figure, an example of four genes (G01, G02, G03, G04) is shown for easy explanation. Indexes 5 to 7 show time-change data of wild strains, where a gene group composed of natural individual genes that have not been manipulated is a wild strain. The leftmost column represents the time, in this example, three times (1.1, 1.2, 1.3), and the next column (1.11, 1.21, 1.31) represents the gene G01. The time course values at the left time are the following 1.12 to 1.32, the time course value of the sample time of the gene G02, 1.13 to 1.33 are the time course values of G03, 1.14 Thru 1.34 is the time course value of G04. Furthermore, as shown in index 8, indexes 10 to 12 are time course values at three times (1.1, 1.2, 1.3) of the disrupted strain in which the G01 gene is disrupted. The lower columns of 12, 2.13, and 2.14 are time change values of the genes G01, G02, G03, and G04, respectively. In the following, indexes 15 to 17 are the time course values of the disrupted strain in which the gene G02 is disrupted, 20 to 22 are the disrupted strain in which the G03 is disrupted, and 25 to 27 are the disrupted strains in which the G04 is disrupted. The time course / data input processing unit 41 performs this in S11.

S12において、計算機はタイムコースデータを計算機処理のために絶対値が大きいデータで正規化し、補間処理部42が行うS13で、対数値を基に図4の経過時間の中間値をスプライン(平滑化)補間処理する。正規化処理部43が行うS14で、補間後の対数値データを元の物理量に戻す。
S15では、サンプル時刻毎に各遺伝子のそれまでのデータを積分して行列Xと、またベクトルYを生成する。S15ではこの他に、生成された行列Xにおいて、推定しない相互作用の要素はゼロに固定し、またX行列、Yベクトルの有効桁を計算の精度と安定化のために、調整する。
In S12, the computer normalizes the time course data with data having a large absolute value for computer processing, and in S13 performed by the interpolation processing unit 42, the intermediate value of the elapsed time in FIG. 4 is splined (smoothed) based on the logarithmic value. ) Interpolate. In S14 performed by the normalization processing unit 43, the interpolated logarithmic value data is returned to the original physical quantity.
In S15, the matrix X and the vector Y are generated by integrating the data so far for each gene at each sampling time. In S15, in addition to this, in the generated matrix X, the elements of the interaction that are not estimated are fixed to zero, and the significant digits of the X matrix and the Y vector are adjusted for calculation accuracy and stabilization.

本実施の形態で特徴的なループを説明する。S16では、S21で囲まれる大きなループを形成している。この大きなループは、野生株の時間変化と、1つの破壊株(図4では、例えばインデックス8に注記したG01を破壊した破壊株で、そのタイムコース値はインデックス10ないし12に示される)をループの1つとし、これを野生株と次の破壊株(G02を破壊した破壊株)の組み、というように全組合わせによるBとWを推定して行くことを表している。
S17とS20で囲まれる小さなループは、例えばG01を破壊した破壊株と野生株のタイムコース値からBとWを得ることを表している。つまりS17で、先ず最初の野生株の遺伝子G01を取り上げて、図4のインデックス5ないし7のタイムコース値と10ないし12のタイムコース値とから、G01に注目して、S18により最小二乗法により暫定的に(3)式の行列BとベクトルW内の遺伝子G01に関する相互作用の係数を推定する。こうして図4の例では、インデックス5ないし7のタイムコース値と10ないし12のタイムコース値による推定では、遺伝子数が4つあるので4つのBとWが推定される。つまりこの小さなループでは、図4のインデックス10ないし12を用いた推定で、一回目のループではG01に着目した推定をし、次の小さなループの2回目ではG02に着目した推定を行い、三回目の小さなループにおいてはG03に着目して推定し、最後に四回目の小さなループではg04に着目した推定を行う。その結果、S19で、得られたBとW内の計数を用いて、(1)式の行列Aとwの各値を暫定的に推定する。
A loop characteristic of this embodiment will be described. In S16, a large loop surrounded by S21 is formed. This large loop loops over time of the wild strain and one disrupted strain (in FIG. 4, for example, a disrupted strain that has disrupted G01 noted in index 8 and its time course value is shown in indexes 10 to 12). This represents the estimation of B and W for all combinations, such as a combination of a wild strain and the next disrupted strain (destroyed strain that destroyed G02).
The small loop surrounded by S17 and S20 represents that B and W are obtained from the time course values of the disrupted strain and the wild strain that destroyed G01, for example. That is, in S17, the first wild-type gene G01 is picked up, and from the time course values of indexes 5 to 7 and 10 to 12 in FIG. 4, paying attention to G01, the least square method is obtained in S18. Provisionally, the coefficient of interaction regarding the gene G01 in the matrix B and the vector W in the equation (3) is estimated. Thus, in the example of FIG. 4, in the estimation using the time course values of the indexes 5 to 7 and the time course values of 10 to 12, there are four genes, so four B and W are estimated. That is, in this small loop, the estimation using the indexes 10 to 12 in FIG. 4 is performed. In the first loop, the estimation focusing on G01 is performed, and in the second small loop, the estimation focusing on G02 is performed. In the small loop, estimation is performed by paying attention to G03. Finally, in the fourth small loop, estimation is performed focusing on g04. As a result, in S19, using the obtained counts in B and W, the values of the matrices A and w in the equation (1) are temporarily estimated.

次いでS16に戻って、タイムコースデータ中のその他の破壊株、図4の例ではインデックス13,18,23に記された他の破壊株について、S17ないしS19のループを繰り返す。即ち大きなループでは、順次他の破壊株と野生株とを用いた推定に移ることを表している。この大きなループを含む二重ループの処理を行うのが初期値推定部44である。
こうして順次破壊株を取り出して、それぞれ破壊株と野生株と同時に満足するBとWの推定を行う。その結果、全破壊株が全遺伝子に与える相互作用が推定できる。このとき遺伝子Yは、ループ中に含まれる遺伝子の発現量の時系列データを格納したベクトルとなる。またXは遺伝子Yベクトルのタグ時刻における発現量の積分値を格納している。
S21で全破壊株が及ぼす遺伝子のBとWの係数の推定が終わると、S23で相互作用係数推定処理部45が、破壊株の影響がない係数について、残りのA,wの係数を野生株のタイムコースデータを入力して最小二乗法により推定する。こうして係数行列Aとベクトルwが暫定的に準最適化用の初期値として確定する。
Next, the process returns to S16, and the loop of S17 to S19 is repeated for the other disrupted strains in the time course data (in the example of FIG. 4, the other disrupted strains indicated by indexes 13, 18, and 23). In other words, a large loop represents a shift to estimation using other disrupted strains and wild strains. The initial value estimation unit 44 performs processing of a double loop including this large loop.
In this way, the disrupted strains are sequentially taken out, and B and W that are satisfied simultaneously with the disrupted strain and the wild strain are estimated. As a result, the interaction that all the disrupted strains give to all genes can be estimated. At this time, the gene Y is a vector that stores time-series data of the expression level of the gene included in the loop. X stores the integrated value of the expression level of the gene Y vector at the tag time.
When the estimation of the B and W coefficients of the genes affected by all the disrupted strains is completed in S21, the interaction coefficient estimation processing unit 45 uses the remaining A and w coefficients as wild strains for the coefficients that are not affected by the disrupted strains in S23. The time course data is input and estimated by the method of least squares. In this way, the coefficient matrix A and the vector w are provisionally determined as initial values for semi-optimization.

S24以降は、係数最適化処理部46が最終的な係数を確定するための処理を行う。S24では、S23で暫定的に確定したxのモデルについて、Aとwの各要素の係数を微小変動させた、2個以上の幾つかのモデル、例えば5個のモデルを線形微分方程式モデルとして生成する。
S25では、確定までの演算時間を制限するために、最大試行回数を設定し、S26では、S24で生成した5個のモデルのうち、最初のモデルを取り出して、S27とS28の処理をする。S27では、モデルの微分方程式を数値積分して、各個体の各遺伝子のタイムコース値を求める。
S28で、上記積分で求めたタイムコース値と、対応するS11での入力実験値とを比較して、推定相対誤差を求める。
S29からS26に戻って、S24で生成した5個全てのモデルについて、S27とS28の処理を繰り返し、各モデルの推定相対誤差を求める。
After S24, the coefficient optimization processing unit 46 performs a process for determining the final coefficient. In S24, two or more models, for example, five models are generated as linear differential equation models by slightly varying the coefficients of the elements of A and w for the model of x tentatively determined in S23. To do.
In S25, the maximum number of trials is set in order to limit the calculation time until confirmation, and in S26, the first model is extracted from the five models generated in S24, and the processes of S27 and S28 are performed. In S27, the differential equation of the model is numerically integrated to obtain the time course value of each gene of each individual.
In S28, the time course value obtained by the integration is compared with the corresponding input experimental value in S11 to obtain an estimated relative error.
Returning from S29 to S26, the process of S27 and S28 is repeated for all five models generated in S24, and the estimated relative error of each model is obtained.

S30で所定の誤差以下のモデルがあれば、そのモデルのA推定行列とwベクトルであるとして推定を終える。
S30でどのモデルも所定の誤差以下でなければ、S31で5個のモデルのうち、最も誤差の少ないモデルを個体として選択し、再び5個のモデルを生成する。そしてS32でその5個のモデルについてそれぞれ行列Aとベクトルwの異なる係数を微小変動させる。S33では、最大試行回数内で所定の回数S30で誤差が閾値以下にならなかった場合に、S32の微小変動を小さくする。
S25とS34とで囲まれる試行ループにより、S30で所定の誤差以下のモデルが得られる可能性が高く、誤差が少ないモデルが定まれば、そのモデルが表す行列Aとベクトルwが最終推定結果となる。
If there is a model having a predetermined error or less in S30, the estimation is finished assuming that the model is an A estimation matrix and a w vector.
If none of the models is equal to or smaller than the predetermined error in S30, the model having the smallest error among the five models is selected as an individual in S31, and five models are generated again. In S32, the different coefficients of the matrix A and the vector w are slightly changed for the five models. In S33, if the error does not become the threshold value or less in the predetermined number of times S30 within the maximum number of trials, the minute fluctuation in S32 is reduced.
If a model having a predetermined error or less is likely to be obtained in S30 by the trial loop surrounded by S25 and S34, and a model with a small error is determined, the matrix A and the vector w represented by the model are determined as the final estimation result. Become.

図5は、このように推定してS30で誤差が少ないとして得た最終のモデルを推定値で表し、図4のタイムコース実験値とを比較した図である。
また図6は、推定の結果得られる、遺伝子間の相互作用、つまり他の遺伝子への影響関係を図式化したものである。この図はパスウェイ描画処理部49で得られる。図1と図2による推定の結果、Bが定まり、遺伝子C3は、遺伝子B6からプラスの相互作用を受け、B7からマイナスの作用を受け、C2と組になってC1にプラス作用を与えることが推定される。
こうして遺伝子を微分方程式でモデル化し、野生株と破壊株とのタイムコース値の組合わせから微分方程式モデルの遺伝子間相互作用行列を初期値推定するようにしたので、推定収束までのシミュレーション回数が短縮され、かつ安定で妥当な推定ができる。
なお、図2と図3の各ステップを手順として持つプログラムを用意して、記録媒体またはオンラインロードにより他の計算機にインストールし、そのプログラムを計算機の中央処理装置により実行させて上記解析を行わせるようにしてもよい。
FIG. 5 is a diagram in which the final model obtained in this way and obtained with a small error in S30 is represented by an estimated value and compared with the time course experimental value of FIG.
FIG. 6 is a schematic diagram showing the interaction between genes, that is, the influence relationship to other genes, obtained as a result of estimation. This figure is obtained by the pathway drawing processing unit 49. As a result of estimation according to FIGS. 1 and 2, B is determined, and gene C3 receives a positive interaction from gene B6, receives a negative action from B7, and forms a positive action on C1 in combination with C2. Presumed.
In this way, the gene is modeled with a differential equation, and the initial value of the gene interaction matrix of the differential equation model is estimated from the combination of the time course values of the wild strain and the disrupted strain, so the number of simulations until the estimated convergence is shortened And a stable and reasonable estimate.
It should be noted that a program having the steps of FIGS. 2 and 3 as a procedure is prepared, installed in another computer by a recording medium or online loading, and the program is executed by the central processing unit of the computer to perform the above analysis. You may do it.

この発明の実施の形態1における行列とベクトル推定処理フローを示す図である。It is a figure which shows the matrix and vector estimation processing flow in Embodiment 1 of this invention. 実施の形態1における行列とベクトル推定処理フローの続きを示す図である。FIG. 10 is a diagram showing a continuation of the matrix and vector estimation processing flow in the first embodiment. 実施の形態1における行列とベクトル推定処理を行う回路構成を示す図である。3 is a diagram showing a circuit configuration for performing matrix and vector estimation processing in Embodiment 1. FIG. 実施の形態1における実験で得られるタイムコース値の例を示す図である。6 is a diagram illustrating an example of a time course value obtained by an experiment in Embodiment 1. FIG. 実験値のタイムコース値と実施の形態1で得られた推定値に基づいたタイムコース値との比較を示す図である。It is a figure which shows the comparison with the time course value based on the estimated value obtained in Embodiment 1 and the time course value of an experimental value. 実施の形態1での推定結果による遺伝子間の相互作用を示すパスウェイ図である。3 is a pathway diagram showing an interaction between genes based on an estimation result in Embodiment 1. FIG.

符号の説明Explanation of symbols

S16 特定破壊株を取上げるステップ、S17 推定対象の個別遺伝子を選択するステップ、S18 最小二乗法による行列B、ベクトルWの係数推定ステップ、S19 行列A、ベクトルw内の要素への登録ステップ、S20 特定破壊株の中で未推定の遺伝子があるかを調べるステップ、S21 未推定の破壊株があるかを調べるステップ、S17,S20 特定破壊株の中で個別遺伝子の係数を順次遺伝子毎に推定する小さなループ、S16,S20 破壊株毎に順次推定する大きなループ、S22 全破壊株について推定が終わったかを確認するステップ、S23 行列Aとwの未推定係数を推定するステップ、S24 推定対象の複数の線形微分方程式モデル作成ステップ、S26 S24で作成した一つの線形微分方程式モデルを選択するステップ、S27 モデルの積分により各遺伝子の時間列値を得るステップ、S28 時間積分値と入力遺伝子時間列値との誤差算出ステップ、S29 S24で作成した線形微分方程式モデルを順次取上げて行き、全モデルが終わったかを確認するステップ、S30 S24で作成したモデルの中で所定の誤差以下のモデルがあるかを調べるステップ、S31 誤差最小のモデルに対してS24で作成した数のモデル数を用意するステップ、S32 行列A、ベクトルwの要素の値をモデル毎に違えて変化させるステップ、S33 変動率を小さくするステップ、S34 設定した最大試行回数を超えていないかを確認するステップ、41 タイムコース・データ入力処理部、42 補間処理部、43 正規化処理部、44 初期値推定部、45 相互作用係数推定処理部、46 係数最適化処理部、47 解析手法記憶部、48 タイムコース・データ記憶部、49 パスウェイ描画処理部。   S16 Step of picking up specific disrupted strains, S17 Step of selecting individual genes to be estimated, S18 Step of estimating matrix B by coefficient of least squares, vector W, S19 Step of registering elements in matrix A and vector w, S20 specification A step of checking whether there is an unestimated gene in the disrupted strain, a step of checking whether there is an S21 unestimated disrupted strain, S17, S20 A small one that sequentially estimates the coefficient of each individual gene among the specific disrupted strains Loop, S16, S20 large loop for estimating each disrupted strain sequentially, S22 step for confirming whether estimation has been completed for all disrupted strains, S23 step for estimating unestimated coefficients of matrices A and w, S24 a plurality of linears to be estimated Differential equation model creation step, S26 Select one linear differential equation model created in S24 Step S27, obtaining the time series value of each gene by integrating the model, S28 calculating the error between the time integral value and the input gene time series value, and taking the linear differential equation model created in S29 S24 in order, A step of confirming whether the model has been completed, a step of checking whether there is a model having a predetermined error or less in the models created in S30 and S24, and a step S31 for the number of models created in S24 with respect to the model having the smallest error. Step, step S32, changing the values of the elements of the matrix A and the vector w differently for each model, step S33, reducing the variation rate, step S34, checking whether the set maximum number of trials has been exceeded, step 41 Data input processing unit, 42 Interpolation processing unit, 43 Normalization processing unit, 44 Initial value estimation unit, 4 Interaction coefficient estimation processing unit, 46 coefficient optimization processing unit, 47 analysis method storage unit, 48 Time-course data storage unit, 49 pathway drawing processing unit.

Claims (9)

遺伝子モデルを作成し、操作されていない遺伝子群からなる野生株と特定遺伝子を破壊した破壊株の時間列データを基に遺伝子発現作用を推定する方法において、
上記遺伝子モデルに対し、上記野生株と特定の破壊株との時間列データを同時に満足する発現係数を推定する手順を上記野生株に含まれる推定対象個別遺伝子の数だけ繰返すループとして持つ個別遺伝子への発現係数推定ステップと、
上記特定の破壊株との推定を終えると、次の破壊株を取上げて上記個別遺伝子への発現係数推定ステップを繰返すことを、所定数の破壊株について順次行う、繰り返しループとして持つ遺伝子発現係数行列推定ステップとを備えて、
上記2つのステップを経て、遺伝子発現係数行列の推定初期値を得るようにしたことを特徴とする遺伝子発現作用推定方法。
In a method for estimating a gene expression action based on time series data of a wild strain consisting of a gene group that has not been manipulated and a disrupted strain in which a specific gene has been destroyed,
To individual genes having a loop that repeats the number of the estimation target individual genes included in the wild strain for the gene model to estimate the expression coefficient that simultaneously satisfies the time series data of the wild strain and the specific disrupted strain An expression coefficient estimation step of
When the estimation with the specific disrupted strain is completed, the gene expression coefficient matrix having a repeated loop is sequentially performed for a predetermined number of disrupted strains by taking the next disrupted strain and repeating the expression coefficient estimation step for the individual gene. An estimation step,
A gene expression action estimation method characterized in that an estimated initial value of a gene expression coefficient matrix is obtained through the above two steps.
遺伝子モデルを、遺伝子発現係数行列を持つ遺伝子発現量の時間積分で近似して推定することを特徴とする請求項1記載の遺伝子発現作用推定方法。   The gene expression action estimation method according to claim 1, wherein the gene model is estimated by approximating the gene expression with a time integration of a gene expression level having a gene expression coefficient matrix. 個別遺伝子への発現係数推定ステップにおいて、発現係数の推定に最小二乗法を適用することを特徴とする請求項1記載の遺伝子発現作用推定方法。   2. The gene expression action estimation method according to claim 1, wherein, in the expression coefficient estimation step for an individual gene, a least square method is applied to estimate the expression coefficient. 遺伝子モデルを作成し、操作されていない遺伝子群からなる野生株と特定遺伝子を破壊した破壊株の時間列データを基に遺伝子発現作用を推定する装置において、
上記遺伝子モデルに対し、上記野生株と特定の破壊株との時間列データを同時に満足する発現係数を推定する手順を上記野生株に含まれる推定対象個別遺伝子の数だけ繰返すループとして計算し、上記特定の破壊株との推定を終えると、次の破壊株を取上げて上記個別遺伝子への発現係数推定計算を繰返すことを、所定数の破壊株について順次行う係数初期値推定部を備えたことを特徴とする遺伝子発現作用推定装置。
In a device that creates a gene model and estimates the gene expression effect based on time series data of a wild strain consisting of a group of genes that has not been manipulated and a disrupted strain that has destroyed a specific gene
With respect to the gene model, a procedure for estimating an expression coefficient that simultaneously satisfies the time series data of the wild strain and a specific disrupted strain is calculated as a loop that repeats the number of individual genes to be estimated included in the wild strain, When the estimation with a specific disrupted strain is completed, an initial coefficient estimator is provided that sequentially takes out a predetermined number of disrupted strains by taking the next disrupted strain and repeating the expression coefficient estimation calculation for the individual genes. Characteristic gene expression action estimation device.
係数初期値推定部は、遺伝子モデルを、遺伝子発現係数行列を持つ遺伝子発現量の時間積分で近似して推定することを特徴とする請求項4記載の遺伝子発現作用推定装置。   5. The gene expression action estimation apparatus according to claim 4, wherein the coefficient initial value estimation unit estimates the gene model by approximating the gene expression by time integration of the gene expression level having the gene expression coefficient matrix. 係数初期値推定部は、発現係数の推定に最小二乗法を適用することを特徴とする請求項4記載の遺伝子発現作用推定装置。   The gene expression action estimation device according to claim 4, wherein the coefficient initial value estimation unit applies a least square method to the estimation of the expression coefficient. コンピュータに、遺伝子モデルを作成して、操作されていない遺伝子群からなる野生株と特定遺伝子を破壊した破壊株の時間列データを基に遺伝子発現作用を推定させるプログラムにおいて、
上記遺伝子モデルに対し、上記野生株と特定の破壊株との時間列データを同時に満足する発現係数を推定する手順を上記野生株に含まれる推定対象個別遺伝子の数だけ繰返すループとして持つ個別遺伝子への発現係数推定手順と、
上記特定の破壊株との推定を終えると、次の破壊株を取上げて上記個別遺伝子への発現係数推定手順を繰返すことを、所定数の破壊株について順次行う、繰り返しループとして持つ遺伝子発現係数行列推定手順とを用いて、遺伝子発現係数行列の推定初期値を得る手順を備えたことを特徴とする遺伝子発現作用推定プログラム。
In a program that makes a computer create a gene model and estimate the gene expression effect based on time series data of a wild strain consisting of a group of genes that have not been manipulated and a disrupted strain that has destroyed a specific gene,
To individual genes having a loop that repeats the number of the estimation target individual genes included in the wild strain for the gene model to estimate the expression coefficient that simultaneously satisfies the time series data of the wild strain and the specific disrupted strain An expression coefficient estimation procedure for
When the estimation with the specific disrupted strain is finished, the next disrupted strain is taken and the expression coefficient estimation procedure for the individual gene is repeated, sequentially for a predetermined number of disrupted strains, and a gene expression coefficient matrix having an iterative loop A gene expression action estimation program comprising a procedure for obtaining an estimated initial value of a gene expression coefficient matrix using an estimation procedure.
発現係数推定手順は、遺伝子モデルを、遺伝子発現係数行列を持つ遺伝子発現量の時間積分で近似して推定することを特徴とする請求項7記載の遺伝子発現作用推定プログラム。   8. The gene expression action estimation program according to claim 7, wherein the expression coefficient estimation procedure estimates a gene model by approximating the gene expression level having a gene expression coefficient matrix by time integration. 発現係数推定手順は、発現係数の推定に最小二乗法を適用することを特徴とする請求項7記載の遺伝子発現作用推定プログラム。   8. The gene expression action estimation program according to claim 7, wherein the expression coefficient estimation procedure applies a least square method to the expression coefficient estimation.
JP2003384457A 2003-11-14 2003-11-14 Gene expression action estimation method, gene expression action estimation apparatus, and gene expression action estimation program Expired - Fee Related JP4459597B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2003384457A JP4459597B2 (en) 2003-11-14 2003-11-14 Gene expression action estimation method, gene expression action estimation apparatus, and gene expression action estimation program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2003384457A JP4459597B2 (en) 2003-11-14 2003-11-14 Gene expression action estimation method, gene expression action estimation apparatus, and gene expression action estimation program

Publications (2)

Publication Number Publication Date
JP2005149037A true JP2005149037A (en) 2005-06-09
JP4459597B2 JP4459597B2 (en) 2010-04-28

Family

ID=34692842

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2003384457A Expired - Fee Related JP4459597B2 (en) 2003-11-14 2003-11-14 Gene expression action estimation method, gene expression action estimation apparatus, and gene expression action estimation program

Country Status (1)

Country Link
JP (1) JP4459597B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007115121A (en) * 2005-10-21 2007-05-10 Mitsubishi Space Software Kk Intergenic interaction estimation device, intergenic interaction estimation method, and intergenic interaction estimation program
JP2009048562A (en) * 2007-08-22 2009-03-05 Mitsubishi Space Software Kk Gene profile processing apparatus, gene profile processing program, and gene profile processing method
CN110349625A (en) * 2019-07-23 2019-10-18 中国科学院心理研究所 A kind of method for building up of human brain gene expression space-time norm
CN112885404A (en) * 2021-03-29 2021-06-01 哈尔滨理工大学 Model identification method and system of multilayer Boolean network

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102188118B1 (en) * 2019-04-15 2020-12-07 인천대학교 산학협력단 Electronic device for generating a gene feature vector for gene distributed representation based on a correlation between genes according to cancer and operating method thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007115121A (en) * 2005-10-21 2007-05-10 Mitsubishi Space Software Kk Intergenic interaction estimation device, intergenic interaction estimation method, and intergenic interaction estimation program
JP4699861B2 (en) * 2005-10-21 2011-06-15 三菱スペース・ソフトウエア株式会社 Gene interaction estimation device, gene interaction estimation method, and gene interaction estimation program
JP2009048562A (en) * 2007-08-22 2009-03-05 Mitsubishi Space Software Kk Gene profile processing apparatus, gene profile processing program, and gene profile processing method
CN110349625A (en) * 2019-07-23 2019-10-18 中国科学院心理研究所 A kind of method for building up of human brain gene expression space-time norm
CN110349625B (en) * 2019-07-23 2022-02-08 中国科学院心理研究所 Method for establishing human brain gene expression space-time norm
CN112885404A (en) * 2021-03-29 2021-06-01 哈尔滨理工大学 Model identification method and system of multilayer Boolean network
CN112885404B (en) * 2021-03-29 2023-11-21 哈尔滨理工大学 Model identification method and system for multi-layer Boolean network

Also Published As

Publication number Publication date
JP4459597B2 (en) 2010-04-28

Similar Documents

Publication Publication Date Title
Green et al. Bayesian and Markov chain Monte Carlo methods for identifying nonlinear systems in the presence of uncertainty
JP3404532B2 (en) Optimal fitting parameter determining method and apparatus, and optimal fitting parameter determining program
CN109858158B (en) Parameter configuration method and system for computational fluid dynamics simulations
US20210133378A1 (en) Methods and systems for the estimation of the computational cost of simulation
CN107621269A (en) Fiber Optic Gyroscope Temperature Drift error compensating method
Sjöberg et al. Identification of Wiener–Hammerstein models: Two algorithms based on the best split of a linear model applied to the SYSID'09 benchmark problem
TWI774919B (en) Information processing device, program, process execution device and information processing system
Petersen et al. Data‐adaptive additive modeling
CN113360983B (en) Slope reliability analysis and risk assessment method
US20120310619A1 (en) Fast function extraction
Machado et al. Takagi–Sugeno fuzzy models in the framework of orthonormal basis functions
CN116383912A (en) Micro motor structure optimization method and system for improving control precision
Bigoni Uncertainty quantification with applications to engineering problems
Razmjooy et al. Uncertain method for optimal control problems with uncertainties using Chebyshev inclusion functions
US6711598B1 (en) Method and system for design and implementation of fixed-point filters for control and signal processing
CN114065659B (en) Two-phase flow interface evolution simulation method and system
JP4459597B2 (en) Gene expression action estimation method, gene expression action estimation apparatus, and gene expression action estimation program
Allakhverdiyeva Application of neural network for digital recursive filter design
WO2000060423A1 (en) Model error bounds for identification of stochastic models for control design
Chowdhury et al. Robust controller synthesis with consideration of performance criteria
JP6065543B2 (en) Neural network design method, fitting method, and program
JP6907767B2 (en) Magnetic material simulation device, magnetic material simulation program, and magnetic material simulation method
Castellini et al. From time series to biological network regulations: an evolutionary approach
Campbell et al. Parameter estimation in differential equation models with constrained states
Clarich et al. Formulations for Robust Design and Inverse Robust Design

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20061113

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20091201

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100115

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20100209

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20100210

R150 Certificate of patent or registration of utility model

Ref document number: 4459597

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130219

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130219

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140219

Year of fee payment: 4

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees