JP4840597B2

JP4840597B2 - Drug discovery multi-target screening device

Info

Publication number: JP4840597B2
Application number: JP2007058741A
Authority: JP
Inventors: 勉襲田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2007-03-08
Filing date: 2007-03-08
Publication date: 2011-12-21
Anticipated expiration: 2027-03-08
Also published as: JP2008224235A

Description

本発明は創薬の分野において２種以上のターゲットタンパクに対して化合物のスクリーニングを行うときに効率の良いスクリーニング装置を提供するものである。 The present invention provides an efficient screening apparatus for screening compounds against two or more target proteins in the field of drug discovery.

創薬においてある１つのターゲットタンパクに対して化合物のスクリーニングを行う方法は確立されている。大手製薬会社においてはハイスループットスクリーニング（ＨＴＳ）という装置を使って自動的に実験を行うようなシステムが使われている。 A method of screening a compound against one target protein in drug discovery has been established. A major pharmaceutical company uses a system that automatically performs experiments using an apparatus called high-throughput screening (HTS).

最近、一つの化合物について２種以上のターゲットに対する活性をもつ化合物を探すという動きが出始めており、上述のシステムにおいて複数のターゲットタンパクに対する活性を調べようとすると、化合物ライブラリに含まれる全ての化合物についてターゲットタンパクに対する活性を調べた後、それらの結果をあわせて解析するということが行われる。 Recently, there has been a movement to search for compounds having activity against two or more targets for one compound, and when trying to examine the activity against multiple target proteins in the above system, all compounds included in the compound library are searched. After investigating the activity against the target protein, the results are analyzed together.

この方法によると、ライブラリに含まれる全ての化合物について個々のターゲットタンパクに対する活性を調べることとなり、多大な時間と費用が必要となる。 According to this method, all the compounds contained in the library are examined for activity against individual target proteins, which requires a great deal of time and expense.

そこで、スクリーニング効率を向上させるため、学習法を用いて、あらかじめ計算機上でターゲットタンパクに作用する可能性のある化合物を探索し、候補を絞り込むということが行われる。例えば、スクリーニング効率を向上させる方法として、能動学習法を用いた創薬スクリーニング化合物選抜法が開発されている（非特許文献１）。 Therefore, in order to improve the screening efficiency, a learning method is used to search in advance for compounds that may act on the target protein on a computer and narrow down candidates. For example, as a method for improving screening efficiency, a drug discovery screening compound selection method using an active learning method has been developed (Non-patent Document 1).

能動学習法とは、学習者（コンピュータ）が訓練データを能動的に選択することで予測精度を向上させる学習法である。学習者が能動的に学習データを選択するマイニング技術の一つであり、選択データの実験・学習・未実験データの予測により、未実験データのうちで実験すべきデータを選択することを繰り返して、少ない実験で十分な予測性能を実現することが可能となる。例えば、１９９７年発行の日本の雑誌「情報処理」３８巻７号５５８−５６１頁記載の安倍と中村による解説「能動学習概要」等に記されている。 The active learning method is a learning method in which a learner (computer) actively selects training data to improve prediction accuracy. This is one of the mining techniques for learners to actively select learning data. By selecting the data to be experimented, learning, and predicting unexperimental data, it is possible to repeatedly select the data to be experimented from the unexperienced data. Therefore, sufficient prediction performance can be realized with a small number of experiments. For example, it is described in the description “Active Learning Overview” by Abe and Nakamura described in Japanese magazine “Information Processing”, Vol. 38, No. 7, pp. 558-561 published in 1997.

非特許文献１は、能動学習により学習の効率を向上させ、膨大な種類の化合物の中から特定のタンパク質に対し活性のある化合物を効率的に発見する創薬スクリーニング方法として注目されている。 Non-Patent Document 1 is attracting attention as a drug discovery screening method that improves the efficiency of learning by active learning and efficiently discovers compounds active against a specific protein from a vast variety of compounds.

しかし、計算機で絞り込んだ化合物について活性を測定し、目的化合物を見つけ出すという、いわゆるインシリコ技術を使ったスクリーニング（＜非特許文献１＞）においても、個々のターゲットタンパクに対する活性を調べ、それらの結果を合わせるという点では、上記手法と本質的に変わりはない。
第３２回構造活性相関シンポジウム予稿集”記述子サンプリング法を用いた能動学習法に基づく創薬スクリーニング” However, even in screening (<Non-Patent Document 1>) using the so-called in silico technique, in which the activity of a compound selected by a computer is measured and the target compound is found out, the activity against individual target proteins is examined, and the results are obtained. In terms of matching, it is essentially the same as the above method.
Proceedings of the 32nd Symposium on Structure-Activity Relationship "Drug Screening Based on Active Learning Method Using Descriptor Sampling Method"

上述のように、複数のターゲットタンパクに活性を有する化合物をスクリーニングする場合には、学習法の利用の有無に関わらず、個々のターゲットタンパクについて活性を有する化合物を探索した後に、全てのターゲットタンパクに活性を有する化合物を選択するということが行われてきた。 As described above, when screening for compounds that have activity on multiple target proteins, regardless of whether or not a learning method is used, after searching for compounds that have activity on individual target proteins, It has been done to select compounds with activity.

したがって、複数のターゲットタンパクに同時に活性を有する化合物の探索を行う場合、従来のスクリーニング方法では、探索効率は著しく低いものであった。つまり、複数のターゲットについて同時に活性を見るシステムとしては従来装置では十分なものとはいえなかった。 Therefore, when searching for a compound having activity on a plurality of target proteins simultaneously, the conventional screening method has a significantly low search efficiency. That is, it cannot be said that the conventional apparatus is sufficient as a system for simultaneously viewing the activity of a plurality of targets.

そこで、本発明は、複数のターゲットタンパクについて活性を有する化合物の探索において、効率的なスクリーニング方法を提供するものである。 Therefore, the present invention provides an efficient screening method in searching for compounds having activity with respect to a plurality of target proteins.

また、本発明は、複数のターゲットタンパクについて活性を有する化合物の探索において、効率的なスクリーニングを可能とする装置を提供するものである。 The present invention also provides an apparatus that enables efficient screening in searching for compounds having activity with respect to a plurality of target proteins.

本発明に係るシステムは、
２種以上のターゲットタンパクに対する活性を有する化合物を特定するためのシステムであって、
複数の化合物を所定位置に配置してなる化合物ライブラリと、
２種以上のターゲットタンパクを各々独立して所定位置に配置した同一のプレートが複数枚含まれているタンパクライブラリと、
該プレートの各反応領域で試験化合物とターゲットタンパクとを反応させるための反応装置と、
該反応領域内での試験化合物とターゲットタンパクとの反応を検出するための検出装置と、
前記化合物ライブラリ中の化合物に試験選択順位を付けるための制御手段とを有することを特徴とするシステムである。 The system according to the present invention comprises:
A system for identifying a compound having activity against two or more target proteins,
A compound library in which a plurality of compounds are arranged at predetermined positions;
A protein library including a plurality of identical plates each having two or more target proteins independently arranged at predetermined positions;
A reaction apparatus for reacting a test compound and a target protein in each reaction region of the plate;
A detection device for detecting a reaction between the test compound and the target protein in the reaction region;
And a control means for assigning a test selection order to the compounds in the compound library.

本発明に係るスクリーニング方法は、
２種以上のターゲットタンパクに活性を有する化合物のスクリーニング方法において、
化合物群から順次選択される化合物に対して、前記２種以上のターゲットタンパクの少なくとも１種に対する活性を測定する工程を含み、
学習法又は類似度計算方法によって次に測定する化合物を選び出すことを特徴とするスクリーニング方法である。 The screening method according to the present invention comprises:
In a screening method for compounds having activity on two or more target proteins,
Measuring the activity of at least one of the two or more target proteins for compounds sequentially selected from the group of compounds,
A screening method characterized by selecting a compound to be measured next by a learning method or a similarity calculation method.

本発明に係るスクリーニング方法は、
２種以上のターゲットタンパクに活性を有する化合物のスクリーニング方法において、
化合物群から順次選択される化合物に対して、前記２種以上のターゲットタンパクの少なくとも１種に対する活性を測定する工程を含み、
活性を有する化合物と化合物ライブラリに含まれる個々の化合物との類似度を計算し、
類似度の計算結果を基に次に実験する化合物を選び出すことを特徴とするスクリーニング方法である。 The screening method according to the present invention comprises:
In a screening method for compounds having activity on two or more target proteins,
Measuring the activity of at least one of the two or more target proteins for compounds sequentially selected from the group of compounds,
Calculate the similarity between the active compound and the individual compounds in the compound library,
This screening method is characterized in that the next compound to be tested is selected based on the similarity calculation result.

本発明によれば複数のターゲットタンパクのスクリーニングを行う場合更なる効率化が可能である。 According to the present invention, further efficiency can be achieved when screening a plurality of target proteins.

図１に本発明における構成を示す。 FIG. 1 shows a configuration in the present invention.

本システムは、化合物が収納されている化合物ライブラリ１０１、ターゲットタンパクが格納されているタンパクライブラリ１０２、化合物とターゲットタンパクを混ぜ合わせるための化合物タンパク混ぜ合わせ装置１０３、混ぜ合わせた後に反応を行う反応装置１０４、反応後のプレートを受け取り活性を測定するための検出装置としての活性測定装置１０５、装置全体の制御を行うための制御手段としての制御装置１０６、制御装置からの情報を元に次に実験する化合物を運び出すための分配手段としての化合物ピックアップ装置１０７から構成されている。ここで、化合物タンパク混ぜ合わせ装置１０３、反応装置１０４、活性測定装置は１０５及びピックアップ装置１０７は、説明の簡略化のために別個独立の装置として記載したものであり、一般に個々の装置を別個に設ける必要はなく、化合物とターゲットタンパクを混ぜ合わせる機能、反応を行う機能、活性を測定する機能及び化合物を運び出す機能をあわせもつ実験装置として、一つにまとめて構成することも可能である。 This system includes a compound library 101 storing compounds, a protein library 102 storing target proteins, a compound protein mixing device 103 for mixing the compound and the target protein, and a reaction device for performing the reaction after mixing. 104, an activity measurement device 105 as a detection device for receiving the plate after reaction and measuring the activity, a control device 106 as a control means for controlling the entire device, and an experiment based on information from the control device It comprises a compound pick-up device 107 as a distribution means for carrying out the compound to be carried out. Here, the compound protein mixing device 103, the reaction device 104, the activity measuring device 105 and the pickup device 107 are described as separate and independent devices for the sake of simplification of description. There is no need to provide it, and it is also possible to configure as a single experimental apparatus having a function of mixing a compound with a target protein, a function of performing a reaction, a function of measuring activity, and a function of carrying out a compound.

化合物ライブラリ１０１と制御装置１０６は配線によって結ばれており、化合物ライブラリに含まれる化合物の情報が化合物ライブラリの管理装置から制御装置１０６に伝えられる。化合物ライブラリ１０１と化合物ピックアップ装置１０７は、化合物ライブラリに含まれる化合物が化合物ピックアップ装置１０７に収められるように結ばれている。タンパクライブラリ１０２と化合物タンパク混ぜ合わせ装置１０３は、ターゲットタンパクが収納されたプレートがこれらの間を移動できるように結ばれている。化合物タンパク混ぜ合わせ装置１０３と反応装置１０４は、混ぜ合わせた後のプレートが反応装置１０４に移動できるように結ばれている。反応装置１０４と活性測定装置１０５は、反応後のプレートが活性測定装置１０５に移動できるように結ばれている。活性測定装置１０５と制御装置１０６は活性を測定した化合物の活性測定情報を提供できるように結ばれている。制御装置１０６と化合物ピックアップ装置１０７は、制御装置１０６で決めた次に実験する選択化合物が化合物ピックアップ装置１０７に伝えられるように結ばれている。化合物タンパク混ぜ合わせ装置１０３と化合物ピックアップ装置１０７は、化合物ピックアップ装置１０７が化合物ライブラリ１０１から運び出した選択化合物を、タンパクライブラリ１０２から提供されたターゲットタンパクが収納されているプレートのウエルに入れることができるように結ばれている。 The compound library 101 and the control device 106 are connected by wiring, and information on the compounds included in the compound library is transmitted from the compound library management device to the control device 106. The compound library 101 and the compound pickup device 107 are connected so that the compounds included in the compound library can be stored in the compound pickup device 107. The protein library 102 and the compound protein mixing apparatus 103 are connected so that the plate storing the target protein can move between them. The compound protein mixing device 103 and the reaction device 104 are connected so that the mixed plate can be moved to the reaction device 104. The reaction device 104 and the activity measuring device 105 are connected so that the plate after the reaction can move to the activity measuring device 105. The activity measuring device 105 and the control device 106 are connected so as to provide activity measurement information of the compound whose activity has been measured. The control device 106 and the compound pickup device 107 are connected so that the selected compound to be tested next determined by the control device 106 is transmitted to the compound pickup device 107. The compound protein mixing apparatus 103 and the compound pickup apparatus 107 can put the selected compound carried out from the compound library 101 by the compound pickup apparatus 107 into the well of the plate in which the target protein provided from the protein library 102 is stored. Are tied together.

ここで、化合物ライブラリに含まれる化合物の情報はあらかじめ制御装置１０６に記憶されておいてもよい。 Here, information on compounds included in the compound library may be stored in the control device 106 in advance.

化合物ライブラリ及びタンパクライブラリは通常の構成の装置を用いて構成できる。例えば、アンプルやバイアル中に各化合物の溶液をそれぞれの位置情報により規定された位置に配置し、そこからピペッター等の分配装置により、プレートのウエル等から構成される反応領域に必要量を分配すればよい。 The compound library and the protein library can be constructed using an apparatus having a usual configuration. For example, a solution of each compound is placed in an ampoule or vial at a position specified by the position information, and a required amount is distributed to a reaction area composed of a plate well or the like by a dispensing device such as a pipetter. That's fine.

次に各装置の動作について図２を参照しながら説明する。 Next, the operation of each apparatus will be described with reference to FIG.

まず、ユーザーがタンパクライブラリ１０２に複数のターゲットタンパクを並べ、化合物ライブラリ１０１に活性を測定する化合物を並べているとする（ステップ２０１）。 First, it is assumed that a user arranges a plurality of target proteins in the protein library 102 and arranges compounds whose activity is measured in the compound library 101 (step 201).

化合物ライブラリ１０１に化合物を並べる順番に特に制約はない。また化合物の情報も同時に化合物ライブラリの管理装置等に記憶される。化合物の情報として記憶されるものは化合物を学習可能な形で表現したものであり、例えば化合物の構造記述子や物理化学定数等がある。また、化合物の情報は化合物ライブラリにおける位置情報とともに記憶される。 There are no particular restrictions on the order in which the compounds are arranged in the compound library 101. In addition, compound information is simultaneously stored in a compound library management device or the like. What is stored as compound information is a representation of the compound in a learnable form, such as a structure descriptor or a physicochemical constant of the compound. In addition, compound information is stored together with position information in the compound library.

位置情報の記述方法としては、例えばプレートの端から順に番号付けを行っても良いし、プレート番号とプレート内部の番号という組で表現しても良い。 As a description method of the position information, for example, numbering may be performed sequentially from the end of the plate, or it may be expressed by a set of a plate number and a number inside the plate.

タンパクライブラリ１０２でも１枚のプレート内部では制約はないが、プレート間では同じ順序に並べなければならない。つまり任意のプレートにおいて１番目の穴は同じタンパクが注入されていることが前提条件となる。 The protein library 102 is not limited within one plate, but must be arranged in the same order between the plates. That is, the precondition is that the same protein is injected into the first hole in any plate.

次に、化合物ライブラリ１０１に含まれる化合物は化合物ピックアップ装置１０７に送られ、化合物の情報はその位置情報とともに制御装置１０６に送られる（ステップ２０２）。 Next, the compound contained in the compound library 101 is sent to the compound pickup device 107, and the information on the compound is sent to the control device 106 together with its position information (step 202).

次に、制御装置１０６では記憶された化合物の情報を学習し、次に実験する化合物を選択し、その情報を化合物ピックアップ装置１０７に伝える（ステップ２０３）。 Next, the control device 106 learns the stored compound information, selects the compound to be tested next, and transmits the information to the compound pickup device 107 (step 203).

このとき化合物を選択する手段としては、学習（マイニング）した結果、正例スコアの高い化合物を選び出すような手法や、活性を有した化合物との類似度をＴａｎｉｍｏｔｏ係数等の類似度計算方法によって計算して類似度が高いものから順に化合物を選び出す方法等がある。また、非特許文献１に記載されるような能動学習法を用いて、次に測定する化合物を選択することも可能である。 At this time, as a means for selecting a compound, a method of selecting a compound having a high positive example score as a result of learning (mining), or a similarity with a compound having activity is calculated by a similarity calculation method such as a Tanimoto coefficient. Thus, there is a method of selecting compounds in descending order of similarity. Moreover, it is also possible to select the compound to be measured next by using an active learning method as described in Non-Patent Document 1.

使用する学習法は、例えば、決定木、決定リスト、ニューラルネットワーク、ナイーブベイズ（ＮａｉｖｅＢａｙｅｓ）、ベイジアンネットワーク、遺伝的アルゴリズム、回帰分析、サポートベクタマシン等の何らかの表現形を用いて入出力データを学習するアルゴリズムであれば何でもよい。スコアは、個々の未知データの正例らしさの数値であり、例えば値が大きいほど、正例である可能性が高いことを示す。 The learning method used is, for example, learning input / output data using some expression such as a decision tree, decision list, neural network, naive Bayes, Bayesian network, genetic algorithm, regression analysis, support vector machine, etc. Any algorithm can be used. The score is a numerical value of the likelihood of a positive example of each unknown data. For example, the larger the value, the higher the possibility of being a positive example.

次に、化合物タンパク混ぜ合わせ装置１０３では、化合物ピックアップ装置１０７から送られてきた選択化合物が、タンパクライブラリ１０２から送られてきたターゲットタンパクが収納されている個々のウエルに入れて混ぜられる。混ぜられた選択化合物とターゲットタンパクを有するプレートは反応装置１０４に送られる（ステップ２０４）。 Next, in the compound protein mixing device 103, the selected compound sent from the compound pickup device 107 is mixed in individual wells in which the target proteins sent from the protein library 102 are stored. The mixed plate with the selected compound and target protein is sent to the reactor 104 (step 204).

次に、反応装置１０４では選択化合物のターゲットタンパクに対する活性を調べるため、それぞれのターゲットタンパクに対応した条件で反応が行われる。このときプレート内に配置された個々のターゲットタンパクの反応条件は同じであっても良いし、異なっていてもよい。反応終了後のプレートは活性測定装置１０５に送られる（ステップ２０５）。 Next, in order to examine the activity of the selected compound with respect to the target protein in the reaction apparatus 104, the reaction is performed under conditions corresponding to each target protein. At this time, the reaction conditions of the individual target proteins arranged in the plate may be the same or different. The plate after completion of the reaction is sent to the activity measuring device 105 (step 205).

活性測定装置１０５では選択化合物の個々のターゲットタンパクに対する活性が測定される。得られた活性測定情報は選択化合物の位置情報とともに制御装置１０６に伝えられる。ここで、活性測定情報とは、例えば、活性の有無、活性の測定値等がある。 The activity measuring device 105 measures the activity of the selected compound against individual target proteins. The obtained activity measurement information is transmitted to the control device 106 together with the position information of the selected compound. Here, the activity measurement information includes, for example, presence / absence of activity, a measured value of activity, and the like.

次に、制御装置１０６では活性測定装置１０５から受け取った活性測定情報を蓄積し記憶する。そのとき個々のターゲットタンパクに対する化合物の活性情報の記憶方法としては、例えば、活性の測定値の情報をそのまま記憶させる、または、ある閾値を設けその閾値以上の活性を持つ場合には１、それ以外の場合には０として扱う手法等がある。 Next, the control device 106 accumulates and stores the activity measurement information received from the activity measurement device 105. At this time, as a method of storing the activity information of the compound for each target protein, for example, the information of the measured value of the activity is stored as it is, or when a certain threshold is set and the activity is higher than the threshold, 1 is set. In this case, there is a method of handling as 0.

また、選択化合物の複数のターゲットタンパクに対する実験結果については、例えば、個々のターゲットタンパクの結果を個々に記憶させたり、いずれか少なくとも１つのタンパクにおいて活性が認められた場合には１、すべてのタンパクにおいて活性が認められなかった場合には０として値を記憶させることもできる。また、活性測定情報は選択化合物の位置情報と対応付けて制御装置１０６に記憶される。 In addition, with regard to the experimental results of a selected compound for a plurality of target proteins, for example, the results of individual target proteins are individually memorized, or when activity is recognized in at least one protein, all proteins If no activity is observed in, the value can be stored as 0. The activity measurement information is stored in the control device 106 in association with the position information of the selected compound.

次に、制御装置１０６に蓄積された化合物の情報と新たに取得した選択化合物の活性測定情報を基に、学習や類似度計算を行い、その結果スコアリング結果が高い化合物のうち実験を行っていないものから順に次に実験する化合物が選択される。そして、新たに選択された化合物の位置情報が化合物ピックアップ装置１０７に送られる（ステップ２０７）。新たな選択化合物の位置情報を送った後はステップ２０３から２０７の処理が、それらの処理をユーザーが終了させるまで、繰り返される。 Next, learning and similarity calculation are performed based on the compound information accumulated in the control device 106 and the newly acquired activity measurement information of the selected compound, and as a result, an experiment is performed among the compounds with high scoring results. The compounds to be tested next are selected in the order from none. Then, the position information of the newly selected compound is sent to the compound pickup device 107 (step 207). After sending the position information of the new selected compound, the processing from Step 203 to Step 207 is repeated until the user ends the processing.

なお、本発明の構成を、ターゲットタンパクがプレートのウエルに収納され、そこに試験化合物を入れて反応を行う方法による例に基づいて説明したが、特にこの例に限定されるものではない。例えば、ターゲットタンパクを収納する容器として試験管やバイアルを使用してもよいし、反応を行う容器としても試験管やバイアル等を使用し、その試験管等に化合物溶液やタンパク溶液を混ぜ合わせることができる。 In addition, although the structure of this invention was demonstrated based on the example by the method by which target protein is accommodated in the well of a plate and puts a test compound in there and reacts, it is not limited to this example in particular. For example, a test tube or a vial may be used as a container for storing the target protein, or a test tube or a vial may be used as a container for reaction, and a compound solution or a protein solution is mixed in the test tube. Can do.

また、本発明の構成を説明するにあたり、次に活性を測定する化合物を選択する方法として能動学習法を利用しているが、本発明で使用する学習法は特に能動学習法に限定されるものではなく、通常の学習法や類似度計算方法を使用することもできる。 In the description of the configuration of the present invention, the active learning method is used as a method for selecting the compound whose activity is to be measured next. However, the learning method used in the present invention is particularly limited to the active learning method. Instead, a normal learning method or similarity calculation method can be used.

以下、実施例を挙げ、本発明をより具体的に説明するが、本発明はこれら実施例に何ら限定されるものではない。 EXAMPLES Hereinafter, although an Example is given and this invention is demonstrated more concretely, this invention is not limited to these Examples at all.

（実施例１）
実施例１では、化合物ライブラリとして４９６２個の市販ライブラリに含まれる化合物を使い、ターゲットタンパクとしてはＧａｂａ、Ｂｅｎｚｏｄｉａｚｅｐｉｎｅという２つの受容体を用い、両ターゲットタンパクに活性を有する化合物の探索に本発明の構成を有するスクリーニング装置を用いた場合に、どの程度効率的に探索できるのかを確認した。 Example 1
In Example 1, the compounds contained in 4962 commercially available libraries are used as the compound library, and two receptors, Gaba and Benzodiazepine, are used as the target proteins, and the constitution of the present invention is used to search for compounds having activity in both target proteins. It was confirmed how efficiently the search can be performed when a screening apparatus having the above is used.

使用した化合物ライブラリに含まれる化合物において、Ｇａｂａ受容体のみに対し活性を有する化合物は３９個、Ｂｅｎｚｏｄｉａｚｅｐｉｎｅ受容体のみに対し活性を有する化合物は６個、Ｇａｂａ及びＢｅｎｚｏｄｉａｚｅｐｉｎｅ両方の受容体に活性を有する化合物は５個含まれている。なお、Ｇａｂａ受容体に対し活性を有する化合物は４４個、Ｂｅｎｚｏｄｉａｚｅｐｉｎｅ受容体に対し活性を有する化合物は１１個含まれていることになる。 Of the compounds included in the compound library used, 39 compounds have activity only for the Gaba receptor, 6 compounds have activity only for the Benzodiazepine receptor, and compounds that have activity on both Gaba and Benzodiazepine receptors. Are included. In addition, 44 compounds having activity with respect to the Gaba receptor and 11 compounds having activity with respect to the benzodiazepine receptor are included.

以上のような目的で本提案装置を適合させると以下のようになる。 When the proposed apparatus is adapted for the above purpose, it is as follows.

化合物ライブラリ１０１には４９６２個の化合物が存在しており、タンパクライブラリ１０２に格納されるターゲットタンパクを収納するプレートには、少なくとも２つのウエルがあり、それぞれのウエルには個々のターゲットタンパクが収納されている。そのウエルが反応領域となり、活性を測定する化合物を注入し混ぜ合わせ、反応を行うことになる。１種の化合物について１プレート使用する設定となるため、個々のターゲットタンパクが所定位置のウエルに収納された同一のプレートが複数枚用意されていることになる。 There are 4962 compounds in the compound library 101, and the plate for storing the target protein stored in the protein library 102 has at least two wells, and each well stores an individual target protein. ing. The well becomes a reaction region, and a compound for measuring the activity is injected and mixed to perform the reaction. Since one plate is used for one kind of compound, a plurality of identical plates in which individual target proteins are stored in wells at predetermined positions are prepared.

また、化合物ピックアップ装置１０７においては同時に９６化合物を選び出すようなシステム構成とした。つまり、化合物ピックアップ装置１０７にて９６化合物を選択し、化合物タンパク混ぜ合わせ装置１０３にてそれぞれの化合物を９６枚の個々のターゲットタンパクが収納されたプレートのウエルに注入して混ぜ合わせることになる。 The compound pickup device 107 has a system configuration in which 96 compounds are selected simultaneously. That is, 96 compounds are selected by the compound pick-up device 107, and each compound is injected into the well of the plate containing 96 individual target proteins by the compound protein mixing device 103 and mixed.

制御装置１０６の内部では、決定木を下位学習機とした能動学習法（「発見科学とデータマイニング」参照、ＩＳＢＮ４−３２０−１２０１８−３）を用いて次に実験を行う化合物を決めた。能動学習をした結果、まだ実験を行っていない化合物からスコアが高く判定された順に試験選択順位を決定し、化合物を選び出した。 Inside the control device 106, an active learning method (see “Discovery Science and Data Mining”, ISBN4-320-12018-3) using a decision tree as a subordinate learning machine was used to determine a compound to be subjected to the next experiment. As a result of the active learning, the test selection order was determined in the order in which the scores were judged higher from the compounds not yet tested, and the compounds were selected.

学習用データは、１０分割を行い、そのうちの１つのデータセットを活性既知データとして学習を開始し、他の９つのデータセットを活性未知データとして扱い、活性未知データの中から両受容体に作用する化合物を特定することにした。データの分割は、（i）Ｇ
ａｂａとＢｅｎｚｏｄｉａｚｅｐｉｎｅの両方の受容体に活性のないものを１０分割、（ii）ＧａｂａとＢｅｎｚｏｄｉａｚｅｐｉｎｅの両方の受容体に活性を有する化合物を１０分割、（iii）Ｇａｂａ受容体のみ活性を有する化合物を１０分割、（iv）Ｂｅｎｚｏｄｉａｚｅｐｉｎｅ受容体にのみ活性を有する化合物を１０分割した後、（i）乃至（iv）における１分割づつをあわせ、１０分割したデータを作成した。なお、正確に割り切れない部分は適当に振り分けた。 The learning data is divided into 10 parts, one of the data sets starts learning as activity known data, the other 9 data sets are treated as activity unknown data, and act on both receptors from the activity unknown data. It was decided to identify the compound to be. The data is divided into (i) G
Inactive compounds for both aba and benzodiapine receptors are divided into 10 parts, (ii) compounds having activity in both Gaba and benzodiazepine receptors are divided into 10 parts, and (iii) compounds having activity only in Gaba receptor are provided in 10 parts Dividing, (iv) After dividing the compound having activity only on the benzodiazepine receptor into 10 parts, the data in 10 parts were prepared by combining 1 part in (i) to (iv). In addition, the part which cannot be divided correctly was distributed appropriately.

評価は両ターゲットタンパクに対する活性を持つ全５化合物がいかに早期の段階で特定できるのかということで行った。なお、選び出した化合物の活性の有無については、活性測定を実際に行うことをせず、データベースに登録されている情報により確認した。 The evaluation was performed based on how early all 5 compounds having activity against both target proteins can be identified. The presence or absence of activity of the selected compound was confirmed by information registered in the database without actually measuring the activity.

制御装置１０６に記憶する活性の表現方法としては、以下の４種類である。
（Ａ）Ｇａｂａ及びＢｅｎｚｏｄｉａｚｅｐｉｎｅの両方の受容体に活性を有する化合物のみを１とし、他の化合物を０として表現した場合
（Ｂ）Ｇａｂａ受容体に活性を有する化合物のみを１とし、他の化合物を０として表現した場合
（Ｃ）Ｂｅｎｚｏｄｉａｚｅｐｉｎｅ受容体に活性を有する化合物のみを１とし、他の化合物を０として表現した場合
（Ｄ）Ｇａｂａ及びＢｅｎｚｏｄｉａｚｅｐｉｎｅのうちいずれか一方の受容体に活性を有する化合物のみを１とし、他の化合物を０として表現した場合
（Ａ）は両受容体に作用する化合物を直接特定するとき、（Ｂ）はＧａｂａ受容体に作用する化合物を手がかりとして両受容体に作用する化合物を特定するとき、（Ｃ）はＢｅｎｚｏｄｉａｚｅｐｉｎｅ受容体に作用する化合物を手がかりとして両受容体に作用する化合物を特定するとき、（Ｄ）はいずれかの受容体のに作用する化合物を手がかりとして両受容体に作用する化合物を特定するとき、に相当している。 There are the following four types of expression methods of activity stored in the control device 106.
(A) When only compounds having activity at both receptors of Gaba and Benzodiazepine are set to 1 and other compounds are expressed as 0 (B) Only compounds having activity at the Gaba receptor are set to 1, and other compounds are When expressed as 0 (C) When a compound having activity at the benzodiazepine receptor is 1, only when the other compound is expressed as 0 (D) Only a compound having activity at one of the receptors of Gaba and benzodiapine Is expressed as 1 and other compounds are expressed as 0. (A) directly identifies compounds acting on both receptors, (B) acts on both receptors using compounds acting on Gaba receptors as clues. When identifying a compound, (C) is a compound that acts on the Benzodiazepine receptor. When identifying a compound that acts on both receptors as a key, (D) corresponds to identifying a compound that acts on both receptors using a compound acting on either receptor as a clue.

（Ａ）及び（Ｃ）の場合には１０分割すると学習の初期段階において値１を持つデータが含まれていないようなデータセットが出現してしまい、性能を正確に測定することができなくなるおそれがあるため、理論的に妥当な５分割の１つのデータセットから学習を開始し、残りの４つのデータセットからデータを選び出すこととした。 In the case of (A) and (C), if the number is divided into 10, a data set that does not include data having a value of 1 appears in the initial stage of learning, and the performance cannot be accurately measured. Therefore, learning is started from one theoretically valid data set divided into five, and data is selected from the remaining four data sets.

そのような問題設定のもと、両タンパク受容体に対する全活性化合物を特定するために要した化合物数は、
（Ａ）では１４７３化合物、
（Ｂ）では９７６化合物、
（Ｃ）では８８０化合物、
（Ｄ）では７８４化合物
であった。 Under such a problem setting, the number of compounds required to identify all active compounds for both protein receptors is:
(A) is a 1473 compound,
(B) 976 compounds,
In (C), 880 compounds,
In (D), it was 784 compounds.

従来法のようにＨＴＳでそのままアッセイ実験を行う場合には全化合物について実験する必要があるため、４９６２化合物をすべて実験する必要がある。最初から２つのターゲットに対する活性を有する化合物のみを探索する場合（（Ａ）の場合）には１４７３化合物、Ｇａｂａ受容体に作用する化合物を探したのち、その中からＢｅｎｚｏｄｉａｚｅｐｉｎｅ受容体にも作用する化合物を探す場合には９７６化合物、Ｂｅｎｚｏｄｉａｚｅｐｉｎｅ受容体に作用する化合物を探したのちＧａｂａ受容体にも作用する化合物を探す場合には８８０化合物、Ｇａｂａ受容体とＢｅｎｚｏｄｉａｚｅｐｉｎｅ受容体とのいずれかに作用する化合物を探していく場合には７８４化合物で探し出すことができ、本発明の装置の有効性を示すことができた。 When an assay experiment is performed as it is with HTS as in the conventional method, it is necessary to experiment on all compounds, and therefore it is necessary to experiment on all 4962 compounds. When searching only for compounds having activity against two targets from the beginning (in the case of (A)), after searching for compounds that act on 1473 compounds and Gaba receptors, compounds that also act on Benzodiapine receptor from among them When searching for 976 compounds, searching for compounds that act on the benzodiazepine receptor and then searching for compounds that also act on the gaba receptor, 880 compounds, compounds that act on either the gaba receptor or the benzodiapine receptor In this case, it was possible to search for 784 compounds, and to show the effectiveness of the apparatus of the present invention.

（実施例２）
制御装置１０６においては上記のような能動学習法だけでなく、通常の学習法も手段として使うことができる。 (Example 2)
The control device 106 can use not only the active learning method as described above but also a normal learning method as means.

その場合、制御装置１０６における処理の方法としては、学習を行った結果、ターゲットタンパクに対する活性を有する化合物の特徴を持つというスコア順にランキング付けを行うということになる。上記と同様のシステム構成によって、通常の学習法を使ったときの実験を行った。 In this case, as a processing method in the control device 106, ranking is performed in the order of scores that have characteristics of a compound having activity against the target protein as a result of learning. Experiments were performed when using a normal learning method with the same system configuration as above.

下位学習機械としては決定木を使ったアンサンブル学習を行った。代表的なアンサンブル学習として、例えばバギング（Ｂａｇｇｉｎｇ）とブースティング（Ｂｏｏｓｔｉｎｇ）がある。 As a subordinate learning machine, ensemble learning using decision trees was performed. As typical ensemble learning, there are, for example, bagging and boosting.

そのときの評価方法としては、学習後に選び出された最初の９６個の化合物中に、両受容体に活性を有する化合物がいくつ含まれているかを比較した。 As an evaluation method at that time, the number of compounds having activity at both receptors in the first 96 compounds selected after learning was compared.

その結果以下のような結果になった。 As a result, the following results were obtained.

（Ａ）の場合０化合物
（Ｂ）の場合１化合物
（Ｃ）の場合０化合物
（Ｄ）の場合２化合物
この結果、早期の段階で、（Ｄ）が最も多くの活性化合物を探し出していることがわかり、制御装置１０６における学習方法は能動学習法だけでなく、通常の学習法も有効であることがわかった。 In the case of (A) 0 compound in the case of (B) 1 compound in the case of (C) 2 compound in the case of (D) As a result, (D) is searching for the most active compounds at an early stage. Thus, it was found that the learning method in the control device 106 is not only the active learning method but also the normal learning method is effective.

（実施例３）
制御装置１０６においては学習法だけでなくＴａｎｉｍｏｔｏ係数をはじめとする類似度計算手法も使うことができる。 (Example 3)
The control device 106 can use not only a learning method but also a similarity calculation method such as a Tanimoto coefficient.

その場合、制御装置１０６における処理方法として、高い活性をもつ化合物と化合物ライブラリ１０１に含まれる化合物との類似度を計算し、その類似度の高い順にランキング付けを行うということになる。そのような手法の有効性は＜非特許文献１＞の類縁化合物選択法の性能が示すように、能動学習法ほどではないにせよ、同様の優れた効果を持つことが示されている。 In this case, as a processing method in the control device 106, the similarity between the compound having high activity and the compound contained in the compound library 101 is calculated, and ranking is performed in descending order of the similarity. As shown by the performance of the related compound selection method in <Non-Patent Document 1>, the effectiveness of such a method has been shown to have the same excellent effect if not as much as the active learning method.

（実施例４）
本提案の手法は２つのターゲットを探索する場合だけでなく、３つ以上のターゲットに同時に作用するような化合物を探す場合にも容易に拡張できる。拡張方法はタンパクライブラリ１０２におけるターゲットの数を３以上にし、制御装置１０６では少なくとも１つのターゲットにおいて活性を認められた化合物に対して１、いずれのターゲットに対しても活性が認められなかった化合物には０と表現すればよいためである。そのように表現することによって前述の装置と同様の問題設定となるため、学習もしくは類似度検索により同様の効果をもつといえる。 Example 4
The proposed method can be easily extended not only when searching for two targets but also when searching for compounds that act simultaneously on three or more targets. In the expansion method, the number of targets in the protein library 102 is set to 3 or more, and the control device 106 applies 1 to a compound whose activity is recognized in at least one target, and a compound whose activity is not recognized for any target. This is because it may be expressed as 0. By expressing in this way, the problem setting is the same as that of the above-described apparatus, so that it can be said that the same effect is obtained by learning or similarity search.

本発明に係るシステムの一例の構成図である。It is a block diagram of an example of the system which concerns on this invention. 本発明に係るスクリーニング法の一例のフローを示す図である。It is a figure which shows the flow of an example of the screening method which concerns on this invention.

Explanation of symbols

１０１：化合物ライブラリ
１０２：タンパクライブラリ
１０３：化合物タンパク混ぜ合わせ装置
１０４：実験装置
１０５：活性測定装置
１０６：制御装置
１０７：化合物ピックアップ装置 DESCRIPTION OF SYMBOLS 101: Compound library 102: Protein library 103: Compound protein mixing apparatus 104: Experimental apparatus 105: Activity measuring apparatus 106: Control apparatus 107: Compound pick-up apparatus

Claims

A system for identifying a compound having activity against two or more target proteins,
A compound library in which a plurality of compounds are arranged at predetermined positions;
A protein library including a plurality of identical plates each having two or more target proteins independently arranged at predetermined positions;
A reaction apparatus for reacting a test compound and a target protein in each reaction region of the plate;
A detection device for detecting a reaction between the test compound and the target protein in the reaction region;
Control means for assigning a test selection order to the compounds in the compound library ;
A dispensing means for removing a test compound from the compound library and dispensing it into the reaction region of the plate;
Have
The control means instructs the distribution means to distribute the test compound selected based on the test selection order to a new analysis plate,
The control means selects a compound selected based on the test selection order using the position information of the compound in the compound library using the position information of the compound in the compound library, and is a compound at the selected position. Command distribution to a new plate,
The test selection order is determined by learning or similarity calculation using detection data from the detection device and information on each compound in the compound library,
As a method for expressing the activity for the target protein when performing the learning or similarity calculation, a value indicating that the target protein has activity with respect to at least one target protein, and a value indicating that the target protein does not have activity The system according to claim 1, wherein the system is expressed by binary values.

In a screening method for compounds having activity on two or more target proteins,
Measuring the activity against at least one of the two or more target proteins with respect to compounds sequentially selected from the compound group included in the compound library ,
The compound to be measured next is selected by the learning method or the similarity calculation method ,
As a method of expressing the activity against the target protein when performing learning or similarity calculation, a value indicating that it has activity against at least one target protein, and a value indicating that it does not have activity against any target protein A screening method characterized in that the expression is performed using the binary value .

The screening method according to claim 2 , wherein the learning method selects a compound to be measured next by active learning.

4. The screening method according to claim 2 , wherein the learning machine of the learning method is a decision tree algorithm.

The similarity between the individual compounds included in the target protein at least one also compound with said compound library have a activity in the process is calculated,
The screening method according to any one of claims 2 to 4 for selecting a compound that next experiment based on the calculation result of the similarity.