JP2020143063A

JP2020143063A - Water-soluble trans-membrane proteins and methods for preparing and using the same

Info

Publication number: JP2020143063A
Application number: JP2020073303A
Authority: JP
Inventors: チャン，シュウガン; Shuguang Zhang; タオ，フェイ; Fei Tao
Original assignee: Massachusetts Institute of Technology
Current assignee: Massachusetts Institute of Technology
Priority date: 2015-02-18
Filing date: 2020-04-16
Publication date: 2020-09-10
Also published as: JP2017516492A; CN113929766A; JP2022037160A; JP7061461B2; JP2023130393A; CN106459174A; CN106459174B

Abstract

To provide computer implemented methods for executing a procedure to select a water-soluble variant of a G Protein-Coupled Receptor (GPCR).SOLUTION: Provided is a method comprising: entering a sequence of the GPCR for analysis; obtaining a variant of the GPCR, in which a plurality of hydrophobic amino acids in the transmembrane (TM) domain α-helical segments ("TM regions") of the GPCR are substituted; subsequently obtaining an α-helical secondary structure result for the variant to verify maintenance of α-helical secondary structures in the variant; and obtaining a trans-membrane region result for the variant to verify water solubility of the variant, thereby selecting the water-soluble variant of the GPCR.SELECTED DRAWING: None

Description

（関連出願）
本出願は、どちらも２０１５年３月２６日に出願された米国特許出願第１４／６６９，７５３号および国際出願ＰＣＴ／ＵＳ第２０１５／０２２７８０号の優先権を主張するものであり、その両方が３５Ｕ．Ｓ．Ｃ．１１９（ｅ）の下で、２０１５年２月１８日に出願された米国仮出願第６２／１１７，５５０号、２０１４年５月１５日に出願された米国仮出願第６１／９９３，７８３号および２０１４年３月２７日に出願された米国仮出願第６１／９７１，３８８号の出願日の利益を主張する。 (Related application)
Both of these applications claim the priority of US Patent Application No. 14 / 669,753 and International Application PCT / US No. 2015/022780 filed on March 26, 2015, both of which. 35U. S. C. Under 119 (e), US Provisional Application Nos. 62 / 117,550 filed on February 18, 2015, US Provisional Application Nos. 61 / 993,783 filed on May 15, 2014, and Claims the benefit of the filing date of US Provisional Application No. 61 / 971,388, filed March 27, 2014.

また、本出願は３５Ｕ．Ｓ．Ｃ．１１９（ｅ）の下で、２０１５年２月１８日に出願された米国仮出願第６２／１１７，５５０号、２０１４年５月１５日に出願された米国仮出願第６１／９９３，７８３号および２０１４年３月２７日に出願された米国仮出願第６１／９７１，３８８号の出願日の利益も主張する。 In addition, this application is 35 U.S. S. C. Under 119 (e), US Provisional Application Nos. 62 / 117,550 filed on February 18, 2015, US Provisional Application Nos. 61 / 993,783 filed on May 15, 2014, and It also claims the benefit of the filing date of US Provisional Application No. 61 / 971,388, filed March 27, 2014.

図面および配列表の全てを含む上記参照出願のそれぞれの内容全体が、参照により本明細書に組み込まれる。 The entire contents of each of the above reference applications, including all of the drawings and sequence listing, are incorporated herein by reference.

膜タンパク質は全ての生体系において重量な役割を担っている。配列決定されているゲノムのほぼ全てにおける全ての遺伝子のおよそ約３０％が膜タンパク質をコードしている。しかし、本発明者らのそれらの構造および機能の詳細な理解は可溶性タンパク質の理解よりも大きく後れを取っている。２０１５年３月の時点で、タンパク質構造データバンクには１００，０００種を超える構造が登録されている。しかし、２８種のＧタンパク質共役受容体を含み、かつテトラスパニン膜タンパク質を含まない５３０種の固有の構造を有する９４５種の膜タンパク質構造しか登録されていない。 Membrane proteins play a heavy role in all biological systems. Approximately 30% of all genes in almost all sequenced genomes encode membrane proteins. However, our in-depth understanding of their structure and function lags far behind our understanding of soluble proteins. As of March 2015, more than 100,000 types of structures are registered in the protein structure databank. However, only 945 membrane protein structures containing 28 G protein-coupled receptors and 530 unique structures containing no tetraspanin membrane protein have been registered.

膜受容体は非常に重要であるが、膜受容体の構造および機能ならびにそれらの認識およびリガンド結合特性を解明するにはいくつかの障害がある。最も重大かつ困難な課題は、ミリグラム量の可溶かつ安定な受容体を産生することが極めて難しいという点にある。安価な大規模産生方法が切実に求められており、従って、広範囲な研究の焦点となっている。これらの先行する障害を克服した場合にのみ詳細な構造研究を行うことができる。 Although membrane receptors are very important, there are several obstacles to elucidating the structure and function of membrane receptors as well as their recognition and ligand binding properties. The most serious and difficult task is that it is extremely difficult to produce milligrams of soluble and stable receptors. Inexpensive large-scale production methods are urgently needed and are therefore the focus of extensive research. Detailed structural studies can only be performed if these preceding obstacles are overcome.

Ｚｈａｎｇら（米国特許第８，６３７，４５２号）（参照により本明細書に組み込まれる）は、膜貫通領域内に位置する特定の疎水性アミノ酸が極性アミノ酸によって置換された水溶性ＧＰＣＲのための改良された方法について記載している。しかし、この方法は大きな労働力を要する。さらに、その修飾された膜貫通領域は水溶性判断基準を満たしているが、水溶性およびリガンド結合の改善が望まれている。従って、当該技術分野において、Ｇタンパク質共役受容体の改良された研究方法が必要とされている。 Zhang et al. (US Pat. No. 8,637,452) (incorporated herein by reference) for water-soluble GPCRs in which certain hydrophobic amino acids located within the transmembrane region have been replaced by polar amino acids. Describes an improved method. However, this method requires a large labor force. Furthermore, although the modified transmembrane region meets the water solubility criteria, improvement in water solubility and ligand binding is desired. Therefore, there is a need for improved research methods for G protein-coupled receptors in the art.

本発明は、水溶性膜タンパク質およびペプチドの設計、選択および／または産生方法、そこから設計、選択または産生されたペプチド（および膜貫通ドメイン）、前記ペプチドを含む組成物ならびにその使用方法に関する。特に、本方法は、水不溶性アミノ酸（ロイシン、イソロイシン、バリンおよびフェニルアラニンまたは単純な文字コードＬ、Ｉ、Ｖ、Ｆ）を非イオン性の水溶性アミノ酸（グルタミン、トレオニンおよびチロシンまたは単純な文字コードＱ、Ｔ、Ｙ）に変更する「ＱＴＹ原理」を用いる、ＧＰＣＲ変異体およびテトラスパニン膜タンパク質などの水溶性膜ペプチドのライブラリーの設計方法に関する。さらに、Ｆ以外のＬ、ＩおよびＶの置換のために２種類のさらなる非イオン性アミノ酸アスパラギン（Ｎ）およびセリン（Ｓ）を使用してもよい。以下で考察する実施形態では、アスパラギン（Ｎ）およびセリン（Ｓ）は、ＱおよびＴ（変異体として記載されている）あるいはＬ、ＩまたはＶ（天然タンパク質として記載されている）のために置換可能なものとして想定されていると理解されるべきである。但し簡潔性のために、これらの他の実施形態の詳細は本明細書中の教示により当業者には知られているため、本出願ではそれらについてさらに詳述しない。 The present invention relates to methods for designing, selecting and / or producing water-soluble membrane proteins and peptides, peptides designed, selected or produced from them (and transmembrane domains), compositions containing said peptides and methods of use thereof. In particular, the method uses water-insoluble amino acids (leucine, isoleucine, valine and phenylalanine or simple letter codes L, I, V, F) as nonionic water-soluble amino acids (glutamine, threonine and tyrosine or simple letter code Q). , T, Y), according to a method for designing a library of water-soluble membrane peptides such as GPCR variants and tetraspanine membrane proteins, using the "QTY principle". In addition, two additional nonionic amino acids, asparagine (N) and serine (S), may be used to replace L, I and V other than F. In the embodiments discussed below, asparagine (N) and serine (S) are substituted for Q and T (described as mutants) or L, I or V (described as native protein). It should be understood that it is supposed to be possible. However, for the sake of brevity, details of these other embodiments will be known to those skilled in the art by the teachings herein and will not be further detailed in this application.

本発明は、修飾された、合成および／または天然に生じない１つ以上のαヘリックスドメインおよびそのような修飾された１つ以上のαヘリックスドメインを含む水溶性ポリペプチド（例えば、「ｓＧＰＣＲ」）を包含し、ここでは、修飾された１つ以上のαヘリックスドメインは、天然膜タンパク質のαヘリックスドメイン内の複数の疎水性アミノ酸残基（Ｌ、Ｉ、Ｖ、Ｆ）が非イオン性の親水性アミノ酸残基（それぞれＱ、Ｔ、Ｔ、Ｙまたは「Ｑ、Ｔ、Ｙ」）および／またはＮおよびＳで置換されているアミノ酸配列を含む。本発明は、天然膜タンパク質の１つ以上のαヘリックスドメイン内の複数の疎水性アミノ酸残基（Ｌ、Ｉ、Ｖ、Ｆ）を非イオン性の親水性アミノ酸残基（Ｑ／Ｎ／Ｓ、Ｔ／Ｎ／Ｓ、Ｙ）で置換する工程を含む水溶性ポリペプチドの調製方法も包含する。本発明は、天然膜タンパク質のαヘリックスドメイン内の複数の疎水性アミノ酸残基（Ｌ、Ｉ、Ｖ、Ｆ）を非イオン性の親水性アミノ酸残基（それぞれＱ／Ｎ／Ｓ、Ｔ／Ｎ／Ｓ、Ｙ）で置換して調製されたポリペプチドをさらに包含する。その変異体は、親すなわち天然タンパク質（例えば、ＣＸＣＲ４）の後に略語「ＱＴＹ」が続く名前（例えば、ＣＸＣＲ４−ＱＴＹ）によって特徴づけることができる。 The present invention is a water-soluble polypeptide comprising one or more modified, synthetic and / or non-naturally occurring α-helix domains and one or more such modified α-helix domains (eg, “sGPCR”). In this case, one or more modified α-helix domains are hydrophilic in which multiple hydrophobic amino acid residues (L, I, V, F) within the α-helix domain of a natural membrane protein are nonionic. Includes amino acid sequences substituted with sex amino acid residues (Q, T, T, Y or "Q, T, Y" respectively) and / or N and S. The present invention presents a plurality of hydrophobic amino acid residues (L, I, V, F) within one or more α-helix domains of a natural membrane protein with nonionic hydrophilic amino acid residues (Q / N / S, It also includes a method for preparing a water-soluble polypeptide, which comprises a step of substituting with T / N / S, Y). In the present invention, a plurality of hydrophobic amino acid residues (L, I, V, F) in the α-helix domain of a natural membrane protein are replaced with nonionic hydrophilic amino acid residues (Q / N / S and T / N, respectively). It further includes the polypeptide prepared by substituting with / S, Y). The variant can be characterized by a name (eg, CXCR4-QTY) in which the parent or native protein (eg, CXCR4) is followed by the abbreviation "QTY".

従って、本発明の一態様は、（１）分析のために膜タンパク質（例えば、ＧＰＣＲ）の配列を入力する工程と、（２）膜タンパク質（例えば、ＧＰＣＲ）の膜貫通（ＴＭ）ドメインαヘリックスセグメント（「ＴＭ領域」）内の複数の疎水性アミノ酸が置換されている膜タンパク質（例えば、ＧＰＣＲ）の変異体を得る工程であって、（ａ）前記疎水性アミノ酸は、ロイシン（Ｌ）、イソロイシン（Ｉ）、バリン（Ｖ）およびフェニルアラニン（Ｆ）からなる群から選択され、（ｂ）前記ロイシン（Ｌ）はそれぞれ独立して、グルタミン（Ｑ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、（ｃ）前記イソロイシン（Ｉ）および前記バリン（Ｖ）はそれぞれ独立して、トレオニン（Ｔ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、かつ（ｄ）前記フェニルアラニンはそれぞれチロシン（Ｙ）で置換される工程と、その後に、（３）当該変異体のためにαヘリックス二次構造結果を得て、当該変異体内のαヘリックス二次構造の維持を確認する工程と、（４）当該変異体のために膜貫通領域結果を得て、当該変異体の水溶性を確認する工程とを含み、それにより膜タンパク質（例えば、ＧＰＣＲ）の水溶性変異体を設計することを特徴とする、膜タンパク質（例えば、Ｇタンパク質共役受容体（ＧＰＣＲ））の水溶性変異体を設計するためのスクリプト化手順を実行するためにコンピュータプログラムを動作させる方法を提供する。 Therefore, one aspect of the invention is (1) a step of inputting a sequence of a membrane protein (eg, GPCR) for analysis and (2) a transmembrane (TM) domain α-helix of the membrane protein (eg, GPCR). A step of obtaining a variant of a membrane protein (eg, GPCR) in which a plurality of hydrophobic amino acids in a segment (“TM region”) is substituted, wherein (a) the hydrophobic amino acid is leucine (L). Selected from the group consisting of isoleucine (I), valine (V) and phenylalanine (F), (b) the leucine (L) is independently glutamine (Q), asparagine (N) or serine (S). Substituted, (c) the isoleucine (I) and the valine (V) are independently substituted with threonine (T), asparagine (N) or serine (S), and (d) the phenylalanine is a tyrosine, respectively. A step of being replaced with (Y), followed by (3) obtaining an α-helix secondary structure result for the variant and confirming the maintenance of the α-helix secondary structure in the variant. 4) It is characterized by including a step of obtaining a transmembrane region result for the variant and confirming the water solubility of the variant, thereby designing a water-soluble variant of a membrane protein (eg, GPCR). Provided is a method of operating a computer program to perform a scripting procedure for designing a water-soluble variant of a membrane protein (eg, a G protein-coupled receptor (GPCR)).

特定の実施形態では、工程（４）の前、それと同時またはその後に工程（３）を行う。本明細書に記載されているさらなる工程を上記処理手順に組み込むことができる。処理はデータ処理システムによって予め形成された計算工程を使用することが好ましい。本システムは、自動計算システムおよびタンパク質変異体の選択方法を利用する。 In a particular embodiment, step (3) is performed before, simultaneously with, or after step (4). Further steps described herein can be incorporated into the processing procedure. The processing preferably uses a calculation process preformed by the data processing system. The system utilizes an automated calculation system and a method for selecting protein variants.

特定の実施形態では、工程（２）において、当該ＧＰＣＲの１つの同じＴＭ領域内の前記複数の疎水性アミノ酸の１つのサブセットを置換して変異体候補ライブラリーの１種のメンバーを作製し、かつ前記複数の疎水性アミノ酸の１つ以上の異なるサブセットを置換して当該ライブラリーのさらなるメンバーを作製する。特定の実施形態では、本方法は、前記ライブラリーの全てのメンバーを組み合わせスコアに基づいてランク付けする工程をさらに含んでもよく、組み合わせスコアは、αヘリックス二次構造予測結果および膜貫通領域予測結果の重み付けされた組み合わせである。特定の実施形態では、本方法は、ランク付け関数を用いて当該変異体をランク付けする工程をさらに含む。特定の実施形態では、ランク付け関数は、二次構造成分および水溶性成分を含んでもよい。例えば、ランク付け関数は、二次構造成分および／または水溶性成分の重み付け値を含んでもよい。特定の実施形態では、本方法は、データプロセッサにより本方法を行う工程をさらに含み、データプロセッサはそこに接続されたメモリをさらに備えていてもよい。 In a particular embodiment, in step (2), one subset of the plurality of hydrophobic amino acids within the same TM region of the GPCR is replaced to make one member of the mutant candidate library. And one or more different subsets of the plurality of hydrophobic amino acids are replaced to make additional members of the library. In certain embodiments, the method may further include a step of ranking all members of the library based on a combination score, the combination score being an α-helix secondary structure prediction result and a transmembrane region prediction result. Is a weighted combination of. In certain embodiments, the method further comprises the step of ranking the variant using a ranking function. In certain embodiments, the ranking function may include a secondary structure component and a water soluble component. For example, the ranking function may include weighted values for secondary structure components and / or water-soluble components. In certain embodiments, the method further comprises the step of performing the method by a data processor, which may further include memory attached thereto.

特定の実施形態では、本方法は、最も高い組み合わせスコアを有するＮ種のメンバーを選択して前記ＴＭ領域のための変異体候補の第１のライブラリーを形成する工程をさらに含んでもよく、ここで、Ｎは所定の整数（例えば、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、１５、１６、１７、１８、１９、２０またはそれ以上）である。特定の実施形態では、本方法は、当該ＧＰＣＲの１、２、３、４、５または６つ全ての他のＴＭ領域のための変異体候補の１つのライブラリーを作製する工程をさらに含んでもよい。特定の実施形態では、本方法は、当該ＧＰＣＲの２つ以上のＴＭ領域を変異体候補ライブラリー内の対応するＴＭ領域で置換して組み合わせ変異体ライブラリーを作製する工程をさらに含んでもよい。特定の実施形態では、本方法は、前記組み合わせ変異体を産生／発現させる工程をさらに含む。特定の実施形態では、本方法は、前記組み合わせ変異体をリガンド結合について（例えば、酵母ツーハイブリッド法で）試験する工程をさらに含み、ここでは、当該ＧＰＣＲと比較して実質的に同じリガンド結合を有するものを選択する。特定の実施形態では、本方法は、前記組み合わせ変異体を当該ＧＰＣＲの生物学的機能について試験する工程をさらに含み、ここでは、当該ＧＰＣＲと比較して実質的に同じ生物学的機能を有するものを選択する。 In certain embodiments, the method may further include selecting the N species member having the highest combination score to form a first library of mutant candidates for said TM region. And N is a predetermined integer (for example, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more). is there. In certain embodiments, the method further comprises the step of creating a library of one mutant candidate for all three other TM regions of the GPCR. Good. In certain embodiments, the method may further comprise the step of substituting the two or more TM regions of the GPCR with the corresponding TM regions in the mutant candidate library to create a combined mutant library. In certain embodiments, the method further comprises the step of producing / expressing the combination variant. In certain embodiments, the method further comprises testing the combination variant for ligand binding (eg, by the yeast two-hybrid method), where the ligand binding is substantially the same as compared to the GPCR. Select what you have. In certain embodiments, the method further comprises testing the combination variant for the biological function of the GPCR, where it has substantially the same biological function as compared to the GPCR. Select.

本発明の特定の水溶性ポリペプチドは、野生型または天然膜タンパク質（例えば、ＧＰＣＲ）に正常に結合するリガンドに結合する能力を有する。特定の実施形態では、天然膜タンパク質（例えば、ＧＰＣＲ）のリガンド結合部位候補内のアミノ酸は置換されておらず、かつ／または当該天然膜タンパク質（例えば、ＧＰＣＲ）の細胞外および／または細胞内ドメインの配列は同一である。 Certain water-soluble polypeptides of the invention have the ability to bind ligands that normally bind wild-type or natural membrane proteins (eg, GPCRs). In certain embodiments, amino acids within potential ligand binding sites of a natural membrane protein (eg, GPCR) are not substituted and / or the extracellular and / or intracellular domain of the natural membrane protein (eg, GPCR). The sequence of is the same.

（非イオン性）親水性残基（これは天然膜タンパク質のαヘリックスドメイン内の１つ以上の疎水性残基を置換する）は、グルタミン（Ｑ）、トレオニン（Ｔ）、チロシン（Ｙ）、アスパラギン（Ｎ）およびセリン（Ｓ）ならびにそれらの任意の組み合わせからなる群から選択される。さらなる態様では、ロイシン（Ｌ）、イソロイシン（Ｉ）、バリン（Ｖ）およびフェニルアラニン（Ｆ）から選択される疎水性残基は置換されている。特定の実施形態では、当該タンパク質のαヘリックスドメインのフェニルアラニン残基はチロシンで置換されており、当該タンパク質のαヘリックスドメインのイソロイシンおよび／またはバリン残基はそれぞれ独立してトレオニン（あるいはＳまたはＮ）で置換されており、かつ／または当該タンパク質のαヘリックスドメインのロイシン残基はそれぞれ独立してグルタミン（あるいはＳまたはＮ）で置換されている。 (Non-ionic) hydrophilic residues, which replace one or more hydrophobic residues within the α-helix domain of natural membrane proteins, are glutamine (Q), threonine (T), tyrosine (Y), It is selected from the group consisting of asparagine (N) and serine (S) and any combination thereof. In a further embodiment, the hydrophobic residues selected from leucine (L), isoleucine (I), valine (V) and phenylalanine (F) have been substituted. In certain embodiments, the phenylalanine residue in the α-helix domain of the protein is replaced with tyrosine, and the isoleucine and / or valine residues in the α-helix domain of the protein are independently treonine (or S or N). And / or the leucine residue of the α-helix domain of the protein is independently substituted with glutamine (or S or N).

特定の実施形態では、前記ロイシンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％がグルタミンで置換されている。特定の実施形態では、前記イソロイシの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％ンがトレオニンで置換されている。特定の実施形態では、前記バリンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％がトレオニンで置換されている。特定の実施形態では、前記フェニルアラニンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％がチロシンで置換されている。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記ロイシンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記イソロイシンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記バリンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記フェニルアラニンは置換されていない。 In certain embodiments, substantially all of the leucine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with glutamine. In certain embodiments, substantially all of the threonine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with threonine. In certain embodiments, substantially all of the valine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with threonine. In certain embodiments, substantially all of the phenylalanine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with tyrosine. In certain embodiments, one or more (eg, 1, 2 or 3) leucines are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of the isoleucines are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of said valine are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of the phenylalanines are not substituted.

特定の実施形態では、組み合わせ変異体ライブラリーは、約２百万種未満のメンバーを含む。特定の実施形態では、当該ＧＰＣＲの配列は、当該ＧＰＣＲのＴＭ領域に関する情報を含む。特定の実施形態では、当該ＧＰＣＲの配列はタンパク質構造データベース（例えば、ＰＤＢ、ＵｎｉＰｒｏｔ）から得られる。特定の実施形態では、当該ＧＰＣＲのＴＭ領域を当該ＧＰＣＲの配列に基づいて予測する。例えば、ＴＭＨＭＭ２．０（隠れマルコフモデルを用いた膜貫通予測）ソフトウェアモジュール／パッケージを用いて当該ＧＰＣＲのＴＭ領域を予測することができる。特定の実施形態では、ＴＭＨＭＭ２．０ソフトウェアモジュール／パッケージはピーク探索のために動的ベースラインを利用する。 In certain embodiments, the combination mutant library comprises less than about 2 million members. In certain embodiments, the GPCR sequence contains information about the TM region of the GPCR. In certain embodiments, the GPCR sequence is obtained from a protein structure database (eg, PDB, UniProt). In certain embodiments, the TM region of the GPCR is predicted based on the sequence of the GPCR. For example, the TMHMM2.0 (Hidden Markov Model-based Transmembrane Prediction) software module / package can be used to predict the TM region of the GPCR. In certain embodiments, the TMHMM 2.0 software module / package utilizes a dynamic baseline for peak search.

特定の実施形態では、本方法は、当該ＧＰＣＲの各変異体のためにポリヌクレオチド配列を提供する工程をさらに含む。当該ポリヌクレオチド配列は、宿主（例えば、大腸菌などの細菌、出芽酵母または分裂酵母などの酵母、Ｓｆ９細胞などの昆虫細胞、非ヒト哺乳類細胞またはヒト細胞）における発現のために最適化されたコドンであってもよい。 In certain embodiments, the method further comprises providing a polynucleotide sequence for each variant of the GPCR. The polynucleotide sequence is a codon optimized for expression in a host (eg, a bacterium such as Escherichia coli, a yeast such as Saccharomyces cerevisiae or fission yeast, an insect cell such as Sf9 cell, a non-human mammalian cell or a human cell). There may be.

特定の実施形態では、スクリプト化手順はＶＢＡスクリプトを含むことができる。特定の実施形態では、スクリプト化手順はＬｉｎｕｘ（登録商標）システム（例えば、Ubuntu 12.04 LTS）、Ｕｎｉｘ（登録商標）システム、Microsoft Windowsオペレーティングシステム、ＡｎｄｒｏｉｄオペレーティングシステムまたはApple iOSオペレーティングシステムにおいて動作可能である。本発明の実装と共に、Ｃ＋＋、ＪａｖａＳｃｒｉｐｔ（登録商標）、ＭＡＴＬＡＢなどを含む異なるプログラミング言語を使用することができる。コード化された命令は、当業者に知られているコンピュータシステムと共に使用することができる非一時的コンピュータ可読媒体などのメモリ装置に記憶することができる。 In certain embodiments, the scripting procedure can include a VBA script. In certain embodiments, the scripting procedure can be run on a Linux® system (eg, Ubuntu 12.04 LTS), a Unix® system, a Microsoft Windows operating system, an Android operating system, or an Apple iOS operating system. With the implementation of the present invention, different programming languages can be used, including C ++, Javascript®, MATLAB, and the like. The coded instructions can be stored in a memory device such as a non-temporary computer-readable medium that can be used with computer systems known to those of skill in the art.

特定の実施形態では、当該αヘリックスドメインは、Ｇタンパク質共役受容体（ＧＰＣＲ）である天然膜タンパク質内の７つの膜貫通αヘリックスドメインのうちの１つである。いくつかの実施形態では、当該ＧＰＣＲは、プリン受容体（Ｐ２Ｙ_１、Ｐ２Ｙ_２、Ｐ２Ｙ_４、Ｐ２Ｙ_６）、Ｍ_１およびＭ_３ムスカリン性アセチルコリン受容体、トロンビン受容体（プロテアーゼ活性化受容体（ＰＡＲ）−１、ＰＡＲ−２）、トロンボキサン（ＴＸＡ_２）、スフィンゴシン１−リン酸（Ｓ１Ｐ_２、Ｓ１Ｐ_３、Ｓ１Ｐ_４およびＳ１Ｐ_５）、リゾホスファチジン酸（ＬＰＡ_１、ＬＰＡ_２、ＬＰＡ_３）、アンジオテンシンＩＩ（ＡＴ_１）、セロトニン（５−ＨＴ_２ｃおよび５−ＨＴ_４）、ソマトスタチン（ｓｓｔ_５）、エンドセリン（ＥＴ_ＡおよびＥＴ_Ｂ）、コレシストキニン（ＣＣＫ_１）、Ｖ_１ａバソプレシン受容体、Ｄ_５ドーパミン受容体、ｆＭＬＰホルミルペプチド受容体、ＧＡＬ_２ガラニン受容体、ＥＰ_３プロスタノイド受容体、Ａ_１アデノシン受容体、α_１アドレナリン作動性受容体、ＢＢ_２ボンベシン受容体、Ｂ_２ブラジキニン受容体、カルシウム感知受容体、ケモカイン受容体、ＫＳＨＶ−ＯＲＦ７４ケモカイン受容体、ＮＫ_１タキキニン受容体、甲状腺刺激ホルモン（ＴＳＨ）受容体、プロテアーゼ活性化受容体、神経ペプチド受容体、アデノシンＡ２Ｂ受容体、Ｐ２Ｙプリン受容体、代謝性グルタミン酸受容体、ＧＲＫ５、ＧＰＣＲ−３０およびＣＸＣＲ４からなる群から選択される。 In certain embodiments, the α-helix domain is one of seven transmembrane α-helix domains within a natural membrane protein that is a G protein-coupled receptor (GPCR). In some embodiments, the GPCR is a purine receptor (P2Y ₁ , P2Y ₂ , P2Y ₄ , P2Y ₆ ), M ₁ and M ₃ muscarinic acetylcholine receptor, thrombin receptor (proteoactive receptor (PAR)). ) -1, PAR-2), thromboxane (TXA ₂ ), sphingosine 1-phosphate (S1P ₂ , S1P ₃ , S1P ₄ and S1P ₅ ), lysophosphatidic acid (LPA ₁ , LPA ₂ , LPA ₃ ), angiotensin II (AT ₁ ), serotonin (5-HT _2c and 5-HT ₄ ), somatostatin (sst ₅ ), endoserin (ET _A and ET _B ), cholesistkinin (CCK ₁ ), V _1a vasopressin receptor, D ₅ Dopamine receptor, fMLP formyl peptide receptor, GAL ₂ galanin receptor, EP ₃ prostanoid receptor, A ₁ adenosine receptor, α ₁ adrenalinergic receptor, BB ₂ bombesin receptor, B ₂ brazikinin receptor, calcium sensing receptor, chemokine receptor, KSHV-ORF74 chemokine receptor, NK ₁ tachykinin receptor, thyroid stimulating hormone (TSH) receptor, protease-activated receptor, neuropeptide receptors, the adenosine A2B receptor, P2Y purinergic receptors , Metalytic glutamate receptor, GRK5, GPCR-30 and CXCR4.

他の実施形態では、当該天然膜タンパク質または膜タンパク質は必須膜タンパク質である。さらなる態様では、当該天然膜タンパク質は哺乳類のタンパク質である。本発明のタンパク質は好ましくはヒトのタンパク質である。特定の実施形態では、具体的なＧＰＣＲタンパク質（例えば、ＣＸＣＲ４）の言及は、非ヒト哺乳類のＧＰＣＲなどの哺乳類のＧＰＣＲまたはヒトＧＰＣＲを指す。 In other embodiments, the natural membrane protein or membrane protein is an essential membrane protein. In a further aspect, the natural membrane protein is a mammalian protein. The protein of the present invention is preferably a human protein. In certain embodiments, reference to a specific GPCR protein (eg, CXCR4) refers to a mammalian GPCR or human GPCR, such as a non-human mammalian GPCR.

いくつかの実施形態では、当該αヘリックスドメインは、文献内のどこかに記載されているように、リガンド結合を改善また変更するために例えば細胞外または細胞内ループにおいて修飾されたＧタンパク質共役受容体（ＧＰＣＲ）変異体内の７つの膜貫通αヘリックスドメインのうちの１つである。本発明の目的のために、「天然」または「野生型」という言葉は、本明細書に記載されている方法に従って水可溶化する前のタンパク質（またはαヘリックスドメイン）を指すものとする。 In some embodiments, the α-helix domain is a G protein-coupled receptor modified, eg, in an extracellular or intracellular loop, to improve or alter ligand binding, as described elsewhere in the literature. It is one of seven transmembrane α-helix domains within a body (GPCR) mutant. For the purposes of the present invention, the term "natural" or "wild-type" shall refer to a protein (or α-helix domain) prior to water solubilization according to the methods described herein.

特定の実施形態では、当該膜タンパク質は、４つの膜貫通αヘリックスを特徴とするテトラスパニン膜タンパク質であってもよい。およそ５４種のヒトのテトラスパニン膜タンパク質が見直され、かつ注釈付けされている。多くは、細胞発生、活性化、増殖および運動性の制御において重要な役割を担う細胞間シグナル伝達イベントを媒介することが知られている。例えば、ＣＤ８１受容体は、Ｃ型肝炎ウイルス侵入およびマラリア原虫感染のための受容体として重要な役割を担う。ＣＤ８１遺伝子は癌抑制遺伝子領域に局在しており、癌悪性腫瘍を媒介するための候補になり得る。ＣＤ１５１は、細胞運動性、浸潤および癌細胞の転移の増加に関与している。ＣＤ６３の発現は卵巣癌の侵襲性と相関している。テトラスパニン膜タンパク質の特徴は、第２すなわち大きな細胞外ループ内のシステイン−システイン−グリシンモチーフである。 In certain embodiments, the membrane protein may be a tetraspanin membrane protein characterized by four transmembrane α-helices. Approximately 54 human tetraspanin membrane proteins have been reviewed and annotated. Many are known to mediate intercellular signaling events that play important roles in the regulation of cell development, activation, proliferation and motility. For example, the CD81 receptor plays an important role as a receptor for hepatitis C virus invasion and malaria parasite infection. The CD81 gene is localized in the tumor suppressor gene region and can be a candidate for transmitting cancer malignancies. CD151 is involved in increased cell motility, infiltration and metastasis of cancer cells. Expression of CD63 correlates with the invasiveness of ovarian cancer. The tetraspanin membrane protein is characterized by a second or large extracellular loop cysteine-cysteine-glycine motif.

本発明の別の態様は、（１）当該ＧＰＣＲの膜貫通（ＴＭ）ドメインαヘリックスセグメント（「ＴＭ領域」）内の複数の疎水性アミノ酸が置換されており、ここで、（ａ）前記疎水性アミノ酸は、ロイシン（Ｌ）、イソロイシン（Ｉ）、バリン（Ｖ）およびフェニルアラニン（Ｆ）からなる群から選択され、（ｂ）前記ロイシン（Ｌ）はそれぞれ独立して、グルタミン（Ｑ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、（ｃ）前記イソロイシン（Ｉ）および前記バリン（Ｖ）はそれぞれ独立して、トレオニン（Ｔ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、かつ（ｄ）前記フェニルアラニンはそれぞれチロシン（Ｙ）で置換されることを特徴とし、その後に、（２）当該変異体の７つ全てのＴＭ領域によりαヘリックス二次構造が維持されており、かつ（３）予測される膜貫通領域が存在しないことを特徴とする、Ｇタンパク質共役受容体（ＧＰＣＲ）の水溶性変異体を提供する。 Another aspect of the invention is that (1) a plurality of hydrophobic amino acids within the transmembrane (TM) domain α-helix segment (“TM region”) of the GPCR are substituted, wherein (a) said hydrophobic. The sex amino acid is selected from the group consisting of leucine (L), isoleucine (I), valine (V) and phenylalanine (F), and (b) the leucine (L) is independently glutamine (Q) and asparagine. Substituted with (N) or serine (S), (c) said isoleucine (I) and said valine (V) were independently replaced with threonine (T), asparagine (N) or serine (S), respectively. And (d) the phenylalanine is each substituted with tyrosine (Y), and (2) the α-helix secondary structure is maintained by all seven TM regions of the variant. (3) Provided is a water-soluble variant of a G protein-coupled receptor (GPCR), which is characterized by the absence of a predicted transmembrane region.

特定の実施形態では、当該水溶性変異体は、配列番号４〜１１、１３〜２０、２２〜２９、３１〜３８、４０〜４７、４９〜５６および５８〜６４からなる群から選択される１つ以上のアミノ酸配列を含む。これは、配列番号３、１２、２１、３０、３９、４８および５７からなる群から選択される１つ以上のアミノ酸配列をさらに含んでもよい。特定の実施形態では、当該水溶性変異体はＣＸＣＲ４リガンドに結合する。 In certain embodiments, the water-soluble variant is selected from the group consisting of SEQ ID NOs: 4-11, 13-20, 22-29, 31-38, 40-47, 49-56 and 58-64. Contains one or more amino acid sequences. It may further comprise one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 3, 12, 21, 30, 39, 48 and 57. In certain embodiments, the water-soluble variant binds to a CXCR4 ligand.

特定の実施形態では、当該水溶性変異体は、配列番号６９〜７６、７８〜８５、８７、８９〜９６、９８〜１０５、１０７〜１１４および１１６〜１２３からなる群から選択される１つ以上のアミノ酸配列を含む。これは、配列番号６８、７７、８６、８８、９７、１０６、１１５および１２４からなる群から選択される１つ以上のアミノ酸配列をさらに含んでもよい。特定の実施形態では、当該水溶性変異体はＣＸ３ＣＲ１リガンドに結合する。 In certain embodiments, the water-soluble variant is one or more selected from the group consisting of SEQ ID NOs: 69-76, 78-85, 87, 89-96, 98-105, 107-114 and 116-123. Contains the amino acid sequence of. It may further comprise one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 68, 77, 86, 88, 97, 106, 115 and 124. In certain embodiments, the water-soluble variant binds to a CX3CR1 ligand.

特定の実施形態では、当該水溶性変異体は、配列番号１２８〜１３５、１３７〜１４４、１４６〜１５３、１５５〜１６２、１６４〜１７１、１７３および１７５〜
１８２からなる群から選択される１つ以上のアミノ酸配列を含む。これは、配列番号１２７、１３６、１４５、１５４、１６３、１７２、１７４および１８３からなる群から選択される１つ以上のアミノ酸配列をさらに含んでもよい。特定の実施形態では、当該水溶性変異体はＣＣＲ３リガンドに結合する。 In certain embodiments, the water-soluble variants are SEQ ID NOs: 128-135, 137-144, 146-153, 155-162, 164-171, 173 and 175-
Contains one or more amino acid sequences selected from the group consisting of 182. It may further comprise one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 127, 136, 145, 154, 163, 172, 174 and 183. In certain embodiments, the water-soluble variant binds to a CCR3 ligand.

特定の実施形態では、当該水溶性変異体は、配列番号１８７〜１９４、１９６〜２０３、２０５〜２０６、２０８、２１０〜２１７、２１９〜２２５、２２７〜２３４からなる群から選択される１つ以上のアミノ酸配列を含む。これは、配列番号１８６、１９５、２０４、２０７、２０９、２１８、２２６および２３５からなる群から選択される１つ以上のアミノ酸配列をさらに含んでもよい。特定の実施形態では、当該水溶性変異体はＣＣＲ５リガンドに結合する。 In certain embodiments, the water-soluble variant is one or more selected from the group consisting of SEQ ID NOs: 187-194, 196-203, 205-206, 208, 210-217, 219-225, 227-234. Contains the amino acid sequence of. It may further comprise one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 186, 195, 204, 207, 209, 218, 226 and 235. In certain embodiments, the water-soluble variant binds to a CCR5 ligand.

特定の実施形態では、当該水溶性変異体は、配列番号２３６〜２４３、２４５〜２５２、２５４〜２６１、２６３〜２７０、２７２、２７４〜２８１および２８３〜２９０からなる群から選択される１つ以上のアミノ酸配列を含む。これは、配列番号２３５、２４４、２５３、２６２、２７１、２７３、２８２および２９１からなる群から選択される１つ以上のアミノ酸配列をさらに含んでもよい。特定の実施形態では、当該水溶性変異体はＣＸＣＲ３リガンドに結合する。 In certain embodiments, the water-soluble variant is one or more selected from the group consisting of SEQ ID NOs: 236-243, 245-252, 254-261, 263-270, 272, 274-281 and 283-290. Contains the amino acid sequence of. It may further comprise one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 235, 244, 253, 262, 271, 273, 282 and 291. In certain embodiments, the water-soluble variant binds to a CXCR3 ligand.

特定の実施形態では、当該水溶性変異体は、配列番号２、６７、１２６、１８５、３２７、２９３、２９５、２９７、２９９、３０１、３０３、３０５、３０７、３０９、３１１、３１３、３１５、３１７、３１９、３２１、３２３または３２５のうちのいずれか１つに記載されている１つ以上の膜貫通ドメインを含む。特定の実施形態では、当該変異体は水溶性であり、かつ相同な天然膜貫通タンパク質のリガンドに結合する。 In certain embodiments, the water-soluble mutant is SEQ ID NO: 2, 67, 126, 185, 327, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317. Includes one or more transmembrane domains described in any one of 319, 321, 323 or 325. In certain embodiments, the variant is water soluble and binds to a ligand for a homologous natural transmembrane protein.

本発明の別の態様は、（ａ）タンパク質産生に適した条件下で増殖培地において細菌を培養する工程と、（ｂ）この細菌の溶解物を画分に分けて可溶性画分および不溶性ペレット画分を生成する工程と、（ｃ）当該タンパク質を可溶性画分から単離する工程とを含み、ここで、（１）当該タンパク質は請求項２９〜４６のいずれか１項に記載のＧタンパク質共役受容体（ＧＰＣＲ）の変異体であり、かつ（２）当該タンパク質の収率は増殖培地の少なくとも２０ｍｇ／Ｌ（例えば、３０ｍｇ／Ｌ、４０ｍｇ／Ｌ、５０ｍｇ／Ｌまたはそれ以上）であることを特徴とする、細菌（例えば、大腸菌）においてタンパク質を産生する方法を提供する。 Another aspect of the present invention is (a) culturing the bacterium in a growth medium under conditions suitable for protein production, and (b) dividing the lysate of the bacterium into fractions for soluble and insoluble pellet fractions. A step of producing a portion and (c) a step of isolating the protein from a soluble fraction are included, wherein (1) the protein is a G-protein conjugated receptor according to any one of claims 29 to 46. It is a variant of the body (GPCR) and (2) the yield of the protein is at least 20 mg / L (eg, 30 mg / L, 40 mg / L, 50 mg / L or more) of the growth medium. Provided is a method for producing a protein in a bacterium (for example, Escherichia coli).

特定の実施形態では、細菌は大腸菌ＢＬ２１であり、かつ増殖培地はＬＢ媒体である。特定の実施形態では、当該タンパク質は細菌内のプラスミドによってコードされる。特定の実施形態では、当該タンパク質の発現は誘導プロモーター、例えばＩＰＴＧによって誘導可能な誘導プロモーターなどの制御下にある。特定の実施形態では、この溶解物を超音波処理によって生成する。特定の実施形態では、この溶解物を１４，５００×ｇ以上で遠心分離して可溶性画分を生成する。 In certain embodiments, the bacterium is E. coli BL21 and the growth medium is LB medium. In certain embodiments, the protein is encoded by a plasmid within the bacterium. In certain embodiments, expression of the protein is under the control of an inducible promoter, such as an inducible promoter inducible by IPTG. In certain embodiments, the lysate is produced by sonication. In certain embodiments, the lysate is centrifuged at 14,500 xg or greater to produce a soluble fraction.

本発明の別の態様は、それを必要とする対象における膜タンパク質の活性により媒介される障害または疾患の治療法であって、前記対象に有効量の本明細書に記載されている水溶性ポリペプチドを投与する工程を含む方法を提供する。 Another aspect of the invention is the treatment of a disorder or disease mediated by the activity of a membrane protein in a subject in need thereof, wherein an effective amount of the water-soluble polypeptide described herein in the subject. A method comprising the step of administering a peptide is provided.

特定の実施形態では、当該水溶性ポリペプチドは当該膜タンパク質のリガンド結合活性を保持する。本発明の水溶性ペプチドを投与することによって治療することができる障害および疾患の例としては、限定されるものではないが、癌（例えば、小細胞肺癌、黒色腫、トリプルネガティブ乳癌）、パーキンソン病、心血管疾患、高血圧症および気管支喘息が挙げられる。 In certain embodiments, the water-soluble polypeptide retains the ligand-binding activity of the membrane protein. Examples of disorders and diseases that can be treated by administering the water-soluble peptides of the invention include, but are not limited to, cancers (eg, small cell lung cancer, melanoma, triple negative breast cancer), Parkinson's disease. , Cardiovascular disease, hypertension and bronchial asthma.

本発明の別の態様は、治療的有効量の本発明の水溶性ポリペプチドおよび薬学的に許容される担体または希釈液を含む医薬組成物を提供する。 Another aspect of the invention provides a pharmaceutical composition comprising a therapeutically effective amount of the water-soluble polypeptide of the invention and a pharmaceutically acceptable carrier or diluent.

さらに別の態様では、本発明は、修飾されたαヘリックスドメインを含む主題の水溶性ペプチドが形質移入された細胞を提供する。特定の実施形態では、当該細胞は、動物細胞（例えば、ヒト、非ヒト哺乳類、昆虫、鳥類、魚、爬虫類、両生類またはその他の細胞）、酵母または細菌細胞である。 In yet another aspect, the invention provides cells transfected with a water-soluble peptide of the subject containing a modified α-helix domain. In certain embodiments, the cells are animal cells (eg, humans, non-human mammals, insects, birds, fish, reptiles, amphibians or other cells), yeast or bacterial cells.

本発明は、本明細書に記載されている方法（またはその工程）のうちの１つ以上を含む、コンピュータシステム上で実行されるコンピュータ実装方法も含む。コンピュータシステムはコンピュータ実行可能命令が記憶された非一時的コンピュータ可読媒体を含み、コンピュータ実行可能命令は、コンピュータシステムによって実行されると、コンピュータシステムに本方法を実行させ、コンピュータ実行可能命令は、当該コンピュータシステムよって実行されると、当該コンピュータシステムに本明細書において想定される方法を実行させる。さらに、本明細書に記載されている配列データおよび定量的結果を記憶するための少なくとも１つメモリと、本明細書に記載されている方法を実行するように構成された、メモリに接続された少なくとも１つのプロセッサとを備えるコンピュータシステムが想定される。電子表示装置と共にグラフィカルユーザインタフェース（ＧＵＩ）などのユーザインタフェースを使用して、本明細書に記載されている計算方法を含む選択方法を制御するように動作する処理パラメータを選択することができる。 The present invention also includes computer implementation methods performed on a computer system, including one or more of the methods (or steps thereof) described herein. The computer system includes a non-temporary computer-readable medium in which the computer executable instructions are stored, and when the computer executable instructions are executed by the computer system, the computer system causes the computer system to execute this method, and the computer executable instructions are the relevant. When executed by a computer system, it causes the computer system to perform the methods envisioned herein. In addition, at least one memory for storing the sequence data and quantitative results described herein and connected to a memory configured to perform the methods described herein. A computer system with at least one processor is envisioned. A user interface, such as a graphical user interface (GUI), can be used with the electronic display device to select processing parameters that operate to control selection methods, including the calculation methods described herein.

本発明の別の態様は、本発明の方法のいずれかを実行するための一連の命令が記憶された非一時的コンピュータ可読媒体を提供する。 Another aspect of the invention provides a non-transitory computer-readable medium in which a series of instructions for performing any of the methods of the invention is stored.

本発明のさらなる態様は、本発明の方法のいずれかと同様にアミノ酸の置換を実行するように動作するデータプロセッサを備え、ランク付け関数によりタンパク質変異体をランク付けし、Ｇタンパク質共役受容体の水溶性変異体を選択するように動作するデータ処理システムを提供する。 A further aspect of the invention comprises a data processor that operates to perform amino acid substitutions similar to any of the methods of the invention, ranking protein variants by a ranking function, and water-soluble G protein-coupled receptors. Provided is a data processing system that operates to select a sex variant.

当然のことながら、本発明の一態様にのみ記載されているもの（例えば、スクリーニング方法）を含む本発明の全ての実施形態は、当業者によって容易に理解されるべきであるように、明示的に請求権を放棄するか、そうでなければ不適切なものでない限り、本発明の全ての態様（例えば、水溶性タンパク質または使用方法）に適用可能なものとして解釈されるべきであり、本発明の任意の１つ以上のさらなる実施形態と組み合わせ可能なものとして解釈されるべきである。 Of course, all embodiments of the invention, including those described in only one aspect of the invention (eg, screening methods), are explicit, as should be readily understood by those skilled in the art. It should be construed as applicable to all aspects of the invention (eg, water-soluble proteins or methods of use) unless the claim is waived or otherwise inappropriate. Should be construed as being combinable with any one or more additional embodiments of.

本発明の上記および他の目的、特徴および利点は、異なる図面を通して同様の符号が同じ部分を指している添付の図面に示す本発明の代表的な実施形態の以下のより詳細な説明から明らかになるであろう。これらの図面は必ずしも縮尺どおりではなく、その代わり、本発明の原理を示すことに重点を置いている。 The above and other objects, features and advantages of the present invention will be apparent from the following more detailed description of typical embodiments of the invention shown in the accompanying drawings in which similar reference numerals refer to the same parts throughout different drawings. Will be. These drawings are not necessarily on scale and instead focus on demonstrating the principles of the invention.

図１Ａ〜図１Ｄは、疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦをそれぞれＱ、Ｔ、Ｔ、Ｙに体系的に置換するＱＴＹコードの一般的な例示である。（図１Ａ）アミノ酸であるロイシンおよびグルタミンの分子形状は類似しており、同様に、イソロイシンおよびバリンの分子形状はトレオニンと類似しており、フェニルアラニンおよびチロシンの分子形状は類似している。ロイシン、イソロイシン、バリンおよびフェニルアラニンは疎水性であり、水分子と結合することができない。対照的に、グルタミンは４つの水分子、すなわち２つの水素ドナーおよび２つの水素アクセプターと結合することができ、トレオニンおよびチロシン上の−ＯＨ基は３つの水分子、すなわち１つの水素ドナーおよび２つのアクセプターに結合することができる。1A-1D are general examples of QTY codes that systematically replace the hydrophobic amino acids L, I, V and F with Q, T, T and Y, respectively. (Fig. 1A) The molecular shapes of the amino acids leucine and glutamine are similar, similarly, the molecular shapes of isoleucine and valine are similar to threonine, and the molecular shapes of phenylalanine and tyrosine are similar. Leucine, isoleucine, valine and phenylalanine are hydrophobic and cannot bind to water molecules. In contrast, glutamine can bind to four water molecules, two hydrogen donors and two hydrogen acceptors, and the -OH groups on treonin and tyrosine are three water molecules, i.e. one hydrogen donor and two Can be combined with an acceptor. 図１Ｂはαヘリックスの側面図である。体系的なアミノ酸変化のＱＴＹコードを適用した後、αヘリックスは水溶性になる。FIG. 1B is a side view of the α helix. After applying the QTY code for systematic amino acid changes, the α-helix becomes water soluble. 図１Ｃは、ＱＴＹコード置換の前および後のαヘリックスの上面図である。左側のヘリックスは、主として疎水性アミノ酸を有する天然膜ヘリックスであり、右側のヘリックスはＱＴＹコード置換を適用した後の同じヘリックスである。このヘリックスはこの時点で最も親水性のアミノ酸を有する。FIG. 1C is a top view of the α-helix before and after the QTY code substitution. The helix on the left is a natural membrane helix with predominantly hydrophobic amino acids, and the helix on the right is the same helix after applying the QTY code substitution. This helix has the most hydrophilic amino acids at this time. （図１Ｄ）ＱＴＹコード前は、当該ＧＰＣＲ膜タンパク質はそれらが脂質膜内に埋め込まれるように疎水性脂質分子に取り囲まれている（図１Ｄの左の部分）。ＱＴＹコードを適用した後、当該ＧＰＣＲ膜タンパク質は水溶性になり、安定化のためにそれを取り囲む界面活性剤をもはや必要としない（図１Ｄの右の部分）。(FIG. 1D) Prior to the QTY code, the GPCR membrane proteins are surrounded by hydrophobic lipid molecules so that they are embedded within the lipid membrane (left portion of FIG. 1D). After applying the QTY code, the GPCR membrane protein becomes water soluble and no longer requires a detergent surrounding it for stabilization (right part of FIG. 1D). ＣＸＣＲ４の膜貫通ドメイン領域のＴＭＨＭＭ予測である。この予測は、識別可能な７つの疎水性膜貫通セグメントを示す。対照的に、本発明のＱＴＹ置換方法に供したＣＸＣＲ４の変異体（ＣＸＣＲ４−ＱＴＹ）のＴＭＨＭＭ予測では、もはや目に見える識別可能な７つの疎水性膜貫通セグメントは存在しない。TMHMM prediction of the transmembrane domain region of CXCR4. This prediction shows seven distinguishable hydrophobic transmembrane segments. In contrast, the TMHMM prediction of the CXCR4 variant (CXCR4-QTY) used in the QTY substitution method of the present invention no longer has seven visible and identifiable hydrophobic transmembrane segments. ＣＸＣＲ４の完全にＱＴＹコード修飾されたＴＭ１ドメインの予測されるαヘリックス車輪構造を示す。The predicted α-helix wheel structure of the fully QTY-coded TM1 domain of CXCR4 is shown. GPCR CXCR4の７つのＴＭ領域のそれぞれにおける当該変異体候補の例示である。It is an example of the mutant candidate in each of the seven TM regions of GPCR CXCR4. 野生型タンパク質およびＣＸＣＲ４、ＣＸＣＲ３、ＣＣＲ３およびＣＣＲ５のそれぞれのＱＴＹ変異体の配列アラインメントである。ＱＴＹコードは、７つの疎水性膜貫通セグメントにのみ適用され、細胞外および細胞内セグメントには適用されない。Sequence alignment of wild-type proteins and QTY variants of CXCR4, CXCR3, CCR3 and CCR5 respectively. The QTY code applies only to the seven hydrophobic transmembrane segments, not the extracellular and intracellular segments. 野生型タンパク質およびＣＸＣＲ４、ＣＸＣＲ３、ＣＣＲ３およびＣＣＲ５のそれぞれのＱＴＹ変異体の配列アラインメントである。ＱＴＹコードは、７つの疎水性膜貫通セグメントにのみ適用され、細胞外および細胞内セグメントには適用されない。Sequence alignment of wild-type proteins and QTY variants of CXCR4, CXCR3, CCR3 and CCR5 respectively. The QTY code applies only to the seven hydrophobic transmembrane segments, not the extracellular and intracellular segments. 野生型タンパク質およびＣＸＣＲ４、ＣＸＣＲ３、ＣＣＲ３およびＣＣＲ５のそれぞれのＱＴＹ変異体の配列アラインメントである。ＱＴＹコードは、７つの疎水性膜貫通セグメントにのみ適用され、細胞外および細胞内セグメントには適用されない。Sequence alignment of wild-type proteins and QTY variants of CXCR4, CXCR3, CCR3 and CCR5 respectively. The QTY code applies only to the seven hydrophobic transmembrane segments, not the extracellular and intracellular segments. 野生型タンパク質およびＣＸＣＲ４、ＣＸＣＲ３、ＣＣＲ３およびＣＣＲ５のそれぞれのＱＴＹ変異体の配列アラインメントである。ＱＴＹコードは、７つの疎水性膜貫通セグメントにのみ適用され、細胞外および細胞内セグメントには適用されない。Sequence alignment of wild-type proteins and QTY variants of CXCR4, CXCR3, CCR3 and CCR5 respectively. The QTY code applies only to the seven hydrophobic transmembrane segments, not the extracellular and intracellular segments. 本方法の代表的な実施形態のフローチャートである。It is a flowchart of a typical embodiment of this method. 本方法の代表的な実施形態の別のフローチャートである。It is another flowchart of the typical embodiment of this method. 本発明のコンピュータシステムの例示である。It is an example of the computer system of the present invention. 本発明の特定の好ましい実施形態の処理工程を記載しているフローチャートの概略図である。It is the schematic of the flowchart which describes the processing process of the specific preferable embodiment of this invention. 本発明の特定の好ましい実施形態の処理工程を記載しているフローチャートの概略図である。It is the schematic of the flowchart which describes the processing process of the specific preferable embodiment of this invention.

本発明の好ましい実施形態の説明は以下のとおりである。「１つ（種）（ａ）の」または「１つ（種）（ａｎ）の」という言葉は、特に定めがない限り１つ（種）以上を包含するものとする。 The description of the preferred embodiment of the present invention is as follows. The words "one (seed) (a)" or "one (seed) (an)" shall include one or more (species) unless otherwise specified.

いくつかの態様では、本発明は、天然タンパク質の７つの膜貫通αヘリックスの疎水性残基であるロイシン（Ｌ）、イソロイシン（Ｉ）、バリン（Ｖ）およびフェニルアラニン（Ｆ）を親水性残基であるグルタミン（Ｑ）、トレオニン（Ｔ）およびチロシン（Ｙ）に変えるためのＱＴＹ（グルタミン、トレオニンおよびチロシン）置換（すなわち「ＱＴＹコード」）方法（または「原理」）の使用に関する。特定の実施形態では、上記のように、アスパラギン（Ｎ）およびセリン（Ｓ）を、Ｆ以外のＬ、Ｉおよび／またはＶのための置換残基として使用することもできる。本発明は、水不溶性の天然膜タンパク質を、天然タンパク質の一部または実質的に全ての機能をなお維持する、より水溶性の対応物に変換することができる。 In some embodiments, the invention comprises hydrophilic residues leucine (L), isoleucine (I), valine (V) and phenylalanine (F), which are the hydrophobic residues of the seven transmembrane α-helices of the natural protein. With respect to the use of QTY (glutamine, threonine and tyrosine) substitution (ie, "QTY code") methods (or "principles") to convert to glutamine (Q), threonine (T) and tyrosine (Y). In certain embodiments, asparagine (N) and serine (S) can also be used as substitution residues for L, I and / or V other than F, as described above. The present invention can convert a water-insoluble natural membrane protein into a more water-soluble counterpart that still retains some or substantially all the functions of the natural protein.

本発明は水溶性ペプチドの設計方法を含む。第一にヒトＣＣＲ３、ＣＣＲ５、ＣＸＣＲ４およびＣＸ３ＣＲ１を具体的な実施例として、ＧＰＣＲタンパク質に関して本方法を説明する。但し、本発明の一般的な原理は、膜貫通（αヘリックス）領域を有する他のタンパク質にも当てはまる。 The present invention includes a method for designing a water-soluble peptide. First, the present method will be described with respect to the GPCR protein, using human CCR3, CCR5, CXCR4 and CX3CR1 as specific examples. However, the general principles of the present invention also apply to other proteins that have a transmembrane (α-helix) region.

ＧＰＣＲは典型的に７つの膜貫通αヘリックス（７つのＴＭ）と、７つのＴＭ領域によって接続された８つのループ（８つのＮＴＭ）とを有する。これらの膜貫通セグメントをＴＭ１、ＴＭ２、ＴＭ３、ＴＭ４、ＴＭ５、ＴＭ６およびＴＭ７と称してもよい。８つの非膜貫通ループは、４つの細胞外ループＥＬ１、ＥＬ２、ＥＬ３およびＥＬ４と４つの細胞内ループＩＬ１、ＩＬ２、ＩＬ３およびＩＬ４、従って全部で８つのループ（１つのＴＭ領域にのみそれぞれ接続された、それぞれが自由端を有するＮ末端およびＣ末端ループを含む）に分けられる。このように、７ＴＭ−ＧＰＣＲタンパク質を膜貫通および非膜貫通特徴に基づいて１５個の断片に分けることができる。 GPCRs typically have seven transmembrane α-helices (7 TMs) and 8 loops (8 NTMs) connected by 7 TM regions. These transmembrane segments may be referred to as TM1, TM2, TM3, TM4, TM5, TM6 and TM7. Eight non-transmembrane loops are connected to four extracellular loops EL1, EL2, EL3 and EL4 and four intracellular loops IL1, IL2, IL3 and IL4, thus a total of eight loops (each connected to only one TM region). It is divided into N-terminal and C-terminal loops, each of which has a free end). In this way, the 7TM-GPCR protein can be divided into 15 fragments based on transmembrane and non-transmembrane characteristics.

本発明の一態様は、コンピュータプログラムにスクリプト化手順を実行させて膜タンパク質（例えば、Ｇタンパク質共役受容体（ＧＰＣＲ））の水溶性変異体を選択または調製するように動作させる方法であって、
（１）分析のために当該膜タンパク質（例えば、ＧＰＣＲ）の配列を入力する工程と、
（２）当該膜タンパク質（例えば、ＧＰＣＲ）の膜貫通（ＴＭ）ドメインαヘリックスセグメント（「ＴＭ領域」）内の複数の疎水性アミノ酸が置換されている当該膜タンパク質（例えば、ＧＰＣＲ）の変異体を得る工程であって、
（ａ）前記疎水性アミノ酸は、ロイシン（Ｌ）、イソロイシン（Ｉ）、バリン（Ｖ）およびフェニルアラニン（Ｆ）からなる群から選択され、
（ｂ）前記ロイシン（Ｌ）はそれぞれ独立して、グルタミン（Ｑ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、
（ｃ）前記イソロイシン（Ｉ）および前記バリン（Ｖ）はそれぞれ独立して、トレオニン（Ｔ）、アスパラギン（Ｎ）またはセリン（Ｓ）で置換され、かつ
（ｄ）前記フェニルアラニンはそれぞれチロシン（Ｙ）で置換される
ことを特徴とする工程と、その後に
（３）当該変異体のためにαヘリックス二次構造結果を得て、当該変異体内のαヘリックス二次構造の維持を確認する工程と、
（４）当該変異体のために膜貫通領域結果を得て、当該変異体の水溶性を確認する工程と
を含み、それにより当該膜タンパク質（例えば、ＧＰＣＲ）の水溶性変異体を選択することを特徴とする方法を提供する。 One aspect of the invention is a method of causing a computer program to perform a scripting procedure to select or prepare a water-soluble variant of a membrane protein (eg, a G protein-coupled receptor (GPCR)).
(1) A step of inputting the sequence of the membrane protein (for example, GPCR) for analysis, and
(2) A variant of the membrane protein (eg, GPCR) in which a plurality of hydrophobic amino acids in the transmembrane (TM) domain α-helix segment (“TM region”) of the membrane protein (eg, GPCR) are substituted. Is the process of obtaining
(A) The hydrophobic amino acid is selected from the group consisting of leucine (L), isoleucine (I), valine (V) and phenylalanine (F).
(B) The leucine (L) is independently replaced with glutamine (Q), asparagine (N) or serine (S).
(C) The isoleucine (I) and the valine (V) are independently substituted with threonine (T), asparagine (N) or serine (S), and (d) the phenylalanine is tyrosine (Y), respectively. A step characterized by being replaced with, followed by (3) a step of obtaining an α-helix secondary structure result for the variant and confirming the maintenance of the α-helix secondary structure in the variant.
(4) To obtain a transmembrane region result for the mutant and confirm the water solubility of the mutant, thereby selecting a water-soluble variant of the membrane protein (eg, GPCR). Provide a method characterized by.

本明細書で使用される「膜（貫通）タンパク質の水溶性変異体」または「水溶性膜（貫通）変異体」は同義で使用することができる。 The "water-soluble variant of a membrane (penetrating) protein" or "water-soluble membrane (penetrating) variant" used herein can be used interchangeably.

本発明の工程を実行する正確な順序は変更可能であってもよい。例えば、特定の実施形態では、工程（４）の前に工程（３）を行う。特定の実施形態では、工程（４）と同時に工程（３）を行う。特定の実施形態では、工程（４）の後に工程（３）を行う。 The exact order in which the steps of the present invention are performed may be variable. For example, in a particular embodiment, step (3) is performed before step (4). In a particular embodiment, step (3) is performed at the same time as step (4). In a particular embodiment, step (4) is followed by step (3).

特定の実施形態では、複数の疎水性アミノ酸は、当該タンパク質の全てのＴＭ領域に位置する全ての疎水性アミノ酸候補Ｌ、Ｉ、ＶおよびＦからランダムに選択される。特定の実施形態では、複数の疎水性アミノ酸は、当該タンパク質の全てのＴＭ領域に位置する全ての疎水性アミノ酸候補Ｌ、Ｉ、ＶおよびＦの約５％、６％、７％、８％、９％、１０％、１１％、１２％、１３％、１４％、１５％、１６％、１７％、１８％、１９％、２０％、２１％、２２％、２３％、２４％、２５％、２６％、２７％、２８％、２９％、３０％、３１％、３２％、３３％、３４％、３５％、３６％、３７％、３８％、３９％、４０％、４１％、４２％、４３％、４４％、４５％、５０％、５５％、６０％、６５％、７０％、７５％、８０％、８５％、９０％、９１％、９２％、９３％、９４％、９５％、９６％、９７％、９８％、９９％または１００％である。特定の実施形態では、複数の疎水性アミノ酸は、当該タンパク質の全てのＴＭ領域に位置する全ての疎水性アミノ酸候補Ｌ、Ｉ、ＶおよびＦの少なくとも約１０％、１５％、２０％、２５％、３０％、３５％、４０％、４５％、５０％である。特定の実施形態では、複数の疎水性アミノ酸は、当該タンパク質の全てのＴＭ領域に位置する全ての疎水性アミノ酸候補Ｌ、Ｉ、ＶおよびＦの約９５％、９０％、８５％、８０％、７５％、７０％、６５％、６０％または５０％以下である。特定の実施形態では、ランダムに選択された疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦを全てのＴＭ領域におよそ均等に分布させてもよく、あるいは１、２、３、４、５または６つのＴＭ領域に優先的または排他的に分布させてもよい。 In certain embodiments, the plurality of hydrophobic amino acids are randomly selected from all hydrophobic amino acid candidates L, I, V and F located in all TM regions of the protein. In certain embodiments, the plurality of hydrophobic amino acids is about 5%, 6%, 7%, 8% of all hydrophobic amino acid candidates L, I, V and F located in all TM regions of the protein. 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25% , 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42 %, 43%, 44%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. In certain embodiments, the plurality of hydrophobic amino acids is at least about 10%, 15%, 20%, 25% of all hydrophobic amino acid candidates L, I, V and F located in all TM regions of the protein. , 30%, 35%, 40%, 45%, 50%. In certain embodiments, the plurality of hydrophobic amino acids is about 95%, 90%, 85%, 80% of all hydrophobic amino acid candidates L, I, V and F located in all TM regions of the protein. It is 75%, 70%, 65%, 60% or 50% or less. In certain embodiments, randomly selected hydrophobic amino acids L, I, V and F may be distributed approximately evenly over all TM regions, or 1, 2, 3, 4, 5 or 6 TMs. It may be preferentially or exclusively distributed in the region.

特定の実施形態では、当該タンパク質の全てのＴＭ領域上の全ての疎水性アミノ酸候補Ｌ、Ｉ、ＶおよびＦは置換されている。例えば、全てのＬは独立してＱ（あるいはＳまたはＮ）で置換されており、かつ／または全てのＩおよびＶは独立してＴ（あるいはＳまたはＮ）で置換されており、かつ／または全てのＦは、Ｙで置換されている。特定の実施形態では、全てのＬはＱで置換されており、全てのIおよびＶはＴで置換されており、かつ全てのＦはＹで置換されている。 In certain embodiments, all hydrophobic amino acid candidates L, I, V and F on all TM regions of the protein have been substituted. For example, all Ls are independently substituted with Q (or S or N) and / or all I and V are independently substituted with T (or S or N) and / or All Fs are replaced by Y. In certain embodiments, all L's are replaced with Q, all I and V are replaced with T, and all F are replaced with Y.

特定の実施形態では、全てのＴＭ領域内の選択された疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦをランダムに置換する代わりに、全ての置換を最初にＴＭ領域のいずれか１つ（例えば、最もＮ末端またはＣ末端のＴＭ領域）に限定することができ、所望の置換変異体のみを変異体候補ライブラリーのメンバーとして選択する。置換されている位置（例えば、ＴＭ領域内の１０番目の残基に対して３番目の残基が置換されている）、または置換残基の同一性（例えば、ＩまたはＶ置換のためにＴに対してＳ）あるいはその両方により、当該ライブラリーの全てのメンバーは選択されたＴＭ領域内の置換において異なる。所望の置換変異体を、αヘリックス二次構造予測結果および／または膜貫通領域予測結果を考慮するスコア化システムなどの所定の判断基準に基づいて選択する。 In certain embodiments, instead of randomly substituting selected hydrophobic amino acids L, I, V and F within all TM regions, all substitutions are first made into any one of the TM regions (eg, most). It can be limited to the N-terminal or C-terminal TM region), and only the desired substitution mutants are selected as members of the mutant candidate library. The position being substituted (eg, the 3rd residue is substituted for the 10th residue in the TM region), or the identity of the substituted residue (eg, T for I or V substitution). With respect to S) or both, all members of the library differ in the substitutions within the selected TM region. The desired substitution mutant is selected based on predetermined criteria such as a scoring system that considers the α-helix secondary structure prediction result and / or the transmembrane region prediction result.

当該タンパク質の１、２、３、４、５、６つのさらなるＴＭ領域または当該タンパク質の全ての残りのＴＭ領域のためにこの方法を繰り返すことができ、各繰り返しにより電子メモリまたはデータベースに記憶することができる変異体候補ライブラリーを作製する。同じライブラリー内において、全ての変異体は選択されたＴＭ領内での置換において異なるが（上記参照）、それ以外の点では残りのＴＭ領域および非ＴＭ領域内で同じである。 This method can be repeated for one, two, three, four, five, six additional TM regions of the protein or all remaining TM regions of the protein, each repetition storing in electronic memory or database. Create a mutant candidate library that can be used. Within the same library, all variants differ in substitutions within the selected TM territory (see above), but are otherwise the same within the remaining TM and non-TM regions.

２つ以上のそのようなライブラリー内の配列を用いるドメイン交換すなわちドメインシャッフリングにより、２つ以上のＴＭ領域に疎水性アミノ酸Ｌ、Ｉ、Ｖ、Ｆ置換を有する組み合わせ変異体を産生する。各ライブラリー内のメンバーの数に応じて、組み合わせ変異体の可能な組み合わせの総数は、各ライブラリー内のほんの数個のメンバーにより数百万に近づくことができる。例えば、７つのＴＭ領域を有するＧＰＣＲでは、７個のライブラリーのそれぞれに８種のメンバーが存在すれば、当該ライブラリーに基づく組み合わせ変異体の総数は、８^７すなわち約２１０万になる。特定の実施形態では、組み合わせ変異体ライブラリーは、約５００万、４００万、３００万、２００万、１００万または５０万種未満のメンバーを含む。 Domain exchange or domain shuffling using sequences within two or more such libraries produces combinatorial variants with hydrophobic amino acids L, I, V, F substitutions in the two or more TM regions. Depending on the number of members in each library, the total number of possible combinations of combinatorial variants can approach millions with just a few members in each library. For example, the GPCR having seven TM regions, if there are eight members in each of the seven libraries, the total number of combinations variants based on the library of 8 ⁷ or about 2.1 million. In certain embodiments, the combination mutant library comprises members of approximately 5 million, 4 million, 3 million, 2 million, 1 million or less than 500,000 species.

従って、特定の実施形態では、工程（２）において、当該タンパク質（例えば、ＧＰＣＲ）の１つの同じＴＭ領域内の前記複数の疎水性アミノ酸の１つのサブセットを置換して変異体候補ライブラリーの１種のメンバーを作製し、かつ前記複数の疎水性アミノ酸の１つ以上の異なるサブセットを置換して当該ライブラリーのさらなるメンバーを作製する。 Thus, in a particular embodiment, in step (2), one subset of the plurality of hydrophobic amino acids within the same TM region of the protein (eg, GPCR) is replaced with one of the variant candidate libraries. Species members are made and one or more different subsets of the plurality of hydrophobic amino acids are replaced to make additional members of the library.

特定の実施形態では、本方法は、前記ライブラリーの全てのメンバーを組み合わせスコアに基づいてランク付けする工程をさらに含み、組み合わせスコアは、αヘリックス二次構造予測結果および膜貫通領域予測結果の重み付けされた組み合わせである。 In certain embodiments, the method further comprises a step of ranking all members of the library based on a combination score, where the combination score is a weighting of the α-helix secondary structure prediction result and the transmembrane region prediction result. It is a combination made.

当業者であれば分かるように、異なる配列を有するドメインは恐らく、αへリックス形成のための異なる水溶性および傾向を予測する。特定の予測される水溶性または水溶性範囲、αへリックス構造を形成する傾向または傾向範囲に対して「スコア」を割り当てる。このスコアは定性的（０、１）であってもよく、ここで０は、例えば許容されない予測される水溶性を有するドメインを表すことができ、１は、例えば許容される予測される水溶性を有するドメインを表すことができる。このスコアは例えば閾値に基づくことができる。あるいは、例えば水溶性の程度の増加を特徴として確立される１〜１０の尺度で、このスコアを評価することができる。あるいは、このスコアはｍｇ／ｍＬの単位で予測される溶解性を記述するなど、定量的であってもよい。各ドメインに対してスコアを評価したら、それらスコアのうちの１つまたは好ましくは両方によってドメイン変異体を容易に比較（またはランク付け）して、どちらも水溶性であり、かつαへリックスを形成するドメイン変異体を選択することができる。従って、好ましい実施形態は、ランク付けデータを計算するために使用することができるランク付け関数を利用することができる。目下記載されているシステムに基づいて調製された水溶性タンパク質を分析して特性評価し、所与の生物学的機能を達成するのに有効ではないそれらの置換組み合わせを使用して計算モデルを制約して、それにより、より効率的な情報処理を可能にするような本システムへの入力を得ることができることにも留意されたい。 Domains with different sequences will probably predict different water solubility and tendencies for α-helix formation, as will be appreciated by those skilled in the art. Assign a "score" to a particular predicted water-soluble or water-soluble range, a tendency or tendency range to form an α-helix structure. This score may be qualitative (0, 1), where 0 can represent a domain having, for example, unacceptable predicted water solubility, and 1 can be, for example, acceptable predicted water solubility. Can represent a domain having. This score can be based on, for example, a threshold. Alternatively, this score can be assessed on a scale of 1-10 established, for example, characterized by an increased degree of water solubility. Alternatively, this score may be quantitative, such as describing the expected solubility in mg / mL. Once the scores have been evaluated for each domain, domain variants are easily compared (or ranked) by one or preferably both of those scores to form both water-soluble and alpha-helices. You can select the domain mutant to be used. Therefore, a preferred embodiment can utilize a ranking function that can be used to calculate the ranking data. Water-soluble proteins prepared based on the systems currently described are analyzed and characterized, and computational models are constrained using their substitution combinations that are not effective in achieving a given biological function. It should also be noted that it is possible to obtain an input to the system that enables more efficient information processing.

例えば、本発明の方法を用いて、１つ以上の変異体を設計して生体外および／または生体内で産生することができ、多くの当該技術分野において承認されている方法のいずれかに基づいて当該変異体の１つ以上の生物学的機能を決定することができる。ＧＰＣＲのために、例えば当該変異体によるリガンド結合および／または下流シグナル伝達を野生型ＧＰＣＲのものと比較することができ、特定の変異体を産生するために使用されるＱＴＹ置換パターンを生物学的活性の増加、維持または低下に関連づけることができる。１つ以上の変異体に基づいて得られたそのような構造／機能的関係の情報を機械学習のために使用したり、本発明の計算モデルに対してさらなる制約を課して本発明の方法によって産生される変異体をより効率的にランク付けしたりすることができる。このように、公知の成功している変異体の置換パターンにより近く一致する置換パターンを有する新しい変異体候補は、公知の成功している変異体の置換パターンにあまり一致していない置換パターンまたは公知の成功していない変異体の置換パターンにより近く一致する置換パターンを有する別の変異体候補よりも高くランク付けすることができる。 For example, using the methods of the invention, one or more variants can be designed and produced in vitro and / or in vivo, based on any of the methods approved in many arts. One or more biological functions of the mutant can be determined. For GPCRs, for example, ligand binding and / or downstream signaling by the variant can be compared to that of wild-type GPCRs, and the QTY substitution pattern used to produce a particular variant is biological. It can be associated with increased, maintained or decreased activity. The methods of the invention can be used for machine learning with such structural / functional relationship information obtained based on one or more variants, or with additional constraints on the computational model of the invention. The mutants produced by can be ranked more efficiently. Thus, a new mutant candidate having a substitution pattern that closely matches the known successful mutant substitution pattern is a substitution pattern or known that does not closely match the known successful mutant substitution pattern. Can be ranked higher than other mutant candidates with a substitution pattern that closely matches the substitution pattern of the unsuccessful mutant of.

ＴＭＨＭＭプログラムを、ソフトウェアモジュール／パッケージの独立型（例えば、Ｌｉｎｕｘ（登録商標）システム用）として実行した場合、膜貫通領域／タンパク質を形成する傾向を予測するために使用することができる０〜１のスコアを生成する。本発明の方法における水溶性の定量的予測としてこのスコアを使用することができる。 When the TMHMM program is run as a stand-alone software module / package (eg, for Linux® systems), 0 to 1 can be used to predict the tendency to form transmembrane regions / proteins. Generate a score. This score can be used as a quantitative prediction of water solubility in the methods of the invention.

従って特定の実施形態では、ランク付け関数のαヘリックス二次構造成分は、予測されるαヘリックス二次構造を有しない場合の０．５または１、および予測されるαヘリックス二次構造を維持している場合の０などの定量的スコアであってもよい。特定の実施形態では、０は予測されるＴＭ領域がなく、かつ１は１つ以上のＴＭ領域を形成する最も強い傾向を有する０〜１の数値を提供するＴＭＨＭＭ２．０などのＴＭ領域予測プログラムによって膜貫通領域結果を得ることができる。従って、組み合わせスコアが維持された二次構造ならびに予測される水溶性（ＴＭ領域を形成する傾向によって測定）の総合評価を表すように、この２つのスコアを直接に、あるいは重みと共に組み合わせることができる。例えば、０の組み合わせスコアは、当該変異体が予測されるＴＭ領域を有しないが予測されるαヘリックス二次構造を維持しており、従って所望の変異体であることを示す。一方、変異体は、（例えば、多くの疎水性残基の存在により）ＴＭ領域を形成する強い傾向を有し、より大きな組み合わせスコアを有する傾向があり、従って、このスコア化スキーム下では望ましくない。 Thus, in certain embodiments, the α-helix secondary structure component of the ranking function maintains 0.5 or 1 in the absence of the predicted α-helix secondary structure, and the predicted α-helix secondary structure. It may be a quantitative score such as 0 when it is. In certain embodiments, a TM region prediction program such as TMHMM 2.0 provides a number 0 to 1 where 0 has no predicted TM region and 1 has the strongest tendency to form one or more TM regions. The transmembrane region result can be obtained. Thus, the two scores can be combined directly or with weights to represent a secondary structure in which the combined score is maintained as well as an overall assessment of the predicted water solubility (measured by the tendency to form the TM region). .. For example, a combination score of 0 indicates that the mutant does not have the predicted TM region but maintains the predicted α-helix secondary structure and is therefore the desired mutant. On the other hand, mutants have a strong tendency to form TM regions (eg, due to the presence of many hydrophobic residues) and tend to have a larger combination score, which is therefore undesirable under this scoring scheme. ..

特定の実施形態では、本方法は、αヘリックス二次構造が破壊または分裂されていることを示すような傾向のあるαヘリックス二次構造予測結果を有する変異体を除去する工程を含む。特定の実施形態では、本方法は、ＴＭ領域を形成する強い傾向を示す傾向のある膜貫通領域予測結果を有する変異体を除去する工程を含む。従って、本システムは、さらなる選択処理により変異体を除外することができるＢＥＡＭｉｎｇモジュールを備えることができる。 In certain embodiments, the method comprises removing a variant having an α-helix secondary structure prediction result that tends to indicate that the α-helix secondary structure is disrupted or disrupted. In certain embodiments, the method comprises removing a variant having a transmembrane region prediction result that tends to show a strong tendency to form a TM region. Therefore, the system can include a BEAMing module that can exclude variants by further selection processing.

特定の実施形態では、５％、１０％、２０％、２５％、３０％、４０％、５０％、６０％、７０％、８０％、９０％または９５％の重みをαヘリックス二次構造予測結果に割り当て、かつ残りを膜貫通領域予測結果に割り当てる重み付けスキームを含むように、ランク付け関数を選択することができる。生物学的機能などの所望の特性に応じて、ユーザが重み付け特徴を手動で選択するか、あるいはソフトウェアが重み付け特徴を自動的に選択することができる。 In certain embodiments, weights of 5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% are α-helix secondary structure predictions. The ranking function can be selected to include a weighting scheme that assigns the results and the rest to the transmembrane region prediction results. Depending on the desired property, such as biological function, the user can manually select the weighted feature, or the software can automatically select the weighted feature.

特定の実施形態では、本方法は、最も高い組み合わせスコアを有するＮ種のメンバーを選択して前記ＴＭ領域のための変異体候補の第１のライブラリーを形成する工程をさらに含み、ここで、Ｎは所定の整数（例えば、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、１５、１６、１７、１８、１９、２０またはそれ以上）である。 In certain embodiments, the method further comprises selecting the N species member having the highest combination score to form a first library of mutant candidates for said TM region, wherein the method further comprises. N is a predetermined integer (eg, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more).

特定の実施形態では、本方法は、当該タンパク質（例えば、ＧＰＣＲ）の１、２、３、４、５、６つまたは全ての残りのＴＭ領域のための変異体候補の１つのライブラリーを作製する工程をさらに含む。当該ライブラリー内の各エントリーは、１つ以上のランク付け関数によって生成されるランク付けデータを含む、エントリーの属性を定めるために使用されるフィールドを含むことができる。 In certain embodiments, the method creates a library of one of the mutant candidates for 1, 2, 3, 4, 5, 6 or all remaining TM regions of the protein (eg, GPCR). Further includes the steps to be performed. Each entry in the library can contain fields used to define the attributes of the entry, including the ranking data generated by one or more ranking functions.

特定の実施形態では、本方法は、当該タンパク質（例えば、ＧＰＣＲ）の２つ以上（例えば全て）のＴＭ領域を変異体候補ライブラリー内の対応するＴＭ領域で置き換えて組み合わせ変異体ライブラリーを作製する工程をさらに含む。本明細書で使用される「対応するＴＭ領域」とは、組み合わせられている当該タンパク質（例えば、ＧＰＣＲ）のＴＭ領域と同じか相同な変異体候補ライブラリー内のＴＭ領域を指す。例えば、ＧＰＣＲのＮ末端から２番目および３番目のＴＭ領域が置換される場合、２番目のＴＭ領域内にのみ置換を有するライブラリー内のＴＭ領域配列および３番目のＴＭ領域内にのみ置換を有するライブラリー内のＴＭ領域配列を、当該ＧＰＣＲの２番目および３番目のＴＭ領域にインポート／ペースト／転送して組み合わせ変異体を産生する。 In certain embodiments, the method replaces two or more (eg, all) TM regions of the protein (eg, GPCR) with the corresponding TM regions in the mutant candidate library to create a combined mutant library. Further includes the steps to be performed. As used herein, the "corresponding TM region" refers to a TM region within a mutant candidate library that is the same as or homologous to the TM region of the protein (eg, GPCR) being combined. For example, when the second and third TM regions from the N-terminus of the GPCR are replaced, the TM region sequence in the library having the substitution only in the second TM region and the substitution only in the third TM region. The TM region sequence in the library to be possessed is imported / pasted / transferred to the second and third TM regions of the GPCR to produce a combination mutant.

特定の実施形態では、前記ロイシンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％はグルタミンで置換されている。特定の実施形態では、前記イソロイシンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％はトレオニンで置換されている。特定の実施形態では、前記バリンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％はトレオニンで置換されている。特定の実施形態では、前記フェニルアラニンの実質的に全て（例えば、９６％、９７％、９８％、９９％または１００％）あるいは３０％、４０％、５０％、６０％、７０％、７５％、８０％、８５％、９０％、９５％はチロシンで置換されている。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記ロイシンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記イソロイシンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記バリンは置換されていない。特定の実施形態では、１つ以上（例えば、１、２または３つ）の前記フェニルアラニンは置換されていない。 In certain embodiments, substantially all of the leucine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with glutamine. In certain embodiments, substantially all of the isoleucine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with threonine. In certain embodiments, substantially all of the valine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with threonine. In certain embodiments, substantially all of the phenylalanine (eg, 96%, 97%, 98%, 99% or 100%) or 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90% and 95% are replaced with tyrosine. In certain embodiments, one or more (eg, 1, 2 or 3) leucines are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of the isoleucines are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of said valine are not substituted. In certain embodiments, one or more (eg, 1, 2 or 3) of the phenylalanines are not substituted.

特定の実施形態では、本方法は、前記組み合わせ変異体を産生／発現させる工程をさらに含む。特定の実施形態では、本方法は、前記組み合わせ変異体をリガンド結合について（例えば、生体外または酵母ツーハイブリッド法などの生物系において）試験する工程をさらに含み、ここでは、当該ＧＰＣＲのリガンド結合と比較して実質的に同じリガンド結合を有するものを選択する。特定の実施形態では、本方法は、前記組み合わせ変異体を当該ＧＰＣＲの生物学的機能について試験する工程をさらに含み、ここでは、当該ＧＰＣＲの生物学的機能と比較して実質的に同じ生物学的機能を有するものを選択する。 In certain embodiments, the method further comprises the step of producing / expressing the combination variant. In certain embodiments, the method further comprises testing the combination variant for ligand binding (eg, in vitro or in a biological system such as the yeast two-hybrid method), wherein the GPCR ligand binding. Select those having substantially the same ligand binding in comparison. In certain embodiments, the method further comprises testing the combination variant for the biological function of the GPCR, wherein the biology is substantially the same as the biological function of the GPCR. Select one that has a specific function.

特定の実施形態では、ＴＭタンパク質（例えば、ＧＰＣＲ）の配列は、当該タンパク質のＴＭ領域に関する情報、例えば、全てのＴＭ領域の位置などの当該ＴＭタンパク質の１つ以上の膜貫通領域の位置を含む。そのような配列は、画定されたＴＭ領域により解明された結晶構造を有するタンパク質に属していてもよい。また、そのような配列は、先の研究に基づいて注釈付けされたＴＭ領域情報を有するタンパク質に属していてもよく、そのような情報は、ＰＤＢ、ＵｎｉＰｒｏｔ、ジェンバンク、ＥＭＢＬ、ＤＢＪなどの公開または独自のデータベースから容易に入手可能である In certain embodiments, the sequence of a TM protein (eg, GPCR) comprises information about the TM region of the protein, eg, the location of one or more transmembrane regions of the TM protein, such as the location of all TM regions. .. Such a sequence may belong to a protein having a crystal structure elucidated by a defined TM region. Also, such sequences may belong to proteins with TM region information annotated based on previous studies, such information being published by PDB, UniProt, GenBank, EMBL, DBJ, etc. Or easily available from your own database

タンパク質構造データバンク（ＰＤＢ）は、タンパク質や核酸などの大きな生体分子の３次元構造データのための毎週更新されるリポジトリである。典型的にＸ線結晶学またはＮＭＲ分光法によって得られ、かつ世界中の生物学者および生化学者によって提出されたデータは、そのメンバー組織（ＰＤＢｅ、ＰＤＢｊおよびＲＣＳＢ）のウェブサイトを介してインターネット上で自由にアクセス可能である。ＰＤＢは世界タンパク質構造データバンクすなわちｗｗＰＤＢによって監督されている。ＰＤＢは構造ゲノム学などの構造生物学の領域において重要なリソースであり、最も主要な科学雑誌およびいくつかの資金提供機関は、自身の構造データをＰＤＢに提出する科学者を現在求めている。 The Protein Data Bank (PDB) is a weekly updated repository for 3D structural data of large biomolecules such as proteins and nucleic acids. Data typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists around the world are posted on the Internet via the websites of their member tissues (PDBe, PDBj and RCSB). It is freely accessible. PDB is overseen by the World Protein Data Bank or www PDB. The PDB is an important resource in the area of structural biology such as structural genomics, and the most major scientific journals and some funding agencies are currently seeking scientists to submit their structural data to the PDB.

ＰＤＢの内容を一次データとして考えれば、そのデータを違うように分類する何百もの派生（すなわち二次）データベースが存在する。例えば、ＳＣＯＰおよびＣＡＴＨはどちらも構造の種類および想定される進化的関係に従って構造を分類しており、ＧＯは遺伝子に基づいて構造を分類しているが、結晶学的データベースは、タンパク質の３Ｄ構造に関する情報を記憶している。全てのそのような公的に入手可能なデータベースを使用して膜貫通領域の存在および位置に関する情報を含む入力配列情報を得てもよい。 Considering the contents of a PDB as primary data, there are hundreds of derived (ie, secondary) databases that classify the data differently. For example, SCOP and CATH both classify structures according to type of structure and assumed evolutionary relationships, GO classifies structures based on genes, while crystallographic databases show 3D structures of proteins. I remember information about. All such publicly available databases may be used to obtain input sequence information, including information regarding the presence and location of transmembrane regions.

本発明の方法で使用される配列情報を提供することができる別の公的に利用可能なデータベースは、ＵｎｉＰｒｏｔである。ＵｎｉＰｒｏｔはタンパク質配列および機能情報の包括的な高品質かつ自由にアクセス可能なデータベースであり、多くのエントリーがゲノムシークエンシングプロジェクトに由来している。これは研究文献に由来するタンパク質の生物学的機能に関する多くの情報を含む。ＵｎｉＰｒｏｔは、４つのコアデータベースすなわちＵｎｉＰｒｏｔＫＢ（下位区分のＳｗｉｓｓ−ＰｒｏｔおよびＴｒＥＭＢＬを含む）、ＵｎｉＰａｒｃ、ＵｎｉＲｅｆおよびＵｎｉＭｅｓを提供する。これらのうち、ＵｎｉＰｒｏｔＫＢ／Ｓｗｉｓｓ−Ｐｒｏｔは、科学文献とバイオキュレーターによって評価されたコンピュータ分析とから抽出された情報が組み合わせられた、手動で注釈付けされた非冗長タンパク質配列データベースである。ＵｎｉＰｒｏｔＫＢ／Ｓｗｉｓｓ−Ｐｒｏｔの目的は、特定のタンパク質に関する全ての公知の関連情報を提供することである。注釈は、現在の科学的所見に遅れを取らないように定期的に見直されている。エントリーの手動の注釈はタンパク質配列および科学文献の詳細な分析を含む。同じ遺伝子および同じ生物種由来の配列は同じデータベースエントリーに併合されている。配列間の差を特定し、それらの原因を文書化する（例えば、選択的スプライシング、天然変異など）。コンピュータ予測を手動で評価し、エントリーに含めるために、関連する結果を選択する。これらの予測としては、翻訳後修飾、膜貫通ドメインおよびトポロジー、シグナルペプチド、ドメイン同定およびタンパク質ファミリー分類が挙げられ、全てを使用して本発明の方法で使用されるＴＭ領域に関連する有用な配列情報を得てもよい。 Another publicly available database that can provide the sequence information used in the methods of the invention is UniProt. UniProt is a comprehensive, high-quality, freely accessible database of protein sequence and functional information, with many entries derived from genome sequencing projects. It contains a lot of information about the biological function of proteins from the research literature. UniProt provides four core databases: UniProtKB (including the subdivisions Swiss-Prot and TREMBL), UniParc, UniRef and UniMes. Of these, UniProtKB / Swiss-Prot is a manually annotated non-redundant protein sequence database that combines information extracted from scientific literature and computer analysis evaluated by biocurators. The purpose of UniProtKB / Swiss-Prot is to provide all known relevant information about a particular protein. Annotations are regularly reviewed to keep up with current scientific findings. Manual annotation of the entry includes detailed analysis of protein sequences and scientific literature. Sequences from the same gene and the same species have been merged into the same database entry. Identify differences between sequences and document their causes (eg, alternative splicing, natural mutations, etc.). Manually evaluate computer predictions and select relevant results for inclusion in the entry. These predictions include post-translational modifications, transmembrane domains and topologies, signal peptides, domain identification and protein family classification, all of which are useful sequences associated with the TM region used in the methods of the invention. You may get information.

特定の実施形態では、当該ＴＭタンパク質（例えば、ＧＰＣＲ）の配列は、１つ以上（例えば任意）の膜貫通領域の位置に関する情報を含んでいない。但し、この１つ以上のＴＭ領域は公知のＴＭ領域を有する関連するタンパク質との配列相同性に基づいて予測することができる。例えば、関連するタンパク質は異なる生物種における相同なタンパク質であってもよい。 In certain embodiments, the sequence of the TM protein (eg, GPCR) does not contain information about the location of one or more (eg, optional) transmembrane regions. However, this one or more TM regions can be predicted based on sequence homology with related proteins having known TM regions. For example, the related protein may be a homologous protein in a different species.

特定の実施形態では、当該ＴＭタンパク質（例えば、ＧＰＣＲ）の配列は１つ以上（例えば任意）の膜貫通領域の位置に関する情報を含んでおらず、そのような情報は公知の情報に基づいて容易に入手可能ではない。本実施形態では、本発明は、生物学的配列センター（Center for Biological Sequence Analysis）によって開発されたＴＭＨＭＭ２．０（隠れマルコフモデルを用いる膜貫通予測）プログラムなどの当該技術分野において承認されている方法を用いてＴＭ領域の計算を行う。以下のこれに関するさらなる詳細を参照されたい。 In certain embodiments, the sequence of the TM protein (eg, GPCR) does not contain information about the location of one or more (eg, optional) transmembrane regions, such information being readily available based on known information. Not available to. In this embodiment, the invention is a method approved in the art, such as the TMHMM 2.0 (Hidden Markov Model) program developed by the Center for Biological Sequence Analysis. The TM area is calculated using. See further details on this below.

特定の実施形態では、本方法は、当該タンパク質（例えば、ＧＰＣＲ）の各変異体のポリヌクレオチド配列を提供する工程をさらに含む。そのようなポリヌクレオチド配列は、当該タンパク質（例えば、ＧＰＣＲ）のタンパク質配列および公知の遺伝暗号に基づいて容易に生成することができる。特定の実施形態では、当該ポリヌクレオチド配列は、宿主における発現のために最適化されたコドンである。当該宿主は、大腸菌などの細菌、出芽酵母または分裂酵母などの酵母、Ｓｆ９細胞などの昆虫細胞、非ヒト哺乳類細胞またはヒト細胞であってもよい In certain embodiments, the method further comprises providing a polynucleotide sequence for each variant of the protein (eg, GPCR). Such a polynucleotide sequence can be easily generated based on the protein sequence of the protein (eg, GPCR) and the known genetic code. In certain embodiments, the polynucleotide sequence is a codon optimized for expression in the host. The host may be a bacterium such as Escherichia coli, a yeast such as budding yeast or fission yeast, an insect cell such as Sf9 cell, a non-human mammalian cell or a human cell.

特定の実施形態では、当該タンパク質は、ＧＰＣＲ、例えば、プリン受容体（Ｐ２Ｙ_１、Ｐ２Ｙ_２、Ｐ２Ｙ_４、Ｐ２Ｙ_６）、Ｍ_１およびＭ_３ムスカリン性アセチルコリン受容体、トロンビン受容体（プロテアーゼ活性化受容体（ＰＡＲ）−１、ＰＡＲ−２）、トロンボキサン（ＴＸＡ_２）、スフィンゴシン１−リン酸（Ｓ１Ｐ_２、Ｓ１Ｐ_３、Ｓ１Ｐ_４およびＳ１Ｐ_５）、リゾホスファチジン酸（ＬＰＡ_１、ＬＰＡ_２、ＬＰＡ_３）、アンジオテンシンＩＩ（ＡＴ_１）、セロトニン（５−ＨＴ_２ｃおよび５−ＨＴ_４）、ソマトスタチン（ｓｓｔ_５）、エンドセリン（ＥＴ_ＡおよびＥＴ_Ｂ）、コレシストキニン（ＣＣＫ_１）、Ｖ_１ａバソプレシン受容体、Ｄ_５ドーパミン受容体、ｆＭＬＰホルミルペプチド受容体、ＧＡＬ_２ガラニン受容体、ＥＰ_３プロスタノイド受容体、Ａ_１アデノシン受容体、α_１アドレナリン作動性受容体、ＢＢ_２ボンベシン受容体、Ｂ_２ブラジキニン受容体、カルシウム感知受容体、ケモカイン受容体、ＫＳＨＶ−ＯＲＦ７４ケモカイン受容体、ＮＫ_１タキキニン受容体、甲状腺刺激ホルモン（ＴＳＨ）受容体、プロテアーゼ活性化受容体、神経ペプチド受容体、アデノシンＡ２Ｂ受容体、Ｐ２Ｙプリン受容体、代謝性グルタミン酸受容体、ＧＲＫ５、ＧＰＣＲ−３０およびＣＸＣＲ４からなる群から選択されるＧＰＣＲである。 In certain embodiments, the protein is GPCR, eg, purine receptors (P2Y ₁ , P2Y ₂ , P2Y ₄ , P2Y ₆ ), M ₁ and M ₃ muscarinic acetylcholine receptors, thrombin receptors (proteo-activated receptors). Body (PAR-1, PAR-2), thromboxane (TXA ₂ ), sphingosine 1-phosphate (S1P ₂ , S1P ₃ , S1P ₄ and S1P ₅ ), lysophosphatidic acid (LPA ₁ , LPA ₂ , LPA ₃₎ ), Angiotensin II (AT ₁ ), Serotonin (5-HT _2c and 5-HT ₄ ), Somatostatin (sst ₅ ), Endoserin (ET _A and ET _B ), Cholecystokinin (CCK ₁ ), V _1a vasopressin receptor , _{D 5} dopamine receptors, fMLP formyl peptide receptor, GAL ₂ galanin receptor, EP ₃ prostanoid receptors, _{A 1} adenosine receptor, alpha ₁ adrenergic receptor, BB ₂ bombesin receptor, _{B 2} bradykinin receptor body, calcium-sensing receptor, chemokine receptor, KSHV-ORF74 chemokine receptor, NK ₁ tachykinin receptor, thyroid stimulating hormone (TSH) receptor, protease-activated receptor, neuropeptide receptors, the adenosine A2B receptor, P2Y GPCR selected from the group consisting of purine receptor, metabolic glutamate receptor, GRK5, GPCR-30 and CXCR4.

特定の実施形態では、本方法のスクリプト化手順はＶＢＡスクリプトを含む。 In certain embodiments, the scripting procedure of the method comprises a VBA script.

特定の実施形態では、当該スクリプト化手順は、Ｌｉｎｕｘ（登録商標）システム（例えば、Ubuntu 12.04 LTS）、Microsoft WindowsオペレーティングシステムまたはApple iOSオペレーティングシステムにおいて動作可能である。 In certain embodiments, the scripting procedure is operational on a Linux® system (eg, Ubuntu 12.04 LTS), Microsoft Windows operating system, or Apple iOS operating system.

特定の実施形態では、本方法は、
（１）必要であれば当該タンパク質（例えば、ＧＰＣＲ）のαヘリックス構造を予測して、膜（貫通）タンパク質の第１の膜貫通領域を同定する工程、
（２）本明細書に定義されているＱＴＹコードにより複数の疎水性アミノ酸を修飾して、修飾された第１の膜貫通配列を得る工程、
（３）（２）の第１の修飾された膜貫通配列のαヘリックス構造の傾向を（例えば、第１の修飾された膜貫通配列を有する修飾された膜（貫通）タンパク質との関連において）スコア化して構造スコアを得る工程、
（４）（２）の第１の修飾された膜貫通配列の水溶性予測を（例えば、第１の修飾された膜貫通配列を有する修飾された膜（貫通）タンパク質との関連において）スコア化して溶解性スコアを得る工程、
（５）（２）〜（４）を繰り返して推定上水溶性である第１の修飾された膜貫通変異体の第１のライブラリーを得る工程、
（６）第１のライブラリー内の推定上水溶性である第１の修飾された膜貫通変異体のそれぞれの構造スコアおよび溶解性スコアを比較し、好ましくは前記構造スコアおよび溶解性スコアを用いて推定上水溶性である第１の修飾された膜貫通変異体をランク付けする工程、
（７）複数の推定上水溶性である第１の修飾された膜貫通変異体（ここで、複数とは整数Ｈまたは好ましくは１０、９、８、７、６、５または４未満である）を選択して、推定上水溶性である第１の修飾された膜貫通変異体の第２のライブラリーを得る工程、
（８）当該タンパク質の第２、第３、第４、第５、第６、第７または好ましくは全ての膜貫通領域のために工程（１）〜（７）を繰り返す工程（本方法によって修飾された膜貫通領域の合計は整数ｎである）、
（９）工程（１）〜（８）において修飾されたいずれかの膜貫通領域に含まれておらず、かつ当該タンパク質の任意の細胞外または細胞内ドメインを含むタンパク質のアミノ酸配列を同定する工程、
（１０）推定上水溶性である修飾された膜貫通タンパク質の組み合わせ変異体を産生する工程（上記参照）、および
（１１）任意に、推定上水溶性である修飾された膜貫通変異体のそれぞれの核酸配列を同定する工程
のうちの全てまたは実質的に全てを含む。 In certain embodiments, the method is
(1) A step of predicting the α-helix structure of the protein (for example, GPCR), if necessary, to identify the first transmembrane region of the membrane (penetrating) protein.
(2) A step of modifying a plurality of hydrophobic amino acids according to the QTY code defined in the present specification to obtain a modified first transmembrane sequence.
(3) Tendency of the α-helix structure of the first modified transmembrane sequence of (2) (eg, in the context of a modified membrane (penetrating) protein having a first modified transmembrane sequence). The process of scoring and obtaining a structural score,
(4) Score the water solubility prediction of the first modified transmembrane sequence of (2) (eg, in the context of a modified membrane (penetration) protein having a first modified transmembrane sequence). To obtain a solubility score,
(5) A step of repeating (2) to (4) to obtain a first library of the first modified transmembrane mutant which is presumably water-soluble.
(6) The structural score and solubility score of each of the first modified transmembrane mutants, which are presumably water-soluble in the first library, are compared, preferably using the structural score and the solubility score. The step of ranking the first modified transmembrane mutant, which is presumably water-soluble.
(7) Multiple presumably water-soluble first modified transmembrane variants (where plural is an integer H or preferably less than 10, 9, 8, 7, 6, 5 or 4). To obtain a second library of presumably water-soluble first modified transmembrane variants,
(8) A step of repeating steps (1)-(7) for the second, third, fourth, fifth, sixth, seventh or preferably all transmembrane regions of the protein (modified by this method). The sum of the transmembrane regions is an integer n),
(9) A step of identifying the amino acid sequence of a protein that is not contained in any of the transmembrane regions modified in steps (1) to (8) and contains an arbitrary extracellular or intracellular domain of the protein. ,
(10) Producing a combination variant of a modified transmembrane protein that is presumably water-soluble (see above), and (11) Optionally, a modified transmembrane variant that is presumably water-soluble, respectively. Includes all or substantially all of the steps of identifying a nucleic acid sequence of.

上記方法において同定された核酸配列を用いて、推定上水溶性である修飾された膜貫通変異体のそれぞれおよび非膜貫通ドメイン（細胞外および細胞内ドメインを含む）のそれぞれの核酸配列を生成し、かつ組み合わせ的に発現させて、最大Ｈ^ｎ種の推定上水溶性である膜貫通タンパク質変異体のライブラリーを作製することができる。例えば、Ｈが８であり、かつｎが７である場合、およそ２百万種の水溶性タンパク質変異体のライブラリーを設計することができる。 The nucleic acid sequences identified in the above method were used to generate the nucleic acid sequences of each of the presumably water-soluble modified transmembrane variants and non-transmembrane domains (including extracellular and intracellular domains). , And can be expressed in combination to create a library of up to H ⁿ species, which are presumably water-soluble transmembrane protein variants. For example, if H is 8 and n is 7, a library of approximately 2 million water-soluble protein variants can be designed.

本発明の別の態様は、本発明の方法に基づいて設計された水溶性変異体タンパク質（例えば、ＧＰＣＲ）の発現に関する。本発明のこの態様は部分的に、本発明の方法に基づいて設計された水溶性変異体タンパク質（例えば、ＧＰＣＲ）により、生体外での無細胞発現系での発現および大腸菌などのよく使用される細胞による発現系での発現の両方における高レベルの発現を達成することができるという驚くべき発見に基づいている。また、発現されたタンパク質は高度に可溶性であり、大部分の膜タンパク質が典型的に内部に存在する不溶性の凝集物すなわちペレットとは対照的に、大腸菌培養物の溶解物の可溶性画分などの発現系の可溶性画分から容易に精製することができる。 Another aspect of the invention relates to the expression of a water soluble mutant protein (eg, GPCR) designed based on the methods of the invention. This aspect of the invention is in vitro expressed in a cell-free expression system and commonly used in E. coli, etc., in part by water-soluble variant proteins (eg, GPCRs) designed according to the methods of the invention. It is based on the surprising finding that high levels of expression can be achieved in both expression systems by E. coli. Also, the expressed protein is highly soluble, such as insoluble fractions of E. coli culture lysates, as opposed to insoluble aggregates or pellets, where most membrane proteins are typically present internally. It can be easily purified from the soluble fraction of the expression system.

従って、本発明の一態様は、
（ａ）タンパク質産生に適した条件下で増殖培地において細菌を培養する工程と、
（ｂ）この細菌の溶解物を画分に分けて可溶性画分および不溶性ペレット画分を生成する工程と、
（ｃ）当該タンパク質を可溶性画分から単離する工程であって、
（１）当該タンパク質は本発明の主題の変異体タンパク質（例えば、Ｇタンパク質共役受容体（ＧＰＣＲ））であり、かつ
（２）当該タンパク質の収率は増殖培地の少なくとも２０ｍｇ／Ｌ（例えば、３０ｍｇ／Ｌ、４０ｍｇ／Ｌ、５０ｍｇ／Ｌまたはそれ以上）である
ことを特徴とする工程と
を含む、細菌（例えば大腸菌）においてタンパク質を産生する方法を提供する。 Therefore, one aspect of the present invention is
(A) A step of culturing bacteria in a growth medium under conditions suitable for protein production, and
(B) A step of dividing the lysate of this bacterium into fractions to produce a soluble fraction and an insoluble pellet fraction, and
(C) A step of isolating the protein from the soluble fraction.
(1) The protein is a variant protein of the subject of the invention (eg, G protein-coupled receptor (GPCR)), and (2) the yield of the protein is at least 20 mg / L (eg, 30 mg) of growth medium. Provided is a method of producing a protein in a bacterium (eg, E. coli), comprising a step characterized by (/ L, 40 mg / L, 50 mg / L or more).

特定の実施形態では、細菌は大腸菌ＢＬ２１であり、かつ増殖培地はＬＢ媒体である。特定の実施形態では、当該タンパク質は細菌内のプラスミドによってコードされる。特定の実施形態では、当該タンパク質の発現は誘導プロモーターの制御下にある。例えば、当該誘導プロモーターはＩＰＴＧによって誘導可能であってもよい。特定の実施形態では、当該溶解物を超音波処理によって生成する。特定の実施形態では、当該溶解物を１４，５００×ｇ以上で遠心分離して可溶性画分を生成する。 In certain embodiments, the bacterium is E. coli BL21 and the growth medium is LB medium. In certain embodiments, the protein is encoded by a plasmid within the bacterium. In certain embodiments, expression of the protein is under the control of an inducible promoter. For example, the inducible promoter may be inducible by IPTG. In certain embodiments, the lysate is produced by sonication. In certain embodiments, the lysate is centrifuged at 14,500 xg or greater to produce a soluble fraction.

上記本発明の一般的な態様を用いて、本発明の特定の特徴または具体的な実施形態について以下にさらに説明する。 Specific features or specific embodiments of the present invention will be further described below using the general aspects of the present invention.

膜貫通領域予測
本発明の特定の方法は、ＧＰＣＲなどのタンパク質の膜貫通領域を予測する工程を含む。ＴＭ領域に関する当該技術分野で知られている多くのプログラムおよびソフトウェアがあり、そのうちのいずれかをＴＭ領域予測工程を必要とする本発明の方法において個々にまたは組み合わせて使用してもよい。これらのプログラムは通常、典型的に指定される形式（ＦＡＳＴＡまたはテキスト形式など）の入力配列を提供することをユーザに要求する非常に単純なユーザインタフェースを有し、かつテキストまたはグラフィックスあるいはその両方を用いて予測結果を与える。また、ユーザに特定のパラメータを指定させて予測結果を微調整するなどのより高度な機能を提供するプログラムもある。本発明の方法において全てのそのようなプログラムを使用することができる。 Transmembrane Region Prediction The particular method of the invention comprises the step of predicting the transmembrane region of a protein, such as a GPCR. There are many programs and software known in the art for the TM domain, any of which may be used individually or in combination in the methods of the invention that require a TM domain prediction step. These programs typically have a very simple user interface that requires the user to provide an input sequence in a typically specified format (such as FASTA or text format), and / or text. Is used to give the prediction result. There are also programs that provide more advanced features such as allowing the user to specify specific parameters to fine-tune the prediction results. All such programs can be used in the methods of the invention.

例示的なＴＭ領域予測プログラムの１つは、ＴＭＨＭＭ（デンマーク工科大学の生物学的配列センターによって提供）であり、この方法は９７〜９８％のＴＭ領域へリックスを正確に予測する。これは隠れマルコフモデルを用いてタンパク質内の膜貫通へリックスを予測する。入力されるタンパク質配列はＦＡＳＴＡ形式であってもよく、その出力をＴＭ領域の予測される位置の画像を含むｈｔｍｌページとして提示することができる。「Evaluation of Methods for the Prediction of Membrane Spanning Regions（膜貫通領域の予測方法の評価）」（Bioinformatics 17(7):646-653, 2001）という題名のＭｏｌｌｅｒらによる研究において、ＴＭＨＭＭは評価の時点で膜貫通予測プログラムを行う最良のプログラムであると判断された。 One of the exemplary TM region prediction programs is TMHMM (provided by the Center for Biological Sequences, Technical University of Denmark), which accurately predicts 97-98% TM region helix. It uses a hidden Markov model to predict transmembrane helix within a protein. The input protein sequence may be in FASTA format and its output can be presented as an html page containing an image of the predicted position of the TM region. In a study by Moller et al. entitled "Evaluation of Methods for the Prediction of Membrane Spanning Regions" (Bioinformatics 17 (7): 646-653, 2001), TMHMM was used at the time of evaluation. It was judged to be the best program to perform the transmembrane prediction program.

その研究において比較されたプログラムとしては以下のプログラム：ＴＭＨＭＭ１．０、２．０および２．０の再トレーニング版(Sonnhammer et al., Int. Conf. Intell. Syst. Mol. Biol. AAAI Press, Montreal, Canada, pp.176-182, 1998; Krogh et al., J Mol Biol. 305(3):567-80, 2001)、ＭＥＭＳＡＴ１．５(Jones et al., Biochemistry 33:3038-3049, 1994)、Ｅｉｓｅｎｂｅｒｇ(Eisenberg et al., Nature 299:371-374, 1982)、Ｋｙｔｅ／Ｄｏｏｌｉｔｔｌｅ(Kyte and Doolittle, J. Mol. Biol. 157:105-132, 1982)、ＴＭＡＰ(Persson and Argos, J. Protein Chem. 16:453-457, 1997)、ＤＡＳ(Cserzo et al., Protein Eng. 10:673-676, 1997)、ＨＭＭＴＯＰ(Tusnady and Simon, J. Mol. Biol. 283:489-506, 1998)、ＳＯＳＵＩ(Hirokawa et al., Bioinformatics 14:378-379, 1998)、ＰＨＤ(Rost et al., Int. Conf. Intell. Syst. Mol. Biol. AAAI Press, St. Louis, USA, pp.192-200, 1996)、ＴＭｐｒｅｄ(Hofmann and Stoffel, Biol. Chem. Hoppe-Seyler 374:166, 1993)、ＫＫＤ(Klein et al., Biochim. Biophys. Acta. 815:468-476, 1985)、ＡＬＯＭ２(Nakai and Kanehisa, Genomics 14:489-911, 1992)およびＴｏｐｐｒｅｄ２(Claros and Heijne, Comput. Appl. Biosci. 10:685-686, 1994)が挙げられ、本発明の方法においてＴＭ領域を予測するために全てを使用することができる。引用されている全ての参考文献が参照により本明細書に組み込まれる。 The programs compared in the study include the following programs: TMHMM 1.0, 2.0 and 2.0 retrained versions (Sonnhammer et al., Int. Conf. Intell. Syst. Mol. Biol. AAAI Press, Montreal , Canada, pp.176-182, 1998; Krogh et al., J Mol Biol. 305 (3): 567-80, 2001), MEMSAT1.5 (Jones et al., Biochemistry 33: 3038-3049, 1994) , Eisenberg (Eisenberg et al., Nature 299: 371-374, 1982), Kyte / Dootrain (Kyte and Doolittle, J. Mol. Biol. 157: 105-132, 1982), TMAP (Persson and Argos, J. Protein) Chem. 16: 453-457, 1997), DAS (Cserzo et al., Protein Eng. 10: 673-676, 1997), HMMTOP (Tusnady and Simon, J. Mol. Biol. 283: 489-506, 1998) , SOSUI (Hirokawa et al., Bioinformatics 14: 378-379, 1998), PHD (Rost et al., Int. Conf. Intell. Syst. Mol. Biol. AAAI Press, St. Louis, USA, pp.192- 200, 1996), TMpred (Hofmann and Stoffel, Biol. Chem. Hoppe-Seyler 374: 166, 1993), KKD (Klein et al., Biochim. Biophys. Acta. 815: 468-476, 1985), ALOM2 (Nakai) and Kanehisa, Genomics 14: 489-911, 1992) and Toppred 2 (Claros and Heijne, Comput. Appl. Biosci. 10: 685-686, 1994), all to predict the TM region in the method of the invention. to use Can be done. All references cited are incorporated herein by reference.

ＴＭＨＭＭの原理は、Krogh et al., Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes（隠れマルコフモデルを用いた膜貫通タンパク質のトポロジー：完全なゲノムへの適用）. Journal of Molecular Biology, 305(3):567-580, January 2001（参照により組み込まれる）およびSonnhammer et al., A hidden Markov model for predicting transmembrane helices in protein sequences（タンパク質配列内の膜貫通へリックスを予測するための隠れマルコフモデル） In J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen, editors, Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology, pages 175-182, Menlo Park, CA, 1998, AAAI Press（参照により組み込まれる）に記載されている。 The principle of TMHMM is Krogh et al., Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, 305 (3): 567-580, January 2001 (incorporated by reference) and Sonnhammer et al., A hidden Markov model for predicting transmembrane helices in protein sequences (Hidden Markov model for predicting transmembrane helices in protein sequences) Model) In J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen, editors, Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology, pages 175-182, Menlo Park , CA, 1998, AAAI Press (incorporated by reference).

ＤＡＳ(Dense Alignment Surface, Cserzo et al., “Prediction of transmembrane alpha-helices in procariotic membrane proteins: the Dense Alignment Surface method（原核生物膜タンパク質における膜貫通αヘリックスの予測：高密度アラインメント表面法）,” Prot. Eng. 10(6): 673-676, 1997, Stockholm University, Sweden)は、高密度アラインメント表面法を用いて膜貫通領域を予測する。ＤＡＳは、以前に得られた特別なスコア行列を用いる、ライブラリー配列（非相同な膜タンパク質）セットに対するクエリー配列の低厳密性ドットプロットに基づいている。本方法は、そこから膜貫通セグメント候補の位置を得ることができる、当該クエリーの高精度疎水性プロファイルを提供する。ＤＡＳ−ＴＭｆｉｌｔｅｒアルゴリズムの新規性は、ＴＭライブラリーの配列内のＴＭセグメントを予測するための第２の予測サイクルである。ＤＡＳサーバを使用するために、ユーザはｗｗｗ．ｓｂｃ．ｓｕ．ｓｅ／〜ｍｉｋｌｏｓ／ＤＡＳ／においてタンパク質配列を入力し、ＤＡサーバは入力配列のＴＭ領域を予測する。 DAS (Dense Alignment Surface, Cserzo et al., “Prediction of transmembrane alpha-helices in procariotic membrane proteins: the Dense Alignment Surface method),” Prot . Eng. 10 (6): 673-676, 1997, Stockholm University, Sweden) predicts transmembrane regions using a high-density alignment surface method. DAS is based on a low-rigidity dot plot of query sequences against a set of library sequences (unhomologous membrane proteins) using a special score matrix previously obtained. The method provides an accurate hydrophobic profile of the query from which the locations of transmembrane segment candidates can be obtained. The novelty of the DAS-TMfilter algorithm is a second prediction cycle for predicting TM segments in an array of TM libraries. To use the DAS server, the user can go to www. sbc. su. The protein sequence is input in se / ~ miklos / DAS /, and the DA server predicts the TM region of the input sequence.

ＨＭＭＴＯＰ（ハンガリー科学アカデミー、ブダペスト）は、酵素学研究所(Institute of Enzymology)においてＧ．Ｅ．Ｔｕｓｎａｄｙによって開発された隠れマルコフモデルを用いて膜貫通へリックスおよびタンパク質のトポロジーを予測するための自動サーバである。この予測サーバによって使用される方法は、G.E Tusnady and I. Simon (1998) “Principles Governing Amino Acid Composition of Integral Membrane Proteins: Applications to Topology Prediction（内在性膜タンパク質のアミノ酸組成を決定する原理：トポロジー予測への適用）," J. Mol. Biol. 283: 489-506（参照により組み込まれる）に記載されている。ＨＭＭＴＯＰ２．０バージョンの新しい特徴は、G.E Tusnady and I. Simon (2001) “The HMMTOP transmembrane topology prediction server（ＨＭＭＴＯＰ膜貫通トポロジー予測サーバ）," Bioinformatics 17: 849-850（参照により組み込まれる）に記載されている。 HMMTOP (Hungarian Academy of Sciences, Budapest) is a member of the Institute of Enzymology. E. An automated server for predicting transmembrane helix and protein topologies using a hidden Markov model developed by Tusnady. The method used by this prediction server is GE Tusnady and I. Simon (1998) “Principles Governing Amino Acid Composition of Integral Membrane Proteins: Applications to Topology Prediction. ), "J. Mol. Biol. 283: 489-506 (incorporated by reference). New features of the HMMTOP 2.0 version are described in GE Tusnady and I. Simon (2001) “The HMMTOP transmembrane topology prediction server,” Bioinformatics 17: 849-850 (incorporated by reference). ing.

ＭＥＭＳＡＴ２膜貫通予測ページ（ｗｗｗ．ｓａｃｓ．ｕｃｓｆ．ｅｄｕ／ｃｇｉ−ｂｉｎ／ｍｅｍｓａｔ．ｐｙ）は、入力としてＦＡＳＴＡ形式またはテキスト形式を用いてタンパク質内の膜貫通セグメントを予測する。関連プログラムであるＭＥＭＳＡＴ（１．５）ソフトウェアは、David Jones博士(Jones et al., Biochemistry 33:3038-3049, 1994)によって著作権保護されている。ＭＥＭＳＴＡＴの最新版であるMEMSAT V3は、広く使用されている全ヘリックス膜タンパク質予測方法ＭＥＭＳＡＴである。この方法に対して、公知のトポロジーの膜貫通タンパク質の試験セットでベンチマークが行われた。ＭＥＭＳＡＴは配列データから全ヘリックス膜貫通タンパク質の構造および膜内のそれらの構成ヘリックス要素の位置の予測の際に７８％超の正確性を有するものと推定された。ＭＥＭＳＡＴ−ＳＶＭは、膜貫通ヘリックストポロジーの非常に正確な予測法である。これはシグナルペプチドを区別して細胞質および細胞外ループを同定することができる。ＭＥＭＳＡＴ３およびＭＥＭＳＡＴ−ＳＶＭはどちらも、いくつかの構造予測方法をロンドン大学において１箇所に集めるＰＳＩＰＲＥＤタンパク質配列分析ワークベンチの一部である。 The MEMSAT2 transmembrane prediction page (www.sacs.ucsf.edu/cgi-bin/memsat.py) predicts transmembrane segments within proteins using FASTA format or text format as input. The related program, MEMSAT (1.5) software, is copyrighted by Dr. David Jones (Jones et al., Biochemistry 33: 3038-3049, 1994). The latest version of MEMSAT, MEMSAT V3, is a widely used method for predicting all-helix membrane proteins, MEMSAT. This method was benchmarked with a test set of transmembrane proteins of known topologies. MEMSAT was estimated from sequence data to have greater than 78% accuracy in predicting the structure of all-helix transmembrane proteins and the location of their constituent helix elements within the membrane. MEMSAT-SVM is a very accurate prediction of transmembrane helix topology. It can distinguish between signal peptides and identify cytoplasmic and extracellular loops. Both MEMSAT3 and MEMSAT-SVM are part of the PSIPRED Protein Sequence Analysis Workbench, which brings together several structure prediction methods at the University of London.

Ｐｈｏｂｉｕｓサーバ（ｐｈｏｂｉｕｓ．ｓｂｃ．ｓｕ．ｓｅ）は、ＦＡＳＴＡ形式のタンパク質のアミノ酸配列から膜貫通トポロジーおよびシグナルペプチドを予測するためのものである。Ｐｈｏｂｉｕｓについては、Lukas et al., “A Combined Transmembrane Topology and Signal Peptide Prediction Method（膜貫通トポロジーおよびシグナルペプチドを組み合わせた予測方法）,” Journal of Molecular Biology 338(5):1027-1036, 2004)に記載されている。ＰｏｙＰｈｏｂｉｕｓについては、Lukas et al., “An HMM posterior decoder for sequence feature prediction that includes homology information（相同性情報を含む配列特徴予測のためのＨＭＭ事後デコーダ）,” Bioinformatics, 21 (Suppl 1):i251-i257, 2005に記載されている。そして、Ｐｈｏｂｉｕｓウェブサーバについては、Lukas et al., “Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server（膜貫通トポロジーおよびシグナルペプチドを組み合わせた予測の利点：Ｐｈｏｂｉｕｓウェブサーバ）,” Nucleic Acids Res. 35:W429-32, 2007に記載されている（引用される技術内容は全て参照により組み込まれる）。 The Phobius server (phobius.sbc.su.se) is for predicting transmembrane topologies and signal peptides from the amino acid sequences of FASTA format proteins. For Phobius, see Lukas et al., “A Combined Transmembrane Topology and Signal Peptide Prediction Method,” Journal of Molecular Biology 338 (5): 1027-1036, 2004). Are listed. For PoyPhobian, Lukas et al., “An HMM posterior decoder for sequence feature prediction that includes homology information,” Bioinformatics, 21 (Suppl 1): i251- It is described in i257, 2005. And for Phobius web server, Lukas et al., "Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server", "Nucleic Described in Acids Res. 35: W429-32, 2007 (all cited technical content is incorporated by reference).

ＳＯＳＵＩは、膜貫通へリックスの予測と共に、膜タンパク質および可溶性タンパク質を区別するためのものである。ＳＯＳＵＩは、トポロジーのための疎水性分析（Hydrophobicity Analysis for Topology）および三次構造のためのプローブヘリックス法（Probe Helix Method for Tertial Structure）を用いて膜貫通領域を予測する。タンパク質の分類の正確性は９９％程に高いと言われており、膜貫通ヘリックス予測のための対応する値は約９７％であると言われている。このＳＯＳＵＩシステムは、インターネットアクセスｗｗｗ．ｔｕａｔ．ａｃ．ｊｐ／ｍｉｔａｋｕ／ｓｏｓｕｉによって入手可能である。 SOSUI is intended to distinguish between membrane and soluble proteins, as well as the prediction of transmembrane helix. SOSUI predicts the transmembrane region using Hydrophobicity Analysis for Topology and Probe Helix Method for Tertial Structure for tertiary structure. The accuracy of protein classification is said to be as high as 99%, and the corresponding value for transmembrane helix prediction is said to be about 97%. This SOSUI system is available on the Internet Access www. tuat. ac. It is available by jp / mitaka / sosui.

ＴＭＰｒｅｄ（欧州分子生物学ネットワーク、スイスのノード）は、クエリー配列における膜貫通領域およびタンパク質の向きを予測する。具体的には、ＴＭＰｒｅｄアルゴリズムは、ＴＭｂａｓｅ（天然に生じる膜貫通タンパク質のデータベース）の統計分析に基づいている。その予測は、スコア化のためのいくつかの重み行列の組み合わせを用いてなされる。Hofmann & Stoffel (1993) “TMbase - A database of membrane spanning proteins segments（ＴＭｂａｓｅ：膜貫通タンパク質セグメントのデータベース）,” Biol. Chem. Hoppe-Seyler, 374:166を参照されたい。 TMPred (European Molecular Biology Network, Swiss Node) predicts transmembrane regions and protein orientation in query sequences. Specifically, the TM Pred algorithm is based on a statistical analysis of TMbase (a database of naturally occurring transmembrane proteins). The prediction is made using a combination of several weight matrices for scoring. See Hofmann & Stoffel (1993) “TMbase --A database of membrane spanning proteins segments”, Biol. Chem. Hoppe-Seyler, 374: 166.

ＳＰＬＩＴ４．０サーバは、選好関数の方法を用いてＳＷＩＳＳ−ＰＲＯＴ形式で膜タンパク質の膜貫通（ＴＭ）二次構造を予測する膜タンパク質二次構造予測サーバ（ｓｐｌｉｔ．ｐｍｆｓｔ．ｈｒ／ｓｐｌｉｔ／４）である。Juretic et al., “Basic charge clusters and predictions of membrane protein topology（基本チャージクラスターおよび膜タンパク質トポロジーの予測）,” J. Chem. Inf. Comput. Sci., 42:620-632, 2002（参照により組み込まれる）を参照されたい。 The SPLIT 4.0 server is a membrane protein secondary structure prediction server (split.pmfst.hr/split/4) that predicts the transmembrane (TM) secondary structure of a membrane protein in the SWISS-PROT format using the method of preference function. Is. Juretic et al., “Basic charge clusters and predictions of membrane protein topology,” J. Chem. Inf. Comput. Sci., 42: 620-632, 2002 (incorporated by reference) Please refer to).

ＰＲＥＤ−ＴＭＲは、単にタンパク質配列そのものを用いてタンパク質内の膜貫通ドメインを予測する。このアルゴリズムは、膜貫通領域の末端（「端部」、開始および終了）候補の検出により標準的な疎水性分析を正確にする。これにより、明確な開始および終了構成によって区切られていない高疎水性領域を廃棄し、かつそれらの疎水性組成によって区別可能でない推定上の膜貫通セグメントを確認することができる。信頼できるトポロジーを有する１０１種の非相同な膜貫通タンパク質の試験セットに関して得られた正確性は、他の一般的な既存の方法の正確性に十分に匹敵する。このアルゴリズムをＳｗｉｓｓＰｒｏｔデータベース（リリース３５）の全ての膜貫通タンパク質に適用した場合、予測正確性の僅かな低下のみが観察された。Pasquier et al., “A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm（ＳｗｉｓｓＰｒｏｔデータベースの統計分析に基づいてタンパク質内の膜貫通セグメントを予測するための新規な方法：ＰＲＥＤ−ＴＭＲアルゴリズム）,” Protein Eng., 12(5):381-385, 1999（参照により組み込まれる）を参照されたい。 PRED-TMR simply uses the protein sequence itself to predict the transmembrane domain within the protein. This algorithm refines standard hydrophobic analysis by detecting end (“end”, start and end) candidates for the transmembrane region. This allows the highly hydrophobic regions not separated by a well-defined start and end configuration to be discarded and the putative transmembrane segments indistinguishable by their hydrophobic composition. The accuracy obtained for a test set of 101 non-homologous transmembrane proteins with a reliable topology is well comparable to the accuracy of other common existing methods. When this algorithm was applied to all transmembrane proteins in the SwissProt database (Release 35), only a slight decrease in predictive accuracy was observed. Pasquier et al., “A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm for predicting transmembrane segments in proteins based on statistical analysis of the SwissProt database. New Method: PRED-TMR Algorithm), "Protein Eng., 12 (5): 381-385, 1999 (incorporated by reference).

関連するＰＲＥＤ−ＴＭＲ２では、その適用は、高い正確性で膜貫通タンパク質を可溶性または線維性タンパク質と区別することができる人工のニューラルネットワークによって代表される前処理段階により拡張されている。膜貫通タンパク質のいくつかの試験セットに適用した場合、このシステムは、膜貫通クラス内の全ての配列を分類することにより１００％の完璧な予測評価を与える。ＰＤＢｓｅｌｅｃｔデータベースから抽出された９９５種の非膜貫通タンパク質に適用した場合、ニューラルネットワークは、誤ってそれらのうちの２３種を膜貫通であると予測する（９７．７％の正確な割り当て）。Pasquier and Hamodrakas, “An hierarchical artificial neural network system for the classification of transmembrane proteins（膜貫通タンパク質の分類のための階層的な人工のニューラルネットワークシステム）,” Protein Eng., 12(8):631-634, 1999（参照により組み込まれる）を参照されたい。 In the relevant PRED-TMR2, its application is extended by pretreatment steps represented by artificial neural networks that can distinguish transmembrane proteins from soluble or fibrous proteins with high accuracy. When applied to several test sets of transmembrane proteins, the system gives a 100% perfect predictive rating by classifying all sequences within the transmembrane class. When applied to 995 non-transmembrane proteins extracted from the PDBselect database, neural networks mistakenly predict 23 of them to be transmembrane (97.7% accurate allocation). Pasquier and Hamodrakas, “An hierarchical artificial neural network system for the classification of transmembrane proteins,” Protein Eng., 12 (8): 631-634, See 1999 (incorporated by reference).

タンパク質αへリックス二次構造予測
本発明の特定の方法は、ＧＰＣＲなどのタンパク質のαへリックス二次構造を予測する工程を含む。多くのそのようなプログラムやソフトウェアが当該技術分野で知られており、それらのうちのいずれかをαへリックス二次構造予測工程を必要とする本発明の方法において個々にまたは組み合わせて使用してもよい。本発明の方法において全てのそのようなプログラムを使用することができる。 Predicting the α-helix secondary structure of a protein The specific method of the present invention comprises the step of predicting the α-helix secondary structure of a protein such as GPCR. Many such programs and software are known in the art and one of them is used individually or in combination in a method of the invention that requires an alpha helix secondary structure prediction step. May be good. All such programs can be used in the methods of the invention.

二次構造予測の初期の方法は、３つの優勢である状態すなわちヘリックス、シートまたはランダムコイルの予測に制限されていた。これらの方法は、時として二次構造要素を形成する自由エネルギーを推定するための規則と結び付けられる個々のアミノ酸のヘリックスまたはシート形成傾向に基づいていた。そのような方法は典型的に、残基が取る３つの状態（ヘリックス／シート／コイル）のどれであるかを予測する際に約６０％正確であった。アミノ酸配列からタンパク質二次構造を予測するために広く使用されていた最初の技術は、Ｃｈｏｕ−Ｆａｓｍａｎ法であった。 Early methods of secondary structure prediction were limited to the prediction of three predominant states: helices, sheets or random coils. These methods were based on the helix or sheet formation tendency of individual amino acids, which is sometimes associated with rules for estimating the free energy that forms secondary structure elements. Such a method was typically about 60% accurate in predicting which of the three states (helix / sheet / coil) the residue would take. The first technique widely used to predict protein secondary structure from amino acid sequences was the Chou-Fasman method.

正確性の有意な増加（ほぼ約８０％への増加）は複数の配列アラインメントによって提供される情報を利用することによってなされ、進化を通じてある位置（およびその近く、典型的には片側に約７つの残基）で生じるアミノ酸の完全な分布を知ることにより、その位置の近くで構造的傾向の非常に良好な画像が得られる。例えば、所与のタンパク質は所与の位置にグリシンを有している場合があり、これはそれ自体がランダムコイルを示唆している場合がある。しかし、複数の配列アラインメントは、ヘリックスを好むアミノ酸が進化を通じて相同なタンパク質の９５％においてその位置（および近くの位置）に生じることを明らかにする場合がある。さらに、その位置および近くの位置における平均的な疎水性を調べることにより、同じアラインメントがαヘリックスに一致する残基溶媒露出度パターンも示唆する場合がある。まとめると、これらの因子は元のタンパク質のグリシンがランダムコイルではなくαヘリックス構造を取ることを示唆していると思われる。従って本発明の方法では、ニューラルネットワーク、隠れマルコフモデルおよびサポートベクターマシンを含むαへリックス二次構造予測プログラムは、全ての利用可能なデータを組み合わせて３つの状態の予測をなしてもよい。そのような予測方法は、全ての位置におけるそれらの予測のために信頼スコアも提供する。 A significant increase in accuracy (increased to approximately 80%) is made by utilizing the information provided by multiple sequence alignments, and through evolution, a position (and close, typically about 7 on one side). Knowing the complete distribution of the amino acids that occur at the residue) gives a very good image of the structural tendency near that position. For example, a given protein may have glycine at a given position, which itself may suggest a random coil. However, multiple sequence alignments may reveal that helix-loving amino acids occur at that position (and nearby positions) in 95% of homologous proteins throughout evolution. In addition, examining the average hydrophobicity at that location and nearby locations may also suggest a residue solvent exposure pattern in which the same alignment matches the α-helix. Taken together, these factors appear to suggest that the original protein, glycine, has an α-helical structure rather than a random coil. Therefore, in the method of the present invention, an α-helix secondary structure prediction program including a neural network, a hidden Markov model and a support vector machine may combine all available data to make predictions of three states. Such prediction methods also provide confidence scores for those predictions at all positions.

二次構造予測方法に対して連続的にベンチマークが行われる（例えば、ＥＶＡ（ベンチマーク））。ＥＶＡは、タンパク質構造予測および二次構造予測方法の品質を評価するためのベンチマークプロジェクトを連続的に実行する。相同性モデリング、タンパク質スレッディングおよびコンタクトオーダ予測などの二次構造および三次構造の両方を予測する方法をタンパク質構造データバンク（ＰＤＢ）に寄託されている毎週新しく解明されたタンパク質構造からの結果と比較する。このプロジェクトは、一般的な公的に利用可能な予測ウェブサーバの専門家でないユーザのために期待される予測正確性を決定することを目的とする。 Benchmarks are continuously performed on the secondary structure prediction method (for example, EVA (benchmark)). EVA continuously runs benchmark projects to evaluate the quality of protein structure prediction and secondary structure prediction methods. Compare methods for predicting both secondary and tertiary structure, such as homology modeling, protein threading and contact order prediction, with results from weekly newly elucidated protein structures deposited in the Protein Data Bank (PDB). .. This project aims to determine the expected predictive accuracy for non-expert users of common publicly available predictive web servers.

これらの試験によれば、現時点で最も正確な方法は、Ｐｓｉｐｒｅｄ、ＳＡＭ(Karplus, "SAM-T08, HMM-based protein structure prediction（ＨＭＭに基づくタンパク質構造予測）," Nucleic Acids Res. (2009) 37 (Web Server issue): W492-497. doi:10.1093/nar/gkp403)、ＰＯＲＴＥＲ(Pollastri & McLysaght, "Porter: a new, accurate server for protein secondary structure prediction（タンパク質二次構造予測のための新しい正確なサーバ）," Bioinformatics 21 (8):1719-1720, 2005)、ＰＲＯＦ(Yachdav et al. (2014). "PredictProtein--an open resource for online prediction of protein structural and functional features（タンパク質の構造的および機能的特徴のオンライン予測のためのオープンリソース）," Nucleic Acids Res. 42 (Web Server issue): W337-343. doi:10.1093/nar/gku366)およびＳＡＢＬＥ(Adamczak et al. (2005) "Combining prediction of secondary structure and solvent accessibility in proteins（タンパク質における二次構造および溶媒露出度の組み合わせ予測）," Proteins 59 (3): 467-475. doi:10.1002/prot.20441)である。また、二次構造クラス（ヘリックス／ストランド／コイル）をＰＤＢ構造に割り当てるための標準的な方法は、ＤＳＳＰ(Kabsch W and Sander (1983) "Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features（タンパク質二次構造の辞書：水素結合および幾何学的特徴のパターン認識）," Biopolymers 22 (12): 2577-2637. doi:10.1002/bip.360221211)であり、これらに対する予測に対してベンチマークが行われる。全てが参照により組み込まれ、本発明の方法において全てを使用することができる。 According to these tests, the most accurate method at this time is Psipred, SAM (Karplus, "SAM-T08, HMM-based protein structure prediction", "Nucleic Acids Res. (2009) 37) (Web Server issue): W492-497. doi: 10.1093 / nar / gkp403), PORTER (Pollastri & McLysaght, "Porter: a new, accurate server for protein secondary structure prediction" Server), "Bioinformatics 21 (8): 1719-1720, 2005), PROF (Yachdav et al. (2014)." PredictProtein--an open resource for online prediction of protein structural and functional features (Structural and functional features of proteins) Open resource for online prediction of protein features), "Nucleic Acids Res. 42 (Web Server issue): W337-343. Doi: 10.1093 / nar / gku366) and SABLE (Adamczak et al. (2005)" Combining prediction of Secondary structure and solvent accessibility in proteins, "Proteins 59 (3): 467-475. Doi: 10.1002 / prot. 20441). Also, the standard method for assigning a secondary structure class (helix / strand / coil) to a PDB structure is DSSP (Kabsch W and Sander (1983) "Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical". Features (Dictionary of protein secondary structure: pattern recognition of hydrogen bonds and geometric features), "Biopolymers 22 (12): 2577-2637. Doi: 10.1002 / bip.360221211), benchmarking against these predictions Is done. All are incorporated by reference and all can be used in the methods of the invention.

ＤＳＳＰアルゴリズムは、タンパク質の原子解像度座標を考慮して二次構造をタンパク質のアミノ酸に割り当てるための標準的な方法である。ＤＳＳＰは、単に静電気定義を用いてタンパク質の骨格内水素結合を同定することにより開始し、カルボニル酸素およびアミド水素のそれぞれに対して−０．４２ｅおよび＋０．２０ｅの部分電荷を仮定し、それらの逆をカルボニル炭素およびアミド窒素に割り当てる。以下の方程式のＥが−０．５ｋｃａｌ／ｍｏｌ未満であれば水素結合が特定される。

The DSSP algorithm is a standard method for assigning secondary structures to amino acids in a protein, taking into account the atomic resolution coordinates of the protein. DSSP begins by simply identifying the intraskeletal hydrogen bonds of the protein using electrostatic definitions, assuming partial charges of -0.42e and +0.20e for carbonyl oxygen and amide hydrogen, respectively, of theirs. The reverse is assigned to carbonyl carbon and amide nitrogen. If E in the following equation is less than -0.5 kcal / mol, a hydrogen bond is specified.

これに基づき、８種類の二次構造が割り当てられる。３_１０ヘリックス、αヘリックスおよびπヘリックスは符号Ｇ、ＨおよびＩを有し、それらの残基がそれぞれ３、４または５つの残基だけ離れた水素結合の反復配列を有することによって認識される。２種類のβシート構造が存在し、βブリッジは符号Ｂを有し、水素結合およびβバルジのより長いセットは符号Ｅを有する。Ｔはへリックスに典型的な水素結合を特徴とする回転のために使用され、Ｓは高い曲率の領域のために使用されており
（ここで、

と

との角度は７０°未満である）、空白（すなわちスペース）は他の規則が適用されない場合に使用され、ループを指す。これらの８種類は通常、３つのより大きなクラスすなわちヘリックス（Ｇ、ＨおよびＩ）、ストランド（ＥおよびＢ）ならびにループ（それ以外の全て）にグループ化される。 Based on this, eight types of secondary structures are assigned. 3 ₁₀ helix, α-helix and π-helix have the codes G, H and I, and their residues are recognized by having a repeating sequence of hydrogen bonds separated by 3, 4 or 5 residues, respectively. There are two types of β-sheet structures, the β-bridge has the symbol B, and the longer set of hydrogen bonds and β-bulges has the symbol E. T is used for rotations characterized by hydrogen bonds typical of helix, and S is used for regions of high curvature (where, here).

When

The angle with and is less than 70 °), blanks (ie spaces) are used when no other rules apply and refer to loops. These eight types are usually grouped into three larger classes: helices (G, H and I), strands (E and B) and loops (all others).

ＰＳＩＰＲＥＤ（ＰＳＩ−ＢＬＡＳＴによる二次構造予測）はタンパク質構造を調査するために使用される技術である。これは、そのアルゴリズムにおいてニューラルネットワークすなわち機械学習法を用いる。これは、フロントエンドインタフェースとしてサービスを提供するウェブサイトを特徴とするサーバ側プログラムであり、一次配列からタンパク質の二次構造（βシート、αへリックスおよびコイル）を予測することができる。ｂｉｏｉｎｆ．ｃｓ．ｕｃｌ．ａｃ．ｕｋ／ｐｓｉｐｒｅｄを参照されたい。この方法の考えは、進化的に関連するタンパク質の情報を使用して新しいアミノ酸配列の二次構造を予測する機械学習法である。具体的には、ＰＳＩ−ＢＬＡＳＴを使用して関連する配列を見つけ、位置特異的スコア行列を構築する。入力配列の二次構造を予測するように構築および訓練されたニューラルネットワークによりこの行列を処理する。この予測方法またはアルゴリズムを３つの段階、すなわち配列プロファイルの作成、最初の二次構造の予測および予測された構造のフィルタリングに分ける。ＰＳＩＰＲＥＤは、ＰＳＩ−ＢＬＡＳＴによって作成された配列プロファイルを正規化するように動作する。次いで、ニューラルネットワーキングを用いることにより、最初の二次構造を予測する。当該配列内の各アミノ酸のために、ニューラルネットワークに１５種の酸のウィンドウを与える。これらのウィンドウが当該鎖のＮまたはＣ末端を跨ぐか否かを示すさらなる情報が添付されている。これにより、２１個の単位からなる１５個のグループに分けられた３１５個の入力単位からなる最終的な入力層が得られる。このネットワークは７５個の単位からなる単一の隠れ層および３個の出力ノード（各二次構造要素すなわちヘリックス、シート、コイルに対して１つ）を有する。第１のネットワークの予測される構造をフィルタリングするために第２のニューラルネットワークを使用する。このネットワークに１５個の位置のウィンドウを与える。鎖末端におけるこのウィンドウの可能な位置に関する指標も転送する。これにより、４個からなる１５個のグループに分けられた６０個の入力単位が得られる。このネットワークは６０個の単位からなる単一の隠れ層および３個の出力ノード（各二次構造要素すなわちヘリックス、シート、コイルに対して１つ）における結果を有する。３個の最終的な出力ノードは、このウィンドウの中心位置について各二次構造要素のスコアを送る。最も高いスコアを有する二次構造を用いて、ＰＳＩＰＲＥＤは当該タンパク質予測を生成する。Ｑ３値は、二次構造状態すなわちヘリックス、ストランドおよびコイルにおいて正確に予測された残基の割合である。 PSIPLED (secondary structure prediction by PSI-BLAST) is a technique used to investigate protein structures. It uses a neural network or machine learning method in its algorithm. It is a server-side program that features a website that provides services as a front-end interface and can predict protein secondary structures (β-sheets, α-helices, and coils) from primary sequences. bioinf. cs. ucl. ac. See uk / plex. The idea of this method is a machine learning method that uses information on evolutionarily related proteins to predict the secondary structure of new amino acid sequences. Specifically, PSI-BLAST is used to find relevant sequences and construct a position-specific score matrix. Process this matrix with a neural network constructed and trained to predict the secondary structure of the input array. This prediction method or algorithm is divided into three stages: sequence profile creation, first secondary structure prediction and predicted structure filtering. PSIPLED acts to normalize the sequence profile created by PSI-BLAST. The first secondary structure is then predicted by using neural networking. For each amino acid in the sequence, the neural network is given a window of 15 acids. Further information is attached to indicate whether these windows straddle the N- or C-terminus of the chain. As a result, a final input layer consisting of 315 input units divided into 15 groups consisting of 21 units is obtained. The network has a single hidden layer of 75 units and 3 output nodes (1 for each secondary structure element ie helix, sheet, coil). A second neural network is used to filter the predicted structure of the first network. Give this network a window at 15 locations. It also transfers an indicator of the possible position of this window at the end of the chain. As a result, 60 input units divided into 15 groups consisting of 4 are obtained. This network has results in a single hidden layer of 60 units and 3 output nodes (1 for each secondary structure element ie helix, sheet, coil). The three final output nodes send the score of each secondary structure element for the center position of this window. Using the secondary structure with the highest score, PSIPRED produces the protein prediction. The Q3 value is the percentage of residues accurately predicted in the secondary structure state ie helices, strands and coils.

例示的な実施形態の段階的説明
上に概略が記載されている本発明を用いて、特定の非限定的であるが例示的な実施形態について、図の中の代表的なフローチャートを参照しながら以下に説明する。 Step-by-Step Description of an exemplary Embodiment Using the invention outlined above, with reference to a representative flowchart in the figure for a particular non-limiting but exemplary embodiment. This will be described below.

図９Ａは非限定的な本発明の一実施形態を示す。この図は全体として本発明の方法２００を示し、ここでは当該タンパク質（例えば、ＧＰＣＲ）のＴＭ領域内の選択された疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦを、任意の特定のＴＭ領域／ドメインにおける置換を限定することなく本発明の「ＱＴＹコード」に従って置換する。 FIG. 9A shows a non-limiting embodiment of the present invention. This figure shows the method 200 of the invention as a whole, where selected hydrophobic amino acids L, I, V and F within the TM region of the protein (eg, GPCR) are placed in any particular TM region / domain. Substitution according to the "QTY code" of the present invention without limiting the substitution in.

その具体的な実施形態では、本方法は、膜貫通タンパク質であってもそうでなくてもよいタンパク質配列の入力を取得または読み取ること（２０４）によって開始する（２０２）。次いで、このタンパク質配列を、ＴＭ領域予測（２０６）（入力されたタンパク質配列からそのような情報がまだ利用可能でない場合）および当該技術分野において承認されている方法のいずれかに基づくαヘリックス二次構造予測に供すことができる。例えば、ＴＭＨＭＭプログラムなどのプログラムを用いてＴＭ領域予測を行うことができる（２４０）。この予測により２４２においてどんなＴＭ領域も得られない場合、ＳＯＳＵＩなどの１つ以上の異なるＴＭ領域予測プログラムを使用して（２５０）ＴＭ領域の存在／不存在の予測を可能にしてもよい。２５２においてそのようなプログラムに基づいてＴＭ領域が予測されない場合、当該タンパク質内にＴＭ領域が存在しない可能性があり（２５４）、本方法は終了する（２６０）。 In its specific embodiment, the method begins by obtaining or reading (204) an input of a protein sequence that may or may not be a transmembrane protein (202). This protein sequence is then subjected to an α-helix secondary based on either TM region prediction (206) (if such information is not yet available from the input protein sequence) and methods approved in the art. It can be used for structural prediction. For example, a TM region prediction can be performed using a program such as the TMHMM program (240). If no TM region is obtained in 242 by this prediction, one or more different TM region prediction programs such as SOSUI may be used to enable prediction of the presence / absence of (250) TM regions. If the TM region is not predicted in 252 based on such a program, the TM region may not be present in the protein (254) and the method ends (260).

他方、２４２において好適なプログラムのいずれかにより１つ以上のＴＭ領域が予測される場合、ＴＭ領域タンパク質配列が得られ（２４４）、そのような１つ以上のＴＭ領域内で本発明のＱＴＹコードを疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦに適用することができる。より具体的には、ＱＴＹコードに従って、ＴＭ領域内のロイシンをそれぞれ独立してグルタミン（Ｑ）、セリン（Ｓ）またはアスパラギン（Ｎ）で置換するか（２１２）置換しないままにすることができ、ＴＭ領域内のイソロイシンおよびバリンをそれぞれ独立してトレオニン（Ｔ）、セリン（Ｓ）またはアスパラギン（Ｎ）で置換するか置換しないままにすることができ、かつＴＭ領域内のフェニルアラニンをそれぞれチロシン（Ｙ）で置換するか置換しないままにすることができる。そのようなＱＴＹ置換の結果から元の膜貫通タンパク質の１つ以上の推定上水溶性である変異体を産生する。なお、ある領域内の各アミノ酸に対してなされる置換の数はパラメータとして選択することができる。 On the other hand, if one or more TM regions are predicted by any of the suitable programs in 242, a TM region protein sequence is obtained (244) and within such one or more TM regions the QTY code of the invention. Can be applied to the hydrophobic amino acids L, I, V and F. More specifically, according to the QTY code, leucine in the TM region can be independently replaced with glutamine (Q), serine (S) or asparagine (N) or left unreplaced (212). Isoleucine and valine in the TM region can be independently replaced or left unchanged with threonine (T), serine (S) or asparagine (N), and phenylalanine in the TM region is tyrosine (Y), respectively. ) Can be replaced or left unreplaced. The results of such QTY substitutions produce one or more presumably water-soluble variants of the original transmembrane protein. The number of substitutions made for each amino acid in a region can be selected as a parameter.

次に、ＰＯＲＴＥＲなどの当該技術分野において承認されている任意のプログラムを用いて、推定上水溶性である変異体のそれぞれにおけるαヘリックス二次構造を予測することができる（２１０）。その結果を、好ましくは同じプログラム（例えばＰＯＲＴＥＲ）を用いて予測される元のタンパク質の結果と比較することができる（２０８）。なお、元のタンパク質のＴＭ領域予測工程の前、それと同時またはその後に、当該技術分野において承認されている任意のプログラムを用いて元のタンパク質のαヘリックス二次構造を予測することができる。 Any program approved in the art, such as PORTER, can then be used to predict the α-helix secondary structure in each of the putatively water-soluble mutants (210). The results can be compared to the expected results of the original protein, preferably using the same program (eg, PORTER) (208). It should be noted that the α-helix secondary structure of the original protein can be predicted using any program approved in the art before, at the same time as, or after the TM region prediction step of the original protein.

αヘリックス二次構造予測結果が、２１４において水溶性変異体候補が元のタンパク質と同じαヘリックス二次構造を維持しているか大部分を維持していることを示す場合、それは、その変異体におけるＱＴＹ置換の特定のパターンは元のタンパク質におけるαヘリックス二次構造に影響を与えないか有意に影響を与えないことを示唆している。次いで、このＴＭ領域の予測を行い（２２０）、確認し（２２２）、かつ変異体配列を生成する（２２４）ことができる。任意に、この結果が２１４において元のタンパク質における１つ以上のαヘリックス二次構造のうちの１つ以上が破壊されていることを示す場合、この工程において当該変異体を望ましくないものとして廃棄し、このようにして本方法を終了することができる。 If the α-helix secondary structure prediction results indicate at 214 that the water-soluble variant candidate retains or most of the same α-helix secondary structure as the original protein, it is in that variant. It is suggested that the specific pattern of QTY substitution does not affect or significantly affect the α-helix secondary structure in the original protein. This TM region can then be predicted (220), confirmed (222), and generate mutant sequences (224). Optionally, if this result indicates at 214 that one or more of the one or more α-helix secondary structures in the original protein have been disrupted, the variant is discarded as undesirable in this step. , The method can be terminated in this way.

他方、本発明の方法は、元のタンパク質と比較して予測されるＱＴＹ変異体がＴＭ領域を形成する傾向が低いかその傾向がないことを示すことも必要とする。従って、元のタンパク質における最初のＴＭ領域予測のために使用した同じＴＭ領域予測プログラムなどを用いて（必要であれば）、推定上水溶性である変異体をＴＭ領域予測に供すことができる。その結果が有意なＴＭ領域がなお存在することを示した場合、その変異体を廃棄してもよい。他方、その結果がＴＭ領域が存在しないことまたはＴＭ領域を形成する傾向が低いことを示した場合、その変異体を元のタンパク質よりも高い水溶性を有するがαヘリックス二次構造を維持し、かつ故に恐らく元のタンパク質の機能を維持している所望の変異体として選択することができる。 On the other hand, the method of the present invention also needs to show that the predicted QTY mutants are less or less prone to form TM regions compared to the original protein. Therefore, a mutant that is presumably water-soluble can be subjected to TM region prediction using the same TM region prediction program or the like used for the initial TM region prediction in the original protein (if necessary). If the results indicate that a significant TM region is still present, the variant may be discarded. On the other hand, if the results show that the TM region is absent or less prone to form the TM region, the mutant has higher water solubility than the original protein but retains the α-helix secondary structure. And therefore it can probably be selected as the desired variant that retains the function of the original protein.

所望であれば、さらなる工程を行って得られた水溶性変異体のさらなる特性評価を行うことができる。そのようなさらなる特性評価は、当該変異体のｐＩを計算する工程（２２６）を含んでいてもよく、かつこのｐＩを元のタンパク質のｐＩと比較する。このｐＩは変化しないか非常に僅かな（すなわち３０％未満または好ましくは２０％未満またはより好ましくは１０％未満の）変化でなければならない。他のさらなる特性評価は、ヘリックス車輪モデル（例えば図３に示すもの）を作成して（２４６）、任意の特定のＴＭ領域におけるＱＴＹ置換の位置および任意のクラスター化を示す工程を含んでもよい。 If desired, further characterization of the water-soluble mutants obtained by further stepping can be performed. Such further characterization may include calculating the pI of the variant (226), and this pI is compared to the pI of the original protein. This pI should be unchanged or very slight (ie less than 30% or preferably less than 20% or more preferably less than 10%). Other further characterizations may include creating a helix wheel model (eg, as shown in FIG. 3) (246) to indicate the location of QTY substitutions and any clustering in any particular TM region.

本発明のＱＴＹコードによるタンパク質（例えば、ＧＰＣＲ）の膜貫通領域を設計するための本発明の別の例示的な実施形態は、図９Ｂに記載されている代表的な方法１０を用いてコンピュータシステム上で行うことができ、詳細な工程のうちのいくつかについて以下にさらに説明する。その工程の多くは任意であるか、本発明の方法に従って組み合わせることができる。 Another exemplary embodiment of the invention for designing a transmembrane region of a protein (eg, GPCR) according to the QTY code of the invention is a computer system using the representative method 10 described in FIG. 9B. Some of the detailed steps that can be performed above are described further below. Many of the steps are optional or can be combined according to the methods of the invention.

１：工程１では、コンピュータシステムのコンピュータインタフェースは、タンパク質配列を受け取り、分析のために選択し、かつ入力されるタンパク質（例えば、その配列）を記述しているデータをコンピュータシステムのコンピュータインタフェースを介してアップロードまたは入力する（１２）。入力されるデータは、タンパク質名、データベース参照またはタンパク質配列であってもよい。例えば、当該タンパク質配列はコンピュータインタフェースを介してアップロードすることができる。 1: In step 1, the computer interface of the computer system receives the protein sequence, selects it for analysis, and inputs data describing the input protein (eg, the sequence) via the computer interface of the computer system. Upload or enter (12). The data entered may be a protein name, database reference or protein sequence. For example, the protein sequence can be uploaded via a computer interface.

２：工程２では、その名前または配列を含む当該タンパク質に関するさらなるデータを同定、決定、取得および／または入力することができ、かつコンピュータインタフェースを介して入力することができる。タンパク質データを得る（２０）ための１つのソースは、ＵｎｉＰｒｏｔという名称のデータベース（ｗｗｗ．ｕｎｉｐｒｏｔ．ｏｒｇ）である。あるいは、本発明の方法は、この工程におけるユーザによる後の検索のために、当該タンパク質または当該タンパク質に関連する配列に関するデータを記憶することができる。実施形態では、当該プログラムは、ユーザに分析のために選択されたタンパク質に関するさらなるデータ（例えば、配列データ）を検索するためのデータベースまたはファイルを選択することを促すことができる。 2: In step 2, additional data about the protein, including its name or sequence, can be identified, determined, obtained and / or entered and can be entered via a computer interface. One source for obtaining protein data (20) is a database named UniProt (www.uniprot.org). Alternatively, the methods of the invention can store data about the protein or sequences associated with the protein for later retrieval by the user in this step. In embodiments, the program can prompt the user to select a database or file for retrieving additional data (eg, sequence data) about the protein selected for analysis.

３：工程３では、ユーザは、膜貫通領域を同定するデータを入力、アップロードまたは取得することができる。例えば、ユーザにＵｎｉＰｒｏｔなどの公開情報源からデータを取得するように促すことができる。その情報を確認し（３０）、工程５で使用するためにそのデータベースから収集することができる。 3: In step 3, the user can input, upload or acquire data for identifying the transmembrane region. For example, users can be encouraged to retrieve data from public sources such as UniProt. The information can be confirmed (30) and collected from the database for use in step 5.

４：代わりまたは追加として、入力されたタンパク質配列からＴＭ領域情報が容易に入手可能でない場合であっても、当該技術分野において承認されている任意の方法によって膜貫通領域を確立することができる（４０）。膜貫通領域は一般に、αへリックス立体構造を特徴とする。例えば、生物学的配列センター（ｗｗｗ．ｃｂｓ．ｄｔｕ．ｄｋ／ｓｅｒｖｉｃｅｓ／ＴＭＨＭＭ）によって開発されたＴＭＨＭＭ２．０という名称のソフトウェアモジュール／パッケージ（隠れマルコフモデルを用いる膜貫通予測）を用いて、膜貫通ヘリックス予測を行うことができる。このソフトウェアのバージョンは、ピーク探索に関する問題を有することがあり、ＧＰＣＲのための７つのＴＭ領域の発見に失敗する場合がある。従って、当該コンピュータシステムによって実行されるピーク探索法に動的ベースラインが導入されたこのプログラムの修正版を必要に応じて使用してもよい。ここでは、例えばＧＰＣＲの場合、最初のベースライン値を用いて７つ全てのＴＭ領域が見つからない場合、ベースラインをより低い値に変更することができる。例えば、デフォルトベースラインを０．２に設定してもよい。欠けている７つ目の膜貫通領域を同定するために、ベースライン値を０．１に設定することができる。８つ以上のＴＭ領域が見つかった場合、ベースラインを０．１５などのより高い値に変更して的外れなＴＭ予測を除外することができる。例えば、ＣＣＲ−２のアミノ酸配列をＴＭＨＭＭ２．０ソフトウェアに供した場合、６つの膜貫通領域のみが最初に同定された。しかし、ＴＭＨＭＭ２．０のベースライン値を０．０７に設定すると、正しい合計７つの膜貫通領域が同定された。次いで、ＴＭ領域予測の結果を工程５に与える。 4: Alternatively or additionally, the transmembrane region can be established by any method approved in the art, even if TM region information is not readily available from the input protein sequence ( 40). The transmembrane region is generally characterized by an α-helix conformation. For example, using a software module / package named TMHMM2.0 (transmembrane prediction using a hidden Markov model) developed by the Biological Sequence Center (www.cbs.dtu.dk/services/TMHMM). Helix prediction can be performed. This software version may have problems with peak search and may fail to find the seven TM regions for GPCRs. Therefore, a modified version of this program with a dynamic baseline introduced into the peak search method performed by the computer system may be used as needed. Here, for example, in the case of GPCRs, the baseline can be changed to a lower value if all seven TM regions are not found using the first baseline value. For example, the default baseline may be set to 0.2. A baseline value can be set to 0.1 to identify the missing seventh transmembrane region. If eight or more TM regions are found, the baseline can be changed to a higher value, such as 0.15, to exclude irrelevant TM predictions. For example, when the amino acid sequence of CCR-2 was subjected to TMHMM 2.0 software, only six transmembrane regions were initially identified. However, setting the baseline value of TMHMM 2.0 to 0.07 identified a total of seven correct transmembrane regions. Next, the result of TM region prediction is given to step 5.

５：工程５では、新たな予測により、あるいは最初の配列入力からそのような情報を得ることによりＴＭデータを同定した後、ＧＰＣＲの配列をＴＭ領域情報に従って合計１５個の断片（すなわち、７つの膜貫通型セグメント（７つのＴＭ）（５２）および８つの非膜貫通セグメント（８つのＮＴＭ）（５４）に分ける（５０）。すなわち、典型的なＧＰＣＲのそれぞれに対して７つのＴＭおよび８つのＮＴＭ断片が存在しなければならない。 5: In step 5, after identifying the TM data by new prediction or by obtaining such information from the first sequence input, the GPCR sequence is sequenced according to the TM region information into a total of 15 fragments (ie, 7). Divided into transmembrane segments (7 TMs) (52) and 8 non-transmembrane segments (8 NTMs) (54) (50), ie 7 TMs and 8 for each typical GPCR. The NTM fragment must be present.

当然ながら、本システムは、ユーザによる入力のためにコンピュータインタフェースを用いて上記工程のうちの１つ以上、例えば全てを実行することができる。また、当然ながら、本システムは上記工程のうちの１つ以上を省略したり２つ以上の工程を組み合わせたりすることができる。 Of course, the system can perform one or more, eg, all, of the above steps using a computer interface for user input. Further, as a matter of course, this system can omit one or more of the above steps or combine two or more steps.

６：工程６では、当該タンパク質の所与のＴＭ領域内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの選択されたサブセットに対してＱＴＹ置換を部分的に行う（６０）。具体的には、変異のために第１の膜貫通領域（典型的には、必須ではないが、当該タンパク質のＮ末端に最も近位の膜貫通領域）を最初に選択する。次いで、第１の膜貫通領域内の疎水性アミノ酸（Ｌ、Ｉ、ＶおよびＦ）の一部または全てを、対応する非イオン性の親水性アミノ酸（Ｑ／Ｓ／Ｎ、Ｔ／Ｓ／Ｎ、Ｔ／Ｓ／ＮまたはＹ）で置換する。当然ながら、この場合このアミノ酸は実際には当該タンパク質の中に置換されていない。むしろ、このアミノ酸指定はモデリング用の配列において置換されている。従って、「配列」という用語は「配列データ」を含むものとする。典型的には、疎水性アミノ酸の大部分または全てを置換のために選択する。全てに満たないアミノ酸を選択した場合、当該膜貫通領域の１つ以上のＮおよび／またはＣ末端アミノ酸を疎水性のままにして内部疎水性アミノ酸を選択することが望ましい場合がある。追加または代わりとして、膜貫通領域内のロイシン（Ｌ）の全てを置換するように選択することが望ましい場合がある。追加または代わりとして、膜貫通領域内のイソロイシン（Ｉ）を全て選択して置換することが望ましい場合がある。追加または代わりとして、膜貫通領域内のバリン（Ｖ）を全て選択して置換することが望ましい場合がある。追加または代わりとして、膜貫通領域内のフェニルアラニン（Ｆ）を全て選択して置換することが望ましい場合がある。追加または代わりとして、膜貫通領域内の１つ以上のフェニルアラニンを保持すると有利になり得る。追加または代わりとして、膜貫通領域内の１つ以上のバリンを保持すると有利になり得る。追加または代わりとして、膜貫通領域内の１つ以上のロイシンを保持すると有利になり得る。追加または代わりとして、膜貫通領域内の１つ以上のイソロイシンを保持すると有利になり得る。追加または代わりとして、その野生型配列が３つ以上の連続する疎水性アミノ酸を特徴とする膜貫通領域内の１つ以上の疎水性アミノ酸を保持すると有利になり得る。 6: In step 6, QTY substitutions are partially performed on selected subsets of the hydrophobic amino acids L, I, V and F within a given TM region of the protein (60). Specifically, the first transmembrane region (typically, but not essential, the most proximal transmembrane region to the N-terminus of the protein) is selected first for mutation. Then, some or all of the hydrophobic amino acids (L, I, V and F) in the first transmembrane region are replaced with the corresponding nonionic hydrophilic amino acids (Q / S / N, T / S / N). , T / S / N or Y). Of course, in this case this amino acid is not actually substituted in the protein. Rather, this amino acid designation is replaced in the modeling sequence. Therefore, the term "array" shall include "array data". Typically, most or all of the hydrophobic amino acids are selected for substitution. If less than all amino acids are selected, it may be desirable to select internally hydrophobic amino acids while leaving one or more N and / or C-terminal amino acids in the transmembrane region hydrophobic. In addition or as an alternative, it may be desirable to choose to replace all of the leucine (L) in the transmembrane region. As an addition or alternative, it may be desirable to selectively replace all isoleucine (I) within the transmembrane region. As an addition or alternative, it may be desirable to selectively replace all valine (V) in the transmembrane region. As an addition or alternative, it may be desirable to selectively replace all of the phenylalanine (F) in the transmembrane region. In addition or as an alternative, it may be advantageous to retain one or more phenylalanines within the transmembrane region. In addition or as an alternative, it may be advantageous to retain one or more valines in the transmembrane region. In addition or as an alternative, it may be advantageous to retain one or more leucines in the transmembrane region. In addition or as an alternative, it may be advantageous to retain one or more isoleucines within the transmembrane region. In addition or as an alternative, it may be advantageous for the wild-type sequence to retain one or more hydrophobic amino acids within a transmembrane region characterized by three or more contiguous hydrophobic amino acids.

７：工程７では、そのように設計した膜貫通領域を元のタンパク質の配列文脈の中に戻す。すなわち、置換の各セットによりそのＴＭ領域に対して１種の特定の推定上の変異体が生成されるため、ＱＴＹ置換を有する変異または再設計されたＴＭ領域（６２）を元のタンパク質の対応するＴＭ領域と交換して膜貫通変異体すなわち「推定上の変異体」を生成する（７０）。同時に、これらの関連する推定上の変異体は、推定上の変異体の第１のライブラリーを形成する。 7: In step 7, the transmembrane region so designed is returned to the original protein sequence context. That is, since each set of substitutions produces one particular putative variant for that TM region, the mutation or redesigned TM region (62) with the QTY substitution is associated with the original protein. It exchanges with the TM region to generate a transmembrane mutant, or "estimated mutant" (70). At the same time, these related putative variants form a first library of putative variants.

８：次いで、工程８２および８４では、各推定上の変異体を本明細書に記載されている膜貫通領域予測方法（８４）に供する（例えば、予測されるＴＭ領域の喪失）。当該変異体を、その配列のαヘリックスを形成する傾向のスコアについても評価する（８２）。また、当該変異体を本明細書に記載されている水溶性予測方法にも供する。例えば、当該変異体を、その配列の水溶性である傾向のスコアについて評価する。そのようなスコアは、予測されるＴＭ領域を形成する傾向に基づいていてもよく、ＴＭ領域を形成する傾向の強さは低い水溶性に関連づけられ、ＴＭ領域を形成する傾向の低さまたはその傾向がないことは高い水溶性に関連づけられる。当然ながら、全ての濃度における完全な水溶性は大部分の商業目的には必要ではない。水溶性は、好ましくは予測される使用条件（例えば、リガンド結合アッセイ）における機能性に必要なものであるように決定する。 8: Then, in steps 82 and 84, each putative variant is subjected to the transmembrane region prediction method (84) described herein (eg, predicted loss of TM region). The variant is also evaluated for a score of propensity to form an α-helix for that sequence (82). The mutant is also used in the water solubility prediction method described herein. For example, the mutant is evaluated for its sequence's tendency to be water soluble. Such scores may be based on the expected tendency to form TM regions, the strength of the tendency to form TM regions is associated with low water solubility, the low tendency to form TM regions or its The lack of tendency is associated with high water solubility. Of course, complete water solubility at all concentrations is not required for most commercial purposes. Water solubility is preferably determined to be necessary for functionality under the expected conditions of use (eg, ligand binding assay).

９：工程９では、αへリックス構造の喪失および／または「水不溶性」を予測する（期待される使用条件において予測）推定上の変異体を廃棄する。例えば、αヘリックスの二次構造予測結果およびＴＭ領域／水溶性予測結果のランク付け関数に基づく重み付けされた組み合わせである組み合わせスコアまたはランク（９０）を用いて、αへリックス構造および水溶性を予測する推定上の変異体を選択することができる。例えば、αへリックス構造が損なわれ得るという予測が可能であれば、高水溶性であるか０、１、２または３つの疎水性アミノ酸（例えば、水溶性予測結果に対してより高い重み）を特徴とする膜貫通変異体を選択することができる。代わりまたは追加として、３、４、５または６つの疎水性アミノ酸を特徴とする高度なαヘリックス構造（例えば、αへリックス二次構造予測結果に対してより高い重み）を選択することができる。 9: In step 9, the putative variants predicting loss of α-helix structure and / or "water insoluble" (predicted under expected conditions of use) are discarded. For example, the α-helix structure and water solubility are predicted using a combination score or rank (90), which is a weighted combination based on the ranking function of the α-helix secondary structure prediction result and the TM region / water solubility prediction result. It is possible to select a putative variant to be used. For example, if it is possible to predict that the α-helix structure can be compromised, then either highly water-soluble or 0, 1, 2 or 3 hydrophobic amino acids (eg, higher weights for water-soluble prediction results). Characteristic transmembrane variants can be selected. Alternatively or additionally, advanced α-helical structures characterized by 3, 4, 5 or 6 hydrophobic amino acids (eg, higher weights for α-helix secondary structure prediction results) can be selected.

１０：工程１０では、同じライブラリー内の推定上の変異体を上に概説したスコア計算スキーム（９４）に基づいてソートまたはランク付けすることができる（１００）。次いで、所定数の推定上の変異体を第１の推定上の変異体のライブラリーの最終メンバーとして選択することができる。例えば、上記組み合わせスコアにおいて、０のスコアはＴＭ領域を形成する傾向がないこと、および元のαへリックス二次構造の完全な維持、故に最も所望の推定上の変異体であることを意味する。僅かにより高いスコアは、ＴＭ領域を形成する傾向が僅かである（または水溶性である傾向が低い）ことを示してもよい。従って、この推定上の変異体はあまり望ましくないが、当該ライブラリー内の他の推定上の変異体と比較してその優れた組み合わせスコアに基づいてなお選択することができる。 10: In step 10, putative variants within the same library can be sorted or ranked based on the scoring scheme (94) outlined above (100). A predetermined number of putative variants can then be selected as the final member of the library of first putative variants. For example, in the above combination score, a score of 0 means that there is no tendency to form TM regions and that the original α-helix secondary structure is completely maintained, and thus is the most desired putative variant. .. A slightly higher score may indicate a slight tendency to form TM regions (or a low tendency to be water soluble). Therefore, this putative variant is less desirable, but can still be selected based on its superior combination score compared to other putative variants in the library.

特定の実施形態では、１０、９、８、７、６、５、４、３、２または１個などの所定数の所望の推定上の変異体を選択することができる。 In certain embodiments, a predetermined number of desired putative variants, such as 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1, can be selected.

第２、第３、第４、第５、第６および／または第７（それ以上）の膜貫通領域またはドメインに対してこれらの工程（例えば、工程６〜１０）を繰り返して、そのようなＴＭ領域またはドメインのそれぞれに１つの推定上の変異体のライブラリーを作製することができる。 Repeating these steps (eg, steps 6-10) for the second, third, fourth, fifth, sixth and / or seventh (or higher) transmembrane regions or domains, such A library of one putative variant can be created for each TM region or domain.

１１：工程１１では、推定上の変異体を有するＴＭ領域またはドメインと置換されていない非ＴＭ領域との組み合わせを選択することができる（１１０）。例えば、高いαヘリックス構造スコアを有する推定上の変異体を含む１つ、２つ、３つまたは４つのドメインと、高い水溶性スコアを有する推定上の変異体を含む１つ、２つ、３つ、４つ、５つまたは６つのドメインを組み合わせることができる。別の例では、複数の変異体の選択において、全ての疎水性アミノ酸が親水性アミノ酸で置換されており、故にその水溶性スコアを最大化することを特徴とするドメイン／ＴＭ領域と、３つ、４つまたは５つの疎水性アミノ酸を保持する第２のドメイン／ＴＭ領域とを組み合わせることができる。当該技術分野で知られているように、そのような選択された推定上の変異体を細胞外および細胞内ドメインと「組み替え」て推定上水溶性であるタンパク質変異体の最初のコンビナトリアルライブラリーを作製することができる。 11: In step 11, a combination of a TM region or domain with a putative variant and an unreplaced non-TM region can be selected (110). For example, one, two, three or four domains containing a putative variant with a high α-helix structure score and one, two, three containing a putative variant with a high water solubility score. You can combine one, four, five or six domains. In another example, in the selection of multiple variants, the domain / TM region is characterized in that all hydrophobic amino acids are replaced with hydrophilic amino acids and thus maximize their water solubility score, and three. It can be combined with a second domain / TM region that holds four or five hydrophobic amino acids. As is known in the art, the first combinatorial library of protein variants that are putatively water-soluble by "recombining" such selected putative variants with extracellular and intracellular domains. Can be made.

特定の実施形態では、本明細書に記載されているように設計された最初のコンビナトリアルライブラリーの推定上水溶性であるタンパク質変異体の全てまたは断片を調製（生体外または宿主細胞において産生または発現）させ、かつ好ましくはハイスループットスクリーニングで水溶性および／またはリガンド結合についてスクリーニングすることができる。例えば、当該ライブラリーの増幅により、１００％未満の推定上水溶性であるタンパク質組み合わせ変異体を発現させることができる。当該技術分野でよく知られているように、リガンド結合をスクリーニングするためにレポーターシステムを使用することができる。本発明の方法を用いて、機能的に組み合わせた細胞外および細胞内ドメインを含む推定上水溶性である修飾された膜貫通組み合わせ変異体のライブラリーを迅速に同定し、野生型タンパク質の適切な三次元構造を有し、かつリガンド結合機能（結合親和性を含む）または他の機能を保持する水溶性タンパク質変異体を産生することができる。当該ソフトウェアは、タンパク質変異体の確認された機能性を使用して特定の変異体を除外するか、それらを異なるようにランク付けする学習モジュールを備えることができる。 In certain embodiments, all or fragments of a putatively water-soluble protein variant of the first combinatorial library designed as described herein are prepared (produced or expressed in vitro or in a host cell). ), And preferably high-throughput screening can be screened for water solubility and / or ligand binding. For example, amplification of the library can express less than 100% presumably water-soluble protein combination variants. As is well known in the art, a reporter system can be used to screen for ligand binding. Using the methods of the invention, a library of presumably water-soluble modified transmembrane combination variants containing functionally combined extracellular and intracellular domains was rapidly identified and suitable for wild-type proteins. It is possible to produce a water-soluble protein mutant having a three-dimensional structure and retaining a ligand-binding function (including binding affinity) or other functions. The software can include learning modules that use the confirmed functionality of protein variants to exclude specific variants or rank them differently.

特定の実施形態では、実験的に実用的なものにするために、最初のコンビナトリアルライブラリーは、約２百万種の潜在的に水溶性のＧＰＣＲまたはＣＸＣＲ４変異体を有する。当然ながら、それよりも多いまたは少ない変異体のライブラリーも設計することができる。特定の実施形態では、本明細書に記載されている研究結果の分析に基づいて最適化することができるため、より小さいライブラリーが好ましい場合がある。研究結果の分析により恐らく、組み替えられるドメイン変異体の数を最適化する傾向およびドメイン変異体を選択するための推定が確立される。 In certain embodiments, the first combinatorial library has approximately 2 million potentially water-soluble GPCRs or CXCR4 variants to make it experimentally practical. Of course, a library of more or less mutants can also be designed. In certain embodiments, smaller libraries may be preferred because they can be optimized based on the analysis of the research results described herein. Analysis of the study results will probably establish a tendency to optimize the number of domain variants to be recombined and estimates for selecting domain variants.

特定の実施形態では、「ヘリックス予測スコア」（ｗｗｗ．ｐｒｏｔｅｏｐｅｄｉａ．ｏｒｇ／ｗｉｋｉ／ｉｎｄｅｘ．ｐｈｐ／Ｍａｉｎ＿Ｐａｇｅを参照）としても知られているへリックス形成傾向に基づく修飾のために、膜貫通タンパク質のＴＭ領域内の特定の疎水性アミノ酸を選択する。様々な断片をランダムに組み合わせて全長ＧＰＣＲ遺伝子の約２百万（８^７）種の変異体を形成する。予測される変異体の数は、一般に式Ｈ^ｎ（式中、ｎ＝本方法によって修飾および／または変更された膜貫通領域の数（ＧＰＣＲの例では、ｎ＝７）およびＨ＝組み合わせ変異体を産生するために利用可能な各膜貫通領域における推定上の変異体の数）によって特徴づけることができる。 In certain embodiments, the TM of the transmembrane protein is due to a modification based on the helix-forming tendency, also known as the "helix prediction score" (see www.proteopedia.org/wiki/index.php/Main_Page). Select specific hydrophobic amino acids within the region. Various fragments combined randomly to form about 2 million one million (8 ⁷⁾ species variant of the full-length GPCR gene. The expected number of mutants is generally the formula H ⁿ (in the formula n = number of transmembrane regions modified and / or modified by the method (n = 7 in the GPCR example) and H = combination mutants. It can be characterized by the estimated number of variants in each transmembrane region available to produce.

最初のコンビナトリアルライブラリーすなわち組み替えられるドメイン変異体群を選択したら、最初のコンビナトリアルライブラリー内のタンパク質をコードする核酸分子すなわちＤＮＡまたはｃＤＮＡ分子を設計することができる。これらの核酸分子は、コード配列のライブラリーを作製するために選択された発現系のためにコドン最適化およびイントロン欠失を行うように設計することが好ましい。例えば、発現系が大腸菌である場合、大腸菌発現のために最適化されたコドンを選択することができる。ｗｗｗ．ｄｎａ２０．ｃｏｍ／ｒｅｓｏｕｒｃｅｓ／ｇｅｎｅｄｅｓｉｇｎｅｒを参照されたい。また、発現系（例えば、大腸菌）での発現に適したプロモーターなどのプロモーター領域を選択し、コード配列のライブラリー内のコード配列に機能的に連結させる。 Once the first combinatorial library or set of domain variants to be recombined is selected, the nucleic acid molecule or DNA or cDNA molecule encoding the protein in the first combinatorial library can be designed. These nucleic acid molecules are preferably designed to undergo codon optimization and intron deletion for the expression system selected to create a library of coding sequences. For example, if the expression system is E. coli, codons optimized for E. coli expression can be selected. www. dna20. com / resources / genesigner. In addition, a promoter region such as a promoter suitable for expression in an expression system (for example, Escherichia coli) is selected and functionally linked to a coding sequence in a library of coding sequences.

次いで、コード配列の最初のライブラリーまたはその一部を発現させて推定上水溶性であるＧＰＣＲのライブラリーを作製する。次いで、このライブラリーをリガンド結合アッセイに供する。結合アッセイでは、推定上水溶性であるＧＰＣＲを好ましくは水性媒体中でリガンドに接触させ、リガンド結合を検出する。 The first library of coding sequences or a portion thereof is then expressed to create a library of presumably water-soluble GPCRs. The library is then subjected to a ligand binding assay. In the binding assay, a putatively water-soluble GPCR is preferably contacted with the ligand in an aqueous medium to detect ligand binding.

本発明は、本明細書に記載されている方法から得られるか得ることができる膜貫通ドメイン変異体およびそれをコードする核酸分子を含む。 The present invention includes transmembrane domain variants obtained or obtained from the methods described herein and nucleic acid molecules encoding them.

本発明は、それぞれＱ、Ｔ、ＴまたはＹによって置換されている天然膜貫通タンパク質（例えば、ＧＰＣＲ）の少なくとも５０％、好ましくは少なくとも約６０％、より好ましくは少なくとも約７０％または８０％、例えば少なくとも約９０％の疎水性アミノ酸残基（Ｌ、Ｉ、ＶおよびＦ）によって独立して特徴づけられる複数の膜貫通ドメインを特徴とする水溶性ＧＰＣＲ変異体（「ｓＧＰＣＲ」）も想定している。本発明のｓＧＰＣＲは、水溶性およびリガンド結合によって特徴づけられる。特に、ｓＧＰＣＲは対応する天然ＧＰＣＲと同じ天然リガンドに結合する。 The present invention relates to at least 50%, preferably at least about 60%, more preferably at least about 70% or 80%, eg, a natural transmembrane protein (eg, GPCR) substituted by Q, T, T or Y, respectively. Water-soluble GPCR variants (“sGPCRs”) characterized by multiple transmembrane domains independently characterized by at least about 90% of hydrophobic amino acid residues (L, I, V and F) are also envisioned. .. The sGPCRs of the present invention are characterized by water solubility and ligand binding. In particular, the sGPCR binds to the same natural ligand as the corresponding natural GPCR.

本発明は、膜タンパク質の活性によって媒介される障害および疾患を治療するために水溶性ポリペプチドの使用を含む前記障害または疾患の治療法であって、前記水溶性ポリペプチドは修飾されたαヘリックスドメインを含み、かつ前記水溶性ポリペプチドはその天然膜タンパク質のリガンド結合活性を保持していることを特徴とする方法をさらに包含する。そのような障害および疾患の例としては、限定されるものではないが、癌、小細胞肺癌、黒色腫、乳癌、パーキンソン病、心血管疾患、高血圧症および喘息が挙げられる。 The present invention is a method of treating the disorder or disease comprising the use of a water-soluble polypeptide to treat a disorder and disease mediated by the activity of a membrane protein, wherein the water-soluble polypeptide is a modified α-helix. It further includes a method comprising a domain and characterized in that the water-soluble polypeptide retains the ligand-binding activity of its natural membrane protein. Examples of such disorders and diseases include, but are not limited to, cancer, small cell lung cancer, melanoma, breast cancer, Parkinson's disease, cardiovascular disease, hypertension and asthma.

本明細書に記載されているように、膜タンパク質の活性によって媒介される病気または疾患の治療のために本明細書に記載されている水溶性ペプチドを使用することができる。特定の態様では、本水溶性ペプチドは膜受容体の「デコイ」として機能し、そうでなければ膜受容体を活性化させるリガンドに結合することができる。従って、本明細書に記載されている水溶性ペプチドを使用して膜タンパク質の活性を低下させることができる。これらの水溶性ペプチドは循環系中に残り、特異的リガンドに競合的に結合し、それにより膜結合受容体の活性を低下させることができる。例えば、GPCR CXCR4は小細胞肺癌において過剰発現され、腫瘍細胞の転移を促進する。本明細書に記載されているような水溶性ペプチドによるこのリガンドへの結合により転移を有意に減少させることができる。 As described herein, the water-soluble peptides described herein can be used for the treatment of diseases or disorders mediated by the activity of membrane proteins. In certain embodiments, the water-soluble peptide can act as a "decoy" for the membrane receptor and otherwise bind to a ligand that activates the membrane receptor. Therefore, the water-soluble peptides described herein can be used to reduce the activity of membrane proteins. These water-soluble peptides remain in the circulatory system and can competitively bind to specific ligands, thereby reducing the activity of membrane-bound receptors. For example, GPCR CXCR4 is overexpressed in small cell lung cancer and promotes tumor cell metastasis. Binding to this ligand by a water-soluble peptide as described herein can significantly reduce metastasis.

ケモカイン受容体ＣＸＣＲ４は、Ｔ細胞株向性ＨＩＶの侵入のための主要な補助受容体としてウイルス研究において知られている(Feng et al. (1996) Science 272: 872-877; Davis et al. (1997) J Exp Med 186: 1793-1798; Zaitseva et al. (1997) Nat Med 3: 1369-1375; Sanchez et al. (1997) J Biol Chem 272: 27529-27531)。間質細胞由来因子１（ＳＤＦ−１）はＣＸＣＲ４と特異的に相互作用するケモカインである。ＳＤＦ−１がＣＸＣＲ４に結合した場合、ＣＸＣＲ４は、リンパ球、巨核球および造血幹細胞中のＲａｓ／ＭＡＰキナーゼおよびホスファチジルイノシトール３−キナーゼ（ＰＩ３Ｋ）／Ａｋｔなどの下流キナーゼ経路(Bleul et al. (1996) Nature 382: 829-833; Deng et al. (1997) Nature 388: 296-300; Kijowski et al. (2001) Stem Cells 19: 453-466; Majka et al. (2001) Folia. Histochem. Cytobiol. 39: 235-244; Sotsios et al. (1999) J. Immunol. 163: 5954-5963; Vlahakis et al. (2002) J. Immunol. 169: 5546-5554) を含むＧαｉタンパク質媒介性シグナル伝達（百日咳毒素感受性）を活性化させる(Chen et al. (1998) Mol Pharmacol 53: 177-181)。ヒトリンパ節が移植されたマウスにおいて、ＳＤＦ−１は移植されたリンパ節へのＣＸＣＲ４陽性細胞遊走を誘導する(Blades et al. (2002) J. Immunol. 168: 4308-4317)。 The chemokine receptor CXCR4 is known in viral studies as a major co-receptor for the invasion of T-cell line-directed HIV (Feng et al. (1996) Science 272: 872-877; Davis et al. ( 1997) J Exp Med 186: 1793-1798; Zaitseva et al. (1997) Nat Med 3: 1369-1375; Sanchez et al. (1997) J Biol Chem 272: 27529-27531). Stromal cell-derived factor 1 (SDF-1) is a chemokine that specifically interacts with CXCR4. When SDF-1 binds to CXCR4, CXCR4 is a downstream kinase pathway such as Ras / MAP kinase and phosphatidylinositol 3-kinase (PI3K) / Akt in lymphocytes, megakaryocytes and hematopoietic stem cells (Bleul et al. (1996)). ) Nature 382: 829-833; Deng et al. (1997) Nature 388: 296-300; Kijowski et al. (2001) Stem Cells 19: 453-466; Majka et al. (2001) Folia. Histochem. Cytobiol. 39: 235-244; Sotsios et al. (1999) J. Immunol. 163: 5954-5963; Vlahakis et al. (2002) J. Immunol. 169: 5546-5554) Gαi protein-mediated signaling (pertussis) Activates toxin sensitivity) (Chen et al. (1998) Mol Pharmacol 53: 177-181). In mice transplanted with human lymph nodes, SDF-1 induces CXCR4-positive cell migration to the transplanted lymph nodes (Blades et al. (2002) J. Immunol. 168: 4308-4317).

最近の研究から、ＣＸＣＲ４相互作用により転移性細胞の遊走を制御することができることが分かった。低酸素症すなわち酸素分圧の低下は、大部分の固形腫瘍において生じる微小環境の変化であり、腫瘍の血管新生および治療抵抗性の主要な誘導因子である。低酸素症はＣＸＣＲ４レベルを上昇させる(Staller et al. (2003) Nature 425: 307-311)。転移活性が上昇した骨転移モデル由来の細胞の亜集団に対するマイクロアレイ分析から、転移表現型において増加した遺伝子のうちの１種はＣＸＣＲ４であることが分かった。さらに、単離された細胞におけるＣＸＣＲ４の過剰発現により転移活性が有意に上昇した(Kang et al. (2003) Cancer Cell 3: 537-549)。様々な乳癌患者から採取した試料において、Ｍｕｌｌｅｒら(Muller et al. (2001) Nature 410: 50-56)は、ＣＸＣＲ４発現レベルは正常な乳腺または上皮細胞に対して原発性腫瘍においてより高いことを見い出した。さらに、ＣＸＣＲ４抗体治療は、全てがリンパ節および肺に転移した対照アイソタイプと比較して所属リンパ節への転移を阻害することが分かった(Muller et al. (2001))。従って、デコイ治療法モデルはＣＸＣＲ４媒介性疾患および障害を治療するのに適している。 Recent studies have shown that CXCR4 interactions can regulate the migration of metastatic cells. Hypoxia, or reduced oxygen partial pressure, is a microenvironmental change that occurs in most solid tumors and is a major inducer of tumor angiogenesis and treatment resistance. Hypoxia raises CXCR4 levels (Staller et al. (2003) Nature 425: 307-311). Microarray analysis of subpopulations of cells from bone metastatic models with increased metastatic activity revealed that one of the increased genes in the metastatic phenotype was CXCR4. In addition, overexpression of CXCR4 in isolated cells significantly increased metastatic activity (Kang et al. (2003) Cancer Cell 3: 537-549). In samples taken from various breast cancer patients, Muller et al. (Muller et al. (2001) Nature 410: 50-56) found that CXCR4 expression levels were higher in primary tumors relative to normal mammary or epithelial cells. I found it. In addition, CXCR4 antibody therapy was found to inhibit metastasis to regional lymph nodes as compared to control isotypes, which all metastasized to lymph nodes and lungs (Muller et al. (2001)). Therefore, the decoy treatment model is suitable for treating CXCR4-mediated diseases and disorders.

別の実施形態では、本発明は、白血球動員または活性化異常を伴うＣＸＣＲ４依存の走化性に関連する疾患または障害の治療に関する。当該疾患は、関節炎、乾癬、多発性硬化症、潰瘍性大腸炎、クローン病、アレルギー、喘息、ＡＩＤＳ関連脳炎、ＡＩＤＳ関連斑状丘疹状皮疹、ＡＩＤＳ関連間質性肺炎、ＡＩＤＳ関連腸疾患、ＡＩＤＳ関連門脈周囲肺炎およびＡＩＤＳ関連糸球体腎炎からなる群から選択される。 In another embodiment, the invention relates to the treatment of a disease or disorder associated with CXCR4-dependent chemotaxis with leukocyte mobilization or abnormal activation. The diseases include arthritis, psoriasis, multiple sclerosis, ulcerative colitis, Crohn's disease, allergy, asthma, AIDS-related encephalitis, AIDS-related patchy rash-like rash, AIDS-related interstitial pneumonia, AIDS-related intestinal disease, AIDS-related It is selected from the group consisting of peri-moneral pneumonia and AIDS-related glomerulonephritis.

別の態様では、本発明は、関節炎、リンパ腫、非小細胞肺癌、肺癌、乳癌、前立腺癌、多発性硬化症、中枢神経系発達障害、認知症、パーキンソン病、アルツハイマー病、腫瘍、線維腫、星状細胞腫、骨髄腫、神経膠芽腫、炎症性疾患、臓器移植拒絶反応、ＡＩＤＳ、ＨＩＶ感染または血管新生から選択される疾患または障害の治療に関する。 In another aspect, the invention comprises arthritis, lymphoma, non-small cell lung cancer, lung cancer, breast cancer, prostate cancer, multiple sclerosis, central nervous system developmental disorders, dementia, Parkinson's disease, Alzheimer's disease, tumors, fibromas, It relates to the treatment of diseases or disorders selected from astrocytoma, myeloma, glioblastoma, inflammatory disease, organ transplant rejection, AIDS, HIV infection or angiogenesis.

本発明は、前記水溶性ポリペプチドおよび薬学的に許容される担体または希釈液を含む医薬組成物も包含する。 The present invention also includes pharmaceutical compositions containing the water-soluble polypeptide and a pharmaceutically acceptable carrier or diluent.

本組成物は、所望の製剤に応じて、動物またはヒトへの投与のための医薬組成物を製剤化するために一般に使用される賦形剤として定義される薬学的に許容される非毒性担体または希釈液も含むことができる。当該希釈液は、薬剤または薬理組成物の生物学的活性に影響を与えないように選択される。そのような希釈液の例は、蒸留水、生理的リン酸緩衝食塩水、リンゲル液、デキストロース溶液およびハンクス液である。また、本医薬組成物または製剤は、他の担体、アジュバントまたは非毒性の非治療的な非免疫原性の安定化剤なども含んでもよい。医薬組成物は、タンパク質、キトサンなどの多糖類、ポリ乳酸、ポリグリコール酸およびコポリマー（例えば、ラテックス官能化セファロース(latex functionalized SEPHAROSE)（商標）、アガロース、セルロースなど）、重合アミノ酸、アミノ酸コポリマーおよび脂質凝集物（例えば、油滴またはリポソーム）などの大きなゆっくりと代謝される巨大分子も含むことができる。 The composition is a pharmaceutically acceptable non-toxic carrier defined as an excipient commonly used to formulate pharmaceutical compositions for administration to animals or humans, depending on the desired formulation. Alternatively, a diluent can also be included. The diluent is selected so as not to affect the biological activity of the drug or pharmacological composition. Examples of such diluents are distilled water, physiological phosphate buffered saline, Ringer's solution, dextrose solution and Hanks solution. The pharmaceutical composition or formulation may also include other carriers, adjuvants or non-toxic, non-therapeutic, non-immunogenic stabilizers and the like. Pharmaceutical compositions include proteins, polysaccharides such as chitosan, polylactic acid, polyglycolic acids and copolymers (eg, latex functionalized SEPHAROSE ™, agarose, cellulose, etc.), polymerized amino acids, amino acid copolymers and lipids. Large, slowly metabolized macromolecules such as aggregates (eg, oil droplets or liposomes) can also be included.

本組成物は、例えば、静脈内、筋肉内、クモ膜下腔内または皮下注射などにより非経口投与することができる。組成物を溶液または懸濁液に組み込むことにより、非経口投与を達成することができる。そのような溶液または懸濁液としては、注射用水などの無菌希釈液、生理食塩水、不揮発性油、ポリエチレングリコール、グリセリン、プロピレングリコールまたは他の合成の溶媒も含んでもよい。非経口製剤は、例えばベンジルアルコールまたはメチルパラベンなどの抗菌剤、例えばアスコルビン酸または重亜硫酸ナトリウムなどの抗酸化剤およびＥＤＴＡなどのキレート剤も含んでもよい。酢酸塩、クエン酸塩またはリン酸塩などの緩衝剤および塩化ナトリウムまたはデキストロースなどの緊張調整剤も添加してもよい。当該非経口製剤をガラスまたはプラスチック製のアンプル、使い捨て注射器または複数回投与用バイアルに封入することができる。 The composition can be administered parenterally, for example, by intravenous, intramuscular, intrasubarachnoid space, or subcutaneous injection. Parenteral administration can be achieved by incorporating the composition into a solution or suspension. Such solutions or suspensions may also include sterile diluents such as water for injection, saline, non-volatile oils, polyethylene glycol, glycerin, propylene glycol or other synthetic solvents. Parenteral preparations may also include antibacterial agents such as benzyl alcohol or methylparaben, antioxidants such as ascorbic acid or sodium bisulfite and chelating agents such as EDTA. Buffering agents such as acetates, citrates or phosphates and tension regulators such as sodium chloride or dextrose may also be added. The parenteral formulation can be encapsulated in glass or plastic ampoules, disposable syringes or multi-dose vials.

さらに、浸潤剤、乳化剤、界面活性剤、ｐＨ緩衝物質などの補助物質が組成物中に存在していてもよい。医薬組成物の他の構成要素は、石油、動物、植物または合成由来の油、例えば、落花生油、大豆油および鉱油である。一般に、プロピレングリコールまたはポリエチレングリコールなどのグリコールは、特に注射溶液のための好ましい液体担体である。 In addition, auxiliary substances such as infiltrators, emulsifiers, surfactants, pH buffers and the like may be present in the composition. Other components of the pharmaceutical composition are petroleum, animal, plant or synthetic oils such as peanut oil, soybean oil and mineral oil. In general, glycols such as propylene glycol or polyethylene glycol are preferred liquid carriers, especially for injectable solutions.

液体溶液または懸濁液のいずれかとして注射製剤を調製することができ、注射前に溶液または懸濁液すなわち液体賦形剤に溶解するのに適した固形も調製することができる。また、この製剤は、上に記載したようにアジュバント効果を高めるために、リポソームあるいはポリ乳酸、ポリグリコライドまたはコポリマーなどのマイクロ粒子に乳化または封入されていてもよい（Langer, Science 249: 1527, 1990およびHanes, Advanced Drug Delivery Reviews 28: 97-119, 1997）。本明細書に記載されている組成物および薬剤は、有効成分の持続放出またはパルス放出を可能にするように製剤化することができるデポ注射または埋込製剤の形態で投与することができる。 Injectable formulations can be prepared as either liquid solutions or suspensions, and solids suitable for dissolution in solutions or suspensions or liquid excipients prior to injection can also be prepared. The formulation may also be emulsified or encapsulated in liposomes or microparticles such as polylactic acid, polyglycolide or copolymers to enhance the adjuvant effect as described above (Langer, Science 249: 1527, 1990 and Hanes, Advanced Drug Delivery Reviews 28: 97-119, 1997). The compositions and agents described herein can be administered in the form of depot injections or implantable formulations that can be formulated to allow sustained or pulsed release of the active ingredient.

経皮投与は、皮膚からの本組成物の経皮吸収を含む。経皮製剤としてはパッチ、軟膏、クリーム、ゲル、膏薬などが挙げられる。皮膚パッチまたはトランスフェロソーム（ｔｒａｎｓｆｅｒｏｓｏｍｅ）を用いて経皮送達を達成することができる。Paul et al., Eur. J. Immunol. 25: 3521-24, 1995およびCevc et al., Biochem. Biophys. Acta 1368: 201-15, 1998を参照されたい。 Transdermal administration comprises transdermal absorption of the composition through the skin. Percutaneous preparations include patches, ointments, creams, gels, ointments and the like. Transdermal delivery can be achieved using skin patches or transfersomes. See Paul et al., Eur. J. Immunol. 25: 3521-24, 1995 and Cevc et al., Biochem. Biophys. Acta 1368: 201-15, 1998.

「治療する」または「治療」は、疾患の症状、合併症または生化学的兆候の発症を予防または遅らせること、その症状を軽減または改善すること、または疾患、病気または障害のさらなる進行を阻止または阻害することを含む。「患者」は治療を必要としているヒトの対象である。 "Treatment" or "treatment" means preventing or delaying the onset of symptoms, complications or biochemical signs of the disease, reducing or ameliorating the symptoms, or preventing or preventing further progression of the disease, illness or disorder. Including to inhibit. A "patient" is a human subject in need of treatment.

「有効量」とは、疾患の１つ以上の症状を改善し、かつ／または疾患の進行を予防し、疾患の回復を引き起こし、かつ／または所望の効果を達成するのに十分な治療薬の量を指す。 An "effective amount" is a therapeutic agent sufficient to ameliorate one or more symptoms of a disease and / or prevent the progression of the disease, cause recovery of the disease and / or achieve the desired effect. Refers to the quantity.

コンピュータシステム
本明細書に記載されている各種態様および機能は、１つ以上のコンピュータシステムにおいて実行される専用ハードウェアまたはソフトウェア構成要素として実装してもよい。現在使用されているコンピュータシステムの多くの例がある。これらの例としては、とりわけ、ネットアプライアンス、パーソナルコンピュータ、ワークステーション、メインフレーム、ネットワーク化されたクライアント、サーバ、メディアサーバ、アプリケーションサーバ、データベースサーバおよびウェブサーバが挙げられる。コンピュータシステムの他の例としては、携帯電話および携帯情報端末などのモバイルコンピューティングデバイス、ロードバランサー、ルータおよびスイッチなどのネットワーク機器を挙げることができる。さらに、態様は、単一のコンピュータシステム上に位置していてもよく、あるいは１つ以上の通信ネットワークによって接続された複数のコンピュータシステム間に分散されていてもよい。 Computer Systems The various aspects and functions described herein may be implemented as dedicated hardware or software components running in one or more computer systems. There are many examples of computer systems currently in use. Examples of these include, among others, net appliances, personal computers, workstations, mainframes, networked clients, servers, media servers, application servers, database servers and web servers. Other examples of computer systems include mobile computing devices such as mobile phones and personal digital assistants, and network devices such as load balancers, routers and switches. Further, the embodiments may be located on a single computer system or may be distributed among multiple computer systems connected by one or more communication networks.

例えば、各種態様、機能および方法は、１つ以上のクライアントコンピュータにサービスを提供するか、分散システムの一部としてタスク全体を行うように構成された１つ以上のコンピュータシステム間に分散されていてもよい。さらに、態様は、各種機能を行う１つ以上のサーバシステム間に分散された構成要素を備えるクライアントサーバすなわち多階層システム上で行われてもよい。従って、実施形態は、あらゆる特定のシステムまたはシステム群上での実行に限定されない。さらに、態様、機能および方法は、ソフトウェア、ハードウェアまたはファームウェアあるいは任意のそれらの組み合わせの中に実装されていてもよい。従って、態様、機能および方法は、様々なハードウェアおよびソフトウェア構成を用いて、方法、動作、システム、システム要素および構成要素内に実装されていてもよく、これらの例は、あらゆる特定の分散型アーキテクチャ、ネットワークまたは通信プロトコルに限定されない。 For example, various aspects, functions, and methods are distributed among one or more computer systems configured to serve one or more client computers or to perform the entire task as part of a distributed system. May be good. Further, the embodiment may be performed on a client server, that is, a multi-layer system, which includes components distributed among one or more server systems that perform various functions. Therefore, embodiments are not limited to execution on any particular system or group of systems. Further, aspects, functions and methods may be implemented in software, hardware or firmware or any combination thereof. Thus, aspects, functions and methods may be implemented within methods, behaviors, systems, system elements and components using various hardware and software configurations, and these examples are of any particular distributed type. Not limited to architecture, network or communication protocol.

図１０を参照すると、各種態様および機能が実施される分散型コンピュータシステム３００のブロック図が示されている。図示のように、分散型コンピュータシステム３００は、情報を交換する１つ以上のコンピュータシステムを備える。より具体的には、分散型コンピュータシステム３００は、コンピュータシステム３０２、３０４および３０６を備える。図示のように、コンピュータシステム３０２、３０４および３０６は通信ネットワーク３０８を介して相互接続されており、通信ネットワーク３０８を介してデータ交換することができる。ネットワーク３０８は、それを介してコンピュータシステムがデータ交換することができる任意の通信ネットワークを備えていてもよい。ネットワーク３０８を用いてデータ交換するために、コンピュータシステム３０２、３０４および３０６ならびにネットワーク３０８は、各種方法、プロトコルおよび規格を使用してもよい。これらのプロトコルおよび規格の例としては、ビッグデータ環境で使用するのに適したＮＡＳ、Ｗｅｂ、記憶および他のデータ移動プロトコルが挙げられる。データ転送が安全であることを保証するために、コンピュータシステム３０２、３０４および３０６は、例えばＳＳＬまたはＶＰＮ技術などの様々なセキュリティ対策を用いてネットワーク３０８を介してデータを伝送してもよい。分散型コンピュータシステム３００は３つのネットワーク化されたコンピュータシステムを示しているが、分散型コンピュータシステム３００はそのように限定されず、任意の媒体および通信プロトコルを用いてネットワーク化された任意の数のコンピュータシステムおよびコンピューティングデバイスを備えていてもよい。 With reference to FIG. 10, a block diagram of a distributed computer system 300 in which various aspects and functions are implemented is shown. As shown, the distributed computer system 300 includes one or more computer systems that exchange information. More specifically, the distributed computer system 300 includes computer systems 302, 304 and 306. As shown, the computer systems 302, 304 and 306 are interconnected via the communication network 308 and data can be exchanged via the communication network 308. The network 308 may include any communication network through which the computer system can exchange data. Computer systems 302, 304 and 306 and network 308 may use various methods, protocols and standards for exchanging data using network 308. Examples of these protocols and standards include NAS, Web, storage and other data movement protocols suitable for use in big data environments. To ensure that the data transfer is secure, the computer systems 302, 304 and 306 may transmit data over the network 308 using various security measures such as SSL or VPN technology. The decentralized computer system 300 represents three networked computer systems, but the decentralized computer system 300 is not so limited, and any number of networks networked using any medium and communication protocol. It may include a computer system and a computing device.

図１０に示すように、コンピュータシステム３０２は、プロセッサ３１０、メモリ３１２、相互接続要素３１４、インタフェース３１６およびデータ記憶要素３１８を備える。本明細書に開示されている態様、機能および方法の少なくともいくつかを実装するために、プロセッサ３１０は、処理されたデータが得られる一連の命令を行う。プロセッサ３１０は、任意の種類のプロセッサ、マルチプロセッサまたは制御装置であってもよい。プロセッサの例としては、Intel Xeon、Ｉｔａｎｉｕｍ、Ｃｏｒｅ、ＣｅｌｅｒｏｎまたはＰｅｎｔｉｕｍプロセッサなどの市販されているプロセッサ、AMD Opteronプロセッサ、Apple A4もしくはA5プロセッサ、Sun UltraSPARCプロセッサ、IBM Power5+プロセッサ、ＩＢＭメインフレームチップまたは量子コンピュータを挙げることができる。プロセッサ３１０は、相互接続要素３１４によって１つ以上のメモリ装置３１２を含む他のシステム構成要素に接続されている。 As shown in FIG. 10, the computer system 302 includes a processor 310, a memory 312, an interconnect element 314, an interface 316 and a data storage element 318. To implement at least some of the embodiments, functions and methods disclosed herein, processor 310 issues a series of instructions to obtain the processed data. The processor 310 may be any type of processor, multiprocessor or control device. Examples of processors include commercially available processors such as Intel Xeon, Itanium, Core, Celeron or Pentium processors, AMD Opteron processors, Apple A4 or A5 processors, Sun UltraSPARC processors, IBM Power5 + processors, IBM mainframe chips or quantum computers. Can be mentioned. Processor 310 is connected by interconnect elements 314 to other system components, including one or more memory devices 312.

メモリ３１２は、コンピュータシステム３０２の動作中にプログラム（例えば、プロセッサ３１０によって実行可能なようにコード化された一連の命令）およびデータを記憶する。従って、メモリ３１２は、ダイナミックＲＡＭ（「ＤＲＡＭ」）またはスタティックメモリ（「ＳＲＡＭ」）などの比較的高性能の揮発性ＲＡＭであってもよい。但し、メモリ３１２は、ディスクドライブまたは他の不揮発性記憶装置などのデータを記憶するための任意の装置を備えていてもよい。各種例により、本明細書に開示されている機能を行うようにメモリ３１２を、個別化され、かつ場合によっては固有の構造に組織化してもよい。これらのデータ構造は、特定のデータの値およびデータの種類を記憶するようにサイズ決めおよび組織化されていてもよい。 The memory 312 stores a program (eg, a series of instructions encoded by the processor 310) and data during the operation of the computer system 302. Therefore, the memory 312 may be a relatively high performance volatile RAM such as a dynamic RAM (“DRAM”) or a static memory (“SRAM”). However, the memory 312 may include any device for storing data, such as a disk drive or other non-volatile storage device. By various examples, the memory 312 may be personalized and, in some cases, organized into a unique structure to perform the functions disclosed herein. These data structures may be sized and organized to store specific data values and data types.

コンピュータシステム３０２の構成要素は、相互接続要素３１４などの相互接続要素によって接続されている。相互接続要素３１４は、ＩＤＥ、ＳＣＳＩ、ＰＣＩおよびＩｎｆｉｎｉＢａｎｄなどの専用または標準的なコンピューティングバス技術に従う１つ以上の物理的バスなどのシステム構成要素間の任意の通信結合を含んでいてもよい。相互接続要素３１４により、命令およびデータを含む通信をコンピュータシステム３０２のシステム構成要素間で交換することが可能となる。 The components of the computer system 302 are connected by interconnect elements such as the interconnect element 314. The interconnect element 314 may include any communication coupling between system components such as one or more physical buses according to dedicated or standard computing bus technology such as IDE, SCSI, PCI and InfiniBand. The interconnect element 314 allows communication, including instructions and data, to be exchanged between system components of computer system 302.

コンピュータシステム３０２は、入力装置、出力装置および入力／出力装置の組み合わせなどの１つ以上のインタフェース装置３１６も備える。インタフェース装置は、入力を受け取るか出力を与えてもよい。より詳細には、出力装置は、外部提示のために情報を与えてもよい。入力装置は外部ソースから情報を受け取ってもよい。インタフェース装置の例としては、キーボード、マウス装置、トラックボール、マイクロホン、タッチスクリーン、印刷装置、表示画面、スピーカー、ネットワークインタフェースカードなどが挙げられる。インタフェース装置によりコンピュータシステム３０２は情報を交換し、かつユーザおよび他のシステムなどの外部実体と通信することができる。 The computer system 302 also includes one or more interface devices 316, such as an input device, an output device, and a combination of input / output devices. The interface device may receive an input or give an output. More specifically, the output device may provide information for external presentation. The input device may receive information from an external source. Examples of interface devices include keyboards, mouse devices, trackballs, microphones, touch screens, printing devices, display screens, speakers, network interface cards, and the like. The interface device allows the computer system 302 to exchange information and communicate with external entities such as users and other systems.

データ記憶要素３１８は、プロセッサ３１０によって実行されるプログラムまたは他のオブジェクトを定義する命令が記憶されているコンピュータによる読み取りおよび書込みが可能な不揮発性または非一時的データ記憶媒体を備える。また、データ記憶要素３１８は、媒体の上または中に記録され、かつプログラムの実行中にプロセッサ３１０によって処理される情報を含んでいてもよい。より具体的には、当該情報は、具体的には記憶スペースを節約するかデータ交換性能を高めるように構成された１つ以上のデータ構造に記憶されていてもよい。当該命令は、符号化された信号として恒久的に記憶されていてもよく、当該命令は、プロセッサ３１０に本明細書に記載されている機能を実行させてもよい。当該媒体は、例えば、とりわけ光ディスク、磁気ディスクまたはフラッシュメモリであってもよい。動作中、プロセッサ３１０またはいくつかの他の制御装置は、データを不揮発性記録媒体から、データ記憶要素３１８に含まれている記憶媒体よりもプロセッサ３１０による情報へのより速いアクセスを可能にするメモリ３１２などの別のメモリに読み込ませる。但し、当該メモリはデータ記憶要素３１８またはメモリ３１２内に位置していてもよく、プロセッサ３１０はメモリ内のデータを処理し、次いで、処理が完了した後にデータをデータ記憶要素３１８に関連する記憶媒体にコピーする。様々な構成要素は、記憶媒体と他のメモリ要素との間でのデータ移動を管理してもよく、これらの例は特定のデータ管理構成要素に限定されない。さらに、これらの例は特定のメモリシステムまたはデータ記憶システムに限定されない。 The data storage element 318 comprises a non-volatile or non-temporary data storage medium that can be read and written by a computer that stores instructions that define a program or other object executed by the processor 310. The data storage element 318 may also contain information that is recorded on or in the medium and processed by the processor 310 during program execution. More specifically, the information may be stored in one or more data structures specifically configured to save storage space or enhance data exchange performance. The instruction may be permanently stored as an encoded signal, and the instruction may cause the processor 310 to perform the functions described herein. The medium may be, for example, an optical disk, a magnetic disk or a flash memory, among others. During operation, processor 310 or some other controller allows faster access to information from the non-volatile recording medium by processor 310 than the storage medium contained in the data storage element 318. Load it into another memory such as 312. However, the memory may be located in the data storage element 318 or the memory 312, the processor 310 processes the data in the memory, and then the data is stored in the storage medium associated with the data storage element 318 after the processing is completed. Copy to. The various components may manage the movement of data between the storage medium and other memory elements, and these examples are not limited to specific data management components. Moreover, these examples are not limited to a particular memory system or data storage system.

コンピュータシステム３０２は、例えば各種態様および機能を実施することができるコンピュータシステムの一種として示されているが、態様および機能は、図１０に示すコンピュータシステム３０２上での実行に限定されない。各種態様および機能は、図１０に示すものとは異なるアーキテクチャまたは構成要素を有する１つ以上のコンピュータ上で実施することができる。例えば、コンピュータシステム３０２は、本明細書に開示されている特定の動作を行うように作られた特定用途向け集積回路（「ＡＳＩＣ」）などの特別にプログラムされた特殊な用途のハードウェアを備えていてもよい。一方、別の例は、Motorola PowerPCプロセッサを備えたMAC OSシステムＸを実行するいくつかの汎用コンピューティングデバイスおよび独自のハードウェアおよびオペレーティングシステムを実行するいくつかの専用コンピューティングデバイスのグリッドを用いて同じ機能を行うことができる。 The computer system 302 is shown, for example, as a type of computer system capable of performing various aspects and functions, but the aspects and functions are not limited to execution on the computer system 302 shown in FIG. The various aspects and functions can be performed on one or more computers having architectures or components different from those shown in FIG. For example, the computer system 302 comprises specially programmed, special purpose hardware such as an application specific integrated circuit (“ASIC”) designed to perform a particular operation as disclosed herein. You may be. On the other hand, another example uses a grid of some general purpose computing devices running MAC OS System X with a Motorola PowerPC processor and some dedicated computing devices running its own hardware and operating system. Can perform the same function.

コンピュータシステム３０２は、コンピュータシステム３０２に含まれるハードウェア要素の少なくとも一部を管理するオペレーティングシステムを備えるコンピュータシステムであってもよい。いくつかの例では、プロセッサ３１０などのプロセッサまたは制御装置は、オペレーティングシステムを実行する。実行することができる特定のオペレーティングシステムの例としては、Ｍｉｃｒｏｓｏｆｔ社から入手可能なWindows NT、Windows 2000 (Windows ME)、Windows XP、Windows VistaまたはWindows 7オペレーティングシステムなどのＷｉｎｄｏｗｓ系オペレーティングシステム、Apple Computerから入手可能なMAC OS System XオペレーティングシステムまたはｉＯＳオペレーティングシステム、多くのＬｉｎｕｘ（登録商標）系オペレーティングシステムディストリビューションのうちの１つ、例えば、Red Hat社から入手可能なEnterprise Linux（登録商標）オペレーティングシステム、Ｏｒａｃｌｅ社から入手可能なＳｏｌａｒｉｓオペレーティングシステムまたは各種提供源から入手可能なＵＮＩＸ（登録商標）オペレーティングシステムが挙げられる。多くの他のオペレーティングシステムを使用してもよく、これらの例は任意の特定のオペレーティングシステムに限定されない。 The computer system 302 may be a computer system including an operating system that manages at least a portion of the hardware elements contained in the computer system 302. In some examples, a processor or controller, such as processor 310, runs an operating system. Examples of specific operating systems that can be run are Windows operating systems such as Windows NT, Windows 2000 (Windows ME), Windows XP, Windows Vista or Windows 7 operating systems available from Microsoft, Apple Computer. Available MAC OS System X or OS operating systems, one of many Linux® operating system distributions, such as the Enterprise Linux® operating system available from Red Hat. Examples include the Solaris operating system available from Oracle or the UNIX® operating system available from various sources. Many other operating systems may be used and these examples are not limited to any particular operating system.

プロセッサ３１０およびオペレーティングシステムは一緒に、高レベルプログラミング言語でアプリケーションプログラムが記載されているコンピュータプラットフォームを画定する。これらのコンポーネントアプリケーションは、通信プロトコル、例えばＴＣＰ／ＩＰを用いて通信ネットワーク、例えばインターネット上で通信する、実行可能な中間バイトコードすなわち解釈実行されるコードであってもよい。同様に、態様は、．Ｎｅｔ、ＳｍａｌｌＴａｌｋ、Ｊａｖａ（登録商標）、Ｃ＋＋、Ａｄａ、Ｃ＃（Ｃシャープ）、ＰｙｔｈｏｎまたはＪａｖａＳｃｒｉｐｔ（登録商標）などのオブジェクト指向プログラミング言語を用いて実装されていてもよい。他のオブジェクト指向プログラミング言語も使用してもよい。あるいは、関数、スクリプトまたは論理プログラミング言語を使用してもよい。 The processor 310 and the operating system together define a computer platform on which application programs are written in a high-level programming language. These component applications may be executable intermediate bytecode or code that is interpreted and executed, communicating over a communication network, such as the Internet, using a communication protocol, such as TCP / IP. Similarly, the aspect is. It may be implemented using an object-oriented programming language such as Net, SmallTalk, Java®, C ++, Ada, C # (C Sharp), Python or Javascript®. Other object-oriented programming languages may also be used. Alternatively, you may use a function, script or logic programming language.

さらに、各種態様および機能は、プログラムされていない環境に実装されていてもよい。例えば、ＨＴＭＬ、ＸＭＬまたは他のフォーマットで作成された文書は、ブラウザプログラムのウィンドウで見た場合、グラフィカルユーザインタフェースの態様を与えるか他の機能を行うことができる。さらに、各種例は、プログラムされた要素またはプログラムされていない要素または任意のそれらの組み合わせとして実装されていてもよい。例えば、ウェブページはＨＴＭＬを用いて実装されていてもよいが、ウェブページ内で必要なデータオブジェクトはＣ＋＋で書き込まれていてもよい。従って、その例は、特定のプログラミング言語に限定されず、あらゆる好適なプログラミング言語を使用することができる。従って、本明細書に開示されている機能的構成要素としては、本明細書に記載されている機能を行うように構成された多種多様な要素（例えば、専用ハードウェア、実行可能コード、データ構造またはオブジェクト）を挙げることができる。 In addition, various aspects and functions may be implemented in an unprogrammed environment. For example, a document created in HTML, XML or other format can give a graphical user interface aspect or perform other functions when viewed in a browser program window. In addition, the various examples may be implemented as programmed or unprogrammed elements or any combination thereof. For example, the web page may be implemented using HTML, but the necessary data objects in the web page may be written in C ++. Therefore, the example is not limited to a specific programming language, and any suitable programming language can be used. Accordingly, the functional components disclosed herein include a wide variety of elements (eg, dedicated hardware, executable code, data structures) configured to perform the functions described herein. Or an object).

いくつかの例では、本明細書に開示されている構成要素は、当該構成要素によって行われる機能に影響を与えるパラメータを読み出してもよい。これらのパラメータは、揮発性メモリ（ＲＡＭなど）または不揮発性メモリ（磁気ハードドライブなど）を含む任意の形態の好適なメモリに物理的に記憶されていてもよい。また、当該パラメータは、独自のデータ構造（ユーザスペースアプリケーションによって定義されたデータベースまたはファイルなど）または一般に共有されるデータ構造（オペレーティングシステムによって定義されているアプリケーションレジストリなど）に論理的に記憶されていてもよい。また、いくつかの例は、外部実体にパラメータを修正させ、それにより当該構成要素の動作を構成させるシステムおよびユーザインタフェースの両方を提供する。 In some examples, the components disclosed herein may read parameters that affect the functionality performed by the component. These parameters may be physically stored in any form of suitable memory, including volatile memory (such as RAM) or non-volatile memory (such as magnetic hard drive). In addition, the parameters are logically stored in their own data structures (such as databases or files defined by userspace applications) or commonly shared data structures (such as application registries defined by operating systems). May be good. Also, some examples provide both a system and a user interface that allows an external entity to modify parameters and thereby configure the behavior of the component.

計算方法を行うためのソフトウェアは全体として図１１Ａに示されており、ここでは、ユーザは、本明細書において先に記載したように、コンピュータ上で手順を実行するための動作パラメータを選択し（４０２）、ここで、１つ以上の配列を入力し（４０４）、かつ置換を行う（４０８）。本システムは、二次構造を確認し（４０８）、かつ１種以上の変異体の水溶性を確認するように動作可能である。図１１Ｂに示すように、当該プログラムは、先に記載したものに加えてさらなる処理オプションを含むことができ、ここでは１つ以上のランク付け関数を記憶することができ（４４２）、ユーザは使用するランク付け関数を選択するか本システムが自動的に選択することができる（４４４）。次いで、本システムは、本明細書に記載されているようにランクを生成し（４４６）、次いで、ユーザは選択された変異体を産生して（４４８）機能を測定し（４４８）、その後に機能データを入力して、それに基づいて処理手順を修正する（４５０）ことができる。 The software for performing the calculation method is shown as a whole in FIG. 11A, where the user selects the operating parameters to perform the procedure on the computer, as described earlier herein ( 402), where one or more sequences are input (404) and substitutions are made (408). The system is capable of confirming secondary structure (408) and confirming the water solubility of one or more mutants. As shown in FIG. 11B, the program can include additional processing options in addition to those described above, where it can store one or more ranking functions (442) and is used by the user. The ranking function to be selected can be selected or the system can automatically select it (444). The system then generates ranks as described herein (446), then the user produces selected variants (448) and measures function (448), followed by Functional data can be entered and the processing procedure modified (450) based on it.

本発明は、例示としてのみ意図されており本発明の範囲を限定するものではない以下の実施例との関連でより理解されるであろう。開示されている実施形態への各種変更および修正は当業者には明らかであり、本発明の趣旨および添付の特許請求の範囲から逸脱することなく、そのような変更を行うことができる。 The present invention will be better understood in the context of the following examples, which are intended as illustrative only and do not limit the scope of the invention. Various changes and amendments to the disclosed embodiments will be apparent to those skilled in the art and such changes can be made without departing from the spirit of the invention and the appended claims.

実施例１：ＣＸＣケモカイン受容体タイプ４イソ型ａ（ＣＸＣＲ４）
ＣＸＣＲ４は３５６アミノ酸長のケモカイン受容体である。これは約８．６１のｐＩおよび４０２２１．１９Ｄａの分子量を有する。文献に発表されているＣＸＣＲ４の配列は、

である。 Example 1: CXC chemokine receptor type 4 isotype a (CXCR4)
CXCR4 is a 356 amino acid long chemokine receptor. It has a molecular weight of about 8.61 pI and 40221.19 Da. The sequence of CXCR4 published in the literature is

Is.

この配列をＴＭＨＭＭに供して、図３に示されている膜貫通ドメインを同定する。 This sequence is subjected to TMHMM to identify the transmembrane domain shown in FIG.

疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して以下の配列を得る。

Substituting all or substantially all of the hydrophobic amino acids L, I, V and F with Q, T and Y (respectively) gives the following sequences.

このタンパク質の予測されるｐＩは８．５４であり、分子量は４０５５１．６４Ｄａである。予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、配列番号２のアミノ酸４７〜７０を含む膜貫通ドメイン（ＴＭ１）およびそれを含むタンパク質を含む。一例として、図３はＴＭ１配列のαヘリックス予測を表す。好ましくは本明細書中のＴＭ１を含むタンパク質は、配列番号２の細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、配列番号２または、配列番号１に記載されている天然Ｌ、Ｉ、ＶおよびＦアミノ酸の１つ、２つ、３つまたは場合により４つまたはそれ以上を保持する相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 The expected pI of this protein is 8.54 and the molecular weight is 40551.64 Da. Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the present invention includes a transmembrane domain (TM1) containing amino acids 47-70 of SEQ ID NO: 2 and a protein containing it. As an example, FIG. 3 shows an α-helix prediction of the TM1 sequence. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of the extracellular and intracellular loop sequences of SEQ ID NO: 2 (ununderlined sequences). In addition or as an alternative, the proteins containing TM1 herein are one, two, three or optionally of the native L, I, V and F amino acids set forth in SEQ ID NO: 2 or SEQ ID NO: 1. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence holding four or more.

ＣＸＣＲ４の天然タンパク質配列（Ｎ末端アミノ酸において異なる）を再度本方法に供する。プログラム出力は天然配列を細胞外および細胞内領域に分け、膜貫通ドメインのそれぞれに対して８種の膜貫通ドメイン変異体を選択した。その結果を図４および以下の表に示す。

MEGISIYTSDNYTEEMGSGDYDSMKEPCFREENANFNK (配列番号3; EC1)

ＴＭ１変異体：
IFLPTTYSTTFQTGTTGNGQVTQVM (配列番号4)
IFQPTTYSTTFQTGTTGNGQVTQVM (配列番号5)
IFQPTTYSTTFQTGTTGNGQVTQTM (配列番号6)
IFQPTTYSTTYQTGTTGNGQVTQTM (配列番号7)
IFQPTTYSTTYQTGTTGNGQTTQVM (配列番号8)
IFQPTTYSTTYQTGTTGNGQTIQTM (配列番号9)
IFQPTTYSTTYQTGTTGNGQTTQTM (配列番号10)
TYQPTTYSTTYQTGTTGNGQTTQTM (配列番号11)

GYQKKLRSMTDKYR (配列番号12; IC1)

ＴＭ２変異体：
LHLSTADQQFTTTQPFWAVDAV (配列番号13)
LHLSVADQQYTTTQPFWATDAV (配列番号14)
LHQSVADQQYVTTQPFWATDAT (配列番号15)
QHQSVADQQFTTTQPFWATDAT (配列番号16)
LHQSVADQQYTITQPYWATDAT (配列番号17)
QHLSVADQQYTITQPYWATDAT (配列番号18)
QHLSTADQQYVTTQPYWATDAT (配列番号19)
QHQSTADQQYTTTQPYWATDAT (配列番号20)

ANWYFGNFLCK (配列番号21; EC2)

ＴＭ３変異体：
AVHVTYTVNQYSSVQIQAFT (配列番号22)
AVHTTYTVNQYSSVQIQAFT (配列番号23)
AVHTTYTVNQYSSVQTQAFT (配列番号24)
ATHTTYTVNQYSSVQTQAFT (配列番号25)
ATHTIYTTNQYSSVQTQAFT (配列番号26)
AVHTTYTTNQYSSVQTQAFT (配列番号27)
ATHTTYTTNQYSSVQTQAFT (配列番号28)
ATHTTYTTNQYSSTQTQAYT (配列番号29)

SLDRYLAIVHATNSQRPRKLLAEK (配列番号30; IC2)

ＴＭ４変異体：
VTYTGVWTPAQQQTIPDFIF (配列番号31)
TTYTGTWIPAQQQTIPDFIF (配列番号32)
TTYTGTWTPAQQQTIPDFIF (配列番号33)
TTYTGTWTPAQQQTIPDFIY (配列番号34)
TTYVGTWTPAQQQTTPDYIF (配列番号35)
TTYVGTWTPAQQQTTPDFIY (配列番号36)
TTYTGVWTPAQQQTTPDYTF (配列番号37)
TTYTGTWTPAQQQTTPDYTY (配列番号38)

ANVSEADDRYICDRFYPNDLW (配列番号39; EC3)

ＴＭ５変異体：
VVVFQFQHTMVGQTQPGTTTQ (配列番号40)
VVVFQFQHTMTGQTQPGTTTQ (配列番号41)
VVVFQYQHTMTGQTQPGTTTQ (配列番号42)
VVVYQYQHTMTGQTQPGTTTQ (配列番号43)
TVVFQYQHTMTGQTQPGTTTQ (配列番号44)
VVTFQYQHTMTGQTQPGTTTQ (配列番号45)
TVVYQYQHTMTGQTQPGTTTQ (配列番号46)
TTTYQYQHTMTGQTQPGTTTQ (配列番号47)

SCYCIIISKLSHSKGHQKRKALKTT (配列番号48; IC3)

ＴＭ６変異体：
VTQIQAFFACWQPYYTGTST (配列番号49)
VIQIQAYFACWQPYYTGTST (配列番号50)
VIQIQAYYACWQPYYTGTST (配列番号51)
VIQTQAFYACWQPYYTGTST (配列番号52)
VIQTQAYFACWQPYYTGTST (配列番号53)
VTQIQAFYACWQPYYTGTST (配列番号54)
VIQTQAYYACWQPYYTGTST (配列番号55)
TTQTQAYYACWQPYYTGTST (配列番号56)

DSFILLEIIKQGCEFENTVHK (配列番号57; EC4)

ＴＭ７変異体
WISITEAQAFFHCCLNPIQY (配列番号58)
WISITEAQAFYHCCLNPIQY (配列番号59)
WISITEAQAYFHCCQNPTLY (配列番号60)
WISTTEALAFYHCCQNPTQY (配列番号61)
WISTTEALAYFHCCQNPTQY (配列番号62)
WISITEALAYYHCCQNPTQY (配列番号63)
WISTTEALAYYHCCQNPTQY (配列番号64)
WTSTTEAQAYYHCCQNPTQY

AFLGAKFKTSAQHALTSVSRGSSLKILSKGKRGGHSSVSTESESSSFHSS (配列番号65; IC4)
The natural protein sequence of CXCR4 (different in N-terminal amino acids) is again subjected to the method. The program output divided the native sequences into extracellular and intracellular regions and selected eight transmembrane domain variants for each of the transmembrane domains. The results are shown in FIG. 4 and the table below.

MEGISIYTSDNYTEEMGSGDYDSMKEPCFREENANFNK (SEQ ID NO: 3; EC1)

TM1 mutant:
IFLPTTYSTTFQTGTTGNGQVTQVM (SEQ ID NO: 4)
IFQPTTYSTTFQTGTTGNGQVTQVM (SEQ ID NO: 5)
IFQPTTYSTTFQTGTTGNGQVTQTM (SEQ ID NO: 6)
IFQPTTYSTTYQTGTTGNGQVTQTM (SEQ ID NO: 7)
IFQPTTYSTTYQTGTTGNGQTTQVM (SEQ ID NO: 8)
IFQPTTYSTTYQTGTTGNGQTIQTM (SEQ ID NO: 9)
IFQPTTYSTTYQTGTTGNGQTTQTM (SEQ ID NO: 10)
TYQPTTYSTTYQTGTTGNGQTTQTM (SEQ ID NO: 11)

GYQKKLRSMTDKYR (SEQ ID NO: 12; IC1)

TM2 mutant:
LHLSTADQQFTTTQPFWAVDAV (SEQ ID NO: 13)
LHLSVADQQYTTTQPFWATDAV (SEQ ID NO: 14)
LHQSVADQQYVTTQPFWATDAT (SEQ ID NO: 15)
QHQSVADQQFTTTQPFWATDAT (SEQ ID NO: 16)
LHQSVADQQYTITQPYWATDAT (SEQ ID NO: 17)
QHLSVADQQYTITQPYWATDAT (SEQ ID NO: 18)
QHLSTADQQYVTTQPYWATDAT (SEQ ID NO: 19)
QHQSTADQQYTTTQPYWATDAT (SEQ ID NO: 20)

ANWYFGNFLCK (SEQ ID NO: 21; EC2)

TM3 mutant:
AVHVTYTVNQYSSVQIQAFT (SEQ ID NO: 22)
AVHTTYTVNQYSSVQIQAFT (SEQ ID NO: 23)
AVHTTYTVNQYSSVQTQAFT (SEQ ID NO: 24)
ATHTTYTVNQYSSVQTQAFT (SEQ ID NO: 25)
ATHTIYTTNQYSSVQTQAFT (SEQ ID NO: 26)
AVHTTYTTNQYSSVQTQAFT (SEQ ID NO: 27)
ATHTTYTTNQYSSVQTQAFT (SEQ ID NO: 28)
ATHTTYTTNQYSSTQTQAYT (SEQ ID NO: 29)

SLDRYLAIVHATNSQRPRKLLAEK (SEQ ID NO: 30; IC2)

TM4 mutant:
VTYTGVWTPAQQQTIPDFIF (SEQ ID NO: 31)
TTYTGTWIPAQQQTIPDFIF (SEQ ID NO: 32)
TTYTGTWTPAQQQTIPDFIF (SEQ ID NO: 33)
TTYTGTWTPAQQQTIPDFIY (SEQ ID NO: 34)
TTYVGTWTPAQQQTTPDYIF (SEQ ID NO: 35)
TTYVGTWTPAQQQTTPDFIY (SEQ ID NO: 36)
TTYTGVWTPAQQQTTPDYTF (SEQ ID NO: 37)
TTYTGTWTPAQQQTTPDYTY (SEQ ID NO: 38)

ANVSEADDRYICDRFYPNDLW (SEQ ID NO: 39; EC3)

TM5 mutant:
VVVFQFQHTMVGQTQPGTTTQ (SEQ ID NO: 40)
VVVFQFQHTMTGQTQPGTTTQ (SEQ ID NO: 41)
VVVFQYQHTMTGQTQPGTTTQ (SEQ ID NO: 42)
VVVYQYQHTMTGQTQPGTTTQ (SEQ ID NO: 43)
TVVFQYQHTMTGQTQPGTTTQ (SEQ ID NO: 44)
VVTFQYQHTMTGQTQPGTTTQ (SEQ ID NO: 45)
TVVYQYQHTMTGQTQPGTTTQ (SEQ ID NO: 46)
TTTYQYQHTMTGQTQPGTTTQ (SEQ ID NO: 47)

SCYCIIISKLSHSKGHQKRKALKTT (SEQ ID NO: 48; IC3)

TM6 mutant:
VTQIQAFFACWQPYYTGTST (SEQ ID NO: 49)
VIQIQAYFACWQPYYTGTST (SEQ ID NO: 50)
VIQIQAYYACWQPYYTGTST (SEQ ID NO: 51)
VIQTQAFYACWQPYYTGTST (SEQ ID NO: 52)
VIQTQAYFACWQPYYTGTST (SEQ ID NO: 53)
VTQIQAFYACWQPYYTGTST (SEQ ID NO: 54)
VIQTQAYYACWQPYYTGTST (SEQ ID NO: 55)
TTQTQAYYACWQPYYTGTST (SEQ ID NO: 56)

DSFILLEIIKQGCEFENTVHK (SEQ ID NO: 57; EC4)

TM7 mutant
WISITEAQAFFHCCLNPIQY (SEQ ID NO: 58)
WISITEAQAFYHCCLNPIQY (SEQ ID NO: 59)
WISITEAQAYFHCCQNPTLY (SEQ ID NO: 60)
WISTTEALAFYHCCQNPTQY (SEQ ID NO: 61)
WISTTEALAYFHCCQNPTQY (SEQ ID NO: 62)
WISITEALAYYHCCQNPTQY (SEQ ID NO: 63)
WISTTEALAYYHCCQNPTQY (SEQ ID NO: 64)
WTSTTEAQAYYHCCQNPTQY

AFLGAKFKTSAQHALTSVSRGSSLKILSKGKRGGHSSVSTESESSSFHSS (SEQ ID NO: 65; IC4)

膜貫通ドメイン変異体の各リストの前、間および後の配列（配列番号３、１２、２１、３０、３９、４８、５７および６５）はそれぞれ、Ｎ’、中間およびＣ’細胞外および細胞内領域であることは上記から明らかであると思われる。 The pre, inter, and post-sequences (SEQ ID NOs: 3, 12, 21, 30, 39, 48, 57 and 65) of each list of transmembrane domain variants are N', intermediate and C'extracellular and intracellular, respectively. It seems clear from the above that it is a region.

次いで当該技術分野で知られているように、上記配列を使用して発現系、この場合は酵母における発現に適したコード配列を生成した。次いで、このコード配列を組み替えて発現させ、それぞれが各変異体リスト内の１種の膜貫通ドメイン変異体をそれぞれの細胞内および細胞外ドメインの間に含む、配列番号３、１２、２１、３０、３９、４８、５７および６５を有する複数のタンパク質を含むライブラリーを作製した。 The above sequences were then used to generate coding sequences suitable for expression in the expression system, in this case yeast, as is known in the art. This coding sequence is then rearranged and expressed, each containing one transmembrane domain variant within each variant list between the intracellular and extracellular domains of SEQ ID NOs: 3, 12, 21, 30. , 39, 48, 57 and 65, to create a library containing multiple proteins.

次いで、そのように作製したライブラリーを、生きている酵母細胞内で結合する酵母において発現されたプラスミド上でＣＸＣＲ４同族リガンドすなわちＳＤＦ１ａ（またはＣＣＬ１２）についてアッセイした。酵母ツーハイブリッドシステムによる遺伝子活性化によりリガンド結合を検出し、次いで試料を配列決定した。１９種のＣＸＣＲ４変異体を配列決定した。その結果を図５に示す。 The library thus prepared was then assayed for the CXCR4 homologous ligand, SDF1a (or CCL12), on a plasmid expressed in yeast that binds within living yeast cells. Ligand binding was detected by gene activation with a yeast-to-hybrid system, and then samples were sequenced. Nineteen CXCR4 mutants were sequenced. The result is shown in FIG.

実施例２：ＣＸＣケモカイン受容体タイプ３イソ型ｂ（ＣＸ３ＣＲ１）
ＣＸ３ＣＲ１は３５５アミノ酸長のケモカイン受容体である。これは約６．７４のｐＩおよび４０３９６．４Ｄａの分子量を有する。この配列をＴＭＨＭＭに供してその膜貫通ドメインを同定する。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線）に整列させた以下の配列（下側の線）を得る。

Example 2: CXC chemokine receptor type 3 isotype b (CX3CR1)
CX3CR1 is a 355 amino acid long chemokine receptor. It has a molecular weight of about 6.74 pI and 40396.4 Da. This sequence is submitted to TMHMM to identify its transmembrane domain. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain were (respectively) substituted with Q, T and Y and aligned to the wild type (upper line): Obtain an array (lower line).

このタンパク質変異体の予測されるｐＩは６．７４であり、分子量は４１０２７．１７Ｄａである。予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、配列番号６７の下線が引かれているアミノ酸を含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、配列番号６６の細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、配列番号６７または、配列番号６６に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 The predicted pI for this protein variant is 6.74 and the molecular weight is 41027.17 Da. Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the present invention includes a transmembrane domain containing an underlined amino acid of SEQ ID NO: 67. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of the extracellular and intracellular loop sequences of SEQ ID NO: 66 (ununderlined sequences). In addition or as an alternative, the protein containing TM1 herein is one, two, three or one of the native V, L, I and F amino acids set forth in SEQ ID NO: 67 or SEQ ID NO: 66. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

ＣＸ３ＣＲ１の天然タンパク質配列を再度本方法に供する。プログラム出力は天然配列を細胞外および細胞内領域に分け、膜貫通ドメインのそれぞれに対して８種の膜貫通ドメイン変異体を選択した。その結果を以下の表に示す。

MDQFPESVTENFEYDDLAEACYIGDIVVFGT (配列番号68)

ＴＭ１変異体：
TYQSTYYSTTFATGQVGNQQVVFALTNS (配列番号69)
TYQSTYYSTTYATGQVGNQQVVFALTNS (配列番号70)
TYQSTYYSTTYATGQVGNQQVVFAQTNS (配列番号71)
TYQSTYYSTTYATGQTGNLQVTFAQTNS (配列番号72)
TYQSTYYSTTYATGQTGNQLVTFAQTNS (配列番号73)
TYQSTYYSTTYATGQTGNQQVVFAQTNS (配列番号74)
TYQSTYYSTTYATGQTGNLQVTYAQTNS (配列番号75)
TYQSTYYSTTYATGQTGNQQTTYAQTNS (配列番号76)

KKPKSVTDIY (配列番号77)

ＴＭ２変異体
LLNQAQSDQLFVATQPFWTHY (配列番号78)
LLNQAQSDQQFVATQPFWTHY (配列番号79)
QQNLAQSDQQFVATQPFWTHY (配列番号80)
LQNLAQSDQQYTATQPFWTHY (配列番号81)
QLNLAQSDQQYTATQPFWTHY (配列番号82)
LLNQAQSDQQFTATQPYWTHY (配列番号83)
QQNLAQSDQQFTATQPYWTHY (配列番号84)
QQNQAQSDQQYTATQPYWTHY (配列番号85)

LINEKGLHNAMCK (配列番号86)

ＴＭ３変異体：
YTTAYYYTGYYGSTYYTTTTST (配列番号87)

DRYLAIVLAANSMNNRT (配列番号88)

ＴＭ４変異体：
VQHGTTTSQGTWAAATQVAAPQFMF (配列番号89)
VQHGVTTSQGTWAAATQTAAPQFMF (配列番号90)
VQHGTTTSQGVWAAATQTAAPQFMY (配列番号91)
VQHGTTTSQGTWAAAIQTAAPQFMY (配列番号92)
VQHGTTTSQGTWAAATQTAAPQFMF (配列番号93)
VQHGTTISQGTWAAATQTAAPQYMF (配列番号94)
VQHGTTTSQGTWAAATQTAAPQFMY (配列番号95)
TQHGTTTSQGTWAAATQTAAPQYMY (配列番号96)

TKQKENECLGDYPEVLQEIWPVLRNVET (配列番号97)

ＴＭ５変異体：
NFLGFQQPQQIMSYCYFRIT (配列番号98)
NFQGFLQPQQTMSYCYFRIT (配列番号99)
NFQGFLQPQQTMSYCYFRTT (配列番号100)
NFQGFQQPQQTMSYCYYRIT (配列番号101)
NFQGFLQPQQTMSYCYYRTT (配列番号102)
NFQGYLQPQQTMSYCYFRTT (配列番号103)
NYQGFQQPQQTMSYCYFRTT (配列番号104)
NYQGYQQPQQTMSYCYYRTT (配列番号105)

QTLFSCKNHKKAKAIK (配列番号106)

ＴＭ６変異体：
LIQQTTTTFYQFWTPYNTMTFQETL (配列番号107)
LIQQTTTTFYQYWTPYNVMTFQETQ (配列番号108)
LIQQTTTTYYQFWTPYNTMTFQETQ (配列番号109)
QIQQTTTTFYQYWTPYNTMTFQETQ (配列番号110)
LTQQTTTTYYQFWTPYNTMTFQETQ (配列番号111)
QIQQTTTTFFQYWTPYNTMTYQETQ (配列番号112)
QIQQTTTTFYQYWTPYNTMTYQETQ (配列番号113)
QTQQTTTTYYQYWTPYNTMTYQETQ (配列番号114)

KLYDFFPSCDMRKDLRL (配列番号115)

ＴＭ７変異体：
ALSVTETVAFSHCCQNPQIYAFAG (配列番号116)
AQSVTETTAFSHCCQNPLIYAFAG (配列番号117)
ALSVTETVAFSHCCQNPQTYAYAG (配列番号118)
AQSVTETTAFSHCCQNPQIYAYAG (配列番号119)
ALSVTETTAFSHCCQNPQTYAYAG (配列番号120)
ALSTTETTAYSHCCQNPQIYAFAG (配列番号121)
ALSVTETTAYSHCCQNPQTYAYAG (配列番号122)
AQSTTETTAYSHCCQNPQTYAYAG (配列番号123)

EKFRRYLYHLYGKCLAVLCGRSVHVDFSSSESQRSRHGSVLSSNFTYHTSDGDALLLL (配列番号124)
The natural protein sequence of CX3CR1 is again subjected to this method. The program output divided the native sequences into extracellular and intracellular regions and selected eight transmembrane domain variants for each of the transmembrane domains. The results are shown in the table below.

MDQFPESVTENFEYDDLAEACYIGDIVVFGT (SEQ ID NO: 68)

TM1 mutant:
TYQSTYYSTTFATGQVGNQQVVFALTNS (SEQ ID NO: 69)
TYQSTYYSTTYATGQVGNQQVVFALTNS (SEQ ID NO: 70)
TYQSTYYSTTYATGQVGNQQVVFAQTNS (SEQ ID NO: 71)
TYQSTYYSTTYATGQTGNLQVTFAQTNS (SEQ ID NO: 72)
TYQSTYYSTTYATGQTGNQLVTFAQTNS (SEQ ID NO: 73)
TYQSTYYSTTYATGQTGNQQVVFAQTNS (SEQ ID NO: 74)
TYQSTYYSTTYATGQTGNLQVTYAQTNS (SEQ ID NO: 75)
TYQSTYYSTTYATGQTGNQQTTYAQTNS (SEQ ID NO: 76)

KKPKSVTDIY (SEQ ID NO: 77)

TM2 mutant
LLNQAQSDQLFVATQPFWTHY (SEQ ID NO: 78)
LLNQAQSDQQFVATQPFWTHY (SEQ ID NO: 79)
QQNLAQSDQQFVATQPFWTHY (SEQ ID NO: 80)
LQNLAQSDQQYTATQPFWTHY (SEQ ID NO: 81)
QLNLAQSDQQYTATQPFWTHY (SEQ ID NO: 82)
LLNQAQSDQQFTATQPYWTHY (SEQ ID NO: 83)
QQNLAQSDQQFTATQPYWTHY (SEQ ID NO: 84)
QQNQAQSDQQYTATQPYWTHY (SEQ ID NO: 85)

LINEKGLHNAMCK (SEQ ID NO: 86)

TM3 mutant:
YTTAYYYTGYYGSTYYTTTTST (SEQ ID NO: 87)

DRYLAIVLAANSMNNRT (SEQ ID NO: 88)

TM4 mutant:
VQHGTTTSQGTWAAATQVAAPQFMF (SEQ ID NO: 89)
VQHGVTTSQGTWAAATQTAAPQFMF (SEQ ID NO: 90)
VQHGTTTSQGVWAAATQTAAPQFMY (SEQ ID NO: 91)
VQHGTTTSQGTWAAAIQTAAPQFMY (SEQ ID NO: 92)
VQHGTTTSQGTWAAATQTAAPQFMF (SEQ ID NO: 93)
VQHGTTISQGTWAAATQTAAPQYMF (SEQ ID NO: 94)
VQHGTTTSQGTWAAATQTAAPQFMY (SEQ ID NO: 95)
TQHGTTTSQGTWAAATQTAAPQYMY (SEQ ID NO: 96)

TKQKENECLGDYPEVLQEIWPVLRNVET (SEQ ID NO: 97)

TM5 mutant:
NFLGFQQPQQIMSYCYFRIT (SEQ ID NO: 98)
NFQGFLQPQQTMSYCYFRIT (SEQ ID NO: 99)
NFQGFLQPQQTMSYCYFRTT (SEQ ID NO: 100)
NFQGFQQPQQTMSYCYYRIT (SEQ ID NO: 101)
NFQGFLQPQQTMSYCYYRTT (SEQ ID NO: 102)
NFQGYLQPQQTMSYCYFRTT (SEQ ID NO: 103)
NYQGFQQPQQTMSYCYFRTT (SEQ ID NO: 104)
NYQGYQQPQQTMSYCYYRTT (SEQ ID NO: 105)

QTLFSCKNHKKAKAIK (SEQ ID NO: 106)

TM6 mutant:
LIQQTTTTFYQFWTPYNTMTFQETL (SEQ ID NO: 107)
LIQQTTTTFYQYWTPYNVMTFQETQ (SEQ ID NO: 108)
LIQQTTTTYYQFWTPYNTMTFQETQ (SEQ ID NO: 109)
QIQQTTTTFYQYWTPYNTMTFQETQ (SEQ ID NO: 110)
LTQQTTTTYYQFWTPYNTMTFQETQ (SEQ ID NO: 111)
QIQQTTTTFFQYWTPYNTMTYQETQ (SEQ ID NO: 112)
QIQQTTTTFYQYWTPYNTMTYQETQ (SEQ ID NO: 113)
QTQQTTTTYYQYWTPYNTMTYQETQ (SEQ ID NO: 114)

KLYDFFPSCDMRKDLRL (SEQ ID NO: 115)

TM7 mutant:
ALSVTETVAFSHCCQNPQIYAFAG (SEQ ID NO: 116)
AQSVTETTAFSHCCQNPLIYAFAG (SEQ ID NO: 117)
ALSVTETVAFSHCCQNPQTYAYAG (SEQ ID NO: 118)
AQSVTETTAFSHCCQNPQIYAYAG (SEQ ID NO: 119)
ALSVTETTAFSHCCQNPQTYAYAG (SEQ ID NO: 120)
ALSTTETTAYSHCCQNPQIYAFAG (SEQ ID NO: 121)
ALSVTETTAYSHCCQNPQTYAYAG (SEQ ID NO: 122)
AQSTTETTAYSHCCQNPQTYAYAG (SEQ ID NO: 123)

EKFRRYLYHLYGKCLAVLCGRSVHVDFSSSESQRSRHGSVLSSNFTYHTSDGDALLLL (SEQ ID NO: 124)

上記実施例１と同様に、膜貫通ドメイン変異体の各リストの前、間および後の配列はそれぞれＮ’、中間およびＣ’細胞内または細胞外領域である。 Similar to Example 1 above, the anterior, interstitial and posterior sequences of each list of transmembrane domain variants are N', intermediate and C'intracellular or extracellular regions, respectively.

次いで当該技術分野で知られているように、上記配列を使用して発現系、この場合は酵母における発現に適したコード配列を生成した。次いで、このコード配列を組み替えて発現させ、それぞれが各変異体リスト内の１種の膜貫通ドメイン変異体をそれぞれの細胞内および細胞外ドメインの間に含む、配列番号６８、７７、８６、８８、９７、１０６および１１５を有する複数のタンパク質を含むライブラリーを作製した。 The above sequences were then used to generate coding sequences suitable for expression in the expression system, in this case yeast, as is known in the art. This coding sequence is then rearranged and expressed, each containing one transmembrane domain variant within each variant list between the intracellular and extracellular domains of SEQ ID NOs: 68, 77, 86, 88. , 97, 106 and 115 were made to contain multiple proteins.

次いで、そのように作製したライブラリーを水性媒体中でＣＸ３ＣＲ１同族リガンド（ＣＸＣＬ１）との結合について実施例１に記載されているようにアッセイした。リガンド結合を検出し、次いで試料を配列決定した。７種の変異体を配列決定した。その結果を図６に示す。 The library thus prepared was then assayed for binding to the CX3CR1 homologous ligand (CXCL1) in an aqueous medium as described in Example 1. Ligand binding was detected and then the sample was sequenced. Seven mutants were sequenced. The result is shown in FIG.

実施例３：ＣＣＲ３変異体
実施例１の方法をケモカイン受容体タイプ３イソ型３のために繰り返した。

Example 3: CCR3 mutant The method of Example 1 was repeated for chemokine receptor type 3 isotype 3.

その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線）に整列させた以下の配列（下側の線）を得る。

All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain were (respectively) substituted with Q, T and Y and aligned to the wild type (upper line): Obtain an array (lower line).

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、配列番号１２６の下線が引かれているアミノ酸を含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、配列番号１２６の細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、配列番号１２６または、配列番号１２５に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes a transmembrane domain containing an underlined amino acid of SEQ ID NO: 126. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of the extracellular and intracellular loop sequences (ununderlined sequences) of SEQ ID NO: 126. As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the native V, L, I and F amino acids set forth in SEQ ID NO: 126 or SEQ ID NO: 125. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

ＣＣＲ３の天然タンパク質配列を再度本方法に供する（Ｎ末端配列における差異に留意）。プログラム出力は天然配列を細胞外および細胞内領域に分け、膜貫通ドメインのそれぞれに対して８種の膜貫通ドメイン変異体を選択した。その結果を以下の表に示す。

MTTSLDTVETFGTTSYYDDVGLLCEKADTRALMA (配列番号127)

ＴＭ１変異体：
QFVPPQYSQTFTTGQQGNVTVTMTQIKY (配列番号128)
QFVPPQYSQTFTTGQQGNTTVTMTQIKY (配列番号129)
QFVPPQYSQTYTTGQQGNTTVTMTQIKY (配列番号130)
QFTPPQYSQTYTTGQQGNVTTTMTQIKY (配列番号131)
QFTPPQYSQTYTTGQQGNTVTTMTQIKY (配列番号132)
QFTPPQYSQTYTTGQQGNTTVTMTQIKY (配列番号133)
QFTPPQYSQTYTTGQQGNTTTTMTQIKY (配列番号134)
QYTPPQYSQTYTTGQQGNTTTTMTQTKY (配列番号135)

RRLRIMTNIY (配列番号136)

ＴＭ２変異体：
LLNQATSDQQFQVTQPFWIHY (配列番号137)
LQNQAISDQLFQTTQPFWTHY (配列番号138)
QQNLAISDQQFQTTQPFWTHY (配列番号139)
QLNQAISDQQFQTTQPYWTHY (配列番号140)
QQNLAISDQQYQVTQPYWTHY (配列番号141)
LQNQATSDQLFQTTQPYWTHY (配列番号142)
QQNQAISDQQYQVTQPYWTHY (配列番号143)
QQNQATSDQQYQTTQPYWTHY (配列番号144)

VRGHNWVFGHGMCK (配列番号145)

ＴＭ３変異体：
LQSGFYHTGQYSETFFTTQQTT (配列番号146)
QLSGFYHTGQYSETFFTTQQTT (配列番号147)
QLSGFYHTGQYSETFYTTQQTT (配列番号148)
QLSGFYHTGQYSETYFTTQQTT (配列番号149)
QLSGYYHTGQYSETFFTTQQTT (配列番号150)
QQSGFYHTGQYSETFFTTQQTT (配列番号151)
QQSGFYHTGQYSETFYTTQQTT (配列番号152)
QQSGYYHTGQYSETYYTTQQTT (配列番号153)

DRYLAIVHAVFALRART (配列番号154)

ＴＭ４変異体：
TTFGTTTSTVTWGQAVQAAQPEFIF (配列番号155)
TTFGTTTSTTTWGQAVQAAQPEFIF (配列番号156)
TTYGTTTSTTTWGQAVQAAQPEFIF (配列番号157)
TTYGTTTSTTTWGQAVQAAQPEFTF (配列番号158)
TTYGTTTSTTTWGQATQAAQPEFIF (配列番号159)
TTFGTTTSTTTWGQATQAAQPEFIY (配列番号160)
TTYGTTTSTTTWGQATQAAQPEFIY (配列番号161)
TTYGTTTSTTTWGQATQAAQPEYTY (配列番号162)

YETEELFEETLCSALYPEDTVYSWRHFHTLRM (配列番号163)

ＴＭ５変異体：
TIFCQVQPQQTMATCYTGTT (配列番号164)
TIFCQTQPQQVMATCYTGTT (配列番号165)
TIFCQTQPQQTMATCYTGIT (配列番号166)
TIFCQTQPQQTMATCYTGTI (配列番号167)
TTFCQVQPQQVMATCYTGTT (配列番号168)
TIYCQVQPQQVMATCYTGTT (配列番号169)
TIFCQTQPQQTMATCYTGTT (配列番号170)
TTYCQTQPQQTMATCYTGTT (配列番号171)

KTLLRCPSKKKYKAIR (配列番号172)

ＴＭ６変異体：
QTYTTMATYYTYWTPYNTATQQSSY (配列番号173)

QSILFGNDCERSKHLDL (配列番号174)

ＴＭ７変異体：
VMQVTEVTAYSHCCMNPVTYAFTG (配列番号175)
VMQVTEVTAYSHCCMNPTTYAYVG (配列番号176)
VMLTTEVTAYSHCCMNPTTYAFTG (配列番号177)
VMQVTETTAYSHCCMNPVTYAYTG (配列番号178)
TMQVTETIAYSHCCMNPTTYAFTG (配列番号179)
TMQVTETTAYSHCCMNPTTYAFVG (配列番号180)
VMQTTETIAYSHCCMNPTTYAYTG (配列番号181)
TMQTTETTAYSHCCMNPTTYAYTG (配列番号182)

ERFRKYLRHFFHRHLLMHLGRYIPFLPSEKLERTSSVSPSTAEPELSIVF (配列番号183)
The natural protein sequence of CCR3 is subjected to the present method again (note the difference in the N-terminal sequence). The program output divided the native sequences into extracellular and intracellular regions and selected eight transmembrane domain variants for each of the transmembrane domains. The results are shown in the table below.

MTTSLDTVETFGTTSYYDDVGLLCEKADTRALMA (SEQ ID NO: 127)

TM1 mutant:
QFVPPQYSQTFTTGQQGNVTVTMTQIKY (SEQ ID NO: 128)
QFVPPQYSQTFTTGQQGNTTVTMTQIKY (SEQ ID NO: 129)
QFVPPQYSQTYTTGQQGNTTVTMTQIKY (SEQ ID NO: 130)
QFTPPQYSQTYTTGQQGNVTTTMTQIKY (SEQ ID NO: 131)
QFTPPQYSQTYTTGQQGNTVTTMTQIKY (SEQ ID NO: 132)
QFTPPQYSQTYTTGQQGNTTVTMTQIKY (SEQ ID NO: 133)
QFTPPQYSQTYTTGQQGNTTTTMTQIKY (SEQ ID NO: 134)
QYTPPQYSQTYTTGQQGNTTTTMTQTKY (SEQ ID NO: 135)

RRLRIMTNIY (SEQ ID NO: 136)

TM2 mutant:
LLNQATSDQQFQVTQPFWIHY (SEQ ID NO: 137)
LQNQAISDQLFQTTQPFWTHY (SEQ ID NO: 138)
QQNLAISDQQFQTTQPFWTHY (SEQ ID NO: 139)
QLNQAISDQQFQTTQPYWTHY (SEQ ID NO: 140)
QQNLAISDQQYQVTQPYWTHY (SEQ ID NO: 141)
LQNQATSDQLFQTTQPYWTHY (SEQ ID NO: 142)
QQNQAISDQQYQVTQPYWTHY (SEQ ID NO: 143)
QQNQATSDQQYQTTQPYWTHY (SEQ ID NO: 144)

VRGHNWVFGHGMCK (SEQ ID NO: 145)

TM3 mutant:
LQSGFYHTGQYSETFFTTQQTT (SEQ ID NO: 146)
QLSGFYHTGQYSETFFTTQQTT (SEQ ID NO: 147)
QLSGFYHTGQYSETFYTTQQTT (SEQ ID NO: 148)
QLSGFYHTGQYSETYFTTQQTT (SEQ ID NO: 149)
QLSGYYHTGQYSETFFTTQQTT (SEQ ID NO: 150)
QQSGFYHTGQYSETFFTTQQTT (SEQ ID NO: 151)
QQSGFYHTGQYSETFYTTQQTT (SEQ ID NO: 152)
QQSGYYHTGQYSETYYTTQQTT (SEQ ID NO: 153)

DRYLAIVHAVFALRART (SEQ ID NO: 154)

TM4 mutant:
TTFGTTTSTVTWGQAVQAAQPEFIF (SEQ ID NO: 155)
TTFGTTTSTTTWGQAVQAAQPEFIF (SEQ ID NO: 156)
TTYGTTTSTTTWGQAVQAAQPEFIF (SEQ ID NO: 157)
TTYGTTTSTTTWGQAVQAAQPEFTF (SEQ ID NO: 158)
TTYGTTTSTTTWGQATQAAQPEFIF (SEQ ID NO: 159)
TTFGTTTSTTTWGQATQAAQPEFIY (SEQ ID NO: 160)
TTYGTTTSTTTWGQATQAAQPEFIY (SEQ ID NO: 161)
TTYGTTTSTTTWGQATQAAQPEYTY (SEQ ID NO: 162)

YETEELFEETLCSALYPEDTVYSWRHFHTLRM (SEQ ID NO: 163)

TM5 mutant:
TIFCQVQPQQTMATCYTGTT (SEQ ID NO: 164)
TIFCQTQPQQVMATCYTGTT (SEQ ID NO: 165)
TIFCQTQPQQTMATCYTGIT (SEQ ID NO: 166)
TIFCQTQPQQTMATCYTGTI (SEQ ID NO: 167)
TTFCQVQPQQVMATCYTGTT (SEQ ID NO: 168)
TIYCQVQPQQVMATCYTGTT (SEQ ID NO: 169)
TIFCQTQPQQTMATCYTGTT (SEQ ID NO: 170)
TTYCQTQPQQTMATCYTGTT (SEQ ID NO: 171)

KTLLRCPSKKKYKAIR (SEQ ID NO: 172)

TM6 mutant:
QTYTTMATYYTYWTPYNTATQQSSY (SEQ ID NO: 173)

QSILFGNDCERSKHLDL (SEQ ID NO: 174)

TM7 mutant:
VMQVTEVTAYSHCCMNPVTYAFTG (SEQ ID NO: 175)
VMQVTEVTAYSHCCMNPTTYAYVG (SEQ ID NO: 176)
VMLTTEVTAYSHCCMNPTTYAFTG (SEQ ID NO: 177)
VMQVTETTAYSHCCMNPVTYAYTG (SEQ ID NO: 178)
TMQVTETIAYSHCCMNPTTYAFTG (SEQ ID NO: 179)
TMQVTETTAYSHCCMNPTTYAFVG (SEQ ID NO: 180)
VMQTTETIAYSHCCMNPTTYAYTG (SEQ ID NO: 181)
TMQTTETTAYSHCCMNPTTYAYTG (SEQ ID NO: 182)

ERFRKYLRHFFHRHLLMHLGRYIPFLPSEKLERTSSVSPSTAEPELSIVF (SEQ ID NO: 183)

次いで当該技術分野で知られているように、上記配列を使用して発現系、この場合は酵母における発現に適したコード配列を生成した。次いで、このコード配列を組み替えて発現させ、それぞれが各変異体リスト内の１種の膜貫通ドメイン変異体をそれぞれの細胞内および細胞外ドメインの間に含む、配列番号１２７、１３６、１４５、１５４、１６３、１７２、１７４および１８３を有する複数のタンパク質を含むライブラリーを作製した。 The above sequences were then used to generate coding sequences suitable for expression in the expression system, in this case yeast, as is known in the art. This coding sequence is then rearranged and expressed, each containing one transmembrane domain variant within each variant list between the intracellular and extracellular domains of SEQ ID NO: 127, 136, 145, 154. , 163, 172, 174 and 183.

次いで、そのように作製したライブラリーを実施例１に記載されているように水性媒体中でＣＣＲ３同族リガンドすなわちＣＣＬ３との結合についてアッセイした。リガンド結合を検出し、次いで試料を配列決定した。１１種の変異体を配列決定した。その結果を図７に示す。 The library thus prepared was then assayed for binding to the CCR3 homologous ligand, CCL3, in an aqueous medium as described in Example 1. Ligand binding was detected and then the sample was sequenced. Eleven mutants were sequenced. The result is shown in FIG.

実施例４：ＣＣＲ５変異体
実施例１の方法をケモカイン受容体タイプ５イソ型３のために繰り返した。

Example 4: CCR5 mutant The method of Example 1 was repeated for chemokine receptor type 5 isotype 3.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、配列番号１８５の下線が引かれているアミノ酸を含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、配列番号１８５の細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、配列番号１８５または、配列番号１８４に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the present invention includes a transmembrane domain containing an underlined amino acid of SEQ ID NO: 185. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of the extracellular and intracellular loop sequences (ununderlined sequences) of SEQ ID NO: 185. In addition or as an alternative, the protein containing TM1 herein is one, two, three or one of the native V, L, I and F amino acids set forth in SEQ ID NO: 185 or SEQ ID NO: 184. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

ＣＣＲ５の天然タンパク質配列を再度本方法に供する（Ｎ末端配列における差異に留意）。プログラム出力は天然配列を細胞外および細胞内領域に分け、膜貫通ドメインのそれぞれに対して８種の膜貫通ドメイン変異体を選択した。その結果を以下の表に示す。

MDYQVSSPIYDINYYTSEPCQKINVKQIAA (配列番号186)

ＴＭ１変異体：
RLQPPQYSQTFTFGFTGNMQVTQTQINC (配列番号187)
RLQPPQYSQTFTFGYTGNMQVTQTQINC (配列番号188)
RQQPPQYSQTFTFGFTGNMQTTQTQINC (配列番号189)
RQQPPQYSQTFTYGFTGNMQTTQTQINC (配列番号190)
RQQPPQYSQTYTFGFTGNMQTTQTQINC (配列番号191)
RQQPPQYSQTFTFGYTGNMQTTQTQINC (配列番号192)
RQQPPQYSQTYTFGYTGNMQTTQTQINC (配列番号193)
RQQPPQYSQTYTYGYTGNMQTTQTQTNC (配列番号194)

KRLKSMTDIY (配列番号195)

ＴＭ２変異体：
LQNQAISDQFFQQTVPFWAHY (配列番号196)
LQNQAISDQFFQQTTPFWAHY (配列番号197)
LQNQAISDQFFQQTTPYWAHY (配列番号198)
LQNQAISDQFYQQTTPYWAHY (配列番号199)
LQNQAISDQYFQQTTPYWAHY (配列番号200)
LQNQATSDQFFQQTTPYWAHY (配列番号201)
LQNQAISDQYYQQTTPYWAHY (配列番号202)
QQNQATSDQYYQQTTPYWAHY (配列番号203)

AAAQWDFGNTMCQ (配列番号204)

ＴＭ３変異体：
QQTGQYFTGYYSGTYYTTQQTT (配列番号205)
QQTGQYYTGYYSGTYYTTQQTT (配列番号206)

DRYLAVVHAVFALKART (配列番号207)

ＴＭ４変異体：
TTYGTTTSTTTWTTATYASQPGTTY (配列番号208)

TRSQKEGLHYTCSSHFPYSQYQFWKNFQTLKI (配列番号209)

ＴＭ５変異体：
VIQGQVQPQQVMVTCYSGIQ (配列番号210)
VIQGQVQPQQVMTTCYSGIQ (配列番号211)
VIQGQVQPQQTMTTCYSGIQ (配列番号212)
VTQGQVQPQQTMVTCYSGTQ (配列番号213)
TIQGQVQPQQVMTTCYSGTQ (配列番号214)
TIQGQVQPQQTMVTCYSGTQ (配列番号215)
TTQGQVQPQQVMTTCYSGTQ (配列番号216)
TTQGQTQPQQTMTTCYSGTQ (配列番号217)

KTLLRCRNEKKRHRAVR (配列番号218)

ＴＭ６変異体：
QTFTTMTTYYQFWAPYNIVQQLNTF (配列番号219)
QTFTTMTTYYQFWAPYNTVQQLNTF (配列番号220)
QTFTTMTTYYQYWAPYNTVQQLNTF (配列番号221)
QTFTTMTTYYQYWAPYNTVQQQNTF (配列番号222)
QTYTTMTTYYQYWAPYNTVQQLNTF (配列番号223)
QTFTTMTTYYQYWAPYNTTQQLNTF (配列番号224)
QTYTTMTTYYQYWAPYNTVQQQNTF (配列番号225)
QTYTTMTTYYQYWAPYNTTQQQNTY (配列番号225)

QEFFGLNNCSSSNRLDQ (配列番号226)

ＴＭ７変異体：
AMQVTETQGMTHCCINPIIYAFVG (配列番号227)
AMQVTETLGMTHCCTNPIIYAFTG (配列番号228)
AMQVTETQGMTHCCINPTIYAYVG (配列番号229)
AMQTTETQGMTHCCINPITYAFTG (配列番号230)
AMQTTETQGMTHCCINPTIYAFTG (配列番号231)
AMQVTETQGMTHCCTNPTIYAYVG (配列番号232)
AMQTTETQGMTHCCINPTTYAYVG (配列番号233)
AMQTTETQGMTHCCTNPTTYAYTG (配列番号234)

EKFRNYLLVFFQKHIAKRFCKCCSIFQQEAPERASSVYTRSTGEQEISVGL (配列番号235)
The natural protein sequence of CCR5 is subjected to the present method again (note the difference in the N-terminal sequence). The program output divided the native sequences into extracellular and intracellular regions and selected eight transmembrane domain variants for each of the transmembrane domains. The results are shown in the table below.

MDYQVSSPIYDINYYTSEPCQKINVKQIAA (SEQ ID NO: 186)

TM1 mutant:
RLQPPQYSQTFTFGFTGNMQVTQTQINC (SEQ ID NO: 187)
RLQPPQYSQTFTFGYTGNMQVTQTQINC (SEQ ID NO: 188)
RQQPPQYSQTFTFGFTGNMQTTQTQINC (SEQ ID NO: 189)
RQQPPQYSQTFTYGFTGNMQTTQTQINC (SEQ ID NO: 190)
RQQPPQYSQTYTFGFTGNMQTTQTQINC (SEQ ID NO: 191)
RQQPPQYSQTFTFGYTGNMQTTQTQINC (SEQ ID NO: 192)
RQQPPQYSQTYTFGYTGNMQTTQTQINC (SEQ ID NO: 193)
RQQPPQYSQTYTYGYTGNMQTTQTQTNC (SEQ ID NO: 194)

KRLKSMTDIY (SEQ ID NO: 195)

TM2 mutant:
LQNQAISDQFFQQTVPFWAHY (SEQ ID NO: 196)
LQNQAISDQFFQQTTPFWAHY (SEQ ID NO: 197)
LQNQAISDQFFQQTTPYWAHY (SEQ ID NO: 198)
LQNQAISDQFYQQTTPYWAHY (SEQ ID NO: 199)
LQNQAISDQYFQQTTPYWAHY (SEQ ID NO: 200)
LQNQATSDQFFQQTTPYWAHY (SEQ ID NO: 201)
LQNQAISDQYYQQTTPYWAHY (SEQ ID NO: 202)
QQNQATSDQYYQQTTPYWAHY (SEQ ID NO: 203)

AAAQWDFGNTMCQ (SEQ ID NO: 204)

TM3 mutant:
QQTGQYFTGYYSGTYYTTQQTT (SEQ ID NO: 205)
QQTGQYYTGYYSGTYYTTQQTT (SEQ ID NO: 206)

DRYLAVVHAVFALKART (SEQ ID NO: 207)

TM4 mutant:
TTYGTTTSTTTWTTATYASQPGTTY (SEQ ID NO: 208)

TRSQKEGLHYTCSSHFPYSQYQFWKNFQTLKI (SEQ ID NO: 209)

TM5 mutant:
VIQGQVQPQQVMVTCYSGIQ (SEQ ID NO: 210)
VIQGQVQPQQVMTTCYSGIQ (SEQ ID NO: 211)
VIQGQVQPQQTMTTCYSGIQ (SEQ ID NO: 212)
VTQGQVQPQQTMVTCYSGTQ (SEQ ID NO: 213)
TIQGQVQPQQVMTTCYSGTQ (SEQ ID NO: 214)
TIQGQVQPQQTMVTCYSGTQ (SEQ ID NO: 215)
TTQGQVQPQQVMTTCYSGTQ (SEQ ID NO: 216)
TTQGQTQPQQTMTTCYSGTQ (SEQ ID NO: 217)

KTLLRCRNEKKRHRAVR (SEQ ID NO: 218)

TM6 mutant:
QTFTTMTTYYQFWAPYNIVQQLNTF (SEQ ID NO: 219)
QTFTTMTTYYQFWAPYNTVQQLNTF (SEQ ID NO: 220)
QTFTTMTTYYQYWAPYNTVQQLNTF (SEQ ID NO: 221)
QTFTTMTTYYQYWAPYNTVQQQNTF (SEQ ID NO: 222)
QTYTTMTTYYQYWAPYNTVQQLNTF (SEQ ID NO: 223)
QTFTTMTTYYQYWAPYNTTQQLNTF (SEQ ID NO: 224)
QTYTTMTTYYQYWAPYNTVQQQNTF (SEQ ID NO: 225)
QTYTTMTTYYQYWAPYNTTQQQNTY (SEQ ID NO: 225)

QEFFGLNNCSSSNRLDQ (SEQ ID NO: 226)

TM7 mutant:
AMQVTETQGMTHCCINPIIYAFVG (SEQ ID NO: 227)
AMQVTETLGMTHCCTNPIIYAFTG (SEQ ID NO: 228)
AMQVTETQGMTHCCINPTIYAYVG (SEQ ID NO: 229)
AMQTTETQGMTHCCINPITYAFTG (SEQ ID NO: 230)
AMQTTETQGMTHCCINPTIYAFTG (SEQ ID NO: 231)
AMQVTETQGMTHCCTNPTIYAYVG (SEQ ID NO: 232)
AMQTTETQGMTHCCINPTTYAYVG (SEQ ID NO: 233)
AMQTTETQGMTHCCTNPTTYAYTG (SEQ ID NO: 234)

EKFRNYLLVFFQKHIAKRFCKCCSIFQQEAPERASSVYTRSTGEQEISVGL (SEQ ID NO: 235)

上記実施例１と同様に、膜貫通ドメイン変異体の各リストの前、間および後の配列は、はそれぞれＮ’、中間およびＣ’細胞内または細胞外領域である。 Similar to Example 1 above, the pre-, inter- and post-sequences of each list of transmembrane domain variants are N', intermediate and C'intracellular or extracellular regions, respectively.

次いで当該技術分野で知られているように、上記配列を使用して発現系、この場合は酵母における発現に適したコード配列を生成した。次いで、このコード配列を組み替えて発現させ、それぞれが各変異体リスト内の１種の膜貫通ドメイン変異体をそれぞれの細胞内および細胞外ドメインの間に含む、配列番号１８６、１９５、２０４、２０７、２０９、２１８、２２６および２３５を有する複数のタンパク質を含むライブラリーを作製した。 The above sequences were then used to generate coding sequences suitable for expression in the expression system, in this case yeast, as is known in the art. This coding sequence is then rearranged and expressed, each containing one transmembrane domain variant within each variant list between the intracellular and extracellular domains of SEQ ID NO: 186, 195, 204, 207. , 209, 218, 226 and 235, and a library containing multiple proteins was prepared.

次いで、そのように作製したライブラリーを水性媒体中でＣＣＲ５同族リガンドであるＣＣＬ５との結合について実施例１に記載されているようにアッセイした。リガンド結合を検出し、次いで試料を配列決定した。１種の変異体を配列決定した。その結果を図８に示す。 The library thus prepared was then assayed in an aqueous medium for binding to the CCR5 homologous ligand, CCL5, as described in Example 1. Ligand binding was detected and then the sample was sequenced. One mutant was sequenced. The result is shown in FIG.

実施例５：ＣＸＣＲ３変異体
実施例１の方法をＣＸＣケモカイン受容体タイプ３イソ型２のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（配列番号３２４、上側の線）に整列させた以下の配列（配列番号３２５、下側の線）を得る。

Example 5: CXCR3 mutant The method of Example 1 was repeated for CXC chemokine receptor type 3 isotype 2. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y and aligned to the wild type (SEQ ID NO: 324, upper line). The following sequence (SEQ ID NO: 325, lower line) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、配列番号３２５または、配列番号３２４に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the native V, L, I and F amino acids set forth in SEQ ID NO: 325 or SEQ ID NO: 324. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

上に記載したように、ＣＸＣＲ３の天然タンパク質配列を本方法に供した。プログラム出力は天然配列を細胞外および細胞内領域に分け、膜貫通ドメインのそれぞれに対して８種の膜貫通ドメイン変異体を選択した。その結果を以下の表に示す。

MVLEVSDHQVLNDAEVAALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDR (配列番号235)

ＴＭ１変異体：
AFLPALYSQQFQQGQQGNGAVAATQLS (配列番号236)
AFQPALYSQQFQQGQQGNGAVAAVQQS (配列番号237)
AFQPAQYSQQFLQGQQGNGAVAATQQS (配列番号238)
AYQPALYSLQYQQGQQGNGATAAVQQS (配列番号239)
AYQPALYSQLFQQGQQGNGATAATQQS (配列番号240)
AFQPALYSLQYQQGQQGNGATAATQQS (配列番号241)
AYQPAQYSLQYQQGQQGNGATAAVQQS (配列番号242)
AYQPAQYSQQYQQGQQGNGATAATQQS (配列番号243)

RRTALSSTD (配列番号244)

ＴＭ２変異体：
TFLQHLAVADTQQVQTLPQWA (配列番号245)
TFLQHQAVADTQLVQTQPQWA (配列番号246)
TFQQHLAVADTQQVQTQPQWA (配列番号247)
TYLQHQAVADTQQVQTQPQWA (配列番号248)
TYQLHQAVADTQQVQTQPQWA (配列番号249)
TYQQHLAVADTQQVQTQPQWA (配列番号250)
TYQQHQAVADTQQVQTQPQWA (配列番号251)
TYQQHQATADTQQTQTQPQWA (配列番号252)

VDAAVQWVFGSGLCK (配列番号253)

ＴＭ３変異体：
TAGAQYNTNFYAGAQQQACISF (配列番号254)
TAGAQYNTNFYAGAQLQACTSF (配列番号255)
TAGAQYNTNFYAGAQQLACTSF (配列番号256)
TAGAQFNTNYYAGAQQQACISF (配列番号257)
TAGAQYNTNYYAGAQQQACISF (配列番号258)
TAGAQYNTNYYAGAQLQACTSF (配列番号259)
TAGAQYNTNYYAGAQQLACTSF (配列番号260)
TAGAQYNTNYYAGAQQQACTSY (配列番号261)

DRYLNIVHATQLYRRGPPARVT (配列番号262)

ＴＭ４変異体：
LTCQAVWGQCQQFAQPDFIF (配列番号263)
QTCQAVWGQCQQFAQPDFIF (配列番号264)
QTCQATWGQCQQFAQPDFIF (配列番号265)
QTCQATWGQCQQYAQPDFIF (配列番号266)
QTCQATWGQCQQFAQPDFTF (配列番号267)
QTCQATWGQCQQFAQPDYIF (配列番号268)
QTCQATWGQCQQYAQPDYIF (配列番号269)
QTCQATWGQCQQYAQPDYTY (配列番号270)

LSAHHDERLNATHCQYNFPQVGR (配列番号271)

ＴＭ５変異体：
TAQRTQQQTAGYQQPQQTMAY (配列番号272)

CYAHILAVLLVSRGQRRLRAMR (配列番号273)

ＴＭ６変異体：
QVTTTTVAFAQCWTPYHQVVQV (配列番号274)
QVTTTTVAFAQCWTPYHQTVQV (配列番号275)
QVTTTTTAFAQCWTPYHQTVQV (配列番号276)
QVTTTTTAYAQCWTPYHQTVQV (配列番号277)
QVTTTTTAFAQCWTPYHQTTQV (配列番号278)
QTTTTTVAFAQCWTPYHQTTQV (配列番号279)
QVTTTTTAYAQCWTPYHQTTQV (配列番号280)
QTTTTTTAYAQCWTPYHQTTQT (配列番号281)

DILMDLGALARNCGRESRVDV (配列番号282)

ＴＭ７変異体：
AKSVTSGQGYMHCCLNPLQYAFV (配列番号283)
AKSVTSGQGYMHCCLNPQLYAFT (配列番号284)
AKSVTSGQGYMHCCLNPLQYAFT (配列番号285)
AKSTTSGQGYMHCCLNPQQYAFV (配列番号286)
AKSTTSGQGYMHCCQNPLQYAFV (配列番号287)
AKSTTSGQGYMHCCQNPQLYAFV (配列番号288)
AKSTTSGQGYMHCCQNPLQYAFT (配列番号289)
AKSTTSGQGYMHCCQNPQQYAYT (配列番号290)

GVKFRERMWMLLLRLGCPNQRGLQRQPSSSRRDSSWSETSEASYSGL (配列番号291)
As described above, the natural protein sequence of CXCR3 was subjected to this method. The program output divided the native sequences into extracellular and intracellular regions and selected eight transmembrane domain variants for each of the transmembrane domains. The results are shown in the table below.

MVLEVSDHQVLNDAEVAALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDR (SEQ ID NO: 235)

TM1 mutant:
AFLPALYSQQFQQGQQGNGAVAATQLS (SEQ ID NO: 236)
AFQPALYSQQFQQGQQGNGAVAAVQQS (SEQ ID NO: 237)
AFQPAQYSQQFLQGQQGNGAVAATQQS (SEQ ID NO: 238)
AYQPALYSLQYQQGQQGNGATAAVQQS (SEQ ID NO: 239)
AYQPALYSQLFQQGQQGNGATAATQQS (SEQ ID NO: 240)
AFQPALYSLQYQQGQQGNGATAATQQS (SEQ ID NO: 241)
AYQPAQYSLQYQQGQQGNGATAAVQQS (SEQ ID NO: 242)
AYQPAQYSQQYQQGQQGNGATAATQQS (SEQ ID NO: 243)

RRTALSSTD (SEQ ID NO: 244)

TM2 mutant:
TFLQHLAVADTQQVQTLPQWA (SEQ ID NO: 245)
TFLQHQAVADTQLVQTQPQWA (SEQ ID NO: 246)
TFQQHLAVADTQQVQTQPQWA (SEQ ID NO: 247)
TYLQHQAVADTQQVQTQPQWA (SEQ ID NO: 248)
TYQLHQAVADTQQVQTQPQWA (SEQ ID NO: 249)
TYQQHLAVADTQVQTQPQWA (SEQ ID NO: 250)
TYQQHQAVADTQQVQTQPQWA (SEQ ID NO: 251)
TYQQHQATADTQQTQTQPQWA (SEQ ID NO: 252)

VDAAVQWVFGSGLCK (SEQ ID NO: 253)

TM3 mutant:
TAGAQYNTNFYAGAQQQACISF (SEQ ID NO: 254)
TAGAQYNTNFYAGAQLQACTSF (SEQ ID NO: 255)
TAGAQYNTNFYAGAQQLACTSF (SEQ ID NO: 256)
TAGAQFNTNYYAGAQQQACISF (SEQ ID NO: 257)
TAGAQYNTNYYAGAQQQACISF (SEQ ID NO: 258)
TAGAQYNTNYYAGAQLQACTSF (SEQ ID NO: 259)
TAGAQYNTNYYAGAQQLACTSF (SEQ ID NO: 260)
TAGAQYNTNYYAGAQQQACTSY (SEQ ID NO: 261)

DRYLNIVHATQLYRRGPPARVT (SEQ ID NO: 262)

TM4 mutant:
LTCQAVWGQCQQFAQPDFIF (SEQ ID NO: 263)
QTCQAVWGQCQQFAQPDFIF (SEQ ID NO: 264)
QTCQATWGQCQQFAQPDFIF (SEQ ID NO: 265)
QTCQATWGQCQQYAQPDFIF (SEQ ID NO: 266)
QTCQATWGQCQQFAQPDFTF (SEQ ID NO: 267)
QTCQATWGQCQQFAQPDYIF (SEQ ID NO: 268)
QTCQATWGQCQQYAQPDYIF (SEQ ID NO: 269)
QTCQATWGQCQQYAQPDYTY (SEQ ID NO: 270)

LSAHHDERLNATHCQYNFPQVGR (SEQ ID NO: 271)

TM5 mutant:
TAQRTQQQTAGYQQPQQTMAY (SEQ ID NO: 272)

CYAHILAVLLVSRGQRRLRAMR (SEQ ID NO: 273)

TM6 mutant:
QVTTTTVAFAQCWTPYHQVVQV (SEQ ID NO: 274)
QVTTTTVAFAQCWTPYHQTVQV (SEQ ID NO: 275)
QVTTTTTAFAQCWTPYHQTVQV (SEQ ID NO: 276)
QVTTTTTAYAQCWTPYHQTVQV (SEQ ID NO: 277)
QVTTTTTAFAQCWTPYHQTTQV (SEQ ID NO: 278)
QTTTTTVAFAQCWTPYHQTTQV (SEQ ID NO: 279)
QVTTTTTAYAQCWTPYHQTTQV (SEQ ID NO: 280)
QTTTTTTAYAQCWTPYHQTTQT (SEQ ID NO: 281)

DILMDLGALARNCGRESRVDV (SEQ ID NO: 282)

TM7 mutant:
AKSVTSGQGYMHCCLNPLQYAFV (SEQ ID NO: 283)
AKSVTSGQGYMHCCLNPQLYAFT (SEQ ID NO: 284)
AKSVTSGQGYMHCCLNPLQYAFT (SEQ ID NO: 285)
AKSTTSGQGYMHCCLNPQQYAFV (SEQ ID NO: 286)
AKSTTSGQGYMHCCQNPLQYAFV (SEQ ID NO: 287)
AKSTTSGQGYMHCCQNPQLYAFV (SEQ ID NO: 288)
AKSTTSGQGYMHCCQNPLQYAFT (SEQ ID NO: 289)
AKSTTSGQGYMHCCQNPQQYAYT (SEQ ID NO: 290)

GVKFRERMWMLLLRLGCPNQRGLQRQPSSSRRDSSWSETSEASYSGL (SEQ ID NO: 291)

上記配列を使用して当該技術分野で知られているように発現系、この場合は酵母における発現に適したコード配列を生成することができる。次いで、このコード配列を組み替えて発現させ、それぞれが各変異体リスト内の１種の膜貫通ドメイン変異体をそれぞれの細胞内および細胞外ドメインの間に含む、細胞内および細胞外ループを有する複数のタンパク質を含むライブラリーを作製した。 The above sequences can be used to generate coding sequences suitable for expression in expression systems, in this case yeast, as is known in the art. The coding sequences are then rearranged and expressed, each having an intracellular and extracellular loop, each containing one transmembrane domain variant within each variant list between its intracellular and extracellular domains. A library containing the above proteins was prepared.

次いで、そのように作製したライブラリーを水性媒体中で同族リガンドとの結合について実施例１に記載されているようにアッセイすることができる。 The library thus prepared can then be assayed for binding to a cognate ligand in an aqueous medium as described in Example 1.

実施例６：（ＣＣＲ−１）ＣＣケモカイン受容体タイプ１
実施例１を表題のタンパク質のために繰り返した。

Example 6: (CCR-1) CC Chemokine Receptor Type 1
Example 1 was repeated for the title protein.

その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号２９２）に整列させた以下の配列（下側の線、配列番号２９３）を得る。

All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 292). The following sequence (lower line, SEQ ID NO: 293) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、下線が引かれたドメインをそれぞれ含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes transmembrane domains, each containing an underlined domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, or three of the proteins depicted or the native V, L, I, and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence holding one or optionally four or more.

当該野生型配列を上に記載した方法に供して実施例１に記載されているようにさらなる膜貫通ドメイン変異体を選択することができる。コード配列を設計し、組み替えてタンパク質を発現させることができる。発現させたタンパク質を本明細書に記載されているようにリガンド結合についてアッセイすることができる。 Additional transmembrane domain variants can be selected as described in Example 1 by using the wild-type sequence as described above. The coding sequence can be designed and recombined to express the protein. The expressed protein can be assayed for ligand binding as described herein.

実施例７：（ＣＣＲ−２）ＣＣケモカイン受容体タイプ２イソ型Ａ
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号２９４）に整列させた以下の配列（下側の線、配列番号２９５）を得る。

Example 7: (CCR-2) CC Chemokine Receptor Type 2 Isotype A
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 294) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 295) is obtained.

実施例８：（ＣＣＲ−４）ＣＣケモカイン受容体タイプ４
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号２９６）に整列させた以下の配列（下側の線、配列番号２９７）を得る。

Example 8: (CCR-4) CC Chemokine Receptor Type 4
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 296). The following sequence (lower line, SEQ ID NO: 297) was obtained.

実施例９：（ＣＣＲ−６）ＣＣケモカイン受容体タイプ６
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号２９８）に整列させた以下の配列（下側の線、配列番号２９９）を得る。

Example 9: (CCR-6) CC Chemokine Receptor Type 6
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 298). The following sequence (lower line, SEQ ID NO: 299) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、下線が引かれたドメインをそれぞれ含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｌ、Ｉ、ＶおよびＦアミノ酸の１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes transmembrane domains, each containing an underlined domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the proteins depicted or the native L, I, V and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

実施例１０：（ＣＣＲ−７）ＣＣケモカイン受容体タイプ７前駆体
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３００）に整列させた以下の配列（下側の線、配列番号３０１）を得る。

Example 10: (CCR-7) CC Chemokine Receptor Type 7 Precursor Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 300). The following sequence (lower line, SEQ ID NO: 301) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、下線が引かれたドメインをそれぞれ含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｌ、Ｉ、ＶおよびＦアミノ酸の１つ、２つ、３つまたは場合により４つまたはそれ以上を保持する相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes transmembrane domains, each containing an underlined domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the proteins depicted or the native L, I, V and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more.

実施例１１：（ＣＣＲ−８）ＣＣケモカイン受容体タイプ８
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３０２）に整列させた以下の配列（下側の線、配列番号３０３）を得る。

Example 11: (CCR-8) CC Chemokine Receptor Type 8
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 302). The following sequence (lower line, SEQ ID NO: 303) is obtained.

実施例１２：（ＣＣＲ−９）ＣＣケモカイン受容体タイプ９イソ型Ｂ
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３０４）に整列させた以下の配列（下側の線、配列番号３０５）を得る。

Example 12: (CCR-9) CC Chemokine Receptor Type 9 Isotype B
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 304). The following sequence (lower line, SEQ ID NO: 305) was obtained.

実施例１３：（ＣＣＲ−１０）ＣＣケモカイン受容体タイプ１０
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３０６）に整列させた以下の配列（下側の線、配列番号３０７）を得る。

Example 13: (CCR-10) CC Chemokine Receptor Type 10
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 306) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 307) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、下線が引かれたドメインをそれぞれ含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｌ、Ｉ、ＶおよびＦアミノ酸の１つ、２つ、３つまたは場合により４つまたはそれ以上を保持する相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。当該野生型配列を上に記載した方法に供して実施例１に記載されているようにさらなる膜貫通ドメイン変異体を選択することができる。コード配列を設計し、組み替えてタンパク質を発現させることができる。発現させたタンパク質を本明細書に記載されているようにリガンド結合についてアッセイすることができる。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes transmembrane domains, each containing an underlined domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the proteins depicted or the native L, I, V and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more. Additional transmembrane domain variants can be selected as described in Example 1 by using the wild-type sequence as described above. The coding sequence can be designed and recombined to express the protein. The expressed protein can be assayed for ligand binding as described herein.

実施例１４：（ＣＸＣＲ１）ケモカイン受容体タイプ１
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３０８）に整列させた以下の配列（下側の線、配列番号３０９）を得る。

Example 14: (CXCR1) Chemokine Receptor Type 1
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 308). The following sequence (lower line, SEQ ID NO: 309) was obtained.

実施例１５：（ＣＸＲ１）ＣＸＲケモカイン受容体タイプ１
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３１０）に整列させた以下の配列（下側の線、配列番号３１１）を得る。

Example 15: (CXR1) CXR chemokine receptor type 1
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 310) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 311) is obtained.

実施例１６：（ＣＸＣＲ２）ケモカイン受容体タイプ２
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３１２）に整列させた以下の配列（下側の線、配列番号３１３）を得る。

Example 16: (CXCR2) Chemokine Receptor Type 2
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 312). The following sequence (lower line, SEQ ID NO: 313) is obtained.

実施例１７：（ＣＣＲ−１０）ＣＣケモカイン受容体タイプ１０
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３１４）に整列させた以下の配列（下側の線、配列番号３１５）を得る。

Example 17: (CCR-10) CC Chemokine Receptor Type 10
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 314) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 315) is obtained.

実施例１８：（ＣＸＣＲ６）ケモカイン受容体タイプ６
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３１６）に整列させた以下の配列（下側の線、配列番号３１７）を得る。

Example 18: (CXCR6) Chemokine Receptor Type 6
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 316) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 317) is obtained.

実施例１９：（ＣＸＣＲ７）ケモカイン受容体タイプ７
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３１８）に整列させた以下の配列（下側の線、配列番号３１９）を得る。

Example 19: (CXCR7) Chemokine Receptor Type 7
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 318). The following sequence (lower line, SEQ ID NO: 319) is obtained.

実施例２０：（ＣＬＲ−１ａ）ケモカイン様受容体１イソ型ａ
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦの全てまたは実質的に全てをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３２０）に整列させた以下の配列（下側の線、配列番号３２１）を得る。

Example 20: (CLR-1a) chemokine-like receptor 1 isotype a
Example 1 was repeated for the title protein. All or substantially all of the hydrophobic amino acids L, I, V and F within the transmembrane domain are (respectively) substituted with Q, T and Y to align to the wild type (upper line, SEQ ID NO: 320). The following sequence (lower line, SEQ ID NO: 321) is obtained.

実施例２１：ＤＡＲＩＡダッフィ抗原／ケモカイン受容体イソ型ａ
実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３２２）に整列させた以下の配列（下側の線、配列番号３２３）を得る。

Example 21: DARIA Duffy antigen / chemokine receptor isotype a
Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 322) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 323) is obtained.

予測される膜貫通領域のそれぞれに下線が引かれており、本発明の完全に修飾されたドメインが例示されている。従って、例えば本発明は、下線が引かれたドメインをそれぞれ含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｌ、Ｉ、ＶおよびＦアミノ酸の１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。当該野生型配列を上に記載した方法に供して実施例１に記載されているようにさらなる膜貫通ドメイン変異体を選択することができる。コード配列を設計し、組み替えてタンパク質を発現させることができる。発現させたタンパク質を本明細書に記載されているようにリガンド結合についてアッセイすることができる。 Each of the predicted transmembrane regions is underlined, exemplifying a fully modified domain of the invention. Thus, for example, the invention includes transmembrane domains, each containing an underlined domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, three or one of the proteins depicted or the native L, I, V and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence that optionally retains four or more. Additional transmembrane domain variants can be selected as described in Example 1 by using the wild-type sequence as described above. The coding sequence can be designed and recombined to express the protein. The expressed protein can be assayed for ligand binding as described herein.

実施例２２：ＣＤ８１抗原
ＣＤ８１はリンパ腫細胞増殖の制御において重要な役割を担う場合があり、１６ｋＤａのＬｅｕ−１３タンパク質と相互作用して場合によりシグナル伝達に関与する複合体を形成する。ＣＤ８１はＨＣＶのウイルス受容体として機能する場合がある。 Example 22: CD81 antigen CD81 may play an important role in the regulation of lymphoma cell proliferation and interacts with the 16 kDa Leu-13 protein to optionally form a signal transduction complex. CD81 may function as a viral receptor for HCV.

実施例１を表題のタンパク質のために繰り返した。その膜貫通ドメイン内の疎水性アミノ酸Ｌ、Ｉ、ＶおよびＦのそれぞれをＱ、ＴおよびＹで（それぞれ）置換して、野生型（上側の線、配列番号３２４）に整列させた以下の配列（下側の線、配列番号３２５）を得る。

Example 1 was repeated for the title protein. The following sequences are aligned to the wild type (upper line, SEQ ID NO: 324) by substituting each of the hydrophobic amino acids L, I, V and F in the transmembrane domain with Q, T and Y (respectively). (Lower line, SEQ ID NO: 325) is obtained.

予測される膜貫通領域は本発明の修飾されたドメインを例示しており、以下を（配列番号３２６、３２７、３２８、３２９、３３０、３３１、３３２、３３３をそれぞれ）含む。

The predicted transmembrane regions exemplify the modified domains of the invention, including (SEQ ID NOs: 326, 327, 328, 329, 330, 331, 332, 333, respectively):

従って、例えば本発明は、それぞれ修飾されたドメインすなわち「ｍｔ」ドメインを含む膜貫通ドメインを含む。好ましくは本明細書中のＴＭ１を含むタンパク質は、細胞外および細胞内ループ配列（下線が引かれていない配列）の１つ以上（例えば全て）を含む。追加または代わりとして、本明細書中のＴＭ１を含むタンパク質は、描写されているタンパク質または、野生型配列に記載されている天然Ｖ、Ｌ、ＩおよびＦアミノ酸のうちの１つ、２つ、３つまたは場合により４つまたはそれ以上を保持している相同な配列内の１つ以上のさらなる膜貫通領域（下線が引かれている配列）を含む。 Thus, for example, the present invention includes transmembrane domains, each containing a modified domain or "mt" domain. Preferably, the TM1 -containing protein herein comprises one or more (eg, all) of extracellular and intracellular loop sequences (ununderlined sequences). As an addition or alternative, the protein containing TM1 herein is one, two, or three of the proteins depicted or the native V, L, I, and F amino acids listed in the wild-type sequence. Includes one or more additional transmembrane regions (underlined sequences) within a homologous sequence holding one or optionally four or more.

実施例２３：ＱＴＹ変異体およびＣＸＣＲ４−ＱＴＹ変異体の大腸菌発現 Example 23: E. coli expression of QTY and CXCR4-QTY mutants

１．大腸菌ＢＬ２１（ＤＥ３）におけるＣＸＣＲ４−ＱＴＹの大規模産生
日常的に使用されるＬＢ培地１リットル当たり約２０ｍｇの精製されたタンパク質であると推定される収率で、水溶性GPCR CXCR4を大腸菌において産生した。推定される産生コストは１ミリグラム当たり約＄０．２５である。この手法を使用してグラム量の水溶性ＧＰＣＲを容易に得ることができ、次いでこれによりそれらの構造的決定を容易にすることができると有利である。 1. 1. Large-scale production of CXCR4-QTY in E. coli BL21 (DE3) Water-soluble GPCR CXCR4 was produced in E. coli in an estimated yield of approximately 20 mg of purified protein per liter of routinely used LB medium. .. The estimated production cost is about $ 0.25 per milligram. It is advantageous that grams of water-soluble GPCRs can be readily obtained using this technique, which in turn facilitates their structural determination.

２．水溶性ＣＸＣＲ４−ＱＴＹが大腸菌細胞において産生される位置の決定
水溶性ＣＸＣＲ４−ＱＴＹをｐＥＴベクターにクローン化した。本発明らは最初に小規模の大腸菌培養研究を行って、産生されるＣＸＣＲ４−ＱＴＹタンパク質（１５０ｍｌの培養物）の位置を評価した。ＩＰＴＧを用いて２４℃で４時間誘導された細胞を培養した後、本発明らはこれらの細胞を回収および超音波処理し、１４，６３７×ｇ（１２，０００ｒｍｐ）の遠心分離により２つの画分に分けた。次いで、本発明らは特異的抗ｒｈｏタグモノクローナル抗体のウエスタンブロット分析を使用してＣＸＣＲ４−ＱＴＹタンパク質の位置を検出した。本発明らはＣＸＣＲ４−ＱＴＹタンパク質が上澄み画分中にあり、タンパク質がペレット画分中にないことを観察し、従って、当該タンパク質が完全に水溶性であることが示唆された。 2. 2. Determining the location of water-soluble CXCR4-QTY produced in E. coli cells Water-soluble CXCR4-QTY was cloned into a pET vector. We first conducted a small-scale E. coli culture study to assess the location of the CXCR4-QTY protein (150 ml culture) produced. After culturing cells induced at 24 ° C. for 4 hours using IPTG, we harvested and sonicated these cells and centrifuged at 14,637 xg (12,000 mp) to create two images. Divided into minutes. We then detected the location of the CXCR4-QTY protein using Western blot analysis of specific anti-rhho-tagged monoclonal antibodies. We observed that the CXCR4-QTY protein was in the supernatant fraction and the protein was not in the pellet fraction, thus suggesting that the protein was completely water soluble.

３．大腸菌細胞の可溶性画分中で産生されるＣＸＣＲ４−ＱＴＹの推定収率
次いで、本発明らは別に１５０ｍｌの培養を行い、約６ｍｇの１Ｄ４モノクローナル抗体で精製されたＣＸＣＲ４−ＱＴＹを得た。本発明らは、その収率を過小に推定したため（本発明らは驚くべき程に高い収率を予期していていなかった）、本発明らは、産生されたＣＸＣＲ４−ＱＴＹを捕捉するために十分な親和性ｒｈｏ−１Ｄ４タグモノクローナル抗体ビーズを使用しなかった。従って、精製中に十分なビーズが添加されず、当該タンパク質が流出レーン中にあり、さらに洗い流されたことにより、有意な量のＣＸＣＲ４−ＱＴＹタンパク質がビーズに結合しなかった。有意な損失にも関わらず、本発明らは、レーン８〜１０（溶離画分）から分かるように、１５０ｍｌの培養物に対してなお約６ｍｇを得ることができる。 3. 3. Estimated Yield of CXCR4-QTY Produced in Soluble Fraction of E. Coli Cells Next, we separately cultured 150 ml to obtain CXCR4-QTY purified with about 6 mg of 1D4 monoclonal antibody. Because we underestimated the yield (we did not expect surprisingly high yields), we to capture the CXCR4-QTY produced. Sufficient affinity rho-1D4 tag monoclonal antibody beads were not used. Therefore, a significant amount of CXCR4-QTY protein did not bind to the beads because sufficient beads were not added during purification, the protein was in the outflow lane, and was further washed away. Despite the significant loss, we can still obtain about 6 mg for 150 ml of culture, as can be seen from lanes 8-10 (elution fraction).

４．精製された水溶性ＣＸＣＲ４−ＱＴＹタンパク質の熱安定性の測定
ほとんどの場合、構造によりタンパク質における機能が決まる。従って、大腸菌で産生された精製されたＣＸＣＲ４−ＱＴＹタンパク質が約５０％のαヘリックスを有する典型的なαへリックス構造になお正確に折り畳まれているか否かを知ることは重要である。本発明らは円偏光二色性（ＣＤ）を用いて二次構造測定を行った。本発明らは、各種温度で精製されたＣＸＣＲ４−ＱＴＹタンパク質のＣＤスペクトルを観察した。本発明らは、精製されたＣＸＣＲ４−ＱＴＹタンパク質の熱安定性を測定した。本発明らは、精製されたＣＸＣＲ４−ＱＴＹタンパク質が５５℃まで比較的安定であり、当該タンパク質が部分的にのみ徐々に変性し、ＣＤシグナル減少が約１５％であることを観察した。５５℃〜６５℃で、その変性は６５℃に向かって増加し、６５℃〜７５℃で変性転移が生じ、７５℃で当該タンパク質はほぼ完全に変性した。 4. Measurement of Thermal Stability of Purified Water Soluble CXCR4-QTY Protein In most cases, structure determines function in the protein. Therefore, it is important to know if the purified CXCR4-QTY protein produced in E. coli is still accurately folded into a typical α-helix structure with an α-helix of about 50%. We performed secondary structure measurements using circular dichroism (CD). The present invention observed the CD spectra of CXCR4-QTY proteins purified at various temperatures. The present invention measured the thermal stability of the purified CXCR4-QTY protein. We have observed that the purified CXCR4-QTY protein is relatively stable up to 55 ° C., the protein is only partially denatured gradually and the CD signal reduction is about 15%. At 55 ° C to 65 ° C, the denaturation increased towards 65 ° C, a denaturing transition occurred at 65 ° C to 75 ° C, and the protein was almost completely denatured at 75 ° C.

本発明らは、２２２ｎｍで楕円率に対して温度をプロットして、精製された水溶性ＣＸＣＲ４−ＱＴＹタンパク質の融解温度（Ｔｍ）を得た。このプロットから、本発明らは、精製されたＣＸＣＲ４−ＱＴＹタンパク質のＴｍは約６７℃であると推定した。このＴｍは、精製された水溶性ＣＸＣＲ４−ＱＴＹタンパク質が多くの他の可溶性タンパク質と比較して非常に安定であることを示唆している。熱安定性が良好である程、結晶格子充填が良好になり、従って構造を得る機会が増すことが知られているため、この熱安定性特性により回折結晶を得ることが容易になる。 The present invention plotted the temperature against ellipticity at 222 nm to obtain the melting temperature (Tm) of the purified water-soluble CXCR4-QTY protein. From this plot, we estimated that the Tm of the purified CXCR4-QTY protein was about 67 ° C. This Tm suggests that the purified water-soluble CXCR4-QTY protein is very stable compared to many other soluble proteins. It is known that the better the thermal stability, the better the crystal lattice filling, and therefore the more chances of obtaining a structure. Therefore, this thermal stability property makes it easy to obtain a diffracted crystal.

５．さらなるＧタンパク質共役受容体
本発明らは、１０種のＧタンパク質共役受容体（ＧＰＣＲ）を選択して、Ｚｈａｎｇらの「Water Soluble Membrane Proteins and Methods for the Preparation and Use Thereof（水溶性膜タンパク質およびその調製および使用方法）」という発明の名称の米国特許公開第２０１２／０２５２７１９Ａ号（「Ｚｈａｎｇ」）に記載されているＱＴＹ方法を用いてその水溶性形態を設計した。あるいは、本明細書に記載されているタンパク質を選択することができる。 5. Further G Protein-coupled Receptors We selected 10 G protein-coupled receptors (GPCRs) from Zhang et al., "Water Soluble Membrane Proteins and Methods for the Preparation and Use Thereof". The water-soluble form was designed using the QTY method described in US Patent Publication No. 2012/0252719A (“Zhang”) entitled “Preparation and Usage)”. Alternatively, the proteins described herein can be selected.

６．遺伝子の分子クローニング
本発明らは無細胞タンパク質発現プラスミドベクターｐＩＶｅｘ２．３ｄおよび大腸菌ｐＥＴ２８ａおよびｐＥＴ−ｄｕｅｔ−１プラスミドベクターにおけるＧＰＣＲの天然およびＱＴＹ遺伝子の確認に成功した。 6. Molecular Cloning of Genes We have succeeded in confirming the natural and QTY genes of GPCR in the cell-free protein expression plasmid vector pIVex2.3d and the Escherichia coli pET28a and pET-duet-1 plasmid vectors.

７．水溶性ＧＰＣＲの産生
本発明らは、いくつかの天然およびＱＴＹタンパク質を産生した。無細胞系において天然ＧＰＣＲを産生した場合、界面活性剤Ｂｒｉｊ３５が必要であり、界面活性剤を使用しない場合、当該タンパク質は産生されるとすぐに沈殿する。他方、本発明らは、界面活性剤の存在および非存在下でＱＴＹ変異体を試験した。界面活性剤を使用しない場合、無細胞系は可溶性タンパク質を産生した。 7. Production of Water-Soluble GPCRs We have produced several natural and QTY proteins. When producing a native GPCR in a cell-free system, the detergent Brij35 is required, and in the absence of the detergent, the protein precipitates as soon as it is produced. On the other hand, we tested QTY variants in the presence and absence of detergents. In the absence of detergent, the cell-free system produced soluble protein.

本発明らは、大腸菌ＢＬ２１（ＤＥ３）株における大腸菌細胞タンパク質産生のために、ＱＴＹ変異体を大腸菌生体内発現系ｐＥＴ２８ａおよびｐＥＴ−ｄｕｅｔ−１プラスミドベクターにクローン化した。本発明らは、ＣＸＣＲ４およびＣＣＲ５を含むいくつかの水溶性ＧＰＣＲタンパク質を精製し、本発明らはそれを二次構造分析のために使用した。本発明らは、ＣＸＣＲ４についてその天然リガンドＣＣＬ１２（ＳＤＦ１ａ）を用いてリガンド結合研究を行った。本発明らは、水溶性GPCR CCR5e変異体の大腸菌産生および精製を行った。ＣＣＲ５ｅ変異体は５８個のアミノ酸変化（約１８％の変化）を有していた。水溶性GPCR CCR5e変異体を、特異的モノクローナル抗体ロドプシンタグを用いて均質になるまで精製した。青色の株は、ＳＤＳゲル上にその純度を示す単一のバンドを示した。当該タンパク質のサイズマーカーから推定されるように、それは純粋なホモ二量体であるように見える（天然膜結合ＣＸＣＲ４結晶構造は二量体であった）。ウエスタンブロットにより、ＧＰＣＲにおいて一般的なＣＣＲ５ｅ変異体の単量体およびホモ二量体を確認した。 We cloned QTY mutants into E. coli in vivo expression systems pET28a and pET-duet-1 plasmid vectors for E. coli cell protein production in E. coli BL21 (DE3) strains. We purified several water-soluble GPCR proteins, including CXCR4 and CCR5, and we used them for secondary structure analysis. The present invention conducted a ligand binding study on CXCR4 using its natural ligand CCL12 (SDF1a). The present inventions produced and purified Escherichia coli of a water-soluble GPCR CCR5e mutant. The CCR5e variant had 58 amino acid changes (about 18% change). Water-soluble GPCR CCR5e variants were purified using a specific monoclonal antibody rhodopsin tag until homogeneous. The blue strain showed a single band on the SDS gel indicating its purity. As inferred from the size marker of the protein, it appears to be a pure homodimer (the natural membrane-bound CXCR4 crystal structure was a dimer). Western blots confirmed common CCR5e mutant monomers and homodimers by GPCR.

８．QTY CCR5eの二次構造研究
本発明らは、GPCR CCR5eの水溶性ＱＴＹ変異体を得た。次いで本発明らは、Ａｖｉｖモデル４１０円偏光二色性装置を用いて二次構造分析を行い、GPCR QTY CCR5-e変異体が典型的なαヘリックス構造を有することを確認した。本発明らは、各種温度で実験を行ってＣＣＲ５ｅ変異体のＴｍすなわち水溶性ＣＣＲ５ｅ変異体の熱安定性も決定した。これらの実験から、本発明らはＣＣＲ５ｅ変異体のＴｍは約４６℃であると決定した。このＴｍは結晶スクリーニング実験にとって良好である。 8. Secondary Structure Study of QTY CCR5e We obtained a water-soluble QTY mutant of GPCR CCR5e. Next, we performed secondary structure analysis using an Aviv model 410 circular dichroism device and confirmed that the GPCR QTY CCR5-e mutant had a typical α-helix structure. The present invention also conducted experiments at various temperatures to determine the Tm of the CCR5e mutant, that is, the thermal stability of the water-soluble CCR5e mutant. From these experiments, we determined that the Tm of the CCR5e mutant was about 46 ° C. This Tm is good for crystal screening experiments.

９．ＣＣＬ１２（ＳＤＦ１ａ）を用いたＣＸＣＲ４のリガンド結合研究
設計された水溶性QTY GPCRがそれらの生物学的機能をなお維持している、すなわちそれらの天然リガンドを確実に認識して結合するようにするために、本発明らは最初にＥＬＩＳＡ測定を使用して、水溶性ＣＸＣＲ４をその天然リガンドＣＣＬ１２（ＳＤＦ１ａともいう）を用いて研究した。アッセイ濃度は５０ｎＭ〜１０μＭの範囲である。測定されたＫｄは約８０ｎＭである。天然膜結合ＣＸＣＲ４のＳＤＦ１ａとのＫｄは約１００ｎＭである。そのため、水溶性ＣＸＣＲ４のＫｄは許容される範囲内である。より感受性の高いＳＰＲを用いるさらなる実験または他の測定を行って、より正確なＫｄを生成してもよい。 9. Ligand Binding Study of CXCR4 Using CCL12 (SDF1a) To ensure that the designed water-soluble QTY GPCRs still maintain their biological function, ie, recognize and bind their natural ligands. In addition, the present invention first studied water-soluble CXCR4 using its natural ligand CCL12 (also referred to as SDF1a) using ELISA measurements. The assay concentration is in the range of 50 nM to 10 μM. The measured Kd is about 80 nM. The Kd of the natural membrane-bound CXCR4 with SDF1a is about 100 nM. Therefore, the Kd of the water-soluble CXCR4 is within the permissible range. Further experiments or other measurements with the more sensitive SPR may be performed to produce more accurate Kd.

本発明を特にその好ましい実施形態を参照しながら図示および説明してきたが、添付の特許請求の範囲によって包含される本発明の範囲から逸脱することなくその形態および詳細における各種変更を行うことができることは当業者によって理解されるであろう。
Although the present invention has been illustrated and described with reference to particularly preferred embodiments thereof, various changes in the form and details can be made without departing from the scope of the present invention included in the appended claims. Will be understood by those skilled in the art.

Claims

A computer-implemented method for performing the procedure for selecting water-soluble variants of G protein-coupled receptors (GPCRs).
The step of inputting the GPCR sequence for analysis and
A step of obtaining a variant of the GPCR in which a plurality of hydrophobic amino acids in the transmembrane (TM) domain α-helix segment (“TM region”) of the GPCR are substituted.
(A) The hydrophobic amino acid is selected from the group consisting of leucine (L), isoleucine (I), valine (V) and phenylalanine (F).
(B) The leucine (L) is independently replaced with glutamine (Q), asparagine (N) or serine (S).
(C) The isoleucine (I) and the valine (V) are independently substituted with threonine (T), asparagine (N) or serine (S), and (d) the phenylalanine is tyrosine (Y), respectively. ), Followed by obtaining α-helix secondary structure results for the variant and confirming maintenance of the α-helix secondary structure within the variant.
A method comprising the step of obtaining a transmembrane region result for the mutant and confirming the water solubility of the mutant, thereby selecting the water-soluble variant of the GPCR.

The method according to claim 1, wherein step (3) is performed before, simultaneously with, or after step (4).

In step (2), one subset of the plurality of hydrophobic amino acids in the same TM region of the GPCR is replaced to prepare one member of the mutant candidate library, and the plurality of hydrophobicities are prepared. The method of claim 1 or 2, wherein one or more different subsets of amino acids are replaced to make additional members of the library.

Claim that the combination score further comprises a step of ranking all members of the library based on the combination score, which is a weighted combination of the α-helix secondary structure prediction result and the transmembrane region prediction result. The method according to 3.

The method of claim 1, further comprising a step of ranking the variant using a ranking function.

The method of claim 1, further comprising the step of performing the method using a data processor.

The method of claim 6, further comprising a memory connected to the data processor.

The method of claim 5, wherein the ranking function comprises a secondary structure component and a water soluble component.

The method of claim 8, wherein the ranking function comprises a weighted value of the secondary structure component and / or the water soluble component.

It further comprises the step of selecting the members of the N species having the highest combination score to form a first library of mutant candidates for the TM region, where N is a predetermined integer (eg, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more). The method of claim 4.

10. The method of claim 10, further comprising making one library of mutant candidates for all three other TM regions of the GPCR.

The method according to claim 11, further comprising the step of substituting two or more TM regions of the GPCR with the corresponding TM regions in the mutant candidate library to prepare a combined mutant library.

The method according to any one of claims 1 to 12, wherein substantially all (for example, all) of the leucine is replaced with glutamine.

The method according to any one of claims 1 to 13, wherein substantially all (for example, all) of the isoleucine is replaced with threonine.

The method of any one of claims 1-14, wherein substantially all (eg, all) of the valine is replaced with threonine.

The method according to any one of claims 1 to 15, wherein substantially all (for example, all) of the phenylalanine is replaced with tyrosine.

The method according to any one of claims 1 to 16, wherein one or more (eg, 1, 2 or 3) of the leucines are not substituted.

The method according to any one of claims 1 to 17, wherein one or more (eg, 1, 2 or 3) of the isoleucines are not substituted.

The method according to any one of claims 1 to 18, wherein one or more (eg, 1, 2 or 3) of the valine is not substituted.

The method according to any one of claims 1 to 19, wherein one or more (eg, 1, 2 or 3) of the phenylalanine is not substituted.

The method according to any one of claims 1 to 20, further comprising a step of producing / expressing the combination mutant.

Claims 1-21 further include testing the combination variant for ligand binding (eg, by the yeast two-hybrid method), wherein those having substantially the same ligand binding as compared to the GPCR are selected. The method according to any one of the above.

Claims 1 to 22, further comprising testing the combination variant for the biological function of the GPCR, wherein selecting one having substantially the same biological function as compared to the GPCR. The method according to any one item.

The method of any one of claims 1-23, wherein the combination mutant library comprises less than about 2 million members.

The method of any one of claims 1-24, wherein the sequence of the GPCR comprises information about the TM region of the GPCR.

The method according to any one of claims 1 to 25, wherein the sequence of the GPCR is obtained from a protein structure database (eg, PDB, UniProt).

The method according to any one of claims 1 to 26, wherein the TM region of the GPCR is predicted based on the sequence of the GPCR.

27. The method of claim 27, wherein the TM region of the GPCR is predicted using a TMHMM 2.0 (transmembrane prediction using a hidden Markov model) software module.

28. The method of claim 28, wherein the TMHMM 2.0 software module utilizes a dynamic baseline for peak search.

The method according to any one of claims 1 to 29, further comprising the step of providing the polynucleotide sequence of each variant of the GPCR.

The polynucleotide sequence is a codon optimized for expression in a host (eg, a bacterium such as E. coli, a yeast such as Saccharomyces cerevisiae or fission yeast, an insect cell such as Sf9 cell, a non-human mammalian cell or a human cell). The method according to claim 30.

The method according to any one of claims 1 to 31, the present scripting procedure includes a VBA script.

This scripting procedure can be operated by a Linux® system (eg, Ubuntu 12.04 LTS), a Unix® system, a Microsoft Windows operating system, an Android operating system or an Apple iOS operating system, claims 1 to 1. The method according to any one of 32.

A water-soluble variant of a G protein-coupled receptor (GPCR) in which multiple hydrophobic amino acids have been substituted within the transmembrane (TM) domain α-helix segment (“TM region”) of GPCR.
(A) The hydrophobic amino acid is selected from the group consisting of leucine (L), isoleucine (I), valine (V) and phenylalanine (F).
(B) The leucine (L) is independently replaced with glutamine (Q), asparagine (N) or serine (S).
(C) The isoleucine (I) and the valine (V) are independently substituted with threonine (T), asparagine (N) or serine (S), and (d) the phenylalanine is tyrosine (Y), respectively. A water-soluble variant characterized in that the secondary structure of the α-helix is maintained by all seven TM regions of the variant thereafter and the expected transmembrane region is absent.

34. Claim 34, which comprises one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 4-11, 13-20, 22-29, 31-38, 40-47, 49-56 and 58-64. Water-soluble mutant of.

35. The water-soluble variant of claim 35, further comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 3, 12, 21, 30, 39, 48 and 57.

The water-soluble variant according to claim 35 or 36, which binds to a CXCR4 ligand.

34. The water-soluble according to claim 34, comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 69-76, 78-85, 87, 89-96, 98-105, 107-114 and 116-123. Sex mutant.

38. The water-soluble variant of claim 38, further comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 68, 77, 86, 88, 97, 106, 115 and 124.

The water-soluble variant of claim 38 or 40 that binds to the CX3CR1 ligand.

34. The water-soluble claim 34, which comprises one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 128-135, 137-144, 146-153, 155-162, 164-171, 173 and 175-182. Sex mutant.

The water-soluble variant of claim 41, further comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 127, 136, 145, 154, 163, 172, 174 and 183.

The water-soluble variant of claim 41 or 42 that binds to a CCR3 ligand.

34. The water-soluble according to claim 34, which comprises one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 187-194, 196-203, 205-206, 208, 210-217, 219-225, 227-234. Sex mutant.

The water-soluble variant of claim 44, further comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 186, 195, 204, 207, 209, 218, 226 and 235.

The water-soluble variant of claim 44 or 45 that binds to a CCR5 ligand.

34. The water-soluble claim 34, which comprises one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 236-243, 245-252, 254-261, 263-270, 272, 274-281 and 283-290. Sex mutant.

The water-soluble variant of claim 47, further comprising one or more amino acid sequences selected from the group consisting of SEQ ID NOs: 235, 244, 253, 262, 271, 273, 282 and 291.

The water-soluble variant of claim 47 or 48 that binds to a CXCR3 ligand.

Any one of SEQ ID NOs: 2, 67, 126, 185, 327, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323 or 325. 34. The water-soluble variant of claim 34, comprising one or more transmembrane domains described in.

The water-soluble mutant according to claim 50, wherein the water-soluble mutant is water-soluble and binds to a ligand of a homologous natural transmembrane protein.

(A) A step of culturing bacteria in a growth medium under conditions suitable for protein production, and
(B) A step of dividing the lysate of the bacterium into fractions to generate a soluble fraction and an insoluble pellet fraction, and
(C) Including the step of isolating the protein from the soluble fraction.
(1) The protein is a variant of the G protein-coupled receptor (GPCR) according to any one of claims 29 to 46.
(2) A protein in a bacterium (eg, E. coli) characterized in that the yield of the protein is at least 20 mg / L (eg, 30 mg / L, 40 mg / L, 50 mg / L or more) of the growth medium. How to produce.

47. The method of claim 47, wherein the bacterium is Escherichia coli BL21 and the growth medium is an LB medium.

The method of claim 47 or 48, wherein the protein is encoded by a plasmid within the bacterium.

The method of any one of claims 47-49, wherein expression of the protein is under the control of an inducible promoter.

The method of claim 50, wherein the inducible promoter is inducible by IPTG.

The method according to any one of claims 47 to 51, wherein the lysate is produced by sonication.

The method according to any one of claims 47 to 52, wherein the lysate is centrifuged at 14,500 xg or more to produce the soluble fraction.

A non-transitory computer-readable medium in which a series of instructions for performing the method according to any one of claims 1 to 33 is stored.

A data processing system that operates to select water-soluble variants of membrane proteins, includes a data processor that operates to perform amino acid substitutions, and by a ranking function that includes secondary and water-soluble components. A system for ranking protein variants.

The system of claim 60, further comprising a library of membrane proteins for processing by the system.

60. The system of claim 60, further comprising a memory connected to the data processor that stores coded instructions for executing a replacement processor.

The system of claim 60, which operates on steps (a), (b), (c) and (d) of the method of claim 1.

The system of claim 60, further comprising a ranking function that is a weighted combination based on the secondary structure component.

The system of claim 60, which communicates with an external program over a network.

The system of claim 60, further comprising a database for storing water-soluble variants.

60. The system of claim 60, further comprising instructions for performing dynamic baseline processing.

60. The system of claim 60, further comprising an interface for selecting method parameters.

The system of claim 60, further comprising the step of inputting the sequence of claim 35-50.

A computer implementation method for performing procedures for selecting water-soluble variants,
The process of processing data to identify membrane protein sequences for analysis,
A step of obtaining a variant of the membrane protein in which a plurality of hydrophobic amino acids of the transmembrane (TM) domain α-helix segment (“TM region”) of the membrane protein is substituted is included.
The data processor
Determining the results of the α-helix secondary structure of the mutant to confirm the maintenance of the α-helix secondary structure in the mutant.
A method comprising determining the transmembrane region result of the mutant, confirming the water solubility of the mutant, and selecting a water-soluble variant of the membrane protein.

The replacement is
(A) Select the hydrophobic amino acid from the group consisting of leucine (L), isoleucine (I), valine (V) and phenylalanine (F).
(B) Replacing leucine (L) independently with glutamine (Q), asparagine (N) or serine (S),
(C) Isoleucine (I) and valine (V) are independently replaced with threonine (T), asparagine (N) or serine (S), and (d) the phenylalanine is replaced with tyrosine (Y), respectively. The method of claim 70, comprising substituting.

Substituting one subset of the plurality of hydrophobic amino acids in one same TM region of the GPCR to make one member of the mutant candidate library, and one or more different subsets of the plurality of hydrophobic amino acids. 71. The method of claim 71, wherein a further member of the library is made by substituting.

Claim that the combination score further comprises a step of ranking all members of the library based on the combination score, which is a weighted combination of the α-helix secondary structure prediction result and the transmembrane region prediction result. 70.

70. The method of claim 70, further comprising a step of ranking the variant using a ranking function.

The method of claim 70, further comprising a memory connected to the data processor.

The method of claim 74, wherein the ranking function comprises a secondary structure component and a water soluble component.

The method of claim 76, wherein the ranking function comprises a weighted value of the secondary structure component and / or the water soluble component.

Further comprising selecting the N species member having the highest combination score to form a first library of mutant candidates for the TM region, where N is a predetermined integer (eg, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more). The method of claim 73.

28. The method of claim 78, further comprising making one library of mutant candidates for all three other TM regions of GPCRs: 1, 2, 3, 4, 5 or 6.

The method of claim 79, further comprising the step of substituting the two or more TM regions of the GPCR with the corresponding TM regions in the mutant candidate library to prepare a combined mutant library.