WO2003007187A1 - Method of presuming ligand and method of using the same - Google Patents

Method of presuming ligand and method of using the same Download PDF

Info

Publication number
WO2003007187A1
WO2003007187A1 PCT/JP2002/007057 JP0207057W WO03007187A1 WO 2003007187 A1 WO2003007187 A1 WO 2003007187A1 JP 0207057 W JP0207057 W JP 0207057W WO 03007187 A1 WO03007187 A1 WO 03007187A1
Authority
WO
WIPO (PCT)
Prior art keywords
binding molecule
protein
amino acid
residue
ligand
Prior art date
Application number
PCT/JP2002/007057
Other languages
French (fr)
Japanese (ja)
Inventor
Hiroshi Inooka
Yoshio Yamamoto
Original Assignee
Takeda Chemical Industries, Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Takeda Chemical Industries, Ltd. filed Critical Takeda Chemical Industries, Ltd.
Publication of WO2003007187A1 publication Critical patent/WO2003007187A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants

Definitions

  • the present invention relates to a method for predicting a binding molecule of an unknown binding molecule protein, a method for producing a medicine using the prediction method, and a computer for predicting a binding molecule of the unknown binding molecule protein.
  • a medicine using the method for predicting a binding molecule or a type of a binding molecule of an unknown protein of a binding molecule from information on the amino acid sequence (or sequence alignment) of the protein and the type of the binding molecule or the type of the binding molecule.
  • a medicine using the method and the method It relates to a computer used for.
  • the present invention relates to a method for predicting the function of a protein whose binding molecule is unknown through prediction of the binding molecule and the type of the binding molecule.
  • the protein examples include a protein that binds a low-molecular substance, an enzyme that has catalytic activity for a low-molecular or high-molecular substance, and a protein that binds a high-molecular substance.
  • an enzyme that has catalytic activity for a low-molecular or high-molecular substance
  • a protein that binds a high-molecular substance there are a very large number of drugs that target proteins that bind low molecular substances.
  • Proteins of particular interest as target molecules include G protein-coupled receptor proteins involved in signal transduction as membrane proteins.
  • GPCR Geneincouplederecrecint
  • GPCRs have an N-terminal outside the cell, have a topology with a C-terminal inside the cell via a seven transmembrane domain, and the ligand binds to the N-terminal region or transmembrane region depending on the type. It is also known that G protein binds to the intracellular C-terminal part and intracellular loop 2. To date, the only crystal structure of oral dopsin, a member of the GP CR family, has been analyzed (Palczewski et al., Science, Vol. 289, pp.
  • GPCR activation occurs as follows. That is, (i) the ligand binds to the receptor and the structure of the GPCR changes. (Ii) Conjugation changes of the GPCR are transmitted to the conjugated G protein, and a part of the G protein is released. (Iiii) Information (signal) is transmitted into cells. Molecules that specifically bind to the receptor are called ligands.
  • known ligand molecules include living amines such as dopamine, serotonin, melatonin, and histamine, lipid derivatives such as prostaglandins and leukotrienes, amino acids such as nucleic acid and glutamine, angiotensin, secretin, and somatosutin. And physiologically active peptides such as
  • the G protein functions as a transducer in the signal transduction system, and is composed of three subunits, i, ⁇ , and ⁇ .
  • the ⁇ subunit has the activity of binding to GTP and hydrolyzing G ⁇ ⁇ , and is a subunit unique to each G protein. That is, the function of GPCR is determined by the type of Ga protein to be conjugated.
  • Go Go; G s and G or G as proteins. And Gt have been isolated and purified, and their properties have been investigated.
  • G protein-coupled receptor protein is present on the surface of each functional cell in living cells and organs, and is a target for molecules that regulate the function of those cells and organs, such as hormones, neurotransmitters, and biologically active substances. Plays a physiologically important role.
  • the receptor transmits a signal into the cell via binding to a physiologically active substance, and this signal causes various reactions such as suppression of activation and activation of the cell.
  • G protein-coupled receptor Yuichi proteins To clarify the relationship between substances that regulate complex functions in various cells and organs and their specific receptor proteins, especially G protein-coupled receptor Yuichi proteins, it is important to clarify the complex functions in various cells and organs. Will provide a very important tool for drug development closely related to these functions.
  • G protein-coupled receptors Not all G protein-coupled receptors have been found, and even at this time, there are many unknown G protein-coupled receptors and so-called orphan receptors for which no corresponding ligand has been identified. It is present, and the search for new G protein-coupled receptors and the elucidation of their functions are eagerly awaited.
  • Methods for predicting binding molecules and functions from sequence information include, for example, mutations of amino acids (and their mutation positions) that are considered to be involved in determining the type of G protein to be coupled to a limited number of known GPCRs.
  • mutations of amino acids (and their mutation positions) that are considered to be involved in determining the type of G protein to be coupled to a limited number of known GPCRs.
  • a method for predicting the site involved in binding of a ligand to its GPCR by analyzing mutations in both GPCR and ligand amino acids that is, a correlated mutation analysis (CMA) method was developed (Singer et al., Receptors and channels, 3: 89-95 (1995)), there is no report that a GPCR sequence alone was used to select unknown GPCR binding molecules (and / or types of binding molecules).
  • CMA correlated mutation analysis
  • GPCRs it is not known how to predict molecules or types of molecules that bind (or conjugate) to GPCR directly from the sequence information of GPCR, and predict the function. Not been.
  • GPCRs if a new GPCR is discovered, in order to manufacture a drug using the GPCR, it must bind to (or conjugate to) the GPCR after understanding the function of the GPCR. There is a problem that it is necessary to grasp the molecule or the kind of the molecule through experiments or the like, which requires enormous cost and labor. Disclosure of the invention
  • the present invention relates to a method for simply and accurately predicting a molecule or the like that can bind to sequence information of unknown function, a method for producing a medicine using the method, and a computer used for the method.
  • a binding molecule unknown protein prediction method for predicting a binding molecule that binds to a binding molecule unknown protein, wherein the amino acid sequence and the binding molecule are known obtaining a binding molecule known protein classification information in which the sequence alignment of the binding molecule known protein is associated with the type of the binding molecule or the binding molecule; and using the binding molecule known protein classification information, the binding molecule known protein. Specifying one or more binding molecule-determining residue positions that are assumed to be involved in determining a binding molecule among the sequence alignment positions of the amino acid sequence; and aminos at the binding molecule-determining residue positions.
  • binding molecule determining residues By associating acid residues (binding molecule determining residues) with binding molecules or types of binding molecules Obtaining a binding molecule-determining residue-binding molecule classification information indicating a correlation between the binding molecule-determining residue and the binding molecule or the type of the binding molecule; and for a binding molecule unknown protein of the same type as the binding molecule-known protein.
  • the binding molecule is any one of a ligand, a regulator, an effector, and a coenzyme.
  • the binding protein unknown protein is any one of a G protein-coupled receptor, a kinase, a lipase, a transporter, a protease, and an ion channel according to any of (1) to (3) above; Method for predicting binding molecules of unknown proteins.
  • the one or more binding molecule-determining residue positions are identified from amino acid residues constituting the sequence alignment and types of binding molecules.
  • the method for predicting a binding molecule of an unknown protein according to any one of (1) to (4).
  • Binding molecule prediction method determining the position of the binding molecule-determining residue using at least one of Formulas 3, 4, and 5; and determining the position of the unknown binding molecule protein according to any one of (1) to (4) above. Binding molecule prediction method.
  • the step of obtaining the ligand-determined residue-ligand classification information extracts the amino acid residues of the ligand-known protein at the position of the ligand-determined residue with the smallest value of the function f 3 (n). Determining the number (A) of ligand-determined residues that match the extracted ligand-determined residues among the ligand-known proteins listed in the ligand-determined residue-ligand classification information; Obtaining a number (B) of the known ligand proteins which correspond to the extracted ligand-determining residues among the known ligand proteins, wherein the type of the ligand or the ligand matches that of the known ligand protein; and Among the amino acid residues of known proteins, the value of the function f3 (n) is the second smallest or the Xth (where X represents an integer greater than 2 and less than 100).
  • Extracting the one at the position of the smallest ligand-determined residue, and the ligand-known protein listed in the ligand-determined residue-ligand classification information Determining the number (C) of the extracted ligand-determining residues that match the extracted ligand-determining residues; and matching the extracted ligand-determining residues among the ligand-known proteins listed in the ligand-determining residue-ligand classification information.
  • binding in which the sequence alignment of the binding protein known protein is associated with the binding molecule or the type of binding molecule A step of obtaining information on the classification of known protein molecules, and a binding molecule which is a position assumed to be involved in determining a binding molecule in the sequence alignment of the known binding molecules using the classification information on known binding molecules.
  • a binding molecule which is a position assumed to be involved in determining a binding molecule in the sequence alignment of the known binding molecules using the classification information on known binding molecules.
  • the binding molecule determination residue indicating the correlation between the binding molecule determination residue and the binding molecule.
  • the binding molecule classification information includes the binding molecule unknown protein of the same type as the binding molecule known protein.
  • this method it is possible to predict the binding molecule or the type of the binding molecule only by obtaining information on the amino acid sequence of the protein whose binding molecule is unknown and the sequence alignment obtained using Z or the amino acid sequence. .
  • This makes it much faster than conventional molecular modeling methods that predict even three-dimensional structures. It can predict the binding molecule (ligand etc.) at a low cost.
  • it is possible to predict the binding molecule or the type of the binding molecule for a protein in which various types of binding molecules are unknown.
  • it is possible to predict the binding molecule or the kind of the binding molecule easily and quickly than by experimenting whether or not any candidate binding molecule actually binds to the protein whose binding molecule is unknown.
  • Determining residue of binding molecule By obtaining the classification information of one binding molecule, the sequence is applied only to the sequence alignment of the unknown protein, and the information is applied to the table, and the type of the binding molecule or binding molecule that binds to the unknown protein. Can be easily predicted.
  • a method for producing a medicament comprising the step of:
  • the binding molecule in the position of the sequence alignment of the protein with the known binding molecule is used.
  • Molecule binding residue which is an amino acid residue at a position supposed to be involved in the determination of the binding molecule (binding molecule determining residue position), and a binding molecule determination residue indicating the correlation between the binding molecule or the type of the binding molecule.
  • a sequence alignment input means for inputting information on the sequence alignment of the protein; an amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means; and a binding molecule or a binding molecule.
  • a sequence alignment binding molecule storage means for storing information on the type; an amino acid sequence or sequence alignment of the binding molecule known protein stored by the sequence alignment binding molecule storage means; and a type of the binding molecule or the binding molecule.
  • a binding molecule determining residue position determining means for determining the binding molecule determining residue position using information; an amino acid residue (binding molecule determining residue) at the binding molecule determining residue position; Type and By associating, the binding molecule determination residue-binding molecule classification information obtaining means for obtaining the binding molecule determination residue-binding molecule classification information indicating the correlation between the binding molecule determination residue and the binding molecule or the type of the binding molecule; Information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the unknown binding molecule proteins for the same type of unknown binding molecule protein as the known binding molecule protein Inputting sequence alignment input means, and using the information on the sequence alignment of unknown binding molecule proteins input by the sequence alignment input means as the binding molecule determination residue-binding molecule classification information.
  • Bound molecule unknown protein Binding molecule or to predict the kind of binding molecules, binding molecules unknown evening protein computer for predicting the binding molecule,
  • the binding molecule-determining residue position determining means predicts the binding molecule of the binding molecule unknown protein according to (14) using at least one of the functions of Formula 9 and Formula 10 or both functions.
  • the binding molecule-determining residue position determining means uses the function represented by Formula 9 to predict the binding molecule of the binding molecule unknown protein described in (15) or (16) above.
  • a computer for predicting a binding molecule of a binding molecule unknown protein wherein the computer is a sequence alignment of a binding molecule known protein having the same type as the binding molecule unknown protein and a binding molecule known.
  • the binding And a binding molecule determining residue position that is assumed to be involved in determining a molecule that binds to a known protein, and a binding molecule determination that is an amino acid residue of the binding molecule known protein at the binding molecule determining residue position.
  • Sequence alignment input means for inputting information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the binding molecule known proteins.
  • Information and storage means for the entered sequence alignment
  • a binding molecule determining means for determining the binding molecule or the type of the binding molecule of the unknown binding molecule protein from the stored information; and displaying the determined binding molecule or the type of the binding molecule binding to the determined unknown binding molecule protein.
  • Display means the information relating to the sequence alignment of the unknown protein of the binding molecule input by the sequence alignment input means, and the binding molecule determination residue and the binding molecule determination residue stored in the storage means.
  • the binding molecule determination unit predicts the binding molecule or the type of the binding molecule of the unknown binding molecule protein, and is predicted by the binding molecule determination unit. Display the binding molecule or the type of the binding molecule of the unknown binding molecule by the display means.
  • Computer for predicting the binding molecules of the child unknown protein - is even. According to such a computer, it is possible to obtain binding molecule-determined residue-binding molecule classification information based on the sequence alignment of the binding molecule known proteins, and thereby to obtain the sequence alignment of the binding molecule unknown protein. Only by obtaining, the binding molecule or the type of the binding molecule can be easily predicted.
  • a computer is connected to a sequence alignment input means for inputting information on the sequence alignment of the known binding molecule protein, and the amino acid sequence or the amino acid sequence of the binding molecule known protein input by the sequence alignment input means.
  • Sequence alignment and binding molecules or types of binding molecules Sequence-binding molecule storage means for storing information on the amino acid sequence or sequence alignment of a known binding molecule protein stored by the sequence alignment-binding molecule storage means, and information on the type of the binding molecule or the binding molecule.
  • a binding molecule-determining residue position determining means for determining the binding molecule-determining residue position using: a binding molecule determining residue position; an amino acid residue (binding molecule determining residue) at the binding molecule determining residue position; By associating the type with the binding molecule determining residue and the binding molecule or the type of the binding molecule, the binding molecule representing the correlation between the binding molecule determining residue and the type of the binding molecule is obtained. And the above step, for the unknown binding molecule protein of the same type as the known binding molecule protein. Sequence alignment input means for inputting information on sequence alignment of unknown binding molecules obtained by aligning sequences of unknown binding molecules with respect to sequence alignment between known proteins of combined molecules. program,
  • Binding molecule-determining residue position that is supposed to be involved in the binding molecule; binding molecule-determining residue that is an amino acid residue of a binding molecule known protein at the binding molecule-determining residue position; Storage means for storing information on the binding molecule or the type of binding molecule of the binding molecule known protein corresponding to the group; and a sequence alignment between the binding molecule known protein for the same type of binding molecule unknown protein as the binding molecule known protein.
  • Unknown protein obtained by aligning the sequence of unknown protein to And Sequence ⁇ Line Instrument input means for inputting information about the Sequence alignment of Park protein, input Sequence ⁇ Line ment Information storage means the binding molecule from the information stored in the unknown protein
  • FIG. 1 shows a process chart from creation of ligand-determined residue-ligand classification information of the present invention.
  • FIG. 2 is a process chart showing one embodiment of the ligand-determining residue position specifying step of the present invention.
  • FIG. 3 is a process chart showing another embodiment of the ligand-determining residue position specifying step of the present invention.
  • FIG. 4 shows the results of sequence alignment between silodopsin and TGR 23-1.
  • FIG. 5 shows the activity of increasing the intracellular Ca ion concentration of TGR23-1-expressing CHO cells with various concentrations of human TGR23.2 ligand (1-20) measured using FLIPR.
  • FIG. 6 shows the activity of increasing the intracellular Ca ion concentration of TGR23-2 expressing CHO cells by various concentrations of human TGR23-2 ligand (112) measured using FLIPR.
  • the present invention relates to a known binding molecule protein whose amino acid sequence and binding molecule are known, wherein the binding alignment is performed by associating at least two or more binding molecule known proteins with the binding molecule or the type of the binding molecule.
  • One or more determined residue positions Determining the binding molecule-determining residue and the binding molecule or the binding molecule by associating the amino acid residue (binding molecule determination residue) at the binding molecule-determining residue position with the type of the binding molecule or the binding molecule.
  • the binding molecule-determining residue-binding molecule classification information indicating the correlation with the type of the binding molecule; and a sequence alignment between the binding molecule unknown proteins of the same type as the binding molecule unknown protein. Aligning the sequence of the unknown binding molecule protein to obtain a sequence alignment of the unknown binding molecule protein, and obtaining information on the binding molecule determining residue in the sequence alignment of the unknown binding molecule protein.
  • the present invention relates to a method for predicting a binding molecule of an unknown protein, which predicts the type of the binding molecule.
  • a binding molecule known protein means a protein for which a biomolecule that binds to the protein is known.
  • it refers to a protein to which a biological molecule specifically binds, such as a receptor whose ligand is known.
  • An unknown binding molecule protein refers to a protein of the same type as a known binding molecule protein, wherein the binding molecule is unknown.
  • a sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the known binding molecule proteins with respect to the same type of unknown binding molecule protein as the known binding molecule protein.
  • the insertion taking into account the substitution, insertion, and deletion, A gap was inserted at the position corresponding to the deletion, and the entire sequence was juxtaposed.
  • unknown binding molecule proteins and known binding molecule proteins examples include G protein-coupled receptors (GPCRs), kinases, lipases, transporters, proteases, and ion channels. Of these, the present invention can be preferably applied to G protein-coupled receptors (GPCRs) and kinases.
  • GPCRs G protein-coupled receptors
  • the unknown binding molecule protein and the known binding molecule protein are G protein-coupled receptors (GPCRs)
  • the unknown binding molecule protein is also called an orphan receptor.
  • the unknown binding molecule protein and the known binding molecule protein are of the same type. For example, if the protein whose binding molecule is unknown is a GPCR, the protein whose binding molecule is known is also a GPCR.
  • the binding molecule known protein classification information means a table in which sequence alignment of at least two or more binding molecule known proteins is associated with the binding molecule (and Z or the type of the binding molecule).
  • This table is not particularly limited as long as it is in a form that can be stored electronically and visually recognized as a table, not only on paper.
  • the binding molecule determining residue position means a position of a sequence alignment of a protein having a known binding molecule, which is assumed to be involved in determining a binding molecule.
  • the number of ligand-determining residue positions is not particularly limited as long as it is 1 or more, preferably 1 or more and 10 or less, more preferably 2 or more and 6 or less, and particularly preferably 2 or more.
  • more than 100 kinds of ligands are known, so that if there is only one ligand-determining residue position, all ligands cannot be associated with ligand-determining residues.
  • the prediction accuracy increases as the number of ligand determination residue positions increases.
  • p is the number of classifications
  • the classification of the ligand of the unknown protein belongs to which classification the number of ligands (value of ⁇ ) is small.
  • amino acids at two or more ligand-determining residue positions are generally The prediction accuracy becomes higher when the type of ligand is predicted by combining acid residues.
  • the binding molecule determining residue means an amino acid residue at the binding molecule determining residue position.
  • specific residue positions (one or two or more) of a plurality of types of binding molecules and a plurality of types of amino acid residues may be combined with a known protein of the binding molecule to be used as a residue for determining a binding molecule.
  • the second and eighth amino acid residue positions in the sequence alignment are taken as examples of the binding molecule-determining residue positions
  • the ninth and eleventh amino acid residues in the sequence alignment are taken as other examples. This is the case. Combining different binding molecule determining residue positions in this manner makes it possible to improve the accuracy of predicting the binding molecule or the type of the binding molecule.
  • the information on the binding molecule determining residue means information on the amino acid residue at the binding molecule determining residue position. For example, if the second and eighth positions in the sequence alignment of the protein are binding molecule determinant residue positions, the second and eighth positions in the sequence alignment are information on the positions of the binding molecule determinant residues. Then, in the sequence alignment, information on the types of the second and eighth amino acid residues and information on the position of the binding molecule determining residue are combined to provide information on the binding molecule determining residue.
  • the binding molecule determining residue-binding molecule classification information is a table showing the correlation between the binding molecule determining residue and the binding molecule or the type of the binding molecule. This table is not particularly limited as long as it is in a form that can be stored electronically and visually recognized as a table as well as on paper.
  • the binding molecule is not particularly limited as long as the correspondence between the binding molecule determining residue and the type of the binding molecule or the binding molecule can be recognized.
  • the binding molecule is not particularly limited as long as it can bind to known binding molecule proteins and unknown binding molecule proteins that are biopolymers.
  • ligands that bind to receptor proteins and GPCRs G ⁇ protein that binds. [Type of binding molecule]
  • the type of binding molecule is a type in which, when a plurality of binding molecules exist in the same type of known binding molecule protein, they are classified according to their functions and properties. For example, there are cases where GPCR ligands are classified into monoamines, lipids, and peptides.
  • the computer of the present invention is not particularly limited as long as it is an electronic device capable of performing a certain calculation or the like.
  • a known computer such as a personal computer, a super computer, and a mobile may be used. You may.
  • a computer equipped with a browser is particularly preferable because it can be connected to the Internet and can access a well-known Web (Web) site.
  • Web Web
  • FIG. 1 shows an example of a process from the generation of ligand-determined residue-ligand classification information, which comprises the following steps. That is, the step of obtaining information on sequence alignment and ligand (and / or ligand type) for at least two or more binding molecule known proteins whose amino acid sequence and ligand (and / or ligand type) are known ( S 101), sequence alignment and ligand (and type of Z or ligand) are associated, and sequence alignment ligand classification information obtaining step for obtaining binding protein known protein classification information (S 102), sequence alignment ligand classification information obtaining The ligand-determining residue positions, which are assumed to be involved in determining the ligand (and / or the type of ligand) in the sequence alignment of the known binding molecule proteins using the information on the classification of the known binding molecule proteins obtained by the process, are described.
  • S104 gand determining residue-ligand classification step
  • the sequence alignment and ligand (and type of Z or ligand) for at least two or more binding molecule known proteins whose amino acid sequence and ligand are known are described.
  • Information is acquired (S101).
  • information on the amino acid sequence and ligand (and / or type of ligand) is obtained for a plurality of known binding molecule proteins, but sequence alignment may be obtained from the amino acid sequence, or a plurality of known binding molecule proteins may be obtained. If sequence alignment has already been requested for, information on the sequence alignment may be obtained directly.
  • the method for obtaining sequence alignment and information on ligands (and Z or the type of ligand) for proteins with known binding molecules is not particularly limited.
  • the database is preferably a database that contains information on sequence alignments and ligands (and / or types of ligands) of at least 100 or more types of known binding molecules, and more preferably 500 or more. It is particularly preferable to record 1000 or more. This is because the greater the number of proteins with known binding molecules, the higher the accuracy of the ligand-determined residue-ligand classification information.
  • the known database is not particularly limited as long as it describes the sequence alignment and ligand (and Z or the type of ligand) of the binding molecule known protein.
  • GPCRDB ht tp: / /www.GPCR.org/7tm/.
  • Sequence alignment can also be obtained by a known calculation method.
  • Known calculation methods for sequence alignment include, for example, Clus tal W and BLAST, but may be calculated manually.
  • a classification for the ligand is created in advance, and if the information on the ligand is input, the type of the ligand is automatically obtained. If the information on the ligand is obtained, the information on the type of the ligand is also obtained. You may make it available. [Step 1 0 2]
  • sequence alignment and the information on the ligand (and the type of Z or ligand) for the two or more binding molecule known proteins obtained in step 101, and the sequence alignment and the information on the ligand (and type of Z or ligand) are obtained.
  • classification information of the binding molecule known protein sequence alignment ligand classification information obtaining step: S102.
  • sequence alignment and ligand (and / or ligand type) information on one or more known binding molecules is obtained from a database or the like, the sequence alignment and the ligand (and / or ligand) have already been performed. If the type is associated with the sequence alignment, the information may be obtained with the sequence alignment and the ligand (and the type of Z or ligand) associated with each other.
  • the ligand (and the type of ligand or ligand) in the sequence alignment of the binding molecule known protein is determined using the binding molecule known protein classification information obtained in the sequence alignment ligand classification information acquisition step of step 102.
  • One or more ligand-determining residue positions, which are assumed to be involved in the above, are specified (ligand-determining residue position specifying step: S103).
  • one or more ligand-determining residue positions are specified by combining preferred functions according to the type and properties of the target protein. A preferred embodiment of this step will be described later.
  • the ligand (and / or the type of ligand) is associated with the amino acid residue (ligand-determining residue) at the ligand-determining residue position specified in the ligand-determining residue position specifying step.
  • ligand-determined residue-ligand classification information indicating the correlation between the ligand and the ligand (ligand determined residue-ligand classification step: S104).
  • information on the ligand-determined residue positions, ligand-determined residues, and ligands (and / or types of ligands) is obtained from the sequence alignment ligand classification information for each GPCR, and the ligand-determined residue-one ligand is obtained. Obtain classification information.
  • ligand-determining residue position identification step S103
  • This step is for the case where the number of ligand-determining residue positions is one.
  • Ligands are classified into!) Types and classified into XI to Xp, respectively.
  • the information on all sequence alignments and ligands (and / or types of ligands) of known binding molecule proteins present in the classification information on binding molecules known in step 102 is input to the following equation 15 and the function fl Since the value of (n) is small, it is determined as a candidate for a ligand-determined residue position (a candidate ligand-determined residue position selection step: S201).
  • the ligand-determined residue positions are determined. It is preferable not to adopt as a ligand-determining residue position
  • n fl (n) as an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein
  • Res represents the amino acid residue.
  • XQ and Xr represent ligands, Q represents an integer from 1 to p-1, r represents an integer greater than Q and less than or equal to p, p represents the number of ligands, and N ( Res, XQ) represents the number of proteins in which the nth amino acid residue in the sequence alignment is Res and the ligand is among the known binding molecule proteins present in the binding molecule known protein classification information, N (Res, Xr) is the number of proteins whose binding amino acid residue is Res and the ligand is Xr among known binding molecule proteins in the binding molecule known protein classification information.
  • the number of candidate ligand-determining residue positions in the ligand-determining residue position candidate selecting step is not particularly limited, but is preferably 1 or more and 100 or less, more preferably 1 or more and 10 or less. It is particularly preferable that the value is not less than 5 and not more than 5. This is because if there are too many candidate ligand-determined residue positions, the subsequent step of confirming the reliability of candidate ligand-determined residue positions becomes difficult.
  • the candidate ligand-determined residue position selected in the candidate ligand-determined residue position selecting step in step 201 may be used as it is as the ligand-determined residue position candidate. It is more preferable to use and study.
  • Step of examining reliability of candidate ligand-determined residue position S202).
  • the reliability of candidate ligand-determined residue positions is examined using the following equation (16). The smaller the obtained value of f 2 (n), the more suitable the position of the ligand-determining residue.
  • f 2 (n) ⁇ (N (Res, XI) xN (Res, X2) X N (Res, Xp)) Equation 16
  • n represents that f 2 (n) is an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the amino acid residue.
  • Res represents the amino acid residue.
  • XI to 3 ⁇ 4 represent the ligand or the type of ligand
  • p represents the number of the ligand or the type of ligand
  • N (Res, X) represents the known binding molecule obtained in step 102.
  • this indicates the number of proteins in which the n-th amino acid residue in the sequence alignment is Res and the ligand is X.
  • the position of a ligand-determining residue is identified (ligand-determining residue position identification step: S 203).
  • the position of the obtained f 2 (ii) having the smallest value may be determined as the ligand-determining residue position, and the product of ⁇ (n) and the value of f 2 (n) that is the smallest may be determined as the ligand-determining residue position.
  • the smallest sum of the values of ⁇ (n) and f 2 (n) may be used as the ligand-determining residue position.
  • the positions of fl (n) are ranked in ascending order, and further, f 2 (n) May be similarly ranked, and both ranks may be multiplied, and the lowest one may be used as the ligand-determining residue position.
  • the sequence alignment at the obtained ligand-determined residue position is not sequence-alignable (indicated by-), but is present in 3% or more of all GPCRs, the ligand-determined residue position is determined by the ligand. It is preferable not to adopt it as a residue position.
  • Step S103 Another preferred embodiment of the ligand-determining residue locating step (S103) will be described.
  • This step is for the case of two ligand-determining residue positions.
  • All sequence alignments of known binding molecule proteins present in the binding molecule known protein classification information obtained in step 102 and information on the ligand or the type of ligand are input to Equation 1, and the ligand-determined residue position is entered.
  • Step for selecting candidate residue position for determining ligand S301).
  • the number of candidate ligand-determining residue positions in the ligand-determining residue position candidate selecting step is not particularly limited, but is preferably 1 or more and 100 or less, more preferably 1 or more and 20 or less, and 2 or more It is more preferably 10 or less, particularly preferably 2 or more and 6 or less. This is because if the number of candidate ligand-determined residue positions is large, the reliability confirmation process of the candidate ligand-determined residue positions becomes difficult, and if the number of candidate candidates is one, it cannot cope with various ligand types.
  • Step of examining reliability of candidate ligand determination residue position S302
  • the process may proceed to step 303 after step 301.
  • step of examining the reliability of the position of the ligand-determined residue at least the amino acid residue position including the candidate position of the ligand-determined residue is input to equation 2 with the sequence alignment of the known binding molecule protein and the ligand. Small value of function f2 (n) The residue is suitable as a residue position for determining a ligand.
  • the position of the ligand-determining residue may be selected from those with the smallest value of the function f 2 (n), or the product of the smallest value of f 1 (n) and ⁇ 2 (n) is determined as the position of the ligand-determining residue. Or the one with the smallest sum of the values of fl (n) and f2 (n) may be selected as the ligand-determining residue position.
  • the amino acid residue positions of the known binding molecule proteins present in the information on the classification of known binding molecules obtained in step 102 are ranked from the one with the lowest f 1 (n), and further the f 2 (n n) may be ranked in the same manner, and both ranks may be multiplied to select a ligand-determining residue position from the lower one. If the sequence alignment at the obtained ligand-determining residue position cannot be sequence-aligned (indicated by-), but is present in 3% or more of the GPCRs described in the information on the classification of known binding molecules, the ligand is determined. It is preferred that residue positions are not employed as residue determining residue positions.
  • a candidate pair for the ligand-determined residue positions is given (step of selecting a pair of ligand-determined residue positions: S303).
  • a candidate pair of ligand-determined residue positions a combination consisting of all candidate ligand-determined residue positions may be used, or a preferred ligand-determined residue position of ⁇ (n) and other residue positions may be used.
  • a candidate pair of ligand-determined residue positions may be given by the combination of
  • a pair of ligand-determined residue positions is identified (pair identification step of ligand-determined residue positions: S304).
  • f 3 (m, n) ⁇ (number of amino acid residue pair types) / wX + wlx (2 cross residue pair types) + w2 X (3 cross residue pair types) tens...
  • wp-1 p cross Number of types of residue pairs
  • wA X number of non-alignable amino acid residues
  • wB X number of non-alignable amino acid residues
  • (m, n) is f 3 (m, n) where the binding molecule is a known protein:
  • the number of types of amino acid residue pairs indicates the number of types of combinations of the mth and nth amino acid residues in the sequence alignment of proteins with known binding molecules.
  • the number of 3 crossing residue pairs means the number of 2 and 3 ligands, respectively, of the combination of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein.
  • the number of pairs of p-crossing residue pairs is determined by the sequence alignment of known binding molecules, the ligand of which is the combination of the mth and nth amino acid residues!
  • One of the m-th and n-th amino acid residues in the sequence alignment of a protein with a known binding molecule has favorable homology.
  • the number of amino acid residue pairs that cannot be sequence-aligned in order to obtain, and the number of amino acid residue pairs that cannot be sequence-aligned means that both the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein are WX is a positive constant or a distribution function with the number of amino acid pair types as variables, and the number of amino acid pair types is 400 or less in order to obtain favorable homology.
  • WX may be a positive constant, a Gaussian function, or a distribution function.
  • a positive constant although not particularly limited, 1 is preferable.
  • wX is a positive constant or a distribution function with the number of amino acid pair types as a variable, and means a distribution function that gives the maximum value when the number of amino acid pair types is 400 or less.
  • a Gaussian function using the number of types of amino acid pairs as a variable a mouth-to-Lenz function, etc.
  • the maximum value is not particularly limited as long as it is a positive number.
  • the value of the variable that gives the maximum value of is preferably 400 or less, more preferably 20 to 200, and even more preferably 40 to 140. This value indicates that the total number of ligand-determining amino acid pairs is 20 or less.
  • X 20 400, based on experience It is determined because prediction with 40 to 140 amino acid pairs gives the best results.
  • the full width at half maximum of the distribution function is preferably 10 or more and 100 or less, and 20 or more and 7 or less. It is more preferably 0 or less, more preferably 30 or more and 50 or less.
  • the maximum value of the distribution function is preferably 1, although it depends on the relationship with other weights.
  • the value of the weight is not particularly limited as long as it is a positive number. However, for example, when 3 pairs of residue pairs exist, 2 intersection residues It is preferable that the value of w2 is larger than the value of wl because it is more unfavorable in predicting the ligand than the base pair, for example, when the number of ligands is three, there may be more than three crossing residue pairs. There are no combinations of weights in this case: wl is 2, w2 is 5, wA and wB are 1.
  • ligand-determined residue-ligand classification information in which pairs of ligand-determined residue positions are arbitrarily combined.
  • the combination is not particularly limited, but the combination of the function i3 (n) having the smallest value and the second and / or Xth (where X is greater than 2 and less than 100) Represents an integer.) Is combined with a smaller one.
  • the value of the function i3 (m, ⁇ ) is the Xth smallest and the yth (where y is different from X and is greater than 2 and greater than 100) It represents a small integer.
  • the value of the f3 ( ⁇ ) function can be arbitrarily combined with the ligand-determining residues up to the 29th.
  • the X-th and y-th Xs and y's may be input to the computer in advance, or the computer may receive input information from the user and create a combination of ligand-determined residues.
  • the combination of the smallest and the second smallest f 3 (m, n) function limits the number of ligand-locating residues that can predict ligand and / or ligand type.
  • X and y above Is preferably larger than 2 and smaller than 100, more preferably smaller than 50, even more preferably smaller than 30 and particularly preferably smaller than 20.
  • the amino acid residue at the ligand-determining residue position where the value of the function ⁇ 3 (ffl, n) is the smallest is extracted.
  • the number of those that match the extracted ligand-determined residues is determined. Further, among these, the number of the ligands or the types of the ligands corresponding to the ligand-known protein is determined.
  • the value obtained in this manner is expressed as N: ((Ligand-determined residues—the number of ligand-identified proteins listed in the ligand classification information that match the extracted ligand-determined residues. ) / (Number of ligands or ligands whose type matches the ligand-known protein))). Then, N is similarly obtained for the ligand-determining residue position where the value of the function f 3 (m, ii) is the second smallest. Then, the denominator of N and the numerator at the position of the first and second smallest ligand-determining residue of the function i3 (m, n) are added. In this manner, the ligand-determined residue-ligand classification information obtained by arbitrarily combining the pairs of the ligand-determined residue positions is obtained.
  • Each of the steps described above may be performed manually, but is particularly preferably performed by a computer having a predetermined medium or program installed.
  • this computer is connected to the Internet and can access an external device.
  • Such a computer includes at least a sequence alignment input means for inputting information on sequence alignment of the unknown binding molecule protein, an amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means.
  • Sequence alignment binding molecule storage means for storing information on the binding molecule or the type of binding molecule; and the amino acid sequence or sequence alignment of the binding molecule known protein or the binding molecule stored by the sequence alignment binding molecule storage means.
  • the position of the binding molecule-determining residue is predicted using information on the type of the binding molecule.
  • a binding molecule or a type of binding molecule by associating the amino acid residue (binding molecule determining residue) at the binding molecule determining residue position with the binding molecule or the type of the binding molecule.
  • a binding molecule-determining residue-binding molecule classification information obtaining means for obtaining binding molecule-determining residue-binding molecule classification information indicating a correlation between the binding molecule and the type of the binding molecule.
  • binding molecule determination residue locating means the sequence alignment of known binding molecule proteins and information on the binding molecule or the type of the binding molecule are described by the ⁇ ( ⁇ ), ⁇ 2 ( ⁇ ), and f3 (m, n) functions described above. Any or any combination thereof is determined to determine the position of the binding molecule determining residue.
  • the ligand-determined residue-ligand classification information may constitute a unique database, or may be configured as a relational database based on the binding molecule known protein classification information obtained in step 102. If the ligand-determined residue-ligand classification information is configured as a relational database based on the binding molecule known protein classification information, if the sequence alignment of the unknown protein and its ligand are confirmed, It is preferable because it can be easily re-evaluated at the position of the ligand-determining residue by incorporating it into the information on the classification of known binding molecules.
  • the binding molecule or the type of the binding molecule can be easily predicted by inputting the sequence alignment of the binding molecule unknown protein into the binding molecule determination residue-binding molecule classification information.
  • the computer of the present invention has the binding molecule determining residue-binding molecule classification information preliminarily installed, and the sequence alignment input means for inputting the protein sequence alignment is used to execute the sequence alignment of the unknown protein. It may be a computer that predicts the type of the binding molecule and / or the binding molecule by inputting the data. Even with such a computer, it is easy to predict the binding molecule or the type of the binding molecule by inputting the sequence alignment of the unknown binding protein to the binding molecule determination residue-binding molecule classification information. Becomes possible.
  • Table 1 is an example of hypothetical binding molecule known protein classification information.
  • Ligands are divided into three categories: P, A, and N.
  • P, A, and N When determining the position of a ligand-determining residue based on the type of ligand, P, A, and N correspond to XI, X2, and X3 in Formulas 1 and 2, respectively.
  • ⁇ X, XX, and ⁇ ⁇ correspond to XI, 11, and X3 in Formulas 1 and 2, respectively.
  • sequence alignment of the first known binding molecule protein is TLMRK
  • binding molecule (ligand) is ⁇
  • type of ligand of ligand I is ⁇ .
  • f2 (l) ⁇ (N (Res, XI) XN (Res, X2) XN (Res, X3))
  • n is in the order of 2, 4, 3, 5, 1.
  • Amino acid residue positions that give a small value of ⁇ ( ⁇ ) are candidates for ligand-determined residue positions.
  • the second, fourth, and third amino acid residue positions are candidates for ligand-determined residue positions. How many amino acid residue positions are candidates for ligand-determining residue positions may be determined in advance. Since the value of f2 (n) at these amino acid residue positions is smaller than the value of ⁇ 2 ( ⁇ ) at the first and fifth amino acid residue positions, the second, fourth and third amino acid residue positions Is a desirable candidate as a candidate for a ligand-determining residue position.
  • Two cross-residue pairs when there are two ligands for a pair of ligand-determined residue positions and three cross-residue pairs when there are three ligands for a pair of ligand-determined residue positions .
  • candidate pairs of ligand-determined residue positions In this example, three types (2, 3), (2,4), (3, 4) Pair candidates.
  • the candidate pairs (2, 3) for the ligand-determined residue positions are examined.
  • the combination of the second and third amino acid residues in the sequence alignment is (L, M), (M, M), (C, M), (L, L), and (M, L). is there. Therefore, the “number of kinds of amino acid residue pairs” to which the present invention belongs is 5.
  • the corresponding ligand for each of the combinations of these five types of amino acid residues is uniquely determined, so there are no two-crossing residue pair species and three-crossing residue pair species. From this, the value of f 3 (2, 3) is 5. Similarly, the value of f3 (2, 4) is 4, and the value of ⁇ 3 (3, 4) is 5. Therefore, a pair of residue positions (2, 4) is the most preferable, and is a pair of ligand-determining residue positions.
  • f3 (l, 5) is determined using a combination of amino acid residue positions 1 and 5 to indicate that an unfavorable combination of amino acid residue positions increases the value of f3.
  • the amino acid residue at the position of the ligand-determined residue of the binding molecule unknown protein (the binding molecule with unknown ligand) is determined, and the ligand determined residue-ligand classification is performed.
  • the information on the classification of the ligand-determined residue-ligand as described above can be obtained, the amino acid residue at the position of the ligand-determined residue of the binding molecule unknown protein (the binding molecule with unknown ligand) is determined, and the ligand determined residue-ligand classification is performed.
  • the information on the classification of the ligand-determined residue-ligand as described above can be obtained, the amino acid residue at the position of the ligand-determined residue of the binding molecule unknown protein (the binding molecule with unknown ligand) is determined, and the ligand determined residue-ligand classification is performed.
  • the ligand of the unknown protein of the binding molecule For example, a sequence alignment of a certain unknown binding molecule protein is determined by a known method. If the second and fourth amino acid residues in the sequence alignment are M
  • the prediction method of the present invention is also useful for developing a novel medicine.
  • a ligand was described.
  • the present invention can predict not only a ligand but also a molecule that binds to a known binding molecule protein.
  • the binding molecule known protein is GPCR
  • a G protein that binds to the GPCR can also be predicted.
  • the use of the method for predicting binding molecules of unknown binding molecules of the present invention makes it possible to predict, for example, the type of ligand and Z or ligand of GPCR. If the ligand of the GPCR and / or the type of the ligand can be predicted, the function of the GPCR in vivo can be predicted. And, for example, using information on a GPCR whose function is predicted and whose ligand and / or ligand type is predicted, It is possible to easily produce a prophylactic or therapeutic drug for a disease or the like involving the GPCR.
  • this method is particularly suitable for the manufacture of either or both prophylactic agents and / or therapeutic agents for central diseases, inflammatory diseases, cardiovascular diseases, cancer, metabolic diseases, immune system diseases or digestive system diseases.
  • prophylactic agents and / or therapeutic agents for central diseases, inflammatory diseases, cardiovascular diseases, cancer, metabolic diseases, immune system diseases or digestive system diseases.
  • the invention will be used effectively.
  • sequence numbers in the sequence listing in the present specification indicate the following sequences.
  • FIG. 1 shows the amino acid sequence of rat TGR23-2 ligand (1-15).
  • Fig. 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-18).
  • Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-15).
  • Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-14).
  • Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-20).
  • FIG. 2 shows the nucleotide sequence of cDNA encoding a part of rat TGR23-2 ligand precursor.
  • [SEQ ID NO: 25] 2 shows the amino acid sequence of rat TGR 23-2 ligand (1-20).
  • Rat TGR 23 Shows the nucleotide sequence of cDNA encoding the precursor of ligand 2
  • FIG. 2 shows the amino acid sequence of rat TGR 23-2 ligand precursor.
  • TGR 23-1 human TGR 23-1).
  • TGR 23-2 human TGR 23-2).
  • Example 1 shows the nucleotide sequence of cDNA encoding human-derived G protein-coupled receptor protein TGR23-2.
  • GPCR is mentioned as a protein with a known binding molecule and a protein with an unknown binding molecule, but the present invention is not limited to these without departing from the gist thereof.
  • P P ?? Peptides (peptide), C emokines (chemokine), Glycoproteins (glycoprotein) A ⁇ Monoamines, (adrenaline, acetylcholine, dopamine, serotonin, histamine)
  • a candidate having a small product of the value of the f l (n) function and the value of the f2 (n) function is selected as the ligand residue position candidate. Among them, if any of them could not be sequence aligned (-), 3% or more of them were excluded from candidate ligand-determined residue positions. In this manner, 20 ligand residue position candidates were selected.
  • Table 3 shows the evaluation values and ranks of the function ⁇ ( ⁇ ) and the function f2 (n) for the more preferable six candidate ligand residue positions among them. Table 3 Evaluation results and ranks of function i l (n) and function f 2 (n) for six preferred candidate ligand residue positions f l (n) f2 (n)
  • the 86th and 90th amino acid residue types and the number of ligand types were extracted from the sequence alignment of GPCR from the binding molecule known protein classification information to obtain ligand-determined residue-ligand classification information.
  • Table 4 shows the excerpts. From Table 4, for example, of the 1152 types of GPCRs, there are 86 types in which the amino acids at amino acid residue positions 86 and 90 are A and G, respectively, and their ligands are all N (lipid). It can be seen that it is classified as Thus, it can be seen that most GPCRs can predict their ligands by the amino acids at the 86th and 90th amino acid positions.
  • Table 4 Ligand-determined residue-ligand classification table (Binding molecule-determined residue-binding molecule classification table)
  • the (86, 90) th amino acid residue (position of the residue determining the binding molecule) has a sequence of I and M, respectively.
  • the binding molecule known GPC R1152 types in the binding molecule classification residue-binding molecule classification information there are 32 types in which the (86, 90) th amino acid residues are I and M, respectively. All 32 types of ligands are peptides.
  • this GPR 2 is not included in the 1152 known binding molecules of GPCR, the actual ligand type is a peptide, which is consistent with the predicted ligand type.
  • the amino acid residue at the position (209, 211) determined by the binding molecule is V and F, respectively.
  • Table 5 shows that the binding molecule-determining residue positions (86, 90) are the most preferable binding molecule-determining residue positions among the three types. Furthermore, according to Table 5, it can be seen that the type of the ligand can be predicted with high accuracy.
  • the binding molecule determining residues of GPR2 (86, 90), and N (1112 types of GPCRs) at (209, 211), the binding molecule determining residues were identical to GPR2.
  • the number of ligands with the same type of ligand are 3 2/3 2 and 4/26, respectively. Add these up to 36/58.
  • Example 2 (Method of predicting the type of binding G a protein that binds to GPCR)
  • the binding G ⁇ protein which is a binding molecule of GPCR, was classified into three types, G i, Gq, and G s. This is a categorization of the binding G ⁇ protein into the 2000 Receptor & Ion channel Nomenclature Supplement of Trends in pharmacological sciences (TIPS). For simplicity, Gi / o is Gi and GQ / 11 is GQ in the 2000 Receptor & Ion channel Nomenclature Supplement of TIPS.
  • proteins listed in the 2000 Receptor & Ion channel Nomenclature Supplement of TIPS those binding to two or more G proteins, and Gi / al, 3, and Gi / a2, 3 were excluded as exceptions. In this way, information on about 600 types of GPCR and its sequence alignment, and the binding G ⁇ protein that binds to it, and its type were obtained.
  • sequence alignment of GPCR and information on the type of binding Ga protein that binds to the G protein were input to a computer.
  • the inputted sequence information of the GPCR and the information on the type of the binding Gcu protein were selected by means of a candidate selecting residue position for determining a binding molecule using the f1 (n) function and the f2 (n) function.
  • two pairs of binding residue positions (177, 178) and (82, 230) were determined.
  • binding molecule determination residue position candidate selecting means a candidate having a small product of the value of the ⁇ ( ⁇ ) function and the value of the ⁇ 2 ( ⁇ ) function is selected as the candidate binding molecule residue position. Among them, if any of them could not be sequence-aligned (-), 3% or more of them were excluded from candidate ligand-determining residue positions. In this way, candidate ligand residue positions were selected.
  • Ga protein that binds to multiple GPCRs was obtained from the literature. Then, the binding Go; how the protein was obtained was determined by Ca influx, Arachidonic acid release: by arachidonic acid release, by PTX (pertussis toxin sensitive), and by cyclic adenosine. Phosphoric acid was used, and Ca, AA, PTX, and cAMP were used, respectively. In addition, GQ is determined only by Ca Were excluded because the bound Ga protein could be something else. In this way, a GPCR was selected.
  • APJ G L 2/2 ST 0/0 Gi (cAMP) Gi For example, GPR5 in Table 6 will be described.
  • the type of binding G a protein of GPR5 is Gi, which was obtained by PTX.
  • the 177th and 178th sequence alignments of GPR5 are R and S.
  • Nine GPCRs have such a sequence alignment, and it can be seen that the type of conjugated Gcu protein is Gi.
  • GPR5, GPR13, GPR14, GPR16, GPR24, and AP; [6] are predicting the binding Go; protein. From the above, it can be seen that according to the present invention, it is possible to predict the binding Ga protein with high accuracy.
  • TGR23 which is the G protein-coupled receptor protein represented by SEQ ID NO: 37 and SEQ ID NO: 39
  • FIG. 4 shows the results of sequence alignment of TGR2311 and ⁇ silodopsin.
  • the amino acids at residues (86, 90), (209, 211) and (86, 236) in TGR 23-1 are (Q, L), (D , F) and (Q, N).
  • N the number of 1152 types of GPCRs in which the binding molecule-determining residue coincided with TGR 23 and the type of ligand were the same
  • 52/52 and the combined evaluation of (86, 90) and (86, 236) SDM pairs with a wide range of estimated GPCR numbers estimated that the ligand type was peptide.
  • the following reference examples show that the ligand of TGR23 (TGR23_1 and TGR23-2) is actually a peptide. [Reference Example 1] ⁇
  • a substance exhibiting a ligand activity specific to TGR 23-2 was purified from rat whole brain using cGR production promoting activity on CGR cells expressing TGR 23-2 as an index.
  • High performance liquid chromatography (HPLC) fraction of rat whole brain extract was prepared by the method described below. Immediately after sequentially extracting 400 g (200 cats) of the whole brain of an 8-week-old Wistar rat purchased from Charles River Japan Co., Ltd., it was thrown into distilled water (300 ml) boiled in 25 pets and boiled for 10 min. did. Immediately after boiling, cool on ice, combine 200 heads (2.4 L), add 180 ml of acetic acid to a final concentration of 1.0 M, and use Polytron (10,000 rpm, 2 minutes) at low temperature. Crushed. The crushed liquid is centrifuged (8,000 rpm, 30 minutes), and the supernatant is collected.
  • HPLC high performance liquid chromatography
  • the column was washed with 400 ml of 1.0 M acetic acid and eluted with 500 ml of 60% acetonitrile containing 0.1% trifluoroacetic acid.
  • the eluate was concentrated under reduced pressure to remove the solvent, and the concentrate was freeze-dried.
  • 1.2 g of the obtained white powder was dissolved in 30 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, and 12.5 ml of each powder was dissolved in an ODS column (Tosoichi, TSKgel ODS-80TS (2.
  • the sample was subjected to preparative HP LC using a gradient elution method of acetonitrile containing 10% to 60% of 0.1% trifluoroacetic acid using 5 ⁇ X 300 mm)).
  • the HP LC was performed twice, and the eluate was divided into 60 fractions every 2 minutes, and the two eluates were combined. Each fraction was concentrated and dried under reduced pressure, and 0.4 ml of dimethyl sulfoxide (DMSO) was added to the residue, and then completely dissolved using a Portex mixer and an ultrasonic washer.
  • DMSO dimethyl sulfoxide
  • the DMSO solution of the HPLC fraction obtained above was administered to CHO cells expressing TGR23-2 according to the method described in Reference Example 3, and the amount of intracellular cAMP production was measured. And 22 to 23 showed remarkable cAMP production promoting activity.
  • a similar sample was examined for arachidonic acid metabolite releasing activity according to a known method. As a result, remarkable activity was confirmed.
  • the three active fractions obtained were further purified by the following methods (a) to (c), respectively.
  • the receptor-specific intracellular calcium release activity was also confirmed by FLIPR (Molecular Devices). Was done. Therefore, in confirming the activity in the subsequent purification steps, intracellular calcium release activity by FLIPR was used as an indicator, and it was appropriately confirmed that the fraction showing the activity exhibited cAMP production promoting activity.
  • fraction No. 18 dissolve in 10 ml of 1 OmM ammonium formate containing 10% acetonitrile, and use a cation exchange column (Tosoichi, TSKgel SP-5PW (20 mm ⁇ X 15 Omm)) After elution, elution was carried out with a concentration gradient of 10 mM to 1.0 M ammonium formate containing 10% acetonitrile. The activity was recovered at around 0.4M ammonium formate.
  • a cation exchange column Tosoichi, TSKgel SP-5PW (20 mm ⁇ X 15 Omm
  • the obtained active fraction was lyophilized, dissolved in 0.1 lm of DMSO, further added with 0.7 ml of 0.1% acetofluorobutyric acid in 10% acetonitrile, and added to an ODS column (Wako Pure Chemical Industries, Ltd.). Wakosii-II 3C18H G (2. ⁇ ⁇ X 15 Omm)) and eluted with a concentration gradient of 10% to 37.5% acetonitrile containing 0.1% heptafluorobutyric acid. It was manually collected every time. The activity was observed around 26% of acetonitrile.
  • the active fraction was further added with 0.7 ml of 0.1% trifluoroacetic acid containing 0.1% acetonitrile, applied to a QDS column (Wako Pure Chemical Industries, Wakosite II 3C18HG), and then purified with 0.1% trifluoroacetic acid. Elution was carried out with a gradient of 10% to 20% acetonitrile containing acetic acid, and the eluate was manually collected for each peak. The activity was obtained as a single peak around 11% of acetonitrile. The structure of the active substance contained in this fraction was determined as shown in Reference Example 5 below.
  • the resulting active fraction was lyophilized, dissolved in 0.1 lm of DMS 0, and further added with 0.7 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, followed by ODS column (Wako Pure Chemical Industries, Wakosil- II 3C18HG (2. ⁇ X 150mm)) and eluted with a concentration gradient of 10% to 20% acetonitrile containing 0.1% trifluoroacetic acid, and the eluate was manually separated for each peak . The activity was obtained as a single peak around 15% of acetonitrile.
  • Fraction Nos. 22-23 were dissolved in 10 ml of 1 OmM ammonium formate containing 10% acetonitrile and applied to a cation exchange column (TOSOKI, TSKgel SP-5PW (2 ⁇ X 150 mm)). Elution was carried out with a concentration gradient of 1.0 M ammonium formate, 1 OmM force containing 10% acetonitrile. The activity was recovered at around 0.4M ammonium formate. After freeze-drying the active fraction, dissolve it in 0.8 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid and attach to a CN column (Nomura Chemical, Develosil CN-UG-5 (4. ⁇ X 250 mm)).
  • the sample was manually collected for each work. Activity was observed around 16% of acetonitrile.
  • To the active fraction 0.7 ml of 10% acetonitrile containing 0.1% heptanofluorbutyric acid was further added, and the mixture was applied to an ODS column (Wako Pure Chemical Industries, Wakosil-II 3C18HG). Elution was performed with a gradient of 10% to 37.5% acetonitrile containing heptafluorobutyric acid, and the eluate was manually collected for each peak. The activity was obtained as a single peak around 28% of acetonitrile.
  • the structure of the active substance contained in this fraction was determined as shown in Reference Example 4 below.
  • protease Type XIV (P5147)) was used to determine whether the active substance was proteinaceous.
  • Rat whole brain extract HP LC active fraction (fraction No. 18, 20, and 22 to 23) Add 4-1 each to 0.2 M ammonium acetate 1001, and add 3 ig pronase to this After incubation at 37 ° C for 2 hours, the added pronase was inactivated by heating in boiling water for 10 minutes. Distilled water (lm1) containing BSAO. 05mg and CHAPS 0.05mg was added thereto and freeze-dried. The freeze-dried sample was added to CHO cells expressing TGR23-2 according to a known method, and the activity of promoting intracellular cAMP production was measured.
  • any of the active substances exhibiting an intracellular cAMP production promoting activity on CHO cells expressing TGR23-2 in a rat whole brain extract is a protein or a peptide.
  • Thermo Fiimigan LCQ ion trap mass spectrometer (ThermoQuest) equipped with a nanospray ion source (Pro evening).
  • the result was calculated from the amino acid sequence of SEQ ID NO: 1. (Measured value: 1954.9, calculated value: 1954.2).
  • the active substance which specifically exhibits cAMP production promoting activity on TGR2-3-expressing CHO cells obtained from fraction number 20 of rat whole brain extract has the amino acid sequence shown in SEQ ID NO: 1. It was determined to have.
  • the active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CHO cells obtained from fraction numbers 22 to 23 of rat whole brain extract is the amino acid shown in SEQ ID NO: 2. It was determined to have a sequence.
  • the same eluate was used to perform mass spectrometry using a Thermo Finnigan LCQ ion trap mass spectrometer (SammoQuest) equipped with a nanospray ion source (protana).
  • the mass was calculated from the amino acid sequence of SEQ ID NO: 3. The following mass values were obtained (actual value: 144.1, calculated value: 1423.6).
  • the active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CHO cells obtained from fractional number 18 of rat whole brain extract
  • rat TGR23 To clone the cDNA encoding the precursor of the human homolog (sometimes referred to as human TGR23-2 ligand in this specification) of the human hypolog, A PCR was performed using the cDNA as a type II. Using the following synthetic DNA primers, cDNA derived from the human hypothalamus was converted to type III and amplified by the PCR method.
  • the composition of the reaction solution was human hypothalamus Marathon-Ready cDNA (CLONTECH) 0.8 ⁇ , SEQ ID NO: 4 and SEQ ID NO: 5, each of the synthetic DNA primers 1.0 M, 0.2 mM dNTPs, ET aq (Takara Shuzo) 0.1 II 1 and the ExTaq buffer attached to the enzyme, and the total reaction volume was 201.
  • the amplification cycle was performed using a thermal cycler (PE Biosystems) at 94 ° C for 300 seconds, followed by 94 ° C for 10 seconds, 55 ° C for 30 seconds, 72 ° C for 30 seconds. A cycle of 35 seconds was repeated 35 times, and finally, the mixture was kept at 72 ° C for 5 minutes.
  • a PCR reaction solution 2 n 1 which was diluted 50-fold with DNase and RNase Free distilled water, synthetic DNA primers of SEQ ID NO: 4 and SEQ ID NO: 6, 1.0 M and 0.2 mM dNTPs, respectively ExT aq polymerase (Takara Shuzo) 0.11 and ExTaq buffer attached to the enzyme to make the total reaction volume 201, and heat it for 94 ⁇ 300 seconds using a thermocycler (PE Biosystems). Thereafter, a cycle of 94 ° C ⁇ 10 seconds, 55 ° C ⁇ 30 seconds, and 72 ° C ⁇ 30 seconds was repeated 35 times, and finally, the temperature was kept at 72 ° C for 5 minutes.
  • the nucleotide sequence of the DNA represented by SEQ ID NO: 7 includes the amino acid sequence of rat TGR23-2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. The presence of a frame encoding an amino acid sequence very similar to that of was predicted to be cDNA encoding the precursor of human TGR23-2 ligand or a part thereof.
  • An ATG which is predicted to be the initiation codon for protein translation, is located 5 'upstream of the amino acid sequence translated from SEQ ID NO: 7 in a frame encoding an amino acid sequence considered to be a human TGR 23-2 ligand.
  • SEQ ID NO: 8 The amino acid sequence of the human TGR 23-2 ligand precursor deduced as described above is shown in SEQ ID NO: 8.
  • the amino acid sequence of human TGR23-2 ligand is the amino acid sequence of rat TGR23-2 ligand obtained from rat whole brain extract; SEQ ID NO: 1 [rat TGR23-2 ligand (1-18) )], SEQ ID NO: 2 [rat TGR23-2 ligand (1-15)] and SEQ ID NO: 3 [rat TGR23-2 ligand (1-114)], SEQ ID NO: 9 [ Human TGR23_2 ligand (111)], SEQ ID NO: 10 [human TGR23-2 ligand (1-15)] and SEQ ID NO: 11 [human TGR23-2 ligand ( 11-14)] and an amino acid sequence represented by SEQ ID NO: 9 with two residues extended to the C-terminal side of SEQ ID NO: 9 [SEQ ID NO: 12] [amino acid sequence represented by human TGR 23-2 ligand (1 20)].
  • the sequence of human TGR 23-2 ligand is TGR 23-2 ligand and rat Unlike the sequence of TGR 23-2 ligand, the sequence has a G 1 nA rg sequence instead of an Arg-A rg sequence, so the 16 residues shown in SEQ ID NO: 26
  • the amino acid sequence [human TGR 23-2 ligand (1-16)] was also deduced to be the ligand sequence.
  • Mouse homologue of rat TGR23-2 ligand obtained from rat whole brain extract (referred to as mouse TGR23-2 ligand in this specification)
  • PCR was performed using type I cDNA from the whole mouse brain.
  • Amplification by PCR was performed using the following synthetic DNA primers and cDNA of mouse whole brain as type II.
  • the composition of the reaction solution was as follows: Mouse whole brain Marathon-Ready cDNA (CLONTECH) 0.8 Synthetic DNA primers of SEQ ID NO: 13 and SEQ ID NO: 14: 1.0 0, 0.2 mM dNTPs, ExT a ⁇ (Takara Shuzo) 0.1 H1 and ExTaQ buffer attached to the enzyme, and the total reaction volume was 201.
  • the amplification cycle was performed using a thermal cycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle of 94 ° C for 10 seconds, 65 ° C for 30 seconds, and 72 and 30 seconds.
  • the test was repeated 5 times, and finally kept at 72 ° C for 5 minutes.
  • the PCR reaction solution 21 which was diluted 100 times with DNase and RNase Free distilled water, the synthetic DNA primers of SEQ ID NO: 13 and SEQ ID NO: 15 were each 1.0 M and 0.2 mM dN.
  • TP s, ExTaQ polymerase (Takara Shuzo) 0.1 1 and EXT aq buffer attached to the enzyme the total reaction volume was 201, and the temperature was 94 ° C After heating for 5 minutes, a cycle of 94 ° C for 10 seconds, 60 hours, 30 seconds, and 72 ° C for 30 seconds was repeated 30 times, and finally, the temperature was kept at 72 ° C for 5 minutes.
  • the nucleotide sequence of DNA represented by SEQ ID NO: 16 includes the amino acid sequence of rat TGR23_2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. Since a frame encoding a very similar amino acid sequence was present, it was presumed to be a cDNA encoding the precursor of mouse TGR 23-2 ligand or a part thereof.
  • SEQ ID NO: 17 shows the amino acid sequence of the mouse TGR23-2 ligand precursor deduced as described above.
  • the N-terminal of the amino acid sequence considered to correspond to the mouse TGR 23-2 ligand has a Lys-Arg sequence (Seidah, NG), which is usually considered to be a bioactive peptide cleaved from its precursor protein. et al., Ann. NY Acad. Sci., 839, 9-24, 1998).
  • NG Lys-Arg sequence
  • a termination codon was present at the C-terminal side, but two more residues were present between the sequence corresponding to the rat TGR23_2 ligand of SEQ ID NO: 1.
  • the amino acid sequence of mouse TGR23-2 ligand was extracted from rat whole brain Amino acid sequence of rat TGR23-2 ligand obtained from the product; SEQ ID NO: 1 [rat TGR23-2 ligand (1-18)], SEQ ID NO: 2 [rat TGR23-2 ligand (1-15) ] And SEQ ID NO: 3 [rat TGR 23-2 ligand (1-114)], corresponding to each, SEQ ID NO: 18 [mouse TGR 23-2 ligand (1-1 8)], SEQ ID NO: 19 [Mouse TGR 23-2 ligand (1-15)] and the amino acid sequence represented by SEQ ID NO: 20 [mouse TGR 23-2 ligand (1-14)], and further, at the C-terminal side of SEQ ID NO: 18 It was presumed to be the amino acid sequence represented by SEQ ID NO: 21 extended by 2 residues [mouse TGR 23-2 ligand (1-20)].
  • PCR was performed using cDNA from rat whole brain as type II.
  • cDNA derived from whole rat brain was used as type II and amplified by the PCR method.
  • the composition of the reaction solution was rat whole brain Marathon-Ready cDNA (CLONTECH) 0.81, SEQ ID NO: 22 and SEQ ID NO: 14, each of the synthetic DNA primers 1.0 / iM, 0.2mM dNTPs, E xT aq (Takara Shuzo) 0.1 a 1 and the ExTaQ buffer attached to the enzyme, the total reaction volume was 20 n 1.
  • the amplification cycle is performed using a thermal cycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle at 94 ° C for 10 seconds, 65 ° C for 30 seconds, and 72 ° C for 30 seconds. Was repeated 35 times, and finally kept at 72 ° C for 5 minutes.
  • PCR reaction solution 2n1 diluted 100 times with DNase and RNase Free distilled water, primer 1.0M of SEQ ID NO: 22, synthetic DNA primer of SEQ ID NO: 15 2 xM, 0.2 mM dNTPs, ExTaQ polymerase (Takara Shuzo) 0.1 x 1 and ExTaq buffer supplied with the enzyme, the total reaction volume is 201, and the total cycler is PE Biosystems.
  • a clone having a cDNA insert is selected on an LB agar medium containing ampicillin and X-ga1, and the white clone is selected. Only one was isolated using a sterilized toothpick to obtain a transformant. Each clone was cultured overnight in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen).
  • the reaction for determining the nucleotide sequence was performed using the BigDye Terminator Cycle Seauencing Ready Reaction Kit (PE Biosystems), followed by decoding using a fluorescent automatic sequencer to obtain the DNA sequence represented by SEQ ID NO: 23. Was.
  • the nucleotide sequence of the DNA represented by SEQ ID NO: 23 includes the amino acid sequence of rat TGR23_2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. There was a frame to code. When the DNA sequence was translated using this frame as a reading frame, the amino acid sequence represented by SEQ ID NO: 24 was obtained. By comparing this sequence with the amino acid sequence of the mouse TGR23-2 ligand precursor obtained in Reference Example 7 (SEQ ID NO: 16), this sequence was found to be a part of the rat TGR23-2 ligand precursor. It was presumed to correspond to a sequence consisting of 54 amino acids at the C-terminal side.
  • a termination codon was present downstream of the sequence encoding rat TGR 23-2 ligand.
  • the N-terminal side of the amino acid sequence of rat TGR 23-2 ligand has a Lys-Arg sequence (Seidah, N.G. et al., Ann. NY Acad. Sci., 839, 9-24, 1998).
  • a termination codon was present on the C-terminal side, but two more residues were present between the sequence of the rat TGR23-2 ligand of SEQ ID NO: 1.
  • the amino acid sequence of rat TGR 23-2 ligand was obtained from SEQ ID NO: 1 [rat TGR 23-2 ligand (1-18)] obtained from rat whole brain extract, Amino acid sequence represented by SEQ ID NO: 2 [rat TGR 23-2 ligand (1-15)] and SEQ ID NO: 3 [rat TGR 23-2 ligand (1-14)], and further SEQ ID NO: 1 It was presumed to be the amino acid sequence [rat TGR 23-2 ligand (1-20)] represented by SEQ ID NO: 25, which was extended to the C-terminal side by 2 residues.
  • cDNA derived from rat whole brain was subjected to amplification by PCR using type III.
  • the composition of the reaction solution was rat whole brain Marathon-Ready cDNA (CLONTECH) 0.8 / 1, SEQ ID NO: 27 and SEQ ID NO: 28, each of the synthetic DNA primers 1.0 M, 0.2 mM dNTPs, E xT aq (Takara Shuzo) 0.11 and ExT aq buffer attached to the enzyme, the total reaction volume was 201.
  • the amplification cycle was performed using a thermocycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle at 94 ° C for 10 seconds, 65 ° C for 30 seconds, and a cycle of 72 ° C for 30 seconds. Was repeated 35 times, and finally, the mixture was kept at 72 ° C for 5 minutes.
  • PE Biosystems PE Biosystems
  • PCR reaction solution 2 ⁇ i1 a primer of SEQ ID NO: 29, 1.0 M, and a synthetic DNA primer of SEQ ID NO: 28, diluted 50-fold with distilled water of DNase and RNA Free 0.2 M, 0.2 mM dNTPs, ExTaQ polymerase (Takara Shuzo) 0.1 x 1 and the ExTaq buffer attached to the enzyme to a total reaction volume of 20 n1.
  • PCR reaction solution 2 ⁇ i1 a primer of SEQ ID NO: 29, 1.0 M
  • a synthetic DNA primer of SEQ ID NO: 28 diluted 50-fold with distilled water of DNase and RNA Free 0.2 M, 0.2 mM dNTPs, ExTaQ polymerase (Takara Shuzo) 0.1 x 1 and the ExTaq buffer attached to the enzyme to a total reaction volume of 20 n1.
  • Each clone was cultured once in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen).
  • the reaction for determining the nucleotide sequence was carried out using a BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems), and was decoded using a fluorescent automatic sequencer to obtain a DNA sequence represented by SEQ ID NO: 30.
  • the nucleotide sequence of cDNA represented by SEQ ID NO: 30 is a further 5 'of the DNA sequence (SEQ ID NO: 23) encoding a part of the rat TGR23-2 ligand precursor obtained in Reference Example 8. The sequence was extended to the side. This sequence is constructed using a frame encoding an amino acid sequence corresponding to the amino acid sequence of rat TGR 23-2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3.
  • TGR23-1 Human TGR23-1 is sometimes simply referred to as TGR23_1
  • TGR23_1 Human TGR23-1 is sometimes simply referred to as TGR23_1
  • Plasmid pTB2173 containing a DNA fragment having the nucleotide sequence represented by SEQ ID NO: 38 encoding TGR 23-1 was designated as type I, and the Sa1I recognition sequence was A PCR reaction was performed using the added primer 1 (SEQ ID NO: 32) and the primer 2 (SEQ ID NO: 33) added with a SpeI recognition sequence.
  • the composition of the reaction solution in the reaction was as follows: 10 ng of the above plasmid was used as type III, PiU Turbo DNA Polymerase (Stratagene) 2.5 U, primer 1 (SEQ ID NO: 32) and primer 1 (SEQ ID NO: 2).
  • coli TOP 10 (Invitrogen), and a clone containing the cDNA of TGR23-1 contained in pTB2173 was selected in an LB agar medium containing kanamycin.
  • the insert DNA was excised from an agarose gel after electrophoresis, and then recovered using a Gel Extraction Kit (Qiagen).
  • Vector plasmid pAKKO—111H for expression of animal cells obtained by cutting this insert DNA with Sa1I and SpeI (Hinuma, S. et al. Biochim. Biophys. Acta, Vol. 1219, pp. 251-259 (1994), the same vector plasmid as pAKKO l.11H), and ligated using DNA Ligation Kit ver.2 (Takara Shuzo) to obtain a plasmid pAKKO-TGR23 for protein expression. -1 was constructed. After culturing E. coli TOP10 transformed with this pAKKO-TGR23-1, plasmid DNA of pAKKO-TGR23-1 was prepared using Plasmid Miniprep Kit (Bio-Rad).
  • Hamster CHOZd hfr_ cells were placed on a falcon dish (3.5 cm in diameter) in an a-MEM medium (with ribonucleosides and deoxyribozyme leosides, GIBC0, Cat. No. 12571) containing 10% fetal serum. 1 0 and five seeding, 5% C0 2 The cells were cultured at 37 ° C overnight in an incubator.
  • the expression plasmid pAKKO-TGR23-1GRA2Xg was transfected using Transiection Reagent FuGENE 6 (Roche) according to the method described in the attached instruction manual. After culturing for 18 hours, the medium was replaced with a fresh growth medium.
  • the transfected cells were collected by trypsin-EDTA treatment, and selected medium (Hi-MEM medium containing 10% dialyzed fetal calf serum (without ribonucleosides and deoxyribonucleosides, GIBC0, Cat. No. 12561) )) was used to inoculate 10 flat bottom 96-well plates. Culture was continued while changing the selection medium every 3 to 4 days, and after 2 to 3 weeks, 81 DHF R + cell clones that had grown in a colony were obtained.
  • selected medium Hi-MEM medium containing 10% dialyzed fetal calf serum (without ribonucleosides and deoxyribonucleosides, GIBC0, Cat. No. 12561)
  • RNAs were prepared using an RNeasy 96 Kit (Qiagen). A reverse transcription reaction was performed on 50 to 200 ng of the obtained total RNA using a TadMan Gold RT-PCR Kit (PE Biosystems).
  • the standard cDNA is prepared by measuring the absorbance at 260 nm of the plasmid PTB2174 containing the DNA fragment having the nucleotide sequence represented by SEQ ID NO: 40, calculating the concentration, calculating the exact copy number, and then including ImM EDTA. It was diluted with a 10 mM Tris-HCI (pH 8.0) solution to prepare 2 ⁇ 10 6 copies of a standard cDNA solution from 2 copies.
  • the probe and primer for TaciMan PCR are P Designed by rimer Express (Version 1.0) (PE Biosystems). The expression level was calculated using ABI PRISM 7700 SDS software.
  • the number of cycles at the moment when the fluorescence intensity of the reporter reached the set value was plotted on the vertical axis, and the logarithmic value of the initial concentration of the standard cDNA was plotted on the horizontal axis, to create a standard curve.
  • the initial concentration of each reverse transcript product was calculated from the standard curve, and the amount of TGR23-1 gene expression per total RNA of each clone was determined.
  • one CH ⁇ cell line having high expression of TGR23-1 was selected and cultured in a 24-well plate. For these cells, the expression level of TGR 23-1 was re-examined.
  • Total RNA was prepared using RNeasy Mini Kits (Qiagen), and then treated with RNase-free DNase Set (Qiagen).
  • a reverse transcription reaction was performed from the obtained total RNA in the same manner as described above, and the expression level of the TGR23-1 gene per total RNA of each clone was determined by TaQMan PCR. As a result, it was found that CHO cell lines clones 49 and 52 expressing TGR23-1 showed high expression levels.
  • Boc- Ser (Bzl) - a 0CH 2 -PAM resin is input to the reaction vessel of the peptide synthesizer ACT 90, the B oc swelling after TFA was removed in DCM, and neutralized with DI EA. This resin was suspended in NMP, and Boc-Lys (Cl-Z) was condensed with HOBt-DIPCI. After the reaction, the presence of free amino groups was examined by a ninhydrin test. When the ninhydrin test was positive, the same amino acid was condensed again. Even after the recondensation, when the ninhydrin test was positive, acetylation was performed with acetic anhydride.
  • the human TGR 23-2 ligand (1-20) obtained in Reference Example 12 was administered at various concentrations to TGR 23-1 -expressing CHO cells and TGR 23-2 -expressing CHO cells at various concentrations according to a known method.
  • human TGR23-2 ligand (1-20) was found to be dependent on the concentration of TGR23-1 expressing CHO cells and TGR23-2 expressing CH0. Increased intracellular Ca ion concentration in cells. The results are shown in FIGS.
  • polypeptide having the amino sequence represented by SEQ ID NO: 12 increased the intracellular Ca ion concentration of TGR 23-1 and TGR 23-2. It is clear that it has activity.
  • binding molecule or a type of a binding molecule only by obtaining information on an amino acid sequence of a protein whose binding molecule is unknown (and / or a sequence alignment obtained using the amino acid sequence). Becomes possible. This makes it possible to predict binding molecules (ligands, etc.) much faster than conventional molecular modeling methods that predict even three-dimensional structures.
  • the present invention it is possible to predict the binding molecule or the type of the binding molecule for a protein for which various types of binding molecules are unknown. Further, it is possible to predict the binding molecule or the kind of the binding molecule easily and quickly rather than experimenting whether or not any binding molecule actually binds to the protein whose binding molecule is unknown.
  • binding molecule or the type of the binding molecule of a binding molecule unknown protein such as an orphan G protein-coupled receptor. It is possible to easily produce a prophylactic or therapeutic drug for a disease or the like involving a binding molecule unknown protein.

Abstract

A method for estimating a ligand or a ligand type directly binding (or coupling) to GPCR based on GPCR sequencial data and, in its turn, estimating its function. More specifically, a method of presuming a ligand of a protein with unknown ligand which comprises obtaining classification data of proteins with known ligands wherein the alignments of the proteins with known ligands correspond respectively to the ligands or ligand types, obtaining ligand-determining residue-ligand classification data which shows the correlationships among ligand-determining residues and ligands or ligand types to thereby obtain the alignments of proteins with unknown ligands, applying the data at least concerning the ligand-determining residues in the alignments of the proteins with unknown ligands as described above to the ligand-determining residue-ligand classification data, and thus estimating the ligand or ligand type of the protein with the unknown ligand, etc.

Description

明細書  Specification
結合分子予測方法およびその利用方法 技術分野  FIELD OF THE INVENTION
本発明は、 結合分子未知タンパク質の結合分子予測方法、 該予測方法を用いた 医薬の製造方法、 及び結合分子未知タンパク質の結合分子を予測するためのコン ピュータに関し、 より詳しくは、 結合分子既知タンパク質のアミノ酸配列 (又は シークェンスアラインメント) とその結合分子又は結合分子の種類とに関する情 報から結合分子未知タンパク質の結合分子又は結合分子の種類を予測する方法、 該方法を用いた医薬、 及ぴ当該方法に用いられるコンピュータに関する。 さらに 本発明は、 結合分子や結合分子の種類を予測することを通して結合分子が未知で あるタンパク質の機能を予測する方法等に関する。 背景技術  The present invention relates to a method for predicting a binding molecule of an unknown binding molecule protein, a method for producing a medicine using the prediction method, and a computer for predicting a binding molecule of the unknown binding molecule protein. For predicting a binding molecule or a type of a binding molecule of an unknown protein of a binding molecule from information on the amino acid sequence (or sequence alignment) of the protein and the type of the binding molecule or the type of the binding molecule, a medicine using the method, and the method It relates to a computer used for. Furthermore, the present invention relates to a method for predicting the function of a protein whose binding molecule is unknown through prediction of the binding molecule and the type of the binding molecule. Background art
ヒ トゲノムの配列情報が公開され (N a t u r e 第 4 0 9巻、 第 6 8 2 2号 ( 2 0 0 1 ) ; S c i e n c e 第 2 9 1卷、 第 5 5 0 7号 (2 0 0 1 ) ) 、 既 知の遺伝子も含め 3— 4万の遺伝子の存在が報告された。 しかし、 遺伝子と細胞 内機能の関係が未知のものが数多く存在している。特に、疾病'疾患の原因となる 遺伝子に関する情報はその病態を明らかにする上で重要であるばかりでなく、 そ の診断 ·予防 '治療に用いる医薬を開発する場合にも必要不可欠なものである。 医薬開発の標的となりうる分子は、 その分子を含む医薬が体内に入ったとき直 接または間接的に他の分子と結合するような分子であり、 ほとんどの場合がタン パク質と考えられる。 このタンパク質としては低分子物質を結合するタンパク質 、 低分子または高分子の基質に対して触媒活性を有する酵素、 高分子物質を結合 するタンパク質などが挙げられる。 その中でも低分子物質を結合するタンパク質 を標的とする医薬品が非常に多く存在する。 標的分子として特に注目されるタン パク質として、 膜タンパク質としてシグナルの伝達に関与する Gタンパク質共役 型受容体タンパク質があげられる。  Sequence information of the human genome has been released (Nature Vol. 409, No. 6822 (2001)); Science Vol. 291 and No. 5507 (2001) ), 30,000 to 40,000 genes, including known genes, were reported. However, there are many unknown ones whose relationship between genes and intracellular functions is unknown. In particular, information on genes that cause disease is not only important for clarifying its pathology, but also indispensable for developing drugs used for its diagnosis and prevention and treatment. . Molecules that can be targeted for drug development are those that bind directly or indirectly to other molecules when the drug containing the molecule enters the body, and is most likely to be a protein. Examples of the protein include a protein that binds a low-molecular substance, an enzyme that has catalytic activity for a low-molecular or high-molecular substance, and a protein that binds a high-molecular substance. Among them, there are a very large number of drugs that target proteins that bind low molecular substances. Proteins of particular interest as target molecules include G protein-coupled receptor proteins involved in signal transduction as membrane proteins.
膜タンパク質の例では、 細胞外から細胞内へのシグナルの伝達は、 膜貫通ドメ インを有する膜タンパク質がその膜貫通ドメインで構造変化を生ずることにより 進行する。 膜貫通ドメインを有する上記の Gタンパク質共役型受容体タンパク質In the case of membrane proteins, signal transduction from extracellular to intracellular involves transmembrane domain It proceeds by causing a conformational change in the transmembrane domain of the in-containing membrane protein. The above G protein-coupled receptor protein having a transmembrane domain
(G r o t e i n c o u p l e d r e c e p t o r p r o t e i n, 以下 GP CRと略すことがある) は 7回膜貫通ドメインをその分子内に有するシ グナル伝達に関与する膜タンパク質である。 GPCRは細胞外側に N末端を有し 、 7回膜貫通ドメインを経て細胞内に C末端を有するトポロジーを持ち、 リガン ドはその種類によって N末端領域や膜貫通領域に結合する。また、細胞内の C末端 部分と細胞内ループ 2に Gタンパク質が結合することが知られている。 現在まで に唯一 GP CRファミリーの一種である口ドプシンの結晶構造解析が行われてい る (P a l c z ews k iら S c i e n c e 第 289巻、 739— 745頁 (2000) ) が、 その他の GP CRの結晶構造解析に関する報告はない。 ま た各種生物学実験により、 以下のように G P C Rの活性化が起きることが明らか となってきた。 即ち、 ( i) 受容体にリガンドが結合し、 GPCRの構造の変化 する。 ( i i) 共役 Gタンパク質に GPCRの構造変化が伝達され、 Gタンパク 質の一部が遊離する。 ( i i i ) 情報 (シグナル) が細胞内へ伝達される。 受容体と特異的に結合する分子はリガンドと呼ばれる。 GPCRの場合、 既知 のリガンド分子としては、 ドーパミン、 セロトニン、 メラトニン、 ヒスタミンの ような生体ァミン、 プロスタグランジン、 ロイコトリェンなどの脂質誘導体、 核 酸、 グルタミンなどのアミノ酸、 アンギオテンシン、 セクレチン、 ソマトス夕チ ンのような生理活性ペプチド類などがあげられる。 (Groteincouplederecrecint, hereinafter sometimes abbreviated as GPCR) is a membrane protein involved in signal transduction having a seven transmembrane domain in its molecule. GPCRs have an N-terminal outside the cell, have a topology with a C-terminal inside the cell via a seven transmembrane domain, and the ligand binds to the N-terminal region or transmembrane region depending on the type. It is also known that G protein binds to the intracellular C-terminal part and intracellular loop 2. To date, the only crystal structure of oral dopsin, a member of the GP CR family, has been analyzed (Palczewski et al., Science, Vol. 289, pp. 739-745 (2000)). There are no reports on structural analysis. In addition, various biological experiments have revealed that GPCR activation occurs as follows. That is, (i) the ligand binds to the receptor and the structure of the GPCR changes. (Ii) Conjugation changes of the GPCR are transmitted to the conjugated G protein, and a part of the G protein is released. (Iiii) Information (signal) is transmitted into cells. Molecules that specifically bind to the receptor are called ligands. In the case of GPCRs, known ligand molecules include living amines such as dopamine, serotonin, melatonin, and histamine, lipid derivatives such as prostaglandins and leukotrienes, amino acids such as nucleic acid and glutamine, angiotensin, secretin, and somatosutin. And physiologically active peptides such as
Gタンパク質はシグナル伝達系においてトランスデューサ一として機能し、 ひ 、 β、 ァの 3種のサブユニットから構成される。 これらのうち αサブユニットは GTPへの結合および G Τ Ρを加水分解する活性を有し、 各々の Gタンパク質に 特有のサブユニットである。 即ち、 共役する Gaタンパク質の種類により、 GP CRの機能が決定されることになる。 現在までに、 Go;タンパク質としては Gs 、 Gい G。および Gtが単離精製され、 その性質が調べられている。 The G protein functions as a transducer in the signal transduction system, and is composed of three subunits, i, β, and α. Among these, the α subunit has the activity of binding to GTP and hydrolyzing G Τ 、, and is a subunit unique to each G protein. That is, the function of GPCR is determined by the type of Ga protein to be conjugated. To date, Go; G s and G or G as proteins. And Gt have been isolated and purified, and their properties have been investigated.
Gタンパク質共役型レセプ夕一タンパク質は生体の細胞や臓器の各機能細胞表 面に存在し、 それら細胞や臓器の機能を調節する分子、 例えば、 ホルモン、 神経 伝達物質および生理活性物質等の標的として生理的に重要な役割を担っている。 レセプタ一は生理活性物質との結合を介してシグナルを細胞内に伝達し、 このシ グナルにより細胞の賦活ゃ抑制といつた種々の反応が惹起される。 G protein-coupled receptor protein is present on the surface of each functional cell in living cells and organs, and is a target for molecules that regulate the function of those cells and organs, such as hormones, neurotransmitters, and biologically active substances. Plays a physiologically important role. The receptor transmits a signal into the cell via binding to a physiologically active substance, and this signal causes various reactions such as suppression of activation and activation of the cell.
各種細胞や臓器における複雑な機能を調節する物質と、 その特異的レセプター タンパク質、 特には G夕ンパク質共役型レセプ夕一タンパク質との関係を明らか にすることは、 各種細胞や臓器における複雑な機能を解明し、 それら機能と密接 に関連した医薬品開発に非常に重要な手段を提供することとなる。  To clarify the relationship between substances that regulate complex functions in various cells and organs and their specific receptor proteins, especially G protein-coupled receptor Yuichi proteins, it is important to clarify the complex functions in various cells and organs. Will provide a very important tool for drug development closely related to these functions.
近年、 生体内で発現している遺伝子を解析する手段として、 c D N Aの配列を ランダムに解析する研究が活発に行なわれており、 その結果として得られた c D N Aの断片配列が Expressed Seauence Tag ( E S T ) としてデータベースに登録 され、 公開されている。 しかし、 多くの E S Tは配列情報だけを提供し、 配列が 有する機能を推定することは困難である。  In recent years, as a means of analyzing genes expressed in vivo, studies on random analysis of cDNA sequences have been actively conducted, and the resulting cDNA fragment sequences are expressed in Expressed Seauence Tag ( EST) is registered in the database and published. However, many ESTs provide only sequence information, and it is difficult to estimate the function of the sequence.
Gタンパク質共役型レセプ夕一はその全てが見出されているわけではなく、 現 時点でもなお、 未知の Gタンパク質共役型レセプター、 また対応するリガンドが 同定されていない、 いわゆるォ一ファンレセプターが多数存在しており、 新たな Gタンパク質共役型レセプ夕一の探索および機能解明が切望されている。  Not all G protein-coupled receptors have been found, and even at this time, there are many unknown G protein-coupled receptors and so-called orphan receptors for which no corresponding ligand has been identified. It is present, and the search for new G protein-coupled receptors and the elucidation of their functions are eagerly awaited.
ゲノムの配列情報が公開されたことにより、 既知の配列情報と比較することに よって何らかの機能を有すると推測できる遺伝子領域を調べることが可能となつ た。 遺伝子の塩基配列間もしくはそれが翻訳されたタンパク質のァミノ酸配列間 の類似性に基づき、 新規タンパク質の結合分子や機能の予測が実施されている。 上記のように、配列の類似性を計算し、配列間の適切な対応関係(シークェンスァ ラインメント)を表示するソフトとして C lus t al Wや BLASTなどが用いられている。 これらのソフトから得られたシークェンスアラインメントを用い、 類似性解析に よって、 新規タンパク質にはその類似タンパク質と類似の機能があるという概念 に基づき、 機能予測が実施されている。 このような類似性に基づく予測は、 基本 的な機能の予測には貢献するが、 基本的な機能が同じタンパク質グループに対す る結合分子の予測に関しては効果が乏しい。 例えば、 その新規タンパク質が G P C Rかどうかの分類化には役立つが、 類似度が極めて高いものでない限り、 その G P C Rに結合するリガンドゃ共役 Gタンパク質の種類の予測には不向きである 。実際に G P C Rに結合する未知のリガンド分子を検索'決定するためには、細胞 抽出物などを利用して多くの候補化合物の活性測定を行う必要があるが、 物性の 異なる結合分子の候補化合物は同時に評価することができないという問題が生じ る。 そこで、 複雑な活性測定などを行うことなく、 リガンド分子の種類を予測す ることができたとすれば、 効率的なリガンド分子の決定に結びつくこととなる。 したがって、 タンパク質の配列情報を類似性とは異なる概念で捉えることが必 要となる。 類似性によって分類されたタンパク質グループ(例えば GPCR)内の各メ ンバ一に対して、 各タンパク質に結合する分子を予測することができれば、 リガ ンドの同定やそのタンパク質とリガンドの機能解析に有用である。 そのために、 タンパク質の 1次元的な配列情報から立体構造を予測し、 計算機上での結合実験 によって結合分子を予測する検討も実施されている。 しかし、 現状ではタンパク 質の立体構造の予測および計算機上での結合実験の精度がないために、 実用的で ない。 これらの精度の低さと煩雑さを回避し、 配列情報から直接結合分子や機能 について検討できることは本発明の技術分野において有用なことである。 With the release of genome sequence information, it has become possible to examine gene regions that can be presumed to have some function by comparing with known sequence information. Prediction of binding molecules and functions of novel proteins has been carried out based on the similarity between the nucleotide sequences of genes or the amino acid sequences of translated proteins. As described above, Clustal W, BLAST, and the like are used as software for calculating sequence similarity and displaying an appropriate correspondence between sequences (sequence alignment). Using sequence alignments obtained from these software and similarity analysis, function prediction is performed based on the concept that a new protein has a function similar to that of the similar protein. Predictions based on such similarities contribute to the prediction of basic function, but have little effect on prediction of binding molecules to the same group of proteins. For example, it is useful for classifying whether the new protein is a GPCR, but it is not suitable for predicting the type of ligand-conjugated G protein that binds to the GPCR unless the similarity is extremely high. To search for and determine the unknown ligand molecule that actually binds to the GPCR, Although it is necessary to measure the activity of many candidate compounds using an extract or the like, there is a problem that candidate compounds of binding molecules having different physical properties cannot be simultaneously evaluated. Therefore, if it was possible to predict the type of ligand molecule without performing complicated activity measurements, it would lead to efficient determination of the ligand molecule. Therefore, it is necessary to understand protein sequence information with a concept different from similarity. If it is possible to predict the molecules that bind to each protein for each member of a protein group (for example, GPCR) classified by similarity, it is useful for identifying ligands and analyzing the functions of the proteins and ligands. is there. For this purpose, studies have been conducted to predict the three-dimensional structure from the one-dimensional sequence information of the protein, and to predict the binding molecule by a binding experiment on a computer. However, at present, it is not practical because of the lack of accuracy of prediction of the three-dimensional structure of the protein and binding experiments on a computer. It is useful in the technical field of the present invention to be able to avoid such low accuracy and complexity and to directly examine binding molecules and functions from sequence information.
配列情報から、 結合分子や機能を予測する方法としては、 例えば、 限定された 既知の GPCRに対して、 共役する Gひタンパク質の種類の決定に関与すると思 われるアミノ酸の変異 (ならびにその変異位置) についての報告がなされた (Bu lseco and Sc imerlik, Molecular Pharmacology, 49: 132-141 (1996); Burste in, Spalding and Brann, Biochemistry, 37: 4052-4058 (1998); Kazmi et al. , Biochemistry, 39: 3734-3744 (2000)) が、 新規 G P C Rに対して共役する G αタンパク質の種類を予測するような報告は知られていない。  Methods for predicting binding molecules and functions from sequence information include, for example, mutations of amino acids (and their mutation positions) that are considered to be involved in determining the type of G protein to be coupled to a limited number of known GPCRs. (Bulseco and Scimerlik, Molecular Pharmacology, 49: 132-141 (1996); Burste in, Spalding and Brann, Biochemistry, 37: 4052-4058 (1998); Kazmi et al., Biochemistry, 39: 3734-3744 (2000)), there is no report that predicts the type of Gα protein coupled to a novel GPCR.
また、 やはり GPCRに関して、 GP CRとリガンド双方のアミノ酸の変異を 解析することにより、 リガンドとその G P C Rの結合に関与する部位を予測する 方法、 即ち correlated mutation analysis (CMA) 法が開発された (Singer e t al. , Receptors and channels, 3: 89-95 (1995)) が、 GPCRの配列を単独 で用いて未知の GPCRの結合分子(及び 又は結合分子の種類)を選択した報告 例はない。  Also, for GPCRs, a method for predicting the site involved in binding of a ligand to its GPCR by analyzing mutations in both GPCR and ligand amino acids, that is, a correlated mutation analysis (CMA) method was developed (Singer et al., Receptors and channels, 3: 89-95 (1995)), there is no report that a GPCR sequence alone was used to select unknown GPCR binding molecules (and / or types of binding molecules).
以上のように、 例えば GPCRの場合、 GP CRの配列情報から直接的に GP CRに結合する (、 あるいは共役する) 分子又は分子の種類を予測し、 その機能 を予測することにつながる方法は知られていない。 また、 やはり G P C Rに関していえば、 新たな G P C Rを発見した場合、 その G P C Rを利用した医薬を製造するためには、 当該 G P C Rの機能を把握した上 で、 当該 G P C Rに結合する (、 あるいは共役する) 分子又は分子の種類を実験 等により把握しなければならず、 莫大な費用と手間がかかるという問題がある。 発明の開示 As described above, for example, in the case of GPCRs, it is not known how to predict molecules or types of molecules that bind (or conjugate) to GPCR directly from the sequence information of GPCR, and predict the function. Not been. Regarding GPCRs, if a new GPCR is discovered, in order to manufacture a drug using the GPCR, it must bind to (or conjugate to) the GPCR after understanding the function of the GPCR. There is a problem that it is necessary to grasp the molecule or the kind of the molecule through experiments or the like, which requires enormous cost and labor. Disclosure of the invention
本発明は、 機能未知の配列情報に対し、 それに結合できる分子等を簡便かつ精 度よく予測する方法、 当該方法を用いた医薬の製造方法及び、 これらに用いられ るコンピュータに関する。  The present invention relates to a method for simply and accurately predicting a molecule or the like that can bind to sequence information of unknown function, a method for producing a medicine using the method, and a computer used for the method.
上記課題の少なくとも一つは、 以下の発明によって解決される。  At least one of the above-mentioned problems is solved by the following invention.
( 1 ) 結合分子未知タンパク質に結合する結合分子を予測する結合分子未知タ ンパク質の結合分子予測方法であって、 アミノ酸配列と結合分子とが既知である 結合分子既知タンパク質について、 少なくとも 2以上の結合分子既知タンパク質 のシークェンスァラインメントと、 結合分子又は結合分子の種類とを対応付けた 結合分子既知タンパク質分類情報を得る工程と、 前記結合分子既知タンパク質分 類情報を用いて、 結合分子既知タンパク質のシークェンスァラインメン卜の位置 のうち結合分子を決定することに関与すると想定される位置である結合分子決定 残基位置を 1又は 2以上特定する工程と、 前記結合分子決定残基位置におけるァ ミノ酸残基 (結合分子決定残基) と、 結合分子又は結合分子の種類とを対応付け ることにより、 結合分子決定残基と結合分子又は結合分子の種類との相関関係を 表す結合分子決定残基一結合分子分類情報を得る工程と、 前記結合分子既知タン パク質と同じ種類の結合分子未知タンパク質について前記結合分子既知タンパク 質間のシークェンスアラインメントに対して結合分子未知タンパク質の配列を整 列させ、 結合分子未知タンパク質のシークェンスァラインメン卜を得る工程と、 前記結合分子未知タンパク質のシークェンスアラインメントのうち少なくとも結 合分子決定残基についての情報を、 結合分子決定残基一結合分子分類情報に当て はめ、 結合分子未知タンパク質の結合分子又は結合分子の種類を予測する工程と を含む結合分子未知タンパク質の結合分子予測方法。  (1) A binding molecule unknown protein prediction method for predicting a binding molecule that binds to a binding molecule unknown protein, wherein the amino acid sequence and the binding molecule are known. Obtaining a binding molecule known protein classification information in which the sequence alignment of the binding molecule known protein is associated with the type of the binding molecule or the binding molecule; and using the binding molecule known protein classification information, the binding molecule known protein. Specifying one or more binding molecule-determining residue positions that are assumed to be involved in determining a binding molecule among the sequence alignment positions of the amino acid sequence; and aminos at the binding molecule-determining residue positions. By associating acid residues (binding molecule determining residues) with binding molecules or types of binding molecules Obtaining a binding molecule-determining residue-binding molecule classification information indicating a correlation between the binding molecule-determining residue and the binding molecule or the type of the binding molecule; and for a binding molecule unknown protein of the same type as the binding molecule-known protein. Aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the known binding molecule proteins, and obtaining a sequence alignment of the unknown binding molecule protein; at least one of the sequence alignments of the unknown binding molecule protein; Applying the information on the determined binding molecule to the classified information on the determined binding molecule-binding molecule to predict the binding molecule or the type of the binding molecule of the unknown binding molecule, and the binding molecule of the unknown binding molecule. Forecasting method.
( 2 ) 前記結合分子が、 リガンド、 調節因子、 エフェクター、 補酵素のいずれ かである上記 (1) に記載の結合分子未知タンパク質の結合分子予測方法。(2) The binding molecule is any one of a ligand, a regulator, an effector, and a coenzyme. The method for predicting a binding molecule of an unknown protein according to the above (1), wherein
(3) 前記結合分子が、 2以上の種類に分類され、 当該分類された結合分子の 種類を予測する上記 (1) 又は (2) に記載の結合分子未知タンパク質の結合分 子予測方法。 (3) The method for predicting a binding molecule of an unknown protein for a binding molecule according to (1) or (2), wherein the binding molecule is classified into two or more types, and the type of the classified binding molecule is predicted.
(4) 結合分子未知タンパク質が、 Gタンパク質共役型受容体、 キナーゼ、 リ パーゼ、 卜ランスポーター、 プロテア一ゼ、 イオンチャンネルのいずれかである 上記 (1) から上記 (3) のいずれかに記載の結合分子未知タンパク質の結合分 子予測方法。  (4) the binding protein unknown protein is any one of a G protein-coupled receptor, a kinase, a lipase, a transporter, a protease, and an ion channel according to any of (1) to (3) above; Method for predicting binding molecules of unknown proteins.
(5) 結合分子決定残基位置を 1又は 2以上特定する工程において、 シークェ ンスアラインメントを構成するアミノ酸残基と結合分子の種類とから結合分子決 定残基位置を 1又は 2以上特定する上記 (1) から (4) のいずれかに記載の結 合分子未知夕ンパク質の結合分子予測方法。  (5) In the step of identifying one or more binding molecule-determining residue positions, the one or more binding molecule-determining residue positions are identified from amino acid residues constituting the sequence alignment and types of binding molecules. The method for predicting a binding molecule of an unknown protein according to any one of (1) to (4).
(6) 式 1、 又は式 2のいずれか又は両方を用いて結合分子決定残基位置を決 定する上記 (1) から (4) のいずれか 1項に記載の結合分子未知タンパク質の 結合分子予測方法。  (6) The binding molecule of the unknown binding molecule protein according to any one of (1) to (4) above, wherein the binding molecule-determining residue position is determined using either or both of Formula 1 and Formula 2. Forecasting method.
(7) 式 3、 式 4、 式 5のいずれかひとつ以上を用いて結合分子決定残基位置 を決定する上記 (1) から (4) のいずれか 1項に記載の結合分子未知タンパク 質の結合分子予測方法。  (7) determining the position of the binding molecule-determining residue using at least one of Formulas 3, 4, and 5; and determining the position of the unknown binding molecule protein according to any one of (1) to (4) above. Binding molecule prediction method.
(8) リガンド決定残基一リガンド分類情報を得る工程が、 リガンド既知タン パク質のアミノ酸残基のうち、関数 f 3 (n)の値が一番小さなリガンド決定残基位置 にあるものを抽出する工程と、 リガンド決定残基一リガンド分類情報にあげられ たリガンド既知タンパク質のうち、 抽出されたリガンド決定残基と一致するもの の数 (A) を求める工程と、 リガンド決定残基—リガンド分類情報にあげられた リガンド既知タンパク質のうち抽出されたリガンド決定残基と一致するもののう ちで、 リガンド又はリガンドの種類が当該リガンド既知タンパク質のものと一致 する数 (B) を求める工程と、 リガンド既知タンパク質のアミノ酸残基のうち関 数 f3 (n)の値が二番目に小さい又は X番目(ここで、 Xは 2より大きく 100より 小さな整数を表す。) に小さいリガンド決定残基位置にあるものを抽出する工程 と、 リガンド決定残基一リガンド分類情報にあげられたリガンド既知タンパク質 のうち、 抽出されたリガンド決定残基と一致するものの数 (C ) を求める工程と 、 リガンド決定残基一リガンド分類情報にあげられたリガンド既知タンパク質の うち抽出されたリガンド決定残基と一致するもののうちで、 リガンド又はリガン ドの種類が当該リガンド既知タンパク質のものと一致する数 (D ) を求める工程 と,、 (A) と (C ) との和 (E ) を求める工程と、 (B ) と (D ) との和 (F ) を求める工程とを含み、 (E ) と (F ) を更に表示するリガンド決定残基一リガ ンド分類情報を得る上記 (7 ) に記載の結合分子未知タンパク質の結合分子予測 方法。 (8) The step of obtaining the ligand-determined residue-ligand classification information extracts the amino acid residues of the ligand-known protein at the position of the ligand-determined residue with the smallest value of the function f 3 (n). Determining the number (A) of ligand-determined residues that match the extracted ligand-determined residues among the ligand-known proteins listed in the ligand-determined residue-ligand classification information; Obtaining a number (B) of the known ligand proteins which correspond to the extracted ligand-determining residues among the known ligand proteins, wherein the type of the ligand or the ligand matches that of the known ligand protein; and Among the amino acid residues of known proteins, the value of the function f3 (n) is the second smallest or the Xth (where X represents an integer greater than 2 and less than 100). Extracting the one at the position of the smallest ligand-determined residue, and the ligand-known protein listed in the ligand-determined residue-ligand classification information Determining the number (C) of the extracted ligand-determining residues that match the extracted ligand-determining residues; and matching the extracted ligand-determining residues among the ligand-known proteins listed in the ligand-determining residue-ligand classification information. A step of obtaining a number (D) in which the type of ligand or ligand matches that of the known protein of the ligand; a step of obtaining the sum (E) of (A) and (C); ) And (D) for obtaining the sum (F), and further obtaining (E) and (F) ligand-determined residue-one-ligand classification information which further displays (E) and (F). Protein binding molecule prediction method.
( 9 ) アミノ酸配列と結合分子とが既知である少なくとも 2以上の結合分子既 知タンパク質について、 当該結合分子既知タンパク質のシークェンスァラインメ ントと、 結合分子又は結合分子の種類とを対応付けた結合分子既知タンパク質分 類情報を得る工程と、 当該結合分子既知タンパク質分類情報を用いて、 結合分子 既知タンパク質のシークェンスァラインメントのうち結合分子を決定することに 関与すると想定される位置である結合分子決定残基位置を 1又は 2以上特定する 工程と、 当該結合分子決定残基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合分子、 又は結合分子の種類とを対応付けることにより、 結合分子決定残 基と結合分子との相関関係を表す結合分子決定残基一結合分子分類情報を得るェ 程とを含む結合分子未知タンパク質の結合分子予測方法。  (9) For at least two or more known binding molecule proteins whose amino acid sequence and binding molecule are known, binding in which the sequence alignment of the binding protein known protein is associated with the binding molecule or the type of binding molecule A step of obtaining information on the classification of known protein molecules, and a binding molecule which is a position assumed to be involved in determining a binding molecule in the sequence alignment of the known binding molecules using the classification information on known binding molecules. By associating one or more determined residue positions with the amino acid residue (binding molecule determining residue) at the relevant binding molecule determining residue position and the binding molecule or the type of the binding molecule, The binding molecule, which represents the correlation between the determined residue and the binding molecule, and the step of obtaining the binding molecule classification information. Binding molecules prediction method of molecular unknown protein.
( 1 0 ) 結合分子決定残基と結合分子との相関関係を表す結合分子決定残基一 結合分子分類情報に、 前記結合分子既知タンパク質と同じ種類の結合分子未知夕 ンパク質について前記結合分子既知タンパク質間のシークェンスアラインメント に対して結合分子未知タンパク質の配列を整列させて得られた結合分子未知タン パク質のシークェンスアラインメントのうち結合分子決定残基に関する情報を入 力し、 当該結合分子未知タンパク質に結合する結合分子、 又は結合分子の種類を 予測する結合分子未知タンパク質の結合分子予測方法。  (10) The binding molecule determination residue indicating the correlation between the binding molecule determination residue and the binding molecule. The binding molecule classification information includes the binding molecule unknown protein of the same type as the binding molecule known protein. By inputting information on binding molecule-determining residues in the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the proteins, A binding molecule prediction method for a binding molecule unknown protein that predicts the binding molecule to be bound or the type of the binding molecule.
この方法によれば、 結合分子が未知であるタンパク質のアミノ酸配列、 及び Z 又はアミノ酸配列を用いて得られるシークェンスアラインメントに関する情報を 得るだけで結合分子又は結合分子の種類を予測することが可能となる。 これによ り、 従来の 3次元構造まで予測するような分子モデリング法に比べ格段に迅速か つ低コストに結合分子 (リガンド等) を予測することができる。 更に、 本発明に よれば様々な種類の結合分子が未知であるタンパク質に対してその結合分子又は 結合分子の種類を予測することができる。 また、 結合分子が未知であるタンパク 質に実際にあらゆる結合分子の候補が結合するかどうか実験するよりも容易かつ 迅速に結合分子又は結合分子の種類を予測することができる。 結合分子決定残基 一結合分子分類情報を得ることによって、 結合分子未知タンパク質のシークェン スアラインメントを得るのみで当該情報を当該表に当てはめ、 当該結合分子未知 タンパク質に結合する結合分子又は結合分子の種類を容易に予測することが可能 となる。 According to this method, it is possible to predict the binding molecule or the type of the binding molecule only by obtaining information on the amino acid sequence of the protein whose binding molecule is unknown and the sequence alignment obtained using Z or the amino acid sequence. . This makes it much faster than conventional molecular modeling methods that predict even three-dimensional structures. It can predict the binding molecule (ligand etc.) at a low cost. Further, according to the present invention, it is possible to predict the binding molecule or the type of the binding molecule for a protein in which various types of binding molecules are unknown. In addition, it is possible to predict the binding molecule or the kind of the binding molecule easily and quickly than by experimenting whether or not any candidate binding molecule actually binds to the protein whose binding molecule is unknown. Determining residue of binding molecule By obtaining the classification information of one binding molecule, the sequence is applied only to the sequence alignment of the unknown protein, and the information is applied to the table, and the type of the binding molecule or binding molecule that binds to the unknown protein. Can be easily predicted.
また、 上記課題のうち少なくとも一つは以下の発明によって解決される。 すな わち、  Further, at least one of the above problems is solved by the following invention. That is,
( 1 1 ) 上記 (1 ) 〜 (1 0 ) のいずれかに記載した結合分子未知タンパク質 の結合分子予測方法を用いて、 結合分子未知タンパク質に結合する結合分子、 又 は結合分子の種類を予測する工程を含む医薬の製造方法、  (11) Using the method for predicting a binding molecule of an unknown binding molecule described in any of (1) to (10) above, predicting the binding molecule or the type of the binding molecule that binds to the unknown binding molecule A method for producing a medicament comprising the step of:
( 1 2 ) 医薬が、 中枢疾患、 炎症性疾患、 循環器疾患、 癌、 代謝性疾患、 免疫 系疾患または消化器系疾患の予防剤、 又は治療剤のいずれか又は両方である上記 ( 1 1 ) に記載の医薬の製造方法である。  (12) The above-mentioned (11) wherein the drug is one or both of a preventive agent and / or a therapeutic agent for a central disease, an inflammatory disease, a circulatory disease, a cancer, a metabolic disease, an immune system disease or a digestive system disease. )).
また、 上記課題の少なくとも一つは以下の発明によって解決される。 すなわち ( 1 3 ) 式 6又は下記式 7のいずれか又は両方を用いた結合分子決定残基位置 を決定する方法。  Further, at least one of the above-mentioned problems is solved by the following invention. That is, (13) a method for determining the position of a residue for determining a binding molecule using either or both of the formulas (6) and (7).
( 1 4 ) 式 8を用いた結合分子決定残基位置を決定する方法、  (14) a method for determining the position of a binding molecule-determining residue using Formula 8,
( 1 5 ) 結合分子既知タンパク質のアミノ酸配列又はシークェンスァラインメ ントと、 結合分子又は結合分子の種類に関する情報とを用いて、 結合分子既知夕 ンパク質のシークェンスァラインメン卜の位置のうち結合分子を決定することに 関与すると想定される位置 (結合分子決定残基位置) におけるアミノ酸残基であ る結合分子決定残基と、 結合分子または結合分子の種類との相関関係を表す結合 分子決定残基一結合分子分類情報を得る、 結合分子未知タンパク質の結合分子を 予測するためのコンピュータであって、 当該コンピュータは、 結合分子既知タン パク質のシークェンスァラインメントに関する情報を入力するシークェンスァラ インメント入力手段と、 前記シークェンスアラインメント入力手段により入力さ れた結合分子既知タンパク質のアミノ酸配列又はシークェンスァラインメントと 、 結合分子又は結合分子の種類に関する情報とを記憶するシークェンスァライン メント結合分子記憶手段と、 前記シークェンスアラインメント結合分子記憶手段 により記憶された結合分子既知タンパク質のアミノ酸配列又はシークェンスァラ インメントと、 結合分子又は結合分子の種類に関する情報を用いて前記結合分子 決定残基位置を決定する結合分子決定残基位置決定手段と、 前記結合分子決定残 基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合分子又は結合分子の 種類とを対応付けることにより、 結合分子決定残基と結合分子または結合分子の 種類との相関関係を表す結合分子決定残基一結合分子分類情報を得る結合分子決 定残基一結合分子分類情報取得手段と、 前記結合分子既知タンパク質と同じ種類 の結合分子未知タンパク質について前記結合分子既知タンパク質間のシークェン スアラインメントに対して結合分子未知タンパク質の配列を整列させて得られた 結合分子未知タンパク質のシークェンスアラインメントに関する情報を入力する シークェンスァラインメント入力手段とを具備し、 結合分子決定残基一結合分子 分類情報に、 シークェンスアラインメント入力手段により入力された結合分子未 知タンパク質のシークェンスァラインメントに関する情報を用いて、 当該結合分 子未知タンパク質の結合分子、 又は結合分子の種類を予測する、 結合分子未知夕 ンパク質の結合分子を予測するためのコンピュータ、 (15) Using the amino acid sequence or sequence alignment of the known binding molecule protein and information on the type of the binding molecule or the binding molecule, the binding molecule in the position of the sequence alignment of the protein with the known binding molecule is used. Molecule binding residue, which is an amino acid residue at a position supposed to be involved in the determination of the binding molecule (binding molecule determining residue position), and a binding molecule determination residue indicating the correlation between the binding molecule or the type of the binding molecule. A computer for predicting a binding molecule of a protein whose binding molecule is unknown, which obtains group-binding molecule classification information. A sequence alignment input means for inputting information on the sequence alignment of the protein; an amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means; and a binding molecule or a binding molecule. A sequence alignment binding molecule storage means for storing information on the type; an amino acid sequence or sequence alignment of the binding molecule known protein stored by the sequence alignment binding molecule storage means; and a type of the binding molecule or the binding molecule. A binding molecule determining residue position determining means for determining the binding molecule determining residue position using information; an amino acid residue (binding molecule determining residue) at the binding molecule determining residue position; Type and By associating, the binding molecule determination residue-binding molecule classification information obtaining means for obtaining the binding molecule determination residue-binding molecule classification information indicating the correlation between the binding molecule determination residue and the binding molecule or the type of the binding molecule; Information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the unknown binding molecule proteins for the same type of unknown binding molecule protein as the known binding molecule protein Inputting sequence alignment input means, and using the information on the sequence alignment of unknown binding molecule proteins input by the sequence alignment input means as the binding molecule determination residue-binding molecule classification information. Bound molecule unknown protein Binding molecule, or to predict the kind of binding molecules, binding molecules unknown evening protein computer for predicting the binding molecule,
( 1 6 ) 前記結合分子決定残基位置決定手段が、 少なくとも式 9又は式 1 0の いずれか又は両方の関数を用いる上記 (1 4 ) に記載の結合分子未知タンパク質 の結合分子を予測するためのコンピュータ、  (16) The binding molecule-determining residue position determining means predicts the binding molecule of the binding molecule unknown protein according to (14) using at least one of the functions of Formula 9 and Formula 10 or both functions. Computer,
( 1 7 ) 前記結合分子決定残基位置決定手段が、 式 9で表される関数を用いる 上記 (1 5 ) 又は上記 (1 6 ) に記載の結合分子未知タンパク質の結合分子を予 測するためのコンピュータ、  (17) The binding molecule-determining residue position determining means uses the function represented by Formula 9 to predict the binding molecule of the binding molecule unknown protein described in (15) or (16) above. Computer,
( 1 8 ) 結合分子未知タンパク質の結合分子を予測するためのコンピュータで あって、 当該結合分子未知タンパク質と同じ種類であり結合する結合分子が既知 である結合分子既知タンパク質のシ一クエンスアラインメントのうち当該結合分 子既知タンパク質に結合する分子を決定することに関与すると想定される位置で ある結合分子決定残基位置と、 当該結合分子決定残基位置における結合分子既知 タンパク質のァミノ酸残基である結合分子決定残基と、 当該結^分子決定残基に 対応した結合分子既知タンパク質の結合分子又は結合分子の種類とに関する情報 を記憶した記憶手段と、 前記結合分子既知タンパク質と同じ種類の結合分子未知 夕ンパク質について前記結合分子既知タンパク質間のシークェンスァラインメン 卜に対して結合分子未知タンパク質の配列を整列させて得られた結合分子未知夕 ンパク質のシークェンスアラインメントに関する情報を入力するシークェンスァ ラインメン卜入力手段と、 入力されたシークェンスアラインメントに関する情報 と記憶手段に記憶される情報とから当該結合分子未知タンパク質の結合分子又は 結合分子の種類を決定する結合分子決定手段と、 決定された結合分子未知タンパ ク質に結合する結合分子又は結合分子の種類を表示する表示手段とを具備し、 シ ークエンスァラインメント入力手段により入力された結合分子未知タンパク質の シークェンスアラインメントに関する情報と、 記憶手段に記憶された結合分子決 定残基と当該結合分子決定残基に対応した結合分子既知タンパク質の結合分子又 は結合分子の種類に関する情報とに基づいて結合分子決定手段により結合分子未 知タンパク質の結合分子又は結合分子の種類を予測し、 結合分子決定手段により 予測された当該結合分子未知タンパク質の結合分子又は結合分子の種類を表示手 段により表示する結合分子未知タンパク質の結合分子を予測するためのコンピュ —夕である。 このようなコンピュータによれば、 結合分子既知タンパク質のシ一 クエンスァラインメントに基づいて結合分子決定残基一結合分子分類情報を得る ことができ、 これにより結合分子未知タンパク質のシークェンスァラインメン卜 を得るのみでその結合分子又は結合分子の種類を容易に予測することができるこ ととなる。 (18) A computer for predicting a binding molecule of a binding molecule unknown protein, wherein the computer is a sequence alignment of a binding molecule known protein having the same type as the binding molecule unknown protein and a binding molecule known. The binding And a binding molecule determining residue position that is assumed to be involved in determining a molecule that binds to a known protein, and a binding molecule determination that is an amino acid residue of the binding molecule known protein at the binding molecule determining residue position. A storage means for storing information on the residue and the type of binding molecule or binding molecule of the binding molecule known protein corresponding to the binding molecule determination residue; and a binding molecule unknown of the same type as the binding molecule known protein. Sequence alignment input means for inputting information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the binding molecule known proteins. Information and storage means for the entered sequence alignment A binding molecule determining means for determining the binding molecule or the type of the binding molecule of the unknown binding molecule protein from the stored information; and displaying the determined binding molecule or the type of the binding molecule binding to the determined unknown binding molecule protein. Display means, the information relating to the sequence alignment of the unknown protein of the binding molecule input by the sequence alignment input means, and the binding molecule determination residue and the binding molecule determination residue stored in the storage means. Based on the information on the binding molecule of the corresponding binding molecule known protein or the information on the type of the binding molecule, the binding molecule determination unit predicts the binding molecule or the type of the binding molecule of the unknown binding molecule protein, and is predicted by the binding molecule determination unit. Display the binding molecule or the type of the binding molecule of the unknown binding molecule by the display means. Computer for predicting the binding molecules of the child unknown protein - is even. According to such a computer, it is possible to obtain binding molecule-determined residue-binding molecule classification information based on the sequence alignment of the binding molecule known proteins, and thereby to obtain the sequence alignment of the binding molecule unknown protein. Only by obtaining, the binding molecule or the type of the binding molecule can be easily predicted.
さらに本発明は、  Furthermore, the present invention
( 1 9 ) コンピュータを、 結合分子既知タンパク質のシークェンスァラインメ ントに関する情報を入力するシークェンスアラインメント入力手段と、 前記シー クエンスァラインメント入力手段により入力された結合分子既知タンパク質のァ ミノ酸配列又はシークェンスアラインメントと、 結合分子又は結合分子の種類に 関する情報とを記憶するシークェンスアラインメント結合分子記憶手段と、 前記 シークェンスアラインメント結合分子記憶手段により記憶された結合分子既知夕 ンパク質のアミノ酸配列又はシークェンスァラインメントと、 結合分子又は結合 分子の種類に関する情報を用いて前記結合分子決定残基位置を決定する結合分子 決定残基位置決定手段と、 前記結合分子決定残基位置におけるアミノ酸残基 (結 合分子決定残基) と、 結合分子又は結合分子の種類とを対応付けることにより、 結合分子決定残基と結合分子または結合分子の種類との相関関係を表す結合分子 決定残基一結合分子分類情報を得る結合分子決定残基一結合分子分類情報取得手 段と、 前記結合分子既知タンパク質と同じ種類の結合分子未知タンパク質につい て前記結合分子既知タンパク質間のシークェンスアラインメントに対して結合分 子未知タンパク質の配列を整列させて得られた結合分子未知タンパク質のシーク エンスァラインメントに関する情報を入力するシークェンスアラインメント入力 手段と、 して機能させるプログラム、 (19) A computer is connected to a sequence alignment input means for inputting information on the sequence alignment of the known binding molecule protein, and the amino acid sequence or the amino acid sequence of the binding molecule known protein input by the sequence alignment input means. Sequence alignment and binding molecules or types of binding molecules Sequence-binding molecule storage means for storing information on the amino acid sequence or sequence alignment of a known binding molecule protein stored by the sequence alignment-binding molecule storage means, and information on the type of the binding molecule or the binding molecule. A binding molecule-determining residue position determining means for determining the binding molecule-determining residue position using: a binding molecule determining residue position; an amino acid residue (binding molecule determining residue) at the binding molecule determining residue position; By associating the type with the binding molecule determining residue and the binding molecule or the type of the binding molecule, the binding molecule representing the correlation between the binding molecule determining residue and the type of the binding molecule is obtained. And the above step, for the unknown binding molecule protein of the same type as the known binding molecule protein. Sequence alignment input means for inputting information on sequence alignment of unknown binding molecules obtained by aligning sequences of unknown binding molecules with respect to sequence alignment between known proteins of combined molecules. program,
( 2 0 ) 前記結合分子決定残基位置決定手段が、 少なくとも式 1 2又は式 1 3 のいずれか又は両方の関数を用いる上記 (1 9 ) に記載のプログラム、  (20) The program according to the above (19), wherein the binding molecule determining residue position determining means uses at least one of the functions of the formulas 12 and 13 or both.
( 2 1 ) 前記結合分子決定残基位置決定手段が、 式 1 4で表される関数を用い る上記 (1 9 ) 又は (2 0 ) に記載のプログラム、  (21) The program according to (19) or (20), wherein the binding molecule determining residue position determining means uses a function represented by Formula 14.
( 2 2 ) コンピュータを、 結合分子未知タンパク質と同じ種類であり結合する 結合分子が既知である結合分子既知タンパク質のシークェンスァラインメントの うち当該結合分子既知夕ンパク質に結合する分子を決定することに関与すると想 定される位置である結合分子決定残基位置と、 当該結合分子決定残基位置におけ る結合分子既知タンパク質のアミノ酸残基である結合分子決定残基と、 当該結合 分子決定残基に対応した結合分子既知タンパク質の結合分子又は結合分子の種類 とに関する情報を記憶した記憶手段と、 前記結合分子既知タンパク質と同じ種類 の結合分子未知タンパク質について前記結合分子既知タンパク質間のシークェン スアラインメントに対して結合分子未知タンパク質の配列を整列させて得られた 結合分子未知タンパク質のシークェンスアラインメントに関する情報を入力する シークェンスァラインメント入力手段と、 入力されたシークェンスァラインメン トに関する情報と記憶手段に記憶される情報とから当該結合分子未知タンパク質 の結合分子又は結合分子の種類を決定する結合分子決定手段と、 決定された結合 分子未知タンパク質に結合する結合分子又は結合分子の種類を表示する表示手段 として機能させるプログラム、 および、 (22) Using a computer to determine a molecule that binds to a protein with a known binding molecule in a sequence alignment of a protein with a known binding molecule that has the same type and binds to the protein with an unknown binding molecule. Binding molecule-determining residue position that is supposed to be involved in the binding molecule; binding molecule-determining residue that is an amino acid residue of a binding molecule known protein at the binding molecule-determining residue position; Storage means for storing information on the binding molecule or the type of binding molecule of the binding molecule known protein corresponding to the group; and a sequence alignment between the binding molecule known protein for the same type of binding molecule unknown protein as the binding molecule known protein. Unknown protein obtained by aligning the sequence of unknown protein to And Sequence § Line Instrument input means for inputting information about the Sequence alignment of Park protein, input Sequence § Line ment Information storage means the binding molecule from the information stored in the unknown protein A binding molecule determining means for determining the binding molecule or the type of the binding molecule, and a program functioning as a display means for displaying the determined binding molecule or the type of the binding molecule binding to the unknown binding molecule protein; and
(23) 上記 (1 9) 〜 (22) のいずれか 1項に記載のプログラムを記憶し た記録媒体、 等を提供する。 図面の簡単な説明  (23) A recording medium storing the program according to any one of the above (19) to (22) is provided. BRIEF DESCRIPTION OF THE FIGURES
図 1は、 本発明のリガンド決定残基一リガンド分類情報作成までの工程表を表 す。  FIG. 1 shows a process chart from creation of ligand-determined residue-ligand classification information of the present invention.
図 2は、 本発明のリガンド決定残基位置特定工程の一態様を示す工程表である 図 3は、 本発明のリガンド決定残基位置特定工程の別の一態様を示す工程表で あ  FIG. 2 is a process chart showing one embodiment of the ligand-determining residue position specifying step of the present invention. FIG. 3 is a process chart showing another embodiment of the ligand-determining residue position specifying step of the present invention.
図 4は、 ゥシロドプシンと TGR 23— 1とのシークェンスアラインメントの 結果を表す。  FIG. 4 shows the results of sequence alignment between silodopsin and TGR 23-1.
図 5は、 FL I PRを用いて測定した種々の濃度のヒト TGR 23— 2リガン ド (1— 20) による TGR 23— 1発現 CHO細胞の細胞内 C aイオン濃度上 昇活性を示す。  FIG. 5 shows the activity of increasing the intracellular Ca ion concentration of TGR23-1-expressing CHO cells with various concentrations of human TGR23.2 ligand (1-20) measured using FLIPR.
図 6は、 FL I PRを用いて測定した種々の濃度のヒト TGR 23 - 2リガン ド (1一 20) による TGR 23— 2発現 CHO細胞の細胞内 C aイオン濃度上 昇活性を示す。 発明を実施するための最良の形態  FIG. 6 shows the activity of increasing the intracellular Ca ion concentration of TGR23-2 expressing CHO cells by various concentrations of human TGR23-2 ligand (112) measured using FLIPR. BEST MODE FOR CARRYING OUT THE INVENTION
本発明は、 アミノ酸配列と結合分子とが既知である結合分子既知タンパク質に ついて、 少なくとも 2以上の結合分子既知タンパク質のシークェンスァラインメ ントと、 結合分子又は結合分子の種類とを対応付けた結合分子既知タンパク質分 類情報を得る工程と、 前記結合分子既知タンパク質分類情報を用いて、 結合分子 既知タンパク質のシークェンスアラインメントの位置のうち結合分子を決定する ことに関与すると想定される位置である結合分子決定残基位置を 1又は 2以上特 定する工程と、 前記結合分子決定残基位置におけるアミノ酸残基 (結合分子決定 残基) と、 結合分子又は結合分子の種類とを対応付けることにより、 結合分子決 定残基と結合分子又は結合分子の種類との相関関係を表す結合分子決定残基一結 合分子分類情報を得る工程と、 前記結合分子既知タンパク質と同じ種類の結合分 子未知タンパク質について前記結合分子既知タンパク質間のシークェンスァライ ンメントに対して結合分子未知タンパク質の配列を整列させ、 結合分子未知タン パク質のシークェンスアラインメントを得る工程と、 前記結合分子未知タンパク 質のシークェンスァラインメントのうち結合分子決定残基についての情報を、 結 合分子決定残基一結合分子分類情報に当てはめ、 結合分子未知タンパク質の結合 分子又は結合分子の種類を予測する結合分子未知タンパク質の結合分子予測方法 に関する。 The present invention relates to a known binding molecule protein whose amino acid sequence and binding molecule are known, wherein the binding alignment is performed by associating at least two or more binding molecule known proteins with the binding molecule or the type of the binding molecule. A step of obtaining molecularly known protein classification information; and a binding molecule that is assumed to be involved in determining a binding molecule among sequence alignment positions of the known binding molecule protein using the binding molecule known protein classification information. One or more determined residue positions Determining the binding molecule-determining residue and the binding molecule or the binding molecule by associating the amino acid residue (binding molecule determination residue) at the binding molecule-determining residue position with the type of the binding molecule or the binding molecule. Obtaining the binding molecule-determining residue-binding molecule classification information indicating the correlation with the type of the binding molecule; and a sequence alignment between the binding molecule unknown proteins of the same type as the binding molecule unknown protein. Aligning the sequence of the unknown binding molecule protein to obtain a sequence alignment of the unknown binding molecule protein, and obtaining information on the binding molecule determining residue in the sequence alignment of the unknown binding molecule protein. By applying the binding molecule-determined residue-binding molecule classification information to the binding molecule unknown protein binding molecule or The present invention relates to a method for predicting a binding molecule of an unknown protein, which predicts the type of the binding molecule.
結合分子既知タンパク質とは、 タンパク質であってそれに結合する生体分子が 知られているものを意味する。 例えば、 リガンドが既知であるレセプターなど生 体分子が特異的に結合するタンパク質を意味する。  A binding molecule known protein means a protein for which a biomolecule that binds to the protein is known. For example, it refers to a protein to which a biological molecule specifically binds, such as a receptor whose ligand is known.
結合分子未知タンパク質とは、 結合分子既知タンパク質と同じ種類のタンパク 質であって、 結合分子が未知のものを意味する。 結合分子既知タンパク質と同じ 種類の結合分子未知タンパク質について前記結合分子既知タンパク質間のシーク エンスアラインメントに対して結合分子未知タンパク質の配列を整列させて得ら れた結合分子未知タンパク質のシークェンスァラインメン卜とは、 結合分子未知 タンパク質のァミノ酸配列と当該結合分子未知タンパク質と同種類の夕ンパグ質 のアミノ酸配列の類似性 (相同性) を調べるために置換、 挿入、 欠失を考慮した 上で、 挿入、 欠失に相当する箇所にギャップを入れ、 配列全体を並置したもので あ 。  An unknown binding molecule protein refers to a protein of the same type as a known binding molecule protein, wherein the binding molecule is unknown. A sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the known binding molecule proteins with respect to the same type of unknown binding molecule protein as the known binding molecule protein. In order to examine the similarity (homology) between the amino acid sequence of the unknown protein of the binding molecule and the amino acid sequence of the same type of protein as the unknown protein of the binding molecule, the insertion, taking into account the substitution, insertion, and deletion, A gap was inserted at the position corresponding to the deletion, and the entire sequence was juxtaposed.
結合分子未知タンパク質及び結合分子既知タンパク質としては、 Gタンパク質 共役型受容体 (G P C R ) 、 キナーゼ、 リパーゼ、 トランスポ一タ一、 プロテア ーゼ、 イオンチャンネルがあげられる。 これらのうちで、 Gタンパク質共役型受 容体 (G P C R ) 、 キナ一ゼについて本発明が好ましく適用できる。 結合分子未 知タンパク質及び結合分子既知タンパク質が Gタンパク質共役型受容体 (G P C R ) である場合、 結合分子未知タンパク質をォ一ファンレセプターとも呼ぶ。 な お、 前述の通り、 結合分子未知タンパク質と結合分子既知タンパク質とは同じ種 類である。 例えば、 結合分子未知タンパク質が G P C Rであれば結合分子既知夕 ンパク質も G P C Rである。 Examples of unknown binding molecule proteins and known binding molecule proteins include G protein-coupled receptors (GPCRs), kinases, lipases, transporters, proteases, and ion channels. Of these, the present invention can be preferably applied to G protein-coupled receptors (GPCRs) and kinases. When the unknown binding molecule protein and the known binding molecule protein are G protein-coupled receptors (GPCRs), the unknown binding molecule protein is also called an orphan receptor. What As described above, the unknown binding molecule protein and the known binding molecule protein are of the same type. For example, if the protein whose binding molecule is unknown is a GPCR, the protein whose binding molecule is known is also a GPCR.
[結合分子既知夕ンパク質分類情報]  [Classification information of known binding molecules]
結合分子既知タンパク質分類情報とは、 少なくとも 2以上の結合分子既知タンパ ク質のシークェンスアラインメントと結合分子 (及び Z又は、 結合分子の種類) とを対応付けた表を意味する。 この表は、 紙面のみならず、 電子的に保存され視 覚により表として認識することができるような態様のものであれば特に限定され るものではない。 また、 結合分子既知タンパク質のシークェンスアラインメント と結合分子又は結合分子の種類との対応が認識できるものであれば特に限定され るものではない。 The binding molecule known protein classification information means a table in which sequence alignment of at least two or more binding molecule known proteins is associated with the binding molecule (and Z or the type of the binding molecule). This table is not particularly limited as long as it is in a form that can be stored electronically and visually recognized as a table, not only on paper. In addition, there is no particular limitation as long as the correspondence between the sequence alignment of the known binding molecule protein and the binding molecule or the type of the binding molecule can be recognized.
[結合分子決定残基位置]  [Positioning residue for binding molecule]
結合分子決定残基位置とは、 結合分子既知タンパク質のシークェンスァラインメ ントの位置であつて、 結合分子を決定することに関与すると想定される位置を意 味する。 The binding molecule determining residue position means a position of a sequence alignment of a protein having a known binding molecule, which is assumed to be involved in determining a binding molecule.
リガンド決定残基位置の数としては、 1以上であれば特に限定されるものではな く、 1以上 10以下であれば好ましく、 2以上 6以下であればより好ましく、 2であれ ば特に好ましい。 生体を構成するアミノ酸残基の種類は 20種しか存在しない。 し たがって、 1つのリガンド決定残基のみで対応付けることのできるリガンドは 20 種類までである。 一方、 例えば、 G P C Rでは、 100種類以上のリガンドが知られ ているので、 リガンド決定残基位置が 1つであれば、全てのリガンドとリガンド決 定残基とを対応付けることができない。 したがって、 G P C Rのようにリガンド が 20種類以上ある系において、 結合分子未知タンパク質のリガンドを予測するに は、 リガンド決定残基位置が 2つ以上あることが望ましい。 また、 リガンド決定残 基位置の数が多いほど予測精度が高まる。 リガンドを数種類に分類し (Xl〜Xp : p は分類の数である。 ) 、 結合分子未知タンパク質のリガンドがいずれの分類に属 するかを決定する場合、 リガンドの数 (ρの値) が少なければ、 1つのリガンド決 定残基位置のアミノ酸残基のみでリガンドの種類を予測することが可能となる。 しかし、かかる場合であっても、一般に 2つ以上のリガンド決定残基位置のァミノ 酸残基を組合せてリガンドの種類を予測した方が予測精度は高くなる。 The number of ligand-determining residue positions is not particularly limited as long as it is 1 or more, preferably 1 or more and 10 or less, more preferably 2 or more and 6 or less, and particularly preferably 2 or more. There are only 20 types of amino acid residues that make up the living body. Therefore, up to 20 ligands can be assigned using only one ligand-determining residue. On the other hand, for example, in GPCR, more than 100 kinds of ligands are known, so that if there is only one ligand-determining residue position, all ligands cannot be associated with ligand-determining residues. Therefore, in a system having more than 20 kinds of ligands, such as a GPCR, in order to predict the ligand of a protein whose binding molecule is unknown, it is desirable that there are two or more ligand-determining residue positions. Also, the prediction accuracy increases as the number of ligand determination residue positions increases. When the ligands are classified into several types (X1 to Xp: p is the number of classifications), and the classification of the ligand of the unknown protein belongs to which classification, the number of ligands (value of ρ) is small. For example, it is possible to predict the type of ligand using only the amino acid residue at one ligand-determining residue position. However, even in such cases, amino acids at two or more ligand-determining residue positions are generally The prediction accuracy becomes higher when the type of ligand is predicted by combining acid residues.
[結合分子決定残基]  [Binding molecule determining residue]
結合分子決定残基とは、 前記結合分子決定残基位置におけるアミノ酸残基を意味 する。 また、 結合分子既知タンパク質について複数種類の結合分子の特定残基位 置 (1つ又は 2つ以上) と複数種類のアミノ酸残基とを組合せ、 結合分子決定残 基としても良い。 例えば、 シークェンスアラインメント第 2番目と第 8番目のァ ミノ酸残基位置を結合分子決定残基位置の一例とし、 シークェンスァラインメン ト第 9番目と第 1 1番目のアミノ酸残基を他の例とする場合である。 このように 異なった結合分子決定残基位置を組合せることにより、 結合分子又は結合分子の 種類の予測精度を向上させることが可能となるからである。 結合分子決定残基に 関する情報とは、 前記結合分子決定残基位置におけるアミノ酸残基に関する情報 を意味する。 例えば、 タンパク質のシークェンスアラインメントの第 2番目と第 8番目が結合分子決定残基位置であれば、 シークェンスアラインメントの第 2番 目と第 8番目が、 結合分子決定残基の位置に関する情報である。 そして、 シーク エンスアラインメントのうち、 第 2番目と第 8番目のアミノ酸残基の種類につい ての情報と当該結合分子決定残基の位置に関する情報とをあわせて結合分子決定 残基に関する情報となる。 The binding molecule determining residue means an amino acid residue at the binding molecule determining residue position. In addition, specific residue positions (one or two or more) of a plurality of types of binding molecules and a plurality of types of amino acid residues may be combined with a known protein of the binding molecule to be used as a residue for determining a binding molecule. For example, the second and eighth amino acid residue positions in the sequence alignment are taken as examples of the binding molecule-determining residue positions, and the ninth and eleventh amino acid residues in the sequence alignment are taken as other examples. This is the case. Combining different binding molecule determining residue positions in this manner makes it possible to improve the accuracy of predicting the binding molecule or the type of the binding molecule. The information on the binding molecule determining residue means information on the amino acid residue at the binding molecule determining residue position. For example, if the second and eighth positions in the sequence alignment of the protein are binding molecule determinant residue positions, the second and eighth positions in the sequence alignment are information on the positions of the binding molecule determinant residues. Then, in the sequence alignment, information on the types of the second and eighth amino acid residues and information on the position of the binding molecule determining residue are combined to provide information on the binding molecule determining residue.
[結合分子決定残基一結合分子分類情報]  [Binding molecule determination residue-binding molecule classification information]
結合分子決定残基一結合分子分類情報とは、 結合分子決定残基と結合分子または 結合分子の種類との相関関係を表す表である。 この表は、 紙面のみならず、 電子 的に保存され視覚により表として認識することができるような態様のものであれ ば特に限定されるものではない。 また、 結合分子決定残基と結合分子又は結合分 子の種類との対応が認識できるようなものであれば特に限定されるものではない [結合分子] The binding molecule determining residue-binding molecule classification information is a table showing the correlation between the binding molecule determining residue and the binding molecule or the type of the binding molecule. This table is not particularly limited as long as it is in a form that can be stored electronically and visually recognized as a table as well as on paper. The binding molecule is not particularly limited as long as the correspondence between the binding molecule determining residue and the type of the binding molecule or the binding molecule can be recognized.
結合分子としては、 生体高分子である結合分子既知タンパク質及び結合分子未知 タンパク質に結合しうるものであれば特に限定されるものではないが、 例えば、 レセプ夕一タンパク質に結合するリガンドゃ、 G P C Rに結合する G αタンパク 質などがあげられる。 [結合分子の種類] The binding molecule is not particularly limited as long as it can bind to known binding molecule proteins and unknown binding molecule proteins that are biopolymers.For example, ligands that bind to receptor proteins and GPCRs G α protein that binds. [Type of binding molecule]
結合分子の種類とは、 同じ種類の結合分子既知タンパク質に複数の結合分子が存 在する場合にそれらをその機能や性質などに応じて分類したものである。 例えば 、 G P C Rのリガンドを、 モノアミン、 脂質、 ペプチドに分類する場合があげら れる。 The type of binding molecule is a type in which, when a plurality of binding molecules exist in the same type of known binding molecule protein, they are classified according to their functions and properties. For example, there are cases where GPCR ligands are classified into monoamines, lipids, and peptides.
[コンピュータ]  [Computer]
本発明のコンピュータは、 一定の計算等をすることができる電子的デバイスであ れば、 特に限定されるものではなく、 たとえば、 パーソナルコンピュータ一、 ス —パ一コンピュータ、 モバイル等公知のコンピュータであってもよい。 ブラウザ を搭載したコンピュータであれば、 インターネットに接続することができ、 公知 の W e b (ウェブ) サイトにアクセスすることができるので特に好ましい。 以下、 結合分子がリガンドである場合を例にしてリガンド決定残基一リガンド 分類情報作成までの工程を説明する。 The computer of the present invention is not particularly limited as long as it is an electronic device capable of performing a certain calculation or the like. For example, a known computer such as a personal computer, a super computer, and a mobile may be used. You may. A computer equipped with a browser is particularly preferable because it can be connected to the Internet and can access a well-known Web (Web) site. Hereinafter, the steps up to the generation of classification information for a ligand-determined residue-ligand will be described, taking the case where the binding molecule is a ligand as an example.
[リガンド決定残基一リガンド分類情報作成までの工程]  [Steps to create ligand-determined residue-ligand classification information]
図 1は、 リガンド決定残基一リガンド分類情報作成までの工程の一例を表し、 以 下の工程からなる。 すなわち、 アミノ酸配列とリガンド (及び 又はリガンドの 種類) とが既知である少なくとも 2以上の結合分子既知タンパク質についてのシ —クエンスアラインメントとリガンド (及び/又はリガンドの種類) に関する情 報を取得する工程 (S 101) 、 シークェンスアラインメントとリガンド (及び Z又 はリガンドの種類) を対応付け、 結合分子既知タンパク質分類情報を得るシーク エンスアラインメントリガンド分類情報取得工程 (S 102) 、 シークェンスァライ ンメントリガンド分類情報取得工程により得られる結合分子既知タンパク質分類 情報を用いて結合分子既知タンパク質のシークェンスアラインメントのうちリガ ンド (及び 又はリガンドの種類) を決定することに関与すると想定される位置 であるリガンド決定残基位置を 1又は 2以上特定するリガンド決定残基位置特定ェ 程 (S 103) 、 リガンド決定残基位置特定工程により特定されたリガンド決定残基 位置におけるアミノ酸残基 (リガンド決定残基) によって、 リガンド (及び 又 はリガンドの種類) を対応付けたリガンド決定残基とリガンド (及び Z又はリガ ンドの種類) との相関関係を表すリガンド決定残基一リガンド分類情報を得るリ ガンド決定残基一リガンド分類工程 (S 104) である。 以下、 各工程について説明 する。 なお、 以降リガンドの種類を含めて単にリガンドという場合もある。 FIG. 1 shows an example of a process from the generation of ligand-determined residue-ligand classification information, which comprises the following steps. That is, the step of obtaining information on sequence alignment and ligand (and / or ligand type) for at least two or more binding molecule known proteins whose amino acid sequence and ligand (and / or ligand type) are known ( S 101), sequence alignment and ligand (and type of Z or ligand) are associated, and sequence alignment ligand classification information obtaining step for obtaining binding protein known protein classification information (S 102), sequence alignment ligand classification information obtaining The ligand-determining residue positions, which are assumed to be involved in determining the ligand (and / or the type of ligand) in the sequence alignment of the known binding molecule proteins using the information on the classification of the known binding molecule proteins obtained by the process, are described. 1 Or the ligand-determining residue position specifying step (S103) for specifying two or more ligand-determining residue positions (S103); Is the ligand-determined residue that indicates the correlation between the ligand (and the type of Z or ligand) that is associated with the ligand-determined residue associated with the ligand type. This is a gand determining residue-ligand classification step (S104). Hereinafter, each step will be described. In the following, the term “ligand” may include the type of ligand.
[ステップ 1 0 1 ]  [Step 1 0 1]
まず、 リガンド決定残基一リガンド分類情報を作成するために、 アミノ酸配列 とリガンドとが既知である少なくとも 2以上の結合分子既知タンパク質について のシークェンスアラインメントとリガンド (及び Z又はリガンドの種類) に関す る情報を取得する (S 101) 。 この工程では、 複数の結合分子既知タンパク質につ いてアミノ酸配列とリガンド (及び/又はリガンドの種類) に関する情報を取得 するが、 アミノ酸配列からシークェンスアラインメントを求めても良いし、 複数 の結合分子既知タンパク質について既にシークェンスァラインメン卜が求められ ていれば、 そのシークェンスァラインメントに関する情報を直接取得しても良い 。 結合分子既知タンパク質についてのシークェンスアラインメントとリガンド ( 及び Z又はリガンドの種類) に関する情報を取得する方法は、 特に限定されるも のではなく、 データベースから当該情報を取得しても、 計算により当該情報を取 得しても良い。データベースとしては、少なくとも 100種類以上の結合分子既知夕 ンパク質についてのシークェンスアラインメントとリガンド (及び/又はリガン ドの種類)に関する情報を収録しているデータベースが好ましく、 500以上収録し ていればより好ましく、 1000以上収録していれば特に好ましい。 結合分子既知夕 ンパク質の数が多いほど、 リガンド決定残基一リガンド分類情報の精度が高まる からである。  First, in order to create the ligand-determined residue-ligand classification information, the sequence alignment and ligand (and type of Z or ligand) for at least two or more binding molecule known proteins whose amino acid sequence and ligand are known are described. Information is acquired (S101). In this step, information on the amino acid sequence and ligand (and / or type of ligand) is obtained for a plurality of known binding molecule proteins, but sequence alignment may be obtained from the amino acid sequence, or a plurality of known binding molecule proteins may be obtained. If sequence alignment has already been requested for, information on the sequence alignment may be obtained directly. The method for obtaining sequence alignment and information on ligands (and Z or the type of ligand) for proteins with known binding molecules is not particularly limited. Even if the information is obtained from a database, the information is obtained by calculation. May be obtained. The database is preferably a database that contains information on sequence alignments and ligands (and / or types of ligands) of at least 100 or more types of known binding molecules, and more preferably 500 or more. It is particularly preferable to record 1000 or more. This is because the greater the number of proteins with known binding molecules, the higher the accuracy of the ligand-determined residue-ligand classification information.
公知のデータベースとしては、 結合分子既知タンパク質のシークェンスァライ ンメン卜とリガンド (及び Z又はリガンドの種類) を記述しているものであれば 特に限定されるものではなく、 例えば G P C R D B (ht tp ://www. G P C R . org/7 tm/)があげられる。 シークェンスアラインメントは、 公知の計算方法によって求 めることもできる。 シークェンスアラインメントに関して公知の計算法としては 、 例えば Clus t al Wや BLASTが挙げられるが、 手動で計算しても良い。 また、 リガ ンドに関する分類をあらかじめ作成しておき、 リガンドに関する情報が入力され れば、 リガンドの種類が自動的に求められるようにしてき、 リガンドの情報を入 手すれば、 リガンドの種類に関する情報も得られるようにしてもよい。 [ステップ 1 0 2 ] The known database is not particularly limited as long as it describes the sequence alignment and ligand (and Z or the type of ligand) of the binding molecule known protein. For example, GPCRDB (ht tp: / /www.GPCR.org/7tm/). Sequence alignment can also be obtained by a known calculation method. Known calculation methods for sequence alignment include, for example, Clus tal W and BLAST, but may be calculated manually. In addition, a classification for the ligand is created in advance, and if the information on the ligand is input, the type of the ligand is automatically obtained. If the information on the ligand is obtained, the information on the type of the ligand is also obtained. You may make it available. [Step 1 0 2]
次に、 ステップ 101で取得された 2以上の結合分子既知タンパク質についてのシ一 クエンスアラインメントとリガンド (及び Z又はリガンドの種類) に関する情報 について、 シークェンスアラインメントとリガンド (及び Z又はリガンドの種類 ) に関する情報を対応付け、 結合分子既知タンパク質分類情報を得る (シークェ ンスアラインメントリガンド分類情報取得工程: S102) 。 ステップ 101において、 1 以上の結合分子既知タンパク質についてのシークェンスァラインメントとリガン ド (及び/又はリガンドの種類) に関する情報をデータベース等から取得する際 に、 すでにシ一クエンスアラインメントとリガンド (及び 又はリガンドの種類 ) が対応付けられていれば、 シークェンスアラインメントとリガンド (及び Z又 はリガンドの種類) が対応付けられたままの情報を取得してもよい。 Next, the sequence alignment and the information on the ligand (and the type of Z or ligand) for the two or more binding molecule known proteins obtained in step 101, and the sequence alignment and the information on the ligand (and type of Z or ligand) are obtained. To obtain the classification information of the binding molecule known protein (sequence alignment ligand classification information obtaining step: S102). In step 101, when sequence alignment and ligand (and / or ligand type) information on one or more known binding molecules is obtained from a database or the like, the sequence alignment and the ligand (and / or ligand) have already been performed. If the type is associated with the sequence alignment, the information may be obtained with the sequence alignment and the ligand (and the type of Z or ligand) associated with each other.
[ステップ 1 0 3 ]  [Step 1 0 3]
次に、ステップ 102のシークェンスアラインメントリガンド分類情報取得工程によ り得られる結合分子既知タンパク質分類情報を用いて、 結合分子既知タンパク質 のシークェンスアラインメントのうちリガンド (及びノ又はリガンドの種類) を 決定することに関与すると想定される位置であるリガンド決定残基位置を 1又は 2 以上特定する (リガンド決定残基位置特定工程: S 103) 。 Next, the ligand (and the type of ligand or ligand) in the sequence alignment of the binding molecule known protein is determined using the binding molecule known protein classification information obtained in the sequence alignment ligand classification information acquisition step of step 102. One or more ligand-determining residue positions, which are assumed to be involved in the above, are specified (ligand-determining residue position specifying step: S103).
この工程においては、 対象とするタンパク質の種類や性質に合わせ好ましい関 数を組合せてリガンド決定残基位置を 1又は 2以上特定する。 この工程の好ましい 実施態様については、 後述する。  In this step, one or more ligand-determining residue positions are specified by combining preferred functions according to the type and properties of the target protein. A preferred embodiment of this step will be described later.
[ステップ 1 0 4 ]  [Step 1 04]
次に、 リガンド決定残基位置特定工程により特定されたリガンド決定残基位置に おけるアミノ酸残基 (リガンド決定残基) によって、 リガンド (及び/又はリガ ンドの種類) を対応付けたリガンド決定残基とリガンドとの相関関係を表すリガ ンド決定残基一リガンド分類情報を得る (リガンド決定残基一リガンド分類工程 : S 104) 。 Next, the ligand (and / or the type of ligand) is associated with the amino acid residue (ligand-determining residue) at the ligand-determining residue position specified in the ligand-determining residue position specifying step. To obtain ligand-determined residue-ligand classification information indicating the correlation between the ligand and the ligand (ligand determined residue-ligand classification step: S104).
この工程では、 それぞれの G P C Rについてリガンド決定残基位置、 リガンド 決定残基、 及びリガンド (及び/又はリガンドの種類) に関する情報を、 シーク エンスアラインメントリガンド分類情報から取得し、 リガンド決定残基一リガン ド分類情報を得る。 In this step, information on the ligand-determined residue positions, ligand-determined residues, and ligands (and / or types of ligands) is obtained from the sequence alignment ligand classification information for each GPCR, and the ligand-determined residue-one ligand is obtained. Obtain classification information.
このようなリガンド決定残基一リガンド分類情報を用いれば、 リガンド未知の G P C Rについても、 そのシークェンスアラインメントを求めるだけで、 そのリ ガンド (及び Z又はリガンドの種類) を予測することが可能となる。  By using such ligand-determined residue-ligand classification information, it is possible to predict the ligand (and Z or the type of ligand) even for a GPCR whose ligand is unknown, simply by determining the sequence alignment.
[ステップ 1 0 3の一実施態様]  [One Embodiment of Step 103]
図 2に従って、 リガンド決定残基位置特定工程 (S 103) の好ましい一態様を説明 する。 この工程は、 リガンド決定残基位置の数が 1つの場合である。 リガンドを!) 種類に分け、それぞれ XIから Xpに分類する。ステップ 102で得られた結合分子既知 タンパク質分類情報に存在する結合分子既知タンパク質の全てのシークェンスァ ラインメントとリガンド (及び/又はリガンドの種類) に関する情報を下記の式 1 5に入力し、関数 f l (n)の値が小さいものから、 リガンド決定残基位置の候補と する (リガンド決定残基位置候補選択工程: S201) 。 このとき、 得られたリガンド 決定残基位置の候補におけるシークェンスァラインメントについて、 シークェン スアラインメント不能(-で表される)が、全 G P C Rの 3 %以上存在すれば、 当該 リガンド決定残基位置をリガンド決定残基位置として採用しないことが好ましい With reference to FIG. 2, a preferred embodiment of the ligand-determining residue position identification step (S103) will be described. This step is for the case where the number of ligand-determining residue positions is one. Ligands are classified into!) Types and classified into XI to Xp, respectively. The information on all sequence alignments and ligands (and / or types of ligands) of known binding molecule proteins present in the classification information on binding molecules known in step 102 is input to the following equation 15 and the function fl Since the value of (n) is small, it is determined as a candidate for a ligand-determined residue position (a candidate ligand-determined residue position selection step: S201). At this time, if the sequence alignment of the obtained candidate ligand-determined residue positions is not sequence-alignable (indicated by-), but is present in 3% or more of all GPCRs, the ligand-determined residue positions are determined. It is preferable not to adopt as a ligand-determining residue position
f l (n) =∑ (N (Res, XQ) XN (Res, Xr) ) 式 1 5 f l (n) = ∑ (N (Res, XQ) XN (Res, Xr)) Equation 15
Res  Res
[式 1 5中、 nは、 f l (n)が、 結合分子既知タンパク質のシークェンスァラインメ ントのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Res は、 アミノ酸残基の種類を表し、 XQ及び Xrは、 リガンドを表し、 Qは 1から p - 1まで の整数を表し、 rは Qより大きく p以下である整数を表し、 pはリガンドの数を表し 、 N (Res, XQ)は、 結合分子既知タンパク質分類情報に存在する結合分子既知タン パク質のうち、シークェンスァラインメン卜の n番目のアミノ酸残基が Resであり 、 かつリガンドが であるものの数を表し、 N (Res, Xr)は、 結合分子既知タンパ ク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァライン メントの n番目のアミノ酸残基が Resであり、かつリガンドが Xrであるものの数を 表す。 ] リガンド決定残基位置候補選択工程においてあげられるリガンド決定残基位置 候補の数としては、 特に限定されるものではないが、 1以上 100以下が好ましく、 1 以上 10以下であればより好ましく、 1以上 5以下であれば特に好ましい。 リガンド 決定残基位置候補が多すぎると後のリガンド決定残基位置候補の信頼性確認工程 が困難になるからである。 [In Formula 15, n represents fl (n) as an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the amino acid residue. XQ and Xr represent ligands, Q represents an integer from 1 to p-1, r represents an integer greater than Q and less than or equal to p, p represents the number of ligands, and N ( Res, XQ) represents the number of proteins in which the nth amino acid residue in the sequence alignment is Res and the ligand is among the known binding molecule proteins present in the binding molecule known protein classification information, N (Res, Xr) is the number of proteins whose binding amino acid residue is Res and the ligand is Xr among known binding molecule proteins in the binding molecule known protein classification information. To Represent. The number of candidate ligand-determining residue positions in the ligand-determining residue position candidate selecting step is not particularly limited, but is preferably 1 or more and 100 or less, more preferably 1 or more and 10 or less. It is particularly preferable that the value is not less than 5 and not more than 5. This is because if there are too many candidate ligand-determined residue positions, the subsequent step of confirming the reliability of candidate ligand-determined residue positions becomes difficult.
ステップ 201のリガンド決定残基位置候補選択工程において選択されたリガン ド決定残基位置候補は、 そのままリガンド決定残基位置としてもよいが、 リガン ド決定残基位置の候補の信頼性を式 2を用いて検討することがより好ましい。 ( リガンド決定残基位置候補の信頼性検討工程: S202) 。 この場合、 下記式 1 6を 用いてリガンド決定残基位置候補の信頼性を検討する。得られた f 2 (n)の値が小さ いほどリガンド決定残基位置としてふさわしいこととなる。 f 2 (n) =∑ (N (Res, XI) xN (Res, X2) X N (Res, Xp) ) 式 1 6  The candidate ligand-determined residue position selected in the candidate ligand-determined residue position selecting step in step 201 may be used as it is as the ligand-determined residue position candidate. It is more preferable to use and study. (Step of examining reliability of candidate ligand-determined residue position: S202). In this case, the reliability of candidate ligand-determined residue positions is examined using the following equation (16). The smaller the obtained value of f 2 (n), the more suitable the position of the ligand-determining residue. f 2 (n) = ∑ (N (Res, XI) xN (Res, X2) X N (Res, Xp)) Equation 16
Res  Res
[式 1 6中、 nは、 f 2 (n)が、 結合分子既知タンパク質のシークェンスァラインメ ントのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Res は、 アミノ酸残基の種類を表し、 XIから ¾は、 リガンド又はリガンドの種類を表 し、 pは、 リガンド又はリガンドの種類の数を表し、 N (Res, X)は、 ステップ 102 で得られた結合分子既知タンパク質分類情報に存在する結合分子既知タンパク質 のうち、 シークェンスアラインメントの n番目のアミノ酸残基が Resであり、かつ リガンドが Xであるものの数を表す。 ] ステップ 201で得られた f l (n)の値及び Zまたはステップ 202で得られた f 2 (n)の 値を用いてリガンド決定残基位置を特定する (リガンド決定残基位置特定工程: S 203) 。 得られた f 2 (ii)の値が最も小さいものをリガンド決定残基位置としてもよ いし、 Π (n)と f 2 (n)の値の積が最'も小さなものをリガンド決定残基位置としても よいし、 Π (n)と f 2 (n)の値の和が最も小さなものをリガンド決定残基位置として もよい。また、ステップ 102で得られた結合分子既知タンパク質分類情報に存在す る結合分子既知タンパク質のアミノ酸残基位置のうち、 f l (n)が低い方から順位付 けをし、更に f 2 (n)についても同様に順位付けをし、 両方の順位を掛け合わせ、 最 も低いものをリガンド決定残基位置としてもよい。 また、 得られたリガンド決定 残基位置におけるシークェンスァラインメントについてシークェンスァラインメ ント不能(-で表される)が、全 G P C Rの 3 %以上存在すれば、 当該リガンド決定 残基位置をリガンド決定残基位置として採用しないことが好ましい。 [In Equation 16, n represents that f 2 (n) is an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the amino acid residue. Represents the type of group, XI to ¾ represent the ligand or the type of ligand, p represents the number of the ligand or the type of ligand, and N (Res, X) represents the known binding molecule obtained in step 102. Among the known binding molecule proteins present in the protein classification information, this indicates the number of proteins in which the n-th amino acid residue in the sequence alignment is Res and the ligand is X. Using the value of fl (n) obtained in step 201 and the value of Z or f 2 (n) obtained in step 202, the position of a ligand-determining residue is identified (ligand-determining residue position identification step: S 203). The position of the obtained f 2 (ii) having the smallest value may be determined as the ligand-determining residue position, and the product of Π (n) and the value of f 2 (n) that is the smallest may be determined as the ligand-determining residue position. As a position Alternatively, the smallest sum of the values of Π (n) and f 2 (n) may be used as the ligand-determining residue position. In addition, among the amino acid residue positions of the known binding molecule proteins present in the binding molecule known protein classification information obtained in step 102, the positions of fl (n) are ranked in ascending order, and further, f 2 (n) May be similarly ranked, and both ranks may be multiplied, and the lowest one may be used as the ligand-determining residue position. In addition, if the sequence alignment at the obtained ligand-determined residue position is not sequence-alignable (indicated by-), but is present in 3% or more of all GPCRs, the ligand-determined residue position is determined by the ligand. It is preferable not to adopt it as a residue position.
[ステップ 1 0 3の別の実施態様]  [Another Embodiment of Step 103]
図 3に従って、 リガンド決定残基位置特定工程 (S 103) の好ましい別の実施態様 を説明する。 この工程は、 リガンド決定残基位置の数が 2つの場合である。 まず、 ステップ 102で得られた結合分子既知タンパク質分類情報に存在する結合分子既 知タンパク質の全てのシークェンスァラインメン卜と、 リガンド又はリガンドの 種類に関する情報を式 1に入力し、 リガンド決定残基位置の候補をあげる (リガ ンド決定残基位置候補選択工程: S301) 。 Referring to FIG. 3, another preferred embodiment of the ligand-determining residue locating step (S103) will be described. This step is for the case of two ligand-determining residue positions. First, all sequence alignments of known binding molecule proteins present in the binding molecule known protein classification information obtained in step 102 and information on the ligand or the type of ligand are input to Equation 1, and the ligand-determined residue position is entered. (Step for selecting candidate residue position for determining ligand: S301).
リガンド決定残基位置候補選択工程においてあげられるリガンド決定残基位置 候補の数としては、 特に限定されるものではないが、 1以上 100以下が好ましく、 1 以上 20以下であればより好ましく、 2以上 10以下であれば更に好ましく、 2以上 6 以下であれば特に好ましい。 リガンド決定残基位置候補が多いと後のリガンド決 定残基位置候補の信頼性確認工程が困難になり、候補が 1だと多様なリガンド種類 に対応できないからである。  The number of candidate ligand-determining residue positions in the ligand-determining residue position candidate selecting step is not particularly limited, but is preferably 1 or more and 100 or less, more preferably 1 or more and 20 or less, and 2 or more It is more preferably 10 or less, particularly preferably 2 or more and 6 or less. This is because if the number of candidate ligand-determined residue positions is large, the reliability confirmation process of the candidate ligand-determined residue positions becomes difficult, and if the number of candidate candidates is one, it cannot cope with various ligand types.
ステップ 301のリガンド決定残基位置候補選択工程においてあげられたリガン ド決定残基位置候補について、 リガンド決定残基位置の候補の信頼性を式 2を用 いて検討することを含めることはより好ましい実施態様である (リガンド決定残 基位置候補の信頼性検討工程: S302) 。 なお、 ステップ 302であるリガンド決定残 基位置信頼性検討ェ程を経ない場合は、 ステップ 301の後ステップ 303へと進めば よい。 リガンド決定残基位置信頼性検討工程では、 少なくともリガンド決定残基 位置候補を含むアミノ酸残基位置に関して、式 2に結合分子既知タンパク質のシー クエンスァラインメントとリガンドに関する情報を入力する。関数 f2 (n)の値の小 さいものがリガンド決定位置残基位置としてふさわしい。 For the candidate ligand-determined residue positions listed in the step of selecting candidate ligand-determined residue positions in step 301, it is more preferable to include the consideration of the reliability of the candidate for the ligand-determined residue positions using Equation 2. (Step of examining reliability of candidate ligand determination residue position: S302). In addition, when the ligand determination residue position reliability examination step in step 302 is not performed, the process may proceed to step 303 after step 301. In the step of examining the reliability of the position of the ligand-determined residue, at least the amino acid residue position including the candidate position of the ligand-determined residue is input to equation 2 with the sequence alignment of the known binding molecule protein and the ligand. Small value of function f2 (n) The residue is suitable as a residue position for determining a ligand.
関数 f 2 (n)の値の小さいものからリガンド決定残基位置として選択してもよい し、 f 1 (n)と ί 2 (n)の値の積が最も小さなものをリガンド決定残基位置として選択 してもよいし、 f l (n)と f 2 (n)の値の和が最も小さなものをリガンド決定残基位置 として選択してもよい。 また、ステップ 102で得られた結合分子既知タンパク質分 類情報に存在する結合分子既知タンパク質のァミノ酸残基位置のうち、 f 1 (n)が低 い方から順位付けをし、更に f 2 (n)についても同様に順位付けをし、両方の順位を 掛け合わせ低いものからリガンド決定残基位置として選択してもよい。 また、 得 られたリガンド決定残基位置におけるシークェンスアラインメントについてシ一 クエンスアラインメント不能(-で表される)が、結合分子既知タンパク質分類情報 に記載された G P C Rの 3 %以上存在すれば、 当該リガンド決定残基位置をリガ ンド決定残基位置として採用しないことが好ましい。  The position of the ligand-determining residue may be selected from those with the smallest value of the function f 2 (n), or the product of the smallest value of f 1 (n) and ί 2 (n) is determined as the position of the ligand-determining residue. Or the one with the smallest sum of the values of fl (n) and f2 (n) may be selected as the ligand-determining residue position. Further, among the amino acid residue positions of the known binding molecule proteins present in the information on the classification of known binding molecules obtained in step 102, the amino acid residue positions are ranked from the one with the lowest f 1 (n), and further the f 2 (n n) may be ranked in the same manner, and both ranks may be multiplied to select a ligand-determining residue position from the lower one. If the sequence alignment at the obtained ligand-determining residue position cannot be sequence-aligned (indicated by-), but is present in 3% or more of the GPCRs described in the information on the classification of known binding molecules, the ligand is determined. It is preferred that residue positions are not employed as residue determining residue positions.
次に、 ステップ 301又はステップ 302においてあげられた、 リガンド決定残基位 置の候補を 2つ組合せ、 リガンド決定残基位置のペア候補をあげる (リガンド決定 残基位置のペア選択工程: S303) 。 リガンド決定残基位置のペア候補としては、 全てのリガンド決定残基位置の候補からなる組合せをあげても良いし、 Π (n)の好 ましいリガンド決定残基位置とその他の残基位置との組合せでリガンド決定残基 位置のペア候補をあげても良い。  Next, two candidates for the ligand-determined residue positions given in step 301 or step 302 are combined, and a candidate pair for the ligand-determined residue positions is given (step of selecting a pair of ligand-determined residue positions: S303). As a candidate pair of ligand-determined residue positions, a combination consisting of all candidate ligand-determined residue positions may be used, or a preferred ligand-determined residue position of Π (n) and other residue positions may be used. A candidate pair of ligand-determined residue positions may be given by the combination of
リガンド決定残基位置のペア候補についての情報を式 1 7に入力することによ りリガンド決定残基位置のペアを特定する (リガンド決定残基位置のペア特定ェ 程: S304) 。 f 3 (m, n) = { (アミノ酸残基ペア種類数)/ wX + w l x (2交差残基ペア種類数) + w2 X (3交差残基ペア種類数) 十… wp- 1 (p交差残基ペア種類数) + wA X (ァ ラインメント不能アミノ酸残基数) + wB X (アラインメント不能アミノ酸残基べ ァ数) } 式 1 7  By inputting information about candidate pairs of ligand-determined residue positions into equation 17, a pair of ligand-determined residue positions is identified (pair identification step of ligand-determined residue positions: S304). f 3 (m, n) = {(number of amino acid residue pair types) / wX + wlx (2 cross residue pair types) + w2 X (3 cross residue pair types) tens… wp-1 (p cross Number of types of residue pairs) + wA X (number of non-alignable amino acid residues) + wB X (number of non-alignable amino acid residues)} Formula 17
[式 1 7中、 (m,n)は、 f 3 (m, n)が結合分子既知タンパク質のシ一: [In the formula 17, (m, n) is f 3 (m, n) where the binding molecule is a known protein:
インメントのうち第 m番目と第 n番目のアミノ酸残基についての評価関数である ことを表し、 アミノ酸残基ペア種類数は、 結合分子既知タンパク質のシークェン スァラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せの種類の数 を表し、 2交差残基ペア種類数及び 3交差残基ペア種類数はそれぞれ、 結合分子既 知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ 酸残基の組合せのうちリガンドが 2種類及び 3種類のものの数を意味し、 p交差残基 ペア種類数は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せのうちリガンドが!)種類のものの数を 意味し、 シークェンスアラインメント不能アミノ酸残基数とは、 結合分子既知夕 ンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残 基のうち一方が、 好ましい相同性を得るためにシークェンスアラインメント不可 能とされた数を意味し、 シークェンスアラインメント不能アミノ酸残基ペア数と は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基の両方が、 好ましい相同性を得るためにシークェンスァラ インメント不可能とされた数を意味し、 wXは正の定数、 またはアミノ酸ペア種類 数を変数とする分布関数であって、アミノ酸ペア種類数が 400以下の正の数である ときに最大値を与える分布関数を意味し、 w l "'wp_ l、 wA、 wBは、 ウェイトで あり、 正の数である。 ] wXは正の定数でもよく、 ガウス関数や分布関数でもよい。正の定数の場合、 特 に限定されるものではないが、 1が望ましい。 wXが、 分布関数の場合、 wXは正 の定数、 またはアミノ酸ペア種類数を変数とする分布関数であって、 アミノ酸べ ァ種類数が 4 0 0以下であるときに最大値を与える分布関数を意味し、 特に限定 されるものではない。 例えば、 アミノ酸ペア種類数を変数とするガウス関数、 口 一レンツ関数などでもよい。 その最大値は、 正の数であれば特に限定されるもの ではない。 この分布関数の最大値を与えるような変数の値としては、 400以下が好 ましく、 20〜200であればより好ましく、 40〜140であれば更に好ましい。 この値 は、 リガンド決定アミノ酸ペアの総数が 20 X 20 = 400であり、経験上 40〜140種の アミノ酸ペアによる予測が最も良好な結果を与えることから決定されたものであ る。 分布関数の半値全幅としては、 1 0以上 1 0 0以下が好ましく、 2 0以上 7 0以下であればより好ましく、 3 0以上 5 0以下であれば更に好ましい。 分布関 数の最大値としては、 他のウェイトとの関係にもよるが、 1が好ましい。 ウェイ ト (w l '"wp-l、 wA、 wB) の値は、 正の数であれば特に限定されるものではな レ^ しかし、 例えば、 3交差残基ペアが存在する場合、 2交差残基ペアよりもリガ ンドを予測する上で好ましくないため w2の値が wlの値よりも大きいことが好ま しい。 例えば、 リガンドの数が 3つの場合は、 3交差残基ペア以降は存在し得ない 。 この場合のウェイ卜の組合せとしては、 wlが 2で w2が 5で wAと wBが 1の組合せ があげられる。 Evaluation function for the mth and nth amino acid residues in the statement The number of types of amino acid residue pairs indicates the number of types of combinations of the mth and nth amino acid residues in the sequence alignment of proteins with known binding molecules. And the number of 3 crossing residue pairs means the number of 2 and 3 ligands, respectively, of the combination of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein. The number of pairs of p-crossing residue pairs is determined by the sequence alignment of known binding molecules, the ligand of which is the combination of the mth and nth amino acid residues! ) Means the number of amino acid residues that cannot be sequence-aligned.One of the m-th and n-th amino acid residues in the sequence alignment of a protein with a known binding molecule has favorable homology. The number of amino acid residue pairs that cannot be sequence-aligned in order to obtain, and the number of amino acid residue pairs that cannot be sequence-aligned means that both the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein are WX is a positive constant or a distribution function with the number of amino acid pair types as variables, and the number of amino acid pair types is 400 or less in order to obtain favorable homology. The distribution function that gives the maximum value when it is a positive number means wl "'wp_l, wA, and wB are weights Is a positive number.] WX may be a positive constant, a Gaussian function, or a distribution function. In the case of a positive constant, although not particularly limited, 1 is preferable. In the case of a function, wX is a positive constant or a distribution function with the number of amino acid pair types as a variable, and means a distribution function that gives the maximum value when the number of amino acid pair types is 400 or less. For example, a Gaussian function using the number of types of amino acid pairs as a variable, a mouth-to-Lenz function, etc. The maximum value is not particularly limited as long as it is a positive number. The value of the variable that gives the maximum value of is preferably 400 or less, more preferably 20 to 200, and even more preferably 40 to 140. This value indicates that the total number of ligand-determining amino acid pairs is 20 or less. X 20 = 400, based on experience It is determined because prediction with 40 to 140 amino acid pairs gives the best results.The full width at half maximum of the distribution function is preferably 10 or more and 100 or less, and 20 or more and 7 or less. It is more preferably 0 or less, more preferably 30 or more and 50 or less. The maximum value of the distribution function is preferably 1, although it depends on the relationship with other weights. The value of the weight (wl '"wp-l, wA, wB) is not particularly limited as long as it is a positive number. However, for example, when 3 pairs of residue pairs exist, 2 intersection residues It is preferable that the value of w2 is larger than the value of wl because it is more unfavorable in predicting the ligand than the base pair, for example, when the number of ligands is three, there may be more than three crossing residue pairs. There are no combinations of weights in this case: wl is 2, w2 is 5, wA and wB are 1.
リガンド決定残基位置のペア候補のうち小さな f3 (m, n)の値を与えるものが好 ましいリガンド決定残基位置のペアである。  Among the candidate pairs of ligand-determining residue positions, those giving a small value of f3 (m, n) are preferable pairs of ligand-determining residue positions.
なお、 特に明示しないが、 リガンド決定残基位置のペアを任意に組合せたリガ ンド決定残基一リガンド分類情報を得ることは、 本発明の好ましい別の実施態様 である。 組合せ方としては、 特に限定されるものではないが、 前述の関数 i3 (n) の値が一番小さなものと、二番目及び 又は X番目(ここで、 Xは 2より大きく 1 0 0より小さな整数を表す。 ) に小さなものを組合せる、 関数 i3 (m, η)の値が X 番目に小さなものと y番目(ここで、 yは、 Xとは異なり、 2より大きく 1 0 0よ り小さな整数を表す。 ) に小さなものを組合せる等があげられる。 例えば、 f3 ( π)関数の値が、 2 9番目までのリガンド決定残基を任意に組合せることができる 。 ここで、 X番目及び y番目の X及び yは、 コンピュータにあらかじめ入力され ていてもよいし、 ユーザからの入力情報をコンピュータが受け取り、 リガンド決 定残基の組合せを作成してもよい。 f 3 (m, n)関数の値が最小なものと 2番目に小さ なものを組合せたのみでは、 リガンド及び/又はリガンドの種類を予測すること のできるリガンド位置決定残基は限られたものとなる。 しかし、 このように、 複 数のリガンド位置決定残基を組合せることで、 多くの種類のリガンド未知タンパ ク質のリガンド及び/又はリガンドの種類を予測することが可能となり、 しかも 当該予測の精度も高まることとなりうる。 このような、 リガンド決定残基の組合 せは、 リガンド決定残基組合せ手段により自動的に組合せることが可能となって いることが、 本発明の好ましい実施の態様である。 かかる手段を用いれば、 リガ ンド決定残基位置から導かれる予測を補完することができる。 なお、 上記 Xと y は、 それぞれ 2より大きく、 1 0 0より小さいことが好ましく、 5 0より小さけ ればより好ましく、 3 0より小さければ更に好ましく、 2 0より小さければ特に 好ましい。 Although not explicitly stated, it is another preferred embodiment of the present invention to obtain ligand-determined residue-ligand classification information in which pairs of ligand-determined residue positions are arbitrarily combined. The combination is not particularly limited, but the combination of the function i3 (n) having the smallest value and the second and / or Xth (where X is greater than 2 and less than 100) Represents an integer.) Is combined with a smaller one. The value of the function i3 (m, η) is the Xth smallest and the yth (where y is different from X and is greater than 2 and greater than 100) It represents a small integer. For example, the value of the f3 (π) function can be arbitrarily combined with the ligand-determining residues up to the 29th. Here, the X-th and y-th Xs and y's may be input to the computer in advance, or the computer may receive input information from the user and create a combination of ligand-determined residues. The combination of the smallest and the second smallest f 3 (m, n) function limits the number of ligand-locating residues that can predict ligand and / or ligand type. Becomes However, in this way, by combining a plurality of ligand positioning residues, it becomes possible to predict the ligands of many types of ligand unknown proteins and / or the types of ligands, and furthermore, the accuracy of the prediction is high. Can also increase. It is a preferred embodiment of the present invention that such a combination of the ligand-determining residues can be automatically combined by the ligand-determining residue combining means. By using such a means, it is possible to complement the prediction derived from the residue-determined residue position. Note that X and y above Is preferably larger than 2 and smaller than 100, more preferably smaller than 50, even more preferably smaller than 30 and particularly preferably smaller than 20.
この実施態様では、 例えば、 あるリガンド既知タンパク質のアミノ酸残基のう ち、 関数 ί 3 (ffl, n)の値が一番小さなリガンド決定残基位置にあるものを抽出する。 そして、 リガンド決定残基一リガンド分類情報にあげられたリガンド既知夕ンパ ク質のうち、 抽出されたリガンド決定残基と一致するものの数を求める。 さらに 、 これらのうちで、 リガンド又はリガンドの種類が、 当該あるリガンド既知タン パク質と一致するものの数を求める。 このような本明細書ではこのようにしてえ られた値を N : ( (リガンド決定残基—リガンド分類情報にあげられたリガンド 既知タンパク質のうち、 抽出されたリガンド決定残基と一致するものの数) / ( リガンド又はリガンドの種類が、 当該あるリガンド既知タンパク質と一致するも のの数) ) とする。そして、 関数 f 3 (m, ii)の値が二番目に小さなリガンド決定残基 位置についても同様に Nを求める。その後、 関数 i3 (m, n)の値第一番目及び第二番 目に小さなリガンド決定残基位置の Nの分母同士及び分子同士を足し合わせる。 このようにして、 リガンド決定残基位置のペアを任意に組合せたリガンド決定残 基一リガンド分類情報を得るのである。  In this embodiment, for example, among the amino acid residues of a certain ligand-known protein, the amino acid residue at the ligand-determining residue position where the value of the function ί 3 (ffl, n) is the smallest is extracted. Then, among the ligand-known proteins listed in the ligand-determined residue-ligand classification information, the number of those that match the extracted ligand-determined residues is determined. Further, among these, the number of the ligands or the types of the ligands corresponding to the ligand-known protein is determined. In this specification, the value obtained in this manner is expressed as N: ((Ligand-determined residues—the number of ligand-identified proteins listed in the ligand classification information that match the extracted ligand-determined residues. ) / (Number of ligands or ligands whose type matches the ligand-known protein))). Then, N is similarly obtained for the ligand-determining residue position where the value of the function f 3 (m, ii) is the second smallest. Then, the denominator of N and the numerator at the position of the first and second smallest ligand-determining residue of the function i3 (m, n) are added. In this manner, the ligand-determined residue-ligand classification information obtained by arbitrarily combining the pairs of the ligand-determined residue positions is obtained.
以上説明した各工程は、 手動によって行われてもよいが所定の媒体又はプログ ラムをインストールしたコンピュータによって行われることが特に好ましい。 ま た、 このコンピュータは、 イン夕一ネットに接続されており、 外部のデ一夕べ一 スにアクセス可能となっていることが特に好ましい。  Each of the steps described above may be performed manually, but is particularly preferably performed by a computer having a predetermined medium or program installed. In addition, it is particularly preferable that this computer is connected to the Internet and can access an external device.
このようなコンピュータは、 少なくとも、 結合分子未知タンパク質のシ一クェ ンスアラインメントに関する情報を入力するシークェンスアラインメント入力手 段と、 前記シークェンスアラインメント入力手段により入力された結合分子既知 タンパク質のアミノ酸配列又はシークェンスアラインメントと結合分子又は結合 分子の種類に関する情報とを記憶するシークェンスアラインメント結合分子記憶 手段と、 前記.シークェンスアラインメント結合分子記憶手段により記憶された結 合分子既知タンパク質のアミノ酸配列又はシークェンスァラインメントと結合分 子又は結合分子の種類に関する情報を用いて前記結合分子決定残基位置を予測す る結合分子決定残基位置決定手段と、 前記結合分子決定残基位置におけるァミノ 酸残基 (結合分子決定残基) と結合分子または結合分子の種類とを対応付けるこ とにより、 結合分子決定残基と結合分子または結合分子の種類との相関関係を表 す結合分子決定残基一結合分子分類情報を得る結合分子決定残基一結合分子分類 情報取得手段とを具備する。 Such a computer includes at least a sequence alignment input means for inputting information on sequence alignment of the unknown binding molecule protein, an amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means. Sequence alignment binding molecule storage means for storing information on the binding molecule or the type of binding molecule; and the amino acid sequence or sequence alignment of the binding molecule known protein or the binding molecule stored by the sequence alignment binding molecule storage means. Alternatively, the position of the binding molecule-determining residue is predicted using information on the type of the binding molecule. And a binding molecule or a type of binding molecule by associating the amino acid residue (binding molecule determining residue) at the binding molecule determining residue position with the binding molecule or the type of the binding molecule. And a binding molecule-determining residue-binding molecule classification information obtaining means for obtaining binding molecule-determining residue-binding molecule classification information indicating a correlation between the binding molecule and the type of the binding molecule.
結合分子決定残基位置決定手段においては、 結合分子既知タンパク質のシーク エンスアラインメントや結合分子又は結合分子の種類に関する情報を先述した Π (η)、 ί2 (η)、 f3 (m, n)関数のいずれか又はこれらを任意に組合せ結合分子決定残基 位置を決定する。  In the binding molecule determination residue locating means, the sequence alignment of known binding molecule proteins and information on the binding molecule or the type of the binding molecule are described by the Π (η), ί2 (η), and f3 (m, n) functions described above. Any or any combination thereof is determined to determine the position of the binding molecule determining residue.
リガンド決定残基一リガンド分類情報は、 独自のデータべ一スを構成していて もよいし、ステップ 102で得られる結合分子既知タンパク質分類情報に基づくリレ —ショナルデータベースとして構成されていてもよい。 リガンド決定残基一リガ ンド分類情報が、 結合分子既知タンパク質分類情報に基づくリレ一ショナルデー 夕ベースとして構成されていれば、 結合分子未知タンパク質のシークェンスァラ インメントとそのリガンドが確認された場合、 それを結合分子既知タンパク質分 類情報に組込んで、 リガンド決定残基位置を容易に再評価することができるため 好ましい。  The ligand-determined residue-ligand classification information may constitute a unique database, or may be configured as a relational database based on the binding molecule known protein classification information obtained in step 102. If the ligand-determined residue-ligand classification information is configured as a relational database based on the binding molecule known protein classification information, if the sequence alignment of the unknown protein and its ligand are confirmed, It is preferable because it can be easily re-evaluated at the position of the ligand-determining residue by incorporating it into the information on the classification of known binding molecules.
このようなコンピュータであれば、 結合分子決定残基一結合分子分類情報に結 合分子未知タンパク質のシークェンスアラインメントを入力することにより結合 分子又は結合分子の種類を容易に予測することが可能となる。  With such a computer, the binding molecule or the type of the binding molecule can be easily predicted by inputting the sequence alignment of the binding molecule unknown protein into the binding molecule determination residue-binding molecule classification information.
なお、 本発明のコンピュータは、 結合分子決定残基—結合分子分類情報をあら かじめィンストールしておき、 タンパク質のシークェンスァラインメン卜を入力 するシークェンスアラインメント入力手段によって、 結合分子未知タンパク質の シークェンスァラインメン卜を入力することにより、 結合分子及び/又は結合分 子の種類を予測するコンピュータであってもよい。 このようなコンピュータであ つても、 結合分子決定残基一結合分子分類情報に結合分子未知夕ンパク質のシ一 クエンスアラインメントを入力することにより結合分子又は結合分子の種類を容 易に予測することが可能となる。  Note that the computer of the present invention has the binding molecule determining residue-binding molecule classification information preliminarily installed, and the sequence alignment input means for inputting the protein sequence alignment is used to execute the sequence alignment of the unknown protein. It may be a computer that predicts the type of the binding molecule and / or the binding molecule by inputting the data. Even with such a computer, it is easy to predict the binding molecule or the type of the binding molecule by inputting the sequence alignment of the unknown binding protein to the binding molecule determination residue-binding molecule classification information. Becomes possible.
[仮想的な結合分子既知タンパク質を用いた例] 以下、 仮想的な結合分子既知タンパク質を用いて、 本発明のリガンド決定残基 一リガンド分類情報作成までの工程及び結合分子未知タンパク質の結合分子種予 測工程を説明する。 [Example using a virtual known binding molecule protein] Hereinafter, the steps up to the creation of the ligand-determined residue-one ligand classification information and the step of predicting the binding molecule species of the unknown binding molecule protein according to the present invention will be described using a virtual binding molecule known protein.
表 1は、 仮想的な結合分子既知タンパク質分類情報の例である。 この系では、 リガンドが既知でありアミノ酸配列 (シークェンスアラインメント) が既知であ る 6つの結合分子既知タンパク質が集められている。 そして、 リガンドは、 P、 A 、 Nの 3つの分類に分けられている。 リガンドの種類に基づいてリガンド決定残基 位置を決定する場合は、 P、 A、 Nが、 それぞれ式 1、 式 2における XI、 X2、 X3に対応 する。 また、 リガンドに基づいてリガンド決定残基位置を決定する場合は、 〇X 、 X X、 △△が、 それぞれ式 1、 式 2における XI、 11、 X3に対応する。 〇 〇〇△ X X  Table 1 is an example of hypothetical binding molecule known protein classification information. In this system, six binding proteins with known ligands and known amino acid sequences (sequence alignment) are collected. Ligands are divided into three categories: P, A, and N. When determining the position of a ligand-determining residue based on the type of ligand, P, A, and N correspond to XI, X2, and X3 in Formulas 1 and 2, respectively. When determining the position of a ligand-determining residue based on a ligand, ΔX, XX, and Δ △ correspond to XI, 11, and X3 in Formulas 1 and 2, respectively. 〇 〇〇 △ X X
表 1 仮想的な結合分子既知タンパク質分類表の Δ X X X X X例 番号 アラインメント リガンド リガンド Table 1. Example of ΔX X X X X example of hypothetical binding molecule known protein classification table No. Alignment Ligand Ligand
TLMRK P TMMQK A TCMTK N TLLRK P TMLQK A TLLRA P 表 1において、例えは 1番目の結合分子既知たんぱく質のシークェンスァライン メントが、 TLMRKであり、 その結合分子 (リガンド) は〇Χである。 そして、 リガ ンド〇Χのリガンドの種類は、 Ρである。  TLMRK P TMMQK A TCMTK N TLLRK P TMLQK A TLLRA P In Table 1, for example, the sequence alignment of the first known binding molecule protein is TLMRK, and the binding molecule (ligand) is 〇Χ. And the type of ligand of ligand I is Ρ.
評価関数 Π (1)について説明する。 上記シークェンスアラインメントのうち 1番 目のアミノ酸残基は Τのみである。 したがって、 関数 f l (η)において、 Resは Τのみ である。  The evaluation function Π (1) will be described. The first amino acid residue in the above sequence alignment is only Τ. Therefore, in the function f l (η), Res is Τ only.
アミノ酸残基が Tであって、 リガンドが Pであるものは 3種類あるから、 Since the amino acid residue is T and the ligand is P, there are three types.
N (Res, Xl) = N (T, X1) =3となる。 同様にして、 N (Res, Xl) = N (T, X1) = 3. Similarly,
N (Res, X2) = N (T, X2) =2, N (Res, X3) = N (T, X3) = l となる。 N (Res, X2) = N (T, X2) = 2, N (Res, X3) = N (T, X3) = l.
これから、 Π (1)は以下の通りとなる。 il(l)=∑ (N (Res, XI) XN(Res, X2) + N(Res, XI) XN(Res, X3) IN (Res, X2) x N(From this, Π (1) is as follows. il (l) = ∑ (N (Res, XI) XN (Res, X2) + N (Res, XI) XN (Res, X3) IN (Res, X2) x N (
Res Res
Res, X3))=∑ (N (T, XI) XN(T, X2) + N (T, XI) XN(T, X3) +N (T, X2) X N (T, X3))=3X2+  Res, X3)) = ∑ (N (T, XI) XN (T, X2) + N (T, XI) XN (T, X3) + N (T, X2) X N (T, X3)) = 3X2 +
3X1+2X1 = 11 次に f2(l)について説明する。 f2(l) =∑ (N (Res, XI) XN(Res, X2) XN(Res, X3)) 3X1 + 2X1 = 11 Next, f2 (l) will be described. f2 (l) = ∑ (N (Res, XI) XN (Res, X2) XN (Res, X3))
Res  Res
=∑ (N (T, XI) XN(T, X2) XN(T, X3)) = 3X2X1 = 6  = ∑ (N (T, XI) XN (T, X2) XN (T, X3)) = 3X2X1 = 6
となる。 fl (2)と f2(2)について説明する。上記シークェンスアラインメントのうち 2番目 のアミノ酸残基は L、 M、 Cの 3種類である。 したがって、 関数 Π(η)において、 Res はし M、 Cである。 2番目のアミノ酸残基が Lであって、 リガンドが Pであるものは 3 種類あるから、 N(L, X1)=N(L, P)=3となる。 2番目のアミノ酸残基が【であって、 リ ガンドが A、 Nのものは存在しないから、 N (L, X2) =N (L, A) =N (L, X3) =N (L, N) =0となる 。 2番目のアミノ酸残基が Mであって、 リガンドが Aであるものは 2種類あるから、 N (M, X2)=N(M, A)=2となる。 2番目のアミノ酸残基が Lであって、 リガンドが P、 Nのも のは存在しないから、 N(M,X1)=N (M, P) =N (M, X3) =N (M, N) =0となる。 Becomes The following describes fl (2) and f2 (2). The second amino acid residue in the sequence alignment is L, M, and C. Therefore, in the function Π (η), Res are M and C. N (L, X1) = N (L, P) = 3 because there are three kinds of amino acid residues where the second amino acid residue is L and the ligand is P. Since the second amino acid residue is [and the ligands do not have A and N, N (L, X2) = N (L, A) = N (L, X3) = N (L, N) = 0. Since the second amino acid residue is M and the ligand is A, there are two types, so N (M, X2) = N (M, A) = 2. N (M, X1) = N (M, P) = N (M, X3) = N (M, M N) = 0.
2番目のアミノ酸残基が Cであって、 リガンドが Nであるものは 1種類あるから、 N( C,X3)=N(C,N)=1となる。 2番目のアミノ酸残基が Cであって、 リガンドが P、 Aのも のは存在しないから、 N(C,X1)=N(C,P)=N(C,X2)=N(C, A)=0となる。 Since the second amino acid residue is C and the ligand is N, there is one kind, so N (C, X3) = N (C, N) = 1. Since the second amino acid residue is C and none of the ligands have P and A, N (C, X1) = N (C, P) = N (C, X2) = N (C, A) = 0.
これから、 Π (2)は以下の通りとなる。 fl(2)=∑ (N (Res, XI) xN(Res, X2) + N(Res, XI) XN(Res, X3) +N (Res, X2) x From this, Π (2) is as follows. fl (2) = ∑ (N (Res, XI) xN (Res, X2) + N (Res, XI) XN (Res, X3) + N (Res, X2) x
Res  Res
N (Res, X3))=∑ (N (L, XI) XN (L, X2) I N (L, XI) XN (L, X3) +N (L, X2) X N (L, X3) ) +∑ (N (M, XI) XN (M, X2) + N(M,X1) XN (M, X3) IN (M, X2) X N (M, X3) ) +∑ (N (C, XI) XN(C, X2)+ N(C,X1) XN(C, X3) +N (C, X2) X N(C, X3) ) =∑ (N (L, P) XN (L, A) + N (L, P) XN (L, N) +N (L, A) X N (L, N) ) +∑ (N (M, P) XN (M, A)+ N (M, P) XN (M, N) +N (M, A) X N (M, N) ) N (Res, X3)) = ∑ (N (L, XI) XN (L, X2) IN (L, XI) XN (L, X3) + N (L, X2) XN (L, X3)) + ∑ (N (M, XI) XN (M, X2) + N (M, X1) XN (M, X3) IN (M, X2) XN (M, X3)) + ∑ (N (C, XI) XN (C, X2) + N (C, X1) XN (C, X3) + N (C, X2) XN (C, X3)) = ∑ (N (L, P ) XN (L, A) + N (L, P) XN (L, N) + N (L, A) XN (L, N)) + ∑ (N (M, P) XN (M, A) + N (M, P) XN (M, N) + N (M, A) XN (M, N))
+∑ (N (C, P) XN (C, A)+ N (C, P) XN (C, N) +N (C, A) X N (C, N) ) =0 また f2 (2)は次の通りとなる。 f2(2) =∑ (N (Res, XI) XN(Res, X2) XN(Res, X3)) + ∑ (N (C, P) XN (C, A) + N (C, P) XN (C, N) + N (C, A) XN (C, N)) = 0 and f2 (2) It is as follows. f2 (2) = ∑ (N (Res, XI) XN (Res, X2) XN (Res, X3))
Res  Res
=(N (L, P) XN (L, A) XN (L, N) ) + (N (M, P) XN (M, A) XN (M, N) ) + (N (C, P) XN (C, A) X N (C, N) ) =0 同様にして il(3)=5、 fl(4)=0、 Π(5)=8、 f2(3)=l、 f2(4)=0、 f2(5)=4となる, = (N (L, P) XN (L, A) XN (L, N)) + (N (M, P) XN (M, A) XN (M, N)) + (N (C, P) XN (C, A) XN (C, N)) = 0 Similarly, il (3) = 5, fl (4) = 0, Π (5) = 8, f2 (3) = l, f2 (4) = 0, f2 (5) = 4,
Π (n)の値が小さなものから並べると、 nが 2、 4、 3、 5、 1の順番となる。 小さ な Π(η)の値を与えるアミノ酸残基位置が、 リガンド決定残基位置の候補である。 ここでは、 第 2、 4、 3番目のアミノ酸残基位置を、 リガンド決定残基位置の候補と する。 いくつのァミノ酸残基位置をリガンド決定残基位置の候補とするかはあら かじめ決めておいても良い。 これらのアミノ酸残基位置における f2(n)の値は、第 1、 5番目のアミノ酸残基位置における ί2(η)の値に比べ小さいことから、 第 2、 4 、 3番目のアミノ酸残基位置が、 リガンド決定残基位置の候補として望ましい候補 であることが確認できる。 並 べ When the values of (n) are arranged in ascending order, n is in the order of 2, 4, 3, 5, 1. Amino acid residue positions that give a small value of Π (η) are candidates for ligand-determined residue positions. Here, the second, fourth, and third amino acid residue positions are candidates for ligand-determined residue positions. How many amino acid residue positions are candidates for ligand-determining residue positions may be determined in advance. Since the value of f2 (n) at these amino acid residue positions is smaller than the value of ί2 (η) at the first and fifth amino acid residue positions, the second, fourth and third amino acid residue positions Is a desirable candidate as a candidate for a ligand-determining residue position.
[リガンド決定残基位置のペア決定過程]  [Pair determination process for ligand-determined residue positions]
リガンド決定残基位置のペアに対してリガンドが 2種類あるものを 2交差残基ペア 種とし、 リガンド決定残基位置のペアに対してリガンドが 3種類あるものを 3交差 残基ペア種とする。 第 2、 4、 3番目のアミノ酸残基位置をそれぞれ組合せ、 リガン ド決定残基位置のペア候補をあげる。 この例では、 (2, 3)、 (2,4)、 (3, 4)の 3種類 のペア候補があげられる。 Two cross-residue pairs when there are two ligands for a pair of ligand-determined residue positions, and three cross-residue pairs when there are three ligands for a pair of ligand-determined residue positions . Combine the 2nd, 4th and 3rd amino acid residue positions, respectively, and list candidate pairs of ligand-determined residue positions. In this example, three types (2, 3), (2,4), (3, 4) Pair candidates.
まず、 リガンド決定残基位置のペア候補(2, 3)について検討する。シークェンス アラインメントの 2番目と 3番目にあるアミノ酸残基の組合せは、 (L, M)、 (M,M)、 ( C,M)、 (L,L)、 (M, L)の 5種類である。 したがって、 「属するァミノ残基ペア種類数 」 は 5である。 これら 5種類のアミノ酸残基の組合せについて対応するリガンドは それぞれ一義的に決まるので、 2交差残基ペア種と 3交差残基ペア種はない。 これ から f 3 (2, 3)の値は 5である。 同様にして、 f3 (2, 4)の値は、 4であり、 ί3 (3, 4)の 値は、 5である。 よって残基位置のペア(2, 4)が最も好ましく、 リガンド決定残基 位置のペアである。  First, the candidate pairs (2, 3) for the ligand-determined residue positions are examined. The combination of the second and third amino acid residues in the sequence alignment is (L, M), (M, M), (C, M), (L, L), and (M, L). is there. Therefore, the “number of kinds of amino acid residue pairs” to which the present invention belongs is 5. The corresponding ligand for each of the combinations of these five types of amino acid residues is uniquely determined, so there are no two-crossing residue pair species and three-crossing residue pair species. From this, the value of f 3 (2, 3) is 5. Similarly, the value of f3 (2, 4) is 4, and the value of ί3 (3, 4) is 5. Therefore, a pair of residue positions (2, 4) is the most preferable, and is a pair of ligand-determining residue positions.
なお、 好ましくないアミノ酸残基位置の組合せであれば、 f3の値が大きくなる ことを示すために、 アミノ酸残基位置 1と 5の組合せを用いて f3 (l, 5)を求める。シ ークエンスァラインメン卜の 1番目と 5番目にあるアミノ酸残基の組合せは、 (Τ, K )、 (T, A)の 2種類である。 したがって、 「属するァミノ残基ペア種類数」 は 2であ る。 (Τ,Κ)の組合せに対応するリガンドの種類は、 Ρ、 Α、 Νの 3種であるから、 (Τ, Κ)は、 3交差残基ペア種である。 よって、 f3 (1, 5)の値は、 7となる。 この値は、 f 3 (2, 4)の値よりも大きく、 関数 f3がリガンド決定残基位置のペア決定に適してい ることが理解できる。  Note that f3 (l, 5) is determined using a combination of amino acid residue positions 1 and 5 to indicate that an unfavorable combination of amino acid residue positions increases the value of f3. There are two combinations of amino acid residues at the first and fifth positions in the sequence alignment: (Τ, K) and (T, A). Therefore, the “number of kinds of amino acid residue pairs” belongs to 2. Since there are three kinds of ligands corresponding to the combination of (Τ, Κ), Ρ, Α, and Τ, (Τ, あ る) is a three-crossing residue pair kind. Therefore, the value of f3 (1, 5) is 7. This value is larger than the value of f 3 (2, 4), and it can be understood that the function f 3 is suitable for determining a pair of ligand-determining residue positions.
[リガンド決定残基ーリガンド分類情報作成工程]  [Ligand-determined residue-ligand classification information creation process]
以上よりリガンド決定残基位置のペアがシークェンスアラインメントの(2, 4)番 目であることがわかった。 これらに対応するアミノ酸残基とそのリガンドを抽出 しリガンド決定残基一リガンド分類情報を作成する。 表 2 リガンド決定残基一リガンド分類表 From the above, it was found that the pair of the ligand-determining residue positions is the (2, 4) -th sequence alignment. The amino acid residues corresponding to these and their ligands are extracted, and ligand-determined residue-ligand classification information is created. Table 2 Ligand-determined residue-ligand classification table
(2, 4) リガンド リガンドの種類 (2, 4) Ligand Type of ligand
L, R 〇 X P  L, R 〇 X P
M, Q X X A  M, Q X X A
C, Τ ΔΔ N 表 2から、 例えば、 2、 4番目のシ一: Rであれば、 そのリガンドは〇Xであり、 リガンドの種類は Pであると予測され る。 C, ΤΔΔN From Table 2, for example, the second and fourth sequences: If R, the ligand is predicted to be ΔX and the ligand type is predicted to be P.
[結合分子未知タンパク質の結合分子種予測工程]  [Step of predicting binding molecule species of unknown binding protein]
以上のようなリガンド決定残基一リガンド分類情報を入手できれば、 結合分子未 知タンパク質 (リガンドが未知の結合分子既知タンパク質) のリガンド決定残基 位置におけるアミノ酸残基を求めリガンド決定残基一リガンド分類情報に当ては めることにより、 結合分子未知タンパク質のリガンドを予測することが可能とな る。 例えば、 ある結合分子未知タンパク質のシークェンスアラインメントを公知 の方法により求め、 当該シークェンスァラインメントのうち第 2番目と第 4番目の アミノ酸残基が、 それぞれ Mと Qであれば、 その結合分子未知タンパク質のリガン ドの種類は A、 でありリガンドは X X (表 2 ) と予想できる。 このように、 結合分 子未知タンパク質のリガンド (又はリガンドの種類) を予測することにより、 そ のタンパク質とリガンドのペアが決められ、 そのリガンドの物理的 ·化学的 ·生 物学的性質は容易に推定できることから、 そのタンパク質の機能を推定すること につながる。 したがって、 本発明の予測方法は新規な医薬を開発する上でも有益 である。 If the information on the classification of the ligand-determined residue-ligand as described above can be obtained, the amino acid residue at the position of the ligand-determined residue of the binding molecule unknown protein (the binding molecule with unknown ligand) is determined, and the ligand determined residue-ligand classification is performed. By applying the information, it becomes possible to predict the ligand of the unknown protein of the binding molecule. For example, a sequence alignment of a certain unknown binding molecule protein is determined by a known method. If the second and fourth amino acid residues in the sequence alignment are M and Q, respectively, the unknown binding molecule protein is determined. The type of ligand is A, and the ligand is expected to be XX (Table 2). In this way, by predicting the ligand (or type of ligand) of a protein whose binding molecule is unknown, a pair of the protein and the ligand is determined, and the physical, chemical, and biological properties of the ligand are easily determined. This can lead to an estimation of the function of the protein. Therefore, the prediction method of the present invention is also useful for developing a novel medicine.
以上の説明では、 リガンドについて説明したが、 本発明は、 リガンドのみなら ず、 結合分子既知タンパク質に結合する分子を予測することができる。 例えば、 結合分子既知タンパク質が G P C Rである場合には当該 G P C Rに結合する、 G タンパク質を予測することもできる。  In the above description, a ligand was described. However, the present invention can predict not only a ligand but also a molecule that binds to a known binding molecule protein. For example, when the binding molecule known protein is GPCR, a G protein that binds to the GPCR can also be predicted.
各種細胞や臓器における複雑な機能を調節する物質と、 その特異的レセプター タンパク質、 特には Gタンパク質共役型レセプ夕一タンパク質との関係を明らか にすることは、 各種細胞や臓器における複雑な機能を解明し、 それら機能と密接 に関連した医薬品開発に非常に重要な手段を提供することとなる。 本発明の結合 分子未知タンパク質の結合分子予測方法を用いれば、 例えば、 G P C Rのリガン ド及び Z又はリガンドの種類を予測することが可能となる。 G P C Rのリガンド 及び/又はリガンドの種類を予測することができれば、 当該 G P C Rの生体内で の機能を予測することにつながる。 そして、 例えば、 機能が予測され、 そのリガ ンド及び/又はリガンドの種類が予測された G P C Rに関する情報を用いれば、 容易に当該 G P C Rが関与する疾患等の予防薬、 治療薬を製造することが可能と なる。 To clarify the relationship between substances that regulate complex functions in various cells and organs and their specific receptor proteins, especially G protein-coupled receptor proteins, elucidation of complex functions in various cells and organs It provides a very important tool for drug development closely related to these functions. The use of the method for predicting binding molecules of unknown binding molecules of the present invention makes it possible to predict, for example, the type of ligand and Z or ligand of GPCR. If the ligand of the GPCR and / or the type of the ligand can be predicted, the function of the GPCR in vivo can be predicted. And, for example, using information on a GPCR whose function is predicted and whose ligand and / or ligand type is predicted, It is possible to easily produce a prophylactic or therapeutic drug for a disease or the like involving the GPCR.
GPCRの機能を考慮すると、 特に中枢疾患、 炎症性疾患、 循環器疾患、 癌、 代謝性疾患、 免疫系疾患または消化器系疾患の予防剤、 若しくは治療剤のいずれ か又は両方の製造方法に本発明は有効に利用されることとなる。 本明細書の配列表の配列番号は、 以下の配列を示す。  Considering the function of GPCRs, this method is particularly suitable for the manufacture of either or both prophylactic agents and / or therapeutic agents for central diseases, inflammatory diseases, cardiovascular diseases, cancer, metabolic diseases, immune system diseases or digestive system diseases. The invention will be used effectively. The sequence numbers in the sequence listing in the present specification indicate the following sequences.
〔配列番号: 1〕  [SEQ ID NO: 1]
ラット TGR 23— 2リガンド (1— 18) のアミノ酸配列を示す。  2 shows the amino acid sequence of rat TGR 23-2 ligand (1-18).
〔配列番号: 2〕  [SEQ ID NO: 2]
ラット TGR23 - 2リガンド (1一 1 5) のアミノ酸配列を示す。  1 shows the amino acid sequence of rat TGR23-2 ligand (1-15).
〔配列番号: 3〕  [SEQ ID NO: 3]
ラット TGR2 3— 2リガンド (1— 14) のアミノ酸配列を示す。  2 shows the amino acid sequence of rat TGR2 3-2 ligand (1-14).
〔配列番号: 4〕  [SEQ ID NO: 4]
以下の参考例 6における P CR反応で使用したプライマーの塩基配列を示す。 〔配列番号: 5〕  The base sequence of the primer used in the PCR reaction in Reference Example 6 below is shown. [SEQ ID NO: 5]
以下の参考例 6における P C R反応で使用したプライマーの塩基配列を示す。 〔配列番号: 6〕  The base sequence of the primer used in the PCR reaction in Reference Example 6 below is shown. [SEQ ID NO: 6]
以下の参考例 6における P C R反応で使用したプライマーの塩基配列を示す。 〔配列番号: 7〕  The base sequence of the primer used in the PCR reaction in Reference Example 6 below is shown. [SEQ ID NO: 7]
ヒト TGR 23— 2リガンド前駆体をコードする c DN Aの塩基配列を示す。 〔配列番号: 8〕  1 shows the nucleotide sequence of cDNA encoding human TGR23-2 ligand precursor. [SEQ ID NO: 8]
ヒト TGR 23— 2リガンド前駆体のアミノ酸配列を示す。  2 shows the amino acid sequence of human TGR 23-2 ligand precursor.
〔配列番号: 9〕  [SEQ ID NO: 9]
ヒト TGR 23— 2リガンド (1— 1 8) のアミノ酸配列を示す。  2 shows the amino acid sequence of human TGR 23-2 ligand (1-18).
〔配列番号: 10〕  [SEQ ID NO: 10]
ヒト TGR23— 2リガンド (1— 1 5) のアミノ酸配列を示す。  2 shows the amino acid sequence of human TGR23-2 ligand (1-15).
〔配列番号: 1 1〕  [SEQ ID NO: 11]
ヒト TGR 23— 2リガンド (1— 14) のアミノ酸配列を示す。 〔配列番号: 1 2〕 2 shows the amino acid sequence of human TGR 23-2 ligand (1-14). [SEQ ID NO: 1 2]
ヒト TGR 23— 2リガンド (1一 20) のアミノ酸配列を示す。  The amino acid sequence of human TGR 23-2 ligand (1-120) is shown.
〔配列番号: 13〕  [SEQ ID NO: 13]
以下の参考例 7における P C R反応で使用したプライマーの塩基配列を示す。 〔配列番号: 14〕  13 shows the nucleotide sequence of a primer used in the PCR reaction in Reference Example 7 below. [SEQ ID NO: 14]
以下の参考例 7における P CR反応で使用したプライマーの塩基配列を示す。 〔配列番号: 1 5〕  13 shows the nucleotide sequence of a primer used in the PCR reaction in Reference Example 7 below. [SEQ ID NO: 15]
以下の参考例 7における P CR反応で使用したプライマーの塩基配列を示す。 〔配列番号: 16〕  13 shows the nucleotide sequence of a primer used in the PCR reaction in Reference Example 7 below. [SEQ ID NO: 16]
マウス TGR 23 -2リガンド前駆体をコードする c DNAの塩基配列を示す 〔配列番号: 17〕  This shows the base sequence of cDNA encoding mouse TGR 23-2 ligand precursor [SEQ ID NO: 17]
マウス TGR 23— 2リガンド前駆体のァミノ酸配列を示す。  2 shows the amino acid sequence of mouse TGR 23-2 ligand precursor.
〔配列番号: 1 8〕  [SEQ ID NO: 18]
マウス TGR 23— 2リガンド (1— 1 8) のアミノ酸配列を示す。  Fig. 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-18).
〔配列番号: 1 9〕  [SEQ ID NO: 19]
マウス TGR 23— 2リガンド (1— 1 5) のアミノ酸配列を示す。  Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-15).
〔配列番号: 20〕  [SEQ ID NO: 20]
マウス TGR 23— 2リガンド (1— 14) のアミノ酸配列を示す。  Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-14).
〔配列番号: 2 1〕  [SEQ ID NO: 21]
マウス TGR 23— 2リガンド (1— 20) のアミノ酸配列を示す。  Figure 3 shows the amino acid sequence of mouse TGR 23-2 ligand (1-20).
〔配列番号: 22〕  [SEQ ID NO: 22]
以下の参考例 8における P CR反応で使用したプライマーの塩基配列を示す。 〔配列番号: 23〕  The base sequence of the primer used in the PCR reaction in Reference Example 8 below is shown. [SEQ ID NO: 23]
ラット TGR 23— 2リガンド前駆体の一部をコードする c DNAの塩基配列 を示す。  2 shows the nucleotide sequence of cDNA encoding a part of rat TGR23-2 ligand precursor.
〔配列番号: 24〕  [SEQ ID NO: 24]
ラット TGR 23— 2リガンド前駆体の一部のアミノ酸配列を示す。  2 shows the amino acid sequence of a part of the rat TGR 23-2 ligand precursor.
〔配列番号: 25〕 ラット TGR 23— 2リガンド (1— 20) のアミノ酸配列を示す。 [SEQ ID NO: 25] 2 shows the amino acid sequence of rat TGR 23-2 ligand (1-20).
〔配列番号: 26〕  [SEQ ID NO: 26]
ヒト TGR 23— 2リガンド (1— 16) のアミノ酸配列を示す。  2 shows the amino acid sequence of human TGR 23-2 ligand (1-16).
〔配列番号: 27〕  [SEQ ID NO: 27]
以下の参考例 9における P CR反応で使用したプライマーの塩基配列を示す。 〔配列番号: 28〕  The base sequence of the primer used in the PCR reaction in Reference Example 9 below is shown. [SEQ ID NO: 28]
以下の参考例 9における P C R反応で使用したプライマ一の塩基配列を示す。 〔配列番号: 29〕  The base sequence of the primer used in the PCR reaction in Reference Example 9 below is shown. [SEQ ID NO: 29]
以下の参考例 9における P C R反応で使用したプライマーの塩基配列を示す。 〔配列番号: 30〕  The base sequence of the primer used in the PCR reaction in Reference Example 9 below is shown. [SEQ ID NO: 30]
ラッ卜 TGR 23— 2リガンド前駆体をコードする c DN Aの塩基配列を示す  Rat TGR 23— Shows the nucleotide sequence of cDNA encoding the precursor of ligand 2
〔配列番号: 31〕 [SEQ ID NO: 31]
ラット TGR 23 - 2リガンド前駆体のアミノ酸配列を示す。  2 shows the amino acid sequence of rat TGR 23-2 ligand precursor.
〔配列番号: 32〕  [SEQ ID NO: 32]
以下の参考例 1 0における P CR反応で使用したプライマ一 1の塩基配列を示 す。  The base sequence of primer 11 used in the PCR reaction in Reference Example 10 below is shown.
〔配列番号: 33〕  [SEQ ID NO: 33]
以下の参考例 1 0における P CR反応で使用したプライマー 2の塩基配列を示 す。  The base sequence of primer 2 used in the PCR reaction in Reference Example 10 below is shown.
〔配列番号: 34〕  [SEQ ID NO: 34]
以下の参考例 1 1における TGR 23 - 1発現 CHO細胞の TGR 23— 1遺 伝子発現量を測定するのに使用したプライマーの塩基配列を示す。  Shown below is the nucleotide sequence of a primer used to measure the expression level of TGR23-1 gene in CHO cells expressing TGR23-1 in Reference Example 11 below.
〔配列番号: 35〕  [SEQ ID NO: 35]
以下の参考例 1 1における TGR 23— 1発現 CHO細胞の TGR 23— 1遺 伝子発現量を測定するのに使用したプライマーの塩基配列を示す。  The following shows the nucleotide sequences of primers used to measure the expression level of TGR 23-1 gene in CHO cells expressing TGR 23-1 in Reference Example 11 below.
〔配列番号: 36〕  [SEQ ID NO: 36]
以下の参考例 1 1における TGR 23 - 1発現 CHO細胞の TGR 23— 1遺 伝子発現量を測定するのに使用したプローブの塩基配列を示す。 5 ' 端は 6—力 ルポキシーフルォレセィン (F am) で、 3 , 端は 6—カルボキシーテトラメチ ルーローダミン (T amr a) で標識されている。 Shown below is the nucleotide sequence of a probe used to measure the expression level of TGR23-1 gene in CHO cells expressing TGR23-1 in Reference Example 11 below. 5 'end is 6—power Rupoxyfluorescein (Fam), labeled at the 3 'end with 6-carboxy-tetramethylolurodamine (Tamra).
〔配列番号: 37〕  [SEQ ID NO: 37]
ヒ'ト由来 Gタンパク質共役型レセプタータンパク質 TGR 23 - 1 (ヒト TG R 23 - 1) のアミノ酸配列を示す。  1 shows the amino acid sequence of human-derived G protein-coupled receptor protein TGR 23-1 (human TGR 23-1).
〔配列番号: 38〕  [SEQ ID NO: 38]
ヒト由来 Gタンパク質共役型レセプタータンパク質 TGR 23— 1をコードす る c DN Aの塩基配列を示す。 = 1 shows the nucleotide sequence of cDNA encoding human-derived G protein-coupled receptor protein TGR23-1. =
〔配列番号: 3 9〕  [SEQ ID NO: 39]
ヒト由来 Gタンパク質共役型レセプ夕一タンパク質 TGR 23 -2 (ヒト TG R 23 - 2) のアミノ酸配列を示す。  1 shows the amino acid sequence of human-derived G protein-coupled receptor protein TGR 23-2 (human TGR 23-2).
〔配列番号: 40〕  [SEQ ID NO: 40]
ヒト由来 Gタンパク質共役型レセプタータンパク質 TGR 23 -2をコードす る c DNAの塩基配列を示す。 実施例  1 shows the nucleotide sequence of cDNA encoding human-derived G protein-coupled receptor protein TGR23-2. Example
以下の実施例においては、 結合分子既知タンパク質および結合分子未知タンパ ク質として GP CRをあげるが、 本発明は、 その要旨を超えない限りこれらに限 定されるものではない。  In the following examples, GPCR is mentioned as a protein with a known binding molecule and a protein with an unknown binding molecule, but the present invention is not limited to these without departing from the gist thereof.
[実施例 1 ] [Example 1]
( G P C Rのリガンドの種類の予測)  (Prediction of ligand type of GPCR)
GP CRのシークェンスァラインメント及びリガンドが登録されている GP C RDB (データべ一ス) から、 1152種類の GP CRのシークェンスァラインメン トおよびリガンドに関する情報を取得し結合分子既知タンパク質分類情報 (表 3 ) を得た。 その後、 取得したリガンドを 3つの種類に分けた。 3つの種類は以下の 通りである。  Information on 1152 types of GPCR sequence alignments and ligands is obtained from the GP C RDB (database) in which GPCR sequence alignments and ligands are registered. 3) was obtained. After that, the obtained ligands were divided into three types. The three types are as follows.
P …… Peptides (ペプチド)、 C emokines (ケモカイン) 、 Glycoproteins (糖 タンパク質) A ··· ··· Monoamines (モノァミン) 、 (アドレナリン、 アセチルコリン、 ドーパ ミン、 セレトニン、 ヒスタミン) P …… Peptides (peptide), C emokines (chemokine), Glycoproteins (glycoprotein) A ······· Monoamines, (adrenaline, acetylcholine, dopamine, serotonin, histamine)
N …… Lipids (脂質) N …… Lipids
1152種類の G P C Rのシークェンスアラインメントおよびリガンドに関する情 報をコンピュータに入力した。 入力された 1152種類の G P C Rのシークェンスァ ラインメントおよびリガンドに関する情報を Π (n)関数及び f 2 (n)関数を用いたリ ガンド決定残基位置候補選択手段により選択した 6種類のリガンド決定残基位置 候補を表 3に示す。  Information on the sequence alignment and ligands of 1152 GPCRs was input to a computer. The information on the sequence alignment and ligands of the 1152 types of GPCRs that were input was used to determine the six types of ligands that were selected by the means for selecting candidate positions for determining residues using the Π (n) function and f 2 (n) function. Table 3 shows candidate base positions.
リガンド決定残基位置候補選択手段においては、 f l (n)関数の値と、 f2 (n)関数 の値の積が小さいものをリガンド残基位置候補として選ぶ。 それらのうち、 シー クエンスァラインメント不能(-)が、 全体の 3 %以上存在しているものがあれば、 リガンド決定残基位置候補から除いた。 このようにして、 リガンド残基位置候補 を 2 0個選択した。 表 3は、 それらのうちより好ましい 6個のリガンド残基位置 候補について、 関数 Π (η)、 関数 f2 (n)の評価値及び順位をそれぞれ表したもので める。 表 3 好ましい 6個のリガンド残基位置候補について、 関数 i l (n)、 関数 f 2 (n)の 評価値及び順位をそれぞれ表したもの f l (n) f2 (n)  In the ligand-determining residue position candidate selecting means, a candidate having a small product of the value of the f l (n) function and the value of the f2 (n) function is selected as the ligand residue position candidate. Among them, if any of them could not be sequence aligned (-), 3% or more of them were excluded from candidate ligand-determined residue positions. In this manner, 20 ligand residue position candidates were selected. Table 3 shows the evaluation values and ranks of the function Π (η) and the function f2 (n) for the more preferable six candidate ligand residue positions among them. Table 3 Evaluation results and ranks of function i l (n) and function f 2 (n) for six preferred candidate ligand residue positions f l (n) f2 (n)
残基位置 順位 評価値 順位 評価値  Residue position Rank Evaluation value Rank Evaluation value
82 1 9222 10 51554  82 1 9222 10 51554
86 5 13317 3 22593  86 5 13 317 3 22593
90 2 10112 1 21168  90 2 10 112 1 21 168
91 13 15016 12 62274  91 13 15016 12 62274
209 18 16389 21 81468  209 18 16 389 21 81 468
230 10 14618 25 93622 表 3によれば、 例えば、 シークェンスアラインメントが 8 6番の残基位置につ いての、 関数 Π (η)の順位が 5位で、 関数 f2 (n)の順位は 3位であることがわかる。 上記 6種類のリガンド決定残基位置候補を組合せ、 360個のリガンド決定残基位 置のペア候補をあげた。 360個のリガンド決定残基位置のペア候補に関する情報を 式 3に代入し、 リガンド決定残基位置のペアを決定した。 その結果、 f3 (n)の値が 最も小さく、 最も好ましいリガンド決定残基位置 (のペア) は、 (86, 90) 、 す なわち、 G P C Rのシークェンスァラインメントのうち第 86番目と第 90番目のァ ミノ酸残基位置であった。 また、 (86, 90) .から導き出される予測を補完できる かどうかを指標とし、 次に好ましいリガンド決定残基位置として、 (209, 211)、 と(86, 236)を選択した。 230 10 14618 25 93622 According to Table 3, for example, for residue number 86 in the sequence alignment, the function Π (η) ranks fifth and the function f2 (n) ranks third. It can be seen that it is. Combine the above 6 candidate ligand-determining residue positions to obtain 360 ligand-determining residue positions He gave a pair of candidates. The information on the candidate pairs of 360 ligand-determined residue positions was substituted into Equation 3 to determine the pair of ligand-determined residue positions. As a result, the f3 (n) value is the smallest, and the most preferable ligand-determining residue position (pair) is (86, 90), that is, the 86th and 90th positions in the GPCR sequence alignment. This was the position of the amino acid residue. In addition, using as an index whether or not the prediction derived from (86, 90) can be complemented, (209, 211) and (86, 236) were selected as the next preferred ligand-determining residue positions.
結合分子既知タンパク質分類情報から G P C Rのシークェンスアラインメント のうち第 86番目と第 90番目のアミノ酸残基の種類とリガンドの種類の数を抽出し 、 リガンド決定残基一リガンド分類情報を得た。 それを抜粋したものを表 4に示 す。 表 4から、 例えば 1152種類の G P C Rのうち、 アミノ酸残基位置 8 6番目及 び 9 0番目のアミノ酸がそれぞれ Aと Gであるものは、 8 6種類あり、 それらの リガンドは全て N (脂質) に分類されることがわかる。 このように、 殆どの G P C Rは、 アミノ酸残基位置 8 6番目及び 9 0番目のアミノ酸によって、 そのリガ ンドを予測することができることがわかる。 表 4 リガンド決定残基一リガンド分類表 (結合分子決定残基一結合分子分類表 The 86th and 90th amino acid residue types and the number of ligand types were extracted from the sequence alignment of GPCR from the binding molecule known protein classification information to obtain ligand-determined residue-ligand classification information. Table 4 shows the excerpts. From Table 4, for example, of the 1152 types of GPCRs, there are 86 types in which the amino acids at amino acid residue positions 86 and 90 are A and G, respectively, and their ligands are all N (lipid). It can be seen that it is classified as Thus, it can be seen that most GPCRs can predict their ligands by the amino acids at the 86th and 90th amino acid positions. Table 4 Ligand-determined residue-ligand classification table (Binding molecule-determined residue-binding molecule classification table)
) )
アミノ酸残基 リガンドの種類 合計  Amino acid residue Type of ligand Total
86 90 A N P  86 90 A N P
A G 0 86 0 86  A G 0 86 0 86
D C 114 0 0 114  D C 114 0 0 114
D M 26 0 0 26  D M 26 0 0 26
D Q 13 0 0 13  D Q 13 0 0 13
D S 68 0 0 68  D S 68 0 0 68
D V 26 0 0 26  D V 26 0 0 26
F L 0 20 6 26  F L 0 20 6 26
F M 0 2 13 15  F M 0 2 13 15
G G 0 58 0 58  G G 0 58 0 58
I L 0 4 23 27  I L 0 4 23 27
I M 0 0 32 32  I M 0 0 32 32
I V 0 1 7 8  I V 0 1 7 8
K F 0 0 11 11  K F 0 0 11 11
K M 0 0 10 10  K M 0 0 10 10
L M 0 0 9 9 M G 0 26 0 26 LM 0 0 9 9 MG 0 26 0 26
M V 11 0 13 24  M V 11 0 13 24
P M 0 . 0 7 7  P M 0 .0 7 7
p V 0 0 8 8  p V 0 0 8 8
Q I 0 0 15 15  Q I 0 0 15 15
Q M 0 0 25 25  Q M 0 0 25 25
Q V 0 0 48 48  Q V 0 0 48 48
T S 0 0 26 26  T S 0 0 26 26
v F o 9 o 9  v F o 9 o 9
V G 0 30 0 30  V G 0 30 0 30
V L 0 0 19 19  V L 0 0 19 19
V T 17 0 0 17  V T 17 0 0 17
Y F 0 0 42 42  Y F 0 0 42 42
Y L 0 0 28 28 表 4から、 例えば、 G P C Rのシークェンスアラインメントのうち第 86番目と 第 90番目のアミノ酸残基が、 それぞれ Aと Gである G P C Rは、 86種類あり、 それ らに対するリガンドは全て N (脂質) に分類されるリガンドであることがわかる。 次に、 最近リガンドが発見された G P C Rを無作為に選択し、 上記リガンド決 定残基一リガンド分類情報の精度 (すなわち、 今回のリガンド予測方法の精度) を分析した。 その結果を表 5に示す。 表 5  YL 0 28 28 From Table 4, it can be seen that, for example, in the sequence alignment of GPCRs, the 86th and 90th amino acid residues are A and G, respectively.There are 86 types of GPCRs, and the ligands for them are all N It is understood that the ligand is classified as (lipid). Next, GPCRs in which ligands were recently discovered were randomly selected, and the accuracy of the above-mentioned ligand-determined residue-ligand classification information (ie, the accuracy of the present ligand prediction method) was analyzed. Table 5 shows the results. Table 5
G P C R ァミノ酸残基 N ァミノ酸残基 N ァミノ酸残基 N  G P C R Amino acid residue N Amino acid residue N Amino acid residue N
86 90 209 211 86 236  86 90 209 211 86 236
GPR2 I M 32/32 V F 4/26 I N 61/66  GPR2 I M 32/32 V F 4/26 I N 61/66
GPR5 A L 1/2 Q L 6/7 A H 0/0  GPR5 A L 1/2 Q L 6/7 A H 0/0
GPR8 D I 1/1 A T 1/3 D N 40/234  GPR8 D I 1/1 A T 1/3 D N 40/234
GPR10 Q V 48/48 R L 3/20 Q S 34/34  GPR10 Q V 48/48 R L 3/20 Q S 34/34
GPR13 F F 4/4 E L 10/10 F H 5/9  GPR13 F F 4/4 E L 10/10 F H 5/9
GPR14 D M 26/26 A Y 0/0 D N 40/234  GPR14 D M 26/26 A Y 0/0 D N 40/234
GP 16 C M 1/3 E S 0/0 C S 0/4  GP 16 C M 1/3 E S 0/0 C S 0/4
GPR24 D G 13/13 Q S 1/1 D N 40/234  GPR24 D G 13/13 Q S 1/1 D N 40/234
GP 28 Y F 42/42 Q I 1/1 Y H 71/71  GP 28 Y F 42/42 Q I 1/1 Y H 71/71
GPR29 Y L 28/28 S F 12/19 Y H 71/71  GPR29 Y L 28/28 S F 12/19 Y H 71/71
GPR54 Q V 48/48 Q L 5/6 Q N 52/52  GPR54 Q V 48/48 Q L 5/6 Q N 52/52
GPR74 Q V 48/48 S Y 2/2 Q N 52/52  GPR74 Q V 48/48 S Y 2/2 Q N 52/52
APJ I M 32/32 Y L 2/7 I N 61/66  APJ I M 32/32 Y L 2/7 I N 61/66
請 Q V 48/48 I Y 0/0 Q H 3/3 表 5について説明する。 例えば、 GPR2のシークェンスアラインメントのうち、 第(86, 90)番目 (結合分子決定残基位置) のアミノ酸残基は、 それぞれ I、 Mとい う配列である。 結合分子決定残基一結合分子分類情報にある結合分子既知 G P C R1152種類のうち、 第(86, 90)番目のアミノ酸残基が、 それぞれ I、 Mとなるもの は 32種類あって、 それらのうちリガンドの種類がペプチドのものは 3 2種類全 てである。 この GPR 2は、 1 1 52種類の結合分子既知 GP CRに含まれてい ないが、 実際のリガンドの種類はペプチドであり、 予測されたリガンドの種類と 一致している。 Contract QV 48/48 IY 0/0 QH 3/3 Table 5 is explained. For example, in the sequence alignment of GPR2, the (86, 90) th amino acid residue (position of the residue determining the binding molecule) has a sequence of I and M, respectively. Among the binding molecule known GPC R1152 types in the binding molecule classification residue-binding molecule classification information, there are 32 types in which the (86, 90) th amino acid residues are I and M, respectively. All 32 types of ligands are peptides. Although this GPR 2 is not included in the 1152 known binding molecules of GPCR, the actual ligand type is a peptide, which is consistent with the predicted ligand type.
また、 GPR2のシークェンスアラインメントのうち、 結合分子決定残基位置 (2 0 9, 2 1 1 ) 番目のアミノ酸残基は、 それぞれ V、 Fである。 そして、 上記 1 1 52種のうち、 2 0 9、 2 1 1番目のシークェンスアラインメントが、 V、 F であるものは、 2 6種類あり、 それらのうちリガンドの種類がペプチドのものは 4種類である。  In addition, in the sequence alignment of GPR2, the amino acid residue at the position (209, 211) determined by the binding molecule is V and F, respectively. Of the above 1 1 52 types, there are 26 types in which the sequence alignment of 209 and 2 1 1 is V and F, and there are 4 types of those with a ligand type of peptide in 4 types. is there.
表 5から結合分子決定残基位置(86, 90)が、 3種のうちで最も好ましい結合分子 決定残基位置であることがわかる。 更に、 表 5によれば、 高い精度をもってリガ ンドの種類を予測できることがわかる。  Table 5 shows that the binding molecule-determining residue positions (86, 90) are the most preferable binding molecule-determining residue positions among the three types. Furthermore, according to Table 5, it can be seen that the type of the ligand can be predicted with high accuracy.
次に、 結合分子決定残基位置(86, 90)、 及び (209, 211) を組合せたリガンド決 定残基一リガンド分類情報の精度を分析した。  Next, we analyzed the accuracy of the ligand-determined residue-ligand classification information combining the binding molecule-determined residue positions (86, 90) and (209, 211).
例えば、 表 5より、 GPR2の結合分子決定残基位置(86, 90)、 及び (209, 211) の N (1 1 52種の GPCRであって、 結合分子決定残基が、 GPR2と一致したもの と、 リガンドの種類も同一であったものの数) は、 それぞれ 3 2/3 2、 及び 4 /2 6である。 これらを足し合わせ、 3 6/58とする。  For example, from Table 5, the positions of the binding molecule determining residues of GPR2 (86, 90), and N (1112 types of GPCRs) at (209, 211), the binding molecule determining residues were identical to GPR2. And the number of ligands with the same type of ligand) are 3 2/3 2 and 4/26, respectively. Add these up to 36/58.
このようにして、 表 5に記載された全ての GP CRについて、 GPR2の結合分子 決定残基位置(86, 90)、 及び (209, 211) の Nを足し合わせた。 この結果、 評価対 象が増大し、 結合分子決定残基位置(86, 90)、 又は (209, 211) 単独でリガンドの 種類を予測した場合に比べ、 より精度高くリガンドの種類を予測できることが確 認された。  In this way, for all GPCRs listed in Table 5, the binding molecule-determining residue positions of GPR2 (86, 90) and N of (209, 211) were added. As a result, the number of evaluation targets increases, and the ligand type can be predicted with higher accuracy than when the ligand type is predicted solely at the binding molecule-determined residue position (86, 90) or (209, 211) alone. confirmed.
[実施例 2 ] (GPCRに結合する結合 G a夕ンパク質の種類の予測方法) [Example 2] (Method of predicting the type of binding G a protein that binds to GPCR)
まず、 GP CRの結合分子である結合 G αタンパク質を G i、 Gq, G sの 3種類 に分類した。 これは、 T I P S (Trends in pharmacological sciences) の 2000 Receptor&Ion channel Nomenclature Supplement ίこ従って結合 G α夕ンノ ク 質を 3つに分類したものである。 なお、 簡単の為に、 T I P Sの 2000 Receptor &Ion channel Nomenclature Supplement 中、 Gi/oを Gi、 GQ/11を GQとした。 また 、T I P Sの 2000 Receptor&Ion channel Nomenclature Supplementにあげられ る結合 Go;タンパク質のから、 2種類以上の Gひタンパク質に結合するもの、及び Gi/al, 3、 Gi/a2, 3を例外として除外した。 このようにして、 約 6 0 0種類の GP CR及びそのシークェンスアラインメント、 並びにそれに結合する結合 G αタン パク質、 及びその種類に関する情報を得た。  First, the binding Gα protein, which is a binding molecule of GPCR, was classified into three types, G i, Gq, and G s. This is a categorization of the binding G α protein into the 2000 Receptor & Ion channel Nomenclature Supplement of Trends in pharmacological sciences (TIPS). For simplicity, Gi / o is Gi and GQ / 11 is GQ in the 2000 Receptor & Ion channel Nomenclature Supplement of TIPS. In addition, of the binding Go; proteins listed in the 2000 Receptor & Ion channel Nomenclature Supplement of TIPS, those binding to two or more G proteins, and Gi / al, 3, and Gi / a2, 3 were excluded as exceptions. In this way, information on about 600 types of GPCR and its sequence alignment, and the binding Gα protein that binds to it, and its type were obtained.
GP CRのシークェンスァラインメントと、 当該 Gタンパク質に結合する結合 Gaタンパク質の種類に関する情報を、 コンピュータに入力した。 入力された G PCRのシークェンスァラインメン卜および結合 G cuタンパク質の種類に関する 情報を f 1 (n)関数及び f2 (n)関数を用いた結合分子決定残基位置候補選択手段によ り選択した。 その結果、 2種類の結合分子決定残基位置のペア (1 7 7、 1 7 8) 、 及び (8 2、 2 30) が得られた。  The sequence alignment of GPCR and information on the type of binding Ga protein that binds to the G protein were input to a computer. The inputted sequence information of the GPCR and the information on the type of the binding Gcu protein were selected by means of a candidate selecting residue position for determining a binding molecule using the f1 (n) function and the f2 (n) function. As a result, two pairs of binding residue positions (177, 178) and (82, 230) were determined.
結合分子決定残基位置候補選択手段においては、 Π (η)関数の値と、 ί2(η)関数 の値の積が小さいものを結合分子残基位置候補として選ぶ。 それらのうち、 シ一 クエンスアラインメント不能(-)が、 全体の 3 %以上存在しているものがあれば、 リガンド決定残基位置候補から除いた。 このようにして、 リガンド残基位置候補 を選択した。  In the binding molecule determination residue position candidate selecting means, a candidate having a small product of the value of the Π (η) function and the value of the ί2 (η) function is selected as the candidate binding molecule residue position. Among them, if any of them could not be sequence-aligned (-), 3% or more of them were excluded from candidate ligand-determining residue positions. In this way, candidate ligand residue positions were selected.
次に、 2種類の結合分子決定残基位置のペア (1 7 7、 1 7 8) 、 及び (8 2、 23 0) の結合分子の種類を予測する精度を確認した。  Next, the accuracy of predicting the types of binding molecules of the two pairs of binding molecule-determining residue positions (177, 178) and (82, 230) was confirmed.
複数の GP CRについて結合する Gaタンパク質の種類を文献から入手した。 そして、 結合 Go;タンパク質が、 どのような手法で得られたものであるかを、 Ca influxによる場合、 Arachidonic acid release:ァラキドン酸の放出による場合 、 PTX (pertussis toxin sensitive)による場合、 環状アデノシン一リン酸による 場合に分け、 それぞれ、 Ca、 AA、 PTX、 cAMPとした。 なお、 Caだけで GQの判定をし ている場合は、 結合 Gaタンパク質が他のものである可能性があるので、 排除し た。 このようにして、 GPCRを選択した。 The type of Ga protein that binds to multiple GPCRs was obtained from the literature. Then, the binding Go; how the protein was obtained was determined by Ca influx, Arachidonic acid release: by arachidonic acid release, by PTX (pertussis toxin sensitive), and by cyclic adenosine. Phosphoric acid was used, and Ca, AA, PTX, and cAMP were used, respectively. In addition, GQ is determined only by Ca Were excluded because the bound Ga protein could be something else. In this way, a GPCR was selected.
選択された GPCR及び、 文献から得られた結合 GPCRの種類、 当該種類が 得られた手法、 結合分子決定残基位置のペア (177、 1 78) 、 及び (82、 230) におけるシークェンスアラインメントと、 それぞれのペアから予測され る Go;タンパク質の種類を表 6に示す。 表 6  The type of GPCR selected and the type of binding GPCR obtained from the literature, the method by which the type was obtained, the sequence alignment of the pair of binding molecule-determining residue positions (177, 178), and (82, 230); Table 6 shows the Go; protein types predicted from each pair. Table 6
GPCR ァミノ酸残基 N アミノ酸残基 N 実験 予想  GPCR Amino acid residue N Amino acid residue N Experiment Forecast
177 178 83 230  177 178 83 230
GPR5 R S 9/9 N R 0/0 Gi (PTX) Gi  GPR5 R S 9/9 N R 0/0 Gi (PTX) Gi
GPR8 L G 4/4 L T 0/0 Gi (cAMP) GQ  GPR8 L G 4/4 L T 0/0 Gi (cAMP) GQ
GPR13 K N 3/3 T E 32/32 Gi (PTX) Gi  GPR13 K N 3/3 T E 32/32 Gi (PTX) Gi
GPR14 A R 1/2 F T 1/1 Ga (AA) Ga  GPR14 A R 1/2 F T 1/1 Ga (AA) Ga
GPR16 F 3/3 H I 3/3 G (Ca) GQ  GPR16 F 3/3 H I 3/3 G (Ca) GQ
GPR24 I R 1/1 I I 9/9 Gi (cAMP) Gi  GPR24 I R 1/1 I I 9/9 Gi (cAMP) Gi
GPR39 S R 0/0 T E 32/32 GQ (AA) Gi  GPR39 S R 0/0 T E 32/32 GQ (AA) Gi
APJ G L 2/2 S T 0/0 Gi (cAMP) Gi 例えば、 表 6の GPR5について説明する。 GP R5の結合 G aタンパク質の種 類は Giであり、 その種類は、 PTXにより取得されたものである。 そして、 GPR5の第 177番目と第 178番目のシークェンスアラインメントは、 Rと Sである。 このような シークェンスァラインメントをもつ GPCRは 9個あり、それらの共役 Gcuタンパク質 の種類は、 いずれも Giであることがわかる。 上記 8種の GPCRのうち、 GPR5、 GPR13 、 GPR14、 GPR16、 GPR24、 AP; [の 6種について結合 G o;タンパク質の予想が的中して いる。 以上より本発明によれば、 高い精度をもって結合 Gaタンパク質を予測す ることが可能となることがわかる。  APJ G L 2/2 ST 0/0 Gi (cAMP) Gi For example, GPR5 in Table 6 will be described. The type of binding G a protein of GPR5 is Gi, which was obtained by PTX. And the 177th and 178th sequence alignments of GPR5 are R and S. Nine GPCRs have such a sequence alignment, and it can be seen that the type of conjugated Gcu protein is Gi. Among the above eight GPCRs, GPR5, GPR13, GPR14, GPR16, GPR24, and AP; [6] are predicting the binding Go; protein. From the above, it can be seen that according to the present invention, it is possible to predict the binding Ga protein with high accuracy.
[実施例 3 ] [Example 3]
( T G R 23リガンドの予測)  (Prediction of TGR23 ligand)
2種の S D Mペアを組み合わせて評価することにより、 高い精度でリガンドの 種類を予測できることが明らかとなった。 そこで、 本発明の方法を用いて、 配列 番号: 37および配列番号: 39で表される Gタンパク質共役型レセプタータン パク質である TGR 23のアミノ酸配列情報のシークェンスァラインメントを行 レ 、 その結果から、 TGR 23リガンドの予測を行った。 ここでは、 TGR 23 一 1とゥシロドプシンのシークェンスアラインメントの結果を図 4に示す。 It became clear that the type of ligand can be predicted with high accuracy by evaluating the combination of two SDM pairs. Therefore, using the method of the present invention, Sequence alignment of the amino acid sequence information of TGR23, which is the G protein-coupled receptor protein represented by SEQ ID NO: 37 and SEQ ID NO: 39, was performed, and the TGR23 ligand was predicted from the results. Here, FIG. 4 shows the results of sequence alignment of TGR2311 and ゥ silodopsin.
図 4に示したとおり、 TGR 23— 1における結合分子決定残基位置 (86、 90) 、 (209, 2 1 1 ) および (86、 236) のアミノ酸は、 それぞれ ( Q、 L) 、 (D、 F) および (Q、 N) であった。 これらの N (1 1 52種の G PCRであって、 結合分子決定残基が TGR 23と一致したものと、 リガンドの 種類も同一であったものの数) は、 それぞれ 0/0、 13/27および 52/5 2であり、 評価 G PC R数が広範な (86、 90) と (86、 236 ) の SDM ペアを組み合わせた評価では、 リガンド種類はペプチドであると推定された。 以下の参考例では、 実際に TGR23 (TGR 23 _ 1および TGR 23— 2 ) のリガンドがペプチドであることを示す。 [参考例 1] · As shown in FIG. 4, the amino acids at residues (86, 90), (209, 211) and (86, 236) in TGR 23-1 are (Q, L), (D , F) and (Q, N). These N (the number of 1152 types of GPCRs in which the binding molecule-determining residue coincided with TGR 23 and the type of ligand were the same) were 0/0 and 13/27, respectively. And 52/52, and the combined evaluation of (86, 90) and (86, 236) SDM pairs with a wide range of estimated GPCR numbers estimated that the ligand type was peptide. The following reference examples show that the ligand of TGR23 (TGR23_1 and TGR23-2) is actually a peptide. [Reference Example 1] ·
(TGR 23一 2発現 CHO細胞に対して特異的に cAM P産生促進活性を示す 活性物質のラット全脳抽出物からの精製) (Purification from rat whole brain extract of an active substance that specifically exhibits cAMP production promotion activity on CHO cells expressing TGR 23-12)
TGR 23 - 2に特異的なリガンド活性を示す物質を、 TGR 23— 2発現 C HO細胞に対する c AMP産生促進活性を指標として、 ラッ卜全脳から精製した 。  A substance exhibiting a ligand activity specific to TGR 23-2 was purified from rat whole brain using cGR production promoting activity on CGR cells expressing TGR 23-2 as an index.
ラット全脳抽出物の高速液体クロマトグラフィー (HPLC) フラクションを 以下に述べる方法で調製した。 日本チャールズリバ一 (株) より購入したォス 8 週齢のウィスターラットの全脳 400 g (200頭分) を順次摘出直後、 25頭 ずつ沸騰した蒸留水 (300m l ) に投じて 10分間煮沸した。 煮沸後、 直ちに 氷冷し、 200頭分を合わせて (2. 4L) 酢酸 1 80m 1を加えて終濃度 1. 0Mとし、 低温下ポリ トロン (10, 000 r pm、 2分間) を用いて破砕した 。 破砕液を遠心 (8, 000 r pm、 30分) して上清を取り、 沈殿には 1. 0 M酢酸 2. 4Lを加えて再度ポリ トロンによって破砕し、 一晩攪拌した後、 遠心 (8, 000 r pm、 30分) して上清を得た。 各遠心で得られた上清は、 2倍 量 (4. 8 L) の冷アセトンを 4°Cでゆっくり滴下した後、 1回目の遠心により 得られた上清については一晩攪拌し、 2回目の遠心により得られた上清について は 4時間攪拌した。 アセトンを加えた抽出液は遠心 (8, 000 r pm、 30分 ) して沈殿を除き、 得られた上清については減圧下エバポレー夕一にてアセトン を留去した。 アセトンを留去した抽出液に等量のジェチルエーテルを加え、 分液 ロートを使って脂質を含むエーテル層を分離して水層を回収した。 エーテル脱脂 した抽出液はエバポレー夕一にて減圧下濃縮しエーテルを完全に除去した。 濃縮 液をガラス繊維濾紙 (アドバンテック、 DP 70 (9 Οπιιη ) ) で濾過し、 濾 液をガラス製カラム (30 X 240 mm) に充填した ODSカラム (ダイソ一 、 Daisogel IR-120-0DS-A 63/210 um) に付した。 カラムを 1. 0M酢酸 400 m 1で洗浄後、 0. 1 %トリフルォロ酢酸を含む 60 %ァセトニトリル 500m 1 で溶出した。 溶出液を減圧下濃縮して溶媒を留去した後、 濃縮液を凍結乾燥した 。 得られた白色粉末 1. 2 gを 30m lの 0. 1 %トリフルォロ酢酸を含む 10 %ァセトニトリルに溶解し、 12. 5m 1ずつを OD Sカラム (東ソ一、 TSKgel ODS-80TS (2 1. 5 φ X 300 mm) ) を用いた 10 %から 60 %の 0. 1 %ト リフルォロ酢酸を含むァセトニトリルの濃度勾配溶出法による分取 HP L Cに付 した。 HP LCは 2回に分けて行い、 溶出液は 2分毎に 60分画にし、 2回分の 溶出液をまとめた。 各分画を減圧下に濃縮 ·乾固し、 残渣に 0. 4m lのジメチ ルスルホキシド (DMSO) を添加後ポルテックスミキサー、 および超音波洗浄 機を用いて完全に溶解した。 High performance liquid chromatography (HPLC) fraction of rat whole brain extract was prepared by the method described below. Immediately after sequentially extracting 400 g (200 cats) of the whole brain of an 8-week-old Wistar rat purchased from Charles River Japan Co., Ltd., it was thrown into distilled water (300 ml) boiled in 25 pets and boiled for 10 min. did. Immediately after boiling, cool on ice, combine 200 heads (2.4 L), add 180 ml of acetic acid to a final concentration of 1.0 M, and use Polytron (10,000 rpm, 2 minutes) at low temperature. Crushed. The crushed liquid is centrifuged (8,000 rpm, 30 minutes), and the supernatant is collected. To the precipitate, 2.4 L of 1.0 M acetic acid is added, crushed again with a polytron, stirred overnight, and then centrifuged ( (8,000 rpm, 30 minutes) to obtain a supernatant. The supernatant obtained from each centrifugation is 2x After slowly adding dropwise (4.8 L) of cold acetone at 4 ° C, the supernatant obtained by the first centrifugation is stirred overnight, and the supernatant obtained by the second centrifugation is added by 4 Stirred for hours. The extract to which acetone was added was centrifuged (8,000 rpm, 30 minutes) to remove the precipitate, and the resulting supernatant was subjected to evaporation under reduced pressure to evaporate acetone. An equal amount of getyl ether was added to the extract from which acetone was distilled off, and the lipid-containing ether layer was separated using a separatory funnel, and the aqueous layer was collected. The extract degreased with ether was concentrated under reduced pressure at an evaporator to remove ether completely. The concentrated solution was filtered through a glass fiber filter paper (Advantech, DP 70 (9Οπιιη)), and the filtrate was packed in a glass column (30 X 240 mm) using an ODS column (Daisogel, Daisogel IR-120-0DS-A63). / 210 um). The column was washed with 400 ml of 1.0 M acetic acid and eluted with 500 ml of 60% acetonitrile containing 0.1% trifluoroacetic acid. The eluate was concentrated under reduced pressure to remove the solvent, and the concentrate was freeze-dried. 1.2 g of the obtained white powder was dissolved in 30 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, and 12.5 ml of each powder was dissolved in an ODS column (Tosoichi, TSKgel ODS-80TS (2. The sample was subjected to preparative HP LC using a gradient elution method of acetonitrile containing 10% to 60% of 0.1% trifluoroacetic acid using 5 φX 300 mm)). The HP LC was performed twice, and the eluate was divided into 60 fractions every 2 minutes, and the two eluates were combined. Each fraction was concentrated and dried under reduced pressure, and 0.4 ml of dimethyl sulfoxide (DMSO) was added to the residue, and then completely dissolved using a Portex mixer and an ultrasonic washer.
上記によって得られた HPLCフラクションの DMSO溶液を参考例 3に示し た方法に従い TGR 23— 2発現 CHO細胞に投与し、 細胞内 c AMP産生量の 測定を行なった結果、 分画番号 1 8、 20および 22〜 23に顕著な c AMP産 生促進活性が認められた。 また同様の試料について公知の方法に従いァラキドン 酸代謝物遊離活性を調べた結果、 顕著な活性が確認された。  The DMSO solution of the HPLC fraction obtained above was administered to CHO cells expressing TGR23-2 according to the method described in Reference Example 3, and the amount of intracellular cAMP production was measured. And 22 to 23 showed remarkable cAMP production promoting activity. In addition, a similar sample was examined for arachidonic acid metabolite releasing activity according to a known method. As a result, remarkable activity was confirmed.
これらの活性は他のレセプ夕一発現細胞では認められなかったことより、 ラッ ト全脳抽出物に T GR 23- 2に特異的なリガンド活性物質が存在することが示 された。 得られた 3つの活性画分をそれぞれ以下の (a) 〜 (c) の方法により さらに精製した。 また、 いずれの活性分画についても、 以下に述べる最初の陽ィ オン交換力ラムを用いた精製工程において得られた c AM P産生促進活性が認め られた分画には、 同時に FL I PR (モレキュラーデバイス社) によってレセプ ター特異的な細胞内カルシウム遊離活性が認められた。 そこで、 それ以降の精製 工程における活性の確認には、 FL I PRによる細胞内カルシウム遊離活性を指 標として用い、 活性を示した分画が c AMP産生促進活性を示すことについては 適宜確認した。 These activities were not observed in other receptor-expressing cells, indicating the presence of a ligand active substance specific for TGR23-2 in rat whole brain extracts. The three active fractions obtained were further purified by the following methods (a) to (c), respectively. In addition, for all active fractions, the first positive In the fractions showing the cAMP production promoting activity obtained in the purification process using the on-exchange column, the receptor-specific intracellular calcium release activity was also confirmed by FLIPR (Molecular Devices). Was done. Therefore, in confirming the activity in the subsequent purification steps, intracellular calcium release activity by FLIPR was used as an indicator, and it was appropriately confirmed that the fraction showing the activity exhibited cAMP production promoting activity.
(a) 分画番号 1 8  (a) Fraction number 1 8
分画番号 1 8については、 1 0 %ァセトニトリルを含む 1 OmMギ酸アンモニ ゥム 1 0m lに溶解し、 陽イオン交換カラム (東ソ一、 TSKgel SP-5PW ( 20 mm φ X 1 5 Omm) ) に付した後、 1 0 %ァセトニトリルを含む 1 0 mMから 1. 0Mのギ酸アンモニゥムの濃度勾配により溶出した。 活性はギ酸アンモニゥム 0 . 4M付近に回収された。 活性分画を凍結乾燥後、 0. 1 %トリフルォロ酢酸を 含む 1 0 %ァセトニトリル 0. 8m lに溶解し、 OD Sカラム (東ソ一、 TSKgel 0DS-80TS (4. 6 φ X 2 5 Omm) ) に付した後、 0. 1 %トリフルォロ酢酸を 含む 1 0 %から 2 5 %のァセトニトリルの濃度勾配により溶出した結果、 ァセト 二トリル 1 3 %付近に活性が認められた。 得られた活性分画を凍結乾燥後、 0. lm 1の DMSOで溶解し、 さらに 0. 7m lの 0. 1 %ヘプタフルォロ酪酸を 含む 1 0 %ァセトニトリルを加えて OD Sカラム (和光純薬、 Wakosii- II 3C18H G (2. Ο ΠΊΠΙ Φ X 1 5 Omm) ) に付した後、 0. 1 %ヘプタフルォロ酪酸を含 む 1 0 %から 3 7. 5 %のァセトニトリルの濃度勾配により溶出し、 ピーク毎に 手動で分取した。 活性はァセトニトリル 2 6 %付近に認められた。 活性画分には 、 さらに 0. 7m lの 0. 1 %を含むトリフルォロ酢酸 1 0 %ァセトニトリルを 加え、 QDSカラム (和光純薬、 Wakosi卜 II 3C18HG) に付した後、 0. 1 %トリ フルォロ酢酸を含む 1 0 %から 20 %のァセトニトリルの濃度勾配によって溶出 し、 溶出液はピーク毎に手動で分取した。 活性はァセトニトリル 1 1 %付近に単 一ピークとして得られた。 この分画に含まれる活性物質は、 以下の参考例 5に示 すようにして構造決定した。  For fraction No. 18, dissolve in 10 ml of 1 OmM ammonium formate containing 10% acetonitrile, and use a cation exchange column (Tosoichi, TSKgel SP-5PW (20 mm φ X 15 Omm)) After elution, elution was carried out with a concentration gradient of 10 mM to 1.0 M ammonium formate containing 10% acetonitrile. The activity was recovered at around 0.4M ammonium formate. After freeze-drying the active fraction, dissolve it in 0.8 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, and use an ODS column (Tosoichi, TSKgel 0DS-80TS (4.6 φ X 25 Omm)) ) And eluted with a concentration gradient of 10% to 25% acetonitrile containing 0.1% trifluoroacetic acid. As a result, activity was observed around 13% of acetonitrile. The obtained active fraction was lyophilized, dissolved in 0.1 lm of DMSO, further added with 0.7 ml of 0.1% acetofluorobutyric acid in 10% acetonitrile, and added to an ODS column (Wako Pure Chemical Industries, Ltd.). Wakosii-II 3C18H G (2.Ο ΦΦ X 15 Omm)) and eluted with a concentration gradient of 10% to 37.5% acetonitrile containing 0.1% heptafluorobutyric acid. It was manually collected every time. The activity was observed around 26% of acetonitrile. The active fraction was further added with 0.7 ml of 0.1% trifluoroacetic acid containing 0.1% acetonitrile, applied to a QDS column (Wako Pure Chemical Industries, Wakosite II 3C18HG), and then purified with 0.1% trifluoroacetic acid. Elution was carried out with a gradient of 10% to 20% acetonitrile containing acetic acid, and the eluate was manually collected for each peak. The activity was obtained as a single peak around 11% of acetonitrile. The structure of the active substance contained in this fraction was determined as shown in Reference Example 5 below.
(b) 分画番号 2 0  (b) Fraction number 20
分画番号 20については、 1 0 %ァセトニトリルを含む 1 OmMギ酸アンモニ ゥム 10mlに溶解し、 陽イオン交換カラム (東ソ一、 TSKgel SP-5PW ( 20 mm X 150mm) ) に付した後、 10 %ァセトニトリルを含む 1 OmMから 1. 0Mのギ酸アンモニゥムの濃度勾配により溶出した。 活性はギ酸アンモニゥム 0 . 6 M付近に回収された。 活性分画を凍結乾燥後、 0. 1%トリフルォロ酢酸を 含む 10 %ァセトニトリル 0. 8mlに溶解し、 CNカラム (野村化学、 Develo sil CN-UG-5 (4. 6 mm φ X 250 mm) ) に付した後、 0. 1 %トリフルォロ 酢酸を含む 10%から 25%のァセトニトリルの濃度勾配によって溶出した結果 、 ァセトニトリル 12%付近に活性が認められた。 得られた活性分画を凍結乾燥 後、 0. lm 1の DMS 0で溶解し、 さらに 0. 7mlの 0. 1 %トリフルォロ 酢酸を含む 10%ァセトニトリルを加えて ODSカラム(和光純薬、 Wakosil-II 3C18HG (2. ΟπΐΓηφ X 150mm) ) に付した後、 0. 1%トリフルォロ酢酸 を含む 10 %から 20 %のァセトニトリルの濃度勾配により溶出し、 溶出液はピ ーク毎に手動で分取した。 活性はァセトニトリル 15 %付近に単一ピークとして 得られた。 この分画に含まれる活性物質を以下の参考例^ =3に示すようにして構 造決定した。 For fraction number 20, 1 OmM ammonium formate containing 10% acetonitrile After dissolving in 10 ml of water and applying it to a cation exchange column (Tosoichi, TSKgel SP-5PW (20 mm x 150 mm)), a concentration gradient from 1 OmM to 1.0 M ammonium formate containing 10% acetonitrile was obtained. Eluted. The activity was recovered at around 0.6 M ammonium formate. After freeze-drying the active fraction, dissolve in 0.8 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, and use a CN column (Nomura Chemical, Develo sil CN-UG-5 (4.6 mm φ X 250 mm)) After elution with elution with a concentration gradient of 10% to 25% acetonitrile containing 0.1% trifluoroacetic acid, activity was observed around 12% of acetonitrile. The resulting active fraction was lyophilized, dissolved in 0.1 lm of DMS 0, and further added with 0.7 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, followed by ODS column (Wako Pure Chemical Industries, Wakosil- II 3C18HG (2.ΟπΐΓηφ X 150mm)) and eluted with a concentration gradient of 10% to 20% acetonitrile containing 0.1% trifluoroacetic acid, and the eluate was manually separated for each peak . The activity was obtained as a single peak around 15% of acetonitrile. The structure of the active substance contained in this fraction was determined as shown in Reference Example ^ = 3 below.
( c ) 分画番号 22〜 23  (c) Fraction number 22-23
分画番号 22〜 23については、 10 %ァセトニ卜リルを含む 1 OmMギ酸ァ ンモニゥム 10mlに溶解し、 陽イオン交換カラム (東ソ一、 TSKgel SP-5PW (2 Οπιηι X 150mm) ) に付した後、 10 %ァセトニトリルを含む 1 OmM力、 ら 1. 0Mのギ酸アンモニゥムの濃度勾配により溶出した。 活性はギ酸アンモニ ゥム 0. 4M付近に回収された。 活性分画を凍結乾燥後、 0. 1%トリフルォロ 酢酸を含む 10%ァセトニトリル 0. 8mlに溶解し、 CNカラム (野村化学、 D evelosil CN-UG-5 (4. βπιπι X 250 mm) ) に付した後、 0. 1%トリフ ルォロ酢酸を含む 10%から 25%のァセトニトリルの濃度勾配によって溶出し た結果、 ァセトニトリル 13%付近に活性が認められた。 得られた活性分画を凍 結乾燥後、 0. 1 m 1の DMS Oで溶解し、 さらに 0. 7mlの 0. 1%トリフ ルォロ酢酸を含む 10 %ァセトニトリルを加えて OD Sカラム (和光純薬、 Wako sil-II 3C18HG (2. 0 mm φ X 150 mm) ) に付した後、 0. 1 %トリフルォ 口酢酸を含む 10%から 20%のァセトニトリルの濃度勾配により溶出し、 ピー ク毎に手動で分取した。 活性はァセトニトリル 16%付近に認められた。 活性分 画には、 さらに 0. 7m lの 0. 1 %ヘプ夕フルォロ酪酸を含む 10 %ァセトニ トリルを加え、 ODSカラム (和光純薬、 Wakosil- II 3C18HG) に付した後、 0. 1 %ヘプタフルォロ酪酸を含む 1 0 %から 37. 5 %のァセトニトリルの濃度勾 配によって溶出し、 溶出液はピーク毎に手動で分取した。 活性はァセトニトリル 28%付近に単一ピ一クとして得られた。 この分画に含まれる活性物質は、 以下 の参考例 4に示すようにして構造決定した。 Fraction Nos. 22-23 were dissolved in 10 ml of 1 OmM ammonium formate containing 10% acetonitrile and applied to a cation exchange column (TOSOKI, TSKgel SP-5PW (2ππηηι X 150 mm)). Elution was carried out with a concentration gradient of 1.0 M ammonium formate, 1 OmM force containing 10% acetonitrile. The activity was recovered at around 0.4M ammonium formate. After freeze-drying the active fraction, dissolve it in 0.8 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid and attach to a CN column (Nomura Chemical, Develosil CN-UG-5 (4. βπιπι X 250 mm)). After that, elution was performed with a concentration gradient of 10% to 25% acetonitrile containing 0.1% trifluoroacetic acid, and as a result, activity was observed around 13% of acetonitrile. The resulting active fraction was freeze-dried, dissolved in 0.1 ml of DMS O, further added with 0.7 ml of 10% acetonitrile containing 0.1% trifluoroacetic acid, and added to an ODS column (Wako Pure Chemical Industries, Ltd.). The drug, Wako sil-II 3C18HG (2.0 mm φ X 150 mm)), was eluted with a concentration gradient of 10% to 20% acetonitrile containing 0.1% trifluoroacetic acid. The sample was manually collected for each work. Activity was observed around 16% of acetonitrile. To the active fraction, 0.7 ml of 10% acetonitrile containing 0.1% heptanofluorbutyric acid was further added, and the mixture was applied to an ODS column (Wako Pure Chemical Industries, Wakosil-II 3C18HG). Elution was performed with a gradient of 10% to 37.5% acetonitrile containing heptafluorobutyric acid, and the eluate was manually collected for each peak. The activity was obtained as a single peak around 28% of acetonitrile. The structure of the active substance contained in this fraction was determined as shown in Reference Example 4 below.
[参考例 2 ] [Reference Example 2]
(ラッ卜全脳抽出物中の TGR 23一 2発現 CHO細胞に対して特異的に細胞内 c AMP産生促進活性を示す活性物質のプロナ一ゼによる失活)  (Inactivation by pronase of an active substance that specifically promotes intracellular cAMP production in CHO cells expressing TGR23-12 in rat whole brain extract)
参考例 1で TGR 23— 2発現 CHO細胞に対して細胞内 cAM P産生促進活 性を示した H PLC分画 1 8、 20ぉょび22〜23を、 タンパク質分解酵素で あるプロナ一ゼ (Sigma, protease Type XIV (P5147)) で処理し、 活性物質が夕 ンパク性であるか否かを調べた。  The HPLC fractions 18, 20, 22 and 23, which showed intracellular cAMP production promoting activity on CHO cells expressing TGR23-2 in Reference Example 1, were converted to pronase (pronase). Sigma, protease Type XIV (P5147)) was used to determine whether the active substance was proteinaceous.
上記ラット全脳抽出物 HP LC活性分画 (分画番号 1 8、 20および 22〜2 3) 各 4〃 1を 0. 2M酢酸アンモニゥム 1 00 1に加え、 これにプロナ一ゼ 3 i gを添加して 37°Cで 2時間インキュベートした後、 沸騰水中で 1 0分間加 熱して添加したプロナーゼを失活させた。 これに B SAO. 05mgおよび CH AP S 0. 05mgを含む蒸留水 lm 1を加え凍結乾燥した。 凍結乾燥した試料 を、 公知の方法に従い TGR 23— 2発現 CHO細胞に添加して細胞内 c AMP 産生促進活性を測定した。  Rat whole brain extract HP LC active fraction (fraction No. 18, 20, and 22 to 23) Add 4-1 each to 0.2 M ammonium acetate 1001, and add 3 ig pronase to this After incubation at 37 ° C for 2 hours, the added pronase was inactivated by heating in boiling water for 10 minutes. Distilled water (lm1) containing BSAO. 05mg and CHAPS 0.05mg was added thereto and freeze-dried. The freeze-dried sample was added to CHO cells expressing TGR23-2 according to a known method, and the activity of promoting intracellular cAMP production was measured.
その結果、 いずれの分画の活性もプロナーゼ処理によって完全に消失した。 従って、 ラット全脳抽出物中の TGR 23— 2発現 CHO細胞に対して細胞内 c AMP産生促進活性を示す活性物質は、 いずれもタンパク質またはペプチドで あることが明らかとなった。  As a result, the activity of each fraction completely disappeared by pronase treatment. Therefore, it was clarified that any of the active substances exhibiting an intracellular cAMP production promoting activity on CHO cells expressing TGR23-2 in a rat whole brain extract is a protein or a peptide.
[参考例 3 ] [Reference Example 3]
(ラッ卜全脳抽出物の分画番号 20から得られた TGR 23一 2発現 CHO細胞 に対して特異的に c AMP産生促進活性を示す活性物質のアミノ酸配列の決定) 参考例 2に示したようにラット全脳抽出物の 3つの分画に含まれる TGR 2 3 一 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物質は、 いずれもタンパク性であることが予想されたので、 以下のようにそれぞれについ てアミノ酸配列解析を行なった。 (CHO cells expressing TGR 23-12 obtained from fraction number 20 of rat whole brain extract) Determination of the amino acid sequence of an active substance that specifically exhibits cAMP production promoting activity against TGR23-12 TGR23-expressing CHO cells contained in the three fractions of rat whole brain extract as shown in Reference Example 2. Since it was expected that any of the active substances that specifically show cAMP production promoting activity against, would be proteinaceous, amino acid sequence analysis was performed on each of them as follows.
参考例 1に示すようにしてラット全脳抽出物の分画番号 2 0から得られた T G R 2 3 _ 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物 質のアミノ酸配列解析および質量分析を行なった。 活性ピークを含む溶出液を用 いて Procise 491cLCプロテインシーケンサー (アプライドバイオシステム) によ るァミノ末端アミノ酸配列分析を行なったところ、 N末端から 1 8残基までに S FRNGVGS GVKKTS FRRA (配列番号: 1) のアミノ酸配列が得られ た。 同様の溶出液を用いてナノスプレーイオン源(プロ夕ナ) を装着した Thermo Fiimigan LCQイオントラップ質量分析計 (サーモクエスト) による質量分析を行 なった結果、 配列番号: 1のアミノ酸配列から計算される質量値が得られた (実 測値: 1 9 54. 9、 計算値: 1 9 54. 2 ) 。  Amino acid sequence of an active substance that specifically exhibits cAMP production promoting activity on TGR23_2-expressing CHO cells obtained from fraction number 20 of rat whole brain extract as shown in Reference Example 1. Analysis and mass spectrometry were performed. Amino-terminal amino acid sequence analysis using Procise 491cLC protein sequencer (Applied Biosystems) using the eluate containing the activity peak revealed that S FRNGVGS GVKKTS FRRA (SEQ ID NO: 1) from the N-terminal to 18 residues. The amino acid sequence of was obtained. The same eluate was used to perform mass spectrometry using a Thermo Fiimigan LCQ ion trap mass spectrometer (ThermoQuest) equipped with a nanospray ion source (Pro evening). The result was calculated from the amino acid sequence of SEQ ID NO: 1. (Measured value: 1954.9, calculated value: 1954.2).
これより、 ラット全脳抽出物の分画番号 2 0から得られた TGR2 3— 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物質は、 配列番号 : 1に示すアミノ酸配列を有するものであると決定された。 [参考例 4]  From this, the active substance which specifically exhibits cAMP production promoting activity on TGR2-3-expressing CHO cells obtained from fraction number 20 of rat whole brain extract has the amino acid sequence shown in SEQ ID NO: 1. It was determined to have. [Reference Example 4]
(ラット全脳抽出物の分画番号 22〜23から得られた TGR 2 3— 2発現 CH 0細胞に対して特異的に c AMP産生促進活性を示す活性物質のアミノ酸配列の 決定)  (Determination of the amino acid sequence of an active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CH0 cells obtained from fraction numbers 22 to 23 of rat whole brain extract)
参考例 1に示すようにしてラット全脳抽出物の分画番号 2 2〜23から得られ た TGR 23— 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す 活性物質のアミノ酸配列解析および質量分析を行なった。 活性ピークを含む溶出 液を用いて Procise 491cLCプロテインシーケンサー (アプライドバイオシステム ) によるァミノ末端アミノ酸配列分析を行なったところ、 N末端から 1 5残基ま でに S FRNGVGS GVKKTS F (配列番号: 2) のアミノ酸配列が得られ た。 同様の溶出液を用いてナノスプレーイオン源(プロ夕ナ) を装着した Tiiermo Finnigan LCQイオントラップ質量分析計 (サーモクエスト) による質量分析を行 なった結果、 配列番号: 2のアミノ酸配列から計算される質量値が得られた (実 測値: 1570. 8、 計算値: 1570. 8 ) 。 Amino acid sequence of an active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CHO cells obtained from fraction numbers 22 to 23 of rat whole brain extract as shown in Reference Example 1. Analysis and mass spectrometry were performed. Amino-terminal amino acid sequence analysis using Procise 491cLC protein sequencer (Applied Biosystems) using the eluate containing the activity peak revealed that S FRNGVGS GVKKTS F (SEQ ID NO: 2) from the N-terminal to 15 residues. Amino acid sequence is obtained Was. Using the same eluate, mass spectrometry using a Tiiermo Finnigan LCQ ion trap mass spectrometer (ThermoQuest) equipped with a nanospray ion source (Pro evening) was calculated from the amino acid sequence of SEQ ID NO: 2. (Measured value: 1570.8, calculated value: 1570.8).
これより、 ラット全脳抽出物の分画番号 22〜23から得られた TGR23— 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物質は、 配 列番号: 2に示すアミノ酸配列を有するものであると決定された。  Thus, the active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CHO cells obtained from fraction numbers 22 to 23 of rat whole brain extract is the amino acid shown in SEQ ID NO: 2. It was determined to have a sequence.
[参考例 5 ] [Reference Example 5]
(ラッ卜全脳抽出物の分画番号 18から得られた TGR23— 2発現 CHO細胞 に対して特異的に c AMP産生促進活性を示す活性物質のアミノ酸配列の決定) 参考例 1に示すようにしてラット全脳抽出物の分画番号 18から得られた TG R23一 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物 質のアミノ酸配列解析および質量分析を行なった。 活性ピークを含む溶出液を用 いて Procise 491cLCプロテインシーケンサー (アプライドバイオシステム) によ るァミノ末端アミノ酸配列分析を行なったところ、 N末端から 14残基までに S FRNGVGSGVKKTS (配列番号: 3) のアミノ酸配列が得られた。 同様 の溶出液を用いてナノスプレーイオン源 (プロタナ) を装着した Thermo Finniga n LCQイオントラップ質量分析計(サ一モクエスト) による質量分析を行なった結 果、 配列番号: 3のアミノ酸配列から計算される質量値が得られた (実測値: 1 424. 1、 計算値: 1423. 6) 。  (Determination of the amino acid sequence of an active substance that specifically exhibits cAMP production promotion activity on TGR23-2 expressing CHO cells obtained from fraction number 18 of rat whole brain extract) As shown in Reference Example 1. Amino acid sequence analysis and mass spectrometry of an active substance that specifically exhibits cAMP production promoting activity on TGR23-12-expressing CHO cells obtained from fraction number 18 of rat whole brain extract were performed. Amino-terminal amino acid sequence analysis using Procise 491cLC protein sequencer (Applied Biosystems) using the eluate containing the activity peak revealed that the amino acid sequence of S FRNGVGSGVKKTS (SEQ ID NO: 3) was extended from the N-terminal to 14 residues. was gotten. The same eluate was used to perform mass spectrometry using a Thermo Finnigan LCQ ion trap mass spectrometer (SammoQuest) equipped with a nanospray ion source (protana). The mass was calculated from the amino acid sequence of SEQ ID NO: 3. The following mass values were obtained (actual value: 144.1, calculated value: 1423.6).
これより、 ラッ卜全脳抽出物の分画番号 18から得られた TGR23— 2発現 CHO細胞に対して特異的に c AMP産生促進活性を示す活性物質は、 配列番号 Thus, the active substance that specifically exhibits cAMP production promoting activity on TGR23-2 expressing CHO cells obtained from fractional number 18 of rat whole brain extract
: 3に示すアミノ酸配列を有するものであると決定された。 : It was determined to have the amino acid sequence shown in 3:
[参考例 6 ] [Reference Example 6]
(ヒト TGR 23-2リガンド前駆体をコードする c DNAのクローニング) ラット全脳抽出物から得られた TGR 23一 2発現 CHO細胞に対して特異的 に c AMP産生促進活性を示す活性ペプチド (本明細書中、 ラット TGR23— 2リガンドと記載することがある) のヒトホモログ (本明細書中、 ヒト TGR 2 3 - 2リガンドと記載することがある) の前駆体をコードする c DNAをクロー ニングするため、 ヒト視床下部由来の c DNAを鍀型とした P CRを行なった。 以下の合成 DNAプライマーを用い、 ヒ卜視床下部由来の c DNAを錶型とし て P CR法による増幅を行なった。 反応液の組成は、 ヒト視床下部 Marathon- Rea dy cDNA (CLONTECH) 0. 8 \ , 配列番号: 4および配列番号: 5の合成 DNA プライマ一各 1. 0 M、 0. 2mM dNTP s、 E T a q (宝酒造) 0. 1 II 1および酵素に付属の E xT a qバッファ一で、 総反応量は 20 1 とした。 増幅のためのサイクルはサーマルサイクラ一 (PE Biosystems) を用い、 94°C · 3 0 0秒の加熱の後、 94°C · 1 0秒、 5 5°C · 30秒、 72 °C · 30秒のサイ クルを 3 5回繰り返し、 最後に 7 2 °Cで 5分間保温した。 次に、 DNa s e、 R N a s e F r e eの蒸留水で 50倍希釈した P C R反応液 2 n 1、配列番号: 4 および配列番号: 6の合成 DNAプライマー各 1. 0 M、 0. 2mM dNTP s、 ExT a qポリメラ一ゼ (宝酒造) 0. 1 1および酵素に付属の E x T a qバッファーで総反応量を 2 0 1とし、 サ一マルサイクラ一 (PE Biosystems ) を用い、 94 · 3 00秒の加熱の後、 94°C · 1 0秒、 5 5 °C · 3 0秒、 7 2°C · 3 0秒のサイクルを 3 5回繰り返し、 最後に 7 2°Cで 5分間保温した。 増 幅した DNAを 2. 0 %のァガロースゲル電気泳動により分離した後、 バンドの 部分を力ミソリで切り出し、 DNAを QIAduick Gel Extraction Kit (キアゲン) を用いて回収した。 この DNAを、 pGEM- T Easy Vector System (プロメガ) のプ ロトコールに従って pGEM-T Easyベクタ一へクローニングした。 これを大腸菌 (E scherichia coli) JM109 competent cell (宝酒造) に導入して形質転換した後、 c DN A揷入断片を持つクローンをアンピシリンおよび X— g a 1を含む L B寒 天培地で選択し、 白色を呈するクローンのみを滅菌したつま楊枝を用いて分離し 、 形質転換体を得た。 個々のクローンをアンピシリンを含む LB培地で一晩培養 し、 QIAwell 8 Plasmid Kit (キアゲン) を用いてプラスミド DNAを調製した。 塩基配列の決定のための反応は BigDye Terminator Cycle Seauencing Ready Rea ction Kit (PE Biosystems) を用いて行ない、 蛍光式自動シーケンサ一を用いて 解読し、 配列番号: 7に示す DNA配列を得た。 配列番号: 7で表される DNAの塩基配列には、 配列番号: 1、 配列番号: 2 および配列番号: 3で表されるラット全脳から得られたラット TGR 23— 2リ ガンドのアミノ酸配列に極めて類似したアミノ酸配列をコードするようなフレー ムが存在したことからヒト T G R 23— 2リガンドの前駆体あるいはその一部を コードする c DNAであると推定された。 (Cloning of cDNA encoding human TGR 23-2 ligand precursor) An active peptide that specifically exhibits cAMP production promoting activity on TGR23-12-expressing CHO cells obtained from rat whole brain extract In the description, rat TGR23— To clone the cDNA encoding the precursor of the human homolog (sometimes referred to as human TGR23-2 ligand in this specification) of the human hypolog, A PCR was performed using the cDNA as a type II. Using the following synthetic DNA primers, cDNA derived from the human hypothalamus was converted to type III and amplified by the PCR method. The composition of the reaction solution was human hypothalamus Marathon-Ready cDNA (CLONTECH) 0.8 \, SEQ ID NO: 4 and SEQ ID NO: 5, each of the synthetic DNA primers 1.0 M, 0.2 mM dNTPs, ET aq (Takara Shuzo) 0.1 II 1 and the ExTaq buffer attached to the enzyme, and the total reaction volume was 201. The amplification cycle was performed using a thermal cycler (PE Biosystems) at 94 ° C for 300 seconds, followed by 94 ° C for 10 seconds, 55 ° C for 30 seconds, 72 ° C for 30 seconds. A cycle of 35 seconds was repeated 35 times, and finally, the mixture was kept at 72 ° C for 5 minutes. Next, a PCR reaction solution 2 n 1, which was diluted 50-fold with DNase and RNase Free distilled water, synthetic DNA primers of SEQ ID NO: 4 and SEQ ID NO: 6, 1.0 M and 0.2 mM dNTPs, respectively ExT aq polymerase (Takara Shuzo) 0.11 and ExTaq buffer attached to the enzyme to make the total reaction volume 201, and heat it for 94 · 300 seconds using a thermocycler (PE Biosystems). Thereafter, a cycle of 94 ° C · 10 seconds, 55 ° C · 30 seconds, and 72 ° C · 30 seconds was repeated 35 times, and finally, the temperature was kept at 72 ° C for 5 minutes. After the amplified DNA was separated by 2.0% agarose gel electrophoresis, the band was cut out with a force razor, and the DNA was recovered using a QIAduick Gel Extraction Kit (Qiagen). This DNA was cloned into the pGEM-T Easy Vector 1 according to the protocol of the pGEM-T Easy Vector System (Promega). This was introduced into Escherichia coli JM109 competent cell (Takara Shuzo), transformed, and clones containing the cDNA fragment were selected on an LB agar medium containing ampicillin and X-ga1, and the color was white. Only clones exhibiting the above were separated using a sterilized toothpick to obtain a transformant. Individual clones were cultured overnight in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen). The reaction for determining the nucleotide sequence was carried out using a BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems), and was decoded using a fluorescent automatic sequencer to obtain the DNA sequence shown in SEQ ID NO: 7. The nucleotide sequence of the DNA represented by SEQ ID NO: 7 includes the amino acid sequence of rat TGR23-2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. The presence of a frame encoding an amino acid sequence very similar to that of was predicted to be cDNA encoding the precursor of human TGR23-2 ligand or a part thereof.
ヒト TGR 23— 2リガンドと考えられるアミノ酸配列をコードするようなフ レームで配列番号: 7から翻訳されるアミノ酸配列の 5 ' 上流側にはタンパク質 翻訳の開始コドンであると予想される A T Gが 2ケ所存在するが、 疎水性プ口ッ トを行なったところ、 より 5 ' 上流側の ATGから翻訳した場合にのみシグナル 配列と推定される疎水性の高い領域が出現したのでこの AT Gが開始コドンであ ると推定した。 3 ' 側にはヒト TGR 23 - 2リガンドをコードすると考えられ る配列の下流に終止コドンが存在した。 以上により推定されたヒト TGR 23 - 2リガンド前駆体のアミノ酸配列を配列番号: 8に示す。 この配列において、 ヒ ト TGR 23— 2リガンドに相当すると考えられるアミノ酸配列の N末側には、 通常生理活性ペプチドがその前駆体タンパク質から切り出されるとされる Ly s 一 Ar gの配列 (Seidah, N. G. et al.、 Ann. N. Y. Acad. Sci.、 839巻、 9-24 頁、 1998年) が存在した。 一方、 C末側には終止コドンが存在したが、 配列番号 : 1で表されるアミノ酸配列を有するラッ卜 TGR 23 - 2リガンドに対応する 配列との間にさらに 2残基が存在した。  An ATG, which is predicted to be the initiation codon for protein translation, is located 5 'upstream of the amino acid sequence translated from SEQ ID NO: 7 in a frame encoding an amino acid sequence considered to be a human TGR 23-2 ligand. However, when a hydrophobic input was performed, a highly hydrophobic region presumed to be the signal sequence appeared only when translated from the ATG 5 'upstream, and this ATG was the start codon. It was estimated that On the 3 'side, a stop codon was present downstream of the sequence thought to encode the human TGR 23-2 ligand. The amino acid sequence of the human TGR 23-2 ligand precursor deduced as described above is shown in SEQ ID NO: 8. In this sequence, at the N-terminal side of the amino acid sequence considered to correspond to human TGR 23-2 ligand, a Lys-Arg sequence (Seidah, NG et al., Ann. NY Acad. Sci., 839, 9-24, 1998). On the other hand, a termination codon was present at the C-terminal side, but two more residues were present between the terminal and the sequence corresponding to the rat TGR23-2 ligand having the amino acid sequence represented by SEQ ID NO: 1.
これより、 ヒト TGR 23— 2リガンドのアミノ酸配列は、 ラット全脳抽出物 より得られたラット TGR23 - 2リガンドのアミノ酸配列;配列番号: 1 〔ラ ット TGR 23— 2リガンド (1— 1 8) 〕 、 配列番号: 2 〔ラット TGR 23 — 2リガンド ( 1一 1 5 ) 〕 および配列番号: 3 〔ラット TGR 23— 2リガン ド (1一 14) 〕 にそれぞれ対応する、 配列番号: 9 〔ヒト TGR 23 _ 2リガ ンド ( 1一 1 8 ) 〕 、 配列番号: 1 0 〔ヒ卜 T GR 23— 2リガンド (1— 1 5 ) 〕 および配列番号: 1 1 〔ヒト TGR 23— 2リガンド (1一 14) 〕 で表さ れるアミノ酸配列、 およびさらに配列番号: 9の C末側に 2残基延長された配列 番号: 1 2で表されるアミノ酸配列 〔ヒ卜 TGR 23— 2リガンド (1一 20) 〕 であると推定された。 さらに、 ヒト TGR 23— 2リガンドの配列は、 マウス TGR 23 - 2リガンドおよびラット TGR 23— 2リガンドの配列と異なり、 その配列中に A r g -A r g配列ではなく G 1 n-A r g配列を有することから 、 配列番号: 26に示された 16残基のアミノ酸配列 〔ヒト TGR 23 -2リガ ンド (1— 1 6) 〕 もまたリガンドの配列であると推定された。 Thus, the amino acid sequence of human TGR23-2 ligand is the amino acid sequence of rat TGR23-2 ligand obtained from rat whole brain extract; SEQ ID NO: 1 [rat TGR23-2 ligand (1-18) )], SEQ ID NO: 2 [rat TGR23-2 ligand (1-15)] and SEQ ID NO: 3 [rat TGR23-2 ligand (1-114)], SEQ ID NO: 9 [ Human TGR23_2 ligand (111)], SEQ ID NO: 10 [human TGR23-2 ligand (1-15)] and SEQ ID NO: 11 [human TGR23-2 ligand ( 11-14)] and an amino acid sequence represented by SEQ ID NO: 9 with two residues extended to the C-terminal side of SEQ ID NO: 9 [SEQ ID NO: 12] [amino acid sequence represented by human TGR 23-2 ligand (1 20)]. Furthermore, the sequence of human TGR 23-2 ligand is TGR 23-2 ligand and rat Unlike the sequence of TGR 23-2 ligand, the sequence has a G 1 nA rg sequence instead of an Arg-A rg sequence, so the 16 residues shown in SEQ ID NO: 26 The amino acid sequence [human TGR 23-2 ligand (1-16)] was also deduced to be the ligand sequence.
[参考例 7 ] [Reference Example 7]
(マウス TGR23 -2リガンド前駆体をコードする cDNAのクローニング) ラット全脳抽出物から得られたラッ卜 TGR 23— 2リガンドのマウスホモ口 グ (本明細書中、 マウス TGR 23— 2リガンドと記載することがある) の前駆 体をコードする c DNAをクロ一ニングするため、 マウス全脳由来の c DNAを 铸型とした P C Rを行なった。  (Cloning of cDNA encoding mouse TGR23-2 ligand precursor) Mouse homologue of rat TGR23-2 ligand obtained from rat whole brain extract (referred to as mouse TGR23-2 ligand in this specification) In order to clone the cDNA encoding the precursor of the mouse, PCR was performed using type I cDNA from the whole mouse brain.
以下の合成 DN Aプライマーを用い、 マウス全脳由来の c DNAを錶型として P C R法による増幅を行なった。 反応液の組成は、 マウス全脳 Marathon-Ready c DNA (CLONTECH) 0. 8 配列番号: 1 3および配列番号: 14の合成 DNA プライマー各 1. 0〃Μ、 0. 2mM dNTP s、 E x T a α (宝酒造) 0. 1 H 1および酵素に付属の E xT a Qバッファーで、 総反応量は 20 1とした。 増幅のためのサイクルはサーマルサイクラ一 (PE Biosystems) を用い、 94°C · 5分間の加熱の後、 94°C · 10秒、 65°C · 30秒、 72 · 30秒のサイク ルを 3 5回繰り返し、 最後に 72°Cで 5分間保温した。 次に、 DNa s e、 RN a s e F r e eの蒸留水で 100倍希釈した P C R反応液 2 1、配列番号: 1 3および配列番号: 1 5の合成 DNAプライマー各 1. 0 M、 0. 2 mM dN TP s , E xT a Qポリメラ一ゼ (宝酒造) 0. 1 1および酵素に付属の E X T a qバッファーで総反応量は 20 1とし、 サ一マルサイクラ一 (PE Biosyst ems) を用い、 94°C · 5分間の加熱の後、 94°C · 1 0秒、 60 · 30秒、 7 2 °C · 30秒のサイクルを 30回繰り返し、 最後に 72 °Cで 5分間保温した。 増 幅した DN Aを 2. 0 %のァガロースゲル電気泳動により分離した後、 約 440 塩基長の DNAを力ミソリで切り出し、 DNAを QIAQuick Gel Extraction Kit (キアゲン) を用いて回収した。 この DNAを、 pGEM-T Easy Vector System (プ 口メガ)のプロトコールに従って pGEM- T Easyベクタ一へクローニングした。 これ を大腸菌 (Escherichia coli) JM109 competent cell (宝酒造) に導入して形質 転換した後、 cDNA挿入断片を持つクローンをアンピシリンおよび X_g a 1 を含む LB寒天培地で選択し、 白色を呈するクローンのみを滅菌したつま楊枝を 用いて分離し、 形質転換体を得た。 個々のクローンをアンピシリンを含む LB培地 でー晚培養し、 QIAwell 8 Plasmid Kit (キアゲン) を用いてプラスミド DNAを 調製した。塩基配列の決定のための反応は BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems) を用いて行ない、 蛍光式自動シーケンサー を用いて解読し、 配列番号: 16で表される DNA配列を得た。 Amplification by PCR was performed using the following synthetic DNA primers and cDNA of mouse whole brain as type II. The composition of the reaction solution was as follows: Mouse whole brain Marathon-Ready cDNA (CLONTECH) 0.8 Synthetic DNA primers of SEQ ID NO: 13 and SEQ ID NO: 14: 1.0 0, 0.2 mM dNTPs, ExT a α (Takara Shuzo) 0.1 H1 and ExTaQ buffer attached to the enzyme, and the total reaction volume was 201. The amplification cycle was performed using a thermal cycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle of 94 ° C for 10 seconds, 65 ° C for 30 seconds, and 72 and 30 seconds. The test was repeated 5 times, and finally kept at 72 ° C for 5 minutes. Next, the PCR reaction solution 21, which was diluted 100 times with DNase and RNase Free distilled water, the synthetic DNA primers of SEQ ID NO: 13 and SEQ ID NO: 15 were each 1.0 M and 0.2 mM dN. TP s, ExTaQ polymerase (Takara Shuzo) 0.1 1 and EXT aq buffer attached to the enzyme, the total reaction volume was 201, and the temperature was 94 ° C After heating for 5 minutes, a cycle of 94 ° C for 10 seconds, 60 hours, 30 seconds, and 72 ° C for 30 seconds was repeated 30 times, and finally, the temperature was kept at 72 ° C for 5 minutes. After separating the amplified DNA by 2.0% agarose gel electrophoresis, about 440 base-length DNA was cut out with a force razor, and the DNA was recovered using a QIAQuick Gel Extraction Kit (Qiagen). This DNA was cloned into the pGEM-T Easy Vector 1 according to the protocol of the pGEM-T Easy Vector System (Porta Mega). this Was transformed into Escherichia coli JM109 competent cell (Takara Shuzo), and clones having cDNA inserts were selected on LB agar medium containing ampicillin and X_ga1, and only white clones were sterilized. Separation was performed using a toothpick to obtain a transformant. Individual clones were cultured in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen). The reaction for determining the nucleotide sequence was performed using a BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems), and the DNA sequence was decoded using a fluorescent automatic sequencer to obtain a DNA sequence represented by SEQ ID NO: 16.
配列番号: 16で表される DNAの塩基配列には、 配列番号: 1、 配列番号: 2および配列番号: 3で表されるラット全脳から得られたラット TGR 23 _ 2 リガンドのアミノ酸配列に極めて類似したアミノ酸配列をコードするようなフレ —ムが存在したことからマウス TGR 23-2リガンドの前駆体あるいはその一 部をコードする c DNAであると推定された。  The nucleotide sequence of DNA represented by SEQ ID NO: 16 includes the amino acid sequence of rat TGR23_2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. Since a frame encoding a very similar amino acid sequence was present, it was presumed to be a cDNA encoding the precursor of mouse TGR 23-2 ligand or a part thereof.
マウス TGR23— 2リガンドと考えられるアミノ酸配列をコ一ドするような フレームで配列番号: 16から翻訳されるアミノ酸配列の 5 ' 上流側にはタンパ ク質翻訳の開始コドンであると予想される ATGが 2ケ所存在するが、 疎水性プ ロットを行なったところ、 より 5' 上流側の ATGから翻訳した場合にのみシグ ナル配列と推定される疎水性の高い領域が出現したのでこの A T Gが開始コドン であると推定した。 この AT Gコドンのさらに 5, 上流側には同じフレームで終 止コドンが出現した。 3, 側にはマウス TGR 23-2リガンドをコードすると 考えられる配列の下流に終止コドンが存在した。 以上により推定されたマウス T GR 23— 2リガンド前駆体のアミノ酸配列を配列番号: 17に示す。 この配列 において、 マウス TGR 23-2リガンドに相当すると考えられるアミノ酸配列 の N末側には、 通常生理活性ペプチドがその前駆体タンパク質から切り出される とされる Ly s— Ar gの配列 (Seidah, N. G. et al.、 Ann. N. Y. Acad. Sci .、 839巻、 9- 24頁、 1998年) が存在した。 一方、 C末側には終止コドンが存在し たが、 配列番号: 1のラット TGR 23 _ 2リガンドに対応する配列との間にさ らに 2残基が存在した。  ATG predicted to be the initiation codon for protein translation 5 'upstream of the amino acid sequence translated from SEQ ID NO: 16 in a frame that encodes the amino acid sequence considered to be mouse TGR23-2 ligand However, when a hydrophobic plot was performed, a highly hydrophobic region presumed to be a signal sequence appeared only when translated from the 5 'upstream ATG. Was estimated. A stop codon appeared in the same frame 5 more upstream of this ATG codon. On the 3rd side, a stop codon was present downstream of the sequence thought to encode the mouse TGR 23-2 ligand. SEQ ID NO: 17 shows the amino acid sequence of the mouse TGR23-2 ligand precursor deduced as described above. In this sequence, the N-terminal of the amino acid sequence considered to correspond to the mouse TGR 23-2 ligand has a Lys-Arg sequence (Seidah, NG), which is usually considered to be a bioactive peptide cleaved from its precursor protein. et al., Ann. NY Acad. Sci., 839, 9-24, 1998). On the other hand, a termination codon was present at the C-terminal side, but two more residues were present between the sequence corresponding to the rat TGR23_2 ligand of SEQ ID NO: 1.
これより、 マウス TGR23— 2リガンドのアミノ酸配列は、 ラット全脳抽出 物より得られたラット TGR23- 2リガンドのアミノ酸配列;配列番号: 1 〔 ラット TGR 23— 2リガンド (1— 18) 〕 、 配列番号: 2 〔ラット TGR 2 3— 2リガンド (1— 1 5) 〕 および配列番号: 3 〔ラット T G R 23— 2リガ ンド (1一 14) 〕 それぞれに対応する、 配列番号: 1 8 〔マウス TGR 23— 2リガンド (1— 1 8) 〕 、 配列番号: 1 9 〔マウス TGR 23— 2リガンド ( 1 - 15) 〕 および配列番号: 20 〔マウス TGR 23— 2リガンド (1— 14 ) 〕 で表されるアミノ酸配列、 およびさらに配列番号: 1 8の C末側に 2残基延 長された配列番号: 21で表されるアミノ酸配列 〔マウス TGR 23— 2リガン ド (1— 20) 〕 であると推定された。 From this, the amino acid sequence of mouse TGR23-2 ligand was extracted from rat whole brain Amino acid sequence of rat TGR23-2 ligand obtained from the product; SEQ ID NO: 1 [rat TGR23-2 ligand (1-18)], SEQ ID NO: 2 [rat TGR23-2 ligand (1-15) ] And SEQ ID NO: 3 [rat TGR 23-2 ligand (1-114)], corresponding to each, SEQ ID NO: 18 [mouse TGR 23-2 ligand (1-1 8)], SEQ ID NO: 19 [Mouse TGR 23-2 ligand (1-15)] and the amino acid sequence represented by SEQ ID NO: 20 [mouse TGR 23-2 ligand (1-14)], and further, at the C-terminal side of SEQ ID NO: 18 It was presumed to be the amino acid sequence represented by SEQ ID NO: 21 extended by 2 residues [mouse TGR 23-2 ligand (1-20)].
[参考例 8 ] [Reference Example 8]
(ラット TGR 23 - 2リガンド前駆体の一部をコードする cDNAのクロー二 ング)  (Cloning of cDNA encoding part of rat TGR 23-2 ligand precursor)
ラット TGR 23 -2リガンドの前駆体をコードする c DNAをクローニング するためラット全脳由来の c DNAを铸型とした P CRを行なった。  In order to clone the cDNA encoding the precursor of rat TGR 23-2 ligand, PCR was performed using cDNA from rat whole brain as type II.
以下の合成 DN Aプライマ一を用い、 ラット全脳由来の c DNAを铸型として P CR法による増幅を行なった。 反応液の組成は、 ラット全脳 Marathon- Ready c DNA (CLONTECH) 0. 8 1、 配列番号: 22および配列番号: 14の合成 DNA プライマ一各 1. 0 /iM、 0. 2mM dNTP s、 E xT a q (宝酒造) 0. 1 a 1および酵素に付属の E xT a Qバッファ一で、 総反応量は 20 n 1とした。 増幅のためのサイクルはサーマルサイクラ一 (PE Biosystems) を用い、 94°C · 5分間の加熱の後、 94°C · 10秒、 65°C · 30秒、 72 °C · 30秒のサイク ルを 35回繰り返し、 最後に 72 °Cで 5分間保温した。 次に、 DNa s e、 RN a s e F r e eの蒸留水で 1 00倍希釈した P C R反応液 2 n 1、配列番号: 2 2のプライマー 1. 0 M、 配列番号: 1 5の合成 DNAプライマ一 0. 2 xM 、 0. 2mM dNTP s、 E x T a Qポリメラーゼ (宝酒造) 0. 1 x 1および 酵素に付属の ExTa qバッファ一で総反応量は 20 1とし、 サ一マルサイク ラー (PE Biosystems) を用い、 94 ' 5分間の加熱の後、 94°C · 1 0秒、 6 0T · 30秒、 72°C · 30秒のサイクルを 30回繰り返し、 最後に 7 2°Cで 5 分間保温した。 増幅した DNAを 2. 0 %のァガロースゲル電気泳動により分離 した後、約 200塩基長の DNAを力ミソリで切り出し、 DNAを QIAQUick Gel Extraction Kit (キアゲン) を用いて回収した。 この DNAを、 pGEM- T Easy Ve ctor System (プロメガ) のプロトコールに従って pGEM- T Easyベクターへクロー ニングした。 これを大腸菌 (Escherichia coli) JM109 co即 etent cell (宝酒造 ) に導入して形質転換した後、 cDNA挿入断片を持つクローンをアンピシリン および X— g a 1を含む LB寒天培地で選択し、 白色を呈するクロ一ンのみを滅 菌したつま楊枝を用いて分離し、 形質転換体を得た。 個々のクローンをアンピシ リンを含む LB培地で一晩培養し、 QIAwell 8 Plasmid Kit (キアゲン) を用いて プラスミド DNAを調製した。 塩基配列の決定のための反応は BigDye Terminato r Cycle Seauencing Ready Reaction Kit (PE Biosystems) を用いて行ない、 蛍 光式自動シーケンサ一を用いて解読し、 配列番号: 23で表される DNA配列を 得た。 Using the following synthetic DNA primer, cDNA derived from whole rat brain was used as type II and amplified by the PCR method. The composition of the reaction solution was rat whole brain Marathon-Ready cDNA (CLONTECH) 0.81, SEQ ID NO: 22 and SEQ ID NO: 14, each of the synthetic DNA primers 1.0 / iM, 0.2mM dNTPs, E xT aq (Takara Shuzo) 0.1 a 1 and the ExTaQ buffer attached to the enzyme, the total reaction volume was 20 n 1. The amplification cycle is performed using a thermal cycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle at 94 ° C for 10 seconds, 65 ° C for 30 seconds, and 72 ° C for 30 seconds. Was repeated 35 times, and finally kept at 72 ° C for 5 minutes. Next, PCR reaction solution 2n1, diluted 100 times with DNase and RNase Free distilled water, primer 1.0M of SEQ ID NO: 22, synthetic DNA primer of SEQ ID NO: 15 2 xM, 0.2 mM dNTPs, ExTaQ polymerase (Takara Shuzo) 0.1 x 1 and ExTaq buffer supplied with the enzyme, the total reaction volume is 201, and the total cycler is PE Biosystems. After heating for 94 'for 5 minutes, repeat the cycle of 94 ° C · 10 seconds, 60T · 30 seconds, 72 ° C · 30 seconds 30 times, and finally 5 times at 72 ° C. Incubated for a minute. After separating the amplified DNA by 2.0% agarose gel electrophoresis, about 200 bases long DNA was cut out with a force razor, and the DNA was recovered using QIAQUick Gel Extraction Kit (Qiagen). This DNA was cloned into a pGEM-T Easy vector according to the protocol of the pGEM-T Easy Vector System (Promega). After introducing this into Escherichia coli JM109 co-immediate etent cell (Takara Shuzo) and transforming, a clone having a cDNA insert is selected on an LB agar medium containing ampicillin and X-ga1, and the white clone is selected. Only one was isolated using a sterilized toothpick to obtain a transformant. Each clone was cultured overnight in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen). The reaction for determining the nucleotide sequence was performed using the BigDye Terminator Cycle Seauencing Ready Reaction Kit (PE Biosystems), followed by decoding using a fluorescent automatic sequencer to obtain the DNA sequence represented by SEQ ID NO: 23. Was.
配列番号: 23で表される DNAの塩基配列には、 配列番号: 1、 配列番号: 2および配列番号: 3で表されるラット全脳から得られたラット TGR 23 _ 2 リガンドのアミノ酸配列をコードするフレームが存在した。 このフレームを読み 取り枠として DNA配列を翻訳したところ、 配列番号: 24で表されるアミノ酸 配列が得られた。 この配列を参考例 7で得られたマウス T G R 23— 2リガンド 前駆体のアミノ酸配列 (配列番号: 1 6) と比較することにより、 本配列がラッ ト TGR 23 - 2リガンド前駆体の一部である C末側の 54アミノ酸からなる配 列に相当することが推定された。 3 ' 側にはラッ卜 TGR 23— 2リガンドをコ ードする配列の下流に終止コドンが存在した。 この配列において、 ラット TGR 23— 2リガンドのアミノ酸配列の N末側には、 通常生理活性ペプチドがその前 駆体タンパク質から切り出されるとされる Ly s— Ar gの配列 (Seidah, N. G . et al.、 Ann. N. Y. Acad. Sci.、 839巻、 9- 24頁、 1998年) が存在した。 一方 、 C末側には終止コドンが存在したが、 配列番号: 1のラット TGR 23— 2リ ガンドの配列との間にさらに 2残基が存在した。  The nucleotide sequence of the DNA represented by SEQ ID NO: 23 includes the amino acid sequence of rat TGR23_2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. There was a frame to code. When the DNA sequence was translated using this frame as a reading frame, the amino acid sequence represented by SEQ ID NO: 24 was obtained. By comparing this sequence with the amino acid sequence of the mouse TGR23-2 ligand precursor obtained in Reference Example 7 (SEQ ID NO: 16), this sequence was found to be a part of the rat TGR23-2 ligand precursor. It was presumed to correspond to a sequence consisting of 54 amino acids at the C-terminal side. On the 3 'side, a termination codon was present downstream of the sequence encoding rat TGR 23-2 ligand. In this sequence, the N-terminal side of the amino acid sequence of rat TGR 23-2 ligand has a Lys-Arg sequence (Seidah, N.G. et al., Ann. NY Acad. Sci., 839, 9-24, 1998). On the other hand, a termination codon was present on the C-terminal side, but two more residues were present between the sequence of the rat TGR23-2 ligand of SEQ ID NO: 1.
これより、 ラット TGR 23— 2リガンドのアミノ酸配列は、 ラット全脳抽出 物より得られた配列番号: 1 〔ラット TGR 23— 2リガンド (1— 18) 〕 、 配列番号: 2 〔ラット TGR 23— 2リガンド (1— 1 5) 〕 および配列番号: 3 〔ラット TGR 23— 2リガンド (1— 14) 〕 で表されるアミノ酸配列、 お よびさらに配列番号: 1の C末側に 2残基延長された配列番号: 25で表される アミノ酸配列 〔ラット TGR 23— 2リガンド (1— 20) 〕 であると推定され た。 From this, the amino acid sequence of rat TGR 23-2 ligand was obtained from SEQ ID NO: 1 [rat TGR 23-2 ligand (1-18)] obtained from rat whole brain extract, Amino acid sequence represented by SEQ ID NO: 2 [rat TGR 23-2 ligand (1-15)] and SEQ ID NO: 3 [rat TGR 23-2 ligand (1-14)], and further SEQ ID NO: 1 It was presumed to be the amino acid sequence [rat TGR 23-2 ligand (1-20)] represented by SEQ ID NO: 25, which was extended to the C-terminal side by 2 residues.
[参考例 9 ] [Reference Example 9]
(ラット TGR 23 - 2リガンド前駆体をコードする c DNAのクロ一ニング) ラット TGR 23 _ 2リガンドの前駆体をコ一ドする c DNAをクローニング するためラット全脳由来の c DNAを銬型とした P CRを行なった。  (Cloning of cDNA encoding rat TGR23-2 ligand precursor) To clone cDNA encoding rat TGR23_2 ligand precursor, cDNA derived from rat whole brain was cloned into type II. Was performed.
以下の合成 DNAプライマ一を用い、 ラット全脳由来の c DNAを錶型として P CR法による増幅を行なった。 反応液の組成は、 ラット全脳 Marathon-Ready c DNA (CLONTECH) 0. 8 / 1、 配列番号: 27および配列番号: 28の合成 DNA プライマ一各 1. 0 M、 0. 2mM dNTP s、 E xT a q (宝酒造) 0. 1 1および酵素に付属の ExT a qバッファ一で、 総反応量は 20 1とした。 増幅のためのサイクルはサ一マルサイクラ一 (PE Biosystems) を用い、 94°C · 5分間の加熱の後、 94°C · 10秒、 65°C · 30秒、 72 °C · 30秒のサイク ルを 3 5回繰り返し、 最後に 72 °Cで 5分間保温した。 次に、 DNa s e、 Rn a s e F r e eの蒸留水で 50倍希釈した P C R反応液 2 {i 1、配列番号: 29 のプライマ一 1. 0 M、 配列番号: 28の合成 DNAプライマ一 0. 2 M、 0. 2mM dNTP s、 E xT a Qポリメラ一ゼ (宝酒造) 0. 1 x 1および酵 素に付属の Ex T a qバッファーで総反応量は 20 n 1とし、 サ一マルサイクラ 一 (PE Biosystems) を用い、 94T · 5分間の加熱の後、 94°C · 1 0秒、 65 °C · 30秒、 72°C · 30秒のサイクルを 30回繰り返し、 最後に 72°Cで 5分 間保温した。 増幅した DNAを 2. 0 %のァガロースゲル電気泳動により分離し た後、 約 350塩基長の DNAを力ミソリで切り出し、 DNAを QIAauick Gel E xtraction Kit (キアゲン) を用いて回収した。 この DNAを、 pGEM- T Easy Vec tor System (プロメガ)のプロトコールに従って pGEM- T Easyベクターへクロ一二 これを大腸菌 (Escherichia. coli) JM109 competent cell (宝酒造) に導入して形質転換した後、 c DNA挿入断片を持つクローンをアンピシリンぉ よび X— g a 1を含む L B寒天培地で選択し、 白色を呈するクローンのみを滅菌 したつま楊枝を用いて分離し、 形質転換体を得た。 個々のクローンをアンピシリ ンを含む LB培地で一晚培養し、 QIAwell 8 Plasmid Kit (キアゲン) を用いてプ ラスミド DNAを調製した。 塩基配列の決定のための反応は BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems) を用いて行ない、 蛍光 式自動シーケンサ一を用いて解読し、 配列番号: 30で表される DNA配列を得 た。 Using the following synthetic DNA primer, cDNA derived from rat whole brain was subjected to amplification by PCR using type III. The composition of the reaction solution was rat whole brain Marathon-Ready cDNA (CLONTECH) 0.8 / 1, SEQ ID NO: 27 and SEQ ID NO: 28, each of the synthetic DNA primers 1.0 M, 0.2 mM dNTPs, E xT aq (Takara Shuzo) 0.11 and ExT aq buffer attached to the enzyme, the total reaction volume was 201. The amplification cycle was performed using a thermocycler (PE Biosystems) at 94 ° C for 5 minutes, followed by a cycle at 94 ° C for 10 seconds, 65 ° C for 30 seconds, and a cycle of 72 ° C for 30 seconds. Was repeated 35 times, and finally, the mixture was kept at 72 ° C for 5 minutes. Next, PCR reaction solution 2 {i1, a primer of SEQ ID NO: 29, 1.0 M, and a synthetic DNA primer of SEQ ID NO: 28, diluted 50-fold with distilled water of DNase and RNA Free 0.2 M, 0.2 mM dNTPs, ExTaQ polymerase (Takara Shuzo) 0.1 x 1 and the ExTaq buffer attached to the enzyme to a total reaction volume of 20 n1. ), After heating for 94T for 5 minutes, repeat the cycle of 94 ° C for 10 seconds, 65 ° C for 30 seconds, 72 ° C for 30 seconds 30 times, and finally for 5 minutes at 72 ° C. Insulated. After the amplified DNA was separated by 2.0% agarose gel electrophoresis, DNA having a length of about 350 bases was cut out with a force razor, and the DNA was recovered using a QIAauick Gel Extraction Kit (Qiagen). This DNA was cloned into a pGEM-T Easy vector according to the protocol of the pGEM-T Easy Vector System (Promega). Escherichia. Coli JM109 competent cell (Takara Shuzo) After transfection, clones containing the cDNA insert were selected on LB agar medium containing ampicillin and Xga1, and only the white clones were isolated using a sterilized toothpick and transformed. I got a body. Each clone was cultured once in LB medium containing ampicillin, and plasmid DNA was prepared using QIAwell 8 Plasmid Kit (Qiagen). The reaction for determining the nucleotide sequence was carried out using a BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Biosystems), and was decoded using a fluorescent automatic sequencer to obtain a DNA sequence represented by SEQ ID NO: 30.
配列番号: 30で表される c DN Aの塩基配列は、 参考例 8で得たラット TG R 23— 2リガンド前駆体の一部をコードする DN A配列 (配列番号: 23) が さらに 5' 側に延長された配列であった。 本配列を、 配列番号: 1、 配列番号: 2または配列番号: 3で表されるラット全脳から得られたラット TGR 23-2 リガンドのアミノ酸配列に一致するアミノ酸配列をコードするようなフレームを 読み取り枠として翻訳したところ、 5, 上流側には、 ヒト TGR23— 2リガン ド前駆体およびマウス TGR 23-2リガンド前駆体をコードすると推定される cDNA (配列番号: 7および配列番号: 16) に存在するタンパク質翻訳の開 始コドンであると予想される AT Gに対応する位置に、 AT Gが 1ケ所存在した 。 また、 この AT Gコドンのさらに 5 ' 上流側には同じフレームで終止コドンが 出現した。 3' 側にはマウス TGR 23— 2リガンドをコードすると考えられる 配列の下流に終止コドンが存在した。 これより、 配列番号: 30で表される配列 は、 ラット TGR 23 _ 2リガンド前駆体をコードする c DNA配列であると推 定された。 配列番号: 30で表される cDNAの塩基配列から翻訳されるァミノ 酸配列を配列番号: 31に示す。 [参考例 1 0 ]  The nucleotide sequence of cDNA represented by SEQ ID NO: 30 is a further 5 'of the DNA sequence (SEQ ID NO: 23) encoding a part of the rat TGR23-2 ligand precursor obtained in Reference Example 8. The sequence was extended to the side. This sequence is constructed using a frame encoding an amino acid sequence corresponding to the amino acid sequence of rat TGR 23-2 ligand obtained from whole rat brain represented by SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3. When translated as an open reading frame, 5, upstream, a cDNA (SEQ ID NO: 7 and SEQ ID NO: 16) presumed to encode human TGR23-2 ligand precursor and mouse TGR23-2 ligand precursor One ATG was present at a position corresponding to the ATG predicted to be the start codon of the existing protein translation. In addition, a stop codon appeared in the same frame further 5 ′ upstream of this ATG codon. On the 3 'side, a stop codon was present downstream of the sequence thought to encode mouse TGR 23-2 ligand. From this, it was estimated that the sequence represented by SEQ ID NO: 30 was a cDNA sequence encoding rat TGR23_2 ligand precursor. The amino acid sequence translated from the nucleotide sequence of the cDNA represented by SEQ ID NO: 30 is shown in SEQ ID NO: 31. [Reference Example 10]
(TGR23— 1 (以下、 ヒト TGR23— 1を、 単に T G R 23 _ 1と称する こともある) 発現 CHO細胞の作成)  (TGR23-1 (Human TGR23-1 is sometimes simply referred to as TGR23_1)) Expression of CHO cells
TGR 23— 1をコードする、 配列番号: 38で表される塩基配列を有する D N A断片を含有するプラスミド pTB2173を铸型とし、 S a 1 I認識配列を 付加したプライマー 1 (配列番号: 3 2) および S p e I認識配列を付加したプ ライマー 2 (配列番号: 33) を用いて P CR反応を行った。 該反応における反 応液の組成は上記プラスミド 1 0 n gを铸型として使用し、 Piu Turbo DNA Poly merase (ストラタジーン社) 2. 5U、 プライマー 1 (配列番号: 32) および プライマ一 2 (配列番号: 33) を各 1. 0 / M、 dNTP sを 200 M、 お よび反応液にに 2 X GC Buffer I (宝酒造) を 2 5 x 1加え、 50 1の液量とし た。 PCR反応は、 9 5°C · 6 0秒の後、 9 5°C · 6 0秒、 5 5 · 6 0秒、 7 2V, - 7 0秒のサイクルを 2 5回繰り返し、 最後に 7 2°C · 1 0分の伸長反応を 行った。該 P CR反応産物を Zero Blunt TOPO PCRクローニングキット (インビト ロジェン社) の処方に従いプラスミドベクター pCR- Bluntll- T0P0 (インビトロジ ェン社) へサブクローニングした。 これを E. coli TOP 10 (インビトロジェン社) に導入し、 p TB 2 1 7 3に含まれる TGR 2 3 - 1の c DNAを持つクロ一ン を、 カナマイシンを含む LB寒天培地中で選択した。 ここで得られた、 5 ' 側お よび 3 ' 側に S a 1 Iおよび S p e Iがそれぞれ認識する配列を付加した T G R 2 3 _ 1が導入されたプラスミドによって形質転換された E. coliのクローンよ り Plasinid Miniprep Kit (バイオラッド社) を用いてプラスミドを調製し、 制限 酵素 S a 1 Iおよび S p e Iで切断してインサート部分を切り出した。 インサー ト DNAは電気泳動後、 ァガロースゲルより切り出し、 次に Gel Extraction Kit (キアゲン社) を用いて回収した。 このインサート DNAを S a 1 Iおよび S p e Iで切断した動物細胞発現用べクタ一プラスミド p AKKO— 1 1 1 H (Hinu ma, S. et al. Biochim. Biophys. Acta, Vol. 1219, pp. 251-259 (1994) 記載 の pAKKO l . 1 1 Hと同一のベクタ一プラスミド) に加え、 DNA Ligation K it Ver.2 (宝酒造) を用いてライゲーシヨンを行ない、 タンパク質発現用プラス ミド pAKKO- TGR23-1を構築した。この pAKKO— TGR 2 3 - 1で形質転換した E. coli TOP10を培養後、 Plasmid Miniprep Kit (バイオラッド社) を用いて pAK KO- TGR23- 1のプラスミド DNAを調製した。 Plasmid pTB2173 containing a DNA fragment having the nucleotide sequence represented by SEQ ID NO: 38 encoding TGR 23-1 was designated as type I, and the Sa1I recognition sequence was A PCR reaction was performed using the added primer 1 (SEQ ID NO: 32) and the primer 2 (SEQ ID NO: 33) added with a SpeI recognition sequence. The composition of the reaction solution in the reaction was as follows: 10 ng of the above plasmid was used as type III, PiU Turbo DNA Polymerase (Stratagene) 2.5 U, primer 1 (SEQ ID NO: 32) and primer 1 (SEQ ID NO: 2). : 33) was added at 1.0 / M each, dNTPs at 200 M, and 2 × GC Buffer I (Takara Shuzo) to the reaction mixture at 25 × 1 to give a volume of 501. The PCR reaction was repeated 95 times at 95 ° C for 60 seconds, followed by a cycle of 95 ° C for 60 seconds, 55 times, 60 seconds, 72 V, -70 seconds, and finally 72 times. The extension reaction was performed at ° C for 10 minutes. The PCR reaction product was subcloned into a plasmid vector pCR-Bluntll-TOP0 (Invitrogen) according to the prescription of Zero Blunt TOPO PCR Cloning Kit (Invitrogen). This was introduced into E. coli TOP 10 (Invitrogen), and a clone containing the cDNA of TGR23-1 contained in pTB2173 was selected in an LB agar medium containing kanamycin. The E. coli transformed with the plasmid obtained by introducing TGR23_1 into which the sequences recognized by Sa1I and SpeI were added to the 5 'and 3' Plasmids were prepared from the clones using the Plasinid Miniprep Kit (Bio-Rad) and cut with restriction enzymes Sa1I and SpeI to cut out the insert. The insert DNA was excised from an agarose gel after electrophoresis, and then recovered using a Gel Extraction Kit (Qiagen). Vector plasmid pAKKO—111H for expression of animal cells obtained by cutting this insert DNA with Sa1I and SpeI (Hinuma, S. et al. Biochim. Biophys. Acta, Vol. 1219, pp. 251-259 (1994), the same vector plasmid as pAKKO l.11H), and ligated using DNA Ligation Kit ver.2 (Takara Shuzo) to obtain a plasmid pAKKO-TGR23 for protein expression. -1 was constructed. After culturing E. coli TOP10 transformed with this pAKKO-TGR23-1, plasmid DNA of pAKKO-TGR23-1 was prepared using Plasmid Miniprep Kit (Bio-Rad).
ハムスター CHOZd h f r_細胞を 1 0 %ゥシ胎児血清を含む a—MEM培 地 (with ribonucleosides and deoxyribo置 leosides、 GIBC0、 Cat. No. 12571 ) でファルコンディッシュ (径 3. 5 cm) に 1 X 1 05個播種し、 5 % C02 インキュベーターで 37 °C 晩培養した。 上記発現プラスミド pAKKO- TGR23- 1丽 A 2 X gを Transiection Reagent FuGENE 6 (Roche社) を用い、 添付説明書記載 の方法に従ってトランスフエクトし、 18時間培養後、 新鮮な増殖培地に交換し た。 さらに 10時間培養を続けたのち、 トランスフエク卜した細胞をトリプシン — EDTA処理により集め、 選択培地 (10 %透析牛胎児血清を含むひ一 MEM 培地 (without ribonucleosides and deoxyribonucleosides, GIBC0、 Cat. No. 12561) ) を用いて平底 96穴プレート 10枚に播種した。 3— 4日ごとに選択培 地を交換しながら培養を続け、 2— 3週間後にコロニー状に増殖してきた DHF R+細胞クローンを 81個取得した。 Hamster CHOZd hfr_ cells were placed on a falcon dish (3.5 cm in diameter) in an a-MEM medium (with ribonucleosides and deoxyribozyme leosides, GIBC0, Cat. No. 12571) containing 10% fetal serum. 1 0 and five seeding, 5% C0 2 The cells were cultured at 37 ° C overnight in an incubator. The expression plasmid pAKKO-TGR23-1GRA2Xg was transfected using Transiection Reagent FuGENE 6 (Roche) according to the method described in the attached instruction manual. After culturing for 18 hours, the medium was replaced with a fresh growth medium. After further culturing for another 10 hours, the transfected cells were collected by trypsin-EDTA treatment, and selected medium (Hi-MEM medium containing 10% dialyzed fetal calf serum (without ribonucleosides and deoxyribonucleosides, GIBC0, Cat. No. 12561) )) Was used to inoculate 10 flat bottom 96-well plates. Culture was continued while changing the selection medium every 3 to 4 days, and after 2 to 3 weeks, 81 DHF R + cell clones that had grown in a colony were obtained.
[参考例 11] [Reference Example 11]
(T a qM a n ?〇1法を用ぃた丁0 23— 1発現 C HO細胞株の T G R 2 3一 1発現量の定量)  (Quantification of the expression level of TGR23-11 in the CHO cell line expressing the T23-23-1 expression using the TaqMan1 method)
参考例 10で得た TGR 23一 1発現 CHO細胞株 81クローンを、 96穴プ レートに培養し、 RNeasy 96 Kit (キアゲン社) を用いて全 R N Aを調製した。 得 られた全 RNA 50〜200 n gを TadMan Gold RT-PCR Kit(PEバイオシステ ムズ社) を用いて、 逆転写反応を行なった。 得られた全 RNA 5〜20ng相当 の逆転写産物、 または後述のようにして作製した標準 c DNA、 lxUniversal PC R Master Mix (P Eバイオシステムズ社) 、 配列番号: 34で表されるプライマ —および配列番号: 35で表されるプライマ一各 500 nM、 および配列番号: 36で表される T a qMa nプローブ 100 nMを含む反応混合液 25 1につ いて ABI PRISM 7700 Seauence Detector (P Eバイオシステムズ社) を用いて P CRを行なった。 PCRは、 50°C ' 2分、 95°C · 10分で処理後、 95 °C . 15秒、 60°C · 60秒のサイクルを 40回繰り返すことにより行なった。  81 clones of the CHO cell line expressing TGR23-11 obtained in Reference Example 10 were cultured in a 96-well plate, and all RNAs were prepared using an RNeasy 96 Kit (Qiagen). A reverse transcription reaction was performed on 50 to 200 ng of the obtained total RNA using a TadMan Gold RT-PCR Kit (PE Biosystems). A reverse transcript equivalent to 5 to 20 ng of the obtained total RNA, or a standard cDNA prepared as described later, lxUniversal PC R Master Mix (PE Biosystems), a primer represented by SEQ ID NO: 34 and a sequence ABI PRISM 7700 Seauence Detector (PE Biosystems) for a reaction mixture 251 containing 500 nM of each of the primers represented by 35 and 500 nM of the TaqMan probe represented by SEQ ID NO: 36 PCR was performed using. PCR was performed by treating the cells at 50 ° C for 2 minutes and 95 ° C for 10 minutes, and then repeating the cycle of 95 ° C for 15 seconds and 60 ° C for 60 seconds 40 times.
標準 cDNAは、 配列番号: 40で表される塩基配列を有する DNA断片を含 有するプラスミド PTB2174の 260 n mの吸光度を測定して濃度を算出し 、 正確なコピー数を算出した後、 ImM EDTAを含む 10 mM T r i s—H C I (pH8. 0) 溶液で希釈し、 2コピーから 2X 106コピーの標準 cDN A溶液を調製した。 また、 TaciMan P C R用プローブおよびプライマーは P rimer Express (Versionl.0) (P Eバイオシステムズ社) により設計した。 発現量は ABI PRISM 7700 SDSソフトウェアによって算出した。 リポーターの蛍 光強度が設定された値に達した瞬間のサイクル数を縦軸にとり、 標準 c DNAの 初期濃度の対数値を横軸にとり、 標準曲線を作成した。 標準曲線より各逆転写産 物の初期濃度を算出し、 各クローンの全 RNA当たりの TGR 23— 1遺伝子発 現量を求めた。 その結果、 TGR 23— 1の発現が高かった CH〇細胞株 1 1個 を選択し 24穴プレ一卜に培養した。 これらの細胞について、 TGR 23— 1の 発現量を再検した。 RNeasy Mini Kits (キアゲン社) を用いて全 RNAを調製し た後、 RNase- free DNase Set (キアゲン社) を用いて DNa s e処理をした。 得 られた全 RNAから、上記と同様に逆転写反応し、 T a QMa n PCR法で各ク ローンの全 RNA当たりの TGR 23— 1遺伝子発現量を求めた。 その結果、 T GR 23— 1発現 CHO細胞株クロ一ン 49および 52が高い発現量を示すこと がわかった。 The standard cDNA is prepared by measuring the absorbance at 260 nm of the plasmid PTB2174 containing the DNA fragment having the nucleotide sequence represented by SEQ ID NO: 40, calculating the concentration, calculating the exact copy number, and then including ImM EDTA. It was diluted with a 10 mM Tris-HCI (pH 8.0) solution to prepare 2 × 10 6 copies of a standard cDNA solution from 2 copies. The probe and primer for TaciMan PCR are P Designed by rimer Express (Version 1.0) (PE Biosystems). The expression level was calculated using ABI PRISM 7700 SDS software. The number of cycles at the moment when the fluorescence intensity of the reporter reached the set value was plotted on the vertical axis, and the logarithmic value of the initial concentration of the standard cDNA was plotted on the horizontal axis, to create a standard curve. The initial concentration of each reverse transcript product was calculated from the standard curve, and the amount of TGR23-1 gene expression per total RNA of each clone was determined. As a result, one CH〇 cell line having high expression of TGR23-1 was selected and cultured in a 24-well plate. For these cells, the expression level of TGR 23-1 was re-examined. Total RNA was prepared using RNeasy Mini Kits (Qiagen), and then treated with RNase-free DNase Set (Qiagen). A reverse transcription reaction was performed from the obtained total RNA in the same manner as described above, and the expression level of the TGR23-1 gene per total RNA of each clone was determined by TaQMan PCR. As a result, it was found that CHO cell lines clones 49 and 52 expressing TGR23-1 showed high expression levels.
以後の参考例では、 これら 2つのクローンの発現細胞を用いた。  In the following reference examples, cells expressing these two clones were used.
[参考例 1 2 ] [Reference Example 1 2]
(ヒト TGR 23— 2リガンド (1— 20) : Ser-Phe-Arg-Asn-Gly-Val-Gly-Th r-Gly-Met-Lys-Lys-Thr-Ser-P e-Gln-Arg-Ala-Lys-Ser (配列番号: 1 2) の製造 (Human TGR 23-2 ligand (1-20): Ser-Phe-Arg-Asn-Gly-Val-Gly-Thr-Gly-Met-Lys-Lys-Thr-Ser-P e-Gln-Arg-Ala Of -Lys-Ser (SEQ ID NO: 1 2)
) )
市販の Boc- Ser (Bzl)- 0CH2-PAM樹脂を、ペプチド合成機 ACT 90の反応槽に入 れ、 DCMで膨潤後 TFAで B o cを除去し、 D I EAで中和した。 この樹脂を NMPに懸濁し、 HOBt-DIPCIで Boc-Lys (Cl-Z)を縮合した。反応後ニンヒドリンテ ストで遊離のァミノ基の有無を調べ、 ニンヒドリンテストがプラスの時には同じ アミノ酸を再度縮合した。 再縮合後においてもニンヒドリンテストがプラスの時 には無水酢酸でァセチル化した。 このサイクルを繰り返し Boc_Ala、 Boc-Arg(Tos )、 Boc-Gln, Boc-Phe, Boc - Ser (Bzl)、 Boc-Thr (Bzl) Boc-Lys (Cl-Z) , Boc-Lys ( CL-Z) , Boc- Met、 Boc- Gly、 Boc-Thr (Bzl) 、 Boc- Gly、 Boc-VaL Boc- GIy、 Boc - Asn、 Boc-Arg(Tos)、 Boc- Phe、 Boc- Ser (Bzl)を配列順に縮合し、 所望の保護ぺプ チド樹脂 0. 24 gを得た。 この樹脂を p—クレゾ一ル 1. 5m lとともにフッ 化水素約 1 5m l中、 0°Cで 60分攪拌した後フッ化水素を減圧留去し、 残留物 にジェチルエーテルを加えて濾過した。 濾過物に水と酢酸を加えペプチドを抽出 し、 樹脂と分離した。 抽出液を濃縮し 50 %酢酸で充填したセフアデックス (商 標) G— 25カラム (2. 0 X 80 cm) に付し、 同溶媒で展開、 主要画分を集 め凍結乾燥した。 その一部 (45mg) を LiChroprep (商標) RP- 18を充填した逆 相クロマトカラム (2. 6 X 60 c m) に付け、 0. 1 % TFA水 200m l で洗浄、 0. 1 % TFA水 300m 1 と 0. 1 % T F A含有 25 %ァセトニト リル水 300m lを用いた線型勾配溶出を行い、 主要画分を集め凍結乾燥し目的 とするペプチド 1 2. 7mgを得た。 Commercially available Boc- Ser (Bzl) - a 0CH 2 -PAM resin, is input to the reaction vessel of the peptide synthesizer ACT 90, the B oc swelling after TFA was removed in DCM, and neutralized with DI EA. This resin was suspended in NMP, and Boc-Lys (Cl-Z) was condensed with HOBt-DIPCI. After the reaction, the presence of free amino groups was examined by a ninhydrin test. When the ninhydrin test was positive, the same amino acid was condensed again. Even after the recondensation, when the ninhydrin test was positive, acetylation was performed with acetic anhydride. Repeat this cycle Boc_Ala, Boc-Arg (Tos), Boc-Gln, Boc-Phe, Boc-Ser (Bzl), Boc-Thr (Bzl) Boc-Lys (Cl-Z), Boc-Lys (CL-Z ), Boc-Met, Boc-Gly, Boc-Thr (Bzl), Boc-Gly, Boc-VaL Boc-GIy, Boc-Asn, Boc-Arg (Tos), Boc-Phe, Boc-Ser (Bzl) Condensation was performed in the order of arrangement to obtain 0.24 g of the desired protected peptide resin. Fluoride this resin with 1.5 ml of p-cresol. After stirring at 0 ° C. for 60 minutes in about 15 ml of hydrogen fluoride, hydrogen fluoride was distilled off under reduced pressure, and getyl ether was added to the residue, followed by filtration. Water and acetic acid were added to the filtrate to extract the peptide, which was separated from the resin. The extract was concentrated, applied to a Sephadex (trademark) G-25 column (2.0 x 80 cm) filled with 50% acetic acid, developed with the same solvent, and the main fractions were collected and lyophilized. An aliquot (45 mg) was applied to a reversed-phase chromatography column (2.6 x 60 cm) packed with LiChroprep ™ RP-18, washed with 200 ml of 0.1% TFA water, and 300 ml of 0.1% TFA water. A linear gradient elution was performed using 300 ml of 25% acetonitrile water containing 1 and 0.1% TFA, and the main fraction was collected and lyophilized to obtain 12.7 mg of the target peptide.
E S I -MS :分子量 MW 21 88. 0 (理論値 2 1 87. 5) HPLC溶出 時間 10. 6分 ESI-MS: molecular weight MW 21 88.0 (theoretical 2 187.5) HPLC elution time 10.6 min
カラム条件:カラム: Wakosil 5C18T 4. 6 X 100mm Column conditions: Column: Wakosil 5C18T 4.6 X 100mm
溶離液: A液— 0. 1 % TFA水、 B液一 0. 1 % TF A含有ァセトニトリル を用い A/B : 95Z5〜45/55へ直線型濃度勾配溶出 ( 25分) 流速: 1. 0m 1 /分 Eluent: Solution A—0.1% TFA water, Solution B—0.1% TF A in acetonitrile A / B: Linear gradient elution to 95Z5 to 45/55 (25 minutes) Flow rate: 1.0m 1 minute
[参考例 1 3 ] [Reference Example 13]
(FL I PRを用いたヒト TGR 23— 2リガンド (1— 20) による TGR 2 3— 1発現 CHO細胞および TGR 23— 2発現 C H 0細胞の細胞内 C aイオン 濃度上昇活性の測定)  (Measurement of intracellular Ca ion concentration increasing activity of TGR23-1 expressing CHO cells and TGR23-2 expressing CH0 cells by human TGR23-2 ligand (1-20) using FLIPR)
参考例 12で得られたヒト TGR 23— 2リガンド (1— 20) を種々の濃度 で、 公知の方法に従って、 TGR 23— 1発現 CHO細胞および TGR 23— 2 発現 CHO細胞に投与し、 細胞内 C aイオン濃度上昇活性を FL I PRを用いて 測定したところ、 ヒト TGR 23— 2リガンド (1— 20) は、 濃度依存的に T GR 23 - 1発現 CHO細胞および TGR 23 - 2発現 C H 0細胞の細胞内 C a イオン濃度上昇を促進した。 結果を図 5および図 6に示す。  The human TGR 23-2 ligand (1-20) obtained in Reference Example 12 was administered at various concentrations to TGR 23-1 -expressing CHO cells and TGR 23-2 -expressing CHO cells at various concentrations according to a known method. When the Ca ion concentration increasing activity was measured using FLIPR, human TGR23-2 ligand (1-20) was found to be dependent on the concentration of TGR23-1 expressing CHO cells and TGR23-2 expressing CH0. Increased intracellular Ca ion concentration in cells. The results are shown in FIGS.
これより、 配列番号: 1 2で表されるァミノ配列を有するポリペプチド 〔ヒト TGR 23— 2リガンド (1— 20) 〕 が、 TGR 23— 1および TGR 23— 2対する細胞内 C aイオン濃度上昇活性を有することが明らかである。 産業上の利用可能性 Thus, the polypeptide having the amino sequence represented by SEQ ID NO: 12 [human TGR 23-2 ligand (1-20)] increased the intracellular Ca ion concentration of TGR 23-1 and TGR 23-2. It is clear that it has activity. Industrial applicability
本発明によれば、 結合分子が未知であるタンパク質のアミノ酸配列 (及び/又 はアミノ酸配列を用いて得られるシークェンスァラインメント) に関する情報を 得るだけで結合分子又は結合分子の種類を予測することが可能となる。 これによ り、 従来の 3次元構造まで予測する分子モデリング法に比べ格段に迅速に結合分 子 (リガンド等) を予測することができる。  According to the present invention, it is possible to predict a binding molecule or a type of a binding molecule only by obtaining information on an amino acid sequence of a protein whose binding molecule is unknown (and / or a sequence alignment obtained using the amino acid sequence). Becomes possible. This makes it possible to predict binding molecules (ligands, etc.) much faster than conventional molecular modeling methods that predict even three-dimensional structures.
更に、 本発明によれば様々な種類の結合分子が未知であるタンパク質に対して その結合分子又は結合分子の種類を予測することができる。 また、 結合分子が未 知であるタンパク質に実際にあらゆる結合分子が結合するかどうか実験するより も容易かつ迅速に結合分子又は結合分子の種類を予測することができる。  Further, according to the present invention, it is possible to predict the binding molecule or the type of the binding molecule for a protein for which various types of binding molecules are unknown. Further, it is possible to predict the binding molecule or the kind of the binding molecule easily and quickly rather than experimenting whether or not any binding molecule actually binds to the protein whose binding molecule is unknown.
公知の技術を用いることで、 相同性を有したタンパク質グループに共通の機能 は、 シークェンスァラインメントを計算することや立体構造モデルを作成するこ とで推定できる。 一方、 シークェンスアラインメントによる類似性評価や分子ド ッキング計算から、 リガンド(およびその種類)の特定は、 現在の技術では不可能 である。 しかし、 医薬品開発に有用な個々の GPCRが有している個別の機能を推定 するためには、 結合するリガンド ·共役 G夕ンパク質の予測が必要である。 この ような、 GPCR · リガンドのセットが推定 ·決定されて初めて、 共役 Gタンパク質 を通じた細胞内応答の調査、 生体内分布や発現量変化の測定、 遺伝子導入 ·欠損 動物の作成などの詳細な機能研究が進展することになる。 リガンド決定残基を用 いた予測方法およびそのためのコンピュータは、 このような医薬品開発に必須な 分子同定 ·機能解明に直接役に立つものである。  By using a known technique, a function common to a group of proteins having homology can be estimated by calculating a sequence alignment or creating a three-dimensional structure model. On the other hand, identification of ligands (and their types) from similarity assessments by sequence alignment and molecular docking calculations is not possible with current technology. However, in order to estimate the individual functions of individual GPCRs useful for drug development, it is necessary to predict the binding ligand and conjugated G protein. Detailed functions such as investigating intracellular responses through conjugated G-proteins, measuring biodistribution and changes in expression levels, gene transfer, and creating deficient animals only after such estimation and determination of a set of GPCRs and ligands Research will progress. The prediction method using ligand-determined residues and the computer for it are directly useful for molecular identification and function elucidation essential for such drug development.
本発明によれば、 きわめて容易に結合分子未知タンパク質(ォーファン Gタンパ ク質共役レセプター等) の結合分子又は結合分子の種類を予測することができ、 かかる知見に基づけば、 実験を経なくとも当該結合分子未知タンパク質が関与す る疾患等の予防薬や治療薬を容易に製造することが可能となる。  According to the present invention, it is possible to very easily predict the binding molecule or the type of the binding molecule of a binding molecule unknown protein (such as an orphan G protein-coupled receptor). It is possible to easily produce a prophylactic or therapeutic drug for a disease or the like involving a binding molecule unknown protein.

Claims

請求の範囲 The scope of the claims
1 . 結合分子未知タンパク質に結合する結合分子を予測する結合分子未知タンパ ク質の結合分子予測方法であって、  1. A binding molecule prediction method for a binding molecule unknown protein, which predicts a binding molecule that binds to the binding molecule unknown protein,
アミノ酸配列と結合分子とが既知である結合分子既知タンパク質について、 少な くとも 2以上の結合分子既知タンパク質のシークェンスアラインメントと、 結合 分子又は結合分子の種類とを対応付けた結合分子既知タンパク質分類情報を得る 工程と、 For known binding molecule known proteins whose amino acid sequence and binding molecule are known, sequence alignment of at least two or more known binding molecule proteins and binding molecule known protein classification information in which the binding molecules or types of binding molecules are associated with each other. The process of obtaining
前記結合分子既知タンパク質分類情報を用いて、 結合分子既知タンパク質のシ一 クエンスアラインメントの位置のうち結合分子を決定することに関与すると想定 される位置である結合分子決定残基位置を 1又は 2以上特定する工程と、 前記結合分子決定残基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合 分子又は結合分子の種類とを対応付けることにより、 結合分子決定残基と結合分 子又は結合分子の種類との相関関係を表す結合分子決定残基一結合分子分類情報 を得る工程と、 Using the binding molecule known protein classification information, one or more binding molecule determining residue positions that are assumed to be involved in determining the binding molecule among the sequence alignment positions of the binding molecule known protein By specifying the identifying step, associating the amino acid residue (binding molecule determining residue) at the binding molecule determining residue position with the type of the binding molecule or the binding molecule, the binding molecule determining residue and the binding molecule or the binding molecule Obtaining binding molecule-determined residue-binding molecule classification information indicating a correlation with the type of
前記結合分子既知タンパク質と同じ種類の結合分子未知タンパク質について前記 結合分子既知タンパク質間のシークェンスアラインメントに対して結合分子未知 タンパク質の配列を整列させ、 結合分子未知タンパク質のシークェンスァライン メントを得る工程と、 Aligning the sequence of the unknown binding molecule protein with respect to the sequence alignment between the known binding molecule proteins for the same type of unknown binding molecule protein, and obtaining a sequence alignment of the unknown binding molecule protein;
前記結合分子未知タンパク質のシークェンスァラインメントのうち少なくとも 1 種類の結合分子決定残基についての情報を、 結合分子決定残基一結合分子分類情 報に当てはめ、 結合分子未知タンパク質の結合分子又は結合分子の種類を予測す る工程とを含む、 Applying information on at least one binding molecule determining residue in the sequence alignment of the binding molecule unknown protein to the binding molecule determining residue-binding molecule classification information, Predicting the type of
結合分子未知タンパク質の結合分子予測方法。 Binding molecule prediction method for unknown protein.
2 . 前記結合分子が、 リガンド、 調節因子、 エフェクター、 補酵素のいずれかで ある請求項 1に記載の結合分子未知タンパク質の結合分子予測方法。  2. The method according to claim 1, wherein the binding molecule is one of a ligand, a regulator, an effector, and a coenzyme.
3 . 前記結合分子が、 2以上の種類に分類され、 当該分類された結合分子の種類 を予測する請求項 1又は 2に記載の結合分子未知夕ンパク質の結合分子予測方法  3. The method according to claim 1 or 2, wherein the binding molecule is classified into two or more types, and the type of the classified binding molecule is predicted.
4 . 結合分子未知タンパク質が、 Gタンパク質共役型受容体、 キナーゼ、 リパー ゼ、 トランスポ一夕一、 プロテアーゼ、 イオンチャンネルのいずれかである請求 項 1から 3のいずれか 1項に記載の結合分子未知タンパク質の結合分子予測方法。4. Binding protein unknown protein is G protein-coupled receptor, kinase, lipase The method for predicting a binding molecule of an unknown protein for a binding molecule according to any one of claims 1 to 3, wherein the method is any one of enzyme, transporter, protease, and ion channel.
5 . 結合分子決定残基位置を 1又は 2以上特定する工程において、 シークェンス アラインメントを構成するァミノ酸残基と結合分子の種類とから結合分子決定残 基位置を 1又は 2以上特定する請求項 1から 4のいずれか 1項に記載の結合分子 未知タンパク質の結合分子予測方法。 5. The step of identifying one or more binding molecule-determining residue positions, wherein one or more binding molecule-determining residue positions are identified from the amino acid residues constituting the sequence alignment and the type of binding molecule. 5. The method for predicting a binding molecule of an unknown protein according to any one of claims 1 to 4.
6 . 下記式 1、 又は下記式 2のいずれか又は両方を用いて結合分子決定残基位置 を決定する請求項 1から 4のいずれか 1項に記載の結合分子未知タンパク質の結 合分子予測方法。 f l (n) =∑ (N (Res, Χ ) XN (Res, Xr) ) 式 1  6. The method for predicting a binding molecule of an unknown protein for a binding molecule according to any one of claims 1 to 4, wherein the position of the residue for determining the binding molecule is determined by using one or both of the following formulas 1 and / or 2. . f l (n) = ∑ (N (Res, Χ) XN (Res, Xr)) Equation 1
Res  Res
[式 1中、 nは、 f l (n)が、 結合分子既知タンパク質のシークェンスァラインメン トのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 Χα及び Xrは、 結合分子又は結合分子の種類を表し、 Q は 1から P- 1までの整数を表し、 rは Qより大きく p以下である整数を表し、 pは結合 分子又は結合分子の種類の数を表し、 N (Res, XQ)は、 結合分子既知タンパク質分 類情報に存在する結合分子既知タンパク質のうち、 シークェンスアラインメント の n番目のアミノ酸残基が Resであり、 かつ結合分子が XQであるものの数を表し、 N (Res, Xr)は、 結合分子既知タンパク質分類情報に存在する結合分子既知タンパ ク質のうち、 シークェンスアラインメントの n番目のアミノ酸残基が Resであり、 かつ結合分子が Xrであるものの数を表す。 ] f 2 (n) =∑ (N (Res, XI) xN (Res, X2)〜X N (Res, Xp) ) 式 2 [In Equation 1, n represents fl (n) as an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the type of amino acid residue. Χα and Xr represent a binding molecule or a type of a binding molecule, Q represents an integer from 1 to P-1, r represents an integer greater than Q and less than or equal to p, and p represents a binding molecule or a bond. N (Res, XQ) represents the number of types of molecules, and among the known binding molecule proteins in the binding molecule known protein classification information, the nth amino acid residue in the sequence alignment is Res, and the binding molecule is Represents the number of XQs, and N (Res, Xr) is the nth amino acid residue of the sequence alignment among the proteins known in the binding molecule known protein classification information that is Res, and Join Represents the number of molecules whose molecule is Xr. ] f 2 (n) = ∑ (N (Res, XI) xN (Res, X2)-X N (Res, Xp)) Equation 2
Res  Res
[式 2中、 nは、 f 2 (n)が、 結合分子既知タンパク質のシークェンスァラインメン トのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XIから Xpは、 結合分子又は結合分子の種類を表し、 p は、 結合分子又は結合分子の種類の数を表し、 N(Res, ¾は、 結合分子既知タンパ ク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァライン メントの n番目のアミノ酸残基が Resであり、 かつ結合分子が Xであるものの数を 表す。 ] [In Formula 2, n represents that f 2 (n) is an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the amino acid residue. Represents the type, XI to Xp represent the binding molecule or the type of the binding molecule, p Represents the number of binding molecules or types of binding molecules, and N (Res, ¾ represents the nth amino acid residue in the sequence alignment among the known binding molecule proteins in the binding molecule known protein classification information. Is Res and the number of binding molecules is X.]
7. 下記式 3、 下記式 4、 下記式 5のいずれかひとつ以上を用いて結合分子決定 残基位置を決定する請求項 1から 4のいずれか 1項に記載の結合分子未知タンパ ク質の結合分子予測方法。 fl(n)=∑ (N (Res, XQ) XN( es, Xr)) 式 3  7. The unknown molecule of the binding molecule according to any one of claims 1 to 4, wherein the residue position is determined using at least one of the following expressions 3, 4, and 5: Binding molecule prediction method. fl (n) = ∑ (N (Res, XQ) XN (es, Xr)) Equation 3
Res  Res
[式 3中、 nは、 fl (n)が、 結合分子既知タンパク質のシークェンスァラインメン トのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XQ及び Xrは、 結合分子又は結合分子の種類を表し、 Q は 1から p- 1までの整数を表し、 rは (1より大きく p以下である整数を表し、 pは結合 分子又は結合分子の種類の数を表し、 N(Res, XQ)は、 結合分子既知タンパク質分 類情報に存在する結合分子既知タンパク質のうち、 シークェンスァラインメント の n番目のァミノ酸残基が Resであり、 かつ結合分子が XQであるものの数を表し、 N(Res, Xr)は、 結合分子既知タンパク質分類情報に存在する結合分子既知タンパ ク質のうち、 シークェンスァラインメントの n番目のアミノ酸残基が Resであり、 かつ結合分子が Xrであるものの数を表す。 ] f2(n) =∑ (N(Res, XI) XN(Res, X2)〜xN(Res, Xp) ) 式 4 [In Equation 3, n represents fl (n) as an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the type of amino acid residue. XQ and Xr represent a binding molecule or a type of a binding molecule, Q represents an integer from 1 to p-1, r represents (an integer greater than or equal to and less than or equal to p, and p represents a binding molecule or N (Res, XQ) represents the number of types of binding molecules. Among the known binding molecule proteins in the binding molecule known protein classification information, the n-th amino acid residue in the sequence alignment is Res. N (Res, Xr) is the nth amino acid residue in the sequence alignment among the known binding molecule proteins in the binding molecule known protein classification information. Is Res, and the bond is There a number from what is Xr.] F2 (n) = Σ (N (Res, XI) XN (Res, X2) ~xN (Res, Xp)) Equation 4
Res  Res
[式 4中、 nは、 ί2(η)が、 結合分子既知タンパク質のシ一: [In the formula 4, n is ί2 (η).
トのうち第 η番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XIから Xpは、 結合分子又は結合分子の種類を表し、 p は、 結合分子又は結合分子の種類の数を表し、 N(Res, X)は、 結合分子既知タンパ ク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァライン メントの n番目のアミノ酸残基が Resであり、 かつ結合分子が Xであるものの数を 表す。 1 f 3 (m, n) = { (アミノ酸残基ペア種類数) / wX + wl x (2交差残基ペア種類数) + w2 X (3交差残基ペア種類数) 十… wp-1 (p交差残基ペア種類数) + wAX (ァ ラインメント不能アミノ酸残基数) + wB X (アラインメント不能アミノ酸残基べ ァ数) } 式 5 Represents the evaluation function for the ηth amino acid residue of the group, Res represents the type of the amino acid residue, XI to Xp represents the binding molecule or the type of the binding molecule, and p represents the binding function. N (Res, X) represents the number of types of molecules or binding molecules. The nth amino acid residue of the comment is Res and the number of binding molecules is X. 1 f 3 (m, n) = {(number of amino acid residue pair types) / wX + wl x (number of 2 cross residue pair types) + w2 X (3 cross residue residue pair types) tens… wp-1 ( p crossing residue pair type) + wAX (number of non-alignable amino acid residues) + wB X (number of non-alignable amino acid residues)} Formula 5
[式 5中、 (in, n)は、 f 3 (m, n)が結合分子既知タンパク質のシ一: [In the formula 5, (in, n) is f 3 (m, n), a sequence of proteins with known binding molecules:
ンメン卜のうち第 m番目と第 n番目のアミノ酸残基についての評価関数であるこ とを表し、 アミノ酸残基ペア種類数は、 結合分子既知タンパク質のシークェンス アラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せの種類の数を 表し、 2交差残基ペア種類数及び 3交差残基ペア種類数はそれぞれ、 結合分子既知 タンパク質のシークェンスァラインメントのうち第 m番目と第 n番目のアミノ酸 残基の組合せのうちリガンドが 2種類及び 3種類のものの数を意味し、 p交差残基べ ァ種類数は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m 番目と第 n番目のアミノ酸残基の組合せのうちリガンドが p種類のものの数を意 味し、 シークェンスアラインメント不能アミノ酸残基数とは、 結合分子既知タン パク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基 のうち一方が、 好ましい相同性を得るためにシークェンスアラインメント不可能 とされた数を意味し、 シークェンスァラインメント不能アミノ酸残基ペア数とは 、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n 番目のアミノ酸残基の両方が、 好ましい相同性を得るためにシークェンスァライ ンメント不可能とされた数を意味し、 wXは正の定数、 またはアミノ酸ペア種類数 を変数とする分布関数であって、アミノ酸ペア種類数が 400以下の正の数であると きに最大値を与える分布関数を意味し、 w l〜wp- 1、 wA、 wBは、 ウェイトであ り、 正の数である。 ] This is an evaluation function for the m-th and n-th amino acid residues in the sequence, and the number of amino acid residue pair types is the m-th and n-th amino acid residues in the sequence alignment of the binding molecule known protein. Indicates the number of combinations of amino acid residues.The number of 2 cross residue pairs and the number of 3 cross residue pairs are the m-th and n-th amino acids in the sequence alignment of proteins with known binding molecules, respectively. The number of residue combinations means the number of two or three ligands.The number of p-crossing residue types is the number of the mth and nth amino acid residues in the sequence alignment of binding protein known proteins. Means the number of p-type ligands out of the combinations, and the number of amino acid residues that cannot be sequence-aligned refers to the number of amino acid residues in the protein with a known binding molecule. One of the m-th and n-th amino acid residues in the sequence alignment means the number of sequences that cannot be sequence-aligned to obtain favorable homology, and the number of amino acid residue pairs that cannot be sequence-aligned. The number means that both the m-th and the n-th amino acid residues in the sequence alignment of the binding molecule known protein were determined to be incapable of sequence alignment in order to obtain favorable homology, and wX represents A distribution function that takes a positive constant or the number of amino acid pair types as a variable, and means a distribution function that gives the maximum value when the number of amino acid pair types is a positive number of 400 or less, and wl to wp-1 , WA and wB are weights and are positive numbers. ]
8 . リガンド決定残基一リガンド分類情報を得る工程が、  8. The step of obtaining ligand-determined residue-ligand classification information comprises:
リガンド既知タンパク質のァミノ酸残基のうち、関数 ί3 (η)の値が一番小さなリガ ンド決定残基位置にあるものを抽出する工程と、 Among the amino acid residues of the ligand-known protein, Riga with the smallest value of the function ί3 (η) Extracting those at the residue positions determined by the
リガンド決定残基一リガンド分類情報にあげられたリガンド既知タンパク質のう ち、 抽出されたリガンド決定残基と一致するものの数 (A) を求める工程と、 リガンド決定残基一リガンド分類情報にあげられたリガンド既知タンパク質のう ち抽出されたリガンド決定残基と一致するもののうちで、 リガンド又はリガンド の種類が当該リガンド既知タンパク質のものと一致する数 (B ) を求める工程と リガンド既知タンパク質のアミノ酸残基のうち関数 f 3 (n)の値が二番目に小さい 又は X番目 (ここで、 Xは 2より大きく 1 0 0より小さな整数を表す。 ) に小さ いリガンド決定残基位置にあるものを抽出する工程と、 リガンド決定残基一リガ ンド分類情報にあげられたリガンド既知タンパク質のうち、 抽出されたリガンド 決定残基と一致するものの数 (C ) を求める工程と、 リガンド決定残基一リガン ド分類情報にあげられたリガンド既知タンパク質のうち抽出されたリガンド決定 残基と一致するもののうちで、 リガンド又はリガンドの種類が当該リガンド既知 タンパク質のものと一致する数 (D ) を求める工程と、 A step of determining the number (A) of extracted ligand-determined residues that match the extracted ligand-determined proteins among the ligand-determined residue-ligand classification information; Determining the number (B) of the ligand-identified proteins that correspond to the extracted ligand-determining residues among the ligand-identified proteins, and the type of the ligand or the ligand matches the ligand-identified protein. The groups at the ligand-determining residue positions where the value of the function f 3 (n) is the second smallest or the X-th (where X represents an integer greater than 2 and less than 100) The extraction step, and the ligand-determined residue matches the extracted ligand-determined residue among the known ligand proteins listed in the ligand classification information. Determining the number (C) of the ligands, and determining whether the ligand or the type of ligand is the ligand or the ligand among the ligand-determined residues extracted from the known ligand proteins listed in the ligand-determined residue-ligand classification information. Determining a number (D) that matches that of the known protein;
(A) と (C ) との和 (E ) を求める工程と、  Obtaining a sum (E) of (A) and (C);
( B ) と (D ) との和 (F ) を求める工程と、  Obtaining the sum (F) of (B) and (D);
を含み、 Including
( E ) と (F ) を更に表示するリガンド決定残基一リガンド分類情報を得る請求 項 7に記載の結合分子未知タンパク質の結合分子予測方法。  8. The method for predicting a binding molecule of an unknown protein for a binding molecule according to claim 7, wherein the information on (E) and (F) is further displayed.
9 . アミノ酸配列と結合分子とが既知である少なくとも 2以上の結合分子既知夕 ンパク質について、 当該結合分子既知タンパク質のシークェンスァラインメント と、 結合分子又は結合分子の種類とを対応付けた結合分子既知タンパク質分類情 報を得る工程と、  9. For at least two or more proteins with known binding molecules whose amino acid sequence and binding molecule are known, a binding molecule that associates the sequence alignment of the protein with known binding molecule with the type of binding molecule or binding molecule. Obtaining known protein classification information;
当該結合分子既知タンパク質分類情報を用いて、 結合分子既知タンパク質のシー クエンスアラインメントのうち結合分子を決定することに関与すると想定される 位置である結合分子決定残基位置を 1又は 2以上特定する工程と、 A step of identifying one or more binding molecule determining residue positions that are assumed to be involved in determining the binding molecule in the sequence alignment of the binding molecule known protein using the binding molecule known protein classification information. When,
当該結合分子決定残基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合 分子、 又は結合分子の種類とを対応付けることにより、 結合分子決定残基と結合 分子との相関関係を表す結合分子決定残基一結合分子分類情報を得る工程と、 を含む結合分子未知タンパク質の結合分子予測方法。 By associating the amino acid residue at the position of the binding molecule determining residue (binding molecule determining residue) with the type of the binding molecule or the type of the binding molecule, binding to the binding molecule determining residue Obtaining a binding molecule-determined residue-binding molecule classification information indicating a correlation with a molecule; and a method for predicting a binding molecule of an unknown protein.
1 0 . 結合分子決定残基と結合分子との相関関係を表す結合分子決定残基一結合 分子分類情報に、 前記結合分子既知タンパク質と同じ種類の結合分子未知タンパ ク質について前記結合分子既知タンパク質間のシークェンスアラインメントに対 して結合分子未知タンパク質の配列を整列させて得られた結合分子未知タンパク 質のシークェンスアラインメントに関する情報を入力し、 当該結合分子未知タン パク質に結合する結合分子、 又は結合分子の種類を予測する結合分子未知タンパ ク質の結合分子予測方法。  10. The binding molecule-determined residue indicating the correlation between the binding molecule-determined residue and the binding molecule. The binding molecule classification information includes the binding molecule unknown protein of the same type as the binding molecule unknown protein. Input information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the two, and the binding molecule or the binding molecule that binds to the unknown binding molecule protein A method for predicting the binding molecule of an unknown protein, which predicts the type of molecule.
1 1 . 請求項 1〜 1 0のいずれか 1項に記載した結合分子未知タンパク質の結合 分子予測方法を用いて、 結合分子未知タンパク質に結合する結合分子、 又は結合 分子の種類を予測する工程を含む医薬の製造方法。  11. The step of predicting a binding molecule that binds to a binding molecule unknown protein or a type of the binding molecule using the binding molecule unknown protein prediction method according to any one of claims 1 to 10. A method for producing a medicament comprising:
1 2 . 医薬が、 中枢疾患、 炎症性疾患、 循環器疾患、 癌、 代謝性疾患、 免疫系疾 患または消化器系疾患の予防剤、 又は治療剤のいずれか又は両方である請求項 1 1に記載の医薬の製造方法。  12. The claim wherein the medicament is one or both of a preventive agent and / or a therapeutic agent for a central disease, an inflammatory disease, a circulatory disease, a cancer, a metabolic disease, an immune system disease or a digestive system disease. The method for producing a medicament according to the above.
1 3 . 下記式 6又は下記式 7のいずれか又は両方を用いた結合分子決定残基位置 を決定する方法。 f l (n) =∑ (N (Res, Χ ) XN (Res, Xr) ) 式 6  13. A method for determining the position of a residue for determining a binding molecule using one or both of the following formulas 6 and 7. f l (n) = ∑ (N (Res, Χ) XN (Res, Xr)) Equation 6
Res  Res
[式 6中、 nは、 f l (n)が、 結合分子既知タンパク質のシークェンスァラインメン トのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XQ及び Xrは、 結合分子又は結合分子の種類を表し、 Q は 1から P- 1までの整数を表し、 rは Qより大きく p以下である整数を表し、 pは結合 分子又は結合分子の種類の数を表し、 N (Res, XQ)は、 結合分子既知タンパク質分 類情報に存在する結合分子既知タンパク質のうち、 シークェンスアラインメント の n番目のアミノ酸残基が Resであり、 かつ結合分子が ¾であるものの数を表し、 N ( es, Xr)は、 結合分子既知タンパク質分類情報に存在する結合分子既知タンパ ク質のうち、 シ一: n番目のアミノ酸残基が Resであり かつ結合分子が Xrであるものの数を表す。 ] f2(n) =∑ (N (Res, XI) XN(Res, X2) -XN(Res, Xp) ) 式 7 [In Equation 6, n indicates that fl (n) is an evaluation function for the nth amino acid residue in the sequence alignment of the binding molecule known protein, and Res indicates the type of amino acid residue. XQ and Xr represent binding molecules or types of binding molecules, Q represents an integer from 1 to P-1, r represents an integer greater than Q and less than or equal to p, and p represents a binding molecule or a bond. N (Res, XQ) represents the number of types of molecules, and among the known binding molecule proteins in the binding molecule known protein classification information, the nth amino acid residue in the sequence alignment is Res, and the binding molecule is ¾ represents the number of あ る, and N (es, Xr) is the known binding molecule protein in the binding molecule known protein classification information. Among the proteins, S: represents the number of amino acids in which the n-th amino acid residue is Res and the binding molecule is Xr. ] f2 (n) = ∑ (N (Res, XI) XN (Res, X2) -XN (Res, Xp)) Equation 7
Res  Res
[式 7中、 nは、 ί2(η)が、 結合分子既知タンパク質のシークェンスァラインメン 卜のうち第 η番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XIから Xpは、 結合分子又は結合分子の種類を表し、 p は、 結合分子又は結合分子の種類の数を表し、 N(Res, X)は、 結合分子既知タンパ ク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァライン メントの n番目のアミノ酸残基が Resであり、 かつ結合分子が Xであるものの数を 表す。 ] [In equation 7, n represents that ί2 (η) is an evaluation function for the ηth amino acid residue in the sequence alignment of the known binding molecule protein, and Res represents the type of amino acid residue. XI to Xp represent binding molecules or types of binding molecules, p represents the number of binding molecules or types of binding molecules, and N (Res, X) represents the binding molecule known protein classification information. Among the known binding molecule proteins, this indicates the number of proteins in which the n-th amino acid residue in the sequence alignment is Res and the binding molecule is X. ]
14. 下記式 8を用いた結合分子決定残基位置を決定する方法。 f 3(m, n) = { (アミノ酸残基ペア種類数)/ wX + wlx (2交差残基ペア種類数) + w2X (3交差残基ペア種類数) + wp- 1 (p交差残基ペア種類数) +wAX (ァ ラインメント不能アミノ酸残基数) +wBX (アラインメント不能アミノ酸残基べ ァ数) } 式 8  14. A method for determining a binding molecule determining residue position using the following formula 8. f 3 (m, n) = {(number of amino acid residue pair types) / wX + wlx (number of 2 cross residue pairs) + w2X (3 cross residue pair types) + wp-1 (p cross residues + WAX (number of non-alignable amino acid residues) + wBX (number of non-alignable amino acid residues)} Formula 8
[式 8中、 (m,n)は、 f 3(m, n)が結合分子既知タンパク質のシ- ンメントのうち第 m番目と第 n番目のアミノ酸残基についての評価関数であるこ とを表し、 アミノ酸残基ペア種類数は、 結合分子既知タンパク質のシークェンス アラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せの種類の数を 表し、 2交差残基ペア種類数及び 3交差残基ペア種類数はそれぞれ、 結合分子既知 タンパク質のシークェンスァラインメントのうち第 m番目と第 n番目のアミノ酸 残基の組合せのうちリガンドが 2種類及び 3種類のものの数を意味し、 p交差残基べ ァ種類数は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m 番目と第 n番目のアミノ酸残基の組合せのうちリガンドが p種類のものの数を意 味し、 シークェンスアラインメント不能アミノ酸残基数とは、 結合分子既知タン パク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基 のうち一方が、 好ましい相同性を得るためにシークェンスァラインメント不可能 とされた数を意味し、 シークェンスアラインメント不能アミノ酸残基ペア数とは 、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n 番目のアミノ酸残基の両方が、 好ましい相同性を得るためにシークェンスァライ ンメント不可能とされた数を意味し、 wXは正の定数、 またはアミノ酸ペア種類数 を変数とする分布関数であって、アミノ酸ペア種類数が 400以下の正の数であると きに最大値を与える分布関数を意味し、 w l '"wp- 1、 wA、 wBは、 ウェイトであ り、 正の数である。 ] [In Equation 8, (m, n) indicates that f3 (m, n) is an evaluation function for the mth and nth amino acid residues in the sequence of the protein with a known binding molecule. The number of amino acid residue pair types represents the number of combinations of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein. The number of pairs means the number of the two and three ligands in the m-th and n-th amino acid residue combinations in the sequence alignment of the binding molecule known protein, respectively. The number of baie types means the number of p-type ligands out of the m-th and n-th amino acid residue combinations in the sequence alignment of binding protein known proteins. The number of amino acid residues that cannot be sequence-aligned means that one of the m-th and n-th amino acid residues in the sequence alignment of a protein with a known binding molecule has a sequence alignment in order to obtain favorable homology. The number of amino acid residue pairs that cannot be sequence-aligned means that both the m-th and n-th amino acid residues in the sequence alignment of a protein with a known binding molecule have favorable homology. WX is a positive constant or a distribution function with the number of amino acid pair types as a variable, where wX is a positive number with 400 or less amino acid pair types. Wl '"wp-1, wA, wB are weights and mean a distribution function that gives the maximum value at a time. That.]
1 5 . 結合分子既知タンパク質のアミノ酸配列又はシークェンスアラインメント と、 結合分子又は結合分子の種類に関する情報とを用いて、 結合分子既知タンパ ク質のシークェンスァラインメン卜の位置のうち結合分子を決定することに関与 すると想定される位置 (結合分子決定残基位置) におけるアミノ酸残基である結 合分子決定残基と、 結合分子または結合分子の種類との相関関係を表す結合分子 決定残基一結合分子分類情報を得る、 結合分子未知タンパク質の結合分子を予測 するためのコンピュータであって、  15. Determining the binding molecule among the positions of the sequence alignment of the known binding molecule protein using the amino acid sequence or sequence alignment of the binding molecule known protein and the information on the binding molecule or the type of the binding molecule. Binding molecule, which is the amino acid residue at the position supposed to be involved in the binding molecule (binding molecule determining residue position), and a binding molecule indicating the correlation between the binding molecule or the type of the binding molecule. A computer for predicting a binding molecule of an unknown protein, which obtains classification information,
当該コンピュータは、 The computer is
結合分子既知タンパク質のシークェンスアラインメントに関する情報を入力する シークェンスアラインメント入力手段と、 A sequence alignment input means for inputting information regarding the sequence alignment of the binding molecule known protein;
前記シークェンスァラインメント入力手段により入力された結合分子既知タンパ ク質のアミノ酸配列又はシークェンスァラインメン卜と、 結合分子又は結合分子 の種類に関する情報とを記憶するシークェンスアラインメント結合分子記憶手段 と、 Sequence alignment binding molecule storage means for storing the amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means, and information on the type of binding molecule or binding molecule;
前記シークェンスアラインメント結合分子記憶手段により記憶された結合分子 既知夕ンパク質のアミノ酸配列又はシークェンスァラインメントと、 結合分子又 は結合分子の種類に関する情報を用いて前記結合分子決定残基位置を決定する結 合分子決定残基位置決定手段と、  Using the amino acid sequence or sequence alignment of the known binding protein stored by the sequence alignment binding molecule storage means and the information on the type of binding molecule or binding molecule to determine the position of the binding molecule determining residue A binding molecule determining residue position determining means;
前記結合分子決定残基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合 分子又は結合分子の種類とを対応付けることにより、 結合分子決定残基と結合分 子または結合分子の種類との相関関係を表す結合分子決定残基一結合分子分類情 報を得る結合分子決定残基一結合分子分類情報取得手段と、 An amino acid residue (binding molecule determining residue) at the binding molecule determining residue position; By associating a molecule or a binding molecule type with a binding molecule determining residue, a binding molecule determining residue indicating the correlation between the binding molecule or the binding molecule or the type of the binding molecule, and a binding molecule determining residue for obtaining binding molecule classification information One binding molecule classification information obtaining means,
前記結合分子既知夕ンパク質と同じ種類の結合分子未知夕ンパク質について前記 結合分子既知タンパク質間のシークェンスァラインメントに対して結合分子未知 タンパク質の配列を整列させて得られた結合分子未知タンパク質のシークェンス ァラインメントに関する情報を入力するシークェンスアラインメント入力手段と を具備し、 For the unknown binding molecule protein of the same type as the binding molecule unknown protein, the binding molecule unknown protein obtained by aligning the sequence of the binding molecule unknown protein with the sequence alignment between the binding molecule known protein Sequence alignment input means for inputting information regarding sequence alignment,
結合分子決定残基一結合分子分類情報に、 シ一クエンスアラインメント入力手段 により入力された結合分子未知タンパク質のシークェンスァラインメントに関す る情報を用いて、 当該結合分子未知タンパク質の結合分子、 又は結合分子の種類 を予測する、 Using the information on the sequence alignment of the unknown protein of the binding molecule input by the sequence alignment inputting means to the binding molecule determination residue-binding molecule classification information, the binding molecule or the binding of the unknown protein of the binding molecule. Predict the type of molecule,
結合分子未知タンパク質の結合分子を予測するためのコンピュータ。 Computer for predicting binding molecules of unknown proteins.
1 6 . 前記結合分子決定残基位置決定手段が、 少なくとも下記式 9又は式 1 0の いずれか又は両方の関数を用いる請求項 1 5に記載の結合分子未知タンパク質の 結合分子を予測するためのコンピュータ。 f l (n) =∑ (N (Re s, XQ) XN (Res, Xr) ) 式 9  16. The binding molecule-determining residue-determining means uses at least one of the following formulas 9 and / or 10 or a function of both to predict the binding molecule of the unknown binding molecule protein according to claim 15. Computer. f l (n) = ∑ (N (Re s, XQ) XN (Res, Xr)) Equation 9
Res  Res
[式 9中、 nは、 f l (n)が、 結合分子既知タンパク質のシ一: [In Formula 9, n is fl (n), and is a known binding molecule protein:
トのうち第 n番目のアミノ酸残基についての評価関数であることを表し、 Resは、 アミノ酸残基の種類を表し、 XQ及び Xrは、 結合分子又は結合分子の種類を表し、 Q は 1から p-1までの整数を表し、 rは Qより大きく p以下である整数を表し、 pは結合 分子又は結合分子の種類の数を表し、 N (Res, XQ)は、 結合分子既知タンパク質分 類情報に存在する結合分子既知タンパク質のうち、 シークェンスアラインメント の n番目のアミノ酸残基が Resであり、 かつ結合分子が XQであるものの数を表し、 N (Res, Xr)は、 結合分子既知タンパク質分類情報に存在する結合分子既知タンパ ク質のうち、 シ、 n番目のァミノ酸残基が Resであり かつ結合分子が Xrであるものの数を表す。 ] f2(n) =∑ (N(Res, XI) XN(Res, X2) -XN(Res, Xp)) 式 1 0 Represents the evaluation function for the nth amino acid residue in the sequence, Res represents the type of amino acid residue, XQ and Xr represent the binding molecule or the type of binding molecule, and Q represents represents an integer up to p-1, r represents an integer greater than Q and less than or equal to p, p represents the number of binding molecules or types of binding molecules, and N (Res, XQ) represents a class of proteins known to have binding molecules. Among the known binding molecule proteins in the information, the number of the nth amino acid residue in the sequence alignment is Res and the binding molecule is XQ, and N (Res, Xr) is the classification of known binding molecule proteins. Binding molecule known tamper existing in information Among the proteins, the number of amino acids in which the n-th amino acid residue is Res and the binding molecule is Xr is shown. f2 (n) = ∑ (N (Res, XI) XN (Res, X2) -XN (Res, Xp)) Equation 10
Res  Res
[式 1 O中、 nは、 ί2(η)が、 結合分子既知タンパク質のシ- ントのうち第 η番目のアミノ酸残基についての評価関数であることを表し、 Res は、 アミノ酸残基の種類を表し、 XIから Xpは、 結合分子又は結合分子の種類を表 し、 pは、 結合分子又は結合分子の種類の数を表し、 N(Res, X)は、 結合分子既知 タンパク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァ ラインメントの n番目のァミノ酸残基が Resであり、 かつ結合分子が Xであるもの の数を表す。 ] [In Formula 1O, n represents that ί2 (η) is an evaluation function for the η-th amino acid residue among the sheets of the binding molecule known protein, and Res represents the type of the amino acid residue. And XI to Xp represent the binding molecule or the type of the binding molecule, p represents the number of the binding molecule or the type of the binding molecule, and N (Res, X) represents the binding molecule known in the protein classification information. It indicates the number of proteins whose binding molecule is X and the nth amino acid residue in the sequence alignment is Res among known binding molecule proteins. ]
1 7. 前記結合分子決定残基位置決定手段が、 下記式 1 1で表される関数を用い る請求項 1 5又は 1 6に記載の結合分子未知タンパク質の結合分子を予測するた めのコンピュータ。 f 3(m, n) = { (アミノ酸残基ペア種類数)/ wX + wlx (2交差残基ペア種類数) + w2X (3交差残基ペア種類数) 十… wp- 1 (p交差残基ペア種類数) +wAX (ァ ラインメント不能アミノ酸残基数) +wBX (アラインメント不能アミノ酸残基べ ァ数) } 式 1 1  17. The computer for predicting a binding molecule of a binding molecule unknown protein according to claim 15 or 16, wherein the binding molecule determination residue position determining means uses a function represented by the following formula 11: . f 3 (m, n) = {(the number of amino acid residue pair types) / wX + wlx (the number of 2 cross residue pairs) + w2X (the number of 3 cross residue pairs) tens… wp-1 (p cross residue + WAX (number of non-alignable amino acid residues) + wBX (number of non-alignable amino acid residues)} Formula 11
[式 1 1中、 (πι,η)は、 f 3(m, n)が結合分子既知タンパク質のシークェンスァラ ィンメントのうち第 m番目と第 n番目のアミノ酸残基についての評価関数である ことを表し、 アミノ酸残基ペア種類数は、 結合分子既知タンパク質のシークェン スアラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せの種類の数 を表し、 2交差残基ペア種類数及び 3交差残基ペア種類数はそれぞれ、 結合分子既 知タンパク質のシークェンスァラインメントのうち第 m番目と第 n番目のァミノ 酸残基の組合せのうちリガンドが 2種類及び 3種類のものの数を意味し、 P交差残基 ペア種類数は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せのうちリガンドが p種類のものの数を 意味し、 シークェンスアラインメント不能アミノ酸残基数とは、 結合分子既知夕 ンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残 基のうち一方が、 好ましい相同性を得るためにシ一クエンスアラインメント不可 能とされた数を意味し、 シークェンスアラインメント不能アミノ酸残基ペア数と は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のァミノ酸残基の両方が、 好ましい相同性を得るためにシークェンスァラ インメント不可能とされた数を意味し、 wXは正の定数、 またはアミノ酸ペア種類 数を変数とする分布関数であって、アミノ酸ペア種類数が 400以下の正の数である ときに最大値を与える分布関数を意味し、 w l '"wp_l、 wA、 wBは、 ウェイ 卜で あり、 正の数である。 ] [In Equation 11, (πι, η) indicates that f 3 (m, n) is an evaluation function for the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein. And the number of amino acid residue pair types represents the number of types of combinations of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein. The number of types of cross-residue pairs means the number of the two and three ligands, respectively, of the combination of the m-th and n-th amino acid residues in the sequence alignment of the known binding molecule proteins. The P cross residue The number of pairs means the number of combinations of the m-th and n-th amino acid residues in the sequence alignment of the binding molecule known protein with ligands of p types, and the number of amino acid residues that cannot be sequence-aligned means Sequence alignment means that one of the m-th and n-th amino acid residues in the sequence alignment of the protein whose binding molecule is known is considered to be incapable of sequence alignment in order to obtain favorable homology. The number of unsuccessful amino acid residue pairs means that both m-th and n-th amino acid residues in the sequence alignment of a known binding molecule were not sequence-aligned due to favorable homology. WX is a positive constant or the number of amino acid pairs Wl '"wp_l, wA, and wB are weights, and are distribution functions that give the maximum value when the number of amino acid pairs is a positive number of 400 or less. is there. ]
1 8 . 結合分子未知タンパク質の結合分子を予測するためのコンピュータであつ て、  1 8. Binding molecule A computer for predicting the binding molecule of an unknown protein,
当該結合分子未知タンパク質と同じ種類であり結合する結合分子が既知である結 合分子既知タンパク質のシークェンスァラインメントのうち当該結合分子既知タ ンパク質に結合する分子を決定することに関与すると想定される位置である結合 分子決定残基位置と、 当該結合分子決定残基位置における結合分子既知タンパク 質のアミノ酸残基である結合分子決定残基と、 当該結合分子決定残基に対応した 結合分子既知タンパク質の結合分子又は結合分子の種類とに関する情報を記憶し た記憶手段と、 It is assumed to be involved in determining a molecule that binds to the protein with a known binding molecule in the sequence alignment of the protein with a known binding molecule, which is of the same type as the unknown protein and has a known binding molecule. And the binding molecule determinant residue corresponding to the binding molecule determinant residue, and the binding molecule determinant residue corresponding to the amino acid residue of the protein known at the binding molecule determinant residue position. Storage means for storing information relating to the binding molecule of the protein or the type of the binding molecule;
前記結合分子既知タンパク質と同じ種類の結合分子未知タンパク質について前記 結合分子既知タンパク質間のシークェンスアラインメントに対して結合分子未知 タンパク質の配列を整列させて得られた結合分子未知タンパク質のシークェンス ァラインメントに関する情報を入力するシークェンスァラインメント入力手段と 入力されたシークェンスァラインメントに関する情報と記憶手段に記憶される情 報とから当該結合分子未知タンパク質の結合分子又は結合分子の種類を決定する 結合分子決定手段と、 決定された結合分子未知タンパク質に結合する結合分子又は結合分子の種類を表 示する表示手段とを具備し、 Information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the known binding molecule proteins for the same type of unknown binding molecule protein as the known binding molecule protein. And a binding molecule determining means for determining a binding molecule or a type of a binding molecule of the unknown protein of the binding molecule from the input information on the sequence alignment and the information stored in the storage means. When, Display means for displaying the determined binding molecule or the type of the binding molecule binding to the unknown binding molecule unknown protein,
シークェンスァラインメント入力手段により入力された結合分子未知タンパク質 のシークェンスアラインメントに関する情報と、 記憶手段に記憶された結合分子 決定残基と当該結合分子決定残基に対応した結合分子既知タンパク質の結合分子 又は結合分子の種類に関する情報とに基づいて結合分子決定手段により結合分子 未知タンパク質の結合分子又は結合分子の種類を予測し、 結合分子決定手段によ り予測された当該結合分子未知タンパク質の結合分子又は結合分子の種類を表示 手段により表示する The information on the sequence alignment of the unknown binding molecule protein input by the sequence alignment input means, the binding molecule determinant residue stored in the storage means, and the binding molecule of the binding molecule known protein corresponding to the binding molecule determinant, or The binding molecule determining unit predicts the binding molecule of the binding molecule unknown protein or the type of the binding molecule based on the information on the type of the binding molecule and the binding molecule of the binding molecule unknown protein predicted by the binding molecule determination unit. Display the type of binding molecule by display means
結合分子未知タンパク質の結合分子を予測するためのコンピュータ。 Computer for predicting binding molecules of unknown proteins.
1 9 . コンピュータを、  1 9.
結合分子既知タンパク質のシークェンスアラインメントに関する情報を入力する 前記シークェンスァラインメント入力手段により入力された結合分子既知タンパ ク質のアミノ酸配列又はシークェンスアラインメントと、 結合分子又は結合分子 の種類に関する情報とを記憶するシークェンスアラインメント結合分子記憶手段 と、 Inputting information on the sequence alignment of the binding molecule known protein Stores the amino acid sequence or sequence alignment of the binding molecule known protein input by the sequence alignment input means, and information on the type of the binding molecule or the binding molecule. Sequence alignment binding molecule storage means,
前記シークェンスアラインメント結合分子記憶手段により記憶された結合分子既 知タンパク質のアミノ酸配列又はシークェンスアラインメントと、 結合分子又は 結合分子の種類に関する情報を用いて前記結合分子決定残基位置を決定する結合 分子決定残基位置決定手段と、 The binding molecule determination residue for determining the binding molecule determination residue position using the amino acid sequence or sequence alignment of the binding molecule known protein stored by the sequence alignment binding molecule storage means and information on the type of binding molecule or binding molecule. Base position determining means,
前記結合分子決定残基位置におけるアミノ酸残基 (結合分子決定残基) と、 結合 分子又は結合分子の種類とを対応付けることにより、 結合分子決定残基と結合分 子または結合分子の種類との相関関係を表す結合分子決定残基一結合分子分類情 報を得る結合分子決定残基一結合分子分類情報取得手段と、 By associating the amino acid residue (binding molecule determining residue) at the binding molecule determining residue position with the type of the binding molecule or the binding molecule, the correlation between the binding molecule determining residue and the type of the binding molecule or the binding molecule is determined. Means for obtaining binding molecule-determined residue-binding molecule classification information for obtaining binding molecule-determined residue-binding molecule classification information representing the relationship;
前記結合分子既知タンパク質と同じ種類の結合分子未知タンパク質について前記 結合分子既知タンパク質間のシークェンスアラインメントに対して結合分子未知 タンパク質の配列を整列させて得られた結合分子未知タンパク質のシークェンス アラインメントに関する情報を入力するシークェンスアラインメント入力手段と して機能させるプログラム。 For a binding molecule unknown protein of the same type as the binding molecule unknown protein, input information on the sequence alignment of the binding molecule unknown protein obtained by aligning the sequence of the binding molecule unknown protein with respect to the sequence alignment between the binding molecule unknown proteins. Sequence alignment input means A program to function as
20. 前記結合分子決定残基位置決定手段が、 少なくとも下記式 1 2又は式 1 3 のいずれか又は両方の関数を用いる請求項 1 9に記載のプログラム。 fl (n)=∑ (N (Res, Χα) XN(Res, Xr) ) 式 1 2  20. The program according to claim 19, wherein the binding molecule determining residue position determining means uses at least one of the following formulas 12 and 13 or a function of both. fl (n) = ∑ (N (Res, Χα) XN (Res, Xr)) Equation 1 2
Res  Res
[式 1 2中、 nは、 Π(η)が、 結合分子既知タンパク質のシークェンスァラインメ ントのうち第 η番目のアミノ酸残基についての評価関数であることを表し、 Res は、 アミノ酸残基の種類を表し、 XQ及び Xrは、 結合分子又は結合分子の種類を表 し、 Qは 1から p- 1までの整数を表し、 rは Qより大きく p以下である整数を表し、 p は結合分子又は結合分子の種類の数を表し、 N(Res, XQ)は、 結合分子既知タンパ ク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァライン メントの n番目のアミノ酸残基が Resであり、かつ結合分子が XQであるものの数を 表し、 N(Res, Xr)は、 結合分子既知タンパク質分類情報に存在する結合分子既知 タンパク質のうち、 シークェンスァラインメントの n番目のアミノ酸残基が Res であり、 かつ結合分子が Xrであるものの数を表す。 ] f2(n) =∑ (N (Res, XI) xN(Res, X2)—XN(Res, Xp)) 式 1 3 [In Equation 12, n represents that Π (η) is an evaluation function for the ηth amino acid residue in the sequence alignment of the binding molecule known protein, and Res represents the amino acid residue XQ and Xr represent binding molecules or binding molecule types, Q represents an integer from 1 to p-1, r represents an integer greater than Q and less than or equal to p, and p represents a bond N (Res, XQ) represents the number of types of molecules or binding molecules, and among the known binding molecule proteins present in the binding molecule known protein classification information, the nth amino acid residue in the sequence alignment is Res And N (Res, Xr) is the nth amino acid residue in the sequence alignment among the known binding molecule proteins in the binding molecule known protein classification information. Is Res and AND Represents the number of molecules whose molecule is Xr. ] f2 (n) = ∑ (N (Res, XI) xN (Res, X2) —XN (Res, Xp)) Equation 13
Res  Res
[式 1 3中、 nは、 ί2(η)が、 結合分子既知タンパク質のシ一: [In Formula 13, n is 、 2 (η), and is a known binding molecule protein:
ントのうち第 η番目のアミノ酸残基についての評価関数であることを表し、 Res は、 アミノ酸残基の種類を表し、 XIから Xpは、 結合分子又は結合分子の種類を表 し、 pは、 結合分子又は結合分子の種類の数を表し、 N(Res, X)は、 結合分子既知 夕ンパク質分類情報に存在する結合分子既知タンパク質のうち、 シークェンスァ ラインメン卜の n番目のアミノ酸残基が Resであり、 かつ結合分子が Xであるもの の数を表す。 ] Represents the evaluation function for the ηth amino acid residue of the amino acid residue, Res represents the type of the amino acid residue, XI to Xp represents the binding molecule or the type of the binding molecule, and p represents N (Res, X) indicates the number of binding molecules or types of binding molecules, and N (Res, X) indicates that the nth amino acid residue in the sequence element is the known binding molecule among the proteins known in the protein classification information. Res and represents the number of those in which the binding molecule is X. ]
2 1. 前記結合分子決定残基位置決定手段が、 下記式 1 4で表される関数を用い る請求項 1 9又は 2 0に記載のプログラム。 f 3 (m, n) = { (アミノ酸残基ペア種類数)/ wX+ wl x (2交差残基ペア種類数) + w2 X (3交差残基ペア種類数) 十… wp-1 (p交差残基ペア種類数) + wAX (ァ ラインメント不能アミノ酸残基数) + wB X (アラインメント不能アミノ酸残基べ ァ数) } 式 1 4 2 1. The binding molecule determining residue position determining means uses a function represented by the following equation 14. 21. The program according to claim 19 or 20. f 3 (m, n) = {(number of types of amino acid residue pairs) / wX + wl x (number of types of 2 crossing residue pairs) + w2 X (number of types of 3 crossing residue pairs) tens… wp-1 (p crossing Number of types of residue pairs) + wAX (number of non-alignable amino acid residues) + wB X (number of non-alignable amino acid residues)} Formula 14
[式 1 4中、 (m, n)は、 f 3 (m, n)が結合分子既知タンパク質のシークェンスァラ ィンメントのうち第 m番目と第 n番目のアミノ酸残基についての評価関数である ことを表し、 アミノ酸残基ペア種類数は、 結合分子既知タンパク質のシークェン スァラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せの種類の数 を表し、 2交差残基ペア種類数及び 3交差残基ペア種類数はそれぞれ、 結合分子既 知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のァミノ 酸残基の組合せのうちリガンドが 2種類及び 3種類のものの数を意味し、 p交差残基 ペア種類数は、 結合分子既知タンパク質のシークェンスアラインメントのうち第 m番目と第 n番目のアミノ酸残基の組合せのうちリガンド力 種類のものの数を 意味し、 シークェンスアラインメント不能アミノ酸残基数とは、 結合分子既知夕 ンパク質のシークェンスァラインメントのうち第 m番目と第 n番目のアミノ酸残 基のうち一方が、 好ましい相同性を得るためにシークェンスァラインメント不可 能とされた数を意味し、 シークェンスアラインメント不能アミノ酸残基ペア数と は、 結合分子既知タンパク質のシークェンスァライン.メントのうち第 m番目と第 n番目のァミノ酸残基の両方が、 好ましい相同性を得るためにシークェンスァラ ィンメント不可能とされた数を意味し、 wXは正の定数、 またはアミノ酸ペア種類 数を変数とする分布関数であって、アミノ酸ペア種類数が 400以下の正の数である ときに最大値を与える分布関数を意味し、 wl '"wp - 1、 wA、 wBは、 ウェイトで あり、 正の数である。 ] [In Equation 14, (m, n) indicates that f 3 (m, n) is an evaluation function for the m-th and n-th amino acid residues in the sequence alignment of proteins with known binding molecules. The number of amino acid residue pair types represents the number of types of combinations of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein. 3 The number of crossing residue pair types means the number of two and three ligands out of the combination of the mth and nth amino acid residues in the sequence alignment of the binding molecule known protein, respectively. The number of types of p-crossing residue pairs means the number of combinations of the ligand power of the m-th and n-th amino acid residues in the sequence alignment of the binding molecule known protein. The number of amino acid residues that cannot be aligned means that one of the m-th and n-th amino acid residues in the sequence alignment of a protein with a known binding molecule is sequence alignment in order to obtain favorable homology. The number of amino acid residue pairs that cannot be sequence-aligned refers to the sequence alignment of a protein with a known binding molecule; both the m-th and n-th amino acid residues in the sequence are preferred. WX is a positive constant or a distribution function with the number of amino acid pair types as a variable, and wX is a positive function with 400 or less amino acid pair types in order to obtain homology. Wl '"wp-1, wA, and wB are weights and positive numbers."
2 2 . コンピュータを、 2 2.
結合分子未知タンパク質と同じ種類であり結合する結合分子が既知である結合分 子既知タンパク質のシークェンスアラインメントのうち当該結合分子既知夕ンパ ク質に結合する分子を決定することに関与すると想定される位置である結合分子 決定残基位置と、 当該結合分子決定残基位置における結合分子既知タンパク質の ァミノ酸残基である結合分子決定残基と、 当該結合分子決定残基に対応した結合 分子既知タンパク質の結合分子又は結合分子の種類とに関する情報を記憶した記 憶手段と、 A binding molecule whose binding molecule is of the same type as the unknown protein and whose binding molecule is known is included in the sequence alignment of the protein whose binding molecule is known. Binding molecule determined residue positions that are assumed to be involved in determining molecules that bind to proteins, and binding molecule determination residues that are amino acid residues of known binding molecule proteins at the determined binding molecule determined residue positions. A storage means for storing information about the group and the type of the binding molecule or the type of the binding molecule of the known binding molecule corresponding to the binding molecule determining residue;
前記結合分子既知タンパク質と同じ種類の結合分子未知タンパク質について前記 結合分子既知タンパク質間のシークェンスアラインメントに対して結合分子未知 タンパク質の配列を整列させて得られた結合分子未知タンパク質のシークェンス ァラインメントに関する情報を入力するシークェンスァラインメント入力手段と 入力されたシークェンスアラインメントに関する情報と記憶手段に記憶される情 報とから当該結合分子未知タンパク質の結合分子又は結合分子の種類を決定する 結合分子決定手段と、 Information on the sequence alignment of the unknown binding molecule protein obtained by aligning the sequence of the unknown binding molecule protein with the sequence alignment between the known binding molecule proteins for the same type of unknown binding molecule protein as the known binding molecule protein. A sequence alignment inputting means for inputting the information of the sequence alignment, and information stored in the storage means to determine a binding molecule or a type of the binding molecule of the unknown protein of the binding molecule;
決定された結合分子未知タンパク質に結合する結合分子又は結合分子の種類を表 示する表示手段として機能させるプログラム。 A program that functions as display means for displaying the determined binding molecule or the type of binding molecule that binds to the unknown binding molecule protein.
2 3 . 請求項 1 9〜請求項 2 2のいずれか 1項に記載のプログラムを記憶した記 録媒体。  23. A recording medium storing the program according to any one of claims 19 to 22.
PCT/JP2002/007057 2001-07-12 2002-07-11 Method of presuming ligand and method of using the same WO2003007187A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2001212749 2001-07-12
JP2001-212749 2001-07-12

Publications (1)

Publication Number Publication Date
WO2003007187A1 true WO2003007187A1 (en) 2003-01-23

Family

ID=19047856

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2002/007057 WO2003007187A1 (en) 2001-07-12 2002-07-11 Method of presuming ligand and method of using the same

Country Status (1)

Country Link
WO (1) WO2003007187A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1433849A1 (en) * 2001-09-14 2004-06-30 Takeda Chemical Industries, Ltd. NOVEL POLYPEPTIDE, DNA THEREOF AND USE OF THE SAME
WO2004056866A1 (en) * 2002-12-20 2004-07-08 Geneos Oy Asthma susceptibility locus
WO2004076487A1 (en) * 2003-02-28 2004-09-10 Takeda Pharmaceutical Company Limited Antibody and use thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARCHESE A., GEORGE S.R., O'DOWD B.F.: "Novel GPCRs and their endogenous ligands: expanding the boundaries of physiology and pharmacology", TRENDS IN PHARMACOLOGICAL SCIENCES, vol. 20, no. 9, 1999, pages 370 - 375, XP002187420 *
STADEL J.M., BERGSMA D.J.: "Orphan G protein-coupled receptors: a neglecteed opportunity for pioneer", TRENDS IN PHARMACOLOGICAL SCIENCES, vol. 18, no. 11, November 1997 (1997-11-01), pages 430 - 437, XP004096215 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1433849A1 (en) * 2001-09-14 2004-06-30 Takeda Chemical Industries, Ltd. NOVEL POLYPEPTIDE, DNA THEREOF AND USE OF THE SAME
EP1433849A4 (en) * 2001-09-14 2005-08-17 Takeda Pharmaceutical Novel polypeptide, dna thereof and use of the same
US7323541B2 (en) 2001-09-14 2008-01-29 Takeda Pharmaceutical Company Limited Polypeptide DNA thereof and use of the same
WO2004056866A1 (en) * 2002-12-20 2004-07-08 Geneos Oy Asthma susceptibility locus
WO2004076487A1 (en) * 2003-02-28 2004-09-10 Takeda Pharmaceutical Company Limited Antibody and use thereof

Similar Documents

Publication Publication Date Title
Lemoine Sucrose transporters in plants: update on function and structure
Echard et al. Alternative splicing of the human Rab6A gene generates two close but functionally different isoforms
Miner et al. Collagen IV alpha 3, alpha 4, and alpha 5 chains in rodent basal laminae: sequence, distribution, association with laminins, and developmental switches.
Stacey et al. EMR4, a novel epidermal growth factor (EGF)-TM7 molecule up-regulated in activated mouse macrophages, binds to a putative cellular ligand on B lymphoma cell line A20
JP2008133300A (en) Human orphan g protein-coupled receptor
Ingley et al. A novel ADP‐ribosylation like factor (ARL‐6), interacts with the protein‐conducting channel SEC61β subunit
Puopolo et al. A single gene encodes the catalytic “A” subunit of the bovine vacuolar H (+)-ATPase.
WO1999009166A2 (en) Prostate tumor polynucleotide and antigen compositions
Likić et al. Patterns that define the four domains conserved in known and novel isoforms of the protein import receptor Tom20
US20090156521A1 (en) Gpr17 modulators,method of screening and uses thereof
Ai et al. Mutating the four extracellular cysteines in the chemokine receptor CCR6 reveals their differing roles in receptor trafficking, ligand binding, and signaling
JP4604184B2 (en) Novel sugar chain recognition protein and its gene
Jones et al. Tissue distribution and functional analyses of the constitutively active orphan G protein coupled receptors, GPR26 and GPR78
Liu et al. Alternative pre-mRNA splicing of the mu opioid receptor gene, OPRM1: insight into complex mu opioid actions
PL178000B1 (en) Glucagon receptors
JPH1128093A (en) New cdna clone hdpbi 30 encoding human 7-transmembrane receptor
Reidling et al. Sweet Tooth, a novel receptor protein-tyrosine kinase with C-type lectin-like extracellular domains
JP2002537805A (en) Human secretory protein
CN103074348A (en) Recombinant carp Nrf2 (NF-E2-related factor 2) gene, protein, preparation and detection methods and application of recombinant carp Nrf2 gene
JP2000510690A (en) Mammalian mixed lymphocyte receptor, chemokine receptor (MMLR-CCR)
WO2003007187A1 (en) Method of presuming ligand and method of using the same
JP2003159095A (en) Method for predicting bonded molecule and method for utilizing the method
US20030187222A1 (en) Novel galanin receptor
WO2003045998A2 (en) Polynucleotide and protein involved in synaptogenesis, variants thereof, and their therapeutic and diagnostic uses
Chang et al. Molecular characterization of a novel nucleolar protein, pNO40

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP