JPH10197529A

JPH10197529A - Antigen deciding group anticipating method and system for protein

Info

Publication number: JPH10197529A
Application number: JP35135696A
Authority: JP
Inventors: Masato Kitajima; 正人北島; Michihiro Ooya; 倫宏大屋; Kota Sakai; 広太酒井; Hirofumi Doi; 洋文土居
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-12-27
Filing date: 1996-12-27
Publication date: 1998-07-31
Anticipated expiration: 2016-12-27
Also published as: JP3618497B2

Abstract

PROBLEM TO BE SOLVED: To anticipate a peptide becoming a specific antigen deciding group by finding a peptide having a high aptitude evaluating value from primary alignment information on an amino acid of protein, and also finding a peptide having higher specificity in a constant protein group. SOLUTION: An evaluating value to show an aptitude as an antigen deciding group is found to respective peptides of the first prescribed number obtained from primary alignment information on an amino acid residue to show protein being an analytical object. Data corresponding to a frequency in which the second prescribed number of amino acid residues in its peptides appear in a prescribed protein group, is found as specific data with every peptide higher in its evaluating value. In this case, the second prescribed number is set not more than the first prescribed number, and may be set equal to each other. When a frequency obtained to the second prescribed number of amino acid residues in the peptides is small, the specific data shows a high specificity condition. A peptide capable of becoming an antigen deciding group higher in specificity can be estimated on the basis of this specific data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、タンパク質の抗原
決定基予測方法及びシステムに係り、詳しくは、タンパ
ク質を表すアミノ酸残基の一次配列情報からそのタンパ
ク質を特徴付ける特異的な抗原決定基となりうるペプチ
ドを予測して表示する方法及びシステムに関する。この
方法及びシステムは、生化学（遺伝子、タンパク質）の
生体機能を同定する分野全般において用いることができ
る。The present invention relates to a method and system for predicting an antigenic determinant of a protein, and more particularly to a peptide which can be a specific antigenic determinant characterizing a protein from primary sequence information of amino acid residues representing the protein. The present invention relates to a method and a system for predicting and displaying. The method and system can be used in all fields of identifying biological functions of biochemistry (genes, proteins).

【０００２】[0002]

【従来の技術】従来、本願出願人は、タンパク質を表す
アミノ酸残基の一次配列情報から抗原決定基を予測する
システムを提案している（特願平７−３３０８７３）。
この従来のシステムにおいては、タンパク質を表すアミ
ノ酸残基の一次配列を所定残基数のペプチド毎に、所定
のアルゴリズムに従って抗原決定基となりうる程度の高
さを表す評価値を演算している。そして、その評価値を
アミノ酸残基の一次配列に対応付けてグラフ表示した
り、また、評価値の高い順にペプチドを表示している。2. Description of the Related Art Heretofore, the present applicant has proposed a system for predicting an antigenic determinant from primary sequence information of amino acid residues representing a protein (Japanese Patent Application No. 7-330873).
In this conventional system, a primary value of an amino acid residue representing a protein is calculated for each peptide having a predetermined number of residues, and an evaluation value indicating a height that can serve as an antigenic determinant is calculated according to a predetermined algorithm. Then, the evaluation value is displayed in a graph in association with the primary sequence of the amino acid residue, and the peptides are displayed in descending order of the evaluation value.

【０００３】上記評価値は、柔軟性評価、親水性／疎水
性評価、抗原決定基解析評価、二次構造予測評価等の評
価を数値化し、その数値を単独または複数統合して表し
たものである。それぞれの評価手法（アルゴリズム）
は、種々の論文等により発表されて既知となっている。[0003] The above evaluation values are obtained by numerically expressing evaluations such as flexibility evaluation, hydrophilicity / hydrophobicity evaluation, antigenic determinant analysis evaluation, secondary structure prediction evaluation and the like, and expressing the numerical values alone or in combination. is there. Each evaluation method (algorithm)
Has been published in various papers and is known.

【０００４】研究者は、上記のように表示された評価値
の高いペプチドを抗原決定基となる可能性が高いものと
して、そのようなペプチドを生化学的に合成し、その合
成されたペプチドを当該タンパク質の性質を特定すため
の種々の実験に用いるようにしている。[0004] Researchers have determined that a peptide having a high evaluation value displayed as described above is likely to be an antigenic determinant, and biochemically synthesize such a peptide, and then synthesize the synthesized peptide. It is used for various experiments to identify the properties of the protein.

【０００５】[0005]

【発明が解決しようとする課題】ところで、解析の対象
となるタンパク質特有の機能を特定するためには、その
タンパク質特有の抗原決定基、即ち、特異的な抗原決定
基を見つけなければならない。しかし、従来の抗原決定
基の予測システムでは、解析の対象となるタンパク質に
特異的な抗原決定基であるか否かについては、何ら情報
を提供するものではなかった。By the way, in order to specify a function specific to a protein to be analyzed, an antigenic determinant specific to the protein, that is, a specific antigenic determinant must be found. However, the conventional antigenic determinant prediction system does not provide any information as to whether or not the antigenic determinant is specific to the protein to be analyzed.

【０００６】そこで、本発明の課題は、タンパク質を特
徴付ける特異的な抗原決定基となりうるペプチドを予測
できるようにしたタンパク質の抗原決定基予測方法及び
システムを提供することである。Accordingly, an object of the present invention is to provide a method and system for predicting an antigenic determinant of a protein, which can predict a peptide which can be a specific antigenic determinant characterizing the protein.

【０００７】[0007]

【課題を解決するための手段】上記課題を解決するた
め、本発明に係るタンパク質の抗原決定基予測方法は、
請求項１に記載されるように、解析対象となるタンパク
質を表すアミノ酸残基の一次配列情報から得られる第一
の所定数のアミノ酸残基で構成されたペプチド毎に所定
のアルゴリズムに従って抗原決定基としての適性の度合
いを表す評価値を求め、その評価値の高いペプチドを所
定数選択し、選択された各ペプチドについて、当該ペプ
チドに含まれる第二の所定数のアミノ酸残基が所定のタ
ンパク質群内において現われる頻度に対応したデータを
当該ペプチドの特異性を表す特異性データとして求め、
その特異性データに基づいて当該解析対象となるタンパ
ク質において特異性のより高い抗原決定基となり得るペ
プチドを推定するように構成される。Means for Solving the Problems To solve the above problems, a method for predicting an antigenic determinant of a protein according to the present invention comprises:
As described in claim 1, an antigenic determinant according to a predetermined algorithm for each peptide composed of a first predetermined number of amino acid residues obtained from primary sequence information of amino acid residues representing a protein to be analyzed. The evaluation value representing the degree of suitability is determined, a predetermined number of peptides having a high evaluation value are selected, and for each of the selected peptides, the second predetermined number of amino acid residues contained in the peptide is a predetermined protein group In the data corresponding to the frequency that appears in the as specificity data representing the specificity of the peptide,
Based on the specificity data, a peptide which can be an antigenic determinant with higher specificity in the protein to be analyzed is estimated.

【０００８】上記のようなタンパク質の抗原決定基予測
方法では、解析対象となるタンパク質を表すアミノ酸残
基の一次配列情報から得られる各ペプチドに対して抗原
決定基としての適性を表す評価値が与えられる。そし
て、その評価値のより高いペプチド毎に、当該ペプチド
に含まれる第二の所定数のアミノ酸残基が所定のタンパ
ク質群内において現われる頻度に対応したデータを当該
ペプチドの特異性を表す特異性データとして求める。ペ
プチドに含まれる第二の所定数の各アミノ酸残基に対し
て得られた頻度が総じて小さい場合に、特異性データは
特異性の高い状態を表す。In the method for predicting the antigenic determinant of a protein as described above, an evaluation value indicating the suitability as an antigenic determinant is given to each peptide obtained from the primary sequence information of amino acid residues representing the protein to be analyzed. Can be Then, for each peptide having a higher evaluation value, data corresponding to the frequency at which the second predetermined number of amino acid residues contained in the peptide appear in the predetermined protein group is specificity data representing the specificity of the peptide. Asking. The specificity data indicates a high specificity state when the frequency obtained for the second predetermined number of each amino acid residue contained in the peptide is generally low.

【０００９】このように得られたその特異性データに基
づいて当該解析対象となるタンパク質において特異性の
より高い抗原決定基となり得るペプチドが推定される。
上記解析対象のタンパク質のアミノ酸一次配列から得ら
れる各ペプチドを構成するアミノ酸残基の数（第一の所
定数）と、当該頻度を得る対象となる各ペプチドに含ま
れるアミノ酸残基の数（第二の所定数）との関係は、第
二の所定数が第一の所定数以下となる。従って、第一の
所定数と第二の所定数が等しくてもよい。この場合、特
異性データは、解析対象となるタンパク質のアミノ酸一
次配列から得られたペプチドそのものの当該タンパク質
群内での出現頻度に対応したものとなる。[0009] Based on the specificity data thus obtained, a peptide which can be an antigenic determinant having higher specificity in the protein to be analyzed is estimated.
The number of amino acid residues (first predetermined number) constituting each peptide obtained from the amino acid primary sequence of the protein to be analyzed, and the number of amino acid residues (first number) contained in each peptide whose frequency is to be obtained The second predetermined number is equal to or less than the first predetermined number. Therefore, the first predetermined number and the second predetermined number may be equal. In this case, the specificity data corresponds to the frequency of appearance of the peptide itself obtained from the primary amino acid sequence of the protein to be analyzed in the protein group.

【００１０】上記所定のタンパク質群は、例えば、公知
となるタンパク質に関するデータベース（例えば、ＳＷ
ＩＳＳ−ＰＲＯＴ）を用いることができる。また、上記
課題を解決するため、本発明に係るタンパク質の抗原決
定基予測システムは、請求項２に記載されるように、解
析対象となるタンパク質を表すアミノ酸残基の一次配列
情報から得られる第一の所定数のアミノ酸残基で構成さ
れたペプチド毎に所定のアルゴリズムに従って抗原決定
基としての適性を評価して、その評価結果を数値化して
評価値として出力する評価値演算手段と、評価値演算手
段にて得られた評価値の高いペプチドを所定数選択する
一次予測手段と、一次予測手段にて選択された各ペプチ
ドについて、当該ペプチドに含まれる第二の所定数のア
ミノ酸残基が所定のタンパク質群内において現われる頻
度に対応したデータを当該ペプチドの特異性を表す特異
性データとして数値化する特異性データ演算手段と、特
異性データ演算手段にて得られた特異性データに基づい
て当該解析対象となるタンパク質においてより特異性の
高い抗原決定基となり得るペプチドを推定する二次予測
手段と、二次予測手段により予測されたペプチドをその
特異性データと共に出力する出力手段とを有するように
構成される。[0010] The predetermined protein group is, for example, a database of known proteins (for example, SW).
ISS-PROT) can be used. Further, in order to solve the above-mentioned problem, the protein antigenic determinant prediction system according to the present invention provides a protein antigenic determinant prediction system, as described in claim 2, which is obtained from primary sequence information of amino acid residues representing a protein to be analyzed. Evaluation value calculating means for evaluating suitability as an antigenic determinant for each peptide composed of one predetermined number of amino acid residues according to a predetermined algorithm, quantifying the evaluation result and outputting it as an evaluation value, A primary prediction means for selecting a predetermined number of peptides having a high evaluation value obtained by the arithmetic means, and for each peptide selected by the primary prediction means, a second predetermined number of amino acid residues contained in the peptide are determined. A specificity data calculating means for quantifying data corresponding to the frequency of occurrence in the protein group as specificity data representing the specificity of the peptide; Prediction means for estimating a peptide that can be a more specific antigenic determinant in the protein to be analyzed based on the specificity data obtained by the data calculation means, and a peptide predicted by the secondary prediction means And output means for outputting together with the specificity data.

【００１１】このようなタンパク質の抗原決定基予測シ
ステムでは、上記予測方法と同様に、抗原決定基として
の適格性の高い（評価値の高い）ペプチドに対して特異
性データが与えられる。この特異性データは、各ペプチ
ドに含まれる第二の所定数のアミノ酸残基が所定のタン
パク質群内において現われる頻度に対応したデータであ
る。そして、特異性データに基づいて当該解析対象とな
るタンパク質において特異的な抗原決定基となり得るペ
プチドが推定され、その推定されたペプチドが特異性デ
ータと共に出力手段より出力される。In such a system for predicting an antigenic determinant of a protein, specificity data is given to a peptide having high eligibility (high evaluation value) as an antigenic determinant, as in the above-described prediction method. This specificity data is data corresponding to the frequency at which a second predetermined number of amino acid residues contained in each peptide appear in a predetermined protein group. Then, a peptide that can be a specific antigenic determinant in the protein to be analyzed is estimated based on the specificity data, and the estimated peptide is output from the output unit together with the specificity data.

【００１２】この抗原決定基予測システムのユーザは、
その出力結果を見て、解析対象となるタンパク質におい
て特異的な抗原決定基となりうるペプチドをしることが
できる。なお、出力手段は、推定されたペプチドを特異
性データと共に表示する表示ユニットでも、また、それ
らをプリントアウトするプリンタ、更に、他のシステム
に送信する送信ユニットでもよい。The user of this antigenic determinant prediction system is
By looking at the output result, a peptide that can be a specific antigenic determinant in the protein to be analyzed can be identified. The output unit may be a display unit that displays the estimated peptides together with the specificity data, a printer that prints them out, or a transmission unit that transmits them to another system.

【００１３】更に、上記頻度に対応したデータを容易に
求めるという観点から、本発明は、請求項３に記載され
るように、上記特異性データ演算手段は、解析対象とな
るタンパク質のアミノ酸一次配列から得られる第二の所
定数のアミノ酸残基毎に、所定のタンパク質群内におい
て当該アミノ酸残基の現われる頻度を外部のユニットか
ら受ける頻度データ受信手段と、上記各ペプチドに含ま
れる上記第二の所定数の各アミノ酸残基の頻度を頻度デ
ータ受信手段にて受けた頻度群から抽出して加算する頻
度加算手段とを有し、該頻度加算手段にて得られる頻度
加算値を各ペプチドの特異性データとするように構成さ
れる。[0013] Further, from the viewpoint of easily obtaining data corresponding to the frequency, the present invention provides, as described in claim 3, the above-mentioned specificity data calculating means, wherein the primary amino acid sequence of a protein to be analyzed is used. For each second predetermined number of amino acid residues obtained from, a frequency data receiving means for receiving the frequency of occurrence of the amino acid residue in a predetermined protein group from an external unit, and the second data included in each of the peptides Frequency addition means for extracting and adding the frequency of each of the predetermined number of amino acid residues from the frequency group received by the frequency data reception means, and adding the frequency addition value obtained by the frequency addition means to the specificity of each peptide. And sex data.

【００１４】上記所定のタンパク質群において第二の所
定数の当該アミノ酸残基の現われる頻度は、ＣＤ−ＲＯ
Ｍ等の記録媒体、当該システムとイントラネット等のロ
ーカルなネットワークで接続されたホストシステムや、
インターネット等の公衆ネットワークを介した他のシス
テム等から供給されることができる。The frequency of occurrence of the second predetermined number of the amino acid residues in the predetermined protein group is determined by CD-RO
M, a recording medium, a host system connected to the system via a local network such as an intranet,
It can be supplied from another system or the like via a public network such as the Internet.

【００１５】[0015]

【発明の実施の形態】以下、本発明の発明の実施の一形
態を図面に基づいて説明する。本発明に係るシステム
は、例えば、図１に示すようなイントラネットによって
ホストシステムに接続された各端末コンピュータシステ
ム内に構築される。図１において、ホストシステム（デ
ータベースサーバ）１００と、各端末コンピュータシス
テム２００（１）乃至２００（１０）は、イントラネッ
トＮによって相互に接続され、それらシステム間で情報
通信可能となっている。ホストシステム１００には、デ
ィスクユニット１２０が接続されており、このディスク
ユニット１２０には、公開されているタンパク質に関す
るデータベース（例えば、ＳＷＩＳＳ−ＰＲＯＴ）が格
納されている。このタンパク質に関するデータベース
は、ＣＤ−ＲＯＭ等の記録媒体または、公衆のネットワ
ークを介して、ホストシステム１００に供給される。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. The system according to the present invention is constructed, for example, in each terminal computer system connected to a host system by an intranet as shown in FIG. In FIG. 1, a host system (database server) 100 and terminal computer systems 200 (1) to 200 (10) are mutually connected by an intranet N, and information communication can be performed between these systems. A disk unit 120 is connected to the host system 100. The disk unit 120 stores a database (for example, SWISS-PROT) relating to published proteins. The database relating to the protein is supplied to the host system 100 via a recording medium such as a CD-ROM or a public network.

【００１６】各端末コンピュータシステム２００（１）
乃至２００（１０）のハードウエア構成は、例えば、図
２に示すようになっている。図２において、このシステ
ムは、制御ユニット１０（ＣＰＵ）、表示ユニット２
０、Ｉ／Ｏインタフェース３０、メモリユニット４０、
入力ユニット５０、ＣＤ−ＲＯＭドライブユニット６０
及び通信ユニット８０を有している。これらの各ユニッ
ト及びＩ／Ｏインタフェースは、バスによって相互に接
続されている。更に、ディスクユニット７０がＩ／Ｏイ
ンタフェース３０を介してバスに接続されている。Each terminal computer system 200 (1)
The hardware configuration of the devices 200 to (10) is, for example, as shown in FIG. 2, the system includes a control unit 10 (CPU), a display unit 2
0, I / O interface 30, memory unit 40,
Input unit 50, CD-ROM drive unit 60
And a communication unit 80. These units and the I / O interface are interconnected by a bus. Further, a disk unit 70 is connected to the bus via the I / O interface 30.

【００１７】制御ユニット１０は、システム全体を制御
すると共に、後述するように、タンパク質の抗原決定基
予測に関する処理を実行する。表示ユニット２０は、Ｃ
ＲＴやＬＣＤ（液晶表示パネル）等で構成され、制御ユ
ニット１０での処理に基づいてタンパク質のアミノ酸一
次配列、抗原決定基の予測での処理にて得られた各種デ
ータ（評価値、頻度分布等）、予測結果等に関する種々
のウインドウを画面上に表示する。メモリユニット４０
は、ＲＡＭ、ＲＯＭを含み、制御ユニット１０が実行す
べきプログラム、そのプログラムに従った処理の過程で
得られた必要なデータ、また、表示ユニット２０の画面
上に表示すべきデータ等を格納する。The control unit 10 controls the entire system and executes a process related to predicting the antigenic determinant of a protein, as described later. The display unit 20 is C
Various data (evaluation value, frequency distribution, etc.) obtained by the processing of the prediction of the primary amino acid sequence of the protein and the antigenic determinant based on the processing in the control unit 10 ), Various windows relating to prediction results and the like are displayed on the screen. Memory unit 40
Includes a RAM and a ROM, and stores a program to be executed by the control unit 10, necessary data obtained in the course of processing according to the program, data to be displayed on the screen of the display unit 20, and the like. .

【００１８】入力ユニット５０は、キーボード、マウス
等で構成され、ユーザ（研究者）が当該システムに情報
を入力するために使用される。通信ユニット８０は、イ
ントラネットＮを介してホストシステム１００との間で
情報通信を行なう。ＣＤ−ＲＯＭ１００には、タンパク
質の抗原決定基予測に関する処理を記述したプログラム
が格納されている。ＣＤ−ＲＯＭドライブユニット６０
にセットされたＣＤ−ＲＯＭ１００から当該プログラム
がディスクユニット７０にインストールされる。そし
て、タンパク質の抗原決定基予測に関する処理を実行す
る際に、制御ユニット１０の制御のものとにディスクユ
ニット７０から読みだされた当該プログラムがメモリユ
ニット４０（ＲＡＭ）に格納される。この状態で、制御
ユニット１０（ＣＰＵ）は、メモリユニット４０に格納
されたプログラムに従って、タンパク質の抗原決定基予
測に関する処理を行なう。The input unit 50 includes a keyboard, a mouse, and the like, and is used by a user (researcher) to input information to the system. The communication unit 80 performs information communication with the host system 100 via the intranet N. The CD-ROM 100 stores a program describing a process related to predicting an antigenic determinant of a protein. CD-ROM drive unit 60
The program is installed in the disk unit 70 from the CD-ROM 100 set in the. Then, when executing a process relating to prediction of the antigenic determinant of a protein, the program read from the disk unit 70 and that controlled by the control unit 10 are stored in the memory unit 40 (RAM). In this state, the control unit 10 (CPU) performs processing relating to prediction of the antigenic determinant of the protein according to the program stored in the memory unit 40.

【００１９】各端末コンピュータシステム２００（１）
乃至２００（１０）において、研究者は、例えば、図３
に示す手順にて処理を進めてゆく。図３において、ま
ず、解析対象となるタンパク質の一次配列情報がシステ
ムに入力されると、そのタンパク質における抗原決定基
の検索処理Ｐ１が実行される。この抗原決定基の検索処
理Ｐ１には、例えば、特願平７−３３０８７３に開示さ
れる従来の手法が適用される。Each terminal computer system 200 (1)
3 to 200 (10), the researcher
The process proceeds according to the procedure shown in. In FIG. 3, first, when primary sequence information of a protein to be analyzed is input to the system, a search process P1 for an antigenic determinant in the protein is executed. For this antigenic determinant search process P1, for example, a conventional method disclosed in Japanese Patent Application No. 7-330873 is applied.

【００２０】即ち、入力されたアミノ酸一次配列情報か
ら、７つのアミノ酸残基で構成されるペプチドを選択
し、その選択されたペプチドについて所定のアルゴリズ
ムに従った評価処理が行なわれ、そのペプチドに対して
抗原決定基の適性の評価値が与えられる。評価の対象と
なる７残基で構成されたペプチドは、例えば、図４に示
すように、一次配列の先頭の残基から１残基ずつずらし
て更新される（．．．）。That is, a peptide composed of seven amino acid residues is selected from the input amino acid primary sequence information, and the selected peptide is subjected to an evaluation process according to a predetermined algorithm. To give an assessment of the suitability of the antigenic determinant. The peptide composed of seven residues to be evaluated is updated, for example, as shown in FIG. 4 by shifting one residue from the first residue of the primary sequence (...).

【００２１】上記のようにして順次更新されるペプチド
に対して与えられた評価値は、各ペプチドを構成するア
ミノ酸残基列の中央の残基（代表残基）に対応付けられ
る（図４の矢印で示されるように、先頭から４番目の残
基に対応付けられる）。即ち、各ペプチドに対して与え
られた評価値がメモリユニット４０に格納されると共
に、図６に示すように各残基に対応した格納領域を持つ
抗原決定基予測バッファの各格納領域に、その対応する
残基が代表残基となるペプチドの評価値の格納位置を指
すポインタ値が格納される。The evaluation values given to the peptides sequentially updated as described above are associated with the central residue (representative residue) of the amino acid residue sequence constituting each peptide (see FIG. 4). As shown by the arrow, it is associated with the fourth residue from the beginning). That is, the evaluation value given to each peptide is stored in the memory unit 40, and as shown in FIG. 6, the evaluation value is stored in each storage area of the antigenic determinant prediction buffer having a storage area corresponding to each residue. A pointer value indicating the storage position of the evaluation value of the peptide whose corresponding residue is the representative residue is stored.

【００２２】なお、各ペプチドに対する評価値は、前述
したのと同様、柔軟性評価、親水性／疎水性評価、抗原
決定基解析評価、二次構造予測評価等の評価を数値化
し、その数値を単独または複数統合して表したものであ
る。柔軟性評価は、例えば、"P.A.Karplus and G.E. Sc
hulz, Prediction of Chain Flexibility, Naturwissen
chaften 72,212-213 (1985)"にて公表されたアルゴリズ
ムに従って行われる。親水性／疎水性評価は、例え
ば、"J.Kyte and R.F.Doolittle, A simple Methodfor
Displaying the Hydropathic Character of a Protein,
J.Mol.Biol 157,105-132 (1982)" にて公表されたアル
ゴリズムに従って行われる。抗原決定基解析評価は、例
えば、"E.A.Emini, Induction of Hepatitis A Virus-N
eutralingAntibody by a Virus-Specific Synthetic Pe
ptide, J.Virol. 55,836-839 (1985)" にて公表された
アルゴリズムに従って行われる。二次構造予測評価は、
例えば、"J.Garnier, Analysis of Accuracy and Impli
cations of Simple Methodsfor Predicting the second
ary structure of Blobular Proteins, J.Mol.Biol.12
0,97-120 (1978)"にて公表されたアルゴリズムに従って
行われる。As described above, evaluation values for each peptide are quantified in the same manner as described above, such as flexibility evaluation, hydrophilicity / hydrophobicity evaluation, antigenic determinant analysis evaluation, secondary structure prediction evaluation, and the like. This is a single or multiple integrated representation. Flexibility assessment is described, for example, in "PAKarplus and GE Sc
hulz, Prediction of Chain Flexibility, Naturwissen
chaften 72, 212-213 (1985) ". The hydrophilicity / hydrophobicity evaluation is performed by, for example," J. Kyte and RFDoolittle, A simple Methodfor
Displaying the Hydropathic Character of a Protein,
J. Mol. Biol 157, 105-132 (1982) ". An antigenic determinant analysis evaluation is performed by, for example," EAEmini, Induction of Hepatitis A Virus-N
eutralingAntibody by a Virus-Specific Synthetic Pe
ptide, J. Virol. 55, 836-839 (1985) ". The secondary structure prediction evaluation
For example, "J. Garnier, Analysis of Accuracy and Impli
cations of Simple Methods for Predicting the second
ary structure of Blobular Proteins, J. Mol. Biol. 12
0,97-120 (1978) ".

【００２３】上記のようにしてタンパク質の一次配列か
ら得られた各ペプチドに対して評価値が与えられると、
その評価値の大きい順（抗原決定基である適格性が高い
順）にランクが設定され、そのランクへのポインタ値が
抗原決定基予測バッファ（図６参照）の当該ペプチドの
代表残基に対応した格納領域にセットされる。When an evaluation value is given to each peptide obtained from the primary sequence of the protein as described above,
The rank is set in the descending order of the evaluation value (in the order of higher eligibility as the antigenic determinant), and the pointer value to the rank corresponds to the representative residue of the peptide in the antigenic determinant prediction buffer (see FIG. 6). Is set in the specified storage area.

【００２４】上記のようにして抗原決定基の検索処理Ｐ
１が終了すると、代表残基を表す残基番号に対する評価
値が、例えば、図５に示すように、表示ユニット２０の
画面上にグラフ表示される。更に、選択すべき抗原決定
基の個数ｎ（例えば、ｎ＝３）をユーザ（研究者）が入
力ユニット５０を用いて指定すると、制御ユニット１０
は、上記抗原決定基バッファ（図６参照）を走査して、
上位ｎ（ｎ＝３）ランクのペプチドを選択する（図３の
Ｐ２）。そして、その選択されたペプチドのランク（Ra
nk of Epitope ）、ペプチドを表すアミノ酸残基列（Re
sidue Name）、及びそのアミノ酸残基列の残基番号（Re
sidue No. ）が表示ユニット２０の画面上に更に表示さ
れる（図５参照）。As described above, the antigen determinant search process P
When 1 is completed, the evaluation value for the residue number representing the representative residue is graphically displayed on the screen of the display unit 20, for example, as shown in FIG. Further, when the user (researcher) specifies the number n (for example, n = 3) of the antigenic determinants to be selected by using the input unit 50, the control unit 10
Scans the antigenic determinant buffer (see FIG. 6)
The peptides with the highest n (n = 3) ranks are selected (P2 in FIG. 3). Then, the rank of the selected peptide (Ra
nk of Epitope), the amino acid residue sequence representing the peptide (Re
sidue Name) and the residue number (Re
Sidue No.) is further displayed on the screen of the display unit 20 (see FIG. 5).

【００２５】研究者は、上記グラフ表示を見て、抗原決
定基の適格性が高いと予想されるペプチドの一次配列上
での大まかな位置を知ることができ、更に、ランク及び
残基列表示等を見て抗原決定基の適格性が高いと予測さ
れる具体的なペプチドを知ることができる。By looking at the above graphical representation, the researcher can know the approximate position on the primary sequence of the peptide which is expected to have a high suitability of the antigenic determinant, and can further display the rank and the residue sequence. The specific peptide predicted to have high suitability for the antigenic determinant can be known by looking at the above.

【００２６】ここで、更に、ユーザが入力ユニット５０
を用いて頻度分布作成処理（図３のＰ３）の開始コマン
ドをシステムに入力すると、制御ユニット１０は、頻度
分布作成処理Ｐ３を開始する。この頻度分布作成処理Ｐ
３では、まず、図８に示すような入力ウインドウが表示
ユニット２０の画面上に表示される。ユーザは、入力ユ
ニット５０（マウス、キーボード）を用いて、入力ウイ
ンドウ上で、頻度を求めるべきオリゴペプチドの大きさ
（４残基列のオリゴペプチド：Tetrapeptides、５残基
列のオリゴペプチド：Pentapeptides 、６残基列のオリ
ゴペプチド：Hexapeptides）を指定し、更に、解析対象
となるタンパク質の一次配列情報（AAAYXLVVVG ASG
V...）を入力する。例えば、この例の場合、５残基列
のオリゴペプチド（Pentapeptides ）が指定されてい
る。そして、ユーザは、頻度を求めるためのタンパク質
の範囲（全種：All species 、人の種：Human 、マウス
の種：Mouse 等）を入力ユニット５０を用いて指定す
る。Here, the user further operates the input unit 50.
When the start command of the frequency distribution creation processing (P3 in FIG. 3) is input to the system by using, the control unit 10 starts the frequency distribution creation processing P3. This frequency distribution creation processing P
In 3, first, an input window as shown in FIG. 8 is displayed on the screen of the display unit 20. Using the input unit 50 (mouse, keyboard), the user uses the input window to determine the size of the oligopeptide whose frequency is to be determined (four-residue oligopeptide: Tetrapeptides, five-residue oligopeptide: Pentapeptides, 6-residue oligopeptide: Hexapeptides) and the primary sequence information of the protein to be analyzed (AAAYXLVVVG ASG
V ...). For example, in the case of this example, oligopeptides (Pentapeptides) of a 5-residue string are specified. Then, the user uses the input unit 50 to specify the range of the protein for which the frequency is to be calculated (All species, Human species: Human, Mouse species: Mouse, etc.).

【００２７】なお、上記入力ウインドウに入力すべき解
析対象となるタンパク質の一次配列情報は、上述した抗
原決定基の検索処理Ｐ１において対象としたものと同じ
であるので、当該入力ウインドウが開かれるときに自動
設定されるようにすることもできる。Since the primary sequence information of the protein to be analyzed to be input into the input window is the same as the information targeted in the above-described antigenic determinant search processing P1, the primary sequence information is not displayed when the input window is opened. Can be automatically set to.

【００２８】上記のようにして各種の情報設定が終了し
た後に、入力ウインドウ上の制御ボタン「ＯＫ」の選択
操作（マウスのクリック操作）がなされると、制御ユニ
ット１０は、入力されたアミノ酸一次配列情報から、指
定されたオリゴペプチド（５残基列）を頻度を求める対
象として選択する。この対象となる５残基で構成された
オリゴペプチドは、例えば、図９に示すように、一次配
列の先頭の残基から１残基ずつずらして更新される（
．．．）。After the various information settings are completed as described above, when the control button "OK" is selected (mouse click operation) on the input window, the control unit 10 starts the input amino acid primary operation. From the sequence information, the specified oligopeptide (five residue sequence) is selected as the target for which the frequency is to be determined. The oligopeptide composed of the target five residues is updated, for example, as shown in FIG. 9 by shifting one residue from the first residue of the primary sequence (
. . . ).

【００２９】このように順次選択されたオリゴペプチド
が、入力ウインドウに入力された頻度を求めるタンパク
質の範囲の情報と共に、通信ユニット８０を介してホス
トシステム１００に送信される。ホストシステム１００
では、ディスクユニット１２０に格納されたデータベー
スを走査して、指定されたタンパク質の範囲（タンパク
質群）内で指定されたオリゴペプチドが現われる頻度を
求める。この頻度を求める手法については、公知の手法
が用いられる（例えば、特願平８−２０３５７５）。The oligopeptides sequentially selected in this manner are transmitted to the host system 100 via the communication unit 80 together with information on the range of the protein whose frequency is to be obtained, which is input to the input window. Host system 100
Then, the database stored in the disk unit 120 is scanned to determine the frequency of occurrence of the specified oligopeptide within the specified range of proteins (protein group). As a method for obtaining the frequency, a known method is used (for example, Japanese Patent Application No. 8-203575).

【００３０】ホストシステム１００は、求めた頻度デー
タを端末コンピュータシステム２００に返送する。通信
ユニット８０を介して、各オリゴペプチドに対応する頻
度データを受信した端末コンピュータシステム２００で
は、制御ユニット１０が受信した頻度データをメモリユ
ニット４０に格納する。そして、更に、制御ユニット１
０は、図７に示すように各残基に対応した格納領域を持
つ頻度バッファの各格納領域に、その対応する残基が代
表残基（ペプチドの中央に位置する残基（図７の矢印参
照））となるオリゴペプチドに対する頻度データの格納
位置を指すポインタ値をセットする。The host system 100 returns the obtained frequency data to the terminal computer system 200. In the terminal computer system 200 that has received the frequency data corresponding to each oligopeptide via the communication unit 80, the frequency data received by the control unit 10 is stored in the memory unit 40. And, furthermore, the control unit 1
0 indicates that the corresponding residue is a representative residue (residue located at the center of the peptide (arrow in FIG. 7) in each storage region of the frequency buffer having a storage region corresponding to each residue as shown in FIG. Set the pointer value indicating the storage position of the frequency data for the oligopeptide to be referred to))).

【００３１】全てのオリゴペプチド（５残基列）につい
ての頻度データを受信し、頻度バッファの全ての残基に
対応した格納領域にポインタ値が設定されると、制御ユ
ニット１０は、この頻度バッファを参照し、各オリゴペ
プチドの頻度を、例えば、図１０に示すように、表示ユ
ニット２０の画面上にグラフ表示させる。このグラフ表
示の詳細は、図１１に示すように、縦軸に、解析対象と
なるタンパク質の一次配列されたアミノ酸残基の残基番
号、横軸に、その残基番号で特定される残基を代表残基
として持つオリゴペプチドの頻度が棒グラフで表されて
いる。When the frequency data for all oligopeptides (five residue strings) is received and pointer values are set in storage areas corresponding to all the residues in the frequency buffer, the control unit 10 , The frequency of each oligopeptide is graphically displayed on the screen of the display unit 20, for example, as shown in FIG. As shown in FIG. 11, details of this graph display are as follows: the ordinate represents the residue number of the amino acid residue in the primary sequence of the protein to be analyzed, and the abscissa represents the residue identified by the residue number. Is represented by a bar graph.

【００３２】制御ユニット１０は、上記のようにして各
オリゴペプチドの頻度を表示ユニット２０に表示させた
後、その頻度に基づいて特異性の高い順にペプチドのラ
ンク付けを行なう処理（図３のＰ４）及び、そのランク
付けされたペプチドを表示する処理（図３のＰ５）を行
なう。上記ペプチドのランク付けを行なう処理Ｐ４は、
例えば、図１２に示す手順に従って行なわれる。After displaying the frequency of each oligopeptide on the display unit 20 as described above, the control unit 10 ranks the peptides in descending order of specificity based on the frequency (P4 in FIG. 3). ) And a process of displaying the ranked peptides (P5 in FIG. 3). The process P4 for ranking the peptides is as follows:
For example, it is performed according to the procedure shown in FIG.

【００３３】図１２において、制御ユニット１０は、評
価ランクＲｏを所定のレジスタにセットする（Ｓ１）。
この評価ランクＲｏは、当該タンパク質に特異的な抗原
決定基として抽出すべきペプチドの数に対応し、上述し
た評価値の上位Ｒｏ個（例えば、３個）が特異性の判断
の対象となるペプチドとなる。そして、制御ユニット１
０は、カウンタ値ｉを「１」にセット（ｉ＝１）にした
後（Ｓ２）、抗原決定基予測バッファ（図６参照）を参
照して、ｉ番目（１番目）のｉ番目のペプチド（７残基
列）の評価ランクＲ（残基データ）を取得する（Ｓ
３）。そして、制御ユニット１０は、その取得した評価
ランクＲが上記設定した評価ランクＲｏ以下（Ｒ≦Ｒ
ｏ）であるか否かを判定する（Ｓ４）。In FIG. 12, the control unit 10 sets the evaluation rank Ro in a predetermined register (S1).
The evaluation rank Ro corresponds to the number of peptides to be extracted as an antigenic determinant specific to the protein, and the top Ro (e.g., three) of the above-described evaluation values is a peptide whose specificity is to be determined. Becomes And the control unit 1
0 sets the counter value i to “1” (i = 1) (S2), and then refers to the antigenic determinant prediction buffer (see FIG. 6) to determine the i-th (first) i-th peptide. The evaluation rank R (residue data) of (seven residue strings) is acquired (S
3). Then, the control unit 10 determines that the obtained evaluation rank R is equal to or less than the evaluation rank Ro set above (R ≦ R
o) is determined (S4).

【００３４】もし、取得した評価ランクＲが設定した評
価ランクＲｏ以下（上位Ｒｏランク以内）であれば、制
御ユニット１０は、頻度バッファ（図７参照）を参照し
て、（ｉ−１）番目、ｉ番目、（ｉ＋１）番目の連続す
る３つのオリゴペプチド（５残基列）の頻度をメモリユ
ニット４０から読みだして、加算する（Ｓ５）。If the obtained evaluation rank R is equal to or less than the set evaluation rank Ro (within the upper Ro rank), the control unit 10 refers to the frequency buffer (see FIG. 7) and , I-th, and (i + 1) -th consecutive oligopeptides (five residue strings) are read out from the memory unit 40 and added (S5).

【００３５】この連続する３つのオリゴペプチドの頻度
を加算することは、次のことを意味する。図１３に示す
ように、今評価の対象となるペプチドが７残基列（ｉ−
３、ｉ−２、ｉ−１、ｉ、ｉ＋１、ｉ＋２、ｉ＋３）で
あり、また、頻度を求めたオリゴペプチドは５残基で構
成される。従って、評価の対象となるペプチド（７残基
列）の中に、頻度を求めたオリゴペプチドが３つ存在す
ることになる。即ち、頻度を求めたオリゴペプチドの代
表残基は、評価の対象となるペプチドの（ｉ−１）番
目、ｉ番目及び（ｉ＋１）番目の残基と一致する。The addition of the frequencies of three consecutive oligopeptides means the following. As shown in FIG. 13, the peptide to be evaluated now has a 7 residue sequence (i-
3, i-2, i-1, i, i + 1, i + 2, i + 3), and the oligopeptide whose frequency is determined is composed of 5 residues. Therefore, among the peptides to be evaluated (seven residue sequences), there are three oligopeptides whose frequencies have been determined. That is, the representative residues of the oligopeptide for which the frequency is obtained coincide with the (i-1) -th, i-th, and (i + 1) -th residues of the peptide to be evaluated.

【００３６】そのため、このように、（ｉ−１）番目、
ｉ番目及び（ｉ＋１）番目の連続する３つのオリゴペプ
チドの頻度を加算して得られた加算値は、評価の対象と
なるペプチド（７残基列）に含まれる全てのオリゴペプ
チドの頻度となる。この頻度の加算値は、評価の対象と
なるペプチド（７残基列）の特異性（タンパク質群内で
現われる程度の少なさ）を表す尺度となる。即ち、この
頻度加算値がより小さいほど（タンパク質内においてよ
りまれであるほど）、その特異性が高いことになる。Therefore, the (i-1) th,
The addition value obtained by adding the frequencies of the i-th and (i + 1) -th consecutive three oligopeptides is the frequency of all the oligopeptides contained in the peptide to be evaluated (seven residue sequence). . The added value of the frequency serves as a scale indicating the specificity (the degree of occurrence in the protein group) of the peptide to be evaluated (seven residue sequence). That is, the smaller this frequency addition value is (the more rare it is within a protein), the higher its specificity is.

【００３７】上記のように、求められた頻度の加算値
は、メモリユニット４０に格納されると共に、抗原決定
基予測バッファ（図６参照）のｉ番目の格納領域にその
頻度加算値へのポインタ値が格納される（Ｓ６）。そし
て、制御ユニット１０は、カウンタ値ｉがアミノ酸一次
配列の最終残基番号に対応した最終値である否かを判定
し（Ｓ７）、最終値に達するまで、カウンタ値ｉを＋１
ずつインクリメントしながら（Ｓ８）、評価ランク上位
Ｒｏまでのペプチドについて、上記と同様の頻度加算の
処理を行なう（Ｓ３乃至Ｓ６）。なお、ｉ番目のペプチ
ドの評価ランクＲが設定した評価ランクＲｏより大きい
場合（評価ランクＲｏより下位のランク）、上述うした
頻度の加算処理は行なわれない（Ｓ４→Ｓ７）。As described above, the obtained frequency addition value is stored in the memory unit 40, and the pointer to the frequency addition value is stored in the i-th storage area of the antigenic determinant prediction buffer (see FIG. 6). The value is stored (S6). Then, the control unit 10 determines whether or not the counter value i is the final value corresponding to the last residue number of the amino acid primary sequence (S7), and increases the counter value i by +1 until the final value is reached.
While incrementing each time (S8), the same frequency addition processing as described above is performed on the peptides up to the evaluation rank higher Ro (S3 to S6). When the evaluation rank R of the i-th peptide is larger than the set evaluation rank Ro (rank lower than the evaluation rank Ro), the above-described addition processing of the frequency is not performed (S4 → S7).

【００３８】そして、全てのペプチドについての処理が
終了すると（ｉ＝ＥＮＤ）、制御ユニット１０は、抗原
決定基予測バッファ（図６参照）を走査して、頻度加算
値を読みだし、その小さい順にランク（頻度ランク）を
与える（Ｓ９）。更に、制御ユニット１０は、各ペプチ
ドに対する頻度ランクを指すポインタ値を抗原決定基予
測バッファの対応する格納領域にセットする（Ｓ１
０）。When the processing for all the peptides has been completed (i = END), the control unit 10 scans the antigenic determinant prediction buffer (see FIG. 6), reads out the frequency addition values, and reads the values in ascending order. A rank (frequency rank) is given (S9). Further, the control unit 10 sets a pointer value indicating the frequency rank for each peptide in the corresponding storage area of the antigenic determinant prediction buffer (S1).
0).

【００３９】上記のようにして頻度ランクのポインタ値
が抗原決定基予測バッファにセットされた後に、制御ユ
ニット１０は、ランク付けされたペプチドの表示処理
（図３のＰ５）を行う。このペプチドの表示処理Ｐ５
は、例えば、図１４に示す手順に従って行われる。After the frequency rank pointer value is set in the antigenic determinant prediction buffer as described above, the control unit 10 performs a display process of the ranked peptide (P5 in FIG. 3). Display processing P5 of this peptide
Is performed, for example, according to the procedure shown in FIG.

【００４０】図１４において、制御ユニット１０は、ま
ず、所定の制御ボタンの操作入力（マウスのクリック操
作）の待ち状態となる（Ｓ１１）。この状態で、ユーザ
が所定の制御ボタン（Specific Epitope Find ）の選択
操作を行うと、制御ユニット１０は、その操作を検出し
て、カウンタ値ｉを初期化（ｉ＝１）した後に（Ｓ１
２）、抗原決定基予測バッファ（図６参照）を参照し
て、ｉ番目のペプチドの頻度ランクＲを取得する（Ｓ１
３）。そして、制御ユニット１０は、その取得した頻度
ランクＲが所定の頻度ランクＲｓ以下であるか否か（Ｒ
≦Ｒｓ）を判定する（Ｓ１４）。なお、このランクＲｓ
は、図１２に示す処理で用いた評価ランクＲｏ以下の所
定値に設定される。In FIG. 14, the control unit 10 first waits for an operation input of a predetermined control button (mouse click operation) (S11). In this state, when the user performs an operation of selecting a predetermined control button (Specific Epitope Find), the control unit 10 detects the operation and initializes the counter value i (i = 1) (S1).
2) With reference to the antigenic determinant prediction buffer (see FIG. 6), the frequency rank R of the i-th peptide is obtained (S1).
3). Then, the control unit 10 determines whether or not the acquired frequency rank R is equal to or less than the predetermined frequency rank Rs (R
≤ Rs) (S14). In addition, this rank Rs
Is set to a predetermined value equal to or lower than the evaluation rank Ro used in the processing shown in FIG.

【００４１】取得した頻度ランクＲが所定ランクＲｓ以
下（上位Ｒｓ以内）の場合、制御ユニット１０は、抗原
決定基予測バッファ（図６参照）を参照してｉ番目のペ
プチドの評価値（抗原決定基の適確性）、頻度加算値
（特異性）及び一次配列された（ｉ−３）番目乃至（ｉ
＋３）番目のアミノ酸残基（ｉ番目のペプチド名）等の
表示すべきデータを表示バッファ（メモリユニット４
０）にセットする（Ｓ１５）。そして、制御ユニット１
０は、カウンタ値ｉがアミノ酸一次配列の最終残基番号
に対応した最終値であるか否かを判定し（Ｓ１６）、最
終値に達するまで、カウンタ値ｉを＋１ずつインクリメ
ントしながら（Ｓ１７）、頻度ランク上位Ｒｓまでのペ
プチドについて、上記と同様の表示バッファへのデータ
セット処理を行う（Ｓ１３乃至Ｓ１５）。When the acquired frequency rank R is equal to or smaller than the predetermined rank Rs (within the upper Rs), the control unit 10 refers to the antigenic determinant prediction buffer (see FIG. 6) and evaluates the i-th peptide evaluation value (antigen determination). Group accuracy), frequency addition value (specificity) and (i-3) th to (i)
+3) The data to be displayed such as the amino acid residue (i-th peptide name) is displayed in a display buffer (memory unit 4).
0) (S15). And the control unit 1
0 determines whether or not the counter value i is the final value corresponding to the last residue number of the amino acid primary sequence (S16), and increments the counter value i by +1 until the final value is reached (S17). For the peptides up to the frequency rank higher Rs, the same data set processing is performed on the display buffer as described above (S13 to S15).

【００４２】なお、ｉ番目のペプチドの頻度ランクＲが
所定ランクＲｓより大きい場合は、上述した表示バッフ
ァへのデータセット処理は行われない（Ｓ１４→Ｓ１
６）。そして、全てのペプチドについての処理が終了す
ると（ｉ＝ＥＮＤ）、制御ユニット１０は、表示バッフ
ァ内のデータを頻度ランクに従ってソートする（Ｓ１
８）。データのソートが終了すると、制御ユニット１０
は、表示バッファ内のデータ（評価値、ペプチド名、ペ
プチドの残基番号、頻度加算値）の表示処理Ｐ１０を行
う。この表示処理Ｐ１０では、上記バッファ内のデータ
が表示ユニット２０に提供され、例えば、図１０に示す
ように、当該データが表示ユニット１０の画面上の頻度
表示ウインドウの下部に表示される。If the frequency rank R of the i-th peptide is larger than the predetermined rank Rs, the above-described data set processing for the display buffer is not performed (S14 → S1).
6). When the processing for all the peptides is completed (i = END), the control unit 10 sorts the data in the display buffer according to the frequency rank (S1).
8). When the data sorting is completed, the control unit 10
Performs display processing P10 of data (evaluation value, peptide name, peptide residue number, frequency addition value) in the display buffer. In the display processing P10, the data in the buffer is provided to the display unit 20, and for example, as shown in FIG. 10, the data is displayed below the frequency display window on the screen of the display unit 10.

【００４３】即ち、頻度ランクの上位Ｒｓ（例えば、Ｒ
ｓ＝３）について、その頻度ランク（Rank of Function
al Site ）の順番に、ペプチド名（Residue Name）、残
基番号（Residue No. ）、評価値（Score ）、頻度加算
値（Number) が表示される。ユーザ（研究者）は、この
表示を見て、解析の対象となるタンパク質に含まれる特
異的な（特異性の高い）抗原決定基として予測されるペ
プチドを知ることができる。That is, Rs (for example, R
s = 3), its frequency rank (Rank of Function)
al Site), a peptide name (Residue Name), a residue number (Residue No.), an evaluation value (Score), and a frequency addition value (Number) are displayed. By viewing this display, the user (researcher) can know the peptide predicted as a specific (highly specific) antigenic determinant contained in the protein to be analyzed.

【００４４】なお、上述した例では、図３に示す抗原決
定基の検索処理Ｐ１が評価値演算手段に対応し、図３に
示す処理Ｐ２が一次予測手段に対応し、図３に示す頻度
分布作成処理Ｐ３が特異性データ演算手段に対応する。
また、頻度に基づいて特異性の高い順にペプチドのラン
ク付けを行う処理（図３のＰ４）、具体的には、図１２
に示す処理が二次予測手段に対応し、ランク付けされた
ペプチドを表示する処理（図３のＰ５）、具体的には、
図１４に示す処理が出力手段に対応する。In the example described above, the antigen determinant search processing P1 shown in FIG. 3 corresponds to the evaluation value calculation means, the processing P2 shown in FIG. 3 corresponds to the primary prediction means, and the frequency distribution shown in FIG. Creation process P3 corresponds to specificity data calculation means.
Further, a process of ranking peptides in descending order of specificity based on frequency (P4 in FIG. 3), specifically, FIG.
Corresponds to the secondary prediction means, and displays the ranked peptides (P5 in FIG. 3). Specifically,
The processing illustrated in FIG. 14 corresponds to the output unit.

【００４５】[0045]

【発明の効果】以上、説明してきたように、各請求項に
記載される本発明によれば、解析対象となるタンパク質
のアミノ酸一次配列から得られる抗原決定基としての適
確性の高いペプチドから、さらに、所定のタンパク質群
内での少なさの程度の高い（よりまれな）ペプチドを選
択することにより、当該解析対象となるタンパク質を特
徴付ける特異的な抗原決定基となりうるペプチドを予測
できるようになる。As described above, according to the present invention described in each claim, a highly accurate peptide as an antigenic determinant obtained from the primary amino acid sequence of a protein to be analyzed is Furthermore, by selecting a peptide having a high degree of smallness (rare) within a predetermined protein group, it becomes possible to predict a peptide which can be a specific antigenic determinant characterizing the protein to be analyzed. .

[Brief description of the drawings]

【図１】本発明に係る予測システムが適用されるコンピ
ュータネットワークシステムを示すブロック図である。FIG. 1 is a block diagram showing a computer network system to which a prediction system according to the present invention is applied.

【図２】図１に示すコンピュータネットワークシステム
における端末コンピュータシステムのハードウエア構成
を示すブロック図である。FIG. 2 is a block diagram showing a hardware configuration of a terminal computer system in the computer network system shown in FIG.

【図３】タンパク質の抗原決定基の予測処理の手順を示
すフローチャートである。FIG. 3 is a flowchart showing a procedure of a process of predicting an antigenic determinant of a protein.

【図４】タンパク質のアミノ酸一次配列から抗原決定基
の適確性を判定するためのペプチドを抽出する手法を示
す図である。FIG. 4 is a diagram showing a technique for extracting a peptide for determining the suitability of an antigenic determinant from the primary amino acid sequence of a protein.

【図５】抽出されたペプチドの評価値の表示例を示す図
である。FIG. 5 is a diagram showing a display example of evaluation values of extracted peptides.

【図６】抗原決定基予測バッファの構成を示す図であ
る。FIG. 6 is a diagram showing a configuration of an antigenic determinant prediction buffer.

【図７】頻度バッファの構成を示す図である。FIG. 7 is a diagram illustrating a configuration of a frequency buffer.

【図８】頻度データを求めるための入力ウインドウを示
す図である。FIG. 8 is a diagram showing an input window for obtaining frequency data.

【図９】タンパク質のアミノ酸一次配列から頻度を求め
るべきオリゴペプチドを抽出する手法を示す図である。FIG. 9 is a diagram showing a technique for extracting an oligopeptide whose frequency is to be determined from the primary amino acid sequence of a protein.

【図１０】図９に示す手法によって抽出されたオリゴペ
プチドの頻度のグラフ表示及び抗原決定基特異性のラン
ク表示の各例を示す図である。FIG. 10 is a diagram showing each example of a graph display of the frequency of an oligopeptide extracted by the technique shown in FIG. 9 and a rank display of antigenic determinant specificity.

【図１１】タンパク質のアミノ酸一次配列に対する頻度
分布の詳細を示す図である。FIG. 11 is a diagram showing the details of the frequency distribution for the primary amino acid sequence of a protein.

【図１２】図３の処理Ｐ４の詳細な手順を示すフローチ
ャートである。FIG. 12 is a flowchart showing a detailed procedure of a process P4 in FIG. 3;

【図１３】評価対象のペプチドと頻度計算対象のペプチ
ドとの関係を示す図である。FIG. 13 is a diagram showing a relationship between a peptide to be evaluated and a peptide to be calculated for frequency.

【図１４】図３の処理Ｐ５の詳細な手順を示すフローチ
ャートである。FIG. 14 is a flowchart showing a detailed procedure of a process P5 in FIG. 3;

[Explanation of symbols]

１０制御ユニット２０表示ユニット３０Ｉ／Ｏインタフェース４０メモリユニット５０入力ユニット６０ＣＤ−ＲＯＭドライバユニット７０ディスクユニット８０通信ユニット１００ＣＤ−ＲＯＭ１１０ホストコンピュータシステム１２０ディスクユニット２００（１）−２００（１０）端末コンピュータシス
テムDESCRIPTION OF SYMBOLS 10 Control unit 20 Display unit 30 I / O interface 40 Memory unit 50 Input unit 60 CD-ROM driver unit 70 Disk unit 80 Communication unit 100 CD-ROM 110 Host computer system 120 Disk unit 200 (1) -200 (10) Terminal Computer system

───────────────────────────────────────────────────── フロントページの続き (72)発明者酒井広太福岡県福岡市早良区百道浜２丁目２番１号株式会社富士通九州システムエンジニアリング内 (72)発明者土居洋文神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Kota Sakai 2-2-1 Hyakuchihama, Sawara-ku, Fukuoka, Fukuoka Prefecture Inside Fujitsu Kyushu System Engineering Co., Ltd. (72) Inventor Hirofumi Doi Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture 4-1-1, Fujitsu Limited

Claims

[Claims]

1. A degree of suitability as an antigenic determinant according to a predetermined algorithm for each peptide composed of a first predetermined number of amino acid residues obtained from primary sequence information of amino acid residues representing a protein to be analyzed. Is determined, a predetermined number of peptides having a high evaluation value are selected, and for each of the selected peptides, the frequency at which a second predetermined number of amino acid residues contained in the peptide appear in a predetermined protein group is determined. The corresponding data is obtained as specificity data indicating the specificity of the peptide, and a peptide that can be a peptide that can be a higher specificity antigen determinant in the protein to be analyzed based on the specificity data is determined. Antigenic determinant prediction method.

2. Evaluating the suitability as an antigenic determinant according to a predetermined algorithm for each peptide composed of a first predetermined number of amino acid residues obtained from primary sequence information of amino acid residues representing a protein to be analyzed. Evaluation value calculating means for quantifying the evaluation result and outputting it as an evaluation value; primary prediction means for selecting a predetermined number of peptides having a high evaluation value obtained by the evaluation value calculation means; and primary prediction means. For each of the selected peptides, a specificity that quantifies data corresponding to the frequency at which a second predetermined number of amino acid residues contained in the peptide appear in a predetermined protein group as specificity data representing the specificity of the peptide Means for calculating the specificity of the protein to be analyzed based on the specificity data obtained by the specificity data calculating means. A secondary estimating means for estimating a peptide that can be had antigenic determinant, antigenic determinants prediction system of a protein and an output means for outputting the peptides predicted by the secondary prediction means together with the specificity data.

3. The protein antigenic determinant prediction system according to claim 2, wherein the specificity data calculating means is configured to generate a predetermined number of amino acid residues for every second predetermined number of amino acid residues obtained from the primary amino acid sequence of the protein to be analyzed. A frequency data receiving unit that receives the frequency of occurrence of the amino acid residue from an external unit in the protein group, and a frequency data receiving unit that determines the frequency of the second predetermined number of each amino acid residue contained in each of the peptides. An antigenic determinant prediction system, comprising: frequency addition means for extracting and adding from a received frequency group, wherein the frequency addition value obtained by the frequency addition means is used as specificity data of each peptide.