JP2006170710A

JP2006170710A - Method of extracting peak and program for executing the same method

Info

Publication number: JP2006170710A
Application number: JP2004361561A
Authority: JP
Inventors: Takeshi Aoshima; 健青島
Original assignee: Eisai Co Ltd; Mitsui Knowledge Industry Co Ltd
Current assignee: Eisai Co Ltd; Mitsui Knowledge Industry Co Ltd
Priority date: 2004-12-14
Filing date: 2004-12-14
Publication date: 2006-06-29
Anticipated expiration: 2024-12-14
Also published as: JP4621491B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method of extracting peaks in an MS/MS spectrum for efficiently and precisely performing proteome analysis using the MS/MS spectrum and to provide a program for executing the same method. <P>SOLUTION: The method of extracting peaks in a spectrum comprising a plurality of peaks includes: (1) a step for acquiring the spectrum; (2) a step for detecting at least two peaks as one peak group from among the plurality of peaks when at least two peaks are contiguous each having a height value equal to or higher than a prescribed proportion of the maximum value of peak heights; and (3) a step for calculating one typical peak by weighted averaging from each of the respective peak groups. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、プロテオーム解析に利用される質量分析スペクトルの解析に係り、より詳細には、プロテオーム解析に利用されるＭＳ／ＭＳスペクトルにおけるピークの抽出方法および該方法を実行するためのプログラムに関する。 The present invention relates to analysis of a mass spectrometry spectrum used for proteome analysis, and more particularly to a method for extracting a peak in an MS / MS spectrum used for proteome analysis and a program for executing the method.

現在、ヒトのゲノム解析が終了し、ポストゲノム時代をむかえている。ここで、「ゲノム」とは、たとえば、人のすべての遺伝子と遺伝子間領域を含む、一つの細胞に含まれる全ＤＮＡのことである。このポストゲノム時代では、生体内の生体分子として重要であるタンパク質を分離同定し、さらに定量することが、ますます重要となってきている。とりわけ、病気の診断・治療技術の研究開発には、多数のタンパク質の機能を解明することが必要である。 Currently, human genome analysis has been completed and we are entering the post-genomic era. Here, the “genome” is, for example, the total DNA contained in one cell including all human genes and intergenic regions. In this post-genomic era, it has become increasingly important to isolate and identify proteins that are important as biomolecules in vivo and to further quantify them. In particular, it is necessary to elucidate the functions of many proteins in research and development of disease diagnosis and treatment techniques.

従来、細胞が発現するタンパク質のプロテオーム解析には、二次元電気泳動が用いられていた。ここで、「プロテオーム解析」とは、遺伝子情報と細胞内で複雑に相互作用している多様なタンパク質との関係を明らかにする解析のことをいう（たとえば、非特許文献１参照）。つまり、プロテオーム解析は、細胞を構成するすべてのタンパク質を網羅的に解析する手法をいう。 Conventionally, two-dimensional electrophoresis has been used for proteome analysis of proteins expressed by cells. Here, “proteome analysis” refers to analysis that reveals the relationship between genetic information and various proteins that interact in a complex manner in cells (for example, see Non-Patent Document 1). In other words, proteome analysis refers to a method of comprehensively analyzing all proteins constituting a cell.

前述の二次元電気泳動では、発現したタンパク質をゲル展開し、対象とするタンパク質に対応したスポットの切り出しにより、その種類を総合的に同定することを可能にするものである。そのため、二次元電気泳動はプロテオーム解析における有用な定性的解析手段である。しかし、展開されるタンパク質の量が僅少であり、分析時の回収率に誤差が生じやすいことに起因して、二次元電気泳動ではタンパク質の定量的解析には不向きであることが指摘されている。 In the above-described two-dimensional electrophoresis, an expressed protein is gel-developed, and the types can be comprehensively identified by cutting out spots corresponding to the target protein. Therefore, two-dimensional electrophoresis is a useful qualitative analysis means in proteome analysis. However, it is pointed out that two-dimensional electrophoresis is not suitable for quantitative analysis of proteins because the amount of protein to be developed is small and errors are likely to occur in the recovery rate at the time of analysis. .

他方、もう一つの重要なタンパク質の解析技術として、質量分析法が利用されている。本方法は、質量分析装置を用いたタンパク質やペプチドの正確な質量を分析する方法である。この質量分析装置には、通例、タンパク質およびペプチドをイオン化する装置と、イオン化されたタンパクおよびペプチドの質量に応じて分離する質量分離部と、該質量を分析する質量分析計と、を備えている。そして、質量分析計とタンパク質のデータベース、およびそれらを結ぶ検索システムによって、今日では、タンパク質の同定は飛躍的に容易になったといえる。そのため、特定のタンパク質混合物（たとえば、ある複合体を形成するタンパク質複合体）を網羅的に同定することが可能である。 On the other hand, mass spectrometry is used as another important protein analysis technique. This method is a method for analyzing an accurate mass of a protein or peptide using a mass spectrometer. This mass spectrometer generally includes a device for ionizing proteins and peptides, a mass separation unit for separating the ions according to the masses of the ionized proteins and peptides, and a mass spectrometer for analyzing the mass. . Today, it can be said that the identification of proteins has been greatly facilitated by the mass spectrometer and the database of proteins and the search system connecting them. Therefore, it is possible to comprehensively identify a specific protein mixture (for example, a protein complex that forms a certain complex).

現在、タンパク質混合物の同定は、通例、ＭＳ／ＭＳ法が利用される。このＭＳ／ＭＳ法とは、複数の質量分析分離部を用い、試料中のタンパク質を第一の質量分離部（ＭＳ１）のイオン化室で生成したイオン種のうち一つを、前駆イオンの質量分析スペクトル（以下、「ＭＳスペクトル」と称する場合もある。）として選択し、第二の質量分離部（ＭＳ２）にて、その前駆イオンに、アルゴンなどの不活性ガスと衝突させて断片化して生じるプロダクトイオンのスペクトルを検出して、タンパク質の解析を行う方法である。ＭＳ/ＳＭ法では、このプロダクトイオンのスペクトル（以下、「ＭＳ／ＭＳスペクトル」と称する場合もある。）を測定し、NCBInr（ HYPERLINK "http://www.ncbi.nlm.nih.gov/" http://www.ncbi.nlm.nih.gov/）等のデータベースとの照合の結果から総合的に判断してタンパク質を同定する。そして、試料中に複数のタンパク質が混在しても、一つ一つの前駆イオンからタンパク質が特定できるという利点を有する。 Currently, the identification of protein mixtures typically uses the MS / MS method. This MS / MS method uses a plurality of mass spectrometry separation units, and one of the ion species produced in the ionization chamber of the first mass separation unit (MS1) for the protein in the sample is subjected to mass analysis of precursor ions. The spectrum is selected as a spectrum (hereinafter sometimes referred to as “MS spectrum”), and the precursor ion is collided with an inert gas such as argon and fragmented in the second mass separation unit (MS2). This is a method for analyzing a protein by detecting a spectrum of product ions. In the MS / SM method, the spectrum of this product ion (hereinafter sometimes referred to as “MS / MS spectrum”) is measured and NCBInr (HYPERLINK "http://www.ncbi.nlm.nih.gov/" http://www.ncbi.nlm.nih.gov/) etc. are identified based on the result of collation with a database to identify a protein. And even if a plurality of proteins coexist in a sample, there is an advantage that a protein can be specified from each precursor ion.

図１は、タンパク質混合物の試料により得られるＭＳスペクトルおよびＭＳ／ＭＳスペクトルの概略図を示す。図１に示すように、ＭＳスペクトルおよびＭＳ／ＭＳスペクトルは、二次元電気泳動の結果から切り出されたバンドに対してゲル内消化した試料を、質量分析装置にて測定されたスペクトル結果を示す。前述のＭＳ１にて前記イオンのＭＳスペクトルを得て、次いで、各ＭＳスペクトルについて、ＭＳ２にてプロダクトイオンのＭＳ／ＭＳスペクトルを得る。その後、得られたＭＳ／ＭＳスペクトルを、データベースと照合することによりタンパク質を同定することができる。図１に示す例では、タンパク質Ａとタンパク質Ｂとの混合物であると同定できる。
Karn, P. Science 270, pp 369-370, 1995 FIG. 1 shows a schematic of the MS spectrum and MS / MS spectrum obtained with a sample of the protein mixture. As shown in FIG. 1, the MS spectrum and the MS / MS spectrum show the spectrum results obtained by measuring the sample digested in the gel with respect to the band cut out from the two-dimensional electrophoresis result with a mass spectrometer. The MS spectrum of the ions is obtained by the aforementioned MS1, and then the MS / MS spectrum of the product ions is obtained by MS2 for each MS spectrum. Subsequently, the protein can be identified by comparing the obtained MS / MS spectrum with a database. In the example shown in FIG. 1, it can be identified as a mixture of protein A and protein B.
Karn, P. Science 270, pp 369-370, 1995

しかしながら、実際に測定されるＭＳ／ＭＳスペクトルは、特に、測定対象とする試料に複数のタンパク質が混合されている場合には非常に複雑なスペクトルである。そのため、ＭＳ／ＭＳスペクトルを用いたプロテオーム解析自体が効率的ではないという問題が指摘されている。
そこで、本発明は、上記事情に鑑み、ＭＳ／ＭＳスペクトルを用いたプロテオーム解析を効率的かつ的確に行うため、ＭＳ／ＭＳスペクトルにおけるピークの抽出方法および該方法を実行するためのプログラムを提供することを目的とする。 However, the actually measured MS / MS spectrum is a very complex spectrum particularly when a plurality of proteins are mixed in the sample to be measured. Therefore, a problem has been pointed out that the proteome analysis itself using the MS / MS spectrum is not efficient.
Therefore, in view of the above circumstances, the present invention provides a method for extracting a peak in an MS / MS spectrum and a program for executing the method in order to efficiently and accurately perform a proteome analysis using an MS / MS spectrum. For the purpose.

本発明者らは、上記課題を解決するため、鋭意検討した結果、複数のピークを有するＭＳ／ＭＳスペクトルにおいて、特定の関係式に基づくピーク抽出を行うことにより、効率的なプロテオーム解析を実現できるという知見を得て、本発明を完成するに至った。 As a result of intensive studies to solve the above problems, the present inventors can realize efficient proteomic analysis by performing peak extraction based on a specific relational expression in an MS / MS spectrum having a plurality of peaks. As a result, the present invention has been completed.

すなわち、第一の態様では、本発明は、複数のピークを有するスペクトルにおけるピークの抽出方法であって、（１）前記スペクトルを取得する工程と、
（２）前記複数のピークから、ピークの高さの最大値の所定の割合以上の値の高さを有するピークが、少なくとも二つ連続している場合、前記少なくとも二つのピークを一つのピーク群として検出する工程と、（３）各ピーク群からそれぞれ、重み付け平均により一つの代表的ピークを算出するセントロイド工程と、を含む抽出方法を提供する。かかる方法により、ノイズ等の不必要なピークを解析対象から排除し、解析対象とすべきピークを絞ることができ、効率的な解析が実現される。 That is, in the first aspect, the present invention is a method for extracting a peak in a spectrum having a plurality of peaks, and (1) a step of acquiring the spectrum;
(2) When at least two peaks having a height greater than or equal to a predetermined ratio of the maximum peak height are consecutive from the plurality of peaks, the at least two peaks are grouped into one peak group. And (3) a centroid step of calculating one representative peak from each peak group using a weighted average. By this method, unnecessary peaks such as noise can be excluded from the analysis target, and the peak to be analyzed can be narrowed down, thereby realizing efficient analysis.

本発明に係る抽出方法の好ましい態様によれば、前記スペクトルは、生体分子に関連する、ｍ／ｚに対して強度で表示されるＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトルを含み、前記工程（３）における前記重み付け平均は、下記式に従い実行される。

（ここで、ｍは生体分子の質量であり、ｚは生体分子の電荷であり、Ｉ_ijおよびｍ／ｚ_ijは、それぞれ、前記一つの代表的ピークの強度と、ｍ/ｚの値であり、kは、ピーク群に含まれるｍ／ｚを特定する値であり、iからjの値が含まれ、ｙ_kは、ｍ／ｚ_kに対する強度であり、h₀はピーク強度の最大値を示し、aは前記所定の割合を示す。）
かかる式を用いることで、ＭＳ／ＭＳスペクトルの解析が的確かつ迅速に行うことができる。 According to a preferred aspect of the extraction method of the present invention, the spectrum includes an MS spectrum and / or an MS / MS spectrum expressed in intensity with respect to m / z related to a biomolecule, and the step (3 The weighted average in) is executed according to the following formula.

(Where m is the mass of the biomolecule, z is the charge of the biomolecule, and I _ij and m / z _ij are the intensity of the one representative peak and the value of m / z, respectively. , K is a value specifying m / z included in the peak group, i to j are included, y _k is the intensity with respect to m / z _k , and h ₀ is the maximum peak intensity. A represents the predetermined ratio.)
By using such a formula, the analysis of the MS / MS spectrum can be performed accurately and rapidly.

本発明に係る抽出方法の好ましい態様によれば、（４）前記ピーク群に含まれる各ピークのｍ／ｚの値と、前記ピーク群の前後に隣接するピークのｍ／ｚの値とを用いて、隣接するピークのｍ／ｚの値の間隔の最小値に対する最大値の割合を算出し、該割合が所定の値を超える場合、そのピーク群を除去する工程をさらに含む。ピーク群の検出が正しいか否かの判断を行うことにより、より精度の高いピーク抽出方法が実現される。 According to a preferred aspect of the extraction method of the present invention, (4) using the m / z value of each peak included in the peak group and the m / z values of peaks adjacent to the peak group before and after. And calculating a ratio of the maximum value to the minimum value of the interval between the m / z values of adjacent peaks, and when the ratio exceeds a predetermined value, further includes a step of removing the peak group. By determining whether or not the detection of the peak group is correct, a more accurate peak extraction method is realized.

本発明に係る抽出方法の好ましい態様によれば、（５）各ピーク群ごとに算出された代表的ピークのｍ／ｚの値が特定の値以内にある複数の代表的ピークを一つのピークとするように、クラスタ処理する工程をさらに含み、前記特定の値は、生体分子の分子量に基づいて決定され、クラスタ処理後の前記一つのピークの強度は、クラスタ処理の対象となる代表的ピークの強度の総和として算出し、クラスタ処理後の前記一つのピークのｍ／ｚの値は、クラスタ処理の対象となる代表的ピークのｍ／ｚを、前記強度を重みとして加重平均して算出される。このようにして、解析対象とすべきピークを整理して、より効率的な解析が可能となる。 According to a preferred aspect of the extraction method of the present invention, (5) a plurality of representative peaks in which the m / z value of the representative peak calculated for each peak group is within a specific value are defined as one peak. The specific value is determined based on the molecular weight of the biomolecule, and the intensity of the one peak after the cluster processing is a representative peak to be clustered. The sum of the intensities is calculated, and the m / z value of the one peak after the cluster processing is calculated by weighted averaging the m / z of the representative peak to be clustered with the intensity as a weight. . In this way, more efficient analysis can be performed by arranging the peaks to be analyzed.

本発明に係る抽出方法の好ましい態様によれば、一つのＭＳ／ＭＳスペクトルがスペクトルに関する一つのデータセットを有し、前記工程（１）にて複数のＭＳ／ＭＳスペクトルを取得する場合、前記工程（２）に先立ち、（ａ）各ＭＳ／ＭＳスペクトルの前駆イオンのＭＳスペクトルのｍ／ｚの差が一定の範囲内にあり、かつ、各ＭＳ／ＭＳスペクトルのデータセットの取得タイミングの差が一定の範囲内にある複数のデータセットを結合するマージ処理をする工程をさらに含む。このように、ピーク群の検出前に、ＭＳ／ＭＳスペクトルのデータセットを調節することにより、効率的なピーク群の検出が可能となる。 According to a preferred aspect of the extraction method of the present invention, when one MS / MS spectrum has one data set relating to the spectrum, and a plurality of MS / MS spectra are acquired in the step (1), the step Prior to (2), (a) the difference in m / z of the MS spectrum of the precursor ion of each MS / MS spectrum is within a certain range, and the difference in the acquisition timing of the data set of each MS / MS spectrum is The method further includes the step of performing a merging process to combine a plurality of data sets within a certain range. As described above, by adjusting the data set of the MS / MS spectrum before the peak group is detected, the peak group can be efficiently detected.

本発明に係る抽出方法の好ましい態様によれば、前記工程（ａ）の後に、前記工程（２）〜（４）を実行する。マージ処理後に前記工程（２）〜（４）を実行することで、ピーク抽出の効率化を図ることができる。 According to the preferable aspect of the extraction method which concerns on this invention, the said process (2)-(4) is performed after the said process (a). By performing the steps (2) to (4) after the merge process, the efficiency of peak extraction can be improved.

また、第二の態様では、本発明は、［８］複数のピークを有するスペクトルにおけるピークの抽出方法を実行させるプログラムであって、（１）前記スペクトルを取得する工程と、（２）前記複数のピークから、ピークの高さの最大値の所定の割合以上の値の高さを有するピークが、少なくとも二つ連続している場合、前記少なくとも二つのピークを一つのピーク群として検出する工程と、（３）各ピーク群からそれぞれ、重み付け平均により一つの代表的ピークを算出するセントロイド工程と、を含む抽出方法を、コンピュータに実行させるプログラム、
［９］前記スペクトルは、生体分子に関連する、ｍ／ｚに対して強度で表示されるＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトルを含み、前記工程（３）における前記重み付け平均は、下記式に従い実行される、

（ここで、ｍは生体分子の質量であり、ｚは生体分子の電荷であり、Ｉ_ijおよびｍ／ｚ_ijは、それぞれ、前記一つの代表的ピークの強度と、ｍ/ｚの値であり、kは、ピーク群に含まれるｍ／ｚを特定する値であり、iからjの値が含まれ、ｙ_kは、ｍ／ｚ_kに対する強度であり、h₀はピーク強度の最大値を示し、aは前記所定の割合を示す。）
前記［８］に記載のプログラム、
［１０］（４）前記ピーク群に含まれる各ピークのｍ／ｚの値と、前記ピーク群の前後に隣接するピークのｍ／ｚの値とを用いて、隣接するピークのｍ／ｚの値の間隔の最小値に対する最大値の割合を算出し、該割合が所定の値を超える場合、そのピーク群を除去する工程をさらに、コンピュータに実行させる、前記［８］または［９］に記載のプログラム、
［１１］（５）各ピーク群ごとに算出された代表的ピークのｍ／ｚの値が特定の値以内にある複数の代表的ピークを一つのピークとするように、クラスタ処理する工程をさらに、コンピュータに実行させる、前記［８］ないし［１０］のうち何れか一に記載のプログラム、
［１２］前記特定の値は、生体分子の分子量に基づいて決定され、クラスタ処理後の前記一つのピークの強度は、クラスタ処理の対象となる代表的ピークの強度の総和として算出し、クラスタ処理後の前記一つのピークのｍ／ｚの値は、クラスタ処理の対象となる代表的ピークのｍ／ｚを、前記強度を重みとして加重平均して算出される、前記［１１］に記載のプログラム、
［１３］一つのＭＳ／ＭＳスペクトルがスペクトルに関する一つのデータセットを有し、前記工程（１）にて複数のＭＳ／ＭＳスペクトルを取得する場合、前記工程（２）に先立ち、（ａ）各ＭＳ／ＭＳスペクトルの前駆イオンのＭＳスペクトルのｍ／ｚの差が一定の範囲内にあり、かつ、各ＭＳ／ＭＳスペクトルのデータセットの取得タイミングの差が一定の範囲内にある複数のデータセットを結合するマージ処理をする工程をさらに、コンピュータに実行させる、前記［８］ないし［１２］のうち何れか一に記載のプログラム、
「１４」前記工程（ａ）の後に、前記工程（２）〜（４）をコンピュータに実行させる、前記［８］ないし［１３］のうち何れか一に記載のプログラム、等を提供する。 In the second aspect, the present invention provides [8] a program for executing a method for extracting a peak in a spectrum having a plurality of peaks, (1) a step of acquiring the spectrum, and (2) the plurality of the plurality of peaks. Detecting at least two peaks as one peak group when at least two peaks having a height equal to or greater than a predetermined ratio of the maximum peak height are consecutive from (3) A program for causing a computer to execute an extraction method including a centroid step of calculating one representative peak by weighted average from each peak group,
[9] The spectrum includes an MS spectrum and / or an MS / MS spectrum expressed in intensity with respect to m / z related to a biomolecule, and the weighted average in the step (3) is according to the following formula: Executed,

(Where m is the mass of the biomolecule, z is the charge of the biomolecule, and I _ij and m / z _ij are the intensity of the one representative peak and the value of m / z, respectively. , K is a value specifying m / z included in the peak group, i to j are included, y _k is the intensity with respect to m / z _k , and h ₀ is the maximum peak intensity. A represents the predetermined ratio.)
The program according to [8],
[10] (4) Using the m / z value of each peak included in the peak group and the m / z values of adjacent peaks before and after the peak group, The ratio according to [8] or [9], wherein a ratio of a maximum value to a minimum value of a value interval is calculated, and if the ratio exceeds a predetermined value, a step of removing the peak group is further executed by a computer. Programs,
[11] (5) A step of performing cluster processing so that a plurality of representative peaks whose m / z values of the representative peaks calculated for each peak group are within a specific value are defined as one peak. A program according to any one of [8] to [10], which is executed by a computer,
[12] The specific value is determined based on the molecular weight of the biomolecule, and the intensity of the one peak after cluster processing is calculated as the sum of the intensities of representative peaks to be subjected to cluster processing. The program according to [11], wherein the m / z value of the later one peak is calculated by weighted averaging the m / z of a representative peak to be clustered using the intensity as a weight. ,
[13] When one MS / MS spectrum has one data set relating to the spectrum and a plurality of MS / MS spectra are acquired in the step (1), prior to the step (2), (a) A plurality of data sets in which the difference in m / z of the MS spectrum of the precursor ion of the MS / MS spectrum is within a certain range, and the difference in the acquisition timing of each MS / MS spectrum data set is within the certain range The program according to any one of [8] to [12], further causing a computer to execute a merge process of combining
[14] After the step (a), the program according to any one of the above [8] to [13] is provided, which causes the computer to execute the steps (2) to (4).

なお、本発明に係るプログラムは、本発明による抽出方法の各工程をコンピュータ上で実行させる。本発明に係るプログラムは、ＣＤ−ＲＯＭ、磁気ディスク、半導体メモリなどの各種の記録媒体を通じて、または通信ネットワークを介してダウンロードすることにより、コンピュータにインストールまたはダウンロードすることができる。 The program according to the present invention causes each step of the extraction method according to the present invention to be executed on a computer. The program according to the present invention can be installed or downloaded in a computer through various recording media such as a CD-ROM, a magnetic disk, and a semiconductor memory, or by downloading via a communication network.

本発明によるピーク抽出方法および該方法を実行するプログラムによれば、質量分析スペクトルの解析を、より効率的に行うことができる。特に、本発明に係るピーク抽出方法は、Ｑｑ−ＴＯＦＭＳ／ＭＳ装置において、ピーク抽出として優れている。 According to the peak extraction method and the program for executing the method according to the present invention, the analysis of the mass spectrometry spectrum can be performed more efficiently. In particular, the peak extraction method according to the present invention is excellent as peak extraction in a Qq-TOFMS / MS apparatus.

本発明の実施の形態について、図面を参照しつつ説明する。以下の実施形態は、本発明を説明するための例示であり、本発明をこの実施形態にのみ限定する趣旨ではない。本発明は、その要旨を逸脱しない限り、さまざまな形態で実施することができる。 Embodiments of the present invention will be described with reference to the drawings. The following embodiment is an example for explaining the present invention, and is not intended to limit the present invention only to this embodiment. The present invention can be implemented in various forms without departing from the gist thereof.

図２は、本発明の対象とする複数のピークを有するスペクトルとして、生体分子、たとえば、タンパク質の質量分析スペクトルの一例を示す。図２に示すスペクトルは、ＭＳ／ＭＳスペクトルであり、ノイズを含め、多くのピークを有するため、その後の解析が容易ではないこと予想される。 FIG. 2 shows an example of a mass spectrometry spectrum of a biomolecule, for example, a protein, as a spectrum having a plurality of peaks targeted by the present invention. The spectrum shown in FIG. 2 is an MS / MS spectrum and has many peaks including noise, so that it is expected that subsequent analysis is not easy.

図３は、本発明に係るピーク抽出方法を実行する装置１０のハードウエア構成図を示す。図３に示すように、本発明にて用いる装置１０は、ＣＰＵ１２とマウスやキーボードなどの入力装置１４と、ＣＲＴなどから構成される表示装置１６と、ＲＡＭ（Random Access Memory）１８、ＲＯＭ（Read Only Memory）２０と、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの可搬式記録媒体２３にアクセスする可搬式記録媒体ドライバ２２と、ハードディスク装置２４と、外部とのデータ授受を制御する通信制御インターフェース（Ｉ／Ｆ）２６とを備える。本発明に用いる装置１０は、入出力制御Ｉ／Ｆを介して、質量分析装置３０に接続され、質量分析装置３０にて得られたデータと、本発明に用いる装置１０にて解析等されたデータとの通信を行うことができる。図３に示すように、本実施の形態に係る装置１０としては、パーソナルコンピュータなどを利用することができる。 FIG. 3 shows a hardware configuration diagram of the apparatus 10 for executing the peak extraction method according to the present invention. As shown in FIG. 3, a device 10 used in the present invention includes a CPU 12, an input device 14 such as a mouse and a keyboard, a display device 16 including a CRT, a RAM (Random Access Memory) 18, a ROM (Read Only Memory) 20, a portable recording medium driver 22 that accesses a portable recording medium 23 such as a CD-ROM or DVD-ROM, a hard disk device 24, and a communication control interface (I / O) that controls data exchange with the outside. F) 26. The apparatus 10 used in the present invention is connected to the mass spectrometer 30 via the input / output control I / F, and is analyzed by the data obtained by the mass spectrometer 30 and the apparatus 10 used in the present invention. Can communicate with data. As shown in FIG. 3, a personal computer or the like can be used as the apparatus 10 according to the present embodiment.

なお、本発明に係るピーク抽出方法は、質量分析装置３０にて得られたＭＳ／ＭＳスペクトルを例示して説明するが、本発明は、ＭＳ／ＭＳスペクトルのみに限定されるものではない。 The peak extraction method according to the present invention will be described by exemplifying an MS / MS spectrum obtained by the mass spectrometer 30, but the present invention is not limited to the MS / MS spectrum alone.

本発明で用いる質量分析装置３０としては、少なくとも２つの質量分離部を結合したタンデム型のＭＳ／ＭＳ法を測定する装置などを挙げることができる。タンデム型のＭＳ／ＭＳ法では、第一の質量分析装置と第二の質量分析装置との間に衝突室を設け、第一の質量分析装置における分解で生じた前駆イオンを不活性ガスと衝突させてプロダクトイオンを発生させる。 Examples of the mass spectrometer 30 used in the present invention include an apparatus for measuring a tandem MS / MS method in which at least two mass separation units are combined. In the tandem MS / MS method, a collision chamber is provided between the first mass spectrometer and the second mass spectrometer, and the precursor ions generated by decomposition in the first mass spectrometer collide with an inert gas. To generate product ions.

具体的には、２台の質量分離部を例として説明すると、本発明の好ましい態様では、第一の質量分析装置の質量分離部をＱフィルタ型にし、第二の質量分析装置の質量分離部にＴＯＦ型を配した、Ｑｑ−ＴＯＦＭＳ／ＭＳ装置を利用する。なお、Ｑフィルタ型の質量分析装置は、平行に並べた４本の円柱電極を用いて、その間に直流電流と交流電流を印加して得られる四重極電場により、イオンを質量分離する方式の質量分析装置である。四重極にかける電圧を変動させてスキャンすることで質量分析スペクトルが得られ、電圧を固定することにより、特定のｍ／ｚの値のイオンだけを通過、いわゆるマスフィルタさせることができる。一方、ＴＯＦ型の質量分析装置は、イオンのｍ／ｚの違いによって、イオンが一定の長さの真空分析管を通過するのに必要な時間が異なることを利用した、イオンを質量分離する方式の質量分析装置である。 Specifically, two mass separation units will be described as an example. In a preferred embodiment of the present invention, the mass separation unit of the first mass spectrometer is a Q filter type, and the mass separation unit of the second mass spectrometer is used. A Qq-TOFMS / MS apparatus having a TOF type is used. The Q filter type mass spectrometer uses four columnar electrodes arranged in parallel and uses a quadrupole electric field obtained by applying a DC current and an AC current between them to mass-separate ions. It is a mass spectrometer. A mass analysis spectrum is obtained by scanning while varying the voltage applied to the quadrupole. By fixing the voltage, only ions having a specific m / z value can be passed, that is, so-called mass filtering can be performed. On the other hand, the TOF type mass spectrometer is a method for mass-separating ions using the fact that the time required for ions to pass through a vacuum analysis tube of a certain length varies depending on the difference in ion m / z. Mass spectrometer.

このようにして、本発明で用いる質量分析装置３０にて得られたＭＳ／ＭＳスペクトルデータを、本発明で用いる装置１０へ送り、所望の動作や解析等を行う。 In this way, the MS / MS spectrum data obtained by the mass spectrometer 30 used in the present invention is sent to the apparatus 10 used in the present invention, and desired operation and analysis are performed.

前述の可搬式記録媒体２３には、本発明にて用いられるピーク抽出方法を実行するプログラム等が記憶されている。したがって、可搬式記録媒体ドライブ２２が、可搬式記録媒体２３から、上記プログラムを読み出し、これをハードディスク装置２４に記憶して、これを起動することにより、パーソナルコンピュータが、本発明にて用いて装置１０として作動することが可能となる。あるいは、インターネット等の外部ネットワークを介して、上記プログラムをダウンロードしてもよい。 The aforementioned portable recording medium 23 stores a program for executing the peak extraction method used in the present invention. Therefore, the portable recording medium drive 22 reads out the program from the portable recording medium 23, stores it in the hard disk device 24, and starts it so that the personal computer uses the device in the present invention. 10 can be operated. Alternatively, the program may be downloaded via an external network such as the Internet.

図４は、本発明による装置１０の要部の機能ブロック図である。装置１０は、質量分析装置３０にて測定されたＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトル等を取得する入出力部５１と、前記ＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトルを保存するデータ記憶部５２と、を備える。さらに、本発明に用いる装置１０は、本発明に係るピーク抽出方法を実行するための、ピーク群検出部５４、セントロイド処理部５５、ピーク群除去部５６と、クラスタ処理部５７とを備える。本発明の好ましい態様では、前記装置１０は、マージ処理部５３とさらに備える。 FIG. 4 is a functional block diagram of the main part of the apparatus 10 according to the present invention. The apparatus 10 includes an input / output unit 51 that acquires an MS spectrum and / or an MS / MS spectrum measured by the mass spectrometer 30, and a data storage unit 52 that stores the MS spectrum and / or the MS / MS spectrum. . Furthermore, the apparatus 10 used in the present invention includes a peak group detection unit 54, a centroid processing unit 55, a peak group removal unit 56, and a cluster processing unit 57 for executing the peak extraction method according to the present invention. In a preferred aspect of the present invention, the apparatus 10 further includes a merge processing unit 53.

まず、本発明に用いる装置１０では、質量分析装置３０にて測定されたＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトルを、入出力部５１にて取得する。取得した各種のスペクトルデータを、必要に応じて、データ記憶部に保存し、後述する本発明に係るピーク抽出方法に利用できる。 First, in the apparatus 10 used in the present invention, the MS spectrum and / or MS / MS spectrum measured by the mass spectrometer 30 is acquired by the input / output unit 51. The acquired various spectrum data can be stored in a data storage unit as necessary, and can be used in the peak extraction method according to the present invention described later.

以下、本発明に係るピーク抽出方法を、ＭＳ／ＭＳスペクトルデータを用いて説明する。通例、図２に例示したように、ＭＳ／ＭＳスペクトルは、複数のピークを有する。そこで、本発明に係る抽出方法を行うことで、ＭＳ／ＭＳスペクトルの効率的な抽出を行い、プロテオーム解析の的確性および迅速性に役立つ。 Hereinafter, the peak extraction method according to the present invention will be described using MS / MS spectrum data. Typically, as illustrated in FIG. 2, the MS / MS spectrum has a plurality of peaks. Therefore, by performing the extraction method according to the present invention, the MS / MS spectrum is efficiently extracted, which is useful for the accuracy and speed of proteome analysis.

図５は、本発明によるピーク抽出方法の概略を示すフローチャートを示す。図５に示すように、工程Ｓ１０にて、質量分析装置３０にて測定されたＭＳスペクトルおよび／またはＭＳ／ＭＳスペクトルを取得する。このように取得したスペクトルを、工程Ｓ１１において、一定の規則に基づきピーク群検出部５４にてピーク群検出を行う。次いで、検出された各ピーク群に対して、セントロイド処理部５５にて、後述するセントロイド処理を行い、各ピーク群から代表的ピークを算出する（図５の工程Ｓ１２参照）。その後、前述のピーク群検出の妥当性を評価するため、工程Ｓ１３では、ピーク群除去部５６にて、検出されたピーク群の前後のピークとの対比から不必要なピーク群を除去する工程を実行する。除去後に残ったピーク群に対して、ピーク群をさらに整理するため、各ピーク群ごとに算出されたｍ／ｚの値の値に基づいて、クラスタ処理部５７にて各ピーク群ごとを集団化させるクラスタ処理を施し（工程Ｓ１４参照）、効率的に解析対象とすべきピークを抽出する。本発明の好ましい態様では、図５では不図示であるが、工程Ｓ１１に先立ち、後述するマージ処理を行うことが好ましい。以下では、各工程Ｓ１０〜Ｓ１４とマージ処理工程について詳述する。 FIG. 5 is a flowchart showing an outline of the peak extraction method according to the present invention. As shown in FIG. 5, in step S10, an MS spectrum and / or an MS / MS spectrum measured by the mass spectrometer 30 is acquired. In step S11, the peak group detection unit 54 performs peak group detection on the acquired spectrum based on a certain rule. Next, the centroid processing unit 55 performs centroid processing described later on each detected peak group, and calculates a representative peak from each peak group (see step S12 in FIG. 5). Thereafter, in order to evaluate the validity of the above-described peak group detection, in step S13, the peak group removal unit 56 includes a step of removing unnecessary peak groups from comparison with peaks before and after the detected peak group. Execute. In order to further sort out the peak group from the peak group remaining after the removal, the cluster processing unit 57 collects each peak group based on the m / z value calculated for each peak group. Cluster processing is performed (see step S14), and peaks to be analyzed are efficiently extracted. In a preferred embodiment of the present invention, although not shown in FIG. 5, it is preferable to perform a merge process described later prior to step S11. Below, each process S10-S14 and a merge process process are explained in full detail.

（ピーク群検出部５４におけるピーク群検出工程）
ピーク群検出部５４では、複数のピークを有するＭＳ／ＭＳスペクトルデータから、一定の関係式から、解析対象とすべき複数のピークを一つにまとめてピーク群を検出する工程である。図６は、本発明に係るピーク抽出方法におけるピーク群検出を説明する図であり、図２のＡ領域を拡大概略図である。図６に示すＭＳ／ＭＳスペクトルは、入出力部５１で取得したＭＳ／ＭＳスペクトルであって、該スペクトルはｘ軸をｍ／ｚで、ｙ軸は強度で表示されるスペクトルである。 (Peak Group Detection Step in Peak Group Detection Unit 54)
The peak group detection unit 54 is a step of detecting a peak group by combining a plurality of peaks to be analyzed from a certain relational expression from MS / MS spectrum data having a plurality of peaks. FIG. 6 is a diagram for explaining peak group detection in the peak extraction method according to the present invention, and is an enlarged schematic view of the area A in FIG. The MS / MS spectrum shown in FIG. 6 is an MS / MS spectrum acquired by the input / output unit 51, and the spectrum is a spectrum displayed with m / z on the x-axis and intensity on the y-axis.

第一に、一つのＭＳ／ＭＳスペクトル中にて最大のピーク高さ（ｈ₀）、図６の例では最大の強度を有するピークを求める。第二に、該高さ、つまり強度の所定の割合以上の値の高さ（強度）を有するピークに着目する。ここで、最大のピークを基準にしたのは、ピークの中ではノイズ等の影響が最も少なく、ピーク自体に信頼性があると推測されるからである。また、所定の割合は、その後の解析の精度に鑑みるに、適宜選択可能な割合である。第三に、所定の割合以上の値の強度を有するピークが少なくとも二つ連続している場合、この少なくとも二つのピークを一つのピーク群として検出する。図６の例では、前記所定の割合を０．５として、Ｐ₀で囲まれるピークを一つのピーク群として検出する。 First, the peak having the maximum peak height (h ₀ ) in one MS / MS spectrum, and the peak having the maximum intensity in the example of FIG. 6 is obtained. Secondly, attention is paid to a peak having the height, that is, a height (intensity) having a value equal to or higher than a predetermined ratio of the intensity. Here, the reason for using the maximum peak as a reference is that the influence of noise or the like is the least in the peak, and it is estimated that the peak itself is reliable. Further, the predetermined ratio is a ratio that can be appropriately selected in view of the accuracy of the subsequent analysis. Third, when at least two peaks having intensities of a predetermined ratio or more are consecutive, the at least two peaks are detected as one peak group. In the example of FIG. 6, the predetermined ratio is set to 0.5, and the peaks surrounded by P ₀ are detected as one peak group.

（セントロイド処理部５５におけるセントロイド処理）
工程Ｓ１２で示されるセントロイド処理は、前述のピーク群として検出されたものを一つの代表的ピークとして算出する処理である。Ｓ１２工程では、ピーク群に含まれるｘ軸およびｙ軸の値から、ピーク群の代表的ピークの算出を、重み付け平均というセントロイド処理により行い、以降のスペクトル解析に利用する。本発明で用いる「セントロイド処理」は、下記式に基づき、ピーク群の代表的ピークを算出する処理である。

（ここで、ｍは生体分子の質量であり、ｚは生体分子の電荷であり、Ｉ_ijおよびｍ／ｚ_ijは、それぞれ、前記一つの代表的ピークの強度と、ｍ／ｚの値であり、kは、ピーク群に含まれるｍ／ｚを特定する値であり、iからjの値が含まれ、ｙ_kは、ｍ／ｚ_kに対する強度であり、aは前記所定の割合を示し、図６および図７では０．５として表示されている。） (Centroid processing in the centroid processing unit 55)
The centroid process shown in step S12 is a process of calculating what is detected as the aforementioned peak group as one representative peak. In step S12, a representative peak of the peak group is calculated from the x-axis and y-axis values included in the peak group by a centroid process called weighted average and used for subsequent spectrum analysis. The “centroid process” used in the present invention is a process for calculating a representative peak of a peak group based on the following formula.

(Where m is the mass of the biomolecule, z is the charge of the biomolecule, and I _ij and m / z _ij are the intensity of the one representative peak and the value of m / z, respectively. , K is a value for specifying m / z included in the peak group, i to j are included, y _k is an intensity with respect to m / z _k , a indicates the predetermined ratio, 6 and 7 are displayed as 0.5.)

式（１）は、ｙ軸である強度の算出法であり、式（２）はｘ軸のｍ／ｚの算出法である。このようにして、各ピーク群に対してその代表的ピークを算出することができる。なお、ｙ軸の強度は、ピーク群に含まれるピーク強度の総和であり、ｘ軸のｍ／ｚは、ピーク群に含まれるｍ／ｚの加重平均の値となっている。また、本発明で用いるセントロイド処理は、前述のように、検出された各ピーク群から代表的ピークを算出する処理であればよく、前述の算出法は例示であり、式（１）および（２）の算出法に限定されるものではない。 Equation (1) is a method of calculating the intensity that is the y-axis, and equation (2) is a method of calculating m / z of the x-axis. In this way, the representative peak can be calculated for each peak group. The y-axis intensity is the sum of peak intensities included in the peak group, and the x-axis m / z is a weighted average value of m / z included in the peak group. Moreover, the centroid process used in the present invention may be a process for calculating a representative peak from each detected peak group as described above, and the above-described calculation method is an example, and equations (1) and ( It is not limited to the calculation method of 2).

図７は、本発明に係るピーク抽出方法におけるセントロイド処理により算出された、ピーク群からの代表的ピーク算出の一例を示す。図７の例示から、工程Ｓ１１にて検出された一つのピーク群に対するセントロイド処理の内容が容易に理解できる。具体的には、ピーク群Ｐ₀の最大強度は２２であり、前述のａの値を０．５とすると、ｍ／ｚ₁〜ｍ／ｚ₇の範囲のピーク強度は、最大強度の５０％である１１以上として連続する。そして、このｍ／ｚ₁〜ｍ／ｚ₇の範囲のピーク強度を、座標Ｓ₁〜Ｓ₇で表示する。最大強度の５０％である強度１１を基準として縦軸に新たなｖ軸を作成し、ｍ／ｚ₁〜ｍ／ｚ₇の範囲のピーク強度を、再計算すると、ｖ１＝１、・・・、Ｖ７＝６と再計算される（図７参照）。その後、ｖ軸で表示される強度の総和Ｉ（Ｐ₀）を算出し、ｍ／ｚ軸は加重平均することによりｍ／ｚ（Ｐ₀）が算出される。このようにして、本発明で用いるセントロイド処理により、検出されたピーク群の代表的ピークを算出することができる。 FIG. 7 shows an example of representative peak calculation from a peak group calculated by centroid processing in the peak extraction method according to the present invention. From the illustration of FIG. 7, the contents of the centroid processing for one peak group detected in step S11 can be easily understood. Specifically, the maximum intensity of the peak group P ₀ is 22, and when the value of a is 0.5, the peak intensity in the range of m / z _{1 to} m / z ₇ is 50% of the maximum intensity. It is continuous as 11 or more. Then, the peak intensity in the range of _{_{m / z 1 ~m / z 7}} , and displays a coordinate S ₁ to S _7. When a new v-axis is created on the vertical axis based on the intensity 11 that is 50% of the maximum intensity, and the peak intensity in the range of m / z _{1 to} m / z ₇ is recalculated, v1 = 1,. , V7 = 6 (see FIG. 7). Thereafter, the total sum I (P ₀ ) of the intensity displayed on the v-axis is calculated, and m / z (P ₀ ) is calculated by performing a weighted average on the m / z axis. In this manner, the representative peak of the detected peak group can be calculated by the centroid processing used in the present invention.

（ピーク群除去部５６におけるピーク除去処理）
前述にように、工程１３は、検出されたピーク群の代表的ピークを算出した後、ピーク群の検出の妥当性を評価し、不必要なピーク群を除去する工程である。本発明では、検出されたピーク群に含まれる各ピークのｍ／ｚの値と、前記ピーク群の前後に隣接するピークのｍ／ｚの値とを用いて、ピーク除去処理を行う。 (Peak removal processing in the peak group removal unit 56)
As described above, step 13 is a step of calculating a representative peak of the detected peak group, evaluating the validity of the detection of the peak group, and removing unnecessary peak groups. In the present invention, peak removal processing is performed using the m / z value of each peak included in the detected peak group and the m / z values of the peaks adjacent to the front and back of the peak group.

図８は、本発明に係るピーク抽出方法におけるピーク除去工程を説明する図である。図８では、Ｐで表示されるピーク群に含まれる各ピークのｍ／ｚは、ｍ／ｚ₁〜ｍ／ｚ₄であり、かかるピーク群に含まれない、ピーク群の前後に隣接するピークのｍ／ｚは、ｍ／ｚ₀とｍ／ｚ₅で表されている。そして、本発明では、ピーク除去の一例として、ｍ／ｚ₀〜ｍ／ｚ₅において隣接するｍ／ｚの値の間隔をｄで表し、以下の式（３）により、ピーク群を除去するか否かの判断をする。

（ここで、ｌは、各ピーク群に含まれるｍ／ｚの値と、前記各ピーク群の前後に隣接するｍ／ｚの値を表す。）
式（３）に基づき、前記ｄの値の最小値に対する最大値の割合εが、所定の値を超える場合、かかるピーク群Ｐは除去する。ここで、εの値が大きい場合とは、ピーク群に包含される各ピーク間隔が不規則であり、ピーク自体の信頼性が低いと予想されるため、かかるピーク群を除去する。εの値としては、その後の解析の精度に鑑みるに、適宜選択可能な割合である。かかるピーク群除去により、より精度の高いピーク抽出が可能となる。なお、本発明で用いるピーク群除去処理は、前述のように、ピーク群の検出の妥当性を評価し、不必要なピーク群を除去する処理をいい、前述の算出法は例示であり、式（３）の算出法に限定されるものではない。 FIG. 8 is a diagram for explaining a peak removal step in the peak extraction method according to the present invention. In FIG. 8, m / z of each peak included in the peak group indicated by P is m / z _{1 to} m / z ₄ and is not included in the peak group and is adjacent to the peak group before and after. M / z is represented by m / z ₀ and m / z ₅ . In the present invention, as an example of peak removal, the interval between adjacent m / z values in m / z _{0 to} m / z ₅ is represented by d, and the peak group is removed by the following equation (3). Judge whether or not.

(Here, l represents the value of m / z included in each peak group and the value of m / z adjacent before and after each peak group.)
Based on Expression (3), when the ratio ε of the maximum value to the minimum value of d exceeds a predetermined value, the peak group P is removed. Here, when the value of ε is large, each peak interval included in the peak group is irregular and it is expected that the reliability of the peak itself is low. Therefore, the peak group is removed. The value of ε is a ratio that can be appropriately selected in view of the accuracy of subsequent analysis. Such peak group removal enables more accurate peak extraction. Note that, as described above, the peak group removal process used in the present invention refers to a process for evaluating the validity of detection of peak groups and removing unnecessary peak groups. It is not limited to the calculation method of (3).

（クラスタ処理部５７におけるクラスタ処理）
前述のピーク群除去処理により残った各ピーク群について、一定の関係にあるピーク群を集団化するクラスタ処理を行う。ここで、各ピーク群とは、たとえば、図２で示すＡ領域およびＢ領域においてピーク検出され、その後のピーク群除去処理により残ったピーク群である。また、一定の関係とは、各ピーク群の代表的ピークのｍ／ｚの値に基づいて行われ、本発明では、たとえば、代表的ピークのｍ／ｚの値の差が２Ｄａ（Ｄａ：生体分子の質量を表示するダルトンを意味する）の場合には、かかる各ピーク群の代表的ピークを一つのピークとなるように、クラスタ処理する。これは、代表的ピークが隣接していることは、ＭＳ／ＭＳスペクトルの前駆イオンが同一または非常に近似していると考えられることから、クラスタ処理すること精度の高いピーク抽出を可能とする。 (Cluster processing in the cluster processing unit 57)
For each peak group remaining by the above-described peak group removal processing, cluster processing is performed to collect peak groups having a certain relationship. Here, each peak group is, for example, a peak group detected in the A region and the B region shown in FIG. 2 and left by the subsequent peak group removal process. Further, the certain relationship is performed based on the m / z value of the representative peak of each peak group. In the present invention, for example, the difference in the m / z value of the representative peak is 2 Da (Da: biological body). In the case of Dalton, which represents the mass of a molecule), cluster processing is performed so that a representative peak of each peak group becomes one peak. This is because, if the representative peaks are adjacent, it is considered that the precursor ions of the MS / MS spectrum are the same or very close to each other, so that it is possible to extract peaks with high accuracy by performing cluster processing.

本発明におけるクラスタ処理した場合、代表的ピークをクラスタ処理されたピークの強度と、ｍ／ｚの値は、それぞれ、クラスタ処理の対象となった代表的ピーク強度の総和と、前記強度を重みとして加重平均して得られるｍ／ｚとして算出される。このようにして抽出されたピークを、前記代表的ピーク強度の総和と、前記強度を重みとして加重平均して得られるｍ／ｚの値を用いて、ピークリストとして表示することができる。なお、本発明で用いるクラスタ処理は、前述のように、代表的ピークを集団化させる処理であればよく、代表的ピークが隣接していることを示唆するパラメータならば、前駆イオンの性質に限定されるものではない。 In the case of cluster processing in the present invention, the peak intensity obtained by clustering representative peaks and the value of m / z are the sum of representative peak intensities subjected to cluster processing and the intensity as a weight. Calculated as m / z obtained by weighted averaging. The peaks extracted in this way can be displayed as a peak list using the sum of the representative peak intensities and the m / z value obtained by weighted averaging using the intensities as weights. As described above, the cluster process used in the present invention may be a process that collects representative peaks, and is limited to the properties of precursor ions as long as the parameters suggest that the representative peaks are adjacent. Is not to be done.

（マージ処理部５３によるマージ処理）
本発明に係るピークの抽出方法の好ましい態様では、工程Ｓ１１のピーク群検出の前に、以下に説明するマージ処理を行う。マージ処理とは、一つのＭＳ／ＭＳスペクトルがスペクトルに関する一つのデータセットを有し、複数のＭＳ／ＭＳスペクトルを取得した場合、
前述のピーク群検出前に、前記複数のＭＳ／ＭＳスペクトルの各々のデータセットを一定の条件下で結合する処理である。ここで用いる「データセット」とは、図２に示すＭＳ／ＭＳスペクトルについて、ｍ／ｚ軸と強度を座標として表示されたものであり、一つのＭＳ／ＭＳスペクトルに関する一つのデータセットとは、図２に示すＭＳ／ＭＳスペクトルを、前記座標で表示された一連のデータをいう。そのため、ＭＳ／ＭＳスペクトルが異なれば、当然、ＭＳ／ＭＳスペクトルが有するデータセットも異なる。 (Merge processing by the merge processing unit 53)
In a preferred embodiment of the peak extraction method according to the present invention, the merge processing described below is performed before the peak group detection in step S11. Merge processing means that one MS / MS spectrum has one data set related to the spectrum, and a plurality of MS / MS spectra are acquired.
Before the above-described peak group detection, the data sets of the plurality of MS / MS spectra are combined under certain conditions. The “data set” used here is the MS / MS spectrum shown in FIG. 2 displayed with the m / z axis and intensity as coordinates, and one data set related to one MS / MS spectrum is: The MS / MS spectrum shown in FIG. 2 refers to a series of data displayed with the coordinates. Therefore, if the MS / MS spectrum is different, the data set of the MS / MS spectrum is naturally different.

本発明で用いる前述の一定の条件とは、各ＭＳ／ＭＳスペクトルの前駆イオンに着目し、前駆イオンのｍ／ｚの値の差が一定の範囲内にあり、かつ、各ＭＳ／ＭＳスペクトルのデータセットの取得タイミングの差が一定の範囲内にある場合に、データセットを結合する処理である。ここで、前駆イオンのｍ／ｚの値の差が一定の範囲内にあるとは、たとえば、その差がダルトンに基づく値が一定の範囲内であり、各ＭＳ／ＭＳスペクトルのデータセットの取得タイミングの差が一定の範囲内にあるとは、たとえば、前駆イオンである親ピークを共通として、ＭＳ／ＭＳスペクトルを取得した時間が一定の範囲内にあることをいう。このマージ処理を行うことで、ＭＳ／ＭＳスペクトルの前駆イオンが共通するスペクトルを結合することにより、より効率的なピーク抽出が可能となる。なお、本発明で用いるマージ処理は、ＭＳ／ＭＳスペクトルが生じる前の前駆イオンに基づいてＭＳ／ＭＳスペクトルを結合する処理であればよく、前述の前駆イオンのｍ／ｚの値の差やＭＳ／ＭＳスペクトルのデータセットの取得タイミングの差に基づく条件により行われる処理に限定されるものではない。 The above-mentioned certain conditions used in the present invention are focused on the precursor ion of each MS / MS spectrum, the difference in the m / z value of the precursor ion is within a certain range, and each MS / MS spectrum This is a process of combining data sets when the difference in the acquisition timing of the data sets is within a certain range. Here, the difference in the m / z values of the precursor ions is within a certain range, for example, the difference is within a certain range based on the dalton, and acquisition of a data set of each MS / MS spectrum The difference in timing within a certain range means that, for example, the time when the MS / MS spectrum is acquired is within a certain range with the parent peak as a precursor ion in common. By performing this merging process, it is possible to extract peaks more efficiently by combining spectra having common precursor ions of MS / MS spectra. The merge process used in the present invention may be a process that combines MS / MS spectra based on the precursor ions before the MS / MS spectrum is generated. The difference in m / z value of the aforementioned precursor ions and the MS It is not limited to the process performed by the condition based on the difference in the acquisition timing of the / MS spectrum data set.

工程Ｓ１１におけるマージ処理の後、前述の工程Ｓ１１〜Ｓ１４を行うことで、より精度の高いピーク抽出が実現される。 After the merge process in step S11, peak extraction with higher accuracy is realized by performing the above-described steps S11 to S14.

以下、実施例によって本発明を具体的に説明するが、これらは本発明の範囲を何等限定するものではない。 EXAMPLES Hereinafter, the present invention will be specifically described by way of examples, but these do not limit the scope of the present invention in any way.

１．細胞溶解物（cell lysate）の調製
¹³Ｃ₆−ロイシン（Cambridge Isotope Laboratories、アンドーバ、マサチューセッツ州）を含有するRPMI-1640培地（Gibco BRL、グランドアイランド、ニューヨーク州）は、OngらのSILAC プロトコール（S.E. Ong, B. Blagoev, I. Kratchmarova, D.B. Kristensen, H. Steen, A. Pandey, M. Mann, Mol Cell Proteomics 1 (2002) 376.）に従って調製した。マウス神経芽細胞腫neuro2a細胞を、¹³Ｃ₆−ロイシン標識のために、前記培地にて培養した。二つのディシュ（dish）（直径１５ｃｍ、８０％ confluent）にプロテアーゼインヒビターカクテル（Roche Diagnostics社、バーゼル、スイス）を含むリン酸塩緩衝液を加えて掻きだし、超音波を利用してタンパク質を抽出した。 1. Preparation of cell lysate
RPMI-1640 medium (Gibco BRL, Grand Island, NY) containing ¹³ C ₆ -leucine (Cambridge Isotope Laboratories, Andover, Mass.) Is based on the SILAC protocol (SE Ong, B. Blagoev, I. Kratchmarova , DB Kristensen, H. Steen, A. Pandey, M. Mann, Mol Cell Proteomics 1 (2002) 376.). Mouse neuroblastoma neuro2a cells were cultured in the above medium for ¹³ C ₆ -leucine labeling. Phosphate buffer containing protease inhibitor cocktail (Roche Diagnostics, Basel, Switzerland) was added to two dishes (15 cm in diameter, 80% confluent), and the protein was extracted using ultrasound. .

２．LCMSMSのためのペプチド混合物の調製
細胞からのタンパク質を乾燥し、8M 尿素を含有した50 mM Tris-HCl 緩衝液（pH9.0）にて再懸濁させた。その後、これらの混合物を還元し、アルキル化し、Lys-C（和光、大阪、日本）により消化し、50 mM 炭酸水素アンモニウム水溶液で4倍に希釈した後トリプシン（プロメガ社、マディソン、ワイオミング州、米国）により消化した。消化溶液をTFAで酸性にし、脱塩し、C18-Stage Tips (J. Rappsilber, Y. Ishihama, M. Mann, Anal Chem 75 (2003) 663.)により濃縮した。 2. Preparation of peptide mixture for LCMSMS Proteins from cells were dried and resuspended in 50 mM Tris-HCl buffer (pH 9.0) containing 8 M urea. These mixtures were then reduced, alkylated, digested with Lys-C (Wako, Osaka, Japan), diluted 4-fold with 50 mM aqueous ammonium bicarbonate and then trypsin (Promega Corp., Madison, Wyoming, USA). ). The digestion solution was acidified with TFA, desalted and concentrated by C18-Stage Tips (J. Rappsilber, Y. Ishihama, M. Mann, Anal Chem 75 (2003) 663.).

３．ナノLC-MS/MS解析
Shimadzu LC10A グラディエントポンプと、HTC-PAL オートサンプラー（CTC Analytics AG、チューリンゲン、スイス）と、150 μm ポートを有するValco C2バルブとを備えるQSTAR Pulsar I (ABI/MDS-Sciex、トロント、カナダ)を利用して、すべてのサンプルをナノLC-MS/MSにより解析した。"stone-arch"frit(Y. Ishihama, J. Rappsilber, J.S. Andersen, M. Mann, J Chromatogr A 979 (2002) 233.)を備える解析カラムニードルを調製するため、窒素による圧力印加されたカラムローダセル（日京テクノス社製、東京、日本）により、self-pulled needle (100 μm ID, 6μm 開口、150 mm 長)へ、ReproSil C18充填剤（3μm, Dr. Maisch, Ammerbuch, ドイツ）を充填した。テフロン（登録商標）被覆カラムホルダ（日京テクノス社製、東京、日本）に磁石付Velco 金属コネクターに取り付けたカラムニードルを装着し、スプレー位置を調節可能なProxeon x-y-z nanospray インターフェース（Odense、デンマーク）に搭載した。注入体積は、3 μＬであり、三方スプリッター後の流速は250 nL/minであった。移動相は、（A）０．５％酢酸と、（B）０．５％酢酸と８０％アセトニトリルを用いた。グラジエント条件は、初期濃度をＢ５％とし、最初の５分間で５％から１０％、次の６０分間で１０％から３０％、続く５分間３０％から１００％と最後の１０分間の１００％というリニアーグラディエントを採用した。金属コネクターを介して2400 Vのスプレー電圧を印加した。ＭＳスキャン(1秒)で、強度順に最大三つピークを選択し、その後、それぞれの親イオンに対しＭＳＭＳスキャンを、０．６秒ごとに実行した。Information Dependent Acquisition (IDA)機能は、先にスキャンした親ピークイオンを排除するように、３分間設定した。スキャン範囲は、ｍ／ｚ350・−・1400であった。 3. Nano LC-MS / MS analysis
Utilizing QSTAR Pulsar I (ABI / MDS-Sciex, Toronto, Canada) with Shimadzu LC10A gradient pump, HTC-PAL autosampler (CTC Analytics AG, Thuringia, Switzerland) and Valco C2 valve with 150 μm port All samples were analyzed by nano LC-MS / MS. To prepare an analytical column needle with "stone-arch" frit (Y. Ishihama, J. Rappsilber, JS Andersen, M. Mann, J Chromatogr A 979 (2002) 233.) ReproSil C18 filler (3 μm, Dr. Maisch, Ammerbuch, Germany) was filled into a self-pulled needle (100 μm ID, 6 μm opening, 150 mm length) with a cell (Nikkyo Technos, Tokyo, Japan) . A column needle attached to a Velco metal connector with magnet is attached to a Teflon-coated column holder (Nikkyo Technos, Tokyo, Japan), and the Proxeon xyz nanospray interface (Odense, Denmark) has an adjustable spray position. equipped. The injection volume was 3 μL and the flow rate after the three-way splitter was 250 nL / min. As the mobile phase, (A) 0.5% acetic acid and (B) 0.5% acetic acid and 80% acetonitrile were used. The gradient conditions are 5% to 10% for the first 5 minutes, 10% to 30% for the next 5 minutes, 10% to 30% for the next 60 minutes, 30% to 100% for the next 5 minutes, and 100% for the last 10 minutes. A linear gradient was adopted. A spray voltage of 2400 V was applied through a metal connector. In MS scan (1 second), a maximum of 3 peaks were selected in order of intensity, after which MSMS scan was performed every 0.6 seconds for each parent ion. The Information Dependent Acquisition (IDA) function was set for 3 minutes to exclude the parent peak ions scanned earlier. The scan range was m / z 350 ··· 1400.

図９は、本発明による実施例により得られた一つのＭＳスペクトル（図９（Ａ））と、そのＭＳスペクトルから得られた一つのＭＳ／ＭＳスペクトル（図９（Ｂ））を示す。一方、図１０は、本発明による実施例から得られた一つのＭＳ／ＭＳスペクトルのデータを、本発明に係るピーク抽出方法により解析した結果を示す。図１０（Ａ）は、本発明による実施例から得られた生データのピークのｍ／ｚの値に対する強度を示す。図１０（Ｂ）は、図１０（Ａ）に示す結果から、本発明によりピーク群検出と、前記式（１）および（２）に従って、算出された代表的ピークを示す。なお、前記式（１）および（２）による本発明のセントロイド処理では、ａを０．５（５０％）として算出した。図１０（Ｃ）は、図１０（Ｂ）にて得られた結果から、前記式（３）に従って、ピーク群除去を行った結果を示す。なお、前記式（３）におけるεの値は２．９を採用した。 FIG. 9 shows one MS spectrum (FIG. 9 (A)) obtained by an example according to the present invention and one MS / MS spectrum (FIG. 9 (B)) obtained from the MS spectrum. On the other hand, FIG. 10 shows the result of analyzing the data of one MS / MS spectrum obtained from the example according to the present invention by the peak extraction method according to the present invention. FIG. 10 (A) shows the intensity with respect to the m / z value of the peak of the raw data obtained from the example according to the present invention. FIG. 10 (B) shows typical peaks calculated from the results shown in FIG. 10 (A) according to the present invention according to the peak group detection and the above formulas (1) and (2). In the centroid processing of the present invention according to the formulas (1) and (2), a was calculated as 0.5 (50%). FIG. 10C shows the result of performing peak group removal from the result obtained in FIG. 10B according to the above equation (3). The value of ε in the above formula (3) was 2.9.

最後に、図１０（Ｃ）にて得られた結果から、本発明におけるクラスタ処理により、ピーク抽出を行った。図１０（Ｄ）は、ピーク群除去され残ったピーク群である、図１０（Ｃ）で得られた結果から、各ピーク群の代表的ピークのｍ／ｚの値の差が２Ｄａ以内である場合、その各ピーク群を一つのピーク群とするクラスタ処理後の結果を示す。なお、図１０（Ｄ）に示す値は、代表的ピークをクラスタ処理されたピークの強度とｍ／ｚの値は、それぞれ、クラスタ処理の対象となった代表的ピーク強度の総和と、前記強度を重みとして加重平均して得られるｍ／ｚとして算出した。 Finally, from the results obtained in FIG. 10C, peak extraction was performed by cluster processing in the present invention. FIG. 10 (D) shows the peak group remaining after removal of the peak group. From the result obtained in FIG. 10 (C), the difference in the m / z value of the representative peak of each peak group is within 2 Da. In the case, the result after cluster processing in which each peak group is one peak group is shown. Note that the values shown in FIG. 10D are the peak intensity obtained by clustering representative peaks and the m / z values are the sum of the representative peak intensities subjected to cluster processing and the intensity, respectively. Was calculated as m / z obtained by weighted average using as a weight.

以上の結果より、本発明に係るピーク抽出方法およびそれをコンピュータに実行させるプログラムによれば、質量分析スペクトルにおける有用なピークを抽出することにより、質量分析スペクトルの解析のより効率的な実行を実現させることができる。 From the above results, according to the peak extraction method according to the present invention and a program for causing a computer to execute the method, a more efficient execution of the analysis of the mass spectrometry spectrum is realized by extracting the useful peaks in the mass spectrometry spectrum. Can be made.

図１は、タンパク質混合物の試料により得られるＭＳスペクトルおよびＭＳスペクトルの概略図を示す。FIG. 1 shows an MS spectrum obtained from a sample of a protein mixture and a schematic diagram of the MS spectrum. 図２は、本発明の対象とする複数のピークを有するスペクトルとして、生体分子、たとえば、タンパク質の質量分析スペクトルの一例を示す。FIG. 2 shows an example of a mass spectrometry spectrum of a biomolecule, for example, a protein, as a spectrum having a plurality of peaks targeted by the present invention. 図３は、本発明に係るピーク抽出方法を実行する装置１０のハードウエア構成図を示す。FIG. 3 shows a hardware configuration diagram of the apparatus 10 for executing the peak extraction method according to the present invention. 図４は、本発明による装置１０の要部の機能ブロック図である。FIG. 4 is a functional block diagram of the main part of the apparatus 10 according to the present invention. 図５は、本発明によるピーク抽出方法の概略を示すフローチャートを示す。FIG. 5 is a flowchart showing an outline of the peak extraction method according to the present invention. 図６は、本発明に係るピーク抽出方法におけるピーク群検出を説明する図であり、図２のＡ領域を拡大概略図である。FIG. 6 is a diagram for explaining peak group detection in the peak extraction method according to the present invention, and is an enlarged schematic view of the area A in FIG. 図７は、本発明に係るピーク抽出方法におけるセントロイド処理により算出された、ピーク群からの代表的ピーク算出の一例を示す。FIG. 7 shows an example of representative peak calculation from a peak group calculated by centroid processing in the peak extraction method according to the present invention. 図８は、本発明に係るピーク抽出方法におけるピーク除去工程を説明する図である。FIG. 8 is a diagram for explaining a peak removal step in the peak extraction method according to the present invention. 図９は、（Ａ）本発明による実施例により得られた一つのＭＳスペクトル、（Ｂ）そのＭＳスペクトルから得られた一つのＭＳ／ＭＳスペクトルを示す。なお、図９（Ｂ）に示すＭＳ／ＭＳスペクトルは、前駆イオンのＭＳスペクトルのｍ／ｚの値が６８８．９であり、リテンションタイムが８２．５６２分であるピークを、ＭＳ／ＭＳ処理して得られたＭＳ／ＭＳスペクトルである。FIG. 9 shows (A) one MS spectrum obtained by an example according to the present invention, and (B) one MS / MS spectrum obtained from the MS spectrum. Note that the MS / MS spectrum shown in FIG. 9B is obtained by performing MS / MS treatment on a peak having a m / z value of 688.9 and a retention time of 82.562 minutes. It is the MS / MS spectrum obtained. 図１０は、本発明による実施例から得られたＭＳ／ＭＳスペクトルのデータを、本発明によるピーク抽出方法により解析した結果を示す。図１０（Ａ）は、オリジナルのＭＳ／ＭＳスペクトルのデータを示す。図１０（Ｂ）は、前記オリジナルのＭＳ／ＭＳスペクトルデータについて、本発明に係るピーク抽出方法におけるピーク群検出された代表的ピークの値の結果を示す。図１０（Ｃ）は、図１０（Ｂ）にて得られた結果から、前記式（３）に従って、ピーク群除去を行った結果を示す。図１０（Ｄ）は、ピーク群除去され残ったピーク群である、図１０（Ｃ）で得られた結果から、各ピーク群の代表的ピークのｍ／ｚの値の差が２Ｄａ以内である場合、その各ピーク群を一つのピーク群とするクラスタ処理後の結果を示す。FIG. 10 shows the result of analyzing the MS / MS spectrum data obtained from the example according to the present invention by the peak extraction method according to the present invention. FIG. 10A shows original MS / MS spectrum data. FIG. 10 (B) shows the result of the value of the representative peak detected in the peak group in the peak extraction method according to the present invention for the original MS / MS spectrum data. FIG. 10C shows the result of performing peak group removal from the result obtained in FIG. 10B according to the above equation (3). FIG. 10 (D) shows the peak group remaining after removal of the peak group. From the result obtained in FIG. 10 (C), the difference in the m / z value of the representative peak of each peak group is within 2 Da. In the case, the result after cluster processing in which each peak group is one peak group is shown.

Explanation of symbols

１０：本発明に係るピーク抽出方法を実行する装置、１２：ＣＰＵ、１４：入力装置、１６：表示装置、１８：ＲＡＭ、２０：ＲＯＭ、２２：可搬式記録媒体ドライバ、２３：可搬式記録媒体、２４：ディスク装置、２６：通信制御インターフェース、３０：質量分析装置 10: Device for executing the peak extraction method according to the present invention, 12: CPU, 14: input device, 16: display device, 18: RAM, 20: ROM, 22: portable recording medium driver, 23: portable recording medium 24: disk device, 26: communication control interface, 30: mass spectrometer

Claims

A method for extracting a peak in a spectrum having a plurality of peaks, comprising:
(1) obtaining the spectrum;
(2) When at least two peaks having a height greater than or equal to a predetermined ratio of the maximum peak height are consecutive from the plurality of peaks, the at least two peaks are grouped into one peak group. Detecting as
(3) A centroid process for calculating one representative peak by weighted average from each peak group,
Extraction method including.

The spectrum includes an MS spectrum and / or MS / MS spectrum expressed in intensity relative to m / z associated with a biomolecule;
The weighted average in the step (3) is executed according to the following formula:

(Where m is the mass of the biomolecule, z is the charge of the biomolecule, and I _ij and m / z _ij are the intensity of the one representative peak and the value of m / z, respectively. , K is a value specifying m / z included in the peak group, i to j are included, y _k is the intensity with respect to m / z _k , and h ₀ is the maximum peak intensity. A represents the predetermined ratio.)
The extraction method according to claim 1.

(4) Using the m / z value of each peak included in the peak group and the m / z value of adjacent peaks before and after the peak group, the interval between the m / z values of adjacent peaks The extraction method according to claim 1, further comprising a step of calculating a ratio of the maximum value with respect to the minimum value and removing the peak group when the ratio exceeds a predetermined value.

(5) The method further includes a step of performing cluster processing so that a plurality of representative peaks whose m / z values of the representative peaks calculated for each peak group are within a specific value are defined as one peak. Item 4. The extraction method according to any one of Items 1 to 3.

The specific value is determined based on the molecular weight of the biomolecule, and the intensity of the one peak after cluster processing is calculated as the sum of the intensity of representative peaks to be subjected to cluster processing, 5. The extraction method according to claim 4, wherein the m / z value of one peak is calculated by performing a weighted average of m / z of a representative peak to be clustered using the intensity as a weight.

When one MS / MS spectrum has one data set related to the spectrum and a plurality of MS / MS spectra are acquired in the step (1), prior to the step (2),
(A) The difference in m / z of the MS spectrum of the precursor ion of each MS / MS spectrum is within a certain range, and the difference in the acquisition timing of the data set of each MS / MS spectrum is within a certain range. The extraction method according to claim 1, further comprising a merge process of combining a plurality of data sets.

The extraction method according to any one of claims 1 to 6, wherein the steps (2) to (4) are performed after the step (a).

A program for executing a peak extraction method in a spectrum having a plurality of peaks,
(1) obtaining the spectrum;
(2) When at least two peaks having a height greater than or equal to a predetermined ratio of the maximum peak height are consecutive from the plurality of peaks, the at least two peaks are grouped into one peak group. Detecting as
(3) A centroid process for calculating one representative peak by weighted average from each peak group,
A program for causing a computer to execute an extraction method including:

(Where m is the mass of the biomolecule, z is the charge of the biomolecule, and I _ij and m / z _ij are the intensity of the one representative peak and the value of m / z, respectively. , K is a value specifying m / z included in the peak group, i to j are included, y _k is the intensity with respect to m / z _k , and h ₀ is the maximum peak intensity. A represents the predetermined ratio.)
The program according to claim 8.

(4) Using the m / z value of each peak included in the peak group and the m / z values of the adjacent peaks before and after each peak group, the m / z value of the adjacent peak The program according to claim 8 or 9, further comprising the step of calculating a ratio of the maximum value to the minimum value of the interval and removing the peak group when the ratio exceeds a predetermined value.

(5) A cluster processing step is further performed in the computer so that a plurality of representative peaks whose m / z values of the representative peaks calculated for each peak group are within a specific value are defined as one peak. The program according to any one of claims 8 to 10, which is executed.

The specific value is determined based on the molecular weight of the biomolecule, and the intensity of the one peak after cluster processing is calculated as the sum of the intensity of representative peaks to be subjected to cluster processing, The program according to claim 11, wherein the value of m / z of one peak is calculated by performing weighted averaging of m / z of a representative peak to be clustered using the intensity as a weight.

When one MS / MS spectrum has one data set related to the spectrum and a plurality of MS / MS spectra are acquired in the step (1), prior to the step (2),
(A) The difference in m / z of the MS spectrum of the precursor ion of each MS / MS spectrum is within a certain range, and the difference in the acquisition timing of the data set of each MS / MS spectrum is within a certain range. The program according to any one of claims 8 to 12, further causing a computer to execute a merge process of combining a plurality of data sets.

The program according to any one of claims 8 to 13, which causes a computer to execute the steps (2) to (4) after the step (a).