CN102072932A - Method and device for identifying glycopeptide segment - Google Patents

Method and device for identifying glycopeptide segment Download PDF

Info

Publication number
CN102072932A
CN102072932A CN200910199086.4A CN200910199086A CN102072932A CN 102072932 A CN102072932 A CN 102072932A CN 200910199086 A CN200910199086 A CN 200910199086A CN 102072932 A CN102072932 A CN 102072932A
Authority
CN
China
Prior art keywords
mass
mass number
peptide section
electric charge
single electric
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200910199086.4A
Other languages
Chinese (zh)
Other versions
CN102072932B (en
Inventor
贺福初
杨芃原
张扬
沈诚频
刘铭琪
陈瑶函
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN200910199086.4A priority Critical patent/CN102072932B/en
Publication of CN102072932A publication Critical patent/CN102072932A/en
Application granted granted Critical
Publication of CN102072932B publication Critical patent/CN102072932B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention discloses a method for identifying a glycopeptide segment, which comprises the following steps: filtering a mass spectrogram; acquiring a single-charge mass number; constructing a mass number network; and retrieving a theory library. The invention also discloses a device for identifying the glycopeptide segment, which comprises a mass spectrogram filtering unit, a single-charge mass number acquisition unit, a mass number network construction unit and a theory library retrieving unit. Compared with the conventional method for identifying the glycopeptide segment, the identification method and the identification device can efficiently acquire more accurate identification results in which false positive results are obviously reduced.

Description

A kind of method and device of identifying the glycosylated peptide section
Technical field
The present invention relates to a kind of method and device of identifying the glycosylated peptide section.
Background technology
Growing along with high flux and high-resolution biological mass spectrometry technology (comprising ground substance assistant laser desorption ionization mass spectrum (MAIDI-MS) and electrospray ionization mass spectrum (ESI-MS)) also got obvious improvement based on the proteomics research of biological mass spectrometry technology.At present, the most important thing of proteomics area research concentrates on three main inferior fields in the proteomics evaluation: qualitative evaluation, quantitatively evaluation and protein modified evaluation.Wherein, protein modified evaluation is the key factor that proteomics is different from transcription group, therefore becomes the part of forefront in the protein science.
Protein modified evaluation is developed to today, to phosphorylation, methylate, acetylation and Fan Suhua etc. are protein modified has formed relatively perfect authenticate technology.But modify for protein glycosylation, because its distinctive non-linear " antenna structure ", the feasible mass spectrogram that obtains by prior biological mass spectrum authentication method is directly resolved very difficulty.At present, the evaluation overwhelming majority that this area is modified protein glycosylation is resolved mass spectrogram by the method for manual annotation by veteran mass spectrum scholar, extremely wastes time and energy.Given this, the researchist of protein science pays close attention to glycosylated peptide section study of identification method, has developed the authentication method of some glycosylated peptide sections already, as GlycoMod, GlycoMiner, GlycoPep DB, STAT and Stroligo etc.But the prediction glycosylated peptide section that above-mentioned authentication method draws is too many, makes the researchist have no way of accepting or rejecting, and is difficult to obtain accurate qualification result.
Therefore, need the authentication method of seeking the higher glycosylated peptide section of a kind of degree of accuracy badly.
Summary of the invention
Technical matters to be solved by this invention is that the candidate result that obtains for the authentication method that overcomes existing glycosylated peptide section is too many, and the defective that degree of accuracy is not high provides the authentication method and the device of the higher glycosylated peptide section of a kind of degree of accuracy.
Authentication method of the present invention comprises the steps (as shown in Figure 1):
(1) mass spectrogram filters: the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning among the tandem mass spectrometry figure of selection glycosylated peptide section sample;
(2) obtain single electric charge mass number: if identify that used mass spectrum is the MALDI mass spectrum, the mass number of then getting the selected peak of step (1) is its single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum,, obtain its single electric charge mass number then with deconvoluting of the mass number processing at the selected peak of step (1);
(3) make up the mass number network: with single electric charge mass number of step (2) gained match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media;
(4) theoretical library retrieval: be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section; Standard a, comprise identical peptide section, in the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit equates or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more; Wherein, described theoretical library is meant, by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
Below, further detailed authentication method of the present invention is introduced.
(1) mass spectrogram filters: the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning among the tandem mass spectrometry figure of selection glycosylated peptide section sample.
Wherein, the tandem mass spectrometry figure of described glycosylated peptide section sample is that glycosylated peptide section sample is carried out the mass spectrogram that tandem mass spectrometry is identified gained.Described glycosylated peptide section sample can be the sample that contains the glycosylated peptide section of the existing the whole bag of tricks gained in this area.For example, the glycoprotein sample is according to a conventional method behind the enzymolysis, by existing method enrichment, get final product glycosylated peptide section sample.
Preferable, the tandem mass spectrometry figure of described glycosylated peptide section sample is, glycosylated peptide section sample carries out tandem mass spectrometry and identifies back gained mass spectrogram through prescreen, casts out remaining mass spectrogram behind the non-glycosylated peptide section mass spectrogram.The method of spectrogram prescreen can be screened by the existing method in this area, the present invention is through a large amount of experiment accumulation, the mass spectrogram that especially preferably meets following standard: contain one or more in the mass spectra peak of 366,528 and 690 these three mass numbers in the spectrogram, and in the signal to noise ratio (S/N ratio) row preceding 20, preferable is in preceding 15; Range of tolerable variance is generally in 2 dalton, and that preferable is 1 dalton.Prescreen through spectrogram can reduce unnecessary work, improves and identifies efficient.
Wherein, described tandem mass spectrometry can be the MALDI or the ionogenic tandem mass spectrometry of ESI of existing various models, and as ESI-LTQ-ORBITRAP, MALDI-QIT-TOF etc., concrete operations condition conventional by this area or existing method is carried out.
Wherein, the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning and specifically can select according to the general knowledge of this area mass spectrophotometry in the described selection mass spectrogram, generally select the highest preceding 30~70 peaks of signal to noise ratio (S/N ratio).Through the accumulation of a large amount of experiments, the inventor finds the relevant mass number of most of glycosylated peptide section in preceding 50 peaks that signal to noise ratio (S/N ratio) is the strongest of mass spectrogram, and therefore, the present invention especially preferably selects the highest preceding 50 peaks of signal to noise ratio (S/N ratio) in the mass spectrogram.Filter through mass spectrogram, deleted the noise that causes owing to reasons such as instrument and isotopic peaks greatly, be beneficial to follow-up confrontation spectrum information and resolve accurately.
(2) obtain single electric charge mass number
If identify that used mass spectrum is the MAIDI mass spectrum, the mass number of then getting the selected peak of step (1) is its single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum, then with deconvoluting of the mass number processing at the selected peak of step (1), obtains its single electric charge mass number.
Wherein, described deconvoluting processing is carried out according to a conventional method, be about to the multi-charge mass number and be converted into single electric charge mass number, but all single electric charge mass numbers after deconvoluting all can not be higher than the mass number of parent ion, and concrete formula is: M y=n * M x-n+1 is as the M that is calculated by this formula y>M pThe time, get M y=M p, wherein, n ∈ 1,2, L, C p, M xBe the original quality number of mass spectra peak, M yBe the single electric charge mass number after the deconvoluting, C pBe parent ion charge number, M pMass number for parent ion.
(3) make up the mass number network
With single electric charge mass number of step (2) gained match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media.
Wherein, described single electric charge mass number with step (2) gained is matched in twos to subtract each other and is meant, is subtracted each other by the optional combination in twos of all single electric charge mass numbers of step (2) gained.For example, the number of establishing single electric charge mass number is X, then can get
Figure B2009101990864D0000041
Individual heteromerism of poor quality.
If glycosylated peptide section sample to be identified belongs to N-glucosides of bonding, and the mass spectra peak source is MALDI-QIT (Quadrupole Ion Trap) mass spectrometer, preferable is chosen in the acceptable heteromerism range of tolerable variance of poor quality, the highest and the mass number of signal to noise ratio (S/N ratio) differs 83 and 120 daltonian one group of single electric charge mass number (single electric charge mass numbers of two couples pairing that promptly comprise same single electric charge mass number continuously, comprise that altogether 3 mass numbers differ 83 and 120 daltonian single electric charge mass numbers continuously), and all heteromerisies of poor quality are 132,146,162,203,291,324,365, the single electric charge mass number of one or more pairing in 589 and 689 is carried out the structure of follow-up mass number network.Why select the highest and mass number of signal to noise ratio (S/N ratio) to differ 83 and 120 daltonian one group of single electric charge mass number continuously especially, be because in the N-glucosides of bonding glycosylated peptide section, first monosaccharide unit N-acetylgalactosamine (N-acetylgalactosamine of sugar chain link peptide section, GalNAc) fracture on the ring in the appearance in the MALDI-QIT mass spectrometer easily, thereby form three high relatively continuous fragmentions of ionic strength, [Pep+H] +, [Pep+84] +[Pep+204] +, the difference mass number is 83 and 120 dalton each other.In single electric charge mass number of the tandem mass spectrometry data gained of any cover N-glucosides of bonding glycosylated peptide section, it is very many to satisfy the combination that differs 83 and 120 daltonian three continuous mass numbers, Zong but be exactly the highest often above-mentioned this group fragment peak of relative intensity is referred to as characteristic peak usually.First fragment peak of characteristic peak generally can reflect the molecular weight of peptide section in the glycosylated peptide section.Therefore, 83 and 120 are considered to the most important heteromerism of poor quality that plays the marked price effect in the data in QIT mass spectrometer source.Why selecting all heteromerisies of poor quality especially is the single electric charge mass numbers of pairing one or more in 132,146,162,203,291,324,365,589 and 689, is because above-mentioned heteromerism of poor quality is the common peak-to-peak heteromerism of poor quality of fragmention (being the mass number of monosaccharide unit) that fracture forms between each monosaccharide unit in sugar chain.
If glycosylated peptide section sample to be identified belongs to N-glucosides of bonding, and the mass spectra peak source is the ESI-LTQ-ORBITRAP mass spectrometer, preferable is chosen in the acceptable heteromerism range of tolerable variance of poor quality, all heteromerisies of poor quality are the single electric charge mass numbers of pairing one or more in 132,146,162,203,291,324,365,589 and 689, carry out the structure of follow-up mass number network.
Wherein, described acceptable heteromerism range of tolerable variance of poor quality is generally 0.5~2 dalton, and that preferable is 1.0~1.5 dalton.
According to single electric charge mass number of pairing and heteromerism of poor quality thereof, adopt the Biograph module in the existing Matlab software, can construct the mass number network that comprises separate mixed-media network modules mixed-media.In the mixed-media network modules mixed-media that Biograph module in the Matlab software provides, single electric charge mass number all is ascending arrangement, and therefore whole network is a kind of digraph.Described mixed-media network modules mixed-media can be understood as to be explained the difference of same set of mass spectrometric data.The correct probability of mixed-media network modules mixed-media that comprises single electric charge mass number number many more (particularly comprising characteristic peak list electric charge mass number (at the data in MALDI-QIT mass spectrum source)) is the highest.Preferable, first-selection comprises the maximum mixed-media network modules mixed-media of single electric charge mass number number and carries out subsequent treatment.If can not obtain meeting the prediction glycosylated peptide section of search criterion in the step (4), then select to comprise single electric charge mass number number time many mixed-media network modules mixed-medias and carry out subsequent treatment, until the prediction glycosylated peptide section that obtains to meet search criterion in the step (4).After this, also can select other mixed-media network modules mixed-medias to analyze again, to obtain more fully to predict the qualification result of glycosylated peptide section.
(4) theoretical library retrieval
Biograph module in the Matlab software can provide the mass number path profile of each mixed-media network modules mixed-media in step (3) the gained network with the form of " relational tree ".In the relational tree, to tree crown, single electric charge mass number is arranged from small to large from tree root, and single electric charge mass number of each node may be led to the downstream tree crown with a plurality of mass numbers path.Described mass number path can be understood as with the different fragmentation of a kind of glycosylated peptide section.
Be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section; Standard a, comprise identical peptide section, in the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit equates or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more.The above-mentioned theory library searching mates with the integral body in mass number path, has left out a large amount of false candidates results, has significantly improved the accuracy of retrieval.
Wherein, described acceptable mass number range of tolerable variance is generally in 2.0 dalton, and the more little confidence level of range of tolerable variance is high more, 1.5 dalton commonly used.
Preferable, for obtaining the high precision parent ion mass number (mass spectrometer of error in 10~20ppm, as LTQ-Orbitrap, except selecting above-mentioned mass number path, also add the mass number of parent ion mass number as the downstream node in this mass number path, carry out the theoretical library retrieval, seek the glycosylated peptide section that in acceptable parent ion mass number range of tolerable variance, conforms to this parent ion mass number, satisfy above-mentioned standard a and b if contain the glycosylated peptide section that each node retrieves in this mass number path of parent ion mass number, then the glycosylated peptide section that retrieves of the downstream node (being the parent ion mass number) in this mass number path is prediction glycosylated peptide section.Wherein, described parent ion mass number range of tolerable variance is generally 10~20ppm.Add the parent ion mass number and carry out the theoretical library retrieval, can further guarantee the accuracy that predicts the outcome.
The theoretical library retrieval of carrying out step (4) can be counted in the path each single electric charge mass number at the arbitrary downstream quality that is starting point with arbitrary single electric charge mass number in the relational tree and be carried out, if can not obtain to meet the prediction glycosylated peptide section of search criteria, other mass number paths that then can to select with this list electric charge mass number be starting point, or the mass number path that is starting point with other single electric charge mass numbers in the choice relation tree retrieves, until the prediction glycosylated peptide section that obtains to meet search criteria.The a certain mass number path of a certain single electric charge mass number is retrieved at relational tree, after acquisition meets the prediction glycosylated peptide section of search criteria, also can continue to select other mass number paths that are starting point with this list electric charge mass number, or the mass number path that is starting point with other single electric charge mass numbers in the relational tree retrieves, to obtain more fully to predict the qualification result of glycosylated peptide section.Preferable, the mass number path that is starting point with the single electric charge mass number in upstream in the first-selected relational tree is retrieved, if unsuccessful, the mass number path that then to select the single electric charge mass number with downstream one deck be starting point is until the prediction glycosylated peptide section that obtains to meet search criteria; Perhaps, after this, also select the mass number path of other single electric charge mass numbers of not retrieving as yet to retrieve, until covering whole relational tree, to obtain comprehensively to predict the qualification result of glycosylated peptide section.
Among the present invention, described theoretical library is meant by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
Among the present invention, described peptide section theoretical library is meant that carry out the mass number set of the theoretical proteolysis peptide section correspondence of inferring according to following parameter: the enzymolysis that carry out the employed proteinase of proteolysis, can have leaks secondary breakdown situation, the variable modification situation that can exist of the peptide segment length scope of cutting number of times, can exist, the peptide hydrolysis that can exist and do not comprise or comprise known glycosylation modified sequence that takes place.The Consideration of the above-mentioned theoretical calculate of enumerating can be selected by the existing knowledge in this area.Usually, carry out the employed proteinase of proteolysis and be generally Trypsin; The enzymolysis leakage that can exist is cut number of times and is generally 0~2, and preferable is 1; The peptide segment length scope that can exist is generally 7~60 amino acid, and preferable is 7~25; The secondary breakdown site of the peptide hydrolysis that can exist is generally aspartic acid (the mass spectrum source is MALDI-QIT); Fixedly modification situation that can exist and the variable modification situation that can exist are selected from the urea that takes place on nothing or the halfcystine and methylate; Glycosylation modified sequence knownly takes place and generally comprises NXS/T (N-sugar) and/or S/T (O-sugar) in what may comprise.In earlier stage to the degree of understanding of glycoprotein sample, can in above-mentioned scope, select scope more specifically according to experiment, identify efficient to improve according to actual conditions.
When the glycosylated peptide section sample of the glycoprotein of test known protein sequence, according to above-mentioned Consideration, theory is inferred in the peptide section that behind this protein sequence enzymolysis, and the mass number set of glycosylation modified peptide section correspondence can take place, and is peptide section theoretical library of the present invention.
When the glycosylated peptide section sample of the glycoprotein of test agnoprotein sequence, according to above-mentioned Consideration, theory is inferred in the protein pool that this glycoprotein sample institute species in the peptide section behind all proteolysiss, the mass number set of glycosylation modified peptide section correspondence can take place, and is peptide section theoretical library of the present invention.
For dwindling the scope of peptide section theoretical library, the acquisition qualification result of high-level efficiency high accuracy, the peptide section theoretical library (referring to accompanying drawing 2) that the present invention especially preferably adopts following method to obtain: glycosylated peptide section sample to be identified is carried out enzymolysis cut sugar chain, get the peptide section in the glycosylated peptide section, carrying out tandem mass spectrometry afterwards identifies, the gained mass spectrogram is carried out the protein sequence database retrieval, as IPI Human v3.50 etc., obtain the sequence information of the peptide section in the glycosylated peptide section, and obtain corresponding mass number set, promptly as peptide section theoretical library of the present invention.
Wherein, describedly glycosylated peptide section sample is carried out the step that enzymolysis cuts sugar chain can be undertaken by the existing enzyme solution in this area.For example, to the glycosylated peptide section of N-glucosides of bonding, can adopt PNGaseF to carry out enzymolysis, to cut sugar chain.
Wherein, described tandem mass spectrometry can be the MALDI or the ESI tandem mass spectrometry of existing various models, and as LTQ-ORBITRAP, concrete operations condition conventional by this area or existing method is carried out.
Wherein, described protein sequence database retrieval by this area conventional or existing search method and parameter carry out.For example, adopt software SEQUEST to retrieve in the human IPI database (IPI Human v3.50) of 3.50 versions, the parameter setting of retrieval is generally: range of tolerable variance 1~3 dalton, and divalent ion, Xcorr are greater than 2.5, and DeltaCN is greater than 0.08; 3 valency ions, Xcorr is greater than 3.0, and DeltaCN is greater than 0.08.
Among the present invention, described sugar chain theoretical library is meant the mass number set of the sugar chain correspondence of all possible glycoprotein by monosaccharide unit combination in the exhaustive biosome.Comprise nine kinds of common monose molecule: Glc (mass number 162.05), Man (mass number 162.05), Gal (mass number 162.05), GlcNAc (mass number 203.08), GalNAc (mass number 203.08), NeuAc (mass number 291.83), NeuGc (mass number 291.83), Xyl (mass number 132.04), Fuc (mass number 146.06) in the biosome.Wherein, Glc, Man and Gal are commonly referred to as Hexose; GlcNAc and GalNAc are commonly referred to as HexNAc.Corresponding 5 the different mass numbers (162.05,203.08,291.83,132.04 and 146.06) of above-mentioned nine kinds of monose.Therefore, the mass number by the sugar chain correspondence of above-mentioned nine kinds of monose combination is 162.05 * n 1+ 203.08 * n 2+ 291.83 * n 3+ 132.04 * n 4+ 146.06 * n 5, wherein, n 1, n 2, n 3, n 4And n 5Independently be selected from set 1,2 ..., n}, n can be in 10, are generally 5~6.Since in the biosome in the glycoprotein monose number of monomers of sugar chain be no more than 15, can think above-mentioned set exhaustive the mass number of all possible sugar chain form correspondence in the glycoprotein in the biosome.In earlier stage to the degree of understanding of glycoprotein sample, can dwindle n according to experiment according to actual conditions 1, n 2, n 3, n 4And n 5Selectable scope is identified efficient to improve.
By the mass number set corresponding of each the protein peptides section in the above-mentioned peptide section theoretical library, be theoretical library of the present invention with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library.When peptide section theoretical library comprises the mass number of p peptide section, the sugar chain theoretical library comprises the mass number of g sugar chain, then should comprise the mass number of p * g glycosylated peptide section in the theoretical library.
The invention further relates to a kind of device (referring to accompanying drawing 3) of identifying the glycosylated peptide section, it comprises:
The mass spectrogram filter element, the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning in the tandem mass spectrometry mass spectrogram of this unit selection glycosylated peptide section sample;
Obtain single electric charge mass number unit: call the mass spectra peak that the mass spectrogram filter element is selected, if the mass spectra peak source is the MALDI mass spectrum, the mass number of then getting mass spectra peak is single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum, then, obtain its single electric charge mass number with deconvoluting of the mass number processing of mass spectra peak;
Make up the mass number network element: call single electric charge mass number of obtaining single electric charge mass number unit gained, match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media;
Theoretical library retrieval unit: call and make up the mixed-media network modules mixed-media that the mass number network element obtains, be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section; Standard a, comprise identical peptide section, in the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit equates or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more; Wherein, described theoretical library is meant, by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
In the device of the present invention, each technical characterictic specifically preferably with aforementioned.
The used instrument of the present invention, software, reagent and raw material are all commercially available to be got.
Positive progressive effect of the present invention is: compare with the authentication method of existing glycosylated peptide section, authentication method of the present invention is can high efficiency acquisition accuracy higher, the qualification result that false positive results significantly reduces.
Description of drawings
Fig. 1 identifies the process flow diagram of the method for glycosylated peptide section for the present invention.
Fig. 2 preferably identifies the process flow diagram of the method for glycosylated peptide section for the present invention.
Fig. 3 identifies the schematic representation of apparatus of glycosylated peptide section for the present invention.
Fig. 4 is the relational tree synoptic diagram of mixed-media network modules mixed-media 1 among the embodiment 2, in each node of relational tree, is the parent mass peak mass number in the bracket, and bracket for the single electric charge mass number after the deconvoluting, is heteromerism of poor quality between the node outward.
Fig. 5 is the relational tree synoptic diagram of mixed-media network modules mixed-media 1 among the embodiment 3, in each node of relational tree, is the parent mass peak mass number in the bracket, and bracket for the single electric charge mass number after the deconvoluting, is heteromerism of poor quality between the node outward.
Embodiment
Further specify the present invention with embodiment below, but the present invention is not limited.
Embodiment 1 identifies the device of glycosylated peptide section
In conjunction with the accompanying drawings 3, this device comprises:
(1) mass spectrogram filter element, the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning in the tandem mass spectrometry mass spectrogram of this unit selection glycosylated peptide section sample;
(2) obtain single electric charge mass number unit: call the mass spectra peak that the mass spectrogram filter element is selected, if the mass spectra peak source is the MALDI mass spectrum, the mass number of then getting mass spectra peak is single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum, then, obtain its single electric charge mass number with deconvoluting of the mass number processing of mass spectra peak;
(3) make up the mass number network element: call single electric charge mass number of obtaining single electric charge mass number unit gained, match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media;
(4) theoretical library retrieval unit: call and make up the mixed-media network modules mixed-media that the mass number network element obtains, be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section; Standard a, comprise identical peptide section, in the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit equates or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more; Wherein, described theoretical library is meant, by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
For the mass spectrometric data in LTQ-orbitrap source, add the parent ion mass number and count the downstream node in path as chosen quality, retrieve.
Embodiment 2 takes off the evaluation of the glycosylated peptide section of sialic acid myosin (Asialofetuin)
The glycosylated peptide section of taking off sialic acid myosin (Asialofetuin) with standard sugar albumen is identified as unknown sample, the feasibility and the confidence level of test the inventive method.
Experimental technique:
1, specimen preparation: with unknown glycoprotein sample dissolution in 25mM NH 4HCO 3In, thermal denaturation.After the solution cooling, add trypsase (according to trypsase: the glycoprotein sample is that 1: 50 ratio of mass ratio adds), 37 ℃, enzymolysis spends the night.Next day,, use in order to subsequent experimental with enzymolysis solution for vacuum centrifugal drying.
2, mass spectrum is identified: adopt on the AXIMA-QITTM MALDI TOF MS and carry out the tandem mass spectrometry evaluation.With 2,5-dihydroxy-benzoic acid (DHB) is dissolved in the 30vt% acetonitrile solution that contains the 0.1vt% trifluoroacetic acid, and final concentration is 12.5mg/mL, as matrix.MALDI mass scanning scope (m/z) is: one-level mass spectrum: 1500-6000, second order ms: below the parent ion peak mass-to-charge ratio.
3, mass spectrogram is adopted the device of embodiment 1 handle by following step:
(1) spectrogram filters:
At first carry out the mass spectrogram prescreen, figure screens by following standard with the gained tandem mass spectrometry, contains one or more in the mass spectra peak of 366,528 and 690 these three mass numbers in the spectrogram, and in the signal to noise ratio (S/N ratio) row preceding 20; Range of tolerable variance is 1 dalton.The mass spectrogram that filters out is the tandem mass spectrometry figure of glycosylated peptide section, is that the tandem mass spectrometry figure of 5004 (electric charge is 1+) is an example with the parent ion mass number, carries out subsequent analysis.
Then, carry out spectrogram and filter, select the strongest preceding 50 peaks of signal to noise ratio (S/N ratio) in the mass spectrogram, as shown in table 1:
Table 1 parent ion mass number is the strongest preceding 50 peaks of signal to noise ratio (S/N ratio) among the tandem mass spectrometry figure of 5004 (1+)
Figure B2009101990864D0000121
Figure B2009101990864D0000131
(2) obtain single electric charge mass number: because of mass spectrometer is the MALDI ionization source, then the mass number of the mass spectra peak of step (1) selection is single electric charge mass number.
(3) make up the mass number network:
Single electric charge mass number of step (2) gained is subtracted each other in twos, get 1225 heteromerisies of poor quality.Be chosen in the heteromerism range of tolerable variance of poor quality (1.5 dalton), the highest and the mass number of signal to noise ratio (S/N ratio) differs 83 and 120 daltonian one group of single electric charge mass number (comprising 3 single electric charge mass numbers) continuously, and all heteromerisies of poor quality are 162,203,324,365,589 and 689 the single electric charge mass number of pairing, adopt the Biograph module in the Matlab software afterwards, construct the mass number network that comprises separate mixed-media network modules mixed-media, see Table 2.
Table 2 mass number network
Figure B2009101990864D0000132
Figure B2009101990864D0000141
Wherein, mixed-media network modules mixed-media 1 comprises and differs 83 and 120 daltonian three single electric charge mass numbers continuously, and promptly mixed-media network modules mixed-media 1 comprises the characteristic peak mass number, and mixed-media network modules mixed-media 1 contained single electric charge mass number is maximum, therefore, selects mixed-media network modules mixed-media 1 to carry out follow-up theoretical library retrieval.
(4) theoretical library retrieval
Set theoretical library:
Sugar chain theoretical library: 162.05 * n 1+ 203.08 * n 2+ 291.83 * n 3+ 132.04 * n 4+ 146.06 * n 5, wherein, n 1=6, n 2=10, n 3=5, n 4=5 and n 5=5.
Peptide section theoretical library: according to the amino acid sequence that takes off the sialic acid myosin, according to following factor, theory draws its mass number that glycosylation modified peptide section correspondence can take place set.Carrying out the employed proteinase of proteolysis is the trypsin enzyme; It is 1 that number of times is cut in the enzymolysis leakage that can exist; The peptide segment length scope that can exist is 7~60; The secondary breakdown situation of the peptide hydrolysis that can exist is an aspartic acid; The fixedly modification situation that can exist is not for having; The variable modification situation that can exist comprises that urea methylates; Glycosylation modified sequence that takes place that should comprise comprises NXS/T (N-sugar) or S/T (O-sugar).
By the optional mass number set of making up gained glycosylated peptide section correspondence in twos of the peptide section in sugar chain in the above-mentioned sugar chain theoretical library and the peptide section theoretical library, be theoretical library.
In the mass number network that selection step (3) draws, it is maximum to comprise single electric charge mass number, and comprises the mixed-media network modules mixed-media 1 of characteristic peak mass number, and the Biograph module of employing Matlab software demonstrates the relational tree (referring to accompanying drawing 4) of this mixed-media network modules mixed-media.
Search criteria: a, comprise identical peptide section, in the mass number path of b, retrieval, arbitrary single electric charge mass number is compared with single electric charge mass number of its upstream, and the number of any monosaccharide unit is equal or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more.
Mass number range of tolerable variance: 1 dalton.
Single electric charge mass number 3017.255 that choice relation is set upstream is the mass number path of starting point, carries out the theoretical library retrieval, the result for retrieval of no conformance with standard a and b.Single electric charge mass number 3100.3391 and 3340.3225 of choosing the downstream again is the mass number path of starting point, does not also have the result for retrieval of conformance with standard a and b.Select single electric charge mass number 3219.389 to be the mass number path of starting point (3219.389,3423.4979,4112.9817,4274.0576,4478.1685,4640.2252,4843.3055) again, carry out the theoretical library retrieval, obtain the result for retrieval of conformance with standard a and b.Then the glycosylated peptide section that retrieves of single electric charge mass number 4843.3055 in downstream, this mass number path is prediction glycosylated peptide section, and its peptide section sequence is VVHAVEVALATFNAESNGSYLQLVEISR, and sugar chain is 6 Hexose and 5 HexNAc.This predicts the outcome and meets the glycosylated peptide section of taking off the sialic acid myosin, and wherein the peptide section is positioned at protein sequence 160-187, and molecular weight is 3017.
With other mass number paths (3219.389,3423.4979,4112.9817,4701.2208) that single electric charge mass number 3219.389 is a starting point, carry out the theoretical library retrieval, the result for retrieval of no conformance with standard a and b.
Therefore, the accuracy height that predicts the outcome that method of the present invention provides, and avoided the existing method many defectives of false positive that predict the outcome.
The evaluation of the glycosylated peptide section of embodiment 3 horseradish peroxidases (HRP)
(Horseradish Peroxidase, glycosylated peptide section HRP) is identified as unknown sample, the feasibility and the confidence level of test the inventive method with standard sugar albumen horseradish peroxidase.
Mass spectrometer: LTQ-Orbitrap mass spectrum
Experimental technique:
1, specimen preparation: with unknown glycoprotein sample dissolution in 25mM NH 4HCO 3In, thermal denaturation.After the solution cooling, add trypsase (according to trypsase: the glycoprotein sample is that 1: 50 ratio of mass ratio adds), 37 ℃, enzymolysis spends the night.Next day,, use in order to subsequent experimental with enzymolysis solution for vacuum centrifugal drying.
2, mass spectrum is identified: adopt ESI-LTQ-Orbitrap XL Systems to carry out tandem mass spectrometry and identify, the proteolysis peptide section of traditional vacuum drying is dissolved in and goes up sample in 0.1% aqueous formic acid (volume ratio), (200mm * 75 μ m, packing material are the Bridged Ethyl Hybrid C of 1.7 μ m by reverse analytical column 18, Waters Corporation, Milford USA) after the separation, carries out online electron spray cascade mass spectrometry.For containing 0.1% aqueous formic acid (volume ratio), B is mutually for containing the acetonitrile solution (volume ratio) of 0.1% formic acid mutually for liquid chromatography moving phase: A.The liquid chromatography gradient is: 0~4min:95%A phase+5%B phase; 4~50min:A from 95% to 50%, and B from 5% to 50%.The analysis of peptide section tandem is a data-dependent MS/MS acquisition model, and the parent ion of selection intensity first six digits is dynamically got rid of 30 seconds.
3, mass spectrogram is adopted the device of embodiment 1 handle by following step:
(1) spectrogram filters:
At first carry out the mass spectrogram prescreen, figure screens by following standard with the gained tandem mass spectrometry, contains one or more in the mass spectra peak of 366,528 and 690 these three mass numbers in the spectrogram, and in the signal to noise ratio (S/N ratio) row preceding 20; Range of tolerable variance is 1 dalton.The mass spectrogram that filters out is the tandem mass spectrometry figure of glycosylated peptide section, is that the tandem mass spectrometry figure of 3353.4178 (electric charge is 2+) is an example with the parent ion mass number, carries out subsequent analysis.
Then, carry out spectrogram and filter, select the strongest preceding 50 peaks of signal to noise ratio (S/N ratio) in the mass spectrogram, as shown in table 3:
Table 3 parent ion mass number is the strongest preceding 50 peaks of signal to noise ratio (S/N ratio) among the tandem mass spectrometry figure of 3353.4178 (2+)
Figure B2009101990864D0000171
(2) obtain single electric charge mass number: the deconvoluting of mass number of the mass spectra peak of step (1) being selected by following formula, common 92 single electric charge mass numbers.
M y=n * M x-n+1, wherein, n ∈ 1,2}, M xBe the original quality number of mass spectra peak, M yBe the single electric charge mass number after the deconvoluting.As the M that calculates by this formula y, get M at>3353.4178 o'clock y=3353.4178.
(3) make up the mass number network:
Single electric charge mass number of step (2) gained is subtracted each other in twos, get 4186 heteromerisies of poor quality.Be chosen in the heteromerism range of tolerable variance of poor quality (1 dalton), all heteromerisies of poor quality are 132,146,162,203 and 291 the single electric charge mass number of pairing, adopt the Biograph module in the Matlab software afterwards, construct the mass number network that comprises separate mixed-media network modules mixed-media, as shown in table 4.
Table 4 mass number network
Figure B2009101990864D0000181
(4) theoretical library retrieval
Set theoretical library:
Sugar chain theoretical library: 162.05 * n 1+ 203.08 * n 2+ 291.83 * n 3+ 132.04 * n 4+ 146.06 * n 5, wherein, n 1=6, n 2=10, n 3=5, n 4=5 and n 5=5.
Peptide section theoretical library: glycosylated peptide section sample is carried out the PNGaseF enzymolysis, and (the peptide section is dissolved in the ammonium bicarbonate soln of 50mM, use the ratio of 1 μ L enzyme according to 1mg albumen, add PNGase F enzyme (NEB, 500unit/ μ L), 37 ℃, reaction is spent the night), the N sugar sugar chain on the glycopeptide is cut, and the asparagine that connects sugar chain simultaneously forms aspartic acid.After obtaining the peptide section of desaccharification chain, carry out chromatogram-tandem mass spectrometry coupling and identify: after separating by the C18 chromatographic column, use the ESI-LTQ-Orbitrap instrument to resolve the cracked spectrogram of tandem mass spectrometry CID of peptide section in the working sample.After obtaining spectrogram, use Sequest software that spectrogram is retrieved (parameter: Swissprot people's protein pool, the Trypsin seminase is cut, the variable asparagine that is modified to changes into aspartic acid, methionine oxidation).After searching the storehouse and finishing, use Peptideprophet software to carry out calorific power, the peptide section that obtains high confidence level is identified tabulation, wherein the peptide section of all generation asparagines conversions is the peptide section of handling through the PNGaseF enzyme, the original series of these peptide sections and corresponding mass number set thereof are peptide section theoretical library.
By the optional mass number set of making up gained glycosylated peptide section correspondence in twos of the peptide section in sugar chain in the above-mentioned sugar chain theoretical library and the peptide section theoretical library, be theoretical library.
In the mass number network that selection step (3) draws, it is maximum to comprise single electric charge mass number, and comprises the mixed-media network modules mixed-media 1 of characteristic peak mass number, and the Biograph module of employing Matlab software demonstrates the relational tree (referring to accompanying drawing 5) of this mixed-media network modules mixed-media.
Search criteria: a, comprise identical peptide section, in the mass number path of b, retrieval, arbitrary single electric charge mass number is compared with single electric charge mass number of its upstream, and the number of any monosaccharide unit is equal or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more.
Mass number range of tolerable variance: 1.5 dalton.
Single electric charge mass number 1893.3234 that choice relation is set upstream is the mass number path of starting point, carries out the theoretical library retrieval, the result for retrieval of no conformance with standard a and b.
Select single electric charge mass number 2182.7463 to be starting point again, and the mass number path (2182.7463 of adding parent ion mass number, 2386.1921,2589.9185,2751.2749,2914.2417,3045.3943,3207.3884,3353.4178), carry out the theoretical library retrieval, obtain the result for retrieval of conformance with standard a and b.Then the glycosylated peptide section that retrieves of single electric charge mass number 3353.4178 (being the parent ion mass number) in downstream, this mass number path is prediction glycosylated peptide section, its peptide section sequence is SFANSTQTFFNAFVEAMDR, and sugar chain is 3 Hexose, 2 HexNAc, 1 Xylose, 1 Fucose.This predicts the outcome and meets the glycosylated peptide section of taking off the sialic acid myosin, and wherein the peptide section is positioned at protein sequence 266-284, and molecular weight is 2182.98.
Therefore, the accuracy height that predicts the outcome that method of the present invention provides, and avoided the existing method many defectives of false positive that predict the outcome.

Claims (28)

1. a method of identifying the glycosylated peptide section is characterized in that it comprises the steps:
(1) mass spectrogram filters: the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning among the tandem mass spectrometry figure of selection glycosylated peptide section sample;
(2) obtain single electric charge mass number: if identify that used mass spectrum is the MALDI mass spectrum, the mass number of then getting the selected peak of step (1) is its single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum,, obtain its single electric charge mass number then with deconvoluting of the mass number processing at the selected peak of step (1);
(3) make up the mass number network: with single electric charge mass number of step (2) gained match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media;
(4) theoretical library retrieval: be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section;
Standard a, comprise identical peptide section,
In the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit is equal or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more;
Wherein, described theoretical library is meant, by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
2. the method for claim 1, it is characterized in that: in the step (1), the tandem mass spectrometry figure of described glycosylated peptide section sample is the mass spectrogram that meets following standard: contain one or more in the mass spectra peak of 366,528 and 690 these three mass numbers in the spectrogram, and in the signal to noise ratio (S/N ratio) row preceding 20; Range of tolerable variance is in 2.0 dalton.
3. the method for claim 1, it is characterized in that: in the step (1), the mass spectrometer that described tandem mass spectrometry figure is originated is ESI-LTQ-ORBITRAP or MALDI-QIT-TOF.
4. the method for claim 1 is characterized in that: in the step (1), it is the highest preceding 30~70 peaks of signal to noise ratio (S/N ratio) that the enough height of described signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning.
5. method as claimed in claim 4 is characterized in that: in the step (1), it is the highest preceding 50 peaks of signal to noise ratio (S/N ratio) that the enough height of described signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning.
6. the method for claim 1 is characterized in that: in the step (3), if glycosylated peptide section sample to be identified belongs to N-glucosides of bonding, then:
When the mass spectra peak source is the MALDI-QIT mass spectrometer, in the single electric charge mass number of described pairing, be chosen in the acceptable heteromerism range of tolerable variance of poor quality, the highest and the mass number of signal to noise ratio (S/N ratio) differs 83 and 120 daltonian one group of single electric charge mass number continuously, and all heteromerisies of poor quality are the single electric charge mass number of pairing one or more in 132,146,162,203,291,324,365,589 and 689, carry out the structure of follow-up mass number network;
When the mass spectra peak source is the ESI-LTQ-ORBITRAP mass spectrometer, in the single electric charge mass number of described pairing, be chosen in the acceptable heteromerism range of tolerable variance of poor quality, all heteromerisies of poor quality are the single electric charge mass numbers of pairing one or more in 132,146,162,203,291,324,365,589 and 689, carry out the structure of follow-up mass number network;
Wherein, described acceptable heteromerism range of tolerable variance of poor quality is 0.5~2 dalton.
7. method as claimed in claim 6 is characterized in that: described acceptable heteromerism range of tolerable variance of poor quality is 1.0~1.5 dalton.
8. the method for claim 1, it is characterized in that: in the step (4), described mixed-media network modules mixed-media first-selection comprises the maximum mixed-media network modules mixed-media of single electric charge mass number number; If can not obtain meeting the prediction glycosylated peptide section of search criterion in the step (4), then select to comprise single electric charge mass number number time many mixed-media network modules mixed-medias, until the prediction glycosylated peptide section that obtains to meet search criterion in the step (4); Perhaps, after this, select other mixed-media network modules mixed-medias to analyze again, to obtain more fully to predict the qualification result of glycosylated peptide section.
9. the method for claim 1, it is characterized in that: in the step (4), described acceptable mass number range of tolerable variance is in 2.0 dalton.
10. method as claimed in claim 9 is characterized in that: in the step (4), described acceptable mass number range of tolerable variance is 1.0 dalton.
11. the method for claim 1, it is characterized in that: if in the step (1), described tandem mass spectrometry figure derives from the LTQ-Orbitrap mass spectrometer, then in the step (4), except selecting described mass number path, also add the mass number of parent ion mass number as the downstream node in this mass number path, carry out the theoretical library retrieval, seek the glycosylated peptide section that in acceptable parent ion mass number range of tolerable variance, conforms to this parent ion mass number, the glycosylated peptide section that retrieves as if each node in this mass number path of containing the parent ion mass number satisfies above-mentioned standard a and b, the downstream node in this mass number path then, promptly the glycosylated peptide section that retrieves of parent ion mass number is prediction glycosylated peptide section; Wherein, described parent ion mass number range of tolerable variance is 10~20ppm dalton.
12. the method for claim 1, it is characterized in that: in the step (4), begin to select the mass number path of single electric charge mass number of retrieving from the upstream of relational tree, if it is unsuccessful, then select the mass number path of single electric charge mass number of downstream level, until the prediction glycosylated peptide section that obtains to meet search criteria; Perhaps, after this, also select the mass number path of other single electric charge mass numbers of not retrieving as yet to retrieve, until covering whole relational tree, to obtain comprehensively to predict the qualification result of glycosylated peptide section.
13. the method for claim 1 is characterized in that:
Protein peptides section in the described peptide section theoretical library is for carrying out the protein peptides section of theoretical calculate according to following parameter: carrying out the employed proteinase of proteolysis is Trypsin, it is 0~2 that number of times is cut in the enzymolysis leakage that can exist, the peptide segment length scope that can exist is generally 7~60 amino acid, the secondary breakdown site of the peptide hydrolysis that can exist is an aspartic acid, fixedly modification situation that can exist and the variable modification situation that can exist are selected from the urea that takes place on the halfcystine and methylate, and do not comprise or comprise sub-NXS/T of glycosylation modified sequence or S/T can take place;
Described sugar chain theoretical library outlines mass number set 162.05 * n under being 1+ 203.08 * n 2+ 291.83 * n 3+ 132.04 * n 4+ 146.06 * n 5, wherein, n 1, n 2, n 3, n 4And n 5Independently be selected from set 1,2 ..., n}, n are in 10.
14. as claim 1 or 13 described methods, it is characterized in that: described peptide section theoretical library is obtained by following method: glycosylated peptide section sample to be identified is carried out enzymolysis cut sugar chain, get the peptide section in the glycosylated peptide section, carrying out tandem mass spectrometry afterwards identifies, the gained mass spectrogram is carried out the protein sequence database retrieval, obtain the sequence information of the peptide section in the glycosylated peptide section, and obtain corresponding mass number set, promptly as peptide section theoretical library.
15. a device of identifying the glycosylated peptide section is characterized in that it comprises:
The mass spectrogram filter element, the enough height of signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning in the tandem mass spectrometry mass spectrogram of this unit selection glycosylated peptide section sample;
Obtain single electric charge mass number unit: call the mass spectra peak that the mass spectrogram filter element is selected, if the mass spectra peak source is the MALDI mass spectrum, the mass number of then getting mass spectra peak is single electric charge mass number, if identify that used mass spectrum is the ESI mass spectrum, then, obtain its single electric charge mass number with deconvoluting of the mass number processing of mass spectra peak;
Make up the mass number network element: call single electric charge mass number of obtaining single electric charge mass number unit gained, match in twos subtract each other heteromerism of poor quality, afterwards according to each assembly to single electric charge mass number and heteromerism of poor quality thereof, adopt the Biograph module in the Matlab software, construct the mass number network that comprises separate mixed-media network modules mixed-media;
Theoretical library retrieval unit: call and make up the mixed-media network modules mixed-media that the mass number network element obtains, be that arbitrary downstream quality of starting point is counted each the single electric charge mass number in the path with arbitrary node in the relational tree of this mixed-media network modules mixed-media that the Biograph module of Matlab software is demonstrated, in theoretical library, retrieve, single electric charge mass number of each node in searching and this mass number path, in acceptable mass number range of tolerable variance, conform to, and satisfy the glycosylated peptide section of following standard a and b, then the glycosylated peptide section that retrieves of the downstream node in this mass number path is prediction glycosylated peptide section; Standard a, comprise identical peptide section, in the mass number path of standard b, retrieval, arbitrary node is compared with its upstream node, and the number of any monosaccharide unit equates or more in the glycosylated peptide section that retrieves, and the number of at least a monosaccharide unit is more; Wherein, described theoretical library is meant, by the mass number set corresponding with the glycosylated peptide section of the optional combination in twos of each sugar chain in the sugar chain theoretical library of each the protein peptides section in the peptide section theoretical library.
16. device as claimed in claim 15, it is characterized in that: in the mass spectrogram filter element, the tandem mass spectrometry figure of described glycosylated peptide section sample is the mass spectrogram that meets following standard: contain one or more in the mass spectra peak of 366,528 and 690 these three mass numbers in the spectrogram, and in the signal to noise ratio (S/N ratio) row preceding 20; Range of tolerable variance is in 2.0 dalton.
17. device as claimed in claim 15 is characterized in that: in the mass spectrogram filter element, the mass spectrometer that described tandem mass spectrometry figure is originated is ESI-LTQ-ORBITRAP or MALDI-QIT-TOF.
18. device as claimed in claim 1 is characterized in that: in the mass spectrogram filter element, it is the highest preceding 30~70 peaks of signal to noise ratio (S/N ratio) that the enough height of described signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning.
19. device as claimed in claim 18 is characterized in that: in the mass spectrogram filter element, it is the highest preceding 50 peaks of signal to noise ratio (S/N ratio) that the enough height of described signal to noise ratio (S/N ratio) have the mass spectra peak of resolving meaning.
20. device as claimed in claim 1 is characterized in that: make up in the mass number network element, if glycosylated peptide section sample to be identified belongs to N-glucosides of bonding, then:
When the mass spectra peak source is the MALDI-QIT mass spectrometer, in the single electric charge mass number of described pairing, be chosen in the acceptable heteromerism range of tolerable variance of poor quality, the highest and the mass number of signal to noise ratio (S/N ratio) differs 83 and 120 daltonian one group of single electric charge mass number continuously, and all heteromerisies of poor quality are the single electric charge mass number of one or more pairings in 132,146,162,203,291,324,365,589 and 689, carry out the structure of follow-up mass number network;
When the mass spectra peak source is the ESI-LTQ-ORBITRAP mass spectrometer, in the single electric charge mass number of described pairing, be chosen in the acceptable heteromerism range of tolerable variance of poor quality, all heteromerisies of poor quality are the single electric charge mass numbers of the one or more pairings in 132,146,162,203,291,324,365,589 and 689, carry out the structure of follow-up mass number network;
Wherein, described acceptable heteromerism range of tolerable variance of poor quality is 0.5~2 dalton.
21. device as claimed in claim 20 is characterized in that: described acceptable heteromerism range of tolerable variance of poor quality is 1.0~1.5 dalton.
22. device as claimed in claim 15 is characterized in that: in the theoretical library retrieval unit, described mixed-media network modules mixed-media first-selection comprises the maximum mixed-media network modules mixed-media of single electric charge mass number number; If can not obtain meeting the prediction glycosylated peptide section of search criterion in the step (4), then select to comprise single electric charge mass number number time many mixed-media network modules mixed-medias, until the prediction glycosylated peptide section that obtains to meet search criterion in the step (4); Perhaps, after this, select other mixed-media network modules mixed-medias to analyze again, to obtain more fully to predict the qualification result of glycosylated peptide section.
23. device as claimed in claim 15 is characterized in that: in the theoretical library retrieval unit, described acceptable mass number range of tolerable variance is in 2.0 dalton.
24. device as claimed in claim 23 is characterized in that: in the theoretical library retrieval unit, described acceptable mass number range of tolerable variance is 1.0 dalton.
25. device as claimed in claim 15, it is characterized in that: if in the mass spectrogram filter element, described tandem mass spectrometry figure derives from the ESI-LTQ-Orbitrap mass spectrometer, then in the theoretical library retrieval unit, except selecting described mass number path, also add the mass number of parent ion mass number as the downstream node in this mass number path, carry out the theoretical library retrieval, seek the glycosylated peptide section that in acceptable parent ion mass number range of tolerable variance, conforms to this parent ion mass number, the glycosylated peptide section that retrieves as if each node in this mass number path of containing the parent ion mass number satisfies above-mentioned standard a and b, the downstream node in this mass number path then, promptly the glycosylated peptide section that retrieves of parent ion mass number is prediction glycosylated peptide section; Wherein, described parent ion mass number range of tolerable variance is 10~20ppm dalton.
26. device as claimed in claim 1, it is characterized in that: in the mass spectrogram filter element, the mass number path that is starting point with the single electric charge mass number in upstream in the first-selected relational tree is retrieved, if it is unsuccessful, the mass number path that then to select the single electric charge mass number with downstream one deck be starting point is until the prediction glycosylated peptide section that obtains to meet search criteria; Perhaps, after this, also select the mass number path of other single electric charge mass numbers of not retrieving as yet to retrieve, until covering whole relational tree, to obtain comprehensively to predict the qualification result of glycosylated peptide section.
27. device as claimed in claim 1 is characterized in that:
Protein peptides section in the described peptide section theoretical library is for carrying out the theoretical protein peptides section of inferring according to following parameter: carrying out the employed proteinase of proteolysis is Trypsin; It is 0~2 that number of times is cut in the enzymolysis leakage that can exist; The peptide segment length scope that can exist is generally 7~60 amino acid; The secondary breakdown site of the peptide hydrolysis that can exist when adopting MAIDI-QIT is an aspartic acid; Fixedly modification situation that can exist and the variable modification situation that can exist are selected from the urea that takes place on nothing or the halfcystine and methylate; Do not comprise or comprise sub-NXS/T of glycosylation modified sequence and/or S/T can take place;
Described sugar chain theoretical library outlines mass number set 162.05 * n under being 1+ 203.08 * n 2+ 291.83 * n 3+ 132.04 * n 4+ 146.06 * n 5, wherein, n 1, n 2, n 3, n 4And n 5Independently be selected from set 1,2 ..., n}, n are 10.
28. as claim 15 or 27 described devices, it is characterized in that: described peptide section theoretical library is obtained by following method: glycosylated peptide section sample to be identified is carried out enzymolysis cut sugar chain, get the peptide section in the glycosylated peptide section, carrying out tandem mass spectrometry afterwards identifies, the gained mass spectrogram is carried out the protein sequence database retrieval, obtain the sequence information of the peptide section in the glycosylated peptide section, and obtain corresponding mass number set, promptly as peptide section theoretical library.
CN200910199086.4A 2009-11-19 2009-11-19 Method and device for identifying glycopeptide segment Expired - Fee Related CN102072932B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910199086.4A CN102072932B (en) 2009-11-19 2009-11-19 Method and device for identifying glycopeptide segment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910199086.4A CN102072932B (en) 2009-11-19 2009-11-19 Method and device for identifying glycopeptide segment

Publications (2)

Publication Number Publication Date
CN102072932A true CN102072932A (en) 2011-05-25
CN102072932B CN102072932B (en) 2012-12-12

Family

ID=44031554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910199086.4A Expired - Fee Related CN102072932B (en) 2009-11-19 2009-11-19 Method and device for identifying glycopeptide segment

Country Status (1)

Country Link
CN (1) CN102072932B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103983684A (en) * 2014-04-14 2014-08-13 南昌大学 Efficient screening method of glycosylation inhibitor on MALDI (matrix-assisted laser desorption ionization) plate
CN106018535A (en) * 2016-05-11 2016-10-12 中国科学院计算技术研究所 Complete glycopeptide identifying method and system
CN106770614A (en) * 2016-12-30 2017-05-31 复旦大学 The method that glycopeptide segment is identified in hydrophilic nanometer composite material combination mass spectral analysis
CN107037173A (en) * 2017-03-31 2017-08-11 李森康 A kind of method of protein content in quantitatively detection cattle and sheep breast
CN107505384A (en) * 2017-07-28 2017-12-22 复旦大学 A kind of glycopeptide segment Mass Spectrometric Identification method of mercaptophenyl boronic acid magnetic Nano material
CN111220749A (en) * 2018-11-25 2020-06-02 中国科学院大连化学物理研究所 Analysis method of O-linked glycopeptide
CN111758029A (en) * 2018-02-27 2020-10-09 新加坡科技研究局 Methods, apparatus and computer readable media for glycopeptide identification
CN112326769A (en) * 2020-11-04 2021-02-05 西北大学 Method for identifying N-sugar chain branch structure on complete glycopeptide
CN113834871A (en) * 2021-09-18 2021-12-24 北京中医药大学 Method for rapidly analyzing low-molecular-weight sugar based on paper spray mass spectrum and application thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1314968C (en) * 2004-03-11 2007-05-09 复旦大学 Trace polypeptide or protein-enriched and its direct analyzing method

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103983684B (en) * 2014-04-14 2016-08-17 南昌大学 A kind of glycosylation inhibitor screening technique on efficient MALDI plate
CN103983684A (en) * 2014-04-14 2014-08-13 南昌大学 Efficient screening method of glycosylation inhibitor on MALDI (matrix-assisted laser desorption ionization) plate
CN106018535B (en) * 2016-05-11 2018-11-09 中国科学院计算技术研究所 A kind of method and system of intact glycopeptide identification
CN106018535A (en) * 2016-05-11 2016-10-12 中国科学院计算技术研究所 Complete glycopeptide identifying method and system
CN106770614A (en) * 2016-12-30 2017-05-31 复旦大学 The method that glycopeptide segment is identified in hydrophilic nanometer composite material combination mass spectral analysis
CN106770614B (en) * 2016-12-30 2019-11-12 复旦大学 The method of hydrophilic nanometer composite material combination mass spectral analysis identification glycopeptide segment
CN107037173A (en) * 2017-03-31 2017-08-11 李森康 A kind of method of protein content in quantitatively detection cattle and sheep breast
CN107037173B (en) * 2017-03-31 2019-01-18 杭州谱胜检测科技有限责任公司 A kind of method of protein content during quantitative detection cattle and sheep are newborn
CN107505384A (en) * 2017-07-28 2017-12-22 复旦大学 A kind of glycopeptide segment Mass Spectrometric Identification method of mercaptophenyl boronic acid magnetic Nano material
CN111758029A (en) * 2018-02-27 2020-10-09 新加坡科技研究局 Methods, apparatus and computer readable media for glycopeptide identification
CN111220749A (en) * 2018-11-25 2020-06-02 中国科学院大连化学物理研究所 Analysis method of O-linked glycopeptide
CN112326769A (en) * 2020-11-04 2021-02-05 西北大学 Method for identifying N-sugar chain branch structure on complete glycopeptide
CN113834871A (en) * 2021-09-18 2021-12-24 北京中医药大学 Method for rapidly analyzing low-molecular-weight sugar based on paper spray mass spectrum and application thereof
CN113834871B (en) * 2021-09-18 2024-05-28 北京中医药大学 Method for rapidly analyzing low-molecular sugar based on paper spray mass spectrum and application thereof

Also Published As

Publication number Publication date
CN102072932B (en) 2012-12-12

Similar Documents

Publication Publication Date Title
CN102072932B (en) Method and device for identifying glycopeptide segment
JP4988884B2 (en) Mass spectrometry system
JP5299060B2 (en) Glycopeptide structure analysis method and apparatus
WO2018192483A1 (en) Electronic method for non-targeted, multi-index and rapid detection of pesticide residue in edible agricultural products
JP4515819B2 (en) Mass spectrometry system
Strittmatter et al. Proteome analyses using accurate mass and elution time peptide tags with capillary LC time-of-flight mass spectrometry
EP1530721B1 (en) Method for characterizing biomolecules utilizing a result driven strategy
Henzel et al. Protein identification: the origins of peptide mass fingerprinting
US8530831B1 (en) Probability-based mass spectrometry data acquisition
CN111257404A (en) Method for top-down multiplexed mass spectrometry of mixtures of proteins or polypeptides
Chen et al. LC-MS for protein characterization: current capabilities and future trends
Bourgoin‐Voillard et al. Top‐down tandem mass spectrometry on RN ase A and B using a Q h/FT‐ICR hybrid mass spectrometer
JP2009068981A (en) Mass spectrometry system and mass spectrometry method
US20210102952A1 (en) Methods for absolute quantification of low-abundance polypeptides using mass spectrometry
Nemeth‐Cawley et al. Identification and sequencing analysis of intact proteins via collision‐induced dissociation and quadrupole time‐of‐flight mass spectrometry
Pabst et al. A Microarray-Matrix-assisted Laser Desorption/Ionization-Mass Spectrometry Approach for Site-specific Protein N-glycosylation Analysis, as Demonstrated for Human Serum Immunoglobulin M (IgM)*[S]
CN111551626A (en) Cascade mass spectrometry analysis method based on molecular composition and structural fingerprint identification
CN108761084B (en) Comprehensive identification method for complete N-glycoprotein primary structure
Baba et al. Localization of multiple O-linked glycans exhibited in isomeric glycopeptides by hot electron capture dissociation
Maleknia et al. Mass spectrometry of amino acids and proteins
US10937639B2 (en) Precursor selection for data-dependent tandem mass spectrometry
US9460903B2 (en) Glycopeptide analyzer
Sun et al. An improved approach for N-linked glycan structure identification from HCD MS/MS spectra
JP5696592B2 (en) Mass spectrometry data analysis method and analysis apparatus
JP2010014563A (en) Mass spectrometry method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121212

Termination date: 20151119

EXPY Termination of patent right or utility model