CN110310706B - Label-free absolute quantitative method for protein - Google Patents
Label-free absolute quantitative method for protein Download PDFInfo
- Publication number
- CN110310706B CN110310706B CN201810226692.XA CN201810226692A CN110310706B CN 110310706 B CN110310706 B CN 110310706B CN 201810226692 A CN201810226692 A CN 201810226692A CN 110310706 B CN110310706 B CN 110310706B
- Authority
- CN
- China
- Prior art keywords
- protein
- peptide
- quantitative
- peptide fragment
- identified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The invention discloses a label-free absolute quantitative method of protein, which comprises the following steps: 1) selecting a plurality of peptide fragments from the screened high-reliability protein as high-reliability peptide fragments, and calculating the physicochemical property and the peptide fragment quantitative efficiency of each high-reliability peptide fragment; 2) constructing a training set based on the physicochemical properties of the high-reliability peptide fragments and the quantitative efficiency of the peptide fragments; then, training by using the training set to obtain a peptide fragment quantitative efficiency prediction model; 3) predicting the peptide fragment quantitative efficiency of each identified peptide fragment of the identified protein by using the peptide fragment quantitative efficiency prediction model; 4) calculating the quantitative value of the identified protein according to the original mass spectrum signal intensity of each identified peptide segment of the identified protein and the quantitative efficiency of the corresponding peptide segment; 5) and predicting the protein concentration value of other proteins to be detected according to the known concentration and the quantitative value of the standard protein. The invention considers the correction of the original signal intensity of the mass spectrum from the protein level and the peptide segment level, thereby exploring a more accurate protein absolute quantitative method.
Description
Technical Field
The invention relates to a protein absolute quantification method in proteomics, in particular to a large-scale protein unlabeled absolute quantification method without labels.
Background
Quantitative proteomics has become an important research field in the life science field. The research content mainly comprises relative and absolute protein quantification based on high-precision mass spectrum data. While large scale absolute quantification of proteins is widely recognized as one of the ultimate goals of proteomics, absolute quantification of proteins still lacks efficient and accurate quantification algorithms compared to the vast number of relative quantification algorithms.
The ideal method for absolute quantification of proteins is to add an internal standard for each protein, however, this approach represents a high cost for analyzing complex samples. Currently, the most common method is a calculation-based method. In these methods, a linear relationship is first obtained from the relationship between the known actual concentration of the added standard protein and its mass spectrum signal intensity, and then the obtained linear relationship is used to predict the concentration of other proteins in the sample. The core of such methods is to calculate the mass spectral intensity of the protein from the mass spectral intensity or number of spectra of the peptide fragments. However, the effect of the currently developed algorithms is far from satisfactory because these algorithms directly use the signal intensity of the original spectrum of the peptide fragment, and these intensities do not accurately describe the actual abundance of the peptide fragment.
The observed signal intensity of the peptide fragments not only depends on the actual concentration of the peptide fragments in a sample, but also has a great relationship with the physicochemical properties and the mass spectrum detection efficiency of the peptide fragments. The same concentration of peptide fragments may have completely different mass spectral signal intensities. For example, even the mass spectral signal intensities of peptide fragments from the same protein may differ by several orders of magnitude. If a protein has enough peptides identified, the mass spectral signal intensity of the peptides can be corrected to obtain accurate absolute protein quantification. Otherwise, mass spectral signal intensity deviations at the peptide level are transferred to the protein level. This problem is particularly acute for low abundance proteins or small proteins because the number of peptide fragments identified is usually small.
Unfortunately, this important problem is often ignored by the absolute protein quantification algorithms that have been developed so far. For example, the iBAQ (the Intensity-Based Absolute Quantification) algorithm (ref: Schwanhauser, B.et al. Global Quantification of mammalian gene expression control. Nature 473,337-342(2011)) simply adds the mass spectra signal intensities of all identified peptides of a protein and divides by the number of theoretical cleaved peptides of the protein as a quantitative result of the protein. The algorithm has been implemented in MaxQuant software (ref: Cox, J. & Man, M.MaxQuant enables high peptide identification rates, induced reduced p.p.b. -range structures and percentages-with protein in quantification. Nat Biotechnology 26, 1367. sup. 1372 (2008)). The Top3 algorithm (ref: Silva, J.C., Gorenstein, M.V., Li, G.Z., Vissers, J.P. & Geromanos, S.J.Absolute quantification of proteins by LCMSSE: a virtual of parallel MS acquisition. mol Cell proteins 5,144-156(2006)) directly uses the three peptides with the highest signal intensity of the peptide fragment of one protein to calculate the quantitative value of the protein. Both algorithms completely neglect the bias of the mass spectral response signals of different peptides and are therefore only effective for proteins with a large number of identified peptides. Apex (exogenous protein Expression algorithms) (Lu, p., Vogel, c., Wang, r., Yao, X. & Marcotte, e.m. exogenous protein Expression profiling estimates of the relative distributions of the translational and translational correlations. nat biotechnologies 25,117-124(2007)) sums the numbers of spectra corresponding to all the identified peptide fragments of a protein, then corrects using the predicted detectability of the peptide fragments, and finally obtains the quantitative value of the protein. However, APEX is only corrected at the protein level without taking into account the peptide level, and peptide detectability is not suitable for correcting the mass spectral signal intensity of the peptide. Furthermore, previous studies (Cox, J.et al., accurate protein-side label-free quantification and maximum peptide ratio extraction, term MaxLFQ. mol cell proteins 13,2513-2526(2014)) have shown that the quantification algorithm based on the spectroscopic technique is not as accurate as the quantification method based on the signal intensity of the mass spectrum, especially for high-precision mass spectrum data. Therefore, in the field of proteomics, there is still a lack of large-scale absolute quantification algorithms, especially suitable for low-abundance proteins.
Disclosure of Invention
The invention aims to correct the mass spectrum response signal intensity of the peptide fragment by predicting the quantitative efficiency of the peptide fragment by adopting a machine learning method, thereby providing a large-scale accurate protein standard-free absolute quantitative method.
In order to achieve the above object, the present invention provides a method for label-free absolute quantification of a protein, comprising:
step 1), screening high-reliability peptide fragments of high-reliability protein, and calculating the physicochemical properties and the quantitative efficiency of the peptide fragments to construct a training set of a peptide fragment quantitative efficiency prediction model;
step 2), training a Bayesian Additive Regression Tree (BART) model (Chipman HA, GeorgeeEI, McCulloch RE. BART: Bayesian additive regression trees.266-298 (2010));
step 3), predicting the peptide fragment quantitative efficiency of all identified peptide fragments of the identified protein by using the trained BART model;
step 4), calculating a quantitative value of the protein according to the original mass spectrum signal intensity of the identified peptide fragment of the identified protein and the peptide fragment quantitative efficiency;
step 5), predicting concentration values of other proteins according to the known concentration of the standard protein and the obtained quantitative value of the standard protein;
step 6), evaluating the effect of absolute quantification of the protein.
In the above technical solution, in the step 1), unlike the way of the fixed training set which is often used, an online learning strategy is used here. And (3) screening out a high-credibility peptide quantitative efficiency sample from each batch of data to construct a training set, thereby eliminating errors brought by experimental environment, operation, instruments and the like.
In the above technical solution, in the step 1), the constructing a training set of a peptide fragment quantitative efficiency prediction model according to physicochemical properties of a peptide fragment and peptide fragment quantitative efficiency includes:
step 1-1) screening proteins at least comprising N (N is 5 in the invention) unique peptide fragments identified from the identified proteins. The unique peptide fragment refers to the peptide fragment which is only present in one protein group in all identified proteins. The term "proteome" as used herein refers to a collection of homogeneous proteins obtained by protein assembly. After the treatment, the high-credibility protein can be screened.
Step 1-2) for each of the high-confidence proteins, identifying a peptide segment, and calculating 587 physicochemical properties related to the peptide segment according to the amino acid sequence of the peptide segment and the adjacent amino acid sequence thereof in the protein sequence. Each peptide fragment can be represented by x ═ (x)1,x2,x3,…,x587) To indicate.
Of the 587 physicochemical properties, the first 23 are the characteristics related to the peptide sequence information, such as the length of the peptide, the number of cleavage sites missing in the peptide, the mass of the peptide, the frequency of occurrence of each amino acid in the peptide, and the like. The middle 544 species are the results after averaging the physicochemical properties of amino Acids from AAindex (ref: Kawashima, S., Pokarowski, P., Pokarowskka, M., Kolinski, A., Katayama, T., and Kanehisa, M.; AAindex: amino acid index database, progress report2008.nucleic Acids Res.36, D202-D205(2008)) in the peptide segment dimension. The last 20 physicochemical properties were cited from the results of previous studies (references: Brasted, J.C.et al.BMC Bioinformatics 9,529(2008), Webb-Robertson, B.J.et al.Bioinformatics 26,1677-1683(2010), Eyers, C.E.et al.mol Cell Proteomics 10, M110003384 (2011), Tang, H.et al.Bioinformatics 22, e481-488 (2006)).
In the above technical solution, in the step 2), the method for training the BART model according to the training set constructed in the step 1) is described in detail in application number: 2018102163139 patent application entitled "method for predicting peptide fragment quantitative efficiency of peptide fragments in proteomics".
In the above technical solution, in said step 3), the physicochemical properties of all peptide fragments of all proteins are calculated as the quantitative characteristics of these peptide fragments. The physicochemical properties here are the same as those in step 1). Then, the peptide fragments are put into a trained BART model to obtain the peptide fragment quantitative efficiency of the peptide fragments.
In the above technical solution, in the step 4), the calculation of the absolute quantitative value of the protein by using the peptide fragment quantification efficiency can be implemented in two layers. The two layers are peptide layer and protein layer.
The peptide fragment level correction refers to:
the protein layer correction refers to:
wherein, LFAQpep,LFAQproAbsolute protein quantification values from peptide and protein level corrections, respectively, are indicated. PepintiRefers to the spectrogram signal intensity of the ith peptide fragment. QiMeans the quantitative efficiency of the peptide fragment. N refers to the number of unique peptide fragments identified by the protein.
In the above technical solution, in the step 5), the standard protein refers to a protein with a known concentration added to the biological sample. By using these proteins and the absolute quantitative values of the calculated proteins, the concentrations of other proteins in the sample can be predicted.
Step 5-1) here a linear model was trained using the actual concentration of the standard protein and the calculated quantitative values. In order to obtain an accurate and reliable linear model, the technical scheme adopts a bootstrapping strategy. The original sample was randomly extracted 10000 times, and a confidence interval with 68% confidence of the slope and intercept of the linear model was obtained.
And 5-2) predicting the actual concentration of the proteins according to the protein quantitative values of other proteins and the linear model obtained above.
In the above technical solution, in the step 6), the quantitative accuracy is a major concern. Relative error can be used to measure the quantitative effect based on the actual and predicted concentrations of the standard protein.
Where cAmount refers to the calculated protein concentration. tAmount refers to the actual protein concentration.
The present invention also provides a large-scale quantification apparatus for the absolute amount of protein, comprising: the device comprises a protein identification and peptide fragment signal intensity calculation module, a peptide fragment quantitative efficiency prediction module and a protein absolute quantitative value prediction module; wherein
The protein identification and peptide fragment signal intensity calculation module utilizes protein identification software to complete basic analysis work of a spectrogram, and utilizes a peptide fragment mass spectrum signal extraction tool to calculate mass spectrum signal intensity of a peptide fragment.
The peptide fragment quantitative efficiency prediction module comprises the following parts:
1) constructing a training set of a peptide quantitative efficiency prediction model;
2) training a peptide quantitative efficiency prediction model;
3) and predicting the peptide fragment quantitative efficiency of the tested peptide fragments.
The protein absolute quantitative value prediction module predicts the quantitative value of the protein by using the peptide quantitative value and the peptide quantitative efficiency. This module can be implemented at two levels (protein level and peptide segment level).
The invention has the following advantages:
1. the peptide fragment quantification efficiency was used for the first time for absolute protein quantification. The peptide fragment quantitative efficiency is the quantitative characterization of the peptide fragment spectrogram signal. It is certain that protein absolute quantification will be increasingly used in the future.
2. And (5) performing online training of the model. The method uses the high-reliability peptide fragment to construct a training set, trains the quantitative efficiency of the peptide fragment on line, and reduces the influence of factors such as experimental operation and experimental instruments on the absolute quantification of the protein.
3. The correction of the original signal intensity of the mass spectrum is considered from two levels (protein level and peptide segment level), thereby exploring a more accurate protein absolute quantification method.
Drawings
FIG. 1 is a flow chart of the method for absolute protein quantification according to the present invention based on the efficiency of peptide fragment quantification;
fig. 2 is a schematic diagram of a fitting model of the actual concentration and the predicted concentration of the standard protein, and the R-side of the actual concentration and the predicted concentration of the standard protein is 0.9308.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
Assuming there is a protein sample to which a known concentration of standard protein is added (if it is not desired to use step 5) or step 6 in the present protocol), it may be dispensed with). These protein samples were analyzed according to the classical bottom-up protocol. Firstly, the protein mixed sample is subjected to enzymolysis by trypsin to form a peptide fragment mixture solution, then the peptide fragment mixture solution is separated by high performance liquid chromatography, and mass spectrum is utilized to generate experimental tandem mass spectrum data. The tandem mass spectrum data comprises three-dimensional information of chromatographic retention time, particle mass-to-charge ratio and mass spectrum response signal intensity. Next, the obtained mass spectrum data was analyzed using MaxQuant (reference: Cox, J.and Mann, M.MaxQuant enables high peptide identification rates, induced reduced p.p.b. -range mass accesses and protein-with protein catalysis. Nat Biotechnol,2008,26, pp 1367-72). The software can extract the mass spectral signal intensity of the peptide fragments from the spectrogram data and determine which peptide fragments are in the spectrogram. Thereby inferring which proteins are present. From experimental results, the actual concentration of the mass spectrum signal intensity of the peptide fragment has no direct linear relationship, which is probably in great relation with the physical and chemical properties of the peptide fragments. Therefore, in order to accurately perform absolute protein quantification, it is necessary to make a correction to the mass spectrum response intensity of the peptide fragment.
The following describes a specific implementation of the method of the present invention, with reference to fig. 1, in conjunction with the aforementioned examples.
First, the peptide fragment identified for each protein was examined. Only peptide fragments of a protein having a unique peptide fragment number of at least 5 are considered when constructing the training set. The unique peptide fragment refers to the peptide fragment which is only present in one protein group in all identified proteins. The term "proteome" as used herein refers to a collection of homogeneous proteins obtained by protein assembly. In contrast to the unique peptide stretch, the shared peptide stretch is. For example,
identified protein A, B, C, wherein the identified peptide fragments related to the protein A are a and b; the identification peptide fragments associated with the protein B are B and c; the identification peptide fragments related to the protein C are C and d. Then B is a shared peptide since it is present in both protein a and protein B. Similarly, the peptide fragment c is also a shared peptide fragment. The peptide fragments a and d are the only peptide fragments.
Next, the peptide fragments in the training set were characterized. A peptide is essentially an ordered sequence of amino acids. One representation of amino acids is: an upper case letter indicates an amino acid, for example alanine may be represented by the letter a and cysteine by the letter C. Thus, the peptide fragments can be represented as a string of letter sequences. The peptide fragment ARNDCEQK is exemplified below to illustrate the characterization of the peptide fragment. The length of the peptide fragment is 8. Trypsin enzymatically cleaves a protein sequence into peptide fragments from the N-terminus of lysine or arginine, and thus it is generally accepted that lysine (K) or arginine (R) occurring in the interior (non-C-terminus) of a peptide fragment is a result of cleavage omission. The cleavage condition of the peptide fragment can have great influence on the mass spectrum signal of the peptide fragment, and therefore, the number of the cleavage missing sites in the peptide fragment is also an important characteristic. For example, there is a leaky cleavage site R in the peptide stretch ARDCEQK. The masses of each amino acid in the peptide fragment are added to obtain the mass of the peptide fragment of 963.43 Da. In biology, 20 kinds of amino acids are commonly used, and the invention represents the composition structure of the amino acids in a peptide segment by a 20-dimensional amino acid frequency vector. For example, by fixing an amino acid ordering pattern, counting the number of occurrences of each amino acid in the segment ARNDCEQK, which happens to be 1, and dividing by the length 8 of the segment, the eigenvalue of the corresponding position of each amino acid is 1/8, and the eigenvalues of the remaining amino acid positions are 0. According to the knowledge in the AAindex database, there are 544 kinds of physical and chemical properties for each amino acid, and the peptide segment is characterized by averaging the quantitative characteristics of the amino acids in the peptide segment. For example: it is assumed that 544 physicochemical properties of each amino acid in the peptide stretch ARNDCEQK are:
Finally, the physicochemical properties of the last 20 peptides were calculated with reference to the references (Braisted, J.C.et al.BMC biologics 9,529(2008), Webb-Robertson, B.J.et al.bioinformatics 26, 1677-. It is to be noted that, when calculating these characteristics, not only the information on the amino acid sequence of the peptide fragment itself but also the information on the adjacent amino acid sequences in the vicinity of the peptide fragment are used.
According to the application number: 2018102163139, a method for calculating the quantitative efficiency of peptide fragments proposed in the patent application entitled "method for predicting the quantitative efficiency of peptide fragments in proteomics" to train BART model.
Similar to the method for calculating the quantitative characteristics of the peptide fragments, the quantitative characteristics of all the identified peptide fragments of all the identified proteins are calculated and then are brought into the BART model obtained by training, so that the quantitative efficiency of all the identified peptide fragments can be obtained.
The method of the invention considers that the peptide quantitative efficiency is utilized to correct the original signal intensity of the mass spectrum to calculate the absolute quantitative value of the protein in two layers. The two layers are peptide layer and protein layer.
The peptide fragment level correction refers to:
the protein layer correction refers to:
wherein, LFAQpep,LFAQproAbsolute protein quantification values from peptide and protein level corrections, respectively, are indicated. PepintiRefers to the spectrogram signal intensity of the ith peptide fragment. QiMeans the quantitative efficiency of the peptide fragment. N refers to the number of unique peptide fragments identified by the protein.
After quantitative values for all proteins are obtained, if a standard protein of known concentration is previously added to the biological sample, the loading concentration of other proteins can be predicted. The method of the invention is illustrated by the example of the UPS2 protein (Proteomics dynamic Range Standard, Sigma-Aldrich). The UPS2 protein included 6 different concentrations (50000fmol,5000fmol,500fmol,50fmol,5fmol,0.5fmol), 8 different proteins at each concentration. First, a linear model is trained based on the actual and calculated concentrations of the standard protein. It should be noted that a single fitting model inevitably introduces bias, so the method of the present invention uses a bootstrapping sampling strategy. Each sampling randomly draws an equal number of samples from all samples to train the model. A total of 10000 extractions were made. As shown in fig. 2, is a schematic diagram of a model fitting the actual concentration of UPS2 protein to the calculated concentration. After the linear model is obtained, the quantitative values of other proteins can be predicted.
Finally, the invention also provides a method for measuring the quantitative accuracy. The relative error is calculated from the actual and predicted concentration values of the protein.
Where cAmount refers to the calculated protein concentration. tAmount refers to the actual protein concentration.
Thus far, the above-described operation of the present invention has completed calculation and evaluation of absolute quantification of proteins.
The present invention also provides a large-scale quantification apparatus for the absolute amount of protein, comprising: the device comprises a protein identification and peptide fragment signal intensity calculation module, a peptide fragment quantitative efficiency prediction module and a protein absolute quantitative value prediction module; the protein identification and peptide fragment signal intensity calculation module utilizes protein identification software to complete basic analysis work of a spectrogram, and utilizes a peptide fragment mass spectrum signal extraction tool to calculate mass spectrum signal intensity of a peptide fragment.
The peptide fragment quantitative efficiency prediction module comprises the following parts:
1) constructing a training set of a peptide quantitative efficiency prediction model;
2) training a peptide quantitative efficiency prediction model;
3) and predicting the peptide fragment quantitative efficiency of the tested peptide fragments.
The protein absolute quantitative value prediction module predicts the quantitative value of the protein by using the peptide quantitative value and the peptide quantitative efficiency. This module can be implemented at two levels (protein level and peptide segment level).
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (4)
1. A method for label-free absolute quantification of proteins comprises the steps of:
1) selecting a plurality of peptide fragments from the screened high-reliability protein as high-reliability peptide fragments, and calculating the physicochemical property and the peptide fragment quantitative efficiency of each high-reliability peptide fragment;
2) constructing a training set of a model for predicting the quantitative efficiency of the peptide fragments based on the physicochemical properties of the high-reliability peptide fragments and the quantitative efficiency of the peptide fragments; then, training by using the training set to obtain a peptide fragment quantitative efficiency prediction model;
3) predicting the peptide fragment quantitative efficiency of each identified peptide fragment of the identified protein by using the peptide fragment quantitative efficiency prediction model;
4) calculating the quantitative value of the identified protein according to the original mass spectrum signal intensity of each identified peptide segment of the identified protein and the quantitative efficiency of the corresponding peptide segment; wherein a formula is utilizedAbsolute protein quantitation LFAQ from peptide fragment level correctionpepUsing the formulaProtein absolute quantitation LFAQ from protein-level calibrationpro;PepIntiIs the spectrogram signal intensity, Q, of the ith peptide segment of the high-reliability proteiniThe quantitative efficiency of the ith peptide fragment of the high-reliability protein is shown, and N is the number of unique peptide fragments identified by the high-reliability protein;
5) training by using the concentration of the standard protein and the calculated quantitative value to obtain a linear model; then, predicting the protein concentration value of the protein to be detected according to the protein quantitative value of the protein to be detected and the linear model; and randomly drawing an equal amount of samples in all samples to train the linear model each time by using a bootstrapping sampling strategy to obtain the slope, intercept and confidence interval of the linear model.
2. The method of claim 1, wherein the high confidence protein is a protein comprising at least N unique peptide stretches; the unique peptide fragment refers to the peptide fragment which only appears in one protein group in the identified proteins; the proteome refers to a set of homogeneous proteins obtained after protein assembly.
3. The method of claim 1, wherein a formula is utilized Evaluating the quantitative accuracy of the obtained protein concentration value of the protein to be detected to obtain an evaluation value Relative error; wherein cAmount refers to the calculated protein concentration value of the protein to be detected, and tAmount refers to the actual protein concentration value of the protein to be detected.
4. The method of claim 1, wherein the peptide fragment quantitative efficiency prediction model is a bayesian additive regression tree model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810226692.XA CN110310706B (en) | 2018-03-19 | 2018-03-19 | Label-free absolute quantitative method for protein |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810226692.XA CN110310706B (en) | 2018-03-19 | 2018-03-19 | Label-free absolute quantitative method for protein |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110310706A CN110310706A (en) | 2019-10-08 |
CN110310706B true CN110310706B (en) | 2020-08-18 |
Family
ID=68073289
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810226692.XA Active CN110310706B (en) | 2018-03-19 | 2018-03-19 | Label-free absolute quantitative method for protein |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110310706B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114093415B (en) * | 2021-11-19 | 2022-06-03 | 中国科学院数学与系统科学研究院 | Peptide fragment detectability prediction method and system |
CN114400049B (en) * | 2022-01-17 | 2024-06-07 | 腾讯科技(深圳)有限公司 | Training method and device for peptide fragment quantitative model, computer equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010078455A (en) * | 2008-09-26 | 2010-04-08 | Japan Science & Technology Agency | Method for separating/identifying peptide in proteomics analysis |
CN107328842A (en) * | 2017-06-05 | 2017-11-07 | 华东师范大学 | Based on mass spectrogram without mark protein quantitation methods |
-
2018
- 2018-03-19 CN CN201810226692.XA patent/CN110310706B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010078455A (en) * | 2008-09-26 | 2010-04-08 | Japan Science & Technology Agency | Method for separating/identifying peptide in proteomics analysis |
CN107328842A (en) * | 2017-06-05 | 2017-11-07 | 华东师范大学 | Based on mass spectrogram without mark protein quantitation methods |
Non-Patent Citations (4)
Title |
---|
"定量蛋白质组算法研究与应用";常乘;《中国博士学位论文全文数据库 基础科学辑》;20151115;第1-6章 * |
"蛋白质相互作用预测方法的研究";史明光;《中国博士学位论文全文数据库 基础科学辑》;20091015;第1-5章 * |
Belinda Hern'andez etal.."Bayesian Additive Regression Trees using Bayesian Model Averaging".《arXiv》.2015, * |
常乘."定量蛋白质组算法研究与应用".《中国博士学位论文全文数据库 基础科学辑》.2015, * |
Also Published As
Publication number | Publication date |
---|---|
CN110310706A (en) | 2019-10-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Blein-Nicolas et al. | Thousand and one ways to quantify and compare protein abundances in label-free bottom-up proteomics | |
Duncan et al. | The pros and cons of peptide-centric proteomics | |
Nesvizhskii et al. | Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS | |
Vandenbogaert et al. | Alignment of LC‐MS images, with applications to biomarker discovery and protein identification | |
Sandin et al. | Data processing methods and quality control strategies for label-free LC–MS protein quantification | |
Matthiesen | Methods, algorithms and tools in computational proteomics: a practical point of view | |
Blueggel et al. | Bioinformatics in proteomics | |
US10354421B2 (en) | Apparatuses and methods for annotated peptide mapping | |
CN113785362A (en) | Automatic detection of boundaries in mass spectrometry data | |
US20180088094A1 (en) | Multiple attribute monitoring methodologies for complex samples | |
Fenyö et al. | Mass spectrometric protein identification using the global proteome machine | |
Eidhammer et al. | Computational and statistical methods for protein quantification by mass spectrometry | |
JP3707010B2 (en) | General-purpose multicomponent simultaneous identification and quantification method in chromatograph / mass spectrometer | |
MacCoss | Computational analysis of shotgun proteomics data | |
Kislinger et al. | Multidimensional protein identification technology: current status and future prospects | |
Anjo et al. | Use of recombinant proteins as a simple and robust normalization method for untargeted proteomics screening: exhaustive performance assessment | |
CN110310706B (en) | Label-free absolute quantitative method for protein | |
JP4953175B2 (en) | Method for improving quantitative accuracy in chromatograph / mass spectrometer | |
CN108491690B (en) | Method for predicting quantitative efficiency of peptide fragment in proteomics | |
WO2002021139A2 (en) | Automated identification of peptides | |
CN112805560A (en) | Sugar chain structure analysis device and sugar chain structure analysis program | |
Feng et al. | Selected reaction monitoring to measure proteins of interest in complex samples: a practical guide | |
CN109243527B (en) | Enzyme digestion probability-assisted peptide fragment detectability prediction method | |
CN103439441A (en) | Peptide identification method based on subset error rate estimation | |
Wan et al. | ComplexQuant: high-throughput computational pipeline for the global quantitative analysis of endogenous soluble protein complexes using high resolution protein HPLC and precision label-free LC/MS/MS |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |