CN116386712A - Epitope prediction method and device based on antigen protein dynamic space structure - Google Patents

Epitope prediction method and device based on antigen protein dynamic space structure Download PDF

Info

Publication number
CN116386712A
CN116386712A CN202310135045.9A CN202310135045A CN116386712A CN 116386712 A CN116386712 A CN 116386712A CN 202310135045 A CN202310135045 A CN 202310135045A CN 116386712 A CN116386712 A CN 116386712A
Authority
CN
China
Prior art keywords
epitope
antigen protein
amino acid
mean square
root mean
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310135045.9A
Other languages
Chinese (zh)
Other versions
CN116386712B (en
Inventor
李静
何建锋
梁国龍
樊欣迎
刘月峰
闻亚磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bokangjian Gene Technology Co ltd
Beijing Institute of Technology BIT
Original Assignee
Beijing Bokangjian Gene Technology Co ltd
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bokangjian Gene Technology Co ltd, Beijing Institute of Technology BIT filed Critical Beijing Bokangjian Gene Technology Co ltd
Priority to CN202310135045.9A priority Critical patent/CN116386712B/en
Publication of CN116386712A publication Critical patent/CN116386712A/en
Application granted granted Critical
Publication of CN116386712B publication Critical patent/CN116386712B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses an epitope prediction method and device based on an antigen protein dynamic space structure, wherein the method adopts conventional or stretching molecular dynamics simulation to obtain a motion track of the antigen protein in a natural state or a specific intermediate state, extracts the dynamic space structure from the track, calculates the solvent accessible area, charge distribution and root mean square fluctuation through the dynamic space structure, constructs a profile of a composite epitope measurement parameter, and identifies the epitope of the antigen protein according to the profile. The invention provides a method for rapidly, efficiently and comprehensively predicting the epitope based on the current situation of lack of the technology for predicting the epitope based on a dynamic structure, and has very good practical value.

Description

Epitope prediction method and device based on antigen protein dynamic space structure
Technical Field
The invention relates to an epitope prediction method, in particular to an epitope prediction method based on a dynamic spatial structure of an antigen protein, and belongs to the fields of immunoinformatics and epitope prediction.
Background
Epitopes are part of protein antigens that can be specifically recognized and bound by antibodies and/or sensitized lymphocytes to produce an immune response. Epitopes can be made up of contiguous amino acid fragments in an antigenic protein or discontinuous amino acids. The epitope research has wide application prospect in the fields of vaccine design, immunodiagnosis, new medicine development and the like. Some experimental methods have been used for screening and identifying antigen epitopes, such as X-ray crystal diffraction, phage random peptide library, overlapping peptide synthesis technology, etc., but are time-consuming and labor-consuming, limiting the research and application of antigen epitopes. The computer-aided epitope prediction has the advantages of high speed, high efficiency, low cost and the like, and is an important means for epitope screening and identification. Existing epitope prediction techniques remain limited due to the diversity and complexity of epitopes. How to effectively predict the epitope is still a practical problem.
The epitope is a special sequence and a space structure composed of a plurality of amino acids in the antigen protein, and is closely related to physicochemical characteristics, structures and the like of the antigen protein. However, most predictive methods of immunoinformatics determine epitopes based on the amino acid sequence of antigen proteins, using parameters mainly based on database statistics of hydrophilicity, solvent accessibility, antigenicity, affinity, etc. The three-dimensional structure of an antigenic protein is more conserved than the amino acid sequence. Techniques based on the spatial structure of antigenic proteins are able to predict epitopes more accurately, however, currently few techniques for epitope prediction based on spatial structures exist.
The invention aims to solve the problems in the antigen epitope prediction and application fields, and provides an epitope prediction method based on a dynamic space structure of an antigen protein. The method of the invention has practical value for research and application of epitopes.
Disclosure of Invention
The invention aims at providing an epitope prediction method based on a dynamic spatial structure of an antigen protein, aiming at the current situation of lacking an epitope prediction technology based on a spatial structure.
The method provided by the invention adopts conventional or stretching molecular dynamics simulation to obtain a motion track of the antigen protein in a natural state or a specific intermediate state, extracts a dynamic space structure from the track, calculates the accessible area of a solvent, charge distribution and root mean square fluctuation through the dynamic space structure, constructs a profile of a composite epitope measurement parameter, and identifies the epitope of the antigen protein according to the profile.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the first aspect of the invention provides an epitope prediction method based on a dynamic spatial structure of an antigen protein, which comprises the following steps:
(1) Obtaining a crystal structure of the antigen protein as an initial structure, and sequentially carrying out solvation, energy minimization, NVT balance and NPT balance;
(2) Carrying out molecular dynamics simulation on the antigen protein to obtain a dynamic structure of a natural state or a specific intermediate state;
(3) Calculating a plurality of epitope measurement parameters of the antigen protein by adopting the dynamic structure extracted in the step (2), wherein the parameters comprise, but are not limited to, structural fragment flexibility, antigenicity, accessibility, hydrophilicity, plasticity, charge distribution, secondary structure and the like; preferably, 3 epitope metric parameters of solvent accessibility area, charge distribution and root mean square fluctuation are calculated;
(4) And (3) constructing a profile of the composite epitope measurement parameter based on the epitope measurement parameter values obtained in the step (3), and determining the epitope of the antigen protein.
In a specific embodiment of the present invention, step (3) comprises the steps of:
(a) Calculating the average solvent accessible area (SASA) of each amino acid side chain in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the solvent accessible area relative to the amino acids;
(b) Analyzing the charge of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the charge relative to the amino acid;
(c) Calculating the root mean square fluctuation of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the root mean square fluctuation relative to the amino acid;
preferably, the solvent accessible area is calculated by ball method, and the radius of the water ball is detected
Figure BDA0004085158050000021
Preferably, the Root Mean Square Fluctuation (RMSF) of an amino acid is described by the root mean square fluctuation of its backbone ca atoms;
wherein RMSF is expressed as the following formula:
Figure BDA0004085158050000031
wherein K is the number of structures, r i Is the coordinate of C alpha atom of a certain amino acid main chain in the ith structure,
Figure BDA0004085158050000032
is the average of the coordinates of this backbone cα atom.
In a specific embodiment of the present invention, step (4) comprises the steps of:
(i) The method comprises the following steps Drawing 3 epitope measurement parameter profile graphs according to the solvent accessibility area, charge and root mean square fluctuation obtained in the step (3);
preferably, the average value of the solvent accessible area, charge and root mean square fluctuation of amino acids 2 to (N-1) of the antigen protein is calculated, taking this average value as a reference 1.0; dividing the solvent accessible area, charge and root mean square fluctuation of each amino acid by its average value as a remeasured solvent accessible area, charge and root mean square fluctuation; then, sequentially extracting L continuous amino acid fragments from the nitrogen end to the carbon end of the antigen protein, and calculating the average value of the re-measured solvent accessible area, charge and root mean square fluctuation in the fragments to serve as the average value of epitope measurement parameters of (L+1)/2 th amino acid in the fragments; plotting the average of these 3 epitope metric parameters as a function of amino acid;
wherein N is the number of amino acids of the antigen protein;
wherein, a preferred scheme for L consecutive amino acid fragments is to take 7 consecutive amino acid fragments;
(ii) The method comprises the following steps Constructing a profile of the composite epitope measurement parameters according to the profile of the 3 epitope measurement parameters obtained in the step (i), and determining epitopes of the antigen protein;
preferably, the profile of the complex epitope metric parameters is obtained by weighted addition of the profiles of the average of the 3 epitope metric parameters plotted in step (i) as a function of amino acid; epitopes of antigen proteins were identified by analyzing the local maxima of the profile of this complex epitope metric parameter.
In a specific embodiment of the present invention, step (1) comprises the steps of:
downloading an experimental structure of the antigen protein from a Protein Data Bank (PDB) as an initial structure, selecting a force field and a water model, and generating a topology file of the antigen protein; defining a solvent box, adding water molecules and counter ions, and setting salt concentration; performing energy minimization on the solvated antigen protein system, and then performing NVT balance simulation and NPT balance simulation on the system respectively;
preferably, the force field and water model is a full atomic force field and TIP3P water model.
In a specific embodiment of the invention, step (2) obtains trajectory data of molecular dynamics simulation, and extracts a dynamic structure of a natural state or a specific intermediate state;
preferably, the molecular dynamics simulation uses GROMACS and PLUMED software for simulation calculations;
preferably, the dynamic structure in the natural state is obtained by performing a conventional molecular dynamics simulation of the antigen protein for a time T 0 (T 0 More than or equal to 40 nanoseconds), from 1 nanosecond to T 0 Extracting track data;
preferably, the dynamic structure of a particular intermediate state is obtained by reacting an antigen proteinThe molecular dynamics simulation of stretching is carried out to obtain the product, and the stretching time is T 1 (T 1 Not less than 10 nanoseconds), balance time is T 2 (T 1 40 nanoseconds or more), from T 2 Extracting in a balance track;
preferably, the stretched molecular dynamics is achieved by applying an external simple harmonic to the antigenic protein, expressed as:
Figure BDA0004085158050000041
where lambda is a control parameter defining the path,
Figure BDA0004085158050000042
is a coordinate vector of a system, xi is a reaction coordinate, and kappa is a force constant of simple harmonic potential;
preferably, when the PLUMED software is adopted to simulate the molecular dynamics of the stretching of the antigen protein, the force constant of the simple harmonic potential takes a larger integer value M, and the reaction coordinates are composed of one or a plurality of collective variables;
preferably, the collective variable is selected according to the structural characteristics of the antigenic proteins; further preferably, the collective variable is selected from the group consisting of the natural association number (C N ) Radius of gyration (Rg), root mean square error (RMSD), helical content (H) C ) And lamellar content (S) C )。
Preferably, the antigenic protein is selected from a protein or polypeptide drug molecule.
Preferably, the epitope is a linear epitope or a nonlinear epitope.
In a second aspect, the present invention provides an apparatus for epitope prediction based on the dynamic spatial structure of an antigen protein, which comprises a storage medium storing a program and/or a model for implementing the prediction method according to the first aspect of the present invention.
In a third aspect, the invention provides the use of a predictive method according to the first aspect or a device according to the second aspect of the invention for analysing an epitope of a protein or polypeptide drug molecule.
In a fourth aspect, the invention provides the use of a predictive method according to the first aspect or a device according to the second aspect of the invention for assaying a protein or polypeptide drug molecule for immunogenicity or resistance.
Advantageous effects
In view of the current research and application situations of antigen epitopes, the epitope prediction method based on the dynamic space structure of antigen protein has the following beneficial effects:
1. compared with the traditional experimental method, the method provided by the invention adopts computer simulation to conduct epitope prediction, and has the advantages of high speed, high efficiency, low cost and the like;
2. compared with a prediction method based on a sequence, the method of the invention carries out epitope prediction based on a more conservative space structure, which is beneficial to improving the accuracy of prediction;
3. compared with a prediction method based on a static crystal structure, the method of the invention carries out epitope prediction based on a dynamic structure, not only considers the natural state and the intermediate state, but also considers the characteristic of dynamic change of the structure in a solvent environment, and is beneficial to comprehensively and accurately determining the epitope;
4. the method of the invention can not only predict linear epitopes, but also predict nonlinear epitopes;
5. the method is suitable for identifying epitopes of large antigen proteins and small polypeptide drug molecules, is favorable for analyzing the immunogenicity of biological drugs, and has practical value for knowing the drug resistance of organisms to the biological drugs;
drawings
FIG. 1 is a flow chart of an epitope prediction method based on the dynamic spatial structure of an antigen protein;
FIG. 2 is a flow chart showing epitope prediction based on the natural dynamic spatial structure of exenatide when the epitope prediction method based on the dynamic spatial structure of antigen protein of the present invention is implemented in particular;
FIG. 3 is a graph showing the average solvent accessible area of the amino acid side chains, the net charge carried, and the root mean square fluctuation of the backbone C.alpha.atoms of exenatide relative to amino acids;
FIG. 4 is a profile of mean solvent accessible area, net charge carried, and root mean square fluctuation of backbone C.alpha.atoms of the 7 peptide fragment amino acid side chains of exenatide in its native state;
FIG. 5 is a 3-parameter complex profile of a 7-peptide fragment of exenatide in its native state;
FIG. 6 is a predicted epitope of exenatide;
FIG. 7 is a 3-parameter complex profile of a 7-peptide fragment of exenatide in an intermediate state;
Detailed Description
For a better description of the objects and advantages of the present method, reference should be made to the accompanying drawings and detailed description of the embodiments of the invention.
The invention relates to an epitope prediction method based on an antigen protein dynamic space structure, which specifically comprises the following steps:
step one: the experimental structure of the downloaded antigen protein is used for carrying out NPT balance simulation, and specifically comprises the following steps:
downloading an experimental structure of the antigen protein from a Protein Data Bank (PDB) as an initial structure, selecting a force field and a water model, and generating a topology file of the antigen protein; defining a solvent box, adding water molecules and counter ions, and setting salt concentration; performing energy minimization on the solvated antigen protein system, and then performing NVT balance simulation and NPT balance simulation on the system respectively;
wherein, a preferable scheme of the force field and the water model is a full atomic force field and a TIP3P water model;
step two: molecular dynamics simulation is carried out on antigen proteins to obtain dynamic structures in a natural state or a specific intermediate state, and the method specifically comprises the following steps:
based on the NPT balance simulation in the first step, carrying out molecular dynamics simulation on an antigen protein system to obtain a data file such as a track of the molecular dynamics simulation, and extracting a dynamic structure of a natural state or a specific intermediate state;
one preferred approach to molecular dynamics simulation is to use GROMACS and PLUMED software for simulation calculations;
wherein the natural state dynamic structure is obtained by performing conventional molecular dynamics simulation on antigen protein for T time 0 (T 0 More than or equal to 40 nanoseconds), from 1 nanosecond to T 0 Extracting track data;
wherein the dynamic structure of the specific intermediate state is obtained by carrying out stretching molecular dynamics simulation on the antigen protein, and the stretching time is T 1 (T 1 Not less than 10 nanoseconds), balance time is T 2 (T 1 40 nanoseconds or more), from T 2 Extracting in a balance track;
the stretching molecular dynamics drag the antigen protein to a specific folding intermediate state by applying an external simple harmonic potential to the antigen protein, and the simple harmonic potential is expressed as the following formula:
Figure BDA0004085158050000071
where lambda is a control parameter defining the path,
Figure BDA0004085158050000072
is a coordinate vector of a system, xi is a reaction coordinate, and kappa is a force constant of simple harmonic potential;
when PLUMED software is adopted to simulate the molecular dynamics of the antigen protein, the force constant of the simple harmonic potential takes a larger integer value M, and the reaction coordinates are composed of one or more collective variables;
wherein the collective variable is selected according to the structural characteristics of the antigen protein, and the preferred scheme is the natural association number (C N ) Radius of gyration (Rg), root mean square error (RMSD), helical content (H) C ) And lamellar content (S) C ) Etc.
Step three: calculating epitope measurement parameters of the antigen protein by adopting the dynamic structure extracted in the step two, and specifically comprising the following steps:
step three, 1: calculating the average solvent accessible area (SASA) of each amino acid side chain in the antigen protein by adopting the dynamic structure extracted in the step two, and drawing a distribution diagram of the solvent accessible area relative to the amino acid;
wherein the accessible area of the solvent is calculated by adopting a rolling ball method, and the radius of the water ball is detected
Figure BDA0004085158050000075
Step three, 2: analyzing the Charge (Charge) of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step two, and drawing a distribution diagram of the Charge relative to the amino acid;
wherein the unit of charge is1 electron charge (e), positive charge is recorded as +, and negative charge is recorded as-;
step three, 3: calculating Root Mean Square Fluctuation (RMSF) of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step two, and drawing a distribution diagram of the root mean square fluctuation relative to the amino acid;
wherein the root mean square fluctuation of the amino acid is described by the root mean square fluctuation of the main chain C alpha atom;
wherein RMSF is expressed as the following formula:
Figure BDA0004085158050000073
wherein K is the number of structures, r i Is the coordinate of C alpha atom of a certain amino acid main chain in the ith structure,
Figure BDA0004085158050000074
is the average value of the coordinates of the C alpha atoms of the main chain;
step four: based on the 3 epitope measurement parameters obtained in the third step, a profile diagram of the composite epitope measurement parameters is constructed, and the epitope of the antigen protein is determined, wherein the specific steps are as follows:
step four, 1: drawing a profile of 3 epitope measurement parameters according to the solvent accessibility area, charge and root mean square fluctuation obtained in the step three, wherein the profile is specifically as follows:
calculating the average value of the solvent accessible area, the charge and the root mean square fluctuation of amino acids from the 2 nd to the (N-1) of the antigen protein, and taking the average value as a reference 1.0; dividing the solvent accessible area, charge and root mean square fluctuation of each amino acid by its average value as a remeasured solvent accessible area, charge and root mean square fluctuation; then, sequentially extracting L continuous amino acid fragments from the nitrogen end to the carbon end of the antigen protein, and calculating the average value of the re-measured solvent accessible area, charge and root mean square fluctuation in the fragments to serve as the average value of epitope measurement parameters of (L+1)/2 th amino acid in the fragments; plotting the average of these 3 epitope metric parameters as a function of amino acid;
wherein N is the number of amino acids of the antigen protein;
wherein, a preferred scheme for L consecutive amino acid fragments is to take 7 consecutive amino acid fragments;
step four, 2: constructing a profile of a composite epitope measurement parameter according to the profile of the 3 epitope measurement parameters obtained in the step four.1, and determining epitopes of antigen proteins, wherein the method specifically comprises the following steps:
adopting a profile graph of the average value of 3 epitope measurement parameters drawn in the step IV.1, which is changed along with amino acid, and carrying out weighted addition on the profile graph to obtain a profile graph of the composite epitope measurement parameters; identifying epitopes of the antigen protein by analyzing local maxima of the profile of the complex epitope metric parameter;
wherein the weight factor W of the epitope metric parameter is optionally a number between [0,1 ]; in the example, weights W1, W2, W3 for the 3 epitope metric parameters are all 1.
So far, from the first step to the fourth step, the epitope prediction method based on the dynamic space structure of the antigen protein is completed.
Example 1
The embodiment details the epitope prediction method based on the dynamic space structure of the antigen protein, which is disclosed by the invention, when the method is implemented in particular, the epitope prediction based on the natural dynamic space structure of exenatide is implemented.
The specific implementation process of epitope prediction based on the natural dynamic spatial structure of exenatide is as follows:
step 1: the experimental structure of the downloaded exenatide is used for carrying out NPT balance simulation, and specifically comprises the following steps:
the NMR experiment structure of exenatide was downloaded from protein data bank, numbered 1JRJ as the initial structure in the natural state; selecting a Charmm36 force field and a TIP3P water model to generate a topology file; defining a cubic solvent box, adding water molecules and counter Na ions, and setting the salt concentration to be 100mM NaCl; performing 10ps energy minimization on the solvated exenatide system, and performing 100ps NVT balance simulation and NPT balance simulation on the system respectively;
step 2: molecular dynamics simulation is carried out on exenatide to obtain a natural dynamic structure, specifically:
based on NPT balance simulation of the solvated exenatide system in the step 1, adopting GROMACS software to perform 100ns conventional molecular dynamics simulation on the exenatide system, obtaining data files such as simulation tracks and the like, and extracting a structure in the track of 1 ns-100 ns as a natural dynamic structure;
step 3: adopting the natural state dynamic structure extracted in the step 2 to calculate epitope measurement parameters of exenatide, and specifically comprising the following steps:
step 3.1: calculating the average solvent accessible area of the side chains of 39 amino acids of exenatide, and drawing a distribution diagram of the solvent accessible area, wherein the distribution diagram specifically comprises the following steps:
calculating the average solvent accessible area of the amino acid side chain by adopting a sasa program in GROMACS software based on the dynamic structure extracted in the step 2, and drawing a distribution diagram of the solvent accessible area relative to the amino acid;
wherein FIG. 3 (a) shows the distribution diagram of solvent accessible area of exenatide with respect to amino acid; as can be seen, the average solvent accessible area of amino acid side chains of Nos. 1, 3, 5-6, 12-17, 20-22, 27-28, 32-33, 36 and 39 is greater than 1nm 2 The solvent of other amino acid side chain can be smaller than 1nm 2
Step 3.2: analyzing charges of 39 amino acids of exenatide, and drawing a charge distribution diagram;
wherein FIG. 3 (b) shows a distribution diagram of charges of exenatide with respect to amino acids; as can be seen, amino acids 1, 12, 20 and 27 have a positive charge of 1 unit, amino acids 3, 9, 15-17, 24 and 39 have a negative charge of 1 unit, and the other amino acids are uncharged; amino acid fragments 12-20 are regions with more densely distributed charges;
step 3.3: calculating root mean square fluctuation of 39 amino acids of exenatide, and drawing a distribution map of the root mean square fluctuation, wherein the distribution map specifically comprises the following steps:
based on the dynamic structure extracted in the step 2, extracting coordinates of main chain C alpha atoms of 39 amino acids, adopting a formula (2) of the invention content step III.3 to calculate root mean square fluctuation of the main chain C alpha atoms, keeping the serial numbers of the main chain C alpha atoms consistent with the serial numbers of the amino acids, and drawing a root mean square fluctuation map of the C alpha atoms;
wherein, FIG. 3 (C) shows the root mean square wave diagram of the main chain C alpha atom of exenatide; as can be seen from the graph, the root mean square fluctuation amplitude of the main chain C alpha atoms of amino acids 1-7, 32-34 and 39 is larger than 0.2nm, and the root mean square fluctuation of other main chain C alpha atoms is smaller than 0.2nm; this indicates that amino acids 1-7, 32-34 and 39 have better flexibility;
step 4: drawing a contour map of the composite epitope measurement parameters according to the 3 epitope measurement parameters obtained in the step 3, and determining the epitope of the exenatide in the natural state, wherein the specific steps are as follows:
step 4.1: drawing a profile diagram of 3 epitope measurement parameters according to the solvent accessibility area, charge and root mean square fluctuation obtained in the step 3, wherein the profile diagram is specifically as follows:
removing the head and tail amino acids of exenatide, and calculating the average value of solvent accessible area, charge and root mean square fluctuation of amino acids 2-38 respectively, wherein the average value is taken as a reference 1.0; dividing the solvent accessible area, charge and root mean square fluctuation of each amino acid by its average value as a remeasured solvent accessible area, charge and root mean square fluctuation; sequentially calculating the average value of the re-measured solvent accessible area, charge and root mean square fluctuation of the 7 peptide fragments from the nitrogen end to the carbon end of the exenatide, and taking the average value as the average value of epitope measurement parameters of the central amino acid of the fragments; plotting the average of these 3 epitope metric parameters as a function of amino acid;
wherein FIG. 4 (a) is a schematic diagram showing the average solvent accessible area of the amino acid side chain of the 7-peptide fragment of exenatide in its natural state; there are two maxima regions in the figure, located at 4-6 and 12-15, respectively;
wherein FIG. 4 (b) is a profile of the net charge carried by the 7 peptide fragment amino acids of exenatide in its native state; as can be seen, a main maximum region is formed at 12-19;
wherein FIG. 4 (C) is a profile diagram showing the root mean square fluctuation of the backbone C.alpha.atom of the 7-peptide fragment of exenatide in its natural state; the areas 4-10 in the graph have a remarkable maximum value, the root mean square fluctuation amplitude of the areas 10-31 is smaller, and the areas 32-36 are smaller extreme value areas;
step 4.2: drawing a composite epitope measurement parameter profile according to the profile of the 3 epitope measurement parameters obtained in the step 4.1, and determining epitopes of exenatide in a natural state, wherein the method specifically comprises the following steps:
adopting a profile graph of the average value of the 3 epitope measurement parameters drawn in the step 4.1, which is changed along with amino acid, and carrying out weighted addition on the profile graph to obtain a profile graph of the 3 epitope measurement parameters; determining the epitope of exenatide in a natural state by analyzing the local maxima of the complex epitope measurement parameter profile;
FIG. 5 is a graph showing the profile of the complex parameters consisting of the average solvent accessible area of the side chain of the 7 peptide fragment amino acid, the net charge carried by the exenatide in its natural state and the root mean square fluctuation of the C.alpha.atom of the main chain; as can be seen, the areas 4-6 and 12-19 in the composite parametric profile have local maxima;
wherein figure 6 shows the epitope of exenatide in its native state; as shown in the figure (natural state complex in the figure), amino acids 1-6 and 12-19 are epitopes of exenatide in natural state; the solvent of amino acids 1-6 has large accessible area and good flexibility, and amino acids 12-19 have the characteristics of large accessible area and dense charge distribution;
wherein, in order to compare the predicted effect of single and complex parameters, figure 6 also shows the epitope predicted by using solvent accessible area, charge and root mean square fluctuation alone as epitope metric parameters; as can be seen from the figure, the epitopes predicted using solvent accessible area as epitope metric parameter are amino acids 1-6 and 12-15 (natural state SASA in the figure), the epitopes predicted using Charge distribution as epitope metric parameter are amino acids 12-19 (natural state Charge in the figure), the epitopes predicted using root mean square fluctuation as epitope metric parameter are amino acids 1-10 (natural state RMSF in the figure); comparison shows that the predicted result of the 3-parameter composite profile perfectly covers the result of independently adopting solvent accessibility area, charge distribution and root mean square fluctuation as epitope measurement parameters; this shows that the complex parameter prediction method of the invention can more comprehensively and finely identify the epitope.
Example 2
The embodiment details the epitope prediction method based on the dynamic space structure of the antigen protein, which is disclosed by the invention, when the method is implemented in particular, the epitope prediction based on the intermediate dynamic space structure of exenatide is implemented.
The specific implementation process of epitope prediction based on the intermediate dynamic spatial structure of exenatide is as follows:
step 1: downloading an experimental structure of exenatide, and performing NPT balance simulation;
the antigen protein of this example is the same as that of example 1, and thus the specific content of this step is the same as that of example 1, step 1.
Step 2: carrying out stretching molecular dynamics simulation on exenatide to obtain a dynamic structure of an intermediate state, wherein the method specifically comprises the following steps:
based on NPT equilibrium simulation of the solvated exenatide system in the step 1, PLUMED software is used, stretching molecular dynamics simulation in the step II is adopted, exenatide is firstly pulled from a natural state to an intermediate state by 20ns, then 80ns equilibrium simulation is carried out, and a dynamic structure of the intermediate state is extracted from a equilibrium simulation track;
the simple harmonic force constant of the tensile molecular dynamics simulation is 30000kJ/mol;
wherein, the collective variable selects natural association number and helix content; the natural association number is the ratio of the number of heavy atoms with the distance less than 0.85nm in the intermediate dynamic structure of exenatide to the number of heavy atoms in the experimental structure; the helix content is the number of amino acids contained in the helix structure in the intermediate dynamic structure of exenatide;
wherein, the natural association number of the intermediate state of exenatide is 0.54, and the helix content is 13.3.
Step 3: calculating epitope measurement parameters of intermediate states of exenatide by adopting the dynamic structure extracted in the step 2;
the specific content of this step is the same as that of step 3 in example 1.
Step 4: determining an epitope of an intermediate state of exenatide according to the epitope measurement parameter value obtained in the step 3;
the specific content of this step is the same as that of step 4 in example 1.
Wherein, figure 7 shows a compound parameter profile diagram formed by mean solvent accessible area of the side chain of the 7 peptide fragment amino acid, the net charge carried by the exenatide and root mean square fluctuation of the main chain C alpha atom in the intermediate state; in the figure, 4, 6, 12-18, and 25-27 are maximum value regions.
Wherein figure 6 shows the epitope of exenatide in the intermediate state; as shown in the figure (intermediate complex in the figure), amino acids 1-4, 6, 12-18 and 25-27 are epitopes of exenatide in the intermediate state.
Example 3
The present embodiment compares the predicted outcome with the outcome of other computational prediction tools. Table 1 shows the epitopes of exenatide predicted by the different prediction tools.
The first 6 methods are based on amino acid sequences, and only the FASTA file of antigen is required as input data, so epitopes in the natural state and intermediate state of exenatide cannot be predicted differently. Comparing the native epitopes of exenatide of example 1 with their predicted results, it can be seen that: these predicted epitopes are distributed in three regions of exenatide: n-terminal, middle helix and C-terminal extension (Gly 29-Ser 39). The predictions of the N-terminal epitope (His 1-Phe 6) are covered by predictions of 4 of these tools, with all 6 tool predicted epitopes overlapping in part with our predicted intermediate epitope (Asp 9-Val 19) in the middle helical region. Notably, the epitopes predicted by the method of the present application are very consistent with the N-terminal and mid-helix epitopes predicted by Parker's method using hydrophilic HPLC parameters.
Discotope and EPSVR are structure-based prediction methods, and the input data is the exenatide structure file (1 JRJ. PDB) downloaded from PDB. For the natural state of exenatide, the N-terminal epitope predicted by the method of the present application is completely covered by the results of discotome and EPSVR, and the middle helical region epitope predicted by both tools overlaps with our predictions. Furthermore, the epitope predicted by EPSVR also does not comprise C-terminal extension of exenatide, consistent with the present application. From the natural structure of exenatide, the hydrophobic side chains of C-terminal extension (Gly 29-Ser 39) are buried inwards to form hydrophobic clusters, so that the hydrophobic clusters have less possibility of acting on antibody molecules, and the C-terminal extension (Gly 29-Ser 39) of exenatide has less possibility of existence of antigen epitopes. From a comparative analysis of these predicted results, it can be seen that the predicted results of the present invention are reasonable.
Table 1: several different tools are used to predict epitopes of exenatide.
Prediction tool Prediction result
Bepipred 1-9,11-16,28-39
Chou&Fasman 1-11,27-39
Emini 9-15,17,35-39
Karplus&Schulz 1-12,27-39
Kolaskar&Tonggaonkar 8-10,18-25,33-39
Parker 1-6,8-18,29-39
Discotope 1-16,27-37,39
EPSVR 1-10,13,16-20
The method of the invention 1-6,9-19
The foregoing is a preferred embodiment of the present invention, and the present invention should not be limited to the embodiment and the disclosure of the drawings. All equivalents and modifications that come within the spirit of the disclosure are desired to be protected.

Claims (10)

1. An epitope prediction method based on an antigen protein dynamic space structure comprises the following steps:
(1) Obtaining a crystal structure of the antigen protein as an initial structure, and sequentially carrying out solvation, energy minimization, NVT balance and NPT balance;
(2) Carrying out molecular dynamics simulation on the antigen protein to obtain a dynamic structure of a natural state or a specific intermediate state;
(3) Calculating a plurality of epitope measurement parameters of the antigen protein by adopting the dynamic structure extracted in the step (2), wherein the parameters comprise, but are not limited to, structural fragment flexibility, antigenicity, accessibility, hydrophilicity, plasticity, charge distribution, secondary structure and the like; preferably, 3 epitope metric parameters of solvent accessibility area, charge distribution and root mean square fluctuation are calculated;
(4) And (3) constructing a profile of the composite epitope measurement parameter based on the epitope measurement parameter values obtained in the step (3), and determining the epitope of the antigen protein.
2. The method of claim 1, wherein step (3) comprises the steps of:
(a) Calculating the average solvent accessible area (SASA) of each amino acid side chain in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the solvent accessible area relative to the amino acids;
(b) Analyzing the charge of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the charge relative to the amino acid;
(c) Calculating the root mean square fluctuation of each amino acid in the antigen protein by adopting the dynamic structure extracted in the step (2), and drawing a distribution diagram of the root mean square fluctuation relative to the amino acid;
preferably, the solvent accessible area is calculated by ball method, and the radius of the water ball is detected
Figure FDA0004085158030000012
Preferably, the Root Mean Square Fluctuation (RMSF) of an amino acid is described by the root mean square fluctuation of its backbone ca atoms;
wherein RMSF is expressed as the following formula:
Figure FDA0004085158030000011
wherein K is the number of structures, r i Is the coordinate of C alpha atom of a certain amino acid main chain in the ith structure,
Figure FDA0004085158030000013
is the average of the coordinates of this backbone cα atom.
3. The method of claim 1, wherein step (4) comprises the steps of:
(i) The method comprises the following steps Drawing 3 epitope measurement parameter profile graphs according to the solvent accessibility area, charge and root mean square fluctuation obtained in the step (3);
preferably, the average value of the solvent accessible area, charge and root mean square fluctuation of amino acids 2 to (N-1) of the antigen protein is calculated, taking this average value as a reference 1.0; dividing the solvent accessible area, charge and root mean square fluctuation of each amino acid by its average value as a remeasured solvent accessible area, charge and root mean square fluctuation; then, sequentially extracting L continuous amino acid fragments from the nitrogen end to the carbon end of the antigen protein, and calculating the average value of the re-measured solvent accessible area, charge and root mean square fluctuation in the fragments to serve as the average value of epitope measurement parameters of (L+1)/2 th amino acid in the fragments; plotting the average of these 3 epitope metric parameters as a function of amino acid;
wherein N is the number of amino acids of the antigen protein;
wherein, a preferred scheme for L consecutive amino acid fragments is to take 7 consecutive amino acid fragments;
(ii) The method comprises the following steps Constructing a profile of the composite epitope measurement parameters according to the profile of the 3 epitope measurement parameters obtained in the step (i), and determining epitopes of the antigen protein;
preferably, the profile of the complex epitope metric parameters is obtained by weighted addition of the profiles of the average of the 3 epitope metric parameters plotted in step (i) as a function of amino acid; epitopes of antigen proteins were identified by analyzing the local maxima of the profile of this complex epitope metric parameter.
4. The method of claim 1, wherein step (1) comprises the steps of:
downloading an experimental structure of the antigen protein from a Protein Data Bank (PDB) as an initial structure, selecting a force field and a water model, and generating a topology file of the antigen protein; defining a solvent box, adding water molecules and counter ions, and setting salt concentration; performing energy minimization on the solvated antigen protein system, and then performing NVT balance simulation and NPT balance simulation on the system respectively;
preferably, the force field and water model is a full atomic force field and TIP3P water model.
5. The method of claim 1, wherein step (2) obtains trajectory data of molecular dynamics simulation, extracts dynamic structures of natural states or specific intermediate states;
preferably, the molecular dynamics simulation uses GROMACS and PLUMED software for simulation calculations;
preferably, the dynamic structure in the natural state is obtained by performing a conventional molecular dynamics simulation of the antigen protein for a time T 0 (T 0 More than or equal to 40 nanoseconds), from 1 nanosecond to T 0 Extracting track data;
preferably, the dynamic structure of the specific intermediate state is obtained by performing a molecular dynamics simulation of stretching of the antigen protein for a time T 1 (T 1 Not less than 10 nanoseconds), balance time is T 2 (T 1 40 nanoseconds or more), from T 2 Extracting in a balance track;
preferably, the stretched molecular dynamics is achieved by applying an external simple harmonic to the antigenic protein, expressed as:
Figure FDA0004085158030000031
where lambda is a control parameter defining the path,
Figure FDA0004085158030000032
is a coordinate vector of a system, xi is a reaction coordinate, and kappa is a force constant of simple harmonic potential;
preferably, when the PLUMED software is adopted to simulate the molecular dynamics of the stretching of the antigen protein, the force constant of the simple harmonic potential takes a larger integer value M, and the reaction coordinates are composed of one or a plurality of collective variables;
preferably, the collective variable is specific to the structure of the antigenic proteinSelecting points; further preferably, the collective variable is selected from the group consisting of the natural association number (C N ) Radius of gyration (Rg), root mean square error (RMSD), helical content (H) C ) And lamellar content (S) C )。
6. The method of claim 1, wherein the antigenic protein is selected from a protein or polypeptide drug molecule.
7. The method of claim 1, wherein the epitope is a linear epitope or a nonlinear epitope.
8. An apparatus for epitope prediction based on a dynamic spatial structure of an antigen protein, comprising a storage medium storing a program and/or model for implementing the prediction method according to any one of claims 1 to 7.
9. Use of the predictive method according to any one of claims 1 to 7 or the device according to claim 8 for the analysis of epitopes of protein or polypeptide drug molecules.
10. Use of the predictive method according to any one of claims 1 to 7 or the device according to claim 8 for the analysis of the immunogenicity or resistance of a protein or polypeptide drug molecule.
CN202310135045.9A 2023-02-20 2023-02-20 Epitope prediction method and device based on antigen protein dynamic space structure Active CN116386712B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310135045.9A CN116386712B (en) 2023-02-20 2023-02-20 Epitope prediction method and device based on antigen protein dynamic space structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310135045.9A CN116386712B (en) 2023-02-20 2023-02-20 Epitope prediction method and device based on antigen protein dynamic space structure

Publications (2)

Publication Number Publication Date
CN116386712A true CN116386712A (en) 2023-07-04
CN116386712B CN116386712B (en) 2024-02-09

Family

ID=86966336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310135045.9A Active CN116386712B (en) 2023-02-20 2023-02-20 Epitope prediction method and device based on antigen protein dynamic space structure

Country Status (1)

Country Link
CN (1) CN116386712B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521527A (en) * 2011-12-12 2012-06-27 同济大学 Method for predicting space epitope of protein antigen according to antibody species classification
CN107341363A (en) * 2017-06-29 2017-11-10 河北省科学院应用数学研究所 A kind of Forecasting Methodology of proteantigen epitope
US20180330045A1 (en) * 2015-11-09 2018-11-15 The University Of British Columbia Systems and methods for predicting misfolded protein epitopes by collective coordinate biasing
CN114496063A (en) * 2022-01-13 2022-05-13 北京博康健基因科技有限公司 Signal peptide design and secondary structure de novo computational modeling method and device based on natural amino acid sequence
CN114974437A (en) * 2022-04-26 2022-08-30 北京理工大学 Method for analyzing protein steady-state ensemble structural change and key amino acid

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521527A (en) * 2011-12-12 2012-06-27 同济大学 Method for predicting space epitope of protein antigen according to antibody species classification
US20180330045A1 (en) * 2015-11-09 2018-11-15 The University Of British Columbia Systems and methods for predicting misfolded protein epitopes by collective coordinate biasing
CN107341363A (en) * 2017-06-29 2017-11-10 河北省科学院应用数学研究所 A kind of Forecasting Methodology of proteantigen epitope
CN114496063A (en) * 2022-01-13 2022-05-13 北京博康健基因科技有限公司 Signal peptide design and secondary structure de novo computational modeling method and device based on natural amino acid sequence
CN114974437A (en) * 2022-04-26 2022-08-30 北京理工大学 Method for analyzing protein steady-state ensemble structural change and key amino acid

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
成岩等: "布鲁氏菌S2株L7/L12蛋白B细胞线性抗原表位的预测与鉴定", 《中国病原生物学杂志》, vol. 10, no. 03, pages 206 - 210 *

Also Published As

Publication number Publication date
CN116386712B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
Durham et al. Solvent accessible surface area approximations for rapid and accurate protein structure prediction
Olson et al. In search of the protein native state with a probabilistic sampling approach
CN101294970B (en) Prediction method for protein three-dimensional structure
Lin et al. Multi-agent simulated annealing algorithm with parallel adaptive multiple sampling for protein structure prediction in AB off-lattice model
CN116386712B (en) Epitope prediction method and device based on antigen protein dynamic space structure
JP2007323099A (en) Parameter determination method for simulation
US5680319A (en) Hierarchical protein folding prediction
JP6484612B2 (en) Obtaining improved therapeutic ligands
US6370479B1 (en) Method and apparatus for extracting and evaluating mutually similar portions in one-dimensional sequences in molecules and/or three-dimensional structures of molecules
Olson et al. Enhancing sampling of the conformational space near the protein native state
CN114758721A (en) Deep learning-based transcription factor binding site positioning method
CN116420191A (en) Predicting protein structure by multiple iterations using loops
Moll et al. Roadmap methods for protein folding
Mann et al. Classifying proteinlike sequences in arbitrary lattice protein models using LatPack
Kabir et al. Antigen Binding Reshapes Antibody Energy Landscape and Conformation Dynamics
Fefelova et al. Protein Tertiary Structure Prediction with Hybrid Clonal Selection and Differential Evolution Algorithms
Anile et al. Determination of protein structure and dynamics combining immune algorithms and pattern search methods
Chen et al. Two hypotheses and test assumptions based on Quantum-behaved Particle Swarm Optimization (QPSO)
Molloy et al. A robotics-inspired method to sample conformational paths connecting known functionally-relevant structures in protein systems
Venske et al. A multiobjective algorithm for protein structure prediction using adaptive differential evolution
Dubey et al. A novel framework for ab initio coarse protein structure prediction
Zaman et al. Guiding Protein Conformation Sampling with Conformation Space Maps
Zhang et al. Predicting Protein Structure Using Structural Feature-based Hybrid Genetic Algorithm
Lin et al. Prediction and analysis of hot region in protein-protein interactions
Zhang DenseCPD: Improving the Accuracy of

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant