CN117577219A - Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA - Google Patents
Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA Download PDFInfo
- Publication number
- CN117577219A CN117577219A CN202210935272.5A CN202210935272A CN117577219A CN 117577219 A CN117577219 A CN 117577219A CN 202210935272 A CN202210935272 A CN 202210935272A CN 117577219 A CN117577219 A CN 117577219A
- Authority
- CN
- China
- Prior art keywords
- free energy
- protein
- ligand
- binding
- drug
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 239000000890 drug combination Substances 0.000 title abstract description 4
- 230000027455 binding Effects 0.000 claims abstract description 83
- 239000003446 ligand Substances 0.000 claims abstract description 73
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 57
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 57
- 239000003814 drug Substances 0.000 claims abstract description 40
- 229940079593 drug Drugs 0.000 claims abstract description 40
- 108010052285 Membrane Proteins Proteins 0.000 claims abstract description 22
- 238000002474 experimental method Methods 0.000 claims abstract description 8
- UVCJGUGAGLDPAA-UHFFFAOYSA-N ensulizole Chemical compound N1C2=CC(S(=O)(=O)O)=CC=C2N=C1C1=CC=CC=C1 UVCJGUGAGLDPAA-UHFFFAOYSA-N 0.000 claims abstract description 7
- 229920009537 polybutylene succinate adipate Polymers 0.000 claims abstract description 7
- 102000018697 Membrane Proteins Human genes 0.000 claims description 18
- 150000002611 lead compounds Chemical class 0.000 claims description 7
- 238000007614 solvation Methods 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 239000013078 crystal Substances 0.000 claims description 5
- 238000004836 empirical method Methods 0.000 claims description 4
- 238000005984 hydrogenation reaction Methods 0.000 claims description 4
- 230000005588 protonation Effects 0.000 claims description 4
- 230000009471 action Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 238000003077 quantum chemistry computational method Methods 0.000 claims description 2
- 238000003556 assay Methods 0.000 claims 2
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000010354 integration Effects 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 description 24
- 238000012360 testing method Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 10
- 239000003596 drug target Substances 0.000 description 6
- 229910052736 halogen Inorganic materials 0.000 description 6
- 150000002367 halogens Chemical class 0.000 description 6
- 238000003775 Density Functional Theory Methods 0.000 description 5
- 150000001875 compounds Chemical group 0.000 description 4
- 238000000329 molecular dynamics simulation Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000007876 drug discovery Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005421 electrostatic potential Methods 0.000 description 3
- 125000000524 functional group Chemical group 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000003032 molecular docking Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 238000011282 treatment Methods 0.000 description 3
- 102000015792 Cyclin-Dependent Kinase 2 Human genes 0.000 description 2
- 108010024986 Cyclin-Dependent Kinase 2 Proteins 0.000 description 2
- 101710181935 Phosphate-binding protein PstS 1 Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 108090000190 Thrombin Proteins 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 238000010998 test method Methods 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 125000003172 aldehyde group Chemical group 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007877 drug screening Methods 0.000 description 1
- 230000001614 effect on membrane Effects 0.000 description 1
- 229940088598 enzyme Drugs 0.000 description 1
- RTZKZFJDLAIYFH-UHFFFAOYSA-N ether Chemical group CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000009149 molecular binding Effects 0.000 description 1
- 238000000324 molecular mechanic Methods 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 238000002731 protein assay Methods 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 229960004072 thrombin Drugs 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/50—Molecular design, e.g. of drugs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Landscapes
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Pharmacology & Pharmacy (AREA)
- Artificial Intelligence (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention discloses a protein-drug combination free energy prediction method and a prediction system based on MM/PB (GB) SA. The prediction method comprises the following steps: the method comprises the following steps: predicting the binding free energy of the protein and its ligand using a parameter comprising at least one of the force field of the protein, the force field of the ligand and the charge number of the ligand; respectively carrying out data fitting on the predicted multiple groups of binding free energy and the binding free energy measured by experiments; analyzing the free energy of binding of the protein to the drug with a best fit set of parameters and methods; wherein, the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA. The method fully considers various situations possibly encountered by free energy prediction, has extremely high robustness and accuracy, and is applicable to both water-soluble proteins and membrane proteins. Compared with the traditional free energy perturbation technology and thermodynamic integration, the precision is improved, and the speed is 40-50 times faster.
Description
Technical Field
The invention relates to the technical field of drug screening, in particular to a protein-drug combination free energy prediction method and a prediction system based on MM/PB (GB) SA.
Background
Predicting pharmaceutical activity is a very challenging topic in the development of new drugs today. In biological systems, the binding free energy determines the direction of many biological processes, such as: protein folding, enzyme catalysis, and drug target binding. Thus, the combination of free energy predictions plays an indispensable role in the relevant field. In the process of drug discovery, the combination of the drug and the biological target is the basis of drug efficacy, the affinity of the drug directly determines the biological activity of the drug, and the free energy of combination is a quantitative index of the affinity of the drug and the target. Therefore, in the optimization stage of the lead compound, the computer is used for predicting the affinity strength of the candidate drug and the biological target, so that theoretical guidance can be provided for optimizing the lead compound, and the progress of drug discovery can be accelerated.
Currently, the methods commonly used in the drug discovery field to predict receptor-ligand interactions are three methods, namely Free Energy Perturbation (FEP), thermodynamic Integration (TI), and molecular mechanics poisson-boltzmann (generalized bern) surface area (MM/PB (GB) SA), which are widely used for ligand-receptor free energy prediction due to the advantages of fast speed, high robustness, and less computational resource consumption compared to the other two methods.
Disclosure of Invention
Against the background art, the invention solves the problems that: and the accuracy of the combination free energy prediction of the protein drug target and the drug ligand is improved on the premise of low calculation resource consumption by utilizing the free energy prediction technology, so that the time and economic cost for drug research and development are saved. Based on the MM/PB (GB) SA theory, the accuracy of the MM/PB (GB) SA calculation combination free energy is further improved by searching suitable calculation parameters in advance. In the test phase, the method uses lower computing resource consumption and achieves accuracy comparable to that of FEP.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
in one aspect, the invention provides a method for predicting protein-drug binding free energy based on MM/PB (GB) SA, comprising the steps of:
(1) Predicting the free energy of binding of the protein and its ligand using a parameter comprising at least one of the force field of the protein, the force field of the ligand and the charge number of the ligand;
(2) Respectively carrying out data fitting on the plurality of groups of binding free energy predicted in the step (1) and the binding free energy measured by the experiment;
(3) Analyzing the free energy of binding of said protein to the drug with the best fit set of parameters and methods of step (2);
wherein in the step (1), the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA.
In the technical scheme of the invention, the process of calculating the binding free energy of the MM/PB (GB) SA is as follows:
ΔG bind,solve =ΔG bind,vacuum +ΔG solve
=ΔE MM +ΔG solve -TΔS
=ΔE MM +ΔG PB(GB) +ΔG SA -TΔS
wherein ΔG bind,solve Is the free energy of binding, ΔG, in the solution environment bind,vaccume Is the binding free energy under vacuum condition, ΔG solve Is solvation energy; ΔE MM Is the action energy of the molecule, T is the temperature, and DeltaS is the entropy change of the system; ΔG PB(GB) And delta G sA Respectively the solvation energy delta G solve In the technical scheme of the invention, delta G is a polar contribution term and a nonpolar contribution term SA Proportional to the solvent accessible surface area.
In a preferred embodiment, in step (1), the ligand and the drug have the same compound skeleton and similar crystal structure, i.e. the ligand and the drug are optimized from the same lead compound, or the drug is optimized from the ligand, or the ligand is optimized from the drug; the lead compound is a compound molecule which needs to be improved and optimized, and consists of a compound skeleton and at least one functional group, wherein the compound skeleton is a core part of the lead compound molecule, the functional group is an atom or an atomic group which determines the chemical property of the lead compound molecule, and common functional groups comprise hydroxyl, carboxyl, ether bond, aldehyde group, carbonyl and the like.
In certain embodiments, in the MM/GBSA method, the commonly used GB models include GBHCT, GBOBC, GBOBC, GBneck, GBneck2;
preferably, the calculation method of the charge number of the ligand is selected from any one of an empirical method, a semi-empirical method and a quantum chemical calculation method; among them, experience-based ligand charge calculation methods such as: charge calculation method based on CHARMM pervasive force field; semi-empirical ligand charge calculation methods combined empirically and quantitatively, such as: AM1-BCC atomic charge calculation method; atomic charge calculation methods based on quantum chemistry, such as electrostatic potential calculation based on density functional theory (density functional theory, DFT) combined with RESP (Restrained Electro Static Potential, limited electrostatic potential) to fit the calculated atomic charge number;
in the technical scheme of the invention, the force field of the ligand is determined by a calculation method of the charge number of the ligand, namely, after the calculation method of the charge is determined, the force field of the ligand is determined. In the present invention, the optional ligand force field is the generic AMBER force field 2 (Generation Amber Force Field) or the CHARMM generic force field (CHARMM General Force Field), etc.
In certain specific embodiments, the protein is a membrane protein, and the parameter further comprises a dielectric constant (membrane dielectric constant) of the membrane protein.
In a preferred embodiment, in step (2), the test is a wet test;
in the technical scheme of the invention, wet experiments, namely, binding free energy measured by adopting molecular, cell, physiological and other test methods in a laboratory, rather than binding free energy obtained by adopting a computer simulation and bioinformatics method, are adopted to fit with the predicted data in the step (1).
In yet another aspect, the present invention provides a protein-drug binding free energy prediction system based on MM/PB (GB) SA, comprising:
and (3) exploring a data module: for collecting and storing structural information data of the protein and its ligand, structural information data of the drug, and experimentally measured binding free energy data of the protein-ligand;
training data module: the method comprises the steps of predicting the binding free energy of the protein and the ligand thereof, and fitting the predicted binding free energy with experimental results to screen out a group of parameters with the best fitting;
and a prediction data module: analyzing the binding free energy of said protein and said drug with a set of predicted parameters selected to fit best;
wherein, the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA.
Preferably, the training data module comprises a preprocessing unit for preprocessing the protein crystal structure.
Preferably, the pretreatment includes hydrogenation, protonation and energy minimization.
In yet another aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program for execution by a processor of the above-described MM/PB (GB) SA-based protein-drug binding free energy prediction method.
The technical scheme has the following advantages or beneficial effects:
the invention discloses a protein-ligand binding free energy calculation and prediction method based on MM/PB (GB) SA, which has the following advantages relative to FEP and TI: (1) The prediction method provided by the invention has lower consumption of computing resources and can save computing time; (2) the prediction method provided by the invention has high robustness; (3) The accuracy of the prediction method provided by the invention is equivalent to that of free energy perturbation technology (FEP, free energy perturbation) and thermodynamic integral (TI, thermodynamic integration), and the speed is 40-50 times faster; (4) The FEP/TI method is mainly used for water-soluble proteins, the effect on membrane proteins is unknown, the method provided by the invention can be used for predicting the binding free energy of drug molecules of a membrane protein system, and the prediction result has certain accuracy.
Drawings
FIG. 1 is a flow chart of the MM/PB (GB) SA-based protein-ligand binding free energy prediction method of the present invention.
FIG. 2 is a graph showing the correlation between the predicted free binding energy of a water-soluble protein and its ligand and the experimentally measured free binding energy by FEP.
FIG. 3 is a graph showing the correlation between the binding free energy of water-soluble protein and its ligand tested by MM/PB (GB) SA and the binding free energy tested by experiment.
FIG. 4 is a graph showing the correlation between the binding free energy of a membrane protein and a ligand tested by MM/PB (GB) SA and the binding free energy tested by experiments.
FIG. 5 is a diagram of the ligand structure of the CDK2 protein test system of the invention.
FIGS. 6-1 and 6-2 are block diagrams of ligand structures of the P38 protein test system of the present invention.
FIG. 7 is a diagram showing the structure of the ligand of the Thrombin protein test system of the present invention.
FIG. 8 is a ligand structure diagram of the Tyk2 protein test system of the invention.
FIG. 9 is a ligand structure diagram of the mPGES protein test system of the invention.
FIG. 10 is a ligand structure diagram of the GPBAR protein test system of the present invention.
FIG. 11 is a ligand structure diagram of the OX1 protein assay system of the invention.
Detailed Description
The following examples are only some, but not all, of the examples of the invention. Accordingly, the detailed description of the embodiments of the invention provided below is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to fall within the scope of the present invention.
In the present invention, all the equipment, raw materials and the like are commercially available or commonly used in the industry unless otherwise specified. The methods in the following examples are conventional in the art unless otherwise specified.
In one embodiment, a flow chart of a method of predicting free energy of protein-drug binding based on MM/PB (GB) SA is shown in FIG. 1.
Hereinafter, it is described in connection with specific operations.
Test methods and test objects in the following implementations: in a water-soluble system, 4 proteins were tested using free energy perturbation technique FEP, including: CDK2, P38, thrombin, tyk2 proteins; membrane protein objects include 3: mPGES, GPBAR, OX1 protein.
Construction of a test system: in an aqueous solution system, firstly, carrying out treatments such as hydrogenation, prediction of the protonation state of amino acid, energy minimization and the like on the three-dimensional structure of known protein crystals; then the ligand molecules and the drug molecules to be predicted are subjected to a limiting molecular docking technology to obtain the binding mode of the drug molecules and the protein; then pass through molecular dynamics tool Gromacs build 12X 12 angstroms 3 Adding sodium chloride solution with the concentration of 0.15M to neutralize redundant charges to enable the whole simulation system to be in electric neutrality and simulate the physiological state of the simulation system in a cell environment, wherein the ligand structure diagrams of the four water-soluble protein test systems are shown in fig. 5, 6-1, 6-2, 7 and 8 respectively; the ligand structure diagrams of the three membrane protein test systems are shown in fig. 9, 10 and 11, respectively.
Predicting the free energy of binding of a known protein drug target to a ligand: after the test system is established, multiple dynamic information of related protein drug targets and ligand molecules under the time scale of 5ns is simulated by molecular dynamics simulation (molecular dynamics simulations, MD); the free energy of binding of the protein drug target to the ligand molecule is predicted by thermodynamic cycle equation of MM/PB (GB) SA as follows:
ΔG bind,solve =ΔG bind,vacuum +ΔG solve
=ΔE MM +ΔG solve -TΔS
=ΔE MM +ΔG PB(GB) +ΔG SA -TΔS
in the above, ΔG bind,solve Is the free energy of binding, ΔG, in the solution environment bind,vaccume Is the binding free energy under vacuum condition, ΔG solve Is solvation energy; ΔE MM Is the action energy of the molecule, T is the simulated temperature, and delta S is the entropy change of the system; ΔG PB(GB) And delta G SA Respectively the solvation energy delta G solve In the technical scheme of the invention, delta G is a polar contribution term and a nonpolar contribution term SA Proportional to the solvent accessible surface area.
In the process of combining free energy through the prediction, systematic testing is carried out on different parameters in the calculation process and a method for acquiring the parameters, and the method comprises the following steps: the charge number of the ligand, the molecular force field of the protein drug target, the molecular force field of the drug ligand, whether additional single-point halogen model is added to treat halogen, different implicit GB models are adopted for a membrane protein system and the like are calculated by different methods.
In one embodiment, the influence of the number of charges obtained by different ligand charge calculation methods and the molecular force field of different proteins on the binding free energy of the water-soluble proteins and the membrane proteins and the ligand molecules thereof is tested, wherein the ligand charge calculation method comprises the following steps: RESP-DFT, RESP-HF, AM1-BCC, CGenFF; the molecular force field comprises: CHARMM36m (abbreviated as CHARMM), amber FF99SB (abbreviated as FF99 SB). The correlation coefficient (obs. R) and the Mean Absolute Error (MAE) obtained by one-to-one fitting the binding free energy of the protein and its ligand by the above parameters and calculation methods are shown in Table 1 (in the table, the higher the value of the index is, the better the value of the index is, the lower the value of the index is, and the same table is as below).
In one embodiment, it is tested whether adding an additional single point halogen model to treat the halogen to calculate the charge number of the ligand and the effect of the different protein molecular force fields on the free energy of binding of the water-soluble protein and the membrane protein to its ligand molecules, wherein the ligand charge calculation method adding a single point halogen treatment model comprises RESP_DFT_EP and RESP_HF_EP, and the RESP_DFT and RESP_HF are not added with a single point halogen treatment model (wherein EP means an additional single point); the molecular force field comprises: CHARMM36m (abbreviated as CHARMM), amber FF99SB (abbreviated as FF99 SB). The correlation coefficient (obs. R) and the Mean Absolute Error (MAE) obtained by one-to-one fitting the binding free energy of the protein and its ligand with the above parameters and calculation methods to the binding free energy measured experimentally are shown in Table 2.
In one embodiment, the influence of different implicit GB models and different protein molecular force fields on the binding free energy of the membrane proteins and ligand molecules thereof is tested, wherein the implicit GB models comprise: GBHCT (igb =1), GBOBC (igb =2), GBOBC2 (igb =5), GBneck (igb =7), GBneck2 (igb =8); the molecular force field comprises: CHARMM36m (abbreviated as CHARMM) and Amber FF99SB (abbreviated as FF99 SB), the ligand charge number calculation method comprises the following steps: resp_dft, resp_hf, am1_bcc, CGenFF. The correlation coefficient (obs. R) and Mean Absolute Error (MAE) obtained by one-to-one fitting the binding free energy of the protein and its ligand to the experimentally measured binding free energy are shown in Table 3.
In one embodiment, the effect of the force field of different protein molecules on the free energy of binding of the membrane protein to its ligand molecules under consideration of the dielectric constant (membrane dielectric constant) of the membrane protein was tested, wherein the dielectric constant (membrane dielectric constant) of the membrane protein was set to emem=1 to 9, respectively; the molecular force field comprises: CHARMM36m (abbreviated as CHARMM) and Amber FF99SB (abbreviated as FF99 SB), the ligand charge number calculation method comprises the following steps: resp_dft, resp_hf, am1_bcc, CGenFF. The correlation coefficient (obs. R) and Mean Absolute Error (MAE) obtained by one-to-one fitting the binding free energy of the protein and its ligand predicted by the above parameters and calculation methods to the experimentally measured binding free energy are shown in Table 4 (wherein inp is the calculation optimization method of the nonpolar solvation free energy).
In one example, the correlation between the free energy of binding of the protein to its ligand, as measured by the free energy perturbation technique (FEP), and the free energy of binding, as measured by wet experiments, is shown in fig. 2.
In one example, the correlation of the binding free energy of the protein and its ligand tested in MM/PB (GB) SA and the experimentally measured binding free energy in the four water-soluble protein systems is shown in FIG. 3.
In one example, in the three membrane protein systems, the binding free energy of the protein and its ligand was tested in MM/PB (GB) SA and the correlation diagram of the experimentally measured binding free energy is shown in FIG. 4.
In one example, the correlation coefficients of the free energy of binding of the protein to the ligand and the free energy of binding as measured by wet experiments are shown in Table 5 (bolded are preferred data in MM/GBSA, MM/PBSA) with molecular docking (docking), optimized MM/PB (GB) SA methods in tables 1-4, and free energy perturbation technique (FEP): in table 5, the larger the absolute value of the numerical value, the higher the accuracy of prediction.
TABLE 5
Therefore, the MM/PB (GB) SA result after systematically optimizing parameters is generally superior to the free energy perturbation technology FEP of the existing gold standard by the prediction method provided by the invention.
In addition, as the current FEP/TI technology mostly adopts an OPLS molecular force field, the force field cannot well reproduce the physical properties of a real cell membrane. Rupture of the cell membrane occurs even during long-time scale molecular dynamics simulation, so FEP/TI based force field is not suitable for drug molecule binding free energy prediction of membrane protein system. The CHARMM36m force field and the AMBER99SB force field adopted in the invention are both very reliable membrane protein force fields, so that the method can be well suitable for predicting the free energy of the combination of drug molecules of a membrane protein system. From the predicted results, the predicted results of three different test systems for the membrane protein system are all highly consistent with the experimental values.
In terms of speed, for FEP/TI calculations, from published reports, on a GTX2080TI graphics card (GPU), FEP/TI can only predict 0.4-0.5 molecules a day. And 20 molecules can be predicted in one day by the optimized MM/PB (GB) SA. The speed is 40-50 times faster than the FEP/TI method.
In one embodiment, the invention provides a MM/PB (GB) SA-based protein-drug binding free energy prediction system comprising:
and (3) exploring a data module: the method comprises the steps of collecting and storing structural information data of proteins and ligands thereof, structural information data of medicines and experimentally measured binding free energy data of the proteins and the ligands;
training data module: the method comprises the steps of predicting the binding free energy of a protein and a ligand thereof, and fitting the predicted binding free energy with experimental results to screen out a group of parameters with the best fit;
and a prediction data module: analyzing the binding free energy of the protein and the drug by using a group of predicted parameters which are screened and have the best fit;
wherein, the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA.
Preferably, the training data module comprises a preprocessing unit for preprocessing the protein crystal structure.
Preferably, the pretreatment includes hydrogenation, protonation and energy minimization.
The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.
Claims (10)
1. A method for predicting free energy of protein-drug binding based on MM/PB (GB) SA, comprising the steps of:
(1) Predicting the free energy of binding of the protein and its ligand using a parameter comprising at least one of the force field of the protein, the force field of the ligand and the charge number of the ligand;
(2) Respectively carrying out data fitting on the plurality of groups of binding free energy predicted in the step (1) and the binding free energy measured by the experiment;
(3) Analyzing the free energy of binding of said protein to the drug with the best fit set of parameters and methods of step (2);
wherein in the step (1), the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA.
2. The method of claim 1, wherein the MM/PB (GB) SA computes the binding free energy as follows:
ΔG bind,solve =ΔG bind vacuum +ΔG solve
=ΔE MM +ΔG solve -TΔS
=ΔE MM +ΔG PB(GB) +ΔΔG SA -TΔS
wherein ΔG bind,solve Is the free energy of binding, ΔG, in the solution environment bind,vaccume Is the binding free energy under vacuum condition, ΔG solve Is solvation energy; ΔE MM Is the action energy of the molecule, T is the temperature, deltaS is the entropy change of the system, deltaG PB(GB) And delta G SA Respectively the solvation energy delta G solve Polar and nonpolar contributing terms in (a).
3. The method according to claim 1, wherein in step (1), the ligand and the drug are optimized from the same lead compound, or the drug is optimized from the ligand, or the ligand is optimized from the drug.
4. The method according to claim 1, wherein the method for calculating the charge number of the ligand is selected from any one of an empirical method, a semi-empirical method and a quantum chemical calculation method.
5. The method of claim 1, wherein the protein is a membrane protein and the parameter further comprises a dielectric constant of the membrane protein.
6. The method of claim 1, wherein in step (2), the assay is a wet assay.
7. A MM/PB (GB) SA-based protein-drug binding free energy prediction system, comprising:
and (3) exploring a data module: for collecting and storing structural information data of the protein and its ligand, structural information data of the drug, and experimentally measured binding free energy data of the protein-ligand;
training data module: the method comprises the steps of predicting the binding free energy of the protein and the ligand thereof, and fitting the predicted binding free energy with experimental results to screen out a group of parameters with the best fitting;
and a prediction data module: analyzing the binding free energy of said protein and said drug with a set of predicted parameters selected to fit best;
wherein, the method for predicting the binding free energy is MM/PBSA and/or MM/GBSA.
8. The protein-drug binding free energy prediction system of claim 7, wherein the training data module comprises a preprocessing unit for preprocessing the protein crystal structure.
9. The protein-drug binding free energy prediction system of claim 8, wherein the pretreatment comprises hydrogenation, protonation, and energy minimization.
10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, the program being executed by a processor to perform the MM/PB (GB) SA-based protein-drug binding free energy prediction method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210935272.5A CN117577219A (en) | 2022-08-03 | 2022-08-03 | Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210935272.5A CN117577219A (en) | 2022-08-03 | 2022-08-03 | Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117577219A true CN117577219A (en) | 2024-02-20 |
Family
ID=89890526
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210935272.5A Pending CN117577219A (en) | 2022-08-03 | 2022-08-03 | Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117577219A (en) |
-
2022
- 2022-08-03 CN CN202210935272.5A patent/CN117577219A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Limongelli | Ligand binding free energy and kinetics calculation in 2020 | |
Hu et al. | LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening | |
Bannan et al. | Calculating partition coefficients of small molecules in octanol/water and cyclohexane/water | |
Sugita et al. | Large-scale membrane permeability prediction of cyclic peptides crossing a lipid bilayer based on enhanced sampling molecular dynamics simulations | |
Van De Waterbeemd et al. | ADMET in silico modelling: towards prediction paradise? | |
CA2500761C (en) | Methods and systems to identify operational reaction pathways | |
Eschweiler et al. | Chemical probes and engineered constructs reveal a detailed unfolding mechanism for a solvent-free multidomain protein | |
KR20200128710A (en) | A method for improving binding and activity prediction based on machine learning and molecular simulation | |
Wang et al. | Weak-binding molecules are not drugs?—toward a systematic strategy for finding effective weak-binding drugs | |
Flores et al. | Multiscale modeling of macromolecular biosystems | |
Priami et al. | Analysis of biological systems | |
Souza et al. | Perspectives on high-throughput ligand/protein docking with Martini MD simulations | |
Mori et al. | Acceleration of cryo-EM flexible fitting for large biomolecular systems by efficient space partitioning | |
Sieradzan et al. | A new protein nucleic‐acid coarse‐grained force field based on the UNRES and NARES‐2P force fields | |
Hu et al. | Improving DNA-binding protein prediction using three-part sequence-order feature extraction and a deep neural network algorithm | |
Stroh et al. | CGCompiler: Automated Coarse-Grained Molecule Parametrization via Noise-Resistant Mixed-Variable Optimization | |
Guterres et al. | CHARMM-GUI LBS finder & refiner for ligand binding site prediction and refinement | |
Anandakrishnan et al. | An n log n generalized Born approximation | |
Guterres et al. | CHARMM-GUI-Based Induced Fit Docking Workflow to Generate Reliable Protein–Ligand Binding Modes | |
Setny | Prediction of water binding to protein hydration sites with a discrete, semiexplicit solvent model | |
Zhang et al. | Advancing Ligand Docking through Deep Learning: Challenges and Prospects in Virtual Screening | |
CN117577219A (en) | Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA | |
Schaab | Analysis of phosphoproteomics data | |
WO2024026725A1 (en) | Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system | |
JP2005018447A (en) | Method for searcing acceptor-ligand stable complex structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |