WO2024026725A1 - Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system - Google Patents

Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system Download PDF

Info

Publication number
WO2024026725A1
WO2024026725A1 PCT/CN2022/109948 CN2022109948W WO2024026725A1 WO 2024026725 A1 WO2024026725 A1 WO 2024026725A1 CN 2022109948 W CN2022109948 W CN 2022109948W WO 2024026725 A1 WO2024026725 A1 WO 2024026725A1
Authority
WO
WIPO (PCT)
Prior art keywords
free energy
protein
binding free
drug
ligand
Prior art date
Application number
PCT/CN2022/109948
Other languages
French (fr)
Chinese (zh)
Inventor
袁曙光
王世玉
孙晓琳
Original Assignee
深圳阿尔法分子科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳阿尔法分子科技有限责任公司 filed Critical 深圳阿尔法分子科技有限责任公司
Priority to PCT/CN2022/109948 priority Critical patent/WO2024026725A1/en
Publication of WO2024026725A1 publication Critical patent/WO2024026725A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof

Definitions

  • the invention relates to the technical field of drug screening, and in particular to a protein-drug binding free energy prediction method and prediction system based on MM/PB(GB)SA.
  • binding free energy determines the direction of many biological processes, such as protein folding, enzyme catalysis, and drug target binding. Therefore, binding free energy prediction plays an indispensable role in related fields.
  • the combination of a drug and a biological target is the basis for the drug's efficacy.
  • the strength of its affinity directly determines the biological activity of the drug, and the binding free energy is a quantitative indicator of the affinity between the drug and the target. Therefore, in the lead compound optimization stage, using computers to predict the affinity of candidate drugs to biological targets can provide theoretical guidance for the optimization of lead compounds and accelerate the process of drug discovery.
  • MM/PB(GB)SA three methods. Compared with the other two methods, MM/PB(GB)SA is widely used in ligand- Receptor free energy prediction.
  • the problem solved by the present invention is to use free energy prediction technology to improve the accuracy of binding free energy prediction of protein drug targets and drug ligands under the premise of low computing resource consumption, thereby saving time and economic costs in drug research and development.
  • this study further improves the accuracy of MM/PB(GB)SA calculation of binding free energy by finding appropriate calculation parameters in advance. In the testing phase, this method used lower computing resource consumption and achieved accuracy comparable to FEP.
  • the present invention provides a protein-drug binding free energy prediction method based on MM/PB(GB)SA, which includes the following steps:
  • step (2) Analyze the binding free energy of the protein and the drug using the best-fitting set of parameters and methods in step (2);
  • the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
  • ⁇ G bind, solve is the binding free energy under solution environment
  • ⁇ G bind vacuum is the binding free energy under vacuum conditions
  • ⁇ G solve is the solvation energy
  • ⁇ E MM is the action energy of the molecule
  • T is the temperature
  • ⁇ S is the system Entropy change
  • ⁇ G PB (GB) and ⁇ G SA are respectively the polar contribution term and the non-polar contribution term in the solvation energy ⁇ G solve .
  • ⁇ G SA is proportional to the solvent accessible surface area.
  • the ligand and the drug have the same compound skeleton and similar crystal structure, that is, the ligand and the drug are optimized from the same lead compound, or
  • the drug is obtained by optimizing the ligand, or the ligand is obtained by optimizing the drug;
  • the lead compound is the compound molecule that needs to be improved and optimized, consisting of a compound skeleton and at least one functional group, and the compound skeleton is the lead compound
  • the functional group is the atom or atomic group that determines the chemical properties of the lead compound molecule.
  • Common functional groups include hydroxyl, carboxyl, ether bond, aldehyde group, carbonyl group, etc.
  • commonly used GB models include GBHCT, GBOBC, GBOBC2, GBneck, and GBneck2;
  • the calculation method of the charge number of the ligand is selected from any one of the empirical method, the semi-empirical method and the quantum chemical calculation method; wherein, the calculation method of the ligand charge is based on experience, such as: based on the CHARMM universal force Field charge calculation method; semi-empirical ligand charge calculation method that combines experience and quantification, such as: AM1-BCC atomic charge calculation method; atomic charge calculation method based on quantum chemistry, such as density functional theory based on density functional theory ( Density functional theory (DFT) electrostatic potential calculation is combined with RESP (Restrained Electro Static Potential, restricted electrostatic potential) to fit and calculate the atomic charge number;
  • DFT Density functional theory
  • the force field of the ligand is determined by the calculation method of the charge number of the ligand. That is, after the calculation method of the charge is determined, the force field of the ligand is determined accordingly.
  • the optional coordinated force field is the Universal AMBER Force Field 2 (Generation Amber Force Field 2) or the CHARMM General Force Field (CHARMM General Force Field), etc.
  • the protein is a membrane protein
  • the parameters further include a membrane dielectric constant of the membrane protein.
  • the experiment is a wet experiment
  • the present invention provides a protein-drug binding free energy prediction system based on MM/PB(GB)SA, including:
  • Exploration data module used to collect and store structural information data of the protein and its ligands, structural information data of the drug, and experimentally measured binding free energy data of the protein-ligands;
  • Training data module used to predict the binding free energy of the protein and its ligands and fit the predicted binding free energy to the experimental results to select a set of parameters with the best fit;
  • Prediction data module analyze the binding free energy of the protein and the drug with the selected set of prediction parameters that fit the best;
  • the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
  • the training data module includes a preprocessing unit for preprocessing the protein crystal structure.
  • said pretreatment includes hydrogenation, protonation and energy minimization.
  • the present invention provides a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor on the above-mentioned MM/PB(GB)SA-based protein-drug binding free energy prediction method.
  • the invention discloses a protein-ligand binding free energy calculation and prediction method based on MM/PB(GB)SA. Compared with FEP and TI, it has the following advantages: (1) The prediction method provided by the invention consumes less computing resources. is low and can save calculation time; (2) the prediction method provided by the present invention has high robustness; (3) the accuracy of the prediction method provided by the present invention is consistent with free energy perturbation technology (FEP, free energy perturbation) and thermal The kinetic integration (TI, thermodynamic integration) is equivalent, and the speed is 40 to 50 times faster; (4) The FEP/TI method is mostly used for water-soluble proteins, and the effect on membrane proteins is unknown, and the method provided by the present invention can be used for prediction The binding free energy of drug molecules in the membrane protein system, and the prediction results have a certain degree of accuracy.
  • FEP free energy perturbation
  • TI thermodynamic integration
  • Figure 1 is a flow chart of the protein-ligand binding free energy prediction method based on MM/PB(GB)SA of the present invention.
  • Figure 2 is a correlation diagram of the binding free energy of water-soluble proteins and their ligands predicted by FEP in the present invention and the experimentally measured binding free energy.
  • Figure 3 is a correlation diagram of the present invention using MM/PB(GB)SA to test the binding free energy of water-soluble proteins and their ligands and the experimentally measured binding free energy.
  • Figure 4 is a correlation diagram of the binding free energy of membrane proteins and ligands tested by the present invention using MM/PB(GB)SA and the experimentally measured binding free energy.
  • Figure 5 is a ligand structure diagram of the CDK2 protein testing system of the present invention.
  • Figure 6-1 and Figure 6-2 are ligand structure diagrams of the P38 protein testing system of the present invention.
  • Figure 7 is a ligand structure diagram of the Thrombin protein testing system of the present invention.
  • Figure 8 is a ligand structure diagram of the Tyk2 protein testing system of the present invention.
  • Figure 9 is a ligand structure diagram of the mPGES protein testing system of the present invention.
  • Figure 10 is a ligand structure diagram of the GPBAR protein testing system of the present invention.
  • Figure 11 is a ligand structure diagram of the OX1 protein testing system of the present invention.
  • the flow chart of the protein-drug binding free energy prediction method based on MM/PB(GB)SA is shown in Figure 1.
  • test methods and test objects are implemented: In the water-soluble system, free energy perturbation technology FEP was used to test 4 proteins, including: CDK2, P38, Thrombin, and Tyk2 proteins; the membrane protein objects included 3: mPGES, GPBAR , OX1 protein.
  • test system In the aqueous solution system, first hydrogenate the known three-dimensional structure of the protein crystal, predict the protonation state of the amino acid, minimize the energy, etc.; then pass its ligand molecules and the drug molecules to be predicted through Restricted molecular docking technology was used to obtain its binding mode with the protein; then a 12 ⁇ 12 ⁇ 12 Angstrom water box system was constructed through the molecular dynamics tool Gromacs, and a sodium chloride solution with a concentration of 0.15M was added to neutralize the excess charge. The entire simulation system is electrically neutral and simulates its physiological state in a cellular environment.
  • the ligand structure diagrams of the above four water-soluble protein test systems are shown in Figure 5, Figure 6-1 and Figure 6-2, respectively. 7 and Figure 8; the ligand structure diagrams of the above three membrane protein test systems are shown in Figure 9, Figure 10 and Figure 11 respectively.
  • ⁇ G bind, solve is the binding free energy under solution environment
  • ⁇ G bind vacuum is the binding free energy under vacuum conditions
  • ⁇ G solve is the solvation energy
  • ⁇ E MM is the action energy of the molecule
  • T is the simulation temperature
  • ⁇ S is the entropy change of the system
  • ⁇ G PB (GB) and ⁇ G SA are respectively the polar contribution term and the non-polar contribution term in the solvation energy ⁇ G solve .
  • ⁇ G SA is proportional to the solvent accessible surface area .
  • the different parameters in the calculation process and the methods to obtain the parameters are systematically tested, including: the charge number of the ligand calculated by different methods, the molecular force field of the protein drug target , the molecular force field of drug ligands, whether to add additional single-point halogen models to deal with halogens, and use different implicit GB models for membrane protein systems, etc.
  • the effects of the charge numbers obtained by different ligand charge calculation methods and the molecular force fields of different proteins on the binding free energy of the above-mentioned water-soluble proteins and membrane proteins and their ligand molecules were tested, where the ligand charge Calculation methods include: RESP-DFT, RESP-HF, AM1-BCC, CGenFF; molecular force fields include: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB).
  • the ligand charge calculation methods that add a single-point halogen treatment model include RESP_DFT_EP and RESP_HF_EP, and those that do not add a single-point halogen treatment model are RESP_DFT and RESP_HF (where EP means additional single point);
  • the molecular force field includes: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB).
  • the ligand charge number calculation methods include: RESP_DFT, RESP_HF, AM1_BCC, CGenFF.
  • the molecular force fields include: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB), and the ligand charge number calculation methods used include: RESP_DFT, RESP_HF, AM1_BCC, CGenFF.
  • the correlation between the binding free energy of the protein and its ligand measured by the free energy perturbation technique (FEP) method and the binding free energy measured by the wet experiment is shown in the figure. 2.
  • FEP free energy perturbation technique
  • the binding free energy of the protein and the ligand measured by molecular docking the optimized MM/PB(GB)SA method in Tables 1 to 4, and the free energy perturbation technology (FEP) method
  • the correlation coefficient with the binding free energy measured through wet experiments is shown in Table 5 (the bolded ones are the better data in MM/GBSA and MM/PBSA): In Table 5, the greater the absolute value of the value, the more accurate the prediction. high.
  • the prediction method provided by the present invention and the MM/PB(GB)SA results after systematically optimizing parameters are generally better than the existing "gold standard” free energy perturbation technology FEP.
  • FEP/TI can only predict 0.4-0.5 molecules per day.
  • the optimized MM/PB(GB)SA can predict 20 molecules a day. The speed is 40-50 times faster than the FEP/TI method.
  • the present invention provides a protein-drug binding free energy prediction system based on MM/PB(GB)SA, including:
  • Exploration data module used to collect and store structural information data of proteins and their ligands, structural information data of drugs, and experimentally measured protein-ligand binding free energy data;
  • Training data module used to predict the binding free energy of proteins and their ligands and fit the predicted binding free energy to the experimental results to select the best-fitting set of parameters;
  • Prediction data module Analyze the binding free energy of proteins and drugs with the selected set of prediction parameters that best fit;
  • the binding free energy prediction methods are MM/PBSA and/or MM/GBSA.
  • the training data module includes a preprocessing unit for preprocessing protein crystal structures.
  • pretreatment includes hydrogenation, protonation and energy minimization.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Disclosed in the present invention is an MM/PB(GB)SA-based protein-drug binding free energy prediction method and prediction system. The prediction method comprises the following steps: performing binding free energy prediction on a protein and a ligand thereof by using parameters, the parameters comprising at least one of the force field of the protein, the force field of the ligand, and the charge number of the ligand; performing data fitting on the predicted multiple groups of binding free energy and experimentally measured binding free energy, respectively; and analyzing the binding free energy of the protein and a drug by using a group of best-fit parameters and methods, wherein the binding free energy prediction method is MM/PBSA and/or MM/GBSA. In the present invention, situations possibly encountered in free energy prediction are fully considered, and the method has extremely high robustness and accuracy and is suitable for both water-soluble proteins and membrane proteins. Compared with traditional free energy perturbation technology and thermodynamic integration, the precision is improved, and the speed is 40-50 times faster.

Description

一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法及预测系统A protein-drug binding free energy prediction method and prediction system based on MM/PB(GB)SA 技术领域Technical field
本发明涉及药物筛选技术领域,尤其涉及一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法及预测系统。The invention relates to the technical field of drug screening, and in particular to a protein-drug binding free energy prediction method and prediction system based on MM/PB(GB)SA.
背景技术Background technique
预测药物活性是当今新药研发中非常具有挑战性的一个课题。在生物体系中,结合自由能决定了许多生物进程的方向,比如:蛋白质折叠、酶催化以及药物靶标结合。因此,结合自由能预测在相关领域扮演着不可或缺的角色。在药物发现的过程中,药物与生物靶标的结合是药物发挥药效的基础,其亲和力的强弱直接决定了药物的生物学活性,而结合自由能又是药物与靶标亲和力大小的量化指标。因此,在先导化合物优化阶段,使用计算机预测候选药物与生物靶标亲和力的强弱,可以为先导化合物的优化提供理论指导,加速药物发现的进程。Predicting drug activity is a very challenging topic in today's new drug development. In biological systems, binding free energy determines the direction of many biological processes, such as protein folding, enzyme catalysis, and drug target binding. Therefore, binding free energy prediction plays an indispensable role in related fields. In the process of drug discovery, the combination of a drug and a biological target is the basis for the drug's efficacy. The strength of its affinity directly determines the biological activity of the drug, and the binding free energy is a quantitative indicator of the affinity between the drug and the target. Therefore, in the lead compound optimization stage, using computers to predict the affinity of candidate drugs to biological targets can provide theoretical guidance for the optimization of lead compounds and accelerate the process of drug discovery.
目前,药物发现领域常常用来预测受体-配体相互作用的方法有自由能微扰(FEP)、热力学积分(TI)以及分子力学泊松-玻尔兹曼(广义伯恩)表面积(MM/PB(GB)SA)三种方法,相比于另外两种方法,MM/PB(GB)SA由于速度快、鲁棒性高以及计算资源消耗较少的优点,被广泛用于配体-受体自由能预测。Currently, methods commonly used in the field of drug discovery to predict receptor-ligand interactions include free energy perturbation (FEP), thermodynamic integration (TI), and molecular mechanics Poisson-Boltzmann (generalized Bern) surface area (MM). /PB(GB)SA) three methods. Compared with the other two methods, MM/PB(GB)SA is widely used in ligand- Receptor free energy prediction.
发明内容Contents of the invention
针对上述背景技术,本发明解决的问题是:利用自由能预测技术在低计算资源消耗的前提下提升蛋白质药物靶标与药物配体结合自由能预测的准确度,从而节省药物研发的时间和经济成本。本研究在MM/PB(GB)SA理论的基础上,通过预先寻找合适的计算参数,来进一步提升MM/PB(GB)SA计算结合 自由能的准确度。在测试阶段,本方法使用较低的计算资源消耗,取得了和FEP相媲美的准确度。In view of the above background technology, the problem solved by the present invention is to use free energy prediction technology to improve the accuracy of binding free energy prediction of protein drug targets and drug ligands under the premise of low computing resource consumption, thereby saving time and economic costs in drug research and development. . Based on the MM/PB(GB)SA theory, this study further improves the accuracy of MM/PB(GB)SA calculation of binding free energy by finding appropriate calculation parameters in advance. In the testing phase, this method used lower computing resource consumption and achieved accuracy comparable to FEP.
为实现上述目的,本发明采取的技术方案为:In order to achieve the above objects, the technical solutions adopted by the present invention are:
一方面,本发明提供一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法,包括以下步骤:On the one hand, the present invention provides a protein-drug binding free energy prediction method based on MM/PB(GB)SA, which includes the following steps:
(1)使用参数对所述蛋白质及其配体进行结合自由能预测,所述参数包括所述蛋白质的力场、所述配体的力场和所述配体的电荷数中的至少一种;(1) Use parameters to predict the binding free energy of the protein and its ligands, the parameters including at least one of the force field of the protein, the force field of the ligand, and the charge number of the ligand. ;
(2)将步骤(1)中预测的多组结合自由能分别与实验测得的结合自由能进行数据拟合;(2) Fit the multiple sets of binding free energies predicted in step (1) with the experimentally measured binding free energies;
(3)以步骤(2)中拟合最好的一组参数和方法分析所述蛋白质与药物的结合自由能;(3) Analyze the binding free energy of the protein and the drug using the best-fitting set of parameters and methods in step (2);
其中,步骤(1)中,所述结合自由能预测的方法为MM/PBSA和/或MM/GBSA。Wherein, in step (1), the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
在本发明的技术方案中,所述MM/PB(GB)SA计算结合自由能的过程如下:In the technical solution of the present invention, the process of calculating the binding free energy of the MM/PB(GB)SA is as follows:
ΔG bind,solve=ΔG bind,vacuum+ΔG solve ΔG bind, solve =ΔG bind, vacuum +ΔG solve
=ΔE MM+ΔG solve-TΔS =ΔE MM +ΔG solve -TΔS
=ΔE MM+ΔG PB(GB)+ΔG SA-TΔS =ΔE MM +ΔG PB(GB) +ΔG SA -TΔS
其中,ΔG bind,solve是溶液环境下的结合自由能,ΔG bind,vaccume是真空条件下的结合自由能,ΔG solve是溶剂化能;ΔE MM是分子的作用能,T是温度,ΔS是体系熵变;ΔG PB(GB)与ΔG SA分别是溶剂化能ΔG solve中的极性贡献项与非极性贡献项,在本发明的技术方案中,ΔG SA正比于溶剂可及表面积。 Among them, ΔG bind, solve is the binding free energy under solution environment, ΔG bind, vacuum is the binding free energy under vacuum conditions, ΔG solve is the solvation energy; ΔE MM is the action energy of the molecule, T is the temperature, and ΔS is the system Entropy change; ΔG PB (GB) and ΔG SA are respectively the polar contribution term and the non-polar contribution term in the solvation energy ΔG solve . In the technical solution of the present invention, ΔG SA is proportional to the solvent accessible surface area.
作为优选地实施方式,步骤(1)中,所述配体与所述药物具有相同的化合物骨架以及相似的晶体结构,即所述配体与所述药物是由同一个先导化合物优化得到,或所述药物由所述配体优化得到,或所述配体由所述药物优化得到;其中,先导化合物即需要进行改良优化的化合物分子,由化合物骨架和至少一 个官能团组成,化合物骨架是先导化合物分子的核心部分,而官能团是决定先导化合物分子的化学性质的原子或原子团,常见官能团包括羟基、羧基、醚键、醛基、羰基等。As a preferred embodiment, in step (1), the ligand and the drug have the same compound skeleton and similar crystal structure, that is, the ligand and the drug are optimized from the same lead compound, or The drug is obtained by optimizing the ligand, or the ligand is obtained by optimizing the drug; wherein, the lead compound is the compound molecule that needs to be improved and optimized, consisting of a compound skeleton and at least one functional group, and the compound skeleton is the lead compound The core part of the molecule, and the functional group is the atom or atomic group that determines the chemical properties of the lead compound molecule. Common functional groups include hydroxyl, carboxyl, ether bond, aldehyde group, carbonyl group, etc.
在某些具体的实施方式中,MM/GBSA方法中,常用的GB模型包括GBHCT、GBOBC、GBOBC2、GBneck、GBneck2;In some specific implementations, in the MM/GBSA method, commonly used GB models include GBHCT, GBOBC, GBOBC2, GBneck, and GBneck2;
优选地,所述配体的电荷数的计算方法选自经验法、半经验法和量子化学计算法中的任一种;其中,基于经验的配体电荷计算方法,如:基于CHARMM普适力场的电荷计算方法;经验与量化相结合的半经验配体电荷计算方法,如:AM1-BCC原子电荷计算方法;基于量子化学的原子电荷计算方法,如基于密度泛函理论密度泛函理论(density functional theory,DFT)的静电势计算联用RESP(Restrained Electro Static Potential,有限制的静电电位)来拟合计算原子电荷数;Preferably, the calculation method of the charge number of the ligand is selected from any one of the empirical method, the semi-empirical method and the quantum chemical calculation method; wherein, the calculation method of the ligand charge is based on experience, such as: based on the CHARMM universal force Field charge calculation method; semi-empirical ligand charge calculation method that combines experience and quantification, such as: AM1-BCC atomic charge calculation method; atomic charge calculation method based on quantum chemistry, such as density functional theory based on density functional theory ( Density functional theory (DFT) electrostatic potential calculation is combined with RESP (Restrained Electro Static Potential, restricted electrostatic potential) to fit and calculate the atomic charge number;
在本发明的技术方案中,所述配体的力场由配体的电荷数的计算方法决定,即,电荷的计算方法确定之后,配体力场随之确定。在本发明中,可选的配体力场为普适AMBER力场2(Generation Amber Force Field 2)或者CHARMM普适力场(CHARMM General Force Field)等。In the technical solution of the present invention, the force field of the ligand is determined by the calculation method of the charge number of the ligand. That is, after the calculation method of the charge is determined, the force field of the ligand is determined accordingly. In the present invention, the optional coordinated force field is the Universal AMBER Force Field 2 (Generation Amber Force Field 2) or the CHARMM General Force Field (CHARMM General Force Field), etc.
在某些具体的实施方式中,所述蛋白质为膜蛋白,所述参数还包括所述膜蛋白的介电常数(membrane dielectric constant)。In some specific embodiments, the protein is a membrane protein, and the parameters further include a membrane dielectric constant of the membrane protein.
作为优选地实施方式,步骤(2)中,所述实验为湿实验;As a preferred embodiment, in step (2), the experiment is a wet experiment;
在本发明的技术方案中,采用湿实验即通过实验室中采用分子、细胞、生理学等试验方法测得的结合自由能而非通过计算机模拟、生物信息学方法得到的结合自由能与步骤(1)预测的数据进行拟合。In the technical solution of the present invention, wet experiments are used, that is, the binding free energy measured by molecular, cellular, physiological and other experimental methods in the laboratory rather than the binding free energy obtained by computer simulation and bioinformatics methods is combined with the step (1 ) to fit the predicted data.
又一方面,本发明提供一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测系统,包括:In another aspect, the present invention provides a protein-drug binding free energy prediction system based on MM/PB(GB)SA, including:
探索数据模块:用于收集并存储所述蛋白质及其配体的结构信息数据、所述药物的结构信息数据以及实验测得的所述蛋白质-配体的结合自由能数据;Exploration data module: used to collect and store structural information data of the protein and its ligands, structural information data of the drug, and experimentally measured binding free energy data of the protein-ligands;
训练数据模块:用于对所述蛋白质及其配体进行结合自由能预测并将预测得到的结合自由能与实验结果拟合筛选出拟合最好的一组参数;Training data module: used to predict the binding free energy of the protein and its ligands and fit the predicted binding free energy to the experimental results to select a set of parameters with the best fit;
预测数据模块:以筛选出的拟合最好的一组预测参数分析所述蛋白质和所述药物的结合自由能;Prediction data module: analyze the binding free energy of the protein and the drug with the selected set of prediction parameters that fit the best;
其中,所述结合自由能预测的方法为MM/PBSA和/或MM/GBSA。Wherein, the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
优选地,所述训练数据模块包括预处理单元,用于所述蛋白质晶体结构的预处理。Preferably, the training data module includes a preprocessing unit for preprocessing the protein crystal structure.
优选地,所述预处理包括加氢、质子化和能量最小化。Preferably, said pretreatment includes hydrogenation, protonation and energy minimization.
又一方面,本发明提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行上述基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法。In another aspect, the present invention provides a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor on the above-mentioned MM/PB(GB)SA-based protein-drug binding free energy prediction method.
上述技术方案具有如下优点或者有益效果:The above technical solution has the following advantages or beneficial effects:
本发明公开了一种基于MM/PB(GB)SA的蛋白质-配体结合自由能计算预测方法,相对于FEP及TI,具有以下优点:(1)本发明提供的预测方法的计算资源消耗较低,且能够节省计算时间;(2)本发明提供的预测方法具有高鲁棒性;(3)本发明提供的预测方法的准确率与自由能微扰技术(FEP,free energy perturbation)及热动力学积分(TI,thermodynamic integration)相当,且速度要快40~50倍;(4)FEP/TI方法多用于水溶性蛋白,对膜蛋白的效果未知,而本发明提供的方法可以用于预测膜蛋白体系的药物分子结合自由能,且预测结果具有一定的精准性。The invention discloses a protein-ligand binding free energy calculation and prediction method based on MM/PB(GB)SA. Compared with FEP and TI, it has the following advantages: (1) The prediction method provided by the invention consumes less computing resources. is low and can save calculation time; (2) the prediction method provided by the present invention has high robustness; (3) the accuracy of the prediction method provided by the present invention is consistent with free energy perturbation technology (FEP, free energy perturbation) and thermal The kinetic integration (TI, thermodynamic integration) is equivalent, and the speed is 40 to 50 times faster; (4) The FEP/TI method is mostly used for water-soluble proteins, and the effect on membrane proteins is unknown, and the method provided by the present invention can be used for prediction The binding free energy of drug molecules in the membrane protein system, and the prediction results have a certain degree of accuracy.
附图说明Description of the drawings
图1是本发明基于MM/PB(GB)SA的蛋白质-配体结合自由能预测方法的流程图。Figure 1 is a flow chart of the protein-ligand binding free energy prediction method based on MM/PB(GB)SA of the present invention.
图2是本发明以FEP预测水溶性蛋白与其配体结合自由能与实验测得的结合自由能的相关性图。Figure 2 is a correlation diagram of the binding free energy of water-soluble proteins and their ligands predicted by FEP in the present invention and the experimentally measured binding free energy.
图3是本发明以MM/PB(GB)SA测试水溶性蛋白与其配体结合自由能与实验测得的结合自由能的相关性图。Figure 3 is a correlation diagram of the present invention using MM/PB(GB)SA to test the binding free energy of water-soluble proteins and their ligands and the experimentally measured binding free energy.
图4是本发明以MM/PB(GB)SA测试膜蛋白与配体结合自由能与实验测得的结合自由能的相关性图。Figure 4 is a correlation diagram of the binding free energy of membrane proteins and ligands tested by the present invention using MM/PB(GB)SA and the experimentally measured binding free energy.
图5是本发明CDK2蛋白测试体系的配体结构图。Figure 5 is a ligand structure diagram of the CDK2 protein testing system of the present invention.
图6-1和图6-2是本发明P38蛋白测试体系的配体结构图。Figure 6-1 and Figure 6-2 are ligand structure diagrams of the P38 protein testing system of the present invention.
图7是本发明Thrombin蛋白测试体系的配体结构图。Figure 7 is a ligand structure diagram of the Thrombin protein testing system of the present invention.
图8是本发明Tyk2蛋白测试体系的配体结构图。Figure 8 is a ligand structure diagram of the Tyk2 protein testing system of the present invention.
图9是本发明mPGES蛋白测试体系的配体结构图。Figure 9 is a ligand structure diagram of the mPGES protein testing system of the present invention.
图10是本发明GPBAR蛋白测试体系的配体结构图。Figure 10 is a ligand structure diagram of the GPBAR protein testing system of the present invention.
图11是本发明OX1蛋白测试体系的配体结构图。Figure 11 is a ligand structure diagram of the OX1 protein testing system of the present invention.
具体实施方式Detailed ways
下述实施例仅仅是本发明的一部分实施例,而不是全部的实施例。因此,以下提供的本发明实施例中的详细描述并非旨在限制要求保护的本发明的范围,而是仅仅表示本发明的选定实施例。基于本发明的实施例,本领域技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本发明的保护范围。The following embodiments are only some of the embodiments of the present invention, rather than all of the embodiments. Therefore, the detailed description of the embodiments of the invention provided below is not intended to limit the scope of the claimed invention but rather to represent selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without any creative work fall within the protection scope of the present invention.
在本发明中,若非特指,所有的设备和原料等均可从市场购得或是本行业常用的。下述实施例中的方法,如无特别说明,均为本领域的常规方法。In the present invention, unless otherwise specified, all equipment and raw materials can be purchased from the market or are commonly used in the industry. The methods in the following examples are all conventional methods in the art unless otherwise specified.
在一个实施例中,基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法的流程图如图1所示。In one embodiment, the flow chart of the protein-drug binding free energy prediction method based on MM/PB(GB)SA is shown in Figure 1.
在下文中,结合具体操作对其进行描述。In the following, it is described in combination with specific operations.
下述实施中的测试方法和测试对象:水溶性体系中,采用自由能微扰技术FEP测试了4个蛋白,包括:CDK2、P38、Thrombin、Tyk2蛋白;膜蛋白对象包括3个:mPGES、GPBAR、OX1蛋白。The following test methods and test objects are implemented: In the water-soluble system, free energy perturbation technology FEP was used to test 4 proteins, including: CDK2, P38, Thrombin, and Tyk2 proteins; the membrane protein objects included 3: mPGES, GPBAR , OX1 protein.
测试体系的构建:在水溶液体系中,首先对已知的蛋白质晶体的三维结构进行加氢、预测氨基酸的质子化状态、能量最小化等处理;然后将其配体分子和所要预测的药物分子通过限制性分子对接技术获得其与蛋白质的结合模式;然后再通过分子动力学工具Gromacs构建12×12×12埃 3的水盒子体系,并加入浓度0.15M的氯化钠溶液中和多余的电荷,使整个模拟体系处于电中性,模拟其在细胞环境下的生理状态,其中,上述四个水溶性蛋白测试体系的配体结构图分别见图5、图6-1和图6-2、图7和图8;上述三个膜蛋白测试体系的配体结构图分别见图9、图10和图11。 Construction of the test system: In the aqueous solution system, first hydrogenate the known three-dimensional structure of the protein crystal, predict the protonation state of the amino acid, minimize the energy, etc.; then pass its ligand molecules and the drug molecules to be predicted through Restricted molecular docking technology was used to obtain its binding mode with the protein; then a 12 × 12 × 12 Angstrom water box system was constructed through the molecular dynamics tool Gromacs, and a sodium chloride solution with a concentration of 0.15M was added to neutralize the excess charge. The entire simulation system is electrically neutral and simulates its physiological state in a cellular environment. The ligand structure diagrams of the above four water-soluble protein test systems are shown in Figure 5, Figure 6-1 and Figure 6-2, respectively. 7 and Figure 8; the ligand structure diagrams of the above three membrane protein test systems are shown in Figure 9, Figure 10 and Figure 11 respectively.
预测已知的蛋白质药物靶标与配体的结合自由能:上述测试体系建立后,通过分子动力学模拟(molecular dynamics simulations,MD)模拟相关蛋白质药物靶标与配体分子在5ns时间尺度下的多重动态信息;通过MM/PB(GB)SA的热力学循环方程来预测蛋白质药物靶标与配体分子结合自由能,如下所示:Predict the binding free energy of known protein drug targets and ligands: After the above test system is established, molecular dynamics simulations (MD) are used to simulate the multiple dynamics of relevant protein drug targets and ligand molecules on a 5ns time scale. Information; Predict the binding free energy of protein drug targets and ligand molecules through the thermodynamic cycle equation of MM/PB(GB)SA, as follows:
ΔG bind,solve=ΔG bind,vacuum+ΔG solve ΔG bind, solve =ΔG bind, vacuum +ΔG solve
=ΔE MM+ΔG solve-TΔS =ΔE MM +ΔG solve -TΔS
=ΔE MM+ΔG PB(GB)+ΔG SA-TΔS =ΔE MM +ΔG PB(GB) +ΔG SA -TΔS
上式中,ΔG bind,solve是溶液环境下的结合自由能,ΔG bind,vaccume是真空条件下的结合自由能,ΔG solve是溶剂化能;ΔE MM是分子的作用能,T是模拟温度,ΔS是体系熵变;ΔG PB(GB)与ΔG SA分别是溶剂化能ΔG solve中的极性贡献项与非极性贡献项,在本发明的技术方案中,ΔG SA正比于溶剂可及表面积。 In the above formula, ΔG bind, solve is the binding free energy under solution environment, ΔG bind, vacuum is the binding free energy under vacuum conditions, ΔG solve is the solvation energy; ΔE MM is the action energy of the molecule, T is the simulation temperature, ΔS is the entropy change of the system; ΔG PB (GB) and ΔG SA are respectively the polar contribution term and the non-polar contribution term in the solvation energy ΔG solve . In the technical solution of the present invention, ΔG SA is proportional to the solvent accessible surface area .
在通过上式预测结合自由能过程中,对计算过程中的不同参数及获取该参数的方法进行系统化测试,包括:由不同方法计算得到的配体的电荷数、蛋白质药物靶标的分子力场、药物配体的分子力场、是否添加额外的单点卤素模型处理卤素、对膜蛋白体系采用不同的隐式GB模型等。In the process of predicting the binding free energy through the above formula, the different parameters in the calculation process and the methods to obtain the parameters are systematically tested, including: the charge number of the ligand calculated by different methods, the molecular force field of the protein drug target , the molecular force field of drug ligands, whether to add additional single-point halogen models to deal with halogens, and use different implicit GB models for membrane protein systems, etc.
在一个实施例中,测试了不同的配体电荷计算方法得到的电荷数与不同的蛋白质的分子力场对上述水溶性蛋白质以及膜蛋白与其配体分子结合自由能的影响,其中,配体电荷计算方法包括:RESP-DFT、RESP-HF、AM1-BCC、CGenFF; 分子力场包括:CHARMM36m(简写为CHARMM)、Amber FF99SB(简写为FF99SB)。以上述参数和计算方法对蛋白质及其配体预测结合自由能,与实验测得的结合自由能一一拟合后得到的相关系数(obs.R)以及平均绝对误差(MAE)见表1(表中,obs.R后括号中的higher is better意为该指标的数值越大越好,MAE后lower is better的意为该指标的数值越小越好,下述表格同理)。In one embodiment, the effects of the charge numbers obtained by different ligand charge calculation methods and the molecular force fields of different proteins on the binding free energy of the above-mentioned water-soluble proteins and membrane proteins and their ligand molecules were tested, where the ligand charge Calculation methods include: RESP-DFT, RESP-HF, AM1-BCC, CGenFF; molecular force fields include: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB). The correlation coefficient (obs.R) and mean absolute error (MAE) obtained by fitting the binding free energy of the protein and its ligands to the predicted binding free energy of the protein and its ligands one by one with the experimentally measured binding free energy using the above parameters and calculation methods are shown in Table 1 ( In the table, "higher is better" in parentheses after obs.R means that the larger the value of the indicator, the better. "lower is better" after MAE means that the smaller the value of the indicator, the better. The same applies to the following tables).
在一个实施例中,测试了是否添加额外的单点卤素模型处理卤素计算配体的电荷数和不同的蛋白质分子力场对上述水溶性蛋白以及膜蛋白与其配体分子结合自由能的影响,其中,添加单点卤素处理模型的配体电荷计算方法包括RESP_DFT_EP和RESP_HF_EP,未添加单点卤素处理模型的的是RESP_DFT和RESP_HF(其中EP意为额外单点);分子力场包括:CHARMM36m(简写为CHARMM)、Amber FF99SB(简写为FF99SB)。以上述参数和计算方法对蛋白质及其配体预测结合自由能,与实验测得的结合自由能一一拟合后得到的相关系数(obs.R)以及平均绝对误差(MAE)见表2。In one embodiment, it was tested whether to add an additional single-point halogen model to process the halogen to calculate the charge number of the ligand and the impact of different protein molecule force fields on the binding free energy of the above-mentioned water-soluble proteins and membrane proteins and their ligand molecules, where , the ligand charge calculation methods that add a single-point halogen treatment model include RESP_DFT_EP and RESP_HF_EP, and those that do not add a single-point halogen treatment model are RESP_DFT and RESP_HF (where EP means additional single point); the molecular force field includes: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB). The above parameters and calculation methods were used to predict the binding free energy of the protein and its ligands, and the correlation coefficient (obs.R) and mean absolute error (MAE) obtained after fitting the experimentally measured binding free energy one by one are shown in Table 2.
在一个实施例中,测试了不同的隐式GB模型和不同的蛋白质分子力场对上述膜蛋白与其配体分子结合自由能的影响,其中,隐式GB模型包括:GBHCT(igb=1)、GBOBC(igb=2)、GBOBC2(igb=5)、GBneck(igb=7)、GBneck2(igb=8);分子力场包括:CHARMM36m(简写为CHARMM)、Amber FF99SB(简写为FF99SB),所用到的配体电荷数计算方法包括:RESP_DFT、RESP_HF、AM1_BCC、CGenFF。以上述参数和计算方法对蛋白质及其配体预测结合自由能,与实验测得的结合自由能一一拟合后得到的相关系数(obs.R)以及平均绝对误差(MAE)见表3。In one embodiment, the effects of different implicit GB models and different protein molecule force fields on the binding free energy of the above-mentioned membrane protein and its ligand molecules were tested, where the implicit GB models include: GBHCT (igb=1), GBOBC (igb=2), GBOBC2 (igb=5), GBneck (igb=7), GBneck2 (igb=8); molecular force fields include: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB), used The ligand charge number calculation methods include: RESP_DFT, RESP_HF, AM1_BCC, CGenFF. The correlation coefficient (obs.R) and mean absolute error (MAE) obtained by fitting the binding free energy of the protein and its ligands to the predicted binding free energy of the protein and its ligands one by one with the experimentally measured binding free energy using the above parameters and calculation methods are shown in Table 3.
在一个实施例中,测试了在考虑膜蛋白的介电常数(membrane dielectric constant)的影响下以及不同的蛋白质分子力场对上述膜蛋白与其配体分子结合自由能的影响,其中,膜蛋白的介电常数(membrane dielectric constant)分别设置为emem=1~9;分子力场包括:CHARMM36m(简写为CHARMM)、Amber FF99SB(简写为FF99SB),所用到的配体电荷数计算方法包括:RESP_DFT、 RESP_HF、AM1_BCC、CGenFF。以上述参数和计算方法对蛋白质及其配体预测结合自由能,与实验测得的结合自由能一一拟合后得到的相关系数(obs.R)以及平均绝对误差(MAE)见表4(其中,inp为非极性溶剂化自由能的计算优化方法)。In one embodiment, the effect of different protein molecule force fields on the binding free energy of the above-mentioned membrane protein and its ligand molecules was tested under the influence of the dielectric constant of the membrane protein (membrane dielectric constant), where the membrane protein's The dielectric constant (membrane dielectric constant) is set to emem=1~9 respectively; the molecular force fields include: CHARMM36m (abbreviated as CHARMM), Amber FF99SB (abbreviated as FF99SB), and the ligand charge number calculation methods used include: RESP_DFT, RESP_HF, AM1_BCC, CGenFF. The correlation coefficient (obs.R) and mean absolute error (MAE) obtained by fitting the binding free energy of the protein and its ligands to the predicted binding free energy of the protein and its ligands one by one with the experimentally measured binding free energy using the above parameters and calculation methods are shown in Table 4 ( Among them, inp is the calculation optimization method of non-polar solvation free energy).
在一个实施例中,上述四个水溶性蛋白体系中,以自由能微扰技术(FEP)方法测得的蛋白与其配体的结合自由能与湿实验测得的结合自由能的相关性见图2。In one embodiment, in the above four water-soluble protein systems, the correlation between the binding free energy of the protein and its ligand measured by the free energy perturbation technique (FEP) method and the binding free energy measured by the wet experiment is shown in the figure. 2.
在一个实施例中,上述四个水溶性蛋白体系中,以MM/PB(GB)SA测试蛋白与其配体的结合自由能与实验测得的结合自由能的相关性图见图3。In one embodiment, in the above four water-soluble protein systems, the correlation diagram between the binding free energy of the test protein and its ligand using MM/PB(GB)SA and the experimentally measured binding free energy is shown in Figure 3.
在一个实施例中,上述三个膜蛋白体系中,以MM/PB(GB)SA测试蛋白与其配体的结合自由能与实验测得的结合自由能的相关性图见图4。In one embodiment, in the above three membrane protein systems, the correlation diagram between the binding free energy of the MM/PB(GB)SA test protein and its ligand and the experimentally measured binding free energy is shown in Figure 4.
在一个实施例中,以分子对接(docking)、表1-表4中优化后的MM/PB(GB)SA方法与自由能微扰技术(FEP)方法测得的蛋白质与配体结合自由能与通过湿实验测得的结合自由能的相关系数见表5(加粗的为MM/GBSA、MM/PBSA中的较佳数据):表5中,数值的绝对值越大,预测的精度越高。In one embodiment, the binding free energy of the protein and the ligand measured by molecular docking, the optimized MM/PB(GB)SA method in Tables 1 to 4, and the free energy perturbation technology (FEP) method The correlation coefficient with the binding free energy measured through wet experiments is shown in Table 5 (the bolded ones are the better data in MM/GBSA and MM/PBSA): In Table 5, the greater the absolute value of the value, the more accurate the prediction. high.
表5table 5
Figure PCTCN2022109948-appb-000001
Figure PCTCN2022109948-appb-000001
因此,本发明提供的预测方法,系统性优化参数后的MM/PB(GB)SA结果普遍优于现有“黄金标准”的自由能微扰技术FEP。Therefore, the prediction method provided by the present invention and the MM/PB(GB)SA results after systematically optimizing parameters are generally better than the existing "gold standard" free energy perturbation technology FEP.
此外,由于目前的FEP/TI技术多采用OPLS分子力场,该力场不能很好的 重现真实细胞膜的物理性质。甚至在长时间尺度的分子动力学模拟过程中细胞膜会发生破裂现象,因此基于该力场的FEP/TI不适合膜蛋白体系的药物分子结合自由能预测。由于本发明中采用的CHARMM36m力场和AMBER99SB力场均有非常可靠的膜蛋白力场,可以很好的适用于膜蛋白体系的药物分子结合自由能预测。从预测结果可以看出,三个不同测试体系对于膜蛋白体系的预测结果均与实验值高度一致。In addition, because current FEP/TI technology mostly uses the OPLS molecular force field, this force field cannot well reproduce the physical properties of real cell membranes. Even during long-term molecular dynamics simulations, cell membranes may rupture, so FEP/TI based on this force field is not suitable for predicting the binding free energy of drug molecules in membrane protein systems. Since both the CHARMM36m force field and the AMBER99SB force field used in the present invention have very reliable membrane protein force fields, they can be well applied to the prediction of drug molecule binding free energy in membrane protein systems. It can be seen from the prediction results that the prediction results of the three different test systems for the membrane protein system are highly consistent with the experimental values.
在速度方面,对FEP/TI计算而言,从公开报道来看,一个GTX2080ti显卡(GPU)上,FEP/TI一天只能预测0.4-0.5个分子。而优化后的MM/PB(GB)SA一天可以预测20个分子。速度比FEP/TI的方法要快40-50倍。In terms of speed, for FEP/TI calculations, according to public reports, on a GTX2080ti graphics card (GPU), FEP/TI can only predict 0.4-0.5 molecules per day. The optimized MM/PB(GB)SA can predict 20 molecules a day. The speed is 40-50 times faster than the FEP/TI method.
在一个实施例中,本发明提供一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测系统,包括:In one embodiment, the present invention provides a protein-drug binding free energy prediction system based on MM/PB(GB)SA, including:
探索数据模块:用于收集并存储蛋白质及其配体的结构信息数据、药物的结构信息数据以及实验测得的蛋白质-配体的结合自由能数据;Exploration data module: used to collect and store structural information data of proteins and their ligands, structural information data of drugs, and experimentally measured protein-ligand binding free energy data;
训练数据模块:用于对蛋白质及其配体进行结合自由能预测并将预测得到的结合自由能与实验结果拟合筛选出拟合最好的一组参数;Training data module: used to predict the binding free energy of proteins and their ligands and fit the predicted binding free energy to the experimental results to select the best-fitting set of parameters;
预测数据模块:以筛选出的拟合最好的一组预测参数分析蛋白质和药物的结合自由能;Prediction data module: Analyze the binding free energy of proteins and drugs with the selected set of prediction parameters that best fit;
其中,结合自由能预测的方法为MM/PBSA和/或MM/GBSA。Among them, the binding free energy prediction methods are MM/PBSA and/or MM/GBSA.
优选地,训练数据模块包括预处理单元,用于蛋白质晶体结构的预处理。Preferably, the training data module includes a preprocessing unit for preprocessing protein crystal structures.
优选地,预处理包括加氢、质子化和能量最小化。Preferably, pretreatment includes hydrogenation, protonation and energy minimization.
以上所述仅是本发明的优选实施方式,应当指出:对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that those of ordinary skill in the art can make several improvements and modifications without departing from the principles of the present invention. These improvements and modifications can also be made. should be regarded as the protection scope of the present invention.
Figure PCTCN2022109948-appb-000002
Figure PCTCN2022109948-appb-000002
Figure PCTCN2022109948-appb-000003
Figure PCTCN2022109948-appb-000003
Figure PCTCN2022109948-appb-000004
Figure PCTCN2022109948-appb-000004
Figure PCTCN2022109948-appb-000005
Figure PCTCN2022109948-appb-000005

Claims (10)

  1. 一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法,其特征在于,包括以下步骤:A protein-drug binding free energy prediction method based on MM/PB(GB)SA, which is characterized by including the following steps:
    (1)使用参数对所述蛋白质及其配体进行结合自由能预测,所述参数包括所述蛋白质的力场、所述配体的力场和所述配体的电荷数中的至少一种;(1) Use parameters to predict the binding free energy of the protein and its ligands, the parameters including at least one of the force field of the protein, the force field of the ligand, and the charge number of the ligand. ;
    (2)将步骤(1)中预测的多组结合自由能分别与实验测得的结合自由能进行数据拟合;(2) Fit the multiple sets of binding free energies predicted in step (1) with the experimentally measured binding free energies;
    (3)以步骤(2)中拟合最好的一组参数和方法分析所述蛋白质与药物的结合自由能;(3) Analyze the binding free energy of the protein and the drug using the best-fitting set of parameters and methods in step (2);
    其中,步骤(1)中,所述结合自由能预测的方法为MM/PBSA和/或MM/GBSA。Wherein, in step (1), the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
  2. 根据权利要求1所述的蛋白质-药物结合自由能预测方法,其特征在于,所述MM/PB(GB)SA计算结合自由能的过程如下:The protein-drug binding free energy prediction method according to claim 1, characterized in that the process of calculating the binding free energy of the MM/PB(GB)SA is as follows:
    ΔG bind,solve=ΔG bind,vacuum+ΔG solve ΔG bind, solve =ΔG bind, vacuum +ΔG solve
    =ΔE MM+ΔG solve-TΔS =ΔE MM +ΔG solve -TΔS
    =ΔE MM+ΔG PB(GB)+ΔG SA-TΔS =ΔE MM +ΔG PB(GB) +ΔG SA -TΔS
    其中,ΔG bind,solve是溶液环境下的结合自由能,ΔG bind,vaccume是真空条件下的结合自由能,ΔG solve是溶剂化能;ΔE MM是分子的作用能,T是温度,ΔS是体系熵变,ΔG PB(GB)与ΔG SA分别是溶剂化能ΔG solve中的极性贡献项与非极性贡献项。 Among them, ΔG bind,solve is the binding free energy under solution environment, ΔG bind,vaccume is the binding free energy under vacuum conditions, ΔG solve is the solvation energy; ΔE MM is the action energy of the molecule, T is the temperature, and ΔS is the system Entropy change, ΔG PB(GB) and ΔG SA are the polar contribution term and non-polar contribution term in the solvation energy ΔG solve respectively.
  3. 根据权利要求1所述的蛋白质-药物结合自由能预测方法,其特征在于,步骤(1)中,所述配体与所述药物是由同一个先导化合物优化得到,或所述药物由所述配体优化得到,或所述配体由所述药物优化得到。The protein-drug binding free energy prediction method according to claim 1, characterized in that, in step (1), the ligand and the drug are optimized from the same lead compound, or the drug is obtained from the The ligand is optimized, or the ligand is optimized from the drug.
  4. 根据权利要求1所述的蛋白质-药物结合自由能预测方法,其特征在于,所述配体的电荷数的计算方法选自经验法、半经验法和量子化学计算法中的任一种。The protein-drug binding free energy prediction method according to claim 1, characterized in that the calculation method of the charge number of the ligand is selected from any one of the empirical method, the semi-empirical method and the quantum chemical calculation method.
  5. 根据权利要求1所述的蛋白质-药物结合自由能预测方法,其特征在于,所述蛋白质为膜蛋白,所述参数还包括所述膜蛋白的介电常数。The protein-drug binding free energy prediction method according to claim 1, wherein the protein is a membrane protein, and the parameters further include the dielectric constant of the membrane protein.
  6. 根据权利要求1所述的蛋白质-药物结合自由能预测方法,其特征在于,步骤(2)中,所述实验为湿实验。The protein-drug binding free energy prediction method according to claim 1, characterized in that in step (2), the experiment is a wet experiment.
  7. 一种基于MM/PB(GB)SA的蛋白质-药物结合自由能预测系统,其特征在于,包括:A protein-drug binding free energy prediction system based on MM/PB(GB)SA, which is characterized by including:
    探索数据模块:用于收集并存储所述蛋白质及其配体的结构信息数据、所述药物的结构信息数据以及实验测得的所述蛋白质-配体的结合自由能数据;Exploration data module: used to collect and store structural information data of the protein and its ligands, structural information data of the drug, and experimentally measured binding free energy data of the protein-ligands;
    训练数据模块:用于对所述蛋白质及其配体进行结合自由能预测并将预测得到的结合自由能与实验结果拟合筛选出拟合最好的一组参数;Training data module: used to predict the binding free energy of the protein and its ligands and fit the predicted binding free energy to the experimental results to select a set of parameters with the best fit;
    预测数据模块:以筛选出的拟合最好的一组预测参数分析所述蛋白质和所述药物的结合自由能;Prediction data module: analyze the binding free energy of the protein and the drug with the selected set of prediction parameters that fit the best;
    其中,所述结合自由能预测的方法为MM/PBSA和/或MM/GBSA。Wherein, the binding free energy prediction method is MM/PBSA and/or MM/GBSA.
  8. 根据权利要求7所述的蛋白质-药物结合自由能预测系统,其特征在于,所述训练数据模块包括预处理单元,用于所述蛋白质晶体结构的预处理。The protein-drug binding free energy prediction system according to claim 7, wherein the training data module includes a preprocessing unit for preprocessing of the protein crystal structure.
  9. 根据权利要求8所述的蛋白质-药物结合自由能预测系统,其特征在于,所述预处理包括加氢、质子化和能量最小化。The protein-drug binding free energy prediction system according to claim 8, wherein the pretreatment includes hydrogenation, protonation and energy minimization.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,该程序被处理器执行权利要求1-6任一所述的基于MM/PB(GB)SA的蛋白质-药物结合自由能预测方法。A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and the program is executed by a processor based on the MM/PB(GB)SA described in any one of claims 1-6. Methods for predicting protein-drug binding free energy.
PCT/CN2022/109948 2022-08-03 2022-08-03 Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system WO2024026725A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/109948 WO2024026725A1 (en) 2022-08-03 2022-08-03 Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/109948 WO2024026725A1 (en) 2022-08-03 2022-08-03 Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system

Publications (1)

Publication Number Publication Date
WO2024026725A1 true WO2024026725A1 (en) 2024-02-08

Family

ID=89848402

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109948 WO2024026725A1 (en) 2022-08-03 2022-08-03 Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system

Country Status (1)

Country Link
WO (1) WO2024026725A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040010376A1 (en) * 2001-04-17 2004-01-15 Peizhi Luo Generation and selection of protein library in silico
CN102930181A (en) * 2012-11-07 2013-02-13 四川大学 Protein-ligand affinity predicting method based on molecule descriptors
CN110400598A (en) * 2019-07-03 2019-11-01 江苏理工学院 Protein-ligand Conjugated free energy calculation method based on MM/PBSA model
CN113808683A (en) * 2021-09-02 2021-12-17 深圳市绿航星际太空科技研究院 Method and system for virtual screening of drugs based on receptors and ligands

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040010376A1 (en) * 2001-04-17 2004-01-15 Peizhi Luo Generation and selection of protein library in silico
CN102930181A (en) * 2012-11-07 2013-02-13 四川大学 Protein-ligand affinity predicting method based on molecule descriptors
CN110400598A (en) * 2019-07-03 2019-11-01 江苏理工学院 Protein-ligand Conjugated free energy calculation method based on MM/PBSA model
CN113808683A (en) * 2021-09-02 2021-12-17 深圳市绿航星际太空科技研究院 Method and system for virtual screening of drugs based on receptors and ligands

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEIS AARON, KATEBZADEH KAMBIZ, SÖDERHJELM PÄR, NILSSON INGEMAR, RYDE ULF: "Ligand Affinities Predicted with the MM/PBSA Method: Dependence on the Simulation Method and the Force Field", JOURNAL OF MEDICINAL CHEMISTRY, AMERICAN CHEMICAL SOCIETY, US, vol. 49, no. 22, 1 November 2006 (2006-11-01), US , pages 6596 - 6606, XP093134758, ISSN: 0022-2623, DOI: 10.1021/jm0608210 *

Similar Documents

Publication Publication Date Title
Rahman et al. DPP-PseAAC: a DNA-binding protein prediction model using Chou’s general PseAAC
Card et al. A family of phosphodiesterase inhibitors discovered by cocrystallography and scaffold-based drug design
de Ruiter et al. Free energy calculations of protein–ligand interactions
CN103324861B (en) Virtual screening method based on the nuclear receptor mediating endocrine interfering material of molecular dynamics simulation
Bhattarai et al. Gaussian accelerated molecular dynamics for elucidation of drug pathways
Dunkler et al. Statistical analysis principles for Omics data
CN106407740A (en) Method for screening anti-androgen activity of flavonoid compounds based on molecular dynamics simulation
Ntai et al. A method for label-free, differential top-down proteomics
Sohraby et al. Advances in computational methods for ligand binding kinetics
Hong et al. QSAR models at the US fda/nctr
Yoon et al. Identification of a minimal subset of receptor conformations for improved multiple conformation docking and two-step scoring
Kato et al. Variance decomposition of protein profiles from antibody arrays using a longitudinal twin model
Stroh et al. CGCompiler: Automated Coarse-Grained Molecule Parametrization via Noise-Resistant Mixed-Variable Optimization
WO2024026725A1 (en) Mm/pb(gb)sa-based protein-drug binding free energy prediction method and prediction system
Guterres et al. CHARMM-GUI LBS finder & refiner for ligand binding site prediction and refinement
CN109285584A (en) Distinguish the building and application of substance androgen and antiandrogen effect model
Wang et al. Dynamic docking of small molecules targeting RNA CUG repeats causing myotonic dystrophy type 1
Pereira et al. Toward the understanding of micro-tlc behavior of various dyes on silica and cellulose stationary phases using a data mining approach
CN117577219A (en) Protein-drug combination free energy prediction method and prediction system based on MM/PB (GB) SA
Mendes et al. Exploring metabolic signatures of ex vivo tumor tissue cultures for prediction of chemosensitivity in ovarian cancer
Du et al. Empirical and accurate method for the three-dimensional electrostatic potential (EM-ESP) of biomolecules
Titz et al. Analysis of proteomic data for toxicological applications
Peters et al. Chemical proteomics identifies unanticipated targets of clinical kinase inhibitors
JP2005018447A (en) Method for searcing acceptor-ligand stable complex structure
Wang et al. Application of coarse-grained water model in the study of mixed collectors compounding mechanism in low-rank coal flotation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22953536

Country of ref document: EP

Kind code of ref document: A1