WO2023123396A1 - Enhanced sampling method, and method for calculating binding free energy of complex - Google Patents

Enhanced sampling method, and method for calculating binding free energy of complex Download PDF

Info

Publication number
WO2023123396A1
WO2023123396A1 PCT/CN2021/143802 CN2021143802W WO2023123396A1 WO 2023123396 A1 WO2023123396 A1 WO 2023123396A1 CN 2021143802 W CN2021143802 W CN 2021143802W WO 2023123396 A1 WO2023123396 A1 WO 2023123396A1
Authority
WO
WIPO (PCT)
Prior art keywords
target molecule
protein complex
free energy
parameters
region
Prior art date
Application number
PCT/CN2021/143802
Other languages
French (fr)
Chinese (zh)
Inventor
李治鹏
杨明俊
邹俊杰
林泓叡
彭春望
方栋
林志雄
万晓
Original Assignee
深圳晶泰科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳晶泰科技有限公司 filed Critical 深圳晶泰科技有限公司
Priority to PCT/CN2021/143802 priority Critical patent/WO2023123396A1/en
Publication of WO2023123396A1 publication Critical patent/WO2023123396A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs

Definitions

  • This application relates to the technical field of drug research and development, in particular to an enhanced sampling method, a method for calculating the binding free energy of a target molecule-protein complex, an enhanced sampling device, a device for calculating the binding free energy of a target molecule-protein complex, electronic devices and computer-readable storage media.
  • FEP free energy perturbation
  • solute tempering replica exchange (Replica Exchange with Solute Tempering, REST2) is an enhanced sampling method in computational chemistry, which was originally designed for the simulation of protein folding systems.
  • the REST2 method adopts the idea of local (hot region) heating, which can reduce the number of required exchange copies while ensuring the exchange rate, and realize efficient and reliable enhanced sampling.
  • FEP/REST2 enhanced sampling for FEP
  • FEP/REST2 FEP/REST2
  • the main purpose of the present application is to provide an enhanced sampling method, a method for calculating the binding free energy of the target molecule-protein complex, an enhanced sampling device, a device for calculating the binding free energy of the target molecule-protein complex, electronic equipment and computer-based
  • the storage medium is read to solve the problem in the prior art that it is difficult to realize and use the REST2 method in the FEP.
  • FEP/REST2 simulations are performed using the parameter file as the input of the potential function to generate a trajectory file, which is used to calculate the binding free energy of the target molecule-protein complex.
  • the initial input file is input to the enhanced sampling method as described in the above embodiment to obtain a trajectory file
  • the binding free energy of the target molecule-protein complex is obtained based on the trajectory file.
  • the heating region determination module is used to determine the heating region of the target molecule-protein complex
  • the force field parameter modification module is used to modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate FEP Parameter files required for /REST2 simulation;
  • the trajectory file generation module is used to perform FEP/REST2 simulation on the parameter file as an input of the potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
  • the enhanced sampling device as described in the above-mentioned embodiment is used to obtain a trajectory file based on the inputted initial input file
  • a calculation module is used to obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
  • processors one or more processors
  • a memory coupled to the processor, for storing one or more programs
  • the one or more programs are executed by the one or more processors, so that the one or more processors implement the enhanced sampling method as described in the above embodiment or the calculation target molecule as described in the above embodiment- Binding free energy methods for protein complexes.
  • a computer-readable storage medium provided by an embodiment of the present application stores a computer program on it.
  • the enhanced sampling method as described in any embodiment or the calculation described in the above-mentioned embodiments is implemented. Binding Free Energy Method for Target Molecule-Protein Complexes.
  • the enhanced sampling method Compared with the prior art, the enhanced sampling method, the enhanced sampling method, the method for calculating the binding free energy of the target molecule-protein complex, the electronic device and the computer-readable storage medium of the present application have the following beneficial effects:
  • the energy of the target molecule-protein complex system can be calculated by the potential function, so this application determines the temperature rise region of the target molecule-protein complex, and then modifies the target molecule in the initial input file corresponding to the temperature rise region -
  • the force field parameters of the protein complex system that is, to modify various parameters in the potential function to generate the parameter files required for FEP/REST2 simulation, so as to quickly and conveniently realize the process of the REST2 method with as little code modification as possible, and There is no need to greatly modify the underlying code of the molecular dynamics simulation software, and the operation is simple, which improves the calculation efficiency and prediction accuracy of subsequent FEP.
  • FIG. 1 is a schematic flowchart of an enhanced sampling method in an embodiment of the present application.
  • FIG. 2 is a schematic flow chart of determining the heating region in step S10 in the embodiment of the present application.
  • Fig. 3 is a schematic flow chart of modifying the force field parameters of the target molecule-protein complex system corresponding to the temperature rise region in step S20 in the embodiment of the present application.
  • Fig. 4 is a schematic flowchart of a method for calculating the binding free energy of a target molecule-protein complex in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the basic principle of using relative free energy to calculate the difference in free energy between different molecules in the embodiment of the present application.
  • Figure 6(a) is a schematic diagram of the FEP calculation results when REST2 enhanced sampling is not used.
  • Fig. 6(b) is a schematic diagram of FEP calculation results when REST2 enhanced sampling is used in this application.
  • FIG. 7 is a schematic structural diagram of an enhanced sampling device in an embodiment of the present application.
  • Fig. 8 is a schematic structural diagram of a device for calculating the binding free energy of a target molecule-protein complex in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an electronic device in an embodiment of the present application.
  • an embodiment of the present invention provides an enhanced sampling method, which is applied in commonly used molecular simulation software, and quickly and conveniently implements the flow of the REST2 method with as little code modification as possible.
  • commonly used molecular simulation software includes but not limited to Amber program, Gromacs program, Namd program.
  • the enhanced sampling method of the present application includes the following steps:
  • the target molecule may be a drug candidate compound.
  • Free energy perturbation calculation is a method to assess the binding strength of drug small molecules and targets.
  • FEP enhances the reliability of sampling by constructing a series of non-physical intermediate states between bound and unbound states.
  • perturbation is performed by modifying lambda in the input file.
  • the temperature rise area of the target molecule-protein complex is determined, so that the target molecule-protein complex to be detected can be heated locally, and the exchange rate can be guaranteed. Reduces the number of swap copies required, enabling efficient and reliable augmented sampling.
  • step S10 includes the following steps:
  • the region of perturbation during the FEP calculation was identified as the initial warming region of the target molecule-protein complex.
  • the perturbation region in the general FEP calculation process is taken as the initial heating region.
  • the structure of the target molecule is analyzed to determine whether the atoms directly connected to the perturbation area are located on the rings in the target molecule. If it is on the ring, it is determined that the perturbation region is directly connected to the ring, and if it is not on the ring, it is determined that the perturbation region is not connected to the ring. If determined to be attached to a ring, add the ring to the elevated zone. If it is determined that it is not connected to the ring, no processing is performed. This example can help some systems to converge faster by adding an extra ring to the heating region.
  • the steps of analyzing the structure of the target molecule to adjust the heating area are as follows:
  • the initial heating region is defined as the first updated heating region
  • the initial heating region is updated according to the structure of the target molecule, thereby obtaining the first updated heating region, and after obtaining the first updated heating region, it is further judged whether there is an additional specified heating region, and if so, it is added to
  • the first heating area is the heating area, if it does not exist, it will not be added, and the first updated heating area is defined as the heating area, thereby realizing the automatic selection of the heating area.
  • FEP/REST2 refers to enhanced sampling for FEP using the REST2 method. Since in the molecular dynamics simulation, the energy of the target molecule-protein complex system can be calculated by the potential function, so when performing REST2 calculation on the heating region, this process can be realized by modifying various parameters in the potential function, There is no need to modify the underlying code of the molecular dynamics simulation software.
  • step S20 includes the following steps:
  • This application can divide the target molecule-protein complex system into two parts on the basis of the heating area, the heating area and the environmental area outside the heating area, so that the total energy of the target molecule-protein complex system under REST2 can be divided into two parts: It is divided into three parts (as shown in Equation 5 below): the energy U c,c inside the heating area, the energy U c,e between the heating area and the environment area, and the energy U e,e inside the environment area.
  • Equation 5 The basic formula of REST2 is:
  • the above-mentioned energy U can be calculated by the potential function (as shown in the following formula 7), and the basic potential function is as follows.
  • the energy U c,c inside the heating region, the energy U c,e between the heating region and the environment region, and the energy U e,e inside the environment region can be calculated, and then the total amount of the target molecule-protein complex system can be calculated.
  • ⁇ i , ⁇ j are the van der Waals potential well depth parameters of the i-th atom and j-th atom respectively, ⁇ ij is the van der Waals parameter;
  • q i , q j are the The electrostatic charge parameters of the i-th atom and the j-th atom;
  • r i , ⁇ i , and ⁇ i are the bond length parameters, angle parameters, and dihedral angle parameters of the i-th atom, respectively;
  • r ij is the i-th atom and j-th atom
  • the distance between atoms; r 0 , ⁇ 0 , ⁇ i , and ni are respectively the bond length parameter, angle parameter, dihedral angle parameter of the i-th atom and the sample size included in the sampling of the intermediate state i.
  • the parameter k is divided into different types, including parameters inside the temperature rise region, parameters between the temperature rise region and the environment region, and parameters inside the environment region.
  • determine the force field parameters that need to be modified including bond length, bond angle, dihedral angle, van der Waals and electrostatic parameters, etc.
  • step S22 at least partially modifying the parameters inside the temperature rise region, the parameters between the temperature rise region and the environment region, and the parameters inside the environment region includes the following steps:
  • the first type of parameters include bonding coefficients, van der Waals coefficients and electrostatic coefficients
  • the second type of parameters include bonding coefficients
  • the parameters located inside the heating region are multiplied by corresponding first-type parameters, including bonding coefficients, van der Waals coefficients, and electrostatic coefficients.
  • the parameter (bonding parameter) located between the heating region and the ambient region it is multiplied by the corresponding second type of parameter for modification, and the second type of parameter is the bonding coefficient between the heating region and the ambient region.
  • the bonding parameters in this embodiment are parameters related to chemical bonds used to connect atoms, such as bond length coefficients, bond angle coefficients, dihedral angle coefficients, and the like.
  • the parameters inside the environment area are not modified.
  • the input of the potential function of the modified parameter file is used for FEP/REST2 simulation.
  • the trajectory files can be used to calculate the binding free energy of the target molecule-protein complex.
  • the enhanced sampling method of this application has the following beneficial effects:
  • the energy of the target molecule-protein complex system can be calculated by the potential function, so this application determines the temperature rise region of the target molecule-protein complex, and then modifies the target molecule in the initial input file corresponding to the temperature rise region -
  • the force field parameters of the protein complex system that is, to modify various parameters in the potential function to generate the parameter files required for FEP/REST2 simulation, so as to quickly and conveniently realize the process of the REST2 method with as little code modification as possible, and There is no need to greatly modify the underlying code of the molecular dynamics simulation software, and the operation is simple, which improves the calculation efficiency and prediction accuracy of subsequent FEP.
  • the embodiment of the present invention also provides a complete calculation method for binding free energy.
  • a method for calculating the binding free energy of the target molecule-protein complex of the present application includes the following steps:
  • the initial input file includes a file of the three-dimensional conformation of the target molecule-protein complex, which may specifically be a file in pdb format.
  • the trajectory file is analyzed, and the obtained molecular dynamics trajectory of the target molecule-protein complex system in different states is analyzed.
  • step S101 of constructing the initial input file of the target molecule-protein complex system the following steps are further included:
  • the step S20 modifies the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtains the modified force field parameters of the target molecule-protein complex system, and then further generates FEP/ The parameter file required for REST2 simulation, including the following steps:
  • the parameter files required for FEP/REST2 simulation are generated.
  • step S103 is to obtain the binding free energy of the target molecule-protein complex based on the trajectory file as follows: based on the trajectory file, calculate using the Bennett acceptance rate method to obtain the target molecule-protein complex binding free energy.
  • this embodiment uses the Bennett Acceptance Ratio (Bennett Acceptance Ratio, BAR) method to calculate the binding free energy of the target molecule-protein complex, the specific calculation method is shown in the following formula. Among them, the method of reweighting can be used to calculate the energy difference ⁇ U of the calculated trajectory file under other state parameter files, and finally obtain the free energy difference ⁇ G between the two states through the formula.
  • Bennett Acceptance Ratio Bennett Acceptance Ratio, BAR
  • step S103 further includes the following steps:
  • the relative binding free energy ⁇ G binding of the reference compound converted into the target molecule is calculated according to formula 3;
  • the binding free energy ⁇ G 2 of the target molecule-protein complex is calculated according to formula 4;
  • ⁇ G 2 ⁇ G binding + ⁇ G 1 Formula 4;
  • ⁇ G is the free energy difference
  • ⁇ U ij is the first energy
  • ⁇ U ji is the second energy
  • ⁇ *> i is the system average under intermediate state i
  • ⁇ *> j is the system under intermediate state j
  • N i and N i are the frame numbers of the simulation trajectory under the intermediate state i and j
  • k B is the Boltzmann constant
  • T is the simulation temperature, generally 298K
  • ⁇ G binding is the relative binding free energy
  • ⁇ G A is The solvent free energy difference of the reference compound converted into the target molecule
  • ⁇ G B is the binding free energy difference of the reference compound converted into the target molecule
  • ⁇ G 1 is the known binding free energy of the reference compound-protein complex
  • ⁇ G 2 is the target molecule- Binding free energy of protein complexes.
  • the relative free energy (Relative Binding Free Energy, RBFE) is used to calculate the free energy difference ⁇ G binding between different molecules, so as to calculate the binding free energy of the target molecule-protein complex.
  • FIG. 5 is a schematic diagram of the basic principle of calculating the free energy difference ⁇ G between different molecules by using relative free energy (Relative Binding Free Energy, RBFE).
  • RBFE Relative Binding Free Energy
  • the upper left figure shows the schematic diagram of the separation of the reference compound A and the protein receptor
  • the lower left figure shows the complex structure A formed by the reference compound A and the protein receptor
  • ⁇ G 1 is the binding free energy of the complex structure A formed by the reference compound A and the protein receptor
  • the upper right figure shows the separation schematic diagram of the target molecule B and the protein receptor
  • the lower right figure shows the complex structure B formed by the target molecule B and the protein receptor
  • ⁇ G 2 is the binding free energy of the target molecule B and the protein receptor to form the complex structure B
  • ⁇ G A is the solvent binding free energy of the protein receptor to transform the reference compound A into the target molecule B
  • ⁇ G B is the difference in the binding free energy of the reference compound A into the target molecule B.
  • the commonly used molecular simulation software takes the Amber program as an example, selects a protein small molecule complex as the test system, selects dihedral angle, van der Waals and electrostatic parameters for modification, and calculates the binding free energy of the protein small molecule complex.
  • the relative binding free energy ⁇ G of the reference compound converted into the target molecule is calculated according to the following formula 3 binding ;
  • ⁇ G 2 ⁇ G binding + ⁇ G 1 Formula 4;
  • ⁇ G is the free energy difference
  • ⁇ U ij is the first energy
  • ⁇ U ji is the second energy
  • ⁇ *> i is the system average under intermediate state i
  • ⁇ *> j is the system under intermediate state j
  • N i , N i are the frame numbers of the simulation trajectory under the intermediate state i
  • j is the system under intermediate state j
  • k B is the Boltzmann constant
  • T the simulation temperature
  • ⁇ G binding is the relative binding free energy
  • ⁇ G A is the reference compound transformed into The solvent free energy difference of the target molecule
  • ⁇ G B is the binding free energy difference of the reference compound converted into the target molecule
  • ⁇ G 1 is the known binding free energy of the reference compound-protein complex
  • ⁇ G 2 is the binding free energy of the target molecule-protein complex Binding free energy.
  • BAR Bennett Acceptance Ratio
  • the method for calculating the binding free energy of the target molecule-protein complex of the present application has the following beneficial effects:
  • the flow of the REST2 method can be realized quickly and conveniently with as little code modification as possible, without greatly modifying the underlying code of the molecular dynamics simulation software, and making the overall flow of the FEP/REST2 method Automation improves the calculation efficiency and prediction accuracy of FEP, making it easier to calculate the binding free energy between target molecules and proteins more quickly.
  • the enhanced sampling device 100 includes:
  • the heating region determination module 110 is used to determine the heating region of the target molecule-protein complex
  • the force field parameter modification module 120 is used to modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate Parameter files required for FEP/REST2 simulations; and
  • the trajectory file generating module 130 is configured to perform FEP/REST2 simulation on the parameter file as an input of the potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
  • the force field parameter modification module 120 includes:
  • a parameter division module configured to divide the force field parameter items of the target molecule-protein complex system into parameters inside the heating region, parameters between the heating region and the environment region, and parameters inside the environment region;
  • the first modification module is used to at least partially modify the parameters inside the heating region, the parameters between the heating region and the environment region, and the parameters inside the environment region to obtain a modified target molecule-protein complex system
  • the force field parameters and then further generate the parameter files required for the FEP/REST2 simulation.
  • the first modification module is specifically used for:
  • the first type of parameters include bonding coefficients, van der Waals coefficients and electrostatic coefficients
  • the second type of parameters include bonding coefficients
  • the heating region determination module 110 is specifically used for:
  • the initial heating region is defined as the first updated heating region
  • Each module in the above-mentioned enhanced sampling device 100 may be fully or partially implemented by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • the embodiment of the present application also provides a device 200 for calculating the binding free energy of the target molecule-protein complex, the device 200 includes:
  • Construction module 210 used to construct the initial input file of the target molecule-protein complex system
  • the enhanced sampling device 100 as described in any of the above-mentioned embodiments is configured to obtain a trajectory file based on the input initial input file;
  • a calculation module 220 configured to obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
  • a device 200 for calculating the binding free energy of the target molecule-protein complex further includes:
  • Reference compound selection module used to select small molecules with known small molecule-protein complex conformations as reference compounds
  • the heating region determination module 110 is also used to determine the heating region of the reference compound-protein complex
  • the force field parameter modification module 120 is specifically used for:
  • the parameter files required for FEP/REST2 simulation are generated.
  • the calculation module 220 includes a first sub-calculation module, and the first sub-calculation module is used for:
  • the Bennett acceptance rate method is used for calculation to obtain the binding free energy of the target molecule-protein complex
  • the first sub-computing module is specifically used for:
  • the relative binding free energy ⁇ G binding of the reference compound converted into the target molecule is calculated according to formula 3;
  • the binding free energy ⁇ G 2 of the target molecule-protein complex is calculated according to formula 4;
  • ⁇ G 2 ⁇ G binding + ⁇ G 1 Formula 4;
  • ⁇ G is the free energy difference
  • ⁇ U ij is the first energy
  • ⁇ U ji is the second energy
  • ⁇ *> i is the system average under intermediate state i
  • ⁇ *> j is the system under intermediate state j
  • N i , N i are the frame numbers of the simulation trajectory under the intermediate state i
  • j is the system under intermediate state j
  • k B is the Boltzmann constant
  • T the simulation temperature
  • ⁇ G binding is the relative binding free energy
  • ⁇ G A is the reference compound transformed into The solvent free energy difference of the target molecule
  • ⁇ G B is the binding free energy difference of the reference compound converted into the target molecule
  • ⁇ G 1 is the known binding free energy of the reference compound-protein complex
  • ⁇ G 2 is the binding free energy of the target molecule-protein complex Binding free energy.
  • each module in the above-mentioned apparatus 200 for calculating the binding free energy of the target molecule-protein complex can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • the embodiment of the present application also provides an electronic device, including:
  • processors one or more processors
  • a memory coupled to the processor, for storing one or more programs
  • the one or more processors implement the enhanced sampling method described in any of the above embodiments or the calculation described in any of the above embodiments Binding Free Energy Method for Target Molecule-Protein Complexes.
  • the processor is used to control the overall operation of the terminal device to complete all or part of the steps of the above enhanced sampling method or the method for calculating the binding free energy of the target molecule-protein complex described in any of the above embodiments.
  • the memory is used to store various types of data to support the operation of the terminal device, such data may include instructions for any application program or method operated on the terminal device, and application-related data.
  • the memory can be implemented by any type of volatile or non-volatile memory devices or their combination, such as Static Random Access Memory (SRAM for short), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory) Erasable Programmable Read-Only Memory, referred to as EEPROM), Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, referred to as EPROM), Programmable Read-Only Memory (Programmable Read-Only Memory, referred to as PROM), read-only memory (Read-Only Memory, referred to as ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM Static Random Access Memory
  • EPROM Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • ROM Read-Only Memory
  • magnetic memory flash memory
  • flash memory magnetic disk or optical disk.
  • the terminal device may be implemented by one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), digital signal processors (Digital Signal Processor, DSP for short), digital signal processing equipment (Digital Signal Processing Device, referred to as DSPD), programmable logic device (Programmable Logic Device, referred to as PLD), field programmable gate array (Field Programmable Gate Array, referred to as FPGA), controller, microcontroller, microprocessor or other electronic components to achieve , for performing the enhanced sampling method as described in any of the above embodiments or the method for calculating the binding free energy of the target molecule-protein complex as described in any of the above embodiments, and achieve the same technical effect as the above methods.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD programmable logic device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor or other electronic components to achieve , for performing the enhanced sampling method as described in any of the above embodiments
  • a computer-readable storage medium including a computer program is also provided, and when the computer program is executed by a processor, the enhanced sampling method as described in any one of the above-mentioned embodiments or any of the above-mentioned implementations is implemented.
  • the computer-readable storage medium may be the above-mentioned memory including a computer program, and the above-mentioned computer program can be executed by the processor of the terminal device to complete the enhanced sampling method as described in any of the above embodiments or the method described in any of the above-mentioned embodiments.
  • a method for calculating the binding free energy of the target molecule-protein complex and achieve the same technical effect as the above method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to the technical field of research and development of drugs, and in particular to an enhanced sampling method, and a method for calculating binding free energy of a complex. The enhanced sampling method comprises: determining a hot region of a target molecule-protein complex; modifying a force field parameter of a target molecule-protein complex system in an initial input file on the basis of the hot region to obtain a force field parameter of a modified target molecule-protein complex system, then further generating a parameter file required for FEP/REST2 simulation; and performing FEP/REST2 simulation by using the parameter file as an input of a potential function to generate a trajectory file. According to the present application, the force field parameter of the target molecule-protein complex system in the initial input file is modified on the basis of the determined hot region, the process of an REST2 method is quickly and conveniently implemented with as few code modifications as possible, and the operation is simple, such that the calculation efficiency and prediction precision of subsequent FEP are improved.

Description

增强采样方法及计算复合物的结合自由能的方法Enhanced sampling method and method for calculating binding free energy of complexes 【技术领域】【Technical field】
本申请涉及药物研发技术领域,特别是涉及一种增强采样方法、计算目标分子-蛋白质复合物的结合自由能的方法、增强采样装置、计算目标分子-蛋白质复合物的结合自由能的装置、电子设备以及计算机可读存储介质。This application relates to the technical field of drug research and development, in particular to an enhanced sampling method, a method for calculating the binding free energy of a target molecule-protein complex, an enhanced sampling device, a device for calculating the binding free energy of a target molecule-protein complex, electronic devices and computer-readable storage media.
【背景技术】【Background technique】
药物分子与蛋白质之间的结合自由能作为一种评价药物分子活性的指标,可以能够通过多种计算方法进行预测。其中,自由能微扰(free energy perturbation,FEP)是一种高精度的计算化学预测方法,已经被广泛应用于药物设计当中。但是,FEP作为一种基于分子动力学模拟的方法,在模拟过程中往往面临着采样的问题。对于采样不充分的体系,得到的结合自由能并不能很好的收敛,极大的影响到预测结果的准确性。As an index for evaluating the activity of drug molecules, the binding free energy between drug molecules and proteins can be predicted by various calculation methods. Among them, free energy perturbation (FEP) is a high-precision computational chemical prediction method, which has been widely used in drug design. However, as a method based on molecular dynamics simulation, FEP often faces the problem of sampling during the simulation process. For systems with insufficient sampling, the obtained binding free energy does not converge well, which greatly affects the accuracy of the prediction results.
为了解决这一问题,现有技术中存在各种增强采样的方法。其中,溶质回火副本交换(Replica Exchange with Solute Tempering,REST2)是一种计算化学中的增强采样方法,设计之初主要被用于蛋白折叠体系的模拟。REST2方法通过局部(hot region)升温的思想,在保证交换率的同时能够减少所需交换副本数量,实现高效可靠的增强采样。很多研究已经证明,使用REST2方法对于FEP进行增强采样(以下缩写为FEP/REST2),能够明显提高FEP的计算精度。但是,本申请的发明人在长期研究中发现,在使用FEP/REST2时面临着两个问题:1、在FEP中使用REST2方法涉及到分子动力学软件底层代码的修改,实现难度较大;2、FEP/REST2方法整体流程复杂,使用难度较大。In order to solve this problem, there are various methods of enhancing sampling in the prior art. Among them, solute tempering replica exchange (Replica Exchange with Solute Tempering, REST2) is an enhanced sampling method in computational chemistry, which was originally designed for the simulation of protein folding systems. The REST2 method adopts the idea of local (hot region) heating, which can reduce the number of required exchange copies while ensuring the exchange rate, and realize efficient and reliable enhanced sampling. Many studies have proved that using the REST2 method to perform enhanced sampling for FEP (hereinafter abbreviated as FEP/REST2) can significantly improve the calculation accuracy of FEP. However, the inventors of the present application have found in long-term research that they face two problems when using FEP/REST2: 1. Using the REST2 method in FEP involves modification of the underlying code of the molecular dynamics software, which is difficult to implement; 2. , The overall process of the FEP/REST2 method is complex and difficult to use.
【发明内容】【Content of invention】
本申请主要的目的是提供一种增强采样方法、计算目标分子-蛋白质复合物的结合自由能的方法、增强采样装置、计算目标分子-蛋白质复合物的结合自由能的装置、电子设备以及计算机可读存储介质,以解决现有技术中在FEP中使用REST2方法实现难度较大且使用难度较大的问题。The main purpose of the present application is to provide an enhanced sampling method, a method for calculating the binding free energy of the target molecule-protein complex, an enhanced sampling device, a device for calculating the binding free energy of the target molecule-protein complex, electronic equipment and computer-based The storage medium is read to solve the problem in the prior art that it is difficult to realize and use the REST2 method in the FEP.
本申请实施例提供的一种增强采样方法,包括:An enhanced sampling method provided in an embodiment of the present application includes:
确定目标分子-蛋白质复合物的升温区域;Determining the heating region of the target molecule-protein complex;
对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件;Modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate the parameter files required for FEP/REST2 simulation ;
将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。FEP/REST2 simulations are performed using the parameter file as the input of the potential function to generate a trajectory file, which is used to calculate the binding free energy of the target molecule-protein complex.
本申请实施例提供的一种计算目标分子-蛋白质复合物的结合自由能的方法,包括:A method for calculating the binding free energy of the target molecule-protein complex provided in the embodiment of the present application includes:
构建目标分子-蛋白质复合物体系的初始输入文件;The initial input file for constructing the target molecule-protein complex system;
将所述初始输入文件输入至如上述实施例所述的增强采样方法,获得轨迹文件;The initial input file is input to the enhanced sampling method as described in the above embodiment to obtain a trajectory file;
基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。The binding free energy of the target molecule-protein complex is obtained based on the trajectory file.
本申请实施例提供的一种增强采样装置,包括:An enhanced sampling device provided in an embodiment of the present application includes:
升温区域确定模块,用于确定目标分子-蛋白质复合物的升温区域;The heating region determination module is used to determine the heating region of the target molecule-protein complex;
力场参数修改模块,用于对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件;和The force field parameter modification module is used to modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate FEP Parameter files required for /REST2 simulation; and
轨迹文件生成模块,用于将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。The trajectory file generation module is used to perform FEP/REST2 simulation on the parameter file as an input of the potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
本申请实施例提供的一种计算目标分子-蛋白质复合物的结合自由能的装置,包括:A device for calculating the binding free energy of the target molecule-protein complex provided in the embodiment of the present application includes:
构建模块,用于构建目标分子-蛋白质复合物体系的初始输入文件;A building block for constructing an initial input file for a target molecule-protein complex system;
如上述实施例所述的增强采样装置,用于基于输入的所述初始输入文件,获得轨迹文件;和The enhanced sampling device as described in the above-mentioned embodiment is used to obtain a trajectory file based on the inputted initial input file; and
计算模块,用于基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。A calculation module is used to obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
本申请实施例提供的一种电子设备,包括:An electronic device provided in an embodiment of the present application includes:
一个或多个处理器;one or more processors;
存储器,与所述处理器耦接,用于存储一个或多个程序;a memory, coupled to the processor, for storing one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述实施例所述的增强采样方法或上述实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法。When the one or more programs are executed by the one or more processors, so that the one or more processors implement the enhanced sampling method as described in the above embodiment or the calculation target molecule as described in the above embodiment- Binding free energy methods for protein complexes.
本申请实施例提供的一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如任一实施例所述的增强采样方法或上述实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法。A computer-readable storage medium provided by an embodiment of the present application stores a computer program on it. When the computer program is executed by a processor, the enhanced sampling method as described in any embodiment or the calculation described in the above-mentioned embodiments is implemented. Binding Free Energy Method for Target Molecule-Protein Complexes.
相较于现有技术,本申请的增强采样方法、增强采样方法、计算目标分子-蛋白质复合物的结合自由能的方法、电子设备以及计算机可读存储介质具有如下有益效果:Compared with the prior art, the enhanced sampling method, the enhanced sampling method, the method for calculating the binding free energy of the target molecule-protein complex, the electronic device and the computer-readable storage medium of the present application have the following beneficial effects:
由于在分子动力学模拟中,目标分子-蛋白质复合物体系的能量可以通过势函数进行计算,因此本申请通过确定目标分子-蛋白质复合物的升温区域,然后对应升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,即修改势函数内的各类参数,以生成FEP/REST2模拟所需的参数文件,从而以尽量少修改代码的方式快速便捷地实现REST2方法的流程,而不需要大幅修改分子动力学模拟软件的底层代码,操作简单,提高后续FEP的计算效率及预测精度。Since in the molecular dynamics simulation, the energy of the target molecule-protein complex system can be calculated by the potential function, so this application determines the temperature rise region of the target molecule-protein complex, and then modifies the target molecule in the initial input file corresponding to the temperature rise region -The force field parameters of the protein complex system, that is, to modify various parameters in the potential function to generate the parameter files required for FEP/REST2 simulation, so as to quickly and conveniently realize the process of the REST2 method with as little code modification as possible, and There is no need to greatly modify the underlying code of the molecular dynamics simulation software, and the operation is simple, which improves the calculation efficiency and prediction accuracy of subsequent FEP.
【附图说明】【Description of drawings】
本申请将结合附图对实施方式进行说明。本申请的附图仅用于描述实施例,以展示为目的。在不偏离本申请原理的条件下,本领域技术人员能够轻松地通过以下描述根据所述步骤做出其他实施例。The present application will describe the implementation manners with reference to the accompanying drawings. The drawings of the present application are only used to describe the embodiments for the purpose of illustration. Without departing from the principles of the present application, those skilled in the art can easily make other embodiments according to the steps described below.
图1为本申请实施例中一种增强采样方法的流程示意图。FIG. 1 is a schematic flowchart of an enhanced sampling method in an embodiment of the present application.
图2为本申请实施例中步骤S10中确定升温区域的具体流程示意图。FIG. 2 is a schematic flow chart of determining the heating region in step S10 in the embodiment of the present application.
图3为本申请实施例中步骤S20中对应所述升温区域修改所述目标分子-蛋白质复合物体系的力场参数的具体流程示意图。Fig. 3 is a schematic flow chart of modifying the force field parameters of the target molecule-protein complex system corresponding to the temperature rise region in step S20 in the embodiment of the present application.
图4为本申请实施例中一种计算目标分子-蛋白质复合物的结合自由能的方法的流程示意图。Fig. 4 is a schematic flowchart of a method for calculating the binding free energy of a target molecule-protein complex in an embodiment of the present application.
图5为本申请实施例中利用相对自由能计算不同分子之间的自由能差异的基本原理示意图。FIG. 5 is a schematic diagram of the basic principle of using relative free energy to calculate the difference in free energy between different molecules in the embodiment of the present application.
图6(a)为未使用REST2增强采样时的FEP计算结果示意图。Figure 6(a) is a schematic diagram of the FEP calculation results when REST2 enhanced sampling is not used.
图6(b)为使用本申请REST2增强采样时的FEP计算结果示意图。Fig. 6(b) is a schematic diagram of FEP calculation results when REST2 enhanced sampling is used in this application.
图7为本申请实施例中一种增强采样装置的结构示意图。FIG. 7 is a schematic structural diagram of an enhanced sampling device in an embodiment of the present application.
图8为本申请实施例中一种计算目标分子-蛋白质复合物的结合自由能的装置的结构示意图。Fig. 8 is a schematic structural diagram of a device for calculating the binding free energy of a target molecule-protein complex in an embodiment of the present application.
图9为本申请实施例中一种电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device in an embodiment of the present application.
【具体实施方式】【Detailed ways】
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。可以理解的是,此处所描述的具体实施例仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It should be understood that the specific embodiments described here are only used to explain the present application, but not to limit the present application. In addition, it should be noted that, for the convenience of description, only some structures related to the present application are shown in the drawings but not all structures. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
本申请中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", etc. in this application are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiments.
请参阅图1,本发明实施例提供一种增强采样方法,该方法应用在常用分子模拟软件中,以尽量少修改代码的方式快速便捷地实现REST2方法的流程。其中,常用分子模拟软件包括但不限于Amber程序、Gromacs程序、Namd程序。Please refer to FIG. 1 , an embodiment of the present invention provides an enhanced sampling method, which is applied in commonly used molecular simulation software, and quickly and conveniently implements the flow of the REST2 method with as little code modification as possible. Among them, commonly used molecular simulation software includes but not limited to Amber program, Gromacs program, Namd program.
具体的,本申请的增强采样方法包括以下步骤:Specifically, the enhanced sampling method of the present application includes the following steps:
S10、确定目标分子-蛋白质复合物的升温区域。S10. Determine the heating region of the target molecule-protein complex.
在本实施例中,目标分子可以为药物候选化合物。In this embodiment, the target molecule may be a drug candidate compound.
自由能微扰计算(FEP)是评估药物小分子和靶点结合强度的一种方法。FEP通过在结合态与非结合态之间构建一系列非物理存在的中间态,从而增强采样的可靠性。在计算FEP时,采用在输入文件里修改lambda的方式进行微扰。在本实施例中,根据一般的FEP计算过程中的微扰区域,确定目标分子-蛋白质复合物的升温区域,从而对待检测的目标分子-蛋白质复合物实现局部升温,在保证交换率的同时能够减少所需交换副本数量,实现高效可靠的增强采样。Free energy perturbation calculation (FEP) is a method to assess the binding strength of drug small molecules and targets. FEP enhances the reliability of sampling by constructing a series of non-physical intermediate states between bound and unbound states. When calculating FEP, perturbation is performed by modifying lambda in the input file. In this embodiment, according to the perturbation area in the general FEP calculation process, the temperature rise area of the target molecule-protein complex is determined, so that the target molecule-protein complex to be detected can be heated locally, and the exchange rate can be guaranteed. Reduces the number of swap copies required, enabling efficient and reliable augmented sampling.
在一个具体实施例中,如图2所示,上述步骤S10包括以下步骤:In a specific embodiment, as shown in FIG. 2, the above step S10 includes the following steps:
将FEP计算过程中的微扰区域确定为目标分子-蛋白质复合物的初始升温区域。The region of perturbation during the FEP calculation was identified as the initial warming region of the target molecule-protein complex.
本实施例将一般的FEP计算过程中的微扰区域作为初始升温区域。In this embodiment, the perturbation region in the general FEP calculation process is taken as the initial heating region.
在确定初始升温区域后,针对目标分子的结构进行分析,判断与微扰区域直接相连的原子是否位于目标分子内的环上。如果位于环上,则确定微扰区域直接与环相连,如果没有位于环上,则确定微扰区域没有与环相连。如果确定与环相连,将该环添加至升温区域。如果确定并未与环相连,则不做处理。本实施例通过将额外的环增加至升温区域,可以帮助部分体系更快地收敛。After the initial heating area is determined, the structure of the target molecule is analyzed to determine whether the atoms directly connected to the perturbation area are located on the rings in the target molecule. If it is on the ring, it is determined that the perturbation region is directly connected to the ring, and if it is not on the ring, it is determined that the perturbation region is not connected to the ring. If determined to be attached to a ring, add the ring to the elevated zone. If it is determined that it is not connected to the ring, no processing is performed. This example can help some systems to converge faster by adding an extra ring to the heating region.
具体的,对目标分子的结构进行分析,以调整升温区域的步骤如下:Specifically, the steps of analyzing the structure of the target molecule to adjust the heating area are as follows:
判断所述目标分子中是否有环;judging whether there is a ring in the target molecule;
若所述目标分子中有环,则进一步判断所述初始升温区域中的至少部分原子是否位于所述环上,或所述初始升温区域中的至少部分原子是否与所述环上的原子直接或间接相连;If there is a ring in the target molecule, it is further judged whether at least some of the atoms in the initial heating region are located on the ring, or whether at least some of the atoms in the initial heating region are directly or directly with the atoms on the ring indirectly connected;
若所述初始升温区域中的至少部分原子位于所述环上或与所述环上的原子直接相连,则将所述环所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are located on the ring or are directly connected to atoms on the ring, then adding the region where the ring is located to the initial heating region to obtain a first updated heating region ;
若所述初始升温区域中的至少部分原子与所述环上的原子间接相连,则将所述环及与所述环间接相连部分所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are indirectly connected to atoms on the ring, add the ring and the region where the part indirectly connected to the ring is located to the initial heating region to obtain a first update warming area;
若所述初始升温区域中的所有原子既不位于所述环上,也不与所述环上的原子直接或间接相连,则将所述初始升温区域定为第一更新升温区域;If all atoms in the initial heating region are neither located on the ring nor directly or indirectly connected to atoms on the ring, then the initial heating region is defined as the first updated heating region;
若所述目标分子中无环,则将所述初始升温区域定为第一更新升温区域;If there is no ring in the target molecule, then set the initial temperature rise region as the first updated temperature rise region;
判断是否存在额外指定的需要升温的区域;Determine whether there are additional designated areas that need to be heated;
若存在,则将所述额外指定的需要升温的区域添加至所述第一更新升温区域,得升温区域;If it exists, adding the additionally designated area that needs to be heated to the first updated heating area to obtain a heating area;
若不存在,则将所述第一更新升温区域定为升温区域。If it does not exist, then set the first update temperature rise region as the temperature rise region.
如此,根据目标分子的结构更新初始升温区域,从而获得第一更新升温区域,并在获得第一更新升温区域后,进一步判断是否存在额外指定的需要升温的区域,如果存在,则将其添加至第一升温区域,得升温区域,如果不存在,则不添加,将第一更新升温区域定为升温区域,从而实现了升温区域的自动选择。In this way, the initial heating region is updated according to the structure of the target molecule, thereby obtaining the first updated heating region, and after obtaining the first updated heating region, it is further judged whether there is an additional specified heating region, and if so, it is added to The first heating area is the heating area, if it does not exist, it will not be added, and the first updated heating area is defined as the heating area, thereby realizing the automatic selection of the heating area.
S20、对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件。S20. Modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate the required parameters for FEP/REST2 simulation parameter file.
FEP/REST2指使用REST2方法对于FEP进行增强采样。由于在分子动力学模拟中,目标分子-蛋白质复合物体系的能量可以通过势函数进行计算,因此在对升温区域进行REST2计算时,可以通过修改势函数内的各类参数来实现这一过程,而不需要修改分子动力学模拟软件的底层代码。FEP/REST2 refers to enhanced sampling for FEP using the REST2 method. Since in the molecular dynamics simulation, the energy of the target molecule-protein complex system can be calculated by the potential function, so when performing REST2 calculation on the heating region, this process can be realized by modifying various parameters in the potential function, There is no need to modify the underlying code of the molecular dynamics simulation software.
在确定升温区域后,进一步确定目标分子-蛋白质复合物体系需要修改的力场参数,以便对应修改。具体的,上述步骤S20,包括以下步骤:After determining the heating area, further determine the force field parameters that need to be modified in the target molecule-protein complex system, so as to modify accordingly. Specifically, the above step S20 includes the following steps:
将所述目标分子-蛋白质复合物体系的力场参数项划分为所述升温区域内部的参数、所述升温区域与环境区域间的参数以及所述环境区域内部的参数;Dividing the force field parameter item of the target molecule-protein complex system into parameters inside the heating region, parameters between the heating region and the environment region, and parameters inside the environment region;
至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成所述FEP/REST2模拟所需的参数文件。At least partially modifying the parameters inside the heating region, the parameters between the heating region and the environment region, and the parameters inside the environment region to obtain the modified force field parameters of the target molecule-protein complex system, and then further Generate the parameter files required for the FEP/REST2 simulation described.
本申请在已经得到升温区域的基础上可以把目标分子-蛋白质复合物体系分为两部分,升温区域及升温区域以外的环境区域,从而可以把REST2下的目标分子-蛋白质复合物体系总能量
Figure PCTCN2021143802-appb-000001
分为三部分(如下式5所示):升温区域内部的能量U c,c、升温区域与环境区域间的能量U c,e以及环境区域内部的能量U e,e,REST2基本公式为:
This application can divide the target molecule-protein complex system into two parts on the basis of the heating area, the heating area and the environmental area outside the heating area, so that the total energy of the target molecule-protein complex system under REST2 can be divided into two parts:
Figure PCTCN2021143802-appb-000001
It is divided into three parts (as shown in Equation 5 below): the energy U c,c inside the heating area, the energy U c,e between the heating area and the environment area, and the energy U e,e inside the environment area. The basic formula of REST2 is:
Figure PCTCN2021143802-appb-000002
Figure PCTCN2021143802-appb-000002
Figure PCTCN2021143802-appb-000003
Figure PCTCN2021143802-appb-000003
其中,
Figure PCTCN2021143802-appb-000004
为使用REST2时目标分子-蛋白质复合物体系总能量,U c,c升温区域内部的能量,U c,e为升温区域与环境区域间的能量,U e,e为环境区域内部的能量,k B为波尔兹曼常数,T 0为环境温度,一般为298K,T m为升温区域的温度,可自行选择温度。
in,
Figure PCTCN2021143802-appb-000004
is the total energy of the target molecule-protein complex system when using REST2, U c,c is the energy inside the heating area, U c,e is the energy between the heating area and the environment area, U e,e is the energy inside the environment area, k B is the Boltzmann constant, T 0 is the ambient temperature, generally 298K, and T m is the temperature in the heating area, and the temperature can be selected by oneself.
在分子动力学模拟中,上述各能量U可以通过势函数(如下式7所示)进行计算,基本势函数如下所示,计算过程中各参数及各变量均有分子动力学模拟软件提供计算。如此,可分别计算出升温区域内部的能量U c,c、升温区域与环境区域间的能量U c,e以及环境区域内部的能量U e,e,进而计算出目标分子-蛋白质复合物体系总能量
Figure PCTCN2021143802-appb-000005
In the molecular dynamics simulation, the above-mentioned energy U can be calculated by the potential function (as shown in the following formula 7), and the basic potential function is as follows. During the calculation process, all parameters and variables are provided by the molecular dynamics simulation software. In this way, the energy U c,c inside the heating region, the energy U c,e between the heating region and the environment region, and the energy U e,e inside the environment region can be calculated, and then the total amount of the target molecule-protein complex system can be calculated. energy
Figure PCTCN2021143802-appb-000005
Figure PCTCN2021143802-appb-000006
Figure PCTCN2021143802-appb-000006
其中,
Figure PCTCN2021143802-appb-000007
为键连参数,包括
Figure PCTCN2021143802-appb-000008
分别为键长系数、角度系数及二面角系数;∈ i、∈ j分别为第i个原子和第j个原子的范德华势阱深度参数,σ ij为范德华参数;q i、q j为第i个原子和第j个原子的静电荷参数;r i、θ i、φ i分别为第i个原子的键长参数、角度参数及二面角参数,r ij为第i个原子与第j个原子间的距离;r 0、θ 0、δ i、n i分别为第i个原子的键长参数、角度参数、二面角参数及中间态i采样所包含的样本量。
in,
Figure PCTCN2021143802-appb-000007
For the key connection parameters, including
Figure PCTCN2021143802-appb-000008
are the bond length coefficient, angle coefficient and dihedral angle coefficient respectively; ∈ i , ∈ j are the van der Waals potential well depth parameters of the i-th atom and j-th atom respectively, σ ij is the van der Waals parameter; q i , q j are the The electrostatic charge parameters of the i-th atom and the j-th atom; r i , θ i , and φ i are the bond length parameters, angle parameters, and dihedral angle parameters of the i-th atom, respectively; r ij is the i-th atom and j-th atom The distance between atoms; r 0 , θ 0 , δ i , and ni are respectively the bond length parameter, angle parameter, dihedral angle parameter of the i-th atom and the sample size included in the sampling of the intermediate state i.
如图3所示,按照确定的升温区域,将参数k划分为不同的类型,包括升温区域内部的参数、升温区域与环境区域间的参数及环境区域内部的参数。同时,确定需要修改的力场参数,包括键长、键角、二面角、范德华及静电参数等。As shown in Figure 3, according to the determined temperature rise region, the parameter k is divided into different types, including parameters inside the temperature rise region, parameters between the temperature rise region and the environment region, and parameters inside the environment region. At the same time, determine the force field parameters that need to be modified, including bond length, bond angle, dihedral angle, van der Waals and electrostatic parameters, etc.
根据上述势函数的计算公式可得,对升温区域进行REST2计算时,可以通过修改势函数内的各类参数来实现这一过程,而不需要修改模拟软件的底层代码。According to the calculation formula of the above potential function, it can be obtained that when REST2 calculation is performed on the heating area, this process can be realized by modifying various parameters in the potential function without modifying the underlying code of the simulation software.
具体的,上述步骤S22中至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,包括以下步骤:Specifically, in the above step S22, at least partially modifying the parameters inside the temperature rise region, the parameters between the temperature rise region and the environment region, and the parameters inside the environment region includes the following steps:
将所述升温区域内部的参数乘以第一类参数;multiplying the parameters inside the heating zone by the parameters of the first type;
将所述升温区域与所述环境区域间的参数乘以第二类参数;multiplying the parameter between the heating zone and the ambient zone by a second type of parameter;
对所述环境区域内部的参数不做修改。No modification is made to the parameters inside the environmental region.
可选的,所述第一类参数包括键连系数、范德华系数及静电系数,所述第二类参数包括键连系数。Optionally, the first type of parameters include bonding coefficients, van der Waals coefficients and electrostatic coefficients, and the second type of parameters include bonding coefficients.
在本实施例中,针对位于升温区域内部的参数(包括键连参数、范德华参数及静电参数),分别乘以对应的第一类参数,包括键连系数、范德华系数及静电系数等。In this embodiment, the parameters located inside the heating region (including bonding parameters, van der Waals parameters, and electrostatic parameters) are multiplied by corresponding first-type parameters, including bonding coefficients, van der Waals coefficients, and electrostatic coefficients.
具体的,如图3所示,
Figure PCTCN2021143802-appb-000009
Specifically, as shown in Figure 3,
Figure PCTCN2021143802-appb-000009
针对位于升温区域与环境区域间的参数(键连参数),乘以对应第二类参数进行修改,所述第二类参数为升温区域与环境区域两者之间的键连系数。For the parameter (bonding parameter) located between the heating region and the ambient region, it is multiplied by the corresponding second type of parameter for modification, and the second type of parameter is the bonding coefficient between the heating region and the ambient region.
需要说明的是,本实施例中的键连参数为与用于连接原子的化学键相关的参数,例如键长系数、键角系数、 二面角系数等。It should be noted that the bonding parameters in this embodiment are parameters related to chemical bonds used to connect atoms, such as bond length coefficients, bond angle coefficients, dihedral angle coefficients, and the like.
具体的,如图3所示,
Figure PCTCN2021143802-appb-000010
Specifically, as shown in Figure 3,
Figure PCTCN2021143802-appb-000010
而对于环境区域内部的参数则不做修改。The parameters inside the environment area are not modified.
S30、将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。S30. Perform FEP/REST2 simulation using the parameter file as an input of a potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
最后,将修改过后的参数文件势函数的输入进行FEP/REST2模拟。当进行FEP/REST2模拟生成轨迹文件后,轨迹文件可以用于计算目标分子-蛋白质复合物的结合自由能。Finally, the input of the potential function of the modified parameter file is used for FEP/REST2 simulation. After performing FEP/REST2 simulation to generate trajectory files, the trajectory files can be used to calculate the binding free energy of the target molecule-protein complex.
综上,本申请的增强采样方法具有如下有益效果:In summary, the enhanced sampling method of this application has the following beneficial effects:
由于在分子动力学模拟中,目标分子-蛋白质复合物体系的能量可以通过势函数进行计算,因此本申请通过确定目标分子-蛋白质复合物的升温区域,然后对应升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,即修改势函数内的各类参数,以生成FEP/REST2模拟所需的参数文件,从而以尽量少修改代码的方式快速便捷地实现REST2方法的流程,而不需要大幅修改分子动力学模拟软件的底层代码,操作简单,提高后续FEP的计算效率及预测精度。Since in the molecular dynamics simulation, the energy of the target molecule-protein complex system can be calculated by the potential function, so this application determines the temperature rise region of the target molecule-protein complex, and then modifies the target molecule in the initial input file corresponding to the temperature rise region -The force field parameters of the protein complex system, that is, to modify various parameters in the potential function to generate the parameter files required for FEP/REST2 simulation, so as to quickly and conveniently realize the process of the REST2 method with as little code modification as possible, and There is no need to greatly modify the underlying code of the molecular dynamics simulation software, and the operation is simple, which improves the calculation efficiency and prediction accuracy of subsequent FEP.
请参阅图4,本发明实施例还提供完整的结合自由能计算方法。具体的,本申请的一种计算目标分子-蛋白质复合物的结合自由能的方法,包括以下步骤:Please refer to FIG. 4 , the embodiment of the present invention also provides a complete calculation method for binding free energy. Specifically, a method for calculating the binding free energy of the target molecule-protein complex of the present application includes the following steps:
S101、构建目标分子-蛋白质复合物体系的初始输入文件。S101. Construct the initial input file of the target molecule-protein complex system.
其中,初始输入文件含目标分子-蛋白质复合物的三维构象的文件,具体可为pdb格式的文件。Wherein, the initial input file includes a file of the three-dimensional conformation of the target molecule-protein complex, which may specifically be a file in pdb format.
S102、将所述初始输入文件输入至如上述任意一个实施例所述的增强采样方法,获得轨迹文件。S102. Input the initial input file into the enhanced sampling method described in any one of the above embodiments to obtain a trajectory file.
S103、基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。S103. Obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
在本实施例中,在获得轨迹文件后,解析所述轨迹文件,并将解析得到的所述目标分子-蛋白质复合物体系在不同状态下的分子动力学轨迹。In this embodiment, after the trajectory file is obtained, the trajectory file is analyzed, and the obtained molecular dynamics trajectory of the target molecule-protein complex system in different states is analyzed.
在一个实施例中,在所述步骤S101构建目标分子-蛋白质复合物体系的初始输入文件之前,还包括以下步骤:In one embodiment, before the step S101 of constructing the initial input file of the target molecule-protein complex system, the following steps are further included:
选择已知小分子-蛋白质复合物构象的小分子为参考化合物;Select small molecules with known conformations of small molecule-protein complexes as reference compounds;
确定参考化合物-蛋白质复合物的升温区域;Determining the warming region of the reference compound-protein complex;
对应的,所述步骤S20对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件,包括以下步骤:Correspondingly, the step S20 modifies the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtains the modified force field parameters of the target molecule-protein complex system, and then further generates FEP/ The parameter file required for REST2 simulation, including the following steps:
对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数;Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the modified force field parameters of the target molecule-protein complex system;
对应所述升温区域修改初始输入文件中参考化合物-蛋白质复合物体系的力场参数,得到修改后的参考化合物-蛋白质复合物体系的力场参数;Modifying the force field parameters of the reference compound-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the reference compound-protein complex system;
基于修改后的参考化合物-蛋白质复合物体系的力场参数,和修改后的目标分子-蛋白质复合物体系的力场参数,生成FEP/REST2模拟所需的参数文件。Based on the modified force field parameters of the reference compound-protein complex system and the modified force field parameters of the target molecule-protein complex system, the parameter files required for FEP/REST2 simulation are generated.
进一步地,所述步骤S103基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能为:基于所述轨迹文件,采用本内特接受率方法进行计算,获得所述目标分子-蛋白质复合物的结合自由能。Further, the step S103 is to obtain the binding free energy of the target molecule-protein complex based on the trajectory file as follows: based on the trajectory file, calculate using the Bennett acceptance rate method to obtain the target molecule-protein complex binding free energy.
因为进行FEP/REST2计算时,把FEP计算过程中各个态(alchemical state)的参数进行了修改,所以不能使用常见的积分方法进行分析,因此本实施例使用本内特接受率(Bennett Acceptance Ratio,BAR)方法计算目标分子-蛋白质复合物的结合自由能,具体计算方法如下公式所示。其中,可以通过重加权(reweighting)的方法,计算得到的轨迹文件在其他状态参数文件下能量的差异计算ΔU,通过公式最终得到两个态之间的自由能差异ΔG。Because the parameters of each state (alchemical state) in the FEP calculation process have been modified during the FEP/REST2 calculation, so the common integral method cannot be used for analysis, so this embodiment uses the Bennett Acceptance Ratio (Bennett Acceptance Ratio, BAR) method to calculate the binding free energy of the target molecule-protein complex, the specific calculation method is shown in the following formula. Among them, the method of reweighting can be used to calculate the energy difference ΔU of the calculated trajectory file under other state parameter files, and finally obtain the free energy difference ΔG between the two states through the formula.
具体的,上述步骤S103,进一步包括以下步骤:Specifically, the above step S103 further includes the following steps:
根据所述轨迹文件中两相邻的中间态i及中间态j,分别计算中间态i轨迹在中间态j对应参数下的第一 能量ΔU ij,以及中间态j轨迹在中间态i对应参数下的第二能量ΔU jiAccording to the two adjacent intermediate state i and intermediate state j in the trajectory file, respectively calculate the first energy ΔU ij of the intermediate state i trajectory under the corresponding parameters of intermediate state j, and the intermediate state j trajectory under the corresponding parameters of intermediate state i The second energy ΔU ji of ;
将所述第一能量ΔU ij和所述第二能量ΔU ji带入式1和式2,分别计算出参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG BPut the first energy ΔU ij and the second energy ΔU ji into formula 1 and formula 2, respectively calculate the solvent free energy difference ΔGA for converting the reference compound into the target molecule, and the binding of the reference compound into the target molecule Free energy difference ΔG B ;
Figure PCTCN2021143802-appb-000011
Figure PCTCN2021143802-appb-000011
Figure PCTCN2021143802-appb-000012
Figure PCTCN2021143802-appb-000012
基于参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG B,按式3计算获得参考化合物变换成目标分子的相对结合自由能ΔΔG bindingBased on the solvent free energy difference ΔGA of the reference compound converted into the target molecule, and the binding free energy difference ΔG B of the reference compound converted into the target molecule, the relative binding free energy ΔΔG binding of the reference compound converted into the target molecule is calculated according to formula 3;
ΔΔG binding=ΔG B-ΔG A         式3; ΔΔG binding = ΔG B -ΔG A formula 3;
基于参考化合物变换成目标分子的相对结合自由能ΔΔG binding,和已知的参考化合物-蛋白质复合物的结合自由能ΔG 1,按照式4计算获得目标分子-蛋白质复合物的结合自由能ΔG 2Based on the relative binding free energy ΔΔG binding converted from the reference compound to the target molecule, and the known binding free energy ΔG 1 of the reference compound-protein complex, the binding free energy ΔG 2 of the target molecule-protein complex is calculated according to formula 4;
ΔG 2=ΔΔG binding+ΔG 1          式4; ΔG 2 = ΔΔG binding + ΔG 1 Formula 4;
其中,ΔG为自由能差异,ΔU ij为所述第一能量,ΔU ji为所述第二能量,<*> i为中间态i下的系统平均,<*> j为中间态j下的系统平均,N i、N i为中间态i、j下的模拟轨迹的帧数,k B为波尔兹曼常数,T为模拟温度,一般为298K,ΔΔG binding为相对结合自由能,ΔG A为参考化合物变换成目标分子的溶剂自由能差异,ΔG B为参考化合物变换成目标分子的结合自由能差异,ΔG 1为已知的参考化合物-蛋白质复合物的结合自由能,ΔG 2为目标分子-蛋白质复合物的结合自由能。 Among them, ΔG is the free energy difference, ΔU ij is the first energy, ΔU ji is the second energy, <*> i is the system average under intermediate state i, <*> j is the system under intermediate state j On average, N i and N i are the frame numbers of the simulation trajectory under the intermediate state i and j, k B is the Boltzmann constant, T is the simulation temperature, generally 298K, ΔΔG binding is the relative binding free energy, and ΔG A is The solvent free energy difference of the reference compound converted into the target molecule, ΔG B is the binding free energy difference of the reference compound converted into the target molecule, ΔG 1 is the known binding free energy of the reference compound-protein complex, ΔG 2 is the target molecule- Binding free energy of protein complexes.
在本实施例中,利用相对自由能(Relative Binding Free Energy,RBFE)计算不同分子之间的自由能差异ΔΔG binding,从而来计算目标分子-蛋白质复合物的结合自由能。 In this embodiment, the relative free energy (Relative Binding Free Energy, RBFE) is used to calculate the free energy difference ΔΔG binding between different molecules, so as to calculate the binding free energy of the target molecule-protein complex.
如图5所示,图5为利用相对自由能(Relative Binding Free Energy,RBFE)计算不同分子之间的自由能差异ΔΔG的基本原理示意图。其中,左上图表示参考化合物A与蛋白质受体分离示意图,左下图表示参考化合物A与蛋白质受体形成复合物结构A,ΔG 1为参考化合物A与蛋白质受体形成复合物结构A的结合自由能,右上图表示目标分子B与蛋白质受体分离示意图,右下图表示目标分子B与蛋白质受体形成复合物结构B,ΔG 2为目标分子B与蛋白质受体形成复合物结构B的结合自由能,ΔG A为蛋白质受体将参考化合物A变换成目标分子B的溶剂结合自由能,ΔG B为参考化合物A变换成目标分子B的结合自由能差异。 As shown in FIG. 5 , FIG. 5 is a schematic diagram of the basic principle of calculating the free energy difference ΔΔG between different molecules by using relative free energy (Relative Binding Free Energy, RBFE). Among them, the upper left figure shows the schematic diagram of the separation of the reference compound A and the protein receptor, the lower left figure shows the complex structure A formed by the reference compound A and the protein receptor, and ΔG 1 is the binding free energy of the complex structure A formed by the reference compound A and the protein receptor , the upper right figure shows the separation schematic diagram of the target molecule B and the protein receptor, the lower right figure shows the complex structure B formed by the target molecule B and the protein receptor, and ΔG 2 is the binding free energy of the target molecule B and the protein receptor to form the complex structure B , ΔG A is the solvent binding free energy of the protein receptor to transform the reference compound A into the target molecule B, and ΔG B is the difference in the binding free energy of the reference compound A into the target molecule B.
其中,ΔΔG binding=ΔG 2-ΔG 1=ΔG B-ΔG AWherein, ΔΔG binding =ΔG 2 −ΔG 1 =ΔG B −ΔG A .
为了计算ΔΔG binding,ΔG 2-ΔG 1计算起来相对困难,而横向相邻的两个复合物体系差别很小,因此相对容易达到平衡,在实际计算的时候,就容易实现,因而可以计算相对简单的ΔG B-ΔG A。具体的,通过上述计算目标分子-蛋白质复合物的结合自由能的方法来计算待预测目标分子的参考化合物变换成目标分子的结合自由能差异ΔG B和参考化合物变换成目标分子的溶剂自由能ΔG A,由于选择了已知小分子-蛋白质复合物构象的小分子作为参考化合物,因此参考化合物-蛋白质复合物的结合自由能ΔG 1为确定,如此在通过ΔG B-ΔG A计算出ΔΔG binding后,进而通过ΔG 2=ΔΔG binding+ΔG 1(式4)计算出目标分子-蛋白质复合 物的结合自由能ΔG 2In order to calculate ΔΔG binding , ΔG 2 -ΔG 1 is relatively difficult to calculate, and the difference between the two laterally adjacent complex systems is relatively small, so it is relatively easy to achieve equilibrium. In actual calculation, it is easy to achieve, so the calculation can be relatively simple ΔG B -ΔG A . Specifically, the binding free energy difference ΔG B of the reference compound of the target molecule to be predicted converted into the target molecule and the solvent free energy ΔG of the reference compound converted into the target molecule are calculated by the above method of calculating the binding free energy of the target molecule-protein complex A , since the small molecule with known conformation of the small molecule-protein complex is selected as the reference compound, the binding free energy ΔG 1 of the reference compound-protein complex is determined, so after calculating ΔΔG binding through ΔG B -ΔG A , and then calculate the binding free energy ΔG 2 of the target molecule-protein complex by ΔG 2 =ΔΔG binding +ΔG 1 (Formula 4).
示例性的,常用分子模拟软件以Amber程序为例,选择某蛋白小分子复合物作为测试体系,选择二面角、范德华及静电参数进行修改,计算该蛋白小分子复合物的结合自由能。Exemplarily, the commonly used molecular simulation software takes the Amber program as an example, selects a protein small molecule complex as the test system, selects dihedral angle, van der Waals and electrostatic parameters for modification, and calculates the binding free energy of the protein small molecule complex.
1.选择已知小分子-蛋白质复合物构象的小分子作为参考化合物,构建相对自由能的分子对。1. Select a small molecule with known conformation of the small molecule-protein complex as a reference compound, and construct a molecular pair with relative free energy.
2.根据小分子对构建适合RBFE计算的复合物体系初始输入文件,具体为:2. Construct the initial input file of the complex system suitable for RBFE calculation according to the small molecule pair, specifically:
a.准备参考化合物与蛋白质之间复合物结构,同时利用分子对接(docking)的方法获得上述蛋白小分子复合物。a. Prepare the structure of the complex between the reference compound and the protein, and at the same time use the method of molecular docking (docking) to obtain the above protein small molecule complex.
b.利用Amber18中的Antechamber工具抓取针对各小分子的gaff2力场参数,同时利用Am1bcc方法计算各小分子所带电荷。b. Use the Antechamber tool in Amber18 to grab the gaff2 force field parameters for each small molecule, and use the Am1bcc method to calculate the charge carried by each small molecule.
c.在此基础上利用Amber18中的tleap工具构建蛋白质和小分子体系的初始输入文件。c. On this basis, use the tleap tool in Amber18 to construct the initial input files of protein and small molecule systems.
同理,构建目标分子-蛋白质复合物的初始输入文件。Similarly, construct the initial input file for the target molecule-protein complex.
3.对于目标分子-蛋白质复合物和参考化合物-蛋白质复合物的初始输入文件,按照前述图1-图3方法,分别修改二面角、范德华及静电参数,得到各中间状态计算所需的力场输入文件,具体的:3. For the initial input files of the target molecule-protein complex and the reference compound-protein complex, modify the dihedral angle, van der Waals and electrostatic parameters respectively according to the method in Figure 1-Figure 3 above to obtain the force required for the calculation of each intermediate state field input file, specifically:
a.分别确定目标分子-蛋白质复合物以及参考化合物-蛋白质复合物的升温区域。a. Determine the temperature rise region for the target molecule-protein complex and the reference compound-protein complex respectively.
b.对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数。b. Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the target molecule-protein complex system.
同理,对应所述升温区域修改初始输入文件中参考化合物-蛋白质复合物体系的力场参数,得到修改后的参考化合物-蛋白质复合物体系的力场参数;Similarly, modify the force field parameters of the reference compound-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the reference compound-protein complex system;
c.基于修改后的参考化合物-蛋白质复合物体系的力场参数,和修改后的目标分子-蛋白质复合物体系的力场参数,生成FEP/REST2模拟所需的参数文件。c. Generate parameter files required for FEP/REST2 simulation based on the modified force field parameters of the reference compound-protein complex system and the modified force field parameters of the target molecule-protein complex system.
4.针对各中间态的输入进行分子动力学模拟,并保存模拟过程中的轨迹文件。4. Carry out molecular dynamics simulation for the input of each intermediate state, and save the trajectory file during the simulation process.
5.根据得到的轨迹文件及前述本内特接受率方法(BAR)方法计算具体的相对自由能,具体过程如下:5. Calculate the specific relative free energy according to the obtained trajectory file and the aforementioned Bennett Acceptance Rate (BAR) method. The specific process is as follows:
a.首先对于轨迹文件中两相邻的中间态i及中间态j,分别计算中间态i轨迹在中间态j对应参数下的第一能量ΔU ij,同理计算出第二能量ΔU jia. First, for two adjacent intermediate states i and intermediate states j in the trajectory file, calculate the first energy ΔU ij of the intermediate state i trajectory under the corresponding parameters of intermediate state j, and calculate the second energy ΔU ji in the same way;
b.将第一能量ΔU ij及第二能量ΔU ji带入BAR方法(如下式1和式2),分别计算出参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG Bb. Bring the first energy ΔU ij and the second energy ΔU ji into the BAR method (the following formulas 1 and 2), and calculate the solvent free energy difference ΔGA of the reference compound converted into the target molecule, and the reference compound converted into the target molecule The binding free energy difference ΔG B of the molecules;
Figure PCTCN2021143802-appb-000013
Figure PCTCN2021143802-appb-000013
Figure PCTCN2021143802-appb-000014
Figure PCTCN2021143802-appb-000014
c.基于参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG B,按如下式3计算获得参考化合物变换成目标分子的相对结合自由能ΔΔG bindingc. Based on the solvent free energy difference ΔG A of the reference compound converted into the target molecule, and the binding free energy difference ΔG B of the reference compound converted into the target molecule, the relative binding free energy ΔΔG of the reference compound converted into the target molecule is calculated according to the following formula 3 binding ;
ΔΔG binding=ΔG B-ΔG A          式3; ΔΔG binding = ΔG B -ΔG A Formula 3;
d.基于参考化合物变换成目标分子的相对结合自由能ΔΔG binding,和已知的参考化合物-蛋白质复合物的结合自由能ΔG 1,按照如下式4计算获得目标分子-蛋白质复合物的结合自由能ΔG 2d. Based on the relative binding free energy ΔΔG binding of the reference compound converted into the target molecule, and the known binding free energy ΔG 1 of the reference compound-protein complex, calculate the binding free energy of the target molecule-protein complex according to the following formula 4 ΔG 2 ;
ΔG 2=ΔΔG binding+ΔG 1           式4; ΔG 2 = ΔΔG binding + ΔG 1 Formula 4;
其中,ΔG为自由能差异,ΔU ij为所述第一能量,ΔU ji为所述第二能量,<*> i为中间态i下的系统平均,<*> j为中间态j下的系统平均,N i、N i为中间态i、j下的模拟轨迹的帧数,k B为波尔兹曼常数,T为 模拟温度,ΔΔG binding为相对结合自由能,ΔG A为参考化合物变换成目标分子的溶剂自由能差异,ΔG B为参考化合物变换成目标分子的结合自由能差异,ΔG 1为已知的参考化合物-蛋白质复合物的结合自由能,ΔG 2为目标分子-蛋白质复合物的结合自由能。 Among them, ΔG is the free energy difference, ΔU ij is the first energy, ΔU ji is the second energy, <*> i is the system average under intermediate state i, <*> j is the system under intermediate state j On average, N i , N i are the frame numbers of the simulation trajectory under the intermediate state i, j, k B is the Boltzmann constant, T is the simulation temperature, ΔΔG binding is the relative binding free energy, ΔG A is the reference compound transformed into The solvent free energy difference of the target molecule, ΔG B is the binding free energy difference of the reference compound converted into the target molecule, ΔG 1 is the known binding free energy of the reference compound-protein complex, ΔG 2 is the binding free energy of the target molecule-protein complex Binding free energy.
如此,在通过ΔG B-ΔG A计算出ΔΔG binding后,通过ΔG 2=ΔΔG binding+ΔG 1(式4)最终计算出目标分子-蛋白质复合物的结合自由能ΔG 2In this way, after ΔΔG binding is calculated by ΔG B -ΔGA , the binding free energy ΔG 2 of the target molecule-protein complex is finally calculated by ΔG 2 =ΔΔG binding +ΔG 1 (Formula 4).
将上述方法计算得到的结果与实验值进行对比,得到的结果如图6(b)所示,其中,图6(a)为未使用REST2增强采样时的FEP计算结果,图6(b)为本申请使用REST2增强采样后的计算结果。Comparing the results calculated by the above method with the experimental values, the results obtained are shown in Figure 6(b), where Figure 6(a) is the FEP calculation result when REST2 enhanced sampling is not used, and Figure 6(b) is This application uses REST2 to enhance the calculation results after sampling.
将图6(b)使用FEP/REST2的结果与图6(a)不使用REST2的结果相比,使用FEP/REST2后的平均无符号误差MUE(mean unsigned error)有所降低,同时相关系数R 2有所提高。由此可知,本申请计算得到的自由能与实验结果更加接近,因此,本申请的增强采样方法以及增强采样方法提高了FEP预测的精度。 Comparing the result of using FEP/REST2 in Figure 6(b) with the result of not using REST2 in Figure 6(a), the average unsigned error MUE (mean unsigned error) after using FEP/REST2 is reduced, and the correlation coefficient R 2 has improved. It can be seen that the free energy calculated by the present application is closer to the experimental results, therefore, the enhanced sampling method and the enhanced sampling method of the present application improve the accuracy of FEP prediction.
需要说明的是,本申请除了采用本内特接受率方法(BAR)方法计算目标分子-蛋白质复合物的结合自由能,还可以采用其他方法来计算目标分子-蛋白质复合物的结合自由能,在此不做具体限定。It should be noted that, in addition to using the Bennett Acceptance Ratio method (BAR) method to calculate the binding free energy of the target molecule-protein complex, other methods can also be used to calculate the binding free energy of the target molecule-protein complex. This is not specifically limited.
综上,本申请的计算目标分子-蛋白质复合物的结合自由能的方法具有如下有益效果:In summary, the method for calculating the binding free energy of the target molecule-protein complex of the present application has the following beneficial effects:
通过执行本申请实施例中的增强采样方法,以尽量少修改代码的方式快速便捷地实现REST2方法的流程,而不需要大幅修改分子动力学模拟软件的底层代码,并且使得FEP/REST2方法整体流程自动化,提高了FEP的计算效率及预测精度,从而便于更加快捷地计算出目标分子与蛋白质之间的结合自由能。By implementing the enhanced sampling method in the embodiment of this application, the flow of the REST2 method can be realized quickly and conveniently with as little code modification as possible, without greatly modifying the underlying code of the molecular dynamics simulation software, and making the overall flow of the FEP/REST2 method Automation improves the calculation efficiency and prediction accuracy of FEP, making it easier to calculate the binding free energy between target molecules and proteins more quickly.
请参阅图7,本申请实施例还提供一种增强采样装置100。增强采样装置100包括:Referring to FIG. 7 , the embodiment of the present application also provides an enhanced sampling device 100 . The enhanced sampling device 100 includes:
升温区域确定模块110,用于确定目标分子-蛋白质复合物的升温区域;The heating region determination module 110 is used to determine the heating region of the target molecule-protein complex;
力场参数修改模块120,用于对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件;和The force field parameter modification module 120 is used to modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate Parameter files required for FEP/REST2 simulations; and
轨迹文件生成模块130,用于将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。The trajectory file generating module 130 is configured to perform FEP/REST2 simulation on the parameter file as an input of the potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
在某一个实施例中,所述力场参数修改模块120,包括:In a certain embodiment, the force field parameter modification module 120 includes:
参数划分模块,用于将所述目标分子-蛋白质复合物体系的力场参数项划分为所述升温区域内部的参数、所述升温区域与环境区域间的参数以及所述环境区域内部的参数;A parameter division module, configured to divide the force field parameter items of the target molecule-protein complex system into parameters inside the heating region, parameters between the heating region and the environment region, and parameters inside the environment region;
第一修改模块,用于至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,得修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成所述FEP/REST2模拟所需的参数文件。The first modification module is used to at least partially modify the parameters inside the heating region, the parameters between the heating region and the environment region, and the parameters inside the environment region to obtain a modified target molecule-protein complex system The force field parameters, and then further generate the parameter files required for the FEP/REST2 simulation.
在某一个实施例中,所述第一修改模块,具体用于:In a certain embodiment, the first modification module is specifically used for:
将所述升温区域内部的参数乘以第一类参数;multiplying the parameters inside the heating zone by the parameters of the first type;
将所述升温区域与所述环境区域间的参数乘以第二类参数;multiplying the parameter between the heating zone and the ambient zone by a second type of parameter;
对所述环境区域内部的参数不做修改。No modification is made to the parameters inside the environmental region.
在某一个实施例中,所述第一类参数包括键连系数、范德华系数及静电系数,所述第二类参数包括键连系数。In a certain embodiment, the first type of parameters include bonding coefficients, van der Waals coefficients and electrostatic coefficients, and the second type of parameters include bonding coefficients.
在某一个实施例中,所述升温区域确定模块110,具体用于:In a certain embodiment, the heating region determination module 110 is specifically used for:
将FEP计算过程中的微扰区域确定为目标分子-蛋白质复合物的初始升温区域;Identify the perturbation region during the FEP calculation as the initial heating region of the target molecule-protein complex;
判断所述目标分子中是否有环;judging whether there is a ring in the target molecule;
若所述目标分子中有环,则进一步判断所述初始升温区域中的至少部分原子是否位于所述环上,或所述初始升温区域中的至少部分原子是否与所述环上的原子直接或间接相连;If there is a ring in the target molecule, it is further judged whether at least some of the atoms in the initial heating region are located on the ring, or whether at least some of the atoms in the initial heating region are directly or directly with the atoms on the ring indirectly connected;
若所述初始升温区域中的至少部分原子位于所述环上或与所述环上的原子直接相连,则将所述环所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are located on the ring or are directly connected to atoms on the ring, then adding the region where the ring is located to the initial heating region to obtain a first updated heating region ;
若所述初始升温区域中的至少部分原子与所述环上的原子间接相连,则将所述环及与所述环间接相连部分所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are indirectly connected to atoms on the ring, add the ring and the region where the part indirectly connected to the ring is located to the initial heating region to obtain a first update warming area;
若所述初始升温区域中的所有原子既不位于所述环上,也不与所述环上的原子直接或间接相连,则将所述初始升温区域定为第一更新升温区域;If all atoms in the initial heating region are neither located on the ring nor directly or indirectly connected to atoms on the ring, then the initial heating region is defined as the first updated heating region;
若所述目标分子中无环,则将所述初始升温区域定为第一更新升温区域;If there is no ring in the target molecule, then set the initial temperature rise region as the first updated temperature rise region;
判断是否存在额外指定的需要升温的区域;Determine whether there are additional designated areas that need to be heated;
若存在,则将所述额外指定的需要升温的区域添加至所述第一更新升温区域,得升温区域;If it exists, adding the additionally designated area that needs to be heated to the first updated heating area to obtain a heating area;
若不存在,则将所述第一更新升温区域定为升温区域。If it does not exist, then set the first update temperature rise region as the temperature rise region.
关于增强采样装置100的具体限定可以参见上文中对于增强采样方法的限定,在此不再赘述。上述增强采样装置100中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the enhanced sampling device 100, refer to the above definition of the enhanced sampling method, which will not be repeated here. Each module in the above-mentioned enhanced sampling device 100 may be fully or partially implemented by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
请参阅图8,本申请实施例还提供一种计算目标分子-蛋白质复合物的结合自由能的装置200,该装置200包括:Please refer to Fig. 8, the embodiment of the present application also provides a device 200 for calculating the binding free energy of the target molecule-protein complex, the device 200 includes:
构建模块210,用于构建目标分子-蛋白质复合物体系的初始输入文件; Construction module 210, used to construct the initial input file of the target molecule-protein complex system;
如上述任一实施例所述的增强采样装置100,用于基于输入的所述初始输入文件,获得轨迹文件;和The enhanced sampling device 100 as described in any of the above-mentioned embodiments is configured to obtain a trajectory file based on the input initial input file; and
计算模块220,用于基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。A calculation module 220, configured to obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
在某一个实施例中,一种计算目标分子-蛋白质复合物的结合自由能的装置200还包括:In a certain embodiment, a device 200 for calculating the binding free energy of the target molecule-protein complex further includes:
参考化合物选择模块,用于选择已知小分子-蛋白质复合物构象的小分子为参考化合物;Reference compound selection module, used to select small molecules with known small molecule-protein complex conformations as reference compounds;
所述升温区域确定模块110,还用于确定参考化合物-蛋白质复合物的升温区域;The heating region determination module 110 is also used to determine the heating region of the reference compound-protein complex;
所述力场参数修改模块120,具体用于:The force field parameter modification module 120 is specifically used for:
对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数;Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the modified force field parameters of the target molecule-protein complex system;
对应所述升温区域修改初始输入文件中参考化合物-蛋白质复合物体系的力场参数,得到修改后的参考化合物-蛋白质复合物体系的力场参数;Modifying the force field parameters of the reference compound-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the reference compound-protein complex system;
基于修改后的参考化合物-蛋白质复合物体系的力场参数,和修改后的目标分子-蛋白质复合物体系的力场参数,生成FEP/REST2模拟所需的参数文件。Based on the modified force field parameters of the reference compound-protein complex system and the modified force field parameters of the target molecule-protein complex system, the parameter files required for FEP/REST2 simulation are generated.
在某一个实施例中,所述计算模块220包括第一子计算模块,所述第一子计算模块用于:In a certain embodiment, the calculation module 220 includes a first sub-calculation module, and the first sub-calculation module is used for:
基于所述轨迹文件,采用本内特接受率方法进行计算,获得所述目标分子-蛋白质复合物的结合自由能;Based on the trajectory file, the Bennett acceptance rate method is used for calculation to obtain the binding free energy of the target molecule-protein complex;
所述第一子计算模块,具体用于:The first sub-computing module is specifically used for:
根据所述轨迹文件中两相邻的中间态i及中间态j,分别计算中间态i轨迹在中间态j对应参数下的第一能量ΔU ij,以及中间态j轨迹在中间态i对应参数下的第二能量ΔU jiAccording to the two adjacent intermediate state i and intermediate state j in the trajectory file, respectively calculate the first energy ΔU ij of the intermediate state i trajectory under the corresponding parameters of intermediate state j, and the intermediate state j trajectory under the corresponding parameters of intermediate state i The second energy ΔU ji of ;
将所述第一能量ΔU ij和所述第二能量ΔU ji带入式1和式2,分别计算出参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG BPut the first energy ΔU ij and the second energy ΔU ji into formula 1 and formula 2, respectively calculate the solvent free energy difference ΔGA for converting the reference compound into the target molecule, and the binding of the reference compound into the target molecule Free energy difference ΔG B ;
Figure PCTCN2021143802-appb-000015
Figure PCTCN2021143802-appb-000015
Figure PCTCN2021143802-appb-000016
Figure PCTCN2021143802-appb-000016
基于参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG B,按式3计算获得参考化合物变换成目标分子的相对结合自由能ΔΔG bindingBased on the solvent free energy difference ΔGA of the reference compound converted into the target molecule, and the binding free energy difference ΔG B of the reference compound converted into the target molecule, the relative binding free energy ΔΔG binding of the reference compound converted into the target molecule is calculated according to formula 3;
ΔΔG binding=ΔG B-ΔG A          式3; ΔΔG binding = ΔG B -ΔG A Formula 3;
基于参考化合物变换成目标分子的相对结合自由能ΔΔG binding,和已知的参考化合物-蛋白质复合物的结合自由能ΔG 1,按照式4计算获得目标分子-蛋白质复合物的结合自由能ΔG 2Based on the relative binding free energy ΔΔG binding converted from the reference compound to the target molecule, and the known binding free energy ΔG 1 of the reference compound-protein complex, the binding free energy ΔG 2 of the target molecule-protein complex is calculated according to formula 4;
ΔG 2=ΔΔG binding+ΔG 1           式4; ΔG 2 = ΔΔG binding + ΔG 1 Formula 4;
其中,ΔG为自由能差异,ΔU ij为所述第一能量,ΔU ji为所述第二能量,<*> i为中间态i下的系统平均,<*> j为中间态j下的系统平均,N i、N i为中间态i、j下的模拟轨迹的帧数,k B为波尔兹曼常数,T为模拟温度,ΔΔG binding为相对结合自由能,ΔG A为参考化合物变换成目标分子的溶剂自由能差异,ΔG B为参考化合物变换成目标分子的结合自由能差异,ΔG 1为已知的参考化合物-蛋白质复合物的结合自由能,ΔG 2为目标分子-蛋白质复合物的结合自由能。 Among them, ΔG is the free energy difference, ΔU ij is the first energy, ΔU ji is the second energy, <*> i is the system average under intermediate state i, <*> j is the system under intermediate state j On average, N i , N i are the frame numbers of the simulation trajectory under the intermediate state i, j, k B is the Boltzmann constant, T is the simulation temperature, ΔΔG binding is the relative binding free energy, ΔG A is the reference compound transformed into The solvent free energy difference of the target molecule, ΔG B is the binding free energy difference of the reference compound converted into the target molecule, ΔG 1 is the known binding free energy of the reference compound-protein complex, ΔG 2 is the binding free energy of the target molecule-protein complex Binding free energy.
关于计算目标分子-蛋白质复合物的结合自由能的装置200的具体限定可以参见上文中对于计算目标分子-蛋白质复合物的结合自由能的方法的限定,在此不再赘述。上述计算目标分子-蛋白质复合物的结合自由能的装置200中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitations of the device 200 for calculating the binding free energy of the target molecule-protein complex, please refer to the above-mentioned limitations on the method for calculating the binding free energy of the target molecule-protein complex, which will not be repeated here. Each module in the above-mentioned apparatus 200 for calculating the binding free energy of the target molecule-protein complex can be fully or partially realized by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
请参阅图9,本申请实施例还提供一种电子设备,包括:Please refer to FIG. 9, the embodiment of the present application also provides an electronic device, including:
一个或多个处理器;one or more processors;
存储器,与所述处理器耦接,用于存储一个或多个程序;a memory, coupled to the processor, for storing one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上任一实施例所述的增强采样方法或上述任一实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the enhanced sampling method described in any of the above embodiments or the calculation described in any of the above embodiments Binding Free Energy Method for Target Molecule-Protein Complexes.
处理器用于控制该终端设备的整体操作,以完成上述的增强采样方法或上述任一实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法的全部或部分步骤。存储器用于存储各种类型的数据以支持在该终端设备的操作,这些数据例如可以包括用于在该终端设备上操作的任何应用程序或方法的指令,以及应用程序相关的数据。该存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,例如静态随机存取存储器(Static Random Access Memory,简称SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,简称EEPROM),可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,简称EPROM),可编程只读存储器(Programmable Read-Only Memory,简称PROM),只读存储器(Read-Only Memory,简称ROM),磁存储器,快闪存储器,磁盘或光盘。The processor is used to control the overall operation of the terminal device to complete all or part of the steps of the above enhanced sampling method or the method for calculating the binding free energy of the target molecule-protein complex described in any of the above embodiments. The memory is used to store various types of data to support the operation of the terminal device, such data may include instructions for any application program or method operated on the terminal device, and application-related data. The memory can be implemented by any type of volatile or non-volatile memory devices or their combination, such as Static Random Access Memory (SRAM for short), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory) Erasable Programmable Read-Only Memory, referred to as EEPROM), Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, referred to as EPROM), Programmable Read-Only Memory (Programmable Read-Only Memory, referred to as PROM), read-only memory (Read-Only Memory, referred to as ROM), magnetic memory, flash memory, magnetic disk or optical disk.
在一示例性实施例中,终端设备可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,简称ASIC)、数字信号处理器(Digital Signal Processor,简称DSP)、数字信号处理设备(Digital Signal Processing Device,简称DSPD)、可编程逻辑器件(Programmable Logic Device,简称PLD)、现场可编程门阵列(Field Programmable Gate Array,简称FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行如上述任一项实施例所述的增强采样方法或上述任一实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法,并达到如上述方法一致的技术效果。In an exemplary embodiment, the terminal device may be implemented by one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), digital signal processors (Digital Signal Processor, DSP for short), digital signal processing equipment (Digital Signal Processing Device, referred to as DSPD), programmable logic device (Programmable Logic Device, referred to as PLD), field programmable gate array (Field Programmable Gate Array, referred to as FPGA), controller, microcontroller, microprocessor or other electronic components to achieve , for performing the enhanced sampling method as described in any of the above embodiments or the method for calculating the binding free energy of the target molecule-protein complex as described in any of the above embodiments, and achieve the same technical effect as the above methods.
在另一示例性实施例中,还提供一种包括计算机程序的计算机可读存储介质,该计算机程序被处理器执行时实现如上述任一项实施例所述的增强采样方法或上述任一实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法的步骤。例如,该计算机可读存储介质可以为上述包括计算机程序的存储器,上述计算机程序可由终端设备的处理器执行以完成如上述任一实施例所述的增强采样方法或上述任一实施例所述的计算目标分子-蛋白质复合物的结合自由能的方法,并达到如上述方法一致的技术效果。In another exemplary embodiment, a computer-readable storage medium including a computer program is also provided, and when the computer program is executed by a processor, the enhanced sampling method as described in any one of the above-mentioned embodiments or any of the above-mentioned implementations is implemented. The steps of the method for calculating the binding free energy of the target molecule-protein complex described in the example. For example, the computer-readable storage medium may be the above-mentioned memory including a computer program, and the above-mentioned computer program can be executed by the processor of the terminal device to complete the enhanced sampling method as described in any of the above embodiments or the method described in any of the above-mentioned embodiments. A method for calculating the binding free energy of the target molecule-protein complex, and achieve the same technical effect as the above method.
以上仅为本申请的较佳实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only the preferred implementation mode of the application, and does not limit the patent scope of the application. Any equivalent structure or equivalent process conversion made by using the specification and drawings of the application, or directly or indirectly used in other related technologies fields, are all included in the scope of patent protection of this application in the same way.

Claims (18)

  1. 一种增强采样方法,其特征在于,包括:A kind of enhanced sampling method, is characterized in that, comprises:
    确定目标分子-蛋白质复合物的升温区域;Determining the heating region of the target molecule-protein complex;
    对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件;Modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate the parameter files required for FEP/REST2 simulation ;
    将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。FEP/REST2 simulations are performed using the parameter file as the input of the potential function to generate a trajectory file, which is used to calculate the binding free energy of the target molecule-protein complex.
  2. 根据权利要求1所述的方法,其特征在于,所述对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件,包括:The method according to claim 1, characterized in that, modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the force of the modified target molecule-protein complex system Field parameters, and then further generate the parameter files required for FEP/REST2 simulation, including:
    将所述目标分子-蛋白质复合物体系的力场参数项划分为所述升温区域内部的参数、所述升温区域与环境区域间的参数以及所述环境区域内部的参数;Dividing the force field parameter item of the target molecule-protein complex system into parameters inside the heating region, parameters between the heating region and the environment region, and parameters inside the environment region;
    至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成所述FEP/REST2模拟所需的参数文件。At least partially modifying the parameters inside the heating region, the parameters between the heating region and the environment region, and the parameters inside the environment region to obtain the modified force field parameters of the target molecule-protein complex system, and then further Generate the parameter files required for the FEP/REST2 simulation described.
  3. 根据权利要求2所述的方法,其特征在于,所述至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,包括:The method according to claim 2, wherein the at least partially modifying parameters inside the heating zone, parameters between the heating zone and the environment zone, and parameters inside the environment zone include:
    将所述升温区域内部的参数乘以第一类参数;multiplying the parameters inside the heating zone by the parameters of the first type;
    将所述升温区域与所述环境区域间的参数乘以第二类参数;multiplying the parameter between the heating zone and the ambient zone by a second type of parameter;
    对所述环境区域内部的参数不做修改。No modification is made to the parameters inside the environmental region.
  4. 根据权利要求3所述的方法,其特征在于,所述第一类参数包括键连系数、范德华系数及静电系数,所述第二类参数包括键连系数。The method according to claim 3, wherein the first type of parameters include bonding coefficient, van der Waals coefficient and electrostatic coefficient, and the second type of parameters include bonding coefficient.
  5. 根据权利要求1所述的方法,其特征在于,所述确定目标分子-蛋白质复合物的升温区域,包括:The method according to claim 1, wherein the determination of the heating region of the target molecule-protein complex comprises:
    将FEP计算过程中的微扰区域确定为目标分子-蛋白质复合物的初始升温区域;Identify the perturbation region during the FEP calculation as the initial heating region of the target molecule-protein complex;
    判断所述目标分子中是否有环;judging whether there is a ring in the target molecule;
    若所述目标分子中有环,则进一步判断所述初始升温区域中的至少部分原子是否位于所述环上,或所述初始升温区域中的至少部分原子是否与所述环上的原子直接或间接相连;If there is a ring in the target molecule, it is further judged whether at least some of the atoms in the initial heating region are located on the ring, or whether at least some of the atoms in the initial heating region are directly or directly with the atoms on the ring indirectly connected;
    若所述初始升温区域中的至少部分原子位于所述环上或与所述环上的原子直接相连,则将所述环所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are located on the ring or are directly connected to atoms on the ring, then adding the region where the ring is located to the initial heating region to obtain a first updated heating region ;
    若所述初始升温区域中的至少部分原子与所述环上的原子间接相连,则将所述环及与所述环间接相连部分所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are indirectly connected to atoms on the ring, add the ring and the region where the part indirectly connected to the ring is located to the initial heating region to obtain a first update warming area;
    若所述初始升温区域中的所有原子既不位于所述环上,也不与所述环上的原子直接或间接相连,则将所述初始升温区域定为第一更新升温区域;If all atoms in the initial heating region are neither located on the ring nor directly or indirectly connected to atoms on the ring, then the initial heating region is defined as the first updated heating region;
    若所述目标分子中无环,则将所述初始升温区域定为第一更新升温区域;If there is no ring in the target molecule, then set the initial temperature rise region as the first updated temperature rise region;
    判断是否存在额外指定的需要升温的区域;Determine whether there are additional designated areas that need to be heated;
    若存在,则将所述额外指定的需要升温的区域添加至所述第一更新升温区域,得升温区域;If it exists, adding the additionally designated area that needs to be heated to the first updated heating area to obtain a heating area;
    若不存在,则将所述第一更新升温区域定为升温区域。If it does not exist, then set the first update temperature rise region as the temperature rise region.
  6. 一种计算目标分子-蛋白质复合物的结合自由能的方法,其特征在于,包括:A method for calculating the binding free energy of a target molecule-protein complex, characterized in that it comprises:
    构建目标分子-蛋白质复合物体系的初始输入文件;The initial input file for constructing the target molecule-protein complex system;
    将所述初始输入文件输入至如权利要求1-5中任一项所述的增强采样方法,获得轨迹文件;The initial input file is input to the enhanced sampling method according to any one of claims 1-5 to obtain a trajectory file;
    基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。The binding free energy of the target molecule-protein complex is obtained based on the trajectory file.
  7. 根据权利要求6所述的方法,其特征在于,在所述构建目标分子-蛋白质复合物体系的初始输入文件之前,还包括:The method according to claim 6, wherein, before the initial input file of the construction target molecule-protein complex system, further comprising:
    选择已知小分子-蛋白质复合物构象的小分子为参考化合物;Select small molecules with known conformations of small molecule-protein complexes as reference compounds;
    确定参考化合物-蛋白质复合物的升温区域;Determining the warming region of the reference compound-protein complex;
    所述对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件,包括:Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the modified force field parameters of the target molecule-protein complex system, and then further generating the required parameters for FEP/REST2 simulation parameter file, including:
    对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数;Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the modified force field parameters of the target molecule-protein complex system;
    对应所述升温区域修改初始输入文件中参考化合物-蛋白质复合物体系的力场参数,得到修改后的参考化合物-蛋白质复合物体系的力场参数;Modifying the force field parameters of the reference compound-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the reference compound-protein complex system;
    基于修改后的参考化合物-蛋白质复合物体系的力场参数,和修改后的目标分子-蛋白质复合物体系的力场参数,生成FEP/REST2模拟所需的参数文件。Based on the modified force field parameters of the reference compound-protein complex system and the modified force field parameters of the target molecule-protein complex system, the parameter files required for FEP/REST2 simulation are generated.
  8. 根据权利要求7所述的方法,其特征在于,所述基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能为:基于所述轨迹文件,采用本内特接受率方法进行计算,获得所述目标分子-蛋白质复合物的结合自由能;所述基于所述轨迹文件,采用本内特接受率方法进行计算,获得所述目标分子-蛋白质复合物的结合自由能,包括:The method according to claim 7, wherein the obtaining of the binding free energy of the target molecule-protein complex based on the trajectory file is: based on the trajectory file, using the Bennett acceptance rate method to calculate, obtain The binding free energy of the target molecule-protein complex; based on the trajectory file, the Bennett acceptance rate method is used to calculate the binding free energy of the target molecule-protein complex, including:
    根据所述轨迹文件中两相邻的中间态i及中间态j,分别计算中间态i轨迹在中间态j对应参数下的第一能量ΔU ij,以及中间态j轨迹在中间态i对应参数下的第二能量ΔU jiAccording to the two adjacent intermediate state i and intermediate state j in the trajectory file, respectively calculate the first energy ΔU ij of the intermediate state i trajectory under the corresponding parameters of intermediate state j, and the intermediate state j trajectory under the corresponding parameters of intermediate state i The second energy ΔU ji of ;
    将所述第一能量ΔU ij和所述第二能量ΔU ji带入式1和式2,分别计算出参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG BPut the first energy ΔU ij and the second energy ΔU ji into formula 1 and formula 2, respectively calculate the solvent free energy difference ΔGA for converting the reference compound into the target molecule, and the binding of the reference compound into the target molecule Free energy difference ΔG B ;
    Figure PCTCN2021143802-appb-100001
    Figure PCTCN2021143802-appb-100001
    Figure PCTCN2021143802-appb-100002
    Figure PCTCN2021143802-appb-100002
    基于参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG B,按式3计算获得参考化合物变换成目标分子的相对结合自由能ΔΔG bindingBased on the solvent free energy difference ΔGA of the reference compound converted into the target molecule, and the binding free energy difference ΔG B of the reference compound converted into the target molecule, the relative binding free energy ΔΔG binding of the reference compound converted into the target molecule is calculated according to formula 3;
    ΔΔG binding=ΔG B-ΔG A  式3; ΔΔG binding = ΔG B -ΔG A Formula 3;
    基于参考化合物变换成目标分子的相对结合自由能ΔΔG binding,和已知的参考化合物-蛋白质复合物的结合自由能ΔG 1,按照式4计算获得目标分子-蛋白质复合物的结合自由能ΔG 2Based on the relative binding free energy ΔΔG binding converted from the reference compound to the target molecule, and the known binding free energy ΔG 1 of the reference compound-protein complex, the binding free energy ΔG 2 of the target molecule-protein complex is calculated according to formula 4;
    ΔG 2=ΔΔG binding+ΔG 1  式4; ΔG 2 = ΔΔG binding + ΔG 1 Formula 4;
    其中,ΔG为自由能差异,ΔU ij为所述第一能量,ΔU ji为所述第二能量,<*> i为中间态i下的系统平均,<*> j为中间态j下的系统平均,N i、N i为中间态i、j下的模拟轨迹的帧数,k B为波尔兹曼常数,T为模拟温度,ΔΔG binding为相对结合自由能,ΔG A为参考化合物变换成目标分子的溶剂自由能差异,ΔG B为参考化合物变换成目标分子的结合自由能差异,ΔG 1为已知的参考化合物-蛋白质复合物的结合自由能,ΔG 2为目标分子-蛋白质复合物的结合自由能。 Among them, ΔG is the free energy difference, ΔU ij is the first energy, ΔU ji is the second energy, <*> i is the system average under intermediate state i, <*> j is the system under intermediate state j On average, N i , N i are the frame numbers of the simulation trajectory under the intermediate state i, j, k B is the Boltzmann constant, T is the simulation temperature, ΔΔG binding is the relative binding free energy, ΔG A is the reference compound transformed into The solvent free energy difference of the target molecule, ΔG B is the binding free energy difference of the reference compound converted into the target molecule, ΔG 1 is the known binding free energy of the reference compound-protein complex, ΔG 2 is the binding free energy of the target molecule-protein complex Binding free energy.
  9. 一种增强采样装置,其特征在于,包括:An enhanced sampling device is characterized in that it comprises:
    升温区域确定模块,用于确定目标分子-蛋白质复合物的升温区域;The heating region determination module is used to determine the heating region of the target molecule-protein complex;
    力场参数修改模块,用于对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成FEP/REST2模拟所需的参数文件;和The force field parameter modification module is used to modify the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region, obtain the modified force field parameters of the target molecule-protein complex system, and then further generate FEP Parameter files required for /REST2 simulation; and
    轨迹文件生成模块,用于将所述参数文件作为势函数的输入进行FEP/REST2模拟以生成轨迹文件,所述轨迹文件用于计算所述目标分子-蛋白质复合物的结合自由能。The trajectory file generation module is used to perform FEP/REST2 simulation on the parameter file as an input of the potential function to generate a trajectory file, and the trajectory file is used to calculate the binding free energy of the target molecule-protein complex.
  10. 根据权利要求9所述的装置,其特征在于,所述力场参数修改模块,包括:The device according to claim 9, wherein the force field parameter modification module includes:
    参数划分模块,用于将所述目标分子-蛋白质复合物体系的力场参数项划分为所述升温区域内部的参数、所述升温区域与环境区域间的参数以及所述环境区域内部的参数;A parameter division module, configured to divide the force field parameter items of the target molecule-protein complex system into parameters inside the heating region, parameters between the heating region and the environment region, and parameters inside the environment region;
    第一修改模块,用于至少部分修改所述升温区域内部的参数、所述升温区域与所述环境区域间的参数以及所述环境区域内部的参数,得修改后的目标分子-蛋白质复合物体系的力场参数,然后进一步生成所述FEP/REST2模拟所需的参数文件。The first modification module is used to at least partially modify the parameters inside the heating region, the parameters between the heating region and the environment region, and the parameters inside the environment region to obtain a modified target molecule-protein complex system The force field parameters, and then further generate the parameter files required for the FEP/REST2 simulation.
  11. 根据权利要求10所述的装置,其特征在于,所述第一修改模块,具体用于:The device according to claim 10, wherein the first modification module is specifically used for:
    将所述升温区域内部的参数乘以第一类参数;multiplying the parameters inside the heating zone by the parameters of the first type;
    将所述升温区域与所述环境区域间的参数乘以第二类参数;multiplying the parameter between the heating zone and the ambient zone by a second type of parameter;
    对所述环境区域内部的参数不做修改。No modification is made to the parameters inside the environmental region.
  12. 根据权利要求11所述的装置,其特征在于,所述第一类参数包括键连系数、范德华系数及静电系数,所述第二类参数包括键连系数。The device according to claim 11, wherein the first type of parameters include bonding coefficient, van der Waals coefficient and electrostatic coefficient, and the second type of parameters include bonding coefficient.
  13. 根据权利要求9所述的装置,其特征在于,所述升温区域确定模块,具体用于:The device according to claim 9, wherein the heating area determination module is specifically used for:
    将FEP计算过程中的微扰区域确定为目标分子-蛋白质复合物的初始升温区域;Identify the perturbation region during the FEP calculation as the initial heating region of the target molecule-protein complex;
    判断所述目标分子中是否有环;judging whether there is a ring in the target molecule;
    若所述目标分子中有环,则进一步判断所述初始升温区域中的至少部分原子是否位于所述环上,或所述初始升温区域中的至少部分原子是否与所述环上的原子直接或间接相连;If there is a ring in the target molecule, it is further judged whether at least some of the atoms in the initial heating region are located on the ring, or whether at least some of the atoms in the initial heating region are directly or directly with the atoms on the ring indirectly connected;
    若所述初始升温区域中的至少部分原子位于所述环上或与所述环上的原子直接相连,则将所述环所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are located on the ring or are directly connected to atoms on the ring, then adding the region where the ring is located to the initial heating region to obtain a first updated heating region ;
    若所述初始升温区域中的至少部分原子与所述环上的原子间接相连,则将所述环及与所述环间接相连部分所在的区域添加至所述的初始升温区域,得到第一更新升温区域;If at least some of the atoms in the initial heating region are indirectly connected to atoms on the ring, add the ring and the region where the part indirectly connected to the ring is located to the initial heating region to obtain a first update warming area;
    若所述初始升温区域中的所有原子既不位于所述环上,也不与所述环上的原子直接或间接相连,则将所述初始升温区域定为第一更新升温区域;If all atoms in the initial heating region are neither located on the ring nor directly or indirectly connected to atoms on the ring, then the initial heating region is defined as the first updated heating region;
    若所述目标分子中无环,则将所述初始升温区域定为第一更新升温区域;If there is no ring in the target molecule, then set the initial temperature rise region as the first updated temperature rise region;
    判断是否存在额外指定的需要升温的区域;Determine whether there are additional designated areas that need to be heated;
    若存在,则将所述额外指定的需要升温的区域添加至所述第一更新升温区域,得升温区域;If it exists, adding the additionally designated area that needs to be heated to the first updated heating area to obtain a heating area;
    若不存在,则将所述第一更新升温区域定为升温区域。If it does not exist, then set the first update temperature rise region as the temperature rise region.
  14. 一种计算目标分子-蛋白质复合物的结合自由能的装置,其特征在于,包括:A device for calculating the binding free energy of a target molecule-protein complex, characterized in that it includes:
    构建模块,用于构建目标分子-蛋白质复合物体系的初始输入文件;A building block for constructing an initial input file for a target molecule-protein complex system;
    如权利要求9-13中任一项所述的增强采样装置,用于基于输入的所述初始输入文件,获得轨迹文件;和The enhanced sampling device according to any one of claims 9-13, configured to obtain a trajectory file based on the inputted initial input file; and
    计算模块,用于基于所述轨迹文件获得目标分子-蛋白质复合物的结合自由能。A calculation module is used to obtain the binding free energy of the target molecule-protein complex based on the trajectory file.
  15. 根据权利要求14所述的装置,其特征在于,还包括:The device according to claim 14, further comprising:
    参考化合物选择模块,用于选择已知小分子-蛋白质复合物构象的小分子为参考化合物;Reference compound selection module, used to select small molecules with known small molecule-protein complex conformations as reference compounds;
    所述升温区域确定模块,还用于确定参考化合物-蛋白质复合物的升温区域;The heating region determination module is also used to determine the heating region of the reference compound-protein complex;
    所述力场参数修改模块,具体用于:The force field parameter modification module is specifically used for:
    对应所述升温区域修改初始输入文件中目标分子-蛋白质复合物体系的力场参数,得到修改后的目标分子-蛋白质复合物体系的力场参数;Modifying the force field parameters of the target molecule-protein complex system in the initial input file corresponding to the heating region to obtain the modified force field parameters of the target molecule-protein complex system;
    对应所述升温区域修改初始输入文件中参考化合物-蛋白质复合物体系的力场参数,得到修改后的参考化合物-蛋白质复合物体系的力场参数;Modifying the force field parameters of the reference compound-protein complex system in the initial input file corresponding to the heating region, to obtain the modified force field parameters of the reference compound-protein complex system;
    基于修改后的参考化合物-蛋白质复合物体系的力场参数,和修改后的目标分子-蛋白质复合物体系的力场参数,生成FEP/REST2模拟所需的参数文件。Based on the modified force field parameters of the reference compound-protein complex system and the modified force field parameters of the target molecule-protein complex system, the parameter files required for FEP/REST2 simulation are generated.
  16. 根据权利要求15所述的装置,其特征在于,所述计算模块包括第一子计算模块,所述第一子计算模块用于:The device according to claim 15, wherein the calculation module comprises a first sub-calculation module, and the first sub-calculation module is used for:
    基于所述轨迹文件,采用本内特接受率方法进行计算,获得所述目标分子-蛋白质复合物的结合自由能;Based on the trajectory file, the Bennett acceptance rate method is used for calculation to obtain the binding free energy of the target molecule-protein complex;
    所述第一子计算模块,具体用于:The first sub-computing module is specifically used for:
    根据所述轨迹文件中两相邻的中间态i及中间态j,分别计算中间态i轨迹在中间态j对应参数下的第一 能量ΔU ij,以及中间态j轨迹在中间态i对应参数下的第二能量ΔU jiAccording to the two adjacent intermediate state i and intermediate state j in the trajectory file, respectively calculate the first energy ΔU ij of the intermediate state i trajectory under the corresponding parameters of intermediate state j, and the intermediate state j trajectory under the corresponding parameters of intermediate state i The second energy ΔU ji of ;
    将所述第一能量ΔU ij和所述第二能量ΔU ji带入式1和式2,分别计算出参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG BPut the first energy ΔU ij and the second energy ΔU ji into formula 1 and formula 2, respectively calculate the solvent free energy difference ΔGA for converting the reference compound into the target molecule, and the binding of the reference compound into the target molecule Free energy difference ΔG B ;
    Figure PCTCN2021143802-appb-100003
    Figure PCTCN2021143802-appb-100003
    Figure PCTCN2021143802-appb-100004
    Figure PCTCN2021143802-appb-100004
    基于参考化合物变换成目标分子的溶剂自由能差异ΔG A,和参考化合物变换成目标分子的结合自由能差异ΔG B,按式3计算获得参考化合物变换成目标分子的相对结合自由能ΔΔG bindingBased on the solvent free energy difference ΔGA of the reference compound converted into the target molecule, and the binding free energy difference ΔG B of the reference compound converted into the target molecule, the relative binding free energy ΔΔG binding of the reference compound converted into the target molecule is calculated according to formula 3;
    ΔΔG binding=ΔG B-ΔG A  式3; ΔΔG binding = ΔG B -ΔG A Formula 3;
    基于参考化合物变换成目标分子的相对结合自由能ΔΔG binding,和已知的参考化合物-蛋白质复合物的结合自由能ΔG 1,按照式4计算获得目标分子-蛋白质复合物的结合自由能ΔG 2Based on the relative binding free energy ΔΔG binding converted from the reference compound to the target molecule, and the known binding free energy ΔG 1 of the reference compound-protein complex, the binding free energy ΔG 2 of the target molecule-protein complex is calculated according to formula 4;
    ΔG 2=ΔΔG binding+ΔG 1  式4; ΔG 2 = ΔΔG binding + ΔG 1 Formula 4;
    其中,ΔG为自由能差异,ΔU ij为所述第一能量,ΔU ji为所述第二能量,<*> i为中间态i下的系统平均,<*> j为中间态j下的系统平均,N i、N i为中间态i、j下的模拟轨迹的帧数,k B为波尔兹曼常数,T为模拟温度,ΔΔG binding为相对结合自由能,ΔG A为参考化合物变换成目标分子的溶剂自由能差异,ΔG B为参考化合物变换成目标分子的结合自由能差异,ΔG 1为已知的参考化合物-蛋白质复合物的结合自由能,ΔG 2为目标分子-蛋白质复合物的结合自由能。 Among them, ΔG is the free energy difference, ΔU ij is the first energy, ΔU ji is the second energy, <*> i is the system average under intermediate state i, <*> j is the system under intermediate state j On average, N i , N i are the frame numbers of the simulation trajectory under the intermediate state i, j, k B is the Boltzmann constant, T is the simulation temperature, ΔΔG binding is the relative binding free energy, ΔG A is the reference compound transformed into The solvent free energy difference of the target molecule, ΔG B is the binding free energy difference of the reference compound converted into the target molecule, ΔG 1 is the known binding free energy of the reference compound-protein complex, ΔG 2 is the binding free energy of the target molecule-protein complex Binding free energy.
  17. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    一个或多个处理器;one or more processors;
    存储器,与所述处理器耦接,用于存储一个或多个程序;a memory, coupled to the processor, for storing one or more programs;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一项所述的增强采样方法或权利要求6-8中任一项所述的计算目标分子-蛋白质复合物的结合自由能的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the enhanced sampling method according to any one of claims 1-5 or claim 6- The method for calculating the binding free energy of the target molecule-protein complex described in any one of 8.
  18. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-5中任一项所述的增强采样方法或权利要求6-8中任一项所述的计算目标分子-蛋白质复合物的结合自由能的方法。A computer-readable storage medium on which a computer program is stored, wherein when the computer program is executed by a processor, the enhanced sampling method according to any one of claims 1-5 or claim 6- The method for calculating the binding free energy of the target molecule-protein complex described in any one of 8.
PCT/CN2021/143802 2021-12-31 2021-12-31 Enhanced sampling method, and method for calculating binding free energy of complex WO2023123396A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/143802 WO2023123396A1 (en) 2021-12-31 2021-12-31 Enhanced sampling method, and method for calculating binding free energy of complex

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/143802 WO2023123396A1 (en) 2021-12-31 2021-12-31 Enhanced sampling method, and method for calculating binding free energy of complex

Publications (1)

Publication Number Publication Date
WO2023123396A1 true WO2023123396A1 (en) 2023-07-06

Family

ID=86997259

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/143802 WO2023123396A1 (en) 2021-12-31 2021-12-31 Enhanced sampling method, and method for calculating binding free energy of complex

Country Status (1)

Country Link
WO (1) WO2023123396A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130166261A1 (en) * 2011-12-23 2013-06-27 Zhiqiang Yan Specificity quantification of biomolecular recognition and its application for drug discovery
CN109360598A (en) * 2018-08-28 2019-02-19 浙江工业大学 A kind of Advances in protein structure prediction based on two stages sampling
CN110047559A (en) * 2019-03-06 2019-07-23 山东师范大学 Calculation method, system, equipment and the medium of protein and drug Conjugated free energy
CN110610745A (en) * 2019-09-24 2019-12-24 南京大学 Identification method of mixed mimetic and antiestrogen interferent based on enhanced sampling molecular dynamics simulation
CN111161810A (en) * 2019-12-31 2020-05-15 中山大学 Free energy perturbation method based on constraint probability distribution function optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130166261A1 (en) * 2011-12-23 2013-06-27 Zhiqiang Yan Specificity quantification of biomolecular recognition and its application for drug discovery
CN109360598A (en) * 2018-08-28 2019-02-19 浙江工业大学 A kind of Advances in protein structure prediction based on two stages sampling
CN110047559A (en) * 2019-03-06 2019-07-23 山东师范大学 Calculation method, system, equipment and the medium of protein and drug Conjugated free energy
CN110610745A (en) * 2019-09-24 2019-12-24 南京大学 Identification method of mixed mimetic and antiestrogen interferent based on enhanced sampling molecular dynamics simulation
CN111161810A (en) * 2019-12-31 2020-05-15 中山大学 Free energy perturbation method based on constraint probability distribution function optimization

Similar Documents

Publication Publication Date Title
WO2022017405A1 (en) Medicine screening method and apparatus and electronic device
Zhang et al. Unified efficient thermostat scheme for the canonical ensemble with holonomic or isokinetic constraints via molecular dynamics
Gumbart et al. Standard binding free energies from computer simulations: What is the best strategy?
Fu et al. Accurate determination of protein: ligand standard binding free energies from molecular dynamics simulations
Tian et al. Assessing an ensemble docking-based virtual screening strategy for kinase targets by considering protein flexibility
Siebenmorgen et al. Evaluation of predicted protein–protein complexes by binding free energy simulations
Clough et al. Protein quantification in label-free LC-MS experiments
Giese et al. A GPU-accelerated parameter interpolation thermodynamic integration free energy method
Sun et al. Extensive Assessment of Various Computational Methods for Aspartate’s p K a Shift
Sahakyan Improving virtual screening results with MM/GBSA and MM/PBSA rescoring
Fu et al. BFEE2: automated, streamlined, and accurate absolute binding free-energy calculations
Alekseenko et al. Protein–protein and protein–peptide docking with ClusPro server
Procacci et al. Statistical mechanics of ligand–receptor noncovalent association, revisited: Binding site and standard state volumes in modern alchemical theories
Hayes et al. BLaDE: A basic lambda dynamics engine for GPU-accelerated molecular dynamics free energy calculations
EP3398102B1 (en) Methods for proteome docking to identify protein-ligand interactions
Hu et al. The importance of protonation and tautomerization in relative binding affinity prediction: a comparison of AMBER TI and Schrödinger FEP
Nishikawa et al. Comparison of the umbrella sampling and the double decoupling method in binding free energy predictions for SAMPL6 octa-acid host–guest challenges
Bolia et al. Adaptive BP-dock: an induced fit docking approach for full receptor flexibility
Masters et al. Efficient and accurate hydration site profiling for enclosed binding sites
Brylinski e matchsite: Sequence order-independent structure alignments of ligand binding pockets in protein models
Minh Implicit ligand theory: Rigorous binding free energies and thermodynamic expectations from molecular docking
Tsai et al. Validation of Free Energy Methods in AMBER
Lemke et al. EncoderMap (II): Visualizing important molecular motions with improved generation of protein conformations
Ganguly et al. Amber drug discovery boost tools: Automated workflow for production free-energy simulation setup and analysis (professa)
Ruiz-Blanco et al. CL-FEP: an end-state free energy perturbation approach

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21969743

Country of ref document: EP

Kind code of ref document: A1