WO2023123288A1 - 相对结合自由能贡献的确定方法、装置及存储介质 - Google Patents
相对结合自由能贡献的确定方法、装置及存储介质 Download PDFInfo
- Publication number
- WO2023123288A1 WO2023123288A1 PCT/CN2021/143389 CN2021143389W WO2023123288A1 WO 2023123288 A1 WO2023123288 A1 WO 2023123288A1 CN 2021143389 W CN2021143389 W CN 2021143389W WO 2023123288 A1 WO2023123288 A1 WO 2023123288A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- energy
- state
- window
- derivative
- frame
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/50—Molecular design, e.g. of drugs
Definitions
- the present application relates to the field of computational simulation technology, and in particular to a method, device and storage medium for determining relative binding free energy contribution.
- the present application provides a method, device and storage medium for determining relative binding free energy contribution.
- Each frame is processed by a parallel processing unit, and based on the derivative of the energy term between the measured unit and the target molecule, the contribution of a single measured unit to the relative binding free energy can be quickly obtained.
- the second aspect of the present application provides a drug design method, including: determining the contribution value of the measured unit to the relative binding free energy according to the above method, and performing drug design based on the contribution value of the measured unit to the relative binding free energy .
- the third aspect of the present application provides a device for determining relative binding free energy contribution, including: a window segmentation module, a state processing module, a window processing module and a contribution determination module.
- the window segmentation module is used to divide the simulated molecular change process into multiple state windows based on multiple coupling variables ⁇ , where the coupling variable ⁇ [0,1], each state window includes M frames, and M is positive Integer;
- the state processing module is used to process the frame through the parallel processing unit to determine the derivative of the state energy of each frame;
- the window processing module is used to determine the mean value of the derivative of each state energy in the state window, as the window energy of the state window Derivative, the derivative of the window energy includes the derivative of the energy term between the measured unit and the target molecule;
- the contribution determination module is used to integrate the derivative of the energy term between the measured unit and the target molecule in each window energy, and determine the The contribution value of the measuring unit to the relative binding free energy, wherein the relative binding free energy is the relative
- the fourth aspect of the present application provides a drug design device, including a contribution determination module and a drug design module.
- the contribution determination module is used to determine the contribution value of the measured unit to the relative free energy of binding according to the above method
- the drug design module is used to design the drug based on the contribution value of the measured unit to the relative free energy of binding.
- a fifth aspect of the present application provides an electronic device, including: a processor and a memory, where executable code is stored on the memory, and when the executable code is executed by the processor, the processor is made to execute the above method.
- the sixth aspect of the present application also provides a computer-readable storage medium, on which executable codes are stored, and when the executable codes are executed by a processor of an electronic device, the processor is made to execute the above method.
- the seventh aspect of the present application further provides a computer program product, including executable codes, and when the executable codes are executed by a processor, the foregoing method is realized.
- FIG. 1 schematically shows a schematic diagram of an application scenario of a method according to an embodiment of the present application
- FIG. 1A and Figure 2B schematically show molecular structure diagrams of two small molecules
- Fig. 3 schematically shows a flow chart of a method for determining relative binding free energy contribution according to an embodiment of the present application
- Fig. 4 schematically shows a flow chart of determining a state energy derivative according to an embodiment of the present application
- Fig. 5 schematically shows a flow chart of determining a state energy derivative according to another embodiment of the present application
- Fig. 6 schematically shows a flowchart of a drug design method according to an embodiment of the present application
- Fig. 7 schematically shows a structural block diagram of a device for determining relative binding free energy contribution according to an embodiment of the present application
- Fig. 8 schematically shows a structural block diagram of a drug design device according to an embodiment of the present application.
- Fig. 9 schematically shows a structural block diagram of an electronic device implementing an embodiment of the present application.
- first, second, third and so on may be used in this application to describe various information, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another.
- first information may also be called second information, and similarly, second information may also be called first information.
- second information may also be called first information.
- a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
- “plurality” means two or more, unless otherwise specifically defined.
- Molecular Simulation refers to the use of theoretical methods and computer technology to simulate the structure and physical and chemical properties of molecules or molecular systems.
- Binding free energy refers to the interaction between ligand and receptor.
- Covalent interactions and non-bonded interactions are two types of interactions that exist between ligands and receptors.
- FIG. 1 schematically shows an exemplary system architecture to which a determination method of relative binding free energy contribution and a drug design method can be applied according to an embodiment of the present application.
- Figure 1 is only an example of the system architecture to which the embodiment of the present application can be applied, to help those skilled in the art understand the technical content of the present application, but it does not mean that the embodiment of the present application cannot be used in other device, system, environment or scenario.
- a system architecture 100 may include terminal devices 101 , 102 , 103 , a network 104 and a server 105 .
- the network 104 is used as a medium for providing communication links between the terminal devices 101 , 102 , 103 and the server 105 .
- Network 104 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
- Terminal devices 101 , 102 , 103 Users can use terminal devices 101 , 102 , 103 to interact with other terminal devices and server 105 via network 104 to receive or send information and the like.
- Terminal devices 101, 102, and 103 can be installed with various communication client applications, for example, drug development applications, material design applications, web browser applications, database applications, search applications, instant messaging tools, email clients, social platforms software and other applications.
- Terminal devices 101, 102, and 103 include, but are not limited to, smart desktop computers, tablet computers, laptop computers, and other electronic devices that can support functions such as surfing the Internet, modeling, analysis and calculation, and design.
- the server 105 can receive the determination request of the relative binding free energy contribution, the drug design request, etc., determine the relative binding free energy and/or drug design based on the request, and can also send the result to the terminal devices 101 , 102 , 103 .
- the server 105 may be a background management server, a server cluster, and the like.
- terminal devices are only illustrative. According to the implementation needs, there can be any number of terminal devices, networks and clouds.
- the applicant found that if the contribution of the amino acids around the small molecule binding site to the total relative binding free energy is known, it can greatly help the design of the drug molecular structure, but there is no published method to achieve this.
- This application decomposes the total relative binding free energy based on Thermodynamic Integration (TI for short), and combines parallel processing architecture functions such as Compute Unified Device Architecture (CUDA for short), to quickly obtain the value of a single unit under test. Contribution, the measured unit can be, for example, a single amino acid.
- TI Thermodynamic Integration
- CUDA Compute Unified Device Architecture
- target molecule 1 in the following examples is represented by A
- target molecule 2 is represented by B
- receptor molecule is represented by P
- a and B may be drug molecules in two designs
- P is a certain protein molecule.
- the binding free energy of the process of combining P and A to obtain PA is ⁇ G 1
- the binding free energy of the process of combining P and B to obtain PB is ⁇ G 2
- a thermodynamic cycle can be constructed to simplify the calculation.
- the free energy change of the process from A to B is ⁇ G 3
- the free energy change of PA to PB is ⁇ G 4
- the latter two processes cannot be realized in reality and are called alchemy processes, they can be used to simplify calculations due to small changes and simple calculations.
- the embodiments of this application also need to calculate the latter two processes.
- the energy changes of several amino acids close to the binding site can be calculated to obtain the contribution of these amino acids to the relative binding free energy.
- Each amino acid can be calculated separately, or the derivative of the energy term of a single amino acid can be separated after the overall calculation.
- Fig. 3 schematically shows a flowchart of a method for determining relative binding free energy contribution according to an embodiment of the present application.
- the method includes operation S310 to operation S340.
- the simulated molecular change process is divided into a plurality of state windows based on a plurality of coupling variables ⁇ .
- the coupling variable ⁇ [0,1] each state window includes M frames, and M is a positive integer.
- each frame may represent a discrete state.
- a frame may include states of multiple atoms, for example, the states of multiple atoms in a frame may be position information of multiple atoms.
- Position information can be represented by coordinate values in a specified coordinate system in a frame. For example, an atom has three coordinate values of x, y, and z.
- the specified coordinate system can be a Cartesian coordinate system, a polar coordinate system, etc. Coordinates in different coordinate systems can be converted. For example, the same coordinate system, especially a spatial coordinate system, can be used in determining the relative binding free energy contribution. For example, some or all of the atomic coordinates are coordinates in the same coordinate system. It should be understood that some or all of the atomic coordinates may also be coordinates in different coordinate systems, but the coordinates in each coordinate system can be converted to each other.
- the molecular change process can be divided into 10 windows, each window contains 500 frames, and each frame includes coordinate values corresponding to the spatial positions of multiple atoms in a specified coordinate system.
- the frames are processed by the parallel processing unit to determine the derivative of the state energy of each frame.
- the frames may be processed by multiple parallel processing units of a graphics processing unit (GPU).
- the calculation formula of the derivative of the state energy can be shown in the formula (1).
- Thermodynamic integration is a method of calculating free energy.
- multiple windows are established through the coupling variable ⁇ , and the molecule is slowly changed from one molecule (molecule 1) to another molecule (molecule 2). Finally, the energy of multiple windows is integrated to obtain the free energy.
- ⁇ is between 0 and 1
- r is related to the atomic coordinates, ⁇ changes from 0 to 1, and the overall system changes from molecule 1 to molecule 2.
- the M frames each include the state of a plurality of atoms. For example, if each frame includes a system of 100 atoms, one frame includes the atomic coordinates of 100 atoms.
- an average value of derivatives of state energies in the state window is determined as a derivative of window energy of the state window, the derivative of window energy including a derivative of an energy term between the unit under test and the target molecule.
- the measured unit may be a component of the target molecule and consist of multiple atoms.
- the measured unit can be an amino acid
- the target molecule can be a protein molecule.
- the mean value of each frame may be taken as the energy representation of the state window. It should be noted that the embodiment of the present application does not exclude the implementation of determining the window energy by other statistical methods. For example, if there are 500 frames in a certain state window, after determining the derivatives of the respective energy items of the 500 frames, the average value of the derivatives of the respective energy items of the 500 frames is used as the window energy derivative of the state window.
- the derivative of the energy term between the measured unit and the target molecule in each window energy is integrated to determine the contribution value of the measured unit to the relative binding free energy, wherein the relative binding free energy is the two target molecules
- the relative free energy of binding to the receptor molecule, the measured unit is a part of the receptor molecule.
- the contribution value of the measured unit to the relative binding free energy can be calculated as follows.
- the free energy change ⁇ G can be obtained by integrating ⁇ , as shown in formula (2).
- the simulated molecular change process is determined by changing the coupling variable ⁇ to determine multiple windows, each ⁇ corresponds to a window, and each window includes multiple frames.
- ⁇ may take ⁇ 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1 ⁇ , for example.
- the value range of the frame quantity M of each state window may be 400 ⁇ M ⁇ 600.
- the value of M can be adjusted adaptively, if the value of M is too small, the result will be inaccurate. Too many values of M will slow down the speed of dynamics and post-processing, and the result will not be greatly improved.
- M may be 500, which is a relatively reasonable value.
- each frame may be processed by a computing unit of a graphics processor as a parallel processing unit.
- each computing unit may run multiple threads, each thread for processing a frame.
- each execution function kernel in CUDA can run multiple threads (threads), and each thread can be used to process a frame, so as to realize parallel computing and improve computing efficiency.
- the window energy includes the derivative of the energy term between the unit under test and the molecule of interest.
- the energy term between the measured unit and the target molecule refers to the mutual energy of the amino acid and the drug molecule, not their own energy.
- the energy term is a key term for calculating the contribution of the unit under test to the relative binding free energy, and thus, in operation S340, the contribution value of the unit under test to the relative binding free energy can be determined based on the energy term.
- all state windows can be summed up, and ⁇ can be integrated to obtain the contribution of the final measured unit to the total relative binding free energy.
- the contribution of the measured unit to the total relative binding free energy can be calculated by formula (3).
- ⁇ dH/d ⁇ > The integral of ⁇ dH/d ⁇ > from 0 to 1, the interval is divided into N-1, and there are a total of ⁇ dH/d ⁇ > 1 to ⁇ dH/d ⁇ > N points.
- ⁇ k is the ⁇ value of the k-th state window
- ⁇ dH/d ⁇ > is the average value of the derivative of the energy term between the measured unit and the target molecule in the state energy of the k-th state to ⁇ .
- the derivative of the amino acid and the energy term of the small molecule in the window energy of each state window is separated, and then integrated with ⁇ , the contribution of the amino acid to the total relative free energy can be obtained.
- the method for determining the relative binding free energy contribution processes each frame through a parallel processing unit, and based on the derivative of the energy term between the measured unit and the target molecule, the contribution of a single measured unit to the relative binding free energy can be quickly obtained contribute.
- Fig. 4 schematically shows a flow chart of state energy derivatives according to an embodiment of the present application.
- operation S420 processes the frames through the parallel processing unit to determine the derivative of the state energy of each frame, and may include operations S410 and S420 .
- atomic coordinates change on a state (eg, time) scale throughout the molecular transition, resulting in a series of frames.
- the state energy of each frame is determined by the atomic coordinates, and the derivative of the state energy of each frame can be determined through the atomic coordinate information.
- the state energies include van der Waals and Coulomb interactions
- processing the frames through the parallel processing unit to determine the state energies for each frame includes:
- each frame may be calculated using van der Waals and Coulomb interaction formulas, respectively.
- ⁇ is the potential energy parameter of van der Waals interaction
- r min is the distance parameter of van der Waals interaction
- r is the distance between molecules.
- the Coulomb action can use the SPME (Smooth Particle Mesh Ewald) algorithm, and E is the total energy, which can be calculated by formula (5).
- SPME Smooth Particle Mesh Ewald
- E dir may be shown in formula (6).
- E corr can be expressed as formula (7).
- E rec can be shown in formula (8).
- q is the charge of the atom
- ⁇ is the Ewald parameter
- r ij is the distance between atoms i and j
- erfc is the complementary error function.
- i, j ⁇ T is that i, j belong to the atom pair that is not calculated, for example, two atoms are directly connected
- erf is the error function.
- ⁇ rec F(B ⁇ C), where F is Fourier transform.
- B and C are matrixes related to grid points respectively.
- Q is the distribution of atomic charges on the lattice points, and the number of lattice points in each direction is K1, K2, K3.
- non-covalent binding systems there are mainly non-bonding interactions between small molecules and amino acids, which can include van der Waals interactions and Coulomb interactions.
- the free energy can be determined more accurately by calculating the van der Waals and Coulomb interactions.
- Fig. 5 schematically shows a flow chart of determining a state energy derivative according to another embodiment of the present application.
- the window energy is determined.
- the total energy of the window that is, the window energy, can be determined by the formula (1) described above.
- dynamics simulation is performed on each window based on the window energy to obtain the state energy of each frame.
- a dynamic simulation is performed on each window, and the state energy of each frame can be obtained. State energy for each frame.
- molecular dynamics is a set of molecular simulation methods, which mainly rely on Newtonian mechanics to simulate the motion of the molecular system, so as to draw samples in the system composed of different states of the molecular system, so as to calculate the configuration integral of the system, and Based on the result of configuration integration, the state energy of the system is further calculated.
- the phase trajectory of the system can be obtained by numerically solving the classical mechanical motion equation of the molecular system, and the structural characteristics and properties of the system can be counted.
- each frame is processed in parallel by the operation unit of the graphics processor to obtain the derivative of each state energy, so that the operation efficiency can be significantly improved.
- the target molecule 1 is ejm-42 as shown in Figure 2A
- the target molecule 2 is ejm-44 as shown in Figure 2B
- the receptor molecule is tyrosine kinase 2 (TYK2)
- the binding site on tyrosine kinase 2 is calculated
- the numbers near the points are the contributions of amino acids 901R, 913L, 980T, 981V, 982P, 983L, and 984G to the relative binding free energy of the two molecules.
- the computing hardware uses NVIDA RTX 2080GPU.
- Each state window saves trajectory files, and each trajectory file contains coordinate information of 500 single structures, that is, 500 frames.
- the universal organic small molecule force field (Generation Amber Force Field, GAFF for short) is used in the simulation. Read the molecular force field non-bond parameters (van der Waals parameters and charge parameters) of the system from the force field parameter file, and save them as CUDA global variables, which can be used by all frames in the trajectory file, and these frames share a set of parameters.
- each state window read the coordinate information of each frame from the trajectory file.
- Each thread under the CUDA architecture processes the coordinate information of a frame to obtain the derivative of the state energy.
- the derivatives of the state energies obtained for these 500 structures are averaged. All windows are then summed and integrated over ⁇ to obtain the amino acid contribution to the total relative binding free energy, including van der Waals and Coulomb interactions.
- the calculation result of the total relative binding free energy of the system is -4.16 kcal/mol (kcal/mol) which is consistent with the experimental value of -2.36 kcal/mol (both negative values), and the values are relatively close, the absolute value The error is within 2kcal/mol. It shows that the total calculation results are in good agreement with the experimental values.
- these amino acids can further help the rational design of drug molecules, for example, these amino acids can be mutated to obtain the desired change in relative free energy.
- the average calculation time for each window of an amino acid is 0.5s, which realizes fast calculation.
- the method for determining the relative binding free energy contribution uses a parallel computing architecture to obtain the contribution of a single unit under test to the relative binding free energy based on post-processing the trajectory files stored in the simulation process. This method can quickly obtain a single The contribution of amino acids to the relative binding free energies of two molecules, thereby aiding in the rational design of drug molecules.
- Another aspect of the present application also provides a drug design method.
- Fig. 6 schematically shows a flowchart of a drug design method according to an embodiment of the present application.
- the above drug design method includes operations S610 and S620.
- the contribution value of the unit under test to the relative binding free energy is determined.
- the process of determining the contribution value of the measured unit to the relative binding free energy can refer to the method described above, and will not be described in detail here.
- drug design is performed based on the contribution value of the measured unit to the relative binding free energy.
- the effect of the measured unit on the target molecule can be obtained through the contribution value of the measured unit to the relative binding free energy, which is helpful for drug design.
- the local interaction can be changed by changing the partial structure of an amino acid or drug molecule, thereby changing the total binding energy.
- Another aspect of the present application also provides an apparatus for determining relative binding free energy contribution.
- Fig. 7 schematically shows a structural block diagram of an apparatus for determining relative binding free energy contribution according to an embodiment of the present application.
- the determination device 700 may include: a window division module 710 , a state processing module 720 , a window processing module 730 and a contribution determination module 740 .
- the window segmentation module 710 is used to divide the simulated molecular change process into multiple state windows based on multiple coupling variables ⁇ , wherein, the coupling variable ⁇ [0,1], each state window includes M frames, and M is a positive integer .
- the state processing module 720 is used to process the frames through the parallel processing module to determine the derivative of the state energy for each frame.
- the window processing module 730 is used to determine the average value of the derivatives of each state energy in the state window as the derivative of the window energy of the state window, and the derivative of the window energy includes the derivative of the energy term between the measured unit and the target molecule.
- the contribution determination module 740 is used to integrate the derivative of the energy term between the measured unit and the target molecule in each window energy, and determine the contribution value of the measured unit to the relative binding free energy, wherein the relative binding free energy is two The relative binding free energy of the binding of the target molecule to the receptor molecule, the measured unit is a part of the receptor molecule.
- the state processing module 720 includes: an atomic coordinate determination unit and a derivative determination unit.
- the atomic coordinate determination unit is used for determining the atomic coordinates of a plurality of atoms in the frame for each frame.
- the derivative determination unit is for determining a derivative of the state energy of the frame based on the atomic coordinates.
- the derivative determination unit includes a window energy determination subunit, a frame state energy acquisition subunit, and a state energy derivative determination subunit.
- the window energy determining subunit is used for determining the window energy based on atomic coordinates.
- the frame state energy obtaining subunit is used to perform dynamic simulation on each window based on the window energy to obtain the state energy of each frame.
- the state energy derivative determining subunit is used to process each frame in parallel to determine the state energy derivative.
- state energies include van der Waals and Coulomb interactions.
- the state processing module 720 is specifically configured to determine the van der Waals interaction and the Coulomb interaction through the unified molecular force field non-bond parameters and frame coordinate parameters.
- the value range of the number M of frames is 400 ⁇ M ⁇ 600.
- the parallel processing unit is a computing unit of a graphics processor.
- the Compute Unit runs multiple threads, one for each frame.
- the value range of the number of state windows is [10,15].
- Another aspect of the present application also provides a drug design device.
- Fig. 8 schematically shows a structural block diagram of a drug design device according to an embodiment of the present application.
- the drug design device 800 may include a contribution determination module 810 and a drug design module 820 .
- the contribution determination module 810 is used to determine the contribution value of the unit under test to the relative binding free energy by using the method for determining the relative binding free energy contribution as above.
- the design module 820 is used for drug design based on the contribution of the measured unit to the relative binding free energy.
- Another aspect of the present application also provides an electronic device.
- Fig. 9 schematically shows a block diagram of an electronic device implementing an embodiment of the present application.
- an electronic device 900 includes a memory 910 and a processor 920 .
- the processor 920 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), on-site Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the memory 910 may include various types of storage units such as system memory, read only memory (ROM), and persistent storage.
- the ROM may store static data or instructions required by the processor 920 or other modules of the computer.
- the persistent storage device may be a readable and writable storage device.
- Persistent storage may be a non-volatile storage device that does not lose stored instructions and data even if the computer is powered off.
- the permanent storage device adopts a mass storage device (such as a magnetic or optical disk, flash memory) as the permanent storage device.
- the permanent storage device may be a removable storage device (such as a floppy disk, an optical drive).
- the system memory can be a readable and writable storage device or a volatile readable and writable storage device, such as dynamic random access memory.
- System memory can store some or all of the instructions and data that the processor needs at runtime.
- memory 910 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (eg, DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), and magnetic disks and/or optical disks may also be used.
- memory 910 may include a readable and/or writable removable storage device, such as a compact disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual-layer DVD-ROM), Read-only Blu-ray Disc, Super Density Disc, Flash memory card (such as SD card, min SD card, Micro-SD card, etc.), magnetic floppy disk, etc.
- a readable and/or writable removable storage device such as a compact disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual-layer DVD-ROM), Read-only Blu-ray Disc, Super Density Disc, Flash memory card (such as SD card, min SD card, Micro-SD card, etc.), magnetic floppy disk, etc.
- Computer-readable storage media do not contain carrier waves and transient electronic signals transmitted by wireless or wire.
- Executable codes are stored in the memory 910 , and when the executable codes are processed by the processor 920 , the processor 920 can be made to execute part or all of the methods mentioned above.
- the method according to the present application can also be implemented as a computer program or computer program product, the computer program or computer program product including computer program code instructions for executing some or all of the steps in the above method of the present application.
- the present application may also be implemented as a computer-readable storage medium (or non-transitory machine-readable storage medium or machine-readable storage medium), on which executable code (or computer program or computer instruction code) is stored,
- executable code or computer program or computer instruction code
- the processor of the electronic device or server, etc.
- the processor is made to perform part or all of the steps of the above method according to the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Pharmacology & Pharmacy (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
一种相对结合自由能贡献的确定方法、装置及存储介质。该相对结合自由能贡献的确定方法包括:基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口;通过并行处理单元处理帧,以确定每个帧的状态能量的导数;确定状态窗口中各个状态能量的导数的平均值,作为状态窗口的窗口能量的导数;对每个窗口能量中被测单元与目标分子之间的能量项的导数进行积分,确定被测单元对相对结合自由能的贡献值。通过并行处理单元处理各个帧,并基于被测单元与目标分子之间的能量项的导数,可以快速得到单个被测单元对相对结合自由能的贡献。
Description
本申请涉及计算模拟技术领域,尤其涉及一种相对结合自由能贡献的确定方法、装置及存储介质。
随着计算机技术和基础学科理论的快速发展,分子模拟的计算效率和精度都获得极大提高,使得分子模拟在多学科领域得到广泛应用。其中,相对结合自由能计算可用于预测两个小分子和蛋白质之间的相对结合能力,因此广泛应用于药物分子设计领域。相关技术中的分子模拟软件只是给出最后的总的相对结合自由能,这对药物设计的参考价值比较有限。
发明内容
为解决或部分解决相关技术中存在的问题,本申请提供一种相对结合自由能贡献的确定方法、装置及存储介质。通过并行处理单元处理各个帧,并基于被测单元与目标分子之间的能量项的导数,可以快速得到单个被测单元对相对结合自由能的贡献。
本申请的第一个方面提供了一种相对结合自由能贡献的确定方法,该相对结合自由能贡献的确定方法包括:基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口,其中,耦合变量λ∈[0,1],每个状态窗口包括M个帧,M为正整数;通过并行处理单元处理帧,以确定每个帧的状态能量的导数;确定状态窗口中各个状态能量的导数的平均值,作为状态窗口的窗口能量的导数,窗口能量的导数包括被测单元与目标分子之间的能量项的导数;以及对每个窗口能量中被测单元与目标分子之间的能量项的导数进行积分,确定被测单元对相对结合自由能的贡献值,其中,相对结合自由能是两个目标分子与受体分子结合的相对结合自由能,被测单元为受体分子的一部分。
本申请的第二个方面提供了一种药物设计方法,包括:根据如上述的方法确定被测单元对相对结合自由能的贡献值,基于被测单元对相对结合自由能的贡献值进行药物设计。
本申请的第三方面提供了一种相对结合自由能贡献的确定装置,包括:窗口分割模块、状态处理模块、窗口处理模块和贡献确定模块。其中,窗口分割模块用于基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口,其中,耦合变量λ∈[0,1],每个状态 窗口包括M个帧,M为正整数;状态处理模块用于通过并行处理单元处理帧,以确定每个帧的状态能量的导数;窗口处理模块用于确定状态窗口中各个状态能量的导数的平均值,作为状态窗口的窗口能量的导数,窗口能量的导数包括被测单元与目标分子之间的能量项的导数;贡献确定模块用于对每个窗口能量中被测单元与目标分子之间的能量项的导数进行积分,确定被测单元对相对结合自由能的贡献值,其中,相对结合自由能是两个目标分子与受体分子结合的相对结合自由能,被测单元为受体分子的一部分。
本申请的第四方面提供了药物设计装置,包括贡献确定模块和药物设计模块。其中,贡献确定模块用于根据如上述的方法确定被测单元对相对结合自由能的贡献值;药物设计模块用于基于被测单元对相对结合自由能的贡献值进行药物设计。
本申请的第五方面提供了一种电子设备,包括:处理器和存储器,存储器上存储有可执行代码,当上述可执行代码被处理器执行时,使得处理器执行上述方法。
本申请的第六方面还提供了一种计算机可读存储介质,其上存储有可执行代码,当可执行代码被电子设备的处理器执行时,使处理器执行上述方法。
本申请的第七方面还提供了一种计算机程序产品,包括可执行代码,可执行代码被处理器执行时实现上述方法。
本申请提供的相对结合自由能贡献的确定方法,通过并行处理单元处理各个帧,并基于被测单元与目标分子之间的能量项的导数,可以快速得到单个被测单元对相对结合自由能的贡献。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
通过结合附图对本申请示例性实施方式进行更详细地描述,本申请的上述以及其它目的、特征和优势将变得更加明显,其中,在本申请示例性实施方式中,相同的参考标号通常代表相同部件。
图1示意性示出了根据本申请实施例的方法的应用场景的示意图;
图2A和图2B示意性示出了两个小分子的分子结构图;
图3示意性示出了根据本申请实施例的相对结合自由能贡献的确定方法的流程图;
图4示意性示出了根据本申请实施例的确定状态能量导数的流程图;
图5示意性示出了根据本申请另一实施例的确定状态能量导数的流程图;
图6示意性示出了根据本申请实施例的药物设计方法的流程图;
图7示意性示出了根据本申请实施例的一种相对结合自由能贡献的确定装置的结构框图;
图8示意性示出了根据本申请实施例的一种药物设计装置的结构框图;
图9示意性示出了实现本申请实施例的电子设备的结构框图。
下面将参照附图更详细地描述本申请的实施方式。虽然附图中显示了本申请的实施方式,然而应该理解,可以以各种形式实现本申请而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了使本申请更加透彻和完整,并且能够将本申请的范围完整地传达给本领域的技术人员。
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在此使用的术语“包括”、“包含”等表明了特征、步骤、操作和/或部件的存在,但是并不排除存在或添加一个或多个其他特征、步骤、操作或部件。
在此使用的所有术语(包括技术和科学术语)具有本领域技术人员通常所理解的含义,除非另外定义。应注意,这里使用的术语应解释为具有与本说明书的上下文相一致的含义,而不应以理想化或过于刻板的方式来解释。
应当理解,尽管在本申请可能采用术语“第一”、“第二”、“第三”等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。
分子模拟(Molecular Simulation,简称MS),是指利用理论方法和计算机技术,模拟分子或分子体系的结构和物理化学性质。
结合自由能(binding free energy)是指存在于配体与受体之间的相互作用。
共价相互作用和非键相互作用,是两类存在于配体和受体之间的相互作用。
非键相互作用存在于大部分体系中,主要由范德瓦耳斯作用、库伦作用等组成。
以下将通过各个附图对本申请实施例的一种相对结合自由能贡献的确定方法、药物设计方法和装置进行详细描述。
图1示意性示出了根据本申请实施例的可以应用相对结合自由能贡献的确定方法、药物设计方法的一种示例性系统架构。需要注意的是,图1所示仅为可以应用本申请实施例的系统架构的示例,以帮助本领域技术人员理解本申请的技术内容,但并不意味着本申请实施例不可以用于其他设备、系统、环境或场景。
参见图1,根据该实施例的系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与其他终端设备和服务器105进行交互,以接收或发送信息等。终端设备101、102、103可以安装有各种通讯客户端应用,例如,药物开发应用、材料设计应用、网页浏览器应用、数据库类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等应用等。
终端设备101、102、103包括但不限于智能台式电脑、平板电脑、膝上型便携计算机等等可以支持上网、建模、分析计算、设计等功能的电子设备。
服务器105可以接收相对结合自由能贡献的确定请求、药物设计请求等,基于请求确定相对结合自由能和/或药物设计,还可以发送结果给终端设备101、102、103。例如,服务器105可以为后台管理服务器、服务器集群等。
需要说明的是,终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和云端。
申请人在进行药物设计的过程中发现,如果了解小分子结合位点周围的氨基酸对总的相对结合自由能的贡献,可对设计药物分子结构产生较大的帮助作用,但目前尚无公开的方法可以实现该目的。本申请基于热力学积分(Thermodynamic Integration,简称TI)对总的相对结合自由能进行分解,结合诸如统一计算架构(Compute Unified Device Architecture,简称CUDA)等并行处理架构功能,可以快速得到单个被测单元的贡献,被测单元例如可以是单个氨基酸。
下面结合本申请实施例的目标分子与受体分子结合的场景,示例性说明分子变化的过程。为了简明,以下实施例中的目标分子1用A表示,目标分子2用B表示,受体分子用P表示,其中A和B的相似程度较高。例如,A和B可能为两种设计中的药物分子,P为某蛋白质分子。
作为示例,图2A和图2B示意性示出了两个小分子的分子结构图。其中,图2A 示出一种名为ejm-42的分子,图2B示出一种名为ejm-44的分子,受体分子例如可以选择酪氨酸激酶2(TYK2),因结构复杂而未示出。
假设,P与A结合得到PA的过程的结合自由能为ΔG
1,P与B结合得到PB的过程的结合自由能为ΔG
2,则相对结合自由能为ΔG
2-ΔG
1=ΔΔG。由于上述两个过程计算复杂,可以构建热力学循环简化计算。由A转化为B的过程的自由能变化为ΔG
3,由PA转化为PB的自由能变化为ΔG
4,从而,ΔΔG=ΔG
2-ΔG
1=ΔG
4-ΔG
3。虽然后两个过程无法在现实中实现,被称为炼金术过程,但是由于变化较小,计算简单,可以用于简化计算,本申请实施例需要计算的也是后两个过程。
特别的是,本申请实施例在计算由PA转化为PB的过程中,可以仅计算受体分子中与结合位点临近的数个被测单元。例如,受体分子为蛋白质,可以仅计算结合位点临近的数个氨基酸的能量变化情况,从而得到该些氨基酸对于相对结合自由能的贡献。每个氨基酸可以单独计算,也可以整体计算后,分离出单个氨基酸的能量项的导数。
图3示意性示出了根据本申请实施例的相对结合自由能贡献的确定方法的流程图。
如图3所示,该方法包括操作S310~操作S340。
在操作S310,基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口。其中,耦合变量λ∈[0,1],每个状态窗口包括M个帧,M为正整数。
在本实施例中,每个帧可以表示一种离散状态。例如,一帧中可以包括多个原子的状态,如一帧的多个原子状态可以是多个原子的位置信息等。位置信息可以通过一帧中的指定坐标系下的坐标值进行表征。例如,1个原子有x,y,z三个坐标值。
指定坐标系可以是笛卡尔坐标系、极坐标系等。不同坐标系下的坐标之间可以进行转换。例如,在确定相对结合自由能贡献的过程中可以采用同一坐标系,特别是空间坐标系。例如,所有原子坐标的部分或者全部是在同一坐标系下的坐标。应当能够理解的是,所有原子坐标中的部分或者全部也可以是在不同坐标系下的坐标,但是各坐标系下的坐标之间可以相互转换。
在一个具体实施例中,可以将分子变化过程分割为10个窗口,每个窗口包含500帧,每帧中包括与多个原子在指定坐标系下的空间位置对应的坐标值。
在操作S320,通过并行处理单元处理帧,以确定每个帧的状态能量的导数。
其中,可以通过图形处理器(GPU)的多个并行处理单元处理帧。状态能量的导数的计算公式可以参考式(1)所示。
热力学积分是计算自由能的一种方法。本实施例通过耦合变量λ建立多个窗口, 将分子从一个分子(分子1)缓慢变成另一个分子(分子2),最后对多个窗口的能量进行积分,得到自由能。其中,λ=0时,体系的总能量是分子1的能量H0。同理,λ=1时,体系的总能量是分子2的能量H1。λ介于0和1之间时,体系的总能量是两端能量的线性组合H=(1-λ)×H0(r)+λ×H1(r)。r与原子坐标有关,λ从0变为1,总的体系从分子1变为分子2。
对于每个λ(窗口),能量H对于λ的导数的计算公式如式(1)所示:
在某些实施例中,M个帧各自包括多个原子的状态。如每帧中包括100个原子的体系,一个帧包括100个原子的原子坐标。
例如,通过并行处理单元处理帧,以确定每个帧的状态能量的导数,可以包括如下操作。首先,对于每个帧,确定帧的多个原子的原子坐标。例如,1个原子有x,y,z三个坐标值,则1帧中包括与100个原子对应的300个坐标值。然后,基于原子坐标,确定帧的状态能量。例如,可以采用体系的总能量是两端能量的线性组合H=(1-λ)×H0(r)+λ×H1(r),基于体系的总能量来确定各帧的状态能量的导数。其中,r是距离参数,如两两原子之间的距离,其值可以是基于原子坐标来确定的。
在操作S330,确定状态窗口中各个状态能量的导数的平均值,作为状态窗口的窗口能量的导数,窗口能量的导数包括被测单元与目标分子之间的能量项的导数。
其中,被测单元可以是目标分子的组成部分,并且由多个原子组成。例如,被测单元可以是氨基酸,目标分子可以是蛋白质分子。
在某些实施例中,在同一个状态窗口内,可以取各个帧的均值作为该状态窗口的能量表征。应当注意的是,本申请实施例并不排除以其他统计方法确定窗口能量的实施方式。例如,某个状态窗口里面含有如500帧,则在确定了500帧各自的能量项的导数之后,以这500帧各自的能量项的导数的平均值,作为该状态窗口的窗口能量的导数。
在操作S340,对每个窗口能量中被测单元与目标分子之间的能量项的导数进行积分,确定被测单元对相对结合自由能的贡献值,其中,相对结合自由能是两个目标分子与受体分子结合的相对结合自由能,被测单元为受体分子的一部分。
可以通过如下方式计算得到被测单元对相对结合自由能的贡献值。
将分子从一个分子(分子1)缓慢变成另一个分子(分子2)的过程中,自由能的变化ΔG可以通过对λ进行积分得到,如式(2)所示。
在某些实施例中,模拟的分子变化过程通过耦合变量λ变化,确定多个窗口,每个λ对应一个窗口,每个窗口包括多个帧。其中,λ例如可以取{0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1}。每个状态窗口的帧数量M的取值范围可以为400≤M≤600。M的数值可以适应性调整,M的数值太少,则结果不精确。M的数值太多,则会拖慢动力学的速度和后处理的速度,而且结果不会有大的提升。本实施例中,M可以取500,是一个相对合理的值。
在某些实施例中,在操作S320,通过处理上述多个帧,可以确定每个状态窗口的窗口能量。由于每个帧的结构相对独立,彼此不需要信息交流,非常适合并行运算,而图形处理器(GPU)运算技术非常适合此类并行运算。在某些实施例中,可以通过图形处理器的计算单元作为并行处理单元,处理各个帧。
在某些实施例中,每个计算单元可以运行多个线程,每个线程用于处理一个帧。例如,CUDA中的每个执行函数kernel可以运行多个线程(thread),可以使用每个线程处理一个帧,从而实现并行运算,提高运算效率。
在某些实施例中,窗口能量包括被测单元与目标分子之间的能量项的导数。在上文描述的由PA转化为PB的过程中,被测单元与目标分子之间存在相互作用,所计算的窗口能量中,存在反映这种作用的能量项的导数,即被测单元与目标分子之间的能量项的导数。以蛋白质为例,被测单元与目标分子之间的能量项是指氨基酸和药物分子的相互能量,不是它们自身的能量。该能量项是用于计算被测单元对相对结合自由能贡献的关键项,从而,在操作S340,可以基于该能量项来确定被测单元对相对结合自由能的贡献值。
在某些实施例中,可以将所有状态窗口汇总,对λ进行积分,得到最终的被测单元对总的相对结合自由能的贡献。例如可以通过式(3)计算被测单元对总的相对结合自由能的贡献。
从0到1的<dH/dλ>的积分,间隔分为N-1,共有<dH/dλ>
1到<dH/dλ>
N个点。其中,λ
k为第k个状态窗口的λ值,<dH/dλ>为第k个状态的状态能量中被测单元与目标分子之间的能量项对λ的导数的平均值。
以氨基酸为例,将每个状态窗口的窗口能量中的该氨基酸和小分子的能量项的导数分离出来,然后对λ积分,便可以得到该氨基酸对总的相对自由能的贡献。
本申请提供的相对结合自由能贡献的确定方法,通过并行处理单元处理各个帧,并基于被测单元与目标分子之间的能量项的导数,可以快速得到单个被测单元对相对结合自由能的贡献。
图4示意性示出了根据本申请实施例的状态能量导数的流程图。
如图4所示,操作S420通过并行处理单元处理帧,以确定每个帧的状态能量的导数,可以包括操作S410和S420。
在操作S410,对于每个帧,确定该帧的多个原子的原子坐标。
在操作S420,基于原子坐标,确定该帧的状态能量的导数。
在某些实施例中,在整个分子变化过程中,原子坐标在状态(如时间)尺度上发生变化,从而得到一系列的帧。每个帧的状态能量由原子坐标决定,通过原子坐标信息,可以确定每个帧的状态能量的导数。
在某些实施例中,状态能量包括范德华作用和库仑作用,通过并行处理单元处理帧,以确定每个帧的状态能量包括:
通过统一的分子力场非键参数和帧的坐标参数,确定范德华作用和库仑作用。
在某些实施例中,每个帧可以分别使用范德华作用和库仑作用的公式进行计算。
范德华的作用E(VDW)可以采用Lennard-Jones势能公式,如式(4)所示。
其中,ε是范德华作用的势能参数,r
min是范德华作用的距离参数,r为分子之间的距离。
库仑作用可以采用SPME(Smooth Particle Mesh Ewald)算法,E为总的能量,可以由式(5)计算得到。
E=E
dir+E
corr+E
rec 式(5)
其中,E
dir可以如式(6)所示。E
corr可以如式(7)所示。E
rec可以如式(8)所示。
式(6)至式(8)中,q为原子的电荷,β是Ewald参数,r
ij为原子i和j之间 的距离,erfc是互补误差函数。i,j∈T是i,j属于不被计算的原子对,比如两个原子直接相连,erf是误差函数。θ
rec=F(B·C),F是傅里叶变换。B,C分别为格点相关的矩阵。Q是原子电荷在格点的分布,各个方向的格点数为K1,K2,K3。
对于非共价结合的体系,小分子和氨基酸之间主要是非键作用,可以包括范德华作用和库仑作用。通过计算范德华作用和库仑作用,可以较为精确地确定自由能。
图5示意性示出了根据本申请另一实施例的确定状态能量导数的流程图。
如图5所示,可以通过操作S510~S530,实现基于原子坐标,确定帧的状态能量的导数。
在操作S510,基于原子坐标,确定窗口能量。在本实施例中,可以通过上文描述的式(1)确定窗口的总能量,即窗口能量。
在操作S520,基于窗口能量,对每个窗口进行动力学模拟,得到各帧的状态能量。在本实施例中,基于定义的与λ有关的窗口总能量,对每个窗口进行动力学模拟,可得到各帧的状态能量,例如参考上文描述的范德华作用和库仑作用的确定方法,确定各帧的状态能量。
其中,分子动力学是一套分子模拟方法,该方法主要是依靠牛顿力学来模拟分子体系的运动,以在由分子体系的不同状态构成的系统中抽取样本,从而计算体系的构型积分,并以构型积分的结果为基础进一步计算体系的状态能量。具体地,可以通过数值求解分子体系经典力学运动方程的方法得到体系的相轨迹,并统计体系的结构特征与性质。
在操作S530,并行处理各帧,确定状态能量的导数。在本实施例中,可以通过图形处理器的运算单元并行处理各个帧,以获得各个状态能量的导数,从而可以显著提高运算效率。
实施例:
(1)实验对象:
目标分子1为如图2A所示的ejm-42,目标分子2为如图2B所示的ejm-44,受体分子为酪氨酸激酶2(TYK2),计算酪氨酸激酶2上结合位点附近的编号分别为901R,913L,980T,981V,982P,983L,984G的氨基酸对两个分子的相对结合自由能的贡献。
(2)使用的工具及参数设置:
运算硬件选用NVIDA RTX 2080GPU。
分子动力学软件如Amber、Gromacs等,本实施例中具体选用Amber20。
共12个状态窗口,耦合变量λ的取值分别为:0,0.0479,0.1151,0.2063,0.3161,0.4374,0.5626,0.6839,0.7937,0.8849,0.9521,1。每个状态窗口保存轨迹文件,每个轨迹文件含有500个单个结构的坐标信息,即500个帧。
模拟采用的普适型有机小分子力场(Generation Amber Force Field,简称GAFF)。从力场参数文件中读取体系的分子力场非键参数(范德华参数和电荷参数),保存为CUDA的全局变量,可供轨迹文件中所有的帧使用,这些帧共用一套参数。
(3)方法步骤
对于每个状态窗口,从轨迹文件中读取各帧的坐标信息。CUDA架构下的每一个线程处理一个帧的坐标信息,得到状态能量的导数。待所有的计算完成,对这500个结构得到的状态能量的导数求平均值。然后将所有的窗口汇总对λ进行积分,得到氨基酸对总的相对结合自由能的贡献,包括范德华作用和库仑作用。
此处未详尽描述的部分请参照上文图3-图5描述的方法。
(4)实验结果
体系总的相对结合自由能的计算的结果为-4.16千卡/摩尔(kcal/mol)与实验值的-2.36千卡/摩尔符号相符(均为负值),而且数值较为接近,绝对值的误差在2kcal/mol以内。说明总的计算结果和实验值比较符合。
运用上述流程对这些结构进行后处理得到结合位点周围的氨基酸对总的相对结合自由能的贡献,结果见表1。
表1
通过结果可以看出氨基酸980T,984G,981V,982P这四个氨基酸对最终负的相对自由能贡献比较大。从贡献的绝对值看出,它们和目标分子的作用主要是范德华作用。负的值说明这些氨基酸对目标分子从ejm-44到ejm-42的转变具有促进作用。ejm-44到ejm-42的转变主要是将两个甲基分别变为氢原子。根据化学理论,甲基主要靠范德华作用和其他分子作用,与此一致的是,表里的数值表明这些氨基酸和这两个变化的甲基存在范德华排斥作用。得到这些氨基酸的贡献,可以进一步帮助药物分子的理性设计,比如可以突变这些氨基酸,以得到期望的相对自由能的变化。此外,在本实施例中,一个氨基酸每个窗口的平均计算时间为0.5s,实现了快速计算。
本申请实施例提供的相对结合自由能贡献的确定方法,利用并行运算架构,基于对模拟过程存储的轨迹文件进行后处理得到单个被测单元对相对结合自由能的贡献,该方法可以快速得到单个氨基酸对两个分子的相对结合自由能的贡献,从而帮助药物分子理性设计。
本申请的另一方面还提供了一种药物设计方法。
图6示意性示出了根据本申请实施例的一种药物设计方法的流程图。
如图6所示,上述药物设计方法包括操作S610和S620。
在操作S610,确定被测单元对相对结合自由能的贡献值。确定被测单元对相对结合自由能的贡献值的过程可以参考如上文所描述的方法,在此不再详述。
在操作S620,基于被测单元对相对结合自由能的贡献值进行药物设计。通过被测单元对相对结合自由能的贡献值可以得到该被测单元对目标分子的作用,有助于药物设计。例如,可以改变某个氨基酸或者药物分子的部分结构来改变这种局部的相互作用,从而改变总的结合能量。
本申请的另一方面还提供了一种相对结合自由能贡献的确定装置。
图7示意性示出了根据本申请实施例的一种相对结合自由能贡献的确定装置的结构框图。
如图7所示,该确定装置700可以包括:窗口分割模块710、状态处理模块720、窗口处理模块730以及贡献确定模块740。
窗口分割模块710用于基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口,其中,耦合变量λ∈[0,1],每个状态窗口包括M个帧,M为正整数。
状态处理模块720用于通过并行处理模块处理帧,以确定每个帧的状态能量的导数。
窗口处理模块730用于确定状态窗口中各个状态能量的导数的平均值,作为状态窗口的窗口能量的导数,窗口能量的导数包括被测单元与目标分子之间的能量项的导数。
贡献确定模块740用于对每个窗口能量中被测单元与目标分子之间的能量项的导数进行积分,确定被测单元对相对结合自由能的贡献值,其中,相对结合自由能是两个目标分子与受体分子结合的相对结合自由能,被测单元为受体分子的一部分。
在某些实施例中,状态处理模块720包括:原子坐标确定单元和导数确定单元。
其中,原子坐标确定单元用于对于每个帧,确定帧的多个原子的原子坐标。
导数确定单元用于基于原子坐标确定帧的状态能量的导数。
在某些实施例中,导数确定单元包括窗口能量确定子单元、帧状态能量获得子单元和状态能量导数确定子单元。
其中,窗口能量确定子单元用于基于原子坐标确定窗口能量。
帧状态能量获得子单元用于基于窗口能量对每个窗口进行动力学模拟,得到各帧的状态能量。
状态能量导数确定子单元用于并行处理各帧,确定状态能量的导数。
在某些实施例中,状态能量包括范德华作用和库仑作用。
相应地,状态处理模块720具体用于通过统一的分子力场非键参数和帧的坐标参数,确定范德华作用和库仑作用。
在某些实施例中,帧的数量M的取值范围为400≤M≤600。
在某些实施例中,并行处理单元为图形处理器的计算单元。计算单元运行多个线程,每个线程用于处理一个帧。
在某些实施例中,状态窗口的数量的取值范围为[10,15]。
本申请的另一方面还提供了一种药物设计装置。
图8示意性示出了根据本申请实施例的一种药物设计装置的结构框图。
如图8所示,该药物设计装置800可以包括贡献确定模块810和药物设计模块820。
其中,贡献确定模块810用于利用如上的相对结合自由能贡献的确定方法确定被测单元对相对结合自由能的贡献值。
设计模块820用于基于被测单元对相对结合自由能的贡献值进行药物设计。
关于上述实施例中的相对结合自由能贡献的确定装置700、药物设计装置800,其中各个模块、单元执行操作的具体方式已经在有关该方法的实施例中进行了详细描 述,此处将不再做详细阐述说明。
本申请的另一方面还提供了一种电子设备。
图9示意性示出了实现本申请实施例的电子设备的方框图。
参见图9,电子设备900包括存储器910和处理器920。
处理器920可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器910可以包括各种类型的存储单元,例如系统内存、只读存储器(ROM)和永久存储装置。其中,ROM可以存储处理器920或者计算机的其他模块需要的静态数据或者指令。永久存储装置可以是可读写的存储装置。永久存储装置可以是即使计算机断电后也不会失去存储的指令和数据的非易失性存储设备。在一些实施方式中,永久性存储装置采用大容量存储装置(例如磁或光盘、闪存)作为永久存储装置。另外一些实施方式中,永久性存储装置可以是可移除的存储设备(例如软盘、光驱)。系统内存可以是可读写存储设备或者易失性可读写存储设备,例如动态随机访问内存。系统内存可以存储一些或者所有处理器在运行时需要的指令和数据。此外,存储器910可以包括任意计算机可读存储媒介的组合,包括各种类型的半导体存储芯片(例如DRAM,SRAM,SDRAM,闪存,可编程只读存储器),磁盘和/或光盘也可以采用。在一些实施方式中,存储器910可以包括可读和/或写的可移除的存储设备,例如激光唱片(CD)、只读数字多功能光盘(例如DVD-ROM,双层DVD-ROM)、只读蓝光光盘、超密度光盘、闪存卡(例如SD卡、min SD卡、Micro-SD卡等)、磁性软盘等。计算机可读存储媒介不包含载波和通过无线或有线传输的瞬间电子信号。
存储器910上存储有可执行代码,当可执行代码被处理器920处理时,可以使处理器920执行上文述及的方法中的部分或全部。
此外,根据本申请的方法还可以实现为一种计算机程序或计算机程序产品,该计算机程序或计算机程序产品包括用于执行本申请的上述方法中部分或全部步骤的计算机程序代码指令。
或者,本申请还可以实施为一种计算机可读存储介质(或非暂时性机器可读存储 介质或机器可读存储介质),其上存储有可执行代码(或计算机程序或计算机指令代码),当可执行代码(或计算机程序或计算机指令代码)被电子设备(或服务器等)的处理器执行时,使处理器执行根据本申请的上述方法的各个步骤的部分或全部。
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其他普通技术人员能理解本文披露的各实施例。
Claims (21)
- 一种相对结合自由能贡献的确定方法,其特征在于,所述方法包括:基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口,其中,所述耦合变量λ∈[0,1],每个所述状态窗口包括M个帧,M为正整数;通过并行处理单元处理所述帧,以确定每个所述帧的状态能量的导数;确定所述状态窗口中各个状态能量的导数的平均值,作为所述状态窗口的窗口能量的导数,所述窗口能量的导数包括被测单元与目标分子之间的能量项的导数;以及对每个所述窗口能量中所述被测单元与目标分子之间的能量项的导数进行积分,确定所述被测单元对相对结合自由能的贡献值,其中,所述相对结合自由能是两个所述目标分子与受体分子结合的相对结合自由能,所述被测单元为所述受体分子的一部分。
- 根据权利要求1所述的方法,其特征在于,所述通过并行处理单元处理所述帧,以确定每个所述帧的状态能量的导数,包括:对于每个所述帧,确定所述帧的多个原子的原子坐标;基于所述原子坐标,确定所述帧的状态能量的导数。
- 根据权利要求2所述的方法,其特征在于,所述基于所述原子坐标,确定所述帧的状态能量的导数,包括:基于所述原子坐标,确定窗口能量;基于所述窗口能量,对每个窗口进行动力学模拟,得到各所述帧的状态能量;并行处理各所述帧,确定所述状态能量的导数。
- 根据权利要求1所述的方法,其特征在于,所述状态能量包括范德华作用和库仑作用,所述通过并行处理单元处理所述帧,以确定每个所述帧的状态能量包括:通过统一的分子力场非键参数和所述帧的坐标参数,确定范德华作用和库仑作用。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述帧的数量M的取值范围为400≤M≤600。
- 根据权利要求1所述的方法,其特征在于,所述状态窗口的数量的取值范围为[10,15]。
- 根据权利要求1-4或6中任一项所述的方法,其特征在于,所述并行处理单元为图形处理器的计算单元。
- 根据权利要求7所述的方法,其特征在于,所述计算单元运行多个线程,每个线程用于处理一个帧。
- 一种药物设计方法,其特征在于,所述方法包括:根据权利要求1-8任一项所述的方法,确定被测单元对相对结合自由能的贡献值;基于所述被测单元对相对结合自由能的贡献值进行药物设计。
- 一种相对结合自由能贡献的确定装置,其特征在于,包括:窗口分割模块,用于基于多个耦合变量λ将模拟的分子变化过程分割为多个状态窗口,其中,所述耦合变量λ∈[0,1],每个所述状态窗口包括M个帧,M为正整数;状态处理模块,用于通过并行处理单元处理所述帧,以确定每个所述帧的状态能量的导数;窗口处理模块,用于确定所述状态窗口中各个状态能量的导数的平均值,作为所述状态窗口的窗口能量的导数,所述窗口能量的导数包括被测单元与目标分子之间的能量项的导数;以及贡献确定模块,用于对每个所述窗口能量中所述被测单元与目标分子之间的能量项的导数进行积分,确定所述被测单元对相对结合自由能的贡献值,其中,所述相对结合自由能是两个所述目标分子与受体分子结合的相对结合自由能,所述被测单元为所述受体分子的一部分。
- 根据权利要求10所述的装置,其特征在于,所述状态处理模块包括:原子坐标确定单元,用于对于每个所述帧,确定所述帧的多个原子的原子坐标;导数确定单元,用于基于所述原子坐标,确定所述帧的状态能量的导数。
- 根据权利要求11所述的装置,其特征在于,所述导数确定单元包括:窗口能量确定子单元,用于基于所述原子坐标,确定窗口能量;帧状态能量获得子单元,用于基于所述窗口能量,对每个窗口进行动力学模拟,得到各所述帧的状态能量;状态能量导数确定子单元,用于并行处理各所述帧,确定所述状态能量的导数。
- 根据权利要求10所述的装置,其特征在于,所述状态能量包括范德华作用和库仑作用;所述状态处理模块具体用于通过统一的分子力场非键参数和所述帧的坐标参数,确定范德华作用和库仑作用。
- 根据权利要求10-13任一项所述的装置,其特征在于,所述帧的数量M的取值范围为400≤M≤600。
- 根据权利要求10所述的装置,其特征在于,所述状态窗口的数量的取值范围为[10,15]。
- 根据权利要求10-13或15中任一项所述的装置,其特征在于,所述并行处理单元为图形处理器的计算单元。
- 根据权利要求16所述的装置,其特征在于,所述计算单元运行多个线程,每个线程用于处理一个帧。
- 一种药物设计装置,其特征在于,包括:贡献确定模块,用于根据权利要求10-17任一项所述的装置,确定被测单元对相对结合自由能的贡献值;药物设计模块,用于基于所述被测单元对相对结合自由能的贡献值进行药物设计。
- 一种电子设备,其特征在于,包括:处理器;以及存储器,其上存储有可执行代码,当所述可执行代码被所述处理器执行时,使所述处理器执行如权利要求1-9任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,其上存储有可执行代码,当所述可执行代码被电子设备的处理器执行时,使所述处理器执行如权利要求1-9中任一项所述的方法。
- 一种计算机程序产品,其特征在于,包括可执行代码,所述可执行代码被处理器执行时实现根据权利要求1-9任一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/143389 WO2023123288A1 (zh) | 2021-12-30 | 2021-12-30 | 相对结合自由能贡献的确定方法、装置及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/143389 WO2023123288A1 (zh) | 2021-12-30 | 2021-12-30 | 相对结合自由能贡献的确定方法、装置及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023123288A1 true WO2023123288A1 (zh) | 2023-07-06 |
Family
ID=86997097
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/143389 WO2023123288A1 (zh) | 2021-12-30 | 2021-12-30 | 相对结合自由能贡献的确定方法、装置及存储介质 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023123288A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103150463A (zh) * | 2011-12-06 | 2013-06-12 | 曙光信息产业股份有限公司 | 分子动力学计算的实现方法和装置 |
US20150178442A1 (en) * | 2013-12-23 | 2015-06-25 | Schrodinger, Inc. | Methods and systems for calculating free energy differences using a modified bond stretch potential |
CN109903818A (zh) * | 2019-02-21 | 2019-06-18 | 深圳晶泰科技有限公司 | 基于恒定pH分子动力学模拟的蛋白质质子化状态确定方法 |
CN110047559A (zh) * | 2019-03-06 | 2019-07-23 | 山东师范大学 | 蛋白质与药物结合自由能的计算方法、系统、设备及介质 |
CN110400598A (zh) * | 2019-07-03 | 2019-11-01 | 江苏理工学院 | 基于mm/pbsa模型的蛋白质-配体结合自由能计算方法 |
-
2021
- 2021-12-30 WO PCT/CN2021/143389 patent/WO2023123288A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103150463A (zh) * | 2011-12-06 | 2013-06-12 | 曙光信息产业股份有限公司 | 分子动力学计算的实现方法和装置 |
US20150178442A1 (en) * | 2013-12-23 | 2015-06-25 | Schrodinger, Inc. | Methods and systems for calculating free energy differences using a modified bond stretch potential |
CN109903818A (zh) * | 2019-02-21 | 2019-06-18 | 深圳晶泰科技有限公司 | 基于恒定pH分子动力学模拟的蛋白质质子化状态确定方法 |
CN110047559A (zh) * | 2019-03-06 | 2019-07-23 | 山东师范大学 | 蛋白质与药物结合自由能的计算方法、系统、设备及介质 |
CN110400598A (zh) * | 2019-07-03 | 2019-11-01 | 江苏理工学院 | 基于mm/pbsa模型的蛋白质-配体结合自由能计算方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Roe et al. | Parallelization of CPPTRAJ enables large scale analysis of molecular dynamics trajectory data | |
Lenselink et al. | Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set | |
Suzumura et al. | Towards large-scale graph stream processing platform | |
EP3743821A1 (en) | Wide key hash table for a graphics processing unit | |
TW201740289A (zh) | 地圖興趣點簡稱獲取方法和裝置 | |
Shao et al. | Large-scale ab initio simulations for periodic system | |
Yu | Understanding regional development mechanisms in Greater Beijing Area, China, 1995–2001, from a spatial–temporal perspective | |
WO2018137346A1 (zh) | 一种图数据处理的方法及装置 | |
Liu et al. | Parameterization for molecular Gaussian surface and a comparison study of surface mesh generation | |
Sriramoju | A review on processing big data | |
CN104063501A (zh) | 基于hdfs的副本平衡方法 | |
CN114360663B (zh) | 相对结合自由能贡献的确定方法、装置及存储介质 | |
CN111488479A (zh) | 超图构建方法、装置以及计算机系统和介质 | |
WO2023123288A1 (zh) | 相对结合自由能贡献的确定方法、装置及存储介质 | |
Assiroj et al. | High Performance Computing (HPC) Implementation: A Survey | |
Ashton et al. | Quantifying the effects of neglecting many-body interactions in coarse-grained models of complex fluids | |
Sun et al. | Fast computation for the forest matrix of an evolving graph | |
CN117393075A (zh) | 一种模型训练方法和基于分子能量信息的任务执行方法 | |
CN116883793A (zh) | 面向遥感分类结果精度评价的多粒度空间抽样方法 | |
CN114139711B (zh) | 分布式推断方法、数据处理方法、装置、终端及计算设备 | |
Wu et al. | A novel fractional implicit polynomial approach for stable representation of complex shapes | |
CN112578458B (zh) | 叠前弹性阻抗随机反演方法、装置、存储介质及处理器 | |
CN114944204A (zh) | 用于管理分子预测的方法、装置、设备和介质 | |
Xie et al. | Parallel computing for the radix-2 fast fourier transform | |
CN108073596A (zh) | 一种olap数据库的数据删除方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21969645 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21969645 Country of ref document: EP Kind code of ref document: A1 |