WO2021103402A1 - 分子力场拟合方法 - Google Patents

分子力场拟合方法 Download PDF

Info

Publication number
WO2021103402A1
WO2021103402A1 PCT/CN2020/085815 CN2020085815W WO2021103402A1 WO 2021103402 A1 WO2021103402 A1 WO 2021103402A1 CN 2020085815 W CN2020085815 W CN 2020085815W WO 2021103402 A1 WO2021103402 A1 WO 2021103402A1
Authority
WO
WIPO (PCT)
Prior art keywords
force field
small molecule
parameters
bond
molecular
Prior art date
Application number
PCT/CN2020/085815
Other languages
English (en)
French (fr)
Inventor
周云飞
马健
温书豪
赖力鹏
Original Assignee
深圳晶泰科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳晶泰科技有限公司 filed Critical 深圳晶泰科技有限公司
Priority to PCT/CN2020/085815 priority Critical patent/WO2021103402A1/zh
Publication of WO2021103402A1 publication Critical patent/WO2021103402A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like

Definitions

  • the invention relates to the field of molecular force field analysis, in particular to a molecular force field fitting method.
  • Molecular force field can be divided into general force field and proprietary force field in terms of coverage.
  • the parameters of the general force field are fitted based on a large number of small molecule fragments or quantified data of atoms and some experimental data.
  • Different types of general force fields are defined in different ways (definition of functional form and atom type), and the training method is also different. There will be differences, but their generality is that they are transferable, expandable, and meet a certain accuracy; while the proprietary force field is for a specific molecule and is fitted on the basis of the general force field parameters. In the future, using the training set and adopting certain strategies to further fit and correct the various parameters of the force field, the accuracy of the description of this molecule or this type of molecule will increase accordingly.
  • the general force field exhibits strong mobility, but the disadvantage is that for some organic molecules with higher degrees of freedom, the general force field exhibits lower accuracy (using quantitative data as a reference, and The correlation of the quantitative data is poor), because when fitting the parameters of the general force field, in order to balance the amount of calculation and its versatility, the smallest molecular fragment is generally used, which covers at most 1 to 2 flexible angles. Taking into account the coupling effect between the adjacent chemical groups (which can be understood as different dihedral angles) inside the actual molecule, the potential energy surface of the molecule is described inaccurately. Such a force field is likely to cause a large The structural deviation and energy deviation.
  • the existing proprietary force field is fitted with quantitative data of complete molecules, which makes up for the shortcomings of the general force field and improves the accuracy of force field calculations.
  • the proprietary force field requires a large amount of quantitative data to be prepared in advance for training set.
  • the degree of freedom of the molecule is high, the conformation search space will increase, so the calculation amount of quantitative calculation will also increase, which undoubtedly increases the difficulty and consumption of force field fitting. Therefore, it is necessary to choose between accuracy and cost.
  • a method of molecular force field fitting including:
  • Segmentation Enter the 3D conformation of the macromolecule, specify the segmentation position, segment the macromolecule into small molecule fragments, and save the atomic correspondence between the small molecule fragments and the input macromolecule;
  • Proprietary force field fitting Perform proprietary force field fitting on small molecule fragments and save the fitted force field parameters;
  • the 3D conformation of the macromolecule includes: a unique serial number and three-dimensional coordinates corresponding to each atom.
  • the positions of the splitting positions are separated by at least three atoms, the groups of the spacers are shared groups, and the small molecule fragments that are cut out include common groups, and the severed chemical bonds are in the broken
  • the position replenishes hydrogen atoms.
  • the coupling between adjacent groups within the macromolecule is retained during the segmentation.
  • the coupling effect judging whether the positions of the adjacent groups of the macromolecules during flexible dihedral scanning have a significant influence on the scanning results.
  • the flexible dihedral scanning is that the dihedral angle rotates 360 degrees around the intermediate axis, and the energy of each position of the molecule is calculated at the same time.
  • the dihedral angle rotates 360 degrees around the intermediate axis, and the energy of each position of the molecule is calculated at the same time.
  • the 3D structure of the segmented small molecule fragment and the atomic correspondence between the small molecule segment and the input macromolecular atom are returned.
  • the exclusive force field fitting Fit the charge parameters of the small molecule fragments and save the fitted charge parameter file, and at the same time respectively fit the bond, angle, and dihedral angle to the small molecule fragments.
  • the non-bonding action item parameters are saved in the parameter file, and the energy of the molecule is calculated according to the bond parameters, angle parameters, dihedral angle parameters, and non-bonding action items.
  • the fitting of the charge parameter is to calculate the charge of each atom
  • the non-bonding action items include: electrostatic action, van der Waals action, and the proprietary force field fitting: according to bond parameters, angle parameters, two-sided Angular parameters, electrostatic interaction terms, van der Waals interaction terms calculate the energy of molecules; the calculation formula is as follows;
  • k b is the force constant in the key expansion item
  • r is the bond length
  • r 0 is the length of the bond in the equilibrium position
  • Bond angle bending term k ⁇ is the force constant, ⁇ is the bond angle, and ⁇ 0 is the bond angle at the equilibrium position;
  • the dihedral angle term Vn represents the highest value of the potential energy during the dihedral rotation process, n is used to adjust the periodicity, ⁇ is the value of the variable dihedral angle, and ⁇ represents the angle of the dihedral angle;
  • a ij , B ij , R ij are Van der Waals parameters
  • R ij represents the distance between two atoms
  • ⁇ ij represents the depth of the potential well between the two atoms
  • ⁇ ij is the distance between the two atoms when the potential energy is zero.
  • Electrostatic interaction expression Where ⁇ is the effective dielectric constant, qi and qj are the charges of atoms i and j, respectively, and R ij is the distance between the two atoms;
  • the parameter file includes: atom type definition, molecular topology.
  • the proprietary force field fitting also includes: scanning the small molecule fragments to obtain the structure of the small molecule fragments as a training set, and calculating the energy of the small molecule fragments according to the function. If the calculated energy has a better correlation with the standard energy, the force field parameters are obtained through iteration. Solve the value of each parameter.
  • the initial parameters of the input macromolecules are first obtained, and the force field parameters of the input macromolecules are reorganized according to the atomic correspondence between the small molecule fragments and the input macromolecules.
  • the initial parameters include Molecular topology.
  • the above molecular force field fitting method cuts macromolecules into smaller molecular fragments.
  • the principle of segmentation is to minimize the degree of freedom (molecular complexity) of the molecule while retaining the coupling between adjacent groups within the molecule, and then separate Fitting the force field parameters of small molecules makes up for the shortcomings of the general force field itself.
  • the search space is greatly reduced when the system conformation search is performed, reducing time consumption
  • the reduction of molecular weight also reduces the difficulty of quantitative calculation and force field parameter fitting. This method not only improves the accuracy of the force field parameters, but also reduces the difficulty and cost of parameter refitting.
  • FIG. 1 is a flowchart of a method for fitting a molecular force field according to an embodiment of the present invention
  • Figure 2 is a macromolecule to be fitted according to an embodiment of the present invention
  • Fig. 3 is a 3D diagram of the macromolecule to be fitted in Fig. 2;
  • FIG. 4 is a schematic diagram of the molecular fragments of the macromolecule to be fitted in FIG. 2 after segmentation.
  • the molecular force field fitting method includes:
  • Step S101 segmentation: input the 3D conformation of the macromolecule, specify the segmentation position, segment the macromolecule into small molecule fragments, and save the atomic correspondence between the small molecule fragments and the input macromolecule;
  • Step S103 fitting of the exclusive force field: fitting the small molecule fragment to the exclusive force field, and saving the fitted force field parameters;
  • Step S105 splicing: according to the atomic correspondence between the small molecule fragments and the input macromolecules, splice the fitted small molecule force field into a macromolecular force field.
  • the 3D constellation of this embodiment is not limited to the form.
  • the 3D conformation of the macromolecule in this embodiment includes: a unique serial number and three-dimensional coordinates corresponding to each atom.
  • the splitting position of this embodiment is separated by at least three atoms, the group of the spacer part is a common group, the small molecule fragments that are split out include common groups, and the severed chemical bond supplements hydrogen at the broken position. atom.
  • the coupling between adjacent groups within the macromolecule is retained during segmentation.
  • the coupling effect of this embodiment is to determine whether the positions of adjacent groups of macromolecules have a significant influence on the scanning results when performing flexible dihedral scanning.
  • the flexible dihedral angle scan of this embodiment is: the dihedral angle rotates 360 degrees around the intermediate axis, and the energy of each position of the molecule is calculated at the same time.
  • the adjacent two groups have a steric hindrance effect or the formation of molecular hydrogen bonds, it is judged that they affect each other and have a coupling effect.
  • the site for splitting cannot be selected between two groups, and the two groups are split into the same molecule.
  • this embodiment returns to the 3D structure of the small molecule fragments after the splitting, and the atomic correspondence between the small molecule fragments and the input macromolecular atoms.
  • the specific force field fitting step of this embodiment Fit the charge parameters of the small molecule fragments and save the fitted charge parameter file, and at the same time respectively fit the bond, angle, dihedral angle, and non-bond to the small molecule fragments.
  • the action item parameters are saved in the parameter file, and the energy of the molecule is calculated according to the bond parameters, angle parameters, dihedral angle parameters, and non-bonding action items.
  • the fitting of the charge parameter is to calculate the charge of each atom.
  • the non-bonding action items include: electrostatic action and van der Waals action.
  • the bond parameters include: bond expansion and contraction items, bond angle bending items, and the proprietary force field fitting: according to The bond expansion item, bond angle bending item, angle parameter, dihedral angle parameter, electrostatic interaction item, van der Waals interaction item calculate the energy of the molecule, the calculation formula is as follows;
  • k b is the force constant in the key expansion item
  • r is the bond length
  • r 0 is the length of the bond in the equilibrium position
  • Bond angle bending term k ⁇ is the force constant, ⁇ is the bond angle, and ⁇ 0 is the bond angle at the equilibrium position;
  • the dihedral angle term Vn represents the highest value of the potential energy during the dihedral rotation process, n is used to adjust the periodicity, ⁇ is the value of the variable dihedral angle, and ⁇ represents the angle of the dihedral angle;
  • a ij , B ij , R ij are Van der Waals parameters
  • R ij represents the distance between two atoms
  • ⁇ ij represents the depth of the potential well between the two atoms
  • ⁇ ij is the distance between the two atoms when the potential energy is zero.
  • Electrostatic interaction expression Where ⁇ is the effective dielectric constant, qi and qj are the charges of atoms i and j, respectively, and R ij is the distance between the two atoms.
  • parameter file of this embodiment includes: atom type definition and molecular topology.
  • the proprietary force field fitting also includes: scanning the small molecule fragments to obtain the structure of the small molecule fragments as a training set, and calculating the energy of the small molecule fragments according to the function. If the calculated energy has a better correlation with the standard energy, the force field parameters are obtained through iteration. Solve the value of each parameter.
  • the fitting of the charge parameter in this embodiment is to calculate the charge of each atom.
  • Non-bonding action items include: electrostatic action, van der Waals action.
  • Proprietary force field fitting calculate the energy of molecules based on bond parameters, angle parameters, dihedral angle parameters, electrostatic interaction terms, and van der Waals interaction terms.
  • the parameter file includes: atom type definition, molecular topology, and parameters.
  • the proprietary force field fitting of this embodiment scan the small molecule fragments to obtain the structure of the small molecule fragments as a training set, calculate the energy of the small molecule fragments according to formula (1), if the energy is better than the calculated correlation with the standard energy , Obtain the force field parameters, and solve each parameter value through iteration.
  • each atom corresponds to a unique number and three-dimensional coordinate.
  • Segmentation Select the site to be segmented, that is, the chemical bond to be broken according to your needs.
  • the site selection must be separated by at least three atoms. This part of the group is called a shared group, and the fragments that are cut out contain this
  • the shared group (the purpose is to ensure the integrity of the dihedral angle parameters of the original molecular force field), and the severed chemical bond will automatically supplement the hydrogen atom at the broken position to ensure the chemical integrity of the small molecule.
  • the splitting positions are bond: 19-20, 22-28, 28-29, 32-39 (as shown in Figure 4). 3 shown).
  • slicing_groups [((22,28),(19,20)),((19,20),(32,39)),((28,29),(32,39) )], execute, the program will return the 3D structure of the three small molecules after being split (mol1, mol2, mol3) (as shown in Figure 4), as well as the small molecule fragments and the original molecule (input macromolecule to be fitted ) Correspondence between atoms.
  • Proprietary force field fitting Fit the charge parameters of the input macromolecules to be fitted and save the fitted charge parameter files, and at the same time fit the bond, angle, dihedral angle and van der Waals parameters to three small molecule fragments respectively , Respectively save its parameter files (such as mol1_ff, mol2_ff, mol3_ff).
  • the initial parameters can be the freely available Gaff2 force field parameters.
  • Proprietary force field fitting needs: 1.
  • the objective function is the functional form that needs to be solved, such as formula (1); 2.
  • the training set the molecular structure with QM energy, will scan the cut small molecule fragments first to obtain a The batch structure is used as the training set (the reference term of the force field, that is, the energy calculated by the fitted function has a good correlation with the standard QM, and then a good force field parameter can be obtained); 3.
  • Algorithm used for To solve the function, the quasi-Newton method BFGS algorithm is adopted, and various parameter values are solved through continuous iteration. This method is used to fit the molecular force field parameters of each fragment.
  • the initial parameter form is the same, you can choose from Gaff2, charge fitting is to calculate the charge of each atom.
  • the molecule is automatically segmented: the algorithm will segment the macromolecule according to the artificially designated segmentation point, and automatically add hydrogen atoms at the broken bond position to ensure the integrity of the molecule. At the same time, the atomic mapping relationship between fragment molecules and macromolecules is preserved.
  • the automatic splicing tool can refit the force field parameters of small molecule fragments, and reconstruct the force field parameter file of the main molecule according to the atom correspondences saved during segmentation.
  • the present invention uses the cutting of macromolecules into smaller molecular fragments.
  • the principle of cutting is to reduce the degree of freedom (molecular complexity) of the molecule as much as possible while retaining the coupling effect between adjacent groups within the molecule, and then respectively fit the small molecules.
  • the molecular force field parameters this approach makes up for the shortcomings of the general force field itself.
  • the search space is greatly reduced when the system conformation search is performed, reducing time consumption
  • the reduction of molecular weight also reduces the difficulty of quantitative calculation and force field parameter fitting. This method not only improves the accuracy of the force field parameters, but also reduces the difficulty and cost of parameter refitting.
  • the molecular force field fitting method of the present invention has been tested on multiple systems. After comparison, the accuracy of the fitted force field is higher than the initial general force field parameter (gaff), and is compared with the conventional fitting special The precision of the force field is equivalent, but the amount of calculation is significantly less than the consumption of macromolecules in the same fitting process.
  • gaff initial general force field parameter
  • the molecular force field fitting method of the present invention is different from the actual parameterization process of the general force field. It considers the interaction between the groups in the molecule, and describes the potential energy surface of the molecule more accurately and with higher precision.
  • the force field parameter splicing method, the force field splicing tool can use the antipodal relationship retained during segmentation to automatically splice the parameters after the small molecule fragments are fitted into a complete macromolecular force field parameter file, which has better auxiliary functions. Reduce human operation.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种分子力场拟合方法,包括:切分:输入大分子3D构象,指定切分位置,将大分子切分为小分子片段,保存小分子片段与输入的大分子之间的原子对应关系(S101);专有力场拟合:对小分子片段进行专有力场拟合,保存拟合好的力场参数(S103);拼接:根据小分子片段与输入的大分子之间的原子对应关系,将拟合后小分子力场拼接成大分子力场(S105);分子力场拟合方法将大分子切为较小分子片段,切分原则是在尽可能减少分子自由度(分子复杂度)的同时,保留分子内部临近基团之间的耦合作用,然后分别拟合小分子的力场参数,弥补了通用力场自身的不足,同时分子量的减少,也相应的减少了量化计算量以及力场参数拟合的难度。

Description

分子力场拟合方法 技术领域
本发明涉及分子力场分析领域,特别涉及一种分子力场拟合方法。
背景技术
分子力场从覆盖度可分为通用力场和专有力场两大类。通用力场的参数是基于大量的小分子片段或者原子的量化数据以及一些实验数据拟合而来,不同类别的通用力场定义的方式(函数形式和原子类型的定义)不同,训练的方式也会有所差异,但是他们的通性都是具有可迁移性,可拓展性,并满足一定精度;而专有力场则是针对某一具体分子,在通用力场参数的基础之上拟合而来,使用训练集,采取一定的策略对力场的各项参数的进一步拟合修正,对这一分子或者这一类分子的描述精度随之提高。
实际的应用中,通用力场展现了其较强的迁移性,但是不足之处是对于一些自由度较高的有机分子,通用力场表现出来的精度较低(以量化数据为参比,与量化数据的相关性较差),原因是由于通用力场的参数在拟合时,为了平衡计算量以及其通用性,一般采用的是最小分子片段,最多覆盖1~2个柔性角,并未考虑到实际分子内部临近的化学基团(可理解为不同的二面角)之间的存在耦合作用,从而导致对分子的势能面描述不准确,这样的力场在模拟中很可能引起较大的结构偏差和能量偏差。
现有的专有力场的拟合所采用的是完整分子的量化数据,弥补了通用力场的不足之处,提高了力场计算的精度,但是专有力场需要预先准备大量的量化数据作为训练集。分子自由度较高时,构象搜索空间会随之增加,那么量化计算的计算量也随之增加,无疑增加了力场拟合的难度和消耗。因此就需要在精度和成本之间做选择。
发明内容
基于此,有必要提供一种可平衡精度和成本的分子力场拟合方法。
一种分子力场拟合方法,包括:
切分:输入大分子3D构象,指定切分位置,将大分子切分为小分子片段, 保存小分子片段与输入的大分子之间的原子对应关系;
专有力场拟合:对小分子片段进行专有力场拟合,保存拟合好的力场参数;
拼接:根据小分子片段与输入的大分子之间的原子对应关系,将拟合后小分子力场拼接成大分子力场。
在优选的实施例中,所述大分子3D构象包括:每个原子对应的唯一的序号、及三维坐标。
在优选的实施例中,所述切分位置的位点间隔至少三个原子,间隔部分的基团为共用基团,切分出来的小分子片段包括公共基团,被切断的化学键于断开位置补充氢原子。
在优选的实施例中,所述切分时保留大分子内部邻近基团之间的耦合作用。
在优选的实施例中,所述耦合作用:判断大分子邻近基团做柔性二面角扫描时相互之间的位置是否对扫描结果有显著影响。
在优选的实施例中,柔性二面角扫描为二面角绕着中间轴做360度旋转,同时计算分子各个位置的能量,二面角做扫描时邻近两基团之间由于有位阻效应或形成分子氢键则判断相互显著影响,具有耦合作用;切分时位点不能选在两个基团之间,将两个基团切分到同一个分子中。
在优选的实施例中,切分后返回切分后小分子片段的3D结构、及小分子片段与输入大分子原子之间的原子对应关系。
在优选的实施例中,所述专有力场拟合:对小分子片段进行电荷参数的拟合并保存拟合后的电荷参数文件,同时分别对小分子片段拟合键、角、二面角、非键作用项参数保存到参数文件中,根据键参数、角参数、二面角参数、非键作用项计算分子的能量。
在优选的实施例中,电荷参数的拟合为计算每一个原子所带电荷,非键作用项包括:静电作用、范德华作用,所述专有力场拟合:根据键参数、角参数、二面角参数、静电作用项、范德华作用项计算分子的能量;计算公式如下;
Figure PCTCN2020085815-appb-000001
其中键伸缩项中k b是力常数,r是键长,r 0是键处于平衡位置的长度;
键角弯曲项:k θ是力常数,θ是键角,θ 0是平衡位置的键角;
二面角项Vn代表二面角旋转过程中势能的最高时值,n用来调整周期性,φ为变量二面角的值,γ代表相即二面角的角度;
A ij,B ij,R ij是范德华参数,
范德华作用可以用标准的lennard-Jones势表示:
Figure PCTCN2020085815-appb-000002
在力场表达式中:A ij=4ε ijσ ij 12,B ij=4ε ijσ ij 6
R ij表示两个原子之间的距离,ε ij表示两个原子之间势阱的深度,σ ij是势能为零时两个原子之间的距离。
静电作用表达式:
Figure PCTCN2020085815-appb-000003
中,ε为有效介电常数,qi、qj分别是原子i,j所带的电荷,R ij是两个原子之间的距离;
参数文件包括:原子类型定义、分子拓扑。
专有力场拟合还包括:对小分子片段扫描得到小分子片段结构作为训练集,根据函数计算小分子片段的能量,若较计算的能量与标准能量相关性好,获得力场参数,通过迭代解出各项参数值。
在优选的实施例中,所述拼接步骤中,首先获取输入大分子的初始参数,根据小分子片段与输入大分子之间的原子对应关系重组输入大分子的力场参数,所述初始参数包括分子拓扑。
上述分子力场拟合方法将大分子切为较小分子片段,切分原则是在尽可能减少分子自由度(分子复杂度)的同时,保留分子内部临近基团之间的耦合作用,然后分别拟合小分子的力场参数,这个做法弥补了通用力场自身的不足。利用小分子片段构建训练集拟合专有力场,虽然分子数有所增加,但是由于每个分子的自由度较低,在进行系统构象搜索时,搜索空间大幅度降低,减少了时间上的耗费,同时分子量的减少,也相应的减少了量化计算量以及力场参数拟合的难度。这种方法既提高了力场参数的精度,同时又减少了参数重拟合的难度和成本。
附图说明
图1为本发明一实施例的分子力场拟合方法的流程图;
图2为本发明一实施例的待拟合大分子;
图3为图2的待拟合大分子的3D图;
图4为图2的待拟合大分子切分后的分子片段的示意图。
具体实施方式
如图1所示,本发明一实施例的分子力场拟合方法,包括:
步骤S101,切分:输入大分子3D构象,指定切分位置,将大分子切分为小分子片段,保存小分子片段与输入的大分子之间的原子对应关系;
步骤S103,专有力场拟合:对小分子片段进行专有力场拟合,保存拟合好的力场参数;
步骤S105,拼接:根据小分子片段与输入的大分子之间的原子对应关系,将拟合后小分子力场拼接成大分子力场。
本实施的3D构象不限形式。
进一步,本实施例的大分子3D构象包括:每个原子对应的唯一的序号、及三维坐标。
进一步,本实施例的切分位置的位点间隔至少三个原子,间隔部分的基团为共用基团,切分出来的小分子片段包括公共基团,被切断的化学键于断开位置补充氢原子。
进一步,本实施例切分时保留大分子内部邻近基团之间的耦合作用。
进一步,本实施例的耦合作用:判断大分子邻近基团做柔性二面角扫描时相互之间的位置是否对扫描结果影响显著。
进一步,本实施例的柔性二面角扫描为:二面角绕着中间轴做360度旋转,同时计算分子各个位置的能量。
进一步,若二面角做扫描时邻近两基团之间由于有位阻效应或形成分子氢键则判断相互影响,具有耦合作用。
进一步,本实施例切分时位点不能选在两个基团之间,将两个基团切分到同一个分子中。
进一步,本实施例切分后返回切分后小分子片段的3D结构、及小分子片段与输入大分子原子之间的原子对应关系。
进一步,本实施例专有力场拟合步骤:对小分子片段进行电荷参数的拟合并保存拟合后的电荷参数文件,同时分别对小分子片段拟合键、角、二面角、非键作用项参数保存到参数文件中,根据键参数、角参数、二面角参数、非键作用项计算分子的能量。
电荷参数的拟合为计算每一个原子所带电荷,非键作用项包括:静电作用、范德华作用,所述键参数包括:键伸缩项、键角弯曲项,所述专有力场拟合:根据键伸缩项、键角弯曲项、角参数、二面角参数、静电作用项、范德华作用项计算分子的能量,计算公式如下;
Figure PCTCN2020085815-appb-000004
其中键伸缩项中k b是力常数,r是键长,r 0是键处于平衡位置的长度;
键角弯曲项:k θ是力常数,θ是键角,θ 0是平衡位置的键角;
二面角项Vn代表二面角旋转过程中势能的最高时值,n用来调整周期性,φ为变量二面角的值,γ代表相即二面角的角度;
A ij,B ij,R ij是范德华参数,
范德华作用可以用标准的lennard-Jones势表示:
Figure PCTCN2020085815-appb-000005
在力场表达式中:A ij=4ε ijσ ij 12,B ij=4ε ijσ ij 6
R ij表示两个原子之间的距离,ε ij表示两个原子之间势阱的深度,σ ij是势能为零时两个原子之间的距离。
静电作用表达式:
Figure PCTCN2020085815-appb-000006
中,ε为有效介电常数,qi、qj分别是原子i,j所带的电荷,R ij是两个原子之间的距离。
其中:k b,r 0是键参数;k θ,θ 0,是角参数项;Vn,γ是二面角参数;A ij,B ij,R ij是范德华参数;qi、q j,是静电参数。
进一步,本实施例的参数文件包括:原子类型定义、分子拓扑。
专有力场拟合还包括:对小分子片段扫描得到小分子片段结构作为训练集,根据函数计算小分子片段的能量,若较计算的能量与标准能量相关性好,获得力场参数,通过迭代解出各项参数值。
进一步,本实施例的电荷参数的拟合为计算每一个原子所带电荷。
非键作用项包括:静电作用、范德华作用。
专有力场拟合:根据键参数、角参数、二面角参数、静电作用项、范德华作用项计算分子的能量。
参数文件包括:原子类型定义、分子拓扑、参数。
进一步,本实施例的专有力场拟合:对小分子片段扫描得到小分子片段结构作为训练集,根据公式(1)计算小分子片段的能量,若较计算的得到能量与标准能量相关性好,获得力场参数,通过迭代解出各项参数值。
如图2至图4所示,本发明一具体实施例中,待拟合的大分子(如图2所示)的3D结构(如图3所示),每个原子对应唯一的序号和三维坐标。
切分:根据需求选择待切分的位点即要断开的化学键,位点的选择必须间隔至少三个原子,间隔这部分基团被称为共用基团,被切出来的碎片均包含此共用基团(目的是为了保证原分子力场的二面角参数完整性),被切断的化学键在断开位置自动补充氢原子,以确保小分子的化学完整性。如上分子,若切分为Mol1,Mol2,Mol3三个小分子(如图4所示),切分位点分别为bond:19-20,22-28,28-29,32-39(如图3所示)。定义切分的位点:slicing_groups=[((22,28),(19,20)),((19,20),(32,39)),((28,29),(32,39))],执行,程序会返回被切分后的三个小分子的3D结构(mol1,mol2,mol3)(如图4所示),以及小分子片段与原分子(输入的待拟合大分子)原子之间的对应关系。
专有力场拟合:对输入的待拟合大分子进行电荷参数的拟合并保存拟合后的电荷参数文件,同时分别对三个小分子片段拟合键、角、二面角和范德华参数,分别保存其参数文件(如mol1_ff,mol2_ff,mol3_ff)。初始参数可以采用的可以免费获取的Gaff2力场参数。
拼接:首先用Gaff2获取Model_molecule分子(待拟合大分子)的初始参数文件(如model_ff,包含其分子拓扑),然后利用小分子片段与原分子(待拟合大分子)之间的原子对应关系,来重组原分子(待拟合大分子)的力场参数。其中键、角、二面角、范德华参数由各个小分子片段的力场参数按照其对应关系进行组合,电荷参数采用是原分子(待拟合大分子)单独拟合的参数,以保证电荷分布的合理性。
专有力场拟合需要:1.目标函数即需要求解的函数形式,如式(1);2.训练集,带QM能量的分子结构,对所切的小分子片段会先进行扫描,得到一批结构作为训练集(力场的参比项,即拟合出来的函数所计算得的能量与标准的QM有好的相关性,才算得到好的力场参数);3.算法,用来求解函数,采用拟牛顿法BFGS算法,通过不断的迭代解出各项参数值。各个碎片的分子力场参数拟合的方法都是此法。初始参数形式都是相同的,可以选择来自Gaff2,电荷拟合就是需要计算出每一个原子所带的电荷。
本发明的分子力场拟合方法,分子自动切分:算法会根据人为指定的切分位点,对大分子进行切分,并在断键位置自动补氢原子,以确保分子的完整性,同时保存碎片分子和大分子的之间的原子对映关系。
小分子片段力场参数的拼接:自动拼接工具可以将小分子片段重拟合后的力场参数,按切分时保存的原子对应关系,重构主体分子的力场参数文件。
本发明利用了将大分子切为较小分子片段,切分原则是在尽可能减少分子自由度(分子复杂度)的同时,保留分子内部临近基团之间的耦合作用,然后分别拟合小分子的力场参数,这个做法弥补了通用力场自身的不足。利用小分子片段构建训练集拟合专有力场,虽然分子数有所增加,但是由于每个分子的自由度较低,在进行系统构象搜索时,搜索空间大幅度降低,减少了时间上的耗费,同时分子量的减少,也相应的减少了量化计算量以及力场参数拟合的难度。这种方法既提高了力场参数的精度,同时又减少了参数重拟合的难度和成本。
本发明的分子力场拟合方法已在多个体系上进行了测试,经过对比,所拟合的力场精度要高于初始的通用力场参数(gaff),并与常规化拟合的专有力场精度相当,但计算量要明显少于大分子在相同拟合过程中的消耗。
本发明的分子力场拟合方法不同于实际的通用力场的参数化过程,它考虑 了分子内基团之间的相互作用,对分子的势能面描述更加准确,精度较高。力场参数的拼接方法,力场拼接工具可以利用切分时保留的对映关系,自动将小分子片段拟合后的参数拼接成完整的大分子力场参数文件,有较好的辅助功能,减少了人为操作。
以上述依据本申请的理想实施例为启示,通过上述的说明内容,相关工作人员完全可以在不偏离本项申请技术思想的范围内,进行多样的变更以及修改。本项申请的技术性范围并不局限于说明书上的内容,必须要根据权利要求范围来确定其技术性范围。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。

Claims (10)

  1. 一种分子力场拟合方法,其特征在于,包括:
    切分:输入大分子3D构象,指定切分位置,将大分子切分为小分子片段,保存小分子片段与输入的大分子之间的原子对应关系;
    专有力场拟合:对小分子片段进行专有力场拟合,保存拟合好的力场参数;
    拼接:根据小分子片段与输入的大分子之间的原子对应关系,将拟合后小分子力场拼接成大分子力场。
  2. 根据权利要求1所述的分子力场拟合方法,其特征在于,所述大分子3D构象包括:每个原子对应的唯一的序号、及三维坐标。
  3. 根据权利要求1所述的分子力场拟合方法,其特征在于,所述切分位置的位点间隔至少三个原子,间隔部分的基团为共用基团,切分出来的小分子片段包括公共基团,被切断的化学键于断开位置补充氢原子。
  4. 根据权利要求1所述的分子力场拟合方法,其特征在于,所述切分时保留大分子内部邻近基团之间的耦合作用。
  5. 根据权利要求4所述的分子力场拟合方法,其特征在于,所述耦合作用:判断大分子邻近基团做柔性二面角扫描时相互之间的位置是否对扫描结果有显著影响。
  6. 根据权利要求5所述的分子力场拟合方法,其特征在于,柔性二面角扫描为二面角绕着中间轴做360度旋转,同时计算分子各个位置的能量,二面角做扫描时邻近两基团之间由于有位阻效应或形成分子氢键则判断相互显著影响,具有耦合作用;切分时位点不能选在两个基团之间,将两个基团切分到同一个分子中。
  7. 根据权利要求1至6任意一项所述的分子力场拟合方法,其特征在于,切分后返回切分后小分子片段的3D结构、及小分子片段与输入大分子原子之间的原子对应关系。
  8. 根据权利要求1至6任意一项所述的分子力场拟合方法,其特征在于,所述专有力场拟合:对小分子片段进行电荷参数的拟合并保存拟合后的电荷参数文件,同时分别对小分子片段拟合键、角、二面角、非键作用项参数保存到参数文件中,根据键参数、角参数、二面角参数、非键作用项计算分子的能量。
  9. 根据权利要求8所述的分子力场拟合方法,其特征在于,电荷参数的拟合为计算每一个原子所带电荷,非键作用项包括:静电作用、范德华作用,所述键参数包括:键伸缩项、键角弯曲项,所述专有力场拟合:根据键伸缩项、键角弯曲项、角参数、二面角参数、静电作用项、范德华作用项计算分子的能量,计算公式如下;
    Figure PCTCN2020085815-appb-100001
    其中键伸缩项中k b是力常数,r是键长,r 0是键处于平衡位置的长度;
    键角弯曲项:k θ是力常数,θ是键角,θ 0是平衡位置的键角;
    二面角项Vn代表二面角旋转过程中势能的最高时值,n用来调整周期性,φ为变量二面角的值,γ代表相即二面角的角度;
    A ij,B ij,R ij是范德华参数,
    范德华作用可以用标准的lennard-Jones势表示:
    Figure PCTCN2020085815-appb-100002
    在力场表达式中:A ij=4ε ijσ ij 12,B ij=4ε ijσ ij 6
    R ij表示两个原子之间的距离,ε ij表示两个原子之间势阱的深度,σ ij是势能为零时两个原子之间的距离。
    静电作用表达式:
    Figure PCTCN2020085815-appb-100003
    中,ε为有效介电常数,qi、qj分别是原子i,j所带的电荷,R ij是两个原子之间的距离;
    参数文件包括:原子类型定义、分子拓扑。
    专有力场拟合还包括:对小分子片段扫描得到小分子片段结构作为训练集,根据函数计算小分子片段的能量,若较计算的能量与标准能量相关性好,获得力 场参数,通过迭代解出各项参数值。
  10. 根据权利要求1至6任意一项所述的分子力场拟合方法,其特征在于,所述拼接步骤中,首先获取输入大分子的初始参数,根据小分子片段与输入大分子之间的原子对应关系重组输入大分子的力场参数,所述初始参数包括分子拓扑。
PCT/CN2020/085815 2020-04-21 2020-04-21 分子力场拟合方法 WO2021103402A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/085815 WO2021103402A1 (zh) 2020-04-21 2020-04-21 分子力场拟合方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/085815 WO2021103402A1 (zh) 2020-04-21 2020-04-21 分子力场拟合方法

Publications (1)

Publication Number Publication Date
WO2021103402A1 true WO2021103402A1 (zh) 2021-06-03

Family

ID=76128741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/085815 WO2021103402A1 (zh) 2020-04-21 2020-04-21 分子力场拟合方法

Country Status (1)

Country Link
WO (1) WO2021103402A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117809757A (zh) * 2023-12-28 2024-04-02 苏州腾迈医药科技有限公司 分子力场拟合方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101131707A (zh) * 2006-08-25 2008-02-27 屹昂计算化学软件(上海)有限公司 分子力学力场参数的自动化生成方法
CN102779239A (zh) * 2011-05-09 2012-11-14 中国科学院研究生院 一种用于建立蛋白质体系分子模拟力场的方法
US20160098543A1 (en) * 2006-08-04 2016-04-07 IFP Energies Nouvelles Method of quantifying hydrocarbon formation and retention in a mother rock
CN105787292A (zh) * 2014-12-18 2016-07-20 中国科学院大连化学物理研究所 蛋白质折叠的并行预测方法
CN108763852A (zh) * 2018-05-09 2018-11-06 深圳晶泰科技有限公司 类药有机分子的自动化构象分析方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098543A1 (en) * 2006-08-04 2016-04-07 IFP Energies Nouvelles Method of quantifying hydrocarbon formation and retention in a mother rock
CN101131707A (zh) * 2006-08-25 2008-02-27 屹昂计算化学软件(上海)有限公司 分子力学力场参数的自动化生成方法
CN102779239A (zh) * 2011-05-09 2012-11-14 中国科学院研究生院 一种用于建立蛋白质体系分子模拟力场的方法
CN105787292A (zh) * 2014-12-18 2016-07-20 中国科学院大连化学物理研究所 蛋白质折叠的并行预测方法
CN108763852A (zh) * 2018-05-09 2018-11-06 深圳晶泰科技有限公司 类药有机分子的自动化构象分析方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI WENZHAO: "Molecular Dynamics Simulation of Protein Structure and Dynamics", THESIS, no. 04, 15 April 2014 (2014-04-15), pages 1 - 97, XP009528355, ISSN: 1674-022X *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117809757A (zh) * 2023-12-28 2024-04-02 苏州腾迈医药科技有限公司 分子力场拟合方法及装置

Similar Documents

Publication Publication Date Title
CN111653320B (zh) 分子力场拟合方法
CN114236552B (zh) 基于激光雷达的重定位方法及系统
CN106125925B (zh) 基于手势和语音控制的智能抓捕方法
JP2022043216A (ja) ターゲット検出方法、電子機器、路側機器、及びクラウド制御プラットフォーム
Eizentals et al. 3D pose estimation of green pepper fruit for automated harvesting
CN110930495A (zh) 基于多无人机协作的icp点云地图融合方法、系统、装置及存储介质
CN106909877A (zh) 一种基于点线综合特征的视觉同时建图与定位方法
CN108958238B (zh) 一种基于协变代价函数的机器人点到区路径规划方法
JPH08292938A (ja) 有限要素メッシュ発生方法及び装置、並びに解析方法及び装置
WO2021103402A1 (zh) 分子力场拟合方法
WO2022198995A1 (zh) 步态轨迹规划方法、装置、计算机可读存储介质及机器人
US11602846B2 (en) Search apparatus, search method, and search program
CN112884653B (zh) 一种基于断裂面信息的兵马俑碎块拼接方法及系统
CN110362039B (zh) 一种五轴加工工件摆放姿态优化方法
CN104090945A (zh) 一种地理空间实体构建方法及系统
US10628533B2 (en) Global optimization of networks of locally fitted objects
Tuvi-Arad et al. Improved algorithms for quantifying the near symmetry of proteins: complete side chains analysis
CN112056173A (zh) 一种割胶轨迹规划方法、装置、电子设备及存储介质
CN107705310B (zh) 一种地块分割方法及系统
CN114119684B (zh) 基于四面体结构的标记点配准方法
CN117193278A (zh) 动态沿边路径生成的方法、装置、计算机设备和存储介质
CN111754421A (zh) 改进的导向滤波三维散乱点云快速光顺方法
CN113867260B (zh) 一种采用数值积分的机器人曲面加工关节轨迹生成方法
WO2019092218A1 (en) Correcting segmented surfaces to align with a rendering of volumetric data
Xue et al. Point cloud registration method for pipeline workpieces based on ndt and improved icp algorithms

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20891631

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC

122 Ep: pct application non-entry in european phase

Ref document number: 20891631

Country of ref document: EP

Kind code of ref document: A1