CN111863141A - Molecular force field multi-target fitting algorithm library system and workflow method - Google Patents

Molecular force field multi-target fitting algorithm library system and workflow method Download PDF

Info

Publication number
CN111863141A
CN111863141A CN202010651916.9A CN202010651916A CN111863141A CN 111863141 A CN111863141 A CN 111863141A CN 202010651916 A CN202010651916 A CN 202010651916A CN 111863141 A CN111863141 A CN 111863141A
Authority
CN
China
Prior art keywords
force field
target
parameters
training
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010651916.9A
Other languages
Chinese (zh)
Other versions
CN111863141B (en
Inventor
林泓叡
杨明俊
彭春望
吴楚楠
马健
温书豪
赖力鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jingtai Technology Co Ltd
Original Assignee
Shenzhen Jingtai Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jingtai Technology Co Ltd filed Critical Shenzhen Jingtai Technology Co Ltd
Priority to CN202010651916.9A priority Critical patent/CN111863141B/en
Publication of CN111863141A publication Critical patent/CN111863141A/en
Application granted granted Critical
Publication of CN111863141B publication Critical patent/CN111863141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a molecular force field multi-target fitting algorithm library system and a working process method, which comprise the following steps: the FFOptIterator module is used for inputting and outputting training iteration of force field parameters; the EnergyCalculator module is used for calculating the difference between the MM energy and the energy and calculating a required value of each step of iteration in the optimization algorithm; a PropertyEstimator module to calculate thermodynamic property calculations based on MD simulations. When initializing the FFOptIterator and EnergyCalculator objects, the user specifies the training force field parameters, adjustable parameter range, algorithm flow parameters and MD simulation parameters in a parameter transmission mode. The invention is suitable for the related application of molecular force field training and verification, and realizes the frame aiming at different training targets, the prediction process of different properties, the compatibility and conversion among different format force field parameters, the integration of multi-objective optimization, result analysis, imaging and the like.

Description

Molecular force field multi-target fitting algorithm library system and workflow method
Technical Field
The invention belongs to the field of molecular force fields, and particularly relates to a molecular force field multi-target fitting algorithm library system and a workflow method.
Background
Molecular dynamics Simulations (Molecular dynamics Simulations) have been widely applied to various fields of computational biology, chemistry and material science, and training and verification of Molecular force fields (Molecular dynamics) and quality evaluation of parameters have become one of the research focuses of property calculation of current small-molecule drugs and the like. The molecular force field is the most basic prerequisite for molecular dynamics simulation, and currently, when the molecular simulation related to small molecules is researched, the grabbing of force field parameters is involved, and the calculation and iteration of each step are performed based on the energy calculated by the force field.
The current force field parameter development tools mainly have the following defects:
1. the method is realized in a command line mode, and the specific realization mode usually involves multiple languages, multiple different software packages, file formats and the like, so that the problems of more dependence, difficult installation and compilation, difficult use and the like of a user are caused.
2. The provided fitting targets are limited, are mostly limited to the energy or structure of the intramolecular QM calculation, and are less related to various properties which are dominant in intermolecular interaction, lack of multi-target fitting functions which can be customized by users and the like.
3. The specific implementation code is often deeply bound with the process, and lacks modularization, process parameter interfaces and the like, and a user often needs to manually modify various input files or even directly modify the source code to achieve the scientifically expected function. So that the use is not easy and the secondary development is difficult.
Disclosure of Invention
In order to solve the above technical problems, the present invention provides a molecular force field multi-target fitting algorithm library system, which has the following main functions: the implementation comprises the following steps: the method comprises the following steps of developing force field parameters with properties such as intramolecular energy (intramolecular energy), intermolecular dimer interaction energy (dimer interaction), crystal cell parameters (crystal lattice parameters), liquid density (liquid density), evaporation enthalpy (HOV), sublimation enthalpy (HOS), melting enthalpy (HOF) and the like as targets, and supporting a user-defined multi-target fitting function.
The specific technical scheme is as follows: a molecular force field multi-target fitting algorithm library system, comprising:
an FFOptIterator module for inputting and outputting training iterations of force field parameters;
an energy calculator module, which is used for calculating MM energy, energy differential and the like, and is used for calculating a required value of each step of iteration in an optimization algorithm;
a PropertyEstimator module for calculating thermodynamic property calculations based on MD simulations. When initializing the FFOptIterator and EnergyCalculator objects, the user specifies the training force field parameters, adjustable parameter range, algorithm flow parameters and MD simulation parameters in a parameter transmission mode.
The algorithm library system is compatible with a CHARMM, AMBER and GROMACS main flow field input file format, and performs system operation and MM simulation calculation based on a ParmEd and OpenMM algorithm library.
Wherein, the FFOptIterator module is the main module of the XFPOPt. CHARMM (chemistry at harvard macromolecular mechanics) is a commercial set of software for molecular dynamics simulation, and the molecular force field used is the CHARMM force field, which has a file format specific to the software.
The AMBER is the more mainstream molecular dynamics simulation software at present, and has wide application in the field of biomacromolecule simulation calculation due to better support to the GPU. The molecular force field used by the method is an AMBER force field, and has a file format specific to the software.
OpenMM is a high-performance toolkit for molecular simulation, and has Python and C + + APIs as libraries or application programs. The native support GPU of a part of the simtk tool kit for biomolecular simulation is used as a computing platform, so that the method has great advantages.
Preferably, the MM calculation supports GPU operation; and simultaneous training of multiple molecules, multiple polymorphs and multiple force field parameter types is supported.
By adopting the technical scheme, the invention has the advantages that the prior molecular force field fitting algorithm library basically does not use OpenMM, so that GPU scheduling is not supported mostly. The algorithm library depends on OpenMM, and MM calculation of OpenMM supports GPU scheduling, so that the XFOpt algorithm library can support GPU calculation. Polymolecular polymorphs and the like are the subject of the invention and the details are described in the following schemes.
Preferably, except for calling OpenMM and Scipy interfaces by MD calculation and numerical optimization, the full code of the algorithm library is realized by Python, an SDK is provided, and a user can perform secondary development and application of various research and development codes based on each API interface of the algorithm library. Only relies on an open-source third-party algorithm library and follows a minimization dependence principle, and does not rely on any redundant third party kit of the Xiaozhong except that the computational chemistry relies on mainstream algorithm libraries ParmEd and OpenMM.
Correspondingly, the invention also provides a working method of the molecular force field multi-target fitting algorithm library system, which comprises the following steps:
selecting a target for parameter fitting, and generating a corresponding QM data set or collecting corresponding experimental measurement data;
b, performing initial MM simulation and calculation aiming at the selected system and the data set;
step C, comparing the initial target value calculated by MM with a QM or experimental target value, and calculating a corresponding target function and the differential of the target function on the training parameters according to the target type;
d, optimizing the objective function according to the calculated objective function and the differential of the objective function, wherein the current algorithm is an L-BFGS-B method, and obtaining an initial force field parameter value during the next iteration;
Step E: judging whether the target function reaches a convergence standard or not according to the convergence condition, and if so, stopping optimization iteration; if not, performing next round of MM calculation according to the force field parameter value of the current optimization iteration, performing objective function and derivative calculation according to the MM calculation, and performing subsequent iteration on the optimized objective function value until convergence.
Preferably, in the step a, an MD input file corresponding to the system is generated; and contains information such as molecular topology, molecular force field parameters used, etc.; specifying the force field parameters that need to be involved in the training, for example: bond (bond), angle (angle), dihedral (dihedral), abnormal dihedral (abnormal dihedral), charge (charge), Van der waal (Van der waal) parameters, and the like.
Preferably, in step B, the items of initial MM simulation and calculation include MM energy calculation and various types of thermodynamic property prediction, such as liquid density, crystal unit cell parameters, HOV, HOS, HOF, and the like.
Preferably, in the step C, the objective function is generally a square difference between the calculated MM value and the target value, and the details may be different according to different training target types. The objective function derivative is analytically calculated by a force balance (ForceBalance) method by using an equation, and the equation is different for different thermodynamic properties.
Preferably, in the step D, the general objective function form is:
Figure BDA0002575291350000031
where T is the objective function value, λ is the designated force field parameter participating in the training, i is the data index number of the training set, w is the weight occupied by the training objective, and P is the target value for MM calculation and QM/experiments, specifically the intramolecular conformation energy, intermolecular interaction energy, liquid density, crystal cell parameters, HOV, HOS, HOF, etc. mentioned above.
Correspondingly, the invention also provides a flow implementation method of the algorithm library system, which comprises the following steps:
step (1): reading an initial MD input file, a target type, a training set system, a conformation, a QM or experimental target value, a training force field parameter and a training range prepared by a user, and performing various parameter calculations and system operations by taking an FFOptIterator module as a container;
step (2): calling an iterator of an FFOptIterator module, calculating MM target values for systems under all target attributes, finishing the Energy related calculation of all MM by an Energy Calculator, and calling an Open MM Python API (OpenMM Python application program interface); for conformational energy, energy calculations for various conformational scans were developed based on OpenMM's custom intramolecular force.
And (3): and after the system contained in each target type system calculates the corresponding MM, calculating the differential of the target function and the ForceBalance target function on the force field parameters in the specific form of the respective target function. And finally, integrating systems under all targets, and calculating a total target function and a corresponding gradient by combining the set weights, wherein the total target function and the corresponding gradient are used as the input of the L-BFGS-B optimizer.
And (4): calling the Fortran L-BFGS-B subroutine inside Scipy, optimizing the force field parameters with the inputs above, and performing one iteration. And checking whether the convergence condition of the target function is achieved, if not, editing all systems by using the updated force field parameters and using a ParmEdaPI (ParmEd Python application program interface), and constructing an MM calculation input system of the next round.
And (5): and performing next round of MM calculation iteration, and repeating the steps to optimize the force field parameters until the objective function is converged. Typically, the number of system iterations targeted for QM energy is approximately 50-100, and the number of system convergence iterations targeted for thermodynamic physicochemical calculations is 8-15.
And (6): after the force field parameters are optimized and converged, the output of the force field parameter file and the visualization of the optimization process can be carried out according to the user operation. The output of the parameter file is mainly in AMBER mol2 and frcmod formats at present, and then system modeling in AMBER formats can be directly carried out, so that the MD simulation is very convenient to use. The visualization part can construct a target calculation value comparison and an optimization iterative process of a target function before and after the optimization of the force field parameters in a returned object according to the target type specified by the user.
Preferably, in the step (1), the system operation and parameter modification function is achieved by using a system editing function of the parmeed object. The FFO ptiiterator module is designed to be compatible with the input of multiple molecules, multiple polymorphs, multiple sets of training force field parameters, and to include structure and operation in attributes of different target types. The initialized FFO ptiiterator module finishes all system preparation work before optimization iteration.
Preferably, in the step (2), for dimer interaction energy (dimer interaction energy), the calculation of intermolecular potential energy is developed based on CustomNonbondence, wherein the Van der Waals term adopts lorentz-bertelot combination rule. For the remaining thermodynamic properties, conventional NPT kinetic simulations were performed in OpenMM, with specific simulation parameters using user-given settings,
and set a simulation length long enough to allow the results of the property calculations to converge. After the simulation is completed, a Property predictor (Property Estimator) is called to perform thermodynamic Property calculations, such as volume, unit cell parameters, HOV, HOS, HOF, etc.
The algorithm library system is mainly applied to parameter training of a customized force field, particularly various properties leading to intermolecular interaction, and is used for development and verification of the customized force field in a special force field for parameterized protein-drug molecule combination and crystal habit prediction (morphology prediction).
The invention brings the following effects:
1. the systematization and the architecture design of the force field development tool are realized, and the force field development tool has the advantages of easy use and very high expandability.
2. Automatic iteration and analysis of a parameter training process are realized, and simultaneous optimization of multiple targets, multiple systems and multiple crystal forms is supported.
3. The third-party algorithm library is called without any external MD software and is provided with Python API (Python application program interface), so that the method can be compatible with various mainstream MD file formats.
4. The prediction of the target and the property which support the dominant interaction among various types of molecules is realized and is used as the basis for the training of the force field parameters.
5. The user is supported to specify different training parameter combinations, self-defining the training range of the force field parameters, inputting process parameters, simulating parameters and the like, so that various customized force field fitting strategies required by the user can be realized.
6. The algorithm library system supports GPU calculation, provides SDK, can directly call Python API with corresponding functions, is convenient for analyzing and interactively running the flow, and can add required analysis codes and secondary development in the flow. Meanwhile, background calling is supported, and tasks can be submitted in a command line mode.
7. The invention is suitable for the related application of molecular force field training and verification, and realizes the frame aiming at different training targets, the prediction process of different properties, the compatibility and conversion among different format force field parameters, the integration of multi-objective optimization, result analysis, imaging and the like.
Drawings
FIG. 1 is a force field parameter development principle and flowchart according to an embodiment of the present invention.
FIG. 2 is a specific operational flow and implementation of an embodiment of the present invention.
Detailed Description
The following provides a more detailed description of the invention when taken in conjunction with the accompanying drawings:
the XFOpt algorithm library uses and realizes the force field parameter development principle and flow, as shown in FIG. 1:
A. the user selects the target for which a parameter fit is desired, generates a corresponding QM data set or collects corresponding experimental measurement data. And generating an MD input file corresponding to the system, and including the force field information. Specifying parameters needed to participate in the training, etc.
B. Initial MM simulations and calculations were performed for selected systems and data sets, including various desired property predictions.
C. The initial target value calculated in MM is compared with QM or experimental target values, and the corresponding target function and the differentiation of the target function on the training parameters are calculated according to the target type. The objective function is generally the square difference between the calculated MM value and the target value, and may vary in detail depending on the type of target trained. The objective function derivative is analytically calculated by a ForceBalance method by using an equation, and the equation has differences for different thermodynamic properties.
D. And optimizing the objective function according to the calculated objective function and the differential of the objective function, wherein the current algorithm is an L-BFGS-B method, and the initial force field parameter value during the next iteration is obtained.
E. And judging whether the target function reaches the convergence standard or not according to the convergence condition, and if so, stopping the optimization iteration. If not, performing next round of MM calculation according to the force field parameter value of the current optimization iteration, performing objective function and derivative calculation according to the MM calculation, and performing subsequent iteration on the optimized objective function value until convergence.
In order to ensure better code expansibility, maintainability and usability, the XFOpt algorithm library is structurally designed and divided into FFOptIterator, EnergyCalculator and PropertyEstimator modules, and the modules have independent functions and play a role in each step of optimization iteration. The following is a brief introduction of the specific operation flow and implementation of the algorithm library (as shown in fig. 2):
step (1): reading an initial MD input file (prmtop, inpcrd and the like), a target type, a training set system, a conformation, a QM or experimental target value, a training force field parameter, a training range and the like prepared by a user, and performing various parameter calculation and system operation by taking an FFOptIterator module as a container, wherein the system operation and parameter modification function is achieved by using a system editing function of a ParmEd object. The FFOptIterator is designed to be compatible with the input of multiple molecules, multiple polymorphs, multiple sets of training force field parameters, and to include structure and operation in attributes of different target types. The specific implementation method is that when the FFOptIterator object is initialized, a Pythony dictionary input by a user is converted into each attribute of the FFOptIterator object, the attributes are distinguished according to different target types, different MM calculation processes and specific target functions are given to the different target types, but the object and the format are unified when each attribute container returns MM calculation values, so that the calculation values of all the target types can be unified and synthesized into a sum of target function values and a function gradient value when the FFOptIterator iterator calculates, and the effect of converting multi-target fitting into single-target fitting is achieved. So the FFOptIterator that is initialized in practice completes the system preparation work before all optimization iterations. After each round of parameter optimization, according to different target types, an FFOptIterator sub-method assigns the updated force field parameters back to a ParmEd Structure object (a Structure object of a ParmEd molecule editor) in each target type container of the FFOptIterator, completes parameter update, and can directly perform the next round of MM calculation iteration.
Step (2): calling an iterator of the FFOptItera, calculating MM target values for systems under all target attributes, finishing the energy-related calculation of all MMs by using an EnergyCalculator, and calling an OpenMM Python API. For relational energy, CustomComponundForce based on OpenMM develops energy calculation of various conformation scans. For dimer interaction energy, calculations of intermolecular potential energy were developed based on CustomNonbondence, where the Van der Waals term used the lorentz-bertelot combination rule. For the remaining thermodynamic properties, conventional NPT kinetic simulations were performed in OpenMM, with the specific simulation parameters using user-given settings, and suggesting that the simulation length be set long enough to allow the results of the property calculations to converge. After the simulation was completed, the PropertyEstimator was called to perform thermodynamic property calculations, such as volume, cell parameters, HOV, HOS, HOF, etc.
And (3): and after the system contained in each target type system calculates the corresponding MM, calculating the differential of the target function and the ForceBalance target function on the force field parameters in the specific form of the respective target function. And finally, integrating systems under all targets, and calculating a total target function and a corresponding gradient by combining the set weights, wherein the total target function and the corresponding gradient are used as the input of the L-BFGS-B optimizer.
And (4): calling the Fortran L-BFGS-B subroutine inside Scipy, optimizing the force field parameters with the inputs above, and performing one iteration. And checking whether the convergence condition of the target function is achieved, if not, editing all systems by using the updated force field parameters and using the ParmEdaPI, and constructing an MM calculation input system of the next round.
And (5): and performing next round of MM calculation iteration, and repeating the steps to optimize the force field parameters until the objective function is converged. Generally, the number of system iterations targeted for QM energy is about 60, and the number of system convergence iterations targeted for thermodynamic physicochemical calculations is about 10.
And (6): after the force field parameters are optimized and converged, the output of the force field parameter file and the visualization of the optimization process can be carried out according to the user operation. The output of the parameter file is mainly in AMBER mol2 and frcmod formats at present, and then system modeling in AMBER formats can be directly carried out, so that the MD simulation is very convenient to use. The visualization part can construct target calculation value comparison before and after the optimization of the force field parameters, an optimization iterative process of an objective function and the like in a returned object according to the target type specified by the user.
Example 1
Taking an Ibuprofen (Ibuprofen) molecule as a system, adopting a form 2 crystal form, taking 69 QM Ibuprofen-Ethanol (Ibuprofen-Ethanol) and Ibuprofen-Water (Ibuprofen-Water) interaction energy data sets generated by hydrogen bond conformation scanning as targets, and taking a crystal force field as an initial force field to fit van der Waals parameters. And (3) an allowable parameter adjustment range is designated to be 10%, the weight of each dimer pair image is set to be 1, and the set of image is substituted into an XFPOPt algorithm library to complete the whole process. The objective function value before optimization is 745.948kJ2/mol2After optimization, the objective function value is 324.104(kJ/mol) -2, and the MM interaction energy is obviously improved compared with the QM target value RMSD.
Example 2
And taking Paracetamol molecules as a system, and simultaneously adopting form 1 and form 2 crystal forms to perform polymorphic force field parameter fitting. 2000 QM Paracetamol-Ethanol and Paracetamol-Water interaction energy data sets generated by molecular simulation image clustering and crystal structure unit cell parameters, HOV, HOS and HOF obtained by experimental measurement are taken as targets, and a GAFF2 force field is taken as an initial force field to fit Van der Waals parameters. The allowable parameter adjustment range is designated as 20%, the image weight of each dimer pair (dimerpair) is set as 1, the weight of the cell parameters such as alpha, beta, gamma angle is set as 10, the HOV, HOS, HOF weight is set as 10, and the XFOPT algorithm library is substituted to complete the whole process. The objective function value before optimization is 2.748kJ 2/mol2The optimized objective function value is 2.583(kJ/mol) -2, the optimization degree is the normal range of crystal-related properties, the MM Dimer calls interaction energy is obviously improved compared with the QM target value RMSD, and HOV, HOS and HOF of property calculation are closer to the experimental values.
Example 3
Taking internal case molecules as a system, adopting a form 1 crystal form, taking 3000 QMAPI-API, API-Ethanol and API-Water interaction energy data sets generated by molecular simulation conformation clustering and HOF obtained by experimental measurement as targets, and taking a crystal force field as an initial force field to fit charges and van der Waals parameters. And (3) designating the allowable parameter adjustment range as 25%, setting the weight-average of the image formation weight of each dimerpair as 1, setting the weight of the HOF as 10, and substituting the weight-average into an XFPOPt algorithm library to complete the whole process. The objective function value before optimization is 2.797kJ2/mol2The optimized objective function value is 0.548kJ2/mol2The optimization degree is in the normal range of crystal-related properties, the MM Dimer calls interaction energy is obviously improved compared with the QM target value RMSD, and the HOF of property calculation is closer to the experimental value.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. A molecular force field multi-target fitting algorithm library system is characterized by comprising:
the FFOptIterator module is used for inputting and outputting training iteration of force field parameters;
the EnergyCalculator module is used for calculating the difference between the MM energy and the energy and calculating a required value of each step of iteration in the optimization algorithm;
a PropertyEstimator module to calculate thermodynamic property calculations based on MD simulations;
when initializing the FFOptIterator and EnergyCalculator objects, the user specifies the training force field parameters, adjustable parameter range, algorithm flow parameters and MD simulation parameters in a parameter transmission mode.
2. A working method based on a molecular force field multi-target fitting algorithm library system is characterized by comprising the following steps:
selecting a target for parameter fitting, and generating a corresponding QM data set or collecting corresponding experimental measurement data; generating an MD input file of a corresponding system;
b, performing initial MM simulation and calculation aiming at the selected system and the data set;
step C, comparing the initial target value calculated by MM with a QM or experimental target value, and calculating a corresponding target function and the differential of the target function on the training parameters according to the target type;
D, optimizing the objective function according to the calculated objective function and the differential of the objective function, wherein the current algorithm is an L-BFGS-B method, and obtaining an initial force field parameter value during the next iteration;
step E: judging whether the target function reaches a convergence standard or not according to the convergence condition, and if so, stopping optimization iteration; if not, performing next round of MM calculation according to the force field parameter value of the current optimization iteration, performing objective function and derivative calculation according to the MM calculation, and performing subsequent iteration on the optimized objective function value until convergence.
3. The method of claim 2, wherein step a further comprises a force field message specifying parameters required to participate in the training.
4. The method of claim 2, wherein in step B, the initial MM simulation and calculation projects include various required property predictions.
5. The method of claim 2, wherein the generic objective function is of the form:
Figure FDA0002575291340000011
6. a flow implementation method based on a molecular force field multi-target fitting algorithm library system is characterized by comprising the following steps:
step (1): reading an initial MD input file, a target type, a training set system, a conformation, a QM or experiment target value, a training force field parameter and a training range prepared by a user, and performing various parameter calculations and system operations by taking an FFOptItera as a container;
Step (2): calling an iterator of the FFOptItera, calculating MM target values for systems under all target attributes, finishing the energy-related calculation of all MMs by using an EnergyCalculator, and calling an OpenMM Python API; for the relational energy, CustomComponund force based on OpenMM develops energy calculation of various configuration scanning;
and (3): after the system contained in each target type system calculates the corresponding MM, calculating the differential of the target function and the ForceBalance target function on the force field parameters in the specific form of the respective target function; finally, integrating the systems under all the targets, and calculating a total target function and a corresponding gradient by combining the set weights, wherein the total target function and the corresponding gradient are used as the input of the L-BFGS-B optimizer;
and (4): calling a Fortran L-BFGS-B subroutine in Scipy, and carrying out a round of iteration by inputting optimized force field parameters; checking whether the convergence condition of the target function is achieved, if not, editing all systems by using the updated force field parameters and a ParmEd API, and constructing an MM calculation input system of the next round;
and (5): performing next round of MM calculation iteration, and repeating the steps to optimize the force field parameters until the objective function is converged;
And (6): and after the force field parameters are optimized and converged, outputting a force field parameter file and visualizing the optimization process according to the user operation.
7. The process implementation method of claim 6, wherein in the step (1), the system operation and parameter modification function is achieved by using a system editing function of a ParmEd object; the FFOptIterator module is compatible with the input of multiple molecules, multiple polymorphs, multiple sets of training force field parameters, and incorporates structure and operation into attributes of different target types.
8. The process of claim 6, wherein in the step (2), for dimerinteraction energy, computation of intermolecular potential energy is developed based on CustomNonbondengforce, wherein Van der Waals term adopts lorentz-bertelot combination rule; for other thermodynamic properties, carrying out conventional NPT dynamics simulation by OpenMM, wherein specific simulation parameters use settings given by a user, and a simulation length which is long enough is set so that the calculation result of each property can be converged; after the simulation was completed, the PropertyEstimator was called for thermodynamic property calculations.
9. The process of claim 6, wherein in step (5), the number of system iterations targeted for QM energy is 50-100, and the number of system convergence iterations targeted for thermodynamic physicochemical calculations is 8-15.
10. The process implementation method of claim 6, wherein the step (6): the output of the parameter file adopts AMBER mol2 and frcmod format, and the visualization part can construct the target calculation value comparison and the optimization iterative process of the target function before and after the optimization of the force field parameters in the returned object according to the target type specified by the user.
CN202010651916.9A 2020-07-08 2020-07-08 Molecular force field multi-target fitting algorithm library system and workflow method Active CN111863141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010651916.9A CN111863141B (en) 2020-07-08 2020-07-08 Molecular force field multi-target fitting algorithm library system and workflow method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010651916.9A CN111863141B (en) 2020-07-08 2020-07-08 Molecular force field multi-target fitting algorithm library system and workflow method

Publications (2)

Publication Number Publication Date
CN111863141A true CN111863141A (en) 2020-10-30
CN111863141B CN111863141B (en) 2022-06-10

Family

ID=73153125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010651916.9A Active CN111863141B (en) 2020-07-08 2020-07-08 Molecular force field multi-target fitting algorithm library system and workflow method

Country Status (1)

Country Link
CN (1) CN111863141B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233733A (en) * 2020-11-05 2021-01-15 深圳晶泰科技有限公司 Molecular force field quality control system and control method thereof
CN112447267A (en) * 2020-11-18 2021-03-05 深圳晶泰科技有限公司 Molecular dynamics force field parameter fitting workflow control system and control method thereof
WO2023070767A1 (en) * 2021-10-26 2023-05-04 深圳晶泰科技有限公司 Construction method for molecular training set, and training method and related apparatuses
WO2023123290A1 (en) * 2021-12-30 2023-07-06 深圳晶泰科技有限公司 Method and apparatus for optimizing force field and non-bonding parameter thereof, and design method and apparatus

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997046949A1 (en) * 1996-06-07 1997-12-11 Hitachi, Ltd. Molecular modeling system and molecular modeling method
US20030195734A1 (en) * 2002-04-16 2003-10-16 Huai Sun Method and expert system of molecular mechanics force fields for computer simulation of molecular systems
WO2008071540A1 (en) * 2006-12-11 2008-06-19 Avant-Garde Materials Simulation Sarl Tailor-made force fields for crystal structure prediction
CN102446235A (en) * 2010-10-11 2012-05-09 中国石油化工股份有限公司 Method for simulating and calculating interaction parameters among chemical components by using computer
EP3026588A1 (en) * 2014-11-25 2016-06-01 Inria Institut National de Recherche en Informatique et en Automatique interaction parameters for the input set of molecular structures
CN106355025A (en) * 2016-09-06 2017-01-25 北京理工大学 Allele competing reaction QM/MM method in living system
CN109637592A (en) * 2018-12-21 2019-04-16 深圳晶泰科技有限公司 The calculating task management and analysis and its operation method that molecular force field parameter generates
CN109994158A (en) * 2019-03-21 2019-07-09 东北大学 A kind of system and method based on the intensified learning building molecule reaction field of force
CN110097927A (en) * 2019-05-10 2019-08-06 青岛理工大学 Method for testing ion diffusion coefficient under electric field action based on molecular dynamics
US20200134246A1 (en) * 2018-05-10 2020-04-30 Shenzhen Jingtai Technology Co., Ltd. Gromacs cloud computing process control method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997046949A1 (en) * 1996-06-07 1997-12-11 Hitachi, Ltd. Molecular modeling system and molecular modeling method
US20030195734A1 (en) * 2002-04-16 2003-10-16 Huai Sun Method and expert system of molecular mechanics force fields for computer simulation of molecular systems
WO2008071540A1 (en) * 2006-12-11 2008-06-19 Avant-Garde Materials Simulation Sarl Tailor-made force fields for crystal structure prediction
CN102446235A (en) * 2010-10-11 2012-05-09 中国石油化工股份有限公司 Method for simulating and calculating interaction parameters among chemical components by using computer
EP3026588A1 (en) * 2014-11-25 2016-06-01 Inria Institut National de Recherche en Informatique et en Automatique interaction parameters for the input set of molecular structures
CN106355025A (en) * 2016-09-06 2017-01-25 北京理工大学 Allele competing reaction QM/MM method in living system
US20200134246A1 (en) * 2018-05-10 2020-04-30 Shenzhen Jingtai Technology Co., Ltd. Gromacs cloud computing process control method
CN109637592A (en) * 2018-12-21 2019-04-16 深圳晶泰科技有限公司 The calculating task management and analysis and its operation method that molecular force field parameter generates
CN109994158A (en) * 2019-03-21 2019-07-09 东北大学 A kind of system and method based on the intensified learning building molecule reaction field of force
CN110097927A (en) * 2019-05-10 2019-08-06 青岛理工大学 Method for testing ion diffusion coefficient under electric field action based on molecular dynamics

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
齐畅 等: ""优化温度相关力场预测正构烷烃热力学性质"", 《化工学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233733A (en) * 2020-11-05 2021-01-15 深圳晶泰科技有限公司 Molecular force field quality control system and control method thereof
CN112233733B (en) * 2020-11-05 2023-04-07 深圳晶泰科技有限公司 Molecular force field quality control system and control method thereof
CN112447267A (en) * 2020-11-18 2021-03-05 深圳晶泰科技有限公司 Molecular dynamics force field parameter fitting workflow control system and control method thereof
WO2023070767A1 (en) * 2021-10-26 2023-05-04 深圳晶泰科技有限公司 Construction method for molecular training set, and training method and related apparatuses
WO2023123290A1 (en) * 2021-12-30 2023-07-06 深圳晶泰科技有限公司 Method and apparatus for optimizing force field and non-bonding parameter thereof, and design method and apparatus

Also Published As

Publication number Publication date
CN111863141B (en) 2022-06-10

Similar Documents

Publication Publication Date Title
CN111863141B (en) Molecular force field multi-target fitting algorithm library system and workflow method
Fu et al. Discovery of the consistently well-performed analysis chain for SWATH-MS based pharmacoproteomic quantification
Walley et al. Integration of omic networks in a developmental atlas of maize
Hartenfeller et al. A collection of robust organic synthesis reactions for in silico molecule design
Wapinski et al. Automatic genome-wide reconstruction of phylogenetic gene trees
Lee et al. Exploring chemical space with score-based out-of-distribution generation
JP2014211907A (en) Human metabolic models and methods
WO2022006771A1 (en) Molecular force field multi-objective fitting algorithm database system and workflow method
Jewett et al. A coalescent model for genotype imputation
EP3920187A1 (en) Structure search method, structure search apparatus, and structure search program
CN111161810A (en) Free energy perturbation method based on constraint probability distribution function optimization
Popp et al. A hybrid approach identifies metabolic signatures of high‐producers for chinese hamster ovary clone selection and process optimization
Xu et al. Ontology integration to identify protein complex in protein interaction networks
Gecen et al. Application of electron conformational–genetic algorithm approach to 1, 4-dihydropyridines as calcium channel antagonists: pharmacophore identification and bioactivity prediction
Huß et al. An automated workflow that generates atom mappings for large‐scale metabolic models and its application to Arabidopsis thaliana
KR101090892B1 (en) Method of providing information for predicting enzyme selectivity of metabolism phase ii reactions
Wharrie et al. HAPNEST: efficient, large-scale generation and evaluation of synthetic datasets for genotypes and phenotypes
Wang et al. A structure-based data set of protein-peptide affinities and its nonredundant benchmark: potential applications in computational peptidology
Ding et al. Combining multi-dimensional molecular fingerprints to predict the hERG cardiotoxicity of compounds
Joshi et al. Mass Spectrometry–Based Proteogenomics: New Therapeutic Opportunities for Precision Medicine
Cheng et al. A dual-population multi-objective evolutionary algorithm driven by generative adversarial networks for benchmarking and protein-peptide docking
Rodrigues et al. CSM‐Potential2: a comprehensive deep learning platform for the analysis of protein interacting interfaces
WO2012027470A2 (en) Articles of manufacture and methods for modeling chinese hamster ovary (cho) cell metabolism
CN115620801A (en) Prediction device and method for protein binding pocket
Wu et al. Be-1DCNN: a neural network model for chromatin loop prediction based on bagging ensemble learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant