CN102779239A - Method for establishing molecular simulation force filed of protein system - Google Patents

Method for establishing molecular simulation force filed of protein system Download PDF

Info

Publication number
CN102779239A
CN102779239A CN 201110117428 CN201110117428A CN102779239A CN 102779239 A CN102779239 A CN 102779239A CN 201110117428 CN201110117428 CN 201110117428 CN 201110117428 A CN201110117428 A CN 201110117428A CN 102779239 A CN102779239 A CN 102779239A
Authority
CN
China
Prior art keywords
model
force
field
force field
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201110117428
Other languages
Chinese (zh)
Inventor
汪志祥
姜金良
吴春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of Sciences filed Critical University of Chinese Academy of Sciences
Priority to CN 201110117428 priority Critical patent/CN102779239A/en
Publication of CN102779239A publication Critical patent/CN102779239A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method for establishing a molecular simulation force field of a protein system. For establishing the force field, a tetrapeptide structure of amino acid is taken as a model, five representative structures are selected, and the configuration of the model molecules is as follows: firstly optimizing a gaseous 6-31G**base group, then conducting single-point correcting on gaseous 6-31G**, 6-311++G** and a cc-pVTZ base group, and conducting single-point energy calculation by using the configuration optimized by M052x/6-31G** through an MP2/cc-pVTZ method. A water solvent model adopted by liquid-state calculation is IEFPCM, and when a hole is constructed, a UAKS united-atom topology model is selected. The force field is obtained by the method, the accuracy is improved greatly, the calculated RMS (Root Mean Square) value is obviously smaller than that of other force fields, and approaches to the M052x result of a QM method, the calculated RMS-C value indicates that the new force field method overcomes the defect of conformation deviation in the original force field.

Description

A kind of method that is used to set up the protein system molecular simulation field of force
Technical field
The present invention relates to a kind of molecular simulation field of force that is used for protein system.
Background technology
Since molecular mechanics was born, different computer simulation methods grew up.As study the MD method of biological molecular dynamics, study the docking method that mutually combines between big molecule and the micromolecule, from the beginning the Rosetta method of predicted protein matter structure.Calculate analogy method for all these, the interaction in the correct descriptive study system between the atom is basis and core, and its accuracy is directly connected to the reliability of analog result.We can be through using Quantum chemical calculation (like molecular orbit in principle; Density functional) finding the solution schrodinger equation obtains interacting between the atom; But it is unpractical finding the solution schrodinger equation for a biomolecule system (comprising big molecule itself and surrounding environment), becomes effective means based on the experience field of force of molecular mechanics.There are AMBER, CHARMM in the field of force that is widely used in the biomacromolecule simulation now, the OPLS and the GROMOS field of force.In all these field of forces, interatomic interactional physical model basically identical is described, mainly comprise key flexible, bond angle is crooked, dihedral angle reverses, Van der Waals and electrostatic interaction energy (formula 1).The difference in the different field of forces is the resulting different parameters group of they application diverse ways.
E = Σ K r ( r - r eq ) 2 + Σ K θ ( θ - θ eq ) 2 + Σ V n 2 [ 1 + cos ( nΦ - γ ) ]
(formula 1)
+ &Sigma; i < j ( A ij R ij 2 - B ij R ij 6 ) + &Sigma; i < j q i q j R ij
The integrality of sampling and the precision in the field of force are two bottleneck problems in the molecular mechanics simulation.In the starting stage of development, because extremely limited computer speed, developing fast and effectively, sampling method becomes research emphasis at that time.Some in addition think that as long as give enough fast computing machine and sufficiently long time, we just can solve complicated protein folding problem.One of main task of the Bluegene engineering of IBM Corporation was exactly to be used for studying protein folding mechanism originally.Yet along with the raising greatly that improves (like parallel computation, Replica exchange MD) and computer speed of sampling integrality, the precision in the increasing computer simulation instance announcement field of force not enough.The field of force of widespread usage all has (Bias) shortcoming of conformation deflection now.Often tend to the right-hand screw conformation like the AMBER field of force, and OPLS and GROMOS preference beta sheet conformation.Current these field of force research groups attempt with the non-additive field of force or match main chain parameter again, though the system to being detected, these methods have appeared to have some to be improved, and to real protein folding research, effect is also unclear.Go back the real from the beginning examples of many successful of folding α/beta protein of neither one up to now.Develop the field of force with enough accuracy and become the common recognition and the task of top priority in this field, it directly influences popularity and the predictability ability that this method is used.
Summary of the invention
The foundation in the one cover experience field of force comprises parameter fitting and accuracy test.Usually, the conformational energy (ACE-XXX-NME, XXX=ALA, GLY, or PRO) and the electrostatic potential of the amino acid dipeptide model that obtained by quantum chemistry calculation through match of parameter obtain; The check of field of force accuracy adopts the conformational energy of other amino acid dipeptide model as evaluation criterion; There is weak point in this appraisal procedure; Mainly comprise: (I) ACE in the dipeptides model and NME end group do not occur in protein peptide chain; Protein calculates the amino acid whose parameter in center that only uses a model; Because the difference that end group brings, the parameter that can better describe the dipeptides model might not be fit to describe amino acid unit in the protein.(II) hydrogen bond is done most importantly each other in the protein, but there is not hydrogen bond in the dipeptides model, and the force field parameter of this model institute match can not be described the conformational energy difference that is caused by hydrogen bond action.(III) accuracy test is confined to several dipeptides or polypeptide model usually, and whether the field of force is fit to other amino acid and unclear.
At first we have carried out parametrization to AMBER polarization model, though the more traditional field of force, the polarization field of force is accurate, it than traditional field of force about slow three times, this has limited the application of this field of force in molecular simulation greatly widely.Consider bioprocess (mainly being protein folding) take place required time (usually in the microsecond rank) and present stage computing machine ability; Fast generalized Born (Generalized Born) the solvent model that gets up based on development in recent years; We carry out parametrization to the Amber protein field of force; Concrete work as follows: fit point charge again, what the quantum chemistry calculation of electrostatic field was selected for use is the M052x/6-31G** method; In original AMBER field of force; Use Fourier series to represent to reverse the energy of dihedral angle; For its coefficient of main chain through the numerical fitting quantum chemistry calculation obtain (the two-dimentional potential energy surface of φ/ψ), here we directly use quantum chemistry M052x/6-311++G** // (potential energy surface of φ/ψ) that the M052x/6-31G** method is calculated.Utilize the method for numerical interpolation to obtain energy and power, the error of having been brought when this method has overcome the Fourier series match.
The field of force that we will set up has numerous continuous medium solvent models available to continuous medium solvent model in the Gaussian program.Force field parameter carries out match according to the quantum chemistry calculation result of having considered solvent effect.Guarantee the rational while of amount of calculation, choosing optimum solvent model.In order to address this problem, we choose the molecule that is similar to amino acid side chain and solvation free energy experimental data is arranged as model, through with experimental data relatively, confirm optimum Quantum chemical calculation and solvent model.
In Gaussian 03, the Quantum chemical calculation of investigation comprises MP2, M05-2X, and B3LYP, the base group comprises 6-31G, 6-31G**, 6-311++G**, cc-pVTZ, aug-cc-pVTZ.Under gas phase condition, use the structure of its corresponding theory method and 6-31G** base group Optimization Model molecule, in the continuous medium solvent of investigating, do single-point then and calculate.The solvent model of investigating comprises IEFPCM, CPCM, DPCM, IPCM, SCIPCM and COSMO.At IEFPCM, in the calculating of CPCM and DPCM, further investigated the selection (comprising UAO, UAHF, UAKS, UFF, PAULING and BONDI radius) of atomic radius.In Gaussian 09, same quantization method and base group have been investigated, new SMD model among IEFPCM model that the solvent model is only considered to combine with above-mentioned six kinds of atomic radiuses and the Gaussian 09.Different with Gaussian 03 calculating is that molecular configuration employing 6-31G** base group is optimized in liquid phase and obtains among the Gaussian 09.
We will be to a certain specific GB solvent modelling field of force, and the GB model that is directed against undoubtedly should be optimum.The precision of GB model depends on three factors: model itself, GB atomic radius and atomic charge.Before not obtaining the new field of force, use the Amber field of force and use that joint mentions similar in appearance to the molecule of amino acid side chain as model, investigate of the influence of these factors to precision.GB method of investigating and relevant GB radius comprise: IGB1 (Bondi, Mbondi, Mbondi2), IGB2 (Bondi, Mbondi, Mbondi2), IGB5 (Bondi, Mbondi, Mbondi2), IGB7 (Bondi, Mbondi2) and IGB8 (Bondi, Mbondi2).When investigating the influencing of atomic charge, calculate the electrostatic potential method and comprise HF, M05-2X, B3LYP and MP2, the base group comprises 6-31G, 6-31G**, 6-311++G**, cc-pVTZ, aug-cc-pVTZ.Medium comprises gas phase and liquid phase (totally 80 covers), with the atomic charge of RESP method model of fit, calculates the solvation free energy of these molecules.
According to above research, in the method for MAP, we are with amino acid whose dipeptide chain structure (ACE-X-NME; One of X=20 amino acid) be model; Use 30 * 30 mesh-density, the energy on the lattice point is through calculating, and the energy of other arbitrfary point calculates through the method for interpolation.
When calculating two-dimentional potential energy surface, for ALA and GLY select quantum chemical methods be M052x/6-311++G** //M052x/6-31G**, the quantum chemical methods of ASP and GLU selection be M052x/6-311++G** //M052x/6-31+G*.The model molecule at first carries out configuration optimization under low level (M052x/6-31G** or M052x/6-31+G*) gas phase, use the method (M052x/6-311++G**) of higher level under liquid phase, to calculate single-point energy then.Quantizing to calculate the hydrosolvent model of selecting for use is IEFPCM; What atomic radius was used is UAKS united stom topological model; The hydrosolvent model that usefulness is calculated in the field of force is A.Onufriev; The GB model of D.Bashford and D.A.Case development, the keyword igb=5 among the corresponding A MBER9, the mbondi2 model that atomic radius adopts.In new field of force method, what ALA, GLY, ASP and GLU used is potential energy surface separately, and except that PRO, what other amino acid was used is the potential energy surface of ALA.
The molecular model that electric charge fits usefulness is amino acid whose dipeptides structure (ACE-XXX-NME, one of X=20 amino acid), has chosen two kinds of structures to compact with the expansion conformation: right-hand screw (α R, φ=-57.0, ψ=-47.0) and beta sheet (β; φ=-119.0; ψ=113.0), the dihedral angle of side chain obtains from the database that people such as Dunbrack come out, after the φ/ψ value of main chain is confirmed; The side chain value is selected the dihedral angle value of the maximum configuration of layout number when φ/ψ for use, and the calculating of molecular configuration optimization and electrostatic potential is carried out under the rank of M052x/6-31G**, gas phase.When fitting electric charge with the RESP method, each amino acid whose total amount of electric charge be restricted to 0 or ± 1, ACE and NME total amount of electric charge are restricted to 1, the atom that has identical chemical environment in the model is set has identical electric charge.
Embodiment
We are model with amino acid whose tetrapeptide chain structure (ACE-ALA-XXX-ALA-NME, one of XXX=20 amino acid), have chosen five kinds of representative structure (right-hand screw (α R, φ=-57.0, ψ=-47.0), left-hand screw (α L, φ=57.0, ψ=47.0), beta sheet (β, φ=-119.0, ψ=113.0), antiparallel β-lamination (β a, φ=-140.0, ψ=135.0), PPII (PPII, φ=-79.0, ψ=150.0), choosing of side chain dihedral angle is identical with choosing of dipeptides model side chain dihedral angle), be the accuracy that standard is checked the field of force with the MP2/cc-pVTZ result calculated.In order to contrast the relative energy that has calculated 5 kinds of configurations of this 20 amino acid simultaneously with the AMBER field of force method of different quantum chemistrys and other version.In the calculating of DFT method M052x and B3LYP, the model molecular configuration at first is optimized with 6-31G** base group at gaseous state, and what wherein ASP and GLU model optimization were used is 6-31+G* base group, during optimization fixes the dihedral angle of main chain and side chain.Use 6-31G** in liquid state then, 6-311++G** and cc-pVTZ base group are done single point correction, and the configuration of optimizing with M052x/6-31G** under the MP2/cc-pVTZ method has been done single-point energy calculating.The liquid hydrosolvent model that calculates employing is IEFPCM, and what select during the structure hole is UAKS united stom topological model.All quantum chemistry calculation are accomplished with Gaussian 03 software.What usefulness was calculated in the field of force is the AMBER9 software package, and there is AMBER94 in the field of force of selection, AMBER96, and AMBER99, AMBER99SB and AMBER03, that the solvent model is used is GB model (igb=5), what atomic radius adopted is the mbondi2 model.During optimization fix the dihedral angle of main chain and side chain.
When data processing, calculated two kinds of root-mean-square deviation: RMS, the relative energy of each amino acid whose five kinds of configuration calculates average in 20 amino acid then respectively; RMS-C amino acid whosely calculates with a kind of configuration 20, and is average to 5 kinds of configurations then.The computing formula of RMS is following:
RMS = &Sigma; i = 1 n ( error ) 2 n - - - ( 5 )
error=E ai-E bi+E c (6)
E c = &Sigma; i = 1 n ( E bi - E ai ) n - - - ( 7 )
Use the quantity (n=5) of configuration during wherein n representes to calculate; E BiAnd E AiRepresent respectively reference method with need that verification method calculates with respect to α REnergy (the α of configuration RThe energy of configuration is made as zero point); E cValue be for each amino acid whose RMS of make calculating be minimum value; The reason of introducing such numerical value is when calculating RMS; If choosing a kind of energy of configuration arbitrarily is zero point, the result who calculates so depends on different energy zero point, can find out from formula 6 and 7; After introducing such numerical value, energy zero point in fact usefulness be the average energy of all configurations.
The computing formula of RMS-C is following:
RMS - C = &Sigma; j = 1 m ( E aj - E bj + E c ) 2 m - - - ( 8 )
SIGN = sign ( &Sigma; j = 1 m E aj - E bj + E c m ) - - - ( 9 )
Wherein m is an amino acid whose number (m=20) in calculating; E BjAnd E AjThe relative energy of expression MP2 method and other method calculating; E cThe numerical value that calculates of value formula 6 when calculating RMS.The configuration energy that SIGN representes to calculate someway is with respect to the relativeness with reference to energy, and the relative energy of configuration represented to over-evaluate in positive sign, the relative energy of the configuration that symbolic representation is underestimated.Can find out that from the definition of RMS and RMS-C these two numerical value have reflected that respectively the whole bag of tricks is in the statistics between the amino acid and between the configuration.
Thisly improved the accuracy in the field of force greatly with the method for MAP, the RMS value of calculating is significantly less than other the field of force, and near the result of the M052x of QM method, the RMS-C value of calculating shows that new field of force method has overcome the shortcoming of conformation deflection in original field of force.The new field of force not only is greatly improved on accuracy, and improves easily, as long as design conditions allow to select to calculate more accurately the method for two-dimentional potential energy surface, can further improve the field of force very easily.

Claims (1)

1. method that is used to set up the protein system molecular simulation field of force; The foundation in the field of force is model with amino acid whose tetrapeptide chain structure in this method; Choose five kinds of representative structures, it is characterized in that at first the model molecular configuration being optimized with 6-31G** base group at gaseous state, used 6-31G** in liquid state then; 6-311++G material and cc-pVTZ base group are done single point correction, and the configuration of optimizing with M052x/6-31G** under the MP2/cc-pVTZ method has been done single-point energy calculating.The liquid hydrosolvent model that calculates employing is IEFPCM, and what select during the structure hole is UAKS united stom topological model.
CN 201110117428 2011-05-09 2011-05-09 Method for establishing molecular simulation force filed of protein system Pending CN102779239A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110117428 CN102779239A (en) 2011-05-09 2011-05-09 Method for establishing molecular simulation force filed of protein system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110117428 CN102779239A (en) 2011-05-09 2011-05-09 Method for establishing molecular simulation force filed of protein system

Publications (1)

Publication Number Publication Date
CN102779239A true CN102779239A (en) 2012-11-14

Family

ID=47124149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110117428 Pending CN102779239A (en) 2011-05-09 2011-05-09 Method for establishing molecular simulation force filed of protein system

Country Status (1)

Country Link
CN (1) CN102779239A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729577A (en) * 2013-12-12 2014-04-16 深圳先进技术研究院 Protein thermodynamic analysis high-efficiency stochastic simulation method based on hybrid parallel mode
CN106372400A (en) * 2016-08-29 2017-02-01 深圳晶泰科技有限公司 Method and application for constructing polarized force fields and method and system for predicting drug crystal forms
CN108763852A (en) * 2018-05-09 2018-11-06 深圳晶泰科技有限公司 The automation conformational analysis method of class medicine organic molecule
CN110875085A (en) * 2018-09-03 2020-03-10 中国石油化工股份有限公司 Method for efficiently optimizing molecular structures in batches
WO2021103402A1 (en) * 2020-04-21 2021-06-03 深圳晶泰科技有限公司 Molecule force field fitting method
WO2021103491A1 (en) * 2020-06-15 2021-06-03 深圳晶泰科技有限公司 Method for testing and fitting force field dihedral angle parameters

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729577A (en) * 2013-12-12 2014-04-16 深圳先进技术研究院 Protein thermodynamic analysis high-efficiency stochastic simulation method based on hybrid parallel mode
CN103729577B (en) * 2013-12-12 2017-08-22 深圳先进技术研究院 Protein thermodynamics based on hybrid parallel mode analyzes efficient Method of Stochastic
CN106372400A (en) * 2016-08-29 2017-02-01 深圳晶泰科技有限公司 Method and application for constructing polarized force fields and method and system for predicting drug crystal forms
CN106372400B (en) * 2016-08-29 2019-06-04 深圳晶泰科技有限公司 Construct the method and application, the method and system for predicting drug crystal forms of Polarized force field
CN108763852A (en) * 2018-05-09 2018-11-06 深圳晶泰科技有限公司 The automation conformational analysis method of class medicine organic molecule
CN108763852B (en) * 2018-05-09 2021-06-15 深圳晶泰科技有限公司 Automated conformational analysis method of drug-like organic molecules
CN110875085A (en) * 2018-09-03 2020-03-10 中国石油化工股份有限公司 Method for efficiently optimizing molecular structures in batches
CN110875085B (en) * 2018-09-03 2022-07-29 中国石油化工股份有限公司 Method for efficiently optimizing molecular structure in batches
WO2021103402A1 (en) * 2020-04-21 2021-06-03 深圳晶泰科技有限公司 Molecule force field fitting method
WO2021103491A1 (en) * 2020-06-15 2021-06-03 深圳晶泰科技有限公司 Method for testing and fitting force field dihedral angle parameters

Similar Documents

Publication Publication Date Title
CN102779239A (en) Method for establishing molecular simulation force filed of protein system
Li et al. Machine learning force field parameters from ab initio data
Li Manni et al. The OpenMolcas Web: A community-driven approach to advancing computational chemistry
Smith et al. Dynamics of myoglobin: comparison of simulation results with neutron scattering spectra.
Lopes et al. Polarizable force field for peptides and proteins based on the classical drude oscillator
DeVane et al. Transferable coarse grain nonbonded interaction model for amino acids
Maekawa et al. Comparative study of electrostatic models for the amide-I and-II modes: Linear and two-dimensional infrared spectra
Chaban et al. Anharmonic vibrational spectroscopy of glycine: Testing of ab initio and empirical potentials
Yang et al. Study of peptide conformation in terms of the ABEEM/MM method
Lemkul et al. Balancing the interactions of Mg2+ in aqueous solution and with nucleic acid moieties for a polarizable force field based on the classical Drude oscillator model
Chys et al. Random coordinate descent with spinor-matrices and geometric filters for efficient loop closure
CN106372400B (en) Construct the method and application, the method and system for predicting drug crystal forms of Polarized force field
Jakobsen et al. Systematic improvement of potential-derived atomic multipoles and redundancy of the electrostatic parameter space
Mohr et al. Complexity reduction in large quantum systems: fragment identification and population analysis via a local optimized minimal basis
Cai et al. On-the-fly numerical surface integration for finite-difference Poisson–Boltzmann methods
Cazals et al. Characterizing molecular flexibility by combining least root mean square deviation measures
Chen et al. Green's function for three‐dimensional elastic wave equation with a moving point source on the free surface with applications
Bouř et al. Simulation of the Raman Optical Activity of l-Alanyl− l-Alanine
Hantal et al. Surface affinity of alkali and halide ions in their aqueous solution: insight from intrinsic density analysis
Liao et al. Computation of coefficients of crack-tip asymptotic fields using the weak form quadrature element method
Bertram et al. Atomic refinement using orientational restraints from solid-state NMR
Mathur et al. First-principles-based machine learning models for phase behavior and transport properties of CO2
Marcellini et al. Accurate prediction of protein NMR spin relaxation by means of polarizable force fields. Application to strongly anisotropic rotational diffusion
Johnston et al. Development of classical molecule–surface interaction potentials based on density functional theory calculations: investigation of force field representability
Stoica Using molecular dynamics to simulate electronic spin resonance spectra of T4 lysozyme

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121114