CN110197700B - Protein ATP docking method based on differential evolution - Google Patents

Protein ATP docking method based on differential evolution Download PDF

Info

Publication number
CN110197700B
CN110197700B CN201910302641.5A CN201910302641A CN110197700B CN 110197700 B CN110197700 B CN 110197700B CN 201910302641 A CN201910302641 A CN 201910302641A CN 110197700 B CN110197700 B CN 110197700B
Authority
CN
China
Prior art keywords
atp
protein
population
score
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910302641.5A
Other languages
Chinese (zh)
Other versions
CN110197700A (en
Inventor
饶亮
张贵军
刘俊
彭春祥
胡俊
周晓根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910302641.5A priority Critical patent/CN110197700B/en
Publication of CN110197700A publication Critical patent/CN110197700A/en
Application granted granted Critical
Publication of CN110197700B publication Critical patent/CN110197700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A protein ATP docking method based on differential evolution comprises the steps that firstly, an ATPbind server is used for predicting protein-ATP binding residue information, and the prediction precision of a compound molecular space structure is improved; then, the original protein-ATP structure prediction problem is converted into the optimization problem of searching the optimal individual through the design of the population individual, so that the calculation cost is reduced; and finally, searching for the optimal individual by using a differential evolution algorithm, so that the prediction precision of the protein-ATP compound structure is improved. The invention provides a protein ATP docking method based on differential evolution, which is low in calculation cost and high in search efficiency.

Description

Protein ATP docking method based on differential evolution
Technical Field
The invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a protein ATP docking method based on differential evolution.
Background
With the continuous and intensive research on proteins, the phenomenon that proteins are combined with small molecules or ligands is ubiquitous, and especially the combination of proteins and energy molecules is widely existed in various life phenomena, so that the research on the characteristics and the rule of the combination of proteins and ligands is necessary. ATP is an unstable, high-energy compound, also known as adenosine triphosphate. The hydrolysis releases more energy, which is the most direct energy source in organisms. In the cell, it can be interconverted with ADP to realize energy storage and release, thus ensuring energy supply of various vital activities of the cell. Many important physiological processes in the body, such as cell cycle regulation, anabolism, signal transduction, and the transmission of genetic information, depend on the interaction and recognition of proteins and ligand molecules. The molecular docking method has important significance for molecular mechanism research of life activities, biomolecular compound structure prediction, targeted drug screening and the like.
Classical thermodynamics holds that the complex structure formed by the interaction of protein and ligand molecules should be the conformation with the lowest binding free energy, and that rapid and accurate search for the conformation with the lowest energy is critical for protein-ligand molecule docking.
Therefore, molecular docking calculations require that the binding free energy be calculated as accurately as possible using mathematical models or functions, and efficient search algorithms are required to quickly find conformations with very low free energy. Conformation search in molecular docking is an extremely complex problem, and protein-ligand molecular docking requires searching for a conformation with low energy on one hand and searching for various possible situations in a short time on the other hand, so that a rapid and effective search algorithm is an important research field in molecular docking. The protein-ligand molecule docking conformation search method mainly comprises two categories of rapid exhaustive search and heuristic search. The region of ligand interaction may occur anywhere on the surface of the molecule and therefore often requires a global search, either by traversing various locations using a fast exhaustive search or by performing an approximate global search using heuristic algorithms.
Although the fast exhaustive algorithm can search the whole constellation space quickly, more wrong constellations are introduced at the same time, and the difficulty is increased for distinguishing the correct constellations. The heuristic search algorithm is to perform random translation and rotation operations on the ligand molecules in the docking system, optimize and accept and reject the operated ligand conformation according to the energy score, and finally find the ligand molecule conformation with the lowest energy. The heuristic Monte Carlo algorithm is a general search method, can randomly sample in the ligand conformation space and is not influenced by the conformation space structure and distribution. But this method may require a long calculation time to give a better solution. The RosettaDock program (Wang C, Schueler-Furman O, Baker D.Improved side-chain modifying for Protein-Protein linking [ J ]. Protein Science,2005,14(5): 1328-.
Therefore, the existing protein ATP molecular docking method has defects in calculation cost and search efficiency, and needs to be improved.
Disclosure of Invention
In order to overcome the defects of the existing protein and ATP docking method in the aspects of calculation cost and prediction accuracy, the invention provides a protein ATP docking method based on a differential evolution algorithm, which is low in calculation cost and high in prediction accuracy.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a differential evolution based protein ATP docking method, the method comprising the steps of:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed as
Figure GDA0002893369400000021
And
Figure GDA0002893369400000022
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereof
Figure GDA0002893369400000031
And
Figure GDA0002893369400000032
three central points in one-to-one correspondence
Figure GDA0002893369400000033
And
Figure GDA0002893369400000034
wherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB database
Figure GDA0002893369400000035
Calculating C of the type T residue to which it bindsαDistance between atoms
Figure GDA0002893369400000036
Wherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Figure GDA0002893369400000037
Wherein
Figure GDA0002893369400000038
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure GDA0002893369400000039
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
Figure GDA00028933694000000310
11.2) will
Figure GDA0002893369400000041
The coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
Figure GDA0002893369400000042
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinates
Figure GDA0002893369400000043
The following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3
Figure GDA0002893369400000044
Wherein C'kIs a three-dimensional coordinate obtained after translation, for C'kAnd
Figure GDA0002893369400000045
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (3) after rotation translation are output as final ligand position information, otherwise, the step 12) is returned to.
The technical conception of the invention is as follows: firstly, predicting protein ATP binding residue information by using an ATPbind server, thereby improving the prediction precision of the molecular space structure of the compound; then, the original protein-ATP structure prediction problem is converted into the optimization problem of searching the optimal individual through the design of the population individual, so that the calculation cost is reduced; and finally, searching for the optimal individual by using a differential evolution algorithm, so that the prediction precision of the protein-ATP compound structure is improved. The invention provides a protein-ATP docking method based on differential evolution, which is low in calculation cost and high in search efficiency.
The beneficial effects of the invention are as follows: on one hand, the ATPbind server is used for predicting the protein-ATP binding residue information, so that the prediction precision of the molecular space structure of the protein-ATP compound is improved; on the other hand, the protein-ATP docking prediction problem is converted into an optimization problem for selecting the optimal individual, and the optimal individual is searched by using a differential evolution algorithm, so that the efficiency and the accuracy of the protein-ATP docking prediction are improved.
Drawings
FIG. 1 is a schematic diagram of a protein ATP docking method based on differential evolution.
FIG. 2 is a diagram of a three-dimensional space structure of a complex obtained by predicting protein 1a0i and ATP by using a differential evolution-based protein ATP docking method.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, a differential evolution-based protein to ATP docking method includes the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed as
Figure GDA0002893369400000051
And
Figure GDA0002893369400000052
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereof
Figure GDA0002893369400000053
And
Figure GDA0002893369400000054
three central points in one-to-one correspondence
Figure GDA0002893369400000055
And
Figure GDA0002893369400000056
wherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB database
Figure GDA0002893369400000057
Calculating C of the type T residue to which it bindsαDistance between atoms
Figure GDA0002893369400000061
Wherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Figure GDA0002893369400000062
Wherein
Figure GDA0002893369400000063
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure GDA0002893369400000064
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
Figure GDA0002893369400000065
11.2) will
Figure GDA0002893369400000066
The coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
Figure GDA0002893369400000067
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinates
Figure GDA0002893369400000068
The following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3
Figure GDA0002893369400000071
Wherein C'kIs translated to obtainTo three-dimensional coordinate of C'kAnd
Figure GDA0002893369400000072
k=1,2,3;
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowIn (1)And outputting the coordinates of the element information after the element information is subjected to rotation translation as final ligand position information, and otherwise, returning to the step 12).
In this embodiment, taking the three-dimensional space structure of the compound after predicting the docking of the protein 1a0i and ATP as an example, a protein ATP docking method based on differential evolution includes the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed as
Figure GDA0002893369400000081
And
Figure GDA0002893369400000082
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereof
Figure GDA0002893369400000083
And
Figure GDA0002893369400000084
three central points in one-to-one correspondence
Figure GDA0002893369400000085
And
Figure GDA0002893369400000086
wherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB database
Figure GDA0002893369400000087
Calculating C of the type T residue to which it bindsαDistance between atoms
Figure GDA0002893369400000088
Wherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Figure GDA0002893369400000089
Wherein
Figure GDA00028933694000000810
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure GDA00028933694000000811
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
Figure GDA0002893369400000091
11.2) will
Figure GDA0002893369400000092
The coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
Figure GDA0002893369400000093
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinates
Figure GDA0002893369400000094
The following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3
Figure GDA0002893369400000095
Wherein C'kIs a three-dimensional coordinate obtained after translation, for C'kAnd
Figure GDA0002893369400000096
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (1) after rotational translation are output as final ligand position informationOtherwise, return to step 12).
Taking the three-dimensional space structure of the protein 1a0i and ATP as an example, the three-dimensional space structure of the complex of the protein 1a0i and ATP obtained by the above method is shown in FIG. 2.
The above description is the prediction result of the protein 1a0i and ATP as examples in the present invention, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1. A protein ATP docking method based on differential evolution is characterized in that: the butt joint method comprises the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, predicting the residue site information bound by the protein-ATP by using an ATPbind server to obtain n residues bound by the protein and the ATP, and respectively marking the n residues as R1,r2,...,rn
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed as
Figure FDA0002893401020000011
And
Figure FDA0002893401020000012
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereof
Figure FDA0002893401020000013
And
Figure FDA0002893401020000014
three central points in one-to-one correspondence
Figure FDA0002893401020000015
And
Figure FDA0002893401020000016
wherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB database
Figure FDA0002893401020000017
Calculating C of the type T residue to which it bindsαDistance between atoms
Figure FDA0002893401020000018
Wherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Figure FDA0002893401020000019
Wherein
Figure FDA00028934010200000110
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: is provided withPopulation size NP, scaling factor F, crossover probability CR, maximum number of iterations GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure FDA0002893401020000021
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
Figure FDA0002893401020000022
11.2) will
Figure FDA0002893401020000023
The coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
Figure FDA0002893401020000024
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinates
Figure FDA0002893401020000025
The following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C'2,C'3
Figure FDA0002893401020000026
Wherein C'kIs a three-dimensional coordinate obtained after translation, for C'kAnd
Figure FDA0002893401020000027
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i,
a mutant S is generated according to the following equationmutant
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), divideRespectively calculating ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (3) after rotation translation are output as final ligand position information, otherwise, the step 12) is returned to.
CN201910302641.5A 2019-04-16 2019-04-16 Protein ATP docking method based on differential evolution Active CN110197700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910302641.5A CN110197700B (en) 2019-04-16 2019-04-16 Protein ATP docking method based on differential evolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910302641.5A CN110197700B (en) 2019-04-16 2019-04-16 Protein ATP docking method based on differential evolution

Publications (2)

Publication Number Publication Date
CN110197700A CN110197700A (en) 2019-09-03
CN110197700B true CN110197700B (en) 2021-04-06

Family

ID=67751921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910302641.5A Active CN110197700B (en) 2019-04-16 2019-04-16 Protein ATP docking method based on differential evolution

Country Status (1)

Country Link
CN (1) CN110197700B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081312B (en) * 2019-12-04 2021-10-29 浙江工业大学 Ligand binding residue prediction method based on multi-sequence association information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006094230A2 (en) * 2005-03-03 2006-09-08 The Burnham Institute For Medical Research Screening methods for protein kinase b inhibitors employing virtual docking approaches and compounds and compositions discovered thereby
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
CN105354440A (en) * 2015-08-12 2016-02-24 中国科学技术大学 Method for extracting protein-micromolecule interaction module
CN106096328A (en) * 2016-04-26 2016-11-09 浙江工业大学 A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information
CN109255427A (en) * 2018-08-30 2019-01-22 无锡城市职业技术学院 A kind of molecular docking calculation method based on quantum particle swarm optimization
CN109524058A (en) * 2018-11-07 2019-03-26 浙江工业大学 A kind of protein dimer Structure Prediction Methods based on differential evolution

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006094230A2 (en) * 2005-03-03 2006-09-08 The Burnham Institute For Medical Research Screening methods for protein kinase b inhibitors employing virtual docking approaches and compounds and compositions discovered thereby
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
CN105354440A (en) * 2015-08-12 2016-02-24 中国科学技术大学 Method for extracting protein-micromolecule interaction module
CN106096328A (en) * 2016-04-26 2016-11-09 浙江工业大学 A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information
CN109255427A (en) * 2018-08-30 2019-01-22 无锡城市职业技术学院 A kind of molecular docking calculation method based on quantum particle swarm optimization
CN109524058A (en) * 2018-11-07 2019-03-26 浙江工业大学 A kind of protein dimer Structure Prediction Methods based on differential evolution

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于聚类的下采样及其在蛋白质-核苷酸绑定位点预测中的应用;石大宏;《计算机与数字工程》;20151231;第43卷(第6期);第972-975页 *
识别蛋白质配体绑定残基的生物计算方法综述;於东军 等;《数据采集与处理》;20181231;第33卷(第2期);第195-206页 *

Also Published As

Publication number Publication date
CN110197700A (en) 2019-09-03

Similar Documents

Publication Publication Date Title
Cruz et al. RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction
Le et al. Prediction of FMN binding sites in electron transport chains based on 2-D CNN and PSSM profiles
Zhong et al. Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property
CN107609342B (en) Protein conformation search method based on secondary structure space distance constraint
Chen et al. Predicting the types of metabolic pathway of compounds using molecular fragments and sequential minimal optimization
Li et al. Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction
CN106096328B (en) A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface
CN106503484A (en) A kind of multistage differential evolution Advances in protein structure prediction that is estimated based on abstract convex
CN106650305B (en) A kind of more tactful group Advances in protein structure prediction based on local abstract convex supporting surface
CN109360596B (en) Protein conformation space optimization method based on differential evolution local disturbance
Pearce et al. Fast and accurate Ab Initio Protein structure prediction using deep learning potentials
CN105975806A (en) Protein structure prediction method based on distance constraint copy exchange
CN110197700B (en) Protein ATP docking method based on differential evolution
CN109872770B (en) Variable strategy protein structure prediction method combined with displacement degree evaluation
CN109360597B (en) Group protein structure prediction method based on global and local strategy cooperation
Zhang et al. Two-stage distance feature-based optimization algorithm for de novo protein structure prediction
Mirceva et al. HMM based approach for classifying protein structures
CN110600076B (en) Protein ATP docking method based on distance and angle information
Tan et al. RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design
Ispano et al. An overview of protein function prediction methods: a deep learning perspective
CN109448786B (en) Method for predicting protein structure by lower bound estimation dynamic strategy
CN109448785B (en) Protein structure prediction method for enhancing Loop region structure by using Laplace graph
CN109147867B (en) Group protein structure prediction method based on dynamic segment length
CN109411013B (en) Group protein structure prediction method based on individual specific variation strategy
CN110689929B (en) Protein ATP docking method based on contact probability assistance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant