CN110197700B - Protein ATP docking method based on differential evolution - Google Patents
Protein ATP docking method based on differential evolution Download PDFInfo
- Publication number
- CN110197700B CN110197700B CN201910302641.5A CN201910302641A CN110197700B CN 110197700 B CN110197700 B CN 110197700B CN 201910302641 A CN201910302641 A CN 201910302641A CN 110197700 B CN110197700 B CN 110197700B
- Authority
- CN
- China
- Prior art keywords
- atp
- protein
- population
- score
- cross
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 39
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000003032 molecular docking Methods 0.000 title claims abstract description 26
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 11
- 125000004429 atom Chemical group 0.000 claims description 20
- 239000003446 ligand Substances 0.000 claims description 16
- 238000013519 translation Methods 0.000 claims description 12
- 230000003993 interaction Effects 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 8
- 125000000539 amino acid group Chemical group 0.000 claims description 4
- 229910052799 carbon Inorganic materials 0.000 claims description 4
- 125000004432 carbon atom Chemical group C* 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 210000001503 joint Anatomy 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 9
- 150000001875 compounds Chemical class 0.000 abstract description 5
- 238000005457 optimization Methods 0.000 abstract description 4
- 238000013461 design Methods 0.000 abstract description 2
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 45
- 238000011160 research Methods 0.000 description 4
- 238000010845 search algorithm Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000007877 drug screening Methods 0.000 description 1
- 238000004146 energy storage Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 238000007620 mathematical function Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 230000022983 regulation of cell cycle Effects 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Databases & Information Systems (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Medicinal Chemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A protein ATP docking method based on differential evolution comprises the steps that firstly, an ATPbind server is used for predicting protein-ATP binding residue information, and the prediction precision of a compound molecular space structure is improved; then, the original protein-ATP structure prediction problem is converted into the optimization problem of searching the optimal individual through the design of the population individual, so that the calculation cost is reduced; and finally, searching for the optimal individual by using a differential evolution algorithm, so that the prediction precision of the protein-ATP compound structure is improved. The invention provides a protein ATP docking method based on differential evolution, which is low in calculation cost and high in search efficiency.
Description
Technical Field
The invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a protein ATP docking method based on differential evolution.
Background
With the continuous and intensive research on proteins, the phenomenon that proteins are combined with small molecules or ligands is ubiquitous, and especially the combination of proteins and energy molecules is widely existed in various life phenomena, so that the research on the characteristics and the rule of the combination of proteins and ligands is necessary. ATP is an unstable, high-energy compound, also known as adenosine triphosphate. The hydrolysis releases more energy, which is the most direct energy source in organisms. In the cell, it can be interconverted with ADP to realize energy storage and release, thus ensuring energy supply of various vital activities of the cell. Many important physiological processes in the body, such as cell cycle regulation, anabolism, signal transduction, and the transmission of genetic information, depend on the interaction and recognition of proteins and ligand molecules. The molecular docking method has important significance for molecular mechanism research of life activities, biomolecular compound structure prediction, targeted drug screening and the like.
Classical thermodynamics holds that the complex structure formed by the interaction of protein and ligand molecules should be the conformation with the lowest binding free energy, and that rapid and accurate search for the conformation with the lowest energy is critical for protein-ligand molecule docking.
Therefore, molecular docking calculations require that the binding free energy be calculated as accurately as possible using mathematical models or functions, and efficient search algorithms are required to quickly find conformations with very low free energy. Conformation search in molecular docking is an extremely complex problem, and protein-ligand molecular docking requires searching for a conformation with low energy on one hand and searching for various possible situations in a short time on the other hand, so that a rapid and effective search algorithm is an important research field in molecular docking. The protein-ligand molecule docking conformation search method mainly comprises two categories of rapid exhaustive search and heuristic search. The region of ligand interaction may occur anywhere on the surface of the molecule and therefore often requires a global search, either by traversing various locations using a fast exhaustive search or by performing an approximate global search using heuristic algorithms.
Although the fast exhaustive algorithm can search the whole constellation space quickly, more wrong constellations are introduced at the same time, and the difficulty is increased for distinguishing the correct constellations. The heuristic search algorithm is to perform random translation and rotation operations on the ligand molecules in the docking system, optimize and accept and reject the operated ligand conformation according to the energy score, and finally find the ligand molecule conformation with the lowest energy. The heuristic Monte Carlo algorithm is a general search method, can randomly sample in the ligand conformation space and is not influenced by the conformation space structure and distribution. But this method may require a long calculation time to give a better solution. The RosettaDock program (Wang C, Schueler-Furman O, Baker D.Improved side-chain modifying for Protein-Protein linking [ J ]. Protein Science,2005,14(5): 1328-.
Therefore, the existing protein ATP molecular docking method has defects in calculation cost and search efficiency, and needs to be improved.
Disclosure of Invention
In order to overcome the defects of the existing protein and ATP docking method in the aspects of calculation cost and prediction accuracy, the invention provides a protein ATP docking method based on a differential evolution algorithm, which is low in calculation cost and high in prediction accuracy.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a differential evolution based protein ATP docking method, the method comprising the steps of:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn;
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed asAnd
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereofAndthree central points in one-to-one correspondenceAndwherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB databaseCalculating C of the type T residue to which it bindsαDistance between atomsWherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Wherein
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range ofsi,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
11.2) willThe coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinatesThe following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3:
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant:
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (3) after rotation translation are output as final ligand position information, otherwise, the step 12) is returned to.
The technical conception of the invention is as follows: firstly, predicting protein ATP binding residue information by using an ATPbind server, thereby improving the prediction precision of the molecular space structure of the compound; then, the original protein-ATP structure prediction problem is converted into the optimization problem of searching the optimal individual through the design of the population individual, so that the calculation cost is reduced; and finally, searching for the optimal individual by using a differential evolution algorithm, so that the prediction precision of the protein-ATP compound structure is improved. The invention provides a protein-ATP docking method based on differential evolution, which is low in calculation cost and high in search efficiency.
The beneficial effects of the invention are as follows: on one hand, the ATPbind server is used for predicting the protein-ATP binding residue information, so that the prediction precision of the molecular space structure of the protein-ATP compound is improved; on the other hand, the protein-ATP docking prediction problem is converted into an optimization problem for selecting the optimal individual, and the optimal individual is searched by using a differential evolution algorithm, so that the efficiency and the accuracy of the protein-ATP docking prediction are improved.
Drawings
FIG. 1 is a schematic diagram of a protein ATP docking method based on differential evolution.
FIG. 2 is a diagram of a three-dimensional space structure of a complex obtained by predicting protein 1a0i and ATP by using a differential evolution-based protein ATP docking method.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, a differential evolution-based protein to ATP docking method includes the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn;
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed asAnd
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereofAndthree central points in one-to-one correspondenceAndwherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB databaseCalculating C of the type T residue to which it bindsαDistance between atomsWherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Wherein
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range ofsi,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
11.2) willThe coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinatesThe following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3:
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant:
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowIn (1)And outputting the coordinates of the element information after the element information is subjected to rotation translation as final ligand position information, and otherwise, returning to the step 12).
In this embodiment, taking the three-dimensional space structure of the compound after predicting the docking of the protein 1a0i and ATP as an example, a protein ATP docking method based on differential evolution includes the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, the ATPbind server (https:// zhangglab. ccmb. med. umich. edu/ATPbind /) is used for predicting the residue site information bound by the protein-ATP, and n residues bound by the protein and the ATP are obtained and respectively marked as R1,r2,...,rn;
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed asAnd
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereofAndthree central points in one-to-one correspondenceAndwherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB databaseCalculating C of the type T residue to which it bindsαDistance between atomsWherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Wherein
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: setting population size NP, scaling factor F, cross probability CR and maximum iteration number GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range ofsi,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
11.2) willThe coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinatesThe following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C’2,C’3:
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i, generating a mutated individual S according to the following equationmutant:
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), respectively calculate ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (1) after rotational translation are output as final ligand position informationOtherwise, return to step 12).
Taking the three-dimensional space structure of the protein 1a0i and ATP as an example, the three-dimensional space structure of the complex of the protein 1a0i and ATP obtained by the above method is shown in FIG. 2.
The above description is the prediction result of the protein 1a0i and ATP as examples in the present invention, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.
Claims (1)
1. A protein ATP docking method based on differential evolution is characterized in that: the butt joint method comprises the following steps:
1) inputting structural information of protein and ATP, and respectively marking as R and A;
2) for the input structure information R, predicting the residue site information bound by the protein-ATP by using an ATPbind server to obtain n residues bound by the protein and the ATP, and respectively marking the n residues as R1,r2,...,rn;
3) According to r1,r2,...,rnCentral carbon atom C ofαClustering coordinate information to obtain a central point CRClustering a central point C according to the coordinate information of each atom in AAMoving ATP to make CAAnd CRThe coordinates of the two points coincide;
4) clustering into three central points according to the coordinate information of each atom in A, wherein the three central points are called pseudo-atoms and are respectively expressed asAnd
5) for each ATP molecule A in the PDB database(j)J 1, 2.. times.n, which is clustered according to coordinate information of all atoms thereofAndthree central points in one-to-one correspondenceAndwherein N is the number of ATP in the PDB database;
6) for each central point of each ATP in the PDB databaseCalculating C of the type T residue to which it bindsαDistance between atomsWherein T is one of the types of amino acid residues present in PDB;
7) calculating the kth central atom C of all ATP molecules in the database of arbitrary residue types T and PDBkAnd k is 1,2,3, the average distance of interaction, denoted as D (C)k,T):
Wherein
8) According to step 7), respectively calculating the ATP central points C bound by all T-type residues in the PDB databasekAverage distance of interaction D (C)k,T);
9) Setting parameters: is provided withPopulation size NP, scaling factor F, crossover probability CR, maximum number of iterations GmaxInitializing the iteration times G to be 0;
10) population initialization: randomly generating an initialization population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range ofsi,4、si,5And si,6The value range of (a) is 0 to 2 pi;
11) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score (S) was calculated for that individuali):
11.1) according to SiThe last three elements s ini,4、si,5And si,6And calculating a three-dimensional space rotation matrix R:
11.2) willThe coordinates are rotated according to the rotation matrix R to respectively obtain three-dimensional coordinates
11.3) according to SiThe first three elements s ini,1、si,2、si,3Will rotate the obtained coordinatesThe following translation process is carried out, and new three-dimensional coordinates C 'are calculated'1,C'2,C'3:
11.4) according to step 8), calculate the score (S)i):
score(Si)=∑|DkT-D(Ck,T)|
Wherein DkTIs C'kWith a residue C of residue type TαDistance of atoms, k ═ 1,2, 3;
12) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } is processed as follows:
12.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a ≠ b ≠ c ≠ i,
a mutant S is generated according to the following equationmutant:
Smutant=Sa+F·(Sb-Sc)
12.2) reaction of SiThe element information in (1) is copied to the crossed individuals ScrossIn S, thencrossRandomly selects an element s from the 6 elementscross,jUsing SmutantOf (5) a corresponding element smutant,jAlternative, finally, for ScrossUsing a randomly generated random number R between 0 and 1 to control whether S is used or notmutantReplacing the corresponding elements in: if R is less than CR, replacing, otherwise, not replacing;
12.3) according to step 11), divideRespectively calculating ScrossAnd SiCorresponding score (S)cross) And score (S)i);
12.4) if score (S)cross)<score(Si) Then use ScrossReplacing S in population PiElse SiRemaining in the population P;
13) g is G +1, if G > GmaxThen according to the individual S with lowest score in the current population PlowAll the atomic coordinates in A are based on SlowThe coordinates of the element information in (3) after rotation translation are output as final ligand position information, otherwise, the step 12) is returned to.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910302641.5A CN110197700B (en) | 2019-04-16 | 2019-04-16 | Protein ATP docking method based on differential evolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910302641.5A CN110197700B (en) | 2019-04-16 | 2019-04-16 | Protein ATP docking method based on differential evolution |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110197700A CN110197700A (en) | 2019-09-03 |
CN110197700B true CN110197700B (en) | 2021-04-06 |
Family
ID=67751921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910302641.5A Active CN110197700B (en) | 2019-04-16 | 2019-04-16 | Protein ATP docking method based on differential evolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110197700B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111081312B (en) * | 2019-12-04 | 2021-10-29 | 浙江工业大学 | Ligand binding residue prediction method based on multi-sequence association information |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006094230A2 (en) * | 2005-03-03 | 2006-09-08 | The Burnham Institute For Medical Research | Screening methods for protein kinase b inhibitors employing virtual docking approaches and compounds and compositions discovered thereby |
CN104992079A (en) * | 2015-06-29 | 2015-10-21 | 南京理工大学 | Sampling learning based protein-ligand binding site prediction method |
CN105354440A (en) * | 2015-08-12 | 2016-02-24 | 中国科学技术大学 | Method for extracting protein-micromolecule interaction module |
CN106096328A (en) * | 2016-04-26 | 2016-11-09 | 浙江工业大学 | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface |
CN107273714A (en) * | 2017-06-07 | 2017-10-20 | 南京理工大学 | The ATP binding site estimation methods of conjugated protein sequence and structural information |
CN109255427A (en) * | 2018-08-30 | 2019-01-22 | 无锡城市职业技术学院 | A kind of molecular docking calculation method based on quantum particle swarm optimization |
CN109524058A (en) * | 2018-11-07 | 2019-03-26 | 浙江工业大学 | A kind of protein dimer Structure Prediction Methods based on differential evolution |
-
2019
- 2019-04-16 CN CN201910302641.5A patent/CN110197700B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006094230A2 (en) * | 2005-03-03 | 2006-09-08 | The Burnham Institute For Medical Research | Screening methods for protein kinase b inhibitors employing virtual docking approaches and compounds and compositions discovered thereby |
CN104992079A (en) * | 2015-06-29 | 2015-10-21 | 南京理工大学 | Sampling learning based protein-ligand binding site prediction method |
CN105354440A (en) * | 2015-08-12 | 2016-02-24 | 中国科学技术大学 | Method for extracting protein-micromolecule interaction module |
CN106096328A (en) * | 2016-04-26 | 2016-11-09 | 浙江工业大学 | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface |
CN107273714A (en) * | 2017-06-07 | 2017-10-20 | 南京理工大学 | The ATP binding site estimation methods of conjugated protein sequence and structural information |
CN109255427A (en) * | 2018-08-30 | 2019-01-22 | 无锡城市职业技术学院 | A kind of molecular docking calculation method based on quantum particle swarm optimization |
CN109524058A (en) * | 2018-11-07 | 2019-03-26 | 浙江工业大学 | A kind of protein dimer Structure Prediction Methods based on differential evolution |
Non-Patent Citations (2)
Title |
---|
基于聚类的下采样及其在蛋白质-核苷酸绑定位点预测中的应用;石大宏;《计算机与数字工程》;20151231;第43卷(第6期);第972-975页 * |
识别蛋白质配体绑定残基的生物计算方法综述;於东军 等;《数据采集与处理》;20181231;第33卷(第2期);第195-206页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110197700A (en) | 2019-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cruz et al. | RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction | |
Le et al. | Prediction of FMN binding sites in electron transport chains based on 2-D CNN and PSSM profiles | |
Zhong et al. | Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property | |
CN107609342B (en) | Protein conformation search method based on secondary structure space distance constraint | |
Chen et al. | Predicting the types of metabolic pathway of compounds using molecular fragments and sequential minimal optimization | |
Li et al. | Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction | |
CN106096328B (en) | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface | |
CN106503484A (en) | A kind of multistage differential evolution Advances in protein structure prediction that is estimated based on abstract convex | |
CN106650305B (en) | A kind of more tactful group Advances in protein structure prediction based on local abstract convex supporting surface | |
CN109360596B (en) | Protein conformation space optimization method based on differential evolution local disturbance | |
Pearce et al. | Fast and accurate Ab Initio Protein structure prediction using deep learning potentials | |
CN105975806A (en) | Protein structure prediction method based on distance constraint copy exchange | |
CN110197700B (en) | Protein ATP docking method based on differential evolution | |
CN109872770B (en) | Variable strategy protein structure prediction method combined with displacement degree evaluation | |
CN109360597B (en) | Group protein structure prediction method based on global and local strategy cooperation | |
Zhang et al. | Two-stage distance feature-based optimization algorithm for de novo protein structure prediction | |
Mirceva et al. | HMM based approach for classifying protein structures | |
CN110600076B (en) | Protein ATP docking method based on distance and angle information | |
Tan et al. | RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design | |
Ispano et al. | An overview of protein function prediction methods: a deep learning perspective | |
CN109448786B (en) | Method for predicting protein structure by lower bound estimation dynamic strategy | |
CN109448785B (en) | Protein structure prediction method for enhancing Loop region structure by using Laplace graph | |
CN109147867B (en) | Group protein structure prediction method based on dynamic segment length | |
CN109411013B (en) | Group protein structure prediction method based on individual specific variation strategy | |
CN110689929B (en) | Protein ATP docking method based on contact probability assistance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |