CN110689929A - Protein ATP docking method based on contact probability assistance - Google Patents

Protein ATP docking method based on contact probability assistance Download PDF

Info

Publication number
CN110689929A
CN110689929A CN201910805001.6A CN201910805001A CN110689929A CN 110689929 A CN110689929 A CN 110689929A CN 201910805001 A CN201910805001 A CN 201910805001A CN 110689929 A CN110689929 A CN 110689929A
Authority
CN
China
Prior art keywords
atp
atom
binding
residue
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910805001.6A
Other languages
Chinese (zh)
Other versions
CN110689929B (en
Inventor
张贵军
饶亮
刘俊
赵凯龙
胡俊
周晓根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910805001.6A priority Critical patent/CN110689929B/en
Publication of CN110689929A publication Critical patent/CN110689929A/en
Application granted granted Critical
Publication of CN110689929B publication Critical patent/CN110689929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Abstract

A protein ATP docking method based on contact probability assistance comprises the steps that firstly, binding residue information of protein-ATP is predicted by using five protein binding residue prediction servers such as ATPbind and the like, residues with large occurrence times are selected as binding residues by using a voting method, and the accuracy of the binding residues is improved; secondly, extracting a contact probability matrix of the binding residues of the specific type and each atom of ATP from the PDB database, and scoring the generated conformation as an energy function to improve the docking accuracy; and finally, searching for the optimal individual by using an improved differential evolution algorithm, thereby improving the calculation efficiency. The invention provides a protein ATP docking method based on contact probability assistance, which is low in calculation cost and high in prediction accuracy.

Description

Protein ATP docking method based on contact probability assistance
Technical Field
The invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a protein ATP docking method based on contact probability assistance.
Background
With the continuous research of proteomics, it is more and more common to find that proteins and some ligand small molecules are combined into a whole to play a role in organisms. Throughout life, protein-ligand mutual recognition processes, including substrate-enzyme, antigen-antibody, hormone-receptor recognition, are important bases for molecular mechanisms and regulation processes of various biological functions. The mutual recognition and action of proteins and ligands are important ways for proteins to exert their biological functions, and play very important roles in various life activities, such as gene regulation, signal transduction, immune response, etc., which are not separated from the interaction of proteins and ligands. ATP is also a small molecule ligand, it is a widely distributed energy molecule in the human body, through the action of ATP hydrolase, the released energy becomes ADP, ADP can form ATP through the action of ATP synthetase, and both processes need to combine with enzyme protein to occur. The research on the molecular recognition mechanism between protein and its ligand, the establishment of recognition model and the research on the relationship between molecular recognition and molecular selectivity not only have very important significance for revealing the biological essence, but also can be applied to guide the design and synthesis of compounds with special recognition function and bioactivity.
At present, the wet experimental methods mainly adopted for determining the structure of the protein-ligand complex comprise X-ray crystal diffraction, nuclear magnetic resonance and the like, but the methods for determining the structure of the protein-ligand complex have the defects of great difficulty, high cost and long time. In recent years, with the continuous enhancement of computer technology and the rapid development and wide application of molecular simulation method theory, molecular simulation methods such as homologous modeling, molecular docking, molecular dynamics simulation, binding free energy calculation, quantum mechanics calculation and the like have become important means for researching the interaction mechanism and dynamic process of protein and ligand. The molecular simulation method provides a good means for researching life phenomena and revealing essential rules of the life phenomena on the molecular level or even the atomic level, and can provide powerful theoretical guidance for experiments. With the theoretical perfection of molecular simulation and the advancement of technology, molecular simulation methods are increasingly being used in the research work of protein structure and function, mutual recognition of protein and ligand, and drug design.
Computer molecular simulation techniques rely primarily on the process of searching for complex structures with the lowest energy using intelligent algorithms and energy functions. However, at present, an energy function can perfectly judge the energy of the complex, besides, the inaccurate prediction of protein binding residues can also cause errors of the energy function, so that the predicted complex structure is inaccurate, and some intelligent algorithms also have the problems of long search time or inaccurate search results.
Therefore, the existing protein and ligand molecule docking methods have defects in prediction accuracy and computational cost, and need to be improved.
Disclosure of Invention
In order to overcome the defects of the conventional protein and ligand ATP docking method in the aspects of prediction accuracy and calculation cost, the invention provides a contact probability-assisted protein ATP docking method which is low in calculation cost and high in prediction accuracy.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a contact probability-assisted protein ATP docking method, the method comprising the steps of:
1) inputting the structures of the target protein and ATP, which are respectively marked as R and A;
2) predicting all ATP binding residues of the target protein R using an ATPbind server (http:// zhanglab. ccmb. med. umich. edu/ATPbind /), a TargetS server (http:// www.csbio.sjtu.edu.cn:8080/TargetS /), a TargetSOS server (http:// www.csbio.sjtu.edu.cn:8080/TargetSOS /), a TargetNUCs server (http://202.119.84.36:3079/TargetNUCs /), and a TargetTPsite server (http:// www.csbio.sjtu.edu.cn: 8080/TargetTPsite /), respectively;
3) for each possible binding residue, if there are three or more serversPredicting that the protein is binding residue, using the binding residue as binding residue to finally obtain h protein binding residues which are marked as r1,r2,...,rh
4) Calculation of all binding residues r1,r2,...,rhCentral carbon atom CαThe mean value of the coordinates is obtained to obtain the central coordinate C of the binding residueR(ii) a Calculating the average value of all the atomic coordinates in A to obtain the central coordinate C of AAMoving A so that CAAnd CRThe coordinates of (2) are overlapped;
5) the probability of each type of binding residue coming into contact with each ATP atom is extracted from the PDB database as follows:
5.1) for each complex in the PDB library, C of binding residues of all residue types g is calculatedαAverage distance d between atom and jth atom in ATPg,jIf, if
Figure BDA0002183382860000021
Then order
Figure BDA0002183382860000022
Otherwise, it orders
Figure BDA0002183382860000023
Wherein g ═ {1,2, …,21} represents 21 residue types, j ═ {1,2, …,31} represents 31 ATP atoms,
Figure BDA0002183382860000031
indicating whether there is contact between a binding residue of residue type g in the kth complex and the jth atom in ATP;
5.2) calculation of all complexes
Figure BDA0002183382860000032
The average value of (1) is denoted as cg,jTo obtain a 21 × 31 dimensional contact probability matrix:
Figure BDA0002183382860000033
6) setting parameters: setting a population size NP, a scaling factor F0Cross probability CR, maximum number of iterations GmaxInitializing the iteration times G to be 0;
7) population initialization: randomly generating an initial population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure BDA0002183382860000034
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
8) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score E for that individual was calculatedi
8.1) according to SiThe last three elements s ini,4、si,5And si,6Calculating a spatial rotation matrix R:
Figure BDA0002183382860000035
8.2) rotating all the atomic coordinates in A according to a rotation matrix R to obtain a new ATP structure AR
8.3) according to SiThe first three elements s ini,1、si,2、si,3A isRAll coordinates in (a) perform a translation process as follows, calculating a new ATP structure AT
Figure BDA0002183382860000036
Wherein
Figure BDA0002183382860000037
Is ATThe coordinates of the jth atom of (c),
Figure BDA0002183382860000038
are respectively ARX, Y, Z coordinates of the jth atom in (j) 1, 2.·, 31;
8.4) calculation of h binding residues CαThe distances between the atoms and all the atoms of ATP are calculated as followsi
Figure BDA0002183382860000041
Wherein g represents the type of the currently bound residue; c. Cg,jIs the probability that there is a contact between the g-type binding residue and the jth atom in ATP, corresponding to the value in the jth row and jth column of the contact matrix C; dh,jIs the currently binding residue CαThe distance between an atom and the jth atom in ATP; dmin=0.75×(rh+rj),rhAnd rjC representing the currently bound residue, respectivelyαThe van der waals radius of the atom and the jth atom in ATP;
Figure BDA0002183382860000043
9) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } performs the following:
9.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a, b and c are respectively belonged to {1,2, …, NP }, and a ≠ b ≠ c ≠ i, and the mutant individuals S are generated according to the following formulamutant
Figure BDA0002183382860000044
Smutant=Sa+F·(Sb-Sc)
9.2) generating crossed individuals S according to the following procedurecross1And Scross2
Figure BDA0002183382860000045
Wherein s iscross1,t、smutant,t、scross2,tAnd si,tAre each Scross1、Smutant、Scross2And SiWherein t is 1,2, 6, trandIs a random integer between 1 and 6, and rand (0,1) is a random decimal between 0 and 1;
9.3) according to step 8), respectively calculate Scross,Scross1And SiCorresponding score Ecross1,Ecross2And Ei
9.4) selection of Scross1,Scross2And SiReplacement of S in population P by the lowest scoring individuali
10) G is G +1, if G ≧ GmaxThen record the lowest score E in the current population PminAnd corresponding ATP structure information
Figure BDA0002183382860000046
Will be provided withOutput as final ATP position information, otherwise return to step 9). The technical conception of the invention is as follows: firstly, predicting binding residue information of protein-ATP (adenosine triphosphate) by using five protein binding residue prediction servers such as ATPbind and the like, and selecting residues with a large number of occurrences as binding residues by using a voting method, so that the accuracy of the binding residues is improved; secondly, extracting a contact probability matrix of the binding residues of the specific type and each atom of ATP from the PDB database, and scoring the generated conformation as an energy function to improve the docking accuracy; finally, the optimal individual is searched by using the improved differential evolution algorithm, so that the calculation efficiency is improved. The invention provides a computing agentThe protein ATP docking method based on the contact probability assistance is low in price and high in prediction accuracy.
The beneficial effects of the invention are as follows: firstly, a plurality of protein binding residue prediction servers are used for predicting binding residues of protein-ATP, so that the reliability of the binding residues is improved; secondly, the extracted binding residues and an ATP atom contact probability matrix are utilized to assist in butt joint, so that the butt joint precision of the protein ATP is improved; thirdly, the improved differential evolution algorithm is adopted to search the space position of the ATP, and the searching efficiency of the algorithm is improved.
Drawings
FIG. 1 is a schematic diagram of a protein ATP docking method based on contact probability assistance.
FIG. 2 is a diagram of the structure of the complex obtained by docking protein 1e2q with ATP using a protein ATP docking method based on contact probability assistance.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, a protein ATP docking method based on contact probability assistance includes the following steps:
1) inputting the structures of the target protein and ATP, which are respectively marked as R and A;
2) predicting all ATP binding residues of the target protein R using an ATPbind server (http:// zhanglab. ccmb. med. umich. edu/ATPbind /), a TargetS server (http:// www.csbio.sjtu.edu.cn:8080/TargetS /), a TargetSOS server (http:// www.csbio.sjtu.edu.cn:8080/TargetSOS /), a TargetNUCs server (http://202.119.84.36:3079/TargetNUCs /), and a TargetTPsite server (http:// www.csbio.sjtu.edu.cn: 8080/TargetTPsite /), respectively;
3) for each possible binding residue, if three or more servers predict that the binding residue is a binding residue, the binding residue is used as the binding residue, and finally h protein binding residues are obtained and are marked as r1,r2,...,rh
4) Calculation of all binding residues r1,r2,...,rhCentral carbon atom CαThe mean value of the coordinates is obtained to obtain the central coordinate C of the binding residueR(ii) a Calculating the average value of all the atomic coordinates in A to obtain the central coordinate C of AAMoving A so that CAAnd CRThe coordinates of (2) are overlapped;
5) the probability of each type of binding residue coming into contact with each ATP atom is extracted from the PDB database as follows:
5.1) for each complex in the PDB library, C of binding residues of all residue types g is calculatedαAverage distance d between atom and jth atom in ATPg,jIf, ifThen order
Figure BDA0002183382860000062
Otherwise, it ordersWherein g ═ {1,2, …,21} represents 21 residue types, j ═ {1,2, …,31} represents 31 ATP atoms,
Figure BDA0002183382860000064
indicating whether there is contact between a binding residue of residue type g in the kth complex and the jth atom in ATP;
5.2) calculation of all complexes
Figure BDA0002183382860000065
The average value of (1) is denoted as cg,jTo obtain a 21 × 31 dimensional contact probability matrix:
Figure BDA0002183382860000066
6) setting parameters: setting a population size NP, a scaling factor F0Cross probability CR, maximum number of iterations GmaxInitializing the iteration times G to be 0;
7) population initialization: randomly generating an initial population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range ofsi,4、si,5And si,6The value range of (a) is 0 to 2 pi;
8) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score E for that individual was calculatedi
8.1) according to SiThe last three elements s ini,4、si,5And si,6Calculating a spatial rotation matrix R:
Figure BDA0002183382860000068
8.2) rotating all the atomic coordinates in A according to a rotation matrix R to obtain a new ATP structure AR
8.3) according to SiThe first three elements s ini,1、si,2、si,3A isRAll coordinates in (a) perform a translation process as follows, calculating a new ATP structure AT
Figure BDA0002183382860000071
Wherein
Figure BDA0002183382860000072
Is ATThe coordinates of the jth atom of (c),are respectively ARX, Y, Z of the jth atom in (1)Coordinates, j ═ 1,2,.. 31;
8.4) calculation of h binding residues CαThe distances between the atoms and all the atoms of ATP are calculated as followsi
Figure BDA0002183382860000074
Figure BDA0002183382860000075
Wherein g represents the type of the currently bound residue; c. Cg,jIs the probability that there is a contact between the g-type binding residue and the jth atom in ATP, corresponding to the value in the jth row and jth column of the contact matrix C; dh,jIs the currently binding residue CαThe distance between an atom and the jth atom in ATP; dmin=0.75×(rh+rj),rhAnd rjC representing the currently bound residue, respectivelyαThe van der waals radius of the atom and the jth atom in ATP;
Figure BDA0002183382860000076
9) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } performs the following:
9.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a, b and c are respectively belonged to {1,2, …, NP }, and a ≠ b ≠ c ≠ i, and the mutant individuals S are generated according to the following formulamutant
Figure BDA0002183382860000077
Smutant=Sa+F·(Sb-Sc)
9.2) generating crossed individuals S according to the following procedurecross1And Scross2
Figure BDA0002183382860000078
Wherein s iscross1,t、smutant,t、scross2,tAnd si,tAre each Scross1、Smutant、Scross2And SiWherein t is 1,2, 6, trandIs a random integer between 1 and 6, and rand (0,1) is a random decimal between 0 and 1;
9.3) according to step 8), respectively calculate Scross,Scross1And SiCorresponding score Ecross1,Ecross2And Ei
9.4) selection of Scross1,Scross2And SiReplacement of S in population P by the lowest scoring individuali
10) G is G +1, if G ≧ GmaxThen record the lowest score E in the current population PminAnd corresponding ATP structure information
Figure BDA0002183382860000086
Will be provided with
Figure BDA0002183382860000087
Output as final ATP position information, otherwise return to step 9). In this embodiment, taking the three-dimensional space structure of the compound after predicting the docking of the protein 1e2q and ATP as an example, a protein ATP docking method based on contact probability assistance comprises the following steps:
1) inputting the structures of the target protein and ATP, which are respectively marked as R and A;
2) predicting all ATP binding residues of the target protein R using an ATPbind server (http:// zhanglab. ccmb. med. umich. edu/ATPbind /), a TargetS server (http:// www.csbio.sjtu.edu.cn:8080/TargetS /), a TargetSOS server (http:// www.csbio.sjtu.edu.cn:8080/TargetSOS /), a TargetNUCs server (http://202.119.84.36:3079/TargetNUCs /), and a TargetTPsite server (http:// www.csbio.sjtu.edu.cn: 8080/TargetTPsite /), respectively;
3) for each possible binding residue, if there are three or more server residuesWhen the binding residues are detected to be binding residues, the binding residues are used as the binding residues, and finally h protein binding residues are obtained and are marked as r1,r2,...,rh
4) Calculation of all binding residues r1,r2,...,rhCentral carbon atom CαThe mean value of the coordinates is obtained to obtain the central coordinate C of the binding residueR(ii) a Calculating the average value of all the atomic coordinates in A to obtain the central coordinate C of AAMoving A so that CAAnd CRThe coordinates of (2) are overlapped;
5) the probability of each type of binding residue coming into contact with each ATP atom is extracted from the PDB database as follows:
5.1) for each complex in the PDB library, C of binding residues of all residue types g is calculatedαAverage distance d between atom and jth atom in ATPg,jIf, if
Figure BDA0002183382860000081
Then order
Figure BDA0002183382860000082
Otherwise, it orders
Figure BDA0002183382860000083
Wherein g ═ {1,2, …,21} represents 21 residue types, j ═ {1,2, …,31} represents 31 ATP atoms,
Figure BDA0002183382860000084
indicating whether there is contact between a binding residue of residue type g in the kth complex and the jth atom in ATP;
5.2) calculation of all complexes
Figure BDA0002183382860000085
The average value of (1) is denoted as cg,jTo obtain a 21 × 31 dimensional contact probability matrix:
Figure BDA0002183382860000091
6) setting parameters: setting population size NP to 50, scaling factor F00.5, 0.5 cross probability CR, and maximum number of iterations Gmax500, initializing the iteration number G to 1;
7) population initialization: randomly generating an initial population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure BDA0002183382860000092
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
8) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score E for that individual was calculatedi
8.1) according to SiThe last three elements s ini,4、si,5And si,6Calculating a spatial rotation matrix R:
Figure BDA0002183382860000093
8.2) rotating all the atomic coordinates in A according to a rotation matrix R to obtain a new ATP structure AR
8.3) according to SiThe first three elements s ini,1、si,2、si,3A isRAll coordinates in (a) perform a translation process as follows, calculating a new ATP structure AT
Figure BDA0002183382860000094
Wherein
Figure BDA0002183382860000095
Is ATThe coordinates of the jth atom of (c),
Figure BDA0002183382860000096
are respectively ARX, Y, Z coordinates of the jth atom in (j) 1, 2.·, 31;
8.4) calculation of h binding residues CαThe distances between the atoms and all the atoms of ATP are calculated as followsi
Figure BDA0002183382860000097
Figure BDA0002183382860000101
Wherein g represents the type of the currently bound residue; c. Cg,jIs the probability that there is a contact between the g-type binding residue and the jth atom in ATP, corresponding to the value in the jth row and jth column of the contact matrix C; dh,jIs the currently binding residue CαThe distance between an atom and the jth atom in ATP; dmin=0.75×(rh+rj),rhAnd rjC representing the currently bound residue, respectivelyαIn atom and ATP
The van der waals radius of the jth atom;
Figure BDA0002183382860000102
9) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } performs the following:
9.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a, b and c are respectively belonged to {1,2, …, NP }, and a ≠ b ≠ c ≠ i, and the mutant individuals S are generated according to the following formulamutant
Figure BDA0002183382860000103
Smutant=Sa+F·(Sb-Sc)
9.2) generating crossed individuals S according to the following procedurecross1And Scross2
Figure BDA0002183382860000104
Wherein s iscross1,t、smutant,t、scross2,tAnd si,tAre each Scross1、Smutant、Scross2And SiWherein t is 1,2, 6, trandIs a random integer between 1 and 6, and rand (0,1) is a random decimal between 0 and 1;
9.3) according to step 8), respectively calculate Scross,Scross1And SiCorresponding score Ecross1,Ecross2And Ei
9.4) selection of Scross1,Scross2And SiReplacement of S in population P by the lowest scoring individuali
10) G is G +1, if G ≧ GmaxThen record the lowest score E in the current population PminAnd corresponding ATP structure information
Figure BDA0002183382860000105
Will be provided with
Figure BDA0002183382860000106
Output as final ATP position information, otherwise return to step 9).
Using the three-dimensional spatial structure of the protein 1e2q and ATP as an example, the root mean square deviation of the three-dimensional spatial structure information of the complex of the protein 1e2q and ATP obtained by the above method from the complex structure measured by the wet experiment is
Figure BDA0002183382860000107
The predicted protein ATP complex structure is shown in figure 2.
The above description is the prediction result of the protein 1e2q and ATP as examples in the present invention, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1. A protein ATP docking method based on contact probability assistance is characterized in that: the butt joint method comprises the following steps:
1) inputting the structures of the target protein and ATP, which are respectively marked as R and A;
2) predicting all ATP binding residues of the target protein R by using an ATPbind server, a TargetS server, a TargetSOS server, a TargetNUCs server and a TargetTPsite server respectively;
3) for each possible binding residue, if three or more servers predict that the binding residue is a binding residue, the binding residue is used as the binding residue, and finally h protein binding residues are obtained and are marked as r1,r2,...,rh
4) Calculation of all binding residues r1,r2,...,rhCentral carbon atom CαThe mean value of the coordinates is obtained to obtain the central coordinate C of the binding residueR(ii) a Calculating the average value of all the atomic coordinates in A to obtain the central coordinate C of AAMoving A so that CAAnd CRThe coordinates of (2) are overlapped;
5) the probability of each type of binding residue coming into contact with each ATP atom is extracted from the PDB database as follows:
5.1) for each complex in the PDB library, C of binding residues of all residue types g is calculatedαAverage distance d between atom and jth atom in ATPg,jIf, if
Figure FDA0002183382850000011
Then order
Figure FDA0002183382850000012
Otherwise, it orders
Figure FDA0002183382850000013
Wherein g ═ {1,2, …,21} represents 21 residue types, j ═ {1,2, …,31} represents 31 ATP atoms,
Figure FDA0002183382850000014
indicating whether there is contact between a binding residue of residue type g in the kth complex and the jth atom in ATP;
5.2) calculation of all complexes
Figure FDA0002183382850000015
The average value of (1) is denoted as cg,jTo obtain a 21 × 31 dimensional contact probability matrix:
Figure FDA0002183382850000016
6) setting parameters: setting a population size NP, a scaling factor F0Cross probability CR, maximum number of iterations GmaxInitializing the iteration times G to be 0;
7) population initialization: randomly generating an initial population P ═ S1,S2,...,Si,...,SNP},Si=(si,1,si,2,si,3,si,4,si,5,si,6) Is the i-th individual of the population P, si,1、si,2、si,3、si,4、si,5And si,6Is SiOf 6 elements of (a), wherein si,1、si,2And si,3Is in the value range of
Figure FDA0002183382850000021
si,4、si,5And si,6The value range of (a) is 0 to 2 pi;
8) for each individual in the population SiThe protein was docked with ATP according to the following manner and the score E for that individual was calculatedi
8.1) root of Ligusticum wallichiiAccording to SiThe last three elements s ini,4、si,5And si,6Calculating a spatial rotation matrix R:
Figure FDA0002183382850000022
8.2) rotating all the atomic coordinates in A according to a rotation matrix R to obtain a new ATP structure AR
8.3) according to SiThe first three elements s ini,1、si,2、si,3A isRAll coordinates in (a) perform a translation process as follows, calculating a new ATP structure AT
Figure FDA0002183382850000023
Wherein
Figure FDA0002183382850000024
Is ATThe coordinates of the jth atom of (c),
Figure FDA0002183382850000025
are respectively ARX, Y, Z coordinates of the jth atom in (j) 1, 2.·, 31;
8.4) calculation of h binding residues CαThe distances between the atoms and all the atoms of ATP are calculated as followsi
Figure FDA0002183382850000026
Figure FDA0002183382850000027
Wherein g represents the type of the currently bound residue; c. Cg,jIs the probability that there is a contact between the g-type binding residue and the jth atom in ATP, corresponding to the value in the jth row and jth column of the contact matrix C; dh,jIs the currently binding residue CαThe distance between an atom and the jth atom in ATP; dmin=0.75×(rh+rj),rhAnd rjC representing the currently bound residue, respectivelyαThe van der waals radius of the atom and the jth atom in ATP;
Figure FDA0002183382850000028
9) according to a differential evolution algorithm, for each individual S in the population PiI ∈ {1,2, …, NP } performs the following:
9.1) randomly selecting three different individuals S from the Current population Pa、SbAnd ScWherein a, b and c are respectively belonged to {1,2, …, NP }, and a ≠ b ≠ c ≠ i, and the mutant individuals S are generated according to the following formulamutant
Figure FDA0002183382850000031
Smutant=Sa+F·(Sb-Sc)
9.2) generating crossed individuals S according to the following procedurecross1And Scross2
Figure FDA0002183382850000032
Wherein s iscross1,t、smutant,t、scross2,tAnd si,tAre each Scross1、Smutant、Scross2And SiWherein t is 1,2, 6, trandIs a random integer between 1 and 6, and rand (0,1) is a random decimal between 0 and 1;
9.3) according to step 8), respectively calculate Scross,Scross1And SiCorresponding score Ecross1,Ecross2And Ei
9.4) selection of Scross1,Scross2And SiReplacement of S in population P by the lowest scoring individuali
10) G is G +1, if G ≧ GmaxThen record the lowest score E in the current population PminAnd corresponding ATP structure information
Figure FDA0002183382850000033
Will be provided with
Figure FDA0002183382850000034
Output as final ATP position information, otherwise return to step 9).
CN201910805001.6A 2019-08-29 2019-08-29 Protein ATP docking method based on contact probability assistance Active CN110689929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910805001.6A CN110689929B (en) 2019-08-29 2019-08-29 Protein ATP docking method based on contact probability assistance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910805001.6A CN110689929B (en) 2019-08-29 2019-08-29 Protein ATP docking method based on contact probability assistance

Publications (2)

Publication Number Publication Date
CN110689929A true CN110689929A (en) 2020-01-14
CN110689929B CN110689929B (en) 2021-12-17

Family

ID=69108516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910805001.6A Active CN110689929B (en) 2019-08-29 2019-08-29 Protein ATP docking method based on contact probability assistance

Country Status (1)

Country Link
CN (1) CN110689929B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109360596A (en) * 2018-08-30 2019-02-19 浙江工业大学 A kind of protein conformation space optimization method based on differential evolution local dip
CN109461470A (en) * 2018-08-29 2019-03-12 浙江工业大学 A kind of protein structure prediction energy function weight optimization method
CN109524058A (en) * 2018-11-07 2019-03-26 浙江工业大学 A kind of protein dimer Structure Prediction Methods based on differential evolution
WO2019080829A1 (en) * 2017-10-23 2019-05-02 Shanghaitech University Compositions and methods for detecting molecule-molecule interactions
US20190145982A1 (en) * 2016-05-02 2019-05-16 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190145982A1 (en) * 2016-05-02 2019-05-16 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
WO2019080829A1 (en) * 2017-10-23 2019-05-02 Shanghaitech University Compositions and methods for detecting molecule-molecule interactions
CN109461470A (en) * 2018-08-29 2019-03-12 浙江工业大学 A kind of protein structure prediction energy function weight optimization method
CN109360596A (en) * 2018-08-30 2019-02-19 浙江工业大学 A kind of protein conformation space optimization method based on differential evolution local dip
CN109524058A (en) * 2018-11-07 2019-03-26 浙江工业大学 A kind of protein dimer Structure Prediction Methods based on differential evolution

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HU X: ""Protein ligand-specific binding residue predictions by an ensemble classifier"", 《BMC BIOINFORMATICS》 *
於东军: ""识别蛋白质配体绑定残基的生物计算方法综述"", 《数据采集与处理》 *

Also Published As

Publication number Publication date
CN110689929B (en) 2021-12-17

Similar Documents

Publication Publication Date Title
Kimber et al. Deep learning in virtual screening: recent applications and developments
Aggarwal et al. DeepPocket: ligand binding site detection and segmentation using 3D convolutional neural networks
Zhu et al. Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2. 0
Lin et al. Efficient classification of hot spots and hub protein interfaces by recursive feature elimination and gradient boosting
Durairaj et al. Geometricus represents protein structures as shape-mers derived from moment invariants
CN109524058B (en) Protein dimer structure prediction method based on differential evolution
CN109785901B (en) Protein function prediction method and device
Tetko et al. Does ‘Big Data’exist in medicinal chemistry, and if so, how can it be harnessed?
CN108846256B (en) Group protein structure prediction method based on residue contact information
Emami et al. Computational predictive approaches for interaction and structure of aptamers
Mao et al. Transformer-based molecular generative model for antiviral drug design
CN110600075B (en) Protein ATP docking method based on ligand growth strategy
Scharnowski et al. Comparative visualization of molecular surfaces using deformable models
CN109872770B (en) Variable strategy protein structure prediction method combined with displacement degree evaluation
CN110689929B (en) Protein ATP docking method based on contact probability assistance
CN110600076B (en) Protein ATP docking method based on distance and angle information
CN108920894B (en) Protein conformation space optimization method based on brief abstract convex estimation
CN109360597B (en) Group protein structure prediction method based on global and local strategy cooperation
CN110197700B (en) Protein ATP docking method based on differential evolution
Yue et al. A systematic review on the state-of-the-art strategies for protein representation
Susanty et al. A review of protein structure prediction using deep learning
Jiang et al. Structure-based prediction of nucleic acid binding residues by merging deep learning-and template-based approaches
Wang et al. SAPocket: Finding pockets on protein surfaces with a focus towards position and voxel channels
Liu et al. GraphCPLMQA: Assessing protein model quality based on deep graph coupled networks using protein language model
CN109448786B (en) Method for predicting protein structure by lower bound estimation dynamic strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant