CN104992079A - Sampling learning based protein-ligand binding site prediction method - Google Patents

Sampling learning based protein-ligand binding site prediction method Download PDF

Info

Publication number
CN104992079A
CN104992079A CN201510368016.2A CN201510368016A CN104992079A CN 104992079 A CN104992079 A CN 104992079A CN 201510368016 A CN201510368016 A CN 201510368016A CN 104992079 A CN104992079 A CN 104992079A
Authority
CN
China
Prior art keywords
sampling
protein
sample
binding site
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510368016.2A
Other languages
Chinese (zh)
Other versions
CN104992079B (en
Inventor
胡俊
何雪
李阳
於东军
沈红斌
杨静宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201510368016.2A priority Critical patent/CN104992079B/en
Publication of CN104992079A publication Critical patent/CN104992079A/en
Application granted granted Critical
Publication of CN104992079B publication Critical patent/CN104992079B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention provides a sampling learning based protein-ligand binding site prediction method. The method comprises the steps of: firstly, utilizing PSI-BLAST and PSIPRED programs to obtain evolutionary information and secondary structure information of protein, and using a slide window technology to extract characteristics of each amino acid residue (sample); secondly, utilizing a random down-sampling technology to perform random down-sampling on non-binding site samples, and using obtained non-binding site sample subsets and binding site sample set to train an SVM for predicting all to-be-predicted samples; thirdly, according to characteristic information of each to-be-predicted sample, utilizing a KNN dynamic sampling learning technology to perform sampling learning on binding site samples and the non-binding site samples respectively, and combining binding site sample subsets and the non-binding site sample subsets after sampling to train a specific SVM for predicting the to-be-predicted samples; and finally, using a threshold based integration technology to integrate the two trained SVMs. The method has the advantages that: firstly, the use of the random down-sampling and KNN dynamic sampling learning technologies can effectively reduce the scale of training sets and accelerate the model training speed; secondly, the use of the KNN dynamic sampling learning technology can train different SVM models for different to-be-predicted samples and effectively infuse the difference among the to-be-predicted samples; and thirdly, the use of the SVM integration technology effectively reduces the information loss caused by sampling learning and improves the model prediction precision.

Description

Based on the protein-ligand bindings bit point prediction method of sampling study
Technical field
The present invention relates to Bioinformatics Prediction protein-ligand binding field, site, in particular to a kind of based on sampling study protein-ligand bindings bit point prediction method, particularly a kind of based on random down-sampling, KNN dynamic sampling learning art, support vector machine ensembles strategy there is high-precision protein-ligand bindings bit point prediction method.
Background technology
In vital movement, big and small part serves indispensable effect, as atriphos (ATP), vitamin etc.; Wherein ATP is a kind of important biomacromolecule, for the film transmission in biosome, contraction of muscle, signal transmission, cell movement, DNA replication dna and transcribe and other vital movements significant.These part great majority are by protein-ligand binding site and protein interaction, perform various biochemical function by the function such as transport, decomposition by protein.In addition, the binding site of protein and some parts is also the antibacterial target spot important with cancer therapy drug.Therefore, the protein-ligand binding site quickly and accurately in positioning protein matter sequence is significant.
But the binding site determining between protein and part by the method for Bioexperiment needs time and the fund of at substantial, and efficiency is lower; And, along with the develop rapidly of sequencing technologies and the continuous propelling of mankind's Structural genomics, in proteomics, have accumulated the protein sequence not carrying out protein-ligand binding site in a large number and demarcate.Therefore the relevant knowledge of applying biological information science, research and development can directly from protein sequence carry out protein-ligand binding site fast and accurately Intelligent Forecasting have active demand, and for discovery and understanding protein structure and physiological function have great significance.
At present, the forecast model for the protein-ligand binding site based on sequence information is also short of very much.By consulting pertinent literature, can find, current specialized designs carries out having based on the computation model of the protein-ligand bindings bit point prediction of sequence information: ATPint, ATPsite, GTPbinder, NsitePred, TargetATP, TargetATPsite, TargetS and TargetSOS etc.Wherein ATPint (J.S.Chauhan, N.K.Mishra, and G.P.Raghava, " Identification of ATP bindingresidues of a protein from its primary sequence, " BMC Bioinformatics, vol.10, pp.434, 2009) with ATPsite (K.Chen, M.J.Mizianty, and L.Kurgan, " ATPsite:sequence-based prediction ofATP-binding residues, " Proteome Sci, vol.9Suppl 1, pp.S4, 2011.) be the forecast model that two protein-ATP based on sequence information comparatively early bind site.GTPbinder (Chauhan, J.S., et al. (2010) Prediction of GTPinteracting residues, dipeptides and tripeptides in a protein from its evolutionary information.BMCBioinformatics, 11,301.) be that specialized designs is used for predicted protein matter-GTP and binds the computation model in site.TargetATP (Dong-Jun Yu, Jun Hu, Zhen-Min Tang, Hong-Bin Shen, Jian Yang, and Jing-Yu Yang.ImprovingProtein-ATP Binding Residues Prediction by Boosting SVMs with Random Under-Sampling.Neurocomputing.2013, 104:180-190.) with TargetATPsite (Dong-Jun Yu, Jun Hu, Yan Huang, Hong-Bin Shen, Yong Qi, Zhen-Min Tang and Jing-Yu Yang:TargetATPsite:A Template-freeMethod for ATP Binding Sites Prediction with Residue Evolution Image Sparse Representation andClassifier Ensemble, Journal of Computational Chemistry.2013, also be 34:974-985.) that specialized designs is used for predicted protein matter-ATP and binds the computation model in site.NsitePred (Chen K, Mizianty M J, Kurgan L.Predictionand analysis of nucleotide-binding residues using sequence and sequence-derived structuraldescriptors.Bioinformatics, 2012, 28 (3): 331-341.) with TargetSOS (Jun Hu, Xue He, Dong-Jun Yu*, Xi-Bei Yang, Jing-Yu Yang, and Hong-Bin Shen.A New Supervised Over-Sampling Algorithm withApplication to Protein-Nucleotide Binding Residues Prediction, PLOS ONE.2014, 9 (9): e107676) be that design is used for predicted protein matter and nucleotide (ATP, ADP, AMP, GTP and GDP) bind the forecast model in site.TargetS (Dong-Jun Yu, Jun Hu, Jing Yang, Hong-Bin Shen, Jinhui Tang, and Jing-Yu Yang.Designing template-free predictor for targeting protein-ligand binding sites with classifier ensembleand spatial clustering, IEEE/ACM Transactions on Computational Biology and Bioinformatics.2013, 10 (4): 994-1008.) be one can predicted protein matter and nucleotide (ATP, ADP, AMP, GTP and GDP), with metallic ion (Ca 2+, Mg 2+, Mn 2+, Fe 3+with Zn 2+) binding site computation model.
But the kind of part has a lot, the computation model in predicted protein matter recited above-part binding site is not all considered comprehensively.And protein-ligand bindings bit point prediction is traditional uneven problem concerning study, although the impact using random down-sampling technology to overcome a part of unbalanced data in some computation models to bring, but different samples to be predicted is not treated with a certain discrimination, do not excavate the otherness between sample to be predicted.Thus cause the poor problem of the interpretation of protein-ligand bindings bit point prediction model to have to be overcome; And can find that the practical application of precision of prediction distance also has larger gap, in the urgent need to further raising.
Summary of the invention
In order to solve in above-mentioned protein-ligand bindings bit point prediction problem because otherness between not strong, the different sample to be predicted of versatility of the incomplete initiation of ligand species is not caused precision of prediction distance practical application gap comparatively large and the shortcoming that interpretation is poor by taking into full account, the object of the invention is to propose a kind of in conjunction with random down-sampling, the study of KNN dynamic sampling and integrated technology, there is the protein-ligand bindings bit point prediction method based on sampling study that precision of prediction is high, model interpretation is strong.
For reaching above-mentioned purpose, the technical solution adopted in the present invention is as follows:
Based on a protein-ligand bindings bit point prediction method for sampling study, comprise the following steps:
Step 1: feature extraction, is converted to numeric form by each amino acid residue in protein sequence to be predicted and represents.For the protein that one is made up of n amino acid, this protein position-specific scoring matrices (Position Specific Scoring Matrix can be obtained by PSI-BLAST program, PSSM), this matrix size is n × 20 (n capable 20 arranges); First use sigmoid function s (x)=1/ (1+e -x) standardization is line by line carried out to this PSSM matrix, the moving window that then use length is winsize obtains the evolution information matrix of each amino acid residue; Evolution information matrix is pulled into the proper vector that length is 20 × winsize: wherein i represents i-th residue in protein sequence; Protein sequence is input to PSIPRED program, secondary structure prediction probability matrix (the Predicted Secondary Structure of protein sequence can be obtained, PSS), size is n × 3 (n capable 3 arranges), use onesize moving window, obtain the secondary structure information matrix of each amino acid residue; Secondary structure information matrix is pulled into the proper vector that length is 3 × winsize: finally, the proper vector serial combination of two kinds of information is finally used for the proper vector predicted.
Step 2: use random down-sampling technology, carries out random down-sampling to the sample in unbundling site; The unbundling site sample set obtained and bindings bit point sample set are formed a training set, to close training SVM at the training set built.In the training set built by this method, the harmony of positive negative sample can be kept.But, computation model also can be caused insensitive to the otherness between difference sample to be predicted.For this reason, KNN dynamic sampling learning art will be utilized in next step to compensate.
Step 3: for each sample to be predicted, first step 1 is used to carry out feature extraction, then KNN dynamic sampling learning art is used to sample to bindings bit point sample and unbundling site sample respectively, finally, a SVM being used for predicting this sample to be predicted is specially trained after the bindings bit point sample set after sampling and unbundling site sample set being merged.Guarantee that the otherness between different samples to be predicted obtains maximum reservation.Such process makes computation model can tackle more ligand classes.
Step 4: adopt the integrated technology based on threshold value to carry out SVM integrated, in above-mentioned steps 2 and step 3 train two each and every one SVM obtained, the integrated technology applied based on threshold value carries out integrated.To the Output rusults integrated, use the method for Threshold segmentation, determine whether each residue belongs to binding site.
From the above technical solution of the present invention shows that, beneficial effect of the present invention is:
1. improve the precision of prediction of model: employ the strategy that random down-sampling combines with KNN dynamic sampling learning art, make computation model have unitarity between different sample to be predicted and otherness information simultaneously, how effective sample distribution information can be excavated further, improve the precision of prediction of the computation model in predicted protein matter-part binding site;
2. the interpretation of lift scheme: the use of KNN dynamic sampling learning art makes computation model can for the special forecast model of different sample trainings to be predicted, while incorporating sample variation to be predicted, also make to predict that the result obtained has more fairness and rationality, improve the interpretation of model.
Accompanying drawing explanation
Fig. 1 is in conjunction with random down-sampling, the study of KNN dynamic sampling and the schematic diagram based on the protein-ligand bindings bit point prediction method of the integrated technology of threshold value.
Embodiment
In order to more understand technology contents of the present invention, below in conjunction with accompanying drawing, the present invention is further illustrated.
Fig. 1 gives Forecasting Methodology system architecture schematic diagram of the present invention.Shown in composition graphs 1, according to embodiments of the invention, a kind of protein-ligand bindings bit point prediction method based on sampling study, includes following steps:
First, PSI-BLAST and PSIPRED program is used to obtain evolution information matrix (the PositionSpecific Scoring Matrix of training protein respectively, and secondary structure prediction probability matrix (Predicted Secondary Structure, PSS) PSSM); Secondly, use sliding window technique, build the proper vector of each amino acid residue from PSSM matrix and secondary structure prediction probability matrix, then the proper vector serial combination of aforementioned two kinds of information is finally used for the proper vector predicted; Again, use random down-sampling technology, down-sampling is carried out to unbundling site residue, by the unbundling site sample set that obtains and bindings bit point composition of sample training set, this training set trains a SVM; Then, use KNN dynamic sampling learning art, respectively down-sampling is carried out to binding site residue and unbundling residue, the bindings bit obtained some sample set and unbundling site sample set are formed a training set, this training set trains a SVM; Finally, the Integrated Strategy based on threshold value is used to carry out integrated to two SVM obtained above.
Shown in accompanying drawing, more specifically aforementioned process is described.
Step 1: feature extraction
For the protein that one is made up of n amino acid residue, can obtain position-specific scoring matrices PSSM by PSI-BLAST program, size is n × 20 (n capable 20 arranges), and protein sequence information is changed into matrix form, as follows:
Each value in PSSM is normalized:
s ( x ) = 1 1 + e - x - - - ( 2 )
Use the moving window that size is winsize, extract the PSSM eigenmatrix of each amino acid residue:
Then, the eigenmatrix of this amino acid residue is pulled into the proper vector that dimension is 20 × winsize:
x p s s m i = ( pssm i - w i n s i z e - 1 2 , 1 n o r m a l i z e d , pssm i - w i n s i z e - 1 2 , 2 n o r m a l i z e d , ... , pssm i - w i n s i z e - 1 2 , 20 n o r m a l i z e d ) T - - - ( 4 )
For the protein sequence that is made up of n amino acid residue, can obtain its secondary structure prediction probability matrix (PSS) by PSIPRED program, size is n × 3 (n capable 3 arranges):
Use above-mentioned onesize sliding window technique, the PSS eigenmatrix of each amino acid residue can be obtained:
Then, the PSS eigenmatrix of this amino acid residue is pulled into the proper vector that dimension is 3 × winsize:
x p s s i = ( pss i - w i n s i z e - 1 2 , 1 , pss i - w i n s i z e - 1 2 , 2 , ... , pss i + w i n s i z e - 1 2 , 3 ) T - - - ( 7 )
Finally, formula (4) and formula (7) serial combination are got up, obtains the proper vector for the sample to be predicted predicted.
Step 2: use random down-sampling technology, carries out down-sampling to the sample in unbundling site, by the unbundling site subset that obtains and the bindings bit point composition of sample training set of sampling, to close training SVM at this training set.
In the training set built by this method, the harmony of positive negative sample can be kept.But, computation model also can be caused insensitive to the otherness between difference sample to be predicted.For this reason, KNN dynamic sampling learning art will be utilized in next step to compensate.
Step 3: use KNN dynamic sampling learning art to carry out down-sampling to bindings bit point sample and unbundling site sample respectively, bindings bit point sample set after sampling and unbundling site sample set are formed one train and gather, then to close training SVM at this training set.
If original amino acid residue training set, wherein represent the proper vector of i-th sample, represent whether i-th sample is binding site (-1 represents unbundling site, and 1 represents it is binding site); for being numbered the amino acid residue to be predicted of j.
In order to make KNN dynamic sampling learning art to sample to bindings bit point sample and unbundling site sample respectively, we first need use formula (8) according to be whether the state in binding site by bindings bit point sample and unbundling site sample from S trin separately.
( S b i n d i n g t r , S n o n - b i n d i n g t r ) = D i v i d e D a t a s e t ( S t r ) - - - ( 8 )
Wherein for bindings bit point sample set, for unbundling site sample set.
Then, exist respectively with in set, according to sample information to be predicted use the neighbour of KNN algorithm search sample to be predicted in bindings bit point sample set and the neighbour in the sample set of unbundling site:
neighbor j b i n d i n g = K N N S e l e c t i o n ( x j t s t , S b i n d i n g t r ) - - - ( 9 )
neighbor j n o n - b i n d i n g = K N N S e l e c t i o n ( x j t s t , S n o n - b i n d i n g t r ) - - - ( 10 )
Again by two neighbour's set with be combined formation one to be used for specially predicting training set
n e i g h b orSe t = U n i o n ( neighbor j b i n d i n g , neighbor j n o n - b i n d i n g ) - - - ( 9 )
Train a SVM being used for predicting this sample to be predicted specially.
Step 4: use the integrated technology based on threshold value, step 2 is integrated with the SVM in step 3.
If pro_rand and pro_dynamic is step 2 with the SVM in step 3 to same sample to be predicted respectively prediction probability, we use as follows based on the integrated technology of threshold value:
pro e n s e m b l e = argmax p ∈ { p r o _ r a n d , p r o _ d y n a m i c } | p - c t h r e s | - - - ( 9 )
Wherein cthres is the threshold parameter that can regulate, and its range of adjustment is 0 to 1.
Finally in the method using Threshold segmentation, determine whether each residue belongs to binding site:
f ( x j t s t ) = - 1 , i f pro e n s e m b l e ≥ T 1 , o t h e r w i s e - - - ( 9 )
Wherein, T is the threshold value of setting, and this threshold value span is 0 ~ 1, the following condition of demand fulfillment: the geneva related coefficient predicted the outcome is maximized.
In sum, the present invention is compared with existing Forecasting Methodology, its remarkable advantage is: this method has the ability solving the unbalanced data study of protein-ligand binding site, there is the ability that the degree of depth excavates otherness between each sample to be predicted, this not only can make to distinguish the difference between different ligands to greatest extent, make forecast model not only interpretation enhancing simultaneously, and improve the precision of prediction of model.
Although the present invention with preferred embodiment disclose as above, so itself and be not used to limit the present invention.Persond having ordinary knowledge in the technical field of the present invention, without departing from the spirit and scope of the present invention, when being used for a variety of modifications and variations.Therefore, protection scope of the present invention is when being as the criterion depending on those as defined in claim.

Claims (6)

1., based on a protein-ligand bindings bit point prediction method for sampling study, it is characterized in that, comprise the following steps:
Step 1: feature extraction, use evolution information and the secondary structure information of PSI-BLAST and PSIPRED Program extraction protein to be predicted, and on this basis, use sliding window technique, amino acid residue in protein sequence is converted to proper vector form represent, then the proper vector serial combination of two kinds of information is finally used for the proper vector predicted;
Step 2: use random down-sampling technology, carries out random down-sampling to the sample in unbundling site; The unbundling site sample set obtained and bindings bit point sample set are formed a training set, a training SVM on the training set built;
Step 3: for each sample to be predicted, first the mode of step 1 is used to carry out feature extraction, then KNN dynamic sampling learning art is used to sample to bindings bit point sample and unbundling site sample respectively, finally, a SVM being used for predicting this sample to be predicted is specially trained after the bindings bit point sample set after sampling and unbundling site sample set being merged; And
Step 4: use the integrated technology based on threshold value to carry out integrated to obtain in step 2 and step 3 two SVM.
2. the protein-ligand bindings bit point prediction method based on sampling study according to claim 1, it is characterized in that: in above-mentioned step 1, for the protein sequence that is made up of n amino acid, by the position-specific scoring matrices PSSM using PSI-BLAST Program extraction to obtain this protein, the size of this matrix is n × 20; Carry out standardization line by line to described position-specific scoring matrices PSSM again, the moving window that then use length is winsize obtains the Evolution matrix of each amino acid residue, and Evolution matrix is pulled into the proper vector that length is 20 × winsize.
3. the protein-ligand bindings bit point prediction method based on sampling study according to claim 2, it is characterized in that: in above-mentioned step 1, the protein sequence that one is made up of n amino acid is input to PSIPRED program, obtain the secondary structure prediction probability matrix PSS of protein sequence, matrix size is n × 3; Re-use and aforementioned onesize moving window, obtain the secondary structure information matrix of each amino acid residue; Finally secondary structure information matrix is pulled into the proper vector that length is 3 × winsize.
4. the protein-ligand bindings bit point prediction method based on sampling study according to claim 1, it is characterized in that: in above-mentioned steps 3, the KNN dynamic sampling learning art of use is sampled to bindings bit point sample set and unbundling site sample set respectively.
5. the protein-ligand bindings bit point prediction method based on sampling study according to claim 1, it is characterized in that: in above-mentioned steps 4, described integrated SVM, uses the method for Threshold segmentation, determines whether each amino acid residue belongs to binding site.
6. the protein-ligand bindings bit point prediction method based on sampling study according to claim 5, it is characterized in that: when using the method for Threshold segmentation to determine whether each amino acid residue belongs to binding site, this threshold value span selected is 0 ~ 1, and meets the following conditions: the geneva related coefficient predicted the outcome is maximized.
CN201510368016.2A 2015-06-29 2015-06-29 Protein-ligand based on sampling study binds site estimation method Active CN104992079B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510368016.2A CN104992079B (en) 2015-06-29 2015-06-29 Protein-ligand based on sampling study binds site estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510368016.2A CN104992079B (en) 2015-06-29 2015-06-29 Protein-ligand based on sampling study binds site estimation method

Publications (2)

Publication Number Publication Date
CN104992079A true CN104992079A (en) 2015-10-21
CN104992079B CN104992079B (en) 2018-07-06

Family

ID=54303892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510368016.2A Active CN104992079B (en) 2015-06-29 2015-06-29 Protein-ligand based on sampling study binds site estimation method

Country Status (1)

Country Link
CN (1) CN104992079B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808975A (en) * 2016-03-14 2016-07-27 南京理工大学 Multi-core-learning and Boosting algorithm based protein-DNA binding site prediction method
CN107194207A (en) * 2017-06-26 2017-09-22 南京理工大学 Protein ligands binding site estimation method based on granularity support vector machine ensembles
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information
CN107480469A (en) * 2017-07-31 2017-12-15 同济大学 It is a kind of to be used for method of the fast search to mould-fixed in gene order
CN109326329A (en) * 2018-11-14 2019-02-12 金陵科技学院 Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode
WO2019041333A1 (en) * 2017-08-31 2019-03-07 深圳大学 Method, apparatus, device and storage medium for predicting protein binding sites
CN110197700A (en) * 2019-04-16 2019-09-03 浙江工业大学 A kind of a-protein TP interconnection method based on differential evolution
CN111785321A (en) * 2020-06-12 2020-10-16 浙江工业大学 DNA binding residue prediction method based on deep convolutional neural network
CN112599186A (en) * 2020-12-30 2021-04-02 兰州大学 Compound target protein binding prediction method based on multi-depth learning model consensus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760210A (en) * 2012-06-19 2012-10-31 南京理工大学常熟研究院有限公司 Adenosine triphosphate binding site predicting method for protein
CN103617203A (en) * 2013-11-15 2014-03-05 南京理工大学 Protein-ligand binding site predicting method based on inquiry drive
CN103955628A (en) * 2014-04-22 2014-07-30 南京理工大学 Subspace fusion-based protein-vitamin binding location point predicting method
CN104077499A (en) * 2014-05-25 2014-10-01 南京理工大学 Supervised up-sampling learning based protein-nucleotide binding positioning point prediction method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760210A (en) * 2012-06-19 2012-10-31 南京理工大学常熟研究院有限公司 Adenosine triphosphate binding site predicting method for protein
CN103617203A (en) * 2013-11-15 2014-03-05 南京理工大学 Protein-ligand binding site predicting method based on inquiry drive
CN103955628A (en) * 2014-04-22 2014-07-30 南京理工大学 Subspace fusion-based protein-vitamin binding location point predicting method
CN104077499A (en) * 2014-05-25 2014-10-01 南京理工大学 Supervised up-sampling learning based protein-nucleotide binding positioning point prediction method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
余健浩,孙廷凯.: "基于随机下采样和 SVR的蛋白质⁃ATP绑定位点预测", 《现代电子技术》 *
石大宏,何雪.: "序列蛋白质-GDP绑定位点预测", 《计算机工程与应用》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808975A (en) * 2016-03-14 2016-07-27 南京理工大学 Multi-core-learning and Boosting algorithm based protein-DNA binding site prediction method
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information
CN107194207A (en) * 2017-06-26 2017-09-22 南京理工大学 Protein ligands binding site estimation method based on granularity support vector machine ensembles
CN107480469B (en) * 2017-07-31 2020-07-07 同济大学 Method for rapidly searching given pattern in gene sequence
CN107480469A (en) * 2017-07-31 2017-12-15 同济大学 It is a kind of to be used for method of the fast search to mould-fixed in gene order
WO2019041333A1 (en) * 2017-08-31 2019-03-07 深圳大学 Method, apparatus, device and storage medium for predicting protein binding sites
CN109326329A (en) * 2018-11-14 2019-02-12 金陵科技学院 Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode
CN109326329B (en) * 2018-11-14 2020-07-07 金陵科技学院 Zinc binding protein action site prediction method
CN110197700A (en) * 2019-04-16 2019-09-03 浙江工业大学 A kind of a-protein TP interconnection method based on differential evolution
CN110197700B (en) * 2019-04-16 2021-04-06 浙江工业大学 Protein ATP docking method based on differential evolution
CN111785321A (en) * 2020-06-12 2020-10-16 浙江工业大学 DNA binding residue prediction method based on deep convolutional neural network
CN111785321B (en) * 2020-06-12 2022-04-05 浙江工业大学 DNA binding residue prediction method based on deep convolutional neural network
CN112599186A (en) * 2020-12-30 2021-04-02 兰州大学 Compound target protein binding prediction method based on multi-depth learning model consensus
CN112599186B (en) * 2020-12-30 2022-09-27 兰州大学 Compound target protein binding prediction method based on multi-deep learning model consensus

Also Published As

Publication number Publication date
CN104992079B (en) 2018-07-06

Similar Documents

Publication Publication Date Title
CN104992079A (en) Sampling learning based protein-ligand binding site prediction method
Cao et al. Ensemble deep learning in bioinformatics
Niu et al. sgRNACNN: identifying sgRNA on-target activity in four crops using ensembles of convolutional neural networks
Levy et al. MethylNet: an automated and modular deep learning approach for DNA methylation analysis
You et al. Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest
CN104077499A (en) Supervised up-sampling learning based protein-nucleotide binding positioning point prediction method
CN102760210A (en) Adenosine triphosphate binding site predicting method for protein
Zhang et al. Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach
CN105808975A (en) Multi-core-learning and Boosting algorithm based protein-DNA binding site prediction method
JP2021514090A (en) Hot route analysis method based on density clustering
CN103955628A (en) Subspace fusion-based protein-vitamin binding location point predicting method
CN107273714A (en) The ATP binding site estimation methods of conjugated protein sequence and structural information
Zhang et al. WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data
CN103617203A (en) Protein-ligand binding site predicting method based on inquiry drive
Zhang et al. Deconvolution algorithms for inference of the cell-type composition of the spatial transcriptome
Asim et al. EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction
Sha et al. Reconstructing growth and dynamic trajectories from single-cell transcriptomics data
Wang et al. A web server for identifying circRNA-RBP variable-length binding sites based on stacked generalization ensemble deep learning network
Toseef et al. Deep transfer learning for clinical decision-making based on high-throughput data: comprehensive survey with benchmark results
Enireddy et al. OneHotEncoding and LSTM-based deep learning models for protein secondary structure prediction
Bhardwaj et al. Multi-omics data and analytics integration in ovarian cancer
Yang et al. Concert: genome-wide prediction of sequence elements that modulate DNA replication timing
Fischer et al. Beyond pseudotime: Following T-cell maturation in single-cell RNAseq time series
Maitra et al. Unsupervised neural network for single cell Multi-omics INTegration (UMINT): an application to health and disease
Sturtz et al. Deep Learning Approaches for the Protein Scaffold Filling Problem

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Wu Dongjun

Inventor after: Hu Jun

Inventor after: Wang Ke

Inventor after: He Xue

Inventor after: Li Yang

Inventor after: Yang Jingyu

Inventor before: Hu Jun

Inventor before: He Xue

Inventor before: Li Yang

Inventor before: Wu Dongjun

Inventor before: Shen Hongbin

Inventor before: Yang Jingyu

GR01 Patent grant
GR01 Patent grant