CN109326329A - Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode - Google Patents

Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode Download PDF

Info

Publication number
CN109326329A
CN109326329A CN201811353819.0A CN201811353819A CN109326329A CN 109326329 A CN109326329 A CN 109326329A CN 201811353819 A CN201811353819 A CN 201811353819A CN 109326329 A CN109326329 A CN 109326329A
Authority
CN
China
Prior art keywords
zinc
sample
binding protein
action site
protein matter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811353819.0A
Other languages
Chinese (zh)
Other versions
CN109326329B (en
Inventor
李慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinling Institute of Technology
Original Assignee
Jinling Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinling Institute of Technology filed Critical Jinling Institute of Technology
Priority to CN201811353819.0A priority Critical patent/CN109326329B/en
Publication of CN109326329A publication Critical patent/CN109326329A/en
Application granted granted Critical
Publication of CN109326329B publication Critical patent/CN109326329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses the zinc-binding protein matter action site prediction technique under a kind of unbalanced mode based on integrated study, the characteristics of for zinc-binding protein matter action site, protein source data are pre-processed;It is handled by non-equilibrium property be balanced of the random down-sampling technology to zinc-binding protein matter action site, obtains several quantum balancing data sets;Respectively on several quantum balancing data sets, the protein biochemistry feature for having ga s safety degree is chosen, carries out character representation, composition characteristic vector;Respectively using feature vector as the input of base classifier support vector machines, calculate sample weights, the probabilistic neural network model based on sample weighting is constructed again, finally integrates base disaggregated model support vector machines and the probabilistic neural network model based on sample weighting obtains prediction model;The zinc-binding protein matter action site in target sample is identified using prediction model is obtained.

Description

The zinc-binding protein matter action site based on integrated study is pre- under a kind of unbalanced mode Survey method
Technical field
The present invention relates to the zinc-binding protein matter action site prediction technique under a kind of unbalanced mode based on integrated study, It is to identify zinc-binding protein matter action site under non-equilibrium classification mode using integrated study disaggregated model, belong to albumen The crossing domain of matter group and computer science.
Background technique
With the completion of the Human Genome Project, life science enters genome times afterwards comprehensively, albumen expressed by gene The matter research topic one of important as life science and natural science field.Protein (Protein) is the base for constituting cell This organic matter is the material base of life, plays decisive role during biological life.However, this decisive role Not being not that can simply be determined by single protein, in most situations, need by protein and other protein or Ligand interacts to complete specific biological function jointly.
In cell, agent and the undertaker of the protein as vital movement complete spy by interacting with ligand Fixed key effect, such as DNA synthesis, signal transduction, gene transcriptional activation, metabolic process of life, antivirus protection etc..Its Secondary, the treatment aspect that protein acts on various diseases also has great progradation, and especially some virus proteins are invaded It disturbs, such as Ebola virus (Ebola virus), it can disclose the pathogenesis of certain diseases, find the target spot of some drugs There is directive function with new drug development.
Metal ion in conjunction with protein, plays its biological function even some life mistakes to protein as co-factor Journey plays conclusive effect.Zinc ion is only second to iron as the in organism second metal ion abundant, the life to organism Long development, disease control, DNA synthesis etc. have important regulating and controlling effect.Zinc ion shortage will lead to some diseases, such as age phase The retired property disease closed, malignant tumour and Wilson disease.In addition, zinc also has aging, apoptosis, immune function and oxidative stress Important function.Zinc ion just exercises the biological functions such as catalysis, rock-steady structure and coordination in conjunction with protein.
To the identification of zinc-binding protein matter action site mainly using biochemical test method.Though these experimental method energy The interaction sites between protein and zinc ion are measured, but since measuring cost is too high, it is time-consuming and laborious;Moreover, because Experiment needs different restrictive condition, using different experimental principles, can make in this way experimental result have certain false negative and False positive.Therefore, find that the biological significance of these data has been far from satisfying life by experimental technique and means merely The needs of object development.
With the development of information technology and the appearance of magnanimity biological data, some calculation methods such as data mining technology is utilized And machine learning related algorithm automatic identification zinc-binding protein matter action site is a kind of inexorable trend of development.It has cost It low, the advantages that speed is fast, can overcome the disadvantages that the defect of experiment, and further provided for Bioexperiment measurement interaction of a high price Directly supports and lead.
The prediction of zinc ion conjugated protein action site is two classification problems, and the action site really combined is seldom, Uncombined action site accounting is very high, and the prediction of zinc-binding protein matter action site is a typical non-equilibrium classification problem. Current existing prediction technique establishes disaggregated model using the methods of data mining, and two class samples are put on an equal footing, are not accounted for To the disequilibrium of data, the precision for causing zinc-binding protein matter action site to predict is very low.Therefore, zinc-binding protein matter is studied Non-equilibrium property in action site prediction, the classification accuracy for improving minority class have important research significance.
Summary of the invention
The purpose of the present invention is provide one for the non-equilibrium property classification problem in the prediction of zinc-binding protein matter action site Zinc-binding protein matter action site prediction technique based on integrated study under kind unbalanced mode.
In order to solve the above-mentioned technical problem, the technical solution adopted by the present invention is as follows:
Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode, including walk as follows It is rapid:
Step 1: the characteristics of being directed to zinc-binding protein matter action site pre-processes protein source data;
Step 2: by random down-sampling technology to non-equilibrium being balanced of the property place of zinc-binding protein matter action site Reason, obtains several quantum balancing data sets;
Step 3: respectively on several quantum balancing data sets, the protein biochemistry feature for having ga s safety degree is chosen, is carried out Character representation, composition characteristic vector;
Step 4: respectively using feature vector as the input of base classifier support vector machines, sample weights are calculated, then are constructed Probabilistic neural network model based on sample weighting finally integrates base disaggregated model support vector machines and based on the general of sample weighting Rate neural network model obtains prediction model;
Step 5: prediction model is obtained using step 4, the zinc-binding protein matter action site in target sample is known Not.
Wherein, in step 1, the pretreatment removes following noise data:
(1) removal homology is higher than 70% peptide chain structure;
(2) duplicate, shorter protein chain and mistake and insecure data are rejected;
(3) removal meets chain of the sequence redundancy less than 20%.
In step 2, the equilibrating processing is that random down-sampling technology is that random lower sampling is carried out to major class sample, often It is secondary to extract quantity identical with group sample, constitute several quantum balancing data sets;The major class sample is uncombined protein Action site, the group sample are the protein interaction sites that zinc combines.
In step 3, the biochemical character of the ga s safety degree includes feature locations specificity score matrix, conservative score With RW-GRMTP (relative weight of gapless real matches to pseudocounts gapless real Relative weighting with pseudorange);Position-specific scoring matrices are normalized, and are used at histogram and sliding window Reason obtains the vector of one 20 dimension;The conservative score of 20 dimensions is converted into a value;Place is normalized to RW-GRMTP Reason, obtains 2 dimensional vectors;Ultimately form the feature vector of one 23 dimension.
In step 4, base classifier SVM support vector machines is respectively trained on several quantum balancing data sets, according to formula (1) and formula (2) calculates separately prediction error rate ejWith the important procedure weight α of disaggregated modelj
Wherein, all data sets are D, D={ (x1,y1),(x2,y2),…,(xn,yn), xi∈ X, X represent classification problem Class field instance space, yi∈ { 1, -1 }, i=1,2 ... n, n are sample numbers;wmiFor weight, initial value is set as 1/n, i.e. w1= (w11,w12,...,w1n), wherein w1i=1/n;I=1,2 ..., n;M=1,2;Respectively using base point on k equilibrium data collection Class device SVM is trained, and obtains k classification prediction result Csvm_j(x), j=1 ..., k.
It calculates current sample weights and is normalized, sample classification is correct, reduces corresponding sample weights;If sample This classification error increases corresponding sample weights, calculation formula such as formula (3):
Probabilistic neural network model of the building based on sample weighting is to be weighted to protein characteristic data, after weighting Input of the sample data as probabilistic neural network model is predicted, this method is denoted as SWPNN using probabilistic neural network, Prediction result is SWPNN (x).
It integrates base disaggregated model support vector machines and the probabilistic neural network model based on sample weighting obtains prediction model SSWPNN, SSWPNN={ SVM, SWPNN, kernelopt, spread, f }, wherein kernelopt, spread be respectively SVM and The parameter of SWPNN classifier, shown in the definition of f such as formula (4);Corresponding weight beta is calculated according to error rate simultaneouslyj
Wherein, δ is threshold value, Csvm_j(x) and SWPNN (x) be respectively classifier SVM and SWPNN classification results, value is big In 0, then the class sample that is positive is predicted, the class sample that is negative is predicted less than 0.If the value of SVM (X) is positive and is less than threshold value δ, and When SWPNN (X) is predicted as counter-example, finally integrated prediction result is judged as counter-example, is final with SVM (X) result in the case of other The result of judgement.
In step 5, it is utilized respectively integrated model SSWPNN in entire test data set and is predicted, obtains different Classification results, then result is weighted integrated, zinc-binding protein matter action site in target sample is finally identified, such as formula (5) shown in:
The utility model has the advantages that
The method that the present invention is mentioned is acted on from the angle of machine learning for zinc-binding protein matter under unbalanced mode The identification problem in site proposes a kind of novel zinc-binding protein matter action site prediction technique based on integrated study, has Effect solves the prediction of zinc-binding protein matter action site under non-equilibrium classification mode, achieves certain predictablity rate.This Invention can be applied to the Forecasting recognition of other type of metal ion conjugated protein action sites after extension.
Detailed description of the invention
The present invention is done with reference to the accompanying drawings and detailed description and is further illustrated, of the invention is above-mentioned And/or otherwise advantage will become apparent.
Fig. 1 is the overall framework figure of the method for the present invention.
Fig. 2 is the zinc-binding protein matter action site classifier frame diagram based on SVM and SWPNN model.
Fig. 3 is the prediction procedure chart of SSWPNN classifier.
Specific embodiment
According to following embodiments, the present invention may be better understood.
Overall procedure of the invention is as shown in Figure 1.
The present invention is directed to the zinc-binding protein matter action site forecasting problem under unbalanced dataset, using to down-sampling skill Art makes data tend towards stability being balanced of data.Using integrated technology building based on support vector machines and sample weighting Probabilistic neural network sorter model, and Classification and Identification is carried out to zinc-binding protein matter action site using model.Specific implementation Steps are as follows:
1. equilibrating is handled
The protein interaction sites that zinc combines are called group sample (negative class sample);Uncombined protein interaction sites Referred to as major class sample (positive class sample).Nothing at random is carried out to major class sample and puts back to lower sampling, while in order to avoid random down-sampling It is likely to result in the loss of major class sample useful information, takes the upper multiple sampling without replacement of data complete or collected works.Major class sample is carried out Random nothing puts back to lower sampling, extracts quantity identical with group sample every time, i.e., major class sample is divided into k subset, every height Collection and group sample synthesize equilibrium data collection D1,D2,…,Dk.The description of its process available algorithm 1:
Algorithm 1: data balancing Processing Algorithm
Input: protein sequence sample data D
Output: quantum balancing data set D1,D2,…,Dk
1 BEGIN;
2 Divide(D);
3 N=CountUp (MinoritySample);
4 For (i=1;I≤k;i++);
5 ExtractedSamplei=RandomExtract (MajoritySample, N);
6 Di=Merge (MinoritySample, ExtractedSamplei);
7 MajoritySample=MajoritySample-ExtractedSamplei
8 End for;
9 END。
2. attributive character indicates
Choose the biochemical character for having ga s safety degree: position-specific scoring matrices, conservative score and RW-GRMTP (relative weight of gapless real matches to pseudocounts), carries out character representation, and composition is special Levy vector set.Position-specific scoring matrices are normalized, and are handled using histogram and sliding window, obtain one The vector of a 20 dimension;The conservative score of 20 dimensions is converted into a value;RW-GRMTP is normalized, obtains one 2 dimensional vectors;Ultimately form the feature vector of one 23 dimension.
3. the probabilistic neural network model of integrated supporting vector machine and sample weighting
It is trained using base classifier support vector machines, according to classification results, sample is weighted, be in some Boundary is easy " difficulty divides sample " of misclassification, probabilistic neural network model of the training based on weighting.
If all data sets are D, D={ (x1,y1),(x2,y2),…,(xn,yn), xi∈ X, X represent the class of classification problem Domain instance space, yi∈ { 1, -1 }, i=1,2 ... n, n are sample numbers.Process are as follows:
Step 1: SVM classifier is respectively trained on several quantum balancing data sets;
It is trained respectively using base classifier SVM on k sub- equilibrium data collection, cross validation is folded using 5-, is obtained To k classification prediction result Csvm_j(x), j=1 ..., k.The error rate of prediction is denoted as ej, the significance level weight of disaggregated model For αj, calculate such as formula (1) and (2).In formula (1), wmiFor weight, initial value is set as 1/n, i.e. w1=(w11,w12,...,w1n), Middle w1i=1/n;I=1,2 ..., n;M=1,2.
Step 2: current sample weights are calculated and are normalized;
After first round base classifier SVM prediction, if some sample classification is correct, in next round prediction, drop Its low weight;On the contrary, in next round prediction, improving his weight if some sample classification mistake.Sample weights function Calculate such as formula (3):
Step 3: PNN fallout predictor SWPNN of the training based on sample weighting;
Feature samples data are weighted using calculated weight in Step 2, probabilistic neural of the training based on weighting Network model, the method for proposition are denoted as SWPNN, and prediction result is SWPNN (x).Zinc based on SVM and SWPNN model, which combines, to be made It is as shown in Figure 2 with site classifier frame.
Step 4: the SWPNN classifier of base disaggregated model SVM and sample weighting is integrated;
The probabilistic neural network model of integrated base classifier SVM and sample weighting propose a kind of new prediction technique SSWPNN, i.e. SSWPNN={ SVM, SWPNN, kernelopt, spread, f }, wherein kernelopt, spread are SVM respectively With the parameter of SWPNN classifier, shown in the definition of f such as formula (4).Corresponding weight beta is calculated according to error rate simultaneouslyj(this is basic Classifier is in the weight in final classification device).
Wherein δ is threshold value, Csvm_j(x) and SWPNN (x) be respectively classifier SVM and SWPNN classification results, value is big In 0, then the class sample that is positive is predicted, the class sample that is negative is predicted less than 0.If the value of SVM (X) is positive and smaller, it is less than threshold value δ, and when SWPNN (X) is predicted as counter-example, finally integrated prediction result is judged as counter-example, in the case of other, is with SVM (X) result The result finally judged.
Step 5: the integrated model SSWPNN being utilized respectively in Step 4 on entire data set is predicted, is obtained not With classification results, then be weighted integrated using formula (5) to result, finally identify zinc-binding protein matter action site.Frame Frame model is as shown in Figure 3.
Tested on the data set of 392 protein chains, and with existing there are four types of method (meta- ZincPrediction, ZincExplorer, zincFinder, zincPred) performance comparison is carried out, whether to four kinds of residues (CHED) estimated performance of whole estimated performance or any residue, method of the invention are better than other methods.
The present invention provides the zinc-binding protein matter action site prediction sides under a kind of unbalanced mode based on integrated study The thinking and method of method, there are many method and the approach for implementing the technical solution, and the above is only preferred reality of the invention Apply mode, it is noted that for those skilled in the art, without departing from the principle of the present invention, Several improvements and modifications can also be made, these modifications and embellishments should also be considered as the scope of protection of the present invention.In the present embodiment not The available prior art of specific each component part is realized.

Claims (9)

1. the zinc-binding protein matter action site prediction technique under a kind of unbalanced mode based on integrated study, which is characterized in that Include the following steps:
Step 1: the characteristics of being directed to zinc-binding protein matter action site pre-processes protein source data;
Step 2: being handled by non-equilibrium being balanced of property of the random down-sampling technology to zinc-binding protein matter action site, Obtain several quantum balancing data sets;
Step 3: respectively on several quantum balancing data sets, the protein biochemistry feature for having ga s safety degree is chosen, carries out feature It indicates, composition characteristic vector;
Step 4: respectively using feature vector as the input of base classifier support vector machines, sample weights are calculated, then constructs and is based on The probabilistic neural network model of sample weighting finally integrates base disaggregated model support vector machines and the probability mind based on sample weighting Prediction model is obtained through network model;
Step 5: prediction model is obtained using step 4, the zinc-binding protein matter action site in target sample is identified.
2. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 1 based on integrated study Method, which is characterized in that in step 1, the pretreatment removes following noise data:
(1) removal homology is higher than 70% peptide chain structure;
(2) duplicate, shorter protein chain and mistake and insecure data are rejected;
(3) removal meets chain of the sequence redundancy less than 20%.
3. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 1 based on integrated study Method, which is characterized in that in step 2, the equilibrating processing is that random down-sampling technology is that lower at random take out is carried out to major class sample Sample extracts quantity identical with group sample every time, constitutes several quantum balancing data sets;The major class sample is uncombined Protein interaction sites, the group sample are the protein interaction sites that zinc combines.
4. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 1 based on integrated study Method, which is characterized in that in step 3, the biochemical character of the ga s safety degree includes feature locations specificity score matrix, guards Property score and RW-GRMTP;Position-specific scoring matrices are normalized, and are used at histogram and sliding window Reason obtains the vector of one 20 dimension;The conservative score of 20 dimensions is converted into a value;Place is normalized to RW-GRMTP Reason, obtains 2 dimensional vectors;Ultimately form the feature vector of one 23 dimension.
5. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 1 based on integrated study Method, which is characterized in that in step 4, base classifier SVM support vector machines is respectively trained on several quantum balancing data sets, according to Formula (1) and formula (2) calculate separately prediction error rate ejWith the important procedure weight α of disaggregated modelj
Wherein, all data sets are D, D={ (x1,y1),(x2,y2),…,(xn,yn), xiε X, X represent the class field of classification problem Instance space, yiε { 1, -1 }, i=1,2 ... n, n are sample numbers;wmiFor weight, initial value is set as 1/n, i.e. w1=(w11, w12,...,w1n), wherein w1i=1/n;I=1,2 ..., n;M=1,2;Classified respectively using base on k sub- equilibrium data collection Device SVM is trained, and obtains k classification prediction result Csvm_j(x), j=1 ..., k.
6. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 5 based on integrated study Method, which is characterized in that in step 4, calculate current sample weights and be normalized, sample classification is correct, reduces corresponding Sample weights;If sample classification mistake, increase corresponding sample weights, calculation formula such as formula (3):
7. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 6 based on integrated study Method, which is characterized in that in step 4, building the probabilistic neural network model based on sample weighting be to protein characteristic data into Row weighting, input of the sample data as probabilistic neural network model after weighting are predicted using probabilistic neural network, are somebody's turn to do Method is denoted as SWPNN, and prediction result is SWPNN (x).
8. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 6 based on integrated study Method, which is characterized in that in step 4, integrate base disaggregated model support vector machines and the probabilistic neural network mould based on sample weighting Type obtains prediction model SSWPNN, SSWPNN={ SVM, SWPNN, kernelopt, spread, f }, wherein kernelopt, Spread is the parameter of SVM and SWPNN classifier respectively, shown in the definition of f such as formula (4);It is calculated simultaneously according to error rate corresponding Weight betaj
Wherein, δ is threshold value, Csvm_j(x) and SWPNN (x) be respectively classifier SVM and SWPNN classification results, value be greater than 0, It then predicts the class sample that is positive, the class sample that is negative is predicted less than 0.If the value of SVM (X) is positive and is less than threshold value δ, and SWPNN (X) When being predicted as counter-example, finally integrated prediction result is judged as counter-example, is the knot finally judged with SVM (X) result in the case of other Fruit.
9. the zinc-binding protein matter action site prediction side under unbalanced mode according to claim 8 based on integrated study Method, which is characterized in that in step 5, integrated model SSWPNN is utilized respectively on entire data set and is predicted, obtains difference Classification results, then result is weighted integrated, zinc-binding protein matter action site in target sample is finally identified, such as formula (5) shown in:
CN201811353819.0A 2018-11-14 2018-11-14 Zinc binding protein action site prediction method Active CN109326329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811353819.0A CN109326329B (en) 2018-11-14 2018-11-14 Zinc binding protein action site prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811353819.0A CN109326329B (en) 2018-11-14 2018-11-14 Zinc binding protein action site prediction method

Publications (2)

Publication Number Publication Date
CN109326329A true CN109326329A (en) 2019-02-12
CN109326329B CN109326329B (en) 2020-07-07

Family

ID=65257207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811353819.0A Active CN109326329B (en) 2018-11-14 2018-11-14 Zinc binding protein action site prediction method

Country Status (1)

Country Link
CN (1) CN109326329B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109979525A (en) * 2019-02-28 2019-07-05 天津大学 Improved hormonebinding protein qualitative classification method
CN110689920A (en) * 2019-09-18 2020-01-14 上海交通大学 Protein-ligand binding site prediction algorithm based on deep learning
CN111916148A (en) * 2020-08-13 2020-11-10 中国计量大学 Method for predicting protein interaction

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
CN106250718A (en) * 2016-07-29 2016-12-21 於铉 N based on individually balanced Boosting algorithm1methylate adenosine site estimation method
CN107194207A (en) * 2017-06-26 2017-09-22 南京理工大学 Protein ligands binding site estimation method based on granularity support vector machine ensembles
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
CN106250718A (en) * 2016-07-29 2016-12-21 於铉 N based on individually balanced Boosting algorithm1methylate adenosine site estimation method
CN106250718B (en) * 2016-07-29 2018-03-02 於铉 N based on individually balanced Boosting algorithms1Methylate adenosine site estimation method
CN107273714A (en) * 2017-06-07 2017-10-20 南京理工大学 The ATP binding site estimation methods of conjugated protein sequence and structural information
CN107194207A (en) * 2017-06-26 2017-09-22 南京理工大学 Protein ligands binding site estimation method based on granularity support vector machine ensembles

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱非易: ""基于不平衡学习的蛋白质_维生素绑定位点预测研究"", 《中国优秀硕士学位论文全文数据库 基础科学辑》 *
马军伟: ""基于机器学习方法的蛋白质亚细胞定位预测研究"", 《中国博士学位论文全文数据库 基础科学辑》 *
魏志森: ""蛋白质相互作用位点预测方法研究"", 《中国博士学位论文全文数据库 基础科学辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109979525A (en) * 2019-02-28 2019-07-05 天津大学 Improved hormonebinding protein qualitative classification method
CN110689920A (en) * 2019-09-18 2020-01-14 上海交通大学 Protein-ligand binding site prediction algorithm based on deep learning
CN110689920B (en) * 2019-09-18 2022-02-11 上海交通大学 Protein-ligand binding site prediction method based on deep learning
CN111916148A (en) * 2020-08-13 2020-11-10 中国计量大学 Method for predicting protein interaction
CN111916148B (en) * 2020-08-13 2023-01-31 中国计量大学 Method for predicting protein interaction

Also Published As

Publication number Publication date
CN109326329B (en) 2020-07-07

Similar Documents

Publication Publication Date Title
Hameed et al. Multi-class skin diseases classification using deep convolutional neural network and support vector machine
US9195949B2 (en) Data analysis and predictive systems and related methodologies
Janitza et al. An AUC-based permutation variable importance measure for random forests
CN103632168B (en) Classifier integration method for machine learning
CN109326329A (en) Zinc-binding protein matter action site prediction technique based on integrated study under a kind of unbalanced mode
CN109671469A (en) The method for predicting marriage relation and binding affinity between polypeptide and HLA I type molecule based on Recognition with Recurrent Neural Network
CN107449921A (en) For differentiating the probing analysis based on cell of drug-induced toxicity mark
CN102282559A (en) Data analysis method and system
Mamani Machine Learning techniques and Polygenic Risk Score application to prediction genetic diseases
CN108985360A (en) Hyperspectral classification method based on expanding morphology and Active Learning
KR102184720B1 (en) Prediction method for binding preference between mhc and peptide on cancer cell and analysis apparatus
CN105938523A (en) Feature selection method and application based on feature identification degree and independence
CN106250913B (en) A kind of combining classifiers licence plate recognition method based on local canonical correlation analysis
Cui et al. A multiple-instance learning-based convolutional neural network model to detect the IDH1 mutation in the histopathology images of glioma tissues
CN111462833A (en) Virtual drug screening method and device, computing equipment and storage medium
Van Buren et al. Artificial intelligence and deep learning to map immune cell types in inflamed human tissue
CN109790320A (en) Molecule and molecular combination structure are determined by momentum transmitting cross-sectional distribution
US20130218581A1 (en) Stratifying patient populations through characterization of disease-driving signaling
CN108570501B (en) Multiple myeloma molecular typing and application
CN101609486B (en) Identification method of superclass of G-protein-coupled receptors and Web service system thereof
CN115033758A (en) Application of kidney clear cell carcinoma prognosis marker gene, screening method and prognosis prediction method
Maddalena et al. A framework based on metabolic networks and biomedical images data to discriminate glioma grades
Barbiero et al. Supervised gene identification in colorectal cancer
KR102429120B1 (en) HUMAN PPARγ ANTAGONIST PREDICTION METHOD BASED ON LEARNING MODEL AND ANALYSIS APPARATUS
CN103258146A (en) Delaminating and classifying method for G-protein-coupled receptor family

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant