CN109872781A - Drug target recognition methods based on Xgboost - Google Patents
Drug target recognition methods based on Xgboost Download PDFInfo
- Publication number
- CN109872781A CN109872781A CN201910141417.2A CN201910141417A CN109872781A CN 109872781 A CN109872781 A CN 109872781A CN 201910141417 A CN201910141417 A CN 201910141417A CN 109872781 A CN109872781 A CN 109872781A
- Authority
- CN
- China
- Prior art keywords
- xgboost
- drug
- drug target
- amino acid
- tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003596 drug target Substances 0.000 title claims abstract description 77
- 238000000034 method Methods 0.000 title claims abstract description 32
- 150000001413 amino acids Chemical class 0.000 claims abstract description 30
- 241000607479 Yersinia pestis Species 0.000 claims abstract description 12
- 238000004458 analytical method Methods 0.000 claims abstract description 7
- 230000010148 water-pollination Effects 0.000 claims abstract description 6
- 239000000470 constituent Substances 0.000 claims abstract description 5
- 238000010494 dissociation reaction Methods 0.000 claims abstract description 5
- 230000005593 dissociations Effects 0.000 claims abstract description 5
- 108020001580 protein domains Proteins 0.000 claims abstract description 5
- 238000003066 decision tree Methods 0.000 claims description 5
- 239000003814 drug Substances 0.000 abstract description 18
- 229940079593 drug Drugs 0.000 abstract description 12
- 238000011160 research Methods 0.000 abstract description 8
- 201000010099 disease Diseases 0.000 abstract description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 3
- 230000010534 mechanism of action Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 28
- 239000000203 mixture Substances 0.000 description 8
- 108090000623 proteins and genes Proteins 0.000 description 7
- 102000004169 proteins and genes Human genes 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 102000005962 receptors Human genes 0.000 description 3
- 108020003175 receptors Proteins 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 description 2
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000002547 new drug Substances 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 206010061623 Adverse drug reaction Diseases 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 208000030453 Drug-Related Side Effects and Adverse reaction Diseases 0.000 description 1
- 102000004310 Ion Channels Human genes 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 238000012362 drug development process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- -1 immune system Proteins 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 230000004960 subcellular localization Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 210000004885 white matter Anatomy 0.000 description 1
Landscapes
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The present invention provides the drug target recognition methods based on Xgboost, belong to drug target identification field.The present invention is based on the drug target recognition methods specific steps of Xgboost are as follows: constituent analysis: calculating the average percent of drug targets and non-drug target every kind of amino acid in 20 kinds of amino acid;Dissociation constant: by 20 kinds of amino acid according to its respective hydrophily by Amino acid score at 6 class groupuscules;The area PEST: according to the protein domain PEST potential in Epestfind procedure identification amino acid;According to Step 1: step 2 and step 3 extract 3 kinds of features of drug targets;The identification of drug target is carried out to the feature extracted in step 4 using Xgboost algorithm.A kind of drug target recognition methods based on Xgboost of the present invention, can high speed, identification potential drug target spot efficiently, inexpensive;It was found that potential drug target can not only push disease mechanism of action and pharmaceutical research, tutorial message can also be provided for the potential side effect of drug and the commercialization of drug.
Description
Technical field
The present invention relates to the drug target recognition methods based on Xgboost, belong to drug target identification field.
Background technique
Binding site between drug and large biological molecule is drug target.Drug target is related to receptor, enzyme, and ion leads to
Road, transport protein, immune system, gene etc..For existing drug more than 50% using receptor as target, receptor becomes main and most heavy
The target spot wanted.Since drug targeting research is the source of modern medicines research, it can be mentioned for the prevention and treatment of major disease
For important information, make the new drug development based on fresh target that there is great social and economic benefit.Therefore, drug targets become doctor
The hot spot in field.
Most protein drug is g protein coupled receptor (GPCR) (23%) and enzyme (50%).Some researchers are pre-
It surveys, has more than 2000 kinds of pharmaceutical grade proteins.However it is reported that only hundreds of drug targets.The number of clinical verification pharmaceutical target
It measures still seldom.Partly cause is the accumulation with redundant data, and simple analysis method has been unable to meet extensive high throughput
The needs of data analysis.But due to handling capacity, the limitation of precision and cost, experimental method, using being difficult to carry out extensively.Make
For handle mass data quick and inexpensive method, based on machine learning pharmaceutical target prediction more and more attention has been paid to.
The basic sequence of the conjugated proteins such as Huang Chen, two stages structure and subcellular localization, predict ion channel by SVM
In potential drug target.Hopkins A L et al. be based on sequence homology and structure domain analysis known drug target and by its
Applied to searching novel targets.3D structure based on protein, the researchs such as Kinnings S L can be in conjunction with medical compounds
Bond area.Campillos M predicts potential drug targets based on the similitude of side effect.Zheng et al. has found drug bound site
Point has certain structure and physicochemical property always.In addition,Kleywegt G uses hydrophobic amino acid
Percentage predicts drug targets.Tala M.BakheeT and Andrew J.Doig analyze the pharmaceutical target of 9 attributes, he
Not only by the difference of this 9 Attribute Discovery between drug targets and non-drug target, but also identify medicine using SVM
Object target.
Although researcher achieves great achievement in terms of identifying drug targets, huge and complicated acid sequence is identified
Need a kind of algorithm with Computationally efficient and high recognition accuracy.Chen T proposed a kind of entitled limit ladder in 2004
The new method of degree enhancing (Xgboost), he improves boost algorithm, its multi-threaded parallel and regularization term not only improves
The accuracy of algorithm, and shorten runing time.Therefore, Xgboost is a kind of conjunction for solving the problems, such as drug targets identification
Suitable algorithm.
Summary of the invention
The purpose of the present invention is to solve the above-mentioned problems of the prior art, and then provide the medicine based on Xgboost
Object target spot recognition methods.
The purpose of the present invention is what is be achieved through the following technical solutions:
Drug target recognition methods based on Xgboost, the drug target recognition methods based on Xgboost specifically walk
Suddenly are as follows:
Step 1: constituent analysis: calculate drug targets and non-drug target in 20 kinds of amino acid every kind of amino acid it is flat
Equal percentage;
Step 2: dissociation constant: by 20 kinds of amino acid according to its respective hydrophily by Amino acid score at 6 class groupuscules;
Step 3: the area PEST: according to the protein domain PEST potential in Epestfind procedure identification amino acid;
Step 4: according to Step 1: step 2 and step 3 extract 3 kinds of features of drug targets;
Step 5: the identification of drug target is carried out to the feature extracted in step 4 using Xgboost algorithm.
The present invention is based on the drug target recognition methods of Xgboost, the Xgboost algorithm specifically:
Objective function includes loss function and regularization term:
Obj (Θ)=L (θ)+Ω (Θ)
Wherein, L (θ) is loss function, and Ω (Θ) is regularization term;
According to the model of following formula building T tree are as follows:
The basic classification device of Xgboost is CART, and objective function can be such that
Target is the parameter f for obtaining each treei, t tree is had trained according to (t-1) tree before
Therefore, t-th of objective function is
Loss function L (θ) is subjected to the second Taylor series
By decision tree is defined as:
ft(x)=wq(x),w∈RM,q:Rd→{1,2,…,M};
W records the score of each leaf node, and q is a function, determines which node is each input sample finally fall on;
In Xgboost, by regularization parameter is defined as:
λ and γ is the parameter of Controlling model complexity;
So the objective function of t-th of tree are as follows:
Define Gj=∑ giAnd Hj=∑ hi, it is then available:
Here, wjIndependently of other, the optimal score of j-th of node and optimal obj are as follows:
Finally, cut tree according to certain rules;
The present invention is based on the drug target recognition methods of Xgboost, can high speed, the potential medicine of identification efficiently, inexpensive
Object target spot;It was found that potential drug target can not only push disease mechanism of action and pharmaceutical research, it can also be latent for drug
Side effect and drug commercialization provide tutorial message.
Detailed description of the invention
Fig. 1 is feature extraction block diagram of the invention.
Fig. 2 is the amino acid composition of drug target and non-drug target spot.
Fig. 3 is accuracy rate curve.
Specific embodiment
Below in conjunction with attached drawing, the present invention is described in further detail: the present embodiment is being with technical solution of the present invention
Under the premise of implemented, give detailed embodiment, but protection scope of the present invention is not limited to following embodiments.
Embodiment one: as shown in Figs. 1-2, the drug target recognition methods based on Xgboost involved in the present embodiment, institute
State the drug target recognition methods specific steps based on Xgboost are as follows:
Step 1: constituent analysis: calculate drug targets and non-drug target in 20 kinds of amino acid every kind of amino acid it is flat
Equal percentage;
Step 2: dissociation constant: by 20 kinds of amino acid according to its respective hydrophily by Amino acid score at 6 class groupuscules;
Step 3: the area PEST: according to the protein domain PEST potential in Epestfind procedure identification amino acid;
Step 4: according to Step 1: step 2 and step 3 extract 3 kinds of features of drug targets;
Step 5: the identification of drug target is carried out to the feature extracted in step 4 using Xgboost algorithm.
Constituent analysis: since the composition of real drug targets and the composition of non-drug target are entirely different, these
The frequency of occurrences of all 20 kinds of amino acid may differ widely in target.In order to find out between drug targets and non-drug target
Difference draws the picture of average amino acid composition, as shown in Figure 1.Therefore, every kind of ammonia in drug targets and non-drug target is calculated
The average percent of base acid.
Calculate the average amino acid composition of 2596 kinds of drug targets and non-drug target.Just as seen, medicine
Object target is very high in ' L' in most abundant, and ' G', ' A', ' V', the composition of ' E', ' S'.
In short, between the composition and non-drug target of drug targets, there are significant differences.Therefore, it is used as identification drug
The function of target.
Dissociation constant: hydrophobic residue and the form of hydrophilic residue are for determining that protein structure is extremely important.Due to
The hydrophily range of amino acid is wider, can according to its respective hydrophily by Amino acid score at groupuscule, therefore in drug targets and
There must be very big difference on non-drug target.Table 1 shows six groups in 20 amino acid.
1. amino acid of table is divided into 6 classes
Therefore, the sequence of each drug targets can be transferred in this 6 groups.Each dimension is being averaged for one of this six groups
Composition.
The area PEST: 1986, RechsteinerM and Rogers SW was made that it is assumed that i.e. ' P', ' E', ' S' and ' T'
Amino acid can be used as proteolysis signal.More and more reports confirm that the sequence containing the region PEST can lead to egg now
The fast degradation of white matter.Epestfind program can be used to identify all bad and potential PEST protein sequence.It only will be potential
The protein domain PEST as identification drug targets feature.Calculate the quantity of potentially harmful biotic district in each sequence.
Therefore, we are extracted 3 kinds of features, i.e., 27 pharmaceutical targets tieed up to determine non-drug target.
The quantity of suitable drug target is still limited at present.For unknown drug target, it is known that drug
Target spot only tip of the iceberg.The selection of target spot plays a crucial role in entire drug development process.Modern medicine
In object research, the foundation of novel targets is often new drug precondition for innovation and guarantee.With the development of modern molecular biology technique
With the completion of the Human Genome Project, there is the novel molecular target spot largely for therapy intervention, but not all target
Point can become Effective target site related with disease, therefore carry out discovery and verifying to New Target point to become be very important
Work.Tradition is not only with high costs using the method for Bioexperiment but also inefficiency, the Xgboost that the present invention develops identify medicine
Object target spot method, can high speed, identification potential drug target spot efficiently, inexpensive.It was found that potential drug target not only can be with
Disease mechanism of action and pharmaceutical research are pushed, guidance letter can also be provided for the potential side effect of drug and the commercialization of drug
Breath.
Embodiment two: as shown in Figure 1, the drug target recognition methods based on Xgboost involved in the present embodiment, described
Xgboost algorithm specifically:
Objective function includes loss function and regularization term:
Obj (Θ)=L (θ)+Ω (Θ)
Wherein, L (θ) is loss function, and Ω (Θ) is regularization term;
According to the model of following formula building T tree are as follows:
The basic classification device of Xgboost is CART, and objective function can be such that
Target is the parameter f for obtaining each treei, t tree is had trained according to (t-1) tree before
Therefore, t-th of objective function is
Loss function L (θ) is subjected to the second Taylor series
By decision tree is defined as:
ft(x)=wq(x),w∈RM,q:Rd→{1,2,…,M};
W records the score of each leaf node, and q is a function, determines which node is each input sample finally fall on;
In Xgboost, by regularization parameter is defined as:
λ and γ is the parameter of Controlling model complexity;
So the objective function of t-th of tree are as follows:
Define Gj=∑ giAnd Hj=∑ hi, it is then available:
Here, wjIndependently of other, the optimal score of j-th of node and optimal obj are as follows:
Finally, cut tree according to certain rules;
Extreme Gradi-ent Boosting (Xgboost) improves traditional gradient and promotes decision tree (GBDT).
Traditional GBDT algorithm is in optimization using only first derivative information of loss function.Xgboost executes two to loss function
Rank Taylor expansion, and use the information of single order and second dervative.In addition, xgboost can make automatically with the help of Open MP
Use CPU.The multi-core parallel concurrent of CPU calculates, and substantially increases the speed of service.Secondly, different from GBDT algorithm, Xgboost supports dilute
Dredge Input matrix.Xgboost defines a new data matrix DMatrix, and training set will be located in advance when training starts
Reason, therefore the efficiency of each iteration of training process can be improved, reduce the model training time.
The process of GBDT is as follows:
Objective function is commonly used in measuring the quality of different models.It is always made of two parts: loss function and just
Then change item.
Obj (Θ)=L (θ)+Ω (Θ)
L (θ) is loss function.If we only use the quality that loss function carrys out assessment models, model is easy to
Overfitting.Therefore, it is considered as regularization parameter.It represents the complexity of model.Therefore, final mask should be in loss letter
Balance is obtained between several and regularization term.
If having trained T tree, model can be constructed in the following way:
The basic classification device of Xgboost and GBDT is all CART, therefore objective function can be as follows
Target is the parameter f for obtaining each treeiWe have trained t tree according to (t-1) tree before.
Therefore, t-th of objective function is
Then, loss function is subjected to the second Taylor series
Then, it would be desirable to calculate regularization term.Firstly, we are by decision tree is defined as:
ft(x)=wq(x),w∈RM,q:Rd→{1,2,…,M}
W records the score of each leaf node.Q is a function, in that case it can be decided which section is each input sample finally fall in
Point on.In Xgboost, regularization parameter is defined as follows by we:
λ and γ is the parameter of Controlling model complexity.So the objective function of t-th of tree is as follows:
We can define Gj=∑ giAnd Hj=∑ hi, then we are available:
Here, wjIndependently of other, the optimal score of our available j-th of node and optimal obj.
Finally, we should cut tree according to certain rules.
It will be seen that branch had better not be added if the gain after division is less than γ.
Embodiment three: as indicated at 3, the drug target recognition methods based on Xgboost involved in the present embodiment, the base
In the experimental verification process of the drug target recognition methods of Xgboost it is that we obtain 2596 real pharmaceutical targets, and
And we produce 2596 pseudo- pharmaceutical targets.In order to verify validity of the Xgboost in terms of identifying drug targets, Wo Menjin
Ten cross validations are gone.
This 5192 sequences are randomly divided into 10 groups by us.For each group, we select 519 sequences as test
Collection, remaining 4673 sequence is as training set.So we have carried out ten experiments in total.In addition, each sequence becomes instruction
Practice collection and test set.It sets the parameter of Xgboost to described in table 2.
The parameter setting of table 2.Xgboost
We assess performance of the Xgboost in terms of identifying drug targets using four kinds of appraisal procedures.We are by ten
The result of experiment is placed in table 3.Test 5190 sequences in total.
The result of table 3. 10 times experiments
Then Accuracy=99.13%, Precision=99.04%, Recall=99.23% can be calculated,
Specificity=99.04%;In our current research, false drug targets are 0, drug targets 1.The accuracy rate of 10 experiments is bent
Line is as shown in Figure 2.
The foregoing is only a preferred embodiment of the present invention, these specific embodiments are all based on the present invention
Different implementations under general idea, and scope of protection of the present invention is not limited thereto, it is any to be familiar with the art
Technical staff in the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of, should all cover of the invention
Within protection scope.Therefore, the scope of protection of the invention shall be subject to the scope of protection specified in the patent claim.
Claims (2)
1. the drug target recognition methods based on Xgboost, which is characterized in that the drug target identification based on Xgboost
Method specific steps are as follows:
Step 1: constituent analysis: calculate drug targets and non-drug target in 20 kinds of amino acid every kind of amino acid average hundred
Divide ratio;
Step 2: dissociation constant: by 20 kinds of amino acid according to its respective hydrophily by Amino acid score at 6 class groupuscules;
Step 3: the area PEST: according to the protein domain PEST potential in Epestfind procedure identification amino acid;
Step 4: according to Step 1: step 2 and step 3 extract 3 kinds of features of drug targets;
Step 5: the identification of drug target is carried out to the feature extracted in step 4 using Xgboost algorithm.
2. the drug target recognition methods according to claim 1 based on Xgboost, which is characterized in that the Xgboost
The specific steps of algorithm are as follows:
Objective function includes loss function and regularization term:
Obj (Θ)=L (θ)+Ω (Θ)
Wherein, L (θ) is loss function, and Ω (Θ) is regularization term;
According to the model of following formula building T tree are as follows:
The basic classification device of Xgboost is CART, and objective function can be such that
Target is the parameter f for obtaining each treei, t tree is had trained according to (t-1) tree before
Therefore, t-th of objective function is
Loss function L (θ) is subjected to the second Taylor series
By decision tree is defined as:
ft(x)=wq(x),w∈RM,q:Rd→{1,2,…,M};
W records the score of each leaf node, and q is a function, determines which node is each input sample finally fall on;
In Xgboost, by regularization parameter is defined as:
λ and γ is the parameter of Controlling model complexity;
So the objective function of t-th of tree are as follows:
Define Gj=∑ giAnd Hj=∑ hi, it is then available:
Here, wjIndependently of other, the optimal score of j-th of node and optimal obj are as follows:
Finally, cut tree according to certain rules;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910141417.2A CN109872781A (en) | 2019-02-26 | 2019-02-26 | Drug target recognition methods based on Xgboost |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910141417.2A CN109872781A (en) | 2019-02-26 | 2019-02-26 | Drug target recognition methods based on Xgboost |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109872781A true CN109872781A (en) | 2019-06-11 |
Family
ID=66919180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910141417.2A Pending CN109872781A (en) | 2019-02-26 | 2019-02-26 | Drug target recognition methods based on Xgboost |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109872781A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110265085A (en) * | 2019-07-29 | 2019-09-20 | 安徽工业大学 | A kind of protein-protein interaction sites recognition methods |
CN110791543A (en) * | 2019-09-30 | 2020-02-14 | 中国海洋大学 | Method for identifying action target of natural product medicine |
CN111383708A (en) * | 2020-03-11 | 2020-07-07 | 中南大学 | Small molecule target prediction algorithm based on chemical genomics and application thereof |
CN112837743A (en) * | 2021-02-04 | 2021-05-25 | 东北大学 | Medicine repositioning method based on machine learning |
-
2019
- 2019-02-26 CN CN201910141417.2A patent/CN109872781A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110265085A (en) * | 2019-07-29 | 2019-09-20 | 安徽工业大学 | A kind of protein-protein interaction sites recognition methods |
CN110791543A (en) * | 2019-09-30 | 2020-02-14 | 中国海洋大学 | Method for identifying action target of natural product medicine |
CN111383708A (en) * | 2020-03-11 | 2020-07-07 | 中南大学 | Small molecule target prediction algorithm based on chemical genomics and application thereof |
CN111383708B (en) * | 2020-03-11 | 2023-05-12 | 中南大学 | Small molecular target prediction algorithm based on chemical genomics and application thereof |
CN112837743A (en) * | 2021-02-04 | 2021-05-25 | 东北大学 | Medicine repositioning method based on machine learning |
CN112837743B (en) * | 2021-02-04 | 2024-03-26 | 东北大学 | Drug repositioning method based on machine learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109872781A (en) | Drug target recognition methods based on Xgboost | |
US20210407622A1 (en) | Neural network architectures for linking biological sequence variants based on molecular phenotype, and systems and methods therefor | |
Merkley et al. | Applications and challenges of forensic proteomics | |
Thireou et al. | Bidirectional long short-term memory networks for predicting the subcellular localization of eukaryotic proteins | |
CN112086129B (en) | Method and system for predicting cfDNA of tumor tissue | |
CN105849279A (en) | Methods and systems for identifying disease-induced mutations | |
Webb-Robertson et al. | A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics | |
Gewehr et al. | SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles | |
JP6644672B2 (en) | Characterization of biological materials using unassembled sequence information, stochastic methods, and trait-specific database catalogs | |
Webb-Robertson et al. | A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics | |
Liu et al. | MetaDecoder: a novel method for clustering metagenomic contigs | |
CN112837743B (en) | Drug repositioning method based on machine learning | |
CN110400605A (en) | A kind of the ligand bioactivity prediction technique and its application of GPCR drug targets | |
WO2006129401A1 (en) | Screening method for specific protein in proteome comprehensive analysis | |
Jain et al. | Quantitative proteomic analysis of formalin fixed paraffin embedded oral HPV lesions from HIV patients | |
Palviainen et al. | Kidney-derived proteins in urine as biomarkers of induced acute kidney injury in sheep | |
Murphy et al. | Self-supervised learning of cell type specificity from immunohistochemical images | |
Wani et al. | Raw sequence to target gene prediction: An integrated inference pipeline for ChIP-seq and RNA-seq datasets | |
Washburn | The H-Index of ‘an approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database’ | |
CN114822690A (en) | Multi-class multifunctional intelligent classification method applied to whole genome expression profile data | |
Wibowo et al. | XGB5hmC: Identifier based on XGB model for RNA 5-hydroxymethylcytosine detection | |
US7805257B2 (en) | Comparison of molecules using field points | |
Li et al. | Fast and accurate classification of meta-genomics long reads with deSAMBA | |
Bangert et al. | Pattern Recognition for Mass-Spectrometry-Based Proteomics | |
CN112041933A (en) | System and method for interpreting transcript expression levels of RNA sequencing data using locally unique features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190611 |
|
RJ01 | Rejection of invention patent application after publication |