CN112185459A - Prediction method for interaction of plant and pathogenic bacteria protein - Google Patents

Prediction method for interaction of plant and pathogenic bacteria protein Download PDF

Info

Publication number
CN112185459A
CN112185459A CN202011020892.3A CN202011020892A CN112185459A CN 112185459 A CN112185459 A CN 112185459A CN 202011020892 A CN202011020892 A CN 202011020892A CN 112185459 A CN112185459 A CN 112185459A
Authority
CN
China
Prior art keywords
protein
interaction
data
spatial structure
plant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011020892.3A
Other languages
Chinese (zh)
Inventor
张利达
郑存俭
刘源
孙方楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202011020892.3A priority Critical patent/CN112185459A/en
Publication of CN112185459A publication Critical patent/CN112185459A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention relates to a prediction method of plant and pathogenic bacteria protein interaction, which comprises the following steps: 1) collecting host-pathogen protein interaction positive data; 2) collecting the spatial structure of the protein complex template, and analyzing the interaction interface of the subunit pair; 3) carrying out homologous structure modeling on a host-pathogenic bacterium protein sequence to obtain a protein homologous space structure model; 4) comparing the protein homologous spatial structure with the protein complex template spatial structure to obtain structural characteristics; 5) extracting non-structural features; 6) and building a machine learning model, testing and adjusting the machine learning model based on the structural characteristics and the non-structural characteristics, and predicting the rice-rice blast germ protein interaction of the genome scale. Compared with the prior art, the invention fully utilizes the determined protein structure data and the information of homology, structural domain interaction and the like, and can effectively, quickly and simply extract the interaction characteristic information related to the plant-pathogenic bacteria protein.

Description

Prediction method for interaction of plant and pathogenic bacteria protein
Technical Field
The invention relates to the technical field of biological data processing, in particular to a prediction method for the interaction between plants and pathogenic bacteria proteins.
Background
Plant-pathogen interactions are a two-way biological process of communication. On the one hand, plants attempt to recognize molecules secreted by pathogenic bacteria to avoid infection, and on the other hand, pathogenic bacteria manipulate plants as much as possible, thereby making the plant host environment more favorable to them. This makes many known intra-species protein interaction prediction methods unsuitable for plant-pathogen, and there is little research focused on plant-pathogen protein interaction prediction.
Although experimental detection methods for protein interactions have been developed, the experimental methods are time consuming, laborious, low in data accumulation, and most of these data focus on interactions between humans and pathogens (especially viruses). In contrast, other hosts, especially plant-pathogen protein interaction data, are very limited.
Although the protein interaction is very easy to explain from the perspective of the protein space structure, the protein space structure is complex, the number of proteins with known structures is limited, and how to extract relevant interaction characteristic information by fully using the measured protein structure data becomes a key problem to be solved urgently in the current plant-pathogenic bacterium interaction.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for predicting the interaction between the plant and the pathogenic bacteria protein, which can effectively, quickly and simply extract the plant-pathogenic bacteria protein related interaction characteristic information by means of the measured protein spatial structure data and the information of homology, domain interaction and the like.
The purpose of the invention can be realized by the following technical scheme:
a method for predicting the interaction of a plant with a pathogen protein, comprising the steps of:
s1, collecting host-pathogenic bacterium protein interaction positive data and genome data of rice and rice blast germs;
and collecting host-pathogen protein interaction positive data by using an HPIDB database, wherein the host-pathogen protein interaction positive data are obtained by at least one experimental method in protein interaction detection means such as yeast two-hybrid and the like.
Downloading genome data of rice from an MSU database, and deleting a transposon gene; downloading genome data of rice blast germs from an Ensembl Genomes database, performing transmembrane helix prediction on a TMHMM website, and selecting proteins with predicted transmembrane helix prediction quantity larger than 0; performing signal peptide prediction on a SignalP website, performing subcellular localization prediction on a WoLF PSORT website, wherein the protein which is the signal peptide and is localized outside cells belongs to secreted protein of rice blast; after removing the repeated protein obtained in each step, screening to obtain the rice blast germ protein with potential interaction with the rice protein.
S2, collecting the spatial structure of the protein complex template, and splitting the protein complex into different subunits to obtain the interaction interface of the subunit pair;
acquiring experimentally measured protein three-dimensional structure data by using a PDB protein structure database, wherein the protein three-dimensional structure data is measured by at least one experimental method of nuclear magnetic resonance, X-ray crystal diffraction or an electron microscope; after the three-dimensional structure data of the protein is obtained, the protein complex is split into different subunits, the structural data of the subunit pairs is read by PIBASE software, and interaction interface information is extracted.
And S3, taking the spatial structure of the protein complex template in the step S2 as a template, and carrying out homologous structure modeling on the host-pathogen protein sequence by using MODPIPE to obtain a protein homologous spatial structure model.
S4, comparing the protein homologous spatial structure with the protein complex template spatial structure to obtain structural characteristics;
further, comparing the protein homologous spatial structure with the protein complex template spatial structure by using TM-align software to obtain structural features. The structural characteristics comprise similarity and structural deviation of a protein homologous spatial structure and a protein complex, and the number and the proportion of conserved residues of an interaction interface of the protein homologous spatial structure and a protein complex template spatial structure.
S5, collecting protein interaction data of the model organisms, acquiring a positive interaction data set of the model organisms, and extracting non-structural features;
the cross-species conservation of plant-pathogenic bacteria protein interaction is analyzed by utilizing homology mapping to obtain a protein homology mapping relation, and a related interaction protein pair supported by an interaction structural domain, namely a structural domain interaction relation, is obtained by combining a structural domain interaction data set.
And S6, building a machine learning model based on the structural features and the non-structural features, testing and adjusting, and predicting the rice-blast germ protein interaction of the genome scale.
And S1, performing sequence clustering and random combination on the host-pathogenic bacteria protein interaction positive data set obtained in the step S1 to generate a certain amount of negative data set, generating a training set and a testing set by the positive data set and the negative data set according to a certain proportion, utilizing sciit-leran random forest to build a machine learning initial model according to the structural characteristics and the non-structural characteristics of the training set, performing batch optimization test and adjustment on parameters of the initial model through a grid search function, utilizing the optimization model to perform relation prediction on all rice-rice blast bacteria protein pairs which can possibly interact pairwise in a genome scale, and drawing a rice-rice blast bacteria protein interaction network by adopting Cytoscape software according to a prediction result.
Compared with the prior art, the method is based on the existing biological data, and can effectively, quickly and simply extract the plant-pathogenic bacteria protein related interaction characteristic information by means of the determined protein space structure data and the information of homology, structural domain interaction and the like, so as to obtain the plant-pathogenic bacteria protein interaction data and provide reference for the research of plant disease-resistant molecular mechanisms.
Drawings
FIG. 1 is a schematic flow chart of a method for predicting plant-pathogen protein interaction in the examples;
FIG. 2 is a rice-blast protein interaction network at the genomic scale in the examples.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
Computer prediction of protein interactions requires extraction of valuable features from large amounts of data using methods such as statistics, machine learning, and the like. With the exponential growth of biological data, the machine learning method can be applied to the analysis of biological data through improvement. The invention provides a prediction method of plant and pathogenic bacteria protein interaction based on a protein space structure, and a high-accuracy plant-pathogenic bacteria protein interaction network is constructed on a genome scale according to the prediction method.
Specifically, as shown in fig. 1, the invention relates to a prediction method of the protein interaction between plants and pathogenic bacteria, which comprises the following steps:
step one, collecting host-pathogenic bacteria protein interaction positive data and genome data of rice and rice blast fungi
A positive Host-Pathogen protein Interaction dataset was collected from the HPIDB Database (Host-Pathogen Interaction Database). The data must be obtained by at least one experimental method in protein interaction detection means such as yeast two-hybrid detection.
When rice is infected by rice blast, the membrane protein and the secretory protein are most likely to interact with the protein of rice in the rice body. The invention obtains the rice blast germ protein interacting with the rice protein potential based on the HPIDB database. Specifically, the method comprises the following steps: the genome data of rice was downloaded from the MSU database, and the transposon gene was deleted. Downloading genome data of rice blast germs from an Ensembl Genomes database, and performing transmembrane helix prediction on a TMHMM website, wherein the predicted proteins with the transmembrane helix prediction number larger than 0 are membrane proteins and are 2317 in total; performing signal peptide prediction on a SignalP website, performing subcellular localization prediction on a WoLF PSORT website, wherein the protein which is classified as the signal peptide and is localized outside belongs to secreted proteins of rice blast, and 1402 proteins are obtained in total; after the deletion of the repeats, 3491 rice blast germ proteins having potential interaction with rice proteins were obtained by co-screening.
Step two, collecting the spatial structure of the protein complex template and analyzing the subunit interaction interface
And downloading experimentally determined three-dimensional structure data of the protein from the PDB protein structure database, wherein the structure data needs to be determined by at least one experimental method of nuclear magnetic resonance, X-ray crystal diffraction or electron microscope. The complex subunit interaction interface analysis means that the PDB protein complex is divided into pairwise interacting protein subunit pairs; the protein complex is split into different subunits, and the PIBASE software is used for reading the structural data of the subunit pairs and extracting the interaction interface information.
Step three, protein homologous structure modeling
And (3) taking the three-dimensional structure data of the protein measured in the experiment in the step two as a template, and carrying out homologous structure modeling on the protein sequences of the host and the pathogenic bacteria by using MODPIPE software to obtain a spatial structure model of the host and the pathogenic bacteria.
Taking the host-pathogen protein interaction dataset in the step one as an example, downloading the protein sequence from the uniprot database, and performing homologous modeling on the protein sequence, wherein the comparison method comprises sequence-sequence comparison (sequence-to-sequence comparison), profile-sequence comparison and profile-profile comparison. Evaluation of the quality of the homology modeling model the scoring was performed using MPQS, which is a composite score comprising sequence similarity, template coverage and three independent evaluation scores: e-value, Z-DOPE and GA 341. e-value is the significance threshold for alignment between the modeled protein and the template; Z-DOPE is a statistical possibility to deduce the dependence of atomic distance from local structure samples based on probability theory, independent of any adjustable parameters (discrete optimized protein energy or DOPE); GA341 is the model reliability score based on statistics. And (3) setting a scoring threshold value of MPQS ≧ 0.5 by observing a scoring probability distribution function, and regarding the model as a stable homologous structure model.
And (3) scoring the sequence length of the homologous structure model, and filtering to remove the protein homologous structure model which is too short to judge whether an interaction interface exists or not. And (3) scoring the sequence length of the homologous structure model to obtain a score MODSEQ-sore ═ L-MOD/L-SEQ, wherein L-MOD is the length of the homologous modeling sequence, and L-SEQ is the length of the corresponding gene sequence. And (3) combining the probability density distribution function of MODSEQ-sore, considering both the data quantity and the data quality, setting the threshold value to be 30%, and obtaining the homology modeling results of 14628 proteins in total.
Fourthly, superposing and comparing the protein homologous structure model and the complex template structure to obtain the structural characteristics
And (3) carrying out spatial structure comparison on the homologous structure model of the host and the pathogenic bacteria and the complex template by using TM-align software. Taking the host-pathogen protein interaction data set in the step one as an example, controlling the TM-score value to be more than 0.4, finally obtaining a structure comparison result of 10148 positive homologous templates and the complex subunit, and calculating the RMSD value, the TM-score value, the number of conserved residues of the interaction interface and the proportion of the conserved residues between the protein homologous model and the complex template as the structural characteristics. The calculation of the RMSD value, TM-score value, the number of conserved residues at the interaction interface and the occupation ratio of the conserved residues between the protein homology model and the complex template through the structure comparison result is the prior art and will not be described in detail herein.
Step five, analyzing and extracting non-structural features
Protein interaction data of 7 model organisms including arabidopsis, mice, nematodes, humans, escherichia coli, yeast and drosophila are collected from five public databases of BioGRID, IntAct, DIP, BIND and MINT, and a model organism positive interaction data set is obtained.
And (3) analyzing the direct homologous relation between the rice and rice blast proteins obtained in the step one and 7 model biological protein groups respectively by using an inparanoid software and a blast software, and obtaining the non-structural characteristics: and (5) homologous mapping relation. According to the opanoid analysis result, a 5720 pair of rice-blast protein interaction relation supported by the homologous mapping result is obtained by combining a mode biological positive data set; according to the result of blast software, adjusting 3 parameters of e value, sequence consistency and sequence coverage, determining that the analysis parameter of blast software is that the e value is 1e-5, the sequence consistency is 45 percent and the sequence coverage is 50 percent, and obtaining 5702-rice blast protein interaction relation.
Reading protein domain information by using PfamScan, and combining a domain interaction data set collected by a 3did database to obtain a related interaction protein pair supported by an interaction domain. Obtaining the non-structural characteristics: domain interaction relationships.
Step six, construction and optimization of deep learning model
And (3) carrying out sequence clustering and random combination on the host-pathogen protein interaction data set in the step one to generate a certain amount of negative data sets, and generating a training set and a testing set by the positive data set obtained in the step one and the negative data set obtained in the step according to a certain proportion. And according to 4 structural features and 2 non-structural features of the training set, constructing a machine learning initial model by utilizing scimit-learn random forests. Using a grid search function to carry out batch optimization and adjustment on parameters of the initial model, and finally determining the parameters: the maximum iteration number is 60, the maximum depth of the decision tree is 13, the minimum sample number required by internal node subdivision is 120, the minimum sample number of leaf nodes is 20, the maximum feature number is 7, the random number seed is 10, and the other parameters are default. The optimization model is utilized to predict the relation of all rice-rice blast germ protein pairs which are possibly interacted pairwise in the genome scale, the screening threshold value is 0.5, a rice-rice blast germ protein interaction network is drawn by using Cytoscape software according to all the prediction results, and the presented visual results are shown in figure 2.
Based on the existing biological data, the invention can effectively, quickly and simply extract the plant-pathogenic bacteria protein related interaction characteristic information by means of the determined protein space structure data and the information of homology, structural domain interaction and the like, thereby obtaining the plant-pathogenic bacteria protein interaction data and providing reference for the research of plant disease-resistant molecular mechanisms.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A method for predicting the protein interaction between a plant and a pathogen, comprising the steps of:
1) collecting host-pathogen protein interaction positive data;
2) collecting the spatial structure of the protein complex template, and splitting the protein complex into different subunits to obtain an interaction interface of a subunit pair;
3) taking the spatial structure of the protein complex template in the step 2) as a template, and carrying out homologous structure modeling on a host-pathogenic bacteria protein sequence to obtain a protein homologous spatial structure model;
4) comparing the protein homologous spatial structure with the protein complex template spatial structure to obtain structural characteristics;
5) collecting protein interaction data of the model organism, acquiring a positive interaction data set of the model organism, and extracting non-structural features;
6) and building a machine learning model, testing and adjusting the machine learning model based on the structural characteristics and the non-structural characteristics, and predicting the rice-rice blast germ protein interaction of the genome scale.
2. The method according to claim 1, wherein in step 1), host-pathogen protein interaction positive data satisfying at least one experimental method of protein interaction detection means such as yeast two-hybrid is collected using an HPIDB database.
3. The method for predicting plant-pathogen protein interaction according to claim 1, wherein the step 2) comprises:
acquiring experimentally measured protein three-dimensional structure data by using a PDB protein structure database, wherein the protein three-dimensional structure data is measured by at least one experimental method of nuclear magnetic resonance, X-ray crystal diffraction or an electron microscope; after the three-dimensional structure data of the protein is obtained, the protein complex is split into different subunits, the structural data of the subunit pairs is read by PIBASE software, and interaction interface information is extracted.
4. The method for predicting the protein interaction between plants and pathogenic bacteria according to claim 3, wherein in the step 3), the three-dimensional structure data of the protein experimentally measured in the step 2) is used as a template, and MODPIPE is used for carrying out homologous structure modeling on the protein sequence of the host-pathogenic bacteria to obtain a protein homologous spatial structure model.
5. The method according to claim 1, wherein the structural characteristics are obtained by comparing the spatial structure of homology of protein with the spatial structure of the template of protein complex in step 4) using TM-align software.
6. The method of claim 5, wherein the structural features include similarity of protein homology spatial structure to protein complex, structural deviation, and the number of conserved residues and the ratio of conserved residues at the interaction interface between protein homology spatial structure and protein complex template spatial structure.
7. The method of claim 1, wherein in step 5), the cross-species conservation of plant-pathogen protein interactions is analyzed using homology mapping to obtain a protein homology mapping, and the domain interaction dataset is combined to obtain pairs of related interacting protein pairs, i.e., domain interactions, supported by the interaction domains.
8. The method for predicting plant-pathogen protein interaction according to claim 1, wherein the step 6) comprises:
carrying out sequence clustering and random combination on the host-pathogenic bacteria protein interaction positive data set obtained in the step 1) to generate a certain amount of negative data set, generating a training set and a testing set by the positive data set and the negative data set according to a certain proportion, building a machine learning model by utilizing sciit-leran random forest according to the structural characteristics and the non-structural characteristics of the training set, carrying out parameter adjustment and optimization on the machine learning model by using a grid search function, predicting the interaction of the rice-blast bacteria protein pair of the genome scale, and drawing a rice-blast bacteria protein interaction network by adopting Cytoscape software.
CN202011020892.3A 2020-09-25 2020-09-25 Prediction method for interaction of plant and pathogenic bacteria protein Pending CN112185459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011020892.3A CN112185459A (en) 2020-09-25 2020-09-25 Prediction method for interaction of plant and pathogenic bacteria protein

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011020892.3A CN112185459A (en) 2020-09-25 2020-09-25 Prediction method for interaction of plant and pathogenic bacteria protein

Publications (1)

Publication Number Publication Date
CN112185459A true CN112185459A (en) 2021-01-05

Family

ID=73944510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011020892.3A Pending CN112185459A (en) 2020-09-25 2020-09-25 Prediction method for interaction of plant and pathogenic bacteria protein

Country Status (1)

Country Link
CN (1) CN112185459A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104911261A (en) * 2015-05-06 2015-09-16 华南农业大学 Method for researching oryza sativa and pathogen interaction mode
CN105354441A (en) * 2015-10-23 2016-02-24 上海交通大学 Vegetable protein interaction network construction method
CN110136773A (en) * 2019-04-02 2019-08-16 上海交通大学 A kind of phytoprotein interaction network construction method based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104911261A (en) * 2015-05-06 2015-09-16 华南农业大学 Method for researching oryza sativa and pathogen interaction mode
CN105354441A (en) * 2015-10-23 2016-02-24 上海交通大学 Vegetable protein interaction network construction method
CN110136773A (en) * 2019-04-02 2019-08-16 上海交通大学 A kind of phytoprotein interaction network construction method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
蔡浩洋等: "水稻蛋白质相互作用网络的预测与分析", 《四川大学学报(自然科学版)》 *

Similar Documents

Publication Publication Date Title
Rokas et al. Genome-scale approaches to resolving incongruence in molecular phylogenies
US20150294065A1 (en) Database-Driven Primary Analysis of Raw Sequencing Data
US11347810B2 (en) Methods of automatically and self-consistently correcting genome databases
CN111863121A (en) Protein self-interaction prediction method based on graph convolution neural network
CN113488104A (en) Cancer driver gene prediction method and system based on local and global network centrality analysis
Kandathil et al. Deep learning-based prediction of protein structure using learned representations of multiple sequence alignments
CN114582429B (en) Mycobacterium tuberculosis drug resistance prediction method and device based on hierarchical attention neural network
Naresh et al. Impact of machine learning in bioinformatics research
CN109801681B (en) SNP (Single nucleotide polymorphism) selection method based on improved fuzzy clustering algorithm
Fleming et al. Identifying and addressing methodological incongruence in phylogenomics: A review
Du et al. Deep multi-label joint learning for RNA and DNA-binding proteins prediction
CN113743453A (en) Population quantity prediction method based on random forest
Pratas et al. Metagenomic composition analysis of sedimentary ancient DNA from the Isle of Wight
CN116246705B (en) Analysis method and device for whole genome sequencing data
Zhang et al. iSP-RAAC: Identify secretory proteins of malaria parasite using reduced amino acid composition
CN112185459A (en) Prediction method for interaction of plant and pathogenic bacteria protein
Lee et al. Protein secondary structure prediction using BLAST and exhaustive RT-RICO, the search for optimal segment length and threshold
CN113257338A (en) Protein structure prediction method based on residue contact diagram information game mechanism
Al-Barhamtoshy et al. DNA sequence error corrections based on TensorFlow
JP2020182445A (en) Novel method for processing sequence information about single biological unit
Semwal et al. Pr [m]: An algorithm for protein motif discovery
Bhat et al. OTU clustering: A window to analyse uncultured microbial world
Hassan et al. Integrated rules classifier for predicting pathogenic non-synonymous single nucleotide variants in human
JP3920207B2 (en) Domain discrimination method and discrimination device
Prajapati et al. Feature Selection using Ant Colony Optimization for Microarray Data Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105