CN108614957B - Multi-stage protein structure prediction method based on Shannon entropy - Google Patents

Multi-stage protein structure prediction method based on Shannon entropy Download PDF

Info

Publication number
CN108614957B
CN108614957B CN201810238703.6A CN201810238703A CN108614957B CN 108614957 B CN108614957 B CN 108614957B CN 201810238703 A CN201810238703 A CN 201810238703A CN 108614957 B CN108614957 B CN 108614957B
Authority
CN
China
Prior art keywords
state
stage
population
current
shannon entropy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810238703.6A
Other languages
Chinese (zh)
Other versions
CN108614957A (en
Inventor
张贵军
谢腾宇
周晓根
王柳静
马来发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201810238703.6A priority Critical patent/CN108614957B/en
Publication of CN108614957A publication Critical patent/CN108614957A/en
Application granted granted Critical
Publication of CN108614957B publication Critical patent/CN108614957B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Abstract

A multi-stage protein structure prediction method based on Shannon entropy comprises the steps of firstly utilizing a Rosetta Abinitio protocol to search a search space, and finding out a potential natural state region through clustering background points; then, performing a prediction process in stages under the framework of a population evolution algorithm, analyzing the relation between each generation of population and the potential natural state area, and indicating the evolution state of the current population by classification; secondly, calculating state transition matrixes of two generations before and after the population and measuring the state transformation condition of the population by using the Shannon entropy; and finally, carrying out stage switching according to the accumulated times of the Shannon entropy value within a certain threshold value, and taking the last generation of population as a final prediction result. The invention provides a multi-stage protein structure prediction method based on Shannon entropy, which is used for dynamically switching stages according to the Shannon entropy so that the prediction precision and robustness of an algorithm are obviously improved.

Description

Multi-stage protein structure prediction method based on Shannon entropy
Technical Field
The invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a multi-stage protein structure prediction method based on Shannon entropy.
Background
The protein is the material basis of life, is an organic macromolecule, is a basic organic matter constituting cells, is the main undertaker of life activities, and is a substance with a certain spatial structure formed by the way that polypeptide chains consisting of amino acids in a dehydration condensation mode are coiled and folded. Multiple proteins can perform a particular function by folding or spiraling into a spatial structure, often by binding together to form a stable protein complex. The three-dimensional structure of proteins is of decisive importance in drug design, protein engineering and biotechnology, and therefore, protein structure prediction is an important research issue.
Experimental measurement methods for protein structure include X-ray crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and the like, and these methods are widely used for protein structure measurement. X-ray crystallography is considered one of the relatively feasible and accurate determination methods among these methods. However, X-ray crystallography requires a complex crystallization process and for some proteins that do not crystallize readily (e.g., membrane proteins), this method cannot be used for structural determination. In addition, these experimental assays are extremely time consuming, expensive, and prone to error.
According to the Anfinsen principle, a three-dimensional structure of a protein is directly predicted from an amino acid sequence by using a computer as a tool and applying an appropriate algorithm, and the prediction is a main research subject in bioinformatics at present. And the de novo prediction method is an optimization method for establishing a protein physical or knowledge energy model based on the Anfinsen hypothesis and then designing a proper optimization algorithm to solve the minimum energy conformation. On one hand, the method is helpful to reveal the protein folding mechanism in a biological sense, and further can finally clarify the second genetic code theoretical part in the biological center rule; on the other hand, this approach is universal in a practical sense, and de novo prediction methods are the only choice for sequence similarity < 20% or oligopeptides (<10 residues of small proteins). Rosetta, QUARK, etc. build energy models based on knowledge, which have been highlighted in past CASP events. However, when the method predicts a target protein with a long sequence, the search space increases exponentially, the prediction accuracy decreases sharply, and thus the problems of insufficient sampling capability, improper phase switching, incapability of retaining excellent intermediate results, and waste of computing resources are caused.
Therefore, the existing multi-stage protein structure prediction method based on the energy function has defects in stage switching and prediction accuracy, and needs to be improved.
Disclosure of Invention
In order to overcome the defects of the conventional multi-stage protein structure prediction method based on an energy function in the aspects of stage switching and prediction precision, the invention provides a Shannon entropy guided multi-stage switching protein structure prediction method with reasonable stage switching and high prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a multi-stage protein structure prediction method based on shannon entropy, the method comprising the steps of:
1) giving input sequence information, and obtaining a fragment library of the sequence by using a Robeta server;
2) and (3) constructing a Markov state model by the following process:
2.1) acquiring nstruct background points: operating the Rosetta Abinitio protocol for nstruct times, and recording the conformation result of each operation as a background point;
2.2) calculating the root mean square difference distance RMSD between the nstruct background points to form a distance matrix D;
2.3) classifying the nstruct background points by using a k-means clustering method according to the distance matrix D to obtain m cluster centers serving as m Markov states;
3) initialization: performing the current stage NP times of Rosetta Abinitio according to the input sequence to generate an initial conformation population P ═ C { (NP), wherein the current stage is 1, the Shannon entropy threshold value alpha and the Shannon entropy maximum accumulation times count _ max1,C2,...,CNPIn which C isNPRepresents the Nth individual;
4) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiThe p cluster center is nearest, then the current state of the individualiP, p ∈ {1, 2.., m }, and the state of the entire population is denoted as statelast={state1,state2,...,stateNP},statelastThe group state of the previous generation is referred to as the state + 1;
5) let the cumulative number count of shannon entropy be 0, enter the next stage, and the process is as follows:
5.1) performing corresponding phase prediction on the population, wherein the process is as follows:
5.1.1) to individuals CiFragment Assembly to give C'iAnd is combined withEnergy E of the conformation before and after fragment assembly was evaluated using the energy function at this stagestage(Ci)、E′stage(C′i);
5.1.2) if Estage(Ci)>E′stage(C′i) Then, the current fragment assembly C is acceptedi=Ci'; otherwise, the selection is made using the Metropolis criteria and p ═ exp (- (E) is calculatedstage(Ci)-Estage(C′i) If p > rand (0,1), accepting the current fragment assembly Ci=Ci'; otherwise, rejecting the segment assembly;
5.1.3) executing the steps 5.1.1) to 5.1.2) on all individuals to obtain a next generation population;
5.2) calculating the current population state: for individual C in the populationiI ∈ {1, 2.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiClosest to the q, q e {1,2,. the.m } cluster centers, then the individual's current state'iQ, the state of the entire population is denoted as statenow={state′1,state′2,...,state′NP},statenowThe current population state is indicated;
5.3) obtaining a Markov state transition matrix T according to the state statistics of the previous generation and the next generation: for conformation CiTwo preceding and succeeding state states of i ∈ { 1.,. NP }iP and state'iQ indicates a transition from state p to state q, then tpq=tpq+1/m,tpqThe value of the matrix T in the p th row and the q th column represents the state transition frequency, and the initial value of the state transition frequency is 0;
5.4) calculating the Shannon Entropy value Encopy ∑ -T according to the state transition matrix Tpq lntpq
5.5) update the State of the Current Statelast=statenow
5.6) if Encopy < alpha, considering that the population state transition is more definite, and then count is equal to count + 1;
5.7) if the count is less than the count _ max, continuing to execute the current stage and returning to the step 5.1); otherwise, switching stages, namely, changing the stage to the stage +1, returning to the step 5 if the stage is less than 5), otherwise, ending the fourth stage prediction process, and outputting a prediction result.
The technical conception of the invention is as follows: firstly, searching a search space by using a Rosetta Abinitio protocol, and finding a potential natural state region by clustering background points; then, performing a prediction process in stages under the framework of a population evolution algorithm, analyzing the relation between each generation of population and the potential natural state area, and indicating the evolution state of the current population by classification; secondly, calculating state transition matrixes of two generations before and after the population and measuring the state transformation condition of the population by using the Shannon entropy; and finally, carrying out stage switching according to the accumulated times of the Shannon entropy value within a certain threshold value, and taking the last generation of population as a final prediction result.
The beneficial effects of the invention are as follows: on one hand, the potential natural state area is searched by using the clustering of the background points, so that the search space is reduced, and the calculation cost is reduced; on the other hand, the evolution condition of the population is measured according to the Shannon entropy so as to switch stages, the iteration times of each stage can be dynamically adjusted according to the size of the search space, and the prediction precision and the robustness are improved.
Drawings
FIG. 1 is a multi-stage protein structure prediction method based on Shannon entropy to perform structure prediction on protein 1ACF to obtain conformational energy and RMSD distribution compared with a natural state.
Fig. 2 is a three-dimensional structure diagram obtained by performing structure prediction on the protein 1ACF by a multi-stage protein structure prediction method based on shannon entropy.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, a multi-stage protein structure prediction method based on shannon entropy includes the following steps:
1) giving input sequence information, and obtaining a fragment library of the sequence by using a Robeta server;
2) and (3) constructing a Markov state model by the following process:
2.1) acquiring nstruct background points: operating the Rosetta Abinitio protocol for nstruct times, and recording the conformation result of each operation as a background point;
2.2) calculating the root mean square difference distance RMSD between the nstruct background points to form a distance matrix D;
2.3) classifying the nstruct background points by using a k-means clustering method according to the distance matrix D to obtain m cluster centers serving as m Markov states;
3) initialization: performing the current stage NP times of Rosetta Abinitio according to the input sequence to generate an initial conformation population P ═ C { (NP), wherein the current stage is 1, the Shannon entropy threshold value alpha and the Shannon entropy maximum accumulation times count _ max1,C2,...,CNPIn which C isNPRepresents the Nth individual;
4) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiThe p cluster center is nearest, then the current state of the individualiP, p ∈ {1, 2.., m }, and the state of the entire population is denoted as statelast={state1,state2,...,stateNP},statelastThe group state of the previous generation is referred to as the state + 1;
5) let the cumulative number count of shannon entropy be 0, enter the next stage, and the process is as follows:
5.1) performing corresponding phase prediction on the population, wherein the process is as follows:
5.1.1) to individuals CiFragment Assembly to give C'iAnd using the energy function at this stage to evaluate the energy E of the conformation before and after fragment assemblystage(Ci)、E′stage(C′i);
5.1.2) if Estage(Ci)>E′stage(C′i) Then, the current fragment assembly C is acceptedi=C′i(ii) a Otherwise, the selection is made using the Metropolis criteria and p ═ exp (- (E) is calculatedstage(Ci)-Estage(C′i) If p > rand (0,1), accepting the current fragment assembly Ci=C′i(ii) a Otherwise, rejectAssembling the current fragment;
5.1.3) executing the steps 5.1.1) to 5.1.2) on all individuals to obtain a next generation population;
5.2) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiClosest to the q, q e {1,2,. the.m } cluster centers, then the individual's current state'iQ, the state of the entire population is denoted as statenow={state′1,state′2,...,state′NP},statenowThe current population state is indicated;
5.3) obtaining a Markov state transition matrix T according to the state statistics of the previous generation and the next generation: for conformation CiTwo preceding and succeeding state states of i ∈ { 1.,. NP }iP and state'iQ indicates a transition from state p to state q, then tpq=tpq+1/m,tpqThe value of the matrix T in the p th row and the q th column represents the state transition frequency, and the initial value of the state transition frequency is 0;
5.4) calculating the Shannon Entropy value Encopy ∑ -T according to the state transition matrix Tpq lntpq
5.5) update the State of the Current Statelast=statenow
5.6) if Encopy < alpha, considering that the population state transition is more definite, then count is equal to count + 1;
5.7) if the count is less than the count _ max, continuing to execute the current stage and returning to the step 5.1); otherwise, switching stages, namely, changing the stage to the stage +1, returning to the step 5 if the stage is less than 5), otherwise, ending the fourth stage prediction process, and outputting a prediction result.
In this embodiment, the α/β sheet protein 1ACF with a sequence length of 125 is an embodiment, and a method for predicting a multi-stage protein structure based on shannon entropy includes the following steps:
1) giving input sequence information, and obtaining a fragment library of the sequence by using a Robeta server;
2) and (3) constructing a Markov state model by the following process:
2.1) obtain nstruct 1000 background points: operating the Rosetta Abinitio protocol for nstruct times, and recording the conformation result of each operation as a background point;
2.2) calculating the root mean square difference distance RMSD between the nstruct background points to form a distance matrix D;
2.3) classifying the nstruct background points by using a k-means clustering method according to the distance matrix D to obtain m-8 cluster centers as m Markov states;
3) initialization: the population size NP is 300, the current stage is 1, the Shannon entropy threshold value alpha is 0.01, the Shannon entropy maximum accumulation times count _ max is 50, the current stage NP of Rosetta Abinitio is executed according to the input sequence, and the initial conformation population P is generated { C ═ C1,C2,...,CNPIn which C isNPRepresents the Nth individual;
4) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiThe p cluster center is nearest, then the current state of the individualiP, p ∈ { 1.. m }, and the state of the entire population is denoted as statelast={state1,state2,...,stateNP},statelastThe group state of the previous generation is referred to as the state + 1;
5) let the cumulative number count of shannon entropy be 0, enter the next stage, and the process is as follows:
5.1) performing corresponding phase prediction on the population, wherein the process is as follows:
5.1.1) to individuals CiFragment Assembly to give C'iAnd using the energy function at this stage to evaluate the energy E of the conformation before and after fragment assemblystage(Ci)、E′stage(C′i);
5.1.2) if Estage(Ci)>E′stage(C′i) Then, the current fragment assembly C is acceptedi=C′i(ii) a Otherwise, the selection is made using the Metropolis criteria and p ═ exp (- (E) is calculatedstage(Ci)-Estage(C′i) If p > rand (0,1), then accept this segmentAssembly Ci=C′i(ii) a Otherwise, rejecting the segment assembly;
5.1.3) executing the steps 5.1.1) to 5.1.2) on all individuals to obtain a next generation population;
5.2) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiClosest to the q, q e {1,2,. the.m } cluster centers, then the individual's current state'iQ, the state of the entire population is denoted as statenow={state′1,state′2,...,state′NP},statenowThe current population state is indicated;
5.3) obtaining a Markov state transition matrix T according to the state statistics of the previous generation and the next generation: for conformation CiTwo preceding and succeeding state states of i ∈ { 1.,. NP }iP and state'iQ indicates a transition from state p to state q, then tpq=tpq+1/m,tpqThe value of the matrix T in the p th row and the q th column represents the state transition frequency, and the initial value of the state transition frequency is 0;
5.4) calculating the Shannon Entropy value Encopy ∑ -T according to the state transition matrix Tpqlntpq
5.5) update the State of the Current Statelast=statenow
5.6) if Encopy < alpha, considering that the population state transition is more definite, then count is equal to count + 1;
5.7) if the count is less than the count _ max, continuing to execute the current stage and returning to the step 5.1); otherwise, switching stages, namely stage +1, if the stage is less than 5, returning to the step 5), otherwise, ending the fourth-stage prediction process, and outputting a prediction result.
Using the alpha/beta sheet protein 1ACF with a sequence length of 125 as an example, the above method is used to obtain the near-native conformation of the protein, and the minimum root mean square deviation is
Figure BDA0001604678530000081
The predicted structure is shown in FIG. 2, and the energy sum of conformation in the prediction process is compared with the natural stateThe RMSD distribution of (a) is shown in fig. 1.
The above description is the optimization effect of the present invention using 1ACF protein as an example, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1. A multi-stage protein structure prediction method based on Shannon entropy is characterized in that: the protein structure prediction method comprises the following steps:
1) giving input sequence information, and obtaining a fragment library of the sequence by using a Robeta server;
2) and (3) constructing a Markov state model by the following process:
2.1) acquiring nstruct background points: operating the Rosetta Abinitio protocol for nstruct times, and recording the conformation result of each operation as a background point;
2.2) calculating the root mean square difference distance RMSD between the nstruct background points to form a distance matrix D;
2.3) classifying the nstruct background points by using a k-means clustering method according to the distance matrix D to obtain m cluster centers serving as m Markov states;
3) initialization: performing the current stage NP times of Rosetta Abinitio according to the input sequence to generate an initial conformation population P ═ C { (NP), wherein the current stage is 1, the Shannon entropy threshold value alpha and the Shannon entropy maximum accumulation times count _ max1,C2,...,CNPIn which C isNPRepresents the Nth individual;
4) calculating the current population state: for individual C in the populationiI ∈ { 1.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiThe p cluster center is nearest, then the current state of the individualiP, p ∈ {1, 2.., m }, and the state of the entire population is denoted as statelast={state1,state2,...,stateNP},statelastThe group state of the previous generation is referred to as the state + 1;
5) let the cumulative number count of shannon entropy be 0, enter the next stage, and the process is as follows:
5.1) executing a prediction process of a corresponding stage on the population, wherein the process is as follows:
5.1.1) to individuals CiFragment Assembly to give C'iAnd using the energy function at this stage to evaluate the energy E of the conformation before and after fragment assemblystage(Ci)、E′stage(C′i);
5.1.2) if Estage(Ci)>E′stage(C′i) Then accept this fragment assembly, i.e. Ci=C′i(ii) a Otherwise, the selection is made using the Metropolis criteria and p ═ exp (- (E) is calculatedstage(Ci)-Estage(C′i) If p > rand (0,1), accepting the current fragment assembly Ci=C′i(ii) a Otherwise, rejecting the segment assembly;
5.1.3) executing the steps 5.1.1) to 5.1.2) on all individuals to obtain a next generation population;
5.2) calculating the current population state: for individual C in the populationiI ∈ {1, 2.,. NP } classification: calculating CiRMSD distance from m cluster centers, if CiClosest to the q, q e {1,2,. the.m } cluster centers, then the individual's current state'iQ, the state of the entire population is denoted as statenow={state′1,state′2,...,state′NP},statenowThe current population state is indicated;
5.3) obtaining a Markov state transition matrix T according to the previous generation population state and the current population state: for conformation CiTwo preceding and succeeding state states of i ∈ { 1.,. NP }iP and state'iQ indicates a transition from state p to state q, then tpq=tpq+1/m,tpqThe value of the matrix T in the p th row and the q th column represents the state transition frequency, and the initial value of the state transition frequency is 0;
5.4) calculating the Shannon Entropy value Encopy ∑ -T according to the state transition matrix Tpqlntpq
5.5) update the State of the Current Statelast=statenow
5.6) if Encopy < alpha, considering that the population state transition is more definite, and then count is equal to count + 1;
5.7) if the count is less than the count _ max, continuing to execute the current stage and returning to the step 5.1); otherwise, switching stages, namely, changing the stage to the stage +1, returning to the step 5 if the stage is less than 5), otherwise, ending the fourth stage prediction process, and outputting a prediction result.
CN201810238703.6A 2018-03-22 2018-03-22 Multi-stage protein structure prediction method based on Shannon entropy Active CN108614957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810238703.6A CN108614957B (en) 2018-03-22 2018-03-22 Multi-stage protein structure prediction method based on Shannon entropy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810238703.6A CN108614957B (en) 2018-03-22 2018-03-22 Multi-stage protein structure prediction method based on Shannon entropy

Publications (2)

Publication Number Publication Date
CN108614957A CN108614957A (en) 2018-10-02
CN108614957B true CN108614957B (en) 2021-06-18

Family

ID=63659305

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810238703.6A Active CN108614957B (en) 2018-03-22 2018-03-22 Multi-stage protein structure prediction method based on Shannon entropy

Country Status (1)

Country Link
CN (1) CN108614957B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575320A (en) * 2014-05-05 2017-04-19 艾腾怀斯股份有限公司 Binding affinity prediction system and method
CN107491664A (en) * 2017-08-29 2017-12-19 浙江工业大学 A kind of protein structure ab initio prediction method based on comentropy

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170002319A1 (en) * 2015-05-13 2017-01-05 Whitehead Institute For Biomedical Research Master Transcription Factors Identification and Use Thereof
US20180068053A1 (en) * 2016-08-05 2018-03-08 The Governors Of The University Of Alberta Systems and methods of selecting compounds with reduced risk of cardiotoxicity using cardiac sodium ion channel models

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575320A (en) * 2014-05-05 2017-04-19 艾腾怀斯股份有限公司 Binding affinity prediction system and method
CN107491664A (en) * 2017-08-29 2017-12-19 浙江工业大学 A kind of protein structure ab initio prediction method based on comentropy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Prediction of Protein Structural Features from Sequence Data Based on Shannon Entropy and Kolmogorov Complexity";Bywater R P;《Plos One》;20150409;第1-15页 *
"基于深度学习的八类蛋白质二级结构预测算法";张蕾;《计算机应用》;20170510;第1512-1515页 *

Also Published As

Publication number Publication date
CN108614957A (en) 2018-10-02

Similar Documents

Publication Publication Date Title
Hanson et al. Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks
Pomyen et al. Deep metabolome: Applications of deep learning in metabolomics
CN107609342B (en) Protein conformation search method based on secondary structure space distance constraint
Quang et al. EXTREME: an online EM algorithm for motif discovery
CN109360599B (en) Protein structure prediction method based on residue contact information cross strategy
Yang et al. A novel feature extraction method with feature selection to identify Golgi-resident protein types from imbalanced data
CN108846256B (en) Group protein structure prediction method based on residue contact information
CN109033744B (en) Protein structure prediction method based on residue distance and contact information
CN107491664B (en) Protein structure de novo prediction method based on information entropy
EP2759952B1 (en) Efficient genomic read alignment in an in-memory database
Zhang et al. Enhancing protein conformational space sampling using distance profile-guided differential evolution
CN109215732B (en) Protein structure prediction method based on residue contact information self-learning
CN109215733B (en) Protein structure prediction method based on residue contact information auxiliary evaluation
CN109086565B (en) Protein structure prediction method based on contact constraint between residues
CN109101785B (en) Protein structure prediction method based on secondary structure similarity selection strategy
CN108614957B (en) Multi-stage protein structure prediction method based on Shannon entropy
Li et al. Identification of protein methylation sites by coupling improved ant colony optimization algorithm and support vector machine
Wang et al. Graph-based peak alignment algorithms for multiple liquid chromatography-mass spectrometry datasets
Hou et al. Predicting protein functions from PPI networks using functional aggregation
CN109300506B (en) Protein structure prediction method based on specific distance constraint
CN109360597B (en) Group protein structure prediction method based on global and local strategy cooperation
CN109033753B (en) Group protein structure prediction method based on secondary structure fragment assembly
CN108563921B (en) Protein structure prediction algorithm evaluation index construction method
CN109147867B (en) Group protein structure prediction method based on dynamic segment length
CN109300504B (en) Protein structure prediction method based on variable isoelite selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181002

Assignee: ZHEJIANG ORIENT GENE BIOTECH CO.,LTD.

Assignor: JIANG University OF TECHNOLOGY

Contract record no.: X2023980053610

Denomination of invention: A multi-stage protein structure prediction method based on Shannon entropy

Granted publication date: 20210618

License type: Common License

Record date: 20231222

EE01 Entry into force of recordation of patent licensing contract