CN106446604A - Protein structure ab into prediction method based on firefly algorithm - Google Patents
Protein structure ab into prediction method based on firefly algorithm Download PDFInfo
- Publication number
- CN106446604A CN106446604A CN201610908691.4A CN201610908691A CN106446604A CN 106446604 A CN106446604 A CN 106446604A CN 201610908691 A CN201610908691 A CN 201610908691A CN 106446604 A CN106446604 A CN 106446604A
- Authority
- CN
- China
- Prior art keywords
- conformation
- colony
- individual
- protein
- individuality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Evolutionary Computation (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention discloses a protein structure ab into prediction method based on a firefly algorithm. The method includes that under a basic firefly algorithm frame, a coarseness energy model is adopted to effectively lower conformational space dimension, group property of the firefly algorithm is utilized to guarantee diversity of protein conformation, segment assembling technology is adopted to initialize conformational group, a dihedral angle is used to express position of conformation in space according to a coarseness expression model of the protein conformation, energy ranking is adopted to determine a strongest luminous individual, position of the conformation is updated by calculating attraction degree among individuals, and approximately-natural-state conformation with lowest energy is acquired by searching in the conformational space. By applying the method in protein structure prediction, conformation high in predication accuracy and low in complexity can be acquired.
Description
Technical field
The present invention relates to bioinformatics, computer application field, more particularly, to a kind of based on glowworm swarm algorithm
Protein structure prediction from the beginning method.
Background technology
Bioinformatics is a study hotspot of life sciences and computer science crossing domain.Bioinformatics research
Achievement has been widely used in gene discovery and prediction, the storage management of gene data, data retrieval and excavation, gene at present
Expression data analysiss, protein structure prediction, gene and protein homology Relationship Prediction, sequence analysis with than equity.Genome
Define all protein constituting this organism, gene defines the aminoacid sequence of constitutive protein matter.Although protein by
The linear order composition of aminoacid, but, they only fold formed specific space structure just can have accordingly activity and
Corresponding biological function.Understand that the space structure of protein not only contributes to recognize the function of protein, be also beneficial to recognize
Protein is how perform function.Determine protein structure be very important.At present, protein sequence database
The speed of data accumulation is very fast, however, it is known that the protein of structure compare less.Although protein structure determination technology has
More significantly it is in progress, but, determine that by experimental technique the process of protein structure is still extremely complex, cost is higher.
Therefore, the protein structure of measuring wants much less than known protein sequence.On the other hand, with DNA sequencing technology
Development, human genome and more Model organism genome or will be completely sequenced, and DNA sequence quantity will be anxious
Increase, and the progress due to DNA sequence analysis technology and gene recognition method, we can derive substantial amounts of protein from DNA
Sequence.This means the protein amounts of known array and measure the protein amounts of structure (as Protein structure databases
Data in PDB) gap will be increasing.It is desirable to the speed producing protein structure can keep up with generation protein
The speed of sequence, or reduce both gaps.
Technical bottleneck main at present is two aspects, and first aspect is the method for sampling, and prior art is empty to conformation
Between ability in sampling not strong, further aspect is that conformation update method, prior art is still not enough to the renewal precision of conformation.Cause
This, existing conformational space searching method Shortcomings, need to improve.
Content of the invention
In order to overcome existing protein structure prediction conformational space optimization method to exist, sampling efficiency is relatively low, complexity relatively
Deficiency high, that precision of prediction is relatively low, the present invention proposes a kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm.?
Under basic glowworm swarm algorithm framework, conformational space dimension is effectively reduced using coarseness energy model, using glowworm swarm algorithm
The multiformity to ensure protein conformation for the group property, using fragment package technique, conformational population is initialized, foundation
The coarseness expression model of protein conformation, represents conformation position in space with one group of dihedral angle, is come using energy ranking
Determine the individuality that lights the most by force, and the Attraction Degree between individual updates the position of conformation by calculating, finally searches in conformational space
Rope obtains the nearly native state conformation of least energy.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm, the method comprising the steps of:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position
Put renewal step factor α;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony
Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical
Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg;
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space
Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as
Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0
Random number between 1, each individual position x in colonyiT () is expressed as
ψ is the dihedral angle of the amino acid residue of list entries, and L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg;
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
The technology design of the present invention is:Under basic glowworm swarm algorithm framework, effectively to be dropped using coarseness energy model
Low conformational space dimension, ensures the multiformity of protein conformation using the group property of glowworm swarm algorithm, using fragment assembling
Technology initializes to conformational population, according to the coarseness expression model of protein conformation, represents conformation with one group of dihedral angle
Position in space, to be determined the individuality that lights the most by force, and to be updated by calculating the Attraction Degree between individuality using energy ranking
The position of conformation, finally in conformational space, search obtains the nearly native state conformation of least energy.
Beneficial effects of the present invention are:The present invention applies in protein structure prediction, can obtain precision of prediction higher,
The relatively low conformation of complexity.
Brief description
Fig. 1 is the pre- geodesic structure of protein 2L0G and experimental determination structure immediate conformation schematic three dimensional views.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.
With reference to Fig. 1, a kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm, methods described includes following step
Suddenly:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position
Put renewal step factor α;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony
Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical
Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg;
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space
Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as
Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0
Random number between 1, each individual position x in colonyiT () is expressed as
ψ is the dihedral angle of the amino acid residue of list entries, and L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg;
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
The present embodiment with the test proteins of the entitled 2L0G of PDB as embodiment, a kind of protein based on glowworm swarm algorithm
Structure prediction from the beginning method, the method comprising the steps of:
1) give list entries information;
2) parameter initialization:Setting population size popSize=50, iterationses generation=10000, light intensity is inhaled
Draw factor gamma=0.5, location updating step factor α=0.7;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony
Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical
Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg;
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space
Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as
Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0
Random number between 1, each individual position x in colonyiT () is expressed as
ψ is the dihedral angle of the amino acid residue of list entries, and L=3 is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg;
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
Described above is the excellent results that show of the embodiment that the present invention is given it is clear that the present invention not only fits
Close above-described embodiment, can on the premise of without departing from essence spirit of the present invention and without departing from content involved by flesh and blood of the present invention
Do many variations to it to be carried out.
Claims (1)
1. a kind of protein structure prediction based on glowworm swarm algorithm from the beginning method it is characterised in that:Methods described includes following
Step:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position is more
New step factor α;
3) colony's conformation initialization:According to given list entries, random generate popSize individuality, in colony every each and every one
Body does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are to pass through
RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg;
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality position in space
Put, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgPresent bit
Put, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand be 0 to 1 it
Between random number, each individual position x in colonyiT () is expressed as
For the dihedral angle of the amino acid residue of list entries, L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg;
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610908691.4A CN106446604A (en) | 2016-10-19 | 2016-10-19 | Protein structure ab into prediction method based on firefly algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610908691.4A CN106446604A (en) | 2016-10-19 | 2016-10-19 | Protein structure ab into prediction method based on firefly algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106446604A true CN106446604A (en) | 2017-02-22 |
Family
ID=58175373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610908691.4A Pending CN106446604A (en) | 2016-10-19 | 2016-10-19 | Protein structure ab into prediction method based on firefly algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106446604A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085674A (en) * | 2017-03-14 | 2017-08-22 | 浙江工业大学 | A kind of multi-modal protein conformation space optimization method based on improvement glowworm swarm algorithm |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040024536A1 (en) * | 2000-09-28 | 2004-02-05 | Torbjorn Rognes | Determination of optimal local sequence alignment similarity score |
US20060141480A1 (en) * | 1999-11-10 | 2006-06-29 | Kalyanaraman Ramnarayan | Use of computationally derived protein structures of genetic polymorphisms in pharmacogenomics and clinical applications |
CN105160196A (en) * | 2015-09-22 | 2015-12-16 | 浙江工业大学 | Dynamic mutation policy based group global optimization method |
CN105205348A (en) * | 2015-09-22 | 2015-12-30 | 浙江工业大学 | Method for colony conformation space optimization based on distance constraint selection strategy |
-
2016
- 2016-10-19 CN CN201610908691.4A patent/CN106446604A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060141480A1 (en) * | 1999-11-10 | 2006-06-29 | Kalyanaraman Ramnarayan | Use of computationally derived protein structures of genetic polymorphisms in pharmacogenomics and clinical applications |
US20040024536A1 (en) * | 2000-09-28 | 2004-02-05 | Torbjorn Rognes | Determination of optimal local sequence alignment similarity score |
CN105160196A (en) * | 2015-09-22 | 2015-12-16 | 浙江工业大学 | Dynamic mutation policy based group global optimization method |
CN105205348A (en) * | 2015-09-22 | 2015-12-30 | 浙江工业大学 | Method for colony conformation space optimization based on distance constraint selection strategy |
Non-Patent Citations (3)
Title |
---|
XIUJUAN LEI ET AL: "Detecting functional modules in dynamic protein-protein interaction networks using Markov Clustering and Firefly Algorithm", 《2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)》 * |
XIUJUAN LEI ET AL: "Protein complex identification through Markov clustering with firefly algorithm on dynamic protein–protein interaction networks", 《INFORMATION SCIENCES》 * |
王冲: "基于群智能优化的聚类算法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085674A (en) * | 2017-03-14 | 2017-08-22 | 浙江工业大学 | A kind of multi-modal protein conformation space optimization method based on improvement glowworm swarm algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6850874B2 (en) | Methods, devices, equipment and storage media for predicting protein binding sites | |
von Reumont et al. | Pancrustacean phylogeny in the light of new phylogenomic data: support for Remipedia as the possible sister group of Hexapoda | |
JP2019535057A5 (en) | ||
Sleator | Phylogenetics | |
CN114898811A (en) | Training method and device of protein training model, electronic equipment and storage medium | |
CN105930688A (en) | Improved PSO algorithm based protein function module detection method | |
CN109816087B (en) | Strong convection weather discrimination method for rough set attribute reduction based on artificial fish swarm and frog swarm hybrid algorithm | |
CN103473482A (en) | Protein three-dimensional structure prediction method based on differential evolution and conformation space annealing | |
CN111429970B (en) | Method and system for acquiring multiple gene risk scores based on feature selection of extreme gradient lifting method | |
CN109360599A (en) | A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy | |
CN104951669B (en) | A kind of distance spectrum construction method for protein structure prediction | |
CN106096326A (en) | A kind of differential evolution Advances in protein structure prediction based on barycenter Mutation Strategy | |
CN106503485B (en) | A kind of multi-modal differential evolution protein structure ab initio prediction method of local enhancement | |
CN106446604A (en) | Protein structure ab into prediction method based on firefly algorithm | |
CN106484865A (en) | One kind is based on four word chained list dictionary tree searching algorithm of DNA k mer index problem | |
Chen et al. | sORFPred: a method based on comprehensive features and ensemble learning to predict the sORFs in plant LncRNAs | |
CN109360601A (en) | A kind of multi-modal Advances in protein structure prediction based on exclusion strategy | |
CN108932402A (en) | A kind of protein complex recognizing method | |
CN107066834B (en) | A kind of protein structure ab initio prediction method based on particle swarm optimization algorithm | |
CN108595910A (en) | A kind of group's protein conformation space optimization method based on diversity index | |
CN108920894A (en) | A kind of protein conformation space optimization method based on the estimation of brief abstract convex | |
CN107145764B (en) | A kind of protein conformation space search method of dual distribution estimation guidance | |
CN109378033B (en) | Strategy self-adaptive protein conformation space optimization method based on transfer entropy | |
CN109390035A (en) | A kind of protein conformation space optimization method compared based on partial structurtes | |
Sun et al. | Hyperedge representations with hypergraph wavelets: applications to spatial transcriptomics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170222 |
|
RJ01 | Rejection of invention patent application after publication |