CN106446604A - Protein structure ab into prediction method based on firefly algorithm - Google Patents

Protein structure ab into prediction method based on firefly algorithm Download PDF

Info

Publication number
CN106446604A
CN106446604A CN201610908691.4A CN201610908691A CN106446604A CN 106446604 A CN106446604 A CN 106446604A CN 201610908691 A CN201610908691 A CN 201610908691A CN 106446604 A CN106446604 A CN 106446604A
Authority
CN
China
Prior art keywords
conformation
colony
individual
protein
individuality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610908691.4A
Other languages
Chinese (zh)
Inventor
张贵军
郝小虎
周晓根
王柳静
李章维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201610908691.4A priority Critical patent/CN106446604A/en
Publication of CN106446604A publication Critical patent/CN106446604A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a protein structure ab into prediction method based on a firefly algorithm. The method includes that under a basic firefly algorithm frame, a coarseness energy model is adopted to effectively lower conformational space dimension, group property of the firefly algorithm is utilized to guarantee diversity of protein conformation, segment assembling technology is adopted to initialize conformational group, a dihedral angle is used to express position of conformation in space according to a coarseness expression model of the protein conformation, energy ranking is adopted to determine a strongest luminous individual, position of the conformation is updated by calculating attraction degree among individuals, and approximately-natural-state conformation with lowest energy is acquired by searching in the conformational space. By applying the method in protein structure prediction, conformation high in predication accuracy and low in complexity can be acquired.

Description

A kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm
Technical field
The present invention relates to bioinformatics, computer application field, more particularly, to a kind of based on glowworm swarm algorithm Protein structure prediction from the beginning method.
Background technology
Bioinformatics is a study hotspot of life sciences and computer science crossing domain.Bioinformatics research Achievement has been widely used in gene discovery and prediction, the storage management of gene data, data retrieval and excavation, gene at present Expression data analysiss, protein structure prediction, gene and protein homology Relationship Prediction, sequence analysis with than equity.Genome Define all protein constituting this organism, gene defines the aminoacid sequence of constitutive protein matter.Although protein by The linear order composition of aminoacid, but, they only fold formed specific space structure just can have accordingly activity and Corresponding biological function.Understand that the space structure of protein not only contributes to recognize the function of protein, be also beneficial to recognize Protein is how perform function.Determine protein structure be very important.At present, protein sequence database The speed of data accumulation is very fast, however, it is known that the protein of structure compare less.Although protein structure determination technology has More significantly it is in progress, but, determine that by experimental technique the process of protein structure is still extremely complex, cost is higher. Therefore, the protein structure of measuring wants much less than known protein sequence.On the other hand, with DNA sequencing technology Development, human genome and more Model organism genome or will be completely sequenced, and DNA sequence quantity will be anxious Increase, and the progress due to DNA sequence analysis technology and gene recognition method, we can derive substantial amounts of protein from DNA Sequence.This means the protein amounts of known array and measure the protein amounts of structure (as Protein structure databases Data in PDB) gap will be increasing.It is desirable to the speed producing protein structure can keep up with generation protein The speed of sequence, or reduce both gaps.
Technical bottleneck main at present is two aspects, and first aspect is the method for sampling, and prior art is empty to conformation Between ability in sampling not strong, further aspect is that conformation update method, prior art is still not enough to the renewal precision of conformation.Cause This, existing conformational space searching method Shortcomings, need to improve.
Content of the invention
In order to overcome existing protein structure prediction conformational space optimization method to exist, sampling efficiency is relatively low, complexity relatively Deficiency high, that precision of prediction is relatively low, the present invention proposes a kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm.? Under basic glowworm swarm algorithm framework, conformational space dimension is effectively reduced using coarseness energy model, using glowworm swarm algorithm The multiformity to ensure protein conformation for the group property, using fragment package technique, conformational population is initialized, foundation The coarseness expression model of protein conformation, represents conformation position in space with one group of dihedral angle, is come using energy ranking Determine the individuality that lights the most by force, and the Attraction Degree between individual updates the position of conformation by calculating, finally searches in conformational space Rope obtains the nearly native state conformation of least energy.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm, the method comprising the steps of:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position Put renewal step factor α;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0 Random number between 1, each individual position x in colonyiT () is expressed as ψ is the dihedral angle of the amino acid residue of list entries, and L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
The technology design of the present invention is:Under basic glowworm swarm algorithm framework, effectively to be dropped using coarseness energy model Low conformational space dimension, ensures the multiformity of protein conformation using the group property of glowworm swarm algorithm, using fragment assembling Technology initializes to conformational population, according to the coarseness expression model of protein conformation, represents conformation with one group of dihedral angle Position in space, to be determined the individuality that lights the most by force, and to be updated by calculating the Attraction Degree between individuality using energy ranking The position of conformation, finally in conformational space, search obtains the nearly native state conformation of least energy.
Beneficial effects of the present invention are:The present invention applies in protein structure prediction, can obtain precision of prediction higher, The relatively low conformation of complexity.
Brief description
Fig. 1 is the pre- geodesic structure of protein 2L0G and experimental determination structure immediate conformation schematic three dimensional views.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.
With reference to Fig. 1, a kind of from the beginning method of the protein structure prediction based on glowworm swarm algorithm, methods described includes following step Suddenly:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position Put renewal step factor α;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0 Random number between 1, each individual position x in colonyiT () is expressed as ψ is the dihedral angle of the amino acid residue of list entries, and L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
The present embodiment with the test proteins of the entitled 2L0G of PDB as embodiment, a kind of protein based on glowworm swarm algorithm Structure prediction from the beginning method, the method comprising the steps of:
1) give list entries information;
2) parameter initialization:Setting population size popSize=50, iterationses generation=10000, light intensity is inhaled Draw factor gamma=0.5, location updating step factor α=0.7;
3) colony's conformation initialization:According to given list entries, random popSize individuality of generation, to every in colony Individuality does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are logical Cross RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality in space Position, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgWork as Front position, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand arrives for 0 Random number between 1, each individual position x in colonyiT () is expressed as ψ is the dihedral angle of the amino acid residue of list entries, and L=3 is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate;
Described above is the excellent results that show of the embodiment that the present invention is given it is clear that the present invention not only fits Close above-described embodiment, can on the premise of without departing from essence spirit of the present invention and without departing from content involved by flesh and blood of the present invention Do many variations to it to be carried out.

Claims (1)

1. a kind of protein structure prediction based on glowworm swarm algorithm from the beginning method it is characterised in that:Methods described includes following Step:
1) give list entries information;
2) parameter initialization:Setting population size popSize, iterationses generation, light intensity attracting factor γ, position is more New step factor α;
3) colony's conformation initialization:According to given list entries, random generate popSize individuality, in colony every each and every one Body does length fragment assembling, and calculates its fluorescent brightness Io, and wherein length is sequence length, and Io=-E, E are to pass through RosettaSscore3 energy function calculated protein conformation energy value;
4) to step 3) in the fluorescent brightness that calculates sort from big to small, make the maximum individuality of fluorescent brightness be pg
5) start iteration:
5.1) individual to each in colony, calculate pgAttraction Degree β to it;
5.2) according to xi(t+1)=xi(t)+β(xj(t)–xi(t))+α (rand 0.5) updates each individuality position in space Put, wherein xi(t+1), xiT () represents individual piPosition after renewal and current position, xjT () represents individual pgPresent bit Put, whereinβ0For the maximum Attraction Degree factor, rijRepresent individual piWith pgThe distance between, rand be 0 to 1 it Between random number, each individual position x in colonyiT () is expressed as For the dihedral angle of the amino acid residue of list entries, L is fragment length;
5.3) each individuality in colony is carried out with L random fragment assembling, completes colony and swing at random;
5.4) recalculate each individual fluorescent brightness, update pg
6) judge whether to reach maximum iteration time generation;
6.1) if current iteration number of times is less than generation, return to step 5.1);
6.2) if current iteration number of times is equal to generation, terminate.
CN201610908691.4A 2016-10-19 2016-10-19 Protein structure ab into prediction method based on firefly algorithm Pending CN106446604A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610908691.4A CN106446604A (en) 2016-10-19 2016-10-19 Protein structure ab into prediction method based on firefly algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610908691.4A CN106446604A (en) 2016-10-19 2016-10-19 Protein structure ab into prediction method based on firefly algorithm

Publications (1)

Publication Number Publication Date
CN106446604A true CN106446604A (en) 2017-02-22

Family

ID=58175373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610908691.4A Pending CN106446604A (en) 2016-10-19 2016-10-19 Protein structure ab into prediction method based on firefly algorithm

Country Status (1)

Country Link
CN (1) CN106446604A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085674A (en) * 2017-03-14 2017-08-22 浙江工业大学 A kind of multi-modal protein conformation space optimization method based on improvement glowworm swarm algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040024536A1 (en) * 2000-09-28 2004-02-05 Torbjorn Rognes Determination of optimal local sequence alignment similarity score
US20060141480A1 (en) * 1999-11-10 2006-06-29 Kalyanaraman Ramnarayan Use of computationally derived protein structures of genetic polymorphisms in pharmacogenomics and clinical applications
CN105160196A (en) * 2015-09-22 2015-12-16 浙江工业大学 Dynamic mutation policy based group global optimization method
CN105205348A (en) * 2015-09-22 2015-12-30 浙江工业大学 Method for colony conformation space optimization based on distance constraint selection strategy

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060141480A1 (en) * 1999-11-10 2006-06-29 Kalyanaraman Ramnarayan Use of computationally derived protein structures of genetic polymorphisms in pharmacogenomics and clinical applications
US20040024536A1 (en) * 2000-09-28 2004-02-05 Torbjorn Rognes Determination of optimal local sequence alignment similarity score
CN105160196A (en) * 2015-09-22 2015-12-16 浙江工业大学 Dynamic mutation policy based group global optimization method
CN105205348A (en) * 2015-09-22 2015-12-30 浙江工业大学 Method for colony conformation space optimization based on distance constraint selection strategy

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIUJUAN LEI ET AL: "Detecting functional modules in dynamic protein-protein interaction networks using Markov Clustering and Firefly Algorithm", 《2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)》 *
XIUJUAN LEI ET AL: "Protein complex identification through Markov clustering with firefly algorithm on dynamic protein–protein interaction networks", 《INFORMATION SCIENCES》 *
王冲: "基于群智能优化的聚类算法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085674A (en) * 2017-03-14 2017-08-22 浙江工业大学 A kind of multi-modal protein conformation space optimization method based on improvement glowworm swarm algorithm

Similar Documents

Publication Publication Date Title
JP6850874B2 (en) Methods, devices, equipment and storage media for predicting protein binding sites
von Reumont et al. Pancrustacean phylogeny in the light of new phylogenomic data: support for Remipedia as the possible sister group of Hexapoda
JP2019535057A5 (en)
Sleator Phylogenetics
CN114898811A (en) Training method and device of protein training model, electronic equipment and storage medium
CN105930688A (en) Improved PSO algorithm based protein function module detection method
CN109816087B (en) Strong convection weather discrimination method for rough set attribute reduction based on artificial fish swarm and frog swarm hybrid algorithm
CN103473482A (en) Protein three-dimensional structure prediction method based on differential evolution and conformation space annealing
CN111429970B (en) Method and system for acquiring multiple gene risk scores based on feature selection of extreme gradient lifting method
CN109360599A (en) A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy
CN104951669B (en) A kind of distance spectrum construction method for protein structure prediction
CN106096326A (en) A kind of differential evolution Advances in protein structure prediction based on barycenter Mutation Strategy
CN106503485B (en) A kind of multi-modal differential evolution protein structure ab initio prediction method of local enhancement
CN106446604A (en) Protein structure ab into prediction method based on firefly algorithm
CN106484865A (en) One kind is based on four word chained list dictionary tree searching algorithm of DNA k mer index problem
Chen et al. sORFPred: a method based on comprehensive features and ensemble learning to predict the sORFs in plant LncRNAs
CN109360601A (en) A kind of multi-modal Advances in protein structure prediction based on exclusion strategy
CN108932402A (en) A kind of protein complex recognizing method
CN107066834B (en) A kind of protein structure ab initio prediction method based on particle swarm optimization algorithm
CN108595910A (en) A kind of group's protein conformation space optimization method based on diversity index
CN108920894A (en) A kind of protein conformation space optimization method based on the estimation of brief abstract convex
CN107145764B (en) A kind of protein conformation space search method of dual distribution estimation guidance
CN109378033B (en) Strategy self-adaptive protein conformation space optimization method based on transfer entropy
CN109390035A (en) A kind of protein conformation space optimization method compared based on partial structurtes
Sun et al. Hyperedge representations with hypergraph wavelets: applications to spatial transcriptomics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170222

RJ01 Rejection of invention patent application after publication