CN106951729A - A kind of method that synteny using organelle gene group carries out Phylogenetic analysis - Google Patents
A kind of method that synteny using organelle gene group carries out Phylogenetic analysis Download PDFInfo
- Publication number
- CN106951729A CN106951729A CN201710163233.7A CN201710163233A CN106951729A CN 106951729 A CN106951729 A CN 106951729A CN 201710163233 A CN201710163233 A CN 201710163233A CN 106951729 A CN106951729 A CN 106951729A
- Authority
- CN
- China
- Prior art keywords
- gene group
- sequence
- synteny
- organelle
- organelle gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
Landscapes
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Computational Biology (AREA)
- Analytical Chemistry (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention relates to a kind of method that synteny using organelle gene group carries out Phylogenetic analysis, belong to phylogenetic analysis field.In order to overcome the processing procedure during Phylogenetic analysis in the prior art cumbersome, the technical deficiency for the treatment of time length, the present invention provides a kind of method that synteny using organelle gene group carries out Phylogenetic analysis, and its main flow includes:(1)The synteny of different plant species organelle gene group is extracted using Mauve;(2)The trimming and series connection of synteny sequence;(3)The detection of co-linear modular data set optimality model;(4)The structure of phylogenetic tree.The synteny that this method is obtained can also include non-coding region comprising complete protein coding gene, while substantial amounts of time cost is also saved, while there is higher reliability and accuracy, and the suitably popularization and application in species Phylogenetic analysis.
Description
Technical field
The present invention relates to a kind of method that synteny using organelle gene group carries out Phylogenetic analysis, belong to and be
System evolutionary analysis field.
Background technology
Main function is responsible for the metabolic process of photosynthesis and energy, organelle gene group to organelle in vivo
Because multicopy, matrilinear inheritance, mutation rate is high, number gene is more, be easily sequenced the features such as, progressively enter into regarding for Phylogenetic analysis
It is wild.In the origin evolutionary analysis of species, the information of organelle gene group can not only help our recognition system evolutionary relationships
Can also its clear and definite Origin of Species problem.And during using organelle gene constructing system Evolvement, it is accurately many
Sequence alignment is the foundation stone that related system Evolvement can correctly be inferred.In today of bioinformatics high speed development, permitted
Many comparison instruments that are accurate, quick, meeting biological theory are developed;Have benefited from these instruments so that biological study
It is able to be carried out in deeper level higher level.But with the development of high throughput sequencing technologies, sequencing data explosion type increases
It is long, higher requirement is proposed to Multiple Sequence Alignment;How to carry out big data quick Multiple Sequence Alignment, obtain more excellent more accurate
True result, is the important goal of Multiple Sequence Alignment software development.
In the phylogenetic relationship based on organelle, current widespread practice is that the shared gene of the multiple species of picking enters
Row is contribute, it is necessary to each group of ortholog is compared, and is then united, but as Phylogenetic analysis is studied
Species increase so that need the number gene compared manually then to increase in geometry multiple, this will drag significantly analyze slowly into
Journey, whole process, which seems, to waste time and energy.
In view of a variety of drawbacks of the above, we explore a kind of quick comparison flow based on organelle gene group so that thin
Analysis process, which occurs, for born of the same parents' device genome system becomes quick, easy, accelerates analysis process.The systematic growth hair analysis of single-gene data
It has been customary, but the Limited information included by single-gene sequence, it is not enough to solve researcher's all classification interested
Phylogenetic Relationships between unit, such as many studies have shown that different genes has different evolutionary rates, such as chloroplaset
GenerbcL evolutionary rate is 1.4 times or so of karyogene 18S rDNA.In this case, by combining different genes
Data set, can increase systematic growth number of signals, so as to strengthen the parsing power to systematic growth taxon, also just say many bases
Because of the difference for the evolutionary rate that can eliminate individual gene that is cascaded, it is to build using polygenes even whole gene
The accuracy of system chadogram is higher than the reliability of single-gene or multiple genes.Therefore, in the research of evolutionary relationship, with
Species gene group information(Matrix attachment region and organelle gene group)It is perfect, the system derivation relations of species is had more deep
Understanding.
At present, developing rapidly with sequencing technologies, the organelle gene group for covering each class species is gradually completed,
Therefore the phylogenetic relationship built based on organelle gene group solves the problem of many systematicses.But correspond to therewith
Result be increasing with data volume(Including species and gene data), cause workload increasing, for example polygenes joins
Tree is built jointly, it is necessary to which each group of ortholog of multiple species is individually compared manually, then joins multiple genes
(Chloroplast gene gene is often as high as more than 100) is combined, this will drag slow analysis process significantly, whole process seems
Waste time and energy.
The content of the invention
In order to overcome the processing procedure during Phylogenetic analysis in the prior art cumbersome, the technology for the treatment of time length is not
Foot, the present invention provides a kind of method of the progress Phylogenetic analysis using organelle gene group synteny, and it mainly includes
With next techniqueflow:(1)The synteny of different plant species organelle gene group is extracted using Mauve;(2)Synteny sequence
Trimming and series connection;(3)The detection of co-linear modular data set optimality model;(4)The structure of phylogenetic tree.
The method that synteny of the present invention using organelle gene group carries out Phylogenetic analysis, its specific bag
Include following steps:
1)The synteny of different plant species organelle gene group is extracted using Mauve:Being downloaded in Genbank needs carry out system
The organelle gene group of the species of Evolvement, a local data base is built into by the organelle gene group downloaded;Use
The organelle gene group of whole species in local data base is imported in Mauve aligner, is examined using progressive Mauve
The structure variation surveyed between different plant species organelle gene group, co-linear modular is marked off according to comparison result;For what is marked off
Co-linear modular is counted, and co-linear modular is all extracted from comparison result sequence using script;
2)The trimming and series connection of synteny sequence:By Gblocks using conservative sequence trimming strategy to co-linear modular
Sequence trimming is carried out, the block that phylogenetic information is not extracted is given up;Co-linear modular after trimming is merged
To aligned sequences, and report distribution of each module in final collating sequence;
3)The detection of co-linear modular data set optimality model:The aligned sequences built based on co-linear modular, according to each common
Linear block carries out sequence cutting and model selection in the distribution of final collating sequence, and determines optimal nucleic acid alternative model and sequence
Row Cut Stratagem;
4)The structure of phylogenetic tree:By the tandem sequence of the co-linear modular quickly compared, built using MrBayes multiple
The phylogenetic relationship of species.
The invention provides a kind of quick comparison flow based on organelle gene group, complete encoding histone can be included
Gene also includes non-coding region, while substantial amounts of time cost is also saved, while also having higher reliability and accurate
Property.Specifically, Phylogenetic analysis method of the present invention can be efficiently solved:
1st, the problem of the comparison speed of organelle full-length genome;
2nd, the information of organelle gene group(Including code area and noncoding region)Comprehensive coverage information;
3rd, the quick comparison for solving the problems, such as conservative region in different classifications unit;
4th, in fast explicit different classifications unit the problem of conservative gene species.
In a word, by the method for this quick comparison flow based on organelle gene group, different points can quickly be realized
The comparison of the synteny of class unit, covers organelle gene group information more fully hereinafter, more accurately infers and reduces
The phylogenetic relationship of species.
Brief description of the drawings
Fig. 1 is the experiment flow figure of synteny of the present invention.
Fig. 2 is the visualization result figure that synteny is compared.
Fig. 3 is algae mitochondria systematic evolution tree figure, and wherein left figure compares the system that flow chooses sequence construct to be quick
Chadogram, right figure is the systematic evolution tree of the sequence construct of shared gene.
Fig. 4 is plant chloroplast phyletic evolution tree graph, and left figure compares the phyletic evolution that flow chooses sequence construct to be quick
Tree, right figure is the systematic evolution tree of the sequence construct of shared gene.
Fig. 5 is rodent mitochondria systematic evolution tree figure, and left figure is for what quick comparison flow chose sequence construct
System chadogram, right figure is the systematic evolution tree of the sequence construct of shared gene.
Embodiment
The present invention is further described below by way of specific embodiment, but those skilled in the art should be able to know, it is described to implement
Example does not limit the scope of patent protection of the present invention in any way.
The method that the embodiment present invention carries out Phylogenetic analysis using the synteny of organelle gene group
The main of the rapid build step of synteny is included with next skill in organelle gene group Phylogenetic analysis of the present invention
Art flow:(1)The synteny of different plant species organelle gene group is extracted using Mauve;(2)The trimming of synteny sequence and
Series connection;(3)The detection of co-linear modular data set optimality model;(4)The structure of phylogenetic tree.Fig. 1 is homologous mould of the invention
The experiment flow figure of block.The method that synteny of the present invention using organelle gene group carries out Phylogenetic analysis,
It specifically includes following steps:
1. the synteny of different plant species organelle gene group is extracted using Mauve
A) the organelle gene group for the species for needing to carry out phylogenetic relationship, comparison data form branch are downloaded in Genbank
Hold the main flow nucleotide sequence form such as fasta, gb, gbk, fas;
B) the organelle gene group downloaded is built into a local database;
C) using the organelle gene group that several species are imported in Mauve aligner, using progressive Mauve detections not
Co-linear modular is marked off with the structure variation between organelle gene group, and according to comparison result.This step can solve sequence knot
The problem of structure variation can not be compared directly, repeats and Redundant process while also avoiding single-gene and comparing this;
D) counted for the co-linear modular marked off, co-linear modular is utilized into script whole from comparison result sequence
Extract.
2. the trimming and series connection of synteny sequence
A) followed by the trimming of co-linear modular sequence, that is, the site for having phylogenetic information is extracted:Sequence trimming is used
Be Gblocks, use most conservative sequence trimming strategy.After sequence trimming, abandon and do not extract phylogenetic information
Block;The block trimmed is merged, and reports distribution of each module in final collating sequence.
3. the detection of co-linear modular data set optimality model
The aligned sequences built based on co-linear modular, its minmal sequence unit is a co-linear modular, and we can be to every
Individual co-linear modular, according to it in the distribution of ultimate sequence, carries out sequence cutting and model selection, is substituted with selecting optimal nucleic acid
Model and sequence Cut Stratagem.
4. the structure of phylogenetic tree
Above by the tandem sequence of the synteny quickly compared, closed using the MrBayes systems for building multiple species
System.
The quick phylogenetic analysis accuracy validation for comparing flow of the invention.
Not currently exist to build the software that organelle gene group aligned sequences are built.The structure of this flow will be entirely thin
The time that born of the same parents' device genome system is analyzed shortens to more than ten minutes to a few houres.In order to verify that we build the accurate of flow
Property and eurytopicity.The systematic evolution tree that we build middle utilization albumen coded sequence of having published an article(Current widespread practice)
The systematic evolution tree of the sequence construct after quick comparison with the present invention carries out Accuracy Verification.Alga cells device genome system
The checking of Evolvement, the result is as shown in Figure 3.The checking of plant cell organelle genome system Evolvement, the result
As shown in Figure 4.The checking of zooblast device genome system Evolvement, the result is as shown in Figure 5.
From above confirmatory experiment, the sequence that the flow entirely quickly compared is extracted can be with constructing system chadogram
The deduction of the phylogenetic relationship of species is accurately carried out, and it is applied widely, greatly save time cost.
Claims (2)
1. a kind of method that synteny using organelle gene group carries out Phylogenetic analysis, it includes following method:
1)The synteny of different plant species organelle gene group is extracted using Mauve:Being downloaded in Genbank needs carry out system
The organelle gene group of the species of Evolvement, a local data base is built into by the organelle gene group downloaded;Use
The organelle gene group of whole species in local data base is imported in Mauve aligner, is examined using progressive Mauve
The structure variation surveyed between different plant species organelle gene group, co-linear modular is marked off according to comparison result;For what is marked off
Co-linear modular is counted, and co-linear modular is all extracted from comparison result sequence using script;
2)The trimming and series connection of synteny sequence:By Gblocks using conservative sequence trimming strategy to co-linear modular
Sequence trimming is carried out, the block that phylogenetic information is not extracted is given up;Co-linear modular after trimming is merged
To aligned sequences, and report distribution of each module in final collating sequence;
3)The detection of co-linear modular data set optimality model:The aligned sequences built based on co-linear modular, according to each common
Linear block carries out sequence cutting and model selection in the distribution of final collating sequence, and determines optimal nucleic acid alternative model and sequence
Row Cut Stratagem;
4)The structure of phylogenetic tree:By the tandem sequence of the co-linear modular quickly compared, built using MrBayes multiple
The phylogenetic relationship of species.
2. the method that the synteny according to claim 1 using organelle gene group carries out Phylogenetic analysis, its
It is characterised by, the step 1)In local data base organelle gene group nucleotide sequence preserve form be fasta, gb, gbk,
One kind in fas.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710163233.7A CN106951729A (en) | 2017-03-19 | 2017-03-19 | A kind of method that synteny using organelle gene group carries out Phylogenetic analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710163233.7A CN106951729A (en) | 2017-03-19 | 2017-03-19 | A kind of method that synteny using organelle gene group carries out Phylogenetic analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106951729A true CN106951729A (en) | 2017-07-14 |
Family
ID=59472617
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710163233.7A Pending CN106951729A (en) | 2017-03-19 | 2017-03-19 | A kind of method that synteny using organelle gene group carries out Phylogenetic analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106951729A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101243191A (en) * | 2004-11-29 | 2008-08-13 | 雷根斯堡大学临床中心 | Means and methods for detecting methylated DNA |
CN101957892A (en) * | 2010-09-17 | 2011-01-26 | 深圳华大基因科技有限公司 | Whole-genome replication event detection method and system |
CN103667328A (en) * | 2013-12-03 | 2014-03-26 | 中国海洋大学 | Construction method of porphyra yezoensis plastid genetic transformation vector |
-
2017
- 2017-03-19 CN CN201710163233.7A patent/CN106951729A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101243191A (en) * | 2004-11-29 | 2008-08-13 | 雷根斯堡大学临床中心 | Means and methods for detecting methylated DNA |
CN101957892A (en) * | 2010-09-17 | 2011-01-26 | 深圳华大基因科技有限公司 | Whole-genome replication event detection method and system |
CN103667328A (en) * | 2013-12-03 | 2014-03-26 | 中国海洋大学 | Construction method of porphyra yezoensis plastid genetic transformation vector |
Non-Patent Citations (3)
Title |
---|
张毓婷 等;: "《雷蒙德氏棉HSP70 基因家族的进化分析及其同源基因在陆地棉中的表达分析》", 《遗传HEREDITAS》 * |
杨俊卿 等;: "《条斑紫菜水通道蛋白PyAQP1基因的克隆及功能分析》", 《中国海洋大学学报》 * |
毕桂萁: "《海水红毛菜(Bangia fuscopurpurea OUCPT-01)与暗紫红毛菜(Bangia atropurpurea)细胞器基因组测序及系统发育分析》", 《中国优秀硕士学位论文全文数据库基础科学辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bog et al. | Duckweed (Lemnaceae): its molecular taxonomy | |
Chacón-Sánchez et al. | Testing domestication scenarios of lima bean (Phaseolus lunatus L.) in Mesoamerica: insights from genome-wide genetic markers | |
CN103233075B (en) | A kind of method based on transcript profile order-checking exploitation Dendranthema SSR primer | |
CN106868116A (en) | A kind of mulberry tree pathogen high throughput identification and kind sorting technique and its application | |
CN110910959B (en) | Population genetic evolution map and construction method thereof | |
Seifertová et al. | Multiple Pleistocene refugia and post‐glacial colonization in the European chub (Squalius cephalus) revealed by combined use of nuclear and mitochondrial markers | |
Moreno et al. | Genetic characterization of sunflower breeding resources from Argentina: assessing diversity in key open-pollinated and composite populations | |
CN108157293A (en) | A kind of breeding method for simplifying selection high productivity energy A2A2 homozygous genotype milk cows based on pedigree information | |
Raduski et al. | Patterns of genetic variation in a prairie wildflower, Silphium integrifolium, suggest a non‐prairie origin and locally adaptive variation | |
Guo et al. | Revisiting the evolutionary history of domestic and wild ducks based on genomic analyses | |
Wang et al. | Multiplexed massively parallel sequencing of plastomes provides insights into the genetic diversity, population structure, and phylogeography of wild and cultivated Coptis chinensis | |
US20030200033A1 (en) | High-throughput alignment methods for extension and discovery | |
CN106951729A (en) | A kind of method that synteny using organelle gene group carries out Phylogenetic analysis | |
Ané | RECONSTRUCTING CONCORDANCE TREES AND TESTING THE~~ COALESCENT MODEL FROM~~ GENOME-WIDE DATA sars | |
CN105279396B (en) | The Drought-resistant gene of plant module method of excavation | |
Molinari et al. | Transcriptome analysis using RNA-Seq fromexperiments with and without biological replicates: areview | |
CN115948521A (en) | Method for detecting aneuploid missing chromosome information | |
Conry | Determining the impact of recombination on phylogenetic inference | |
CN102747147B (en) | High-throughput identification method of non-coding gene | |
Del Giudice et al. | Study of genetic variation and its association with tensile strength among bamboo species through whole genome resequencing | |
Mu et al. | Genomic Sequence Analysis of 4 Culm Shape Variants of Moso Bamboo Based on Re-sequencing | |
Mu et al. | Investigation on tree molecular genome of Arabidopsis thaliana for internet of things | |
Dittberner et al. | Approximate Bayesian computation untangles signatures of contemporary and historical hybridization between two endangered species | |
CN111445954B (en) | Method for identifying multiple gene families and carrying out evolutionary analysis | |
CN104598770A (en) | Wheat aphid quantity forecasting method and system based on human being evolution gene expression programming |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170714 |
|
RJ01 | Rejection of invention patent application after publication |