Summary of the invention
The purpose of the application is to provide a kind of method and its application of new effective acquisition chloroplast DNA sequencing data.
The application uses following technical scheme:
The one side of the application discloses a kind of method of effective acquisition chloroplast DNA sequencing data, green including the use of leaf
Body nucleic acid sequence set constructs nonredundancy chloroplaset sequence intersection;It is carried out according to constructed nonredundancy chloroplaset sequence intersection
Probe design;Hybrid capture is carried out using full-length genome of the designed probe to sample to be tested, obtains the chloroplast DNA of enrichment
Segment;The chloroplast DNA segment of enrichment is sequenced, the chloroplast DNA sequencing data of the application is obtained.Wherein, chloroplaset
Nucleic acid sequence set is the set collecting the existing all chloroplaset nucleic acid sequences disclosed and being formed;And construct nonredundancy leaf
Green body sequence intersection is primarily referred to as, and removes duplicate redundant sequence in all chloroplaset nucleic acid sequences, and the sequence finally obtained is closed
Collection.
It should be noted that the chloroplast DNA sequencing data acquisition methods of the application, creative constructing in advance are non-superfluous
Then remaining chloroplaset sequence intersection carries out probe design for nonredundancy chloroplaset sequence intersection again.On the one hand, the side of the application
Method can capture chloroplast DNA segment directly from the full-length genome of sample to be tested, relative to extracting after direct enrichment chloroplaset
For the method for its DNA, the present processes are more simple and convenient, moreover, the requirement to sample quality is relatively low.On the other hand,
In the present processes, designed probe has wide applicability, being capable of plant chloroplast DNA to Different Evolutionary branch
It is captured;The region of capture does not have Preference, can guarantee randomness and wide spreadability that output data is sequenced;Compared to existing
For some data separation methods, the present processes covering is wider, more efficient, can capture and obtain chloroplast DNA
90% or more gene order and data.Also, the present processes, the leaf especially suitable for large-scale larger evolutionary branching
The acquisition of green body DNA sequencing data, the further investigation for large-scale plant evolution and heredity are laid a good foundation.
Preferably, wherein building nonredundancy chloroplaset sequence intersection specifically includes following steps,
(1) public database is utilized, all chloroplaset nucleic acid sequences disclosed is obtained, obtains chloroplaset nucleic acid sequence
Set;
(2) according to the species information of each chloroplaset nucleic acid sequence, according to spore relationship, with the chloroplaset nucleic acid of acquisition
Species chadogram is constructed based on arrangement set, and all chloroplaset nucleic acid sequences are screened according to constructed spore tree,
Ensure to retain in each evolutionary branching of spore tree the chloroplaset nucleic acid sequence of the 1-2 assembling preferable species of result,
Obtain initial chloroplaset nucleic acid sequence intersection;
(3) one of object is selected according to the evolutionary degree of spore tree according to initial chloroplaset nucleic acid sequence intersection
For kind labeled as species are referred to, remaining is labeled as non-reference species, by the sequence of reference species and the progress of the sequence of non-reference species
It compares two-by-two, location information of the high similarity homology region in non-reference species gene group is recorded according to comparison result, and will
The nucleic acid sequence annotation of high similarity homology region is N;Meanwhile the sequence of reference species itself is compared, it will be high similar
It spends longest sequence in homology region to retain, the nucleic acid sequence annotation of reinforcement similarity homology region is N;
Wherein, it is compared two-by-two with reference to the sequence of species and the sequence of non-reference species, can be removed and refers to species
The redundant sequence of the high similarity of sequence, and refer to species sequence itself comparison, allow for reference in species there are it is multiple come
In the case where source or multiple duplicate sequences, to remove with reference to the redundant sequence in species own sequence;The application removal is superfluous
The mode of remaining sequence is to annotate the nucleic acid sequence of high similarity homology region for N, that is to say, that and this section of sequence of indirect deletion
Column, but its nucleotide annotate as N, the region annotated as N will not be analyzed when analysis, nonredundancy can be made in this way
The location information of chloroplaset sequence will not change before and after removing redundancy;
(4) based on the intersection of step (3) removal redundant sequence obtained, according to the method for step (3), one by one more
It changes with reference to species, is iterated comparison, come out until not new high similarity homology region is identified, likewise, according to step
Suddenly the method for (3) annotates the nucleic acid sequence of high similarity homology region for N, alternatively, knot will be compared with reference to species own sequence
The nucleic acid sequence annotation of shorter high similarity homology region is N in fruit;Obtain nonredundancy chloroplaset sequence intersection.
Wherein, replacement refers to reference to species and all makees each species in initial chloroplaset nucleic acid sequence intersection respectively one by one
For with reference to species, iteration one by one is compared, until all species all compare completion, the high similarity homologous region of all species is removed
The redundant sequence in domain.Iteration comparison refers to that the de-redundancy carried out based on the result that last time compares de-redundancy next time compares.
The nucleic acid sequence of high similarity homology region shorter in reference species own sequence comparison result is annotated as N, refers to reservation
Longest sequence in high similarity homology region, the nucleic acid sequence of reinforcement similarity homology region all annotate as N.
Preferably, the judgment basis of high similarity homology region is that similarity is greater than 90%, and the length of aligned sequences
Greater than 90bp.
Preferably, probe design is carried out according to constructed nonredundancy chloroplaset sequence intersection, specifically includes following steps,
(1) each to the upstream and downstream of its each section of nucleic acid sequence according to nonredundancy chloroplaset sequence intersection obtained
Extend 30-45bp, obtains the location coordinate information of probe design section;If upstream or the downstream area alkali of certain section of nucleic acid sequence
Base length is less than 30bp, then directly using the location information of this section of nucleic acid sequence as the location coordinate information of probe design section;
(2) location coordinate information obtained according to step (1), in the probe design section of location coordinate information mark,
Design the specific hybrid capture probe of each nucleic acid sequence in nonredundancy chloroplaset sequence intersection.
It should be noted that in the application, since the nucleotide of redundant sequence is annotated with N, all positions
Coordinate information, the coordinate information actually and in original genomic sequence.In the application, to the upper of its each section of nucleic acid sequence
Trip and downstream respectively extend 30-45bp and refer to, in nonredundancy chloroplaset sequence intersection, in the coordinate information of each section of nucleic acid sequence
On the basis of, then to upstream and downstream respectively extend 30-45bp, the area coordinate as the design of last probe.When designing probe,
In probe design section, script can be annotated and revert to script sequence for the upstream and downstream 30-45bp of N.In the application, upstream is under
Swimming each extension 30-45bp is that comprehensive two o'clock accounts for: first, it can guarantee the marginal position for probe design section, have
Enough additional sequences carry out probe design, so as to pick out optimal probe sequence.Second, the probe sequence of design is long
Degree is about 90bp, is set as extending 30-45bp, can guarantee that the base of the probe sequence at least 50% or more of design can be covered
Cover the region for needing to design probe.It is appreciated that upstream and downstream respectively extends 30-45bp, this range is in practical operation
Reference value can carry out appropriate adjustment, be not specifically limited herein specifically when design.
The another side of the application discloses the method for the application effective acquisition chloroplast DNA sequencing data in chloroplast DNA
Application in enrichment, chloroplaset library construction, the extensive plant evolution research based on chloroplaset information or genetic research.
It should be noted that the method for the effective acquisition chloroplast DNA sequencing data of the application, key, which is that, to be passed through
Nonredundancy chloroplaset sequence intersection is constructed, and probe is designed according to the nonredundancy chloroplaset sequence intersection, carries out chloroplast DNA piece
Section enrichment;To efficiently obtain chloroplast DNA sequencing data.Therefore, the present processes completely can be by its Chloroplast
DNA fragmentation enriching section, which pulls out, to be come, and chloroplast DNA enrichment or chloroplaset library construction are individually carried out.In addition, the application
The method of effective acquisition chloroplast DNA sequencing data, the chloroplast DNA sequencing especially suitable for large-scale larger evolutionary branching
Therefore the acquisition of data also can be completely used for the research of extensive plant evolution or genetic research based on chloroplaset information.
The application's discloses a kind of method for preparing chloroplast DNA segment hybrid capture probe on one side again, including following
Step, (one) utilize chloroplaset nucleic acid sequence set, construct nonredundancy chloroplaset sequence intersection;(2) according to constructed non-superfluous
Remaining chloroplaset sequence intersection carries out probe design, obtains the chloroplast DNA segment hybrid capture probe.
Preferably, the method that the application prepares chloroplast DNA segment hybrid capture probe in step (1), constructs non-superfluous
Remaining chloroplaset sequence intersection, specifically includes,
(1) initial chloroplaset nucleic acid sequence intersection is obtained: according to the species information of each chloroplaset nucleic acid sequence, according to species
Evolutionary relationship is constructed species chadogram based on chloroplaset nucleic acid sequence set, and is sieved according to constructed spore tree
Select all chloroplaset nucleic acid sequences, it is ensured that it is preferable to retain 1-2 assembling result in each evolutionary branching of spore tree
The chloroplaset nucleic acid sequence of species obtains initial chloroplaset nucleic acid sequence intersection;
(2) de-redundancy is carried out based on reference to species: according to initial chloroplaset nucleic acid sequence intersection, according to spore
The evolutionary degree of tree is selected one of species and is labeled as with reference to species, remaining is labeled as non-reference species, by reference species
Sequence and the sequence of non-reference species are compared two-by-two, record high similarity homology region in non-reference object according to comparison result
Location information in kind genome, and the nucleic acid sequence of high similarity homology region is annotated as N;Meanwhile certainly to reference species
The sequence of body is compared, and sequence longest in high similarity homology region is retained, the core of reinforcement similarity homology region
Acid sequence injection is interpreted as N;
(3) iteration, which compares, obtains nonredundancy chloroplaset sequence intersection: with the conjunction of step (2) removal redundant sequence obtained
Based on collection, according to the method for step (2), replacement refers to species one by one, is iterated comparison, until not new high similarity
Homology region is identified to be come out, likewise, the nucleic acid sequence annotation of high similarity homology region is by the method according to step (2)
N, alternatively, annotating the nucleic acid sequence of high similarity homology region shorter in reference species own sequence comparison result for N;I.e.
Obtain nonredundancy chloroplaset sequence intersection.
Preferably, the method that the application prepares chloroplast DNA segment hybrid capture probe, in step (2), according to institute's structure
The nonredundancy chloroplaset sequence intersection built carries out probe design, specifically includes,
(1) location coordinate information of probe design section is determined: right according to nonredundancy chloroplaset sequence intersection obtained
The upstream and downstream of its each section of nucleic acid sequence respectively extends 30-45bp, obtains the location coordinate information of probe design section;If
The upstream of certain section of nucleic acid sequence or downstream area bases longs are less than 30bp, then are directly made with the location information of this section of nucleic acid sequence
For the location coordinate information of probe design section;
(2) probe designs: the location coordinate information obtained according to step (1), marks in the location coordinate information
Probe design section in, design nonredundancy chloroplaset sequence intersection in each nucleic acid sequence specific hybrid capture probe, i.e.,
Chloroplast DNA segment hybrid capture probe.
Preferential, in step (1), chloroplaset nucleic acid sequence collection is combined into all having draped over one's shoulders using public database acquisition
The set of the chloroplaset nucleic acid sequence of dew.
It is appreciated that as set forth above, the method that the application prepares chloroplast DNA segment hybrid capture probe, it is real
It is exactly formed with reference to the part steps in the method for the application effective acquisition chloroplast DNA sequencing data on border.
The leaf for disclosing the application on one side again and preparing the method preparation of chloroplast DNA segment hybrid capture probe of the application
Green body DNA fragmentation hybrid capture probe.
It is appreciated that using the chloroplast DNA segment hybrid capture probe of the present processes preparation, it can not only be to not
Plant chloroplast DNA fragmentation with evolutionary branching is captured;Moreover, can realize full-length genome substantially to Chloroplast gene
The segment of level captures, and in angiosperm, gymnosperm and pteridophyte, can capture > 90% genome area.
The application's discloses a kind of method of chloroplast DNA enrichment on one side again, including the chloroplast DNA using the application
Segment hybrid capture probe carries out hybrid capture to the full-length genome of sample to be tested, realizes chloroplast DNA enrichment.
The application's discloses a kind of construction method in chloroplast DNA library on one side again, including green using the leaf of the application
Body DNA fragmentation hybrid capture probe carries out hybrid capture to the full-length genome of sample to be tested, realizes chloroplast DNA enrichment, then
Library construction is carried out using the chloroplast DNA of enrichment, obtains chloroplast DNA library.
The beneficial effects of the present application are as follows:
The method of the application effective acquisition chloroplast DNA sequencing data, can be directly from the full-length genome of sample to be tested
Capture chloroplast DNA segment, it is simple to operate, moreover, the requirement to sample quality is relatively low, as long as can extract acquisition to
The full-length genome of sample.In addition, the present processes have wide applicability, it can be to the plant of Different Evolutionary branch
Object chloroplast DNA is captured;The region of capture does not have Preference, can guarantee randomness and all standing that output data is sequenced
Property;For existing data separation method, the present processes covering is wider, more efficient.The present processes, it is special
Not Shi Yongyu large-scale larger evolutionary branching chloroplast DNA sequencing data acquisition, be large-scale plant evolution and something lost
The further investigation of biography is laid a good foundation.
Embodiment
The method that this example obtains chloroplast DNA sequencing data, including the use of the chloroplaset disclosed in public database
Nucleic acid sequence constructs nonredundancy chloroplaset sequence intersection, and wherein public database Primary Reference nucleic acid data collection is relatively complete
The NCBI in face, this example have collected plant chloroplast gene data all on NCBI, carry out nonredundancy chloroplaset sequence intersection structure
It builds.Then, probe design is carried out according to constructed nonredundancy chloroplaset sequence intersection;Test sample is treated using designed probe
The full-length genome of product carries out hybrid capture, obtains the chloroplast DNA segment of enrichment;The chloroplast DNA segment of enrichment is surveyed
Sequence obtains chloroplast DNA sequencing data.It is specific as follows:
1. constructing nonredundancy chloroplaset sequence intersection
(1) public database is utilized, all chloroplaset nucleic acid sequences disclosed is obtained, obtains chloroplaset nucleic acid sequence
Set.This example downloads the Chloroplast gene sequence all having disclosed from NCBI public database and same species is chosen
It selects the sequence that assembling quality is best, in the chloroplaset nucleic acid sequence set of this example, has collected 567 sequences of 544 species in total
Column.
(2) according to the species information of each chloroplaset nucleic acid sequence, according to spore relationship, with chloroplaset nucleic acid sequence collection
It is combined into fundamental construction spore tree, and all chloroplaset nucleic acid sequences are screened according to constructed spore tree, it is ensured that object
The chloroplaset nucleic acid sequence for retaining the 1-2 assembling preferable species of result in each evolutionary branching of kind chadogram, obtains just
Beginning chloroplaset nucleic acid sequence intersection.In the initial chloroplaset nucleic acid sequence intersection of this example, finally, 99 plant species are picked out altogether
Chloroplast gene nucleic acid sequence, specific nucleic acid sequence information is as shown in table 1.
99 plant species and its chloroplast gene group information that table 1 is selected
(3) nucleic acid sequence in initial chloroplaset nucleic acid sequence intersection is carried out comparing the high phase obtained between sequence two-by-two
De-redundancy is carried out like the homologous block of degree, and to the homologous block of high similarity.Specifically, genome sequence intersection is according to assembling result
And evolutionary degree, it picks out arabidopsis and is used as with reference to species, remaining species is labeled as non-reference species.To non-reference species gene
Group sequence construct sequence index, using arabidopsis chloroplaset genome sequence as search sequence.By blastn by arabidopsis
Chloroplast gene sequence and non-reference species gene group sequence carry out sequence alignment, according to e-value < 1e-5To comparison result
It is screened.In the result that screening obtains, defined nucleotide sequence similarity > 90%, and aligned sequences length > 90bp comparison knot
High similarity homology region of the fruit between sequence, and high similarity homology region is recorded in non-reference species according to comparison result
Location information in genome.Retain arabidopsis chloroplaset genomic nucleic acids information;It is high by non-reference species gene group sequence
The nucleic acid sequence annotation of similarity homology region is N.Meanwhile arabidopsis chloroplaset genome sequence being carried out certainly by blastn
Sequence alignment between body according to above-mentioned identical screening conditions and defines the condition of high similarity homology region, according to identifying
High similarity homology region sequence length, longest homologous sequence is retained, the sequence of reinforcement similarity homology region
Column annotation is N.
(4) sequence Jing Guo step (3) de-redundancy, replacement refer to species, are iterated comparison to the result of step (3),
High similarity homology region all in intersection sequence is identified, de-redundancy is carried out to the high similarity homology region of identification.It is logical
The sequence homology for crossing step (3) compares, and obtains non-reference species gene group sequence and arabidopsis chloroplaset sequence and arabidopsis
High similarity homology region between chloroplaset itself, and de-redundancy is carried out to these homologous blocks.The purpose of step (4) be for
De-redundancy further is carried out to the high similarity homology region between 99 Chloroplast gene sequences.By warp in step (3)
The sequence intersection obtained after de-redundancy is crossed to compare using the iteration that blastn carries out itself;Each time after iteration, ratio is checked
To in result whether there is also the high similarity homology region between sequence, similarity > 90% and aligned sequences length >
90bp, and according to the method in step (3), the shorter nucleic acid sequence in high similarity homology region is annotated as N.Iteration ratio
To until not new high similarity homology region is identified out.New sequence intersection, i.e. the nonredundancy chloroplaset sequence of this example
Column intersection, having between any two sequences does not have high similarity homology region, and at the same time it is green to retain different plant species leaf
Between body sequence the characteristics of sequence polymorphism information, the probe suitable for carrying out next step is designed.
2. designing probe according to nonredundancy chloroplaset sequence intersection
(1) from nonredundancy chloroplaset sequence intersection, the coordinate information of probe design section is obtained according to nucleic acid sequence.Through
It crosses the sequence intersection annotated again to annotate the nucleic acid sequence of the homologous block of similarity high between sequence for N, be formed non-
It is the design section of probe that the sequence intersection of redundancy, which is not the region of N in each sequence in intersection,.Specifically, each to its
The upstream and downstream of section nucleic acid sequence respectively extends 40bp and obtains the location coordinate information of probe design section;If certain section of nucleic acid sequence
The upstream of column or downstream area bases longs are less than 40bp, then are directly designed using the location information of this section of nucleic acid sequence as probe
The location coordinate information in region.
(2) it according to the location coordinate information of acquisition, in the probe design section of location coordinate information mark, designs non-superfluous
The specific hybrid capture probe of each nucleic acid sequence in remaining chloroplaset sequence intersection.Wherein, the sequence intersection amplifying nucleic acid of nonredundancy
Sequence area can be corresponded to directly on original series, therefore, the location coordinate information of all sequences, as original gene group sequence
Coordinate information in column.
In this example, the location coordinate information in arabidopsis thaliana sequence for probe design is as shown in table 2.
The location coordinate information of 2 arabidopsis thaliana sequence middle probe of table design
This example carries out probe design using the sequence capturing probe design software NimbleDesign of Roche Holding Ag, and this example is total
180519 hybrid capture probes have been designed and synthesized altogether.
Plant chloroplast genome sequence in this example comprehensive collection Different Evolutionary branch combines leaf using spore tree
Green body genome assembling quality, screens the genome sequence of collection, ensure that the genome sequence that design probe is used
Assembling quality all with higher, while having corresponding species chloroplaset sequence in each evolutionary branching, it ensure that design
Probe out has extensive use property for different plant chloroplasts.Using dynamic programming algorithm to Chloroplast gene sequence
It is compared two-by-two, according to the comparison length between sequence similarity and sequence as the homologous region between standard identification sequence
Block, the nucleic acid sequence annotation to the homologous block identified is N, de-redundancy.It is reference sequences by setting arabidopsis, improves sequence
The operability of column intersection de-redundancy.It compares, is utmostly reduced in the sequence intersection of collection due to chloroplaset base by iteration
The bulk redundancy sequence because of present in sequence intersection caused by organizing highly conserved characteristic.Sequence intersection by de-redundancy, according to
Nucleic acid sequence coordinate corresponds to original genomic sequence center acid region and had both contained sequence polymorphism between different plant species,
Redundancy between sequence is also effectively reduced simultaneously.The probe of this example design can completely cover the Chloroplast gene of different plant species
Sequence, to Chloroplast gene segment have preferable capture effect, also, capture chloroplast DNA segment can directly into
Row builds library and sequencing.
In order to test chloroplaset capture effect of the probe to Different Evolutionary branch plant of this example design, this example is used and is closed
At probe, hybrid capture has been carried out to the full-length genome of three tomato, ginkgo and lotus throne fern plant species respectively, it is each to obtain
From chloroplast DNA segment, and the chloroplast DNA segment of capture is sequenced respectively, to obtain chloroplast DNA sequencing number
According to.
It should be noted that selection tomato, ginkgo and lotus throne fern these three plant species, the reason is that, these three species
There is complete Chloroplast gene sequence in NCBI, and three is belonging respectively to different evolutionary branchings, tomato belongs to quilt
Plant, ginkgo belongs to gymnosperm, and lotus throne Cyclosorus is in pteridophyte.It is tested using the plant of Different Evolutionary branch, it can
To detect capture ability of the probe to the chloroplast DNA segment of Different Evolutionary branch plant of this example design.
Chloroplast gene pack is carried out to the full-length genome of tomato, ginkgo and lotus throne fern respectively using the probe of this example design
The probe catching method of section capture, this example refers to Qiao, Xian et al. " Genome-wide Target Enrichment-
aided Chip Design:a 66K SNP Chip for Cashmere Goat."Scientific Reports7,no.1
(2017):8621;It is specific as follows:
(1) Plant Genome gDNA fragmentation is handled
The Plant Genome extracts kit provided using TIANGEN company is respectively to tomato, ginkgo and lotus throne fern sample
Carry out gDNA extraction.Obtained gDNA is interrupted genomic DNA to the segment of 170bp using Covaris LE220 ultrasonic instrument.
In the way of magnetic bead absorption, by the Ampure XP magnetic bead of 1 times of volume in the DNA interrupted, absorption 5 minutes is mixed, supernatant is taken
Magnetic bead is abandoned, the XP magnetic bead of DNA sample volume Yu 0.5 times of magnetic bead volume is added, abandons supernatant, magnetic after mixing absorption 10~15 minutes
Pearl is washed twice with 75% ethyl alcohol, the TE eluted dna of 42 μ L.
(2) end reparation is carried out to the plant gDNA of fragmentation
Will purifying with XP magnetic bead to be placed at room temperature for 30min spare.
The DNA fragmentation of each 42 μ L of sample is added in 58 μ L End Repair Master Mix and is reacted.58μL
End Repair Master Mix prepare system include: 10 × End Repair Buffer, 10 μ L, 1.6 dNTPs μ L,
1 μ L of T4DNA Polymerase, 2 μ L of Klenow DNA Polymerase, 2.2 μ L of T4Polynucleotide Kinase,
Supplement ddH2O to 58 μ L.
Reaction solution is mixed, is placed in PCR instrument and reacts, reaction condition are as follows: 20 DEG C of reaction 30min are kept in 4 DEG C.
180 μ L XP magnetic beads are added in each reaction tube, stand 5min;
Supernatant is removed, adds 200 μ L, 80% ethyl alcohol, stands 30s on magnetic frame;
Supernatant is removed, 32 μ L ddH are added2O after standing 5min on magnetic frame, draws 30 μ L liquid to new 1.5mL
Centrifuge tube obtains 30 μ L DNA fragmentations of purifying.
20 μ L Adenylation Master Mix will be added in 30 μ L DNA fragmentations of obtained purifying to react.
The Adenylation Master Mix of 20 μ L prepare system include: 10 × Klenow Polymerase Buffer, 5 μ L,
1 μ L of dATP, 3 μ L of Exon (-) Klenow supplement ddH2O to 20 μ L.
Reaction solution mixes, and is placed in PCR instrument and is reacted, reaction condition are as follows: 37 DEG C of reaction 30min are kept in 4 DEG C.
90 μ L XP magnetic beads are added in each sample, stand 5min;
Supernatant is removed, adds 200 μ L, 80% ethyl alcohol, stands 30s on magnetic frame;
Supernatant is removed, 15 μ L ddH are added2O after standing 5min on magnetic frame, draws 13 μ L liquid to new 1.5mL
Centrifuge tube obtains the DNA fragmentation that the end of 13 μ L purifying is repaired.
(3) specific sequence measuring joints are connected at gDNA segment both ends
Will purifying with XP magnetic bead to be placed at room temperature for 30min spare.
Each in step (2) is repaired to by end and added the DNA fragmentation of A tail, the i.e. DNA fragmentation of 13 μ L, 37 μ are added
The Ligation Master Mix of L is reacted.It includes: 5 × T4DNA that the Ligation Master Mix of 37 μ L, which prepares system,
10 μ L of Ligase Buffer, Sure Select Adapter Oligo Mix10 μ L, 1.5 μ L of T4DNA Ligase, supplement
ddH2O to 37 μ L.
Above-mentioned reaction system is placed in PCR instrument and is reacted, reaction condition are as follows: protected under the conditions of 4 DEG C after 20 DEG C of reaction 15min
It holds.
90 μ L XP magnetic beads are added in each sample, stand 5min;
Supernatant is removed, adds 200 μ L, 80% ethyl alcohol, stands 30s on magnetic frame;
Supernatant is removed, 32 μ L ddH are added2O after standing 5min on magnetic frame, draws 30 μ L liquid to new 1.5mL
Centrifuge tube, i.e. 30 μ L DNA fragmentations of adjunction head.
(4) PCR amplification is connected with the segment gDNA of connector and purifies
This example, which is entirely tested, builds library and hybrid process is using Agilent Sure Select Reagent Kit kit
It is operated, PCR amplification primer is also provided by kit.
Will purifying with XP magnetic bead to be placed at room temperature for 30min spare.
The DNA fragmentation for taking 15 μ L steps (3) to finally obtain is added 35 μ L PCR Reaction Mix and is reacted.35μL
The reaction system of PCR Reaction Mix include: 1.25 μ L of Sure Select Primer, Sure Select ILM
Indexing Pre-Capture PCR Reverse Primer 1.25μL、5×Herculase II Reaction
10 μ L of Buffer, 0.5 μ L of 100mmol/L dNTP Mix, 1 μ L of Herculase II Fusion DNA Polymerase,
Supplement ddH2O to 35 μ L.
PCR reaction solution is mixed, following reaction condition: 95 DEG C of 2min is run in PCR instrument, is recycled subsequently into 10: 95
DEG C 30s, 65 DEG C of 30s, 72 DEG C of 60s, after circulation terminates 72 DEG C of 10min, 4 DEG C of preservations.
90 μ L XP magnetic beads are added in obtained PCR reaction product, each sample, stand 5min;
Supernatant is removed, adds 200 μ L, 80% ethyl alcohol, stands 30s on magnetic frame;
Supernatant is removed, 30 μ L ddH are added2O after standing 5min on magnetic frame, draws 28 μ L liquid to new 1.5mL
Centrifuge tube, the 28 μ L liquid are the pcr amplification product purified, that is, the library gDNA with given joint purified.
(5) using the probe of this example design and the library gDNA with given joint after purification in hybridization buffer into
Row hybridization, specific as follows:
A) library gDNA that step (4) finally obtains is diluted to the concentration of 221ng/ μ L, draw 3.4 libraries μ L gDNA,
Total amount is about 750ng, into new 1.5mL centrifuge tube.Add 5.6 μ L Sure Select in the library gDNA of each sample
Block Mix, is reacted in PCR instrument after mixing.Reaction condition are as follows: 95 DEG C of reaction 5min;65 DEG C of holdings.5.6μL Sure
It includes: the SureSelect Indexing Block 1 of 2.5 μ L, 2.5 μ L that Select Block Mix, which prepares system,
The Sure Select ILM Indexing Block 3 of SureSelect Block2 and 0.6 μ L.
B) in the library gDNA for finally obtaining step a), 20 μ L Capture Library are added in each sample
Hybridization Mix;Sample is mixed, is placed in PCR instrument and carries out hybridization reaction, reaction condition are as follows: 65 DEG C of reactions are for 24 hours.
The preparation system of the Capture Library Hybridization Mix of 20 μ L includes: the SureSelect Hyb of 6.63 μ L
1, the SureSelect Hyb 2 of 0.27 μ L, the SureSelect Hyb 3 of 2.65 μ L, 3.45 μ L SureSelect Hyb 4,
The Capture Library (Probe) of the RNase Blcok and 2 μ L of 5 μ L concentration 10%.
(6) target DNA fragment of hybrid capture is separated and is purified using magnetic bead, obtain plant chloroplast genome
DNA library, specific as follows:
SureSelect Wash Buffer 2 is preheated under the conditions of 65 DEG C in advance.By MyOne Streptavidin
T1magnetic Beads acutely vibrates resuspension with whirlpool mixed instrument to mixing;200 μ L SureSelect are added into magnetic bead
Binding Buffer is stored at room temperature 5min in magnetic frame, and Aspirate supernatant simultaneously abandons.
It is added in magnetic bead using 200 μ L SureSelect Binding Buffer, is acutely vibrated 5 seconds with whirlpool mixed instrument
Magnetic bead is resuspended in clock.
Whole hybridization reaction systems are added in above-mentioned 200 μ L magnetic bead and are mixed.Room temperature is mixed on Nutator blending instrument
Even 30min.
Sample, which is placed on magnetic frame, to be stored at room temperature 5min and moves back except after supernatant, and 65 DEG C of 200 μ L preheatings are added
Magnetic bead is resuspended in SureSelect Wash Buffer 2.65 DEG C of heat preservation 10min in PCR instrument.
The above-mentioned sample by 65 DEG C of heat preservation 10min is placed on magnetic frame and is stored at room temperature 5min;
Supernatant is removed, adds 200 μ L, 80% ethyl alcohol, stands 30s on magnetic frame;
Supernatant is removed, 30 μ L ddH are added2O after standing 5min on magnetic frame, draws liquid and is centrifuged to new 1.5mL
Pipe, the chloroplast DNA frag-ment libraries as captured.
(7) it is sequenced
The survey of PE100 is carried out using Illumina Hiseq.4000 microarray dataset to the Chloroplast gene library of acquisition
Sequence.
(8) sequencing result is analyzed
The joint sequence in initial data generated to sequencing is filtered;Higher low quality base ratio will be contained simultaneously
Data be filtered;The filter condition being arranged in this example be low quality base ratio >=10%.Wherein, joint sequence, that is, structure
Build the joint sequence added during library.
It by the sequencing data by filtering, is compared using bwa software with template sequence, counts sequencing data in template
Coverage in sequence, the results are shown in Table 3.Wherein, tomato, ginkgo or the lotus throne fern disclosed in template sequence, that is, NCBI it is complete
Whole Chloroplast gene sequence;The NCBI complete excision genome sequence of the sequencing data of tomato and tomato is compared
It is right, count its coverage;The sequencing data of ginkgo is compared with the NCBI complete excision genome sequence of ginkgo, is counted
Its coverage;The sequencing data of lotus throne fern is compared with the NCBI complete excision genome sequence of lotus throne fern, counts it
Coverage.
3 sequencing data of table and coverage statistical result
In table 3, NCBI number, i.e., number of the three kinds of test plants chloroplaset sequences used in this example on NCBI;Mould
Plate sequence length, i.e. test plants Chloroplast gene reference sequences length, reference sequences derive from NCBI;Sequencing data amount,
That is the lower machine data volume that the chloroplaset segment of probe capture is generated by sequencing;Average sequencing depth descends machine data volume/template
Sequence length;Template sequence length is covered, that is, descends machine data that can compare the base number of template sequence base zone, it can be anti-
Mirror the case where base in template sequence region is measured;Sequencing data coverage rate, i.e. covering template sequence length/template are long
Degree, reaction is that the base of template sequence can be sequenced the ratio of data cover.
Table 3 the results show that carry out the capture of chloroplast DNA segment using the probe of this example design, and be sequenced, can
It is efficient to obtain chloroplast DNA sequencing data, 90% is greater than to the coverage rate of total Chloroplast gene sequence, wherein tomato
Coverage rate be even up to 98.5%.Also, the probe and chloroplast DNA sequencing data acquisition methods of this example, to Different Evolutionary
The plant species of branch can effectively obtain its chloroplast DNA sequencing data, without Preference, can guarantee that output is sequenced
The wide spreadability of data, the acquisition of the chloroplast DNA sequencing data especially suitable for large-scale Different Evolutionary branch species are
The further investigation of large-scale plant evolution and heredity is laid a good foundation.
The foregoing is a further detailed description of the present application in conjunction with specific implementation manners, and it cannot be said that this Shen
Specific implementation please is only limited to these instructions.For those of ordinary skill in the art to which this application belongs, it is not taking off
Under the premise of from the application design, a number of simple deductions or replacements can also be made.