CN104711250A

CN104711250A - Building method of long fragment nucleic acid library

Info

Publication number: CN104711250A
Application number: CN201510038647.8A
Authority: CN
Inventors: 郑洪坤; 刘少卿
Original assignee: BEIJING BIOMARKER TECHNOLOGIES Co Ltd
Current assignee: BEIJING BIOMARKER TECHNOLOGIES Co Ltd
Priority date: 2015-01-26
Filing date: 2015-01-26
Publication date: 2015-06-17

Abstract

The invention provides a building method of a long fragment nucleic acid library. The method comprises the following steps: preparing a batch of Index label clusters, marking long fragment nucleic acid molecules in batch, using whole genome resequencing library building to build long fragment libraries suitable for illumine sequencing platforms, allowing all fragments to have connectors suitable for the illumine sequencing platforms under the action of transposase, carrying out PCR amplification, and screening the well build library. The method effectively solves the problem of influences of polyploid and repeated sequence in a genome de novo sequencing technology on the assembling accuracy, and can be used for assembling complex genomes and assisting de novo sequencing assembling to improve the assembling accuracy and shorten the assembling time.

Description

A kind of construction process of long-fragment nucleic acid library

Technical field

The present invention relates to field of bioinformatics, particularly relate to a kind of long-fragment nucleic acid library constructing method being applicable to genome de novo sequencing package technique.

Background technology

High throughput sequencing technologies, also known as sequencing technologies of future generation, is change relative to the revolution of conventional sequencing technology, can checks order to millions of DNA moleculars simultaneously.High throughput sequencing technologies not only can carry out extensive gene order-checking, also can be used for the correlative studys such as gene expression analysis, the qualification of non-coding microRNA, the screening of Target Genes of Transcription Factor and DNA methylation.

Full-length genome De novo order-checking is also called genome de novo sequencing, refer to that not relying on the genome of any known group sequence information to certain species checks order, then applying biological information science means are spliced sequencing sequence and are assembled, final this species gene group sequence map of acquisition.13 years consuming time of first genomic assembling---human genome, this is an epoch-making event, proposes new striving direction again to scientists simultaneously, how to shorten the overall time.Along with the birth of high throughput sequencing technologies, this time significantly shortens, and carries out human genome order-checking and become possibility within one day.For nothing with reference to genomic species, de novo built-up time also shortens greatly, and existing Some Species genome has been assembled at present, comprising corn, Chinese cabbage, Kiwifruit, and the national treasure giant panda of China.But, in the process of complicated species gene group assembling, polyploid and a large amount of tumor-necrosis factor glycoproteinss, the technology barriers of assembling are become, if can address this problem, strong technical support will be provided for complex genome assembling, and improve assembling accuracy, shorten built-up time.

At present, the means that solution polyploid, tumor-necrosis factor glycoproteins affect de novo, except bioinformatics method, also comprise the means of experimental technique aspect, mainly contain Mate Pair library preparation and long-fragment nucleic acid molecule check order two kinds, be the experimental technique means of illumina company.Wherein the preparation of Mate Pair library is intended to generate some short DNA fragmentations, these fragments comprise the sequence at comparatively large span (2-10kb) fragment two ends in genome, first genomic dna is interrupted at random specific size (2-10kb scope is optional), then successively through end reparation, biotin labeling and cyclisation process, DNA molecule after cyclisation is broken into the fragment of 400-600bp at random and is caught with biotin labeled fragment by the M280 magnetic bead handle with the affine mycin of chain, these fragments of catching are again through end modified and build up Mate Pair library after adding given joint, those sequences compared with the fragment two ends of large span are obtained through upper machine order-checking, (framework is built just to greatly simplify genome, area of absence is supplemented) complicacy.Another technology is the same with Mate Pair, belong to illumina company, its principle and basic procedure as follows: first genomic dna is interrupted to 6-8kb fragment, add location PCR joint (for determining 5 ' and 3 ' position of 10kb fragment), utilize limiting dilution to create the sample equal portions of hundreds of to several thousand DNA moleculars, guarantee in each hole only containing a small amount of long segment molecule (reducing the difficulty of the rear 10kb fragment assembly of order-checking), utilize long segment amplification technique simultaneously, make a small amount of 10kb molecule enrichment (enough splicing data are provided), long segment is broken for short-movie section by second time, add the two Index joints that can be used for Illumina sequencing system, build small segment library, Illumina HiSeq 2000 is utilized to check order, utilize Velvet Assembler the short end pairing of each Index to be read afterwards to assemble respectively, simplify packing problem.The length of reading of new-generation sequencing instrument is effectively improve 50 times by this method, error rate is reduced several order of magnitude simultaneously.But the shortcoming of above-mentioned technology is in limiting dilution process, very easily affects by environmental microorganism is genomic, cause valid data amount low.

Summary of the invention

The object of the present invention is to provide a kind of long-fragment nucleic acid library constructing method, affect the problems such as caused assembling difficulty is large for complex genome assembling by polyploid, factor such as height heterozygosis and tumor-necrosis factor glycoproteins etc. at present to solve.

In order to realize object of the present invention, the present invention utilizes the long-fragment nucleic acid molecule batch labeling techniques based on batch Index bunch of preparation of independent research, to resurvey sequence Library development flow in conjunction with conventional full-length genome, construct and be suitable for illumina and check order the long segment library of platform, and carry out checking order and the information analysis in later stage.Technical scheme is as follows: (1) conventional long segment library preparation, low cycle P CR enrichment large fragment nucleic acid; (2) cut the mode of connection by enzyme, prepare two kinds of labels bunch containing Index, the first is Index label bunch, and the second is that large fragment catches label bunch; (3) DNA long fragment is caught, and mixes catching label bunch with large fragment in second step after the long segment amplified production process of the first step, catches on label bunch by long segment " carry " to large fragment; (4) transposase interrupts, Index label bunch and the long segment of " carry " long segment are caught label bunch mix, and carry out cell process with PTP plate, make to comprise respectively in each cell an Index label bunch and a long segment catches label bunch, under the effect of transposase, the large fragment that " carry " catches on label bunch in long segment is broken into the dispersion plating that length concentrates on about 500bp-600bp, each fragment under the effect of transposase, the joint of the platform that directly checks order with applicable illumina; (5) library obtains, and by pcr amplification, screens the library built; Upper machine order-checking and information analysis.

First, the invention provides a kind of method that batch prepares Index bunch, the nanometer magnetic bead that joint A1 modifies with Streptavidin is respectively connected, form A1 joint and nanometer magnetic bead mixture, then be connected on A1 joint and nanometer magnetic bead mixture with T4DNA ligase enzyme by man-to-man at random respectively for the 64 kinds of joint A including Sma I restriction enzyme site, connection product Sma I is carried out enzyme cut after jointing A again, carry out 3 jointing A endonuclease reactions altogether, last jointing A2-14U;

Described joint A1 totally 64 kinds, sequence is as shown in SEQ ID NO.1-128;

Described joint A totally 64 kinds, sequence is as shown in SEQ ID NO.129-256;

Described joint A2-14U sequence is:

F：5’-ACGCATGACTCAdUCGdUCGGCAGCGdUCAdUCTCGCAGTTG；

R：5’-CAACdUGCGAGAdUGACGCTGCCGACGATGAGTCATGCGT。

Wherein, nanometer magnetic bead of the present invention be M280 magnetic bead, M270 magnetic bead, T1 or Ci Streptavidin modify nanometer magnetic bead.

Further, the invention provides a kind of long-fragment nucleic acid library constructing method, comprise the following steps:

(1) preparation in DNA long fragment library

(2) containing the label bunch preparation of mark

Cut the mode of connection by enzyme, prepare two kinds of labels bunch containing Index, the first is Index label bunch, and the second is that DNA long fragment catches Index label bunch; Wherein Index label bunch is the method adopting aforesaid batch to prepare Index bunch, the Index label bunch of preparation containing different label;

(3) acquisition of DNA long fragment capture complexes

(4) secondary breaking library construction

By step (2) Index label bunch mix with DNA long fragment capture complexes, and carry out single hole process with PTP plate, guarantee that comprising an Index label bunch in the aperture of each PTP plate respectively catches label bunch with a long segment being connected large fragment, under the effect of transposase, be connected to the large fragment that long segment catches on label bunch and be broken into the dispersion plating that length concentrates on about 500bp-600bp, each fragment directly connects and is applicable to illumina and checks order the joint of platform under the effect of transposase;

(5) acquisition in long-fragment nucleic acid library

The illumina that connects obtained in (4) step is checked order the connection product of platform joint by pcr amplification, screens the library built; Carry out upper machine order-checking by HiSeq2500, and sequencing result is analyzed.

Wherein step (1) concrete steps are:

1. genomic dna is broken

Utilize the method for atomization that genomic dna is broken, obtain the smear concentrating band for 2-10Kbp, in this interval, reclaim object fragment as required and purifying;

2. end reparation, add A, add joint

The large fragment good to purifying with T4DNA polysaccharase, T4 polynueleotide kinase and Klenow enzyme carries out end reparation, and use PCR primer Purification Kit, after end repairs purifying, A process is added and purifying to purification of samples, joint is added with T4DNA ligase enzyme, joint is MANNN, and sequence is:

F：’-ACTTNNNTCCCNNNTCCCNNNTCCCNNNTCCCGCTCTTCCGATCT

R：5’-GATCGGAAGAGCACACGTCT

Connect rear purifying and connect product;

3. glue selected episode is cut

Prepare 0.6% sepharose, after electrophoresis 100V, 120min, select according to demand to carry out cutting glue in 2K-10K fragment, reclaim test kit by large fragment and reclaim the object fragment of 2K-10K within the scope of this;

4. object fragment enrichment

The object fragment utilizing amplification to reclaim carries out pcr amplification enrichment, and purifying amplified production.

The step of a kind of long-fragment nucleic acid library constructing method step (1) of the present invention 4. in, carry out PCR increase enrichment time the primer that uses be:

F：5'-ACUTCCAUCCCCCAUCCCCCAUCCCCCAUCCC

R：5'-AGACGTGTGCTCTTCCGATC。

In the inventive method, the preparation method that the described DNA long fragment of step (2) catches Index label bunch is: nanometer magnetic bead is connected with 64 kinds of LF2 joints, form 64 kinds of LF2 joints and nanometer magnetic bead mixture, then be connected on a kind of LF2 with T4DNA ligase enzyme at random by man-to-man for the 64 kinds of AU joints including Sma I restriction enzyme site, connection product Sma I is carried out man-to-manly again be connected on a kind of jointing AU, LF2 and nanometer magnetic bead mixture at random after enzyme is cut, carry out 3 jointing AU endonuclease reactions altogether;

Described joint 64 kinds of LF2 joint sequences are as shown in SEQ ID NO.257-384;

Described joint AU sequence is for such as shown in SEQ ID NO.385-512.

In the inventive method, wherein the DNA long fragment capture complexes of step (3) is obtained by following steps:

1. Index label bunch is caught with the DNA long fragment that USER ferment treatment marks;

2. object fragment enriched product purifying after USER ferment treatment reclaims;

3. the acquisition of DNA long fragment capture complexes

Long segment amplified production after USER ferment treatment and the DNA long fragment after USER ferment treatment are caught Index label bunch mix, utilizing Taq DNA ligase DNA long fragment to be connected to DNA long fragment catches on the joint of Index label bunch, forms DNA long fragment capture complexes.

In step (3), step 1. ferment treatment condition is 37 DEG C of process 150 minutes, and step 2. ferment treatment condition is 37 DEG C of process 60 minutes.

In a kind of long-fragment nucleic acid library constructing method provided by the invention, the primer that step (5) pcr amplification adopts is long amplimer 5 and long amplimer 7, long amplimer 5:

5 '-TGACGACTACTTCGTTAGCGC, long amplimer 7:

5’-TGAAGGTCCTGCGCGTGCATAGATTCGCCTTAGTCTCGTGGGCTCG G。

Further, in step (5), upper machine order-checking is carried out by HiSeq2500 in the library built, detect contained Index number in every bar sequenced fragments simultaneously, and calculate in sequenced fragments the ratio adding different I ndex number, determine whether to produce the different Index label of kind more than 1,677 ten thousand bunch (because joint A1, A, LF2, AU respectively have 64 kinds, so 64 ⁴be 16777216 kinds), determine the segments adding different I ndex has how many, each unit molecule can be reached and caught, for follow-up single-molecule sequencing and splicing by different Index number of tags.

Center tap A1, A, LF2, AU of the present invention respectively have 64 kinds.In 64 joints of wherein A1, its upstream sequence and downstream sequence have common frame sequence, i.e. " TGACGACTAC TTCGTTAGCG CTACAGTCGT T " and " AACGACT GTAGCGCTAA CGAAGTAGTC GTCA ", the difference of 64 kinds of A1 joints also has 3 different bases as molecular label after being upstream sequence skeleton, also has 3 different bases as molecular label before downstream sequence skeleton.64 kinds of A1 joints as shown in SEQ ID NO.1-128, their reaction conditions, mode of connection etc. are similar, difference is only and is connected to different molecular labels, therefore those skilled in the art know, any one joint of 64 kinds of joints of A1 is used for carrying out ligation, and its effect is all identical.64 kinds of joints all herewith situation of A, LF2, AU.

The present invention has following advantages and beneficial effect:

(1) a kind of batch provided by the invention prepares the method for Index bunch, and each Index that each Index bunch comprises is same with other Index on this bunch, and different I ndex bunch comprises different Index bunch, is applied to single-molecule sequencing technology;

(2) means for batch marker nucleic acid molecule, can form 1,677 ten thousand kinds of marker nucleic acid molecules simultaneously, can be used for diagnosing tumor, the research fields such as cancer cells is quantitative;

(3) to resurvey sequence experiment flow in conjunction with tradition, construct long-fragment nucleic acid library, to connect high degree extend illumina check order platform read long;

(4) realization of long-fragment nucleic acid order-checking, by the packing problem for solving complex genome, provides strong technical guarantee.The inventive method efficiently solve occur in genome de novo sequencing technology polyploid, tumor-necrosis factor glycoproteins impact assembling accuracy problem, can be used for the assembling of complex genome, auxiliary de novo checks order assembling, improves assembling accuracy, shortens built-up time.

Accompanying drawing explanation

Fig. 1 is broken rear fragment distribution situation figure, the M:1kb DNA Ladder of atomization process in large fragment nucleic acid library preparation process, 1: the DNA long fragment after fragmentation.

Fig. 2 is selected episode in large fragment nucleic acid library preparation process---cut electrophorogram before and after glue.Wherein, A figure is that DNA long fragment library fragments is selected to cut glue rear electrophoresis figure, and B figure is that the selection of DNA long fragment library fragments cuts electrophorogram, wherein M1:1kb DNA Ladder before glue, M2:DL15,000DNA Marker, 1: the DNA long fragment after fragmentation.

Fig. 3 is that in large fragment nucleic acid library preparation process, agarose gel electrophoresis detects pcr amplification design sketch, and A figure is 5K expanding effect figure, B figure is 6K expanding effect figure.

Fig. 4 is the Library PCR amplification electrophorogram check result based on label bunch in large fragment nucleic acid library obtaining step, and left side D1-D4 is design sketch before purifying, and the right D1-D4 is design sketch after purifying.

Fig. 5 is that upper machine library obtains electrophorogram, and left side D1-D4 is design sketch before purifying; The right D1-D4 is design sketch after purifying, and CK is blank.

Fig. 6 is 10K fragment fragmentation figure.Broken rear fragment distribution situation figure, the M:1kb DNA Ladder of atomization process in large fragment nucleic acid library preparation process, M2:DL15,000DNA Marker, C27: sample C27 divides the DNA long fragment after four fragmentations.

Fig. 7 is that 10K fragment cuts glue figure.Selected episode in large fragment nucleic acid library preparation process---cut glue rear electrophoresis figure.Wherein M:1kb DNA Ladder, C27: the DNA long fragment after fragmentation.

Fig. 8 is 10K fragment amplification figure.That in large fragment nucleic acid library preparation process, agarose gel electrophoresis detects pcr amplification design sketch.

Fig. 9 is be Library PCR amplification electrophorogram check result based on label bunch in example 2 large fragment nucleic acid library obtaining step.1:PCR product, M:100bp DNA ladder.

Figure 10 is for being acquisition electrophorogram in machine library on example 2.1: upper machine library, CK: blank, M:100bp DNA ladder.

Figure 11 is splicing fragment length distribution plan.X-coordinate: fragment length kbp, ordinate zou: the segments of this splicing length.

Figure 12 is splicing fragment chromosome profiling.1,2,3,4,5: coloured differently body is numbered.

Embodiment

Following examples for illustration of the present invention, but are not used for limiting the scope of the invention.

Embodiment 1 long-fragment nucleic acid library constructing method

(1) preparation in large fragment library

1. genomic dna is broken

By the 20 fine genome DNA samples of μ g (Nanodrop is quantitative) paddy rice Japan, utilize the method for atomization that genomic dna is broken, obtain the smear concentrating band for 5K and 6K, reclaim 5K and 6K fragment respectively, electrophoresis detection situation is shown in Fig. 1;

2. end reparation, add A, add joint

The large fragment good to purifying with T4DNA polysaccharase, T4PNK and Klenow enzyme is carried out end reparation and uses PCR primer Purification Kit.End adds A process and purifying to purification of samples after repairing purifying, adds joint with T4DNA ligase enzyme, and joint is MACCA (in MANNN, NNN fixed bit CCA, verifies joint joint efficiency), and be the synthesis of Invitrogen company, sequence is:

F：’-ACTTCCATCCCCCATCCCCCATCCCCCATCCCGCTCTTCCGATCT；

R:5 '-GATCGGAAGAGCACACGTCT) connect rear purifying connection product;

3. glue selected episode is cut

Prepare 0.6% sepharose, carry out cutting glue after electrophoresis 100V, 120min and select 5K and 6K fragment, reclaim by large fragment the object fragment that test kit reclaims 5K and 6K, cut before and after glue and see Fig. 2;

4. object fragment enrichment

Points of two PCR pipe amplifications reclaim 5K and 6K fragments (Invitrogen synthesizes, and PCR primer sequence is F:5'-ACUTCCAUCCCCCAUCCCCCAUCCCCCAUCCC,

R:5'-AGACGTGTGCTCTTCCGATC), obtain the initial long segment of sufficient amount, agarose gel electrophoresis detects pcr amplification design sketch and sees Fig. 3.

(2) containing bunch preparation of Index label

Cut the mode of connection by enzyme, prepare two kinds of labels bunch containing Index, the first is Index label bunch, and the second is that large fragment DNA catches label bunch:

Bunch preparation of Index label

By joint A1, (sequence is F:5 '-TGACGACTAC TTCGTTAGCG CTACAGTCGT TCCA, R:5 '-TGGAACGACT GTAGCGCTAA CGAAGTAGTC GTCA) be connected on M280 magnetic bead, with T4DNA ligase enzyme by include Sma I restriction enzyme site joint A (be Invitrogen synthesis, sequence is:

F：5’-TGACTGCCAC TAGGCTTCAA TGGCCCGGG

R:5 '-CCCGGGCCAT TGAAGCCTAG TGGCAGTCA) be connected on A1, connection product Sma I is carried out enzyme cut after jointing A again, carry out 3 jointing A-endonuclease reactions altogether, last jointing A2-14U (be Invitrogen synthesis, sequence is:

F：5’-ACGCATGACTCAUCGUCGGCAGCGUCAUCTCGCAGTTG

R：5’-CAACUGCGAGAUGACGCTGCCGACGATGAGTCATGCGT)。

2. DNA long fragment catches the preparation of Index label bunch

By joint LF2 (be Invitrogen synthesis, sequence is:

F：5’-GATGGACCGA CACTCTTTCC CTACACGACT TCCA

R:5 '-TGGAagducgt gtagggaaag agtgtcggtc catc) be connected on M280 magnetic bead, with T4DNA ligase enzyme by include Sma I restriction enzyme site AU (be Invitrogen synthesis, sequence is:

F：5’-Gactgccact aggcdutcaaCCACCCGGG

R:5 '-CCCGGGTGGT TGAAGCCTAG TGGCAGTC) be connected on LF2, connection product Sma I is carried out enzyme cut after jointing AU again, carry out 4 jointing AU-endonuclease reactions altogether.

(3) acquisition of DNA long fragment capture complexes

Catch Index label bunch with the DNA long fragment that USER ferment treatment marks, ferment treatment condition is 37 DEG C of process 150 minutes, and object fragment enriched product purifying after USER ferment treatment reclaims, and ferment treatment condition is 37 DEG C of process 60 minutes; Object fragment enriched product after USER ferment treatment is caught Index label bunch with the DNA long fragment after USER ferment treatment mix, utilize Taq DNA Ligase the long segment amplified production of object fragment enriched product to be connected to DNA long fragment to catch on the joint of Index label bunch, obtain DNA long fragment capture complexes.

(4) transposase interrupts (small segment library construction)

Treatment process tested by table 1

Note: D1-D4 represents process 1-4; CCA represents Index label bunch added in process; LF2A represents that what catch is 5K DNA long fragment; LF2B represents that what catch is 6K DNA long fragment, and wherein D1 and D2 is LF2A two repetition, D3 and D4 is LF2B two repetition; R1 with R2 is respectively the transposase joint that transposase is connected.

Carry out transposase fragments, wherein D1 and D2 carries out as follows:

Equal-volume has been mixed with the label bunch of Index and has been connected to the Index label bunch of DNA long fragment, carries out cell process with PTP plate, in the system of transposase, by following conditioned response: 55 DEG C, and 10min; 45 DEG C, 60min.Be connected to the large fragment that DNA long fragment catches on Index label bunch and be broken into the dispersion plating that length concentrates on about 500bp-600bp, each fragment directly connects and is applicable to illumina and checks order the joint of platform under the effect of transposase;

D3 and D4 carries out by the following step:

Equal-volume has been mixed with the label bunch of Index and has been connected to the label bunch of DNA long fragment, in the system of transposase, by following conditioned response: 55 DEG C, and 10min; 45 DEG C, 60min.Be connected to the large fragment that long segment catches on label bunch and be broken into the dispersion plating that length concentrates on about 500bp-600bp, each fragment directly connects and is applicable to illumina and checks order the joint of platform under the effect of transposase;

(5) library obtains

The illumina that connects obtained in (4) step is checked order the connection product of platform joint by pcr amplification, i5/i7-long hold primer (be the synthesis of Invitrogen company, sequence is respectively:

Long amplimer 55 '-TGACGACTACTTCGTTAGCGC,

Long amplimer 7

5’-TGAAGGTCCTGCGCGTGCATAGATTCGCCTTAGTCTCGTGGGCTCG

G) add 2.1ul i5/i7-long mixture according to every 50 μ l reaction systems to add:

Reaction system is 50 μ l, reacts by following program:

98 DEG C, 30s; 16 circulations (98 DEG C, 10s; 65 DEG C, 30s; 72 DEG C, 30s); 72 DEG C, 5min.Get PCR supernatant liquor, purifying rear electrophoresis detects, and the results are shown in Figure 4.

(6) acquisition in upper machine library

Library is carried out Qubit quantitatively after, D1, D2, D3 and D4 library is carried out after end adds the step such as A, connection standard joint, carry out the Library PCR amplification of 16 circulations, detect library construction situation with agarose gel electrophoresis, see Fig. 5.

(7) information analysis after upper machine order-checking

Carry out upper machine order-checking by HiSeq2500, utilize bioinformatics software to analyze sequencing result, add up the preparation situation of random Index, the results are shown in Table 2.Catch by label preparation, large fragment and check order, obtain 1956041,1556375,1485623 and 1993805reads respectively altogether, wherein in 4 process, contain A1 joint containing 1579204,1245167,1229178 and 1603009reads respectively, the ratio accounting for all Index is respectively 80.73%, 80.00%, 82.74% and 80.40%.Reads containing A1+index+A2+R1 joint is respectively 141796,198587,103304 and 137895.The overall Index preparation efficiency of this preliminary experiment is respectively 7.47%, 7.92%, 8.40% and 8.60%, 141796,198587,103304 and 137895 Index labels bunch are prepared respectively, the unit molecule large fragment of respective numbers can be caught, the large fragment of catching can be assembled respectively, then assembles for the genome that complex genome, polyploid and repetition sequence content are high.

Table 2 sequencing result

Embodiment 2 long-fragment nucleic acid library constructing method and sequence assembly

(1) preparation in large fragment library

1. genomic dna is broken

By the 20 fine genome DNA samples of μ g (Nanodrop is quantitative) paddy rice Japan, sample number into spectrum is C27, utilizes the method for atomization that genomic dna is broken, and obtain the smear concentrating band for 10K, reclaim 10K fragment respectively, electrophoresis detection situation is shown in Fig. 6;

2. end reparation, add A, add joint

The large fragment good to purifying with T4DNA polysaccharase, T4PNK and Klenow enzyme is carried out end reparation and uses PCR primer Purification Kit.End adds A process and purifying to purification of samples after repairing purifying, adds joint with T4DNA ligase enzyme, and joint is the synthesis of MANNN, Invitrogen company, and sequence is:

F：’-ACTTNNNTCCCNNNTCCCNNNTCCCNNNTCCCGCTCTTCCGATC T；

R:5 '-GATCGGAAGAGCACACGTCT) connect rear purifying connection product;

5. glue selected episode is cut

Prepare 0.6% sepharose, carry out cutting glue after electrophoresis 100V, 120min and select 10K fragment, reclaim by large fragment the object fragment that test kit reclaims 10K, after cutting glue, see Fig. 7;

6. object fragment enrichment

(Invitrogen synthesizes, and PCR primer sequence is to divide two PCR pipe amplifications to reclaim 10K fragments

F：5'-ACUTNNNUCCCNNNUCCCNNNUCCCNNNUCCC，

R:5'-AGACGTGTGCTCTTCCGATC), obtain the initial long segment of sufficient amount, agarose gel electrophoresis detects pcr amplification design sketch and sees Fig. 8.

(2) containing bunch preparation of Index label

1. joint A1 (see sequence table 1-128) is connected on M280 magnetic bead by bunch preparation of Index label, with T4DNA ligase enzyme by include Sma I restriction enzyme site joint A (be Invitrogen synthesis, see sequence table 129-256) be connected on A1, connection product Sma I is carried out enzyme cut after jointing A again, carry out 3 jointing A-endonuclease reactions altogether, last jointing A2-14U (being Invitrogen synthesis)

2. DNA long fragment catches the preparation of Index label bunch

By joint LF2 (be Invitrogen synthesis, see sequence table 257-386), be connected on M280 magnetic bead, with T4DNA ligase enzyme by include Sma I restriction enzyme site AU (be Invitrogen synthesis, see sequence table 387-512) be connected on LF2, connection product Sma I is carried out enzyme cut after jointing AU again, carry out 4 jointing AU-endonuclease reactions altogether.

(3) acquisition of DNA long fragment capture complexes

(4) transposase interrupts (small segment library construction)

Carry out transposase fragments, step is as follows:

(5) library obtains

Long amplimer 55 '-TGACGACTACTTCGTTAGCGC,

Long amplimer 7

5 '-TGAAGGTCCTGCGCGTGCATAGATTCGCCTTAGTCTCGTGGGCTCG G) add the interpolation of 2.1ul i5/i7-long mixture according to every 50 μ l reaction systems:

Reaction system is 50 μ l, reacts by following program:

98 DEG C, 30s; 16 circulations (98 DEG C, 10s; 65 DEG C, 30s; 72 DEG C, 30s); 72 DEG C, 5min.Get PCR supernatant liquor, purifying rear electrophoresis detects, and the results are shown in Figure 9.

(6) acquisition in upper machine library

Library is carried out Qubit quantitatively after, D1, D2, D3 and D4 library is carried out after end adds the step such as A, connection standard joint, carry out the Library PCR amplification of 16 circulations, detect library construction situation with agarose gel electrophoresis, see Figure 10.

(7) information analysis after upper machine order-checking

Carry out machine order-checking on PE125 by HiSeq2500, add up the preparation situation of random Index.Caught by label preparation, large fragment and checked order, obtain 195M reads altogether, the reads containing A1+index+A2+R1 joint is respectively 184M, has prepared 16.5M Index label bunch.Then splice these reads with SOPdenovo software, splice different lengths fragment 15.9M bar altogether, fragment length is mainly distributed between 7-11K, is about 82%, the results are shown in Figure 11.Analyze the distribution situation of fragment on Arabidopis thaliana simultaneously, find these fragments being evenly distributed on coloured differently body, coverage reaches more than 95% (Figure 12).

Although be described in detail the present invention and its embodiment above; should be understood that; for those skilled in the art; under the prerequisite not departing from the technology of the present invention principle; can also do some to corresponding condition etc. to improve, these improvement also should be considered as protection scope of the present invention.

Claims

1. prepare the method for Index bunch in batches for one kind, it is characterized in that, the nanometer magnetic bead that joint A1 modifies with Streptavidin is respectively connected, form A1 joint and nanometer magnetic bead mixture, then be connected on A1 joint and nanometer magnetic bead mixture with T4DNA ligase enzyme by man-to-man at random respectively for the 64 kinds of joint A including Sma I restriction enzyme site, connection product Sma I is carried out enzyme cut after jointing A again, carry out 3 jointing A endonuclease reactions altogether, last jointing A2-14U;

Described joint A1 totally 64 kinds, sequence is as shown in SEQ ID NO.1-128;

Described joint A totally 64 kinds, sequence is as shown in SEQ ID NO.129-256;

Described joint A2-14U sequence is:

F：5’-ACGCATGACTCAdUCGdUCGGCAGCGdUCAdUCTCGCAGTTG；

R：5’-CAACdUGCGAGAdUGACGCTGCCGACGATGAGTCATGCGT。

2. batch as claimed in claim 1 prepares the method for Index bunch, it is characterized in that, the nanometer magnetic bead that nanometer magnetic bead is M280 magnetic bead, the Streptavidin of M270 magnetic bead, T1 or Ci is modified.

3. a long-fragment nucleic acid library constructing method, comprises the following steps:

(1) preparation in DNA long fragment library

(2) containing the label bunch preparation of mark

Cut the mode of connection by enzyme, prepare two kinds of labels bunch containing Index, the first is Index label bunch, and the second is that DNA long fragment catches Index label bunch; Wherein Index label bunch is the method adopting batch according to claim 1 to prepare Index bunch, the Index label bunch of preparation containing different label;

(3) acquisition of DNA long fragment capture complexes

(4) secondary breaking library construction

The Index label of step (2) bunch is mixed with DNA long fragment capture complexes, and carry out single hole process with PTP plate, guarantee that comprising an Index label bunch in the aperture of each PTP plate respectively catches label bunch with a long segment being connected large fragment, under the effect of transposase, be connected to the large fragment that long segment catches on label bunch and be broken into the dispersion plating that length concentrates on about 500bp-600bp, each fragment directly connects and is applicable to illumina and checks order the joint of platform under the effect of transposase;

(5) acquisition in long-fragment nucleic acid library

4. a kind of long-fragment nucleic acid library constructing method as claimed in claim 3, is characterized in that, wherein step (1) concrete steps are:

1. genomic dna is broken

2. end reparation, add A, add joint

The large fragment good to purifying with T4DNA polysaccharase, T4 polynueleotide kinase and Klenow enzyme carries out end reparation, and use PCR primer Purification Kit, after end repairs purifying, A process is added and purifying to purification of samples, joint is added with T4DNA ligase enzyme, joint is MACCA, and sequence is:

F：’-ACTTCCATCCCCCATCCCCCATCCCCCATCCCGCTCTTCCGATCT

R：5’-GATCGGAAGAGCACACGTCT

Connect rear purifying and connect product;

3. glue selected episode is cut

4. object fragment enrichment

5. a kind of long-fragment nucleic acid library constructing method as claimed in claim 4, is characterized in that, step 4. in, the primer carrying out pcr amplification enrichment use is:

F：5'-ACUTCCAUCCCCCAUCCCCCAUCCCCCAUCCC

R：5'-AGACGTGTGCTCTTCCGATC。

6. a kind of long-fragment nucleic acid library constructing method as claimed in claim 3, it is characterized in that, wherein the described DNA long fragment of step (2) is caught the preparation method of Index label bunch and is: nanometer magnetic bead is connected with 64 kinds of LF2 joints, form 64 kinds of LF2 joints and nanometer magnetic bead mixture, then be connected on a kind of LF2 with T4DNA ligase enzyme at random by man-to-man for the 64 kinds of AU joints including Sma I restriction enzyme site, connection product Sma I is carried out after enzyme is cut, be man-to-manly again connected to a kind of jointing AU at random, on LF2 and nanometer magnetic bead mixture, carry out 3 jointing AU endonuclease reactions altogether,

Described joint AU sequence is for such as shown in SEQ ID NO.385-512.

7. a kind of long-fragment nucleic acid library constructing method as claimed in claim 3, is characterized in that, wherein the DNA long fragment capture complexes of step (3) is obtained by following steps:

3. the acquisition of DNA long fragment capture complexes

8. a kind of long-fragment nucleic acid library constructing method as claimed in claim 7, is characterized in that, step 1. ferment treatment condition is 37 DEG C of process 150 minutes, and step 2. ferment treatment condition is 37 DEG C of process 60 minutes.

9. a kind of long-fragment nucleic acid library constructing method as claimed in claim 3, is characterized in that, in the system of wherein step (4) transposase, by following conditioned response: 55 DEG C, and 10min; 45 DEG C, 60min.

10. a kind of long-fragment nucleic acid library constructing method as claimed in claim 3, is characterized in that, wherein the primer of step (5) pcr amplification is long amplimer 5, long amplimer 7,

Long amplimer 5:5 '-TGACGACTACTTCGTTAGCGC, long amplimer 7:5 '-TGAAGGTCCTGCGCGTGCATAGATTCGCCTTAGTCTCGTGGGCTCGG.