WO2012028105A1 - 测序文库及其制备方法、确定核酸末端序列的方法和系统 - Google Patents

测序文库及其制备方法、确定核酸末端序列的方法和系统 Download PDF

Info

Publication number
WO2012028105A1
WO2012028105A1 PCT/CN2011/079213 CN2011079213W WO2012028105A1 WO 2012028105 A1 WO2012028105 A1 WO 2012028105A1 CN 2011079213 W CN2011079213 W CN 2011079213W WO 2012028105 A1 WO2012028105 A1 WO 2012028105A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
sequencing
library
vector
preparing
Prior art date
Application number
PCT/CN2011/079213
Other languages
English (en)
French (fr)
Inventor
韩长磊
徐讯
Original Assignee
深圳华大基因科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技有限公司 filed Critical 深圳华大基因科技有限公司
Publication of WO2012028105A1 publication Critical patent/WO2012028105A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups

Definitions

  • This invention relates to the field of molecular biology, and more particularly to the field of genomics.
  • the present invention relates to sequencing libraries and methods for their preparation, and methods and systems for determining nucleic acid end sequences based on such sequencing libraries. Background technique
  • genomic DNA is typically cloned into a vector for sequencing.
  • the commonly used vectors are Fosmid and bacterial artificial chromosome (BAC), which have the characteristics of large and stable inserts and are important tools for genomics research. It is known that BAC can usually insert a 100-200 kb fragment, and Fosmid can usually insert a fragment of about 40 kb, which plays an important role in gene map cloning, gene analysis, structural variation and genome assembly.
  • Fosmid can usually insert a fragment of about 40 kb, which plays an important role in gene map cloning, gene analysis, structural variation and genome assembly.
  • a variety of other vectors are also used in sequencing.
  • the ends of vector clones containing the DNA to be tested are usually sequenced to construct overlapping clones.
  • the ends of the clones are Sequencing is difficult, even though automated devices are currently being developed (Kelley, JM et al. 1999. High throughout direct end sequencing of BAC clones. Nucleic Acids Res. 27: 1539 - 1546), but for hundreds of thousands of clones, Still seemingly time consuming and laborious.
  • NGS Second generation sequencing
  • N50 is used to mean that all assembled sequences are arranged from large to small and added by length, and the length obtained by adding is five percent of the total length of all assembled sequences.
  • WO2010003316 A1 discloses a method for synthesizing a vector for constructing a Fosmid clone in which a recognition 4 base is present.
  • the cleavage sites of the endonucleases FspB1 and Csp6l were mutated, and the Fosmid clone inserted into the exogenous fragment was digested with these two endonucleases, and the target cleavage fragment was recovered. After cyclization, two Fosmid clones were obtained. End sequences can be double-ended sequenced using a second generation sequencer.
  • the method of WO2010003316 A1 is greatly limited because the restriction sites of FspB1 and Csp61 are not completely evenly distributed in the genome, resulting in the end of some Fosmid clones containing a specific region being unavailable, and End sequencing of BACs inserted into longer fragments is not possible.
  • the method of WO2010003316 A1 also requires the selection of a specific vector for the restriction site or the modification of the existing vector, which adds complexity, and the lack of carrier versatility makes the method difficult to use widely.
  • an aspect of the present invention provides a method capable of preparing a sequencing library, and a sequencing library prepared by the method can be effectively used for determining a terminal sequence of a nucleic acid.
  • a method of preparing a sequencing library comprising the steps of: randomly interrupting a construct to obtain a plurality of random fragments, wherein the construct is composed of a DNA to be tested and a vector, Inserting the DNA into the vector; separating the plurality of random fragments based on the length of the vector to obtain a library fragment, wherein the length of the library fragment is greater than the length of the vector; Self-ligating to obtain a circular molecule; and amplifying the circular molecule to obtain an amplification product, the amplification product constituting the sequencing library.
  • the amplified product fragment in the sequencing library obtained by the method for preparing a sequencing library according to an embodiment of the present invention comprises the double ends of the DNA to be tested, and thus can be subjected to conventional sequencing methods such as high-throughput sequencing methods such as using SOLEXA. , SOLID, 454 or single molecule sequencing devices efficiently and accurately determine the end sequence of the DNA to be tested.
  • Another aspect of the present invention provides a sequencing library which is prepared according to the above method for preparing a sequencing library.
  • the amplified product fragment in the sequencing library using the embodiment according to the present invention contains the double ends of the DNA to be tested, and thus can be subjected to conventional sequencing methods such as high-throughput sequencing methods such as using SOLEXA, SOLID, 454. Or a single molecule sequencing device efficiently and accurately determines the terminal sequence of the DNA to be tested.
  • a further aspect of the invention provides a method of determining a nucleic acid end sequence, comprising the steps of: inserting the nucleic acid into a vector; preparing a sequencing library of the nucleic acid according to the method for preparing a sequencing library described above; The sequencing library is subjected to sequencing to obtain end sequence information of the nucleic acid.
  • the terminal sequence of a nucleic acid can be efficiently and accurately determined.
  • a further aspect of the invention provides a system for determining a nucleic acid end sequence, comprising: 1) a DNA fragmentation device for randomly breaking a construct to obtain a plurality of random fragments, Wherein the construct consists of a DNA to be tested and a vector into which the DNA to be tested is inserted; 2) a separation device, the separation device being connected to the DNA fragmentation device for use based on the carrier Length, separating the plurality of random fragments to obtain a library fragment; 3) a cyclization device, the cyclization device being coupled to the separation device for self-ligating the library fragments to obtain a circular molecule 4) an amplification device, the amplification device being coupled to the cyclization device for amplifying the circular molecule to obtain an amplification product, the amplification product constituting the sequencing library; and 5) sequencing The sequencing device is connected to the amplification device, and is configured to sequence the sequencing library to obtain the core The end sequence of the acid.
  • the inventors of the present application have found that a method of preparing a sequencing library, a prepared sequencing library, and a method and system for determining a nucleic acid terminal sequence according to an embodiment of the present invention are particularly suitable for high-throughput sequencing.
  • FIG. 1 shows a partial schematic diagram of a method of preparing a sequencing library according to one embodiment of the present invention
  • FIG. 2 shows a partial schematic diagram of a method of preparing a sequencing library according to one embodiment of the present invention, wherein white is inserted DNA; black is Vector sequence; point-like primer sequence, paired with a black vector;
  • Figure 3 shows a flow diagram of a method of preparing a sequencing library in accordance with one embodiment of the present invention
  • FIG. 4 is a block diagram showing the structure of a sequencing system according to an embodiment of the present invention.
  • FIG. 5 is a block diagram showing the structure of a sequencing system in accordance with another embodiment of the present invention. Detailed description of the invention
  • a method of preparing a sequencing library is presented.
  • the method of preparing a sequencing library according to an embodiment of the present invention will be described in detail below with reference to Figs. 1-3.
  • the method for preparing a sequencing library comprises the step of randomly fragmenting a construct containing the DNA to be tested, and separating a fragment having a length longer than a vector length from the randomly fragmented fragment as a library. Step S200 of the fragment, a step S300 of self-ligating the obtained library fragment, and a step S400 of obtaining an amplification product by amplification of the circular molecule.
  • the construct composed of the DNA to be tested and the vector is subjected to random interruption processing, whereby a plurality of random fragments can be obtained.
  • the DNA to be tested is inserted into a vector to constitute a construct.
  • the type of carrier according to an embodiment of the present invention is not particularly limited.
  • the plasmid is used as a vector, whereby the operation can be facilitated, for example, the test DNA can be inserted into the multiple cloning site region of the plasmid.
  • the plasmid is at least one selected from the group consisting of a Fosmid plasmid, a BAC plasmid, and a Cosmid plasmid.
  • a larger DNA fragment can be inserted into the plasmid, thereby improving the efficiency and accuracy of sequencing.
  • the method and apparatus for performing random interruption processing on a construct are not particularly limited.
  • random interruption of the construct is performed by a physical method, thereby not Chemical group that destroys the DNA to be tested In order to improve the accuracy and efficiency of subsequent sequencing.
  • Examples of physical methods for random interruption include, but are not limited to, high pressure gas atomization treatment, ultrasonic treatment, and hydraulic shear. Among them, according to a specific example of the present invention, it is most preferable to carry out the use of a HydroShear DNA shear, whereby the random fragmentation treatment of the construct can be efficiently performed.
  • the HydroShear DNA shear when the solution containing the nucleic acid fragment passes through a smaller area of the channel, the fluid accelerates, and the force generated causes the nucleic acid fragment to suddenly break, wherein the flow rate and channel parameters of the HydroShear DNA shear are determined.
  • the parameters of the HydroShear DNA shear can be set such that the random fragments are in the range of several tens of bp to several hundred bp larger than the length of the carrier.
  • the parameters of the instrument can be set such that the length of the random segment is greater than the length of the vector.
  • the length of the random segment is greater than the length of the carrier from 50 bp to 800 bp (eg, the carrier size is 8.2 kb, the construct is interrupted to 8.25-9.0 kb) More preferably, the randomly interrupted fragment is in a range from 200 bp to 800 bp greater than the length of the vector.
  • the construct may be subjected to a restriction enzyme treatment using a restriction endonuclease having no restriction enzyme site on the vector before the construct is randomly interrupted.
  • a restriction enzyme treatment using a restriction endonuclease having no restriction enzyme site on the vector before the construct is randomly interrupted.
  • a 6-base nucleic acid restriction endonuclease can be used.
  • a nucleic acid restriction enzyme such as ⁇ / ⁇ or C3 ⁇ 4I can be used.
  • the efficiency of constructing the sequencing library and subsequently determining the nucleic acid end sequence method can be further improved.
  • a plurality of random fragments obtained in step S100 are separated based on the length of the vector to obtain a library fragment, wherein the length of the library fragment is greater than the length of the vector.
  • the length of the library fragment larger than the vector is not particularly limited and may be several tens of bp to several hundred bp.
  • the length of the library fragment is greater than about 50 bp to about 800 bp for the vector (e.g., if the vector size is 8.2 kb, the length of the isolated library fragment is about 8.25-9.0 kb).
  • the isolated library fragments are from about 200 bp to about 800 bp longer than the length of the vector.
  • the ratio of the end sequence of the DNA to be detected in the prepared sequencing library can be increased, thereby further improving the efficiency of constructing the sequencing library and subsequently determining the nucleic acid end sequence method.
  • a method and apparatus for separating a random fragment to obtain a library fragment are not particularly limited.
  • a library fragment of a specific length can be selected from a random fragment by using at least one of a gel electrophoresis method and a gradient sedimentation method (as shown in FIGS. 1C and D;), thereby being convenient and quick.
  • Random fragments are isolated and library fragments of a particular length are readily obtained.
  • the size and concentration of the isolated fragments can also optionally be determined in accordance with embodiments of the present invention, for example, an Agilent 2100 Biochip Analyzer can be used to facilitate subsequent processing.
  • the isolated library fragments are self-ligated with reference to Figures 2A and B to obtain a circular molecule.
  • the method and apparatus for self-ligating a library fragment are not particularly limited and can be carried out by methods known in the art.
  • ligation can be achieved using T4 ligase to effect cyclization of the library fragments.
  • the concentration of the library fragment nucleic acid fragments is not higher than about 2 ng/ ⁇ l, whereby different nucleic acid fragments can be prevented from being linked to each other and cyclized into a larger loop.
  • the circular molecules are amplified to obtain amplification products, and these expansions are performed.
  • the amplified product constitutes a sequencing library according to an embodiment of the invention. Since the length of the isolated library fragment is greater than the length of the vector, the obtained circular molecule includes some such molecules, that is, the vector sequence remains intact, and the insertion site of the vector contains at least one of the two end sequences of the DNA to be tested, The sequence of the vector is known, and thus, an amplification product containing a terminal sequence can be amplified by designing a suitable primer based on the vector sequence, thereby constituting a sequencing library according to an embodiment of the present invention.
  • Amplification of the circular molecule to obtain an amplification product is carried out by PCR amplification using a DNA polymerase having a terminal A function and a vector-specific primer.
  • the amplification product after the addition of A can directly connect the joints according to different sequencing platforms, eliminating the need for additional A steps and reducing product loss.
  • 18-20 cycles of PCR amplification are performed in accordance with specific examples of the invention. Thereby, the fidelity of the PCR reaction can be improved, thereby improving the accuracy of subsequent sequencing.
  • an ATP-dependent DNase and an exonuclease I which do not degrade the vector may be utilized before the amplification of the circular molecule.
  • At least one of the library fragments is digested to remove non-cyclic molecules.
  • a plurality of random segments may be subjected to a blapping process before the separation of the plurality of random segments.
  • the efficiency of the cyclization reaction in the subsequent cyclization step can be improved, thereby improving the efficiency and accuracy of preparing the sequencing library and subsequent sequencing.
  • the library fragments can be blunt-ended prior to self-ligation of the library fragments. Also, this can increase the efficiency of the cyclization reaction in the subsequent cyclization step, thereby improving the efficiency and accuracy of preparing the sequencing library and subsequent sequencing.
  • the above-described blunt-end treatment can be carried out by using at least one selected from the group consisting of Klenow enzyme, T4 polymerase and T4 polynucleotide kinase.
  • Klenow enzyme e.g., Klenow enzyme
  • T4 polymerase e.g., T4 polymerase
  • T4 polynucleotide kinase e.g., T4 polynucleotide kinase
  • step 3 isolation: the randomly broken fragments after the end repair in step 2) are separated to obtain random interrupted fragments larger than the vector length of 50 bp to 800 bp;
  • step 4) cyclization: the random interrupted fragments separated in step 3) are self-ligated to form a circular molecule, and then the fragments that are not self-ligated are removed;
  • Primers are designed according to the vector sequence, and the fragment of the DNA to be detected in the circular molecule is amplified to obtain an amplified product, which is a sequencing library.
  • Random interruption The vector inserted with the DNA to be tested is randomly interrupted to obtain a random interrupted fragment
  • step A The random interrupted fragments in step A are separated to obtain random interrupted fragments larger than the vector length of 50 bp to 800 bp;
  • step B The randomly broken fragments obtained in step B are subjected to end repair to make the ends flattened;
  • Primers are designed according to the vector sequence, and a fragment of the DNA to be detected in the circular molecule is amplified to obtain an amplification product, which is a sequencing library.
  • a further aspect of the invention proposes a sequencing library which can be prepared by the above-described method of preparing a sequencing library according to an embodiment of the invention.
  • the terminal sequence of the DNA to be tested can be efficiently and accurately determined.
  • the present invention also provides a method for determining a nucleic acid end sequence, comprising the steps of: inserting a nucleic acid into a vector; preparing a sequencing library of the nucleic acid according to the aforementioned method for preparing a sequencing library; and sequencing the sequencing library to The end sequence information of the nucleic acid is obtained.
  • the preparation of the sequencing library has been described in detail above and will not be described again. It should be noted that the term "nucleic acid” as used herein is not limited to DNA, but may also include RNA.
  • RNA sequence can be converted into a corresponding DNA sequence by a conventional method, for example, by a reverse transcription method, and then a method for preparing a sequencing library according to an embodiment of the present invention can be applied to prepare a sequencing library of RNA, thereby Determine the end sequence of the RNA.
  • the method and apparatus for sequencing a sequencing library are not particularly limited, and in view of the maturity of the technique, second generation sequencing techniques such as SOLEXA, SOLID, and 454 may be employed according to embodiments of the present invention. Sequencing technology.
  • sequencing technologies such as: Helicos' True Single Molecule DNA sequencing technology, Pacific Biosciences's single molecule, real-time (SMRT.TM. Technology, and nanopore sequencing technology from Oxford Nanopore Technologies, Inc. (Rusk, Nicole (2009-04-01). Cheap Third-Generation Sequencing. Nature Methods 6 (4): 244-245).
  • single-molecule sequencing technologies such as: Helicos' True Single Molecule DNA sequencing technology, Pacific Biosciences's single molecule, real-time (SMRT.TM. Technology, and nanopore sequencing technology from Oxford Nanopore Technologies, Inc. (Rusk, Nicole (2009-04-01). Cheap Third-Generation Sequencing. Nature Methods 6 (4): 244-245).
  • Steps for sequencing using second generation sequencing techniques can be performed by those skilled in the art in accordance with the instructions provided by the manufacturer.
  • a second-generation sequencing platform it is usually necessary to perform end-repair of the amplified product, blunt-end the end, and then add a sequencing linker for sequencing.
  • a high-fidelity polymerase having a function of terminal A plus can be directly used in the amplification step, and the amplification product after terminal A addition can be directly connected according to different sequencing platforms.
  • the joint eliminates the need for additional A steps to reduce product loss.
  • the method of ligation with the sequencing linker can be carried out using methods known in the art, often using T4 ligase.
  • the insertion site of the vector is often a multiple cloning site, and has a plurality of restriction endonuclease sites, and thus, according to an embodiment of the present invention, these restriction enzyme pairs can be used.
  • the amplified product is digested to reduce the vector sequence contained in the sequence of the amplified product, and a longer end sequence is obtained in the read length obtained by sequencing.
  • the digested product can be end-repaired prior to addition of the linker, for example, by polymerase such as Klenow, T4 polymerase and T4 polynucleotide kinase, and dNTP to fill the end, followed by no external Klenow fragment of Dicer activity plus A.
  • the obtained terminal sequence after obtaining the terminal sequence of the nucleic acid, can also be assembled and spliced with the sequence obtained by a conventional method, thereby obtaining an assembly fragment length obtained by assembly and splicing according to a conventional method.
  • assembly and/or splicing can be performed using methods and apparatus known to those skilled in the art, for example, SOAPdenovo software can be used (this software is available free of charge, for example, from http://soap.genomics. Org.cn/soapdenovo.html Download, see Li ⁇ 3 ⁇ 4/. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res.
  • a method of sequencing a nucleic acid sequence comprising: dividing a nucleic acid into two, and sequencing one of the nucleic acid fragments according to an embodiment of the present invention to obtain nucleic acid fragment sequence information,
  • the obtained end sequence is assembled and spliced with the nucleic acid fragment sequence information obtained by the conventional method to obtain an assembly fragment (scaffold).
  • the assembly fragment thus obtained is significantly larger than the assembly fragment obtained by directly assembling and splicing the nucleic acid fragments obtained by the conventional method.
  • the assembly fragment obtained by the nucleic acid sequencing method according to the embodiment of the present invention may have an N50 value of 5 kb or more, and may be 10 kb or more, or even 20 kb or more, according to a specific example of the present invention.
  • a method of determining a nucleic acid sequence comprising: dividing a nucleic acid sample into a first nucleic acid nucleic acid sample and a second nucleic acid sample, the first nucleic acid sample and The composition of the second nucleic acid sample is the same; using the first nucleic acid sample, according to the method of the embodiment of the present invention, preparing a sequencing library, and determining the end sequence information of the nucleic acid by sequencing; using the second nucleic acid sample, obtaining according to a conventional sequencing method Nucleic acid fragment sequence information of the nucleic acid, wherein the conventional sequencing method described herein is at least one selected from the group consisting of SOLEXA, SOLID, 454, and single molecule sequencing technology; and the end sequence information of the nucleic acid and the nucleic acid The nucleic acid fragment sequence information is assembled and spliced to determine the sequence of the nucleic acid.
  • the method of determining a nucleic acid sequence comprising: dividing a nucleic acid sample
  • a system 1000 for determining a nucleic acid end sequence includes a DNA fragmentation device 100, a separation device 200, a cyclization device 300, an amplification device 400, and a sequencing device 500.
  • the DNA fragmentation apparatus 100 is used to randomly break the construct to obtain a plurality of random fragments.
  • the construct is composed of the DNA to be tested and the vector, and the DNA to be tested is inserted into the vector.
  • the separation device 200 is coupled to the DNA fragmentation device 100 for separating a plurality of random fragments based on the length of the vector to obtain a library fragment.
  • the cyclization unit 300 is coupled to the separation unit 200 for self-ligating the fragment of the library to obtain a circular molecule.
  • the amplification device 400 is coupled to the cyclization device 300 for amplifying the circular molecule to obtain an amplification product, and the amplification product constitutes the sequencing library.
  • the sequencing device 500 is coupled to the amplification device 400 for sequencing the sequencing library to obtain the end sequence of the nucleic acid.
  • the method according to the embodiment of the present invention can be efficiently carried out, and the terminal sequence of the nucleic acid can be efficiently and accurately obtained.
  • the term "connected” as used herein shall be understood broadly, either directly or indirectly through a medium.
  • the DNA fragmentation device can be a HydroShear DNA shear.
  • the sequencing device may be at least one selected from the group consisting of SOLEXA, SOLID, 454, and single molecule sequencing devices. It has been described in detail above and will not be described here.
  • the system for determining a nucleic acid end sequence further includes at least one of a pretreatment device 101, a blunt device 201, and a purification device 301.
  • the pretreatment apparatus 101 is configured to perform a digestion treatment on the construct by using a restriction endonuclease having no restriction enzyme site on the vector before randomly interrupting the vector.
  • the blunt-end apparatus 201 is for blunt-ending a plurality of random fragments before detaching the plurality of random fragments, or squaring the library fragments before self-joining the library fragments.
  • the non-circular molecule is removed by digestion of the library fragment with at least one of an ATP-dependent DNase and an exonuclease I that does not degrade the vector prior to self-ligating the library fragment.
  • an ATP-dependent DNase and an exonuclease I that does not degrade the vector prior to self-ligating the library fragment.
  • DNA after the end of the blunt ends was electrophoresed, and electrophoresed at 0.6 V Megebase agarose gel for 5 hours at a voltage of 5 V/CM. After staining, DNA of 8.2-9.0 kb fragment size was cut out under Darkreader and purified using QIAquick Gel Purification Kit. .
  • the 36.75 ⁇ l sample was recovered in order, with force.
  • the sample was subjected to PCR amplification with 5 ⁇ 110 x Ex Taq buffer, 4 ⁇ 12.5 ⁇ dNTP, 2 ⁇ 110 ⁇ forward primer Fl: and reverse primer Rl: , 0 ⁇ 25 ⁇ 1 Ex Taq (5000 units/ml, Takara).
  • the specific sequence of the primers used is as follows:
  • Rl GTACAACGACACCTAGAC (SEQ ID NO: 2).
  • the PCR procedure is as follows:
  • the samples were then purified using a Qiagen MinElute PCR Purification Kit.
  • TCTTCCGATCT SEQ ID NO: 4
  • the PCR procedure is as follows:
  • the reaction product was then denatured by standing at 65 ° C for 10 minutes, then placed on water, and the denatured product was electrophoresed and electrophoresed using a 2.0% Low Range Ultra agarose gel at a voltage of 15 V/CM for 2 hours, after staining, at Darkreader.
  • the DNA of 400 bp-700 bp fragment size was cut out, using Qiagen MinElute Gel Purification Kit for purification.
  • the purified product was sequenced on Illumina GA (Solexa) for 76 cycles.
  • a total of 15,225,082 pairs of sequences were obtained, and after the repetition was removed, there were 2,865,235 pairs of clean sequences (i.e., sequences with unique characteristics obtained by removing the repeatedly determined sequences).
  • the sequence of the obtained sequencing result was compared with the genomic map of the original obtained assembly fragment (scaffold) with a N50 of 2.3M (the preparation process is as described in the comparative example), and the unique assembly site was obtained to locate the same assembly fragment (scaffold).
  • the number of distances less than 500 bp is 209,600 pairs, which are located on the same assembly fragment (scaffold) and the number of distances greater than 10 kb is 531,028 pairs, of which 520,897 pairs of 30-50 kb, accounting for 98.09%, are located.
  • the same genomic sample as in Example 1 was used, and the following procedure was used for sequencing.
  • the sequencing process was a standard procedure provided by illumina, specifically: using Genomic DNA Sample Prep Kits (Illumina, USA), constructing the insert according to the kit manufacturer's instructions. Sequencing libraries of 165 - 175 bp, 450 - 550 bp, 720 - 880 bp, respectively; using the Paired-End Sample Prep Kit ( Illumina, USA), the insert size was 2.4 kb according to the manufacturer's instructions.
  • a sequencing library of -2.7 kb, 5.7 kb-6.3 kb and 10 kb-l lkb, and then the two libraries constructed were sequenced using Illumina GA (Solexa). The valid data (repeating the duplicates and erroneous results in the original sequencing data) reached 60X (sequencing depth) coverage and assembled with SOAPdenovo.
  • the calculated assembly fragment (scaffold) has an N50 of 2.3
  • the method of the present invention can effectively increase the length of the assembled scaffold.
  • the increase in the length of the scaffold facilitates the subsequent positioning of various molecular markers and the study of related genes or traits.
  • the method of preparing a sequencing library, the prepared sequencing library, the method and system for determining the end sequence of a nucleic acid according to an embodiment of the present invention (hereinafter collectively referred to as "the technical scheme according to the present invention") are particularly suitable for high-throughput sequencing. And at least one of the following advantages:
  • the technical solution according to the present invention reduces the cumbersome steps of picking clones, preparing a single clone plasmid, and the like, and greatly saving time and money;
  • the present invention uses a random disruption method to fragment the plasmid DNA of a vector such as BAC or Fosmid clone, and the end fragment can be obtained by reverse PCR, which overcomes The preference for the digested Fosmid clone end sequencing method does not require modification and selection of the vector involved in the restriction enzyme site, and thus the technical solution according to the present invention is widely used.
  • high-throughput terminal sequencing of a vector into which a larger fragment is inserted such as BAC, can be performed without restriction of the restriction site.
  • the technical solution according to the present invention as a helper method can greatly increase the length of the scaffold in the de novo sequencing of the genome.
  • the increase in the length of the assembled fragments facilitates the subsequent localization of various molecular markers and the study of related genes or traits.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Description

测序文库及其制备方法、 确定核酸末端序列的方法和系统 优先权信息
本申请请求 2010 年 9 月 1 日向中国国家知识产权局提交的、 专利申请号为 201010272706.5的专利申请的优先权和权益, 并且通过参照将其全文并入此处。 技术领域
本发明涉及分子生物学领域, 尤其是基因组学领域, 具体而言, 本发明涉及测序文 库及其制备方法, 以及基于该测序文库确定核酸末端序列的方法和系统。 背景技术
在基因组测序中, 通常将基因组 DNA克隆到载体中再进行测序。 例如常用的载体 是 Fosmid和细菌人工染色体 ( bacterial artificial chromosome , BAC) , 二者具有插入 片段大和稳定的特点,是基因组学研究重要工具。 已知 BAC通常可以插入 100 - 200 kb 的片段, Fosmid通常可以插入大约 40 kb的片段, 二者在基因图位克隆、 基因分析、 结构性变异和基因组组装中有重要的作用。 此外还有多种其它载体也应用在测序当中。
第一代测序技术中通常要对含有待测 DNA的载体克隆的末端进行测序, 以构建重 叠克隆, 然而由于克隆具有低拷贝的特点以及插入的大片段存在的二级结构, 所以对 克隆进行末端测序比较困难, 即使目前开发了自动化的设备 ( Kelley,J.M.et al. 1999. High throughout direct end sequencing of BAC clones. Nucleic Acids Res. 27: 1539 - 1546 ) , 但是对于数十万的克隆来说, 仍然显得费时费力。
第二代测序技术 (next generation sequencing,NGS)是高通量测序技术,其对第一代测 序技术进行了改进,釆用 SOLEXA、 SOLID ,和 454测序平台等 (Metzker ML. Sequencing techno logies-the next generation. Nat Rev Genet. 2010 Jan; 11 (1):31-46)使高通量测序得到 了迅速的发展。 然而, 目前的测序方法仍有待改进。 发明内容
本申请的发明人发现, 对于读长较小的一些测序平台 (如 illumina/solexa ) ,测序后 的拼接较为困难,得到的组装片段( scaffold )的 N50偏低, 组装结果并不理想。 在本发 明中, 所使用的术语 "N50" 是指将所有的组装得到的序列从大到小排列起来并按长度 相加, 当相加得到的长度为所有组装得到序列总长的百分之五十时那条组装序列的长 度, 关于 N50的详细描述, 可以参考 Miller et al. 2010. Assembly algorithms for next generation sequencing data. Genomics 95 ( 6 ) : 315-327 , 在此通过参照将其全文并入此 处。 为了克服组装的困难, 需要对大片段的末端进行测序。 WO2010003316 A1中公开 了一种方法, 该方法合成了构建 Fosmid 克隆的载体, 在这些载体中存在的识别 4碱 基的内切酶 FspBl 和 Csp6l 的酶切位点被突变, 利用这两个内切酶对插入外源片段 的 Fosmid 克隆进行酶切, 回收目的酶切片段, 环化之后, 获得 Fosmid 克隆的两个 末端序列, 可以利用第二代测序仪进行双末端测序。 但是本申请的发明人发现, WO2010003316 A1的方法受到很大限制, 因为 FspBl 和 Csp61 的酶切位点并不是完 全平均分布在基因组中, 导致有一些含有特定区域的 Fosmid 克隆的末端无法得到, 并且不能对插入更长片段的 BAC进行末端测序。 此外, WO2010003316 A1 的方法还 需要针对酶切位点选择特定的载体或者对现有载体进行改造, 增加了复杂性, 由于没 有载体通用性, 也使得该方法难以广泛使用。
本发明旨在至少解决现有技术中存在的技术问题之一。 为此, 本发明的一个方面提 出了一种能够制备测序文库的方法, 利用该方法制备的测序文库能够有效地用于确定核酸 的末端序列。
根据本发明的实施例的制备测序文库的方法, 包括下述步骤: 将构建体进行随机打断, 以获得多个随机片断, 其中, 所述构建体由待测 DNA和载体构成, 所述待测 DNA插入所 述载体中; 基于所述载体的长度, 对所述多个随机片断进行分离, 以获得文库片断, 其中 所述文库片断的长度大于所述载体的长度; 使所述文库片断进行自连接, 以获得环形分子; 以及将所述环形分子进行扩增得到扩增产物, 所述扩增产物构成所述测序文库。 由此, 根 据本发明实施例的制备测序文库的方法所得到的测序文库中的扩增产物片段包含了待测 DNA 的双末端, 进而可以通过常规的测序方法例如高通量测序方法诸如利用 SOLEXA、 SOLID, 454或单分子测序装置高效精确地确定待测 DNA的末端序列。
本发明的另一个方面提供了一种测序文库, 其是根据上述的制备测序文库的方法而制 备的。 如前所述, 利用根据本发明的实施例的测序文库中的扩增产物片段包含了待测 DNA 的双末端,进而可以通过常规的测序方法例如高通量测序方法诸如利用 SOLEXA、 SOLID, 454或单分子测序装置高效精确地确定待测 DNA的末端序列。
本发明的又一方面提供了一种确定核酸末端序列的方法, 其包括以下步骤: 将所述核 酸插入载体中; 根据上述的制备测序文库的方法制备所述核酸的测序文库; 以及对所述测 序文库进行测序, 以获得所述核酸的末端序列信息。 利用根据本发明的实施例的确定核酸 末端序列的方法, 能够高效精确地确定核酸的末端序列。
本发明的再一方面提供了一种确定核酸末端序列的系统, 其包括: 1 ) DNA 片断化装 置, 所述 DNA片断化装置用于将构建体进行随机打断, 以获得多个随机片断, 其中, 所述 构建体由待测 DNA和载体构成, 所述待测 DNA插入所述载体中; 2 )分离装置, 所述分 离装置与所述 DNA片断化装置相连,用于基于所述载体的长度,对所述多个随机片断进行 分离, 以获得文库片断; 3 )环化装置, 所述环化装置与所述分离装置相连, 用于使所述文 库片断进行自连接, 以获得环形分子; 4 )扩增装置, 所述扩增装置与所述环化装置相连, 用于将所述环形分子进行扩增得到扩增产物, 所述扩增产物构成所述测序文库; 以及 5 ) 测序装置; 所述测序装置与所述扩增装置相连, 用于对所述测序文库进行测序得到所述核 酸的末端序列。 利用才艮据本发明实施例的确定核酸末端序列的系统能够高效地确定核酸的 末端序列。
本申请的发明人发现根据本发明实施例的制备测序文库的方法、 所制备的测序文库、 确定核酸末端序列的方法和系统特别适用于高通量测序。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得 明显, 或通过本发明的实践了解到。 附图说明
本发明的上述和 /或附加的方面和优点从结合下面附图对实施例的描述中将变得明 显和容易理解, 其中:
图 1显示了根据本发明一个实施例的制备测序文库的方法的部分示意图; 图 2显示了根据本发明一个实施例的制备测序文库的方法的部分示意图, 其中, 白 色为插入的 DNA; 黑色为载体序列; 点状为引物序列, 与黑色载体配对;
图 3显示了根据本发明一个实施例的制备测序文库的的方法的流程图;
图 4显示了根据本发明一个实施例的测序系统的结构示意图; 以及
图 5显示了根据本发明另一个实施例的测序系统的结构示意图。 发明详细描述
下面详细描述本发明的实施例, 所述实施例的示例在附图中示出, 其中自始至终相 同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。 下面通过参考 附图描述的实施例是示例性的, 仅用于解释本发明, 而不能理解为对本发明的限制。
根据本发明的一个方面, 提出了一种制备测序文库的方法。 下面首先参考图 1-3对根 据本发明的实施例的制备测序文库的方法进行详细描述。
如图 3所示,根据本发明实施例的制备测序文库的方法包括将含有待测 DNA的构建体 进行随机片段化的步骤 S100、 从随机片段化的片段中分离长度大于载体长度的片段作为文 库片段的步骤 S200、对所得到的文库片段进行自连接的步骤 S300、 以及通过对环形分子进 行扩增得到扩增产物的步骤 S400。
在随机片段化步骤 S100中, 如图 1A和 B所示, 将由待测 DNA和载体构成的构建体 进行随机打断处理, 由此可以获得多个随机片段。 其中,待测 DNA插入在载体中构成构建 体。 根据本发明的实施例的载体类型并不受特别限制。 根据本发明的一个实施例, 釆用质 粒作为载体,由此,能够便于操作,例如可以将待测 DNA插入到质粒的多克隆位点区域中。 才艮据本发明的具体示例, 所述质粒是选自 Fosmid质粒、 BAC质粒、 和 Cosmid质粒的至少 一种。 由此, 能够在质粒中插入更大的 DNA片段, 从而提高测序的效率和精确度。 根据本 发明的实施例, 对构建体进行随机打断处理的方法和装置并不受特别限制, 根据本发明的 实施例,将构建体进行随机打断是利用物理方法进行的, 由此不会破坏待测 DNA的化学组 成, 从而提高后续测序的精确度和效率。 物理方法进行随机打断的例子包括但不限于高压 气体雾化处理、超声处理、水力剪切。其中,根据本发明的具体示例,最优选釆用 HydroShear DNA 剪切仪进行, 由此可以高效地完成对构建体的随机片段化处理。 在利用 HydroShear DNA剪切仪时, 当含有核酸片段的溶液通过较小面积的通道时, 流体加速, 产生的力使核 酸片段突然断裂, 其中, HydroShear DNA剪切仪的流速和通道参数决定所得到核酸片段的 大小, 因而很容易通过设置参数来选择所需要长度的核酸片段, 从而提高制备测序文库的 效率。 例如, 根据本发明的实施例, 可以通过设置 HydroShear DNA剪切仪的参数使得随 机片段处于大于载体长度几十 bp到数百 bp的范围内。 可以设置仪器的参数, 使随机片段 长度大于载体的长度, 优选地, 使随机片段长度处于大于载体长度 50 bp至 800bp (例如 载体大小为 8.2 kb, 则将构建体打断为 8.25-9.0 kb )的范围内, 更优选地, 使随机打断片段 处于大于载体长度 200 bp至 800 bp的范围内。
另外, 根据本发明的一些实施例, 在将构建体进行随机打断之前, 可以利用在载体上 不存在限制性酶切位点的限制性内切酶, 对构建体进行酶切处理。 由此, 可以提高将构建 体随机片段化的效率,并且提高所得到测序文库中待测 DNA末端序列的比例。根据本发明 的实施例, 可以釆用识别 6碱基的核酸限制性内切酶。 例如针对利用 pCC2FOS Vector ( Epicentre, USA )得到的 Fosmid构建体, 可以使用 Χ/^Ι、 或 C¾I等核酸限制性内切酶。 由此, 能够进一步提高构建测序文库以及后续确定核酸末端序列方法的效率。
在分离步骤 S200中, 参考图 1C和 D, 基于载体的长度, 对在步骤 S100中所得到的 多个随机片断进行分离, 以获得文库片断, 其中文库片断的长度大于所述载体的长度。 根 据本发明的一些实施例, 文库片断大于载体的长度并不受特别限制, 可以为几十 bp至数百 bp。 优选地, 文库片断的长度大于载体为约 50 bp至约 800bp (例如, 如果载体大小为 8.2 kb, 则分离的文库片段的长度为约 8.25-9.0 kb )。 更优选地, 分离的文库片断的长度比载体 的长度大约 200 bp至约 800 bp。 由此, 可以提高所制备测序文库中待测 DNA末端序列的 比例, 从而能够进一步提高构建测序文库以及后续确定核酸末端序列方法的效率。 根据本 发明的实施例, 对随机片段进行分离获得文库片段的方法和装置并不受特别限制。 根据本 发明的实施例, 可以釆用凝胶电泳法和梯度沉降法的至少一种进行从随机片段中选择特定 长度的文库片段(如图 1C和 D所示;), 由此能够方便快捷地对随机片段进行分离并容易获 得特定长度的文库片段。 根据本发明的实施例, 还可以任选地对分离得到的片段进行大小 和浓度的测定, 例如可以使用安捷伦 2100生物芯片分析仪, 以便于后续的处理。
在环化步骤 S300中, 参考图 2A和 B使所分离得到的文库片断进行自连接, 以获得环 形分子。 根据本发明的实施例, 使文库片段自连接的方法和装置并不受特别限制, 可以釆 用本领域已知的方法进行。根据本发明的一些示例, 可以使用 T4连接酶进行连接, 实现文 库片段的环化。 根据本发明的具体示例, 在环化体系中, 文库片段核酸片段的浓度不高于 约 2ng/微升, 由此可以防止不同的核酸片段彼此连接并环化成较大的环。
在扩增步骤 S400中, 如图 2B和 C所示, 将环形分子进行扩增得到扩增产物, 这些扩 增产物构成了根据本发明实施例的测序文库。 由于分离的文库片段的长度大于载体长度, 因而所得到的环形分子中包括一些这样的分子, 即载体序列保持完整, 同时载体的插入位 点中至少含有待测 DNA两个末端序列之一, 由于已知载体的序列, 因而, 可以通过基于载 体序列设计适合的引物, 扩增得到含有末端序列的扩增产物, 进而构成根据本发明实施例 的测序文库。 将所述环形分子进行扩增得到扩增产物是利用具有末端加 A功能的 DNA聚 合酶、 载体特异性引物进行 PCR扩增而进行的。 由此, 末端加 A后的扩增产物就可以直接 根据不同的测序平台连接接头, 省去额外加 A的操作步骤, 减少产物损失。 另外, 根据本 发明的具体示例, 进行 18-20个循环的 PCR扩增。 由此, 可以提高 PCR反应的保真度, 从 而提高后续测序的精确度。
根据本发明的实施例, 在将所述环形分子进行扩增之前, 可以利用不降解载体的 ATP 依赖性 DNA酶 ( Plasmid-Safe ATP-dependent DNase )和外切核酸酶 I ( Exonuc l ea s e I ) 的 至少一种对文库片断进行消化除去非环形分子。 由此, 能够提高后续扩增反应的效率, 从 而提高制备测序文库以及后续测序的效率和精确度。
另外, 根据本发明的一个实施例, 在对多个随机片断进行分离之前, 可以对多个随机 片断进行平端化处理。 由此, 可以提高后续环化步骤中环化反应的效率, 从而提高制备测 序文库以及后续测序的效率和精确度。 根据本发明的实施例, 可以在将文库片断进行自连 接之前, 对文库片断进行平端化处理。 同样, 这样也可以提高后续环化步骤中环化反应的 效率, 从而提高制备测序文库以及后续测序的效率和精确度。 根据本发明的具体示例, 实 施上述平端化处理,可以釆用选自 Klenow酶、 T4聚合酶和 T4多核苷酸激酶的至少一种进 行。 由此, 能够提高平端化处理的效果, 并且不会影响后续测序的精确度。
因而, 根据本发明的具体示例, 提出了一种制备测序文库的方法, 包括下述步骤:
1 ) 随机打断: 将插入有待测 DNA的载体进行随机打断处理, 得到随机打断片段;
2 )末端修复: 将步骤 1 ) 中得到的随机打断片段进行末端修复, 使末端平端化;
3 )分离: 将步骤 2 )中的末端修复后的随机打断片段进行分离, 得到大于载体长度 50 bp至 800bp的随机打断片段;
4 )环化: 将步骤 3 ) 中分离得到的随机打断片段进行自身连接, 形成环形分子, 然后 清除未自身连接的片段;
5 )扩增: 根据载体序列设计引物, 扩增环形分子中的待测 DNA的片段, 得到扩增产 物, 即为测序文库。
根据本发明的又一具体示例, 还提供了一种制备测序文库的方法, 包括下述步骤:
A.随机打断: 将插入有待测 DNA的载体进行随机打断处理, 得到随机打断片段;
B.分离: 将步骤 A中的随机打断片段进行分离, 得到大于载体长度 50 bp至 800bp的 随机打断片段;
C.末端修复: 将步骤 B中分离得到的随机打断片段进行末端修复, 使末端平端化;
D.环化: 将步骤 C中末端修复的随机打断片段进行自身连接, 形成环形分子, 然后清 除未自身连接的片段;
E.扩增:根据载体序列设计引物,扩增环形分子中的待测 DNA的片段,得到扩增产物, 即为测序文库。
本发明的又一方面提出了一种测序文库, 该文库可以通过上述根据本发明的实施例的 制备测序文库的方法制备。 利用根据本发明实施例的测序文库, 可以高效、 精确地确定待 测 DNA的末端序列。
进而, 本发明还提出了一种确定核酸末端序列的方法, 其包括以下步骤: 将核酸插入 载体中; 根据前述制备测序文库的方法制备核酸的测序文库; 以及对所述测序文库进行测 序, 以获得所述核酸的末端序列信息。 关于制备测序文库, 前面已经进行了详细描述, 不 再赘述。 需要说明的是, 这里所使用的术语 "核酸" 不仅限于 DNA, 还可以包括 RNA。 本领域技术人员能够理解, 可以通过常规方法将 RNA序列转化成相应的 DNA序列, 例如 通过反转录方法,进而可以应用根据本发明实施例的制备测序文库的方法, 制备 RNA的测 序文库, 从而确定 RNA的末端序列。根据本发明的实施例, 对测序文库进行测序的方法和 装置不受特别限制, 考虑到技术的成熟度, 根据本发明的实施例, 可以釆用第二代测序技 术, 诸如 SOLEXA、 SOLID和 454测序技术。 当然, 也可以釆用正在开发或者尚未开发的 新型测序技术, 例如单分子测序技术, 诸如: Helicos公司的 True Single Molecule DNA sequencing技术, Pacific Biosciences公司的 the single molecule, real-time (SMRT.TM.) 技 术, 以及 Oxford Nanopore Technologies 公司的纳米孔测序技术等 ( Rusk, Nicole (2009-04-01). Cheap Third-Generation Sequencing. Nature Methods 6 (4): 244-245 )。
关于利用第二代测序技术进行测序的步骤, 本领域技术人员可以根据制造商所提供的 说明书进行实施。 例如, 为了使用第二代测序平台, 通常需要将扩增产物进行末端修复, 使末端平端化, 然后加上测序用接头, 进行测序。 如前所述, 根据本发明的实施例, 可以 直接在扩增步骤种使用高保真同时具有末端加 A的功能的聚合酶,末端加 A后的扩增产物 就可以直接根据不同的测序平台连接接头, 省去额外加 A的操作步骤, 减少产物损失。 与 测序接头的连接方法可以釆用本领域已知的方法, 常使用 T4连接酶进行连接。 另外, 一般 情况下, 载体的插入位点常为多克隆位点, 具有多个限制性内切酶的酶切位点, 因而, 根 据本发明的实施例, 可以使用这些限制性内切酶对扩增产物进行酶切, 减少扩增产物的序 列上包含的载体序列, 在测序得到的读长中得到更长的末端序列。根据本发明的具体示例, 在加接头前, 可以对酶切的产物进行末端修复, 例如利用聚合酶如 Klenow、 T4聚合酶和 T4多聚核苷酸激酶以及 dNTP补平末端, 接着用没有外切酶活性的 Klenow片段加 A。
根据本发明的实施例, 在获得核酸的末端序列后, 还可以将所得到的末端序列与通过 常规方法获得的序列进行组装和拼接, 由此获得比常规方法组装和拼接所得到的组装片断 长度显著大的组装片断。 根据本发明的实施例, 可以釆用在本领域技术人员已知的方法和 装置进行组装和 /或拼接, 例如可以使用 SOAPdenovo软件 (该软件可免费获得, 例如从 http://soap.genomics.org.cn/soapdenovo.html 下载, 参考 Li <¾/. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20(2):265-72, 通过参照将其并入本文)。 因而, 根据本发明的实施例, 还提供了一种 对核酸序列进行测序的方法, 其包括: 将核酸分为两份, 将其中的一份按照本发明实施例 进行测序得到核酸片断序列信息, 将所得到的末端序列与常规方法得到的核酸片断序列信 息进行组装和拼接, 得到组装片断(scaffold )。 如此获得的组装片断显著大于通过常规方法 得到的核酸片断直接进行组装和拼接所得到的组装片断。 根据本发明的实施例的核酸测序 方法所得到的组装片断的 N50值可以达到 5 kb以上, 才艮据本发明的具体示例, 可以达到 10 kb以上, 甚至可以达到 20 kb以上。 具体地, 由此, 根据本发明实施例, 提供了一种确 定核酸序列的方法, 其包括: 将核酸样本分为第一核酸核酸样本和第二核酸样本, 这里所 述的第一核酸样本和第二核酸样本的组成是相同的; 利用第一核酸样本, 根据本发明实施 例的方法, 制备测序文库, 并通过测序确定核酸的末端序列信息; 利用第二核酸样本, 根 据常规的测序方法获得所述核酸的核酸片段序列信息, 其中, 这里所述的常规的测序方法 为选自 SOLEXA、 SOLID, 454、 和单分子测序技术的至少一种; 以及将核酸的末端序列信 息与所述核酸的核酸片段序列信息进行组装和拼接, 以便确定所述核酸的序列。 根据本发 明实施例的确定核酸序列的方法, 能够高效精确地确定核酸的序列。
才艮据本发明的另一方面, 还提供了一种确定核酸末端序列的系统。 参考图 4和图 5 , 根据本发明实施例的确定核酸末端序列的系统 1000包括 DNA片断化装置 100、 分离装置 200、 环化装置 300、 扩增装置 400、 以及测序装置 500。 其中, DNA片断化装置 100用于 将构建体进行随机打断,以获得多个随机片断,如前所述,构建体由待测 DNA和载体构成, 并且待测 DNA插入所述载体中。 分离装置 200与 DNA片断化装置 100相连, 用于基于所 述载体的长度,对多个随机片断进行分离, 以获得文库片断。 环化装置 300与分离装置 200 相连, 用于使文库片断进行自连接, 以获得环形分子。 扩增装置 400与环化装置 300相连, 用于将环形分子进行扩增得到扩增产物, 所述扩增产物构成所述测序文库。 测序装置 500 与扩增装置 400相连, 用于对测序文库进行测序得到核酸的末端序列。 才艮据本发明实施例 的确定核酸末端序列的系统, 能够有效地实施才艮据本发明实施例的方法, 并且高效精确地 获得核酸的末端序列。 这里所使用的术语 "相连" 应作广义理解, 即可以是直接连接, 也 可以是通过媒介物间接相连。 如前所述, 根据本发明的实施例, DNA 片断化装置可以是 HydroShear DNA剪切仪。 测序装置可以是选自 SOLEXA、 SOLID, 454、 和单分子测序装 置的至少一种。 前面已对其进行了详细描述, 在此不再赘述。
参考图 5 , 才艮据本发明实施例的确定核酸末端序列的系统进一步包括预处理装置 101、 平端化装置 201和净化装置 301的至少一种。 其中, 预处理装置 101用于在将所述载体进 行随机打断之前, 利用在载体上不存在限制性酶切位点的限制性内切酶, 对构建体进行酶 切处理。 平端化装置 201用于在对多个随机片断进行分离之前, 对多个随机片断进行平端 化处理, 或者在将文库片断进行自连接之前, 对文库片断进行平端化处理。 净化装置 301 用于在将文库片断进行自连接之前,利用不降解载体的 ATP依赖性 DNA酶和外切核酸酶 I 的至少一种对文库片断进行消化除去非环形分子。 关于这些装置所实施处理的优点, 前面 已经进行了详细描述, 在此不再详述。 下面将结合实施例对本发明的方案进行解释。 本领域技术人员将会理解, 下面的 实施例仅用于说明本发明, 而不应视为限定本发明的范围。 实施例中未注明具体技术 或条件的, 按照本领域内的文献所描述的技术或条件 (例如参考 J.萨姆布鲁克等著, 黄培堂等译的 《分子克隆实验指南》 , 第三版, 科学出版社) 或者按照产品说明书进 行。 所用试剂或仪器未注明生产厂商者, 均为可以通过市购获得的常规产品。 实施例 1北极熊基因组 DNA测序
1 ) 随机打断
取北极熊基因组 DNA (本实验室使用盐析法提取, 具体方法可以参考 Lahiri D, Schnabel B .1993. DNA isolation by a rapid method from human blood samples: effects of MgC12, EDTA, storage time, and temperature on DNA yield and quality. Biochem Genet. 31 :321-328 ) , 确保所提取 DNA大小不低于 36Kb , 利用 CopyControl™ HTP Fosmid Library Production Kit ( Epicentre , USA ) ,按照生产商的详细说明制备北极熊的 Fosmid 克隆文库, 利用本领域常用的碱裂解法对 Fosmid克隆混合样提取质粒 DNA。
使用标准 HydroShear DNA剪切仪 ( GeneMachine, San Carlos, CA., USA ) , Custom Shearing Assembly-large (4 Kb-40 Kb)装置 ( GeneMachine, San Carlos, CA., USA ) , 以 速度 8 对 200μ1北极熊混合 Fosmid 克隆质粒 DNA 20 g进行 20个循环的剪切, 该 克隆文库使用了载体 PCC2FOS ( Epicentre , Madison, WL, USA )所构建, 载体大小为 8181bp。
2 ) 末端修复
利用 QIAquick PCR Purification Kit ( Qiagen , Germany ) 纯化片段化的 DNA , 在 154.8μ1 DNA溶液中加入 20μ1 10 χ Τ4多聚核苷酸激酶緩冲液, 3.2μ1 25mM dNTP , ΙΟμΙ T4 DNA聚合酶( 3000单位 /ml, Enzymatics , Beverly, MA. , USA ) , 2μ1 Klenow 聚 合酶(5000 单位 /ml , Enzymatics) 和 ΙΟμΙ T4 多聚核苷酸激酶(10000 单位 /ml , Enzymatics) , 20 °C温育 30分钟, 对片段化的 DNA进行补平末端。
3 ) 分离
对补平末端后的 DNA进行电泳, 使用 0.6%的 Megebase 琼脂糖胶以电压 5V/CM 电泳 16小时,染色后,在 Darkreader 下切取 8.2-9.0 kb片段大小的 DNA ,使用 QIAquick Gel Purification Kit 进行纯化。
4 ) 环化
对回收的 8.2-9. Okb片段大小的 DNA进行环化,在 1600ng DNA溶液中加入 80μ1 10 xT4DNA连接酶緩冲液和 40μ1Τ4ϋΝΑ连接酶( 400, 000单位 /ml, NEB) , 16°C温 育 16 小时, 此后通过向体系中加入 16μ1 lOOmM 的 ATP, 192μ1 10 x Plasmid-Safe ATP-dependent DNase 緩冲液, 80μ1 Plasmid-Safe ATP-dependent DNase ( 10,000单位 /ml, Epicentre ) 和 48μ1 Exonuclease I ( 20,000单位 /ml, NEB ) ,将反应体系于 37 °C下 放置 30分钟, 以此来消化没有环化的 DNA, 然后在 72°C放置 20分钟, 接着加入 64μ1 0.5Μ EDTA, 此后将样品用 QIAquick PCR Purification Kit纯化。
5)扩增
在 3) 环化步骤中回收得到的 36.75μ1样品中, 依次力。入 5μ110 x Ex Taq 緩冲液, 4μ12.5ηιΜ dNTP, 2μ110μΜ的正向引物 Fl: 和反向引物 Rl: , 0·25μ1的 Ex Taq(5000 单位 /ml, Takara) , 对样品进行 PCR扩增。 其中, 所用引物的具体序列如下:
Fl: CAGGAAACAGCCTAGGAA ( SEQ ID NO: 1) ,
Rl: GTACAACGACACCTAGAC ( SEQ ID NO: 2) 。
PCR程序如下:
(a) 94 °C, 1分钟; (b) 94°C、 30秒; (c) 58°C, 30秒; (d) 72°C , 40秒; 其中步骤( b )到 ( d )进行 18个循环, ( e ) 72 °C , 5分钟, 此后将反应物保持在 4°C。
然后将样品用 Qiagen MinElute PCR Purification Kit纯化。
6) 测序
在 19μ1纯化产物中加入 ΙΟμΙ 2 X Rapid Ligase緩冲液, 5μ1 Τ4 DNA连接酶( 600,000 单位 /mL, Enzymatics ) , Ιμΐ 15μΜ的 SOLEXA Adaptor Mix, 将混合液在 20 °C温育 15 分钟, 接着用 Qiagen MinElute PCR Purification Kit纯化反应产物。
在回收得到的 38.75μ1样品中,依次加入 5μ110 x Ex Taq 緩冲液, 4μ12.5mM dNTP, Ιμΐ 10μΜ 的正向引物 PE1.0和和反向引物 PE2.0, 0.25μ1的 Ex Taq(5000单位 /ml, Takara) , 对样品进行 PCR扩增, 所用引物的具体序列如下:
PE1.0:
CCGATCT ( SEQ ID NO: 3) ,
ΡΕ2.0:
TCTTCCGATCT ( SEQ ID NO: 4) 。
PCR程序如下:
(a) 94 °C, 1分钟; (b) 94 °C, 15秒; (c) 65°C, 30秒; (d) 72°C , 30秒; 其中步骤( b )到 ( d )进行 18个循环, ( e ) 72 °C , 5分钟, 此后将反应物保持在 4°C。
然后对反应产物通过 65°C放置 10分钟进行变性, 然后置于水上, 将变性后的产物 电泳, 使用 2.0%的 Low Range Ultra 琼脂糖胶以电压 15V/CM电泳 2小时, 染色后, 在 Darkreader 下切取 400bp-700bp 片段大小的 DNA, 使用 Qiagen MinElute Gel Purification Kit 进行纯化。 对纯化后的产物在 Illumina GA (Solexa)上机测序, 76个循 环。
使用以上方法, 得到以下结果:
共得到 15,225,082对序列, 去掉重复之后, 共有 2,865,235对干净的序列 (即去掉 重复测到的序列后所得到的具有唯一特征的序列) 。 将得到的测序结果的序列跟原始 得到的组装片断( scaffold )的 N50为 2.3M的基因组图谱 (其制备过程见对比例)进行比 对, 得到具有唯一匹配位点定位到同一个组装片断( scaffold )上且距离小于 500 bp的 数目为 209,600对, 定位到同一个组装片断 ( scaffold ) 上且距离大于 10 kb的数目为 531,028对,其中 30 - 50 kb的有 520,897对,占 98.09% ,定位到不同组装片断( scaffold ) 上的有 185,888对, 利用这 185,888对进行基因组的辅助组装, 将组装片断 ( scaffold ) 的 N50从 2.3M提高到 6.5M。 对比例 利用常规方法进行北极熊基因组 DNA测序
釆用与实施例 1相同的基因组样品, 釆用如下方法测序, 测序过程为 illumina提供 的标准流程, 具体为: 使用 Genomic DNA Sample Prep Kits ( Illumina, USA ) , 依据 试剂盒生产商的说明构建插入片段大小分别为 165 - 175 bp、 450 - 550 bp, 720 - 880 bp 的测序文库; 使用 Paired-End Sample Prep Kit ( Illumina, USA ) , 依据试剂盒生产商 的说明构建插入片段大小分别为 2.4kb-2.7kb, 5.7kb-6.3kb 和 10kb-l lkb的测序文库, 然后用 Illumina GA (Solexa)对构建的两个文库进行测序。 有效数据 (去除原始测序得 到数据中的重复和错误的结果(reads ) )达到 60X(测序深度)覆盖度后,用 SOAPdenovo 进行组装。 计算得到的组装片断 ( scaffold ) 的 N50为 2.3M。
通过实施例 1 与对比例的比较可见, 本发明的方法可以有效地提高组装出的组装 片断 (scaffold ) 的长度。 组装片断(scaffold )长度的增长会有利于后续的各类分子标 记的定位以及相关基因或性状的研究。 工业实用性
根据本发明实施例的制备测序文库的方法、 所制备的测序文库、 确定核酸末端序列的 方法和系统(以下统称为 "根据本发明的技术方案")特别适用于高通量测序。 并且至少具 有下列优点之一:
1 ) 相对于目前常用的基于第一代测序技术的测序方法, 根据本发明的技术方案减少 了挑克隆、 制备单个克隆质粒等繁瑣的步骤, 大大节约了时间和财力;
2 ) 相对于基于酶切的 Fosmid克隆末端测序方法, 本发明釆用了随机打断的方式对载 体例如 BAC或者 Fosmid克隆的质粒 DNA进行片段化, 可以通过反向 PCR得到末端片 段, 克服了基于酶切的 Fosmid克隆末端测序方法的偏好性缺陷, 同时无需针对酶切位点对 所涉及的载体进行改造和选择, 因而根据本发明的技术方案应用广泛。 3 )根据本发明的技术方案不受酶切位点的限制,能够对插入更大片段的载体例如 BAC 进行高通量的末端测序。
4 ) 根据本发明的技术方案作为一种辅助方法能够在基因组从头测序 (de novo sequencing )中大大提高组装片段( scaffold )的长度。 组装片段长度的增长会有利于后续的 各类分子标记的定位以及相关基因或性状的研究。
在本说明书的描述中, 参考术语 "一个实施例"、 "一些实施例"、 "示意性实施例"、 "示 例"、 "具体示例"、 或 "一些示例" 等的描述意指结合该实施例或示例描述的具体特征、 结 构、 材料或者特点包含于本发明的至少一个实施例或示例中。 在本说明书中, 对上述术语 的示意性表述不一定指的是相同的实施例或示例。 而且, 描述的具体特征、 结构、 材料或 者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管已经示出和描述了本发明的实施例, 本领域的普通技术人员可以理解: 在不脱离 本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、 修改、 替换和变型, 本发 明的范围由权利要求及其等同物限定。

Claims

权利要求 书
1、 一种制备测序文库的方法, 其包括下述步骤:
将构建体进行随机打断, 以获得多个随机片断, 其中, 所述构建体由待测 DNA和载体 构成, 所述待测 DNA插入所述载体中;
基于所述载体的长度, 对所述多个随机片断进行分离, 以获得文库片断, 其中所述文 库片断的长度大于所述载体的长度;
使所述文库片断进行自连接, 以获得环形分子; 以及
将所述环形分子进行扩增得到扩增产物, 所述扩增产物构成所述测序文库。
2、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 进一步包括: 在将所述构建体进行随机打断之前, 利用在所述载体上不存在限制性酶切位点的限制 性内切酶, 对所述构建体进行酶切处理。
3、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 所述载体是质粒。
4、根据权利要求 3所述的制备测序文库的方法, 其特征在于, 所述质粒是选自 Fosmid 质粒、 BAC质粒、 和 Cosmid质粒的至少一种。
5、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 将所述构建体进行随机 打断是利用物理方法进行的。
6、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 将所述构建体进行随机 打断是利用 HydroShear DNA剪切仪进行的。
7、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 所述文库片断的长度比 所述载体的长度大约 50bp-约 800bp。
8、 根据权利要求 7所述的制备测序文库的方法, 其特征在于, 所述文库片断的长度比 所述载体的长度大约 200bp-约 800bp。
9、 根据权利要求 1所述的制备测序文库的方法, 其特征在于,
在对所述多个随机片断进行分离之前, 对所述多个随机片断进行平端化处理。
10、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 在将所述文库片断进 行自连接之前, 对所述文库片断进行平端化处理。
11、 根据权利要求 9或 10所述的制备测序文库的方法, 其特征在于,
利用选自 Klenow酶、 T4聚合酶和 T4多核苷酸激酶的至少一种进行所述平端化处理。
12、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 所述分离是釆用凝胶 电泳法和梯度沉降法的至少一种进行的。
13、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 在将所述环形分子进 行扩增之前, 利用不降解载体的 ATP依赖性 DNA酶和外切核酸酶 I的至少一种对所述文 库片断进行消化除去非环形分子。
14、 根据权利要求 1所述的制备测序文库的方法, 其特征在于, 将所述环形分子进行 扩增得到扩增产物是利用具有末端加 A功能的 DNA聚合酶、 载体特异性引物进行 PCR扩 增而进行的。
15、 根据权利要求 14所述的制备测序文库的方法, 其特征在于, 进行 18-20个循环的 PCR扩增。
16、 一种测序文库, 其是根据权利要求 1-15中任一项所述的方法而制备的。
17、 一种确定核酸末端序列的方法, 其包括以下步骤:
将所述核酸插入载体中;
根据权利要求 1-15任一项所述的方法制备所述核酸的测序文库; 以及
对所述测序文库进行测序, 以获得所述核酸的末端序列信息。
18、 根据权利要求 17 所述的确定核酸末端序列的方法, 其特征在于, 利用选自 SOLEXA、 SOLID, 454、 和单分子测序装置的至少一种对所述测序文库进行测序。
19、 一种确定核酸序列的方法, 其包括:
将核酸样本分为第一核酸核酸样本和第二核酸样本;
利用所述第一核酸样本, 根据权利要求 17或 18所述的确定核酸末端序列的方法确定 所述核酸的末端序列信息;
利用所述第二核酸样本, 根据常规的测序方法获得所述核酸的核酸片段序列信息, 其 中, 所述常规的测序方法为选自 SOLEXA、 SOLID, 454、 和单分子测序技术的至少一种; 以及
将所述核酸的末端序列信息与所述核酸的核酸片段序列信息进行组装和拼接, 以便确 定所述核酸的序列。
20、 一种确定核酸末端序列的系统, 其包括:
1 ) DNA片断化装置, 所述 DNA片断化装置用于将构建体进行随机打断, 以获得多个 随机片断, 其中, 所述构建体由待测 DNA和载体构成, 所述待测 DNA插入所述载体中;
2 )分离装置, 所述分离装置与所述 DNA片断化装置相连, 用于基于所述载体的长度, 对所述多个随机片断进行分离, 以获得文库片断;
3 )环化装置, 所述环化装置与所述分离装置相连, 用于使所述文库片断进行自连接, 以获得环形分子;
4 )扩增装置, 所述扩增装置与所述环化装置相连, 用于将所述环形分子进行扩增得到 扩增产物, 所述扩增产物构成所述测序文库; 以及
5 )测序装置; 所述测序装置与所述扩增装置相连, 用于对所述测序文库进行测序得到 所述核酸的末端序列。
21、 根据权利要求 20所述的确定核酸末端序列的系统, 其特征在于, 所述 DNA片断 化装置是 HydroShear DNA剪切仪。
22、 根据权利要求 20所述的确定核酸末端序列的系统, 其特征在于, 所述测序装置是 选自 SOLEXA、 SOLID, 454、 和单分子测序装置的至少一种。
23、 根据权利要求 20所述的确定核酸末端序列的系统, 其特征在于, 进一步包括预处 理装置, 以便在将所述载体进行随机打断之前, 利用在所述载体上不存在限制性酶切位点 的限制性内切酶, 对所述构建体进行酶切处理。
24、 根据权利要求 20所述的确定核酸末端序列的系统, 其特征在于, 进一步包括平端 化装置, 以便在对所述多个随机片断进行分离之前, 对所述多个随机片断进行平端化处理, 或者在将所述文库片断进行自连接之前, 对所述文库片断进行平端化处理。
25、 根据权利要求 20所述的确定核酸末端序列的系统, 其特征在于, 所述进一步包括 净化装置, 以便在将所述文库片断进行自连接之前, 利用不降解载体的 ATP依赖性 DNA 酶和外切核酸酶 I的至少一种对所述文库片断进行消化除去非环形分子。
PCT/CN2011/079213 2010-09-01 2011-08-31 测序文库及其制备方法、确定核酸末端序列的方法和系统 WO2012028105A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010272706.5 2010-09-01
CN 201010272706 CN101967684B (zh) 2010-09-01 2010-09-01 一种测序文库及其制备方法、一种末端测序方法和装置

Publications (1)

Publication Number Publication Date
WO2012028105A1 true WO2012028105A1 (zh) 2012-03-08

Family

ID=43546901

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079213 WO2012028105A1 (zh) 2010-09-01 2011-08-31 测序文库及其制备方法、确定核酸末端序列的方法和系统

Country Status (2)

Country Link
CN (1) CN101967684B (zh)
WO (1) WO2012028105A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104862302A (zh) * 2015-05-05 2015-08-26 华南师范大学 一种dna片段化的方法及实现该方法的装置

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102181943B (zh) * 2011-03-02 2013-06-05 中山大学 一种配对双末端文库构建方法及用该文库进行基因组测序的方法
CN102206704B (zh) * 2011-03-02 2013-11-20 深圳华大基因科技服务有限公司 组装基因组序列的方法和装置
CN102732598B (zh) * 2011-04-11 2017-03-01 陈先锋 一种全基因组dna序列拼接测序方法
CN102286632A (zh) * 2011-09-14 2011-12-21 中山大学 一种检测基因组目的区域结构变异的方法
WO2014008635A1 (zh) * 2012-07-11 2014-01-16 北京贝瑞和康生物技术有限公司 片断dna检测方法、片断dna检测试剂盒及其应用
CN102864498B (zh) * 2012-09-24 2014-07-16 中国科学院天津工业生物技术研究所 一种长片段末端文库的构建方法
CN104745679B (zh) * 2013-12-31 2018-06-15 杭州贝瑞和康基因诊断技术有限公司 一种无创检测egfr基因突变的方法及试剂盒
CN104293941B (zh) * 2014-09-30 2017-01-11 天津华大基因科技有限公司 构建测序文库的方法及其应用
CN105624272B (zh) * 2014-10-29 2019-08-09 深圳华大基因科技有限公司 基因组预定区域核酸测序文库的构建方法及装置
CN105986020B (zh) * 2015-02-11 2019-08-09 深圳华大智造科技有限公司 构建测序文库的方法及装置
CN106148323B (zh) * 2015-04-22 2021-03-05 北京贝瑞和康生物技术有限公司 一种用于构建alk基因融合突变检测文库的方法和试剂盒
CN105002570B (zh) * 2015-07-21 2017-09-05 中国农业科学院深圳农业基因组研究所 一种一次制备多个dna大片段插入双末端测序文库的方法
CN105332063B (zh) * 2015-08-13 2017-04-12 厦门飞朔生物技术有限公司 一种单管高通量测序文库的构建方法
CN107858408A (zh) * 2016-09-19 2018-03-30 深圳华大基因科技服务有限公司 一种基因组二代序列组装方法和系统
CN109957615B (zh) * 2017-12-26 2023-07-21 北京安诺优达医学检验实验室有限公司 一种单细胞基因组目标区域捕获的方法
CN112375807B (zh) * 2017-12-30 2023-02-21 浙江安诺优达生物科技有限公司 一种随机打断dna的方法
CN110310702B (zh) * 2018-03-16 2021-03-23 深圳华大基因科技服务有限公司 一种基因组测序组装结果修复的方法、装置和存储介质
CN110241189A (zh) * 2019-06-22 2019-09-17 华中农业大学 一种长片段dna文库长配对末端测序方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101484589A (zh) * 2006-07-12 2009-07-15 凯津公司 使用aflp的高通量物理作图
WO2010003316A1 (en) * 2008-07-10 2010-01-14 Si Lok Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids
CN101955545A (zh) * 2010-09-07 2011-01-26 四川大学 一种多靶点重组基因及其蛋白在防治幽门螺旋杆菌感染中的应用

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101182526A (zh) * 2007-11-12 2008-05-21 山东省农业科学院家禽研究所 鸭肠炎病毒基因组dna的提取及其序列

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101484589A (zh) * 2006-07-12 2009-07-15 凯津公司 使用aflp的高通量物理作图
WO2010003316A1 (en) * 2008-07-10 2010-01-14 Si Lok Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids
CN101955545A (zh) * 2010-09-07 2011-01-26 四川大学 一种多靶点重组基因及其蛋白在防治幽门螺旋杆菌感染中的应用

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104862302A (zh) * 2015-05-05 2015-08-26 华南师范大学 一种dna片段化的方法及实现该方法的装置
WO2016177020A1 (zh) * 2015-05-05 2016-11-10 华南师范大学 一种dna片段化的方法及实现该方法的装置
CN104862302B (zh) * 2015-05-05 2020-12-15 华南师范大学 一种dna片段化的方法及实现该方法的装置

Also Published As

Publication number Publication date
CN101967684A (zh) 2011-02-09
CN101967684B (zh) 2013-02-27

Similar Documents

Publication Publication Date Title
WO2012028105A1 (zh) 测序文库及其制备方法、确定核酸末端序列的方法和系统
US12116571B2 (en) Compositions and methods for detecting nucleic acid regions
EP3628732B1 (en) Transposase compositions for reduction of insertion bias
WO2012079486A1 (zh) 制备用于测序的dna样品的方法及其应用
Van Dijk et al. Library preparation methods for next-generation sequencing: tone down the bias
US10351848B2 (en) Method for constructing nucleic acid single-stranded cyclic library and reagents thereof
WO2013064066A1 (zh) 全基因组甲基化高通量测序文库的构建方法及其应用
US20100035249A1 (en) Rna sequencing and analysis using solid support
WO2018112806A1 (zh) 将线性测序文库转换为环状测序文库的方法
WO2012037878A1 (zh) 核酸标签及其应用
CN108611398A (zh) 通过新一代测序进行基因分型
WO2012071985A1 (zh) 从ffpe样本中提取dna的方法及其用途
WO2013056640A1 (zh) 核酸文库的制备方法及其应用以及试剂盒
WO2013104106A1 (zh) 用于构建血浆dna测序文库的方法和试剂盒
WO2012089147A1 (zh) 针对核酸样本构建测序文库的方法及其用途
US20230017673A1 (en) Methods and Reagents for Molecular Barcoding
JP2023506631A (ja) 共有結合で閉端された核酸分子末端を使用したngsライブラリー調製
CN112322700A (zh) 短rna片段文库的构建方法、试剂盒及应用
KR101913735B1 (ko) 차세대 염기서열 분석을 위한 시료 간 교차 오염 탐색용 내부 검정 물질
US11225658B2 (en) Enrichment and sequencing of RNA species
CN109750031A (zh) 可利用高通量测序技术检测转录起始位点的文库构建方法
WO2023060539A1 (en) Compositions and methods for detecting target cleavage sites of crispr/cas nucleases and dna translocation
WO2020259303A1 (zh) 一种快速构建rna 3&#39;端基因表达文库的方法
WO2015024075A1 (en) Method of nucleic acid fragmentation
WO2024119461A1 (en) Compositions and methods for detecting target cleavage sites of crispr/cas nucleases and dna translocation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11821131

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11821131

Country of ref document: EP

Kind code of ref document: A1