WO2024027123A1 - 一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法 - Google Patents

一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法 Download PDF

Info

Publication number
WO2024027123A1
WO2024027123A1 PCT/CN2023/075228 CN2023075228W WO2024027123A1 WO 2024027123 A1 WO2024027123 A1 WO 2024027123A1 CN 2023075228 W CN2023075228 W CN 2023075228W WO 2024027123 A1 WO2024027123 A1 WO 2024027123A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
fragment
capture
sequencing
complementary
Prior art date
Application number
PCT/CN2023/075228
Other languages
English (en)
French (fr)
Inventor
甘广丽
李改玲
刘二凯
赵陆洋
Original Assignee
深圳赛陆医疗科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳赛陆医疗科技有限公司 filed Critical 深圳赛陆医疗科技有限公司
Publication of WO2024027123A1 publication Critical patent/WO2024027123A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention requires the priority of the Chinese patent application submitted to the China Patent Office on August 1, 2022, with the application number 202210916238.3, and the application name is "A method for constructing a sequencing library, a kit for constructing a sequencing library, and a gene sequencing method" , the entire contents of which are incorporated herein by reference.
  • the invention belongs to the field of gene sequencing, and specifically relates to a method for constructing a sequencing library, a kit for constructing a sequencing library, and a gene sequencing method.
  • Second-generation sequencing technology with its high-throughput, high accuracy and low cost, has greatly changed the research and clinical application of gene sequencing in the biological and medical fields.
  • second-generation sequencing technology due to its relatively short read length, second-generation sequencing technology generally
  • the maximum length that can be read is 300 to 500bp, so it cannot be performed well in fields such as detecting genome structural variation and genome gene phasing.
  • the commonly used sequencing scheme is to interrupt a long fragment of DNA with 100,000 to 1 million bases in a limited area and add a single type of DNA tags and adapters, while different long fragments of DNA are separated in different regions. The defined regions were segmented and added with different types of tags and the same adapters, and then sequenced.
  • the sequencing results can be automatically determined to come from a certain region.
  • a small amount of DNA can be assembled in a small region first, and then the overall assembly can be performed. This can produce a longer assembly sequence, thus solving the problem.
  • BGI single tube long fragment reads
  • TELL-Seq TM Transposase Enzyme Linked Long-read Sequencing
  • the present invention aims to solve at least one of the technical problems existing in the above-mentioned prior art. To this end, the present invention proposes a method for constructing a sequencing library, which is simple to operate, greatly reduces costs, and is easy to combine with an automated platform in the later stage.
  • the present invention also provides a kit for constructing a sequencing library.
  • the present invention also proposes a gene sequencing method.
  • a method for constructing a sequencing library including the following steps:
  • S1 Provide a chip and a first fragment, the chip has several microwells, the first fragment includes the P1 sequence, and the first fragment is fixed in the microwells;
  • S2 Provide a second fragment, the second fragment includes the P1 complementary sequence, the first random sequence N1 and the P2 sequence, hybridize the second fragment with the first fragment; extend and amplify the first fragment, Obtain the first capture fragment immobilized in the microwell;
  • S3 Provide a third fragment, the third fragment includes the P2 sequence, the second random sequence N2 and the P3 sequence, hybridize the third fragment with the first capture fragment; extend and amplify the first capture fragment , obtaining the second capture fragment fixed in the microwell;
  • the complex includes a target DNA and a transposome bound to the target DNA, the transposome includes a transposase and a transposon, and the transposon
  • the transposon includes any of part of the P3 sequence, the entire P3 sequence, and part of the P3 complementary sequence;
  • S5 Remove the transposase to obtain a composite sequence, which includes a partial transposon sequence and fragmented target DNA. Extend the composite sequence to obtain an extension product, which is the sequencing library. .
  • Two random sequences are introduced into the second capture fragment of the present invention. These random sequences are generated by amplifying a large number of random libraries synthesized at one time on the chip. There is no need to separately synthesize numerous types as in the prior art.
  • the primers are used to modify capture microspheres or produce hydrogel microspheres respectively, significantly reducing core costs.
  • the method of the present invention is simple to operate to construct a sequencing library, and can be easily used in combination with an automated platform to sequence long fragments of DNA in the later stage.
  • the chip includes two flow channels, and each of the flow channels includes a plurality of the micropores.
  • the chip includes 8 million to 15 million of the micropores.
  • the chip includes 10 million of the microwells.
  • the diameter of the micropores is 2-8 ⁇ m.
  • the diameter of the micropores is 5 ⁇ m.
  • the depth of the micropores is 5-15 ⁇ m.
  • the depth of the micropores is 10 ⁇ m.
  • the center distance between two adjacent micropores is 8-15 ⁇ m.
  • the center distance between two adjacent micropores is 12 ⁇ m.
  • the diameter of the micropore is 5 ⁇ m, the depth is 10 ⁇ m, and the center distance between two adjacent micropores is 12 ⁇ m.
  • step S1 also includes fixing sequencing primers in the microwells.
  • the bottom of the micropores is modified with a silane coupling agent.
  • the bottom of the micropores is modified with any one of aminosilane or chlorosilane coupling agent.
  • the bottom of the micropores is modified with any one of ⁇ -aminopropyltriethoxysilane (APTES), aminopropyltrimethoxysilane or trichlorosilane.
  • APTES ⁇ -aminopropyltriethoxysilane
  • aminopropyltrimethoxysilane aminopropyltrimethoxysilane
  • trichlorosilane ⁇ -aminopropyltriethoxysilane
  • the bottom of the micropores is modified with ⁇ -aminopropyltriethoxysilane.
  • silane modification on the bottom of the microwell is to immobilize the first fragment or the sequencing primer.
  • the method of fixing the first fragment or the sequencing primer in the microwell of the chip includes: introducing an azide group into the microwell modified with the aminosilane, and adding dibenzo
  • the first fragment or sequencing primer modified with cyclooctyne (DBCO) is added into the microwell, and the first fragment or sequencing primer is fixed on the microwell through the covalent bond formed by the DBCO and the azide group. in the micropores.
  • DBCO cyclooctyne
  • 1,4-phenylenediisothiocyanate or glutaraldehyde is introduced into the micropores modified with the aminosilane, and the 1,4-phenylenediisothiocyanate or glutaraldehyde passes through the pores. Covalent bonds formed with the aminosilane and the amino-modified first fragment or sequencing primer, respectively, fix the first fragment or the sequencing primer in the microwell.
  • a bifunctional SMBP is introduced into the micropore modified with the aminosilane, and the first fragment or sequencing primer is modified through covalent bonds formed by the bifunctional SMBP with the aminosilane and the thiol-modified first fragment respectively. Fragments or the sequencing primers are immobilized in the microwells.
  • the edges of the micropores are modified with at least one of trimethylsilane, isobutylenetriethoxysilane, disiloxane or cyclotetrasiloxane.
  • the edges of the micropores are modified with the trimethylsilane.
  • the area between adjacent micropores is modified with at least one of trimethylsilane, isobutylenetriethoxysilane, disiloxane or cyclotetrasiloxane.
  • the area between adjacent micropores is modified with the trimethylsilane.
  • micropores mainly play a hydrophobic role and prevent reagents in the micropores, such as enzymes, DNA, etc., from non-specific adsorption or spreading to adjacent ones. of the micropores resulting in cross-reaction.
  • the P1 sequence is a fixed sequence.
  • the length of the P1 sequence is 25-100 nt.
  • the P1 sequence is shown in SEQ ID NO: 1.
  • the sequencing primer is a fixed sequence.
  • the sequencing primer is partially or completely complementary to the P1 sequence.
  • the P1 complementary sequence is a fixed sequence.
  • the length of the P1 complementary sequence is 25-100 nt.
  • the P1 complementary sequence is shown in SEQ ID NO: 2.
  • the P2 sequence is a fixed sequence.
  • the length of the P2 sequence is 25-100 nt.
  • the P2 sequence is shown in SEQ ID NO: 3.
  • the P1 sequence and the P2 sequence are not identical and are not complementary.
  • the length of the first random sequence N1 is 10 to 15 nt.
  • the concentration of the second fragment is 2-15 pM.
  • the concentration of the second fragment is 5 pM.
  • Controlling the concentration of the second fragment to 5 pM can ensure that at least one second fragment is hybridized in each microwell.
  • sodium citrate buffer needs to be added to wash away the unhybridized solution, and the sodium citrate solution will not affect the hybridized strands.
  • the amplification described in step S2 requires the addition of P2 primer.
  • the P2 primer is a fixed sequence.
  • the sequence of the P2 primer is the same as the P2 sequence.
  • the P2 primer is as shown in SEQ ID NO:4.
  • step S2 The amplification in step S2 is performed with two primers: the P2 primer and the first fragment.
  • step S2 after adding the P2 primer in step S2, oil needs to be passed in to seal the micropores.
  • the oil includes any one of liquid paraffin oil, silicone oil, or petroleum jelly.
  • the amplification in step S2 adopts any one of rapid cycle PCR, solid-phase PCR or isothermal PCR amplification method.
  • the amplification in step S2 adopts an isothermal amplification method.
  • the amplification in step S2 adopts any one of the helicase-dependent amplification method or the recombinase polymerase amplification method (RPA).
  • RPA recombinase polymerase amplification method
  • the amplification in step S2 adopts the recombinase polymerase amplification method.
  • step S2 when adding the P2 primer in step S2, it is also necessary to add the enzyme and dNTP required for the amplification.
  • the enzymes required for amplification include recombinase and DNA polymerase.
  • the amplification time in step S2 is 50 to 70 minutes.
  • the amplification time in step S2 is 60 minutes.
  • step S2 after the amplification is completed, a solution that destroys the oil phase system needs to be introduced first.
  • the solution that destroys the oil phase system includes at least one of ethanol, sodium citrate buffer (SSC solution), formamide, sodium hydroxide, detergent or banana water.
  • washing liquid needs to be added to the microwell to wash the amplification product.
  • the washing liquid includes at least one of formamide, sodium hydroxide, urea or helicase.
  • the formamide is selected as the washing liquid.
  • one strand i.e., the first capture fragment
  • the other strand is complementary to the above strand through hydrogen bonds.
  • Washing with the above-mentioned detergent can remove the other chain without destroying the chemical bond, that is, the first capture fragment is retained.
  • the first capture fragment includes a P1 sequence, a first random sequence complementary sequence, and a P2 complementary sequence.
  • the length of the first random sequence complementary sequence is 10-15 nt.
  • the P2 complementary sequence is a fixed sequence.
  • the length of the P2 complementary sequence is 25-100 nt.
  • the P2 complementary sequence is shown in SEQ ID NO: 5.
  • step S2 when step S2 is completed, 85-99% of the microwells of the chip include the first capture fragment, and 1-15% of the microwells only include the first fragment.
  • 90 to 99% of the microwells of the chip include the first capture fragment, and 1 to 10% of the microwells only include the first fragment.
  • 95 to 99% of the microwells of the chip include the first capture fragment, and 1 to 5% of the microwells only include the first fragment.
  • each microwell of the chip includes 1 to 5 of the first capture fragments.
  • the length of the second random sequence N2 is 10 to 15 nt.
  • the P3 sequence is a fixed sequence.
  • the length of the P3 sequence is 25-100 nt.
  • the P3 sequence is shown in SEQ ID NO: 6.
  • the P1 sequence, the P2 sequence and the P3 sequence are different from each other and are not complementary to each other.
  • the concentration of the third fragment is 2-15 pM.
  • the concentration of the third fragment is 5 pM.
  • concentration of the third fragment at 5 pM can ensure that at least one third fragment is hybridized in each microwell.
  • the amplification described in step S3 requires the addition of P3 primers.
  • the P3 primer is a fixed sequence.
  • the sequence of the P3 primer is the same as the P3 sequence.
  • the P3 primer is as shown in SEQ ID NO:7.
  • step S3 The amplification in step S3 is performed with two primers: the P3 primer and the first fragment.
  • step S1 when both the first fragment and the sequencing primer are fixed in the microwell in step S1, part of the P3 primer is added in step S3 to perform the amplification.
  • the partial P3 primer is as shown in SEQ ID NO: 16.
  • step S3 the amplification method in step S3 is the same as that in step S2.
  • step S3 after the amplification in step S3 is completed, a solution that destroys the oil phase system needs to be introduced first.
  • the solution that destroys the oil phase system includes at least one of ethanol, sodium citrate buffer, formamide, sodium hydroxide, detergent or banana water.
  • ddNTP after destroying the oil phase system in step S3, ddNTP needs to be added to bind to the 3’ end of the amplification product.
  • the method of adding the ddNTP is to bind the ddNTP to the 3’ end of the amplification product through tdt terminal transferase.
  • the purpose of binding the ddNTP to the 3' end of the amplification product is to prevent the second capture fragment from continuing to extend when the target DNA is captured at a later stage, thereby achieving the effect of repeated use of the chip.
  • washing liquid needs to be added to the microwell to wash the amplification product.
  • the washing liquid includes at least one of formamide, sodium hydroxide, urea or helicase.
  • the formamide is selected as the washing liquid.
  • step S3 After the washing in step S3, the second capture fragment is retained.
  • the second capture fragment includes a P1 sequence, a first random sequence complementary sequence, a P2 complementary sequence, a second random sequence complementary sequence, and a P3 complementary sequence.
  • the length of the second random sequence complementary sequence is 10-15 nt.
  • the P3 complementary sequence is a fixed sequence.
  • the length of the P3 complementary sequence is 25-100 nt.
  • the P3 complementary sequence is shown in SEQ ID NO: 8.
  • the second capture fragment when both the first fragment and the sequencing primer are fixed in the microwell in step S1, the second capture fragment includes a P1 sequence, a first random sequence complementary sequence, P2 complementary sequence, second random sequence complementary sequence, partial P3 complementary sequence.
  • steps S1 to S3 P1 sequence, P2 sequence, P3 sequence and the first random sequence N1 and the second random sequence N2 are introduced, which can basically meet the current sequencing requirements.
  • the first random sequence N1 and the second random sequence N2 are introduced as tag sequences to distinguish different target DNAs that are subsequently captured; combining and comparing the two random sequences can determine whether to add a microwell when preparing the second capture fragment. Single or multiple fragments containing P2 sequence or P3 sequence are obtained; and the combination of two random sequences can label more target DNA.
  • the P1 sequence is introduced to fix the first capture fragment and the second capture fragment, and can serve as an amplification primer for the first capture fragment and the second capture fragment.
  • the P2 sequence and the P3 sequence are introduced to serve as primers to detect the base information of the first random sequence N1 and the second random sequence N2 during sequencing. At the same time, the P3 sequence is also used to hybridize with the target DNA and capture it.
  • the P1 sequence and the second fragment cannot be introduced at the same time, because the P1 sequence is fixed in the microwell of the chip, and its fixation conditions are different from those when the second fragment is introduced; at the same time, the P1 sequence and the second fragment contain If the complementary sequence (i.e. P1 complementary sequence) is added at the same time, the binding of the P1 complementary sequence to the P1 sequence will interfere with the fixation efficiency of the P1 sequence and the extension efficiency of the second fragment in the microwell.
  • the complementary sequence i.e. P1 complementary sequence
  • the transposon includes a first double-stranded DNA and a second double-stranded DNA.
  • the first double-stranded DNA includes an a-strand and a b-strand.
  • the a chain includes a P4 sequence, a coding sequence, a fixed sequence and a fixed sequence complementary sequence (ME complementary sequence) specifically recognized by a transposase.
  • the b chain includes a fixed sequence (ME sequence) specifically recognized by a transposase.
  • the a chain and the b chain are complementary to each other through the ME complementary sequence and the ME sequence to form a double strand, which is recognized by the transposase, thereby binding to the transposase. on the enzyme.
  • the second double-stranded DNA includes c-strand and d-strand.
  • the c-chain includes ME complementary sequences.
  • the d chain includes part or all of the P3 sequence, as well as the ME sequence.
  • the c chain and the d chain are complementary to each other through the ME complementary sequence and the ME sequence to form a double strand, which is recognized by the transposase, thereby binding to the transposase. on the enzyme.
  • the P4 sequence is a fixed sequence.
  • the length of the P4 sequence is 25-100 nt.
  • the P4 sequence is shown in SEQ ID NO: 9.
  • the P1 sequence, the P2 sequence, the P3 sequence and the P4 sequence are different from each other and are not complementary to each other.
  • the coding sequence is a random sequence.
  • the length of the coding sequence is 6 to 8 nt.
  • the transposon includes part of the P3 sequence, as shown in SEQ ID NO: 14.
  • the transposon when both the first fragment and the sequencing primer are fixed in the microwell in step S1, the transposon includes a third double-stranded DNA and a fourth double-stranded DNA. .
  • the third double-stranded DNA is the same as the first double-stranded DNA.
  • the fourth double-stranded DNA includes e-strand and f-strand.
  • the e-chain includes a ME sequence.
  • the f chain includes the partial P3 complementary sequence and the ME complementary sequence.
  • the e chain and the f chain are complementary to each other through the ME sequence and the ME complementary sequence to form a double strand, which is recognized by the transposase, thereby binding to the transposase. on the enzyme.
  • the function of the transposase is to interrupt the target DNA, form the fragmented target DNA, and add adapters to both ends of the fragmented target DNA.
  • the enzymatic activity of the transposase and the ratio of the reaction between the transposase and the target DNA will affect the capture and sequencing results.
  • Low enzyme activity will result in poor fragmentation of the target DNA, and an inappropriate ratio of the transposase to the target DNA will result in the length of the sequencing library being too long or too short, thereby reducing the sequencing quality.
  • the enzymatic activity of the transposase is 1 to 20 U/ ⁇ L.
  • 0.01-20 U of the transposase can bind to 0.1-500 ng of the target DNA.
  • the second capture fragment in step S4 captures the complex through its P3 complementary sequence being complementary to part or all of the P3 sequence of the transposon.
  • the second capture fragment in step S4 captures the complex by complementing part or all of its P3 complementary sequence with part or all of the P3 sequence of the transposon.
  • step S4 when both the first fragment and the sequencing primer are fixed in the microwell in step S1, step S4 needs to first provide a fourth fragment, and the fourth fragment includes the P3 sequence. , hybridize the fourth fragment with the second capture fragment; the second capture fragment captures the complex through the fourth fragment.
  • the fourth fragment hybridizes to part of the P3 complementary sequence in the second capture fragment.
  • sequence in the fourth fragment that does not hybridize with the second capture fragment is used to capture the complex.
  • sequence in the fourth fragment that does not hybridize with the second capture fragment and the part of the transposon is complementary to capture the complex.
  • the concentration of the fourth fragment is 2-15 pM.
  • the concentration of the fourth fragment is 5 pM.
  • Keeping the concentration of the fourth fragment at 5pM can ensure that at least one of the fourth fragments is hybridized in each microwell.
  • the extension product is separated and collected.
  • the method for separating and collecting the extension product is to add a washing solution to elute the extension product and collect it.
  • the washing liquid includes at least one of sodium hydroxide, formamide or urea.
  • the sodium hydroxide is selected as the washing liquid.
  • the partial transposon sequence described in step S5 includes part or all of the P3 sequence, ME sequence, and a chain (ME complementary sequence, fixed sequence, coding sequence, and P4 sequence) of the transposon.
  • the sequences of the sequencing library described in step S5 include P1 complementary sequence, first random sequence N1, P2 sequence, second random sequence N2, P3 sequence, ME sequence, fragmented target DNA, a chain (ME complementary sequence, fixed sequence, coding sequence and P4 sequence).
  • step S5 when step S1 fixes the first fragment and the sequencing primer in the microwell, step S5 includes: removing the transposase to obtain the composite sequence, so The composite sequence is connected to the second capture fragment to form the sequencing library.
  • a washing solution needs to be added to the microwells to wash the sequencing library.
  • the washing liquid includes at least one of formamide, sodium hydroxide, urea or helicase.
  • the partial transposon sequence described in step S5 includes part of the P3 complementary sequence, ME complementary sequence, and a chain complementary sequence (P4 complementary sequence, coding sequence complementary sequence, fixed sequence complementary sequence, and ME sequence) of the transposon.
  • sequences of the sequencing library described in step S5 include P1 sequence, first random sequence complementary sequence, P2 complementary sequence, second random sequence complementary sequence, P3 complementary sequence, ME complementary sequence, fragmented target DNA, a chain Complementary sequences (P4 complement, coding sequence complement, fixed sequence complement and ME sequence).
  • a kit for constructing a sequencing library including:
  • the chip has several microwells, and second capture fragments are fixed in the microwells;
  • the second capture fragments include P1 sequence, first random sequence complementary sequence, P2 complementary sequence, and second random sequence complementary sequence , P3 complementary sequence;
  • the kit also includes a transposase and a transposon; the transposon includes any one of a partial P3 sequence, a complete P3 sequence, and a partial P3 complementary sequence.
  • a second capture fragment and a sequencing primer are fixed in the microwell;
  • the second capture fragment includes a P1 sequence, a first random sequence complementary sequence, a P2 complementary sequence, and a second random sequence complementary sequence. Sequence, partial P3 complementary sequence;
  • the kit also includes a transposase, a transposon and a fourth fragment; the transposon includes the partial P3 complementary sequence, and the fourth fragment includes the P3 sequence.
  • the transposase and the transposon are used to bind target DNA to form a complex.
  • the second capture fragment is used to capture the complex.
  • the second capture fragment captures the complex by complementing part or all of its P3 complementary sequence with part or all of the P3 sequence of the transposon.
  • the transposase interrupts the target DNA to form fragmented target DNA; and then removes the transposase, Get a composite sequence.
  • the second capture fragment when the second capture fragment and the sequencing primer are immobilized in the microwell, the second capture fragment captures the complex through the fourth fragment.
  • the fourth fragment captures the complex through the partial P3 sequence of the fourth fragment that is not hybridized to the second capture fragment and is complementary to the partial P3 complementary sequence of the transposon.
  • the composite sequence includes a portion of the transposon sequence and fragmented target DNA.
  • the partial transposon sequence includes part or all of the P3 sequence of the transposon, ME sequence, fragmented target DNA, a chain (ME complementary sequence, fixed sequence, coding sequence and P4 sequence).
  • the partial transposon sequence when the second capture fragment and the sequencing primer are fixed in the microwell, includes the partial P3 complementary sequence of the transposon, the ME complementary sequence, the fragmented Target DNA, a chain complementary sequence (P4 complementary sequence, coding sequence complementary sequence, fixed sequence complementary sequence and ME sequence).
  • the composite sequence is extended to obtain an extension product, and the extension product is the sequencing library.
  • the complex sequence obtained by removing the transposase is ligated with the second capture fragment to form the sequencing library.
  • the kit includes a chip, and the second capture fragment is immobilized in the microwell of the chip; the kit also includes transposase, transposon, dNTP, DNA polymerization Enzymes, detergents.
  • the dNTP and the DNA polymerase are used to extend the composite sequence.
  • the washing liquid includes at least one of sodium hydroxide, formamide or urea.
  • the washing solution is used to elute the extension product.
  • the kit includes a chip, and the second capture fragment and the sequencing primer are immobilized in the microwells of the chip; the kit also includes a transposase, a transposon, a third Four fragments, dNTPs, DNA polymerase, washing solution.
  • a gene sequencing method which includes the following steps: using the sequencing library construction method to construct a sequencing library of the sample to be tested, and then amplifying and sequencing the sequencing library.
  • Figure 1 is a chip prepared in Example 1 of the present invention
  • Figure 2 is a schematic diagram of fixing the first fragment in the microhole of the chip in Embodiment 1 of the present invention
  • Figure 3 is a schematic diagram of the first capture fragment obtained in Example 1 of the present invention; the legend on the left represents the second fragment, which from bottom to top is the P1 complementary sequence, the first random sequence N1 and the P2 sequence;
  • Figure 4 is a schematic diagram of the second capture fragment obtained in Embodiment 1 of the present invention; the legend on the left represents the third fragment, and from bottom to top are the P2 sequence, the second random sequence N2 and the P3 sequence;
  • FIG. 5 is a schematic structural diagram of the transposon in Embodiment 1 of the present invention; wherein, the fixed sequence specifically recognized by ME-transposase and the complementary sequence of the fixed sequence specifically recognized by ME'-transposase;
  • FIG. 6 is a schematic diagram of the second capture fragment capture complex in Example 1 of the present invention; the complex includes target DNA and a transposome bound to the target DNA.
  • the transposome includes a transposase and a transposon.
  • the transposon Includes part of P3 sequence;
  • Figure 7 is a schematic diagram of the extension product obtained in step (5) in Example 1 of the present invention; wherein, the a chain from bottom to top is the ME complementary sequence, fixed sequence, coding sequence, and P4 sequence;
  • Figure 8 is a schematic diagram of the sequencing library eluted in Example 1 of the present invention; the specific markers are the same as Figure 7;
  • Figure 9 is a schematic diagram of fixing the first fragment and sequencing primer in the microwell of the chip in Example 2 of the present invention.
  • Figure 10 is a schematic diagram of the second capture fragment obtained in Embodiment 2 of the present invention.
  • the legend on the left represents the third fragment, and from bottom to top are the P2 sequence, the second random sequence N2 and the P3 sequence;
  • Figure 11 is a schematic structural diagram of the transposon in Example 2 of the present invention; wherein, the fixed sequence specifically recognized by ME-transposase and the complementary sequence of the fixed sequence specifically recognized by ME'-transposase;
  • Figure 12 is a schematic diagram of the complex captured by the fourth fragment in Example 2 of the present invention; the complex includes target DNA and a transposome bound to the target DNA.
  • the transposome includes a transposase and a transposon.
  • the transposon Includes part of the P3 complementary sequence;
  • Figure 13 is a schematic diagram of the sequencing library obtained in Example 2 of the present invention; wherein, from bottom to top, the a chain complementary sequence is the P4 complementary sequence, the coding sequence complementary sequence, the fixed sequence complementary sequence, and the ME sequence.
  • test methods used in the examples are conventional methods; unless otherwise stated, the materials and reagents used are commercially available reagents and materials.
  • This example constructs an E. coli Ecoli genome sequencing library.
  • the specific process is:
  • the chip size is 25mm*75mm.
  • the chip includes two flow channels, each flow channel includes several micropores, and the entire chip includes 10 million micropores. As shown in Figure 1, the diameter of the micropore is 5 ⁇ m, the depth is 10 ⁇ m, and the center distance between two adjacent micropores is 12 ⁇ m.
  • the bottom of the micropores is modified with ⁇ -aminopropyltriethoxysilane (APTES), which is used to fix the first fragment (i.e., the P1 sequence); the edges of the micropores and the area between the micropores are modified with trimethylsilane to exert hydrophobicity It prevents reagents in the micropores, such as enzymes and DNA, from non-specific adsorption or spreading into adjacent micropores, causing cross-reaction.
  • APTES ⁇ -aminopropyltriethoxysilane
  • the azide group was modified on the amino group of APTES, and dibenzocyclooctyne (DBCO) was used Modify the P1 sequence, add the DBCO-P1 sequence complex into the microwell, and fix the first fragment at the bottom of the microwell of the chip through the covalent bond formed by DBCO and the azide group ( Figure 2).
  • the P1 sequence is as SEQ ID NO. :1(GTTAACCCTTAGCGGGGCGTTTGTAGTGCGTAAAGA).
  • the P1 complementary sequence is as shown in SEQ ID NO:2 (TCTTTACGCACTACAACGCCCCGCTAAGGGTTAAC).
  • the length of the first random sequence N1 is 15nt.
  • the P2 sequence is as follows Shown in SEQ ID NO:3(GTCCTAACCGAATATGAACCGTCGTCAATCCAGGT);
  • the concentration of the second fragment added is controlled to 5pM to ensure that at least one second fragment hybridizes in each microwell; add sodium citrate buffer Wash off the unhybridized solution, and then add the recombinase, DNA polymerase, dNTP and P2 primer used for isothermal amplification (the sequence of the P2 primer is the same as the P2 sequence, as shown in SEQ ID NO: 4 (GTCCTAACCGAATATGAACCGTCGTCAATCCAGGT)).
  • the P2 complementary sequence is such as SEQ ID NO: 5 (ACCTGGATTGACGACGGTTCATATTCGGTTAGGAC ), as shown in Figure 3.
  • SEQ ID NO: 5 ACCTGGATTGACGACGGTTCATATTCGGTTAGGAC .
  • This strand is called the second capture fragment (P1 sequence - the first random sequence complementary sequence -P2 complementary sequence - second random sequence complementary sequence - P3 complementary sequence, the length of the second random sequence complementary sequence is 15nt, and the P3 complementary sequence is as shown in SEQ ID NO: 8 (CAATGTAACTTTCGCTGTCTGTCATATTAGCTGTTAGGGAGCGTAATCATTCTCCATTCCATTCCAGCAGTAGATCATGTATTAGGTTCTATAGGTAATC), see Figure 4.
  • transposon including a first double-stranded DNA and a second double-stranded DNA.
  • the specific structural diagram is shown in Figure 5.
  • the two double-stranded DNAs connected to the transposase in Figure 5 are the first double-stranded DNA.
  • the first double-stranded DNA includes the a-strand (including the P4 sequence (SEQ ID NO: 9: GCTATGAGTCCATGGCAGATGTCGAGAT), the coding sequence (8nt random sequence), the fixed sequence (SEQ ID NO: 10: GAGTATTTGTCCGCAGA) and Fixed sequence complement sequence specifically recognized by transposase Column (ME complementary sequence, as shown in SEQ ID NO:11: AGATGTGTATAAGAGACAG)) and b chain (ME sequence, as shown in SEQ ID NO:12: CTGTCTCTTATACACATCT), the second double-stranded DNA includes c chain (including ME complementary sequence (SEQ ID NO: 13: AGATGTGTATAAGAGACAG)) and the d chain (including part of the P3 sequence (SEQ ID NO: 14: GATTACCTATAGAACCTAATACATGATCTACTGCTGGAATGGAATGGA) and the ME sequence (SEQ ID NO: 15: CTGTCTCTTATACACATCT)).
  • the P4 sequence SEQ ID
  • transposome Combine the above-mentioned transposon and transposase to form a transposome, and incubate the reaction with the target DNA in the prepared E. coli Ecoli genome (the enzyme activity of the transposase here is 10 U/ ⁇ L, and the transposase activity is 0.01 to 20 U).
  • the enzyme activity of the transposase here is 10 U/ ⁇ L, and the transposase activity is 0.01 to 20 U).
  • the above complex is shown in Figure 6;
  • This example constructs an E. coli Ecoli genome sequencing library.
  • the specific process is:
  • the bottom of the microwell is modified with ⁇ -aminopropyltriethoxysilane (APTES), which is used to fix the first fragment (i.e., P1 sequence) and the sequencing primer; the edges of the microwell and the area between the microwells are modified as in Example 1 .
  • APTES ⁇ -aminopropyltriethoxysilane
  • the second capture fragment (P1 sequence - first random sequence complementary sequence - P2 complementary sequence - second random sequence complementary sequence - partial P3 complementary sequence, the length of the second random sequence complementary sequence is 15nt, and the partial P3 complementary sequence is such as SEQ ID NO:17: CAATGTAACTTTCGCTGTCTGTCATATTAGCTGTTAGGGAGCGTAATCATTC (shown in Figure 10).
  • the P3 sequence is shown in SEQ ID NO: 6; add the fourth fragment to the microwell of the chip to hybridize it with part of the P3 complementary sequence of the second capture fragment.
  • the third double-stranded DNA is the same as the first double-stranded DNA
  • the fourth double-stranded DNA includes the e-strand (including the ME sequence (SEQ ID NO: 18: CTGTCTCTTATACACATCT)) and the f-strand (including part of P3 complementary sequence (SEQ ID NO: 19: TCCATTCCATTCCAGCAGTAGATCATGTATTAGGTTCTATAGGTAATC) and ME complementary sequence (SEQ ID NO: 20: AGATGTGTATAAGAGACAG)).
  • coli Ecoli genome (the enzyme activity of the transposase here is 10 U/ ⁇ L, and the transposase activity is 0.01 to 20 U). Can be combined with 0.1-500ng of target DNA) to obtain a complex; add the complex into the above-mentioned microwell, and pass the partial P3 sequence of the fourth fragment that has not hybridized with the second capture fragment and the partial P3 complementary sequence of the transposon Complementation is performed to capture the above complex, see Figure 12;
  • Partial transposon sequence includes part of the P3 complementary sequence of the transposon, ME complementary sequence, a chain complementary sequence (P4 complementary sequence, coding sequence complementary sequence, fixed sequence complementary sequence, ME sequence).
  • the composite sequence is connected to the second capture fragment to form a sequencing library (P1 sequence-first random sequence complementary sequence-P2 complementary sequence-second random sequence complementary sequence-P3 complementary sequence-ME complementary sequence-fragmented target DNA-a chain Complementary sequence (P4 complementary sequence, coding sequence complementary sequence, fixed sequence complementary sequence, ME sequence)), see Figure 13.
  • Sequencing library purification Add 2 times the volume of DNA purification magnetic beads to the eluted sequencing library according to the volume ratio, mix, incubate at room temperature for 5 minutes, place on a magnetic stand until clear, discard the supernatant, and use 80 % ethanol by gently pipetting the magnetic beads to wash the salt ions twice. Centrifuge briefly, discard the remaining ethanol, open the lid and let dry until it cracks. Add 23 ⁇ L of clean nuclease-free water to elute the sequencing library, place it on a magnetic stand until it is clear, and transfer 22 ⁇ L of the supernatant to a clean PCR tube for later use. .
  • Sequencing library quality control Take 1 ⁇ L of the supernatant and use the qubit ssDNA quantification kit for quantification. If the concentration is greater than or equal to 4nM, it is sufficient for on-machine sequencing. You can directly use the Salus Pro Sequencing Reagent Set for sequencing. For the on-machine steps, please refer to "Salus Pro Sequencing Reagent Set, Instruction Manual Version V1.1" for operation. If the library concentration is lower than 4nM, a round of PCR amplification and enrichment is required before the next step of sequencing can be performed.
  • Sequencing library enrichment (optional): To the eluted 20 ⁇ L sequencing library, add a total volume of 30 ⁇ L (polymerase, dNTPs, polymerase reaction buffer, sequencing primers, P1 sequence) to perform three steps on the library For PCR reaction, the reaction program was: 98°C for 1 min, 4 to 10 cycles (98°C for 10 s, 60°C for 30 s, 72°C for 30 s), 72°C for 5 min, and maintained at 4°C.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明属于基因测序领域,公开了一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法。测序文库的构建方法:将第一片段固定在芯片的微孔中;将第二片段与第一片段杂交,延伸并扩增,得第一捕获片段;将第三片段与第一捕获片段杂交,延伸并扩增,得第二捕获片段;用第二捕获片段捕获复合体,复合体包括靶DNA和转座体;去除转座酶,得复合序列,延伸,即得测序文库。试剂盒包括:芯片,固定在芯片微孔中的第二捕获片段;转座酶和转座子。测序方法:利用上述方法构建测序文库,扩增,测序。该测序文库的构建方法,操作简单,成本大幅下降,易于后期与自动化平台结合。

Description

一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法
本发明要求于2022年08月01日提交中国专利局、申请号为202210916238.3,申请名称为“一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法”的中国专利申请的优先权,其全部内容通过引用结合在本发明中。
技术领域
本发明属于基因测序领域,具体涉及一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法。
背景技术
二代测序技术以其高通量、高准确率和低成本的特点,极大地改变了基因测序在生物和医学领域的研究和临床应用,然而二代测序技术由于其读长相对较短,一般最长可以读取300~500bp,因此在检测基因组结构变异、基因组基因定相等领域无法很好地开展。目前常采用测序方案是:将一个有10万~100万级别碱基的长片段DNA,在一个限定的区域内打断并添加单一种类的DNA标签和接头,而不同的长片段DNA在不同的限定区域分别打断并添加各自不同种类的标签和相同的接头,之后再测序。如果某部分DNA的序列的标签相同,在测序结果中可自动判定来自某一区域,可以先进行小区域少量的DNA组装,然后再进行整体的组装,这能够产生更长的组装序列,从而解决二代测序在基因组定相和结构变异中面临的问题。这种带有相同DNA标签的基因序列被称为关联的短片段(linked reads)。
基于上述方案,目前主要有如下测序文库的构建方法:1.10x genomics的Chromium采用微流控产生液滴,同时把一个长片段DNA、同一种DNA标签、打断、加接头等试剂控制在一个油包水液滴中,在液滴中进行对长片段DNA的打断、加DNA标签和接头,最后破坏油包水液滴以收集DNA。该方法使用方便,但产生液滴的时间和流程较复杂,目前已逐步退出市场。2.华大基因的single tube long fragment reads(st-lfr)和universal sequencing technology的Transposase Enzyme Linked Long-read Sequencing(TELL-SeqTM),都采用了转座酶的打断方法,首先制备含有不同标签序列的微球,然后将含有转座子DNA的转座酶和长片段DNA混合,通过微球上的标签序列捕获转座子,从而捕获长片段DNA,一般情况下一个微球最多捕获一个长片段DNA,最后通过延伸反应把微球上的标签序列加到被转座酶打断的DNA序列上。这两种方法无需油包水的物理空间分隔,均相的反应时间相对较快,但是需要更多的人工操作流程,不利于以后的自动化。
另外,上述三种方法均需要分开合成种类众多的引物来分别修饰捕获微球或生产水凝胶微球,是三种方法的核心成本来源。因此,有必要提供一种操作简单、成本低、易于后期自动化操作的测序文库的构建方法。
发明内容
本发明旨在至少解决上述现有技术中存在的技术问题之一。为此,本发明提出一种测序文库的构建方法,操作简单,成本大幅下降,易于后期与自动化平台结合。
本发明还提出一种构建测序文库的试剂盒。
本发明还提出一种基因测序方法。
根据本发明的一个方面,提出了一种测序文库的构建方法,包括以下步骤:
S1:提供芯片和第一片段,所述芯片上具有若干微孔,所述第一片段包括P1序列,将所述第一片段固定在所述微孔中;
S2:提供第二片段,所述第二片段包括P1互补序列、第一随机序列N1和P2序列,将所述第二片段与所述第一片段杂交;延伸并扩增所述第一片段,得到固定在所述微孔中的第一捕获片段;
S3:提供第三片段,所述第三片段包括P2序列、第二随机序列N2和P3序列,将所述第三片段与所述第一捕获片段杂交;延伸并扩增所述第一捕获片段,得到固定在所述微孔中的第二捕获片段;
S4:利用所述第二捕获片段捕获复合体,所述复合体包括靶DNA和结合在所述靶DNA上的转座体,所述转座体包括转座酶和转座子,所述转座子包括部分P3序列、全部P3序列、部分P3互补序列中的任一种;
S5:去除所述转座酶,得复合序列,所述复合序列包括部分转座子序列和片段化的靶DNA,延伸所述复合序列,得延伸产物,所述延伸产物即为所述测序文库。
根据本发明的一种优选的实施方式,至少具有以下有益效果:
本发明的第二捕获片段中引入了两段随机序列,这些随机序列是用大量的一次合成的随机文库,通过在芯片上扩增来产生的,不需要像现有技术中需分开合成种类众多的引物来分别修饰捕获微球或生产水凝胶微球,显著降低了核心成本。另外,本发明构建测序文库的操作简便,易于后期与自动化平台结合使用对长片段DNA进行测序。
在本发明的一些实施方式中,所述芯片包括2个流道,每个所述流道包括若干所述微孔。
在本发明的一些实施方式中,所述芯片包括800万~1500万个所述微孔。
在本发明的一些优选的实施方式中,所述芯片包括1000万个所述微孔。
在本发明的一些实施方式中,所述微孔的直径为2~8μm。
在本发明的一些优选的实施方式中,所述微孔的直径为5μm。
在本发明的一些实施方式中,所述微孔的深度为5~15μm。
在本发明的一些优选的实施方式中,所述微孔的深度为10μm。
在本发明的一些实施方式中,相邻两个所述微孔的中心距离为8~15μm。
在本发明的一些优选的实施方式中,相邻两个所述微孔的中心距离为12μm。
在本发明的一些优选的实施方式中,所述微孔的直径为5μm,深度为10μm,相邻两个所述微孔的中心距离为12μm。
在本发明的一些实施方式中,步骤S1中还包括将测序引物固定在所述微孔中。
在本发明的一些实施方式中,所述微孔的底部修饰有硅烷偶联剂。
在本发明的一些实施方式中,所述微孔的底部修饰有氨基硅烷或含氯硅烷偶联剂中的任一种。
在本发明的一些实施方式中,所述微孔的底部修饰有γ-氨丙基三乙氧基硅烷(APTES)、氨丙基三甲氧基硅烷或三氯硅乙烷中的任一种。
在本发明的一些优选的实施方式中,所述微孔的底部修饰有γ-氨丙基三乙氧基硅烷。
对所述微孔底部进行硅烷修饰的目的是用于固定所述第一片段或所述测序引物。
具体地,将所述第一片段或所述测序引物固定在所述芯片的微孔中的方法包括:在修饰有所述氨基硅烷的所述微孔中引入叠氮基团,将二苯并环辛炔(DBCO)修饰的第一片段或测序引物加入所述微孔中,通过所述DBCO与所述叠氮基团形成的共价键将所述第一片段或所述测序引物固定在所述微孔中。
或者,在修饰有所述氨基硅烷的所述微孔中引入1,4-苯二异硫氰酸酯或戊二醛,通过所述1,4-苯二异硫氰酸酯或戊二醛分别与氨基硅烷和氨基修饰的第一片段或测序引物形成的共价键将所述第一片段或所述测序引物固定在所述微孔中。
或者,在修饰有所述氨基硅烷的所述微孔中引入双官能团SMBP,通过所述双官能团SMBP分别与氨基硅烷和巯基修饰的第一片段或测序引物形成的共价键将所述第一片段或所述测序引物固定在所述微孔中。
或者,先将所述微孔中修饰的所述三氯硅乙烷转换成卞氯基硅烷,加入碘化钠(NaI)活化形成卞碘基,再加入巯基修饰的第一片段或测序引物,通过与卞碘基偶合固定在所述微孔中。
在本发明的一些实施方式中,所述微孔的边缘修饰有三甲基硅烷、异丁烯三乙氧基硅烷、乙硅氧烷或环丁硅氧烷中的至少一种。
在本发明的一些优选的实施方式中,所述微孔的边缘修饰有所述三甲基硅烷。
在本发明的一些实施方式中,相邻所述微孔间的区域修饰有三甲基硅烷、异丁烯三乙氧基硅烷、乙硅氧烷或环丁硅氧烷中的至少一种。
在本发明的一些优选的实施方式中,相邻所述微孔间的区域修饰有所述三甲基硅烷。
在所述微孔的边缘和相邻所述微孔间的区域进行上述修饰,主要起到疏水作用,防止所述微孔内的试剂,如酶、DNA等发生非特异性吸附或蔓延到相邻的所述微孔中而导致交叉反应。
在本发明的一些实施方式中,所述P1序列为固定序列。
在本发明的一些实施方式中,所述P1序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P1序列如SEQ ID NO:1所示。
在本发明的一些实施方式中,所述测序引物为固定序列。
在本发明的一些实施方式中,所述测序引物与所述P1序列部分或全部互补。
在本发明的一些实施方式中,所述P1互补序列为固定序列。
在本发明的一些实施方式中,所述P1互补序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P1互补序列如SEQ ID NO:2所示。
在本发明的一些实施方式中,所述P2序列为固定序列。
在本发明的一些实施方式中,所述P2序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P2序列如SEQ ID NO:3所示。
在本发明的一些实施方式中,所述P1序列和所述P2序列不相同,且不互补。
在本发明的一些实施方式中,所述第一随机序列N1的长度为10~15nt。
在本发明的一些实施方式中,所述第二片段的浓度为2~15pM。
在本发明的一些优选的实施方式中,所述第二片段的浓度为5pM。
控制所述第二片段的浓度为5pM,可以保证每一个所述微孔中至少有一个所述第二片段杂交上去。
在本发明的一些实施方式中,在所述杂交后,需加入柠檬酸钠缓冲液洗掉未杂交的溶液,所述柠檬酸钠溶液不会影响已完成杂交的链。
在本发明的一些实施方式中,步骤S2所述的扩增,需加入P2引物。
在本发明的一些实施方式中,所述P2引物为固定序列。
在本发明的一些实施方式中,所述P2引物的序列与所述P2序列相同。
在本发明的一些优选的实施方式中,所述P2引物如SEQ ID NO:4所示。
步骤S2中的所述扩增以两条引物:P2引物和第一片段进行。
在本发明的一些实施方式中,步骤S2中在加入所述P2引物后,需通入油来对所述微孔进行封口。
在本发明的一些实施方式中,所述油包括液体石蜡油、硅油或凡士林中的任一种。
在本发明的一些实施方式中,步骤S2所述的扩增采用快速循环PCR、固相PCR或等温PCR扩增法中的任一种。
在本发明的一些实施方式中,步骤S2所述的扩增采用等温扩增法。
在本发明的一些实施方式中,步骤S2所述扩增采用解旋酶依赖性扩增法或重组酶聚合酶扩增法(RPA)中的任一种。
在本发明的一些优选的实施方式中,步骤S2所述的扩增采用重组酶聚合酶扩增法。
在本发明的一些实施方式中,步骤S2中在加入所述P2引物时,还需加入所述扩增所需的酶和dNTP。
在本发明的一些优选的实施方式中,所述扩增所需的酶包括重组酶和DNA聚合酶。
在本发明的一些实施方式中,步骤S2中所述扩增的时间为50~70min。
在本发明的一些优选的实施方式中,步骤S2中所述扩增的时间为60min。
在本发明的一些实施方式中,步骤S2在所述扩增结束后,需先通入破坏油相体系的溶液。
在本发明的一些实施方式中,所述破坏油相体系的溶液包括乙醇、柠檬酸钠缓冲液(SSC溶液)、甲酰胺、氢氧化钠、去污剂或香蕉水中的至少一种。
在本发明的一些实施方式中,步骤S2在所述扩增结束后,需向所述微孔中加入洗涤液对扩增产物进行洗涤。
在本发明的一些实施方式中,所述洗涤液包括甲酰胺、氢氧化钠、尿素或解旋酶中的至少一种。
在本发明的一些优选的实施方式中,所述洗涤液选择所述甲酰胺。
由于步骤S2经所述扩增后得到的核酸序列,其中一条链(即所述第一捕获片段)是通过化学键和所述微孔进行连接,而另一条链是通过氢键与上述链互补配对,因此需要把所述另一条链去除,便于进行后续的杂交步骤。用上述洗涤剂进行洗涤,可去除所述另一条链,但不会破坏所述化学键,即所述第一捕获片段被保留。
在本发明的一些实施方式中,所述第一捕获片段包括P1序列、第一随机序列互补序列、P2互补序列。
在本发明的一些实施方式中,所述第一随机序列互补序列的长度为10~15nt。
在本发明的一些实施方式中,所述P2互补序列为固定序列。
在本发明的一些实施方式中,所述P2互补序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P2互补序列如SEQ ID NO:5所示。
在本发明的一些实施方式中,步骤S2完成时,所述芯片的85~99%的微孔中包括所述第一捕获片段,1~15%的微孔中只包括所述第一片段。
在本发明的一些优选的实施方式中,所述芯片的90~99%的微孔中包括所述第一捕获片段,1~10%的微孔中只包括所述第一片段。
在本发明一些更优选的实施方式中,所述芯片的95~99%的微孔中包括所述第一捕获片段,1~5%的微孔中只包括所述第一片段。
在本发明的一些实施方式中,所述芯片的每个微孔中包括1~5个所述第一捕获片段。
上述每个微孔中包括1~5个第一捕获片段是经发明人多次试验后得出的结果。
在本发明的一些实施方式中,所述第二随机序列N2的长度为10~15nt。
在本发明的一些实施方式中,所述P3序列为固定序列。
在本发明的一些实施方式中,所述P3序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P3序列如SEQ ID NO:6所示。
在本发明的一些实施方式中,所述P1序列、所述P2序列和所述P3序列互不相同,且互不互补。
在本发明的一些实施方式中,所述第三片段的浓度为2~15pM。
在本发明的一些优选的实施方式中,所述第三片段的浓度为5pM。
保持所述第三片段的浓度为5pM,可以保证每一个所述微孔中至少有一个所述第三片段杂交上去。
在本发明的一些实施方式中,步骤S3所述的扩增,需加入P3引物。
在本发明的一些实施方式中,所述P3引物为固定序列。
在本发明的一些实施方式中,所述P3引物的序列与所述P3序列相同。
在本发明的一些优选的实施方式中,所述P3引物如SEQ ID NO:7所示。
步骤S3中的所述扩增以两条引物:P3引物和第一片段进行。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物均固定在所述微孔中时,步骤S3中加入部分P3引物进行所述扩增。
在本发明的一些优选的实施方式中,所述部分P3引物如SEQ ID NO:16所示。
在本发明的一些实施方式中,步骤S3所述扩增的方法与步骤S2的相同。
在本发明的一些实施方式中,步骤S3所述扩增结束后,需先通入破坏油相体系的溶液。
在本发明的一些实施方式中,所述破坏油相体系的溶液包括乙醇、柠檬酸钠缓冲液、甲酰胺、氢氧化钠、去污剂或香蕉水中的至少一种。
在本发明的一些实施方式中,步骤S3在破坏所述油相体系后,需加入ddNTP,以结合在扩增产物的3’端。
在本发明的一些实施方式中,加入所述ddNTP的方法为通过tdt末端转移酶将所述ddNTP结合在所述扩增产物的3’端。
在所述扩增产物的3’端结合所述ddNTP是为了防止在后期捕获所述靶DNA时,所述第二捕获片段继续延伸,可以达到所述芯片重复利用的效果。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物均固定在所述微孔中时,无需加入所述ddNTP。
在本发明的一些实施方式中,步骤S3在所述扩增产物的3’端加上ddNTP后,需向所述微孔中加入洗涤液对所述扩增产物进行洗涤。
在本发明的一些实施方式中,所述洗涤液包括甲酰胺、氢氧化钠、尿素或解旋酶中的至少一种。
在本发明的一些优选的实施方式中,所述洗涤液选择所述甲酰胺。
同样的,步骤S3经所述洗涤后,所述第二捕获片段被保留。
在本发明的一些实施方式中,所述第二捕获片段包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、P3互补序列。
在本发明的一些实施方式中,所述第二随机序列互补序列的长度为10~15nt。
在本发明的一些实施方式中,所述P3互补序列为固定序列。
在本发明的一些实施方式中,所述P3互补序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P3互补序列如SEQ ID NO:8所示。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物均固定在所述微孔中时,所述第二捕获片段包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、部分P3互补序列。
步骤S1~步骤S3中引入了P1序列、P2序列、P3序列和第一随机序列N1、第二随机序列N2,可基本满足现阶段测序的要求。
其中,引入第一随机序列N1和第二随机序列N2是作为标签序列来区分后续捕获的不同靶DNA;将两个随机序列结合比较可以确定在制备第二捕获片段时,一个微孔中是否加入了单条或多条包含P2序列或P3序列的片段;且两个随机序列组合可以标记更多的靶DNA。
引入P1序列是为了固定第一捕获片段和第二捕获片段,并且可以作为第一捕获片段和第二捕获片段的扩增引物。
引入P2序列和P3序列是为了测序时作为引物测出第一随机序列N1和第二随机序列N2的碱基信息,同时P3序列也用于与靶DNA杂交进而对其进行捕获。
在上述步骤中,不能同时引入P1序列和第二片段,因为P1序列是要固定在芯片的微孔中的,其固定条件和第二片段引入时的条件不同;同时P1序列和第二片段含有互补序列(即P1互补序列),如果同时加入,由于P1互补序列与P1序列的结合,会干扰P1序列的固定效率和第二片段在微孔内的延伸效率。
在本发明的一些实施方式中,所述转座子包括第一双链DNA和第二双链DNA。
在本发明的一些实施方式中,所述第一双链DNA包括a链和b链。
在本发明的一些实施方式中,所述a链包括P4序列、编码序列、固定序列和转座酶特异性识别的固定序列互补序列(ME互补序列)。
在本发明的一些实施方式中,所述b链包括转座酶特异性识别的固定序列(ME序列)。
在本发明的一些实施方式中,所述a链和所述b链通过所述ME互补序列和所述ME序列互补结合形成双链,被所述转座酶识别,从而结合到所述转座酶上。
在本发明的一些实施方式中,所述第二双链DNA包括c链和d链。
在本发明的一些实施方式中,所述c链包括ME互补序列。
在本发明的一些实施方式中,所述d链包括所述部分或全部P3序列,以及ME序列。
在本发明的一些实施方式中,所述c链和所述d链通过所述ME互补序列和所述ME序列互补结合形成双链,被所述转座酶识别,从而结合到所述转座酶上。
在本发明的一些实施方式中,所述P4序列为固定序列。
在本发明的一些实施方式中,所述P4序列的长度为25~100nt。
在本发明的一些优选的实施方式中,所述P4序列如SEQ ID NO:9所示。
在本发明的一些实施方式中,所述P1序列、所述P2序列、所述P3序列和所述P4序列互不相同,且互不互补。
在本发明的一些实施方式中,所述编码序列为随机序列。
在本发明的一些实施方式中,所述编码序列的长度为6~8nt。
在本发明的一些优选的实施方式中,所述转座子包括部分P3序列,如SEQ ID NO:14所示。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物均固定在所述微孔中时,所述转座子包括第三双链DNA和第四双链DNA。
在本发明的一些实施方式中,所述第三双链DNA与所述第一双链DNA相同。
在本发明的一些实施方式中,所述第四双链DNA包括e链和f链。
在本发明的一些实施方式中,所述e链包括ME序列。
在本发明的一些实施方式中,所述f链包括所述部分P3互补序列和ME互补序列。
在本发明的一些实施方式中,所述e链和所述f链通过所述ME序列和所述ME互补序列互补结合形成双链,被所述转座酶识别,从而结合到所述转座酶上。
在本发明的一些实施方式中,所述转座酶的作用是打断所述靶DNA,形成所述片段化的靶DNA,并在所述片段化的靶DNA两端加上接头。
所述转座酶的酶活性以及所述转座酶与所述靶DNA反应的比例都会影响捕获和测序效果。酶活性低会导致所述靶DNA的打断效果不佳,所述转座酶与所述靶DNA的比例不合适会导致所述测序文库长度过长或过短,进而使测序质量下降。
在本发明的一些实施方式中,所述转座酶的酶活性为1~20U/μL。
在本发明的一些实施方式中,0.01~20U的所述转座酶可与0.1~500ng的所述靶DNA结合。
在本发明的一些实施方式中,步骤S4中所述第二捕获片段通过其P3互补序列与所述转座子的部分或全部P3序列互补来捕获所述复合体。
在本发明的一些实施方式中,步骤S4中所述第二捕获片段通过其部分或全部P3互补序列与所述转座子的部分或全部P3序列互补来捕获所述复合体。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物均固定在所述微孔中时,步骤S4需先提供第四片段,所述第四片段包括P3序列,将所述第四片段与所述第二捕获片段杂交;所述第二捕获片段通过所述第四片段捕获所述复合体。
具体地,所述第四片段与所述第二捕获片段中的部分P3互补序列杂交。
具体地,所述第四片段中未与所述第二捕获片段杂交的序列用于捕获所述复合体。
更具体地,所述第四片段中未与所述第二捕获片段杂交的序列与所述转座子的部分 P3互补序列互补来捕获所述复合体。
在本发明的一些实施方式中,所述第四片段的浓度为2~15pM。
在本发明的一些优选的实施方式中,所述第四片段的浓度为5pM。
保持所述第四片段的浓度为5pM,可以保证每一个所述微孔中至少有一个所述第四片段杂交上去。
在本发明的一些实施方式中,步骤S5中得到所述延伸产物后,要将所述延伸产物分离出来并收集。
在本发明的一些实施方式中,分离并收集所述延伸产物的方法为加入洗涤液将所述延伸产物洗脱出来,并收集。
在本发明的一些实施方式中,所述洗涤液包括氢氧化钠、甲酰胺或尿素中的至少一种。
在本发明的一些优选的实施方式中,所述洗涤液选择所述氢氧化钠。
在本发明的一些实施方式中,步骤S5所述部分转座子序列包括转座子的部分或全部P3序列、ME序列、a链(ME互补序列、固定序列、编码序列和P4序列)。
在本发明的一些实施方式中,步骤S5中所述测序文库的序列包括P1互补序列、第一随机序列N1、P2序列、第二随机序列N2、P3序列、ME序列、片段化的靶DNA、a链(ME互补序列、固定序列、编码序列和P4序列)。
在本发明的一些实施方式中,当步骤S1将所述第一片段和所述测序引物固定在所述微孔中时,步骤S5包括:去除所述转座酶,得所述复合序列,所述复合序列与所述第二捕获片段连接构成所述测序文库。
具体地,在得到所述测序文库后,需向所述微孔中加入洗涤液对所述测序文库进行洗涤。
具体地,所述洗涤液包括甲酰胺、氢氧化钠、尿素或解旋酶中的至少一种。
具体地,步骤S5所述部分转座子序列包括转座子的部分P3互补序列、ME互补序列、a链互补序列(P4互补序列、编码序列互补序列、固定序列互补序列和ME序列)。
此时,步骤S5中所述测序文库的序列包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、P3互补序列、ME互补序列、片段化的靶DNA、a链互补序列(P4互补序列、编码序列互补序列、固定序列互补序列和ME序列)。
根据本发明的第二个方面,提出了一种构建测序文库的试剂盒,包括:
芯片,所述芯片上具有若干微孔,所述微孔中固定有第二捕获片段;所述第二捕获片段包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、P3互补序列;
所述试剂盒还包括转座酶和转座子;所述转座子包括部分P3序列、全部P3序列、部分P3互补序列中的任一种。
在本发明的一些实施方式中,所述微孔中固定有第二捕获片段和测序引物;所述第二捕获片段包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、部分P3互补序列;
所述试剂盒还包括转座酶、转座子和第四片段;所述转座子包括所述部分P3互补序列,所述第四片段包括P3序列。
在本发明的一些实施方式中,所述转座酶和所述转座子用于结合靶DNA而形成复合体。
在本发明的一些实施方式中,所述第二捕获片段用于捕获所述复合体。
具体地,所述第二捕获片段通过其部分或全部P3互补序列与所述转座子的部分或全部P3序列互补来捕获所述复合体。
在本发明的一些实施方式中,所述第二捕获片段捕获所述复合体后,所述转座酶将所述靶DNA打断,形成片段化的靶DNA;然后去除所述转座酶,得复合序列。
在本发明的一些实施方式中,当所述微孔中固定有第二捕获片段和测序引物时,所述第二捕获片段通过第四片段捕获所述复合体。
具体地,所述第四片段通过其未与第二捕获片段杂交的部分P3序列与所述转座子的部分P3互补序列互补来捕获所述复合体。
在本发明的一些实施方式中,所述复合序列包括部分转座子序列和片段化的靶DNA。
在本发明的一些实施方式中,所述部分转座子序列包括转座子的部分或全部P3序列、ME序列、片段化的靶DNA、a链(ME互补序列、固定序列、编码序列和P4序列)。
在本发明的一些实施方式中,当所述微孔中固定有第二捕获片段和测序引物时,所述部分转座子序列包括转座子的部分P3互补序列、ME互补序列、片段化的靶DNA、a链互补序列(P4互补序列、编码序列互补序列、固定序列互补序列和ME序列)。
在本发明的一些实施方式中,对所述复合序列进行延伸,得延伸产物,所述延伸产物即为所述测序文库。
在本发明的一些实施方式中,当所述第四片段捕获所述复合体后,通过去除转座酶得到的所述复合序列与所述第二捕获片段连接即构成所述测序文库。
在本发明的一些优选的实施方式中,所述试剂盒包括芯片,所述芯片的微孔中固定有第二捕获片段;所述试剂盒还包括转座酶、转座子、dNTP、DNA聚合酶、洗涤液。
在本发明的一些实施方式中,所述dNTP和所述DNA聚合酶用于延伸所述复合序列。
在本发明的一些实施方式中,所述洗涤液包括氢氧化钠、甲酰胺或尿素中的至少一种。
所述洗涤液用于洗脱所述延伸产物。
在本发明的一些优选的实施方式中,所述试剂盒包括芯片,所述芯片的微孔中固定有第二捕获片段和测序引物;所述试剂盒还包括转座酶、转座子、第四片段、dNTP、DNA聚合酶、洗涤液。
根据本发明的第三个方面,提出了一种基因测序方法,包括以下步骤:利用所述测序文库的构建方法构建待测样品的测序文库,然后对所述测序文库进行扩增和测序。
附图说明
下面结合附图和实施例对本发明做进一步的说明,其中:
图1为本发明实施例1制备的芯片;
图2为本发明实施例1中将第一片段固定在芯片的微孔中的示意图;
图3为本发明实施例1中得到的第一捕获片段的示意图;其中左侧的图注代表第二片段,从下到上依次为P1互补序列、第一随机序列N1和P2序列;
图4为本发明实施例1中得到的第二捕获片段的示意图;其中左侧的图注代表第三片段,从下到上依次是P2序列、第二随机序列N2和P3序列;
图5为本发明实施例1中转座子的结构示意图;其中,ME-转座酶特异性识别的固定序列,ME’-转座酶特异性识别的固定序列互补序列;
图6为本发明实施例1中第二捕获片段捕获复合体的示意图;复合体包括靶DNA和结合在靶DNA上的转座体,转座体包括转座酶和转座子,转座子包括部分P3序列;
图7为本发明实施例1中步骤(5)中得到的延伸产物的示意图;其中,a链从下至上依次为ME互补序列、固定序列、编码序列、P4序列;
图8为本发明实施例1洗脱出来的测序文库的示意图;具体标记同图7;
图9为本发明实施例2中将第一片段和测序引物固定在芯片微孔中的示意图;
图10为本发明实施例2中得到的第二捕获片段的示意图;其中左侧的图注代表第三片段,从下到上依次是P2序列、第二随机序列N2和P3序列;
图11为本发明实施例2中转座子的结构示意图;其中,ME-转座酶特异性识别的固定序列,ME’-转座酶特异性识别的固定序列互补序列;
图12为本发明实施例2中通过第四片段捕获复合体的示意图;复合体包括靶DNA和结合在靶DNA上的转座体,转座体包括转座酶和转座子,转座子包括部分P3互补序列;
图13为本发明实施例2中得到的测序文库的示意图;其中,a链互补序列从下至上依次为P4互补序列、编码序列互补序列、固定序列互补序列、ME序列。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
在本发明的描述中,如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。
本发明的描述中,除非另有明确的限定,扩增、洗涤等词语应做广义理解,所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本发明中的具体含义。
本发明的描述中,参考术语“一个实施例”、“一些实施例”等的描述意指结合该实施例描述的具体特征、材料或者特点包含于本发明的至少一个实施例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例。而且,描述的具体特征、材料或者特点可以在任何的一个或多个实施例中以合适的方式结合。
实施例中所使用的试验方法如无特殊说明,均为常规方法;所使用的材料、试剂等,如无特殊说明,均可从商业途径得到的试剂和材料。
实施例1
本实施例构建了一种大肠杆菌Ecoli基因组测序文库,具体过程为:
(1)通过外包的微纳加工和封装服务,制备一个芯片,芯片大小为25mm*75mm,该芯片包括两个流道,每个流道包括若干微孔,整个芯片包括1000万个微孔。如图1所示,微孔的直径为5μm,深度为10μm,相邻两个微孔的中心距离为12μm。
微孔的底部修饰有γ-氨丙基三乙氧基硅烷(APTES),用于固定第一片段(即P1序列);微孔的边缘和微孔间的区域修饰有三甲基硅烷,发挥疏水作用,防止微孔内的试剂,如酶、DNA等发生非特异性吸附或蔓延到相邻的微孔中而导致交叉反应。
在上述微孔中,给APTES的氨基上修饰叠氮基团,同时用二苯并环辛炔(DBCO) 修饰P1序列,将DBCO-P1序列复合物加入微孔中,通过DBCO与叠氮基团形成的共价键将第一片段固定在芯片的微孔底部(图2),P1序列如SEQ ID NO:1(GTTAACCCTTAGCGGGGCGTTGTAGTGCGTAAAGA)所示。
(2)提供第二片段,包括P1互补序列-第一随机序列N1-P2序列,P1互补序列如SEQ ID NO:2(TCTTTACGCACTACAACGCCCCGCTAAGGGTTAAC)所示,第一随机序列N1的长度为15nt,P2序列如SEQ ID NO:3(GTCCTAACCGAATATGAACCGTCGTCAATCCAGGT)所示;
将第二片段加入微孔中,使其与第一片段进行杂交;这里控制第二片段加入的浓度为5pM,以保证每一个微孔中至少有一个第二片段杂交上去;加入柠檬酸钠缓冲液洗掉未杂交的溶液,再加入等温扩增使用的重组酶、DNA聚合酶、dNTP和P2引物(P2引物的序列与P2序列相同,如SEQ ID NO:4(GTCCTAACCGAATATGAACCGTCGTCAATCCAGGT)所示),通入石蜡油对微孔进行封口,于37℃等温扩增1h(以P2引物和第一片段作为引物进行扩增),使所有的P2引物重复反应。扩增结束后,通入柠檬酸钠缓冲液破坏油相体系,加入甲酰胺洗涤,洗掉扩增得到的双链DNA中与微孔非共价结合的DNA单链,与微孔共价结合的链被保留,该链称为第一捕获片段(P1序列-第一随机序列互补序列-P2互补序列,第一随机序列互补序列的长度为15nt,P2互补序列如SEQ ID NO:5(ACCTGGATTGACGACGGTTCATATTCGGTTAGGAC)所示),如图3所示。此时,芯片97%的微孔中包括第一捕获片段,且每个微孔中包括1~5个第一捕获片段,而3%的微孔中只包括第一片段。
(3)提供第三片段,包括P2序列-第二随机序列N2-P3序列,第二随机序列的长度为15nt,P3序列如SEQ ID NO:6(GATTACCTATAGAACCTAATACATGATCTACTGCTGGAATGGAATGGAGAATGATTACGCTCCCTAACAGCTAATATGACAGACAGCGAAAGTTACATTG)所示;
将第三片段加入芯片的微孔中,使其与第一捕获片段进行杂交;这里控制第三片段加入的浓度为5pM,以保证每一个微孔中至少有一个第三片段杂交上去;加入柠檬酸钠缓冲液洗掉未杂交的溶液,再加入等温扩增使用的重组酶、DNA聚合酶、dNTP和P3引物(P3引物的序列与P3序列相同,如SEQ ID NO:7(GATTACCTATAGAACCTAATACATGATCTACTGCTGGAATGGAATGGAGAATGATTACGCTCCCTAACAGCTAATATGACAGACAGCGAAAGTTACATTG)所示),通入石蜡油对微孔进行封口,于37℃等温扩增1h(以P3引物和第一片段为引物进行扩增),使所有的P3引物重复反应。扩增结束后,通入柠檬酸钠缓冲液破坏油相体系,然后加入tdt末端转移酶将ddATP结合在扩增产物的3’端,以防止扩增产物继续延伸;再加入甲酰胺洗涤,洗掉扩增得到的双链DNA中与微孔非共价结合的DNA单链,与微孔共价结合的链被保留,该链称为第二捕获片段(P1序列-第一随机序列互补序列-P2互补序列-第二随机序列互补序列-P3互补序列,第二随机序列互补序列的长度为15nt,P3互补序列如SEQ ID NO:8(CAATGTAACTTTCGCTGTCTGTCATATTAGCTGTTAGGGAGCGTAATCATTCTCCATTCCATTCCAGCAGTAGATCATGTATTAGGTTCTATAGGTAATC)所示),见图4。
(4)提供转座子,包括第一双链DNA和第二双链DNA,具体结构示意图参见图5所示,图5中转座酶上连接的两个双链DNA即为第一双链DNA和第二双链DNA;第一双链DNA包括a链(包括P4序列(SEQ ID NO:9:GCTATGAGTCCATGGCAGATGTCGAGAT)、编码序列(8nt的随机序列)、固定序列(SEQ ID NO:10:GAGTATTTGTCCGCAGA)和转座酶特异性识别的固定序列互补序 列(ME互补序列,如SEQ ID NO:11所示:AGATGTGTATAAGAGACAG))和b链(ME序列,如SEQ ID NO:12所示:CTGTCTCTTATACACATCT),第二双链DNA包括c链(包括ME互补序列(SEQ ID NO:13:AGATGTGTATAAGAGACAG))和d链(包括部分P3序列(SEQ ID NO:14:GATTACCTATAGAACCTAATACATGATCTACTGCTGGAATGGAATGGA)和ME序列(SEQ ID NO:15:CTGTCTCTTATACACATCT))。将上述转座子与转座酶结合形成转座体,并与制作好的大肠杆菌Ecoli基因组中的靶DNA孵育反应(这里转座酶的酶活性为10U/μL,0.01~20U的转座酶可与0.1~500ng的靶DNA结合),得复合体;将该复合体加入步骤(3)的微孔中,通过第二捕获片段的部分P3互补序列与转座子的部分P3序列互补来捕获上述复合体,见图6;
(5)捕获后通入SDS溶液去除转座酶,得复合序列(部分转座子序列和片段化的靶DNA,部分转座子序列包括转座子的部分P3序列、ME序列、a链(ME互补序列、固定序列、编码序列、P4序列)),加入DNA聚合酶和DNA连接酶,加热至37℃反应10min,对上述复合序列进行延伸,得延伸产物,见图7;然后加入氢氧化钠溶液将延伸产物洗脱出来,收集即得测序文库(P1互补序列-第一随机序列N1-P2序列-第二随机序列N2-P3序列-ME序列-片段化的靶DNA-a链(ME互补序列、固定序列、编码序列、P4序列)),见图8。
实施例2
本实施例构建了一种大肠杆菌Ecoli基因组测序文库,具体过程为:
(1)制备一个芯片,具体见实施例1。
微孔的底部修饰有γ-氨丙基三乙氧基硅烷(APTES),用于固定第一片段(即P1序列)和测序引物;微孔的边缘和微孔间的区域修饰同实施例1。
在上述微孔中,给APTES的氨基上修饰叠氮基团,同时用二苯并环辛炔(DBCO)修饰P1序列和测序引物,将DBCO-P1序列复合物加入微孔中,通过DBCO与叠氮基团形成的共价键将第一片段和测序引物固定在芯片的微孔底部(图9),P1序列如SEQ ID NO:1所示。
(2)同实施例1。
(3)同实施例1,这里进行等温扩增时加入的是部分P3引物(如SEQ ID NO:16所示:GAATGATTACGCTCCCTAACAGCTAATATGACAGACAGCGAAAGTTACATTG);且在破坏油相体系后,不需要将ddNTP结合在扩增产物的3’端。
最后得到第二捕获片段(P1序列-第一随机序列互补序列-P2互补序列-第二随机序列互补序列-部分P3互补序列,第二随机序列互补序列的长度为15nt,部分P3互补序列如SEQ ID NO:17:CAATGTAACTTTCGCTGTCTGTCATATTAGCTGTTAGGGAGCGTAATCATTC所示),见图10。
(4)提供第四片段,包括P3序列,P3序列如SEQ ID NO:6所示;将第四片段加入芯片的微孔中,使其与第二捕获片段的部分P3互补序列进行杂交。
制备复合体:提供转座子,包括第三双链DNA和第四双链DNA,具体结构示意图参见图11所示,图11中转座酶上连接的两个双链DNA即为第三双链DNA和第四双链DNA;第三双链DNA与第一双链DNA相同;第四双链DNA包括e链(包括ME序列(SEQ ID NO:18:CTGTCTCTTATACACATCT))和f链(包括部分P3互补序列(SEQ  ID NO:19:TCCATTCCATTCCAGCAGTAGATCATGTATTAGGTTCTATAGGTAATC)和ME互补序列(SEQ ID NO:20:AGATGTGTATAAGAGACAG))。将上述转座子与转座酶结合形成转座体,并与制作好的大肠杆菌Ecoli基因组中的靶DNA孵育反应(这里转座酶的酶活性为10U/μL,0.01~20U的转座酶可与0.1~500ng的靶DNA结合),得复合体;将该复合体加入上述微孔中,通过第四片段中未与第二捕获片段杂交的部分P3序列与转座子的部分P3互补序列进行互补来捕获上述复合体,见图12;
(5)捕获后通入SDS溶液去除转座酶,得复合序列(部分转座子序列和片段化的靶DNA),部分转座子序列包括转座子的部分P3互补序列、ME互补序列、a链互补序列(P4互补序列、编码序列互补序列、固定序列互补序列、ME序列)。该复合序列连接第二捕获片段即形成测序文库(P1序列-第一随机序列互补序列-P2互补序列-第二随机序列互补序列-P3互补序列-ME互补序列-片段化的靶DNA-a链互补序列(P4互补序列、编码序列互补序列、固定序列互补序列、ME序列)),见图13。
实施例3
本实施例利用实施例1构建的大肠杆菌Ecoli基因组的测序文库进行了测序,具体过程为:
(1)测序文库纯化:将洗脱出来的测序文库,按体积比加入2倍体积的DNA纯化磁珠进行混匀,室温孵育5min,置于磁力架上直至澄清,弃掉上清,用80%的乙醇轻轻吹打磁珠清洗盐离子,共洗两次。瞬时离心,弃干净剩余的乙醇,开盖晾干直至干裂,加入23μL干净的无核酸酶水对测序文库进行洗脱,置于磁力架上直至澄清,吸取22μL上清于干净的PCR管中备用。
(2)测序文库质控:取1μL上清使用qubit ssDNA定量试剂盒进行定量。若浓度大于等于4nM,则满足上机测序,可直接使用Salus Pro测序试剂套装进行测序,上机步骤参考《Salus Pro测序试剂套装,说明书版本V1.1》进行操作。若文库浓度低于4nM,则需要进行一轮PCR扩增富集,才可进行下一步测序操作。
(3)测序文库富集(可选):向洗脱的20μL测序文库中,加入总体积为30μL的(聚合酶,dNTP,聚合酶反应缓冲液,测序引物,P1序列)对文库进行三步PCR反应,反应程序为:98℃1min,4~10个循环的(98℃10s,60℃30s,72℃30s),72℃5min,4℃保持。反应结束后,加入1.2x的纯化磁珠进行混匀,室温孵育5min,置于磁力架上直至澄清,弃掉上清,用80%的乙醇轻轻吹打磁珠清洗盐离子,共洗两次。瞬时离心,弃干净剩余的乙醇,开盖晾干直至干裂,加入22μL干净的无核酸酶水对测序文库进行洗脱,置于磁力架上直至澄清,吸取20μL上清于干净的PCR管中,取1μL上清使用qubit ssDNA定量试剂盒进行定量。文库合格后,使用Salus Pro测序试剂套装进行测序,上机步骤参考《Salus Pro测序试剂套装,说明书版本V1.1》进行操作。
上面对本发明实施例作了详细说明,但是本发明不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下作出各种变化。此外,在不冲突的情况下,本发明的实施例及实施例中的特征可以相互组合。

Claims (12)

  1. 一种测序文库的构建方法,其特征在于,包括以下步骤:
    S1:提供芯片和第一片段,所述芯片上具有若干微孔,所述第一片段包括P1序列,将所述第一片段固定在所述微孔中;
    S2:提供第二片段,所述第二片段包括P1互补序列、第一随机序列N1和P2序列,将所述第二片段与所述第一片段杂交;延伸并扩增所述第一片段,得到固定在所述微孔中的第一捕获片段;
    S3:提供第三片段,所述第三片段包括P2序列、第二随机序列N2和P3序列,将所述第三片段与所述第一捕获片段杂交;延伸并扩增所述第一捕获片段,得到固定在所述微孔中的第二捕获片段;
    S4:利用所述第二捕获片段捕获复合体,所述复合体包括靶DNA和结合在所述靶DNA上的转座体,所述转座体包括转座酶和转座子,所述转座子包括部分P3序列、全部P3序列、部分P3互补序列中的任一种;
    S5:去除所述转座酶,得复合序列,所述复合序列包括部分转座子序列和片段化的靶DNA,延伸所述复合序列,得延伸产物,所述延伸产物即为所述测序文库。
  2. 根据权利要求1所述的方法,其特征在于,步骤S1中还包括将测序引物固定在所述微孔中。
  3. 根据权利要求2所述的方法,其特征在于,步骤S4中需先提供第四片段,所述第四片段包括P3序列,将所述第四片段与所述第二捕获片段杂交;所述第二捕获片段通过所述第四片段捕获所述复合体。
  4. 根据权利要求3所述的方法,其特征在于,步骤S5包括:去除所述转座酶,得所述复合序列,所述复合序列与所述第二捕获片段连接构成所述测序文库。
  5. 根据权利要求1所述的方法,其特征在于,所述P1序列的长度为25~100nt。
  6. 根据权利要求1所述的方法,其特征在于,在所述第二片段中,所述P1互补序列的长度为25~100nt。
  7. 根据权利要求6所述的方法,其特征在于,所述第一随机序列N1的长度为10~15nt。
  8. 根据权利要求6或7所述的方法,其特征在于,所述P2序列的长度为25~100nt。
  9. 根据权利要求1所述的方法,其特征在于,所述第二片段的浓度为2~15pM。
  10. 根据权利要求1所述的方法,其特征在于,在所述第三片段中,所述第二随机序列N2的长度为10~15nt;优选地,所述P3序列的长度为25~100nt。
  11. 一种构建测序文库的试剂盒,其特征在于,包括:
    芯片,所述芯片上具有若干微孔,所述微孔中固定有第二捕获片段;所述第二捕获片段包括P1序列、第一随机序列互补序列、P2互补序列、第二随机序列互补序列、P3互补序列;
    所述试剂盒还包括转座酶和转座子;所述转座子包括部分P3序列、全部P3序列、部分P3互补序列中的任一种。
  12. 一种基因测序方法,包括以下步骤:利用权利要求1~10任一项所述的方法构建待测样品的测序文库,然后对所述测序文库进行扩增和测序。
PCT/CN2023/075228 2022-08-01 2023-02-09 一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法 WO2024027123A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210916238.3A CN115747301B (zh) 2022-08-01 2022-08-01 一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法
CN202210916238.3 2022-08-01

Publications (1)

Publication Number Publication Date
WO2024027123A1 true WO2024027123A1 (zh) 2024-02-08

Family

ID=85349136

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/075228 WO2024027123A1 (zh) 2022-08-01 2023-02-09 一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法

Country Status (2)

Country Link
CN (1) CN115747301B (zh)
WO (1) WO2024027123A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103710323A (zh) * 2012-10-01 2014-04-09 安捷伦科技有限公司 用于dna断裂和标记的固定化的转座酶复合体
CN105525357A (zh) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 一种测序文库的构建方法及试剂盒和应用
CN110219054A (zh) * 2018-03-04 2019-09-10 清华大学 一种核酸测序文库及其构建方法
CN113337576A (zh) * 2020-04-30 2021-09-03 深圳市真迈生物科技有限公司 文库制备方法、试剂盒及测序方法
WO2021252617A1 (en) * 2020-06-09 2021-12-16 Illumina, Inc. Methods for increasing yield of sequencing libraries
WO2022031955A1 (en) * 2020-08-06 2022-02-10 Illumina, Inc. Preparation of rna and dna sequencing libraries using bead-linked transposomes

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
PT3207134T (pt) * 2014-10-17 2019-09-17 Illumina Cambridge Ltd Transposição que preserva a contiguidade
CN109234356B (zh) * 2018-09-18 2021-10-08 南京迪康金诺生物技术有限公司 一种构建杂交捕获测序文库的方法及应用
CN112654719A (zh) * 2019-05-31 2021-04-13 伊鲁米纳公司 使用流动池进行信息存储和检索的系统和方法
IL299042A (en) * 2020-07-08 2023-02-01 Illumina Inc Beads as transpososome carriers
CN114096678A (zh) * 2020-07-31 2022-02-25 北京寻因生物科技有限公司 多种核酸共标记支持物及其制作方法与应用

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103710323A (zh) * 2012-10-01 2014-04-09 安捷伦科技有限公司 用于dna断裂和标记的固定化的转座酶复合体
CN105525357A (zh) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 一种测序文库的构建方法及试剂盒和应用
CN110219054A (zh) * 2018-03-04 2019-09-10 清华大学 一种核酸测序文库及其构建方法
CN113337576A (zh) * 2020-04-30 2021-09-03 深圳市真迈生物科技有限公司 文库制备方法、试剂盒及测序方法
WO2021252617A1 (en) * 2020-06-09 2021-12-16 Illumina, Inc. Methods for increasing yield of sequencing libraries
WO2022031955A1 (en) * 2020-08-06 2022-02-10 Illumina, Inc. Preparation of rna and dna sequencing libraries using bead-linked transposomes

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GERTZ, J. ET AL.: "Transposase mediated construction of RNA-seq libraries", GENOME RES, vol. 22, no. 1, 29 November 2011 (2011-11-29), XP055050772, ISSN: 1088-9051, DOI: 10.1101/gr.127373.111 *
HENNING, B. P. ET AL.: "Large-Scale Low-Cost NGS Library Preparation Using a Robust Tn5 Purification and Tagmentation Protocol", G3(BETHESDA), vol. 8, no. 1, 4 January 2018 (2018-01-04), XP093002331, ISSN: 2160-1836, DOI: 10.1534/g3.117.300257 *
LI LIN, QIAN SI-HUA, LYU TIAN-QI, WANG YU-HUI, ZHENG JIAN-PING: "Recent Progress of Library Construction for Next-generation Sequencing ", CHINESE JOURNAL OF APPLIED CHEMISTRY, vol. 38, no. 1, 1 January 2021 (2021-01-01), pages 11 - 23, XP093134691, DOI: 10.19894/j.issn.1000-0518.200158 *

Also Published As

Publication number Publication date
CN115747301A (zh) 2023-03-07
CN115747301B (zh) 2023-12-22

Similar Documents

Publication Publication Date Title
KR102602143B1 (ko) 고체 지지체 상의 샘플 제조법
WO2021013244A1 (zh) 一种构建捕获文库的方法和试剂盒
CN105917004B (zh) 在固体支持物上的多核苷酸修饰
EP2398915B1 (en) Synthesis of sequence-verified nucleic acids
US20210155985A1 (en) Surface concatemerization of templates
JPH02219600A (ja) ヌクレオチドシークエンス決定方法およびシークエンス決定用キット
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
JP2017532028A (ja) 単離されたオリゴヌクレオチドおよび核酸の配列決定におけるその使用
Ma et al. Microfluidics for genome-wide studies involving next generation sequencing
WO2022021279A1 (zh) 多种核酸共标记支持物及其制作方法与应用
JP2002330783A (ja) アレイ解析のための標的の濃縮と増幅
CN112930405A (zh) 复合表面结合的转座体复合物
EP2635703B1 (en) Integrated capture and amplification of target nucleic acid for sequencing
WO2013164319A1 (en) Targeted dna enrichment and sequencing
CN114196735A (zh) 一种芯片上恒温扩增测序的方法
US20130130917A1 (en) Method for specific enrichment of nucleic acid sequences
WO2024027123A1 (zh) 一种测序文库的构建方法、构建测序文库的试剂盒及基因测序方法
US9023597B2 (en) One step diagnosis by dendron-mediated DNA chip
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
CN114196737A (zh) 一种恒温扩增的测序方法
WO2020118543A1 (zh) 分离和/或富集宿主源核酸和病原核酸的方法和试剂及其制备方法
WO2023240611A1 (zh) 单链核酸环状文库的建库以及测序方法
EP3655547A1 (en) Methods and compositions for isolating asymmetric nucleic acid complexes
WO2024092562A1 (zh) 一种封闭序列及其试剂盒和使用方法
US20220154173A1 (en) Compositions and Methods for Preparing Nucleic Acid Sequencing Libraries Using CRISPR/CAS9 Immobilized on a Solid Support

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23848838

Country of ref document: EP

Kind code of ref document: A1