WO2022141061A1 - 一种环化文库的快速构建方法及成环接头 - Google Patents

一种环化文库的快速构建方法及成环接头 Download PDF

Info

Publication number
WO2022141061A1
WO2022141061A1 PCT/CN2020/140877 CN2020140877W WO2022141061A1 WO 2022141061 A1 WO2022141061 A1 WO 2022141061A1 CN 2020140877 W CN2020140877 W CN 2020140877W WO 2022141061 A1 WO2022141061 A1 WO 2022141061A1
Authority
WO
WIPO (PCT)
Prior art keywords
linker
complementary
double
ring
stranded
Prior art date
Application number
PCT/CN2020/140877
Other languages
English (en)
French (fr)
Inventor
于友钱
傅书锦
耿春雨
梁鑫明
Original Assignee
深圳华大智造科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大智造科技股份有限公司 filed Critical 深圳华大智造科技股份有限公司
Priority to CN202080107392.5A priority Critical patent/CN116670340A/zh
Priority to PCT/CN2020/140877 priority patent/WO2022141061A1/zh
Priority to EP20967409.2A priority patent/EP4273307A1/en
Priority to US18/259,978 priority patent/US20240076657A1/en
Publication of WO2022141061A1 publication Critical patent/WO2022141061A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B80/00Linkers or spacers specially adapted for combinatorial chemistry or libraries, e.g. traceless linkers or safety-catch linkers
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B70/00Tags or labels specially adapted for combinatorial chemistry or libraries, e.g. fluorescent tags or bar codes

Definitions

  • the invention belongs to the field of biotechnology, and more specifically, the invention provides a rapid construction method of a circularized library and a circular linker.
  • the conventional library construction process mainly includes the following steps: break the genomic nucleic acid chain into fragments by physical or enzymatic cleavage; use exonuclease to perform de-end repair on the broken fragment, and fill in both ends of the fragment so that both ends are uniform. To blunt end, and then use polymerase to add A to the 3' end of DNA to generate a single base sticky end; add the same linker to both ends of the fragment with A added to the 3' end.
  • the linker is formed by the complementary pairing of a long nucleic acid chain and a short chain.
  • the gap is connected by ligase; the fragment connecting the linker is used as the template, and the nucleic acid single-stranded primer that is complementary to the linker chain is added as a primer for polymerization.
  • Enzyme chain reaction the two single strands of the template are separated by denaturation, and after the primer is bound to the corresponding single strand, the corresponding single strand is respectively extended into a double-stranded target product with a complete complementary pairing, one end is an A linker, and one end is a B linker ;
  • the ligation product is purified and recovered by magnetic bead purification; the purified and recovered DNA double-stranded product is denatured to obtain single-stranded DNA, and the use of circularized auxiliary nucleic acid single-stranded primer and screening of the phosphate group at the 5' end of the DNA single-stranded , circularize the target DNA single-strand; through exonuclease and other methods to remove unnecessary and remaining uncircular
  • the adapter used in NGS high-throughput sequencing itself is a specially designed DNA sequence.
  • the characteristic sequence information on the adapter is used as the sequence of the starting site for sequencing during sequencing. Determination of subsequent sequence information.
  • the adapters are connected to the two ends of the DNA fragments by ligation and other methods. In order to realize this directional connection and avoid the mutual connection between the adapters, the connection method of sticky end adapters is usually adopted. In traditional library construction, it is necessary to ensure that both ends of the DNA double-strand are connected with adapters, and it is necessary to remove excess adapter products by means of purification.
  • the present invention provides a rapid construction method of a circularized library and a circular linker.
  • the present invention provides a method for constructing a circularized library, the method comprising:
  • the fragment protruding from the 3' end is circularized to form a circular library using a loop-forming linker, the loop-forming linker is a double-stranded incomplete pairing and has 5'-end overhangs at both ends, and the 5' end of the loop-forming linker is double-stranded;
  • the 'end overhangs are complementary to the 3' end overhangs of the interrupted fragment.
  • the DNA sequence is broken into random fragments by sonication or enzyme cleavage.
  • the overhang at the 3' end of the interrupted fragment and the overhang at the 5' end of the looping linker is 1-5 nt in length, such as 3 nt or 2 nt, preferably 1 nt.
  • the 3' overhang of the interrupted fragment is A and the 5' overhang of the looped linker is T.
  • the interrupted fragment is treated with exonuclease, polymerase and T4 polynucleotide kinase to 5' phosphorylate and 3' more A deoxynucleotide sticky ends.
  • the incompletely paired duplexes comprise gaps in one strand or non-matching regions between the duplexes.
  • the double strands include two non-matching regions, and the two non-matching regions include barcode sequences for distinguishing samples.
  • the ring-forming linker comprises the following ring-forming linker (a) or ring-forming linker (b):
  • the loop-forming linker (a) includes a long chain and two short chains paired with both ends of the long chain, the 5' end of the long chain has a phosphoric acid modification, and the short chain complementary to the 3' end of the long chain has a pair of short chains.
  • the 5' end has a phosphoric acid modification
  • the complementary double-stranded linker has a 3' end T sticky end, and includes a single-stranded non-complementary region of 8-12 nt (eg 10 nt), preferably the single-stranded non-complementary region includes a sample for distinguishing samples. barcode sequence;
  • the loop-forming linker (b) comprises two partially complementary double strands, the two ends of the double strand are paired to form a double strand structure, and the double strand structure has a 5'-terminal phosphoric acid modification and a 3'-terminal T sticky end, preferably the
  • the duplex includes a complementary portion of 8-12 nt (eg, 10 nt) as a barcode sequence that differentiates the samples.
  • the ring-forming linker is a ring-forming linker (a)
  • the ligated product is digested with exonuclease, and the digested product undergoes one-step purification to obtain a circular library.
  • the loop-forming linker is a loop-forming linker (b), and the ligated product is denatured to obtain a circular library.
  • the circularized single-stranded circular library enters a subsequent sequencing step, that is, after rolling circle replication, nucleic acid nanospheres (DNB) are formed to read nucleic acid sequence information.
  • NDB nucleic acid nanospheres
  • the present invention provides a loop-forming linker constructed from a circularization library, the loop-forming linker is a double-stranded incomplete pairing and has 5'-end overhangs at both ends, and the 5'-end of the loop-forming linker The overhang is complementary to the overhang at the 3' end of the fragment to be circularized.
  • the overhang at the 3' end of the fragment to be circularized and the overhang at the 5' end of the looping linker is 1-5 nt in length, such as 3 nt or 2 nt, preferably 1 nt.
  • the 3' overhang of the fragment to be circularized is A
  • the 5' overhang of the looping linker is T.
  • the incompletely paired duplexes comprise gaps in one strand or regions of non-matching between the duplexes.
  • the duplex includes two non-matching regions, and the two non-matching regions include barcode sequences for distinguishing samples.
  • the ring-forming linker comprises the following ring-forming linker (a) or ring-forming linker (b):
  • the loop-forming linker (a) includes a long chain and two short chains paired with both ends of the long chain, the 5' end of the long chain has a phosphoric acid modification, and the short chain complementary to the 3' end of the long chain has a pair of short chains.
  • the 5' end has a phosphoric acid modification
  • the complementary double-stranded linker has a 3' end T sticky end, and includes a single-stranded non-complementary region of 8-12 nt (eg 10 nt), preferably the single-stranded non-complementary region includes a sample for distinguishing samples. barcode sequence;
  • the loop-forming linker (b) comprises two partially complementary double strands, the two ends of the double strand are paired to form a double strand structure, and the double strand structure has a 5'-terminal phosphoric acid modification and a 3'-terminal T sticky end, preferably the
  • the duplex includes a complementary portion of 8-12 nt (eg, 10 nt) as a barcode sequence that differentiates the samples.
  • the present invention achieves one-step circularization of circularized libraries through specially designed linkers.
  • the end-repair of the broken DNA fragment and the addition of A are processed to form the A sticky end at the end, which is complementary to the specially designed linker T sticky end to form a circular structure, and the connection at the gap is completed under the action of ligase.
  • Fig. 1 shows the schematic diagram of the circular linker (a):
  • A is the sequence information of the final circular structure of the library, the upper part is the insert fragment information, the lower part is the linker sequence information, and the underlined sequence information is the barcode information;
  • B, C and D is a schematic diagram of the database construction process.
  • Fig. 2 shows the schematic diagram of the looping linker (b):
  • A is the sequence information of the final looping structure of the library, the upper part is the insert fragment information, the lower part is the linker sequence information, the underlined non-italicized sequence information is barcode information, and the underlined italicized
  • the sequence is a specially designed complementary sequence of barcode information, and the underlined sequence information forms a palindrome sequence;
  • B, C and D are schematic diagrams of the library construction process.
  • Figure 3 shows GC coverage information (A) for experiments performed with the ring-forming linker (a), and GC coverage information (B) for experiments performed with the ring-forming linker (b).
  • the invention solves the long library construction time existing in the traditional MGI-based library construction methods (including the steps of genomic DNA interruption, DNB fragment end-repair and end addition of A, joint connection, PCR amplification, single-strand separation and circularization, etc.). , the problem of low operational simplicity and purification loss.
  • the inventors improved in principle, changed the method of conventionally adding a linker, amplifying and then performing circularization, and designed a linking linker with a unique sequence structure, which realized linker connection and product formation. Ring fusion is a one-step reaction, the reaction system and purification steps are optimized, and the time for building a circularized library is greatly shortened.
  • A shows the sequence information of the final circular structure of the library, the upper part shows the insert fragment, the lower part shows the linker sequence, and the underlined sequence represents the barcode sequence ;
  • B, C and D schematically show the process of building a library.
  • the circular linker (a) is formed by complementary pairing of one long nucleic acid chain and two short nucleic acid chains, and this structure enables digestion of the chain where the two short nucleic acid chains are located after the circular formation.
  • the length of the circular linker is 84 bp, and the distance between the two nucleic acid short sequences is 10 bp.
  • the 5' end of the long nucleic acid chain and the 5' of the short chain at the linker end have phosphoric acid modification, and the complementary double-stranded DNA linker structure is a sticky end with 3' extra T deoxynucleotides, preferably 10bp in the middle of the linker for use.
  • Ident single-stranded barcode sequences After the cohesive end of the 3' excess T deoxynucleotide of the double-stranded DNA linker structure is complementary to the cohesive end of the 3' excess A deoxynucleotide of the broken DNA double-stranded fragment, the ligase can be used for ligation. After ligation into a circle, the ligation product can be digested to generate a single-stranded library or directly subjected to DNB.
  • A is the sequence information of the final circular structure of the library, the upper part is the insert fragment information, the lower part is the linker sequence information, the underlined non-italicized sequence information is the barcode information, and the underlined italicized
  • the sequence is the complementary sequence of specially designed barcode information, and the underlined sequence information forms a palindrome sequence;
  • B, C and D schematically show the library construction process.
  • the loop-forming linker (b) is formed by the complementary pairing of two long nucleic acid chains, the 5' end has a phosphoric acid modification, and the complementary double-stranded DNA linker structure is a sticky end with 3' extra T deoxynucleotides, and there is a non-stick in the middle of the linker. complementary sequences to form an ⁇ structure.
  • the length of the circular linker is 93 bp, and there is a non-complementary sequence in the middle to form a structure of ⁇ .
  • the positive and negative strands of the formed circular linker can be measured during sequencing, while reducing the number of linker dimers. form.
  • the length of the non-complementary sequence is 34bp, and the sticky end of the 3' excess T deoxynucleotide of the double-stranded DNA linker structure can be complementary to the sticky end of the 3' excess A deoxynucleotide that breaks the DNA double-stranded fragment.
  • Ligation is performed by ligase. After ligation into a circle, the ligation product can be denatured to generate a single-stranded library for DNB.
  • the design innovation of the present invention lies in the design structure of the circular linker, the library construction is completed for the double-stranded DNA, the AT connection is used, and the sequence design is adapted to the MGI sequencing instrument.
  • the linker of the present invention can be connected to both ends of the fragment at the same time, so as to realize one-step loop formation by ligation reaction.
  • the use of the circularization linker of the present invention skips the PCR and circularization reaction of traditional library construction, and at the same time omits the purification step, thus greatly reducing the operation time of the entire library construction process.
  • the reaction of each step is guaranteed to be carried out continuously in the same tube, which avoids the operation of transferring the tube and the loss in the process.
  • a circularized library is produced after one-step purification.
  • test methods in the following examples are conventional methods unless otherwise specified.
  • the test materials used in the following examples, unless otherwise specified, were purchased from conventional chemical reagent stores. It should be noted that the above summary section and the following detailed description are only for the purpose of specifically illustrating the present invention and are not intended to limit the present invention in any way. The scope of the present invention is to be determined by the appended claims without departing from the spirit and spirit of the present invention.
  • Short chain 1 5'-AGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG-3' (SEQ ID NO. 2)
  • Short chain 2 5'-TTGTCTTCCTAAGACCGCTTGGCCTCCGACTT-3' (SEQ ID NO. 3) looping linker (b)
  • both chains of the looping linker (b) are long chains b, that is, the matching regions and non-matching regions at both ends of the looping linker (b) are mirror images of each other, and the matching region in the middle is a palindrome sequence.
  • Genomic DNA fragmentation There are various methods for genomic DNA fragmentation, whether it is physical ultrasonic method or enzymatic reaction method, there are very mature solutions on the market, and the physical ultrasonic fragmentation method is adopted in this embodiment.
  • the interrupt conditions are set as follows:
  • the magnetic bead purification method or the gel recovery method can be used, and the magnetic bead purification method is used in this embodiment.
  • the disrupted DNA add 80 ⁇ l Ampure XP magnetic beads, mix well and leave for 7-15min; put it on a magnetic stand to collect the supernatant, add 40 ⁇ l Ampure XP magnetic beads to the supernatant, mix well and place for 7-15min; Put it into a magnetic rack to remove the supernatant, and wash the magnetic beads twice with 75% ethanol; add 50 ⁇ l of TE buffer solution or enzyme-free water after drying, mix well and place for 7-15 minutes to dissolve the recovered product.
  • End repair plus A Take 100 ng of the recovered product from the previous step, and supplement the volume of TE to 40 ⁇ L to prepare the system according to the following table: Prepare the reaction mixture as shown in the following table:
  • Linker ligation After the reaction, immediately add 5 ⁇ L of the ring-forming linker (a) or the ring-forming linker (b), and the linker concentration is 10 ⁇ M. At the same time, prepare the joint connection mixture, as shown in the following table:
  • Reagent name volume 10 ⁇ PNK buffer 3 ⁇ L 100mM ATP 0.8 ⁇ L Nuclease-free water 3.6 ⁇ L 50%PEG8000 16 ⁇ L T4 DNA ligase 1.6 ⁇ L total 25 ⁇ L
  • linker ligation mixture After adding an appropriate amount of linker to the end repair product, 25 ⁇ L of linker ligation mixture was added and the following reaction was performed.
  • the reaction conditions are shown in the following table:
  • DNB Preparation and sequencing of DNB: For specific operation steps, please refer to the instructions of the MGI sequencing kit. DNBs were prepared from double-stranded libraries. Preliminary on-machine analysis and sequencing results are as follows. The GC coverage information of the experiment with the looping linker (a) is shown in Figure 3 A, and the GC coverage information of the experiment with the looping linker (b) is shown in Figure 3 B. Show.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Plant Pathology (AREA)
  • Immunology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明属于生物技术领域,具体公开了一种环化文库构建方法及成环接头。所述方法包括:1)将DNA序列打断成片段;2)使所述打断的片段的两端形成3'端突出;3)利用成环接头将所述3'端突出的片段环化形成环状文库,所述成环接头为不完全配对且两端具有3'端突出的双链,所述成环接头的3'端突出与所述打断的片段的3'端突出互补。本发明通过对打断DNA片段的末端修复和加A处理形成末端的A粘性末端,再与特别设计的接头T粘性末端互补形成环化结构,在连接酶的作用下完成缝隙处的连接。

Description

一种环化文库的快速构建方法及成环接头 技术领域
本发明属于生物技术领域,更具体而言,本发明提供了一种环化文库的快速构建方法及成环接头。
背景技术
现阶段基于下一代测序技术(NGS)高通量测序的成熟建库产品众多。但是,这些建库产品普遍是基于常规的建库流程。常规的文库构建流程主要包括如下步骤:将基因组核酸链通过物理或者酶切的方式打断成片段;对打断的片段利用外切酶进行去末端修复,并补平片段两端使得两端均为平末端,然后再利用聚合酶在DNA的3’端加A,从而生成单碱基粘性末端;在3’端加A的片段的两端加上同样的接头。接头为一条核酸长链和一条短链互补配对而成,通过AT连接互补配对后通过连接酶连接缝隙;以连接接头的片段为模板,加入与接头链互补配对的核酸单链引物作为引物进行聚合酶链式反应;通过变性使模板的两条单链分离,在引物结合至对应单链后,分别将对应单链延伸成完全互补配对、一端为A接头、一端为B接头的双链目的产物;连接产物使用磁珠纯化的方式纯化回收;纯化回收得到的DNA双链产物经过变性后,获得单链DNA,使用环化辅助核酸单链引物和DNA单链5’末端的磷酸基团的筛选,将目标DNA单链环化;通过外切酶等方法的处理,去掉不需要和剩余的未环化单链;环化后产物通过磁珠纯化的方式纯化回收;环化后的单链环状核酸产物进入后续的测序步骤;经过滚环复制后形成核酸纳米球(DNB)进行核酸序列信息读取。这其中涉及到多个转管和纯化步骤,在操作简便性和建库时间方面有改进空间。
NGS高通量测序中使用的接头本身是一段特殊设计的DNA序列,接头上的特性序列信息在测序的时候作为测序的起始位点的序列与测序引物配对,通过引物的延伸,然后完成对后续序列信息的测定。通过连接等方法将接头连接在DNA片段两端,为了实现这种有方向的连接,同时避免接头间的相互连接,通常采用粘性末端接头的连接方式。传统的建库需要保证DNA双链两端都连接接头,并且需要通过纯化的方式去掉多余的接头产物。
随着许多国家开始进行国家级大队列居民测序服务,现有技术中需要操作简单且耗时短的环化文库建库方法。
发明内容
为了解决现有技术中存在的问题,本发明提出了一种环化文库的快速构建方法及成环接头。
因此,在一方面,本发明提供了一种环化文库构建方法,所述方法包括:
1)将DNA序列打断成片段;
2)使所述打断的片段的两端3’端突出;
3)利用成环接头将所述3’端突出的片段环化形成环状文库,所述成环接头为不完全配对且两端具有5’端突出的双链,所述成环接头的5’端突出与所述打断的片段的3’端突出互补。
在一个实施方案中,在1)中,所述DNA序列经过超声或者酶切打断成随机片段。
在一个实施方案中,所述打断的片段的3’端突出和所述成环接头的5’端突出长度为1-5nt,例如3nt或2nt,优选1nt。
在一个实施方案中,所述打断的片段的3’端突出为A,所述成环接头的5’端突出为T。
在一个实施方案中,在2)中,将所述打断的片段经外切酶、聚合酶和T4多聚核苷酸激酶的处理成5’磷酸化且3’多出A脱氧核苷酸的粘性末端。
在一个实施方案中,在3)中,所述不完全配对的双链包括在一条链上有缺口或者所述双链间有非匹配区。
在一个实施方案中,在3)中,所述双链间包括两段非匹配区,所述两段非匹配区之间包括用于区分样本的条形码序列。
在一个实施方案中,在3)中,所述成环接头包括如下的成环接头(a)或成环接头(b):
成环接头(a)包括一条长链和与所述长链两端配对的两条短链,所述长链5’端具有磷酸修饰,与所述长链3’端互补配对的短链的5’端具有磷酸修饰,互补形成的双链接头具有3’端T粘性末端,并包括8-12nt(例如10nt)的单链非互补区域,优选所述单链非互补区域包括用于区分样本的条形码序列;
成环接头(b)包括两条部分互补配对的双链,所述双链两端配对形成双链结构,所述双链结构具有5’端磷酸修饰和3’端T粘性末端,优选所述双链包括8-12nt(例如10nt)的互补部分,作为区分样本的条形码序列。
在一个实施方案中,在3)中,所述成环接头为成环接头(a),所述连接后的产物经过核酸外切酶消化,消化后产物经过一步纯化后得到环状文库。
在一个实施方案中,在3)中,所述成环接头为成环接头(b),所述连接后的产物经过变性得到环状文库。
在一个实施方案中,所述环化后的单链环状文库进入后续的测序步骤,即经过滚环复制后形成核酸纳米球(DNB)进行核酸序列信息读取。
在另一方面,本发明提供了一种环化文库构建的成环接头,所述成环接头为不完全配对且两端具有5’端突出的双链,所述成环接头的5’端突出与待环化片段的3’端突出互补。
在一个实施方案中,所述待环化片段的3’端突出和所述成环接头的5’端突出长度为1-5nt,例如3nt或2nt,优选1nt。
在一个实施方案中,所述待环化片段的3’端突出为A,所述成环接头的5’端突出为T。
在一个实施方案中,所述不完全配对的双链包括在一条链上有缺口或者所述双链间有非匹配区。
在一个实施方案中,所述双链间包括两段非匹配区,所述两段非匹配区之间包括用于区分样本的的条形码序列。
在一个实施方案中,所述成环接头包括如下的成环接头(a)或成环接头(b):
成环接头(a)包括一条长链和与所述长链两端配对的两条短链,所述长链5’端具有磷酸修饰,与所述长链3’端互补配对的短链的5’端具有磷酸修饰,互补形成的双链接头具有3’端T粘性末端,并包括8-12nt(例如10nt)的单链非互补区域,优选所述单链非互补区域包括用于区分样本的条形码序列;
成环接头(b)包括两条部分互补配对的双链,所述双链两端配对形成双链结构,所述双链结构具有5’端磷酸修饰和3’端T粘性末端,优选所述双链包括8-12nt(例如10nt)的互补部分,作为区分样本的条形码序列。
本发明通过特别设计的接头实现了环化文库的一步成环。本发明通过对打断DNA片段的末修复和加A处理形成末端的A粘性末端,在与特别设计的接头T粘性末端互补形成环化结构,在连接酶的作用下完成缝隙处的连接。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对附图作简单地介绍。显而易见地,下面描述中的附图仅仅涉及本发明的一些实施例。
图1示出了成环接头(a)的示意图:A为文库最终成环结构的序列信息,上部分为插入片段信息,下部分为接头序列信息,下划线序列信息为条形码信息;B、C和D为建库流程示意图。
图2示出了成环接头(b)的示意图:A为文库最终成环结构的序列信息,上部分为插入片段信息,下部分为接头序列信息,下划线非斜体序列信息为条形码信息,下划线斜体序列为特别设计条形码信息互补序列,下划线序列信息形成回文序列;B、C和D为建库流程示意图。
图3示出了以成环接头(a)进行实验的GC覆盖度信息(A),以成环接头(b)进行的实验的GC覆盖度信息(B)。
具体实施方式
下面对本发明进行清楚、完整地描述。显然,所描述的实施方案仅仅是本发明的一部分实施方案,而不是全部的实施方案。基于本发明中的实施方案,本领域普通技术人员可以获得的所有其他实施方案,并且它们都属于本发明保护的范围。除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不旨在于限制本发明。
本发明解决了基于MGI传统建库方式(包括基因组DNA打断、打断DNB片段末修和末端加A、接头连接、PCR扩增、单链分离环化等步骤)中存在的建库时间长、操作简便性低和纯化损失的问题。针对现有技术中存在的问题,发明人从原理上进行改进,改变了通过常规加接头、扩增再进行环化的方式,设计了具有独特序列结构的连接接头,实现了接头连接和产物成环融合为一步反应,优化了反应体系和纯化步骤,大大缩短了构建环化文库的时间。
如图1所示,对于成环接头(a),A示出了文库最终成环结构的序列信息,上部分示出插入片段,下部分示出接头序列,下划线的序列表示条形码(barcode)序列;B、C和D示意性示出了建库流程。成环接头(a)由一条核酸长链和两条核酸短链互补配对而成,这种结构使得在成环后可以对两条核酸短链所在的链进行消化。所述成环接头长度是84bp,两条核酸短序列之间的相隔10bp。核酸长链5’端和在接头端的 短链的5’具有磷酸修饰,互补形成的双链DNA接头结构为3’多出T脱氧核苷酸的粘性末端,优选在接头中间有10bp为用于区分的单链条形码序列。双链DNA接头结构3’多出T脱氧核苷酸的粘性末端与打断DNA双链片段的3’多出A脱氧核苷酸的粘性末端互补配对后,可以通过连接酶进行连接。连接成环后,可以对所述连接产物进行消化产生单链文库或者直接进行DNB。
如图2所示,对于成环接头(b),A为文库最终成环结构的序列信息,上部分为插入片段信息,下部分为接头序列信息,下划线非斜体序列信息为条形码信息,下划线斜体序列为特别设计条形码信息互补序列,下划线序列信息形成回文序列;B、C和D示意性示出了建库流程。成环接头(b)由两条核酸长链互补配对而成,5’端具有磷酸修饰,互补形成的双链DNA接头结构为3’多出T脱氧核苷酸的粘性末端且接头中间存在非互补序列而形成∞的结构。所述成环接头长度是93bp,中间存在非互补序列而形成∞的结构的作用是可以是形成的环状接头的正反义链均可以在测序中被测得,同时减少接头二聚体的形成。非互补序列的长度为34bp,双链DNA接头结构3’多出T脱氧核苷酸的粘性末端可以与打断DNA双链片段的3’多出A脱氧核苷酸的粘性末端互补配对后,通过连接酶进行连接。连接成环后,可以对所述连接产物变性产生单链文库进行DNB。
本发明的设计创新在于环化接头的设计构造,针对双链DNA完成文库的构建,利用AT连接,序列设计适配MGI测序仪器。相较于传统的建库序列,往往需要在片段的两端连接双接头,而本发明的连接接头可以同时连接到片段的两端从而实现利用连接反应进行一步成环。利用本发明的环化接头跳过了传统建库的PCR和环化反应,于此同时也省略了纯化步骤,因此大大降低了整个建库流程的操作时间。同时每一步的反应保证在同一管中连续进行,避免了转管的操作和过程中的损失,最后进行一步纯化后产出环化文库。
提供以下实施例是为了更好地理解本发明。下述实施例中的试验方法,如无特殊说明,均为常规方法。下述实施例中所用的试验材料,如无特殊说明,均为自常规化试剂商店购买所得。应注意,上文的发明内容部分以及下文的详细描述仅为具体阐释本发明之目的,无意于以任何方式对本发明进行限制。在不背离本发明的精神和主旨的情况下,本发明的范围由随附的权利要求书确定。
实施例
实施例中使用的成环接头的序列如下:
成环接头(a)
长链a:
5’-AGTCGGAGGCCAAGCGGTCTTAGGAAGACAAxxxxxxxxxxCAACTCCTTGGCTCACAGAACGACATGGCTACGATCCGACTT-3’,xxxxxxxxxx为条形码序列(SEQ ID NO.1)
短链1:5’-AGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG-3’(SEQ ID NO.2)
短链2:5’-TTGTCTTCCTAAGACCGCTTGGCCTCCGACTT-3’(SEQ ID NO.3)成环接头(b)
长链b:
5’-AGTCGGAGGCCAAGCGGTCTTAGGAAGACAAxxxxxxxxxxYYYYYYYYYYCAACTCCTTGGCTCACAGAACGACATGGCTACGATCCGACTT-3’(SEQ ID NO.4),xxxxxxxxxx为条形码序列,YYYYYYYYYY为条形码互补序列。
这里成环接头(b)的两条链都为长链b,即成环接头(b)的两端的匹配区和非匹配区互为镜像,中间的匹配区为回文序列。
接头退火操作:
订购后的接头使用TE回溶到100μM的浓度后,
对于成环接头(a),按照以下配方稀释后室温静置30分钟。合成10μM成环接头(a):
长链a(100μM) 10μL
短链1(100μM) 10μL
短链2(100μM) 10μL
5×STE buffer 20μL
50μL
总计 100μL
对于成环接头(b),按照以下配方稀释后室温静置30分钟。合成10μM成环接头(b):
长链b(100μM) 20μL
5×STE buffer 20μL
60μL
总计 100μL
来源:
1、基因组DNA打断:基因组DNA打断有多种方式,无论是物理超声法还是酶反应法,市场上有非常成熟的方案,本实施例采用的是物理超声打断法。
取96孔PCR板一块,加入一根聚四氟乙烯线,加入提取的基因组DNA 1μg,加入TE缓冲溶液或无酶水补齐80μl,将板封膜后至于E210超声打断仪上超声打断。
打断条件设置如下:
填充系数 20%
剧烈度 5
脉冲系数 200
打断时间 35×4次
2、打断片段选择:可以采用磁珠纯化法或凝胶回收法,本实施例采用磁珠纯化法。取打断后的DNA,加入80μl Ampure XP磁珠,混匀后放置7-15min;置入磁力架后收集上清,在上清中加入40μl Ampure XP磁珠,混匀后放置7-15min;置入磁力架吸去上清,用75%乙醇洗磁珠两次;晾干后加入50μl TE缓冲溶液或无酶水,混匀后放置7-15min溶解回收产物。
3、末端修复加A:取上步骤回收产物100ng,补充TE体积至40μL按下表配制体系:配制如下表所示的反应混合液:
试剂名称 体积
无核酸酶的水 2.1μL
10×PNK缓冲液 5μL
5:1 dATP:dNTP 0.6μL
Klenow片段 0.1μL
rTaq 0.2μL
T4 DNA聚合酶 2μL
总量 10μL
立即加入10μL末端修复反应液到40μl打断产物中,并进行以下反应。反应条件如下表所示:
处理条件 时间
37℃ 30min
65℃ 15min
4℃
4、接头连接:反应结束,立即加入5μL成环接头(a)或成环接头(b),接头浓度为10μM。同时配制接头连接混合液,如下表:
试剂名称 体积
10×PNK缓冲液 3μL
100mM ATP 0.8μL
无核酸酶的水 3.6μL
50%PEG8000 16μL
T4 DNA连接酶 1.6μL
总量 25μL
向末端修复产物加入适量接头之后,加入25μL接头连接混合液并进行以下反应。反应条件如下表所示:
处理条件 时间
23℃ 30min
4℃
5、反应结束采用2×磁珠纯化,最终回溶于20μL TE溶液中。
6、DNB的制备和测序:具体的操作步骤可以参考MGI测序试剂盒说明书。以双链文库制备DNB。初步上机分析测序结果如下,以成环接头(a)进行实验的GC覆盖度信息如图3中A所示,以成环接头(b)进行实验的GC覆盖度信息如图3中B所示。
成环接头(a)
样本名 成环接头a
过滤后读长数 1076236
过滤后碱基数(Mb) 161.44
过滤后比例(%) 52.62
比对率(%) 11.1
特异性比率(%) 98.26
重复数据率(%) 0.83
错配比例(%) 3.5
成环接头(b)
样本名 成环接头b
过滤后读长数 16709122
过滤后碱基数(Mb) 1754.46
过滤后比例(%) 38.24
比对率(%) 50.65
特异性比率(%) 98.1
重复数据率(%) 1.9
错配比例(%) 1.86
结果分析:所有成环接头从原理上实现了针对现有技术的问题,改变了通过常规加接头、扩增再进行环化的方式,实现了接头连接和产物成环融合为一步反应,大大缩短了构建环化文库的时间。建库后的产物为环化产物,经过DNB制备后,可以直接上级测序,测序结果分析后可以发现:所有接头均可以成功建库测序,测序后的结果虽然数据上表现不是最好,但是本实验目的为验证接头设计在实验中的可行性和提供一定的示范例。因此实验数据结果很好的证明了本发明的接头设计重复满足发明需求并具有一定的可实施性。
以上应用了具体实例对本发明进行阐述,只是用于帮助理解本发明,并不用以限制本发明。对于本发明所属技术领域的技术人员,依据本发明的思想,还可以做出若干简单推演、变形或替换。

Claims (10)

  1. 一种环化文库构建方法,所述方法包括:
    1)将DNA序列打断成片段;
    2)使所述打断的片段的两端3’端突出;
    3)利用成环接头将所述3’端突出的片段环化形成环状文库,所述成环接头为不完全配对且两端具有5’端突出的双链,所述成环接头的5’端突出与所述打断的片段的3’端突出互补。
  2. 根据权利要求1所述的方法,所述打断的片段的3’端突出为A,所述成环接头的5’端突出为T。
  3. 根据权利要求2所述的方法,在2)中,将所述打断的片段经外切酶、聚合酶和T4多聚核苷酸激酶的处理成5’磷酸化且3’多出A脱氧核苷酸的粘性末端。
  4. 根据权利要求1-3任一项所述的方法,在3)中,所述不完全配对的双链包括在一条链上有缺口或者所述双链间有非匹配区。
  5. 根据权利要求4所述的方法,在3)中,所述双链间包括两段非匹配区,所述两段非匹配区之间包括用于区分样本的条形码序列。
  6. 根据权利要求1-4任一项所述的方法,在3)中,所述成环接头包括如下的成环接头(a)或成环接头(b):
    成环接头(a)包括一条长链和与所述长链两端配对的两条短链,所述长链5’端具有磷酸修饰,与所述长链3’端互补配对的短链的5’端具有磷酸修饰,互补形成的双链接头具有3’端T粘性末端,并包括8-12nt(例如10nt)的单链非互补区域,优选所述单链非互补区域包括用于区分样本的条形码序列;
    成环接头(b)包括两条部分互补配对的双链,所述双链两端配对形成双链结构,所述双链结构具有5’端磷酸修饰和3’端T粘性末端,优选所述双链包括8-12nt(例如10nt)的互补部分,作为区分样本的条形码序列。
  7. 根据权利要求6所述的方法,在3)中,所述成环接头为成环接头(a),所述连接后的产物经过核酸外切酶消化,消化后产物经过一步纯化后得到环状文库;或者,所述成环接头为成环接头(b),所述连接后的产物经过变性得到环状文库。
  8. 一种构建环化文库的成环接头,所述成环接头为不完全配对且两端具有5’端突出结构的双链,所述成环接头5’端突出与待环化片段的3’端突出互补。
  9. 根据权利要求8所述的成环接头,所述不完全配对的双链包括在一条链上有缺口或者所述双链间有非匹配区;优选地,所述双链间包括两段非匹配区,所述两段非匹配区之间包括用于区分样本的条形码序列。
  10. 根据权利要求8或9所述的成环接头,所述成环接头包括如下的成环接头(a)或成环接头(b):
    成环接头(a)包括一条长链和与所述长链两端配对的两条短链,所述长链5’端具有磷酸修饰,与所述长链3’端互补配对的短链的5’端具有磷酸修饰,互补形成的双链接头具有3’端T粘性末端,并包括8-12nt(例如10nt)的单链非互补区域,优选所述单链非互补区域包括用于区分样本的条形码序列;
    成环接头(b)包括两条部分互补配对的双链,所述双链两端配对形成双链结构,所述双链结构具有5’端磷酸修饰和3’端T粘性末端,优选所述双链包括8-12nt(例如10nt)的互补部分,作为区分样本的条形码序列。
PCT/CN2020/140877 2020-12-29 2020-12-29 一种环化文库的快速构建方法及成环接头 WO2022141061A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202080107392.5A CN116670340A (zh) 2020-12-29 2020-12-29 一种环化文库的快速构建方法及成环接头
PCT/CN2020/140877 WO2022141061A1 (zh) 2020-12-29 2020-12-29 一种环化文库的快速构建方法及成环接头
EP20967409.2A EP4273307A1 (en) 2020-12-29 2020-12-29 Method for rapidly constructing cyclized library and ring-forming linker
US18/259,978 US20240076657A1 (en) 2020-12-29 2020-12-29 Method for rapidly constructing cyclized library and cyclization adapter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/140877 WO2022141061A1 (zh) 2020-12-29 2020-12-29 一种环化文库的快速构建方法及成环接头

Publications (1)

Publication Number Publication Date
WO2022141061A1 true WO2022141061A1 (zh) 2022-07-07

Family

ID=82258757

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/140877 WO2022141061A1 (zh) 2020-12-29 2020-12-29 一种环化文库的快速构建方法及成环接头

Country Status (4)

Country Link
US (1) US20240076657A1 (zh)
EP (1) EP4273307A1 (zh)
CN (1) CN116670340A (zh)
WO (1) WO2022141061A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024098178A1 (zh) * 2022-11-07 2024-05-16 深圳华大智造科技股份有限公司 制备dna纳米球的反应体系及其应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102864498A (zh) * 2012-09-24 2013-01-09 天津工业生物技术研究所 一种长片段末端文库的构建方法
WO2016078095A1 (zh) * 2014-11-21 2016-05-26 深圳华大基因科技有限公司 鼓泡状接头元件和使用其构建测序文库的方法
WO2018081666A1 (en) * 2016-10-28 2018-05-03 Silgentech Inc. Methods of single dna/rna molecule counting
CN112226821A (zh) * 2020-10-16 2021-01-15 鲲羽生物科技(江门)有限公司 一种基于双链环化的mgi测序平台测序文库的构建方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102864498A (zh) * 2012-09-24 2013-01-09 天津工业生物技术研究所 一种长片段末端文库的构建方法
WO2016078095A1 (zh) * 2014-11-21 2016-05-26 深圳华大基因科技有限公司 鼓泡状接头元件和使用其构建测序文库的方法
WO2018081666A1 (en) * 2016-10-28 2018-05-03 Silgentech Inc. Methods of single dna/rna molecule counting
CN112226821A (zh) * 2020-10-16 2021-01-15 鲲羽生物科技(江门)有限公司 一种基于双链环化的mgi测序平台测序文库的构建方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024098178A1 (zh) * 2022-11-07 2024-05-16 深圳华大智造科技股份有限公司 制备dna纳米球的反应体系及其应用

Also Published As

Publication number Publication date
US20240076657A1 (en) 2024-03-07
EP4273307A1 (en) 2023-11-08
CN116670340A (zh) 2023-08-29

Similar Documents

Publication Publication Date Title
US20220064721A1 (en) Method of preparing libraries of template polynucleotides
US10400279B2 (en) Method for constructing a sequencing library based on a single-stranded DNA molecule and application thereof
CN106795514B (zh) 泡状接头及其在核酸文库构建及测序中的应用
EP3464634B1 (en) Molecular tagging methods and sequencing libraries
US9822395B2 (en) Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
US5262311A (en) Methods to clone polyA mRNA
WO2016037361A1 (zh) 试剂盒及其在核酸测序中的用途
EP3607065B1 (en) Method and kit for constructing nucleic acid library
JP2001513639A (ja) 複数のdna断片をアセンブリーする方法
WO2022141061A1 (zh) 一种环化文库的快速构建方法及成环接头
CN112795620A (zh) 双链核酸环化方法、甲基化测序文库构建方法和试剂盒
AU727748B2 (en) Detection and confirmation of nucleic acid sequences by use of oligonucleotides comprising a subsequence hybridizing exactly to a known terminal sequence and a subsequence hybridizing to an undentified sequence
CN113638055B (zh) 一种制备双链rna测序文库的方法
WO2016058127A1 (zh) 一种转座酶打断核酸并加接头的方法和试剂
CN114808148A (zh) 一种dna文库构建试剂盒、文库构建方法和应用
JP2003518953A (ja) 核酸分析の方法
CN115928222A (zh) 提升文库转化率的建库方法
CN112795990B (zh) 一种灵活多变的降低污染及pcr偏倚的多标签二代测序文库接头
WO2018081666A1 (en) Methods of single dna/rna molecule counting
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
CN108265047B (zh) 用于dna片段的非特异性复制的方法及试剂盒
WO2008075519A1 (ja) 核酸の増幅方法とこれを用いた核酸の解析方法
US20230026055A1 (en) Dna construct for sequencing and method for preparing the same
US20210395813A1 (en) Multimer for sequencing and methods for preparing and analyzing the same
US6673577B1 (en) Detection and confirmation of nucleic acid sequences by use of poisoning oligonucleotides

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20967409

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202080107392.5

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 18259978

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020967409

Country of ref document: EP

Effective date: 20230731