EP3615683B1 - Verfahren zum verknüpfen von polynukleotiden - Google Patents

Verfahren zum verknüpfen von polynukleotiden Download PDF

Info

Publication number
EP3615683B1
EP3615683B1 EP18724679.8A EP18724679A EP3615683B1 EP 3615683 B1 EP3615683 B1 EP 3615683B1 EP 18724679 A EP18724679 A EP 18724679A EP 3615683 B1 EP3615683 B1 EP 3615683B1
Authority
EP
European Patent Office
Prior art keywords
nucleic acid
sequence
dna
adapter
chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP18724679.8A
Other languages
English (en)
French (fr)
Other versions
EP3615683A1 (de
Inventor
Xiaofeng Xin
Robert Nicol
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Broad Institute Inc
Original Assignee
Broad Institute Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Broad Institute Inc filed Critical Broad Institute Inc
Publication of EP3615683A1 publication Critical patent/EP3615683A1/de
Application granted granted Critical
Publication of EP3615683B1 publication Critical patent/EP3615683B1/de
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0634Cells from the blood or the immune system
    • C12N5/0635B lymphocytes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0634Cells from the blood or the immune system
    • C12N5/0636T lymphocytes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the present disclosure relates to methods for sequencing, retrieving, assembling, and/or cloning nucleic acids. More specifically, the invention relates to the use of 5'-5'linked oligonucleotides for linking DNA molecules for sequencing of the ends of long DNA template molecules, or for sequencing polymorphism or different target genes or different RNAs simultaneously.
  • the present disclosure provides a method for linking two or more nucleic acid molecules or fragments thereof, comprising: (a) segregating the nucleic acid molecules into individual discrete volumes; (b) annealing, within each individual discrete volume, said two or more nucleic acid molecules using a first primer and a second primer, wherein the first and second primers are linked by a 5'-5'-covalent linkage, and wherein the first primer hybridizes to a first sequence of a first nucleic acid molecule of said two or more nucleic acid molecules and the second primer hybridizes to a second sequence of a second nucleic acid molecule of said two or more nucleic acid molecules, to create a complex comprising the first and the second nucleic acid molecules.
  • the first and the second nucleic acid molecules are RNAs or mRNAs.
  • the mRNAs may encode immunoglobulin light chain and heavy chain, or T cell receptor- ⁇ and T cell receptor- ⁇ .
  • the first and the second nucleic acid molecules are genomic DNAs.
  • the genomic DNAs may comprise polymorphic sequences.
  • the method further comprises amplifying the first and the second nucleic acid molecules in the complex with a reverse transcriptase under conditions to create a first cDNA complementary to the first nucleic acid molecule and a second cDNA complementary to the second nucleic acid molecule.
  • Amplifying the first and the second nucleic acid molecules may comprise contacting the complex with a third primer that hybridizes to a sequence of the first cDNA and a fourth primer that hybridizes to a sequence of the second cDNA and creating a third cDNA complementary to the first cDNA and a fourth cDNA complementary to the second cDNA.
  • the third and the fourth primers may be unlinked or linked by a 5'-5'-covalent linkage.
  • amplifying the first and the second nucleic acid molecules may comprise contacting the complex with a template switching adapter and a third primer that primes at the template switching adapter and creating a third cDNA complementary to the first cDNA and a fourth cDNA complementary to the second cDNA.
  • the method may further comprise amplifying the first and the second nucleic acid molecules in the complex under conditions to create a first DNA complementary to the first nucleic acid molecule and a second DNA complementary to the second nucleic acid molecule.
  • Amplifying the first and the second nucleic acid molecules may further comprise contacting the complex with a third primer that hybridizes to a sequence of the first DNA and a fourth primer that hybridizes to a sequence of the second DNA and creating a third DNA complementary to the first DNA and a fourth DNA complementary to the second DNA.
  • the first and the second primers are 5'-5' linked via PCR amplification, isothermal amplification, ligation, click chemistry, or oligonucleotide chemical synthesis.
  • the primers may be linked using a biocompatible reaction.
  • the biocompatible reaction can be selected from the group consisting of a copper(I)-catalyzed azide-alkyne cycloaddition (CuAAC) reaction, a copper-free strain-promoted azide-alkyne cycloaddition (SPAAC) reaction, and a thiol-ene reaction.
  • the first primer, the second primer, both primers, or the linkage can comprise a binding tag.
  • the binding tag may be an affinity pull-down functional group, for example, a biotin or desthiobiotin group.
  • the method can further comprise isolating the complex comprising the first and the second nucleic acid molecules and the first and second primers by affinity pull-down, for example, by contacting the complex with a streptavidin linked tag.
  • the individual discrete volume is a droplet generated by emulsification.
  • the droplet may be generated by vortexing or shaking, or on a microfluidic device.
  • the individual discrete volume is a hollow particle of sufficient size to hold reaction mixture, for example, a section of a thin capillary tube.
  • the present disclosure provides a method for linking nucleic acid molecules or fragments thereof, comprising: (a) segregating individual nucleic acid molecules labeled on both terminal ends with a first adapter pair comprising a forward sequence (F) and a reverse sequence (R), into individual discrete volumes; (b) inserting, within the individual discrete volumes, at least a second adapter into two or more interior locations of the nucleic acid molecule; (c) fragmenting the nucleic acid molecules to generate nucleic acid fragments of the nucleic acid molecule of which a least a portion are labeled with both the first adapter pair and the second adapter; (d) contacting the nucleic acid fragments with at least a first and a second primer, wherein the first primer comprises at least two 5'-5'-linked arms, wherein a first arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the forward (F) sequence of the first adapter pair and a second arm of the at least two 5'-5
  • the method may further comprise: (f) pooling the amplified nucleic acid fragments from each individual discrete volume; and (g) circularizing the amplified nucleic acid fragments by joining the second adapters.
  • the method may further comprise: (h) PCR amplification to generate linearized nucleic acid molecules comprising the second adapter; and (i) sequencing the linearized nucleic acid molecules to generate a set of nucleic acid reads.
  • the method may comprise isolating the amplified nucleic acid fragments labeled with the first primer prior to the circularization step, or exonuclease digestion prior to the PCR amplification step to generate linearized nucleic acid molecules.
  • the method may further comprise removing the first adapter pair sequence from the circularized nucleic acid molecules to generate linearized nucleic acid molecules comprising the second adapter prior to the PCR amplification step.
  • the method may comprise assembling a nucleic acid sequence of the nucleic acid molecules based, at least in part, on the set of nucleic acid sequencing reads.
  • the first arm and the second arm are each between 5 to 1000 bps in length.
  • the first and the second arms may be 5'-5' linked via PCR amplification, isothermal amplification, ligation, click chemistry, or oligonucleotide chemical synthesis.
  • the arms may be linked using a biocompatible reaction.
  • the biocompatible reaction can be selected from the group consisting of a copper(I)-catalyzed azide-alkyne cycloaddition (CuAAC) reaction, a copper-free strain-promoted azide-alkyne cycloaddition (SPAAC) reaction, and a thiol-ene reaction.
  • the first arm, the second arm, both arms, or the linkage can comprise a binding tag.
  • the binding tag may be an affinity pull-down functional group, for example, a biotin or desthiobiotin group.
  • the method can further comprise isolating the amplified nucleic acid fragments labeled with the first primer via the binding tag.
  • the forward (F) sequence and the reverse (R) sequence are between 6 and 5000 nucleotides in length.
  • the forward (F) sequence and the reverse (R) sequence can be the same or different.
  • the forward (F) sequence, the reverse (R) sequence, or the second adapter can further comprise a restriction site.
  • the restriction site can be a Type IIS restriction site, for example, a SapI, AcuI, BpuEI, or BsgI restriction site.
  • the method can further comprise removing or shortening the forward (F) sequence, the reverse (R) sequence, or the second adapter from the circularized nucleic acid fragments by a restriction enzyme recognizing the restriction site.
  • the first arm of the first primer comprises a forward sequencing adapter sequence or a fragment thereof
  • the second arm of the first primer comprises a reverse sequencing adapter sequence or a fragment thereof.
  • the end-labeled nucleic acid molecules are fragmented by a transposase.
  • the individual discrete volume is a droplet generated by emulsification.
  • the droplet may be generated by vortexing or shaking, or on a microfluidic device.
  • the droplet may comprise the transposase, the second adapter, and the first and the second primers.
  • the individual discrete volume is a hollow particle of sufficient size to hold reaction mixture, for example, a section of a thin capillary tube.
  • the nucleic acid molecules may be DNA or RNA molecules. In some embodiments, the nucleic acid molecules are 5 kb or longer, or 40-100 kb or longer.
  • the nucleic acid molecules may encode T-cell receptor, B-cell receptor, or immunoglobulin heavy or light chain.
  • Another aspect of the present disclosure provide a method for linking nucleic acid molecules or fragments thereof, comprising: (a) segregating individual nucleic acid molecules labeled on both terminal ends with an adapter pair comprising a forward sequence (F) and a reverse sequence (R), into individual discrete volumes; (b) contacting, within each individual discrete volume, the nucleic acid molecules labeled with the adapter pair with at least a first primer and a second primer, wherein the first primer comprises at least two 5'-5'-linked arms, wherein a first arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the forward (F) sequence of the adapter pair and a second arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the reverse (R) sequence of the adapter pair, wherein the second primer comprises a sequence that hybridizes to the forward (F) sequence of the adapter pair, the reverse (R) sequence of the adapter pair, or an internal conserved region of the nucleic
  • the method may further comprise: (f) pooling the amplified nucleic acid fragments from each individual discrete volume; and (g) circularizing the amplified nucleic acid fragments.
  • the method may further comprise: (h) PCR amplification to generate linearized nucleic acid molecules comprising the adapter pair sequences; and sequencing the linearized nucleic acid molecules to generate a set of nucleic acid reads.
  • the method may comprise isolating the amplified nucleic acid fragments labeled with the first primer prior to the circularization step, or exonuclease digestion prior to the PCR amplification step.
  • the method may comprise assembling a nucleic acid sequence of the nucleic acid molecules based, at least in part, on the set of nucleic acid sequencing reads.
  • the first arm and the second arm are each between 5 to 1000 bps in length.
  • the first and the second arms may be 5'-5' linked via PCR amplification, isothermal amplification, ligation, click chemistry or oligonucleotide chemical synthesis.
  • the arms may be linked using a biocompatible reaction.
  • the biocompatible reaction can be selected from the group consisting of a copper(I)-catalyzed azide-alkyne cycloaddition (CuAAC) reaction, a copper-free strain-promoted azide-alkyne cycloaddition (SPAAC) reaction, and a thiol-ene reaction.
  • the first arm, the second arm, both arms, or the linkage can comprise a binding tag.
  • the binding tag may be an affinity pull-down functional group, for example, a biotin or desthiobiotin group.
  • the method can further comprise isolating the amplified nucleic acid fragments labeled with the first primer via the binding tag.
  • the forward (F) sequence and the reverse (R) sequence are between 6 and 5000 nucleotides in length.
  • the forward (F) sequence and the reverse (R) sequence can be the same or different.
  • the forward (F) sequence or the reverse (R) sequence can further comprise a restriction site.
  • the restriction site can be a Type IIS restriction site, for example, a SapI, AcuI, BpuEI, or BsgI restriction site.
  • the method can further comprise removing or shortening the forward (F) sequence or the reverse (R) sequence from the circularized nucleic acid fragments by a restriction enzyme recognizing the restriction site.
  • the first arm of the first primer comprises a forward sequencing adapter sequence or a fragment thereof
  • the second arm of the first primer comprises a reverse sequencing adapter sequence or a fragment thereof
  • the individual discrete volume is a droplet generated by emulsification.
  • the droplet may be generated by vortexing or shaking, or on a microfluidic device.
  • the individual discrete volume is a hollow particle of sufficient size to hold reaction mixture, for example, a section of a thin capillary tube.
  • the nucleic acid molecules can be DNA molecules or RNA molecules.
  • the nucleic acid molecules may encode T-cell receptor, B-cell receptor, or immunoglobulin heavy chain or light chain.
  • the present disclosure provides a composition for linking two or more nucleic acid molecules, comprising at least a first and a second nucleic acid molecule and a first and a second primer, wherein the first and the second primers are linked by a 5'-5'-covalent linkage, and wherein the first primer comprises a sequence that hybridizes to a first conserved sequence of the first nucleic acid molecule and the second primer comprises a sequence that hybridizes to a second conserved sequence of the second nucleic acid molecule.
  • the composition can further comprise a first DNA molecule amplified from the first primer and a second DNA molecule amplified from the second primer, wherein the first DNA molecule is complementary to the first nucleic acid molecule and the second DNA molecule is complementary to the second nucleic acid molecule.
  • the composition can further comprise a third primer and a fourth primer, wherein the third primer comprises a sequence that hybridizes to the first DNA molecule and the fourth primer comprises a sequence that hybridizes to the second DNA molecule.
  • the third primer and the fourth primer may be linked by a 5'-5'-covalent linkage.
  • a circularized DNA molecule for linking and/or sequencing two ends of a nucleic acid molecules comprising: a first primer, both ends of the nucleic acid molecule and an internal adapter sequence, wherein the internal adapter is inserted in the nucleic acid molecule, wherein the ends of the nucleic acid molecule are labeled with a forward (F) sequence and a reverse (R) sequence, wherein the first primer comprises at least two 5'-5'-linked arms linking the two ends of the nucleic acid molecule, wherein a first arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the forward (F) sequence and a second arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the reverse (R) sequence, and wherein the linked ends of the nucleic acid molecule are ligated by ligase at their distal ends.
  • the present disclosure provides a circularized DNA molecule for linking and/or sequencing two nucleic acid molecules, comprising: a primer and two nucleic acid molecules comprising a first and a second nucleic acid molecules, wherein the primer comprises at least two 5'-5'-linked arms linking the two nucleic acid molecules, wherein a first arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the first nucleic acid molecule and a second arm of the at least two 5'-5'-linked arms comprises a sequence that hybridizes to the second nucleic acid molecules, and wherein the linked two nucleic acid molecules are ligated by ligase at their distal ends.
  • the circularized DNA further comprises a second primer or a second adapter, wherein the second primer or second adapter labels the distal ends of the two nucleic acid molecules before they are ligated by ligase.
  • a "biological sample” may contain whole cells and/or live cells and/or cell debris.
  • the biological sample may contain (or be derived from) a "bodily fluid".
  • the present invention encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humour, vitreous humour, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof.
  • Biological samples include cell cultures, bodily fluids, cell cultures
  • subject means a vertebrate, preferably a mammal, more preferably a human.
  • Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
  • chain-oligo or oligonucleotide and "crab-oligo or crab nucleotide” may be used interchangeably herein.
  • Embodiments disclosed herein provide methods, primers, and kits for covalently linking polynucleotides which has application in, for example, de novo genome assembly, long range mutation detection, mapping of repeating regions, and synthetic biology construct validation.
  • the embodiments disclosed herein are well adapted for applications requiring the manipulation and/or sequencing of large polynucleotide molecules.
  • Existing techniques sequencing large DNA reads are costly, require large DNA inputs, suffer from higher error rates than sequencing shorter reads, and are typically much less efficient when used with oligonucleotides that are greater than 5 kb.
  • the embodiments disclosed herein utilizes amplification of target oligonucleotides using 5'-5' linked oligonucleotides, referred to herein as chain-seq oligos or crab oligos.
  • the chain-seq oligos comprise two or more separate oligonucleotide arms that are linked to one another at the 5' end of each arm.
  • each arm of the chain-oligo has two or more free 3' ends and comprise at least a hybridization domain capable of binding to a complementary sequence on a target oligonucleotide.
  • the chain-seq oligos may be designed to bind to RNA, DNA, or a combination thereof.
  • Amplification reactions utilizing the chain-seq oligo then result in a single amplicon or molecule that incorporates the sequence of both target molecules. This single amplicon or molecule may then be used in further processing steps such as, but not limited to, sequencing.
  • the target oligonucleotides are smaller fragments originating from the same larger oligonucleotide.
  • the large oligonucleotide is a DNA oligonucleotide.
  • a "large oligonucleotide” is an oligonucleotide that is at least about 5 kB, 6 kB, 7 kB, 8 kB, 9 kB, 10 kB, 11 kB, 12 kB, 13 kB, 14 kB, 15 kB, 16 kB, 17 kB, 18 kB, 19 kB, 20 kB, 21 kB, 22 kB, 23 kB, 24 kB, 25 kB, 26 kB, 27 kB, 28 kB, 29 kB, 30 kB, 31 kB, 32 kB, 33 kB, 34 kB, 35 kB, 36 kB, 37 kB, 38 kB, 39 kB, 40 kB, and the like.
  • covalent linkage of polynucleotides is achieved using 5'-5' linked oligonucleotides, also referred to herein as a "chain-seq" or “crab-seq” oligonucleotide.
  • a 5'-5' linked oligonucleotide comprises two or more "arms,” each comprising an oligonucleotide sequence that is linked at the 5' end via a covalent or non-covalent biocompatible linkage.
  • the chain-seq oligonucleotide comprises two or more free 3' ends.
  • Each arm of the 5'-5' linked oligonucleotide may comprise the same oligonucleotide sequence, or each arm may comprise a different oligonucleotide sequence.
  • the oligonucleotide sequences of each arm may be the same or of different lengths. In certain embodiments, an individual arm may be from about 8 to about 1000 nucleotides in length.
  • an individual arm may be 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, or the like.
  • Linking together individual nucleic acid molecules as described herein may result in a nucleic acid construct of any size, containing any number of joined nucleic acid molecules appropriate in accordance with the invention.
  • Each arm may be single stranded, double stranded, or a combination thereof.
  • the oligonucleotide may comprise DNA, RNA, or a combination thereof.
  • the arms may also comprise, in full or in part, nucleotide analogs such as peptide nucleic acids, morpholino and locked nucleic acids, glycol nucleic acid, and threose nucleic acids.
  • a portion of each arm of the 5'-5' linked oligonucleotide may comprise a binding domain comprising a nucleic acid sequence that is complementary to and hybridizes with a target sequence.
  • a target sequence may be a naturally occurring nucleic acid sequence or may be artificially introduced into a target polynucleotide as appropriate depending on the application.
  • a 5'-5' linked oligonucleotide may comprise at least two, at least three, at least four, at least five, at least six, at least seven, at least eight arms, or any number of arms as appropriate for the number of target oligonucleotides to be linked via the methods disclosed herein.
  • the arms of an oligonucleotide as described herein may be connected via a common linkage.
  • each arm may recognize the same target sequence.
  • each arm, or a subset of arms may recognize a different target sequence. For example, given a four-arm 5'-5' linked oligonucleotide, each arm may recognize up to four different target sequences. Alternatively, two arms could hybridize to a single target sequence and the remaining two arms could hybridize to a second target sequence. Other similar variations are contemplated and are within the scope of this invention.
  • each arm may be linked to each other using means known in the art for linking nucleic acids to each other.
  • nucleic acids are linked together via a biocompatible reaction.
  • a biocompatible reaction may comprise use of "click chemistry" (see, e.g., Rostovtsev et al., Angew Chem Int Ed 41:2596-2599, 2002 ; Himo et al., JAm Chem Soc 127:210-216, 2005 ; Boren et al., JAm Chem Soc 130:8923-8930, 2008 ).
  • An example of a click chemistry reaction is the Huisgen 1,3-dipolar cycloaddition of alkynes to azides to form 1,4-disubsituted-1,2,3-triazoles.
  • the copper(I)-catalyzed reaction is mild and very efficient, requiring no protecting groups, and requiring no purification, in many cases.
  • the azide (AZ) and alkyne (AK) functional groups are largely inert towards biological molecules and aqueous environments, which allows the use of the Huisgen 1,3-dipolar cycloaddition in target-guided synthesis and activity-based protein profiling.
  • a chain oligo is formed by linking the 5' end of one nucleic acid strand that includes an azide group to the 5' end of another nucleic acid strand that includes an alkyne group.
  • Other example biocompatible reactions include copper(I)-catalyzed azide-alkyne cycloaddition (CuAAC) reaction, a copper-free strain-promoted azide-alkyne cycloaddition (SPAAC) reaction, and a thiol-ene reaction.
  • each arm of the 5'-5' linked oligonucleotide may be connected indirectly via a binding or scaffolding molecule.
  • indirect means for linking the arms of such an oligo may include use of binding or scaffolding molecules such as, but not limited to polymers, such as polyethylene glycol (PEG) and other polyethers.
  • spacers may be employed, for example, to reduce steric hindrance between individual arms.
  • the spacer may be an alkyne or an azide spacer.
  • a spacer may be joined to an oligo of the invention using direct or indirect means, including, but not limited to polymers, such as polyethylene glycol (PEG) and other polyethers.
  • PEG polyethylene glycol
  • a spacer may be 8 to 1000 nucleotides in length.
  • a spacer may be 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, or the like.
  • nucleic acid a polymer of nucleotides
  • a nucleic acid is "single-stranded” if nucleotides that form the nucleic acid are unpaired. That is, nucleotides of a single-stranded nucleic acid are not base-paired (via Watson-Crick base pairs, e.g., guanine-cytosine and adenine-thymine/uracil) to nucleotides of another nucleic acid.
  • a single-stranded nucleic acid may be contrasted with a double-stranded (paired) nucleic acid, a typical example of which is a DNA double helix.
  • Single-stranded nucleic acids may include a contiguous (uninterrupted) sequence of nucleotides or, in some embodiments, a single-stranded nucleic acid may be a conjugate that includes two nucleic acid strands joined together, for example, through a chemical (covalent) linkage.
  • a single strand of a nucleic acid (e.g., DNA or RNA) has a 5' end (fiveprime end) and a 3' end (three-prime end).
  • the 5' end typically contains a phosphate group attached to the 5' carbon of the ribose ring of a nucleotide and a 3' end, which is unmodified from the ribose -OH substituent.
  • Nucleic acids are synthesized in vivo in the 5' to 3' direction. Polymerase relies on the energy produced by breaking nucleoside triphosphate bonds to attach new nucleoside monophosphates to the 3'-hydroxyl (-OH) group via a phosphodiester bond.
  • An engineered single-stranded nucleic acid of the present disclosure has two 3' ends (a chain oligo). Each terminus of the single-stranded nucleic acid includes a 3'-hydroxyl (-OH) group.
  • a single-stranded chain oligo is formed by joining (linking) the 5' end of one single-stranded nucleic acid to the 5' end of another single-stranded nucleic acid.
  • the linkage between two 5' ends is a covalent linkage. In other embodiments, the linkage is non-covalent.
  • Each arm of a chain-oligo may comprise a hybridization domain.
  • a “domain” refers to a discrete, contiguous sequence of nucleotides or nucleotide base pairs, depending on whether the domain is unpaired (single-stranded nucleotides) or paired (double-stranded nucleotide base pairs), respectively.
  • a hybridization domain facilitates binding of the chain-oligo to a complementary sequence on a target oligonucleotide i.e. the target sequence.
  • a domain is "complementary to" a target sequence if the domain contains nucleotides that base pair (hybridize/bind through Watson-Crick nucleotide base pairing) with nucleotides of the target sequence such that a paired (double-stranded) or partially-paired molecular species/structure is formed.
  • Complementary domains need not be perfectly (100%) complementary to form a paired structure, although perfect complementarity is provided, in some embodiments.
  • the length of a hybridization domain may vary. In some embodiments, a hybridization domain may have a length of 5-50 nucleotides.
  • an anchor domain may have a length of 5-45, 5-40, 5-35, 5-30, 5-25, 5-20, 5-15, 5-10, 10-50, 10-45, 10-40, 10-35, 10-30, 10-25, 10-20, 10-15, 15-50, 15-45, 15-40, 15-35, 15-30, 15-25, 15-20, 20-50, 20-45, 20-40, 20-35, 20-30, 20-25, 25-40, 25-35, 25-30, 30-40, 30-35, or 35-40 nucleotides.
  • a hybridization domain may have a length of 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides.
  • a hybridization domain may have a length of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides.
  • a hybridization domain in some embodiments, may be longer than 50 nucleotides, or shorter than 5 nucleotides.
  • one or more chain-oligo arms may further comprise a primer domain.
  • a primer domain is a domain to which a primer binds.
  • a primer is a strand of short nucleotide sequence that serves as a starting point for nucleic acid (e.g., DNA) synthesis.
  • chain oligos may comprise a pair of internal primer domains (e.g., near the linked 5' ends), which may be used for amplification of sequenceready barcoded constructs produced using the methods of the present disclosure. The length of a primer domain may vary.
  • a primer domain may have a length of 5-50 nucleotides, for example, a length of 5-45, 5-40, 5-35, 5-30, 5-25, 5-20, 5-15, 5-10, 10-50, 10-45, 10-40, 10-35, 10-30, 10-25, 10-20, 10-15, 15-50, 15-45, 15-40, 15-35, 15-30, 15-25, 15-20, 20-50, 20-45, 20-40, 20-35, 20-30, 20-25, 25-40, 25-35, 25-30, 30-40, 30-35, or 35-40 nucleotides.
  • a primer domain has a length of 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides.
  • a primer domain has a length of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides.
  • a primer domain in some embodiments, may be longer than 50 nucleotides, or shorter than 5 nucleotides.
  • one or more chain-oligo arms may further comprise a sequencing adapter.
  • a sequencing adapter is nucleotide sequence that facilitates binding of oligonucleotide sequences generated using the methods disclosed herein to complementary sequences used in certain next-generation sequencing technologies.
  • one or more chain-oligo arms may further comprise a barcode.
  • a barcode is short sequence of nucleotides used as an identifier for an associated molecule, such as a target molecule and/or target nucleic acid.
  • a nucleic acid barcode, or unique molecular identifier (UMI) can have a length of at least, for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides, and can be in single- or double-stranded form.
  • Chain-oligos can be labeled with multiple nucleic acid barcodes in combinatorial fashion, such as a nucleic acid barcode concatemer.
  • a nucleic acid barcode is used to identify a nucleic acid as being from a particular compartment (for example a discrete volume), having a particular physical property (for example, affinity, length, sequence, etc.), or having been subject to certain treatment conditions.
  • Chain-seq oligos can be associated with multiple nucleic acid barcodes to provide information about all of these features (and more).
  • Each member of a given population of UMIs is typically associated with individual members of a particular set of identical, specific (for example, discrete volume-, physical property-, or treatment condition-specific) nucleic acid barcodes.
  • each member of a set of origin-specific nucleic acid barcodes, having identical or matched barcode sequences may be associated with (for example, covalently bound to or a component of the same molecule as) a distinct or different UMI.
  • a method of the invention may involve the use of a binding tag.
  • a nucleic acid as described herein may be labeled with an affinity tag, for example an affinity pull-down functional group, on the first arm, or the second arm, or both.
  • an affinity tag may be used to isolate a biomolecule of interest, for example an amplified nucleic acid, such as an amplified segment of a template DNA molecule or fragment or portion thereof.
  • an amplified nucleic acid may contain one or more adaptor molecules as described herein, which may serve as a means for isolation of the nucleic acid.
  • a chain oligo may have more than two arms, and thus an affinity tag may be present on a chain oligo on only a single arm, on multiple arms, or on all arms of the chain oligo.
  • an affinity tag may be used to isolate a biomolecule of interest, such as a nucleic acid, polynucleotide, protein, or the like. Affinity tags attached to as described herein may be removed by chemical or enzymatic means.
  • One of skill in the art will be able to identify appropriate tagmentation methods and means for an affinity tag in accordance with the invention.
  • affinity tags include an enzymatic modification such as biotin or desthiobiotin; a fluorescent tag, such as green fluorescent protein (GFP), a solubilization tag, such as thioredoxin, maltose binding protein, glutathione-S-transferase, or poly(NANP).
  • GFP green fluorescent protein
  • solubilization tag such as thioredoxin, maltose binding protein, glutathione-S-transferase, or poly(NANP).
  • any binding tag appropriate for the specific application may be used to isolate or separate a biomolecule of interest as described herein.
  • some embodiments of the invention involve the use of chain oligos as described herein to link together two RNA molecules.
  • a chain oligo having two arms may be bound to a first and a second RNA molecule such that the RNA molecules are linked together to form a single long RNA molecule, wherein the chain oligo is located between the first and second RNA molecules, as shown, for example, in FIG. 1 .
  • Reverse transcription may then be performed to produce cDNA of both the first and the second RNA molecules, wherein both 3' ends of the chain oligo serve as primer molecules for first-strand cDNA synthesis according to methods known in the art.
  • the newly produced cDNA may be dissociated from the template RNA molecules and a second, distinct chain oligo may hybridize to the 3' ends of the cDNA than the first chain oligo and second strand synthesis as known in the art may be performed in order to produce a double-stranded DNA copy of the starting RNA molecule.
  • second strand cDNA may be synthesized using additional conventional mRNA specific primers, or using a common template switching adapter and a primer priming a sequence in the template switching adapter.
  • two or more chain oligos may be used to link together two or more RNA molecules.
  • a first chain oligo may be hybridized to a first RNA molecule, wherein the first 3' end of the first chain oligo hybridizes to the first RNA molecule and the second 3' end of the first chain oligo hybridizes to a second RNA molecule;
  • a second chain oligo may be hybridized to a second RNA molecule, wherein the first 3' end of the second chain oligo hybridizes to the second RNA molecule and the second 3' end of the second chain oligo hybridizes to a third RNA molecule;
  • a third chain oligo may be hybridized to a third RNA molecule, wherein the first 3' end of the third chain oligo hybridizes to the third RNA molecule and the second 3' end of the third chain oligo hybridizes to the first RNA molecule, such that a circular nucleic acid molecule is formed by the hybridization of the first
  • the nucleic acid molecules linked by the 5'-5' linked oligonucleotides may be mRNA molecules encoding different transcripts.
  • the nucleic acid molecules may encode immunoglobulin heavy and light changes, or T cell receptor ⁇ and T cell receptor ⁇ .
  • the nucleic acid molecules linked by the 5'-5' linked oligonucleotides may be DNA molecules, for example, genomic DNAs harboring different mutations or polymorphisms.
  • the nucleic acid molecules may be isolated from different cells, for example, immune cells including T cells, B cells, dendritic cells, macrophages, neutrophils, mast cells, eosinophils, basophils, and natural killer cells.
  • the nucleic acid molecules may encode any cellular receptors or lectins.
  • the 5'-5' linked oligonucleotides may be used in linking two ends of a nucleic acid molecule and downstream amplification and sequencing procedures.
  • the nucleic acid molecule may be a DNA or RNA.
  • the nucleic acid molecule may be at least 1 kb, at least 2 kb, at least 3 kb, at least 5 kb, at least 6 kb, at least 7 kb, at least 8 kb, at least 9 kb, or at least 10 kb in length.
  • the use of the 5'-5' linked oligonucleotides in sequencing may be applied to archiving an immune repertoire by capturing all sequences in a Crab-Seq library derived from a human sample, such as a blood or urine sample.
  • the 5'-5' linked oligonucleotides may be used to characterize cell type specific mRNAs and to identify or profile any cell types.
  • Adaptor molecules may be used, in some embodiments, to add one or more elements to target oligonucleotides.
  • the adapter may be used to add a universal sequence complementary to the hybridization domain on a set of chain-oligos.
  • Adapters may also be used to add primer binding sites Whether an adaptor is single-stranded, double-stranded, or partially double-stranded (partially single-stranded), depends on the target molecule to which the adaptor is being added.
  • an adaptor molecule as described herein may be a forward sequencing adaptor sequence or a reverse sequencing adaptor sequence.
  • a homopolymer domain is simply a contiguous stretch of the same nucleotides, such as for example, GGGG.
  • a homopolymer may comprise adenines, guanines, cytosines or thymines (or variants thereof). It should be understood that the homopolymer domain is used to join the 3' ends of the extended whip molecule to each other to permit polymerization to form a circular, double-stranded molecule. Other means of joining the two 3' ends are encompassed by the present disclosure. Thus, the homopolymer domains may be substituted with other complementary nucleotide domains, for example.
  • an adaptor molecule may be added to the outside of a template DNA molecule, referred to herein as an "outside adaptor.”
  • Other embodiments of the invention utilize an internal adaptor, which is described in more detail below.
  • An adaptor molecule may provide a binding site for a chain oligo to hybridize to an/or provide a primer binding site for amplification or sequencing purposes.
  • the invention may provide for use of a number of adaptor molecules.
  • the methods described herein may use a single outside adaptor and a single internal adaptor.
  • a method of the invention may use multiple adaptor molecules, as appropriate for the particular application.
  • the present invention may provide in some embodiments an adaptor molecule containing one or more nucleic acid segments that may serve as a site location into which a nucleic acid segment is inserted or added.
  • a segment or sequence may be introduced into an adaptor molecule through techniques known in the art, or it may be a naturally occurring sequence.
  • an adaptor molecule as described herein may be engineered or produced to contain a particular splice site, recombination or crossover site, or "hot spot," such as a Chi site or Chi sequence. This may serve as a tag for enzymatic removal of the adaptor molecule prior to sequencing of the template DNA, such as removal of a particular segment of nucleic acid by the RecBCD enzyme.
  • other sequences or sites that may stimulate or result in double-stranded DNA breakage and are useful for removal of an adaptor molecule or other nucleic acid segment at a specific location are encompassed by the present invention.
  • an adaptor molecule may be any size appropriate for the particular use.
  • an adaptor as described herein may be from about 6 nucleotides in length to about 5000 nucleotides in length.
  • an adaptor may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 2000, 3000, 4000, 5000, or the like.
  • more than one Chi site may be employed, such as 2 Chi sites, 3 Chi sites, or more.
  • a binding site for a restriction endonuclease may be incorporated into an adaptor molecule for addition to a template DNA as described herein.
  • a restriction endonuclease may be employed in order to cut the template DNA molecule at a desired location.
  • a restriction endonuclease useful in accordance with the invention may be a type IIS restriction endonuclease.
  • One of skill in the art will recognize particular restriction endonucleases that may be useful for the invention, for example, SapI.
  • restriction endonuclease that recognizes a particular sequence and will cut DNA as required for a particular application is appropriate for use with the invention, on the condition that the recognition site for the particular restriction endonuclease is added to the template DNA molecule.
  • a restriction endonuclease may be employed to remove an adaptor molecule from the template DNA molecule.
  • the template DNA molecule may be in circular form or linear form when the adaptor molecule is removed using a restriction endonuclease.
  • the 5'-5' linked oligonucleotides described herein may be used to isolate, amplify and/or covalently link a number of nucleic acid fragments for a number of different applications, including but not limited to, de novo genome assembly, genomic deletion and insertion detection, genomic repeat detection, synthetic biology construct verification, T-cell receptor profiling, B-cell receptor profiling, T-cell receptor cloning, multiplex RNA or DNA molecule combination detection, such as multiple mutation combination detection and profiling in single cells, such as cancer cells, and cloning of multiplex DNA molecules, as described in further detail below.
  • a “barcoded nucleic acid” is a nucleic acid, typically single-stranded, that includes a barcode domain.
  • a “barcode domain” is a domain that includes a nucleotide sequence that can be used to identify the barcoded nucleic acid or to identify one or more biomolecules to which the barcoded nucleic acid is directly or indirectly linked.
  • a barcoded nucleic acid may include a barcode domain that is unique to that single nucleic acid (among a population of barcoded nucleic acids, the barcode is specific to that one nucleic acid) or a barcode domain that is unique to a subpopulation of nucleic acids (among multiple populations of barcoded nucleic acids, the barcode is specific to a single subpopulation of barcoded nucleic acids).
  • the length of a barcode domain may vary.
  • a barcode domain may have a length of 5-45, 5-40, 5-35, 5-30, 5-25, 5-20, 5-15, 5-10, 10-50, 10-45, 10-40, 10-35, 10-30, 10-25, 10-20, 10-15, 15-50, 15-45, 15-40, 15-35, 15-30, 15-25, 15-20, 20-50, 20-45, 20-40, 20-35, 20-30, 20-25, 25-40, 25-35, 25-30, 30-40, 30-35, or 35-40 nucleotides.
  • a barcode domain may have a length of 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides.
  • a barcode domain may have a length of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides.
  • a barcode domain in some embodiments, may be longer than 50 nucleotides, or shorter than 5 nucleotides.
  • the length of a barcoded nucleic acid itself may vary.
  • the length of a barcoded nucleic acid may be 20-1000 nucleotides, for example a length of 20-900, 20-800, 20-700, 20-600, 20-500, 20-400, 20-300, 20-200, 20-100, 20-50, or 20-25 nucleotides.
  • a barcoded nucleic acid has a length of 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 350, 400, 450, or 500 nucleotides.
  • a barcoded nucleic acid may be longer than 1000 nucleotides.
  • a barcoded nucleic acid may further include a primer domain and/or an anchor domain that are complementary to one of the anchor domains of a chain oligo.
  • a barcoded nucleic acid may be linked to or serve as a means to identify any biomolecule, as discussed below.
  • the 3' end of a barcoded nucleic acid in this example, may include an anchor domain that that is complementary to one 3' end of a chain oligo such that the two anchor domains bind to each other to form a paired domain.
  • Anchor domains may be single-stranded, double-stranded, or partially double-stranded (containing a single-stranded and double-stranded nucleic acid).
  • Barcoded nucleic acids may include a single-stranded anchor domain.
  • An anchor domain may be added to, or may be a component of, a barcoded nucleic acid or a target biomolecule of interest.
  • Anchor domains are used for identifying or localizing a target biomolecule(s) of interest.
  • one of the biomolecules contains an anchor domain complementary to one of the anchor domains of a chain oligo
  • the other biomolecule contains an anchor domain complementary to the other of the anchor domains of a whip molecule.
  • a chain oligo is used in combination with a barcoded nucleic acid linked to a target biomolecule
  • typically the barcoded nucleic acid contains an anchor domain complementary to one anchor domain of the chain oligo
  • another biomolecule contains an anchor domain complementary to the other anchor domain of the chain oligo.
  • the methods disclosed herein can be used to link smaller oligonucleotide fragments originating from a single larger oligonucleotide.
  • individual oligonucleotides may first be isolated in individual discrete volumes prior to fragmentation. Fragmentation of oligonucleotides in individual discrete volumes may be accomplished using known methods in art. In certain example embodiments, fragmentation of oligonucleotides is accomplished using tagmentation.
  • an "individual discrete volume” is a discrete volume or discrete space, such as a container, receptacle, or other arbitrary defined volume or space that can be defined by properties that prevent and/or inhibit migration of nucleic acids and reagents necessary to carry out the methods disclosed herein, for example a volume or space defined by physical properties such as walls, for example the walls of a well, tube, or a surface of a droplet, which may be impermeable or semipermeable, or as defined by other means such as chemical, diffusion rate limited, electro-magnetic, or light illumination, or any combination thereof that can contain a target molecule and a indexable nucleic acid identifier (for example nucleic acid barcode).
  • a volume or space defined by physical properties such as walls, for example the walls of a well, tube, or a surface of a droplet, which may be impermeable or semipermeable, or as defined by other means such as chemical, diffusion rate limited, electro-magnetic, or light illumination, or any
  • diffusion rate limited for example diffusion defined volumes
  • chemical defined volume or space
  • electro-magnetically defined volume or space spaces where the electro-magnetic properties of the target molecules or their supports such as charge or magnetic properties can be used to define certain regions in a space such as capturing magnetic particles within a magnetic field or directly on magnets.
  • optical defined volume any region of space that may be defined by illuminating it with visible, ultraviolet, infrared, or other wavelengths of light such that only target molecules within the defined space or volume may be labeled.
  • reagents such as buffers, chemical activators, or other agents may be passed in or through the discrete volume, while other material, such as target molecules, maybe maintained in the discrete volume or space.
  • a discrete volume will include a fluid medium, (for example, an aqueous solution, an oil, a buffer, and/or a media capable of supporting cell growth) suitable for labeling of the target molecule with the indexable nucleic acid identifier under conditions that permit labeling.
  • a fluid medium for example, an aqueous solution, an oil, a buffer, and/or a media capable of supporting cell growth
  • Exemplary discrete volumes or spaces useful in the disclosed methods include droplets (for example, microfluidic droplets and/or emulsion droplets), hydrogel beads or other polymer structures (for example poly-ethylene glycol di-acrylate beads or agarose beads), tissue slides (for example, fixed formalin paraffin embedded tissue slides with particular regions, volumes, or spaces defined by chemical, optical, or physical means), microscope slides with regions defined by depositing reagents in ordered arrays or random patterns, tubes (such as, centrifuge tubes, microcentrifuge tubes, test tubes, cuvettes, conical tubes, and the like), bottles (such as glass bottles, plastic bottles, ceramic bottles, Erlenmeyer flasks, scintillation vials and the like), wells (such as wells in a plate), plates, pipettes, or pipette tips among others.
  • droplets for example, microfluidic droplets and/or emulsion droplets
  • hydrogel beads or other polymer structures for example poly-ethylene glycol di-acrylate beads or aga
  • the compartment is an aqueous droplet in a water-in-oil emulsion.
  • Said droplets may be formed using microfluidic devices according to known techniques in the art. Other methods for generating droplets as described herein may be used as appropriate, including, but not limited to, high speed vortex, ultrasonic waves, extrusion, filtering, microsieve chips, or the like. Individual oligonucleotides may be loaded into separate droplets according to known methods in the art.
  • the disclosure provides a nucleic acid assembly method based on the technology described herein, wherein individual discrete containers or droplets are not required. This would enable a "one-pot" approach for performing the reactions described herein.
  • one or more chain oligos as described herein may be used to capture via PCR, two distinct, distal segments of DNA by having each arm of the chain oligo specific to a different target DNA, such that after PCR, they would then be joined by the chain oligo. This would allow linking of any number of nucleic acid sequences in "daisy chain” fashion, wherein multiple DNA segments may be held together or joined by chain oligos at each intersection.
  • a transposase may then be used to excise the intervening chain oligo, which would also contain recognition sequences for the transposase.
  • the resulting products would be "scarless" assemblies of the DNA segments targeted by the chain oligos. In some embodiments, this could be done in a "one-pot” assembly. In other embodiments, such methods may be performed either with custom unique barcodes on adapters for each fragment, or by creating primers to match "natural" or native sequences of the particular nucleic acid.
  • each chain oligo may be specifically designed to a particular genomic sequence. Alternatively, generic adaptors may also be used.
  • Such methods may be useful for any application, including, but not limited to, synthetic biology methods, Golden Gate technologies, assembly of genes, assembly of entire genomes, cloning, or plasmid assembly.
  • One particular advantage for such methods is that it is only necessary to perform several short PCRs, rather than long-range PCR, thereby eliminating the possibility of amplification error.
  • One of skill in the art will be able to identify useful and appropriate applications in accordance with the invention.
  • the individual discrete volume is a section of a thin capillary tube.
  • alternative emulsion containers may be used, such as nano-particle hollow containers that act as capillaries (see, for example, Wang et al., "Synthesis, Properties, and Applications of Hollow Micro-/Nanostructures," Chemical Reviews 116(18): 10983-11060, 2016 ).
  • individual template DNA molecules may be encapsulated into droplets, and transposomes may be prepared to contain an internal adaptor.
  • the transposomes may then be used to insert the internal adaptor into the encapsulated template DNA by incubation.
  • the reactions may be incubated at 37°C for 2 hr, or at 55°C for 15 min. Other incubations times and/or temperatures may also be used in accordance with the invention as appropriate.
  • the droplets may then be incubated at 95°C for 10 minutes, to denature the transposase and break the interaction between transposase and the template DNA ( FIG. 4 , steps 3 and 4).
  • the reaction buffer may be optimized to be compatible with transposome insertion and/or PCR.
  • various buffers, polymerases, conditions for transposome insertion and denaturing, and PCR amplification protocols may be used, as would be recognized by one of skill in the art.
  • a buffer may be optimized in order to increase the efficiency of transposome insertion.
  • the buffer components may also be adjusted using different polymerases and PCR protocols to identify conditions that produce reliable chain-PCR products.
  • a transposome may be synthesized or prepared to comprise an internal adaptor.
  • An internal adaptor may be any size appropriate for use with the invention, and as appropriate for the particular application.
  • an internal adaptor may be 10-100 bp, including 10 bp, 15 bp, 20 bp, 25 bp, 30 bp, 35 bp, 40 bp, 45 bp, 50 bp, 55 bp, 60 bp, 65 bp, 70 bp, 75 bp, 80 bp, 85 bp, 90 bp, 95 bp, 100 bp, or the like.
  • An internal adaptor may also include a mosaic end. Similar to the internal adaptor, a mosaic end may be any size as described above.
  • Such elements may be artificial sequence or may be naturally occurring.
  • internal adaptors smaller or larger than described herein may be appropriate for use in accordance with the invention.
  • a mosaic end as described herein may be specifically recognized by a transposase, such as a Tn5 transposase, and may be included in order to provide additional length or sequence to the internal adaptor or for use in further embodiments of the invention, such as PCR.
  • transposomes may be used for insertion of an internal adaptor or a mosaic end, or both, into a template DNA molecule.
  • the transposome is part of the reaction buffer for PCR in an emulsion droplet as described herein. In other embodiments, the methods described herein take place without breaking or disruption of the emulsion droplets.
  • transposases may be denatured and dissociated from DNA after addition of the internal adaptor or mosaic end, or both, into the template DNA.
  • chain-PCR may be performed subsequent to transposome activity in order to link and amplify the outside adaptor-ligated ends of the template DNA.
  • the reaction buffer may be optimized to be compatible with both transposome insertion and subsequent PCR.
  • buffers, polymerases, conditions for transposome insertion and denaturing, and PCR amplification protocols may be appropriate and useful as described herein.
  • chain-PCR may then be performed to link and amplify the outside adaptor-ligated ends of the template DNA.
  • suppression chain-PCR using internal adaptor (IA) oligos and a limited amount of chain-oligos, may be used in order to minimize IA-IA product ( FIG. 4 , step 5).
  • one or more IA-spacer-IA segments may be inserted into the encapsulated template DNA.
  • DNA amplified from internal DNA regions (“internal PCR products”) may have an IA at both ends
  • DNA amplified from the end regions of the template DNA (“end PCR products”) may have an IA at one end and an OA at the other end.
  • the IA used was 34 bp
  • the OA had no complementary sequence with the IA.
  • "internal PCR products” had 34 bp of IA sequence at both ends.
  • the ends of single-stranded DNA from "internal PCR products” may anneal with each other and prevent the binding of IA oligos.
  • the production of "end PCR products” may be favored, and the production of "internal PCR products” may be minimized.
  • using a limited amount of chain-oligos may maximize the yield of two-ends chain-PCR products, and minimize free chain-oligos and one-end products.
  • a limited amount of chain-oligos may be used when making emulsion droplets.
  • Ends PCR products may have two arms, in which each arm represents one end of the template DNA, with the chain-oligo in between the two ends.
  • the ends of the "ends PCR products” may be the IAs, which were inserted into the template DNA molecule.
  • emulsion droplets may then be broken according to the emulsion droplet oil phase components ( FIG. 4 , step 6). DNA may then be purified accordingly using compatible silicon-based column or DNA precipitation.
  • end PCR products contain incorporated chain-oligos, which may have been engineered to carry a desthiobiotin label.
  • Streptavidin beads for example Dynabeads M-280 Streptavidin from Thermo Fisher Scientific, were used to pull down “ends PCR products," while the “internal PCR products” were washed away.
  • the "ends PCR products” were then eluted using biotin, which has higher affinity to streptavidin than desthiobiotin ( FIG. 4 , step 7).
  • T7 DNA PCR was used to check the size distribution of products, shown in FIG. 7 .
  • the ends of the template DNA which could be any combination of the head end and the tail end, were now physically closer to each other on the DNA sequence ( FIG. 4 , step 10).
  • Primers targeting the ends of T7 DNA were used to perform PCR.
  • Intact T7 DNA was used as a control template, which produced a 40-kb linear DNA amplicon.
  • Successful chain-seq produced a smear of DNA products ranging from approximately 150 bp to several kb.
  • ExoV was used to digest uncircularized DNA, and PCR was performed to amplify the products. Primers were used that annealed to the OA to amplify the end-joined products. The chain-OA was then removed using SapI digestion and DNA purification.
  • FIG. 5 provides an alternative embodiment wherein sequencing adaptors may be directly incorporated into the chain oligos. Such an alternative provides a means to streamline the process for sequencing purposes.
  • the first strategy is to make traditional jumping library, which relies on the circularization of the template DNA. Constructs with highly repetitive or similar elements require this step in order to resolve the order and orientation of sequences, which involves the circularization of long 3-8 kb molecules or cloning-based libraries using phage packaging to span 40 kb, a laborious and expensive process. Although the cost of sequencing has dropped dramatically during the past few years while the cost of construction of jumping libraries has not. In addition, due to the low yield, the starting material is high. Furthermore, the efficiency of circularization, the key step of building a jumping library, on DNA larger than 5 kb is dramatically low.
  • the second strategy is to use some long-read next generation sequencing technologies, like those from PacBio and Oxford Nanopore. But both platforms have much higher error rates than Illumina sequencing.
  • Illumina's TruSeq Synthetic Long-Read DNA Library Prep Kit for Genome Assembly is claimed being capable of sequencing up to 10 kb DNA. But it relies on DNA barcoding and using proprietary informatics.
  • the present invention provides a more efficient method that is capable of replacing the jumping library, PacBio, and Oxford Nanopore methodologies.
  • any two or more nucleic acid molecules may be joined together using the chain oligos and methods described herein.
  • chain oligos are screening of a biological sample for a desired molecule. Examples of this include T-cell and B-cell receptor profiling, in which one or more chain oligos may be used to screen a population of nucleic acids to identify a particular desired cell type or clonotype.
  • a chain oligo as described herein may be designed to screen a population of RNA molecules to identify one or more RNA molecules of interest.
  • a chain oligo may be designed to recognize and hybridize to such RNA sequence(s).
  • more than one particular RNA molecule or sequence may be captured at once.
  • a chain oligo may be designed to recognize two distinct RNA molecules. Such an oligo may have two arms, wherein one arm has a recognition sequence for the first RNA molecule, and the second arm has a recognition sequence for the second RNA molecule.
  • a biological sample may be obtained from a particular individual for which screening is necessary.
  • a chain oligo may be designed with a sequence on one arm to recognize a particular RNA molecule, and a sequence on the other arm that recognizes a second, distinct RNA molecule. Each arm of the chain oligo therefore hybridizes to a separate RNA molecule. Reverse transcription may then be performed using appropriate reaction conditions and reagents in order to produce cDNA of both the first and the second RNA molecules.
  • both 3' ends of the chain oligo may serve as primer molecules for first-strand cDNA synthesis according to methods known in the art.
  • Second strand synthesis may then be performed using a second, distinct chain oligo.
  • the newly produced cDNA may be dissociated from the template RNA molecules and the second, distinct chain oligo may hybridize to the 3' ends of the cDNA than the first chain oligo and second strand synthesis as known in the art may be performed in order to produce a double-stranded DNA copy of the starting RNA molecule.
  • an appropriate reaction buffer may be optimized by one of skill in the art.
  • various buffers, polymerases, and amplification protocols may be used, as would be recognized by one of skill in the art.
  • 15-20 cycles of PCR amplification may be used in order to produce a desired amount of cDNA.
  • One of skill in the art will recognize that other conditions may be used in accordance with the invention.
  • emulsion droplets may then be broken following PCR, and cDNA may then be purified accordingly using compatible silicon-based column or DNA precipitation.
  • chain-oligos may be engineered to carry a desthiobiotin or other appropriate label.
  • Streptavidin beads for example Dynabeads M-280 Streptavidin from Thermo Fisher Scientific, may be used to pull down desired cDNA. The pulled down products may then be eluted using, for example, biotin, and eluted cDNA may be circularized for sequencing data analysis.
  • a first chain oligo may be hybridized to a first RNA molecule, wherein the first 3' end of the first chain oligo hybridizes to the first RNA molecule and the second 3' end of the first chain oligo hybridizes to a second RNA molecule;
  • a second chain oligo may be hybridized to a second RNA molecule, wherein the first 3' end of the second chain oligo hybridizes to the second RNA molecule and the second 3' end of the second chain oligo hybridizes to a third RNA molecule;
  • a third chain oligo may be hybridized to a third RNA molecule, wherein the first 3' end of the third chain oligo hybridizes to the third RNA molecule and the second 3' end of the
  • First and second strand cDNA synthesis may be performed as known in the art and as described herein above.
  • a circular nucleic acid is produced as shown in FIG. 2 .
  • Subsequent amplification cycles may be performed in order to produce a desired number of copies of cDNA for detection of, for example, mutations present in cancer cells.
  • one or more restriction endonucleases may be employed to cut the amplified cDNA for further analysis. Additional details of these applications are provided below.
  • T-cell and B-cell receptor profiling does not form part of the claimed invention.
  • T cells or T lymphocytes
  • Essential to T-cell function are highly specialized extracellular receptors (T-cell receptors or TCRs) that selectively bind specific antigens displayed by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells (APCs).
  • MHC major histocompatibility complex
  • APCs antigen-presenting cells
  • TCR- ⁇ or TCR- ⁇ subunit TCR- ⁇ subunit
  • TCR- ⁇ + TCR- ⁇ TCR- ⁇
  • TCR- ⁇ + TCR- ⁇ TCR subunit variants
  • Turchaninova et al. Eur J Immunol 43:2507-15, 2013
  • Blocker oligos to inhibit unfused molecule amplification, and can only use DNA polymerase without 3' to 5' exonuclease activity, like Taq polymerase, instead of any high fidelity polymerase. Otherwise, the blocker oligos will be degraded. Therefore, the final sequencing results will contain many artificial sequencing errors introduced by low fidelity polymerase.
  • Some of the unfused molecules from different cells are possibly fused and amplified with each other in the nested PCR step after breaking emulsion and pooled the molecules in bulk.
  • chain-oligos as described herein, these shortcomings can be overcome and accuracy and efficiency can be significantly increased.
  • specifically designed chain-oligos can be used to amplify and clone the coding sequences of TCR- ⁇ and TCR- ⁇ . Once sequenced, the coding sequences of TCR- ⁇ and TCR- ⁇ can in turn be cloned into B-cells.
  • Chain-oligos may be designed that are specific to, for example, the constant regions in ⁇ -chain mRNAs, such as the 5' untranslated region (UTR) and "constant" (C) segment coding region, or to the constant regions in ⁇ -chain mRNAs.
  • UTR 5' untranslated region
  • C constant segment coding region
  • two chain-oligos can be used to isolate and/or amplify even longer sequences for sequencing. After a single chain-oligo grabs a pair of particular DNA sequences, only around 300 bp of nucleic acid sequence can be sequenced using an Illumina platform. However, using two compatible chain-oligos, two separate sequenceable molecules can be produced, allowing sequencing of twice the length of nucleic acid, i.e., 300+300 bp.
  • the invention provides a diagnostic method to capture heavy and light chain transcripts of B-cells or TCR- ⁇ and TCR- ⁇ sequences wherein no adaptor is required.
  • one or more chain oligos may directly link the transcript pair via PCR.
  • Such a method may only require isolation of a cell either in a container or spatially on a surface.
  • each chain oligo contains a primer pair corresponding to a conserved framework in, for example, the heavy (H) or light (L) chains of an antibody sequence, which can then extend and capture the full-length chain information.
  • a cell or container barcode may be added to identify single cells or samples.
  • the B cell or T cell of interest can be isolated from a subject with a recent infection or with a vaccine administration.
  • chain oligos as described herein may be used to produce an antibody of any combination of components, such as H and L chains. Any number of coding regions for antibody components may be joined together in any configuration using any number of chain oligos to link the nucleic acid sequences together. In a particular embodiment, once such nucleic acids are joined together using chain oligos, the chain oligos themselves may then be removed or excised from the joined complex using specific transposases, including, but not limited to a piggyback transposase, in order to remove the chain oligo such that there is no "scar" left in the joined nucleic acid complex. The resulting nucleic acid complex may then be introduced into a B-cell in order to produce a specific desired antibody.
  • the present invention therefore, enables the production of engineered antibodies having any desired sequence.
  • One or more chain-oligos may be used to target multiple specific genomic regions or multiple mRNAs in a single cell, which can enable detection of the presence of certain mRNAs or mutation combinations or polymorphism or profile single cells using sequencing.
  • the detection of such polymorphism or mutations can be used in genotyping or cancer diagnostics.
  • Profiling or detection of presentation of two or more specific mutations can be performed in a single cell, such as a cancer cell.
  • chain oligos may be used to profile or detect two or more specific RNA molecules or RNA transcripts. Such oligos may be designed to specifically reverse transcribe, amplify, and link the resulting cDNA fragment converted from the target RNAs. Kits Not part of the invention
  • kits comprising one or more such reagents or components for use in a variety of assays, including for example, nucleic acid assays, e.g., an assay described herein.
  • kits may preferably include at least a first chain oligo as described herein, and means for detecting or visualizing amplification of a target sequence.
  • a kit may contain multiple chain oligos as described herein for the purpose of performing PCR or sequencing. Chain oligos may be provided in lyophilized, desiccated, or dried form, or may be provided in an aqueous solution or other liquid media appropriate for use in accordance with the invention.
  • Kits may also include additional reagents, e.g., PCR components, such as salts including MgCl 2 , a polymerase enzyme, and deoxyribonucleotides, and the like, reagents for DNA isolation or sequencing, as described herein.
  • additional reagents e.g., PCR components, such as salts including MgCl 2 , a polymerase enzyme, and deoxyribonucleotides, and the like, reagents for DNA isolation or sequencing, as described herein.
  • additional reagents e.g., PCR components, such as salts including MgCl 2 , a polymerase enzyme, and deoxyribonucleotides, and the like, reagents for DNA isolation or sequencing, as described herein.
  • reagents or components are well known in the art.
  • reagents included with such a kit may be provided either in the same container or media as the chain oligo or plurality chain oligos,
  • Embodiments disclosed herein, but not part of the invention provide methods, primers, and kits for covalently linking polynucleotides which has application in, for example, de novo genome assembly, long range mutation detection, mapping of repeating regions, and synthetic biology construct validation.
  • the aspects disclosed herein that do not form part of the invention are well adapted for applications requiring the manipulation and/or sequencing of large polynucleotide molecules.
  • Existing techniques are costly, require large DNA inputs, suffer from error rates much higher than sequencing shorter reads, are typically much less efficient once the size of DNA exceeds 5 kb.
  • Example 11 including Figures 10-14 is a reference example which does not form part of the invention.
  • the outside adaptor contains two Chi sites and a rare cut site (SapI).
  • the Chi sites are included at the ends of the OA in order to provide protection of the ends of linear DNA from exonuclease V (RecBCD) digestion.
  • Other end protection methods can be used, such as a hairpin.
  • the SapI restriction enzyme recognizes a 7-bp sequence (one cut out of 16,384 bp of random DNA sequence) and cuts outside of recognition site.
  • the SapI site was used to remove the OA before making a sequencing library.
  • Other restriction enzyme sites may also be used to replace SapI.
  • the SapI recognition and cutting site was as follows:
  • Chain oligos were linked using Copper(I)-Catalyzed Azide-Alkyne Cycloaddition (CuAAC) click chemistry reaction.
  • CuAAC Copper(I)-Catalyzed Azide-Alkyne Cycloaddition
  • Other click chemistry reactions such as copper-free strain-promoted azide-alkyne cycloaddition (SPAAC) reaction, or other chemical reactions, such as the thiol-ene reaction, may be used, along with others known in the art.
  • SPAAC copper-free strain-promoted azide-alkyne cycloaddition
  • thiol-ene reaction may be used, along with others known in the art.
  • the azide (AZ)-chain oligo contained the following components: AZ-Desthiobiotin-Chi-Chi-SapI.
  • Desthiobiotin is a modified form of biotin that binds less tightly to avidin and streptavidin than biotin, while still providing excellent specificity in affinity purification methods. Desthiobiotin can be released from streptavidin with biotin. Biotin can also be used to replace desthiobiotin.
  • the alkye (AK)-chain oligo contained the following components: AK-Desthiobiotin-Chi-Chi-SapI.
  • the AZ-chain and AK-chain oligos were 5'-5'-linked as described herein, producing a chain oligo having two arms. Chain oligos may also be constructed with more than two arms, such as demonstrated in FIGs. 1 and 2 . Chain oligos having two or more arms may be used to link together multiple DNA molecules for any number of applications described herein.
  • the internal adaptor used was 15 bp and included a mosaic end (ME).
  • the mosaic end was 19 bp and is specifically recognized by Tn5 transposase.
  • the 15-bp artificial sequence was included to elongate the internal adaptor and to facilitate suppression chain-PCR.
  • the outside adaptor consisted of 11 bp of common sequence from Illumina sequencing F-adapter and R-adapter, plus two Chi sites.
  • the AZ-chain oligo contained the following components: AZ-Desthiobiotin-Chi-Chi-SapI-F.
  • the AK-chain oligo contained the following components: AK-Desthiobiotin-Chi-Chi-SapI-R.
  • the AZ-chain and AK-chain oligos were 5'-5'-linked as described herein.
  • the SapI site can be omitted in this version of the outside adaptor, if desired. In this case, the SapI cut step was skipped and the circularized chain-PCR products were used as template to make final sequencing library.
  • the internal adaptor used was 15 bp and included a mosaic end.
  • the mosaic end was 19 bp and is specifically recognized by Tn5 transposase.
  • the 15-bp of artificial sequence was included to elongate the internal adaptor (IA) and to facilitate suppression chain-PCR.
  • End repair and A-tailing of template long DNA - This step is used for sequencing the ends of long DNA molecules 5-100 kb in size.
  • the method was optimized using 40-kb T7 phage DNA.
  • size selection may be performed, depending on the purpose of the experiment.
  • the end repair and A-tailing steps are performed using NEBNext Ultra II kit, or other similar commercial kit, or home-made reagents.
  • outside adapter OA
  • the outside adaptor was ligated to the ends of the template DNA via T-A ligation using the NEBNext Ultra II kit, or other similar commercial kit. Alternatively, home-made reagents may be used.
  • ExoV digestion to remove incomplete products [optional] - Template DNA without a ligated outside adaptor at its ends may be digested using Exonuclease V (ExoV, RecBCD, from NEB).
  • outside adaptor ligation may be detected using primers, one annealed to the outside adaptor and one annealed to the T7 DNA at a position several hundreds of base pairs from the ends, for example.
  • Click chemistry to make chain-oligos - Copper(I)-Catalyzed Azide-Alkyne Cycloaddition (CuAAC) click chemistry reaction was used to make chain-oligos, as shown in FIG. 3 .
  • Other click chemistry reactions or other addition chemistry reactions may also be used to make chain-oligos.
  • Transposomes were assembled using Tn5 transposase and an internal adapter (IA) containing a Tn5-specific binding sequence and mosaic end. Transposomes were used to insert a known sequence, such as an internal adaptor, into the template DNA for subsequent PCR amplification.
  • IA internal adapter
  • Other transposases for example MuA, and its corresponding recognition/binding sequences, or other enzymes having similar functions, such as integrase, may also be used.
  • FIG. 5 Single template DNA was first encapsulated into droplets, and transposomes were pre-assembled to contain the internal adaptor. The pre-assembled transposomes were then used to insert the internal adaptor into the encapsulated template DNA by incubating at 37°C for 2 hr, or at 55°C for 15 min. The droplets were then incubated at 95°C for 10 minutes, to denature the transposase and break the interaction between transposase and the template DNA ( FIG. 5 , steps 3 and 4).
  • the reaction buffer was optimized to be compatible with both transposome insertion and PCR.
  • Various buffers, polymerases, conditions for transposome insertion and denaturing, and PCR amplification protocols were tested.
  • the buffer was optimized to favor transposome insertion, without which the efficiency of transposome insertion was too low.
  • the buffer components were then adjusted using different polymerases and PCR protocols to identify conditions that would produce reliable chain-PCR products.
  • the results of integration of transposome insertion and PCR in bulk and in emulsion are shown in FIGs. 6 and 7 , respectively.
  • chain-PCR was performed to link and amplify the outside adaptor-ligated ends of the template DNA. Suppression chain-PCR, using internal adaptor (IA) oligos and a limited amount of chain-oligos, was used in order to minimize IA-IA product ( FIG. 5 , step 5).
  • IA internal adaptor
  • IA-spacer-IA segments were inserted into the encapsulated template DNA.
  • DNA amplified from internal DNA regions (“internal PCR products”) had an IA at both ends
  • DNA amplified from the end regions of the template DNA (“end PCR products”) had an IA at one end and an OA at the other end.
  • the IA was 34 bp
  • the OA had no complementary sequence with the IA.
  • "internal PCR products” had 34 bp of IA sequence at both ends. After denaturing, the ends of the single stranded DNA from "internal PCR products” could annealed with each other and prevent the binding of IA oligos.
  • ends PCR products After 15-20 cycles of PCR amplification, all chain-oligos were used and incorporated into "ends PCR products.” This avoids possible cross-reaction between the ends from different template DNA after break emulsion.
  • the success "ends PCR products” have two arms, which each arm is one end of the template DNA, and a chain-oligo in the middle, which are already incorporated into the PCR products.
  • the ends of the "ends PCR products” are IAs.
  • Emulsion droplets were broken according to the emulsion droplet oil phase components ( FIG. 5 , step 6). DNA were purified accordingly using compatible silicon-based column or DNA precipitation.
  • the "ends PCR products” incorporated chain-oligos, which carry desthiobiotin.
  • Streptavidin beads for example Dynabeads M-280 Streptavidin from Thermo Fisher Scientific, were used to pull down “ends PCR products," while the “internal PCR products” were washed away.
  • the "ends PCR products” were then eluted using biotin, which has higher affinity to streptavidin than desthiobiotin ( FIG. 5 , step 7).
  • T7 DNA For T7 DNA, PCR was used to check the size distribution of products, shown in FIG. 8 .
  • the ends of the template DNA which could be any combination of the head end and the tail end, were now physically closer to each other on the DNA sequence ( FIG. 5 , step 10).
  • Primers targeting the ends of T7 DNA were used to perform PCR.
  • Intact T7 DNA was used as a control template, which produced a 40-kb linear DNA amplicon.
  • Successful chain-seq produced a smear of DNA products ranging from approximately 150 bp to several kb.
  • ExoV was used to digest uncircularized DNA, and PCR was performed to amplify the products. Primers were used that annealed to the OA to amplify the end-joined products. The chain-OA was then removed using SapI digestion and DNA purification.
  • Version 1 end repair, A-tailing, adapter ligation, PCR, size-selection
  • Tn5 Transposome assembly 25°C for 30-60 min.
  • Emulsion PCR integrated tagmentation with PCR
  • Example 11 One-Pot setup provides more flexibility for optimization Not part of the invention
  • Transposome assembly and tagmentation reactions can be set up on a "one-pot" reaction, which can enable optimization of appropriate reaction conditions ( FIGs. 10-14 ).
  • FIG. 15 A schematic overview of Crab-Seq procedure is shown in FIG. 15 .
  • Outside Adapter is a hairpin formed from oligo OA.
  • Covalently linking oligos are covalently linked from these two oligos by their 5' and 5', using Azide-Alkyne Huisgen Cycloaddition.
  • IA_pcr (SEQ. ID. No: 21) 5'GAGGAGAGATGTGTATAAGAGACAG
  • E. coli DH5a genomic DNA was prepared using Gentra ® Puregene ® kit (Qiagen) following vender's protocol. DNA was stored at 4 until use.
  • genomic DNA was sheared using g-TUBE (Covaris) with a benchtop centrifuge according to vendor's protocol and then size-selected with BluePippin (Sage Science). The size ranges of size-selected genomic DNA was confirmed using Fragment Analyzer (Advanced Analytical Technologies).
  • End Repair, dA-tailing and Adapter ligation steps are performed on the size-selected templates DNA using KAPA Hyper Prep Kit (Kapa Biosystems), KAPA HyperPlus Kit (Kapa Biosystems) or NEBNext Ultra II kit (New England Biolabs) following vendors's protocols.
  • KAPA Hyper Prep Kit Kapa Biosystems
  • KAPA HyperPlus Kit Kapa Biosystems
  • NEBNext Ultra II kit New England Biolabs
  • thermocycler Put PCR tube in a thermocycler and perform the following steps:
  • DNA that was not ligated Outside Adapters at both ends was digest with Exonucleases.
  • Tn5 transposome was assembled using EZ-Tn5 Transposase (Epicentre) or Robust Tn5 Transposase (Creative Biogene) and Internal Adapter following vendors's protocols. An example setup is showing below. 13 ⁇ l H2O 2 ⁇ l 10x TPS buffer 1 ⁇ l Annealed Internal Adapter, 40 ⁇ M 4 ⁇ l Robust Tn5 transposase Total 20 ⁇ l
  • the integrated tagmentation and emulsion PCR was set up using Micellula DNA Emulsion & Purification Kit (EURx) or in-house microfluidics devices. Average number of template copies per droplet was kept below 0.1, to minimize multiple copies of templates presenting in droplets.
  • An example of emulsion PCR setup using Micellula DNA Emulsion & Purification Kit (EURx) is listed below.
  • thermocycler Put in thermocycler and run the following program:
  • PCR products i.e., PCR products covalently linked by the Covalently linking oligos were enriched by biotin pulldown using the built-in DesthioBiotin in the Alkyne oligo and Dynabeads MyOne Streptavidin C1 (Thermo Fisher Scientific), following vendor's protocol.
  • An example protocol is listed below.
  • biotin elution buffer 100 mM Tris-HCl (pH 8.0), 250 mM NaCl, 1 mM EDTA, 10 mM D-biotin).
  • PCR products were size-selected on 2% E-Gel (Thermo Fisher Scientific) for 500-1,000 bp range, and extracted using QIAquick Gel Extraction Kit (Qiagen).
  • Circularization ligation was performed using T4 DNA ligase (New England Biolabs) at 16 for 1-16 hours and T4 DNA ligase was inactivated at 65 for 10 minutes.
  • Linear DNA was digested using Lambda Exonuclease (New England Biolabs) and Exonuclease I (New England Biolabs) at 37 for 30 min following vendor's protocol and cleaned up using DNA Clean & Concentrator Kit (Zymo Research).
  • Illumina sequencing adapters were added using PCR with indexed oligos (New England Biolabs) and NEBNext Ultra II Kit (New England Biolabs). Products were cleaned up using 0.9x Agencourt AMPure XP (Beckman Coulter) following NEBNext Ultra II Kit protocol.
  • Illumina sequencing results was split at the internal adapter junction, then 5' part and 3' part of each read were mapped to E. coli DH5a reference genome using Burrows-Wheeler Aligner (BWA) ( Li H. and Durbin R. (2009) Bioinformatics, 25:1754 .). The distance between the locations of 5' and 3' parts mapped to the reference genome was calculated for all reads. For the reads that have distance in the range of the length of the templates DNA, they were generated from the two ends of the template DNAs and provide long distance sequencing information. The results of Crab-Seq on 5-10 kb E . coli gDNA are shown in FIG. 17A-17B .
  • Crab-Seq Fiddler-Crab-Seq uses two different outside adapters to ligate to the ends of template DNA. Though the protocols of these two versions of Crab-Seq look similar, they are significantly different.
  • Crab-Seq there is only one type of outside adapter. After outside adapter ligation, there are mixture of various products, including two-end ligated templates, one-end ligated templates, and non-ligated templates. In the following emulsion PCR step, one-end ligated templates will produce covalently linked PCR products that are generated from the same template ends, i.e., Crab linked the copies of the same template ends in emulsion PCR. These "same end" products are not useful for long distance DNA sequencing and will waste sequencing reads.
  • exonuclease digestion To remove one-end ligated templates before emulsion PCR, one can use exonuclease digestion. However, due to the long length of DNA templates (up to 30 kb) and the ligation mixture is not optimal for typical exonucleases (such as Exonuclease V, or Exonuclease III plus Exonuclease VII) activities, the exonuclease digestion is not complete and thus does not remove all of (if any) one-end ligated templates. If one chooses to clean up outside adapter ligated DNA before exonuclease digestion, long DNA are very easily broken or tangled together during the cleaning up step. This will result in many newly one-end ligated templates and will lead to the same problem described above.
  • exonucleases such as Exonuclease V, or Exonuclease III plus Exonuclease VII
  • FIG. 18 shows Fiddler-Crab-Seq strategy for suppression PCR to eliminate damaged or partially end-labeled molecules and keep only proper "Forward and Reverse”.
  • FIG. 19 A schematic overview of the Fiddler-Crab-Seq procedure is shown in FIG. 19 .
  • Outside Adapter 1 is a hairpin formed from oligo OA1-F.
  • OA1-F carrying other type of loop containing dU also work, such as TAUCG. (SEQ. ID. No: 23)
  • Outside Adapter 2 is a hairpin formed from oligo OA2-R.
  • OA2-R carrying other type of loop containing dU also work, such as TAUCG. (SEQ. ID. No: 25)
  • Covalently linking oligos are covalently linked from these two oligos by their 5' and 5', using Azide-Alkyne Huisgen Cycloaddition.
  • IA_pcr (SEQ. ID. No: 32) 5'GAGGAGAGATGTGTATAAGAGACAG
  • E. coli DH5 ⁇ genomic DNA was prepared using Gentra ® Puregene ® kit (Qiagen) following vender's protocol. DNA was stored at 4 until use.
  • genomic DNA was treated with NEBNext ® dsDNA Fragmentase ® (New England Biolabs), size-selected on 1% E-Gel (Thermo Fisher Scientific), and purified using QIAquick Gel Extraction Kit (Qiagen).
  • NEBNext ® dsDNA Fragmentase ® New England Biolabs
  • size-selected on 1% E-Gel Thermo Fisher Scientific
  • QIAquick Gel Extraction Kit QIAquick Gel Extraction Kit
  • genomic DNA was either used directly for size-selection with BluePippin (Sage Science), or was sheared using g-TUBE (Covaris) with a benchtop centrifuge according to vendor's protocol and then size-selected with BluePippin (Sage Science).
  • the size ranges of size-selected genomic DNA was confirmed using Fragment Analyzer (Advanced Analytical Technologies).
  • End Repair, dA-tailing and Outside Adapter ligation steps are performed on the size-selected templates DNA using KAPA Hyper Prep Kit (Kapa Biosystems), KAPA HyperPlus Kit (Kapa Biosystems) or NEBNext Ultra II kit (New England Biolabs) following vendors's protocols.
  • KAPA Hyper Prep Kit Kapa Biosystems
  • KAPA HyperPlus Kit Kapa Biosystems
  • NEBNext Ultra II kit New England Biolabs
  • thermocycler Put PCR tube in a thermocycler and perform the following steps:
  • DNA from the adapter ligation reaction was purified using DNA Clean & Concentrator Kit (Zymo Research). For 5-10 kb and 10-30 kb DNA, to minimize DNA damage we used a mild DNA clean up protocol.
  • Tn5 transposome was assembled using EZ-Tn5 Transposase (Epicentre) or Robust Tn5 Transposase (Creative Biogene) and Internal Adapter following vendors's protocols.
  • An example setup is showing below. 13 ⁇ l H2O 2 ⁇ l 10x TPS buffer 1 ⁇ l Annealed Internal Adapter, 40 ⁇ M 4 ⁇ l Robust Tn5 transposase Total 20 ⁇ l
  • the integrated tagmentation and emulsion PCR was set up using Micellula DNA Emulsion & Purification Kit (EURx) or in-house microfluidics devices. Average number of template copies per droplet was kept below 0.1, to minimize multiple copies of templates presenting in droplets.
  • An example of emulsion PCR setup using Micellula DNA Emulsion & Purification Kit (EURx) is listed below.
  • thermocycler Put in thermocycler and run the following program:
  • PCR products i.e., PCR products covalently linked by the Covalently linking oligos were enriched by biotin pulldown using the built-in DesthioBiotin in the Alkyne oligo and Dynabeads MyOne Streptavidin C1 (Thermo Fisher Scientific), following vendor's protocol.
  • An example protocol is listed below.
  • biotin elution buffer 100 mM Tris-HCl (pH 8.0), 250 mM NaCl, 1 mM EDTA, 10 mM D-biotin).
  • PCR products were size-selected on 2% E-Gel (Thermo Fisher Scientific) for 500-1,000 bp range, and extracted using QIAquick Gel Extraction Kit (Qiagen).
  • Circularization ligation was performed using T4 DNA ligase (New England Biolabs) at 16 for 1-16 hours and T4 DNA ligase was inactivated at 65 for 10 minutes.
  • Linear DNA was digested using Lambda Exonuclease (New England Biolabs) and Exonuclease I (New England Biolabs) at 37 for 30 min following vendor's protocol and cleaned up using DNA Clean & Concentrator Kit (Zymo Research).
  • Illumina sequencing adapters were added using PCR with indexed oligos (New England Biolabs) and NEBNext Ultra II Kit (New England Biolabs). Products were cleaned up using 0.9x Agencourt AMPure XP (Beckman Coulter) following NEBNext Ultra II Kit protocol.
  • Illumina sequencing results was split at the internal adapter junction, then 5' part and 3' part of each read were mapped to E. coli DH5 ⁇ reference genome using Burrows-Wheeler Aligner (BWA) ( Li H. and Durbin R. (2009) Bioinformatics, 25:1754 .). The distance between the locations of 5' and 3' parts mapped to the reference genome was calculated for all reads. For the reads that have distance in the range of the length of the templates DNA, they were generated from the two ends of the template DNAs and provide long distance sequencing information.
  • FIGs. 20A-20C show results of Fiddler-Crab-Seq results on E. coli gDNA.
  • Example 14 Crab-PCR to link and pair heavy and light chains mRNAs from individual B lymphocyte hybridoma cells
  • FIG. 22 shows a schematic overview of the Crab-PCR using DNA templates.
  • FIG. 23 shows a schematic overview of the Crab-PCR using mRNA template and mRNA specific amplification primers.
  • FIG. 24 shows a schematic overview of the Crab-PCR using mRNA template, a template switching adapter and a common amplification primer, which is used by the Example below.
  • Mus musculus B lymphocyte hybridoma MYC1-9E10.2 [9E10] (ATCC ® CRL-1729 TM ) from ATCC
  • Single cells are sorted using FACS into PCR tubes or strips with each well containing 2.5 ⁇ l of H2O and 2.5 ⁇ l of OneTaq One-Step Reaction mix (2x) (New England Biolabs).
  • thermocycler Put in thermocycler and run the following program:
  • AMPure beads purification, using 1x AMPure XP beads (Beckman Coulter).
  • Steps 5-8 are optional if using Sanger sequencing for final products readout. But these steps are required for Next Generation Sequencing, like Illumina Sequencing. Because without Type IIS digestion to remove part of the adapter, the junction region formed by the two adapters from two ends of the covalently linked PCR products w ill form hairpin on Illumina flowcell and affect sequencing results.
  • Steps 9-10 are optional if the DNA yield is sufficient for size-selection and Sanger sequencing after step 4. If the DNA yield is low, these steps are required, because the covalently linked PCR products need to be circularized and amplified to generate more DNA molecules. If Next Generation Sequencing, like Illumina Sequencing, will be performed, these steps are required becuase the covalently linked PCR products need to be circularized, and furthermore sequencing adapters need to be added.
  • thermocycler Put in thermocycler and run the following program: 98 2 min 52 30 sec 68 3 min 4 hold
  • AMPure beads purification, using 1x AMPure XP beads (Beckman Coulter).
  • AMPure beads purification, using 1x AMPure XP beads (Beckman Coulter).
  • thermocycler Put in thermocycler and run the following program:

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Hematology (AREA)
  • Cell Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Claims (16)

  1. Verfahren zum Verknüpfen von Nukleinsäuremolekülen oder Fragmenten davon, umfassend:
    (a) Entmischen einzelner Nukleinsäuremoleküle, die an beiden Enden mit einem ersten Adapterpaar markiert sind, das eine Vorwärtssequenz (F) und eine Rückwärtssequenz (R) umfasst, in einzelne separate Volumina;
    (b) Einfügen mindestens eines zweiten Adapters in zwei oder mehr innere Stellen des Nukleinsäuremoleküls innerhalb der einzelnen separaten Volumina;
    (c) Fragmentieren der Nukleinsäuremoleküle zur Erzeugung von Nukleinsäurefragmenten des Nukleinsäuremoleküls, von denen mindestens ein Teil sowohl mit dem ersten Adapterpaar als auch mit dem zweiten Adapter markiert ist;
    (d) Inkontaktbringen der Nukleinsäurefragmente mit mindestens einem ersten und einem zweiten Primer, wobei der erste Primer mindestens zwei 5'-5'-verknüpfte Arme umfasst, wobei ein erster Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit der Vorwärtssequenz (F) des ersten Adapterpaares hybridisiert, und ein zweiter Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit der Rückwärtssequenz (R) des ersten Adapterpaares hybridisiert, und wobei der zweite Primer eine Sequenz umfasst, die mit dem zweiten Adapter hybridisiert; und
    (e) Amplifizieren der Nukleinsäuremoleküle unter Verwendung sowohl des ersten als auch des zweiten Primers durch PCR-Amplifikation oder isothermische Amplifikation.
  2. Verfahren nach Anspruch 1, ferner umfassend:
    (f) Zusammenführen der amplifizierten Nukleinsäurefragmente aus jedem einzelnen separaten Volumen; und
    (g) Zirkularisieren der amplifizierten Nukleinsäurefragmente durch Verbinden der zweiten Adapter, gegebenenfalls ferner umfassend das Isolieren der amplifizierten Nukleinsäurefragmente, die mit dem ersten Primer markiert sind, vor dem Zirkularisierungsschritt, und/oder
    ferner umfassend:
    (h) PCR-Amplifikation zur Erzeugung linearisierter Nukleinsäuremoleküle, die den zweiten Adapter umfassen; und
    (i) Sequenzieren der linearisierten Nukleinsäuremoleküle zur Erzeugung eines Satzes von Nukleinsäure-Reads, gegebenenfalls
    ferner umfassend Exonuklease-Verdau vor dem PCR-Amplifikationsschritt, oder
    ferner umfassend das Entfernen der ersten Adapterpaarsequenz von den zirkularisierten Nukleinsäuremolekülen, um linearisierte Nukleinsäuremoleküle zu erzeugen, die den zweiten Adapter umfassen, vor dem PCR-Amplifikationsschritt, oder
    ferner umfassend das Zusammensetzen einer Nukleinsäuresequenz der Nukleinsäuremoleküle zumindest teilweise auf der Grundlage des Satzes von Nukleinsäuresequenzierungs-Reads.
  3. Verfahren nach Anspruch 1, wobei der erste Arm und der zweite Arm jeweils eine Länge zwischen 5 und 1.000 bps aufweisen, oder
    wobei der erste und der zweite Arm über PCR-Amplifikation, isotherme Amplifikation, Ligation, Click-Chemie oder chemische Oligonukleotidsynthese 5'-5' verknüpft sind,
    wobei gegebenenfalls der erste und der zweite Arm unter Verwendung einer biokompatiblen Reaktion miteinander verknüpft sind, wobei ferner gegebenenfalls
    die biokompatible Reaktion ausgewählt ist aus der Gruppe bestehend aus einer Kupfer(I)-katalysierten Azid-Alkin-Cycloaddition (CuAAC)-Reaktion, einer kupferfreien, stamm-geförderter Azid-Alkin-Cycloaddition (SPAAC)-Reaktion und einer Thiol-En-Reaktion.
  4. Verfahren nach Anspruch 1, wobei der erste Arm, der zweite Arm, beide Arme oder die Verknüpfung ferner einen Bindungs-Tag umfassen,
    wobei der Bindungs-Tag gegebenenfalls eine funktionelle Affinitäts-Pulldown-Gruppe ist,
    wobei die funktionelle Affinitäts-Pulldown-Gruppe gegebenenfalls Biotin oder Desthiobiotin ist.
  5. Verfahren nach Anspruch 2, wobei die amplifizierten Nukleinsäurefragmente, die mit dem ersten Primer markiert sind, über den Bindungs-Tag isoliert werden.
  6. Verfahren nach einem der Ansprüche 1-5, wobei die Vorwärtssequenz (F) und die Rückwärtssequenz (R) zwischen 6 und 5.000 Nukleotide lang sind, und/oder
    wobei die Vorwärtssequenz (F) und die Rückwärtssequenz (R) identisch sind, oder
    wobei die Vorwärtssequenz (F) und die Rückwärtssequenz (R) unterschiedlich sind.
  7. Verfahren nach einem der Ansprüche 1-6, wobei die Vorwärtssequenz (F), die Rückwärtssequenz (R) oder der zweite Adapter ferner eine Restriktionsstelle umfasst,
    wobei gegebenenfalls die Restriktionsstelle eine Typ-IIS-Restriktionsstelle ist,
    wobei ferner gegebenenfalls die Typ-IIS-Restriktionsstelle eine SapI-, AcuI-, BpuEI-, BsgI-, BseRI- oder Ecil-Restriktionsstelle ist.
  8. Verfahren nach Anspruch 7, wobei die Vorwärtssequenz (F), die Rückwärtssequenz (R) oder der zweite Adapter aus den zirkularisierten Nukleinsäurefragmenten durch ein Restriktionsenzym, das die Restriktionsstelle erkennt, entfernt oder verkürzt wird.
  9. Verfahren nach Anspruch 1, wobei der erste Arm des ersten Primers eine Vorwärtssequenzierungs-Adaptersequenz oder ein Fragment davon umfasst, und der zweite Arm des ersten Primers eine Rückwärtssequenzierungs-Adaptersequenz oder ein Fragment davon umfasst, oder
    wobei die endmarkierten Nukleinsäuremoleküle durch eine Transposase fragmentiert werden.
  10. Verfahren nach einem der Ansprüche 1-9, wobei das einzelne separate Volumen ein durch Emulgierung erzeugtes Tröpfchen ist,
    wobei gegebenenfalls das einzelne separate Volumen ein durch Verwirbelung oder Schütteln erzeugtes Tröpfchen ist, oder
    wobei das einzelne separate Volumen ein auf einer Mikrofluidvorrichtung erzeugtes Tröpfchen ist
  11. Verfahren nach Anspruch 10, wobei das Tröpfchen die Transposase, den zweiten Adapter sowie den ersten und den zweiten Primer umfasst.
  12. Verfahren nach einem der Ansprüche 1-9, wobei das einzelne separate Volumen ein hohles Teilchen von ausreichender Größe ist, um die Reaktionsmischung aufzunehmen,
    wobei das Teilchen gegebenenfalls ein Abschnitt eines dünnen Kapillarrohrs ist.
  13. Verfahren nach einem der Ansprüche 1-12, wobei die Nukleinsäuremoleküle DNA-Moleküle sind, oder
    wobei die Nukleinsäuremoleküle RNA-Moleküle sind.
  14. Verfahren nach einem der Ansprüche 1-13, wobei die Nukleinsäuremoleküle 5 kb oder länger sind, oder
    wobei die Nukleinsäuremoleküle 40-100 kb oder länger sind, oder
    wobei die Nukleinsäuremoleküle für einen T-Zell-Rezeptor, einen B-Zell-Rezeptor oder eine Immunglobulin-Schwer- oder Leichtkette kodieren.
  15. Zirkularisiertes DNA-Molekül zum Verknüpfen und/oder Sequenzieren zweier Enden eines Nukleinsäuremoleküls, umfassend: einen ersten Primer, beide Enden des Nukleinsäuremoleküls und eine interne Adaptersequenz, wobei der interne Adapter in das Nukleinsäuremolekül eingefügt ist, wobei die Enden des Nukleinsäuremoleküls mit einer Vorwärtssequenz (F) und einer Rückwärtssequenz (R) markiert sind, wobei der erste Primer mindestens zwei 5'-5'-verknüpfte Arme umfasst, die die beiden Enden des Nukleinsäuremoleküls verknüpfen, wobei ein erster Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit der Vorwärtssequenz (F) hybridisiert, und ein zweiter Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit der Rückwärtssequenz (R) hybridisiert wobei die verknüpften Enden des Nukleinsäuremoleküls durch Ligase an ihren distalen Enden ligiert sind.
  16. Zirkularisiertes DNA-Molekül zum Verknüpfen und/oder Sequenzieren von zwei Nukleinsäuremolekülen, umfassend: einen Primer und zwei Nukleinsäuremoleküle, die ein erstes und ein zweites Nukleinsäuremolekül umfassen, wobei der Primer mindestens zwei 5'-5'-verknüpfte Arme umfasst, die die beiden Nukleinsäuremoleküle verknüpfen, wobei ein erster Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit dem ersten Nukleinsäuremolekül hybridisiert, und ein zweiter Arm der mindestens zwei 5'-5'-verknüpften Arme eine Sequenz umfasst, die mit dem zweiten Nukleinsäuremolekül hybridisiert, und wobei die zwei verknüpften Nukleinsäuremoleküle durch Ligase an ihren distalen Enden ligiert sind,
    ferner gegebenenfalls umfassend einen zweiten Primer oder einen zweiten Adapter, wobei der zweite Primer oder der zweite Adapter die distalen Enden der beiden Nukleinsäuremoleküle markiert, bevor sie durch Ligase ligiert werden.
EP18724679.8A 2017-04-26 2018-04-26 Verfahren zum verknüpfen von polynukleotiden Active EP3615683B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762490453P 2017-04-26 2017-04-26
PCT/US2018/029663 WO2018200884A1 (en) 2017-04-26 2018-04-26 Methods for linking polynucleotides

Publications (2)

Publication Number Publication Date
EP3615683A1 EP3615683A1 (de) 2020-03-04
EP3615683B1 true EP3615683B1 (de) 2022-10-12

Family

ID=62165657

Family Applications (1)

Application Number Title Priority Date Filing Date
EP18724679.8A Active EP3615683B1 (de) 2017-04-26 2018-04-26 Verfahren zum verknüpfen von polynukleotiden

Country Status (3)

Country Link
US (1) US10982278B2 (de)
EP (1) EP3615683B1 (de)
WO (1) WO2018200884A1 (de)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190039294A (ko) 2016-08-19 2019-04-10 아리조나 보드 오브 리젠츠 온 비하프 오브 아리조나 스테이트 유니버시티 개별 세포로부터 쌍 형성된 mRNA 캡처 및 시퀀싱을 위한 보타이 바코드의 대량 고효율 오일-에멀전 합성
WO2018200884A1 (en) 2017-04-26 2018-11-01 The Broad Institute, Inc. Methods for linking polynucleotides

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6054274A (en) * 1997-11-12 2000-04-25 Hewlett-Packard Company Method of amplifying the signal of target nucleic acid sequence analyte
JP6017458B2 (ja) 2011-02-02 2016-11-02 ユニヴァーシティ・オブ・ワシントン・スルー・イッツ・センター・フォー・コマーシャリゼーション 大量並列連続性マッピング
WO2014093676A1 (en) * 2012-12-14 2014-06-19 10X Technologies, Inc. Methods and systems for processing polynucleotides
DK3553175T3 (da) 2013-03-13 2021-08-23 Illumina Inc Fremgangsmåde til fremstilling af et nukleinsyresekvenseringsbibliotek
US10000799B2 (en) 2014-11-04 2018-06-19 Boreal Genomics, Inc. Methods of sequencing with linked fragments
CN114606228A (zh) * 2014-11-20 2022-06-10 安普里怀斯公司 用于核酸扩增的组合物和方法
WO2018200884A1 (en) 2017-04-26 2018-11-01 The Broad Institute, Inc. Methods for linking polynucleotides

Also Published As

Publication number Publication date
US20200208209A1 (en) 2020-07-02
EP3615683A1 (de) 2020-03-04
WO2018200884A1 (en) 2018-11-01
US10982278B2 (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US11845924B1 (en) Methods of preparing nucleic acid samples for sequencing
KR102433825B1 (ko) 핵산 바코드를 이용하는 단일 세포와 관련된 핵산의 분석
CN104619894B (zh) 用于非期望核酸序列的阴性选择的组合物和方法
US11879151B2 (en) Linked ligation
KR20190034164A (ko) 단일 세포 전체 게놈 라이브러리 및 이의 제조를 위한 조합 인덱싱 방법
CN102625838A (zh) 用于生成rRNA除尽的样本或者用于从样本分离rRNA的方法、组合物和试剂盒
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
KR20130113447A (ko) 고정된 프라이머들을 이용하여 표적 dna의 직접적인 캡쳐, 증폭 및 서열화
WO2018227025A1 (en) Creation and use of guide nucleic acids
KR20210114918A (ko) 복합체 표면-결합 트랜스포좀 복합체
JP2020501554A (ja) 短いdna断片を連結することによる一分子シーケンスのスループットを増加する方法
CN111712580A (zh) 用于扩增双链dna的方法和试剂盒
CN114729349A (zh) 条码化核酸用于检测和测序的方法
EP3615683B1 (de) Verfahren zum verknüpfen von polynukleotiden
JP4446746B2 (ja) ポリヌクレオチドの並行配列決定のための一定長シグネチャー
EP3559268A1 (de) Verfahren und reagenzien zum molekularen barcoding
CN111801428B (zh) 一种获得单细胞mRNA序列的方法
JP2023514388A (ja) 並列化サンプル処理とライブラリー調製
RU2790295C2 (ru) Сложные комплексы связанной на поверхности транспосомы
WO2022251510A2 (en) Oligo-modified nucleotide analogues for nucleic acid preparation
WO2022195089A1 (en) Methods for the selective analysis of cells or organelles
WO2005010184A1 (ja) 変異の検出方法

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20191114

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 602018041664

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: C12Q0001680000

Ipc: C12Q0001685300

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

RIC1 Information provided on ipc code assigned before grant

Ipc: C12Q 1/6869 20180101ALI20220506BHEP

Ipc: C12Q 1/6853 20180101AFI20220506BHEP

INTG Intention to grant announced

Effective date: 20220527

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602018041664

Country of ref document: DE

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1524199

Country of ref document: AT

Kind code of ref document: T

Effective date: 20221115

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG9D

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20221012

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1524199

Country of ref document: AT

Kind code of ref document: T

Effective date: 20221012

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230213

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230112

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230212

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230113

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230523

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602018041664

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230425

Year of fee payment: 6

Ref country code: DE

Payment date: 20230427

Year of fee payment: 6

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

26N No opposition filed

Effective date: 20230713

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230427

Year of fee payment: 6

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230426

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20230430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221012

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230426

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230426