WO2018172726A1

WO2018172726A1 - Single cell dna sequencing

Info

Publication number: WO2018172726A1
Application number: PCT/GB2018/000042
Authority: WO
Inventors: Patrick Gilligan
Original assignee: Blacktrace Holdings Limited
Priority date: 2017-03-20
Filing date: 2018-03-20
Publication date: 2018-09-27
Also published as: GB201704402D0

Abstract

A bead for single cell sequencing comprises a plurality of oligonucleotides attached to the bead. At a minimum, the oligonucleotides included in proximal to distal order from the bead are: a transposase target sequence, a common barcode sequence and a common primer binding site. A preferred comprises the following oligonucleotides in proximal to distal order from the bead: a first transposase target sequence, a first common primer binding site; an intervening spacer sequence; a second transposase target sequence; a second common primer binding site; and further comprises a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site. The first and second transposase target sequences are oriented with the cleavage site towards the bead. The intervening spacer sequence facilitates transposase dimer binding to first and second transposase target sequences. Various sequencing based applications of such beads are described. A method of preparing cellular nucleic acids from a single cell or nucleus for sequencing comprises encapsulating the single cell or nucleus in a gel bead and adding a cell lysis reagent to the gel bead in order to lyse the cell to leave cellular nucleic acids entrapped within the gel bead.

Description

SINGLE CELL DNA SEQUENCING

FIELD OF THE INVENTION

The invention relates to reagents for use in single cell and single nucleus sequencing. In particular, the invention relates to beads that are useful in droplet-based single cell/nucleus sequencing methods. Barcoded beads are provided that enable an end user to

conveniently create libraries of DNA sequences for various analyses. The beads advantageously rely upon the activity of a transposase enzyme to cleave the

oligonucleotides from the beads and also insert sequencing primers and barcodes into genomic DNA. Applications include epigenetic studies, copy number variation analysis, lineage analysis and sequencing of environmental samples. Libraries of beads are also provided in which the separate beads each carry a different barcode on the attached oligonucleotides. The beads of the invention are adapted for droplet-based library generation and provide advantageous means of encapsulating cells. The invention also provides droplets comprising the beads of the invention together with a transposase enzyme. Related kits, uses and methods are also provided.

BACKGROUND TO THE INVENTION

A fundamental functional unit in biology is the cell, and one of the fundamental goals of biology is a mechanistic understanding of various processes (embryonic development, brain function, diseases (Alzheimer's, diabetes, cancer)) in terms of cells, their functions, and their behaviours. However, many of the powerful conventional molecular biology techniques such as PCR, RNA seq etc. only work on homogenates that are only averages of all the constituent cells, or cell types. Approaches such as in situ hybridisation and cytometry provide data about single cells, but only low content data.

Various technologies are currently available including:

FACS/limiting dilution into multi-well plates, followed by SMART-seq 2 or CEL-seq 2 Wafergen

- Fluidigm C1

Droplet microfluidics

Recently, a technique was published in which cells are compartmentalised, together with biochemical reactions, in tens of thousands of microfluidic droplets. Each droplet behaves as a compartment, enabling RNA seq on thousands of single cells at once. See Macosko, Evan Z., et al. "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets." Cell 161.5 (2015): 1202-1214 ("Drop-seq") and Klein, Allon M., et al. "Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells." Cell 161.5 (2015): 1187-1201 ("inDrop"). These single cell methods generally rely on putting single cells in microfluidic droplets with barcoded beads, so that mRNA/cDNA from each cell can be uniquely barcoded. These methods permit the transcriptional profiling of thousands of single cells.

Bulk methods of assaying transposase accessible chromatin are known; see Buenrostro, Jason D., et al. "Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position." Nature methods 10.12 (2013): 1213-1218 and Buenrostro, Jason D., et al. "ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide." Current Protocols in Molecular Biology (2015): 21-29. A single cell method has also been developed based on use of Fluidigm's integrated fluidics circuit; see Buenrostro, Jason D., et al. "Single-cell chromatin accessibility reveals principles of regulatory variation." Nature 523.7561 (2015): 486-490. Combinatorial cellular indexing has also been applied to profiling chromatin accessibility; see Cusanovich, Darren A., et al. "Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing." Science 348.6237 (2015): 910-914.

Rotem et al recently described a single-cell method of chromatin immunoprecipitation followed by sequencing (ChlP-seq); see Rotem, A., Ram, O. & Shoresh, N. et al. Single- cell ChlP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. http://dx.doi.org/10.1038/nbt.3383 (2015). This technique employs micrococcal nuclease (MNase) to preferentially cut accessible linker DNA and generated sparse data, on the order of 1000 unique reads, for each single cell.

WO00/17343 describes methods for making insertional mutations at random positions in cellular nucleic acid in a target cell by introducing a synaptic complex of a Tn5 transposase and a polynucleotide comprising a pair of nucleotide sequences that interact with the Tn5 transposase and comprise a transposable nucleotide sequence therebetween. Adey et al (Genome Biology, 2010, 1 1 : R119) constructed shotgun fragment libraries in which transposase catalysed in vitro DNA fragmentation and adaptor incorporation

simultaneously. This method was applied to human genome sequencing. DESCRIPTION OF THE INVENTION

The inventor has identified a number of disadvantages associated with current single cell (sc) epigenetics/scDNA sequencing methods. For example, the methods may only produce very sparse data, because the DNA or chromatin in a single cell is limiting, and methods such as digestion, blunting, adaptor ligation etc. are inefficient, and result in much or most of the potential sequence reads being lost. Assaying transposase accessible chromatin (ATAC) seq is an efficient technique and consequently very useful for limiting material, because it directly introduces adaptor sequences in a single step (i.e., integration of an artificial "transposome" construct). The bulk ATAC seq method has been applied to single cells, in the Fluidigm C1 chips, but this has the disadvantages of limited throughput

(dozens of cells, in expensive chips). The Fluidigm chips have the further disadvantage of, at least in some circumstances, having a high doublet rate (i.e., two or more cells in a single chamber on the chip), which leads to data that is confounded, at an unknown rate. There are other single cell DNA sequencing methods in multi-well plates, but they suffer from additional disadvantages, such as not being able to achieve satisfactory coverage of the genome; e.g., copy number sequencing requires good, unbiased coverage of the genome, but current methods may produce uneven amplification of the genome. The present invention achieves very efficient tagging of single cell genomic DNA, by generating and using barcoded active transposition complexes, referred to herein as "tagmosomes", to tag the genomes of single cells in microfluidic droplet compartments, with adaptor primers and barcodes in droplets. The inventor has enabled further applications of droplet microfluidics in sequencing of single cells and nuclei. Generally, these applications focus on DNA sequencing as opposed to RNA seq. The inventors have devised bead- oligonucleotide constructs and other reagents that allow convenient, efficient and high throughput library preparation for implementing various droplet microfluidics-based DNA sequencing applications as applied to single cells and nuclei. The constructs interact with transposase enzymes in a droplet to integrate primer binding sequences and barcodes into DNA.

Accordingly, in a first aspect, the invention provides a bead for single cell sequencing comprising a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising in proximal to distal order from the bead:

a. a transposase target sequence b. a common barcode sequence

c. a common primer binding site.

Throughout the present disclosure, where single cells are described this is intended to also encompass isolated nuclei as appropriate and unless otherwise indicated.

It should be noted that throughout the specification the term "comprising" is intended to represent open-ended (i.e. including) language. However, for the avoidance of doubt, wherever the term "comprising" is used it is envisaged that the corresponding feature may be limited to that specified (i.e. consisting) as necessary.

Thus, the basic components of the reagents of the invention include:

A bead

The bead acts as the solid support for the plurality of oligonucleotides. The bead is dimensioned so as to facilitate attachment to a large number of oligonucleotides and encapsulation within a droplet. The bead may be composed of any suitable material as would be readily understood by one skilled in the art. In preferred embodiments, the bead is a polyacrylamide bead or a hydrogel bead. Such beads may be advantageous because they are compatible with efficient enzymatic reactions, allow for 'super-Poisson' loading in droplets and because they can be close-packed in a chip.

A plurality of oligonucleotides

The oligonucleotides are acted upon by a transposase enzyme to generate a sequencing library. Typically, at least on average 10⁴ or at least on average 10⁵ oligonucleotides are attached to each bead. The beads are intended to be used in high numbers to investigate large numbers of cells in parallel fashion. Thus, the number of oligonucleotides attached to a bead may be specified as an average across the library of beads. The average may be a mean or a median. The figure specified also allows for manufacturing tolerances. In some embodiments, around 10⁴-10⁹, such as 10⁶-10⁸, preferably 10⁷ oligonucleotides are attached to each bead.

Typically a single strand of each oligonucleotide (which may incorporate double stranded regions as outlined herein) is attached to the bead. Attachment may be via any suitable means. It may be via a linker, which may be a flexible linker. The linker may be one of a number known in the art, such as a hydrocarbon linker, or a PEG linker. The linker may be attached by one of a number of methods known in the art, e.g., the linker may attached by reacting the carboxylic acid group of a molecule such as hexadecenoic acid with an amine group on the bead.

The essential features of the oligonucleotides include, stated in proximal to distal order from the bead:

a. a transposase target sequence

b. a common barcode sequence

c. a common primer binding site.

"Proximal" means the region of the oligonucleotide closest to the bead and "distal" means furthest away from the bead. Thus, the orientation of the components of the

oligonucleotide can be more easily understood.

A transposase target sequence

A transposase target sequence may also be referred to as a "transposase half site". This sequence may be double stranded and incorporates a site that can be cleaved by a transposase enzyme. For the avoidance of doubt the cleavage site may be double stranded. The transposase target sequence is oriented so that the cleavage site is located at the distal end, with distal being defined relative to the common primer binding site to which the transposase target sequence is functionally linked on a given oligonucleotide (which sequences are integrated together into the DNA). Advantageously, in some embodiments, this cleavage site is employed in the constructs of the invention to facilitate cleavage of the oligonucleotides from the bead. In other embodiments, the oligonucleotide can be cleaved from the bead by other means and the cleavage site is then cleaved to permit insertion into the target DNA. The transposase target sequence is recognised and bound by a transposase enzyme, which can then integrate the oligonucleotides into cellular DNA. The transposase target sequence is typically at least 10 nucleotides in length, such as between 15 and 25 nucleotides in length, 19-40 nucleotides in length, preferably 18 or 19 nucleotides in length. Each transposase target sequence may comprise nucleotide A at position 10, nucleotide T at position 11 , and nucleotide A at position 12. Each transposase target sequence may further comprise nucleotide A or T at position 4, nucleotide G or C at position 15, nucleotide A or T at position 17 or nucleotide G or C at position 18. In some embodiments, each transposase target sequence has the sequence 5'- CTGTCTCTTATACACATCT-

3' (SEQ ID NO: 1) or 5'-CTGTCTCTTATACAGATCT-3' (SEQ ID NO: 2). In some embodiments, one or more transposase target sequence comprises, consists essentially of or consists of the following sequences:

MEssol CGCATTGAGATGTGTATAAGAGACAGGTACTCTGCG (SEQ ID NO: 7) and/or MEsso2 CCGCTCACGAGATGTGTATAAGAGACAG (SEQ ID NO: 8). Any suitable transposase enzyme may be utilised according to the invention and this will determine the precise sequence of the transposase target sequences. One example is Tn5 (Goryshin, Igor Yu, and William S. Reznikoff. "Tn5 in vitro transposition." Journal of Biological Chemistry 273.13 (1998): 7367-7374). Various mutant forms are available which display increased insertion activity, such as Tnp (EK54, MA56, LP372). See Goryshin and Reznikoff and WO98/010077.

A common barcode sequence

This barcode sequence is the same between all oligonucleotides attached to a given bead. It is different to the barcode sequence included in the oligonucleotides attached to another bead. Since a single bead is ultimately brought into contact with a single cell (in a droplet), the barcode functions to identify the cell from which a particular target sequence originated. Any suitable barcode sequence may be used. A useful method for synthesising barcodes is split and pool synthesis (e.g. as used by Macosko et al), although other methods may be employed. The common barcode sequence is typically at least 10 nucleotides in length, such as between 10 and 15 nucleotides in length, preferably 12 nucleotides in length. Due to the synthesis methods employed herein, the barcode sequence may be comprised within a larger overall sequence. For example, a 12 nucleotide barcode sequence may be synthesised in 4 nucleotide blocks, each block comprising additional adaptor sequences that permit the next block to be added. This may create an overall sequence that is around 32 nucleotides in length and contains the 12 nucleotide barcode sequence in three distinct 4 nucleotide blocks. This is shown schematically in Figures 13 and 14 and discussed in greater detail herein. Such an arrangement is preferred because it facilitates barcode sequence generation by a split and pool synthesis scheme as explained further herein.

A common primer binding site The common primer binding site may also be referred to as a "PCR handle"; the terms are used interchangeably herein. This primer binding site is the same between all

oligonucleotides attached to a given bead (and across beads). It thus facilitates

amplification of DNA using a single forward primer. Any suitable primer binding sequence may be used. Primer design is routine for one skilled in the art and various tools are available to facilitate the process (including freely available online software). Typically, the common primer binding site is between 15 and 40 nucleotides in length, more particularly between 20 and 35 nucleotides in length.

It is preferred that the oligonucleotides also facilitate insertion of a second, reverse, primer binding site into the DNA. This has corresponding properties to the common primer binding site and thus may be termed a "second common primer binding site". In combination, the two primer binding sites facilitate amplification of the DNA between the sites of two adjacent insertions. Thus, primers binding to first and second common primer binding sites may be termed forward and reverse primers respectively. The first and second common primer binding sites are different to one another but may be the same between all oligonucleotides attached to a given bead (and across beads).

Thus, in some embodiments, the bead further comprises a second plurality of

oligonucleotides attached to the bead, the second oligonucleotides comprising in proximal to distal order from the bead:

a. a second transposase target sequence

b. a second common primer binding site.

The second transposase target sequence may be the same as the first transposase target sequence. The properties of a transposase target sequence are discussed above, which applies mutatis mutandis. The transposase enzyme functions as a dimer, with one subunit binding to each transposase target sequence. The inclusion of two transposase target sequences in the oligonucleotide thus facilitates excision of the desired sequences from the bead to create a diffusible transposition complex. This also facilitates transposition of the desired sequences into the target DNA.

The nature of the insertion events is dependent upon whether the transposase target sequences remain linked to one another following excision from the bead, as catalysed by the transposase enzyme. If the transposase target sequences remain linked to one another following excision then insertion does not introduce a double stranded break and the constructs are referred to as "non-fragmenting". Such "non-fragmenting" bead constructs are advantageous in specific applications, such as ATAC-seq or where it is advantageous to amplify the genome (e.g. by a WGA technique). Alternatively, if the transposase target sequences are separated from one another following excision from the bead, which includes providing separate oligonucleotides attached to the beads, the constructs are referred to as "fragmenting". Such "fragmenting" bead constructs are advantageous in specific applications, such as ChlP-seq. In various aspects of the invention, it may be advantageous to have the transposase dimer cleave the oligonucleotides from the bead. Since the transposase functions as a dimer, at reasonable concentrations of oligonucleotides on the bead, it may be advantageous to provide both transposase sites on a single oligonucleotide. This facilitates cleavage and thus formation of the diffusible transposome complex.

Accordingly, a preferred bead of the invention for single cell sequencing, of the fragmenting type, comprises a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising in proximal to distal order from the bead:

a. a first transposase target sequence

b. a first common primer binding site

c. an intervening spacer sequence

d. a second transposase target sequence

e. a second common primer binding site; and

further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; and wherein the intervening spacer sequence (is of sufficient length to) facilitates transposase dimer binding to first and second transposase target sequences. Thus, according to this aspect, each oligonucleotide incorporates all sequences needed for insertion via a fragmenting mechanism. The common barcode sequence may be positioned between the first transposase target sequence and first common primer binding site. Alternatively, it may be positioned between the second transposase target sequence and second common primer binding site. In such constructs, the first and second transposase target sequences are oriented with the cleavage site towards the bead. Such a construct is shown schematically in Figure 1. Accordingly, upon binding by the transposase enzyme (dimer), the cleavage sites are oriented as shown in Figure 2. In this orientation, cleavage results in fragmentation and separation of the first and second common primer binding sites onto separate

oligonucleotides that are then inserted into the DNA.

In order to adopt such a conformation, the oligonucleotide includes a so-called "intervening spacer sequence" between the first common primer binding site and second transposase target sequence. This intervening spacer sequence facilitates transposase dimer binding to first and second transposase target sequences. Generally, the intervening spacer sequence must be of sufficient length to allow the cleavage complex to form. For convenience of synthesis, the intervening spacer sequence is generally composed of DNA nucleotides. The intervening spacer sequence may be single stranded, double stranded or a mixture of both. In some embodiments, the intervening spacer sequence comprises a single stranded region of at least 20, 30 or 40 nucleotides in length. In specific

embodiments, the intervening spacer sequence is between 40 and 65 nucleotides in length. In other embodiments, the intervening spacer sequence comprises a double stranded region of at least 100, 150 or 200 nucleotides in length. In specific embodiments, the intervening spacer sequence is between 100 and 300 nucleotides in length. A longer double stranded region (compared to a single stranded region) is likely needed to generate the conformation needed for effective transposase binding.

As already discussed, generally the oligonucleotides are single stranded at the site of attachment to the bead (i.e. only one strand is attached to the bead). Conveniently, all oligonucleotides may be attached to the bead in the same orientation. In some

embodiments, the oligonucleotides are attached to the bead at their 3' end. In such embodiments, while each transposase target sequence is double stranded, this is not necessary for the common primer binding site and common barcode sequence. Each may be single stranded.

In other embodiments, the oligonucleotides are attached to the bead at their 5' end. Such an orientation may be preferred in view of preferred synthesis routes. In such

embodiments, because the transposase target sequences are oriented with the cleavage site towards the bead, then the adjacent primer binding site must be double stranded. This is necessary because the strand not attached to the bead is the one that will be inserted into the DNA by the transposase. The same logic applies to the barcode sequence. Thus, in some embodiments, each transposase target sequence, the common primer binding sites and common barcode sequence are each double stranded. They are effectively provided as a double stranded construct (although only a single strand is attached to the bead and the intervening linker need not be double stranded as explained above).

In other aspects, the second transposase target sequence is in the opposite orientation. This results in a non-fragmenting method of insertion into the DNA. Thus, the invention also provides a bead for single cell sequencing comprising a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising in proximal to distal order from the bead:

a. a first transposase target sequence

b. a first common primer binding site

c. optionally, an intervening spacer sequence

d. a second common primer binding site

e. a second transposase target sequence; and

further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; and wherein the intervening spacer sequence facilitates transposase dimer binding to first and second transposase target sequences.

In these aspects, for the avoidance of doubt, the first transposase target sequence

(proximal to the bead) is oriented with the cleavage site towards the bead. However, the second transposase target site (distal to the bead) is oriented with the cleavage site away from the bead.

Thus, according to this aspect, each oligonucleotide attached to the bead incorporates all sequences needed for insertion via a non-fragmenting mechanism. The common barcode sequence may be positioned between the first transposase target sequence and first common primer binding site. Alternatively, it may be positioned between the second transposase target sequence and second common primer binding site. The intervening spacer sequence is of sufficient length to facilitate transposase dimer binding to first and second transposase target sequences. In view of this alternative orientation of the oligonucleotide components, the requirements of the intervening spacer sequence are less. This is shown schematically in Figure 6. Thus, in some embodiments, the intervening spacer sequence comprises a single stranded region of no more than 30, 20, 10 or 5 nucleotides in length. Indeed, no separate sequence may be needed in some embodiments; the second primer binding site and barcode may provide sufficient sequence to enable the relevant conformation to be adopted. Thus, the intervening spacer sequence as a separate sequence is optional.

As already discussed, generally the oligonucleotides are single stranded at the site of attachment to the bead (i.e. only one strand of the oligonucleotides are attached to the bead). Conveniently, all oligonucleotides may be attached to the bead in the same orientation. In some embodiments, the oligonucleotides are attached to the bead at their 5' end. In such embodiments, each transposase target sequence and the first common primer binding site are double stranded. The barcode may also be double stranded if located between the first transposase target sequence and first common primer binding site. For the second transposase target sites oriented with the cleavage site away from the bead, it is not necessary for the second common primer binding site (and barcode where relevant) to be double stranded. Preferably, such components of the oligonucleotides are single stranded for convenience of synthesis.

In other embodiments, the oligonucleotides are attached to the bead at their 3' end. In such embodiments, each transposase target sequence and the second common primer binding site may be double stranded. If the barcode is between the second transposase target sequence and second common primer binding site it may be double stranded. In such embodiments, at least the first common primer binding and where relevant the barcode, may be single stranded. In each of the aforementioned aspects, it is advantageous for the first transposase target sequence to be oriented with the cleavage site towards the bead. In this arrangement, transposon mediated cleavage releases the oligonucleotide from the bead without requiring any additional separate cleavage step. Accordingly, in some embodiments, the

oligonucleotides are attached to the bead via a non-cleavable linker. By "non-cleavable" is meant that the linker is not cleaved during the use of the beads of the invention (rather than that the linker cannot be cleaved at all). Thus, the linker is not, for example, cleaved by transposase activity. An example is a non-photocleavable linker (i.e. not cleaved by light, such as UV light).

In order to maximise efficiency of activity by the transposase enzyme and/or minimise any inhibition caused by the bead, the oligonucleotides may comprise a terminal spacer sequence between the bead and the (first) transposase target sequence. This is typically a nucleic acid sequence. In some embodiments, the terminal spacer sequence is at least 10, 20 or 30 nucleotides in length.

For the avoidance of doubt, it is possible for the oligonucleotides to be attached to the bead via a cleavable linker, such as a photocleavable linker (i.e. cleaved by light, such as UV light). This may obviate the need for the terminal spacer sequence because the bead can be removed before the transposase acts on the oligonucleotide.

If a cleavable linker, such as a photocleavable linker, is incorporated into the

oligonucleotides of the invention then reliance on the transposase target sequence to cleave the oligonucleotides from the bead is removed. Thus, the invention also provides a bead for single cell sequencing comprising a plurality of oligonucleotides (that insert via a fragmenting mechanism) attached to the bead, the oligonucleotides comprising:

a. a first common primer binding site

b. a first transposase target sequence

c. an intervening spacer sequence

d. a second (common) primer binding site

e. a second transposase target sequence; and

further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; and wherein the intervening spacer sequence (is of sufficient length to) facilitates transposase dimer binding to first and second transposase target sequences; and wherein the oligonucleotides are attached to the bead via a cleavable linker.

According to this aspect, the orientation of the transposase target sequences may be different. Thus, in one embodiment, the oligonucleotides comprise in proximal to distal order from the bead: a. a first common primer binding site

b. a first transposase target sequence

c. an intervening spacer sequence

d. a second (common) primer binding site

e. a second transposase target sequence; and

In order to adopt such a conformation, the oligonucleotide includes a so-called "intervening spacer sequence" between the first transposase target sequence and second common primer binding site. This intervening spacer sequence facilitates transposase dimer binding to first and second transposase target sequences. Generally, the intervening spacer sequence must be of sufficient length to allow the cleavage complex to form. For convenience of synthesis, the intervening spacer sequence is generally composed of DNA nucleotides. The intervening spacer sequence may be single stranded, double stranded or a mixture of both. In some embodiments, the intervening spacer sequence comprises a single stranded region of at least 20, 30 or 40 nucleotides in length. In specific

Alternatively, the oligonucleotides comprise in proximal to distal order from the bead:

a. a first common primer binding site

b. a first transposase target sequence

c. an intervening spacer sequence

d. a second transposase target sequence

e. a second (common) primer binding site; and further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; and wherein the intervening spacer sequence (is of sufficient length to) facilitates transposase dimer binding to first and second transposase target sequences; and wherein the oligonucleotides are attached to the bead via a cleavable linker.

The intervening spacer sequence is of sufficient length to facilitate transposase dimer binding to first and second transposase target sequences. In view of this alternative orientation of the oligonucleotide components, in which the transposase target sequence are connected directly to one another via the intervening spacer sequence, the

requirements of the intervening spacer sequence are less. Thus, in some embodiments, the intervening spacer sequence comprises a single stranded region of no more than 30, 20, 10 or 5 nucleotides in length.

According to each of these embodiments and aspects, the cleavable linker may be a photocleavable linker.

As for earlier aspects of the invention, the oligonucleotides may be attached to the bead at their 5' end or 3' end. When attached via the 5' end, each transposase target sequence may be double stranded but at least the common primer binding site and common barcode sequence may be single stranded. If attached via the 3' end, each transposase target sequence, the common primer binding sites and common barcode sequence may each be double stranded.

Whilst there are advantages associated with generating beads of the invention carrying a plurality of identical oligonucleotides, each of which carries two transposase cleavage sites, it is also within the scope of the invention to separate the transposase cleavage sites and common primer binding sites into two separate oligonucleotides where each

oligonucleotide is attached to the bead via a cleavable linker. Once the oligonucleotides are released from the bead they are freely diffusible and thus there is no absolute requirement to provide both transposase sites on one oligonucleotide. Thus, the invention also provides a bead for single cell sequencing comprising a plurality of first and second oligonucleotides attached to the beads; the first oligonucleotides comprising:

a. a first transposase target sequence b. a first common primer binding site; and

the second oligonucleotides comprising:

a. a second transposase target sequence

b. a common barcode sequence

c. a second common primer binding site;

wherein the plurality of first and second oligonucleotides are attached to the bead via a cleavable linker.

For the avoidance of doubt, the first and second oligonucleotides are discrete and separately attached to the bead. The components of the oligonucleotide can be provided in either orientation vis a vis the bead. Thus, in some embodiments, the invention provides a bead for single cell sequencing comprising a plurality of first and second oligonucleotides attached to the beads; the first oligonucleotides comprising in proximal to distal order from the bead:

a. a first transposase target sequence

b. a first common primer binding site; and

the second oligonucleotides comprising:

a. a second transposase target sequence

b. a common barcode sequence

c. a second common primer binding site;

Such a bead is shown schematically in Figure 10 and discussed herein in further detail.

In some embodiments, the invention provides a bead for single cell sequencing comprising a plurality of first and second oligonucleotides attached to the beads; the first

oligonucleotides comprising in proximal to distal order from the bead:

a. a first common primer binding site; and

b. a first transposase target sequence

the second oligonucleotides comprising:

a. a second common primer binding site

b. a common barcode sequence

c. a second transposase target sequence; wherein the plurality of first and second oligonucleotides are attached to the bead via a cleavable linker.

Any suitable cleavable linker may be employed. The cleavable linker may be a

photocleavable linker.

The first and second transposase target sequences are generally double stranded

(including the cleavage site). The first and/or second oligonucleotides may each be attached to the bead at their 5' end. In such embodiments, the oligonucleotides may comprise a terminal spacer sequence between the cleavable linker and each transposase target sequence. The terminal spacer sequence may be at least 10, 20 or 30 nucleotides in length. In some embodiments, the first and/or second oligonucleotides are attached to the bead at their 3' end. In such embodiments, the oligonucleotides may also comprise a terminal spacer sequence between the cleavable linker and the first and/or second common primer binding site. The terminal spacer sequence may be at least 10, 20 or 30 nucleotides in length.

The orientation of the oligonucleotides in relation to the beads determines the preferred route of synthesis. Oligonucleotides may be synthesised by any suitable route. The oligonucleotides may be synthesised using polymerase and/or ligase activity. Synthesis in a 5'-3' direction may be catalysed by a polymerase. Synthesised components may be ligated together to form the final oligonucleotide in some embodiments. A preferred method of synthesis, particularly for the barcodes, is split and pool synthesis.

Although the invention is defined in relation to the properties of a single bead, it is readily apparent that the beads are intended to be produced and sold in bulk. Thus, the invention also provides a library of beads. The library comprises a plurality of beads of the invention.

Typically, in the library, each bead is a bead of the invention. Within the library, each of the plurality of oligonucleotides attached to each bead comprises the same common primer binding sites and same transposase target sequence. This simplifies downstream handling of the beads. The same forward and reverse primer pair can be used for all amplification steps. The transposase target sequence is acted upon by the same transposase enzyme with minimal bias.

In contrast thereto, the common barcode sequence is different between beads. This ensures that the origin of each sequenced molecule can be assigned to a single cell/nucleus. Typically, a single sequencing run will include amplicons from many cells and thus the barcode plays a critical role in assigning the single cell of origin of the sequence data. Naturally, the number of beads included in a library can be varied as desired by the end user dependent on the experiment or experiments to be performed. In some embodiments, the library contains at least 10², 10³, 10⁴, 10⁵ or 10⁶ beads. Preferred numbers are around 10⁵ or 10⁶ beads. They may be used in experiments with a corresponding number of cells. In order to achieve this, the library of beads may be larger, e.g. 10 or 100 fold larger, than the number of cells to be analysed in order to minimise the chance of the same barcode being provided to more than one cell.

The beads may be packaged in the form of a kit. Thus, the invention also provides a kit for single cell sequencing comprising a bead or a library of beads as defined herein, optionally together with a transposase enzyme. Whilst the transposase enzyme may be provided in active form, in some embodiments, the transposase enzyme is provided in inactive form. By "inactive" is meant that some additional condition is required in order to make the enzyme active. The enzyme can be activated by the user of the kit. In some

embodiments, the kit also provides the necessary component to activate the enzyme.

Thus, in some embodiments the kit further comprises a source of magnesium ions to activate the transposase enzyme.

Various additional components may be included in the kits. Thus, in some embodiments, the kit comprises at least one or more, up to all of: a cell suspension buffer, a bead suspension buffer, a lysis buffer, wash buffers, components of a PCR reaction, and primers.

The beads of the invention are preferably captured in droplets together with a transposase enzyme to enable production of a diffusible transposition complex that can integrate into DNA. Typically, this is achieved using a microfluidics device. This may be achieved by combining a stream containing beads with a stream containing the transposase enzyme and an oil stream to form a droplet containing the two components. Thus, there is also provided a droplet for use in single cell sequencing comprising:

a. a single bead of the invention or a single bead from the library of beads of the invention b. a transposase enzyme.

As for the beads of the invention, although the invention is defined in relation to the properties of a single droplet, it is readily apparent that the droplets are intended to be produced during methods in which many cells are analysed in the same experiment (i.e. in massively parallel fashion). Thus, the invention also provides a droplet library. The library comprises a plurality of droplets of the invention. Typically, in the library, each of the plurality of droplets contains a single bead of the invention and a transposase enzyme. The common barcode sequence is different between the beads included in the droplets. This ensures that the origin of each sequenced molecule can be assigned to a single cell/nucleus. Typically, a single sequencing run will include amplicons from many cells and thus the barcode plays a critical role in assigning the single cell of origin of the sequence data.

It must be noted that, in order to generally achieve a single bead per droplet and to minimize droplets containing two or more beads, the beads are typically incorporated into the droplets at limited dilution. Thus, many droplets will contain no bead at all.

Nevertheless, the droplet libraries of the invention still comprise a plurality of droplets each containing a single bead of the invention and a transposase enzyme. Alternatively, the beads may be conformable beads, so that they can be provided at around the size of the channel, so that they can be fed into the droplets at almost exactly one per droplet, so that around 80% of the droplets contain a single bead. Once both components are contained within the same droplet, the transposase enzyme cleaves the transposase target sequences from the bead to generate a diffusible

transposition complex.

Preferably, the diffusible transposition complex is also brought into contact with a single cell or nucleus whilst within the droplet. Thus, in some embodiments, a droplet of the invention further comprises a single cell or a single purified nucleus within the droplet. The single purified nucleus may have been fixed prior to encapsulation within the droplet. Suitable fixing agents and processes are well known to those skilled in the art. Any cell may be amenable to transposition according to the invention. Thus, the cell may be a eukaryotic or a prokaryotic cell.

In order to lyse the cells and permit insertion into the genomic DNA, the droplet may also comprise a lysis reagent to lyse a cell. Preferably, the lysis reagent lyses the cell but not the nucleus within the cell (i.e. the nucleus remains intact). Any suitable lysis reagent may be used. In some embodiments, the lysis reagent comprises a detergent. The detergent may be a non-ionic detergent. Naturally, the number of droplets included in a library can be varied as desired by the end user dependent on the experiment or experiments to be performed. In some embodiments, the library contains at least 10², 10³, 10⁴, 10⁵, 10⁶ or 10⁷ droplets. Preferred numbers are around 10⁵, 10⁶ or 10⁷ droplets. They may be used in experiments with a corresponding number of cells (which may be, for example, around 20-fold fewer cells because the cells are introduced into the droplets at limiting dilution).

The invention also provides corresponding methods for making a barcoded diffusible transposition complex. Thus, the invention provides a method of making a barcoded diffusible transposition complex comprising combining a single bead of the invention or a single bead from the library of beads of the invention with transposase enzyme in a single droplet. Suitable microfluidics approaches to form such a droplet are known in the art. In some embodiments, vortexing may be employed in order to create a suitable emulsion. The invention also provides a diffusible barcoded transposition complex produced according to this method.

Such methods are also useful for making a library of diffusible transposition complexes. Thus, the invention also provides a method of making a library of diffusible transposition complexes comprising, for a plurality of single beads wherein each single bead comprises a different barcode sequence, combining a single bead of the invention or a single bead from the library of beads of the invention with a transposase enzyme in a single droplet. The invention also provides a library of diffusible barcoded transposition complex produced according to this method.

The invention also provides a method of generating a barcoded DNA library comprising combining the following components in a single droplet: a single bead of the invention or a single bead from the library of beads of the invention

a transposase enzyme

a single cell

a lysis reagent.

The transposase enzyme binds to the oligonucleotides attached to the bead and cleaves them from the bead to generate a diffusible transposition complex. In parallel, the lysis reagent lyses the cell thereby enabling the generated transposition complex to insert first and second common primer binding sites and the common barcode sequence into cellular DNA, thereby generating a barcoded DNA library. Any suitable lysis reagent may be employed. As explained further herein, for applications where cell lysis is needed but the nucleic remain intact a non-ionic detergent may be employed. Examples are commercially available and include TWEEN20, TRITONX100 and NP-40.

In some embodiments, droplets may be formed by flowing a bead together with a lysis reagent in a channel. A cell and a transposase enzyme may be flowed down a second channel to form a droplet at the interface containing the four components. This keeps the relevant components that act on one another (the transposase on the bead and the lysis reagent on the cell) separate until they are contained within the droplet.

Similarly, the invention further provides a method of generating a barcoded DNA library comprising combining the following components in a single droplet:

a. a single bead of the invention or a single bead from the library of beads of the invention

b. a transposase enzyme

c. a single purified nucleus.

Here, a lysis reagent is not needed because cell lysis is not required. In some

embodiments, droplets may be formed by flowing a bead in a channel. A transposase enzyme may be flowed down a second channel to form a droplet at the interface. The purified nucleus can be included in either channel. This keeps the relevant components that act on one another (the transposase on the bead) separate until they are contained within the droplet. The transposase enzyme binds to the oligonucleotides attached to the bead and cleaves them from the bead to generate a diffusible transposition complex. The generated transposition complex then inserts first and second common primer binding sites and the common barcode sequence into cellular DNA, thereby generating a barcoded DNA library.

In general terms, the invention provides for use of a bead or library of beads as defined herein for single cell or single nucleus sequencing.

The methods for generating a barcoded DNA library have many potential applications. Advantageously, the barcoded DNA libraries are used in DNA sequencing application. Typically, the DNA sequencing is by a next generation sequencing method. Examples of NGS platforms include lllumina sequencing (such as Hi-Seq and Mi-Seq), SMRT sequencing (Pacific Biosciences), Nanopore sequencing, SoLID sequencing,

pyrosequencing (e.g. Roche 454), single molecule sequencing (SeqLUHelicos) and Ion- Torrent (Thermo Fisher) which are well-known to the skilled person. In some

embodiments, the sequencing technique employed is a short-read sequencing technique (such as lllumina sequencing).

Thus, the invention provides a method of single cell or single nucleus sequencing comprising:

a. Performing a method of the invention in order to generate a barcoded DNA library

b. Amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

c. Sequencing the amplified barcoded DNA.

Some applications may involve whole genome amplification. This is particularly relevant where the oligonucleotides insert via a non-fragmenting mechanism. Thus, the invention provides a method of single cell or single nucleus sequencing comprising:

a. Performing a method of the invention using a bead of the invention where the oligonucleotides insert by a non-fragmenting mechanism in order to generate a barcoded DNA library

b Repairing gaps

c. Amplifying the DNA using a whole genome amplification method; and d Sequencing the amplified barcoded DNA. The amplified DNA is preferably further amplified using primers binding to the common primer binding sites included in the oligonucleotides. Thus, the invention provides a method of single cell or single nucleus sequencing comprising:

b. Repairing gaps

c. Amplifying the DNA using a whole genome amplification method d. Further amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

e. Sequencing the amplified barcoded DNA.

Gap repair can be achieved by any technique. For example, gap repair may be achieved by gap filling approaches and/or by ligation approaches (e.g. gap filling followed by ligation).

Various whole genome amplification (WGA) methods are known and may be PCR and/or multiple displacement amplification (MDA) based. DA methods may involve use of a primase.

Once the barcoded DNA has been sequenced, the data may be processed in any desired manner. For example, the methods may include steps of clustering and assembling the sequences.

The methods of the invention are particularly suitable for single cell sequencing of environmental samples. Thus, according to such methods an environmental sample is used as the source of the single cell or single purified nucleus. An environmental sample may be of any nature. It may be a water, soil or air sample for example. It may be a clinical sample, such as a stool sample or a buccal or skin swab.

The beads of the invention are highly useful for assaying transposase accessible chromatin (ATAC). Thus, the invention provides for use of a bead or library of beads as defined herein for assaying transposase accessible chromatin (ATAC) in a single cell or single nucleus. The invention also provides a method of assaying transposase accessible chromatin (AT AC) in a single cell or single nucleus comprising:

a. Performing a method of the invention in order to generate a barcoded DNA library under conditions in which chromatin structure is maintained b. Amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

c. Sequencing the amplified barcoded DNA. ATAC relies upon retention of chromatin structure during the steps of creating a barcoded DNA library. In the context of chromatin, the transposase preferentially inserts the first and second common primer binding sites and the common barcode sequence into nucleosome- free regions. Thus, preferably the methods involve generating a barcoded DNA library under conditions which do not lyse the nucleus. This aids preservation of chromatin structure. Thus, this step typically does not involve exposing the cell or nucleus to an ionic (anionic or cationic) detergent (such as SDS) and/or a protease (such as proteinase K). Once the transposase has performed insertion, it is then not essential that chromatin structure is maintained throughout the remainder of the method. ATAC-seq methods of the invention may, in some embodiments, use beads of the invention where the oligonucleotides insert by a non-fragmenting mechanism. Alternatively they may employ beads of the invention where the oligonucleotides insert by a fragmenting mechanism. As already mentioned, the methods typically involve generating a barcoded DNA library under conditions which do not lyse the nucleus. For such applications where cell lysis is needed but the nucleic remain intact a non-ionic detergent may be employed. Examples are commercially available and include TWEEN20, TRITONX100 and NP-40.

This allows intact cell nuclei to be recovered following generation of the barcoded DNA library (step a.) and prior to performing amplification of the barcoded DNA (step b). For example differential centrifugation may be employed in order to recover the nuclei and separate them from the released beads.

In some embodiments, the ATAC-seq methods are performed using purified single nuclei. They may be purified from tissue. Purification is preferably performed under mild conditions to minimise stress conditions applied to the cell which may affect chromatin status. The purified single nuclei may be fixed. Fixing may be helpful to preserve the native chromatin status that is then probed via the ATAC-seq methods of the invention. Suitable fixing agents and processes are well known to those skilled in the art. One example is paraformaldehyde cross-linking.

ATAC-seq methods may incorporate WGA methods, as discussed herein.

The beads of the invention are also highly useful for chromatin immunoprecipitation followed by sequencing (ChIP seq). Thus, the invention provides for use of a bead or library of beads as defined herein for chromatin immunoprecipitation followed by sequencing (ChIP seq) in a single cell or single nucleus.

The invention also provides a method of chromatin immunoprecipitation followed by sequencing (ChIP seq) in a single cell or single nucleus comprising:

a. Performing the method of the invention in order to generate a barcoded DNA library under conditions in which chromatin structure is maintained b. Immunoprecipitation of the chromatin

c. Amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

d. Sequencing the amplified barcoded DNA.

ChlP-seq permits chromatin profiling and thus reveals information in relation to regulation of functional genomic elements. It can be used for mapping histone modifications (such as acetylation and methylation), transcription factor-DNA interactions and a range of other protein-DNA interactions across the genome.

It is preferred that the ChlP-seq methods of the invention employ beads of the invention where the oligonucleotides insert by a fragmenting mechanism. This facilitates downstream immunoprecipitation and further processing.

One of the challenges of ChlP-seq in relation to single cells is low input samples and low coverage. The beads of the invention are densely populated with oligonucleotides and provide high sequence coverage. In addition, immunoprecipitation (step b.) may be performed in the presence of carrier chromatin in order to minimise chromatin loss and avoid noise associated with low input samples. Carrier chromatin is not barcoded. It is preferably taken from a different organism to that being investigated.

ChlP-seq relies upon retention of chromatin structure during the steps of creating a barcoded DNA library. In the context of chromatin, the transposase preferentially inserts the first and second common primer binding sites and the common barcode sequence into nucleosome-free regions. Thus, preferably the methods involve generating a barcoded DNA library under conditions which do not lyse the nucleus. This aids preservation of chromatin structure. Thus, this step typically does not involve exposing the cell or nucleus to an ionic (anionic or cationic) detergent (such as SDS) and/or a protease (such as proteinase K). Once the transposase has performed insertion, it is then not essential that chromatin structure is maintained throughout the remainder of the method.

For such applications where cell lysis is needed but the nucleic remain intact a non-ionic detergent may be employed. Examples are commercially available and include TWEEN20, TRITONX100 and NP-40. This allows intact cell nuclei to be recovered following generation of the barcoded DNA library (step a.) and prior to chromatin

immunoprecipitation (step b). For example differential centrifugation may be employed in order to recover the nuclei and separate them from the released beads.

In some embodiments, the ChlP-seq methods are performed using purified single nuclei. They may be purified from tissue. Purification is preferably performed under mild conditions to minimise stress conditions applied to the cell which may affect chromatin status. The purified single nuclei may be fixed. Fixing may be helpful to preserve the native chromatin status that is then probed via the ChlP-seq methods of the invention. Suitable fixing agents and processes are well known to those skilled in the art. One example is paraformaldehyde cross-linking.

For ATAC-seq and ChlP-seq applications it is important that the barcoded library is generated with chromatin in its native state. However, the beads of the invention have other applications in which it is advantageous to remove chromatin (nucleosomes etc.) which otherwise interfere with the transposition process. This is particularly the case where unbiased insertions into the genome, or dense coverage, are required. To this end the invention provides a method of preparing cellular nucleic acids from a single cell or nucleus for sequencing comprising:

a. Encapsulating the single cell or nucleus in a gel bead

b. Adding a cell lysis reagent to the gel bead in order to lyse the cell to leave cellular nucleic acids entrapped within the gel bead.

This method produces gel beads containing purified single cell genomes which can then be used as a substrate for transposition using the beads of the invention. Advantageously, the gel beads containing purified single cell genomes and beads of the invention can be brought together in a droplet to facilitate transposition and single cell analysis.

Typically the method additional comprises, following step b:

c. Washing the bead in order to remove the cell lysis reagent, whilst retaining the entrapped cellular nucleic acids.

The entrapped cellular nucleic acids are preferably high molecular weight molecules.

Thus, the washing step may remove low molecular weight molecules (e.g. mRNA, miRNA etc.) but this is not detrimental to the applications of the gel beads. The cellular nucleic acids thus generally comprise DNA, in particular genomic DNA. Preferably the genomic DNA is not fragmented.

According to these methods, the single cell may be provided in a droplet and thus the encapsulation step (step a.) comprises encapsulating a single cell in a droplet in a single gel bead. The gel may advantageously be a hydrogel.

The encapsulation step (step a.) may comprise the steps of:

i. encapsulating a single cell with a liquid capable of forming a gel bead, wherein the liquid forms a droplet surrounded by oil ii. forming the gel bead to encapsulate the single cell iii. isolating the gel bead from the oil

Encapsulation may be achieved microfluidically. For example, encapsulation may be achieved by flowing a single cell in a liquid capable of forming a gel bead into an oil flow such that the oil surrounds the liquid containing the single cell. The liquid can be any suitable liquid that can be used to form a gel. In one embodiment, a gel is heated to form a liquid that can then be set to form a gel bead. One suitable material is agarose gel which is heated to a molten state for encapsulation. Forming the gel bead may, therefore, rely on a cooling step for the gel to set. Other suitable materials include acrylamide and gelatine. The cell lysis reagent (used in step b.) digests chromatin to leave free DNA entrapped within the gel bead. Thus, in contrast to the mild detergents adopted in earlier methods where chromatin structure was to be retained, more harsh cell lysis reagents may be adopted. In some embodiments, the cell lysis reagent comprises a detergent and a protease. In specific embodiments, the cell lysis reagent comprises sodium dodecyl sulfate and/or proteinase K.

These methods are applicable to any cell type of interest. The cell may be a eukaryotic or prokaryotic cell. The methods are intended to be performed in relation to many individual cells in a single overall method. Thus, these methods may be performed for at least 10³, 10⁴, 10⁵ or 10⁶ cells in parallel.

These methods are very useful for preparing DNA for single cell bisulfite sequencing.

Single cell bisulfite sequencing can provide information about epigenetic modifications to DNA, such as the methylation status of the promoter of a gene. Thus, the methods may further comprise, following step b., adding a bisulfite reagent to the bead in order to convert unmethylated cytosine residues to uracil. Bisulfite is a known reagent which selectively modifies unmethylated cytosine residues but which does not convert methylated cytosine residues. Thus, sequencing can then reveal whether a particular cytosine residue was methylated or not within the cell.

There is more than one form of methylation that can be usefully investigated. Thus, in some embodiments, 5-hydroxymethylated cytosine residues can be distinguished from 5- methylcytosine residues. Accordingly, the methods may further comprise following step b, performing a chemical oxidation in order to convert hydroxymethylated cytosine residues to formylcytosine residues. This may be followed by adding a bisulfite reagent to the bead in order to convert unmethylated cytosine residues and formylcytosine residues to uracil. This treatment, as already explained, does not convert methylated cytosine residues).

These methods produce gel beads comprising cellular nucleic acids in an accessible format for use with the beads of the invention that carry oligonucleotides for transposition. Thus, the invention further provides a gel bead, or gel beads, comprising cellular nucleic acids produced according to these methods.

Similarly, the invention provides a method of generating a barcoded DNA library comprising combining the following components in a single droplet:

a. a single gel bead comprising cellular nucleic acids produced according to a method of the invention

b. a single bead of the invention or a single bead from a library of beads of the invention as defined herein

c. a transposase enzyme.

The transposase enzyme binds to the oligonucleotides attached to the bead and cleaves them from the bead to generate a diffusible transposition complex. The generated transposition complex inserts first and second common primer binding sites and the common barcode sequence into cellular DNA, thereby generating a barcoded DNA library.

Once such a barcoded DNA library has been generated it has various applications. There is provided a method of single cell or single nucleus sequencing comprising:

a. Performing a method of generating a barcoded DNA library using a single gel bead of the invention as the source of the cellular DNA

c. Sequencing the amplified barcoded DNA.

a. Performing a method of generating a barcoded DNA library using a single gel bead of the invention in combination with a bead of the invention where the oligonucleotides insert by a non-fragmenting mechanism in order to generate a barcoded DNA library

b. Repairing gaps

c. Amplifying the DNA using a whole genome amplification method; and d. Sequencing the amplified barcoded DNA. The amplified DNA is preferably further amplified using primers binding to the common primer binding sites included in the oligonucleotides. Thus, the invention provides a method of single cell or single nucleus sequencing comprising:

b. Repairing gaps

e. Sequencing the amplified barcoded DNA.

Gap repair can be achieved by any technique. For example, gap repair may be achieved by gap filling approaches and/or by ligation approaches, (e.g. gap filling followed by ligation).

Various whole genome amplification (WGA) methods are known and may be PCR and/or multiple displacement amplification (MDA) based. MDA methods may involve use of a primase.

These methods of the invention (involving gel beads containing the cellular DNA of interest) are particularly suitable for single cell sequencing of environmental samples. Thus, according to such methods an environmental sample is used as the source of the single cell or single purified nucleus. An environmental sample may be of any nature. It may be a water, soil or air sample for example. It may be a clinical sample, such as a stool sample or a buccal or skin swab.

The beads of the invention are highly useful for bisulfite sequencing applications. Thus, the invention provides for use of a bead or library of beads as defined herein for bisulphite sequencing in a single cell or single nucleus. Preferably this also involves use of the gel beads produced according to the invention as the source of the cellular DNA.

Accordingly, the invention provides a method of bisulphite sequencing in a single cell or single nucleus comprising:

c. Sequencing the amplified barcoded DNA.

Preferably step a. comprises performing a method of generating a barcoded DNA library using a single gel bead of the invention in combination with a bead of the invention as discussed herein.

The beads of the invention are highly useful for general DNA sequencing applications when applied to a single cell or nucleus. Thus, the invention provides for use of a bead or library of beads as defined herein for DNA sequencing in a single cell or single nucleus.

Preferably this also involves use of the gel beads produced according to the invention as the source of the cellular DNA.

Similarly, the invention also provides a method of DNA sequencing in a single cell or single nucleus comprising:

c. Sequencing the amplified barcoded DNA.

The beads of the invention are also useful for determining copy number variations in DNA by sequencing when applied to a single cell or nucleus. Thus, the invention provides for use of a bead or library of beads as defined herein for copy number variation DNA sequencing in a single cell or single nucleus. Preferably this also involves use of the gel beads produced according to the invention as the source of the cellular DNA.

Similarly, the invention also provides a method of copy number variation sequencing in a single cell or single nucleus comprising:

b. Amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites

c. Sequencing the amplified barcoded DNA; and

d. Determining the copy number of one or more DNA sequences.

Each of the foregoing methods may be performed using a purified single nucleus. The cell of interest (or nucleus derived therefrom) can be a eukaryotic or prokaryotic cell. As already mentioned, the methods of the invention generally involve sequencing using a next generation sequencing technique.

Examples of NGS platforms include lllumina sequencing (such as Hi-Seq and Mi-Seq), SMRT sequencing (Pacific Biosciences), Nanopore sequencing, SoLID sequencing, pyrosequencing (e.g. Roche 454), single molecule sequencing (SeqLUHelicos) and lon- Torrent (Thermo Fisher) which are well-known to the skilled person. In some

As the skilled person is aware, many NGS platforms rely upon amplification products that incorporate specific sequences, often termed adaptor sequences. The adaptor sequence is complementary to an oligonucleotide immobilised on a suitable solid surface (the nature of which depends on the sequencing platform, such as a flow cell (lllumina), zero mode waveguide (SMRT) or bead (pyrosequencing)) for sequencing. Thus, in some

embodiments, the methods of the invention incorporate sequencing adaptors into the amplification products. This facilitates subsequent sequencing of the DNA. This may be achieved by including adaptor sequences in the primers that hybridise with the first and second common primer binding sites, as would be readily understood by one skilled in the art. As already discussed, the barcode sequences included in the oligonucleotides are used to identify the single cell or single nucleus from which a given sequence originates. This is an important step in the methods of the invention which allow single cell or single nucleus resolution. It is also possible for certain applications that the oligonucleotides may comprise a unique molecular identifier (UMI). UMIs are used to identify the molecule from which a given sequence originates. A UMI is located between the common primer binding site and transposase target sequence. Conveniently, the UMI is located downstream of the common barcode sequence but it may be located upstream of the common barcode sequence. The UMI is preferably different on each oligonucleotide on a single bead. Any UMI sequence may be used. A useful method for synthesising UMIs is by degenerate oligonucleotide synthesis (e.g. as described in Macosko et al), although other methods may be employed. The UMI is typically at least 4 nucleotides in length, such as between 5 and 10 nucleotides in length, preferably 8 nucleotides in length. Depending upon how many oligonucleotides are attached to each bead, it is possible that the UMI will be repeated at least once, and potentially more than once, across different oligonucleotides attached to the bead. For example, if the UMI is 8 nucleotides in length there will be 4⁸ different sequences available. If there are 10⁹ oligonucleotides attached to a bead, there may be repetition of the UMI. However, this is not detrimental to use of the UMIs because the efficiency of DNA capture by the oligonucleotides attached to the beads is much less than 10⁹ (likely more in the order of 10⁴).

One particular utility of sequencing techniques is the ability to quantify the sequences. Many NGS rely on digital analysis of the sequences. Thus, all sequencing methods of the invention may comprise quantifying the respective sequences. For example, the methods may allow variations in copy number to be measured. DESCRIPTION OF THE FIGURES

Figure 1 shows a preferred "fragmenting" oligonucleotide construction in the context of a bead of the invention together with approximate nucleotide dimensions of each component. Figure 2 shows schematically a transposase dimer bound to both transposase target sequences of the oligonucleotide shown in Figure 1 simultaneously.

Figure 3 shows a ribbon diagram of a transposase dimer with the transposase target sequences of the oligonucleotide shown in Figure 1 overlaid.

Figure 4 shows, in schematic form, the transposase dimer acting on the oligonucleotide and the formation of the separated primer binding site containing oligonucleotides.

Figure 5 shows schematically the insertion of the separated primer binding site containing oligonucleotides into genomic DNA via the active transposition complex (40).

Figure 6 shows a preferred "non-fragmenting" oligonucleotide construction in the context of a bead of the invention together with approximate nucleotide dimensions of each component.

Figure 7 shows schematically a transposase dimer bound to both transposase target sequences of the oligonucleotide shown in Figure 6 simultaneously.

Figure 8 shows a ribbon diagram of a transposase dimer with the transposase target sequences of the oligonucleotide shown in Figure 6 overlaid.

Figure 9 shows schematically the insertion of the oligonucleotide, which has been cleaved from the bead by the transposase, into genomic DNA via the active transposition complex (91 ).

Figure 10 shows an alternative oligonucleotide construction in the context of a bead of the invention.

Figure 11 shows schematically the oligonucleotides, as shown in Figure 10, released from the bead following cleavage of the linker. Figure 12 shows an active transposition complex comprising a transposase dimer and the oligonucleotides formed following release of the first and second oligonucleotides. Figure 13 shows, in schematic form, a split and pool synthesis scheme for making barcoded oligonucleotides on beads.

Figure 14 shows in schematic form, a more detailed example of how an exemplary two- transposase-site barcoded construct, such as that shown in Figure 1 , can be synthesised on beads.

Figure 15 shows a schematic of a synthesis scheme for making beads for generating barcoded transposomes, where the oligos are cleavable.

DETAILED DESCRIPTION

Figure 1 shows a preferred oligonucleotide construction in the context of a bead of the invention together with approximate nucleotide dimensions of each component. This generates a fragmenting active transposition complex.

A bead for single cell sequencing (1) comprises a plurality of oligonucleotides (2) attached to a bead (3) via a non-cleavable linker (4), such as a C16 linker. Each oligonucleotide (2) comprises, in proximal to distal order from the bead (3):

• a terminal spacer sequence (5), typically of around 20 nucleotides in length · a first transposase target sequence (6) that incorporates a transposase cleavage site (7) at the bead proximal end, typically of around 19 nucleotides in length

• a common barcode sequence (8) for single cell identification, typically of around 12 nucleotides in length

• a first primer binding site (9), typically of around 20 nucleotides in length

· an intervening spacer sequence (10) that facilitates transposase dimer binding to first and second transposase target sequences. Generally, the intervening spacer sequence must be of sufficient length to allow the cleavage complex to form. This intervening spacer sequence is single stranded and around 40 nucleotides in length (can be 40-60 nucleotides) • a second transposase target sequence (11 ) that incorporates a transposase cleavage site (12) at the bead proximal end, typically of around 19 nucleotides in length

• a second primer binding site (13), typically of around 20 nucleotides in length.

In this arrangement, the oligonucleotides (2) can be attached to the bead at their 3' or 5' end. If attached at the 3' end, each transposase target sequence (6, 11) may be double stranded but the other components of the oligonucleotide can all be single stranded. If attached at the 5' end, each of the transposase target sequences (6, 11), primer binding sites (9, 13) and barcode (8) is double stranded and at least part of the intervening spacer can be single stranded.

As shown schematically in Figure 2, this oligonucleotide arrangement permits a transposase dimer (20) to bind to both transposase target sequences simultaneously. The two primer binding sites are marked for orientation purposes. Approximate nucleotide distances between the various components are also shown. The importance of the intervening spacer sequence (10) is clearly seen to facilitate transposase dimer binding to first and second transposase target sequences simultaneously. The functionality of the terminal spacer sequence (5) is also shown, which prevents the bead from interfering with transposase binding. This arrangement is further displayed in the ribbon diagram of Figure 3.

Figure 4 shows, in schematic form, the transposase dimer (20) acting on the

oligonucleotide (2) to generate an active transposition complex (40). Following binding of the oligonucleotide (2) by the transposase dimer (20), the enzyme cleaves the two cleavage sites (7, 12), thus separating the oligonucleotide (2) into separate parts (41 , 42), each containing a primer binding site and one part containing the barcode sequence. The bead (3) is also released. Figure 5 shows the insertion of the separated primer binding site containing

oligonucleotides (41 , 42) into genomic DNA (50) via the active transposition complex (40). In this example, the genomic DNA is packaged in chromatin, with the DNA wound around nucleosomes (51 ). Thus, the regions (52) between the nucleosomes are accessible to the transposition complex whereas the nucleosomic regions (51) are relatively inaccessible. The active transposition complex (40) inserts the separated primer binding site containing oligonucleotides (41 , 42) into the accessible genomic DNA (52) to generate a barcoded DNA library (53) that can be amplified for subsequent sequencing via universal primers that hybridise to the primer binding sites. A nucleosome (51) is now located between the primer binding sites and can be investigated via ChlP-seq at the single cell level (because the primer binding sites are close enough together).

Figure 6 shows a preferred oligonucleotide construction in the context of a bead of the invention together with approximate nucleotide dimensions of each component. This generates a non-fragmenting active transposition complex.

A bead for single cell sequencing (61) comprises a plurality of oligonucleotides (62) attached to a bead (63) via a non-cleavable linker (64), such as a C16 linker. Each oligonucleotide (62) comprises, in proximal to distal order from the bead (63):

• a terminal spacer sequence (65), typically of around 20 nucleotides in length · a first transposase target sequence (66) that incorporates a transposase cleavage site (67) at the bead proximal end, typically of around 19 nucleotides in length

• a first primer binding site (68), typically of around 20 nucleotides in length

• a short intervening spacer sequence (69) that facilitates transposase dimer binding to first and second transposase target sequences. Generally, the intervening spacer sequence must be of sufficient length to allow the cleavage complex to form.

However, in view of the orientation of the components of this oligonucleotide only a minimal intervening sequence is needed, which may be less than 10 nucleotides in length. Indeed, no separate sequence may be needed in some embodiments; the second primer binding site and barcode may provide sufficient sequence to enable the relevant conformation to be adopted.

• a second primer binding site (610), typically of around 20 nucleotides in length

• a common barcode sequence (611) for single cell identification, typically of around 12 nucleotides in length

• a second transposase target sequence (612), typically of around 19 nucleotides in length. In this embodiment, the target sequence does not need to be cleaved by the transposase enzyme because it is synthesised to include the appropriate end sequence (and including a 5' phosphate). However, alternatively, the

oligonucleotides may incorporate a transposase cleavage site at the bead distal end in some embodiments. In this arrangement, the oligonucleotides (62) are attached to the bead at their 5' end. Each transposase target sequence (66, 612) is double stranded, as is the first primer binding site (68). The other components of the oligonucleotide can all be single stranded. As shown schematically in Figure 7, this oligonucleotide arrangement permits a

transposase dimer (70) to bind to both transposase target sequences simultaneously. The two primer binding sites (68, 610) are marked for orientation purposes. Approximate nucleotide distances between the various components are also shown. The functionality of the terminal spacer sequence (65) is shown, which prevents the bead from interfering with transposase binding. The figure shows the bead being cleaved from the oligonucleotide at the cleavage site (67) within the first transposon target sequence. This arrangement is further displayed in the ribbon diagram of Figure 8.

Figure 9 shows the insertion of the oligonucleotide (62) into genomic DNA (90) via the active transposition complex (91). In this example, the genomic DNA is packaged in chromatin, with the DNA wound around nucleosomes (92). Thus, the regions (93) between the nucleosomes are accessible to the transposition complex whereas the nucleosomic regions (92) are relatively inaccessible. The active transposition complex (91) inserts the oligonucleotide sequence between and including the transposase target sequences (66, 612) into the accessible genomic DNA (93) to generate a barcoded DNA library (94) that can be amplified for subsequent sequencing via universal primers that hybridise to the primer binding sites. A nucieosome (92) is now located between the primer binding sites of neighbouring oligonucleotides. The pattern of insertions can be analysed in a ATAC-seq method at the single cell level. An amplification product between neighbouring insertions (i.e. directed by a primer from each insertion) is shown (95).

Figure 10 shows an alternative oligonucleotide construction in the context of a bead of the invention. This relies on two separate oligonucleotides attached to the bead and generates a fragmenting active transposition complex.

A bead for single cell sequencing (101) comprises a plurality of first oligonucleotides (102) and second oligonucleotides (103) each attached to a bead (104) via a photo-cleavable linker (105). Each first oligonucleotide (102) comprises, in proximal to distal order from the bead (104):

· a terminal spacer sequence (106), typically of around 20 nucleotides in length • a first transposase target sequence (107) that incorporates a transposase cleavage site at the bead proximal end, typically of around 19 nucleotides in length

• a first primer binding site (108), typically of around 20 nucleotides in length

Each second oligonucleotide (103) comprises, in proximal to distal order from the bead (104):

• a terminal spacer sequence (109), typically of around 20 nucleotides in length

• a second transposase target sequence (1 10) that incorporates a transposase

cleavage site at the bead proximal end and is typically of around 19 nucleotides in length

• a common barcode sequence (111) for single cell identification, typically of around 12 nucleotides in length

• a second primer binding site (112), typically of around 20 nucleotides in length.

In this arrangement, the oligonucleotides (102, 103) may be attached to the bead at their 3' or 5' end. If attached at their 3' end, each transposase target sequence (107, 110) is double stranded but the other components of the oligonucleotide can all be single stranded. If attached at their 5' end, all functional components may be double stranded.

Due to the inclusion of a photo-cleavable linker (105) between the bead (104) and the oligonucleotides (102, 103), the transposase enzyme does not need to bind to the oligonucleotides whilst they are attached to the bead. Thus, the terminal spacer sequence may be short or, if desired, dispensed with altogether (provided a functional transposase target sequence is retained). Cleavage at the linker releases the first and second oligonucleotides ( 02, 103) which can then be complexed with the transposase enzyme. This release is shown in Figure 11. One disadvantage of this construction is that the oligonucleotides are released from the bead before the active transposition complex is formed with the transposase enzyme. Thus, there is less control over the interaction between the two monomers of the transposase enzyme dimer and the first and second oligonucleotides respectively. Thus, in some embodiments, instead of a photo cleavable linker, the linker is not (photo) cleavable. Instead, generation of the active transposition complex relies upon the transposase enzyme binding to the respective transposase target sequences included in the first and second oligonucleotides to cleave the first and second oligonucleotides from the bead (not shown). Figure 12 shows an active transposition complex (120) comprising a transposase dimer (121) and the oligonucleotides (102, 103) formed following release of the first and second oligonucleotides (102, 103). Because the oligonucleotides (102, 103) are separated insertion will fragment the target DNA.

The oligonucleotide constructs of the invention are preferably synthesised by assembling each oligonucleotide from smaller oligonucleotides. The oligonucleotides may be assembled by DNA polymerase-mediated assembly, but other routes are also possible, such as assembling by ligation, e.g. with a ligase, e.g. a thermally stable ligase. Enzymatic assembly of smaller oligonucleotides may be preferred, as shorter oligonucleotides (e.g., 100nt or under, or 80 nt or under, or 60nt or under) are currently easier and cheaper to synthesise, with a higher yield.

Figure 13 shows, in schematic form, a split and pool synthesis scheme for making barcoded oligonucleotides on beads. The synthesis scheme relies on using enzymatic reactions to assemble synthetic oligonucleotides. The reactions may be carried out in multiwell plates (not shown). A similar synthesis scheme is described in Klein et al (Klein et al., 2015; Zilionis et al., 2017).

In the example here, the barcode is assembled from four fragments, each having 4nt of barcode, so that the total barcode size will be 12 nt, for a library size of 4¹², or 16 million. It is advantageous to use 3 lots of 256 oligos, because each step fits easily in 96 or 384 well plates, but the total library size is still very large. It is further advantageous to use a short (e.g., 20nt, or 15nt, or 10nt) adaptor sequence, because it minimises the length of the sequencing read taken up by reading the barcode. In the end, the barcode will be interrupted by two intervening sequences (i.e., the adaptor sequences that were required for assembly), but this is not a significant disadvantage, and in fact is likely to be slightly advantageous - some barcodes out of the possible 16 million contain homopolymeric tracts, and these would be advantageously interrupted by the intervening sequences. Also, there is a certain error rate in oligo synthesis, which degrades the quality of the library, and the intervening sequences, because they are of known sequence at known positions, will provide some error correction.

In one implementation, shown in Figure 13A a first oligo (131) may be synthesised by 'reverse' chemical synthesis on beads, so that the 5' end of the oligos are attached to the bead, and the 3' end is available to be extended by a polymerase. The first oligo comprises at least the first transposase site (132). The bead with the first oligo is aliquoted into 256 wells of one or more multiwell plates. Next, the first 4nt of the barcode (133) is synthesised as a library of 256 oligos (4⁴); the 4nt barcode fragment is flanked 5' and 3' by adaptor sequences (134, 135) that allow assembly, by annealing to complementary sequences on flanking oligos. The 256 oligos are released (not shown) and aliquoted into the 256 wells, one oligo (i.e., a unique 4nt barcode fragment) into each well. The barcode fragments are annealed to the first oligo on the bead, and extended with a polymerase to generate a first extended product (136). The beads in one well will all receive the same first barcode fragment sequence, but beads in different wells will have received different sequences. The beads from the different wells are then pooled (not shown). At this point, the oligo is double stranded, so the second strand is washed away (e.g., with NaOH). The beads are then aliquoted into 256 wells again.

In Figure 13B, a second set of 256 barcode fragments (137) are synthesised, again as 4nt fragments flanked by adaptor sequences, and aliquoted into the 256 wells, one barcode fragment per well. The oligos are assembled, so now each well contains beads, where each bead contains a different first barcode fragment, but the same second barcode fragment, but the oligos on each bead are identical to each other (138). The beads are pooled, the second strand washed away, and the library is aliquoted into 256 wells.

In Figure 13C, a third set of 256 barcode fragments is synthesised, this time also containing the first PCR handle (139), second transposase site (1310), and second PCR handle (1311). In practice, this oligo may be supplied as two overlapping oligos (the second one being common), to keep the length of the oligos down. The 256 barcode fragments are then aliquoted into the 256 wells, one oligo per well, and assembled onto the beads. The bead library is then pooled, and the second strand washed away to generate the final oligonucleotide construct (1312). This is a "fragmenting" bead of the invention, such as is shown in Figure 1. Other implementations are possible. For example, the oligo construct may be attached to the bead at its 3' end, and the oligo assembled with a ligase; the first oligo may be synthesised by 'forward' chemical DNA synthesis on another bead, cleaved off, and joined to the destination bead, either in bulk, or after aliquoting into wells, etc. Figure 14 shows in schematic form, a more detailed example of how an exemplary two- transposase-site barcoded construct, that shown in Figure 1 , can be synthesised on beads. In practice, the transposome construct may be assembled from 5 oligos, each numbered in Figure 14A (140-144) and with the lengths as shown, so as to minimise the total length of the oligo.

As shown in Figure 14B, in order to make the transposase sites, PCR handles and barcodes all double stranded, two oligos are annealed. One oligo (145) is annealed to the first PCR handle and extended, to make the barcode and first transposase site double stranded. A second oligo (146) can then be annealed to the second transposase site and second PCR handle; this one is not extended, so as to leave the intervening spacer sequence single stranded. For convenience, this one may be blocked (147) so that it can't be extended, and added at the same time as the first. Figure 15 shows a schematic of a synthesis scheme for making beads for generating barcoded transposomes, where the oligos are cleavable (e.g., photocleavable), such as shown in Figure 10 (with the same labelling applicable). With these beads, one

implementation has two pluralities of oligos on the bead, a first plurality with a first transposase site and a first PCR handle, and a second plurality of oligos with a second transposase site, a barcode, and a second PCR handle. The schematic shows a scheme to synthesise a bead where there are two distinct oligos, one of which has a barcode, where the barcode is made by a split and pool approach with pre-synthesised oligos, and enzymatic assembly. First, a bead is functionalised with a mixture of two oligos, where the first plurality of oligos is the full-length first plurality mentioned above, containing a first transposase site and a first PCR handle (150). The second plurality of oligos (151) are each incomplete, having only the second transposase site. These beads are split, e.g. by aliquoting out into 256 wells of one or more multi-well plates (for the sake of clarity, the first plurality of oligos (150) is not shown in subsequent drawings (dashed line, 152))). A set of 256 oligos with a first barcode fragment are made (153), where each oligo has a 4nt barcode fragment, flanked by adaptor sequences, where the 3' adaptor sequence anneals to the second transposase site in the second plurality of oligos (151), and the barcode fragment is used as a template in a polymerase reaction, so as to template the addition of the reverse complement of the barcode sequence fragment, to the second plurality of oligos on the bead. It doesn't template the addition of any sequence to the first plurality of oligos on the bead, because the 3' end of that oligo does not anneal to the barcode fragment template oligos (not shown). The 256 barcode fragment oligos are aliquoted, one per well, into the 256 wells, and used to template the addition of the first fragment of the barcode. The beads are pooled, the second strand (154) washed away, and the beads are split between 256 wells of one or more multi-well plates.

A second set of 256 barcode fragment oligos is synthesised (155), where the 3' flanking sequence anneals to the 3' end adaptor sequence (156) of the oligo on the bead (which was templated by the 5' flanking adaptor sequence of the first 256 barcode fragment oligos). These oligos are aliquoted, one per well, into the 256 wells, and used to template the addition of the second barcode fragment to the oligos on the beads. The beads are pooled, the second strand washed away, and the beads are split between 256 wells of one or more multiwell plates (not shown).

A third plurality of 256 barcode fragment oligos is synthesised (157), having a 4nt barcode fragment, where the 3' flanking sequence anneals to the 3' adaptor sequence (158) on the beads (which was templated by the 5' flanking sequence of the second plurality of 256 barcode fragment oligos). The 5' flanking sequence contains the second PCR handle. This is used to template the addition of the reverse complement of the third barcode fragment, and the second PCR handle.

At the end of this step, the second plurality of oligos is full length and double stranded (159). The second strand does not need to be washed away; in fact, it is required, because at least the transposase site must be double stranded. Depending on the orientation of the oligo that is attached to the bead, the barcode and PCR handle may or may not need to be double stranded. The transposase site on the first plurality of oligos (150) (containing the first transposase site and first PCR handle) must also be double stranded (1510). This may be achieved e.g. by including (in the last polymerase reaction) an oligo annealing to the first PCR handle, so as to make the first plurality of oligos double stranded.

Other synthesis schemes are possible. For instance, the bead might be functionalised with an oligo that contains the first transposase binding sites, then the first polymerase- mediated assembly step can use a mixture of two oligos, one containing the PCR handle 1 (which also functions to terminate that oligo), and a second template oligo with a first barcode fragment.

EXPERIMENTAL SECTION

1. Synthesis of initial bead

Beads are prepared by droplet microfluidics, by encapsulating a solution containing acrylamide, initiator, and an oligo with a 5' acrydite group that gets incorporated into the polymerised gel. The aqueous phase is a degassed solution of 6.4% acylamide (35:1 acylamide:bisacrylamide), 0.3% ammonium persulfate, and 0.5uM oligo (this may vary depending on the loading required and the oligo sequence used) with a 5' acrydite modification. The continuous phase is degassed 2% 008-FluoroSurfactant in HFE7500 (RAN Biotechnologies) supplemented with 0.4% TEMED. The droplets are made on a Dolomite microfluidics droplet system, typically at flow rates of around 50ul/min for the continuous phase, and 10ul/min for the aqueous phase. The emulsion is collected under mineral oil, and incubated at 65°C for 2 hours to fully polymerise the acrylamide. Droplets are typically around 50um diameter, or 65pl volume. The mineral oil, and excess continuous phase, are removed, and 3 volumes of perfluorooctanol (B20156, Alfa Aesar) are added for every remaining volume of continuous phase, plus 30 ml of TBSTE (50 mM TRIS pH7.4, 150 mM NaCI, 0.1%TRITONX100, 10 mM EDTA), and the tube is shaken to break the emulsion. The supernatant (containing the beads) is removed to a fresh 50ml centrifuge tube, and the beads are pelleted at 5,000g for 10 minutes. The beads are washed twice more and stored at 4°C.

Bead oligo /5Acrvd/5'-AAGCAGTGGTATCAACGCAGAGTACCTGTCTCTTATACACATCT CAATGCGACCACATGACGATGAGAATTCGGGTTCACTCGC 3' (SEQ ID NO: 3) BC1 GATGTGACACJJJJGCGAGTGAAC (SEQ ID NO: 4)

BC2 CGTGGATTCGJJJJGATGTGACAC (SED ID NO: 5)

BC3 CCGCTCACGAGATGTGTATAAGAGACAGJJJJCGTGGATTCG (SEQ ID NO: 6) MEssol CGCATTGAGATGTGTATAAGAGACAGGTACTCTGCG (SEQ ID NO: 7) MEsso2 CCGCTCACGAGATGTGTATAAGAGACAG (SEQ ID NO: 8)

Where "J" represents a nucleotide in a barcoded position. Barcoding of bead.

The beads are barcoded by building up the oligo by split and pool DNA synthesis. The barcode is a 12nt barcode (16 million possible barcodes) split into 3 segments of 4 nt each, where each segment is represented by 256 oligos. To add the first segment of the barcode, 256 oglios BC1 oligos are aliquoted out into wells of a 384 well plate, 9ul of 0.15uM of oligo per well. Next, 6ul of a mix containing about 40,000 polyacylamide beads, 2.5x isothermal amplification buffer (NEB) and 0.85mM dNTPs are added to each well. The beads are denatured at 85°C for 2 min, then 5ul of Bst enzyme mix (1 .8U of Bst 20 and 0.3mM dNTPs in 1X isothermal amplification buffer) is added to each well. The reactions are incubated at 60°C for 1 hour, then 20ul of stop buffer is added to each well (100mM KCI, 10mM TRIS pH8.0, 50mM EDTA, 0.1% TWEEN20), and the plate is incubated on ice for 30 min. The beads are then collected in a 50 ml centrifuge tube, and pelleted at 1 ,000g for 5 min. The beads are washed once more with TBSTE, and then the second strand is washed away by re-suspending in 20 ml of 150mM NaOH, 0.5% Brij 35P, and washed once more with 20ml 100m NaOH, 0.5% Brij 35P. The beads are then washed in TBSTE, followed by TE- TWEEN (10 mM TRIS pH 8.0, 1 mM EDTA, 0.1 %TWEEN) and re-suspended in TE- TWEEN.

The synthesis and washing steps are repeated for the BC2 and BC3 oligo sets and then the second strand oligos are annealed.

The beads are pelleted, and the volume of pelleted beads estimated. The beads are then washed in TBSTE, and then resuspended in 0.5 volumes of TBSTE. First, the tube containing the beads is pre-heated to 80°C, and Mosaic End second strand oligo 1

(MEssol ; SEQ ID NO: 7) is added to a final concentration of 1 uM, and the beads are allowed to cool to 50°C, then incubated for 30 minutes. Next, ME second strand oligo 2 (MEsso2; SEQ ID NO: 8) is added to uM, and the beads are incubated for a further 30 minutes. The beads are then cooled to room temperature, and washed twice in 30 ml TBSTE, and re-suspended in TBSTE, and stored at 4°C.

Generation of barcoded tagmosome library in droplets.

The transposase enzyme is kept separate from the oligo construct until just before being introduced into the droplets, so that the diffusible tagmosome complex is not formed until the components are compartmentalised in the droplets. The tagmosome library is generated on a Dolomite Bio uEncapsulator module. The continuous phase is 2% 008-FluoroSurfactant in HFE7500, in a 20ml glass scintillation vial, which is loaded into the reservoir of a Dolomite P Pump. 100ul of barcoded beads are washed in 1 ml of 10mM TAPS-NaOH at pH8.5, 100mM NaCI, 5mM MgCI₂, and 8% PEG 8000), and loaded into one reservoir on the uEncapsulator reservoir chip. 2ug of Tnp in 100 ul of of 10mM TAPS-NaOH at pH8.5, 5mM MgCI₂, and 8% PEG 8000 is loaded into the other reservoir. Droplets are then made by flowing the continuous phase at 50ul/min and each of the two aqueous solutions at 7ul/min each (may require lower flow rate for super- Poisson loading). The emulsion is then incubated at 55°C for 10 minutes.

2. DropATAC seq. using barcoded tagmosomes

Preparation of nuclei

Tissue is dissected, cut into pieces smaller than 5mm in at least one dimension, and stored overnight at 4°C in RNAIater (Ambion).

Nuclei are prepared using a Dounce homogenizer and centrifugation through a density cushion. Briefly, dissected tissue is transferred to a Dounce homogenizer (Sigma) on ice, with 2ml of chilled homogenization buffer (320 mM sucrose, 5 mM CaCI, 3 mM Mg(Ac)2, 10 mM Tris pH7.8, 0.1 mM EDTA, 0.1% NP40, 0.1 mM PMSF, 1 mM β-mercaptoethanol). The tissue is gently homogenized with 10 strokes of an A fit pestle, followed by 10 strokes with the B fit pestle. Next, the volume is increased to 5 ml with 3 ml extra of homogenization buffer. 5 ml of a solution containing 50% Optiprep (Sigma), 5 mM CaCI, 3 mM Mg(Ac)2, 10 mM Tris pH 7.8, 0.1 mM PMSF, 1 mM β-mercaptoethanol is added and mixed. The lysate is filtered through a 35um cell strainer (Corning, NY, Falcon, #352235) and gently layered on top of 10 ml of ice cold 29% iso-osmolar OptiPrep in a 30 ml centrifuge tube, and spun at 10,000 g for 30 min at 4°C. The supernatant is discarded, and the nuclear pellet gently resuspended in 65 mM β-glycerophosphate (pH 7.0), 2 mM MgCI2, 25 mM KCI, 340 mM sucrose and 5% glycerol. The number and quality of purified nuclei is determined by bright- field microscopy.

Tagging of nuclei

The nuclei are tagged with the barcoded tagmosomes, using a Dolomite Bio uEncapsulator module mounted on a Dolomite TCU100 temperature control unit, pre-cooled to 1 °C. The continuous phase is 2% 008-FluoroSurfactant in HFE7500, in a 20ml glass scintillation vial, which is loaded into the reservoir of a Dolomite P Pump. 100ul of approx. 45um barcoded beads are washed in 1 ml of 10mM TAPS-NaOH at pH8.5, 100mM NaCI, 5mM MgCI₂, and 8% Ficoll PM400), and loaded into one reservoir on the uEncapsulator reservoir chip.

Nuclei are resuspended at 3,000 nuclei/ul in 100 ul of 20mM TAPS-NaOH at pH7.4, 5 mM MgCI2, 25 mM KCI, 325 mM sucrose, 8% Ficoll PM400 and 5% glycerol, with 2ug of Tnp transposase enzyme. The nuclei plus transposase are loaded into the other 100ul reservoir on the uEncapsulator reservoir chip. 55um droplets are made at 40 - 50ul/min for the oil phase, and 7ul/min for each of the two aqueous solutions. Approximately 10% of the droplets contain a nucleus, and most contain a bead. Droplets are collected, then incubated for 30 minutes at 37°C to insert the tagmosomes into the DNA in the nuclei.

If a non-fragmenting transposome has been used, the nuclei are recovered from the emulsion. Excess emulsion oil is removed, then 30 ml of 50mM TRIS pH7.4, 25 mM KCI, 10mM EDTA, 340 mM sucrose, and 5% glycerol is added. The emulsion is broken by adding 3 volumes of peril uorooctanol for every volume of emulsion oil, and inverting the centrifuge tube several times. The supernatant, containing the nuclei, is removed to a fresh 50 ml centrifuge tube, and the nuclei are pelleted at 10,000 g for 30 minutes. The supernatant is discarded, and the nuclei are resuspended in 1 ml of TBS. Next, SDS is added to 0.1%, and proteinase K to 200ug/ml. The DNA is incubated at 55C for 1 hour, then precipitated with 1 volume of isopropanol, pelleted at 14,000g for 1 minute, the pellet washed with 1 ml 70% ethanol, and then resuspended in TE (10mM TRIS pH 8.0, 1 mM EDTA).

Alternatively, if a fragmenting transposome is used, the emulsion is broken by adding 2ml of stop buffer (50mM TRIS pH7.4, 25 mM KCI, 20mM EDTA, 340 mM sucrose), followed by 3 volumes of perfluorooctoanol, and inverting the tube. The supernatant is removed to two fresh 2ml microcentrifuge tubes, and lysed by adding 30ul of 10% SDS. The lysate is then purified using a Qiagen MinElute PCR Purification Kit. Transposed DNA is eluted in 10 μΙ Elution Buffer (10 mM Tris buffer, pH 8). Library generation.

Sequencing libraries are generated by PCR, using universal primers that hybridise to the primer binding sites within the oligonucleotides. The universal primers include adaptor sequences that are needed for subsequent NGS.

3. Single cell ChIP

Preparation of nuclei

Nuclei are prepared with a Dounce homogeniser. Optionally, the tissue or isolated nuclei may be cross-linked, e.g., with paraformaldehyde, to facilitate immunoprecipitation. In this case, the cross-links can be reversed after immunoprecipitation.

Tissue is dissected, and nuclei prepared using a Dounce homogenizer and centrifugation through a density cushion. Briefly, dissected tissue is transferred to a Dounce homogenizer (Sigma) on ice, with 2ml of chilled homogenization buffer (320 mM sucrose, 5 mM CaCI, 3 mM Mg(Ac)2, 10 mM Tris pH7.8, 0.1 mM EDTA, 0.1 % NP40, 0.1 mM PMSF, 1 mM β- mercaptoethanol). The tissue is gently homogenized with 10 strokes of an A fit pestle, followed by 10 strokes with the B fit pestle. Next, the volume is increased to 5 ml with 3 ml extra of homogenization buffer. 5 ml of a solution containing 50% Optiprep (Sigma), 5 mM CaCI, 3 mM Mg(Ac)2, 10 mM Tris pH 7.8, 0.1 mM PMSF, 1 mM β-mercaptoethanol is added and mixed. The lysate is filtered through a 35um cell strainer (Corning, NY, Falcon, #352235) and gently layered on top of 10 ml of ice cold 29% iso-osmolar OptiPrep in a 30 ml centrifuge tube, and spun at 10,000 g for 30 min at 4°C. The supernatant is discarded, and the nuclear pellet gently resuspended in 65 mM β-glycerophosphate (pH 7.0), 2 mM MgCI2, 25 mM KCI, 340 mM sucrose and 5% glycerol. The number and quality of purified nuclei is determined by bright-field microscopy.

Preparation of carrier chromatin.

Carrier chromatin is prepared from cells, e.g. Drosophila SL2 cell nuclei, using micrococcal nuclease, according to methods known in the art (Hao, Haiping, et al. "A fast carrier chromatin immunoprecipitation method applicable to microdissected tissue samples." Journal of neuroscience methods 172.1 (2008): 38-42). Tagging of nuclei

The nuclei are tagged, and the DNA fragmented, with barcoded tagmosomes, using a Dolomite Bio uEncapsulator module mounted on a Dolomite TCU100 temperature control unit, pre-cooled to 1 °C. The continuous phase is 2% 008-FluoroSurfactant in HFE7500, in a 20ml glass scintillation vial, which is loaded into the reservoir of a Dolomite P Pump. 100ul of approx. 45um barcoded beads are washed in 1 ml of 10mM TAPS-NaOH at pH8.5, 100mM NaCI, 5mM MgCI₂, and 8% Ficoll PM400), and loaded into one reservoir on the uEncapsulator reservoir chip. Nuclei are resuspended at 3,000 nuclei/ul in 100 ul of 20mM TAPS-NaOH at pH7.4, 5 mM MgCI2, 25 mM KCI, 325 mM sucrose, 8% Ficoll PM400 and 5% glycerol, with 2ug of Tnp transposase enzyme. The nuclei plus

transposase are loaded into the other 100ul reservoir on the uEncapsulator reservoir chip. 55um droplets are made at 40 - 50ul/min for the oil phase, and 7ul/min for each of the two aqueous solutions. Approximately 10% of the droplets contain a nucleus, and most contain a bead. Droplets are collected, then incubated for 30 minutes at 37°C to insert the tagmosomes into the DNA in the nuclei.

The emulsion is split into two 2 ml tubes, and broken by adding to each tube 1 ml of stop buffer (50mM TRIS pH7.4, 150mM NaCI, 25 mM KCI, 20mM EDTA, and 0.1% TWEEN20) containing carrier chromatin from 10⁸ cells, followed by 3 volumes of perfluorooctoanol, and inverting the tube. The supernatant is removed to two fresh 2ml microcentrifuge tubes, 100 - 300 ul of the appropriate primary antibody was added, and the tubes are rotated overnight at 4°C. 800ul of protein-A coated magnetic beads (10008D, Life Technologies, USA) were added to each tube to precipitate the complexes. Beads are washed

sequentially with 2ml low salt buffer (50mM Tris-HC (pH 7.5) 10mM EDTA, 50mM NaCI), 2ml medium salt buffer (50mM Tris-HCI (pH 7.5) 10mM EDTA, 100mM NaCI), and 2ml high salt buffer (50mM Tris-HCI (pH 7.5) 10mM EDTA, 150mM NaCI), and 2ml TE (10mM Tris- HCI, 1 mM EDTA), then re-suspended in TE. The DNA is eluted with elution buffer (1%SDS, 100mM NaHC0₃). RNA is removed by adding 100 ul RNase A, and incubating at 37°C for 20 minutes. Nucleosomes are removed by adding proteinase K to 100ug/ml, and incubating at 55°C for 30 minutes, and inactivating at 65°C for 30 minutes. The DNA is purified with 1.5X AMPure XP beads (A63880, Beckman Coulter, USA). Eluted DNA is used as a substrate for library preparation, using universal primers that anneal to the tags on the inserted tagmosome oligos, and append adaptors for sequencing. 4. Single cell genome sequencing

The barcoded tagmosomes may be used for single cell genome sequencing. This may be applied to Copy Number Variation sequencing for tumours and sequencing environmental cells. It is also useful for bisulphite sequencing. In these cases, it is advantageous to obtain unbiased insertions and even amplification across the genome. It may also be useful to obtain high density insertions, to obtain relatively high coverage genome sequences from individual cells. In these cases, two main modifications are made to the technique. First, cells are singly encapsulated in hydrogel beads, allowing for the partial purification of genomic DNA, by digesting away and extracting proteins and other material. In the other modification, the DNA is pre-amplified with a whole genome amplification method, such as TruePrime (Picher et al., 2016), after the tagged DNA is recovered from the droplets, and before the PCR-based preparation of the sequencing libraries. Preparation of extracted single cell genomes

First, cells are singly encapsulated in aery lam ide beads (encapsulating cells in agarose beads, for extraction of single cell genomes, is also possible), on a Dolomite uEncapsulator module preheated to 37°C. The continuous phase is degassed 2% 008-FluoroSurfactant in HFE7500 supplemented with 0.4% TEMED, overlaid with mineral oil to keep air out.

One aqueous reservoir is loaded with cells at 3,000 cells/ul, suspended in a degassed solution of PBS with 0.1% BSA and 6.4% acrylamide. The other reservoir is loaded with a degassed solution of 6.4% acylamide (35:1 acylamide:bisacrylamide), and 0.6%

ammonium persulfate in PBS. 45 urn droplets were made at flow rates of 5ul/min each for the aqueous phases, and 50 - 70 ul/min for the continuous phase, and the emulsion collected under mineral oil. About 5% of the gel droplets contain a cell. The emulsion is left at room temperature for 30 minutes for the gel to polymerise.

To recover the beads, the mineral oil, and excess continuous phase, are removed, and 3 volumes of perfluorooctanol added for every remaining volume of continuous phase, plus 30 ml of TBSTE (50 mM TRIS pH7.4, 150 mM NaCI, 0.1%TRITONX100, 10 mM EDTA), and the tube is shaken to break the emulsion. The supernatant (containing the beads) is removed to a fresh 50ml centrifuge tube, and the beads pelleted at 5,000g for 10 minutes. The beads are washed twice more, resuspended in 50mM TRIS pH8.5, 10 mM EDTA, 150 mM NaCI, 0.1% SDS and 200ug/ml of proteinase K, and incubated at 55°C for 30 mins. The beads are then washed twice in 50mM TRIS pH8.5, 10 mM EDTA, 150 mM NaCI and 0.1 % TWEEN20, once in TE + 0.1% TWEEN20, then resuspended in TE + 0.1 %

TWEEN20 and stored at 4°C. Next, the genomes in the single cell DNA beads are tagged with barcoded tagmosomes in droplets. The continuous phase is 2% 008-FluoroSurfactant in HFE7500. 45um barcoded beads are washed in 1 ml of 10mM TAPS-NaOH at ρΗδ.δ, 100mM NaCI, 5mM MgCI₂, and 8% Ficoll PM400), and loaded into one reservoir on the uEncapsulator reservoir chip. The single cell DNA beads are washed in 1 ml of 20mM TAPS-NaOH at pH7.4, 5 mM MgCI2, and 8% Ficoll PM400, then 100 ul of suspension is taken, mixed with 2ug of Tnp transposase enzyme, and loaded in the other reservoir. 60um droplets are formed in the uEncapsulator, where most droplets have one 'DNA' bead (or empty bead) and one tagmosome bead. The emulsion is incubated at 37°C for 1 hour to insert the tagmosomes into the DNA.

The emulsion is broken by adding 2ml of stop buffer (50mM TRIS pH7.4, 25 mM KCI, 20mM EDTA, 340 mM sucrose), followed by 3 volumes of perfluorooctoanol, and inverting the tube. If a fragmenting tagmosome has been used, the DNA is purified from the supernatant using a Qiagen MinElute PCR Purification Kit. Transposed DNA is eluted in 10 μΙ Elution Buffer (10 mM Tris buffer, pH 8).

Gap filling and ligation

The gap filling reaction is performed by adding 5 μΙ of the reaction mixture containing 1 unit ampligase enzyme (Ampligase Thermostable DNA Ligase, Epicentre Biotechnologies,

Wisconsin, USA), 1 unit stoffel fragment DNA polymerase (Applied biosystems, USA), 125 nM of each dNTP (Roche diagnostics, Manheim, Germany) in 1X ligase buffer (200 mM Tris-HCI (pH 8.3), 250 mM KCI, 100 mM MgCI2, 5 mM NAD, and 0.1% Triton X-100). The reaction mixture is incubated at 56°C for 2 h followed by cycling the reaction for four cycles using the following conditions: initial denaturation at 95°C for 6 min followed by 85°C for 10 min. The temperature is then gradually decreased from 65°C to 56°C at the rate of 1°C/30 sec followed by 4 h incubation at 56°C. Whole Genome Amplification

Following gap filling and ligation, WGA is performed using TruePrime & MDA. For library preparation, labelled fragments are prepared by PCR, using primers that bind to the primer binding sites within the oligonucleotide. The primers incorporate sequencing adaptors, and the libraries can then be sequenced.

The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and accompanying figures. Such modifications are intended to fall within the scope of the appended claims. Moreover, all embodiments described herein are considered to be broadly applicable and combinable with any and all other consistent embodiments, as appropriate.

Various publications are cited herein, the disclosures of which are incorporated by reference in their entireties.

Claims

CLAIMS:

A bead for single cell sequencing comprising a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising in proximal to distal order from the bead:

a. a transposase target sequence

b. a common barcode sequence

c. a common primer binding site.

The bead of claim 1 further comprising a second plurality of oligonucleotides attached to the bead, the second oligonucleotides comprising in proximal to distal order from the bead:

a. a second transposase target sequence

b. a second common primer binding site.

a. a first transposase target sequence

b. a first common primer binding site

c. an intervening spacer sequence

d. a second transposase target sequence

e. a second common primer binding site; and

further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; wherein the first and second transposase target sequences are oriented with the cleavage site towards the bead; and

wherein the intervening spacer sequence (is of sufficient length to) facilitates transposase dimer binding to first and second transposase target sequences.

4. The bead of claim 3 wherein the intervening spacer sequence comprises a single stranded region of at least 2, 3, 4 or 5 nucleotides and up to 20, 30 or 40 nucleotides in length. The bead of claim 3 wherein the intervening spacer sequence comprises a double stranded region of at least 100, 150 or 200 nucleotides in length.

The bead of any one of claims 1 to 5 wherein the oligonucleotides are attached to the bead at their 3' end.

The bead of any one of claims 1 to 6 wherein the (or each) transposase target sequence is double stranded but at least the common primer binding site and common barcode sequence is single stranded.

The bead of any one of claims 1 to 5 wherein the oligonucleotides are attached to the bead at their 5' end.

The bead of claim 8 wherein the (or each) transposase target sequence, the common primer binding site(s) and common barcode sequence are each double stranded.

10. A bead for single cell sequencing comprising a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising in proximal to distal order from the bead:

a. a first transposase target sequence

b. a first common primer binding site

c. optionally, an intervening spacer sequence

d. a second common primer binding site

e. a second transposase target sequence; and

further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; wherein the first transposase target sequence (proximal to the bead) is oriented with the cleavage site towards the bead,

wherein the second transposase target site (distal to the bead) is oriented with the potential cleavage site away from the bead,

11. The bead of claim 10 wherein the intervening spacer sequence comprises a single stranded region of no more than 60, 50, 40, 30, 20, 10 or 5 nucleotides in length.

12. The bead of claim 10 or 11 wherein the oligonucleotides are attached to the bead at their 5' end.

13. The bead of claim 12 wherein each transposase target sequence and the first

common primer binding site are double stranded.

14. The bead of claim 10 or 11 wherein the oligonucleotides are attached to the bead at their 3' end.

15. The bead of claim 14 wherein each transposase target sequence and the second common primer binding site are double stranded.

16. The bead of any one of claims 1 to 15 wherein the oligonucleotides are attached to the bead via a non-cleavable linker, such as a non-photocleavable linker.

17. The bead of any one of claims 1 to 16 wherein the oligonucleotides comprise a terminal spacer sequence between the bead and the (first) transposase target sequence.

18. The bead of claim 17 wherein the terminal spacer sequence is at least 10, 20 or 30 nucleotides in length.

19. The bead of any one of claims 1 to 15 wherein the oligonucleotides are attached to the bead via a cleavable linker, such as a photocleavable linker.

20. A bead for single cell sequencing comprising a plurality of oligonucleotides attached to the bead, the oligonucleotides comprising:

a. a first common primer binding site

b. a first transposase target sequence

c. an intervening spacer sequence

d. a second (common) primer binding site

e. a second transposase target sequence; and further comprising a common barcode sequence between either the first transposase target sequence and first common primer binding site or between the second transposase target sequence and second common primer binding site; and wherein the intervening spacer sequence (is of sufficient length to) facilitates transposase dimer binding to first and second transposase target sequences; and wherein the oligonucleotides are attached to the bead via a cleavable linker.

21. The bead of claim 20, the oligonucleotides comprising in proximal to distal order from the bead:

a. a first common primer binding site

b. a first transposase target sequence

c. an intervening spacer sequence

d. a second (common) primer binding site

e. a second transposase target sequence; and

22. The bead of claim 20, the oligonucleotides comprising in proximal to distal order from the bead:

a. a first common primer binding site

b. a first transposase target sequence

c. optionally an intervening spacer sequence

d. a second transposase target sequence

e. a second (common) primer binding site; and

23. The bead of any one of claims 20 to 22 wherein the cleavable linker is a

photocleavable linker.

24. The bead of any one of claims 20 to 23 wherein the intervening spacer sequence comprises a single stranded region of at least 2, 3, 4 or 5 nucleotides and up to 20, 30 or 40 nucleotides in length.

25. The bead of any one of claims 20 to 23 wherein the intervening spacer sequence comprises a double stranded region of at least 100, 150 or 200 nucleotides in length.

26. A bead for single cell sequencing comprising a plurality of first and second

oligonucleotides attached to the beads; the first oligonucleotides comprising:

d. a first transposase target sequence

e. a first common primer binding site; and

the second oligonucleotides comprising:

f. a second transposase target sequence

g. a common barcode sequence

h. a second common primer binding site;

27. The bead of claim 26 wherein the cleavable linker is a photocleavable linker.

28. The bead of claim 26 or 27 wherein the first and second transposase target

sequences are double stranded.

29. The bead of any one of claims 26 to 28 wherein the first and/or second

oligonucleotides are attached to the bead at their 5' end.

30. The bead of claim 29 wherein the oligonucleotides comprise a terminal spacer sequence between the cleavable linker and each transposase target sequence.

31. The bead of claim 30 wherein the terminal spacer sequence is at least 10, 20 or 30 nucleotides in length.

32. The bead of any one of claims 26 to 31 wherein the first and/or second

oligonucleotides are attached to the bead at their 3' end. 33. The bead of claim 32 wherein the oligonucleotides comprise a terminal spacer sequence between the cleavable linker and the first and/or second common primer binding site.

34. The bead of claim 33 wherein the terminal spacer sequence is at least 10, 20 or 30 nucleotides in length.

35. The bead of any one of claims 1 to 34 wherein the bead is a polyacrylamide bead or a hydrogel bead. 36. The bead of any one of claims 1 to 35 wherein at least (on average) 10⁴ or at least (on average) 10⁵ oligonucleotides are attached to each bead, optionally 10⁴-10⁹, such as 10⁶-10⁸, preferably 10⁷.

37. A library of beads in which each bead is a bead as defined in any one of claims 1 to 36 and wherein:

a. each of the plurality of oligonucleotides attached to each bead comprises the same first and second common primer binding sites

b. the common barcode sequence is different between beads. 38. The library of claim 37 containing at least 10², 10³, 10⁴, 10⁵ or 10⁶ beads.

39. A kit for single cell sequencing comprising:

a. a bead as defined in any one of claims 1 to 36 or a library of beads as defined in claim 37 or 38

b. a transposase enzyme.

40. The kit of claim 39 wherein the transposase enzyme is provided in inactive form.

41. The kit of claim 39 or 40 wherein the kit further comprises a source of magnesium ions to activate the transposase enzyme.

42. A droplet for use in single cell sequencing comprising:

a. a single bead as defined in any one of claims 1 to 36 or a single bead from the library of beads as defined in claim 37 or 38

b. a transposase enzyme.

43. The droplet of claim 42 wherein the transposase enzyme cleaves the transposase target sequences (from the bead) to generate a diffusible transposition complex. 44. The droplet of claim 42 or 43 further comprising a single cell or a single purified nucleus, optionally fixed, within the droplet.

45. The droplet of any one of claims 42 to 44 wherein the cell is a eukaryotic or a

prokaryotic cell.

46. The droplet of any one of claims 42 to 45 further comprising a lysis reagent to lyse a cell (but not the nucleus within the cell).

47. The droplet of claim 46 wherein the lysis reagent comprises a detergent.

48. The droplet of claim 47 wherein the detergent is a non-ionic detergent.

49. A droplet library comprising a plurality of droplets that each comprise:

b. a transposase enzyme

wherein each single bead comprises a different barcode sequence.

50. The droplet library of claim 49 wherein the transposase cleaves the transposase target sequences in each droplet to generate a diffusible transposition complex.

51. The droplet library of claim 49 or 50 each droplet further comprising a lysis reagent.

52. The droplet library of claim 51 wherein the lysis reagent comprises a detergent.

53. The droplet library of claim 52 wherein the detergent is a non-ionic detergent.

54. The droplet library of any one of claims 49 to 53 wherein the plurality of droplets each comprise a single cell or a single purified nucleus, optionally fixed, within the droplet.

55. The droplet library of any one of claims 49 to 54 wherein the cell is a eukaryotic or a prokaryotic cell. 56. The droplet library of any one of claims 35 to 41 containing at least 10², 10³, 10⁴, 10⁵ or 10⁶ droplets.

57. A method of making a barcoded diffusible transposition complex comprising

combining a single bead as defined in any one of claims 1 to 36 or a single bead from the library of beads as defined in claim 37 or 38 with a transposase enzyme in a single droplet.

58. A diffusible barcoded transposition complex produced according to the method of claim 57.

59. A method of making a library of diffusible transposition complexes comprising, for a plurality of single beads wherein each single bead comprises a different barcode sequence, combining a single bead as defined in any one of claims 1 to 36 or a single bead from the library of beads as defined in claim 37 or 38 with a

transposase enzyme in a single droplet.

60. A library of diffusible transposition complexes produced according to the method of claim 59. 61. A method of generating a barcoded DNA library comprising combining the following components in a single droplet:

b. a transposase enzyme

c. a single cell d. a lysis reagent

wherein the transposase enzyme binds to the oligonucleotides attached to the bead and cleaves them from the bead to generate a diffusible transposition complex wherein the lysis reagent lyses the cell thereby enabling the generated transposition complex to insert first and second common primer binding sites and the common barcode sequence into cellular DNA, thereby generating a barcoded DNA library.

62. A method of generating a barcoded DNA library comprising combining the following components in a single droplet:

b. a transposase enzyme

c. a single purified nucleus

wherein the transposase enzyme binds to the oligonucleotides attached to the bead and cleaves them from the bead to generate a diffusible transposition complex wherein the generated transposition complex inserts first and second common primer binding sites and the common barcode sequence into cellular DNA, thereby generating a barcoded DNA library.

63. Use of a bead or library of beads as defined in any one of claims 1 to 38 for single cell or single nucleus sequencing.

64. A method of single cell or single nucleus sequencing comprising:

a. Performing the method of claim 61 or 62 in order to generate a barcoded DNA library

c. Sequencing the amplified barcoded DNA.

65. A method of single cell or single nucleus sequencing comprising:

a. Performing the method of claim 61 or 62, using a single bead or beads as defined in any one of claims 9 to 17 or 35 to 38 when dependent from claims 9 to 17, in order to generate a barcoded DNA library

b. Repairing gaps c. Amplifying the DNA using a whole genome amplification method; and d. Sequencing the amplified barcoded DNA.

66. A method of single cell or single nucleus sequencing comprising:

b. Repairing gaps

c. Amplifying the DNA using a whole genome amplification method

d. Further amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

e. Sequencing the amplified barcoded DNA.

67. The method of claim 65 or 66 wherein the whole genome amplification method comprises Multiple Displacement Amplification.

68. The method of claim 67 which comprises use of a primase.

69. The method of any one of claims 64 to 68 further comprising clustering and

assembling the sequences.

70. A method of single cell sequencing of environmental samples comprising

performing a method as claimed in any one of claims 64 to 69 using an

environmental sample as the source of the single cell or single purified nucleus.

71. Use of a bead or library of beads as defined in any one of claims 1 to 38 for

assaying transposase accessible chromatin (ATAC) in a single cell or single nucleus. 72. A method of assaying transposase accessible chromatin (ATAC) in a single cell or single nucleus comprising:

a. Performing the method of claim 61 or 62 in order to generate a barcoded DNA library under conditions in which chromatin structure is maintained b. Amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and c. Sequencing the amplified barcoded DNA.

73. The method of claim 72 which comprises:

a. Performing a method as claimed in any one of claims 64 to 69; and/or b. use of beads as defined in any one of claims 10 to 19 or 35 to 38 when dependent from claims 10 to 19 (which insert via a non-fragmenting mechanism).

74. The method of claim 72 or 73 wherein step a. is performed under conditions which do not lyse the nucleus.

75. The method of any one of claims 72 to 74 wherein intact cell nuclei are recovered following step a. and prior to step b.

76. The method of any one of claims 72 to 75 which is performed using purified single nuclei.

77. The method of claim 76 wherein the purified single nuclei are fixed.

78. Use of a bead or library of beads as defined in any one of claims 1 to 38 for

chromatin immunoprecipitation sequencing (ChIP seq) in a single cell or single nucleus.

79. A method of chromatin immunoprecipitation sequencing (ChIP seq) in a single cell or single nucleus comprising:

a. Performing the method of claim 61 or 62 in order to generate a barcoded DNA library under conditions in which chromatin structure is maintained b. Immunoprecipitation of the chromatin

d. Sequencing the amplified barcoded DNA.

80. The method of claim 79 which comprises use of beads as defined in any one of claims 1 to 9 and 20 to 38 (which insert via a fragmenting mechanism).

81. The method of claim 79 or 80 wherein step b. is performed in the presence of carrier chromatin.

82. The method of any one of claims 79 to 81 wherein intact cell nuclei are recovered following step a. and prior to step b.

83. A method of preparing cellular nucleic acids from a single cell or nucleus for sequencing comprising:

a. Encapsulating the single cell or nucleus in a gel bead

84. The method of claim 83 further comprising, following step b:

85. The method of claim 83 or 84 wherein the cellular nucleic acids comprise DNA.

86. The method of any one of claims 83 to 85 wherein the single cell is provided in a droplet and thus step a. comprises encapsulating a single cell in a droplet in a single gel bead.

87. The method of any one of claims 83 to 86 wherein the gel is a hydrogel.

88. The method of any one of claims 83 to 87 wherein step a. comprises:

89. The method of any one of claims 83 to 88 wherein step b. digests chromatin to leave free DNA entrapped within the gel bead.

90. The method of any one of claims 83 to 89 wherein the cell is a eukaryotic or prokaryotic cell.

91. The method of any one of claims 83 to 90 wherein the cell lysis reagent comprises a detergent and a protease

92. The method of any one of claims 83 to 91 wherein the cell lysis reagent comprises sodium dodecyl sulfate and/or proteinase K.

93. The method of any one of claims 83 to 92 which is performed for at least 10³, 10⁴, 10⁵ or 10⁶ cells in parallel.

94. The method of any one of claims 83 to 93 wherein the sequencing is DNA

sequencing.

95. The method of any one of claims 83 to 94 further comprising, following step b., adding a bisulfite reagent to the bead in order to convert unmethylated cytosine residues to uracil (and which does not convert methylated cytosine residues).

96. The method of any one of claims 83 to 95 further comprising following step b,

performing a chemical oxidation in order to convert hydroxymethylated cytosine residues to formylcytosine residues, followed by adding a bisulfite reagent to the bead in order to convert unmethylated cytosine residues and formylcytosine residues to uracil (and which does not convert methylated cytosine residues).

97. A gel bead, or gel beads, comprising cellular nucleic acids produced according to the method of any one of claims 83 to 96.

98. A method of generating a barcoded DNA library comprising combining the following components in a single droplet:

a. a single gel bead comprising cellular nucleic acids produced according to the method as defined in any one of claims 83 to 96

b. a single bead as defined in any one of claims 1 to 36 or a single bead from the library of beads as defined in claim 37 or 38

c. a transposase enzyme

99. A method of single cell or single nucleus sequencing comprising:

a. Performing the method of claim 98 in order to generate a barcoded DNA library

c. Sequencing the amplified barcoded DNA.

A method of single cell or single nucleus sequencing comprising: Performing the method of claim 98, using a single bead or beads as defined in any one of claims 9 to 17 or 35 to 38 when dependent from claims 9 to 17, in order to generate a barcoded DNA library

Repairing gaps

Amplifying the DNA using a whole genome amplification method; and Sequencing the amplified barcoded DNA.

Repairing gaps

Amplifying the DNA using a whole genome amplification method Further amplifying the barcoded DNA using primers that hybridise with the first and second common primer binding sites; and

Sequencing the amplified barcoded DNA.

102. The method of claim 100 or 101 wherein the whole genome amplification method comprises Multiple Displacement Amplification.

103. The method of claim 102 which comprises use of a primase.

104. The method of any one of claims 99 to 103 further comprising clustering and assembling the sequences.

105. A method of single cell sequencing of environmental samples comprising performing a method as claimed in any one of claims 99 to 104 using an environmental sample as the source of the cellular nucleic acids.

106. Use of a bead or library of beads as defined in any one of claims 1 to 38 for bisulphite sequencing in a single cell or single nucleus.

107. A method of bisulphite sequencing in a single cell or single nucleus

comprising:

c. Sequencing the amplified barcoded DNA.

108. A method of DNA sequencing in a single cell or single nucleus comprising: a. Performing the method of claim 98 in order to generate a barcoded DNA library

c. Sequencing the amplified barcoded DNA.

109. Use of a bead or library of beads as defined in any one of claims 1 to 38 for DNA sequencing in a single cell or single nucleus.

110. A method of copy number variation sequencing in a single cell or single nucleus comprising:

c. Sequencing the amplified barcoded DNA d. Determining the copy number of one or more DNA sequences.

111. Use of a bead or library of beads as defined in any one of claims 1 to 38 for copy number variation sequencing in a single cell or single nucleus.

112. The method or use of any one of claims 98 to 1 11 which is performed using a purified single nucleus.

113. The method or use of any one of claims 98 to 112 wherein the cell is a

eukaryotic or prokaryotic cell.

114. The use or method of any one of claims 63 to 113 wherein the sequencing is next generation sequencing.

1 15. The use or method of any one of claims 63 to 114 wherein the primers that hybridise with the first and second common primer binding sites introduce sequencing adaptors into the amplification products.

116. The use or method of any one of claims 63 to 115 wherein the barcode sequences are used to identify the single cell or single nucleus from which a given sequence originates.

117. The use or method of any one of claims 63 to 116 wherein the

oligonucleotides comprise a UMI and the UMIs are used to identify the molecule from which a given sequence originates.

118. The use or method of any one of claims 63 to 117 further comprising

quantifying the respective sequences.