WO2022056231A1 - Compositions and methods for screening cell internalizing agents - Google Patents

Compositions and methods for screening cell internalizing agents Download PDF

Info

Publication number
WO2022056231A1
WO2022056231A1 PCT/US2021/049814 US2021049814W WO2022056231A1 WO 2022056231 A1 WO2022056231 A1 WO 2022056231A1 US 2021049814 W US2021049814 W US 2021049814W WO 2022056231 A1 WO2022056231 A1 WO 2022056231A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
nucleic acid
protein
target
linker
Prior art date
Application number
PCT/US2021/049814
Other languages
French (fr)
Inventor
Eric ESTRIN
Aaron CANTOR
Akshay TAMBE
Original Assignee
Spotlight Therapeutics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spotlight Therapeutics filed Critical Spotlight Therapeutics
Publication of WO2022056231A1 publication Critical patent/WO2022056231A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1086Preparation or screening of expression libraries, e.g. reporter assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Definitions

  • Described herein are methods and compositions for identifying cell internalizing agents capable of internalizing endonucleases (e.g., RNA-guided endonucleases) into a cell for gene editing.
  • endonucleases e.g., RNA-guided endonucleases
  • Endonucleases such as Cas9
  • Cas9 have become a versatile tool for genome engineering in various cell types and organisms (see, e.g., US 8,697,359).
  • a guide RNA gRNA
  • gRNA guide RNA
  • RNA-guided endonucleases can generate site-specific doublestranded breaks DSBs or single-stranded breaks (SSBs) within target nucleic acids (e.g., double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), or RNA).
  • target nucleic acids e.g., double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), or RNA.
  • dsDNA double-stranded DNA
  • ssDNA single-stranded DNA
  • RNA RNA-target nucleic acids
  • RNA-guided endonucleases there is a need for improved methods for identifying cell internalizing agents that can facilitate the internalization of nucleases, including RNA-guided endonucleases.
  • improved screening methods that can detect the ability of an endonuclease, such as an RNA-guided endonuclease, to perform genome editing.
  • compositions for testing agents in order to identify a cell internalizing agent that can internalize a site-directed modifying polypeptide or nucleoprotein as well as methods for determining gene editing activity once internalized.
  • the methods and compositions disclosed herein provide screening methods for identifying effective agents that can internalize a site- directed modifying polypeptide or nucleoprotein in such a manner that the site-directed modifying polypeptide or nucleoprotein is able to perform gene editing.
  • a method for identifying a cell internalizing agent comprising providing a population of target cells, wherein each target cell comprises a reporter construct that comprises a nucleic acid that provides a phenotype when activated or repressed, wherein each target cell is a eukaryotic cell that expresses a test agent on its cell surface, and wherein the test agent comprises a first conjugation moiety, contacting the population of target cells with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid within the target cell, wherein the nucleoprotein comprises a second conjugation moiety that binds to the first conjugation moiety of the test agent, and selecting a modified target cell having the phenotype observed with activation or repression of the reporter construct, thereby identifying the cell internalizing agent.
  • gRNA guide RNA
  • the target nucleic acid is in the nucleus of the target cell.
  • the test agent is a protein, a lipid, or a carbohydrate.
  • the protein is an antigen-binding moiety.
  • the antigen-binding moiety is an antibody or an antibody fragment thereof.
  • the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic.
  • the antibody mimetic is a fibronectin based binding molecule, an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide.
  • the protein is a cell-penetrating peptide (CPP).
  • the protein is a ligand, or binding fragment thereof.
  • the site-directed modifying polypeptide is a Class 2 Cas polypeptide.
  • the Class 2 Cas polypeptide is a Type II Cas polypeptide.
  • the Type II Cas polypeptide is Cas9.
  • the site-directed modifying polypeptide is conjugated to the second binding moiety via a linker.
  • the linker is a labile linker.
  • the labile linker is pH sensitive.
  • the labile linker is sensitive to reducing conditions.
  • the labile linker is a disulfide linker.
  • the linker is a hydrazone linker or a valine-citrate linker.
  • the modified target cell is selected by determining the RNA or protein expression level of the reporter construct.
  • an increase in the RNA or protein expression level of a nucleic acid in the reporter construct relative to a control indicates internalization of the test agent.
  • the RNA or protein expression level of a nucleic acid in the reporter construct is decreased or substantially eliminated upon internalization of the test agent.
  • the reporter construct comprises a nucleic acid encoding a selection marker.
  • the selection marker is a thymidine kinase gene construct and the population of target cells is cultured in the presence of ganciclovir.
  • the nucleic acid encoding a selection marker is an antibiotic resistance marker, a fluorescence marker, or a bioluminescence marker.
  • the fluorescence marker is a green fluorescent protein (GFP), a yellow fluorescent protein (YFP), a red fluorescent protein (RFP), or a split GFP reporter.
  • the modified target cell is selected by detecting a signal produced by reassembly of the split GFP reporter.
  • the selection is a positive selection.
  • the selection is a negative selection.
  • the modified target cell is identified using cell sorting.
  • the cell sorting is fluorescence-activated cell sorting (FACS), magnetic-activated cell sorting (MACS), or microfluidic-based cell sorting.
  • the cell internalizing agent binds to a cell surface moiety associated with a disorder. In some embodiments, the cell internalizing agent binds to a cell surface protein associated with a disorder. In certain embodiments, the cell surface protein is a tyrosine kinase, an epidermal growth factor receptor (EGFR), a platelet-derived growth factor receptor (PDGFR), a fibroblast growth factor receptor (FGFR), a hepatocyte growth factor receptor (HGFR), a nerve growth factor receptor (NGFR), CD3, CD4, Tim-3, CD278, TNFR-I, IL-1 R, LT-betaR, IL-18R, CCR1 , CD26, CD94, CD119, CD183, CD195, or DPIV.
  • EGFR epidermal growth factor receptor
  • PDGFR platelet-derived growth factor receptor
  • FGFR fibroblast growth factor receptor
  • HGFR hepatocyte growth factor receptor
  • NGFR nerve growth factor receptor
  • CD3, CD4, Tim-3 CD278,
  • the population of target cells comprises mammalian cells or yeast cells.
  • the mammalian cells are a cell type selected from the group consisting of a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and an HepG2 cell.
  • the cell internalizing agent is identified by polymerase chain reaction (PCR) or by deep sequencing of a PCR-amplified nucleic acid derived from the modified target cell.
  • PCR polymerase chain reaction
  • a method for screening a library of cells having a plurality of genotypes for a cell that produces a cell internalizing agent comprising providing an array with a library of protein-variant-producing cells that each express a test agent; incubating the array under conditions that allow for the production of test agents from the protein-variant-producing cells; providing a target cell comprising a reporter construct comprising a nucleic acid that provides a phenotype when activated or repressed; contacting the target cell with the test agent under conditions that allow for the test agent to bind to the target cell, wherein the test agent comprises a first conjugation moiety; contacting the target cell with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid of the target cell; wherein a second conjugation moiety is conjugated to the site-directed modifying polypeptide such that the first conjugation moiety of the test agent binds
  • gRNA guide RNA
  • the array is a microfluidic system, a microbubble system, or a microcavity array. In certain embodiments, the array is a microcavity array and comprises extracting the target cell with electromagnetic radiation.
  • the test agent is a protein, a lipid, or a carbohydrate.
  • the protein is a peptide.
  • the protein is an antigen-binding moiety.
  • the antigen-binding moiety is an antibody or an antibody fragment thereof.
  • the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic.
  • the antibody mimetic is an adnectin (i.e.
  • the protein is a cell-penetrating peptide (CPP).
  • the protein is a ligand, or fragment thereof.
  • the site-directed modifying polypeptide is a Class 2 Cas polypeptide. In other embodiments, the Class 2 Cas polypeptide is a Type II Cas polypeptide. In yet another embodiment, the Type II Cas polypeptide is Cas9. In some embodiments, the site-directed modifying polypeptide is conjugated to the second binding moiety via a linker. In other embodiments, the linker is a labile linker. In another embodiment, the labile linker is pH sensitive. In another embodiment, the labile linker is sensitive to reducing conditions. In yet another embodiment, the labile linker is a disulfide linker. In other embodiments, the linker is a hydrazone linker or a valine-citrate linker.
  • the result is assayed by determining an RNA or protein expression level of the nucleic acid in the reporter construct. In other embodiments, an increase in the RNA or protein expression level of the nucleic acid indicates internalization of the test agent.
  • the nucleic acid encodes a selection marker.
  • the selection marker is an antibiotic resistance marker, a fluorescence marker, and a bioluminescence marker.
  • the selection marker is an antibiotic resistance marker, a fluorescence marker, and a bioluminescence marker.
  • the fluorescence marker is green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), or a split GFP reporter.
  • RNA or protein expression level of the nucleic acid indicates internalization of the test agent.
  • the selection marker is a nucleic acid encoding thymidine kinase gene construct and the target cells are cultured in the presence of ganciclovir.
  • the selection is a positive selection.
  • the selection is a negative selection.
  • the method further comprises measuring a signal produced from a labeled antibody capable of specifically binding to a cell surface protein expressed by the reporter construct.
  • the presence of the signal indicates internalization of the test agent.
  • the absence of the signal indicates internalization of the test agent.
  • the target cell is a mammalian cell or a yeast cell.
  • the mammalian cell is a cell type selected from the group consisting of a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and an HepG2 cell.
  • the cell internalizing agent is identified via by sequencing a nucleic acid in the target cell.
  • the method further comprises the step of identifying the cell internalizing agent comprises detecting using polymerase chain reaction (PCR) or deep sequencing of PCR-amplified nucleic acid from the cell that produces the cell internalizing agent.
  • PCR polymerase chain reaction
  • the cell internalization agent binds to a cell surface antigen associated with a disease.
  • the disease is selected from the group consisting of cancer, autoimmune disease, and a hereditary genetic disease.
  • the cell surface antigen is selected from the group consisting of HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam.
  • the modified target cell is selected by detecting cells capable of propagating in the presence of ganciclovir.
  • Fig. 1 depicts results of an assay showing the formation of a complex of an anti-FAP antibody (28H1 )-SpyTag (genetically encoded on the C-terminus) with spycatcher-Cas9- (genetically encoded on the N-terminus).
  • RNA-guided endonuclease e.g., an RNA-guided endonuclease or nucleoprotein.
  • screening methods for rapidly detecting the ability of an endonuclease, e.g., an RNA-guided endonuclease or nucleoprotein to perform genome editing activities e.g., an RNA-guided endonuclease or nucleoprotein to perform genome editing activities.
  • cell internalizing agent refers to a binding moiety that can internalize into a cell.
  • a cell internalizing agent specifically binds to an extracellular target molecule (e.g., an extracellular protein, lipid or glycan) displayed on a cell surface and internalizes into the cell.
  • a cell internalizing agent is a molecule that is displayed on the surface of a cell and internalizes into the cell.
  • a cell internalizing agent is a protein (e.g., an antigen binding moiety (e.g., an antibody or an antibody fragment), a ligand, or a cell penetrating peptide (CPP)) that can associate with a site-directed modifying polypeptide and enable the site-directed modifying polypeptide (e.g., Cas9) to internalize into a target cell.
  • a protein e.g., an antigen binding moiety (e.g., an antibody or an antibody fragment), a ligand, or a cell penetrating peptide (CPP)
  • CPP cell penetrating peptide
  • test agent refers to a molecule which is being assayed for a given characteristic.
  • a test agent represents a molecule that is being tested for its ability to both bind to a molecule (e.g., protein receptor, lipid, glycoprotein, glycolipid, carbohydrate, or others) on the surface of a cell and internalize into the cell.
  • a test agent represents a molecule that is displayed on the surface of a cell and is being tested for its ability to internalize into the cell.
  • a test agent is assayed for its ability to internalize an endonuclease where internalization is determined according to the activity of the endonuclease that the test agent helps to internalize.
  • a test agent associates with or is conjugated to a site- directed modifying polypeptide (e.g., Cas9) or a nucleoprotein, and facilitates the internalization of the site- directed modifying polypeptide or nucleoprotein into the intracellular space of a target cell.
  • a site- directed modifying polypeptide e.g., Cas9
  • the cell internalization activity of a test agent may be determined by detecting a target cell having a particular phenotype characterized by the activity of a reporter construct comprising a nucleic acid that provides a read out, e.g., repression or activation, to determine if gene editing has occurred in the target cell.
  • test agents include, but are not limited to, a protein (e.g., an antibody or antibody fragment thereof, a ligand or portion thereof, or a CPP), a lipid, a glycoprotein, a glycolipid, and a carbohydrate.
  • reporter construct refers to a nucleic acid that is within a cell (e.g., in the nucleus of the cell) that can be activated or repressed by, e.g., a site-directed modifying polypeptide or nucleoprotein, where the repression or activation results in a phenotype that allows for selection of the cell.
  • activation or repression of the reporter construct is achieved via gene editing.
  • the phenotype resulting from the activation or repression may be, for example, a selection (e.g., antibiotic resistance, fluorescence, etc.) imparting a characteristic that distinguishes the cell from other cells in a population.
  • target cell refers to a cell to which an agent binds or is associated with, e.g., a cell to which a cell internalizing agent or a test agent can bind or is associated with.
  • a target cell may, for example, express an extracellular molecule (i.e. , an extracellular molecule embedded in or tethered to the cell membrane) which is bound by the cell internalizing agent (e.g., an antibody, a ligand, a CPP, a protein, etc.).
  • a target cell may be a healthy cell or may be a cell associated with a disease.
  • a target cell is a cell on which the screening assays are focused.
  • a target cell is a cell in vivo to which the cell internalizing agent binds.
  • modified target cell refers to a cell targeted by a cell internalizing agent wherein a reporter construct containing a nucleic acid sequence within the cell targeted is altered or modified by genome editing activity (e.g., genome editing activities of a nucleoprotein as described and disclosed herein).
  • a modified target cell has a phenotype associated with the reporter construct.
  • protein-variant-producing cell refers to a cell that expresses at least one (but preferably a plurality of) protein variants.
  • An example of a protein variant is an antigen binding moiety, such as, but not limited to, an antibody or antigen binding fragment thereof.
  • a “site-directed modifying polypeptide” refers to a nuclease, that is targeted to a specific nucleic acid sequence or set of similar sequences of a polynucleotide chain via recognition of the particular sequence(s) by the modifying polypeptide itself or an associated molecule (e.g., RNA), wherein the polypeptide can modify the polynucleotide chain.
  • a site-directed modifying polypeptide is an RNA-guided endonuclease, such as Cas9.
  • conjugation moiety refers to a molecule that is capable of associating with at least one other molecule.
  • the association may be covalent or non-covalent.
  • a first and a second conjugation moiety may be used to conjugate two proteins, such as a cell internalizing agent and a site-directed modifying polypeptide.
  • conjugation refers to the physical or chemical complex formed between a molecule (for e.g. a cell internalizing agent) and the second molecule (e.g. a site-directed modifying polypeptide). In one embodiment, conjugation is achieved via a physical association or non-covalent complexation.
  • nucleic acids that specifically hybridize to one another are generally complements of one another and can be the same length or within 30, 20, or 10% of the length of the reference nucleic acid.
  • polypeptide or “protein”, as used interchangeably herein, refer to any polymeric chain of amino acids.
  • the terms encompass native or artificial proteins, protein fragments and polypeptide analogs of a protein sequence.
  • ligand refers to a molecule that is capable of specifically binding to another molecule on or in a cell, such as one or more cell surface receptors, and includes molecules such as proteins, hormones, neurotransmitters, cytokines, growth factors, cell adhesion molecules, or nutrients.
  • a site-directed modifying polypeptide can be associated with one or more ligands through covalent or non- covalent linkage. Examples of ligands useful herein, or targets bound by ligands, and further description of ligands in general, are disclosed in Bryant & Stow (2005). Traffic, 6(10), 947-953; Olsnes et al. (2003). Physiological reviews, 83(1 ), 163-182; and Planque, N. (2006). Cell Communication and Signaling, 4(1 ), 7, which are incorporated herein by reference.
  • antigen binding moiety that specifically binds to an antigen binds to an antigen with an Kd of at least about 1 x10 ⁇ 4 , 1 x10 -5 , 1 x10 ⁇ 6 M, 1 x10 -7 M, 1 x10 ⁇ 8 M, 1 x10 ⁇ 9 M, 1 x10 ⁇ 10 M, 1 x10 ⁇ 11 M, 1 x10 ⁇ 12 M, or more as determined by surface plasmon resonance or other approaches known in the art (e.g., filter binding assay, fluorescence polarization, isothermal titration calorimetry), including those described further herein.
  • an antigen binding moiety specifically binds to an antigen if the antigen binding moiety binds to an antigen with an affinity
  • CPP cell-penetrating peptide
  • a CPP can also be characterized in certain embodiments as being able to facilitate the movement or traversal of a molecular conjugate across/th rough one or more of a lipid bilayer, micelle, cell membrane, organelle membrane (e.g., nuclear membrane), vesicle membrane, or cell wall.
  • a CPP herein can be cationic, amphipathic, or hydrophobic in certain embodiments.
  • CPPs useful herein, and further description of CPPs in general, are disclosed in Borrelli, Antonella, et al. Molecules 23.2 (2016): 295; Milletti, Francesca. Drug discovery today 17.15-16 (2012): 850-860, which are incorporated herein by reference. Further, there exists a database of experimentally validated CPPs (CPPsite, Gautam et al., 2012).
  • the CPP can be any known CPP, such as a CPP shown in the CPPsite database.
  • antigen binding moiety refers to a molecule that binds to an antigen, such as an extracellular cell-membrane bound protein (e.g., a cell surface receptor).
  • an antigen binding moiety include, but are not limited to, a protein, an antibody, antigen-binding fragments of an antibody, and an antibody mimetic.
  • antibody is used herein in the broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), nanobodies, monobodies, and antibody fragments so long as they exhibit the desired antigen-binding activity.
  • antibody includes an immunoglobulin molecule comprising four polypeptide chains, two heavy (H) chains and two light (L) chains inter-connected by disulfide bonds, as well as multimers thereof (e.g., IgM).
  • Each heavy chain (HC) comprises a heavy chain variable region (or domain) (abbreviated herein as HCVR or VH) and a heavy chain constant region (or domain).
  • the heavy chain constant region comprises three domains, CH1 , CH2 and CH3.
  • Each light chain (LC) comprises a light chain variable region (abbreviated herein as LCVR or VL) and a light chain constant region.
  • the light chain constant region comprises one domain (CL1 ).
  • Each VH and VL is composed of three CDRs and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1 , CDR1 , FR2, CDR2, 1 -R3, CDR3, FR4
  • Immunoglobulin molecules can be of any type (e.g., IgG, IgE, IgM, IgD, IgA and IgY), class (e.g., IgG 1 , lgG2, lgG3, lgG4, Ig A1 and lgA2) or subclass.
  • VH and VL regions can be further subdivided into regions of hypervariability, termed complementarity determining regions (CDRs), interspersed with regions that are more conserved, termed framework regions (FR).
  • CDRs complementarity determining regions
  • FR framework regions
  • Each VH and VL is composed of three CDRs and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1 , CDR1 , FR2, CDR2, FR3, CDR3, FR4.
  • CDR complementarity determining region
  • CDR refers to the noncontiguous antigen combining sites found within the variable region of both heavy and light chain polypeptides. These particular regions have been described by Kabat et al., J. Biol. Chem. 252, 6609-6616 (1977) and Kabat et al., Sequences of protein of immunological interest. (1991 ), and by Chothia et al., J. Mol. Biol. 196:901 -917 (1987) and by MacCallum et al., J. Mol. Biol. 262:732-745 (1996) where the definitions include overlapping or subsets of amino acid residues when compared against each other. The amino acid residues which encompass the CDRs as defined by each of the above cited references are set forth for comparison.
  • the term “CDR” is a CDR as defined by Kabat, based on sequence comparisons.
  • an “intact” or a “full length” antibody refers to an antibody comprising four polypeptide chains, two heavy (H) chains and two light (L) chains.
  • an intact antibody is an intact IgG antibody.
  • monoclonal antibody refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical and/or bind the same epitope, except for possible variant antibodies, e.g., containing naturally occurring mutations or arising during production of a monoclonal antibody preparation, such variants generally being present in minor amounts.
  • polyclonal antibody preparations typically include different antibodies directed against different determinants (epitopes)
  • each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen.
  • the modifier "monoclonal” indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies and is not to be construed as requiring production of the antibody by any particular method.
  • the monoclonal antibodies to be used in accordance with the present invention may be made by a variety of techniques, including but not limited to the hybridoma method, recombinant DNA methods, phagedisplay methods, and methods utilizing transgenic animals containing all or part of the human immunoglobulin loci, such methods and other exemplary methods for making monoclonal antibodies being described herein.
  • human antibody refers to an antibody having variable regions in which both the framework and CDR regions are derived from human germline immunoglobulin sequences. Furthermore, if the antibody contains a constant region, the constant region also is derived from human germline immunoglobulin sequences.
  • the human antibodies of the invention may include amino acid residues not encoded by human germline immunoglobulin sequences (e.g., mutations introduced by random or site-specific mutagenesis in vitro or by somatic mutation in vivo).
  • the term “human antibody”, as used herein is not intended to include antibodies in which CDR sequences derived from the germline of another mammalian species, such as a mouse, have been grafted onto human framework sequences.
  • humanized antibody is intended to refer to antibodies in which CDR sequences derived from the germline of one mammalian species, such as a mouse, have been grafted onto human framework sequences. Additional framework region modifications may be made within the human framework sequences.
  • a "humanized form" of an antibody e.g., a non-human antibody, refers to an antibody that has undergone humanization.
  • chimeric antibody is intended to refer to antibodies in which the variable region sequences are derived from one species and the constant region sequences are derived from another species, such as an antibody in which the variable region sequences are derived from a mouse antibody and the constant region sequences are derived from a human antibody.
  • antibody fragment refers to a molecule other than an intact antibody that comprises a portion of an intact antibody and that binds the antigen to which the intact antibody binds.
  • antibody fragments include, but are not limited to, Fv, Fab, Fab', Fab'-SH, F(ab')2; diabodies; linear antibodies; single-chain antibody molecules (e.g. scFv); and multispecific antibodies formed from antibody fragments.
  • a “multispecific antigen binding polypeptide” or “multispecific antibody” is one that targets more than one antigen or epitope.
  • a “bispecific,” “dual-specific” or “bifunctional” antigen binding polypeptide or antibody is a hybrid antigen binding polypeptide or antibody, respectively, having two different antigen binding sites.
  • Bispecific antigen binding polypeptides and antibodies are examples of a multispecific antigen binding polypeptide or a multispecific antibody and may be produced by a variety of methods including, but not limited to, fusion of hybridomas or linking of Fab' fragments. See, e.g., Songsivilai and Lachmann, 1990, Clin. Exp. Immunol.
  • antibody mimetic or “antibody mimic” refers to a molecule that is not structurally related to an antibody but is capable of specifically binding to an antigen.
  • antibody mimetics include, but are not limited to, an adnectin (i.e.
  • fibronectin based binding molecules an affilin, an affimer, an affitin, an alphabody, an affibody, DARPins, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a nanobody, a unibody, a versabody, an aptamer, a cyclotide, and a peptidic molecule all of which employ binding structures that, while they mimic traditional antibody binding, are generated from and function via distinct mechanisms.
  • Amino acid sequences described herein may include “conservative mutations,” including the substitution, deletion or addition of nucleic acids that alter, add or delete a single amino acid or a small number of amino acids in a coding sequence where the nucleic acid alterations result in the substitution of a chemically similar amino acid.
  • a conservative amino acid substitution refers to the replacement of a first amino acid by a second amino acid that has chemical and/or physical properties (e.g., charge, structure, polarity, hydrophobicity/hyd rophilicity) that are similar to those of the first amino acid.
  • Conservative substitutions include replacement of one amino acid by another within the following groups: lysine (K), arginine (R) and histidine (H); aspartate (D) and glutamate (E); asparagine (N) and glutamine (Q); N, Q, serine (S), threonine (T), and tyrosine (Y); K, R, H, D, and E; D, E, N, and Q; alanine (A), valine (V), leucine (L), isoleucine (I), proline (P), phenylalanine (F), tryptophan (W), methionine (M), cysteine (C), and glycine (G); F, W, and Y; H, F, W, and Y; C, S and T; C and A; S and T; C and S; S, T, and Y; V, I, and L; V, I, and T.
  • Other conservative amino acid substitutions are also recognized as valid, depending on the context of
  • isolated refers to a compound, which can be, e.g. a nucleoprotein, protein, or nucleic acid, that is substantially free of other cellular material.
  • a site-directed modifying polypeptide or nucleoprotein enable a site-directed modifying polypeptide (e.g., Cas9) or nucleoprotein (e.g., Cas9 and guide RNA) to be targeted to the surface of a cell of interest (e.g., a diseased cell expressing an antigen on its surface associated with the disease) and subsequently internalized by the cell of interest such that gene editing can occur in the cell.
  • a site-directed modifying polypeptide e.g., Cas9
  • nucleoprotein e.g., Cas9 and guide RNA
  • the cell internalizing agent specifically binds to an extracellular target molecule (e.g., an extracellular protein, lipid, carbohydrate, glycan, etc.) displayed on (e.g., embedded in or tethered to) a cell membrane.
  • an extracellular target molecule e.g., an extracellular protein, lipid, carbohydrate, glycan, etc.
  • a cell internalizing agent is identified from a population of test agents using the methods disclosed herein.
  • identifying cell internalizing agents that have both the ability to target a cell of interest and internalize such that a site-directed modifying polypeptide or nucleoprotein is delivered to the cell for gene editing.
  • test agents can be used in the methods disclosed herein in order to identify cell internalizing agents. Examples include, but are not limited to a protein (e.g., a glycoprotein), a lipid (e.g., a glycolipid), or a carbohydrate.
  • Test agents identified using the present invention can be natural, recombinant, or synthetic.
  • the test agent is a peptide.
  • the test agent is an antigen-binding moiety.
  • the test agent is an antibody or an antibody fragment thereof.
  • the test agent is a cell-penetrating peptide (CPP).
  • the test agent is an antibody mimetic.
  • the test agent is a lipid.
  • the test agent is a glycoprotein.
  • the test agent is a glycolipid.
  • the test agent is a carbohydrate.
  • the test agent is expressed on the surface of a cell membrane. In certain embodiments, the test agent is tethered to the surface of a cell membrane. In other embodiments, the test agent is secreted from a cell. In yet another embodiment, the test agent binds to or associates with the surface of a cell membrane.
  • antigen-binding moieties are screened as test agents using the methods disclosed herein to identify a cell internalizing agent.
  • An example of an antigen-binding moiety is an antigen-binding moiety, an antibody (e.g., an intact antibody) or an antibody fragment thereof. In certain embodiments, an antibody or fragment thereof is humanized or human. In one embodiment, the antibody is a monoclonal antibody.
  • Antibodies or antigen-binding fragments thereof that can be screened as test agents can be in various forms known in the art, e.g., full-length antibodies, bispecific antibodies, dual variable domain antibodies, multiple chain or single chain antibodies, and/or binding fragments that specifically bind an extracellular molecule, including but not limited to Fab, Fab', (Fab')2, Fv), scFv (single chain Fv), surrobodies (including surrogate light chain construct), single domain antibodies, camelized antibodies and the like. They also can be of, or derived from, any isotype, including, for example, IgA (e.g., lgA1 or lgA2), IgD, IgE, IgG (e.g.
  • the anti-CD117 antibody is an IgG (e.g. IgG 1 , lgG2, lgG3 or lgG4).
  • antigen-binding moieties that can be screened as test agents include, but are not limited to, a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic, e.g., a fibronectin based binding molecule, an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide.
  • a nanobody e.g., a nanobody, a domain antibody, an scFv, a Fab, a diabody,
  • Test agents may be directed to an antigen of interest, including, but not limited to, HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam.
  • exemplary targets include: (i) tumor-associated antigens; (ii) cell surface receptors, (ill) CD proteins and their ligands, such as CD3, CD4, CD8, CD19, CD20, CD22, CD25, CD32, CD33, CD34, CD40, CD44, CD47, CD54, CD59, CD70, CD74, CD79a (CD79a), and CD79P (CD79b); (iv) members of the ErbB receptor family such as the EGF receptor, HER2, HER3 or HER4 receptor; (v) cell adhesion molecules such as LFA-1 , Mac1 , p150,95, VLA-4, ICAM-1 , VCAM and av/p3 integrin including either alpha or beta subunits thereof (e.g.
  • anti-CD11 a, anti-CD18 or anti-CD11 b antibodies include growth factors such as VEGF; IgE; blood group antigens; flk2/flt3 receptor; obesity (OB) receptor; mpl receptor; CTLA4; protein C, BR3, c-met, tissue factor, p7 etc.
  • growth factors such as VEGF; IgE; blood group antigens; flk2/flt3 receptor; obesity (OB) receptor; mpl receptor; CTLA4; protein C, BR3, c-met, tissue factor, p7 etc.
  • antigens that can be targeted by the antibody, or an antigen-binding fragment thereof, include cell surface receptors such as those described in Chen and Flies. Nature Reviews Immunology. 13.4 (2013): 227, which is incorporated herein by reference.
  • Exemplary antibodies that may be identified by screening test agents using the methods disclosed include those selected from, and without limitation, an anti-HLA-DR antibody, an anti-CD3 antibody, an anti-CD20 antibody, an anti-CD22 antibody, an anti-CD25 antibody, an anti-CD32 antibody, an anti-CD33 antibody, an anti-CD44 antibody, an anti-CD47 antibody, an anti-CD54 antibody, an anti-CD59 antibody, an anti-CD70 antibody, an anti-CD74 antibody, an anti-AchR antibody, an anti-CTLA4 antibody, an anti-CXCR4 antibody, an anti-EGFR antibody, an anti-Her2 antibody, an anti- EpCam antibody, an-anti-PD-1 antibody, or an anti-FAP1 antibody
  • CPP cell-penetrating peptide
  • a CPP which is a cell internalizing agent induces the absorption of a linked protein or peptide through the plasma membrane of a cell.
  • CPPs induce entry into the cell because of their general shape and tendency to either self-assemble into a membrane-spanning pore, or to have several positively charged residues, which interact with the negatively charged phospholipid outer membrane inducing curvature of the membrane, which in turn activates internalization.
  • Exemplary permeable peptides include, but are not limited to, transportan, PEP1 , MPG, p-VEC, MAP, CADY, polyR, HIV-TAT, HIV-REV, Penetratin, R6W3, P22N, DPV3, DPV6, K-FGF, and C105Y, and are reviewed in van den Berg and Dowdy (201 1 ) Current Opinion in Biotechnology 22:888-893 and Farkhani et al. (2014) Peptides 57:78-94, each of which is herein incorporated by reference in its entirety.
  • a cell internalizing agent identified using the methods disclosed herein is a ligand, or binding fragment thereof.
  • Cell internalizing agents identified herein may be a ligand that binds to another molecule on or in a cell, including one or more cell surface receptors.
  • test agents are aptamers and are screened to identify a cell internalizing agent.
  • An “aptamer” used in the compositions and methods disclosed herein includes aptamer molecules made from either peptides or nucleotides. Peptide aptamers share many properties with nucleotide aptamers (e.g., small size and ability to bind target molecules with high affinity) and they may be generated by selection methods that have similar principles to those used to generate nucleotide aptamers, for example Baines and Colas. 2006. Drug Discov Today. 11 (7-8):334-41 ; and Bickle et al. 2006. Nat Protoc. 1 (3):1066-91 which are incorporated herein by reference.
  • an aptamer is a small nucleotide polymer that binds to specific molecular targets.
  • Aptamers may be single or double stranded nucleic acid molecules (DNA or RNA), although DNA based aptamers are most commonly double stranded.
  • DNA or RNA DNA based aptamers are most commonly double stranded.
  • aptamer molecules are most commonly between 15 and 40 nucleotides long.
  • Aptamers often form complex three-dimensional structures which determine their affinity for target molecules. Aptamers can offer many advantages over simple antibodies, primarily because they can be engineered and amplified almost entirely in vitro. Furthermore, aptamers often induce little or no immune response.
  • Aptamers may be generated for testing using a variety of techniques, but were originally developed using in vitro selection (Ellington and Szostak. (1990) Nature. 346 (6287) :818-22) and the SELEX method (systematic evolution of ligands by exponential enrichment) (Schneider et al. 1992. J Mol Biol. 228 (3):862-9) the contents of which are incorporated herein by reference. Other methods to make and uses of aptamers have been published including Klussmann. The Aptamer Handbook Functional Oligonucleotides and Their Applications. ISBN: 978-3-527-31059-3; Ulrich et al. 2006. Comb Chem High Throughput Screen 9 (8):619- 32; Cerchia and de Franciscis.
  • the methods disclosed herein are used to identify a cell internalizing agent that binds to a cell surface protein associated with a disorder.
  • cell surface proteins are, but not limited to, a tyrosine kinase, an epidermal growth factor receptor (EGFR), a platelet-derived growth factor receptor (PDGFR), a fibroblast growth factor receptor (FGFR), a hepatocyte growth factor receptor (HGFR), a nerve growth factor receptor (NGFR), CD3, CD4, Tim-3, CD278, TNFR-I, IL-1 R, LT-betaR, IL- 18R, CCR1 , CD26, CD94, CD119, CD183, CD195, or DPIV.
  • EGFR epidermal growth factor receptor
  • PDGFR platelet-derived growth factor receptor
  • FGFR fibroblast growth factor receptor
  • HGFR hepatocyte growth factor receptor
  • NGFR nerve growth factor receptor
  • CD3, CD4, Tim-3 CD278, TNFR-I, IL-1 R,
  • cell surface proteins are, but not limited to, HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam.
  • a test agent is a protein or peptide found in a protein or peptide database (for example, SWISS-PROT, TrEMBL, SBASE, PFAM, CPPsite, or others known in the art), or a fragment or variant thereof.
  • a test agent may be a protein or peptide that may be derived (for example, by transcription and/or translation) from a nucleic acid sequence known in the art, such as a nucleic acid sequence found in a nucleic acid database (for example, GenBank, TIGR, CPPsite, or others known in the art), or a fragment or variant thereof.
  • the methods disclosed herein are useful for screening a library of cells having a plurality of genotypes for a cell having a phenotype of interest, such a cell producing a protein or other molecule having a phenotype of interest.
  • the method is available for screening all cell types, e.g., mammalian, yeast, fungal, bacterial, and insect, that are able to survive and/or multiply in the array.
  • Phenotypes of interest can include any biological process that renders a detectable result, including but not limited to production, secretion and/or display of polypeptides and nucleic acids.
  • Libraries of cells having a plurality of genotypes associated with detectable phenotypes can be generated by methods involving error prone PCR, random activation of gene expression, phage display, overhang-based DNA block shuffling, random mutagenesis, in vitro DNA shuffling, site-specific recombination, and other methods generally known to those of skill in the art.
  • the method may involve generating a library of test agents that are primarily expressed on the surface of a cell (e.g., an Expi293 cell (Innovative Targeting Solutions, Inc.)).
  • a cell e.g., an Expi293 cell (Innovative Targeting Solutions, Inc.)
  • test agents can be selected from a library of randomly mutated proteins.
  • the method can include mutagenizing a test agent (e.g., through random mutagenesis) and preparing a library of mutagenized proteins. The mutagenized test agent can then be assessed as a cell internalizing agent, as described herein.
  • the method for generating a library of test agents may involve incorporating chemically synthetic variant pools into a plasmid or virus for subsequent extra-genomic or genomic incorporation.
  • a test agent is an antibody or bispecific antibody found in the xEmplar library repertoire (xCella BioSciences).
  • the system and method for generating a library of test agents include an in vitro system for generating large repertoires of protein structural diversity de novo (e.g., antibodies and/or polypeptides) by harnessing V(D)J recombination biochemistry.
  • the system is an in vitro system for generating antibody diversity constructed using appropriately selected nucleic acid molecules that comprise immunoglobulin V, D, J and C region encoding polynucleotide sequences and recombination signal sequences (RSS).
  • RSS recombination signal sequences
  • the system allows for the generation of greater immunoglobulin structural diversity in vitro through selection of appropriate relative representation of the immunoglobulin gene elements to generate a highly diverse repertoire.
  • such enhanced structural diversity is obtained when the ratio of VH region genes to D segment genes is about 1 :1 to 1 :2 and the ratio of Ju segment genes to D segment genes is about 1 :1 to 1 :2, or when the ratio of VH region genes to JH segment genes is about 1 :2 (V to J) to 2:1 (V to J), or when the combined number of VH region genes together with JH segment genes is not greater than the number of D segment genes when there is a plurality of D gene segments, or when 6, 7, 8, 9, 10, 1 1 or 12 D segment genes are present.
  • a parameter that is described as being "about” a certain quantitative value typically may have a value that varies (i.e. , may be greater than or less than) from the stated value by no more than 50%, and in preferred embodiments by no more than 40%, 30%, 25%, 20%, 15%, 10% or 5%.
  • a nucleic acid composition for generating immunoglobulin structural diversity may be assembled from certain immunoglobulin gene elements, including naturally occurring and artificial sequences, using genetic engineering methodologies and molecular biology techniques with which those skilled in the art will be familiar.
  • Useful immunoglobulin genetic elements for producing the compositions described herein include mammalian Ig heavy chain variable (VH) and light chain variable (VL) region genes, natural or artificial Ig diversity (D) segment genes, Ig heavy chain joining (JH) and light chain joining (Ji_) segment genes, and Ig locus recombination signal sequences (RSSs).
  • Immunoglobulin variable (V) region genes are known in the art and include in their polypeptide- encoding sequences at least the polynucleotide coding sequence for one antibody complementarity determining region (CDR), for example, a first or a second CDR known as CDR1 or CDR2 according to conventional nomenclature with which those skilled in the art will be familiar, preferably coding sequence for two CDRs, for example, CDR1 and CDR2, and more preferably coding sequence for CDR1 and CDR2 and at least a portion (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13 or more amino acids) of CDR3, where it will be appreciated that typically one or more amino acids of CDR3 may be encoded at least in part by at least one nucleotide that is present in a D segment gene and/or in a J segment gene.
  • CDR antibody complementarity determining region
  • libraries containing test agents having sequence diversity within a known protein sequence are made by targeted introduction of two or more recombination signal sequences (RSSs) into the protein coding sequence and subsequent introduction of the modified protein coding sequence into a recombination-competent host cell, specifically a host cell that is capable of expressing at least RAG-1 and RAG-2, resulting in the generation and expression of variants of the protein.
  • RSSs recombination signal sequences
  • Such mutations provide for the generation of a very large number of protein variants such that, in certain embodiments, mutations imparting the desired functionality to the protein can be identified, e.g., in a single round.
  • the methods according as set forth herein generally comprise the steps of introducing a pair of RSSs at a selected location within the coding sequence for a test agent, and introducing the modified coding sequence into a recombination-competent host cell to allow for recombination and expression of variants of the test agent.
  • the method of generating variants of a test agent comprises the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test agent and comprising a complementary pair of RSSs, introducing the polynucleotide into a recombination- competent host cell, the host cell capable of expressing at least RAG-1 and RAG-2, and culturing the host cell in vitro under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test agent.
  • the methods further comprise screening the variant test agents for variants having defined functional characteristics.
  • the methods are applied to a test agent that is an antigen-binding moiety.
  • the methods are applied to an antigen-binding moiety in order to introduce sequence diversity into a loop region involved antigen-binding and comprise the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test antigen-binding moiety, the nucleic acid sequence comprising a complementary pair of RSSs in a region of the sequence encoding an antigen-binding loop of the protein, introducing the polynucleotide into a recombination-competent host cell, and culturing the host cell under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test antigen-binding moiety.
  • the host cell may constitutively express RAG-1 and RAG-2, and optionally TdT, or one or more of these proteins may be under inducible control.
  • expression of one or more of RAG-1 and RAG-2, and optionally TdT, in the host cell is under inducible control allowing, for example, for expansion of the host cell prior to the induction of sequence diversity generation.
  • the method comprises the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test agent and comprising a pair of RSSs, introducing the polynucleotide into a recombination-competent host cell, the host cell capable of expressing at least RAG-1 and RAG-2 and optionally TdT, wherein expression of one or more of RAG-1 , RAG-2 and TdT is under inducible control, culturing the host cell under conditions allowing expansion of the host cell, inducing expression of one or more of RAG-1 , RAG-2 and TdT, culturing the expanded host cells under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test agent.
  • the polynucleotide may be introduced into the host cell on a suitable vector and may be, for example, stably integrated into the genome of the cell, stably maintained exogenously to the genome or transiently expressed.
  • the polynucleotide may comprise additional pairs of RSSs allowing for generation of additional sequence diversity in the protein.
  • the polynucleotide comprises two complementary pairs of RSSs, each pair positioned to introduce sequence diversity into a different region of the test agent.
  • the recombination signal sequence as set forth herein, preferably consists of two conserved sequences (for example, heptamer, 5'- CACAGTG-3', and nonamer, 5'- ACAAAAACC-3'), separated by a spacer of either 12 +/- 1 bp (a "12-signal” RSS) or 23 +/- 1 bp (a "23-signal” RSS).
  • two RSSs one 12-signal RSS and one 23-signal RSS
  • Recombination does not occur between two RSS signals with the same size spacer.
  • the orientation of the RSS determines if recombination results in a deletion or inversion of the intervening sequence.
  • nucleotide positions within RSSs cannot be varied without compromising RSS functional activity in genetic recombination mechanisms, which nucleotide positions within RSSs can be varied to alter (for example, increase or decrease in a statistically significant manner) the efficiency of RSS functional activity in genetic recombination mechanisms, and which positions within RSSs can be varied without having any significant effect on RSS functional activity in genetic recombination mechanisms (see, for example, Ramsden et al, 1994, Proc Natl Acad Sci USA 88(23): 10721 -10725).
  • the RSS selected for inclusion in the test agent coding sequence is a RSS that is known to the art.
  • sequence variants of known RSSs that comprise one or more nucleotide substitutions (for example, about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18 or more substitutions) relative to the known RSS sequence and which, by virtue of such substitutions, predictably have low efficiency (for example, about 1 % or less, relative to a high efficiency RSS), medium efficiency (for example, about 10% to about 20%, relative to a high efficiency RSS) or high efficiency.
  • RSS variants for which one or more nucleotide substitutions relative to a known RSS sequence will have no significant effect on the recombination efficiency of the RSS (for example, the success rate of the RSS in promoting formation of a recombination product, as known in the art).
  • RSSs selected for inclusion in the test agent coding sequence are pairs of RSSs in which the first RSS of the pair is capable of functional recombination with the second RSS of the pair (i.e. "complementary pairs").
  • a first RSS for example present in a first polynucleotide or nucleic acid sequence
  • a second RSS for example present in a second polynucleotide or nucleic acid sequence
  • such capability includes compliance with the above-noted 12/23 rule for RSS spacers, such that if the first RSS comprises a 12- nucleotide spacer then the second RSS will comprise a 23 -nucleotide spacer, and similarly if the first RSS comprises a 23-nucleotide spacer then the second RSS will comprise a 12-nucleotide spacer.
  • the RSSs are positioned at a pre-determined (targeted) location or locations within the test agent coding sequence.
  • the targeted location is selected such that it is within a non-conformational region of the protein (i.e. a region of the protein that is not important for folding and/or adoption of the protein's functional conformation). In some embodiments, the targeted location is selected such that it is within an externally exposed region of the protein.
  • the targeted location is selected such that it is within or proximal to a ligand-binding domain, or in a region that otherwise impacts on ligand binding by the protein.
  • Immunoglobulin D segment genes are known in the art and may include coding regions for natural or non-naturally occurring D segments which coding regions comprise 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23 or 24 nucleotides.
  • Immunoglobulin J segment genes are also known in the art, for example, from immunoglobulin genes or cDNAs that have been sequenced, and typically comprise J segment-encoding regions of about 1 -51 nucleotides.
  • Ig gene sequences are therefore known in the art (e.g., Kabat et al., Sequences of Proteins of Immunological Interest, Edition: 5, 1992 DIANE Publishing, 1992, Darby, PA, ISBN 094137565X, 9780941375658,; Tomlinson et al., 1992 J Mol Biol 227:776; Milner et al., 1995 Ann N Y Acad Sci 764:50.
  • membrane anchor domain polypeptide encoding polynucleotide sequences and variants or fragments thereof (e.g., primary sequence variants or truncated products that retain 3D structural properties of the corresponding unmodified polypeptide, such as space-filling, charge distribution and/or hydrophobicity/hydrophi licity) that encode membrane anchor domain polypeptides that localize the polypeptides in which they are present to the surfaces of cells in which they are expressed.
  • Specific binding interactions such as a specific protein-protein association or a specific antibodyantigen binding interaction may preferably include a protein-protein binding event, or an antibody-antigen binding event, having an affinity constant, Ks, of greater than or equal to about 10 4 M -1 , more preferably of greater than or equal to about 10 5 M 1 , more preferably of greater than or equal to about 10 6 IvH , and still more preferably of greater than or equal to about 10 7 M -1 .
  • affinity constant, Ks of greater than or equal to about 10 4 M -1 , more preferably of greater than or equal to about 10 5 M 1 , more preferably of greater than or equal to about 10 6 IvH , and still more preferably of greater than or equal to about 10 7 M -1 .
  • Affinities of specific binding partners including antibodies can be readily determined using conventional techniques, for example, those described by Scatchard et al. (Ann. N. Y. Acad. Sci. USA 51 :660 (1949)), by Harlow et al.,
  • Binding or affinity between an antigen and a test agent can be determined using a variety of techniques known in the art, for example but not limited to, equilibrium methods (e.g., enzyme-linked immunosorbent assay (ELISA); KinExA, Rathanaswami et al. Analytical Biochemistry, Vol. 373:52-60, 2008; or radioimmunoassay (RIA)), or by a surface plasmon resonance assay or other mechanism of kineticsbased assay (e.g., BIACORETM analysis or OctetTM analysis (forteBIO)), and other methods such as indirect binding assays, competitive binding assays fluorescence resonance energy transfer (FRET), gel electrophoresis and chromatography (e.g., gel filtration).
  • equilibrium methods e.g., enzyme-linked immunosorbent assay (ELISA); KinExA, Rathanaswami et al. Analytical Biochemistry, Vol. 373:52-60, 2008; or radioimmunoassay
  • binding affinities and kinetics can be found in Paul, W. E., ed., Fundamental Immunology, 4th Ed., Lippincott- Raven, Philadelphia (1999), which focuses on antibody-immunogen interactions.
  • a competitive binding assay is a radioimmuno assay comprising the incubation of labeled antigen with the cell internalizing agent of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the cell internalizing agent bound to the labeled antigen.
  • the affinity of the test agent for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis. Competition with a second agent can also be determined using radioimmunoassays.
  • the antigen is incubated with test agent conjugated to a labeled compound in the presence of increasing amounts of an unlabeled second cell internalizing agent.
  • the invention includes a method for identifying a cell internalizing agent, the method comprising providing a population of target cells expressing or displaying test agents.
  • Each of the target cells contains a reporter construct that comprises a nucleic acid that provides a certain phenotype when activated or repressed.
  • the phenotype is indicative of gene editing and represents internalization of the test agent in such a manner that when the test agent is conjugated to a site-directed modifying polypeptide, internalization of the complex results in gene editing, which is in turn represented by the observed phenotype.
  • the target cell is a eukaryotic cell.
  • eukaryotic cells that may be used to express the test agents include mammalian cells or yeast cells.
  • mammalian cell lines that may be used with the methods disclosed herein include, but are not limited to HEK 293 (Human embryonic kidney) and CHO (Chinese hamster ovary). These cell lines can be transfected by standard methods to express the test agents (e.g., using calcium phosphate or polyethyleneimine (PEI), or electroporation).
  • PEI polyethyleneimine
  • mammalian cell lines include, but are not limited to: HeLa, U2OS, 549, HT1080, CAD, P19, NIH 3T3, L929, N2a, Human embryonic kidney 293 cells, MCF-7, Y79, SO-Rb50, Hep G2, DUKX-X1 1 , J558L, and Baby hamster kidney (BHK) cells.
  • the mammalian cell is a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK- 293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, or HepG2.
  • test agents e.g., a protein, a lipid, or a carbohydrate
  • a conjugation moiety that allow the agent to be conjugated to a site-directed modifying polypeptide (described in more detail below).
  • site-directed modifying polypeptide also contains a conjugation moiety that is complementary to that on the test agent, such that the two molecules can be conjugated. Conjugation methods and moieties are provided in more detail below.
  • screening of the test agents is further performed by contacting a population of target cells with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid within the target cell.
  • gRNA guide RNA
  • the reporter construct containing the target nucleic acid sequence is located within the nucleus of the target cell.
  • the nucleoprotein comprises a conjugation moiety (i.e., a second moiety) that binds to the first conjugation moiety of the test agent.
  • Cell internalizing agents are subsequently identified by selecting a target cell having the phenotype observed with activation or repression of the reporter construct (thereafter referred to a modified target cell).
  • the phenotype of interest is associated with repression or activation of a reporter construct.
  • the phenotype of interest is a fluorescent protein that has at least one of an absorption or emission intensity of interest, an absorption or emission spectra of interest, and a stokes shift of interest.
  • the phenotype of interest may be the production of a protein having enzymatic activity, a protein having a lack of inhibition of enzyme activity, and a protein having activity in the presence of an inhibitor for the enzyme.
  • a target cell by applying selective pressure to a population of cells comprising a reporter construct, where expression of the reporter construct confers a survival benefit to a target cell.
  • a target cell or a plurality of target cells that survive the selection process can be collected and the test agent of the target cell can be isolated and identified using techniques known in the art (i.e., positive selection).
  • a guide RNA is complexed with a site- directed modifying polypeptide (e.g., Cas9) wherein the resulting nucleoprotein is capable of modifying a nucleic acid sequence encoding a reporter construct, wherein the nucleic acid editing activity of the resulting nucleoprotein results in the expression of an antibiotic resistance gene, capable of conferring a survival advantage to the target cell.
  • a site- directed modifying polypeptide e.g., Cas9
  • a target cell of interest i.e., a target cell comprising a test agent capable of internalizing the site-directed modifying polypeptide
  • a target cell of interest is selected for target cell survival in the presence of an antibiotic (i.e., by selecting a cell that is not sensitive to the antibiotic) due to modification of the reporter construct such that the reporter construct is capable of expressing the antibiotic resistance gene.
  • the invention provides a method of screening test cells having a plurality of genotypes for a cell that produces a cell internalizing agent using an array.
  • arrays include, but are not limited to, a microfluidic system, a microbubble system, or a microcavity array.
  • An arraybased screening method includes, but is not limited to, providing an array with a library of protein-variantproducing cells that each express a test agent, and then incubating the array under conditions that allow for the production of test agents from the protein-variant-producing cells.
  • a target cell comprising a reporter construct comprising a nucleic acid that provides a phenotype with activated or repressed is provided and contacted with the test agent under conditions that allow for the test agent to bind to the target cell, wherein the test agent comprises a first conjugation moiety.
  • the target cell is then contacted with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid of the target cell, wherein a second conjugation moiety is conjugated to the site-directed modifying polypeptide such that the first conjugation moiety of the test agent binds to the second conjugation moiety of the site-directed modifying polypeptide.
  • gRNA guide RNA
  • the array is examined in order to determine a given result which is an expected phenotype associated with activation or repression of the nucleic acid in the reporter construct.
  • the result is representative of the test agent being capable of internalizing the nucleic acid guided nuclease such that gene editing can occur.
  • the target cell is extracted (e.g. , with electromagnetic radiation) such that the cell internalizing agent (i.e., the successful test agent) can be identified using standard techniques.
  • the array may be designed such that some or all cavities (or chambers of a device) contain a single biological element (e.g., a single target cell) to screen for the cell internalizing agent.
  • concentration of the heterogeneous mixture of cells is therefore calculated according to the design of the array and desired agents to identify.
  • the array may be loaded by contacting a solution containing a plurality of cells, such as a heterogeneous population of cells, with the array.
  • loading a mixture of test agent displaying or secreting cells e.g., yeast or mammalian cells, evenly into all the microcavities involves placing a 500 pL droplet on the upper side of the array and spreading it over all the micro-pores.
  • an initial concentration of approximately 10 9 cells in the 500 pL, droplet results in approximately 3 cells (or subpopulation) per micro-cavity.
  • the concentration conditions are set such that in most cavities of the array only single elements are present. This allows for the most precise screening of single elements. However, the exact number will depend on the volume of the cavity in the array and the concentration of cells in solution.
  • the cells may secrete or display at least one test agent.
  • concentration conditions can be readily calculated by the person of skill in the art.
  • the ratio of protein-producing cells to cavities is about 1 to 3
  • single cells are not desired in each pore.
  • the concentration of the heterogeneous population is set so that more than one cell is found in each pore.
  • a sample containing the population and/or library of cells may require preparation steps prior to distribution to the array (e.g., genetic engineering of the genomic nucleic acid, incubation of expanding a particular cell line).
  • these preparation steps include an incubation time.
  • the incubation time will depend on the design of the screen and the cells being screened. Example times include 5 minutes, 1 hour, 3 hours, 6 hours, 12 hours, 1 day, 2 days and 3 days or more.
  • the heterogeneous population of cells may be expanded in media prior to adding and/or loading onto the array.
  • additional molecules or particles can be added or removed from the array without disturbing the cells.
  • any biological reactive molecule or particle useful in the detection of the target cells can be added.
  • These additional molecules or particles can be added to the array by introducing liquid reagents comprising the molecules or particles to the top of the array, such as for example by adding drop-wise as described herein in relation to the addition of the cells.
  • the top of the array is sealed with a membrane following the addition of sample to the cavities in order to reduce evaporation of the media from the cavities.
  • a membrane for example, typical food-service type plastic wraps such as polyvinylidene fluoride (PVDF) are suitable.
  • PVDF polyvinylidene fluoride
  • the membrane allows water vapor to equilibrate with the top liquid layer of the liquid in the pore, which can help prevent evaporation.
  • the top of the array is covered with a semi-permeable composition that allows delivery of liquid and reagents to the cavities of the array while also preventing evaporation of the cavity contents.
  • the disclosure is directed to an array including a plurality of distinct cavities comprising open first ends and open second ends, wherein the open first ends of essentially all of the plurality of cavities collectively comprise a first porous planar surface, and the open second ends of essentially all of the plurality of cavities comprise second porous planar surface, and a cover for the first surface that imparts at least one of moisture, a nutrient, or a biologically reactive molecule, to contents of the cavities.
  • the array is scanned to identify cavities containing cells having a phenotype of interest, which may include cells capable of internalizing a test agent and allowing editing of a nucleic acid sequence by a site-directed modifying polypeptide (or nucleoprotein).
  • a site-directed modifying polypeptide or nucleoprotein.
  • the intercapillary variability in fluorescence signals detected from the array may be measured.
  • the passive nature of microcapillary filling process results in a uniform meniscus level across the entire array. This uniformity, coupled with gravitational sedimentation of the loaded cells, simplifies the establishment of the imaging focus plane without the need for autofocus. Rather, the focus may be set at three distantly spaced points on the array, for example the corners. From these three points, the plane of the microcapillary array may be calculated.
  • target cavities with the desired properties are identified and their contents extracted for further characterizations and expansion.
  • the disclosed methods maintain the integrity of the biological elements in the cavities. Therefore, the methods disclosed herein provide for the display and independent recovery of a target population of test agents from a population of up to billions of test agents.
  • the signals from each cavity are scanned to locate the binding events of interest. This identifies the cavities of interest.
  • Individual cavities containing the desired clones can be extracted using a variety of methods. For all extraction techniques, the extracted cells or material can be expanded through culture or amplification reactions and identified for the recovery of the test agent, e.g., a protein or an antibody. Following screening, one or more cavities of interest can be extracted as described herein. In certain embodiments, the desired specificity will be a single biological element per pore or cavity.
  • a test agent having a desired characteristic is detected directly in the cavities of the array.
  • the biochemical sensing can be done using standard detection techniques including a sandwich immunoassay or similar binding or hybridizing reactions.
  • the property of interest may be at least one of an emission spectra or emission intensity, a Stokes shift, and an absorption spectra or absorption intensity.
  • the method of the disclosure can be used to screen a library of cells expressing test agents having a particular fluorescent absorbance spectrum, emission spectra and/or extinction coefficient.
  • the protein is a dimerization dependent orange fluorescent protein (ddOFP).
  • Detection of test agents in accordance with the disclosure requires, in some embodiments, the use of an apparatus capable of applying electromagnetic radiation to the sample, and in particular, to an array of cavities, such as a microarray.
  • the apparatus must also be capable of detecting electromagnetic radiation emitted from the sample, and in particular a sample cavity.
  • an electromagnetic radiation source of the apparatus is broad spectrum light or a monochromatic light source having a wavelength that matches the wavelength of at least one label in a sample.
  • the electromagnetic radiation source is a laser, such as a continuous wave laser.
  • the electromagnetic source is a solid state UV laser.
  • the apparatus may also include, in certain embodiments, a detector that receives electromagnetic (EM) radiation from the label(s) in the sample, array.
  • the detectors can identify at least one cavity (e.g., a microcavity) emitting electromagnetic radiation from one or more labels.
  • light e.g., light in the ultra-violet, visible or infrared range
  • a fluorescent label after exposure to electromagnetic radiation is detected.
  • any suitable detection mechanism known in the art may be used without departing from the scope of the disclosure, for example a CCD camera, a video input module camera, a Streak camera, a bolometer, a photodiode, a photodiode array, avalanche photodiodes, and photomultipliers producing sequential signals, and combinations thereof.
  • the number of cells in the sample liquid results in a diverse population of cells in each cavity. Following extraction and expansion of the contents of a particular cavity, the resulting population can be screened in subsequent steps to identify particular cells of interest.
  • the number of cells in a sample liquid is less than the number of cavities in the array, resulting in the loading only one cell or less in each of the cavities.
  • the contents of the cavities can be extracted with the apparatus and methods known in the art.
  • the cavity contents can be further analyzed or expanded. Expanded cell populations from a cavity or cavities can be rescreened with the array according the methods herein. For instance, if the number of biological elements in a population exceeds the number of cavities in the array, the population can be screened with more than one element in each pore.
  • the contents of the cavities that provide a positive signal can then be extracted to provide a subpopulation.
  • the subpopulation can be screened immediately or, when the subpopulation is cells, it can be expanded.
  • the screening process can be repeated until each cavity of the array contains only a single element.
  • the screen can also be applied to detect and/or extract the cavity that indicates the desired analyte is therein. Following the selection of the cavity, other conventional techniques may be used to isolate the individual agents of interest, such as techniques that provide for higher levels of protein production.
  • an initial screen of the library or in the enrichment process multiple cells may be added to any particular cavity.
  • Cell contents may be extracted and further analyzed or enriched in accordance with the method of the disclosure.
  • having one cell per cavity allows for identification of a particular genotype.
  • the extracting may discreetly directing electromagnetic radiation to the cavities having cells producing proteins having a phenotype of interest, wherein the directing of electromagnetic radiation to the cavities does not heat the liquid prior to extraction.
  • microcavity arrays are used to screen test agents where test cells produce test agents and secrete the agents into the microcavity.
  • the test agent may be an antibody, e.g., a recombinant antibody and/or a monoclonal antibody.
  • the cell or cells may produce more than one kind of antibody or multiple copies of the same antibody.
  • the microcavities containing the cells displaying or secreting compounds having the highest binding affinity can be identified with an appropriate reporter system.
  • the array can be imaged it identify one or more cavities containing cells having the phenotype of interest. The contents of the cavities may be extracted by directing electromagnetic radiation from a pulsed diode laser at a radiation absorbing material associated with the cavity.
  • microcavity arrays which may include reaction cavities assembled in an extreme density porous array.
  • micro-arrays contemplated herein can be manufactured by bundling millions or billions of cavities or pores, such as in the form of silica capillaries, and fusing them together through a thermal process.
  • Such a fusing process may comprise the steps including but not limited to; I) heating a capillary single draw glass that is drawn under tension into a single clad fiber; ii) creating a capillary multi draw single capillary from the single draw glass by bundling, heating, and drawing; iii) creating a capillary multi-multi draw multi capillary from the multi draw single capillary by additional bundling, heating, and drawing; iv) creating a block assembly of drawn glass from the multi-multi draw multi capillary by stacking in a pressing block; v) creating a block pressing block from the block assembly by treating with heat and pressure; and vi) creating a block forming block by cutting the block pressing block at a precise length (e.g., 1 mm).
  • a precise length e.g. 1 mm
  • the capillaries are cut to approximately 1 millimeter in height, thereby forming a plurality of micro-pores having an internal diameter between approximately 1 .0 micrometers and 500 micrometers.
  • the micro-pores range between approximately 10 micrometers and 1 millimeter long. In one embodiment, the micro-pores range between approximately 10 micrometers and 1 centimeter long. In one embodiment, the micro-pores range between approximately 10 micrometers and 100 millimeters long. In one embodiment, the micro-pores range between approximately 0.5 millimeter and 1 centimeter long.
  • each micro-pore can have a 5 pm diameter and approximately 66% open space (i.e., representing the lumen of each microcavity).
  • the proportion of the array that is open ranges between about 50% and about 90%, for example about 60 to 75%, more particularly about 67%.
  • a 10x10 cm array having 5 pm diameter microcavities and approximately 66% open space has about 330 million micro-pores.
  • the internal diameter of micro-cavities may range between approximately 1 .0 micrometers and 500 micrometers.
  • each of the micro-pores can have an internal diameter in the range between approximately 1 .0 micrometers and 300 micrometers; optionally between approximately 1 .0 micrometers and 100 micrometers; further optionally between approximately 1 .0 micrometers and 75 micrometers; still further optionally between approximately 1 .0 micrometers and 50 micrometers, still further optionally, between approximately 5.0 micrometers and 50 micrometers.
  • a microcavity array can be manufactured by bonding billions of silica capillaries and then fusing them together through a thermal process. After that slices (0.5 mm or more) are cut out to form a very high aspect ratio glass micro perforated array plate. See, International Application PCT/EP201 1 /062015 (WO2012/007537), which is incorporated by reference herein in its entirety.
  • a number of useful arrays are commercially available, such as from Hamamatsu Photonics K. K. (Japan), Incom, Inc. (Massachusetts), Photonis Technologies, S.A.S. (France) Inc. and others.
  • the microcavities of the array are closed at one end with a solid substrate attached to the array.
  • the sidewalls of the cavities of the arrays are not transmissive to electromagnetic radiation, or the cavities are coated with a material that prevents the transmission of electromagnetic radiation between cavities of the arrays. Suitable coating should not interfere with the binding reaction within the cavities or the application of forces to the cavities.
  • Example coatings include sputtered nanometer layers of gold, silver and platinum.
  • the capillary walls of the array are comprised of multiple layers, wherein one or more layers of the walls are made of a low refractive index material that prevents or substantially diminishes transmission of electromagnetic radiation between cavities of the array.
  • the arrays are prepared under or subjected to either wet or dry hydrogen atmospheres in order to inhibit or block the transmission of electromagnetic radiation through the array.
  • the cavities of the array have a hydrophilic surface that facilitates the spontaneously uptake the solution into the cavity.
  • a surface of the array may be treated to impart hydrophobicity.
  • one surface of the array may be hydrophobic and the other surface may be hydrophilic.
  • a top surface and a bottom surface of the array are treated differently to impart hydrophilic characteristics on the top and hydrophobic characteristics on the bottom.
  • the array may be treated sequentially, first with an agent to impart hydrophobicity, then on the opposite side with an agent to impart hydrophilicity.
  • the disclosure is directed to the use of an array including a plurality of distinct cavities comprising open first ends and open second ends, wherein the open first ends of essentially all of the plurality of cavities collectively encompass a porous planar hydrophilic surface, and the open second ends of essentially all of the plurality of cavities encompass a porous planar hydrophobic surface.
  • the surfaces include the open ends of the cavities and the interstitial spaces between the cavities.
  • the hydrophilic characteristics may be imparted using a corona treatment according to techniques known in the art.
  • the array may be treated with hydrophobic agents such as a polysiloxane, or composition comprising polysiloxane.
  • the hydrophobic agent is a hydroxy-terminated polydimethylsiloxane.
  • the hydrophobic agent is RAIN-X® water repellant.
  • one surface or the entire array can be treated to impart a hydrophilic characteristic. Thereafter the hydrophilic surfaces are protected, for example by application of a sealant, and the opposing surfaces are treated with a hydrophobic agent.
  • the method includes isolating cells located in the microcavities by pressure ejection.
  • a separated microcavity array is covered with a plastic film.
  • the method further provides a laser capable of making a hole through the plastic film, thereby exposing the spatially addressed micro-pore. Subsequently, exposure to a pressure source (e.g., air pressure) expels the contents from the spatially addressed microcavity. See WO2012/007537.
  • a pressure source e.g., air pressure
  • Another embodiment is directed to a method of extracting a solution including a biological element from a single microcavity in a microcavity array.
  • the microcavity is associated with an electromagnetic radiation absorbent material so that the material is within the cavity or is coating or covering the microcavity. Extraction occurs by focusing electromagnetic radiation at the microcavity to generate an expansion of the sample or of the material or both or evaporation that expels at least part of the sample from the microcavity.
  • the electromagnetic radiation source may be the same or different than the source that excites a fluorescent label. The source may be capable of emitting multiple wavelengths of electromagnetic radiation in order to accommodate different absorption spectra of the materials and the labels.
  • subjecting a selected microcavity to focused electromagnetic radiation can cause an expansion of the electromagnetic radiation absorbent material, which expels sample contents onto a substrate for collecting the expelled contents.
  • the electromagnetic radiation is focused on the electromagnetic radiation absorbing material, resulting in linear absorption of the laser energy and cavitation of the liquid sample at the material/liquid interface.
  • directing of electromagnetic radiation to the material should avoid heating that liquid that is not in contact with the material at the focus of the radiation to avoid heating the liquid contents of the microcavity and impacting the biological material in the cells. Accordingly, the amount of energy necessary to disrupt the meniscus is not sufficient to cause a significant increase in temperature of the entire liquid contents.
  • the laser is focused on the material of a cavity of the array adjacent the meniscus itself, causing a disruption of the meniscus without heating the liquid contents of the cavity other than the heating associated with the vaporization of a small amount of liquid at the portion of the meniscus adjacent the laser focus.
  • extraction from cavities of the array is accomplished by excitation of one or more particles in the microcavity, wherein excitation energy is focused on the particles.
  • some embodiments employ energy absorbing particles in the cavities and an electromagnetic radiation source capable of discreetly delivering electromagnetic radiation to the particles in each cavity of the array.
  • a sequence of pulses repeatedly agitates magnetic beads in a cavity to disrupt a meniscus, which expels sample contents onto a substrate for collecting the expelled contents.
  • the electromagnetic radiation emission spectra from the electromagnetic radiation source must be such that there is at least a partial overlap in the absorption spectra of the electromagnetic radiation absorbent material associated with the cavity.
  • individual cavities from a microcavity array are extracted by a sequence of short laser pulses rather than a single large pulse.
  • a laser is pulsed at wavelengths of between about 300 and 650, more particularly about 349 nm, 405 nm, 450 nm, or 635 nm.
  • cavities of interest are selected and then extracted by focusing a 349 nm solid state UV laser at 20-30% intensity power.
  • a diode laser may be used as an electromagnetic radiation source.
  • a diode laser is pulsed at between about 2 to 20 pulses, for instance 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20 pulses, with a pulse length of about 1 to 10 msec, for instance, 1 , 2, 3, 4, 5, 6, 7, 8, 9, and 10 msec, and having a pulse separation of approximately 10 msec to 100 msec, for instance 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100 msec. If magnetic beads are in the capillary the laser pulse energy is absorbed by the beads, primarily heating the surface of the bead that is directly exposed to the laser. The liquid in immediate proximity to this surface is explosively vaporized which propels the beads within the capillary.
  • Focusing electromagnetic radiation at a microcavity can cause the electromagnetic radiation absorbing material to expand, which causes at least part of the liquid volume of the cavity to be expelled.
  • Microcavities can be open at both ends, with the contents being held in place by hydrostatic force. During the extraction process, one of the ends of the cavities can be covered to prevent expulsion of the contents from the wrong end of the cavity.
  • the capture surface comprises a hygroscopic layer upon which the contents of the cavity are expelled.
  • the hygroscopic layer attracts water and prevents the deformation of the optical surface allowing clear imaging of the cavity contents.
  • the layer is a hygroscopic composition, such as witch hazel, a solution including glycerol, or a solution of phosphate buffered saline with bovine serum albumin and sorbitol in concentrations for example of 0.1% weight/volume BSA and 1 M sorbitol.
  • the surface may be contacted with a culture matrix to allow transfer of the contents of the cavity to the matrix.
  • cells may be allowed to propagate on the matrix.
  • the capture surface can be removed immediately after contact or within minutes, hours, days or weeks as appropriate to ensure viability of the cell culture(s) in the matrix.
  • the cells may be extracted directly onto the growth matrix, assuming the matrix has sufficient transparency to allow for the extraction laser to penetrate the matrix without sufficient focus to transfer energy to the array as described herein.
  • arrays and methods for high-throughput analysis of cells are further described in, e.g., US20160244749A1 , US20180163198A1 , US10370653B2, US10227583B2, WO2018111765A1 , WO2018191180A1 , and WO2018125832A1 which are hereby incorporated by reference.
  • the invention provides methods for selecting test agents that are able to internalize site-directed modifying polypeptides (or nucleoproteins) where the read out for the internalization, in certain preferred embodiments, is the gene editing activity of the site-directed modifying polypeptides (or nucleoproteins).
  • This read out which is based on activation or repression of the reporter construct, is manifested in a phenotype that can be screened in order to separate target cells having test agents that failed to internalize and/or provide gene editing activity when conjugated to a site-directed modifying polypeptides (or nucleoproteins).
  • label or “detectable label” means a molecule that can be directly (i.e., a primary label) or indirectly (i.e., a secondary label) detected.
  • a label can be visualized and/or measured and/or otherwise identified so that its presence, absence, or a parameter or characteristic thereof can be measured and/or determined.
  • fluorescent label refers to any molecule that can be detected via its inherent fluorescent properties, which include fluorescence detectable upon excitation.
  • fluorescent labels are described herein and elsewhere in the art (e.g., see The Tenth Edition of Haugland, RP. The Handbook: A Guide to Fluorescent Probes and Labeling Technologies. 10th. Invitrogen/Molecular Probes; Carlsbad, CA: 2005, hereby incorporated by reference).
  • internalization and nucleic acid editing activity of a site-directed modifying polypeptide is detected by introducing into the target cell a polynucleotide comprising a target sequence capable of being bound by the gNA and measuring for cleavage at the target sequence in the target cell.
  • the target sequence can be labelled with two detectable labels that generate a signal upon interaction (e.g., a FRET pair) such that cleavage of the target sequence disrupts interaction of the detectable labels and causes a reduction in fluorescence.
  • the target sequence is labelled with a quenching pair such that cleavage of the target sequence leads to a gain in signal.
  • fluorophores that can be used in the methods provided herein include Alexa Fluor® 350; Marina Blue®; Atto 390; Alexa Fluor® 405; Pacific Blue ⁇ ; Pacific Green ⁇ ; Atto 425; Alexa Fluor® 430; Atto 465; DY-485XL; DY-475XL; FAMTM 494; Alexa Fluor® 488; DY-495-05; Atto 495; Oregon Green® 488; DY-480XL 500; Atto 488; Alexa Fluor® 500; Rhodamin Green®; DY-505-05; DY-500XL; DY- 510XL; Oregon Green® 514; Atto 520; Alexa Fluor® 514; JOE 520; TETTM 521 ; CAL Fluor® Gold 540; DY- 521 XL; Rhodamin 6G®; Yakima Yellow® 526; Atto 532; Alexa Fluor®532; HEX 535; VIC 538; CAL Fluor
  • the methods provided herein involve negative selection against cells wherein the reporter construct is not edited.
  • a guide RNA is complexed with a site- directed modifying polypeptide (e.g., Cas9) wherein the resulting nucleoprotein is capable of modifying a nucleic acid sequence encoding for thymidine kinase.
  • a site- directed modifying polypeptide e.g., Cas9
  • a target cell of interest i.e., a target cell comprising a test agent capable of internalizing the site-directed modifying polypeptide
  • a target cell of interest is selected for target cell survival in the presence of ganciclovir (i.e., by selecting a cell that is not sensitive to ganciclovir) due to modification of the reporter construct capable of expressing thymidine kinase (i.e., knockout of the thymidine kinase by the nucleic acid editing activity of the site-directed modifying polypeptide).
  • ganciclovir i.e., by selecting a cell that is not sensitive to ganciclovir
  • modification of the reporter construct capable of expressing thymidine kinase i.e., knockout of the thymidine kinase by the nucleic acid editing activity of the site-directed modifying polypeptide.
  • a target cell containing a reporter construct can be screened for internalization of a site-directed modifying polypeptide (or nucleoprotein) or genome editing activities of a site-directed modifying polypeptide (or nucleoprotein) based on the level of the detectable label.
  • a detectable label is a molecule that can be visualized or otherwise observed.
  • the detectable label may be encoded by a polynucleotide that is operably linked to the polynucleotide encoding the nucleic acid- guided nuclease. In such instances, the expression construct will encode a nucleoprotein.
  • Detectable labels include any detectable protein domain, including but not limited to, a fluorescent protein or a protein domain that can be detected with a specific antibody.
  • fluorescent proteins include green fluorescent proteins (e.g., GFP, sfGFP, EGFP, ZsGreenl ), yellow fluorescent proteins (e.g., YFP, EYFP, ZsYellowl ), or red fluorescent protein (e.g., RFP).
  • fluorescent proteins include green fluorescent proteins (e.g., GFP, sfGFP, EGFP, ZsGreenl ), yellow fluorescent proteins (e.g., YFP, EYFP, ZsYellowl ), or red fluorescent protein (e.g., RFP).
  • Non-limiting examples of small molecule detectable labels include radioactive labels, such as 3 H and 35 S.
  • a variety of detectable labels and methods of preparing fusion proteins comprising detectable labels are known in the art, e.g., see Thorn, K. (2017). Genetically encoded fluorescent tags.
  • a target cell comprising a reporter construct is detected using cell sorting.
  • the cell sorting is fluorescence-activated cell sorting (FACS).
  • the cell sorting is magnetic- activated cell sorting (MACS).
  • the cell sorting is microfluidic-based cell sorting.
  • a target cell comprising a reporter construct capable of expressing a fluorescent label is sorted by fluorescence activated cell sorting (FACS).
  • FACS sorting not only measures fluorescence signals in cells at a rapid rate, but also collects cells that have specified fluorescence properties. Screening for a target cell using FACS may be performed in accordance with the compositions and methods as described herein.
  • a target cell comprising a test agent capable of internalizing a site-directed modifying polypeptide (or nucleoprotein) may be identified using FACS by screening for cells for the disappearance of a cell surface receptor due to the nucleic acid editing activities of the internalized site-directed modifying polypeptide (or nucleoprotein).
  • site-directed modifying polypeptide e.g., Cas9
  • a guide RNA i.e., forming a nucleoprotein of Cas9 and the guide RNA
  • modifies a reporter construct capable of expressing a cell surface receptor expressed on target cell of interest resulting in loss of expression e.g., knock-out of the cell surface receptor.
  • a target cell comprising a test agent capable of internalizing a site- directed modifying polypeptide (or nucleoprotein) may be identified using FACS by sequencing and isolating/expanding cells that lead to knockout of cell surface receptor.
  • a target cell comprising a test agent expressed from a protein variant producing cell wherein the target cell comprising the test agent is capable of internalizing a site-directed modifying polypeptide (or nucleoprotein) may be identified by providing a labeled protein (or antibody) capable of binding to a cell surface receptor expressed by the reporter construct of the target cell, wherein the absence of labeled protein (or antibody) binding identifies a target cell of interest due to the loss of expression of the cell surface receptor (e.g., knock-out of the cell surface receptor).
  • a protein variant producing cell from these wells would be pooled for further screening and sequencing of test agent variants.
  • the selection can be performed in a variety of host cells such as yeast, bacteria, plant, insect, or mammalian cells depending on the requirements of the experiment and the capabilities of the expression vectors being used.
  • a spectrophotometer, a microtitre plate reader, a CCD, a fluorescence microscope, or other similar device may be used to detect fluorescence in a target cell or in an array comprising a single cell or plurality of cells.
  • the present invention includes methods for determining whether a test agent, e.g., a test agent expressed on (or bound to) the surface of a target cell, is a cell internalization agent, and thereby facilitate the internalization of a site-directed modifying polypeptide (or nucleoprotein) into a target cell.
  • the test agent contains a first conjugation moiety and the site-directed modifying polypeptide contains a second conjugation moiety, such that the first conjugation moiety of the test agent binds to (or is stably associated with) the second conjugation moiety of the site-directed modifying polypeptide to form a complex between the test agent and the site-directed modifying polypeptide.
  • the first and second conjugation moieties provide for a covalent linkage between the test agent and the site- directed modifying polypeptide. In other embodiments, the first and second conjugation moieties provide for a non-covalent linkage between the test agent and the site-directed modifying polypeptide.
  • the test agent comprising the first conjugation moiety is expressed on the surface of a cell via a genetically-encoded first conjugation moiety.
  • the test agent comprising the first conjugation moiety is expressed on the surface of a cell and the first conjugation moiety is added post-production.
  • the test agent is secreted from a cell such that the test agent comprising the first conjugation moiety must be prepared in ex vivo culture.
  • the test agent and first conjugation moiety are site- specifically conjugated in ex vivo culture. Accordingly, in preferred embodiments, following ex vivo conjugation the test agent comprising the first conjugation moiety may be presented to a target cell under conditions that allow for the test agent to bind to the target cell.
  • the conjugation moieties include, but are not limited to, SpyTag/SpyCatcher, snooptag/snoopcatcher, sortase, split intein. In other embodiments, the conjugation moieties include, but are not limited to, Halo-tag, mono-avidin, ACP tag, a SNAP tag, or any other conjugation moieties known in the art.
  • the conjugation moiety is selected from Protein A, CBP, MBP, GST, poly(His), biotin/streptavidin, V5-tag, Myc-tag, HA-tag, NE-tag, His-tag, Flag tag, Halo-tag, Snap- tag, Fc-tag, Nus-tag, BCCP, thioredoxin, SnoopTag, SpyTag, SpyCatcher, Isopeptag, SBP-tag, S- tag, AviTag, and calmodulin.
  • the conjugation moiety is a chemical tag.
  • a chemical tag may be SNAP tag, a CLIP tag, a HaloTag or a TMP-tag.
  • the chemical tag is a SNAP-tag or a CLIP- tag.
  • SNAP and CLIP fusion proteins enable the specific, covalent attachment of virtually any molecule to a protein of interest.
  • the chemical tag is a HaloTag.
  • HaloTag involves a modular protein tagging system that allows different molecules to be linked onto a single genetic fusion, either in solution, in living cells, or in chemically fixed cells.
  • the chemical tag is a TMP-tag.
  • the conjugation moiety is an epitope tag.
  • an epitope tag may be a poly-histidine tag such as a hexahistidine tag or a dodecahistidine, a FLAG tag, a Myc tag, a HA tag, a GST tag or a V5 tag.
  • the site-directed modifying polypeptide and the test agent may each be engineered to comprise complementary binding pairs that enable stable association upon contact.
  • Exemplary binding moiety pairings include (i) streptavidin-binding peptide (streptavidin binding peptide; SBP) and streptavidin (STV), (ii) biotin and EMA (enhanced monomeric avidin), (iii) SpyTag (ST) and SpyCatcher (SC), (iv) Halo-tag and Halo-tag ligand, (v) and SNAP-Tag, (vi) Myc tag and anti-Myc immunoglobulins (vii) FLAG tag and anti-FLAG immunoglobulins, and (ix) ybbR tag and coenzyme A groups.
  • the conjugation moiety is selected from SBP, biotin, SpyTag, SpyCatcher, halo-tag, SNAP-tag, Myc tag, or FLAG tag.
  • the site-directed modifying polypeptide can alternatively be associated with a test agent, via one or more linkers as described herein wherein the linker is a conjugation moiety.
  • test agent is associated with the first conjugation moiety via a linker.
  • site-directed modifying polypeptide is associated with the second conjugation moiety via a linker.
  • linker means a divalent chemical moiety comprising a covalent bond or a chain of atoms that covalently attaches, for example, a test agent with a first conjugation moiety, and a site- directed modifying polypeptide with a second conjugation moiety. Any known method of conjugation of peptides or macromolecules can be used in the context of the present disclosure. Generally, covalent attachment of the test agent with the first conjugation moiety or the site-directed modifying polypeptide with the second conjugation moiety requires the linker to have two reactive functional groups, i.e. , bivalency in a reactive sense.
  • Bivalent linker reagents which are useful to attach two or more functional or biologically active moieties, such as peptides, nucleic acids, drugs, toxins, antibodies, haptens, and reporter groups are known, and methods for such conjugation have been described in, for example, Hermanson, G. T. (1996) Bioconjugate Techniques; Academic Press: New York, p234-242, the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation. Further linkers are disclosed in, for example, Tsuchikama, K. and Zhiqiang, A. Protein and Cell, 9(1 ), p.33-46, (2016), the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation.
  • linkers suitable for use in the compositions and methods disclosed are stable in circulation, but allow for release of the extracellular cell membrane binding moiety and/or the site-directed modifying polypeptide in the target cell or, alternatively, in close proximity to the target cell.
  • Linkers suitable for the present disclosure may be broadly categorized as non-cleavable or cleavable, as well as intracellular or extracellular, each of which is further described herein below.
  • the linker is cleaved once inside the cell cytoplasm (reducing conditions). In other embodiments, the linker is cleaved once inside the maturing endosome (acidic conditions).
  • the linker conjugating the test agentwith the first conjugation moiety, and the site-directed modifying polypeptide with the second conjugation moiety are non-cleavable.
  • Non-cleavable linkers comprise stable chemical bonds that are resistant to degradation (e.g., proteolysis).
  • non- cleavable linkers require proteolytic degradation inside the target cell, and exhibit high extracellular stability.
  • Non-limiting examples of non-cleavable linker utilized in antibodydrug conjugation include those based on maleimidomethylcyclohexanecarboxylate, caproylmaleimide, and acetylphenylbutanoic acid.
  • the linker conjugating the test agentwith the first conjugation moiety, and the site-directed modifying polypeptide with the second conjugation moiety are cleavable, such that cleavage of the linker (e.g., by a protease, such as metalloproteases) releases the test agentor the site-directed modifying polypeptide from its respective conjugation moiety in the intracellular or extracellular (e.g., upon binding of the molecule to the cell surface) environment.
  • Cleavable linkers are designed to exploit the differences in local environments, e.g., extracellular and intracellular environments, for example, pH, reduction potential or enzyme concentration, to trigger the release of site-directed modifying polypeptide component in the target cell in order to facilitate genome editing.
  • cleavable linkers are relatively stable in circulation in vivo, but are particularly susceptible to cleavage in the intracellular environment through one or more mechanisms (e.g., including, but not limited to, activity of proteases, peptidases, and glucuronidases).
  • Cleavable linkers used herein are stable outside the target cell and may be cleaved at some efficacious rate inside the target cell or in close proximity to the extracellular membrane of the target cell.
  • An effective linker will: (i) maintain the specific binding properties of the test agent, e.g., an antibody; (ii) allow intra- or extracellular delivery of the site-directed modifying polypeptide (or nucleoprotein); (iii) remain stable and intact, i.e.
  • Stability of the site-directed modifying polypeptide (or nucleoprotein) may be measured by standard analytical techniques such as mass spectroscopy, size determination by size exclusion chromatography or diffusion constant measurement by dynamic light scattering, HPLC, and the separation/analysis technique LC/MS.
  • Suitable cleavable linkers include those that may be cleaved, for instance, by enzymatic hydrolysis, photolysis, hydrolysis under acidic conditions, hydrolysis under basic conditions, oxidation, disulfide reduction, nucleophilic cleavage, or organometallic cleavage (see, for example, Leriche et al., Bioorg. Med. Chem., 20:571 -582, 2012, the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation).
  • Suitable cleavable linkers may include, for example, chemical moieties such as a hydrazine, a disulfide, a thioether or a peptide.
  • Linkers hydrolyzable under acidic conditions include, for example, hydrazones, semicarbazones, thiosemicarbazones, cis-aconitic amides, orthoesters, acetals, ketals, or the like.
  • hydrazones include, for example, hydrazones, semicarbazones, thiosemicarbazones, cis-aconitic amides, orthoesters, acetals, ketals, or the like.
  • linkers are relatively stable under neutral pH conditions, such as those in the blood, but are unstable at below pH 5.5 or 5.0, the approximate pH of the lysosome.
  • linkers including such acid-labile functionalities tend to be relatively less stable extracellularly. This lower stability may be advantageous where extracellular cleavage is desired.
  • Linkers cleavable under reducing conditions include, for example, a disulfide.
  • a variety of disulfide linkers are known in the art, including, for example, those that can be formed using SATA (N-succinimidyl-S- acetylth ioacetate), SPDP (N-succinimidyl-3-(2-pyridyldithio)propionate), SPDB (N-succinimidyl-3-(2- pyridyldithio)butyrate) and SMPT (N-succinimidyl-oxycarbonyl-alpha-methyl-alpha-(2-pyridyl-dithio)toluene), SPDB and SMPT (See, e.g., Thorpe et al., 1987, Cancer Res.
  • Linkers susceptible to enzymatic hydrolysis can be, e.g., a peptide-containing linker that is cleaved by an intracellular peptidase or protease enzyme, including, but not limited to, a lysosomal or endosomal protease.
  • the peptidyl linker is at least two amino acids long or at least three amino acids long.
  • Exemplary amino acid linkers include a dipeptide, a tripeptide, a tetrapeptide or a pentapeptide.
  • suitable peptides include those containing amino acids such as Valine, Alanine, Citrulline (Cit), Phenylalanine, Lysine, Leucine, and Glycine.
  • Amino acid residues which comprise an amino acid linker component include those occurring naturally, as well as minor amino acids and non-naturally occurring amino acid analogs, such as citrulline.
  • Exemplary dipeptides include val ine-citrulli ne (vc or val-cit) and alanine-phenylalanine (af or ala-phe).
  • Exemplary tripeptides include glycine-valine-citrulline (gly-val-cit) and glycine-glycine-glycine (gly-gly-gly).
  • the linker includes a dipeptide such as Val-Cit, Ala-Vai, or Phe-Lys, Val-Lys, Ala-Lys, Phe-Cit, Leu-Cit, lle-Cit, Phe-Arg, or Trp-Cit.
  • Linkers containing dipeptides such as Val-Cit or Phe-Lys are disclosed in, for example, U.S. Pat. No. 6,214,345, the disclosure of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation.
  • the linker includes a dipeptide selected from Val-Ala and Val-Cit.
  • linkers comprising a peptide moiety may be susceptible to varying degrees of cleavage both intra- and extracellularly. Accordingly, in some embodiments, the linker comprises a dipeptide, and the TAGE agent is substantially cleaved extracellularly. Accordingly, in some embodiments, the linker comprises a dipeptide, and the TAGE agent is stable extracellularly and is cleaved intracellularly.
  • Linkers suitable for conjugating a site-directed modifying polypeptide as disclosed herein to a second conjugation moiety, as disclosed herein, include those capable of releasing the site-directed modifying polypeptide by a 1 ,6-elimination process.
  • Chemical moieties capable of this elimination process include the p-aminobenzyl (PAB) group, 6-maleimidohexanoic acid, pH-sensitive carbonates, and other reagents as described in Jain et al., Pharm. Res. 32:3526-3540, 2015, the disclosure of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation.
  • PAB p-aminobenzyl
  • the linker includes a "self-immolative" group such as the afore-mentioned PAB or PABC (para-aminobenzyloxycarbonyl), which are disclosed in, for example, Carl et al., J. Med. Chem. (1981 ) 24:479-480; Chakravarty et al (1983) J. Med. Chem. 26:638-644; US 6214345;
  • PAB para-aminobenzyloxycarbonyl
  • self-immolative linkers include methylene carbamates and heteroaryl groups such as aminothiazoles, aminoimidazoles, aminopyrimidines, and the like.
  • Linkers containing such heterocyclic self-immolative groups are disclosed in, for example, U.S. Patent Publication Nos. 20160303254 and 201500791 14, and U.S. Patent No. 7,754,681 ; Hay et al. (1999) Bioorg. Med. Chem. Lett. 9:2237; US 2005/0256030; de Groot et al (2001 ) J. Org. Chem. 66:8815-8830; and US 7223837.
  • a dipeptide is used in combination with a self-immolative linker.
  • Linkers suitable for use herein further may include one or more groups selected from Ci-Ce alkylene, Ci-C 6 heteroalkylene, C2-C6 alkenylene, C2-C6 heteroalkenylene, C2-Ce alkynylene, C2-C6 heteroalkynylene, C3-C6 cycloalkylene, heterocycloalkylene, arylene, heteroarylene, and combinations thereof, each of which may be optionally substituted.
  • PAB
  • the linker includes a p-aminobenzyl group (PAB).
  • PAB p-aminobenzyl group
  • the p- aminobenzyl group is disposed between the cytotoxic drug and a protease cleavage site in the linker.
  • the p-aminobenzyl group is part of a p-aminobenzyloxycarbonyl unit.
  • the p-aminobenzyl group is part of a p-aminobenzylamido unit.
  • the linker comprises PAB, Val-Cit-PAB, Val-Ala- PAB, Val-Lys(Ac)-PAB, Phe-Lys-PAB, Phe-Lys(Ac)-PAB, D-Val-Leu-Lys, Gly-Gly-Arg, Ala-Ala-Asn-PAB, or Ala-PAB.
  • the linker comprises a combination of one or more of a peptide, oligosaccharide, -(CH2) P -, -(CH2CH2O) P -, PAB, Val-Cit-PAB, Val-Ala-PAB, Val- Lys(Ac)-PAB, Phe-Lys-PAB, Phe-Lys(Ac)-PAB, D-Val-Leu-Lys, Gly-Gly-Arg, Ala-Ala-Asn-PAB, or Ala-PAB.
  • Suitable linkers may be substituted with groups which modulate solubility or reactivity. Suitable linkers may contain groups having solubility enhancing properties. Linkers including the (CH2CH2O) P unit (polyethylene glycol, PEG), for example, can enhance solubility, as can alkyl chains substituted with amino, sulfonic acid, phosphonic acid or phosphoric acid residues. Linkers including such moieties are disclosed in, for example, U.S. Patent Nos. 8,236,319 and 9,504,756, the disclosure of each of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation. Linkers containing such groups are described, for example, in U.S. Patent No. 9,636,421 and U.S. Patent Application Publication No. 2017/0298145, the disclosures of which are incorporated herein by reference as they pertain to linkers suitable for covalent conjugation to cytotoxins and antibodies or antigen-binding fragments thereof.
  • Linkers including such moieties are disclosed in, for
  • Suitable linkers for covalently conjugating test agentwith the first conjugation moiety, and the site- directed modifying polypeptide with the second conjugation moiety as disclosed herein can have two reactive functional groups (i.e., two reactive termini), one for conjugation to the test agent (or site-directed modifying polypeptide, respectively), and the other for conjugation to the first conjugation moiety (or second conjugation moiety, respectively).
  • Suitable sites for conjugation may include, in certain embodiments, nucleophilic, such as a thiol, amino group, or hydroxyl group.
  • Reactive (e.g., nucleophilic) sites that may be present within a test agent (or site-directed modifying polypeptide) as disclosed herein may include, without limitation, nucleophilic substituents on amino acid residues such as (i) N-terminal amine groups, (ii) side chain amine groups, e.g. lysine, (iii) side chain thiol groups, e.g. cysteine, (iv) side chain hydroxyl groups, e.g. serine; or (iv) sugar hydroxyl or amino groups where the antibody is glycosylated.
  • nucleophilic substituents on amino acid residues such as (i) N-terminal amine groups, (ii) side chain amine groups, e.g. lysine, (iii) side chain thiol groups, e.g. cysteine, (iv) side chain hydroxyl groups, e.g. serine; or (iv) sugar hydroxyl or amino groups where the
  • Suitable sites for conjugation on the first or second conjugation moiety may include, without limitation, hydroxyl moieties of serine, threonine, and tyrosine residues; amino moieties of lysine residues; carboxyl moieties of aspartic acid and glutamic acid residues; and thiol moieties of cysteine residues, as well as propargyl, azido, haloaryl (e.g., fluoroaryl), haloheteroaryl (e.g., fluoroheteroaryl), haloalkyl, and haloheteroalkyl moieties of non- naturally occurring amino acids.
  • the antibody conjugation reactive terminus on the linker is, in certain embodiments, a thiol-reactive group such as a double bond (as in maleimide), a leaving group such as a chloro, bromo, iodo, or an R-sulfanyl group, or a carboxyl group.
  • a thiol-reactive group such as a double bond (as in maleimide)
  • a leaving group such as a chloro, bromo, iodo, or an R-sulfanyl group, or a carboxyl group.
  • Suitable sites for conjugation on the site-directed modifying polypeptide can also be, in certain embodiments, nucleophilic.
  • Reactive (e.g., nucleophilic) sites that may be present within a site-directed modifying polypeptide as disclosed herein include, without limitation, nucleophilic substituents on amino acid residues such as (i) N-terminal amine groups, (ii) side chain amine groups, e.g. lysine, (iii) side chain thiol groups, e.g. cysteine, (iv) side chain hydroxyl groups, e.g. serine; or (iv) sugar hydroxyl or amino groups where the antibody is glycosylated.
  • Suitable sites for conjugation on the site-directed modifying polypeptide include, without limitation, hydroxyl moieties of serine, threonine, and tyrosine residues; amino moieties of lysine residues; carboxyl moieties of aspartic acid and glutamic acid residues; and thiol moieties of cysteine residues, as well as propargyl, azido, haloaryl (e.g., fluoroaryl), haloheteroaryl (e.g., fluoroheteroaryl), haloalkyl, and haloheteroalkyl moieties of non-naturally occurring amino acids.
  • haloaryl e.g., fluoroaryl
  • haloheteroaryl e.g., fluoroheteroaryl
  • haloalkyl e.g., fluoroheteroaryl
  • the site-directed modifying polypeptide conjugation reactive terminus on the linker is, in certain embodiments, a thiol-reactive group such as a double bond (as in maleimide), a leaving group such as a chloro, bromo, iodo, or an R- sulfanyl group, or a carboxyl group.
  • the reactive functional group attached to the linker is a nucleophilic group which is reactive with an electrophilic group present on an antigen binding moiety, the site-directed modifying polypeptide, or both.
  • Useful electrophilic groups on an antigen binding moiety or site-directed modifying polypeptide include, but are not limited to, aldehyde and ketone carbonyl groups.
  • the heteroatom of a nucleophilic group can react with an electrophilic group on an antigen binding moiety or site-directed modifying polypeptide and form a covalent bond to the antigen binding moiety or the site-directed modifying polypeptide.
  • Useful nucleophilic groups include, but are not limited to, hydrazide, oxime, amino, hydroxyl, hydrazine, thiosemicarbazone, hydrazine carboxylate, and arylhydrazide.
  • linker When the term "linker” is used in describing the linker in conjugated form, one or both of the reactive termini will be absent, (having been converted to a chemical moiety) or incomplete (such as being only the carbonyl of a carboxylic acid) because of the formation of the bonds between the linker and the extracellular cell membrane binding moiety, and/or between the linker and the site-directed modifying polypeptide.
  • linkers useful herein include, without limitation, linkers containing a chemical moiety formed by a coupling reaction between a reactive functional group on the linker and a nucleophilic group or otherwise reactive substituent on the antigen binding moiety, and a chemical moiety formed by a coupling reaction between a reactive functional group on the linker and a nucleophilic group on the site-directed modifying polypeptide.
  • Examples of chemical moieties formed by these coupling reactions result from reactions between chemically reactive functional groups, including a nucleophile/electrophile pair (e.g., a thiol/haloalkyl pair, an amine/carbonyl pair, or a thiol/a,p-unsaturated carbonyl pair, and the like), a diene/dienophile pair (e.g., an azide/alkyne pair, or a diene/ a,p-unsaturated carbonyl pair, among others), and the like.
  • a nucleophile/electrophile pair e.g., a thiol/haloalkyl pair, an amine/carbonyl pair, or a thiol/a,p-unsaturated carbonyl pair, and the like
  • a diene/dienophile pair e.g., an azide/alkyne pair, or a diene/ a,p-uns
  • Coupling reactions between the reactive functional groups to form the chemical moiety include, without limitation, thiol alkylation, hydroxyl alkylation, amine alkylation, amine or hydroxylamine condensation, hydrazine formation, amidation, esterification, disulfide formation, cycloaddition (e.g., [4+2] Diels-Alder cycloaddition, [3+2] Huisgen cycloaddition, among others), nucleophilic aromatic substitution, electrophilic aromatic substitution, and other reactive modalities known in the art or described herein.
  • Suitable linkers may contain an electrophilic functional group for reaction with a nucleophilic functional group on the antigen binding moiety, the site-directed modifying polypeptide, or both.
  • the reactive functional group present within test agent, the site-directed modifying polypeptide, the first and/or second conjugation moieties, or all of these, as disclosed herein are amine or thiol moieties.
  • Certain extracellular cell membrane binding moieties have reducible interchain disulfides, i.e. cysteine bridges. Extracellular cell membrane binding moieties may be made reactive for conjugation with linker reagents by treatment with a reducing agent such as DTT (dithiothreitol). Each cysteine bridge will thus form, theoretically, two reactive thiol nucleophiles.
  • Additional nucleophilic groups can be introduced into antigen binding moieties through the reaction of lysines with 2-iminothiolane (Traut's reagent) resulting in conversion of an amine into a thiol.
  • Reactive thiol groups may be introduced into the antigen binding moiety by introducing one, two, three, four, or more cysteine residues (e.g., preparing mutant antibodies comprising one or more non-native cysteine amino acid residues).
  • U.S. Pat. No. 7,521 ,541 teaches engineering antibodies by introduction of reactive cysteine amino acids.
  • Linkers suitable for the synthesis of the covalent conjugates as disclosed herein include, without limitation, reactive functional groups such as maleimide or a haloalkyl group. These groups may be present in linkers or cross linking reagents such as succinimidyl 4-(N-maleimidomethyl)-cyclohexane-L-carboxylate (SMCC), N-succinimidyl iodoacetate (SI A) , sulfo-SMCC, m-maleimidobenzoyl-/V-hydroxysuccinimidyl ester (MBS), sulfo-MBS, and succinimidyl iodoacetate, among others described, in for instance, Liu et al., 18:690- 697, 1979, the disclosure of which is incorporated herein by reference as it pertains to linkers for chemical conjugation.
  • SMCC succinimidyl 4-(N-maleimidomethyl)-cyclohexane-L-car
  • one or both of the reactive functional groups attached to the linker is a maleimide, azide, or alkyne.
  • a maleimide-containing linker is the non-cleavable maleimidocaproyl-based linker.
  • linkers are described by Doronina et al., Bioconjugate Chem. 17:14- 24, 2006, the disclosure of which is incorporated herein by reference as it pertains to linkers for chemical conjugation.
  • the reactive functional group is an N-maleimidyl group, halogenated N- alkylamido group, sulfonyloxy N-alkylamido group, carbonate group, sulfonyl halide group, thiol group or derivative thereof, alkynyl group comprising an internal carbon-carbon triple bond, (het-ero)cycloalkynyl group, bicyclo[6.1 .0]non-4-yn-9-yl group, alkenyl group comprising an internal carbon-carbon double bond, cycloalkenyl group, tetrazinyl group, azido group, phosphine group, nitrile oxide group, nitrone group, nitrile imine group, diazo group, ketone group, (O-alkyl)hydroxylamino group, hydrazine group, halogenated N- maleimidyl group, 1 ,1 -bis (sulfonylmethyl)methylcarbony
  • Suitable bivalent linker reagents suitable for preparing conjugates as disclosed herein include, but are not limited to, N-succinimidyl 4-(maleimidomethyl)cyclohexanecarboxylate (SMCC), N- succinimidyl-4-(N-maleimidomethyl)-cyclohexane-1 -carboxy-(6-amidocaproate), which is a “long chain” analog of SMCC (LC-SMCC), K-maleimidoundecanoic acid N-succinimidyl ester (KMUA), Y-maleimidobutyric acid N-succinimidyl ester (GMBS), £-maleimidocaproic acid N-hydroxysuccinimide ester (EMCS), m- maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), N-(a-maleimidoacetoxy)-succinimide ester (AMAS), succinimidyl-6-(
  • Cross-linking reagents comprising a haloacetyl-based moiety include N-succinimidyl-4-(iodoacetyl)-aminobenzoate (SIAB), N-succinimidyl iodoacetate (SIA), N-succinimidyl bromoacetate (SBA), and N-succinimidyl 3-(bromoacetamido)propionate (SBAP).
  • SIAB N-succinimidyl-4-(iodoacetyl)-aminobenzoate
  • SIA N-succinimidyl iodoacetate
  • SBA N-succinimidyl bromoacetate
  • SBAP N-succinimidyl 3-(bromoacetamido)propionate
  • any one or more of the chemical groups, moieties and features disclosed herein may be combined in multiple ways to form linkers useful for conjugation of the extracellular cell membrane binding moiety as disclosed herein to a site-directed modifying polypeptide, as disclosed herein.
  • Further linkers useful in conjunction with the compositions and methods described herein, are described, for example, in U.S. Patent Application Publication No. 2015/0218220, the disclosure of which is incorporated herein by reference as is pertain to linkers suitable for covalent conjugation.
  • a goal of the present invention is to identify cell internalizing agents that can act to internalize nucleic acid-guided nucleases so as to provide cell specificity for gene editing nucleases.
  • the methods disclosed herein include conjugating a test agent to a site directed modifying polypeptide, which in turn will target a specific nucleic acid and provide a gene editing function.
  • a site-directed modifying polypeptide refers to a nuclease that is directed to a specific target sequence based on the complementarity (full or partial) between a guide nucleic acid (i.e. , guide RNA or gRNA, guide DNA or gDNA, or guide DNA/RNA hybrid) that is associated with the nuclease and a target sequence.
  • a guide nucleic acid i.e. , guide RNA or gRNA, guide DNA or gDNA, or guide DNA/RNA hybrid
  • the site-directed modifying polypeptide is an RNA guided nuclease.
  • the binding between the guide RNA and the target sequence serves to recruit the nuclease to the vicinity of the target sequence.
  • -directed modifying polypeptides suitable for the presently disclosed compositions and methods include naturally-occurring Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated (Cas) polypeptides from a prokaryotic organism (e.g., bacteria, archaea) or variants thereof.
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • Cas Clustered Regularly Interspaced Short Palindromic Repeats
  • CRISPR sequences found within prokaryotic organisms are sequences that are derived from fragments of polynucleotides from invading viruses and are used to recognize similar viruses during subsequent infections and cleave viral polynucleotides via CRISPR-associated (Cas) polypeptides that function as an RNA-guided nuclease to cleave the viral polynucleotides.
  • CRISPR-associated polypeptide or “Cas polypeptide” refers to a naturally-occurring polypeptide that is found within proximity to CRISPR sequences within a naturally-occurring CRISPR system. Certain Cas polypeptides function as RNA-guided nucleases.
  • nucleic acid-guided nucleases of the presently disclosed compositions and methods are Class 2 Cas polypeptides or variants thereof given that the Class 2 CRISPR systems comprise a single polypeptide with nucleic acid-guided nuclease activity, whereas Class 1 CRISPR systems require a complex of proteins for nuclease activity.
  • Type II There are at least three known types of Class 2 CRISPR systems, Type II, Type V, and Type VI, among which there are multiple subtypes (subtype ll-A, ll-B, ll-C, V-A, V-B, V-C, Vl-A, Vl-B, and Vl-C, among other undefined or putative subtypes).
  • Type II and Type V-B systems require a tracrRNA, in addition to crRNA, for activity.
  • Type V-A and Type VI only require a crRNA for activity.
  • All known Type II and Type V RNA-guided nucleases target double-stranded DNA
  • Type VI RNA-guided nucleases target single-stranded RNA.
  • RNA-guided nucleases of Type II CRISPR systems are referred to as Cas9 herein and in the literature.
  • the nucleic acid-guided nuclease of the presently disclosed compositions and methods is a Type II Cas9 protein or a variant thereof.
  • Type V Cas polypeptides that function as RNA-guided nucleases do not require tracrRNA for targeting and cleavage of target sequences.
  • RNA-guided nuclease of Type VA CRISPR systems are referred to as Cpf1 ; of Type VB CRISPR systems are referred to as C2C1 ; of Type VC CRISPR systems are referred to as Cas12C or C2C3; of Type VIA CRISPR systems are referred to as C2C2 or Cas13A1 ; of Type VIB CRISPR systems are referred to as Cas13B; and of Type VIC CRISPR systems are referred to as Cas13A2 herein and in the literature.
  • the nucleic acid-guided nuclease of the presently disclosed compositions and methods is a Type VA Cpf1 protein or a variant thereof.
  • Naturally- occurring Cas polypeptides and variants thereof that function as nucleic acid-guided nucleases are known in the art and include, but are not limited to Streptococcus pyogenes Cas9, Staphylococcus aureus Cas9, Streptococcus thermophilus Cas9, Francisella novicida Cpf1 , or those described in Shmakov et al. (2017) Nat Rev Microbiol 15(3):169-182; Makarova et al. (2015) Nat Rev Microbiol 13(11 ):722-736; and U.S. Pat. No. 9790490, each of which is incorporated herein in its entirety.
  • Class 2 Type V CRISPR nucleases include Cas12 and any subtypes of Cas12, such as Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12g, Cas12h, and Cas12i.
  • Class 2 Type VI CRISPR nucleases including Cas13 can be used in order to cleave RNA target sequences.
  • the site-directed modifying polypeptide (i.e. , nucleic acid-guided nuclease) of the presently disclosed compositions and methods can be a naturally-occurring nucleic acid-guided nuclease (e.g., S. pyogenes Cas9) or a variant thereof.
  • Variant nucleic acid-guided nucleases can be engineered or naturally occurring variants that contain substitutions, deletions, or additions of amino acids that, for example, alter the activity of one or more of the nuclease domains, fuse the nucleic acid-guided nuclease to a heterologous domain that imparts a modifying property (e.g., transcriptional activation domain, epigenetic modification domain, detectable label), modify the stability of the nuclease, or modify the specificity of the nuclease.
  • a modifying property e.g., transcriptional activation domain, epigenetic modification domain, detectable label
  • a nucleic acid-guided nuclease includes one or more mutations to improve specificity for a target site and/or stability in the intracellular microenvironment.
  • the protein is Cas9 (e.g., SpCas9) or a modified Cas9
  • the nuclease comprises at least one substitution relative to a naturally-occurring version of the nuclease.
  • substitutions may include any of C80A, C80L, C80I, C80V, C80K, C574E, C574D, C574N, C574Q (in any combination) and in particular C80A. Substitutions may be included to reduce intracellular protein binding of the nuclease and/or increase target site specificity. Additionally or alternatively, substitutions may be included to reduce off-target toxicity of the composition.
  • the nucleic acid-guided nuclease is directed to a particular target sequence through its association with a guide nucleic acid (e.g., guide RNA (gRNA), guide DNA (gDNA)).
  • a guide nucleic acid e.g., guide RNA (gRNA), guide DNA (gDNA)
  • the nucleic acid-guided nuclease is bound to the guide nucleic acid via non-covalent interactions, thus forming a complex.
  • the polynucleotide- targeting nucleic acid provides target specificity to the complex by comprising a nucleotide sequence that is complementary to a sequence of a target sequence.
  • the nucleic acid-guided nuclease of the complex or a domain or label fused or otherwise conjugated thereto provides the site-specific activity.
  • the nucleic acid-guided nuclease is guided to a target polynucleotide sequence (e.g. a target sequence in a chromosomal nucleic acid; a target sequence in an extrachromosomal nucleic acid, e.g. an episomal nucleic acid, a minicircle; a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic acid; a target sequence in a plasmid) by virtue of its association with the protein-binding segment of the polynucleotide-targeting guide nucleic acid.
  • a target polynucleotide sequence e.g. a target sequence in a chromosomal nucleic acid; a target sequence in an extrachromosomal nucleic acid, e.g. an episomal nucleic acid, a minicircle; a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic
  • the guide nucleic acid comprises two segments, a “polynucleotide-targeting segment” and a “polypeptide-binding segment.”
  • segment it is meant a segment/section/region of a molecule (e.g., a contiguous stretch of nucleotides in an RNA).
  • a segment can also refer to a region/section of a complex such that a segment may comprise regions of more than one molecule.
  • the polypeptide-binding segment (described below) of a polynucleotide-targeting nucleic acid comprises only one nucleic acid molecule and the polypeptide-binding segment therefore comprises a region of that nucleic acid molecule.
  • the polypeptide-binding segment (described below) of a DNA-targeting nucleic acid comprises two separate molecules that are hybridized along a region of complementarity.
  • the polynucleotide-targeting segment (or "polynucleotide-targeting sequence” or “guide sequence”) comprises a nucleotide sequence that is complementary (fully or partially) to a specific sequence within a target sequence (for example, the complementary strand of a target DNA sequence).
  • the polypeptide- binding segment (or "polypeptide-binding sequence") interacts with a nucleic acid-guided nuclease (e.g., RNA-guided nuclease).
  • site-specific cleavage or modification of the target DNA by a nucleic acid-guided nuclease occurs at locations determined by both (i) base-pairing complementarity between the polynucleotide-targeting sequence of the nucleic acid and the target DNA; and (ii) a short motif (referred to as the protospacer adjacent motif (PAM)) in the target DNA.
  • PAM protospacer adjacent motif
  • a protospacer adjacent motif can be of different lengths and can be a variable distance from the target sequence, although the PAM is generally within about 1 to about 10 nucleotides from the target sequence, including about 1 , about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, or about 10 nucleotides from the target sequence.
  • the PAM can be 5' or 3' of the target sequence.
  • the PAM is a consensus sequence of about 3-4 nucleotides, but in particular embodiments, can be 2, 3, 4, 5, 6, 7, 8, 9, or more nucleotides in length.
  • RNA-guided nuclease Methods for identifying a preferred PAM sequence or consensus sequence for a given RNA-guided nuclease are known in the art and include, but are not limited to the PAM depletion assay described by Karvelis et al. (2015) Genome Biol 16:253, or the assay disclosed in Pattanayak et al. (2013) Nat Biotechnol 31 (9):839-43, each of which is incorporated by reference in its entirety.
  • the polynucleotide-targeting sequence is the nucleotide sequence that directly hybridizes with the target sequence of interest.
  • the guide sequence is engineered to be fully or partially complementary with the target sequence of interest.
  • the guide sequence can comprise from about 8 nucleotides to about 30 nucleotides, or more.
  • the guide sequence can be about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length.
  • the guide sequence is about 10 to about 26 nucleotides in length, or about 12 to about 30 nucleotides in length. In particular embodiments, the guide sequence is about 30 nucleotides in length.
  • the degree of complementarity between a guide sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81 %, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more.
  • the guide sequence is free of secondary structure, which can be predicted using any suitable polynucleotide folding algorithm known in the art, including but not limited to mFold (see, e.g., Zuker and Stiegler (1981 ) Nucleic Acids Res. 9:133-148) and RNAfold (see, e.g., Gruber et al. (2008) Cell 106(1 ) :23-24) .
  • a guide nucleic acid comprises two separate nucleic acid molecules (an “activator-nucleic acid” and a “targeter-nucleic acid”, see below) and is referred to herein as a “doublemolecule guide nucleic acid” or a "two-molecule guide nucleic acid.”
  • the subject guide nucleic acid is a single nucleic acid molecule (single polynucleotide) and is referred to herein as a “single-molecule guide nucleic acid,” a “single-guide nucleic acid,” or an “sgNA.”
  • the term “guide nucleic acid” or “gNA” is inclusive, referring both to double-molecule guide nucleic acids and to single-molecule guide nucleic acids (i.e., sgNAs).
  • the gRNA can be a double-molecule guide RNA or a single-guide RNA.
  • the gDNA can be a double-molecule guide DNA or a single-guide DNA.
  • An exemplary two-molecule guide nucleic acid comprises a crRNA-like (“CRISPR RNA” or “targeter- RNA” or “crRNA” or “crRNA repeat”) molecule and a corresponding tracrRNA-like (“trans-acting CRISPR RNA” or “activator-RNA” or “tracrRNA”) molecule.
  • a crRNA-like molecule comprises both the polynucleotide-targeting segment (single stranded) of the guide RNA and a stretch ("duplex-forming segment") of nucleotides that forms one half of the dsRNA duplex of the polypeptide-binding segment of the guide RNA, also referred to herein as the CRISPR repeat sequence.
  • activator-nucleic acid or “activator-NA” is used herein to mean a tracrRNA-like molecule of a double-molecule guide nucleic acid.
  • targeter-nucleic acid or “targeter-NA” is used herein to mean a crRNA-like molecule of a double-molecule guide nucleic acid.
  • duplex-forming segment is used herein to mean the stretch of nucleotides of an activator-NA or a targeter-NA that contributes to the formation of the dsRNA duplex by hybridizing to a stretch of nucleotides of a corresponding activator-NA or targeter-NA molecule.
  • an activator-NA comprises a duplex-forming segment that is complementary to the duplex-forming segment of the corresponding targeter-NA.
  • an activator-NA comprises a duplex-forming segment while a targeter-NA comprises both a duplex-forming segment and the DNA-targeting segment of the guide nucleic acid. Therefore, a subject double-molecule guide nucleic acid can be comprised of any corresponding activator-NA and targeter-NA pair.
  • the activator-NA comprises a CRISPR repeat sequence comprising a nucleotide sequence that comprises a region with sufficient complementarity to hybridize to an activator-NA (the other part of the polypeptide-binding segment of the guide nucleic acid).
  • the CRISPR repeat sequence can comprise from about 8 nucleotides to about 30 nucleotides, or more.
  • the CRISPR repeat sequence can be about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length.
  • the degree of complementarity between a CRISPR repeat sequence and the antirepeat region of its corresponding tracr sequence when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81 %, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more.
  • a corresponding tracrRNA-like molecule comprises a stretch of nucleotides (duplex-forming segment) that forms the other part of the double-stranded duplex of the polypeptide-binding segment of the guide nucleic acid.
  • a stretch of nucleotides of a crRNA-like molecule i.e., the CRISPR repeat sequence
  • a stretch of nucleotides of a tracrRNA- like molecule i.e., the anti-repeat sequence
  • the crRNA-like molecule additionally provides the single stranded DNA- targeting segment.
  • a crRNA-like and a tracrRNA-like molecule hybridize to form a guide nucleic acid.
  • the exact sequence of a given crRNA or tracrRNA molecule is characteristic of the CRISPR system and species in which the RNA molecules are found.
  • a subject double-molecule guide RNA can comprise any corresponding crRNA and tracrRNA pair.
  • a trans-activating-like CRISPR RNA or tracrRNA-like molecule (also referred to herein as an “activator-NA”) comprises a nucleotide sequence comprising a region that has sufficient complementarity to hybridize to a CRISPR repeat sequence of a crRNA, which is referred to herein as the anti-repeat region.
  • the tracrRNA-like molecule further comprises a region with secondary structure (e.g., stem-loop) or forms secondary structure upon hybridizing with its corresponding crRNA.
  • the region of the tracrRNA-like molecule that is fully or partially complementary to a CRISPR repeat sequence is at the 5' end of the molecule and the 3' end of the tracrRNA-like molecule comprises secondary structure.
  • This region of secondary structure generally comprises several hairpin structures, including the nexus hairpin, which is found adjacent to the anti-repeat sequence.
  • the nexus hairpin often has a conserved nucleotide sequence in the base of the hairpin stem, with the motif UNANNC found in many nexus hairpins in tracrRNAs.
  • terminal hairpins at the 3' end of the tracrRNA that can vary in structure and number, but often comprise a GC-rich Rho-independent transcriptional terminator hairpin followed by a string of U’s at the 3' end. See, for example, Briner et al. (2014) Molecular Cell 56:333- 339, Briner and Barrangou (2016) Cold Spring Harb Protoc; doi: 10.1101 /pdb.top090902, and U.S. Publication No. 2017/0275648, each of which is herein incorporated by reference in its entirety.
  • the anti-repeat region of the tracrRNA-like molecule that is fully or partially complementary to the CRISPR repeat sequence comprises from about 8 nucleotides to about 30 nucleotides, or more.
  • the region of base pairing between the tracrRNA-like anti-repeat sequence and the CRISPR repeat sequence can be about 8, about 9, about 10, about 1 1 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length.
  • the degree of complementarity between a CRISPR repeat sequence and its corresponding tracrRNA-like anti-repeat sequence when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more.
  • the entire tracrRNA-like molecule can comprise from about 60 nucleotides to more than about 140 nucleotides.
  • the tracrRNA-like molecule can be about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 95, about 100, about 105, about 110, about 115, about 120, about 125, about 130, about 135, about 140, or more nucleotides in length.
  • the tracrRNA-like molecule is about 80 to about 100 nucleotides in length, including about 80, about 81 , about 82, about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91 , about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, and about 100 nucleotides in length.
  • a subject single-molecule guide nucleic acid comprises two stretches of nucleotides (a targeter-NA and an activator-NA) that are complementary to one another, are covalently linked by intervening nucleotides ("linkers” or "linker nucleotides”), and hybridize to form the double stranded nucleic acid duplex of the protein-binding segment, thus resulting in a stem-loop structure.
  • the targeter-NA and the activator-NA can be covalently linked via the 3' end of the targeter-NA and the 5' end of the activator-NA.
  • the targeter-NA and the activator-NA can be covalently linked via the 5' end of the targeter-NA and the 3' end of the activator-NA.
  • the linker of a single-molecule DNA-targeting nucleic acid can have a length of from about 3 nucleotides to about 100 nucleotides.
  • the linker can have a length of from about 3 nucleotides (nt) to about 90 nt, from about 3 nt to about 80 nt, from about 3 nt to about 70 nt, from about 3 nt to about 60 nt, from about 3 nt to about 50 nt, from about 3 nt to about 40 nt, from about 3 nt to about 30 nt, from about 3 nt to about 20 nt or from about 3 nt to about 10 nt, including but not limited to about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, or more nucleotides.
  • the linker of a singlemolecule DNA-targeting nucleic acid is 4 nt.
  • An exemplary single-molecule DNA-targeting nucleic acid comprises two complementary stretches of nucleotides that hybridize to form a double-stranded duplex, along with a guide sequence that hybridizes to a specific target sequence.
  • tracrRNAs Appropriate naturally-occurring cognate pairs of crRNAs (and, in some embodiments, tracrRNAs) are known for most Gas proteins that function as nucleic acid-guided nucleases that have been discovered or can be determined for a specific naturally-occurring Gas protein that has nucleic acid-guided nuclease activity by sequencing and analyzing flanking sequences of the Gas nucleic acid-guided nuclease protein to identify tracrRNA-coding sequence, and thus, the tracrRNA sequence, by searching for known antirepeatcoding sequences or a variant thereof.
  • Antirepeat regions of the tracrRNA comprise one-half of the ds protein-binding duplex.
  • CRISPR repeat The complementary repeat sequence that comprises one-half of the ds proteinbinding duplex is called the CRISPR repeat.
  • CRISPR repeat and antirepeat sequences utilized by known CRISPR nucleic acid-guided nucleases are known in the art and can be found, for example, at the CRISPR database on the world wide web at crispr.i2bc.paris-saclay.fr/crispr/.
  • the single guide nucleic acid or dual-guide nucleic acid can be synthesized chemically or via in vitro transcription.
  • Assays for determining sequence-specific binding between a nucleic acid-guided nuclease and a guide nucleic acid are known in the art and include, but are not limited to, in vitro binding assays between an expressed nucleic acid-guided nuclease and the guide nucleic acid, which can be tagged with a detectable label (e.g., biotin) and used in a pull-down detection assay in which the nucleoprotein complex is captured via the detectable label (e.g., with streptavidin beads).
  • a control guide nucleic acid with an unrelated sequence or structure to the guide nucleic acid can be used as a negative control for non-specific binding of the nucleic acid-guided nuclease to nucleic acids.
  • the nucleic acid-guided nuclease of the presently disclosed compositions and methods comprise a nuclease variant that functions as a nickase, wherein the nuclease comprises a mutation in comparison to the wild-type nuclease that results in the nuclease only being capable of cleaving a single strand of a double-stranded nucleic acid molecule, or lacks nuclease activity altogether (i.e . , nuclease-dead).
  • a nuclease such as a nucleic acid-guided nuclease, that functions as a nickase only comprises a single functioning nuclease domain.
  • additional nuclease domains have been mutated such that the nuclease activity of that particular domain is reduced or eliminated.
  • the nuclease (e.g., RNA-guided nuclease) lacks nuclease activity completely and is referred to herein as nuclease-dead.
  • nuclease-dead In some of these embodiments, all nuclease domains within the nuclease have been mutated such that all nuclease activity of the polypeptide has been eliminated. Any method known in the art can be used to introduce mutations into one or more nuclease domains of a nucleic acid-guided nuclease, including those set forth in U.S. Publ. Nos. 2014/0068797 and U.S. Pat. No. 9,790,490, each of which is incorporated by reference in its entirety.
  • any mutation within a nuclease domain that reduces or eliminates the nuclease activity can be used to generate a nucleic acid-guided nuclease having nickase activity or a nuclease-dead nucleic acid-guided nuclease.
  • Such mutations are known in the art and include, but are not limited to the D10A mutation within the RuvC domain or H840A mutation within the HNH domain of the S. pyogenes Cas9 or at similar position(s) within another nucleic acid-guided nuclease when aligned for maximal homology with the S. pyogenes Cas9. Other positions within the nuclease domains of S.
  • pyogenes Cas9 that can be mutated to generate a nickase or nuclease-dead protein include G12, G17, E762, N854, N863, H982, H983, and D986.
  • Other mutations within a nuclease domain of a nucleic acid-guided nuclease that can lead to nickase or nuclease-dead proteins include a D917A, E1006A, E1028A, D1227A, D1255A, N1257A, D917A, E1006A, E1028A, D1227A, D1255A, and N1257A of the Francisella novicida Cpf1 protein or at similar position(s) within another nucleic acid-guided nuclease when aligned for maximal homology with the F. novicida Cpf 1 protein (U.S. Pat. No. 9,790,490, which is incorporated by reference in
  • Nucleic acid-guided nucleases comprising a nuclease-dead domain can further comprise a domain capable of modifying a polynucleotide.
  • modifying domains that may be fused to a nuclease-dead domain include but are not limited to, a transcriptional activation or repression domain, a base editing domain, and an epigenetic modification domain.
  • the nucleic acid-guided nuclease comprising a nuclease-dead domain further comprises a detectable label that can aid in detecting the presence of the target sequence.
  • An epigenetic modification domain that can be fused to a nuclease-dead domain can serve to covalently modify DNA or histone proteins to alter histone structure and/or chromosomal structure without altering the DNA sequence itself, leading to changes in gene expression (upregulation or downregulation).
  • Non-limiting examples of epigenetic modifications that can be induced by nucleic acid-guided nuclease include the following alterations in histone residues and the reverse reactions thereof: sumoylation, methylation of arginine or lysine residues, acetylation or ubiquitination of lysine residues, phosphorylation of serine and/or threonine residues; and the following alterations of DNA and the reverse reactions thereof: methylation or hydroxymethylation of cytosine residues.
  • Non-limiting examples of epigenetic modification domains thus include histone acetyltransferase domains, histone deacetylation domains, histone methyltransferase domains, histone demethylase domains, DNA methyltransferase domains, and DNA demethylase domains.
  • the nucleic acid-guided nuclease comprises a transcriptional activation domain that activates the transcription of at least one adjacent gene through the interaction with transcriptional control elements and/or transcriptional regulatory proteins, such as transcription factors or RNA polymerases.
  • Suitable transcriptional activation domains are known in the art and include, but are not limited to, VP16 activation domains.
  • the nucleic acid-guided nuclease comprises a transcriptional repressor domain, which can also interact with transcriptional control elements and/or transcriptional regulatory proteins, such as transcription factors or RNA polymerases, to reduce or terminate transcription of at least one adjacent gene.
  • transcriptional repression domains are known in the art and include, but are not limited to, IKB and KRAB domains.
  • the nucleic acid-guided nuclease comprising a nuclease-dead domain further comprises a detectable label that can aid in detecting the presence of the target sequence, which may be a disease-associated sequence.
  • a detectable label is a molecule that can be visualized or otherwise observed.
  • the detectable label may be fused to the nucleic acid-guided nuclease as a fusion protein (e.g., fluorescent protein) or may be a small molecule conjugated to the nuclease polypeptide that can be detected visually or by other means.
  • Detectable labels that can be fused to the presently disclosed nucleic acid-guided nucleases as a fusion protein include any detectable protein domain, including but not limited to, a fluorescent protein or a protein domain that can be detected with a specific antibody.
  • fluorescent proteins include green fluorescent proteins (e.g., GFP, EGFP, ZsGreenl ) and yellow fluorescent proteins (e.g., YFP, EYFP, ZsYellowl ).
  • Non-limiting examples of small molecule detectable labels include radioactive labels, such as 3 H and 35 S.
  • the nucleic acid-guided nuclease can be delivered as part of a nucleoprotein (e.g., RNA-guided nuclease protein and guide RNA) into a cell as a nucleoprotein complex comprising the nucleic acid-guided nuclease bound to its guide nucleic acid.
  • a nucleoprotein e.g., RNA-guided nuclease protein and guide RNA
  • the nucleic acid-guided nuclease is delivered as a protein and the guide nucleic acid is provided separately.
  • a guide RNA can be introduced into a target cell as an RNA molecule.
  • the guide RNA can be transcribed in vitro or chemically synthesized.
  • a nucleotide sequence encoding the guide RNA is introduced into the cell.
  • the nucleotide sequence encoding the guide RNA is operably linked to a promoter (e.g., an RNA polymerase III promoter), which can be a native promoter or heterologous to the guide RNA-encoding nucleotide sequence.
  • a nucleic acid sequence encoding the guide RNA and RNA-guided nuclease operably linked to a promoter can be delivered on a vector, such as the expression vector described in detail herein.
  • the nucleoprotein can comprise additional amino acid sequences, such as at least one nuclear localization sequence (NLS).
  • NLS nuclear localization sequences enhance transport of the nucleic acid-guided nuclease into the nucleus of a cell.
  • Proteins that are imported into the nucleus bind to one or more of the proteins within the nuclear pore complex, such as importin/karyopherin proteins, which generally bind best to lysine and arginine residues.
  • the best characterized pathway for nuclear localization involves short peptide sequence which binds to the importin-a protein.
  • nuclear localization sequences often comprise stretches of basic amino acids and given that there are two such binding sites on importin-a, two basic sequences separated by at least 10 amino acids can make up a bipartite NLS.
  • the second most characterized pathway of nuclear import involves proteins that bind to the importin-pi protein, such as the HIV-TAT and HIV-REV proteins, which use the sequences RKKRRQRRR (SEQ ID NO: 1 ) and RQARRNRRRRWR (SEQ ID NO: 2), respectively to bind to importin-pi .
  • Other nuclear localization sequences are known in the art (see, e.g., Lange et a/., J. Biol. Chem. (2007) 282:5101 -5105).
  • the NLS can be the naturally-occurring NLS of the nucleic acid-guided nuclease or a heterologous NLS.
  • heterologous in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
  • Non-limiting examples of NLS sequences that can be used to enhance the nuclear localization of the nucleic acid-guided nuclease or nucleoprotein include the NLS of the SV40 Large T-antigen and c-Myc.
  • the NLS comprises the amino acid sequence PKKKRKV (SEQ ID NO: 3).
  • a nucleoprotein can comprise more than one NLS, such as two, three, four, five, six, or more NLS sequences. Each of the multiple NLSs can be unique in sequence or there can be more than one of the same NLS sequence used.
  • the NLS can be on the amino-terminal (N-terminal) end of the nucleoprotein, the carboxy-terminal (C-terminal) end, or both the N-terminal and C-terminal ends of the nucleoprotein.
  • the nucleoprotein comprises two NLS sequences on its N-terminal end.
  • the nucleoprotein comprises two NLS sequences on the C-terminal end of the site-directed polypeptide.
  • the site-directed polypeptide comprises four NLS sequences on its N-terminal end and two NLS sequences on its C-terminal end.
  • the site-directed modifying polypeptide contains a conjugation moiety that allows the protein to conjugate to a test agent.
  • Conjugation moieties include, but are not limited to Protein A, SpyCatcher tag, Halo-tag, Sortase, mono-avidin, ACP tag, a SNAP tag, or any other conjugation moieties known in the art.
  • the conjugation moiety is selected from Protein A, CBP, MBP, GST, poly(His), biotin/streptavidin, V5-tag, Myc-tag, HA-tag, NE-tag, His-tag, Flag tag, Halo-tag, Snap- tag, Fc-tag, Nus-tag, BCCP, thioredoxin, SnoopTag, SpyTag, SpyCatcher, Isopeptag, SBP-tag, S- tag, AviTag, and calmodulin.
  • Exemplary binding moiety pairings include (i) streptavidin-binding peptide (streptavidin binding peptide; SBP) and streptavidin (STV), (ii) biotin and EMA (enhanced monomeric avidin), (iii) SpyTag (ST) and SpyCatcher (SC ), (iv) Halo-tag and Halo-tag ligand, (v) and SNAP-Tag , (vi) Myc tag and anti-Myc immunoglobulins (vii) FLAG tag and anti-FLAG immunoglobulins, and (ix) ybbR tag and coenzyme A groups.
  • the nucleic acid-guided nuclease comprises the self-cleaving N-terminal portions (NP ro ) of polyproteins from pestiviruses such as Hog cholera virus (strain Alfort), also called classical swine fever virus (CSFV), from border disease virus (BDV), bovine viral diarrhea virus (BVDV), or fragments thereof; (2) the N-terminal portion of carboxypeptidase B (‘CPB’) precursor (amino acids 21 -110 of Sus scrofa CPB, SwissProt P09955.5), and fragments thereof; and/or (3) small ubiquitin-related modifier (SUMO) (SwissProt P55853.1 ).
  • pestiviruses such as Hog cholera virus (strain Alfort), also called classical swine fever virus (CSFV), from border disease virus (BDV), bovine viral diarrhea virus (BVDV), or fragments thereof
  • CSFV border disease virus
  • BVDV bovine viral diarrhea virus
  • Any N-terminal tag may itself be further tagged at its N-terminus with a polyhistidine tag such as 6xHis (SEQ ID NO: 4), allowing for initial purification of the tagged polypeptide on a nickel column, followed by self-cleavage of tags such as NP ro , or enzymatic cleavage of the CPB or SUMO N-terminal tag by trypsin or SUMO protease, respectively, and elution of the freed polypeptide from the column.
  • a polyhistidine tag such as 6xHis (SEQ ID NO: 4)
  • the SUMO protease polypeptides are also fusion proteins comprising 6xHis tags, allowing for a two-step purification: in the first step, the expressed 6xHis-SUMO-tagged nucleic acid-guided nuclease is purified by binding to a nickel column, followed by elution from the column.
  • the SUMO tags on the purified polypeptides are cleaved by the 6xHis-tagged SUMO protease, and the SUMO protease-nucleic acid-guided nuclease reaction mixture is run through a second nickel column, which retains the SUMO protease but allows the now untagged nucleic acid-guided nuclease to flow through.
  • fluorescent protein sequences can be expressed as part of a polypeptide gene product, with the amino acid sequence for the fluorescent protein preferably added at the N- or C-terminal end of the amino acid sequence of the polypeptide gene product.
  • the resulting fusion protein fluoresces when exposed to light of certain wavelengths, allowing the presence of the fusion protein to be detected visually.
  • a well-known fluorescent protein is the green fluorescent protein of Aequorea victoria, and many other fluorescent proteins are commercially available, along with nucleotide sequences encoding them.
  • the expression vectors herein comprise from 5’ to 3’ a promoter, a ribosome binding site, a nucleic acid-guided nuclease, and a detectable label, conjugation moiety, or other tag described herein.
  • the expression vectors herein comprise from 5' to 3’ a promoter, a ribosome binding site; a detectable label, conjugation moiety, or other tag described herein; and a nucleic acid-guided nuclease.
  • the detectable label or other tag is operably linked to the nucleic acid-guided nuclease by a cleavable linker (e.g., a SUMO protease cleavable linker).
  • Example 1 Gel binding assay for cell internalization agents that can effectively bind to a site- directed modifying polypeptide
  • An anti-FAP antibody (28H1 )-SpyTag construct (where the SpyTag was genetically encoded on the C-terminus of light chain) was incubated with spycatch er-Cas9 (where the SpyCatcher was genetically encoded on the N-terminus) in Expi293 media (or size exclusion column buffer) for 30 minutes at room temperature (i.e. , Fig. 1 , lane 8).
  • this assay can be used to determine the ability of a cell internalization agent (e.g., an anti-FAP antibody) to effectively bind the cell surface and internalize a site-directed modifying polypeptide (e.g., Cas9).
  • a cell internalization agent e.g., an anti-FAP antibody
  • a site-directed modifying polypeptide e.g., Cas9

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Methods and compositions related to identifying cell internalizing agents capable of internalizing endonucleases for gene editing, such as RNA-guided endonucleases, are provided.

Description

COMPOSITIONS AND METHODS FOR SCREENING CELL INTERNALIZING AGENTS
RELATED APPLICATIONS
This application claims priority to U.S. Provisional Application No. 63/077,042, filed September 1 1 , 2020. The entire content of the foregoing priority application is incorporated by reference herein.
SEQUENCE LISTING
The instant application contains a Sequence Listing which has been submitted electronically in ASCII format, and which is hereby incorporated by reference in its entirety. Said ASCII copy, created on September 8, 2021 , is named S106638 1070WO_SL_ST25.txt, and is 966 bytes in size.
FIELD OF THE INVENTION
Described herein are methods and compositions for identifying cell internalizing agents capable of internalizing endonucleases (e.g., RNA-guided endonucleases) into a cell for gene editing.
BACKGROUND OF THE INVENTION
Endonucleases, such as Cas9, have become a versatile tool for genome engineering in various cell types and organisms (see, e.g., US 8,697,359). Guided by a guide RNA (gRNA), such as a dual-RNA complex or a chimeric single-guide RNA, RNA-guided endonucleases can generate site-specific doublestranded breaks DSBs or single-stranded breaks (SSBs) within target nucleic acids (e.g., double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), or RNA). The ability to target such endonucleases to specific cells or tissues, particularly in vivo, remains difficult in the field. In particular, methods for identifying agents that can assist with internalization of such nucleases, remain a challenge.
SUMMARY OF THE INVENTION
There is a need for improved methods for identifying cell internalizing agents that can facilitate the internalization of nucleases, including RNA-guided endonucleases. In addition, there is a need in the art for improved screening methods that can detect the ability of an endonuclease, such as an RNA-guided endonuclease, to perform genome editing.
Provided herein are methods and compositions for testing agents in order to identify a cell internalizing agent that can internalize a site-directed modifying polypeptide or nucleoprotein, as well as methods for determining gene editing activity once internalized. Overall, the methods and compositions disclosed herein provide screening methods for identifying effective agents that can internalize a site- directed modifying polypeptide or nucleoprotein in such a manner that the site-directed modifying polypeptide or nucleoprotein is able to perform gene editing.
In one aspect, provided herein is a method for identifying a cell internalizing agent, the method comprising providing a population of target cells, wherein each target cell comprises a reporter construct that comprises a nucleic acid that provides a phenotype when activated or repressed, wherein each target cell is a eukaryotic cell that expresses a test agent on its cell surface, and wherein the test agent comprises a first conjugation moiety, contacting the population of target cells with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid within the target cell, wherein the nucleoprotein comprises a second conjugation moiety that binds to the first conjugation moiety of the test agent, and selecting a modified target cell having the phenotype observed with activation or repression of the reporter construct, thereby identifying the cell internalizing agent.
In some embodiments, the target nucleic acid is in the nucleus of the target cell.
In some embodiments, the test agent is a protein, a lipid, or a carbohydrate. In certain embodiments, the protein is an antigen-binding moiety. In other embodiments, the antigen-binding moiety is an antibody or an antibody fragment thereof. In another embodiment, the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic. In yet another embodiment, the antibody mimetic is a fibronectin based binding molecule, an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide. In other embodiments, the protein is a cell-penetrating peptide (CPP). In yet another embodiment, the protein is a ligand, or binding fragment thereof.
In some embodiments, the site-directed modifying polypeptide is a Class 2 Cas polypeptide. In other embodiments, the Class 2 Cas polypeptide is a Type II Cas polypeptide. In yet another embodiment, the Type II Cas polypeptide is Cas9.
In some embodiments, the site-directed modifying polypeptide is conjugated to the second binding moiety via a linker. In other embodiments, the linker is a labile linker. In another embodiment, the labile linker is pH sensitive. In another embodiment, the labile linker is sensitive to reducing conditions. In yet another embodiment, the labile linker is a disulfide linker. In other embodiments, the linker is a hydrazone linker or a valine-citrate linker.
In some embodiments, the modified target cell is selected by determining the RNA or protein expression level of the reporter construct. In another embodiment, an increase in the RNA or protein expression level of a nucleic acid in the reporter construct relative to a control indicates internalization of the test agent. In another embodiment, the RNA or protein expression level of a nucleic acid in the reporter construct is decreased or substantially eliminated upon internalization of the test agent. In other embodiments, the reporter construct comprises a nucleic acid encoding a selection marker. In yet another embodiment, the selection marker is a thymidine kinase gene construct and the population of target cells is cultured in the presence of ganciclovir. In another embodiment, the nucleic acid encoding a selection marker is an antibiotic resistance marker, a fluorescence marker, or a bioluminescence marker. In another embodiment, the fluorescence marker is a green fluorescent protein (GFP), a yellow fluorescent protein (YFP), a red fluorescent protein (RFP), or a split GFP reporter. In certain embodiments, the modified target cell is selected by detecting a signal produced by reassembly of the split GFP reporter. In another embodiment, the selection is a positive selection. In another embodiment, the selection is a negative selection.
In some embodiments, the modified target cell is identified using cell sorting. In certain embodiments, the cell sorting is fluorescence-activated cell sorting (FACS), magnetic-activated cell sorting (MACS), or microfluidic-based cell sorting.
In some embodiments, the cell internalizing agent binds to a cell surface moiety associated with a disorder. In some embodiments, the cell internalizing agent binds to a cell surface protein associated with a disorder. In certain embodiments, the cell surface protein is a tyrosine kinase, an epidermal growth factor receptor (EGFR), a platelet-derived growth factor receptor (PDGFR), a fibroblast growth factor receptor (FGFR), a hepatocyte growth factor receptor (HGFR), a nerve growth factor receptor (NGFR), CD3, CD4, Tim-3, CD278, TNFR-I, IL-1 R, LT-betaR, IL-18R, CCR1 , CD26, CD94, CD119, CD183, CD195, or DPIV.
In some embodiments, the population of target cells comprises mammalian cells or yeast cells. In certain embodiments, the mammalian cells are a cell type selected from the group consisting of a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and an HepG2 cell.
In some embodiments, the cell internalizing agent is identified by polymerase chain reaction (PCR) or by deep sequencing of a PCR-amplified nucleic acid derived from the modified target cell.
In one aspect, provided herein is a method for screening a library of cells having a plurality of genotypes for a cell that produces a cell internalizing agent, the method comprising providing an array with a library of protein-variant-producing cells that each express a test agent; incubating the array under conditions that allow for the production of test agents from the protein-variant-producing cells; providing a target cell comprising a reporter construct comprising a nucleic acid that provides a phenotype when activated or repressed; contacting the target cell with the test agent under conditions that allow for the test agent to bind to the target cell, wherein the test agent comprises a first conjugation moiety; contacting the target cell with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid of the target cell; wherein a second conjugation moiety is conjugated to the site-directed modifying polypeptide such that the first conjugation moiety of the test agent binds to the second conjugation moiety of the site-directed modifying polypeptide; determining a result from the array, wherein the result identifies a target cell having a phenotype associated with activation or repression of the nucleic acid in the reporter construct; and extracting the target cell identified in (f), thereby obtaining the cell that produces the cell internalizing agent.
In some embodiments, the array is a microfluidic system, a microbubble system, or a microcavity array. In certain embodiments, the array is a microcavity array and comprises extracting the target cell with electromagnetic radiation.
In some embodiments, the test agent is a protein, a lipid, or a carbohydrate. In another embodiment, the protein is a peptide. In other embodiments, the protein is an antigen-binding moiety. In yet another embodiment, the antigen-binding moiety is an antibody or an antibody fragment thereof. In another embodiment, the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic. In yet another embodiment, the antibody mimetic is an adnectin (i.e. , fibronectin based binding molecules), an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide. In another embodiment, the protein is a cell-penetrating peptide (CPP). In another embodiment, the protein is a ligand, or fragment thereof.
In some embodiments, the site-directed modifying polypeptide is a Class 2 Cas polypeptide. In other embodiments, the Class 2 Cas polypeptide is a Type II Cas polypeptide. In yet another embodiment, the Type II Cas polypeptide is Cas9. In some embodiments, the site-directed modifying polypeptide is conjugated to the second binding moiety via a linker. In other embodiments, the linker is a labile linker. In another embodiment, the labile linker is pH sensitive. In another embodiment, the labile linker is sensitive to reducing conditions. In yet another embodiment, the labile linker is a disulfide linker. In other embodiments, the linker is a hydrazone linker or a valine-citrate linker.
In some embodiments, the result is assayed by determining an RNA or protein expression level of the nucleic acid in the reporter construct. In other embodiments, an increase in the RNA or protein expression level of the nucleic acid indicates internalization of the test agent. In another embodiment, the nucleic acid encodes a selection marker. In another embodiment, the selection marker is an antibiotic resistance marker, a fluorescence marker, and a bioluminescence marker. In yet another embodiment, the selection marker is an antibiotic resistance marker, a fluorescence marker, and a bioluminescence marker. In another embodiment, the fluorescence marker is green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), or a split GFP reporter. In certain embodiments, further comprises detecting a signal produced by reassembly of the split GFP reporter. In yet another embodiment, a decrease in the RNA or protein expression level of the nucleic acid indicates internalization of the test agent. In another embodiment, the selection marker is a nucleic acid encoding thymidine kinase gene construct and the target cells are cultured in the presence of ganciclovir. In another embodiment, the selection is a positive selection. In another embodiment, the selection is a negative selection.
In some embodiments, the method further comprises measuring a signal produced from a labeled antibody capable of specifically binding to a cell surface protein expressed by the reporter construct. In one embodiment, the presence of the signal indicates internalization of the test agent. In another embodiment, the absence of the signal indicates internalization of the test agent.
In some embodiments, the target cell is a mammalian cell or a yeast cell. In certain embodiments, the mammalian cell is a cell type selected from the group consisting of a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and an HepG2 cell.
In some embodiments, the cell internalizing agent is identified via by sequencing a nucleic acid in the target cell.
In some embodiments, the method further comprises the step of identifying the cell internalizing agent comprises detecting using polymerase chain reaction (PCR) or deep sequencing of PCR-amplified nucleic acid from the cell that produces the cell internalizing agent.
In some embodiments, the cell internalization agent binds to a cell surface antigen associated with a disease. In certain embodiments, the disease is selected from the group consisting of cancer, autoimmune disease, and a hereditary genetic disease. In another embodiment, the cell surface antigen is selected from the group consisting of HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam.
In some embodiments, the modified target cell is selected by detecting cells capable of propagating in the presence of ganciclovir. BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 depicts results of an assay showing the formation of a complex of an anti-FAP antibody (28H1 )-SpyTag (genetically encoded on the C-terminus) with spycatcher-Cas9- (genetically encoded on the N-terminus).
DETAILED DESCRIPTION OF THE INVENTION
Provided herein are methods and compositions to identify constructs that facilitate the internalization of an endonuclease alone or associated with a guide nucleic acid, e.g., an RNA-guided endonuclease or nucleoprotein. Further provided herein are screening methods for rapidly detecting the ability of an endonuclease, e.g., an RNA-guided endonuclease or nucleoprotein to perform genome editing activities.
I. Definitions
The term “cell internalizing agent” refers to a binding moiety that can internalize into a cell. In certain embodiments, a cell internalizing agent specifically binds to an extracellular target molecule (e.g., an extracellular protein, lipid or glycan) displayed on a cell surface and internalizes into the cell. In another embodiment, a cell internalizing agent is a molecule that is displayed on the surface of a cell and internalizes into the cell. An example of a cell internalizing agent is a protein (e.g., an antigen binding moiety (e.g., an antibody or an antibody fragment), a ligand, or a cell penetrating peptide (CPP)) that can associate with a site-directed modifying polypeptide and enable the site-directed modifying polypeptide (e.g., Cas9) to internalize into a target cell.
The term “test agent” refers to a molecule which is being assayed for a given characteristic. In particular, in the context of certain embodiments described herein, a test agent represents a molecule that is being tested for its ability to both bind to a molecule (e.g., protein receptor, lipid, glycoprotein, glycolipid, carbohydrate, or others) on the surface of a cell and internalize into the cell. In certain other embodiments, a test agent represents a molecule that is displayed on the surface of a cell and is being tested for its ability to internalize into the cell. In certain embodiments, a test agent is assayed for its ability to internalize an endonuclease where internalization is determined according to the activity of the endonuclease that the test agent helps to internalize. In some embodiments, a test agent associates with or is conjugated to a site- directed modifying polypeptide (e.g., Cas9) or a nucleoprotein, and facilitates the internalization of the site- directed modifying polypeptide or nucleoprotein into the intracellular space of a target cell. In one embodiment, the cell internalization activity of a test agent may be determined by detecting a target cell having a particular phenotype characterized by the activity of a reporter construct comprising a nucleic acid that provides a read out, e.g., repression or activation, to determine if gene editing has occurred in the target cell. Examples of test agents include, but are not limited to, a protein (e.g., an antibody or antibody fragment thereof, a ligand or portion thereof, or a CPP), a lipid, a glycoprotein, a glycolipid, and a carbohydrate.
The term “reporter construct” refers to a nucleic acid that is within a cell (e.g., in the nucleus of the cell) that can be activated or repressed by, e.g., a site-directed modifying polypeptide or nucleoprotein, where the repression or activation results in a phenotype that allows for selection of the cell. In one embodiment, activation or repression of the reporter construct is achieved via gene editing. The phenotype resulting from the activation or repression may be, for example, a selection (e.g., antibiotic resistance, fluorescence, etc.) imparting a characteristic that distinguishes the cell from other cells in a population.
The term “target cell” refers to a cell to which an agent binds or is associated with, e.g., a cell to which a cell internalizing agent or a test agent can bind or is associated with. In one embodiment, a target cell may, for example, express an extracellular molecule (i.e. , an extracellular molecule embedded in or tethered to the cell membrane) which is bound by the cell internalizing agent (e.g., an antibody, a ligand, a CPP, a protein, etc.). A target cell may be a healthy cell or may be a cell associated with a disease. In certain of the screening methods described herein, a target cell is a cell on which the screening assays are focused. In an alternative, a target cell is a cell in vivo to which the cell internalizing agent binds.
The term “modified target cell”, as used herein, refers to a cell targeted by a cell internalizing agent wherein a reporter construct containing a nucleic acid sequence within the cell targeted is altered or modified by genome editing activity (e.g., genome editing activities of a nucleoprotein as described and disclosed herein). In certain embodiments, a modified target cell has a phenotype associated with the reporter construct.
The term “protein-variant-producing cell” refers to a cell that expresses at least one (but preferably a plurality of) protein variants. An example of a protein variant is an antigen binding moiety, such as, but not limited to, an antibody or antigen binding fragment thereof.
As used herein, a “site-directed modifying polypeptide” (i.e., a gene editing polypeptide, such as Cas9) refers to a nuclease, that is targeted to a specific nucleic acid sequence or set of similar sequences of a polynucleotide chain via recognition of the particular sequence(s) by the modifying polypeptide itself or an associated molecule (e.g., RNA), wherein the polypeptide can modify the polynucleotide chain. An example of a site-directed modifying polypeptide is an RNA-guided endonuclease, such as Cas9.
The term “conjugation moiety” as used herein refers to a molecule that is capable of associating with at least one other molecule. The association may be covalent or non-covalent. For example, a first and a second conjugation moiety may be used to conjugate two proteins, such as a cell internalizing agent and a site-directed modifying polypeptide. The term "conjugation," as used herein, refers to the physical or chemical complex formed between a molecule (for e.g. a cell internalizing agent) and the second molecule (e.g. a site-directed modifying polypeptide). In one embodiment, conjugation is achieved via a physical association or non-covalent complexation.
The term “specifically hybridizes” refers to hybridization of two nucleic acids under high stringency conditions or greater. Stringency is measured by hybridization and washing conditions. Such conditions are known in the art, e.g., Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1 -6.3.6. Nucleic acids that specifically hybridize to one another are generally complements of one another and can be the same length or within 30, 20, or 10% of the length of the reference nucleic acid.
The terms “polypeptide” or “protein”, as used interchangeably herein, refer to any polymeric chain of amino acids. The terms encompass native or artificial proteins, protein fragments and polypeptide analogs of a protein sequence.
As used herein, the term "ligand" refers to a molecule that is capable of specifically binding to another molecule on or in a cell, such as one or more cell surface receptors, and includes molecules such as proteins, hormones, neurotransmitters, cytokines, growth factors, cell adhesion molecules, or nutrients. A site-directed modifying polypeptide can be associated with one or more ligands through covalent or non- covalent linkage. Examples of ligands useful herein, or targets bound by ligands, and further description of ligands in general, are disclosed in Bryant & Stow (2005). Traffic, 6(10), 947-953; Olsnes et al. (2003). Physiological reviews, 83(1 ), 163-182; and Planque, N. (2006). Cell Communication and Signaling, 4(1 ), 7, which are incorporated herein by reference.
As used herein, the term “specifically binds” refers the ability of an antigen binding moiety to recognize and bind an antigen present in a sample, where the antigen binding moiety does not substantially recognize or bind other molecules in the sample. In one embodiment, antigen binding moiety that specifically binds to an antigen, binds to an antigen with an Kd of at least about 1 x10~4, 1 x10-5, 1 x10~6 M, 1 x10-7 M, 1 x10~8 M, 1 x10~9 M, 1 x10~10 M, 1 x10~11 M, 1 x10~12 M, or more as determined by surface plasmon resonance or other approaches known in the art (e.g., filter binding assay, fluorescence polarization, isothermal titration calorimetry), including those described further herein. In one embodiment, an antigen binding moiety specifically binds to an antigen if the antigen binding moiety binds to an antigen with an affinity that is at least two-fold greater as determined by surface plasmon resonance than its affinity for a nonspecific antigen.
The term "cell-penetrating peptide" (CPP) refers to a peptide, generally of about 5-60 amino acid residues in length, that can facilitate cellular uptake of a conjugated molecule, particularly one or more sitespecific modifying polypeptides. A CPP can also be characterized in certain embodiments as being able to facilitate the movement or traversal of a molecular conjugate across/th rough one or more of a lipid bilayer, micelle, cell membrane, organelle membrane (e.g., nuclear membrane), vesicle membrane, or cell wall. A CPP herein can be cationic, amphipathic, or hydrophobic in certain embodiments. Examples of CPPs useful herein, and further description of CPPs in general, are disclosed in Borrelli, Antonella, et al. Molecules 23.2 (2018): 295; Milletti, Francesca. Drug discovery today 17.15-16 (2012): 850-860, which are incorporated herein by reference. Further, there exists a database of experimentally validated CPPs (CPPsite, Gautam et al., 2012). The CPP can be any known CPP, such as a CPP shown in the CPPsite database.
The term “antigen binding moiety” as used herein refers to a molecule that binds to an antigen, such as an extracellular cell-membrane bound protein (e.g., a cell surface receptor). Examples of an antigen binding moiety include, but are not limited to, a protein, an antibody, antigen-binding fragments of an antibody, and an antibody mimetic.
The term "antibody" is used herein in the broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), nanobodies, monobodies, and antibody fragments so long as they exhibit the desired antigen-binding activity.
The term "antibody" includes an immunoglobulin molecule comprising four polypeptide chains, two heavy (H) chains and two light (L) chains inter-connected by disulfide bonds, as well as multimers thereof (e.g., IgM). Each heavy chain (HC) comprises a heavy chain variable region (or domain) (abbreviated herein as HCVR or VH) and a heavy chain constant region (or domain). The heavy chain constant region comprises three domains, CH1 , CH2 and CH3. Each light chain (LC) comprises a light chain variable region (abbreviated herein as LCVR or VL) and a light chain constant region. The light chain constant region comprises one domain (CL1 ). Each VH and VL is composed of three CDRs and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1 , CDR1 , FR2, CDR2, 1 -R3, CDR3, FR4 Immunoglobulin molecules can be of any type (e.g., IgG, IgE, IgM, IgD, IgA and IgY), class (e.g., IgG 1 , lgG2, lgG3, lgG4, Ig A1 and lgA2) or subclass. The VH and VL regions can be further subdivided into regions of hypervariability, termed complementarity determining regions (CDRs), interspersed with regions that are more conserved, termed framework regions (FR). Each VH and VL is composed of three CDRs and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1 , CDR1 , FR2, CDR2, FR3, CDR3, FR4.
As used herein, the term “CDR” or “complementarity determining region” refers to the noncontiguous antigen combining sites found within the variable region of both heavy and light chain polypeptides. These particular regions have been described by Kabat et al., J. Biol. Chem. 252, 6609-6616 (1977) and Kabat et al., Sequences of protein of immunological interest. (1991 ), and by Chothia et al., J. Mol. Biol. 196:901 -917 (1987) and by MacCallum et al., J. Mol. Biol. 262:732-745 (1996) where the definitions include overlapping or subsets of amino acid residues when compared against each other. The amino acid residues which encompass the CDRs as defined by each of the above cited references are set forth for comparison. Preferably, the term “CDR” is a CDR as defined by Kabat, based on sequence comparisons.
An “intact” or a “full length” antibody, as used herein, refers to an antibody comprising four polypeptide chains, two heavy (H) chains and two light (L) chains. In one embodiment, an intact antibody is an intact IgG antibody.
The term "monoclonal antibody" as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical and/or bind the same epitope, except for possible variant antibodies, e.g., containing naturally occurring mutations or arising during production of a monoclonal antibody preparation, such variants generally being present in minor amounts. In contrast to polyclonal antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen. Thus, the modifier "monoclonal" indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies and is not to be construed as requiring production of the antibody by any particular method. For example, the monoclonal antibodies to be used in accordance with the present invention may be made by a variety of techniques, including but not limited to the hybridoma method, recombinant DNA methods, phagedisplay methods, and methods utilizing transgenic animals containing all or part of the human immunoglobulin loci, such methods and other exemplary methods for making monoclonal antibodies being described herein.
The term “human antibody”, as used herein, refers to an antibody having variable regions in which both the framework and CDR regions are derived from human germline immunoglobulin sequences. Furthermore, if the antibody contains a constant region, the constant region also is derived from human germline immunoglobulin sequences. The human antibodies of the invention may include amino acid residues not encoded by human germline immunoglobulin sequences (e.g., mutations introduced by random or site-specific mutagenesis in vitro or by somatic mutation in vivo). However, the term “human antibody”, as used herein, is not intended to include antibodies in which CDR sequences derived from the germline of another mammalian species, such as a mouse, have been grafted onto human framework sequences.
The term “humanized antibody” is intended to refer to antibodies in which CDR sequences derived from the germline of one mammalian species, such as a mouse, have been grafted onto human framework sequences. Additional framework region modifications may be made within the human framework sequences. A "humanized form" of an antibody, e.g., a non-human antibody, refers to an antibody that has undergone humanization.
The term “chimeric antibody” is intended to refer to antibodies in which the variable region sequences are derived from one species and the constant region sequences are derived from another species, such as an antibody in which the variable region sequences are derived from a mouse antibody and the constant region sequences are derived from a human antibody.
An "antibody fragment", “antigen-binding fragment” or “antigen-binding portion” of an antibody refers to a molecule other than an intact antibody that comprises a portion of an intact antibody and that binds the antigen to which the intact antibody binds. Examples of antibody fragments include, but are not limited to, Fv, Fab, Fab', Fab'-SH, F(ab')2; diabodies; linear antibodies; single-chain antibody molecules (e.g. scFv); and multispecific antibodies formed from antibody fragments.
A "multispecific antigen binding polypeptide" or "multispecific antibody" is one that targets more than one antigen or epitope. A "bispecific," "dual-specific" or "bifunctional" antigen binding polypeptide or antibody is a hybrid antigen binding polypeptide or antibody, respectively, having two different antigen binding sites. Bispecific antigen binding polypeptides and antibodies are examples of a multispecific antigen binding polypeptide or a multispecific antibody and may be produced by a variety of methods including, but not limited to, fusion of hybridomas or linking of Fab' fragments. See, e.g., Songsivilai and Lachmann, 1990, Clin. Exp. Immunol. 79:315-321 ; Kostelny et al., 1992, J. Immunol. 148:1547-1553, Brinkmann and Kontermann. 2017. MABS. 9(2):182-212. The two binding sites of a bispecific antigen binding polypeptide or antibody, for example, will bind to two different epitopes, which may reside on the same or different protein targets.
The term “antibody mimetic” or “antibody mimic” refers to a molecule that is not structurally related to an antibody but is capable of specifically binding to an antigen. Examples of antibody mimetics include, but are not limited to, an adnectin (i.e. , fibronectin based binding molecules), an affilin, an affimer, an affitin, an alphabody, an affibody, DARPins, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a nanobody, a unibody, a versabody, an aptamer, a cyclotide, and a peptidic molecule all of which employ binding structures that, while they mimic traditional antibody binding, are generated from and function via distinct mechanisms.
Amino acid sequences described herein may include “conservative mutations,” including the substitution, deletion or addition of nucleic acids that alter, add or delete a single amino acid or a small number of amino acids in a coding sequence where the nucleic acid alterations result in the substitution of a chemically similar amino acid. A conservative amino acid substitution refers to the replacement of a first amino acid by a second amino acid that has chemical and/or physical properties (e.g., charge, structure, polarity, hydrophobicity/hyd rophilicity) that are similar to those of the first amino acid. Conservative substitutions include replacement of one amino acid by another within the following groups: lysine (K), arginine (R) and histidine (H); aspartate (D) and glutamate (E); asparagine (N) and glutamine (Q); N, Q, serine (S), threonine (T), and tyrosine (Y); K, R, H, D, and E; D, E, N, and Q; alanine (A), valine (V), leucine (L), isoleucine (I), proline (P), phenylalanine (F), tryptophan (W), methionine (M), cysteine (C), and glycine (G); F, W, and Y; H, F, W, and Y; C, S and T; C and A; S and T; C and S; S, T, and Y; V, I, and L; V, I, and T. Other conservative amino acid substitutions are also recognized as valid, depending on the context of the amino acid in question. For example, in some cases, methionine (M) can substitute for lysine (K). In addition, sequences that differ by conservative variations are generally homologous.
The term "isolated" refers to a compound, which can be, e.g. a nucleoprotein, protein, or nucleic acid, that is substantially free of other cellular material.
Additional definitions are described in the sections below.
Various aspects of the invention are described in further detail in the following subsections.
II. METHODS FOR IDENTIFYING CELL TARGETING AGENTS
Provided herein are methods for identifying cell internalizing agents that, when associated with a site-directed modifying polypeptide or nucleoprotein, enable a site-directed modifying polypeptide (e.g., Cas9) or nucleoprotein (e.g., Cas9 and guide RNA) to be targeted to the surface of a cell of interest (e.g., a diseased cell expressing an antigen on its surface associated with the disease) and subsequently internalized by the cell of interest such that gene editing can occur in the cell.
The cell internalizing agent specifically binds to an extracellular target molecule (e.g., an extracellular protein, lipid, carbohydrate, glycan, etc.) displayed on (e.g., embedded in or tethered to) a cell membrane. A cell internalizing agent is identified from a population of test agents using the methods disclosed herein.
Thus, described herein are methods for identifying cell internalizing agents that have both the ability to target a cell of interest and internalize such that a site-directed modifying polypeptide or nucleoprotein is delivered to the cell for gene editing.
Test agents
Various types of test agents can be used in the methods disclosed herein in order to identify cell internalizing agents. Examples include, but are not limited to a protein (e.g., a glycoprotein), a lipid (e.g., a glycolipid), or a carbohydrate. Test agents identified using the present invention can be natural, recombinant, or synthetic. In one embodiment, the test agent is a peptide. In another embodiment, the test agent is an antigen-binding moiety. In other embodiments, the test agent is an antibody or an antibody fragment thereof. In yet another embodiment, the test agent is a cell-penetrating peptide (CPP). In another embodiment, the test agent is an antibody mimetic. In another embodiment, the test agent is a lipid. In some embodiments, the test agent is a glycoprotein. In other embodiments, the test agent is a glycolipid. In yet another embodiment, the test agent is a carbohydrate.
In certain embodiments, the test agent is expressed on the surface of a cell membrane. In certain embodiments, the test agent is tethered to the surface of a cell membrane. In other embodiments, the test agent is secreted from a cell. In yet another embodiment, the test agent binds to or associates with the surface of a cell membrane.
In one embodiment, antigen-binding moieties are screened as test agents using the methods disclosed herein to identify a cell internalizing agent. An example of an antigen-binding moiety is an antigen-binding moiety, an antibody (e.g., an intact antibody) or an antibody fragment thereof. In certain embodiments, an antibody or fragment thereof is humanized or human. In one embodiment, the antibody is a monoclonal antibody. Antibodies or antigen-binding fragments thereof that can be screened as test agents can be in various forms known in the art, e.g., full-length antibodies, bispecific antibodies, dual variable domain antibodies, multiple chain or single chain antibodies, and/or binding fragments that specifically bind an extracellular molecule, including but not limited to Fab, Fab', (Fab')2, Fv), scFv (single chain Fv), surrobodies (including surrogate light chain construct), single domain antibodies, camelized antibodies and the like. They also can be of, or derived from, any isotype, including, for example, IgA (e.g., lgA1 or lgA2), IgD, IgE, IgG (e.g. IgG 1 , lgG2, lgG3 or lgG4) , or IgM. In some embodiments, the anti-CD117 antibody is an IgG (e.g. IgG 1 , lgG2, lgG3 or lgG4).
Additional examples of antigen-binding moieties that can be screened as test agents include, but are not limited to, a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)2, an intrabody, or an antibody mimetic, e.g., a fibronectin based binding molecule, an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide.
Test agents may be directed to an antigen of interest, including, but not limited to, HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam. Other exemplary targets include: (i) tumor-associated antigens; (ii) cell surface receptors, (ill) CD proteins and their ligands, such as CD3, CD4, CD8, CD19, CD20, CD22, CD25, CD32, CD33, CD34, CD40, CD44, CD47, CD54, CD59, CD70, CD74, CD79a (CD79a), and CD79P (CD79b); (iv) members of the ErbB receptor family such as the EGF receptor, HER2, HER3 or HER4 receptor; (v) cell adhesion molecules such as LFA-1 , Mac1 , p150,95, VLA-4, ICAM-1 , VCAM and av/p3 integrin including either alpha or beta subunits thereof (e.g. anti-CD11 a, anti-CD18 or anti-CD11 b antibodies); and (vi) growth factors such as VEGF; IgE; blood group antigens; flk2/flt3 receptor; obesity (OB) receptor; mpl receptor; CTLA4; protein C, BR3, c-met, tissue factor, p7 etc. Other examples of antigens that can be targeted by the antibody, or an antigen-binding fragment thereof, include cell surface receptors such as those described in Chen and Flies. Nature Reviews Immunology. 13.4 (2013): 227, which is incorporated herein by reference.
Exemplary antibodies (or antigen-binding fragments thereof) that may be identified by screening test agents using the methods disclosed include those selected from, and without limitation, an anti-HLA-DR antibody, an anti-CD3 antibody, an anti-CD20 antibody, an anti-CD22 antibody, an anti-CD25 antibody, an anti-CD32 antibody, an anti-CD33 antibody, an anti-CD44 antibody, an anti-CD47 antibody, an anti-CD54 antibody, an anti-CD59 antibody, an anti-CD70 antibody, an anti-CD74 antibody, an anti-AchR antibody, an anti-CTLA4 antibody, an anti-CXCR4 antibody, an anti-EGFR antibody, an anti-Her2 antibody, an anti- EpCam antibody, an-anti-PD-1 antibody, or an anti-FAP1 antibody
Another example of a cell internalizing agent that can be identified using the methods disclosed herein is a cell-penetrating peptide (CPP). A CPP which is a cell internalizing agent induces the absorption of a linked protein or peptide through the plasma membrane of a cell. Generally, CPPs induce entry into the cell because of their general shape and tendency to either self-assemble into a membrane-spanning pore, or to have several positively charged residues, which interact with the negatively charged phospholipid outer membrane inducing curvature of the membrane, which in turn activates internalization. Exemplary permeable peptides include, but are not limited to, transportan, PEP1 , MPG, p-VEC, MAP, CADY, polyR, HIV-TAT, HIV-REV, Penetratin, R6W3, P22N, DPV3, DPV6, K-FGF, and C105Y, and are reviewed in van den Berg and Dowdy (201 1 ) Current Opinion in Biotechnology 22:888-893 and Farkhani et al. (2014) Peptides 57:78-94, each of which is herein incorporated by reference in its entirety.
In certain embodiments, a cell internalizing agent identified using the methods disclosed herein is a ligand, or binding fragment thereof. Cell internalizing agents identified herein may be a ligand that binds to another molecule on or in a cell, including one or more cell surface receptors.
In certain embodiments, test agents are aptamers and are screened to identify a cell internalizing agent. An “aptamer” used in the compositions and methods disclosed herein includes aptamer molecules made from either peptides or nucleotides. Peptide aptamers share many properties with nucleotide aptamers (e.g., small size and ability to bind target molecules with high affinity) and they may be generated by selection methods that have similar principles to those used to generate nucleotide aptamers, for example Baines and Colas. 2006. Drug Discov Today. 11 (7-8):334-41 ; and Bickle et al. 2006. Nat Protoc. 1 (3):1066-91 which are incorporated herein by reference. Thus, an aptamer is a small nucleotide polymer that binds to specific molecular targets. Aptamers may be single or double stranded nucleic acid molecules (DNA or RNA), although DNA based aptamers are most commonly double stranded. There is no defined length for an aptamer nucleic acid; however, aptamer molecules are most commonly between 15 and 40 nucleotides long.
Aptamers often form complex three-dimensional structures which determine their affinity for target molecules. Aptamers can offer many advantages over simple antibodies, primarily because they can be engineered and amplified almost entirely in vitro. Furthermore, aptamers often induce little or no immune response.
Aptamers may be generated for testing using a variety of techniques, but were originally developed using in vitro selection (Ellington and Szostak. (1990) Nature. 346 (6287) :818-22) and the SELEX method (systematic evolution of ligands by exponential enrichment) (Schneider et al. 1992. J Mol Biol. 228 (3):862-9) the contents of which are incorporated herein by reference. Other methods to make and uses of aptamers have been published including Klussmann. The Aptamer Handbook Functional Oligonucleotides and Their Applications. ISBN: 978-3-527-31059-3; Ulrich et al. 2006. Comb Chem High Throughput Screen 9 (8):619- 32; Cerchia and de Franciscis. 2007. Methods Mol Biol. 361 :187-200; Ireson and Kelland. 2006. Mol Cancer Ther. 2006 5 (12):2957-62; U.S. Pat. Nos. 5,582,981 ; 5,840,867; 5,756,291 ; 6,261 ,783; 6,458,559; 5,792,613; 6,111 ,095; and U.S. patent application Ser. Nos. 11/482,671 ; 1 1/102,428; 1 1/291 ,610; and 10/627,543 which are all incorporated herein by reference.
In one embodiment, the methods disclosed herein are used to identify a cell internalizing agent that binds to a cell surface protein associated with a disorder. Examples of such cell surface proteins are, but not limited to, a tyrosine kinase, an epidermal growth factor receptor (EGFR), a platelet-derived growth factor receptor (PDGFR), a fibroblast growth factor receptor (FGFR), a hepatocyte growth factor receptor (HGFR), a nerve growth factor receptor (NGFR), CD3, CD4, Tim-3, CD278, TNFR-I, IL-1 R, LT-betaR, IL- 18R, CCR1 , CD26, CD94, CD119, CD183, CD195, or DPIV. Other examples of such cell surface proteins are, but not limited to, HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam.
In other embodiments, a test agent is a protein or peptide found in a protein or peptide database (for example, SWISS-PROT, TrEMBL, SBASE, PFAM, CPPsite, or others known in the art), or a fragment or variant thereof. A test agent may be a protein or peptide that may be derived (for example, by transcription and/or translation) from a nucleic acid sequence known in the art, such as a nucleic acid sequence found in a nucleic acid database (for example, GenBank, TIGR, CPPsite, or others known in the art), or a fragment or variant thereof.
The methods disclosed herein are useful for screening a library of cells having a plurality of genotypes for a cell having a phenotype of interest, such a cell producing a protein or other molecule having a phenotype of interest. In general, the method is available for screening all cell types, e.g., mammalian, yeast, fungal, bacterial, and insect, that are able to survive and/or multiply in the array. Phenotypes of interest can include any biological process that renders a detectable result, including but not limited to production, secretion and/or display of polypeptides and nucleic acids. Libraries of cells having a plurality of genotypes associated with detectable phenotypes can be generated by methods involving error prone PCR, random activation of gene expression, phage display, overhang-based DNA block shuffling, random mutagenesis, in vitro DNA shuffling, site-specific recombination, and other methods generally known to those of skill in the art.
In one aspect, provided herein are methods and compositions for generating a library of test agents. In some embodiments, the method may involve generating a library of test agents that are primarily expressed on the surface of a cell (e.g., an Expi293 cell (Innovative Targeting Solutions, Inc.)). Examples of systems capable of producing a library of test agents are known in the art, including, but not limited to, random mutagenesis, error prone PCR, rolling circle error-prone PCR, random activation of gene expression, overhang-based DNA block shuffling, in vitro DNA shuffling, site-specific recombination, chemical mutagenesis, mutagenesis by random insertion and deletion, transposon-based random mutagenesis, employing mutator strains, mega primed and ligase free focused mutagenesis, phage display, yeast display, cassette mutagenesis, directed evolution and other methods generally known to those of skill in the art. For example, the test agent can be selected from a library of randomly mutated proteins. Accordingly, the method can include mutagenizing a test agent (e.g., through random mutagenesis) and preparing a library of mutagenized proteins. The mutagenized test agent can then be assessed as a cell internalizing agent, as described herein.
In another embodiment, the method for generating a library of test agents may involve incorporating chemically synthetic variant pools into a plasmid or virus for subsequent extra-genomic or genomic incorporation.
In other embodiments, a test agent is an antibody or bispecific antibody found in the xEmplar library repertoire (xCella BioSciences).
In some embodiments, the system and method for generating a library of test agents include an in vitro system for generating large repertoires of protein structural diversity de novo (e.g., antibodies and/or polypeptides) by harnessing V(D)J recombination biochemistry. For example, in some embodiments, the system is an in vitro system for generating antibody diversity constructed using appropriately selected nucleic acid molecules that comprise immunoglobulin V, D, J and C region encoding polynucleotide sequences and recombination signal sequences (RSS). In another embodiment, the system allows for the generation of greater immunoglobulin structural diversity in vitro through selection of appropriate relative representation of the immunoglobulin gene elements to generate a highly diverse repertoire. In certain embodiments, such enhanced structural diversity is obtained when the ratio of VH region genes to D segment genes is about 1 :1 to 1 :2 and the ratio of Ju segment genes to D segment genes is about 1 :1 to 1 :2, or when the ratio of VH region genes to JH segment genes is about 1 :2 (V to J) to 2:1 (V to J), or when the combined number of VH region genes together with JH segment genes is not greater than the number of D segment genes when there is a plurality of D gene segments, or when 6, 7, 8, 9, 10, 1 1 or 12 D segment genes are present. A parameter that is described as being "about" a certain quantitative value typically may have a value that varies (i.e. , may be greater than or less than) from the stated value by no more than 50%, and in preferred embodiments by no more than 40%, 30%, 25%, 20%, 15%, 10% or 5%.
In one embodiment, a nucleic acid composition for generating immunoglobulin structural diversity may be assembled from certain immunoglobulin gene elements, including naturally occurring and artificial sequences, using genetic engineering methodologies and molecular biology techniques with which those skilled in the art will be familiar. Useful immunoglobulin genetic elements for producing the compositions described herein include mammalian Ig heavy chain variable (VH) and light chain variable (VL) region genes, natural or artificial Ig diversity (D) segment genes, Ig heavy chain joining (JH) and light chain joining (Ji_) segment genes, and Ig locus recombination signal sequences (RSSs). Immunoglobulin variable (V) region genes are known in the art and include in their polypeptide- encoding sequences at least the polynucleotide coding sequence for one antibody complementarity determining region (CDR), for example, a first or a second CDR known as CDR1 or CDR2 according to conventional nomenclature with which those skilled in the art will be familiar, preferably coding sequence for two CDRs, for example, CDR1 and CDR2, and more preferably coding sequence for CDR1 and CDR2 and at least a portion (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13 or more amino acids) of CDR3, where it will be appreciated that typically one or more amino acids of CDR3 may be encoded at least in part by at least one nucleotide that is present in a D segment gene and/or in a J segment gene. (See, e.g., Lefranc M. -P., 1997 Immunology Today 18:509; Lefranc, 1999 The Immunologist, 7:132-13; Lefranc et al., 2003 Dev. Comp. Immunol. 27:55-77; Ruiz et al., 2002 Immunogenetics 53:857-883; Kaas et al., 2007 Current Bioinformatics 2:21 -30; Kaas et al., 2004 Nucl. Acids. Res., 32: D208-D210.)
In yet another embodiment, libraries containing test agents having sequence diversity within a known protein sequence (e.g., a ligand), are made by targeted introduction of two or more recombination signal sequences (RSSs) into the protein coding sequence and subsequent introduction of the modified protein coding sequence into a recombination-competent host cell, specifically a host cell that is capable of expressing at least RAG-1 and RAG-2, resulting in the generation and expression of variants of the protein. Such mutations provide for the generation of a very large number of protein variants such that, in certain embodiments, mutations imparting the desired functionality to the protein can be identified, e.g., in a single round.
The methods according as set forth herein generally comprise the steps of introducing a pair of RSSs at a selected location within the coding sequence for a test agent, and introducing the modified coding sequence into a recombination-competent host cell to allow for recombination and expression of variants of the test agent. Accordingly, in certain embodiments, the method of generating variants of a test agent comprises the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test agent and comprising a complementary pair of RSSs, introducing the polynucleotide into a recombination- competent host cell, the host cell capable of expressing at least RAG-1 and RAG-2, and culturing the host cell in vitro under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test agent. In certain embodiments, the methods further comprise screening the variant test agents for variants having defined functional characteristics.
In certain embodiments, the methods are applied to a test agent that is an antigen-binding moiety. In some embodiments, the methods are applied to an antigen-binding moiety in order to introduce sequence diversity into a loop region involved antigen-binding and comprise the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test antigen-binding moiety, the nucleic acid sequence comprising a complementary pair of RSSs in a region of the sequence encoding an antigen-binding loop of the protein, introducing the polynucleotide into a recombination-competent host cell, and culturing the host cell under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test antigen-binding moiety.
The host cell may constitutively express RAG-1 and RAG-2, and optionally TdT, or one or more of these proteins may be under inducible control. In certain embodiments, expression of one or more of RAG-1 and RAG-2, and optionally TdT, in the host cell is under inducible control allowing, for example, for expansion of the host cell prior to the induction of sequence diversity generation. Accordingly, in some embodiments, the method comprises the steps of: providing a polynucleotide comprising a nucleic acid sequence encoding a test agent and comprising a pair of RSSs, introducing the polynucleotide into a recombination-competent host cell, the host cell capable of expressing at least RAG-1 and RAG-2 and optionally TdT, wherein expression of one or more of RAG-1 , RAG-2 and TdT is under inducible control, culturing the host cell under conditions allowing expansion of the host cell, inducing expression of one or more of RAG-1 , RAG-2 and TdT, culturing the expanded host cells under conditions allowing recombination and expression of the polynucleotide, thereby generating variants of the test agent.
The polynucleotide may be introduced into the host cell on a suitable vector and may be, for example, stably integrated into the genome of the cell, stably maintained exogenously to the genome or transiently expressed.
In certain embodiments, the polynucleotide may comprise additional pairs of RSSs allowing for generation of additional sequence diversity in the protein. In some embodiments, the polynucleotide comprises two complementary pairs of RSSs, each pair positioned to introduce sequence diversity into a different region of the test agent.
The recombination signal sequence (RSS) as set forth herein, preferably consists of two conserved sequences (for example, heptamer, 5'- CACAGTG-3', and nonamer, 5'- ACAAAAACC-3'), separated by a spacer of either 12 +/- 1 bp (a "12-signal" RSS) or 23 +/- 1 bp (a "23-signal" RSS). Within the host cell, two RSSs (one 12-signal RSS and one 23-signal RSS) are selected and rearranged under the "12/23 rule." Recombination does not occur between two RSS signals with the same size spacer. As would be appreciated by one of skill in the art, the orientation of the RSS determines if recombination results in a deletion or inversion of the intervening sequence.
As a result of extensive investigations of RSS processes, it is known in the art which nucleotide positions within RSSs cannot be varied without compromising RSS functional activity in genetic recombination mechanisms, which nucleotide positions within RSSs can be varied to alter (for example, increase or decrease in a statistically significant manner) the efficiency of RSS functional activity in genetic recombination mechanisms, and which positions within RSSs can be varied without having any significant effect on RSS functional activity in genetic recombination mechanisms (see, for example, Ramsden et al, 1994, Proc Natl Acad Sci USA 88(23): 10721 -10725).
In certain embodiments, the RSS selected for inclusion in the test agent coding sequence is a RSS that is known to the art. Also contemplated in some embodiments are sequence variants of known RSSs that comprise one or more nucleotide substitutions (for example, about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18 or more substitutions) relative to the known RSS sequence and which, by virtue of such substitutions, predictably have low efficiency (for example, about 1 % or less, relative to a high efficiency RSS), medium efficiency (for example, about 10% to about 20%, relative to a high efficiency RSS) or high efficiency. Also contemplated in some embodiments are those RSS variants for which one or more nucleotide substitutions relative to a known RSS sequence will have no significant effect on the recombination efficiency of the RSS (for example, the success rate of the RSS in promoting formation of a recombination product, as known in the art).
In accordance with certain embodiments of the invention, RSSs selected for inclusion in the test agent coding sequence are pairs of RSSs in which the first RSS of the pair is capable of functional recombination with the second RSS of the pair (i.e. "complementary pairs"). It is to be understood that when a first RSS (for example present in a first polynucleotide or nucleic acid sequence) is described as being capable of functional recombination with a second RSS (for example present in a second polynucleotide or nucleic acid sequence), such capability includes compliance with the above-noted 12/23 rule for RSS spacers, such that if the first RSS comprises a 12- nucleotide spacer then the second RSS will comprise a 23 -nucleotide spacer, and similarly if the first RSS comprises a 23-nucleotide spacer then the second RSS will comprise a 12-nucleotide spacer.
In accordance with the present invention, the RSSs are positioned at a pre-determined (targeted) location or locations within the test agent coding sequence.
In accordance with certain embodiments, the targeted location is selected such that it is within a non-conformational region of the protein (i.e. a region of the protein that is not important for folding and/or adoption of the protein's functional conformation). In some embodiments, the targeted location is selected such that it is within an externally exposed region of the protein.
In certain embodiments in which the test agent is a ligand-binding protein, the targeted location is selected such that it is within or proximal to a ligand-binding domain, or in a region that otherwise impacts on ligand binding by the protein.
Immunoglobulin D segment genes are known in the art and may include coding regions for natural or non-naturally occurring D segments which coding regions comprise 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23 or 24 nucleotides. Immunoglobulin J segment genes are also known in the art, for example, from immunoglobulin genes or cDNAs that have been sequenced, and typically comprise J segment-encoding regions of about 1 -51 nucleotides. As described herein, many such Ig gene sequences are therefore known in the art (e.g., Kabat et al., Sequences of Proteins of Immunological Interest, Edition: 5, 1992 DIANE Publishing, 1992, Darby, PA, ISBN 094137565X, 9780941375658,; Tomlinson et al., 1992 J Mol Biol 227:776; Milner et al., 1995 Ann N Y Acad Sci 764:50. Other genetic elements that may be useful in certain herein disclosed embodiments include membrane anchor domain polypeptide encoding polynucleotide sequences and variants or fragments thereof (e.g., primary sequence variants or truncated products that retain 3D structural properties of the corresponding unmodified polypeptide, such as space-filling, charge distribution and/or hydrophobicity/hydrophi licity) that encode membrane anchor domain polypeptides that localize the polypeptides in which they are present to the surfaces of cells in which they are expressed.
Specific binding interactions such as a specific protein-protein association or a specific antibodyantigen binding interaction may preferably include a protein-protein binding event, or an antibody-antigen binding event, having an affinity constant, Ks, of greater than or equal to about 104 M-1 , more preferably of greater than or equal to about 105 M 1, more preferably of greater than or equal to about 106 IvH , and still more preferably of greater than or equal to about 107 M-1 . Affinities of specific binding partners including antibodies can be readily determined using conventional techniques, for example, those described by Scatchard et al. (Ann. N. Y. Acad. Sci. USA 51 :660 (1949)), by Harlow et al., in Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory (1988), by Weir).
Accordingly, examples of systems and methods capable of producing libraries of test agents are further described in, e.g., W02009129247 A2, WO2013134880A1 , WO2013134881 A1 , US201600297883A1 , W02D17091905A1 , and WO2018090144A1 , which are hereby incorporated by reference.
Binding or affinity between an antigen and a test agent can be determined using a variety of techniques known in the art, for example but not limited to, equilibrium methods (e.g., enzyme-linked immunosorbent assay (ELISA); KinExA, Rathanaswami et al. Analytical Biochemistry, Vol. 373:52-60, 2008; or radioimmunoassay (RIA)), or by a surface plasmon resonance assay or other mechanism of kineticsbased assay (e.g., BIACORE™ analysis or Octet™ analysis (forteBIO)), and other methods such as indirect binding assays, competitive binding assays fluorescence resonance energy transfer (FRET), gel electrophoresis and chromatography (e.g., gel filtration). These and other methods may utilize a label on one or more of the components being examined and/or employ a variety of detection methods including but not limited to chromogenic, fluorescent, luminescent, or isotopic labels. A detailed description of binding affinities and kinetics can be found in Paul, W. E., ed., Fundamental Immunology, 4th Ed., Lippincott- Raven, Philadelphia (1999), which focuses on antibody-immunogen interactions. One example of a competitive binding assay is a radioimmuno assay comprising the incubation of labeled antigen with the cell internalizing agent of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the cell internalizing agent bound to the labeled antigen. The affinity of the test agent for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis. Competition with a second agent can also be determined using radioimmunoassays. In this case, the antigen is incubated with test agent conjugated to a labeled compound in the presence of increasing amounts of an unlabeled second cell internalizing agent.
Screening methods
The invention includes a method for identifying a cell internalizing agent, the method comprising providing a population of target cells expressing or displaying test agents. Each of the target cells contains a reporter construct that comprises a nucleic acid that provides a certain phenotype when activated or repressed. The phenotype is indicative of gene editing and represents internalization of the test agent in such a manner that when the test agent is conjugated to a site-directed modifying polypeptide, internalization of the complex results in gene editing, which is in turn represented by the observed phenotype.
In one embodiment, the target cell is a eukaryotic cell. Examples of eukaryotic cells that may be used to express the test agents include mammalian cells or yeast cells. Examples of mammalian cell lines that may be used with the methods disclosed herein include, but are not limited to HEK 293 (Human embryonic kidney) and CHO (Chinese hamster ovary). These cell lines can be transfected by standard methods to express the test agents (e.g., using calcium phosphate or polyethyleneimine (PEI), or electroporation). Other mammalian cell lines include, but are not limited to: HeLa, U2OS, 549, HT1080, CAD, P19, NIH 3T3, L929, N2a, Human embryonic kidney 293 cells, MCF-7, Y79, SO-Rb50, Hep G2, DUKX-X1 1 , J558L, and Baby hamster kidney (BHK) cells. In certain embodiments, the mammalian cell is a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK- 293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, or HepG2.
As described above, test agents (e.g., a protein, a lipid, or a carbohydrate) contain a conjugation moiety that allow the agent to be conjugated to a site-directed modifying polypeptide (described in more detail below). As such, the site-directed modifying polypeptide also contains a conjugation moiety that is complementary to that on the test agent, such that the two molecules can be conjugated. Conjugation methods and moieties are provided in more detail below.
In one embodiment, screening of the test agents is further performed by contacting a population of target cells with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid within the target cell. In one embodiment, the reporter construct containing the target nucleic acid sequence is located within the nucleus of the target cell. As noted above, the nucleoprotein comprises a conjugation moiety (i.e., a second moiety) that binds to the first conjugation moiety of the test agent.
Cell internalizing agents are subsequently identified by selecting a target cell having the phenotype observed with activation or repression of the reporter construct (thereafter referred to a modified target cell).
In various aspects of the method, the phenotype of interest is associated with repression or activation of a reporter construct. In one embodiment, the phenotype of interest is a fluorescent protein that has at least one of an absorption or emission intensity of interest, an absorption or emission spectra of interest, and a stokes shift of interest. Moreover, the phenotype of interest may be the production of a protein having enzymatic activity, a protein having a lack of inhibition of enzyme activity, and a protein having activity in the presence of an inhibitor for the enzyme.
Thus, provided herein are methods of selecting a target cell by applying selective pressure to a population of cells comprising a reporter construct, where expression of the reporter construct confers a survival benefit to a target cell. A target cell or a plurality of target cells that survive the selection process can be collected and the test agent of the target cell can be isolated and identified using techniques known in the art (i.e., positive selection). For example, in one embodiment, a guide RNA is complexed with a site- directed modifying polypeptide (e.g., Cas9) wherein the resulting nucleoprotein is capable of modifying a nucleic acid sequence encoding a reporter construct, wherein the nucleic acid editing activity of the resulting nucleoprotein results in the expression of an antibiotic resistance gene, capable of conferring a survival advantage to the target cell. In such embodiments, a target cell of interest (i.e., a target cell comprising a test agent capable of internalizing the site-directed modifying polypeptide) is selected for target cell survival in the presence of an antibiotic (i.e., by selecting a cell that is not sensitive to the antibiotic) due to modification of the reporter construct such that the reporter construct is capable of expressing the antibiotic resistance gene.
Array-based screening methods
In certain embodiments, the invention provides a method of screening test cells having a plurality of genotypes for a cell that produces a cell internalizing agent using an array. Examples of such arrays include, but are not limited to, a microfluidic system, a microbubble system, or a microcavity array. An arraybased screening method includes, but is not limited to, providing an array with a library of protein-variantproducing cells that each express a test agent, and then incubating the array under conditions that allow for the production of test agents from the protein-variant-producing cells. Subsequently, a target cell comprising a reporter construct comprising a nucleic acid that provides a phenotype with activated or repressed is provided and contacted with the test agent under conditions that allow for the test agent to bind to the target cell, wherein the test agent comprises a first conjugation moiety. The target cell is then contacted with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid of the target cell, wherein a second conjugation moiety is conjugated to the site-directed modifying polypeptide such that the first conjugation moiety of the test agent binds to the second conjugation moiety of the site-directed modifying polypeptide. Following this contact, the array is examined in order to determine a given result which is an expected phenotype associated with activation or repression of the nucleic acid in the reporter construct. Once the result is achieved, it is representative of the test agent being capable of internalizing the nucleic acid guided nuclease such that gene editing can occur. Thereafter, the target cell is extracted (e.g. , with electromagnetic radiation) such that the cell internalizing agent (i.e., the successful test agent) can be identified using standard techniques.
The array may be designed such that some or all cavities (or chambers of a device) contain a single biological element (e.g., a single target cell) to screen for the cell internalizing agent. The concentration of the heterogeneous mixture of cells is therefore calculated according to the design of the array and desired agents to identify.
The array may be loaded by contacting a solution containing a plurality of cells, such as a heterogeneous population of cells, with the array. In one embodiment, loading a mixture of test agent displaying or secreting cells, e.g., yeast or mammalian cells, evenly into all the microcavities involves placing a 500 pL droplet on the upper side of the array and spreading it over all the micro-pores. As an example, an initial concentration of approximately 109 cells in the 500 pL, droplet results in approximately 3 cells (or subpopulation) per micro-cavity. In a specific embodiment, the concentration conditions are set such that in most cavities of the array only single elements are present. This allows for the most precise screening of single elements. However, the exact number will depend on the volume of the cavity in the array and the concentration of cells in solution. The cells may secrete or display at least one test agent.
These concentration conditions can be readily calculated by the person of skill in the art. By way of example, in a cell screen, if the ratio of protein-producing cells to cavities is about 1 to 3, an array with 109 cavities could be loaded with 3x108 different protein-producing cells in a 6 mL volume (6 mL=20 picoliter/pore x 3 x 1 o8 cavities), and the vast majority of the cavities will contain at most a single clone. In certain other embodiments, single cells are not desired in each pore. For these embodiments, the concentration of the heterogeneous population is set so that more than one cell is found in each pore.
A sample containing the population and/or library of cells may require preparation steps prior to distribution to the array (e.g., genetic engineering of the genomic nucleic acid, incubation of expanding a particular cell line). In some embodiments, these preparation steps include an incubation time. The incubation time will depend on the design of the screen and the cells being screened. Example times include 5 minutes, 1 hour, 3 hours, 6 hours, 12 hours, 1 day, 2 days and 3 days or more. The heterogeneous population of cells may be expanded in media prior to adding and/or loading onto the array.
After the target cells have been loaded into the array, additional molecules or particles can be added or removed from the array without disturbing the cells. For example, any biological reactive molecule or particle useful in the detection of the target cells can be added. These additional molecules or particles can be added to the array by introducing liquid reagents comprising the molecules or particles to the top of the array, such as for example by adding drop-wise as described herein in relation to the addition of the cells.
In certain embodiments, the top of the array is sealed with a membrane following the addition of sample to the cavities in order to reduce evaporation of the media from the cavities. For example, typical food-service type plastic wraps such as polyvinylidene fluoride (PVDF) are suitable. In another embodiment, the membrane allows water vapor to equilibrate with the top liquid layer of the liquid in the pore, which can help prevent evaporation.
In certain embodiments the top of the array is covered with a semi-permeable composition that allows delivery of liquid and reagents to the cavities of the array while also preventing evaporation of the cavity contents.
Accordingly, in one aspect, the disclosure is directed to an array including a plurality of distinct cavities comprising open first ends and open second ends, wherein the open first ends of essentially all of the plurality of cavities collectively comprise a first porous planar surface, and the open second ends of essentially all of the plurality of cavities comprise second porous planar surface, and a cover for the first surface that imparts at least one of moisture, a nutrient, or a biologically reactive molecule, to contents of the cavities.
Following incubation, addition of components, and/or another preparation step, the array is scanned to identify cavities containing cells having a phenotype of interest, which may include cells capable of internalizing a test agent and allowing editing of a nucleic acid sequence by a site-directed modifying polypeptide (or nucleoprotein). For example, following established guidelines for quantitative wide-field microscopy, the intercapillary variability in fluorescence signals detected from the array may be measured. The passive nature of microcapillary filling process results in a uniform meniscus level across the entire array. This uniformity, coupled with gravitational sedimentation of the loaded cells, simplifies the establishment of the imaging focus plane without the need for autofocus. Rather, the focus may be set at three distantly spaced points on the array, for example the corners. From these three points, the plane of the microcapillary array may be calculated.
Based on the optical information received from a detector associated with the array of cavities, target cavities with the desired properties are identified and their contents extracted for further characterizations and expansion. The disclosed methods maintain the integrity of the biological elements in the cavities. Therefore, the methods disclosed herein provide for the display and independent recovery of a target population of test agents from a population of up to billions of test agents.
For example, the signals from each cavity are scanned to locate the binding events of interest. This identifies the cavities of interest. Individual cavities containing the desired clones can be extracted using a variety of methods. For all extraction techniques, the extracted cells or material can be expanded through culture or amplification reactions and identified for the recovery of the test agent, e.g., a protein or an antibody. Following screening, one or more cavities of interest can be extracted as described herein. In certain embodiments, the desired specificity will be a single biological element per pore or cavity.
In an embodiment, a test agent having a desired characteristic (e.g., fluorescence) is detected directly in the cavities of the array. The biochemical sensing can be done using standard detection techniques including a sandwich immunoassay or similar binding or hybridizing reactions. The property of interest may be at least one of an emission spectra or emission intensity, a Stokes shift, and an absorption spectra or absorption intensity. In one particular example, the method of the disclosure can be used to screen a library of cells expressing test agents having a particular fluorescent absorbance spectrum, emission spectra and/or extinction coefficient. In one particular embodiment, the protein is a dimerization dependent orange fluorescent protein (ddOFP).
Detection of test agents in accordance with the disclosure requires, in some embodiments, the use of an apparatus capable of applying electromagnetic radiation to the sample, and in particular, to an array of cavities, such as a microarray. The apparatus must also be capable of detecting electromagnetic radiation emitted from the sample, and in particular a sample cavity.
In one embodiment, an electromagnetic radiation source of the apparatus is broad spectrum light or a monochromatic light source having a wavelength that matches the wavelength of at least one label in a sample. In a further embodiment, the electromagnetic radiation source is a laser, such as a continuous wave laser. In yet a further embodiment, the electromagnetic source is a solid state UV laser.
The apparatus may also include, in certain embodiments, a detector that receives electromagnetic (EM) radiation from the label(s) in the sample, array. The detectors can identify at least one cavity (e.g., a microcavity) emitting electromagnetic radiation from one or more labels.
In one embodiment, light (e.g., light in the ultra-violet, visible or infrared range) emitted by a fluorescent label after exposure to electromagnetic radiation is detected.
Once a particle or element is labeled to render it detectable, or if the particle possesses an intrinsic characteristic rendering it detectable, any suitable detection mechanism known in the art may be used without departing from the scope of the disclosure, for example a CCD camera, a video input module camera, a Streak camera, a bolometer, a photodiode, a photodiode array, avalanche photodiodes, and photomultipliers producing sequential signals, and combinations thereof.
In some embodiments, the number of cells in the sample liquid results in a diverse population of cells in each cavity. Following extraction and expansion of the contents of a particular cavity, the resulting population can be screened in subsequent steps to identify particular cells of interest.
In other embodiments, the number of cells in a sample liquid is less than the number of cavities in the array, resulting in the loading only one cell or less in each of the cavities.
Once a cavity or cavities of interest are identified, the contents of the cavities can be extracted with the apparatus and methods known in the art. The cavity contents can be further analyzed or expanded. Expanded cell populations from a cavity or cavities can be rescreened with the array according the methods herein. For instance, if the number of biological elements in a population exceeds the number of cavities in the array, the population can be screened with more than one element in each pore. The contents of the cavities that provide a positive signal can then be extracted to provide a subpopulation. The subpopulation can be screened immediately or, when the subpopulation is cells, it can be expanded. The screening process can be repeated until each cavity of the array contains only a single element. The screen can also be applied to detect and/or extract the cavity that indicates the desired analyte is therein. Following the selection of the cavity, other conventional techniques may be used to isolate the individual agents of interest, such as techniques that provide for higher levels of protein production.
In certain embodiments, an initial screen of the library or in the enrichment process, multiple cells may be added to any particular cavity. Cell contents may be extracted and further analyzed or enriched in accordance with the method of the disclosure. Ultimately, having one cell per cavity allows for identification of a particular genotype. The extracting may discreetly directing electromagnetic radiation to the cavities having cells producing proteins having a phenotype of interest, wherein the directing of electromagnetic radiation to the cavities does not heat the liquid prior to extraction.
Microcavity array based screening
In certain embodiments, microcavity arrays are used to screen test agents where test cells produce test agents and secrete the agents into the microcavity. The test agent may be an antibody, e.g., a recombinant antibody and/or a monoclonal antibody. The cell or cells may produce more than one kind of antibody or multiple copies of the same antibody. The microcavities containing the cells displaying or secreting compounds having the highest binding affinity can be identified with an appropriate reporter system. The array can be imaged it identify one or more cavities containing cells having the phenotype of interest. The contents of the cavities may be extracted by directing electromagnetic radiation from a pulsed diode laser at a radiation absorbing material associated with the cavity.
In one particular aspect, the disclosure is directed to microcavity arrays which may include reaction cavities assembled in an extreme density porous array. As an example, micro-arrays contemplated herein can be manufactured by bundling millions or billions of cavities or pores, such as in the form of silica capillaries, and fusing them together through a thermal process. Such a fusing process may comprise the steps including but not limited to; I) heating a capillary single draw glass that is drawn under tension into a single clad fiber; ii) creating a capillary multi draw single capillary from the single draw glass by bundling, heating, and drawing; iii) creating a capillary multi-multi draw multi capillary from the multi draw single capillary by additional bundling, heating, and drawing; iv) creating a block assembly of drawn glass from the multi-multi draw multi capillary by stacking in a pressing block; v) creating a block pressing block from the block assembly by treating with heat and pressure; and vi) creating a block forming block by cutting the block pressing block at a precise length (e.g., 1 mm).
In one embodiment, the capillaries are cut to approximately 1 millimeter in height, thereby forming a plurality of micro-pores having an internal diameter between approximately 1 .0 micrometers and 500 micrometers. In one embodiment, the micro-pores range between approximately 10 micrometers and 1 millimeter long. In one embodiment, the micro-pores range between approximately 10 micrometers and 1 centimeter long. In one embodiment, the micro-pores range between approximately 10 micrometers and 100 millimeters long. In one embodiment, the micro-pores range between approximately 0.5 millimeter and 1 centimeter long.
Very high-density micro-pore array that may be used in the various aspects of the disclosure. In example embodiments, each micro-pore can have a 5 pm diameter and approximately 66% open space (i.e., representing the lumen of each microcavity). In some arrays, the proportion of the array that is open ranges between about 50% and about 90%, for example about 60 to 75%, more particularly about 67%. In one example, a 10x10 cm array having 5 pm diameter microcavities and approximately 66% open space has about 330 million micro-pores. The internal diameter of micro-cavities may range between approximately 1 .0 micrometers and 500 micrometers. In some arrays, each of the micro-pores can have an internal diameter in the range between approximately 1 .0 micrometers and 300 micrometers; optionally between approximately 1 .0 micrometers and 100 micrometers; further optionally between approximately 1 .0 micrometers and 75 micrometers; still further optionally between approximately 1 .0 micrometers and 50 micrometers, still further optionally, between approximately 5.0 micrometers and 50 micrometers.
In one particular embodiment, a microcavity array can be manufactured by bonding billions of silica capillaries and then fusing them together through a thermal process. After that slices (0.5 mm or more) are cut out to form a very high aspect ratio glass micro perforated array plate. See, International Application PCT/EP201 1 /062015 (WO2012/007537), which is incorporated by reference herein in its entirety. A number of useful arrays are commercially available, such as from Hamamatsu Photonics K. K. (Japan), Incom, Inc. (Massachusetts), Photonis Technologies, S.A.S. (France) Inc. and others. In some embodiments, the microcavities of the array are closed at one end with a solid substrate attached to the array.
In certain embodiment, the sidewalls of the cavities of the arrays are not transmissive to electromagnetic radiation, or the cavities are coated with a material that prevents the transmission of electromagnetic radiation between cavities of the arrays. Suitable coating should not interfere with the binding reaction within the cavities or the application of forces to the cavities. Example coatings include sputtered nanometer layers of gold, silver and platinum. In another example, the capillary walls of the array are comprised of multiple layers, wherein one or more layers of the walls are made of a low refractive index material that prevents or substantially diminishes transmission of electromagnetic radiation between cavities of the array.
In particular embodiments, the arrays are prepared under or subjected to either wet or dry hydrogen atmospheres in order to inhibit or block the transmission of electromagnetic radiation through the array.
In one aspect of the disclosure, the cavities of the array have a hydrophilic surface that facilitates the spontaneously uptake the solution into the cavity. In another aspect, a surface of the array may be treated to impart hydrophobicity. Combining these aspects, one surface of the array may be hydrophobic and the other surface may be hydrophilic. For example, a top surface and a bottom surface of the array are treated differently to impart hydrophilic characteristics on the top and hydrophobic characteristics on the bottom. The array may be treated sequentially, first with an agent to impart hydrophobicity, then on the opposite side with an agent to impart hydrophilicity.
Accordingly, the disclosure is directed to the use of an array including a plurality of distinct cavities comprising open first ends and open second ends, wherein the open first ends of essentially all of the plurality of cavities collectively encompass a porous planar hydrophilic surface, and the open second ends of essentially all of the plurality of cavities encompass a porous planar hydrophobic surface. The surfaces include the open ends of the cavities and the interstitial spaces between the cavities. The hydrophilic characteristics may be imparted using a corona treatment according to techniques known in the art. In addition, the array may be treated with hydrophobic agents such as a polysiloxane, or composition comprising polysiloxane. As an example, the hydrophobic agent is a hydroxy-terminated polydimethylsiloxane. In a particular embodiment, the hydrophobic agent is RAIN-X® water repellant.
In order to provide an array having opposed hydrophilic and hydrophobic surfaces, one surface or the entire array can be treated to impart a hydrophilic characteristic. Thereafter the hydrophilic surfaces are protected, for example by application of a sealant, and the opposing surfaces are treated with a hydrophobic agent.
In one embodiment, the method includes isolating cells located in the microcavities by pressure ejection. For example, a separated microcavity array is covered with a plastic film. In one embodiment, the method further provides a laser capable of making a hole through the plastic film, thereby exposing the spatially addressed micro-pore. Subsequently, exposure to a pressure source (e.g., air pressure) expels the contents from the spatially addressed microcavity. See WO2012/007537.
Another embodiment is directed to a method of extracting a solution including a biological element from a single microcavity in a microcavity array. In this embodiment, the microcavity is associated with an electromagnetic radiation absorbent material so that the material is within the cavity or is coating or covering the microcavity. Extraction occurs by focusing electromagnetic radiation at the microcavity to generate an expansion of the sample or of the material or both or evaporation that expels at least part of the sample from the microcavity. The electromagnetic radiation source may be the same or different than the source that excites a fluorescent label. The source may be capable of emitting multiple wavelengths of electromagnetic radiation in order to accommodate different absorption spectra of the materials and the labels.
In some embodiments, subjecting a selected microcavity to focused electromagnetic radiation can cause an expansion of the electromagnetic radiation absorbent material, which expels sample contents onto a substrate for collecting the expelled contents.
In one aspect, the electromagnetic radiation is focused on the electromagnetic radiation absorbing material, resulting in linear absorption of the laser energy and cavitation of the liquid sample at the material/liquid interface. In most applications, directing of electromagnetic radiation to the material should avoid heating that liquid that is not in contact with the material at the focus of the radiation to avoid heating the liquid contents of the microcavity and impacting the biological material in the cells. Accordingly, the amount of energy necessary to disrupt the meniscus is not sufficient to cause a significant increase in temperature of the entire liquid contents. In one aspect the laser is focused on the material of a cavity of the array adjacent the meniscus itself, causing a disruption of the meniscus without heating the liquid contents of the cavity other than the heating associated with the vaporization of a small amount of liquid at the portion of the meniscus adjacent the laser focus.
In certain embodiments, extraction from cavities of the array is accomplished by excitation of one or more particles in the microcavity, wherein excitation energy is focused on the particles. Accordingly, some embodiments employ energy absorbing particles in the cavities and an electromagnetic radiation source capable of discreetly delivering electromagnetic radiation to the particles in each cavity of the array. In certain aspects, a sequence of pulses repeatedly agitates magnetic beads in a cavity to disrupt a meniscus, which expels sample contents onto a substrate for collecting the expelled contents. The electromagnetic radiation emission spectra from the electromagnetic radiation source must be such that there is at least a partial overlap in the absorption spectra of the electromagnetic radiation absorbent material associated with the cavity. In certain embodiments, individual cavities from a microcavity array are extracted by a sequence of short laser pulses rather than a single large pulse. For example, a laser is pulsed at wavelengths of between about 300 and 650, more particularly about 349 nm, 405 nm, 450 nm, or 635 nm. In some embodiments, cavities of interest are selected and then extracted by focusing a 349 nm solid state UV laser at 20-30% intensity power.
In some embodiments, a diode laser may be used as an electromagnetic radiation source. In certain embodiments, a diode laser is pulsed at between about 2 to 20 pulses, for instance 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20 pulses, with a pulse length of about 1 to 10 msec, for instance, 1 , 2, 3, 4, 5, 6, 7, 8, 9, and 10 msec, and having a pulse separation of approximately 10 msec to 100 msec, for instance 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100 msec. If magnetic beads are in the capillary the laser pulse energy is absorbed by the beads, primarily heating the surface of the bead that is directly exposed to the laser. The liquid in immediate proximity to this surface is explosively vaporized which propels the beads within the capillary.
Focusing electromagnetic radiation at a microcavity can cause the electromagnetic radiation absorbing material to expand, which causes at least part of the liquid volume of the cavity to be expelled. Microcavities can be open at both ends, with the contents being held in place by hydrostatic force. During the extraction process, one of the ends of the cavities can be covered to prevent expulsion of the contents from the wrong end of the cavity.
In some embodiments, the capture surface comprises a hygroscopic layer upon which the contents of the cavity are expelled. The hygroscopic layer attracts water and prevents the deformation of the optical surface allowing clear imaging of the cavity contents. In certain embodiments, the layer is a hygroscopic composition, such as witch hazel, a solution including glycerol, or a solution of phosphate buffered saline with bovine serum albumin and sorbitol in concentrations for example of 0.1% weight/volume BSA and 1 M sorbitol.
In another embodiment, once the cellular contents of one or more cavities have been extracted onto a capture surface, the surface may be contacted with a culture matrix to allow transfer of the contents of the cavity to the matrix. Once the contents have been transferred to the matrix, cells may be allowed to propagate on the matrix. The capture surface can be removed immediately after contact or within minutes, hours, days or weeks as appropriate to ensure viability of the cell culture(s) in the matrix. Alternatively, the cells may be extracted directly onto the growth matrix, assuming the matrix has sufficient transparency to allow for the extraction laser to penetrate the matrix without sufficient focus to transfer energy to the array as described herein.
Accordingly, examples of arrays and methods for high-throughput analysis of cells are further described in, e.g., US20160244749A1 , US20180163198A1 , US10370653B2, US10227583B2, WO2018111765A1 , WO2018191180A1 , and WO2018125832A1 which are hereby incorporated by reference.
Selection
The invention provides methods for selecting test agents that are able to internalize site-directed modifying polypeptides (or nucleoproteins) where the read out for the internalization, in certain preferred embodiments, is the gene editing activity of the site-directed modifying polypeptides (or nucleoproteins). This read out, which is based on activation or repression of the reporter construct, is manifested in a phenotype that can be screened in order to separate target cells having test agents that failed to internalize and/or provide gene editing activity when conjugated to a site-directed modifying polypeptides (or nucleoproteins).
The phenotype used to identify test agents that are cell internalizing agents can be observed using a label or other detection means. As used herein, the term "label" or "detectable label" means a molecule that can be directly (i.e., a primary label) or indirectly (i.e., a secondary label) detected. For example, a label can be visualized and/or measured and/or otherwise identified so that its presence, absence, or a parameter or characteristic thereof can be measured and/or determined. As used herein, the term "fluorescent label" refers to any molecule that can be detected via its inherent fluorescent properties, which include fluorescence detectable upon excitation. Examples of fluorescent labels are described herein and elsewhere in the art (e.g., see The Tenth Edition of Haugland, RP. The Handbook: A Guide to Fluorescent Probes and Labeling Technologies. 10th. Invitrogen/Molecular Probes; Carlsbad, CA: 2005, hereby incorporated by reference). In some embodiments of the methods provided herein, internalization and nucleic acid editing activity of a site-directed modifying polypeptide is detected by introducing into the target cell a polynucleotide comprising a target sequence capable of being bound by the gNA and measuring for cleavage at the target sequence in the target cell. To measure cleavage of the target sequence, in some embodiments, the target sequence can be labelled with two detectable labels that generate a signal upon interaction (e.g., a FRET pair) such that cleavage of the target sequence disrupts interaction of the detectable labels and causes a reduction in fluorescence. In other embodiments, the target sequence is labelled with a quenching pair such that cleavage of the target sequence leads to a gain in signal.
Additional examples of fluorophores that can be used in the methods provided herein include Alexa Fluor® 350; Marina Blue®; Atto 390; Alexa Fluor® 405; Pacific Blue©; Pacific Green©; Atto 425; Alexa Fluor® 430; Atto 465; DY-485XL; DY-475XL; FAM™ 494; Alexa Fluor® 488; DY-495-05; Atto 495; Oregon Green® 488; DY-480XL 500; Atto 488; Alexa Fluor® 500; Rhodamin Green®; DY-505-05; DY-500XL; DY- 510XL; Oregon Green® 514; Atto 520; Alexa Fluor® 514; JOE 520; TET™ 521 ; CAL Fluor® Gold 540; DY- 521 XL; Rhodamin 6G®; Yakima Yellow® 526; Atto 532; Alexa Fluor®532; HEX 535; VIC 538; CAL Fluor Orange 560; DY-530; TAMRA™; Quasar 570; Cy3™ 550; NED™; DY-550; Atto 550; Alexa Fluor® 555; DY- 555; Alexa Fluor® 546; BMN™-3; DY-547; PET®; Rhodamin Red®; Atto 565; CAL Fluor RED 590; ROX; Alexa Fluor® 568; Texas Red®; CAL Fluor Red 610; LC Red® 610; Alexa Fluor® 594; Atto 590; Atto 594; DY-600XL; DY-610; Alexa Fluor® 610; CAL Fluor Red 635; Atto 620; DY-615; LC Red 640; Atto 633; Alexa Fluor® 633; DY-630; DY-633; DY-631 ; LIZ 638; Atto 647N; BMN™-5; Quasar 670; DY-635; Cy5™; Alexa Fluor® 647; CEQ8000 D4; LC Red 670; DY-647 652; DY-651 ; Atto 655; Alexa Fluor® 660; DY-675; DY- 676; Cy5.5™ 675; Alexa Fluor® 680; LC Red 705; BMN™-6; CEQ8000 D3; IRDye® 700Dx 689; DY-680; DY-681 ; DY-700; Alexa Fluor® 700; DY-701 ; DY-730; DY-731 ; DY-732; DY-750; Alexa Fluor® 750; CEQ8000 D2; DY-751 ; DY-780; DY-776; IRDye® 800CW; DY-782; and DY-781 ; Oyster® 556; Oyster® 645; IRDye® 700, IRDye® 800; WellRED D4; WellRED D3; WellRED D2 Dye; Rhodamine Green™; Rhodamine Red™; fluorescein; MAX 550 531 560 JOE NHS Ester (like Vic); TYE™ 563; TEX 615; TYE™ 665; TYE 705; ODIPY 493/503™; BODIPY 558/568™; BODIPY 564/570™; BODIPY 576/589™; BODIPY 581/591 ™; BODIPY TR-X™; BODIPY- 530/550™; Carboxy-X-Rhodamine™; Carboxynaphthotluorescein; Carboxyrhodamine 6G™; Cascade Blue™; 7-Methoxycoumarin; 6-JOE; 7- Aminocoumarin-X; and 2', 4', 5', 7'- Tetrabromosulfonefluorescein.
In some embodiments, the methods provided herein involve negative selection against cells wherein the reporter construct is not edited. For example, in one embodiment, a guide RNA is complexed with a site- directed modifying polypeptide (e.g., Cas9) wherein the resulting nucleoprotein is capable of modifying a nucleic acid sequence encoding for thymidine kinase. In such embodiments, a target cell of interest (i.e., a target cell comprising a test agent capable of internalizing the site-directed modifying polypeptide) is selected for target cell survival in the presence of ganciclovir (i.e., by selecting a cell that is not sensitive to ganciclovir) due to modification of the reporter construct capable of expressing thymidine kinase (i.e., knockout of the thymidine kinase by the nucleic acid editing activity of the site-directed modifying polypeptide). See, e.g., Mortensen, Selection of Transfected Mammalian Cells, Current Protocols, 2001.
In one aspect, where the screening method as described herein involves a reporter construct providing for a phenotype which is a detectable label, a target cell containing a reporter construct can be screened for internalization of a site-directed modifying polypeptide (or nucleoprotein) or genome editing activities of a site-directed modifying polypeptide (or nucleoprotein) based on the level of the detectable label. A detectable label is a molecule that can be visualized or otherwise observed. The detectable label may be encoded by a polynucleotide that is operably linked to the polynucleotide encoding the nucleic acid- guided nuclease. In such instances, the expression construct will encode a nucleoprotein. Detectable labels include any detectable protein domain, including but not limited to, a fluorescent protein or a protein domain that can be detected with a specific antibody. Non-limiting examples of fluorescent proteins include green fluorescent proteins (e.g., GFP, sfGFP, EGFP, ZsGreenl ), yellow fluorescent proteins (e.g., YFP, EYFP, ZsYellowl ), or red fluorescent protein (e.g., RFP). Non-limiting examples of small molecule detectable labels include radioactive labels, such as 3H and 35S. A variety of detectable labels and methods of preparing fusion proteins comprising detectable labels are known in the art, e.g., see Thorn, K. (2017). Genetically encoded fluorescent tags. Molecular biology of the cell, 28(7), 848-857, which is hereby incorporated by reference.
In certain embodiments, a target cell comprising a reporter construct is detected using cell sorting. In some embodiments, the cell sorting is fluorescence-activated cell sorting (FACS). In another embodiment, the cell sorting is magnetic- activated cell sorting (MACS). In yet another embodiment, the cell sorting is microfluidic-based cell sorting. In one embodiment, a target cell comprising a reporter construct capable of expressing a fluorescent label is sorted by fluorescence activated cell sorting (FACS). FACS sorting not only measures fluorescence signals in cells at a rapid rate, but also collects cells that have specified fluorescence properties. Screening for a target cell using FACS may be performed in accordance with the compositions and methods as described herein. In certain embodiments, a target cell comprising a test agent capable of internalizing a site-directed modifying polypeptide (or nucleoprotein) may be identified using FACS by screening for cells for the disappearance of a cell surface receptor due to the nucleic acid editing activities of the internalized site-directed modifying polypeptide (or nucleoprotein). In some embodiments, site-directed modifying polypeptide (e.g., Cas9) complexed with a guide RNA (i.e., forming a nucleoprotein of Cas9 and the guide RNA) modifies a reporter construct capable of expressing a cell surface receptor expressed on target cell of interest resulting in loss of expression (e.g., knock-out of the cell surface receptor). In another embodiment, a target cell comprising a test agent capable of internalizing a site- directed modifying polypeptide (or nucleoprotein) may be identified using FACS by sequencing and isolating/expanding cells that lead to knockout of cell surface receptor. In another embodiment, a target cell comprising a test agent expressed from a protein variant producing cell, wherein the target cell comprising the test agent is capable of internalizing a site-directed modifying polypeptide (or nucleoprotein) may be identified by providing a labeled protein (or antibody) capable of binding to a cell surface receptor expressed by the reporter construct of the target cell, wherein the absence of labeled protein (or antibody) binding identifies a target cell of interest due to the loss of expression of the cell surface receptor (e.g., knock-out of the cell surface receptor). In such embodiments, a protein variant producing cell from these wells would be pooled for further screening and sequencing of test agent variants. In other embodiments, the selection can be performed in a variety of host cells such as yeast, bacteria, plant, insect, or mammalian cells depending on the requirements of the experiment and the capabilities of the expression vectors being used. In some embodiments, a spectrophotometer, a microtitre plate reader, a CCD, a fluorescence microscope, or other similar device may be used to detect fluorescence in a target cell or in an array comprising a single cell or plurality of cells.
Conjugation moieties and linkers
The present invention includes methods for determining whether a test agent, e.g., a test agent expressed on (or bound to) the surface of a target cell, is a cell internalization agent, and thereby facilitate the internalization of a site-directed modifying polypeptide (or nucleoprotein) into a target cell. In some embodiments, the test agent contains a first conjugation moiety and the site-directed modifying polypeptide contains a second conjugation moiety, such that the first conjugation moiety of the test agent binds to (or is stably associated with) the second conjugation moiety of the site-directed modifying polypeptide to form a complex between the test agent and the site-directed modifying polypeptide. In certain embodiments, the first and second conjugation moieties provide for a covalent linkage between the test agent and the site- directed modifying polypeptide. In other embodiments, the first and second conjugation moieties provide for a non-covalent linkage between the test agent and the site-directed modifying polypeptide.
In one embodiment, the test agent comprising the first conjugation moiety is expressed on the surface of a cell via a genetically-encoded first conjugation moiety. In another embodiment, the test agent comprising the first conjugation moiety is expressed on the surface of a cell and the first conjugation moiety is added post-production. In other embodiments, the test agent is secreted from a cell such that the test agent comprising the first conjugation moiety must be prepared in ex vivo culture. In preferred embodiments, when the test agent is secreted, the test agent and first conjugation moiety are site- specifically conjugated in ex vivo culture. Accordingly, in preferred embodiments, following ex vivo conjugation the test agent comprising the first conjugation moiety may be presented to a target cell under conditions that allow for the test agent to bind to the target cell.
In one embodiment, the conjugation moieties include, but are not limited to, SpyTag/SpyCatcher, snooptag/snoopcatcher, sortase, split intein. In other embodiments, the conjugation moieties include, but are not limited to, Halo-tag, mono-avidin, ACP tag, a SNAP tag, or any other conjugation moieties known in the art. In one embodiment, the conjugation moiety is selected from Protein A, CBP, MBP, GST, poly(His), biotin/streptavidin, V5-tag, Myc-tag, HA-tag, NE-tag, His-tag, Flag tag, Halo-tag, Snap- tag, Fc-tag, Nus-tag, BCCP, thioredoxin, SnoopTag, SpyTag, SpyCatcher, Isopeptag, SBP-tag, S- tag, AviTag, and calmodulin. In one embodiment, the conjugation moiety is a chemical tag. For example, a chemical tag may be SNAP tag, a CLIP tag, a HaloTag or a TMP-tag. In one example, the chemical tag is a SNAP-tag or a CLIP- tag. SNAP and CLIP fusion proteins enable the specific, covalent attachment of virtually any molecule to a protein of interest. In another example, the chemical tag is a HaloTag. HaloTag involves a modular protein tagging system that allows different molecules to be linked onto a single genetic fusion, either in solution, in living cells, or in chemically fixed cells. In another example, the chemical tag is a TMP-tag.
In one embodiment, the conjugation moiety is an epitope tag. For example, an epitope tag may be a poly-histidine tag such as a hexahistidine tag or a dodecahistidine, a FLAG tag, a Myc tag, a HA tag, a GST tag or a V5 tag.
In one embodiment, the site-directed modifying polypeptide and the test agent may each be engineered to comprise complementary binding pairs that enable stable association upon contact. Exemplary binding moiety pairings include (i) streptavidin-binding peptide (streptavidin binding peptide; SBP) and streptavidin (STV), (ii) biotin and EMA (enhanced monomeric avidin), (iii) SpyTag (ST) and SpyCatcher (SC), (iv) Halo-tag and Halo-tag ligand, (v) and SNAP-Tag, (vi) Myc tag and anti-Myc immunoglobulins (vii) FLAG tag and anti-FLAG immunoglobulins, and (ix) ybbR tag and coenzyme A groups. In some embodiments, the conjugation moiety is selected from SBP, biotin, SpyTag, SpyCatcher, halo-tag, SNAP-tag, Myc tag, or FLAG tag.
In one embodiment, the site-directed modifying polypeptide can alternatively be associated with a test agent, via one or more linkers as described herein wherein the linker is a conjugation moiety.
In one embodiment, the test agent is associated with the first conjugation moiety via a linker. In other embodiments, the site-directed modifying polypeptide is associated with the second conjugation moiety via a linker.
The term “linker" as used herein means a divalent chemical moiety comprising a covalent bond or a chain of atoms that covalently attaches, for example, a test agent with a first conjugation moiety, and a site- directed modifying polypeptide with a second conjugation moiety. Any known method of conjugation of peptides or macromolecules can be used in the context of the present disclosure. Generally, covalent attachment of the test agent with the first conjugation moiety or the site-directed modifying polypeptide with the second conjugation moiety requires the linker to have two reactive functional groups, i.e. , bivalency in a reactive sense. Bivalent linker reagents which are useful to attach two or more functional or biologically active moieties, such as peptides, nucleic acids, drugs, toxins, antibodies, haptens, and reporter groups are known, and methods for such conjugation have been described in, for example, Hermanson, G. T. (1996) Bioconjugate Techniques; Academic Press: New York, p234-242, the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation. Further linkers are disclosed in, for example, Tsuchikama, K. and Zhiqiang, A. Protein and Cell, 9(1 ), p.33-46, (2018), the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation.
Generally, linkers suitable for use in the compositions and methods disclosed are stable in circulation, but allow for release of the extracellular cell membrane binding moiety and/or the site-directed modifying polypeptide in the target cell or, alternatively, in close proximity to the target cell. Linkers suitable for the present disclosure may be broadly categorized as non-cleavable or cleavable, as well as intracellular or extracellular, each of which is further described herein below. In some embodiments, the linker is cleaved once inside the cell cytoplasm (reducing conditions). In other embodiments, the linker is cleaved once inside the maturing endosome (acidic conditions).
Non-Cleavable Linkers
In some embodiments, the linker conjugating the test agentwith the first conjugation moiety, and the site-directed modifying polypeptide with the second conjugation moiety are non-cleavable. Non-cleavable linkers comprise stable chemical bonds that are resistant to degradation (e.g., proteolysis). Generally, non- cleavable linkers require proteolytic degradation inside the target cell, and exhibit high extracellular stability. Non-cleavable linkers suitable for use herein further may include one or more groups selected from a bond, - (C=O)-, Ci-Ce alkylene, Ci-Ce heteroalkylene, C2-C6 alkenylene, C2-C6 heteroalkenylene, C2-C6 alkynylene, C2-C6 heteroalkynylene, C3-C6 cycloalkylene, heterocycloalkylene, arylene, heteroarylene, and combinations thereof, each of which may be optionally substituted, and/or may include one or more heteroatoms (e.g., S, N, or O) in place of one or more carbon atoms. Non-limiting examples of such groups include alkylene (CH2)P> (C=O)(CH2)P, and polyethyleneglycol (PEG; (CHzCF jp), units, wherein p is an integer from 1 -6, independently selected for each occasion. Non-limiting examples of non-cleavable linker utilized in antibodydrug conjugation include those based on maleimidomethylcyclohexanecarboxylate, caproylmaleimide, and acetylphenylbutanoic acid.
Cleavable Linkers
In some embodiments, the linker conjugating the test agentwith the first conjugation moiety, and the site-directed modifying polypeptide with the second conjugation moiety are cleavable, such that cleavage of the linker (e.g., by a protease, such as metalloproteases) releases the test agentor the site-directed modifying polypeptide from its respective conjugation moiety in the intracellular or extracellular (e.g., upon binding of the molecule to the cell surface) environment. Cleavable linkers are designed to exploit the differences in local environments, e.g., extracellular and intracellular environments, for example, pH, reduction potential or enzyme concentration, to trigger the release of site-directed modifying polypeptide component in the target cell in order to facilitate genome editing. Generally, cleavable linkers are relatively stable in circulation in vivo, but are particularly susceptible to cleavage in the intracellular environment through one or more mechanisms (e.g., including, but not limited to, activity of proteases, peptidases, and glucuronidases). Cleavable linkers used herein are stable outside the target cell and may be cleaved at some efficacious rate inside the target cell or in close proximity to the extracellular membrane of the target cell. An effective linker will: (i) maintain the specific binding properties of the test agent, e.g., an antibody; (ii) allow intra- or extracellular delivery of the site-directed modifying polypeptide (or nucleoprotein); (iii) remain stable and intact, i.e. not cleaved, until the site-directed modifying polypeptide (or nucleoprotein) has been delivered or transported to its targeted site; and (iv) maintain the gene targeting effect (e.g., CRISPR) of the site-directed modifying polypeptide. Stability of the site-directed modifying polypeptide (or nucleoprotein) may be measured by standard analytical techniques such as mass spectroscopy, size determination by size exclusion chromatography or diffusion constant measurement by dynamic light scattering, HPLC, and the separation/analysis technique LC/MS.
Suitable cleavable linkers include those that may be cleaved, for instance, by enzymatic hydrolysis, photolysis, hydrolysis under acidic conditions, hydrolysis under basic conditions, oxidation, disulfide reduction, nucleophilic cleavage, or organometallic cleavage (see, for example, Leriche et al., Bioorg. Med. Chem., 20:571 -582, 2012, the disclosure of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation). Suitable cleavable linkers may include, for example, chemical moieties such as a hydrazine, a disulfide, a thioether or a peptide.
Linkers hydrolyzable under acidic conditions include, for example, hydrazones, semicarbazones, thiosemicarbazones, cis-aconitic amides, orthoesters, acetals, ketals, or the like. (See, e.g., U.S. Pat. Nos. 5,122,368; 5,824,805; 5,622,929; Dubowchik and Walker, 1999, Pharm. Therapeutics 83:67-123; Neville et al., 1989, Biol. Chem. 264:14653-14661 , the disclosure of each of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation. Such linkers are relatively stable under neutral pH conditions, such as those in the blood, but are unstable at below pH 5.5 or 5.0, the approximate pH of the lysosome. Generally, linkers including such acid-labile functionalities tend to be relatively less stable extracellularly. This lower stability may be advantageous where extracellular cleavage is desired.
Linkers cleavable under reducing conditions include, for example, a disulfide. A variety of disulfide linkers are known in the art, including, for example, those that can be formed using SATA (N-succinimidyl-S- acetylth ioacetate), SPDP (N-succinimidyl-3-(2-pyridyldithio)propionate), SPDB (N-succinimidyl-3-(2- pyridyldithio)butyrate) and SMPT (N-succinimidyl-oxycarbonyl-alpha-methyl-alpha-(2-pyridyl-dithio)toluene), SPDB and SMPT (See, e.g., Thorpe et al., 1987, Cancer Res. 47:5924-5931 ; Wawrzynczak et al., In Immunoconjugates: Antibody Conjugates in Radioimagery and Therapy of Cancer (C. W. Vogel ed., Oxford U. Press, 1987. See also U.S. Pat. No. 4,880,935, the disclosure of each of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation. Disulfide-based linkers tend to be relatively unstable in circulation in plasma, however, this lower stability may be advantageous where extracellular cleavage is desired. Susceptibility to cleavage may also be tuned by e.g., introducing steric bulk near the disulfide moiety to hinder reductive cleavage.
Linkers susceptible to enzymatic hydrolysis can be, e.g., a peptide-containing linker that is cleaved by an intracellular peptidase or protease enzyme, including, but not limited to, a lysosomal or endosomal protease. In some embodiments, the peptidyl linker is at least two amino acids long or at least three amino acids long. Exemplary amino acid linkers include a dipeptide, a tripeptide, a tetrapeptide or a pentapeptide. Examples of suitable peptides include those containing amino acids such as Valine, Alanine, Citrulline (Cit), Phenylalanine, Lysine, Leucine, and Glycine. Amino acid residues which comprise an amino acid linker component include those occurring naturally, as well as minor amino acids and non-naturally occurring amino acid analogs, such as citrulline. Exemplary dipeptides include val ine-citrulli ne (vc or val-cit) and alanine-phenylalanine (af or ala-phe). Exemplary tripeptides include glycine-valine-citrulline (gly-val-cit) and glycine-glycine-glycine (gly-gly-gly). In some embodiments, the linker includes a dipeptide such as Val-Cit, Ala-Vai, or Phe-Lys, Val-Lys, Ala-Lys, Phe-Cit, Leu-Cit, lle-Cit, Phe-Arg, or Trp-Cit. Linkers containing dipeptides such as Val-Cit or Phe-Lys are disclosed in, for example, U.S. Pat. No. 6,214,345, the disclosure of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation. In some embodiments, the linker includes a dipeptide selected from Val-Ala and Val-Cit. In certain embodiments, linkers comprising a peptide moiety may be susceptible to varying degrees of cleavage both intra- and extracellularly. Accordingly, in some embodiments, the linker comprises a dipeptide, and the TAGE agent is substantially cleaved extracellularly. Accordingly, in some embodiments, the linker comprises a dipeptide, and the TAGE agent is stable extracellularly and is cleaved intracellularly.
Linkers suitable for conjugating a site-directed modifying polypeptide as disclosed herein to a second conjugation moiety, as disclosed herein, include those capable of releasing the site-directed modifying polypeptide by a 1 ,6-elimination process. Chemical moieties capable of this elimination process include the p-aminobenzyl (PAB) group, 6-maleimidohexanoic acid, pH-sensitive carbonates, and other reagents as described in Jain et al., Pharm. Res. 32:3526-3540, 2015, the disclosure of which is incorporated herein by reference in its entirety as it pertains to linkers suitable for covalent conjugation.
In some embodiments, the linker includes a "self-immolative" group such as the afore-mentioned PAB or PABC (para-aminobenzyloxycarbonyl), which are disclosed in, for example, Carl et al., J. Med. Chem. (1981 ) 24:479-480; Chakravarty et al (1983) J. Med. Chem. 26:638-644; US 6214345;
US20030130189; US20030096743; US6759509; US20040052793; US6218519; US6835807; US6268488; US20040018194; W098/13059; US20040052793; US6677435; US5621002; US20040121940;
W 02004/032828). Other such chemical moieties capable of this process (“self-immolative linkers”) include methylene carbamates and heteroaryl groups such as aminothiazoles, aminoimidazoles, aminopyrimidines, and the like. Linkers containing such heterocyclic self-immolative groups are disclosed in, for example, U.S. Patent Publication Nos. 20160303254 and 201500791 14, and U.S. Patent No. 7,754,681 ; Hay et al. (1999) Bioorg. Med. Chem. Lett. 9:2237; US 2005/0256030; de Groot et al (2001 ) J. Org. Chem. 66:8815-8830; and US 7223837. In some embodiments, a dipeptide is used in combination with a self-immolative linker.
Linkers suitable for use herein further may include one or more groups selected from Ci-Ce alkylene, Ci-C6 heteroalkylene, C2-C6 alkenylene, C2-C6 heteroalkenylene, C2-Ce alkynylene, C2-C6 heteroalkynylene, C3-C6 cycloalkylene, heterocycloalkylene, arylene, heteroarylene, and combinations thereof, each of which may be optionally substituted. Non-limiting examples of such groups include (CHajp, (CH2CH2O)P, and - (C=O)(CH2)P- units, wherein p is an integer from 1 -6, independently selected for each occasion.
In some embodiments, the linker may include one or more of a hydrazine, a disulfide, a thioether, a dipeptide, a p-aminobenzyl (PAB) group, a heterocyclic self-immolative group, an optionally substituted C1- Ce alkyl, an optionally substituted Ci-Ce heteroalkyl, an optionally substituted Cs-Ce alkenyl, an optionally substituted C2-C6 heteroalkenyl, an optionally substituted C2-C6 alkynyl, an optionally substituted C2-C6 heteroalkynyl, an optionally substituted Ca-Ce cycloalkyl, an optionally substituted heterocycloalkyl, an optionally substituted aryl, an optionally substituted heteroaryl, a solubility enhancing group, acyl, -(C=O)-, or -(CH2CH2O)P- group, wherein p is an integer from 1 -6. One of skill in the art will recognize that one or more of the groups listed may be present in the form of a bivalent (diradical) species, e.g., Ci-Ce alkylene and the like.
In some embodiments, the linker includes a p-aminobenzyl group (PAB). In one embodiment, the p- aminobenzyl group is disposed between the cytotoxic drug and a protease cleavage site in the linker. In one embodiment, the p-aminobenzyl group is part of a p-aminobenzyloxycarbonyl unit. In one embodiment, the p-aminobenzyl group is part of a p-aminobenzylamido unit.
In some embodiments, the linker comprises PAB, Val-Cit-PAB, Val-Ala- PAB, Val-Lys(Ac)-PAB, Phe-Lys-PAB, Phe-Lys(Ac)-PAB, D-Val-Leu-Lys, Gly-Gly-Arg, Ala-Ala-Asn-PAB, or Ala-PAB. In some embodiments, the linker comprises a combination of one or more of a peptide, oligosaccharide, -(CH2)P-, -(CH2CH2O)P-, PAB, Val-Cit-PAB, Val-Ala-PAB, Val- Lys(Ac)-PAB, Phe-Lys-PAB, Phe-Lys(Ac)-PAB, D-Val-Leu-Lys, Gly-Gly-Arg, Ala-Ala-Asn-PAB, or Ala-PAB.
Suitable linkers may be substituted with groups which modulate solubility or reactivity. Suitable linkers may contain groups having solubility enhancing properties. Linkers including the (CH2CH2O)P unit (polyethylene glycol, PEG), for example, can enhance solubility, as can alkyl chains substituted with amino, sulfonic acid, phosphonic acid or phosphoric acid residues. Linkers including such moieties are disclosed in, for example, U.S. Patent Nos. 8,236,319 and 9,504,756, the disclosure of each of which is incorporated herein by reference as it pertains to linkers suitable for covalent conjugation. Linkers containing such groups are described, for example, in U.S. Patent No. 9,636,421 and U.S. Patent Application Publication No. 2017/0298145, the disclosures of which are incorporated herein by reference as they pertain to linkers suitable for covalent conjugation to cytotoxins and antibodies or antigen-binding fragments thereof.
Suitable linkers for covalently conjugating test agentwith the first conjugation moiety, and the site- directed modifying polypeptide with the second conjugation moiety as disclosed herein can have two reactive functional groups (i.e., two reactive termini), one for conjugation to the test agent (or site-directed modifying polypeptide, respectively), and the other for conjugation to the first conjugation moiety (or second conjugation moiety, respectively). Suitable sites for conjugation may include, in certain embodiments, nucleophilic, such as a thiol, amino group, or hydroxyl group. Reactive (e.g., nucleophilic) sites that may be present within a test agent (or site-directed modifying polypeptide) as disclosed herein may include, without limitation, nucleophilic substituents on amino acid residues such as (i) N-terminal amine groups, (ii) side chain amine groups, e.g. lysine, (iii) side chain thiol groups, e.g. cysteine, (iv) side chain hydroxyl groups, e.g. serine; or (iv) sugar hydroxyl or amino groups where the antibody is glycosylated. Suitable sites for conjugation on the first or second conjugation moiety may include, without limitation, hydroxyl moieties of serine, threonine, and tyrosine residues; amino moieties of lysine residues; carboxyl moieties of aspartic acid and glutamic acid residues; and thiol moieties of cysteine residues, as well as propargyl, azido, haloaryl (e.g., fluoroaryl), haloheteroaryl (e.g., fluoroheteroaryl), haloalkyl, and haloheteroalkyl moieties of non- naturally occurring amino acids. Accordingly, the antibody conjugation reactive terminus on the linker is, in certain embodiments, a thiol-reactive group such as a double bond (as in maleimide), a leaving group such as a chloro, bromo, iodo, or an R-sulfanyl group, or a carboxyl group.
Suitable sites for conjugation on the site-directed modifying polypeptide can also be, in certain embodiments, nucleophilic. Reactive (e.g., nucleophilic) sites that may be present within a site-directed modifying polypeptide as disclosed herein include, without limitation, nucleophilic substituents on amino acid residues such as (i) N-terminal amine groups, (ii) side chain amine groups, e.g. lysine, (iii) side chain thiol groups, e.g. cysteine, (iv) side chain hydroxyl groups, e.g. serine; or (iv) sugar hydroxyl or amino groups where the antibody is glycosylated. Suitable sites for conjugation on the site-directed modifying polypeptide include, without limitation, hydroxyl moieties of serine, threonine, and tyrosine residues; amino moieties of lysine residues; carboxyl moieties of aspartic acid and glutamic acid residues; and thiol moieties of cysteine residues, as well as propargyl, azido, haloaryl (e.g., fluoroaryl), haloheteroaryl (e.g., fluoroheteroaryl), haloalkyl, and haloheteroalkyl moieties of non-naturally occurring amino acids. Accordingly, the site-directed modifying polypeptide conjugation reactive terminus on the linker is, in certain embodiments, a thiol-reactive group such as a double bond (as in maleimide), a leaving group such as a chloro, bromo, iodo, or an R- sulfanyl group, or a carboxyl group. In some embodiments, the reactive functional group attached to the linker is a nucleophilic group which is reactive with an electrophilic group present on an antigen binding moiety, the site-directed modifying polypeptide, or both. Useful electrophilic groups on an antigen binding moiety or site-directed modifying polypeptide include, but are not limited to, aldehyde and ketone carbonyl groups. The heteroatom of a nucleophilic group can react with an electrophilic group on an antigen binding moiety or site-directed modifying polypeptide and form a covalent bond to the antigen binding moiety or the site-directed modifying polypeptide. Useful nucleophilic groups include, but are not limited to, hydrazide, oxime, amino, hydroxyl, hydrazine, thiosemicarbazone, hydrazine carboxylate, and arylhydrazide.
When the term "linker" is used in describing the linker in conjugated form, one or both of the reactive termini will be absent, (having been converted to a chemical moiety) or incomplete (such as being only the carbonyl of a carboxylic acid) because of the formation of the bonds between the linker and the extracellular cell membrane binding moiety, and/or between the linker and the site-directed modifying polypeptide. Accordingly, linkers useful herein include, without limitation, linkers containing a chemical moiety formed by a coupling reaction between a reactive functional group on the linker and a nucleophilic group or otherwise reactive substituent on the antigen binding moiety, and a chemical moiety formed by a coupling reaction between a reactive functional group on the linker and a nucleophilic group on the site-directed modifying polypeptide.
Examples of chemical moieties formed by these coupling reactions result from reactions between chemically reactive functional groups, including a nucleophile/electrophile pair (e.g., a thiol/haloalkyl pair, an amine/carbonyl pair, or a thiol/a,p-unsaturated carbonyl pair, and the like), a diene/dienophile pair (e.g., an azide/alkyne pair, or a diene/ a,p-unsaturated carbonyl pair, among others), and the like. Coupling reactions between the reactive functional groups to form the chemical moiety include, without limitation, thiol alkylation, hydroxyl alkylation, amine alkylation, amine or hydroxylamine condensation, hydrazine formation, amidation, esterification, disulfide formation, cycloaddition (e.g., [4+2] Diels-Alder cycloaddition, [3+2] Huisgen cycloaddition, among others), nucleophilic aromatic substitution, electrophilic aromatic substitution, and other reactive modalities known in the art or described herein. Suitable linkers may contain an electrophilic functional group for reaction with a nucleophilic functional group on the antigen binding moiety, the site-directed modifying polypeptide, or both.
In some embodiments, the reactive functional group present within test agent, the site-directed modifying polypeptide, the first and/or second conjugation moieties, or all of these, as disclosed herein are amine or thiol moieties. Certain extracellular cell membrane binding moieties have reducible interchain disulfides, i.e. cysteine bridges. Extracellular cell membrane binding moieties may be made reactive for conjugation with linker reagents by treatment with a reducing agent such as DTT (dithiothreitol). Each cysteine bridge will thus form, theoretically, two reactive thiol nucleophiles. Additional nucleophilic groups can be introduced into antigen binding moieties through the reaction of lysines with 2-iminothiolane (Traut's reagent) resulting in conversion of an amine into a thiol. Reactive thiol groups may be introduced into the antigen binding moiety by introducing one, two, three, four, or more cysteine residues (e.g., preparing mutant antibodies comprising one or more non-native cysteine amino acid residues). U.S. Pat. No. 7,521 ,541 teaches engineering antibodies by introduction of reactive cysteine amino acids.
Linkers suitable for the synthesis of the covalent conjugates as disclosed herein include, without limitation, reactive functional groups such as maleimide or a haloalkyl group. These groups may be present in linkers or cross linking reagents such as succinimidyl 4-(N-maleimidomethyl)-cyclohexane-L-carboxylate (SMCC), N-succinimidyl iodoacetate (SI A) , sulfo-SMCC, m-maleimidobenzoyl-/V-hydroxysuccinimidyl ester (MBS), sulfo-MBS, and succinimidyl iodoacetate, among others described, in for instance, Liu et al., 18:690- 697, 1979, the disclosure of which is incorporated herein by reference as it pertains to linkers for chemical conjugation.
In some embodiments, one or both of the reactive functional groups attached to the linker is a maleimide, azide, or alkyne. An example of a maleimide-containing linker is the non-cleavable maleimidocaproyl-based linker. Such linkers are described by Doronina et al., Bioconjugate Chem. 17:14- 24, 2006, the disclosure of which is incorporated herein by reference as it pertains to linkers for chemical conjugation.
In some embodiments, the reactive functional group is -(C=O)- or -NH(C=O)-, such that the linker may be joined to the extracellular cell membrane binding moiety or the site-directed modifying polypeptide by an amide or urea moiety, respectively, resulting from reaction of the -(C=O)- or -NH(C=O)- group with an amino group of the extracellular cell membrane binding moiety or the site-directed modifying polypeptide, or both.
In some embodiments, the reactive functional group is an N-maleimidyl group, halogenated N- alkylamido group, sulfonyloxy N-alkylamido group, carbonate group, sulfonyl halide group, thiol group or derivative thereof, alkynyl group comprising an internal carbon-carbon triple bond, (het-ero)cycloalkynyl group, bicyclo[6.1 .0]non-4-yn-9-yl group, alkenyl group comprising an internal carbon-carbon double bond, cycloalkenyl group, tetrazinyl group, azido group, phosphine group, nitrile oxide group, nitrone group, nitrile imine group, diazo group, ketone group, (O-alkyl)hydroxylamino group, hydrazine group, halogenated N- maleimidyl group, 1 ,1 -bis (sulfonylmethyl)methylcarbonyl group or elimination derivatives thereof, carbonyl halide group, or an allenamide group, each of which may be optionally substituted. In some embodiments, the reactive functional group comprises a cycloalkene group, a cycloalkyne group, or an optionally substituted (hetero)cycloalkynyl group.
Examples of suitable bivalent linker reagents suitable for preparing conjugates as disclosed herein include, but are not limited to, N-succinimidyl 4-(maleimidomethyl)cyclohexanecarboxylate (SMCC), N- succinimidyl-4-(N-maleimidomethyl)-cyclohexane-1 -carboxy-(6-amidocaproate), which is a “long chain” analog of SMCC (LC-SMCC), K-maleimidoundecanoic acid N-succinimidyl ester (KMUA), Y-maleimidobutyric acid N-succinimidyl ester (GMBS), £-maleimidocaproic acid N-hydroxysuccinimide ester (EMCS), m- maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), N-(a-maleimidoacetoxy)-succinimide ester (AMAS), succinimidyl-6-(p-maleimidopropionamido)hexanoate (SMPH), N-succinimidyl 4-(p-maleimidophenyl)- butyrate (SMPB), and N-(p-maleimidophenyl)isocyanate (PMPI). Cross-linking reagents comprising a haloacetyl-based moiety include N-succinimidyl-4-(iodoacetyl)-aminobenzoate (SIAB), N-succinimidyl iodoacetate (SIA), N-succinimidyl bromoacetate (SBA), and N-succinimidyl 3-(bromoacetamido)propionate (SBAP).
It will be recognized by one of skill in the art that any one or more of the chemical groups, moieties and features disclosed herein may be combined in multiple ways to form linkers useful for conjugation of the extracellular cell membrane binding moiety as disclosed herein to a site-directed modifying polypeptide, as disclosed herein. Further linkers useful in conjunction with the compositions and methods described herein, are described, for example, in U.S. Patent Application Publication No. 2015/0218220, the disclosure of which is incorporated herein by reference as is pertain to linkers suitable for covalent conjugation.
Site-directed modifying poiypeptide
A goal of the present invention is to identify cell internalizing agents that can act to internalize nucleic acid-guided nucleases so as to provide cell specificity for gene editing nucleases. In particular, the methods disclosed herein include conjugating a test agent to a site directed modifying polypeptide, which in turn will target a specific nucleic acid and provide a gene editing function.
A site-directed modifying polypeptide (also referred to herein as a nucleic acid-guided nuclease) refers to a nuclease that is directed to a specific target sequence based on the complementarity (full or partial) between a guide nucleic acid (i.e. , guide RNA or gRNA, guide DNA or gDNA, or guide DNA/RNA hybrid) that is associated with the nuclease and a target sequence.
In specific embodiments, the site-directed modifying polypeptide is an RNA guided nuclease. The binding between the guide RNA and the target sequence serves to recruit the nuclease to the vicinity of the target sequence. Non-limiting examples of -directed modifying polypeptides suitable for the presently disclosed compositions and methods include naturally-occurring Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated (Cas) polypeptides from a prokaryotic organism (e.g., bacteria, archaea) or variants thereof. CRISPR sequences found within prokaryotic organisms are sequences that are derived from fragments of polynucleotides from invading viruses and are used to recognize similar viruses during subsequent infections and cleave viral polynucleotides via CRISPR-associated (Cas) polypeptides that function as an RNA-guided nuclease to cleave the viral polynucleotides. As used herein, a “CRISPR-associated polypeptide” or “Cas polypeptide” refers to a naturally-occurring polypeptide that is found within proximity to CRISPR sequences within a naturally-occurring CRISPR system. Certain Cas polypeptides function as RNA-guided nucleases.
There are at least two classes of naturally-occurring CRISPR systems, Class 1 and Class 2. In general, the nucleic acid-guided nucleases of the presently disclosed compositions and methods are Class 2 Cas polypeptides or variants thereof given that the Class 2 CRISPR systems comprise a single polypeptide with nucleic acid-guided nuclease activity, whereas Class 1 CRISPR systems require a complex of proteins for nuclease activity. There are at least three known types of Class 2 CRISPR systems, Type II, Type V, and Type VI, among which there are multiple subtypes (subtype ll-A, ll-B, ll-C, V-A, V-B, V-C, Vl-A, Vl-B, and Vl-C, among other undefined or putative subtypes). In general, Type II and Type V-B systems require a tracrRNA, in addition to crRNA, for activity. In contrast, Type V-A and Type VI only require a crRNA for activity. All known Type II and Type V RNA-guided nucleases target double-stranded DNA, whereas all known Type VI RNA-guided nucleases target single-stranded RNA. The RNA-guided nucleases of Type II CRISPR systems are referred to as Cas9 herein and in the literature. In some embodiments, the nucleic acid-guided nuclease of the presently disclosed compositions and methods is a Type II Cas9 protein or a variant thereof. Type V Cas polypeptides that function as RNA-guided nucleases do not require tracrRNA for targeting and cleavage of target sequences. The RNA-guided nuclease of Type VA CRISPR systems are referred to as Cpf1 ; of Type VB CRISPR systems are referred to as C2C1 ; of Type VC CRISPR systems are referred to as Cas12C or C2C3; of Type VIA CRISPR systems are referred to as C2C2 or Cas13A1 ; of Type VIB CRISPR systems are referred to as Cas13B; and of Type VIC CRISPR systems are referred to as Cas13A2 herein and in the literature. In certain embodiments, the nucleic acid-guided nuclease of the presently disclosed compositions and methods is a Type VA Cpf1 protein or a variant thereof. Naturally- occurring Cas polypeptides and variants thereof that function as nucleic acid-guided nucleases are known in the art and include, but are not limited to Streptococcus pyogenes Cas9, Staphylococcus aureus Cas9, Streptococcus thermophilus Cas9, Francisella novicida Cpf1 , or those described in Shmakov et al. (2017) Nat Rev Microbiol 15(3):169-182; Makarova et al. (2015) Nat Rev Microbiol 13(11 ):722-736; and U.S. Pat. No. 9790490, each of which is incorporated herein in its entirety. Class 2 Type V CRISPR nucleases include Cas12 and any subtypes of Cas12, such as Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12g, Cas12h, and Cas12i. Class 2 Type VI CRISPR nucleases including Cas13 can be used in order to cleave RNA target sequences.
The site-directed modifying polypeptide (i.e. , nucleic acid-guided nuclease) of the presently disclosed compositions and methods can be a naturally-occurring nucleic acid-guided nuclease (e.g., S. pyogenes Cas9) or a variant thereof. Variant nucleic acid-guided nucleases can be engineered or naturally occurring variants that contain substitutions, deletions, or additions of amino acids that, for example, alter the activity of one or more of the nuclease domains, fuse the nucleic acid-guided nuclease to a heterologous domain that imparts a modifying property (e.g., transcriptional activation domain, epigenetic modification domain, detectable label), modify the stability of the nuclease, or modify the specificity of the nuclease.
In some embodiments, a nucleic acid-guided nuclease includes one or more mutations to improve specificity for a target site and/or stability in the intracellular microenvironment. For example, where the protein is Cas9 (e.g., SpCas9) or a modified Cas9, it may be beneficial to delete any or all residues from N175 to R307 (inclusive) of the Rec2 domain. It may be found that a smaller, or lower-molecular mass, version of the nuclease is more effective. In some embodiments, the nuclease comprises at least one substitution relative to a naturally-occurring version of the nuclease. For example, where the protein is Cas9 or a modified Cas9, it may be beneficial to mutate C80 or C574 (or homologs thereof, in modified proteins with indels). In Cas9, desirable substitutions may include any of C80A, C80L, C80I, C80V, C80K, C574E, C574D, C574N, C574Q (in any combination) and in particular C80A. Substitutions may be included to reduce intracellular protein binding of the nuclease and/or increase target site specificity. Additionally or alternatively, substitutions may be included to reduce off-target toxicity of the composition.
The nucleic acid-guided nuclease is directed to a particular target sequence through its association with a guide nucleic acid (e.g., guide RNA (gRNA), guide DNA (gDNA)). The nucleic acid-guided nuclease is bound to the guide nucleic acid via non-covalent interactions, thus forming a complex. The polynucleotide- targeting nucleic acid provides target specificity to the complex by comprising a nucleotide sequence that is complementary to a sequence of a target sequence. The nucleic acid-guided nuclease of the complex or a domain or label fused or otherwise conjugated thereto provides the site-specific activity. In other words, the nucleic acid-guided nuclease is guided to a target polynucleotide sequence (e.g. a target sequence in a chromosomal nucleic acid; a target sequence in an extrachromosomal nucleic acid, e.g. an episomal nucleic acid, a minicircle; a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic acid; a target sequence in a plasmid) by virtue of its association with the protein-binding segment of the polynucleotide-targeting guide nucleic acid.
Thus, the guide nucleic acid comprises two segments, a "polynucleotide-targeting segment" and a "polypeptide-binding segment." By "segment" it is meant a segment/section/region of a molecule (e.g., a contiguous stretch of nucleotides in an RNA). A segment can also refer to a region/section of a complex such that a segment may comprise regions of more than one molecule. For example, in some cases the polypeptide-binding segment (described below) of a polynucleotide-targeting nucleic acid comprises only one nucleic acid molecule and the polypeptide-binding segment therefore comprises a region of that nucleic acid molecule. In other cases, the polypeptide-binding segment (described below) of a DNA-targeting nucleic acid comprises two separate molecules that are hybridized along a region of complementarity.
The polynucleotide-targeting segment (or "polynucleotide-targeting sequence" or “guide sequence”) comprises a nucleotide sequence that is complementary (fully or partially) to a specific sequence within a target sequence (for example, the complementary strand of a target DNA sequence). The polypeptide- binding segment (or "polypeptide-binding sequence") interacts with a nucleic acid-guided nuclease (e.g., RNA-guided nuclease). In general, site-specific cleavage or modification of the target DNA by a nucleic acid-guided nuclease occurs at locations determined by both (i) base-pairing complementarity between the polynucleotide-targeting sequence of the nucleic acid and the target DNA; and (ii) a short motif (referred to as the protospacer adjacent motif (PAM)) in the target DNA.
A protospacer adjacent motif can be of different lengths and can be a variable distance from the target sequence, although the PAM is generally within about 1 to about 10 nucleotides from the target sequence, including about 1 , about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, or about 10 nucleotides from the target sequence. The PAM can be 5' or 3' of the target sequence. Generally, the PAM is a consensus sequence of about 3-4 nucleotides, but in particular embodiments, can be 2, 3, 4, 5, 6, 7, 8, 9, or more nucleotides in length. Methods for identifying a preferred PAM sequence or consensus sequence for a given RNA-guided nuclease are known in the art and include, but are not limited to the PAM depletion assay described by Karvelis et al. (2015) Genome Biol 16:253, or the assay disclosed in Pattanayak et al. (2013) Nat Biotechnol 31 (9):839-43, each of which is incorporated by reference in its entirety.
The polynucleotide-targeting sequence (i.e., guide sequence) is the nucleotide sequence that directly hybridizes with the target sequence of interest. The guide sequence is engineered to be fully or partially complementary with the target sequence of interest. In various embodiments, the guide sequence can comprise from about 8 nucleotides to about 30 nucleotides, or more. For example, the guide sequence can be about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length. In some embodiments, the guide sequence is about 10 to about 26 nucleotides in length, or about 12 to about 30 nucleotides in length. In particular embodiments, the guide sequence is about 30 nucleotides in length. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81 %, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more. In particular embodiments, the guide sequence is free of secondary structure, which can be predicted using any suitable polynucleotide folding algorithm known in the art, including but not limited to mFold (see, e.g., Zuker and Stiegler (1981 ) Nucleic Acids Res. 9:133-148) and RNAfold (see, e.g., Gruber et al. (2008) Cell 106(1 ) :23-24) . In some embodiments, a guide nucleic acid comprises two separate nucleic acid molecules (an "activator-nucleic acid" and a "targeter-nucleic acid", see below) and is referred to herein as a "doublemolecule guide nucleic acid" or a "two-molecule guide nucleic acid." In other embodiments, the subject guide nucleic acid is a single nucleic acid molecule (single polynucleotide) and is referred to herein as a "single-molecule guide nucleic acid," a "single-guide nucleic acid," or an "sgNA." The term "guide nucleic acid” or "gNA" is inclusive, referring both to double-molecule guide nucleic acids and to single-molecule guide nucleic acids (i.e., sgNAs). In those embodiments wherein the guide nucleic acid is an RNA, the gRNA can be a double-molecule guide RNA or a single-guide RNA. Likewise, in those embodiments wherein the guide nucleic acid is a DNA, the gDNA can be a double-molecule guide DNA or a single-guide DNA.
An exemplary two-molecule guide nucleic acid comprises a crRNA-like ("CRISPR RNA" or "targeter- RNA" or "crRNA" or "crRNA repeat") molecule and a corresponding tracrRNA-like ("trans-acting CRISPR RNA" or "activator-RNA" or "tracrRNA") molecule. A crRNA-like molecule (targeter-RNA) comprises both the polynucleotide-targeting segment (single stranded) of the guide RNA and a stretch ("duplex-forming segment") of nucleotides that forms one half of the dsRNA duplex of the polypeptide-binding segment of the guide RNA, also referred to herein as the CRISPR repeat sequence.
The term "activator-nucleic acid" or “activator-NA” is used herein to mean a tracrRNA-like molecule of a double-molecule guide nucleic acid. The term "targeter-nucleic acid" or “targeter-NA” is used herein to mean a crRNA-like molecule of a double-molecule guide nucleic acid. The term "duplex-forming segment" is used herein to mean the stretch of nucleotides of an activator-NA or a targeter-NA that contributes to the formation of the dsRNA duplex by hybridizing to a stretch of nucleotides of a corresponding activator-NA or targeter-NA molecule. In other words, an activator-NA comprises a duplex-forming segment that is complementary to the duplex-forming segment of the corresponding targeter-NA. As such, an activator-NA comprises a duplex-forming segment while a targeter-NA comprises both a duplex-forming segment and the DNA-targeting segment of the guide nucleic acid. Therefore, a subject double-molecule guide nucleic acid can be comprised of any corresponding activator-NA and targeter-NA pair.
The activator-NA comprises a CRISPR repeat sequence comprising a nucleotide sequence that comprises a region with sufficient complementarity to hybridize to an activator-NA (the other part of the polypeptide-binding segment of the guide nucleic acid). In various embodiments, the CRISPR repeat sequence can comprise from about 8 nucleotides to about 30 nucleotides, or more. For example, the CRISPR repeat sequence can be about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length. In some embodiments, the degree of complementarity between a CRISPR repeat sequence and the antirepeat region of its corresponding tracr sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81 %, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more.
A corresponding tracrRNA-like molecule (i.e., activator-NA) comprises a stretch of nucleotides (duplex-forming segment) that forms the other part of the double-stranded duplex of the polypeptide-binding segment of the guide nucleic acid. In other words, a stretch of nucleotides of a crRNA-like molecule (i.e., the CRISPR repeat sequence) are complementary to and hybridize with a stretch of nucleotides of a tracrRNA- like molecule (i.e., the anti-repeat sequence) to form the double-stranded duplex of the polypeptide-binding domain of the guide nucleic acid. The crRNA-like molecule additionally provides the single stranded DNA- targeting segment. Thus, a crRNA-like and a tracrRNA-like molecule (as a corresponding pair) hybridize to form a guide nucleic acid. The exact sequence of a given crRNA or tracrRNA molecule is characteristic of the CRISPR system and species in which the RNA molecules are found. A subject double-molecule guide RNA can comprise any corresponding crRNA and tracrRNA pair.
A trans-activating-like CRISPR RNA or tracrRNA-like molecule (also referred to herein as an “activator-NA”) comprises a nucleotide sequence comprising a region that has sufficient complementarity to hybridize to a CRISPR repeat sequence of a crRNA, which is referred to herein as the anti-repeat region. In some embodiments, the tracrRNA-like molecule further comprises a region with secondary structure (e.g., stem-loop) or forms secondary structure upon hybridizing with its corresponding crRNA. In particular embodiments, the region of the tracrRNA-like molecule that is fully or partially complementary to a CRISPR repeat sequence is at the 5' end of the molecule and the 3' end of the tracrRNA-like molecule comprises secondary structure. This region of secondary structure generally comprises several hairpin structures, including the nexus hairpin, which is found adjacent to the anti-repeat sequence. The nexus hairpin often has a conserved nucleotide sequence in the base of the hairpin stem, with the motif UNANNC found in many nexus hairpins in tracrRNAs. There are often terminal hairpins at the 3' end of the tracrRNA that can vary in structure and number, but often comprise a GC-rich Rho-independent transcriptional terminator hairpin followed by a string of U’s at the 3' end. See, for example, Briner et al. (2014) Molecular Cell 56:333- 339, Briner and Barrangou (2016) Cold Spring Harb Protoc; doi: 10.1101 /pdb.top090902, and U.S. Publication No. 2017/0275648, each of which is herein incorporated by reference in its entirety.
In various embodiments, the anti-repeat region of the tracrRNA-like molecule that is fully or partially complementary to the CRISPR repeat sequence comprises from about 8 nucleotides to about 30 nucleotides, or more. For example, the region of base pairing between the tracrRNA-like anti-repeat sequence and the CRISPR repeat sequence can be about 8, about 9, about 10, about 1 1 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21 , about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, or more nucleotides in length. In some embodiments, the degree of complementarity between a CRISPR repeat sequence and its corresponding tracrRNA-like anti-repeat sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, about 60%, about 70%, about 75%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more.
In various embodiments, the entire tracrRNA-like molecule can comprise from about 60 nucleotides to more than about 140 nucleotides. For example, the tracrRNA-like molecule can be about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 95, about 100, about 105, about 110, about 115, about 120, about 125, about 130, about 135, about 140, or more nucleotides in length. In particular embodiments, the tracrRNA-like molecule is about 80 to about 100 nucleotides in length, including about 80, about 81 , about 82, about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91 , about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, and about 100 nucleotides in length.
A subject single-molecule guide nucleic acid (i.e., sgNA) comprises two stretches of nucleotides (a targeter-NA and an activator-NA) that are complementary to one another, are covalently linked by intervening nucleotides ("linkers" or "linker nucleotides"), and hybridize to form the double stranded nucleic acid duplex of the protein-binding segment, thus resulting in a stem-loop structure. The targeter-NA and the activator-NA can be covalently linked via the 3' end of the targeter-NA and the 5' end of the activator-NA. Alternatively, the targeter-NA and the activator-NA can be covalently linked via the 5' end of the targeter-NA and the 3' end of the activator-NA.
The linker of a single-molecule DNA-targeting nucleic acid can have a length of from about 3 nucleotides to about 100 nucleotides. For example, the linker can have a length of from about 3 nucleotides (nt) to about 90 nt, from about 3 nt to about 80 nt, from about 3 nt to about 70 nt, from about 3 nt to about 60 nt, from about 3 nt to about 50 nt, from about 3 nt to about 40 nt, from about 3 nt to about 30 nt, from about 3 nt to about 20 nt or from about 3 nt to about 10 nt, including but not limited to about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11 , about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, or more nucleotides. In some embodiments, the linker of a singlemolecule DNA-targeting nucleic acid is 4 nt.
An exemplary single-molecule DNA-targeting nucleic acid comprises two complementary stretches of nucleotides that hybridize to form a double-stranded duplex, along with a guide sequence that hybridizes to a specific target sequence.
Appropriate naturally-occurring cognate pairs of crRNAs (and, in some embodiments, tracrRNAs) are known for most Gas proteins that function as nucleic acid-guided nucleases that have been discovered or can be determined for a specific naturally-occurring Gas protein that has nucleic acid-guided nuclease activity by sequencing and analyzing flanking sequences of the Gas nucleic acid-guided nuclease protein to identify tracrRNA-coding sequence, and thus, the tracrRNA sequence, by searching for known antirepeatcoding sequences or a variant thereof. Antirepeat regions of the tracrRNA comprise one-half of the ds protein-binding duplex. The complementary repeat sequence that comprises one-half of the ds proteinbinding duplex is called the CRISPR repeat. CRISPR repeat and antirepeat sequences utilized by known CRISPR nucleic acid-guided nucleases are known in the art and can be found, for example, at the CRISPR database on the world wide web at crispr.i2bc.paris-saclay.fr/crispr/.
The single guide nucleic acid or dual-guide nucleic acid can be synthesized chemically or via in vitro transcription. Assays for determining sequence-specific binding between a nucleic acid-guided nuclease and a guide nucleic acid are known in the art and include, but are not limited to, in vitro binding assays between an expressed nucleic acid-guided nuclease and the guide nucleic acid, which can be tagged with a detectable label (e.g., biotin) and used in a pull-down detection assay in which the nucleoprotein complex is captured via the detectable label (e.g., with streptavidin beads). A control guide nucleic acid with an unrelated sequence or structure to the guide nucleic acid can be used as a negative control for non-specific binding of the nucleic acid-guided nuclease to nucleic acids.
In certain embodiments, the nucleic acid-guided nuclease of the presently disclosed compositions and methods comprise a nuclease variant that functions as a nickase, wherein the nuclease comprises a mutation in comparison to the wild-type nuclease that results in the nuclease only being capable of cleaving a single strand of a double-stranded nucleic acid molecule, or lacks nuclease activity altogether (i.e . , nuclease-dead).
A nuclease, such as a nucleic acid-guided nuclease, that functions as a nickase only comprises a single functioning nuclease domain. In some of these embodiments, additional nuclease domains have been mutated such that the nuclease activity of that particular domain is reduced or eliminated.
In other embodiments, the nuclease (e.g., RNA-guided nuclease) lacks nuclease activity completely and is referred to herein as nuclease-dead. In some of these embodiments, all nuclease domains within the nuclease have been mutated such that all nuclease activity of the polypeptide has been eliminated. Any method known in the art can be used to introduce mutations into one or more nuclease domains of a nucleic acid-guided nuclease, including those set forth in U.S. Publ. Nos. 2014/0068797 and U.S. Pat. No. 9,790,490, each of which is incorporated by reference in its entirety.
Any mutation within a nuclease domain that reduces or eliminates the nuclease activity can be used to generate a nucleic acid-guided nuclease having nickase activity or a nuclease-dead nucleic acid-guided nuclease. Such mutations are known in the art and include, but are not limited to the D10A mutation within the RuvC domain or H840A mutation within the HNH domain of the S. pyogenes Cas9 or at similar position(s) within another nucleic acid-guided nuclease when aligned for maximal homology with the S. pyogenes Cas9. Other positions within the nuclease domains of S. pyogenes Cas9 that can be mutated to generate a nickase or nuclease-dead protein include G12, G17, E762, N854, N863, H982, H983, and D986. Other mutations within a nuclease domain of a nucleic acid-guided nuclease that can lead to nickase or nuclease-dead proteins include a D917A, E1006A, E1028A, D1227A, D1255A, N1257A, D917A, E1006A, E1028A, D1227A, D1255A, and N1257A of the Francisella novicida Cpf1 protein or at similar position(s) within another nucleic acid-guided nuclease when aligned for maximal homology with the F. novicida Cpf 1 protein (U.S. Pat. No. 9,790,490, which is incorporated by reference in its entirety).
Nucleic acid-guided nucleases comprising a nuclease-dead domain can further comprise a domain capable of modifying a polynucleotide. Non-limiting examples of modifying domains that may be fused to a nuclease-dead domain include but are not limited to, a transcriptional activation or repression domain, a base editing domain, and an epigenetic modification domain. In other embodiments, the nucleic acid-guided nuclease comprising a nuclease-dead domain further comprises a detectable label that can aid in detecting the presence of the target sequence.
An epigenetic modification domain that can be fused to a nuclease-dead domain can serve to covalently modify DNA or histone proteins to alter histone structure and/or chromosomal structure without altering the DNA sequence itself, leading to changes in gene expression (upregulation or downregulation). Non-limiting examples of epigenetic modifications that can be induced by nucleic acid-guided nuclease include the following alterations in histone residues and the reverse reactions thereof: sumoylation, methylation of arginine or lysine residues, acetylation or ubiquitination of lysine residues, phosphorylation of serine and/or threonine residues; and the following alterations of DNA and the reverse reactions thereof: methylation or hydroxymethylation of cytosine residues. Non-limiting examples of epigenetic modification domains thus include histone acetyltransferase domains, histone deacetylation domains, histone methyltransferase domains, histone demethylase domains, DNA methyltransferase domains, and DNA demethylase domains. In some embodiments, the nucleic acid-guided nuclease comprises a transcriptional activation domain that activates the transcription of at least one adjacent gene through the interaction with transcriptional control elements and/or transcriptional regulatory proteins, such as transcription factors or RNA polymerases. Suitable transcriptional activation domains are known in the art and include, but are not limited to, VP16 activation domains.
In other embodiments, the nucleic acid-guided nuclease comprises a transcriptional repressor domain, which can also interact with transcriptional control elements and/or transcriptional regulatory proteins, such as transcription factors or RNA polymerases, to reduce or terminate transcription of at least one adjacent gene. Suitable transcriptional repression domains are known in the art and include, but are not limited to, IKB and KRAB domains.
In still other embodiments, the nucleic acid-guided nuclease comprising a nuclease-dead domain further comprises a detectable label that can aid in detecting the presence of the target sequence, which may be a disease-associated sequence. A detectable label is a molecule that can be visualized or otherwise observed. The detectable label may be fused to the nucleic acid-guided nuclease as a fusion protein (e.g., fluorescent protein) or may be a small molecule conjugated to the nuclease polypeptide that can be detected visually or by other means. Detectable labels that can be fused to the presently disclosed nucleic acid-guided nucleases as a fusion protein include any detectable protein domain, including but not limited to, a fluorescent protein or a protein domain that can be detected with a specific antibody. Nonlimiting examples of fluorescent proteins include green fluorescent proteins (e.g., GFP, EGFP, ZsGreenl ) and yellow fluorescent proteins (e.g., YFP, EYFP, ZsYellowl ). Non-limiting examples of small molecule detectable labels include radioactive labels, such as 3H and 35S.
The nucleic acid-guided nuclease can be delivered as part of a nucleoprotein (e.g., RNA-guided nuclease protein and guide RNA) into a cell as a nucleoprotein complex comprising the nucleic acid-guided nuclease bound to its guide nucleic acid. Alternatively, the nucleic acid-guided nuclease is delivered as a protein and the guide nucleic acid is provided separately. In certain embodiments, a guide RNA can be introduced into a target cell as an RNA molecule. The guide RNA can be transcribed in vitro or chemically synthesized. In other embodiments, a nucleotide sequence encoding the guide RNA is introduced into the cell. In some of these embodiments, the nucleotide sequence encoding the guide RNA is operably linked to a promoter (e.g., an RNA polymerase III promoter), which can be a native promoter or heterologous to the guide RNA-encoding nucleotide sequence. In specific embodiments, a nucleic acid sequence encoding the guide RNA and RNA-guided nuclease operably linked to a promoter can be delivered on a vector, such as the expression vector described in detail herein.
In certain embodiments, the nucleoprotein can comprise additional amino acid sequences, such as at least one nuclear localization sequence (NLS). Nuclear localization sequences enhance transport of the nucleic acid-guided nuclease into the nucleus of a cell. Proteins that are imported into the nucleus bind to one or more of the proteins within the nuclear pore complex, such as importin/karyopherin proteins, which generally bind best to lysine and arginine residues. The best characterized pathway for nuclear localization involves short peptide sequence which binds to the importin-a protein. These nuclear localization sequences often comprise stretches of basic amino acids and given that there are two such binding sites on importin-a, two basic sequences separated by at least 10 amino acids can make up a bipartite NLS. The second most characterized pathway of nuclear import involves proteins that bind to the importin-pi protein, such as the HIV-TAT and HIV-REV proteins, which use the sequences RKKRRQRRR (SEQ ID NO: 1 ) and RQARRNRRRRWR (SEQ ID NO: 2), respectively to bind to importin-pi . Other nuclear localization sequences are known in the art (see, e.g., Lange et a/., J. Biol. Chem. (2007) 282:5101 -5105). The NLS can be the naturally-occurring NLS of the nucleic acid-guided nuclease or a heterologous NLS. As used herein, “heterologous” in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. Non-limiting examples of NLS sequences that can be used to enhance the nuclear localization of the nucleic acid-guided nuclease or nucleoprotein include the NLS of the SV40 Large T-antigen and c-Myc. In certain embodiments, the NLS comprises the amino acid sequence PKKKRKV (SEQ ID NO: 3).
A nucleoprotein can comprise more than one NLS, such as two, three, four, five, six, or more NLS sequences. Each of the multiple NLSs can be unique in sequence or there can be more than one of the same NLS sequence used. The NLS can be on the amino-terminal (N-terminal) end of the nucleoprotein, the carboxy-terminal (C-terminal) end, or both the N-terminal and C-terminal ends of the nucleoprotein. In certain embodiments, the nucleoprotein comprises two NLS sequences on its N-terminal end. In other embodiments, the nucleoprotein comprises two NLS sequences on the C-terminal end of the site-directed polypeptide. In still other embodiments, the site-directed polypeptide comprises four NLS sequences on its N-terminal end and two NLS sequences on its C-terminal end.
As described above, the site-directed modifying polypeptide contains a conjugation moiety that allows the protein to conjugate to a test agent. Conjugation moieties include, but are not limited to Protein A, SpyCatcher tag, Halo-tag, Sortase, mono-avidin, ACP tag, a SNAP tag, or any other conjugation moieties known in the art. In one embodiment, the conjugation moiety is selected from Protein A, CBP, MBP, GST, poly(His), biotin/streptavidin, V5-tag, Myc-tag, HA-tag, NE-tag, His-tag, Flag tag, Halo-tag, Snap- tag, Fc-tag, Nus-tag, BCCP, thioredoxin, SnoopTag, SpyTag, SpyCatcher, Isopeptag, SBP-tag, S- tag, AviTag, and calmodulin. Exemplary binding moiety pairings include (i) streptavidin-binding peptide (streptavidin binding peptide; SBP) and streptavidin (STV), (ii) biotin and EMA (enhanced monomeric avidin), (iii) SpyTag (ST) and SpyCatcher (SC ), (iv) Halo-tag and Halo-tag ligand, (v) and SNAP-Tag , (vi) Myc tag and anti-Myc immunoglobulins (vii) FLAG tag and anti-FLAG immunoglobulins, and (ix) ybbR tag and coenzyme A groups.
Additional tags can be operably linked to the site-directed modifying polypeptide, e.g., at the N- terminal end of the amino acid sequence. In certain embodiments, the nucleic acid-guided nuclease comprises the self-cleaving N-terminal portions (NPro) of polyproteins from pestiviruses such as Hog cholera virus (strain Alfort), also called classical swine fever virus (CSFV), from border disease virus (BDV), bovine viral diarrhea virus (BVDV), or fragments thereof; (2) the N-terminal portion of carboxypeptidase B (‘CPB’) precursor (amino acids 21 -110 of Sus scrofa CPB, SwissProt P09955.5), and fragments thereof; and/or (3) small ubiquitin-related modifier (SUMO) (SwissProt P55853.1 ). Any N-terminal tag may itself be further tagged at its N-terminus with a polyhistidine tag such as 6xHis (SEQ ID NO: 4), allowing for initial purification of the tagged polypeptide on a nickel column, followed by self-cleavage of tags such as NPro, or enzymatic cleavage of the CPB or SUMO N-terminal tag by trypsin or SUMO protease, respectively, and elution of the freed polypeptide from the column. In one embodiment of this method, the SUMO protease polypeptides are also fusion proteins comprising 6xHis tags, allowing for a two-step purification: in the first step, the expressed 6xHis-SUMO-tagged nucleic acid-guided nuclease is purified by binding to a nickel column, followed by elution from the column. In the second step, the SUMO tags on the purified polypeptides are cleaved by the 6xHis-tagged SUMO protease, and the SUMO protease-nucleic acid-guided nuclease reaction mixture is run through a second nickel column, which retains the SUMO protease but allows the now untagged nucleic acid-guided nuclease to flow through. As another example, fluorescent protein sequences can be expressed as part of a polypeptide gene product, with the amino acid sequence for the fluorescent protein preferably added at the N- or C-terminal end of the amino acid sequence of the polypeptide gene product. The resulting fusion protein fluoresces when exposed to light of certain wavelengths, allowing the presence of the fusion protein to be detected visually. A well-known fluorescent protein is the green fluorescent protein of Aequorea victoria, and many other fluorescent proteins are commercially available, along with nucleotide sequences encoding them.
In some embodiments, the expression vectors herein comprise from 5’ to 3’ a promoter, a ribosome binding site, a nucleic acid-guided nuclease, and a detectable label, conjugation moiety, or other tag described herein. In other embodiments, the expression vectors herein comprise from 5' to 3’ a promoter, a ribosome binding site; a detectable label, conjugation moiety, or other tag described herein; and a nucleic acid-guided nuclease. In some embodiments, the detectable label or other tag is operably linked to the nucleic acid-guided nuclease by a cleavable linker (e.g., a SUMO protease cleavable linker).
EXAMPLE
The invention will be more fully understood by reference to the following examples. They should not, however, be construed as limiting the scope of the invention. All literature and patent citations are incorporated herein by reference.
Example 1 : Gel binding assay for cell internalization agents that can effectively bind to a site- directed modifying polypeptide
An anti-FAP antibody (28H1 )-SpyTag construct (where the SpyTag was genetically encoded on the C-terminus of light chain) was incubated with spycatch er-Cas9 (where the SpyCatcher was genetically encoded on the N-terminus) in Expi293 media (or size exclusion column buffer) for 30 minutes at room temperature (i.e. , Fig. 1 , lane 8). The corresponding reaction was applied to the well of a non-reducing SDS- PAGE gel under electrophoresis in comparison to various controls (i.e., FAP-SpyTag (alone; lane 2); spycatcher-Cas9 (alone; lane 3); FAP=spycatcher-Cas9 (lane 4); Expi293 media (alone; lane 5); FAP in Expi293 media (lane 6); spycatcher-Cas9 in Expi293 media (lane 7) (see Fig. 1 ; note
Figure imgf000048_0001
indicates a covalent conjugation between two molecules). The results indicate that incubating the anti-FAP antibody (28H1 )-SpyTag construct with spycatcher-Cas9 in Expi293 media results in a reduction in electrophoretic mobility that corresponds to a stable complex formation of anti-FAP antibody and Cas9.
The results suggest this assay can be used to determine the ability of a cell internalization agent (e.g., an anti-FAP antibody) to effectively bind the cell surface and internalize a site-directed modifying polypeptide (e.g., Cas9).

Claims

What is claimed is:
1 . A method for identifying a cell internalizing agent, the method comprising providing a population of target cells, wherein each target cell comprises a reporter construct that comprises a nucleic acid that provides a phenotype when activated or repressed, wherein each target cell is a eukaryotic cell that expresses a test agent on its cell surface, and wherein the test agent comprises a first conjugation moiety, contacting the population of target cells with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid within the target cell, wherein the nucleoprotein comprises a second conjugation moiety that binds to the first conjugation moiety of the test agent, and selecting a modified target cell having the phenotype observed with activation or repression of the reporter construct, thereby identifying the cell internalizing agent.
2. The method of claim 1 wherein the target nucleic acid is in the nucleus of the target cell.
3. The method of claim 1 or 2, wherein the test agent is a protein, a lipid, or a carbohydrate.
4. The method of claim 3, wherein the protein is an antigen-binding moiety.
5. The method of claim 4, wherein the antigen-binding moiety is an antibody or an antibody fragment thereof.
6. The method of claim 4, wherein the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a F(ab’)z, an intrabody, or an antibody mimetic.
7. The method of claim 6, wherein the antibody mimetic is a fibronectin based binding molecule, an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide.
8. The method of claim 3, wherein the protein is a cell-penetrating peptide (CPP).
9. The method of claim 3, wherein the protein is a ligand, or binding fragment thereof.
10. The method of any one of claims 1 -9, wherein the site-directed modifying polypeptide is a Class 2 Cas polypeptide.
11 . The method of claim 10, wherein the Class 2 Cas polypeptide is a Type II Cas polypeptide.
47 The method of claim 1 1 , wherein the Type II Cas polypeptide is Cas9. The method of any one of claims 1 -12, wherein the site-directed modifying polypeptide is conjugated to the second binding moiety via a linker. The method of claim 13, wherein the linker is a labile linker. The method of claim 14, wherein the labile linker is pH sensitive. The method of claim 14, wherein the labile linker is sensitive to reducing conditions. The method of any one of claims 1 -14 or 16, wherein the labile linker is a disulfide linker. The method of any one of claims 1 -13, wherein the linker is a hydrazone linker or a valine-citrate linker. The method of any one of claims 1 -18 wherein the modified target cell is selected by determining the RNA or protein expression level of the reporter construct. The method of claim 19, wherein an increase in the RNA or protein expression level of a nucleic acid in the reporter construct relative to a control indicates internalization of the test agent. The method of claim 19, wherein the RNA or protein expression level of a nucleic acid in the reporter construct is decreased or substantially eliminated upon internalization of the test agent. The method of any one of claims 1 -21 , wherein the reporter construct comprises a nucleic acid encoding a selection marker. The method of claim 22, wherein the selection marker is a thymidine kinase gene construct and the population of target cells is cultured in the presence of ganciclovir. The method of claim 23, wherein the nucleic acid encoding a selection marker is an antibiotic resistance marker, a fluorescence marker, or a bioluminescence marker. The method of claim 24, wherein the fluorescence marker is a green fluorescent protein (GFP), a yellow fluorescent protein (YFP), a red fluorescent protein (RFP), or a split GFP reporter. The method of claim 25, wherein the modified target cell is selected by detecting a signal produced by reassembly of the split GFP reporter.
48
27. The method of any one of claims 1 -26, wherein the modified target cell is identified using cell sorting.
28. The method of claim 27, wherein the cell sorting is fluorescence-activated cell sorting (FACS), magnetic-activated cell sorting (MACS), or microfluidic-based cell sorting.
29. The method of any one of claims 1 -28, wherein the cell internalizing agent binds to a cell surface protein associated with a disorder.
30. The method of claim 29, wherein the cell surface protein is a tyrosine kinase, an epidermal growth factor receptor (EGFR), a platelet-derived growth factor receptor (PDGFR), a fibroblast growth factor receptor (FGFR), a hepatocyte growth factor receptor (HGFR), a nerve growth factor receptor (NGFR), CD3, CD4, Tim-3, CD278, TNFR-I, IL-1 R, LT-betaR, IL-18R, CCR1 , CD26, CD94, CD1 19, CD183, CD195, or DPIV.
31 . The method of any one of claims 1 -30, wherein the population of target cells comprises mammalian cells or yeast cells.
32. The method of claim 31 , wherein the mammalian cells are a cell type selected from the group consisting of a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and an HepG2 cell.
33. The method of any one of claims 1 -32, wherein the cell internalizing agent is identified by polymerase chain reaction (PCR) or by deep sequencing of a PCR-amplified nucleic acid derived from the modified target cell.
34. A method of screening a library of cells having a plurality of genotypes for a cell that produces a cell internalizing agent, the method comprising:
(a) providing an array with a library of protein-variant-producing cells that each express a test agent;
(b) incubating the array under conditions that allow for the production of test agents from the protein- variant-producing cells;
(c) providing a target cell comprising a reporter construct comprising a nucleic acid that provides a phenotype when activated or repressed;
(d) contacting the target cell with the test agent under conditions that allow for the test agent to bind to the target cell, wherein the test agent comprises a first conjugation moiety;
(e) contacting the target cell with a nucleoprotein comprising a site-directed modifying polypeptide and a guide RNA (gRNA) that specifically hybridizes to a target nucleic acid of the target cell; wherein a second conjugation moiety is conjugated to the site-directed modifying polypeptide such that the first
49 conjugation moiety of the test agent binds to the second conjugation moiety of the site-directed modifying polypeptide;
(f) determining a result from the array, wherein the result identifies a target cell having a phenotype associated with activation or repression of the nucleic acid in the reporter construct; and
(g) extracting the target cell identified in (f), thereby obtaining the cell that produces the cell internalizing agent.
35. The method of claim 34, wherein the array is a microfluidic system, a microbubble system, or a microcavity array.
36. The method of claim 34, wherein the array is a microcavity array and step (g) comprises extracting the target cell with electromagnetic radiation.
37. The method of any one of claims 34-36, wherein the test agent is a protein, a lipid, or a carbohydrate.
38. The method of claim 37, wherein the protein is a peptide.
39. The method of claim 37, wherein the protein is an antigen-binding moiety.
40. The method of claim 39, wherein the antigen-binding moiety is an antibody or an antibody fragment thereof.
41 . The method of claim 39, wherein the antigen-binding moiety is a nanobody, a domain antibody, an scFv, a Fab, a diabody, a BiTE, a diabody, a DART, a minibody, a Fjab’js, an intrabody, or an antibody mimetic.
42. The method of claim 41 , wherein the antibody mimetic is an adnectin (i.e. , fibronectin based binding molecules), an affilin, an affimer, an affitin, an alphabody, an affibody, a DARPin, an anticalin, an avimer, a fynomer, a Kunitz domain peptide, a monobody, a nanoCLAMP, a unibody, or a versabody, an aptamer, or a cyclotide.
43. The method of claim 37, wherein the protein is a cell-penetrating peptide (CPP).
44. The method of claim 37, wherein the protein is a ligand, or fragment thereof.
45. The method of any one of claims 34-44, wherein the site-directed modifying polypeptide is a Class 2 Cas polypeptide.
46. The method of claim 45, wherein the Class 2 Cas polypeptide is a Type II Cas polypeptide.
50 The method of claim 46, wherein the Type II Cas polypeptide is Cas9. The method of any one of claims 34-47, wherein the site-directed modifying polypeptide is conjugated to the second conjugation moiety via a linker. The method of claim 48, wherein the linker is a labile linker. The method of claim 49, wherein the labile linker is pH sensitive. The method of claim 49, wherein the labile linker is sensitive to reducing conditions. The method of any one of claims 49 or 51 , wherein the labile linker is a disulfide linker. The method of claim 48, wherein the linker is a hydrazone linker or a valine-citrate linker. The method of any one of claims 34-53, wherein the result of (f) is assayed by determining an RNA or protein expression level of the nucleic acid in the reporter construct. The method of claim 54, wherein an increase in the RNA or protein expression level of the nucleic acid indicates internalization of the test agent. The method of claim 55, wherein the nucleic acid encodes a selection marker. The method of claim 56, wherein the selection marker is an antibiotic resistance marker, a fluorescence marker, and a bioluminescence marker. The method of claim 57, wherein the fluorescence marker is green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), or a split GFP reporter. The method of claim 58, further comprising detecting a signal produced by reassembly of the split GFP reporter. The method of claim 54, wherein a decrease in the RNA or protein expression level of the nucleic acid indicates internalization of the test agent. The method of claim 56, wherein the selection marker is a nucleic acid encoding thymidine kinase gene construct and the target cells are cultured in the presence of ganciclovir. The method of any one of claims 34-53, wherein step (f) comprises measuring a signal produced from a labeled antibody capable of specifically binding to a cell surface protein expressed by the reporter construct. The method of claim 62, wherein the presence of the signal indicates internalization of the test agent. The method of claim 62, wherein the absence of the signal indicates internalization of the test agent. The method of any one of claims 34-64, wherein the target cell is a mammalian cell or a yeast cell. The method of claim 65, wherein the mammalian cell is a cell type selected from the group consisting of is a COP cell, an L cell, a C127 cell, an Sp2/0 cell, an NS-0 cell, an NIH3T3 cell, a PC12 cell, a PC12h cell, a BHK cell, a CHO cell, a COS1 cell, a COS3 cell, a COST cell, a CV1 cell, a Vero cell, a HeLa cell, an HEK-293 cell, a PER C6 cell, a cell derived from diploid fibroblasts, a myeloma cell, and a HepG2 cell. The method of any one of claims 34-66, wherein the cell internalizing agent of step (g) is identified via by sequencing a nucleic acid in the target cell. The method of claim 67, wherein the step of identifying the cell internalizing agent comprises detecting using polymerase chain reaction (PCR) or deep sequencing of PCR-amplified nucleic acid from the cell that produces the cell internalizing agent. The method of any one of claims 1 -68, wherein the cell internalization agent binds to a cell surface antigen associated with a disease. The method of claim 69, wherein the disease is selected from the group consisting of cancer, autoimmune disease, and a hereditary genetic disease. The method of claim 69, wherein the cell surface antigen is selected from the group consisting of HLA-DR, CD44, CD22, CD3, CD20, CD33, CD32, CD44, CD47, CD59, CD54, CD25, AchR, CD70, CD74, CTLA4, EGFR, HER2, or EpCam. The method of claim 23 or 61 , wherein the modified target cell is selected by detecting cells capable of propagating in the presence of ganciclovir.
PCT/US2021/049814 2020-09-11 2021-09-10 Compositions and methods for screening cell internalizing agents WO2022056231A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063077042P 2020-09-11 2020-09-11
US63/077,042 2020-09-11

Publications (1)

Publication Number Publication Date
WO2022056231A1 true WO2022056231A1 (en) 2022-03-17

Family

ID=80629866

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/049814 WO2022056231A1 (en) 2020-09-11 2021-09-10 Compositions and methods for screening cell internalizing agents

Country Status (1)

Country Link
WO (1) WO2022056231A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030219826A1 (en) * 1999-09-01 2003-11-27 Robbins Paul D. Identification of peptides that facilitate uptake and cytoplasmic and/or nuclear transport of proteins, DNA and viruses
US20160244749A1 (en) * 2015-02-22 2016-08-25 The Board Of Trustees Of The Leland Stanford Junior University Micro-screening Apparatus, Process, and Products
WO2019051428A1 (en) * 2017-09-11 2019-03-14 The Regents Of The University Of California Antibody-mediated delivery of cas9 to mammalian cells

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030219826A1 (en) * 1999-09-01 2003-11-27 Robbins Paul D. Identification of peptides that facilitate uptake and cytoplasmic and/or nuclear transport of proteins, DNA and viruses
US20160244749A1 (en) * 2015-02-22 2016-08-25 The Board Of Trustees Of The Leland Stanford Junior University Micro-screening Apparatus, Process, and Products
WO2019051428A1 (en) * 2017-09-11 2019-03-14 The Regents Of The University Of California Antibody-mediated delivery of cas9 to mammalian cells

Similar Documents

Publication Publication Date Title
Omidfar et al. Advances in phage display technology for drug discovery
Maidorn et al. Tools and limitations to study the molecular composition of synapses by fluorescence microscopy
US20220010337A1 (en) Targeted active gene editing agent and methods of use
US20220002695A1 (en) Targeted active gene editing agent and methods of use
US20190002558A1 (en) Compositions and methods for the identification and isolation of cell-membrane protein specific binding moieties
Wagner et al. Nanobodies–Little helpers unravelling intracellular signaling
Miller et al. Beyond epitope binning: directed in vitro selection of complementary pairs of binding proteins
KR102499955B1 (en) Antibody Selection Method
SG187787A1 (en) Dual function in vitro target binding assay for the detection of neutralizing antibodies against target antibodies
WO2022056231A1 (en) Compositions and methods for screening cell internalizing agents
Fagbadebo et al. A nanobody-based toolset to monitor and modify the mitochondrial GTPase Miro1
US20100247529A1 (en) Cooperative and dynamic assembly of affinity complexes
CN108551763B (en) Method for determining abundance of target molecules in sample
US20230056532A1 (en) Methods for information transfer and related kits
US11976384B2 (en) Methods and compositions for protein detection
Shembekar Receptors in Immunodiagnostics: Antibodies, Antibody Fragments, Single Domain Antibodies and Aptamers
US20230194547A1 (en) Large molecule unspecific clearance assay
Monti et al. Epitope mapping of nanobodies binding the Alzheimer’s disease receptor SORLA
CN113950375A (en) Methods and systems for screening using microcapillary arrays
WO2023081695A1 (en) Methods and compositions for protein detection
WO2010118300A1 (en) Conformational epitope initiated signal amplification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21867657

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01/08/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21867657

Country of ref document: EP

Kind code of ref document: A1