WO2019134561A1 - Activation in vivo à efficacité élevée faisant appel à crispr - Google Patents

Activation in vivo à efficacité élevée faisant appel à crispr Download PDF

Info

Publication number
WO2019134561A1
WO2019134561A1 PCT/CN2018/123517 CN2018123517W WO2019134561A1 WO 2019134561 A1 WO2019134561 A1 WO 2019134561A1 CN 2018123517 W CN2018123517 W CN 2018123517W WO 2019134561 A1 WO2019134561 A1 WO 2019134561A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
sgrna
host cell
gene
locus
Prior art date
Application number
PCT/CN2018/123517
Other languages
English (en)
Inventor
Bo Feng
Xiangjun HE
Original Assignee
The Chinese University Of Hong Kong
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Chinese University Of Hong Kong filed Critical The Chinese University Of Hong Kong
Priority to CN201880090785.2A priority Critical patent/CN111886341A/zh
Publication of WO2019134561A1 publication Critical patent/WO2019134561A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Definitions

  • Genome editing tools such as zinc-finger nucleases (ZFNs) (Urnov et al., Nature Review Genetics, 11: 636-46 (2010) ) , transcription activator-like effector nucleases (TALENs) (Sung et al., Nature Biotechnology, 31: 23-24 (2013) , and clustered regularly interspaced short palindromic repeats (CRISPR) /CRISPR associated protein 9 (Cas9) (Cho et al., Nature Biotechnology, 31: 230-2 (2013) ) , have proven as effective tools for generating genetically modified cells and animals.
  • ZFNs zinc-finger nucleases
  • TALENs transcription activator-like effector nucleases
  • Cas9 clustered regularly interspaced short palindromic repeats
  • Cas9 CRISPR associated protein 9
  • Genome editing tools can act in a variety of different ways including altering the genome of a cell by deleting, inserting, mutating, or substituting specific nucleic acid sequences. These genomic alterations can be gene-or location-specific.
  • the above engineered nucleases can cleave defined DNA sequences within a cell genome to generate DNA double-strand breaks (DSBs) and stimulate DNA repair through non-homologous end-joining (NHEJ) or homologous directed recombination (HDR) repair mechanisms (Kim et al., Nature Review Genetics, 15: 321-34, (2014) ) .
  • “knocking out” a gene is considered easier than “knocking in” a gene because DSBs introduced into the genome by a nuclease complex are repaired more quickly by NHEJ.
  • NHEJ induced mutations occur more frequently than HDR-mediated genetic modifications because NHEJ happens throughout the cell cycle, while HDR occurs only in S and G2 phase (Deriano and Roth, Annu. Rev. Genet., 47: 433-55, (2013) ) .
  • the efficiency of HDR-mediated precise genetic modifications is relatively low compared with NHEJ induced mutations (Lombardo et al., Nat. Biotechnol., 25, 1298-1306, 2007) .
  • Gene therapy is a promising strategy to treat inherited diseases but has been largely hindered in the past two decades due to inefficient delivery of therapeutic genes into live tissues as well as unstable and short-lived expression provided by current technologies.
  • gene disruption involves stimulation of NHEJ to create targeted indels in genetic elements, often resulting in loss of function (LOF) mutations that are beneficial to subjects.
  • gene correction uses HDR to directly reverse a disease causing mutation, restoring function while preserving physiological regulation of the corrected element.
  • HDR may also be used to insert a therapeutic transgene into a defined ‘safe harbor’ locus in the genome to recover missing gene function.
  • the efficiency of in vivo knock-in of therapeutic genes by HDR is very low.
  • NHEJ is a DNA repair mechanism by which one or more nucleotides are inserted or deleted (indels) from the ends of a DSB to facilitate joining of the broken ends (Lieber, Annu. Rev. Biochem., 79, 181-211, (2010) .
  • NHEJ-mediated DSB repair often leads to a frameshift mutation introducing a premature stop codon or resulting in nonsense mediated decay of the mRNA transcripts -effectively “knocking out” the gene.
  • NHEJ was not considered for inserting genes into the genome of a cell because in order to obtain a functional gene and resulting protein, greater insertion precision is needed to avoid a frameshift mutations.
  • the inventors demonstrate targeted in vivo “knock-in” of large DNA fragments, including genes, into somatic tissues of living organisms through CRISPR/Cas9-coupled NHEJ, using a gene-specific donor vector in combination with a locus-specific helper and a universal helper construct to efficiently insert desired genes into a pre-selected locus at the genome of a host cell.
  • the development of targeted genomic integration of large DNA fragments, such as the genes used herein provides the potential for long-term gene expression thereby resulting in the production of sufficient levels of encoded proteins or mRNAs to treat various diseases in living organisms.
  • the invention provides a method for inserting a desired gene at a selected locus in a host cell genome, wherein the host cell is within a live organism, for example, at the 3’UTR of the selected genetic locus.
  • the method comprises the step of contacting the host cell or the live organism with: (i) a donor vector, which encodes the desired gene, optionally an internal ribosome entry site (ires) element at the 5' end of the gene, and one or two sgRNA target sites; (ii) a locus-specific helper vector, which encodes at least one sgRNA (optionally more than one, e.g., two or more) that is complementary to a selected nucleic acid sequence in the host cell genome; (iii) a universal helper vector, which encodes at least one sgRNA that is complementary to the one or two sgRNA target sites in the donor vector, for example, at 3’UTR of the selected gene locus; and (iv)
  • the Cas or Cpf1 protein in (iv) may be provided either in its protein form or in the form of its encoding polynucleotide sequence (e.g., an expression cassette or vector comprising a polynucleotide sequence encoding the Cas or Cpf1 protein and capable of directing the expression of the protein) .
  • the locus-specific helper vector and universal helper vector could be combined into one vector to carry both locus-specific sgRNA (s) and the universal sgRNA.
  • the host cell is within the liver of a live organism.
  • the host cell is a hematopoietic stem cell of the live organism.
  • the host cell is a somatic cell within a live organism.
  • the method further comprises detecting a protein or RNA encoded by the inserted desired gene from a sample obtained from the live organism.
  • the sample includes a tissue section, blood sample, biopsy, or cell lysate obtained from a tissue of the live organism.
  • the method further comprises evaluating functionality of the NHEJ-mediated knock-in desired gene from a sample obtained from the live organism.
  • evaluating the functionality of the NHEJ-mediated knock-in desired gene includes assessing the level, amount or relative amount of gene expression products (e.g., mRNA and protein) in a tissue of live organism.
  • the donor vector has a single (i.e., only one) sgRNA target site.
  • the donor vector has two sgRNA target sites, wherein each of the sgRNA target sites is complementary to different nucleic acid sequence in the host cell genome.
  • the two sgRNA target sites can be designed such that they target the same genomic region (e.g., the same intron, exon, or same UTR region) but target different nucleic acid sequences in that genomic region (e.g., upstream or downstream from each other) .
  • the donor vector has a single (i.e., only one) gene of interest.
  • the desired gene comprises any gene.
  • the desired gene is associated with an inherited disease.
  • the desired gene is a gene known to be directly responsible for development or onset of a genetic disease (e.g., insulin for diabetes or Factor IX for Hemophilia B) .
  • the donor vector can comprise two genes of interest.
  • the donor vector can comprise one or more genes of interest.
  • the locus-specific helper vector encodes Cas9 or Cpf1. In another embodiment, the universal helper vector encodes Cas9 or Cpf1. In one embodiment, either the locus-specific helper vector or the universal helper vector encodes the Cas protein.
  • the donor vector, locus-specific helper vector and/or universal helper vector is selected from a plasmid, AAV viral particle, adenovirus particle, lentivirus particle, and DNA-nanoparticle complex.
  • the donor vector is a plasmid or AAV viral particle.
  • the locus-specific helper vector is a plasmid or AAV viral particle.
  • the universal helper vector is a plasmid or AAV viral particle.
  • the donor vector, locus-specific helper vector and/or universal helper vector are all plasmid or all viral vectors.
  • the donor vector, locus-specific helper vector and/or universal helper vector are all AAV viral vectors.
  • the live organism is a human or non-animal human.
  • the live organism is a human with diabetes, hemophilia, sickle cell anemia, cystic fibrosis, Duchenne muscular dystrophy, hemochromatosis, congenital deafness, familial hypercholesterolemia, Huntingdon’s, Tay-Sachs or phenylketonuria.
  • the live organism is a human diagnosed, suspected of having, or suffering from a genetically inherited disease.
  • the live organism is a human diagnosed, suspected of having, or suffering from hemophilia B.
  • the live organism is a human diagnosed, suspected of having, or suffering from diabetes
  • the donor vector encodes a human gene. In one embodiment, the donor vector encodes a human gene and a single sgRNA target site. In one embodiment, the donor vector encodes one or more human genes and one or more sgRNA target sites. In one embodiment, the donor vector encodes a hINS gene and a single sgRNA target site. In some embodiments, the donor vector encodes a hINS gene and two sgRNA target sites. In another embodiment, the donor vector encodes a hF9 gene and a single sgRNA target site. In some embodiments, the donor vector encodes a hF9 gene and two sgRNA target sites.
  • the selected locus for insertion of the desired gene is Actb, Alb or GAPDH.
  • the locus-specific helper vector and the universal helper vector each encode a single sgRNA.
  • an sgRNA of the locus-specific helper vector and the universal helper vector comprise a nucleic acid region of about 20 nucleotides that is complementary to the selected nucleic acid sequence in the host cell genome.
  • the invention provides a method for treating a somatic tissue disease in a subject.
  • the method comprises the step of administering to the subject in need thereof an therapeutically effective amount of: (a) a donor vector encoding a desired gene and one or two sgRNA target sites; (b) a locus-specific helper vector encoding at least one sgRNA (optionally more than one, e.g., two or more) that is complementary to a selected nucleic acid sequence in the host cell genome; (c) a universal helper vector encoding at least one sgRNA that is complementary to the one or two sgRNA target sites in the donor vector, and (d) a Cas or Cpfl protein if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, to treat the somatic tissue disease in the subject.
  • the Cas or Cpf1 protein in (iv) may be provided either in its protein form or in the form of its encoding polynucleotide sequence (e.g., an expression cassette or vector comprising a polynucleotide sequence encoding the Cas or Cpf1 protein and capable of directing the expression of the protein) .
  • the locus-specific helper vector and universal helper vector could be combined into one vector to carry both locus-specific sgRNA (s) and the universal sgRNA.
  • the donor vector has a single sgRNA target site.
  • the donor vector has two sgRNA target sites, wherein each of the sgRNA target sites is complementary to different nucleic acid sequence in the host cell genome.
  • the two sgRNA target sites can be designed such that they target the same genomic region (e.g., the same intron, exon, or same UTR region) but target different nucleic acid sequences in that genomic region (e.g., upstream or downstream from each other) .
  • the method further comprises detecting expression of a gene expression product encoded by the desired gene in a sample from the subject.
  • the gene expression products include mRNA transcripts and proteins expressed by the desired gene.
  • the method further comprises confirming the expression of the RNA or protein encoded by the desired gene is sufficient to treat the somatic tissue disease.
  • confirming the expression of the RNA or protein encoded by the desired gene can include detecting (e.g., visualizing) , measuring, quantitating the level, amount or relative amount of RNA or protein encoded by the desired gene for a period of time after administration to the subject.
  • the confirming can include one or more gene expression assays or one or more protein assays.
  • the confirming can include visualization of proteins by fluorescence microscopy.
  • the period of time after administration to the subject can range from 2 days to one year after administration to the subject.
  • the period of time after administration can include regular intervals, such as once a week, once a month, bimonthly, or quarterly. In some embodiments, the period of time after administration is less than a month after administration and optionally, with a monthly or quarterly follow up to confirm continued expression of the gene products in the sample.
  • the somatic tissue disease is an inherited disease.
  • the somatic tissue disease is a metabolic disorder.
  • the somatic tissue disease is a disease caused by mutation of a single gene.
  • the inherited disease is selected from diabetes, hemophilia, sickle cell anemia, cystic fibrosis, Duchenne muscular dystrophy, hemochromatosis, congenital deafness, familial hypercholesterolemia, Huntingdon’s, Tay-Sachs and phenylketonuria.
  • the somatic tissue disease is hemophilia.
  • the somatic tissue disease is type 1 diabetes mellitus.
  • the invention provides a kit for treating a somatic tissue disease comprising at least these components: (i) a first container comprising a donor vector, wherein the donor vector encodes a desired gene and one or two sgRNA target sites; (ii) a second container comprising a locus-specific helper vector, wherein the locus-specific helper vector encodes at least one sgRNA (optionally more than one, e.g., two or more) that is complementary to a selected nucleic acid sequence in the host cell genome; and (iii) a third container comprising a universal helper vector, wherein the universal helper vector encodes at least one sgRNA that is complementary to the one or two sgRNA target sites in the donor vector.
  • the locus-specific helper vector and universal helper vector could be combined into one vector to carry both locus-specific sgRNA (s) and the universal sgRNA, and this combined vector is placed in a separate container in addition to the first container described above in (i) .
  • the kit further comprises a Cas or Cfp1 protein or a nucleic acid encoding a Cas or Cfpl protein (e.g., an expression cassette or vector comprising a polynucleotide sequence encoding the Cas or Cpf1 protein and capable of directing the expression of the protein) .
  • the kit further comprises instructions for use of the kit or an instruction manual.
  • the kit further comprises a host cell.
  • the host cell in the kit is a host cell from a somatic tissue of a live organism.
  • the host cell in the kit is a germ cell.
  • the first, second and third containers can further comprise one or more additional reagents or preservatives for storing or preserving the donor vector, universal helper vector and locus-specific helper vector.
  • the donor vector encodes a single sgRNA target site.
  • both the universal helper vector and the locus-specific helper vector encode a single sgRNA.
  • the universal helper vector and the locus-specific helper vector both encode two sgRNAs, for a total of four different sgRNAs.
  • the invention provides a composition for genome editing in vivo comprising at least these components: (i) a donor vector, wherein the donor vector encodes a desired gene and one or two sgRNA target sites; (ii) a locus-specific helper vector, wherein the locus-specific helper vectors encodes at least one small guide RNA (sgRNA) (optionally more than one, e.g., two or more) that is complementary to a selected nucleic acid sequence in a host cell genome; (iii) a universal helper vector, wherein the universal helper vector encodes at least one sgRNA that is complementary to the one or two sgRNA target sites in the donor vector; and (iv) a host cell, wherein the host cell is a somatic cell within a live organism.
  • the locus-specific helper vector and universal helper vector could be combined into one vector to carry both locus-specific sgRNA (s) and the universal sgRNA.
  • the host cell is an animal cell. In some embodiments, the host cell is a non-human animal cell such as a rodent, feline, canine, porcine, quine, or ovine cell. In some embodiments, the host cell is a liver, kidney, bone, or blood cell. In one embodiment, the host cell is a somatic cell within a live organism. In yet another embodiment, the host cell is a germ cell. In some embodiments, the host cell is within the liver of a live organism. In some embodiments, the host cell is a hematopoietic stem cell of the live organism.
  • FIGS. 1A-1E NHEJ-mediated in vivo knock-in through hydrodynamic injection.
  • FIG. 1A is a schematic of hydrodynamic injection for plasmid-mediated in vivo knock-in.
  • FIG. 1B is a schematic of plasmid-mediated NHEJ knock-in of an ires-GFP gene at a position within 3’- UTR of mouse Actb locus.
  • FIG. 1C shows direct images of fresh mouse livers harvested 5 days after hydrodynamic injection of different plasmid combinations. The upper row and middle row show liver images from control groups, while the lower row shows a NHEJ-mediated knock-in group.
  • FIG. 1D are gel electrophoresis images for the PCR products amplified from the integration junctions, confirming non-directional NHEJ-mediated knock-in in vivo. Primers used in PCR reactions are indicated in FIG. 1B.
  • FIG. 1E discloses immunohistochemistry staining for reporter signals expressed by GFP gene introduced through NHEJ-mediated in vivo knock-in.
  • the left panel shows a tissue sample from a mouse injected with an empty vector (control) ; and the right panels shows significant staining of the liver tissue from a mouse injected with plasmids for NHEJ-mediated knock-in, indicating the successful knock-in and expression of the reporter gene in hepatocytes.
  • FIGS. 2A-2G NHEJ-mediated in vivo knock-in through hydrodynamic injection reverses type 1 diabetes mellitus (T1DM) in mice.
  • FIG. 2A is a schematic outlining an exemplary method for Streptozotocin (STZ) injection via the tail vein.
  • FIG. 2B presents bar graph data for body weight, blood glucose and plasma insulin levels measured before and after STZ treatment (at day 0 and day 7) .
  • FIG. 2C is a schematic showing NHEJ-mediated knock-in of hINS at 3’-UTR of mouse Alb gene.
  • sg-Alb, sg-A and NHEJ-hINS donor constructs were co-injected into STZ-induced T1DM mouse through hydrodynamic injection.
  • FIG. 1DM type 1 diabetes mellitus
  • FIG. 2D is a bar graph showing glucose levels detected before (Day 0) and after (Day 7) NHEJ-mediated in vivo knock-in of hINS.
  • Two sg-RNA targeting Alb were used, and named as sg-Alb1 and sg-Alb2.
  • FIG. 2E shows the modulation of glucose levels over time during a glucose tolerance test (GTT) in wild-type mice, T1DM mice, or T1DM mice with NHEJ-mediated hINS knock-in.
  • FIG. 2F shows plasma insulin levels detected before (D0) and after (D4) NHEJ-mediated in vivo knock-in of hINS.
  • FIG. 2G shows immunohistofluorescence staining for liver tissues indicating the successful hINS knock-in and expression in hepatocytes. The images shown are tissue from mock mouse (left) and injected mouse (right) which carries successful knock-in of hINS gene.
  • FIGS. 3A-3D NHEJ-mediated in vivo knock-in through hydrodynamic injection for human coagulation factor IX (FIX) expression to correct hemophilia B.
  • FIG. 3A is an exemplary schematic showing hydrodynamic injection of plasmids for transient expression (CMV-hF9) or NHEJ-mediated in vivo knock-in of hF9, the gene encoding human FIX.
  • FIG. 3B demonstrates successful transient expression of hF9 in mice. Shown in the bar graph is the concentration of human FIX detected in mouse serum after hydrodynamic injection at days 1, 2 and 5 (D1, D2 and D5) .
  • FIG. 3C is an exemplary schematic showing NHEJ-mediated knock-in of hF9 at Alb 3’-UTR (upper panel) .
  • Secreted human FIX in serum was detected by ELISA after NHEJ-mediated in vivo knock-in at days 1, 3, 5 and 7 (D1, D3, D5 and 7) (lower panel) .
  • FIG. 3D shows immunohistofluorescence staining for liver tissues indicating the successful knock-in and expression of hF9 in mouse hepatocytes. The images shown are tissue from mock mouse (left) and injected mouse (right) that carries successful knock-in of hF9 gene.
  • FIGS 4A-4D AAV-coupled NHEJ-mediated knock-in under in vitro and in vivo conditions.
  • FIG. 4A is an exemplary schematic demonstrating AAV-coupled NHEJ-mediated knock-in in vitro in human cells (at GAPDH locus) or in mouse cells (at Actb locus) .
  • FIG. 4B shows flow analysis of AAV-coupled NHEJ-mediated knock-in in human HEK293T cells. The presence of GFP positive cells in the AAV-donor panel represents successful knock-in of the eGFP gene.
  • FIG. 4C is an exemplary schematic showing AAV-mediated transient expression (AAV-CMV-GFP) or AAV-coupled NHEJ-mediated in vivo knock-in in a mouse.
  • AAV-mediated transient expression AAV-CMV-GFP
  • FIG. 4D shows direct imaging of fresh mouse livers harvested at day 5 after AAV injection.
  • the upper row shows expression of AAV-CMV-GFP, indicating transduction efficiency.
  • the middle row shows GFP expression from AAV-coupled NHEJ-mediated in vivo knock-in; while the lower row shows liver images from the control group.
  • Bright field images at low magnification (0.7x) were shown in left column; while the fluorescent images at low (0.7x) and high magnification (4x) were shown in middle and right columns respectively.
  • FIG 5A-5C AAV-coupled NHEJ-mediated hF9 knock-in.
  • FIG. 5A is an exemplary schematic demonstrating AAV-coupled NHEJ-mediated knock-in of hF9 in mice (at Alb locus) .
  • FIG. 5B is ELISA test for plasma collected from mice with AAV-coupled NHEJ mediated knock-in of hF9.
  • FIG. 5C is quantitative RT-PCR data to detect the RNA expression of hF9 in liver tissue, which was collected 3 months after AAV injection.
  • CRISPR Clustered regularly interspaced short palindromic repeats
  • Cas proteins e.g., Cas3
  • Cas proteins e.g., Cas3
  • CRISPR protein, ” “Cas, ” and “CRISPR/Cas protein” refer to CRISPR-associated proteins (Cas) including, but not limited to Class 1 Type I CRISPR-associated proteins, Class 1 Type III CRISPR-associated proteins, and Class 1 Type IV CRISPR-associated proteins, Class 2 Type II CRISPR-associated proteins, Class 2 Type V CRISPR-associated proteins, and Class 2 Type VI CRISPR-associated proteins.
  • Class 2 Cas proteins include Cas9 proteins, Cas9-like proteins encoded by Cas9 orthologs, Cas9-like synthetic proteins, Cpf1 proteins, proteins encoded by Cpf1 orthologs, Cpf1-like synthetic proteins, C2c1 proteins, C2c2 proteins, C2c3 proteins, and variants and modifications thereof.
  • Cas proteins are Class 2 CRISPR-associated proteins, for example one or more Class 2 Type II CRISPR-associated proteins, such as Cas9, one or more Class 2 Type V CRISPR-associated proteins, such as Cpf1, or one or more Class 2 Type VI CRISPR-associated proteins, such as C2c2.
  • Cas proteins are one or more Class 2 Type II CRISPR-associated proteins, such as Cas9, or one or more Class 2 Type V CRISPR-associated proteins, such as Cpf1.
  • a Cas protein is capable of interacting with one or more polynucleotides (typically RNA) to form a nucleoprotein complex (typically, a ribonucleoprotein complex) .
  • Cas9 or “Cas9 protein” refers to a Cas9 wild-type protein derived from Class 2 Type II CRISPR-Cas9 systems, modifications of Cas9 proteins, variants of Cas9 proteins, Cas9 orthologs, and combinations thereof.
  • Cas9 proteins include, but are not limited to, Cas9 from Streptococcus pyogenes (UniProtKB: Q99ZW2 (CAS9_STRP1) ) , Streptococcus thermophilus (UniProtKB: G3ECR1 (CAS9_STRTR) ) , and Staphylococcus aureus (UniProtKB: J7RUA5 (CAS9_STAAU) ) .
  • Cas9 homologs can be identified using sequence similarity search methods known to one skilled in the art. Targeting specificity is determined by complementary base pairing of guide RNA (typically, a single guide RNA “sgRNA” ) to the genomic locus and protospace adjacent motif (PAM) .
  • Cas9 is the signature protein characteristic for Class 2 Type II CRISPR systems.
  • Cpf1 protein refers to a Cpf1 wild-type protein derived from Class 2 Type V CRISPR-Cpf1 systems, modifications of Cpf1 proteins, variants of Cpf1 proteins, Cpf1 orthologs, and combinations thereof.
  • Cpf1 proteins include, but are not limited to, Cpf1 from Acidaminococcus sp. (UniProtKB: U2UMQ6 (CPF1_ACISB) ) , and Francisella tularensis (UniProtKB: A0Q7Q2 (CPF1_FRATN) ) .
  • Cpf1 homologs can be identified using sequence similarity search methods known to one skilled in the art (e.g., see Cas9 targeting specificity above) .
  • Cpf1 is the signature protein characteristic for Class 2 Type V CRISPR systems.
  • small-guide RNA refers to a short polynucleotide component capable of associating with a Cas protein (e.g., a Cas9 protein) .
  • a sgRNA is capable of forming a nucleoprotein complex with a Cas9 protein, wherein the complex (Cas9-sgRNA) is capable of targeting a nucleic acid sequence complementary to the protospace adjacent motif (PAM) sequence.
  • PAM protospace adjacent motif
  • a sgRNA contains a segment of about 20 nucleotides complementary to a target nucleic acid sequence (i.e., DNA) , such that the Cas9-sgRNA complex directs Cas9 cleavage of the target nucleic acid sequence upon the sgRNA recognizing the complementary sequence in the target nucleic acid.
  • a sgRNA is approximately a 20-base sequence (ranging from about 10-50, 15-45, or 20-40, for example, 15, 20, 25, or 30 bases) specific to the target nucleic acid (i.e., target DNA) 5’ of a non-variable scaffold sequence.
  • Cas9-sgRNAs Modifications of Cas9-sgRNAs are known in the art, including, deletion of one or more 3’ hairpin elements, modifications of the upper stem, bulge, lower stem, and 5’ hairpin region (see, e.g., U.S. Patent Publication Nos.: 20140315985; 20150376586; 20160257973; and 20160289673) .
  • CRISPR system refers to a prokaryotic immune system that confers resistance to foreign genetic elements (See, Bhaya, D., Davison, M. &Barrangou, R. CRISPR-Cas systems in bacteria and archaea: versatile small RNAs for adaptive defense and regulation. Annu Rev Genet 45, 273-97 (2011) ) .
  • CRISPR is short for clustered regularly interspaced short palindromic repeats which are segments of prokaryotic DNA containing short, repetitive base sequences acquired from plasmid or phages. These segments can be transcribed into RNA and form as a scaffold to bind with CRISPR-associated protein (Cas proteins, such as Cas9) .
  • Cas proteins such as Cas9
  • the combined complex e.g., the ribonucleotide complex, Cas9-sgRNA
  • Cas9-sgRNA can be directed to degrade a target sequence present in DNA that is specifically recognized by these transcribed segments to acquire immunity to the foreign genetic elements (see, e.g., Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-21 (2012) and Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823-6 (2013) ) .
  • target site refers to a polynucleotide construct of this invention (e.g., a donor construct) having a nucleic acid sequence that shares substantial sequence identity to a corresponding target site in the host cell genome (e.g., the nucleic acid sequence at the intended site of gene insertion) .
  • a sgRNA target site will substantial sequence identity to the corresponding target site in the host cell genome (i.e., at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%or complete sequence identity to the target site in the host cell genome) .
  • vector refers to a polynucleotide vehicle to introduce genetic material (e.g., foreign genetic material) into a cell.
  • Vectors can be linear or circular.
  • Vectors can contain a replication sequence capable of effecting replication of the vector in a suitable host cell (i e., an origin of replication) .
  • a suitable host cell e.g., a somatic tissue cell
  • the vector can replicate and function independently of the host genome or integrate into the host genome.
  • Vector design depends, among other things, on the intended use and host cell for the vector.
  • vectors for a particular use and host cell type are within the level of skill in the art.
  • the four major types of vectors are plasmids, viral vectors, cosmids, and artificial chromosomes.
  • vectors comprise an origin of replication, a multicloning site (MCS) , and/or a selectable marker.
  • An expression vector typically comprises an expression cassette.
  • a “donor vector” refers to a vector encoding at least one sgRNA target site and a desired gene of interest.
  • the donor vector contains a single sgRNA target site.
  • the donor vector includes two sgRNA target sites.
  • the donor vector includes a single desired gene.
  • the donor vector includes one or more genes, such as two desired genes.
  • the donor vector comprises an additional flanking nucleic acid sequence adjacent to the nucleic acid sequence encoding the desired gene, which is identical to a nucleic acid sequence in the target site of insertion in the host cell genome.
  • the donor vector is cleaved by a nuclease (such as a Cas protein) to form a linear DNA fragment.
  • a nuclease such as a Cas protein
  • the linear DNA fragment is inserted into the host cell genome without the formation of a frameshift mutation, resulting in the production of mRNA transcripts and proteins corresponding to the inserted desired gene.
  • a “locus-specific helper vector” refers to a vector encoding at least one sgRNA that is complementary to a nucleic acid sequence in the host cell genome.
  • the locus-specific helper vector encodes a single sgRNA.
  • the specific helped vector encodes two sgRNAs.
  • each sgRNA of the locus-specific helper vector is complementary to a different selected nucleic acid sequence in the host cell genome.
  • a sgRNA of the locus-specific helper vector is complementary to a nucleic acid sequence flanking the desired gene in the host cell genome.
  • the encoded sgRNAs of the locus-specific helper vector create a nucleic acid strand break in the host cell’s genome through interaction with the CRISPR/Cas complex.
  • the nucleic acid strand break in the host cell’s genome removes an existing version of the gene (e.g., mutated or dysfunctional form) from the host cell’s genome allowing for the insertion of the desired gene by the linear DNA fragment from the donor vector.
  • a “universal helper vector” refers to a vector encoding at least one sgRNA that is complementary to one or two sgRNA target sites in the donor vector.
  • the universal helper vector encodes a single sgRNA.
  • the universal helper vector encodes two sgRNAs.
  • each encoded sgRNA of the universal helper vector is complementary to a different sgRNA target site in the donor vector.
  • a sgRNA encoded by the universal helper vector is complementary (e.g., along its length) to a sgRNA target site in the donor vector.
  • sgRNAs encoded by the universal helper vector create a nucleic acid strand break at a target site in the donor vector with the CRISPR/Cas complex.
  • the nucleic acid strand break at a target site in the donor vector produces a linear donor DNA fragment that can be inserted into the host cell genome.
  • expression cassette refers to a polynucleotide construct generated using recombinant methods or by synthetic means and comprising regulatory sequences operably linked to a selected polynucleotide to facilitate expression of the selected polynucleotide in a host cell (e.g., a somatic tissue cell, such as a hepatocyte) .
  • a host cell e.g., a somatic tissue cell, such as a hepatocyte
  • the regulatory sequences can facilitate transcription of the selected polynucleotide in a host cell, or transcription and translation of the selected polynucleotide in a host cell.
  • An expression cassette can, for example, be integrated into the genome of a host cell or be present in a vector to form an expression vector.
  • an expression cassette comprises a plasmid or viral vector for gene expression within somatic tissue cells that produces sufficient polypeptides or RNAs (e.g., mRNAs) to treat a human condition or disease.
  • RNAs e.g., mRNAs
  • an expression cassette is a construct that comprises a polynucleotide sequence encoding a polypeptide of the invention operably linked to a promoter, e.g., its native promoter, where the expression cassette is introduced into a heterologous microorganism.
  • an expression cassette comprises a polynucleotide sequence encoding a polypeptide of the invention where the polynucleotide is targeted to a position in the genome of a microorganism such that expression of the polynucleotide sequence is driven by a promoter that is present in the microorganism.
  • nucleic acid sequence As used herein, the terms “nucleic acid sequence, ” “nucleotide sequence, ” “oligonucleotide, ” and “polynucleotide” are interchangeable and refer to a polymeric form of nucleotides.
  • the nucleotides may be deoxyribonucleotides (DNA) , ribonucleotides (RNA) , analogs thereof, or combinations thereof, and may be of any length.
  • Polynucleotides may perform any function and may have any secondary and tertiary structures.
  • the terms encompass known analogs of natural nucleotides and nucleotides that are modified in the base, sugar and/or phosphate moieties.
  • a polynucleotide may comprise one modified nucleotide or multiple modified nucleotides. Examples of modified nucleotides include fluorinated nucleotides, methylated nucleotides, and nucleotide analogs. Nucleotide structure may be modified before or after a polymer is assembled. Following polymerization, polynucleotides may be additionally modified via, for example, conjugation with a labeling component or target binding component. A nucleotide sequence may incorporate non-nucleotide components.
  • nucleic acids comprising modified backbone residues or linkages, that are synthetic, naturally occurring, and non-naturally occurring, and have similar binding properties as a reference polynucleotide (e.g., DNA or RNA) .
  • reference polynucleotide e.g., DNA or RNA
  • analogs include, but are not limited to, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs) , Locked Nucleic Acid (LNA) and morpholino structures.
  • selected nucleic acid sequence refers to a nucleic acid sequence that is complementary to the nucleic acid sequence identified, predetermined, selected or chosen as the site for insertion of the desired gene into the host cell genome.
  • the selected nucleic acid sequence can be identified by any appropriate means known in the art.
  • the selected nucleic acid sequence is identified through the use of nucleic acid sequence alignment processes (such as BLAST TM ) or nucleic acid sequence homology.
  • the selected nucleic acid sequence has a high degree of complementarity (e.g., at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%or complete complementarity (i.e., 100%) to the nucleic acid sequence identified as the site for insertion of the desired gene in the host cell genome.
  • a high degree of complementarity e.g., at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%or complete complementarity (i.e., 100%) to the nucleic acid sequence identified as the site for insertion of the desired gene in the host cell genome.
  • double-strand break refers to both strands of a double-stranded segment of DNA being severed.
  • one strand can be said to have a “sticky end” or “overhang” , wherein one or more nucleotides are exposed and not hydrogen bonded to nucleotides on the other (opposite) strand of the DNA.
  • blunt end can occur where both strands of DNA remain fully base paired with each other despite the presence of a DSB.
  • sequence identity or “sequence similarity” generally refers to the percent identity of nucleotide bases or amino acids comparing a first polynucleotide or polypeptide to a second polynucleotide or polypeptide using algorithms having various weighting parameters. Sequence identity between two polynucleotides or two polypeptides can be determined using sequence alignment by various methods and computer programs (e.g., BLAST, CS-BLAST, FASTA, HMMER, L-ALIGN, and the like) available through the worldwide web at sites including but not limited to GENBANK (website: ncbi. nlm. nih.
  • Sequence identity between two polynucleotides or two polypeptide sequences is generally calculated using standard default parameters of the various methods or computer programs.
  • a high degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 90%identity and 100%identity, for example, about 90%identity or higher, preferably about 95%identity or higher, more preferably about 98%identity or higher.
  • a moderate degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 80%identity to about 85%identity, for example, about 80%identity or higher, preferably about 85%identity.
  • a low degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 50%identity and 75%identity, for example, about 50%identity, preferably about 60%identity, more preferably about 75%identity.
  • a Cas protein e.g., a Cas9 comprising amino acid substitutions or Cpf1 comprising amino acid substitutions
  • a Cas protein can have a moderate degree of sequence identity, or preferably a high degree of sequence identity, over its length to a reference Cas protein (e.g., a wild-type Cas9 or wild-type Cpf1, respectively) .
  • operably linked refers to polynucleotide sequences or amino acid sequences placed into a functional relationship with one another.
  • regulatory sequences e.g., a promoter or enhancer
  • operably linked regulatory elements are typically contiguous with the coding sequence.
  • enhancers can function if separated from a promoter by up to several kilobases or more. Accordingly, some regulatory elements may be operably linked to a polynucleotide sequence but not contiguous with the polynucleotide sequence.
  • translational regulatory elements contribute to the modulation of protein expression from a polynucleotide.
  • the term “expression” refers to transcription of a polynucleotide from a DNA template, resulting in, for example, a messenger RNA (mRNA) or other RNA transcript (e.g., non-coding, such as structural or scaffolding RNAs) .
  • mRNA messenger RNA
  • RNA transcript e.g., non-coding, such as structural or scaffolding RNAs
  • the term further refers to the process through which transcribed mRNA is translated into peptides, polypeptides, or proteins.
  • Transcripts and encoded polypeptides may be referred to collectively as “gene product (s) . ”
  • a “genomic region” or “selected locus” is a specific position or location of a segment of a chromosome in the genome of a host cell. In some embodiments, it can include the specific location or position of a desired gene’s DNA sequence on a chromosome. In another embodiment, it can refer to a segment of a chromosome that is present on either side of the nucleic acid (e.g., DNA) target sequence site or, alternatively, also includes a portion of the nucleic acid target sequence site.
  • the donor construct has sufficient homology to undergo non-homologous end joining with the corresponding genomic region. In some embodiments, the donor construct shares significant sequence homology to the genomic region immediately flanking the target sequence site. In one embodiment, significant homology of the donor construct to the target sequence site comprises at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or complete (100%) sequence homology.
  • gene refers to a polynucleotide sequence comprising exon (s) and related regulatory sequences.
  • a gene may further comprise intron (s) and/or untranslated region (s) (UTR) .
  • the present application is directed to the insertion of one or more genes into the genome of a cell (e.g., a somatic tissue cell) within a liveorganism.
  • the one or more genes may be inserted to replace a dysfunctional gene or a gene that contains one or more mutations that optionally result in the manifestation of a disease in the host organism.
  • the host organism is a human and the gene is human insulin (hINS) or human Factor IX (hFIX) .
  • the term “desired gene” refers to identification and application of one or more genes that are desirable for genomic integration into the genome of a somatic tissue cell within the host organism. For example, it can be desirable to replace a dysfunctional gene in the genome of a hepatocyte cell, with a corresponding functional gene such that sufficient amounts or levels of RNA transcripts or polypeptides are produced to alleviate, reduce, or treat and biological condition or disease.
  • a “host cell” generally refers to a cell of a live organism.
  • a cell is the basic structural, functional and/or biological unit of a live organism.
  • a cell can originate from any organism having one or more cells.
  • Examples of host cells include, but are not limited to: a prokaryotic cell, eukaryotic cell, a bacterial cell, an archaeal cell, a cell of a single-cell eukaryotic organism, a protozoal cell, a fungal cell (e.g., a yeast cell) , an animal cell, a cell from an invertebrate animal (e.g., fruit fly, cnidarian, echinoderm, and nematode) , a cell from a vertebrate animal (e.g., fish, amphibian, reptile, bird, mammal) , a plant cell (i.e., crop plants such as cotton, tobacco, maize, rice, wheat, tomatoes, strawberries) and a cell from a
  • host cell typically refers to an animal cell within the subphylum vertebrata and includes an individual animal cell or cell culture prepared from the animal that can be or has been a recipient of a vector (s) or isolated polynucleotide (s) of the invention.
  • Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or modification.
  • a host cell includes cells into which a vector or a polynucleotide of the invention has been introduced, including by transformation or transfection.
  • the host cell is a human cell present in a somatic tissue of the human (e.g., liver, kidneys, spleen, gall bladder, stomach, bladder, uterus, intestines, pancreas, colon, lung, heart, brain, muscle, bone, pharynx, and larynx) .
  • a somatic tissue of the human e.g., liver, kidneys, spleen, gall bladder, stomach, bladder, uterus, intestines, pancreas, colon, lung, heart, brain, muscle, bone, pharynx, and larynx
  • the host cell is a germ cell.
  • somatic tissue refers to a cell of a live organism that does not contribute to the production of gametes (e.g., is not a germ cell) .
  • a somatic tissue includes cells that have differentiated into various internal organs.
  • a somatic tissue includes cells within the liver, kidneys, spleen, gall bladder, stomach, bladder, uterus, intestines, pancreas, colon, lung, heart, brain, muscle, bone, skin, eyes, pharynx, and larynx, or a cell obtained from one of these organs.
  • a somatic tissue can include specific cells from the liver, termed hepatocytes.
  • a somatic tissue can include specific cells from the brain, termed neurons.
  • a somatic tissue includes blood cells (e.g., erythrocytes, lymphocytes, red and white blood cells, T-cells, and helper cells) .
  • blood cells e.g., erythrocytes, lymphocytes, red and white blood cells, T-cells, and helper cells
  • the present invention can be used to treat blood-related disorders such as hemoglobinopathies, such as thalassemia’s and sickle cell disease (e.g., see PCT Publication Number: WO/2013/126794 for somatic tissues, that could be targeted by the CRISPR-Cas9-NHEJ mediated constructs described herein) .
  • subject, ” “host, ” or “organism” refers to any member of the phylum Chordata, more preferably any member of the subphylum vertebrata, or most preferably, any member of the class Mammalia, including, without limitation, humans and other primates, including non-human primates such as rhesus macaques, chimpanzees and other monkey and ape species; farm animals, such as cattle, sheep, pigs, goats and horses; domestic mammals, such as dogs and cats; laboratory animals, including rabbits, mice, rats and guinea pigs; birds, including domestic, wild, and game birds, such as chickens, turkeys, ducks, and geese.
  • a host cell is derived from a subject (e.g., tissue specific cells, such as hepatocytes) .
  • the subject is a non-human subject.
  • live organism or “living organism” refers to an organism that is capable of responding to external stimuli, such as heat, light, water, or atmospheric conditions.
  • a live organism refers to an animal that is presently living, not dead.
  • a live organism refers to an animal that is breathing such as, but not limited to a mouse, rat, rabbit, goat or human.
  • a live organism can include a cell culture (e.g., cells cloned from a human or animal somatic tissue) from which a response to external stimuli can be obtained.
  • a live organism is a transgenic animal.
  • a sample refers to a representative part or a single item from a larger whole or group.
  • a sample is a cell, cell lysate, tissue section, tissue biopsy, liquid biopsy, blood or other biological fluid such as, but not limited to, saliva, sputum, urine, stool, plasma/serum, breast milk, sperm, ejaculate, vaginal secretions, sweat, mucus, bile, and cerebrospinal fluid obtained from an organism.
  • the sample is a eukaryotic cell such as a cell from a mammal (e.g., a pig, a cow, a goat, a sheep, a rodent, a rat, a mouse, a dog, a cat, a non-human primate, and a human) .
  • the cell is a cell culture cell derived from an organism.
  • the sample is from a live organism.
  • the sample can comprise a plurality of samples from different sources (e.g., two or more closely related hosts such as, but not limited to, parent/progeny or siblings) .
  • the sample can comprise a plurality of samples from different sources (e.g., two or more unrelated hosts such as, but not limited to, different ethnicities) .
  • the host cell is a human cell present in a somatic tissue of the human (e.g., liver, kidneys, spleen, gall bladder, stomach, bladder, uterus, intestines, pancreas, colon, lung, heart, brain, muscle, bone, pharynx, and larynx) .
  • wild-type, ” “naturally occurring, ” and “unmodified” are used herein to mean the typical (or most common) form, appearance, phenotype, or strain existing in nature; for example, the typical form of cells, organisms, characteristics, polynucleotides, proteins, macromolecular complexes, genes, RNAs, DNAs, or genomes as they occur in, and can be isolated from, a source in nature.
  • the wild-type serves as the original parent form before an intentional modification is introduced.
  • mutant, variant, engineered, recombinant, and modified forms are not wild-type forms.
  • transgenic animal refers to an animal whose genome is genetically modified.
  • the term includes the progeny (any generation) of a transgenic animal, provided that the progeny has the genetic modification.
  • the term refers to an entire live animal (e.g., a live mouse) as opposed to a tissue culture containing a cell from the transgenic animal.
  • hydrodynamic injection refers to a plasmid delivery method and can often include rapid injection of a large volume of fluid into a blood vessel (e.g., mouse tail vein) to deliver genetic materials into cells.
  • a blood vessel e.g., mouse tail vein
  • AAV or “adeno-associated virus” refers to a small (25 nm) non enveloped virus that packages a linear single-stranded DNA genome which infects humans and some other primate species.
  • the AAV has a very low immunogenicity.
  • the AAV can infect both dividing and quiescent cells and can persist in an extrachromosomal state without integrating into the genome of the host cell.
  • HDR homologous recombination
  • a cell Facing a DNA DSB break, a cell can use the sister chromatin or any provided donor to repair the break based on the intact allele or template through pairing homologous sequence around the DNA DSB break site.
  • the efficiency of the HDR for CRISPR/Cas9 activity can be modulated (e.g., expression of Rad52 and/or Scr7) or suppressed (e.g., DNA ligase IV or suppression of the NHEJ pathway) to increase the efficiency of HDR repair (see, e.g., Chu et al., Nat. Biotech., 33: 543-548 (2015) ) .
  • NHEJ non-homologous end joining
  • a DNA repair mechanism Facing a DNA DSB break, a cell can repair the DSB break through recognizing and end processing the nucleotides present at the site of the DSB and ligating the DNA ends together. During this process, a few (1-10 bp) nucleotides may be inserted or deleted (indels) , which results in a frameshift mutation.
  • NHEJ is referred to as non-homologous because the DNA DSB ends are ligated without the need for a homologous template (in contrast to HDR) .
  • NHEJ is evolutionarily conserved throughout the animal kingdoms and is the predominate repair pathway in mammalian cells.
  • the term “knock-in” or “knocking in” refers to a process of genetic engineering by which a desired DNA fragment, which may range from less than 100 bp to up to several hundred kilo base pairs (kb) and contain one or more genes, is inserted into the genome of a host cell.
  • the term refers to insertion of a functional gene into a host cell, resulting in the production of normal mRNA transcripts and proteins corresponding to the functional gene.
  • the term refers to the substitution of a non-functional or dysfunctional gene in the host cell with a normal gene by the knock in process.
  • Knock-in refers to a specific site of insertion and thus can be regarded as targeted insertion.
  • Knock-in is distinct from “knock-out” processes which either delete part of the target site in the host cell genome or insert an incomplete/irrelevant nucleic acid sequence into the target site so as to disrupt expression of the gene of interest.
  • the knock-in method produces a gain of function (GOF) rather than loss of function (LOF) genotype.
  • the knock in process is mediated by non-homologous end joining (NHEJ) .
  • INS insulin receptor kinase
  • insulin a protein coding gene which is highly expressed in pancreatic beta cells.
  • Human insulin is referred to herein as “hINS” (UniProt: P01308) .
  • Insulin protein functions to trigger the uptake blood glucose of various cells to produce energy and decrease glucose level. Therefore, it has been widely used to treat the Type 1 diabetes mellitus (T1DM) .
  • T1DM Type 1 diabetes mellitus
  • F9 or “Factor IX” is a term of the art understood by skilled persons and means a protein coding gene highly expressed in the liver. Human insulin is referred to herein as “hF9” (UniProt: P00740) . A deficiency of this protein leads to a significant bleeding disorder, hemophilia B, which can be rescued through injection of purified Factor IX protein (FIX) originally encoded by the wild type hF9 gene.
  • FIX Factor IX protein
  • Hemophilia B refers to a disease that is caused by the deficient or limited level of FIX and characterized as blood clotting disorder. Intravenous infusions of FIX can be an effective treatment for this disease.
  • HEK293T is a term of the art understood by skilled persons and means a variant of human embryonic kidney 293 cells (HEK293) that contains the SV40 large T-antigen.
  • the antigen allows episomal replication of transfected plasmids containing the SV40 origin of replication, which leads to the amplification of transfected plasmids and extended temporal expression of the desired gene products.
  • 3’-UTR is a term of the art understood by skilled persons and means the section of messenger RNA (mRNA) that immediately follows the translation termination codon.
  • mRNA messenger RNA
  • An mRNA molecule is transcribed from the DNA sequence of a gene and can be later translated into a corresponding protein.
  • the term “ires” is a term of the art understood by skilled persons and means internal ribosome entry site segments which are known to attract eukaryotic ribosomal translation initiation complex and thus promote translation initiation independently of the presence of the commonly utilized 5'-terminal 7mG cap structure.
  • eGFP is a term of the art understood by skilled persons and means enhanced green fluorescent protein with F64L point mutation which folds the efficiency at 37°C.
  • eGFP leads to the significant performance of GFPs in mammalian cells.
  • Actb or “actin” is a term of the art understood by skilled persons and means a genome locus termed beta-actin in mouse or human genomes.
  • Human Actb is referred to herein as “hActb” (UniProt: P60709 (ACTB_HUMAN) ) .
  • Mouse actin is referred to herein as “Actb” (UniProt: P60710 (ACTB_MOUSE) ) .
  • the actin gene produces highly conserved proteins that are involved in cell motility, structure, and integrity.
  • Alb or “albumin” is a term of art understood by skilled persons and means a genome locus albumin in mouse or human genomes.
  • Human Alb is referred to herein as “hAlb” (UniProt: P02768 (ALBU_HUMAN) ) .
  • Mouse albumin is referred to herein as “Alb” (UniProt: P07724 (ALBU_MOUSE) ) .
  • the albumin gene is often stably and constitutively expressed at high levels in most human tissues and cells.
  • GAPDH is a term of the art understood by skilled persons and means a housekeeping gene which produces Glyceraldehyde 3-phosphate dehydrogenase.
  • Human GAPDH is referred to herein as “hGAPDH” (UniProt: P04406 (G3P_HUMAN) ) .
  • Mouse GAPDH is referred to herein as “GAPDH” (UniProt: P16858 (G3P_MOUSE) ) .
  • the GAPDH gene is often stably and constitutively expressed at high levels in most human tissues and cells. Thus, GAPDH is commonly used as a control for Western blot to check protein expression levels or for qPCR to check mRNA expression levels.
  • STZ or “Streptozotocin” refer to a chemical, which is naturally occurring and particularly toxic to the insulin-producing pancreatic beta cells. Therefore, this drug has been used for building animal models of Type 1 diabetes mellitus or as a medical treatment for cancers of beta cells.
  • T1DM type 1 diabetes, ” or “type 1 diabetic mellitus” refers to a type of metabolic disease, which is characterized as insufficient insulin production leading to high blood glucose in the body.
  • the classical symptoms are frequent urination, increased thirst, increased hunger, and weight loss. Insulin therapy is usually given by injection to treat the disease.
  • an “inherited disease” is a condition or disease caused by absence or a defect in a desirable gene (loss of function) or expression of an undesirable or defective gene or (gain of function) .
  • a loss of function genetic disorder is hemophilia, an inherited bleeding disorder caused by deficiency in either coagulation factor VIII (FVIII, hemophilia A) or Factor IX (FIX, hemophilia B) .
  • FVIII coagulation factor VIII
  • FIX Factor IX
  • a gain of function genetic disorder is Huntington's disease, a disease caused by a pathologic “HTT” gene (encodes the huntingtin protein) that encodes a mutated protein that accumulates within and leads to gradual destruction of neurons, particularly in the basal ganglia and the cerebral cortex.
  • an inherited disease can be transmitted (e.g., by asymptomatic carrier parents) to subsequent generations or progeny.
  • an inherited disease can include an autosomal recessive disease.
  • an inherited disease can include an autosomal dominant disease.
  • an inherited disease includes, but is not limited to, neuromuscular, cardiovascular, developmental, and metabolic diseases.
  • an inherited disease includes but is not limited to, autism spectrum disorders, cardiomyopathy, ciliopathies, congenital disorders of glyosylation, congenital myasthenic syndromes, epilepsy and seizure disorders, eye disorders, glycogen storage disorders, hereditary cancer syndrome, hereditary periodic fever syndromes, inflammatory bowel disease, lysosomal storage disorders, multiple epiphyseal dysplasia, neuromuscular disorders, or Noonan Syndrome and related disorders.
  • an inherited disease can be the result of a single nucleic acid mutation in a gene.
  • a single nucleic acid mutation inherited disease can include, but is not limited to, diabetes, hemophilia, sickle cell anemia, cystic fibrosis, Duchenne muscular dystrophy, hemochromatosis, congenital deafness, familial hypercholesterolemia, Huntingdon’s, Tay-Sachs and phenylketonuria.
  • a “metabolic disease” refers to disease or condition that disrupts normal metabolism.
  • Tay-Sach’s is a genetically based metabolic disorder found among Ashkenazi Jewish families, French Canadians of southeastern Quebec and Cajuns of Louisiana.
  • Other metabolic conditions can arise due to changes in environmental conditions or lifestyle (e.g., pregnancy, prolonged fasting, malnutrition or obesity) .
  • the metabolic disease can include, but is not limited to, cystinosis, cystinuria, Fabry disease, galactosemia, Gaucher disease, Harntup disease, homocystinuria, Hunter Syndrome, Lesch-Nyhan syndrome, Niemann-Pick disease, Pompe disease, porphyria, Scheie syndrome, tyrosinemia and von Gierke disease.
  • the metabolic disease is an endocrine disease, for example diabetes or hypothyroidism.
  • treatment refers to an indicia of success in the treatment or amelioration of a disease or condition, including any objective or subjective parameter such as abatement; remission; diminishing of symptoms or delaying the onset of symptoms; slowing in the rate of degeneration or decline; making the final point of degeneration less debilitating; and/or improving a subject's physical or mental well-being.
  • hybridization or “hybridize” or “hybridizing” or “anneal” or “annealing” is the process of combining two complementary single-stranded DNA or RNA molecules so as to form a single double-stranded molecule (DNA/DNA, DNA/RNA, RNA/RNA) through hydrogen base pairing.
  • Hybridization stringency is typically determined by the hybridization temperature and the salt concentration of the hybridization buffer; e.g., high temperature and low salt provide high stringency hybridization conditions.
  • salt concentration ranges and temperature ranges for different hybridization conditions are as follows: high stringency, approximately 0.01 M to approximately 0.05 M salt, hybridization temperature 5°C to 10°C below T m ; moderate stringency, approximately 0.16 M to approximately 0.33 M salt, hybridization temperature 20°C to 29°C below T m ; and low stringency, approximately 0.33 M to approximately 0.82 M salt, hybridization temperature 40°C to 48°C below T m of duplex nucleic acids is calculated by standard methods well-known in the art (see, e.g., Maniatis, T., et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press: New York (1982) ; Casey, J., et al., Nucleic Acids Research 4: 1539-1552 (1977) ; Bodkin, D.K., et al., Journal of Virological Methods 10 (1) : 45-52 (1985) ; Wallace, R.B., et al., Nucleic Acid
  • High stringency conditions for hybridization typically refer to conditions under which a nucleic acid having complementarity to a target sequence predominantly hybridizes with the target sequence, and substantially does not hybridize to non-target or off-target sequences.
  • hybridization conditions are typically of moderate stringency, preferably high stringency.
  • sgRNA programmable small guide RNAs
  • DRB double strand breaks
  • the CRISPR/Cas genomic sequence manipulation systems described herein are intended for universally targeting essentially any gene, in somatic tissue cells derived from essentially any live organism. These systems include a gene targeting system that requires a insertion event into the host cell genome and a donor construct containing a functional gene that is introduced into the target genomic locus. The insertion event is based on nucleotide sequence homology between the donor construct and its insertion site.
  • compositions useful as components of a CRISPR/Cas system for genome editing in vivo and/or targeting genetic elements can be used for various applications, including but not limited to, a screen to identify genetic elements that modulate a phenotype, to identify genetic interactions, to develop/identify optimized sgRNAs, for lead compound discovery/improvement and for gene-therapy (e.g., replacement of a mutated or dysfunctional gene with a functional copy of the gene in the host cell genome) .
  • the components include sgRNAs, Cas proteins, donor vectors, locus-specific helper vectors and universal helper vectors.
  • the sgRNAs typically contain from 5′to 3′: a binding region, a 5′hairpin region, a 3′ hairpin region, and a transcription termination sequence.
  • the sgRNA is configured to form a stable and active complex with a small guide RNA-mediated nuclease (e.g., Cas protein such as, but not limited to, Cas9 of Cpf1) .
  • a small guide RNA-mediated nuclease e.g., Cas protein such as, but not limited to, Cas9 of Cpf1
  • the sgRNA can be optimized to enhance expression of a polynucleotide encoding the sgRNA in a host cell.
  • the 5′hairpin region can be between about 15 and about 50 nucleotides in length (e.g., about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length) . In some cases, the 5′ hairpin region is between about 30-45 nucleotides in length (e.g., about 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides in length) .
  • the 5′hairpin region is, or is at least about, 31 nucleotides in length (e.g., is at least about 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides in length) .
  • the 5′hairpin region contains one or more loops or bulges, each loop or bulge of about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides.
  • the 5′hairpin region contains a stem of between about 10 and 30 complementary base pairs (e.g., 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 complementary base pairs) .
  • the binding region of the 5’ hairpin can contain four or fewer consecutive uracil nucleotides (see, e.g., US20160289673) .
  • the 5′hairpin region can contain protein-binding, or small molecule-binding structures.
  • the 5′hairpin function e.g., interacting or assembling with a sgRNA-mediated nuclease
  • the 5′hairpin region can contain non-natural nucleotides.
  • non-natural nucleotides can be incorporated to enhance protein-RNA interaction, or to increase the thermal stability or resistance to degradation of the sgRNA.
  • the sgRNA typically contains an intervening sequence between the 5′and 3′hairpin regions.
  • the intervening sequence between the 5′and 3′hairpin regions can be between about 0 to about 50 nucleotides in length, preferably between about 10 and about 50 nucleotides in length (e.g., at a length of, or about a length of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides) .
  • the intervening sequence is designed to be linear, unstructured, substantially linear, or substantially unstructured.
  • the intervening sequence can contain non-natural nucleotides.
  • non-natural nucleotides can be incorporated to enhance protein-RNA interaction or to increase the activity of the sgRNA: Cas protein complex.
  • natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA.
  • the 3′hairpin region can contain an about 3, 4, 5, 6, 7, or 8 nucleotide loop and an about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotide or longer stem.
  • the 3′hairpin region can contain a protein-binding, small molecule-binding, hormone-binding, or metabolite-binding structure that can conditionally stabilize the secondary and/or tertiary structure of the sgRNA.
  • the 3′hairpin region can contain non-natural nucleotides.
  • non-natural nucleotides can be incorporated to enhance protein-RNA interaction or to increase the activity of the sgRNA: Cas complex.
  • natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA.
  • the sgRNA includes a termination structure at its 3′end.
  • the sgRNA includes an additional 3′hairpin region, e.g., before the termination and after a first 3′hairpin region, that can interact with proteins, small-molecules, hormones, etc., for stabilization or additional functionality, such as conditional stabilization or conditional regulation of sgRNA: Cas assembly or activity.
  • the binding region is designed to complement (e.g., perfectly complement) or substantially complement the target genetic region or selected locus.
  • the 19 nucleotides at the 3′or 5′end of the binding region are perfectly complementary to the target genetic region or selected locus.
  • the binding region can be altered to increase stability. For example, non-natural nucleotides, can be incorporated to increase RNA resistance to degradation.
  • the binding region can be altered or designed to avoid or reduce secondary structure formation in the binding region.
  • the binding region can be designed to optimize G-C content.
  • G-C content is preferably between about 40%and about 60% (e.g., 40%, 45%, 50%, 55%, 60%) .
  • the binding region can be selected to begin with a sequence that facilitates efficient transcription of the sgRNA.
  • the binding region can begin at the 5′end with a G nucleotide.
  • the binding region can contain modified nucleotides such as, without limitation, methylated or phosphorylated nucleotides.
  • the sgRNAs are selected so as not to have significant off-target effects.
  • the similarity of an sgRNA binding region for off-target genetic region or selected locus can be determined.
  • sgRNAs having a high similarity exceeding a pre-designated threshold can be filtered out.
  • candidate binding regions, including the protospacer adjacent motif (PAM) sequences can be scored using a scoring metric in a manual or automated fashion. Accordingly, sgRNA binding regions having an acceptable number of off-target mismatches can be selected for synthesis.
  • PAM protospacer adjacent motif
  • sgRNAs are targeted to specific regions at or near a gene.
  • an sgRNA can be targeted to a region at or near the 0-750 bp region 5′ (upstream) of the transcription start site of a gene.
  • the 0-750 bp targeting of the region can provide, or provide increased transcriptional activation by an sgRNA: Cas complex.
  • a somatic cell can be contacted with a Cas protein fused to a transcriptional activator or epitope fusion domain and an sgRNA (or library of sgRNAs) targeted to the 0-750 bp region 5′of the transcription start site of one or more genes.
  • an sgRNA can be targeted to a region at or near the 0-1000 bp region 3′ (downstream) of the transcription start site of a gene.
  • the 0-1000 bp targeting of the region can provide, or provide increased transcriptional repression by an sgRNA: Cas complex.
  • a somatic cell can be contacted with a Cas protein fused to a transcriptional repressor or epitope fusion domain and an sgRNA (or library of sgRNAs) targeted to the 0-1000 bp region 3′of the transcription start site of one or more genes.
  • the sgRNAs are targeted to a region at or near the transcription start site (TSS) based on an automated or manually annotated database.
  • TSS transcription start site
  • transcripts annotated by Ensembl/GENCODE or the APPRIS pipeline can be used to identify the TSS and target genetic elements 0-750 bp upstream (e.g., for targeting one or more transcriptional activator domains) or 0-1000 bp downstream (e.g., for targeting one or more transcriptional repressor domains) of the TSS.
  • the sgRNAs are targeted to a genomic region that is predicted to be relatively free of nucleosomes.
  • the locations and occupancies of nucleosomes can be assayed through use of enzymatic digestion with micrococcal nuclease (MNase) .
  • MNase micrococcal nuclease
  • MNase is an endo-exo nuclease that preferentially digests naked DNA and the DNA in linkers between nucleosomes, thus enriching for nucleosome-associated DNA.
  • MNase-seq high-throughput sequencing technologies
  • regions having a high MNase-seq signal are predicted to be relatively occupied by nucleosomes and regions having a low MNase-seq signal are predicted to be relatively unoccupied by nucleosomes.
  • the sgRNAs are targeted to a genomic region that has a low MNase-Seq signal.
  • the sgRNA-mediated nuclease is a Cas protein (e.g., Cas9 or Cpf1) .
  • the sgRNA-mediated nuclease can be a class 1 or class 2, CRISPR associated protein.
  • the sgRNA-mediated nuclease can be a type I, II, III, IV or V CRISPR-associated protein.
  • the Cas protein is a Cpf1 protein or a Cpf1-like protein encoded by Cpf1 ortholog.
  • the Cas protein is a Cas9 protein or a Cas9-like protein encoded by Cas9 ortholog.
  • the sgRNA-mediated nuclease can be a modified Cas9 protein.
  • Cas9 proteins can be modified by any method known in the art.
  • the Cas9 protein can be codon optimized for expression in a host cell or an in vitro expression system.
  • the Cas protein can be engineered for improved characteristics such as, but not limited to, stability, enhanced target binding, or reduced aggregation.
  • the Cas protein catalyzes a DNA double strand break at a selected locus for insertion of the desired gene into the host cell genome.
  • the expression cassettes can contain a promoter (e.g., a heterologous or native promoter) operably linked to a polynucleotide encoding an sgRNA.
  • the promoter can be inducible or constitutive.
  • the promoter can be tissue specific.
  • the promoter is a U6, H1, or spleen focus-forming virus (SFFV) long terminal repeat promoter.
  • the promoter is a weak mammalian promoter as compared to the human elongation factor 1 promoter (EF1A) .
  • the weak mammalian promoter is a ubiquitin C promoter or a phosphoglycerate kinase 1 promoter (PGK) . In some cases, the weak mammalian promoter is a TetOn promoter in the absence of an inducer.
  • the strength of the selected sgRNA promoter is selected to express an amount of sgRNA that is proportional to the amount of Cas protein.
  • the expression cassette can be in a vector, such as a plasmid, a viral vector, a lentiviral vector, AAV vector, adenovirus particle, DNA-nanoparticle complex (see, e.g., WO/2017/097377) , and the like.
  • the sgRNA expression cassette can be episomal or integrated in the host cell.
  • the expression cassette is in a vector, such as a plasmid or AAV viral particle.
  • the Cas protein and one or more sgRNAs of the present invention can be delivered, for example to a live organism, using adeno associated virus (AAV) , lentivirus, adenovirus or other plasmid or viral vector types.
  • AAV adeno associated virus
  • formulations and doses are described in U.S. Pat. No. 8,454,972 (formulations, doses for adenovirus) , U.S. Pat. No. 8,404,658 (formulations, doses for AAV) , and U.S. Pat. No. 5,846,946 (formulations, doses for DNA plasmids) and from clinical trials and publications regarding the clinical trials involving lentivirus, AAV and adenovirus.
  • the route of administration, formulation and dose can be as described, but not limited to, U.S. Pat. No. 8,454,972 and as in clinical trials involving AAV.
  • the route of administration, formulation and dose can be as described, but not limited to, U.S. Pat. No. 8,404,658 and as in clinical trials involving adenovirus.
  • the route of administration, formulation and dose can be as described, but not limited to, U.S. Pat. No. 5,846,946 and as in clinical studies involving plasmids.
  • Doses for any vector may be based on, or extrapolated to, an average 70 kg individual, and can be adjusted for patients, subjects, mammals of different weight and species. Frequency of administration is within the skill of the medical or veterinary practitioner (e.g., physician, veterinarian) , depending on usual factors including the age, sex, general health, other conditions of the patient or subject and the particular condition or symptoms being addressed.
  • the donor vector comprises an AAV, adenovirus, plasmid or other suitable vector.
  • the donor vector can include an expression cassette.
  • the donor vector encodes a single desired gene and one sgRNA target site.
  • the donor vector encodes a single desired gene and two sgRNA target sites.
  • the desired gene can be any suitable gene.
  • the desired gene is a gene known to cause or is closely associated with an inherited disease (e.g., cystic fibrosis) .
  • the desired gene can include a metabolic disease (e.g., diabetes or phenylketonuria) .
  • the desired gene is a primarily responsible for the manifestation of disease in the subject.
  • the disease can include a disease manifested as a result of a single gene mutation (e.g., sickle cell anemia) .
  • the donor vector encodes a human gene, e.g., human insulin or human Factor IX.
  • the donor vector is suitable for administration to an animal such as a non-human animal (e.g., a pig) .
  • the donor vector is suitable for administration to a human animal (e.g., an infant or minor) .
  • the donor vector is suitable for administration to a human animal (e.g., an infant or minor) to treat as somatic tissue disease (e.g., cystic fibrosis) .
  • the donor vector encodes a human insulin gene and a single sgRNA target site.
  • the sgRNA target site of the donor vector comprises a nucleic acid sequence that flanks the 3’ or 5’ region of the human insulin gene such that the donor vector upon linearization into a linear DNA fragment can be inserted at the intended position in the host cell genome.
  • the donor vector encodes a human Factor IX gene and a single sgRNA target site.
  • the sgRNA target site of the donor vector comprises a nucleic acid sequence that flanks the 3’ or 5’ region of the human Factor IX gene such that the donor vector upon linearization into a linear DNA fragment can be inserted at the intended position in the host cell genome.
  • the locus-specific helper vector comprises an AAV, adenovirus, plasmid or other suitable vector.
  • the locus-specific helper vector can include an expression cassette.
  • the locus-specific helper vector encodes a single sgRNA.
  • the locus-specific helper vector encodes one (or more) sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the locus-specific helper vector encodes two sgRNAs each of which is complementary to a different selected nucleic acid sequence in the host cell genome.
  • the locus-specific helper vector further comprises a nucleic acid sequence encoding a Cas protein.
  • the locus-specific helper vector further comprises a nucleic acid sequence encoding a Cas9 protein.
  • the locus-specific helper vectors may be combined with the universal helper vectors, described below, to form a single combined vector to harbour both locus-specific sgRNA (s) and the universal sgRNA.
  • the universal helper vector comprises an AAV, adenovirus, plasmid or other suitable vector.
  • the universal helper vector can include an expression cassette.
  • the universal helper vector encodes a single sgRNA.
  • the universal helper vector encodes one (or more) sgRNA that is complementary to one (or more) sgRNA target sites in the donor vector.
  • the universal helper vector encodes two sgRNAs each of which is complementary to a different sgRNA target site in the donor vector.
  • the universal helper vector further comprises a nucleic acid sequence encoding a Cas protein.
  • the universal helper vector further comprises a nucleic acid sequence encoding a Cas9 protein.
  • a composition for genome editing in vivo comprises (i) a donor vector; (ii) a locus-specific helper vector; and (iii) a universal helper vector.
  • the composition further comprises a host cell located within a somatic tissue of a live organism.
  • the host cell can be an animal cell such as a human or non-human animal.
  • the host cell is a mammalian cell.
  • the host cell is a mammalian cell from a liver, kidney, spleen, gall bladder, stomach, bladder, uterus, intestine, pancreas, colon, lung, heart, brain, muscle, bone, pharynx, and larynx.
  • the host cell is a blood cell from a mammal (e.g., erythrocyte, lymphocytes, red and white blood cell, T-cell, and helper cell) .
  • the host cell can include a progeny of the host cell.
  • the donor vector is a plasmid or AAV viral particle.
  • the donor vector encodes a single gene for insertion into the host cell genome.
  • the donor vector encodes one or more genes (e.g., two genes) for insertion into the host cell genome.
  • the donor vector also comprises one or more sgRNA target sites.
  • a sgRNA target site comprises a nucleic acid region of an approximately 20 bp.
  • the donor vector encodes a single sgRNA site; while in other embodiments, the donor vector encodes two sgRNA target sites.
  • the sgRNA target sites allow the sgRNA: Cas complex to effectively target the donor vector such that a strand break occurs in the donor vector resulting in the formation of a linearized DNA fragment.
  • the linear DNA fragment retains the encoding for the one or more genes and the one or more sgRNA target sites of the linear DNA fragment allow for insertion of the linear DNA fragment into the host cell genome.
  • the locus-specific helper vector is a plasmid or AAV viral particle. In some embodiments, the locus-specific helper vector encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome. In some embodiments, the locus-specific helper vector encodes a single sgRNA complementary to a selected nucleic acid sequence in the host cell genome. In another embodiment, the locus-specific helper vector encodes two sgRNAs complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vector is a plasmid or AAV viral particle. In some embodiments, the universal helper vector encodes at least one sgRNA that is complementary to a sgRNA in the donor vector. In some embodiments, the universal helper vector encodes a single sgRNA that is complementary to a sgRNA in the donor vector. In another embodiment, the universal helper vector encodes two sgRNAs that are complementary to two sgRNAs in the donor vector. In yet another embodiment, the universal helper vector encodes two or more sgRNAs that are complementary to two or more sgRNA target sites in the donor vector.
  • a viral (e.g., AV or AAV) or plasmid vector of the invention can be administrated (e.g., injected) into a subject or tissue of interest in the subject.
  • the expression of the Cas protein can be driven by a cell-type specific promoter.
  • liver-specific expression might use an Albumin promoter (see, e.g., Wooddell et al., Journal of Gene Medicine, 10 (5) : 551-63 (2008) and neuron-specific expression might use an Synapsin I promoter (see, e.g., Kugler et al., Gene Therapy, 10: 337-347 (2003) ) .
  • Described herein are methods of inserting a desired gene at a selected locus in a host cell genome. Also described are methods for performing CRISPR/Cas genome editing. Further still, methods for treating somatic tissue disease in a subject are described.
  • the invention provides a method of inserting a desired gene at a selected locus in a host cell genome.
  • the host cell is a somatic cell within a live organism.
  • the method comprises contacting a live organism with (i) a donor vector; (ii) a locus-specific helper vector; (iii) a universal helper vector; and (iv) a Cas or Cpfl protein if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein; linearizing the donor vector to form a donor DNA fragment, and whereby the donor DNA fragment undergoes NHEJ mediated knock-in into the host cell genome of the somatic tissue of the live organism.
  • the method comprises contacting a live organism with (i) a donor vector, which encodes the desired gene and one or two sg-RNA target sites; (ii) a locus-specific helper vector, which encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome; (iii) a universal helper vector, which encodes at least one sgRNA that is complementary to the one or two sg-RNA target sites in the donor vector; and (iv) a Cas or Cpfl protein if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, thereby linearizing the donor vector to form a donor DNA fragment, whereby the donor DNA fragment undergoes NHEJ-mediated knock-in into the host cell genome of the somatic tissue of the live organism.
  • the method comprises contacting a live organism with (i) a donor vector, which encodes the desired gene and a single sg-RNA target site; (ii) a locus-specific helper vector, which encodes a single sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome; (iii) a universal helper vector, which encodes a single sgRNA that is complementary to the single sg-RNA target site in the donor vector; and (iv) a Cas or Cpfl protein if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, thereby linearizing the donor vector to form a donor DNA fragment, whereby the donor DNA fragment undergoes NHEJ-mediated knock-in into the host cell genome of the somatic tissue of the live organism.
  • the donor vector comprises a single desired gene.
  • the single desired gene is a gene that is responsible for the manifestation of disease in the host (e.g., CFTR gene) .
  • the donor vector encodes a single sg-RNA target site, wherein the single sgRNA target site is complementary to at least one sgRNA encoded by the universal helper vector.
  • the locus-specific helper vector or the universal helper vector further comprise a nucleic acid sequence encoding a Cas protein.
  • the locus-specific helper vector or universal helper vector further comprise a nucleic acid sequence encoding a Cas9 protein.
  • the method is applicable for the insertion of a desired gene into an animal such as a non-human animal (e.g., mouse) .
  • the method is also applicable for the insertion of a desired gene into a human (e.g., a newborn infant) .
  • the method is suitable for the insertion of a desired gene into a human having an inherited disease (e.g., familial dilated cardiomyopathy or Hemophilia B) .
  • the method is suitable for the insertion of a desired gene into a human having a metabolic disease (e.g., a newborn infant with a metabolic disorder, such as phenylketonuria) .
  • the method is particularly suitable for the replacement of a gene that is dysfunctional or absent from the genome of the host.
  • the desired gene is human insulin or human Factor IX.
  • the donor vector of the method encodes a human gene, e.g., human insulin or human Factor IX.
  • the donor vector of the method is suitable for administration to an animal such as a non-human animal (e.g., porcine) .
  • the donor vector of the method is suitable for administration to a human animal (e.g., an infant) .
  • the donor vector of the method is suitable for administration to a human animal (e.g., an infant) to treat as somatic tissue disease (e.g., cystic fibrosis) .
  • the donor vector of the method encodes a human insulin gene and a single sgRNA target site. In another embodiment, the donor vector of the method encodes a human Factor IX gene and a single sgRNA target site.
  • the sgRNAs of the locus-specific helper vector and/or universal helper vector comprise a nucleic acid region of about 20 nucleotides that are complementary to the selected nucleic acid sequence in the host cell genome or the sgRNA target sites in the donor vector.
  • the selected locus for insertion of the desired gene into the host cell genome is determined by identifying a desired gene of interest in the host and evaluating the nucleic acid sequence 3’ or 5’ of the desired gene to identify one or more sites for introduction of a double strand break that would allow insertion of the desired gene in the absence of a frameshift mutation. It will be apparent to one of skill in the art that various guidelines exist for the selection of genomic locus (see, e.g, . Ran et al., Nature Protocols, 8 (11) : 2281-2308 (2013) , incorporated herein by reference in its entirety) .
  • the method further comprises detecting a gene product in the host cell that is encoded by the desired gene.
  • the gene product can include an mRNA transcript or protein encoded by the desired gene.
  • the gene product can be detected by a gene expression assay (e.g., Assays or RT-qPCR Assay) or protein expression assay (e.g., Western blot) .
  • the method can include detecting a protein or RNA encoded by the inserted gene from a sample obtained from a live organism.
  • Detecting a gene product in the host cell encoded by the desired gene can be achieved using any known method in the art.
  • suitable methods for detecting a gene product expressed by a desired gene inserted into the host cell genome include, but is not limited to, real-time polymerase chain reaction (RT-PCR) , reverse-transcription-quantitative polymerase chain reaction (RT-qPCR) , Northern blot, quantitative polymerase chain reaction (qPCR) , enzyme-linked immunosorbent assay (ELISA) , next-generation sequencing, fluorescence activated cell sorting (FACS) , and Western Blot.
  • RT-PCR real-time polymerase chain reaction
  • RT-qPCR reverse-transcription-quantitative polymerase chain reaction
  • qPCR quantitative polymerase chain reaction
  • ELISA enzyme-linked immunosorbent assay
  • FACS fluorescence activated cell sorting
  • FPs fluorescent proteins
  • GFP Green Fluorescent Protein
  • proteins expressed by the desired gene in the host cell can be detected via labelled antibodies having specificity for the expressed protein. Detection of the fluorescent mRNAs or proteins can then be observed, for example using a fluorescence microscope. In yet another embodiment, proteins expressed by the desired gene in the host cell can be detected using an enzyme-linked immunosorbent assay (ELISA) .
  • ELISA enzyme-linked immunosorbent assay
  • the method further comprises evaluating functionality of the inserted desired gene from a sample obtained from a live organism.
  • evaluating functionality of the inserted desired gene includes measuring or quantifying the amount (or relative amount) of one or more gene products in a host cell; and comparing the amount (or relative amount) of the one or more gene products with a reference amount (or relative amount) of the gene products from a host cell of a control sample.
  • the control sample includes a host cell from the same or related organism (e.g., by species or lineage) not having the inserted desired gene.
  • the control sample can include a host cell lacking the desired gene.
  • the control sample can include a host cell from the same or related organism (e.g., by species or lineage) having a wild-type version of the inserted desired gene.
  • evaluating the functionality of the inserted desired gene can include comparing the amount (or relative amount) of a gene product produced by a host cell having the inserted desired gene versus an amount (or relative amount) of a gene product produced by a host cell having a wild-type genome without genetic modification such as gene editing.
  • evaluating the functionality of the inserted desired gene can include comparing the amount (or relative amount) of a gene product produced by a host cell having the inserted desired gene from a subject having an inherited disease versus an amount (or relative amount) of a gene product produced by a host cell from a subject lacking the inherited disease (i.e., having wild-type genotype) .
  • the evaluating can include any method known in the art to compare two sets of biological data.
  • evaluating can include determining if a statistically relevant difference (e.g., P ⁇ 0.05) exists between a host cell expressing a gene product having the inserted desired gene (e.g., a test sample) and a host cell expressing a gene product having a wild type genotype (e.g., a control sample) .
  • suitable statistical analyses include the Fishers exact test, Wilcoxon test, and Student’s T-test.
  • no statistical difference between the level of gene expression products in the control sample and the test sample (e.g., the inserted desired gene) indicates that the insertion of the desired gene into the host cell has restored wild-type functionality for the desired gene in the host cell.
  • restoration of wild-type functionality of the desired gene in the host cell can result in treatment of a disease or condition in the host.
  • the method comprises treating a somatic tissue disease in the host.
  • the invention provides a method of genome editing, the method comprising contacting a live organism with (i) a donor vector; (ii) a locus-specific helper vector; (iii) a universal helper vector; and (iv) a Cas or Cpfl protein (which may be in the form of a protein or encoding nucleic acid) if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein; linearizing the donor vector to form a donor DNA fragment, whereby the donor DNA fragment undergoes NHEJ mediated knock-in into the host cell genome of the live organism.
  • the method comprises contacting a live organism with (i) a donor vector, which encodes the desired gene and one or two sg-RNA target sites; (ii) a locus-specific helper vector, which encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome; (iii) a universal helper vector, which encodes at least one sgRNA that is complementary to the one or two sg-RNA target sites in the donor vector; and (iv) a Cas or Cpfl protein (in the form of a protein or a nucleic acid encoding the protein) if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, thereby linearizing the donor vector to form a donor DNA fragment, whereby the donor DNA fragment undergoes NHEJ-mediated knock-in into the host cell genome of the live organism.
  • a donor vector which encodes the desired gene and one or two sg-RNA target sites
  • the method comprises contacting a live organism with (i) a donor vector, which encodes the desired gene and a single sg-RNA target site; (ii) a locus-specific helper vector, which encodes a single sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome; (iii) a universal helper vector, which encodes a single sgRNA that is complementary to the single sg-RNA target site in the donor vector; and (iv) a Cas or Cpfl protein (which may be in the form of a protein or encoding nucleic acid) if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, thereby linearizing the donor vector to form a donor DNA fragment, whereby the donor DNA fragment undergoes NHEJ-mediated knock-in into the host cell genome of the live organism.
  • a donor vector which encodes the desired gene and a single sg-RNA target site
  • a locus-specific helper vector which
  • the donor vector comprises a single desired gene.
  • the single desired gene is a gene that is responsible for the manifestation of disease in the live organism (e.g., CFTR gene) .
  • the invention provides a method for treating somatic tissue disease in a subject, the method comprising (i) administering to the subject in need thereof an therapeutically effective amount of (a) a donor vector encoding a desired gene and one or two sg-RNA target sites; (b) a locus-specific helper vector encoding at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome; (c) a universal helper vector encoding at least one sgRNA that is complementary to the one or two sg-RNA target sites in the donor vector, and (d) a Cas or Cpfl protein (which may be in the form of a protein or encoding nucleic acid) if the locus-specific helper vector or universal helper vector do not encode the Cas or Cpfl protein, to treat the somatic tissue disease in the subject.
  • a donor vector encoding a desired gene and one or two sg-RNA target sites
  • a locus-specific helper vector encoding at least one
  • the donor vector encodes a single sgRNA target site and a single gene.
  • the locus-specific helper vectors encodes a single sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vectors encodes a single sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vector or the locus-specific helper vector encode a Cas protein.
  • the Cas protein is a Cas9 protein.
  • the method further comprises detecting expression of a gene product (e.g., mRNA or protein) encoded by the desired gene in a sample from the subject. In one embodiment, the method further comprises confirming expression of the gene product encoded by the desired gene is sufficient to treat the somatic tissue disease.
  • a gene product e.g., mRNA or protein
  • the method is suitable for the treatment of an inherited disease (e.g., familial dilated cardiomyopathy or Hemophilia B) .
  • the method is suitable for the treatment of a human having a metabolic disease (e.g., a newborn infant with a metabolic disorder, such as phenylketonuria) .
  • the method is particularly suitable for the treatment of a somatic tissue disease by replacing a defective or dysfunctional gene with a functional gene in the host cell genome.
  • the method treats human hemophilia B or type 1 diabetes mellitus.
  • the method further comprises detecting expression of a gene product in the host cell that is encoded by the desired gene.
  • the gene product can include a mRNA transcript or protein encoded by the desired gene.
  • the gene product can be detected by a gene expression assay (e.g., Assays or RT-qPCR Assay) or protein expression assay (e.g., Western blot) .
  • the method can include detecting a protein or RNA encoded by the inserted gene from a sample obtained from a live organism.
  • Detecting expression of a gene product in the host cell encoded by the desired gene can be achieved using any known method in the art.
  • suitable methods for detecting a gene product expressed by a desired gene inserted into the host cell genome include, but is not limited to, real-time polymerase chain reaction (RT-PCR) , reverse-transcription-quantitative polymerase chain reaction (RT-qPCR) , Northern blot, quantitative polymerase chain reaction (qPCR) , enzyme-linked immunosorbent assay (ELISA) , next-generation sequencing, fluorescence activated cell sorting (FACS) , and Western Blot.
  • RT-PCR real-time polymerase chain reaction
  • RT-qPCR reverse-transcription-quantitative polymerase chain reaction
  • qPCR quantitative polymerase chain reaction
  • ELISA enzyme-linked immunosorbent assay
  • FACS fluorescence activated cell sorting
  • FPs fluorescent proteins
  • GFP Green Fluorescent Protein
  • proteins expressed by the desired gene in the host cell can be detected via labelled antibodies having specificity for the expressed protein. Detection of the fluorescent mRNAs or proteins can then be observed, for example using a fluorescence microscope. In yet another embodiment, proteins expressed by the desired gene in the host cell can be detected using an enzyme-linked immunosorbent assay (ELISA) .
  • ELISA enzyme-linked immunosorbent assay
  • treating a somatic disease includes generating one or more mRNA transcripts or protein encoded by the desired gene in an amount (or relative amount) that is statistically significant over a sample lacking the inserted desired gene (e.g., a host cell lacking the desired gene) .
  • treating a somatic disease includes confirming the level of gene expression in the host cell having the inserted desired gene is sufficient to treat the somatic tissue disease.
  • treating a somatic tissue disease includes expression of one or more gene products that result in the amelioration of one or more symptoms associated with the somatic tissue disease.
  • treating a somatic tissue disease includes increasing the expression of one or more gene products by at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more, as compared to a host cell lacking the desired gene.
  • treating a somatic tissue disease includes increasing the expression of one or more gene products by at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold or more as compared to a host cell lacking the desired gene.
  • the method comprises administering to the subject a donor vector, a special helper vector and a universal helper vector.
  • a distinct vector encoding a Cas protein can be administered to the subject if the universal helper vector or locus-specific helper vector do not encode a Cas protein.
  • the donor vector, special helper vector and universal helper vector can be administered intravenously, intrathecally, intraspinally, intraperitoneally, intramuscularly, intranasally, subcutaneously, orally, topically, and/or by inhalation.
  • the donor vector, special helper vector and universal helper vector are administered intravenously.
  • the donor vector, special helper vector and universal helper vector are co-administered to the host in the same volume, sequentially, or simultaneously.
  • the donor vector, special helper vector and universal helper vector are administered in a manner compatible with the dosage formulation, and in such amount as will be therapeutically effective.
  • therapeutically effective amount refers to that amount of an agent (e.g., the donor vector, special helper vector and universal helper vector as described herein) being administered that effectively treats a disease, disorder, or condition, e.g., relieve one or more of the symptoms of the disease being treated, and/or that amount that will prevent one or more of the symptoms of the disease that the subject being treated has or is at risk of developing.
  • the size of the dose will also be determined by the existence, nature, and extent of any adverse side-effects that accompany the administration of a particular agent in a particular host. Determination of the proper dosage for a particular situation is within the skill of the practitioner.
  • the somatic tissue disease is an inherited disease.
  • the inherited disease is selected from diabetes, hemophilia, sickle cell anemia, cystic fibrosis, Duchenne muscular dystrophy, hemochromatosis, congenital deafness, familial hypercholesterolemia, Huntingdon’s, Tay-Sachs and phenylketonuria.
  • the somatic tissue disease is a metabolic disorder.
  • the somatic tissue disease is a disease caused by mutation of a single gene.
  • kits for treating somatic tissue diseases Further described herein are kits for performing CRISPR/Cas genome editing and/or transcriptional modulation. In some embodiments, a kit is disclosed herein that can used performing CRISPR/Cas genome editing and transcriptional modulation.
  • a kit for treating somatic tissue disease comprises (i) a first container comprising a donor vector; (ii) a second container comprising a locus-specific helper vector; and (iii) a third container comprising a universal helper vector.
  • the donor vector of the first container encodes one or two sgRNA target sites and a desired gene.
  • the locus-specific helper vector of the second container encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vector of the third container encodes at least one sgRNA that is complementary to the one or more sgRNA target sites in the donor vector.
  • the kit further comprises a host cell (e.g., human hepatocytes or neurons) in a fourth container.
  • the kit further comprises a Cas protein or a vector encoding a Cas protein in a fifth container.
  • the kit further comprises one or more additional reagents to detect/measure gene products (e.g., mRNA transcripts or proteins) produced by the desired gene.
  • the kit further comprises a reference, control, or standard by which expression of the gene products by the desired gene can be evaluated as treating the somatic tissue disease.
  • a kit for performing CRISPR/Cas genome editing comprises (i) a first container comprising a donor vector; (ii) a second container comprising a locus-specific helper vector; and (iii) a third container comprising a universal helper vector.
  • the donor vector of the first container encodes one or two sgRNA target sites and a desired gene.
  • the locus-specific helper vector of the second container encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vector of the third container encodes at least one sgRNA that is complementary to the one or more sgRNA target sites in the donor vector.
  • the kit further comprises a host cell (e.g., human hepatocytes or neurons) in a fourth container.
  • the kit further comprises a Cas protein or a vector encoding a Cas protein in a fifth container.
  • the kit further comprises one or more additional reagents to detect/measure gene products (e.g., mRNA transcripts or proteins) produced by the desired gene.
  • the kit further comprises a reference, control, or standard by which expression of the gene products by the desired gene are an indicator of genome editing.
  • a kit for performing CRISPR/Cas transcriptional modulation comprises (i) a first container comprising a donor vector; (ii) a second container comprising a locus-specific helper vector; and (iii) a third container comprising a universal helper vector.
  • the donor vector of the first container encodes one or two sgRNA target sites and a desired gene.
  • the locus-specific helper vector of the second container encodes at least one sgRNA that is complementary to a selected nucleic acid sequence in the host cell genome.
  • the universal helper vector of the third container encodes at least one sgRNA that is complementary to the one or more sgRNA target sites in the donor vector.
  • the kit further comprises a host cell (e.g., human hepatocytes or neurons) in a fourth container.
  • the kit further comprises a Cas protein or a vector encoding a Cas protein in a fifth container.
  • the kit further comprises one or more additional reagents to detect/measure gene products (e.g., mRNA transcripts or proteins) produced by the desired gene.
  • the kit further comprises a reference, control, or standard by which expression of the gene products by the desired gene can be evaluated as modulating (i.e., increasing or decreasing) transcription in the host cell.
  • the kit comprises a plurality of containers, wherein the donor vector, locus-specific helper vector and universal helper vector are each contained in different (distinct) containers.
  • the first, second, and third containers can be of identical, similar, or distinct materials. Any appropriate storage or reaction vessel container known in the art is contemplated (e.g., plastic tubes or Eppendorf’s) .
  • the first, second, and third containers are of the same material.
  • the first, second, and/or third containers are stored or maintained on ice, at -20°C, or preferably -80°C until needed.
  • first, second and/or third containers can include additional reagents, including but not limited to, buffers, dNTPs, or enzymes.
  • the kit includes instructions (e.g., on a computer readable medium or accessible via a hyperlink) or an instruction manual for use.
  • Plasmid was dissolved in saline to a concentration of 60 ⁇ g/ml, and the total volume injected for each mouse was 10%of its body weight (such as 2 ml for 20 grams) .
  • the dose for each plasmid component was equal to each other. To ensure successful transfection, mouse injections were finished within 5 seconds.
  • AAV-DJ and AAV-Helper system was used for virus packaging (see, e.g., Grieger, J.C., Choi, V.W. &Samulski, R.J. Production and characterization of adeno-associated viral vectors. Nat Protoc 1, 1412-28 (2006) ) .
  • Transfection was carried out in 293FT cells with 80%confluence, and 3 days after transfection, the virus was collected. Briefly, the virus was extracted from the cytoplasm and nuclear material, then gradient centrifugation was performed with ultra-speed. Finally, the viral particles were concentrated and further purified with ultra-centrifugal 100KD filter.
  • Streptozotocin (STZ) was bought from Sigma-Aldrich (St. Louis, MO) , stored at -20°C and dissolved in sodium citrate (pH 4.5) for immediate use. A concentration of 160 mg/kg was used as a single injection for this diabetic model.
  • mice were fasted for 6 hours before undergoing a blood glucose measurement.
  • Blood glucose level was measured through tail vein bleeding using Contour Next meter (Bayer AG, NJ) and associated detection strips essentially according to the manufacturers’ instructions.
  • the blood serum was collected after 6 hours of fasting for insulin detection.
  • Insulin level was measured by ELISA using ultra-sensitive mouse insulin kit bought from Hong Kong University essentially following the manufacturer’s instructions.
  • FIX Factor IX
  • Mouse blood specimens were collected from the ophthalmic vein using capillary action. The blood samples were kept for 20 minutes at room temperature for clotting, and then centrifuged to collect the upper and transparent serum. The mouse serum samples were kept at -80°C for further analysis.
  • Fluorescence Activated Cell Sorting (FACS) analyzer (BD LSRFortessa Cell Analyzer) was configured with a single 488 nm argon ion laser (200 mW) . The laser was used to induce light scattering by either the excitation of cellular fluorescent proteins (eGFP) or granularity within the cell. The recorded events within the gate on the FITC-A (GFP) log scale provided a good indication of the GFP expression level and the counts indicated the number of GFP-positive cells. The ratio of GFP-positive cells over the total counts in the gated area is defined as targeting efficiency.
  • FACS Fluorescence Activated Cell Sorting
  • Genome DNA from cultured cells was extracted using Genome DNA extraction Kit (Tiangen) essentially following the manufacturer's instructions. To extract genome DNA from liver tissues, the Tris buffer and proteinase K was used for overnight digestion at 37°C, then purified with 75%ethanol. Typically, 200 ng of genomic DNA was used for PCR reactions containing Phusion High-Fidelity DNA Polymerase (New England Biolabs) , performed essentially as set forth by the manufacturer's instructions.
  • Mouse livers were harvested immediately from sacrificed mice, saturated in PBS, and then directly imaged under Olympus SZX16 Stereomicroscope equipped with Fluorescence Imaging system.
  • PX330-Cas9 plasmid was bought from addgene (addgene Catalog Number: 42230) , the sgRNA targeting sites were selected as previously described using the MIT-CRISPR Designer (He, X. et al. Knock-in of large reporter genes in human cells via CRISPR/Cas9-induced homology-dependent and independent DNA repair. Nucleic Acids Res 44, e85 (2016) ) .
  • AAV-Cas9-sgRNA plasmid was bought from addgene (addgene Catalog Number: 61591) , and sgRNA target sequences were designed based on the output from website (casblastr. org) . All sgRNAs were constructed by annealing two synthesized oligonucleotides and insertion at Bbs1 for PX330-Cas9 or Bsa1 for AAV-Cas9.
  • NHEJ-GFP donor was used as previously described (He, X. et al. Knock-in of large reporter genes in human cells via CRISPR/Cas9-induced homology-dependent and independent DNA repair. Nucleic Acids Res 44, e85 (2016) ) .
  • the GFP cassette was then replaced with hINS or hF9 CDS through Msc1 and Nsi1 double digestion to construct the NHEJ-hINS donor and NHEJ-hF9 donors respectively.
  • the backbone of AAV donors were purchased from addgene (addgene Catalog Number: 21894) .
  • Two oligonucleotides containing multiple cloning sites (MCS) were synthesized, annealed, and inserted into this backbone with Not1 and Spe1 digestion.
  • the sgRNA target sequences were synthesized and inserted into the MCS, then the IRES-eGFP-PA, IRES-hINS-PA or IRES-hF9-PA cassette was amplified from the corresponding NHEJ donor and inserted into the MCS to generate AAV-NHEJ-eGFP, AAV-NHEJ-luciferase, AAV-NHEJ-hINS or AAV-NHEJ-hF9, respectively.
  • the starting plasmid was pEGFP-N1 (addgene Catalog Number: 6085-1) .
  • the eGFP cassette was replaced with hINS or hF9 CDS to construct CMV-hINS and CMV-hF9 plasmids, respectively, for ectopic expression.
  • new systems were constructed to adopt a CRISPR/Cas9-NHEJ approach for in vivo knock-in and expression of transgenes in mouse liver. Excellent results with the new systems were demonstrated for efficient knock-in of different transgenes under in vivo conditions through the CRISPR/Cas9-NHEJ approach, which itself is a valuable new tool and technology for animal-based biomedical or preclinical research.
  • T1DM Type 1 diabetes mellitus
  • hF9 transgene was successfully introduced into mouse liver, and prolonged expression and secretion of human FIX was achieved at a significant level in a mouse model.
  • FIG. 1C clearly show a large population of hepatocytes to be GFP positive in injected mice (FIG. 1C, lower row) , but not in the control groups (FIG. 1C, middle and upper rows) .
  • PCR for genomic DNA extracted from the liver tissues verified the successful integration of the ires-GFP transgene in the target site at 3’-UTR of mouse Actb gene, which was found to be present in both orientations (FIG. 1D (middle lane) .
  • T1DM type 1 diabetes mellitus
  • Hemophilia B type 1 diabetes mellitus
  • T1DM patients are characterized by the loss of pancreatic ⁇ -cells and deficiency of insulin synthesis, to whom, daily insulin administration throughout life is essential for survival (Huen et al., An Update on the Epidemiology of Childhood Diabetes in Hong Kong. HK J Paediatr (New Series) 14, 252-259 (2009) ) .
  • Streptozotocin STZ is selectively toxic to pancreatic beta-cells and it has been widely used to induce Type 1 diabetes mellits (T1DM) in animal models (Deeds, M.C. et al. Single dose streptozotocin-induced diabetes: considerations for study design in islet transplantation models. Lab Anim 45, 131-40 (2011) .
  • Hemophilia B is a X-linked bleeding disorder caused by the deficiency of coagulation Factor IX (FIX) encoded by the human F9 (hF9) gene, which affects 1 in 25,000 to 30,000 males worldwide (Au, W.Y. et al. A synopsis of current haemophilia care in Hong Kong. Hong Kong Med J 17, 189-94 (2011) . To these patients, life-long treatments by protein infusion 2-3 times a week are required for treatment and prophylaxis.
  • FIX coagulation Factor IX
  • Hemophilia B is caused by the deficient or limited amount of coagulation Factor IX (FIX) . It was examined whether the newly developed in vivo NHEJ knock-in system described above (e.g., treatment of T1DM) could be used to supply FIX to alleviate or cure hemophilia B. Similar to hINS knock-in, a novel donor was constructed carrying ires-hF9, and it was injected together with sg-A and sg-Alb via the tail vein into mice (FIG. 3A) .
  • FIX coagulation Factor IX
  • FIX expression and secretion Delivery of constant-expressing human hF9 cDNA in mouse liver via hydrodynamic injection resulted in successful FIX expression and secretion (FIG. 3C) .
  • Secreted FIX protein was successfully detected in the mouse blood serum after administration and remained at a high level for at least seven days after administration, which would provide sufficient levels of FIX to alleviate the abnormal bleeding symptom over the long term.
  • immunohistofluorescence staining confirmed expression of FIX in the hepatocytes, indicating the significant potential of the newly developed constructs and methods to treat the Hemophilia B (FIG. 3D) .
  • EXAMPLE 6 ADENO-ASSOCIATED VIRUSES (AAV’S) FOR CRISPR/CAS9-NHEJ KNOCK-IN
  • AAV adeno-associated virus
  • Adeno-associated virus is a non-pathogenic, non-integral parvovirus.
  • AAV gene delivery vectors are being investigated as vehicles for gene therapy for a wide variety of hereditary and acquired human diseases.
  • AAV inability to self-propagate, ability to be maintained as an episome in the transduced cell, and relatively innocuous effects on the immune system make it the vector of choice for prolonged in vivo gene expression.
  • the pseudovirus delivery system developed based on AAV has been regarded as the safest vehicle for in vivo application and gene therapy (see Hastie, E. &Samulski, R.J. Adeno-associated virus at 50: a golden anniversary of discovery, research, and gene therapy success--a personal perspective.
  • AAV-based plasmids required for NHEJ-mediated knock-in were constructed, including AAV-donor, AAV-Cas9-sgA and AAV-Cas9-sgTarget (FIG. 4A) .
  • AAV-donor AAV-Cas9-sgA
  • AAV-Cas9-sgTarget FIG. 4A
  • the human GAPDH 3’-UTR was first chosen as the target site, and the AAV-based NHEJ knock-in was examined using ires-GFP reporter in a human cell line (see He, X. et al. Knock-in of large reporter genes in human cells via CRISPR/Cas9-induced homology-dependent and independent DNA repair. Nucleic Acids Res 44, e85 (2016) ) .
  • the purified AAV particles including AAV-donor, AAV-Cas9-sgA and AAV-Cas9-sgGAPDH, were transduced into human HEK293T cells, and flow cell analysis showed that around 2.6%cells were GFP positive and carrying the transgene integration (FIG. 4B, right panel) ; while the control group showed nearly no detectable GFP positive cells (FIG. 4B, left panel) , indicating that the AAV system is suitable to deliver DNA components and support CRISPR/Cas9-NHEJ-mediated knock-in.
  • the AAV pseudoviral particles carrying the required DNA components were injected into mice via the tail vein (FIG. 4C) .
  • GFP positive cells were detected in the mouse livers from the knock-in group (FIG. 4D, top row) but not in the control group (FIG. 4D. lower row) .
  • EXAMPLE 7 ADENO-ASSOCIATED VIRUSES (AAV’S) FOR CRISPR/CAS9-NHEJ IN VIVO KNOCK-IN OF HUMAN F9 GENE
  • the CRISPR/Cas9-NHEJ in vivo knock-in approach presented herein demonstrated improved integration efficiency, significantly higher than current HDR-based knock-in techniques, due to the dominant prevalence of the NHEJ mechanism under in vivo conditions.
  • this approach introduced efficient transgene integrations at specific target sites in the host cell genome, which are permanent in the host cells and can therefore support long-term expression and potentially lasting therapeutic effect.
  • the inventors have identified an active target site, which can ensure transgene integration at a high efficiency and render the transgene’s expression at high levels after knock-in.
  • the present disclosure provides and contemplates various designs of donor constructs, including optimized clinically acceptable AAV-donors in conjunction with the CRISPR/Cas9-NHEJ system. These advantageous make the systems, compositions, kits, and methods disclosed herein user-friendly and flexible to various research and clinical applications.
  • the CRISPR/Cas9-NHEJ system disclosed herein has proven itself to be an excellent tool for studying various diseases or gene functions using mouse models and can be easily adopted for any other species.
  • the T1DM and hemophilia B mouse models used herein have demonstrated that the CRISPR/Cas9-NHEJ knock-in approach disclosed herein shows great potential for developing novel gene-based therapies to various human diseases.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Mycology (AREA)
  • Cell Biology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne des compositions, des méthodes et des kits d'édition génomique in vivo. L'invention concerne également des compositions, des méthodes et des kits pour le traitement de maladies des tissus somatiques.
PCT/CN2018/123517 2018-01-05 2018-12-25 Activation in vivo à efficacité élevée faisant appel à crispr WO2019134561A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201880090785.2A CN111886341A (zh) 2018-01-05 2018-12-25 使用crispr的高效体内敲入

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862614229P 2018-01-05 2018-01-05
US62/614,229 2018-01-05

Publications (1)

Publication Number Publication Date
WO2019134561A1 true WO2019134561A1 (fr) 2019-07-11

Family

ID=67143880

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123517 WO2019134561A1 (fr) 2018-01-05 2018-12-25 Activation in vivo à efficacité élevée faisant appel à crispr

Country Status (2)

Country Link
CN (1) CN111886341A (fr)
WO (1) WO2019134561A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110628825A (zh) * 2019-10-14 2019-12-31 上海捷易生物科技有限公司 一种依赖nhej的报告基因敲入组合物及其使用方法
WO2021170089A1 (fr) * 2020-02-28 2021-09-02 The Chinese University Of Hong Kong Ingénierie de cellules immunitaires par inactivation knock-in et disruption génique simultanées
US11622547B2 (en) 2019-06-07 2023-04-11 Regeneran Pharmaceuticals, Inc. Genetically modified mouse that expresses human albumin

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112582024B (zh) * 2020-12-23 2021-11-02 广州赛业百沐生物科技有限公司 一种基因定点敲入载体构建方法、系统及平台

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103388006A (zh) * 2013-07-26 2013-11-13 华东师范大学 一种基因定点突变的构建方法
CN104342457A (zh) * 2014-10-17 2015-02-11 杭州师范大学 一种将外源基因定点整合到靶标基因的方法
CN106032540A (zh) * 2015-03-16 2016-10-19 中国科学院上海生命科学研究院 CRISPR/Cas9核酸内切酶体系的腺相关病毒载体构建及其用途
WO2017059241A1 (fr) * 2015-10-02 2017-04-06 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Système d'administration de protéine lentivirale pour l'édition génomique guidée par l'arn

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2959130A1 (fr) * 2014-08-11 2016-02-18 The Board Of Regents Of The University Of Texas System Prevention de la dystrophie musculaire par edition de gene mediee par crispr/cas9
CN107429263A (zh) * 2015-01-15 2017-12-01 斯坦福大学托管董事会 调控基因组编辑的方法
CN106893739A (zh) * 2015-11-17 2017-06-27 香港中文大学 用于靶向基因操作的新方法和系统
SG11201805680SA (en) * 2016-01-15 2018-07-30 Sangamo Therapeutics Inc Methods and compositions for the treatment of neurologic disease

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103388006A (zh) * 2013-07-26 2013-11-13 华东师范大学 一种基因定点突变的构建方法
CN104342457A (zh) * 2014-10-17 2015-02-11 杭州师范大学 一种将外源基因定点整合到靶标基因的方法
CN106032540A (zh) * 2015-03-16 2016-10-19 中国科学院上海生命科学研究院 CRISPR/Cas9核酸内切酶体系的腺相关病毒载体构建及其用途
WO2017059241A1 (fr) * 2015-10-02 2017-04-06 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Système d'administration de protéine lentivirale pour l'édition génomique guidée par l'arn

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BACHU, R. ET AL.: "CRISPR-Cas Targeted Plasmid Integration Into Mammalian Cells via Non-Homologous End Joining", BIOTECHNOLOGY AND BIOENGINEERING, vol. 112, no. 10, 7 July 2015 (2015-07-07), pages 2154 - 2162, XP055622158 *
GAJ, T. ET AL.: "Targeted gene knock-in by homology-directed genome editing using Cas9 ribonucleoprotein and AAV donor delivery", NUCLEIC ACIDS RESEARCH, vol. 45, no. 11, 2 March 2017 (2017-03-02), pages 1 - 11, XP055485687 *
HE, X.J. ET AL.: "Knock-in of large reporter genes in human cells via CRISPR/Cas9-induced homology-dependent and independent DNA repair", NUCLEIC ACIDS RESEARCH, vol. 44, no. 9, 4 February 2016 (2016-02-04), pages 1 - 14, XP055415869 *
YANG, Y. ET AL.: "A dual AAV system enables the Cas9-mediated correction of a metabolic liver disease in newborn mice", NATURE BIOTECHNOLOGY, vol. 34, no. 3, 1 February 2016 (2016-02-01), pages 334 - 340, XP055569763 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11622547B2 (en) 2019-06-07 2023-04-11 Regeneran Pharmaceuticals, Inc. Genetically modified mouse that expresses human albumin
CN110628825A (zh) * 2019-10-14 2019-12-31 上海捷易生物科技有限公司 一种依赖nhej的报告基因敲入组合物及其使用方法
WO2021170089A1 (fr) * 2020-02-28 2021-09-02 The Chinese University Of Hong Kong Ingénierie de cellules immunitaires par inactivation knock-in et disruption génique simultanées

Also Published As

Publication number Publication date
CN111886341A (zh) 2020-11-03

Similar Documents

Publication Publication Date Title
JP7365374B2 (ja) ヌクレアーゼ介在性遺伝子発現調節
CN108779466B (zh) 用于通过基因编辑修正人肌营养不良蛋白基因的治疗靶标和使用方法
JP6642943B2 (ja) 血友病を処置するための方法および組成物
WO2019134561A1 (fr) Activation in vivo à efficacité élevée faisant appel à crispr
US20210017509A1 (en) Gene Editing for Autosomal Dominant Diseases
CN105683376A (zh) 用于治疗遗传病状的方法和组合物
US11492614B2 (en) Stem loop RNA mediated transport of mitochondria genome editing molecules (endonucleases) into the mitochondria
US20220218843A1 (en) Non-disruptive gene therapy for the treatment of mma
CA3116452A1 (fr) Procedes et constructions d'edition de genome
CN110249051A (zh) 增强功能性髓鞘产生的方法和组合物
US20230165976A1 (en) Htra1 modulation for treatment of amd
US20230323322A1 (en) Split cas12 systems and methods of use thereof
US20210189426A1 (en) Crispr interference based htt allelic suppression and treatment of huntington disease
CN115279184A (zh) B4galt1介导的功能的啮齿动物模型
TWI829101B (zh) 使用經工程改造之核酸酶調控基因表現
US20230303990A1 (en) Pyruvate kinase deficiency (pkd) gene editing treatment method
US20220380756A1 (en) Methods and compositions for treating thalassemia or sickle cell disease
US20230265382A1 (en) Production system for helper-dependent adenovirus
WO2020187272A1 (fr) Protéine de fusion pour thérapie génique et son application
CN115427568A (zh) Rp1相关视网膜变性的基于单倍型的治疗
LLADO SANTAEULARIA THERAPEUTIC GENOME EDITING IN RETINA AND LIVER
WO2023279106A1 (fr) Compositions et procédés d'édition de base de chaîne lourde de la myosine
CN117897486A (zh) 用于肌球蛋白重链碱基编辑的组合物和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18898307

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18898307

Country of ref document: EP

Kind code of ref document: A1