WO2023154877A2 - Rna-guided genome recombineering at kilobase scale - Google Patents
Rna-guided genome recombineering at kilobase scale Download PDFInfo
- Publication number
- WO2023154877A2 WO2023154877A2 PCT/US2023/062406 US2023062406W WO2023154877A2 WO 2023154877 A2 WO2023154877 A2 WO 2023154877A2 US 2023062406 W US2023062406 W US 2023062406W WO 2023154877 A2 WO2023154877 A2 WO 2023154877A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- protein
- composition
- cell
- recombination
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/115—Aptamers, i.e. nucleic acids binding a target molecule specifically and with high affinity without hybridising therewith ; Nucleic acids binding to non-nucleic acids, e.g. aptamers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1276—RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
- C12N9/222—Clustered regularly interspaced short palindromic repeats [CRISPR]-associated [CAS] enzymes
- C12N9/226—Class 2 CAS enzyme complex, e.g. single CAS protein
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y207/00—Transferases transferring phosphorus-containing groups (2.7)
- C12Y207/07—Nucleotidyltransferases (2.7.7)
- C12Y207/07049—RNA-directed DNA polymerase (2.7.7.49), i.e. telomerase or reverse-transcriptase
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14141—Use of virus, viral particle or viral elements as a vector
- C12N2750/14143—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
Definitions
- the present invention relates to RNA-guided recombineering-editing systems using phage recombination enzymes as well as methods, vectors, nucleic acid compositions, and kits thereof.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- systems and methods that facilitate nucleic acid editing in a manner that allows large-scale nucleic acid editing with high accuracy and low off-target errors.
- These systems and methods employ a recombination protein component and optionally a CRISPR component.
- a binding protein comprising a binding protein, a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence, and a recombination protein.
- the recombination protein may be a single stranded DNA annealing protein (SSAP), including but not limited to a microbial recombination protein, for example, RecE, RecT, lambda exonuclease (Exo), Bet protein (betA, redB), exonuclease gp6, single-stranded DNA-binding protein gp2.5, or a derivative or variant thereof.
- the system further comprises donor DNA.
- the target DNA sequence is a genomic DNA sequence in a host cell. In certain embodiments, there is no CRISPR component.
- the system comprises a recruitment system which recruits the recombination protein and a nucleic acid that directs the recombination protein to a target. In certain embodiments, the recruitment system recruits the recombination protein, the nucleic acid that directs the recombination protein, and a CRISPR component.
- the invention provides a recombination system comprising an SSAP and lacking a CRISPR component.
- the invention provides a system or composition comprising: (i) a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence; and (ii) a recombination protein, wherein the recombination protein comprises an exonuclease, a single stranded DNA annealing protein (SSAP), or a single stranded DNA binding protein (SSB), or a combination of two or more thereof; or, (iii) nucleic acid molecule(s) encoding or delivering (i), and/or (ii) for expression in vivo in a cell; or, (iv) vector(s) containing the nucleic acid molecule(s) of (iii) for expression in vivo in a cell.
- the system or composition does not comprise a CRISPR protein
- the system or composition comprises a recruitment system for recruiting a guide nucleic acid and a recombination protein.
- the recruitment system comprises at least one aptamer sequence and an aptamer binding protein functionally linked to the recombination protein as part of a fusion protein.
- the at least one aptamer sequence is an RNA aptamer sequence or a peptide aptamer sequence.
- the nucleic acid molecule or nucleic acid molecules additionally comprises the at least one RNA aptamer sequence or comprises one, two, three, or more RNA aptamer sequences.
- two aptamer sequences comprise the same sequence or comprise sequences that bind to the same aptamer binding protein.
- the aptamer binding protein comprises a MS2 coat protein, or a functional derivative or variant thereof. In certain embodiments, the aptamer binding protein comprises phage N peptide, or a functional derivative or variant thereof. In certain embodiments, the at least one peptide aptamer sequence is conjugated to the guide RNA. In certain embodiments, the at least one peptide aptamer sequence comprises between 1 and 24 peptide aptamer sequences. In certain embodiments, two or more aptamer sequences comprise the same sequence. In certain embodiments, an aptamer sequence comprises a GCN4 peptide sequence.
- the recombination protein N-terminus is linked to the aptamer binding protein C-terminus.
- the recombination protein and the aptamer binding protein are operably linked by a linker.
- the recombination system or composition comprises at least one nuclear localization sequence (NLS), optionally wherein the NLS(s) is linked to the recombination protein.
- NLS nuclear localization sequence
- the NLS is located at the recombination protein C-terminus or at the recombination protein N-terminus.
- the recombination protein comprises a microbial recombination protein or active portion thereof, a mitochondrial recombination protein or active portion thereof, a viral recombination protein or active portion thereof, or a eukaryotic recombination protein or active portion thereof, including without limitation, a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the recombination protein comprises an amino acid sequence with at least 70% identity , or at least 75% identity, or at least 80% identity, or at least 85% identity, or at least 90% identity, or at least 92% identity, or at least 95% identity, or at least 96% identity, or at least 97% identity, or at least 98% identity, or at least 99% identity to a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the system or composition comprises a donor nucleic acid.
- the donor nucleic acid comprises homology arms.
- the recombination system is comprised in a cell, for example, a eukaryotic cell, a mammalian cell, an animal cell, a human cell, or a plant cell.
- the recruitment system is adaptable to a multitude of combinations and configurations of recombination proteins.
- the system can comprise multiple recombination proteins, which may be the same or different and in various ratios.
- the system comprises an exonuclease.
- the system comprises an SSAP.
- the system comprises an SSB.
- the system comprises an exonuclease and an SSAP.
- the system comprises an exonuclease and an SSB.
- the system comprises an SSAP and an SSB.
- the system comprises an exonuclease and an SSAP and does not comprise an SSB. In certain embodiments, the system comprises an exonuclease and an SSB and does not comprise an SSAP. In certain embodiments, the system comprises an SSAP and an SSB and does not comprise an exonuclease. In certain embodiments, the system comprises an exonuclease, an SSAP, and an SSB.
- the invention provides a recombination system comprising an SSAP and a reverse transcriptase (RT).
- the invention provides a system or composition comprising: (i) a reverse transcriptase(s) (RT); (ii) a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence and RNA for reverse transcription, or nucleic acid molecules comprising a guide RNA sequence that is complementary to a target DNA sequence and RNA for reverse transcription; and (iii) a recombination protein, wherein the recombination protein comprises an exonuclease, a single stranded DNA annealing protein (SSAP), or a single stranded DNA binding protein (SSB), or a combination of two or more thereof; or, (iv) nucleic acid molecule(s) encoding or delivering (i), and/or (ii) and/or (iii) for expression in vivo
- SSAP single stranded
- the system or composition further comprises a Cas protein; or (iv) comprises nucleic acid molecule(s) encoding or delivering (i), and/or (ii) and/or (iii) and/or a Cas protein for expression in vivo in a cell; or the vector(s) of (v) additionally contains nucleic acid molecule(s) encoding a Cas protein.
- one or more of the components is provided as a complex.
- a protein or a fusion protein and a nucleic acid are provided as a ribonucleoprotein (RNP).
- RNP ribonucleoprotein
- Nonlimiting examples of an RNP include a CRISPR-guideRNA complex, and an SSAP-guide RNA complex.
- a fusion protein comprises one or more components. Nonlimiting examples include a Cas9-SSAP fusion, a Cas9-RT fusion, and a SSAP-RT fusion.
- the system or composition comprises a recruitment system for recruiting a guide nucleic acid and a recombination protein.
- the recruitment system comprises at least one aptamer sequence and an aptamer binding protein functionally linked to the recombination protein as part of a fusion protein.
- the at least one aptamer sequence is an RNA aptamer sequence or a peptide aptamer sequence.
- the nucleic acid molecule or nucleic acid molecules additionally comprises the at least one RNA aptamer sequence or comprises one, two, three, or more RNA aptamer sequences.
- two aptamer sequences comprise the same sequence or comprise sequences that bind to the same aptamer binding protein.
- the aptamer binding protein comprises a MS2 coat protein, or a functional derivative or variant thereof.
- the aptamer binding protein comprises phage N peptide, or a functional derivative or variant thereof.
- the at least one peptide aptamer sequence is conjugated to the guide RNA.
- the at least one peptide aptamer sequence comprises between 1 and 24 peptide aptamer sequences.
- two or more aptamer sequences comprise the same sequence.
- an aptamer sequence comprises a GCN4 peptide sequence.
- the recombination protein N-terminus is linked to the aptamer binding protein C-terminus.
- the recombination protein and the aptamer binding protein are operably linked by a linker.
- the recombination system or composition comprises at least one nuclear localization sequence (NLS), optionally wherein the NLS(s) is linked to the recombination protein.
- NLS nuclear localization sequence
- the NLS is located at the recombination protein C-terminus or at the recombination protein N-terminus.
- the recombinant protein comprises a microbial recombination protein or active portion thereof, a mitochondrial recombination protein or active portion thereof, a viral recombination protein or active portion thereof, or a eukaryotic recombination protein or active portion thereof, including without limitation, a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the recombination protein comprises an amino acid sequence with at least 70% identity , or at least 75% identity, or at least 80% identity, or at least 85% identity, or at least 90% identity, or at least 92% identity, or at least 95% identity, or at least 96% identity, or at least 97% identity, or at least 98% identity, or at least 99% identity to a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the system or composition comprises a donor nucleic acid.
- the donor nucleic acid comprises homology arms.
- the recombination system is comprised in a cell, for example, a eukaryotic cell, a mammalian cell, an animal cell, a human cell, or a plant cell.
- the invention provides a method of recombination, which comprises providing in a cell, a system or composition, (i) a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence; wherein the target DNA sequence comprises a genomic DNA sequence in the cell, and (ii) a recombination protein, wherein the recombination protein comprises an exonuclease, a single stranded DNA annealing protein (SSAP), or a single stranded DNA binding protein (SSB), or a combination of two or more thereof; or, (iii) nucleic acid molecule(s) encoding or delivering (i), and/or (ii) for expression in vivo in a cell; or, (iv) vector(s) containing the nucleic acid molecule(s) of (iii) for expression in vivo in a cell.
- SSAP single stranded DNA annealing protein
- SSB single strande
- (i) and (ii) further comprise a Cas protein or a nucleic acid polymerase, including but not limited to a native or engineered polymerase having reverse transcriptase activity such as a reverse transcriptase (RT) or a Cas protein and RT; or (iii) comprises nucleic acid molecule(s) encoding or delivering (i), and/or (ii) and/or a Cas protein and/or a RT for expression in vivo in the cell; or the vector(s) of (iv) additionally contains nucleic acid molecule(s) encoding a Cas protein and or RT.
- a Cas protein or a nucleic acid polymerase including but not limited to a native or engineered polymerase having reverse transcriptase activity such as a reverse transcriptase (RT) or a Cas protein and RT
- RT reverse transcriptase
- RT reverse transcriptase
- RT reverse transcriptase
- RT reverse transcripta
- one or more of the components is provided as a complex.
- a protein or a fusion protein and a nucleic acid are provided as a ribonucleoprotein (RNP).
- RNP ribonucleoprotein
- Nonlimiting examples of an RNP include a CRISPR-guideRNA complex, and an SSAP-guide RNA complex.
- a fusion protein comprises one or more components. Nonlimiting examples include a Cas9-SSAP fusion, a Cas9-RT fusion, and a SSAP-RT fusion.
- the target DNA sequence comprises a genomic sequence of albumin (ALB), AAVS1, HSP90AA1, DYNLT1, ACTB, BCAP31, HIST1H2BK, CLTA, or RAB11A.
- the system or composition comprises a recruitment system for recruiting a guide nucleic acid and a recombination protein.
- the recruitment system comprises at least one aptamer sequence and an aptamer binding protein functionally linked to the recombination protein as part of a fusion protein.
- the at least one aptamer sequence is an RNA aptamer sequence or a peptide aptamer sequence.
- the nucleic acid molecule or nucleic acid molecules additionally comprises the at least one RNA aptamer sequence or comprises one, two, three, or more RNA aptamer sequences.
- two aptamer sequences comprise the same sequence or comprise sequences that bind to the same aptamer binding protein.
- the aptamer binding protein comprises a MS2 coat protein, or a functional derivative or variant thereof. In certain embodiments, the aptamer binding protein comprises phage N peptide, or a functional derivative or variant thereof. In certain embodiments, the at least one peptide aptamer sequence is conjugated to the guide RNA. In certain embodiments, the at least one peptide aptamer sequence comprises between 1 and 24 peptide aptamer sequences. In certain embodiments, two or more aptamer sequences comprise the same sequence. In certain embodiments, an aptamer sequence comprises a GCN4 peptide sequence.
- the recombination protein N-terminus is linked to the aptamer binding protein C-terminus.
- the recombination protein and the aptamer binding protein are operably linked by a linker.
- the linker comprises 39115.
- the recombination system or composition comprises at least one nuclear localization sequence (NLS), optionally wherein the NLS(s) is linked to the recombination protein.
- NLS comprises the amino acid sequence of SEQ ID NO: 16.
- the NLS is located at the recombination protein C- terminus or at the recombination protein N-terminus.
- the recombinant protein comprises a microbial recombination protein or active portion thereof, a mitochondrial recombination protein or active portion thereof, a viral recombination protein or active portion thereof, or a eukaryotic recombination protein or active portion thereof, including without limitation, a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the recombination protein comprises an amino acid sequence with at least 70% identity , or at least 75% identity, or at least 80% identity, or at least 85% identity, or at least 90% identity, or at least 92% identity, or at least 95% identity, or at least 96% identity, or at least 97% identity, or at least 98% identity, or at least 99% identity to a recombination protein set forth in Table 12 or derivative or variant or functional portion thereof.
- the system or composition comprises a donor nucleic acid.
- the donor nucleic acid comprises homology arms.
- the recombination system is comprised in a cell, for example, a eukaryotic cell, a mammalian cell, an animal cell, a human cell, or a plant cell.
- the Cas protein is Cas9 or Casl2a. In some embodiments, the Cas protein is a catalytically dead. In some embodiments, the Cas9 protein is wild-type Streptococcus pyogenes Cas9 or a wild type Staphylococcus aureus Cas9. In some embodiments, the Cas9 protein is a Cas9 nickase (e.g., wild-type Streptococcus pyogenes Cas9 with an amino acid substation at position 10 of D10A).
- a eukaryotic cell comprising the systems or vectors disclosed herein.
- methods of altering a target genomic DNA sequence in a host cell comprise contacting the systems, compositions, or vectors described herein with a target DNA sequence (e.g., introducing the systems, compositions, or vectors described herein into a host cell comprising a target genomic DNA sequence). Kits containing one or more reagents or other components useful, necessary, or sufficient for practicing any of the methods are also disclosed herein.
- the invention provides a system or composition comprising: (i) a nucleic acid polymerase, such as a reverse transcriptase(s) (RT); (ii) a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence and RNA for reverse transcription, or nucleic acid molecules comprising a guide RNA sequence that is complementary to a target DNA sequence and RNA for reverse transcription; and (iii) a recombination protein, wherein the recombination protein comprises an exonuclease, a single stranded DNA annealing protein (SSAP), or a single stranded DNA binding protein (SSB), or a combination of two or more thereof; or, (iv) nucleic acid molecule(s) encoding or delivering (i), (ii) and (iii) for expression in vivo in a cell; or, (v) vector(s) containing the nucleic acid molecule(s)
- the RT system or composition can involve (i) being enzyme, (ii) being nucleic acid molecule(s), and (iii) being nucleic acid molecules; or (i) being nucleic acid molecule(s) encoding the enzyme(s), (ii) being nucleic acid molecule(s), and (iii) being protein, or all of (i), (ii) and (iii) being nucleic acid molecules.
- the RT system or composition can include more than one reverse transcriptase. When there is more than one reverse transcriptase there can be more than one RNA for reverse transcription.
- composition (i), (ii) and (iii) further comprises a Cas protein; or (iv) further comprises nucleic acid molecule(s) encoding a Cas protein, e.g., (iv) comprises nucleic acid molecule(s) encoding or delivering (i), and/or (ii) and/or (iii) and/or a Cas protein for expression in vivo in a cell; or the vector(s) of (v) additional contain nucleic acid molecule(s) encoding a Cas protein.
- Reverse transcriptases that can be used according to the invention include, without limitation, reverse transcriptases, retrotransposon reverse transcriptases, retron reverse transcriptases, LINE-1 reverse transcriptase, Ec86 reverse transcriptase, Human immunodeficiency virus (HIV) RT, Avian myoblastosis virus (AMV) RT, Moloney murine leukemia virus (M- MLV) RT a group II intron RT, a group II intron-like RT, a chimeric RT, Ma Luoni mouse leukaemia virus (M-MLV) Transcriptase, Rous sarcoma virus (Rous sarcoma virus, RSV), avian myeloblastosis virus (AMV) reverse transcriptase, Lao Sishi correlated virus (RAV) reverse transcriptase and myeloblast Tumor correlated virus (MAV) reverse transcriptase or other Avian Sarcoma leucovirus (Avian sarcoma le
- Such engineered polymerases include, with limitation, human DNA polymerase r] which has reverse transcriptase activity in cellular environments (Su et al. 2019, J. Biol. Chem. 294(15):6073-81), and Taq DNA polymerase engineered to enhance reverse transcription and strand displacement (Barnes et el., Front. Bioeng. Biotechnol., 14 January 2021, doi.org/10.3389/fbioe.2020.553474).
- the RT system or composition further comprises a recruitment system comprising at least one aptamer sequence; and an aptamer binding protein functionally linked to the recombination protein as part of a fusion protein.
- the at least one aptamer sequence is an RNA aptamer sequence or a peptide aptamer sequence.
- the RT system or composition or composition having a recruitment system has nucleic acid molecule or nucleic acid molecules that additionally comprises the at least one RNA aptamer sequence, such as nucleic acid molecule or nucleic acid molecules comprises two RNA aptamer sequences; for instance, wherein the two RNA aptamer sequences comprise the same sequence.
- the RT system or composition or composition having a recruitment system has the aptamer binding protein comprising a MS2 coat protein, or a functional derivative or variant thereof; and/or the aptamer binding protein comprises phage N peptide, or a functional derivative or variant thereof; and/or the at least one peptide aptamer sequence is conjugated to the Cas protein; and/or the at least one peptide aptamer sequence comprises between 1 and 24 peptide aptamer sequences; and/or the aptamer sequences comprise the same sequence.
- the RT system or composition or composition having a recruitment system has the aptamer sequence comprising a GCN4 peptide sequence.
- the recombination protein N- terminus is linked to the aptamer binding protein C-terminus; and in some embodiments, the RT system or composition further comprises a linker between the recombination protein and the aptamer binding protein; for instance, in some embodiments, the linker comprises the amino acid sequence of SEQ ID NO: 15.
- the system or composition includes at least one nuclear localization sequence (NLS), optionally wherein the NLS(s) is / are linked to the recombination protein or the Cas protein or the reverse transcriptase or at least one NLS on each at least two or three of the recombination protein, the reverse transcriptase or the Cas protein; for instance, the nuclear localization sequence in some embodiments comprises the amino acid sequence of SEQ ID NO: 16.
- the nuclear localization sequence is on the recombination protein C-terminus on the recombination protein or the Cas protein.
- the recombination protein comprises a recombination protein or active portion thereof. In some embodiments of the RT system or composition, the recombination protein comprises a mitochondrial recombination protein or active portion thereof. In some embodiments of the RT system or composition, the recombination protein comprises a viral recombination protein or active portion thereof. In some embodiments of the RT system or composition, the recombination protein comprises a eukaryotic recombination protein or active portion thereof. In some embodiments of the RT system or composition, the recombination protein comprises RecE or RecT or RecE and RecT or derivative or variant or functional portion thereof.
- the RecE, or derivative or variant thereof comprises an amino acid sequence with at least 70% (or any whole number integer from 70 to 100% e.g., at least 71%, 72%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%) similarity or identity or homology to an amino acid sequence selected from the group consisting of SEQ ID NOs: l-8.
- the fusion protein comprises RecT, or derivative or variant thereof.
- the RecT, or derivative or variant thereof comprises an amino acid sequence with at least 70%(or any whole number integer from 70 to 100% e.g., at least 71%, 72%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%) similarity or identity or homology to an amino acid sequence selected from the group consisting of SEQ ID NOs: 9-14.
- the Cas protein is catalytically inactive (less than 5% nuclease activity as compared with a wild type or non-mutated of the Cas protein) or catalytically dead.
- the Cas protein comprises Cas9 or Casl2a.
- the Cas9 protein comprises wild-type Streptococcus pyogenes Cas9 or a wild type Staphylococcus aureus Cas9.
- the Cas protein comprises a nickase.
- the nickase comprises wild-type Streptococcus pyogenes Cas9 with an amino acid substation at position 10 of D10A.
- the RT system or composition further comprises donor nucleic acid.
- the target DNA sequence is a genomic DNA sequence in a host cell.
- the RT and recombination protein are functionally linked to each other and comprise a fusion protein.
- the aptamer binding protein and the recombination protein are functionally linked to each other and comprise a fusion protein.
- the RT and the Cas protein are functionally linked to each other and comprise a fusion protein.
- the recombination protein and the Cas protein are functionally linked to each other and comprise a fusion protein.
- the RT system or composition In some embodiments of the RT system or composition. In some embodiments of the RT system or composition the RT, and the Cas protein, and the recombination protein are functionally linked to each other and comprise a fusion protein.
- RTs of W02020/191241, W02020/191153, WO2020/191245, WO2020/191243, WO2020/191233, WO2020/191246, WO2020/191249, WO2020/191239, WO2020/191234, WO2020/191242, WO2020/191248, W02020191171 and WO2021/226558 can be used in the practice of the present invention.
- Linkers or ways to functionally link of W02020/191241, W02020/191153, WO2020/191245, WO2020/191243, WO2020/191233, WO2020/191246, WO2020/191249, WO2020/191239, WO2020/191234, WO2020/191242, WO2020/191248, W02020191171 and WO2021/226558 can be used in the practice of the present invention.
- the invention comprehends a cell or eukaryotic cell comprising any herein-described or discussed RT system or composition.
- the invention comprehends a method of altering a target genomic DNA sequence in a cell comprising a target genomic DNA sequence, comprising introducing any herein-discussed or described RT system or composition.
- the cell or eukaryotic cell is a mammalian cell, or in the methods the cell or eukaryotic cell is a mammalian cell; for instance, a human cell; for instance, a stem cell.
- the method involves the target genomic DNA sequence encoding a gene product.
- the method includes introducing into a cell comprises administering to a subject.
- the method involves the subject being a mammalian non-human animal (e.g., a laboratory animal such as a rodent, rat, mouse, rabbit, or a domestic animal such as a horse, dog or canine, or cat or feline, or a zoo animal (a nondomesticated animal in human care and custody) or a production animal such as a cow or pig), or a human.
- the administering comprises in vivo administration.
- the cell or eukaryotic cell or mammalian cell is an ex vivo or in vitro cell.
- in the method comprises after the introducing step, administering to a subject the ex vivo or in vitro cells; and in such embodiments, the subject is a mammalian non-human animal or a human.
- the invention involves use of the RT system or composition of for the alteration of a target DNA sequence in a cell.
- RT In systems or compositions discussed herein that do not involve RT, aspects of the RT system that do not pertain to RT or the RT system (e.g., linkers) can be applied to the herein- discussed systems or compositions that do not include RT.
- the linker may be a peptide of 5-30, 10-30, 10-20 or 15 amino acid residues.
- the linker may be - (Gly-Gly-Gly-Gly-Ser) 2 - (SEQ ID NO:560), - (Gly-Gly-Gly-Gly-Ser) 3 - (SEQ ID NO:561), or - (Gly-Gly-Gly-Gly-Ser)4 - (SEQ ID NO:562).
- the linker is
- SEQ ID NO:561 The amino acid sequence of SEQ ID NO:561 may be encoded by the nucleic acid sequence of SEQ ID NO:563.
- a linker is made up of a majority of amino acids that are sterically unhindered, such as glycine and alanine.
- exemplary linkers are polyglycines (particularly (poly (Gly-Ser), poly(Gly-Ala), and polyalanines.
- One exemplary suitable linker as shown in the Examples below is (Gly-Ser), such as - (Gly-Gly-Gly-Gly-Ser)2 - (SEQ ID NO:560),
- Linkers may also be non-peptide linkers.
- These alkyl linkers may further be substituted by any non-sterically hindering group such as lower alkyl (e.g., C1-4) lower acyl, halogen (e.g., CI, Br), CN, NH2, phenyl, etc.
- FIG. 1A and FIG. IB are the reconstructed RecE (FIG. 1A) and RecT (FIG. IB) phylogenetic trees with eukaryotic recombination enzymes from yeast and human.
- FIG. 2A is a phylogenetic tree and length distribution of RecE/RecT homologs.
- FIG. 2B is the metagenomics distribution of RecE/T.
- FIG. 2C is a schematic showing central models disclosed herein.
- FIG. 2D are graphs of the genome knock-in efficiency of RecE/T homologs.
- FIG. 3A and 3B are graphs of the high-throughput sequencing (HTS) reads of homology directed repair (HDR) at the EMX1 (FIG. 3 A) locus and the VEGFA (FIG. 3B) locus.
- FIGS. 3C-3E are graphs of the mKate knock-in efficiency ⁇ .HSP9OAA1 (FIG. 3C), DYNLT1 (FIG. 3D), andHHFSV (FIG. 3E) loci in HEK293T cells.
- FIG. 3F is images of mKate knock-in efficiency in HEK293T cells with RecT.
- FIG. 3G is a schematic of an exemplary AAVS1 knock-in strategy and chromatogram trace from RecT knock-in group.
- FIG. 3H is schematics and graphs of the recruitment control experiment and corresponding knock-in efficiency. All results are normalized to NR. (NC, no cutting; NR, no recombinator).
- FIGS. 4A-4C are graphs of the relative mKate knock-in efficiencies to the NE group at HSP90AA1 (FIG. 4A), DYNLT1 (FIG. 4B), and AAVS1 (FIG. 4C) loci in HEK293T cells.
- NC no cutting control group.
- NR no recombinator control group.
- FIG. 4D is an image of an exemplary agarose gel of junction PCR that validates mKate knock-in at AAVS1 locus.
- FIGS. 4E and 4F are graphs of the absolute and (FIG. 4E) and relative (FIG. 4F) LOV knock-in efficiencies at AA VS1 locus.
- FIG. 4G are the Sanger sequencing results of the junction PCR product of an exemplary mKate knock-in at AA VS1 locus.
- FIGS. 5A-5D are graphs of the genomic knock-in efficiencies at different loci across cell lines A549 (FIG. 5A), HepG2 (FIG. 5B), HeLa (FIG. 5C), and hESCs (H9) (FIG. 5D).
- FIG. 5E is images of mKate knock-ins in hESCs.
- FIGS. 5F and 5G are genomic-wide off-target site (OTS) counts (FIG. 5F) and OTS chromosomal distribution (FIG. 5G) of REDITvl tools.
- OTS off-target site
- FIGS. 6A-6D are graphs of the relative mKate knock-in efficiency at the AA VS1 locus and the DYNT1 locus in A549 cell line (FIG. 6A), the DYNLT1 locus and the HSP90AA1 locus in HepG2 cell line (FIG. 6B), the DYNLT1 locus and the HSP90AA1 locus in Hela cell line (FIG. 6C), and the HSP90AA1 locus and the OCT4 locus in hES-H9 cell line (FIG. 6D).
- NC no cutting control group.
- NR no recombinator control group. All data normalized to NR group.
- FIG. 6E is representative FACS results of HSP90AA1 mKate knock-in in hES-H9 cells.
- FIGS. 7A-7D are graphs of the absolute mKate knock-in efficiencies of different homology arm lengths at the DYNLT1 (FIG. 7A) and HSP90AA1 (FIG. 7B) loci and the no recombinator controls for DYNLT1 (FIG. 7C) and HSP90AA1 (FIG. 7D).
- FIGS. 8A-8E are graphs of the indel rates of the top 3 predicted off-target loci associated with sgEMXl (FIGS. 8A-8C) or sgVEGFA (FIGS. 8D-8E) in the REDITvl system.
- FIG. 9A is a schematic of select embodiments of REDITv2N and corresponding knock- in efficiencies in HEK293T cells.
- FIGS. 9B and 9C are graphs of genomic-wide off-target site (OTS) counts (FIG. 9B) and OTS chromosomal distribution (FIG. 9C) comparing REDITv2N against REDITvl.
- FIG. 9D is a schematic of select embodiments ofREDITv2D and corresponding knock-in efficiencies.
- FIG. 9E is a graph of editing efficiency of REDITvl, REDITv2N, and REDITv2D under serum starvation conditions.
- FIG. 9F is the knock-in efficiencies of REDITv3 in hESCs.
- FIG. 9G is images of mKate knock in using REDITv3 in hESCs.
- FIGS. 10A and 10B are schematics and graphs of the relative mKate knock-in efficiencies of select embodiments of REDITv2N (FIG. 10 A) and REDITv2D (FIG. 10B) at the DYNLT1 locus and the HSP90AA1 locus.
- FIGS. 11A-1 ID are images of agarose gels showing junction PCR of mKate knock-in at the DYNLT1 locus and the HSP90AA1 locus for a select REDITv2N system.
- FIG. HE is the chromatogram sequence of junction PCR products at the DYNLT1 locus.
- FIGS. 12A and 12B are graphs of the genomic distribution of detected off-target cleavages of select embodiments of REDITv2 (FIG. 12A) and REDITv2N (FIG. 12B).
- a pileup includes alignments that have two or more reads overlapping with each other. Flanking pairs include alignments that show up on opposite strands within 200bp upstream of each other.
- Target matched includes alignments that match to a treated target in the upstream sequence (up to 6 mismatches, including 1 mismatch in the PAM, are allowed in the target sequence).
- FIG. 12C is a graph of the HTS HDR and indel reads a EMXl locus for REDITv2N system.
- FIG. 13 A is an image of an agarose gel showing junction PCR of mKate knock-ins at the DYNLT1 locus for REDITv2D system.
- FIG. 13B is the chromatogram sequence of junction PCR products at the DYNLT1 locus.
- FIGS. 14A-14C are graphs of the mKate knock-in efficiencies at the HSP90AA1 locus in REDITv2 (FIG. 14 A), REDITv2N (FIG. 14B) and REVITv2D (FIG. 14C) when treated with different FBS concentrations.
- FIGS. 14A-14C are graphs of the mKate knock-in efficiencies at the HSP90AA1 locus in REDITv2 (FIG. 14D), REDITv2N (FIG. 14E) and REVITv2D (FIG. 14F) when treated with different serum FBS concentrations.
- FIG. 15 is images of the nuclear localization of RecE_587 and RecT following EGFP fusion to the REDITvl systems. Nuclei were stained with NucBlue Live Ready Probes Reagent.
- FIGS. 16A and 16B are the relative mKate knock-in efficiencies at HSP90AA lan DYNLT1 loci following fusion of different nuclear localization sequences to either the N- or C- terminus of RecT and RecE_587.
- FIGS. 16C and 16D are graphs of the absolute mKate knock-in efficiencies of the constructs from FIGS. 16A and 16B for the DYNLT1 locus (FIG. 16C) and the HSP90AA1 locus (FIG. 16D).
- FIGS. 17A-17D are graphs of the relative (FIGS. 17A and 17B) and absolute (FIGS. 17C and 17D) mKate knock-in efficiencies for the DYNLT1 locus (FIGS. 17A and 17C) and the HSP90AA1 locus (FIGS. 17B and 17D) following fusion new NLS sequences as well as optimal linkers to REDITv2 and REDITv3 variants.
- the REDITv2 versions using REDITv2N (D10A or H840A) and REDITv2D (dCas9) are indicated in the horizonal axis, along with the number of guides used.
- the different colors represent the different control groups and REDIT versions.
- FIG. 18 is a graph of the relative editing efficiency of REDITv3N system HSP90AA 1 locus in hES-H9 cells.
- FIG. 19A is a diagram of an exemplary saCas9 expression vector.
- FIGS. 19B-19E are graphs of the relative mKate knock-in efficiencies at the AAVS1 locus (FIG. 19B) and HSP90AA1 locus (FIG. 19D) of different effectors in saCas9 system and the respective absolute efficiencies (FIG. 19D and 19E, respectively).
- NC no cutting control group.
- NR no recombinator control group.
- FIG. 20A is a schematic of RecT truncations.
- FIGS. 20B and 20C are graphs of the relative mKate knock-in efficiencies at the DYNLT1 locus for wild-type Streptococcus pyogenes Cas9 and Streptococcus pyogenes Cas9n(D10A) with single- and double-nicking.
- FIG. 21 A is a schematic of RecE_587 truncations.
- FIGS. 21B and 21C are graphs of the relative mKate knock-in efficiencies at the DYNLT1 locus for wild-type Streptococcus pyogenes Cas9 and Streptococcus pyogenes Cas9n(D10A) with single- and double-nicking.
- FIGS. 22A and 22B are graphs of comparison of efficiency to perform recombineeringbased editing with various exonucleases (FIG. 22A) and single-strand DNA annealing protein (SSAP) (FIG. 22B) from naturally occurring recombineering systems, including NR (no recombinator) as negative control.
- FIGS. 23 A-23E show a compact recruitment system using boxB and N22.
- FIGS. 23B-23E are graphs of the gene-editing efficiency using mKate knock-in assay, with wildtype SpCas9, with side-by-side comparisons to the MS2-MCP recruitment system.
- FIGS. 23B and 23D are absolute mKate knock-in efficiency at DYNLT1, HSP90AA1 loci and
- FIGS. 24A-24C show a SunTag recruitment system.
- the REDIT recombinator proteins were fused to scFV antibody and the GCN4 peptide in tandem fashion (10 copies of GCN4 peptide separated by linkers) was fused to the Cas9 protein (FIG. 24 A).
- An mKate knock-in experiment (FIG. 24B) with the DYNLT1 locus was used to measure the gene-editing knock-in efficiency (FIG. 24C). All data are measurements of gene-editing efficiency using mKate knock-in assay, with wildtype SpCas9.
- FIGS. 25A and 25B exemplify REDIT with a Casl2A system.
- a Cpfl/Casl2a based REDIT system via the SunTag recruitment design was created (FIG. 25A) for two different Cpfl/Casl2a proteins.
- the efficiencies at two endogenous loci were measured.
- FIGS. 27A and 27B are a schematic showing the SunTag-based recruitment of SSAP RecT to Cas9-gRNA complex for gene-editing (FIG. 27A) and a graph quantifying the editing efficiencies of SunTag compared to MS2-based strategies (FIG. 27B).
- FIGS. 28A-28C show comparisons of REDIT with alternative HDR-enhancing geneediting approaches.
- FIG. 28A is schematics showing alternative HDR-enhancing approaches via fusing functional domains, CtIP or Geminin (Gem), to Cas9 protein (left) and when combined with REDIT (right).
- FIG. 28B is an alternative small-molecule HDR-enhancing approach through cell cycle control. Nocodazole was used to synchronize cells at the G2/M boundary (left) according to the timeline shown (right).
- 28C is comparisons of gene-editing efficiencies using REDIT and alternative HDR-enhancing tools, Cas9-HE (CtIP fusion), Cas9-Gem (Geminin fusion), and Nocodazole (noc), along with combination of REDIT with these methods (Cas9-HE/Cas9- Gem/noc+REDIT).
- Donor DNAs have 200 + 400 bp (DYNLT1) or 200 + 200bp (HSP90AA1)' of HAs. All assays performed with no donor, NTC and Cas9 (no enhancement) controls. #P ⁇ 0.05, compared to REDIT; ##P ⁇ 0.01, compared to REDIT.
- FIGS. 29A-29D show template design guideline, junction precision, and capacity of REDIT gene-editing methods.
- FIG. 29A is graphs of a homology arm (HA) length test comparing different template designs of HDR donors (longer HAs) or NHEJ/MMEJ donors (zero/shorter HAs) using REDIT and Cas9 references. Top and bottom are two genomic loci tested using mKate knock-in assay.
- FIG. 29B is a design of an exemplary junction profiling assay through isolation of knock-in clones, followed by genomic PCR using primers (fwd, rev) binding outside donor to avoid template amplification.
- FIG. 29C is a graph of the percentage of colonies with indicated junction profiles from the Sanger sequencing of knock-in clones as in FIG. 29B. Editing methods and donor DNA are listed at the bottom (HA lengths indicated in bracket).
- FIG. 29D is a graph of knock-in efficiencies using a 2-kb cassette to insert dual-GFP/mKate tags to validate REDIT methods with Cas9. HA lengths of donor DNAs indicated at the bottom.
- FIGS. 30A-30C show GISseq results (FIGS. 6C-6E) indicating that REDIT is an efficient method with the ability to insert kilobase-length sequences with less unwanted editing events.
- FIG. 30A is a schematic showing the design, procedures, and analysis steps for GIS-seq to measure genome-wide insertion sites of the knock-in cassettes. High-molecular-weight (HMW) genomic DNA purification was needed to remove potential contamination from donor DNAs. Donor DNAs had 200 bp HAs each side.
- FIG. 30B is representative GIS-seq results showing plus/minus reads at on-target locus DYNLT1.
- FIG. 30C is a summary of top GIS-seq insertion sites comparing Cas9dn and REDITdn groups, showing the expected on-target insertion site (highlighted) and reduced number of identified off-target insertion sites when using REDITdn. (Left) DYNLT1 and (Right) ACTB loci with MLE calculated from the distribution of filtered and trimmed GIS-seq reads.
- FIGS. 31A-31F show the dependence of REDIT gene-editing on endogenous DNA repair and applying REDIT methods for human stem cell engineering.
- FIG. 31A is a model showing the editing process and major repair pathways involved when using REDIT or Cas9 for gene-editing, the HDR pathway are highlighted for chemical perturbation (inhibition of RAD51). Donor DNAs with 200 + 200 bp HAs are used for all inhibitor experiments.
- FIGS. 3 IB and 31C are graphs showing the relative knock-inefficiency of REDIT tools compared with Cas9 reference treated with RAD51 inhibitor B02 and RI-1, or vehicle-treated, for the wtCas9-based REDIT and Cas9 (FIG.
- FIG. 3 IB Cas9 nickase-based REDITdn and Cas9dn
- FIG. 31C Cas9 nickase-based REDITdn and Cas9dn
- FIG. 3 ID are graphs of knock-in efficiencies in hESCs (H9) using REDIT and REDITdn tested across three genomic loci, compared with corresponding Cas9 and Cas9dn references.
- FIGS. 3 IE and 3 IF are flow cytometry plots of mKate knock-in results in hESCs using REDIT, REDITdn with Cas9, Cas9dn, and NTC controls.
- Donor DNAs in the hESC experiments have 200 + 200 bp HAs across all loci tested.
- FIGS. 32A-32B show chemical perturbations to dCas9 REDIT. Gene editing efficiencies were determined when treated with mammalian DNA repair pathway inhibitors (Mirin, RI-1, and B02) with (FIG. 32A) and without (FIG. 32B) cell cycle inhibitor (Thy, doubly Thymidine) blocking. Statistical analyses are from t-test results with 1% FDR via a two-stage step- up method.
- FIGS. 33A and 33B are schematics of the DNA components (gene-editing vectors and template DNA) and tail vein injection of mice, respectively.
- FIGS. 34A-34C are results from the tail vein injection of mice with gene-editing vectors.
- FIG. 34A is a schematic and gel electrophoresis of PCR analysis of liver hepatocytes from the injected mice.
- FIG. 34B is the Sanger sequencing results of the PCR amplicon.
- FIG. 34C is a schematic of next-generation sequencing and a graph of the quantification of knock-in junction errors.
- FIGS. 35 A and 35B are schematics of the DNA components (gene-editing and control vector) and adeno-associated virus (AAV) treatment, respectively.
- FIG. 35C is fluorescent images of lungs from AAV treated mice and graphs of corresponding quantitation of tumor number.
- FIGS. 36A-36C show the predicted structure of E. coli RecT (EcRecT) alone (FIG. 36A) and with bound single-strand DNA (FIG. 36B, 36C).
- FIGS. 37A-37B show predicted interactions of EcRecT SSAP amino acids with ssDNA.
- FIGS. 38A-38F show development of the dCas9 gene-editor through mining microbial SSAPs.
- FIG. 38 A Schematic model of dCas9 editor with single-strand annealing proteins (SSAP).
- FIG. 38B Design of the genomic knock-in assay to measure gene-editing efficiencies (left); workflow of the SSAP screening experiments (right).
- FIG. 38C Construct designs for screening gene-editing efficiency of SSAPs using the 2A-mKate knock-in assay, with an 800bp transgene.
- FIG. 38D Results of initial screen of three SSAPs: Bet protein from Lambda phage (LBet), RecT protein from Rac prophage (RacRecT), and gp2.5 from T7 phage (T7gp2.5).
- FIG. 38E Screening RecT-like SSAP candidates via metagenomic homolog mining and knock-in assay. The most active candidate is labeled as dCas9-SSAP.
- NTC non-target control.
- Donor templates were added in all groups except the no-donor controls, with the homology arm (HA) lengths: DYNLT1, 200+200bp; HSP90AA1, 200+400bp; ACTB, 200+400bp.
- HA homology arm
- FIGS. 39A-39H show on-target and off-target editing errors of dCas9-SSAP.
- FIG. 39A Deep sequencing to measure the levels of indel formation when using dCas9-SSAP and Cas9 references at endogenous targets.
- the donor templates used are 200bp-HA HDR templates. Details of the assay described in Methods.
- FIG. 39B Clonal Sanger sequencing to analyze the accuracy of knock-in editing using dCas9-SSAP and Cas9 references with different HDR and MMEJ donors.
- the donor templates used are the 200bp-HA HDR templates and 25bp-HA MMEJ templates (Methods and Supplementary Notes).
- FIG. 39E Genome-wide detection of insertion sites of knock-in cassette using unbiased sequencing, showing (FIG. 39C) workflow, (FIG. 39D) representative reads aligned at knock-in genomic site, and (e) summary of detected on- target and off-target insertion sites.
- FIG. 39F- FIG. 39G workflow and results for measuring cell fitness effect as defined by percentage of live cells after editing (normalized to mock controls).
- FIG. 39H Summary analysis of knock-in accuracy of dCas9-SSAP editor, in comparison with Cas9 HDR and Cas9 MMEJ methods. Accuracy is defined as the overall yield (%) of correct knock-in within all edited outcomes (correct knock-in, knock-in with indel s, and NHEJ indel s).
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-enhancing methods.
- FIGS. 40A-40G show validation of dCas9-SSAP editor and comparison with Cas9 reference and other HDR-
- FIG 40B Imaging verification of mKate knock-in at endogenous genome locus using dCas9-SSAP editor.
- FIG 40C Design of knock-in donor with different lengths of transgenes.
- FIG 40D knock-in efficiencies for different transgene lengths using dCas9-SSAP editors.
- Donor HA lengths are 200bp+200bp iorDYNLTl, 200bp+400bp for HSP90AA1.
- FIG40E performance of dCas9-SSAP editor compared with Cas9 references across 7 endogenous loci in HEK293T cells. ND, no-donor controls; NT, non-target controls.
- FIG 40F- FIG 40G knock-in gene-editing in human embryonic stem cells (hESC, H9) using dCas9-SSAP editor, with quantified HDR efficiencies (FIG 40F) and flow cytometry analysis (FIG 40G). All statistical analyses are performed using multiple t-test to compare across all genomic targets, with 1% false-discovery rate (FDR) via a two-stage step-up method of Benjamini, Krieger and Yekutieli.
- FIGS. 41A-41D show chemical perturbations to probe the editing mechanism of dCas9-SSAP editor.
- Statistical analyses are from t-test results with 1% FDR via a two-stage step-up method of Benjamini, Krieger and Yekutieli.
- FIGS. 42A-42D show minimization of dCas9-SSAP editor as a compact CRISPR knock-in tool for convenient delivery.
- FIG. 42A Schematic showing the EcRecT predicted secondary structure and priming sites for constructing truncated EcRecT proteins based on the structural prediction.
- FIG. 42B Relative knock-in efficiencies of various truncated designs. All groups were normalized to Cas9 references (individually for each target).
- FIG. 42C Schematic of dSaCas9-mSSAP system in AAV construct using the compact SaCas9 (left, sizes of elements not shown to scale) and
- FIG. 42C Schematic of dSaCas9-mSSAP system in AAV construct using the compact SaCas9 (left, sizes of elements not shown to scale) and
- FIGS. 43A-43E show gel electrophoresis and sequencing verification of knock-in- specific PCR products using dCas9-SSAP.
- FIG. 43 A Agarose gel results of knock-in-specific junction PCR at DYNLT1 locus.
- FIG. 43B- FIG. 43E Sanger sequencing chromatogram of genomic junctions from knock-in experiments at DYNLT1 locus. For all samples, Applicants amplified the 5’ (FIG. 43B, FIG. 43 C) and 3’ (FIG. 43D, FIG. 43E) end of genomic DNA using junction-spanning primers outside of the donor DNAs to confirm knock-in.
- FIG. 44 shows a phylogenetic tree and amino acid alignment of representative RecT homologs along with the protein conserved domain annotated.
- FIGS. 45A-45B show deep sequencing of short-sequence editing comparing dCas9- SSAP and Cas9 editors.
- FIG. 45A Donor design of 16-bp replacement at EMX1.
- FIG. 45B Analysis of precision HDR and indel editing outcomes using deep sequencing at EMX1 genomic locus. The first round of PCR used sequencing primers completely outside of the donor to ensure the sequencing results will be free from the donor template contamination, validated by the nontarget control (where the donor DNAs are delivered into the cells).
- FIGS. 46A-46B are schematics showing the workflows used in Sanger sequencing of knock-in products (FIG. 46A) and the sequencing method used in deep on-target indel assay (FIG 46B). Assays described here correspond to Fig. 41. gPCR, genomic PCR. Seq-F/seq-R are primers for Sanger sequencing binding upstream/downstream of the knock-in templates.
- FIGS. 47A-47B show Sanger sequencing chromatograms of genomic junctions from dCas9-SSAP experiments at DYNLT1 locus. The sequences in the red boxes were not precisely repaired. For all samples, the 5’ (FIG. 47 A) and 3’ (FIG. 47B) ends of genomic DNA were amplified using junction-spanning primers to confirm knock-in precision. The genomic-binding primers used are completely outside of the donor DNAs to avoid contamination.
- FIGS. 48A-48B show Sanger sequencing chromatograms of genomic junctions from dCas9-SSAP experiments at HSP90AA1 locus.
- the 5’ (FIG. 48A) and 3’ (FIG. 48B) end of genomic DNA were amplified using junction-spanning primers to confirm knock-in precision.
- the genomic-binding primers used are completely outside of the donor DNAs to avoid contamination.
- FIGS. 49A-49B show genome-wide insertion site mapping and quantification.
- FIG. 49A Overall workflow for unbiased genome-wide insertion site mapping process. On-target and off-target insertions sites are recovered from reads that align to the reference genome (hg38). Full protocol and data analysis pipeline are detailed in Methods.
- FIG. 49B Quantification of genomewide insertion sites counting all aligned reads (with valid UMI) showed decreased insertion site abundance using Cas9-SSAP compared with Cas9 HDR, across two genomic loci (DYNLT1 and HSP90AA 1). The abundance of insertion sites are measured as RPKU, or Reads Per Thousand UMIs.
- FIGS. 50A-50B show testing of dCas9-SSAP editor tool using single-guide (FIG. 50A) and dual-guide (FIG. 50B) designs across three genomic targets (shown on the top).
- the donor DNAs used are the same as shown in Fig. 3a with 800-bp knock-in design.
- FIGS. 51A-51C show validation of dCas9-SSAP knock-in efficiencies in three additional cell lines in HepG2 (FIG. 51 A), HeLa (FIG. 5 IB), and U2OS (FIG. 51C) cell lines.
- the knock-in experiments used similar donor DNA with ⁇ 800-bp cassettes encoding 2A-mKate transgene for all cell lines tested.
- FIGS. 52A-52C show the full set of flow cytometry analysis data using dCas9-SSAP editor for human stem cell engineering.
- FIG. 53 is a schematic showing the RecT protein secondary structure predicted using an online tool (CFSSP, see Methods).
- the prediction results (secondary structure visualized at top, alignment at bottom) formed the basis for developing a truncated functional RecT variant.
- FIG. 54A-54C show optimization of dCas9-SSAP for efficient and durable geneediting.
- FIG. 54B Performance of dCas9-SSAP editor compared with Cas9 references across 7 endogenous loci in HEK293T cells after SSAP dosage optimization and donor HA extension.
- FIG. 55A-55C shows optimization of donor dosages and homology arms of donor DNA.
- FIG. 55A Quantification of genomic mKate knock-in efficiency atDYNLTl, HSP90AA1, ACTB loci for donor dosage optimization when using dCas9-SSAP editor, non target, non-target controls.
- Donor HA lengths are 200bp+200bp for DYNLT1, 200bp+400bp for HSP90AA1, 200bp+400bp for ACTB.
- Quantification of mKate knock-in efficiency at HSP90AA1 FIG. 55B
- ACTB Quantification of mKate knock-in efficiency at HSP90AA1
- HA donor homology arm
- FIG. 56A-56D show validation of dCas9-SSAP editor with protein functional assays.
- FIG. 56A Design of genomic Puromycin/Blasticidin-resistance-cassette knock-in assay to validate functional on-target editing by dCas9-SSAP.
- FIG. 56B Immunoblotting confirms the presence and sizes of on-target dCas9-SSAP knock-in products at HSP90AA1 and ACTB loci, performed with anti-V5 antibody recognizing in-frame fusion with endogenous protein. Data shown represent 3 biologically independent experiments.
- FIG. 56A Design of genomic Puromycin/Blasticidin-resistance-cassette knock-in assay to validate functional on-target editing by dCas9-SSAP.
- FIG. 56B Immunoblotting confirms the presence and sizes of on-target dCas9-SSAP knock-in products at HSP90AA1 and ACTB loci, performed with anti-V5 antibody
- FIG. 57A-57E show validation the stability of on-target editing.
- FIG. 57A Workflow of the long-term time-course experiments to evaluate the editing outcome stability using dCas9- SSAP editor.
- FIG. 57B Flow cytometry analysis of knock-in gene-editing at HSP90AA1, ACTB endogenous loci at different time points post delivery of dCas9-SSAP and donor DNA.
- FIG. 57A Workflow of the long-term time-course experiments to evaluate the editing outcome stability using dCas9- SSAP editor.
- FIG. 57B Flow cytometry analysis of knock-in gene-editing at HSP90AA1, ACTB endogenous loci at different time points post delivery of dCa
- FIG. 58 shows SSAP + Cas9 mediated knock-in editing with deactivated guide RNA (dgRNA).
- the SSAP + Cas9 comprises RecT and wtCas9.
- mKate knock-ins are depicted at DYNLT1, HSP90AA1, and ACTB.
- FIG. 59 shows dCas9-SSAP mediated knock-in of luciferase-expressing or mKate expressing 600-bp transgenes at the human albumin (ALB) locus (top) or the AAVS1 locus (bottom) in human HEK293T cells or human hepatocytes.
- ALB human albumin
- AAVS1 locus bottom
- Transgene knock-ins at the albumin locus were highly expressed in hepatocytes but not HEK293T (top).
- Transgene knock-ins at the AAVS1 locus were highly expressed in HEK293T but not hepatocytes (bottom).
- FIG. 59 shows dCas9-SSAP mediated knock-in of luciferase-expressing or mKate expressing 600-bp transgenes at the human albumin (ALB) locus (top) or the AAVS1 locus (bottom) in human HEK293T cells or human
- FIG. 60 shows dCas9-SSAP mediated knock-in of luciferase-expressing or mKate expressing 800-bp transgenes at the human albumin (ALB) locus (top) or the AAVS1 locus (bottom) in human HEK293T cells or human hepatocytes.
- ALB human albumin
- AAVS1 locus bottom
- Transgene knock-ins at the albumin locus were highly expressed in hepatocytes but not HEK293T (top).
- Transgene knock-ins at the AAVS1 locus were highly expressed in HEK293T but not hepatocytes (bottom).
- FIG. 61 shows electroporation of an RNP comprising an 800bp mKate-encoding transgene in K562 cells.
- Cells were electroporated with RNP comprising purified Cas9 or dCas9 protein complexed with guideRNA, an double stranded 800bp mKate transgene, with and without RecT.
- Knock-ins were at the HSP90AA1 (left) or HIST1H2BK (right) locus.
- FIG. 62 shows delivery of RNP comprising Cas9 or dCAS9, with or without SSAP to mouse primary hematopoietic stem cells (HSC) and AAV6 to knock in a GFP-expressing transgene.
- FIG. 63 shows transgene expression.
- FIGS. 64A-64B depict SSAP-mediated knock-in of transgenes using an Rloop-forming guide without CRISPR components.
- FIG. 64A Model of guide-RNA-SSAP mediated gene editing showing MCP-MS2 aptamer pairing of SSAP and R-loop-gRNA.
- FIG. 64B Vector/plasmid designs to express guide RNA, dsDNA donor, and SSAP.
- FIG. 64C Raw FACS plot showing identification of mCherry expressing subset.
- FIG. 64D Transgene knock-in at the ACTB locus using varying guide lengths (18nt, 20nt, 25nt) with and without RecT.
- FIGS. 65A-65B shows R-loop-guide RNA design.
- the R-loop-guideRNA comprises two components, guide and scaffold, depicted in guide-scaffold and scaffold-guide configurations (e.g., the guide at the 5' or 3'end of scaffold).
- the guide sequence is designed to match a target DNA.
- One or more aptamers can be incorporated, including without limitation MS2, PP7, and BoxB. MS2 is depicted.
- FIG. 66 shows a chimeric guide RNA comprising an MS2/PP7-aptamer.
- FIG. 67 shows effect of varying guide length on knock-in efficiency at the ACTB locus comparing R-loop-SSAP (no CRISPR), Cas9 HDR, and dCas9-SSAP.
- Donor-only included only the mKate knock-in donor.
- FIG. 68 shows effect of varying guide length on knock-in efficiency at the HIST locus comparing R-loop-SSAP (no CRISPR), Cas9 HDR, and dCas9-SSAP. Donor-only included only the mKate knock-in donor.
- FIG. 69 shows effect of varying guide length on knock-in efficiency at the HSP90AA1 locus comparing R-loop-SSAP (no CRISPR), Cas9 HDR, and dCas9-SSAP. Donor-only included only the mKate knock-in donor.
- FIG. 70 shows R-loop-SSAP mediated knock-in of luciferase-expressing or mKate expressing 600-bp transgenes at the human albumin (ALB) locus (top) or the AAVS1 locus (bottom) in human HEK293T cells or human hepatocytes.
- ALB human albumin
- AAVS1 locus bottom
- Transgene knock-ins at the albumin locus were highly expressed in hepatocytes but not HEK293T (top).
- Transgene knock-ins at the AAVS1 locus were highly expressed in HEK293T but not hepatocytes (bottom).
- FIG. 71 shows R-loop-SSAP mediated knock-in of luciferase-expressing or mKate expressing 800-bp transgenes at the human albumin (ALB) locus (top) or the AAVS1 locus (bottom) in human HEK293T cells or human hepatocytes.
- ALB human albumin
- AAVS1 locus bottom
- Transgene knock-ins at the albumin locus were highly expressed in hepatocytes but not HEK293T (top).
- Transgene knock-ins at the AAVS1 locus were highly expressed in HEK293T but not hepatocytes (bottom).
- FIG. 72 shows schematics comparing RNA-mediated SSAP editing without reverse transcriptase (top) with RNA-mediated SSAP editing with reverse transcriptase (bottom).
- the RNA template/donor as depicted includes a Homology Arm (HA) region with one HA.
- the RNA template/donor comprises two HA regions, one at each end. The HA is matched with the genomic region next the editing site so SSAP can promote editing.
- FIG. 73 shows insertion rate for a 4bp sequence inserted at the human EMX locus using an in vitro transcribed (IVT) RNA template.
- System components are 1. gE3 or sg2 or sgHE SpCas9 guideRNA targeting human EMX1; 2. dg2 of dg3 dead/deactivated guideRNA binding to a region near gE3/sg2/sgHE; and 3. 100/200HA: lOObp or 200bp HA region next to the 4bp edits, on both ends.
- FIG. 74 shows a U6-expressed RNA template in sense or anti-sense orientation used to replace a 16bp sequence (install 16bp edits) at human EMX1.
- System components comprise 1. SpCas9 guideRNA targeting human EMX1; 2. dead/deactivated guideRNA binding to the region. Numbers at the top of each lane of the gel indicate lengths of homology arm (HA) region next to the edits, on both ends.
- HA homology arm
- FIG. 75 shows dosage relationship of a U6-expressed RNA template in sense or antisense orientation used to replace a 16bp sequence (install 16bp edits) at human EMX1.
- System components comprise 1. SpCas9 guideRNA targeting human EMX1; 2. dead/deactivated guideRNA binding to a region.
- FIG. 76 shows a system of the invention inserted at the human AAVS1 locus (FIG. 76A) and repair of a defective Venus (green fluorescent protein) locus (FIG. 76B).
- FIG. 77 shows a sgRNA+ dgRNA system schematic (top) and example based on the TLR locus (bottom).
- SpCas9 guide_20bp sgRNA 20bp guide for sgRNA targeting TLR used in first guideRNA. Also shown are SaCas9 guide designs.
- dgl-dg6 different 15bp/16bp dead/deactivated guide in dgRNA targeting TLR, used in second guideRNA with aptamer for recruitment.
- FIG. 78 shows a demonstration of the sgRNA+dgRNA system including signal achieved in GFP (green) channel indicating repair of Venus protein.
- pA19 encodes Cas9 and the guide RNA with sg318/sg530 that are two guides targeting the TLR gene-editing reporter genome region.
- BB is backbone, serve as negative control.
- Dg532/534/536/538 are different dead guideRNAs comprising an MS2/PP7 aptamer.
- the red box indicates the design with RNA repair tempi ate/ donor for Venus at 3’end of RNA works the best.
- FIG. 79 shows a sgRNA+dgRNA system schematic with a direct fusion of RNA tempi ate/ donor to dgRNA.
- the sgRNA and dgRNA can be circular RNA (e.g., a configuration where the 5 ’end and 3’end of the RNA is covalently linked).
- the circular RNA can enhance stability and efficiency.
- FIG. 80 shows a demonstration of a system with fusion of dgRNA to the RNA template donor and SSAP.
- the box highlights repair of Venus significantly higher than control.
- FIG. 81 shows a test of pol2 (CMV) v.s. pol3 (U6) promoters TLR genomic editing.
- FIG. 82 shows a sgRNA+ dgRNA system example based on the EMX1 locus.
- gE3 20bp guide for sgRNA targeting EMX1, used in the first guideRNA;
- dgl-dg8 (15bp/16bp dead/deactivated guide in dgRNA targeting EMX1, used in second guideRNA with fused RNA template/donor.
- FIG. 83 shows dgRNA with fusion RNA template donor and SSAP targeted at the human EMX site.
- pA19 has Cas9 and the guide RNA with sg334/sg516 that are two guides targeting the EMX1 gene-editing reporter genome region.
- BB is backbone, serving as negative control.
- dg518/520/522/526 are different dead guideRNAs that have MS2/PP7 aptamer and bind to nearby locations from the sg334/sg516, and has fused RNA template/donor by 36bp linker (36L).
- the designs are all antisense with 300bp homology arm region (a300+300).
- Red box the design with optimal location of dg guideRNA supported higher editing efficiencies.
- FIG. 84 shows a schematic of a system incorporating SSAP and prime editing.
- An MS2 aptamer recruits SSAP-MCP to a Cas9-RT complex.
- the system provides 1) locally reverse- transcribed template donor for SSAP, 2) bypasses the endogenous HDR machinery restricted to dividing cells, and 3) benefits from use of the homology arm region of the tempi ate/donor and allows SSAP editing with Cas/nCas/dCas.
- FIG. 85 shows SSAP + prime editing mediated editing at the HEK3 locus (top) and RFN2 locus (bottom).
- 293T cells were transfected with a Cas9n-RT construct, pegRNA construct, nicking/recruiting sgRNA-MS2 construct, and SSAP-MCP construct.
- FIG. 86 shows different length edits mediated by SSAP + prime editing at the HEK3 locus (top) and RFN2 locus (bottom).
- 293T cells were transfected with a Cas9n-RT construct, pegRNA construct, nicking/recruiting sgRNA-MS2 construct, and SSAP-MCP construct.
- FIG. 87 shows a schematic of an editing system incorporating SSAP and retron.
- a retron-SSAP editor has three components: (1) Retron-sgRNA can be subdivided into three regions: the region of RNA that is reverse transcribed (called “msd”) and a region that remains as RNA in the final molecule (called “msr”), and finally the guide RNA region (guide RNA with MS2 or other aptamer to recruit SSAP).
- the gRNA region can be derived from Cas9 scaffold. This msr/msd RNA helps initiate the RT process that generates reverse-transcribed ssDNA directly linked to sgRNA.
- the RT of a retron which recognizes retron RNA and complete reverse transcription of the donor template (a linked RNA-DNA hybrid molecule).
- This RT can be fused to Cas9 or to MCP-SSAP.
- the SSAP protein fused to MCP optionally also fused to RT if RT is not fused to Cas9.
- FIGS. 88A-88D depict SSAP array screening, showing cell viability vs. editing efficiency (fold over negative control (FIGS. 88A, 88C) or percent of mKate knock-in (FIGS. 88B, 88D)) for the ACTB target (FIGS. 88A, 88B) and the HSP90AA1 target (FIGS. 88C, 88D).
- the positive control is EcRecT.
- FIGS. 89A-89C depict normalized (FIG. 89A) and absolute (FIG. 89B) editing efficiency, comparing activity at two targets, HSP90AA and ACTB.
- FIG. 89C shows cell viability, comparing SSAP use for HSP90AA1 knock-ins with ACTB knock-ins.
- the positive control is EcRecT.
- FIG. 90 depicts by scatter plot a comparison of cell viability vs. normalized (A) or absolute (B) editing efficiency for all targets combined. Bar graphs compare editing efficiency at two targets, HSP90 and QCTB, normalized (C) or absolute (D) for each of the candidates.
- the positive control is EcRecT.
- FIG. 91 depicts a tree and sequence alignment for SSAP 16 (1, SEQ ID NO: 185), SSAP_10 (2, SEQ ID NO: 179), SSAP 36 (3, SEQ ID NO:205), SSAP_152 (4, SEQ ID NO:321), and SSAP 184 (5, SEQ ID NO:353) compared with EcRecT (SEQ ID NO:171). See Table 12.
- FIG. 92 depicts a tree and sequence alignment for SSAP 16 (1, SEQ ID NO: 185), SSAP_10 (2, SEQ ID NO: 179), SSAP 36 (3, SEQ ID NO:205), SSAP_152 (4, SEQ ID NO:321), SSAP 184 (5, SEQ ID NO:353), SSAP 197 (6, SEQ ID NO:366), SSAP 305 (7, SEQ ID NO:424), SSAP_210 (8; SEQ ID NO:379), and SSAP_190 (9, SEQ ID NO:359) compared with EcRecT (SEQ ID NO: 171). See Table 12.
- FIG. 93 depicts a tree and sequence alignment for SSAP 16 (1, SEQ ID NO: 185), SSAP_10 (2, SEQ ID NO: 179), SSAP 36 (3, SEQ ID NO:205) , SSAP 197 (6, SEQ ID NO:366), and SSAP 210 (8; SEQ ID NO:379) compared with EcRecT (SEQ ID NO: 171). See Table 12.
- FIG. 94 depicts an evolution tree of candidate SSAPs. 296 candidates were selected applying a set of filters and maximizing evolution distances.
- the SSAPs cover a diverse phylogenetic family (branches) within the SSAP family.
- FIGS. 95A-95C depict editing efficiencies of 10 top-ranked SSAPs compared to EcRecT and a negative control using the dCas9 editing system.
- FIG. 95A mKate knock-ins at the ACTB locus.
- FIG. 95B mKate knock-ins at the HSP90 locus.
- FIG. 95C Scatter plot depicting editing efficiency at the ACTB target and at the HSP90AA1 target for the candidate SSAPs.
- FIG. 96 compares editing efficiencies in a Cas-free system of 10 top-ranked SSAPs with a negative control (pA25 expresses MCP-EBFP) and pCK914 which expresses MCP- EcRecT.
- Gene-editing efficiencies are for knock-in of 800bp mKate donor with homology arms, in HEK293 human cells, and a 20 base guideRNA to match the target genomic insertion site HSP90AA1 (top) or ACTB (bottom).
- Constructs used are i) guideRNA with MS aptamer to recruit SSAP; ii) MCP-SSAP fusion protein; and iii) donor DNA that inserts mKate cargo (without promoter).
- FIGS. 97A-97C depict editing efficiencies of top SSAPs compared to EcRecT using a dCas9 editing system with a transcribed AAV donor in primary hepatocyte (mouse).
- the dCas9 is virally delivered separately using adeno-viral-Cas9 under the control of strong CMV promoter (Adeno-CMV-Cas9).
- FIG. 97A AAV donor designs: (top) typical AAV donor DNA; (bottom) AAV vector includes a promoter that transcribes the donor cargo into RNA.
- the donor RNA is transcribed in anti-sense orientation to avoid cargo expression.
- a 600bp luciferase cargo was knocked in at the mouse Albumin locus (FIG. 97B) or ACTB locus (FIG. 97C).
- FIGS. 98A-98C depict AAV donor designs and editing efficiency.
- the dCas9 is virally delivered separately using adeno-viral-Cas9 under the control of strong CMV promoter (Adeno- CMV-Cas9).
- FIG. 98A top: 5’ release AAV design.
- a second guide RNA is provided to bind/cleave the 5’ end of the cargo (hsgRNA cleavage site adjacent to the left homology arm (Left HA), middle: 3’ release AAV design.
- a second guide RNA is provided to bind/cleave the 3’ end of the cargo (hsgRNA cleavage site adjacent to the right homology arm (Right HA), bottom: Intact AAV design.
- a 600bp luciferase cargo was knocked in at the mouse Albumin locus (FIG. 98B) or ACTB locus (FIG. 98C).
- FIG. 99 depicts genome engineering across multiple human targets with SSAPs.
- the editing system included dCas9 (dSpCas9), guideRNA with MS2 aptamer, MCP protein fused to candidate SSAP, and donor DNA inserting a mKate fluorescent protein cargo in-frame into the indicated endogenous genomic loci.
- FIGS. 100A-100C depict a comparison of SSAP clones to RecT of Listeria Innoccua phage Al 18 (LiRecT) and models of RecT complexes in DNA annealing.
- FIG. 100A Evolution tree showing clones SSAP-10, SSAP-16, SSAP-152, SSAP-198, EcRecT, and LiRecT.
- FIG. 100B Model of RecT-dsDNA complex.
- FIG. 100C Model of RecT in complex with a duplex intermediate of DNA annealing. The models are based on deposited structures 7UBB and 7UB2.
- FIGS. 101A-101B depict engineering of RecT.
- the present invention is directed to a system and the components for DNA editing.
- the disclosed system based on CRISPR targeting and homology directed repair by phage recombination enzymes.
- the system results in superior recombination efficiency and accuracy at a kilobase scale.
- the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
- complementary and complementarity refer to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson- Crick base-paring or other non-traditional types of pairing.
- the degree of complementarity between two nucleic acid sequences can be indicated by the percentage of nucleotides in a nucleic acid sequence which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 50%, 60%, 70%, 80%, 90%, and 100% complementary).
- Two nucleic acid sequences are “perfectly complementary” if all the contiguous nucleotides of a nucleic acid sequence will hydrogen bond with the same number of contiguous nucleotides in a second nucleic acid sequence.
- Two nucleic acid sequences are “substantially complementary” if the degree of complementarity between the two nucleic acid sequences is at least 60% (e.g., 65%, 70%, 75%, 80%, 85%, 90%, 95%.
- nucleic acid sequences hybridize under at least moderate, preferably high, stringency conditions.
- Exemplary moderate stringency conditions include overnight incubation at 37° C in a solution comprising 20% formamide, 5*SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5*Denhardt’s solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1 *SSC at about 37-50° C., or substantially similar conditions, e.g., the moderately stringent conditions described in Sambrook et al., infra.
- High stringency conditions are conditions that use, for example (1) low ionic strength and high temperature for washing, such as 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate (SDS) at 50° C, (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin (BSA)/0.1% Ficoll/0.1% polyvinylpyrrolidone (PVP)/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride and 75 mM sodium citrate at 42° C., or (3) employ 50% formamide, 5*SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5*Denhardt’s solution, sonicated salmon sperm DNA (50 pg/ml), 0.1% SDS, and 10% dextran sulfate
- a cell has been “genetically modified,” “transformed,” or “transfected” by exogenous DNA, e.g., a recombinant expression vector, when such DNA has been introduced inside the cell.
- exogenous DNA e.g., a recombinant expression vector
- the presence of the exogenous DNA results in permanent or transient genetic change.
- the transforming DNA may or may not be integrated (covalently linked) into the genome of the cell.
- the transforming DNA may be maintained on an episomal element such as a plasmid.
- a stably transformed cell is one in which the transforming DNA has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication.
- a “clone” is a population of cells derived from a single cell or common ancestor by mitosis.
- a “cell line” is a clone of a primary cell that is capable of stable growth in vitro for many generations.
- nucleic acid or a “nucleic acid sequence” refers to a polymer or oligomer of pyrimidine and/or purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively.
- the present technology contemplates any deoxyribonucleotide, ribonucleotide, or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated, or glycosylated forms of these bases, and the like.
- the polymers or oligomers may be heterogenous or homogenous in composition and may be isolated from naturally occurring sources or may be artificially or synthetically produced.
- nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states.
- a nucleic acid or nucleic acid sequence comprises other kinds of nucleic acid structures such as, for instance, a DNA/RNA helix, peptide nucleic acid (PNA), morpholino nucleic acid (see, e.g., Braasch and Corey, Biochemistry, 41(14): 4503-4510 (2002)) and U.S. Pat. No. 5,034,506, incorporated herein by reference), locked nucleic acid (LNA; see Wahlestedt et al., Proc.
- PNA peptide nucleic acid
- LNA locked nucleic acid
- nucleic acid or “nucleic acid sequence” may also encompass a chain comprising non-natural nucleotides, modified nucleotides, and/or non- nucleotide building blocks that can exhibit the same function as natural nucleotides (e.g., “nucleotide analogs”); further, the term “nucleic acid sequence” as used herein refers to an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin, which may be single or double-stranded, and represent the sense or antisense strand.
- nucleic acid refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof.
- a “peptide” or “polypeptide” is a linked sequence of two or more amino acids linked by peptide bonds.
- the peptide or polypeptide can be natural, synthetic, or a modification or combination of natural and synthetic.
- Polypeptides include proteins such as binding proteins, receptors, and antibodies. The proteins may be modified by the addition of sugars, lipids or other moieties not included in the amino acid chain.
- the terms “polypeptide” and “protein,” are used interchangeably herein.
- percent sequence identity refers to the percentage of nucleotides or nucleotide analogs in a nucleic acid sequence, or amino acids in an amino acid sequence, that is identical with the corresponding nucleotides or amino acids in a reference sequence after aligning the two sequences and introducing gaps, if necessary, to achieve the maximum percent identity.
- additional nucleotides in the nucleic acid, that do not align with the reference sequence are not taken into account for determining sequence identity.
- Methods and computer programs for alignment are well known in the art, including BLAST, Align 2, and FASTA.
- a “vector” or “expression vector” is a replicon, such as plasmid, phage, virus, or cosmid, to which another DNA segment, e.g., an “insert,” may be attached or incorporated so as to bring about the replication of the attached segment in a cell.
- wild-type refers to a gene or a gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
- a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the “normal” or “wild-type” form of the gene.
- modified,” “mutant,” or “polymorphic” refers to a gene or gene product that displays modifications in sequence and or functional properties (e.g., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
- RNA-guided CRISPR Recombineering System In bacteria and archaea, CRISPR/Cas systems provide immunity by incorporating fragments of invading phage, virus, and plasmid DNA into CRISPR loci and using corresponding CRISPR RNAs (“crRNAs”) to guide the degradation of homologous sequences. Each CRISPR locus encodes acquired “spacers” that are separated by repeat sequences. Transcription of a CRISPR locus produces a “pre-crRNA,” which is processed to yield crRNAs containing spacer-repeat fragments that guide effector nuclease complexes to cleave dsDNA sequences complementary to the spacer.
- pre-crRNA pre-crRNA
- CRISPR systems Three different types are known, type I, type II, or type III, and classified based on the Cas protein type and the use of a proto-spacer-adjacent motif (PAM) for selection of proto-spacers in invading DNA.
- the endogenous type II systems comprise the Cas9 protein and two noncoding crRNAs: trans-activating crRNA (tracrRNA) and a precursor crRNA (pre-crRNA) array containing nuclease guide sequences (also referred to as “spacers”) interspaced by identical direct repeats (DRs).
- tracrRNA is important for processing the pre-crRNA and formation of the Cas9 complex.
- tracrRNAs hybridize to repeat regions of the pre-crRNA.
- each mature complex locates a target double stranded DNA (dsDNA) sequence and cleaves both strands using the nuclease activity of Cas9.
- dsDNA target double stranded DNA
- CRISPR/Cas gene editing systems have been developed to enable targeted modifications to a specific gene of interest in eukaryotic cells.
- CRISPR/Cas gene editing systems are commonly based on the RNA-guided Cas9 nuclease from the type II prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR) adaptive immune system.
- Engineering CRISPR/Cas systems for use in eukaryotic cells typically involves reconstitution of the crRNA- tracrRNA-Cas9 complex.
- the Cas9 amino acid sequence may be codon-optimized and modified to include an appropriate nuclear localization signal, and the crRNA and tracrRNA sequences may be expressed individually or as a single chimeric molecule via an RNA polymerase II promoter.
- the crRNA and tracrRNA sequences are expressed as a chimera and are referred to collectively as “guide RNA” (gRNA) or single guide RNA (sgRNA).
- gRNA guide chimera
- sgRNA single guide RNA
- guide RNA single guide RNA
- guide RNA single guide RNA
- synthetic guide RNA are used interchangeably herein and refer to a nucleic acid sequence comprising a tracrRNA and a pre- crRNA array containing a guide sequence.
- guide sequence refers to the about 20 nucleotide sequence within a guide RNA that specifies the target site.
- the guide RNA contains an approximate 20-nucleotide guide sequence followed by a protospacer adjacent motif (PAM) that directs Cas9 via Watson-Crick base pairing to a target sequence.
- PAM protospacer adjacent motif
- the system comprises: a Cas protein, a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence and a recombination protein.
- the recombination protein comprises a microbial recombination protein.
- the recombination protein comprises a viral recombination protein.
- the recombination protein comprises a eukaryotic recombination protein.
- the recombination protein comprises a mitochondrial recombination protein.
- Cas protein families are described in further detail in, e.g., Haft et al., PLoS Comput. Biol., 1(6): e60 (2005), incorporated herein by reference.
- the Cas protein may be any Cas endonucleases.
- the Cas protein is Cas9 or Casl2a, otherwise referred to as Cpfl.
- the Cas9 protein is a wild-type Cas9 protein.
- the Cas9 protein can be obtained from any suitable microorganism, and a number of bacteria express Cas9 protein orthologs or variants.
- the Cas9 is from Streptococcus pyogenes or Staphylococcus aureus.
- Cas9 proteins of other species are known in the art (see, e.g., U.S. Patent Application Publication 2017/0051312, incorporated herein by reference) and may be used in connection with the present invention.
- the amino acid sequences of Cas proteins from a variety of species are publicly available through the GenBank and UniProt databases.
- the Cas9 protein is a Cas9 nickase (Cas9n).
- Wild-type Cas9 has two catalytic nuclease domains facilitating double-stranded DNA breaks.
- a Cas9 nickase protein is typically engineered through inactivating point mutation(s) in one of the catalytic nuclease domains causing Cas9 to nick or enzymatically break only one of the two DNA strands using the remaining active nuclease domain.
- Cas9 nickases are known in the art (see, e.g., U.S.
- Patent Application Publication 2017/0051312 incorporated herein by reference
- the Cas9 nickase is Streptococcus pyogenes Cas9n (D10A).
- the Cas protein is a catalytically dead Cas.
- catalytically dead Cas9 is essentially a DNA-binding protein due to, typically, two or more mutations within its catalytic nuclease domains which renders the protein with very little or no catalytic nuclease activity.
- Streptococcus pyogenes Cas9 may be rendered catalytically dead by mutations of D10 and at least one of E762, H840, N854, N863, or D986, typically H840 and/or N863 (see, e.g., U.S. Patent Application Publication 2017/0051312, incorporated herein by reference). Mutations in corresponding orthologs are known, such as N580 in Staphylococcus aureus Cas9. Oftentimes, such mutations cause catalytically dead Cas proteins to possess no more than 3% of the normal nuclease activity.
- the system comprises a nucleic acid molecule comprising a guide RNA sequence complementary to a target DNA sequence.
- the guide RNA sequence specifies the target site with an approximate 20-nucleotide guide sequence followed by a protospacer adjacent motif (PAM) that directs Cas9 via Watson-Crick base pairing to a target sequence.
- PAM protospacer adjacent motif
- the system comprises a nucleic acid molecule comprising a deactivated guide RNA (dgRNA) sequence complementary to a target DNA sequence.
- dgRNA deactivated guide RNA
- the deactivated guide is shortened or modified such that a CRISPR complex comprising the dgRNA binds to but does not cut or nick target DNA.
- Non-limiting examples include guides such as are described by WO/2017/094872, which are modified in a manner which allows for formation of a CRISPR complex and successful binding to a target, while at the same time, not allowing for successful nuclease activity (e.g., without nuclease activity / without indel activity).
- the guide nucleic acids can be thought of as catalytically inactive or conformationally inactive with regard to nuclease activity.
- dgRNAs with short target recognition sequences can dramatically improve Cas9-mediated editing specificity by binding to and shielding off-target sites from an active Cas9 sgRNA complex.
- Shortened / modified dgRNAs are used according to the invention to recruit Cas9-SSAP for cleavage-free knock-in of long sequences.
- target DNA sequence refers to a polynucleotide (nucleic acid, gene, chromosome, genome, etc.) to which a guide sequence (e.g., a guide RNA) is designed to have complementarity, wherein hybridization between the target sequence and a guide sequence promotes the formation of a Cas9/CRISPR complex, provided sufficient conditions for binding exist.
- the target sequence is a genomic DNA sequence.
- genomic refers to a nucleic acid sequence (e.g., a gene or locus) that is located on a chromosome in a cell.
- a target sequence may comprise any polynucleotide, such as DNA or RNA.
- Suitable DNA/RNA binding conditions include physiological conditions normally present in a cell.
- Other suitable DNA/RNA binding conditions e.g., conditions in a cell-free system are known in the art; see, e.g., Sambrook, referenced herein and incorporated by reference.
- the strand of the target DNA that is complementary to and hybridizes with the DNA-targeting RNA is referred to as the “complementary strand” and the strand of the target DNA that is complementary to the “complementary strand” (and is therefore not complementary to the DNA-targeting RNA) is referred to as the “noncomplementary strand” or “non-complementary strand.”
- the target genomic DNA sequence may encode a gene product.
- gene product refers to any biochemical product resulting from expression of a gene.
- Gene products may be RNA or protein.
- RNA gene products include non-coding RNA, such as tRNA, rRNA, micro RNA (miRNA), and small interfering RNA (siRNA), and coding RNA, such as messenger RNA (mRNA).
- mRNA messenger RNA
- the target genomic DNA sequence encodes a protein or polypeptide.
- two nucleic acid molecules comprising a guide RNA sequence may be utilized.
- the two nucleic acid molecules may have the same or different guide RNA sequences, thus complementary to the same or different target DNA sequence.
- the guide RNA sequences of the two nucleic acid molecules are complementary to a target DNA sequences at opposite ends (e.g., 3’ or 5’) and/or on opposite strands of the insert location.
- the system further comprises a recruitment system comprising at least one aptamer sequence and an aptamer binding protein functionally linked to the recombination protein as part of a fusion protein.
- the aptamer sequence is an RNA aptamer sequence.
- the nucleic acid molecule comprising the guide RNA also comprises one or more RNA aptamers, or distinct RNA secondary structures or sequences that can recruit and bind another molecular species, an adaptor molecule, such as a nucleic acid or protein.
- an adaptor molecule such as a nucleic acid or protein.
- RNA aptamers can be naturally occurring or synthetic oligonucleotides that have been engineered through repeated rounds of in vitro selection or SELEX (systematic evolution of ligands by exponential enrichment) to bind to a specific target molecular species.
- the nucleic acid comprises two or more aptamer sequences.
- the aptamer sequences may be the same or different and may target the same or different adaptor proteins.
- the nucleic acid comprises two aptamer sequences.
- RNA aptamer/ aptamer binding protein pair known may be selected and used in connection with the present invention (see, e.g., Jayasena, S.D., Clinical Chemistry, 1999. 45(9): p. 1628-1650; Gelinas, et al., Current Opinion in Structural Biology, 2016. 36: p. 122-132; and Hasegawa, H., Molecules, 2016; 21(4): p. 421, incorporated herein by reference).
- RNA aptamer binding, or adaptor, proteins exist, including a diverse array of bacteriophage coat proteins.
- coat proteins include but are not limited to: MS2, Qp, F2, GA, fr, JP501, M12, R17, BZ13, JP34, JP500, KU1, Mi l, MX1, TW18, VK, SP, FI, ID2, NL95, TW19, AP205, 4>Cb5, 4>Cb8r, 4>Cb 12r, (
- the RNA aptamer binds MS2 bacteriophage coat protein or a functional derivative, fragment, or variant thereof.
- MS2 binding RNA aptamers commonly have a simple stem-loop structure, classically defined by a 19 nucleotide RNA molecule with a single bulged adenine on the 5’ leg of the stem (Witherail G.W., et al., (1991) Prog. Nucleic Acid Res. Mol. Biol., 40, 185-220, incorporated herein by reference).
- MS2 coat protein Parrott AM, et al., Nucleic Acids Res. 2000;28(2):489-497, Buenrostro JD, et al. Natura Biotechnology 2014; 32, 562-568, and incorporated herein by reference).
- RNA aptamer sequence known to bind the MS2 bacteriophage coat protein may be utilized in connection with the present invention to bind to fusion proteins comprising MS2.
- the MS2 RNA aptamer sequence comprises: AACAUGAGGAUCACCCAUGUCUGCAG (SEQ ID NO: 145),
- N-proteins (Nut-utilization site proteins) of bacteriophages contain arginine-rich conserved RNA recognition motifs of ⁇ 20 amino acids, referred to as N peptides.
- the RNA aptamer may bind a phage N peptide or a functional derivative, fragment, or variant thereof.
- the phage N peptide is the lambda or P22 phage N peptide or a functional derivative, fragment, or variant thereof.
- the N peptide is lambda phage N22 peptide, or a functional derivative, fragment, or variant thereof.
- the N22 peptide comprises an amino acid sequence with at least 70% similarity to the amino acid sequence GNARTRRRERRAEKQAQWKAAN (SEQ ID NO: 149).
- N22 peptide the 22 amino acid RNA- binding domain of the X bacteriophage antiterminator protein N (XN-(l-22) or XN peptide) is capable of specifically binding to specific stem-loop structures, including but not limited to the BoxB stem-loop. See, for example Cilley and Williamson, RNA 1997; 3(l):57-67, incorporated herein by reference. A number of different BoxB stem-loop primary sequences are known to bind the N22 peptide and any of those may be utilized in connection with the present invention.
- the N22 peptide RNA aptamer sequence comprises a nucleotide sequence with at least 70% similarity to an RNA sequence selected from the group consisting of GCCCUGAAAAAGGGC (SEQ ID NO:150), GCCCUGAAGAAGGGC (SEQ ID NO:151), GCGCUGAAAAAGCGC (SEQ ID NO: 152), GCCCUGACAAAGGGC (SEQ ID NO: 153), and GCGCUGACAAAGCGC (SEQ ID NO: 154).
- the N22 peptide RNA aptamer sequence is selected from the group consisting of SEQ ID NOs: 150-154.
- the N peptide is the P22 phage N peptide, or a functional derivative, fragment, or variant thereof.
- a number of different BoxB stem-loop primary sequences are known to bind the P22 phage N peptide and variants thereof and any of those may be utilized in connection with the present invention. See, for example Cocozaki, Ghattas, and Smith, Journal of Bacteriology 2008; 190(23):7699-7708, incorporated herein by reference.
- the P22 phage N peptide comprises an amino acid sequence with at least 70% similarity to the amino acid sequence GNAKTRRHERRRKLAIERDTI (SEQ ID NO: 155).
- the P22 phage N peptide RNA aptamer sequence comprises a sequence with at least 70% similarity to an RNA sequence selected from the group consisting of GCGCUGACAAAGCGC (SEQ ID NO: 156) and CCGCCGACAACGCGG (SEQ ID NO: 157). In some embodiments, the P22 phage N peptide RNA aptamer sequence is selected from the group consisting of SEQ ID NOs: 156-157, UGCGCUGACAAAGCGCG (SEQ ID NO:158) or ACCGCCGACAACGCGGU (SEQ ID NO: 159).
- aptamer sequence is a peptide aptamer sequence.
- the peptide aptamers can be naturally occurring or synthetic peptides that are specifically recognized by an affinity agent.
- Such aptamers include, but are not limited to, a c-Myc affinity tag, an HA affinity tag, a His affinity tag, an S affinity tag, a methionine-His affinity tag, an RGD-His affinity tag, a 7x His tag, a FLAG octapeptide, a strep tag or strep tag II, a V5 tag, or a VSV-G epitope.
- Corresponding aptamer binding proteins are well-known in the art and include, for example, primary antibodies, biotin, affimers, single domain antibodies, and antibody mimetics.
- An exemplary peptide aptamer includes a GCN4 peptide (Tanenbaum et al., Cell 2014; 159(3):635-646, incorporated herein by reference).
- Antibodies, or GCN4 binding protein can be used as the aptamer binding proteins.
- the peptide aptamer sequence is conjugated to the Cas protein.
- the peptide aptamer sequence may be fused to the Cas in any orientation (e.g., N-terminus to C- terminus, C-terminus to N-terminus, N-terminus to N-terminus).
- the peptide aptamer is fused to the C-terminus of the Cas protein.
- peptide aptamer sequences may be conjugated to the Cas protein.
- the aptamer sequences may be the same or different and may target the same or different aptamer binding proteins.
- 1 to 24 tandem repeats of the same peptide aptamer sequence are conjugated to the Cas protein.
- between 4 and 18 tandem repeats are conjugated to the Cas protein.
- the individual aptamers may be separated by a linker region. Suitable linker regions are known in the art. The linker may be flexible or configured to allow the binding of affinity agents to adjacent aptamers without or with decreased steric hindrance.
- the linker sequences may provide an unstructured or linear region of the polypeptide, for example, with the inclusion of one or more glycine and/or serine residues.
- the linker sequences can be at least about 2, 3, 4, 5, 6, 7, 8, 9, 10 or more amino acids in length.
- the fusion protein comprises a recombination protein functionally linked to an aptamer binding protein.
- the recombination protein comprises a microbial recombination protein.
- the recombination protein comprises a recombinase.
- the recombination protein comprises 5 ’-3’ exonuclease activity.
- the recombination protein comprises 3 ’-5’ exonuclease activity.
- the recombination protein comprises ssDNA binding activity.
- the recombination protein comprises ssDNA annealing activity.
- the bacteriophage X-encoded genetic recombination machinery comprises the exo and bet genes, assisted by the gam gene, together designated X red genes.
- Exo is a 5 '-3 ' exonuclease which targets dsDNA
- Bet is a ssDNA-binding protein. Bet functions include protecting ssDNA from degradation and promoting annealing of complementary ssDNA strands.
- Another bacteriophage system found in E. coll is the Rac prophage system, comprising recE and recT genes which are functionally similar to exo and bet.
- the microbial recombination protein may be RecE, RecT, lambda exonuclease (Exo), Bet protein (betA, redB), exonuclease gp6, single-stranded DNA-binding protein gp2.5, or a derivative or variant thereof.
- Recombination proteins and functional fragments thereof useful in the invention include nucleases, ssDNA-binding proteins (SSBs), and ssDNA annealing proteins (SSABs).
- SSBs ssDNA-binding proteins
- SSABs ssDNA annealing proteins
- E. coll proteins such as Exol (xoriA., sbcB). ExoIII (xlhA). ExoIV (orn). Exo VII (xseA.
- SSBs include, without limitation, SSBs of prokaryotes, bacteriophage, eukaryotes, mammals, mitochondria, and viruses. While SSBs are found in every organism, the proteins themselves share surprisingly little sequence similarity, and may differ in subunit composition and oligomerization states. SSB proteins may comprise certain structural features. One is use of oligonucleotide/oligosaccharide-binding (OB) domains to bind ssDNA through a combination of electrostatic and base-stacking interactions with the phosphodiester backbone and nucleotide bases. Another feature is oligomerization that brings together DNA-binding OB folds.
- OB oligonucleotide/oligosaccharide-binding
- Eukaryotic SSBs are regulated by phosphorylation on serine and threonine residues. Tyrosine phosphorylation of microbial SSBs is observed in taxonomically distant bacteria and substantially increases affinity for ssDNA.
- the human mitochondrial ssDNA- binding protein is structurally similar to SSB from Escherichia coli (EcoSSB), but lacks the C- terminal disordered domain.
- Eukaryotic replication protein A (RPA) shares function, but not sequence homology with bacterial SSB.
- the herpes simplex virus (HSV-1) SSB, ICP8, is a nuclear protein that, along other replication proteins is required for viral DNA replication.
- exonuclease activities and ssDNA binding activities of the recombination proteins of the invention uncover and protect single stranded regions of template and target DNAs, thereby facilitating recombination.
- targeting can be cooperative, involving target directed CRISPR-mediated nicking of chromosomal DNA coordinated with recombination directed by homology arms designed into template DNAs.
- off-target effects are minimized. For example, whereas targeted recombination involves coordinated CRISPR and recombination functions, at off-target sites, homology with the HR template DNA is absent and nick repair may be favored.
- SSAPs Single stranded DNA annealing proteins
- phage encoded SSAPs are recognized to encode their own SSAP recombinases which substitute for classic RecA proteins while functioning with host proteins to control DNA metabolism.
- Steczkiewiz classified SSAPs into seven families (RecA, Gp2.5, RecT/Redp, Erf, Rad52/22, Sak3, and Sak4) organized into three superfamilies including prokaryotes, eukaryotes, and phage (Steczkiewicz et al., 2021, Front. Microbiol 12:644622).
- Nonlimiting examples of SSAPs that can be used according to the invention are provided in Table 7. Any one or more of the SSAPs can be employed in the invention.
- a microbial recombination protein is RecE or RecT, or a derivative or variant thereof.
- Derivatives or variants of RecE and RecT are functionally equivalent proteins or polypeptides which possess substantially similar function to wild type RecE and RecT.
- RecE and RecT derivatives or variants include biologically active amino acid sequences similar to the wild-type sequences but differing due to amino acid substitutions, additions, deletions, truncations, post-translational modifications, or other modifications.
- the derivatives may improve translation, purification, biological half-life, activity, or eliminate or lessen any undesirable side effects or reactions.
- the derivatives or variants may be naturally occurring polypeptides, synthetic or chemically synthesized polypeptides or genetically engineered peptide polypeptides.
- RecE and RecT bioactivities are known to, and easily assayed by, those of ordinary skill in the art, and include, for example exonuclease and single-stranded nucleic acid binding, respectively.
- the RecE or RecT may be from a number of organisms, including Escherichia coH. Pantoea breeneri. Type-F symbiont of Plautia slab. Providencia sp. MGF014, Shigella sonnei. Pseudobacteriovorax antillogorgiicola, among others. Other non-limiting sources include Desulfotalea psychroph a. Lactococcus lactis, Flavobacterium psychrophilum, Mycobacterium smegmalis. Lactobacillus rhamnosus. Psychrobacter arclicus.
- the RecE and RecT protein is derived from Escherichia coli.
- the fusion protein comprises RecE, or a derivative or variant thereof.
- the RecE, or derivative or variant thereof may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-8.
- the RecE, or derivative or variant thereof may comprise an amino acid sequences with at least 70% (e.g., 75%., 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%) similarity to amino acid sequences selected from the group consisting of SEQ ID NOs: 1-8.
- the RecE, or derivative or variant thereof comprises an amino acid sequences with at least 90% similarity to amino acid sequences selected from the group consisting of SEQ ID NOs: 1-8.
- the RecE, or derivative or variant thereof comprises an amino acid sequences with at least 90% similarity to amino acid sequences selected from the group consisting of SEQ ID NOs: 1-3.
- the fusion protein comprises RecT, or a derivative or variant thereof.
- the RecT, or derivative or variant thereof may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 9-14.
- the RecT, or derivative or variant thereof may comprise an amino acid sequences with at least 70% (e.g., 75%., 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%) similarity to amino acid sequences selected from the group consisting of SEQ ID NOs: 9-14.
- the RecT, or derivative or variant thereof comprises an amino acid sequences with at least 90% similarity to amino acid sequences selected from the group consisting of SEQ ID NOs: 9-14. In exemplary embodiments, the RecT, or derivative or variant thereof, comprises an amino acid sequences with at least 90% similarity to amino acid sequences selected from the group consisting of SEQ ID NO:9.
- the fusion protein comprises a recombination protein comprising an amino acid sequence at least 75% similar, or at least 75% identical to a recombination protein of SEQ ID NO: 166 to SEQ ID NO:491, a recombination protein of Table 9, a recombination protein of SEQ ID NO: 179, SEQ ID NO:185, SEQ ID NO:205, SEQ ID NO:321, SEQ ID NO:353, SEQ ID NO:359, SEQ ID NO:366, SEQ ID NO:424, or SEQ ID NO:479, or a recombination protein of SEQ ID NO: 166, SEQ ID NO: 168, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO:241, SEQ ID NO:253, SEQ ID NO:290, SEQ ID NO:408, SEQ ID NO:411, or SEQ ID NO:442.
- the fusion protein comprises a recombination protein comprising a sequence having at least 80%, at least 85%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 98.5%, at least 99%, at least 99.5%, or 100% similarity or identity to the above referenced recombination proteins.
- Truncations may be from either the C-terminal or N-terminal ends, or both. For example, as demonstrated in Example 6 below, a diverse set of truncations from either end or both provided a functional product.
- one or more (2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120 or more) amino acids may be truncated from the C-terminal, N-terminal ends as compared to the wild-type sequence.
- the recombination protein comprises a tyrosine recombinase or functional fragment thereof. In some embodiments, the recombination protein comprises a serine recombinase or functional fragment thereof. In some embodiments, the recombination protein comprises an integrase, resolvase, or invertase, or functional fragment thereof. In some embodiments, the recombinase protein comprises a site-specific recombinase protein or functional fragment thereof. In some embodiments, the recombination protein comprises an exonuclease or functional fragment thereof. In some embodiments, the recombination protein comprises an ssDNA-binding protein or functional fragment thereof.
- the fusion protein comprises without limitation, Hin, Gin, Tn3, p/six, CinH, Min, ParA, y5, Bxbl, (pC31, TP901-1, TGI, Wp, ⁇ 370.1, (pK38, (pBTl, R4, ⁇ RVl, (pFCl, MR11, Al 18, U153, Bxz2, gp29, Cre, Dre, Vika, Flp, Kw, SprA, HK022, P22, LI, orL5 or a homolog of any of such proteins or functional fragment thereof.
- Such recombinases which may be classified in the art as integrases, resolvases, or invertases, may share substructures and activities with exonucleases and SSBs and be used according to the invention.
- the invention provides a system which comprises a reverse transcriptase, a guide nucleic acid, and a recombination protein, and optionally a Cas protein.
- the term “reverse transcriptase” describes a class of polymerases characterized as RNA-dependent DNA polymerases. All known reverse transcriptases require a primer to synthesize a DNA transcript from an RNA template. Historically, reverse transcriptase has been used primarily to transcribe mRNA into cDNA which can then be cloned into a vector for further manipulation. Avian myoblastosis virus (AMV) reverse transcriptase was the first widely used RNA-dependent DNA polymerase (Verma, Biochim. Biophys.
- the enzyme has 5 '-3 ' RNA-directed DNA polymerase activity, 5 -3 ' DNA-directed DNA polymerase activity, and RNase H activity.
- RNase H is a processive 5' and 3' ribonuclease specific for the RNA strand for RNA-DNA hybrids (Perbal, A Practical Guide to Molecular Cloning, New York: Wiley & Sons (1984)). Errors in transcription cannot be corrected by reverse transcriptase because known viral reverse transcriptases lack the 3'-5' exonuclease activity necessary for proof-reading (Saunders and Saunders, Microbial Genetics Applied to Biotechnology, London: Croom Helm (1987)).
- AMV reverse transcriptase and its associated RNase H activity has been presented by Berger et al, Biochemistry 22:2365-2372 (1983).
- Another reverse transcriptase which is used extensively in molecular biology is reverse transcriptase originating from Moloney murine leukemia virus (M-MLV).
- M-MLV Moloney murine leukemia virus
- Gerard, G. R. DNA 5:271-279 (1986) and Kotewicz, M. L. et al., Gene 35:249-258 (1985).
- M-MLV reverse transcriptase substantially lacking in RNase H activity has also been described. See, e.g., U.S. Pat. No. 5,244,797.
- RT reverse transcriptases, or variants or mutants thereof
- linkers or ways to functionally link components of embodiments of the invention such as the RT system or composition of the invention (as well as with regard to linkers or ways to functionally link components of systems or compositions discussed herein that do not involve RT) mention is made of W02020/191241, W02020/191153, WO2020/191245, WO2020/191243, WO2020/191233, WO2020/191246, WO2020/191249, WO2020/191239, WO2020/191234, WO2020/191242, WO2020/191248, W02020191171 and WO2021/226558 that involve what is known as prime editing and twin prime editing.
- WO2020/191249 WO2020/191239, WO2020/191234, WO2020/191242, WO2020/191248,
- W02020191171 and WO2021/226558 is hereby incorporated herein by reference. RTs of W02020/191241, W02020/191153, WO2020/191245, WO2020/191243, WO2020/191233,
- WO2020/191246, WO2020/191249, WO2020/191239, WO2020/191234, WO2020/191242, W02020/191248, W02020191171 and WO2021 /226558 can be used in the practice of the present invention.
- Linkers or ways to functionally link of W02020/191241, W02020/191153, WO2020/191245, WO2020/191243, WO2020/191233, WO2020/191246, WO2020/191249, WO2020/191239, WO2020/191234, WO2020/191242, WO2020/191248, W02020191171 and WO2021/226558 can be used in the practice of the present invention.
- WO/2020/191153 describes a system comprising a CRISPR protein (e.g., a Cas9 nickase) and a reverse transcriptase for use with a guide RNA that specifies a target site and templates synthesis of a desired edit in the form of a replacement DNA strand by way of an extension (either DNA or RNA) engineered onto a guide nucleic acid (e.g., at the 5' or 3' end, or at an internal portion of a guide RNA).
- an extension either DNA or RNA
- a guide nucleic acid e.g., at the 5' or 3' end, or at an internal portion of a guide RNA.
- the invention provides single stranded binding protein (e.g., SSAP or SSB) used with a reverse transcriptase to edit without CRISPR-mediated nicking or cleavage or target DNA.
- RNA as a molecular entity to mediate gene editing.
- Applicants designed and validated components of systems and methods to apply RNA as template (donor) to insert, delete, replace, or control genomic DNA sequences, mediated through the activity of SSAP (single-strand annealing protein, exemplified by RecT, lambda Red, T7gp2.5).
- SSAP single-strand annealing protein
- Applicants here show the efficiency of gene editing through the process of delivering three components into a cell: (1) Applicants introduced local DNA cleavage, nicking, or R-loop-formation using the CRISPR system composed of CRISPR enzymes (corresponds to Cas9/Cas9n/dCas9 or Casl2a/nCasl2a/dCasl2a respectively for cleavage/nick/R- loop-formation), and a guide RNA, where the guideRNA contains aptamer (such as MS2, or PP7, or BoxB) to recruit SSAP protein; (2) an RNA sequence bearing the desirable DNA changes with one or more homology arm (HA) region(s) that is either fused/linked to the guide RNA in (1), or fused/linked to a second guide RNA.
- CRISPR system composed of CRISPR enzymes (corresponds to Cas9/Cas9n/dCas9 or Casl2a/nCasl2a/dCa
- the HA region is at least 20bp and provide a homology region next to the editing site for SSAP-mediated editing.
- this second guideRNA binds to a nearby genomic site, located between Obp to 150bp away from the guide RNA in (1).
- This second guide RNA then forms a complex with CRISPR enzymes (such as Cas9/nCas9/dCas9 and Casl2a/nCasl2a/dCasl2a), and be recruited to the target genomic loci, and serve to provide RNA tempi ate/donor for the editing.
- CRISPR enzymes such as Cas9/nCas9/dCas9 and Casl2a/nCasl2a/dCasl2a
- the enzymes are either regular CRISPR enzymes or Cas proteins, but could also be nicking or deactivated CRISPR enzymes (dCas9, dCasl2a, etc.) that only binds to target loci.
- the guide is regular guide RNA or shorter guide RNA (typically 2 ⁇ 6bp shorter than the regular guide RNA, so 14bp to 18bp) to allow efficient binding but not cleavage of targets.
- the RBP is MS2 coat protein (MCP), PP7 coat protein (PCP), or BoxB binding peptide from lambda phage (lambda N22 peptide).
- RNA-templated SSAP gene-editing when Applicants fuse a reverse transcriptase (RT) to the SSAP protein via a long peptide linker, making this third component RBP-SSAP-RT, or RBP-RT-SSAP (- represent linkers), this further enhance editing efficiencies.
- RT reverse transcriptase
- the Cas9/nCas9/dCas9 or Casl2a/nCasl2a/dCasl2a protein is fused via linker to a reverse transcriptase (RT).
- the guide RNA in this design is also has a primer-binding-site (PBS) of at least 14-bp or more, which is complementary to a region at the editing site. This PBS helps to initiate RT activity.
- another design uses the same guide RNA as in the first embodiment, and to initiate RT activity, and a short oligo DNA (length is 14bp or more) that is complementary to a region at the editing site is supplied to the cell.
- the Cas9/nCas9/dCas9 or Casl2a/nCasl2a/dCasl2a protein is fused via linker to a reverse transcriptase (RT) from a retron system.
- the guide RNA in this design has a msr/msd sequence from retron, and also one or more homology arm (HA) region(s), which is complementary to a region at the editing site.
- the msr/msd sequence helps to initiate RT activity.
- the HA region helps to mediate SSAP gene-editing.
- this suite of tools and methods provides a novel and nonobvious RNA- mediated/RNA-templated gene editing in eukaryotic/mammalian cells.
- Applicants further demonstrated that through designing cleavable RNA template using endogenous tRNA, ribozyme, or the direct repeat from Casl2a system, Applicants also achieve multiple-target gene editing using RNA as template.
- RNA-templated SSAP gene editing system (1) it has reduced off-target or toxicity due to RNA and is less immunogenic compared with DNA used in existing gene editing process, and also that RNA cannot be integrated directly into unintended genomic DNA sites or off-target DNA sites; (2) Applicants easily multiplex the precision gene editing methods by using cleavable RNA template in Applicants’ methods; (3) RNA is easier to delivery into cells, it is easier to manufacture, less expensive to scale up for clinical usage; (4) RNA has a lot of engineering potential by combining other regulatory or combinatorial payload/components via chemical linkage or biochemical coupling, to enable more efficiency delivery, editing, or synergistic action of RNA-templated gene editing with other type of gene editing or therapeutic modalities; and (5) the efficiency of RNA-templated gene editing could be enhanced via RNA and protein factors and is orthogonal to regular DNA-repair pathways that may be critical for health of target cells.
- RNA-guided Recombination Protein System In certain embodiments or the invention, there is provided a system or composition for RNA-guided recombineering that does not rely primarily on CRISPR proteins.
- the system or composition comprises: a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence and a recombination protein.
- the system or composition is capable of promoting R-loop formation.
- the system or composition is capable of recombination.
- the system or composition is free of CRISPR proteins.
- the recombination protein comprises a microbial recombination protein.
- the recombination protein comprises a viral recombination protein. In certain embodiments, the recombination protein comprises a eukaryotic recombination protein. In certain embodiments, the recombination protein comprises a mitochondrial recombination protein. In various embodiments, the recombination protein comprises a single stranded DNA annealing protein (SSAP), a single stranded DNA binding protein (SSB), an exonuclease, or a combination of two or more thereof.
- SSAP single stranded DNA annealing protein
- SSB single stranded DNA binding protein
- exonuclease or a combination of two or more thereof.
- system or composition does not comprise a Cas9. In certain embodiments, the system or composition does not comprise a Casl2a. In certain embodiments, the system or composition does not comprise a Cas. In certain embodiments, the system or composition does not comprise a CRISPR.
- the system can be thought of as comprising a guide nucleic acid that promotes R-loop formation by binding to target DNA and a recombination protein that promotes recombination between the target nucleic acids and donor nucleic acids.
- the guide RNA and the recombination protein are effectively linked.
- the linkage is covalent.
- the linkage is non- covalent.
- the guide nucleic acid comprises an aptamer sequence and the recombination protein comprises or is joined to an aptamer binding domain.
- the RLoop-guideRNA comprises a guide component and a scaffold component in various arrangements, e.g., guide-scaffold and scaffold-guide configurations.
- RLoop-guideRNA comprises the guide at 5' end of scaffold.
- RLoop-guideRNA comprises the guide at 3' end of scaffold.
- the guide sequence is engineered to bind to target DNA (genome target).
- the guide is from 17 to 160 bases.
- the scaffold comprises one or more of an aptamer sequence.
- Aptamers used in the invention include, without limitation, MS2, PP7, BoxB, and others.
- the fusion protein comprises an RNA binding component that binds to an aptamer such as is described above and an SSAP protein such as but not limited to RecT, LambdaRed, T7gp2.5, and others.
- Donor nucleic acids can be single-strand or double-stranded DNA and comprise (1) various lengths of homology arms (HA) to match a genomic target region, and (2) a transgene, e.g., knock-in sequence or replacement sequence etc. There is no limit to the sized of the transgene. Insertions of 600-bp (FIG. 70) and 800-bp (FIG. 71) are exemplified herein.
- an RLoop-guideRNA binds to an RNA-binding-protein or domain fused to a recombination protein such as but not limited to SSAP.
- the invention provides fusion proteins.
- a recombination protein may be linked to either terminus of an aptamer binding protein in any orientation (e.g., N-terminus to C-terminus, C-terminus to N-terminus, N-terminus to N-terminus).
- a recombination protein N-terminus is linked to the aptamer binding protein C-terminus.
- the overall fusion protein from N- to C-terminus comprises the aptamer binding protein (N- to C-terminus) linked to the recombination protein (N- to C-terminus).
- the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to a nuclease. In some embodiments, the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to an endonuclease. In some embodiments, the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to an exonuclease. In some embodiments, the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to a nuclease and/or a Cas or dCas.
- the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to an endonuclease and/or a Cas or dCas. In some embodiments, the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to an exonuclease and/or a Cas or dCas. In some embodiments, the recombination protein may be expressed independently from, not a fusion protein with a nuclease. In some embodiments, the recombination protein may be expressed independently from, not a fusion protein with an endonuclease.
- the recombination protein may be expressed independently from, not a fusion protein with an exonuclease. In some embodiments, the recombination protein may be expressed independently from, not a fusion protein with a nuclease and/or a Cas or dCas. In some embodiments, the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to an aptamer and/or aptamer binding protein. In some embodiments, the recombination protein may be expressed independently, not as a fusion protein, with an aptamer and/or aptamer binding protein.
- the recombination protein may be functionally linked as a fusion protein or chimera or chimeric molecule to a nuclease and/or Cas or dCas and/or to an aptamer and/or aptamer binding protein.
- the recombination protein may be expressed independently from, not a fusion protein with a nuclease and/or a Cas or dCas and/or an aptamer and/or aptamer binding protein.
- the aptamer and/or aptamer binding protein is an MCP protein.
- the recombination protein may be an SSAP.
- nuclease refers to an agent, such as a protein or small molecule, that is capable of cleaving phosphodiester bonds that join nucleotide residues in a nucleic acid molecule.
- the nuclease is but woven, e.g., an enzyme that is capable of binding to a nucleic acid molecule and cleaving phosphodiester bonds linking nucleotide residues in the nucleic acid molecule.
- the nuclease may be an endonuclease, which cleaves a phosphodiester bond in a polynucleotide strand, or an exonuclease, which cleaves a phosphodiester bond at the end of a polynucleotide strand.
- the nuclease is a site-specific nuclease that binds to and/or cleaves a particular phosphodiester bond within a particular nucleotide sequence, which is also referred to herein as a "recognition sequence," "nuclease target site", or "target site”.
- the nuclease is an RNA-guided (e.g., RNA-programmable) nuclease that complexes (e.g., binds) to RNA having a sequence complementary to the target site, thereby providing sequence specificity of the nuclease.
- the nuclease recognizes a single-stranded target site, while in other embodiments, the nuclease recognizes a double-stranded target site, e.g., a double-stranded DNA target site.
- Target sites for many naturally occurring nucleases for example many naturally occurring DNA restriction nucleases, are well known to those skilled in the art.
- DNA nucleases such as EcoRI, Hindlll or BamHI recognize palindromic double-stranded DNA target sites that are 4 to 10 base pairs in length and cut each of the wo DNA strands at specific positions within the target site.
- Some endonucleases symmetrically cleave a double-stranded nucleic acid target site, e.g., cleave both strands at the same position, such that the ends comprise base-paired nucleotides, also referred to herein as blunt ends.
- endonucleases cleave double-stranded nucleic acid target sites asymmetrically, e.g., each strand is cleaved at a different position such that the ends contain unpaired nucleotides.
- Unpaired nucleotides at the ends of a double-stranded DNA molecule are also referred to as "overhangs", e.g., "5 '-overhangs” or "3' -overhangs,” depending on whether the unpaired nucleotide forms the 5'or 3' end of the corresponding DNA strand.
- Nuclease proteins typically comprise a "binding domain” that mediates interaction of the protein with a nucleic acid substrate (in some cases also specifically binding to a target site) and a "cleavage domain” that catalyzes the cleavage of phosphodiester bonds within the nucleic acid backbone.
- the nuclease protein is capable of binding and cleaving a nucleic acid molecule in a monomeric form, while in other embodiments, the nuclease protein must dimerize or otherwise cleave a target nucleic acid molecule. Binding and cleavage domains of naturally occurring nucleases, as well as mode binding and cleavage domains that can be fused to create nucleases, are well known to those of skill in the art.
- a zinc finger or transcriptional activator-like element can be used as a binding domain to specifically bind a desired target site and fused or conjugated to a cleavage domain, such as the cleavage domain of fokl, to create an engineered nuclease that cleaves the target site.
- Non-limiting examples of an exonuclease include exonuclease I, exonuclease II, exonnuclease III, exonuclease IV, exonuclease V, exonuclease VII, exonuclease VIII, lambda exonuclease, Xrnl, mung bean nuclease, TREX2, exonuclease T, T7 exonuclease, strandase exonuclease, 3’-5’ exophosphodiesterase, and Bal31 nuclease.
- the fusion protein further comprises a linker between the recombination protein and the aptamer binding protein.
- the linkers may comprise any amino acid sequence of any length.
- the linkers may be flexible such that they do not constrain either of the two components they link together in any particular orientation.
- the linkers may essentially act as a spacer.
- the linker links the C-terminus of the recombination protein to the N-terminus of the aptamer binding protein.
- the linker comprises the amino acid sequence of the 16-residue XTEN linker, SGSETPGTSESATPES (SEQ ID NO: 15) or the 37-residue EXTEN linker, SASGGSSGGSSGSETPGTSESATPESSGGSSGGSGGS (SEQ ID NO: 148).
- the fusion protein further comprises a nuclear localization sequence (NLS).
- the nuclear localization sequence may be at any location within the fusion protein (e.g., C-terminal of the aptamer binding protein, N-terminal of the aptamer binding protein, C-terminal of the recombination protein).
- the nuclear localization sequence is linked to the C-terminus of the recombination protein.
- a number of nuclear localization sequences are known in the art (see, e.g., Lange, A., et al., J Biol Chem. 2007; 282(8): 5101-5105, incorporated herein by reference) and may be used in connection with the present invention.
- the nuclear localization sequence may be the SV40 NLS, PKKKRKV (SEQ ID NO: 16); the Ty 1 NLS, NSKKRSLEDNETEIKVSRDTWNTKNMRSLEPPRSKKRIH (SEQ ID NO: 17); the c-Myc NLS, PAAKRVKLD (SEQ ID NO: 18); the biSV40 NLS, KRTADGSEFESPKKKRKV (SEQ ID NO: 19); and the Mut NLS, PEKKRRRPSGSVPVLARPSPPKAGKSSCI (SEQ ID NO:20).
- the nuclear localization sequence is the SV40 NLS, PKKKRKV (SEQ ID NO: 16).
- the Cas protein and the fusion protein are desirably included in a single composition alone, in combination with each other, and/or the polynucleotide(s) (e.g., a vector) comprising the guide RNA sequence and the aptamer sequence.
- the Cas protein and/or the fusion protein may or may not be physically or chemically bound to the polynucleotide.
- the Cas protein and/or the recombination protein can be associated with a polynucleotide using any suitable method for protein-protein linking or protein-virus linking known in the art.
- compositions and vectors comprising a polynucleotide comprising a nucleic acid sequence encoding a fusion protein comprising a recombination protein functionally linked to an RNA aptamer binding protein.
- compositions or vectors may further comprise at least one or both of a polynucleotide comprising a nucleic acid sequence encoding a Cas protein and a nucleic acid molecule comprising a guide RNA sequence that is complementary to a target DNA sequence.
- the nucleic acid molecule comprising a guide RNA sequence further comprises at least one RNA aptamer sequence.
- the polynucleotide comprising a nucleic acid sequence encoding a Cas protein further comprises a sequence encoding at least one peptide aptamer sequence.
- nucleic acid molecule comprising a guide RNA sequence, the aptamer sequences, the Cas proteins, the recombination proteins, and the aptamer binding proteins set forth above in connection with the inventive system also are applicable to the polynucleotides of the recited compositions and vectors.
- the nucleic acid sequence encoding the Cas protein and/or the nucleic acid sequence encoding a fusion protein comprising a recombination protein functionally linked to an aptamer binding protein can be provided to a cell on the same vector (e.g., in cis) as the nucleic acid molecule comprising the guide RNA sequence and/or the RNA aptamer sequence.
- a unidirectional promoter can be used to control expression of each nucleic acid sequence.
- a combination of bidirectional and unidirectional promoters can be used to control expression of multiple nucleic acid sequences.
- a nucleic acid sequence encoding the Cas protein, the nucleic acid sequence encoding a fusion protein comprising a recombination protein functionally linked to an aptamer binding protein, and the nucleic acid molecule comprising the guide RNA sequence and/or the RNA aptamer sequence can be provided to a cell on separate vectors (e.g., in trans).
- Each of the nucleic acid sequences in each of the separate vectors can comprise the same or different expression control sequences.
- the separate vectors can be provided to cells simultaneously or sequentially.
- the vector(s) comprising the nucleic acid sequences encoding the Cas protein and encoding a fusion protein comprising a recombination protein functionally linked to an aptamer binding protein can be introduced into a host cell that is capable of expressing the polypeptide encoded thereby, including any suitable prokaryotic or eukaryotic cell.
- the invention provides an isolated cell comprising the vector or nucleic acid sequences disclosed herein.
- Preferred host cells are those that can be easily and reliably grown, have reasonably fast growth rates, have well characterized expression systems, and can be transformed or transfected easily and efficiently.
- suitable prokaryotic cells include, but are not limited to, cells from the genera Bacillus (such as Bacillus subtilis and Bacillus brevis), Escherichia (such as E. coll), Pseudomonas, Streptomyces, Salmonella, and Envinia.
- Suitable eukaryotic cells are known in the art and include, for example, yeast cells, insect cells, and mammalian cells.
- suitable yeast cells include those from the genera Kluyveromyces, Pichia, Rhino-sporidium, Saccharomyces, and Schizosaccharomyces.
- Exemplary insect cells include Sf-9 and HIS (Invitrogen, Carlsbad, Calif.) and are described in, for example, Kitts et al., Biotechniques, IP. 810-817 (1993); Lucklow, Curr. Opin. Biotechnol., P. 564-572 (1993); and Lucklow et al., J. Virol., 67'. 4566-4579 (1993), incorporated herein by reference.
- the host cell is a mammalian cell, and in some embodiments, the host cell is a human cell.
- a number of suitable mammalian and human host cells are known in the art, and many are available from the American Type Culture Collection (ATCC, Manassas, Va.).
- suitable mammalian cells include, but are not limited to, Chinese hamster ovary cells (CHO) (ATCC No. CCL61), CHO DHFR-cells (Urlaub et al., Proc. Natl. Acad. Sci. USA, 97: 4216-4220 (1980)), human embryonic kidney (HEK) 293 or 293T cells (ATCC No. CRL1573), and 3T3 cells (ATCC No. CCL92).
- CHO Chinese hamster ovary cells
- CHO DHFR-cells Urlaub et al., Proc. Natl. Acad. Sci. USA, 97: 4216-4220 (1980)
- human embryonic kidney (HEK) 293 or 293T cells ATCC No. CRL1573)
- 3T3 cells ATCC No. CCL92
- Other suitable mammalian cell lines are the monkey COS-1 (ATCC No. CRL1650) and COS-7 cell lines (ATCC No. CRL1651), as
- mammalian host cells include primate, rodent, and human cell lines, including transformed cell lines. Normal diploid cells, cell strains derived from in vitro culture of primary tissue, as well as primary explants, are also suitable.
- Other suitable mammalian cell lines include, but are not limited to, mouse neuroblastoma N2A cells, HeLa, HEK, A549, HepG2, mouse L-929 cells, and BHK or HaK hamster cell lines. Methods for selecting suitable mammalian host cells and methods for transformation, culture, amplification, screening, and purification of cells are known in the art.
- the invention also provides a method of altering a target DNA.
- the method alters genomic DNA sequence in a cell, although any desired nucleic acid may be modified.
- the method comprises introducing the systems, compositions, or vectors described herein into a cell comprising a target genomic DNA sequence.
- Descriptions of the nucleic acid molecule comprising a guide RNA sequence, the Cas proteins, the recombination proteins, the recruitment systems, and polynucleotides encoding thereof, the cell, the target genomic DNA sequence, and components thereof, set forth above in connection with the inventive system are also applicable to the method of altering a target genomic DNA sequence in a cell.
- the systems, composition or vectors may be introduced in any manner known in the art including, but not limited to, chemical transfection, electroporation, microinjection, biolistic delivery via gene guns, or magnetic-assisted transfection, depending on the cell type.
- delivery of editing systems or components comprises delivery of a ribonucleoprotein (RNP) complex.
- RNP ribonucleoprotein
- targeting nucleic acids including but not limited to gRNAs, dgRNAs can be provided in complexes, such as without limitation, complexes comprising Cas9 or dCas9.
- an RNP complex comprises a guide nucleic acid and a Cas9 fusion protein, such as without limitation a complex comprising dCas9-SSAP.
- an RNP complex comprises a guide nucleic acid and a recombination protein, e.g., an SSAP or SSB, which may be adapted or modified to bind to the guide nucleic acid.
- the guide nucleic acid and the recombination protein or Cas9 fusion protein comprise binding elements that promote complex formation.
- a recombination protein comprises an MCP domain and a guide RNA comprises an MS2 aptamer, whereby binding of the MS2 aptamer to the MCP domain produces an RNP.
- the guide RNA and the Cas and/or recombination protein polypeptide are be incubated together to form a ribonucleoprotein (RNP) complex prior to introducing into a cell, for example mixed together in a vessel to form an RNP complex, and then the RNP complex is introduced into the cell.
- RNP ribonucleoprotein
- the Cas polypeptide described herein can be an mRNA encoding the Cas polypeptide, which Cas mRNA is introduced into the primary cell together with the modified sgRNA as an “All RNA” CRISPR system.
- the RNP complex and donor nucleic acid or vector are concomitantly introduced into a cell.
- the RNP complex and the donor nucleic acid or vector are sequentially introduced into the primary cell.
- the RNP complex is introduced into the primary cell before the donor.
- the donor is introduced into the primary cell before the RNP complex.
- the RNP complex can be introduced into a cell about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 90, 120, 150, 180, 210, or 240 minutes or more before the donor nucleic acid or vector, or vice versa.
- RNP ribonucleoprotein
- Non-limiting examples include use of (1) purified Cas9 or dCas9 protein; (2) synthesized guideRNA with MS2 aptamer; (3) purified MCP-SSAP fusion protein; (4) donor DNA (double, single strand DNA donor for HEK293 and K562, and AAV donor for HSC), delivered into HEK293, K562, and primary hematopoietic stem cells (mouse and human) for knock-in editing.
- donor DNA double, single strand DNA donor for HEK293 and K562, and AAV donor for HSC
- the following table provides exemplary sequences for generating knock-ins including at ALB and AAVS1.
- the sequences can be employed in RNPs, nucleic acids, vectors, for expression, and the like.
- the guide RNA sequence binds to the target genomic DNA sequence in the cell genome
- the Cas protein associates with the guide RNA and may induce a double strand break or single strand nick in the target genomic DNA sequence and the aptamer recruits the recombination proteins to the target genomic DNA sequence through the aptamer binding protein of the fusion protein, thereby altering the target genomic DNA sequence in the cell.
- the nucleic acid molecule comprising a guide RNA sequence, the Cas9 protein, and the fusion protein are first expressed in the cell.
- the cell is in an organism or host, such that introducing the disclosed systems, compositions, vectors into the cell comprises administration to a subject.
- the method may comprise providing or administering to the subject, in vivo, or by transplantation of ex vivo treated cells, systems, compositions, vectors of the present system.
- a “subject” may be human or non-human and may include, for example, animal strains or species used as “model systems” for research purposes, such a mouse model as described herein. Likewise, subject may include either adults or juveniles (e.g., children). Moreover, subject may mean any living organism, preferably a mammal (e.g., human or non-human) that may benefit from the administration of compositions contemplated herein.
- mammals include, but are not limited to, any member of the Mammalian class: humans, non-human primates such as chimpanzees, and other apes and monkey species; farm animals such as cattle, horses, sheep, goats, swine; domestic animals such as rabbits, dogs, and cats; laboratory animals including rodents, such as rats, mice and guinea pigs, and the like.
- non-mammals include, but are not limited to, birds, fish, and the like.
- the mammal is a human.
- Plants include without limitation sugar cane, com, wheat, rice, oil palm fruit, potatoes, soybeans, vegetables, cassava, sugar beets, tomatoes, barley, bananas, watermelon, onions, sweet potatoes, cucumbers, apples, seed cotton, oranges, and the like.
- the terms “providing”, “administering,” “introducing,” are used interchangeably herein and refer to the placement of the systems of the invention into a subject by a method or route which results in at least partial localization of the system to a desired site.
- the systems can be administered by any appropriate route which results in delivery to a desired location in the subject.
- altering a DNA sequence refers to modifying at least one physical feature of a DNA sequence of interest.
- DNA alterations include, for example, single or double strand DNA breaks, deletion or insertion of one or more nucleotides, and other modifications that affect the structural integrity or nucleotide sequence of the DNA sequence.
- the modifications of a target sequence in genomic DNA may lead to, for example, gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, gene knock-down, and the like.
- the systems and methods described herein may be used to correct one or more defects or mutations in a gene (referred to as “gene correction”).
- the target genomic DNA sequence encodes a defective version of a gene
- the system further comprises a donor nucleic acid molecule which encodes a wild-type or corrected version of the gene.
- the target genomic DNA sequence is a “disease-associated” gene.
- the term “disease-associated gene,” refers to any gene or polynucleotide whose gene products are expressed at an abnormal level or in an abnormal form in cells obtained from a disease-affected individual as compared with tissues or cells obtained from an individual not affected by the disease.
- a disease-associated gene may be expressed at an abnormally high level or at an abnormally low level, where the altered expression correlates with the occurrence and/or progression of the disease.
- a disease-associated gene also refers to a gene, the mutation or genetic variation of which is directly responsible or is in linkage disequilibrium with a gene(s) that is responsible for the etiology of a disease.
- genes responsible for such “single gene” or “monogenic” diseases include, but are not limited to, adenosine deaminase, a-1 antitrypsin, cystic fibrosis transmembrane conductance regulator (CFTR), P-hemoglobin (HBB), oculocutaneous albinism II (0CA2), Huntingtin (HTT), dystrophia myotonica-protein kinase (DMPK), low-density lipoprotein receptor (LDLR), apolipoprotein B (APOB), neurofibromin 1 (NF1), polycystic kidney disease 1 (PKD1), polycystic kidney disease 2 (PKD2), coagulation factor VIII (F8), dystrophin (DMD), phosphate-regulating endopeptidase homologue, X-linked (PHEX), methyl-CpG-binding protein 2 (MECP2), and ubiquitin-specific peptidase 9Y, Y-linked (USP9Y).
- the invention provides knock-ins of large transgenes at therapeutically relevant loci in the human genome.
- the locus provides cell or tissue-specific expression.
- the invention comprises insertion of nucleic acids into the albumin (ALB) locus.
- the ALB locus provides for liver targeting in human hepatocytes, is highly expressed and in a liver-specific manner.
- the invention comprises insertion of nucleic acids into the AAVS1 locus.
- the AAVS1 locus is a safe-harbor locus for gene therapy that is well expressed in certain tissue types and can be used in a wide variety of treatments, with low expression in liver.
- US Patent Publication 2018/0214490 Al describes gene therapy for lysosomal storage diseases, including targeting transgenes to safe harbo” loci such as the AAVS1, HPRT and CCR5 genes in human cells, and Rosa26 in murine cells.
- US Patent 9267154 describes integration of exogenous nucleic acid sequences into the PPP1R12C locus, which is widely expressed in most tissues, describes cell-specific expression by targeting transgenes (e.g., encoding chimeric antigen receptors (CARs)) to the T-cell receptor a constant (TRAC) locus.
- transgenes e.g., encoding chimeric antigen receptors (CARs)
- T-cell receptor a constant (TRAC) locus are exemplary and nonlimiting as to loci that can be targeted according to the invention.
- the target genomic DNA sequence can comprise a gene, the mutation of which contributes to a particular disease in combination with mutations in other genes.
- Diseases caused by the contribution of multiple genes which lack simple (e.g., Mendelian) inheritance patterns are referred to in the art as a “multifactorial” or “polygenic” disease.
- multifactorial or polygenic diseases include, but are not limited to, asthma, diabetes, epilepsy, hypertension, bipolar disorder, and schizophrenia.
- Certain developmental abnormalities also can be inherited in a multifactorial or polygenic pattern and include, for example, cleft lip/palate, congenital heart defects, and neural tube defects.
- the method of altering a target genomic DNA sequence can be used to delete nucleic acids from a target sequence in a cell by cleaving the target sequence and allowing the cell to repair the cleaved sequence in the absence of an exogenously provided donor nucleic acid molecule.
- Deletion of a nucleic acid sequence in this manner can be used in a variety of applications, such as, for example, to remove disease-causing trinucleotide repeat sequences in neurons, to create gene knock-outs or knock-downs, and to generate mutations for disease models in research.
- donor nucleic acid molecule refers to a nucleotide sequence that is inserted into the target DNA (e.g., genomic DNA).
- the donor DNA may include, for example, a gene or part of a gene, a sequence encoding a tag or localization sequence, or a regulating element.
- the donor nucleic acid molecule may be of any length. In some embodiments, the donor nucleic acid molecule is between 10 and 10,000 nucleotides in length.
- nucleotides in length between about 100 and 5,000 nucleotides in length, between about 200 and 2,000 nucleotides in length, between about 500 and 1,000 nucleotides in length, between about 500 and 5,000 nucleotides in length, between about 1,000 and 5,000 nucleotides in length, or between about 1,000 and 10,000 nucleotides in length,
- the disclosed systems and methods overcome challenges encountered during conventional gene editing, including low efficiency and off-target events, particularly with kilobase-scale nucleic acids.
- the disclosed systems and methods improve the efficiency of gene editing.
- the disclosed systems and methods can have a 2- to 10-fold increase in efficiency over conventional CRISPR-Cas9 systems and methods, as shown in Examples 2, 3, and 5.
- the improvement in efficiency is accompanied by a reduction in off-target events.
- the off-target events may be reduced by greater than 50% compared to conventional CRISPR-Cas9 systems and methods, for example, a reduction of off-target events by about 90% is shown in Example 3.
- the invention further provides kits containing one or more reagents or other components useful, necessary, or sufficient for practicing any of the methods described herein.
- kits may include CRISPR reagents (Cas protein, guide RNA, vectors, compositions, etc.), recombineering reagents (recombination protein-aptamer binding protein fusion protein, the aptamer sequence, vectors, compositions, etc.) transfection or administration reagents, negative and positive control samples (e.g., cells, template DNA), cells, containers housing one or more components (e.g., microcentrifuge tubes, boxes), detectable labels, detection and analysis instruments, software, instructions, and the like.
- CRISPR reagents Cas protein, guide RNA, vectors, compositions, etc.
- recombineering reagents recombination protein-aptamer binding protein fusion protein, the aptamer sequence, vectors, compositions, etc.
- transfection or administration reagents e.g., negative and positive control samples (e.g., cells, template DNA), cells, containers housing one or more components (e.g., microcentrif
- the RNAs may be delivered using adeno associated virus (AAV), lentivirus, adenovirus or other viral vector types, or combinations thereof.
- AAV adeno associated virus
- the RNAs can be packaged into one or more viral vectors.
- the viral vector is delivered to the tissue of interest by, for example, an intramuscular injection, while other times the viral delivery is via intravenous, transdermal, intranasal, oral, mucosal, or other delivery methods. Such delivery may be either via a single dose, or multiple doses.
- the actual dosage to be delivered herein may vary greatly depending upon a variety of factors, such as the vector chose, the target cell, organism, or tissue, the general condition of the subject to be treated, the degree of transformation/modification sought, the administration route, the administration mode, the type of transformation/modification sought, etc.
- Such a dosage may further contain, for example, a carrier (water, saline, ethanol, glycerol, lactose, sucrose, calcium phosphate, gelatin, dextran, agar, pectin, peanut oil, sesame oil, etc.), a diluent, a pharmaceutically-acceptable carrier (e.g., phosphate-buffered saline), a pharmaceutically-acceptable excipient, and/or other compounds known in the art.
- a carrier water, saline, ethanol, glycerol, lactose, sucrose, calcium phosphate, gelatin, dextran, agar, pectin, peanut oil, sesame oil, etc.
- a pharmaceutically-acceptable carrier e.g., phosphate-buffered saline
- a pharmaceutically-acceptable excipient e.g., phosphate-buffered saline
- the dosage may further contain one or more pharmaceutically acceptable salts such as, for example, a mineral acid salt such as a hydrochloride, a hydrobromide, a phosphate, a sulfate, etc.; and the salts of organic acids such as acetates, propionates, malonates, benzoates, etc.
- auxiliary substances such as wetting or emulsifying agents, pH buffering substances, gels or gelling materials, flavorings, colorants, microspheres, polymers, suspension agents, etc. may also be present herein.
- Suitable exemplary ingredients include microcrystalline cellulose, carboxymethylcellulose sodium, polysorbate 80, phenylethyl alcohol, chlorobutanol, potassium sorbate, sorbic acid, sulfur dioxide, propyl gallate, the parabens, ethyl vanillin, glycerin, phenol, parachlorophenol, gelatin, albumin, and a combination thereof.
- REMINGTON'S PHARMACEUTICAL SCIENCES (Mack Pub. Co., N.J. 1991) which is incorporated by reference herein.
- the delivery is via an adenovirus, which may be at a single booster dose containing at least 1 x 10 5 particles (also referred to as particle units, pu) of adenoviral vector.
- the dose preferably is at least about 1 x 10 6 particles (for example, about I / I O 6 - I / I O 12 particles), more preferably at least about U lO 10 particles, more preferably at least about U 10 8 particles (e.g., about l x lO 8 -lx lO n particles or about 1 X 10 8 -1 X 10 12 particles), and most preferably at least about U lO 10 particles (e.g., about l x lO 9 -lx lO 10 particles or about l x io 9 -i x io 12 particles), or even at least about U lO 10 particles (e.g., about l x lO lo -l x lO
- the dose comprises no more than about l x 1014 particles, preferably no more than about 1 x 10 13 particles, even more preferably no more than about I x lO 12 particles, even more preferably no more than about U 10 11 particles, and most preferably no more than about 1 x IO 10 particles (e.g., no more than about 1 x 10 9 articles).
- the dose may contain a single dose of adenoviral vector with, for example, about 1 x 10 6 particle units (pu), about 2x 10 6 pu, about 4x 10 6 pu, about 1 x 10 7 pu, about 2x 10 7 pu, about 4x 10 7 pu, about 1 x 10 8 pu, about 2x 10 8 pu, about 4x 10 8 pu, about 1 x 10 9 pu, about 2x 10 9 pu, about 4x 10 9 pu, about 1 x IO 10 pu, about 2x io 10 p U , about 4x lO 10 pu, about ElO 11 pu, about 2x lO n pu, about 4x lO n pu, about U K) 12 pu, about 2* 10 12 pu, or about 4* 10 12 pu of adenoviral vector.
- adenoviral vector with, for example, about 1 x 10 6 particle units (pu), about 2x 10 6 pu, about 4x 10 6 pu, about 1 x 10 7 pu, about 2x 10 7 pu, about 4x 10 7
- the adenoviral vectors in U.S. Pat. No. 8,454,972 B2 to Nabel, et. al., granted on Jun. 4, 2013; incorporated by reference herein, and the dosages at col 29, lines 36-58 thereof.
- the adenovirus is delivered via multiple doses.
- the delivery is via an AAV.
- a therapeutically effective dosage for in vivo delivery of the AAV to a human is believed to be in the range of from about 20 to about 50 ml of saline solution containing from about 1 x 10 10 to about 1 x 10 10 functional AAV/ml solution. The dosage may be adjusted to balance the therapeutic benefit against any side effects.
- the AAV dose is generally in the range of concentrations of from about 1 x 10 5 to 1 x 10 50 genomes AAV, from about 1 x 10 8 to 1 x IO 20 genomes AAV, from about 1 x 10 10 to about 1 x 10 16 genomes, or about 1 x 10 11 to about 1 x 10 16 genomes AAV.
- a human dosage may be about I x lO 13 genomes AAV. Such concentrations may be delivered in from about 0.001 ml to about 100 ml, about 0.05 to about 50 ml, or about 10 to about 25 ml of a carrier solution. Other effective dosages can be readily established by one of ordinary skill in the art through routine trials establishing dose response curves. See, for example, U.S. Pat. No. 8,404,658 B2 to Hajjar, et al., granted on Mar. 26, 2013, at col. 27, lines 45-60.
- the delivery is via a plasmid.
- the dosage should be a sufficient amount of plasmid to elicit a response.
- suitable quantities of plasmid DNA in plasmid compositions can be from about 0.1 to about 2 mg, or from about 1 pg to about 10 pg.
- the doses herein are based on an average 70 kg individual.
- the frequency of administration is within the ambit of the medical or veterinary practitioner (e.g., physician, veterinarian), or scientist skilled in the art. Mice used in experiments are about 20 g. From that which is administered to a 20 g mouse, one can extrapolate to a 70 kg individual.
- Lentiviruses are complex retroviruses that have the ability to infect and express their genes in both mitotic and post-mitotic cells.
- the most commonly known lentivirus is the human immunodeficiency virus (HIV), which uses the envelope glycoproteins of other viruses to target a broad range of cell types.
- HIV human immunodeficiency virus
- VSV-g pseudotype VSV-g pseudotype
- psPAX2 gag/pol/rev/tat
- Transfection was done in 4 mL OptiMEM with a cationic lipid delivery agent (50 uL Lipofectamine 2000 and 100 ul Plus reagent). After 6 hours, the media was changed to antibiotic-free DMEM with 10% fetal bovine serum.
- Lentivirus may be purified as follows. Viral supernatants were harvested after 48 hours. Supernatants were first cleared of debris and filtered through a 0.45 um low protein binding (PVDF) filter. They were then spun in an ultracentrifuge for 2 hours at 24,000 rpm. Viral pellets were resuspended in 50 ul of DMEM overnight at 4 C. They were then aliquotted and immediately frozen at -80 C.
- PVDF low protein binding
- minimal non-primate lentiviral vectors based on the equine infectious anemia virus are also contemplated, especially for ocular gene therapy (see, e.g., Balagaan, J Gene Med 2006; 8: 275-285, Published online 21 Nov. 2005 in Wiley InterScienc; available at the website: interscience.wiley.com. DOI: 10.1002/jgm.845).
- EIAV equine infectious anemia virus
- RetinoStat® an equine infectious anemia virus-based lentiviral gene therapy vector that expresses angiostatic proteins endostain and angiostatin that is delivered via a subretinal injection for the treatment of the web form of age-related macular degeneration is also contemplated (see, e.g., Binley et al., HUMAN GENE THERAPY 23 :980-991 (September 2012)) may be modified for the system of the present invention.
- Lentiviral vectors have been disclosed as in the treatment for Parkinson's Disease, see, e.g., US Patent Publication No. 20120295960 and U.S. Pat. Nos. 7,303,910 and 7,351,585. Lentiviral vectors have also been disclosed for the treatment of ocular diseases, see e.g., US Patent Publication Nos. 20060281180, 20090007284, US20110117189; US20090017543;
- a particle is defined as a small object that behaves as a whole unit with respect to its transport and properties. Particles are further classified according to diameter Coarse particles cover a range between 2,500 and 10,000 nanometers. Fine particles are sized between 100 and 2,500 nanometers. Ultrafine particles, or nanoparticles, are generally between 1 and 100 nanometers in size. The basis of the 100-nm limit is the fact that novel properties that differentiate particles from the bulk material typically develop at a critical length scale of under 100 nm.
- a particle delivery system/formulation is defined as any biological delivery system/formulation which includes a particle in accordance with the present invention.
- a particle in accordance with the present invention is any entity having a greatest dimension (e.g., diameter) of less than 100 microns (pm).
- inventive particles have a greatest dimension of less than 10
- inventive particles have a greatest dimension of less than 2000 nanometers (nm).
- inventive particles have a greatest dimension of less than 1000 nanometers (nm).
- inventive particles have a greatest dimension of less than 900 nm, 800 nm, 700 nm, 600 nm, 500 nm, 400 nm, 300 nm, 200 nm, or 100 nm.
- inventive particles have a greatest dimension (e.g., diameter) of 500 nm or less.
- inventive particles have a greatest dimension (e.g., diameter) of 250 nm or less.
- inventive particles have a greatest dimension (e.g., diameter) of 200 nm or less.
- inventive particles have a greatest dimension (e.g., diameter) of 150 nm or less.
- inventive particles have a greatest dimension (e.g., diameter) of 100 nm or less. Smaller particles, e.g., having a greatest dimension of 50 nm or less are used in some embodiments of the invention. In some embodiments, inventive particles have a greatest dimension ranging between 25 nm and 200 nm.
- Particle characterization is done using a variety of different techniques. Common techniques are electron microscopy (TEM, SEM), atomic force microscopy (AFM), dynamic light scattering (DLS), X-ray photoelectron spectroscopy (XPS), powder X-ray diffraction (XRD), Fourier transform infrared spectroscopy (FTIR), matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF), ultraviolet-visible spectroscopy, dual polarisation interferometry and nuclear magnetic resonance (NMR).
- TEM electron microscopy
- AFM atomic force microscopy
- DLS dynamic light scattering
- XPS X-ray photoelectron spectroscopy
- XRD powder X-ray diffraction
- FTIR Fourier transform infrared spectroscopy
- MALDI-TOF matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
- Characterization may be made as to native particles (e.g., preloading) or after loading of the cargo (herein cargo refers to one or more RNAs and/or vectors encoding the same, and may include additional components, carriers and/or excipients) to provide particles of an optimal size for delivery for any in vitro, ex vivo and/or in vivo application of the present invention.
- particle dimension (e.g., diameter) characterization is based on measurements using dynamic laser scattering (DLS).
- DLS dynamic laser scattering
- any of the delivery systems described herein including but not limited to, e.g., lipid-based systems, liposomes, micelles, microvesicles, exosomes, or gene gun may be provided as particle delivery systems within the scope of the present invention.
- CRISPR enzyme mRNA and guide RNA may be delivered simultaneously using nanoparticles or lipid envelopes.
- Other delivery systems or vectors may be used in conjunction with the nanoparticle aspects of the invention.
- nanoparticle refers to any particle having a diameter of less than 1000 nm.
- nanoparticles of the invention have a greatest dimension (e.g., diameter) of 500 nm or less.
- nanoparticles of the invention have a greatest dimension ranging between 25 nm and 200 nm.
- nanoparticles of the invention have a greatest dimension of 100 nm or less.
- nanoparticles of the invention have a greatest dimension ranging between 35 nm and 60 nm.
- Nanoparticles encompassed in the present invention may be provided in different forms, e.g., as solid nanoparticles (e.g., metal such as silver, gold, iron, titanium), non-metal, lipid- based solids, polymers), suspensions of nanoparticles, or combinations thereof.
- Metal, dielectric, and semiconductor nanoparticles may be prepared, as well as hybrid structures (e.g., core-shell nanoparticles).
- Nanoparticles made of semiconducting material may also be labeled quantum dots if they are small enough (typically sub 10 nm) that quantization of electronic energy levels occurs. Such nanoscale particles are used in biomedical applications as drug carriers or imaging agents and may be adapted for similar purposes in the present invention.
- Nanoparticles with one half hydrophilic and the other half hydrophobic are termed Janus particles and are particularly effective for stabilizing emulsions. They can self-assemble at water/oil interfaces and act as solid surfactants.
- nanoparticles based on self assembling bioadhesive polymers are contemplated, which may be applied to oral delivery of peptides, intravenous delivery of peptides and nasal delivery of peptides, all to the brain.
- Other embodiments, such as oral absorption and ocular deliver of hydrophobic drugs are also contemplated.
- the molecular envelope technology involves an engineered polymer envelope which is protected and delivered to the site of the disease (see, e.g., Mazza, M. et al. ACSNano, 2013. 7(2): 1016-1026; Siew, A., et al. Mol Pharm, 2012. 9(1): 14-28; Lalatsa, A., et al.
- nanoparticles that can deliver RNA to a cancer cell to stop tumor growth developed by Dan Anderson's lab at MIT may be used/and or adapted to the CRISPR Cas system of the present invention.
- the Anderson lab developed fully automated, combinatorial systems for the synthesis, purification, characterization, and formulation of new biomaterials and nanoformulations. See, e.g., Alabi et al., Proc Natl Acad Sci USA. 2013 Aug. 6; 110(32): 12881-6; Zhang et al., Adv Mater. 2013 Sep. 6; 25(33):4641-5; Jiang et al., Nano Lett. 2013 Mar.
- US patent application 20110293703 relates to lipidoid compounds are also particularly useful in the administration of polynucleotides, which may be applied to deliver the CRISPR Cas system of the present invention.
- the aminoalcohol lipidoid compounds are combined with an agent to be delivered to a cell or a subject to form microparticles, nanoparticles, liposomes, or micelles.
- the agent to be delivered by the particles, liposomes, or micelles may be in the form of a gas, liquid, or solid, and the agent may be a polynucleotide, protein, peptide, or small molecule.
- aminoalcohol lipidoid compounds may be combined with other aminoalcohol lipidoid compounds, polymers (synthetic or natural), surfactants, cholesterol, carbohydrates, proteins, lipids, etc. to form the particles. These particles may then optionally be combined with a pharmaceutical excipient to form a pharmaceutical composition.
- US Patent Publication No. 0110293703 also provides methods of preparing the aminoalcohol lipidoid compounds.
- One or more equivalents of an amine are allowed to react with one or more equivalents of an epoxide-terminated compound under suitable conditions to form an aminoalcohol lipidoid compound of the present invention.
- all the amino groups of the amine are fully reacted with the epoxide-terminated compound to form tertiary amines.
- all the amino groups of the amine are not fully reacted with the epoxide-terminated compound to form tertiary amines thereby resulting in primary or secondary amines in the aminoalcohol lipidoid compound.
- a diamine or polyamine may include one, two, three, or four epoxide-derived compound tails off the various amino moieties of the molecule resulting in primary, secondary, and tertiary amines. In certain embodiments, all the amino groups are not fully functionalized. In certain embodiments, two of the same types of epoxide-terminated compounds are used. In other embodiments, two or more different epoxide-terminated compounds are used.
- the synthesis of the aminoalcohol lipidoid compounds is performed with or without solvent, and the synthesis may be performed at higher temperatures ranging from 30.-100 C., preferably at approximately 50.-90 C.
- the prepared aminoalcohol lipidoid compounds may be optionally purified.
- the mixture of aminoalcohol lipidoid compounds may be purified to yield an aminoalcohol lipidoid compound with a particular number of epoxide-derived compound tails. Or the mixture may be purified to yield a particular stereo- or regioisomer.
- the aminoalcohol lipidoid compounds may also be alkylated using an alkyl halide (e.g., methyl iodide) or other alkylating agent, and/or they may be acylated.
- US Patent Publication No. 0110293703 also provides libraries of aminoalcohol lipidoid compounds prepared by the inventive methods. These aminoalcohol lipidoid compounds may be prepared and/or screened using high-throughput techniques involving liquid handlers, robots, microtiter plates, computers, etc. In certain embodiments, the aminoalcohol lipidoid compounds are screened for their ability to transfect polynucleotides or other agents (e.g., proteins, peptides, small molecules) into the cell.
- agents e.g., proteins, peptides, small molecules
- US Patent Publication No. 20130302401 relates to a class of poly(beta-amino alcohols) (PBAAs) has been prepared using combinatorial polymerization.
- PBAAs poly(beta-amino alcohols)
- the inventive PBAAs may be used in biotechnology and biomedical applications as coatings (such as coatings of films or multilayer films for medical devices or implants), additives, materials, excipients, non-biofouling agents, micropatteming agents, and cellular encapsulation agents.
- coatings such as coatings of films or multilayer films for medical devices or implants
- additives such as coatings of films or multilayer films for medical devices or implants
- materials such as coatings of films or multilayer films for medical devices or implants
- additives such as coatings of films or multilayer films for medical devices or implants
- materials such as coatings of films or multilayer films for medical devices or implants
- excipients such as coatings of films or multilayer films for medical devices or implants
- these coatings reduce the recruitment of inflammatory cells, and reduce fibrosis, following the subcutaneous implantation of carboxylated polystyrene microparticles.
- These polymers may be used to form polyelectrolyte complex capsules for cell encapsulation.
- the invention may also have many other biological applications such as antimicrobial coatings, DNA or siRNA delivery, and stem cell tissue engineering.
- US Patent Publication No. 20130302401 may be applied to the system of the present invention.
- lipid nanoparticles are contemplated.
- an antitransthyretin small interfering RNA encapsulated in lipid nanoparticles may be applied to the system of the present invention.
- Doses of about 0.01 to about 1 mg per kg of body weight administered intravenously are contemplated.
- Medications to reduce the risk of infusion-related reactions are contemplated, such as dexamethasone, acetampinophen, diphenhydramine or cetirizine, and ranitidine are contemplated.
- Lipids include, but are not limited to, DLin-KC2-DMA4, Cl 2-200 and colipids disteroylphosphatidyl choline, cholesterol, and PEG-DMG may be formulated RNA instead of siRNA (see, e.g., Novobrantseva, Molecular Therapy — Nucleic Acids (2012) 1, e4; doi: 10.1038/mtna.2011.3) using a spontaneous vesicle formation procedure.
- the component molar ratio may be about 50/10/38.5/1.5 (DLin-KC2-DMA or C12-200/disteroylphosphatidyl choline/cholesterol/PEG-DMG).
- the final lipid:siRNA weight ratio may be ⁇ 12: 1 and 9: 1 in the case of DLin-KC2-DMA and C12-200 lipid nanoparticles (LNPs), respectively.
- the formulations may have mean particle diameters of ⁇ 80 nm with >90% entrapment efficiency. A 3 mg/kg dose may be contemplated.
- LNPs have been shown to be highly effective in delivering siRNAs to the liver (see, e.g., Tabernero et al., Cancer Discovery, April 2013, Vol. 3, No. 4, pages 363-470) and are therefore contemplated for delivering CRISPR Cas to the liver.
- a dosage of about four doses of 6 mg/kg of the LNP (or RNA of the CRISPR-Cas) every two weeks may be contemplated.
- Tabernero et al. demonstrated that tumor regression was observed after the first 2 cycles of LNPs dosed at 0.7 mg/kg, and by the end of 6 cycles the patient had achieved a partial response with complete regression of the lymph node metastasis and substantial shrinkage of the liver tumors.
- the charge of the LNP must be taken into consideration.
- cationic lipids combined with negatively charged lipids to induce nonbilayer structures that facilitate intracellular delivery.
- ionizable cationic lipids with pKa values below 7 were developed (see, e.g., Rosin et al, Molecular Therapy, vol. 19, no. 12, pages 1286-2200, December 2011).
- Negatively charged polymers such as siRNA oligonucleotides may be loaded into LNPs at low pH values (e.g., pH 4) where the ionizable lipids display a positive charge.
- LNPs exhibit a low surface charge compatible with longer circulation times.
- ionizable cationic lipids have been focused upon, namely l,2-dilineoyl-3 -dimethylammoniumpropane (DLinDAP), l,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2- dilinoleyloxy-keto-N,N-dimethyl-3 -aminopropane (DLinKDMA), and l,2-dilinoleyl-4-(2- dimethylaminoethyl)-[l,3]-dioxolane (DLinKC2-DMA).
- DLinDAP l,2-dilineoyl-3 -dimethylammoniumpropane
- DLinDMA l,2-dilinoleyloxy-3-N,N-dimethylaminopropane
- DLinKDMA 1,2- dilin
- LNP siRNA systems containing these lipids exhibit remarkably different gene silencing properties in hepatocytes in vivo, with potencies varying according to the series DLinKC2- DMA>DLinKDMA>DLinDMA»DLinDAP employing a Factor VII gene silencing model (see, e.g., Rosin et al, Molecular Therapy, vol. 19, no. 12, pages 1286-2200, December 2011).
- a dosage of 1 pg/ml levels may be contemplated, especially for a formulation containing DLinKC2-DMA.
- Preparation of LNPs and CRISPR Cas encapsulation may be used/and or adapted from Rosin et al, Molecular Therapy, vol. 19, no.
- Cholesterol may be purchased from Sigma (St Louis, Mo.).
- the specific CRISPR Cas RNA may be encapsulated in LNPs containing DLinDAP, DLinDMA, DLinK-DMA, and DLinKC2-DMA (cationic lipid:DSPC:CHOL: PEGS-DMG or PEG-C-DOMG at 40: 10:40: 10 molar ratios).
- 0.2% SP-DiOC18 Invitrogen, Burlington, Canada
- Encapsulation may be performed by dissolving lipid mixtures comprised of cationic lipid:DSPC:cholesterol:PEG-c-DOMG (40: 10:40: 10 molar ratio) in ethanol to a final lipid concentration of 10 mmol/1.
- This ethanol solution of lipid may be added drop-wise to 50 mmol/1 citrate, pH 4.0 to form multilamellar vesicles to produce a final concentration of 30% ethanol vol/vol.
- Large unilamellar vesicles may be formed following extrusion of multilamellar vesicles through two stacked 80 nm Nuclepore polycarbonate filters using the Extruder (Northern Lipids, Vancouver, Canada).
- Encapsulation may be achieved by adding RNA dissolved at 2 mg/ml in 50 mmol/1 citrate, pH 4.0 containing 30% ethanol vol/vol drop-wise to extruded preformed large unilamellar vesicles and incubation at 31° C. for 30 minutes with constant mixing to a final RNA/lipid weight ratio of 0.06/1 wt/wt. Removal of ethanol and neutralization of formulation buffer were performed by dialysis against phosphate-buffered saline (PBS), pH 7.4 for 16 hours using Spectra/Por 2 regenerated cellulose dialysis membranes.
- PBS phosphate-buffered saline
- Nanoparticle size distribution may be determined by dynamic light scattering using aNICOMP 370 particle sizer, the vesicle/intensity modes, and Gaussian fitting (Nicomp Particle Sizing, Santa Barbara, Calif.). The particle size for all three LNP systems may be ⁇ 70 nm in diameter.
- siRNA encapsulation efficiency may be determined by removal of free siRNA using VivaPureD MiniH columns (Sartorius Stedim Biotech) from samples collected before and after dialysis. The encapsulated RNA may be extracted from the eluted nanoparticles and quantified at 260 nm.
- siRNA to lipid ratio was determined by measurement of cholesterol content in vesicles using the Cholesterol E enzymatic assay from Wako Chemicals USA (Richmond, Va.). PEGylated liposomes (or LNPs) can also be used for delivery.
- Preparation of large LNPs may be used/and or adapted from Rosin et al, Molecular Therapy, vol. 19, no. 12, pages 1286-2200, December 2011.
- a lipid premix solution (20.4 mg/ml total lipid concentration) may be prepared in ethanol containing DLinKC2-DMA, DSPC, and cholesterol at 50: 10:38.5 molar ratios.
- Sodium acetate may be added to the lipid premix at a molar ratio of 0.75:1 (sodium acetate:DLinKC2-DMA).
- the lipids may be subsequently hydrated by combining the mixture with 1.85 volumes of citrate buffer (10 mmol/1, pH 3.0) with vigorous stirring, resulting in spontaneous liposome formation in aqueous buffer containing 35% ethanol.
- the liposome solution may be incubated at 37° C. to allow for time-dependent increase in particle size. Aliquots may be removed at various times during incubation to investigate changes in liposome size by dynamic light scattering (Zetasizer Nano ZS, Malvern Instruments, Worcestershire, UK).
- the liposomes should their size, effectively quenching further growth.
- RNA may then be added to the empty liposomes at an siRNA to total lipid ratio of approximately 1 : 10 (wt:wt), followed by incubation for 30 minutes at 37° C. to form loaded LNPs. The mixture may be subsequently dialyzed overnight in PBS and filtered with a 0.45-pm syringe filter.
- Spherical Nucleic Acid (SNATM) constructs and other nanoparticles (particularly gold nanoparticles) are also contemplated as a means to delivery CRISPR/Cas system to intended targets.
- Significant data show that AuraSense Therapeutics' Spherical Nucleic Acid (SNATM) constructs, based upon nucleic acid-functionalized gold nanoparticles, are superior to alternative platforms based on multiple key success factors, such as:
- the constructs can enter a variety of cultured cells, primary cells, and tissues with no apparent toxicity.
- constructs elicit minimal changes in global gene expression as measured by whole-genome microarray studies and cytokine-specific protein assays.
- chemical tailorability Any number of single or combinatorial agents (e.g., proteins, peptides, small molecules) can be used to tailor the surface of the constructs.
- nucleic acid-based therapeutics may be applicable to numerous disease states, including inflammation and infectious disease, cancer, skin disorders and cardiovascular disease.
- Citable literature includes: Cutler et al., J. Am. Chem. Soc. 2011 133:9254-9257, Hao et al., Small. 2011 7:3158-3162, Zhang et al., ACS Nano. 2011 5:6962-6970, Cutler et al., J. Am. Chem. Soc. 2012 134: 1376-1391, Young et al., Nano Lett. 2012 12:3867-71, Zheng et al., Proc. Natl. Acad. Sci. USA. 2012 109: 11975-80, Mirkin, Nanomedicine 2012 7:635-638 Zhang et al., J. Am. Chem. Soc.
- Self-assembling nanoparticles with siRNA may be constructed with polyethyleneimine (PEI) that is PEGylated with an Arg-Gly-Asp (RGD) peptide ligand attached at the distal end of the polyethylene glycol (PEG), for example, as a means to target tumor neovasculature expressing integrins and used to deliver siRNA inhibiting vascular endothelial growth factor receptor-2 (VEGF R2) expression and thereby tumor angiogenesis (see, e.g., Schiffelers et al., Nucleic Acids Research, 2004, Vol. 32, No. 19).
- PEI polyethyleneimine
- RGD Arg-Gly-Asp
- VEGF R2 vascular endothelial growth factor receptor-2
- Nanoplexes may be prepared by mixing equal volumes of aqueous solutions of cationic polymer and nucleic acid to give a net molar excess of ionizable nitrogen (polymer) to phosphate (nucleic acid) over the range of 2 to 6.
- the electrostatic interactions between cationic polymers and nucleic acid resulted in the formation of polyplexes with average particle size distribution of about 100 nm, hence referred to here as nanoplexes.
- a dosage of about 100 to 200 mg of CRISPR Cas is envisioned for delivery in the self-assembling nanoparticles of Schiffelers et al.
- the nanoplexes of Bartlett et al. may also be applied to the present invention.
- the nanoplexes of Bartlett et al. are prepared by mixing equal volumes of aqueous solutions of cationic polymer and nucleic acid to give a net molar excess of ionizable nitrogen (polymer) to phosphate (nucleic acid) over the range of 2 to 6.
- the electrostatic interactions between cationic polymers and nucleic acid resulted in the formation of polyplexes with average particle size distribution of about 100 nm, hence referred to here as nanoplexes.
- DOTA-NHSester 1,4,7,10-tetraazacyclododecane- 1,4,7, 10-tetraacetic acid mono(N-hydroxy succinimide ester)
- DOTA-NHSester 1,4,7,10-tetraazacyclododecane- 1,4,7, 10-tetraacetic acid mono(N-hydroxy succinimide ester)
- the amine modified RNA sense strand with a 100-fold molar excess of DOTA-NHS-ester in carbonate buffer (pH 9) was added to a microcentrifuge tube. The contents were reacted by stirring for 4 h at room temperature.
- the DOTA-RNAsense conjugate was ethanol-precipitated, resuspended in water, and annealed to the unmodified antisense strand to yield DOTA-siRNA.
- Tf-targeted and nontargeted siRNA nanoparticles may be formed by using cyclodextrin-containing polycations. Typically, nanoparticles were formed in water at a charge ratio of 3 (+/-) and an siRNA concentration of 0.5 g/liter. One percent of the adamantane-PEG molecules on the surface of the targeted nanoparticles were modified with Tf (adamantane-PEG-Tf). The nanoparticles were suspended in a 5% (wt/vol) glucose carrier solution for injection. [00301] Davis et al. (Nature, Vol 464, 15 Apr.
- the nanoparticles consist of a synthetic delivery system containing: (1) a linear, cyclodextrin-based polymer (CDP), (2) a human transferrin protein (TF) targeting ligand displayed on the exterior of the nanoparticle to engage TF receptors (TFR) on the surface of the cancer cells, (3) a hydrophilic polymer (polyethylene glycol (PEG) used to promote nanoparticle stability in biological fluids), and (4) siRNA designed to reduce the expression of the RRM2 (sequence used in the clinic was previously denoted siR2B+5).
- CDP linear, cyclodextrin-based polymer
- TF human transferrin protein
- TFR TF receptors
- siRNA designed to reduce the expression of the RRM2 (sequence used in the clinic was previously denoted siR2B+5).
- the TFR has long been known to be upregulated in malignant cells, and RRM2 is an established anti-cancer target.
- CRISPR Cas system of the present invention Similar doses may also be contemplated for the CRISPR Cas system of the present invention.
- the delivery of the invention may be achieved with nanoparticles containing a linear, cyclodextrin-based polymer (CDP), a human transferrin protein (TF) targeting ligand displayed on the exterior of the nanoparticle to engage TF receptors (TFR) on the surface of the cancer cells and/or a hydrophilic polymer (for example, polyethylene glycol (PEG) used to promote nanoparticle stability in biological fluids).
- CDP linear, cyclodextrin-based polymer
- TF human transferrin protein
- TFR TF receptors
- hydrophilic polymer for example, polyethylene glycol (PEG) used to promote nanoparticle stability in biological fluids
- Liposomes are spherical vesicle structures composed of a uni- or multilamellar lipid bilayer surrounding internal aqueous compartments and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes have gained considerable attention as drug delivery carriers because they are biocompatible, nontoxic, can deliver both hydrophilic and lipophilic drug molecules, protect their cargo from degradation by plasma enzymes, and transport their load across biological membranes and the blood brain barrier (BBB) (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi : 10.1155/2011/469679 for review).
- BBB blood brain barrier
- Liposomes can be made from several different types of lipids; however, phospholipids are most commonly used to generate liposomes as drug carriers. Although liposome formation is spontaneous when a lipid film is mixed with an aqueous solution, it can also be expedited by applying force in the form of shaking by using a homogenizer, sonicator, or an extrusion apparatus (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi: 10.1155/2011/469679 for review).
- liposomes may be added to liposomes in order to modify their structure and properties.
- either cholesterol or sphingomyelin may be added to the liposomal mixture in order to help stabilize the liposomal structure and to prevent the leakage of the liposomal inner cargo.
- liposomes are prepared from hydrogenated egg phosphatidylcholine or egg phosphatidylcholine, cholesterol, and dicetyl phosphate, and their mean vesicle sizes were adjusted to about 50 and 100 nm. (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi: 10.1155/2011/469679 for review).
- Conventional liposome formulation is mainly comprised of natural phospholipids and lipids such as l,2-distearoryl-sn-glycero-3-phosphatidyl choline (DSPC), sphingomyelin, egg phosphatidylcholines and monosialoganglioside. Since this formulation is made up of phospholipids only, liposomal formulations have encountered many challenges, one of the ones being the instability in plasma. Several attempts to overcome these challenges have been made, specifically in the manipulation of the lipid membrane. One of these attempts focused on the manipulation of cholesterol.
- DSPC l,2-distearoryl-sn-glycero-3-phosphatidyl choline
- sphingomyelin sphingomyelin
- egg phosphatidylcholines monosialoganglioside.
- Trojan Horse liposomes are desirable and protocols may be found at cshprotocols.cshlp.org/content/2010/4/pdb.prot5407.1ong. These particles allow delivery of a transgene to the entire brain after an intravascular injection. Without being bound by limitation, it is believed that neutral lipid particles with specific antibodies conjugated to surface allow crossing of the blood brain barrier via endocytosis. Applicant postulates utilizing Trojan Horse Liposomes to deliver the CRISPR family of nucleases to the brain via an intravascular injection, which would allow whole brain transgenic animals without the need for embryonic manipulation. About 1-5 g of nucleic acid molecule, e.g., DNA, RNA, may be contemplated for in vivo administration in liposomes.
- nucleic acid molecule e.g., DNA, RNA
- the system may be administered in liposomes, such as a stable nucleic-acid-lipid particle (SNALP) (see, e.g., Morrissey et al., Nature Biotechnology, Vol. 23, No. 8, August 2005).
- SNALP stable nucleic-acid-lipid particle
- Daily intravenous injections of about 1, 3 or 5 mg/kg/day of a specific CRISPR Cas targeted in a SNALP are contemplated.
- the daily treatment may be over about three days and then weekly for about five weeks.
- a specific CRISPR Cas encapsulated SNALP administered by intravenous injection to at doses of abpit 1 or 2.5 mg/kg are also contemplated (see, e.g., Zimmerman et al., Nature Letters, Vol. 441, 4 May 2006).
- the SNALP formulation may contain the lipids 3-N-[(wmethoxypoly(ethylene glycol) 2000) carbamoyl]- 1,2-dimyristyloxy-propylamine (PEG-C-DMA), l,2-dilinoleyloxy-N,N-dimethyl-3- aminopropane (DLinDMA), l,2-distearoyl-sn-glycero-3-phosphocholine (DSPC) and cholesterol, in a 2:40: 10:48 molar percent ratio (see, e.g., Zimmerman et al., Nature Letters, Vol. 441, 4 May 2006).
- PEG-C-DMA 1,2-dimyristyloxy-propylamine
- DLinDMA l,2-dilinoleyloxy-N,N-dimethyl-3- aminopropane
- DSPC l,2-distearoyl-sn-glycero-3-phosphocholine
- cholesterol in
- SNALPs stable nucleic-acid-lipid particles
- the SNALP liposomes may be prepared by formulating D-Lin-DMA and PEG-C-DMA with distearoylphosphatidylcholine (DSPC), Cholesterol and siRNA using a 25: 1 lipid/siRNA ratio and a 48/40/10/2 molar ratio of Cholesterol/D-Lin-DMA/DSPC/PEG-C-DMA.
- DSPC distearoylphosphatidylcholine
- Cholesterol and siRNA using a 25: 1 lipid/siRNA ratio and a 48/40/10/2 molar ratio of Cholesterol/D-Lin-DMA/DSPC/PEG-C-DMA.
- the resulted SNALP liposomes are about 80-100 nm in size.
- a SNALP may comprise synthetic cholesterol (Sigma- Aldrich, St Louis, Mo., USA), dipalmitoylphosphatidylcholine (Avanti Polar Lipids, Alabaster, Ala., USA), 3-N-[(w-methoxy poly(ethylene glycol)2000)carbamoyl]-l,2- dimyrestyloxypropylamine, and cationic l,2-dilinoleyloxy-3-N,Ndimethylaminopropane (see, e.g., Geisbert et al., Lancet 2010; 375: 1896-905).
- a SNALP may comprise synthetic cholesterol (Sigma- Aldrich), l,2-distearoyl-sn-glycero-3-phosphocholine (DSPC; Avanti Polar Lipids Inc.), PEG- cDMA, and l,2-dilinoleyloxy-3-(N;N-dimethyl)aminopropane (DLinDMA) (see, e.g., Judge, J. Clin. Invest. 119:661-673 (2009)).
- Formulations used for in vivo studies may comprise a final lipid/RNA mass ratio of about 9: 1.
- DLin-KC2-DMA amino lipid 2,2-dilinoleyl-4-dimethylaminoethyl-[l,3]- dioxolane
- DLin-KC2-DMA amino lipid 2,2-dilinoleyl-4-dimethylaminoethyl-[l,3]- dioxolane
- a preformed vesicle with the following lipid composition may be contemplated: amino lipid, di stearoylphosphatidylcholine (DSPC), cholesterol and (R)-2,3-bis(octadecyloxy) propyl- 1 -(methoxy poly(ethylene glycol)2000)propylcarbamate (PEG-lipid) in the molar ratio 40/10/40/10, respectively, and a FVII siRNA/total lipid ratio of approximately 0.05 (w/w).
- the particles may be extruded up to three times through 80 nm membranes prior to adding the CRISPR Cas RNA.
- Particles containing the highly potent amino lipid 16 may be used, in which the molar ratio of the four lipid components 16, DSPC, cholesterol and PEG-lipid (50/10/38.5/1.5) which may be further optimized to enhance in vivo activity.
- CRISPR/Cas gene editing system Any element of any suitable CRISPR/Cas gene editing system known in the art can be employed in the systems and methods described herein, as appropriate.
- CRISPR/Cas gene editing technology is described in detail in, for example, U.S. Patent Application Publication 2014/0068797; U.S.
- RecE/T Homolog Screening RefSeq non-redundant protein database was downloaded from NCBI on October 29, 2019. The database was searched with E. coli Rac prophage RecT (NP 415865.1) and RecE (NP 415866.1) as queries using position-specific iterated (PSI)- BLAST 1 to retrieve protein homologs. Hits were clustered with CD-HIT2 and representative sequences were selected from each cluster for multiple alignment with MUSCLE 3 . Then, FastTree4 was used for maximum likelihood tree reconstruction with default parameters. A diverse set of RecET homologs were selected, synthesized by GenScript, and cloned into pMPH MCP vectors for testing.
- PSI position-specific iterated
- Plasmids construction pX330, pMPH and pU6-(BbsI)_CBh-Cas9-T2A-BFP plasmids were obtained from Addgene. Tested effector DNA fragments were ordered from IDT, Genewiz, and GenScript. The fragments were Gibson assembled into the backbones using NEBuilder HiFi DNA Assembly Master Mix (New England BioLabs). All sgRNAs (Table 3) were inserted into backbones using Golden Gate cloning. All constructs were sequence-verified with Sanger sequencing of prepped plasmids.
- HEK Cell culture Human Embryonic Kidney 293 T, HeLa and HepG2 were maintained in Dulbecco’s Modified Eagle’s Medium (DMEM, Life Technologies), with 10% fetal bovine serum (FBS, HyClone), 100 U/mL penicillin, and 100 pg/mL streptomycin (Life Technologies) at 37 °C with 5% CO2.
- DMEM Modified Eagle’s Medium
- FBS fetal bovine serum
- streptomycin Life Technologies
- hES-H9 cells were maintained in mTeSRl medium (StemCell Technologies) at 37 °C with 5% CO2. Culture plates were pre-coated with Matrigel (Corning) 12 hours prior to use, and cells were supplemented with 10 pM Y27632 (Sigma) for the first 24 hours after passaging. Culture media was changed every 24 hours.
- Transfection HEK293T cells were seeded into 96-well plates (Corning) 12-24 hours prior to transfection at a density of 30,000 cells/well, and 250 ng of total DNA was transfected per well.
- HeLa and HepG2 cells were seeded into 48-well plates (Corning) one day prior to transfection at a density of 50,000 and 30,000 cells/well respectively, and 400 ng of total DNA was transfected per well. Transfections were performed with Lipofectamine 3000 (Life Technologies) following the manufacturer’s instructions.
- Fluorescence-activated cell sorting FACS mKate knock-in efficiency was analyzed on a CytoFLEX flow cytometer (Beckman Coulter; Stanford Stem Cell FACS Core). 72 hours after transfection, cells were washed once with PBS and dissociated with TrypLE Express Enzyme (Thermo Fisher Scientific). Cell suspension was then transferred to a 96-well U-bottom plate (Thermo Fisher Scientific) and centrifuged at 300xG for 5 minutes. After removing the supernatant, pelleted cells were resuspended with 50 pl 4% FBS in PBS, and cells were sorted within 30 minutes of preparation.
- FACS Fluorescence-activated cell sorting
- RFLP HEK293T cells were transfected with plasmid DNA and PCR templates and harvested after 72 hours for genomic DNA using the QuickExtract DNA Extraction Solution (Biosearch Technologies) following the manufacturer’s protocol.
- the target genomic region was amplified using specific primers outside of the homology arms of the PCR template.
- PCR products were purified with Monarch PCR & DNA Cleanup Kit (New England BioLabs). 300 ng of purified product was digested with BsrGI (EMX1, New England BioLabs) or Xbal (VEGFA, NEB), and the digested products were analyzed on a 5% Mini-PROTEAN TBE gel (Bio-Rad).
- iGUIDE Off-target Analysis Genome-wide, unbiased off-target analysis was performed following the iGUIDE pipeline (Nobles, C.L., et al. Genome Biol 20, 14 (2019), incorporated herein by reference) based on Guide-seq invented previously (Tsai, S., et al. Nat Biotechnol 33, 187-197 (2015), incorporated herein by reference).
- HEK293T cells were transfected in 20uL Lonza SF Cell Line Nucleofector Solution on a Lonza Nucleofector 4-D with program DS- 150 according to the manufacturer’s instructions.
- gRNA-Cas9 plasmids or 150ng of each gRNACas9n plasmid for the double nickase
- 150ng of the effector plasmids and 5pmol of double stranded oligonucleotides (dsODN) were transfected.
- Cells were harvested after 72hrs for genomic DNA using Agencourt DNAdvance reagent kit. 400ng of purified gDNA which was then fragmented to an average of 500bp and ligated with adaptors using NEBNext Ultra II FS DNA Library Prep kit following manufacturer’s instructions.
- recombineering-edit tools are available for bacteria, e.g., the phage lambda Red and RecE/T.
- Microbial recombineering has two major steps: template DNA is chewed back by exonucleases (Exo), then the single-strand annealing protein (SSAP) supports homology directed repair by the template, optionally facilitated by nuclease inhibitor.
- SSAP single-strand annealing protein
- a system for RNA-guided targeting of RecE/T recombineering activities was developed and achieved kilobase (kb) human gene-editing without DNA cutting.
- NCBI protein database was systematically searched for RecE/T homologs.
- FIG. 2A To develop a portable tool, evolutionary relationships and lengths were examined (FIG. 2A). Cooccurrence analysis revealed that most RecE/T systems have only one of the two proteins (FIG. 2B). As prophage integration could be imprecise, the 11% of species harboring both homologs were prioritized as evidence for intact functionality.
- the top 12 candidates were codon-optimized and MS2 coat protein (MCP) fusions were constructed to recruit these RecE/T homologs, hereafter termed “recombinator”, to wild-type Streptococcus pyogenes Cas9 (wtCas9) via MS2 RNA aptamers.
- MCP MS2 coat protein
- RecE is only 269 amino acid (AA) long
- RecE was truncated from AA587 (RecE_587) and the carboxy terminus domain (RecE CTD) based on functional studies (Muyrers, J.P., Genes Dev. (2000); 14, 1971-1982, incorporated herein by reference).
- HDR homology directed repair
- RecE had activities without recruitment, whereas RecT showed efficiency increases in a recruitment-dependent manner (FIG. 3H). Without being bound by theory, this may be explained by RecE exonuclease activity acting promiscuously (FIG. 2C).
- the RecE/T recombineering-edit (REDIT) tools was termed as REDITvl, with REDITvl RecT as the preferred variant.
- REDITvl activity was robust across multiple genomic sites in HEK, A549, HepG2, and HeLa cells (FIGS. 5A-C, FIGS. 6A-C). Noticeably, in human embryonic stem cells (hESCs), REDITvl exhibited consistent increases of kilobase knock- in efficiency at HSP90AA1 and OCT4, with up to 3.5-fold improvement relative to Cas9-HDR (FIGS. 5D-E, FIGS. 6D-E). Different template designs were also tested.
- REDITvl performed efficient kilobase editing using HA length as short as 200bp total, with longer HA supporting higher efficiency. It achieved up to 10% efficiency (without selection) for kb-scale knock-in, a 5- fold increase over Cas9-HDR and significantly higher than the 1 ⁇ 2% typical efficiency (FIG. 7). Lastly, the accuracy of REDITvl accuracy was determined using deep sequencing of predicted off-target sites (OTSs) and GUIDE-seq. Although REDITvl did not increase off-target effects, detectable OTSs remained at previously reported sites for EMX1 and VEGFA (FIGS. 5F-G, FIG. 8). In short, REDITvl showcased kilobase-scale genome recombineering but retained the off- target issues, with REDITvl RecT having the highest efficiency.
- GUIDE-seq Concepts from GUIDE-seq, LAM-PCR, and TLA were used to develop an NGS-based assay to identify genome-wide insertion sites (GIS), or GIS-seq (FIG. 30A).
- GIS-seq NGS read clusters/peaks representing knock-in insertion sites were obtained (FIG. 30B), showing representative reads from the on-target site).
- GIS-seq was applied to DYNLT1 and ACTB loci to measure the knock-in accuracy. Sequencing results indicated that, when considering sites with high confidence based on maximum likelihood estimation, REDIT had less off-target insertion sites identified compared with Cas9 (FIG. 30C).
- REDIT was examined for long sequence editing ability in the absence of any nicking/cutting of the target DNA.
- dCas9 catalytically dead Cas9
- FIG. 9D, top, FIG. 13 an exact genomic knock-in of a kilobase cassette was observed in human cells.
- REDITv2D has lower efficiency than REDITv2N, it achieved programmable DNA-damage-free editing at kilobase-scale with 1 ⁇ 2% efficiency and no selection (FIG. 9D, FIG. 10B). It was hypothesized that two processes could be contributing to the REDITv2D recombineering. One possibility was via dCas9 unwinding.
- REDITv3 Microscopy analysis revealed incomplete nuclei-targeting of REDITvl, particularly REDITvl RecT (FIG. 15). Hence, different designs of protein linkers and nuclear localization signals (NLSs) were tested (FIG. 15 A). The extended XTEN-linker with C-terminal SV40-NLS was identified as a preferred configuration, termed REDITv3 (FIG. 16). REDITv3 further achieved a 2- to 3- fold increase of HDR efficiencies over REDITv2 across genome targets and Cas9 variants (wtCas9, Cas9n, dCas9) (FIG. 17).
- REDITv3 was utilized in hESCs to engineer kilobase knock-in alleles in human stem cells.
- REDITv3N single- and double-nicking designs resulted in 5-fold and 20-fold increased HDR efficiencies over no-recombinator controls, respectively (FIG. 9F).
- the efficacy and fidelity were confirmed via a combination of assays described for previous REDIT versions (FIGS. 9F-G, FIG. 18).
- REDITv3 works effectively with Staphylococcus aureus Cas9 (SaCas9), a compact CRISPR system suitable for in vivo delivery (FIG. 19).
- RecT and RecE_587 variants both RecT and RecE_587 were truncated at various lengths as shown in FIG. 20A and FIG. 21A, respectively.
- the resulting efficiencies were measured using an mKate knock-in assay, with both wildtype SpCas9 and Cas9n(D10A) with single- and double-nicking at the DYNLT 1 locus (FIGS. 20B-C and FIGS. 21B- C, respectively). Efficiencies of the no recombination group are shown as the control.
- the truncated versions of both RecT and RecE_587 retained significant recombineering activity when used with different Cas9s.
- the new truncated versions such as RecT(93-264aa) are over 30% smaller yet they preserved essentially the full activities of RecT in stimulating recombination in eukaryotic cells.
- truncated versions such as RecE_587(120-221aa) and RecE_587(120-209aa) are over 60% smaller but still retained high recombination activities in human cells.
- REDIT harnessed the specificity of CRISPR genome-targeting with the efficiency of RecE/RecT recombineering.
- the disclosed high-efficiency, low-error system makes a powerful addition to existing CRISPR toolkits.
- the balanced efficiency and accuracy of REDITv3N makes it an attractive therapeutic option for knock-in of large cassette in immune and stem cells.
- exonuclease proteins were used: the exonuclease from phage Lambda, the RecE587 core domain of E. coli RecE protein, and the exonuclease (gene name gp6) from phage T7 (FIG. 22A).
- the gene-editing activity was measured using mKate knock-in assay at genomic loci (DYNLT1 and HSP90AA1).
- SSAPs single-strand DNA annealing proteins
- exonucleases showed ⁇ 3-fold higher recombination efficiency (up to 4% mKate genome knock-in) when compared with no-recombinator controls.
- the single-strand annealing proteins (SSAP) showed higher activities, with 4-fold to 8-fold higher gene-editing activities over the control groups. This demonstrated the general applicability and validity that microbial recombination proteins in the exonuclease and SSAP families could be engineered via the Cas9-based fusion protein system to achieve highly efficient genome recombination in mammalian cells.
- a REDIT system using SunTag recruitment was developed (FIGS. 24A and 27A). Because SunTag is based on fusion protein design, the sgRNA or guideRNAs are the same as wild-type CRISPR system. Specifically, the REDIT recombinator proteins were fused to scFV antibody peptide (replacing MCP), and the GCN4 peptide was fused in tandem fashion (10 copies of GCN4 peptide separated by linkers) to the Cas9 protein. Thus, the scFV-REDIT could be recruited to the Cas9 complex via GCN4’s affinity to scFV.
- mKate knock-in experiments were used to measure the editing efficiencies at the DYNLT1 locus and the HSP90AA1 locus, respectively.
- This SunTag-based REDIT system demonstrated significant increase of gene-editing knock-in efficiency at the DYNLT1 genomic sites tested.
- the SunTag design significantly increased HRD efficiencies to ⁇ 2-fold better than Cas9 but did not achieve increases as high as the MS2-aptamer.
- RecE/RecT proteins 15 different species of microbes having RecE/RecT proteins were selected for a screen of various RecE and RecT proteins across the microbial kingdom (Table 5). Each protein was codon-optimized and synthesized. As previously described for E. coli RecE/RecT based REDIT systems, each protein was fused via E-XTEN linker to the MCP protein with additional nuclear localization signal. mKate knock-in gene-editing assay was used to measure efficiencies at DYNLT1 locus (FIG. 26A, Table 6) and HSP90AA1 locus (FIG. 26B, Table 6). The homologs demonstrated the ability to enable and enhance precision gene-editing.
- RecT-based REDIT design was combined with three different approaches (conveniently through the MS2-aptamer) (FIG. 28A, right).
- the RecT-based REDIT design could indeed further enhance the HDR- promoting activities of the tested tools (FIG. 28C).
- the knock-in cells were clonally isolated and the target genomic region was amplified using primers binding completely outside of the donor DNAs for colony Sanger sequencing (FIG. 29B.
- Junction sequencing analysis ( ⁇ 48 colonies per gene per condition) revealed varying degrees of indels at the 5’- and 3’- knock-in junctions, including at single or both junctions (FIG. 29C).
- HDR donors had better precision than MMEJ donors, and REDIT modestly improved the knock-in yield compared with Cas9, though junction indels were still observed.
- REDIT The sensitivity of REDIT’ s ability to promote HDR in the presence or absence of two distinctive pharmacological inhibitors of RAD51, B02 and RI-1 (FIG. 31 A).
- RAD51 inhibition significantly lowered HDR efficiencies (FIGS. 3 IB, 31C, and 32A).
- RAD51 inhibition decreased REDIT and REDIT dn efficiencies only moderately, as both REDIT/REDITdn methods maintained significantly higher knock-in efficiencies compared with Cas9/Cas9dn under RAD51 inhibition.
- Mirin a potent chemical inhibitor of DSB repair, which has also been shown to prevent MRN complex formation, MRN-dependent ATM activation, and inhibit Mrel l exonuclease activity was also used.
- Mrining only the editing efficiencies of Cas9 reference experiments were affected by the Miring treatment, whereas the REDIT versions were essentially the same as vehicle-treated groups across all genomic targets (FIG. 32A).
- REDIT was applied in human embryonic stem cells (hESCs) to test their ability to engineer long sequences in non-transformed human cells.
- Robust stimulation of HDR was observed across all three genomic sites (HSP90AA1, ACTB, OCT4/POU5FP) using REDIT and REDITdn (FIGS. 3 ID and 3 IE).
- REDIT and REDITdn editing used donor DNAs with 200-bp HAs on each side and achieved up to over 5% efficiency for kb-scale gene-editing without selection compared with ⁇ 1% efficiency using non- REDIT methods.
- REDIT improved knock-in efficiencies in A549 (lung-derived), HepG2 (liver-derived), and HeLa (cervix-derived) cells, demonstrating up to ⁇ 15% kb-scale genomic knock-in without selection. This improvement was up to 4-fold higher than the Cas9 groups, supporting the potential of using REDIT methods in different cell types.
- FIG. 33A A gene editing vector (60 pg) and template DNA (60 pg) were injected via hydrodynamic tail vein injection to deliver the components to the mouse. Successful gene editing of liver hepatocytes was monitored by transgene-encoded protein expression from the albumin locus.
- FIG. 33B A schematic of the experimental procedure is shown in FIG. 33B
- the perfused mice livers were dissected.
- the lobes of the liver were homogenized and processed to extract liver genomic DNA from the primary hepatocytes.
- the extracted genomic DNA was used for three different downstream analyses: 1) PCR using knock-in-specific primers and agarose gel electrophoresis (FIG. 34A); 2) Sanger sequencing of the knock-in PCR product (FIG. 34B); 3) high-throughput deep sequencing of the knock-injunction to confirm and quantify the accuracy of gene-editing using SAFE-dCas9 in vivo (FIG. 34C).
- Each downstream analysis confirmed knock-in success with .
- LTC mice include three genome alleles: 1) Lkbl (flox/flox) allele allows Lkbl- KO when expressing Cre; 2) R26(LSL-TdTom) allele allows detection of AAV-transduced cells via TdTom red fluorescent protein; and 3) Hl 1(LSL-Cas9) allele allows expression of Cas9 in AAV-transduced cells.
- Schematics of the REDI gene editing vector and Cas9 control vectors are shown in FIG. 35 A.
- successful gene editing using the gene editing vector leads to Kras alleles that drive tumor growth in the lung of the treated mice.
- Pantoea brenneri RecE amino acid sequence (SEQ ID NO:4):
- Type-F symbiont of Plautia stali RecE amino acid sequence (SEQ ID NO:5):
- Pantoea brenneri RecT amino acid sequence (SEQ ID NO: 10):
- Type-F symbiont of Plautia stali RecT amino acid sequence SEQ ID NO: 11:
- PAAKRVKLD biSV40 NLS amino acid sequence (SEQ ID NO: 19):
- Template DNA sequences (underlining marks the replaced or inserter editing sequences)
- VEGFA HDR template sequence (SEQ ID NO:80):
- HSP90AA1 HDR template sequence (SEQ ID NO:82):
- OCT4 HDR template sequence (SEQ ID NO:84):
- Pantoea stewartii RecT DNA SEQ ID NO:85:
- Pantoea stewartii RecE DNA SEQ ID NO:86:
- Pantoea brenneri RecT DNA (SEQ ID NO: 87):
- Pantoea brenneri RecE DNA (SEQ ID NO: 88):
- Pantoea dispersa RecE DNA SEQ ID NO:90:
- Type-F symbiont of Plautia stali RecE DNA (SEQ ID NO:92):
- Salmonella enterica RecT DNA SEQ ID NO: 1023
- Salmonella enterica RecE DNA SEQ ID NO: 1044:
- Acetobacter RecT DNA SEQ ID NO: 1057
- Acetobacter RecE DNA SEQ ID NO: 1036:
- Salmonella enterica subsp. enterica serovar Javiana str. 10721 RecT DNA (SEQ ID NO: 107): CCAAAGCAGCCCCCTATCGCCAAGGCAGACCTGCAGAAAACCCAGGGAGCACGGAC CCCAACAGCAGTGAAGAACAATAACGATGTGATCTCCTTTATCAATCAGCCTTCTAT GAAGGAGCAGCTGGCCGCCGCCCTGCCAAGGCACATGACCGCCGAGCGGATGATCA GAATCGCCACCACAGAGATCAGGAAGGTGCCCGCCCTGGGCGACTGCGATACAATG TCTTTTGTGAGCGCCATCGTGCAGTGTAGCCAGCTGGGCCTGGAGCCTGGCGGCGCC CTGGGCCACGCCTACCTGCTTTCGGCAATCGGAACGAGAAGTCCGGCAAGAA GAATGTGCAGCTGATCATCGGCTATAGAGGCATGATCGACCTGGCCCGGAGATCCG GACAGATCGCCAGCCTGTCCGCCAGGGTGGTGCGCGAGGGCGACGATTTCTCTT
- Salmonella enterica subsp. enterica serovar Javiana str. 10721 RecE DNA (SEQ ID NO: 108): TACTATGACATCCCAAACGAGGCCTACCACGCAGGCCCCGGCGTGTCTAAGAGCCA GCTGGACGACATCGCCGATACCCCCGCCATCTATCTGTGGCGGAAGAATGCCCCTGT GGACACCGAGAAAACCAAGTCCCTGGATACCGGCACAGCCTTCCACTGCAGGGTGC TGGAGCCAGAGGAGTTCAGCAAGCGGTTCATCATCGCCCCCGAGTTCAACCGGAGA ACCTCCGCCGGCAAGGAGGAGGAAAACCTTCCTGGAGGAGTGTACCCGGACAG GCAGAACCGTGCTGACAGCCGAGGAGGGCAGGAAGATCGAGCTGATGTACCAGTC CGTGATGGCACTGCCACTGGGACAGTGGCTGGTGGAGTCTGCCGGCTACGCCGAGA GCTCCGTGTATTGGGAGGACCCTGAGACAGGCATCCTGCCGGTGTAGACCCGAT
- Pseudobacteriovorax antillogorgiicola RecT DNA SEQ ID NO: 109: GGCCACCTGGTGAGCAAGACCGAGCAGGATTACATCAAGCAGCACTATGCCAAGGG
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Mycology (AREA)
- Cell Biology (AREA)
- Animal Behavior & Ethology (AREA)
- Epidemiology (AREA)
- Pharmacology & Pharmacy (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
- Peptides Or Proteins (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Priority Applications (8)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2023217087A AU2023217087A1 (en) | 2022-02-10 | 2023-02-10 | Rna-guided genome recombineering at kilobase scale |
| IL314544A IL314544A (en) | 2022-02-10 | 2023-02-10 | Ribonucleic acid-guided genome recombination engineering at the scale of thousands of bases |
| EP23753709.7A EP4476333A4 (en) | 2022-02-10 | 2023-02-10 | KILOBASE-SIZED RNA-GUIDED GENOME RECOMBINATION |
| JP2024547600A JP2025505732A (ja) | 2022-02-10 | 2023-02-10 | キロ塩基スケールでのrnaガイドゲノムリコンビニアリング |
| CN202380033469.2A CN119855904A (zh) | 2022-02-10 | 2023-02-10 | Rna指导的千碱基规模基因组重组工程 |
| US18/832,052 US20250354164A1 (en) | 2022-02-10 | 2023-02-10 | Rna-guided genome recombineering at kilobase scale |
| KR1020247030192A KR20240139088A (ko) | 2022-02-10 | 2023-02-10 | 킬로베이스 규모의 rna 가이드 게놈 리콤비니어링 |
| CA3249564A CA3249564A1 (en) | 2022-02-10 | 2023-02-10 | KILOBASE-SIZED RNA-GUIDED GENOME RECOMBINATION |
Applications Claiming Priority (8)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263308834P | 2022-02-10 | 2022-02-10 | |
| US202263308837P | 2022-02-10 | 2022-02-10 | |
| US202263308830P | 2022-02-10 | 2022-02-10 | |
| US63/308,837 | 2022-02-10 | ||
| US63/308,834 | 2022-02-10 | ||
| US63/308,830 | 2022-02-10 | ||
| USPCT/US2022/075850 | 2022-09-01 | ||
| PCT/US2022/075850 WO2023034925A1 (en) | 2021-09-01 | 2022-09-01 | Rna-guided genome recombineering at kilobase scale |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2023154877A2 true WO2023154877A2 (en) | 2023-08-17 |
| WO2023154877A3 WO2023154877A3 (en) | 2023-09-14 |
Family
ID=87565146
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/062406 Ceased WO2023154877A2 (en) | 2022-02-10 | 2023-02-10 | Rna-guided genome recombineering at kilobase scale |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US20250354164A1 (https=) |
| EP (1) | EP4476333A4 (https=) |
| JP (1) | JP2025505732A (https=) |
| KR (1) | KR20240139088A (https=) |
| CN (1) | CN119855904A (https=) |
| AU (1) | AU2023217087A1 (https=) |
| CA (1) | CA3249564A1 (https=) |
| IL (1) | IL314544A (https=) |
| WO (1) | WO2023154877A2 (https=) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117247858A (zh) * | 2023-07-06 | 2023-12-19 | 云南农业大学 | 一种消减烟草连作障碍并提质增产的菌剂及其制法和应用 |
| WO2024168265A1 (en) * | 2023-02-10 | 2024-08-15 | Possible Medicines Llc | Aav delivery of rna guided recombination system |
| EP4396340A4 (en) * | 2021-09-01 | 2025-09-24 | Univ Leland Stanford Junior | RNA-GUIDED GENOME RECOMBINATION AT THE KILOBASE SCALE |
| WO2026010985A1 (en) * | 2024-07-02 | 2026-01-08 | The General Hospital Corporation | Regeneration of biological tissues |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017106657A1 (en) * | 2015-12-18 | 2017-06-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
| WO2018081535A2 (en) * | 2016-10-28 | 2018-05-03 | Massachusetts Institute Of Technology | Dynamic genome engineering |
| EP3728604A2 (en) * | 2017-12-22 | 2020-10-28 | KWS SAAT SE & Co. KGaA | Targeted transcriptional regulation using synthetic transcription factors |
| IL296057A (en) * | 2020-03-03 | 2022-10-01 | Univ Leland Stanford Junior | Ribonucleic acid-guided genome recombination engineering at the scale of thousands of bases |
| WO2023154892A1 (en) * | 2022-02-10 | 2023-08-17 | Possible Medicines Llc | Rna-guided genome recombineering at kilobase scale |
-
2023
- 2023-02-10 JP JP2024547600A patent/JP2025505732A/ja active Pending
- 2023-02-10 CA CA3249564A patent/CA3249564A1/en active Pending
- 2023-02-10 WO PCT/US2023/062406 patent/WO2023154877A2/en not_active Ceased
- 2023-02-10 EP EP23753709.7A patent/EP4476333A4/en active Pending
- 2023-02-10 AU AU2023217087A patent/AU2023217087A1/en active Pending
- 2023-02-10 IL IL314544A patent/IL314544A/en unknown
- 2023-02-10 US US18/832,052 patent/US20250354164A1/en active Pending
- 2023-02-10 CN CN202380033469.2A patent/CN119855904A/zh active Pending
- 2023-02-10 KR KR1020247030192A patent/KR20240139088A/ko active Pending
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4396340A4 (en) * | 2021-09-01 | 2025-09-24 | Univ Leland Stanford Junior | RNA-GUIDED GENOME RECOMBINATION AT THE KILOBASE SCALE |
| WO2024168265A1 (en) * | 2023-02-10 | 2024-08-15 | Possible Medicines Llc | Aav delivery of rna guided recombination system |
| CN117247858A (zh) * | 2023-07-06 | 2023-12-19 | 云南农业大学 | 一种消减烟草连作障碍并提质增产的菌剂及其制法和应用 |
| WO2026010985A1 (en) * | 2024-07-02 | 2026-01-08 | The General Hospital Corporation | Regeneration of biological tissues |
Also Published As
| Publication number | Publication date |
|---|---|
| US20250354164A1 (en) | 2025-11-20 |
| CA3249564A1 (en) | 2023-08-17 |
| EP4476333A4 (en) | 2026-02-25 |
| CN119855904A (zh) | 2025-04-18 |
| WO2023154877A3 (en) | 2023-09-14 |
| IL314544A (en) | 2024-09-01 |
| AU2023217087A1 (en) | 2024-08-22 |
| KR20240139088A (ko) | 2024-09-20 |
| EP4476333A2 (en) | 2024-12-18 |
| JP2025505732A (ja) | 2025-02-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12168789B2 (en) | Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation | |
| US11149259B2 (en) | CRISPR-Cas systems and methods for altering expression of gene products, structural information and inducible modular Cas enzymes | |
| US20210277371A1 (en) | Engineering of systems, methods and optimized guide compositions with new architectures for sequence manipulation | |
| US20250354164A1 (en) | Rna-guided genome recombineering at kilobase scale | |
| EP3237615B2 (en) | Crispr having or associated with destabilization domains | |
| US20250034594A1 (en) | Rna-guided genome recombineering at kilobase scale | |
| EP3230451B1 (en) | Protected guide rnas (pgrnas) | |
| CA3077086A1 (en) | Systems, methods, and compositions for targeted nucleic acid editing | |
| US20180057810A1 (en) | Functional screening with optimized functional crispr-cas systems | |
| US20260103690A1 (en) | Programmable dna transposases for nucleic acid manipulation | |
| WO2025038989A1 (en) | Rna-guided genome recombineering at kilobase scale | |
| WO2024168265A1 (en) | Aav delivery of rna guided recombination system | |
| WO2025160203A1 (en) | Engineered programmable dna transposases and engineered bridge rna systems for nucleic acid manipulation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23753709 Country of ref document: EP Kind code of ref document: A2 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 314544 Country of ref document: IL |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024547600 Country of ref document: JP |
|
| ENP | Entry into the national phase |
Ref document number: 2023217087 Country of ref document: AU Date of ref document: 20230210 Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 20247030192 Country of ref document: KR Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1020247030192 Country of ref document: KR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023753709 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023753709 Country of ref document: EP Effective date: 20240910 |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23753709 Country of ref document: EP Kind code of ref document: A2 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380033469.2 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202380033469.2 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 18832052 Country of ref document: US |