WO2024012435A1 - Gene editing systems and methods for treating hereditary angioedema - Google Patents

Gene editing systems and methods for treating hereditary angioedema Download PDF

Info

Publication number
WO2024012435A1
WO2024012435A1 PCT/CN2023/106730 CN2023106730W WO2024012435A1 WO 2024012435 A1 WO2024012435 A1 WO 2024012435A1 CN 2023106730 W CN2023106730 W CN 2023106730W WO 2024012435 A1 WO2024012435 A1 WO 2024012435A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
cell
gene editing
rna
editing system
Prior art date
Application number
PCT/CN2023/106730
Other languages
French (fr)
Other versions
WO2024012435A9 (en
Inventor
Lijie Wang
Liguo HUANG
Yichuan Wang
Peixue Li
Xiaodun MOU
Original Assignee
CorrectSequence Therapeutics Co., Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CorrectSequence Therapeutics Co., Ltd filed Critical CorrectSequence Therapeutics Co., Ltd
Publication of WO2024012435A1 publication Critical patent/WO2024012435A1/en
Publication of WO2024012435A9 publication Critical patent/WO2024012435A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • C12N15/1137Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against enzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21034Plasma kallikrein (3.4.21.34)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/50Methods for regulating/modulating their activity

Definitions

  • the present disclosure generally relates to gene editing systems and methods for treating hereditary angioedema (HAE) . Also disclosed are polynucleotides, vectors, cells, kits, and compositions comprising components of the gene editing systems, and methods related to treatment of HAE.
  • HAE hereditary angioedema
  • Hereditary angioedema is a rare inherited disease that can cause recurrent attacks of painful swellings in any part of the body. These painful swellings interfere with patient’s daily activities. Swelling of the throat is potentially life-threatening because it could lead to the risk of asphyxiation. Most of HAEs are caused by C1-INH deficiency. C1-INH helps to regulate plasma kallikrein. C1-INH deficiency and/or dysfunction could increase the activity of plasma kallikrein. Increased kallikrein activity leads to excess bradykinin, which triggers vascular leakage, causes blood vessels to release fluid, and results in localized swelling in an HAE attack.
  • C1-INH is encoded by the SERPING1 gene, and 450 known mutations in SERPING1 have been found associated with HAE.
  • correction of the SERPING1 gene is not practical for the treatment of HAE. Therefore, there is a need to find a therapeutic target other than the SERPING1 gene for the treatment of HAE.
  • the present disclosure provides gene editing systems, polynucleotides, vectors, cells, compositions, kits, and methods to disrupt the KLKB1 gene, which encodes the precursor of plasma kallikrein.
  • Plasma kallikrein could cleave Lys-Arg and Arg-Ser bonds in human kininogen to release bradykinin.
  • suppressing the expression of plasma kallikrein reduces bradykinin production.
  • the disruption of the KLKB1 gene prevents the release of bradykinin. In some embodiments, the disruption of the KLKB1 gene treats hereditary angioedema (HAE) . In some embodiments, the disruption of the KLKB1 gene treats HAE caused by C1 inhibitor (C1-INH) deficiency and/or dysfunction.
  • HAE hereditary angioedema
  • C1-INH C1 inhibitor
  • the present disclosure provides a gene editing system for disrupting the KLKB1 gene, wherein the gene editing system comprises a base editor and at least one guide RNA that is capable of binding to the KLKB1 gene.
  • a highly specific base editor transformer base editor (tBE) , is used to induce efficient and precise gene editing at genomic sites for disrupting the KLKB1 gene.
  • a tBE is used with a combination of main guide RNA (mgRNA) and helper guide RNA (hgRNA) , wherein the mgRNA and hgRNA are capable of binding to the KLKB1 gene.
  • mgRNA main guide RNA
  • hgRNA helper guide RNA
  • the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer, wherein the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise the respective sequences as set forth in Table 3.
  • mgRNA main guide RNA
  • hgRNA helper guide RNA
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 81, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 82, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 83, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 84, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 334, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 335, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 336, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 337, respectively.
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, and (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR- associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the protease is split into a first protease fragment and a second protease fragment, wherein the first or second protease fragment alone is not able to cleave the cleavage site.
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the protease is a TEV protease, a TuMV protease, a PPV protease, a PVY protease, a ZIKV protease, or a WNV protease.
  • the protease is a TEV protease.
  • the TEV protease comprises a sequence as set forth in SEQ ID NO: 590.
  • the first TEV protease fragment comprises a sequence of SEQ ID NO: 591.
  • the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase.
  • the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase.
  • the inhibitory domain of a cytidine deaminase comprises an amino acid sequence as set forth in SEQ ID NO: 607 or SEQ ID NO: 608.
  • the nucleotide deaminase is a cytidine deaminase.
  • the cytidine deaminase is selected from the group consisting of APOBEC3B (A3B) , APOBEC3C (A3C) , APOBEC3D (A3D) , APOBEC3F (A3F) , APOBEC3G (A3G) , APOBEC3H (A3H) , APOBECI (Al) , APOBEC3 (A3) , APOBEC2 (A2) , APOBEC4 (A4) , and AICDA (AID) .
  • the cytidine deaminase comprises an amino acid sequence of any one of SEQ ID NOs: 765-800.
  • the cytidine deaminase is a human or mouse cytidine deaminase.
  • the catalytic domain of the cytidine deaminase is a mouse A3 cytidine deaminase domain 1 (mA3-CDAl) or human A3B cytidine deaminase domain 2 (hA3B-CDA2) .
  • the first fusion protein further comprises an uracil glycosylase inhibitor (UGI) .
  • UMI uracil glycosylase inhibitor
  • the first fusion protein further comprises a nuclear localization sequences (NLS)
  • the Cas protein is Cas9, a dead Cas9 (dCas9) , or a Cas9 nickase (nCas9) selected from the group consisting of SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpfl, LbCpfl, FnCpfl, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpfl, FnCpfl, SsCpfl, PcCpfl, BpCpfl, CmtCpfl, LiCpfl, PmCpfl, Pb3310Cpfl, Pb4417Cpfl, BsCpfl, EeCpfl, BhCasl
  • the Cas protein comprise an amino acid sequence of any one of SEQ ID Nos: 713-764.
  • the Cas protein is a nCas9. In some embodiments, the nCas9 protein is a nCas9-D10A protein. In some embodiments, the nCas9-D10A protein has an amino acid sequence of SEQ ID NO: 612.
  • the first protein-binding RNA motif and the first RNA binding domain, the second protein-binding RNA motif and the second RNA binding domain, and the third protein-binding RNA motif and the third RNA binding domain are each independently selected from the group consisting of a MS2 phage operator stem-loop and MS2 coat protein (MCP) or an RNA-binding section thereof; a BoxB and N22P or an RNA-binding section thereof; a telomerase Ku binding motif and Ku protein or an RNA-binding section thereof; a telomerase Sm7 binding motif and Sm7 protein or an RNA-binding section thereof; a PP7 phage operator stem –loop and PP7 coat protein (PCP) or an RNA-binding section thereof; a SfMu phage Com stem-loop and Com RNA binding protein or an RNA-binding section thereof; and a non-natural RNA aptamer and corresponding aptamer ligand or an RNA-
  • MCP MS
  • the mgRNA and/or the hgRNA comprises a dual-RNA structure.
  • the dual-RNA structure is formed by a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) , wherein the crRNA comprises the spacer.
  • crRNA CRISPR RNA
  • tracrRNA trans-activating crRNA
  • the mgRNA comprises a mcrRNA and a first tracrRNA
  • the mcrRNA comprises the mgRNA spacer
  • the hgRNA comprises a hcrRNA and a second tracrRNA
  • the hcrRNA comprises the hgRNA spacer
  • the first tracrRNA and the second tracrRNA are same or different.
  • the mcrRNA and the hcrRNA are SEQ ID NO. 805 and SEQ ID NO: 808, respectively; or SEQ ID NO. 806 and SEQ ID NO: 809, respectively; or SEQ ID NO. 807 and SEQ ID NO: 810, respectively; or SEQ ID NO. 812 and SEQ ID NO: 815, respectively; or SEQ ID NO. 813 and SEQ ID NO: 816, respectively; or SEQ ID NO. 814 and SEQ ID NO: 817, respectively.
  • the tracrRNA is SEQ ID NO: 804 or 811.
  • the present disclosure provides a polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
  • the present disclosure provides a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a kit comprising a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein, and a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the first and the second Cas proteins are the same Cas protein.
  • the present disclosure provides a vector comprising the polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
  • the present disclosure provides a vector comprising the polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
  • the vector is a plasmid or a viral vector.
  • the vector is a polycistronic vector.
  • the present disclosure provides a kit comprising the vector disclosed above, and a vector comprising the polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the polynucleotide disclosed herein.
  • the cell further comprises a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the vector disclosed herein.
  • the cell further comprises a vector comprising a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the kit disclosed herein.
  • the cell is a stem cell.
  • the cell is a pluripotent stem cell.
  • the cell is an embryonic stem cell (ESC) .
  • ESC embryonic stem cell
  • the cell is an induced pluripotent stem cell (iPSC) .
  • iPSC induced pluripotent stem cell
  • the cell is an endothelial cell.
  • the cell is a primary cell.
  • the cell is a differentiated cell.
  • the present disclosure provides a composition comprising the gene editing system disclosed herein.
  • the present disclosure provides a composition comprising the cell disclosed herein.
  • the present disclosure provides a method for disrupting a KLKB1 gene in a cell, comprising introducing into the cell the gene editing system disclosed herein.
  • the present disclosure provides a method for decreasing the expression of plasma kallikrein in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • the present disclosure provides a method for decreasing the expression of bradykinin in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • the present disclosure provides a method for decreasing vascular leakage regulated by a kinin B2 receptor (B2R) in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • B2R kinin B2 receptor
  • the HAE is caused by C1-INH deficiency or dysfunction.
  • the HAE is HAE-1.
  • the HAE is HAE-2.
  • the cell is a stem cell.
  • the cell is a pluripotent stem cell.
  • the cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
  • iPSC induced pluripotent stem cell
  • the cell is an endothelial cell.
  • the cell is a primary cell or a differentiated cell.
  • Fig. 1 illustrates exemplary base editors that can be used in the gene editing systems disclosed here.
  • the various versions of base editors are denoted as V1, V2, V3, V4, and V5, with constructs denoted as tBE-V1-rA1, tBE-V2-rA1, tBE-V3-rA1, tBE-V4-rA1, tBE-V5-rA1, and tBE-V5-mA3.
  • Fig. 1A shows schematic diagrams illustrating the construction and development of various versions of base editors.
  • Fig. 1B shows interactions of molecular components in different versions of the base editors.
  • Base editors of V2 to V5 illustrate different strategies to cleave mA3dCDI off.
  • the dCDI domain could be cleaved off from APOBEC through a two-component interaction of the TEV site and a free TEV protease (V2) , a N22p-fused TEV protease (V3) , or a TEV protease reconstituted by an mgRNA-boxB (V4) .
  • V5 version 5 of the base editor
  • the dCDI is cleaved off from APOBEC through a three-component interaction of TEV site, TEVn, and N22p-TEVc.
  • Fig. 2 illustrates editing efficiencies induced by a tBE gene editing system with pairs of mgRNA-KLKB1-1 ⁇ 4 and corresponding hgRNAs targeting human KLKB1 gene in human cells.
  • Fig. 2A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1-1 ⁇ 4 and corresponding hgRNA-KLKB1-1 ⁇ 4s with tBE-V5-mA3 and nCas9.
  • Fig. 2B shows editing efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites.
  • Fig. 2C shows base substitution frequency at each target sites calculated by EditR analysis.
  • Fig. 3 illustrates editing efficiencies induced by a tBE gene editing system with pairs of mgRNA-KLKB1-5 ⁇ 6 and corresponding hgRNAs targeting human KLKB1 gene in human cells.
  • Fig. 3A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1-5 ⁇ 6 and corresponding hgRNA-KLKB1-5 ⁇ 6s with tBE-V5-mA3 and nCas9.
  • Fig. 3B shows editing efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites.
  • Fig. 3C shows base substitution frequency at each target sites calculated by EditR analysis.
  • Fig. 4 illustrates relative KLKB1 protein expression induced by a tBE gene editing system with pairs of mgRNA and hgRNA targeting human KLKB1 gene in human cells.
  • Fig. 4A shows KLKB1 protein expression level examined with western blot using GAPDH as the loading control.
  • Fig. 4B shows relative densitometry representation of protein expression level examined in Fig. 4A.
  • Fig. 5 illustrates editing efficiencies induced by a tBE gene editing system in RNA electroporation delivery system in human cells.
  • Fig. 5A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1 and corresponding hgRNA-KLKB1 with mRNA of tBE-V5-mA3 and nCas9.
  • Fig. 5B shows editing efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites.
  • Fig. 5C shows base substitution frequency at each target site calculated by EditR analysis.
  • Fig. 6 illustrates editing efficiencies induced by a V5-LigoRNA-based editing system in RNA electroporation delivery system in human and mouse cells.
  • Fig. 6A is a schematic diagram illustrating co-transfection of mcrRNA-KLKB1, corresponding hcrRNA-KLKB1 and tracrRNA with mRNA of tBE-V5-mA3 and nCas9.
  • Fig. 6B shows editing efficiency induced by the V5-LigoRNA-based editing system at indicated sites in HepG2 cells.
  • Fig. 6C shows base substitution frequency at the target sites calculated by EditR analysis.
  • Fig. 6D shows editing efficiency induced by the V5-LigoRNA-based editing system at indicated sites in Hepa1-6 cells.
  • Fig. 6E shows base substitution frequency at the target sites calculated by EditR analysis.
  • nucleic acids are written left to right in the 5'to 3'orientation, and amino acid sequences are written left to right in amino to carboxy orientation, respectively.
  • a number “n” when used in the context of an amino acid sequence, refers to the n th amino acid in the amino acid sequence counting from the amino end.
  • amino acid 15 refers to the 15 th amino acid in a certain amino acid sequence.
  • R15 refers to the 15 th amino acid, which is an arginine (R) , in a certain amino acid sequence.
  • the term “about” will be understood by persons of ordinary skill in the art and will vary to some extent depending on the context in which it is used. In some embodiments, the term “about” when referring to a value is meant to encompass art-accepted variations. In some embodiments, the term “about” when referring to such values, is meant to encompass variations of ⁇ 20%or ⁇ 10%, more preferably ⁇ 5%, even more preferably ⁇ 1%, and still more preferably ⁇ 0.1%from the specified value, as such variations are appropriate in the context in which the term “about” is used.
  • percent identity and “%identity, ” as applied to nucleic acid or polynucleotide sequences, refer to the percentage of residue matches between at least two nucleic acid or polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences.
  • Percent identity between nucleic acid or polynucleotide sequences may be determined using a suite of commonly used and freely available sequence comparison algorithms provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215: 403-410) , which is available from several sources, including the NCBI, Bethesda, Md., and on the Internet at http: //www. ncbi. nlm. nih. gov/BLAST/.
  • NCBI National Center for Biotechnology Information
  • BLAST Basic Local Alignment Search Tool
  • Nucleic acid or polynucleotide sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res 19: 5081; Ohtsuka et al.
  • nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single-or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
  • nucleic acid is used interchangeably with polynucleotide, and (in appropriate contexts) gene, cDNA, and mRNA encoded by a gene.
  • percent (%) amino acid sequence identity with respect to a peptide, polypeptide or protein sequence is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in another peptide or polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Percent amino acid sequence identity in the current disclosure is measured using BLAST software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • amino acid substitution refers to the replacement of one amino acid in a polypeptide with another amino acid.
  • Amino acid substitutions can be conservative or non-conservative substitutions. Exemplary substitutions are shown in Table 1. Amino acid substitutions may be introduced into a protein of interest and the products screened for a desired activity, for example, retained/improved biological activity.
  • Amino acids may be grouped according to common side-chain properties:
  • polypeptide is intended to encompass a singular “polypeptide” as well as plural “polypeptides, ” and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds) .
  • polypeptide refers to any chain or chains of two or more amino acids, and does not refer to a specific length of the product.
  • peptides, ” “protein” , or any other term used to refer to a chain or chains of two or more amino acids are included within the definition of “polypeptide, ” and the term “polypeptide” may be used instead of, or interchangeably with any of these terms.
  • polypeptide is also intended to refer to the products of post-expression modifications of the polypeptide, including without limitation glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids.
  • a polypeptide may be derived from a natural biological source or produced by recombinant technology, but is not necessarily translated from a designated nucleic acid sequence. It may be generated in any manner, including by chemical synthesis.
  • encode or “encoding” as it is applied to polynucleotides refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof.
  • the antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
  • a “guide RNA” refers to a synthetic or expressed RNA sequence that comprises a CRISPR binding motif and a spacer.
  • the guide RNA is a single guide RNA.
  • the guide RNA is a dual-RNA structure.
  • the guide RNA is a dual-RNA structure formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) .
  • the guide RNA is a LigoRNA.
  • a “spacer” is a DNA-targeting motif, which is a sequence that is complementary to a target specific DNA region.
  • the guide RNA is a crRNA-tracrRNA dual RNA structure, and the crRNA comprises the spacer.
  • the CRISPR binding motif of a guide RNA can bind to a Cas enzyme and DNA-targeting motif of the gRNA can guide the complex to a specific target location on a DNA.
  • the guide RNA is a crRNA-tracrRNA dual RNA structure, and the base-pair structure formed by the crRNA and the tracrRNA comprises the CRISPR binding motif.
  • a guide RNA may further comprise one or more other motifs, such as one or more protein-binding motifs, or the like.
  • a “fusion protein” is a protein comprising at least two polypeptides that have been joined as a single polypeptide.
  • a fusion protein can comprise two domains that are encoded by separate genes that have been joined so that they are transcribed and translated as a single unit, producing a single polypeptide.
  • the at least two domains are fused together directly.
  • the domains are connected by one or more linkers.
  • genetic modification and its grammatical equivalents, as used herein can refer to one or more alterations of a nucleic acid, e.g., the nucleic acid within an organism’s genome.
  • genetic modification can refer to alterations, additions, and/or deletion of genes or portions of genes or other nucleic acid sequences.
  • a genetically modified cell can also refer to a cell with an added, deleted, and/or altered gene or portion of a gene.
  • a genetically modified cell can also refer to a cell with an added nucleic acid sequence that is not a gene or gene portion.
  • Genetic modifications include, for example, both transient knock-in or knock-down mechanisms, and mechanisms that result in permanent knock-in, knock-down, or knock-out of target genes or portions of genes or nucleic acid sequences. Genetic modifications include, for example, both transient knock-in and mechanisms that result in permanent knock-in of nucleic acids sequences. Genetic modifications also include, for example, reduced or increased transcription, reduced or increased mRNA stability, reduced or increased translation, and reduced or increased protein stability.
  • composition refers to any mixture of two or more products, substances, or compounds, including cells.
  • subject means any animal such as a mammal, e.g., a human.
  • treat, ” “treating, ” or “treatment” refers to ameliorating a disease or disorder, e.g., slowing or arresting or reducing the development of the disease or disorder or reducing at least one of the clinical symptoms thereof.
  • ameliorating a disease or disorder can include obtaining a beneficial or desired clinical result that includes, but is not limited to, any one or more of: alleviation of one or more symptoms, diminishment of extent of disease, preventing or delaying spread of disease, preventing or delaying recurrence of disease, delay or slowing of disease progression, amelioration of the disease state, inhibiting or eliminating the disease or progression of the disease, inhibiting or slowing the disease or its progression, arresting its development, and remission (whether partial or total) .
  • HAE Hereditary angioedema
  • Hereditary angioedema is a disorder characterized by recurrent episodes of severe swelling (angioedema) .
  • the most common areas of the body to develop swelling are the limbs, face, intestinal tract, and airway. Minor trauma or stress may trigger an attack, but swelling often occurs without a known trigger.
  • Episodes involving the intestinal tract cause severe abdominal pain, nausea, and vomiting. Swelling in the airway can restrict breathing and lead to life-threatening obstruction of the airway.
  • types I, II, and III HAE-1, HAE-2, and HAE-3
  • C1 inhibitor C1 inhibitor
  • HAE-1, HAE-2, and HAE-3 can be characterized by C1-INH low serum levels (HAE-1) , dysfunction (HAE-2) , and normal serum levels and function (HAE-3) .
  • HAE-1, HAE-2, and HAE-3 can be characterized by C1-INH low serum levels (HAE-1) , dysfunction (HAE-2) , and normal serum levels and function (HAE-3) .
  • HAE-1 is the most common, accounting for 85 percent of cases.
  • HAE-2 occurs in 15 percent of cases, and HAE-3 is very rare.
  • the uncontrolled activity of the plasma contact system forms the basis for the pathological tissue swelling in C1-INH-related HAE.
  • the plasma contact system designates a group of serine proteases and their substrates that assemble on surfaces of circulating blood cells and vessel walls. It is composed of the serine protease zymogens factor XII (FXII) , factor XI (FXI) , plasma prekallikrein (PPK) and the substrate of plasma kallikrein (PK) , high molecular weight kininogen (HK) .
  • the contact system is started by FXII binding to “contact” -activators including negatively charged surfaces, such as the silicate kaolin and high molecular weight dextran sulfate (DXS) in vitro or platelet polyphosphate (polyP) and mast cell heparin in vivo.
  • “contact” -activators including negatively charged surfaces, such as the silicate kaolin and high molecular weight dextran sulfate (DXS) in vitro or platelet polyphosphate (polyP) and mast cell heparin in vivo.
  • Human plasma prekallikrein is the precursor of plasma kallikrein (PK) , the serine protease that liberates bradykinin (BK) from high molecular weight kininogen (HK) . Additionally, PK cleaves FXII to generate active FXII (FXIIa) .
  • PPK is encoded by the KLKB1 gene (e.g., SEQ ID NO: 588) located on chromosome 4 in humans.
  • Human PPK e.g., SEQ ID NO: 587, UniProtKB-P03952
  • SEQ ID NO: 587, UniProtKB-P03952 is mostly synthesized in the liver and secreted into the bloodstream as a 619 amino acid single chain glycoprotein with five N-linked glycosylations.
  • PPK Two differentially glycosylated PPK forms (85 and 88 kDa) exist and circulate with a total plasma concentration of 35–50 ⁇ g/ml (350–500 nM) . Most of PPK circulates in a non-covalently bound complex with HK (about 75%complex and 25%free form) . PPK is activated to PK by limited proteolysis at the peptide bond R371–I372 leading to the heavy (amino acids 1–371) and light (amino acids 372–619) chain fragments. Both chains remain connected by a disulfide bond spanning C364–C484.
  • the N-terminal heavy chain is composed of four apple domains (A1-A4) , wherein A2 contains the major HK binding sites with some contribution of A1 and A4. Each apple domain is stabilized by three disulfide bridges (four in Apple 4) and the entire PPK molecule has 18 disulfide bonds.
  • the catalytic triad is composed of H415, D464 and S559 within the C-terminal light chain of the protein.
  • C1-INH is encoded by the SERPING1 gene, which is located on chromosome 11. This 105-kDa glycoprotein (seven N-linked and eight O-linked glycans) is the main inhibitor of the classical complement enzymes C1r and C1 esterase (C1s) .
  • the complement component C1 is a protein complex involved in the complement system, which is composed of one molecule of C1q, two molecules of C1r, and two molecules of C1s.
  • C1-INH is also the primary inhibitor of the contact factors. It is responsible for 93%of the inhibition of FXIIa in plasma. Similarly, C1-INH is responsible for 52%of the inhibition of PK and 47%of the inhibition of activated FXI (FXIa) .
  • C1-INH also modestly inhibits plasmin, but the physiological relevance of this inhibition may be limited.
  • C1-INH By acting as an inhibitor of PK, C1-INH thereby limits bradykinin release from HK, because cleavage of HK by PK leads to the release of bradykinin (e.g., SEQ ID NO: 589) .
  • PK could cleave Lys-Arg and Arg-Ser bonds in HK to release bradykinin.
  • Bradykinin is recognized by the kinin B2 receptor (B2R) , which is constitutively expressed on vascular endothelial cells.
  • B2R kinin B2 receptor
  • B2R a G-protein-coupled receptor
  • B2R Activation of B2R, a G-protein-coupled receptor, leads to vascular leakage through induction of endothelial cell contractility, uncoupling of endothelial cell junctions, production of nitric oxide, and prostacyclin.
  • B2R is internalized, leading to (temporary) desensitization of the tissue to bradykinin.
  • HAE C1-INH-related HAE
  • HAE2 C1-INH-related HAE
  • PK activity leads to increased PK activity.
  • the present disclosure provides gene editing systems and methods that aim to disrupt PK expression by targeting the KLKB1 gene, thus decrease bradykinin release and B2R activation, and ultimately reduce vascular leakage.
  • the present disclosure provides highly specific gene editing systems, polynucleotides, vectors, cells, compositions, kits, and methods to disrupt the KLKB1 gene, which encodes the precursor of plasma kallikrein.
  • Plasma kallikrein could cleave Lys-Arg and Arg-Ser bonds in human kininogen to release bradykinin.
  • suppressing the expression of plasma kallikrein reduces bradykinin production.
  • the disruption of the KLKB1 gene prevents or reduces the release of bradykinin. In some embodiments, the disruption of the KLKB1 gene leads to the suppression of C1-INH deficiency and/or dysfunction. In some embodiments, the disruption of the KLKB1 gene treats HAE caused by C1 inhibitor (C1-INH) deficiency and/or dysfunction. In some embodiments, the disruption of the KLKB1 gene treats hereditary angioedema (HAE) . In some embodiments, the disruption of the KLKB1 gene treats HAE-1 and/or HAE-2.
  • HAE hereditary angioedema
  • present disclosure provides a gene editing system for disrupting the KLKB1 gene, wherein the gene editing system comprises a base editor and at least one guide RNA that is capable of binding to the KLKB1 gene.
  • a highly specific base editor transformer base editor (tBE) , is used to induce efficient and precise gene editing at genomic sites for disrupting the KLKB1 gene.
  • a tBE is used with a combination of main guide RNA (mgRNA) and helper guide RNA (hgRNA) , wherein the mgRNA and hgRNA are capable of binding to the KLKB1 gene.
  • mgRNA main guide RNA
  • hgRNA helper guide RNA
  • a base editor as used herein is a cytosine base editor (CBE) , which comprises a combination of a CRISPR system and cytidine deaminase.
  • CBE cytosine base editor
  • a CBE effectuates a programmable cytosine to thymine (C-to-T) substitution. Because the base editing process does not depend on the generation of DNA double strand break (DSB) , unwanted nucleotide insertions/deletions (indels) or DNA damage responses (DDRs) can be largely avoided.
  • transformer base editor tBE
  • tBE transformer base editor
  • the tBE is any one of the base editors described in WO2020156575A1, incorporated herein by reference in its entirety.
  • the tBE can be any base editor as illustrated in Fig. 1.
  • the transformer base editor (tBE) system comprises a cytidine deaminase inhibitor (dCDI) domain and a split-TEV protease (e.g., as illustrated in Fig. 1, V5) .
  • dCDI cytidine deaminase inhibitor
  • split-TEV protease e.g., as illustrated in Fig. 1, V5 .
  • the tBE remains inactive at off-target sites with a cleavable fusion of dCDI domain and thus eliminates or reduces unintended off-target mutations.
  • the tBE is transformed to cleave off the dCDI domain and catalyzes targeted deamination for precise editing.
  • the tBE uses one mgRNA (normally about 20 nt) to bind at the target genomic site and one helper mgRNA (hgRNA, normally about 10 to about 20 nt) to bind at a nearby region (preferably upstream to the target genomic site) .
  • the binding of two gRNAs can guide the components of tBE system to correctly assemble at the target genomic site for base editing.
  • tBE can specifically edit cytosine in target regions with no observable off-target mutations.
  • the tBE can specifically edit cytosine in a target region to induce a premature stop codon to repress KLKB1 protein expression or break the GU-AG rule to disrupt a splicing site.
  • the tBE system is used to disrupt a KLKB1 gene, which leads to the suppression of C1-INH deficiency and/or dysfunction, for the treatment of HAE.
  • the base editors and base editing methods described herein can be applied to perform high-specificity and high-efficiency base editing in the genome of various eukaryotes.
  • the tBE comprises a Cas9 nickase (D10A) , which is less toxic to cells than Cas9 nuclease, because Cas9 nickase activates a lower level of p53-mediated DDR. Besides, tBE achieves highly specific and efficient base editing at most sites.
  • the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer of about 20 nucleotides (about 20 nt) that binds to a target site on a KLKB1 gene and the hgRNA comprises an hgRNA spacer of about 10 to about 20 nt that binds to a site that is close to the target site that the mgRNA spacer binds to.
  • mgRNA main guide RNA
  • hgRNA helper guide RNA
  • the gene editing system comprises an mgRNA comprising an mgRNA spacer selected from SEQ ID NOs: 1 to 80, and an hgRNA comprising an hgRNA spacer of about 10 to about 20 nt, e.g., 7 nt to 23 nt, 8 nt to 22 nt, 9 nt to 21 nt, and 10 nt to 20 nt, that binds to a site close to the target site that the mgRNA spacer binds to.
  • the gene editing system comprises an hgRNA comprising an hgRNA spacer selected from SEQ ID NOs: 81-586.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise the sequences as set forth in Table 2.
  • the mgRNA having the paring code of “1” can be used in combination with any one of the hgRNAs having the paring codes of “1-1, ” “1-2, ” “1-3, ” “1-4, ” “1-5, ” and “1-6. ”
  • the mgRNA having the paring code of “2” can be used in combination with any one of the hgRNAs having the paring codes of “2-1” and “2-2, ” and so on and so forth.
  • any appropriate fragment of the 20-nt of SEQ ID NOs: 81-333 e.g., having 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 nucleotides, can be used as an hgRNA spacer in the gene editing system disclosed herein.
  • SEQ ID NOs: 334-586 are exemplary hgRNA spacers having 10 nucleotides.
  • the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer, wherein the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise the sequences respectively as set forth in Table 3.
  • mgRNA main guide RNA
  • hgRNA helper guide RNA
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 81, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 82, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 83, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 84, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 334, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 335, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 336, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 337, respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 03 and SEQ ID NO: 85 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 03 and SEQ ID NO: 338 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 86 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 87 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 88 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 339 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 340 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 341 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 91 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 92 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 93 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 344 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 345 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 346 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 94 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 95 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 96 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 347 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 348 respectively.
  • the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 349 respectively.
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, and (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the protease is split into a first protease fragment and a second protease fragment, wherein the first or second protease fragment alone is not able to cleave the cleavage site.
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide
  • a “protease” refers to an enzyme that catalyzes proteolysis.
  • a “cleavage site for a protease” refers to a short peptide that the protease recognizes, and within which creates a proteolytic cleavage.
  • Non-limiting examples of proteases include TEV protease, TuMV protease, PPV protease, PVY protease, ZIKV protease, and WNV protease.
  • the protein sequences of example proteases and their corresponding cleavage sites are provided in Table 4.
  • the protease is a TEV protease, a TuMV protease, a PPV protease, a PVY protease, a ZIKV protease, or a WNV protease.
  • the protease cleavage site is a self-cleaving peptide, such as the 2A peptides.
  • 2A peptides are 18-22 amino-acid-long viral oligopeptides that mediate “cleavage” of polypeptides during translation in eukaryotic cells.
  • the designation “2A” refers to a specific region of the viral genome and different viral 2As have generally been named after the virus they were derived from.
  • the first discovered 2A was F2A (foot-and-mouth disease virus) , after which E2A (equine rhinitis A virus) , P2A (porcine teschovirus-1 2A) , and T2A (thosea asigna virus 2A) were also identified.
  • E2A equine rhinitis A virus
  • P2A porcine teschovirus-1 2A
  • T2A thosea asigna virus 2A
  • SEQ ID NO: 604 GSGATNFSLLKQAGDVEENPGP
  • SEQ ID NO: 605 GSGEGRGSLLTCGDVEENPGP
  • SEQ ID NO: 606 GSGQCTNYALLKLAGDVESNPGP
  • the protease is a TEV protease.
  • the TEV protease comprises a sequence as set forth in SEQ ID NO: 590.
  • the first and/or the second TEV protease fragment is not able to cleave the TEV cleavage site on its own. However, in the presence of the remaining portion of the TEV protease, this fragment will be able to effectuate the cleavage.
  • the TEV fragment may be the TEV N-terminal domain (e.g., SEQ ID NO: 591) or the TEV C-terminal domain (e.g., SEQ ID NO: 592) .
  • the first TEV protease fragment comprises a sequence of SEQ ID NO: 591.
  • the first TEV protease fragment comprises a sequence of SEQ ID NO: 592.
  • nucleobase deaminase inhibitor or an “inhibitory domain” refers to a protein or a protein domain that inhibits the deaminase activity of a nucleobase deaminase.
  • the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase.
  • the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase.
  • the nucleobase deaminase inhibitor is the mouse APOBEC3 cytidine deaminase domain 2 (mA3-CDA2, SEQ ID NO: 607) .
  • the nucleobase deaminase inhibitor is the human APOBEC3B cytidine deaminase domain 1 (hA3B-CDA1, SEQ ID NO: 608) .
  • Table 5 shows 44 proteins/domains that have significant sequence homology to mA3-CDA2 core sequence and Table 6 shows 43 proteins/domains that have significant sequence homology to hA3B-CDA1. All of these proteins and domains, as well as their variants and equivalents, are contemplated to have nucleobase deaminase inhibition activities.
  • the inhibitory domain of a cytidine deaminase comprises an amino acid sequence as set forth in SEQ ID NO: 607 or SEQ ID NO: 608.
  • nucleobase deaminase refers to a group of enzymes that catalyze the hydrolytic deamination of nucleobases such as cytidine, deoxycytidine, adenosine and deoxyadenosine.
  • nucleobase deaminases include cytidine deaminases and adenosine deaminases.
  • the gene editing system disclosed herein only includes the catalytic domain, such as mouse A3 cytidine deaminase domain 1 (mA3-CDA1, SEQ ID NO: 609) and human A3B cytidine deaminase domain 2 (hA3B-CDA2, SEQ ID NO: 610) .
  • the gene editing system disclosed herein includes at least a catalytic core of the catalytic domain. For instance, when mA3-CDA1 was truncated at residues 196/197 the CDA1 domain still retained substantial editing efficiencies.
  • the nucleotide deaminase is a cytidine deaminase. In some embodiments, the nucleotide deaminase is a cytidine deaminase comprising an amino acid sequence of SEQ ID NO: 609. In some embodiments, the nucleotide deaminase is a cytidine deaminase comprising an amino acid sequence of SEQ ID NO: 610.
  • Cytidine deaminase refers to enzymes that catalyze the hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. Cytidine deaminases maintain the cellular pyrimidine pool.
  • a family of cytidine deaminases is APOBEC (“apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like” ) . Members of this family are C-to-U editing enzymes. Some APOBEC family members have two domains, one domain of APOBEC like proteins is the catalytic domain, while the other domain is a pseudocatalytic domain.
  • the catalytic domain is a zinc dependent cytidine deaminase domain and is important for cytidine deamination.
  • RNA editing by APOBEC-1 requires homodimerisation and this complex interacts with RNA binding proteins to form the editosome.
  • Non-limiting examples of APOBEC proteins include APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, and activation-induced (cytidine) deaminase (AID) .
  • mutants of the APOBEC proteins are also known that have brought about different editing characteristics for base editors.
  • certain mutants e.g., W98Y, Y130F, Y132D, W104A, D131Y and P134Y
  • the term APOBEC and each of its family member also encompasses variants and mutants that have certain level (e.g., 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%) of sequence identity to the corresponding wildtype APOBEC protein or the catalytic domain and retain the cytidine deaminating activity.
  • the variants and mutants can be derived with amino acid additions, deletions and/or substitutions. Such substitutions, in some embodiments, are conservative substitutions.
  • the cytidine deaminase is selected from the group consisting of APOBEC3B (A3B) , APOBEC3C (A3C) , APOBEC3D (A3D) , APOBEC3F (A3F) , APOBEC3G (A3G) , APOBEC3H (A3H) , APOBECI (Al) , APOBEC3 (A3) , APOBEC2 (A2) , APOBEC4 (A4) , and AICDA (AID) .
  • the cytidine deaminase comprises an amino acid sequence of any one of SEQ ID NOs: 765-800. (Table 8)
  • the cytidine deaminase is a human or mouse cytidine deaminase.
  • the catalytic domain of the cytidine deaminase is a mouse A3 cytidine deaminase domain 1 (CDAl) or human A3B cytidine deaminase domain 2 (CDA2) .
  • the first fusion protein further comprises an uracil glycosylase inhibitor (UGI) .
  • UMI uracil glycosylase inhibitor
  • Uracil Glycosylase Inhibitor which can be prepared from Bacillus subtilis bacteriophage PBS1, is a small protein (9.5 kDa) which inhibits E. coli uracil-DNA glycosylase (UDG) as well as UDG from other species. Inhibition of UDG occurs by reversible protein binding with a 1: 1 UDG : UGI stoichiometry. UGI is capable of dissociating UDG-DNA complexes. A non-limiting example of UGI is found in Bacillus phage AR9 (YP_009283008.1) .
  • the UGI comprises the amino acid sequence of SEQ ID NO: 611 (TNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDES) or has at least 70%, 75%, 80%, 85%, 90%or 95%sequence identity to SEQ ID NO: 611 and retains the uracil glycosylase inhibition activity.
  • the first fusion protein further comprises a nuclear localization sequences (NLS)
  • NLS nuclear localization signal or sequence
  • NES nuclear export signal
  • iNLS internal SV40 nuclear localization sequence
  • a peptide linker is optionally provided between each of the fragments in any of the fusion proteins.
  • the peptide linker has from 1 to 100 amino acid residues (or 3-20, 4-15, without limitation) .
  • at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%or 90%of the amino acid residues of peptide linker are amino acid residues selected from the group consisting of alanine, glycine, cysteine, and serine.
  • Cas protein or “clustered regularly interspaced short palindromic repeats (CRISPR) -associated (Cas) protein” refers to RNA-guided DNA endonuclease enzymes associated with the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) adaptive immunity system in Streptococcus pyogenes, as well as other bacteria.
  • Cas proteins include Cas9 proteins, Cas12a (Cpf1) proteins, Cas12b (formerly known as C2c1) proteins, Cas13 proteins and various engineered counterparts.
  • Example Cas proteins include SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpf1, LbCpf1, FnCpf1, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpf1, FnCpf1, SsCpf1, PcCpf1, BpCpf1, CmtCpf1, LiCpf1, PmCpf1, Pb3310Cpf1, Pb4417Cpf1, BsCpf1, EeCpf1, BhCas12b, AkCas12b, EbCas12b, LsCas12b, RfCas13
  • the Cas protein comprise an amino acid sequence of any one of SEQ ID NOs: 713-764. (Table 10)
  • the Cas protein is a Cas9, a dead Cas9 (dCas9) , or a Cas9 nickase (nCas9) .
  • the Cas protein is a nCas9. In some embodiments, the nCas9 protein is a nCas9-D10A protein. In some embodiments, the nCas9-D10A protein has an amino acid sequence of SEQ ID NO: 612.
  • the first protein-binding RNA motif and the first RNA binding domain, the second protein-binding RNA motif and the second RNA binding domain, and the third protein-binding RNA motif and the third RNA binding domain are each independently selected from the group consisting of a MS2 phage operator stem-loop and MS2 coat protein (MCP) or an RNA-binding section thereof; a BoxB and N22P or an RNA-binding section thereof; a telomerase Ku binding motif and Ku protein or an RNA-binding section thereof; a telomerase Sm7 binding motif and Sm7 protein or an RNA-binding section thereof; a PP7 phage operator stem -loop and PP7 coat protein (PCP) or an RNA-binding section thereof; a SfMu phage Com stem-loop and Com RNA binding protein or an RNA-binding section thereof; and a non-natural RNA aptamer and corresponding aptamer ligand or an RNA-
  • MCP MS
  • biological equivalents thereof are also provided.
  • the biological equivalents have at least about 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%sequence identity with the reference protein.
  • the biological equivalents retained the desired activity of the reference protein.
  • the biological equivalents are derived by including one, two, three, four, five or more amino acid additions, deletions, substitutions, or the combinations thereof.
  • the substitution is a conservative amino acid substitution.
  • the guide RNA (the main guide RNA and/or the helper guide RNA) is a dual-RNA structure formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) .
  • the crRNA comprises a spacer sequence and is capable of forming a base-pair structure with the tracrRNA, and wherein the base-pair structure binds to a Cas protein.
  • the crRNA further comprises a linker sequence which comprises a protein-binding motif.
  • the “CRISPR motif” refers to the base-pair structure formed between the crRNA and the tracrRNA.
  • the gene editing system is a LigoRNA-based gene editing system, as described in PCT/CN2023/096482, which is incorporated herein by reference in its entirety.
  • at least one guide RNA is a LigoRNA.
  • a LigoRNA system comprises a dual-RNA structure, which can be used as a guide RNA in CRISPR-based gene editing systems.
  • the dual-RNA structure can be formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) .
  • the LigoRNA system comprises an hgRNA set of a hcrRNA and a tracrRNA, and an mgRNA set of mcrRNA and a tracrRNA.
  • all of these RNA molecules are not longer than 100 nucleotides.
  • LigoRNA system Since the LigoRNA system is formed by two short RNAs, it helps to solve the problem of synthesizing long single guide RNAs in previous gene editing systems. Chemically synthesized RNAs over 100 nt demonstrated much lower yield and purity, resulting in challenges for large-scale production and cost control.
  • crRNA and tracrRNA are capable of guiding nCas9-mediated DNA location.
  • the crRNAs and the tracrRNAs in the LigoRNA system are further modified.
  • an MS2 or boxB hairpin is fused to crRNA in multiple different sites.
  • at least one nucleotide in the crRNAs and the tracrRNAs is modified, such as by a 2’-O-methyl modification and/or 3’ -phosphorothioate modification.
  • the crRNA comprises a spacer sequence and a linker sequence, wherein the linker sequence comprises at least one protein-binding motif, wherein the protein-binding motif is an RNA aptamer motif.
  • the protein binding motif is selected from MS2, PP7, boxB, SfMu hairpin motif, telomerase Ku, and Sm7 binding motif, or a variant thereof.
  • Aptamers are single-stranded oligonucleotides that fold into defined architectures and selectively bind to a specific target, including proteins, peptides, carbohydrates, small molecules, toxins, and even live cells.
  • the crRNA is capable of forming a base-pair structure with a trans-activating crRNA (tracrRNA) .
  • the tracrRNA has an sequence of SEQ ID NO: 804 or 811.
  • the crRNA comprises at least one nucleotide with modification.
  • the modification is selected from 2’ -O-alkyl, 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo, 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) .
  • the at least one nucleotide with modification is any one of the first three nucleotides from 3’ -end of the engineered crRNA.
  • the tracrRNA comprises at least one nucleotide with modification.
  • the modification is selected from 2’ -O-alkyl, 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo, 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) .
  • the at least one nucleotide with modification is any one of the first three nucleotides from 3’ -end of the engineered tracrRNA.
  • the crRNA and/or tracrRNA comprises at least one nucleotide with modification.
  • the modification is selected from 2’ -O-alkyl (such as 2’ -O-methyl) , 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo (such as 2’ -fluoro) , 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) .
  • the crRNA and/or tracrRNA comprises nucleotides comprising 2’ -O-methyl and 3’ -phosphorothioate.
  • the first three nucleotides from the 5’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl and 3’ -phosphorothioate. In some embodiments, the first three nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl, and the second to fourth nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 3’ -phosphorothioate.
  • the first three nucleotides from the 5’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl and 3’ -phosphorothioate
  • the first three nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl
  • the second to fourth nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 3’ -phosphorothioate.
  • a tBE system comprising two LigoRNA structures: an mcrRNA-tracrRNA base-paired structure and an hcrRNA-tracrRNA base-paired structure.
  • the mcrRNA contains a boxB hairpin to generate an R-loop region for intended base editing and the hcrRNA contains an MS2 hairpin to recruit a nucleotide deaminase (e.g., an APOBEC linked to a nucleobase deaminase inhibitor (e.g., a cytosine deaminase inhibitor (dCDI) ) domain through a cleavage site such as a TEV protease cleavage site.
  • a nucleotide deaminase e.g., an APOBEC linked to a nucleobase deaminase inhibitor (e.g., a cytosine deaminase inhibitor (dCDI)
  • an N22p-fused TEVc is recruited by the boxB-containing mcrRNA, working as the key in tBE system with free TEVn.
  • mcrRNA and hcrRNA form a base-paired structure with the same tracrRNA to locate a target DNA, and the dCDI domain is cleaved off at the target site to induce efficient base editing.
  • the gene editing system comprises
  • an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif
  • an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif
  • a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA
  • a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA
  • a first CRISPR-associated protein e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure
  • a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif
  • first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different.
  • the gene editing system further comprises
  • protease a protease, or a polynucleotide encoding the protease
  • nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof.
  • the gene editing system comprises
  • an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif
  • an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif
  • a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA
  • a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA
  • a first CRISPR-associated protein e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure
  • a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif
  • protease a protease, or a polynucleotide encoding the protease
  • nucleobase deaminase inhibitor domain i. a nucleobase deaminase inhibitor domain
  • a second fusion protein comprising the protease and a second RNA binding
  • first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
  • nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
  • protease and the second RNA binding domain are optionally connected by a linker
  • the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site.
  • the gene editing system comprises
  • an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif
  • an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif
  • a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA
  • a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA
  • a first CRISPR-associated protein e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure
  • a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif
  • protease or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
  • nucleobase deaminase inhibitor domain i. a nucleobase deaminase inhibitor domain
  • a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
  • a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker,
  • first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
  • nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
  • mcrRNA further comprises a third protein-binding motif
  • the second RNA binding domain binds to the second protein-binding motif
  • the third RNA binding domain binds to the third protein-binding motif
  • the gene editing system comprises
  • an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif
  • an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif
  • a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA
  • a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA
  • a first CRISPR-associated protein e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure
  • a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif
  • protease or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
  • nucleobase deaminase inhibitor domain i. a nucleobase deaminase inhibitor domain
  • a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
  • a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker,
  • first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
  • nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
  • mcrRNA further comprises a third protein-binding motif
  • the gene editing system comprises
  • an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif
  • an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif
  • a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA
  • a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA
  • a first CRISPR-associated protein e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure
  • a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif
  • protease or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
  • nucleobase deaminase inhibitor domain i. a nucleobase deaminase inhibitor domain
  • a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein
  • first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
  • first protease fragment and the second RNA binding domain are optionally connected by a linker
  • the LigoRNA-based gene editing system comprises a main crRNA (mcrRNA) , a helper crRNA (hcrRNA) , and a tracrRNA respectively:
  • the mgRNA and/or the hgRNA comprises a dual-RNA structure.
  • the dual-RNA structure is formed by a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) , wherein the crRNA comprises the spacer.
  • crRNA CRISPR RNA
  • tracrRNA trans-activating crRNA
  • the mgRNA comprises a mcrRNA and a first tracrRNA
  • the mcrRNA comprises the mgRNA spacer
  • the hgRNA comprises a hcrRNA and a second tracrRNA
  • the hcrRNA comprises the hgRNA spacer
  • the first tracrRNA and the second tracrRNA are same or different.
  • the mcrRNA and the hcrRNA are SEQ ID NO. 805 and SEQ ID NO: 808, respectively; or SEQ ID NO. 806 and SEQ ID NO: 809, respectively; or SEQ ID NO. 807 and SEQ ID NO: 810, respectively; or SEQ ID NO. 812 and SEQ ID NO: 815, respectively; or SEQ ID NO. 813 and SEQ ID NO: 816, respectively; or SEQ ID NO. 814 and SEQ ID NO: 817, respectively.
  • the tracrRNA is SEQ ID NO: 804 or 811.
  • the present disclosure provides a polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
  • the present disclosure provides a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a polynucleotide encoding all components in the gene editing system disclosed herein.
  • the present disclosure provides a kit comprising a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein, and a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the first and the second Cas proteins are the same Cas protein.
  • polynucleotides disclosed herein can be obtained by methods known in the art.
  • the polynucleotide can be obtained from cloned DNA (e.g., from a DNA library) , by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA or fragments thereof, purified from the desired cell.
  • cloned DNA e.g., from a DNA library
  • any method known to those skilled in the art for identification of nucleic acids that encode desired genes can be used. Any method available in the art can be used to obtain a full length (i.e., encompassing the entire coding region) cDNA or genomic DNA encoding a desired protein, such as from a cell or tissue source.
  • Modified or variant polynucleotides can be engineered from a wildtype polynucleotide using standard recombinant DNA methods.
  • Polynucleotides can be cloned or isolated using any available methods known in the art for cloning and isolating nucleic acid molecules. Such methods include PCR amplification of nucleic acids and screening of libraries, including nucleic acid hybridization screening, antibody-based screening, and activity-based screening.
  • Methods for amplification of polynucleotides can be used to isolate polynucleotides encoding a desired protein, including for example, polymerase chain reaction (PCR) methods.
  • PCR can be carried out using any known methods or procedures in the art. Exemplary methods include use of a Perkin-Elmer Cetus thermal cycler and Taq polymerase (Gene Amp) .
  • a nucleic acid containing gene of interest can be used as a source material from which a desired polypeptide-encoding nucleic acid molecule can be amplified.
  • DNA and mRNA preparations, cell extracts, tissue extracts from an appropriate source e.g. testis, prostate, breast
  • fluid samples e.g.
  • the source can be from any eukaryotic species including, but not limited to, vertebrate, mammalian, human, porcine, bovine, feline, avian, equine, canine, and other primate sources.
  • Nucleic acid libraries also can be used as a source material.
  • Primers can be designed to amplify a desired polynucleotide. For example, primers can be designed based on expressed sequences from which a desired polynucleotide is generated. Primers can be designed based on back-translation of a polypeptide amino acid sequence. If desired, degenerate primers can be used for amplification.
  • Oligonucleotide primers that hybridize to sequences at the 3’a nd 5’ termini of the desired sequence can be uses as primers to amplify by PCR from a nucleic acid sample.
  • Primers can be used to amplify the entire full-length polynucleotide, or a truncated sequence thereof.
  • Nucleic acid molecules generated by amplification can be sequenced and confirmed to encode a desired polypeptide.
  • the present disclosure provides a vector comprising the polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
  • the present disclosure provides a vector comprising the polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a vector comprising the polynucleotide encoding all components in the gene editing system disclosed herein.
  • the vector is a plasmid or a viral vector.
  • the vector is a polycistronic vector.
  • the present disclosure provides a kit comprising the vector disclosed above, and a vector comprising the polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • any methods known in the art for the insertion of DNA fragments into a vector can be used to construct expression vectors comprising a polynucleotide disclosed herein. These methods can include in vitro recombinant DNA and synthetic techniques and in vivo (genetic) recombination.
  • the polynucleotide disclosed herein can be operably linked to control sequences in the expression vector (s) to ensure protein expression.
  • control sequences may include, but are not limited to, leader or signal sequences, promoters (e.g., naturally associated or heterologous promoters) , ribosomal binding sites, enhancer or activator elements, translational start and termination sequences, and transcription start and termination sequences, and are chosen to be compatible with the host cell chosen to express the proteins.
  • the promoters may be either naturally occurring promoters, hybrid promoters that combine elements of more than one promoter, or synthetic promoters.
  • An expression construct may be present in a cell on an episome, such as a plasmid, or the expression construct may be inserted in a chromosome such as in a gene locus.
  • the expression vector includes a selectable marker gene to allow the selection of transformed host cells.
  • the vector is an expression vector comprising a nucleotide sequence encoding a variant polypeptide operably linked to at least one regulatory control sequence. Regulatory control sequence for use herein include promoters, enhancers, and other expression control elements.
  • the expression vector is designed for the choice of the host cell to be transformed, the particular variant polypeptide desired to be expressed, the vector's copy number, the ability to control that copy number, and/or the expression of any other protein encoded by the vector, such as antibiotic markers.
  • the vector can include, but is not limited to, viral vectors and plasmid DNA.
  • Viral vectors can include, but are not limited to, adenoviral vectors, lentiviral vectors, retroviral vectors, and adeno-associated viral vectors.
  • expression vectors contain selection markers such as ampicillin-resistance, hygromycin-resistance, tetracycline resistance, kanamycin resistance, or neomycin resistance to permit detection of those cells transformed with the desired DNA sequences.
  • Suitable vectors, promoter, and enhancer elements are known in the art; many are commercially available for generating subject recombinant constructs.
  • the vector is a polycistronic vector.
  • the vector is a bicistronic vector or a tricistronic vector.
  • Bicistronic or polycistronic expression vectors may include (1) multiple promoters fused to each of the open reading frames; (2) insertion of splicing signals between genes; (3) fusion of genes whose expressions are driven by a single promoter; and (4) insertion of proteolytic cleavage sites between genes (self-cleavage peptide) or insertion of internal ribosomal entry sites (IRESs) between genes.
  • a polycistronic vector is used to co-express multiple genes in the same cell.
  • Two strategies are most commonly used to construct a multicistronic vector.
  • an Internal Ribosome Entry Site (IRES) element is typically used for bi-cistronic vectors.
  • the IRES element acting as another ribosome recruitment site, allows initiation of translation from an internal region of the mRNA. Thus, two proteins are translated from one mRNA.
  • IRES elements are quite large (usually 500-600 bp) (Pelletier et al., 1988; Jang et al., 1988) .
  • the engineered CD47 proteins disclosed herein have a smaller size compared to the wild-type full-length human CD47, and thus could be used with IRES element in a multicistronic vectors having limited packaging capacity.
  • the present disclosure provides a cell comprising the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the polynucleotide disclosed herein.
  • the cell further comprises a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the vector disclosed herein.
  • the cell further comprises a vector comprising a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
  • the present disclosure provides a cell comprising the kit disclosed herein.
  • the cell is a stem cell.
  • the cell is a pluripotent stem cell.
  • Pluripotent stem cells are cells that have the capacity to self-renew by dividing and to develop into the three primary germ cell layers of the early embryo and therefore into all cells of the adult body, but not extra-embryonic tissues such as the placenta.
  • Embryonic stem cells and induced pluripotent stem cells are pluripotent stem cells.
  • the cell is an embryonic stem cell (ESC) .
  • Embryonic stem cells are pluripotent stem cells derived from the inner cell mass of a blastocyst, an early-stage pre-implantation embryo.
  • the cell is an induced pluripotent stem cell (iPSC) .
  • iPSCs are derived from adult somatic cells that have been genetically reprogrammed back into an embryonic-like pluripotent state that enables the development of an unlimited source of any type of cell needed for therapeutic purposes.
  • Pluripotent stem cells as used herein have the potential to differentiate into any of the three germ layers: endoderm (e.g., the stomach lining, gastrointestinal tract, lungs, etc. ) , mesoderm (e.g., muscle, bone, blood, urogenital tissue, etc. ) or ectoderm (e.g., epidermal tissues and nervous system tissues) .
  • endoderm e.g., the stomach lining, gastrointestinal tract, lungs, etc.
  • mesoderm e.g., muscle, bone, blood, urogenital tissue, etc.
  • ectoderm e.g., epidermal tissues and nervous system tissues
  • pluripotent stem cells as used herein, also encompasses induced pluripotent stem cells (iPSCs or iPS cells) , or a type of pluripotent stem cell derived from a non-pluripotent cell.
  • a pluripotent stem cell is produced or generated from a cell that is not a pluripotent cell.
  • pluripotent stem cells can be direct or indirect progeny of a non-pluripotent cell.
  • parent cells include somatic cells that have been reprogrammed to induce a pluripotent, undifferentiated phenotype by various means.
  • Such "iPS" or “iPSC” cells can be created by inducing the expression of certain regulatory genes or by the exogenous application of certain proteins. Methods for the induction of iPS cells are known in the art and are further described below.
  • hiPSCs are human induced pluripotent stem cells.
  • pluripotent stem cells As used herein, “pluripotent stem cells, " as used herein, also encompasses mesenchymal stem cells (MSCs) , and/or embryonic stem cells (ESCs) .
  • the cell is an endothelial cell.
  • Endothelial cells form the endothelium, which is a single layer that line the interior surface of blood vessels and lymphatic vessels, providing an anticoagulant barrier between the vessel wall and blood.
  • the endothelial cell is a unique multifunctional cell with critical basal and inducible metabolic and synthetic functions.
  • the endothelial cell reacts with physical and chemical stimuli within the circulation and regulates hemostasis, vasomotor tone, and immune and inflammatory responses.
  • the endothelial cell is pivotal in angiogenesis and vasculogenesis. (Sumpio et al., Cells in focus: endothelial cell, Int. J. Biochem Cell Biol., 2002. )
  • the cell is a primary cell.
  • Primary cells are isolated directly from human or animal tissue using enzymatic or mechanical methods. Once isolated, they are placed in an artificial environment in plastic or glass containers supported with specialized medium containing essential nutrients and growth factors to support proliferation.
  • Primary cells could be of two types: adherent or suspension.
  • Adherent cells require attachment for growth and are said to be anchorage-dependent cells.
  • Adherent cells are usually derived from tissues of organs. Suspension cells do not require attachment for growth and are said to be anchorage-independent cells.
  • Most suspension cells are isolated from the blood system, but some tissue-derived cells can also be used in suspension, such as hepatocytes or intestinal cells.
  • primary cells usually have a limited lifespan, they offer a number of advantages compared to cell lines.
  • Primary cell culture enables researchers to study donors and not just cells. Several factors such as age, medical history, race, and sex can be considered when building an experimental model. With a growing trend towards personalized medicine, such donor variability and tissue complexity can be achieved with use of primary cells, but are difficult to replicate with cell lines that are more systematic and uniform in nature and do not capture the true diversity of a living tissue.
  • the cell is a differentiated cell.
  • Differentiated cells are cells that have undergone differentiation. They are mature cells that perform a specialized function.
  • Some examples of differentiated cells are epithelial cells, skin fibroblasts, endothelial cells lining the blood vessels, smooth muscle cells, liver cells, nerve cells, human cardiac muscle cells, etc. Generally, these cells have a unique morphology, metabolic activity, membrane potential, and responsiveness to signals facilitating their function in a body tissue or organ.
  • the present disclosure provides a composition comprising the gene editing system disclosed herein.
  • the present disclosure provides a composition comprising the cell disclosed herein.
  • composition includes, but is not limited to, a pharmaceutical composition.
  • a “pharmaceutical composition” refers to an active pharmaceutical agent formulated in pharmaceutically acceptable or physiologically acceptable solutions for administration to a cell or an animal, either alone, or in combination with one or more other modalities of therapy. It will also be understood that, if desired, the compositions of the invention may be administered in combination with other agents, such as, e.g., cytokines, growth factors, hormones, small molecules, chemotherapeutics, pro-drugs, drugs, antibodies, or other various pharmaceutically active agents. There is virtually no limit to other components that may also be included in the compositions, provided that the additional agents do not adversely affect the ability of the composition to deliver the intended therapy.
  • phrases “pharmaceutically acceptable” is used herein to refer to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
  • compositions may also comprise a pharmaceutically acceptable carrier, diluent, or excipient.
  • pharmaceutically acceptable carrier, diluent, or excipient includes, without limitation, any adjuvant, carrier, excipient, glidant, sweetening agent, diluent, preservative, dye/colorant, flavor enhancer, surfactant, wetting agent, dispersing agent, suspending agent, stabilizer, isotonic agent, solvent, surfactant, or emulsifier which has been approved by the United States Food and Drug Administration as being acceptable for use in humans or domestic animals.
  • Exemplary pharmaceutically acceptable carriers include, but are not limited to, to sugars, such as lactose, glucose, and sucrose; starches, such as corn starch and potato starch; cellulose, and its derivatives, such as sodium carboxymethyl cellulose, ethyl cellulose, and cellulose acetate; tragacanth; malt; gelatin; talc; cocoa butter; waxes; animal and vegetable fats; paraffins; silicones; bentonites; silicic acid; zinc oxide; oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil, and soybean oil; glycols, such as propylene glycol; polyols, such as glycerin, sorbitol, mannitol, and polyethylene glycol; esters, such as ethyl oleate, and ethyl laurate; agar; buffering agents, such as magnesium hydroxide and aluminum hydroxide; alginic acid;
  • the liquid pharmaceutical compositions may include one or more of the following: sterile diluents such as water for injection, saline solution, preferably physiological saline; Ringers solution; isotonic sodium chloride; fixed oils such as synthetic mono or diglycerides which may serve as the solvent or suspending medium; polyethylene glycols; glycerin; propylene glycol or other solvents; antibacterial agents, such as benzyl alcohol or methyl paraben; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents, such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates, or phosphates; and agents for the adjustment of tonicity, such as sodium chloride or dextrose.
  • the parenteral preparation can be enclosed in ampoules, disposable syringes, or multiple dose vials made of glass or plastic.
  • An injectable pharmaceutical composition is preferably sterile.
  • composition may be suitably developed for intravenous, intratumoral, oral, rectal, vaginal, parenteral, topical, pulmonary, intranasal, buccal, ophthalmic, or another route of administration.
  • the present disclosure provides a method for disrupting a KLKB1 gene in a cell, comprising introducing into the cell the gene editing system disclosed herein.
  • “disrupt” or “disruption” of a gene refers to gene knock-down or gene knock-out.
  • a gene knock-down the expression of the gene is reduced via methods such as genetic modification and treatment as disclosed herein.
  • a gene knock-out a gene is made inoperative, partially or completely, via methods such as genetic modification.
  • the KLKB1 gene is disrupted by adding stop codons using the gene editing system disclosed herein.
  • the gene editing system disclosed herein is used to induce C-to-T base editing in the codons of CAA (Gln) , CAG (Gln) or CGA (Arg) in KLKB1 genes to create TAA, TAG or TGA stop codon. See Fig. 2.
  • the gene editing system disclosed herein is used to induce G-to-A (C-to-T on the opposite strand) base editing in the codons of TGG (Trp) in KLKB1 genes to create a TAA, TAG or TGA stop codon.
  • the KLKB1 gene is disrupted by destroying splicing site in it using the gene editing system disclosed herein.
  • the gene editing system disclosed herein is used to induce G-to-A (C-to-T on the opposite strand) base editing in 5’ GU or 3’ AG splice site to destroy the GU-AG canonical splicing pattern. See Fig. 3.
  • the present disclosure provides a method for decreasing the expression of plasma kallikrein in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • the present disclosure provides a method for decreasing the expression of bradykinin in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • the present disclosure provides a method for decreasing vascular leakage regulated by a kinin B2 receptor (B2R) in a cell, comprising introducing the cell the gene editing system disclosed herein.
  • B2R kinin B2 receptor
  • the HAE is caused by C1-INH deficiency or dysfunction.
  • the HAE is HAE-1.
  • the HAE is HAE-2.
  • the cell is a stem cell.
  • the cell is a pluripotent stem cell.
  • the cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
  • iPSC induced pluripotent stem cell
  • the cell is an endothelial cell.
  • the cell is a primary cell or a differentiated cell.
  • the present disclosure provides a method for treating HAE in a subject in need thereof, the method comprising: administering to the subject a composition comprising a gene editing system disclosed herein, wherein the KLKB1 gene in the subject is disrupted.
  • the present disclosure provides a method for treating HAE in a subject in need thereof, the method comprising: administering to the subject a composition comprising a cell disclosed herein, wherein the KLKB1 gene in the subject is disrupted.
  • the present disclosure provides use of a gene editing system, a cell, or a composition as disclosed herein for the manufacture of a medicament for treating HAE.
  • the present disclosure provides a gene editing system, a cell, or a composition as disclosed herein for treating HAE.
  • the various protein components and the gRNAs of a gene editing system disclosed herein may be introduced into a subject or a cell via one or more vectors expressing the protein components and gRNAs.
  • the gRNAs and the protein components of a gene editing system disclosed herein can be delivered into a cell in a form of ribonucleoprotein (RNP) via electroporation.
  • RNP ribonucleoprotein
  • Primer sets (hgKLKB1-E2-1-U1_FOR/mgKLKB1-E2-1_REV) were used to amplify the fragment hgKLKB1-E2 (Exon Number) -1 (mgRNA Number) -U1 (hgRNA Number) -MS2 (the operator in hgRNA scaffold) -U6 (mgRNA promoter) -mgKLKB1-E2-1 using the template pUC57-mgRNA-MS2-U6.
  • the fragment hgKLKB1-E2-1-U1-MS2-U6-mgKLKB1-E2-1 was then ligated into BsmBI-linearized U6-ccdB-boxB-tBE-V5 to generate the vector ptBE-V5-KLKB1-E2-1-U1.
  • Other combinations with different on-target hgRNA and mgRNA were constructed using the same strategy, respectively.
  • 293FT cells were maintained in DMEM + 10%FBS and regularly tested to exclude mycoplasma contamination.
  • 293FT cells were seeded in a 24-well plate at a density of 1 ⁇ 10 5 per well and transfected with 250 ⁇ l serum-free Opti-MEM containing 2.5 ⁇ l LIPOFECTAMINE LTX, 1 ⁇ l LIPOFECTAMINE plus, 0.5 ⁇ g tBE-V5 expression vector, 0.5 ⁇ g pEFS-nSpCas9 or pEFS-nSpCas9-NG expression vector. After 24 h, puromycin was added to the medium at a final concentration of 4 ⁇ g ml -1 .
  • genomic DNA was extracted from the cells using QuickExtractT DNA Extraction Solution for subsequent sequencing analysis.
  • Target genomic sequences were PCR-amplified using high-fidelity DNA polymerase PrimeSTAR HS with primer sets flanking the examined mgRNA target sites.
  • Base substitution frequency at each target sites was calculated by EditR analysis. See http: //baseeditr. com/ .
  • Total proteins from NHDFs were extracted using a RIPA buffer with 1mM PMSF and proteinase inhibitor cocktail. The protein concentration was measured using a BCA protein assay kit. Equal amounts of protein (5 ⁇ g) were separated by SDS-PAGE using 4–12%SurePage Mini-PROTEAN Gels and then transferred onto Nitrocellulose membrane.
  • Membranes were blocked using 5%skim milk in Tris-buffered saline, containing 0.1% (v/v) Tween-20 (TBST) , for 1 h, and incubated with primary antibodies against KLKB1 (1: 1000) and GAPDH (1: 10000) overnight at 4 °C, followed by incubation with anti-rabbit IgG or anti-mouse IgG conjugated with horseradish peroxidase. GAPDH was used as an internal control. The probed protein was visualized using Amersham Image 680. The densitometric analysis was semi-quantified using the ImageQuantTL.

Abstract

The present disclosure generally relates to gene editing systems and methods for treating hereditary angioedema (HAE). Also disclosed are polynucleotides, vectors, cells, kits and compositions comprising the components of the gene editing systems, and methods related to treatment of HAE.

Description

GENE EDITING SYSTEMS AND METHODS FOR TREATING HEREDITARY ANGIOEDEMA
RELATED APPLICATION
This application claims priority to International Application No. PCT/CN2022/104999, filed on July 11, 2022, the content of which is incorporated by reference in its entirety.
FIELD OF DISCLOSURE
The present disclosure generally relates to gene editing systems and methods for treating hereditary angioedema (HAE) . Also disclosed are polynucleotides, vectors, cells, kits, and compositions comprising components of the gene editing systems, and methods related to treatment of HAE.
SEQUENCE LISTING
This application contains a Sequence Listing electronically submitted as an XML file entitled “CU3106CST33WO-sql. xml” having a size of 822,087 bytes and created on July 11, 2023. The information contained in the Sequence Listing is incorporated by reference herein.
BACKGROUND
Hereditary angioedema (HAE) is a rare inherited disease that can cause recurrent attacks of painful swellings in any part of the body. These painful swellings interfere with patient’s daily activities. Swelling of the throat is potentially life-threatening because it could lead to the risk of asphyxiation. Most of HAEs are caused by C1-INH deficiency. C1-INH helps to regulate plasma kallikrein. C1-INH deficiency and/or dysfunction could increase the activity of plasma kallikrein. Increased kallikrein activity leads to excess bradykinin, which triggers vascular leakage, causes blood vessels to release fluid, and results in localized swelling in an HAE attack.
C1-INH is encoded by the SERPING1 gene, and 450 known mutations in SERPING1 have been found associated with HAE. De Maat S, Hofman ZLM, Maas C. Hereditary angioedema: the plasma contact system out of control. J Thromb Haemost 16, 1674-85 (2018) . Thus, correction of the SERPING1 gene is not practical for the treatment of HAE. Therefore, there is a need to find a therapeutic target other than the SERPING1 gene for the treatment of HAE.
SUMMARY
The present disclosure provides gene editing systems, polynucleotides, vectors, cells, compositions, kits, and methods to disrupt the KLKB1 gene, which encodes the precursor of  plasma kallikrein. Plasma kallikrein could cleave Lys-Arg and Arg-Ser bonds in human kininogen to release bradykinin. Thus, suppressing the expression of plasma kallikrein reduces bradykinin production.
In some embodiments, the disruption of the KLKB1 gene prevents the release of bradykinin. In some embodiments, the disruption of the KLKB1 gene treats hereditary angioedema (HAE) . In some embodiments, the disruption of the KLKB1 gene treats HAE caused by C1 inhibitor (C1-INH) deficiency and/or dysfunction.
In some embodiments, the present disclosure provides a gene editing system for disrupting the KLKB1 gene, wherein the gene editing system comprises a base editor and at least one guide RNA that is capable of binding to the KLKB1 gene.
In some embodiments, a highly specific base editor, transformer base editor (tBE) , is used to induce efficient and precise gene editing at genomic sites for disrupting the KLKB1 gene. A tBE is used with a combination of main guide RNA (mgRNA) and helper guide RNA (hgRNA) , wherein the mgRNA and hgRNA are capable of binding to the KLKB1 gene.
In an aspect, the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer, wherein the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise the respective sequences as set forth in Table 3.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 81, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 82, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 83, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 84, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 334, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 335, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 336, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 337, respectively.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, and (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, and wherein the first Cas protein and second Cas protein are the same or different.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, and (7) a nucleobase deaminase inhibitor domain, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR- associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, and (8) a second fusion protein comprising the protease and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the protease and the second RNA binding domain are optionally connected by a linker, wherein the mgRNA further comprises a second protein-binding motif, and wherein the second RNA binding domain binds to the second protein-binding motif.
In some embodiments, the protease is split into a first protease fragment and a second protease fragment, wherein the first or second protease fragment alone is not able to cleave the cleavage site.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and (9) a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second  protease fragment and the third RNA binding domain are optionally connected by a linker, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the mgRNA further comprises a second protein-binding motif and a third protein-binding motif, wherein the second RNA binding domain binds to the second protein-binding motif, and wherein the third RNA binding domain binds to the third protein-binding motif.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) an mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and (9) a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the mgRNA further comprises a second protein-binding motif and a third protein-binding motif, wherein the second RNA binding domain binds to the second protein-binding motif, wherein the third RNA binding domain binds to the third protein-binding motif, wherein the second and third RNA binding domains are the same or different, and the second and third protein-binding motifs are the same or different.
In some embodiments, the gene editing system disclosed herein comprises (1) an hgRNA comprising a CRISPR motif, an hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and an mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, wherein the mgRNA further comprises a second protein-binding motif, and wherein the second RNA binding domain binds to the second protein-binding motif.
In some embodiments, the protease is a TEV protease, a TuMV protease, a PPV protease, a PVY protease, a ZIKV protease, or a WNV protease.
In some embodiments, the protease is a TEV protease. In some embodiments, the TEV protease comprises a sequence as set forth in SEQ ID NO: 590.
In some embodiments, the first TEV protease fragment comprises a sequence of SEQ ID NO: 591.
In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase.
In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase.
In some embodiments, the inhibitory domain of a cytidine deaminase comprises an amino acid sequence as set forth in SEQ ID NO: 607 or SEQ ID NO: 608.
In some embodiments, the nucleotide deaminase is a cytidine deaminase.
In some embodiments, the cytidine deaminase is selected from the group consisting of APOBEC3B (A3B) , APOBEC3C (A3C) , APOBEC3D (A3D) , APOBEC3F (A3F) , APOBEC3G (A3G) , APOBEC3H (A3H) , APOBECI (Al) , APOBEC3 (A3) , APOBEC2 (A2) , APOBEC4 (A4) , and AICDA (AID) .
In some embodiments, the cytidine deaminase comprises an amino acid sequence of any one of SEQ ID NOs: 765-800.
In some embodiments, the cytidine deaminase is a human or mouse cytidine deaminase.
In some embodiments, the catalytic domain of the cytidine deaminase is a mouse A3 cytidine deaminase domain 1 (mA3-CDAl) or human A3B cytidine deaminase domain 2 (hA3B-CDA2) .
In some embodiments, the first fusion protein further comprises an uracil glycosylase inhibitor (UGI) .
In some embodiments, the first fusion protein further comprises a nuclear localization sequences (NLS)
In some embodiments, the Cas protein is Cas9, a dead Cas9 (dCas9) , or a Cas9 nickase (nCas9) selected from the group consisting of SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpfl, LbCpfl, FnCpfl, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpfl, FnCpfl, SsCpfl, PcCpfl, BpCpfl, CmtCpfl, LiCpfl, PmCpfl, Pb3310Cpfl, Pb4417Cpfl, BsCpfl, EeCpfl, BhCasl2b, AkCasl2b, EbCasl2b, LsCasl2b, RfCasl3d, LwaCasl3a, PspCasl3b, PguCasl3b, and RanCasl3b.
In some embodiments, the Cas protein comprise an amino acid sequence of any one of SEQ ID NOs: 713-764.
In some embodiments, the Cas protein is a nCas9. In some embodiments, the nCas9 protein is a nCas9-D10A protein. In some embodiments, the nCas9-D10A protein has an amino acid sequence of SEQ ID NO: 612.
In some embodiments, the first protein-binding RNA motif and the first RNA binding domain, the second protein-binding RNA motif and the second RNA binding domain, and the third protein-binding RNA motif and the third RNA binding domain, are each independently selected from the group consisting of a MS2 phage operator stem-loop and MS2 coat protein (MCP) or an RNA-binding section thereof; a BoxB and N22P or an RNA-binding section thereof; a telomerase Ku binding motif and Ku protein or an RNA-binding section thereof; a telomerase Sm7 binding motif and Sm7 protein or an RNA-binding section thereof; a PP7 phage operator stem –loop and PP7 coat protein (PCP) or an RNA-binding section thereof; a SfMu phage Com stem-loop and  Com RNA binding protein or an RNA-binding section thereof; and a non-natural RNA aptamer and corresponding aptamer ligand or an RNA-binding section thereof.
In some embodiments of the gene editing system described herein, the mgRNA and/or the hgRNA comprises a dual-RNA structure.
In some embodiments, the dual-RNA structure is formed by a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) , wherein the crRNA comprises the spacer.
In some embodiments of the gene editing system described herein, the mgRNA comprises a mcrRNA and a first tracrRNA, and the mcrRNA comprises the mgRNA spacer, wherein the hgRNA comprises a hcrRNA and a second tracrRNA, and the hcrRNA comprises the hgRNA spacer, and wherein the first tracrRNA and the second tracrRNA are same or different.
In some embodiments, the mcrRNA and the hcrRNA are SEQ ID NO. 805 and SEQ ID NO: 808, respectively; or SEQ ID NO. 806 and SEQ ID NO: 809, respectively; or SEQ ID NO. 807 and SEQ ID NO: 810, respectively; or SEQ ID NO. 812 and SEQ ID NO: 815, respectively; or SEQ ID NO. 813 and SEQ ID NO: 816, respectively; or SEQ ID NO. 814 and SEQ ID NO: 817, respectively.
In some embodiments, the tracrRNA is SEQ ID NO: 804 or 811.
In another aspect, the present disclosure provides a polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
In another aspect, the present disclosure provides a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a kit comprising a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein, and a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein. In some embodiments, the first and the second Cas proteins are the same Cas protein.
In another aspect, the present disclosure provides a vector comprising the polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
In another aspect, the present disclosure provides a vector comprising the polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
In some embodiments, the vector is a plasmid or a viral vector.
In some embodiments, the vector is a polycistronic vector.
In another aspect, the present disclosure provides a kit comprising the vector disclosed above, and a vector comprising the polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the polynucleotide disclosed herein. In some embodiments, the cell further comprises a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the vector disclosed herein. In some embodiments, the cell further comprises a vector comprising a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the kit disclosed herein.
In some embodiments, the cell is a stem cell.
In some embodiments, the cell is a pluripotent stem cell.
In some embodiments, the cell is an embryonic stem cell (ESC) .
In some embodiments, the cell is an induced pluripotent stem cell (iPSC) .
In some embodiments, the cell is an endothelial cell.
In some embodiments, the cell is a primary cell.
In some embodiments, the cell is a differentiated cell.
In another aspect, the present disclosure provides a composition comprising the gene editing system disclosed herein.
In another aspect, the present disclosure provides a composition comprising the cell disclosed herein.
In another aspect, the present disclosure provides a method for disrupting a KLKB1 gene in a cell, comprising introducing into the cell the gene editing system disclosed herein.
In another aspect, the present disclosure provides a method for decreasing the expression of plasma kallikrein in a cell, comprising introducing the cell the gene editing system disclosed herein.
In another aspect, the present disclosure provides a method for decreasing the expression of bradykinin in a cell, comprising introducing the cell the gene editing system disclosed herein.
In another aspect, the present disclosure provides a method for decreasing vascular leakage regulated by a kinin B2 receptor (B2R) in a cell, comprising introducing the cell the gene editing system disclosed herein. In some embodiments, the HAE is caused by C1-INH deficiency or dysfunction. In some embodiments, the HAE is HAE-1. In some embodiments, the HAE is HAE-2.
In some embodiments, the cell is a stem cell.
In some embodiments, the cell is a pluripotent stem cell.
In some embodiments, the cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
In some embodiments, the cell is an endothelial cell.
In some embodiments, the cell is a primary cell or a differentiated cell.
BRIEF DESCRIPTION OF THE FIGURES
Fig. 1 illustrates exemplary base editors that can be used in the gene editing systems disclosed here. The various versions of base editors are denoted as V1, V2, V3, V4, and V5, with constructs denoted as tBE-V1-rA1, tBE-V2-rA1, tBE-V3-rA1, tBE-V4-rA1, tBE-V5-rA1, and tBE-V5-mA3. Fig. 1A shows schematic diagrams illustrating the construction and development of various versions of base editors. Fig. 1B shows interactions of molecular components in different versions of the base editors. Base editors of V2 to V5 illustrate different strategies to cleave mA3dCDI off. The dCDI domain could be cleaved off from APOBEC through a two-component interaction of the TEV site and a free TEV protease (V2) , a N22p-fused TEV protease (V3) , or a TEV protease reconstituted by an mgRNA-boxB (V4) . In the version 5 (V5) of the base editor, the dCDI is cleaved off from APOBEC through a three-component interaction of TEV site, TEVn, and N22p-TEVc.
Fig. 2 illustrates editing efficiencies induced by a tBE gene editing system with pairs of mgRNA-KLKB1-1~4 and corresponding hgRNAs targeting human KLKB1 gene in human cells. Fig. 2A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1-1~4 and corresponding hgRNA-KLKB1-1~4s with tBE-V5-mA3 and nCas9. Fig. 2B shows editing efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites. Fig. 2C shows base substitution frequency at each target sites calculated by EditR analysis.
Fig. 3 illustrates editing efficiencies induced by a tBE gene editing system with pairs of mgRNA-KLKB1-5~6 and corresponding hgRNAs targeting human KLKB1 gene in human cells. Fig. 3A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1-5~6 and corresponding hgRNA-KLKB1-5~6s with tBE-V5-mA3 and nCas9. Fig. 3B shows editing  efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites. Fig. 3C shows base substitution frequency at each target sites calculated by EditR analysis.
Fig. 4 illustrates relative KLKB1 protein expression induced by a tBE gene editing system with pairs of mgRNA and hgRNA targeting human KLKB1 gene in human cells. Fig. 4A shows KLKB1 protein expression level examined with western blot using GAPDH as the loading control. Fig. 4B shows relative densitometry representation of protein expression level examined in Fig. 4A.
Fig. 5 illustrates editing efficiencies induced by a tBE gene editing system in RNA electroporation delivery system in human cells. Fig. 5A is a schematic diagram illustrating co-transfection of mgRNA-KLKB1 and corresponding hgRNA-KLKB1 with mRNA of tBE-V5-mA3 and nCas9. Fig. 5B shows editing efficiency induced by tBE-V5-mA3 with indicated pairs of mgRNA/hgRNA at indicated sites. Fig. 5C shows base substitution frequency at each target site calculated by EditR analysis.
Fig. 6 illustrates editing efficiencies induced by a V5-LigoRNA-based editing system in RNA electroporation delivery system in human and mouse cells. Fig. 6A is a schematic diagram illustrating co-transfection of mcrRNA-KLKB1, corresponding hcrRNA-KLKB1 and tracrRNA with mRNA of tBE-V5-mA3 and nCas9. Fig. 6B shows editing efficiency induced by the V5-LigoRNA-based editing system at indicated sites in HepG2 cells. Fig. 6C shows base substitution frequency at the target sites calculated by EditR analysis. Fig. 6D shows editing efficiency induced by the V5-LigoRNA-based editing system at indicated sites in Hepa1-6 cells. Fig. 6E shows base substitution frequency at the target sites calculated by EditR analysis.
DETAILED DESCRIPTION
All publications, patents, and patent applications referred to herein are incorporated by reference in their entirety to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference in its entirety.
In the present disclosure, unless otherwise specified, the scientific and technical terms used herein have the meanings generally understood by a person skilled in the art. Although any methods and materials similar or equivalent to those described herein find use in the practice of the present disclosure, the preferred methods and materials are described herein. Accordingly, the terms defined herein are more fully described by reference to the Specification as a whole.
As used herein, the singular terms “a, ” “an, ” and “the” include the plural reference unless the context clearly indicates otherwise.
As used herein, “and/or” refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations when interpreted in the alternative ( “or” ) . Moreover, the present invention also contemplates that in some embodiments of the invention, any feature or combination of features set forth herein can be excluded or omitted.
Unless the context requires otherwise, the terms “comprise, ” “comprises, ” and “comprising, ” or similar terms are intended to mean a non-exclusive inclusion, such that a recited list of elements or features does not include those stated or listed elements solely, but may include other elements or features that are not listed or stated.
Unless otherwise indicated, nucleic acids are written left to right in the 5'to 3'orientation, and amino acid sequences are written left to right in amino to carboxy orientation, respectively. A number “n” , when used in the context of an amino acid sequence, refers to the nth amino acid in the amino acid sequence counting from the amino end. For example, “amino acid 15” refers to the 15th amino acid in a certain amino acid sequence. For example, “R15” refers to the 15th amino acid, which is an arginine (R) , in a certain amino acid sequence.
It is to be understood that this disclosure is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context in which they are used by those skilled in the art.
As used herein, the term “about” will be understood by persons of ordinary skill in the art and will vary to some extent depending on the context in which it is used. In some embodiments, the term “about” when referring to a value is meant to encompass art-accepted variations. In some embodiments, the term “about” when referring to such values, is meant to encompass variations of ±20%or ±10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1%from the specified value, as such variations are appropriate in the context in which the term “about” is used.
As used herein, the terms “percent identity” and “%identity, ” as applied to nucleic acid or polynucleotide sequences, refer to the percentage of residue matches between at least two nucleic acid or polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences.
Percent identity between nucleic acid or polynucleotide sequences may be determined using a suite of commonly used and freely available sequence comparison algorithms provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215: 403-410) , which is available from several  sources, including the NCBI, Bethesda, Md., and on the Internet at http: //www. ncbi. nlm. nih. gov/BLAST/.
Nucleic acid or polynucleotide sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res 19: 5081; Ohtsuka et al. (1985) J Biol Chem 260: 2605-2608; Cassol et al. (1992) ; Rossolini et al. (1994) Mol Cell Probes 8: 91-98) . The term “nucleic acid” refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single-or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. The term nucleic acid is used interchangeably with polynucleotide, and (in appropriate contexts) gene, cDNA, and mRNA encoded by a gene.
As used herein, “percent (%) amino acid sequence identity” with respect to a peptide, polypeptide or protein sequence is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in another peptide or polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Percent amino acid sequence identity in the current disclosure is measured using BLAST software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
An amino acid substitution refers to the replacement of one amino acid in a polypeptide with another amino acid. Amino acid substitutions can be conservative or non-conservative substitutions. Exemplary substitutions are shown in Table 1. Amino acid substitutions may be introduced into a protein of interest and the products screened for a desired activity, for example, retained/improved biological activity.
Table 1

Amino acids may be grouped according to common side-chain properties:
(1) hydrophobic: Norleucine, Met, Ala, Val, Leu, Ile;
(2) neutral hydrophilic: Cys, Ser, Thr, Asn, Gln;
(3) acidic: Asp, Glu;
(4) basic: His, Lys, Arg;
(5) residues that influence chain orientation: Gly, Pro;
(6) aromatic: Trp, Tyr, Phe.
As used herein, the term “polypeptide” is intended to encompass a singular “polypeptide” as well as plural “polypeptides, ” and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds) . The term “polypeptide” refers to any chain or chains of two or more amino acids, and does not refer to a  specific length of the product. Thus, “peptides, ” “protein” , or any other term used to refer to a chain or chains of two or more amino acids, are included within the definition of “polypeptide, ” and the term “polypeptide” may be used instead of, or interchangeably with any of these terms. The term “polypeptide” is also intended to refer to the products of post-expression modifications of the polypeptide, including without limitation glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids. A polypeptide may be derived from a natural biological source or produced by recombinant technology, but is not necessarily translated from a designated nucleic acid sequence. It may be generated in any manner, including by chemical synthesis.
As used herein, the term “encode” or “encoding” as it is applied to polynucleotides refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
A “guide RNA” (gRNA) refers to a synthetic or expressed RNA sequence that comprises a CRISPR binding motif and a spacer. In some embodiments, the guide RNA is a single guide RNA. In some embodiments, the guide RNA is a dual-RNA structure. In some embodiments, the guide RNA is a dual-RNA structure formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) . In some embodiments, the guide RNA is a LigoRNA. A “spacer” is a DNA-targeting motif, which is a sequence that is complementary to a target specific DNA region. In some embodiments, the guide RNA is a crRNA-tracrRNA dual RNA structure, and the crRNA comprises the spacer. The CRISPR binding motif of a guide RNA can bind to a Cas enzyme and DNA-targeting motif of the gRNA can guide the complex to a specific target location on a DNA. In some embodiments, the guide RNA is a crRNA-tracrRNA dual RNA structure, and the base-pair structure formed by the crRNA and the tracrRNA comprises the CRISPR binding motif. A guide RNA may further comprise one or more other motifs, such as one or more protein-binding motifs, or the like.
As used herein, a “fusion protein” is a protein comprising at least two polypeptides that have been joined as a single polypeptide. For example, a fusion protein can comprise two domains that are encoded by separate genes that have been joined so that they are transcribed and translated as a single unit, producing a single polypeptide. In some embodiments, the at least two domains are fused together directly. In some embodiments, the domains are connected by one or more linkers.
The term “genetic modification” and its grammatical equivalents, as used herein can refer to one or more alterations of a nucleic acid, e.g., the nucleic acid within an organism’s genome. For example, genetic modification can refer to alterations, additions, and/or deletion of genes or portions of genes or other nucleic acid sequences. A genetically modified cell can also refer to a cell with an added, deleted, and/or altered gene or portion of a gene. A genetically modified cell can also refer to a cell with an added nucleic acid sequence that is not a gene or gene portion. Genetic modifications include, for example, both transient knock-in or knock-down mechanisms, and mechanisms that result in permanent knock-in, knock-down, or knock-out of target genes or portions of genes or nucleic acid sequences. Genetic modifications include, for example, both transient knock-in and mechanisms that result in permanent knock-in of nucleic acids sequences. Genetic modifications also include, for example, reduced or increased transcription, reduced or increased mRNA stability, reduced or increased translation, and reduced or increased protein stability.
As used herein, a composition refers to any mixture of two or more products, substances, or compounds, including cells.
The term “subject” means any animal such as a mammal, e.g., a human.
As used herein, the term “treat, ” “treating, ” or “treatment” refers to ameliorating a disease or disorder, e.g., slowing or arresting or reducing the development of the disease or disorder or reducing at least one of the clinical symptoms thereof. For example, in some embodiments, ameliorating a disease or disorder can include obtaining a beneficial or desired clinical result that includes, but is not limited to, any one or more of: alleviation of one or more symptoms, diminishment of extent of disease, preventing or delaying spread of disease, preventing or delaying recurrence of disease, delay or slowing of disease progression, amelioration of the disease state, inhibiting or eliminating the disease or progression of the disease, inhibiting or slowing the disease or its progression, arresting its development, and remission (whether partial or total) .
Hereditary angioedema (HAE)
Hereditary angioedema is a disorder characterized by recurrent episodes of severe swelling (angioedema) . The most common areas of the body to develop swelling are the limbs, face, intestinal tract, and airway. Minor trauma or stress may trigger an attack, but swelling often occurs without a known trigger. Episodes involving the intestinal tract cause severe abdominal pain, nausea, and vomiting. Swelling in the airway can restrict breathing and lead to life-threatening obstruction of the airway. There are three types of hereditary angioedema, called types I, II, and III (HAE-1, HAE-2, and HAE-3) , which can be distinguished by their underlying causes and levels of a protein called C1 inhibitor (C1-INH) in the blood. HAE-1, HAE-2, and HAE-3 can be characterized by C1-INH low serum levels (HAE-1) , dysfunction (HAE-2) , and normal serum  levels and function (HAE-3) . The different types have similar signs and symptoms. Hereditary angioedema is estimated to affect 1 in 50, 000 people. HAE-1 is the most common, accounting for 85 percent of cases. HAE-2 occurs in 15 percent of cases, and HAE-3 is very rare.
The uncontrolled activity of the plasma contact system forms the basis for the pathological tissue swelling in C1-INH-related HAE. The plasma contact system designates a group of serine proteases and their substrates that assemble on surfaces of circulating blood cells and vessel walls. It is composed of the serine protease zymogens factor XII (FXII) , factor XI (FXI) , plasma prekallikrein (PPK) and the substrate of plasma kallikrein (PK) , high molecular weight kininogen (HK) . The contact system is started by FXII binding to “contact” -activators including negatively charged surfaces, such as the silicate kaolin and high molecular weight dextran sulfate (DXS) in vitro or platelet polyphosphate (polyP) and mast cell heparin in vivo.
Human plasma prekallikrein (PPK) is the precursor of plasma kallikrein (PK) , the serine protease that liberates bradykinin (BK) from high molecular weight kininogen (HK) . Additionally, PK cleaves FXII to generate active FXII (FXIIa) . PPK is encoded by the KLKB1 gene (e.g., SEQ ID NO: 588) located on chromosome 4 in humans. Human PPK (e.g., SEQ ID NO: 587, UniProtKB-P03952) is mostly synthesized in the liver and secreted into the bloodstream as a 619 amino acid single chain glycoprotein with five N-linked glycosylations. Two differentially glycosylated PPK forms (85 and 88 kDa) exist and circulate with a total plasma concentration of 35–50 μg/ml (350–500 nM) . Most of PPK circulates in a non-covalently bound complex with HK (about 75%complex and 25%free form) . PPK is activated to PK by limited proteolysis at the peptide bond R371–I372 leading to the heavy (amino acids 1–371) and light (amino acids 372–619) chain fragments. Both chains remain connected by a disulfide bond spanning C364–C484. The N-terminal heavy chain is composed of four apple domains (A1-A4) , wherein A2 contains the major HK binding sites with some contribution of A1 and A4. Each apple domain is stabilized by three disulfide bridges (four in Apple 4) and the entire PPK molecule has 18 disulfide bonds. The catalytic triad is composed of H415, D464 and S559 within the C-terminal light chain of the protein. (Weidmann et al., The plasma contact system, a protease cascade at the nexus of inflammation, coagulation and immunity, Molecular Cell Research, 2017)
C1-INH is encoded by the SERPING1 gene, which is located on chromosome 11. This 105-kDa glycoprotein (seven N-linked and eight O-linked glycans) is the main inhibitor of the classical complement enzymes C1r and C1 esterase (C1s) . The complement component C1 is a protein complex involved in the complement system, which is composed of one molecule of C1q, two molecules of C1r, and two molecules of C1s. C1-INH is also the primary inhibitor of the contact factors. It is responsible for 93%of the inhibition of FXIIa in plasma. Similarly, C1-INH is responsible for 52%of the inhibition of PK and 47%of the inhibition of activated FXI (FXIa) .  C1-INH also modestly inhibits plasmin, but the physiological relevance of this inhibition may be limited.
By acting as an inhibitor of PK, C1-INH thereby limits bradykinin release from HK, because cleavage of HK by PK leads to the release of bradykinin (e.g., SEQ ID NO: 589) . PK could cleave Lys-Arg and Arg-Ser bonds in HK to release bradykinin. Bradykinin is recognized by the kinin B2 receptor (B2R) , which is constitutively expressed on vascular endothelial cells. Activation of B2R, a G-protein-coupled receptor, leads to vascular leakage through induction of endothelial cell contractility, uncoupling of endothelial cell junctions, production of nitric oxide, and prostacyclin. After activation, B2R is internalized, leading to (temporary) desensitization of the tissue to bradykinin. (Matt et al., Hereditary angioedema: the plasma contact system out of control, J. Thrombosis and Haemostasis, 2018) 
In C1-INH-related HAE (HAE1 and HAE2) , deficiency or dysfunction of C1-INH leads to increased PK activity. In order to treat HAE, the present disclosure provides gene editing systems and methods that aim to disrupt PK expression by targeting the KLKB1 gene, thus decrease bradykinin release and B2R activation, and ultimately reduce vascular leakage.
Gene Editing Systems
The safety and efficiency of gene editing tools are of great importance in clinical applications. Previous studies have reported that the DSBs induced by Cas9 nuclease can activate a p53-mediated DDR pathway and then lead to cell death. Moreover, APOBEC/AID family members can trigger C-to-T base substitutions in single-stranded DNA (ssDNA) regions, which are formed randomly during various cellular processes including DNA replication, repair and transcription. Thus, the specificity of previous base editing systems is compromised, limiting the applications of base editors (BEs) for therapeutic purposes.
In some embodiments, the present disclosure provides highly specific gene editing systems, polynucleotides, vectors, cells, compositions, kits, and methods to disrupt the KLKB1 gene, which encodes the precursor of plasma kallikrein. Plasma kallikrein could cleave Lys-Arg and Arg-Ser bonds in human kininogen to release bradykinin. Thus, suppressing the expression of plasma kallikrein reduces bradykinin production.
In some embodiments, the disruption of the KLKB1 gene prevents or reduces the release of bradykinin. In some embodiments, the disruption of the KLKB1 gene leads to the suppression of C1-INH deficiency and/or dysfunction. In some embodiments, the disruption of the KLKB1 gene treats HAE caused by C1 inhibitor (C1-INH) deficiency and/or dysfunction. In some embodiments, the disruption of the KLKB1 gene treats hereditary angioedema (HAE) . In some embodiments, the disruption of the KLKB1 gene treats HAE-1 and/or HAE-2.
In some embodiments, present disclosure provides a gene editing system for disrupting the KLKB1 gene, wherein the gene editing system comprises a base editor and at least one guide RNA that is capable of binding to the KLKB1 gene.
In some embodiments, a highly specific base editor, transformer base editor (tBE) , is used to induce efficient and precise gene editing at genomic sites for disrupting the KLKB1 gene. A tBE is used with a combination of main guide RNA (mgRNA) and helper guide RNA (hgRNA) , wherein the mgRNA and hgRNA are capable of binding to the KLKB1 gene.
In some embodiments, a base editor as used herein is a cytosine base editor (CBE) , which comprises a combination of a CRISPR system and cytidine deaminase. A CBE effectuates a programmable cytosine to thymine (C-to-T) substitution. Because the base editing process does not depend on the generation of DNA double strand break (DSB) , unwanted nucleotide insertions/deletions (indels) or DNA damage responses (DDRs) can be largely avoided.
In some embodiments, a highly specific base editing system, transformer base editor (tBE) , is used, which can edit cytosine in target regions with no observable off-target mutations. In some embodiments, the tBE is any one of the base editors described in WO2020156575A1, incorporated herein by reference in its entirety. For instance, the tBE can be any base editor as illustrated in Fig. 1.
In some embodiments, the transformer base editor (tBE) system comprises a cytidine deaminase inhibitor (dCDI) domain and a split-TEV protease (e.g., as illustrated in Fig. 1, V5) . The tBE remains inactive at off-target sites with a cleavable fusion of dCDI domain and thus eliminates or reduces unintended off-target mutations. When binding at on-target sites, the tBE is transformed to cleave off the dCDI domain and catalyzes targeted deamination for precise editing. In some embodiments, the tBE uses one mgRNA (normally about 20 nt) to bind at the target genomic site and one helper mgRNA (hgRNA, normally about 10 to about 20 nt) to bind at a nearby region (preferably upstream to the target genomic site) . The binding of two gRNAs can guide the components of tBE system to correctly assemble at the target genomic site for base editing. In some embodiments, tBE can specifically edit cytosine in target regions with no observable off-target mutations. For example, the tBE can specifically edit cytosine in a target region to induce a premature stop codon to repress KLKB1 protein expression or break the GU-AG rule to disrupt a splicing site.
In some embodiments, the tBE system is used to disrupt a KLKB1 gene, which leads to the suppression of C1-INH deficiency and/or dysfunction, for the treatment of HAE. The base editors and base editing methods described herein can be applied to perform high-specificity and high-efficiency base editing in the genome of various eukaryotes. In some embodiments, the tBE comprises a Cas9 nickase (D10A) , which is less toxic to cells than Cas9 nuclease, because Cas9  nickase activates a lower level of p53-mediated DDR. Besides, tBE achieves highly specific and efficient base editing at most sites.
In an aspect, the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer of about 20 nucleotides (about 20 nt) that binds to a target site on a KLKB1 gene and the hgRNA comprises an hgRNA spacer of about 10 to about 20 nt that binds to a site that is close to the target site that the mgRNA spacer binds to.
In some embodiments, the gene editing system comprises an mgRNA comprising an mgRNA spacer selected from SEQ ID NOs: 1 to 80, and an hgRNA comprising an hgRNA spacer of about 10 to about 20 nt, e.g., 7 nt to 23 nt, 8 nt to 22 nt, 9 nt to 21 nt, and 10 nt to 20 nt, that binds to a site close to the target site that the mgRNA spacer binds to. In some embodiments, the gene editing system comprises an hgRNA comprising an hgRNA spacer selected from SEQ ID NOs: 81-586. In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise the sequences as set forth in Table 2.
As shown in Table 2, the mgRNA having the paring code of “1” can be used in combination with any one of the hgRNAs having the paring codes of “1-1, ” “1-2, ” “1-3, ” “1-4, ” “1-5, ” and “1-6. ” The mgRNA having the paring code of “2” can be used in combination with any one of the hgRNAs having the paring codes of “2-1” and “2-2, ” and so on and so forth. As a further example, any appropriate fragment of the 20-nt of SEQ ID NOs: 81-333, e.g., having 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 nucleotides, can be used as an hgRNA spacer in the gene editing system disclosed herein. SEQ ID NOs: 334-586 are exemplary hgRNA spacers having 10 nucleotides.
Table 2





















In some embodiments, the present disclosure provides a gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer, wherein the nucleic acid sequences  of the mgRNA spacer and the hgRNA spacer comprise the sequences respectively as set forth in Table 3.
Table 3 Combinations of mgRNA spacer and hgRNA spacer













In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 81, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 82, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 83, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 84, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 334, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 335, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 01 and SEQ ID NO: 336, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 02 and SEQ ID NO: 337, respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 03 and SEQ ID NO: 85 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 03 and SEQ ID NO: 338 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 86 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 87 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 88 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 339 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 340 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 04 and SEQ ID NO: 341 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 91 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 92 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 93 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 344 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 345 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 05 and SEQ ID NO: 346 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 94 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 95 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 96 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 347 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 348 respectively.
In some embodiments, the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise SEQ ID NO: 06 and SEQ ID NO: 349 respectively.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a  DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, and (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, and wherein the first Cas protein and second Cas protein are the same or different.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, and (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the  nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, and (8) a second fusion protein comprising the protease and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the protease and the second RNA binding domain are optionally connected by a linker, wherein the mgRNA further comprises a second protein-binding motif, and wherein the second RNA binding domain binds to the second protein-binding motif.
In some embodiments, the protease is split into a first protease fragment and a second protease fragment, wherein the first or second protease fragment alone is not able to cleave the cleavage site.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and (9) a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the  catalytic domain thereof, wherein the mgRNA further comprises a second protein-binding motif and a third protein-binding motif, wherein the second RNA binding domain binds to the second protein-binding motif, and wherein the third RNA binding domain binds to the third protein-binding motif.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and (9) a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the mgRNA further comprises a second protein-binding motif and a third protein-binding motif, wherein the second RNA binding domain binds to the second protein-binding motif, wherein the third RNA binding domain binds to the third protein-binding motif, and wherein the second and third RNA binding domains are the same or different, and the second and third protein-binding motifs are the same or different.
In some embodiments, the gene editing system disclosed herein comprises (1) the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA, (2) the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA, (3) a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif, (4) a second Cas protein, or a polynucleotide  encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif, (5) a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif, (6) a protease, or a polynucleotide encoding the protease, (7) a nucleobase deaminase inhibitor domain, and (8) a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first Cas protein and second Cas protein are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, wherein the mgRNA further comprises a second protein-binding motif, and wherein the second RNA binding domain binds to the second protein-binding motif..
A “protease” refers to an enzyme that catalyzes proteolysis. A “cleavage site for a protease” refers to a short peptide that the protease recognizes, and within which creates a proteolytic cleavage. Non-limiting examples of proteases include TEV protease, TuMV protease, PPV protease, PVY protease, ZIKV protease, and WNV protease. The protein sequences of example proteases and their corresponding cleavage sites are provided in Table 4.
Table 4 Exemplary proteases and their cleavage sites

In some embodiments, the protease is a TEV protease, a TuMV protease, a PPV protease, a PVY protease, a ZIKV protease, or a WNV protease.
In some embodiments, the protease cleavage site is a self-cleaving peptide, such as the 2A peptides. “2A peptides” are 18-22 amino-acid-long viral oligopeptides that mediate “cleavage” of polypeptides during translation in eukaryotic cells. The designation “2A” refers to a specific region of the viral genome and different viral 2As have generally been named after the virus they were derived from. The first discovered 2A was F2A (foot-and-mouth disease virus) , after which E2A (equine rhinitis A virus) , P2A (porcine teschovirus-1 2A) , and T2A (thosea asigna virus 2A) were also identified. A few non-limiting examples of 2A peptides are provided in SEQ ID NO:  604 (GSGATNFSLLKQAGDVEENPGP) ; SEQ ID NO: 605 (GSGEGRGSLLTCGDVEENPGP) ; and SEQ ID NO: 606 (GSGQCTNYALLKLAGDVESNPGP) .
In some embodiments, the protease is a TEV protease. In some embodiments, the TEV protease comprises a sequence as set forth in SEQ ID NO: 590.
In some embodiments, the first and/or the second TEV protease fragment is not able to cleave the TEV cleavage site on its own. However, in the presence of the remaining portion of the TEV protease, this fragment will be able to effectuate the cleavage. The TEV fragment may be the TEV N-terminal domain (e.g., SEQ ID NO: 591) or the TEV C-terminal domain (e.g., SEQ ID NO: 592) . In some embodiments, the first TEV protease fragment comprises a sequence of SEQ ID NO: 591. In some embodiments, the first TEV protease fragment comprises a sequence of SEQ ID NO: 592.
A “nucleobase deaminase inhibitor” or an “inhibitory domain” refers to a protein or a protein domain that inhibits the deaminase activity of a nucleobase deaminase.
In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase.
In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase. In some embodiments, the nucleobase deaminase inhibitor is the mouse APOBEC3 cytidine deaminase domain 2 (mA3-CDA2, SEQ ID NO: 607) . In some embodiments, the nucleobase deaminase inhibitor is the human APOBEC3B cytidine deaminase domain 1 (hA3B-CDA1, SEQ ID NO: 608) .
Table 5 shows 44 proteins/domains that have significant sequence homology to mA3-CDA2 core sequence and Table 6 shows 43 proteins/domains that have significant sequence homology to hA3B-CDA1. All of these proteins and domains, as well as their variants and equivalents, are contemplated to have nucleobase deaminase inhibition activities.
Table 5



Table 6



In some embodiments, the inhibitory domain of a cytidine deaminase comprises an amino acid sequence as set forth in SEQ ID NO: 607 or SEQ ID NO: 608.
The term "nucleobase deaminase" as used herein, refers to a group of enzymes that catalyze the hydrolytic deamination of nucleobases such as cytidine, deoxycytidine, adenosine and deoxyadenosine. Non-limiting examples of nucleobase deaminases include cytidine deaminases and adenosine deaminases.
Some of the nucleobase deaminases have a single, catalytic domain, while others also have other domains, such as an inhibitory domain as described in WO2020156575A1. In some embodiments, therefore, the gene editing system disclosed herein only includes the catalytic domain, such as mouse A3 cytidine deaminase domain 1 (mA3-CDA1, SEQ ID NO: 609) and human A3B cytidine deaminase domain 2 (hA3B-CDA2, SEQ ID NO: 610) . In some embodiments, the gene editing system disclosed herein includes at least a catalytic core of the catalytic domain. For instance, when mA3-CDA1 was truncated at residues 196/197 the CDA1 domain still retained substantial editing efficiencies.
In some embodiments, the nucleotide deaminase is a cytidine deaminase. In some embodiments, the nucleotide deaminase is a cytidine deaminase comprising an amino acid sequence of SEQ ID NO: 609. In some embodiments, the nucleotide deaminase is a cytidine deaminase comprising an amino acid sequence of SEQ ID NO: 610.
Table 7

“Cytidine deaminase” refers to enzymes that catalyze the hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. Cytidine deaminases maintain the cellular pyrimidine pool. A family of cytidine deaminases is APOBEC (“apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like” ) . Members of this family are C-to-U editing enzymes. Some APOBEC family members have two domains, one domain of APOBEC like proteins is the catalytic domain, while the other domain is a pseudocatalytic domain. More specifically, the catalytic domain is a zinc dependent cytidine deaminase domain and is important for cytidine deamination. RNA editing by APOBEC-1 requires homodimerisation and this complex interacts with RNA binding proteins to form the editosome.
Non-limiting examples of APOBEC proteins include APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, and activation-induced (cytidine) deaminase (AID) .
Various mutants of the APOBEC proteins are also known that have brought about different editing characteristics for base editors. For instance, for human APOBEC3A, certain mutants (e.g., W98Y, Y130F, Y132D, W104A, D131Y and P134Y) even outperform the wildtype human APOBEC3A in terms of editing efficiency or editing window. Accordingly, the term APOBEC and each of its family member also encompasses variants and mutants that have certain level (e.g., 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%) of sequence identity to the corresponding wildtype APOBEC protein or the catalytic domain and retain the cytidine deaminating activity. The variants and mutants can be derived with amino acid additions, deletions and/or substitutions. Such substitutions, in some embodiments, are conservative substitutions.
In some embodiments, the cytidine deaminase is selected from the group consisting of APOBEC3B (A3B) , APOBEC3C (A3C) , APOBEC3D (A3D) , APOBEC3F (A3F) , APOBEC3G (A3G) , APOBEC3H (A3H) , APOBECI (Al) , APOBEC3 (A3) , APOBEC2 (A2) , APOBEC4 (A4) , and AICDA (AID) .
In some embodiments, the cytidine deaminase comprises an amino acid sequence of any one of SEQ ID NOs: 765-800. (Table 8) 
Table 8 Cytidine deaminase 







In some embodiments, the cytidine deaminase is a human or mouse cytidine deaminase.
In some embodiments, the catalytic domain of the cytidine deaminase is a mouse A3 cytidine deaminase domain 1 (CDAl) or human A3B cytidine deaminase domain 2 (CDA2) .
In some embodiments, the first fusion protein further comprises an uracil glycosylase inhibitor (UGI) .
The “Uracil Glycosylase Inhibitor” (UGI) , which can be prepared from Bacillus subtilis bacteriophage PBS1, is a small protein (9.5 kDa) which inhibits E. coli uracil-DNA glycosylase (UDG) as well as UDG from other species. Inhibition of UDG occurs by reversible protein binding with a 1: 1 UDG : UGI stoichiometry. UGI is capable of dissociating UDG-DNA complexes. A non-limiting example of UGI is found in Bacillus phage AR9 (YP_009283008.1) . In some embodiments, the UGI comprises the amino acid sequence of SEQ ID NO: 611 (TNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDES) or has at least 70%, 75%, 80%, 85%, 90%or 95%sequence identity to SEQ ID NO: 611 and retains the uracil glycosylase inhibition activity.
In some embodiments, the first fusion protein further comprises a nuclear localization sequences (NLS)
A “nuclear localization signal or sequence” (NLS) is an amino acid sequence that tags a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES) , which targets proteins out of the nucleus. A non-limiting example of NLS is the internal SV40 nuclear localization sequence (iNLS) .
In some embodiments, a peptide linker is optionally provided between each of the fragments in any of the fusion proteins. In some embodiments, the peptide linker has from 1 to 100 amino acid residues (or 3-20, 4-15, without limitation) . In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%or 90%of the amino acid residues of peptide linker are amino acid residues selected from the group consisting of alanine, glycine, cysteine, and serine.
The term “Cas protein” or “clustered regularly interspaced short palindromic repeats (CRISPR) -associated (Cas) protein” refers to RNA-guided DNA endonuclease enzymes associated with the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) adaptive immunity system in Streptococcus pyogenes, as well as other bacteria. Cas proteins include Cas9 proteins, Cas12a (Cpf1) proteins, Cas12b (formerly known as C2c1) proteins, Cas13 proteins and various engineered counterparts. Example Cas proteins include SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpf1, LbCpf1, FnCpf1, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpf1, FnCpf1, SsCpf1, PcCpf1, BpCpf1, CmtCpf1, LiCpf1, PmCpf1, Pb3310Cpf1, Pb4417Cpf1, BsCpf1, EeCpf1, BhCas12b, AkCas12b, EbCas12b, LsCas12b, RfCas13d, LwaCas13a, PspCas13b, PguCas13b, RanCas13b and those provided in Tables 9-10 below.
In some embodiments, the Cas protein comprise an amino acid sequence of any one of SEQ ID NOs: 713-764. (Table 10)
Table 9 Exemplary Cas Proteins

Table 10 Cas proteins









































In some embodiments, the Cas protein is a Cas9, a dead Cas9 (dCas9) , or a Cas9 nickase (nCas9) .
In some embodiments, the Cas protein is a nCas9. In some embodiments, the nCas9 protein is a nCas9-D10A protein. In some embodiments, the nCas9-D10A protein has an amino acid sequence of SEQ ID NO: 612.
In some embodiments, the first protein-binding RNA motif and the first RNA binding domain, the second protein-binding RNA motif and the second RNA binding domain, and the third protein-binding RNA motif and the third RNA binding domain, are each independently selected from the group consisting of a MS2 phage operator stem-loop and MS2 coat protein (MCP) or an RNA-binding section thereof; a BoxB and N22P or an RNA-binding section thereof; a telomerase Ku binding motif and Ku protein or an RNA-binding section thereof; a telomerase Sm7 binding motif and Sm7 protein or an RNA-binding section thereof; a PP7 phage operator stem -loop and PP7 coat protein (PCP) or an RNA-binding section thereof; a SfMu phage Com stem-loop and Com RNA binding protein or an RNA-binding section thereof; and a non-natural RNA aptamer and corresponding aptamer ligand or an RNA-binding section thereof. See Table 11.
Table 11

For any protein of the present disclosure, biological equivalents thereof are also provided. In some embodiments, the biological equivalents have at least about 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%sequence identity with the reference protein. Preferably, the biological equivalents retained the desired activity of the reference protein. In some embodiments, the biological equivalents are derived by including one, two, three, four, five or more amino acid additions, deletions, substitutions, or the combinations thereof. In some embodiments, the substitution is a conservative amino acid substitution.
In some embodiments of the gene editing systems described herein, the guide RNA (the main guide RNA and/or the helper guide RNA) is a dual-RNA structure formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) . In some embodiments, the crRNA comprises a spacer sequence and is capable of forming a base-pair structure with the tracrRNA, and wherein the base-pair structure binds to a Cas protein. In some embodiments, the crRNA further comprises a linker sequence which comprises a protein-binding motif.
For the purpose of the present disclosure, when the guide RNA is a dual-RNA structure of crRNA and tracrRNA, the “CRISPR motif” refers to the base-pair structure formed between the crRNA and the tracrRNA.
In some embodiments, the gene editing system is a LigoRNA-based gene editing system, as described in PCT/CN2023/096482, which is incorporated herein by reference in its entirety. In the LigoRNA-based gene editing system, at least one guide RNA is a LigoRNA. A LigoRNA system comprises a dual-RNA structure, which can be used as a guide RNA in CRISPR-based gene editing systems. The dual-RNA structure can be formed by a ligand-bound CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) . For example, the LigoRNA system comprises an hgRNA set of a hcrRNA and a tracrRNA, and an mgRNA set of mcrRNA and a tracrRNA. Preferably, all of these RNA molecules are not longer than 100 nucleotides.
Since the LigoRNA system is formed by two short RNAs, it helps to solve the problem of synthesizing long single guide RNAs in previous gene editing systems. Chemically synthesized RNAs over 100 nt demonstrated much lower yield and purity, resulting in challenges for large-scale production and cost control.
Original types of crRNA and tracrRNA are capable of guiding nCas9-mediated DNA location. The crRNAs and the tracrRNAs in the LigoRNA system are further modified. In some embodiments, an MS2 or boxB hairpin is fused to crRNA in multiple different sites. In some embodiments, at least one nucleotide in the crRNAs and the tracrRNAs is modified, such as by a 2’-O-methyl modification and/or 3’ -phosphorothioate modification.
In some embodiments, the crRNA comprises a spacer sequence and a linker sequence, wherein the linker sequence comprises at least one protein-binding motif, wherein the protein-binding motif is an RNA aptamer motif. In some embodiments, the protein binding motif is selected from MS2, PP7, boxB, SfMu hairpin motif, telomerase Ku, and Sm7 binding motif, or a variant thereof. Aptamers are single-stranded oligonucleotides that fold into defined architectures and selectively bind to a specific target, including proteins, peptides, carbohydrates, small molecules, toxins, and even live cells.
In some embodiments, the crRNA is capable of forming a base-pair structure with a trans-activating crRNA (tracrRNA) . In some embodiments, the tracrRNA has an sequence of SEQ ID NO: 804 or 811.
In some embodiments, the crRNA comprises at least one nucleotide with modification. In some embodiments, the modification is selected from 2’ -O-alkyl, 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo, 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) . In some embodiments, the at least one nucleotide with modification is any one of the first three nucleotides from 3’ -end of the engineered crRNA.
In some embodiments, the tracrRNA comprises at least one nucleotide with modification. In some embodiments, the modification is selected from 2’ -O-alkyl, 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo, 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) . In some embodiments, the at least one nucleotide with modification is any one of the first three nucleotides from 3’ -end of the engineered tracrRNA.
In some embodiments, the crRNA and/or tracrRNA comprises at least one nucleotide with modification. In some embodiments, the modification is selected from 2’ -O-alkyl (such as 2’ -O-methyl) , 2’ -substituted alkoxy, 2’ -substituted alkyl, 2’ -halo (such as 2’ -fluoro) , 3’ -phosphorothioate, bridged nucleic acid (BNA) , and locked nucleic acid (LNA) . In some embodiments, the crRNA and/or tracrRNA comprises nucleotides comprising 2’ -O-methyl and 3’ -phosphorothioate. In some embodiments, the first three nucleotides from the 5’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl and 3’ -phosphorothioate. In some embodiments, the first three nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl, and the second to fourth nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 3’ -phosphorothioate. In some embodiments, the first three nucleotides from the 5’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl and 3’ -phosphorothioate, and the first three nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 2’ -O-methyl, and the second to fourth nucleotides from the 3’ -end of the crRNA and/or tracrRNA are modified with 3’ -phosphorothioate.
In some embodiments, is the present disclosure provides a tBE system comprising two LigoRNA structures: an mcrRNA-tracrRNA base-paired structure and an hcrRNA-tracrRNA base-paired structure. In some embodiments, the mcrRNA contains a boxB hairpin to generate an R-loop region for intended base editing and the hcrRNA contains an MS2 hairpin to recruit a nucleotide deaminase (e.g., an APOBEC linked to a nucleobase deaminase inhibitor (e.g., a cytosine deaminase inhibitor (dCDI) ) domain through a cleavage site such as a TEV protease cleavage site. For example, to cleave off the dCDI domain at the on-target sites, an N22p-fused TEVc is recruited by the boxB-containing mcrRNA, working as the key in tBE system with free TEVn. In some embodiments, mcrRNA and hcrRNA form a base-paired structure with the same tracrRNA to locate a target DNA, and the dCDI domain is cleaved off at the target site to induce efficient base editing.
In some embodiments of the gene editing system described herein, the gene editing system comprises
a. an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif,
b. an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif,
c. a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA,
d. a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA,
e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure,
f. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second base pair structure,
g. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif,
wherein the first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different.
In some embodiments of the gene editing system described herein, the gene editing system further comprises
a. a protease, or a polynucleotide encoding the protease, and
b. a nucleobase deaminase inhibitor domain,
wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof.
In some embodiments of the gene editing system described herein, the gene editing system comprises
a. an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif,
b. an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif,
c. a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA,
d. a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA,
e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure,
f. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second base pair structure,
g. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif,
h. a protease, or a polynucleotide encoding the protease,
i. a nucleobase deaminase inhibitor domain, and
j. a second fusion protein comprising the protease and a second RNA binding
domain, or a polynucleotide encoding the second fusion protein,
wherein the first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
wherein the protease and the second RNA binding domain are optionally connected by a linker, and
wherein the second RNA binding domain binds to the second protein-binding motif.
In some embodiments of the gene editing system described herein, the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site.
In some embodiments of the gene editing system described herein, wherein the gene editing system comprises
a. an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif,
b. an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif,
c. a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA,
d. a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA,
e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure,
f. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second base pair structure,
g. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif,
h. a protease, or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
i. a nucleobase deaminase inhibitor domain,
j. a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
k. a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker,
wherein the first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
wherein the mcrRNA further comprises a third protein-binding motif,
wherein the second RNA binding domain binds to the second protein-binding motif, and wherein the third RNA binding domain binds to the third protein-binding motif.
In some embodiments of the gene editing system described herein, the gene editing system comprises
a. an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif,
b. an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif,
c. a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA,
d. a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA,
e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure,
f. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second base pair structure,
g. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif,
h. a protease, or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
i. a nucleobase deaminase inhibitor domain,
j. a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
k. a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker,
wherein the first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different,
wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
wherein the mcrRNA further comprises a third protein-binding motif,
wherein the second RNA binding domain binds to the second protein-binding motif,
wherein the third RNA binding domain binds to the third protein-binding motif, and
wherein the second and the third RNA binding domains are the same or different, and the second and the third protein-binding motifs are the same or different.
In some embodiments of the gene editing system described herein, the gene editing system comprises
a. an hcrRNA comprising a first spacer sequence and a first linker sequence, wherein the first linker sequence comprises a first protein-binding motif,
b. an mcrRNA comprising a second spacer sequence and a second linker sequence, wherein the second linker sequence comprises a second protein-binding motif,
c. a first tracrRNA which is capable of forming a first base-pair structure with the hcrRNA,
d. a second tracrRNA which is capable of forming a second base-pair structure with the mcrRNA,
e. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first base-pair structure,
f. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second base pair structure,
g. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif,
h. a protease, or a polynucleotide encoding the protease, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site,
i. a nucleobase deaminase inhibitor domain,
j. a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein,
wherein the first Cas protein and the second Cas protein are the same or different, and the first tracrRNA and the second tracrRNA are the same or different, wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein  there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof,
wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
wherein the second RNA binding domain binds to the second protein-binding motif. In some embodiments, the LigoRNA-based gene editing system comprises a main crRNA (mcrRNA) , a helper crRNA (hcrRNA) , and a tracrRNA respectively:
Table 12

In some embodiments of the gene editing system described herein, the mgRNA and/or the hgRNA comprises a dual-RNA structure.
In some embodiments, the dual-RNA structure is formed by a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) , wherein the crRNA comprises the spacer.
In some embodiments of the gene editing system described herein, the mgRNA comprises a mcrRNA and a first tracrRNA, and the mcrRNA comprises the mgRNA spacer, wherein the hgRNA comprises a hcrRNA and a second tracrRNA, and the hcrRNA comprises the hgRNA spacer, and wherein the first tracrRNA and the second tracrRNA are same or different.
In some embodiments, the mcrRNA and the hcrRNA are SEQ ID NO. 805 and SEQ ID NO: 808, respectively; or SEQ ID NO. 806 and SEQ ID NO: 809, respectively; or SEQ ID NO. 807 and SEQ ID NO: 810, respectively; or SEQ ID NO. 812 and SEQ ID NO: 815, respectively; or SEQ ID NO. 813 and SEQ ID NO: 816, respectively; or SEQ ID NO. 814 and SEQ ID NO: 817, respectively.
In some embodiments, the tracrRNA is SEQ ID NO: 804 or 811.
Polynucleotides
In another aspect, the present disclosure provides a polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
In another aspect, the present disclosure provides a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a polynucleotide encoding all components in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a kit comprising a polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein, and a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein. In some embodiments, the first and the second Cas proteins are the same Cas protein.
The polynucleotides disclosed herein can be obtained by methods known in the art. For example, the polynucleotide can be obtained from cloned DNA (e.g., from a DNA library) , by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA or fragments thereof, purified from the desired cell. When the polynucleotides are produced by recombinant means, any method known to those skilled in the art for identification of nucleic acids that encode desired genes can be used. Any method available in the art can be used to obtain a full length (i.e., encompassing the entire coding region) cDNA or genomic DNA encoding a desired protein, such as from a cell or tissue source. Modified or variant polynucleotides can be engineered from a wildtype polynucleotide using standard recombinant DNA methods. Polynucleotides can be cloned or isolated using any available methods known in the art for cloning and isolating nucleic acid molecules. Such methods include PCR amplification of nucleic acids and screening of libraries, including nucleic acid hybridization screening, antibody-based screening, and activity-based screening.
Methods for amplification of polynucleotides can be used to isolate polynucleotides encoding a desired protein, including for example, polymerase chain reaction (PCR) methods. PCR can be carried out using any known methods or procedures in the art. Exemplary methods include use of a Perkin-Elmer Cetus thermal cycler and Taq polymerase (Gene Amp) . A nucleic acid containing gene of interest can be used as a source material from which a desired polypeptide-encoding nucleic acid molecule can be amplified. For example, DNA and mRNA preparations, cell extracts, tissue extracts from an appropriate source (e.g. testis, prostate, breast) , fluid samples (e.g. blood, serum, saliva) , samples from healthy and/or diseased subjects can be used in amplification methods. The source can be from any eukaryotic species including, but not limited to, vertebrate, mammalian, human, porcine, bovine, feline, avian, equine, canine, and other primate sources. Nucleic acid libraries also can be used as a source material. Primers can be designed to amplify a desired polynucleotide. For example, primers can be designed based on expressed sequences from which a desired polynucleotide is generated. Primers can be designed based on back-translation of a polypeptide amino acid sequence. If desired, degenerate primers can be used for amplification. Oligonucleotide primers that hybridize to sequences at the 3’a nd 5’ termini of the desired sequence can be uses as primers to amplify by PCR from a nucleic acid sample. Primers can be used to amplify the entire full-length polynucleotide, or a truncated sequence thereof. Nucleic acid molecules generated by amplification can be sequenced and confirmed to encode a desired polypeptide.
Vectors
In another aspect, the present disclosure provides a vector comprising the polynucleotide encoding the hgRNA and/or the mgRNA disclosed herein.
In another aspect, the present disclosure provides a vector comprising the polynucleotide encoding all components except the first and the second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a vector comprising the polynucleotide encoding all components in the gene editing system disclosed herein.
In some embodiments, the vector is a plasmid or a viral vector.
In some embodiments, the vector is a polycistronic vector.
In another aspect, the present disclosure provides a kit comprising the vector disclosed above, and a vector comprising the polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
Any methods known in the art for the insertion of DNA fragments into a vector can be used to construct expression vectors comprising a polynucleotide disclosed herein. These methods can include in vitro recombinant DNA and synthetic techniques and in vivo (genetic) recombination. The polynucleotide disclosed herein can be operably linked to control sequences in the expression vector (s) to ensure protein expression. Such control sequences may include, but are not limited to, leader or signal sequences, promoters (e.g., naturally associated or heterologous promoters) , ribosomal binding sites, enhancer or activator elements, translational start and termination sequences, and transcription start and termination sequences, and are chosen to be compatible with the host cell chosen to express the proteins. Constitutive or inducible promoters as known in the art are also contemplated. The promoters may be either naturally occurring promoters, hybrid promoters that combine elements of more than one promoter, or synthetic promoters. An expression construct may be present in a cell on an episome, such as a plasmid, or the expression construct may be inserted in a chromosome such as in a gene locus. In some embodiment, the expression vector includes a selectable marker gene to allow the selection of transformed host cells. In some embodiments, the vector is an expression vector comprising a nucleotide sequence encoding a variant polypeptide operably linked to at least one regulatory control sequence. Regulatory control sequence for use herein include promoters, enhancers, and other expression control elements. In some embodiments, the expression vector is designed for the choice of the host cell to be transformed, the particular variant polypeptide desired to be expressed, the vector's copy number, the ability to control that copy number, and/or the expression of any other protein encoded by the vector, such as antibiotic markers.
The vector can include, but is not limited to, viral vectors and plasmid DNA. Viral vectors can include, but are not limited to, adenoviral vectors, lentiviral vectors, retroviral vectors, and adeno-associated viral vectors. Commonly, expression vectors contain selection markers such as ampicillin-resistance, hygromycin-resistance, tetracycline resistance, kanamycin resistance, or  neomycin resistance to permit detection of those cells transformed with the desired DNA sequences. Suitable vectors, promoter, and enhancer elements are known in the art; many are commercially available for generating subject recombinant constructs. In some embodiments, the vector is a polycistronic vector. In some embodiments, the vector is a bicistronic vector or a tricistronic vector. Bicistronic or polycistronic expression vectors may include (1) multiple promoters fused to each of the open reading frames; (2) insertion of splicing signals between genes; (3) fusion of genes whose expressions are driven by a single promoter; and (4) insertion of proteolytic cleavage sites between genes (self-cleavage peptide) or insertion of internal ribosomal entry sites (IRESs) between genes.
A polycistronic vector is used to co-express multiple genes in the same cell. Two strategies are most commonly used to construct a multicistronic vector. First, an Internal Ribosome Entry Site (IRES) element is typically used for bi-cistronic vectors. The IRES element, acting as another ribosome recruitment site, allows initiation of translation from an internal region of the mRNA. Thus, two proteins are translated from one mRNA. IRES elements are quite large (usually 500-600 bp) (Pelletier et al., 1988; Jang et al., 1988) . The engineered CD47 proteins disclosed herein have a smaller size compared to the wild-type full-length human CD47, and thus could be used with IRES element in a multicistronic vectors having limited packaging capacity.
Cells
In another aspect, the present disclosure provides a cell comprising the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the polynucleotide disclosed herein. In some embodiments, the cell further comprises a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the vector disclosed herein. In some embodiments, the cell further comprises a vector comprising a polynucleotide encoding the first and/or second Cas protein in the gene editing system disclosed herein.
In another aspect, the present disclosure provides a cell comprising the kit disclosed herein.
In some embodiments, the cell is a stem cell.
In some embodiments, the cell is a pluripotent stem cell. Pluripotent stem cells are cells that have the capacity to self-renew by dividing and to develop into the three primary germ cell layers of the early embryo and therefore into all cells of the adult body, but not extra-embryonic tissues such as the placenta. Embryonic stem cells and induced pluripotent stem cells are pluripotent stem cells.
In some embodiments, the cell is an embryonic stem cell (ESC) . Embryonic stem cells are pluripotent stem cells derived from the inner cell mass of a blastocyst, an early-stage pre-implantation embryo.
In some embodiments, the cell is an induced pluripotent stem cell (iPSC) . iPSCs are derived from adult somatic cells that have been genetically reprogrammed back into an embryonic-like pluripotent state that enables the development of an unlimited source of any type of cell needed for therapeutic purposes.
"Pluripotent stem cells" as used herein have the potential to differentiate into any of the three germ layers: endoderm (e.g., the stomach lining, gastrointestinal tract, lungs, etc. ) , mesoderm (e.g., muscle, bone, blood, urogenital tissue, etc. ) or ectoderm (e.g., epidermal tissues and nervous system tissues) . The term "pluripotent stem cells, " as used herein, also encompasses induced pluripotent stem cells (iPSCs or iPS cells) , or a type of pluripotent stem cell derived from a non-pluripotent cell. In some embodiments, a pluripotent stem cell is produced or generated from a cell that is not a pluripotent cell. In other words, pluripotent stem cells can be direct or indirect progeny of a non-pluripotent cell. Examples of parent cells include somatic cells that have been reprogrammed to induce a pluripotent, undifferentiated phenotype by various means. Such "iPS" or "iPSC" cells can be created by inducing the expression of certain regulatory genes or by the exogenous application of certain proteins. Methods for the induction of iPS cells are known in the art and are further described below. (See, e.g., Zhou et al., Stem Cells 27 (11) : 2667-74 (2009) ; Huangfu et al., Nature Biotechnol. 26 (7) : 795 (2008) ; Woltjen et al., Nature 458 (7239) : 766-770 (2009) ; and Zhou et al., Cell Stem Cell 8: 381-384 (2009) ; each of which is incorporated by reference herein in their entirety. ) As used herein, "hiPSCs" are human induced pluripotent stem cells. In some embodiments, "pluripotent stem cells, " as used herein, also encompasses mesenchymal stem cells (MSCs) , and/or embryonic stem cells (ESCs) .
In some embodiments, the cell is an endothelial cell. Endothelial cells form the endothelium, which is a single layer that line the interior surface of blood vessels and lymphatic vessels, providing an anticoagulant barrier between the vessel wall and blood. In addition to its role as a selective permeability barrier, the endothelial cell is a unique multifunctional cell with critical basal and inducible metabolic and synthetic functions. The endothelial cell reacts with physical and chemical stimuli within the circulation and regulates hemostasis, vasomotor tone, and immune and inflammatory responses. In addition, the endothelial cell is pivotal in angiogenesis and vasculogenesis. (Sumpio et al., Cells in focus: endothelial cell, Int. J. Biochem Cell Biol., 2002. )
In some embodiments, the cell is a primary cell. Primary cells are isolated directly from human or animal tissue using enzymatic or mechanical methods. Once isolated, they are placed in an artificial environment in plastic or glass containers supported with specialized medium  containing essential nutrients and growth factors to support proliferation. Primary cells could be of two types: adherent or suspension. Adherent cells require attachment for growth and are said to be anchorage-dependent cells. Adherent cells are usually derived from tissues of organs. Suspension cells do not require attachment for growth and are said to be anchorage-independent cells. Most suspension cells are isolated from the blood system, but some tissue-derived cells can also be used in suspension, such as hepatocytes or intestinal cells. Although primary cells usually have a limited lifespan, they offer a number of advantages compared to cell lines. Primary cell culture enables researchers to study donors and not just cells. Several factors such as age, medical history, race, and sex can be considered when building an experimental model. With a growing trend towards personalized medicine, such donor variability and tissue complexity can be achieved with use of primary cells, but are difficult to replicate with cell lines that are more systematic and uniform in nature and do not capture the true diversity of a living tissue.
In some embodiments, the cell is a differentiated cell. Differentiated cells are cells that have undergone differentiation. They are mature cells that perform a specialized function. Some examples of differentiated cells are epithelial cells, skin fibroblasts, endothelial cells lining the blood vessels, smooth muscle cells, liver cells, nerve cells, human cardiac muscle cells, etc. Generally, these cells have a unique morphology, metabolic activity, membrane potential, and responsiveness to signals facilitating their function in a body tissue or organ.
Composition
In another aspect, the present disclosure provides a composition comprising the gene editing system disclosed herein.
In another aspect, the present disclosure provides a composition comprising the cell disclosed herein.
As used herein, the term “composition” includes, but is not limited to, a pharmaceutical composition. A “pharmaceutical composition” refers to an active pharmaceutical agent formulated in pharmaceutically acceptable or physiologically acceptable solutions for administration to a cell or an animal, either alone, or in combination with one or more other modalities of therapy. It will also be understood that, if desired, the compositions of the invention may be administered in combination with other agents, such as, e.g., cytokines, growth factors, hormones, small molecules, chemotherapeutics, pro-drugs, drugs, antibodies, or other various pharmaceutically active agents. There is virtually no limit to other components that may also be included in the compositions, provided that the additional agents do not adversely affect the ability of the composition to deliver the intended therapy. The phrase “pharmaceutically acceptable” is used herein to refer to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without  excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
The compositions may also comprise a pharmaceutically acceptable carrier, diluent, or excipient. As used herein “pharmaceutically acceptable carrier, diluent, or excipient” includes, without limitation, any adjuvant, carrier, excipient, glidant, sweetening agent, diluent, preservative, dye/colorant, flavor enhancer, surfactant, wetting agent, dispersing agent, suspending agent, stabilizer, isotonic agent, solvent, surfactant, or emulsifier which has been approved by the United States Food and Drug Administration as being acceptable for use in humans or domestic animals. Exemplary pharmaceutically acceptable carriers include, but are not limited to, to sugars, such as lactose, glucose, and sucrose; starches, such as corn starch and potato starch; cellulose, and its derivatives, such as sodium carboxymethyl cellulose, ethyl cellulose, and cellulose acetate; tragacanth; malt; gelatin; talc; cocoa butter; waxes; animal and vegetable fats; paraffins; silicones; bentonites; silicic acid; zinc oxide; oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil, and soybean oil; glycols, such as propylene glycol; polyols, such as glycerin, sorbitol, mannitol, and polyethylene glycol; esters, such as ethyl oleate, and ethyl laurate; agar; buffering agents, such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen-free water; isotonic saline; Ringer's solution; ethyl alcohol; phosphate buffer solutions; and any other compatible substances employed in pharmaceutical formulations.
The liquid pharmaceutical compositions, whether they be solutions, suspensions or other like form, may include one or more of the following: sterile diluents such as water for injection, saline solution, preferably physiological saline; Ringers solution; isotonic sodium chloride; fixed oils such as synthetic mono or diglycerides which may serve as the solvent or suspending medium; polyethylene glycols; glycerin; propylene glycol or other solvents; antibacterial agents, such as benzyl alcohol or methyl paraben; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents, such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates, or phosphates; and agents for the adjustment of tonicity, such as sodium chloride or dextrose. The parenteral preparation can be enclosed in ampoules, disposable syringes, or multiple dose vials made of glass or plastic. An injectable pharmaceutical composition is preferably sterile.
The composition may be suitably developed for intravenous, intratumoral, oral, rectal, vaginal, parenteral, topical, pulmonary, intranasal, buccal, ophthalmic, or another route of administration.
Methods of treatment
In another aspect, the present disclosure provides a method for disrupting a KLKB1 gene in a cell, comprising introducing into the cell the gene editing system disclosed herein. As used herein, “disrupt” or “disruption” of a gene refers to gene knock-down or gene knock-out. In  a gene knock-down, the expression of the gene is reduced via methods such as genetic modification and treatment as disclosed herein. In a gene knock-out, a gene is made inoperative, partially or completely, via methods such as genetic modification.
In some embodiments, the KLKB1 gene is disrupted by adding stop codons using the gene editing system disclosed herein. In some embodiments, the gene editing system disclosed herein is used to induce C-to-T base editing in the codons of CAA (Gln) , CAG (Gln) or CGA (Arg) in KLKB1 genes to create TAA, TAG or TGA stop codon. See Fig. 2. In some embodiments, the gene editing system disclosed herein is used to induce G-to-A (C-to-T on the opposite strand) base editing in the codons of TGG (Trp) in KLKB1 genes to create a TAA, TAG or TGA stop codon.
In some embodiments, the KLKB1 gene is disrupted by destroying splicing site in it using the gene editing system disclosed herein. In some embodiments, the gene editing system disclosed herein is used to induce G-to-A (C-to-T on the opposite strand) base editing in 5’ GU or 3’ AG splice site to destroy the GU-AG canonical splicing pattern. See Fig. 3.
In another aspect, the present disclosure provides a method for decreasing the expression of plasma kallikrein in a cell, comprising introducing the cell the gene editing system disclosed herein.
In another aspect, the present disclosure provides a method for decreasing the expression of bradykinin in a cell, comprising introducing the cell the gene editing system disclosed herein.
In another aspect, the present disclosure provides a method for decreasing vascular leakage regulated by a kinin B2 receptor (B2R) in a cell, comprising introducing the cell the gene editing system disclosed herein. In some embodiments, the HAE is caused by C1-INH deficiency or dysfunction. In some embodiments, the HAE is HAE-1. In some embodiments, the HAE is HAE-2.
In some embodiments, the cell is a stem cell.
In some embodiments, the cell is a pluripotent stem cell.
In some embodiments, the cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
In some embodiments, the cell is an endothelial cell.
In some embodiments, the cell is a primary cell or a differentiated cell.
In some embodiments, the present disclosure provides a method for treating HAE in a subject in need thereof, the method comprising: administering to the subject a composition  comprising a gene editing system disclosed herein, wherein the KLKB1 gene in the subject is disrupted.
In some embodiments, the present disclosure provides a method for treating HAE in a subject in need thereof, the method comprising: administering to the subject a composition comprising a cell disclosed herein, wherein the KLKB1 gene in the subject is disrupted.
In some embodiments, the present disclosure provides use of a gene editing system, a cell, or a composition as disclosed herein for the manufacture of a medicament for treating HAE.
In some embodiments, the present disclosure provides a gene editing system, a cell, or a composition as disclosed herein for treating HAE.
In some embodiments, the various protein components and the gRNAs of a gene editing system disclosed herein may be introduced into a subject or a cell via one or more vectors expressing the protein components and gRNAs.
In some embodiments, the gRNAs and the protein components of a gene editing system disclosed herein can be delivered into a cell in a form of ribonucleoprotein (RNP) via electroporation.
While the disclosure has been particularly shown and described with reference to specific embodiments, it should be understood by those having skill in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present disclosure as disclosed herein.
Table 13


EXAMPLES
Example 1: Gene Editing Efficiency Tests
Plasmid construction
Primer sets (hgKLKB1-E2-1-U1_FOR/mgKLKB1-E2-1_REV) were used to amplify the fragment hgKLKB1-E2 (Exon Number) -1 (mgRNA Number) -U1 (hgRNA Number) -MS2 (the operator in hgRNA scaffold) -U6 (mgRNA promoter) -mgKLKB1-E2-1 using the template pUC57-mgRNA-MS2-U6. The fragment hgKLKB1-E2-1-U1-MS2-U6-mgKLKB1-E2-1 was then ligated into BsmBI-linearized U6-ccdB-boxB-tBE-V5 to generate the vector ptBE-V5-KLKB1-E2-1-U1. Other combinations with different on-target hgRNA and mgRNA were constructed using the same strategy, respectively.
Cell culture and Lipofectamine transfection
293FT cells were maintained in DMEM + 10%FBS and regularly tested to exclude mycoplasma contamination. For base editing with transformer BEs, 293FT cells were seeded in a 24-well plate at a density of 1 × 105 per well and transfected with 250 μl serum-free Opti-MEM containing 2.5 μl LIPOFECTAMINE LTX, 1 μl LIPOFECTAMINE plus, 0.5 μg tBE-V5 expression vector, 0.5 μg pEFS-nSpCas9 or pEFS-nSpCas9-NG expression vector. After 24 h, puromycin was added to the medium at a final concentration of 4 μg ml-1. After another 48 h, the genomic DNA was extracted from the cells using QuickExtractT DNA Extraction Solution for subsequent sequencing analysis. Target genomic sequences were PCR-amplified using high-fidelity DNA polymerase PrimeSTAR HS with primer sets flanking the examined mgRNA target sites.
gRNA and mRNA preparation and electroporation
Chemically modified mgRNA (2’ -O-methyl modifications were made to the first and last three nucleotides; the phosphate backbones between the first and last four nucleotides were modified to be phosphorothioate internucleotide linkage) was synthesized from GenScript. mRNAs encoding a tBE system were transcribed in vitro. HepG2 cells or Hepa1-6 cells were electroporated with the end-modified mgRNA and the mRNAs described above. Electroporation was performed using Lonza 4D Nucleofector by using an officially recommended program (e.g., EH-100) . For 20-μl Nucleocuvette Strips, 0.2 million HepG2 cells or Hepa1-6 cells were resuspended in 20 μl SF Cell Line 4D-Nucleofector buffer and about 160 pmol RNA complex were added. The editing frequencies of target sequences were measured with cells cultured in medium 120 hours after electroporation.
Base substitution frequency at each target sites was calculated by EditR analysis. See http: //baseeditr. com/.
Western blot analysis
Total proteins from NHDFs were extracted using a RIPA buffer with 1mM PMSF and proteinase inhibitor cocktail. The protein concentration was measured using a BCA protein assay kit. Equal amounts of protein (5 μg) were separated by SDS-PAGE using 4–12%SurePage Mini-PROTEAN Gels and then transferred onto Nitrocellulose membrane. Membranes were blocked using 5%skim milk in Tris-buffered saline, containing 0.1% (v/v) Tween-20 (TBST) , for 1 h, and incubated with primary antibodies against KLKB1 (1: 1000) and GAPDH (1: 10000) overnight at 4 ℃, followed by incubation with anti-rabbit IgG or anti-mouse IgG conjugated with horseradish peroxidase. GAPDH was used as an internal control. The probed protein was visualized using Amersham Image 680. The densitometric analysis was semi-quantified using the ImageQuantTL.
Base substitution calculation, statistics analysis, and other relevant steps for obtaining the data as illustrated in Figures 2-6 are essentially the same as disclosed in the “Methods” section of Wang, Lijie, et al., Eliminating base-editor-induced genome-wide and transcriptome-wide off-target mutations, Nature Cell Biology 23.5 (2021) : 552-563, the content of which is incorporated herein by reference in its entirety.
Gene editing results obtained from the above experiments are illustrated in Fig. 2-3 and Fig. 5-6. Editing frequencies of the gene editing systems (identified by the respective hgRNA-mgRNA or hcrRNA-mcrRNA pairs used therein) and annotations in Figs. 2-3 and Fig. 5-6 are listed in Table 14.


While the disclosure has been particularly shown and described with reference to specific embodiments, it should be understood by those having skill in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present disclosure as disclosed herein.

Claims (60)

  1. A gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer of about 10 to about 20 nucleotides, wherein the nucleic acid sequence of the mgRNA spacer comprises a sequence selected from SEQ ID NOs: 1-80 and 801, and wherein the hgRNA spacer binds to a site on a KLKB1 gene that is close to the target site that the mgRNA spacer binds to.
  2. The gene editing system of claim 1, wherein the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise respectively:













  3. A gene editing system comprising a main guide RNA (mgRNA) and a helper guide RNA (hgRNA) , or at least one DNA polynucleotide encoding the mgRNA and/or the hgRNA, wherein the mgRNA comprises an mgRNA spacer and the hgRNA comprises an hgRNA spacer, wherein the nucleic acid sequences of the mgRNA spacer and the hgRNA spacer comprise respectively:
  4. The gene editing system of any one of claims 1-3, comprising
    a. the hgRNA comprising a CRISPR motif, the hgRNA spacer, and a first protein-binding motif, or a DNA polynucleotide encoding the hgRNA,
    b. the mgRNA comprising a second CRISPR motif and the mgRNA spacer, or a DNA polynucleotide encoding the mgRNA,
    c. a first CRISPR-associated protein (Cas protein) , or a polynucleotide encoding the first Cas protein, wherein the first Cas protein binds to the first CRISPR motif,
    d. a second Cas protein, or a polynucleotide encoding the second Cas protein, wherein the second Cas protein binds to the second CRISPR motif,
    e. a first fusion protein comprising a nucleobase deaminase or a catalytic domain thereof and a first RNA binding domain, or a polynucleotide encoding the first fusion protein, wherein the nucleobase deaminase or the catalytic domain thereof and the first RNA binding domain are optionally connected by a linker, and wherein the first RNA binding domain binds to the first protein-binding motif.
    wherein the first Cas protein and second Cas protein are the same or different.
  5. The gene editing system of claim 4, further comprising
    a. a protease, or a polynucleotide encoding the protease, and
    b. a nucleobase deaminase inhibitor domain,
    wherein the nucleobase deaminase inhibitor domain is connected to the nucleobase deaminase or the catalytic domain thereof in the first fusion protein optionally by a linker, and wherein there is a cleavage site for the protease between the nucleobase deaminase inhibitor domain and the nucleobase deaminase or the catalytic domain thereof.
  6. The gene editing system of claim 5, further comprising
    a second fusion protein comprising the protease and a second RNA binding domain, or a polynucleotide encoding the second fusion protein,
    wherein the protease and the second RNA binding domain are optionally connected by a linker,
    wherein the mgRNA further comprises a second protein-binding motif,
    and wherein the second RNA binding domain binds to the second protein-binding motif.
  7. The gene editing system of claim 5, wherein the protease is split into a first protease fragment and a second protease fragment, wherein the first and/or second protease fragment alone is not able to cleave the cleavage site.
  8. The gene editing system of claim 7, further comprising
    a. a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein, wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker, and
    b. a third fusion protein comprising the second protease fragment and a third RNA binding domain, or a polynucleotide encoding the third fusion protein, wherein the second protease fragment and the third RNA binding domain are optionally connected by a linker,
    wherein the mgRNA further comprises a second protein-binding motif and a third protein-binding motif,
    wherein the second RNA binding domain binds to the second protein-binding motif, and
    wherein the third RNA binding domain binds to the third protein-binding motif.
  9. The gene editing system of claim 8, wherein the second and third RNA binding domains are the same or different, and the second and third protein-binding motifs are the same or different.
  10. The gene editing system of claim 7, further comprising
    a second fusion protein comprising the first protease fragment and a second RNA binding domain, or a polynucleotide encoding the second fusion protein,
    wherein the first protease fragment and the second RNA binding domain are optionally connected by a linker,
    wherein the mgRNA further comprises a second protein-binding motif, and
    wherein the second RNA binding domain binds to the second protein-binding motif.
  11. The gene editing system of any one of claims 5-10, wherein the protease is a TEV protease, a TuMV protease, a PPV protease, a PVY protease, a ZIKV protease, or a WNV protease.
  12. The gene editing system in claim 11, wherein the protease is a TEV protease comprising a sequence of SEQ ID NO: 590.
  13. The gene editing system in claim 12, wherein the first TEV protease fragment comprises a sequence of SEQ ID NO: 591.
  14. The gene editing system in any one of claims 5-13, wherein the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase.
  15. The gene editing system in any one of claims 5-14, wherein the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase.
  16. The gene editing system in claim 15, wherein the inhibitory domain of a cytidine deaminase comprises an amino acid sequence of SEQ ID NO: 607 or SEQ ID NO: 608.
  17. The gene editing system in any one of claims 5-16, wherein the nucleotide deaminase is a cytidine deaminase.
  18. The gene editing system in claim 17, wherein the cytidine deaminase is selected from the group consisting of APOBEC3B (A3B) , APOBEC3C (A3C) , APOBEC3D (A3D) , APOBEC3F (A3F) , APOBEC3G (A3G) , APOBEC3H (A3H) , APOBECI (Al) , APOBEC3 (A3) , APOBEC2 (A2) , APOBEC4 (A4) , and AICDA (AID) .
  19. The gene editing system in claim 17, wherein the cytidine deaminase comprises an amino acid sequence of any one of SEQ ID NOs: 765-800.
  20. The gene editing system in claim 17, wherein the cytidine deaminase is a human or mouse cytidine deaminase.
  21. The gene editing system in claim 20, wherein the catalytic domain of the cytidine deaminase is a mouse A3 cytidine deaminase domain 1 (mA3-CDAl) or human A3B cytidine deaminase domain 2 (hA3B-CDA2) .
  22. The gene editing system of any one of claims 4-21, wherein the first fusion protein further comprises an uracil glycosylase inhibitor (UGI) .
  23. The gene editing system of any one of claims 4-22, wherein the Cas protein is a Cas9, a dead Cas9 (dCas9) , or a Cas9 nickase (nCas9) selected from the group consisting of SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpfl, LbCpfl, FnCpfl, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpfl, FnCpfl, SsCpfl, PcCpfl, BpCpfl, CmtCpfl, LiCpfl, PmCpfl, Pb3310Cpfl, Pb4417Cpfl, BsCpfl, EeCpfl, BhCasl2b, AkCasl2b, EbCasl2b, LsCasl2b, RfCasl3d, LwaCasl3a, PspCasl3b, PguCasl3b, and RanCasl3b.
  24. The gene editing system of any one of claims 4-23, wherein the first protein-binding RNA motif and the first RNA binding domain, the second protein-binding RNA motif and the second RNA binding domain, and the third protein-binding RNA motif and the third RNA binding domain, are each independently selected from the group consisting of a MS2 phage operator stem-loop and MS2 coat protein (MCP) or an RNA-binding section thereof,
    a BoxB and N22P or an RNA-binding section thereof,
    a telomerase Ku binding motif and Ku protein or an RNA-binding section thereof,
    a telomerase Sm7 binding motif and Sm7 protein or an RNA-binding section thereof,
    a PP7 phage operator stem -loop and PP7 coat protein (PCP) or an RNA-binding section thereof,
    a SfMu phage Com stem-loop and Com RNA binding protein or an RNA-binding section thereof, and
    a non-natural RNA aptamer and corresponding aptamer ligand or an RNA-binding section thereof.
  25. The gene editing system of any one of claims 1-24, wherein the mgRNA and/or the hgRNA comprise a dual-RNA structure.
  26. The gene editing system of claim 25, wherein the dual-RNA structure is formed by a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA) , wherein the crRNA comprises the spacer.
  27. The gene editing system of claim 25 or 26, wherein the mgRNA comprises a mcrRNA and a first tracrRNA, and the mcrRNA comprises the mgRNA spacer, wherein the  hgRNA comprises a hcrRNA and a second tracrRNA, and the hcrRNA comprises the hgRNA spacer, and wherein the first tracrRNA and the second tracrRNA are same or different.
  28. The gene editing system of claim 27, wherein the mcrRNA and the hcrRNA are
    a. SEQ ID NO. 805 and SEQ ID NO: 808, respectively; or
    b. SEQ ID NO. 806 and SEQ ID NO: 809, respectively; or
    c. SEQ ID NO. 807 and SEQ ID NO: 810, respectively; or
    d. SEQ ID NO. 812 and SEQ ID NO: 815, respectively; or
    e. SEQ ID NO. 813 and SEQ ID NO: 816, respectively; or
    f. SEQ ID NO. 814 and SEQ ID NO: 817, respectively.
  29. The gene editing system of any one of claims 26-28, wherein the tracrRNA is SEQ ID NO: 804 or 811.
  30. A polynucleotide encoding the hgRNA and/or the mgRNA in claims 1, 2, or 3.
  31. A polynucleotide encoding all components except the first and the second Cas proteins in the gene editing system in any one of claims 4-29.
  32. A kit comprising
    a. the polynucleotide in claim 30,
    b. a polynucleotide encoding the first and/or second Cas protein in any one of claims 4-29.
  33. A vector comprising the polynucleotide in claim 30.
  34. A vector comprising the polynucleotide in claim 31.
  35. The vector of any one of claims 33-34, wherein the vector is a plasmid or a viral vector.
  36. The vector of any one of claims 33-35, wherein the vector is a polycistronic vector.
  37. A kit comprising
    a. the vector in claim any one of claim 33-36,
    b. a vector comprising the polynucleotide encoding the first and/or second Cas protein in any one of claims 4-29.
  38. A cell comprising the gene editing system in any one of claims 1-29.
  39. A cell comprising the polynucleotide in any one of claims 30-31.
  40. The cell in claim 39, further comprising a polynucleotide encoding the first and/or second Cas protein in any one of claims 4-29.
  41. A cell comprising the vector in any one of claims 33-36.
  42. The cell in claim 41, further comprising a vector comprising a polynucleotide encoding the first and/or second Cas protein in any one of claims 4-29.
  43. The cell of any one of claims 38-42, wherein the cell is a stem cell.
  44. The cell in claim 43, wherein the stem cell is a pluripotent stem cell.
  45. The cell in claim 44, wherein the pluripotent stem cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
  46. The cell in any one of claims 38-42, wherein the cell is an endothelial cell.
  47. The cell in any one of claims 38-46, wherein the cell is a primary cell or a differentiated cell.
  48. A composition comprising the gene editing system in any one of claims 1-29.
  49. A composition comprising the cell in any one of claims 38-47.
  50. A method for disrupting a KLKB1 gene in a cell, comprising introducing into the cell the gene editing system in any one of claims 1-29.
  51. A method for decreasing the expression of plasma kallikrein in a cell, comprising introducing into the cell the gene editing system in any one of claims 1-29.
  52. A method for decreasing the expression of bradykinin in a cell, comprising introducing into the cell the gene editing system in any one of claims 1-29.
  53. A method for decreasing vascular leakage regulated by a kinin B2 receptor (B2R) in a cell, comprising introducing into the cell the gene editing system in any one of claims 1-29.
  54. A method for treating hereditary angioedema (HAE) , comprising introducing into a cell the gene editing system in any one of claims 1-29.
  55. The method in claim 54, wherein the HAE is caused by C1-INH deficiency or dysfunction.
  56. The method in any one of claims 50-55, wherein the cell is a stem cell.
  57. The method in claim 56, wherein the stem cell is a pluripotent stem cell.
  58. The method in claim 57, wherein the pluripotent stem cell is an induced pluripotent stem cell (iPSC) or an embryonic stem cell.
  59. The method in any one of claims 50-55, wherein the cell is an endothelial cell.
  60. The method in any one of claims 50-55, wherein the cell is a primary cell or a differentiated cell.
PCT/CN2023/106730 2022-07-11 2023-07-11 Gene editing systems and methods for treating hereditary angioedema WO2024012435A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2022104999 2022-07-11
CNPCT/CN2022/104999 2022-07-11

Publications (2)

Publication Number Publication Date
WO2024012435A1 true WO2024012435A1 (en) 2024-01-18
WO2024012435A9 WO2024012435A9 (en) 2024-04-04

Family

ID=87934026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/106730 WO2024012435A1 (en) 2022-07-11 2023-07-11 Gene editing systems and methods for treating hereditary angioedema

Country Status (1)

Country Link
WO (1) WO2024012435A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156575A1 (en) 2019-02-02 2020-08-06 Shanghaitech University Inhibition of unintended mutations in gene editing
WO2021158858A1 (en) * 2020-02-07 2021-08-12 Intellia Therapeutics, Inc. Compositions and methods for kallikrein ( klkb1) gene editing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156575A1 (en) 2019-02-02 2020-08-06 Shanghaitech University Inhibition of unintended mutations in gene editing
WO2021158858A1 (en) * 2020-02-07 2021-08-12 Intellia Therapeutics, Inc. Compositions and methods for kallikrein ( klkb1) gene editing

Non-Patent Citations (14)

* Cited by examiner, † Cited by third party
Title
ALTSCHUL, S. F ET AL., J. MOL. BIOL, vol. 215, 1990, pages 403 - 410
BATZER ET AL., NUCLEIC ACID RES, vol. 19, 1991, pages 5081
HUANGFU ET AL., NATURE BIOTECHNOL, vol. 26, no. 7, 2008, pages 795
JOBERTY G. ET AL.: "supplementary online data", THE CRISPR JOURNAL, 1 January 2020 (2020-01-01), XP093093916, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194318/> [retrieved on 20231023] *
JOBERTY GÉRARD ET AL.: "A tandem guide RNA-based strategy for efficient CRISPR gene editing of cell populations with low heterogeneity of edited alleles", THE CRISPR JOURNAL, vol. 3, no. 2, 1 April 2020 (2020-04-01), pages 123 - 134, XP093093872, ISSN: 2573-1599, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194318/pdf/crispr.2019.0064.pdf> DOI: 10.1089/crispr.2019.0064 *
MATT ET AL.: "Hereditary angioedema: the plasma contact system out of control", J. THROMBOSIS AND HAEMOSTASIS, 2018
OHTSUKA ET AL., J BIOL CHEM, vol. 260, 1985, pages 2605 - 2608
ROSSOLINI ET AL., MOL CELL PROBES, vol. 8, 1994, pages 91 - 98
SUMPIO ET AL.: "Cells in focus: endothelial cell,", INT. J. BIOCHEM CELL BIOL., 2002
WANG, LIJIE ET AL.: "Eliminating base-editor-induced genome-wide and transcriptome-wide off-target mutations", NATURE CELL BIOLOGY, vol. 23.5, 2021, pages 552 - 563, XP037452073, DOI: 10.1038/s41556-021-00671-4
WEIDMANN ET AL.: "The plasma contact system, a protease cascade at the nexus of inflammation. coagulation and immunity", MOLECULAR CELL RESEARCH, 2017
WOLTJEN ET AL., NATURE, vol. 458, no. 7239, 2009, pages 766 - 770
ZHOU ET AL., CELL STEM CELL, vol. 8, 2009, pages 381 - 384
ZHOU ET AL., STEM CELLS, vol. 27, no. 11, 2009, pages 2667 - 74

Also Published As

Publication number Publication date
WO2024012435A9 (en) 2024-04-04

Similar Documents

Publication Publication Date Title
JP6922038B2 (en) Nucleic acid containing or encoding histone stem-loop and poly (A) sequence or polyadenylation signal to increase expression of the encoded therapeutic protein
US11512311B2 (en) Systems and methods for treating alpha 1-antitrypsin (A1AT) deficiency
JP7197363B2 (en) Genome editing of human neural stem cells using nucleases
US20200340012A1 (en) Crispr-cas genome engineering via a modular aav delivery system
ES2574584T3 (en) Stable serum-free transfection and production of recombinant human proteins in human cell lines
CN109678967B (en) Targeting polypeptide for treating osteosarcoma and application thereof
CN110382697A (en) For treating the composition and method of α -1 antitrypsin deficiency disease
US20230174958A1 (en) Crispr-inhibition for facioscapulohumeral muscular dystrophy
JP7199492B2 (en) Rabbit knocked out of Factor VIII or Factor IX gene, method for producing the same, and use thereof
WO2019239361A1 (en) Method for sequence insertion using crispr
US20240123088A1 (en) USE OF A SPLIT dCAS FUSION PROTEIN SYSTEM FOR EPIGENETIC EDITING
WO2024012435A1 (en) Gene editing systems and methods for treating hereditary angioedema
JP6959369B2 (en) Stem cells that secrete angiopoietin-1 or VEGF and pharmaceutical compositions containing them for the prevention or treatment of cardiovascular diseases.
CA3227357A1 (en) Hemocompatible mesenchymal stem cells, preparation method therefor and use thereof
KR20230010617A (en) Mesenchymal stem cells with oxidative stress resistance, manufacturing method and use thereof
JP2020528735A (en) Genome editing system for repetitive elongation mutations
KR102289661B1 (en) Composition for preventing or treating Gout comprising stem cells overexpressing Uricase
WO2022056041A2 (en) Rna and dna base editing via engineered adar
JP2003504316A (en) Adenovirus vectors for treating diseases
WO2024088401A1 (en) Gene editing systems and methods for reducing immunogenicity and graft versus host response
EP3897675A1 (en) Compositions and methods for airway tissue regeneration
CN114540325B (en) Method for targeted DNA demethylation, fusion protein and application thereof
JP7084418B2 (en) A pharmaceutical composition for the prevention or treatment of neurological or cardiovascular diseases, which comprises stem cells that secrete sRAGE.
EP0698107A1 (en) Gene therapy for haemophilia
JP2003500336A (en) Use of apoptosis inducers in the treatment of (auto) immune diseases

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765143

Country of ref document: EP

Kind code of ref document: A1