CN114685685B - Fusion protein for editing RNA and application thereof - Google Patents

Fusion protein for editing RNA and application thereof Download PDF

Info

Publication number
CN114685685B
CN114685685B CN202210419197.7A CN202210419197A CN114685685B CN 114685685 B CN114685685 B CN 114685685B CN 202210419197 A CN202210419197 A CN 202210419197A CN 114685685 B CN114685685 B CN 114685685B
Authority
CN
China
Prior art keywords
fragment
gly
ser
seq
lys
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210419197.7A
Other languages
Chinese (zh)
Other versions
CN114685685A (en
Inventor
池天
吕君君
邸明慧
荆征宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ShanghaiTech University
Original Assignee
ShanghaiTech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ShanghaiTech University filed Critical ShanghaiTech University
Priority to CN202210419197.7A priority Critical patent/CN114685685B/en
Publication of CN114685685A publication Critical patent/CN114685685A/en
Application granted granted Critical
Publication of CN114685685B publication Critical patent/CN114685685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K47/00Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
    • A61K47/06Organic compounds, e.g. natural or synthetic hydrocarbons, polyolefins, mineral oil, petrolatum or ozokerite
    • A61K47/26Carbohydrates, e.g. sugar alcohols, amino sugars, nucleic acids, mono-, di- or oligo-saccharides; Derivatives thereof, e.g. polysorbates, sorbitan fatty acid esters or glycyrrhizin
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K47/00Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
    • A61K47/30Macromolecular organic or inorganic compounds, e.g. inorganic polyphosphates
    • A61K47/42Proteins; Polypeptides; Degradation products thereof; Derivatives thereof, e.g. albumin, gelatin or zein
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4702Regulators; Modulating activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0618Cells of the nervous system
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0684Cells of the urinary tract or kidneys
    • C12N5/0686Kidney cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/21Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/31Chemical structure of the backbone
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/50Physical structure
    • C12N2310/53Physical structure partially self-complementary or closed
    • C12N2310/531Stem-loop; Hairpin
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Veterinary Medicine (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Urology & Nephrology (AREA)
  • Cell Biology (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Neurosurgery (AREA)
  • Neurology (AREA)
  • Toxicology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Oil, Petroleum & Natural Gas (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Inorganic Chemistry (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention discloses a fusion protein for editing RNA and application thereof. The fusion protein sequentially comprises the following components from the N end to the C end: a single-stranded RNA binding protein fragment, an RNA hairpin binding protein fragment, a nuclear signal peptide fragment, and a deaminase fragment; the amino acid sequence of the deaminase fragment is shown as SEQ ID NO. 99. The fusion protein has small volume, low immunogenicity and low off-target effect, and the base editing system comprising the fusion protein and the optimized sgRNA can target mRNA and activate WNT signal channels, thus having good target editing efficiency and high industrial utilization value.

Description

Fusion protein for editing RNA and application thereof
Technical Field
The invention relates to the field of biotechnology, in particular to a fusion protein for editing RNA, application thereof, a base editing system and a pharmaceutical composition containing the same, and a base editing method for non-therapeutic purposes.
Background
In Site-directed RNA editing (Site-Directed RNA editing, SDRE), an editor consisting of an adenosine or cytosine deaminase fused to a programmable RNA binding domain converts A to I or C to U, respectively, at the mRNA target Site, thereby enabling the encoding of genetic information at the mRNA level (Porto et al 2020; reardon,2020; rees and Liu,2018; vogel and Stafforts, 2019). The SDRE is reversible in that once editing ceases, the edited mRNA is destined to be replaced by nascent, unedited mRNA during RNA transcription. This reversibility provides a unique advantage of SDRE over irreversible DNA base editing. First, since off-target editing on RNA is reversible, SDRE may be safer than irreversible DNA editing. Second, more importantly, SDRE can be used to introduce mutations into normal proteins to manipulate their function, while DNA base editing is largely inapplicable for safety and ethical reasons. Typical examples include the Wnt/β -catenin signaling pathway (abbreviated as Wnt pathway) (shown in FIGS. 1 and 2) (Liu et al, 2002; macDonald et al, 2009).
In almost all cell types in fetuses and adults, the Wnt pathway regulates cell proliferation, differentiation, migration and maintains stem cell pluripotency, thereby widely and profoundly affecting embryonic development and adult tissue organ homeostasis (including adult tissue renewal, regeneration, repair) (Clevers and Nusse,2012; kretzschmar and Clevers,2017; macdonald et al, 2009). A variety of diseases characterized by tissue damage are characterized by a low Wnt pathway activity, including deafness, neurodegenerative diseases, osteoporosis, hair loss, defective wound healing, and ischemia-reflex damage. Wnt pathway is a drug target with great therapeutic potential in regenerative medicine, and more than 20 Wnt pathway agonists (small molecules and Wnt proteins) have been produced in the last 30 years of research (Huang et al, 2019). However, as a drug target, the Wnt pathway is a double sword, because the pathway itself must be precisely regulated, and its overactivation can lead to serious consequences such as malignancy, pulmonary fibrosis and hyperosteogeny (Anastas and Moon,2013; bladoutski et al, 2014; huang et al, 2019; kahn, 2014). Thus, the Wnt pathway agonists described above have serious safety implications. First, small molecule agonists lack sufficient specificity. For example, the direct targets of well-known agonists such as CHIR99021 and BIO are GSK-3β, and the substrates of GSK-3β are currently known to be 100 species in addition to β -catenin, so these agonists have complex, unpredictable pleiotropic effects (Beurel et al, 2015). Second, the Wnt pathway is widely expressed, small molecule agonists can freely diffuse to non-target cells, leading to ectopic activation of the pathway and risk of tumorigenesis (blagadatski et al, 2014; huang et al, 2019; kahn, 2014). Another strategy for Wnt pathway activation is to selectively express mutants of the β -catenin protein in target cells lacking the β -catenin phosphorylation site. This approach does not have the safety problems mentioned above, but it is clearly difficult to ensure that the mutants are not overexpressed. After 30 years of finding this pathway to activate WNT signaling, its key issue remains to target the safety of the WNT pathway (Kahn, 2014). The safety risk of Wnt pathway agonists severely hampers the development of regenerative medicine.
Kannan et al reported successful reduction of the size of the reset-S (reset, RNAEditing for Specific C to U Exchange, S375A mutation of reset; specifically the latter using ADAR, i.e. the S375A mutation of dual function a and C deaminase) by replacing dirancas 13b (1094 aa) with Cas13bt1, cas13bt1 being a small Cas13b of only 804aa (Kannan et al 2021). Although replacement of Cas13bt1 does not jeopardize editing of the target by RESCUE-S, the off-target effect is expected to be greatly exacerbated, as can be inferred from the fact that Cas13bt1 replaces dirancas 13b in similar editors (dirancas 13b-repair and dirancas 13 b-repair-S), which will result in a substantial increase in off-target. In addition, xu et al, in 2020, deleted the dCas13 family protein, resulting in dCas13X (mini) which is also a relatively small editor. But the editing efficiency of dCas13X (mini) is far less than that of C-resuev 2 (fig. 14). Furthermore, it is still of bacterial origin, immunogenic.
Beta-catenin RNA editing offers attractive prospects for safely activating the Wnt pathway. However, two currently published C > U editors (reset and CURE) based on Cas13 have the following limitations, which prevent their practical application: cas13 is bacterial in origin and is an immunogen; the size is large, and AAV (adeno-associated virus) is difficult to be loaded; both RESCUE and CURE have low editing efficiency (15% -30%) for the beta-catenin phosphorylation site.
Disclosure of Invention
The invention aims to overcome the defects of low safety, low editing efficiency and difficult coating delivery of the existing RNA editor, and provides an RNA editing fusion protein and application thereof. The fusion protein has small volume, low immunogenicity and low off-target effect, and the base editing system comprising the fusion protein and the optimized sgRNA can target mRNA, has good target editing efficiency and has extremely high industrial utilization value.
Through a great deal of exploratory researches, the inventor designs a plurality of fusion proteins by adopting CIRTS fragments with the length of only 200 amino acids and mutating deaminase, and discovers that the fusion proteins comprising the CIRTS fragments and ADAR fragments can realize high-efficiency editing of UC > UU in a target gene region forming a mismatched structure under the guidance of sgRNA, and have higher efficiency and specificity; and the inventors have unexpectedly found that an inherent Histidine-Rich Domain (HRD) sequence in the disordered region (intrinsic disordered region, IDR), rather than other types of IDR, can enhance the editing efficiency of fusion proteins without the occurrence of strong phase transition phenomena, presumably HRD sequences that stabilize the conformation of the protein, promote its interaction with other proteins, or act as a potentially unknown mechanism.
The invention solves the technical problems by the following technical proposal:
in a first aspect, the present invention provides a fusion protein comprising, in order from the N-terminus to the C-terminus: a single-stranded RNA binding protein fragment, an RNA hairpin binding protein fragment, a nuclear signal (Nuclear export signal, NES) peptide fragment, and a deaminase fragment; the amino acid sequence of the deaminase fragment is shown as SEQ ID NO. 99.
In the present invention, the deaminase ADAR corresponding to the deaminase fragment is a derivative of adenine deaminase ADAR2dd targeting double-stranded RNA, and can edit both a > G and C > U.
In the present invention, the single-stranded RNA binding protein fragment and the RNA hairpin binding protein fragment constitute a CIRTS fragment.
In the present invention, the CIRTS fragment is about 200 amino acids in length, about 0.6kb in size, and is composed of a human protein, thus not inducing an immune response in humans.
In some embodiments of the invention, the deaminase fragment has a histidine-rich domain (HRD) fragment attached to the C-terminus, the histidine-rich domain fragment having a natural number of 1 or more copies.
Preferably, the histidine-rich domain fragment has a copy number of 1, 2, 3 or 4.
In the present invention, the inventors used HRD sequences, the addition of which improved editing efficiency to some extent, but this improvement in editing efficiency appeared not to be mainly caused by phase transitions.
In some embodiments of the invention, the histidine-rich domain fragment has an amino acid sequence as set forth in SEQ ID NO. 4.
In some embodiments of the invention, the single stranded RNA binding protein fragment, RNA hairpin binding protein fragment, nuclear signal peptide fragment and deaminase fragment are linked by a linker.
In some embodiments of the invention, the deaminase fragment and the histidine-rich domain fragment are linked by a linker.
In some embodiments of the invention, more than one copy of the histidine-rich domain fragment is linked by a linker.
In the present invention, the linker is (G) m S) n Or an amino acid sequence as shown in SEQ ID NO 97, 98 or 100; wherein m and n are each 0 or a positive integer and are not simultaneously 0, e.g. (G) 2 S) 6 、(GS) 3
In some embodiments of the invention, the single-stranded RNA binding protein fragment and the RNA hairpin binding protein fragment are linked by a linker 1, the linker 1 being (G 2 S) 6
In some embodiments of the invention, the RNA hairpin binding protein fragment and the nuclear signal peptide fragment are linked by a linker 2, and the amino acid sequence of the linker 2 is shown as SEQ ID NO. 97.
In some embodiments of the invention, the nucleolytic signal peptide fragment and the deaminase fragment are connected through a connector 3 and a connector 4 which are connected in series, wherein the amino acid sequence of the connector 3 is EF, and the amino acid sequence of the connector 4 is shown as SEQ ID NO. 98.
In some embodiments of the invention, the deaminase fragment and the histidine-rich domain fragment are linked via a linker 5, said linker 5 being (GS) 3
In some embodiments of the invention, more than one copy of the histidine-rich domain fragment is linked by a linker 6, the amino acid sequence of said linker 6 being shown in SEQ ID NO. 100.
In some embodiments of the invention, the single stranded RNA binding protein fragment is a beta-descensin fragment
In the present invention, the β -catenin may be human β -catenin (Gene ID: 1499) or murine β -catenin (Gene ID: 12387).
The amino acid sequence of the beta-descensin fragment is preferably shown in SEQ ID NO. 38.
In some embodiments of the invention, the RNA hairpin binding protein fragment is a TAR binding protein (TAR Binding Protein, TBP) fragment having the amino acid sequence shown in SEQ ID NO. 96.
In some embodiments of the invention, the amino acid sequence of the out-of-core signal peptide fragment is shown in SEQ ID NO. 3.
In a second aspect the invention provides a base editing system comprising a guide RNA and a fusion protein according to the first aspect.
In the invention, the guide RNA comprises a framework structure and a recognition structure; the backbone structure comprises at least one hairpin structure.
In some embodiments of the invention, the scaffold structure comprises 2 hairpin structures.
In some embodiments of the invention, the transcription template of the hairpin structure has a nucleotide sequence as shown in SEQ ID NO. 101; and/or the length of the identification structure is 30nt.
In the present invention, when the RNA hairpin binding protein is a TAR binding protein, the guide RNA (sgRNA) consists of a spacer and a TAR hairpin. Spacer serves as a recognition structure for guide RNA for recognition of mRNA fragments containing target C (Cytosine). By introducing C or U (Uracil) at the complementary position of target C, resulting in a spacer-mRNA duplex producing a C-or C-U mismatch (C-flip, U-flip, respectively), the editor can be made to edit target C preferentially. The length of the Spacer is typically 30nt, but the optimal position of the C/U-flip varies with target point. TAR hairpin carried by sgRNA can recruit C-RESCUE to target spot through the combination of TBP on CIRTS fragment.
One skilled in the art can select an appropriate sgRNA targeting a target region according to the target editing region of the gene. For example, the sgRNA sequence may be partially complementary to the target region, and a mismatch may occur at the editing position, so that it may be coordinated with the fusion protein, and the fusion protein may be positioned to the target region, thereby realizing base editing of the target region, specifically deamination, that is, cytosine (C) is edited to uracil (U), and adenine (a) may be edited to hypoxanthine (I). The base editing system provided by the invention can be complemented with the target gene by the sgRNA, so that the target gene generates mismatch at the editing position, and the editing special for editing the C > U RNA base is realized. For example, in the case of a length of 30nt sgRNA nucleotide sequence, complementary pairing with the target gene and a mismatch at the editing position of the target gene is generated, and the mismatch position has good editing activity at different positions (e.g., 15-26 nt) of 30nt sgRNA. As another example, adding a TAR structure to the sgRNA can also result in higher editing efficiency. As another example, adding an HRD sequence after the fusion protein can also increase the editing efficiency of the editor.
In a third aspect of the invention, there is provided a base deaminase having the amino acid sequence shown in SEQ ID NO. 99.
In a fourth aspect the invention provides an isolated polynucleotide encoding a fusion protein as described in the first aspect or a base editing system as described in the second aspect.
In a fifth aspect the invention provides an expression cassette comprising a promoter and a polynucleotide as described in the fourth aspect.
In the present invention, the promoter is a U6 promoter, preferably a human U6 promoter.
In some embodiments of the invention, the nucleotide sequence of the expression cassette is shown in SEQ ID NO. 5.
In a sixth aspect the invention provides a construct comprising a polynucleotide as described in the fourth aspect or an expression cassette as described in the fifth aspect.
In the present invention, the construct may be generally obtained by constructing the isolated polynucleotide by inserting it into a suitable expression vector, which may be selected by those skilled in the art, for example, the expression vector may be, but is not limited to, a bacterial expression vector, a fungal expression vector, an animal expression vector (e.g., insect, drosophila, nematode, fish, zebra fish, mammal, mouse, rat, rabbit, pig, monkey, human, etc.), a plant expression vector, and the like. In the present invention, the construct is generally used interchangeably with "recombinant expression vector".
In some embodiments of the invention, the scaffold of the construct is selected from the group consisting of a bacterium, a fungus, an animal cell, and a plant cell.
In a seventh aspect the invention provides an expression system which is a host cell expressing a construct according to the sixth aspect.
In the present invention, the expression system comprises the construct or the genome according to the sixth aspect integrated with an exogenous isolated polynucleotide according to the fourth aspect or the expression cassette according to the sixth aspect. The host cell can express the fusion protein, and the fusion protein can be matched with the sgRNA, so that the fusion protein can be positioned to a target region, and the base editing of the target region is realized. In the present invention, the expression system generally represents the same or similar meaning as "transformant", "genetically modified cell".
In some embodiments of the invention, the host cell is a eukaryotic cell or a prokaryotic cell.
In some embodiments of the invention, the eukaryotic cell is a human cell line or a murine cell, preferably a human kidney epithelial cell line or a murine brain neuroma cell, such as HEK293T or N2A.
An eighth aspect of the invention provides a pharmaceutical composition comprising a fusion protein as described in the first aspect or a base editing system as described in the second aspect.
A ninth aspect of the present invention provides a base editing method for non-therapeutic purposes, the base editing method comprising:
expressing the fusion protein according to the first aspect or the base editing system according to the eighth aspect in a target cell, and causing base editing of the target cell.
The non-therapeutic purpose may be for scientific research purposes, such as gene function discovery and animal disease model construction; but may also be other non-commercial research purposes.
In the present invention, the base editing method may include: culturing the expression system of the seventh aspect under appropriate conditions to express the fusion protein, which allows base editing of the target region in the presence of the sgRNA targeting the target region to which it is complexed. Methods of providing conditions under which the sgrnas are present should be known to those skilled in the art, and for example, may be culturing under appropriate conditions an expression system capable of expressing the sgrnas, which may be a host cell comprising an expression vector comprising a polynucleotide encoding the sgrnas, or a host cell having a polynucleotide encoding the sgrnas integrated in the chromosome.
In some embodiments of the invention, the gene editing is in vitro gene editing.
Preferably, the target cell is a eukaryotic cell or a prokaryotic cell.
More preferably, the eukaryotic cell is a human cell line or a murine cell, preferably a human kidney epithelial cell line or a murine brain neuroma cell, such as HEK293T or N2A.
A tenth aspect of the invention provides the use of a fusion protein according to the first aspect, a base editing system according to the second aspect, a base deaminase according to the third aspect, a polynucleotide according to the fourth aspect, an expression cassette according to the fifth aspect, a construct according to the sixth aspect or an expression system according to the seventh aspect for the preparation of a base edited drug, for the preparation of a gene therapy drug or for the preparation of a base editing tool.
Preferably, the base editing is selected from repairing a gene mutation from T to C, recovering a missense mutation resulting from a single base mutation, translation of an abnormally terminated inactive protein, RNA editing, and activating the WNT signaling pathway.
In the present invention, the use is preferably for regulating gene expression in eukaryotic organisms, which may be in particular metazoans, which may in particular include but are not limited to human cells and the like. Such uses may include, but are not limited to, repairing mutations in genes from T to C, restoring translation of inactive proteins that have missense mutations resulting from single base mutations or that have failed to terminate properly, altering RNA editing, activating WNT signaling pathways, and the like.
On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred examples of the invention.
The reagents and materials used in the present invention are commercially available.
The invention has the positive progress effects that:
the fusion protein comprises CIRTS fragments and ADAR fragments, and has small volume, low immunogenicity and low off-target effect. The base editing system containing the fusion protein and the optimized sgRNA can target mRNA, has good target editing efficiency and has high industrial utilization value.
Drawings
FIG. 1 is a simplified view of the Wnt pathway; beta-catenin is phosphorylated by the APC-Axin-gsk3β complex, and then degraded until Wnt ligand inhibits the complex, which results in beta-catenin stabilization and subsequent transcription of the target gene.
FIG. 2 is a phosphorylation site (at site tag) on β -catenin (amino acids 33-45) and an SDRE-based editing strategy; of the 4 residues on β -catenin (S33, S37, T41 and S45), phosphorylation of each residue is necessary for β -catenin degradation; all 4 residues can be mutated at the mRNA level by a C > U transition, while T41 can also be mutated by an a > I transition.
FIG. 3 is an editor of the test of the present invention; wherein the editors 2-5 are editors designed by the invention, and ADAR is bifunctional a and C deaminase S375A; TBP is TAR binding protein, CIRTS is RNA targeting structure based on Crispr/Cas.
FIG. 4 is the sgRNA structure of C-RESCUE; the spacer binds to the mRNA target sequence and the TAR (hairpin) binds to the TBP portion of C-RESCUE; a hairpin structure may be added. When editing C on mRNA, opposite the target site C on mRNA within the spacer is C or U (labeled X); for the case where the editing site is a, a is opposite to C.
FIG. 5 is a Ctnnb1 reporter construct; ctnnb1 is the full-length cDNA sequence of the mouse, asterisks are the phosphorylation sites, SD and SA are splice donors and acceptors, and the arrows are RT-PCR primers used to amplify the cDNA after mRNA reverse transcription; since primers can also amplify plasmid DNA, the SV40 intron is inserted to distinguish between the two amplicons.
FIG. 6 is an optimization of sgRNA in N2a cells characterized by Ctnnb1 reporter gene (S33F); optimizing gNRA hairpin copy number (1×tar and 2×tar alignment), sgRNA mismatch position and mismatched bases (C-C mismatch versus C-U mismatch); NT is sgRNA that does not target mRNA, with random, non-targeting spacers.
FIG. 7 is an optimization of sgRNA in N2a cells characterized by Ctnnb1 endogenous gene; similar to fig. 6, editing the endogenous gene Ctnnb 1S 33F site in N2a cells, comparing the 1×tar and 2×tar editing efficiencies, showed that the 2×tar editing efficiency was not lower than 1×tar.
FIG. 8 is an edit of HRD enhanced C-RESCUE at Ctnnb 1S 33F locus on endogenous and reporter genes; to enhance C-RESCUEv1, a 71-aa histidine-rich domain (HRD) selected from human Cyclin was fused into the editor; this domain (aa 480-550) is located within the 191-aa IDR (aa 462-652) of Cyclin T1, a key to the LLPS mediated by Cyclin T1 IDR; the results indicate that HRD does have a tendency to promote S33F editing in N2a, editing Ctnnb 1S 33F site on both endogenous and reporter genes, with optimal editing effect (30 nt length of all sgrnas) when 2 HRD copies are added.
FIG. 9 is a graph showing that HRD increases the efficiency of editing HEK293T endogenous genes; to determine whether HRD effects Ctnnb 1S 33 in mouse N2a cells could be generalized to human target cells, C-RESCUEv1 and C-RESCUEv2 were compared at 9 sites in HEK293T cells; the performance of C-RESCUEv2 at 4 sites (CTNNB 1S 33, CTNNB 1P 44, PPIB I18, SMARCA 4S 85L) is obviously better than that of C-RESCUEv1, and the performance of the other 5 sites has the same trend except for NFKB 1P 33; therefore, the introduction of HRD improves editing efficiency of different target sites to different extents.
FIG. 10 shows the different sites of S33F, S37F, T3541I, T A, S45F on the Ctnnb1 endogenous gene (A of FIG. 10) and the reporter gene (B of FIG. 10) edited by C-RESCUEv2; in the N2a endogenous gene, the editing efficiency of the S33F site is higher than that of other S37F, T41I, T41A, S45F sites, except that in the editing test of different sgRNA lengths on the right side, the editing efficiency of 40 and 50 base lengths is lower.
FIG. 11 shows fluorescence bleaching recovery (Fluorescence Recovery After Photobleaching, FRAP) results; the purpose of visually detecting the phase change process is achieved by using mCherry marked C-RESCUEv2, and mCherry-Cyclin T1 (mCherry-CYCT 1) is used as a positive control; whereas mCherry-CYCT1 transiently expressed in Hela cells formed a large number of distinct punctiform structures as expected, much fewer and less prominent in C-resuev 2-mCherry cells (a of fig. 11); furthermore, fluorescence Recovery After Photobleaching (FRAP) analysis showed that C-resuev 2-mCherry spots hardly recovered fluorescence signals after bleaching (B of fig. 11), in contrast to mCherry-CYCT1 spot highly dynamic recovery (τ1/2=1.6 s).
FIG. 12 is a graph showing the comparison of editing efficiency of C-RESCUEv2 and RESCUE-S on HEK293T endogenous genes; all sgRNAs of C-RESCUEv2 are 30nt in length, the mismatch position is 22 th base, and C-C pairing is carried out; C-RESCUEv2 was significantly more active at 2 of the 9 sites (CTNNB 1P 44, SMARCA 4D 86), while both were approximately equivalent at the remaining sites, except that C-RESCUEv2 was weaker in PPIB R7. Although in HEK293T cells, C-RESCUEv2 appeared to be approximately equivalent to RESCUE-S of CTNNB 1T 41, it was significantly more active than RESCUE-S (9-fold) in N2a cells.
FIG. 13 is a diagram of RNA-seq detection of off-target of transcriptome; in terms of transcriptome-wide off-target effects, C-RESCUEv2 produced 5.8-fold more off-targets than RESCUE-S (19151 to 3321); importantly, the off-target editing of C-RESCUEv2 was similar to C-RESCUEv1- (23339) and C-RESCUEv1-1 XHRD (21895), indicating that the higher off-target effect of C-RESCUEv2 relative to RESCUE-S was not from HRD, but from CIRTS, possibly because CIRTS had a higher non-specific affinity for RNA than RanCas13b (editor 1 in FIG. 3).
Fig. 14 is an edit result of C-resuev 2 and Cas13e (mini) -ADAR on endogenous (a of fig. 14) and reporter (B of fig. 14); C-RESCUEv2 was compared with dCas13e (mini) -ADAR. dCas13e is a small RNA base editor comprising a deletion mutant of dCas13e fused to a bifunctional deaminase ADAR (i.e., editor 6 in fig. 3). dCas13e- (mini) ADAR was barely edited at Ctnnb1 and was much less active on PPIB (a site known to be dCas13e (mini) -ADAR) edited than C-rescue 2.
FIG. 15 is an endogenous Ctnnb1T41A site in an N2a cell for the MCP-ADAR base editor, A > I editing results; two MCP-based a > I editors were tested, containing ADAR1 and ADAR2 deaminase domains, respectively (editors 7-8 in fig. 3) (Katrekar et al, 2019). The results indicate that 30% and 9% editing was achieved by MCP-ADAR1 and MCP-ADAR2, respectively, in Ctnnb1T41A conversion of Ctnnb1 endogenous to N2a cells.
FIG. 16 is an edit comparison of S33F locus, C-RESCUEv1 and C-RESCUEv1-1 XHRD in N2a cells edited on Ctnnb1 reporter; on the Ctnnb1 reporter gene, the sgRNA has the length of 30nt which is C-C mismatch, different mismatch positions are changed, and 1 XHRD in the C-RESCUEv1-1 XHRD has different effects on the improvement of editing efficiency at different positions.
FIG. 17 is a schematic diagram showing the addition of a nuclear signal sequence at the N-terminus or C-terminus or both sides of C-RESCUEv2, editing the Ctnnb 1S 33F site; in HEK293T, the S33F locus of Ctnnb1 endogenous gene is edited, the sgRNA length is 30nt, the mismatch position is 22, and the mismatch of C-C is added after any editor, so that the editing efficiency of the editor is greatly reduced.
FIG. 18 shows that different copy numbers of the histidine-rich short peptide DYRK1A (B) on the reporter gene were not as efficient in editing the S33F site as C-RESCUEv2; the short peptide DYRK1A sequence rich in histidine is put at the N end and the C end of the C-RESCUEv1, short peptides with different copy numbers are tried, and the editing efficiency is not higher than that of the C-RESCUEv2 at the S33F site of the Ctnnb1 reporter gene.
FIG. 19 is an edit of different IDR sequences at the S33F site of the Ctnnb1 reporter gene and endogenous gene; three different IDR sequences of h-CYCT1 (A) (1X), FUS (C) and hnRNPA1 (D) are respectively inserted into the C end of a C-RESCUEv1 editor, and the editing efficiency on a Ctnnb1 reporter gene and an endogenous gene is lower than that of C-RESCUEv2.
Detailed Description
The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention. The experimental methods, in which specific conditions are not noted in the following examples, were selected according to conventional methods and conditions, or according to the commercial specifications.
In order to make the objects, technical solutions and advantageous technical effects of the present invention clearer, the present invention will be further described in detail with reference to examples. It should be understood that the examples described in this specification are for the purpose of illustrating the invention only and are not intended to limit the invention.
For simplicity, only a few numerical ranges are explicitly disclosed herein. However, any lower limit may be combined with any upper limit to form a range not explicitly recited; and any lower limit may be combined with any other lower limit to form a range not explicitly recited, and any upper limit may be combined with any other upper limit to form a range not explicitly recited. Furthermore, each point or individual value between the endpoints of the range is included within the range, although not explicitly recited. Thus, each point or individual value may be combined as a lower or upper limit on itself with any other point or individual value or with other lower or upper limit to form a range that is not explicitly recited.
In the description herein, unless otherwise indicated, "above" and "below" are intended to include the present number, and the meaning of "one or more" of "multiple" is two or more.
The above summary of the present invention is not intended to describe each disclosed embodiment or every implementation of the present invention. The following description more particularly exemplifies illustrative embodiments. Guidance is provided throughout this application by a series of embodiments, which may be used in various combinations. In the various examples, the list is merely a representative group and should not be construed as exhaustive.
In the fusion proteins provided by the present invention, the substitution, deletion or addition may be conservative amino acid substitutions. The term "conservative amino acid substitution" may specifically refer to the case where an amino acid residue is substituted for another amino acid residue having a similar side chain. Families of amino acid residues with similar side chains should be known to those skilled in the art, and for example, may be families including, but not limited to, basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Conservative amino acid substitutions may more particularly include, but are not limited to, the specific cases listed in the following table, with the numbers in table 1 (amino acid similarity matrix) representing the similarity between two amino acids, and when the numbers are 0 or more, the conservative amino acid substitutions are considered, and table 2 is an exemplary scheme of conservative amino acid substitutions.
TABLE 1 amino acid similarity
TABLE 2 conservative amino acid substitutions
Amino acid residues Conservative substitutes
Alanine D-Ala、Gly、Aib、β-Ala、L-Cys、D-Cys
Arginine D-Arg、Lys、D-Lys、Orn D-Orn
Asparagine D-Asn、Asp、D-Asp、Glu、D-Glu Gln、D-Gln
Aspartic Acid D-Asp、D-Asn、Asn、Glu、D-Glu、Gln、D-Gln
Cysteine D-Cys、S-Me-Cys、Met、D-Met、Thr、D-Thr、L-Ser、D-Ser
Glutamine D-Gln、Asn、D-Asn、Glu、D-Glu、Asp、D-Asp
Glutamic Acid D-Glu、D-Asp、Asp、Asn、D-Asn、Gln、D-Gln
Glycine Ala、D-Ala、Pro、D-Pro、Aib、β-Ala
Isoleucine D-Ile、Val、D-Val、Leu、D-Leu、Met、D-Met
Leucine Val、D-Val、Met、D-Met、D-Ile、D-Leu、Ile
Lysine D-Lys、Arg、D-Arg、Orn、D-Orn
Methionine D-Met、S-Me-Cys、Ile、D-Ile、Leu、D-Leu、Val、D-Val
Phenylalanine D-Phe、Tyr、D-Tyr、His、D-His、Trp、D-Trp
Proline D-Pro
Serine D-Ser、Thr、D-Thr、allo-Thr、L-Cys、D-Cys
Threonine D-Thr、Ser、D-Ser、allo-Thr、Met、D-Met、Val、D-Val
Tyrosine D-Tyr、Phe、D-Phe、His、D-His、Trp、D-Trp
Valine D-Val、Leu、D-Leu、Ile、D-Ile、Met、D-Met
It should be understood that the process equipment or devices not specifically identified in the examples below are all conventional in the art.
Unless otherwise indicated, the experimental methods, detection methods, and preparation methods disclosed in the present invention employ techniques conventional in the art of molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA techniques, and related arts. These techniques are well described in the prior art literature and see, in particular, sambrook et al MOLECULAR CLONING: a LABORATORY MANUAL, second edition, cold Spring Harbor Laboratory Press,1989and Third edition,2001; ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, john Wiley & Sons, new York,1987and periodic updates; the series METHODS IN ENZYMOLOGY, academic Press, san Diego; wolffe, CHROMATIN STRUCTURE AND FUNCTION, third edition, academic Press, san Diego,1998; METHODS IN ENZYMOLOGY, vol.304, chromatin (p.m. wassman and a.p. wolffe, eds.), academic Press, san Diego,1999; and METHODS IN MOLECULAR BIOLOGY, vol.119, chromatin Protocols (p.b. becker, ed.) Humana Press, totowa,1999, etc.
The reagents and materials used in the examples are shown in table 3 below:
TABLE 3 reagents and materials
The primer sequences involved in the examples are shown in Table 4 below.
TABLE 4 partial primer sequences
/>
For the editing efficiency mentioned throughout the experiment, the values are mean +/-SEM (n=3, unless otherwise stated). * < =0.05, <=0.01, <=0.001 (t-test).
As shown in FIG. 3, the neutralizing fusion protein C-RESCUEv1 in the examples, comprising, in order from the N-terminus to the C-terminus: CIRTS fragments (comprising beta-descensin and TAR Binding Protein (TBP)), HIV NES (out-of-core signal) peptide fragments and ADAR fragments; C-RESCUEv2 comprises, in order from the N-terminal to the C-terminal: CIRTS fragments (comprising β -descensin and TAR binding proteins), HIV NES peptide fragments, ADAR fragments, and 2 XHRD fragments (in the present invention, "n×" represents copy number, e.g., 1 XHRD represents one copy of HRD,2 XHRD fragments indicate that there are two copies of HRD concatenated together); fusion protein C-RESCUEv1-1 XHRD is formed by adding an HRD sequence at the C end on the basis of the C-RESCUEv1 fusion protein; C-RESCUEv1-4 XHRD is formed by adding four HRD sequences on the C end on the basis of C-RESCUEv1 fusion protein, and connecting the HRDs by using corresponding Linker; as shown in fig. 4, sgrnas comprise, from left to right: a TAR sequence fragment, a UUAUU sequence linking part, a target site sequence, a UUAUU sequence linking part and a TAR sequence fragment.
The amino acid sequence of the C-RESCUEv1 is shown as SEQ ID NO. 1, and the amino acid sequence of the C-RESCUEv2 is shown as SEQ ID NO. 2.
MGIINTLQKYYCRVRGGRCAVLSCLPKEEQIGKCSTRGRKCCRRKKGGSGGSGGSGGSGGSGGSMAVPETRPNHTIYINNLNSKIKKDELKKSLYAIFSQFGQILDILVPRQRTPRGQAFVIFKEVSSATNALRSMQGFPFYDKPMRIQYAKTDKRIPAKMKGTFVMHGSLQLPPLERLTLEFGGSGGSGGSGGSGGSGGSGSSQLHLPQVLADAVSRLVIGKFGDLTDNFSSPHARRIGLAGVVMTTGTDVKDAKVICVSTGAKCINGEYLSDRGLALNDCHAEIVSRRSLLRFLYTQLELYLNNEDDQKRSIFQKSERGGFRLKENIQFHLYISTSPCGDARIFSPHEAILEEPADRHPNRKARGQLRTKIEAGQGTIPVRNNASIQTWDGVLQGERLLTMSCSDKIARWNVVGIQGSLLSIFVEPIYFSSIILGSLYHGDHLSRAMYQRISNIEDLPPLYTLNKPLLTGISNAEARQPGKAPIFSVNWTVGDSAIEVINATTGKGELGRASRLCKHALYCRWMRVHGKVPSHLLRSKITKPNVYHETKLAAKEYQAAKARLFTAFIKAGLGAWVEKPTEQDQFSLT(SEQ ID NO:1)
MGIINTLQKYYCRVRGGRCAVLSCLPKEEQIGKCSTRGRKCCRRKKGGSGGSGGSGGSGGSGGSMAVPETRPNHTIYINNLNSKIKKDELKKSLYAIFSQFGQILDILVPRQRTPRGQAFVIFKEVSSATNALRSMQGFPFYDKPMRIQYAKTDKRIPAKMKGTFVMHGSLQLPPLERLTLEFGGSGGSGGSGGSGGSGGSGSSQLHLPQVLADAVSRLVIGKFGDLTDNFSSPHARRIGLAGVVMTTGTDVKDAKVICVSTGAKCINGEYLSDRGLALNDCHAEIVSRRSLLRFLYTQLELYLNNEDDQKRSIFQKSERGGFRLKENIQFHLYISTSPCGDARIFSPHEAILEEPADRHPNRKARGQLRTKIEAGQGTIPVRNNASIQTWDGVLQGERLLTMSCSDKIARWNVVGIQGSLLSIFVEPIYFSSIILGSLYHGDHLSRAMYQRISNIEDLPPLYTLNKPLLTGISNAEARQPGKAPIFSVNWTVGDSAIEVINATTGKGELGRASRLCKHALYCRWMRVHGKVPSHLLRSKITKPNVYHETKLAAKEYQAAKARLFTAFIKAGLGAWVEKPTEQDQFSLTGSGSGSIKMRIKVHAAADKHNSVEDSVTKSREHKEKHKTHPSNHHHHHNHHSHKHSHSQLPVGTGNKRPGDPKHSSQGGSAAAGGSIKMRIKVHAAADKHNSVEDSVTKSREHKEKHKTHPSNHHHHHNHHSHKHSHSQLPVGTGNKRPGDPKHSSQ (SEQ ID NO: 2) the amino acid sequence of β -descensin in the CIRTS fragment of the example is:
MGIINTLQKYYCRVRGGRCAVLSCLPKEEQIGKCSTRGRKCCRRKK(SEQ ID NO:38);
the amino acid sequence of TBP is:
MAVPETRPNHTIYINNLNSKIKKDELKKSLYAIFSQFGQILDILVPRQRTPRGQAFVIFKEVSSATNALRSMQGFPFYDKPMRIQYAKTDKRIPAKMKGTFV(SEQ ID NO:96);
the amino acid sequence of ADAR is:
QLHLPQVLADAVSRLVIGKFGDLTDNFSSPHARRIGLAGVVMTTGTDVKDAKVICVSTGAKCINGEYLSDRGLALNDCHAEIVSRRSLLRFLYTQLELYLNNEDDQKRSIFQKSERGGFRLKENIQFHLYISTSPCGDARIFSPHEAILEEPADRHPNRKARGQLRTKIEAGQGTIPVRNNASIQTWDGVLQGERLLTMSCSDKIARWNVVGIQGSLLSIFVEPIYFSSIILGSLYHGDHLSRAMYQRISNIEDLPPLYTLNKPLLTGISNAEARQPGKAPIFSVNWTVGDSAIEVINATTGKGELGRASRLCKHALYCRWMRVHGKVPSHLLRSKITKPNVYHETKLAAKEYQAAKARLFTAFIKAGLGAWVEKPTEQDQFSLT(SEQ ID NO:99)。
NES peptide fragments can be located intermediate CIRTS protein and ADAR protein, and can include an amino acid sequence as set forth in SEQ ID NO. 3.
LQLPPLERLTL(SEQ ID NO:3)
The fusion protein provided by the invention can also comprise a domain which is 71 amino acids in length and rich in histidine, namely an HRD fragment, wherein the HRD fragment can be positioned at the C-terminal of C-RESCUEv2, and the HRD fragment can comprise an amino acid sequence shown as SEQ ID NO. 4.
IKMRIKVHAAADKHNSVEDSVTKSREHKEKHKTHPSNHHHHHNHHSHKHSHSQLPVGTGNKRPGDPKHSSQ(SEQ ID NO:4)
Connector 2: MHGS (SEQ ID NO: 97)
Connector 4: GGSGGSGGSGGSGGSGGSGSS (SEQ ID NO: 98)
Connector 6: GGSAAAGGS (SEQ ID NO: 100)
Example 1 construction of plasmid for C-RESCUE System:
CIRTS fragment was synthesized in gold Style, SEQ ID NO:6 at TC1309: the CMV bpNLS-dPspCas13b-Apobec3A (Y132D) plasmid was used as a vector, xhoI and PmeI were used for cleavage, and the cleaved vector (5078 bp) fragment was recovered by using Monarch DNAGel Extraction Kit kit. The two fragments (the digested vector fragment, the CIRTS fragment synthesized by the company) are usedHiFi DNA Assembly Master Mix recombinant ligation. After incubation at 50 ℃ for 15min, the plates were transformed and Sanger sequenced to give TC1688: CMV CIRTS-Apobec3A sequence (SEQ ID NO: 7).
ATGGGAATTATCAACACCCTTCAAAAATACTACTGTCGCGTCCGTGGGGGTCGTTGCGCAGTCCTGTCGTGCCTGCCTAAAGAGGAGCAGATCGGTAAATGCTCAACCCGTGGACGTAAGTGCTGTCGTCGCAAGAAAGGCGGTTCTGGAGGCTCAGGAGGAAGCGGAGGATCCGGTGGATCAGGAGGCTCCATGGCAGTTCCCGAGACCCGCCCTAACCACACTATTTATATCAACAACCTCAATAGCAAGATCAAGAAGGATGAGCTAAAAAAGTCCCTGTACGCCATCTTCTCCCAGTTTGGCCAGATCCTGGATATCCTGGTACCGCGGCAGCGTACCCCGAGGGGCCAGGCCTTTGTCATCTTCAAGGAGGTCAGCAGCGCCACCAACGCCCTGCGCTCCATGCAGGGTTTCCCTTTCTATGACAAACCTATGCGTATCCAGTATGCCAAGACCGACAAACGTATCCCGGCCAAGATGAAAGGCACCTTCGTGATGCATGGATCCCTTCAACTGCCTCCACTTGAAAGACTGACACTGGAATTCGGCGGTAGTGGAGGAAGCGGAGGAAGCGGAGGAAGTGGTGGATCCGGAGGCTCC(SEQ ID NO:6)
GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGACTCACTATAGGGAGACCCAAGCTGGCTAGCGTTTGCCACCATGGGAATTATCAACACCCTTCAAAAATACTACTGTCGCGTCCGTGGGGGTCGTTGCGCAGTCCTGTCGTGCCTGCCTAAAGAGGAGCAGATCGGTAAATGCTCAACCCGTGGACGTAAGTGCTGTCGTCGCAAGAAAGGCGGTTCTGGAGGCTCAGGAGGAAGCGGAGGATCCGGTGGATCAGGAGGCTCCATGGCAGTTCCCGAGACCCGCCCTAACCACACTATTTATATCAACAACCTCAATAGCAAGATCAAGAAGGATGAGCTAAAAAAGTCCCTGTACGCCATCTTCTCCCAGTTTGGCCAGATCCTGGATATCCTGGTACCGCGGCAGCGTACCCCGAGGGGCCAGGCCTTTGTCATCTTCAAGGAGGTCAGCAGCGCCACCAACGCCCTGCGCTCCATGCAGGGTTTCCCTTTCTATGACAAACCTATGCGTATCCAGTATGCCAAGACCGACAAACGTATCCCGGCCAAGATGAAAGGCACCTTCGTGATGCATGGATCCCTTCAACTGCCTCCACTTGAAAGACTGACACTGGAATTCGGCGGTAGTGGAGGAAGCGGAGGAAGCGGAGGAAGTGGTGGATCCGGAGGCTCCGGCTCGAGCATGGAAGCCAGCCCAGCATCCGGGCCCAGACACTTGATGGATCCACACATATTCACTTCCAACTTTAACAATGGCATTGGAAGGCATAAGACCTACCTGTGCTACGAAGTGGAGCGCCTGGACAATGGCACCTCGGTCAAGATGGACCAGCACAGGGGCTTTCTACACAACCAGGCTAAGAATCTTCTCTGTGGCTTTTACGGCCGCCATGCGGAGCTGCGCTTCTTGGACCTGGTTCCTTCTTTGCAGTTGGACCCGGCCCAGATCTACAGGGTCACTTGGTTCATCTCCTGGAGCCCCTGCTTCTCCTGGGGCTGTGCCGGGGAAGTGCGTGCGTTCCTTCAGGAGAACACACACGTGAGACTGCGTATCTTCGCTGCCCGCATCTATGATGACGACCCCCTATATAAGGAGGCACTGCAAATGCTGCGGGATGCTGGGGCCCAAGTCTCCATCATGACCTACGATGAATTTAAGCACTGCTGGGACACCTTTGTGGACCACCAGGGATGTCCCTTCCAGCCCTGGGATGGACTAGATGAGCACAGCCAAGCCCTGAGTGGGAGGCTGCGGGCCATTCTCCAGAATCAGGGAAAC(SEQ ID NO:7)
At TC1688: the CMV CIRTS-Apobec3A plasmid was used as a vector, the Apobec3A was excised with NotI-HF and XhoI, and the digested vector fragment (5085 bp) was recovered using a Monarch DNA Gel Extraction Kit kit. The 6937F and 7693 primer pairs were diluted to 10. Mu.M and the ADAR fragments of RESCUE were amplified from TC1738 templates using Phanta Super-Fidelity DNA Polymerase.
After the PCR amplified product was purified and recovered by AxyPrep PCR Clean-up kit, the ADAR fragments and the cleaved vector fragments were usedHiFi DNAAssembly Master Mix recombinant. Recombinant productAfter incubation at 50℃for 15min, the plates were transformed and the correct C-RESCUE plasmid TC1798B was obtained by Sanger sequencing: CMV cirst-ADAR sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, a nick was introduced downstream of ADAR with NotI-HF, and the digested vector (6248 bp) fragment was recovered using the Select-a-Size DNAMagBead Kit kit. The 8332F and 8332R primer pairs were diluted to 10. Mu.M and amplified using Phanta Super-Fidelity DNAPolymerase (GS) 3 Fragments. The 8331F and 8331R primer pairs were diluted to 10. Mu.M and the HRD fragment was amplified from the human epithelial cell genome using Phanta Super-Fidelity DNA Polymerase.
The PCR amplified product was purified and recovered by Select-a-Size DNAMagBead Kit, and the three fragments were usedHiFi DNA Assembly Master Mix recombinant. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct HRD-containing plasmid TC1879: CMV CIRTS-ADAR-HRD sequence.
The 8395F and 8395R primer pairs were diluted to 10. Mu.M and the whole plasmid was amplified from TC1879 using Phanta Super-Fidelity DNAPolymerase.
The HRD fragment was amplified from TC1879 using Phanta Super-Fidelity DNAPolymerase by diluting the 8395F and 8394R primer pair to 10. Mu.M, and amplified by the primers 8396F and 8396R diluted to 10. Mu.M to give the GS linker-introduced HRD fragment. Two-stage PCR amplification product, linearized TC1879 and GS linker introduced HRD were purified and recovered by AxyPrep PCR Clean-up kit, and then purified and recovered byHiFi DNAAssembly Master Mix recombinant. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct plasmid TC1889: CMV CIRTS-ADAR-2 XHRD sequence.
The 8420F and 8420R primer pairs were diluted to 10. Mu.M, the 8421F and 8421R primer pairs were diluted to 10. Mu.M, and both primer pairs were used to amplify HRD fragments from TC1879 using Phanta Super-Fidelity DNA Polymerase, with BsaI enzyme on the primersA cleavage site. The PCR amplification product was purified and recovered by AxyPrep PCR Clean-up kit, and then ligated with T4 DNA Ligase at room temperature for 10 min. The ligation product was purified using Select-a-Size DNA MagBead Kit. In TC1889: CMV CIRTS-ADAR-2 XHRD plasmid was used as vector, notI-HF was used to cleave the opening, and the cleaved vector fragment (6713 bp) was recovered using Monarch DNA Gel Extraction Kit kit. The fragment obtained by ligating the above T4 is used for the vector HiFi DNAAssembly Master Mix recombinant. The recombinant product was plated after incubation at 50℃for 15min and subjected to Sanger sequencing to obtain a plasmid with 4 XHRD added to the C-terminus of the editor, TC1902: CMV CIRTS-ADAR-4 XHRD sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, nheI-HF and EcoRV-HF were used to cleave the vector, and the cleaved vector fragment (5915 bp) was recovered using the Select-a-Size DNA MagBead Kit kit. The 8559F-LJJ10 and 8559R-LJJ10 primer pairs were diluted to 10. Mu.M and bpNLS fragment (SEQ ID NO: 37) was amplified using Phanta Super-Fidelity DNAPolymerase with TC1614 as template. The 8560F-LJJ11 and 8560R-LJJ11 primer pairs were diluted to 10. Mu.M and the beta-depensin fragment (SEQ ID NO: 34) was amplified using Phanta Super-Fidelity DNAPolymerase with TC1798B as template. The two-stage PCR amplified product is purified and recovered by a AxyPrep PCR Clean-up kit and then is usedHiFi DNA Assembly Master Mix are recombined with the vector fragment. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct plasmid TC1940: CMV bpNLS-CIRTS-ADAR sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, a nick was introduced downstream of ADAR with NotI-HF, and the digested vector fragment (6248 bp) was recovered using the Select-a-Size DNA MagBead Kit kit. The 8561F-LJJ12 and 8561R-LJJ12 primer pairs were diluted to 10. Mu.M and bpNLS-3 xFLAG fragment (SEQ ID NO:39:KRTADGSEFESPKKKRKVGSD YKDHDGDYKDHDIDYKDDDDK) was amplified using Phanta Super-Fidelity DNA Polymerase and TC1811 as template. PCR amplified products pass through AxyPrep PCR Clean- Purifying and recovering up kit, and usingHiFi DNAAssembly Master Mix are recombined with the vector fragment. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct plasmid TC1941: CMV CIRTS-ADAR-bpNLS-3 xFLAG sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, a nick was introduced downstream of ADAR with NotI-HF, and the digested vector fragment (6248 bp) was recovered using the Select-a-Size DNA MagBead Kit kit. The 8562-LJJ13 and 8561R-LJJ12 primer pairs were diluted to 10. Mu.M, and 2 XbpNLS-3 XFLAG fragment was amplified using Phanta Super-Fidelity DNA Polymerase and TC1811 as a template. The PCR amplified product was purified and recovered by AxyPrep PCR Clean-up kit and then usedHiFi DNA Assembly Master Mix are recombined with the vector fragment. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct plasmid TC1942: CMV CIRTS-ADAR-2 XbpNLS-3 XFLAG sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, a nick was introduced downstream of ADAR with NotI-HF, and the digested vector fragment (6248 bp) was recovered using the Select-a-Size DNA MagBead Kit kit. Three primers of 8408F,8408Fb and 8408R were diluted to 10. Mu.M, and the IDR fragment of FUS (SEQ ID NO:40:MASNDYTQQATQSYGAYPTQPGQGYSQQSSQPYGQQSYSGYSQ STDTSGYGQSSYSSYGQSQNSYGTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGYGQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPSYGGQQQSYGQQQSYNPPQGYGQQNQYNSSSGGGGGGGGGGNYGQDQSSMSSGGGSGGGYGNQDQSGGGGSGGYGQQDRGG) was amplified using cDNA obtained by reverse transcription of 293T cell transcriptome as a template, using Phanta Super-Fidelity DNA Polymerase. The PCR amplified product was purified and recovered by AxyPrep PCR Clean-up kit and then used HiFi DNA Assembly Master Mix are recombined with the vector fragment. The recombinant product was transformed into a coated plate after incubation at 50℃for 15min, as measured by SangerThe correct plasmid TC1895 was obtained: CMV CIRTS-ADAR-IDR (FUS) sequence.
In TC1798B: CMV CIRTS-ADAR plasmid was used as vector, a nick was introduced downstream of ADAR with NotI-HF, and the digested vector fragment (6248 bp) was recovered using the Select-a-Size DNA MagBead Kit kit. The 8409F and 8409R primer pairs were diluted to 10 μm, and IDR fragment of HNRNPA1 (SEQ ID NO: 41:MAASSQRGRSGSGNFGGGRGGGFGGGGNDNGGGGGGGGGGGGYGGGGGYNGGGNGGGNGGYNGGGNGGGNGGYNGGVGGVGGVGGGPQNGYNGYNGYNNQSNPMKGPMKGGGGNGGGGGGRSSGPGGGQYFAKPQGGYGSSSSSSGSGSGRRF A kind of electronic device A kind of electronic device. The PCR amplified product was purified and recovered by AxyPrep PCR Clean-up kit and then usedHiFi DNA Assembly Master Mix are recombined with the vector fragment. The recombinant product was incubated at 50℃for 15min and then plated and sequenced by Sanger to give the correct plasmid TC1896: CMV CIRTS-ADAR-IDR (hnRNPA 1) sequence.
In TC1889: CMV CIRTS-ADAR-2 XHRD plasmid was used as vector, digested with NotI-HF and BsmBI, and digested vector fragment (6449 bp) was recovered using Monarch DNA Gel Extraction Kit kit. The 8934F and 8934R primer pairs were diluted to 10. Mu.M and the HRD fragment was amplified from TC1889 using Phanta Super-Fidelity DNA Polymerase. The PCR amplified product was purified and recovered by AxyPrep PCR Clean-up kit and then used HiFi DNA Assembly Master Mix are recombined with the vector fragment. The recombinant product was incubated at 50℃for 15min and then plated, and the plasmid fused with the red fluorescent protein editor was obtained by Sanger sequencing, TC2055: CMV CIRTS-ADAR-2 XHRD-mCherry sequence.
Transiently coexpression of the C-RESCUEv1 and Ctnnb1 reporter genes (shown in FIG. 5) with S33-targeting sgRNA in mouse neuroblastoma N2a cells; the sgRNA carries 1-2 TAR hairpin (hairpin transcription template sequence SEQ ID NO:101: GGCCAGATCTGGCCTGGGGAGCTCTCTGGCC) copies and 30nt spacer sites, and mismatch bases (C or U) and mismatch distances (5-26 nt) are different. 40-48 hours after transfection, the cells were lysed, the mRNA of the reporter or endogenous gene was inverted into cDNA and then PCR amplified, and editing was detected by NGS (FIG. 6). The results show that 2 copies of TAR hairpins perform better overall than 1 copy (FIG. 7), consistent with previous studies (Rauch et al 2020). Mismatch distances also have an effect on editing, with distances between 15 and 26nt being more effective than 5-10 nt. Wherein the sgRNA does not target the backbone structure of any mRNA sequence, is a sequence expressed by the U6 promoter that contains two hairpin copies of TAR:
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCG GGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTATTNNNNNNNNNNNNNNNNNNNNTTATTGGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTTTTTT(SEQ ID NO:5)
the sequences of the U6-TAR-BbsI-BbsI (SEQ ID NO: 42) and U6-TAR-BbsI-BbsI-TAR crRNA (SEQ ID NO: 43) plasmids are shown below, respectively. And respectively designing an upstream primer and a downstream primer which are complementary and paired with the base with a specific structure according to the target site sequence, and adding sterilized water to dissolve to 100 mu M. And annealing and then connecting to the U6-TAR-BbsI-BbsI or U6-TAR-BbsI-BbsI-TAR vector to construct the targeting specific sgRNA.
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCG GGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTATTGGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTTTTTT(SEQ ID NO:42)
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCG GGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTATTGTCTTCGATATCGAAGACTTTTATTGGCCAGATCTGAGCCTGGGAGCTCTCTGGCCTTTTTTT(SEQ ID NO:43)
Structure of the primer: linker-sgRNA; the upstream primer adapter is cttatt, the 1 xTAR downstream primer adapter is aaaa, and the 2 xTAR downstream primer adapter is ataa.
The sequence of the sgrnas (on the upstream primer, the downstream primer is complementary to it) is as follows:
hCTNNB1 S33F C-flip
30/22 ggattccaCagtccaggtaagactgttgct(SEQ ID NO:44)
hCTNNB1 P44S C-flip
30/22 cagagaagCagctgtggtagtggcaccaga(SEQ ID NO:45)
hCTNNB1 T41I C-flip
30/22 gagctgtgCtagtggcaccagaatggattc(SEQ ID NO:46)
hPPIB I18I C-flip
30/22 gaccccgcCatgagggcggcggcaaggagc(SEQ ID NO:47)
hPPIB R7C C-flip
30/22 catgttgcCttcggagaggcgcagcatcca(SEQ ID NO:48)
hSMACA4 S85L C-flip
30/22 ggtcgtccCacatgcccttctcatgcatgg(SEQ ID NO:49)
hSMACA4 D86D C-flip
30/22 cgcgggtcCtccgacatgcccttctcatgc(SEQ ID NO:50)
hNRAS I21I C-flip
30/22 attagctgCattgtcagtgcgcttttccca(SEQ ID NO:51)
hNKFB1 P33S C-flip
30/22 catctgtgCttgaaatacttctggattaaa(SEQ ID NO:52)
mCTNNB1 S33F C-flip
20/10 atggattccaCaatccaagt(SEQ ID NO:53)
20/15 ttccaCaatccaagtaagac(SEQ ID NO:54)
30/5 ggtggtggcaccagaatggattccaCaatc(SEQ ID NO:55)
30/10 tggcaccagaatggattccaCaatccaagt(SEQ ID NO:56)
30/15 ccagaatggattccaCaatccaagtaagac(SEQ ID NO:57)
30/20 atggattccaCaatccaagtaagactgctg(SEQ ID NO:58)
30/22 ggattccaCaatccaagtaagactgctgct(SEQ ID NO:59)
30/24 attccacaatccaagtaagactgctgctgc(SEQ ID NO:60)
30/25 ttccaCaatccaagtaagactgctgctgcc(SEQ ID NO:61)
30/26 tccaCaatccaagtaagactgctgctgcca(SEQ ID NO:62)
40/20 tggcaccagaatggattccaCaatccaagtaagactgctg(SEQ ID NO:63)
50/25 ggtggtggcaccagaatggattccaCaatccaagtaagactgctgctgcc(SEQ ID NO:64)
50/30 tggcaccagaatggattccaCaatccaagtaagactgctgctgccagtgg(SEQ ID NO:65)
mCTNNB1 T41A C-flip
20/10 ggagctgtggCggtggcacc(SEQ ID NO:66)
20/15 tgtggCggtggcaccagaat(SEQ ID NO:67)
30/15 gggaaggagctgtggCggtggcaccagaat(SEQ ID NO:68)
30/20 ggagctgtggCggtggcaccagaatggatt(SEQ ID NO:69)
30/22 agctgtggCggtggcaccagaatggattcc(SEQ ID NO:70)
30/24 ctgtggCggtggcaccagaatggattccag(SEQ ID NO:71)
30/26 gtggCggtggcaccagaatggattccagaa(SEQ ID NO:72)
40/20 actcagggaaggagctgtggCggtggcaccagaatggatt(SEQ ID NO:73)
50/25 ttgccactcagggaaggagctgtggCggtggcaccagaatggattccaga(SEQ ID NO:74)
50/30 actcagggaaggagctgtggCggtggcaccagaatggattccagaatcca(SEQ ID NO:75)
mCTNNB1 S33F U-flip
30/5 ggtggtggcaccagaatggattccaUaatc(SEQ ID NO:76)
30/10 tggcaccagaatggattccaUaatccaagt(SEQ ID NO:77)
30/15 ccagaatggattccaUaatccaagtaagac(SEQ ID NO:78)
30/20 atggattccaUaatccaagtaagactgctg(SEQ ID NO:79)
30/22 ggattccaUaatccaagtaagactgctgct(SEQ ID NO:80)
30/24 attccaUaatccaagtaagactgctgctgc(SEQ ID NO:81)
30/25 ttccaUaatccaagtaagactgctgctgcc(SEQ ID NO:82)
30/26 tccaUaatccaagtaagactgctgctgcca(SEQ ID NO:83)
mCTNNB1 S37F U-flip
30/15 gtggtggtggcaccaUaatggattccagaa(SEQ ID NO:84)
30/20 ggtggcaccaUaatggattccagaatccaa(SEQ ID NO:85)
30/22 tggcaccaUaatggattccagaatccaagt(SEQ ID NO:86)
30/24 gcaccaUaatggattccagaatccaagtaa(SEQ ID NO:87)
mCTNNB1 T41I U-flip
30/15 agggaaggagctgtgUtggtggcaccagaa(SEQ ID NO:88)
30/20 aggagctgtgUtggtggcaccagaatggat(SEQ ID NO:89)
30/22 gagctgtgUtggtggcaccagaatggattc(SEQ ID NO:90)
30/24 gctgtgUtggtggcaccagaatggattcca(SEQ ID NO:91)
mCTNNB1 S45F U-flip
30/15cccttgccactcaggUaaggagctgtggtg(SEQ ID NO:92)
30/20gccactcaggUaaggagctgtggtggtggc(SEQ ID NO:93)
30/22cactcaggUaaggagctgtggtggtggcac(SEQ ID NO:94)
30/24ctcaggUaaggagctgtggtggtggcacca2SEQ ID NO:95)
and (3) performing enzyme digestion on the U6-TAR-BbsI-BbsI or U6-TAR-BbsI-TAR plasmid by using BbsI-HF, and recovering the plasmid by using a Monarch DNA Gel Extraction Kit kit to obtain the linearization vector. 10ng of linearized vector was ligated with 0.3. Mu.L of annealed product by T4 DNA Ligase, incubated at room temperature for 10 min and transferred to a plate, and the correct target-specific sgRNA was obtained by Sanger sequencing.
Example 2 comparison of base editing efficiency of C-RESCUE System on endogenous Gene
HEK293T cells were transfected using the C-RESCUE system described above, as follows:
1) HEK293T cells (from ATCC, CRL-11268) were thawed and cultured in 10cm dishes in DMEM mixed with 10% (v/v) fetal bovine serum. The culture temperature was 37℃and the carbon dioxide concentration was 5%. Cells were plated to 48 well plates after multiple passages at a cell density of 80%.
2) When the cell concentration was 70-80%, the cell state was recovered optimally by exchanging the medium with DMEM medium containing 10% (v/v) serum for 2 hours. The amount of plasmid transfected per well was 200ng of C-RESCUE plasmid and 300ng of sgRNA plasmid, respectively. The plasmid was homogenized in 25. Mu.L of Opti-MEM medium mixed with 0.75. Mu. L Lipofectamine P3000.
3) mu.L of Lipofectamine 3000 was mixed with 25. Mu.L of Opti-MEM medium and allowed to stand for 5 minutes.
4) The culture medium mixed with the plasmid mixed with P3000 is added into Opti-MEM mixed with Lipofectamine 3000, and the mixture is stirred and mixed at a slow speed, and then the mixture is kept stand for 15 minutes.
5) Opti-MEM mixed with plasmid and transfection reagent was added separately to 48-well plates.
6) DMEM rehydration with 10% (v/v) FBS 12 hours after transfection.
7) Cells were harvested 48 hours after transfection, excess DMEM was washed off with PBS, and mRNA was cleaved by addition of the prepared lysate (Joung et al, 2017). mu.L of the lysate was used for reverse transcription using a Repran reverse transcription kit HiScript II Q RT SuperMix kit, and the transcription system was 10. Mu.L.
8) Performing PCR amplification on the target site fragment by using a Norpran high-fidelity kit, and performing tapping recovery on a PCR product sample by using an AxyPrep DNA gel recovery kit to remove a non-specific strip; and sending to second generation sequencing analysis.
Random off-target condition of full transcriptome:
the random off-target situation of the whole transcriptome is important for accurate base editing of RNA, and it is related to whether the final base editor can be applied in vivo. The base editors used for the full transcriptome off-target experiments were RESCUE-S, C-RESCUEv1, C-RESCUEv1-1 XHRD, C-RESCUEv2 (FIG. 3 editors 1-4). The editors were co-transfected with the corresponding sgrnas, cells not transfected with plasmids were removed by sorting, and the editing efficiency of the endogenous genes was analyzed using NGS as the respective editing efficiency of the editors. As shown in fig. 10, 11, 12, the main disadvantage of C-resuev 2 compared to previous resue-S is that off-target editing is increased 5.8 times (fig. 13), due to cirrts. Of the two RNA binding modules in CIRTS, TBP specifically binds to the TAR hairpin, which seems unlikely to increase off-target. In contrast, β -depensin is a hybrid single stranded RNA binding protein. The purpose of β -inhibition is to protect the binding of gRNA and enhance the binding of cirs target RNA, and it is believed that β -inhibition is also involved in exacerbation of off-target. Future studies will require fine tuning of β -descensin to minimize off-target without affecting the efficiency of editing the target site.
Summarizing:
RESCUEv2 was the first small, human-derived RNA base editor and was also relatively efficient compared to previous RNA base editors. Xu et al reported successful reduction of the reset-S size by replacing dwancas 13b (1094 aa) with cas13e.1, which is a novel Cas13 (Xu et al 2020) with only 775aa (fig. 3 editor 6). The results for cas13.1 for CTNNB1 also show that its editing efficiency is very low (fig. 14), completely inferior to C-reset. Furthermore, cas13.1 is still much larger than cirs (775 vs 200 aa) (fig. 3) and has bacterial origin. Thus, C-RESCUEv2 also has unique advantages over the Cas13e.1-based editor. Other smaller editors such as MCP-ADAR, which edit efficiency against CTNNB1 sites is relatively low, especially MCP-ADAR2 with low off-target, edit efficiency is very low (as shown in fig. 15).
Since the C-RESCUEv2 has no core entering signal, the present study also discussed the editing situation of C-RESCUEv2 after core entering, and the editor editing efficiency was reduced after core entering was found (as shown in FIG. 17). It is not known why the editing efficiency is lowered. May be a cause of the decrease in the expression level. Further discussion is needed later.
The main disadvantage of C-RESCUEv2 compared to previous RESCUE-S is the 5.8-fold increase in off-target editing (FIG. 13). This may be due to CIRTS. Of the two RNA binding modules in CIRTS, TBP specifically binds to the TAR hairpin, which seems unlikely to increase off-target. In contrast, β -depensin is a hybrid single stranded RNA binding protein. The purpose of β -inhibition is to protect sgrnas and enhance binding of cirst target RNAs, and it is thought that β -inhibition also participates in exacerbation of off-target. Future research may require fine tuning of β -descensin to reduce off-target as much as possible without affecting the efficiency of editing the target site. It is also possible that the expression levels of different editors are different, and then the difference of the expression levels of the editors needs to be verified by performing western blot.
An important finding of this study is that HRD can enhance the editing efficiency of C-reset, whereas IDR cannot (fig. 8, 9, 18, 19). As shown in fig. 16, HRD may promote its interaction with other proteins and/or function through unknown mechanisms by stabilizing the protein conformation of C-reset. It is currently unclear whether the effect of HRD on C-reset is applicable to other edits, and future research is also necessary to continue.
Wherein: the HRD sequence of hCYCT1 is: IKMRIKVHAAADKHNSVEDSVTKSREHKEKHKTHPSNHHHHHNHHSH KHSHSQLPVGTGNKRPGDPKHSSQ (SEQ ID NO: 102);
the HRD sequence of hDYRK1A is: QNALHHHHGNSSHHHHHHHHHHHHHGQQALG (SEQ ID NO: 103).
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.
SEQUENCE LISTING
<110> Shanghai university of science and technology
<120> fusion protein for editing RNA and use thereof
<130> P21018859C
<160> 103
<170> PatentIn version 3.5
<210> 1
<211> 589
<212> PRT
<213> Artificial Sequence
<220>
<223> C-RESCUEv1
<400> 1
Met Gly Ile Ile Asn Thr Leu Gln Lys Tyr Tyr Cys Arg Val Arg Gly
1 5 10 15
Gly Arg Cys Ala Val Leu Ser Cys Leu Pro Lys Glu Glu Gln Ile Gly
20 25 30
Lys Cys Ser Thr Arg Gly Arg Lys Cys Cys Arg Arg Lys Lys Gly Gly
35 40 45
Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser
50 55 60
Met Ala Val Pro Glu Thr Arg Pro Asn His Thr Ile Tyr Ile Asn Asn
65 70 75 80
Leu Asn Ser Lys Ile Lys Lys Asp Glu Leu Lys Lys Ser Leu Tyr Ala
85 90 95
Ile Phe Ser Gln Phe Gly Gln Ile Leu Asp Ile Leu Val Pro Arg Gln
100 105 110
Arg Thr Pro Arg Gly Gln Ala Phe Val Ile Phe Lys Glu Val Ser Ser
115 120 125
Ala Thr Asn Ala Leu Arg Ser Met Gln Gly Phe Pro Phe Tyr Asp Lys
130 135 140
Pro Met Arg Ile Gln Tyr Ala Lys Thr Asp Lys Arg Ile Pro Ala Lys
145 150 155 160
Met Lys Gly Thr Phe Val Met His Gly Ser Leu Gln Leu Pro Pro Leu
165 170 175
Glu Arg Leu Thr Leu Glu Phe Gly Gly Ser Gly Gly Ser Gly Gly Ser
180 185 190
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Ser Ser Gln Leu His Leu
195 200 205
Pro Gln Val Leu Ala Asp Ala Val Ser Arg Leu Val Ile Gly Lys Phe
210 215 220
Gly Asp Leu Thr Asp Asn Phe Ser Ser Pro His Ala Arg Arg Ile Gly
225 230 235 240
Leu Ala Gly Val Val Met Thr Thr Gly Thr Asp Val Lys Asp Ala Lys
245 250 255
Val Ile Cys Val Ser Thr Gly Ala Lys Cys Ile Asn Gly Glu Tyr Leu
260 265 270
Ser Asp Arg Gly Leu Ala Leu Asn Asp Cys His Ala Glu Ile Val Ser
275 280 285
Arg Arg Ser Leu Leu Arg Phe Leu Tyr Thr Gln Leu Glu Leu Tyr Leu
290 295 300
Asn Asn Glu Asp Asp Gln Lys Arg Ser Ile Phe Gln Lys Ser Glu Arg
305 310 315 320
Gly Gly Phe Arg Leu Lys Glu Asn Ile Gln Phe His Leu Tyr Ile Ser
325 330 335
Thr Ser Pro Cys Gly Asp Ala Arg Ile Phe Ser Pro His Glu Ala Ile
340 345 350
Leu Glu Glu Pro Ala Asp Arg His Pro Asn Arg Lys Ala Arg Gly Gln
355 360 365
Leu Arg Thr Lys Ile Glu Ala Gly Gln Gly Thr Ile Pro Val Arg Asn
370 375 380
Asn Ala Ser Ile Gln Thr Trp Asp Gly Val Leu Gln Gly Glu Arg Leu
385 390 395 400
Leu Thr Met Ser Cys Ser Asp Lys Ile Ala Arg Trp Asn Val Val Gly
405 410 415
Ile Gln Gly Ser Leu Leu Ser Ile Phe Val Glu Pro Ile Tyr Phe Ser
420 425 430
Ser Ile Ile Leu Gly Ser Leu Tyr His Gly Asp His Leu Ser Arg Ala
435 440 445
Met Tyr Gln Arg Ile Ser Asn Ile Glu Asp Leu Pro Pro Leu Tyr Thr
450 455 460
Leu Asn Lys Pro Leu Leu Thr Gly Ile Ser Asn Ala Glu Ala Arg Gln
465 470 475 480
Pro Gly Lys Ala Pro Ile Phe Ser Val Asn Trp Thr Val Gly Asp Ser
485 490 495
Ala Ile Glu Val Ile Asn Ala Thr Thr Gly Lys Gly Glu Leu Gly Arg
500 505 510
Ala Ser Arg Leu Cys Lys His Ala Leu Tyr Cys Arg Trp Met Arg Val
515 520 525
His Gly Lys Val Pro Ser His Leu Leu Arg Ser Lys Ile Thr Lys Pro
530 535 540
Asn Val Tyr His Glu Thr Lys Leu Ala Ala Lys Glu Tyr Gln Ala Ala
545 550 555 560
Lys Ala Arg Leu Phe Thr Ala Phe Ile Lys Ala Gly Leu Gly Ala Trp
565 570 575
Val Glu Lys Pro Thr Glu Gln Asp Gln Phe Ser Leu Thr
580 585
<210> 2
<211> 746
<212> PRT
<213> Artificial Sequence
<220>
<223> C-RESCUEv2
<400> 2
Met Gly Ile Ile Asn Thr Leu Gln Lys Tyr Tyr Cys Arg Val Arg Gly
1 5 10 15
Gly Arg Cys Ala Val Leu Ser Cys Leu Pro Lys Glu Glu Gln Ile Gly
20 25 30
Lys Cys Ser Thr Arg Gly Arg Lys Cys Cys Arg Arg Lys Lys Gly Gly
35 40 45
Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser
50 55 60
Met Ala Val Pro Glu Thr Arg Pro Asn His Thr Ile Tyr Ile Asn Asn
65 70 75 80
Leu Asn Ser Lys Ile Lys Lys Asp Glu Leu Lys Lys Ser Leu Tyr Ala
85 90 95
Ile Phe Ser Gln Phe Gly Gln Ile Leu Asp Ile Leu Val Pro Arg Gln
100 105 110
Arg Thr Pro Arg Gly Gln Ala Phe Val Ile Phe Lys Glu Val Ser Ser
115 120 125
Ala Thr Asn Ala Leu Arg Ser Met Gln Gly Phe Pro Phe Tyr Asp Lys
130 135 140
Pro Met Arg Ile Gln Tyr Ala Lys Thr Asp Lys Arg Ile Pro Ala Lys
145 150 155 160
Met Lys Gly Thr Phe Val Met His Gly Ser Leu Gln Leu Pro Pro Leu
165 170 175
Glu Arg Leu Thr Leu Glu Phe Gly Gly Ser Gly Gly Ser Gly Gly Ser
180 185 190
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Ser Ser Gln Leu His Leu
195 200 205
Pro Gln Val Leu Ala Asp Ala Val Ser Arg Leu Val Ile Gly Lys Phe
210 215 220
Gly Asp Leu Thr Asp Asn Phe Ser Ser Pro His Ala Arg Arg Ile Gly
225 230 235 240
Leu Ala Gly Val Val Met Thr Thr Gly Thr Asp Val Lys Asp Ala Lys
245 250 255
Val Ile Cys Val Ser Thr Gly Ala Lys Cys Ile Asn Gly Glu Tyr Leu
260 265 270
Ser Asp Arg Gly Leu Ala Leu Asn Asp Cys His Ala Glu Ile Val Ser
275 280 285
Arg Arg Ser Leu Leu Arg Phe Leu Tyr Thr Gln Leu Glu Leu Tyr Leu
290 295 300
Asn Asn Glu Asp Asp Gln Lys Arg Ser Ile Phe Gln Lys Ser Glu Arg
305 310 315 320
Gly Gly Phe Arg Leu Lys Glu Asn Ile Gln Phe His Leu Tyr Ile Ser
325 330 335
Thr Ser Pro Cys Gly Asp Ala Arg Ile Phe Ser Pro His Glu Ala Ile
340 345 350
Leu Glu Glu Pro Ala Asp Arg His Pro Asn Arg Lys Ala Arg Gly Gln
355 360 365
Leu Arg Thr Lys Ile Glu Ala Gly Gln Gly Thr Ile Pro Val Arg Asn
370 375 380
Asn Ala Ser Ile Gln Thr Trp Asp Gly Val Leu Gln Gly Glu Arg Leu
385 390 395 400
Leu Thr Met Ser Cys Ser Asp Lys Ile Ala Arg Trp Asn Val Val Gly
405 410 415
Ile Gln Gly Ser Leu Leu Ser Ile Phe Val Glu Pro Ile Tyr Phe Ser
420 425 430
Ser Ile Ile Leu Gly Ser Leu Tyr His Gly Asp His Leu Ser Arg Ala
435 440 445
Met Tyr Gln Arg Ile Ser Asn Ile Glu Asp Leu Pro Pro Leu Tyr Thr
450 455 460
Leu Asn Lys Pro Leu Leu Thr Gly Ile Ser Asn Ala Glu Ala Arg Gln
465 470 475 480
Pro Gly Lys Ala Pro Ile Phe Ser Val Asn Trp Thr Val Gly Asp Ser
485 490 495
Ala Ile Glu Val Ile Asn Ala Thr Thr Gly Lys Gly Glu Leu Gly Arg
500 505 510
Ala Ser Arg Leu Cys Lys His Ala Leu Tyr Cys Arg Trp Met Arg Val
515 520 525
His Gly Lys Val Pro Ser His Leu Leu Arg Ser Lys Ile Thr Lys Pro
530 535 540
Asn Val Tyr His Glu Thr Lys Leu Ala Ala Lys Glu Tyr Gln Ala Ala
545 550 555 560
Lys Ala Arg Leu Phe Thr Ala Phe Ile Lys Ala Gly Leu Gly Ala Trp
565 570 575
Val Glu Lys Pro Thr Glu Gln Asp Gln Phe Ser Leu Thr Gly Ser Gly
580 585 590
Ser Gly Ser Ile Lys Met Arg Ile Lys Val His Ala Ala Ala Asp Lys
595 600 605
His Asn Ser Val Glu Asp Ser Val Thr Lys Ser Arg Glu His Lys Glu
610 615 620
Lys His Lys Thr His Pro Ser Asn His His His His His Asn His His
625 630 635 640
Ser His Lys His Ser His Ser Gln Leu Pro Val Gly Thr Gly Asn Lys
645 650 655
Arg Pro Gly Asp Pro Lys His Ser Ser Gln Gly Gly Ser Ala Ala Ala
660 665 670
Gly Gly Ser Ile Lys Met Arg Ile Lys Val His Ala Ala Ala Asp Lys
675 680 685
His Asn Ser Val Glu Asp Ser Val Thr Lys Ser Arg Glu His Lys Glu
690 695 700
Lys His Lys Thr His Pro Ser Asn His His His His His Asn His His
705 710 715 720
Ser His Lys His Ser His Ser Gln Leu Pro Val Gly Thr Gly Asn Lys
725 730 735
Arg Pro Gly Asp Pro Lys His Ser Ser Gln
740 745
<210> 3
<211> 11
<212> PRT
<213> Artificial Sequence
<220>
<223> NES
<400> 3
Leu Gln Leu Pro Pro Leu Glu Arg Leu Thr Leu
1 5 10
<210> 4
<211> 71
<212> PRT
<213> Artificial Sequence
<220>
<223> HRD
<400> 4
Ile Lys Met Arg Ile Lys Val His Ala Ala Ala Asp Lys His Asn Ser
1 5 10 15
Val Glu Asp Ser Val Thr Lys Ser Arg Glu His Lys Glu Lys His Lys
20 25 30
Thr His Pro Ser Asn His His His His His Asn His His Ser His Lys
35 40 45
His Ser His Ser Gln Leu Pro Val Gly Thr Gly Asn Lys Arg Pro Gly
50 55 60
Asp Pro Lys His Ser Ser Gln
65 70
<210> 5
<211> 349
<212> DNA
<213> Artificial Sequence
<220>
<223> U6-gRNA
<220>
<221> misc_feature
<222> (287)..(306)
<223> n is a, c, g, or t
<400> 5
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggccagatct gagcctggga gctctctggc cttattnnnn nnnnnnnnnn 300
nnnnnnttat tggccagatc tgagcctggg agctctctgg ccttttttt 349
<210> 6
<211> 603
<212> DNA
<213> Artificial Sequence
<220>
<223> CIRTS
<400> 6
atgggaatta tcaacaccct tcaaaaatac tactgtcgcg tccgtggggg tcgttgcgca 60
gtcctgtcgt gcctgcctaa agaggagcag atcggtaaat gctcaacccg tggacgtaag 120
tgctgtcgtc gcaagaaagg cggttctgga ggctcaggag gaagcggagg atccggtgga 180
tcaggaggct ccatggcagt tcccgagacc cgccctaacc acactattta tatcaacaac 240
ctcaatagca agatcaagaa ggatgagcta aaaaagtccc tgtacgccat cttctcccag 300
tttggccaga tcctggatat cctggtaccg cggcagcgta ccccgagggg ccaggccttt 360
gtcatcttca aggaggtcag cagcgccacc aacgccctgc gctccatgca gggtttccct 420
ttctatgaca aacctatgcg tatccagtat gccaagaccg acaaacgtat cccggccaag 480
atgaaaggca ccttcgtgat gcatggatcc cttcaactgc ctccacttga aagactgaca 540
ctggaattcg gcggtagtgg aggaagcgga ggaagcggag gaagtggtgg atccggaggc 600
tcc 603
<210> 7
<211> 1888
<212> DNA
<213> Artificial Sequence
<220>
<223> TC1688:CMV CIRTS-Apobec3A
<400> 7
gttgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60
gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120
ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 180
ggactttcca ttgacgtcaa tgggtggagt atttacggta aactgcccac ttggcagtac 240
atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 300
cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360
tattagtcat cgctattacc atggtgatgc ggttttggca gtacatcaat gggcgtggat 420
agcggtttga ctcacgggga tttccaagtc tccaccccat tgacgtcaat gggagtttgt 480
tttggcacca aaatcaacgg gactttccaa aatgtcgtaa caactccgcc ccattgacgc 540
aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctctc tggctaacta 600
gagaacccac tgcttactgg cttatcgaaa ttaatacgac tcactatagg gagacccaag 660
ctggctagcg tttgccacca tgggaattat caacaccctt caaaaatact actgtcgcgt 720
ccgtgggggt cgttgcgcag tcctgtcgtg cctgcctaaa gaggagcaga tcggtaaatg 780
ctcaacccgt ggacgtaagt gctgtcgtcg caagaaaggc ggttctggag gctcaggagg 840
aagcggagga tccggtggat caggaggctc catggcagtt cccgagaccc gccctaacca 900
cactatttat atcaacaacc tcaatagcaa gatcaagaag gatgagctaa aaaagtccct 960
gtacgccatc ttctcccagt ttggccagat cctggatatc ctggtaccgc ggcagcgtac 1020
cccgaggggc caggcctttg tcatcttcaa ggaggtcagc agcgccacca acgccctgcg 1080
ctccatgcag ggtttccctt tctatgacaa acctatgcgt atccagtatg ccaagaccga 1140
caaacgtatc ccggccaaga tgaaaggcac cttcgtgatg catggatccc ttcaactgcc 1200
tccacttgaa agactgacac tggaattcgg cggtagtgga ggaagcggag gaagcggagg 1260
aagtggtgga tccggaggct ccggctcgag catggaagcc agcccagcat ccgggcccag 1320
acacttgatg gatccacaca tattcacttc caactttaac aatggcattg gaaggcataa 1380
gacctacctg tgctacgaag tggagcgcct ggacaatggc acctcggtca agatggacca 1440
gcacaggggc tttctacaca accaggctaa gaatcttctc tgtggctttt acggccgcca 1500
tgcggagctg cgcttcttgg acctggttcc ttctttgcag ttggacccgg cccagatcta 1560
cagggtcact tggttcatct cctggagccc ctgcttctcc tggggctgtg ccggggaagt 1620
gcgtgcgttc cttcaggaga acacacacgt gagactgcgt atcttcgctg cccgcatcta 1680
tgatgacgac cccctatata aggaggcact gcaaatgctg cgggatgctg gggcccaagt 1740
ctccatcatg acctacgatg aatttaagca ctgctgggac acctttgtgg accaccaggg 1800
atgtcccttc cagccctggg atggactaga tgagcacagc caagccctga gtgggaggct 1860
gcgggccatt ctccagaatc agggaaac 1888
<210> 8
<211> 42
<212> DNA
<213> Artificial Sequence
<220>
<223> 6937F
<400> 8
tccggaggct ccggctcgag ccagctgcat ttaccgcagg tt 42
<210> 9
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 7693
<400> 9
gaagttagta gcggatccag cggccgccgt gagtgagaac tggtcc 46
<210> 10
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> 8332F
<400> 10
accagttctc actcacgggc tccggatccg ggtcc 35
<210> 11
<211> 37
<212> DNA
<213> Artificial Sequence
<220>
<223> 8332R
<400> 11
ggacttttat gcgcattttt atggacccgg atccgga 37
<210> 12
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> 8331F
<400> 12
ataaaaatgc gcataaaagt ccatgct 27
<210> 13
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 8331R
<400> 13
tgaagttagt agcggatccc ccctggctac tatgttttgg atcacc 46
<210> 14
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> 8395F
<400> 14
ataaaaatgc gcataaaagt ccat 24
<210> 15
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> 8395R
<400> 15
tccggagccc gtgagtgaga actggtcc 28
<210> 16
<211> 51
<212> DNA
<213> Artificial Sequence
<220>
<223> 8394R
<400> 16
tgttgataat tcccatggaa ccgccagcgg ccgcggatcc cccctggcta c 51
<210> 17
<211> 48
<212> DNA
<213> Artificial Sequence
<220>
<223> 8396F
<400> 17
ctcacgggct ccggatccgg gtccataaaa atgcgcataa aagtccat 48
<210> 18
<211> 41
<212> DNA
<213> Artificial Sequence
<220>
<223> 8396R
<400> 18
acttttatgc gcatttttat ggaaccgcca gcggccgcgg a 41
<210> 19
<211> 48
<212> DNA
<213> Artificial Sequence
<220>
<223> 8420F
<400> 19
caggggggat ccgcgggtgg ttcaataaaa atgcgcataa aagtccat 48
<210> 20
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 8420R
<400> 20
ccccggtctc tgggactacc cccctggcta ctatgttttg gatcac 46
<210> 21
<211> 48
<212> DNA
<213> Artificial Sequence
<220>
<223> 8421F
<400> 21
ggggggtctc atcccggggg ttccataaaa atgcgcataa aagtccat 48
<210> 22
<211> 47
<212> DNA
<213> Artificial Sequence
<220>
<223> 8421R
<400> 22
tatggaaccg ccagcacttc ccccctggct actatgtttt ggatcac 47
<210> 23
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> 8559F-LJJ10
<400> 23
ggagacccaa gctggctagc gccaccatga aacggactgc 40
<210> 24
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> 8559R-LJJ10
<400> 24
ggtgttgata attccggatc cgactttcct ctt 33
<210> 25
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> 8560F-LJJ11
<400> 25
ggaattatca acacccttca a 21
<210> 26
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> 8560R-LJJ11
<400> 26
ccgcggtacc aggatatcca g 21
<210> 27
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> 8561F-LJJ12
<400> 27
cagttctcac tcacgggttc taagcgcacc gcc 33
<210> 28
<211> 38
<212> DNA
<213> Artificial Sequence
<220>
<223> 8561R-LJJ12
<400> 28
agtagcggat ccagcggccg cggatccctt gtcatcgt 38
<210> 29
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> 8562-LJJ13
<400> 29
cagttctcac tcacgggatc caaaagaact gcg 33
<210> 30
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> 8408F
<400> 30
accagttctc actcacggcg gccgctg 27
<210> 31
<211> 57
<212> DNA
<213> Artificial Sequence
<220>
<223> 8408Fb
<400> 31
cacggcggcc gctggctccg gatccgggtc catggcctca aacgattata cccaaca 57
<210> 32
<211> 47
<212> DNA
<213> Artificial Sequence
<220>
<223> 8408R
<400> 32
gaagttagta gcggatccgc ctccacggtc ctgctgtcca tagccac 47
<210> 33
<211> 51
<212> DNA
<213> Artificial Sequence
<220>
<223> 8409F
<400> 33
cacggcggcc gctggctccg gatccgggtc catggctagt gcttcatcca g 51
<210> 34
<211> 44
<212> DNA
<213> Artificial Sequence
<220>
<223> 8409R
<400> 34
gaagttagta gcggatccaa atcttctgcc actgccatag ctac 44
<210> 35
<211> 58
<212> DNA
<213> Artificial Sequence
<220>
<223> 8934F
<400> 35
tagccagggg ggatccgcgg ccgctggcgg ttccataaaa atgcgcataa aagtccat 58
<210> 36
<211> 38
<212> DNA
<213> Artificial Sequence
<220>
<223> 8934R
<400> 36
agggttctcc tccacgtctc cggatccccc ctggctac 38
<210> 37
<211> 18
<212> PRT
<213> Artificial Sequence
<220>
<223> bpNLS
<400> 37
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys Arg
1 5 10 15
Lys Val
<210> 38
<211> 46
<212> PRT
<213> Artificial Sequence
<220>
<223> beta-defensin
<400> 38
Met Gly Ile Ile Asn Thr Leu Gln Lys Tyr Tyr Cys Arg Val Arg Gly
1 5 10 15
Gly Arg Cys Ala Val Leu Ser Cys Leu Pro Lys Glu Glu Gln Ile Gly
20 25 30
Lys Cys Ser Thr Arg Gly Arg Lys Cys Cys Arg Arg Lys Lys
35 40 45
<210> 39
<211> 42
<212> PRT
<213> Artificial Sequence
<220>
<223> bpNLS-3FLAG
<400> 39
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys Arg
1 5 10 15
Lys Val Gly Ser Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His
20 25 30
Asp Ile Asp Tyr Lys Asp Asp Asp Asp Lys
35 40
<210> 40
<211> 214
<212> PRT
<213> Artificial Sequence
<220>
<223> FUS-IDR
<400> 40
Met Ala Ser Asn Asp Tyr Thr Gln Gln Ala Thr Gln Ser Tyr Gly Ala
1 5 10 15
Tyr Pro Thr Gln Pro Gly Gln Gly Tyr Ser Gln Gln Ser Ser Gln Pro
20 25 30
Tyr Gly Gln Gln Ser Tyr Ser Gly Tyr Ser Gln Ser Thr Asp Thr Ser
35 40 45
Gly Tyr Gly Gln Ser Ser Tyr Ser Ser Tyr Gly Gln Ser Gln Asn Ser
50 55 60
Tyr Gly Thr Gln Ser Thr Pro Gln Gly Tyr Gly Ser Thr Gly Gly Tyr
65 70 75 80
Gly Ser Ser Gln Ser Ser Gln Ser Ser Tyr Gly Gln Gln Ser Ser Tyr
85 90 95
Pro Gly Tyr Gly Gln Gln Pro Ala Pro Ser Ser Thr Ser Gly Ser Tyr
100 105 110
Gly Ser Ser Ser Gln Ser Ser Ser Tyr Gly Gln Pro Gln Ser Gly Ser
115 120 125
Tyr Ser Gln Gln Pro Ser Tyr Gly Gly Gln Gln Gln Ser Tyr Gly Gln
130 135 140
Gln Gln Ser Tyr Asn Pro Pro Gln Gly Tyr Gly Gln Gln Asn Gln Tyr
145 150 155 160
Asn Ser Ser Ser Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Asn Tyr
165 170 175
Gly Gln Asp Gln Ser Ser Met Ser Ser Gly Gly Gly Ser Gly Gly Gly
180 185 190
Tyr Gly Asn Gln Asp Gln Ser Gly Gly Gly Gly Ser Gly Gly Tyr Gly
195 200 205
Gln Gln Asp Arg Gly Gly
210
<210> 41
<211> 135
<212> PRT
<213> Artificial Sequence
<220>
<223> HNRNPA1-IDR
<400> 41
Met Ala Ser Ala Ser Ser Ser Gln Arg Gly Arg Ser Gly Ser Gly Asn
1 5 10 15
Phe Gly Gly Gly Arg Gly Gly Gly Phe Gly Gly Asn Asp Asn Phe Gly
20 25 30
Arg Gly Gly Asn Phe Ser Gly Arg Gly Gly Phe Gly Gly Ser Arg Gly
35 40 45
Gly Gly Gly Tyr Gly Gly Ser Gly Asp Gly Tyr Asn Gly Phe Gly Asn
50 55 60
Asp Gly Ser Asn Phe Gly Gly Gly Gly Ser Tyr Asn Asp Phe Gly Asn
65 70 75 80
Tyr Asn Asn Gln Ser Ser Asn Phe Gly Pro Met Lys Gly Gly Asn Phe
85 90 95
Gly Gly Arg Ser Ser Gly Pro Tyr Gly Gly Gly Gly Gln Tyr Phe Ala
100 105 110
Lys Pro Arg Asn Gln Gly Gly Tyr Gly Gly Ser Ser Ser Ser Ser Ser
115 120 125
Tyr Gly Ser Gly Arg Arg Phe
130 135
<210> 42
<211> 324
<212> DNA
<213> Artificial Sequence
<220>
<223> U6-TAR-BbsI-BbsI
<400> 42
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggccagatct gagcctggga gctctctggc cttattggcc agatctgagc 300
ctgggagctc tctggccttt tttt 324
<210> 43
<211> 349
<212> DNA
<213> Artificial Sequence
<220>
<223> U6-TAR-BbsI-BbsI-TAR crRNA
<400> 43
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggccagatct gagcctggga gctctctggc cttattgtct tcgatatcga 300
agacttttat tggccagatc tgagcctggg agctctctgg ccttttttt 349
<210> 44
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hCTNNB1 S33F C-flip
<400> 44
ggattccaca gtccaggtaa gactgttgct 30
<210> 45
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hCTNNB1 P44S C-flip
<400> 45
cagagaagca gctgtggtag tggcaccaga 30
<210> 46
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hCTNNB1 T41I C-flip
<400> 46
gagctgtgct agtggcacca gaatggattc 30
<210> 47
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hPPIB I18I C-flip
<400> 47
gaccccgcca tgagggcggc ggcaaggagc 30
<210> 48
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hPPIB R7C C-flip
<400> 48
catgttgcct tcggagaggc gcagcatcca 30
<210> 49
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hSMACA4 S85L C-flip
<400> 49
ggtcgtccca catgcccttc tcatgcatgg 30
<210> 50
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hSMACA4 D86D C-flip
<400> 50
cgcgggtcct ccgacatgcc cttctcatgc 30
<210> 51
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hNRAS I21I C-flip
<400> 51
attagctgca ttgtcagtgc gcttttccca 30
<210> 52
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> hNKFB1 P33S C-flip
<400> 52
catctgtgct tgaaatactt ctggattaaa 30
<210> 53
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-20/10
<400> 53
atggattcca caatccaagt 20
<210> 54
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-20/15
<400> 54
ttccacaatc caagtaagac 20
<210> 55
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/5
<400> 55
ggtggtggca ccagaatgga ttccacaatc 30
<210> 56
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/10
<400> 56
tggcaccaga atggattcca caatccaagt 30
<210> 57
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/15
<400> 57
ccagaatgga ttccacaatc caagtaagac 30
<210> 58
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/20
<400> 58
atggattcca caatccaagt aagactgctg 30
<210> 59
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/22
<400> 59
ggattccaca atccaagtaa gactgctgct 30
<210> 60
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/24
<400> 60
attccacaat ccaagtaaga ctgctgctgc 30
<210> 61
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/25
<400> 61
ttccacaatc caagtaagac tgctgctgcc 30
<210> 62
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-30/26
<400> 62
tccacaatcc aagtaagact gctgctgcca 30
<210> 63
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-40/20
<400> 63
tggcaccaga atggattcca caatccaagt aagactgctg 40
<210> 64
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-50/25
<400> 64
ggtggtggca ccagaatgga ttccacaatc caagtaagac tgctgctgcc 50
<210> 65
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F C-flip-50/30
<400> 65
tggcaccaga atggattcca caatccaagt aagactgctg ctgccagtgg 50
<210> 66
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-20/10
<400> 66
ggagctgtgg cggtggcacc 20
<210> 67
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-20/15
<400> 67
tgtggcggtg gcaccagaat 20
<210> 68
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-30/15
<400> 68
gggaaggagc tgtggcggtg gcaccagaat 30
<210> 69
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-30/20
<400> 69
ggagctgtgg cggtggcacc agaatggatt 30
<210> 70
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-30/22
<400> 70
agctgtggcg gtggcaccag aatggattcc 30
<210> 71
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-30/24
<400> 71
ctgtggcggt ggcaccagaa tggattccag 30
<210> 72
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-30/26
<400> 72
gtggcggtgg caccagaatg gattccagaa 30
<210> 73
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-40/20
<400> 73
actcagggaa ggagctgtgg cggtggcacc agaatggatt 40
<210> 74
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-50/25
<400> 74
ttgccactca gggaaggagc tgtggcggtg gcaccagaat ggattccaga 50
<210> 75
<211> 50
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41A C-flip-50/30
<400> 75
actcagggaa ggagctgtgg cggtggcacc agaatggatt ccagaatcca 50
<210> 76
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/5
<400> 76
ggtggtggca ccagaatgga ttccauaatc 30
<210> 77
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/10
<400> 77
tggcaccaga atggattcca uaatccaagt 30
<210> 78
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/15
<400> 78
ccagaatgga ttccauaatc caagtaagac 30
<210> 79
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/20
<400> 79
atggattcca uaatccaagt aagactgctg 30
<210> 80
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/22
<400> 80
ggattccaua atccaagtaa gactgctgct 30
<210> 81
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/24
<400> 81
attccauaat ccaagtaaga ctgctgctgc 30
<210> 82
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/25
<400> 82
ttccauaatc caagtaagac tgctgctgcc 30
<210> 83
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S33F U-flip-30/26
<400> 83
tccauaatcc aagtaagact gctgctgcca 30
<210> 84
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S37F U-flip-30/15
<400> 84
gtggtggtgg caccauaatg gattccagaa 30
<210> 85
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S37F U-flip-30/20
<400> 85
ggtggcacca uaatggattc cagaatccaa 30
<210> 86
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S37F U-flip-30/22
<400> 86
tggcaccaua atggattcca gaatccaagt 30
<210> 87
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S37F U-flip-30/24
<400> 87
gcaccauaat ggattccaga atccaagtaa 30
<210> 88
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41I U-flip-30/15
<400> 88
agggaaggag ctgtgutggt ggcaccagaa 30
<210> 89
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41I U-flip-30/20
<400> 89
aggagctgtg utggtggcac cagaatggat 30
<210> 90
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41I U-flip-30/22
<400> 90
gagctgtgut ggtggcacca gaatggattc 30
<210> 91
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 T41I U-flip-30/24
<400> 91
gctgtgutgg tggcaccaga atggattcca 30
<210> 92
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S45F U-flip-30/15
<400> 92
cccttgccac tcagguaagg agctgtggtg 30
<210> 93
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S45F U-flip-30/20
<400> 93
gccactcagg uaaggagctg tggtggtggc 30
<210> 94
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S45F U-flip-30/22
<400> 94
cactcaggua aggagctgtg gtggtggcac 30
<210> 95
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> mCTNNB1 S45F U-flip-30/24
<400> 95
ctcagguaag gagctgtggt ggtggcacca 30
<210> 96
<211> 102
<212> PRT
<213> Artificial Sequence
<220>
<223> TBP
<400> 96
Met Ala Val Pro Glu Thr Arg Pro Asn His Thr Ile Tyr Ile Asn Asn
1 5 10 15
Leu Asn Ser Lys Ile Lys Lys Asp Glu Leu Lys Lys Ser Leu Tyr Ala
20 25 30
Ile Phe Ser Gln Phe Gly Gln Ile Leu Asp Ile Leu Val Pro Arg Gln
35 40 45
Arg Thr Pro Arg Gly Gln Ala Phe Val Ile Phe Lys Glu Val Ser Ser
50 55 60
Ala Thr Asn Ala Leu Arg Ser Met Gln Gly Phe Pro Phe Tyr Asp Lys
65 70 75 80
Pro Met Arg Ile Gln Tyr Ala Lys Thr Asp Lys Arg Ile Pro Ala Lys
85 90 95
Met Lys Gly Thr Phe Val
100
<210> 97
<211> 4
<212> PRT
<213> Artificial Sequence
<220>
<223> linker
<400> 97
Met His Gly Ser
1
<210> 98
<211> 21
<212> PRT
<213> Artificial Sequence
<220>
<223> linker
<400> 98
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gly Ser Gly Ser Ser
20
<210> 99
<211> 385
<212> PRT
<213> Artificial Sequence
<220>
<223> ADAR*
<400> 99
Gln Leu His Leu Pro Gln Val Leu Ala Asp Ala Val Ser Arg Leu Val
1 5 10 15
Ile Gly Lys Phe Gly Asp Leu Thr Asp Asn Phe Ser Ser Pro His Ala
20 25 30
Arg Arg Ile Gly Leu Ala Gly Val Val Met Thr Thr Gly Thr Asp Val
35 40 45
Lys Asp Ala Lys Val Ile Cys Val Ser Thr Gly Ala Lys Cys Ile Asn
50 55 60
Gly Glu Tyr Leu Ser Asp Arg Gly Leu Ala Leu Asn Asp Cys His Ala
65 70 75 80
Glu Ile Val Ser Arg Arg Ser Leu Leu Arg Phe Leu Tyr Thr Gln Leu
85 90 95
Glu Leu Tyr Leu Asn Asn Glu Asp Asp Gln Lys Arg Ser Ile Phe Gln
100 105 110
Lys Ser Glu Arg Gly Gly Phe Arg Leu Lys Glu Asn Ile Gln Phe His
115 120 125
Leu Tyr Ile Ser Thr Ser Pro Cys Gly Asp Ala Arg Ile Phe Ser Pro
130 135 140
His Glu Ala Ile Leu Glu Glu Pro Ala Asp Arg His Pro Asn Arg Lys
145 150 155 160
Ala Arg Gly Gln Leu Arg Thr Lys Ile Glu Ala Gly Gln Gly Thr Ile
165 170 175
Pro Val Arg Asn Asn Ala Ser Ile Gln Thr Trp Asp Gly Val Leu Gln
180 185 190
Gly Glu Arg Leu Leu Thr Met Ser Cys Ser Asp Lys Ile Ala Arg Trp
195 200 205
Asn Val Val Gly Ile Gln Gly Ser Leu Leu Ser Ile Phe Val Glu Pro
210 215 220
Ile Tyr Phe Ser Ser Ile Ile Leu Gly Ser Leu Tyr His Gly Asp His
225 230 235 240
Leu Ser Arg Ala Met Tyr Gln Arg Ile Ser Asn Ile Glu Asp Leu Pro
245 250 255
Pro Leu Tyr Thr Leu Asn Lys Pro Leu Leu Thr Gly Ile Ser Asn Ala
260 265 270
Glu Ala Arg Gln Pro Gly Lys Ala Pro Ile Phe Ser Val Asn Trp Thr
275 280 285
Val Gly Asp Ser Ala Ile Glu Val Ile Asn Ala Thr Thr Gly Lys Gly
290 295 300
Glu Leu Gly Arg Ala Ser Arg Leu Cys Lys His Ala Leu Tyr Cys Arg
305 310 315 320
Trp Met Arg Val His Gly Lys Val Pro Ser His Leu Leu Arg Ser Lys
325 330 335
Ile Thr Lys Pro Asn Val Tyr His Glu Thr Lys Leu Ala Ala Lys Glu
340 345 350
Tyr Gln Ala Ala Lys Ala Arg Leu Phe Thr Ala Phe Ile Lys Ala Gly
355 360 365
Leu Gly Ala Trp Val Glu Lys Pro Thr Glu Gln Asp Gln Phe Ser Leu
370 375 380
Thr
385
<210> 100
<211> 9
<212> PRT
<213> Artificial Sequence
<220>
<223> linker
<400> 100
Gly Gly Ser Ala Ala Ala Gly Gly Ser
1 5
<210> 101
<211> 31
<212> DNA
<213> Artificial Sequence
<220>
<223> TAR
<400> 101
ggccagatct gagcctggga gctctctggc c 31
<210> 102
<211> 71
<212> PRT
<213> Artificial Sequence
<220>
<223> hCYCT1-HRD
<400> 102
Ile Lys Met Arg Ile Lys Val His Ala Ala Ala Asp Lys His Asn Ser
1 5 10 15
Val Glu Asp Ser Val Thr Lys Ser Arg Glu His Lys Glu Lys His Lys
20 25 30
Thr His Pro Ser Asn His His His His His Asn His His Ser His Lys
35 40 45
His Ser His Ser Gln Leu Pro Val Gly Thr Gly Asn Lys Arg Pro Gly
50 55 60
Asp Pro Lys His Ser Ser Gln
65 70
<210> 103
<211> 31
<212> PRT
<213> Artificial Sequence
<220>
<223> hDYRK1A-HRD
<400> 103
Gln Asn Ala Leu His His His His Gly Asn Ser Ser His His His His
1 5 10 15
His His His His His His His His His Gly Gln Gln Ala Leu Gly
20 25 30

Claims (32)

1. A fusion protein, comprising, in order from N-terminus to C-terminus: a single-stranded RNA binding protein fragment, an RNA hairpin binding protein fragment, a nuclear signal peptide fragment, and a deaminase fragment; the amino acid sequence of the deaminase fragment is shown as SEQ ID NO. 99;
the amino acid sequence of the single-stranded RNA binding protein fragment is shown as SEQ ID NO. 38; the amino acid sequence of the RNA hairpin binding protein fragment is shown as SEQ ID NO. 96.
2. The fusion protein of claim 1, wherein the fusion protein further comprises a histidine-rich domain fragment attached to the N-terminus and/or the C-terminus, wherein the histidine-rich domain fragment has a natural number of 1 or more copies.
3. The fusion protein of claim 2, wherein the histidine-rich domain fragment has a copy number of 1, 2, 3 or 4; and/or the amino acid sequence of the fragment rich in histidine domain is shown as SEQ ID NO. 4.
4. The fusion protein of claim 2 or 3, wherein the single-stranded RNA binding protein fragment, RNA hairpin binding protein fragment, out-of-core signal peptide fragment and deaminase fragment are linked by a linker; and/or, the deaminase fragment and the histidine-rich domain fragment are linked by a linker; and/or more than one copy of the histidine-rich domain fragment is linked by a linker.
5. The fusion protein of claim 4, wherein the linker is (G m S) n Or an amino acid sequence as shown in SEQ ID NO 97, 98 or 100; wherein m and n are each 0 or a positive integer, and are not simultaneously 0.
6. The fusion protein of claim 5, wherein the linker is (G 2 S) 6 Or (GS) 3
7. The fusion protein of claim 5, wherein the single-stranded RNA binding protein fragment and the RNA hairpin binding protein fragment are isolated by (G 2 S) 6 Connecting; and/or, the RNA hairpin binding protein fragment and the nuclear signal peptide fragment are connected through an amino acid sequence shown as SEQ ID NO. 97; and/or, the nuclear signal peptide fragment and the deaminase fragment are connected through a connecting piece in series; and/or, between said deaminase fragment and said histidine-rich domain fragment (GS) 3 Connecting; and/or more than one copy of the histidine-rich domain fragment between the fragments by a sequence as set forth in SEQ ID NThe amino acid sequence shown in O.sub.100 is connected.
8. The fusion protein of claim 7, wherein the tandem linker is made up of EF and the amino acid sequence set forth in SEQ ID NO. 98.
9. The fusion protein of claim 1, wherein the fusion protein comprises a polypeptide that,
the amino acid sequence of the nuclear signal peptide fragment is shown as SEQ ID NO. 3.
10. A base editing system comprising a guide RNA and the fusion protein of any one of claims 1 to 9.
11. The base editing system of claim 10, wherein the guide RNA comprises a backbone structure and a recognition structure; the backbone structure comprises at least one hairpin structure.
12. The base editing system of claim 11, wherein said backbone structure comprises 2 hairpin structures;
and/or, the transcription template of the hairpin structure has a nucleotide sequence shown as SEQ ID NO. 101; and/or the length of the identification structure is 30nt.
13. An isolated polynucleotide encoding the fusion protein of any one of claims 1 to 9 or the base editing system of any one of claims 10 to 12.
14. An expression cassette comprising a promoter and the polynucleotide of claim 13.
15. The expression cassette of claim 14, wherein the promoter is a U6 promoter.
16. The expression cassette of claim 15, wherein the promoter is a human U6 promoter.
17. The expression cassette of claim 16, wherein the nucleotide sequence of the expression cassette is set forth in SEQ ID No. 5.
18. A construct comprising the polynucleotide of claim 13 or the expression cassette of any one of claims 14 to 17.
19. The construct of claim 18, wherein the construct is selected from the group consisting of a bacterial construct, a fungal construct, an animal cell construct, and a plant cell construct.
20. An expression system, wherein the expression system is a host cell expressing the construct of claim 18 or 19.
21. The expression system of claim 20, wherein the host cell is a eukaryotic cell or a prokaryotic cell.
22. The expression system of claim 21, wherein the eukaryotic cell is a human cell line or a murine cell.
23. The expression system of claim 22, wherein the eukaryotic cell is a human kidney epithelial cell line or a murine brain neuroma cell.
24. The expression system of claim 23, wherein the eukaryotic cell is HEK293T or N2A.
25. A pharmaceutical composition comprising the fusion protein of any one of claims 1 to 9 or the base editing system of any one of claims 10 to 12, and a pharmaceutically acceptable carrier.
26. A base editing method for non-therapeutic purposes, characterized in that the base editing method comprises:
expressing the fusion protein according to any one of claims 1 to 9 or the base editing system according to claims 10 to 12 in a target cell, thereby causing base editing in the target cell.
27. The base editing method according to claim 26, wherein the target cell is a eukaryotic cell or a prokaryotic cell.
28. The base editing method according to claim 27, wherein said eukaryotic cell is a human cell line or a murine cell.
29. The base editing method of claim 28, wherein said eukaryotic cell is a human kidney epithelial cell line or a murine brain neuroma cell.
30. The base editing method according to claim 29, wherein said eukaryotic cell is HEK293T or N2A.
31. Use of the fusion protein according to any one of claims 1 to 9, the base editing system according to any one of claims 10 to 12, the polynucleotide according to claim 13, the expression cassette according to any one of claims 14 to 17, the construct according to claim 18 or 19 or the expression system according to any one of claims 20 to 24 for the preparation of a base edited medicament, for the preparation of a gene therapy medicament or for the preparation of a base editing tool.
32. The use of claim 31, wherein said base editing is selected from the group consisting of repairing a gene mutation from T to C, restoring a missense mutation resulting from a single base mutation, translation of an abnormally terminated inactive protein, RNA editing, and activating the WNT signaling pathway.
CN202210419197.7A 2022-04-20 2022-04-20 Fusion protein for editing RNA and application thereof Active CN114685685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210419197.7A CN114685685B (en) 2022-04-20 2022-04-20 Fusion protein for editing RNA and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210419197.7A CN114685685B (en) 2022-04-20 2022-04-20 Fusion protein for editing RNA and application thereof

Publications (2)

Publication Number Publication Date
CN114685685A CN114685685A (en) 2022-07-01
CN114685685B true CN114685685B (en) 2023-11-07

Family

ID=82144111

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210419197.7A Active CN114685685B (en) 2022-04-20 2022-04-20 Fusion protein for editing RNA and application thereof

Country Status (1)

Country Link
CN (1) CN114685685B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114058607B (en) * 2020-07-31 2024-02-27 上海科技大学 Fusion protein for editing C to U base, and preparation method and application thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9816093B1 (en) * 2016-12-06 2017-11-14 Caribou Biosciences, Inc. Engineered nucleic acid-targeting nucleic acids
CN110527697A (en) * 2018-05-23 2019-12-03 中国科学院上海生命科学研究院 RNA based on CRISPR-Cas13a pinpoints editing technique
CN111206053A (en) * 2019-10-23 2020-05-29 马信龙 RNA editing system based on CRISPR-Cas13 and application
CA3125299A1 (en) * 2019-01-04 2020-07-09 The University Of Chicago Systems and methods for modulating rna
CN111793627A (en) * 2019-04-08 2020-10-20 中国科学院上海生命科学研究院 RNA fixed-point editing by utilizing artificially constructed RNA editing enzyme and related application

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9816093B1 (en) * 2016-12-06 2017-11-14 Caribou Biosciences, Inc. Engineered nucleic acid-targeting nucleic acids
CN110527697A (en) * 2018-05-23 2019-12-03 中国科学院上海生命科学研究院 RNA based on CRISPR-Cas13a pinpoints editing technique
CA3125299A1 (en) * 2019-01-04 2020-07-09 The University Of Chicago Systems and methods for modulating rna
CN113543797A (en) * 2019-01-04 2021-10-22 芝加哥大学 Systems and methods for modulating RNA
CN111793627A (en) * 2019-04-08 2020-10-20 中国科学院上海生命科学研究院 RNA fixed-point editing by utilizing artificially constructed RNA editing enzyme and related application
CN111206053A (en) * 2019-10-23 2020-05-29 马信龙 RNA editing system based on CRISPR-Cas13 and application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Identification of RNA-binding protein targets with HyperTRIBE;Reazur Rahman;《Nature Protocols》;第13卷(第8期);1829-1849 *
基于CRISPR/Cas9 系统的单碱基基因编辑技术及其在医药研究中的应用;张爱霞;《中国药理学与毒理学》;第32卷(第7期);507-514 *

Also Published As

Publication number Publication date
CN114685685A (en) 2022-07-01

Similar Documents

Publication Publication Date Title
AU613316B2 (en) Improved recombinant expression
CN109715803B (en) Allele editing and uses thereof
CA2195303C (en) Method for selecting high-expressing host cells
KR20220032050A (en) rigged CASX system
US10590434B2 (en) Highly inducible dual-promoter lentiviral TET-ON system
CN114685685B (en) Fusion protein for editing RNA and application thereof
Barth et al. Regulation of two nested proteins from gene 49 (recombination endonuclease VII) and of a lambda RexA-like protein of bacteriophage T4.
CN106536722B (en) Method for rapid preparation of infectious RNA viruses
Merrill et al. Genetic and physical analysis of the chicken tk gene
CN114805500B (en) Application of African swine fever virus I73R protein as immunosuppressant and construction of immunosuppression site mutant strain
Tansey et al. Sp1 and thyroid hormone receptor differentially activate expression of human growth hormone and chorionic somatomammotropin genes
WO2022065689A1 (en) Prime editing-based gene editing composition with enhanced editing efficiency and use thereof
WO2021218594A1 (en) REAGENT AND METHOD FOR REGULATING INTEGRIN β SUBUNIT
Trevor et al. Suppression of endo B cytokeratin by its antisense RNA inhibits the normal coexpression of endo A cytokeratin.
Berberich et al. Comparison of Rous sarcoma virus RNA processing in chicken and mouse fibroblasts: evidence for double-spliced RNA in nonpermissive mouse cells
CN109402071A (en) A kind of recombinant herpesvirus of turkeys for expressing H9N2 subtype avian influenza virus H9 albumen
CN109970861B (en) Mitochondrion-targeted ND4 fusion protein and preparation method and application thereof
US20020055172A1 (en) Multiple promoter expression constructs and methods of use
Angrand et al. An exogenous albumin promoter can become silent in dedifferentiated hepatoma variants as well as intertypic hybrids
CN110885819A (en) AAV virus-based gene editing expression cassette
CN115176016A (en) Enhanced expression systems and methods of use thereof
CN113308480B (en) A-type Seneca virus SVA/HeB full-length infectious cDNA clone, and preparation method and application thereof
JP6956995B2 (en) Genome editing method
CN111808878B (en) Bilirubin metabolism function gene fragment and modified HepG2 cell
CN115029380B (en) Novel coronavirus SARS-CoV-2 replicon and cell model, construction method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant