WO2021032155A1 - Système d'édition de bases et son procédé d'utilisation - Google Patents

Système d'édition de bases et son procédé d'utilisation Download PDF

Info

Publication number
WO2021032155A1
WO2021032155A1 PCT/CN2020/110207 CN2020110207W WO2021032155A1 WO 2021032155 A1 WO2021032155 A1 WO 2021032155A1 CN 2020110207 W CN2020110207 W CN 2020110207W WO 2021032155 A1 WO2021032155 A1 WO 2021032155A1
Authority
WO
WIPO (PCT)
Prior art keywords
base editing
fusion protein
deaminase
domain
nucleic acid
Prior art date
Application number
PCT/CN2020/110207
Other languages
English (en)
Chinese (zh)
Inventor
高彩霞
李超
Original Assignee
中国科学院遗传与发育生物学研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院遗传与发育生物学研究所 filed Critical 中国科学院遗传与发育生物学研究所
Priority to CN202080059623.XA priority Critical patent/CN114945670A/zh
Publication of WO2021032155A1 publication Critical patent/WO2021032155A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells

Definitions

  • the invention belongs to the field of genetic engineering. Specifically, the present invention relates to a base editing system and its use method. More specifically, the present invention relates to a base editing system and method capable of generating de novo mutations, such as saturation mutations, in situ in an organism.
  • the base editing system and method are based on the inclusion of deaminase and CRISPR.
  • the base editing fusion protein of the effector protein is based on the inclusion of deaminase and CRISPR.
  • CRISPR/Cas Clusters of regularly spaced short palindromic repeats and their related (CRISPR/Cas) systems can produce double strand breaks (DSB) at endogenous target sites.
  • DSB double strand breaks
  • CRISPR/Cas Clusters of regularly spaced short palindromic repeats and their related (CRISPR/Cas) systems can produce double strand breaks (DSB) at endogenous target sites.
  • DSB double strand breaks
  • a base editing fusion protein which comprises a nucleic acid targeting domain, a cytosine deamination domain, and an adenine deamination domain.
  • Item 2 The base editing fusion protein of Item 1, wherein the nucleic acid targeting domain comprises at least one CRISPR effector protein polypeptide.
  • the base editing fusion protein of item 2 wherein the CRISPR effector protein is Cas9 nuclease or a functional variant thereof, preferably, the CRISPR effector protein is nuclease-inactivated Cas9, more preferably, The nuclease-inactivated Cas9 includes the amino acid sequence shown in SEQ ID NO: 2, and most preferably, the nuclease-inactivated Cas9 includes the amino acid sequence shown in SEQ ID NO: 3.
  • Item 4 The base editing fusion protein of any one of items 1 to 3, wherein the cytosine deaminase domain comprises at least one cytosine deaminase polypeptide.
  • Item 5 The base editing fusion protein of item 4, wherein the cytosine deaminase is selected from the group consisting of APOBEC1 deaminase, activation-induced cytidine deaminase (AID), APOBEC3G, CDA1, human APOBEC3A deaminase, or Their functional variants.
  • cytosine deaminase is selected from the group consisting of APOBEC1 deaminase, activation-induced cytidine deaminase (AID), APOBEC3G, CDA1, human APOBEC3A deaminase, or Their functional variants.
  • Item 6 The base editing fusion protein of Item 5, wherein the cytosine deaminase is human APOBEC3A deaminase or a functional variant thereof, for example, the human APOBEC3A deaminase comprises SEQ ID NO: 4 Amino acid sequence.
  • Item 7 The base editing fusion protein of any one of items 1 to 6, wherein the adenine deaminase domain comprises at least one DNA-dependent adenine deaminase polypeptide.
  • Item 8 The base editing fusion protein of item 7, wherein the DNA-dependent adenine deaminase is derived from wild-type E. coli tRNA adenine deaminase TadA (ecTadA), for example, the DNA-dependent adenine deaminase Aminase includes the amino acid sequence shown in SEQ ID NO: 6.
  • Item 9 The base editing fusion protein of Item 7 or 8, wherein the adenine deaminase domain comprises two DNA-dependent adenine deaminase.
  • Item 10 The base editing fusion protein of item 7, wherein the adenine deaminase domain further comprises wild-type E. coli tRNA adenine deaminase TadA (ecTadA) fused to the DNA-dependent adenine deaminase
  • the DNA-dependent adenine deaminase is fused to the C-terminus of the wild-type E. coli tRNA adenine deaminase TadA (ecTadA).
  • Item 11 The base editing fusion protein of Item 7 or 8, wherein the adenine deamination domain comprises the amino acid sequence shown in SEQ ID NO: 7 or 8.
  • Item 12 The base editing fusion protein of any one of items 1-11, wherein the nucleic acid targeting domain, the cytosine deaminization domain and the adenine deaminization domain are fused via a linker, for example,
  • the linker comprises an amino acid sequence selected from SEQ ID NO: 9-11.
  • Item 13 The base editing fusion protein of any one of items 1-12, wherein the base editing fusion protein comprises from N-terminal to C-terminal in the following order: cytosine deaminization domain, adenine deaminization structure A domain and a nucleic acid targeting domain, or the base editing fusion protein comprises an adenine deamination domain, a cytosine deamination domain, and a nucleic acid targeting domain in the following order from N-terminal to C-terminal.
  • UFI uracil DNA glycosylase inhibitor
  • the enzyme inhibitor (UGI) comprises the amino acid sequence shown in SEQ ID NO: 12.
  • Item 15 The base editing fusion protein of any one of items 1-14, wherein the base editing fusion protein further comprises one or more nuclear localization sequences (NLS).
  • NLS nuclear localization sequences
  • Item 16 The base editing fusion protein of Item 1, wherein the base editing fusion protein comprises the amino acid sequence shown in any one of SEQ ID NOs: 13-19.
  • a base editing system for modifying target nucleic acid regions in the genome which comprises:
  • the at least one guide RNA is directed to at least one target sequence in the target nucleic acid region.
  • Item 18 The base editing system of Item 17, wherein the guide RNA is sgRNA, for example, the sgRNA includes the scaffold sequence shown in SEQ ID NO: 27 or SEQ ID NO: 28.
  • Item 19 The base editing system of Item 17 or 18, wherein the target sequence targeted by the guide RNA contains a PAM sequence at the 3'end, such as 5'-NGG-3' or 5'-NG-3'.
  • Item 20 The base editing system of any one of Items 17-19, wherein the at least one guide RNA is directed to a target sequence located on the sense strand and/or antisense strand in the target nucleic acid region of the cell genome.
  • Item 21 The base editing system of any one of items 17-20, wherein the nucleotide sequence encoding the base editing fusion protein is codon-optimized for the organism whose genome is to be modified.
  • Item 22 The base editing system of Item 21, wherein the base editing fusion protein is encoded by any nucleotide sequence shown in SEQ ID NO: 20-26.
  • Item 23 A method of producing at least one genetically modified cell, comprising introducing the base editing system of any one of items 17-22 into at least one of the cells, thereby resulting in a target nucleic acid region in the at least one cell One or more nucleotide substitutions within.
  • Item 24 The method of Item 23, further comprising the step of selecting cells having the desired one or more nucleotide substitutions from the at least one cell.
  • Item 25 The method of Item 23 or 24, wherein the base editing system is introduced into the cell by a method selected from the group consisting of calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as Baculovirus, vaccinia virus, adenovirus, adeno-associated virus, lentivirus and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
  • viral infection such as Baculovirus, vaccinia virus, adenovirus, adeno-associated virus, lenti
  • Item 26 The method of any one of items 23-25, wherein the cells are derived from mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; Plants, including monocots and dicots, preferably crop plants such as wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava and potato.
  • mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats
  • poultry such as chickens, ducks, and geese
  • Plants including monocots and dicots, preferably crop plants such as wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava and potato.
  • Item 27 A method for producing a genetically modified plant, comprising introducing the base editing system of any one of items 17-22 into at least one of the plants, thereby resulting in a target nucleic acid region in the genome of the at least one plant One or more nucleotide substitutions within.
  • Item 28 The method of Item 27, further comprising selecting plants having the desired one or more nucleotide substitutions from the at least one plant.
  • Item 29 The method of Item 27 or 28, wherein the base editing system is introduced into the plant by a method selected from the group consisting of: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated Transformation, pollen tube passage method and ovary injection method.
  • Item 30 The method of Item 28, wherein the importing is performed in the absence of selective pressure.
  • Item 31 The method of any one of items 27-30, wherein the introducing comprises transforming the base editing system into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant Preferably, the regeneration is performed in the absence of selective pressure.
  • Item 32 The method of any one of items 27-30, wherein the introducing comprises transforming the base editing system into leaves, stem tips, pollen tubes, young ears, or hypocotyls on intact plants.
  • Item 33 The method of any one of items 27-30, wherein the expression construct is an in vitro transcribed RNA molecule.
  • Item 34 The method of any one of items 27 to 33, wherein the genetically modified plant does not contain an exogenous polynucleotide integrated into its genome.
  • Item 35 The method of any one of items 27 to 34, wherein the modified target nucleic acid region is related to plant traits such as agronomic traits.
  • Item 36 The method of any one of items 27 to 35, further comprising the step of screening plants for desired traits such as agronomic traits.
  • Item 37 The method of any one of items 27 to 36, further comprising obtaining progeny of the genetically modified plant.
  • a method for plant breeding comprising combining the genetically modified first plant that contains one or more nucleotide substitutions in the target nucleic acid region obtained by the method of any one of items 27-37 and does not contain the A second plant with one or more nucleotide substitutions is crossed, thereby introducing the one or more nucleotide substitutions into the second plant.
  • the genetically modified first plant has desired traits such as agronomic traits .
  • Item 39 A method for in-situ saturation mutation of an endogenous target nucleic acid region in a cell or organism to obtain a mutation of interest in the target nucleic acid region, comprising
  • the base editing system of any one of items 17-22 is introduced into the population of said cells or organisms, resulting in one or more mutations in the endogenous target nucleic acid region of the cells or organisms of said population;
  • Item 40 The method of Item 39, wherein the base editing system comprises a plurality of guide RNAs and/or a plurality of expression constructs containing nucleotide sequences encoding the plurality of guide RNAs, preferably, the plurality of The guide RNA targets different target sequences within the target nucleic acid region.
  • Item 41 The method of Item 39 or 40, wherein the plurality of guide RNAs comprise 2 to 250 or more, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 75, 100, 150, 200, 250, 300 or more guide RNAs.
  • Item 42 The method of any one of items 39-41, wherein the target sequences and/or complementary sequences targeted by at least some of the plurality of guide RNAs partially overlap each other and/or are adjacent to each other.
  • Item 43 The method of any one of items 39-42, wherein the target nucleic acid region has a length of about 20bp to about 10000bp or longer, for example about 20bp, about 40bp, about 60bp, about 80bp, about 100bp, about 120bp, About 140bp, about 160bp, about 180bp, about 200bp, about 300bp, about 400bp, about 500bp, about 1000bp, about 1500bp, about 2000bp, about 3000bp, about 4000bp, about 5000bp, about 6000bp or longer; or, wherein said The target nucleic acid sequence encodes an amino acid sequence of about 5 to about 2000 amino acids in length, for example, it can encode about 5, about 10, about 15, about 20, about 25, about 30, or about 35.
  • Item 44 The method of any one of items 39 to 43, wherein the target sequence targeted by the plurality of guide RNAs substantially covers the target nucleic acid region.
  • Item 45 The method of any one of items 39-44, wherein at least a portion of the plurality of guide RNAs target the sense strand of the target nucleic acid region.
  • Item 46 The method of any one of items 39-45, wherein at least a portion of the plurality of guide RNAs target the antisense strand of the target nucleic acid region.
  • Item 47 The method of any one of items 39 to 46, wherein the plurality of guide RNAs and/or a plurality of expression constructs containing nucleotide sequences encoding the plurality of guide RNAs are each independently introduced into the cell Or a population of organisms; or the plurality of guide RNAs and/or a plurality of expression constructs containing nucleotide sequences encoding the plurality of guide RNAs are introduced into the population of cells or organisms in combination with each other.
  • Item 48 The method of any one of items 39-47, wherein the mutation is a nucleotide substitution, such as a C to T substitution, A to G substitution, G to A substitution, or T to C substitution.
  • a nucleotide substitution such as a C to T substitution, A to G substitution, G to A substitution, or T to C substitution.
  • Item 49 The method of any one of items 39-48, wherein the target nucleic acid region is located in a coding region of a protein, for example, the target nucleic acid region encodes a functionally related motif or domain of the protein.
  • Item 50 The method of Item 49, wherein the mutation in the target nucleic acid region causes an amino acid substitution in the amino acid sequence of the protein, preferably, the mutation causes a change in the function of the protein.
  • Item 51 The method of any one of items 39-50, wherein the target nucleic acid region is related to a trait of the cell or organism, for example, a mutation in the target nucleic acid region causes the trait of the cell or organism change.
  • Item 52 The method of any one of items 39-51, in step iii), screening cells or organisms with mutations of interest by screening cells or organisms with changes in traits of interest, for example, the induction
  • the trait change of interest is selected from increased growth rate, increased yield, increased nutrient content, increased cold resistance, increased drought resistance, increased insect resistance, increased disease resistance, and increased herbicide resistance.
  • Item 53 The method of any one of items 39-52, wherein the cells are derived from mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants , Including monocot plants and dicot plants, preferably crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava and potato,
  • the cell is a plant cell, more preferably a crop plant cell, more preferably a rice cell.
  • Item 54 The method of any one of items 39-52, wherein the organism is selected from mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese ; Plants, including monocots and dicots, preferably crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava and potato,
  • the organism is a plant, more preferably a crop plant, more preferably rice.
  • Item 55 A method of treating a disease in a subject in need thereof, comprising delivering to the subject an effective amount of the base editing system of any one of items 17-22 to modify genes related to the disease.
  • Item 56 Use of the base editing system of any one of items 17-22 in the preparation of a pharmaceutical composition for treating a disease in a subject in need, wherein the base editing system is used to modify the disease Related genes.
  • Item 57 A pharmaceutical composition for treating a disease in a subject in need, comprising the base editing system of any one of items 17-22, and optionally a pharmaceutically acceptable carrier, wherein the base editing The system is used to modify genes related to the disease.
  • Item 58 The method of item 55, the use of item 56 or the pharmaceutical composition of item 57, wherein the subject is a mammal, such as a human.
  • Item 59 The method of item 55, the use of item 56 or the pharmaceutical composition of item 57, wherein the disease is selected from the group consisting of tumor, inflammation, Parkinson's disease, cardiovascular disease, Alzheimer's disease, autism, and drug composition Addiction, age-related macular degeneration, schizophrenia, genetic diseases.
  • Item 60 A kit comprising the base editing fusion protein of any one of items 1-16 and/or an expression construct containing a nucleotide sequence encoding the base editing fusion protein, or comprising item 17 -22 base editing system.
  • STEME performs base editing through the fusion of cytidine and adenine deaminase.
  • FIG. 1 pOsU3-esgRNA expression vector.
  • the present invention preferably uses esgRNA.
  • Figure 3 Editing efficiency of STEME-1, STEME-2, STEME-3 and STEME-4 in protoplasts.
  • (a) Comparison of C>T base editing frequency between A3A-PBE and four STEME constructs (n 3).
  • (b) Comparison of A>G base editing frequency between PABE-7 and four STEME constructs (n 3).
  • Untreated protoplast samples were used as controls. The values and error bars reflect the mean ⁇ s.e.m of three independent biological replicates.
  • FIG. 1 Activity and product purity of STEME-1, STEME-2, STEME-3 and STEME-4 in rice protoplasts.
  • (a)-(f) are OsAAT, OsACC, OsCDC48, OsDEP1, OsEV and OsOD target sequences, respectively.
  • Figure 5 Comparison of base editing efficiency of STEME-1, STEME-5 and STEME-6 in rice protoplasts.
  • (a) The structure of STEME-5 and STEME-6. ecTadA7.10: evolved E. coli TadA; aa, amino acid; XTEN: 16 amino acid linker.
  • (b) Comparison of C>T base editing frequency of A3A-PBE, STEME-1, STEME-5 and STEME-6 constructs.
  • STEME-NG is used for saturation de novo mutation in rice protoplasts.
  • Figure 8. Design of NG PAM targets at different rice sites.
  • STEME-NG can widely edit target sequences with NGA, NGT, NGC or NGG PAM.
  • NGG PAM and STEME-1 of untreated protoplast samples were used as controls. The values and error bars reflect the mean ⁇ s.e.m of three independent biological replicates.
  • Figure 10 The activity and product purity of STEME-NG in rice protoplasts.
  • FIG. 11 The basic editing efficiency of C>T and A>G in rice protoplasts by A3A-PBE-NG and PABE7-NG, respectively. Both A3A-PBE-NG(a) and PABE7-NG(b) have extensive capabilities to edit NGA, NGT, NGC or NGG PAM targets. Untreated protoplast samples were used as controls. The values and error bars reflect the mean ⁇ s.e.m of three independent biological replicates.
  • FIG. 12 De novo saturation mutation of key domains of OsACC protein with STEME-NG.
  • (c) The heat map shows the conversion saturation of STEME-NG on the 56 amino acids involved in (b) in rice protoplasts. Statistics are collected for silent mutation, missense mutation and nonsense mutation.
  • FIG. 13 The 168bp of the CT domain of OsACC saturation mutation by A3A-PBE-NG in rice protoplasts.
  • A3A-PBE-NG (a) and untreated control (b) are shown.
  • sgRNAs convert the same cytidine and guanosine
  • the values and error bars reflect the mean ⁇ s.e.m of three independent biological replicates.
  • the term “and/or” encompasses all combinations of items connected by the term, and should be treated as if each combination has been individually listed herein.
  • “A and/or B” encompasses “A”, “A and B”, and “B”.
  • “A, B, and/or C” encompasses "A”, “B”, “C”, “A and B”, “A and C”, “B and C”, and "A and B and C”.
  • the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or cores at one or both ends of the protein or nucleic acid. Glycolic acid, but still has the activity of the present invention.
  • methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain practical conditions (for example, when expressed in a specific expression system), but does not substantially affect the function of the polypeptide.
  • Gene as used herein not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in the subcellular components of the cell (such as mitochondria, plastids).
  • Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or contains modified genes or expression control sequences in its genome.
  • exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations.
  • the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
  • the modified gene or expression control sequence is that the gene or expression control sequence in the genome of an organism or cell contains one or more deoxynucleotide substitutions, deletions and additions.
  • Form in terms of sequence means a sequence from a foreign species or, if from the same species, a sequence whose composition and/or locus has been significantly altered from its natural form through deliberate human intervention.
  • nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
  • Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively for RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “D” means A, T or G, “I” means inosine, and “N” means any nucleotide.
  • Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
  • the terms “polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
  • expression construct refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism.
  • “Expression” refers to the production of a functional product.
  • the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
  • the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be RNA (such as mRNA) that can be translated, for example, RNA generated by in vitro transcription.
  • RNA such as mRNA
  • the "expression construct" of the present invention may contain regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a way different from those normally occurring in nature.
  • regulatory sequence and “regulatory element” can be used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
  • Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
  • the promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
  • the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type The promoter.
  • tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
  • inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • promoters include, but are not limited to, polymerase (pol) I, pol II, or pol III promoters.
  • pol I promoters include chicken RNA pol I promoter.
  • pol II promoters include, but are not limited to, cytomegalovirus immediate early (CMV) promoter, Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and simian virus 40 (SV40) immediate early promoter.
  • pol III promoters include U6 and H1 promoters.
  • An inducible promoter such as a metallothionein promoter can be used.
  • promoters include T7 phage promoter, T3 phage promoter, ⁇ -galactosidase promoter, and Sp6 phage promoter.
  • the promoter may be cauliflower mosaic virus 35S promoter, corn Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, corn U3 promoter, rice actin promoter.
  • operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
  • regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
  • nucleic acid sequences for example, coding sequences or open reading frames
  • Introducing a nucleic acid molecule (such as a plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming the cell of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • the "transformation” used in the present invention includes stable transformation and transient transformation.
  • Stable transformation refers to the introduction of exogenous nucleotide sequences into the genome, resulting in stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of foreign genes. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
  • Proteins refer to the physiological, morphological, biochemical or physical characteristics of cells or organisms.
  • “Agronomic traits” especially refer to the measurable index parameters of crop plants, including but not limited to: leaf green, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, plant total nitrogen content, fruit nitrogen content, seed nitrogen content, plant nutrient tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant nutrient tissue free amino acid content, plant total protein Content, fruit protein content, seed protein content, plant nutrition tissue protein content, herbicide resistance, drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance Resistance, cold resistance, salt resistance and tiller number.
  • the present invention provides a base editing fusion protein comprising a nucleic acid targeting domain, a cytosine deaminization domain, and an adenine deaminization domain.
  • base editing fusion protein and “base editor” are used interchangeably, and refer to those that can mediate the substitution of one or more nucleotides of a target sequence in the genome in a sequence-specific manner. protein.
  • nucleic acid targeting domain refers to a domain capable of mediating the attachment of the base editing fusion protein to a specific target sequence in the genome in a sequence-specific manner (for example, through a guide RNA).
  • the nucleic acid targeting domain comprises at least one (e.g., one) CRISPR effector polypeptide.
  • CRISPR effector protein generally refers to a nuclease (CRISPR nuclease) or a functional variant thereof found in a naturally occurring CRISPR system.
  • the term covers any effector protein based on the CRISPR system that can achieve sequence-specific targeting in cells.
  • a "functional variant" in terms of CRISPR nuclease means that it retains at least the guide RNA-mediated sequence-specific targeting ability.
  • the functional variant is a nuclease-inactivated variant, that is, it lacks double-stranded nucleic acid cleavage activity.
  • CRISPR nucleases lacking double-stranded nucleic acid cleavage activity also encompass nickases, which form nicks in double-stranded nucleic acid molecules, but do not completely cut double-stranded nucleic acids.
  • the CRISPR effector protein of the present invention has nickase activity.
  • the functional variant recognizes a different PAM (proximal region sequence adjacent motif) sequence relative to the wild-type nuclease.
  • the "CRISPR effector protein” can be derived from Cas9 nuclease, including Cas9 nuclease or functional variants thereof.
  • the Cas9 nuclease may be a Cas9 nuclease from different species, such as spCas9 from S. pyogenes or SaCas9 derived from S. aureus.
  • Cas9 nuclease and Cas9 are used interchangeably herein, and refer to RNA comprising Cas9 protein or fragments thereof (for example, a protein containing the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas9) Guided nuclease.
  • Cas9 is a component of CRISPR/Cas (clustered regularly spaced short palindrome repeats and related systems) genome editing system, which can target and cut DNA target sequences under the guidance of guide RNA to form DNA double-strand breaks (DSB) ).
  • DSB DNA double-strand breaks
  • An exemplary amino acid sequence of wild-type spCas9 is shown in SEQ ID NO:1.
  • CRISPR effector protein can also be derived from Cpf1 nuclease, including Cpf1 nuclease or functional variants thereof.
  • the Cpf1 nuclease may be Cpf1 nuclease from different species, for example, Cpf1 nuclease from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006.
  • CRISPR effector proteins can also be derived from Cas3, Cas8a, Cas5, Cas8b, Cas8c, Cas10d, Cse1, Cse2, Csy1, Csy2, Csy3, GSU0054, Cas10, Csm2, Cmr5, Cas10, Csx11, Csx10, Csf1, Csn2 , Cas4, C2c1, C2c3 or C2c2 nucleases, for example, include these nucleases or functional variants thereof.
  • the CRISPR effector protein is nuclease-inactivated Cas9.
  • the DNA cleavage domain of Cas9 nuclease is known to contain two subdomains: HNH nuclease subdomain and RuvC subdomain.
  • the HNH subdomain cleaves the strand complementary to gRNA, while the RuvC subdomain cleaves the non-complementary strand. Mutations in these subdomains can inactivate the nuclease activity of Cas9, forming "nuclease-inactivated Cas9".
  • the Cas9 inactivated by the nuclease still retains the DNA binding ability guided by gRNA.
  • the nuclease-inactivated Cas9 of the present invention can be derived from Cas9 of different species, for example, derived from S. pyogenes Cas9 (SpCas9), or derived from Staphylococcus aureus (S. aureus) Cas9 (SaCas9). ). Simultaneously mutating the HNH nuclease subdomain and RuvC subdomain of Cas9 (for example, including the mutations D10A and H840A) makes the nuclease of Cas9 inactive and becomes nuclease death Cas9 (dCas9). Mutation and inactivation of one of the subdomains can make Cas9 have nickase activity, that is, obtain Cas9 nickase (nCas9), for example, nCas9 with only mutation D10A.
  • SpCas9 S. pyogenes Cas9
  • SaCas9 Staphyloc
  • the nuclease-inactivated Cas9 variant of the present invention contains amino acid substitution D10A and/or H840A relative to wild-type Cas9, wherein the amino acid number refers to SEQ ID NO:1.
  • the nuclease-inactivated Cas9 contains the amino acid substitution D10A relative to the wild-type Cas9, wherein the amino acid number refers to SEQ ID NO:1.
  • the nuclease-inactivated Cas9 comprises the amino acid sequence shown in SEQ ID NO: 2 (nCas9(D10A)).
  • Cas9 nuclease When Cas9 nuclease is used for gene editing, it usually requires the target sequence to have a 5'-NGG-3' PAM (proximal region sequence adjacent motif) sequence at the 3'end.
  • PAM proximal region sequence adjacent motif
  • CRISPR effector proteins that recognize different PAM sequences are preferably used in the present invention, for example, functional variants of Cas9 nuclease with different PAM sequences.
  • the CRISPR effector protein is a Cas9 variant that recognizes the 5'-NG-3' of the PAM sequence.
  • the Cas9 variant that recognizes the PAM sequence 5'-NG-3' is also referred to herein as Cas9-NG.
  • Cas9-NG includes the following amino acid substitutions R1335V, L1111R, D1135V, G1218R, E1219F, A1322R, T1337R relative to wild-type Cas9, wherein the amino acid number refers to SEQ ID NO:1.
  • the CRISPR effector protein is nuclease-inactivated and recognizes the Cas9 variant of the PAM sequence 5'-NG-3'.
  • the nuclease-inactivated Cas9 variant that recognizes the PAM sequence 5'-NG-3' contains the following amino acid substitutions D10A, R1335V, L1111R, D1135V, G1218R, E1219F, A1322R, T1337R relative to wild-type Cas9 , Where the amino acid number refers to SEQ ID NO:1.
  • the nuclease-inactivated Cas9 variant that recognizes the PAM sequence 5'-NG-3' comprises the amino acid sequence shown in SEQ ID NO: 3 (nCas9-NG(D10A)).
  • cytosine deamination domain refers to a domain that can accept single-stranded DNA as a substrate and catalyze the deamination of cytidine or deoxycytidine into uracil or deoxyuracil, respectively.
  • the cytosine deaminase domain comprises at least one (eg, one or two) cytosine deaminase polypeptides.
  • the cytidine deaminization domain in the fusion protein can convert the cytidine deamination of the single-stranded DNA produced during the formation of the fusion protein-guide RNA-DNA complex into U, and then realize base mismatch repair C to T base substitution.
  • cytosine deaminase examples include, but are not limited to, for example, APOBEC1 deaminase, activation-induced cytidine deaminase (AID), APOBEC3G, CDA1, human APOBEC3A deaminase, or their functional modifications body.
  • the cytosine deaminase is human APOBEC3A deaminase or a functional variant thereof.
  • the human APOBEC3A deaminase comprises the amino acid sequence shown in SEQ ID NO:4.
  • adenine deamination domain refers to a domain that can accept single-stranded DNA as a substrate and catalyze the formation of inosine (I) from adenosine or deoxyadenosine (A).
  • the adenine deaminase domain comprises at least one (eg, one) DNA-dependent adenine deaminase polypeptide.
  • the adenine deamination domain in the fusion protein can convert the adenosine deamination of the single-stranded DNA generated during the formation of the CRISPR effector protein-guide RNA-DNA complex into inosine (I), due to DNA polymerization
  • the enzyme treats inosine (I) as guanine (G), so A to G substitution can be achieved through base mismatch repair.
  • the DNA-dependent adenine deaminase is a variant of E. coli tRNA adenine deaminase TadA (ecTadA).
  • ecTadA E. coli tRNA adenine deaminase TadA
  • An exemplary wild-type ecTadA amino acid sequence is shown in SEQ ID NO: 5.
  • the wild-type ecTadA amino acid sequence may not include the N-terminal methionine in SEQ ID NO: 5.
  • the DNA-dependent adenine deaminase comprises one or more sets of mutations selected from the following relative to wild-type ecTadA:
  • the amino acid number refers to SEQ ID NO: 5.
  • the DNA-dependent adenine deaminase contains the following mutations relative to wild-type ecTadA: W23R, H36L, R51L, S146C, K157N, A106V, D108N, P48A, L84F, H123Y, I156F, For D147Y, E155V and R152P, the amino acid number refers to SEQ ID NO: 5.
  • the DNA-dependent adenine deaminase comprises the amino acid sequence shown in SEQ ID NO:6.
  • E. coli tRNA adenine deaminase usually functions as a dimer, it is expected that two DNA-dependent adenine deaminase will form a dimer or DNA-dependent adenine deaminase and wild-type adenine The formation of dimers by deaminase can significantly increase the editing activity of fusion proteins A to G.
  • the adenine deaminase domain comprises two of the DNA-dependent adenine deaminase.
  • the adenine deaminase domain further comprises a corresponding DNA-dependent adenine deaminase (such as a DNA-dependent variant of E. coli tRNA adenine deaminase TadA) fused to Wild-type adenine deaminase (eg E. coli tRNA adenine deaminase TadA).
  • the DNA-dependent adenine deaminase (such as a DNA-dependent variant of E. coli tRNA adenine deaminase TadA) is fused to a corresponding wild-type adenine deaminase (such as E. coli The C-terminus of tRNA adenine deaminase (TadA).
  • DNA-dependent adenine deaminase e.g., a DNA-dependent variant of E. coli tRNA adenine deaminase TadA
  • DNA-dependent adenine deaminase e.g., A DNA-dependent variant of E. coli tRNA adenine deaminase TadA
  • wild-type adenine deaminase such as E. coli tRNA adenine deaminase TadA
  • the adenine deamination domain comprises the amino acid sequence shown in SEQ ID NO: 7 or 8.
  • the nucleic acid targeting domain, the cytosine deamination domain and the adenine deamination domain are fused via a linker.
  • linkers can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
  • the linker may be a flexible linker, such as GGGGS, GS, GAP, (GGGGS)x3, GGS and (GGS)x7, etc.
  • the linker is 32 amino acids long, for example, the linker comprises the amino acid sequence shown in SEQ ID NO:9.
  • the linker is 48 amino acids long, for example, the linker comprises the amino acid sequence shown in SEQ ID NO:10.
  • the linker is an XTEN linker comprising the amino acid sequence shown in SEQ ID NO: 11.
  • the base editing fusion protein comprises the following order from the N-terminus to the C-terminus: a cytosine deamination domain, an adenine deamination domain, and a nucleic acid targeting domain. In some embodiments, the base editing fusion protein comprises in the following order from the N-terminal to the C-terminal: an adenine deamination domain, a cytosine deamination domain, and a nucleic acid targeting domain.
  • uracil DNA glycosylase catalyzes the removal of U from DNA and initiates base excision repair (BER), resulting in the repair of U:G to C:G. Therefore, without being limited by any theory, the combination of the base editing fusion protein of the present invention and Uracil DNA Glycosylase Inhibitor (UGI) will increase the efficiency of C to T base editing.
  • Uracil DNA Glycosylase Inhibitor Uracil DNA Glycosylase Inhibitor
  • the base editing fusion protein is co-expressed with uracil DNA glycosylase inhibitor (UGI).
  • UFI uracil DNA glycosylase inhibitor
  • the base editing fusion protein further comprises Uracil DNA Glycosylase Inhibitor (UGI).
  • UBI Uracil DNA Glycosylase Inhibitor
  • UGI is connected to other parts of the base editing fusion protein through a linker.
  • UGI is connected to other parts of the base editing fusion protein through a "self-cleaving peptide”.
  • self-cleaving peptide means a peptide that can achieve self-cleavage within a cell.
  • the self-cleaving peptide may include a protease recognition site, so that it can be recognized and specifically cleaved by the protease in the cell.
  • the self-cleaving peptide may be a 2A polypeptide.
  • the 2A polypeptide is a type of short peptide derived from viruses, and its self-cleavage occurs during translation. When 2A polypeptide is used to connect two different target polypeptides and expressed in the same reading frame, the two target polypeptides are almost produced at a ratio of 1:1.
  • 2A polypeptides can be P2A from porcine techovirus-1, T2A from Thosea asignis virus, E2A from equine rhinitis A virus And F2A from foot-and-mouth disease virus.
  • T2A porcine techovirus-1
  • E2A from equine rhinitis A virus
  • F2A foot-and-mouth disease virus.
  • a variety of functional variants of these 2A polypeptides are also known in the art, and these variants can also be used in the present invention.
  • the self-cleavable peptide does not exist between or within the nucleic acid targeting domain, the cytosine deamination domain and the adenine deamination domain.
  • UGI is located at the N-terminus or C-terminus of the base editing fusion protein, preferably the C-terminus.
  • the uracil DNA glycosylase inhibitor comprises the amino acid sequence shown in SEQ ID NO: 12.
  • the fusion protein of the present invention may also include a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • one or more NLS in the fusion protein should have sufficient strength to drive the accumulation of the fusion protein in an amount that can achieve its base editing function in the nucleus of the cell.
  • the strength of nuclear localization activity is determined by the number and location of NLS in the fusion protein, one or more specific NLS used, or a combination of these factors.
  • the NLS of the fusion protein of the present invention may be located at the N-terminal and/or C-terminal. In some embodiments of the present invention, the NLS of the fusion protein of the present invention may be located between the adenine deamination domain, cytosine deamination domain, nucleic acid targeting domain and/or UGI. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the N-terminus. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the C-terminus.
  • the polypeptide includes a combination of these, such as one or more NLS at the N-terminus and one or more NLS at the C-terminus. When there is more than one NLS, each one can be selected as not dependent on the other NLS.
  • NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of the protein, but other types of NLS are also known.
  • Non-limiting examples of NLS include: KKRKV, PKKKRKV, or KRPAATKKAGQAKKKK.
  • the fusion protein of the present invention may also include other positioning sequences, such as cytoplasmic positioning sequences, chloroplast positioning sequences, mitochondrial positioning sequences and the like.
  • the base editing fusion protein comprises the amino acid sequence shown in any one of SEQ ID NO: 13-19.
  • the present invention provides a base editing system for modifying target nucleic acid regions in the genome, which comprises:
  • the at least one guide RNA is directed to at least one target sequence in the target nucleic acid region.
  • base editing system refers to a combination of components required for base editing of the genome of a cell or organism.
  • the various components of the system such as base editing fusion protein, one or more guide RNAs, can exist independently of each other, or can exist in any combination as a composition.
  • guide RNA and “gRNA” are used interchangeably, and refer to RNA that can form a complex with the CRISPR effector protein and can target the complex to the target sequence due to a certain identity with the target sequence molecular.
  • the guide RNA targets the target sequence by base pairing with the complementary strand of the target sequence.
  • the gRNA used by Cas9 nuclease or its functional variants is usually composed of crRNA and tracrRNA molecules that are partially complementary to form a complex, wherein the crRNA contains sufficient identity with the target sequence to hybridize with the complementary strand of the target sequence and guide
  • the CRISPR complex (Cas9+crRNA+tracrRNA) is a guide sequence (also called a seed sequence) that specifically binds to the target sequence.
  • sgRNA single guide RNA
  • the gRNA used by Cpf1 nuclease or its functional variants is usually composed of mature crRNA molecules only, which can also be called sgRNA. Designing a suitable gRNA based on the CRISPR nuclease used and the target sequence to be edited is within the abilities of those skilled in the art.
  • the sequence of sgRNA may include the following scaffold sequence:
  • the target sequence targeted by the guide RNA usually contains 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 , 28, 29, 30, preferably 20 nucleotides.
  • the guide sequence in the guide RNA usually includes 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26 1, 27, 28, 29, 30, preferably 20 nucleotides.
  • the target sequence targeted by the guide RNA also requires a PAM sequence at one end, such as the 3'end, to be recognized by the CRISPR effector protein (or fusion protein)-guide RNA complex.
  • the location, type, and length of the PAM required for the target sequence depends on the CRISPR effector protein used.
  • the PAM sequence is 5'-NGG-3' located at the 3'end of the target sequence.
  • the PAM sequence is 5'-NG-3' located at the 3'end of the target sequence.
  • the base editing fusion protein and the guide RNA can form a complex, and the complex specifically targets the target sequence under the guidance of the guide RNA, and causes One or more Cs are replaced by T and/or one or more A is replaced by G in the target sequence.
  • the C to T base editing window of the base editing fusion protein of the present invention is located at positions 1-17 of the target sequence. That is to say, the base editing fusion protein of the present invention can have one or more Cs in the range of positions 1-17 from the 5'end of the target sequence replaced by T.
  • the A to G base editing window of the base editing fusion protein of the present invention is located at positions 4-8 of the target sequence. That is to say, the base editing fusion protein of the present invention can make one or more A in the range of 4-8 positions from the 5'end of the target sequence replaced by G.
  • the at least one guide RNA may be directed to a target sequence located on the sense strand (such as the protein coding strand) and/or the antisense strand within the target nucleic acid region of the genome.
  • the base editing composition of the present invention can cause one or more Cs in the target sequence on the sense strand (e.g., protein coding strand) to be replaced by T and/ Or one or more A is replaced by G.
  • the base editing composition of the present invention can cause one or more Gs in the target sequence on the sense strand (for example, protein coding strand) to be replaced by A and/or one or more T Replaced by C.
  • the nucleotide sequence encoding the base editing fusion protein is codon optimized for the organism whose genome is to be modified.
  • Codon optimization refers to replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10) of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell. , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
  • Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and nature of the codon being translated
  • tRNA transfer RNA
  • genes can be tailored to be the best in a given organism based on codon optimization. Good gene expression. Codon utilization tables can be easily obtained, such as the "Codon Usage Database” available at www.kazusa.orjp/codon/ , and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
  • the base editing fusion protein of the present invention is encoded by any one of the nucleotide sequences shown in SEQ ID NO: 20-26.
  • Organisms whose genome can be modified by the base editing system of the present invention include any organisms suitable for base editing, preferably eukaryotes.
  • organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots
  • the plant is a crop plant, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava, and potato.
  • the organism is a plant. More preferably, the organism is rice.
  • the present invention also provides a method for producing at least one genetically modified cell, comprising introducing the base editing system of the present invention into at least one of the cells, thereby causing a target nucleic acid region in the at least one cell One or more nucleotide substitutions within.
  • the method further includes the step of selecting cells having the desired one or more nucleotide substitutions from the at least one cell.
  • the methods of the invention are performed in vitro.
  • the cell is an isolated cell, or a cell in an isolated tissue or organ.
  • the present invention also provides a genetically modified organism, which comprises a genetically modified cell or its progeny cells produced by the method of the present invention.
  • a genetically modified organism which comprises a genetically modified cell or its progeny cells produced by the method of the present invention.
  • the genetically modified cell or its progeny cells have the desired one or more nucleotide substitutions.
  • the target nucleic acid region to be modified can be located anywhere in the genome, for example, within a functional gene such as a protein coding gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so as to achieve The modification of gene function or the modification of gene expression.
  • the desired nucleotide substitution results in a desired gene function modification or gene expression modification.
  • the target nucleic acid region is related to the trait of the cell or organism. In some embodiments, the mutation in the target nucleic acid region causes a change in the trait of the cell or organism. In some embodiments, the target nucleic acid region is located in the coding region of the protein. In some embodiments, the target nucleic acid region encodes a functionally related motif or domain of the protein. In some preferred embodiments, one or more nucleotide substitutions in the target nucleic acid region result in amino acid substitutions in the amino acid sequence of the protein. In some embodiments, the one or more nucleotide substitutions result in a change in the function of the protein.
  • the base editing system can be introduced into cells by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the base editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, Adenovirus, adeno-associated virus, lentivirus and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
  • Cells that can be base-edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants, including Monocotyledonous plants and dicotyledonous plants, preferably crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava and potato.
  • mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats
  • poultry such as chickens, ducks, and geese
  • plants including Monocotyledonous plants and dicotyledonous plants, preferably crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato
  • the base editing fusion protein, the base editing system and the method for producing genetically modified cells of the present invention are particularly suitable for genetic modification of plants.
  • the plant is a crop plant, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava and potato. More preferably, the plant is rice.
  • the present invention provides a method for producing a genetically modified plant, comprising introducing the base editing system of the present invention into at least one of the plants, thereby causing a target nucleic acid region in the genome of the at least one plant One or more nucleotide substitutions within.
  • the method further includes screening the at least one plant for plants having the desired one or more nucleotide substitutions.
  • the base editing composition can be introduced into plants by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the base editing system of the present invention into plants include, but are not limited to: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube channel method, and seed Room injection.
  • the base editing composition is introduced into the plant by transient transformation.
  • the target sequence can be modified by introducing or producing the base editing fusion protein and guide RNA into plant cells, and the modification can be inherited stably, without the need to encode the base Exogenous polynucleotides that are components of the editing system stably transform plants. This avoids the potential off-target effects of the stable (continuously produced) base editing composition, and also avoids the integration of foreign nucleotide sequences in the plant genome, thereby having higher biological safety.
  • the introduction is carried out in the absence of selective pressure, so as to avoid the integration of foreign nucleotide sequences in the plant genome.
  • the introduction includes transforming the base editing system of the present invention into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant.
  • the regeneration is performed in the absence of selective pressure, that is, no selective agent for the selective gene carried on the expression vector is used in the tissue culture process.
  • no selection agent can improve the regeneration efficiency of plants and obtain modified plants without exogenous nucleotide sequences.
  • the base editing system of the present invention can be transformed to a specific part of the whole plant, such as leaves, stem tips, pollen tubes, young ears or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to undergo tissue culture regeneration.
  • the protein expressed in vitro and/or the RNA molecule transcribed in vitro (for example, the expression construct is an RNA molecule transcribed in vitro) is directly transformed into the plant.
  • the protein and/or RNA molecule can realize base editing in plant cells and then be degraded by the cell, avoiding the integration of foreign nucleotide sequences in the plant genome.
  • genetic modification and breeding of plants using the method of the present invention can obtain plants whose genomes are not integrated with foreign polynucleotides, that is, transgene-free modified plants.
  • the modified target nucleic acid region is related to plant traits such as agronomic traits, whereby the one or more nucleotide substitutions result in the plant having altered characteristics relative to wild-type plants.
  • plant traits such as agronomic traits
  • the one or more nucleotide substitutions result in the plant having altered characteristics relative to wild-type plants.
  • improved traits such as agronomic traits.
  • the method further includes the step of screening for plants having desired one or more nucleotide substitutions and/or desired traits such as agronomic traits.
  • the method further includes obtaining progeny of the genetically modified plant.
  • the genetically modified plant or its progeny has desired one or more nucleotide substitutions and/or desired traits such as agronomic traits.
  • the present invention also provides a genetically modified plant or its progeny or part thereof, wherein the plant is obtained by the above-mentioned method of the present invention.
  • the genetically modified plant or progeny or part thereof is non-transgenic.
  • the genetically modified plant or its progeny have desired genetic modification and/or desired traits such as agronomic traits.
  • the present invention also provides a plant breeding method, which comprises combining the genetically modified first plant that contains one or more nucleotide substitutions in the target nucleic acid region obtained by the above-mentioned method of the present invention with the one that does not contain The second plant with the one or more nucleotide substitutions is crossed, thereby introducing the one or more nucleotide substitutions into the second plant.
  • the genetically modified first plant has desired traits such as agronomic traits.
  • the present invention provides a method for in situ saturation mutation of an endogenous target nucleic acid region in a cell or organism to obtain a mutation of interest in the target nucleic acid region, comprising
  • the methods of the invention are performed in vitro.
  • the cell is an isolated cell, or a cell in an isolated tissue or organ.
  • the base editing system includes multiple guide RNAs and/or multiple expression constructs containing nucleotide sequences encoding the multiple guide RNAs.
  • the multiple guide RNAs are directed to different target sequences in the target nucleic acid region.
  • the plurality of guide RNAs may be 2 to 250 or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10. Species, 15 species, 20 species, 25 species, 50 species, 75 species, 100 species, 150 species, 200 species, 250 species, 300 species or more.
  • the target sequences and/or complementary sequences targeted by at least some of the plurality of guide RNAs partially overlap each other and/or are adjacent to each other.
  • the base editing system of the present invention can realize base editing of a longer target nucleic acid region.
  • the target nucleic acid region may have a length of about 20bp to about 10000bp or longer, such as about 20bp, about 40bp, about 60bp, about 80bp, about 100bp, about 120bp, about 140bp, about 160bp, about 180bp, about 200bp, About 300bp, about 400bp, about 500bp, about 1000bp, about 1500bp, about 2000bp, about 3000bp, about 4000bp, about 5000bp, about 6000bp or longer.
  • the target nucleic acid sequence may encode an amino acid sequence of about 5 to about 2000 amino acids in length, for example, may encode an amino acid sequence of about 5, about 10, or about 15 in length.
  • the target sequence targeted by the plurality of guide RNAs substantially covers the target nucleic acid region.
  • At least a portion of the plurality of guide RNAs target the sense strand of the target nucleic acid region.
  • At least a portion of the plurality of guide RNAs target the antisense strand of the target nucleic acid region.
  • the plurality of guide RNAs and/or a plurality of expression constructs containing nucleotide sequences encoding the plurality of guide RNAs can each be independently introduced into the population of the cell or organism. In some embodiments, the plurality of guide RNAs and/or a plurality of expression constructs containing nucleotide sequences encoding the plurality of guide RNAs can be introduced into the population of the cell or organism in combination with each other.
  • each guide RNA or its expression construct is introduced into a subpopulation of the cells or organisms, and finally all the subpopulations constitute the population of cells or organisms that have been introduced into the gene editing system; or, every two guide RNAs Or a mixture of expression constructs thereof is used to introduce a subpopulation of said cells or organisms, and finally all subpopulations constitute a population of cells or organisms that have been introduced into the gene editing system; and so on.
  • the mutation is a nucleotide substitution, such as a C to T substitution, A to G substitution, G to A substitution, or T to C substitution.
  • the target nucleic acid region is located in the coding region of the protein. In some embodiments, the target nucleic acid region encodes a functionally related motif or domain of the protein. In some embodiments, the mutations in the target nucleic acid region may be silent mutations, missense mutations, or nonsense mutations. In some preferred embodiments, mutations in the target nucleic acid region result in amino acid substitutions in the amino acid sequence of the protein. In some embodiments, the mutation results in a change in the function of the protein.
  • the "saturation” mutation does not necessarily mean that the population of cells or organisms contains all nucleotide mutations in the target nucleic acid region or all amino acid mutations in the amino acid sequence encoded by the target nucleic acid region. “Near saturation” mutations are also encompassed, such as mutations in which more than 50% of the nucleotides in the target nucleic acid region are contained in the population of cells or organisms or more than 50% of the amino acid mutations in the amino acid sequence encoded by the target nucleic acid region.
  • the target nucleic acid region is related to the trait of the cell or organism.
  • the mutation in the target nucleic acid region causes a change in the trait of the cell or organism. Therefore, in some embodiments, mutations of interest can be screened for changes in the traits of cells or organisms.
  • mutant of interest generally refers to a mutation that causes a change in a trait of interest of a cell or organism. Therefore, in step iii), cells or organisms with mutations of interest can be screened by screening cells or organisms with changes in traits of interest.
  • the cells can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots, preferably Crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava, and potato.
  • the cell is a plant cell, more preferably a crop plant cell, more preferably a rice cell.
  • the organism may be, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots, Preferred crop plants include, but are not limited to, wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava, and potato.
  • the organism is a plant, more preferably a crop plant, more preferably rice.
  • the traits of interest include improved agronomic traits, including but limited to increased growth rate, increased yield, increased nutrient content, increased cold resistance, increased drought resistance, Increased insect resistance, increased disease resistance, increased herbicide resistance, etc.
  • the traits of interest include but are not limited to drug resistance.
  • the present invention also covers the application of the base editing system of the present invention in the treatment of diseases.
  • Modification of disease-related genes by the base editing system of the present invention can realize up-regulation, down-regulation, inactivation, activation or mutation correction of disease-related genes, etc., thereby achieving disease prevention and/or treatment.
  • the target nucleic acid region in the present invention can be located in the protein coding region of a disease-related gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so that functional modification or modification of the disease-related gene can be achieved.
  • Modification of disease-related gene expression. Therefore, modifying disease-related genes described herein includes modification of disease-related genes themselves (such as protein coding regions), as well as modification of their expression regulatory regions (such as promoters, enhancers, introns, etc.).
  • a “disease-related” gene refers to any gene that produces transcription or translation products at abnormal levels or in an abnormal form in cells derived from tissues affected by the disease, compared to tissues or cells that are not disease control. In the case where the altered expression is related to the appearance and/or progression of the disease, it may be a gene expressed at an abnormally high level; it may be a gene expressed at an abnormally low level.
  • Disease-related genes also refer to genes that have one or more mutations or genetic variants directly responsible for or linkage disequilibrium with one or more genes responsible for the etiology of the disease.
  • the mutation or genetic variation is, for example, a single nucleotide variation (SNV).
  • SNV single nucleotide variation
  • the present invention also provides a method of treating a disease in a subject in need thereof, comprising delivering to the subject an effective amount of the base editing system of the present invention to modify genes related to the disease.
  • the present invention also provides the use of the base editing system of the present invention in preparing a pharmaceutical composition for treating diseases in a subject in need, wherein the base editing system is used to modify genes related to the disease.
  • the present invention also provides a pharmaceutical composition for treating diseases in a subject in need, which comprises the base editing system of the present invention, and optionally a pharmaceutically acceptable carrier, wherein the base editing system is used for modification and The genes related to the disease.
  • the subject is a mammal, such as a human.
  • diseases include, but are not limited to, tumors, inflammation, Parkinson's disease, cardiovascular diseases, Alzheimer's disease, autism, drug addiction, age-related macular degeneration, schizophrenia, genetic diseases and the like.
  • the present invention also includes a kit for the method of the present invention, which includes the base editing fusion protein of the present invention and/or an expression construct containing a nucleotide sequence encoding the base editing fusion protein, or The base editing system of the present invention.
  • the kit generally includes a label indicating the intended use and/or method of use of the contents of the kit.
  • the term label includes any written or recorded material provided on or with the kit or otherwise provided with the kit.
  • the kit of the present invention may also contain suitable materials for constructing the expression vector in the base editing system of the present invention.
  • the kit of the present invention may also include reagents suitable for transforming the base editing fusion protein or base editing composition of the present invention into cells.
  • the cytidine deaminase, adenosine deaminase, nCas9 (D10A) and UGI parts of STEME-1, STEME-2, STEME-3 and STEME-4 are codon-optimized for cereal plants and synthesized commercially (GENEWIZ, Suzhou , China).
  • the Cas9 variant nCas9-NG (D10A) containing the R1335V/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R mutation was obtained from the mutant nCas9 (Mut Express MultiS Fast Mutagenesis Kit, Vazyme, Nanjing, China) by the Gibson assembly method D10A) obtained by mutation.
  • the japonica rice variety Nipponbare was used to prepare the protoplasts used in this study. Protoplast isolation and transformation were performed as described previously (reference 3). 10 ⁇ g of base editor and sgRNA plasmid DNA were introduced into protoplasts by PEG-mediated transfection, and the average transformation efficiency was measured to be 40-55%. The transfected protoplasts were incubated at 23°C. 60 hours after transfection, protoplasts were collected to extract genomic DNA for deep sequencing of amplicons.
  • PCR was directly performed to amplify the protospacer of the binary vector from the transgenic callus.
  • the barcode is added to both ends of the PCR product through primers.
  • two rounds of PCR were performed. In the first round of PCR, site-specific primers are used to amplify the target region.
  • forward and reverse barcodes were added to the ends of PCR products for library construction. The same amount of PCR products were combined and purified by gel DNA extraction, and commercial sequencing was performed on the samples using the Illumina NextSeq 500 platform (Genewiz, Suzhou, China). Check the pre-spacer sequence in the sequencing reads to analyze C>T and/or A>G substitutions and indels. Using genomic DNA extracted from three independent protoplast samples, repeat amplicon sequencing for each target sequence three times.
  • CBE cytosine base editor
  • ABE adenine base editor
  • a new type of base editor is designed, which can simultaneously generate C:G>T:A and A:T>G:C on the same target site with only one sgRNA.
  • Mutation in order to carry out endogenous sequence targeted saturation mutagenesis (saturated targeting endogenous mutagenesis, STEM) on the selected target gene.
  • the inventors fused cytosine deaminase and adenine deaminase into a new deaminase, and developed a saturated targeting endogenous mutagenesis editor (STEME) for endogenous sequence targeting.
  • the components of STEME also include nCas9 (D10A) and uracil DNA glycosylase inhibitor (UGI) ( Figure 1a).
  • the fused deaminase can deaminate C and/or A in the deamination window, nCas9 can promote the mismatch repair mechanism (mismatch repair, MMR) in the cell, and UGI is used to inhibit uracil DNA glycosylase (uracil DNA glycosylase, UDG), so that the damaged DNA chain can be repaired according to the target chain being deamination.
  • MMR mismatch repair
  • UGI uracil DNA glycosylase
  • UDG uracil DNA glycosylase
  • PABE-7 is composed of artificially evolved ecTadA-ecTadA7.10 heterodimer and nCas9 with 3 NLS at the N-terminal.
  • the inventors designed two forms of fusion deaminase: APOBEC3A-ecTadA-ecTadA7.10 and ecTadA-ecTadA7.10-APOBEC3A , And fused them to the N-terminal of nCas9 (D10A) respectively, and fused one UGI or two UGIs freely expressed (through the T2A connecting peptide) to the C-terminal of nCas9 (D10A) to construct STEME-1, STEME -2, STEME-3 and STEME-4 four carriers (Figure 1b). All STEME vectors are optimized according to the codons of the crops, and expressed by the Ubiquitin-1 (Ubi-1) promoter of maize.
  • Ubiquitin-1 Ubiquitin-1
  • the 20nt sgRNA spacer sequence was constructed on the esgRNA vector driven by the OsU3 promoter ( Figure 2).
  • A3A-PBE and PABE-7 were used as controls for C>T and A>G, and wild-type Cas9 was used as a control for indel production.
  • Perform amplicon sequencing on each sample get about 30,000-310,000 sequence reads per sample, and analyze the base editing efficiency.
  • the results show that all four STEME vectors can produce high-efficiency C>T and/or A>G base transversion in rice protoplasts.
  • STEME-1 has the highest C>T efficiency (0.1%-61.61%) ( Figure 3a and Figure 4).
  • the base editing window of STEME to C>T is the same as that of A3A-PBE, which is C1 to C17 ( Figure 3a and Figure 4).
  • the editing efficiency of STEME-1 from C5 to C14 averaged 29.59%, which was 1.3 times that of A3A-PBE ( Figure 3a and Figure 4).
  • STEME-1 still has the highest A>G efficiency (0.69%-15.5%) among the four vectors ( Figure 3b and Figure 4).
  • the A>G base editing window of STEME is the same as PABE-7, ranging from A4 to A8, but the four types of STEME (0.07%-15.5%) are all higher than the A>G base produced by PABE-7 (1.74%-21.54%) Editing efficiency is low ( Figure 3b and Figure 4).
  • STEME According to the amplicon sequencing data of rice protoplasts, STEME has high product purity at the six target sites tested (Figure 4), and the efficiency of indels produced by it is consistent with that of the untreated group, much lower than that of Cas9 (6.3%-15.61%) (Figure 6).
  • STEME especially STEME-1
  • STEME-1 can simultaneously achieve C:G>T:A and/or A:T>G:C base transversion using only one sgRNA.
  • STEME-1 produces higher C:G>T:A base editing efficiency from C5 to C14 than A3A-PBE, and the base transversion of A:T>G:C can increase the target of directed evolution through saturation mutation To the mutation type.
  • Cas9 derived from Streptococcus pyogenes requires NGG PAM in the target sequence, which limits the number of sgRNAs available on the rice genome.
  • the bio-information analysis of the rice reference genome showed that the use of Cas9-NG (VRVRFRR) extended the editable range to 79%, while the use of Cas9 could only target 19% of the rice genome ( Figure 7a). Therefore, in order to expand the editing scope of STEME, nCas9 (D10A) on STEME-1 was replaced with nCas9-NG (D10A), and STEME-NG was constructed ( Figure 7b).
  • A3A-PBE-NG, PABE7-NG and pCas9-NG were also constructed (Figure 8a).
  • a 20nt target sequence with PAM as NGA, NGT, NGC and NGG was designed to reduce the impact of chromatin state ( Figure 8b and c), and the target sequence Constructed on pOsU3-esgRNA vector.
  • STEME-NG has editing activity on target sequences with PAM as NGA, NGT, NGC and NGG ( Figure 9 and Figure 10).
  • STEME-NG can edit the cytosine in C1 to C17 and the adenine in A4 to A8, but like STEME-1, the base editing efficiency of A>G is lower than that of C>T ( Figure 9).
  • the activity of STEME-NG on the NGG PAM target is lower than that of STEME-1 ( Figure 9 and Figure 10), and it has higher editing on NGA and NGT PAM Activity ( Figure 9 and Figure 10).
  • STEME-NG has a higher editing activity on the OsODEV-NGC target, and the editing activity on the other three NGC targets is relatively low ( Figure 9 and Figure 10) .
  • STEME-NG The activities of A3A-PBE-NG and PABE7-NG at the target sites of NGA, NGT, NGC and NGG PAM are consistent with those of STEME-NG ( Figure 11).
  • STEME-NG, A3A-PBE-NG and PABE7-NG all have lower indels values compared with pCas9-NG.
  • STEME-NG greatly expands the base editing range of C>T and/or A>G on the genome, and promotes mutations produced by saturation mutations and directed evolution in plants.
  • This example uses OsACC as an example to illustrate the ability of STEME-mediated saturation de novo mutation to produce directed evolution in protoplasts.
  • Acetyl-coenzyme A carboxylase is a key enzyme in the lipid synthesis pathway, and the carboxyltransferase domain (CT) is the herbicide-resistant active site of this enzyme ( Figure 12a).
  • CT carboxyltransferase domain
  • these 20 target sites can cover 90.32% of C, 40.43% of A, 77.78% of G and 38.89% of T on the coding chain, covering a total of 61.31% of the bases on the coding chain ( Figure 12a, Table 3).
  • STEME-NG produced a total of 212 mutation types, which is 2.7 times that of A3A-PBE-NG.
  • 212 mutation types 18.4% of the mutations were caused by the simultaneous mutations of C:G>T:A and A:T>G:C.
  • pCas9-NG still has a higher indel efficiency (0.32%-39.72%) than STEME-NG and A3A-PBE-NG.
  • STEME-NG can generate a variety of mutation types in the coding chain. Different from the recently reported directed evolution methods that rely on bacteria and yeasts (such as PACE, EvolvR, CREATE, CHAnGE, etc.), STEME will be able to directly generate saturated de novo mutation types in situ and can be used for directed evolution of plant proteins .
  • STEME-1 APOBEC3A-48aa linker-ecTadA-32aa linker-ecTadA7.10-32aa linker-nCas9(D10A)-NLS-UGI-NLS) coding sequence
  • STEME-NG (APOBEC3A-48aa linker-ecTadA-32aa linker-ecTadA7.10-32aa linker-nCas9-NG(D10A)-NLS-UGI-NLS) coding sequence

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

L'invention concerne une protéine de fusion d'édition de bases, comprenant un domaine de ciblage d'acide nucléique, un domaine cytosine désaminase et un domaine d'adénine désaminase. L'invention concerne également un système d'édition de bases et un procédé de génération d'au moins une cellule génétiquement modifiée.
PCT/CN2020/110207 2019-08-20 2020-08-20 Système d'édition de bases et son procédé d'utilisation WO2021032155A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080059623.XA CN114945670A (zh) 2019-08-20 2020-08-20 一种碱基编辑系统和其使用方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910767418 2019-08-20
CN201910767418.8 2019-08-20

Publications (1)

Publication Number Publication Date
WO2021032155A1 true WO2021032155A1 (fr) 2021-02-25

Family

ID=74659768

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110207 WO2021032155A1 (fr) 2019-08-20 2020-08-20 Système d'édition de bases et son procédé d'utilisation

Country Status (2)

Country Link
CN (1) CN114945670A (fr)
WO (1) WO2021032155A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113201517A (zh) * 2021-05-12 2021-08-03 广州大学 一种胞嘧啶单碱基编辑器工具及其应用
CN115704015A (zh) * 2021-08-12 2023-02-17 清华大学 基于腺嘌呤和胞嘧啶双碱基编辑器的靶向诱变系统
WO2024051850A1 (fr) * 2022-09-09 2024-03-14 中国科学院遗传与发育生物学研究所 Système et procédé d'édition du génome fondé sur une adn polymérase
CN118086285A (zh) * 2024-04-23 2024-05-28 天津凯莱英生物科技有限公司 蛋白定向进化的方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115820603B (zh) * 2022-11-15 2024-07-05 吉林大学 一种基于dCasRx-NSUN6单基因特异性M5C修饰编辑方法
CN116751799B (zh) * 2023-06-14 2024-01-26 江南大学 一种多位点双重碱基编辑器及其应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109517841A (zh) * 2018-12-05 2019-03-26 华东师范大学 一种用于核苷酸序列修饰的组合物、方法与应用
CN109957569A (zh) * 2017-12-22 2019-07-02 中国科学院遗传与发育生物学研究所 基于cpf1蛋白的碱基编辑系统和方法
WO2019147014A1 (fr) * 2018-01-23 2019-08-01 기초과학연구원 Arn guide simple étendu et utilisation associée
CN110835634A (zh) * 2018-08-15 2020-02-25 华东师范大学 一种新型碱基转换编辑系统及其应用

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109957569A (zh) * 2017-12-22 2019-07-02 中国科学院遗传与发育生物学研究所 基于cpf1蛋白的碱基编辑系统和方法
WO2019147014A1 (fr) * 2018-01-23 2019-08-01 기초과학연구원 Arn guide simple étendu et utilisation associée
CN110835634A (zh) * 2018-08-15 2020-02-25 华东师范大学 一种新型碱基转换编辑系统及其应用
CN109517841A (zh) * 2018-12-05 2019-03-26 华东师范大学 一种用于核苷酸序列修饰的组合物、方法与应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI CHAO; ZHANG RUI; MENG XIANGBING; CHEN SHA; ZONG YUAN; LU CHUNJU; QIU JIN-LONG; CHEN YU-HANG; LI JIAYANG; GAO CAIXIA: "Targeted, random mutagenesis of plant genes with dual cytosine and adenine base editors", NATURE BIOTECHNOLOGY, GALE GROUP INC., NEW YORK, US, vol. 38, no. 7, 13 January 2020 (2020-01-13), us, pages 875 - 882, XP037187539, ISSN: 1087-0156, DOI: 10.1038/s41587-019-0393-7 *
LIU, JIAHUI ET AL.: "Research Progress of Base Editing System", WORLD SCI-TECH R&D, vol. 39, no. 6, 31 December 2017 (2017-12-31), XP055763468, ISSN: 1006-6055, DOI: 10.16507/j.issn.1006-6055.2017.09.004 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113201517A (zh) * 2021-05-12 2021-08-03 广州大学 一种胞嘧啶单碱基编辑器工具及其应用
CN113201517B (zh) * 2021-05-12 2022-11-01 广州大学 一种胞嘧啶单碱基编辑器工具及其应用
CN115704015A (zh) * 2021-08-12 2023-02-17 清华大学 基于腺嘌呤和胞嘧啶双碱基编辑器的靶向诱变系统
WO2024051850A1 (fr) * 2022-09-09 2024-03-14 中国科学院遗传与发育生物学研究所 Système et procédé d'édition du génome fondé sur une adn polymérase
CN118086285A (zh) * 2024-04-23 2024-05-28 天津凯莱英生物科技有限公司 蛋白定向进化的方法

Also Published As

Publication number Publication date
CN114945670A (zh) 2022-08-26

Similar Documents

Publication Publication Date Title
WO2021032155A1 (fr) Système d'édition de bases et son procédé d'utilisation
US11820990B2 (en) Method for base editing in plants
WO2019120310A1 (fr) Système et procédé d'édition de bases reposant sur la protéine cpf1
US11447785B2 (en) Method for base editing in plants
KR20200103769A (ko) 연장된 단일 가이드 rna 및 그 용도
CN108866092A (zh) 抗除草剂基因的产生及其用途
CN107027313A (zh) 用于多元rna引导的基因组编辑和其它rna技术的方法和组合物
WO2021185358A1 (fr) Procédé d'amélioration de la transformation génétique de plantes et de l'efficacité d'édition génomique
US20210403901A1 (en) Targeted mutagenesis using base editors
WO2021175289A1 (fr) Procédé et système d'édition de génome multiplex
WO2023169454A1 (fr) Adénine désaminase et son utilisation dans la réécriture de base
WO2021082830A1 (fr) Procédé de modification ciblée de séquence de génome de plante
JP2022511508A (ja) ゲノム編集による遺伝子サイレンシング
WO2023169410A1 (fr) Cytosine désaminase et son utilisation dans l'édition de bases
CN112805385B (zh) 基于人apobec3a脱氨酶的碱基编辑器及其用途
CN117295817A (zh) Dna修饰酶及其活性片段和变体以及使用方法
WO2021175288A1 (fr) Système amélioré d'édition de base de cytosine
WO2023227050A1 (fr) Procédé pour l'insertion spécifique d'un site d'une séquence exogène dans le génome
WO2022199665A1 (fr) Procédé pour améliorer l'efficacité de la transformation génétique des plantes et de l'édition génétique
WO2024051850A1 (fr) Système et procédé d'édition du génome fondé sur une adn polymérase
WO2022127894A1 (fr) Plante résistante aux herbicides
WO2023232109A1 (fr) Nouveau système d'édition de gène crispr
US20230227835A1 (en) Method for base editing in plants
IL303583A (en) Cannabis plant resistant to herbicides

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20854097

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20854097

Country of ref document: EP

Kind code of ref document: A1