WO2023207607A1 - Deaminase mutant, composition, and method for modifying mitochondrial dna - Google Patents

Deaminase mutant, composition, and method for modifying mitochondrial dna Download PDF

Info

Publication number
WO2023207607A1
WO2023207607A1 PCT/CN2023/088008 CN2023088008W WO2023207607A1 WO 2023207607 A1 WO2023207607 A1 WO 2023207607A1 CN 2023088008 W CN2023088008 W CN 2023088008W WO 2023207607 A1 WO2023207607 A1 WO 2023207607A1
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
sequence
binding protein
amino acid
double
Prior art date
Application number
PCT/CN2023/088008
Other languages
French (fr)
Chinese (zh)
Inventor
伊成器
雷芷芯
孟浩巍
刘璐璐
赵华男
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Priority to CN202380013022.9A priority Critical patent/CN117751133A/en
Publication of WO2023207607A1 publication Critical patent/WO2023207607A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/64General methods for preparing the vector, for introducing it into the cell or for selecting the vector-containing host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)

Abstract

A double-stranded DNA deaminase mutant for constructing a mitochondrial base editor, a base editor for mitochondrial DNA base editing, and a method for base editing.

Description

用于修饰线粒体DNA的脱氨酶突变体、组合物和方法Deaminase mutants, compositions and methods for modifying mitochondrial DNA 技术领域Technical field
本申请涉及碱基编辑(特别是线粒体碱基编辑)技术领域。具体而言,本申请涉及一种可用于构建线粒体碱基编辑器的双链DNA脱氨酶突变体,用于线粒体DNA碱基编辑的碱基编辑器及碱基编辑方法。The present application relates to the technical field of base editing (especially mitochondrial base editing). Specifically, the present application relates to a double-stranded DNA deaminase mutant that can be used to construct a mitochondrial base editor, a base editor for mitochondrial DNA base editing, and a base editing method.
背景技术Background technique
人类的基因组主要分为两个部分,第一部分为处于细胞核内的核基因组,DNA与组蛋白以染色质或染色体的形式组织在一起;第二部分为处于线粒体中的环状DNA。处于细胞核内的基因组DNA(genomic DNA,gDNA)以染色体的形式存在,总共23对,单倍型总长度约为3.1Gbp;处于线粒体内的环状线粒体DNA(mitochondrial DNA,mtDNA)长度约为16Kbp。The human genome is mainly divided into two parts. The first part is the nuclear genome located in the nucleus of the cell, where DNA and histones are organized together in the form of chromatin or chromosomes; the second part is the circular DNA located in the mitochondria. Genomic DNA (gDNA) in the nucleus exists in the form of chromosomes, with a total of 23 pairs, and the total haplotype length is about 3.1Gbp; the circular mitochondrial DNA (mitochondrial DNA, mtDNA) in the mitochondria is about 16Kbp in length .
正常人体细胞中,大约有1000到10000个线粒体,而每一个线粒体内,则大约有2到10组线粒体,每个mtDNA共包含16569个碱基对。mtDNA与核基因组gDNA不同,一般成环状。单个mtDNA总共编码37个基因,包含13种编码蛋白的基因,22种tRNA基因以及两种rRNA基因。线粒体基因组的突变有可能会导致严重的疾病(Stewart,J.B.et al,The dynamics of mitochondrial DNA heteroplasmy:implications for human health and disease.Nat Rev Genet,2015.16(9):p.530-42.),因此将带有致病突变的mtDNA通过基因编辑的手段进行更正意义重大。There are approximately 1,000 to 10,000 mitochondria in normal human cells, and within each mitochondria, there are approximately 2 to 10 groups of mitochondria. Each mtDNA contains a total of 16,569 base pairs. mtDNA is different from nuclear genomic gDNA and is generally circular. A single mtDNA encodes a total of 37 genes, including 13 protein-coding genes, 22 tRNA genes and two rRNA genes. Mutations in the mitochondrial genome may cause serious diseases (Stewart, J.B. et al, The dynamics of mitochondrial DNA heteroplasmy:implications for human health and disease. Nat Rev Genet, 2015.16(9):p.530-42.), therefore It is of great significance to correct mtDNA with disease-causing mutations through gene editing.
2020年,Joseph Mougous团队与David Liu团队基于TALE系统开发出可以靶向线粒体DNA并实现定点位置C-to-T碱基编辑的线粒体碱基编辑器DdCBE(Mok,B.Y.,et al.,A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing.Nature,2020.583(7817):p.631-637.)。Joseph Mougous团队与David Liu团队首先通过生物信息学手段在一种伯克霍尔德菌(Burkholderia cenocepacia)中挖掘出了可以特异对DNA双链进行催化的脱氨酶细菌毒素DddA,并在另一种缺乏dddA基因的伯克霍尔德菌株中发现了一种可以拮抗DddA活性的细菌免疫蛋白DddIA。通过结构生物学判断,DddA蛋白的C端附近的多肽片段为催化双链DNA的核心区域,因此该段也被命名为DddAtox。由于全长DddAtox可以催化双链DNA序列,因此必须将DddAtox进行分割,才能保证在大肠杆菌等常规培养体系的正常培养,根据DddAtox分割位点的不同,可以分为 两种方案G1333与G1397,其中G表示分割位点的氨基酸序列为甘氨酸Gly。参考mitoTALEN的相关工作,Joseph Mougous团队与David Liu团队通过TALE元件分别与DddAtox分割后的N端或C端相连;在TALE的N端串联能够促进蛋白进入线粒体的线粒体定位信号MTS(mitochondrial targeting signal);同时参考之前基于CRISPR-Cas9系统开发的胞嘧啶碱基编辑器CBE,在整体TALE元件的C端串联1个UGI,用于抑制线粒体内的尿嘧啶糖基化酶(UDG)活性,防止由于DddAtox催化C脱氨变成dU后,dU被线粒体内的UDG切割形成缺口(nick)。通过更换TALE的序列,可以使构建的元件N端与C端精准靶向线粒体基因组特定区域,随后分割的DddAtox会在该区域组合形成具有完整脱氨活性的元件,完成对附近区域内精准的C-to-T的编辑,并且在编辑过程中不产生DNA的双链断裂,防止了线粒体DNA的降解。In 2020, Joseph Mougous's team and David Liu's team developed DdCBE, a mitochondrial base editor that can target mitochondrial DNA and achieve site-specific C-to-T base editing based on the TALE system (Mok, BY, et al., A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature, 2020.583(7817):p.631-637.). Joseph Mougous's team and David Liu's team first used bioinformatics methods to discover the deaminase bacterial toxin DddA in a species of Burkholderia cenocepacia that can specifically catalyze DNA double strands, and in another A bacterial immune protein, DddI A , that antagonizes DddA activity was discovered in a Burkholderia strain lacking the dddA gene. Judging from structural biology, the polypeptide fragment near the C-terminus of the DddA protein is the core region of catalytic double-stranded DNA, so this fragment is also named DddA tox . Since the full-length DddA tox can catalyze double-stranded DNA sequences, DddA tox must be split to ensure normal culture in conventional culture systems such as E. coli. According to the different split sites of DddA tox , it can be divided into Two schemes are G1333 and G1397, where G indicates that the amino acid sequence of the split site is glycine Gly. Referring to the related work of mitoTALEN, Joseph Mougous's team and David Liu's team connected to the N-terminal or C-terminal of DddA tox through TALE elements respectively; the N-terminal of TALE was connected in series with the mitochondrial targeting signal MTS (mitochondrial targeting signal) that can promote the protein to enter mitochondria. ); At the same time, referring to the previously developed cytosine base editor CBE based on the CRISPR-Cas9 system, a UGI is connected in series to the C-terminus of the overall TALE element to inhibit the activity of uracil glycosylase (UDG) in mitochondria and prevent After DddA tox catalyzes the deamination of C into dU, dU is cleaved by UDG in the mitochondria to form a nick. By replacing the sequence of TALE, the N-terminal and C-terminal ends of the constructed element can be precisely targeted to a specific region of the mitochondrial genome. The subsequently divided DddA tox will be combined in this region to form an element with complete deamination activity, completing precise targeting of nearby regions. C-to-T editing does not produce DNA double-strand breaks during the editing process, preventing the degradation of mitochondrial DNA.
发明内容Contents of the invention
线粒体遗传疾病的治疗关乎人类的身体健康,2020年由Joseph Mougous团队与David Liu团队的报道线粒体碱基编辑器DdCBE可以实细胞内的线粒体单碱基的编辑,为精准治疗线粒体遗传疾病带来了可能。但是,由于DdCBE与线粒体DNA的非特异性结合造成线粒体水平的脱靶(Mok,B.Y.,et al.,A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing.Nature,2020.583(7817):p.631-637.)。同时,本申请发明人根据Detect-seq的检测发现,目前的线粒体碱基编辑器DdCBE虽然已经包含了MTS序列,但仍然会有部分DdCBE错误定位到细胞核内,并能够在细胞核内造成严重的脱靶编辑。在线粒体DNA水平及核基因组水平造成的脱靶,为线粒体遗传疾病的精准治疗带来安全性问题。The treatment of mitochondrial genetic diseases is related to human health. In 2020, Joseph Mougous's team and David Liu's team reported that the mitochondrial base editor DdCBE can edit mitochondrial single bases in cells, bringing new insights to the precise treatment of mitochondrial genetic diseases. possible. However, due to the non-specific binding of DdCBE to mitochondrial DNA, it causes off-targeting at the mitochondrial level (Mok, B.Y., et al., A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature, 2020.583(7817): p.631-637 .). At the same time, the inventor of the present application found based on Detect-seq detection that although the current mitochondrial base editor DdCBE already contains the MTS sequence, some DdCBE will still be mislocalized into the nucleus and can cause serious off-targeting in the nucleus. edit. Off-targets caused at the mitochondrial DNA level and nuclear genome level bring safety issues to the precise treatment of mitochondrial genetic diseases.
为降低现有DdCBE在线粒体和/或细胞核中的脱靶编辑,发明人经过大量研究,提供了可用于构建具有降低的脱靶编辑的线粒体碱基编辑器的双链DNA脱氨酶突变体,所述双链DNA脱氨酶突变体可有利应用于线粒体碱基编辑器的构建,由其构建的线粒体碱基编辑器在保持相当的靶位点编辑效率的同时,可有效降低线粒体内和/或细胞核内的脱靶编辑。此外,本申请还提供了具有降低的线粒体核/或细胞核内脱靶编辑的碱基编辑组合物及方法。In order to reduce the off-target editing of existing DdCBE in mitochondria and/or cell nuclei, the inventors have conducted extensive research and provided a double-stranded DNA deaminase mutant that can be used to construct a mitochondrial base editor with reduced off-target editing. Double-stranded DNA deaminase mutants can be advantageously used in the construction of mitochondrial base editors. The mitochondrial base editor constructed by them can effectively reduce the amount of DNA in mitochondria and/or the nucleus while maintaining considerable target site editing efficiency. Off-target editing within. In addition, the present application also provides base editing compositions and methods with reduced off-target editing in the mitochondrial nucleus/or cell nucleus.
因此,在第一方面,本申请提供了一种具有双链DNA脱氨酶活性的多肽或其突变体,所述多肽或其突变体包含野生型双链DNA脱氨酶中与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸残基;并且,所述多肽或其突变体与野生型双链DNA脱氨酶中与SEQ  ID NO:1的第1290-1427位对应位置处的氨基酸残基相比,具有下述突变:Therefore, in a first aspect, the application provides a polypeptide or a mutant thereof having double-stranded DNA deaminase activity, the polypeptide or a mutant thereof comprising wild-type double-stranded DNA deaminase and SEQ ID NO: The amino acid residues at positions corresponding to positions 1290-1427 of 1; and, the polypeptide or its mutant is identical to SEQ in wild-type double-stranded DNA deaminase Compared with the amino acid residues corresponding to positions 1290-1427 of ID NO:1, it has the following mutations:
(1)在与SEQ ID NO:1的第1308位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;或者,(1) The amino acid residue at the position corresponding to position 1308 of SEQ ID NO:1 is replaced by an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue) amino acid residue, isoleucine residue, valine residue) substitution; or,
(2)在与SEQ ID NO:1的第1310位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;(2) The amino acid residue at the position corresponding to position 1310 of SEQ ID NO:1 is an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue amino acid residues, isoleucine residues, valine residues) substitution;
其中,所述突变体与所述具有双链DNA脱氨酶活性的多肽相比,具有至少90%,例如至少95%,至少96%,至少97%,至少98%,至少99%的序列同一性;或者,具有一个或几个(例如,1个、2个、3个、4个、5个、6个、7个、8个或9个)氨基酸的置换(优选保守置换)、添加或缺失;且,具有双链DNA脱氨酶活性;并且,Wherein, the mutant has at least 90%, for example, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% sequence identity compared to the polypeptide with double-stranded DNA deaminase activity. or, having one or several (for example, 1, 2, 3, 4, 5, 6, 7, 8 or 9) amino acid substitutions (preferably conservative substitutions), additions or Deleted; and, has double-stranded DNA deaminase activity; and,
所述多肽或其突变体中与SEQ ID NO:1的第1309位、第1367位和第1368位对应位置处的氨基酸残基未发生突变。The amino acid residues at positions corresponding to positions 1309, 1367 and 1368 of SEQ ID NO: 1 in the polypeptide or its mutants are not mutated.
在某些实施方案中,所述野生型双链DNA脱氨酶具有如SEQ ID NO:1所示的氨基酸序列。In certain embodiments, the wild-type double-stranded DNA deaminase has the amino acid sequence set forth in SEQ ID NO: 1.
在某些实施方案中,所述具有双链DNA脱氨酶活性的多肽具有如SEQ ID NO:3或5所示的氨基酸序列。In certain embodiments, the polypeptide having double-stranded DNA deaminase activity has an amino acid sequence as shown in SEQ ID NO: 3 or 5.
在第二方面,本申请提供了一种突变的双链DNA脱氨酶或其变体,其与野生型双链DNA脱氨酶相比,具有下述突变:In a second aspect, the application provides a mutant double-stranded DNA deaminase or a variant thereof, which has the following mutations compared with a wild-type double-stranded DNA deaminase:
(1)在与SEQ ID NO:1的第1308位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;或者,(1) The amino acid residue at the position corresponding to position 1308 of SEQ ID NO:1 is replaced by an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue) amino acid residue, isoleucine residue, valine residue) substitution; or,
(2)在与SEQ ID NO:1的第1310位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;(2) The amino acid residue at the position corresponding to position 1310 of SEQ ID NO:1 is an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue amino acid residues, isoleucine residues, valine residues) substitution;
其中,所述变体与所述突变的双链DNA脱氨酶相比,具有至少90%,例如至少95%,至少96%,至少97%,至少98%,至少99%的序列同一性;或者,具有一个或几个(例如,1个、2个、3个、4个、5个、6个、7个、8个或9个)氨基酸的置换(优选保守置换)、添加或缺失;且,具有双链DNA脱氨酶活性;并且 Wherein, the variant has at least 90%, such as at least 95%, at least 96%, at least 97%, at least 98%, at least 99% sequence identity compared to the mutated double-stranded DNA deaminase; Alternatively, having one or several (for example, 1, 2, 3, 4, 5, 6, 7, 8 or 9) substitutions (preferably conservative substitutions), additions or deletions of amino acids; And, has double-stranded DNA deaminase activity; and
所述突变的双链DNA脱氨酶或其变体中与SEQ ID NO:1的第1309位、第1367位和第1368位对应位置处的氨基酸残基未发生突变。The amino acid residues at positions corresponding to positions 1309, 1367 and 1368 of SEQ ID NO: 1 in the mutated double-stranded DNA deaminase or its variants are not mutated.
在某些实施方案中,所述突变的双链DNA脱氨酶或其变体在其与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸序列为如上所述的多肽或其突变体的氨基酸序列。In certain embodiments, the amino acid sequence of the mutated double-stranded DNA deaminase or variant thereof at positions corresponding to positions 1290-1427 of SEQ ID NO: 1 is a polypeptide as described above or a mutation thereof body’s amino acid sequence.
在某些实施方案中,所述野生型双链DNA脱氨酶具有如SEQ ID NO:1所示的氨基酸序列。In certain embodiments, the wild-type double-stranded DNA deaminase has the amino acid sequence set forth in SEQ ID NO: 1.
在第三方面,本申请提供了多肽聚合物,其包含第一多肽和第二多肽,其中:In a third aspect, the application provides a polypeptide polymer comprising a first polypeptide and a second polypeptide, wherein:
所述第一多肽包含N-末端片段,所述第二多肽包含C-末端片段;The first polypeptide includes an N-terminal fragment and the second polypeptide includes a C-terminal fragment;
所述N-末端片段和所述C-末端片段的氨基酸序列分别是由如第一方面所述的多肽或其突变体在切割位点断裂形成的N-末端片段和C-末端片段的氨基酸序列;The amino acid sequences of the N-terminal fragment and the C-terminal fragment are respectively the amino acid sequences of the N-terminal fragment and the C-terminal fragment formed by cleavage of the polypeptide or its mutant at the cleavage site as described in the first aspect. ;
其中,所述多肽聚合物由所述N-末端片段和所述C-末端片段聚合形成。例如,所述多肽聚合物是由所述N-末端片段和所述C-末端片段形成的二聚体。Wherein, the polypeptide polymer is formed by polymerizing the N-terminal fragment and the C-terminal fragment. For example, the polypeptide polymer is a dimer formed from the N-terminal fragment and the C-terminal fragment.
在某些实施方案中,当所述N-末端片段和所述C-末端片段各自单独存在时不具备双链DNA脱氨酶活性,或者,具备显著降低的脱氨酶活性(例如,如上所述的多肽的双链DNA脱氨酶的活性的至多40%、至多30%、至多20%、至多10%、至多5%或至多1%)。In certain embodiments, the N-terminal fragment and the C-terminal fragment each have no double-stranded DNA deaminase activity when present alone, or have significantly reduced deaminase activity (e.g., as described above At most 40%, at most 30%, at most 20%, at most 10%, at most 5% or at most 1% of the double-stranded DNA deaminase activity of the polypeptide.
在某些实施方案中,当所述N-末端片段与所述C-末端片段聚合时,所述聚合物具备双链DNA脱氨酶活性(例如,具备所述的多肽的双链DNA脱氨酶活性的至少70%、至少80%、至少90%或至少95%)。In certain embodiments, when the N-terminal fragment and the C-terminal fragment are polymerized, the polymer possesses double-stranded DNA deaminase activity (e.g., double-stranded DNA deamination of the polypeptide) at least 70%, at least 80%, at least 90% or at least 95% of the enzyme activity).
在某些实施方案中,所述分割位点位于具有双链DNA脱氨酶活性的多肽或其突变体中紧邻在与SEQ ID NO:1的第1333位对应位置处的氨基酸残基之后的肽键。In certain embodiments, the cleavage site is located in a polypeptide having double-stranded DNA deaminase activity or a mutant thereof immediately following the amino acid residue at position 1333 of SEQ ID NO: 1 key.
在某些实施方案中,所述N-末端片段具有如SEQ ID NO:104或106所示的氨基酸序列。In certain embodiments, the N-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 104 or 106.
在某些实施方案中,所述C-末端片段具有如SEQ ID NO:55所示的氨基酸序列。In certain embodiments, the C-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 55.
在某些实施方案中,所述分割位点位于具有双链DNA脱氨酶活性的多肽或其突变体中紧邻在与SEQ ID NO:1的第1397位对应位置处的氨基酸残基之后的肽键。In certain embodiments, the cleavage site is located in a polypeptide having double-stranded DNA deaminase activity or a mutant thereof immediately following the amino acid residue at position 1397 of SEQ ID NO: 1 key.
在某些实施方案中,所述N-末端片段具有如SEQ ID NO:35或37所示的氨基酸序列。In certain embodiments, the N-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 35 or 37.
在某些实施方案中,所述C-末端片段具有如SEQ ID NO:15所示的氨基酸序列。In certain embodiments, the C-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 15.
在某些实施方案中,所述第一多肽还包含与所述N-末端片段连接的第一DNA结合蛋 白,和/或,所述第二多肽还包含与所述C-末端片段连接的第二DNA结合蛋白。In certain embodiments, the first polypeptide further comprises a first DNA binding protein linked to the N-terminal fragment. White, and/or, the second polypeptide further comprises a second DNA binding protein linked to the C-terminal fragment.
在某些实施方案中,所述第一DNA结合蛋白和/或第二DNA结合蛋白各自独立地为可编程DNA结合蛋白。In certain embodiments, the first DNA binding protein and/or the second DNA binding protein are each independently a programmable DNA binding protein.
在某些实施方案中,所述第一多肽和所述第二多肽各自独立地还包含线粒体靶向序列(MTS),和/或,尿嘧啶糖基化酶抑制剂(UGI)结构域。In certain embodiments, the first polypeptide and the second polypeptide each independently further comprise a mitochondrial targeting sequence (MTS), and/or, a uracil glycosylase inhibitor (UGI) domain .
在某些实施方案中,所述第一多肽包下列结构含:第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;所述第二多肽包含下列结构:第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域。In certain embodiments, the first polypeptide comprises the following structure: a first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, and, the uracil sugar basalase inhibitor (UGI) domain; the second polypeptide comprises the following structure: a second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, and, the Uracil glycosylase inhibitor (UGI) domain.
在某些实施方案中,所述第一多肽和第二多肽中相邻的各结构之间各自独立地直接连接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)连接。In certain embodiments, adjacent structures in the first polypeptide and the second polypeptide are each independently connected directly or through a linker (e.g., a peptide linker, for example, including one or more glycine (G) and / or serine (S) flexible peptide) linkage.
在某些实施方案中,所述第一线粒体靶向序列(MTS)位于所述第一多肽的N端,和/或,所述第二线粒体靶向序列(MTS)位于所述第二多肽的N端。In certain embodiments, the first mitochondrial targeting sequence (MTS) is located at the N-terminus of the first polypeptide, and/or the second mitochondrial targeting sequence (MTS) is located at the second polypeptide. N-terminus of the peptide.
在某些实施方案中,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域。In certain embodiments, the first polypeptide comprises, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, and , the uracil glycosylase inhibitor (UGI) domain; the second polypeptide sequentially includes from the N end to the C end: the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, and, the uracil glycosylase inhibitor (UGI) domain.
在某些实施方案中,所述第一线粒体靶向序列或所述第二线粒体靶向序列各自独立地选自来源于COX8(细胞色素C氧化酶8A亚基)、ATP5G2(ATP合成酶F0复合体C2亚基)、SOD2(超氧化物歧化酶2)、COQ8A(线粒体非典型激酶COQ8A)的线粒体靶向序列。In certain embodiments, the first mitochondrial targeting sequence or the second mitochondrial targeting sequence are each independently selected from the group consisting of COX8 (cytochrome c oxidase 8A subunit), ATP5G2 (ATP synthase F0 complex Mitochondrial targeting sequences of body C2 subunit), SOD2 (superoxide dismutase 2), and COQ8A (mitochondrial atypical kinase COQ8A).
在某些实施方案中,所述第一线粒体靶向序列与所述第二线粒体靶向序列相同或不相同。In certain embodiments, the first mitochondrial targeting sequence is the same or different from the second mitochondrial targeting sequence.
在某些实施方案中,所述第一线粒体靶向序列为源自SOD2的线粒体靶向序列。在某些实施方案中,所述第一线粒体靶向序列具有如SEQ ID NO:9所示的氨基酸序列。In certain embodiments, the first mitochondrial targeting sequence is a mitochondrial targeting sequence derived from SOD2. In certain embodiments, the first mitochondrial targeting sequence has the amino acid sequence set forth in SEQ ID NO: 9.
在某些实施方案中,所述第二线粒体靶向序列为源自COX8的线粒体靶向序列。在某些实施方案中,所述第二线粒体靶向序列具有如SEQ ID NO:19所示的氨基酸序列。In certain embodiments, the second mitochondrial targeting sequence is a mitochondrial targeting sequence derived from COX8. In certain embodiments, the second mitochondrial targeting sequence has the amino acid sequence set forth in SEQ ID NO: 19.
在某些实施方案中,所述第一DNA结合蛋白或所述第二DNA结合蛋白各自独立地 选自:TALE(转录激活因子样效应子)蛋白、锌指蛋白和Cas蛋白。在某些实施方案中,所述第一DNA结合蛋白或所述第二DNA结合蛋白各自独立地为TALE(转录激活因子样效应子)蛋白或锌指蛋白。In certain embodiments, the first DNA binding protein or the second DNA binding protein each independently Selected from: TALE (transcription activator-like effector) protein, zinc finger protein and Cas protein. In certain embodiments, the first DNA binding protein or the second DNA binding protein are each independently a TALE (transcription activator-like effector) protein or a zinc finger protein.
在某些实施方案中,所述第一DNA结合蛋白与所述第二DNA结合蛋白相同或不相同。In certain embodiments, the first DNA binding protein and the second DNA binding protein are the same or different.
在某些实施方案中,所述第一DNA结合蛋白和所述第二DNA结合蛋白均为TALE蛋白。In certain embodiments, the first DNA binding protein and the second DNA binding protein are both TALE proteins.
在某些实施方案中,所述第一多肽和/或所述第二多肽各自独立地还包含出核信号(NES)序列。In certain embodiments, the first polypeptide and/or the second polypeptide each independently further comprise a nuclear exit signal (NES) sequence.
在某些实施方案中,所述NES序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述第一多肽或所述第二多肽中的其他结构域连接。In certain embodiments, the NES sequence is associated with the first polypeptide or the first polypeptide, either directly or through a linker (e.g., a peptide linker, e.g., a flexible peptide comprising one or more glycine (G) and/or serine (S)). other domains in the second polypeptide.
在某些实施方案中,所述第一多肽包含第一NES序列,和/或,所述第二多肽包含第二NES序列。In certain embodiments, the first polypeptide comprises a first NES sequence, and/or the second polypeptide comprises a second NES sequence.
在某些实施方案中,所述第一NES序列位于所述第一DNA结合蛋白的C端。In certain embodiments, the first NES sequence is located C-terminal to the first DNA binding protein.
在某些实施方案中,所述第二NES序列位于所述第二DNA结合蛋白的C端。In certain embodiments, the second NES sequence is located C-terminal to the second DNA binding protein.
在某些实施方案中,所述第一NES序列与所述第二NES序列相同或不相同。In certain embodiments, the first NES sequence is the same or different from the second NES sequence.
在某些实施方案中,所述第一NES序列或所述第二NES序列各自独立地选自来源于HIV病毒的Rev蛋白(HIV regulator of virion)、促有丝分裂素激活蛋白激酶(MAPK,mitogen-activated protein kinase)、细胞肿瘤抗原蛋白P53(cellular tumor antigen p53)、核糖体转运蛋白NMD3(60S ribosomal export protein NMD3)。在某些实施方案中,所述第一NES序列或所述第二NES序列分别具有如SEQ ID NO:47或56所示的氨基酸序列。In certain embodiments, the first NES sequence or the second NES sequence are each independently selected from the group consisting of Rev protein (HIV regulator of virion), mitogen-activated protein kinase (MAPK, mitogen- activated protein kinase), cellular tumor antigen protein P53 (cellular tumor antigen p53), ribosome transport protein NMD3 (60S ribosomal export protein NMD3). In certain embodiments, the first NES sequence or the second NES sequence has the amino acid sequence set forth in SEQ ID NO: 47 or 56, respectively.
在某些实施方案中,In certain embodiments,
所述第一多肽从N端至C端依次包含:The first polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述第一NES序列、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the first NES sequence, the N-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,所述第一NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(ii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the first NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain;
或者, or,
(iii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;(iii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, First NES sequence;
和/或,and / or,
所述第二多肽从N端至C端依次包含:The second polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述第二NES序列、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the second NES sequence, the C-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述第二NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;或者,(ii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the second NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain; or,
(iii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。(iii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, Second NES sequence.
在某些实施方案中,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;和/或,所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。In certain embodiments, the first polypeptide comprises, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the the uracil glycosylase inhibitor (UGI) domain, and the first NES sequence; and/or the second polypeptide sequentially comprising from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, the second NES sequence.
在第四方面,本申请提供了多肽聚合物,其包含第一多肽和第二多肽,其中,所述第一多肽包含第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;所述第二多肽包含:第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;In a fourth aspect, the application provides a polypeptide polymer comprising a first polypeptide and a second polypeptide, wherein the first polypeptide comprises a first mitochondrial targeting sequence (MTS), the first DNA binding protein, an N-terminal fragment, and the uracil glycosylase inhibitor (UGI) domain; the second polypeptide includes: a second mitochondrial targeting sequence (MTS), the second DNA binding protein , C-terminal fragment, and, the uracil glycosylase inhibitor (UGI) domain;
所述N-末端片段和所述C-末端片段的氨基酸序列是由具有双链DNA脱氨酶活性的多肽在切割位点断裂形成的N-末端片段和C-末端片段的氨基酸序列;所述具有双链DNA脱氨酶活性的多肽包含野生型双链DNA脱氨酶中与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸残基;The amino acid sequences of the N-terminal fragment and the C-terminal fragment are the amino acid sequences of the N-terminal fragment and the C-terminal fragment formed by cleavage of a polypeptide with double-stranded DNA deaminase activity at the cleavage site; the The polypeptide having double-stranded DNA deaminase activity includes the amino acid residues at positions corresponding to positions 1290-1427 of SEQ ID NO: 1 in the wild-type double-stranded DNA deaminase;
其中,所述多肽聚合物由所述N-末端片段和所述C-末端片段聚合形成。Wherein, the polypeptide polymer is formed by polymerizing the N-terminal fragment and the C-terminal fragment.
在某些实施方案中,所述第一多肽还包含第一出核信号(NES)序列;和/或,所述第二多肽还包含第二出核信号(NES)序列。In certain embodiments, the first polypeptide further comprises a first nuclear export signal (NES) sequence; and/or the second polypeptide further comprises a second nuclear export signal (NES) sequence.
在某些实施方案中,所述野生型双链DNA脱氨酶具有如SEQ ID NO:1所示的氨基酸序列。 In certain embodiments, the wild-type double-stranded DNA deaminase has the amino acid sequence set forth in SEQ ID NO:1.
在某些实施方案中,所述具有双链DNA脱氨酶活性的多肽具有如SEQ ID NO:2所示的氨基酸序列。In certain embodiments, the polypeptide having double-stranded DNA deaminase activity has an amino acid sequence as shown in SEQ ID NO: 2.
在某些实施方案中,当所述N-末端片段和所述C-末端片段各自单独存在时不具备双链DNA脱氨酶活性,或者,具备显著降低的脱氨酶活性(例如,具有所述双链DNA脱氨酶活性的多肽的双链DNA脱氨酶的活性的至多40%、至多30%、至多20%、至多10%、至多5%或至多1%)。In certain embodiments, the N-terminal fragment and the C-terminal fragment each have no double-stranded DNA deaminase activity when present alone, or have significantly reduced deaminase activity (e.g., have the At most 40%, at most 30%, at most 20%, at most 10%, at most 5% or at most 1% of the double-stranded DNA deaminase activity of the polypeptide with double-stranded DNA deaminase activity.
在某些实施方案中,当所述N-末端片段与所述C-末端片段聚合时,所述聚合物具备双链DNA脱氨酶活性(例如,具备所述具有双链DNA脱氨酶活性的多肽的双链DNA脱氨酶活性的至少70%、至少80%、至少90%或至少95%)。In certain embodiments, when the N-terminal fragment and the C-terminal fragment are polymerized, the polymer possesses double-stranded DNA deaminase activity (e.g., possesses the double-stranded DNA deaminase activity) at least 70%, at least 80%, at least 90% or at least 95% of the double-stranded DNA deaminase activity of the polypeptide).
在某些实施方案中,所述分割位点位于所述具有双链DNA脱氨酶活性的多肽中紧邻在与SEQ ID NO:1的第1333位对应位置处的氨基酸残基之后的肽键。In certain embodiments, the cleavage site is located in the polypeptide having double-stranded DNA deaminase activity at a peptide bond immediately following the amino acid residue at position corresponding to position 1333 of SEQ ID NO:1.
在某些实施方案中,所述N-末端片段具有如SEQ ID NO:54所示的氨基酸序列。In certain embodiments, the N-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 54.
在某些实施方案中,所述C-末端片段具有如SEQ ID NO:55所示的氨基酸序列。In certain embodiments, the C-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 55.
在某些实施方案中,所述分割位点位于所述具有双链DNA脱氨酶活性的多肽中紧邻在与SEQ ID NO:1的第1397位对应位置处的氨基酸残基之后的肽键。In certain embodiments, the cleavage site is located in the polypeptide having double-stranded DNA deaminase activity at a peptide bond immediately following the amino acid residue at position corresponding to position 1397 of SEQ ID NO:1.
在某些实施方案中,所述N-末端片段具有如SEQ ID NO:14所示的氨基酸序列。In certain embodiments, the N-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 14.
在某些实施方案中,所述C-末端片段具有如SEQ ID NO:15所示的氨基酸序列。In certain embodiments, the C-terminal fragment has the amino acid sequence set forth in SEQ ID NO: 15.
在某些实施方案中,所述第一DNA结合蛋白和/或第二DNA结合蛋白为可编程DNA结合蛋白。In certain embodiments, the first DNA binding protein and/or second DNA binding protein is a programmable DNA binding protein.
在某些实施方案中,所述各结构之间任选地通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)连接。In certain embodiments, the structures are optionally connected by a linker (eg, a peptide linker, such as a flexible peptide comprising one or more glycine (G) and/or serine (S)).
在某些实施方案中,所述第一线粒体靶向序列(MTS)位于所述第一多肽的N端,和/或,所述第二线粒体靶向序列(MTS)位于所述第二多肽的N端。In certain embodiments, the first mitochondrial targeting sequence (MTS) is located at the N-terminus of the first polypeptide, and/or the second mitochondrial targeting sequence (MTS) is located at the second polypeptide. N-terminus of the peptide.
在某些实施方案中,所述第一线粒体靶向序列或所述第二线粒体靶向序列各自独立地选自来源于COX8(细胞色素C氧化酶8A亚基)、ATP5G2(ATP合成酶F0复合体C2亚基)、SOD2(超氧化物歧化酶2)、COQ8A(线粒体非典型激酶COQ8A)的线粒体靶向序列。In certain embodiments, the first mitochondrial targeting sequence or the second mitochondrial targeting sequence are each independently selected from the group consisting of COX8 (cytochrome c oxidase 8A subunit), ATP5G2 (ATP synthase F0 complex Mitochondrial targeting sequences of body C2 subunit), SOD2 (superoxide dismutase 2), and COQ8A (mitochondrial atypical kinase COQ8A).
在某些实施方案中,所述第一线粒体靶向序列与所述第二线粒体靶向序列相同或不相同。In certain embodiments, the first mitochondrial targeting sequence is the same or different from the second mitochondrial targeting sequence.
在某些实施方案中,所述第一线粒体靶向序列为源自SOD2的线粒体靶向序列。 In certain embodiments, the first mitochondrial targeting sequence is a mitochondrial targeting sequence derived from SOD2.
在某些实施方案中,所述第二线粒体靶向序列为源自COX8的线粒体靶向序列。In certain embodiments, the second mitochondrial targeting sequence is a mitochondrial targeting sequence derived from COX8.
在某些实施方案中,所述第一DNA结合蛋白或所述第二DNA结合蛋白各自独立地选自:TALE(转录激活因子样效应子)蛋白、锌指蛋白和Cas蛋白。在某些实施方案中,所述第一DNA结合蛋白或所述第二DNA结合蛋白各自独立地为TALE(转录激活因子样效应子)蛋白或锌指蛋白。In certain embodiments, the first DNA binding protein or the second DNA binding protein are each independently selected from the group consisting of: TALE (transcription activator-like effector) proteins, zinc finger proteins, and Cas proteins. In certain embodiments, the first DNA binding protein or the second DNA binding protein are each independently a TALE (transcription activator-like effector) protein or a zinc finger protein.
在某些实施方案中,所述第一DNA结合蛋白与所述第二DNA结合蛋白相同或不相同。In certain embodiments, the first DNA binding protein and the second DNA binding protein are the same or different.
在某些实施方案中,所述第一DNA结合蛋白和所述第二DNA结合蛋白均为TALE蛋白。In certain embodiments, the first DNA binding protein and the second DNA binding protein are both TALE proteins.
在某些实施方案中,所述第一NES序列位于所述第一DNA结合蛋白的C端。In certain embodiments, the first NES sequence is located C-terminal to the first DNA binding protein.
在某些实施方案中,所述第二NES序列位于所述第二DNA结合蛋白的C端。In certain embodiments, the second NES sequence is located C-terminal to the second DNA binding protein.
在某些实施方案中,所述第一NES序列与所述第二NES序列相同或不相同。In certain embodiments, the first NES sequence is the same or different from the second NES sequence.
在某些实施方案中,所述第一NES序列或所述第二NES序列各自独立地选自来源于HIV病毒的Rev蛋白(HIV regulator of virion)、促有丝分裂素激活蛋白激酶(MAPK,mitogen-activated protein kinase)、细胞肿瘤抗原蛋白P53(cellular tumor antigen p53)、核糖体转运蛋白NMD3(60S ribosomal export protein NMD3)。In certain embodiments, the first NES sequence or the second NES sequence are each independently selected from the group consisting of Rev protein (HIV regulator of virion), mitogen-activated protein kinase (MAPK, mitogen- activated protein kinase), cellular tumor antigen protein P53 (cellular tumor antigen p53), ribosome transport protein NMD3 (60S ribosomal export protein NMD3).
在某些实施方案中,In certain embodiments,
所述第一多肽从N端至C端依次包含:The first polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述第一NES序列、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the first NES sequence, the N-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,所述第一NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(ii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the first NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain;
或者,or,
(iii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;(iii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, First NES sequence;
和/或,and / or,
所述第二多肽从N端至C端依次包含:The second polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述第二NES序列、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the second NES sequence, the C-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片 段、所述第二NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;或者,(ii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal patch segment, the second NES sequence, and, the uracil glycosylase inhibitor (UGI) domain; or,
(iii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。(iii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, Second NES sequence.
在某些实施方案中,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;和/或,所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。In certain embodiments, the first polypeptide comprises, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the the uracil glycosylase inhibitor (UGI) domain, and the first NES sequence; and/or the second polypeptide sequentially comprising from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, the second NES sequence.
在第五方面,本申请还提供了分离的核酸分子,其编码如第一方面所述的具有双链DNA脱氨酶活性的多肽或其突变体、如第二方面所述的突变的双链DNA脱氨酶或其变体、如第三方面或第四方面所述的第一多肽或第二多肽或其组合。In a fifth aspect, the application also provides an isolated nucleic acid molecule encoding a polypeptide having double-stranded DNA deaminase activity as described in the first aspect or a mutant thereof, a mutated double-stranded protein as described in the second aspect DNA deaminase or a variant thereof, the first polypeptide or the second polypeptide as described in the third or fourth aspect, or a combination thereof.
在某些实施方案中,所述分离的核酸分子包含编码本发明第三方面所述的第一多肽的第一核苷酸序列和编码第三方面所述所述第二多肽的第二核苷酸序列,其中所述第一核苷酸序列和所述第二核苷酸序列存在于相同或不同的分离的核酸分子上。当所述第一核苷酸序列和所述第二核苷酸序列存在于不同的分离的核酸分子上时,本发明所述的分离的核酸分子包含含有所述第一核苷酸序列的第一核酸分子以及含有所述第二核苷酸序列的第二核酸分子。In certain embodiments, the isolated nucleic acid molecule comprises a first nucleotide sequence encoding the first polypeptide of the third aspect of the invention and a second nucleotide sequence encoding the second polypeptide of the third aspect. Nucleotide sequences, wherein said first nucleotide sequence and said second nucleotide sequence are present on the same or different isolated nucleic acid molecules. When the first nucleotide sequence and the second nucleotide sequence are present on different isolated nucleic acid molecules, the isolated nucleic acid molecule of the present invention includes a third nucleotide sequence containing the first nucleotide sequence. a nucleic acid molecule and a second nucleic acid molecule containing said second nucleotide sequence.
在某些实施方案中,所述分离的核酸分子包含编码本发明第四方面所述的第一多肽的第一核苷酸序列和编码第四方面所述所述第二多肽的第二核苷酸序列,其中所述第一核苷酸序列和所述第二核苷酸序列存在于相同或不同的分离的核酸分子上。当所述第一核苷酸序列和所述第二核苷酸序列存在于不同的分离的核酸分子上时,本发明所述的分离的核酸分子包含含有所述第一核苷酸序列的第一核酸分子以及含有所述第二核苷酸序列的第二核酸分子。In certain embodiments, the isolated nucleic acid molecule comprises a first nucleotide sequence encoding the first polypeptide of the fourth aspect of the invention and a second nucleotide sequence encoding the second polypeptide of the fourth aspect. Nucleotide sequences, wherein said first nucleotide sequence and said second nucleotide sequence are present on the same or different isolated nucleic acid molecules. When the first nucleotide sequence and the second nucleotide sequence are present on different isolated nucleic acid molecules, the isolated nucleic acid molecule of the present invention includes a third nucleotide sequence containing the first nucleotide sequence. a nucleic acid molecule and a second nucleic acid molecule containing said second nucleotide sequence.
在第六方面,本申请提供了载体,其包含如上所述的分离的核酸分子。在某些实施方案中,所述载体为克隆载体或表达载体。In a sixth aspect, the application provides a vector comprising an isolated nucleic acid molecule as described above. In certain embodiments, the vector is a cloning vector or an expression vector.
在某些实施方案中,所述载体包含编码本发明第三方面的第一多肽的第一核苷酸序列和编码第三方面所述的第二多肽的第二核苷酸序列,其中所述第一核苷酸序列和所述第二核苷酸序列存在于相同或不同的载体上。当所述第一核苷酸序列和所述第二核苷酸 序列存在于不同的载体上时,本发明所述的载体包含含有所述第一核苷酸序列的第一载体以及含有所述第二核苷酸序列的第二载体。In certain embodiments, the vector comprises a first nucleotide sequence encoding a first polypeptide of the third aspect of the invention and a second nucleotide sequence encoding a second polypeptide of the third aspect, wherein The first nucleotide sequence and the second nucleotide sequence are present on the same or different vectors. When the first nucleotide sequence and the second nucleotide sequence When the sequences exist on different vectors, the vector of the present invention includes a first vector containing the first nucleotide sequence and a second vector containing the second nucleotide sequence.
在某些实施方案中,所述载体包含编码本发明第四方面的第一多肽的第一核苷酸序列和编码第四方面所述的第二多肽的第二核苷酸序列,其中所述第一核苷酸序列和所述第二核苷酸序列存在于相同或不同的载体上。当所述第一核苷酸序列和所述第二核苷酸序列存在于不同的载体上时,本发明所述的载体包含含有所述第一核苷酸序列的第一载体以及含有所述第二核苷酸序列的第二载体。In certain embodiments, the vector comprises a first nucleotide sequence encoding a first polypeptide of the fourth aspect of the invention and a second nucleotide sequence encoding a second polypeptide of the fourth aspect, wherein The first nucleotide sequence and the second nucleotide sequence are present on the same or different vectors. When the first nucleotide sequence and the second nucleotide sequence exist on different vectors, the vector of the present invention includes a first vector containing the first nucleotide sequence and a vector containing the A second vector for a second nucleotide sequence.
在第七方面,本申请还提供了宿主细胞,其包含如上所述的核酸分子或载体。此类宿主细胞包括但不限于,原核细胞例如细菌细胞(如大肠杆菌细胞),以及真核细胞例如真菌细胞(例如酵母细胞),昆虫细胞,植物细胞和动物细胞(如哺乳动物细胞,例如小鼠细胞、人细胞等)。In a seventh aspect, the present application also provides a host cell comprising the nucleic acid molecule or vector as described above. Such host cells include, but are not limited to, prokaryotic cells such as bacterial cells (e.g., E. coli cells), and eukaryotic cells such as fungal cells (e.g., yeast cells), insect cells, plant cells, and animal cells (e.g., mammalian cells, e.g., small mouse cells, human cells, etc.).
在第八方面,本申请还提供了制备如第一方面所述的具有双链DNA脱氨酶活性的多肽或其突变体、如第二方面所述的突变的双链DNA脱氨酶或其变体、如第三方面所述的第一多肽或第二多肽、或如第四方面所述的第一多肽或第二多肽的方法,其包括,在允许蛋白表达的条件下,培养如上所述的宿主细胞,和从培养的宿主细胞培养物中回收所述具有双链DNA脱氨酶活性的多肽或其突变体、突变的双链DNA脱氨酶或其变体、第一多肽或第二多肽。In the eighth aspect, the present application also provides the preparation of a polypeptide having double-stranded DNA deaminase activity as described in the first aspect or a mutant thereof, a mutated double-stranded DNA deaminase or a mutant thereof as described in the second aspect. Variant, the first polypeptide or the second polypeptide as described in the third aspect, or the method of the first polypeptide or the second polypeptide as described in the fourth aspect, which includes, under conditions that allow protein expression , culturing the host cell as described above, and recovering the polypeptide having double-stranded DNA deaminase activity or a mutant thereof, a mutated double-stranded DNA deaminase or a variant thereof from the cultured host cell culture, a polypeptide or a second polypeptide.
在某些实施方案中,所述第一多肽和所述第二多肽不在同一个宿主细胞中共表达。In certain embodiments, the first polypeptide and the second polypeptide are not co-expressed in the same host cell.
在第九方面,本申请还提供了组合物,其包含相互分离的第一组分和第二组分,所述第一组分包含:In a ninth aspect, the application further provides a composition comprising a first component and a second component that are separated from each other, the first component comprising:
(i)如第三方面所述的第一多肽或者编码所述第一多肽的第一多核苷酸;所述第二组分包含:如第三方面所述的第二多肽或者编码所述第二多肽的第二多核苷酸;(i) The first polypeptide as described in the third aspect or the first polynucleotide encoding the first polypeptide; the second component includes: the second polypeptide as described in the third aspect or a second polynucleotide encoding said second polypeptide;
或者,or,
(ii)如第四方面所述的第一多肽或者编码所述第一多肽的第一多核苷酸;所述第二组分包含:如第四方面所述的第二多肽或者编码所述第二多肽的第二多核苷酸。(ii) The first polypeptide as described in the fourth aspect or the first polynucleotide encoding the first polypeptide; the second component includes: the second polypeptide as described in the fourth aspect or A second polynucleotide encoding said second polypeptide.
在某些实施方案中,所述组合还包含第三组分,所述第三组分包括融合蛋白或者编码所述融合蛋白的第三多核苷酸;其中,所述融合蛋白包含一个或多个核定位信号 (NLS)序列以及能够抑制双链DNA脱氨酶活性的多肽。In certain embodiments, the combination further comprises a third component comprising a fusion protein or a third polynucleotide encoding the fusion protein; wherein the fusion protein comprises one or more nuclear localization signal (NLS) sequence and polypeptides capable of inhibiting double-stranded DNA deaminase activity.
在某些实施方案中,所述融合蛋白能够抑制所述第一多肽和所述第二多肽形成的多肽聚合物的双链DNA脱氨酶活性。In certain embodiments, the fusion protein is capable of inhibiting double-stranded DNA deaminase activity of a polypeptide polymer formed by the first polypeptide and the second polypeptide.
在某些实施方案中,所述NLS序列位于所述能够抑制双链DNA脱氨酶活性的多肽的N端和/或C端。In certain embodiments, the NLS sequence is located at the N-terminus and/or C-terminus of the polypeptide capable of inhibiting double-stranded DNA deaminase activity.
在某些实施方案中,所述NLS序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述能够抑制双链DNA脱氨酶活性的多肽连接。In certain embodiments, the NLS sequence is coupled directly or through a linker (e.g., a peptide linker, such as a flexible peptide comprising one or more glycine (G) and/or serine (S)) to the said NLS sequence capable of inhibiting double-stranded DNA detachment. Polypeptide linkage for ammonia enzyme activity.
在某些实施方案中,所述NLS序列选自来源于猿猴空泡病毒40(SV40)、睾丸决定因子(SRY)、细胞核质蛋白(Nuceloplasmin)、常用的二分核定位信号(bipartite NLS,bpNLS)的NLS序列。In certain embodiments, the NLS sequence is selected from the group consisting of simian vacuolating virus 40 (SV40), testis determinant (SRY), nucleoplasmin (Nuceloplasmin), and commonly used bipartite NLS (bpNLS). NLS sequence.
在某些实施方案中,所述融合蛋白中所述能够抑制双链DNA脱氨酶活性的多肽的N端和C端各连接有一个NLS序列。在某些实施方案中,所述能够抑制双链DNA脱氨酶活性的多肽的N端连接有第一NLS序列,所述能够抑制双链DNA脱氨酶活性的多肽的N端连接有第二NLS序列。In certain embodiments, the N-terminus and the C-terminus of the polypeptide capable of inhibiting double-stranded DNA deaminase activity in the fusion protein are each connected to an NLS sequence. In certain embodiments, the N-terminus of the polypeptide capable of inhibiting double-stranded DNA deaminase activity is connected to a first NLS sequence, and the N-terminus of the polypeptide capable of inhibiting double-stranded DNA deaminase activity is connected to a second NLS sequence. NLS sequence.
在某些实施方案中,所述第一NLS序列具有如SEQ ID NO:62所示的氨基酸序列。在某些实施方案中,所述第二NLS序列具有如SEQ ID NO:63所示的氨基酸序列。In certain embodiments, the first NLS sequence has the amino acid sequence set forth in SEQ ID NO: 62. In certain embodiments, the second NLS sequence has the amino acid sequence set forth in SEQ ID NO: 63.
在某些实施方案中,所述第一NLS序列和所述第二NLS序列具有如SEQ ID NO:61所示的氨基酸序列。In certain embodiments, the first NLS sequence and the second NLS sequence have the amino acid sequence set forth in SEQ ID NO: 61.
在某些实施方案中,所述能够抑制双链DNA脱氨酶活性的多肽包含DddIA蛋白的最小活性结构域。在某些实施方案中,所述能够抑制双链DNA脱氨酶活性的多肽具有如SEQ ID NO:60所示的氨基酸序列。In certain embodiments, the polypeptide capable of inhibiting double-stranded DNA deaminase activity comprises the minimal active domain of DddI A protein. In certain embodiments, the polypeptide capable of inhibiting double-stranded DNA deaminase activity has an amino acid sequence as shown in SEQ ID NO: 60.
在某些实施方案中,所述融合蛋白具有如SEQ ID NO:109或110所示的氨基酸序列。In certain embodiments, the fusion protein has the amino acid sequence set forth in SEQ ID NO: 109 or 110.
在某些实施方案中,所述融合蛋白具有如SEQ ID NO:64或65所示的氨基酸序列。In certain embodiments, the fusion protein has the amino acid sequence set forth in SEQ ID NO: 64 or 65.
在第十方面,本申请还提供了一种在细胞外编辑靶核苷酸序列的方法,其包括,在适合进行靶核酸编辑的条件下,将靶核苷酸序列与如上所述的多肽聚合物或组合物接触,从而诱导靶核苷酸序列中的靶碱基的脱氨基。In a tenth aspect, the present application also provides a method for editing a target nucleotide sequence extracellularly, which includes polymerizing the target nucleotide sequence with a polypeptide as described above under conditions suitable for target nucleic acid editing. Contact with a substance or composition thereby inducing deamination of the target base in the target nucleotide sequence.
在某些实施方案中,所述靶碱基为胞嘧啶。In certain embodiments, the target base is cytosine.
在某些实施方案中,所述方法包括将靶核苷酸序列与如上所述的组合物接触,并且, 所述组合物包含相互分离的第一组分和第二组分,所述第一组分包含如上所述的第一多肽;所述第二组分包含如上所述的第二多肽;或者,所述方法包括将靶核苷酸序列与如上所述的多肽聚合物接触,所述多肽聚合物包含如上所述的第一多肽以及如上所述的第二多肽。In certain embodiments, the method includes contacting a target nucleotide sequence with a composition as described above, and, The composition includes a first component and a second component that are separated from each other, the first component includes a first polypeptide as described above; the second component includes a second polypeptide as described above; Alternatively, the method includes contacting the target nucleotide sequence with a polypeptide polymer as described above, the polypeptide polymer comprising a first polypeptide as described above and a second polypeptide as described above.
在某些实施方案中,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白。In certain embodiments, the first polypeptide comprises a first DNA binding protein and the second polypeptide comprises a second DNA binding protein.
在某些实施方案中,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,所述第一多肽与所述第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基。In certain embodiments, the first DNA binding protein targets a first nucleotide sequence flanking one of the target bases and the second DNA binding protein targets a first nucleotide sequence flanking the other target base. dinucleotide sequence; thus, the first polypeptide and the second polypeptide can form a polypeptide polymer, thereby inducing deamination of the target base.
在某些实施方案中,所述第一多肽和/或所述第二多肽各自独立地还包含出核信号(NES)序列。In certain embodiments, the first polypeptide and/or the second polypeptide each independently further comprise a nuclear exit signal (NES) sequence.
在某些实施方案中,所述NES序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述第一多肽或所述第二多肽中的其他结构域连接。In certain embodiments, the NES sequence is associated with the first polypeptide or the first polypeptide, either directly or through a linker (e.g., a peptide linker, e.g., a flexible peptide comprising one or more glycine (G) and/or serine (S)). other domains in the second polypeptide.
在某些实施方案中,所述第一多肽包含第一NES序列,和/或,所述第二多肽包含第二NES序列。In certain embodiments, the first polypeptide comprises a first NES sequence, and/or the second polypeptide comprises a second NES sequence.
在某些实施方案中,所述第一NES序列位于所述第一DNA结合蛋白的C端。In certain embodiments, the first NES sequence is located C-terminal to the first DNA binding protein.
在某些实施方案中,所述第二NES序列位于所述第二DNA结合蛋白的C端。In certain embodiments, the second NES sequence is located C-terminal to the second DNA binding protein.
在某些实施方案中,所述第一NES序列与所述第二NES序列相同或不相同。In certain embodiments, the first NES sequence is the same or different from the second NES sequence.
在某些实施方案中,所述第一NES序列或所述第二NES序列各自独立地选自来源于HIV病毒的Rev蛋白(HIV regulator of virion)、促有丝分裂素激活蛋白激酶(MAPK,mitogen-activated protein kinase)、细胞肿瘤抗原蛋白P53(cellular tumor antigen p53)、核糖体转运蛋白NMD3(60S ribosomal export protein NMD3)。在某些实施方案中,所述第一NES序列或所述第二NES序列分别具有如SEQ ID NO:47或56所示的氨基酸序列。In certain embodiments, the first NES sequence or the second NES sequence are each independently selected from the group consisting of Rev protein (HIV regulator of virion), mitogen-activated protein kinase (MAPK, mitogen- activated protein kinase), cellular tumor antigen protein P53 (cellular tumor antigen p53), ribosome transport protein NMD3 (60S ribosomal export protein NMD3). In certain embodiments, the first NES sequence or the second NES sequence has the amino acid sequence set forth in SEQ ID NO: 47 or 56, respectively.
在某些实施方案中,所述第一多肽从N端至C端依次包含:In certain embodiments, the first polypeptide, sequentially from N-terminus to C-terminus, includes:
(i)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述第一NES序列、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the first NES sequence, the N-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,所述第一NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域; (ii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the first NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain;
或者,or,
(iii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;(iii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, First NES sequence;
和/或,and / or,
所述第二多肽从N端至C端依次包含:The second polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述第二NES序列、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the second NES sequence, the C-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述第二NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;或者,(ii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the second NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain; or,
(iii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。(iii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, Second NES sequence.
在某些实施方案中,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;和/或,所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。In certain embodiments, the first polypeptide comprises, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the the uracil glycosylase inhibitor (UGI) domain, and the first NES sequence; and/or the second polypeptide sequentially comprising from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, the second NES sequence.
在第十一方面,本申请还提供一种在细胞内编辑靶核苷酸序列的方法,其包括将如上所述的多肽聚合物或组合物递送入含有所述靶核苷酸序列细胞,从而诱导靶位点处靶碱基的脱氨基。In an eleventh aspect, the present application also provides a method for editing a target nucleotide sequence in a cell, which includes delivering a polypeptide polymer or composition as described above into a cell containing the target nucleotide sequence, thereby Induces deamination of the target base at the target site.
在某些实施方案中,所述靶碱基为胞嘧啶。In certain embodiments, the target base is cytosine.
在某些实施方案中,所述方法包括将如上所述的组合物递送入含有所述靶核苷酸序列细胞。In certain embodiments, the methods include delivering a composition as described above into a cell containing the target nucleotide sequence.
在某些实施方案中,所述组合物包含相互分离的第一组分和第二组分,所述第一组分包含如上所述的第一多肽;且,所述第二组分包含如上所述的第二多肽;或者,所述第一组分包含编码所述第一多肽的第一多核苷酸;且,所述第二组分包含编码所述第二多肽的第二多核苷酸。In certain embodiments, the composition comprises a first component and a second component that are separate from each other, the first component comprising a first polypeptide as described above; and, the second component comprising A second polypeptide as described above; alternatively, the first component includes a first polynucleotide encoding the first polypeptide; and, the second component includes a first polynucleotide encoding the second polypeptide. Second polynucleotide.
在某些实施方案中,所述第一组分包含如上所述的第一多肽;且,所述第二组分包含如上所述的第二多肽。所述第一组分和所述第二组分被递送入细胞后,所述第一多肽和所述第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基。 In certain embodiments, the first component includes a first polypeptide as described above; and, the second component includes a second polypeptide as described above. After the first component and the second component are delivered into a cell, the first polypeptide and the second polypeptide are capable of forming a polypeptide polymer, thereby inducing deamination of the target base.
在某些实施方案中,所述第一组分包含编码所述第一多肽的第一多核苷酸;且,所述第二组分包含编码所述第二多肽的第二多核苷酸。所述第一组分和所述第二组分被递送入细胞后,由所述第一多核苷酸编码的第一多肽和由所述第二多核苷酸编码的第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基。In certain embodiments, the first component comprises a first polynucleotide encoding the first polypeptide; and, the second component comprises a second polynucleotide encoding the second polypeptide. glycosides. After the first component and the second component are delivered into the cell, the first polypeptide encoded by the first polynucleotide and the second polypeptide encoded by the second polynucleotide Polypeptide polymers can be formed, thereby inducing deamination of the target base.
在某些实施方案中,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白。In certain embodiments, the first polypeptide comprises a first DNA binding protein and the second polypeptide comprises a second DNA binding protein.
在某些实施方案中,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,所述第一多肽与所述第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基。In certain embodiments, the first DNA binding protein targets a first nucleotide sequence flanking one of the target bases and the second DNA binding protein targets a first nucleotide sequence flanking the other target base. dinucleotide sequence; thus, the first polypeptide and the second polypeptide can form a polypeptide polymer, thereby inducing deamination of the target base.
在某些实施方案中,所述方法包括将如上所述的还包含所述第三组分的组合物递送入含有所述靶核苷酸序列细胞;由此,组合物中的所述融合蛋白或由组合物中的所述第三多核苷酸编码的融合蛋白通过其含有的NLS序列定位于细胞核内,降低位于细胞核内的所述第一多肽、或所述第二多肽或所述第一多肽和所述第二多肽的组合的双链DNA脱氨酶活性。In certain embodiments, the method includes delivering a composition as described above further comprising the third component into a cell containing the target nucleotide sequence; thereby, the fusion protein in the composition Or the fusion protein encoded by the third polynucleotide in the composition is located in the cell nucleus through the NLS sequence it contains, reducing the first polypeptide, or the second polypeptide, or the Double-stranded DNA deaminase activity of the combination of the first polypeptide and the second polypeptide.
在某些实施方案中,所述方法包括将如上所述的多肽聚合物递送入含有所述靶核苷酸序列细胞。In certain embodiments, the methods include delivering a polypeptide polymer as described above into a cell containing the target nucleotide sequence.
在某些实施方案中,所述多肽聚合物包含如上所述的第一多肽和如上所述的第二多肽,其中,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白。In certain embodiments, the polypeptide polymer comprises a first polypeptide as described above and a second polypeptide as described above, wherein the first polypeptide comprises a first DNA binding protein and the second The polypeptide includes a second DNA binding protein.
在某些实施方案中,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,诱导所述靶碱基的脱氨基。In certain embodiments, the first DNA binding protein targets a first nucleotide sequence flanking one of the target bases and the second DNA binding protein targets a first nucleotide sequence flanking the other target base. dinucleotide sequence; thereby, deamination of the target base is induced.
在某些实施方案中,所述方法还包括将如上所述的融合蛋白或者编码所述融合蛋白的多核苷酸递送入含有所述靶核苷酸序列细胞。In certain embodiments, the method further includes delivering a fusion protein as described above, or a polynucleotide encoding the fusion protein, into a cell containing the target nucleotide sequence.
在某些实施方案中,所述第一多肽和/或所述第二多肽各自独立地还包含出核信号(NES)序列。In certain embodiments, the first polypeptide and/or the second polypeptide each independently further comprise a nuclear exit signal (NES) sequence.
在某些实施方案中,所述NES序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述第一多肽或所述第二多肽中的其他结构域连接。In certain embodiments, the NES sequence is associated with the first polypeptide or the first polypeptide, either directly or through a linker (e.g., a peptide linker, e.g., a flexible peptide comprising one or more glycine (G) and/or serine (S)). other domains in the second polypeptide.
在某些实施方案中,所述第一多肽包含第一NES序列,和/或,所述第二多肽包含第 二NES序列。In certain embodiments, the first polypeptide comprises a first NES sequence, and/or the second polypeptide comprises a first NES sequence. Two NES sequences.
在某些实施方案中,所述第一NES序列位于所述第一DNA结合蛋白的C端。In certain embodiments, the first NES sequence is located C-terminal to the first DNA binding protein.
在某些实施方案中,所述第二NES序列位于所述第二DNA结合蛋白的C端。In certain embodiments, the second NES sequence is located C-terminal to the second DNA binding protein.
在某些实施方案中,所述第一NES序列与所述第二NES序列相同或不相同。In certain embodiments, the first NES sequence is the same or different from the second NES sequence.
在某些实施方案中,所述第一NES序列或所述第二NES序列各自独立地选自来源于HIV病毒的Rev蛋白(HIV regulator of virion)、促有丝分裂素激活蛋白激酶(MAPK,mitogen-activated protein kinase)、细胞肿瘤抗原蛋白P53(cellular tumor antigen p53)、核糖体转运蛋白NMD3(60S ribosomal export protein NMD3)。在某些实施方案中,所述第一NES序列或所述第二NES序列分别具有如SEQ ID NO:47或56所示的氨基酸序列。In certain embodiments, the first NES sequence or the second NES sequence are each independently selected from the group consisting of Rev protein (HIV regulator of virion), mitogen-activated protein kinase (MAPK, mitogen- activated protein kinase), cellular tumor antigen protein P53 (cellular tumor antigen p53), ribosome transport protein NMD3 (60S ribosomal export protein NMD3). In certain embodiments, the first NES sequence or the second NES sequence has the amino acid sequence set forth in SEQ ID NO: 47 or 56, respectively.
在某些实施方案中,所述第一多肽从N端至C端依次包含:In certain embodiments, the first polypeptide, sequentially from N-terminus to C-terminus, includes:
(i)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述第一NES序列、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the first NES sequence, the N-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,所述第一NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(ii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the first NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain;
或者,or,
(iii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;(iii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, First NES sequence;
和/或,and / or,
所述第二多肽从N端至C端依次包含:The second polypeptide sequentially includes from N-terminus to C-terminus:
(i)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述第二NES序列、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the second NES sequence, the C-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
(ii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述第二NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;或者,(ii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the second NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain; or,
(iii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。(iii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, Second NES sequence.
在某些实施方案中,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;和/或,所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所 述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。In certain embodiments, the first polypeptide comprises, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the the uracil glycosylase inhibitor (UGI) domain, and the first NES sequence; and/or the second polypeptide sequentially comprising from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the the uracil glycosylase inhibitor (UGI) domain, and the second NES sequence.
在第十二方面,本申请还提供了试剂盒,其包含如第一方面所述的具有双链DNA脱氨酶活性的多肽或其突变体、如第二方面所述的突变的双链DNA脱氨酶或其变体、如第三方面或第四方面所述的多肽聚合物、如第五方面所述的分离的核酸分子、如第六方面所述的载体、如第七方面所述的宿主细胞、或如第九方面所述的组合物。In a twelfth aspect, the present application also provides a kit comprising a polypeptide having double-stranded DNA deaminase activity as described in the first aspect or a mutant thereof, and a mutated double-stranded DNA as described in the second aspect. Deaminase or a variant thereof, a polypeptide polymer as described in the third or fourth aspect, an isolated nucleic acid molecule as described in the fifth aspect, a vector as described in the sixth aspect, as described in the seventh aspect The host cell, or the composition as described in the ninth aspect.
在某些实施方案中,所述试剂盒包含如第三方面所述的多肽聚合物。在某些实施方案中,所述试剂盒进一步包含如上所述的融合蛋白或者编码所述融合蛋白的多核苷酸。In certain embodiments, the kit comprises a polypeptide polymer as described in the third aspect. In certain embodiments, the kit further comprises a fusion protein as described above or a polynucleotide encoding the fusion protein.
在某些实施方案中,所述试剂盒包含如第九方面所述的组合物。In certain embodiments, the kit includes a composition as described in the ninth aspect.
在第十三方面,本申请还提供了如第一方面所述的具有双链DNA脱氨酶活性的多肽或其突变体、如第二方面所述的突变的双链DNA脱氨酶或其变体、如第三方面或第四方面所述的多肽聚合物、如第三方面或第四方面所述的第一多肽、如第三方面或第四方面所述的第二多肽、如第五方面所述的分离的核酸分子、如第六方面所述的载体、如第七方面所述的宿主细胞、或如第九方面所述的组合物用于制备编辑靶核苷酸序列的试剂盒或用于编辑靶核苷酸序列的用途。In the thirteenth aspect, the application also provides a polypeptide having double-stranded DNA deaminase activity as described in the first aspect or a mutant thereof, a mutated double-stranded DNA deaminase or a mutant thereof as described in the second aspect. Variant, a polypeptide polymer as described in the third or fourth aspect, a first polypeptide as described in the third or fourth aspect, a second polypeptide as described in the third or fourth aspect, The isolated nucleic acid molecule as described in the fifth aspect, the vector as described in the sixth aspect, the host cell as described in the seventh aspect, or the composition as described in the ninth aspect is used for preparing editing target nucleotide sequence Kits or uses for editing target nucleotide sequences.
术语定义Definition of Terms
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的病毒学、生物化学、免疫学实验室操作步骤均为相应领域内广泛使用的常规步骤。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。In the present invention, unless otherwise stated, scientific and technical terms used herein have the meanings commonly understood by those skilled in the art. Moreover, the virology, biochemistry, and immunology laboratory procedures used in this article are routine procedures widely used in the corresponding fields. Meanwhile, in order to better understand the present invention, definitions and explanations of relevant terms are provided below.
当本文使用术语“例如”、“如”、“诸如”、“包括”、“包含”或其变体时,这些术语将不被认为是限制性术语,而将被解释为表示“但不限于”或“不限于”。When the terms "such as," "such as," "such as," "including," "including," or variations thereof are used herein, these terms will not be considered limiting terms and will instead be interpreted to mean "without limitation ” or “without limitation.”
除非本文另外指明或根据上下文明显矛盾,否则术语“一个”和“一种”以及“该”和类似指称物在描述本发明的上下文中(尤其在以下权利要求的上下文中)应被解释成覆盖单数和复数。Unless otherwise indicated herein or clearly contradicted by context, the terms "a" and "an" as well as "the" and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover Singular and plural.
如本文所用,术语“碱基编辑”指涉及在目标基因组位点(例如包括在mtDNA中)将特定核酸碱基转化为另一核酸碱基的基因组编辑技术。在某些实施方案中,这可以在不需要双链DNA断裂(DSB)或单链断裂(即刻痕)的情况下实现。 As used herein, the term "base editing" refers to genome editing techniques involving the conversion of a specific nucleic acid base into another nucleic acid base at a target genomic site (eg, included in mtDNA). In certain embodiments, this can be accomplished without the need for double-stranded DNA breaks (DSBs) or single-stranded breaks (i.e., nicking).
如本文所用,术语“脱氨酶”指催化脱氨反应的蛋白质或酶。在一些实施方案中,脱氨酶是腺苷(或腺嘌呤)脱氨酶它催化腺嘌呤或腺苷的水解脱氨。在一些实施方案中,腺苷脱氨酶催化腺嘌呤或腺苷在脱氧核糖核酸(DNA)中水解脱氨为肌苷。在某些实施方案中,脱氨酶是催化胞苷或胞嘧啶水解脱氨的胞苷(或胞嘧啶)脱氨酶。在某些实施方案中,脱氨酶是双链DNA脱氨酶,或者是经修饰的进化的或以其它方式改变的,以能够利用双链DNA作为脱氨的底物。在某些实施方案中,所述脱氨酶是直接以双链DNA作为脱氨的底物的催化胞苷或胞嘧啶水解脱氨的胞苷(或胞嘧啶)脱氨酶。As used herein, the term "deaminase" refers to a protein or enzyme that catalyzes a deamination reaction. In some embodiments, the deaminase is an adenosine (or adenine) deaminase which catalyzes the hydrolytic deamination of adenine or adenosine. In some embodiments, adenosine deaminase catalyzes the hydrolytic deamination of adenine or adenosine to inosine in deoxyribonucleic acid (DNA). In certain embodiments, the deaminase is a cytidine (or cytosine) deaminase that catalyzes the hydrolytic deamination of cytidine or cytosine. In certain embodiments, the deaminase is a double-stranded DNA deaminase, or is modified, evolved or otherwise altered to be able to utilize double-stranded DNA as a substrate for deamination. In certain embodiments, the deaminase is a cytidine (or cytosine) deaminase that catalyzes the hydrolytic deamination of cytidine or cytosine using double-stranded DNA directly as a substrate for deamination.
如本文中所使用的,术语“同一性”用于指两个多肽之间或两个核酸之间序列的匹配情况。为了测定两个氨基酸序列或两个核酸序列的百分比同一性,为了最佳比较目的将序列进行比对(例如,可在第一氨基酸序列或核酸序列中引入缺口以与第二氨基酸或核酸序列最佳比对)。然后比较对应氨基酸位置或核苷酸位置处的氨基酸残基或核苷酸。当第一序列中的位置被与第二序列中的对应位置相同的氨基酸残基或核苷酸占据时,则分子在该位置上是同一的。两个序列之间的百分比同一性是由序列所共享的同一性位置的数目的函数(即,百分比同一性=同一重叠位置的数目/位置的总数×100%)。在某些实施方案中,两个序列长度相同。As used herein, the term "identity" is used to refer to the match of sequences between two polypeptides or between two nucleic acids. To determine the percent identity of two amino acid sequences or two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps may be introduced in the first amino acid sequence or nucleic acid sequence to best match the second amino acid or nucleic acid sequence). Good comparison). The amino acid residues or nucleotides at the corresponding amino acid positions or nucleotide positions are then compared. Molecules are identical when a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence. The percent identity between two sequences is a function of the number of identical positions shared by the sequences (ie, percent identity = number of identical overlapping positions/total number of positions x 100%). In certain embodiments, both sequences are the same length.
两个序列之间的百分比同一性的测定还可使用数学算法来实现。用于两个序列的比较的数学算法的一个非限制性实例是Karlin和Altschul的算法,1990,Proc.Natl.Acad.Sci.U.S.A.87:2264-2268,如同Karlin和Altschul,1993,Proc.Natl.Acad.Sci.U.S.A.90:5873-5877中改进的。将这样的算法整合至Altschul等人,1990,J.Mol.Biol.215:403的NBLAST和XBLAST程序中。Determination of percent identity between two sequences can also be accomplished using mathematical algorithms. One non-limiting example of a mathematical algorithm for comparison of two sequences is the algorithm of Karlin and Altschul, 1990, Proc. Improved in .Acad.Sci.U.S.A.90:5873-5877. Such algorithms were integrated into the NBLAST and XBLAST programs of Altschul et al., 1990, J. Mol. Biol. 215:403.
如本文中所使用的,术语“变体”,在多肽的情境中(包括多肽)也指包含已通过引入氨基酸残基置换、缺失或添加改变的氨基酸序列的多肽或肽。在某些情况下,术语“变体”还指已被修饰(即,通过将任何类型的分子共价连接至多肽或肽)的多肽或肽。例如,但非限制性地,多肽可以被修饰,例如通过糖基化、乙酰化、聚乙二醇化、磷酸化、酰胺化、通过已知保护/封闭基团进行的衍生化、蛋白水解切割、连接至细胞配体或其它蛋白质等。衍生多肽或肽可使用本领域技术人员已知的技术通过化学修饰来产生,所述技术包括但不限于特异性化学切割、乙酰化、甲酰化、衣霉素的代谢合成等。此外,变体具有与其所源自的多肽或肽相似、相同或改善的功能。As used herein, the term "variant", in the context of polypeptides (including polypeptides), also refers to a polypeptide or peptide comprising an amino acid sequence that has been altered by introducing substitutions, deletions, or additions of amino acid residues. In some cases, the term "variant" also refers to a polypeptide or peptide that has been modified (ie, by covalently linking any type of molecule to the polypeptide or peptide). For example, and without limitation, polypeptides may be modified, e.g., by glycosylation, acetylation, PEGylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, Attached to cellular ligands or other proteins, etc. Derivatized polypeptides or peptides can be produced by chemical modification using techniques known to those skilled in the art, including, but not limited to, specific chemical cleavage, acetylation, formylation, metabolic synthesis of tunicamycin, and the like. Furthermore, a variant has a similar, identical or improved function to the polypeptide or peptide from which it is derived.
在某些实施方案中,所述可编程DNA结合蛋白可选自TALEs,ZFs,Casx,Casy,Cpf1,C2c1,C2c2,C2c3,Argonaute蛋白,或其衍生形式。在某些实施方案中,所 述可编程DNA结合蛋白不具有核酸酶活性。在某些实施方案中,所述可编程DNA结合蛋白只能切割核酸双链体中的一条链。在某些实施方案中,所述可编程DNA结合蛋白不具有形成核酸双链断裂切口的活性。In certain embodiments, the programmable DNA binding protein can be selected from TALEs, ZFs, Casx, Casy, Cpf1, C2c1, C2c2, C2c3, Argonaute proteins, or derivatives thereof. In certain embodiments, the The programmable DNA binding protein does not have nuclease activity. In certain embodiments, the programmable DNA binding protein can only cleave one strand of a nucleic acid duplex. In certain embodiments, the programmable DNA binding protein does not have the activity to form nucleic acid double-strand break nicks.
如本文所用,术语“转录激活子样效应蛋白”或“TALE蛋白”,是一类由植物病原菌黄单胞杆菌属(Xanthomonas)细菌入侵宿主时分泌的一种天然蛋白质,这类蛋白质在黄单胞杆菌属细菌入侵植物时起着重要的作用。黄单胞杆菌属细菌通过Type III分泌系统将TALE蛋白注入植物细胞中,TALE蛋白进入宿主细胞后可以特异性结合到某些基因表达区域上游位点,并且调控这些基因的表达,以辅助细菌对植物宿主的感染。天然的TALE蛋白由N端的转运信号(Translocation signal)结构域、C端的核定位信号(nuclear localization signal,NLS)和转录激活结构域(activation domain,AD)、中间的负责识别DNA的高度重复序列结构域组成。TALE蛋白的高度重复结构域以33-35个氨基酸为基本单元,共有12-25个重复。在每个基本单元中,除了第12位和第13位的的两个相邻氨基酸位,其他的氨基酸都是高度保守的,因此这第12、13位的两个氨基酸也被称为RVD(repeat variable di-residue,RVD)。不同的RVD可以特异地识别具体的DNA碱基,例如NI识别A碱基、HD识别C碱基、NG识别T碱基、NN识别G碱基,将这些带有不同的RVD的TALE基本单元串联重复表达,就可以产生识别任意DNA的重组TALE蛋白。As used herein, the term "transcription activator-like effector protein" or "TALE protein" is a type of natural protein secreted by plant pathogenic bacteria of the genus Xanthomonas when they invade the host. Bacteria of the genus Bacillus play an important role in invading plants. Xanthomonas bacteria inject TALE proteins into plant cells through the Type III secretion system. After entering the host cells, the TALE proteins can specifically bind to the upstream sites of certain gene expression regions and regulate the expression of these genes to assist bacterial response. Infection of plant hosts. Natural TALE proteins consist of an N-terminal Translocation signal domain, a C-terminal nuclear localization signal (NLS) and a transcription activation domain (AD), and a highly repetitive sequence structure in the middle responsible for recognizing DNA. Domain composition. The highly repetitive domain of TALE protein has 33-35 amino acids as the basic unit, with a total of 12-25 repeats. In each basic unit, except for the two adjacent amino acid positions at positions 12 and 13, the other amino acids are highly conserved, so the two amino acids at positions 12 and 13 are also called RVD ( repeat variable di-residue,RVD). Different RVDs can specifically recognize specific DNA bases. For example, NI recognizes A base, HD recognizes C base, NG recognizes T base, and NN recognizes G base. These TALE basic units with different RVDs are connected in series. Repeated expression can produce recombinant TALE proteins that recognize any DNA.
如本文所用,“锌指蛋白”或“ZFs”是通过一个或多个锌指以序列特异性方式结合DNA的蛋白质或结构域,其结构通过锌离子的配位而稳定。As used herein, "zinc finger proteins" or "ZFs" are proteins or domains that bind DNA in a sequence-specific manner through one or more zinc fingers, the structure of which is stabilized by coordination of zinc ions.
如本文中所使用的,术语“载体(vector)”是指,可将多聚核苷酸插入其中的一种核酸运载工具。当载体能使插入的多核苷酸编码的蛋白获得表达时,载体称为表达载体。载体可以通过转化,转导或者转染导入宿主细胞,使其携带的遗传物质元件在宿主细胞中获得表达。载体是本领域技术人员公知的,包括但不限于:质粒;噬菌粒;柯斯质粒;人工染色体,例如酵母人工染色体(YAC)、细菌人工染色体(BAC)或P1来源的人工染色体(PAC);噬菌体如λ噬菌体或M13噬菌体及动物病毒等。可用作载体的动物病毒包括但不限于,逆转录酶病毒(包括慢病毒)、腺病毒、腺相关病毒、疱疹病毒(如单纯疱疹病毒)、痘病毒、杆状病毒、乳头瘤病毒、乳头多瘤空泡病毒(如SV40)。一种载体可以含有多种控制表达的元件,包括但不限于,启动子序列、转录起始序列、增强子序列、选择元件及报告基因。另外,载体还可含有复制起始位点。As used herein, the term "vector" refers to a nucleic acid delivery vehicle into which a polynucleotide can be inserted. When the vector can express the protein encoded by the inserted polynucleotide, the vector is called an expression vector. The vector can be introduced into the host cell through transformation, transduction or transfection, so that the genetic material elements it carries can be expressed in the host cell. Vectors are well known to those skilled in the art, including but not limited to: plasmids; phagemids; cosmids; artificial chromosomes, such as yeast artificial chromosomes (YAC), bacterial artificial chromosomes (BAC) or P1-derived artificial chromosomes (PAC) ; Phages such as lambda phage or M13 phage and animal viruses, etc. Animal viruses that can be used as vectors include, but are not limited to, retroviruses (including lentiviruses), adenoviruses, adeno-associated viruses, herpesviruses (such as herpes simplex virus), poxviruses, baculoviruses, papillomaviruses, papillomaviruses, Polyomavacuolating viruses (such as SV40). A vector can contain a variety of expression-controlling elements, including, but not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, and reporter genes. In addition, the vector may also contain an origin of replication site.
在本发明中,术语“多肽”和“蛋白质”具有相同的含义且可互换使用。并且在本发 明中,氨基酸通常用本领域公知的单字母和三字母缩写来表示。例如,丙氨酸可用A或Ala表示。In the present invention, the terms "polypeptide" and "protein" have the same meaning and are used interchangeably. And in this hair In this description, amino acids are often represented by one-letter and three-letter abbreviations well known in the art. For example, alanine can be represented by A or Ala.
发明的有益效果Beneficial effects of the invention
本申请提供的双链DNA脱氨酶突变体可有利应用于线粒体碱基编辑器的构建,由其构建的线粒体碱基编辑器在保持相当的靶位点编辑效率的同时,可有效降低线粒体内和/或细胞核内的脱靶编辑。The double-stranded DNA deaminase mutant provided by the present application can be advantageously used in the construction of mitochondrial base editors. The mitochondrial base editor constructed thereby can effectively reduce the mitochondrial DNA content while maintaining considerable target site editing efficiency. and/or off-target editing within the nucleus.
此外,本申请还提供了具有降低的线粒体核/或细胞核内脱靶编辑的碱基编辑组合物及方法。In addition, the present application also provides base editing compositions and methods with reduced off-target editing in the mitochondrial nucleus/or cell nucleus.
下面将结合附图和实施例对本发明的实施方案进行详细描述,但是本领域技术人员将理解,下列附图和实施例仅用于说明本发明,而不是对本发明的范围的限定。根据附图和优选实施方案的下列详细描述,本发明的各种目的和有利方面对于本领域技术人员来说将变得显然。The embodiments of the present invention will be described in detail below with reference to the accompanying drawings and examples, but those skilled in the art will understand that the following drawings and examples are only used to illustrate the present invention and do not limit the scope of the present invention. Various objects and advantageous aspects of the present invention will become apparent to those skilled in the art from the accompanying drawings and the following detailed description of preferred embodiments.
附图说明Description of the drawings
图1显示了通过Detect-seq技术在核基因组范围内鉴定到的DdCBE产生的大量脱靶编辑,图中外圈标记为染色体编号,圆点代表对应DdCBE在核基因组中造成的脱靶编辑。Figure 1 shows a large number of off-target edits produced by DdCBE identified within the nuclear genome through Detect-seq technology. The outer circle in the figure is marked with the chromosome number, and the dots represent the off-target edits caused by the corresponding DdCBE in the nuclear genome.
图2显示了以ND6-L1397N为例,构建DdCBE中TALE元件删除的工具的元件组成示意图。Figure 2 shows a schematic diagram of the component composition of a tool for deleting TALE components in DdCBE, taking ND6-L1397N as an example.
图3显示了通过对DddA进行突变降低DdCBE脱靶活性示意图(图3a);以及,突变改造后DdCBE结构示意图(图3b)。Figure 3 shows a schematic diagram of reducing the off-target activity of DdCBE by mutating DddA (Fig. 3a); and a schematic diagram of the structure of DdCBE after mutation modification (Fig. 3b).
图4显示了ND6-L1397N中包含不同DddA点突变后线粒体靶向位点的编辑效率(图4a);Q1310A突变可以使线粒体脱靶水平显著下降(图4b);以及,Q1310A突变可以使核基因组某些脱靶位点的Detect-seq信号强度下降(图4c)。Figure 4 shows the editing efficiency of mitochondrial targeting sites after ND6-L1397N contains different DddA point mutations (Figure 4a); the Q1310A mutation can significantly reduce the level of mitochondrial off-targets (Figure 4b); and, the Q1310A mutation can cause certain changes in the nuclear genome. The Detect-seq signal intensity of some off-target sites decreased (Figure 4c).
图5显示了使用靶向扩增子测序对N1308A-ND6、Q1310A-ND6在已知的8个核基因组脱靶位点的编辑水平进行检测的结果。Figure 5 shows the results of using targeted amplicon sequencing to detect the editing levels of N1308A-ND6 and Q1310A-ND6 at eight known nuclear genome off-target sites.
图6显示了NES与DdCBE的融合位置示意图。Figure 6 shows a schematic diagram of the fusion position of NES and DdCBE.
图7显示了使用ND5.1-L1397N对不同位置融合NES序列的策略进行测试的结果; 图7a.线粒体靶向位点的编辑效率;图7b.8个已知核基因组脱靶位点的脱靶编辑效率。Figure 7 shows the results of testing the strategy of fusing NES sequences at different positions using ND5.1-L1397N; Figure 7a. Editing efficiency of mitochondrial target sites; Figure 7b. Off-target editing efficiency of 8 known nuclear genome off-target sites.
图8显示了UGI-NES方式融合相对于未融合NES的对照组的脱靶编辑效率;图8a.UGI-NES方式融合在ND6-L1397N后线粒体脱靶水平稍微下降;图8b.UGI-NES方式融合在ND6-L1397N后核基因组脱靶位点的Detect-seq信号显著下降。Figure 8 shows the off-target editing efficiency of UGI-NES fusion compared to the control group without NES fusion; Figure 8a. Mitochondrial off-target editing levels slightly decreased after UGI-NES fusion in ND6-L1397N; Figure 8b. UGI-NES fusion in The Detect-seq signal of off-target sites in the nuclear genome decreased significantly after ND6-L1397N.
图9显示了使用ND5.1-L1333N对融合mapk-NES序列的策略进行测试;图9a.线粒体靶向位点的编辑效率;图9b.8个已知核基因组脱靶位点的脱靶编辑效率。Figure 9 shows the test of the strategy of fusing mapk-NES sequences using ND5.1-L1333N; Figure 9a. Editing efficiency of mitochondrial targeting sites; Figure 9b. Off-target editing efficiency of 8 known nuclear genome off-target sites.
图10显示了使用ND6-L1397N对融合HIV-NES并同时使用DddA-Q1310A突变策略进行测试;图10a.线粒体靶向位点的编辑效率;图10b.8个已知核基因组脱靶位点的脱靶编辑效率。Figure 10 shows the testing of fused HIV-NES using ND6-L1397N and simultaneously using the DddA-Q1310A mutation strategy; Figure 10a. Editing efficiency of mitochondrial targeting sites; Figure 10b. Off-targeting of 8 known nuclear genome off-target sites Editing efficiency.
图11显示了通过同时使用DddIA降低DdCBE的脱靶效应示意图。Figure 11 shows a schematic diagram of the off-target effects of reducing DdCBE by concurrent use of DddI A.
图12显示了DddIA与DdCBE同时使用的工具结构示意图。Figure 12 shows a schematic diagram of the tool structure used simultaneously with DddI A and DdCBE.
图13显示了不同DddIA剂量对ND6-L1397N线粒体靶向位点编辑效率的影响(图13a);最佳DddIA剂量共转染后线粒体脱靶水平显著下降(图13b);以及,最佳DddIA剂量共转染后核基因组脱靶位点的Detect-seq信号显著下降(图13c)。Figure 13 shows the impact of different DddI A doses on the editing efficiency of ND6-L1397N mitochondrial targeting sites (Figure 13a); the mitochondrial off-target level was significantly reduced after co-transfection with the optimal DddI A dose (Figure 13b); and, the optimal DddI A dose The Detect-seq signal of off-target sites in the nuclear genome significantly decreased after dose A co-transfection (Figure 13c).
图14显示了不同DddIA剂量对ND6-L1397N线粒体靶核基因组脱靶位点编辑效率的影响。Figure 14 shows the effect of different DddI A doses on the editing efficiency of ND6-L1397N mitochondrial target nuclear genome off-target sites.
图15显示了使用ND6-L1397N对共转染SV40-NLS-DddIA策略进行测试;图15a.线粒体靶向位点的编辑效率;图15b.8个已知核基因组脱靶位点的脱靶编辑效率。Figure 15 shows the test of the co-transfection SV40-NLS-DddI A strategy using ND6-L1397N; Figure 15a. Editing efficiency of mitochondrial targeted sites; Figure 15b. Off-target editing efficiency of 8 known nuclear genome off-target sites .
图16显示了使用ND6-L1397N对Q1310A突变体联合DddIA策略进行测试;图16a.线粒体靶向位点的编辑效率;图16b.8个已知核基因组脱靶位点的脱靶编辑效率。Figure 16 shows the use of ND6-L1397N to test the Q1310A mutant combined with the DddI A strategy; Figure 16a. Editing efficiency of mitochondrial target sites; Figure 16b. Off-target editing efficiency of 8 known nuclear genome off-target sites.
图17显示了Q1310A-ND6-L1397N分别联合DddIA、NES,或者,Q1310A-ND6-L1397N同时联合NES以及DddIA在8个已知核基因组脱靶位点的脱靶编辑效率。Figure 17 shows the off-target editing efficiency of Q1310A-ND6-L1397N combined with DddI A and NES respectively, or Q1310A-ND6-L1397N combined with NES and DddI A at eight known nuclear genome off-target sites.
图18显示了所有策略的性能比较。数值为某处理之下的“在靶编辑效率平均值/脱靶编辑效率平均值”,该值越大说明降低脱靶影响越好。其中,“canonical”指示由野生型DddAtox构建的传统DdCBE,“hivNES”、“TALE-hivNES”、“DddA-hivNES”指示由野生型DddAtox以及源自hiv的NES构建的DdCBE,“SV40-NLS-DddIA”指示由野生型DddAtox构建的传统DdCBE联合NLS-DddIA,“hivNES+Q1310A”指示由DddAtox突变体Q1310A以及源自hiv的NES构建的DdCBE,“Q1310A+DddIA”指示由DddAtox突变体Q1310A构建的DdCBE联合NLS-DddIA,“hivNES+Q1310A+DddIA”指示由DddAtox突变体Q1310A以及源自hiv的NES构建的DdCBE联合 NLS-DddIA,“UGI-hivNES”指示由野生型DddAtox、UGI以及源自hiv的NES构建的DdCBE,“mapKNES”指示由野生型DddAtox以及源自mapK的NES构建的DdCBE。Figure 18 shows the performance comparison of all strategies. The value is the "average on-target editing efficiency/average off-target editing efficiency" under a certain treatment. The larger the value, the better the reduction of off-target effects. Among them, "canonical" indicates the traditional DdCBE constructed from wild-type DddAtox, "hivNES", "TALE-hivNES", and "DddA-hivNES" indicate DdCBE constructed from wild-type DddAtox and HIV-derived NES, and "SV40-NLS- "DddIA" indicates the traditional DdCBE constructed from wild-type DddAtox combined with NLS-DddIA, "hivNES+Q1310A" indicates the DdCBE constructed from the DddAtox mutant Q1310A and hiv-derived NES, "Q1310A+DddIA" indicates the DddCBE constructed from the DddAtox mutant Q1310A DdCBE combined with NLS-DddIA, "hivNES+Q1310A+DddIA" indicates the DdCBE combination constructed from the DddAtox mutant Q1310A and HIV-derived NES NLS-DddIA, "UGI-hivNES" indicates DdCBE constructed from wild-type DddAtox, UGI, and hiv-derived NES, and "mapKNES" indicates DdCBE constructed from wild-type DddAtox and mapK-derived NES.
序列信息sequence information
本申请涉及的序列的描述提供于下表中。A description of the sequences covered by this application is provided in the table below.
表1:序列信息













Table 1: Sequence information













具体实施方式Detailed ways
现参照下列意在举例说明本发明(而非限定本发明)的实施例来描述本发明。The invention will now be described with reference to the following examples which are intended to illustrate but not to limit the invention.
除非特别指明,本发明中所使用的分子生物学实验方法和免疫检测法,基本上参照J.Sambrook等人,分子克隆:实验室手册,第2版,冷泉港实验室出版社,1989,以及F.M.Ausubel等人,精编分子生物学实验指南,第3版,John Wiley&Sons,Inc.,1995中所述的方法进行;限制性内切酶的使用依照产品制造商推荐的条件。本领域技术人员知晓,实施例以举例方式描述本发明,且不意欲限制本发明所要求保护的范围。Unless otherwise specified, the molecular biology experimental methods and immunoassay methods used in the present invention basically refer to J. Sambrook et al., Molecular Cloning: Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, 1989, and The method was carried out according to the method described in F.M. Ausubel et al., Compiled Experimental Guide to Molecular Biology, 3rd Edition, John Wiley & Sons, Inc., 1995; the use of restriction enzymes was in accordance with the conditions recommended by the product manufacturer. Those skilled in the art will appreciate that the examples describe the invention by way of example and are not intended to limit the scope of the invention as claimed.
实施例1:现有DdCBE在核基因组脱靶水平的评估Example 1: Evaluation of existing DdCBE at nuclear genome off-target levels
由于DdCBE在进行碱基编辑过程中,与之前David Liu课题组开发的胞嘧啶碱基编辑器(CBE)产生编辑的过程类似,首先利用脱氨酶将靶向位置的C通过脱氨变成脱氧尿嘧啶dU,随后通过DNA的复制或者修复机制将dU识别为T,最终完成C-to-T的转变。因此,成功的DdCBE编辑及DdCBE产生的脱靶编辑均会产生中间状态产物dU。故使用本实验室已经成功开发的CBE脱靶检测技术Detect-seq(参见Lei,Z.,et al.,Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors.Nat Methods,2021.18(6):p.643-651.)可以对dU进行捕获,从而进行现有DdCBE在基因组DNA上的脱靶情况的评估。Since the base editing process of DdCBE is similar to the editing process of the cytosine base editor (CBE) previously developed by David Liu's research group, deaminase is first used to deaminate the C at the target position into deoxygenation. Uracil dU is then recognized as T through the DNA replication or repair mechanism, ultimately completing the C-to-T conversion. Therefore, both successful DdCBE editing and off-target editing by DdCBE will produce the intermediate product dU. Therefore, the CBE off-target detection technology Detect-seq that has been successfully developed by our laboratory is used (see Lei, Z., et al., Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors. Nat Methods, 2021.18(6):p.643-651.) dU can be captured to evaluate the off-target status of existing DdCBE on genomic DNA.
1.1实验方法1.1 Experimental methods
使用G1397DddAtox-split构成的两种DdCBE(L1397N与L1397C),在HEK293T细胞系中对Mok,B.Y.,et al.,A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing.Nature,2020.583(7817):p.631-637中报道的ND4,ND5.1,ND5.3和ND6四个人体线粒体位点进行线粒体基因编辑。其中,L1397N和L1397C中的“1397”的简写意为DddAtox分割位点为G1397;“L”表示Left-TALE;“N”,“C”表示DddAtox-split的N端或者C端。因此,L1397N的简写含义为:DddAtox分割位点为G1397,Left-TALE带有的为DddAtox-split的N端。Mok, BY, et al., A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature, 2020.583(7817): The four human mitochondrial sites ND4, ND5.1, ND5.3 and ND6 reported in p.631-637 are used for mitochondrial gene editing. Among them, the abbreviation of "1397" in L1397N and L1397C means that the DddA tox split site is G1397; "L" represents Left-TALE; "N" and "C" represent the N-terminal or C-terminal of DddA tox -split. Therefore, the abbreviation of L1397N means: the DddA tox split site is G1397, and the Left-TALE carries the N-terminus of DddA tox -split.
本实施例中具体使用ND4-L1397N,ND5.1-L1397N,ND6-L1397N,ND4-L1397C以及ND5.3-L1397C的DdCBE在HEK293T细胞系中进行编辑。各DdCBE的组成元件如表2所示。In this example, DdCBE of ND4-L1397N, ND5.1-L1397N, ND6-L1397N, ND4-L1397C and ND5.3-L1397C was specifically used for editing in the HEK293T cell line. The components of each DdCBE are shown in Table 2.
表2各DdCBE的组成元件
Table 2 Components of each DdCBE
1.1.1使用DdCBE进行编辑1.1.1 Editing with DdCBE
细胞培养:将HEK293T细胞系置于添加有10%FBS(PAA胎牛血清,货号:A15-151/101)、1%GlutaMAXTM(GibcoTM,货号:35050061)和0.5%penicillin/streptomycin(GibcoTM,货号:15140122)的培养基中,在5%CO2浓度下的37℃恒温培养箱中进行培养,当细胞密度达到80%左右时进行传代。Cell culture: HEK293T cell line was placed in a culture medium supplemented with 10% FBS (PAA fetal bovine serum, Cat. No.: A15-151/101), 1% GlutaMAX (Gibco , Cat. No.: 35050061) and 0.5% penicillin/streptomycin (Gibco , Catalog No.: 15140122), culture in a 37°C constant-temperature incubator at a concentration of 5% CO2 , and passage when the cell density reaches about 80%.
转染:将1.6×106个HEK293T细胞传代至24孔板细胞培养板(CORNING货号:3524)中培养16h。而后使用Lipofectamine LTX(ThermoFisher,货号: 15338100)按照其说明书,将一对编码有DdCBE的质粒各840ng转染至已贴壁的细胞。继续培养72h后离心收集转染后的细胞,用Universal Genomic DNA Kit(CWBIO,货号:CW2298M)从收集的细胞沉淀中提取基因组DNA,最后用10mM Tris-HCl(pH 8.0)洗脱并通过Nanodrop分光光度计测定浓度。Transfection: Passage 1.6×10 6 HEK293T cells into a 24-well cell culture plate (CORNING No.: 3524) for 16 hours. Then use Lipofectamine LTX (ThermoFisher, Cat. No.: 15338100) according to its instructions, 840ng of each pair of plasmids encoding DdCBE were transfected into adherent cells. After continuing to culture for 72 hours, the transfected cells were collected by centrifugation, and genomic DNA was extracted from the collected cell pellet using Universal Genomic DNA Kit (CWBIO, Cat. No.: CW2298M). Finally, it was eluted with 10mM Tris-HCl (pH 8.0) and spectrophotometered by Nanodrop. Photometer determines concentration.
1.1.2使用Detect-seq技术进行DdCBE的脱靶评估1.1.2 Use Detect-seq technology for off-target assessment of DdCBE
具体实验步骤参见Lei,Z.,et al.,Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors.Nat Methods,2021.18(6):p.643-651。For specific experimental steps, see Lei, Z., et al., Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors. Nat Methods, 2021.18(6): p.643-651.
1.1.3目标扩增子测序1.1.3 Target amplicon sequencing
各待测点的扩增引物如表3所示:The amplification primers for each test point are shown in Table 3:
表3引物序列

Table 3 Primer sequences

对于每个待测定的位点,第一轮PCR的反应体系如表4:For each site to be determined, the reaction system of the first round of PCR is as shown in Table 4:
表4第一轮PCR反应体系
Table 4 First round PCR reaction system
PCR扩增程序按照high-fidelity DNA polymerase的说明书设定,扩增循环数设置为10。The PCR amplification procedure follows According to the instruction manual of high-fidelity DNA polymerase, the number of amplification cycles is set to 10.
第一轮PCR结束后,取使用相同基因组、不同扩增引物的PCR反应体系的产物5μL混合在一起,使用0.9×Agencourt AMPure XP(BECKMAN,货号:A63882)纯化,然后使用18μL无核酸酶的水洗脱XP beads上的DNA,然后将洗脱的DNA用于第二轮PCR反应。第二轮PCR反应体系如表5:After the first round of PCR, mix 5 μL of the products of the PCR reaction systems using the same genome and different amplification primers together, purify using 0.9×Agencourt AMPure XP (BECKMAN, Cat. No.: A63882), and then use 18 μL of nuclease-free water. The DNA on the XP beads is eluted and the eluted DNA is used in the second round of PCR reaction. The second round PCR reaction system is shown in Table 5:
表5第二轮PCR反应体系
Table 5 Second round PCR reaction system
第二轮PCR的反应的扩增程序按照high-fidelity DNA polymerase的说明书设置, 扩增循环数设置为15。The amplification procedure for the second round of PCR reaction is as follows: Instructions for setting up high-fidelity DNA polymerase, The number of amplification cycles was set to 15.
第二轮PCR反应结束后,可将具有不同index的PCR反应产物混合在一起,再使用0.8×Agencourt AMPure XP进行纯化,然后使用20μL无核酸酶的水洗脱,得到目标扩增子测序文库。取1μL洗脱的DNA,使用Qubit 2.0荧光计测定浓度;取1μL洗脱的DNA,使用Agilent 4150片段分析仪检测文库质量。After the second round of PCR reaction, the PCR reaction products with different indexes can be mixed together, purified using 0.8×Agencourt AMPure XP, and then eluted with 20 μL nuclease-free water to obtain the target amplicon sequencing library. Take 1 μL of eluted DNA and use a Qubit 2.0 fluorometer to measure the concentration; take 1 μL of eluted DNA and use an Agilent 4150 fragment analyzer to check the library quality.
目标扩增子测序文库使用华大智造MGI测序平台测序。The target amplicon sequencing library was sequenced using the MGI sequencing platform manufactured by MGI.
1.2实验结果1.2 Experimental results
评估结果表明,ND4-L1397N,ND5.1-L1397N,ND6-L1397N,ND4-L1397C以及ND5.3-L1397C分别可以在核基因组水平上造成100,652,697,91以及610个脱靶位点(图1)。The evaluation results show that ND4-L1397N, ND5.1-L1397N, ND6-L1397N, ND4-L1397C and ND5.3-L1397C can cause 100, 652, 697, 91 and 610 off-target sites respectively at the nuclear genome level (Figure 1).
为了探索DdCBE在核内产生脱靶的机制,对ND4-L1397N,ND5.1-L1397N,ND6-L1397N三种DdCBE的Left-TALE及Right-TALE进行单边删除及全部删除操作。以ND6-L1397N为例,对Left-TALE及Right-TALE进行单边删除或全部删除后,DdCBE编辑工具组成示意图如图2所示。随后对转染相应上述删除元件的DdCBE的HEK293T细胞进行Detect-seq建库测序。通过删除单边TALE元件、删除全部TALE元件与完整DdCBE的Detect-seq数据的比较,定义出两种核基因组脱靶类型:TAS-dependent(TAS依赖型)与TAS-independent(TAS非依赖型),其中TAS为TALE array sequence的缩写。In order to explore the off-target mechanism of DdCBE in the nucleus, the Left-TALE and Right-TALE of three DdCBEs, ND4-L1397N, ND5.1-L1397N, and ND6-L1397N, were deleted unilaterally or completely. Taking ND6-L1397N as an example, after deleting Left-TALE and Right-TALE unilaterally or completely, the schematic diagram of the DdCBE editing tool is shown in Figure 2. Subsequently, Detect-seq library construction and sequencing was performed on HEK293T cells transfected with DdCBE corresponding to the above-mentioned deleted elements. By comparing the Detect-seq data of deleting unilateral TALE elements, deleting all TALE elements and complete DdCBE, two types of nuclear genome off-targets were defined: TAS-dependent (TAS-dependent) and TAS-independent (TAS-independent). Among them, TAS is the abbreviation of TALE array sequence.
实施例2:通过对DddAtox进行突变降低DdCBE的脱靶效应Example 2: Reducing the off-target effects of DdCBE by mutating DddAtox
对DddAtox选择性点突变,突变位置与突变类型为:N1308A(SEQ ID NO:3),G1309A(SEQ ID NO:4),Q1310A(SEQ ID NO:5),N1367A(SEQ ID NO:6)以及N1368A(SEQ ID NO:7)。使用所述突变后的DddAtox-split替换野生型DddAtox-split构建改造后的DdCBE,突变改造后DdCBE结构示意图如图3b所示,各工具组成元件如表6所示;通过对DddA进行突变降低DdCBE脱靶活性示意图如图3a所示。For DddA tox selective point mutations, the mutation positions and mutation types are: N1308A (SEQ ID NO:3), G1309A (SEQ ID NO:4), Q1310A (SEQ ID NO:5), N1367A (SEQ ID NO:6) and N1368A (SEQ ID NO:7). The mutated DddA tox -split was used to replace the wild-type DddA tox -split to construct the modified DdCBE. The structural diagram of the DdCBE after the mutation is shown in Figure 3b, and the components of each tool are shown in Table 6; by mutating DddA The schematic diagram of reducing the off-target activity of DdCBE is shown in Figure 3a.
表6各DdCBE的组成元件

Table 6 Components of each DdCBE

对上述突变后的ND6-L1397N DdCBE进行靶向扩增子验证后,发现突变体N1308A与Q1310A构建的DdCBE在线粒体靶向位点仍有编辑,其它突变均会造成DdCBE线粒体靶向编辑的大幅度下降(图4a)。对Q1310A-ND6使用ATAC-seq和Detect-seq分别对HEK293T细胞中DdCBE在线粒体水平及核基因组水平上造成的脱靶编辑进行比较。与原始DdCBE(WT-ND6)相比,Q1310A可以显著降低DdCBE在线粒体水平上的脱靶编辑至原始DdCBE的三分之一水平(图4b);同时,与原始DdCBE相比,Q1310A可以略微降低DdCBE在核基因组上造成的脱靶编辑强度(图4c)。 After verifying the targeted amplicon of the above-mentioned mutated ND6-L1397N DdCBE, it was found that the DdCBE constructed with the mutants N1308A and Q1310A still had editing at the mitochondrial targeting site. Other mutations would cause a substantial increase in mitochondrial targeting editing of DdCBE. decreased (Fig. 4a). ATAC-seq and Detect-seq were used to compare the off-target editing caused by DdCBE in HEK293T cells at the mitochondrial level and nuclear genome level for Q1310A-ND6. Compared with original DdCBE (WT-ND6), Q1310A can significantly reduce the off-target editing of DdCBE at the mitochondrial level to one-third of the level of original DdCBE (Figure 4b); at the same time, compared with original DdCBE, Q1310A can slightly reduce DdCBE The resulting off-target editing intensity on the nuclear genome (Figure 4c).
随后,对N1308A-ND6与Q1310A-ND6在已知的8个核基因组脱靶位点进行靶向扩增子测序,结果表明Q1310A-ND6比N1308A-ND6更能降低这8个脱靶位点的编辑水平(图5)。Subsequently, targeted amplicon sequencing was performed on N1308A-ND6 and Q1310A-ND6 at eight known nuclear genome off-target sites. The results showed that Q1310A-ND6 can reduce the editing level of these eight off-target sites better than N1308A-ND6. (Figure 5).
实施例3:通过增加出核信号(NES)降低DdCBE的脱靶效应Example 3: Reduce the off-target effects of DdCBE by increasing nuclear exit signal (NES)
出核信号(nuclear export signal,NES)是一段富含输水氨基酸的短肽。带有NES信号的蛋白可以通过核孔转运蛋白重新定位到细胞质中(Azmi,A.S.,M.H.Uddin,and R.M.Mohammad,The nuclear export protein XPO1-from biology to targeted therapy.Nat Rev Clin Oncol,2021.18(3):p.152-169.)。目前已知的出核信号一般符合Φ1-X3-Φ2-X2-Φ3-X-Φ4的氨基酸序列特征,其中Φn表示有n个串联的疏水氨基酸(如Leu,Val,Ile,Phe,或者Met),Xn表示n个串联的任意一种氨基酸。目前在真核细胞体系中,最常用的两种NES序列为从HIV病毒中提取的NES序列(HIV-NES,氨基酸序列如SEQ ID NO:47)与在细胞激酶通路中存在的NES序列(mapk-NES,氨基酸序列如SEQ ID NO:56)。The nuclear export signal (NES) is a short peptide rich in water-transporting amino acids. Proteins with NES signals can be relocated to the cytoplasm through nuclear pore transporters (Azmi, A.S., M.H. Uddin, and R.M. Mohammad, The nuclear export protein XPO1-from biology to targeted therapy. Nat Rev Clin Oncol, 2021.18(3) :p.152-169.). The currently known nuclear exit signals generally conform to the amino acid sequence characteristics of Φ1-X3-Φ2-X2-Φ3-X-Φ4, where Φn represents n hydrophobic amino acids in series (such as Leu, Val, Ile, Phe, or Met) , Xn represents any amino acid of n series. Currently in eukaryotic cell systems, the two most commonly used NES sequences are the NES sequence extracted from the HIV virus (HIV-NES, amino acid sequence such as SEQ ID NO: 47) and the NES sequence that exists in the cellular kinase pathway (mapk -NES, amino acid sequence such as SEQ ID NO:56).
具体融合NES时,可以选择3种不同融合位置,分别为DdCBE的TALE元件C端,DddA-split元件C端以及UGI元件的C端(如图6所示)。上述三种融合方式分别可以简称为TALE-NES-DdCBE,DddA-NES-DdCBE和UGI-NES-DdCBE。When specifically fusing NES, three different fusion positions can be selected, namely the C-terminal of the TALE element of DdCBE, the C-terminal of the DddA-split element and the C-terminal of the UGI element (as shown in Figure 6). The above three fusion methods can be abbreviated as TALE-NES-DdCBE, DddA-NES-DdCBE and UGI-NES-DdCBE respectively.
结合上述内容,本案示例性选择了2种NES序列,3种NES融合位置,不同线粒体靶向位点等多种组合对原始DdCBE进行改造与比较。所用各DdCBE工具的组成元件如表7所示:Combining the above contents, this case exemplarily selected two kinds of NES sequences, three kinds of NES fusion positions, different mitochondrial targeting sites and other combinations to transform and compare the original DdCBE. The components of each DdCBE tool used are shown in Table 7:
表7各DdCBE的组成元件


Table 7 Components of each DdCBE


首先使用HIV-NES构建TALE-NES-DdCBE,DddA-NES-DdCBE和UGI-NES-DdCBE。通过比较这三种NES融合位置发现,不同NES的融合位置对线粒体靶向位点的编辑效率几乎没有影响(图7a)。进一步,使用靶向扩增子测序,对8个ND5.1-L1397N已知的核基因组脱靶位点进行评估,发现在UGI-C端融合NES可以使这8个核基因组脱靶位点的编辑效率下降最多(图7b)。HIV-NES was first used to construct TALE-NES-DdCBE, DddA-NES-DdCBE and UGI-NES-DdCBE. By comparing the fusion positions of these three NESs, it was found that the fusion positions of different NESs had little impact on the editing efficiency of mitochondrial targeting sites (Figure 7a). Furthermore, targeted amplicon sequencing was used to evaluate eight known nuclear genome off-target sites of ND5.1-L1397N, and found that fusion of NES at the UGI-C terminus could improve the editing efficiency of these eight nuclear genome off-target sites. decreased the most (Fig. 7b).
对UGI-NES-ND6-L1397N使用ATAC-seq和Detect-seq分别对HEK293T细胞中DdCBE在线粒体水平及核基因组水平上造成的脱靶编辑进行比较。与原始DdCBE(WT-ND6)相比,UGI-NES不能明显降低DdCBE在线粒体水平上的脱靶编辑(图8a);但与原始DdCBE相比,UGI-NES可以显著降低DdCBE在核基因组上造成的脱靶编辑强度(图8b)。ATAC-seq and Detect-seq were used to compare the off-target editing caused by DdCBE in HEK293T cells at the mitochondrial level and nuclear genome level for UGI-NES-ND6-L1397N. Compared with the original DdCBE (WT-ND6), UGI-NES cannot significantly reduce the off-target editing of DdCBE at the mitochondrial level (Figure 8a); but compared with the original DdCBE, UGI-NES can significantly reduce the off-target editing caused by DdCBE on the nuclear genome. Off-target editing intensity (Figure 8b).
为了验证对DdCBE增加NES即可降低细胞核内脱靶为通用策略,对UGI-NES中的NES序列进行更换,将其中的HIV-NES更换为mapk-NES序列。随后使用靶向扩增子测序技术对8个已知的脱靶位点对mapk-UGI-NES-ND6-L1397N进行脱靶检测,发现在UGI-C端融合mapk-NES序列的DdCBE,依然可以降低细胞核内脱靶编辑(图9),证明了该融合NES策略的通用性。In order to verify that adding NES to DdCBE can reduce off-target in the nucleus as a general strategy, the NES sequence in UGI-NES was replaced, and the HIV-NES was replaced with the mapk-NES sequence. Subsequently, targeted amplicon sequencing technology was used to conduct off-target detection of mapk-UGI-NES-ND6-L1397N at eight known off-target sites. It was found that DdCBE, which is fused with the mapk-NES sequence at the UGI-C terminus, can still reduce the number of nuclear Internal off-target editing (Figure 9) proves the versatility of this fusion NES strategy.
考虑到Q1310A能够显著降低DdCBE在线粒体水平上的脱靶编辑,对DdCBE增加NES信号可以显著降低核基因组脱靶位点的信号强度。因此,可以对DddA增加Q1310A,同时在UGI-C端增加NES序列,通过联合两种策略同时减低DdCBE在线粒体水平以及核基因组水平的脱靶编辑。Considering that Q1310A can significantly reduce the off-target editing of DdCBE at the mitochondrial level, increasing the NES signal to DdCBE can significantly reduce the signal intensity of off-target sites in the nuclear genome. Therefore, Q1310A can be added to DddA and the NES sequence can be added to the UGI-C terminus to simultaneously reduce off-target editing of DdCBE at the mitochondrial level and nuclear genome level by combining the two strategies.
对Q1310A-ND6-L1397N-UGI-NES样本在已知的8个核基因组脱靶位点进行靶向扩增子测序,结果表明,相比原始DdCBE(WT-ND6),Q1310A-ND6-L1397N-UGI-NES确实可以显著降低这8个核基因组脱靶位点的编辑效率(图10)。Targeted amplicon sequencing of the Q1310A-ND6-L1397N-UGI-NES sample at 8 known nuclear genome off-target sites showed that compared with the original DdCBE (WT-ND6), Q1310A-ND6-L1397N-UGI -NES can indeed significantly reduce the editing efficiency of these eight nuclear genome off-target sites (Figure 10).
实施例4:通过DddIA降低DdCBE的脱靶效应Example 4: Reducing the off-target effects of DdCBE through DddI A
Joseph Mougous团队与David Liu团队在伯克霍尔德菌(Burkholderia cenocepacia)中挖掘出了可以特异对DNA双链进行催化的脱氨酶细菌毒素DddA的同时,在另一种缺乏dddA基因的伯克霍尔德菌株中发现了一种可以拮抗DddA活性的细菌免疫蛋白DddIA。为了保证DdCBE在线粒体的编辑活性,同时降低其在细胞核内的脱靶编辑,经过大量研究,发明人发现可以在使用DdCBE进行编辑时,共转染增加核定位信号(NLS)的 DddAtox活性抑制蛋白DddIA,从而降低DdCBE在细胞核内的催化活性。通过同时使用DddIA降低DdCBE的脱靶效应示意图如图11所示,相关构造的工具结构如图12所示,相关构造的工具的组成元件如表8所示。Joseph Mougous's team and David Liu's team discovered the deaminase bacterial toxin DddA in Burkholderia cenocepacia that can specifically catalyze DNA double strands. At the same time, they discovered another Burkholderia cenocepacia lacking the dddA gene. A bacterial immune protein DddI A that can antagonize the activity of DddA was discovered in Holder's strain. In order to ensure the editing activity of DdCBE in mitochondria and reduce its off-target editing in the nucleus, after extensive research, the inventors found that co-transfection can increase the nuclear localization signal (NLS) when using DdCBE for editing. DddA tox activity inhibits the protein DddI A , thereby reducing the catalytic activity of DdCBE in the nucleus. The schematic diagram of reducing the off-target effect of DdCBE by simultaneously using DddI A is shown in Figure 11. The structure of the related tool is shown in Figure 12. The components of the related tool are shown in Table 8.
表8各工具的组成元件
Table 8 Components of Each Tool
在同时使用带有NLS的DddIA对细胞核内DdCBE的催化活性进行抑制时,需要确定DdCBE与DddIA的摩尔质量的剂量比值。因此,首先使用ND6-L1397N进行对DdCBE:DddIA摩尔质量的比值为1:0到1:1.5不同剂量梯度的共转染条件在HEK293T细胞中进行测试。结果表明,对于线粒体靶向位点,共转染DddIA对线粒体的靶向位置编辑几乎没有影响(图13a)。When simultaneously using DddI A with NLS to inhibit the catalytic activity of DdCBE in the nucleus, it is necessary to determine the dose ratio of the molar masses of DdCBE and DddI A. Therefore, ND6-L1397N was first used to test co-transfection conditions in HEK293T cells at different dose gradients with DdCBE:DddI A molar mass ratios ranging from 1:0 to 1:1.5. The results showed that for mitochondrial targeting sites, co-transfection of DddI A had little effect on mitochondrial targeting position editing (Figure 13a).
使用靶向扩增子测序,对9个ND6-L1397N已知的核基因组脱靶位点进行评估,发现随着DdCBE:DddIA摩尔质量的比值增加,核基因组脱靶位点的编辑效率也会随之下降(图14)。综合考虑靶向位点的编辑后,认为DdCBE:DddIA摩尔质量的比值为1:1.2为最佳共转染比例。Using targeted amplicon sequencing, nine known nuclear genome off-target sites of ND6-L1397N were evaluated and found that as the molar mass ratio of DdCBE:DddI A increases, the editing efficiency of nuclear genome off-target sites will also increase. decline (Figure 14). After comprehensive consideration of the editing of the targeted site, it is believed that the molar mass ratio of DdCBE:DddI A is 1:1.2 as the optimal co-transfection ratio.
在DdCBE与DddIA摩尔质量比值为最佳共转染比例条件下,使用ATAC-seq和Detect-seq分别对HEK293T细胞中DdCBE在线粒体水平及核基因组水平上造成的脱靶 编辑进行比较。与原始DdCBE相比,共转染DddIA可以显著降低DdCBE在线粒体水平上的脱靶编辑至原始DdCBE的四分之一水平(图13b);同时,与原始DdCBE相比,共转染DddIA可以显著降低DdCBE在核基因组上造成的脱靶编辑强度,有些脱靶位点的Detect-seq信号强度可降至背景水平(图13c)。Under the condition that the molar mass ratio of DdCBE and DddI A was the optimal co-transfection ratio, ATAC-seq and Detect-seq were used to detect off-target effects caused by DdCBE at the mitochondrial level and nuclear genome level in HEK293T cells, respectively. Edit for comparison. Compared with original DdCBE, co-transfection of DddI A can significantly reduce the off-target editing of DdCBE at the mitochondrial level to a quarter of the level of original DdCBE (Figure 13b); at the same time, compared with original DdCBE, co-transfection of DddI A can significantly reduce the off-target editing of DdCBE at the mitochondrial level (Figure 13b). The off-target editing intensity caused by DdCBE on the nuclear genome is significantly reduced, and the Detect-seq signal intensity of some off-target sites can be reduced to the background level (Figure 13c).
将上述bpNLS序列更换为另一种常见的SV40-NLS序列后,对已知的若干核基因组脱靶位点进行靶向扩增子测序。After replacing the above bpNLS sequence with another common SV40-NLS sequence, targeted amplicon sequencing was performed on several known nuclear genome off-target sites.
测序结果表明,即使更换为SV40-NLS序列,DddIA与DdCBE进行共转染后,依然可以在靶向线粒体编辑水平不受影响的情况下,显著降低核基因组上脱靶位点的编辑效率,证明了该策略的通用性(图15)。The sequencing results show that even if the SV40-NLS sequence is replaced, co-transfection of DddI A and DdCBE can still significantly reduce the editing efficiency of off-target sites on the nuclear genome without affecting the level of targeted mitochondrial editing, proving that This demonstrates the versatility of this strategy (Figure 15).
实施例5:Q1310A-ND6与DddIA策略联用脱靶水平的评估Example 5: Evaluation of the off-target level of Q1310A-ND6 combined with DddI A strategy
考虑到Q1310A能够显著降低DdCBE在线粒体水平上的脱靶编辑,共转染DddIA进入细胞核中则几乎不影响DdCBE在线粒体水平上的在靶编辑,但可以显著降低核基因组脱靶位点的信号强度。因此,可以对DddA增加Q1310A突变体,同时共转染DddIA,通过联合两种策略同时减低DdCBE在线粒体水平以及核基因组水平的脱靶编辑。Considering that Q1310A can significantly reduce the off-target editing of DdCBE at the mitochondrial level, co-transfection of DddI A into the nucleus has almost no effect on the on-target editing of DdCBE at the mitochondrial level, but can significantly reduce the signal intensity of off-target sites in the nuclear genome. Therefore, the Q1310A mutant can be added to DddA and co-transfected with DddI A to simultaneously reduce off-target editing of DdCBE at the mitochondrial level and nuclear genome level by combining the two strategies.
对Q1310A-ND6-L1397N共转染DddIA样本在已知的8个核基因组脱靶位点进行靶向扩增子测序,结果表明,相比原始DdCBE,联合策略确实可以显著降低这8个核基因组脱靶位点的编辑效率,证明了策略的有效性(图16)。Targeted amplicon sequencing of Q1310A-ND6-L1397N co-transfected DddI A samples at 8 known nuclear genome off-target sites showed that compared with the original DdCBE, the combined strategy can indeed significantly reduce these 8 nuclear genome off-target sites. The editing efficiency of off-target sites proves the effectiveness of the strategy (Figure 16).
实施例6:Q1310A-ND6、增加NES与DddIA三种策略联用脱靶水平的评估Example 6: Evaluation of the off-target level of the three strategies of Q1310A-ND6, increasing NES and DddI A
在ND6-L1397N的Q1310A突变体与NES、DddIA的策略组合中,我们发现三种策略联合使用的效果最佳,双策略联合使用的效果次之(图17、18)。结果表明,策略联合使用相比原始DdCBE可以显著降低这8个核基因组脱靶位点的编辑效率,且三种策略联合使用效果优于双策略联合使用。同时,在图18的统计结果中,本申请所提出的所有示例性优化策略均能显著改善DdCBE工具的脱靶影响。图18对应的各构建体主要结构如表9所示:In the strategy combination of the Q1310A mutant of ND6-L1397N with NES and DddI A , we found that the combination of the three strategies had the best effect, followed by the combination of the two strategies (Figures 17 and 18). The results show that the combined use of strategies can significantly reduce the editing efficiency of these eight nuclear genome off-target sites compared with the original DdCBE, and the combined use of three strategies is better than the combined use of two strategies. At the same time, in the statistical results of Figure 18, all exemplary optimization strategies proposed in this application can significantly improve the off-target impact of the DdCBE tool. The main structures of each construct corresponding to Figure 18 are shown in Table 9:
表9各构建体主要组成元件

Table 9 Main components of each construct

尽管本发明的具体实施方式已经得到详细的描述,但本领域技术人员将理解:根据已经公布的所有教导,可以对细节进行各种修改和变动,并且这些改变均在本发明的保护范围之内。本发明的全部分为由所附权利要求及其任何等同物给出。 Although the specific embodiments of the present invention have been described in detail, those skilled in the art will understand that various modifications and changes can be made to the details based on all teachings that have been published, and these changes are within the protection scope of the present invention. . The full scope of the present invention is given by the appended claims and any equivalents thereof.

Claims (23)

  1. 一种具有双链DNA脱氨酶活性的多肽或其突变体,所述多肽或其突变体包含野生型双链DNA脱氨酶中与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸残基;并且,所述多肽或其突变体与野生型双链DNA脱氨酶中与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸残基相比,具有下述突变:A polypeptide with double-stranded DNA deaminase activity or a mutant thereof, the polypeptide or a mutant thereof comprising wild-type double-stranded DNA deaminase at positions corresponding to positions 1290-1427 of SEQ ID NO:1 Amino acid residues; and, compared with the amino acid residues at positions corresponding to positions 1290-1427 of SEQ ID NO:1 in the wild-type double-stranded DNA deaminase, the polypeptide or its mutant has the following mutations:
    (1)在与SEQ ID NO:1的第1308位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;或者,(1) The amino acid residue at the position corresponding to position 1308 of SEQ ID NO:1 is replaced by an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue) amino acid residue, isoleucine residue, valine residue) substitution; or,
    (2)在与SEQ ID NO:1的第1310位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;(2) The amino acid residue at the position corresponding to position 1310 of SEQ ID NO:1 is an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue amino acid residues, isoleucine residues, valine residues) substitution;
    其中,所述突变体与所述具有双链DNA脱氨酶活性的多肽相比,具有至少90%,例如至少95%,至少96%,至少97%,至少98%,至少99%的序列同一性;或者,具有一个或几个(例如,1个、2个、3个、4个、5个、6个、7个、8个或9个)氨基酸的置换(优选保守置换)、添加或缺失;且,具有双链DNA脱氨酶活性;并且,Wherein, the mutant has at least 90%, for example, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% sequence identity compared to the polypeptide with double-stranded DNA deaminase activity. or, having one or several (for example, 1, 2, 3, 4, 5, 6, 7, 8 or 9) amino acid substitutions (preferably conservative substitutions), additions or Deleted; and, has double-stranded DNA deaminase activity; and,
    所述多肽或其突变体中与SEQ ID NO:1的第1309位、第1367位和第1368位对应位置处的氨基酸残基未发生突变;The amino acid residues at positions corresponding to positions 1309, 1367 and 1368 of SEQ ID NO: 1 in the polypeptide or its mutants are not mutated;
    优选地,所述野生型双链DNA脱氨酶具有如SEQ ID NO:1所示的氨基酸序列;Preferably, the wild-type double-stranded DNA deaminase has the amino acid sequence shown in SEQ ID NO: 1;
    优选地,所述具有双链DNA脱氨酶活性的多肽具有如SEQ ID NO:3或5所示的氨基酸序列。Preferably, the polypeptide with double-stranded DNA deaminase activity has an amino acid sequence as shown in SEQ ID NO: 3 or 5.
  2. 一种突变的双链DNA脱氨酶或其变体,其与野生型双链DNA脱氨酶相比,具有下述突变:A mutated double-stranded DNA deaminase or variant thereof, which has the following mutations compared with wild-type double-stranded DNA deaminase:
    (1)在与SEQ ID NO:1的第1308位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、缬氨酸残基)替换;或者,(1) The amino acid residue at the position corresponding to position 1308 of SEQ ID NO:1 is replaced by an alanine residue or an amino acid residue that is conservatively substituted relative to an alanine residue (for example, a glycine residue, a leucine residue) amino acid residue, isoleucine residue, valine residue) substitution; or,
    (2)在与SEQ ID NO:1的第1310位对应位置处的氨基酸残基被丙氨酸残基或相对于丙氨酸残基是保守置换的氨基酸残基(例如,甘氨酸残基、亮氨酸残基、异亮氨酸残基、 缬氨酸残基)替换;(2) The amino acid residue at the position corresponding to position 1310 of SEQ ID NO: 1 is conservatively replaced by an alanine residue or an amino acid residue relative to an alanine residue (for example, a glycine residue, a leucine residue Amino acid residues, isoleucine residues, valine residue) substitution;
    其中,所述变体与所述突变的双链DNA脱氨酶相比,具有至少90%,例如至少95%,至少96%,至少97%,至少98%,至少99%的序列同一性;或者,具有一个或几个(例如,1个、2个、3个、4个、5个、6个、7个、8个或9个)氨基酸的置换(优选保守置换)、添加或缺失;且,具有双链DNA脱氨酶活性;并且Wherein, the variant has at least 90%, such as at least 95%, at least 96%, at least 97%, at least 98%, at least 99% sequence identity compared to the mutated double-stranded DNA deaminase; Alternatively, having one or several (for example, 1, 2, 3, 4, 5, 6, 7, 8 or 9) substitutions (preferably conservative substitutions), additions or deletions of amino acids; And, has double-stranded DNA deaminase activity; and
    所述突变的双链DNA脱氨酶或其变体中与SEQ ID NO:1的第1309位、第1367位和第1368位对应位置处的氨基酸残基未发生突变;The amino acid residues at positions corresponding to positions 1309, 1367 and 1368 of SEQ ID NO: 1 in the mutated double-stranded DNA deaminase or its variants are not mutated;
    优选地,所述突变的双链DNA脱氨酶或其变体在与SEQ ID NO:1的第1290-1427位对应位置处的氨基酸序列为权利要求1的多肽或其突变体的氨基酸序列;Preferably, the amino acid sequence of the mutated double-stranded DNA deaminase or its variant at the position corresponding to positions 1290-1427 of SEQ ID NO: 1 is the amino acid sequence of the polypeptide of claim 1 or its mutant;
    优选地,所述野生型双链DNA脱氨酶具有如SEQ ID NO:1所示的氨基酸序列。Preferably, the wild-type double-stranded DNA deaminase has the amino acid sequence shown in SEQ ID NO:1.
  3. 多肽聚合物,其包含第一多肽和第二多肽,其中:A polypeptide polymer comprising a first polypeptide and a second polypeptide, wherein:
    所述第一多肽包含N-末端片段,所述第二多肽包含C-末端片段;The first polypeptide includes an N-terminal fragment and the second polypeptide includes a C-terminal fragment;
    所述N-末端片段和所述C-末端片段的氨基酸序列分别是由权利要求1的多肽或其突变体在切割位点断裂形成的N-末端片段和C-末端片段的氨基酸序列;The amino acid sequences of the N-terminal fragment and the C-terminal fragment are respectively the amino acid sequences of the N-terminal fragment and the C-terminal fragment formed by cleavage of the polypeptide of claim 1 or its mutant at the cleavage site;
    其中,所述多肽聚合物由所述N-末端片段和所述C-末端片段聚合形成;Wherein, the polypeptide polymer is formed by the polymerization of the N-terminal fragment and the C-terminal fragment;
    优选地,当所述N-末端片段和所述C-末端片段各自单独存在时不具备双链DNA脱氨酶活性,或者,具备显著降低的脱氨酶活性(例如,权利要求1的多肽的双链DNA脱氨酶的活性的至多40%、至多30%、至多20%、至多10%、至多5%或至多1%);Preferably, the N-terminal fragment and the C-terminal fragment do not have double-stranded DNA deaminase activity when each exists alone, or have significantly reduced deaminase activity (for example, the polypeptide of claim 1 At most 40%, at most 30%, at most 20%, at most 10%, at most 5% or at most 1% of the double-stranded DNA deaminase activity);
    优选地,当所述N-末端片段与所述C-末端片段聚合时,所述聚合物具备双链DNA脱氨酶活性(例如,具备权利要求1的多肽的双链DNA脱氨酶活性的至少70%、至少80%、至少90%或至少95%)。Preferably, when the N-terminal fragment and the C-terminal fragment are polymerized, the polymer possesses double-stranded DNA deaminase activity (e.g., possesses the double-stranded DNA deaminase activity of the polypeptide of claim 1 at least 70%, at least 80%, at least 90% or at least 95%).
  4. 权利要求3的多肽聚合物,其中,所述分割位点位于具有双链DNA脱氨酶活性的多肽或其突变体中紧邻在与SEQ ID NO:1的第1333位对应位置处的氨基酸残基之后的肽键;The polypeptide polymer of claim 3, wherein the cleavage site is located in the polypeptide with double-stranded DNA deaminase activity or a mutant thereof immediately adjacent to the amino acid residue at the position corresponding to position 1333 of SEQ ID NO: 1 subsequent peptide bonds;
    优选地,所述N-末端片段具有如SEQ ID NO:104或106所示的氨基酸序列;Preferably, the N-terminal fragment has the amino acid sequence shown in SEQ ID NO: 104 or 106;
    优选地,所述C-末端片段具有如SEQ ID NO:55所示的氨基酸序列。 Preferably, the C-terminal fragment has the amino acid sequence shown in SEQ ID NO:55.
  5. 权利要求3的多肽聚合物,其中,所述分割位点位于具有双链DNA脱氨酶活性的多肽或其突变体中紧邻在与SEQ ID NO:1的第1397位对应位置处的氨基酸残基之后的肽键;The polypeptide polymer of claim 3, wherein the cleavage site is located in the polypeptide with double-stranded DNA deaminase activity or a mutant thereof immediately adjacent to the amino acid residue at the position corresponding to position 1397 of SEQ ID NO:1 subsequent peptide bonds;
    优选地,所述N-末端片段具有如SEQ ID NO:35或37所示的氨基酸序列;Preferably, the N-terminal fragment has the amino acid sequence shown in SEQ ID NO: 35 or 37;
    优选地,所述C-末端片段具有如SEQ ID NO:15所示的氨基酸序列。Preferably, the C-terminal fragment has the amino acid sequence shown in SEQ ID NO:15.
  6. 权利要求3的多肽聚合物,其中,所述第一多肽还包含与所述N-末端片段连接的第一DNA结合蛋白,和/或,所述第二多肽还包含与所述C-末端片段连接的第二DNA结合蛋白;The polypeptide polymer of claim 3, wherein the first polypeptide further comprises a first DNA binding protein linked to the N-terminal fragment, and/or the second polypeptide further comprises a first DNA binding protein linked to the C-terminal fragment. a second DNA-binding protein connected to the terminal fragments;
    优选地,所述第一DNA结合蛋白和/或第二DNA结合蛋白各自独立地为可编程DNA结合蛋白。Preferably, the first DNA binding protein and/or the second DNA binding protein are each independently a programmable DNA binding protein.
  7. 权利要求6的多肽聚合物,其中,所述第一多肽和所述第二多肽各自独立地还包含线粒体靶向序列(MTS),和/或,尿嘧啶糖基化酶抑制剂(UGI)结构域;The polypeptide polymer of claim 6, wherein the first polypeptide and the second polypeptide each independently further comprise a mitochondrial targeting sequence (MTS), and/or, a uracil glycosylase inhibitor (UGI). ) domain;
    优选地,所述第一多肽包含下列结构:第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;所述第二多肽包含下列结构:第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;Preferably, said first polypeptide comprises the following structure: a first mitochondrial targeting sequence (MTS), said first DNA binding protein, said N-terminal fragment, and, said uracil glycosylase inhibitor (UGI) domain; the second polypeptide comprises the following structure: a second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, and the uracil glycosylation Enzyme inhibitor (UGI) domain;
    优选地,所述第一多肽和第二多肽中相邻的结构之间各自独立地直接连接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)连接;Preferably, adjacent structures in the first polypeptide and the second polypeptide are each independently connected directly or through a linker (such as a peptide linker, such as one or more glycine (G) and/or serine (S)). ) flexible peptide) connection;
    优选地,所述第一线粒体靶向序列(MTS)位于所述第一多肽的N端,和/或,所述第二线粒体靶向序列(MTS)位于所述第二多肽的N端;Preferably, the first mitochondrial targeting sequence (MTS) is located at the N-terminus of the first polypeptide, and/or the second mitochondrial targeting sequence (MTS) is located at the N-terminus of the second polypeptide. ;
    优选地,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域。Preferably, the first polypeptide includes, in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA-binding protein, the N-terminal fragment, and the urinary protein. Pyrimidine glycosylase inhibitor (UGI) domain; the second polypeptide sequentially includes from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS), the second DNA-binding protein, the C-terminal fragment, and the uracil glycosylase inhibitor (UGI) domain.
  8. 权利要求3-7任一项的多肽聚合物,其中,所述第一线粒体靶向序列或所述第二线 粒体靶向序列各自独立地选自来源于COX8(细胞色素C氧化酶8A亚基)、ATP5G2(ATP合成酶F0复合体C2亚基)、SOD2(超氧化物歧化酶2)、COQ8A(线粒体非典型激酶COQ8A)线粒体靶向序列;The polypeptide polymer of any one of claims 3-7, wherein the first mitochondrial targeting sequence or the second line Mitochondrial targeting sequences are each independently selected from COX8 (cytochrome C oxidase 8A subunit), ATP5G2 (ATP synthase F0 complex C2 subunit), SOD2 (superoxide dismutase 2), COQ8A (mitochondrial Atypical kinase COQ8A) mitochondrial targeting sequence;
    优选地,所述第一线粒体靶向序列与所述第二线粒体靶向序列相同或不相同;Preferably, the first mitochondrial targeting sequence is the same as or different from the second mitochondrial targeting sequence;
    优选地,所述第一线粒体靶向序列为源自SOD2的线粒体靶向序列;优选地,所述第一线粒体靶向序列具有如SEQ ID NO:9所示的氨基酸序列;Preferably, the first mitochondrial targeting sequence is a mitochondrial targeting sequence derived from SOD2; Preferably, the first mitochondrial targeting sequence has the amino acid sequence shown in SEQ ID NO: 9;
    优选地,所述第二线粒体靶向序列为源自COX8的线粒体靶向序列;优选地,所述第二线粒体靶向序列具有如SEQ ID NO:19所示的氨基酸序列。Preferably, the second mitochondrial targeting sequence is a mitochondrial targeting sequence derived from COX8; preferably, the second mitochondrial targeting sequence has the amino acid sequence shown in SEQ ID NO: 19.
  9. 权利要求3-8任一项的多肽聚合物,其中,所述第一DNA结合蛋白或所述第二DNA结合蛋白各自独立地选自:TALE(转录激活因子样效应子)蛋白、锌指蛋白和Cas蛋白;The polypeptide polymer of any one of claims 3-8, wherein the first DNA-binding protein or the second DNA-binding protein is each independently selected from: TALE (transcription activator-like effector) protein, zinc finger protein and Cas proteins;
    优选地,所述第一DNA结合蛋白与所述第二DNA结合蛋白相同或不相同;Preferably, the first DNA binding protein and the second DNA binding protein are the same or different;
    优选地,所述第一DNA结合蛋白和所述第二DNA结合蛋白均为TALE蛋白。Preferably, both the first DNA-binding protein and the second DNA-binding protein are TALE proteins.
  10. 权利要求3-9任一项的多肽聚合物,其中,所述第一多肽和/或所述第二多肽各自独立地还包含出核信号(NES)序列;The polypeptide polymer of any one of claims 3-9, wherein the first polypeptide and/or the second polypeptide each independently further comprises a nuclear exit signal (NES) sequence;
    优选地,所述NES序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述第一多肽或所述第二多肽中的其他结构域连接;Preferably, said NES sequence is linked to said first polypeptide or said second polypeptide directly or via a linker (eg a peptide linker, for example a flexible peptide comprising one or more glycine (G) and/or serine (S)). other domains in the peptide are connected;
    优选地,所述第一多肽包含第一NES序列,和/或,所述第二多肽包含第二NES序列;Preferably, the first polypeptide comprises a first NES sequence, and/or the second polypeptide comprises a second NES sequence;
    优选地,所述第一NES序列位于所述第一DNA结合蛋白的C端;Preferably, the first NES sequence is located at the C-terminus of the first DNA binding protein;
    优选地,所述第二NES序列位于所述第二DNA结合蛋白的C端;Preferably, the second NES sequence is located at the C-terminus of the second DNA binding protein;
    优选地,所述第一NES序列与所述第二NES序列相同或不相同;Preferably, the first NES sequence and the second NES sequence are the same or different;
    优选地,所述第一NES序列或所述第二NES序列各自独立地选自来源于HIV病毒的Rev蛋白(HIV regulator of virion)、促有丝分裂素激活蛋白激酶(MAPK,mitogen-activated protein kinase)、细胞肿瘤抗原蛋白P53(cellular tumor antigen p53)、核糖体转运蛋白NMD3(60S ribosomal export protein NMD3);例如,所述第一NES序列或所述第二NES序列分别具有如SEQ ID NO:47或56所示的氨基酸序列。 Preferably, the first NES sequence or the second NES sequence are each independently selected from the group consisting of Rev protein (HIV regulator of virion) and mitogen-activated protein kinase (MAPK) derived from HIV virus. , cellular tumor antigen p53 (cellular tumor antigen p53), ribosome transport protein NMD3 (60S ribosomal export protein NMD3); for example, the first NES sequence or the second NES sequence respectively has SEQ ID NO: 47 or The amino acid sequence shown in 56.
  11. 权利要求10的多肽聚合物,其中,The polypeptide polymer of claim 10, wherein,
    所述第一多肽从N端至C端依次包含:The first polypeptide sequentially includes from N-terminus to C-terminus:
    (i)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述第一NES序列、所述N-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the first NES sequence, the N-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
    (ii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段,所述第一NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(ii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the first NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain;
    或者,or,
    (iii)所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;(iii) the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, First NES sequence;
    和/或,and / or,
    所述第二多肽从N端至C端依次包含:The second polypeptide sequentially includes from N-terminus to C-terminus:
    (i)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述第二NES序列、所述C-末端片段,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;(i) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the second NES sequence, the C-terminal fragment, and, the uracil glycosylase inhibitor ( UGI) domain;
    (ii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述第二NES序列,以及,所述尿嘧啶糖基化酶抑制剂(UGI)结构域;或者,(ii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the second NES sequence, and, the uracil glycosylase inhibitor ( UGI) domain; or,
    (iii)所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列;(iii) the second mitochondrial targeting sequence (MTS), the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, second NES sequence;
    优选地,所述第一多肽从N端至C端依次包含:所述第一线粒体靶向序列(MTS)、所述第一DNA结合蛋白、所述N-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第一NES序列;和/或,所述第二多肽从N端至C端依次包含:所述第二线粒体靶向序列(MTS)、所述第二DNA结合蛋白、所述C-末端片段、所述尿嘧啶糖基化酶抑制剂(UGI)结构域,以及,所述第二NES序列。Preferably, the first polypeptide includes in order from N-terminus to C-terminus: the first mitochondrial targeting sequence (MTS), the first DNA binding protein, the N-terminal fragment, the uracil sugar basalase inhibitor (UGI) domain, and, the first NES sequence; and/or, the second polypeptide sequentially includes from the N-terminus to the C-terminus: the second mitochondrial targeting sequence (MTS) , the second DNA binding protein, the C-terminal fragment, the uracil glycosylase inhibitor (UGI) domain, and, the second NES sequence.
  12. 分离的核酸分子,其编码权利要求1的具有双链DNA脱氨酶活性的多肽或其突变体、权利要求2的突变的双链DNA脱氨酶或其变体、权利要求3-11任一项中所定义的第一多肽或权利要求3-11任一项中所定义的第二多肽或其组合。Isolated nucleic acid molecule encoding the polypeptide with double-stranded DNA deaminase activity of claim 1 or a mutant thereof, the mutated double-stranded DNA deaminase of claim 2 or a variant thereof, any one of claims 3-11 The first polypeptide defined in claim 3 or the second polypeptide defined in any one of claims 3-11 or a combination thereof.
  13. 载体,其包含权利要求12所述的分离的核酸分子;优选地,所述载体为克隆载体或表达载体。 A vector comprising the isolated nucleic acid molecule of claim 12; preferably, the vector is a cloning vector or an expression vector.
  14. 宿主细胞,其包含权利要求12所述的核酸分子或权利要求13所述的载体。A host cell comprising the nucleic acid molecule of claim 12 or the vector of claim 13.
  15. 制备权利要求1的具有双链DNA脱氨酶活性的多肽或其突变体、权利要求2的突变的双链DNA脱氨酶或其变体、如权利要求3-11任一项中所定义的第一多肽或如权利要求3-11任一项中所定义的第二多肽的方法,其包括,在允许蛋白表达的条件下,培养权利要求14所述的宿主细胞,和从培养的宿主细胞培养物中回收所述具有双链DNA脱氨酶活性的多肽或其突变体、突变的双链DNA脱氨酶或其变体、第一多肽或第二多肽;Preparation of the polypeptide having double-stranded DNA deaminase activity of claim 1 or a mutant thereof, the mutated double-stranded DNA deaminase of claim 2 or a variant thereof, as defined in any one of claims 3-11 A method for a first polypeptide or a second polypeptide as defined in any one of claims 3 to 11, comprising culturing the host cell of claim 14 under conditions allowing protein expression, and from the cultured Recovering the polypeptide having double-stranded DNA deaminase activity or its mutant, the mutated double-stranded DNA deaminase or its variant, the first polypeptide or the second polypeptide from the host cell culture;
    优选地,所述第一多肽和所述第二多肽不在同一个宿主细胞中共表达。Preferably, said first polypeptide and said second polypeptide are not co-expressed in the same host cell.
  16. 组合物,其包含相互分离的第一组分和第二组分,所述第一组分包含:如权利要求3-11任一项中所定义的第一多肽或者编码所述第一多肽的第一多核苷酸;A composition comprising a first component and a second component that are separated from each other, the first component comprising: a first polypeptide as defined in any one of claims 3-11 or encoding said first polypeptide The first polynucleotide of the peptide;
    所述第二组分包含:如权利要求3-11任一项中所定义的第二多肽或者编码所述第二多肽的第二多核苷酸。The second component comprises: a second polypeptide as defined in any one of claims 3-11 or a second polynucleotide encoding the second polypeptide.
  17. 权利要求16的组合物,其中,所述组合还包含第三组分,所述第三组分包括融合蛋白或者编码所述融合蛋白的第三多核苷酸;其中,所述融合蛋白包含一个或多个核定位信号(NLS)序列以及能够抑制双链DNA脱氨酶活性的多肽;The composition of claim 16, wherein the combination further comprises a third component comprising a fusion protein or a third polynucleotide encoding the fusion protein; wherein the fusion protein comprises a or multiple nuclear localization signal (NLS) sequences and polypeptides capable of inhibiting double-stranded DNA deaminase activity;
    优选地,所述融合蛋白能够抑制所述第一多肽和所述第二多肽形成的多肽聚合物的双链DNA脱氨酶活性;Preferably, the fusion protein is capable of inhibiting the double-stranded DNA deaminase activity of the polypeptide polymer formed by the first polypeptide and the second polypeptide;
    优选地,所述NLS序列位于所述能够抑制双链DNA脱氨酶活性的多肽的N端和/或C端;Preferably, the NLS sequence is located at the N-terminus and/or C-terminus of the polypeptide capable of inhibiting double-stranded DNA deaminase activity;
    优选地,所述NLS序列直接或通过接头(例如肽接头,例如包含一个或多个甘氨酸(G)和/或丝氨酸(S)的柔性肽)与所述能够抑制双链DNA脱氨酶活性的多肽连接;Preferably, the NLS sequence is directly or through a linker (such as a peptide linker, such as a flexible peptide containing one or more glycine (G) and/or serine (S)) with the enzyme capable of inhibiting double-stranded DNA deaminase activity. polypeptide linkage;
    优选地,所述NLS序列选自来源于猿猴空泡病毒40(SV40)、睾丸决定因子(SRY)、细胞核质蛋白(Nuceloplasmin)、常用的二分核定位信号(bipartite NLS,bpNLS)的NLS序列;Preferably, the NLS sequence is selected from NLS sequences derived from simian vacuolating virus 40 (SV40), testis determinant (SRY), nucleoplasmin (Nuceloplasmin), and commonly used bipartite NLS (bpNLS);
    优选地,所述能够抑制双链DNA脱氨酶活性的多肽包含DddIA蛋白的最小活性结构域;例如,所述能够抑制双链DNA脱氨酶活性的多肽具有如SEQ ID NO:60所示的氨基 酸序列;Preferably, the polypeptide capable of inhibiting double-stranded DNA deaminase activity includes the minimal active domain of DddI A protein; for example, the polypeptide capable of inhibiting double-stranded DNA deaminase activity has the structure shown in SEQ ID NO: 60 of amino acid sequence;
    优选地,所述融合蛋白具有如SEQ ID NO:109或110所示的氨基酸序列。Preferably, the fusion protein has the amino acid sequence shown in SEQ ID NO: 109 or 110.
  18. 一种在细胞外编辑靶核苷酸序列的方法,其包括,在适合进行靶核酸编辑的条件下,将靶核苷酸序列与权利要求3-11任一项的多肽聚合物或权利要求16的组合物接触,从而诱导靶核苷酸序列中的靶碱基的脱氨基;A method for editing a target nucleotide sequence extracellularly, which includes, under conditions suitable for target nucleic acid editing, polymerizing the target nucleotide sequence with the polypeptide polymer of any one of claims 3-11 or claim 16 Contact with a composition to induce deamination of a target base in a target nucleotide sequence;
    优选地,所述靶碱基为胞嘧啶;Preferably, the target base is cytosine;
    优选地,所述方法包括将靶核苷酸序列与权利要求16的组合物接触,并且,所述组合物包含相互分离的第一组分和第二组分,所述第一组分包含如权利要求3-11任一项中所定义的第一多肽;所述第二组分包含如权利要求3-11任一项中所定义的第二多肽;或者,所述方法包括将靶核苷酸序列与权利要求3-11任一项的多肽聚合物接触,所述多肽聚合物包含如权利要求3-11任一项中所定义的第一多肽以及如权利要求3-11任一项中所定义的第二多肽;Preferably, the method comprises contacting the target nucleotide sequence with the composition of claim 16, and the composition comprises a first component and a second component that are separate from each other, the first component comprising e.g. The first polypeptide as defined in any one of claims 3-11; the second component comprises a second polypeptide as defined in any one of claims 3-11; or, the method comprises converting the target The nucleotide sequence is contacted with the polypeptide polymer of any one of claims 3-11, said polypeptide polymer comprising a first polypeptide as defined in any one of claims 3-11 and a polypeptide as defined in any one of claims 3-11 a second polypeptide as defined in a paragraph;
    优选地,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白;Preferably, the first polypeptide comprises a first DNA binding protein and the second polypeptide comprises a second DNA binding protein;
    优选地,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,所述第一多肽与所述第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基。Preferably, the first DNA-binding protein targets a first nucleotide sequence flanking one side of the target base, and the second DNA-binding protein targets a second nucleotide sequence flanking the other side of the target base. sequence; thus, the first polypeptide and the second polypeptide can form a polypeptide polymer, thereby inducing deamination of the target base.
  19. 一种在细胞内编辑靶核苷酸序列的方法,其包括将权利要求3-11任一项的多肽聚合物或权利要求16或17的组合物递送入含有所述靶核苷酸序列细胞,从而诱导靶位点处靶碱基的脱氨基;A method for editing a target nucleotide sequence in a cell, comprising delivering the polypeptide polymer of any one of claims 3-11 or the composition of claims 16 or 17 into a cell containing the target nucleotide sequence, thereby inducing deamination of the target base at the target site;
    优选地,所述靶碱基为胞嘧啶。Preferably, the target base is cytosine.
  20. 权利要求19的方法,其中,所述方法包括将权利要求16或17的组合物递送入含有所述靶核苷酸序列细胞;The method of claim 19, wherein the method comprises delivering the composition of claim 16 or 17 into a cell containing the target nucleotide sequence;
    优选地,所述组合物包含相互分离的第一组分和第二组分,所述第一组分包含如权利要求3-11任一项中所定义的第一多肽;且,所述第二组分包含如权利要求3-11任一项中所定义的第二多肽;或者,所述第一组分包含编码所述第一多肽的第一多核苷酸;且,所 述第二组分包含编码所述第二多肽的第二多核苷酸;Preferably, the composition comprises a first component and a second component that are separated from each other, the first component comprising a first polypeptide as defined in any one of claims 3-11; and, The second component comprises a second polypeptide as defined in any one of claims 3-11; alternatively, the first component comprises a first polynucleotide encoding the first polypeptide; and, The second component comprises a second polynucleotide encoding the second polypeptide;
    优选地,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白;Preferably, the first polypeptide comprises a first DNA binding protein and the second polypeptide comprises a second DNA binding protein;
    优选地,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,所述第一多肽与所述第二多肽能够形成多肽聚合物,从而诱导所述靶碱基的脱氨基;Preferably, the first DNA-binding protein targets a first nucleotide sequence flanking one side of the target base, and the second DNA-binding protein targets a second nucleotide sequence flanking the other side of the target base. sequence; thereby, the first polypeptide and the second polypeptide can form a polypeptide polymer, thereby inducing deamination of the target base;
    优选地,所述组合物还包含第三组分,所述第三组分如权利要求17中所定义。Preferably, the composition further comprises a third component as defined in claim 17.
  21. 权利要求19所述的方法,其中,所述方法包括将权利要求3-11任一项的多肽聚合物递送入含有所述靶核苷酸序列细胞;The method of claim 19, wherein the method comprises delivering the polypeptide polymer of any one of claims 3-11 into a cell containing the target nucleotide sequence;
    优选地,所述多肽聚合物包含如权利要求3-11任一项中所定义的第一多肽和如权利要求3-11任一项中所定义的第二多肽,其中,所述第一多肽包含第一DNA结合蛋白,所述第二多肽包含第二DNA结合蛋白;Preferably, the polypeptide polymer comprises a first polypeptide as defined in any one of claims 3-11 and a second polypeptide as defined in any one of claims 3-11, wherein said first polypeptide is as defined in any one of claims 3-11. a polypeptide comprising a first DNA binding protein, the second polypeptide comprising a second DNA binding protein;
    优选地,所述第一DNA结合蛋白靶向所述靶碱基一个侧翼的第一核苷酸序列,所述第二DNA结合蛋白靶向所述靶碱基另一个侧翼的第二核苷酸序列;由此,诱导所述靶碱基的脱氨基;Preferably, the first DNA-binding protein targets a first nucleotide sequence flanking one side of the target base, and the second DNA-binding protein targets a second nucleotide sequence flanking the other side of the target base. sequence; thereby inducing deamination of the target base;
    优选地,所述方法还包括将融合蛋白或者编码所述融合蛋白的多核苷酸递送入含有所述靶核苷酸序列细胞;其中,所述融合蛋白如权利要求17中所定义。Preferably, the method further comprises delivering a fusion protein or a polynucleotide encoding the fusion protein into a cell containing the target nucleotide sequence; wherein the fusion protein is as defined in claim 17.
  22. 试剂盒,其包含权利要求1的具有双链DNA脱氨酶活性的多肽或其突变体、权利要求2的突变的双链DNA脱氨酶或其变体、权利要求3-11任一项的多肽聚合物、权利要求12的分离的核酸分子、权利要求13的载体、权利要求14的宿主细胞、或权利要求16或17的组合物;A kit comprising the polypeptide with double-stranded DNA deaminase activity of claim 1 or a mutant thereof, the mutated double-stranded DNA deaminase of claim 2 or a variant thereof, any one of claims 3-11 The polypeptide polymer, the isolated nucleic acid molecule of claim 12, the vector of claim 13, the host cell of claim 14, or the composition of claim 16 or 17;
    优选地,所述试剂盒包含权利要求3-11任一项的多肽聚合物;优选地,所述试剂盒进一步包含融合蛋白或者编码所述融合蛋白的多核苷酸,其中,所述融合蛋白如权利要求17中所定义。Preferably, the kit comprises the polypeptide polymer of any one of claims 3-11; Preferably, the kit further comprises a fusion protein or a polynucleotide encoding the fusion protein, wherein the fusion protein is such as as defined in claim 17.
    优选地,所述试剂盒包含权利要求16或17的组合物。Preferably, the kit comprises a composition according to claim 16 or 17.
  23. 权利要求1的具有双链DNA脱氨酶活性的多肽或其突变体、权利要求2的突变的 双链DNA脱氨酶或其变体、权利要求3-11任一项的多肽聚合物、如权利要求3-11任一项中所定义的第一多肽、如权利要求3-11任一项中所定义的第二多肽、权利要求12的分离的核酸分子、权利要求13的载体、权利要求14的宿主细胞、或权利要求16或17的组合物用于制备编辑靶核苷酸序列的试剂盒或用于编辑靶核苷酸序列的用途。 The polypeptide having double-stranded DNA deaminase activity according to claim 1 or a mutant thereof, the mutated polypeptide according to claim 2 Double-stranded DNA deaminase or variant thereof, the polypeptide polymer of any one of claims 3-11, the first polypeptide as defined in any one of claims 3-11, as defined in any one of claims 3-11 The second polypeptide as defined in claim 12, the isolated nucleic acid molecule of claim 12, the vector of claim 13, the host cell of claim 14, or the composition of claim 16 or 17 for use in preparing an editing target nucleotide sequence Kits or uses for editing target nucleotide sequences.
PCT/CN2023/088008 2022-04-29 2023-04-13 Deaminase mutant, composition, and method for modifying mitochondrial dna WO2023207607A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202380013022.9A CN117751133A (en) 2022-04-29 2023-04-13 Deaminase mutants, compositions and methods for modifying mitochondrial DNA

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210466218.0 2022-04-29
CN202210466218 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023207607A1 true WO2023207607A1 (en) 2023-11-02

Family

ID=88517455

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/088008 WO2023207607A1 (en) 2022-04-29 2023-04-13 Deaminase mutant, composition, and method for modifying mitochondrial dna

Country Status (2)

Country Link
CN (1) CN117751133A (en)
WO (1) WO2023207607A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1816635A (en) * 2003-07-04 2006-08-09 强生研究有限公司 Method for detection of alkylated cytosine in DNA
WO2018176009A1 (en) * 2017-03-23 2018-09-27 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable dna binding proteins
CN109957569A (en) * 2017-12-22 2019-07-02 中国科学院遗传与发育生物学研究所 Base editing system and method based on CPF1 albumen
CN111793627A (en) * 2019-04-08 2020-10-20 中国科学院上海生命科学研究院 RNA fixed-point editing by utilizing artificially constructed RNA editing enzyme and related application
WO2021155065A1 (en) * 2020-01-28 2021-08-05 The Broad Institute, Inc. Base editors, compositions, and methods for modifying the mitochondrial genome
CN113584064A (en) * 2021-07-01 2021-11-02 五邑大学 Rapid TALE expression vector construction method based on codon degeneracy
CN113699160A (en) * 2021-08-16 2021-11-26 中国医学科学院医学实验动物研究所 Mutation method of rat mitochondrial gene G14098A and application thereof

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1816635A (en) * 2003-07-04 2006-08-09 强生研究有限公司 Method for detection of alkylated cytosine in DNA
WO2018176009A1 (en) * 2017-03-23 2018-09-27 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable dna binding proteins
CN109957569A (en) * 2017-12-22 2019-07-02 中国科学院遗传与发育生物学研究所 Base editing system and method based on CPF1 albumen
CN111793627A (en) * 2019-04-08 2020-10-20 中国科学院上海生命科学研究院 RNA fixed-point editing by utilizing artificially constructed RNA editing enzyme and related application
WO2021155065A1 (en) * 2020-01-28 2021-08-05 The Broad Institute, Inc. Base editors, compositions, and methods for modifying the mitochondrial genome
CN113584064A (en) * 2021-07-01 2021-11-02 五邑大学 Rapid TALE expression vector construction method based on codon degeneracy
CN113699160A (en) * 2021-08-16 2021-11-26 中国医学科学院医学实验动物研究所 Mutation method of rat mitochondrial gene G14098A and application thereof

Also Published As

Publication number Publication date
CN117751133A (en) 2024-03-22

Similar Documents

Publication Publication Date Title
US11912985B2 (en) Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
CN112195164B (en) Engineered Cas effector proteins and methods of use thereof
CN113631708B (en) Methods and compositions for editing RNA
AU2022200262B2 (en) Target-specific CRISPR mutant
AU2015299850B2 (en) Genome editing using Campylobacter jejuni CRISPR/CAS system-derived RGEN
KR102494449B1 (en) Engineered cas9 systems for eukaryotic genome modification
EP3481434A1 (en) Crispr/cas9-based compositions and methods for treating retinal degenerations
CN114686483A (en) Compositions and methods for expressing CRISPR guide RNA using H1 promoter
JP2019517802A (en) Method for screening target specific nucleases using on target and off target multi-target systems and use thereof
US20230167454A1 (en) Programmable nucleases and methods of use
US11970720B2 (en) RNA targeting methods and compositions
WO2019120193A1 (en) Split single-base gene editing systems and application thereof
US20230203481A1 (en) Effector proteins and methods of use
WO2023078384A1 (en) Isolated cas13 protein and use thereof
US20220228133A1 (en) Single base substitution protein, and composition comprising same
CN111051509A (en) Composition for dielectric calibration containing C2CL endonuclease and method for dielectric calibration using the same
WO2023207607A1 (en) Deaminase mutant, composition, and method for modifying mitochondrial dna
US20230323406A1 (en) Effector proteins and methods of use
US20230257739A1 (en) Effector proteins and methods of use
US20240158779A1 (en) Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
US20240026345A1 (en) Parallel single-cell reporter assays and compositions
WO2022253277A1 (en) Type i-c crispr-cas3 system and application thereof
WO2022241032A1 (en) Enhanced guide nucleic acids and methods of use
WO2022221581A1 (en) Programmable nucleases and methods of use
WO2023122663A2 (en) Effector proteins and methods of use

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795040

Country of ref document: EP

Kind code of ref document: A1