CN117136233A - Compositions comprising variant Cas12i4 polypeptides and uses thereof - Google Patents

Compositions comprising variant Cas12i4 polypeptides and uses thereof Download PDF

Info

Publication number
CN117136233A
CN117136233A CN202280027316.2A CN202280027316A CN117136233A CN 117136233 A CN117136233 A CN 117136233A CN 202280027316 A CN202280027316 A CN 202280027316A CN 117136233 A CN117136233 A CN 117136233A
Authority
CN
China
Prior art keywords
sequence
nucleotide
variant
polypeptide
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280027316.2A
Other languages
Chinese (zh)
Inventor
S·重
W-C·陆
B·J·希尔伯特
Q·N·韦斯塞尔
T·M·迪托马索
A·J·加里蒂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Abbott Biotechnology
Original Assignee
Abbott Biotechnology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Abbott Biotechnology filed Critical Abbott Biotechnology
Priority claimed from PCT/US2022/016214 external-priority patent/WO2022174099A2/en
Publication of CN117136233A publication Critical patent/CN117136233A/en
Pending legal-status Critical Current

Links

Abstract

The present application relates to variant Cas12i4 polypeptides, methods of producing these variant Cas12i4 polypeptides, methods for characterizing these variant Cas12i4 polypeptides, cells comprising these variant Cas12i4 polypeptides, and methods of using these variant Cas12i4 polypeptides. The application further relates to complexes comprising variant Cas12i4 polypeptides and RNA guides, methods of producing these complexes, methods for characterizing these complexes, cells comprising these complexes, and methods of using these complexes.

Description

Compositions comprising variant Cas12i4 polypeptides and uses thereof
Sequence listing
The present application contains a sequence listing that has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. The ASCII copy was created at 2022, month 2, 11, named 51451-023wo3_sequence_listing_2_10_22_st25 and was 607,386 bytes in size.
Cross Reference to Related Applications
The present application claims priority from U.S. Ser. No. 63/148,421, filed on day 11, 2, 2021, and U.S. Ser. No. 63/154,437, filed on day 26, 2, 2021. The contents of each of these prior applications are incorporated herein by reference in their entirety.
Background
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) genes (collectively, CRISPR-Cas or CRISPR/Cas systems) are archaebacters and adaptive immune systems in bacteria that defend against specific species against foreign genetic elements.
Disclosure of Invention
In light of the foregoing background, the present invention provides certain advantages and advances over the prior art. Although the invention disclosed herein is not limited to a particular advantage or function, the invention provides a variant Cas12i4 polypeptide comprising a sequence having at least 95% identity to the sequence set forth in any one of SEQ ID NOs 3-59.
In one aspect of the variant, the variant Cas12i4 polypeptide is a variant of the parent polypeptide of SEQ ID No. 2.
In another aspect of the variant, the variant Cas12i4 polypeptide comprises the substitutions of table 2.
In another aspect of the variant, the variant comprises the sequence set forth in any one of SEQ ID NOs 3-59.
In another aspect of the variant, the variant comprises the sequence set forth in SEQ ID NO. 3.
In another aspect of the variant, the variant comprises the sequence set forth in SEQ ID NO. 4.
In another aspect of the variant, the variant Cas12i4 polypeptide exhibits increased binary complex formation with the RNA guide relative to the parent polypeptide.
In another aspect of the variant, the binary complex comprising the variant Cas12i4 polypeptide exhibits increased stability relative to the parent binary complex.
In another aspect of the variant, the variant Cas12i4 polypeptide exhibits increased nuclease activity relative to the parent polypeptide.
In another aspect of the variant, the variant Cas12i4 polypeptide further comprises a substitution of table 4.
In another aspect of the variant, the substitution of table 4 increases binary complex formation with the RNA guide relative to the parent polypeptide.
In another aspect of the variant, the substitution of table 4 increases the stability of the binary complex comprising the variant Cas12i4 polypeptide relative to the parent binary complex.
In another aspect of the variant, the variant Cas12i4 polypeptide further comprises a substitution that increases ternary complex formation with the RNA guide and the target nucleic acid relative to the parent polypeptide.
In another aspect of the variant, the variant Cas12i4 polypeptide further comprises a substitution that increases the stability of the ternary complex relative to the parent polypeptide.
In another aspect of the variant, the substitutions are those of table 5, table 6, table 7, table 8, table 9, and/or table 10.
In another aspect of the variant, the variant Cas12i4 polypeptide further comprises a substitution that increases on-target binding to the target nucleic acid relative to the parent polypeptide.
In another aspect of the variant, the substitution is a substitution of table 11.
The invention still further provides a composition comprising a variant Cas12i4 polypeptide as described herein, wherein the composition further comprises an RNA guide or a nucleic acid encoding an RNA guide, wherein the RNA guide comprises a cognate repeat sequence and a spacer sequence.
In one aspect of the composition, the orthostatic sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. Nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. Nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In another aspect of the composition, the homeotropic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. Nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. Nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In another aspect of the composition, the homeotropic repeat sequence comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
In another aspect of the composition, the homeotropic sequence comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C.
In another aspect of the composition, the spacer sequence is about 15 nucleotides to about 35 nucleotides in length.
In another aspect of the composition, the spacer sequence binds to a target strand sequence of a target nucleic acid, and a non-target strand sequence of the target nucleic acid sequence is adjacent to a Protospacer Adjacent Motif (PAM) sequence.
In another aspect of the composition, the PAM sequence is 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', wherein N is any nucleotide and V is A, G, or C.
In another aspect of the variant or composition, the variant Cas12i4 polypeptide further comprises a Nuclear Localization Signal (NLS).
In another aspect of the variant or composition, the variant Cas12i4 polypeptide further comprises a peptide tag, a fluorescent protein, a base editing domain, a DNA methylation domain, a histone residue modification domain, a localization factor, a transcription modification factor, a light gating factor, a chemically inducible factor, or a chromatin visualization factor.
The invention still further provides nucleic acids encoding Cas12i4 polypeptides or compositions as described herein.
In one aspect of the composition, the nucleic acid is codon optimized for expression in the cell.
In another aspect of the composition, the nucleic acid is operably linked to a promoter.
In another aspect of the composition, the nucleic acid is in a vector.
In another aspect of the composition, the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
In another aspect of the variant or composition, the variant Cas12i4 polypeptide is present in a delivery system comprising a nanoparticle (e.g., a lipid nanoparticle), a liposome, an exosome, a microbubble, or a gene gun.
The invention still further provides a cell comprising a variant Cas12i4 polypeptide or composition as described herein.
In one aspect of the cell, the cell is a eukaryotic cell.
In another aspect of the cell, the cell is a mammalian cell or a plant cell.
In another aspect of the cell, the cell is a human cell.
The invention still further provides a composition comprising a variant Cas12i4 polypeptide or a complex comprising the variant Cas12i4 polypeptide, wherein the variant Cas12i4 polypeptide comprises a sequence having at least 95% identity to the sequence set forth in any one of SEQ ID NOs 3-59, and wherein the variant Cas12i4 polypeptide or the complex exhibits enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability relative to the parent polypeptide or the complex comprising the parent polypeptide.
In one aspect of the composition, the variant Cas12i4 polypeptide comprises the substitutions of table 2, table 4, table 5, table 6, table 7, table 8, table 9, table 10, and/or table 11.
In another aspect of the composition, the variant Cas12i4 polypeptide comprises the sequence set forth in any one of SEQ ID NOs 3-59.
In another aspect of the composition, the variant Cas12i4 polypeptide comprises the sequence set forth in SEQ ID No. 3.
In another aspect of the composition, the variant Cas12i4 polypeptide comprises the sequence set forth in SEQ ID No. 4.
In another aspect of the composition, the enhanced enzymatic activity is enhanced nuclease activity.
In another aspect of the composition, the variant Cas12i4 polypeptide exhibits enhanced binding activity to the RNA guide relative to the parent polypeptide.
In another aspect of the composition, the variant Cas12i4 polypeptide exhibits enhanced binding specificity to the RNA guide relative to the parent polypeptide.
In another aspect of the composition, the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide, and the variant binary complex exhibits enhanced binding activity (e.g., mid-target binding activity) to a target nucleic acid relative to the parent binary complex.
In another aspect of the composition, the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide, and the variant binary complex exhibits enhanced binding specificity (e.g., mid-target binding specificity) to a target nucleic acid relative to the parent binary complex.
In another aspect of the composition, the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide, and the variant binary complex exhibits enhanced stability relative to the parent binary complex.
In another aspect of the composition, the variant binary complex and the target nucleic acid form a variant ternary complex, and the variant ternary complex exhibits increased stability relative to the parent ternary complex.
In another aspect of the composition, the variant Cas12i4 polypeptide further exhibits enhanced binary complex formation, enhanced protein-RNA interactions, and/or reduced dissociation from RNA guides relative to the parent polypeptide.
In another aspect of the composition, the variant binary complex further exhibits reduced dissociation from the target nucleic acid, and/or reduced off-target binding to non-target nucleic acid relative to the parent binary complex.
In another aspect of the composition, the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs at a temperature ranging, for example, from 20 ℃ to 65 ℃.
In another aspect of the composition, the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs over a range of incubation times.
In another aspect of the composition, the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs in a buffer having a pH in the range of about 7.3 to about 8.6.
In another aspect of the composition, when the variant Cas12i4 polypeptide, the variant binary complex, or the variant ternary complex is T m A value of T that is greater than that of the parent polypeptide, parent binary complex, or parent ternary complex m At values at least 8 ℃ greater, the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs.
In another aspect of the composition, the variant Cas12i4 polypeptide comprises a RuvC domain or a split RuvC domain.
In another aspect of the composition, the parent polypeptide comprises the sequence of SEQ ID NO. 2.
In another aspect of the composition, the RNA guide comprises a homodromous repeat sequence and a spacer sequence.
In another aspect of the composition, the homeotropic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. Nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. Nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In another aspect of the composition, the homeotropic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. Nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. Nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
Nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In another aspect of the composition, the homeotropic repeat sequence comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
In another aspect of the composition, the homeotropic sequence comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A orC。
In another aspect of the composition, the spacer sequence is 15 to 35 nucleotides in length.
In another aspect of the composition, the spacer sequence comprises complementarity to a target strand sequence of the target nucleic acid.
In another aspect of the composition, the target nucleic acid comprises a non-target strand sequence adjacent to a Protospacer Adjacent Motif (PAM) sequence.
In another aspect of the composition, the PAM sequence is 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', where N is any nucleotide (e.g., A, G, T, or C) and V is A, G, or C.
In another aspect of the composition, the variant Cas12i4 polypeptide further comprises a peptide tag, a fluorescent protein, a base editing domain, a DNA methylation domain, a histone residue modification domain, a localization factor, a transcription modification factor, a light gating factor, a chemically inducible factor, or a chromatin visualization factor.
The invention still further provides a composition comprising a nucleic acid encoding a Cas12i4 polypeptide as described herein, wherein optionally the nucleic acid is codon optimized for expression in a cell.
In one aspect of the composition, the cell is a eukaryotic cell.
In another aspect of the composition, the cell is a mammalian cell or a plant cell.
In another aspect of the composition, the cell is a human cell.
In another aspect of the composition, the nucleic acid encoding the variant Cas12i4 polypeptide is operably linked to a promoter.
In another aspect of the composition, the nucleic acid encoding the variant Cas12i4 polypeptide is located in a vector.
In another aspect of the composition, the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
In another aspect of the composition, the composition is present in a delivery composition comprising nanoparticles (e.g., lipid nanoparticles), liposomes, exosomes, microbubbles, or gene-guns.
The invention still further provides an RNA guide or a nucleic acid encoding the RNA guide, wherein the RNA guide comprises a direct repeat sequence comprising:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. Nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. Nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
Nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In one aspect of the RNA guide or nucleic acid encoding the RNA guide, the cognate repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. Nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. Nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
Nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the cognate repeat sequence comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the cognate repeat comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C(SEQ ID NO:125),N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C.
In another aspect of the RNA guide or the nucleic acid encoding the RNA guide, the RNA guide further comprises a spacer sequence.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the spacer sequence is about 15 to about 35 nucleotides in length.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the spacer sequence recognizes the target nucleic acid.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the target nucleic acid comprises a target sequence adjacent to a Protospacer Adjacent Motif (PAM) sequence, wherein the PAM sequence comprises a nucleotide sequence as shown by 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', wherein N is any nucleotide (e.g., A, G, T, or C) and V is A, G, or C.
The invention still further provides a composition comprising an RNA guide or a nucleic acid encoding the RNA guide as described herein.
In one aspect of the composition, the composition is a delivery composition comprising nanoparticles (e.g., lipid nanoparticles), liposomes, exosomes, microbubbles, or gene-guns.
In another aspect of the RNA guides or nucleic acids encoding the RNA guides described herein, the nucleic acids encoding the RNA guides are operably linked to a promoter.
In another aspect of the RNA guide or the nucleic acid encoding the RNA guide, the nucleic acid encoding the RNA guide is in a vector.
In another aspect of the RNA guide or nucleic acid encoding the RNA guide, the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
The invention still further provides a cell comprising an RNA guide as described herein or a nucleic acid encoding the RNA guide.
In one aspect of the cell, the cell is a eukaryotic cell.
In another aspect of the cell, the cell is a mammalian cell or a plant cell.
In another aspect of the cell, the cell is a human cell.
The invention still further provides a method for editing a gene in a cell, the method comprising contacting the cell with a variant, composition, RNA guide, or nucleic acid molecule as described herein.
The invention still further provides a nucleic acid molecule encoding a Cas12i4 variant of SEQ ID No. 4, wherein the sequence of the nucleic acid molecule has 95% identity to a sequence selected from the group consisting of SEQ ID nos. 222-228.
In one embodiment, the sequence of the nucleic acid molecule comprises a sequence selected from the group consisting of SEQ ID NOS: 222-228.
Definition of the definition
The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. Unless otherwise indicated, the terms set forth below should generally be understood in their ordinary sense.
As used herein, the term "activity" refers to biological activity. In some embodiments, the nuclease activity comprises an enzymatic activity of a nuclease, e.g., a catalytic ability. For example, the nuclease activity may comprise nuclease activity. In some embodiments, the nuclease activity comprises a binding activity, e.g., a nuclease binding activity to an RNA guide and/or a target nucleic acid.
As used herein, the term "complex" refers to the grouping of two or more molecules. In some embodiments, a complex comprises a polypeptide and a nucleic acid molecule that interact (e.g., bind, contact, adhere) with each other.
As used herein, the term "binary complex" refers to the clustering of two molecules (e.g., a polypeptide and a nucleic acid molecule). In some embodiments, a binary complex refers to the clustering of polypeptides and targeting moieties (e.g., RNA guides). In some embodiments, the binary complex is referred to as Ribonucleoprotein (RNP). As used herein, the term "variant binary complex" refers to the clustering of variant Cas12i4 polypeptides and RNA guides. As used herein, the term "parent binary complex" refers to the clustering of parent polypeptides and RNA guides or reference polypeptides and RNA guides.
As used herein, the term "ternary complex" refers to the clustering of three molecules (e.g., one polypeptide and two nucleic acid molecules). In some embodiments, "ternary complex" refers to the clustering of polypeptides, RNA molecules, and DNA molecules. In some embodiments, a ternary complex refers to the clustering of polypeptides, targeting moieties (e.g., RNA guides), and target nucleic acids (e.g., target DNA molecules). In some embodiments, "ternary complex" refers to the clustering of binary complexes (e.g., ribonucleoproteins) and third molecules (e.g., target nucleic acids).
As used herein, the term "domain" refers to different functional and/or structural units of a polypeptide. In some embodiments, the domain may comprise a conserved amino acid sequence.
As used herein, the term "interface" refers to one or more residues (e.g., domains/motifs or portions of domains/motifs) of a variant Cas12i4 polypeptide that contact (e.g., interact with or are adjacent to) a different domain/motif or portion of a different domain/motif of a nucleic acid molecule or a variant Cas12i4 polypeptide. In some aspects, the interface is a buried surface region between adjacent domains or motifs. In some aspects, the interface is a surface region between the polypeptide and the ligand (e.g., DNA or RNA) in which the polypeptide and ligand are contacted. As used herein, the term "nucleic acid interface" refers to residues of a variant Cas12i4 polypeptide that are immediately adjacent to (e.g., adjacent to) or interact with a nucleic acid sequence (e.g., a DNA sequence or an RNA sequence). As used herein, the term "RNA-binding interface" refers to residues of the variant Cas12i4 polypeptide that are immediately adjacent to (e.g., adjacent to) or interact with an RNA guide (e.g., the cognate repeat of the RNA guide). As used herein, the term "double-stranded DNA-binding interface" refers to residues of the variant Cas12i4 polypeptide that are immediately adjacent to (e.g., adjacent to) and/or interact with double-stranded DNA. As used herein, the term "single-stranded DNA-binding interface" refers to residues of the variant Cas12i4 polypeptide that are immediately adjacent to (e.g., adjacent to) and/or interact with single-stranded DNA. As used herein, the term "domain-domain interface" refers to a domain that is immediately adjacent (e.g., adjacent) to an individual domain. In some embodiments, the domain-domain interface (e.g., the helix II domain-Nuc domain interface) is formed upon complex formation (e.g., ternary complex formation).
As used herein, the terms "parent," "parent polypeptide," and "parent sequence" refer to the original polypeptide (e.g., the starting polypeptide) to which they are altered to produce a variant Cas12i4 polypeptide of the invention. In some embodiments, the parent is a polypeptide having the same amino acid sequence as the variant at one or more specified positions. The parent may be a naturally occurring (wild-type) polypeptide. In a particular embodiment, the parent is a polypeptide having at least 60%, at least 61%, at least 62%, at least 63%, at least 64%, at least 65%, at least 70%, at least 72%, at least 73%, at least 74%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polypeptide of SEQ ID NO. 2.
As used herein, the term "protospacer adjacent motif" or "PAM" refers to a DNA sequence adjacent to a "target sequence" that binds to a complex comprising a Cas12i4 polypeptide and an RNA guide. The "target nucleic acid" is a double-stranded molecule: one strand comprises the target sequence adjacent to PAM and is referred to as a "PAM strand" (e.g., a non-target strand or a non-spacer complementary strand), while the other complementary strand is referred to as a "non-PAM strand" (e.g., a target strand or a spacer complementary strand). As used herein, the term "adjacent" includes the case where the RNA guide of the complex specifically binds, interacts or associates with the target sequence immediately adjacent to PAM. In such cases, there are no nucleotides between the target sequence and PAM. The term "adjacent" also includes the case where there are few (e.g., 1, 2, 3, 4 or 5) nucleotides between the target sequence to which the targeting moiety binds and PAM.
As used herein, the terms "reference composition," "reference molecule," "reference sequence," and "reference" refer to a control, such as a negative control or a parent (e.g., a parent sequence, a parent protein, or a wild-type protein). For example, a reference molecule refers to a polypeptide that is compared to a variant Cas12i4 polypeptide. Likewise, reference to an RNA guide refers to a targeting moiety that is compared to a modified RNA guide. The variant or modified molecule may be compared to the reference molecule based on sequence (e.g., the variant or modified molecule may have X% sequence identity or homology to the reference molecule), thermostability, or activity (e.g., the variant or modified molecule may have X% activity of the reference molecule). For example, a variant or modified molecule may be characterized as having no more than 10% activity of the reference polypeptide, or may be characterized as having at least 10% greater activity than the reference polypeptide. Examples of reference polypeptides include naturally occurring unmodified polypeptides, such as naturally occurring polypeptides from archaebacteria or bacterial species. In certain embodiments, the reference polypeptide is a naturally occurring polypeptide having closest sequence identity or homology to the compared variant Cas12i4 polypeptide. In certain embodiments, the reference polypeptide is a parent molecule having a naturally occurring or known sequence on which mutations have been made to obtain a variant Cas12i4 polypeptide.
As used herein, the term "RNA guide" or "RNA guide sequence" refers to any RNA molecule that facilitates targeting of a Cas12i4 polypeptide described herein to a target nucleic acid. For example, an RNA guide can be a molecule that recognizes (e.g., binds to) a target nucleic acid. The RNA guide can be designed to be complementary to a target strand (e.g., a non-PAM strand) of a target nucleic acid sequence. The RNA guide comprises a DNA targeting sequence and a repeat (DR) sequence. The terms CRISPR RNA (crRNA), pre-crRNA, mature crRNA and gRNA are also used herein to refer to RNA guides. As used herein, the term "pre-crRNA" refers to an unprocessed RNA molecule comprising a DR-spacer-DR sequence. As used herein, the term "mature crRNA" refers to the processed form of the pre-crRNA; the mature crRNA can comprise a DR-spacer sequence, wherein DR is a truncated form of DR of the pre-crRNA and/or the spacer is a truncated form of spacer of the pre-crRNA.
As used herein, the term "substantially identical" refers to a sequence, polynucleotide, or polypeptide that has a degree of identity to a reference sequence.
As used herein, the terms "target nucleic acid," "target sequence," and "target substrate" refer to a nucleic acid that specifically binds to an RNA guide. In some embodiments, the DNA targeting sequence of the RNA guide binds to the target nucleic acid.
As used herein, the terms "variant Cas12i4 polypeptide" and "variant nuclease polypeptide" refer to polypeptides comprising alterations (e.g., substitutions, insertions, deletions, and/or fusions) at one or more residue positions as compared to the parent polypeptide. As used herein, the terms "variant Cas12i4 polypeptide" and "variant nuclease polypeptide" refer to polypeptides comprising alterations compared to the polypeptide of SEQ ID No. 2.
Drawings
FIG. 1A is a DNA EMSA gel showing the ability of RNPs prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variant Cas12i4 of SEQ ID NO:4 and b) RNA guide of SEQ ID NO:62 to bind to AAVS1 dsDNA target (SEQ ID NO: 65). Unbound dsDNA bands are indicated.
FIG. 1B is a DNA EMSA gel showing the ability of RNPs prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variant Cas12i4 of SEQ ID NO:4 and B) RNA guide of SEQ ID NO:63 to bind to AAVS1 dsDNA target (SEQ ID NO: 66). Bound dsDNA and unbound dsDNA bands are indicated.
FIG. 1C is a DNA EMSA gel showing the ability of RNPs prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variant Cas12i4 of SEQ ID NO:4 and b) RNA guide of SEQ ID NO:64 to bind to an EMX1 dsDNA target (SEQ ID NO: 67). Bound dsDNA and unbound dsDNA bands are indicated.
FIG. 1D is a control DNA EMSA gel showing the ability of RNP prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variant Cas12i4 of SEQ ID NO:4 and b) RNA guide of SEQ ID NO:62 to bind to EMX1 dsDNA target (SEQ ID NO: 67). Unbound dsDNA bands are indicated.
FIG. 2A is a gel showing cleavage of AAVS1 dsDNA target (SEQ ID NO: 65) by an RNP prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variants of SEQ ID NO:4 Cas12i4 and b) RNA guide of SEQ ID NO: 62. Full length and cleaved DNA bands are indicated.
FIG. 2B is a gel showing cleavage of AAVS1 dsDNA target (SEQ ID NO: 66) by RNP prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variants of SEQ ID NO:4 Cas12i4 and B) RNA guide of SEQ ID NO: 63. Full length and cleaved DNA bands are indicated.
FIG. 2C is a gel showing cleavage of EMX1 dsDNA target (SEQ ID NO: 67) by RNP prepared with a) wild-type Cas12i4 (SEQ ID NO: 2) or variants of SEQ ID NO:4 Cas12i4 and b) RNA guide of SEQ ID NO: 64. Full length and cleaved DNA bands are indicated.
FIG. 3 is a diagram showing induction of indels (indels) in AAVS1, EMX1 and VEGFA targets (SEQ ID NOs: 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, and 107) by wild type Cas12i4 (SEQ ID NO: 2) and Cas12i4 variants of SEQ ID NO:3 and SEQ ID NO:4 in mammalian cells.
FIG. 4 is a diagram showing indels in AAVS1, EMX1 and VEGFA targets adjacent to 5'-NTTN-3' or 5'-NVTN-3' PAM sequences induced in mammalian cells by wild type Cas12i4 (SEQ ID NO: 2) and Cas12i4 variants of SEQ ID NO: 4.
Fig. 5 is a schematic diagram showing the domain structure of a Cas12i4 polypeptide.
Fig. 6A depicts the position of V592R substitution in the Cas12i4 structure. V592R substitutions may interact with single-stranded non-target strands.
Fig. 6B depicts the positions of E480R and G564R substitutions in the Cas12i4 structure, which are similar to the PAM sequence of double stranded DNA. The E480R and G564R substitutions may stabilize interactions with double stranded DNA.
Detailed Description
The present disclosure relates to novel variants of the polypeptide of SEQ ID NO. 2, methods of producing and uses thereof. The disclosure further relates to complexes comprising variants of the polypeptide of SEQ ID NO. 2, methods of producing the same and uses thereof. In some aspects, described herein are compositions comprising a complex having one or more characteristics. In some aspects, methods of delivering a composition comprising a complex are described.
Composition and method for producing the same
In some embodiments, the compositions of the invention include variant Cas12i4 polypeptides that exhibit enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability relative to the parent polypeptide. In some embodiments, the compositions of the invention include a complex comprising a variant Cas12i4 polypeptide that exhibits enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability relative to the parent complex.
In some embodiments, the compositions of the invention comprise a variant Cas12i4 polypeptide and an RNA guide. In some embodiments, the compositions of the invention comprise a variant binary complex comprising a variant Cas12i4 polypeptide and an RNA guide.
In some aspects of the composition, the variant Cas12i4 polypeptide has increased complex formation (e.g., increased binary complex formation) with the RNA guide as compared to the parent polypeptide. In some aspects of the composition, the variant Cas12i4 polypeptide and the RNA guide have greater binding affinity than the parent polypeptide and the RNA guide. In some aspects of the composition, the variant Cas12i4 polypeptide and the RNA guide have stronger protein-RNA interactions (e.g., ion interactions) than the parent polypeptide and the RNA guide. In some aspects of the composition, the variant binary complex is more stable than the parent binary complex.
In some embodiments, the compositions of the invention include a variant Cas12i4 polypeptide, an RNA guide, and a target nucleic acid. In some embodiments, the compositions of the invention comprise a variant ternary complex comprising a variant Cas12i4 polypeptide, an RNA guide, and a target nucleic acid.
In some aspects of the composition, the variant Cas12i4 polypeptide has increased complex formation (e.g., increased ternary complex formation) with the RNA guide and the target nucleic acid as compared to the parent polypeptide. In some aspects of the composition, the variant Cas12i4 polypeptide and the RNA guide (e.g., variant binary complex) have greater binding affinity to the target nucleic acid than the parent polypeptide and the RNA guide (e.g., parent binary complex). In some aspects of the composition, the variant ternary complex is more stable than the parent ternary complex.
Variant Cas12i4 polypeptides
In some embodiments, the compositions of the invention comprise a variant Cas12i4 polypeptide described herein.
In some embodiments, the polypeptide of the invention is a variant of a parent polypeptide, wherein the parent is encoded by a polynucleotide comprising a nucleotide sequence, e.g., SEQ ID NO. 1, or comprises an amino acid sequence, e.g., SEQ ID NO. 2.
Table 1. Parental sequences.
/>
The nucleic acid sequence encoding a parent polypeptide described herein may be substantially identical to a reference nucleic acid sequence (e.g., SEQ ID NO: 1). In some embodiments, the variant Cas12i4 polypeptide is encoded by a nucleic acid comprising the sequence: the sequence has at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity to a reference nucleic acid sequence (e.g., a nucleic acid sequence encoding a parent polypeptide, such as SEQ ID NO: 1). The percent identity between two such nucleic acids can be determined manually by examining the two optimally aligned nucleic acid sequences or by using standard parameters using software programs or algorithms (e.g., BLAST, ALIGN, CLUSTAL). One indication that two nucleic acid sequences are substantially identical is that the nucleic acid molecule hybridizes under stringent conditions (e.g., in the medium to high stringency range) to the complement of the other nucleic acid molecule.
In some embodiments, the variant Cas12i4 polypeptide is encoded by the following nucleic acid sequence: the nucleic acid sequence has at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or more sequence identity, but not 100% sequence identity, to a reference nucleic acid sequence (e.g., a nucleic acid sequence encoding a parent polypeptide, such as SEQ ID NO: 1).
In some embodiments, a variant Cas12i4 polypeptide of the invention comprises a polypeptide sequence that is 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% but not 100% identical to SEQ ID No. 2. In some embodiments, a variant Cas12i4 polypeptide of the invention comprises a polypeptide sequence that is greater than 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% but not 100% identical to SEQ ID No. 2. In some embodiments, the variant Cas12i4 polypeptide retains amino acid changes (or at least 1, 2, 3, 4, 5, etc. of these changes) that distinguish the polypeptide from its corresponding parent/reference sequence.
In some embodiments, the invention describes variant Cas12i4 polypeptides having a specified degree of amino acid sequence identity to one or more reference polypeptides (e.g., parent polypeptides), e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99% but not 100% sequence identity to the amino acid sequence of SEQ ID No. 2. Homology or identity can be determined, for example, by amino acid sequence alignment using a program as described herein (such as BLAST, ALIGN, or CLUSTAL). In some embodiments, the variant Cas12i4 polypeptide retains amino acid changes (or at least 1, 2, 3, 4, 5, etc. of these changes) that distinguish the polypeptide from its corresponding parent/reference sequence.
In some embodiments, the variant Cas12i4 polypeptide comprises alterations at one or more (e.g., several) amino acids of the parent polypeptide, wherein at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 103, 105, 106, 107, 108, 109, 94, 95, 100, 101, 102, 107, 108, and 12 are provided 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 193, 194, 195, 196, 197, 198, 199, 200 or more amino acids are altered. In some embodiments, the variant Cas12i4 polypeptide retains amino acid changes (or at least 1, 2, 3, 4, 5, etc. of these changes) that distinguish the polypeptide from its corresponding parent/reference sequence.
In some embodiments, the variant Cas12i4 polypeptide comprises one or more of the amino acid substitutions listed in table 2.
Table 2. Single amino acid substitutions in variant Cas12i4 polypeptides.
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the interaction of the variant Cas12i4 polypeptide with the RNA guide. In some embodiments, the alteration that increases interaction with the RNA guide is an arginine, lysine, glutamine, asparagine, or histidine substitution. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the interaction of the variant Cas12i4 polypeptide with the target nucleic acid. In some embodiments, the alteration that increases interaction with the target nucleic acid is an arginine, lysine, glutamine, asparagine, or histidine substitution. In some embodiments, the variant Cas12i4 polypeptide comprises an alanine substitution.
In some embodiments, the variant Cas12i4 polypeptide comprises an arginine substitution relative to the parent polypeptide of SEQ ID No. 2. For example, in some embodiments, the variant Cas12i4 polypeptide comprises an arginine substitution at residues 480, 482, 484, 486, 487, 490, 503, 545, 564, 566, 568, 569, 570, 587, 591, 592, 595, 598, 599, 612, 625, 629, 633, 635, 641, 668, 679, 713, 727, 735, 753, 754, 812, 825, 826, 831, 845, 846, 863, 865, 867, 870, 875, 886, 906, 945, 1028, 1032, 1042, 1049, 1055, 1058, 1059, 1071 of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises a glycine substitution relative to the parent polypeptide of SEQ ID No. 2. For example, in some embodiments, the variant Cas12i4 polypeptide comprises a glycine substitution at residues 480, 482, 484, 486, 490, 503, 545, 566, 568, 569, 570, 587, 591, 592, 595, 598, 599, 612, 621, 625, 633, 635, 641, 668, 679, 689, 713, 727, 735, 753, 754, 812, 818, 825, 826, 831, 845, 846, 863, 865, 867, 870, 875, 886, 906, 945, 1028, 1032, 1042, 1049, 1055, 1058, 1059, 1071 of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises two or more substitutions relative to the parent polypeptide of SEQ ID No. 2. For example, a variant polypeptide may comprise two, three, four, five, six, seven, eight, nine, ten or more substitutions as compared to SEQ ID NO. 2. Non-limiting examples of two or more substitutions are shown in table 3. In some embodiments, the variant Cas12i4 polypeptide comprises two or more substitutions listed in table 3 and further comprises a substitution listed in table 2.
Table 3. Polyamino acid substitutions of variant Cas12i4 polypeptides.
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
In some embodiments, the variant Cas12i4 polypeptide comprises one or more additional substitutions over the sequence of SEQ ID NO:3 (e.g., the variant Cas12i4 polypeptide comprises a V592R substitution and an E1042R substitution and further comprises one or more of the substitutions shown in table 2 or table 3). In some embodiments, the variant Cas12i4 polypeptide comprises one or more additional substitutions over the sequence of SEQ ID NO:4 (e.g., the variant Cas12i4 polypeptide comprises an E480R substitution, a G564R substitution, a V592R substitution, an E1042R substitution, and further comprises one or more of the substitutions shown in table 2 or table 3). In some embodiments, the variant Cas12i4 polypeptide comprises one or more additional substitutions over any one of the sequences of SEQ ID NOs:5-59 (e.g., the variant Cas12i4 polypeptide further comprises one or more substitutions shown in table 2 or table 3). As described above, in some embodiments, the variant Cas12i4 polypeptide retains amino acid changes (or at least 1, 2, 3, 4, 5, etc. of these changes) that distinguish the polypeptide from its corresponding parent/reference sequence.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one RuvC motif or RuvC domain.
The domains of Cas12i4 polypeptides disclosed herein are depicted in fig. 5. The ridge domain comprises residues 1-14 and 447-593 of the Cas12i4 polypeptide. The Rec1 domain comprises residues 15-171 and 266-446 of the Cas12i4 polypeptide. The PI domain comprises residues 172-265 of the Cas12i4 polypeptide. The Rec2 domain comprises residues 647-839 of the Cas12i4 polypeptide. The Nuc domain comprises residues 891-11018 of the Cas12i4 polypeptide. The RuvC domain comprises residues 594-646 (RuvC 1 motif), residues 840-890 (RuvC 2 motif), and residues 1019-1074 (RuvC 3 motif) of the Cas12i4 polypeptide.
Although the changes described herein may be changes in one or more amino acids, the changes in the variant Cas12i4 polypeptide may also be substantial, e.g., as an amino-and/or carboxy-terminal extended polypeptide fusion. For example, a variant Cas12i4 polypeptide may contain additional peptides, such as one or more peptides. Additional examples of peptides may include epitope peptides for tagging, such as polyhistidine tags (His tags), myc, and FLAG. In some embodiments, a variant Cas12i4 polypeptide described herein can be fused to a detectable moiety, such as a fluorescent protein (e.g., green Fluorescent Protein (GFP) or Yellow Fluorescent Protein (YFP)).
In some embodiments, the variant Cas12i4 polypeptide comprises at least one (e.g., two, three, four, five, six, or more) Nuclear Localization Signal (NLS). In some embodiments, the variant Cas12i4 polypeptide comprises at least one (e.g., two, three, four, five, six, or more) Nuclear Export Signal (NES).
In some embodiments, the variant Cas12i4 polypeptide comprises at least one (e.g., two, three, four, five, six, or more) NLS and at least one (e.g., two, three, four, five, six, or more) NES.
In some embodiments, a variant Cas12i4 polypeptide described herein can be self-inactivating. See Epstein et al, "Engineering a Self-Inactivating CRISPR System for AAV Vectors [ CRISPR System engineered for self-inactivation of AAV vectors ]," mol. Ther. [ molecular therapy ],24 (2016): S50, which is incorporated by reference in its entirety.
In some embodiments, the nucleotide sequence encoding a variant Cas12i4 polypeptide described herein may be codon optimized for a particular host cell or organism. For example, the nucleic acid can be codon optimized for use in any non-human eukaryotic organism, including mice, rats, rabbits, dogs, livestock, or non-human primates. Codon usage tables are readily available, for example in the "codon usage database (Codon Usage Database)" available on www.kazusa.orjp/codon, and these tables can be adapted in a variety of ways. See Nakamura et al nucleic acids Res 28:292 (2000), which is incorporated herein by reference in its entirety. Computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene cage (Aptagen, inc.; jacobus, pa.).
Functionality of variant polypeptides
As used herein, a "biologically active moiety" is a moiety that retains at least one function (e.g., fully, partially, minimally) of a parent polypeptide (e.g., a "minimal" or "core" domain). In some embodiments, the variant Cas12i4 polypeptide retains enzymatic activity that is at least as active as the parent polypeptide. Thus, in some embodiments, the variant Cas12i4 polypeptide has greater enzymatic activity than the parent polypeptide.
Also provided are variant Cas12i4 polypeptides of the invention that have enzymatic activity (e.g., nuclease or endonuclease activity) and comprise an amino acid sequence that differs from the amino acid sequence of any of the parent polypeptides and SEQ ID NOs 2 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid residues when aligned using any of the foregoing alignment methods.
In some embodiments, the variant Cas12i4 polypeptide comprising a V592R substitution exhibits enhanced enzymatic activity. In some embodiments, the V592R residue interacts with NTS. In some embodiments, the V592R residue contacts NTS near the PAM sequence. See fig. 6A.
In some embodiments, a variant Cas12i4 polypeptide comprising an E480R substitution exhibits enhanced enzymatic activity. In some embodiments, the E480R substitution interacts with double stranded DNA. In some embodiments, the E480R substitution interacts with double stranded DNA upstream of the PAM sequence. In some embodiments, the E480R substitution stabilizes the interaction of the variant Cas12i4 polypeptide with the target nucleic acid. See fig. 6B.
In some embodiments, a variant Cas12i4 polypeptide comprising a G564R substitution exhibits enhanced enzymatic activity. In some embodiments, the G564R substitution interacts with double stranded DNA. In some embodiments, the G564R substitution interacts with double stranded DNA upstream of the PAM sequence. In some embodiments, the G564R substitution stabilizes the interaction of the variant Cas12i4 polypeptide with the target nucleic acid. See fig. 6B.
In some embodiments, a variant Cas12i4 polypeptide comprising an E1042R substitution exhibits enhanced enzymatic activity.
In some embodiments, the variant Cas12i4 polypeptide has reduced nuclease activity or is a nuclease dead (nuclease dead) polypeptide. As used herein, the catalytic residues of the polypeptides disclosed herein are D608, E844, and D1022. In some embodiments, a variant Cas12i4 polypeptide comprising a substitution at one or more of D608, E844, and D1022 (e.g., D608A, E844A and D1022A) exhibits reduced nuclease activity or no nuclease activity relative to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide of the invention has an enzymatic activity that is equivalent to or greater than that of the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptides of the invention have enzymatic activity at a temperature ranging from about 20 ℃ to about 90 ℃. In some embodiments, the variant Cas12i4 polypeptides of the invention have enzymatic activity at a temperature of about 20 ℃ to about 25 ℃ or at a temperature of about 37 ℃.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances affinity for RNA (e.g., RNA affinity) as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA affinity at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA affinity in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced RNA affinity compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the value of 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 19 ℃, or 20 ℃. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced RNA affinity.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced RNA affinity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) increased RNA affinity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced RNA affinity relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances complex formation (e.g., binary complex formation) with the RNA guide as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced binary complex formation at a temperature about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced binary complex formation in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced binary complex formation when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced binary complex formation.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced binary complex formation relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced binary complex formation relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced binary complex formation relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances binding activity to the RNA guide as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding activity compared to the parent polypeptide at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ lower than the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding activity in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced RNA guide binding activity compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding activity.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced RNA guide binding activity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced RNA guide binding activity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced RNA guide binding activity relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances binding specificity to the RNA guide as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding specificity compared to the parent polypeptide at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ lower than the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding specificity in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced RNA guide binding specificity compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced RNA guide binding specificity.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced RNA guide binding specificity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced RNA guide binding specificity relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced RNA guide binding specificity relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances protein-RNA interactions compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced protein-RNA interactions at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced protein-RNA interactions in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced protein-RNA interactions compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced protein-RNA interactions.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced protein-RNA interaction relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced protein-RNA interaction relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced protein-RNA interaction relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances protein stability as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced protein stability at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced protein stability in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced protein stability compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced protein stability at values at least 8 ℃ greater.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced protein stability relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced protein stability relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced protein stability relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that reduces dissociation from the RNA guide (e.g., binary complex dissociation) as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits reduced dissociation of the RNA guide as compared to the parent polypeptide at a temperature that is lower than about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃. In some embodiments, the variant Cas12i4 polypeptide exhibits reduced dissociation from the RNA guide in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits reduced dissociation from the RNA guide compared to the parent polypeptide when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits reduced dissociation from the RNA guide. In some embodiments, the variant Cas12i4 polypeptide exhibits reduced dissociation from the RNA guide over an incubation period of at least any one of 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 35 minutes, 40 minutes, 45 minutes, 50 minutes, 55 minutes, 1 hour, 2 hours, 3 hours, 4 hours, or more as compared to the parent polypeptide. In some embodiments, the variant Ribonucleoprotein (RNP) complex does not exchange RNA guides with a different RNA.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) reduced dissociation from the RNA guide relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) decreased dissociation from the RNA guide relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) reduced dissociation from the RNA guide relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration that enhances ternary complex formation with the RNA guide and the target nucleic acid as compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced ternary complex formation at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide exhibits enhanced ternary complex formation in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent polypeptide. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant Cas12i4 polypeptide exhibits enhanced ternary complex formation when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent polypeptide. In one embodiment, when the T of the variant Cas12i4 polypeptide is m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant Cas12i4 polypeptide exhibits enhanced ternary complex formation.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) reduced enzymatic activity and (b) enhanced ternary complex formation relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) increased enzymatic activity and (b) enhanced ternary complex formation relative to the parent polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide exhibiting (a) retained enzymatic activity and (b) enhanced ternary complex formation relative to the parent polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a binary complex comprising the variant Cas12i4 polypeptide (e.g., a variant binary complex) exhibits enhanced binding affinity to a target nucleic acid compared to the parent binary complex. In some embodiments, the variant binary complex exhibits enhanced binding affinity for the target nucleic acid at a temperature that is less than about any of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent binary complex. In some embodiments, the variant binary complex exhibits enhanced binding affinity to the target nucleic acid in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent binary complex. In some embodiments, when the T of the binary complex is varied m Value ratio of T of parent binary complex m The variant binary complex exhibits enhanced binding affinity for the target nucleic acid compared to the parent binary complex when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent binary complex. In one embodiment, when the variant binary complex is T m Value ratio of T of parent binary complex m The variant binary complex exhibits an increase at a value of at least 8℃ greaterStrong binding affinity to target nucleic acid.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) reduced enzymatic activity and (b) enhanced binding affinity to a target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) increased enzymatic activity and (b) increased binding affinity to a target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) retained enzymatic activity and (b) enhanced binding affinity to a target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a binary complex comprising the variant Cas12i4 polypeptide (e.g., a variant binary complex) exhibits enhanced mid-target binding activity compared to the parent binary complex. In some embodiments, the variant binary complex exhibits enhanced mid-target binding activity at a temperature that is less than about any of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent binary complex. In some embodiments, the variant binary complex exhibits enhanced mid-target binding activity in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent binary complex. In some embodiments, when the T of the binary complex is varied m Value ratio of T of parent binary complex m The value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10The variant binary complex exhibits enhanced mid-target binding activity when compared to the parent binary complex at a temperature of 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃. In one embodiment, when the variant binary complex is T m Value ratio of T of parent binary complex m The variant binary complex exhibits enhanced mid-target binding activity at values at least 8 ℃ greater.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) reduced enzymatic activity and (b) enhanced mid-target binding activity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) increased enzymatic activity and (b) increased mid-target binding activity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) retained enzymatic activity and (b) enhanced mid-target binding activity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a binary complex comprising the variant Cas12i4 polypeptide (e.g., a variant binary complex) exhibits enhanced mid-target binding specificity as compared to the parent binary complex. In some embodiments, the variant binary complex exhibits enhanced mid-target binding specificity at a temperature that is less than about any of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent binary complex. In some embodiments, with a parent binary complex The variant binary complex exhibits enhanced mid-target binding specificity in a buffer having a pH in the range of about 7.3 to about 8.6. In some embodiments, when the T of the binary complex is varied m Value ratio of T of parent binary complex m The variant binary complex exhibits enhanced mid-target binding specificity compared to the parent binary complex when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the value of 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, and 19 ℃. In one embodiment, when the variant binary complex is T m Value ratio of T of parent binary complex m The variant binary complex exhibits enhanced mid-target binding specificity at values at least 8 ℃ greater.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) reduced enzymatic activity and (b) enhanced mid-target binding specificity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) increased enzymatic activity and (b) increased mid-target binding specificity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) retained enzymatic activity and (b) enhanced mid-target binding specificity relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a binary complex comprising the variant Cas12i4 polypeptide (e.g., a variant binary complex) exhibits reduced off-target binding to a non-target nucleic acid as compared to the parent binary complex. In some embodiments, the variant binary complex is at a ratio of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃,27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ exhibit reduced off-target binding to non-target nucleic acids at a temperature that is about any one of low. In some embodiments, the variant binary complex exhibits reduced off-target binding to the non-target nucleic acid in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent binary complex. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant binary complex exhibits reduced off-target binding to non-target nucleic acids when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the parent binary complex. In one embodiment, when the variant binary complex is T m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant binary complex exhibits reduced off-target binding to non-target nucleic acids.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) reduced enzymatic activity and (b) reduced off-target binding to a non-target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) increased enzymatic activity and (b) decreased off-target binding to a non-target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) retained enzymatic activity and (b) reduced off-target binding to a non-target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a binary complex comprising the variant Cas12i4 polypeptide (e.g., a variant binary complex) exhibits reduced dissociation from the target nucleic acid compared to the parent binary complex. In some embodiments, the variant binary complex exhibits reduced dissociation from the target nucleic acid at a temperature less than about any of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent binary complex. In some embodiments, the variant binary complex exhibits reduced dissociation from the target nucleic acid in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent binary complex. In some embodiments, when the T of the variant Cas12i4 polypeptide is the same as the T of the variant Cas12i4 polypeptide m Value of T compared to parent polypeptide m The variant binary complex exhibits reduced dissociation from the target nucleic acid compared to the parent binary complex when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the value. In one embodiment, when the variant binary complex is T m Value of T compared to parent polypeptide m At values at least 8 ℃ greater, the variant binary complex exhibits reduced dissociation from the target nucleic acid.
In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) reduced enzymatic activity and (b) enhanced dissociation from the target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) increased enzymatic activity and (b) enhanced dissociation from the target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2. In some embodiments, at least one alteration is introduced into the parent polypeptide of SEQ ID No. 2 to produce a variant Cas12i4 polypeptide that forms a variant binary complex that exhibits (a) retained enzymatic activity and (b) enhanced dissociation from the target nucleic acid relative to the parent binary complex comprising the polypeptide of SEQ ID No. 2.
In some embodiments, the variant Cas12i4 polypeptide comprises at least one alteration such that a ternary complex (e.g., a variant ternary complex) comprising the variant Cas12i4 polypeptide exhibits enhanced stability compared to the parent ternary complex. In some embodiments, the variant ternary complex exhibits enhanced stability at a temperature less than about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, 51 ℃, 52 ℃, 53 ℃, 54 ℃, 55 ℃, 56 ℃, 57 ℃, 58 ℃, 59 ℃, 60 ℃, or 65 ℃ compared to the parent ternary complex. In some embodiments, the variant ternary complex exhibits enhanced stability in a buffer having a pH in the range of about 7.3 to about 8.6 as compared to the parent ternary complex. In some embodiments, when the T of the variant ternary complex m Value ratio of T of parent ternary complex m The variant ternary complex exhibits enhanced stability compared to the parent ternary complex when the value is at least 1 ℃, 2 ℃, 3 ℃, 4 ℃, 5 ℃, 6 ℃, 7 ℃, 8 ℃, 9 ℃, 10 ℃, 11 ℃, 12 ℃, 13 ℃, 14 ℃, 15 ℃, 16 ℃, 17 ℃, 18 ℃, 19 ℃, or 20 ℃ greater than the value. In one embodiment, when the T of the variant ternary complex m Value ratio of T of parent ternary complex m The variant ternary complex exhibits enhanced stability at values at least 8 ℃ greater.
Increased RNA guide interaction
In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the RNA guide as compared to the parent polypeptide. In some embodiments, the alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the RNA guide is a substitution of one or more amino acids to an arginine, lysine, glutamine, asparagine, histidine, serine, or tyrosine residue. In some embodiments, the variant Cas12i4 polypeptide comprises a substitution of one or more amino acids in the RNA binding interface to an arginine, lysine, glutamine, asparagine, histidine, serine, tyrosine, phenylalanine, glutamic acid, or methionine residue. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain (e.g., a ridge domain, ruvC1 motif, ruvC2 motif, or Rec2 domain). In some embodiments, the RNA binding interface substitution increases RNA guide binding or RNA guide binding affinity by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 180%, 200%, or more than the parent polypeptide.
In some embodiments, the substitution increases RNA guide complex (binary complex) formation relative to the parent polypeptide. Non-limiting examples of substitutions that can alter the ability of a variant Cas12i4 polypeptide to interact with a cognate repeat of an RNA guide are shown in table 4. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4 exhibit enhanced RNA guide complex (binary complex) formation relative to the parent polypeptide. In some embodiments, a Cas12i4 polypeptide comprising one or more of the substitutions listed in table 4 forms a more stable binary complex with an RNA guide as compared to a binary complex comprising a parent polypeptide.
Table 4. Substitution by increasing contact with the orthotropic repeat.
/>
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions listed in table 4. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 4.
In some embodiments, a variant Cas12i4 polypeptide exhibiting enhanced RNA guide complex (binary complex) formation comprises two or more substitutions. In some embodiments, the variant Cas12i4 polypeptide further comprises K545R and K546R. In some embodiments, the variant Cas12i4 polypeptide further comprises K545R and K546R and N654R.
In some embodiments, a variant Cas12i4 polypeptide further comprising any one of the one or more substitutions set forth in table 4, SEQ ID NOs 2-59, exhibits increased enzymatic activity. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4 exhibit increased enzymatic activity. In some embodiments, a variant Cas12i4 polypeptide further comprising any one of the one or more substitutions set forth in table 4, SEQ ID NOs 2-59, exhibits increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
Increased double-stranded DNA interactions
In some aspects, the variant Cas12i4 polypeptide comprises an alteration that increases interaction with double-stranded DNA relative to the parent polypeptide. In some embodiments, the increased interaction with double stranded DNA is increased electrostatic interaction. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the affinity between the variant Cas12i4 polypeptide and the double-stranded DNA relative to the parent polypeptide. In some embodiments, increasing the interaction and/or change in affinity between the variant Cas12i4 polypeptide and the double-stranded DNA increases the binding of the variant Cas12i4 polypeptide to the PAM sequence.
In some embodiments, the alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the double-stranded DNA is a substitution of one or more amino acids. In some embodiments, the variant Cas12i4 polypeptide comprises a substitution of one or more amino acids in the double-stranded DNA binding interface. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain (e.g., a Rec1 domain, PI domain, or a ridge domain) to an arginine, lysine, glutamine, asparagine, histidine, tryptophan, glycine, leucine, alanine, or serine residue. In some embodiments, double-stranded DNA binding interface substitution increases double-stranded DNA interactions and/or affinities by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 190%, 200%, or more than the parent polypeptide. In some embodiments, the double-stranded DNA binding interface substitution increases the binding of the variant Cas12i4 polypeptide to the PAM sequence by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 170%, 200%, or more than the parent polypeptide.
In some embodiments, the substitution that increases double-stranded DNA interactions increases ternary complex formation relative to the parent polypeptide. Non-limiting examples of substitutions that can alter the ability of the variant Cas12i4 polypeptide to interact with double-stranded DNA are shown in table 5. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 5 exhibit increased double-stranded DNA interactions (ternary complex formation) relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 5 form a more stable ternary complex as compared to the parent polypeptide.
Table 5. Substitution to alter double-stranded interactions.
/>
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions listed in table 5. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 5.
In some embodiments, a variant Cas12i4 polypeptide exhibiting increased double-stranded DNA interactions comprises two or more substitutions listed in table 5. In some embodiments, the variant Cas12i4 polypeptide that exhibits increased double-stranded DNA interaction comprises K232R and D228A. In some embodiments, a variant Cas12i4 polypeptide that exhibits increased double-stranded DNA interaction comprises a286R and Y160L. In some embodiments, the variant Cas12i4 polypeptide that exhibits increased double-stranded DNA interaction comprises N287K and V456A. In some embodiments, the variant Cas12i4 polypeptide that exhibits increased double-stranded DNA interaction comprises K178R and E173S.
In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions in table 4 and/or table 5. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 exhibits increased double-stranded DNA interactions and/or affinities (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 98%, 99%, 95%, 150%, 180%, 120%, 170%, or more) compared to the parent polypeptide. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 exhibits increased ternary complex formation and/or ternary complex stability (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 94%, 98%, 140%, 180%, 150%, 120%, or more) compared to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5, SEQ ID NOs 2-59, exhibits increased enzymatic activity. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4 and/or table 5 exhibit increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
Increased single-stranded DNA interactions
In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases interaction with single-stranded DNA relative to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the affinity between the variant Cas12i4 polypeptide and the double-stranded DNA relative to the parent polypeptide. In some embodiments, the single stranded DNA comprises non-target strands (NTS). In some embodiments, the increased interaction with single-stranded DNA (e.g., NTS) is an interaction between the PAM sequence and the active site of the variant Cas12i4 polypeptide. In some embodiments, the single-stranded DNA comprises single-stranded DNA that interacts with the variant Cas12i4 polypeptide at or near the active site of the variant Cas12i4 polypeptide. In some embodiments, an alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the single-stranded DNA stabilizes the R-loop. As used herein, "R-loop" refers to a nucleic acid comprising an RNA guide paired with a Target Strand (TS) and a single-stranded non-target strand (NTS).
In some embodiments, the alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the single-stranded DNA is a substitution of one or more amino acids. In some embodiments, the variant Cas12i4 polypeptide comprises a substitution of one or more amino acids in a single-stranded DNA-binding interface. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain/motif (e.g., PI domain, rec1 domain, ruvC1 motif, rec2 domain, ruvC2 motif, nuc domain, or RuvC3 motif) to arginine, lysine, or alanine.
In some embodiments, single-stranded DNA binding interface substitutions increase single-stranded DNA interactions and/or affinities by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 190%, 200%, or more than the parent polypeptide.
In some embodiments, the substitution that increases single-stranded DNA interactions increases ternary complex formation relative to the parent polypeptide. Non-limiting examples of substitutions that can alter the ability of the variant Cas12i4 polypeptide to interact with single-stranded DNA are shown in table 6. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 6 exhibit increased single-stranded DNA interactions (ternary complex formation) relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 6 form a more stable ternary complex as compared to the parent polypeptide.
TABLE 6 substitution to alter single stranded interactions.
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 6. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 6.
In some embodiments, a variant Cas12i4 polypeptide exhibiting increased single-stranded DNA interactions comprises two or more substitutions listed in table 6. In some embodiments, a variant Cas12i4 polypeptide exhibiting increased ternary complex formation/stability comprises two or more substitutions listed in table 6. In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions in table 4 and/or table 5 and/or table 6. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 exhibits increased single-stranded DNA interactions and/or affinities (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 92%, 98%, 95%, 140%, 120%, 150%, or more) compared to the parent polypeptide. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 exhibits increased ternary complex formation and/or ternary complex stability as compared to the parent polypeptide (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween.
In some embodiments, the variant Cas12i4 polypeptide comprises a substitution that increases the stability of the single-stranded DNA (e.g., the substitution increases the electrostatic interaction between the single-stranded DNA and the active site of the variant Cas12i4 polypeptide). In some embodiments, a variant Cas12i4 polypeptide increases single-stranded DNA stability by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 200% or more than the parent polypeptide. Non-limiting examples of substitutions that can alter the ability of the variant Cas12i4 polypeptide to stabilize single-stranded DNA are shown in table 6. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 6 exhibit increased single-stranded DNA stability relative to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6, 2-59, exhibits increased enzymatic activity. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 exhibit increased enzymatic activity. In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6, 2-59, exhibits increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
Increased heteroduplex interactions
In some embodiments, the variant Cas12i4 polypeptide comprises a substitution that increases the interaction with the DNA/RNA hybrid molecule for the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the affinity between the variant Cas12i4 polypeptide and the DNA/RNA hybrid relative to the parent polypeptide. In some embodiments, the DNA/RNA hybrid molecule is a heteroduplex. As used herein, "heteroduplex" refers to a duplex formed by a spacer of an RNA guide and a Target Strand (TS). As used herein, the term "seed region" refers to the portion of the heteroduplex of TS immediately downstream of the PAM sequence. The seed region contains the first base in the heteroduplex to pair with the RNA guide and is required for RNA-DNA binding and TS substitution. In some embodiments, increasing the change in interaction and/or affinity between the variant Cas12i4 polypeptide and the heteroduplex increases non-specific nucleic acid contact. In some embodiments, increasing the interaction and/or change in affinity between the variant Cas12i4 polypeptide and the heteroduplex increases ternary complex formation/stability relative to the parent polypeptide.
In some embodiments, the alteration that increases the interaction and/or affinity between the variant Cas12i4 polypeptide and the heteroduplex is a substitution of one or more amino acids. In some embodiments, the variant Cas12i4 polypeptide comprises a substitution of one or more amino acids that contact the heteroduplex. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain/motif (e.g., a ridge domain, rec1 domain, rec2 domain, or RuvC2 motif) to lysine, arginine, histidine, serine, glutamine, or asparagine. In some embodiments, the nucleic acid interfacial substitution increases heteroduplex interactions and/or affinities by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 180%, 190%, 200% or more as compared to the parent polypeptide.
In some embodiments, the substitution that increases heteroduplex interactions increases ternary complex formation/stability relative to the parent polypeptide. Non-limiting examples of substitutions that can alter the ability of a variant Cas12i4 polypeptide to interact with a heteroduplex are shown in table 7. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 7 exhibit increased heteroduplex interactions (ternary complex formation) relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 7 form a more stable ternary complex as compared to the parent polypeptide.
TABLE 7 substitution of variant source duplex interactions.
/>
* Substitution in the seed region
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 7. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 7.
In some embodiments, a variant Cas12i4 polypeptide that exhibits increased heteroduplex interactions comprises two or more substitutions listed in table 7. In some embodiments, a variant Cas12i4 polypeptide exhibiting increased ternary complex formation/stability comprises two or more substitutions listed in table 7. In some embodiments, the variant Cas12i4 polypeptide comprises V585R and Y447S. In some embodiments, the variant Cas12i4 polypeptide comprises V585K and Y447S. In some embodiments, the variant Cas12i4 polypeptide comprises V585R and Y447K. In some embodiments, the variant Cas12i4 polypeptide comprises V585K and Y447K. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 7. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises V585R and Y447S, V K and Y447S, V585R and Y447K, or V585K and Y447K. In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 exhibits increased heteroduplex interactions and/or affinities compared to the parent polypeptide (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 88%, 92%, 94%, 90%, 95%, 120%, 150%, 120%, or more, or any of which is between them). In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 exhibits increased ternary complex formation and/or ternary complex stability (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 88%, 92%, 98%, 95%, 90%, 95%, 150%, 120%, or more) compared to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 exhibits increased enzymatic activity. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 and/or table 7 exhibit increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
Increased duplex DNA duplex and heteroduplex stability
During ternary complex formation, double-stranded DNA downstream of the PAM sequence is melted (e.g., unwound) into Target Strands (TS) and non-target strands (NTS). The spacer of the RNA guide binds to TS, forming a double helix called heteroduplex. PAM sequences do not melt and remain intact double stranded DNA. This results in exposure of these terminal PAM dsDNA base pair moieties to the environment and proteins, which can be energetically unfavorable. Similarly, the terminal base pairs of the heteroduplex are exposed and may be energetically unfavorable. In some embodiments, increasing the change in aromatic, hydrophobic, van der waals, and/or cation-pi interactions between the variant Cas12i4 polypeptide and the exposed end PAM base of the double-stranded DNA duplex or the end base of the heteroduplex increases the stability of DNA melting during ternary complex formation.
In some embodiments, increasing the change in aromatic, hydrophobic, van der waals, and/or cation-pi interactions between the variant Cas12i4 polypeptide and the exposed bases of the double-stranded DNA duplex or heteroduplex increases the R-loop stability during ternary complex formation. In some embodiments, increasing the change in aromatic, hydrophobic, van der waals, and/or cation-pi interactions between the variant Cas12i4 polypeptide and the exposed bases of the double-stranded DNA duplex or heteroduplex increases ternary complex formation. In some embodiments, increasing the change in aromatic, hydrophobic, van der waals, and/or cation-pi interactions between the variant Cas12i4 polypeptide and the exposed bases of the double-stranded DNA duplex or heteroduplex increases ternary complex stability.
In some embodiments, the modification that increases aromatic, hydrophobic, van der Waals, and/or cation-pi interactions is substitution of one or more residues. In some embodiments, the alteration that increases aromatic, hydrophobic, van der Waals, and/or cation-pi interactions is a substitution of one or more residues that contact the double-stranded DNA duplex and/or heteroduplex. In some embodiments, the changes that increase aromatic, hydrophobic, van der waals, and/or cation-pi interactions are substitutions listed in table 8. In some embodiments, a variant Cas12i4 polypeptide comprising the substitutions listed in table 8 exhibits increased aromatic, hydrophobic, van der waals, and/or cation-pi interactions between the variant Cas12i4 polypeptide and exposed bases of a double-stranded DNA duplex or heteroduplex as compared to the parent polypeptide. In some embodiments, the alteration includes substitution of an amino acid adjacent to a terminal duplex base pair with a positively charged, aromatic, hydrophobic, or branched amino acid to create conditions that are more energetically favorable for double-stranded DNA and heteroduplex.
Table 8. Substitution of the stabilizing R-ring.
Residues Substitution of
I4 A
S5 Q、I、M
Y876 W、H
E156 R
E158 Q、K、R
A161 M、R、Y
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 8. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 8.
In some embodiments, a variant Cas12i4 polypeptide that exhibits increased ternary complex formation and/or ternary complex stability (e.g., by stabilizing DNA and/or melting of the R-loop) comprises two or more substitutions listed in table 8. In some embodiments, the variant Cas12I4 polypeptide comprises I4A and Y876W. In some embodiments, the variant Cas12i4 polypeptide comprises E156R and E158Q. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 8. In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions of table 4, table 5, table 6, table 7, and/or table 8. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4, table 5, table 6, table 7, and/or table 8 exhibits increased ternary complex formation and/or ternary complex stability as compared to the parent polypeptide (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4, table 5, table 6, table 7, and/or table 8, 2-59, exhibits increased enzymatic activity. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 4, table 5, table 6, table 7, and/or table 8 exhibit increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
Increased conformational change
Conformational changes (e.g., upon binding to RNA guides or target DNA) affect the function of the variant Cas12i4 polypeptide, e.g., conformational changes may alter the kinetics of the variant Cas12i4 polypeptide. The Rec1 (helix II) domain of Cas12i4 moves and rotates during ternary complex formation to accommodate DNA binding. In some embodiments, changes that increase movement (e.g., flexibility or conformational change) of the helix II domain increase DNA binding/DNA binding affinity. In some embodiments, substitutions that increase flexibility in the helix II domain (e.g., substitution of bulky amino acids to amino acids with small or smaller side chains (alanine, valine, glycine, or serine residues)) increase ternary complex formation. In some embodiments, an alteration that increases movement (e.g., flexibility or conformational change) of the helix II domain increases ternary complex stability. In some embodiments, the change that increases the conformational change of the helix II domain is a substitution of one or more residues with an alanine, valine, glycine, or serine residue. In some embodiments, the change that increases the flexibility of the helix II domain is a substitution of one or more residues. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids near the helical II domain. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids near the helical II domain. In some embodiments, the variant Cas12i4 polypeptide comprises the substitutions shown in table 9.
Table 9. Substitution to alter the flexibility of the helix II domain.
/>
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 9. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 9.
In some embodiments, the change that increases the flexibility of the helix II domain is a substitution listed in table 9. In some embodiments, a variant Cas12i4 polypeptide having one or more of the substitutions listed in table 9 exhibits an increase in helix II domain flexibility of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 150%, 170%, 150%, or more than the parent polypeptide. In some embodiments, the alterations that increase DNA binding/DNA affinity are the substitutions listed in table 9. In some embodiments, a variant Cas12i4 polypeptide having one or more of the substitutions listed in table 9 exhibits an increase in DNA binding/DNA affinity of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 150%, 170%, 150%, or more than the parent polypeptide. In some embodiments, a variant Cas12i4 polypeptide comprising a substitution set forth in table 9 exhibits increased ternary complex formation and/or ternary complex stability (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 96%, 97%, 99%, 170%, 180%, 120%, 130%, or more) compared to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide exhibiting increased helix II domain flexibility comprises two or more substitutions listed in table 9. In some embodiments, a variant Cas12i4 polypeptide exhibiting increased DNA binding/affinity comprises two or more substitutions listed in table 9. In some embodiments, a variant Cas12i4 polypeptide exhibiting increased ternary complex formation/stability comprises two or more substitutions listed in table 9. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 9. In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 exhibits increased DNA binding/affinity (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 88%, 92%, 91%, 90%, 120%, 150%, 100%, or more) compared to the parent polypeptide. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 exhibits increased ternary complex formation and/or ternary complex stability as compared to the parent polypeptide (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 exhibits increased enzymatic activity. In some embodiments, a Cas12i4 polypeptide comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 exhibits increased enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
In some embodiments, the change in the linkage between the increasing Nuc and the helix II interface formed when the target single-stranded DNA is in the active site of the Cas12i4 polypeptide increases the transition from a binary complex to a ternary complex. In some embodiments, increasing the change in the connection between Nuc and the helix II interface increases ternary complex formation. In some embodiments, increasing the change in the connection between Nuc and the helix II interface increases ternary complex stability. In some embodiments, the change that increases the linkage between Nuc and the helix II interface is a substitution of one or more residues with an aspartic acid, glutamic acid, arginine, or lysine residue. In some embodiments, the variant Cas12i4 polypeptide comprises the substitutions shown in table 10.
Table 10. Substitution of the ligation at the Nuc and helix II interface was increased.
Amino acid substitutions
Q386E+Q387D+N966R
Q386E+Q387E+N966R
Q386E+N966R
Q386E+Q387D+A936K+N966R
Q386E+Q387E+A936K+N966R
Q387D
A936K
Q387E
S931K
N932K
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 10. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 10.
In some embodiments, the substitutions in table 10 increase the connection between Nuc and the helix II interface. In some embodiments, a variant Cas12i4 polypeptide having one or more of the substitutions in table 10 increases the connection between Nuc and the helix II interface by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 170%, 120%, 180%, 120%, more than the parent polypeptide. In some embodiments, a variant Cas12i4 polypeptide comprising a substitution set forth in table 10 exhibits increased ternary complex formation and/or ternary complex stability (e.g., an increase of about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 96%, 97%, 99%, 170%, 180%, 120%, 130%, or more) compared to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide that exhibits increased ligation between Nuc and the helix II interface comprises two or more substitutions listed in table 10. In some embodiments, a variant Cas12i4 polypeptide exhibiting increased ternary complex formation/stability comprises two or more substitutions listed in table 10. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 10. In some embodiments, the variant Cas12i4 polypeptide comprises any one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 exhibits increased ligation between Nuc and the helix II interface as compared to the parent polypeptide (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween. In some embodiments, a variant Cas12i4 polypeptide having one or more substitutions in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 exhibits increased ternary complex formation and/or ternary complex stability as compared to the parent polypeptide (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 exhibits increased enzymatic activity. In some embodiments, a Cas12i4 polypeptide comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 exhibits increased enzymatic activity.
In some embodiments, the variant Cas12i4 polypeptide exhibits increased enzymatic activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 180%, 200% or more) compared to the parent polypeptide.
In some embodiments, the change reduces the connection between Nuc and the helix II interface. In some embodiments, decreasing the change in the connection between Nuc and the helix II interface increases ternary complex formation. In some embodiments, the change that reduces the linkage between Nuc and the helix II interface is a substitution of one or more residues. In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 10.
Increased fidelity
In some aspects, the variant Cas12i4 polypeptide comprises an alteration in target specificity in an increase relative to the parent polypeptide. In some aspects, the variant Cas12i4 polypeptide comprises an alteration in target binding in an increase relative to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that increases the interaction (e.g., affinity) between the variant Cas12i4 polypeptide and the mid-target DNA relative to the parent polypeptide.
In some embodiments, the change in target specificity in the increase is a substitution of one or more amino acids. In some aspects, the alteration in target specificity in the increase is a truncation of the residue contacting the spacer sequence of the RNA guide (e.g., substitution of the residue contacting the spacer sequence with a residue having a smaller side chain). In some aspects, the change in target specificity in the increase is a truncation of a residue of the spacer sequence contacting the RNA guide.
In some embodiments, the variant Cas12i4 polypeptide comprises a substitution of one or more amino acids of the spacer sequence that contacts the RNA guide. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain/motif (e.g., a ridge domain, rec1 domain, rec2 domain, or RuvC2 motif). In some embodiments, the truncated substitution in the helix II domain results in a variant Cas12i4 polypeptide that exhibits increased mid-target binding specificity.
In some embodiments, the substitution increases target specificity in a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 170%, 190%, or more than the parent polypeptide.
In some embodiments, the substitution increases the mid-target binding of the variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 190%, 200% or more compared to the parent polypeptide.
In some embodiments, the substitution increases the mid-target binding affinity of the variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 190%, 200%, or more than the parent polypeptide.
Non-limiting examples of changes that can alter the ability of the variant Cas12i4 polypeptide to selectively bind to mid-target DNA are the substitutions listed in table 11. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit increased mid-target specificity relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit increased mid-target binding relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit increased mid-target binding affinity relative to the parent polypeptide.
Table 11. Substitution of target specificity in addition.
/>
/>
In some embodiments, increasing the change in mid-target specificity (e.g., the substitutions listed in table 11) further increases mid-target ternary complex formation and/or mid-target ternary complex stability (e.g., mid-target ternary complex formation/stability). In some embodiments, an increase in mid-target specificity increases mid-target ternary complex formation and/or mid-target ternary complex stability by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 170%, 180%, or more than any of them compared to the parent polypeptide.
In some aspects, the variant Cas12i4 polypeptide comprises an alteration that reduces off-target specificity relative to the parent polypeptide. In some aspects, the variant Cas12i4 polypeptide comprises an alteration that reduces off-target binding relative to the parent polypeptide. In some embodiments, the variant Cas12i4 polypeptide comprises an alteration that reduces the interaction (e.g., affinity) between the variant Cas12i4 polypeptide and the off-target DNA relative to the parent polypeptide.
Methods for detecting off-target activity are known in the art. In some embodiments, off-target activity is detected by tag-based tag integration site sequencing (TTISS) or by sequenced DSB whole genome, unbiased identification (GUIDE-Seq). For example, in some embodiments, the TTISS is performed using a Cas12i4 polypeptide or a variant Cas12i4 polypeptide, using the TTISS method described in PCT/US2021/025257, which is incorporated by reference in its entirety.
In some embodiments, the alteration that reduces off-target specificity is a substitution of one or more amino acids to alanine, serine, valine, glutamine, or asparagine residues. In some aspects, the alteration that reduces off-target specificity is a truncation of the residue of the spacer sequence that contacts the RNA guide (e.g., substitution of the residue of the spacer sequence with a residue having a smaller side chain). In some aspects, the alteration that reduces off-target specificity is a truncation of the residue of the spacer sequence that contacts the RNA guide, e.g., substitution of alanine, serine, or valine. In some embodiments, the variant Cas12i4 polypeptide comprises a change in one or more amino acids in at least one domain/motif (e.g., a ridge domain, rec1 domain, rec2 domain, or RuvC2 motif) to alanine. In some embodiments, the truncated substitution in the helix II domain results in a variant Cas12i4 polypeptide that exhibits reduced off-target binding specificity. In some embodiments, the substitution reduces the off-target specificity of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% compared to the parent polypeptide. In some embodiments, the substitution reduces off-target binding of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% compared to the parent polypeptide. In some embodiments, the substitution reduces the off-target binding affinity of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% compared to the parent polypeptide.
Non-limiting examples of changes that can alter the ability of a variant Cas12i4 polypeptide to bind to off-target DNA are the substitutions listed in table 11. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit reduced off-target specificity relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit reduced off-target binding relative to the parent polypeptide. In some embodiments, cas12i4 polypeptides comprising one or more of the substitutions listed in table 11 exhibit reduced off-target binding affinity relative to the parent polypeptide.
In some embodiments, the polypeptide is a polypeptide that is substantially identical to the parent polypeptide, the mid-target specificity of the variant Cas12i4 polypeptide is increased by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%: 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more, or any percentage substitution therebetween, further reduces the off-target specificity of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46% 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. In some embodiments, the polypeptide is a polypeptide that is substantially identical to the parent polypeptide, increasing the mid-target binding of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83% >. 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more, or any percentage of substitution therebetween, further reduces off-target binding of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46% 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. In some embodiments, the polypeptide is a polypeptide that is substantially identical to the parent polypeptide, the mid-target binding affinity of the variant Cas12i4 polypeptide is increased by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%: 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more, or any percentage therebetween, further reduces the off-target binding affinity of a variant Cas12i4 polypeptide by about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44% 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%.
In some embodiments, the variant Cas12i4 polypeptide of any one of SEQ ID NOs 2-59 further comprises one or more substitutions set forth in table 11. In some embodiments, the variant Cas12i4 polypeptide comprises one or more substitutions listed in table 2 and table 11.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11 exhibits increased off-target enzymatic activity. In some embodiments, a Cas12i4 polypeptide comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11 exhibits increased off-target enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide exhibits increased mid-target enzyme activity (e.g., about 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 170%, 200%, or more) compared to the parent polypeptide.
In some embodiments, a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11 exhibits an increased ratio of off-target to on-target enzyme activity. In some embodiments, a Cas12i4 polypeptide comprising one or more of the substitutions listed in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11 exhibits an increased ratio of off-target to on-target enzymatic activity. In some embodiments, the variant Cas12i4 polypeptide (e.g., comprises one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11 as compared to the parent polypeptide: 2-59) is at least 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 150%, 170%, 180%, 190%, or more than any of them. In some embodiments, the variant Cas12i4 polypeptide (e.g., the in-target enzymatic activity comprising one or more of the substituted variant Cas12i4 polypeptides listed in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) is at least 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 41%, 42%, 43% higher than the off-target enzymatic activity of the variant Cas12i4 polypeptide. 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200% or more or any percentage therebetween.
In some embodiments, the enzymatic activity of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at the off-target locus does not exceed 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the enzymatic activity at the on-target locus. In some embodiments, the variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) does not have an enzymatic activity at the off-target locus that exceeds 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the enzymatic activity at the on-target locus. In some embodiments, the enzymatic activity of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at the off-target locus is NO more than 5% (e.g., 5%, 4%, 3%, 2%, 1%, or 0%) of the enzymatic activity at the on-target locus. In some embodiments, the enzymatic activity of the variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) at the off-target locus does not exceed 5% (e.g., 5%, 4%, 3%, 2%, 1% or 0%) of the enzymatic activity at the on-target locus. By comparison, the enzymatic activity of SpCas9 at the off-target locus is up to 95% (e.g., 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75%, 74%, 73%, 72%, 71%, 70%, 69%, 68%, 67%, 66%, 65%, 64%, 63%, 62%, 61%, 60%, 59%, 58%, 57%, 56%, 55%, 54%, 53%, 52%, 51%, 50%, 49%, 48%, 47%, 46%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, 30%, 29%, 28%, 27%, 26%, 25%, 24%, 23%, 22%, 21%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or 0%) of the enzymatic activity at the off-target locus.
In some embodiments, the editing efficiency of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at an off-target locus does not exceed 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the editing efficiency at an on-target locus. In some embodiments, the editing efficiency of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) at an off-target locus is NO more than 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the editing efficiency at an on-target locus. In some embodiments, the editing efficiency of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at an off-target locus is NO more than 5% (e.g., 5%, 4%, 3%, 2%, 1%, or 0%) of the editing efficiency at an on-target locus. In some embodiments, the editing efficiency of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) at an off-target locus is NO more than 5% (e.g., 5%, 4%, 3%, 2%, 1%, or 0%) of the editing efficiency at an on-target locus. By comparison, the efficiency of editing at the off-target locus by SpCas9 is as high as 95% (e.g., 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75%, 74%, 73%, 72%, 71%, 70%, 69%, 68%, 67%, 66%, 65%, 64%, 63%, 62%, 61%, 60%, 59%, 58%, 57%, 56%, 55%, 54%, 53%, 52%, 51%, 50%, 49%, 48%, 47%, 46%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, 30%, 29%, 28%, 27%, 26%, 25%, 24%, 23%, 22%, 21%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or 0%) of the efficiency of editing at the off-target locus.
In some embodiments, the editing of a variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at the off-target locus is NO more than 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the editing at the on-target locus. In some embodiments, the editing of the variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) at the off-target locus is NO more than 10% (e.g., 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the editing at the off-target locus.
In some embodiments, the editing of the variant Cas12i4 polypeptide (e.g., the variant Cas12i4 polypeptide of any one of SEQ ID NOs: 2-59) at the off-target locus is NO more than 5% (e.g., 5%, 4%, 3%, 2%, 1%, or 0%) of the editing at the on-target locus. In some embodiments, the editing of the variant Cas12i4 polypeptide (e.g., a variant Cas12i4 polypeptide comprising any one of the one or more substitutions set forth in table 4 and/or table 5 and/or table 6 and/or table 7 and/or table 8 and/or table 9 and/or table 10 and/or table 11) at the off-target locus is NO more than 5% (e.g., 5%, 4%, 3%, 2%, 1% or 0%) of the editing at the on-target locus. By comparison, the editing of SpCas9 at the off-target locus is as high as 95% (e.g., 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75%, 74%, 73%, 72%, 71%, 70%, 69%, 68%, 67%, 66%, 65%, 64%, 63%, 62%, 61%, 60%, 59%, 58%, 57%, 56%, 55%, 54%, 53%, 52%, 51%, 50%, 49%, 48%, 47%, 46%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, 30%, 29%, 28%, 27%, 26%, 25%, 24%, 23%, 22%, 21%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or 0%) of the editing at the off-target locus.
RNA guide
In some embodiments, a composition or complex as described herein comprises a targeting moiety (e.g., an RNA guide, an antisense oligonucleotide, a peptide oligonucleotide conjugate) that binds to a target nucleic acid and interacts with a Cas12i4 polypeptide (e.g., a parent polypeptide or a variant Cas12i4 polypeptide). The targeting moiety can bind to the target nucleic acid (e.g., utilizing a specific binding affinity for the target nucleic acid).
In some embodiments, the targeting moiety comprises or is an RNA guide. In some embodiments, the RNA guide directs the Cas12i4 polypeptide (e.g., the parent polypeptide or the variant Cas12i4 polypeptide) to a particular nucleic acid sequence. Those of skill in the art reading the following examples of specific classes of RNA guides will understand that in some embodiments, the RNA guides are site-specific. That is, in some embodiments, the RNA guide specifically associates with one or more target nucleic acid sequences (e.g., specific DNA or genomic DNA sequences) but not with non-target nucleic acid sequences (e.g., non-specific DNA or random sequences).
In some embodiments, the compositions as described herein comprise an RNA guide that associates with a Cas12i4 polypeptide (e.g., a parent polypeptide or a variant Cas12i4 polypeptide) and directs the Cas12i4 polypeptide to a target nucleic acid sequence (e.g., DNA).
The RNA guide can target (e.g., associate, direct, contact, or bind) one or more nucleotides of a target sequence (e.g., a site-specific sequence or site-specific target). In some embodiments, a nucleoprotein (e.g., a parent polypeptide or variant Cas12i4 polypeptide plus an RNA guide) is activated upon binding to a target nucleic acid (e.g., a sequence-specific substrate or target nucleic acid) that is complementary to a DNA targeting sequence in the RNA guide.
In some embodiments, the RNA guide comprises a spacer having a length of about 11 nucleotides to about 100 nucleotides. For example, the DNA targeting segment can have a length of about 11 nucleotides to about 80 nucleotides, about 11 nucleotides to about 50 nucleotides, about 11 nucleotides to about 40 nucleotides, about 11 nucleotides to about 30 nucleotides, about 11 nucleotides to about 25 nucleotides, about 11 nucleotides to about 20 nucleotides, or about 11 nucleotides to about 19 nucleotides. For example, the spacer can have about 19 nucleotides to about 20 nucleotides, about 19 nucleotides to about 25 nucleotides, about 19 nucleotides to about 30 nucleotides, about 19 nucleotides to about 35 nucleotides, about 19 nucleotides to about 40 nucleotides, about 19 nucleotides to about 45 nucleotides, about 19 nucleotides to about 50 nucleotides, about 19 nucleotides to about 60 nucleotides, about 19 nucleotides to about 70 nucleotides, about 19 nucleotides to about 80 nucleotides, about 19 nucleotides to about 90 nucleotides, about 19 nucleotides to about 100 nucleotides, about 20 nucleotides to about 25 nucleotides, about 20 nucleotides to about 30 nucleotides, about 20 nucleotides to about 35 nucleotides, about 20 nucleotides to about 40 nucleotides, about 20 nucleotides to about 45 nucleotides, about 20 nucleotides to about 50 nucleotides, about 20 nucleotides to about 60 nucleotides, about 19 nucleotides to about 70 nucleotides, about 20 nucleotides to about 20 nucleotides, about 20 nucleotides to about 80 nucleotides, or about 20 nucleotides to about 80 nucleotides.
In some embodiments, the spacer of the RNA guide can be generally designed to have a length of between 11 and 50 nucleotides (e.g., 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides) and be complementary to a particular target nucleic acid sequence. In some particular embodiments, the RNA guide can be designed to be complementary to a particular DNA strand, e.g., a genomic locus. In some embodiments, the DNA targeting sequence is designed to be complementary to a particular DNA strand, e.g., a genomic locus.
The RNA guide can be substantially identical to the complementary strand of the reference nucleic acid sequence. In some embodiments, the RNA guide comprises a sequence having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity to the complementary strand of a reference nucleic acid sequence (e.g., a target nucleic acid). The percent identity between two such nucleic acids can be determined manually by examining the two optimally aligned nucleic acid sequences or by using standard parameters using software programs or algorithms (e.g., BLAST, ALIGN, CLUSTAL).
In some embodiments, the RNA guide has at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity to the complementary strand of the target nucleic acid.
In some embodiments, the RNA guide comprises a spacer that is between 11 and 50 nucleotides in length (e.g., 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides) and is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target nucleic acid. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target DNA sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target genomic sequence. In some embodiments, the RNA guide comprises a sequence (e.g., RNA sequence) up to 50 in length and at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target nucleic acid. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target DNA sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target genomic sequence.
In certain embodiments, the RNA guide comprises, consists essentially of, or comprises a direct repeat sequence linked to a DNA targeting sequence. In some embodiments, the RNA guide comprises a direct repeat sequence and a DNA targeting sequence or a direct repeat-DNA targeting sequence-direct repeat sequence. In some embodiments, the RNA guide includes truncated, homologous repeats and DNA targeting sequences that are typical features of processed or mature crrnas. In some embodiments, the Cas12i4 polypeptide (e.g., the parent polypeptide or variant Cas12i4 polypeptide) forms a complex with the RNA guide, and the RNA guide directs the complex to associate with a site-specific target nucleic acid that is complementary to at least a portion of the RNA guide.
In some embodiments, the length of the orthostatic repeat of the RNA guide is between 12-100, 13-75, 14-50, or 15-40 nucleotides (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 nucleotides).
In some embodiments, the orthographic repeat sequence is a sequence of table 12 or a portion of a sequence of table 12. The orthostatic repeat sequence may comprise nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 2 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 3 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 4 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 5 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 6 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 7 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 8 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 9 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 10 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 11 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 12 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 13 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may comprise nucleotide 14 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124.
In some embodiments, the orthostatic sequence has at least 95% identity (e.g., at least 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 12 or a portion of a sequence of table 12. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 3 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 4 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 5 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 6 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 9 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 11 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 95% identity to a sequence comprising nucleotide 13 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124.
In some embodiments, the orthostatic sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 12 or a portion of a sequence of table 12. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 3 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 4 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 5 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 6 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 9 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 11 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. The orthostatic repeat sequence may have at least 90% identity to a sequence comprising nucleotide 13 to nucleotide 36 of any of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124.
In some embodiments, the orthostatic repeat sequence has at least 90% identity to the reverse complement of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. In some embodiments, the orthostatic repeat sequence has at least 95% identity to the reverse complement of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124. In some embodiments, the orthostatic repeat sequence is the reverse complement of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124.
TABLE 12 orthotropic repeats.
Sequence identifier Orthotropic repeat sequences
SEQ ID NO:60 UCUCAACGAUAGUCAGACAUGUGUCCUCAGUGACAC
SEQ ID NO:108 UUUUAACAACACUCAGGCAUGUGUCCACAGUGACAC
SEQ ID NO:109 UUGAACGGAUACUCAGACAUGUGUUUCCAGUGACAC
SEQ ID NO:110 UGCCCUCAAUAGUCAGAUGUGUGUCCACAGUGACAC
SEQ ID NO:111 UCUCAAUGAUACUUAGAUACGUGUCCUCAGUGACAC
SEQ ID NO:112 UCUCAAUGAUACUCAGACAUGUGUCCCCAGUGACAC
SEQ ID NO:113 UCUCAAUGAUACUAAGACAUGUGUCCUCAGUGACAC
SEQ ID NO:114 UCUCAACUAUACUCAGACAUGUGUCCUCAGUGACAC
SEQ ID NO:115 UCUCAACGAUACUCAGACAUGUGUCCUCAGUGACAC
SEQ ID NO:116 UCUCAACGAUACUAAGAUAUGUGUCCUCAGCGACAC
SEQ ID NO:117 UCUCAACGAUACUAAGAUAUGUGUCCCCAGUGACAC
SEQ ID NO:118 UCUCAACGAUACUAAGAUAUGUGUCCACAGUGACAC
SEQ ID NO:119 UCUCAACAAUACUCAGACAUGUGUCCCCAGUGACAC
SEQ ID NO:120 UCUCAACAAUACUAAGGCAUGUGUCCCCAGUGACCC
SEQ ID NO:121 UCUCAAAGAUACUCAGACACGUGUCCCCAGUGACAC
SEQ ID NO:122 UCUCAAAAAUACUCAGACAUGUGUCCUCAGUGACAC
SEQ ID NO:123 GCGAAACAACAGUCAGACAUGUGUCCCCAGUGACAC
SEQ ID NO:124 CCUCAACGAUAUUAAGACAUGUGUCCGCAGUGACAC
SEQ ID NO:61 AGACAUGUGUCCUCAGUGACAC
In some embodiments, the orthostatic repeat is AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C. In some embodiments, the orthologous repeat of SEQ ID NO. 125 is referred to as Cas12i4 mature DR.
In some embodiments, the orthostatic repeat sequence has at least 90% identity to SEQ ID NO. 61 or a portion of SEQ ID NO. 61. In some embodiments, the orthostatic repeat sequence has at least 95% identity to SEQ ID NO. 61 or a portion of SEQ ID NO. 61. In some embodiments, the orthostatic repeat has 100% identity to SEQ ID NO. 61 or a portion of SEQ ID NO. 61. In some embodiments, the orthologous repeat of SEQ ID NO. 61 is referred to as Cas12i4 mature DR.
In some embodiments, a composition or complex described herein includes one or more (e.g., two, three, four, five, six, seven, eight, or more) RNA guides, such as a plurality of RNA guides.
In some embodiments, the RNA guide has a structure similar to that of, for example, international publication nos. WO 2014/093622 and WO 2015/070083, the entire contents of each of which are incorporated herein by reference.
Unless otherwise indicated, all compositions and complexes and polypeptides provided herein are made with reference to the level of activity of the composition or complex or polypeptide, and do not include impurities, such as residual solvents or byproducts that may be present in commercial sources. The enzyme component weight is based on total active protein. All percentages and ratios are by weight unless otherwise indicated. All percentages and ratios are calculated based on the total composition unless otherwise indicated. In the exemplary compositions, enzyme levels are expressed as pure enzymes by weight of the total composition, and ingredients are expressed as weight of the total composition unless otherwise indicated.
Modification
The RNA guide or any nucleic acid sequence encoding the Cas12i4 polypeptide may comprise one or more covalent modifications with respect to the reference sequence, in particular the parent polyribonucleotide, which are included within the scope of the invention.
Exemplary modifications may include any modification to a sugar, nucleobase, internucleoside linkage (e.g., to a linked phosphate/phosphodiester linkage/phosphodiester backbone), and any combination. Some exemplary modifications provided herein are described in detail below.
The RNA guide or any nucleic acid sequence encoding a component of the Cas12i4 polypeptides described herein can include any useful modification, such as modification to a sugar, nucleobase, or internucleoside linkage (e.g., to a linked phosphate/phosphodiester linkage/phosphodiester backbone). One or more atoms of the pyrimidine nucleobase may be replaced or substituted with an optionally substituted amino group, an optionally substituted thiol, an optionally substituted alkyl (e.g., methyl or ethyl) or a halo (e.g., chloro or fluoro). In certain embodiments, a modification (e.g., one or more modifications) is present in each of the sugar and internucleoside linkages. The modification may be a modification of ribonucleic acid (RNA) to deoxyribonucleic acid (DNA), threose Nucleic Acid (TNA), ethylene Glycol Nucleic Acid (GNA), peptide Nucleic Acid (PNA), locked Nucleic Acid (LNA) or hybrids thereof. Additional modifications are described herein.
In some embodiments, the modification may include a chemical or cell-induced modification. For example, some non-limiting examples of intracellular RNA modifications are described by Lewis and Pan in "RNA modifications and structures cooperate to guide RNA-protein interactions [ RNA modification and structural collaboration guide RNA-protein interactions ]", nat Reviews Mol Cell Biol [ natural review: molecular cell biology ],2017, 18:202-210.
Different sugar modifications, nucleotide modifications, and/or internucleoside linkages (e.g., backbone structures) may be present at different positions in the sequence. One of ordinary skill in the art will appreciate that nucleotide analogs or one or more other modifications may be located at any one or more positions in the sequence such that the function of the sequence is not substantially reduced. The sequence may include about 1% to about 100% modified nucleotides (relative to the total nucleotide content, or relative to one or more types of nucleotides, i.e., any one or more of A, G, U or C) or any intervening percentages (e.g., 1% to 20%, 1% to 25%, 1% to 50%, 1% to 60%, 1% to 70%, 1% to 80%, 1% to 90%, 1% to 95%, 10% to 20%, 10% to 25%, 10% to 50%, 10% to 60%, 10% to 70%, 10% to 80%, 10% to 90%, 10% to 95%, 10% to 100%, 20% to 25%, 20% to 50%, 20% to 60%, 20% to 70%, 20% to 80%, 20% to 90%, 20% to 95%, 20% to 100%, 50% to 60%, 50% to 70%, 50% to 80%, 50% to 95%, 50% to 100%, 70% to 80%, 70% to 90%, 70% to 95%, 80% to 80%, 80% to 90%, 80% to 95%, and 100% to 95%).
In some embodiments, sugar modifications (e.g., at the 2 'position or at the 4' position) or sugar substitutions at one or more ribonucleotides of the sequence and backbone modifications may include modifications or substitutions of phosphodiester bonds. Specific examples of sequences include, but are not limited to, sequences that include a modified backbone or non-natural internucleoside linkages (e.g., internucleoside modifications, including modifications or substitutions of phosphodiester linkages). Sequences having modified backbones include, inter alia, those that do not have phosphorus atoms in the backbone. For the purposes of the present application, and as sometimes referred to in the art, modified RNAs that do not have a phosphorus atom in their internucleoside backbone can also be considered oligonucleotides. In particular embodiments, the sequence will include ribonucleotides with phosphorus atoms in their internucleoside backbone.
Modified sequence backbones can include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkylphosphonates (such as 3 '-alkylene phosphonates and chiral phosphonates), phosphonites, phosphoramidates (such as 3' -phosphoramidates and aminoalkyl phosphoramidates), phosphorothioates (phosphorothioates), phosphorothioate alkyl phosphonates, phosphorothioate alkyl phosphotriesters, and borane phosphates with normal 3'-5' linkages, 2'-5' linkages of these esters, and analogs with opposite polarity, wherein adjacent pairs of nucleoside units are 3'-5' linked to 5'-3' or 2'-5' linked to 5'-2'. Also included are various salts, mixed salts and free acid forms. In some embodiments, the sequence may be negatively or positively charged.
Modified nucleotides that can be incorporated into the sequence can be modified on internucleoside linkages (e.g., phosphate backbones). In this context, the phrases "phosphate" and "phosphodiester" are used interchangeably in the context of a polynucleotide backbone. The backbone phosphate group may be modified by replacing one or more oxygen atoms with a different substituent. In addition, modified nucleosides and nucleotides can include an integral substitution of the unmodified phosphate moiety with another internucleoside linkage as described herein. Examples of modified phosphate groups include, but are not limited to, phosphorothioates, selenophosphates, boranophosphates (borophosphate), boranophosphates (boranophosphate ester), hydrogen phosphonates, phosphoramidates, phosphorodiamidates, alkyl or aryl phosphonates, and phosphotriesters. Both non-linking oxygens of the dithiophosphate are replaced by sulfur. Phosphate linkers can also be modified by replacing the linking oxygen with nitrogen (bridged phosphoramidate), sulfur (bridged phosphorothioate) and carbon (bridged methylphosphonate).
Alpha-thio substituted phosphate moieties are provided to impart stability to RNA and DNA polymers through non-natural phosphorothioate backbone linkages. Phosphorothioate DNA and RNA have enhanced nuclease resistance and therefore have a longer half-life in the cellular environment.
In particular embodiments, the modified nucleoside includes an α -thio-nucleoside (e.g., 5' -O- (1-phosphorothioate) -adenosine, 5' -O- (1-phosphorothioate) -cytidine (a-thiocytidine), 5' -O- (1-phosphorothioate) -guanosine, 5' -O- (1-phosphorothioate) -uridine, or 5' -O- (1-phosphorothioate) -pseudouridine).
Other internucleoside linkages, including those that do not contain a phosphorus atom, that can be used in accordance with the present invention are described herein.
In some embodiments, the sequence may include one or more cytotoxic nucleosides. For example, cytotoxic nucleosides can be incorporated into sequences, such as bifunctional modifications. Cytotoxic nucleosides can include, but are not limited to, arabinoside, 5-azacytidine, 4' -thioarabinoside, cyclopentenyl cytosine, cladribine, clofarabine, cytarabine, cytosine arabinoside, 1- (2-C-cyano-2-deoxy- β -D-arabino-pentosyl) -cytosine, decitabine, 5-fluorouracil, fludarabine, fluorouridine, gemcitabine, a combination of tegafur and uracil, tegafur ((RS) -5-fluoro-1- (tetrahydrofuran-2-yl) pyrimidine-2, 4 (1 h,3 h) -dione), troxacitabine, tizalcitabine, 2' -deoxy-2 ' -methylenecytidine (DMDC), and 6-mercaptopurine. Other examples include fludarabine phosphate, N4-behenacyl-1- β -D-arabinofuranosyl cytosine, N4-octadecyl-1- β -D-arabinofuranosyl cytosine, N4-palmitoyl-1- (2-C-cyano-2-deoxy- β -D-arabino-pentafuranosyl) cytosine, and P-4055 (cytarabine 5' -elaidite).
In some embodiments, the sequence includes one or more post-transcriptional modifications (e.g., capping, cleavage, polyadenylation, splicing, poly-a sequence, methylation, acylation, phosphorylation, methylation of lysine and arginine residues, acetylation, and nitrosylation of thiol and tyrosine residues, etc.). The one or more post-transcriptional modifications may be any post-transcriptional modification, such as any of more than one hundred different nucleoside modifications that have been identified in RNA (Rozenski, J, crain, P, and McCloskey, J. (1999) The RNA Modification Database:1999update [ RNA modification database 1999 ]. Nucl Acids Res [ nucleic Acids Ind. ] 27:196-197). In some embodiments, the first isolated nucleic acid comprises messenger RNA (mRNA). In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: pyridine-4-ketoriboside, 5-aza-uridine, 2-thio-uridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxy-uridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyl-uridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, 1-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-dean-pseudouridine, 2-thio-1-methyl-dean-pseudouridine, dihydrouridine, 2-thio-uridine, 2-dihydro-pseudouridine, 2-methoxy-4-thio-uridine and pseudouridine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: 5-aza-cytidine, pseudoiso-cytidine, 3-methyl-cytidine, N4-acetyl-cytidine, 5-formyl-cytidine, N4-methylcytidine, 5-hydroxymethyl cytidine, 1-methyl-pseudoiso-cytidine, pyrrolo-pseudoiso-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoiso-cytidine, 4-thio-1-methyl-deaza-pseudoiso-cytidine, 1-methyl-1-deaza-pseudoiso-cytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoiso-cytidine, and 4-methyl-pseudoiso-cytidine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-isopentenyl adenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycylcarbamoyl adenosine, N6-threonyl adenosine, 2-methylthio-N6-threonyl carbamoyl adenosine, N6-dimethyladenosine, 7-methyladenosine, 2-methylthio-adenine and 2-methoxy-adenine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: inosine, 1-methyl-inosine, huacoside, huai Dinggan, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methyl-inosine, 6-methoxy-guanosine, 1-methyl guanosine, N2-dimethyl guanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, 1-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine.
The sequence may or may not be uniformly modified along the entire length of the molecule. For example, one or more or all types of nucleotides (e.g., naturally occurring nucleotide purines or pyrimidines, or any or more or all of A, G, U, C, I, pU) may or may not be uniformly modified in the sequence or in a given predetermined sequence region thereof. In some embodiments, the sequence comprises pseudouridine. In some embodiments, the sequence includes inosine, which may help the immune system characterize the sequence as endogenous relative to the viral RNA. The incorporation of inosine may also mediate improved RNA stability/reduced degradation. See, e.g., yu, z. Et al (2015) RNA editing by ADAR marks dsRNA as "self" [ RNA editing by ADAR1 labeling dsRNA as "self" ].
Target nucleic acid
The methods disclosed herein are applicable to a variety of target nucleic acids. In some embodiments, the target nucleic acid is DNA, e.g., a DNA locus. In some embodiments, the target nucleic acid is RNA, e.g., an RNA locus or mRNA. In some embodiments, the target nucleic acid is single-stranded (e.g., single-stranded DNA). In some embodiments, the target nucleic acid is double-stranded (e.g., double-stranded DNA). In some embodiments, the target nucleic acid comprises both a single-stranded region and a double-stranded region. In some embodiments, the target nucleic acid is linear. In some embodiments, the target nucleic acid is circular. In some embodiments, the target nucleic acid comprises one or more modified nucleotides, such as methylated nucleotides, compromised nucleotides, or nucleotide analogs. In some embodiments, the target nucleic acid is unmodified.
The target nucleic acid may have any length, for example, about at least one of 100bp, 200bp, 500bp, 1000bp, 2000bp, 5000bp, 10kb, 20kb, 50kb, 100kb, 200kb, 500kb, 1Mb or more. The target nucleic acid may also comprise any sequence. In some embodiments, the target nucleic acid is GC-rich, such as having any of at least about 40%, 45%, 50%, 55%, 60%, 65% or higher GC content. In some embodiments, the target nucleic acid has a GC content of at least about 70%, 80%, or higher. In some embodiments, the target nucleic acid is a GC-rich fragment of a non-GC-rich target nucleic acid. In some embodiments, the target nucleic acid is not GC-rich. In some embodiments, the target nucleic acid has one or more secondary or higher order structures. In some embodiments, the target nucleic acid is not in a condensed state, such as in chromatin, such that the Cas12i4 polypeptide/RNA guide complex is not accessible to the target nucleic acid.
In some embodiments, the target nucleic acid is present in a cell. In some embodiments, the target nucleic acid is present in the nucleus. In some embodiments, the target nucleic acid is endogenous to the cell. In some embodiments, the target nucleic acid is genomic DNA. In some embodiments, the target nucleic acid is chromosomal DNA. In some embodiments, the target nucleic acid is a protein-encoding gene or a functional region thereof (such as a coding region) or regulatory element (such as a promoter, enhancer, 5 'or 3' untranslated region, etc.). In some embodiments, the target nucleic acid is a non-coding gene, such as a transposon, miRNA, tRNA, ribosomal RNA, ribozyme, or lincRNA. In some embodiments, the target nucleic acid is a plasmid.
In some embodiments, the target nucleic acid is exogenous to the cell. In some embodiments, the target nucleic acid is a viral nucleic acid, such as viral DNA or viral RNA. In some embodiments, the target nucleic acid is a horizontally transferred plasmid. In some embodiments, the target nucleic acid is integrated in the genome of the cell. In some embodiments, the target nucleic acid is not integrated in the genome of the cell. In some embodiments, the target nucleic acid is a plasmid in the cell. In some embodiments, the target nucleic acid is present in an extrachromosomal array.
In some embodiments, the target nucleic acid is an isolated nucleic acid, such as an isolated DNA or an isolated RNA. In some embodiments, the target nucleic acid is present in a cell-free environment. In some embodiments, the target nucleic acid is an isolated vector, such as a plasmid. In some embodiments, the target nucleic acid is an ultrapure plasmid.
The target nucleic acid is a segment of target nucleic acid that hybridizes to an RNA guide. In some embodiments, the target nucleic acid has only one copy of the target nucleic acid. In some embodiments, the target nucleic acid has more than one copy, such as at least about 2, 3, 4, 5, 10, 100 or more copies of any of the target nucleic acids. For example, a target nucleic acid comprising a repeat sequence in the genome of a viral nucleic acid or bacteria may be targeted by a nucleoprotein.
The target sequence is adjacent to a protospacer adjacent motif or PAM of the present disclosure as described herein. PAM may be immediately adjacent to the target sequence, or within a small number (e.g., 1, 2, 3, 4, or 5) nucleotides of the target sequence, for example. In the case of a double-stranded target, the targeting moiety (e.g., RNA guide) binds to the first strand of the target and the PAM sequence as described herein is present in the second complementary strand. In this case, the PAM sequence is immediately adjacent to (or within a small amount, e.g., 1, 2, 3, 4, or 5 nucleotides of) the sequence in the second strand that is complementary to the sequence in the first strand to which the binding moiety binds.
In some embodiments, sequence specificity requires that the spacer sequence in the RNA guide be perfectly matched to the non-PAM strand of the target nucleic acid. In other embodiments, sequence specificity requires that the spacer sequence in the RNA guide match with a portion (contiguous or non-contiguous) of the non-PAM strand of the target nucleic acid.
In some embodiments, the RNA guides or complexes comprising the RNA guides and Cas12i4 polypeptides described herein bind to a target nucleic acid at a sequence defined by a region of complementarity between the RNA guides and the target sequence. In some embodiments, a PAM sequence described herein is located directly upstream of (e.g., directly 5' of) a target sequence of a target nucleic acid. In some embodiments, a PAM sequence described herein is located directly 5' of a target sequence on a non-spacer complementary strand (e.g., a non-target strand) of a target nucleic acid.
In some embodiments, PAM sequences corresponding to Cas12i4 (e.g., a parent Cas12i4 polypeptide or a variant Cas12i4 polypeptide) include 5'-TTN-3' and 5'-NTTN-3', where N is any nucleotide (e.g., A, G, T, or C). In some embodiments, the PAM sequence comprises 5'-TTH-3', 5'-TTY-3', 5'-TTC-3', 5'-NTTH-3', 5'-NTTY-3', or 5'-NTTC-3', where N is any nucleotide, H is A, C, or T, and Y is C or T. In some embodiments, the PAM sequence comprises 5'-TTA-3', 5'-TTC-3', 5'-TTG-3', 5'-TTT-3', 5'-NTTA-3', 5'-NTTC-3', 5'-NTTG-3', or 5'-NTTT-3'. For example, in some embodiments, the PAM comprises 5'-ATTA-3', 5'-CTTA-3', 5'-TTTC-3', 5'-TTA-3', 5'-GTTA-3', 5'-CTTC-3', 5'-CTTG-3', 5'-TTC-3', 5'-TTTA-3', 5'-GTTC-3', 5'-TTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-GTTT-3', 5 '-ATTTT-3', 5'-TTT-3', 5 '-CTTTT-3', 5'-TTTG-3', or 5'-CTT-3'.
In some embodiments, PAM sequences corresponding to Cas12i4 (e.g., a parent Cas12i4 polypeptide or a variant Cas12i4 polypeptide) include 5'-NTN-3', 5'-NNTN-3', 5'-VTN-3', and 5'-NVTN-3', where N is any nucleotide (e.g., A, G, T, or C) and V is A, G, or C. In some embodiments, the PAM sequence comprises 5'-NTC-3', 5'-NTA-3', 5'-NTG-3', 5'-NTT-3', 5'-NNTC-3', 5'-NNTA-3', 5'-NNTG-3', or 5'-NNTT-3'. For example, in some embodiments, the PAM sequence comprises 5' -AATC-3', 5' -CCTG-3', 5' -CTA-3', 5' -TCTC-3', 5' -CTG-3', 5' -GCTG-3', 5' -CTC-3', 5' -GCTC-3', 5' -TCTG-3', 5' -ACTG-3', 5' -GATA-3', 5' -TATC-3', 5' -ATC-3', 5' -ATA-3', 5' -GATC-3', 5' -ACTA-3', 5' -GATG-3', 5' -TGTG-3', 5' -TCTT-3', 5' -CCTT-3', GCTT-3', or 5' -ACTT-3'.
In some embodiments, the Cas12i4 polypeptide (e.g., the parent Cas12i4 polypeptide or the variant Cas12i4 polypeptide) recognizes a PAM sequence as shown by 5'-ATAA-3', 5'-CAAT-3', 5'-CGAT-3', 5'-GAGA-3', 5'-CAAG-3', 5'-ACGT-3', 5'-GGCC-3', 5'-GGAC-3', 5'-GGCA-3', 5'-GTAC-3', 5'-GACC-3', or 5 '-TTAC-3'.
In some embodiments, the target nucleic acid is present in an accessible region of the target nucleic acid. In some embodiments, the target nucleic acid is in an exon of a target gene. In some embodiments, the target nucleic acid spans an exon-intron junction of the target gene. In some embodiments, the target nucleic acid is present in a non-coding region (such as a regulatory region of a gene). In some embodiments, wherein the target nucleic acid is exogenous to the cell, the target nucleic acid comprises a sequence not found in the genome of the cell.
Suitable DNA/RNA binding conditions include physiological conditions that are normally present in cells. Other suitable DNA/RNA binding conditions (e.g., conditions in a cell-free system) are known in the art; see, e.g., sambrook, supra. The strand of the target nucleic acid that is complementary to and hybridizes with the RNA guide is referred to as the "complementary strand" and the strand of the target nucleic acid that is complementary to the "complementary strand" (and thus not complementary to the RNA guide) is referred to as the "non-complementary strand".
Preparation
In some embodiments, a variant Cas12i4 polypeptide of the invention can be prepared by: (a) Culturing a bacterium that produces a variant Cas12i4 polypeptide of the invention, isolating the variant Cas12i4 polypeptide, optionally purifying the variant Cas12i4 polypeptide, and complexing the variant Cas12i4 polypeptide with an RNA guide. The variant Cas12i4 polypeptide may also be prepared by (b) known genetic engineering techniques, in particular, by: the genes encoding the variant Cas12i4 polypeptides of the invention are isolated from the bacteria, recombinant expression vectors are constructed, and the vectors are then transferred into suitable host cells expressing the RNA guides for expression of the recombinant proteins complexed with the RNA guides in the host cells. Alternatively, the variant Cas12i4 polypeptide may be prepared by (c) an in vitro coupled transcription-translation system, and then complexed with an RNA guide. Bacteria that can be used to prepare the variant Cas12i4 polypeptides of the invention are not particularly limited, as long as they can produce the variant Cas12i4 polypeptides of the invention. Some non-limiting examples of bacteria include E.coli (E.coli) cells described herein.
Carrier body
The present invention provides vectors for expressing a variant Cas12i4 polypeptide described herein or nucleic acids encoding the variants described herein can be incorporated into vectors. In some embodiments, the vectors of the invention comprise a nucleotide sequence encoding a variant Cas12i4 polypeptide. In some embodiments, the vectors of the invention comprise a nucleotide sequence encoding a variant Cas12i4 polypeptide.
The invention also provides vectors useful for preparing a variant Cas12i4 polypeptide or a composition comprising a variant Cas12i4 polypeptide as described herein. In some embodiments, the invention includes a composition or carrier described herein in a cell. In some embodiments, the invention includes methods of expressing a composition comprising a variant Cas12i4 polypeptide or a vector or nucleic acid encoding a variant Cas12i4 polypeptide in a cell. The method can include the step of providing a composition (e.g., a vector or nucleic acid) and delivering the composition to a cell.
Expression of the natural or synthetic polynucleotide is typically achieved by operably linking a polynucleotide encoding a gene of interest (e.g., a nucleotide sequence encoding a variant Cas12i4 polypeptide) to a promoter and incorporating the construct into an expression vector. The expression vector is not particularly limited as long as it includes a polynucleotide encoding the variant Cas12i4 polypeptide of the invention and may be suitable for replication and integration in eukaryotic cells.
Typical expression vectors include transcriptional and translational terminators, initiation sequences, and promoters useful for expression of the desired polynucleotide. For example, plasmid vectors carrying RNA polymerase recognition sequences (pSP 64, pBluescript, etc.) may be used. Including those derived from retroviruses such as lentiviruses, are suitable tools for achieving long-term gene transfer, as they allow for long-term stable integration of transgenes and their propagation in daughter cells. Examples of vectors include expression vectors, replication vectors, probe-generating vectors, and sequencing vectors. The expression vector may be provided to the cell in the form of a viral vector.
Viral vector technology is well known in the art and is described in various handbooks of pathology and molecular biology. Viruses that may be used as vectors include, but are not limited to, phage viruses, retroviruses, adenoviruses, adeno-associated viruses, herpesviruses, and lentiviruses. In general, suitable vectors contain an origin of replication in at least one organism, a promoter sequence, a convenient restriction endonuclease site, and one or more selectable markers.
The kind of the vector is not particularly limited, and a vector that can be expressed in a host cell may be appropriately selected. More specifically, depending on the kind of host cell, the promoter sequence is appropriately selected to ensure expression of the variant Cas12i4 polypeptide from the polynucleotide, and the promoter sequence and polynucleotide are inserted into any of various plasmids or the like to prepare an expression vector.
Additional promoter elements (e.g., enhancer sequences) regulate the frequency of transcription initiation. Typically, these elements are located in the region 30-110bp upstream of the start site, although many promoters have recently been shown to also contain functional elements downstream of the start site. Depending on the promoter, it appears that individual elements may function together or independently to activate transcription.
Furthermore, the present disclosure should not be limited to the use of constitutive promoters. Inducible promoters are also contemplated as part of the present disclosure. The use of an inducible promoter provides a molecular switch that can either initiate expression of the polynucleotide sequence to which the promoter is operably linked when such expression is desired, or shut down expression when expression is not desired. Examples of inducible promoters include, but are not limited to, metallothionein promoters, glucocorticoid promoters, progesterone promoters, and tetracycline promoters.
The expression vector to be introduced may also contain a selectable marker gene or a reporter gene or both, thereby facilitating identification and selection of the expressing cells from the population of cells sought to be transfected or infected by the viral vector. In other aspects, the selectable marker may be performed on a single piece of DNA and used in a co-transfection procedure. Both the selectable marker and the reporter gene may be flanked by appropriate transcriptional control sequences to enable expression in the host cell. Examples of such markers include the dihydrofolate reductase gene and the neomycin resistance gene for eukaryotic cell culture; and tetracycline resistance genes and ampicillin resistance genes for use in E.coli and other bacterial cultures. By using such a selection marker, it can be confirmed whether the polynucleotide encoding the variant Cas12i4 polypeptide of the invention has been transferred into a host cell and then successfully expressed.
The method for preparing the recombinant expression vector is not particularly limited, and examples thereof include a method using a plasmid, phage or cosmid.
Expression method
The invention includes methods for protein expression comprising translating a variant Cas12i4 polypeptide described herein.
In some embodiments, the host cells described herein are used to express a variant Cas12i4 polypeptide. The host cell is not particularly limited, and various known cells may be preferably used. Specific examples of host cells include bacteria such as E.coli, yeasts such as Saccharomyces cerevisiae (Saccharomyces cerevisiae) and Schizosaccharomyces pombe (Schizosaccharomyces pombe), nematodes such as caenorhabditis elegans (Caenorhabditis elegans), xenopus laevis (Xenopus laevis) oocytes and animal cells such as CHO cells, COS cells and HEK293 cells. The method for transferring the above-mentioned expression vector into a host cell (i.e., transformation method) is not particularly limited, and known methods such as electroporation, calcium phosphate method, liposome method and DEAE dextran method may be used.
After transformation of the host with the expression vector, the host cell can be cultured, bred, or propagated to produce the variant Cas12i4 polypeptide. After expression of the variant Cas12i4 polypeptide, the host cells can be collected and the variant Cas12i4 polypeptide purified from culture, etc., according to conventional methods (e.g., filtration, centrifugation, cell disruption, gel filtration chromatography, ion exchange chromatography, etc.).
In some embodiments, the method for variant Cas12i4 polypeptide expression comprises translating at least 5 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, at least 50 amino acids, at least 100 amino acids, at least 150 amino acids, at least 200 amino acids, at least 250 amino acids, at least 300 amino acids, at least 400 amino acids, at least 500 amino acids, at least 600 amino acids, at least 700 amino acids, at least 800 amino acids, at least 900 amino acids, or at least 1000 amino acids of the variant Cas12i4 polypeptide. In some embodiments, the method for protein expression comprises translating about 5 amino acids, about 10 amino acids, about 15 amino acids, about 20 amino acids, about 50 amino acids, about 100 amino acids, about 150 amino acids, about 200 amino acids, about 250 amino acids, about 300 amino acids, about 400 amino acids, about 500 amino acids, about 600 amino acids, about 700 amino acids, about 800 amino acids, about 900 amino acids, about 1000 amino acids, or more of the variant Cas12i4 polypeptide.
Various methods can be used to determine the level of production of the mature variant Cas12i4 polypeptide in a host cell. Such methods include, but are not limited to, methods such as utilizing polyclonal or monoclonal antibodies specific for the variant Cas12i4 polypeptide or a labeling tag as described elsewhere herein. Exemplary methods include, but are not limited to, enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), fluorescence Immunoassay (FIA), and Fluorescence Activated Cell Sorting (FACS). These and other assays are well known in the art (see, e.g., maddox et al, J. Exp. Med. [ journal of laboratory medicine ]158:1211[1983 ]).
The present disclosure provides methods of expressing a variant Cas12i4 polypeptide in vivo in a cell, the methods comprising providing a host cell with a polyribonucleotide encoding a variant Cas12i4 polypeptide, wherein the polyribonucleotide encodes a variant Cas12i4 polypeptide; expressing the variant Cas12i4 polypeptide in a cell; and obtaining the variant Cas12i4 polypeptide from the cell.
Introduction of alterations or mutations
Nucleic acid sequences encoding a variant polypeptide or polypeptides may be produced by synthetic methods known in the art. One or more changes or mutations may be inserted at a time to alter the nucleic acid sequence encoding the parent polypeptide using the nucleic acid sequence encoding the parent polypeptide itself as a framework. Along the same lines, a parent polypeptide may be altered or mutated by introducing changes into the polypeptide sequence as synthesized in a synthetic manner. This may be accomplished by methods well known in the art.
The creation of alterations or mutations and their introduction into the parent polypeptide sequence may be accomplished using any method known to those skilled in the art. In particular, in some embodiments, oligonucleotide primers used for PCR can be used for rapid synthesis of DNA templates (including one or more changes or mutations in a nucleic acid sequence encoding a variant polypeptide). Site-specific mutagenesis can also be used as a useful technique for preparing individual peptides or biologically functional equivalent proteins or peptides by specific mutagenesis of the underlying DNA. The technology further provides the existing ability to prepare and test variants by introducing one or more nucleotide sequence changes into DNA in combination with one or more of the foregoing considerations. Site-specific mutagenesis allows variants to be generated by using specific oligonucleotide sequences encoding the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide primer sequences of sufficient size and sequence complexity to form stable duplex on both sides of the deletion junction traversed. Typically, primers of about 17 to 25 nucleotides in length are preferred, with about 5 to 10 residues being altered on either side of the junction of the sequences.
Introduction of structural variations (such as fusion of the polypeptide as an amino-and/or carboxy-terminal extension) may be accomplished in a similar manner as the introduction of alterations or mutations into the parent polypeptide. The additional peptide may be added to the parent polypeptide or variant polypeptide by including an appropriate nucleic acid sequence encoding the additional peptide into the nucleic acid sequence encoding the parent polypeptide or variant polypeptide. Optionally, additional peptides can be directly attached to the variant polypeptide by synthetic polypeptide production.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a variant Cas12i4 polypeptide having increased mid-target binding to two or more loci (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) of a target nucleic acid as compared to the parent polypeptide.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a plurality of variant Cas12i4 polypeptides (e.g., individual variant Cas12i4 polypeptides having the same amino acid sequence), the plurality of variant Cas12i4 polypeptides having increased mid-target binding to two or more loci of a target nucleic acid when separately complexed with a plurality of different RNA guides, as compared to the plurality of parent polypeptides and RNA guides.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a variant Cas12i4 polypeptide having increased in-target ternary complex formation with two or more target loci of a target nucleic acid as compared to the parent polypeptide.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a plurality of variant Cas12i4 polypeptides (e.g., individual variant Cas12i4 polypeptides having the same amino acid sequence), the plurality of variant Cas12i4 polypeptides having increased ternary complex formation with two or more loci of a target nucleic acid when separately complexed with a plurality of different RNA guides, as compared to the plurality of parent polypeptides and RNA guides.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a Cas12i4 polypeptide, which variant Cas12i4 polypeptide exhibits targeting to an increased number of target nucleic acids or target loci as compared to the parent polypeptide.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to produce a plurality of variant Cas12i4 polypeptides (e.g., individual variant Cas12i4 polypeptides having the same amino acid sequence), the plurality of variant Cas12i4 polypeptides exhibiting an increased targeted number of target nucleic acids or target loci when complexed with a plurality of different RNA guides alone, as compared to the plurality of parent polypeptides and RNA guides.
In one aspect, the invention also provides methods for introducing alterations or mutations into a parent polypeptide sequence to enhance stability of a Cas12i4 polypeptide. Stability of Cas12i4 polypeptides may be determined by or may include, but is not limited to, the following techniques: thermal denaturation assays, thermal transition assays, differential Scanning Calorimetry (DSC), differential scanning fluorescence assays (DSF), isothermal Titration Calorimetry (ITC), pulse tracing, bleach tracing, cycloheximide tracing, circular Dichroism (CD) spectroscopy, crystallization, and fluorescence-based activity assays.
Variant binary complex
Typically, the variant Cas12i4 polypeptide and the RNA guide bind to each other in a molar ratio of about 1:1 to form a variant binary complex. The variant Cas12i4 polypeptide and RNA guide (alone or together) will not occur naturally.
In some embodiments, the variant Cas12i4 polypeptide may be overexpressed in a host cell and purified as described herein before it is complexed with an RNA guide (e.g., in a tube) to form a variant Ribonucleoprotein (RNP) (e.g., a variant binary complex).
In some embodiments, the variant binary complex exhibits increased binding affinity to the target nucleic acid, increased mid-target binding activity, increased mid-target binding specificity, increased ternary complex formation with the target nucleic acid, and/or increased stability over a range of incubation times. In some embodiments, the variant binary complex exhibits reduced off-target binding to a non-target nucleic acid and/or reduced dissociation from the target nucleic acid over a range of incubation times. In some embodiments, the variant binary complex exhibits increased target nucleic acid complex formation, target nucleic acid activity, and/or target nucleic acid specificity over a range of incubation times.
In some embodiments, the complexing of the binary complex occurs at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, or 55 ℃. In some embodiments, the variant Cas12i4 polypeptide does not dissociate from the RNA guide or bind to free RNA during the incubation period of at least any of 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 35 minutes, 40 minutes, 45 minutes, 50 minutes, 55 minutes, 1 hour, 2 hours, 3 hours, 4 hours, or more at about 37 ℃. In some embodiments, the variant ribonucleoprotein complex does not exchange RNA guides with a different RNA after binary complex formation.
In some embodiments, the variant Cas12i4 polypeptide and the RNA guide are complexed in a binary complexing buffer. In some embodiments, the variant Cas12i4 polypeptide is stored in a buffer that is replaced with a binary complexing buffer to form a complex with the RNA guide. In some embodiments, the variant Cas12i4 polypeptide is stored in a binary complex buffer.
In some embodiments, the binary complex buffer has a pH in the range of about 7.3 to 8.6. In one embodiment, the pH of the binary complex buffer is about 7.3. In one embodiment, the pH of the binary complex buffer is about 7.4. In one embodiment, the pH of the binary complex buffer is about 7.5. In one embodiment, the pH of the binary complex buffer is about 7.6. In one embodiment, the pH of the binary complex buffer is about 7.7. In one embodiment, the pH of the binary complex buffer is about 7.8. In one embodiment, the pH of the binary complex buffer is about 7.9. In one embodiment, the pH of the binary complex buffer is about 8.0. In one embodiment, the pH of the binary complex buffer is about 8.1. In one embodiment, the pH of the binary complex buffer is about 8.2. In one embodiment, the pH of the binary complex buffer is about 8.3. In one embodiment, the pH of the binary complex buffer is about 8.4. In one embodiment, the pH of the binary complex buffer is about 8.5. In one embodiment, the pH of the binary complex buffer is about 8.6.
The thermostability of the variant Cas12i4 polypeptide may be increased under favorable conditions (such as the addition of an RNA guide, e.g., binding to an RNA guide).
In some embodiments, the variant Cas12i4 polypeptide may be overexpressed in a host cell and complexed with an RNA guide prior to purification as described herein. In some embodiments, mRNA or DNA encoding the variant Cas12i4 polypeptide is introduced into the cell such that the variant Cas12i4 polypeptide is expressed in the cell. The RNA guide (which directs the variant Cas12i4 polypeptide to the desired target nucleic acid) is also introduced into the cell simultaneously, separately or sequentially with a single mRNA or DNA construct, such that the necessary ribonucleoprotein complex is formed in the cell.
Assessment of variant binary Complex stability and functionality
Provided herein in certain embodiments are methods for identifying optimal variant Cas12i4 polypeptide/RNA guide complexes (referred to herein as variant binary complexes) comprising (a) combining a variant Cas12i4 polypeptide and an RNA guide in a sample to form a variant binary complex; (b) measuring the value of the variant binary complex; and (c) if the value of the variant binary complex is greater than the value of the reference molecule, determining that the variant binary complex is optimal relative to the reference molecule. In some embodiments, the value may include, but is not limited to, a stability measurement (e.g., T m Values, thermostability), binary complex formation rate, RNA guide binding specificity, and/or complex activity.
In some embodiments, the best variant Cas12i4 polypeptide/RNA guide complex (i.e., variant binary complex) is identified by: (a) Combining the variant Cas12i4 polypeptide and the RNA guide in the sample to form a variant binary complex; (b) Detection of T of variant binary complexes m A value; and (c) if T of the binary complex is a variant m Value ratio T of reference molecule m Value or T m Large reference valueAt least 8 ℃, the variant binary complex is determined to be stable.
Methods involving the step of measuring the thermostability of the variant Cas12i4 polypeptide/RNA guide complex (i.e., the variant binary complex) may include, but are not limited to, methods of determining the stability of the variant binary complex, methods of determining conditions that promote stable variant binary complex, methods of screening for stable variant binary complex, and methods for identifying optimal gRNA to form stable variant binary complex. In certain embodiments, the thermal stability value of the variant binary complex may be measured.
Additionally, in certain embodiments, the thermal stability value of the reference molecule may also be measured. In certain embodiments, a variant binary complex may be determined to be stable if the measured thermostability value of the variant binary complex is greater than the measured thermostability value of the reference molecule or the thermostability reference value measured under the same experimental conditions, as described herein. In certain embodiments, the reference molecule can be a variant Cas12i4 polypeptide lacking an RNA guide.
In certain embodiments, the measured thermal stability value may be a denaturation temperature value. In these embodiments, the thermal stability reference is a denaturation temperature reference. In some embodiments, the measured thermal stability value may be T m Values. In these embodiments, the thermal stability reference value may be T m Reference value. In certain embodiments, thermal stability values may be measured using a thermal transition assay. In certain embodiments, assays for measuring thermal stability may involve techniques described herein, including, but not limited to, thermal denaturation assays, thermal transition assays, differential Scanning Calorimetry (DSC), differential Scanning Fluorometry (DSF), isothermal Titration Calorimetry (ITC), pulse tracking, bleach tracking, cycloheximide tracking, circular Dichroism (CD) spectroscopy, crystallization, and fluorescence-based activity assays.
In certain embodiments, a variant binary complex can be identified if the variant Cas12i4 polypeptide/RNA guide complex formation rate, RNA guide binding specificity, and/or complex activity of the variant binary complex is greater than the value of the reference molecule or reference value (e.g., the value of the parent polypeptide/RNA guide complex (referred to herein as the parent binary complex). For example, in certain embodiments, a variant binary complex can be identified if the variant Cas12i4 polypeptide/RNA guide complex formation rate, RNA guide binding specificity, and/or complex activity value of the variant binary complex is at least X% greater than the value of the reference molecule or reference value (e.g., the value of the parent binary complex). In certain embodiments, the methods described herein may further comprise a plurality of steps comprising measuring the activity of a variant binary complex as described herein.
Variant ternary complex
In some embodiments, a variant Cas12i4 polypeptide, RNA guide, and target nucleic acid as described herein form a variant ternary complex (e.g., in a tube or cell). Typically, the variant Cas12i4 polypeptide, RNA guide, and target nucleic acid associate with each other at a molar ratio of about 1:1:1 to form a variant ternary complex. The variant Cas12i4 polypeptide, RNA guide, and target nucleic acid (alone or together) will not occur naturally.
In some embodiments, a variant binary complex (e.g., a complex of a variant Cas12i4 polypeptide and an RNA guide) as described herein is further complexed with a target nucleic acid (e.g., in a test tube or cell) to form a variant ternary complex.
In some embodiments, the complexing of the ternary complex occurs at a temperature that is about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, or 55 ℃. In some embodiments, the variant binary complex does not dissociate or bind to free nucleic acid (e.g., free DNA) from the target nucleic acid at about 37 ℃ during an incubation period of at least any of 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 35 minutes, 40 minutes, 45 minutes, 50 minutes, 55 minutes, 1 hour, 2 hours, 3 hours, 4 hours, or more. In some embodiments, the variant binary complex does not exchange target nucleic acid with a different nucleic acid after ternary complex formation.
In some embodiments, the variant Cas12i4 polypeptide, the RNA guide, and the target nucleic acid are complexed in a ternary complexing buffer. In some embodiments, the variant Cas12i4 polypeptide is stored in a buffer that is replaced with a ternary complex buffer to form a complex with the RNA guide and the target nucleic acid. In some embodiments, the variant Cas12i4 polypeptide is stored in a ternary complex buffer.
In some embodiments, the variant binary complex and the target nucleic acid are complexed in a ternary complexing buffer. In some embodiments, the variant binary complex is stored in a buffer that is replaced with a ternary complex buffer to form a complex with the target nucleic acid. In some embodiments, the variant binary complex is stored in a ternary complex buffer.
In some embodiments, the ternary complex buffer has a pH in the range of about 7.3 to 8.6. In one embodiment, the pH of the ternary complex buffer is about 7.3. In one embodiment, the pH of the ternary complex buffer is about 7.4. In one embodiment, the pH of the ternary complex buffer is about 7.5. In one embodiment, the pH of the ternary complex buffer is about 7.6. In one embodiment, the pH of the ternary complex buffer is about 7.7. In one embodiment, the pH of the ternary complex buffer is about 7.8. In one embodiment, the pH of the ternary complex buffer is about 7.9. In one embodiment, the pH of the ternary complex buffer is about 8.0. In one embodiment, the pH of the ternary complex buffer is about 8.1. In one embodiment, the pH of the ternary complex buffer is about 8.2. In one embodiment, the pH of the ternary complex buffer is about 8.3. In one embodiment, the pH of the ternary complex buffer is about 8.4. In one embodiment, the pH of the ternary complex buffer is about 8.5. In one embodiment, the pH of the ternary complex buffer is about 8.6.
The thermostability of the variant Cas12i4 polypeptide may be increased under favorable conditions (such as the addition of RNA guides and target nucleic acids).
Assessment of variant ternary complex stability and functionality
Provided herein in certain embodiments are methods for identifying optimal variant ternary complexes,the methods comprise (a) combining a variant Cas12i4 polypeptide, an RNA guide, and a target nucleic acid in a sample to form a variant ternary complex; (b) measuring the value of the variant ternary complex; and (c) if the value of the variant ternary complex is greater than the value of the reference molecule, determining that the variant ternary complex is optimal relative to the reference molecule. In some embodiments, the value may include, but is not limited to, a stability measurement (e.g., T m Values, thermostability), ternary complex formation rate, DNA binding affinity measurements, DNA binding specificity measurements, and/or complex activity measurements (e.g., nuclease activity measurements).
In some embodiments, the best variant ternary complex is identified by: (a) Combining the variant Cas12i4 polypeptide, the RNA guide, and the target nucleic acid in the sample to form a variant ternary complex; (b) Detection of T of variant ternary complexes m A value; and (c) T if variant ternary complexes m Value ratio T of reference molecule m Value or T m The variant ternary complex is determined to be stable if the reference value is at least 8 ℃ greater.
Methods involving the step of measuring the thermostability of a variant ternary complex may include, but are not limited to, methods of determining the stability of the variant ternary complex, methods of determining conditions that promote stable variant ternary complex, methods of screening for stable variant ternary complex, and methods for identifying the optimal binary complex to form a stable variant ternary complex. In certain embodiments, the thermal stability value of the variant ternary complex can be measured.
Additionally, in certain embodiments, the thermal stability value of the reference molecule may also be measured. In certain embodiments, a variant ternary complex may be determined to be stable if the measured thermal stability value of the variant ternary complex is greater than the measured thermal stability value of a reference molecule or the thermal stability reference value measured under the same experimental conditions, as described herein. In certain embodiments, the reference molecule can be a variant Cas12i4 polypeptide lacking an RNA guide and/or target nucleic acid.
In certain embodiments, the measured thermal stability value may be a denaturation temperature value. At the position of In these embodiments, the thermal stability reference is a denaturation temperature reference. In some embodiments, the measured thermal stability value may be T m Values. In these embodiments, the thermal stability reference value may be T m Reference value. In certain embodiments, thermal stability values may be measured using a thermal transition assay. In certain embodiments, the assays for measuring thermal stability may involve techniques described herein, including, but not limited to, differential Scanning Fluorometry (DSF), differential Scanning Calorimetry (DSC), or Isothermal Titration Calorimetry (ITC).
In certain embodiments, a variant ternary complex can be identified if the ternary complex formation rate, DNA binding affinity, DNA binding specificity, and/or complex activity (e.g., nuclease activity) of the variant ternary complex is greater than the value of the reference molecule or reference value (e.g., the value of the parent ternary complex). For example, in certain embodiments, a variant ternary complex can be identified if the ternary complex formation rate, DNA binding affinity, DNA binding specificity, and/or complex activity of the variant ternary complex is at least X% greater than the value of the reference molecule or reference value (e.g., the value of the parent ternary complex). In certain embodiments, the methods described herein may further comprise a plurality of steps comprising measuring the activity of a variant ternary complex as described herein.
Delivery of
The compositions or complexes described herein can be formulated to include, for example, a carrier (such as a carrier and/or a polymeric carrier, e.g., a liposome) and delivered to a cell (e.g., a prokaryotic cell, eukaryotic cell, plant cell, mammalian cell, etc.) by known methods. Such methods include, but are not limited to, transfection (e.g., lipid-mediated, cationic polymers, calcium phosphate, dendrimers); electroporation or other methods of disrupting membranes (e.g., nuclear transfection), viral delivery (e.g., lentivirus, retrovirus, adenovirus, AAV), microinjection, microprojectile bombardment ("gene gun"), fugene, direct sonic loading, cell extrusion, light transfection, protoplast fusion, puncture infection, magnetic transfection, exosome-mediated transfer, lipid nanoparticle-mediated transfer, and any combination thereof.
In some embodiments, the method comprises delivering to the cell one or more nucleic acids (e.g., nucleic acids encoding a variant Cas12i4 polypeptide, RNA guides, donor DNA, etc.), one or more transcripts thereof, and/or a preformed variant Cas12i4 polypeptide/RNA guide complex (i.e., variant binary complex). Exemplary intracellular delivery methods include, but are not limited to: a viral or virus-like agent; chemical-based transfection methods, such as transfection methods using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine); non-chemical methods such as microinjection, electroporation, cell extrusion, sonoporation, optical transfection, puncture infection, protoplast fusion, bacterial conjugation, delivery of plasmids or transposons; particle-based methods, such as using gene gun, magnetic transfection or magnetic assisted transfection, particle bombardment; and hybrid methods such as nuclear transfection. In some embodiments, the application further provides cells produced by such methods, and organisms (e.g., animals, plants, or fungi) comprising or produced by such cells.
Cells
The compositions or complexes described herein can be delivered to a variety of cells. In some embodiments, the cell is an isolated cell. In some embodiments, the cell is in a cell culture. In some embodiments, the cell is ex vivo. In some embodiments, the cells are obtained from a living organism and maintained in cell culture. In some embodiments, the cell is a unicellular organism.
In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is a bacterial cell or is derived from a bacterial cell. In some embodiments, the bacterial cell is independent of the bacterial species from which the parent polypeptide is derived. In some embodiments, the cell is or is derived from an archaebacterium cell. In some embodiments, the cell is a eukaryotic cell. In some embodiments, the cell is a plant cell or is derived from a plant cell. In some embodiments, the cell is a fungal cell or is derived from a fungal cell. In some embodiments, the cell is an animal cell or is derived from an animal cell. In some embodiments, the cell is or is derived from an invertebrate cell. In some embodiments, the cell is a vertebrate cell or is derived from a vertebrate cell. In some embodiments, the cell is a mammalian cell or is derived from a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is a zebra fish cell. In some embodiments, the cell is a rodent cell. In some embodiments, the cells are synthetically made, sometimes referred to as artificial cells.
In some embodiments, the cells are derived from a cell line. A wide variety of cell lines for tissue culture are known in the art. Examples of cell lines include, but are not limited to, 293T, MF, K562, heLa and transgenic varieties thereof. Cell lines can be obtained from a variety of sources known to those skilled in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, va.), ma, va). In some embodiments, cells transfected with one or more nucleic acids described herein (such as Ago encoding vectors and gDNA) or Ago-gDNA complexes are used to create new cell lines comprising one or more vector-derived sequences to create new cell lines comprising modifications to the target nucleic acid. In some embodiments, cells transiently or non-transiently transfected with or cell lines derived from one or more nucleic acids described herein (such as variant Cas12i4 polypeptide encoding vectors and RNA guides) or variant Cas12i4 polypeptide/RNA guide complexes (i.e., variant binary complexes) are used to evaluate one or more test compounds.
In some embodiments, the cell is a primary cell. For example, a culture of primary cells may be passaged 0, 1, 2, 4, 5, 10, 15, or more times. In some embodiments, the primary cells are harvested from the individual by any known method. For example, leukocytes can be harvested by apheresis, leukocyte apheresis, density gradient separation, and the like. Cells of tissue (such as skin, muscle, bone marrow, spleen, liver, pancreas, lung, intestine, stomach, etc.) may be collected by biopsy. The harvested cells may be dispersed or suspended using a suitable solution. Such a solution may typically be a balanced salt solution (e.g., normal saline, phosphate Buffered Saline (PBS), hank balanced salt solution, etc.), conveniently supplemented with fetal bovine serum or other naturally occurring factors, along with a low concentration of acceptable buffer. Buffers may include HEPES, phosphate buffer, lactate buffer, and the like. The cells may be used immediately, or they may be stored (e.g., by freezing). Frozen cells can be thawed and reused. Cells can be frozen in DMSO, serum, medium buffers (e.g., 10% DMSO, 50% serum, 40% buffered medium), and/or some other such commonly used solution for preserving cells at freezing temperatures.
In some embodiments, the variant Cas12i4 polypeptide has nuclease activity that induces a double-strand break or single-strand break in a target nucleic acid (e.g., genomic DNA). Double strand breaks can stimulate cellular endogenous DNA repair pathways, including Homology Directed Recombination (HDR), non-homologous end joining (NHEJ), or alternative non-homologous end joining (a-NHEJ). NHEJ can repair cleaved target nucleic acids without the need for a cognate template. This may result in one or more nucleotide deletions or insertions into the target nucleic acid. HDR can occur with a homologous template (such as donor DNA). The homology template may comprise sequences homologous to sequences flanking the target nucleic acid cleavage site. In some cases, HDR can insert an exogenous polynucleotide sequence into the cleaved target nucleic acid. Modification of the target DNA due to NHEJ and/or HDR can result in, for example, mutation, deletion, alteration, integration, gene correction, gene replacement, gene tagging, transgene knock-in, gene disruption, and/or gene knockout.
In some embodiments, cell culture is synchronized to increase the efficiency of these methods. In some embodiments, cells in S and G2 phases are used for HDR-mediated gene editing. In some embodiments, the cells may be subjected to the method at any cell cycle. In some embodiments, cell overdosing significantly reduces the efficacy of the method. In some embodiments, the method is applied to the cell culture at no more than about any of 40%, 45%, 50%, 55%, 60%, 65%, or 70% confluence.
In some embodiments, the binding of the variant Cas12i4 polypeptide/RNA guide complex (i.e., the variant binary complex) to the target nucleic acid in the cell recruits one or more endogenous cellular molecules or pathways other than the DNA repair pathway to modify the target nucleic acid. In some embodiments, binding of the variant binary complex blocks access of one or more endogenous cellular molecules or pathways to the target nucleic acid, thereby modifying the target nucleic acid. For example, binding of variant binary complexes may block endogenous transcription or translation machinery to reduce expression of the target nucleic acid.
Kit for detecting a substance in a sample
The invention also provides kits that can be used, for example, to carry out the methods described herein. In some embodiments, the kit comprises a variant Cas12i4 polypeptide of the invention, e.g., a variant comprising a substitution of table 2 or a variant polypeptide of table 3. In some embodiments, the kit comprises a polynucleotide encoding such a variant Cas12i4 polypeptide, and optionally the polynucleotide is contained within a vector, e.g., as described herein. The kit may also optionally include RNA guides, e.g., as described herein. The RNA guides of the kits of the invention can be designed to target sequences of interest, as known in the art. The nuclease variant and the RNA guide can be packaged in the same vial or other container within the kit, or can be packaged in separate vials or other containers, the contents of which can be mixed prior to use. The kit may additionally optionally include a buffer and/or instructions for using the nuclease variants and/or RNA guides.
All references and publications cited herein are hereby incorporated by reference.
Sequences encoding Cas12i4 variants of SEQ ID No. 4
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
Examples
The following examples are provided to further illustrate some embodiments of the invention but are not intended to limit the scope of the invention; it will be appreciated by their exemplary nature that other procedures, methods or techniques known to those skilled in the art may alternatively be used.
Example 1 preparation of variant constructs
In this example, variant constructs were generated.
Using slave IDTs TM (comprehensive DNA technologies company (Integrated DNA Technologies, inc.)) ordered mutagenesis forward and mutagenesis reverse primers, a DNA template comprising a single mutation was constructed via two PCR steps. In the first step, two sets of PCR reactions were performed in 384 well plates to generate two fragments. The overlapping region of the two PCR fragments contains the desired single mutation and allows the entire DNA template to be assembled via the second PCR. In the second step, the purified fragment from the first step was used as template for overlap PCR (OL PCR) and Fw and Rv oligonucleotides annealed to the vector backbone were used as OL PCR primers. The resulting linear DNA template contains the T7 promoter, the T7 terminator and the open reading frame of the polypeptide.
These linear DNA templates are used directly in cell-free transcription and translation systems to express polypeptide variants containing single mutations. The variant constructs were further transferred individually into transient transfection vectors. Alternatively, DNA templates containing combinatorial mutations were prepared by PCR and subsequently transferred into transient transfection vectors.
Example 2-fluorescence polarization assay for variant binary complex detection
In this example, the ability of a wild-type or variant nuclease polypeptide and an RNA guide to form a binary complex is assessed by fluorescence polarization assay.
Through IDT TM Linear ssDNA fragments are synthesized comprising the reverse complement of the T7 RNA polymerase promoter sequence upstream of the cognate repeat and the desired 20bp RNA guide target. Then by annealing the universal T7 forward oligonucleotide (95-4 ℃ C., 5 ℃ C./min) to the reverse complement ssDNA and using Klenow fragment (New England) at 25 ℃ C) Filling for 15 minutes to generate a linear dsDNA In Vitro Transcription (IVT) template. The resulting IVT template was then transcribed into RNA guide using the HiScribe T7 High Yield RNA synthesis kit (New England Biolabs (New England Biolabs)) at 37℃for 4 hours. After transcription, each RNA guide was purified using an RNA cleaning and concentration kit (z Mo Gongsi (Zymo)) and stored at-20 ℃ until use.
Then using 6-carboxyfluorescein (6-FAM) (IDT TM ) Labeling the RNA guide. 1 Xassay buffer (20 mM Tris-HCl (pH 7.5), 150mM KCl, 5mM MgCl) was titrated with increasing concentrations of labeled RNA guide (7.5-250 nM) 2 1mM DTT) of the nucleic acid enzyme polypeptide (wild-type or variant Cas12i4 polypeptide). Incubating the complex at 37 ℃ for 30 minutes, and then using an enzyme-labeled instrument200Pro, tecan) was used for fluorescence polarization measurement.
Binary complex formation at different temperatures was also investigated. Further binding experiments as described above were performed isothermally at 25 ℃, 50 ℃, 60 ℃ and 70 ℃.
The formation of a binary complex upon titration of the nuclease polypeptide (wild-type or variant Cas12i4 polypeptide) with increasing concentrations of the RNA guide (or the formation of a binary complex upon titration of the RNA guide with increasing concentrations of the nuclease polypeptide) results in a change in fluorescence polarization signal (in millipolarization (mP)). Binding curves were generated by plotting the change in fluorescence polarization signal over the concentration range of the RNA guide.
This example indicates how the binding affinity of a nuclease polypeptide (wild-type or variant Cas12i4 polypeptide) to an RNA guide can be determined and compared.
Example 3-RNA electrophoretic mobility Change assay for variant binary Complex detection
This example describes the use of RNA EMSA to determine the ability of nuclease polypeptides (wild-type or variants) to bind to RNA guides.
Using 5'800CW (also referred to as IR800 dye or IR 800) Tag kit with 5' end TagLaboratories) and +.>800CW maleimide (/ -)>Biosciences) tag from IDT TM As previously detailed in Yan et al, 2018. After labeling, the RNA guide was washed and concentrated via phenol chloroform extraction. By Nanodrop TM And quantifying the concentration.
For RNA binding assays, the nuclease polypeptide (wild-type or variant Cas12i4 polypeptide) is incubated in 1 Xbinding buffer (50 mM NaCl, 10mM Tris-HCl, 10mM MgCl 2 1mM DTT, pH 7.9) to 2.5. Mu.M. The polypeptides were then serially diluted from 2.5 μm to 37.5 μm in 1X binding buffer. The polypeptides were diluted 1:10 again in 1X binding buffer plus 50nM IR800-labeled RNA guide andthoroughly mixed. These reactions can further include 0.5-5 μg tRNA that acts as a competitive inhibitor to reduce non-specific binding of the polypeptide to RNA, thereby facilitating accurate specific binding assays. The reaction was incubated at 37℃for 1 hour. 1. Mu.L of 100 Xbromophenol blue was added to the reaction to visualize the dye front, and the entire reaction was loaded into a 6% DNA blocking gel (ThermoFisher Scientific) TM ) On top of this, it was run at 80V for 90 minutes. At the position ofThe gel was imaged on CLx.
The assay relies on the principle that the rate at which RNA migrates through the gel is determined by its size. Only the sample of RNA is able to migrate a specific distance. However, if RNA binds to a polypeptide, a band representing a larger, less mobile RNA complex will appear, which band "moves up" the gel.
Thus, the intensity of two strips was measured: 1) Only bands of RNA and 2) bands of "up-shifting" RNA that bind the polypeptide. If all RNA is bound to the polypeptide, only an upward band is observed. As the concentration of polypeptide decreases, the intensity of the up-shifted band decreases, while the intensity of the RNA-only band increases. When comparing the RNA binding affinities of nuclease polypeptides (wild-type or variant Cas12i4 polypeptides), higher polypeptide/RNA affinities are characterized by more specific binding at lower concentrations of the polypeptide.
This example indicates how the binding affinity of a wild-type nuclease polypeptide to an RNA guide can be determined and compared, as well as the binding affinity of a variant Cas12i4 polypeptide to an RNA guide.
Example 4-DNA electrophoretic mobility shift assay for variant Cas12i4 ternary complex detection
This example describes the use of DNA Electrophoretic Mobility Shift Assay (EMSA) to determine the ability of RNA guide, cas12i4 polypeptide (wild-type or variant Cas12i 4) and target DNA substrate to form a ternary complex.
Transforming the Cas12i4 wild type of SEQ ID NO. 2 and the Cas12i4 variant of SEQ ID NO. 4 into escherichia coli BL21 (DE 3) respectivelyNew England) And BL21 (DE 3) pLysS->And allowing it to be expressed under the T7 promoter. Transformed cells were initially grown in 5mL Luria broth (TEKNOVA TM ) +50. Mu.g/mL kanamycin was grown overnight and then inoculated into 1L of Terrific broth (TEKNOVA) TM ) +50. Mu.g/mL kanamycin. Cas12i4 wild-type and variant cells were grown at 37℃until OD 600 0.6-0.8 and 3, respectively, and protein expression was then induced with 0.5mM IPTG. The culture was then grown at 18℃for a further 14-18 hours. Cultures were harvested and pelleted via centrifugation, and then resuspended in 1mL extraction buffer (50mM HEPES,pH 7.5, 500mM NaCl,5% glycerol, 0.5mM TCEP) per 5g cell pellet. Cells were lysed via a cell disruptor (constant systems limited (Constant System Limited)) and then centrifuged at 20,000Xg for 20 minutes at 4 ℃ to clarify the lysate. 0.2% Polyethylenimine (PEI) was added to the clarified lysate and incubated at 4℃for 20 minutes with constant end-to-end rotation. The lysate was then centrifuged again at 20,000Xg for 10 minutes. The wild-type Cas12i4 is purified via ion exchange and hydrophobic chromatography, and the variant Cas12i4 is purified via immobilized metal affinity chromatography and ion exchange chromatography. After purification, the fractions were run on SDS-PAGE gels, pooled with appropriate size protein and used with 30kD +. >Concentrating by an Ultra15 centrifugal device. Protein buffer was exchanged into 12.5mM HEPES pH 7.0, 120mM NaCl, 0.5mM TCEP and 50% glycerol. Then use Nanodrop TM (ThermoFisher Scientific TM ) The concentration was measured and the protein stored at-20 ℃.
Synthetic RNA guide (Integrated DNA technologies Co., IDT) was used in a 2:1 ratio TM ) And polypeptides to prepare RNPs. The RNA guide sequences are shown in table 13. CrRNA 1 (SEQ ID NO: 62) corresponds to target 1 (S)EQ ID NO: 65), crRNA 2 (SEQ ID NO: 63) corresponds to target 2 (SEQ ID NO: 66), and crRNA 3 (SEQ ID NO: 64) corresponds to target 3 (SEQ ID NO: 67). RNP was allowed to stand at 37℃in a1 XNEBuffer TM 2(NEB2;New England50mM NaCl,10mM Tris-HCl,10mM MgCl 2 1mM DTT, pH 7.9) for 30 minutes. After complexing, a 5-point 1:2 serial dilution from 5. Mu.M to 37.5. Mu.M was performed using 1 XNEB 2 as dilution buffer. Apo reaction (polypeptide without RNA guide) was prepared in the same manner using H 2 O complements the volume of RNA guide.
Table 13.DNA EMSA RNA guide sequences.
The dsDNA target substrates for the sequences in table 14 were generated by PCR from oligonucleotides (integrated DNA technologies) using the primers in table 15. The 5' end of the forward primer was labeled with IR800 dye prior to PCR, as described in Yan et al, 2018. Using Amplitaq(ThermoFisher Scientific TM ) The dsDNA substrate was then amplified with IR800 labeled forward primer and unlabeled reverse primer. The resulting dsDNA was purified using a DNA cleaning and concentration kit (z Mo Gongsi) and passed through Nanodrop TM (ThermoFisher Scientific TM ) And (5) quantifying.
Table 14.Dna EMSA target substrates.
/>
Table 15 primers for DNA EMSA target substrate production.
RNP samples and Apo (control) samples were diluted 1:10 to 1 Xbinding buffer (50 mM NaCl, 10mM Tris-HCl, 1mM TCEP, 10% glycerol, 2mM EDTA, pH 8.0) plus 20nM IR800 labeled target DNA substrate and mixed well. The reaction was incubated at 37℃for 1 hour. Bromophenol blue was added to the reaction to visualize the dye front, and the entire reaction was loaded into a 6% DNA blocking gel (ThermoFisher Scientific TM ) On top of this, it was run at 80V for 90 minutes. At the position ofThe gel was imaged on CLx.
Fig. 1A, 1B and 1C show EMSA gels of target 1 (aavs1_t6), target 2 (aavs1_t7) and target 3 (emx1_t4), respectively. In each gel, the "Apo" lanes (lanes 1 and 8) include target DNA plus wild-type Cas12i4 (lane 1) or Cas12i4 variants of SEQ ID NO. 4 (lane 8). The "Ref" lane includes only target DNA. Lanes 2-6 in FIGS. 1A, 1B and 1C correspond to a decrease in the concentration of RNP comprising wild-type Cas12i4 (SEQ ID NO: 2) from 500nM to 37nM. Lanes 9-13 in FIGS. 1A, 1B and 1C correspond to a decrease in the concentration of RNP from 500nM to 37nM for the Cas12i4 variant comprising SEQ ID NO. 4.
The gels of FIGS. 1A, 1B and 1C show DNA bands migrating different distances. In this assay, the rate at which DNA migrates through the gel is determined by its size. Only the sample of DNA is able to migrate a specific distance. However, if RNP binds to DNA, a band representing a larger, less mobile DNA complex appears, which "moves up" the gel. Thus, the arrows in FIGS. 1A, 1B, and 1C point to "unbound dsDNA" and "bound dsDNA", where "bound dsDNA" migrates less than "unbound dsDNA".
Fig. 1A shows that for the highest concentration of wild-type Cas12i4 RNP (lane 2) and the highest concentration of variant Cas12i4 RNP (lane 9), also only unbound dsDNA bands are present, indicating that the wild-type and variant Cas12i4 RNPs do not form a ternary complex with aavs1_t6 target DNA.
Fig. 1B shows that even at the highest concentration of wild-type Cas12i4 RNP (lane 2), only unbound dsDNA bands are present, indicating that the wild-type Cas12i4 RNP does not form a ternary complex with aavs1_t7 target DNA. However, a bound dsDNA band was observed in RNPs prepared with variant Cas12i4 (lanes 9-10). Thus, RNPs prepared with variant Cas12i4 have higher affinity for aavs1_t7 target DNA than wild-type Cas12i 4.
Also, fig. 1C shows that even at the highest concentration of wild-type Cas12i4 RNP (lane 2), there is only an unbound dsDNA band, indicating that the wild-type Cas12i4 RNP does not form a ternary complex with EMX1 target DNA. However, a bound dsDNA band was observed in RNP prepared with variant Cas12i4 (lane 10). Thus, RNPs prepared with variants have higher affinity for EMX1 target DNA than wild-type Cas12i 4.
Based on the data in fig. 1A, 1B, and 1C, RNPs prepared with variant Cas12i4 have higher affinity for multiple dsDNA targets than wild-type Cas12i4 RNPs for dsDNA targets.
To demonstrate that the substrate DNA up-shift is sequence dependent, RNPs were incubated with mismatched target substrates. These reactions were performed in the same manner, with 1 XNEB 2 buffer to make up any volume of polypeptide. The reaction comprising the Cas12i4 polypeptide (wild-type or variant), crRNA 1 (SEQ ID NO: 62), and DNA target 3 (SEQ ID NO: 67) is shown in FIG. 1D.
In the gel of FIG. 1D, the "Apo" lanes (lanes 1 and 8) include target 3DNA (SEQ ID NO: 67) plus wild-type Cas12i4 (lane 1) and variant Cas12i4 (lane 8). The "Ref" lane includes only target 3DNA. Lanes 2-6 in FIG. 1D correspond to a decrease in concentration of wild-type Cas12i4 RNP prepared with crRNA 1 (SEQ ID NO: 62) from 500nM to 37nM. Lanes 9-13 in FIG. 1D correspond to the decrease in concentration of RNP from 500nM to 37nM prepared with variants of SEQ ID NO. 4 Cas12i4 and crRNA 1 (SEQ ID NO. 62).
As shown in fig. 1D, dsDNA remained unbound by RNP at all concentrations, indicating that RNP of wild-type and variant Cas12i4 were unable to form ternary complexes. Thus, as shown in FIGS. 1B and 1C, the ability of RNPs to bind to target DNA substrates depends on the sequences of the RNA guide and target DNA substrates.
Taken together, this example demonstrates that RNP (binary complex) prepared with variant Cas12i4 polypeptides has a higher affinity for multiple DNA targets (to produce ternary complexes) than wild-type Cas12i4 RNP.
Example 5-in vitro cleavage assay for determining variant Cas12i4 ternary complex formation
This example describes a method for assessing the in vitro biochemical activity of Cas12i4 (wild-type or variant Cas12i 4) RNP on a target DNA substrate as a means for determining ternary complex formation.
The RNA guides and dsDNA substrates in this example are the same as in tables 13 and 14, respectively. The dsDNA substrate in this assay remains unlabeled. RNP and apo samples were generated and incubated in the same manner as described in example 4, and then serially diluted from 1 μm to 15.7nM in 1x NEB2. RNP and apo samples were then further diluted 1:10 into 1 XNEB 2 and target dsDNA substrate was added at 20 nM. The reaction was thoroughly mixed and then incubated at 37℃for 1 hour, followed by 1. Mu.L of 20mg/mL proteinase K (ThermoFisher Scientific) TM ) Quenching. The reaction was incubated at 50℃for an additional 15 minutes, and then the whole reaction was allowed to incubate on a 2% agarose E-gel (ThermoFisher Scientific TM ) And (5) operating on the machine. At Gel Doc TM EZ gel imagerThe gel was visualized by ethidium bromide.
Fig. 2A, 2B and 2C show the cleavage gels of target 1 (aavs1_t6), target 2 (aavs1_t7) and target 3 (emx1_t4), respectively. In each gel, the "Apo" lanes (lanes 1 and 11) include target DNA plus wild-type Cas12i4 (lane 1) or Cas12i4 variants of SEQ ID NO. 4 (lane 11). The "Ref" lane includes only target DNA. Lanes 2-9 in FIGS. 2A, 2B and 2C correspond to a decrease in concentration of RNP comprising wild-type Cas12i4 (SEQ ID NO: 2) from 1 μM to 15.7nM. Lanes 12-19 in FIG. 2A, FIG. 2B and FIG. 2C correspond to a decrease in the concentration of RNP corresponding to the Cas12i4 variant comprising SEQ ID NO. 4 from 1. Mu.M to 15.7nM.
In fig. 2A, 2B and 2C, the intensities of two types of strips were measured: 1) Full length (uncleaved) DNA bands and 2) one or more downset cleaved DNA bands. Inactive RNPs are characterized by full-length DNA bands (e.g., RNPs are unable to form a ternary complex with DNA substrates). The active RNP produces one or more down-shifted, cleaved DNA bands (e.g., RNP is capable of forming a ternary complex with a DNA substrate). As the concentration of active RNP decreases, the intensity of the full-length band increases and the intensity of the one or more cleaved bands decreases. When comparing the activities of multiple RNPs, an RNP with higher activity than another is characterized by a stronger cleavage band at lower RNP concentrations.
Fig. 2A, 2B and 2C show cleavage of each target in vitro for wild-type Cas12i4 and variant Cas12i 4. However, variant Cas12i4 is able to cleave each target at lower RNP concentrations. Thus, the variant Cas12i4 of SEQ ID No. 4 exhibits higher lytic activity than the wild type Cas12i 4.
Example 6-in vitro stability assay for variant Cas12i4 polypeptides and variant binary complexes
In this example, stability of the variant RNP was assessed.
For the accelerated stability study, RNP (5 μm) was produced in the same manner as described in example 4, followed by storage of the samples at 25 ℃ for 48 hours.
RNP samples were subjected to an in vitro cleavage assay (described in example 5). These results were compared with the results of example 5 to determine the extent to which variant RNPs stored at 25 ℃ for 48 hours remained biochemically active.
Apo polypeptides (without RNA guide) were also incubated for 48 hours at 25 ℃. The RNA EMSA assay was performed on apo samples using the method described in example 3. These results were compared to the results of example 3 to determine the extent to which the variant nuclease was able to form a binary complex with the RNA guide.
Apo samples incubated for 48 hours at 25℃were also complexed with RNA guides to form RNPs using the method described in example 4. An in vitro cleavage assay was then performed according to the method of example 5. The assay results were compared to the assay results of example 5 to assess the activity level of the variant RNP formed with the protein incubated at 25 ℃.
The methods of the present examples allow comparison of the stability of wild-type and variant Cas12i4 polypeptides and wild-type and variant RNPs (binary complexes). A nuclease polypeptide that demonstrates greater specific binding to the RNA guide than another nuclease polypeptide to the RNA guide indicates a more stable polypeptide. RNP that demonstrates more robust cleavage of target DNA in vitro than cleavage of another RNP with a different nuclease polypeptide indicates a more stable binary complex.
Example 7-targeting of mammalian genes by variant nucleases
This example describes indel assessment of multiple targets using wild-type Cas12i4 and Cas12i4 variants introduced into mammalian cells by transient transfection.
The nucleases of SEQ ID NO. 2, SEQ ID NO. 3 and SEQ ID NO. 4 were cloned into the pcda3.1 backboneIs a kind of medium. RNA guides were cloned into the pUC19 backbone (New England Biolabs). The plasmid was then prepared in large quantities and diluted to 1. Mu.g/. Mu.L. RNA guides and target sequences are shown in table 16.
Table 16. Mammalian target and corresponding crRNA.
/>
/>
Approximately 16 hours prior to transfection, 100. Mu.l of the cells were incubated in DMEM/10% FBS+Penicillium25,000 HEK293T cells in plain/streptomycin were plated into each of the 96-well plates. On the day of transfection, the confluency of cells was 70% -90%. For each well to be transfected, 0.5. Mu.l Lipofectamine was prepared TM 2000 and 9.5. Mu.l of Opti-MEM, and then incubated at room temperature for 5-20 minutes (solution 1). After incubation, lipofectamine is added TM :OptiMEM TM The mixture was added to up to 10 μl of a separate mixture containing 126ng of nuclease plasmid and 174ng of guide plasmid (solution 2) with water. In the case of the negative control, crRNA was not included in solution 2. The solution 1 and solution 2 mixtures were mixed by pipetting up and down and then incubated for 25 minutes at room temperature. After incubation, 20 μl of the solution 1 and solution 2 mixtures were added drop-wise to each well of the 96-well plate containing the cells. At 72 hours post-transfection, cells were trypsinized by: add 10. Mu.L TrypLE to the center of each well TM And incubated for about 5 minutes. 100 μ L D media was then added to each well and mixed to re-suspend the cells. The cells were then spun down at 500g for 10 minutes and the supernatant discarded. Will QuickExtract TM Buffer solution1/5 of the amount added to the volume of the original cell suspension. Cells were incubated at 65℃for 15 min, at 68℃for 15 min, and at 98℃for 10 min.
Samples for next generation sequencing were prepared by two rounds of PCR. The first round (PCR 1) was used to amplify specific genomic regions according to the target. The PCR1 product was purified by column purification. Round 2 PCR (PCR 2) was performed for additionAdapters and indexes. The reaction solutions were then combined and purified by column purification. With 150 cycles of NextSeq TM v2.5 medium or high output kits were run for sequencing.
FIG. 3 shows the indel activity of wild-type Cas12i4 of SEQ ID NO. 2, variant Cas12i4 of SEQ ID NO. 3, and variant Cas12i4 of SEQ ID NO. 4. The variant Cas12i4 of SEQ ID No. 3 and the variant Cas12i4 of SEQ ID No. 4 demonstrate a higher indel activity at each target compared to the wild type Cas12i4 of SEQ ID No. 2. Thus, the engineered Cas12i4 variants demonstrate increased nuclease activity in mammalian cells.
Example 8-use of 5'-NTTN-3' and 5'-NVTN-3' PAM sequences for mammalian genes by variant nucleases Targeting
This example describes indel evaluation of multiple targets adjacent to 5'-NTTN-3' or 5'-NVTN-3' pam using wild-type Cas12i4 and Cas12i4 variants introduced into mammalian cells by transient transfection.
Nuclease and RNA guide constructs were prepared and transformed into HEK293T cells as described in example 7. RNA guides and target sequences are shown in table 17.
Table 17 mammalian target, PAM and corresponding crRNA.
/>
/>
/>
/>
/>
The insertion loss was analyzed as described in example 7 and the results are shown in fig. 4. The hollow shape represents a target using 5'-NVTN-3' PAM, and the solid shape represents a target using 5'-NTTN-3' PAM. Circles represent wild-type Cas12i4 (SEQ ID NO: 2), squares represent Cas12i4 variants of SEQ ID NO: 4. Bars represent average indels across all targets. The variant Cas12i4 of SEQ ID No. 4 shows higher indel activity than the wild-type Cas12i4 of SEQ ID No. 2, whereas the use of 5'-NTTN-3' pam results in higher indel levels than the use of 5'-NVTN-3' pam.
This example shows that indels can be induced by Cas12i4 (wild-type or variant Cas12i 4) using 5'-NTTN-3' pam or 5'-NVTN-3' pam.

Claims (95)

1. A variant Cas12i4 polypeptide comprising a sequence having at least 95% identity to the sequence set forth in any one of SEQ ID NOs 3-59.
2. The variant Cas12i4 polypeptide of claim 1, wherein the variant Cas12i4 polypeptide is a variant of the parent polypeptide of SEQ ID No. 2.
3. The variant Cas12i4 polypeptide of claim 1 or 2, wherein the variant Cas12i4 polypeptide comprises the substitution of table 2.
4. The variant Cas12i4 polypeptide of any one of claims 1-3, comprising the sequence set forth in any one of SEQ ID NOs 3-59.
5. The variant Cas12i4 polypeptide of any one of claims 1-4, comprising the sequence set forth in SEQ ID No. 3.
6. The variant Cas12i4 polypeptide of any one of claims 1-4, comprising the sequence set forth in SEQ ID No. 4.
7. The variant Cas12i4 polypeptide of any one of claims 1-6, wherein the variant Cas12i4 polypeptide exhibits increased binary complex formation with an RNA guide relative to a parent polypeptide.
8. The variant Cas12i4 polypeptide of any one of claims 1-7, wherein a binary complex comprising the variant Cas12i4 polypeptide exhibits increased stability relative to a parent binary complex.
9. The variant Cas12i4 polypeptide of any one of claims 1-8, wherein the variant Cas12i4 polypeptide exhibits increased nuclease activity relative to the parent polypeptide.
10. The variant Cas12i4 polypeptide of any one of claims 1-9, wherein the variant Cas12i4 polypeptide further comprises the substitution of table 4.
11. The variant Cas12i4 polypeptide of claim 10, wherein the substitution of table 4 increases binary complex formation with the RNA guide relative to the parent polypeptide.
12. The variant Cas12i4 polypeptide of claim 10 or 11, wherein the substitution of table 4 increases the stability of the binary complex comprising the variant Cas12i4 polypeptide relative to the parent binary complex.
13. The variant Cas12i4 polypeptide of any one of claims 1-12, wherein the variant Cas12i4 polypeptide further comprises a substitution that increases ternary complex formation with the RNA guide and the target nucleic acid relative to the parent polypeptide.
14. The variant Cas12i4 polypeptide of any one of claims 1-13, wherein the variant Cas12i4 polypeptide further comprises a substitution that increases ternary complex stability relative to the parent polypeptide.
15. The variant Cas12i4 polypeptide of claim 13 or 14, wherein the substitution is a substitution of table 5, table 6, table 7, table 8, table 9, and/or table 10.
16. The variant Cas12i4 polypeptide of any one of claims 1-15, wherein the variant Cas12i4 polypeptide further comprises a substitution that increases on-target binding to a target nucleic acid relative to the parent polypeptide.
17. The variant Cas12i4 polypeptide of any one of claims 1-10, wherein the substitution is a substitution of table 11.
18. A composition comprising the variant Cas12i4 polypeptide of any one of claims 1-17, wherein the composition further comprises an RNA guide or a nucleic acid encoding the RNA guide, wherein the RNA guide comprises a cognate repeat sequence and a spacer sequence.
19. The composition of claim 18, wherein the orthostatic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. Nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. Nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
Nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
20. The composition of claim 19, wherein the homeotropic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. Nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. Nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
21. The composition of claim 20, wherein the homeotropic repeat sequence comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
22. The composition of any one of claims 18 to 21, wherein the homeotropic repeat sequence comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C.
23. The composition of any one of claims 18 to 22, wherein the spacer sequence is about 15 nucleotides to about 35 nucleotides in length.
24. The composition of any one of claims 18 to 23, wherein the spacer sequence binds to a target strand sequence of a target nucleic acid, and wherein a non-target strand sequence of the target nucleic acid sequence is adjacent to a Protospacer Adjacent Motif (PAM) sequence.
25. The composition of claim 24, wherein the PAM sequence is 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', wherein N is any nucleotide and V is A, G, or C.
26. The variant Cas12i4 polypeptide or composition of any preceding claim, wherein the variant Cas12i4 polypeptide further comprises a Nuclear Localization Signal (NLS).
27. The variant Cas12i4 polypeptide or composition of any preceding claim, wherein the variant Cas12i4 polypeptide further comprises a peptide tag, a fluorescent protein, a base editing domain, a DNA methylation domain, a histone residue modification domain, a localization factor, a transcription modification factor, a light gating factor, a chemically inducible factor, or a chromatin visualization factor.
28. A composition comprising a nucleic acid encoding the Cas12i4 polypeptide or composition of any preceding claim.
29. The composition of claim 28, wherein the nucleic acid is codon optimized for expression in a cell.
30. The composition of claim 28 or 29, wherein the nucleic acid is operably linked to a promoter.
31. The composition of any one of claims 28 to 30, wherein the nucleic acid is in a vector.
32. The composition of claim 31, wherein the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
33. The variant Cas12i4 polypeptide or composition of any preceding claim, wherein the variant Cas12i4 polypeptide is present in a delivery system comprising a nanoparticle (e.g., a lipid nanoparticle), a liposome, an exosome, a microbubble, or a gene gun.
34. A cell comprising the variant Cas12i4 polypeptide or composition of any preceding claim.
35. The cell of claim 34, wherein the cell is a eukaryotic cell.
36. The cell of claim 34 or 35, wherein the cell is a mammalian cell or a plant cell.
37. The cell of any one of claims 34 to 36, wherein the cell is a human cell.
38. A composition comprising a variant Cas12i4 polypeptide or a complex comprising the variant Cas12i4 polypeptide, wherein the variant Cas12i4 polypeptide comprises a sequence having at least 95% identity to the sequence set forth in any one of SEQ ID NOs 3-59, and wherein the variant Cas12i4 polypeptide or the complex exhibits enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability relative to the parent polypeptide or the complex comprising the parent polypeptide.
39. The composition of claim 38, wherein the variant Cas12i4 polypeptide comprises the substitutions of table 2, table 4, table 5, table 6, table 7, table 8, table 9, table 10, and/or table 11.
40. The composition of claim 38 or 39, wherein the variant Cas12i4 polypeptide comprises the sequence set forth in any one of SEQ ID NOs 3-59.
41. The composition of any one of claims 38 to 40, wherein the variant Cas12i4 polypeptide comprises the sequence set forth in SEQ ID No. 3.
42. The composition of any one of claims 38 to 41, wherein the variant Cas12i4 polypeptide comprises the sequence set forth in SEQ ID No. 4.
43. The composition of any one of claims 38 to 42, wherein the enhanced enzymatic activity is enhanced nuclease activity.
44. The composition of any one of claims 38 to 43, wherein the variant Cas12i4 polypeptide exhibits enhanced binding activity to an RNA guide relative to the parent polypeptide.
45. The composition of any one of claims 38 to 44, wherein the variant Cas12i4 polypeptide exhibits enhanced binding specificity to an RNA guide relative to the parent polypeptide.
46. The composition of any one of claims 38 to 45, wherein the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide, and the variant binary complex exhibits enhanced binding activity (e.g., mid-target binding activity) to a target nucleic acid relative to a parent binary complex.
47. The composition of any one of claims 38 to 46, wherein the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide, and the variant binary complex exhibits enhanced binding specificity (e.g., mid-target binding specificity) to a target nucleic acid relative to a parent binary complex.
48. The composition of any one of claims 38 to 47, wherein the complex comprising the variant Cas12i4 polypeptide is a variant binary complex further comprising an RNA guide and the variant binary complex exhibits enhanced stability relative to the parent binary complex.
49. The composition of any one of claims 38 to 48, wherein the variant binary complex and the target nucleic acid form a variant ternary complex, and the variant ternary complex exhibits increased stability relative to the parent ternary complex.
50. The composition of any one of claims 38 to 49, wherein the variant Cas12i4 polypeptide further exhibits enhanced binary complex formation, enhanced protein-RNA interactions, and/or reduced dissociation with RNA guides relative to the parent polypeptide.
51. The composition of any one of claims 38 to 50, wherein the variant binary complex further exhibits reduced dissociation from the target nucleic acid, and/or reduced off-target binding to non-target nucleic acid relative to the parent binary complex.
52. The composition of any one of claims 38 to 51, wherein the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs at a temperature range of, for example, 20 ℃ to 65 ℃.
53. The composition of any one of claims 38 to 52, wherein the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs over a range of incubation times.
54. The composition of any one of claims 38 to 53, wherein the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs in a buffer having a pH in the range of about 7.3 to about 8.6.
55. The composition of any one of claims 38 to 54, wherein when the variant Cas12i4 polypeptide, variant binary complex, or variant ternary complex is T m A value of T that is greater than that of the parent polypeptide, parent binary complex, or parent ternary complex m At values at least 8 ℃ greater, the enhanced enzymatic activity, enhanced binding specificity, and/or enhanced stability occurs.
56. The composition of any one of claims 38 to 55, wherein the variant Cas12i4 polypeptide comprises a RuvC domain or split RuvC domain.
57. The composition of any one of claims 38 to 56, wherein the parent polypeptide comprises the sequence of SEQ ID No. 2.
58. The composition of any one of claims 38 to 57, wherein the RNA guide comprises a homodromous repeat sequence and a spacer sequence.
59. The composition of claim 58, wherein the orthostatic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. Nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. Nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
60. The composition of claim 58 or 59, wherein the homeotropic repeat sequence comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. Nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. Nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
61. The composition of any one of claims 58 to 60, wherein the homeotropic repeat sequence comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
62. The composition of any one of claims 58 to 61, wherein the homeotropic repeat sequence comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C.
63. The composition of any one of claims 58 to 62, wherein the spacer sequence is 15 to 35 nucleotides in length.
64. The composition of any one of claims 58 to 63, wherein the spacer sequence comprises complementarity to a target strand sequence of a target nucleic acid.
65. The composition of claim 64, wherein the target nucleic acid comprises a non-target strand sequence adjacent to a Protospacer Adjacent Motif (PAM) sequence.
66. The composition of claim 65, wherein the PAM sequence is 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', where N is any nucleotide (e.g., A, G, T, or C) and V is A, G, or C.
67. The composition of any one of claims 38 to 66, wherein the variant Cas12i4 polypeptide further comprises a peptide tag, a fluorescent protein, a base editing domain, a DNA methylation domain, a histone residue modification domain, a localization factor, a transcription modification factor, a light gating factor, a chemically inducible factor, or a chromatin visualization factor.
68. A composition comprising a nucleic acid encoding the Cas12i4 polypeptide of any one of claims 38-67, wherein optionally the nucleic acid is codon optimized for expression in a cell.
69. The composition of claim 68, wherein the cell is a eukaryotic cell.
70. The composition of claim 68 or 69, wherein the cell is a mammalian cell or a plant cell.
71. The composition of any one of claims 68 to 70, wherein the cell is a human cell.
72. The composition of claim 68, wherein the nucleic acid encoding the variant Cas12i4 polypeptide is operably linked to a promoter.
73. The composition of claim 68 or 72, wherein the nucleic acid encoding the variant Cas12i4 polypeptide is located in a vector.
74. The composition of claim 73, wherein the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
75. The composition of any one of claims 38 to 74, wherein the composition is present in a delivery composition comprising nanoparticles (e.g., lipid nanoparticles), liposomes, exosomes, microbubbles, or gene-guns.
76. An RNA guide or a nucleic acid encoding the RNA guide, wherein the RNA guide comprises a direct repeat sequence comprising:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. Nucleotide 5 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. Nucleotide 10 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 90% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 90% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
77. The RNA guide of claim 76, or a nucleic acid encoding the RNA guide, wherein the direct repeat comprises:
a. nucleotide 1 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
b. nucleotide 2 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
c. nucleotide 3 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
d. nucleotide 4 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
e. Nucleotide 5 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
f. nucleotide 6 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
g. nucleotide 7 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
h. nucleotide 8 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
i. nucleotide 9 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
j. Nucleotide 10 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
k. nucleotide 11 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 12 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 13 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
nucleotide 14 to nucleotide 36 of a sequence having at least 95% identity to the sequence of any one of SEQ ID NOs 60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
A sequence having at least 95% identity to the sequence of SEQ ID NO. 61 or a portion thereof.
78. The RNA guide of claim 76 or 77, or a nucleic acid encoding the RNA guide, wherein the cognate repeat comprises:
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124 to nucleotide 36;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124;
60, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, or 124; or (b)
SEQ ID NO. 61 or a part thereof.
79. The RNA guide of any one of claims 76-78, or a nucleic acid encoding the RNA guide, wherein the orthostatic repeat comprises AGN 1 N 2 N 3 N 4 GUGUN 5 N 6 N 7 CAGN 8 GACN 9 C (SEQ ID NO: 125), wherein N 1 Is A or G, N 2 Is C or U, N 3 Is A or G, N 4 Is U or C, N 5 Is C or U, N 6 Is C or U, N 7 Is U, A, C, or G, N 8 Is U or C, and N 9 Is A or C.
80. The RNA guide or nucleic acid encoding the RNA guide of any one of claims 76-79, wherein the RNA guide further comprises a spacer sequence.
81. The RNA guide of claim 80, or a nucleic acid encoding the RNA guide, wherein the spacer sequence is about 15 to about 35 nucleotides in length.
82. The RNA guide of claim 80 or claim 81, or a nucleic acid encoding the RNA guide, wherein the spacer sequence recognizes a target nucleic acid.
83. The RNA guide or nucleic acid encoding the RNA guide of claim 82, wherein the target nucleic acid comprises a target sequence adjacent to a proto-spacer adjacent motif (PAM) sequence, wherein the PAM sequence comprises a nucleotide sequence as shown by 5' -TTN-3', 5' -NTTN-3', 5' -NTN ' -3', 5' -NNTN-3', 5' -VTN-3', or 5' -NVTN-3', wherein N is any nucleotide (e.g., A, G, T, or C) and V is A, G, or C.
84. A composition comprising the RNA guide of any one of claims 76-83 or a nucleic acid encoding the RNA guide.
85. The composition of claim 84, wherein the composition is a delivery composition comprising nanoparticles (e.g., lipid nanoparticles), liposomes, exosomes, microbubbles, or gene-guns.
86. The RNA guide or nucleic acid encoding the RNA guide of any one of claims 76-83, wherein the nucleic acid encoding the RNA guide is operably linked to a promoter.
87. The RNA guide or nucleic acid encoding the RNA guide of any one of claims 76-83 or 86, wherein the nucleic acid encoding the RNA guide is in a vector.
88. The RNA guide of any one of claims 76-83 or 86-87, or a nucleic acid encoding the RNA guide, wherein the vector comprises a retroviral vector, a lentiviral vector, a phage vector, an adenoviral vector, an adeno-associated vector, or a herpes simplex vector.
89. A cell comprising the RNA guide of any one of claims 76-83 or 86-88 or a nucleic acid encoding the RNA guide.
90. The cell of claim 89, wherein the cell is a eukaryotic cell.
91. The cell of claim 89 or 90, wherein the cell is a mammalian cell or a plant cell.
92. The cell of any one of claims 89 to 91, wherein the cell is a human cell.
93. A method for editing a gene in a cell, the method comprising contacting the cell with: the variant of any one of claims 1-17, 26, 27, or 33; the composition of any one of claims 18-32, 38-75, 84, or 85; or the RNA guide of any one of claims 76-83 or 86-88.
94. A nucleic acid molecule encoding a Cas12i4 variant of SEQ ID No. 4, wherein the sequence of the nucleic acid molecule has 95% identity to a sequence selected from the group consisting of SEQ ID nos. 222-228.
95. The nucleic acid molecule of claim 94, wherein the sequence of the nucleic acid molecule comprises a sequence selected from the group consisting of SEQ ID NOs 222-228.
CN202280027316.2A 2021-02-11 2022-02-11 Compositions comprising variant Cas12i4 polypeptides and uses thereof Pending CN117136233A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/148,421 2021-02-11
US202163154437P 2021-02-26 2021-02-26
US63/154,437 2021-02-26
PCT/US2022/016214 WO2022174099A2 (en) 2021-02-11 2022-02-11 Compositions comprising a variant cas12i4 polypeptide and uses thereof

Publications (1)

Publication Number Publication Date
CN117136233A true CN117136233A (en) 2023-11-28

Family

ID=88860462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280027316.2A Pending CN117136233A (en) 2021-02-11 2022-02-11 Compositions comprising variant Cas12i4 polypeptides and uses thereof

Country Status (1)

Country Link
CN (1) CN117136233A (en)

Similar Documents

Publication Publication Date Title
US20230332119A1 (en) Compositions comprising a cas12i2 variant polypeptide and uses thereof
EP4305159A1 (en) Compositions comprising a variant polypeptide and uses thereof
AU2020397041A1 (en) Compositions comprising a nuclease and uses thereof
US11866746B2 (en) Compositions comprising a variant Cas12i4 polypeptide and uses thereof
CN117136233A (en) Compositions comprising variant Cas12i4 polypeptides and uses thereof
US11946045B2 (en) Compositions comprising a variant polypeptide and uses thereof
US20230235304A1 (en) Compositions comprising a crispr nuclease and uses thereof
US20240035010A1 (en) Compositions comprising a variant polypeptide and uses thereof
WO2023019243A1 (en) Compositions comprising a variant cas12i3 polypeptide and uses thereof
US20230193243A1 (en) Compositions comprising a cas12i2 polypeptide and uses thereof
WO2022150608A1 (en) Compositions comprising a variant crispr nuclease polypeptide and uses thereof
CN117295816A (en) Compositions comprising variant polypeptides and uses thereof
CN117043326A (en) Compositions comprising variant polypeptides and uses thereof
WO2024020557A1 (en) Compositions comprising a variant nuclease and uses thereof
CN117813382A (en) Gene editing system including RNA guide targeting STATHMIN 2 (STMN 2) and uses thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination