CN114292831A - Novel Cas enzyme and application - Google Patents

Novel Cas enzyme and application Download PDF

Info

Publication number
CN114292831A
CN114292831A CN202210115774.3A CN202210115774A CN114292831A CN 114292831 A CN114292831 A CN 114292831A CN 202210115774 A CN202210115774 A CN 202210115774A CN 114292831 A CN114292831 A CN 114292831A
Authority
CN
China
Prior art keywords
lys
leu
asn
glu
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210115774.3A
Other languages
Chinese (zh)
Other versions
CN114292831B (en
Inventor
梁亚峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Shunfeng Biotechnology Co Ltd
Original Assignee
Shandong Shunfeng Biotechnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Shunfeng Biotechnology Co Ltd filed Critical Shandong Shunfeng Biotechnology Co Ltd
Priority to CN202310515419.XA priority Critical patent/CN116555227A/en
Publication of CN114292831A publication Critical patent/CN114292831A/en
Application granted granted Critical
Publication of CN114292831B publication Critical patent/CN114292831B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention belongs to the field of nucleic acid editing, and particularly relates to the technical field of regularly clustered spaced short palindromic repeats (CRISPR). Specifically, the invention provides a novel Cas enzyme, which has low homology with the reported Cas enzyme, can show the activity of nuclease in cells and outside cells, and has wide application prospect.

Description

Novel Cas enzyme and application
Technical Field
The invention relates to the field of gene editing, in particular to the technical field of regularly clustered spaced short palindromic repeats (CRISPR). In particular, the present invention relates to a novel Cas effector protein, fusion proteins comprising such proteins, and nucleic acid molecules encoding them. The invention also relates to complexes and compositions for nucleic acid editing (e.g., gene or genome editing) comprising a Cas protein or fusion protein of the invention, or a nucleic acid molecule encoding the same.
Background
The CRISPR/Cas technology is a widely used gene editing technology, which specifically binds to a target sequence on a genome and cleaves DNA to generate double-strand break through RNA guide, and performs site-directed gene editing by using bionon-homologous end joining or homologous recombination.
The CRISPR/Cas9 system is the most commonly used type II CRISPR system, which recognizes the PAM motif of 3' -NGG, performing blunt-end cleavage of the target sequence. The CRISPR/Cas Type V system is a newly discovered Type of CRISPR system that has a motif of 5' -TTN, with sticky end cleavage of the target sequence, e.g. Cpf1, C2C1, CasX, CasY. However, the different CRISPRs/Cas currently available have different advantages and disadvantages. For example, Cas9, C2C1 and CasX all require two RNAs for guide RNA, whereas Cpf1 requires only one guide RNA and can be used for multiple gene editing. CasX has a size of 980 amino acids, while the common Cas9, C2C1, CasY and Cpf1 are typically around 1300 amino acids in size. In addition, the PAM sequences of Cas9, Cpf1, CasX, and CasY are complex and diverse, while C2C1 recognizes the stringent 5' -TTN, so its target site is easily predicted than other systems to reduce potential off-target effects.
In summary, given that currently available CRISPR/Cas systems are all limited by some drawbacks, the development of a new more robust CRISPR/Cas system with versatile good performance is of great significance for the development of biotechnology.
Disclosure of Invention
The inventors of the present application have unexpectedly discovered a novel class of endonucleases (Cas enzymes) through a large number of experiments and repeated trials. The Cas enzyme comprises one or more of Cas-sf1, Cas-sf3, Cas-sf4, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10, and based on the discovery, the inventor develops a novel CRISPR/Cas system and a gene editing method and a nucleic acid detection method based on the system.
Cas effector protein
In one aspect, the invention provides a Cas protein (alternatively referred to as Cas enzyme) which is an effector protein in a CRISPR/Cas system, wherein the effector protein is selected from one or any more of Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf 10.
Wherein the amino acid sequences of the Cas-sf4, the Cas-sf1, the Cas-sf3, the Cas-sf6, the Cas-sf8, the Cas-sf9 and the Cas-sf10 are respectively shown as SEQ ID No. 1-7.
In one embodiment, the Cas protein amino acid sequence has at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to any one of SEQ ID nos. 1-7 and substantially retains its biological function from the sequence.
In one embodiment, the Cas protein amino acid sequence has a sequence with one or more amino acid substitutions, deletions, or additions compared to any of SEQ ID nos. 1-7, and substantially retains its biological function from the sequence; the one or more amino acids include substitutions, deletions or additions of 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids.
It will be clear to those skilled in the art that the structure of a protein may be altered without adversely affecting its activity and functionality, for example one or more conservative amino acid substitutions may be introduced in the amino acid sequence of the protein without adversely affecting the activity and/or the three-dimensional structure of the protein molecule. Examples and embodiments of conservative amino acid substitutions will be apparent to those skilled in the art. Specifically, the amino acid residue may be substituted with another amino acid residue belonging to the same group as the site to be substituted, i.e., a nonpolar amino acid residue is substituted for another nonpolar amino acid residue, a polar uncharged amino acid residue is substituted for another polar uncharged amino acid residue, a basic amino acid residue is substituted for another basic amino acid residue, and an acidic amino acid residue is substituted for another acidic amino acid residue. Such substituted amino acid residues may or may not be encoded by the genetic code. Conservative substitutions where one amino acid is replaced by another amino acid belonging to the same group are within the scope of the present invention, as long as the substitution does not result in inactivation of the biological activity of the protein. Thus, the proteins of the invention may comprise one or more conservative substitutions in the amino acid sequence, which are preferably made by substitution according to Table 1. In addition, proteins that also comprise one or more other non-conservative substitutions are also encompassed by the present invention, provided that the non-conservative substitutions do not significantly affect the desired function and biological activity of the proteins of the present invention.
Conservative amino acid substitutions may be made at one or more predicted nonessential amino acid residues. A "nonessential" amino acid residue is an amino acid residue that can be altered (deleted, substituted, or substituted) without altering the biological activity, while an "essential" amino acid residue is required for biological activity. A "conservative amino acid substitution" is one in which an amino acid residue is replaced with an amino acid residue having a similar side chain. Amino acid substitutions can be made in non-conserved regions of the Cas enzyme. In general, such substitutions are not made to conserved amino acid residues, or to amino acid residues located within conserved motifs, where such residues are required for protein activity. However, one skilled in the art will appreciate that functional variants may have fewer conservative or non-conservative changes in conserved regions.
TABLE 1
Initial residue(s) Representative substitutions Preferred substitutions
Ala(A) Val;Leu;Ile Val
Arg(R) Lys;Gln;Asn Lys
Asn(N) Gln;His;Lys;Arg Gln
Asp(D) Glu Glu
Cys(C) Ser Ser
Gln(Q) Asn Asn
Glu(E) Asp Asp
Gly(G) Pro;Ala Ala
His(H) Asn;Gln;Lys;Arg Arg
Ile(I) Leu;Val;Met;Ala;Phe Leu
Leu(L) Ile;Val;Met;Ala;Phe Ile
Lys(K) Arg;Gln;Asn Arg
Met(M) Leu;Phe;Ile Leu
Phe(F) Leu;Val;Ile;Ala;Tyr Leu
Pro(P) Ala Ala
Ser(S) Thr Thr
Thr(T) Ser Ser
Trp(W) Tyr;Phe Tyr
Tyr(Y) Trp;Phe;Thr;Ser Phe
Val(V) Ile;Leu;Met;Phe;Ala Leu
It is well known in the art that one or more amino acid residues may be altered (substituted, deleted, truncated, or inserted) from the N-and/or C-terminus of a protein while still retaining its functional activity. Thus, proteins that have one or more amino acid residues altered from the N-and/or C-terminus of the Cas protein of the present invention, while retaining their desired functional activity, are also within the scope of the present invention. These changes may include changes introduced by modern molecular methods such as PCR, including PCR amplification by altering or extending the protein coding sequence by inclusion of amino acid coding sequences among the oligonucleotides used in PCR amplification.
It will be appreciated that proteins may be altered in various ways, including amino acid substitutions, deletions, truncations, and insertions, and methods for such manipulations are generally known in the art. For example, amino acid sequence variants of Cas proteins can be made by mutation of DNA. It may also be accomplished by other forms of mutagenesis and/or by directed evolution, e.g., using known methods of mutagenesis, recombination and/or shuffling (shuffling), in conjunction with related screening methods, to make single or multiple amino acid substitutions, deletions and/or insertions.
Those skilled in the art will appreciate that these minor amino acid changes in the Cas protein of the invention can occur (e.g., naturally occurring mutations) or be generated (e.g., using r-DNA technology) without loss of protein function or activity. If these mutations occur in the catalytic domain, active site or other functional domain of the protein, the properties of the polypeptide may change, but the polypeptide may retain its activity. Minor effects can be expected if the mutations present are not close to the catalytic domain, active site or other functional domains.
One skilled in the art can identify essential amino acids of a Cas protein according to methods known in the art, such as site-directed mutagenesis or analysis of protein evolution or biological information systems. The catalytic domain, active site or other functional domain of a protein can also be determined by physical analysis of the structure, such as by the following techniques: such as nuclear magnetic resonance, crystallography, electron diffraction, or photoaffinity labeling, in combination with mutations in putative key site amino acids.
In one embodiment, the Cas protein comprises the amino acid sequence shown in any one of SEQ ID nos. 1 to 7.
In one embodiment, the Cas protein is the amino acid sequence shown in any one of SEQ ID nos. 1 to 7.
In one embodiment, the Cas protein is a derivatized protein having the same biological function as a protein having the sequence shown in any of SEQ ID nos. 1-7.
Such biological functions include, but are not limited to, binding to a guide RNA, endonuclease activity, binding to a specific site of a target sequence under the guidance of a guide RNA and cleavage activity, including, but not limited to Cis cleavage activity and Trans cleavage activity.
The invention also provides a fusion protein which comprises any one Cas protein selected from Cas-sf1, Cas-sf3, Cas-sf4, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10 and other modification parts.
In one embodiment, the modifying moiety is selected from an additional protein or polypeptide, a detectable label, or any combination thereof.
In one embodiment, the modifying moiety is selected from the group consisting of an epitope tag, a reporter sequence, a Nuclear Localization Signal (NLS) sequence, a targeting moiety, a transcription activation domain (e.g., VP64), a transcription repression domain (e.g., KRAB domain or SID domain), a nuclease domain (e.g., Fok1), and a domain having an activity selected from the group consisting of: nucleotide deaminase, methylase activity, demethylase, transcriptional activation activity, transcriptional repression activity, transcriptional release factor activity, histone modification activity, nuclease activity, single-stranded RNA cleavage activity, double-stranded RNA cleavage activity, single-stranded DNA cleavage activity, double-stranded DNA cleavage activity and nucleic acid binding activity; and any combination thereof. Such NLS sequences are well known to those skilled in the art, examples of which include, but are not limited to, the SV40 large T antigen, EGL-13, c-Myc, and TUS proteins.
In one embodiment, the NLS sequence is located at, near, or near a terminus (e.g., N-terminus, C-terminus, or both) of a Cas protein of the invention.
Such epitope tags (epitoptags) are well known to those skilled in the art and include, but are not limited to, His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and other suitable epitope tags (e.g., purification, detection, or tracking) may be selected by those skilled in the art.
The reporter gene sequences are well known to those skilled in the art, examples of which include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
In one embodiment, the fusion protein of the invention comprises a domain capable of binding to a DNA molecule or an intracellular molecule, such as Maltose Binding Protein (MBP), the DNA binding domain of Lex a (DBD), the DBD of GAL4, and the like.
In one embodiment, the fusion protein of the invention comprises a detectable label, such as a fluorescent dye, e.g. FITC or DAPI.
In one embodiment, the Cas protein of the present invention is coupled, conjugated or fused to the modifying moiety, optionally via a linker.
In one embodiment, the modification moiety is directly linked to the N-terminus or C-terminus of the Cas protein of the present invention.
In one embodiment, the modification moiety is linked to the N-terminus or C-terminus of the Cas protein of the present invention via a linker. Such linkers are well known in the art, examples of which include, but are not limited to, linkers comprising one or more (e.g., 1, 2, 3, 4, or 5) amino acids (e.g., Glu or Ser) or amino acid derivatives (e.g., Ahx, β -Ala, GABA, or Ava), or PEG, and the like.
The Cas protein, protein derivative or fusion protein of the present invention is not limited by the manner of its production, and for example, it may be produced by a genetic engineering method (recombinant technology) or may be produced by a chemical synthesis method.
Nucleic acid of Cas protein
In another aspect, the invention provides an isolated polynucleotide comprising: a polynucleotide sequence encoding a Cas protein or a fusion protein of the present invention.
In one embodiment, the polynucleotide sequence is codon optimized for expression in a prokaryotic cell. In one embodiment, the polynucleotide sequence is codon optimized for expression in a eukaryotic cell.
In one embodiment, the polynucleotide is preferably single-stranded or double-stranded.
Direct Repeat (Direct Repeat) sequences
In another aspect, the invention provides an engineered direct repeat that forms a complex with any one of the Cas proteins selected from Cas-sf1, Cas-sf3, Cas-sf4, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10 described above.
The direct repeat sequence is connected with a guide sequence capable of hybridizing with a target sequence to form a guide RNA (guide RNA or gRNA).
Hybridization of the target sequence to the gRNA represents at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity of the target sequence and the nucleic acid sequence of the gRNA, such that a complex can be hybridized; or at least 12, 15, 16, 17, 18, 19, 20, 21, 22, or more bases of the nucleic acid sequences representing the target sequence and the gRNA can be complementarily paired to form a complex.
In some embodiments, the direct repeat sequence has at least 90% sequence identity, e.g., 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to any one of SEQ ID nos. 8-14. In some embodiments, the direct repeat sequence has a substitution, deletion, or addition of one or more bases (e.g., a substitution, deletion, or addition of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases) as compared to the sequence set forth in any of SEQ ID nos. 8-14.
In some embodiments, the direct repeat sequence is as shown in any one of SEQ ID Nos. 8-14, or as shown in any one of SEQ ID Nos. 16-22.
In the invention, SEQ ID Nos. 8-14 respectively correspond to prototypes of homologous repeat sequences of Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf 10; SEQ ID Nos. 16-22 correspond to the mature direct repeats of Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9, and Cas-sf10, respectively.
Guide RNA (gRNA)
In another aspect, the present invention provides a gRNA comprising a first segment and a second segment; the first segment is also referred to as "framework region", "protein binding segment", "protein binding sequence", or "Direct Repeat (Direct Repeat) sequence"; the second segment is also referred to as a "targeting sequence for targeting nucleic acid" or a "targeting segment for targeting nucleic acid", or a "targeting sequence for targeting a target sequence".
The first segment of the gRNA is capable of interacting with a Cas protein of the invention, thereby allowing the Cas protein and the gRNA to form a complex.
The targeting sequence of the targeting nucleic acid or the targeting segment of the targeting nucleic acid of the invention comprises a nucleotide sequence that is complementary to a sequence in the target nucleic acid. In other words, the targeting sequence of the targeting nucleic acid or the targeting segment of the targeting nucleic acid of the present invention interacts in a sequence-specific manner with the target nucleic acid upon hybridization (i.e., base pairing). Thus, the targeting sequence of the targeting nucleic acid or the targeting segment of the targeting nucleic acid may be altered or modified to hybridize to any desired sequence within the target nucleic acid. The nucleic acid is selected from DNA or RNA.
The percent complementarity between the targeting sequence of the targeting nucleic acid or the targeting segment of the targeting nucleic acid and the target sequence of the target nucleic acid can be at least 60% (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%).
The "framework region", "protein-binding segment", "protein-binding sequence", or "direct repeat" of a gRNA of the invention can interact with a CRISPR protein (or, Cas protein). The gRNA of the invention directs its interacting Cas protein to a specific nucleotide sequence within a target nucleic acid through the action of a targeting sequence of the targeting nucleic acid.
Preferably, the guide RNA comprises a first segment and a second segment in the 5 'to 3' direction.
In the context of the present invention, the second segment is also understood to be a leader sequence which hybridizes to the target sequence.
The grnas of the invention are capable of forming a complex with the Cas protein.
Carrier
The present invention also provides a vector comprising a Cas protein, an isolated nucleic acid molecule or a polynucleotide as described above; preferably, it further comprises a regulatory element operably linked thereto.
In one embodiment, the regulatory element is selected from one or more of the group consisting of: enhancers, transposons, promoters, terminators, leader sequences, polyadenylation sequences, marker genes.
In one embodiment, the vector comprises a cloning vector, an expression vector, a shuttle vector, an integration vector.
In some embodiments, the vectors included in the system are viral vectors (e.g., retroviral vectors, lentiviral vectors, adenoviral vectors, adeno-associated vectors and herpes simplex vectors), and may also be of the type of plasmid, virus, cosmid, phage, and the like, which are well known to those skilled in the art.
Carrier system
The present invention provides an engineered non-naturally occurring vector system, or CRISPR-Cas system, comprising a Cas protein or a nucleic acid sequence encoding said Cas protein and nucleic acid encoding one or more guide RNAs.
In one embodiment, the nucleic acid sequence encoding the Cas protein and the nucleic acid encoding the one or more guide RNAs are artificially synthesized.
In one embodiment, the nucleic acid sequence encoding the Cas protein and the nucleic acid encoding the one or more guide RNAs do not occur naturally together.
The one or more guide RNAs target one or more target sequences in the cell. The one or more target sequences hybridize to the genomic locus of the DNA molecule encoding the one or more gene products and direct the Cas protein to the genomic locus site of the DNA molecule of the one or more gene products, and the Cas protein modifies, edits, or cleaves the target sequence upon reaching the target sequence position, whereby expression of the one or more gene products is altered or modified.
The cells of the invention include one or more of animals, plants, or microorganisms.
In some embodiments, the Cas protein is codon optimized for expression in a cell.
In some embodiments, the Cas protein directs cleavage of one or both strands at the target sequence position.
The present invention also provides an engineered non-naturally occurring vector system, which may include one or more vectors, the one or more vectors including:
a) a first regulatory element operably linked to the gRNA,
b) a second regulatory element operably linked to the Cas protein;
wherein components (a) and (b) are located on the same or different carriers of the system.
The first and second regulatory elements include promoters (e.g., constitutive promoters or inducible promoters), enhancers (e.g., 35S promoter or 35S enhanced promoter), Internal Ribosome Entry Sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences).
In some embodiments, the vector in the system is a viral vector (e.g., retroviral vectors, lentiviral vectors, adenoviral vectors, adeno-associated vectors and herpes simplex vectors), and may also be of the type of plasmid, virus, cosmid, phage, and the like, which are well known to those skilled in the art.
In some embodiments, the systems provided herein are in a delivery system. In some embodiments, the delivery system is a nanoparticle, a liposome, an exosome, a microbubble, and a gene gun.
In one embodiment, when the target sequence is DNA, the target sequence is located 3' of the Protospacer Adjacent Motif (PAM) and the PAM has a sequence represented by TTN, where N is selected from A, G, T, C.
In one embodiment, the target sequence is a DNA or RNA sequence from a prokaryotic or eukaryotic cell. In one embodiment, the target sequence is a non-naturally occurring DNA or RNA sequence.
In one embodiment, the target sequence is present within a cell. In one embodiment, the target sequence is present within the nucleus or within the cytoplasm (e.g., organelle). In one embodiment, the cell is a eukaryotic cell. In other embodiments, the cell is a prokaryotic cell.
In one embodiment, the Cas protein has one or more NLS sequences attached thereto. In one embodiment, the fusion protein comprises one or more NLS sequences. In one embodiment, the NLS sequence is linked to the N-terminus or C-terminus of the protein. In one embodiment, the NLS sequence is fused to the N-terminus or C-terminus of the protein.
In another aspect, the invention relates to an engineered CRISPR system comprising a Cas protein as described above and one or more guide RNAs, wherein the guide RNA comprises a direct repeat and a spacer sequence capable of hybridizing to a target nucleic acid, the Cas protein being capable of binding to the guide RNA and targeting a target nucleic acid sequence complementary to the spacer sequence.
Protein-nucleic acid complexes/compositions
In another aspect, the present invention provides a complex or composition comprising:
(i) a protein component selected from: the above Cas protein, derivatized protein, or fusion protein, and any combination thereof; and
(ii) a nucleic acid component comprising (a) a guide sequence capable of hybridizing to a target sequence; and (b) a direct repeat sequence capable of binding to a Cas protein of the present invention.
The protein component and the nucleic acid component are combined with each other to form a complex.
In one embodiment, the nucleic acid component is a guide RNA in a CRISPR-Cas system.
In one embodiment, the complex or composition is non-naturally occurring or modified. In one embodiment, at least one component of the complex or composition is non-naturally occurring or modified. In one embodiment, the first component is non-naturally occurring or modified; and/or, the second component is non-naturally occurring or modified.
Activated CRISPR complexes
In another aspect, the present invention also provides an activated CRISPR complex comprising: (1) a protein component selected from: a Cas protein, a derivatized protein, or a fusion protein of the invention, and any combination thereof; (2) a gRNA comprising (a) a guide sequence capable of hybridizing to a target sequence; and (b) a direct repeat sequence capable of binding to a Cas protein of the present invention; and (3) a target sequence that binds to the gRNA. Preferably, the binding is via a targeting sequence of a targeting nucleic acid on the gRNA to the target nucleic acid.
The terms "activated CRISPR complex", "activation complex" or "ternary complex" as used herein refer to a complex of a Cas protein, a gRNA, and a target nucleic acid in a CRISPR system after binding or modification.
The Cas protein and gRNA of the invention can form a binary complex that is activated upon binding to a nucleic acid substrate to form an activated CRISPR complex. The nucleic acid substrate is complementary to a spacer sequence in the gRNA (alternatively referred to as a guide sequence that hybridizes to the target nucleic acid). In some embodiments, the spacer sequence of the gRNA is perfectly matched to the target substrate. In other embodiments, the spacer sequence of the gRNA matches a portion (continuous or discontinuous) of the target substrate.
In a preferred embodiment, the activated CRISPR complex may exhibit a collateral nuclease activity, which refers to the non-specific or random cleavage activity of the activated CRISPR complex on single-stranded nucleic acids, also referred to in the art as trans cleavage activity.
Delivery and delivery compositions
The Cas proteins, grnas, fusion proteins, nucleic acid molecules, vectors, systems, complexes, and compositions of the invention can be delivered by any method known in the art. Such methods include, but are not limited to, electroporation, lipofection, nuclear transfection, microinjection, sonoporation, gene gun, calcium phosphate-mediated transfection, cationic transfection, lipofection, dendritic transfection, heat shock transfection, nuclear transfection, magnetic transfection, lipofection, puncture transfection, optical transfection, agent-enhanced nucleic acid uptake, and delivery via liposomes, immunoliposomes, viral particles, artificial virosomes, and the like.
Thus, in another aspect, the present invention provides a delivery composition comprising a delivery vehicle and one or any of the following: the Cas protein, fusion protein, nucleic acid molecule, vector, system, complex and composition of the present invention.
In one embodiment, the delivery vehicle is a particle.
In one embodiment, the delivery vector is selected from a lipid particle, a sugar particle, a metal particle, a protein particle, a liposome, an exosome, a microvesicle, a gene gun, or a viral vector (e.g., a replication defective retrovirus, lentivirus, adenovirus, or adeno-associated virus).
Host cell
The invention also relates to an in vitro, ex vivo or in vivo cell or cell line or progeny thereof comprising: cas proteins, fusion proteins, nucleic acid molecules, protein-nucleic acid complexes, activated CRISPR complexes, vectors, and delivery compositions of the invention described herein.
In certain embodiments, the cell is a prokaryotic cell.
In certain embodiments, the cell is a eukaryotic cell. In certain embodiments, the cell is a mammalian cell. In certain embodiments, the cell is a human cell. In certain embodiments, the cell is a non-human mammalian cell, e.g., a cell of a non-human primate, bovine, ovine, porcine, canine, monkey, rabbit, rodent (e.g., rat or mouse). In certain embodiments, the cell is a non-mammalian eukaryotic cell, such as a cell of a poultry bird (e.g., chicken), fish, or crustacean (e.g., clam, shrimp). In certain embodiments, the cell is a plant cell, e.g., a cell possessed by a monocot or dicot or a cell possessed by a cultivated plant or a food crop such as cassava, corn, sorghum, soybean, wheat, oat, or rice, e.g., an algae, a tree, or a producer, a fruit, or a vegetable (e.g., a tree such as a citrus tree, a nut tree; a solanum plant, cotton, tobacco, tomato, grape, coffee, cocoa, etc.).
In certain embodiments, the cell is a stem cell or stem cell line.
In certain instances, a host cell of the invention comprises a modification of a gene or genome that is not present in its wild type.
Gene editing method and application
The Cas protein, the nucleic acid, the composition as described above, the CIRSPR/Cas system as described above, the vector system as described above, the delivery composition as described above or the activated CRISPR complex as described above or the host cell as described above may be used for any one or several of the following uses: targeting and/or editing a target nucleic acid; cleaving double-stranded DNA, single-stranded DNA, or single-stranded RNA; non-specifically cleaving and/or degrading the nucleic acid of the collateral branch; non-specifically cleaving single-stranded nucleic acids; detecting nucleic acid; detecting nucleic acids in a target sample; specifically editing double-stranded nucleic acids; base-editing double-stranded nucleic acids; base-editing single-stranded nucleic acids. In other embodiments, the kit may also be used to prepare reagents or kits for any one or more of the uses described above.
The invention also provides the application of the Cas protein, the nucleic acid, the composition, the CIRCR SPR/Cas system, the vector system, the delivery composition or the activated CRISPR complex in gene editing, gene targeting or gene cutting; alternatively, use in the manufacture of a reagent or kit for gene editing, gene targeting or gene cleavage.
In one embodiment, the gene editing, gene targeting or gene cleavage is gene editing, gene targeting or gene cleavage inside and/or outside a cell.
The present invention also provides a method of editing, targeting or cleaving a target nucleic acid, comprising contacting the target nucleic acid with the above-described Cas protein, nucleic acid, the above-described composition, the above-described CIRSPR/Cas system, the above-described vector system, the above-described delivery composition or the above-described activated CRISPR complex. In one embodiment, the method is editing, targeting or cleaving a target nucleic acid intracellularly and/or extracellularly.
The gene editing or editing target nucleic acids include modifying genes, knocking out genes, altering expression of gene products, repairing mutations, and/or inserting polynucleotides, gene mutations.
The editing can be performed in prokaryotic cells and/or eukaryotic cells.
In another aspect, the invention also provides the application of the above Cas protein, nucleic acid, the above composition, the above CIRSPR/Cas system, the above vector system, the above delivery composition or the above activated CRISPR complex in nucleic acid detection, or in the preparation of a reagent or kit for nucleic acid detection.
In another aspect, the invention also provides a method of cleaving single-stranded nucleic acid, the method comprising contacting a nucleic acid population with the Cas protein and the grnas described above, wherein the nucleic acid population comprises a target nucleic acid and a plurality of non-target single-stranded nucleic acids, the Cas protein cleaving the plurality of non-target single-stranded nucleic acids.
The gRNA is capable of binding the Cas protein.
The gRNA is capable of targeting the target nucleic acid.
The contacting may be in vitro, ex vivo, or inside a cell in vivo.
Preferably, the cleaved single-stranded nucleic acid is non-specific cleavage.
In another aspect, the invention also provides the use of the above Cas protein, nucleic acid, the above composition, the above CIRSPR/Cas system, the above vector system, the above delivery composition or the above activated CRISPR complex for non-specific cleavage of single stranded nucleic acids, or for the preparation of a reagent or kit for non-specific cleavage of single stranded nucleic acids.
In another aspect, the invention also provides a kit for gene editing, gene targeting or gene cleavage, comprising the above Cas protein, gRNA, nucleic acid, the above composition, the above CIRSPR/Cas system, the above vector system, the above delivery composition, the above activated CRISPR complex, or the above host cell.
In another aspect, the present invention also provides a kit for detecting a target nucleic acid in a sample, the kit comprising: (a) a Cas protein, or a nucleic acid encoding the Cas protein; (b) a guide RNA, or a nucleic acid encoding the guide RNA, or a precursor RNA comprising the guide RNA, or a nucleic acid encoding the precursor RNA; and (c) a single-stranded nucleic acid detector that is single-stranded and does not hybridize to the guide RNA.
It is known in the art that precursor RNAs can be cleaved or processed into mature guide RNAs as described above.
In another aspect, the invention provides the use of the above Cas protein, nucleic acid, the above composition, the above CIRSPR/Cas system, the above vector system, the above delivery composition, the above activated CRISPR complex or the above host cell in the preparation of a formulation or kit for:
(i) gene or genome editing;
(ii) target nucleic acid detection and/or diagnosis;
(iii) editing a target sequence in a target locus to modify an organism or non-human organism;
(iv) treatment of diseases;
(v) targeting a target gene;
(vi) cutting the target gene.
Preferably, the gene or genome editing is carried out intracellularly or extracellularly.
Preferably, the target nucleic acid detection and/or diagnosis is in vitro.
Preferably, the treatment of the disease is the treatment of a condition caused by a defect in the target sequence in the target locus.
In another aspect, the invention provides a method of detecting a target nucleic acid in a sample, the method comprising contacting the sample with the Cas protein, a gRNA (guide RNA) comprising a region that binds to the Cas protein and a guide sequence that hybridizes to the target nucleic acid, and a single-stranded nucleic acid detector; detecting a detectable signal generated by the Cas protein-cleaved single-stranded nucleic acid detector, thereby detecting a target nucleic acid; the single-stranded nucleic acid detector does not hybridize to the gRNA.
Method for specifically modifying target nucleic acid
In another aspect, the present invention also provides a method of specifically modifying a target nucleic acid, the method comprising: contacting the target nucleic acid with the Cas protein, the nucleic acid, the composition, the CIRSPR/Cas system, the vector system, the delivery composition, or the activated CRISPR complex.
The specific modification may occur in vivo or in vitro.
The specific modification may occur intracellularly or extracellularly.
In some cases, the cell is selected from a prokaryotic cell or a eukaryotic cell, e.g., an animal cell, a plant cell, or a microbial cell.
In one embodiment, the modification refers to a break in the target sequence, e.g., a single/double strand break in DNA, or a single strand break in RNA.
In some cases, the method further comprises contacting the target nucleic acid with a donor polynucleotide, wherein the donor polynucleotide, a portion of the donor polynucleotide, a copy of the donor polynucleotide, or a portion of the copy of the donor polynucleotide is integrated into the target nucleic acid.
In one embodiment, the modification further comprises inserting an editing template (e.g., an exogenous nucleic acid) into the break.
In one embodiment, the method further comprises: contacting the editing template with the target nucleic acid, or delivering into a cell comprising the target nucleic acid. In this embodiment, the method repairs the disrupted target gene by homologous recombination with an exogenous template polynucleotide; in some embodiments, the repair results in a mutation, including an insertion, deletion, or substitution of one or more nucleotides of the target gene, and in other embodiments, the mutation results in one or more amino acid changes in a protein expressed from a gene comprising the target sequence.
Detection (non-specific cleavage)
In another aspect, the invention provides a method of detecting a target nucleic acid in a sample, the method comprising contacting the sample with the above-described Cas protein, nucleic acid, the above-described composition, the above-described CIRSPR/Cas system, the above-described vector system, the above-described delivery composition, or the above-described activated CRISPR complex, and a single-stranded nucleic acid detector; detecting a detectable signal generated by the Cas protein cleavage single stranded nucleic acid detector, thereby detecting the target nucleic acid.
In the present invention, the target nucleic acid comprises a ribonucleotide or a deoxyribonucleotide; including single-stranded nucleic acids, double-stranded nucleic acids, e.g., single-stranded DNA, double-stranded DNA, single-stranded RNA, double-stranded RNA. In one embodiment, the target nucleic acid is derived from a sample of a virus, bacterium, microorganism, soil, water source, human, animal, plant, or the like.
Preferably, the target nucleic acid is a product enriched or amplified by PCR, NASBA, RPA, SDA, LAMP, HAD, NEAR, MDA, RCA, LCR, RAM and the like.
In one embodiment, the target nucleic acid is a viral nucleic acid, a bacterial nucleic acid, a specific nucleic acid associated with a disease, such as a specific mutation site or SNP site or a nucleic acid that is different from a control; preferably, the virus is a plant virus or an animal virus, e.g., papilloma virus, hepatic DNA virus, herpes virus, adenovirus, poxvirus, parvovirus, coronavirus; preferably, the virus is a coronavirus, preferably SARS, SARS-CoV2(COVID-19), HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1, Mers-CoV.
In some embodiments, the target nucleic acid is derived from a cell, e.g., from a cell lysate.
In one embodiment, the target nucleic acid comprises DNA, RNA, preferably single-stranded nucleic acid or double-stranded nucleic acid or nucleic acid modification.
In the present invention, the gRNA has at least 50% match to a target sequence on a target nucleic acid, preferably at least 60%, preferably at least 70%, preferably at least 80%, preferably at least 90%.
In one embodiment, when the target sequence contains one or more characteristic sites (e.g., a particular mutation site or SNP), the characteristic site is a perfect match to the gRNA.
In one embodiment, one or more grnas with targeting sequences different from each other can be included in the detection method, targeting different target sequences.
In the present invention, the single-stranded nucleic acid detector includes, but is not limited to, a single-stranded DNA, a single-stranded RNA, a DNA-RNA hybrid, a nucleic acid analog, a base modification, a single-stranded nucleic acid detector containing a base-free spacer, and the like; "nucleic acid analogs" include, but are not limited to: locked nucleic acids, bridged nucleic acids, morpholino nucleic acids, ethylene glycol nucleic acids, hexitol nucleic acids, threose nucleic acids, arabinose nucleic acids, 2 ' oxymethyl RNA, 2 ' methoxyacetyl RNA, 2 ' fluoro RNA, 2 ' amino RNA, 4 ' thio RNA, and combinations thereof, including optional ribonucleotide or deoxyribonucleotide residues.
In the present invention, the detectable signal is realized by: vision-based detection, sensor-based detection, color detection, fluorescence signal-based detection, gold nanoparticle-based detection, fluorescence polarization, fluorescence detection, colloidal phase transition/dispersion, electrochemical detection, and semiconductor-based detection.
In the present invention, it is preferable that a fluorescent group and a quencher group are respectively disposed at both ends of the single-stranded nucleic acid detector, and when the single-stranded nucleic acid detector is cleaved, a detectable fluorescent signal can be exhibited. The fluorescent group is selected from one or more of FAM, FITC, VIC, JOE, TET, CY3, CY5, ROX, Texas Red or LC RED 460; the quenching group is selected from one or more of BHQ1, BHQ2, BHQ3, Dabcy1 or Tamra.
In other embodiments, different labeled molecules are respectively disposed at the 5 'end and the 3' end of the single-stranded nucleic acid detector, and the results of the colloidal gold test before and after cleavage by the Cas protein of the single-stranded nucleic acid detector are detected by means of colloidal gold detection; the single-stranded nucleic acid detector shows different color development results on a colloidal gold detection line and a quality control line before and after being cut by the Cas protein.
In some embodiments, the method of detecting a target nucleic acid can further comprise comparing the level of the detectable signal to a reference signal level, and determining the amount of the target nucleic acid in the sample based on the level of the detectable signal.
In some embodiments, the method of detecting a target nucleic acid can further comprise using an RNA reporter nucleic acid and a DNA reporter nucleic acid (e.g., fluorescent color) on different channels and determining the level of detectable signal by measuring the signal levels of the RNA and DNA reporter molecules and by measuring the amount of target nucleic acid in the RNA and DNA reporter molecules, sampling based on combining (e.g., using a minimum or product) the levels of detectable signal.
In one embodiment, the target nucleic acid is present within a cell.
In one embodiment, the cell is a prokaryotic cell.
In one embodiment, the cell is a eukaryotic cell.
In one embodiment, the cell is an animal cell.
In one embodiment, the cell is a human cell.
In one embodiment, the cell is a plant cell, such as a cell possessed by a cultivated plant (e.g., cassava, corn, sorghum, wheat, or rice), an algae, a tree, or a vegetable.
In one embodiment, the target gene is present in a nucleic acid molecule (e.g., a plasmid) in vitro.
In one embodiment, the target gene is present in a plasmid.
Definition of terms
In the present invention, unless otherwise specified, scientific and technical terms used herein have the meanings that are commonly understood by those skilled in the art. Also, the procedures of molecular genetics, nucleic acid chemistry, molecular biology, biochemistry, cell culture, microbiology, cell biology, genomics, and recombinant DNA, etc., used herein, are all conventional procedures widely used in the corresponding field. Meanwhile, in order to better understand the present invention, the definitions and explanations of related terms are provided below.
Cas protein
In the present invention, Cas protein, Cas enzyme, Cas effector protein may be used interchangeably; the present inventors have for the first time discovered and identified a Cas effector protein having an amino acid sequence selected from the group consisting of:
(i) a sequence as shown in any one of SEQ ID Nos. 1 to 7;
(ii) a sequence having one or more amino acid substitutions, deletions or additions (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acid substitutions, deletions or additions) compared to the sequence set forth in any of SEQ ID nos. 1 to 7; or
(iii) A sequence having at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to a sequence set forth in any one of SEQ ID Nos. 1-7.
Nucleic acid cleavage or cleavage of nucleic acids herein includes DNA or RNA fragmentation in a target nucleic acid (Cis cleavage), DNA or RNA fragmentation in a side-branch nucleic acid substrate (single-stranded nucleic acid substrate) (i.e., non-specific or non-targeting, Trans cleavage) produced by a Cas enzyme as described herein. In some embodiments, the cleavage is a double-stranded DNA break. In some embodiments, the cleavage is a single-stranded DNA break or a single-stranded RNA break.
CRISPR system
As used herein, the terms "regularly clustered short palindromic repeats (CRISPR) -CRISPR-associated (Cas) (CRISPR-Cas) system" or "CRISPR system" are used interchangeably and have the meaning generally understood by those skilled in the art, which generally comprise a transcript or other element that is associated with the expression of a CRISPR-associated ("Cas") gene, or a transcript or other element that is capable of directing the activity of said Cas gene.
CRISPR/Cas complexes
As used herein, the term "CRISPR/Cas complex" refers to a complex formed by the binding of a guide RNA (guide RNA) or mature crRNA to a Cas protein, which comprises a direct repeat that hybridizes to a guide sequence of a target sequence and binds to a Cas protein, which complex is capable of recognizing and cleaving a polynucleotide that is capable of hybridizing to the guide RNA or mature crRNA.
Guide RNA (guideRNA, gRNA)
As used herein, the terms "guide RNA", "gRNA", "mature crRNA", "guide sequence" are used interchangeably and have the meaning commonly understood by those skilled in the art. In general, the guide RNA may comprise, consist essentially of, or consist of a direct repeat (direct repeat) and a guide sequence.
In certain instances, the guide sequence is any polynucleotide sequence that is sufficiently complementary to the target sequence to hybridize to the target sequence and direct specific binding of the CRISPR/Cas complex to the target sequence. In one embodiment, the degree of complementarity between a guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99%, when optimally aligned. Determining the optimal alignment is within the ability of one of ordinary skill in the art. For example, there are published and commercially available alignment algorithms and programs such as, but not limited to, ClustalW, the Smith-Waterman algorithm in matlab (Smith-Waterman), Bowtie, Geneius, Biopython, and SeqMan.
Target sequence
By "target sequence" is meant a polynucleotide that is targeted by a guide sequence in the gRNA, e.g., a sequence that is complementary to the guide sequence, wherein hybridization between the target sequence and the guide sequence will promote formation of a CRISPR/Cas complex (including Cas protein and gRNA). Complete complementarity is not necessary as long as there is sufficient complementarity to cause hybridization and promote formation of a CRISPR/Cas complex.
The target sequence may comprise any polynucleotide, such as DNA or RNA. In some cases, the target sequence is located intracellularly or extracellularly. In some cases, the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located within an organelle of the eukaryotic cell, such as a mitochondrion or chloroplast. Sequences or templates that can be used for recombination into a target locus containing the target sequence are referred to as "editing templates" or "editing polynucleotides" or "editing sequences". In one embodiment, the editing template is an exogenous nucleic acid. In one embodiment, the recombination is homologous recombination.
In the present invention, a "target sequence" or "target polynucleotide" or "target nucleic acid" can be any polynucleotide endogenous or exogenous to a cell (e.g., a eukaryotic cell). For example, the target polynucleotide may be a polynucleotide present in the nucleus of a eukaryotic cell. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or non-useful DNA). In some cases, the target sequence should be related to the Protospacer Adjacent Motif (PAM).
Single-stranded nucleic acid detector
The single-stranded nucleic acid detector of the present invention refers to a sequence containing 2 to 200 nucleotides, preferably, 2 to 150 nucleotides, preferably, 3 to 100 nucleotides, preferably, 3 to 30 nucleotides, preferably, 4 to 20 nucleotides, and more preferably, 5 to 15 nucleotides. Preferably a single-stranded DNA molecule, a single-stranded RNA molecule or a single-stranded DNA-RNA hybrid.
The single-stranded nucleic acid detector comprises different reporter groups or marker molecules at both ends, and does not present a reporter signal when in an initial state (i.e., an uncleaved state), and presents a detectable signal when the single-stranded nucleic acid detector is cleaved, i.e., presents a detectable difference after cleavage from before cleavage.
In one embodiment, the reporter group or the marker molecule comprises a fluorescent group and a quenching group, wherein the fluorescent group is selected from one or any several of FAM, FITC, VIC, JOE, TET, CY3, CY5, ROX, Texas Red or LC RED 460; the quenching group is selected from one or more of BHQ1, BHQ2, BHQ3, Dabcy1 or Tamra.
In one embodiment, the single stranded nucleic acid detector has a first molecule (e.g., FAM or FITC) attached to the 5 'end and a second molecule (e.g., biotin) attached to the 3' end. The reaction system containing the single-stranded nucleic acid detector is used in combination with a flow strip to detect the target nucleic acid (preferably, in a colloidal gold detection manner). The flow strip is designed with two capture lines, with an antibody that binds to a first molecule (i.e. a first molecular antibody) at the sample contacting end (colloidal gold), an antibody that binds to the first molecular antibody at the first line (control line), and an antibody that binds to a second molecule (i.e. a second molecular antibody, such as avidin) at the second line (test line). As the reaction flows along the strip, the first molecular antibody binds to the first molecule carrying the cleaved or uncleaved oligonucleotide to the capture line, the cleaved reporter will bind to the antibody of the first molecular antibody at the first capture line, and the uncleaved reporter will bind to the second molecular antibody at the second capture line. Binding of the reporter group at each line will result in a strong readout/signal (e.g. color). As more reporters are cut, more signal will accumulate at the first capture line and less signal will appear at the second line. In certain aspects, the invention relates to the use of a flow strip as described herein for detecting nucleic acids. In certain aspects, the invention relates to a method of detecting nucleic acids using a flow strip as defined herein, e.g. a (side) flow test or a (side) flow immunochromatographic assay. In some aspects, the molecules in the single-stranded nucleic acid detector may be replaced with each other, or the positions of the molecules may be changed, and the modified form is also included in the present invention as long as the reporting principle is the same as or similar to that of the present invention.
The detection method of the present invention can be used for quantitative detection of a target nucleic acid to be detected. The quantitative detection index can be quantified according to the signal intensity of the reporter group, such as the luminous intensity of a fluorescent group, or the width of a color development strip.
Wild type
As used herein, the term "wild-type" has the meaning commonly understood by those skilled in the art to mean a typical form of an organism, strain, gene, or characteristic that, when it exists in nature, is distinguished from a mutant or variant form, which may be isolated from a source in nature and which has not been intentionally modified by man.
Derivatization
As used herein, the term "derivatize" refers to a chemical modification of an amino acid, polypeptide, or protein to which one or more substituents have been covalently attached. The substituents may also be referred to as side chains.
The derivatized protein is a derivative of the protein, and generally, derivatization of the protein does not adversely affect the desired activity of the protein (e.g., activity in binding to a guide RNA, endonuclease activity, activity in binding to a specific site of a target sequence under the guidance of a guide RNA and cleavage), i.e., the derivative of the protein has the same activity as the protein.
Derivatized proteins
Also referred to as "protein derivatives" refer to modified forms of proteins, for example, wherein one or more amino acids of the protein may be deleted, inserted, modified and/or substituted.
Not naturally occurring
As used herein, the terms "non-naturally occurring" or "engineered" are used interchangeably and represent artificial participation. When these terms are used to describe a nucleic acid molecule or polypeptide, it means that the nucleic acid molecule or polypeptide is at least substantially free from at least one other component with which it is associated in nature or as found in nature.
Orthologues (orthologues)
As used herein, the term "ortholog" has the meaning commonly understood by those skilled in the art. By way of further guidance, an "ortholog" of a protein as described herein refers to a protein belonging to a different species that performs the same or similar function as the protein being its ortholog.
Identity of each other
As used herein, the term "identity" is used to refer to the match of sequences between two polypeptides or between two nucleic acids. When a position in both of the sequences being compared is occupied by the same base or amino acid monomer subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides is occupied by lysine), then the molecules are identical at that position. The "percent identity" between two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions compared x 100. For example, if 6 of 10 positions of two sequences match, then the two sequences have 60% identity. For example, the DNA sequences CTGACT and CAGGTT share 50% identity (3 of the total 6 positions match). Typically, the comparison is made when the two sequences are aligned to yield maximum identity. Such alignments can be performed by using, for example, Needleman et al (1970) j.mol.biol.48: 443-453. The algorithm of E.Meyers and W.Miller (Compout.appl biosci., 4:11-17(1988)) which has been incorporated into the ALIGN program (version 2.0) can also be used to determine percent identity between two amino acid sequences using a PAM120 weight residue table (weight residue table), a gap length penalty of 12, and a gap penalty of 4. Furthermore, percent identity between two amino acid sequences can be determined using the Needleman and Wunsch (J MoI biol.48: 444-.
Carrier
The term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid molecule to which it is linked. Vectors include, but are not limited to, single-stranded, double-stranded, or partially double-stranded nucleic acid molecules; nucleic acid molecules comprising one or more free ends, free ends (e.g., circular); nucleic acid molecules comprising DNA, RNA, or both; and other various polynucleotides known in the art. The vector may be introduced into a host cell by transformation, transduction, or transfection, and the genetic material elements carried thereby are expressed in the host cell. A vector can be introduced into a host cell to thereby produce a transcript, protein, or peptide, including from a protein, fusion protein, isolated nucleic acid molecule, etc. (e.g., a CRISPR transcript, such as a nucleic acid transcript, protein, or enzyme) as described herein. A vector may contain a variety of elements that control expression, including, but not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, and reporter genes. In addition, the vector may contain a replication initiation site.
One type of vector is a "plasmid," which refers to a circular double-stranded DNA loop into which additional DNA segments can be inserted, for example, by standard molecular cloning techniques.
Another type of vector is a viral vector, in which the virus-derived DNA or RNA sequences are present in a vector for packaging of viruses (e.g., retroviruses, replication-defective retroviruses, adenoviruses, replication-defective adenoviruses, and adeno-associated viruses). Viral vectors also comprise polynucleotides carried by viruses for transfection into a host cell. Certain vectors (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors) are capable of autonomous replication in a host cell into which they are introduced.
Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operably linked. Such vectors are referred to herein as "expression vectors".
Host cell
As used herein, the term "host cell" refers to a cell that can be used to introduce a vector, and includes, but is not limited to, prokaryotic cells such as Escherichia coli or Bacillus subtilis, eukaryotic cells such as microbial cells, fungal cells, animal cells, and plant cells.
One skilled in the art will appreciate that the design of an expression vector may depend on factors such as the choice of host cell to be transformed, the level of expression desired, and the like.
Regulatory element
As used herein, the term "regulatory element" is intended to include promoters, enhancers, Internal Ribosome Entry Sites (IRES), and other expression control elements (e.g., transcription termination signals such as polyadenylation signals and poly-U sequences), which are described in detail with reference to gordel (Goeddel), "gene expression technology: METHODS IN ENZYMOLOGY (GENE EXPRESSION TECHNOLOGY: METHOD IN ENZYMOLOGY)185, Academic Press, San Diego, Calif. (1990). In some cases, regulatory elements include those sequences that direct constitutive expression of a nucleotide sequence in many types of host cells as well as those sequences that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). Tissue-specific promoters may primarily direct expression in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, a particular organ (e.g., liver, pancreas), or a particular cell type (e.g., lymphocyte). In certain instances, the regulatory element may also direct expression in a time-dependent manner (e.g., in a cell cycle-dependent or developmental stage-dependent manner), which may or may not be tissue or cell type specific. In certain instances, the term "regulatory element" encompasses enhancer elements, such as WPRE; a CMV enhancer; the R-U5' fragment in the LTR of HTLV-I ((mol. cell. biol., Vol.8 (1), pp.466-472, 1988); the SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit β -globin (Proc. Natl. Acad. Sci. USA., Vol.78 (3), pp.1527-31, 1981).
Promoters
As used herein, the term "promoter" has a meaning well known to those skilled in the art and refers to a non-coding nucleotide sequence located upstream of a gene that promotes expression of a downstream gene. Constitutive (constitutive) promoters are nucleotide sequences that: when operably linked to a polynucleotide that encodes or defines a gene product, it results in the production of the gene product in the cell under most or all physiological conditions of the cell. An inducible promoter is a nucleotide sequence that, when operably linked to a polynucleotide that encodes or defines a gene product, causes the gene product to be produced intracellularly substantially only when an inducer corresponding to the promoter is present in the cell. A tissue-specific promoter is a nucleotide sequence that: when operably linked to a polynucleotide that encodes or defines a gene product, it results in the production of the gene product in the cell substantially only when the cell is of the tissue type to which the promoter corresponds.
NLS
A "nuclear localization signal" or "nuclear localization sequence" (NLS) is an amino acid sequence that "tags" a protein for introduction into the nucleus by nuclear transport, i.e., a protein with NLS is transported to the nucleus. Typically, NLS contains positively charged Lys or Arg residues exposed at the surface of the protein. Exemplary nuclear localization sequences include, but are not limited to, NLS from: SV40 Large T antigen, EGL-13, c-Myc and TUS protein. In some embodiments, the NLS comprises a PKKKRKV sequence. In some embodiments, the NLS comprises an AVKRPAATKKAGQAKKKKLD sequence. In some embodiments, the NLS comprises an PAAKRVKLD sequence. In some embodiments, the NLS comprises an MSRRRKANPTKLSENAKKLAKEVEN sequence. In some embodiments, the NLS comprises an KLKIKRPVK sequence. Other nuclear localization sequences include, but are not limited to, the acidic M9 domain of hnRNP A1, the sequence KIPIK and PY-NLS in the yeast transcriptional repressor Mat α 2.
Is operably connected to
As used herein, the term "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the one or more regulatory elements in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
Complementarity
As used herein, the term "complementarity" refers to the ability of a nucleic acid to form one or more hydrogen bonds with another nucleic acid sequence by means of a conventional watson-crick or other unconventional type. Percent complementarity refers to the percentage of residues (e.g., 5, 6, 7, 8, 9, 10 out of 10 are 50%, 60%, 70%, 80%, 90%, and 100% complementary) in a nucleic acid molecule that can form hydrogen bonds (e.g., watson-crick base pairing) with a second nucleic acid sequence. "completely complementary" means that all consecutive residues of one nucleic acid sequence hydrogen bond with the same number of consecutive residues in a second nucleic acid sequence. As used herein, "substantially complementary" refers to a degree of complementarity of at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides, or to two nucleic acids that hybridize under stringent conditions.
Stringent conditions
As used herein, "stringent conditions" for hybridization refer to conditions under which a nucleic acid having complementarity to a target sequence predominantly hybridizes to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are generally sequence dependent and vary depending on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence.
Hybridization of
The terms "hybridize" or "complementary" or "substantially complementary" refer to a nucleic acid (e.g., RNA, DNA) that comprises a nucleotide sequence that enables it to bind non-covalently, i.e., to form base pairs and/or G/U base pairs with another nucleic acid in a sequence-specific, antiparallel manner (i.e., the nucleic acid binds specifically to the complementary nucleic acid), "anneal" or "hybridize".
Hybridization requires that the two nucleic acids contain complementary sequences, although mismatches between bases are possible. Suitable conditions for hybridization between two nucleic acids depend on the length and degree of complementarity of the nucleic acids, variables well known in the art. Typically, the length of the hybridizable nucleic acid is 8 nucleotides or more (e.g., 10 nucleotides or more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides or more, 25 nucleotides or more, or 30 nucleotides or more).
It is understood that the sequence of a polynucleotide need not be 100% complementary to the sequence of its target nucleic acid to specifically hybridize. A polynucleotide may comprise 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 98% or more, 99% or more, 99.5% or more, or a target region that hybridizes thereto has 100% sequence complementarity of the target region.
Hybridization of a target sequence to a gRNA represents that at least 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the target sequence and the nucleic acid sequence of the gRNA can hybridize to form a complex; or at least 12, 15, 16, 17, 18, 19, 20, 21, 22 or more bases of nucleic acid sequences representing the target sequence and the gRNA can be complementarily paired to hybridize to form a complex.
Expression of
As used herein, the term "expression" refers to the process by which a polynucleotide is transcribed from a DNA template (e.g., into mRNA or other RNA transcript) and/or the process by which transcribed mRNA is subsequently translated into a peptide, polypeptide, or protein. The transcripts and encoded polypeptides may be collectively referred to as "gene products". If the polynucleotide is derived from genomic DNA, expression may include splicing of mRNA in eukaryotic cells.
Joint
As used herein, the term "linker" refers to a linear polypeptide formed from a plurality of amino acid residues joined by peptide bonds. The linker of the present invention may be an artificially synthesized amino acid sequence, or a naturally occurring polypeptide sequence, such as a polypeptide having a hinge region function. Such linker polypeptides are well known in the art (see, e.g., Holliger, P. et al (1993) Proc. Natl. Acad. Sci. USA 90: 6444-.
Treatment of
As used herein, the term "treating" refers to treating or curing a disorder, delaying the onset of symptoms of a disorder, and/or delaying the development of a disorder.
Test subject
As used herein, the term "subject" includes, but is not limited to, various animals, plants, and microorganisms.
Animal(s) production
For example, a mammal, such as a bovine, equine, ovine, porcine, canine, feline, lagomorph, rodent (e.g., mouse or rat), non-human primate (e.g., macaque or cynomolgus monkey), or human. In certain embodiments, the subject (e.g., human) has a disorder (e.g., a disorder resulting from a deficiency in a disease-associated gene).
Plant and method for producing the same
The term "plant" is to be understood as including any differentiated multicellular organism capable of photosynthesis, in including crop plants at any stage of maturity or development, in particular monocotyledonous or dicotyledonous plants, vegetable crops, including artichokes, corm cabbages, sesames, leeks, asparagus, lettuce (e.g. head lettuce, leaf lettuce), bok choy, yellow croaker, melons (e.g. melons, watermelons, crow's melon, honeydew melon, cantaloupe), rape crops (e.g. brussels sprouts, cabbage, cauliflower, broccoli, collards, headless cabbages, chinese cabbages, cephalanoplos, carrots, cabbage (napa), okra, onions, celery, chickpea, parsnip, endive, potato, cucurbits (e.g. zucchini, cucurbits, etc, Squash, pumpkin), radish, dried onion, turnip cabbage, purple eggplant (also called eggplant), salsify, endive, shallot, endive, garlic, spinach, green onion, squash, leafy vegetables (greens), beets (sugar and feed beets), sweet potato, lettuce, horseradish, tomato, turnip, and spices; fruit and/or vine crops such as apple, apricot, cherry, nectarine, peach, pear, plum, prune, cherry, quince, almond, chestnut, hazelnut, pecan, pistachio, walnut, citrus, blueberry, boysenberry (boysenberry), raspberry, gooseberry, loganberry, raspberry, strawberry, blackberry, grape, avocado, banana, kiwi, persimmon, pomegranate, pineapple, tropical fruit, pome, melon, mango, papaya, and lychee; field crops, such as clover, alfalfa, evening primrose, meadowfoam, corn/maize (fodder corn, sweet corn, popcorn), hops, jojoba, peanuts, rice, safflower, small grain crops (barley, oats, rye, wheat, etc.), sorghum, tobacco, kapok, legumes (beans, lentils, peas, soybeans), oleaginous plants (oilseed rape, mustard, poppy, olives, sunflowers, coconut, castor oil plants, cocoa beans, groundnuts), arabidopsis, fibrous plants (cotton, flax, jute), lauraceae (cinnamon, camphor), or a plant such as coffee, sugar cane, tea, and natural rubber plants; and/or bedding plants, such as flowering plants, cactus, fleshy plants and/or ornamental plants, and trees, such as forests (broad leaf and evergreen trees, such as conifers), fruit trees, ornamental trees, and nut-bearing trees, as well as shrubs and other plantlets.
Advantageous effects of the invention
The invention discovers a novel Cas enzyme which can show the activity of nuclease in vivo and in vitro and has wide application prospect.
Embodiments of the present invention will be described in detail below with reference to the drawings and examples, but those skilled in the art will understand that the following drawings and examples are only for illustrating the present invention and do not limit the scope of the present invention. Various objects and advantageous aspects of the present invention will become apparent to those skilled in the art from the accompanying drawings and the following detailed description of the preferred embodiments.
Drawings
FIG. 1. results of in vitro cleavage activity experiments for different Cas proteins.
Figure 2 editing efficiency in Cas-sf4 and Cas-sf1 protoplasts.
Figure 3 type of editing of genes by Cas-sf4 and Cas-sf1 in protoplasts.
FIG. 4 is a graph of fluorescence results of Cas-sf1 when used in vitro nucleic acid detection.
FIG. 5 is a graph of fluorescence results of Cas-sf4 when used in vitro nucleic acid detection.
FIG. 6 is a graph of fluorescence results of Cas-sf8 when used in vitro nucleic acid detection.
FIG. 7 is a graph of fluorescence results of Cas-sf9 when used in vitro nucleic acid detection.
FIG. 8 is a graph of fluorescence results of Cas-sf10 when used in vitro nucleic acid detection.
Sequence information
Figure BDA0003496308980000131
Figure BDA0003496308980000141
Detailed Description
The following examples are intended to illustrate the invention only and are not intended to limit the invention. Unless otherwise indicated, the experiments and procedures described in the examples were performed essentially according to conventional methods well known in the art and described in various references. For example, conventional techniques in immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant DNA used in the present invention can be found in Sambrook (Sambrook), friesch (Fritsch), and manitis (manitis), molecular cloning: a LABORATORY Manual (Molecular CLONING: A Laboratory Manual), 2 nd edition (1989); a Current Manual of MOLECULAR BIOLOGY experiments (Current PROTOCOLS IN MOLECULAR BIOLOGY BIOLOGY) (edited by F.M. Otsubel et al, (1987)); METHODS IN ENZYMOLOGY (METHODS IN Enzymology) series (academic Press): PCR 2: practical methods (PCR 2: A PRACTICAL APPROACH) (m.j. macpherson, b.d. heims (b.d. hames) and g.r. taylor (g.r. taylor) editions (1995)), Harlow (Harlow) and la nei (Lane) editions (1988) antibodies: a LABORATORY Manual (ANTIBODIES, A LABORATORY MANUAL), and animal cell CULTURE (ANIMAL CELL CURTURE) (edited by R.I. Freyrnib (R.I. Freshney) (1987)).
In addition, those whose specific conditions are not specified in the examples are conducted under the conventional conditions or conditions recommended by the manufacturer. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available. The examples are given by way of illustration and are not intended to limit the scope of the invention as claimed. All publications and other references mentioned herein are incorporated by reference in their entirety.
Example 1 acquisition of Cas protein
The inventor analyzes the uncultured metagenome, and identifies a new Cas enzyme by redundancy removal and protein clustering analysis, and the new Cas enzyme is named as Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10 respectively, and the amino acid sequences of the new Cas enzyme are shown as SEQ ID No. 1-7; blast results show that the Cas protein has low sequence identity with the reported Cas protein.
Analysis shows that the prototype direct repeat sequences of gRNAs corresponding to Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10 are respectively shown as SEQ ID No.8-14, and the corresponding PAM is TTN (N can be any base); the corresponding mature direct repeat sequences are respectively shown as SEQ ID No. 16-22.
Example 2 validation of in vitro cleavage Activity of Cas protein
Cas-sf4, Cas-sf1, Cas-sf3, Cas-sf6, Cas-sf8, Cas-sf9 and Cas-sf10 proteins are respectively constructed on pet30a expression vectors, transferred into escherichia coli and purified to obtain purified target proteins.
Incubating 1ug of purified Cas protein, 500ng of in vitro transcribed gRNA and a PCR product with 300ngPAM as TTC for 1h or overnight enzyme digestion at 37 ℃, wherein the sequence of the PCR product is shown as SEQ ID No. 15; gRNA sequences for different Cas proteins are shown in table 1.
TABLE 1 gRNA sequences utilized in vitro cleavage experiments
Figure BDA0003496308980000142
Figure BDA0003496308980000151
The result is shown in fig. 1, in which the arrow position is the PCR product, and as can be seen from fig. 1, different Cas proteins cut the PCR product, especially Cas-sf4, to different extents, with the highest efficiency for cutting the PCR product.
Example 2 efficiency of Cas protein editing in maize protoplasts
To verify whether the Cas protein can produce editing effect in eukaryotic cells. Firstly, plasmids expressing different Cas proteins are constructed by PAM of TTTC and plasmids expressing crRNA are transformed into corn protoplast. Extracting DNA after transformation, amplifying a segment containing a target site, and performing second-generation sequencing; the genes of the targeted corn are: SBE2.2(Zm00001d003817), gRNA sequences for different Cas proteins are shown in table 2.
TABLE 2 gRNA sequences utilized in maize protoplasts
Figure BDA0003496308980000152
The test results show that Cas-sf4 and Cas-sf1 can show obvious editing efficiency in corn protoplasts, and as shown in FIG. 2, the editing efficiency of Cas-sf1 in corn protoplasts is 2.4%, and the editing efficiency of Cas-sf4 in corn protoplasts is 0.7%. The editing types of Cas-sf4 and Cas-sf1 for SBE2.2 are shown in fig. 3.
Example 3 application of Cas protein in vitro nucleic acid detection
This example was tested in vitro to verify the trans cleavage activity of the Cas enzyme. The gRNA that can pair with the target nucleic acid is used in the embodiment to guide the recognition and binding of the Cas enzyme on the target nucleic acid; subsequently, the Cas enzyme activates trans cleavage activity on any single-stranded nucleic acid, thereby cleaving the single-stranded nucleic acid detector in the system; the two ends of the single-stranded nucleic acid detector are respectively provided with a fluorescent group and a quenching group, and if the single-stranded nucleic acid detector is cut, fluorescence can be excited; in other embodiments, both ends of the single-stranded nucleic acid detector may be provided with a label capable of being detected by colloidal gold.
In this example, the target nucleic acid was selected to be a single-stranded DNA, N-B-i3g1-ssDNA0, having the sequence: cgacattccgaagaacgctgaagcgctgggggcaaattgtgcaatttgcggc are provided.
The 5 'end to the 3' end of the gRNA are sequentially a DR area of different Cas proteins and a sequence of a target nucleic acid, and the sequence of the target nucleic acid is cccccagcgcuucagcguuc;
the single-stranded nucleic acid detector sequence was FAM-TTATT-BHQ 1.
The following reaction system is adopted: cas enzyme final concentration is 50nM, gRNA final concentration is 50nM, target nucleic acid final concentration is 500nM, single-stranded nucleic acid detector final concentration is 200 nM. Incubation at 37 ℃ and reading FAM fluorescence/1 min. The control group had no target nucleic acid added.
In this example, the assay was performed for trans cleavage activity of Cas-sf4, Cas-sf1, Cas-sf6, Cas-sf8, Cas-sf9, Cas-sf10, and the DR region of the gRNA was selected from the mature direct repeats of the corresponding proteins, as shown in SEQ ID Nos. 18, 19, 21, 22, 23, and 24 for Cas-sf4, Cas-sf1, Cas-sf6, Cas-sf8, Cas-sf9, and Cas-sf 10. Of these, no significant trans cleavage activity was detected by Cas-sf 6. As shown in FIGS. 4-8, compared with the control without the target nucleic acid, the single-stranded nucleic acid in the system can be cleaved by the Cas-sf4, Cas-sf1, Cas-sf8, Cas-sf9 and Cas-sf10 in the presence of the target nucleic acid, and fluorescence is rapidly reported. The above experiments reflect that, in cooperation with single-stranded nucleic acid detectors, Cas-sf4, Cas-sf1, Cas-sf8, Cas-sf9 and Cas-sf10 can be used for detection of target nucleic acids. In FIGS. 4 to 8, line 1 shows the results of the experiment with the addition of the target nucleic acid, and line 2 shows the control group without the addition of the target nucleic acid.
While specific embodiments of the invention have been described in detail, those skilled in the art will understand that: various modifications and changes in detail can be made in light of the overall teachings of the disclosure, and such changes are intended to be within the scope of the present invention. A full appreciation of the invention is gained by taking the entire specification as a whole in the light of the appended claims and any equivalents thereof.
SEQUENCE LISTING
<110> Shunheng Biotech Co., Ltd
<120> novel Cas enzyme and use
<130> SF063
<160> 22
<170> PatentIn version 3.5
<210> 1
<211> 1285
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf4
<400> 1
Met Ile Lys Met Met Lys Glu Lys Ser Ile Trp Asn Glu Phe Thr Asn
1 5 10 15
Met Tyr Ser Ile Ser Lys Thr Leu Arg Phe Lys Leu Lys Pro Ile Gly
20 25 30
Lys Thr Phe Asp Asn Ile Lys Lys Lys Gly Leu Ile Glu Glu Asp Lys
35 40 45
Asp Arg Glu Lys Gly Phe Asn Asn Ile Lys Lys Ile Met Asp Asp Tyr
50 55 60
Tyr Arg Tyr Phe Ile Glu Lys Cys Leu Asn Gly Ile Lys Leu Glu Lys
65 70 75 80
Lys Asp Leu Glu Ala Tyr Gln Lys Val Tyr Glu Asp Leu Lys Lys Asp
85 90 95
Asn Lys Asn Gln Lys Leu Lys Asn Lys Tyr Ala Lys Asn Gln Thr Ile
100 105 110
Leu Arg Lys Glu Ile Tyr Asn His Ile Lys Ser Gln Lys Glu Phe Ser
115 120 125
Gln Leu Phe Lys Lys Glu Leu Ile Thr His Ile Leu Pro Glu Trp Leu
130 135 140
Glu Lys Asn Lys Arg Leu Lys Asp Lys Asn Leu Val Asn Gln Phe Asn
145 150 155 160
Asn Trp Ser Thr Tyr Phe Thr Gly Phe Phe Asn Asn Arg Lys Asn Val
165 170 175
Phe Ser Glu Lys Glu Ile Pro Thr Ser Ile Ile Tyr Arg Ile Val His
180 185 190
Val Asn Leu Pro Lys Tyr Leu Asp Asn Val Ser Arg Phe Glu Lys Ile
195 200 205
Lys Glu Phe Asn Leu Asp Leu Lys Thr Leu Glu Asn Asp Phe Lys Asp
210 215 220
Val Leu Asp Asn Met Asp Leu Asn Glu Phe Phe Ser Val Asn Asn Phe
225 230 235 240
Asn Asn Phe Leu Asn Gln Ser Gly Ile Asp Lys Phe Asn Leu Val Ile
245 250 255
Gly Gly Lys Ser Leu Glu Asp Asn Lys Lys Ile Lys Gly Leu Asn Glu
260 265 270
Tyr Ile Asn Glu Phe Ser Gln Lys Glu Ser Asp Lys Ala Lys Arg Lys
275 280 285
Asn Ile Arg Lys Leu Lys Phe Ala Val Leu Phe Lys Gln Ile Leu Ser
290 295 300
Asp Ser Glu Ser Ser Ser Phe Val Ile Glu Lys Phe Lys Asp Lys Lys
305 310 315 320
Glu Ile Phe Glu Thr Ile Asp Val Phe Tyr Lys Glu Phe Asn Lys Tyr
325 330 335
Ser Ser Lys Ile Lys Glu Ser Ile Thr Lys Leu Asn Asn Cys Asp Ser
340 345 350
Lys Asn Val Tyr Ile Lys Asn Asp Thr Asn Leu Thr Gln Ile Ser Lys
355 360 365
Gly Leu Phe Asn Asp Trp Asn Lys Ile Asp Gly Gly Leu Arg Arg His
370 375 380
Phe Glu Asn Glu Leu Lys Ile Lys Lys Leu Thr Asp Lys Gln Arg Glu
385 390 395 400
Lys Glu Leu Asp Lys Cys Met Lys Ser Lys Tyr Phe Ser Leu Tyr Glu
405 410 415
Ile Glu Lys Gly Ile Asn Ser Leu Glu Leu Lys Asp Lys Lys Ser Ile
420 425 430
Ile Asp Tyr Phe Leu Asn Phe Ser Lys Ser Lys Asn Asp Ser Lys Val
435 440 445
Asp Leu Phe Glu Asn Ile Lys Ser Lys Tyr Ser Glu Phe Asn Lys Ile
450 455 460
Asp Arg Asn Lys Thr Thr Lys Leu Thr Glu Lys Ser Ser Glu Asn Asp
465 470 475 480
Val Glu Leu Ile Lys Thr Phe Leu Asp Ala Ile Met Glu Leu Tyr His
485 490 495
Phe Ile Lys Pro Leu His Leu Asn Phe Lys Lys Asn Glu Asp Glu Lys
500 505 510
Gly Ser Asn Ala Leu Glu Thr Asp Ser Asp Phe Tyr Asn Tyr Phe Asn
515 520 525
Glu Ile Phe Asp Lys Leu Gly Glu Ile Ile Pro Leu Tyr Asn Lys Val
530 535 540
Arg Asn Tyr Val Thr Gln Lys Pro Phe Ser Thr Lys Lys Phe Lys Leu
545 550 555 560
Asn Phe Glu Asn Ser Thr Leu Ala Ala Gly Trp Asp Ile Asn Lys Glu
565 570 575
Thr Ala Asn Thr Ala Ile Ile Leu Lys Lys Gly Thr Asp Phe Tyr Leu
580 585 590
Gly Ile Ile Asp Lys Asn Asn Thr Lys Ile Phe Leu Asn Gln Gln Asn
595 600 605
Ser Asn Ser Ser Val Val Tyr Glu Lys Leu Cys Tyr Lys Leu Val Ser
610 615 620
Gly Ala Asn Lys Met Leu Pro Lys Val Phe Leu Ser Glu Lys Gly Val
625 630 635 640
Lys Thr Phe Lys Pro Ser Lys Glu Ile Leu Lys Leu Tyr Lys Asn Glu
645 650 655
Glu His Lys Lys Gly Asn Thr Phe Ser Ile Glu Ser Cys His Lys Leu
660 665 670
Ile Asp Tyr Phe Lys Glu Cys Met Pro Asn Tyr Lys Pro Asn Pro Asn
675 680 685
Asp Lys Tyr Gly Trp Asp Val Phe Lys Phe Lys Phe Ser Asp Thr Lys
690 695 700
Thr Tyr Lys Asp Ile Ser Asp Phe Tyr Arg Glu Val Glu Asn Gln Gly
705 710 715 720
Tyr Lys Ile Trp Phe Glu Asn Ile Asp Glu Ser Tyr Leu Asn Lys Leu
725 730 735
Val Asp Glu Gly Lys Leu Tyr Leu Phe Gln Ile Trp Asn Lys Asp Phe
740 745 750
Ser Lys Tyr Ser Lys Gly Lys Pro Asn Leu His Thr Met Tyr Trp Lys
755 760 765
Glu Leu Phe Ser Glu Glu Asn Leu Lys Asp Val Ile Tyr Lys Leu Asn
770 775 780
Gly Glu Ala Glu Leu Phe Tyr Arg Glu Ala Ser Ile Lys Arg Gln Ile
785 790 795 800
Thr His Pro Lys Asn Ile Ser Ile Asp Asn Lys Asn Pro Ile Lys Asn
805 810 815
Lys Glu Lys Ser Thr Phe Asn Tyr Asp Leu Ile Lys Asn Lys Arg Tyr
820 825 830
Ser Glu Asp Ser Phe Met Phe His Cys Pro Ile Thr Leu Asn Phe Lys
835 840 845
Ala Lys Asp Gln Ser Lys Ser Ile His Lys Leu Val Asn Lys Phe Ile
850 855 860
His Asp Thr Asp Lys Lys Ile Asn Ile Val Gly Ile Asp Arg Gly Glu
865 870 875 880
Arg Asn Leu Ala Tyr Tyr Thr Leu Val Asn Ser Asp Gly Asn Ile Ile
885 890 895
Glu Gln Glu Ser Phe Asn Ile Ile Ser Asp Asp Leu Gln Arg Lys Phe
900 905 910
Asp Tyr Gln Glu Lys Leu Asp Gln Ile Glu Gly Asp Arg Asp Lys Ala
915 920 925
Arg Lys Asn Trp Lys Lys Ile Ala Asn Ile Lys Glu Met Lys Thr Gly
930 935 940
Tyr Leu Ser Gln Val Ile His Lys Ile Ser Lys Leu Val Ile Glu His
945 950 955 960
Asp Ala Ile Ile Val Leu Glu Asp Leu Asn Tyr Gly Phe Lys Arg Gly
965 970 975
Arg Phe Lys Ile Glu Lys Gln Ile Tyr Gln Lys Phe Glu Lys Met Leu
980 985 990
Val Asp Lys Leu Asn Tyr Leu Val Phe Lys Gly Ile Asp Lys Thr Leu
995 1000 1005
Ser Gly Gly Asn Leu Asn Ala Tyr Gln Leu Thr Asn Lys Phe Glu
1010 1015 1020
Ser Phe Gln Lys Leu Gly Lys Gln Ser Gly Ile Ile Tyr Tyr Val
1025 1030 1035
Asp Ala Tyr Lys Thr Ser Lys Ile Cys Pro Lys Thr Gly Phe Val
1040 1045 1050
Asn Leu Leu Tyr Pro Lys Phe Glu Asn Ile Leu Lys Ser Gln Glu
1055 1060 1065
Phe Ile Lys Lys Phe Lys Ser Ile Lys Tyr His Lys Asp Glu Asp
1070 1075 1080
Leu Phe Glu Phe Asn Phe Asn Tyr Ser Asp Phe Lys Lys Asp Gln
1085 1090 1095
Lys Glu Lys Leu Glu Gln Asp Asn Trp Ser Ile Trp Ser Asn Gly
1100 1105 1110
Thr Lys Leu Ile Asn Phe Arg Asp Lys Glu Asn Asn Asn Gln Trp
1115 1120 1125
Thr Thr Lys Glu Phe Lys Val Thr Glu Lys Leu Lys Glu Leu Phe
1130 1135 1140
Glu Asn His Asn Ile Asp Tyr Asn Ser Gly Asn Asp Leu Ile Glu
1145 1150 1155
Gln Ile Val Thr Ile Glu Asn Lys Ser Phe Tyr Glu Ser Leu Ile
1160 1165 1170
Tyr Ile Leu Lys Ile Ile Leu Lys Leu Arg Asn Ser Tyr Ser Asp
1175 1180 1185
Phe Glu Val Lys Gln Phe Lys Lys Lys Leu Gly Asn Lys Phe Lys
1190 1195 1200
Glu Cys Asp Tyr Asp Tyr Ile Leu Ser Cys Val Lys Asp Lys Glu
1205 1210 1215
Gly Asn Phe Phe Asp Ser Arg His Ala Lys Thr Asn Glu Val Lys
1220 1225 1230
Asp Ala Asp Ala Asn Gly Ala Phe His Ile Ala Leu Lys Gly Leu
1235 1240 1245
Met Val Ile Asp Lys Ile Lys Lys Phe Asp Asp Val Asp Glu Lys
1250 1255 1260
Thr Lys Ile Asp Leu Lys Ile Pro Arg Thr Asp Phe Leu Asn Tyr
1265 1270 1275
Val Val Lys Arg Ile Asn Arg
1280 1285
<210> 2
<211> 1297
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf1
<400> 2
Met Ala Thr Leu Val Ser Phe Thr Lys Gln Tyr Gln Val Gln Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Gln Ala Asn Ile Asp
20 25 30
Ala Lys Gly Phe Ile Asn Asp Asp Leu Lys Arg Asp Glu Asn Tyr Met
35 40 45
Lys Val Lys Gly Val Ile Asp Glu Leu His Lys Asn Phe Ile Glu Gln
50 55 60
Thr Leu Val Asn Val Asp Tyr Asp Trp Arg Ser Leu Ala Thr Ala Ile
65 70 75 80
Lys Asn Tyr Arg Lys Asp Arg Ser Asp Thr Asn Lys Lys Asn Leu Glu
85 90 95
Lys Thr Gln Glu Ala Ala Arg Lys Glu Ile Ile Ala Trp Phe Glu Gly
100 105 110
Lys Arg Gly Asn Ser Ala Phe Lys Asn Asn Gln Lys Ser Phe Tyr Gly
115 120 125
Lys Leu Phe Lys Lys Glu Leu Phe Ser Glu Ile Leu Arg Ser Asp Asp
130 135 140
Leu Glu Tyr Asp Glu Glu Thr Gln Asp Ala Ile Ala Cys Phe Asp Lys
145 150 155 160
Phe Thr Thr Tyr Phe Val Gly Phe His Glu Asn Arg Lys Asn Met Tyr
165 170 175
Ser Thr Glu Ala Lys Ser Thr Ser Val Ala Tyr Arg Val Val Asn Glu
180 185 190
Asn Phe Ser Lys Phe Leu Ser Asn Cys Glu Ala Phe Ser Val Leu Glu
195 200 205
Ala Val Cys Pro Asn Val Leu Val Glu Ala Glu Gln Glu Leu His Leu
210 215 220
His Lys Ala Phe Ser Asp Leu Lys Leu Ser Asp Val Phe Lys Val Glu
225 230 235 240
Ala Tyr Asn Lys Tyr Leu Ser Gln Thr Gly Ile Asp Tyr Tyr Asn Gln
245 250 255
Ile Ile Gly Gly Ile Ser Ser Ala Glu Gly Val Arg Lys Ile Arg Gly
260 265 270
Val Asn Glu Val Val Asn Asn Ala Ile Gln Gln Asn Asp Glu Leu Lys
275 280 285
Val Ala Leu Arg Asn Lys Gln Phe Thr Met Val Gln Leu Phe Lys Gln
290 295 300
Ile Leu Ser Asp Arg Ser Thr Leu Ser Phe Val Ser Glu Gln Phe Thr
305 310 315 320
Ser Asp Gln Glu Val Ile Thr Val Val Lys Gln Phe Asn Asp Asp Ile
325 330 335
Val Asn Asn Lys Val Leu Ala Val Val Lys Thr Leu Phe Glu Asn Phe
340 345 350
Asn Ser Tyr Asp Leu Glu Lys Ile Tyr Ile Asn Ser Lys Glu Leu Ala
355 360 365
Ser Val Ser Asn Ala Leu Leu Lys Asp Trp Ser Lys Ile Arg Asn Ala
370 375 380
Val Leu Glu Asn Lys Ile Ile Glu Leu Gly Ala Asn Pro Pro Lys Thr
385 390 395 400
Lys Ile Ser Ala Val Glu Lys Glu Val Lys Asn Lys Asp Phe Ser Ile
405 410 415
Ala Glu Leu Ala Ser Tyr Asn Asp Lys Tyr Leu Asp Lys Glu Gly Asn
420 425 430
Asp Lys Glu Ile Cys Ser Ile Ala Asn Val Val Leu Glu Ala Val Gly
435 440 445
Ala Leu Glu Ile Met Leu Ala Glu Ser Leu Pro Ala Asp Leu Lys Thr
450 455 460
Leu Glu Asn Lys Asn Lys Val Lys Gly Ile Leu Asp Ala Tyr Glu Asn
465 470 475 480
Leu Leu His Leu Leu Asn Tyr Phe Lys Val Ser Ala Val Asn Asp Val
485 490 495
Asp Leu Ala Phe Tyr Gly Ala Phe Glu Lys Val Tyr Val Asp Ile Ser
500 505 510
Gly Val Met Pro Leu Tyr Asn Lys Val Arg Asn Tyr Ala Thr Lys Lys
515 520 525
Pro Tyr Ser Val Glu Lys Phe Lys Leu Asn Phe Ala Met Pro Thr Leu
530 535 540
Ala Asp Gly Trp Asp Lys Asn Lys Glu Arg Asp Asn Gly Ser Ile Ile
545 550 555 560
Leu Leu Lys Asp Gly Gln Tyr Tyr Leu Gly Val Met Asn Pro Gln Asn
565 570 575
Lys Pro Val Ile Asp Asn Ala Val Cys Asn Asp Ala Lys Gly Tyr Gln
580 585 590
Lys Met Val Tyr Lys Met Phe Pro Glu Ile Ser Lys Met Val Thr Lys
595 600 605
Cys Ser Thr Gln Leu Asn Ala Val Lys Ala His Phe Glu Asp Asn Thr
610 615 620
Asn Asp Phe Val Leu Asp Asp Thr Asp Lys Phe Ile Ser Asp Leu Thr
625 630 635 640
Ile Thr Lys Glu Ile Tyr Asp Leu Asn Asn Val Leu Tyr Asp Gly Lys
645 650 655
Lys Lys Phe Gln Ile Asp Tyr Leu Arg Asn Thr Gly Asp Phe Ala Gly
660 665 670
Tyr His Lys Ala Leu Glu Thr Trp Ile Asp Phe Val Lys Glu Phe Leu
675 680 685
Ser Lys Tyr Arg Ser Thr Ala Ile Tyr Asp Leu Thr Thr Leu Leu Pro
690 695 700
Thr Asn Tyr Tyr Glu Lys Leu Asp Val Phe Tyr Ser Asp Val Asn Asn
705 710 715 720
Leu Cys Tyr Lys Ile Asp Tyr Glu Asn Ile Ser Val Glu Gln Val Asn
725 730 735
Glu Trp Val Glu Glu Gly Asn Leu Tyr Leu Phe Lys Ile Tyr Asn Lys
740 745 750
Asp Phe Ala Thr Gly Ser Thr Gly Lys Pro Asn Leu His Thr Met Tyr
755 760 765
Trp Asn Ala Val Phe Ala Glu Glu Asn Leu His Asp Val Val Val Lys
770 775 780
Leu Asn Gly Gly Ala Glu Leu Phe Tyr Arg Pro Lys Ser Asn Met Pro
785 790 795 800
Lys Val Glu His Arg Val Gly Glu Lys Leu Val Asn Arg Lys Asn Val
805 810 815
Asn Gly Glu Pro Ile Ala Asp Ser Val His Lys Glu Ile Tyr Ala Tyr
820 825 830
Ala Asn Gly Lys Ile Ser Lys Ser Glu Leu Ser Glu Asn Ala Gln Glu
835 840 845
Glu Leu Pro Leu Ala Ile Ile Lys Asp Val Lys His Asn Ile Thr Lys
850 855 860
Asp Lys Arg Tyr Leu Ser Asp Lys Tyr Phe Phe His Val Pro Ile Thr
865 870 875 880
Leu Asn Tyr Lys Ala Asn Gly Asn Pro Ser Ala Phe Asn Thr Lys Val
885 890 895
Gln Ala Phe Leu Lys Asn Asn Pro Asp Val Asn Ile Ile Gly Ile Asp
900 905 910
Arg Gly Glu Arg Asn Leu Leu Tyr Val Val Val Ile Asp Gln Gln Gly
915 920 925
Asn Ile Ile Asp Lys Lys Gln Val Ser Tyr Asn Lys Val Asn Gly Tyr
930 935 940
Asp Tyr Tyr Glu Lys Leu Asn Gln Arg Glu Lys Glu Arg Ile Glu Ala
945 950 955 960
Arg Gln Ser Trp Gly Ala Val Gly Lys Ile Lys Glu Leu Lys Glu Gly
965 970 975
Tyr Leu Ser Leu Val Val Arg Glu Ile Ala Asp Met Met Val Lys Tyr
980 985 990
Asn Ala Ile Val Val Met Glu Asn Leu Asn Ala Gly Phe Lys Arg Val
995 1000 1005
Arg Gly Gly Ile Ala Glu Lys Ala Val Tyr Gln Lys Phe Glu Lys
1010 1015 1020
Met Leu Ile Asp Lys Leu Asn Tyr Leu Val Phe Lys Asp Val Glu
1025 1030 1035
Ala Lys Glu Ala Gly Gly Val Leu Asn Ala Tyr Gln Leu Thr Asp
1040 1045 1050
Lys Phe Asp Ser Phe Glu Lys Met Gly Asn Gln Ser Gly Phe Leu
1055 1060 1065
Phe Tyr Val Pro Ala Ala Tyr Thr Ser Lys Ile Asp Pro Val Thr
1070 1075 1080
Gly Phe Ala Asn Val Phe Ser Thr Lys His Ile Thr Asn Thr Glu
1085 1090 1095
Ala Lys Lys Glu Phe Ile Cys Ser Phe Asn Ser Leu Arg Tyr Asp
1100 1105 1110
Glu Ala Lys Asp Lys Phe Val Leu Glu Cys Asp Leu Asn Lys Phe
1115 1120 1125
Lys Ile Val Ala Asn Ser His Ile Lys Asn Trp Lys Phe Ile Ile
1130 1135 1140
Gly Gly Lys Arg Ile Val Tyr Asn Ser Lys Asn Lys Thr Tyr Met
1145 1150 1155
Glu Lys Tyr Pro Cys Glu Asp Leu Lys Ala Thr Leu Asn Ala Ser
1160 1165 1170
Gly Ile Asp Phe Ser Ser Ser Glu Ile Ile Asn Leu Leu Lys Asn
1175 1180 1185
Val Pro Ala Asn Arg Glu Tyr Gly Lys Leu Phe Asp Glu Thr Tyr
1190 1195 1200
Trp Ala Ile Met Asn Thr Leu Gln Met Arg Asn Ser Asn Ala Leu
1205 1210 1215
Thr Gly Glu Asp Tyr Ile Ile Ser Ala Val Ala Asp Asp Asn Glu
1220 1225 1230
Lys Val Phe Asp Ser Arg Thr Cys Gly Ala Glu Leu Pro Lys Asp
1235 1240 1245
Ala Asp Ala Asn Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Tyr
1250 1255 1260
Leu Leu Gln Arg Ile Asp Ile Ser Glu Glu Gly Glu Lys Val Asp
1265 1270 1275
Leu Ser Ile Lys Asn Glu Glu Trp Phe Lys Phe Val Gln Gln Lys
1280 1285 1290
Glu Tyr Ala Arg
1295
<210> 3
<211> 1288
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf3
<400> 3
Met Tyr Ser Leu Ile Asn Tyr Phe Thr Thr Phe Thr Gly Asn Phe Ile
1 5 10 15
Asn Asn Leu Phe Thr Leu Thr Glu Tyr Ile Met Lys Thr Phe Gln Gln
20 25 30
Phe Ser Arg Val Tyr Pro Leu Ser Lys Thr Leu Arg Phe Glu Leu Lys
35 40 45
Pro Ile Gly Ser Thr Leu Glu His Ile Asn Lys Asn Gly Leu Leu Asp
50 55 60
Gln Asp Gln His Arg Ala Lys Ser Tyr Ile Gln Met Lys Asn Ile Ile
65 70 75 80
Asp Glu Tyr His Lys Glu Phe Ile Glu Asp Val Leu Asp Asp Leu Glu
85 90 95
Leu Gln Tyr Asp Asn Glu Gly Arg Asn Asn Ser Ile Ser Glu Phe Tyr
100 105 110
Thr Cys Tyr Met Ile Lys Ser Lys Asp Asp Asn Gln Arg Lys Leu Tyr
115 120 125
Glu Lys Ile Gln Glu Glu Leu Arg Lys Gln Ile Ala Asn Ala Phe Asn
130 135 140
Lys Ser Asp Ile Tyr Lys Arg Ile Phe Ser Glu Lys Leu Ile Lys Glu
145 150 155 160
Asp Leu Lys Asn Phe Ile Thr Asn Gln Lys Asp Asn Asp Lys Arg Glu
165 170 175
Gln Asp Ile Gln Ile Ile Glu Glu Phe Lys Asn Phe Thr Thr Tyr Phe
180 185 190
Thr Gly Phe His Glu Asn Arg Lys Asn Met Tyr Thr Ser Glu Ala Gln
195 200 205
Ser Thr Ala Ile Ala Tyr Arg Leu Ile His Glu Asn Leu Pro Lys Phe
210 215 220
Ile Asp Asn Ile Met Val Phe Asp Lys Val Ala Ala Ser Pro Ile Ala
225 230 235 240
Asp Ser Phe Ser Glu Leu Tyr Thr Asn Phe Glu Glu Cys Leu Asn Val
245 250 255
Met Ser Ile Glu Glu Met Phe Lys Leu Asn Tyr Phe Asn Val Val Leu
260 265 270
Thr Gln Lys Gln Ile Asp Val Tyr Asn Ala Ile Ile Gly Gly Lys Thr
275 280 285
Ile Asp Asn Thr Asn Ile Lys Ile Lys Gly Leu Asn Glu Tyr Ile Asn
290 295 300
Leu Tyr Asn Gln Gln Gln Lys Asp Lys Ser Ala Arg Leu Pro Lys Leu
305 310 315 320
Lys Pro Leu Tyr Lys Gln Ile Leu Ser Asp Arg Asn Ala Ile Ser Trp
325 330 335
Leu Pro Glu Gln Phe Glu Ser Asp Asp Lys Leu Leu Glu Ala Ile Gln
340 345 350
Lys Ala Tyr Gln Glu Leu Asp Glu Gln Val Leu Asn Arg Lys Ile Glu
355 360 365
Gly Glu His Ser Leu Arg Glu Leu Leu Val Gly Leu Ala Asp Tyr Asp
370 375 380
Leu Ser Lys Ile Tyr Ile Arg Asn Asp Leu Gln Leu Thr Asp Ile Ser
385 390 395 400
Gln Lys Val Phe Gly His Trp Gly Val Ile Ser Lys Ala Leu Leu Glu
405 410 415
Glu Leu Lys Asn Glu Val Pro Lys Lys Ser Lys Lys Glu Ser Asp Glu
420 425 430
Ala Tyr Glu Asp Arg Leu Asn Lys Val Ile Lys Ser Gln Gly Ser Ile
435 440 445
Ser Ile Ala Phe Ile Asn Asp Cys Ile Asn Lys Gln Leu Pro Glu Lys
450 455 460
Gln Lys Thr Ile Gln Gly Tyr Phe Ala Glu Leu Gly Ala Val Asn Asn
465 470 475 480
Glu Thr Ile Gln Lys Glu Asn Leu Phe Ala Gln Ile Glu Asn Ala Tyr
485 490 495
Thr Glu Val Lys Asp Leu Leu Asn Thr Pro Tyr Thr Gly Lys Asn Leu
500 505 510
Ala Gln Asp Lys Val Asn Val Glu Lys Ile Lys Asn Leu Leu Asp Ala
515 520 525
Ile Lys Ala Leu Gln His Phe Ile Lys Pro Leu Leu Gly Asp Gly Thr
530 535 540
Glu Pro Glu Lys Asp Glu Lys Phe Tyr Gly Glu Phe Ala Ala Leu Trp
545 550 555 560
Glu Glu Leu Asp Lys Ile Thr Pro Leu Tyr Asn Met Val Arg Asn Tyr
565 570 575
Met Thr Arg Lys Pro Tyr Ser Thr Glu Lys Ile Lys Leu Asn Phe Glu
580 585 590
Asn Ser Thr Leu Met Asp Gly Trp Asp Leu Asn Lys Glu Gln Ala Asn
595 600 605
Thr Thr Val Ile Leu Arg Lys Asp Gly Leu Tyr Tyr Leu Ala Ile Met
610 615 620
Asn Lys Lys His Asn Arg Val Phe Asp Val Lys Ala Met Pro Asp Asp
625 630 635 640
Gly Asp Cys Tyr Glu Lys Met Glu Tyr Lys Leu Leu Pro Gly Ala Asn
645 650 655
Lys Met Leu Pro Lys Val Phe Phe Ser Lys Ser Arg Ile Gln Glu Phe
660 665 670
Ala Pro Ser Ser Gln Leu Leu Glu Asn Tyr His Asn Asp Thr His Lys
675 680 685
Lys Gly Val Thr Phe Asn Ile Lys Asp Cys His Ala Leu Ile Asp Phe
690 695 700
Phe Lys Ala Ser Ile Asn Lys His Glu Asp Trp Cys Lys Phe Gly Phe
705 710 715 720
Arg Phe Ser Pro Thr Glu Thr Tyr Glu Asp Leu Ser Gly Phe Tyr Arg
725 730 735
Glu Val Glu Gln Gln Gly Tyr Lys Ile Ser Phe Arg Asn Val Ser Val
740 745 750
Asp Tyr Ile His Ser Leu Val Glu Glu Gly Lys Ile Phe Leu Phe Gln
755 760 765
Ile Tyr Asn Lys Asp Phe Ser Pro Tyr Ser Lys Gly Thr Pro Asn Leu
770 775 780
His Thr Leu Tyr Trp Lys Met Leu Phe Asp Glu Lys Asn Leu Ala Asp
785 790 795 800
Val Val Tyr Lys Leu Asn Gly Gln Ala Glu Val Phe Phe Arg Lys Ser
805 810 815
Ser Ile Asn Tyr Glu Gln Pro Thr His Pro Ala Asn Lys Ala Ile Asp
820 825 830
Asn Lys Asn Glu Leu Asn Lys Lys Lys Gln Ser Leu Phe Thr Tyr Asp
835 840 845
Leu Ile Lys Asp Lys Arg Tyr Thr Ile Asp Lys Phe Gln Phe His Val
850 855 860
Pro Ile Thr Met Asn Phe Lys Ser Thr Gly Asn Asp Asn Ile Asn Gln
865 870 875 880
Ser Val Asn Glu Tyr Ile Gln Gln Ser Asp Asp Leu His Ile Ile Gly
885 890 895
Ile Asp Arg Gly Glu Arg His Leu Leu Tyr Leu Thr Val Ile Asn Leu
900 905 910
Lys Gly Glu Ile Lys Glu Gln Tyr Ser Leu Asn Glu Ile Val Asn Thr
915 920 925
Tyr Lys Gly Asn Glu Tyr Arg Thr Asp Tyr His Asp Leu Leu Ser Lys
930 935 940
Arg Glu Asp Glu Arg Met Lys Ala Arg Gln Ser Trp Gln Thr Ile Glu
945 950 955 960
Asn Ile Lys Glu Leu Lys Glu Gly Tyr Leu Ser Gln Val Val His Lys
965 970 975
Ile Ala Glu Leu Met Ile Lys Tyr Asn Ala Ile Val Val Leu Glu Asp
980 985 990
Leu Asn Ala Gly Phe Met Arg Gly Arg Gln Lys Val Glu Ser Ser Val
995 1000 1005
Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Leu
1010 1015 1020
Ala Asp Lys Lys Lys Gln Pro Glu Glu Pro Gly Gly Ile Leu Asn
1025 1030 1035
Ala Tyr Gln Leu Thr Asn Lys Phe Val Ser Phe Gln Lys Met Gly
1040 1045 1050
Lys Gln Cys Gly Phe Leu Phe Tyr Thr Gln Ala Trp Asn Thr Ser
1055 1060 1065
Lys Ile Asp Pro Val Thr Gly Phe Val Asn Leu Phe Asp Thr Arg
1070 1075 1080
Tyr Glu Thr Arg Glu Lys Ala Lys Thr Phe Phe Gly Lys Phe Asp
1085 1090 1095
Ser Ile Arg Tyr Asn Asp Glu Lys Asp Trp Phe Glu Phe Ala Phe
1100 1105 1110
Asp Tyr Thr Asn Phe Thr Ser Lys Ala Asp Gly Ser Arg Thr Asn
1115 1120 1125
Trp Lys Leu Cys Thr Tyr Gly Lys Arg Ile Glu Thr Phe Arg Asp
1130 1135 1140
Glu Lys Gln Asn Ser Asn Trp Thr Ser Lys Glu Val Val Leu Thr
1145 1150 1155
Asp Lys Phe Lys Glu Phe Phe Lys Glu Ser Asn Ile Asp Ile His
1160 1165 1170
Ser Asn Leu Lys Glu Ala Ile Met Gln Gln Asp Ser Ala Asp Phe
1175 1180 1185
Phe Lys Lys Leu Leu Tyr Leu Leu Lys Leu Thr Leu Gln Met Arg
1190 1195 1200
Asn Ser Glu Thr Gly Thr Asn Val Asp Tyr Met Gln Ser Pro Val
1205 1210 1215
Ala Asp Glu Glu Gly Asn Phe Tyr Asn Ser Asp Thr Cys Asp Ser
1220 1225 1230
Ser Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala
1235 1240 1245
Arg Lys Gly Leu Trp Ile Val Gln Gln Ile Lys Thr Ser Asp Asp
1250 1255 1260
Leu Arg Asn Leu Lys Leu Ala Ile Thr Asn Lys Glu Trp Leu Gln
1265 1270 1275
Phe Ala Gln Arg Lys Pro Tyr Leu Asp Glu
1280 1285
<210> 4
<211> 1283
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf6
<400> 4
Met Ser Asn Met Gln Gln Tyr Asp Asn Phe Ile Asn His Tyr Ala Ile
1 5 10 15
Gln Lys Thr Leu Arg Phe Glu Leu Gln Pro Ile Gly Lys Thr Arg Glu
20 25 30
His Ile Gln Lys Asn Gly Ile Ile Glu His Asp Glu Ala Leu Glu Gln
35 40 45
Lys Tyr Gln Ile Val Lys Lys Ile Ile Asp Arg Phe His Arg Lys His
50 55 60
Ile Asp Glu Ala Leu Ser Leu Ala Asp Phe Ser Lys Asp Thr Ala Met
65 70 75 80
Leu Lys Arg Phe Glu Glu Leu Tyr Trp Lys Lys Asn Lys Asn Glu Asn
85 90 95
Glu Lys Asn Glu Phe Val Lys Ile Gln Ser Asp Leu Arg Lys Arg Val
100 105 110
Val Ser Phe Leu Glu Gly Lys Val Glu Gly Asp Ala Arg Phe Ala Lys
115 120 125
Val Gln Gln Arg Tyr Gly Ile Leu Phe Asp Ala Lys Ile Phe Lys Asp
130 135 140
Lys Glu Phe Ile Ser Thr Ala Cys Asp Asp Ile Glu Lys Asp Ala Ile
145 150 155 160
Glu Ala Phe Lys Arg Phe Ala Thr Tyr Phe Thr Gly Phe His Glu Asn
165 170 175
Arg Lys Asn Met Tyr Ser Ala Asp Glu Glu Ser Thr Ala Ile Ala Tyr
180 185 190
Arg Val Ile Asn Glu Asn Leu Pro Arg Phe Leu Glu Asn Lys Ala Arg
195 200 205
Phe Glu Lys Ile Gln His Thr Val Asp Ser Lys Thr Leu Asn Glu Ile
210 215 220
Ala Thr Glu Leu Lys Pro Val Leu Glu Lys Asn Lys Leu Glu Thr Ile
225 230 235 240
Phe Thr Leu Asn Tyr Phe Gln Asn Thr Leu Ser Gln Ala Gly Ile Thr
245 250 255
Tyr Tyr Asn Thr Ile Leu Gly Gly Lys Thr Lys Glu Asn Gly Glu Lys
260 265 270
Val Gln Gly Leu Asn Glu Ile Ile Asn Leu Phe Asn Gln Lys Asn Lys
275 280 285
Asp Thr Met Leu Pro Leu Leu Lys Pro Leu Tyr Lys Gln Ile Leu Ser
290 295 300
Glu Glu Tyr Ser Thr Ser Phe Thr Ile Ser Ala Phe Glu Lys Asp Asn
305 310 315 320
Asp Val Leu Gln Ala Ile Gly Ser Phe Cys Asn Asp Cys Ile Phe Tyr
325 330 335
Ala Lys Asn Asn Val Asn Gly Lys Ala Tyr Asn Leu Leu Gln Thr Val
340 345 350
Gln Ala Phe Cys Asn Ser Ile Asp Thr Tyr Asn Asp Asn Arg Leu Asp
355 360 365
Gly Leu His Ile Glu Arg Lys Asn Leu Ala Thr Leu Ser His Gln Val
370 375 380
Tyr Gly Glu Trp Asn Ile Leu Arg Asp Ala Leu Gln Ile His Tyr Glu
385 390 395 400
Ala Tyr Glu Gln Lys Asp Asn Gly Asn Asn Asn Asn Tyr Leu Glu Ser
405 410 415
Lys Thr Phe Ser Trp Lys Ala Leu Lys Asp Ala Leu Thr Thr Tyr Lys
420 425 430
Ser Leu Val Glu Glu Ala Gln Asp Ile Asp Glu Asn Gly Phe Ile Ala
435 440 445
Tyr Phe Lys Asp Met Lys Phe Lys Glu Glu Ile Asp Gly Lys Thr Thr
450 455 460
Ser Ile Asp Leu Ile Glu Asn Ile Gln Thr Arg Tyr Lys Ser Ile Glu
465 470 475 480
Thr Ile Leu Gln Glu Asp Arg Asn Asn Lys Asn Asn Leu His Gln Glu
485 490 495
Lys Glu Lys Val Ala Thr Ile Lys Gly Phe Leu Asp Ser Val Lys Tyr
500 505 510
Leu Gln Trp Phe Leu Asn Leu Met Tyr Ile Ala Ser Pro Val Asp Asp
515 520 525
Lys Asp Tyr Asp Phe Tyr Asn Glu Leu Glu Met Tyr His Asp Thr Leu
530 535 540
Leu Pro Leu Thr Thr Leu Tyr Asn Lys Val Arg Asn Tyr Met Thr Arg
545 550 555 560
Lys Pro Tyr Ser Val Glu Lys Phe Lys Leu Thr Phe Glu Lys Ser Thr
565 570 575
Leu Leu Asp Gly Trp Asp Lys Asn Lys Glu Arg Ala Asn Leu Gly Val
580 585 590
Ile Leu Arg Lys Gly Asn Asn Tyr Tyr Leu Gly Ile Met Asn Lys Lys
595 600 605
Tyr Asn Asp Ile Phe Asp Ser Ile Pro Gly Leu Thr Thr Thr Asp Tyr
610 615 620
Cys Glu Lys Met Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu
625 630 635 640
Pro Lys Val Phe Phe Ser Lys Lys Gly Val Gln Phe Tyr Lys Pro Ser
645 650 655
Gln Glu Ile Ile Arg Leu Tyr Asn Asn Lys Glu Phe Lys Lys Gly Asp
660 665 670
Thr Phe Asn Lys Asn Ser Leu His Lys Leu Ile Asn Phe Tyr Lys Glu
675 680 685
Ser Ile Ala Lys Thr Glu Asp Trp Ser Val Phe Gln Phe Lys Phe Lys
690 695 700
Asn Thr Asn Asp Tyr Ala Asp Ile Ser Gln Phe Tyr Lys Asp Val Glu
705 710 715 720
Arg Gln Gly Tyr Lys Ile Ser Phe Asp Lys Ile Asp Trp Glu Tyr Ile
725 730 735
Leu Leu Leu Val Asp Glu Gly Lys Leu Phe Leu Phe Lys Ile Tyr Asn
740 745 750
Lys Asp Phe Ser Pro Tyr Ser Lys Gly Lys Pro Asn Leu His Thr Ile
755 760 765
Tyr Trp Lys Asn Ile Phe Ser His Asp Asn Leu Asn Asn Val Val Tyr
770 775 780
Lys Leu Asn Gly Glu Ala Glu Val Phe Tyr Arg Lys Lys Ser Ile Glu
785 790 795 800
Tyr Pro Glu Glu Ile Leu Gln Lys Gly His His Val Asn Glu Leu Lys
805 810 815
Asp Lys Phe Lys Tyr Pro Ile Ile Lys Asp Lys Arg Tyr Ala Glu Asp
820 825 830
Lys Phe Leu Phe His Val Pro Ile Thr Met Asn Phe Leu Ser Lys Gly
835 840 845
Glu Pro Asn Ile Asn Gln Arg Val Gln Gln Tyr Ile Ala Ser Thr Ser
850 855 860
Glu Asp Tyr His Ile Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu
865 870 875 880
Tyr Leu Ser Leu Ile Asp Ala Thr Gly Lys Ile Ile Lys Gln Leu Ser
885 890 895
Leu Asn Thr Ile Lys Asn Glu Asn Phe Asn Thr Thr Ile Asp Tyr His
900 905 910
Ala Lys Leu Asp Glu Lys Glu Lys Lys Arg Glu Glu Ala Arg Lys Asn
915 920 925
Trp Asp Val Ile Glu Asn Ile Lys Glu Leu Lys Glu Gly Tyr Leu Ser
930 935 940
Gln Val Val His Gln Ile Ala Lys Leu Met Val Glu Tyr Lys Ala Ile
945 950 955 960
Leu Val Met Glu Asp Leu Asn Thr Gly Phe Lys Arg Gly Arg Phe Lys
965 970 975
Val Glu Lys Gln Val Tyr Gln Lys Phe Glu Lys Met Met Ile Asp Lys
980 985 990
Leu Asn Tyr Leu Val Leu Lys Asp Arg Gln Ala Thr Gln Pro Gly Gly
995 1000 1005
Ser Leu Lys Ala Tyr Gln Leu Ala Ser Ser Leu Glu Ser Phe Lys
1010 1015 1020
Lys Leu Gly Lys Gln Cys Gly Met Ile Phe Tyr Val Pro Ala Val
1025 1030 1035
Tyr Thr Ser Lys Ile Asp Pro Thr Thr Gly Phe Tyr Asn Phe Leu
1040 1045 1050
Arg Val Asp Val Ser Thr Leu Asn Ser Ala His Ser Phe Phe Asn
1055 1060 1065
Arg Phe Asn Ala Ile Val Tyr Asn Asn Glu Gln Asp Tyr Phe Glu
1070 1075 1080
Phe His Cys Thr Tyr Lys Asn Phe Val Ser Glu Pro Ser Leu Gln
1085 1090 1095
Lys Asn Val Lys Ser Ser Lys Met His Glu Tyr Asn Asn Leu Lys
1100 1105 1110
Asp Thr Thr Trp Val Leu Cys Ser Thr His His Glu Arg Tyr Lys
1115 1120 1125
Lys Phe Lys Asn Lys Ser Gly Tyr Phe Glu Tyr Lys Pro Val Asn
1130 1135 1140
Val Thr Gln Ser Leu Lys Gln Leu Phe Asp Glu Ala Gly Ile Asp
1145 1150 1155
Tyr Gln Ala Gly Ala Asp Leu Lys Glu Ala Ile Val Thr Gly Lys
1160 1165 1170
Asn Thr Lys Leu Leu Lys Gly Leu Gly Glu Gln Leu Asn Ile Leu
1175 1180 1185
Leu Ala Met Arg Tyr Asn Asn Gly Lys His Gly Asn Glu Glu Lys
1190 1195 1200
Asp Tyr Ile Val Ser Pro Val Lys Asn Asn Tyr Gly Lys Phe Phe
1205 1210 1215
Cys Thr Leu Asp Gly Asp Ala Ser Leu Pro Val Asp Ala Asp Ala
1220 1225 1230
Asn Gly Ala Tyr Ala Ile Ala Leu Lys Gly Leu Met Leu Val Glu
1235 1240 1245
Arg Met Lys Ser Asn Lys Asp Ile Lys Gly Arg Ile Asp Tyr Phe
1250 1255 1260
Ile Ser Asn Asn Glu Trp Phe Asn Tyr Leu Ile Ala Lys Asn Thr
1265 1270 1275
Leu Asn Lys Ser Lys
1280
<210> 5
<211> 1275
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf8
<400> 5
Met Arg Lys Ser Phe Lys Asp Phe Thr Asn Met Tyr Pro Val Gln Lys
1 5 10 15
Thr Leu Arg Phe Glu Leu Lys Pro Leu Gly Lys Thr Glu Gln His Ile
20 25 30
Lys Glu Ser Phe Ile Ile Glu His Asp Glu Gln Arg Ser Asn Asp Tyr
35 40 45
Lys Ala Ala Lys Lys Ile Ile Asp Asp Tyr His Arg Leu Phe Ile Gln
50 55 60
Lys Thr Leu Ser Gln Thr Asp Leu Asp Trp Lys Asp Leu Lys Glu Ala
65 70 75 80
Leu Glu Tyr Asp Gly Glu Asp Lys Asp Lys Arg Leu Glu Thr Val Gln
85 90 95
Lys Asp Lys Arg Ser Lys Ile Ile Cys Arg Phe Thr Glu Gln Pro Glu
100 105 110
Phe Lys Lys Leu Phe Gly Lys Glu Leu Phe Ser Glu Leu Leu Pro Glu
115 120 125
Met Ile Asn Ala Glu Asn Ala Asp Asn Lys Asp Glu Lys Leu His Ala
130 135 140
Ala Ala Ala Phe Asp Lys Phe Ser Thr Tyr Phe Lys Gly Phe His Asp
145 150 155 160
Asn Arg Arg Asn Ile Tyr Ser Asn Glu Glu Ile Ser Thr Ser Val Ala
165 170 175
Tyr Arg Ile Val His Gln Asn Phe Pro Lys Phe Leu Ala Asn Ala Glu
180 185 190
Thr Phe Lys Thr Ile Cys Lys Lys Ala Pro Glu Ile Ile Glu Gln Thr
195 200 205
Gln Lys Glu Leu Ser Lys Ile Leu Gly Lys His Lys Leu Glu Asp Ile
210 215 220
Phe Arg Ile Glu Ser Phe Asn Asn Val Met Thr Gln Asp Gly Ile Asp
225 230 235 240
Tyr Tyr Asn Asn Ile Ile Asp Gly Val Pro Cys Glu Ala Gly Lys Lys
245 250 255
Lys Leu Arg Gly Val Asn Glu Phe Ala Ser Ile Tyr Arg Gln Gln His
260 265 270
Pro Asp Thr Lys Ile Gln Ile Lys Met Val Pro Leu Tyr Lys Gln Ile
275 280 285
Leu Ser Asp Arg Ala Thr Leu Ser Phe Met Pro Ala Ala Leu Asp Asn
290 295 300
Asp Gly Asp Ala Phe Glu Ala Val Ala Gly Leu Glu Lys Met Leu Asn
305 310 315 320
Glu Pro Asp Ala Glu Thr Lys Thr Ser Val Leu Gln Gln Ile Ser Ala
325 330 335
Leu Phe Ala Lys Pro Ser Asp Tyr Ser Gln Glu Arg Val Trp Ile Asn
340 345 350
Gln Lys Ser Val Pro Val Val Ser Ala Ala Leu Phe Gly Ser Trp Asp
355 360 365
Thr Leu Gly Ser Ala Leu Ala Ala Tyr Lys Glu Asn Glu Leu Gly Asp
370 375 380
Thr Arg Gly Lys Asp Lys Lys Val Glu Lys Trp Ile Lys Ser Lys Ala
385 390 395 400
Phe Ser Phe Ala Ser Leu Asp Ala Ala Ala Asp Phe Tyr Lys Asp Ser
405 410 415
Leu Pro Gly Glu Lys Ser Ala Arg Arg Ile Lys Asp Tyr Phe Ala Gly
420 425 430
Cys Arg Glu Leu Val Lys Asn Thr Ser Glu Lys Gln Lys Glu Phe Asp
435 440 445
Lys Ile Lys Asp Ser Ala Leu Phe Gly Asn Glu Thr Asn Thr Ser Ala
450 455 460
Val Lys Ala Tyr Leu Asp Ser Leu Asn Asp Ile Leu Arg Phe Met Arg
465 470 475 480
Pro Phe Glu Thr Glu Asp Ile Thr Asp Ile Asp Thr Glu Phe Tyr Ser
485 490 495
Ala Tyr Ser Val Leu Leu Glu Lys Ile Lys Met Val Ile Pro Val Tyr
500 505 510
Asn Thr Val Arg Asn Tyr Val Thr Lys Lys Pro Phe Lys Thr Asp Lys
515 520 525
Phe Lys Leu Asn Phe Glu Asn Pro Thr Leu Ala Tyr Gly Trp Asp Lys
530 535 540
Ser Lys Glu Gln Ala Asn Thr Ala Ile Leu Leu Met Lys Asp Asp Lys
545 550 555 560
Tyr Tyr Leu Gly Ile Met Asn Ala Lys His Lys Ile Lys Pro Ala Glu
565 570 575
Leu Ala Asp Asp His Asn Gly Asp Gly Tyr Lys Lys Met Gln Tyr Met
580 585 590
Gln Met Ser Gly Pro Thr Lys Asp Leu Pro Asn Leu Leu Val Ile Asp
595 600 605
Gly Lys Thr Val Arg Lys Thr Gly Ser Lys Asp Ala Asn Gly Val Asn
610 615 620
Arg Lys Gln Glu Gln Leu Lys Asn Thr Tyr Leu Pro Pro Asp Ile Asn
625 630 635 640
Glu Ile Arg Leu Asp Gly Ser Tyr Leu Glu Thr Ser Asn Asn Phe Ser
645 650 655
Lys Lys Asn Ser Gln Lys Tyr Leu Ala Tyr Tyr Met Lys Leu Leu Lys
660 665 670
Glu Tyr Lys Ser Asn Phe Asp Phe Asn Phe Lys Lys Ala Asn Glu Tyr
675 680 685
Glu Ser Tyr Tyr Asp Phe Thr Asn Asp Ile Lys Lys Gln Cys Tyr Ser
690 695 700
Leu Thr Phe Thr Asn Leu Ala Glu Asn Lys Val Asp Lys Trp Val Asp
705 710 715 720
Glu Gly Arg Leu Tyr Leu Phe Gln Ile Trp Asn Lys Asp Phe Ala Glu
725 730 735
Gly Val Ser Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala Leu
740 745 750
Phe Ser Pro Glu Asn Leu Lys Asn Val Val Tyr Lys Leu Asn Gly Lys
755 760 765
Ala Glu Leu Phe Phe Arg Arg Lys Ser Ile Asn Glu Pro Val Val His
770 775 780
Pro Thr Gly Ser Lys Lys Val Asn Arg Arg Asp Ile Asp Gly Ser Pro
785 790 795 800
Ile Asp Asp Glu Thr Phe Asn Glu Ile Tyr Leu Tyr Ala Asn Gly Lys
805 810 815
Arg Ala Leu Gly Ser Leu Gly Ala Ala Ala Arg Ala Leu Val Glu Ser
820 825 830
Lys Arg Val Arg Ile Thr Asp Val Lys His Glu Leu Val Lys Asp Lys
835 840 845
Arg Tyr Thr Gln Asp Lys Phe Phe Phe His Val Ser Leu Thr Ile Asn
850 855 860
Phe Lys Ala Ser Gly Lys Glu Asn Ile Asn Ser Asp Val Asn Leu Phe
865 870 875 880
Leu Lys Asn Asn Lys Asp Val Lys Ile Ile Gly Ile Asp Arg Gly Glu
885 890 895
Arg Asn Leu Ile Tyr Ile Ser Leu Ile Asp Arg Lys Gly Asn Ile Ile
900 905 910
Glu Gln Lys His Phe Asn Thr Val Gly Gly Met Asp Tyr His Ala Lys
915 920 925
Leu Asp Gln Arg Glu Lys Ala Arg Asp Glu Ala Arg Lys Ser Trp Lys
930 935 940
Thr Ile Gly Asn Ile Lys Glu Leu Lys Glu Gly Tyr Leu Ser Gln Val
945 950 955 960
Ile His Glu Ile Thr Lys Met Ala Val Glu Asn Asp Ala Ile Ile Ala
965 970 975
Met Glu Asp Leu Asn Val Gly Phe Lys Arg Gly Arg Phe Lys Val Glu
980 985 990
Lys Gln Val Tyr Gln Lys Phe Glu Glu Met Leu Ile Asn Lys Leu Asn
995 1000 1005
Tyr Leu Ser Phe Lys Asp Thr Gly Glu Asn Lys Gln Cys Gly Ile
1010 1015 1020
Arg Asn Gly Leu Gln Leu Ala Gly Lys Phe Thr Ser Phe Lys Lys
1025 1030 1035
Ile Gly Lys Gln Cys Gly Ile Ile Phe Tyr Val Pro Ala Gly Tyr
1040 1045 1050
Thr Ser Lys Ile Asp Pro Val Thr Gly Phe Val Ser Val Phe Asn
1055 1060 1065
Leu Ser Ala Val Thr Ser Gln Glu Lys Gln Lys Glu Phe Ile Asp
1070 1075 1080
Arg Leu Asp Ser Ile Arg Tyr Asp Lys Lys Leu Asp Met Phe Val
1085 1090 1095
Phe Ser Phe Asp Tyr Ser Glu Phe Lys Thr Tyr Gln Thr Leu Pro
1100 1105 1110
Val Thr Lys Trp Asp Val Tyr Thr Asn Gly Lys Arg Ile Ile Asn
1115 1120 1125
Lys Arg Glu Gly Ser Arg Trp Ile Pro Gln Asn Val Val Pro Thr
1130 1135 1140
Glu Glu Met Lys Arg Thr Leu Lys Gln Leu Gly Ile Glu Tyr Glu
1145 1150 1155
Ser Gly Arg Asp Ile Leu Pro Val Ile Met Glu Arg Asp Lys Lys
1160 1165 1170
Leu Ala Ser Asp Val Phe Tyr Ile Phe Lys Asn Thr Leu Gln Met
1175 1180 1185
Arg Asn Ser Asn Ala Ala Thr Gly Glu Asp Tyr Ile Ile Ser Pro
1190 1195 1200
Val Lys Gly Lys Lys Gly Val Phe Phe Ser Ser Ser Ala Lys Asp
1205 1210 1215
Lys Ser Leu Pro Gln Asp Ala Asp Ala Asn Gly Ala Tyr His Ile
1220 1225 1230
Ala Leu Lys Gly Ser Leu Val Leu Asp Ala Ile Asp Glu Lys Leu
1235 1240 1245
Lys Asp Asp Gly Lys Met Ser Tyr Lys Asp Met Tyr Ile Ser Asn
1250 1255 1260
Pro Asp Trp Phe Lys Phe Met Gln Thr Gly Lys His
1265 1270 1275
<210> 6
<211> 1273
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf9
<400> 6
Met Lys Glu Asn Phe Ile Gly Lys Tyr Gln Ile Thr Lys Thr Leu Arg
1 5 10 15
Phe Ser Leu Ile Pro Ile Gly Lys Thr Glu Glu Tyr Phe Asn Ala Arg
20 25 30
Cys Met Leu Glu Glu Asp Glu Gln Arg Ala Glu Asp Tyr Val Lys Val
35 40 45
Lys Ser Phe Ile Asp Glu Tyr His Lys Ala Phe Ile Glu Arg Ile Leu
50 55 60
Ser Asn Leu Ile Lys Gln Lys Ser Thr Ser Lys Gly Thr Glu Phe Ile
65 70 75 80
Glu Lys Val Arg Asp Tyr Ala Asp Leu Tyr Asn Ser Ser Gln Arg Asp
85 90 95
Asp Lys Lys Leu Asn Lys Ile Gly Glu Glu Leu Arg Lys Ser Ile Ser
100 105 110
Glu Ala Phe Thr Lys Asp Asp His Tyr Asp Arg Leu Phe Asn Lys Asp
115 120 125
Ile Ile Glu Glu Leu Leu Pro Glu Tyr Leu Gly Asp Ser Arg Lys Glu
130 135 140
Asp Thr Lys Ile Val Glu Asn Phe Val Gly Phe Lys Thr Tyr Phe Asn
145 150 155 160
Gly Phe Phe Glu Asn Arg Lys Asn Met Tyr Val Lys Glu Gln Glu Thr
165 170 175
Thr Ala Ile Ala Tyr Arg Cys Ile Asp Glu Asn Leu Pro Arg Phe Leu
180 185 190
Asp Asn Ala Thr Ile Trp Lys Lys Lys Leu Arg Asp Ala Leu Pro Glu
195 200 205
Glu Asp Ile Cys Arg Leu Asn Lys Glu Cys Thr Asp Phe His Asp Lys
210 215 220
Lys Val Glu Asp Ile Phe Asp Ile Asp Phe Phe Thr Gln Val Leu Ser
225 230 235 240
Gln Ser Gly Ile Asp Trp Tyr Asn Gln Ile Leu Gly Gly Tyr Thr Lys
245 250 255
Glu Gly Asn Ile Lys Ile Gln Gly Leu Asn Glu Tyr Ile Asn Thr Tyr
260 265 270
Asn Asp Lys Val Ser Glu Lys Glu Arg Ser His Arg Leu Pro Leu Leu
275 280 285
Lys Pro Leu Tyr Lys Gln Ile Leu Ser Asp Arg Val Ser Thr Ser Phe
290 295 300
Ile Pro Glu Lys Phe Thr Ser Asp Glu Glu Leu Leu Ser Ala Val His
305 310 315 320
Lys Leu Tyr Thr Val Lys Glu Asp Gly Arg Val Ser Leu Lys Glu Ala
325 330 335
Ile Ser Glu Ile Lys Glu Leu Phe Ala Glu Leu Ser Ile Phe Asn Leu
340 345 350
Ser Gly Ile Phe Val Ser Ala Lys Thr Gly Leu Ser Asp Val Ser Asn
355 360 365
Arg Val Phe Gly Tyr Trp Gly Ala Val Lys Glu Gly Trp Ile Asp Asn
370 375 380
Tyr His Glu Asn Asn Pro Leu Gly Lys Arg Glu Ser Ile Glu Leu Tyr
385 390 395 400
Glu Lys Lys Leu Asn Lys Glu Tyr Gly Asn Ile Pro Ser Phe Ser Ile
405 410 415
Glu Glu Ile Gln Gln Phe Gly Glu Gly Lys Ala Lys Glu Glu Tyr Arg
420 425 430
Asn Glu Thr Val Ile His Phe Tyr Ser Gly Thr Val Arg Lys Gln Ser
435 440 445
Asn Lys Ile Cys Asp Ser Tyr Lys Asp Ala Tyr Lys Arg Ile Lys Pro
450 455 460
Leu Leu Glu Ala Pro Asn Glu Ser Gly Asn Asp Leu Arg Ser Asn Lys
465 470 475 480
Glu Ala Ile Glu Leu Leu Lys Ile Phe Leu Asp Ser Val Lys Glu Leu
485 490 495
Glu Phe Leu Val Lys Pro Phe Arg Gly Glu Gly Asn Glu Thr Asp Lys
500 505 510
Asp Asn Asn Phe Tyr Asn Arg Phe Leu Val Ala Phe Asp Thr Phe Thr
515 520 525
Asp Phe Asp Phe Leu Tyr Asp Lys Val Arg Asn Tyr Ile Thr Gln Lys
530 535 540
Pro Phe Ser Thr Glu Lys Ile Lys Leu Asn Phe Asn Asn Pro Gln Phe
545 550 555 560
Leu Gly Gly Trp His Glu Asn Lys Glu Ser Ser Tyr Ser Ser Ile Leu
565 570 575
Leu Arg Ser Ala Gly Lys Tyr Tyr Leu Gly Val Met Asp Thr Lys Ser
580 585 590
Lys His Ser Phe Lys Lys Tyr Pro Ser Pro Lys Ser Lys Asn Asp Val
595 600 605
Val Glu Lys Met Phe Leu His Gln Val Ala Asn Pro Ala Lys Asp Val
610 615 620
Gln Asn Leu Met Val Ile Asn Gly Lys Thr Val Arg Arg Thr Gly Arg
625 630 635 640
Lys Glu Thr Glu Gly Glu Tyr Lys Gly Glu Asn Leu Arg Leu Glu Glu
645 650 655
Leu Lys Asn Thr His Leu Pro Glu Glu Ile Asn Arg Ile Arg Lys Ser
660 665 670
Gln Ser Tyr Leu Lys Ser Ser Gly Glu Ile Phe Ser Lys Gln Asp Leu
675 680 685
Val Ala Phe Ile Lys Phe Tyr Met Glu Arg Thr Lys Glu Tyr Tyr Thr
690 695 700
Asn Ser His Phe Glu Phe Arg Asn Ala Glu Asn Tyr Gln Asp Phe Lys
705 710 715 720
Glu Phe Thr Asp Asp Ile Asp Ala Gln Ala Tyr Gln Val His Phe Lys
725 730 735
Glu Ile Ser His Ser Phe Ile Asn Ser Leu Val Asp Lys Gly Glu Leu
740 745 750
Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser Pro Tyr Ser Arg Gly
755 760 765
Thr Pro Asn Leu His Thr Leu Tyr Phe Lys Met Leu Phe Asp Glu Arg
770 775 780
Asn Leu Ala Asp Val Val Phe Lys Leu Asp Gly Asn Ala Glu Met Phe
785 790 795 800
Tyr Arg Lys Ala Ser Leu Lys Lys Gln Ile Thr His Pro Ala Asn Lys
805 810 815
Pro Ile Pro Asn Lys Asn Thr Met Asn Pro Lys Lys Glu Ser Thr Phe
820 825 830
Gly Tyr Asp Ile Ile Lys Asp Lys Arg Tyr Thr Glu Arg Gln Phe Ser
835 840 845
Leu His Phe Pro Ile Thr Leu Asn Phe Lys Glu Ala Lys Asn Ala Asn
850 855 860
Ile Ser Lys Glu Val Arg Asp Thr Leu Tyr Lys Ser Asp Leu Pro Tyr
865 870 875 880
Ile Ile Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Ile Cys Val
885 890 895
Ile Asp Gly Asn Gly Asn Ile Val Glu Gln Met Ser Met Asn Glu Ile
900 905 910
Thr Thr Asp Asn Asn Tyr Lys Val Asn Tyr His Asn Leu Leu Gln Arg
915 920 925
Lys Glu Glu Glu Arg Lys Lys Ala Arg Gly Asn Trp Ser Val Ile Glu
930 935 940
Asn Ile Lys Glu Leu Lys Glu Gly Tyr Leu Ser Gln Val Ile Asn Lys
945 950 955 960
Ile Cys Gly Leu Val Ile Lys Tyr Asn Ala Val Ile Ala Met Glu Asn
965 970 975
Leu Asn Tyr Gly Phe Lys Arg Gly Arg Phe Arg Val Glu Lys Gln Val
980 985 990
Tyr Gln Lys Phe Glu Asn Asn Leu Ile Lys Lys Leu Asn Tyr Leu Ala
995 1000 1005
Asp Lys Lys Leu Pro Pro Glu Gln Asp Gly Gly Leu Leu Arg Ala
1010 1015 1020
Tyr Gln Leu Thr Glu Lys Phe Glu Lys Ile Asn Lys Ser Asn Gln
1025 1030 1035
Asn Gly Ile Ile Phe Phe Val Pro Ala Trp Leu Thr Ser Lys Ile
1040 1045 1050
Asp Pro Thr Thr Gly Phe Thr Asn Leu Leu Tyr Pro Arg Tyr Glu
1055 1060 1065
Ser Val Lys Lys Ala Lys Asn Phe Phe Ala Asn Phe Asn Leu Ile
1070 1075 1080
Thr Tyr Asp Ala Ser Glu Asp Met Phe Arg Phe Asp Phe Asp Tyr
1085 1090 1095
Thr Lys Phe Leu Cys Gly Val Ala Asp Phe Lys Lys Lys Trp Ser
1100 1105 1110
Val Trp Ser Tyr Gly Glu Arg Ile Lys Thr Arg Arg Lys Glu Lys
1115 1120 1125
His Asn Asn Asp Ile Glu Tyr Thr Thr Val Gln Leu Thr Asp Glu
1130 1135 1140
Phe Lys Asn Leu Phe Glu Asn Tyr Arg Ile Asn Tyr Leu Asp Asn
1145 1150 1155
Leu Gln Lys Gln Ile Ile Glu Ala Asp Asp Lys Glu Phe Phe Tyr
1160 1165 1170
Ser Leu Tyr Ser Leu Leu Asn Leu Thr Leu Gln Met Arg Asn Ser
1175 1180 1185
Asn Pro Asn Ser Gly Asp Asp Tyr Leu Ile Ser Pro Val Arg Asn
1190 1195 1200
Thr Ser Gly Gly Phe Tyr Asp Ser Arg Asn Tyr Leu Lys Ser Gly
1205 1210 1215
Asn Leu Ser Leu Pro Val Asp Ala Asp Ala Asn Gly Ala Tyr Asn
1220 1225 1230
Ile Ala Arg Lys Cys Leu Trp Gln Ile Met Lys Leu Lys Ser Leu
1235 1240 1245
Ser Glu Asp Glu Thr Lys Lys Pro Asn Leu Thr Ile Ser Asn Lys
1250 1255 1260
Asp Trp Leu Cys Tyr Ala Gln Glu Asn Lys
1265 1270
<210> 7
<211> 1273
<212> PRT
<213> Artificial Sequence
<220>
<223> Cas-sf10
<400> 7
Met Gln Asp Lys Thr Gly Trp Ser Ser Phe Thr Asn Lys Tyr Ser Leu
1 5 10 15
Ser Lys Thr Leu Arg Phe Glu Leu Lys Pro Val Gly Asn Thr Gln Lys
20 25 30
Met Leu Glu Asp Asp Gly Val Phe Gln Lys Asp Arg Glu Arg Gln Glu
35 40 45
Asn Tyr Lys Lys Val Lys Pro Phe Met Asp Lys Leu His Arg Glu Phe
50 55 60
Ile Lys Glu Ala Leu Asn Asn Leu Lys Leu Glu Gly Leu Thr Glu Tyr
65 70 75 80
Phe Glu Ile Phe Lys Lys Phe Arg Lys Asp Lys Asn Asn Lys Glu Leu
85 90 95
Lys Asn Ala Glu Lys Lys Leu Arg Gln Ile Ile Gly Arg Cys Tyr Thr
100 105 110
Glu Thr Ala Gln Ile Trp Val Glu Lys Tyr Lys Glu Phe Gly Phe Lys
115 120 125
Lys Lys Asn Ile Gly Phe Leu Phe Glu Glu Gly Val Phe Glu Leu Met
130 135 140
Lys Leu Lys Tyr Gly Asn Asp Glu Ala Ser Gln Ile Glu Lys Asn Gly
145 150 155 160
Glu Val Leu Ser Ile Phe Asp Gly Trp Lys Gly Phe Leu Gly Tyr Phe
165 170 175
Lys Lys Phe Phe Glu Thr Arg Asn Asn Phe Tyr Lys Asp Asp Gly Thr
180 185 190
Ser Thr Ala Val Ser Thr Arg Ile Ile Asn Glu Asn Leu Lys Ile Tyr
195 200 205
Leu Asp Asn Leu Ile Lys Tyr Asn Lys Ile Lys Asp Lys Val Asp Phe
210 215 220
Lys Glu Ala Asp Ile Leu Gln Glu Asn Lys Leu Asn Leu Ser Asp Phe
225 230 235 240
Phe Asn Val Glu Ser Tyr Ala Lys Tyr Ser Leu Gln Lys Gly Ile Asp
245 250 255
Tyr Tyr Asn Glu Ile Leu Gly Gly Lys Thr Leu Lys Asn Gly Thr Lys
260 265 270
Leu Lys Gly Leu Asn Glu Val Ile Asn Glu Tyr Lys Gln Lys Asn Lys
275 280 285
Ser Gly Glu Leu Ser Lys Phe Lys Met Leu Lys Lys Gln Ile Leu Gly
290 295 300
Glu Gly Glu Asp Arg Thr Leu Phe Glu Glu Ile Glu Asn Glu Asp Glu
305 310 315 320
Leu Lys Asp Val Leu Lys Asp Phe Phe Tyr Asn Ala Asp Pro Lys Ile
325 330 335
Thr Leu Phe Lys Thr Leu Leu Glu Asp Phe Phe Ser Asn Thr Glu Lys
340 345 350
Tyr Lys Asp Glu Leu Asp Lys Ile Tyr Phe Asn Thr Val Ala Ile Asn
355 360 365
Gly Ile Leu His Arg Trp Val Asp Asp Ser Gly Val Phe Gln Lys Tyr
370 375 380
Leu Phe Glu Val Leu Lys Ser Asn Lys Leu Val Lys Ser Asn His Tyr
385 390 395 400
Asp Lys Lys Glu Asp Ser Tyr Lys Phe Pro Asp Phe Ile Ser Phe Glu
405 410 415
His Ile Lys Val Ala Leu Glu Asn Cys Glu Arg Asp Gly Leu Lys Asp
420 425 430
Lys Phe Trp Lys Glu Lys Tyr Tyr Thr Lys Glu Cys Leu Thr Glu Asn
435 440 445
Gly Leu Ala Asn Leu Trp Gln Glu Phe Leu Glu Ile Tyr Lys Cys Glu
450 455 460
Phe Lys Lys Leu Tyr Asp Tyr Lys Thr Asp Asp Asn Asp Cys Tyr Leu
465 470 475 480
Gln Tyr Arg Asp Asn Tyr Lys Lys Tyr Ile Leu Asp Ala Asn Phe Asn
485 490 495
Pro Lys Glu Lys Ser Ala Lys Asp Ile Ile Lys Asp Tyr Leu Asp Ser
500 505 510
Val Leu Ser Ile Tyr Gln Leu Ala Lys Tyr Phe Ala Leu Glu Lys Lys
515 520 525
Lys Val Trp Thr Thr Asp Tyr Glu Thr Gly Asp Phe Tyr Tyr Glu Tyr
530 535 540
Ile Lys Phe Tyr Glu Asp Thr Tyr Glu Gln Ile Ile Lys Pro Tyr Asn
545 550 555 560
Leu Val Arg Asn Tyr Leu Thr Arg Lys Pro Ile Asn Thr Ala Lys Lys
565 570 575
Trp Lys Leu Asn Phe Asp Asn Ala Tyr Leu Ala Ser Gly Trp Asp Lys
580 585 590
Asp Lys Glu Val Ser Asn Leu Thr Val Ile Leu Arg Arg Asp Glu Gln
595 600 605
Tyr Tyr Leu Ala Ile Met Lys Lys Gly Lys Asn Lys Ile Phe Glu Lys
610 615 620
Lys Phe Ser Cys Gly Glu Phe Glu Lys Met Glu Tyr Lys Gln Ile Ala
625 630 635 640
Glu Ala Ser Ser Asp Ile His Asn Leu Val Leu Met Asn Asp Gly Ser
645 650 655
Cys Arg Arg Cys Ile Lys Met His Asp Lys Arg Lys Tyr Trp Pro Leu
660 665 670
Asp Ile Ser Ile Ile Lys Glu Lys Lys Ser Tyr Ala Lys Glu Asn Phe
675 680 685
Val Arg Arg Asp Phe Glu Arg Phe Val Asn Tyr Met Lys Lys Cys Ser
690 695 700
Leu Leu Tyr Trp Lys Glu Tyr Asp Leu Lys Phe Ser Asp Thr Ser Thr
705 710 715 720
Tyr Lys Asn Ile Asn Asp Phe Thr Asn Glu Ile Ala Ser Gln Gly Tyr
725 730 735
Lys Leu Ser Phe Ser Ala Ile Pro Glu Ser Tyr Ile Asn Glu Lys Asn
740 745 750
Asn Asn Gly Glu Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Gly
755 760 765
Ile Lys Thr Glu Gly Asn Lys Asn Leu His Thr Met Tyr Trp Glu Ser
770 775 780
Ile Phe Ser Glu Glu Asn Arg Phe Arg Asn Phe Ile Val Lys Leu Asn
785 790 795 800
Gly Lys Ala Glu Ile Phe Tyr Arg Pro Lys Ser Glu Gln Val Glu Lys
805 810 815
Glu Gln Arg Asn Phe Thr Arg Glu Ile Ile Lys Asn Arg Arg Tyr Thr
820 825 830
Glu Asn Lys Ile Tyr Phe His Cys Pro Ile Thr Leu Asn Arg Ile Ser
835 840 845
Arg Glu Asn Val Lys Lys Phe Asn Asn Gly Ile Asn Asn Tyr Ile Ala
850 855 860
Thr Asn Pro Asn Ile Asn Ile Leu Gly Val Asp Arg Gly Glu Lys His
865 870 875 880
Leu Val Tyr Tyr Ala Ile Val Asp Gln Asp Gly Lys Leu Ile Asp Ala
885 890 895
Glu Asp Ala Thr Gly Ser Phe Asn Thr Ile Gly Ser Thr Asp Tyr His
900 905 910
Arg Leu Leu Glu Glu Lys Ala Lys Asp Arg Glu Lys Glu Arg Lys Asp
915 920 925
Trp Asp Leu Ile Arg Gly Ile Lys Asp Leu Lys Lys Gly Tyr Ile Ser
930 935 940
Leu Val Val Arg Lys Ile Ala Asp Leu Ala Ile Lys Tyr Asn Ala Ile
945 950 955 960
Ile Ile Phe Glu Asp Leu Asn Thr Arg Phe Lys Gln Ile Arg Gly Gly
965 970 975
Met Glu Lys Ser Val Tyr Gln Gln Leu Glu Lys Ala Leu Ile Asn Lys
980 985 990
Leu Ser Phe Leu Val Asn Lys Gly Glu Lys Asp Pro Glu Gln Ala Gly
995 1000 1005
His Leu Leu Lys Ala Tyr Gln Leu Ala Ala Pro Phe Gln Thr Phe
1010 1015 1020
Asp Lys Met Gly Arg Gln Thr Gly Ile Ile Phe Tyr Thr Gln Ala
1025 1030 1035
Ser Tyr Thr Ser Lys Ile Asp Pro Ile Thr Gly Trp Arg Pro Asn
1040 1045 1050
Leu Tyr Leu Lys Tyr Arg Asn Ile Asp Asp Ser Lys Glu Ser Ile
1055 1060 1065
Lys Lys Phe Lys Ser Ile Leu Phe Asn Lys Glu Lys Asn Arg Phe
1070 1075 1080
Glu Phe Thr Tyr Asp Leu Lys Asp Phe Val Asp Phe Glu Glu Asp
1085 1090 1095
Lys Ile Pro Glu Lys Thr Glu Trp Thr Leu Cys Ser Ser Val Glu
1100 1105 1110
Arg His Lys Trp Asn Arg His Met Asn Asn Asn Lys Gly Gly Tyr
1115 1120 1125
Glu Val Tyr Lys Asp Leu Thr Glu Asn Phe Tyr Lys Leu Phe Asp
1130 1135 1140
Glu Asn Asn Ile Ser Met Asn Lys Asp Ile Val Asp Gln Val Glu
1145 1150 1155
Ser Ile Ser Asn Gly Asn Phe Phe Arg Gln Phe Ile Tyr Leu Phe
1160 1165 1170
Asn Leu Val Cys Gln Ile Arg Asn Thr Asp Glu Lys Ala Glu Asp
1175 1180 1185
Val Asp Lys Arg Asp Phe Ile Leu Ser Pro Val Glu Pro Phe Phe
1190 1195 1200
Asp Ser Arg Arg Ala Lys Asp Phe Lys Ala Tyr Gly Asp Asn Leu
1205 1210 1215
Pro Lys Asn Gly Asp Glu Asn Gly Ala Tyr Asn Ile Ala Arg Lys
1220 1225 1230
Gly Val Leu Ile Ile Lys Lys Ile Lys Glu Tyr Tyr Asn Gln Asn
1235 1240 1245
Gly Ser Cys Asp Lys Leu Gly Trp Gly Asp Leu Ser Ile Ser His
1250 1255 1260
Lys Glu Trp Asp Asp Phe Ala Thr Asn Asn
1265 1270
<210> 8
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf4
<400> 8
guuuagaagc augcuuuaau uucuacuguu guagau 36
<210> 9
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf1
<400> 9
gucuaaaccu caaugaaaau uucuacuguu guagau 36
<210> 10
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf3
<400> 10
gucuauaaga cuauuauaau uucuacuauu guagau 36
<210> 11
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf6
<400> 11
gucuaaaggu auuauaaaau uucuacuauu guagau 36
<210> 12
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf8
<400> 12
gucuaaaggc cuuauauaau uucuacuuuu guagau 36
<210> 13
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf9
<400> 13
guuuaagacc uccuuuuaau uucuacuguu guagau 36
<210> 14
<211> 36
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf10
<400> 14
cucaauuccu uacaauagau uucuacuuuu guagau 36
<210> 15
<211> 709
<212> DNA
<213> Artificial Sequence
<220>
<223> PCR
<400> 15
tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120
ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240
attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300
tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360
tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt cgagctcggt accttcggta 420
taacaacttc gacgagctct acaaagcttg gcgtaatcat ggtcatagct gtttcctgtg 480
tgaaattgtt atccgctcac aattccacac aacatacgag ccggaagcat aaagtgtaaa 540
gcctggggtg cctaatgagt gagctaactc acattaattg cgttgcgctc actgcccgct 600
ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg cgcggggaga 660
ggcggtttgc gtattgggcg ctcttccgct tcctcgctca ctgactcgc 709
<210> 16
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf4
<400> 16
aauuucuacu guuguagau 19
<210> 17
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf1
<400> 17
aauuucuacu guuguagau 19
<210> 18
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf3
<400> 18
aauuucuacu auuguagau 19
<210> 19
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf6
<400> 19
aauuucuacu auuguagau 19
<210> 20
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf8
<400> 20
aauuucuacu uuuguagau 19
<210> 21
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf9
<400> 21
aauuucuacu guuguagau 19
<210> 22
<211> 19
<212> RNA
<213> Artificial Sequence
<220>
<223> Cas-sf10
<400> 22
gauuucuacu uuuguagau 19

Claims (19)

1. A Cas protein, characterized in that the Cas protein is any one of the following I-III:
I. the amino acid sequence of the Cas protein has at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to any one of the amino acid sequences of SEQ ID nos. 1-7, and substantially retains the biological function of the sequence from which it is derived;
II. The amino acid sequence of the Cas protein has a sequence with one or more amino acid substitutions, deletions or additions compared with any one of the amino acid sequences of SEQ ID Nos. 1-7, and basically retains the biological function of the sequence from which it is derived;
III, the Cas protein comprises an amino acid sequence shown in any one of SEQ ID No. 1-7.
2. A fusion protein comprising the Cas protein of claim 1 and other modifying moieties.
3. An isolated polynucleotide, wherein the polynucleotide is a polynucleotide sequence encoding a Cas protein of claim 1, or a polynucleotide sequence encoding a fusion protein of claim 2.
4. A gRNA comprising a direct repeat sequence capable of binding the Cas protein of claim 1 and a guide sequence capable of targeting a target sequence.
5. An direct repeat comprising a sequence as set forth in any one of SEQ ID Nos. 8 to 14 or 16 to 22.
6. A vector comprising the polynucleotide of claim 3 operably linked to a regulatory element.
7. A CRISPR-Cas system, comprising a Cas protein of claim 1 and at least one gRNA of claim 4.
8. A vector system, wherein the vector system comprises one or more vectors comprising:
a) a first regulatory element operably linked to the gRNA of claim 4,
b) a second regulatory element operably linked to the Cas protein of claim 1;
wherein components (a) and (b) are located on the same or different carriers of the system.
9. A composition, characterized in that the composition comprises:
(i) a protein component selected from: a Cas protein according to claim 1 or a fusion protein according to claim 2;
(ii) a nucleic acid component selected from the group consisting of: the gRNA of claim 4, or a nucleic acid encoding the gRNA of claim 4, or a precursor RNA of the gRNA of claim 4, or a precursor RNA nucleic acid encoding the gRNA of claim 4;
the protein component and the nucleic acid component are combined with each other to form a complex.
10. An activated CRISPR complex comprising:
(i) a protein component selected from: a Cas protein according to claim 1 or a fusion protein according to claim 2;
(ii) a nucleic acid component selected from the group consisting of: the gRNA of claim 4, or a nucleic acid encoding the gRNA of claim 4, or a precursor RNA of the gRNA of claim 4, or a precursor RNA nucleic acid encoding the gRNA of claim 4;
(iii) a target sequence that binds on a gRNA of claim 4.
11. An engineered host cell comprising the Cas protein of claim 1, or the fusion protein of claim 2, or the polynucleotide of claim 3, or the vector of claim 6, or the CRISPR-Cas system of claim 7, or the vector system of claim 8, or the composition of claim 9, or the activated CRISPR complex of claim 10.
12. Use of a Cas protein of claim 1, or a fusion protein of claim 2, or a polynucleotide of claim 3, or a vector of claim 6, or a CRISPR-Cas system of claim 7, or a vector system of claim 8, or a composition of claim 9, or an activated CRISPR complex of claim 10, or a host cell of claim 11 in gene editing, gene targeting, or gene cleavage; alternatively, use in the manufacture of a reagent or kit for gene editing, gene targeting or gene cleavage.
13. Use of a Cas protein of claim 1, or a fusion protein of claim 2, or a polynucleotide of claim 3, or a vector of claim 6, or a CRISPR-Cas system of claim 7, or a vector system of claim 8, or a composition of claim 9, or an activated CRISPR complex of claim 10, or a host cell of claim 11 in a cell selected from any one or any of:
targeting and/or editing a target nucleic acid; cleaving double-stranded DNA, single-stranded DNA, or single-stranded RNA; non-specifically cleaving and/or degrading the nucleic acid of the collateral branch; non-specifically cleaving single-stranded nucleic acids; detecting nucleic acid; specifically editing double-stranded nucleic acids; base-editing double-stranded nucleic acids; base-editing single-stranded nucleic acids.
14. A method of editing, targeting or cleaving a target nucleic acid, the method comprising contacting the target nucleic acid with the Cas protein of claim 1, or the fusion protein of claim 2, or the polynucleotide of claim 3, or the vector of claim 6, or the CRISPR-Cas system of claim 7, or the vector system of claim 8, or the composition of claim 9, or the activated CRISPR complex of claim 10, or the host cell of claim 11.
15. A method of cleaving single-stranded nucleic acid, the method comprising contacting a nucleic acid population with the Cas protein of claim 1 and the gRNA of claim 4, wherein the nucleic acid population comprises a target nucleic acid and at least one non-target single-stranded nucleic acid, the gRNA being capable of targeting the target nucleic acid, the Cas protein cleaving the non-target single-stranded nucleic acid.
16. A kit for gene editing, gene targeting or gene cleavage comprising the Cas protein of claim 1, or the fusion protein of claim 2, or the polynucleotide of claim 3, or the vector of claim 6, or the CRISPR-Cas system of claim 7, or the vector system of claim 8, or the composition of claim 9, or the activated CRISPR complex of claim 10, or the host cell of claim 11.
17. A kit for detecting a target nucleic acid in a sample, the kit comprising: (a) the Cas protein of claim 1, or a nucleic acid encoding the Cas protein; (b) the gRNA of claim 4, or a nucleic acid encoding the gRNA, or a precursor RNA comprising the gRNA, or a nucleic acid encoding the precursor RNA; and (c) a single-stranded nucleic acid detector that is single-stranded and does not hybridize to the gRNA.
18. Use of a Cas protein of claim 1, or a fusion protein of claim 2, or a polynucleotide of claim 3, or a vector of claim 6, or a CRISPR-Cas system of claim 7, or a vector system of claim 8, or a composition of claim 9, or an activated CRISPR complex of claim 10, or a host cell of claim 11 in the preparation of a formulation or kit for:
(i) gene or genome editing;
(ii) target nucleic acid detection and/or diagnosis;
(iii) editing a target sequence in a target locus to modify an organism or non-human organism;
(iv) treatment of diseases;
(v) targeting a target gene;
(vi) cutting the target gene.
19. A method of detecting a target nucleic acid in a sample, the method comprising contacting the sample with a Cas protein of claim 1, a gRNA (guide RNA) comprising a region that binds to the Cas protein and a guide sequence that hybridizes to the target nucleic acid, and a single-stranded nucleic acid detector; detecting a detectable signal generated by the Cas protein-cleaved single-stranded nucleic acid detector, thereby detecting a target nucleic acid; the single-stranded nucleic acid detector does not hybridize to the gRNA.
CN202210115774.3A 2021-02-03 2022-02-07 Novel Cas enzyme and application Active CN114292831B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310515419.XA CN116555227A (en) 2021-02-03 2022-02-07 Novel Cas enzyme and application

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110146620 2021-02-03
CN2021101466206 2021-02-03

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202310515419.XA Division CN116555227A (en) 2021-02-03 2022-02-07 Novel Cas enzyme and application

Publications (2)

Publication Number Publication Date
CN114292831A true CN114292831A (en) 2022-04-08
CN114292831B CN114292831B (en) 2023-04-07

Family

ID=80978393

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202310515419.XA Pending CN116555227A (en) 2021-02-03 2022-02-07 Novel Cas enzyme and application
CN202210115774.3A Active CN114292831B (en) 2021-02-03 2022-02-07 Novel Cas enzyme and application

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202310515419.XA Pending CN116555227A (en) 2021-02-03 2022-02-07 Novel Cas enzyme and application

Country Status (1)

Country Link
CN (2) CN116555227A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020088450A1 (en) * 2018-10-29 2020-05-07 中国农业大学 Novel crispr/cas12f enzyme and system
CN111615557A (en) * 2017-11-22 2020-09-01 国立大学法人神户大学 Stable genome editing complex with few side effects and nucleic acid encoding same
CN111996236A (en) * 2020-05-29 2020-11-27 山东舜丰生物科技有限公司 Method for detecting target nucleic acid based on CRISPR technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111615557A (en) * 2017-11-22 2020-09-01 国立大学法人神户大学 Stable genome editing complex with few side effects and nucleic acid encoding same
WO2020088450A1 (en) * 2018-10-29 2020-05-07 中国农业大学 Novel crispr/cas12f enzyme and system
CN111757889A (en) * 2018-10-29 2020-10-09 中国农业大学 Novel CRISPR/Cas12f enzymes and systems
CN111996236A (en) * 2020-05-29 2020-11-27 山东舜丰生物科技有限公司 Method for detecting target nucleic acid based on CRISPR technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MAKAROVA K.S.等: "Evolutionary classification of CRISPR–cas systems: A burst of class 2 and derived variants" *

Also Published As

Publication number Publication date
CN114292831B (en) 2023-04-07
CN116555227A (en) 2023-08-08

Similar Documents

Publication Publication Date Title
CN113881652B (en) Novel Cas enzymes and systems and applications
CN114672473B (en) Optimized Cas protein and application thereof
CN114517190B (en) CRISPR enzymes and systems and uses
CN114410609B (en) Cas protein with improved activity and application thereof
CN114507654B (en) Cas enzymes and systems and applications
CN114438055B (en) Novel CRISPR enzymes and systems and uses
CN113337502B (en) gRNA and its use
CN117106752A (en) Optimized Cas12 proteins and uses thereof
CN116004573B (en) Cas protein with improved editing activity and application thereof
CN114292831B (en) Novel Cas enzyme and application
CN116286739A (en) Mutant Cas proteins and uses thereof
CN114277015A (en) Novel CRISPR enzymes and uses
CN115851666B (en) Novel Cas enzymes and systems and uses
WO2023173682A1 (en) Optimized cas protein and use thereof
WO2023143150A1 (en) Novel cas enzyme and system and use
WO2024041299A1 (en) Mutated crispr-cas protein and use thereof
WO2024040874A1 (en) Mutated cas12j protein and use thereof
WO2023174249A1 (en) Cas protein having improved activity and use thereof
CN117050971A (en) Cas muteins and uses thereof
CN117247920A (en) Novel CRISPR enzyme, system and application
CN117286123A (en) Optimized Cas protein and application thereof
CN117603943A (en) Cas protein with improved editing efficiency and application thereof
CN116200369A (en) Novel Cas enzyme and application thereof
CN116555225A (en) Cas proteins with improved activity and uses thereof
CN118006585A (en) Optimized Cas protein and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant