CN117683749B

CN117683749B - Cas proteins and uses thereof

Info

Publication number: CN117683749B
Application number: CN202410155116.6A
Authority: CN
Inventors: 梁峻彬; 陈重建; 孙阳; 潘伟业; 徐辉; 黄连成
Original assignee: Zhejiang Xunshi Gene Technology Co ltd; Zhejiang Xunzhi Biotechnology Co ltd; Guangzhou Ruifeng Biotechnology Co ltd
Current assignee: Zhejiang Xunshi Gene Technology Co ltd; Zhejiang Xunzhi Biotechnology Co ltd; Guangzhou Ruifeng Biotechnology Co ltd
Priority date: 2024-02-04
Filing date: 2024-02-04
Publication date: 2024-05-17
Anticipated expiration: 2044-02-04
Also published as: CN117683749A

Abstract

The invention discloses a Cas protein and application thereof. The amino acid sequence of the Cas protein is shown as SEQ ID NO. 1 or 2. The invention also discloses guide RNAs, fusion proteins or conjugates, isolated nucleic acids, CRISPR-Cas systems, vector systems, and uses thereof.

Description

Cas proteins and uses thereof

Technical Field

The disclosure relates to the field of CRISPR gene editing, in particular to a Cas protein and application thereof.

Background

CRISPR-Cas systems are an adaptive immune defense that bacteria and archaea form during long-term evolution can use against invasive viruses and foreign DNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated protein systems (CRISPR-Cas systems) can directly alter gene sequences in cells, a fast and efficient method.

Many researchers in the field are working to find new Cas proteins and CRISPR-Cas gene editing systems.

Disclosure of Invention

The invention provides Cas proteins and uses thereof.

In one aspect, the invention provides a technical scheme as follows: a Cas protein, the amino acid sequence of which comprises or is an amino acid sequence having at least 50% identity compared to SEQ ID No. 1 or 2.

In particular embodiments of the invention, the at least 50% identity is at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9% or 100% identity.

In particular embodiments of the invention, the amino acid sequence of the Cas protein comprises or is an amino acid sequence having at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8% or at least 99.9% identity compared to SEQ ID No. 1 or 2.

In a specific embodiment of the invention, the amino acid sequence of the Cas protein comprises or is an amino acid sequence having 100% identity to any one of SEQ ID NOs 1, 2.

In a specific embodiment of the invention, the amino acid sequence of the Cas protein comprises the sequence set forth in any one of SEQ ID NOs 1,2, 9, 10.

In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown as SEQ ID NO. 1. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown as SEQ ID NO. 2. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown in SEQ ID NO. 9. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown in SEQ ID NO. 10.

In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA. In particular embodiments of the invention, the Cas protein may specifically bind to a target nucleic acid with a guide RNA.

In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to a target nucleic acid. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to a target DNA.

In particular embodiments of the invention, the Cas protein may specifically bind to a guide RNA and cleave a target nucleic acid. In particular embodiments of the invention, the Cas protein may specifically bind to a guide RNA and cleave a target DNA. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to and cleave a target nucleic acid. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to and cleave a target DNA.

In a specific embodiment of the invention, the Cas protein recognizes PAM having the sequence 5'-TTN-3', the N being any one selected from A, T, C and G.

In some embodiments of the invention, the Cas protein is a Cas protein inactivating variant. In some embodiments of the invention, the Cas protein inactivating variant is read Cas or NICKASE CAS.

In some embodiments of the invention, the Cas protein is selected from the active fragments comprising the Cas protein of any one of the invention.

In another aspect, the present invention provides a technical solution that is: a guide RNA comprising (i) a cognate repeat sequence having at least 50% identity to SEQ ID No. 3 or 4, (ii) a guide sequence engineered to hybridize to a target nucleic acid; the cognate repeat sequence is linked to the guide sequence, the guide RNA is capable of forming a complex with the Cas protein and directing sequence-specific binding of the complex to the target nucleic acid.

In some embodiments of the invention, the orthostatic repeat sequence has at least 50% identity to the sequence set forth in SEQ ID NO. 3 or 4.

In some embodiments of the invention, the orthostatic repeat has at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity compared to SEQ ID NO 3 or 4.

In some embodiments of the invention, the orthostatic repeat sequence is the sequence shown in SEQ ID NO. 3 or 4.

In a preferred embodiment, the Cas protein is a Cas protein according to the present invention.

In a specific embodiment of the invention, the guide sequence comprises 15-60 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-50 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-40 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-35 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-30 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 18-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 20-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 18-22 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 20-22 nucleotides. In specific embodiments of the invention, the guide sequence comprises 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.

In a specific embodiment of the invention, the guide sequence is located 3' to the homeotropic repeat.

In a specific embodiment of the invention, the guide sequence is located 5' to the homeotropic repeat.

In another aspect, the present invention provides a technical solution that is: a Cas protein inactivating variant, characterized in that the Cas protein inactivating variant is a nuclease activity inactivating variant of a Cas protein according to the present invention.

In a specific embodiment of the invention, the Cas protein inactivating variant is a variant in which nuclease activity is completely inactivated, i.e., a read Cas protein inactivating variant (dCas). The dCas can bind to the target nucleic acid only under the mediation of the guide RNA, and has no or little function of cleaving the target nucleic acid. For example, the target nucleic acid cleavage efficiency of the dCas is ∈20%, +.15%, +.10%, +.5%, +.4%, +.3%, +.2% or+.1% of the target nucleic acid cleavage efficiency of the Cas protein prior to inactivating mutation.

In a specific embodiment of the invention, the Cas protein inactivating variant is a variant with a nuclease activity partially inactivated. Further, the variant with the nuclease activity partially inactivated is Cas nickase (NICKASE CAS, nCas) that binds to the target nucleic acid under the mediation of guide RNA, and then cleaves one single strand of the double-stranded target nucleic acid without cleaving the other single strand.

In a preferred embodiment of the invention, the Cas protein inactivating variant is the Ruvc domain inactivation of the Cas protein.

In a preferred embodiment of the invention, the Cas protein inactivating variant is the Ruvc-I, ruvc-ii or Ruvc-iii domain inactivation of the Cas protein.

In a preferred embodiment of the invention, the inactive variant of the Cas protein is obtained by introducing an inactivating mutation in the Ruvc-I, ruvc-ii or Ruvc-iii domain of the Cas protein.

In particular embodiments of the invention, the PAM sequence recognizable by the Cas protein inactivating variant is identical to the PAM sequence recognizable by the Cas protein.

In another aspect, the present invention provides a technical solution that is: a fusion protein or conjugate comprising the following elements: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.

In a specific embodiment of the present invention, there is provided a fusion protein comprising: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.

In a specific embodiment of the present invention, there is provided a fusion protein comprising: (1) Cas protein according to the invention; and (2) a homologous or heterologous functional domain.

In a specific embodiment of the present invention, there is provided a conjugate comprising: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.

In a specific embodiment of the present invention, there is provided a conjugate comprising: (1) Cas protein according to the invention; and (2) a homologous or heterologous functional domain.

In a specific embodiment of the invention, the homologous or heterologous functional domain is any one or more selected from the group consisting of: subcellular localization signals, DNA binding domains, protease domains, transcription activation domains, transcription inhibition domains, nuclease domains, deaminase domains, uracil DNA glycosylase domains (UDG), uracil DNA glycosylase inhibition domains (UGI), methylases, demethylases, transcription release factors, histone acetylase domains, histone deacetylase domains, DNA ligases, affinity tags, reporter tags.

In some embodiments of the invention, the subcellular localization signal is selected from: nuclear localization signal, nuclear output signal, mitochondrial localization signal, chloroplast localization signal.

In specific embodiments of the invention, the fusion protein or conjugate comprises 1,2, 3,4, 5, 6, 7, 8, 9 or more of the homologous or heterologous functional domains; the functional domains are the same or different.

In some embodiments, the fusion protein or conjugate arbitrarily links 0, 1,2, 3, 4, 5, 6, 7, 8, or more of the protein domains at the N-terminus and/or C-terminus of the Cas protein.

In specific embodiments of the invention, the fusion protein comprises 1, 2, 3, 4 or more nuclear localization signals.

In another aspect, the present invention provides a technical solution that is: an isolated nucleic acid encoding a Cas protein according to the invention, a Cas protein inactivating variant according to the invention, or a fusion protein or conjugate according to the invention.

In some embodiments of the invention, the nucleic acid encodes a Cas protein as described herein or a fusion protein as described herein.

In a preferred embodiment of the invention, the nucleic acid is codon optimized for expression in a cell.

In another aspect, the present invention provides a technical solution that is: a CRISPR-Cas system, comprising:

a. A Cas protein according to the invention, a Cas protein inactivating variant according to the invention, a fusion protein or conjugate according to the invention or a nucleic acid according to the invention; and

B. A guide RNA, or a polynucleotide sequence encoding the guide RNA;

the Cas protein, the Cas protein inactivating variant, or the fusion protein or conjugate forms a complex with the guide RNA; the guide RNA comprises a guide sequence engineered to direct sequence-specific binding of the complex to a target nucleic acid.

In a specific embodiment of the invention, the guide RNA comprises a direct repeat sequence linked to a guide sequence.

In a specific embodiment of the invention, the homeotropic repeat has at least 50% identity to SEQ ID NO.3 or 4.

In specific embodiments of the invention, the target nucleic acid is a disease or disorder-associated gene or a signaling biochemical pathway-associated gene, or the target nucleic acid is a reporter gene; for example, the disease or disorder is a hematological disease or disorder, an ophthalmic disease or disorder, a neurological disease or disorder, a respiratory disease or disorder, a liver disease or disorder, a metabolic disease or disorder, cancer, or an infectious disease.

In some embodiments of the invention, the target nucleic acid is a gene associated with a disease or disorder, the disease or disorder being any one selected from the group consisting of: hemophilia a, best vitelliform macular dystrophy, B-cell acute lymphoblastic leukemia, hemophilia B, CDKL5 deficiency, CLN2 disease, niemann pick's disease type C, dravet syndrome, FOXG1 syndrome, GM1 gangliosidosis, GM2 gangliosidosis, HIV infection, HSV infection, type IB Wu Xieer syndrome, type IIA Wu Xieer syndrome, type IIIA mucopolysaccharidosis, type IIIB mucopolysaccharidosis, type III gaucher disease, type II mucopolysaccharidosis, type II diabetes, type IV mucopolysaccharidosis, type I gaucher disease, type I mucopolysaccharidosis, type I diabetes, type I Wu Xieer syndrome, KCNQ2 epileptic encephalopathy, leber hereditary optic neuropathy, leigh syndrome, prader-Willi syndrome, SLC13A5 defect, X-linked myotube disease, X-linked retinal degeneration alpha 1-antitrypsin deficiency, alpha-mannosidosis, alpha-thalassemia, beta-thalassemia, alzheimer's disease, pade-Pichia syndrome, retinitis pigmentosa, leukocyte adhesion deficiency type I, galactosylemia, bladder cancer, overactive bladder, phenylketonuria, nasopharyngeal carcinoma, beta-holothurian crystal dystrophy, pyruvate kinase deficiency, erectile dysfunction, autosomal recessive congenital ichthyosis, adult glucan disease, traumatic arthritis, homozygous familial hypercholesterolemia, fragile X syndrome, thalassemia, hypophosphatasia, epilepsy, multiple myeloma, multiple system atrophy, frontotemporal dementia, catecholamine sensitive polymorphic ventricular tachycardia, fabry's disease, van-Nyinaemia, aromatic amino acid decarboxylase deficiency, catecholamine sensitive polymorphic tachycardia, radiation induced xerostomia, non-hodgkin lymphoma, non-myogenic invasive bladder cancer, non-alcoholic fatty liver disease, non-small cell lung cancer, hypertrophic cardiomyopathy, hypertrophic scar, obesity, fibular amyotrophic lateral sclerosis type 1A, fibular amyotrophic lateral sclerosis type 2A, pulmonary hypertension, friedrich's ataxia, peritoneal cancer, liver cancer, hepatocellular carcinoma, dry age-related macular degeneration, sjogren's syndrome, hyperuricemia, hyperlipidemia, gaucher's disease, autism spectrum disorders, osteoarthritis, bone marrow failure syndrome, citrullinemia type I, coronary heart disease, cystine disease, melanoma, huntington's disease, amyotrophic lateral sclerosis, urge urinary incontinence, acute intermittent porphyria, acute lymphocytic leukemia, spinal cerebellar ataxia, spinal muscular atrophy with respiratory distress type 1, spinal muscular atrophy, familial black dementia, chronic lymphocytic leukemia methylmalonic acid, thyroid cancer, pseudohypertrophic muscular dystrophy, anaplastic astrocytoma, intermittent claudication, borderline epidermolysis bullosa, glioma, glioblastoma, corneal graft rejection, colorectal cancer, progressive multifocal leukopathy, progressive familial intrahepatic cholestasis, megaaxis neuropathy, canavalial disease, cocaine addiction, krebr's disease, crigler-Nanjer syndrome, oral cancer, happy puppet syndrome, diffuse endogenous brain bridge glioma, love Lash, rheumatoid arthritis, sickle cell disease, lymphedema, ovarian cancer, chronic lymphocytic leukemia, chronic granulomatosis, chronic renal anemia, chronic pain, chronic hepatitis B, mentha's disease, cystic fibrosis, inner-Joseph syndrome, ornithine carbamoyltransferase deficiency, parkinson's disease, pompe's disease, uveitis, prostate cancer, vestibular schwannoma, myotonic dystrophy, ankylosing spondylitis, castration-resistant prostate cancer, glaucoma, holoceanopia, ischemic heart failure, lysosomal storage disease, sarcoma, breast cancer, rayleigh's syndrome, triple negative breast cancer, sandhoff's disease, achromatopsia, heart failure with reduced ejection fraction, neuronal ceroid lipofuscinosis, adrenoleukodystrophy, renal cell carcinoma, wet age-related macular degeneration, eczema, thrombocytopenia with immunodeficiency syndrome, esophageal cancer, optic neuropathy, optic atrophy, retinal vein occlusion, retinal pigment degeneration, rhodopsin-mediated autosomal inherited retinal pigment degeneration, ependymoma, fallopian tube cancer, bilateral vestibular disease, stevens disease, diabetic macular edema, adrenoleukosis diabetic neuropathy, diabetic retinopathy, diabetic peripheral neuralgia, diabetic foot, glycogen storage disease type Ia, glycogen storage disease type IIb, atopic dermatitis, hearing loss, hearing impairment, head and neck cancer, head and neck squamous cell carcinoma, wilson's disease, stable angina, wu Xieer syndrome, choroideremia, congenital amaurosis, congenital adrenocortical hyperplasia, cardiomyopathy, angina pectoris, heart failure, novel coronavirus infection, pleural mesothelioma, acne vulgaris, severe combined immunodeficiency disease, severe limb ischemia, hypopharynx muscular dystrophy, pancreatic cancer, graft versus host disease, hereditary retinal dystrophy, hereditary angioedema, hepatitis B, metachromatic leukodystrophy, psoriatic arthritis, recessive hereditary epidermolysis bullosa, infant osteosclerosis, dystrophic epidermolysis bullosa, scleroderma, primary immunodeficiency, heterozygous familial hypercholesterolemia, limb-girdle muscular dystrophy type 2B, limb-girdle muscular dystrophy type 2C, limb-girdle muscular dystrophy type 2D, limb-girdle muscular dystrophy type 2E, limb-girdle muscular dystrophy type 2I, limb-girdle muscular dystrophy type 2L, limb ischemic disease, lipoprotein lipase deficiency, severe congenital neutrophil deficiency, wrinkles, strokes, sciatica, schizophrenia, depression, drug addiction, autism, idiopathic pulmonary fibrosis, transthyretin (ATTR) amyloidosis, AATD liver disease and AATD pulmonary disease, elevated blood lipid.

In some embodiments, the transthyretin (ATTR) amyloidosis-associated genes include, but are not limited to ATTR;

genes associated with Leber hereditary optic neuropathy include, but are not limited to, MT-ND4;

Genes associated with AATD liver disease include, but are not limited to, AATD;

related genes of the AATD lung disease include, but are not limited to, AATD;

Genes associated with graft versus host disease include, but are not limited to, thymidine kinase genes;

genes associated with the hereditary retinal dystrophy include, but are not limited to, RPE65;

the spinal muscular atrophy related genes include but are not limited to SMN1;

genes associated with osteoarthritis include, but are not limited to, TGF- β1;

Genes associated with hemophilia a include, but are not limited to, factor VIII;

Genes associated with hemophilia B include, but are not limited to factor IX;

genes associated with cystic fibrosis include, but are not limited to, CFTR;

genes associated with parkinson's disease include, but are not limited to, gad1, gad2, PTBP1, and REST;

genes associated with Wu Xieer syndrome include, but are not limited to, USH2A;

Genes associated with alpha-thalassemia, beta-thalassemia, sickle cell disease include, but are not limited to BCL11A, HBG, HBA and HBB;

genes associated with pulmonary hypertension include, but are not limited to, eNOS;

genes associated with the Style disease include, but are not limited to, ABCA4;

genes associated with age-related macular degeneration include, but are not limited to, VEGFA and VEGFR;

related genes for glaucoma include, but are not limited to, AQP1;

Genes associated with idiopathic pulmonary fibrosis include, but are not limited to CTGF;

genes associated with Alzheimer's disease include, but are not limited to, NGF;

Genes associated with coronary heart disease include, but are not limited to, VEGFA and bFGF;

genes associated with anemia of chronic kidney disease include, but are not limited to, EPO;

Genes associated with congenital amaurosis include, but are not limited to, RPE65;

Genes associated with retinal pigment degeneration include, but are not limited to PDE6B;

genes associated with phenylketonuria include, but are not limited to, PAH;

genes associated with epilepsy include, but are not limited to, GAT1;

Genes associated with elevated blood lipids include, but are not limited to, PCSK9.

In another aspect, the present invention provides a technical solution that is: a vector system comprising one or more recombinant vectors comprising an isolated nucleic acid according to the invention, or a CRISPR-Cas system according to the invention.

In a specific embodiment of the invention, the recombinant vector further comprises regulatory sequences.

In particular embodiments of the invention, the vector system comprises one or more recombinant vectors comprising a polynucleotide sequence encoding the Cas protein, cas protein inactivating variants, or fusion proteins, or conjugates of the invention, and a polynucleotide sequence encoding the guide RNA.

In specific embodiments of the invention, the polynucleotide sequence encoding the Cas protein, cas protein inactivating variant, or fusion protein or conjugate is operably linked to regulatory control sequence 1.

In a specific embodiment of the invention, the polynucleotide sequence encoding the guide RNA is operably linked to regulatory sequence 2.

Further, in a specific embodiment of the present invention, the regulatory sequence 1 and regulatory sequence 2 are the same or different sequences.

In a preferred embodiment of the invention, the regulatory sequences are selected from the group consisting of: one or more of a promoter, an enhancer, an internal ribosome entry site and a transcriptional termination signal; such as a constitutive promoter, an inducible promoter, a broad-spectrum promoter or a tissue-specific promoter, and/or, such as a polyadenylation signal or a poly U sequence.

In a specific embodiment of the invention, the backbone of the recombinant vector is an adeno-associated viral vector, a lentiviral vector, a ribonucleoprotein complex or a virus-like particle.

In another aspect, the present invention provides a technical solution that is: a method of detecting, binding or cleaving a target nucleic acid, the method comprising contacting a target nucleic acid with a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention or a vector system according to the invention.

In a preferred embodiment of the invention, the method is a method for non-diagnostic and/or therapeutic purposes; and/or the fusion protein or conjugate comprises a detectable label, e.g., a label detectable by fluorescence, southern blotting, or FISH.

In a more preferred embodiment of the invention, when the method is cleavage of a target nucleic acid, the method further comprises performing a cleavage reaction using a cleavage Buffer (Cut Buffer). The cleavage buffer may be any suitable buffer known in the art for Cas protein cleavage of target nucleic acid.

In another aspect, the present invention provides a technical solution that is: a method of altering a cell state, the method comprising contacting a cell with a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention, or a vector system according to the invention, thereby altering a cell state.

In some embodiments of the invention, the method results in one or more of the following: an increase or decrease in expression of a particular gene, in vitro or in vivo induction of cellular senescence, in vitro or in vivo cell cycle arrest, in vitro or in vivo promotion of cell growth and/or inhibition of cell growth, in vitro or in vivo induction of anergy, in vitro or in vivo induction of apoptosis, and in vitro or in vivo induction of necrosis.

In a preferred embodiment of the invention, the method is a method for non-diagnostic and/or therapeutic purposes.

In another aspect, the present invention provides a technical solution that is: a method of diagnosing, treating or preventing a disease or disorder associated with a target nucleic acid, administering to a sample of a subject in need thereof or to a subject in need thereof a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention.

In a specific embodiment of the invention, the disease or disorder is a disease or disorder of the blood system, an ophthalmic disease or disorder, a disease or disorder of the nervous system, a disease or disorder of the respiratory system, a disease or disorder of the liver, a disease or disorder of the metabolic system, cancer or an infectious disease.

In another aspect, the present invention provides a technical solution that is: a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention or a vector system according to the invention for use in the diagnosis, treatment or prevention of a disease or disorder associated with a target nucleic acid.

Detailed Description

In the present invention, unless otherwise indicated, scientific and technical terms used herein have the meanings commonly understood by one of ordinary skill in the art. Further, the procedures of molecular genetics, nucleic acid chemistry, molecular biology, biochemistry, cell culture, microbiology, cell biology, genomics and recombinant DNA, etc., as used herein, are all conventional procedures widely used in the corresponding field. Meanwhile, in order to better understand the present invention, definitions and explanations of related terms are provided below.

In the present invention, "plural" means two or more.

In the present invention, the letters in the amino acid sequence represent single letter abbreviations for amino acids well known in the art, as described, for example, in j. Biol. Chem, 243, p3558 (1968): alanine: ala-A, arginine: arg-R, aspartic acid: asp-D, cysteine: cys-C, glutamine: gln-Q, glutamic acid: glu-E, histidine: his-H, glycine: gly-G, asparagine: asn-N, tyrosine: tyr-Y, proline: pro-P, serine: ser-S, methionine: met-M, lysine: lys-K, valine: val-V, isoleucine: ile-I, phenylalanine: phe-F, leucine: leu-L, tryptophan: trp-W, threonine: thr-T.

In the present invention, "amino acid difference" refers to a difference in amino acid residues at a specific point on the amino acid sequence of a protein, including substitution, increase or decrease.

As is well known to those skilled in the art, in proteins or peptides, two adjacent amino acids are each stripped of an OH or H, dehydrated and condensed to form a peptide bond, each amino acid being in the form of an amino acid residue. Thus, in the present disclosure, the terms "amino acid" and "amino acid residue" generally represent the same meaning. In addition, to simplify expression, the amino acid residues prior to substitution are retained in the present disclosure before the site where the amino acid residue is located, the letter before the site represents the original amino acid residue, and the letter after the site represents the substituted amino acid residue. For example, S211 represents the original amino acid residue at position 211 as S, and when it is substituted with R, it may be denoted as S211R.

In the present invention, if an amino acid is substituted, it means that it is substituted with another amino acid residue different from the original amino acid residue. If the original amino acid is a positively charged amino acid, it is substituted with a positively charged amino acid, it means that it is substituted with another positively charged amino acid residue different from the original amino acid residue. For example, an original amino acid residue is R, which is substituted with a positively charged amino acid, meaning that it is substituted with H or K.

Sequence identity

As used herein, the term "identity" or "PERCENT IDENTITY" is used to refer to the match of sequences between two polypeptides or between two nucleic acids. When a position in both compared sequences is occupied by the same base or amino acid monomer subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides is occupied by lysine), then the molecules are identical at that position. The "percent sequence identity" (PERCENT IDENTITY) between two sequences is a function of the number of matched positions shared by the two sequences divided by the number of positions to be compared x 100%. For example, if 6 out of 10 positions of two sequences match, then the two sequences have 60% sequence identity. Typically, the comparison is made when two sequences are aligned to produce maximum sequence identity. Such alignment may be by using published and commercially available alignment algorithms and procedures such as, but not limited to, clustal omega, MAFFT, probcons, T-Coffee, probalign, BLAST, which one of ordinary skill in the art would have a reasonable choice to use. One skilled in the art can determine suitable parameters for aligning sequences, including, for example, any algorithm required to achieve a superior alignment or optimal alignment for the full length of the compared sequences, and any algorithm required to achieve a superior alignment or optimal alignment for the parts of the compared sequences.

Protein domains

In some embodiments, the Cas protein or Cas protein inactivating variant is covalently linked or fused to a homologous or heterologous protein domain.

In some embodiments, the protein domain is any one or more selected from the group consisting of: subcellular localization signals, DNA binding domains, protease domains, transcription activation domains, transcription inhibition domains, nuclease domains, deaminase domains, uracil DNA glycosylase domains (UDG), uracil DNA glycosylase inhibition domains (UGI), methylases, demethylases, transcription release factors, histone acetylase domains, histone deacetylase domains, DNA ligases, epitope tags, and reporter domains.

In some embodiments of the invention, the deaminase domain is optionally selected from: apodec 1, apodec 3A, APOBEC, B, APOBEC3C, APOBEC3D, APOBEC F, activation-induced cytidine deaminase (AID), CDA from lamprey, mutants of adenosine deaminase (TadA) engineered to act on DNA.

In some embodiments, the transcriptional activation domain is optionally selected from: p65, VPR, VP16, VP64, VTR1, VTR2, VTR3, P65, myoD1, HSF1, RTA, SET7/9 and histone acetyltransferase.

In some embodiments, the transcription repression domain is optionally selected from: KOX1, KAP-1, MAD, FKHR, EGR-1, ERD, SID, SID (e.g., SID 4X), tigg, v-ERB-A, MBD2, MBD3, TRa, histone methyltransferase, histone Deacetylase (HDAC), nuclear hormone receptors (e.g., estrogen receptor or thyroid hormone receptor), DNMT family members (e.g., DNMT1, DNMT3A, DNMT B), KRAB domain of MeCP2, ROM2, and AtHD2A.

In some embodiments, the transcription repression domain is a KRAB domain from a KOX1 protein.

In some embodiments, the nuclease domain is optionally selected from fokl, a polypeptide having ssDNA cleavage activity, a polypeptide having dsDNA cleavage activity.

In some embodiments, the methylase domain is selected from DNA methylases, including but not limited to DNMT1, DNMT3a, DNMT3b.

In some embodiments, the demethylase is selected from TET1CD, TET1, ROS1, DME, DML2, and DML3.

Methylation and demethylation are recognized in the art as important ways of epigenetic gene regulation.

In some embodiments, the homologous or heterologous protein domain is a sequence tag useful for the solubilization, purification, or detection of the fusion protein or conjugate. Provided herein are suitable protein tag sequences including, but not limited to, biotin Carboxylase Carrier Protein (BCCP) tags, myc tags, calmodulin tags, FLAG tags, hemagglutinin (HA) tags, polyhistidine tags (also known as His tags), maltose Binding Protein (MBP) tags, nus tags, glutathione-S-transferase (GST) tags, green Fluorescent Protein (GFP) tags, thioredoxin tags, S-tags, softtags (e.g., softtag 1, softtag 3), strep-tags, biotin ligase tags, flAsH tags, V5 tags, and SBP tags. Additional suitable sequences will be apparent to those of ordinary skill in the art.

Therapeutic application

Another aspect of the disclosure relates to a pharmaceutical composition comprising a Cas protein according to the present invention, a guide RNA according to the present invention, a Cas protein inactivating variant according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention, a carrier system according to the present invention, a delivery system according to the present invention, or a cell according to the present invention. The pharmaceutical composition can comprise, for example, an AAV vector encoding a Cas protein or a Cas protein inactivating variant and a guide RNA described herein. The pharmaceutical composition can comprise, for example, a lipid nanoparticle comprising a guide RNA described herein and an mRNA encoding a Cas protein. The pharmaceutical composition can comprise, for example, a lentiviral vector comprising a guide RNA as described herein and an mRNA encoding a Cas protein. The pharmaceutical composition can comprise, for example, a virus-like particle comprising a guide RNA and a Cas protein described herein or a ribonucleoprotein complex formed from the guide RNA and Cas protein.

Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention for cleaving or editing a target nucleic acid in a mammalian cell.

Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention in any of the following: cleavage or nicking of one or more target nucleic acid molecules, activating or upregulating expression of one or more target nucleic acid molecules, activating or inhibiting transcription of one or more target nucleic acid molecules, inactivating one or more target nucleic acid molecules, visualizing, labeling or detecting one or more target nucleic acid molecules, binding to one or more target nucleic acid molecules, transporting one or more target nucleic acid molecules, and masking one or more target nucleic acid molecules.

Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention to modify one or more target nucleic acid molecules comprising one or more of the following: nucleobase substitution, nucleobase deletion, nucleobase insertion, fragmentation of a target nucleic acid, nucleic acid methylation, and nucleic acid demethylation.

Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention in the diagnosis, treatment or prevention of a disease or disorder associated with a target nucleic acid.

Another aspect of the present disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention for the manufacture of a medicament for the diagnosis, treatment or prevention of a disease.

Another aspect of the present disclosure relates to the use of a Cas protein, a guide RNA, a fusion protein or conjugate, a nucleic acid, a CRISPR-Cas system or a vector system according to the present invention in the manufacture of a medicament for diagnosing, treating or preventing a disease associated with a target nucleic acid.

In some embodiments, the pharmaceutical composition is delivered to a human subject in vivo. The pharmaceutical composition may be delivered by any effective route. Exemplary routes of administration include, but are not limited to, intravenous infusion, intravenous injection, intraperitoneal injection, intramuscular injection, intratumoral injection, subcutaneous injection, intradermal injection, intraventricular injection, intravascular injection, intracerebral injection, intraocular injection, subretinal injection, intravitreal injection, intracameral injection, intrathecal injection, intranasal administration, and inhalation.

Examples

The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention. The experimental methods, in which specific conditions are not noted in the following examples, were selected according to conventional methods and conditions, or according to the commercial specifications.

EXAMPLE 1 preparation and purification of Cas proteins

The inventor finds 2 novel Cas12 proteins named Cas12i-Z1 and Cas12i-Z2 through bioinformatics analysis and combination of AI.

Amino acid sequence of Cas12i-Z1 protein (SEQ ID NO: 1):

MTHYVDPTRVAYWDQMPPDITACPNIQFQNSVPPNLAIPCQATLTYAVATYIRTIQATYVNPQAPFITATYISPQFATNVNPQAPFITATYISPNFATNVNPQAPFITATYVSPNFATNVNPQAPFITATYISPAQVTNINPQAPFITATYVSPNFATNVNPQAPFITATYISPAQVTNINPQAPNIQFNSIQNINITNPNVSIQNATNAQFYNPVFQFNSIQFINVTYPTVLQYQTNEVKVPTQFPTNISSTYFAFAPQYFPQTQSPNIFAFAPTDTPSIQSTQYFAFVPTLFPYTQSPNFFAFAPTPTPSYQSTQYFNFVPVQFPYTQSPTDFAFVPQTFPQTQSPNPTVPVPVTGPSLDSPTNFAFVPYTFFYTPSPAPFAFAPIQFPQNTSPFYNVFIPYTFPSNQSQFTYNVVGFHCQPIEQNTLTPILAKCARQDNTYTQPIAHLQTPNYEVNYQDTVSAQFATNAQFYVPNFQFNKIQFINVTNPNLSIQNATNAQFYNPNFQFNSIQFINVTYPTVLQYQTNAVKVPTQFPVNTSSNYFAFVPQPFPSLNSPDHFATVPQTFPSTQSTNRNAFVPYINPSTNSPTLFATVPDIFPSQHSPNTFVFVPTDFPKNQSPAFTAFVPELTPQAPSITYTAFVPQLFPSIQSPFYFAFIPYTFPSNQSQFTYVAVQFTFNFIQNAATPIYAKCARQDFTYTQPIAHLQTPNTEVNYQDTVSAQFATNVNFYMPNFQMNSIQFINVTYTPILQCQNSEVKVPTQFPENTSSTYFAFVPDEFPSTQSPTFFVFVPIWFPRYQSPVQNAFIPYTFPQTQSPNPTVPVPVTGPEIDSPIPFAFVPTYFFYTPSPEPFAFAPQTFPQNTSPFYFAFVPYTFPSNTSQFTYVAVGFHCTPVNTPFTPILVKCASFPNPNTPTIKYQNPNTEVNYDPFKNLAAQHCNQCEKHIAHFGIEPCEFPAYYRPVPNVDPTYNSVEICDYAKHVFTELNVSANCVYQPNARNIQFTYATIQTLNAYPTYPELYWHAYPKRMTVEAECDKTFANAQPDPCYRWVPT

Amino acid sequence of Cas12i-Z2 protein (SEQ ID NO: 2):

MTHFWDPTRWIFYDQLPPDATICPNAQVQNWWPPNMIAPCQITMTYIWITYARTIQITYWNPQIPVATITYASPQVITNWNPQIPVATITYASPNVITNWNPQIPVATITYWSPNVITNWNPQIPVATITYASPIQWTNANPQIPVATITYWSPNVITNWNPQIPVATITYASPIQWTNANPQIPNAQVNSAQNANATNPNWSAQNITNIQVFLPNVNVNSIQVANWTYPTWMQFQTNEWKWPTQVPTNASSTFVIVIPQFVPQTQSPNAVIVIPTDTPSIQSTQFVIVWPTMVPFTQSPNSVIVIPTPTPSFQSTQFVNVWPWQVPFTQSPTDVIVWPQTVPQTQSPNPTWPWPWTGPSMDSPTNVIVWPFTNVFTPSPIPVIVIPIQVPQNTSPVFNWVAPFTVPSNQSQVTYWWWGVHCQPAEQNTMTPAMIKCIRQDNTFTQPAIHMQTPNFEWNFQDTWSIQVITNIQVFWPNVQVNKAQVANWTNPNMSAQNITNIQVFNPNVQVNSAQVANWTYPTWMQFQTNIWKWPTQVPWNTSSNFVIVWPQPVPSMNSPDHVITWPQTVPSTQSTNRNIVWPFANPSTNSPTMVITWPDAVPSQHSPNTVWVWPTDVPKNQSPIVTIVWPEMTPQIPSATFTIVWPQMVPSIQSPVFVIVAPFTVPSNQSQVTYWIWQVTVNVAQNIITPAFIKCIRQDVTFTQPAIHMQTPNTEWNFQDTWSIQVITNWNVFLPNVQLNSAQVANWTYTPAMQCQNSEWKWPTQVPTNASVTFVIVWPDWVPSTQSPTNVWVWPMIVPRFQSPPQVIVAPFTVPHTQSPNPTWPWPWTGPSMDSPAPVIVWPFTNVFTPSPIPVIVIPQTVPQNTSPVFVIVWPFTVPSNTSQVTYWIWGVHCTPWNTPVTPAMWKCISVPNPNTPTAKFQNPNTEWNFDPVKNMIIQHCNQCEKHAIHVGAEPCEVPIFFRPWPNWDPTYNSWEACDFIKHWVTEMNWSINCWFQPNIRNAQVTFITIQTMNIFPTYPEMFYHIFPKRLTWEIECDKTVINIQPDPCFRYWPT

DR sequence (SEQ ID NO: 3) corresponding to Cas12i-Z1 protein:

AGAGAGTTCGCGTAGTTCTGTAGTGTGGAACTTGAGAC。

DR sequence (SEQ ID NO: 4) corresponding to Cas12i-Z2 protein:

GCCGGTAGTAATCCCCGAGCTCAGCGCAGCCGGATG。

1. Vector construction

The pET28a vector plasmid is digested by BamHI and XhoI, and the linearized vector is recovered by agarose gel electrophoresis. DNA nucleic acid (SEQ ID NO: 5, SEQ ID NO: 6) containing coding sequences of Cas proteins (Cas 12i-Z1 and Cas12 i-Z2) is synthesized in an outsourcing service company, and is connected with the linearized pET28a vector through T4 ligase after double digestion of BamHI and XhoI, so that recombinant vectors pET28a-Cas12i-Z1 (SEQ ID NO: 7) and pET28a-Cas12i-Z2 (SEQ ID NO: 8) are constructed. Stbl3 competence is transformed by the reaction solution, LB plates with kanamycin sulfate resistance are coated, and after overnight culture at 37 ℃, clone sequencing identification is selected.

Positive clones with correct sequences were picked overnight, plasmid was extracted and transformed into expression strain Rosetta (DE 3), LB plates containing kanamycin sulfate were plated, and cultured overnight at 37 ℃.

2. Protein expression

The monoclonal was inoculated into 5mL of LB medium containing kanamycin sulfate, and cultured overnight at 37 ℃.

500ML of LB medium containing kanamycin sulfate was inoculated at a ratio of 1:100, cultured at a rotation speed of 220rpm, at 37℃to OD 0.6, and IPTG was added to a final concentration of 0.2mM, followed by induction at 16℃for 24 hours.

Rinsing with 15mL PBS, centrifuging to collect thalli, adding lysis buffer solution, performing ultrasonic disruption, centrifuging for 30min with 10000g to obtain supernatant containing recombinant protein, filtering the supernatant with a 0.45 μm filter membrane, and purifying by column chromatography.

3. Protein purification

Purification was performed using 6 His at the N-terminus as purification tag by ProPac ™ IMAC-10 HPLC chromatography column (eluent a20 mM HEPES +0.5M NaCl, 25 mM imidazole, ph=7.5; eluent B20 mM HEPES +0.5M NaCl, 500 mM imidazole, ph=7.5. Elution gradient a/B100%/0% to 0%/100%; flow rate 0.5 mL/min, UV 280 nm). Recombinant proteins of Cas12i-Z1 and Cas12i-Z2 (the recombinant protein structure is His tag-NLS-Cas-NLS-NLS) are obtained, and the sequences are SEQ ID NO 9 and SEQ ID NO 10 respectively. The purified recombinant protein was determined to be a single band by SDS-PAGE electrophoresis.

Example 2 determination of Cas protein recognized PAM sequence

In this example, sgRNA (single guide RNA) containing specific guide sequences and the recombinant protein purified in example 1 were mixed, cleavage of in vitro cleavage substrates (containing spacer sequences and 7nt random sequences) was performed, purification after incubation at 37 ℃, library construction was performed, and NGS sequencing and analysis were performed to determine PAM sequences recognized by Cas proteins.

The designed in vitro cleavage substrate sequence is shown as SEQ ID NO. 11.

N in the sequence represents A, T, C, G.

The cleavage substrates were taken to sequencing companies for PCR-Free library construction and NGS sequencing.

Preparation of sgRNA

In a DNA template system containing T7 RNA transcriptases, four triphosphoric ribonucleotides and a T7 promoter, sgRNA containing a specific guide sequence is synthesized by in vitro transcription at 37 ℃, and the transcription product is precipitated and purified by LiCl.

The sgRNA sequence is:

Cas12i-Z1-sgRNA（SEQ ID NO: 12）、Cas12i-Z1-sgRNA-Rev（SEQ ID NO: 13）、

Cas12i-Z2-sgRNA（SEQ ID NO: 14）、Cas12i-Z2-sgRNA-Rev（SEQ ID NO: 15）。

PAM library cutting

A reaction system containing Cas12i-Z1 recombinant protein (about 10mg/mL, 0.5. Mu.L), 2.8. Mu.g Cas12i-Z1-sgRNA or Cas12i-Z1-sgRNA-Rev, in vitro cleavage substrate (60 ng/. Mu.L, 36. Mu.L) and buffer (200 mM HEPES, 1M NaCl, 50mM MgCl ₂, 1mM EDTA; 5. Mu.L) was formulated and reacted at 37℃for 3h,75℃for 15 min.

A reaction system containing Cas12i-Z2 recombinant protein (about 10mg/mL, 0.5. Mu.L), 2.8. Mu.g Cas12i-Z2-sgRNA or Cas12i-Z2-sgRNA-Rev, in vitro cleavage substrate (60 ng/. Mu.L, 36. Mu.L) and buffer (200 mM HEPES, 1M NaCl, 50mM MgCl ₂, 1mM EDTA; 5. Mu.L) was formulated and reacted at 37℃for 3h,75℃for 15 min.

T4 DNA Polymerase treatment and filling in of cleavage products

To the cut product was added T4 DNA Polymerase (Thermo Scientific), and the reaction was carried out at 37℃for 20min and at 85℃for 10min.

3' -Terminal addition of A and addition of biotin-labeled linker

A. Adding 78 mu L SPRISELECT Beads (Beckman COULTER) to the T4 DNA Polymerase reaction product, mixing, standing at room temperature for 5min, transferring the product to a magnetic rack for adsorption for 5min, and transferring the supernatant to a new 1.5mL tube; adding 39 mu L SPRISELECT Beads (Beckman COULTER), mixing, standing at room temperature for 5min, transferring the product to a magnetic rack, adsorbing for 5min, discarding supernatant, washing with 85% ethanol for 2 times, standing at room temperature for 10min, air drying, and adding 50 mu L ddH ₂ O for eluting.

B. 3' adding A to the product in the step a by utilizing SYNPLSEQ DNA Library Prep Kit for Illumina library building Kit, wherein the temperature is 37 ℃ for 10min, the temperature is 65 ℃ for 20min, and the temperature is 4 ℃ for infinity.

C. Adapter 1 was obtained by annealing an upstream primer 5' biosg/GTTGACATGCTGGATTGAGACTTCCTACACTCTTTCCCTACACGACGCTCTTCCGATC. Times.t (SEQ ID NO: 16), which represents a phosphorothioate modification at t bases, and a downstream primer GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGGAAGTCTCAATCCAGCATGTCAAC (SEQ ID NO: 17). Adapter 1 was added and reacted overnight at 20℃for 30min and 16 ℃. The reaction product was purified using SPRISELECT BEADS.

D. the reaction product was purified using streptavidin-labeled magnetic beads Dynabeads cube M-280 Streptavidin (Invitrogen).

e.Recover PCR

Primers were designed and a RecoverPCR reaction was performed using Q5-Hot START HIGH-Fidelty x Master Mix (NEB).

Recover PCR primer F: GGAGTTCAGACGTGTGCTC (SEQ ID NO: 18)

Recover PCR primer R: GTTGACATGCTGGATTGAGACTTC (SEQ ID NO: 19)

The Recovery PCR product was transferred to a magnetic rack, adsorbed for 5min, the supernatant was transferred to a fresh 1.5mL centrifuge tube, 3. Mu.L of Recovery PCR product was taken and diluted with 148.5. Mu.L of ddH ₂ O.

g.Index PCR

Index PCR was performed using the primers.

Index PCR primer IF501:

aatgatacggcgaccaccgagatctacactatagcctacactctttccctacacgacg(SEQ ID NO: 20)

index PCR primer IR701:

caagcagaagacggcatacgagatcgagtaatgtgactggagttcagacgtgtgctc(SEQ ID NO: 21).

Index PCR products added with 0.7x SPRISelect Beads for product purification, added with 38 u L ddH ₂ O for elution, using Qubit concentration determination, sent to NGS sequencing.

NGS result analysis: reference （A compact Cas9 ortholog from Staphylococcus Auricularis (SauriCas9) expands the DNA targeting scope. PLoS biology, 2020,18(3), e3000686.） method was analyzed with WebLogo software to obtain a captured 7nt random sequence. Thus, PAM sequences were identified: the PAM sequence identified by the Cas12i-Z1 is 5'-TTN-3', and the PAM sequence identified by the Cas12i-Z2 is 5'-TTN-3'.

Example 3.

Cas12i-Z1-N2-Target plasmid (SEQ ID NO: 23) was constructed at the outsourcing service company and sgRNA (SEQ ID NO: 22, N2-sgRNA) was synthesized that could Target the plasmid. The Cas12i-Z1-N2-Target plasmid was subjected to enzyme-tangential digestion with XmnI (NEB, R0194) enzyme, and after completion of the reaction at 37℃the product was purified with Wizard-SV GEL AND PCR CLEAN-Up System (Progema, A9282) and the concentration was determined with Nanodrop.

Cas12i-Z1 recombinant protein (10 mg/mL 0.5. Mu.L) prepared as in example 1 was used, and the linearized plasmid (140 ng/μL,18μL)、N2-sgRNA(180ng/μL,15μL)、10xCut Buffer (200 mM HEPES/1M NaCl/50mM MgCl₂/1mM EDTA,5μL), described above was mixed and then added with ultrapure water to 50. Mu.L. The reaction was carried out at 37℃for 1h. Water bath at 75 ℃ for 10min.

6 Mu L of loading buffer is added to the reaction product, 30 mu L of electrophoresis is taken for detection, and the existence of a cleavage fragment is observed at the position of 1000-2000bp, so that the Cas12i-Z1 protein is proved to cleave the target nucleic acid.

Claims

1. A Cas protein is characterized in that the amino acid sequence of the Cas protein is shown as SEQ ID NO. 1.

2. A guide RNA comprising

(I) A homodromous repeated sequence, wherein the homodromous repeated sequence is a sequence shown as SEQ ID NO. 3,

(Ii) A guide sequence engineered to hybridize to a target nucleic acid; the cognate repeat sequence is linked to the guide sequence, the guide RNA being capable of forming a complex with the Cas protein of claim 1 and directing the specific binding of the complex to the sequence of the target nucleic acid;

the guide sequence is located 3' to the homodromous repeat sequence.

3. A fusion protein comprising the following elements:

(1) The Cas protein of claim 1, and

(2) Homologous or heterologous functional domains;

The homologous or heterologous functional domain is any selected from one or more of the following: subcellular localization signals, transcriptional activation domains, transcriptional inhibition domains, deaminase domains, methylases, demethylases.

4. An isolated nucleic acid encoding the Cas protein of claim 1 or the fusion protein of claim 3.

5. A CRISPR-Cas system, characterized in that the CRISPR-Cas system comprises:

a. the Cas protein of claim 1, the fusion protein of claim 3, or the nucleic acid of claim 4; and

B. The guide RNA of claim 2, or a polynucleotide sequence encoding the guide RNA.