CN117683749B - Cas proteins and uses thereof - Google Patents
Cas proteins and uses thereof Download PDFInfo
- Publication number
- CN117683749B CN117683749B CN202410155116.6A CN202410155116A CN117683749B CN 117683749 B CN117683749 B CN 117683749B CN 202410155116 A CN202410155116 A CN 202410155116A CN 117683749 B CN117683749 B CN 117683749B
- Authority
- CN
- China
- Prior art keywords
- sequence
- cas protein
- cas
- protein
- nucleic acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 142
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 109
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 62
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 62
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 62
- 108020005004 Guide RNA Proteins 0.000 claims abstract description 43
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 29
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 29
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 2
- 108091033319 polynucleotide Proteins 0.000 claims description 6
- 102000040430 polynucleotide Human genes 0.000 claims description 6
- 239000002157 polynucleotide Substances 0.000 claims description 6
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 claims description 5
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 claims description 5
- 230000005764 inhibitory process Effects 0.000 claims description 5
- 108700040121 Protein Methyltransferases Proteins 0.000 claims description 4
- 102000055027 Protein Methyltransferases Human genes 0.000 claims description 4
- 230000004960 subcellular localization Effects 0.000 claims description 4
- 230000009870 specific binding Effects 0.000 claims description 3
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 claims description 2
- 230000002103 transcriptional effect Effects 0.000 claims description 2
- 108091081062 Repeated sequence (DNA) Proteins 0.000 claims 2
- 239000013598 vector Substances 0.000 abstract description 24
- 235000018102 proteins Nutrition 0.000 description 99
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 53
- 201000010099 disease Diseases 0.000 description 31
- 150000001413 amino acids Chemical group 0.000 description 26
- 230000000415 inactivating effect Effects 0.000 description 25
- 208000035475 disorder Diseases 0.000 description 22
- 238000003776 cleavage reaction Methods 0.000 description 17
- 238000000034 method Methods 0.000 description 17
- 230000007017 scission Effects 0.000 description 16
- 108020004414 DNA Proteins 0.000 description 13
- 125000000539 amino acid group Chemical group 0.000 description 13
- 239000000243 solution Substances 0.000 description 12
- 229940024606 amino acid Drugs 0.000 description 11
- 235000001014 amino acid Nutrition 0.000 description 11
- 238000000338 in vitro Methods 0.000 description 11
- 239000002773 nucleotide Substances 0.000 description 11
- 125000003729 nucleotide group Chemical group 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 238000013518 transcription Methods 0.000 description 11
- 230000035897 transcription Effects 0.000 description 11
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 10
- 238000002347 injection Methods 0.000 description 10
- 239000007924 injection Substances 0.000 description 10
- 102100039087 Peptidyl-alpha-hydroxyglycine alpha-amidating lyase Human genes 0.000 description 9
- 208000011580 syndromic disease Diseases 0.000 description 9
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 8
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 8
- 101710163270 Nuclease Proteins 0.000 description 7
- 239000000872 buffer Substances 0.000 description 7
- 210000004027 cell Anatomy 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 7
- 238000001727 in vivo Methods 0.000 description 7
- 239000008194 pharmaceutical composition Substances 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 230000001105 regulatory effect Effects 0.000 description 6
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 5
- 239000007995 HEPES buffer Substances 0.000 description 5
- 239000011324 bead Substances 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 102000004196 processed proteins & peptides Human genes 0.000 description 5
- 108020001580 protein domains Proteins 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 239000000758 substrate Substances 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 206010010356 Congenital anomaly Diseases 0.000 description 4
- 102100036279 DNA (cytosine-5)-methyltransferase 1 Human genes 0.000 description 4
- 201000007737 Retinal degeneration Diseases 0.000 description 4
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 4
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 239000008280 blood Substances 0.000 description 4
- 239000007795 chemical reaction product Substances 0.000 description 4
- 230000006698 induction Effects 0.000 description 4
- OOYGSFOGFJDDHP-KMCOLRRFSA-N kanamycin A sulfate Chemical compound OS(O)(=O)=O.O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N OOYGSFOGFJDDHP-KMCOLRRFSA-N 0.000 description 4
- 229960002064 kanamycin sulfate Drugs 0.000 description 4
- 238000007481 next generation sequencing Methods 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 102100022641 Coagulation factor IX Human genes 0.000 description 3
- 208000035473 Communicable disease Diseases 0.000 description 3
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 3
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 3
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 3
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 208000015872 Gaucher disease Diseases 0.000 description 3
- 102000003893 Histone acetyltransferases Human genes 0.000 description 3
- 108090000246 Histone acetyltransferases Proteins 0.000 description 3
- 206010028980 Neoplasm Diseases 0.000 description 3
- 102100029290 Transthyretin Human genes 0.000 description 3
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Natural products N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 150000002632 lipids Chemical class 0.000 description 3
- 208000019423 liver disease Diseases 0.000 description 3
- 230000002746 orthostatic effect Effects 0.000 description 3
- 230000002265 prevention Effects 0.000 description 3
- 238000006467 substitution reaction Methods 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 208000024827 Alzheimer disease Diseases 0.000 description 2
- 208000010839 B-cell chronic lymphocytic leukemia Diseases 0.000 description 2
- 206010005003 Bladder cancer Diseases 0.000 description 2
- 201000004569 Blindness Diseases 0.000 description 2
- 201000003883 Cystic fibrosis Diseases 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 108010009540 DNA (Cytosine-5-)-Methyltransferase 1 Proteins 0.000 description 2
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 230000004568 DNA-binding Effects 0.000 description 2
- 208000010412 Glaucoma Diseases 0.000 description 2
- 208000032612 Glial tumor Diseases 0.000 description 2
- 206010018338 Glioma Diseases 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 208000009329 Graft vs Host Disease Diseases 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- 206010019280 Heart failures Diseases 0.000 description 2
- 101710154606 Hemagglutinin Proteins 0.000 description 2
- 102100029100 Hematopoietic prostaglandin D synthase Human genes 0.000 description 2
- 208000031220 Hemophilia Diseases 0.000 description 2
- 208000009292 Hemophilia A Diseases 0.000 description 2
- 208000032087 Hereditary Leber Optic Atrophy Diseases 0.000 description 2
- 206010019899 Hereditary retinal dystrophy Diseases 0.000 description 2
- 102000003964 Histone deacetylase Human genes 0.000 description 2
- 108090000353 Histone deacetylase Proteins 0.000 description 2
- 102000043851 Histone deacetylase domains Human genes 0.000 description 2
- 108700038236 Histone deacetylase domains Proteins 0.000 description 2
- 101000931098 Homo sapiens DNA (cytosine-5)-methyltransferase 1 Proteins 0.000 description 2
- 101000729271 Homo sapiens Retinoid isomerohydrolase Proteins 0.000 description 2
- 101000808011 Homo sapiens Vascular endothelial growth factor A Proteins 0.000 description 2
- 101000818735 Homo sapiens Zinc finger protein 10 Proteins 0.000 description 2
- 201000009794 Idiopathic Pulmonary Fibrosis Diseases 0.000 description 2
- 201000000639 Leber hereditary optic neuropathy Diseases 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 208000019693 Lung disease Diseases 0.000 description 2
- 208000031422 Lymphocytic Chronic B-Cell Leukemia Diseases 0.000 description 2
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 208000002678 Mucopolysaccharidoses Diseases 0.000 description 2
- 208000012902 Nervous system disease Diseases 0.000 description 2
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 2
- 101710093908 Outer capsid protein VP4 Proteins 0.000 description 2
- 101710135467 Outer capsid protein sigma-1 Proteins 0.000 description 2
- 208000018737 Parkinson disease Diseases 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- 201000011252 Phenylketonuria Diseases 0.000 description 2
- 108010071690 Prealbumin Proteins 0.000 description 2
- 206010060862 Prostate cancer Diseases 0.000 description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 101710176177 Protein A56 Proteins 0.000 description 2
- 102100031176 Retinoid isomerohydrolase Human genes 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- 102000004389 Ribonucleoproteins Human genes 0.000 description 2
- 108010081734 Ribonucleoproteins Proteins 0.000 description 2
- 102100022433 Single-stranded DNA cytosine deaminase Human genes 0.000 description 2
- 101710143275 Single-stranded DNA cytosine deaminase Proteins 0.000 description 2
- 241001147687 Staphylococcus auricularis Species 0.000 description 2
- 108010090804 Streptavidin Proteins 0.000 description 2
- 102000009190 Transthyretin Human genes 0.000 description 2
- 206010045261 Type IIa hyperlipidaemia Diseases 0.000 description 2
- 102100037111 Uracil-DNA glycosylase Human genes 0.000 description 2
- 102100039037 Vascular endothelial growth factor A Human genes 0.000 description 2
- 102100021112 Zinc finger protein 10 Human genes 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 201000006288 alpha thalassemia Diseases 0.000 description 2
- 206010001902 amaurosis Diseases 0.000 description 2
- 206010002022 amyloidosis Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 208000005980 beta thalassemia Diseases 0.000 description 2
- 230000027455 binding Effects 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 108700023293 biotin carboxyl carrier Proteins 0.000 description 2
- 150000003943 catecholamines Chemical class 0.000 description 2
- 230000001684 chronic effect Effects 0.000 description 2
- 208000032852 chronic lymphocytic leukemia Diseases 0.000 description 2
- 208000029078 coronary artery disease Diseases 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 230000017858 demethylation Effects 0.000 description 2
- 238000010520 demethylation reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 206010013663 drug dependence Diseases 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 239000003480 eluent Substances 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 206010015037 epilepsy Diseases 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- -1 gad2 Proteins 0.000 description 2
- 208000024908 graft versus host disease Diseases 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 208000016354 hearing loss disease Diseases 0.000 description 2
- 239000000185 hemagglutinin Substances 0.000 description 2
- 208000014951 hematologic disease Diseases 0.000 description 2
- 208000009429 hemophilia B Diseases 0.000 description 2
- 208000002672 hepatitis B Diseases 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- 208000017532 inherited retinal dystrophy Diseases 0.000 description 2
- 208000036971 interstitial lung disease 2 Diseases 0.000 description 2
- 201000002818 limb ischemia Diseases 0.000 description 2
- KWGKDLIKAYFUFQ-UHFFFAOYSA-M lithium chloride Chemical compound [Li+].[Cl-] KWGKDLIKAYFUFQ-UHFFFAOYSA-M 0.000 description 2
- 210000004185 liver Anatomy 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 208000030159 metabolic disease Diseases 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 230000011987 methylation Effects 0.000 description 2
- 238000007069 methylation reaction Methods 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 206010028093 mucopolysaccharidosis Diseases 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 210000000653 nervous system Anatomy 0.000 description 2
- 201000008482 osteoarthritis Diseases 0.000 description 2
- 238000012946 outsourcing Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 208000002815 pulmonary hypertension Diseases 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 210000002345 respiratory system Anatomy 0.000 description 2
- 208000023504 respiratory system disease Diseases 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 208000007056 sickle cell anemia Diseases 0.000 description 2
- 208000002320 spinal muscular atrophy Diseases 0.000 description 2
- NCYCYZXNIZJOKI-IOUUIBBYSA-N 11-cis-retinal Chemical compound O=C/C=C(\C)/C=C\C=C(/C)\C=C\C1=C(C)CCCC1(C)C NCYCYZXNIZJOKI-IOUUIBBYSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 description 1
- 239000013607 AAV vector Substances 0.000 description 1
- 102100032123 AMP deaminase 1 Human genes 0.000 description 1
- 102100024643 ATP-binding cassette sub-family D member 1 Human genes 0.000 description 1
- 208000002874 Acne Vulgaris Diseases 0.000 description 1
- 208000005452 Acute intermittent porphyria Diseases 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 102100036664 Adenosine deaminase Human genes 0.000 description 1
- 201000011452 Adrenoleukodystrophy Diseases 0.000 description 1
- 208000033337 Alpha-sarcoglycan-related limb-girdle muscular dystrophy R3 Diseases 0.000 description 1
- 208000009575 Angelman syndrome Diseases 0.000 description 1
- 206010002383 Angina Pectoris Diseases 0.000 description 1
- 206010002556 Ankylosing Spondylitis Diseases 0.000 description 1
- 208000031782 Anoctamin-5-related limb-girdle muscular dystrophy R12 Diseases 0.000 description 1
- 102000004888 Aquaporin 1 Human genes 0.000 description 1
- 108090001004 Aquaporin 1 Proteins 0.000 description 1
- 101100443354 Arabidopsis thaliana DME gene Proteins 0.000 description 1
- 101100331657 Arabidopsis thaliana DML2 gene Proteins 0.000 description 1
- 101100331658 Arabidopsis thaliana DML3 gene Proteins 0.000 description 1
- 101001030716 Arabidopsis thaliana Histone deacetylase HDT1 Proteins 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 108700019265 Aromatic amino acid decarboxylase deficiency Proteins 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 206010003571 Astrocytoma Diseases 0.000 description 1
- 206010003591 Ataxia Diseases 0.000 description 1
- 206010003805 Autism Diseases 0.000 description 1
- 208000020706 Autistic disease Diseases 0.000 description 1
- 208000025324 B-cell acute lymphoblastic leukemia Diseases 0.000 description 1
- 102100022976 B-cell lymphoma/leukemia 11A Human genes 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000037663 Best vitelliform macular dystrophy Diseases 0.000 description 1
- 102100022548 Beta-hexosaminidase subunit alpha Human genes 0.000 description 1
- 208000034067 Beta-sarcoglycan-related limb-girdle muscular dystrophy R4 Diseases 0.000 description 1
- 208000018240 Bone Marrow Failure disease Diseases 0.000 description 1
- 208000014644 Brain disease Diseases 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 102100031168 CCN family member 2 Human genes 0.000 description 1
- 208000033528 CLN2 disease Diseases 0.000 description 1
- 108091033409 CRISPR Proteins 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 description 1
- 101100121050 Caenorhabditis elegans gad-1 gene Proteins 0.000 description 1
- 102000000584 Calmodulin Human genes 0.000 description 1
- 108010041952 Calmodulin Proteins 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 206010008025 Cerebellar ataxia Diseases 0.000 description 1
- 208000033810 Choroidal dystrophy Diseases 0.000 description 1
- 208000000419 Chronic Hepatitis B Diseases 0.000 description 1
- 208000000094 Chronic Pain Diseases 0.000 description 1
- 208000022497 Cocaine-Related disease Diseases 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000006992 Color Vision Defects Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 206010011017 Corneal graft rejection Diseases 0.000 description 1
- 208000001528 Coronaviridae Infections Diseases 0.000 description 1
- 102100034746 Cyclin-dependent kinase-like 5 Human genes 0.000 description 1
- 206010011777 Cystinosis Diseases 0.000 description 1
- 102100026846 Cytidine deaminase Human genes 0.000 description 1
- 108010031325 Cytidine deaminase Proteins 0.000 description 1
- 102100024812 DNA (cytosine-5)-methyltransferase 3A Human genes 0.000 description 1
- 108010024491 DNA Methyltransferase 3A Proteins 0.000 description 1
- 102100040261 DNA dC->dU-editing enzyme APOBEC-3C Human genes 0.000 description 1
- 102100040264 DNA dC->dU-editing enzyme APOBEC-3D Human genes 0.000 description 1
- 208000011518 Danon disease Diseases 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 206010011968 Decreased immune responsiveness Diseases 0.000 description 1
- 206010012289 Dementia Diseases 0.000 description 1
- 201000004624 Dermatitis Diseases 0.000 description 1
- 206010012438 Dermatitis atopic Diseases 0.000 description 1
- 208000032131 Diabetic Neuropathies Diseases 0.000 description 1
- 208000008960 Diabetic foot Diseases 0.000 description 1
- 206010012688 Diabetic retinal oedema Diseases 0.000 description 1
- 206010012689 Diabetic retinopathy Diseases 0.000 description 1
- 201000007547 Dravet syndrome Diseases 0.000 description 1
- 206010013801 Duchenne Muscular Dystrophy Diseases 0.000 description 1
- 208000037150 Dysferlin-related limb-girdle muscular dystrophy R2 Diseases 0.000 description 1
- 208000010975 Dystrophic epidermolysis bullosa Diseases 0.000 description 1
- 108010051542 Early Growth Response Protein 1 Proteins 0.000 description 1
- 102100023226 Early growth response protein 1 Human genes 0.000 description 1
- 208000032274 Encephalopathy Diseases 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 206010014967 Ependymoma Diseases 0.000 description 1
- 206010014989 Epidermolysis bullosa Diseases 0.000 description 1
- 208000010228 Erectile Dysfunction Diseases 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 208000033534 FKRP-related limb-girdle muscular dystrophy R9 Diseases 0.000 description 1
- 208000033331 FOXG1 syndrome Diseases 0.000 description 1
- 208000024720 Fabry Disease Diseases 0.000 description 1
- 108010076282 Factor IX Proteins 0.000 description 1
- 108010054218 Factor VIII Proteins 0.000 description 1
- 102000001690 Factor VIII Human genes 0.000 description 1
- 201000001342 Fallopian tube cancer Diseases 0.000 description 1
- 208000013452 Fallopian tube neoplasm Diseases 0.000 description 1
- 102100024785 Fibroblast growth factor 2 Human genes 0.000 description 1
- 108090000379 Fibroblast growth factor 2 Proteins 0.000 description 1
- 102100035427 Forkhead box protein O1 Human genes 0.000 description 1
- 208000001914 Fragile X syndrome Diseases 0.000 description 1
- 201000011240 Frontotemporal dementia Diseases 0.000 description 1
- 201000008892 GM1 Gangliosidosis Diseases 0.000 description 1
- 208000001905 GM2 Gangliosidoses Diseases 0.000 description 1
- 201000008905 GM2 gangliosidosis Diseases 0.000 description 1
- 208000033136 Gamma-sarcoglycan-related limb-girdle muscular dystrophy R5 Diseases 0.000 description 1
- 208000020322 Gaucher disease type I Diseases 0.000 description 1
- 208000028735 Gaucher disease type III Diseases 0.000 description 1
- 229920001503 Glucan Polymers 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 208000001500 Glycogen Storage Disease Type IIb Diseases 0.000 description 1
- 206010018464 Glycogen storage disease type I Diseases 0.000 description 1
- 206010053185 Glycogen storage disease type II Diseases 0.000 description 1
- 208000031886 HIV Infections Diseases 0.000 description 1
- 208000037357 HIV infectious disease Diseases 0.000 description 1
- 102100032606 Heat shock factor protein 1 Human genes 0.000 description 1
- 208000002972 Hepatolenticular Degeneration Diseases 0.000 description 1
- 206010019860 Hereditary angioedema Diseases 0.000 description 1
- 208000009889 Herpes Simplex Diseases 0.000 description 1
- 108010068250 Herpes Simplex Virus Protein Vmw65 Proteins 0.000 description 1
- 108010036115 Histone Methyltransferases Proteins 0.000 description 1
- 102000011787 Histone Methyltransferases Human genes 0.000 description 1
- 101710159508 Histone-lysine N-methyltransferase SETD7 Proteins 0.000 description 1
- 102100027704 Histone-lysine N-methyltransferase SETD7 Human genes 0.000 description 1
- 101000775844 Homo sapiens AMP deaminase 1 Proteins 0.000 description 1
- 101000903703 Homo sapiens B-cell lymphoma/leukemia 11A Proteins 0.000 description 1
- 101001045440 Homo sapiens Beta-hexosaminidase subunit alpha Proteins 0.000 description 1
- 101000777550 Homo sapiens CCN family member 2 Proteins 0.000 description 1
- 101000945692 Homo sapiens Cyclin-dependent kinase-like 5 Proteins 0.000 description 1
- 101000964383 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3C Proteins 0.000 description 1
- 101000964382 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3D Proteins 0.000 description 1
- 101000877727 Homo sapiens Forkhead box protein O1 Proteins 0.000 description 1
- 101000867525 Homo sapiens Heat shock factor protein 1 Proteins 0.000 description 1
- 101001006782 Homo sapiens Kinesin-associated protein 3 Proteins 0.000 description 1
- 101000962483 Homo sapiens Max dimerization protein 1 Proteins 0.000 description 1
- 101000615495 Homo sapiens Methyl-CpG-binding domain protein 3 Proteins 0.000 description 1
- 101000653360 Homo sapiens Methylcytosine dioxygenase TET1 Proteins 0.000 description 1
- 101001109052 Homo sapiens NADH-ubiquinone oxidoreductase chain 4 Proteins 0.000 description 1
- 101001135344 Homo sapiens Polypyrimidine tract-binding protein 1 Proteins 0.000 description 1
- 101001098868 Homo sapiens Proprotein convertase subtilisin/kexin type 9 Proteins 0.000 description 1
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 description 1
- 101100087363 Homo sapiens RBFOX2 gene Proteins 0.000 description 1
- 101000801643 Homo sapiens Retinal-specific phospholipid-transporting ATPase ABCA4 Proteins 0.000 description 1
- 101000609949 Homo sapiens Rod cGMP-specific 3',5'-cyclic phosphodiesterase subunit beta Proteins 0.000 description 1
- 101000639970 Homo sapiens Sodium- and chloride-dependent GABA transporter 1 Proteins 0.000 description 1
- 101000617738 Homo sapiens Survival motor neuron protein Proteins 0.000 description 1
- 101000753286 Homo sapiens Transcription intermediary factor 1-beta Proteins 0.000 description 1
- 208000030673 Homozygous familial hypercholesterolemia Diseases 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 208000031226 Hyperlipidaemia Diseases 0.000 description 1
- 208000001021 Hyperlipoproteinemia Type I Diseases 0.000 description 1
- 241000235789 Hyperoartia Species 0.000 description 1
- 206010020853 Hypertonic bladder Diseases 0.000 description 1
- 201000001431 Hyperuricemia Diseases 0.000 description 1
- 206010049933 Hypophosphatasia Diseases 0.000 description 1
- 206010022562 Intermittent claudication Diseases 0.000 description 1
- 108020004684 Internal Ribosome Entry Sites Proteins 0.000 description 1
- 108010006746 KCNQ2 Potassium Channel Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 208000006136 Leigh Disease Diseases 0.000 description 1
- 208000017507 Leigh syndrome Diseases 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 208000010534 Leukocyte adhesion deficiency type I Diseases 0.000 description 1
- 206010025282 Lymphoedema Diseases 0.000 description 1
- 208000015439 Lysosomal storage disease Diseases 0.000 description 1
- 108010049137 Member 1 Subfamily D ATP Binding Cassette Transporter Proteins 0.000 description 1
- 235000014435 Mentha Nutrition 0.000 description 1
- 241001072983 Mentha Species 0.000 description 1
- 201000011442 Metachromatic leukodystrophy Diseases 0.000 description 1
- 102000006890 Methyl-CpG-Binding Protein 2 Human genes 0.000 description 1
- 108010072388 Methyl-CpG-Binding Protein 2 Proteins 0.000 description 1
- 102100021291 Methyl-CpG-binding domain protein 3 Human genes 0.000 description 1
- 102100030819 Methylcytosine dioxygenase TET1 Human genes 0.000 description 1
- 206010058799 Mitochondrial encephalomyopathy Diseases 0.000 description 1
- 208000003445 Mouth Neoplasms Diseases 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 208000001089 Multiple system atrophy Diseases 0.000 description 1
- 208000036572 Myoclonic epilepsy Diseases 0.000 description 1
- 206010068871 Myotonic dystrophy Diseases 0.000 description 1
- 102100021506 NADH-ubiquinone oxidoreductase chain 4 Human genes 0.000 description 1
- 208000002454 Nasopharyngeal Carcinoma Diseases 0.000 description 1
- 206010061306 Nasopharyngeal cancer Diseases 0.000 description 1
- 206010028851 Necrosis Diseases 0.000 description 1
- 206010058116 Nephrogenic anaemia Diseases 0.000 description 1
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 1
- 208000025966 Neurological disease Diseases 0.000 description 1
- 208000002537 Neuronal Ceroid-Lipofuscinoses Diseases 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 102000007399 Nuclear hormone receptor Human genes 0.000 description 1
- 108020005497 Nuclear hormone receptor Proteins 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 206010061323 Optic neuropathy Diseases 0.000 description 1
- 208000000599 Ornithine Carbamoyltransferase Deficiency Disease Diseases 0.000 description 1
- 206010052450 Ornithine transcarbamoylase deficiency Diseases 0.000 description 1
- 208000035903 Ornithine transcarbamylase deficiency Diseases 0.000 description 1
- 201000000023 Osteosclerosis Diseases 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 208000009722 Overactive Urinary Bladder Diseases 0.000 description 1
- 208000002193 Pain Diseases 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 206010035603 Pleural mesothelioma Diseases 0.000 description 1
- 102100033073 Polypyrimidine tract-binding protein 1 Human genes 0.000 description 1
- 206010036182 Porphyria acute Diseases 0.000 description 1
- 102100034354 Potassium voltage-gated channel subfamily KQT member 2 Human genes 0.000 description 1
- 201000010769 Prader-Willi syndrome Diseases 0.000 description 1
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 1
- 208000031951 Primary immunodeficiency Diseases 0.000 description 1
- 201000002150 Progressive familial intrahepatic cholestasis Diseases 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 102100038955 Proprotein convertase subtilisin/kexin type 9 Human genes 0.000 description 1
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 description 1
- 201000001263 Psoriatic Arthritis Diseases 0.000 description 1
- 208000036824 Psoriatic arthropathy Diseases 0.000 description 1
- 108700014121 Pyruvate Kinase Deficiency of Red Cells Proteins 0.000 description 1
- 102100038187 RNA binding protein fox-1 homolog 2 Human genes 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 101150065817 ROM2 gene Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 208000006265 Renal cell carcinoma Diseases 0.000 description 1
- 108700008625 Reporter Genes Proteins 0.000 description 1
- 102100033617 Retinal-specific phospholipid-transporting ATPase ABCA4 Human genes 0.000 description 1
- 208000007014 Retinitis pigmentosa Diseases 0.000 description 1
- 102100040756 Rhodopsin Human genes 0.000 description 1
- 108090000820 Rhodopsin Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 102100039174 Rod cGMP-specific 3',5'-cyclic phosphodiesterase subunit beta Human genes 0.000 description 1
- 108091006587 SLC13A5 Proteins 0.000 description 1
- 208000021811 Sandhoff disease Diseases 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 208000008765 Sciatica Diseases 0.000 description 1
- 206010039710 Scleroderma Diseases 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 206010073677 Severe myoclonic epilepsy of infancy Diseases 0.000 description 1
- 208000021386 Sjogren Syndrome Diseases 0.000 description 1
- 206010040954 Skin wrinkling Diseases 0.000 description 1
- 102100033927 Sodium- and chloride-dependent GABA transporter 1 Human genes 0.000 description 1
- 102100035210 Solute carrier family 13 member 5 Human genes 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 208000033145 Spinal muscular atrophy with respiratory distress type 1 Diseases 0.000 description 1
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 1
- 208000007718 Stable Angina Diseases 0.000 description 1
- 208000006011 Stroke Diseases 0.000 description 1
- 102100021947 Survival motor neuron protein Human genes 0.000 description 1
- 208000001871 Tachycardia Diseases 0.000 description 1
- 208000002903 Thalassemia Diseases 0.000 description 1
- 241001052560 Thallis Species 0.000 description 1
- 102100036407 Thioredoxin Human genes 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 102100022012 Transcription intermediary factor 1-beta Human genes 0.000 description 1
- 206010048873 Traumatic arthritis Diseases 0.000 description 1
- 208000003721 Triple Negative Breast Neoplasms Diseases 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 1
- 208000007930 Type C Niemann-Pick Disease Diseases 0.000 description 1
- 208000000921 Urge Urinary Incontinence Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 206010046851 Uveitis Diseases 0.000 description 1
- 108091008605 VEGF receptors Proteins 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 102100033177 Vascular endothelial growth factor receptor 2 Human genes 0.000 description 1
- 208000014070 Vestibular schwannoma Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 208000000208 Wet Macular Degeneration Diseases 0.000 description 1
- 208000018839 Wilson disease Diseases 0.000 description 1
- 208000005946 Xerostomia Diseases 0.000 description 1
- 201000000761 achromatopsia Diseases 0.000 description 1
- 206010000496 acne Diseases 0.000 description 1
- 208000004064 acoustic neuroma Diseases 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 230000001780 adrenocortical effect Effects 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 206010064930 age-related macular degeneration Diseases 0.000 description 1
- 238000007605 air drying Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 208000006682 alpha 1-Antitrypsin Deficiency Diseases 0.000 description 1
- 201000008333 alpha-mannosidosis Diseases 0.000 description 1
- 206010002224 anaplastic astrocytoma Diseases 0.000 description 1
- 208000007502 anemia Diseases 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 201000008937 atopic dermatitis Diseases 0.000 description 1
- 208000010668 atopic eczema Diseases 0.000 description 1
- 208000029560 autism spectrum disease Diseases 0.000 description 1
- 201000000751 autosomal recessive congenital ichthyosis Diseases 0.000 description 1
- 201000000527 autosomal recessive distal spinal muscular atrophy 1 Diseases 0.000 description 1
- 201000009563 autosomal recessive limb-girdle muscular dystrophy type 2B Diseases 0.000 description 1
- 201000009561 autosomal recessive limb-girdle muscular dystrophy type 2D Diseases 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000008238 biochemical pathway Effects 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 208000014905 bone marrow failure syndrome Diseases 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000025084 cell cycle arrest Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 230000010094 cellular senescence Effects 0.000 description 1
- 230000011088 chloroplast localization Effects 0.000 description 1
- 208000003571 choroideremia Diseases 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 208000020832 chronic kidney disease Diseases 0.000 description 1
- 208000016617 citrullinemia type I Diseases 0.000 description 1
- 201000006145 cocaine dependence Diseases 0.000 description 1
- 201000007254 color blindness Diseases 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 201000011190 diabetic macular edema Diseases 0.000 description 1
- 208000016097 disease of metabolism Diseases 0.000 description 1
- 208000021347 distal spinal muscular atrophy 1 Diseases 0.000 description 1
- 208000011325 dry age related macular degeneration Diseases 0.000 description 1
- 206010013781 dry mouth Diseases 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 208000004298 epidermolysis bullosa dystrophica Diseases 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 230000001037 epileptic effect Effects 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 102000015694 estrogen receptors Human genes 0.000 description 1
- 108010038795 estrogen receptors Proteins 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 229960004222 factor ix Drugs 0.000 description 1
- 229960000301 factor viii Drugs 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 238000010362 genome editing Methods 0.000 description 1
- 208000005017 glioblastoma Diseases 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 201000004502 glycogen storage disease II Diseases 0.000 description 1
- 208000011460 glycogen storage disease due to glucose-6-phosphatase deficiency type IA Diseases 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 1
- 230000010370 hearing loss Effects 0.000 description 1
- 231100000888 hearing loss Toxicity 0.000 description 1
- 208000038002 heart failure with reduced ejection fraction Diseases 0.000 description 1
- 208000018706 hematopoietic system disease Diseases 0.000 description 1
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 208000033519 human immunodeficiency virus infectious disease Diseases 0.000 description 1
- 206010020718 hyperplasia Diseases 0.000 description 1
- 206010020871 hypertrophic cardiomyopathy Diseases 0.000 description 1
- 230000001969 hypertrophic effect Effects 0.000 description 1
- 210000003026 hypopharynx Anatomy 0.000 description 1
- 230000007124 immune defense Effects 0.000 description 1
- 201000001881 impotence Diseases 0.000 description 1
- 208000033065 inborn errors of immunity Diseases 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000006882 induction of apoptosis Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 208000017326 inherited epidermolysis bullosa Diseases 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 208000021156 intermittent vascular claudication Diseases 0.000 description 1
- 238000010255 intramuscular injection Methods 0.000 description 1
- 239000007927 intramuscular injection Substances 0.000 description 1
- 239000007928 intraperitoneal injection Substances 0.000 description 1
- 238000007913 intrathecal administration Methods 0.000 description 1
- 230000002601 intratumoral effect Effects 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 238000010253 intravenous injection Methods 0.000 description 1
- 238000007914 intraventricular administration Methods 0.000 description 1
- 230000000302 ischemic effect Effects 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 201000008105 leukocyte adhesion deficiency 1 Diseases 0.000 description 1
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 208000002502 lymphedema Diseases 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 208000002780 macular degeneration Diseases 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- ZIYVHBGGAOATLY-UHFFFAOYSA-N methylmalonic acid Chemical compound OC(=O)C(C)C(O)=O ZIYVHBGGAOATLY-UHFFFAOYSA-N 0.000 description 1
- 230000025608 mitochondrion localization Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 208000022018 mucopolysaccharidosis type 2 Diseases 0.000 description 1
- 208000036710 mucopolysaccharidosis type 3A Diseases 0.000 description 1
- 208000036709 mucopolysaccharidosis type 3B Diseases 0.000 description 1
- 201000006938 muscular dystrophy Diseases 0.000 description 1
- 230000001114 myogenic effect Effects 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 201000011216 nasopharynx carcinoma Diseases 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 230000017095 negative regulation of cell growth Effects 0.000 description 1
- 208000004296 neuralgia Diseases 0.000 description 1
- 201000008051 neuronal ceroid lipofuscinosis Diseases 0.000 description 1
- 201000001119 neuropathy Diseases 0.000 description 1
- 230000007823 neuropathy Effects 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 208000008338 non-alcoholic fatty liver disease Diseases 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 208000001749 optic atrophy Diseases 0.000 description 1
- 208000020911 optic nerve disease Diseases 0.000 description 1
- 201000011278 ornithine carbamoyltransferase deficiency Diseases 0.000 description 1
- 208000020629 overactive bladder Diseases 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 208000033808 peripheral neuropathy Diseases 0.000 description 1
- 201000002628 peritoneum cancer Diseases 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920002704 polyhistidine Polymers 0.000 description 1
- 208000014321 polymorphic ventricular tachycardia Diseases 0.000 description 1
- 208000028529 primary immunodeficiency disease Diseases 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000007026 protein scission Effects 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000004258 retinal degeneration Effects 0.000 description 1
- 208000004644 retinal vein occlusion Diseases 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 231100000241 scar Toxicity 0.000 description 1
- 201000000980 schizophrenia Diseases 0.000 description 1
- 208000002491 severe combined immunodeficiency Diseases 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 238000001179 sorption measurement Methods 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000010254 subcutaneous injection Methods 0.000 description 1
- 239000007929 subcutaneous injection Substances 0.000 description 1
- 208000011117 substance-related disease Diseases 0.000 description 1
- 230000006794 tachycardia Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 108060008226 thioredoxin Proteins 0.000 description 1
- 229940094937 thioredoxin Drugs 0.000 description 1
- 206010043554 thrombocytopenia Diseases 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 102000004217 thyroid hormone receptors Human genes 0.000 description 1
- 108090000721 thyroid hormone receptors Proteins 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 101150069263 tra gene Proteins 0.000 description 1
- 208000022679 triple-negative breast carcinoma Diseases 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 229910021642 ultra pure water Inorganic materials 0.000 description 1
- 239000012498 ultrapure water Substances 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 206010047302 ventricular tachycardia Diseases 0.000 description 1
- 208000027491 vestibular disease Diseases 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 201000007790 vitelliform macular dystrophy Diseases 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000037303 wrinkles Effects 0.000 description 1
Landscapes
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
The invention discloses a Cas protein and application thereof. The amino acid sequence of the Cas protein is shown as SEQ ID NO. 1 or 2. The invention also discloses guide RNAs, fusion proteins or conjugates, isolated nucleic acids, CRISPR-Cas systems, vector systems, and uses thereof.
Description
Technical Field
The disclosure relates to the field of CRISPR gene editing, in particular to a Cas protein and application thereof.
Background
CRISPR-Cas systems are an adaptive immune defense that bacteria and archaea form during long-term evolution can use against invasive viruses and foreign DNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated protein systems (CRISPR-Cas systems) can directly alter gene sequences in cells, a fast and efficient method.
Many researchers in the field are working to find new Cas proteins and CRISPR-Cas gene editing systems.
Disclosure of Invention
The invention provides Cas proteins and uses thereof.
In one aspect, the invention provides a technical scheme as follows: a Cas protein, the amino acid sequence of which comprises or is an amino acid sequence having at least 50% identity compared to SEQ ID No. 1 or 2.
In particular embodiments of the invention, the at least 50% identity is at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9% or 100% identity.
In particular embodiments of the invention, the amino acid sequence of the Cas protein comprises or is an amino acid sequence having at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8% or at least 99.9% identity compared to SEQ ID No. 1 or 2.
In a specific embodiment of the invention, the amino acid sequence of the Cas protein comprises or is an amino acid sequence having 100% identity to any one of SEQ ID NOs 1, 2.
In a specific embodiment of the invention, the amino acid sequence of the Cas protein comprises the sequence set forth in any one of SEQ ID NOs 1,2, 9, 10.
In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown as SEQ ID NO. 1. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown as SEQ ID NO. 2. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown in SEQ ID NO. 9. In a specific embodiment of the invention, the amino acid sequence of the Cas protein is shown in SEQ ID NO. 10.
In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA. In particular embodiments of the invention, the Cas protein may specifically bind to a target nucleic acid with a guide RNA.
In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA. In particular embodiments of the invention, the Cas protein may specifically bind to a target nucleic acid with a guide RNA.
In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to a target nucleic acid. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to a target DNA.
In particular embodiments of the invention, the Cas protein may specifically bind to a guide RNA and cleave a target nucleic acid. In particular embodiments of the invention, the Cas protein may specifically bind to a guide RNA and cleave a target DNA. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to and cleave a target nucleic acid. In particular embodiments of the invention, the Cas protein may form a complex with a guide RNA that may specifically bind to and cleave a target DNA.
In a specific embodiment of the invention, the Cas protein recognizes PAM having the sequence 5'-TTN-3', the N being any one selected from A, T, C and G.
In some embodiments of the invention, the Cas protein is a Cas protein inactivating variant. In some embodiments of the invention, the Cas protein inactivating variant is read Cas or NICKASE CAS.
In some embodiments of the invention, the Cas protein is selected from the active fragments comprising the Cas protein of any one of the invention.
In another aspect, the present invention provides a technical solution that is: a guide RNA comprising (i) a cognate repeat sequence having at least 50% identity to SEQ ID No. 3 or 4, (ii) a guide sequence engineered to hybridize to a target nucleic acid; the cognate repeat sequence is linked to the guide sequence, the guide RNA is capable of forming a complex with the Cas protein and directing sequence-specific binding of the complex to the target nucleic acid.
In some embodiments of the invention, the orthostatic repeat sequence has at least 50% identity to the sequence set forth in SEQ ID NO. 3 or 4.
In some embodiments of the invention, the orthostatic repeat has at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity compared to SEQ ID NO 3 or 4.
In some embodiments of the invention, the orthostatic repeat sequence is the sequence shown in SEQ ID NO. 3 or 4.
In a preferred embodiment, the Cas protein is a Cas protein according to the present invention.
In a specific embodiment of the invention, the guide sequence comprises 15-60 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-50 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-40 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-35 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-30 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 15-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 18-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 20-25 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 18-22 nucleotides. In a specific embodiment of the invention, the guide sequence comprises 20-22 nucleotides. In specific embodiments of the invention, the guide sequence comprises 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
In a specific embodiment of the invention, the guide sequence is located 3' to the homeotropic repeat.
In a specific embodiment of the invention, the guide sequence is located 5' to the homeotropic repeat.
In another aspect, the present invention provides a technical solution that is: a Cas protein inactivating variant, characterized in that the Cas protein inactivating variant is a nuclease activity inactivating variant of a Cas protein according to the present invention.
In a specific embodiment of the invention, the Cas protein inactivating variant is a variant in which nuclease activity is completely inactivated, i.e., a read Cas protein inactivating variant (dCas). The dCas can bind to the target nucleic acid only under the mediation of the guide RNA, and has no or little function of cleaving the target nucleic acid. For example, the target nucleic acid cleavage efficiency of the dCas is ∈20%, +.15%, +.10%, +.5%, +.4%, +.3%, +.2% or+.1% of the target nucleic acid cleavage efficiency of the Cas protein prior to inactivating mutation.
In a specific embodiment of the invention, the Cas protein inactivating variant is a variant with a nuclease activity partially inactivated. Further, the variant with the nuclease activity partially inactivated is Cas nickase (NICKASE CAS, nCas) that binds to the target nucleic acid under the mediation of guide RNA, and then cleaves one single strand of the double-stranded target nucleic acid without cleaving the other single strand.
In a preferred embodiment of the invention, the Cas protein inactivating variant is the Ruvc domain inactivation of the Cas protein.
In a preferred embodiment of the invention, the Cas protein inactivating variant is the Ruvc-I, ruvc-ii or Ruvc-iii domain inactivation of the Cas protein.
In a preferred embodiment of the invention, the inactive variant of the Cas protein is obtained by introducing an inactivating mutation in the Ruvc-I, ruvc-ii or Ruvc-iii domain of the Cas protein.
In particular embodiments of the invention, the PAM sequence recognizable by the Cas protein inactivating variant is identical to the PAM sequence recognizable by the Cas protein.
In another aspect, the present invention provides a technical solution that is: a fusion protein or conjugate comprising the following elements: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.
In a specific embodiment of the present invention, there is provided a fusion protein comprising: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.
In a specific embodiment of the present invention, there is provided a fusion protein comprising: (1) Cas protein according to the invention; and (2) a homologous or heterologous functional domain.
In a specific embodiment of the present invention, there is provided a conjugate comprising: (1) A Cas protein according to the invention, or a Cas protein inactivating variant according to the invention; and (2) a homologous or heterologous functional domain.
In a specific embodiment of the present invention, there is provided a conjugate comprising: (1) Cas protein according to the invention; and (2) a homologous or heterologous functional domain.
In a specific embodiment of the invention, the homologous or heterologous functional domain is any one or more selected from the group consisting of: subcellular localization signals, DNA binding domains, protease domains, transcription activation domains, transcription inhibition domains, nuclease domains, deaminase domains, uracil DNA glycosylase domains (UDG), uracil DNA glycosylase inhibition domains (UGI), methylases, demethylases, transcription release factors, histone acetylase domains, histone deacetylase domains, DNA ligases, affinity tags, reporter tags.
In some embodiments of the invention, the subcellular localization signal is selected from: nuclear localization signal, nuclear output signal, mitochondrial localization signal, chloroplast localization signal.
In specific embodiments of the invention, the fusion protein or conjugate comprises 1,2, 3,4, 5, 6, 7, 8, 9 or more of the homologous or heterologous functional domains; the functional domains are the same or different.
In some embodiments, the fusion protein or conjugate arbitrarily links 0, 1,2, 3, 4, 5, 6, 7, 8, or more of the protein domains at the N-terminus and/or C-terminus of the Cas protein.
In specific embodiments of the invention, the fusion protein comprises 1, 2, 3, 4 or more nuclear localization signals.
In another aspect, the present invention provides a technical solution that is: an isolated nucleic acid encoding a Cas protein according to the invention, a Cas protein inactivating variant according to the invention, or a fusion protein or conjugate according to the invention.
In some embodiments of the invention, the nucleic acid encodes a Cas protein as described herein or a fusion protein as described herein.
In a preferred embodiment of the invention, the nucleic acid is codon optimized for expression in a cell.
In another aspect, the present invention provides a technical solution that is: a CRISPR-Cas system, comprising:
a. A Cas protein according to the invention, a Cas protein inactivating variant according to the invention, a fusion protein or conjugate according to the invention or a nucleic acid according to the invention; and
B. A guide RNA, or a polynucleotide sequence encoding the guide RNA;
the Cas protein, the Cas protein inactivating variant, or the fusion protein or conjugate forms a complex with the guide RNA; the guide RNA comprises a guide sequence engineered to direct sequence-specific binding of the complex to a target nucleic acid.
In a specific embodiment of the invention, the guide RNA comprises a direct repeat sequence linked to a guide sequence.
In a specific embodiment of the invention, the homeotropic repeat has at least 50% identity to SEQ ID NO.3 or 4.
In specific embodiments of the invention, the target nucleic acid is a disease or disorder-associated gene or a signaling biochemical pathway-associated gene, or the target nucleic acid is a reporter gene; for example, the disease or disorder is a hematological disease or disorder, an ophthalmic disease or disorder, a neurological disease or disorder, a respiratory disease or disorder, a liver disease or disorder, a metabolic disease or disorder, cancer, or an infectious disease.
In some embodiments of the invention, the target nucleic acid is a gene associated with a disease or disorder, the disease or disorder being any one selected from the group consisting of: hemophilia a, best vitelliform macular dystrophy, B-cell acute lymphoblastic leukemia, hemophilia B, CDKL5 deficiency, CLN2 disease, niemann pick's disease type C, dravet syndrome, FOXG1 syndrome, GM1 gangliosidosis, GM2 gangliosidosis, HIV infection, HSV infection, type IB Wu Xieer syndrome, type IIA Wu Xieer syndrome, type IIIA mucopolysaccharidosis, type IIIB mucopolysaccharidosis, type III gaucher disease, type II mucopolysaccharidosis, type II diabetes, type IV mucopolysaccharidosis, type I gaucher disease, type I mucopolysaccharidosis, type I diabetes, type I Wu Xieer syndrome, KCNQ2 epileptic encephalopathy, leber hereditary optic neuropathy, leigh syndrome, prader-Willi syndrome, SLC13A5 defect, X-linked myotube disease, X-linked retinal degeneration alpha 1-antitrypsin deficiency, alpha-mannosidosis, alpha-thalassemia, beta-thalassemia, alzheimer's disease, pade-Pichia syndrome, retinitis pigmentosa, leukocyte adhesion deficiency type I, galactosylemia, bladder cancer, overactive bladder, phenylketonuria, nasopharyngeal carcinoma, beta-holothurian crystal dystrophy, pyruvate kinase deficiency, erectile dysfunction, autosomal recessive congenital ichthyosis, adult glucan disease, traumatic arthritis, homozygous familial hypercholesterolemia, fragile X syndrome, thalassemia, hypophosphatasia, epilepsy, multiple myeloma, multiple system atrophy, frontotemporal dementia, catecholamine sensitive polymorphic ventricular tachycardia, fabry's disease, van-Nyinaemia, aromatic amino acid decarboxylase deficiency, catecholamine sensitive polymorphic tachycardia, radiation induced xerostomia, non-hodgkin lymphoma, non-myogenic invasive bladder cancer, non-alcoholic fatty liver disease, non-small cell lung cancer, hypertrophic cardiomyopathy, hypertrophic scar, obesity, fibular amyotrophic lateral sclerosis type 1A, fibular amyotrophic lateral sclerosis type 2A, pulmonary hypertension, friedrich's ataxia, peritoneal cancer, liver cancer, hepatocellular carcinoma, dry age-related macular degeneration, sjogren's syndrome, hyperuricemia, hyperlipidemia, gaucher's disease, autism spectrum disorders, osteoarthritis, bone marrow failure syndrome, citrullinemia type I, coronary heart disease, cystine disease, melanoma, huntington's disease, amyotrophic lateral sclerosis, urge urinary incontinence, acute intermittent porphyria, acute lymphocytic leukemia, spinal cerebellar ataxia, spinal muscular atrophy with respiratory distress type 1, spinal muscular atrophy, familial black dementia, chronic lymphocytic leukemia methylmalonic acid, thyroid cancer, pseudohypertrophic muscular dystrophy, anaplastic astrocytoma, intermittent claudication, borderline epidermolysis bullosa, glioma, glioblastoma, corneal graft rejection, colorectal cancer, progressive multifocal leukopathy, progressive familial intrahepatic cholestasis, megaaxis neuropathy, canavalial disease, cocaine addiction, krebr's disease, crigler-Nanjer syndrome, oral cancer, happy puppet syndrome, diffuse endogenous brain bridge glioma, love Lash, rheumatoid arthritis, sickle cell disease, lymphedema, ovarian cancer, chronic lymphocytic leukemia, chronic granulomatosis, chronic renal anemia, chronic pain, chronic hepatitis B, mentha's disease, cystic fibrosis, inner-Joseph syndrome, ornithine carbamoyltransferase deficiency, parkinson's disease, pompe's disease, uveitis, prostate cancer, vestibular schwannoma, myotonic dystrophy, ankylosing spondylitis, castration-resistant prostate cancer, glaucoma, holoceanopia, ischemic heart failure, lysosomal storage disease, sarcoma, breast cancer, rayleigh's syndrome, triple negative breast cancer, sandhoff's disease, achromatopsia, heart failure with reduced ejection fraction, neuronal ceroid lipofuscinosis, adrenoleukodystrophy, renal cell carcinoma, wet age-related macular degeneration, eczema, thrombocytopenia with immunodeficiency syndrome, esophageal cancer, optic neuropathy, optic atrophy, retinal vein occlusion, retinal pigment degeneration, rhodopsin-mediated autosomal inherited retinal pigment degeneration, ependymoma, fallopian tube cancer, bilateral vestibular disease, stevens disease, diabetic macular edema, adrenoleukosis diabetic neuropathy, diabetic retinopathy, diabetic peripheral neuralgia, diabetic foot, glycogen storage disease type Ia, glycogen storage disease type IIb, atopic dermatitis, hearing loss, hearing impairment, head and neck cancer, head and neck squamous cell carcinoma, wilson's disease, stable angina, wu Xieer syndrome, choroideremia, congenital amaurosis, congenital adrenocortical hyperplasia, cardiomyopathy, angina pectoris, heart failure, novel coronavirus infection, pleural mesothelioma, acne vulgaris, severe combined immunodeficiency disease, severe limb ischemia, hypopharynx muscular dystrophy, pancreatic cancer, graft versus host disease, hereditary retinal dystrophy, hereditary angioedema, hepatitis B, metachromatic leukodystrophy, psoriatic arthritis, recessive hereditary epidermolysis bullosa, infant osteosclerosis, dystrophic epidermolysis bullosa, scleroderma, primary immunodeficiency, heterozygous familial hypercholesterolemia, limb-girdle muscular dystrophy type 2B, limb-girdle muscular dystrophy type 2C, limb-girdle muscular dystrophy type 2D, limb-girdle muscular dystrophy type 2E, limb-girdle muscular dystrophy type 2I, limb-girdle muscular dystrophy type 2L, limb ischemic disease, lipoprotein lipase deficiency, severe congenital neutrophil deficiency, wrinkles, strokes, sciatica, schizophrenia, depression, drug addiction, autism, idiopathic pulmonary fibrosis, transthyretin (ATTR) amyloidosis, AATD liver disease and AATD pulmonary disease, elevated blood lipid.
In some embodiments, the transthyretin (ATTR) amyloidosis-associated genes include, but are not limited to ATTR;
genes associated with Leber hereditary optic neuropathy include, but are not limited to, MT-ND4;
Genes associated with AATD liver disease include, but are not limited to, AATD;
related genes of the AATD lung disease include, but are not limited to, AATD;
Genes associated with graft versus host disease include, but are not limited to, thymidine kinase genes;
genes associated with the hereditary retinal dystrophy include, but are not limited to, RPE65;
the spinal muscular atrophy related genes include but are not limited to SMN1;
genes associated with osteoarthritis include, but are not limited to, TGF- β1;
Genes associated with hemophilia a include, but are not limited to, factor VIII;
Genes associated with hemophilia B include, but are not limited to factor IX;
genes associated with cystic fibrosis include, but are not limited to, CFTR;
genes associated with parkinson's disease include, but are not limited to, gad1, gad2, PTBP1, and REST;
genes associated with Wu Xieer syndrome include, but are not limited to, USH2A;
Genes associated with alpha-thalassemia, beta-thalassemia, sickle cell disease include, but are not limited to BCL11A, HBG, HBA and HBB;
genes associated with pulmonary hypertension include, but are not limited to, eNOS;
genes associated with the Style disease include, but are not limited to, ABCA4;
genes associated with age-related macular degeneration include, but are not limited to, VEGFA and VEGFR;
related genes for glaucoma include, but are not limited to, AQP1;
Genes associated with idiopathic pulmonary fibrosis include, but are not limited to CTGF;
genes associated with Alzheimer's disease include, but are not limited to, NGF;
Genes associated with coronary heart disease include, but are not limited to, VEGFA and bFGF;
genes associated with anemia of chronic kidney disease include, but are not limited to, EPO;
Genes associated with congenital amaurosis include, but are not limited to, RPE65;
Genes associated with retinal pigment degeneration include, but are not limited to PDE6B;
genes associated with phenylketonuria include, but are not limited to, PAH;
genes associated with epilepsy include, but are not limited to, GAT1;
Genes associated with elevated blood lipids include, but are not limited to, PCSK9.
In another aspect, the present invention provides a technical solution that is: a vector system comprising one or more recombinant vectors comprising an isolated nucleic acid according to the invention, or a CRISPR-Cas system according to the invention.
In a specific embodiment of the invention, the recombinant vector further comprises regulatory sequences.
In particular embodiments of the invention, the vector system comprises one or more recombinant vectors comprising a polynucleotide sequence encoding the Cas protein, cas protein inactivating variants, or fusion proteins, or conjugates of the invention, and a polynucleotide sequence encoding the guide RNA.
In specific embodiments of the invention, the polynucleotide sequence encoding the Cas protein, cas protein inactivating variant, or fusion protein or conjugate is operably linked to regulatory control sequence 1.
In a specific embodiment of the invention, the polynucleotide sequence encoding the guide RNA is operably linked to regulatory sequence 2.
Further, in a specific embodiment of the present invention, the regulatory sequence 1 and regulatory sequence 2 are the same or different sequences.
In a preferred embodiment of the invention, the regulatory sequences are selected from the group consisting of: one or more of a promoter, an enhancer, an internal ribosome entry site and a transcriptional termination signal; such as a constitutive promoter, an inducible promoter, a broad-spectrum promoter or a tissue-specific promoter, and/or, such as a polyadenylation signal or a poly U sequence.
In a specific embodiment of the invention, the backbone of the recombinant vector is an adeno-associated viral vector, a lentiviral vector, a ribonucleoprotein complex or a virus-like particle.
In another aspect, the present invention provides a technical solution that is: a method of detecting, binding or cleaving a target nucleic acid, the method comprising contacting a target nucleic acid with a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention or a vector system according to the invention.
In a preferred embodiment of the invention, the method is a method for non-diagnostic and/or therapeutic purposes; and/or the fusion protein or conjugate comprises a detectable label, e.g., a label detectable by fluorescence, southern blotting, or FISH.
In a more preferred embodiment of the invention, when the method is cleavage of a target nucleic acid, the method further comprises performing a cleavage reaction using a cleavage Buffer (Cut Buffer). The cleavage buffer may be any suitable buffer known in the art for Cas protein cleavage of target nucleic acid.
In another aspect, the present invention provides a technical solution that is: a method of altering a cell state, the method comprising contacting a cell with a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention, or a vector system according to the invention, thereby altering a cell state.
In some embodiments of the invention, the method results in one or more of the following: an increase or decrease in expression of a particular gene, in vitro or in vivo induction of cellular senescence, in vitro or in vivo cell cycle arrest, in vitro or in vivo promotion of cell growth and/or inhibition of cell growth, in vitro or in vivo induction of anergy, in vitro or in vivo induction of apoptosis, and in vitro or in vivo induction of necrosis.
In a preferred embodiment of the invention, the method is a method for non-diagnostic and/or therapeutic purposes.
In another aspect, the present invention provides a technical solution that is: a method of diagnosing, treating or preventing a disease or disorder associated with a target nucleic acid, administering to a sample of a subject in need thereof or to a subject in need thereof a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention.
In a specific embodiment of the invention, the disease or disorder is a disease or disorder of the blood system, an ophthalmic disease or disorder, a disease or disorder of the nervous system, a disease or disorder of the respiratory system, a disease or disorder of the liver, a disease or disorder of the metabolic system, cancer or an infectious disease.
In another aspect, the present invention provides a technical solution that is: a Cas protein according to the invention, a guide RNA according to the invention, a fusion protein or conjugate according to the invention, a nucleic acid according to the invention, a CRISPR-Cas system according to the invention or a vector system according to the invention for use in the diagnosis, treatment or prevention of a disease or disorder associated with a target nucleic acid.
In a specific embodiment of the invention, the disease or disorder is a disease or disorder of the blood system, an ophthalmic disease or disorder, a disease or disorder of the nervous system, a disease or disorder of the respiratory system, a disease or disorder of the liver, a disease or disorder of the metabolic system, cancer or an infectious disease.
Detailed Description
In the present invention, unless otherwise indicated, scientific and technical terms used herein have the meanings commonly understood by one of ordinary skill in the art. Further, the procedures of molecular genetics, nucleic acid chemistry, molecular biology, biochemistry, cell culture, microbiology, cell biology, genomics and recombinant DNA, etc., as used herein, are all conventional procedures widely used in the corresponding field. Meanwhile, in order to better understand the present invention, definitions and explanations of related terms are provided below.
In the present invention, "plural" means two or more.
In the present invention, the letters in the amino acid sequence represent single letter abbreviations for amino acids well known in the art, as described, for example, in j. Biol. Chem, 243, p3558 (1968): alanine: ala-A, arginine: arg-R, aspartic acid: asp-D, cysteine: cys-C, glutamine: gln-Q, glutamic acid: glu-E, histidine: his-H, glycine: gly-G, asparagine: asn-N, tyrosine: tyr-Y, proline: pro-P, serine: ser-S, methionine: met-M, lysine: lys-K, valine: val-V, isoleucine: ile-I, phenylalanine: phe-F, leucine: leu-L, tryptophan: trp-W, threonine: thr-T.
In the present invention, "amino acid difference" refers to a difference in amino acid residues at a specific point on the amino acid sequence of a protein, including substitution, increase or decrease.
As is well known to those skilled in the art, in proteins or peptides, two adjacent amino acids are each stripped of an OH or H, dehydrated and condensed to form a peptide bond, each amino acid being in the form of an amino acid residue. Thus, in the present disclosure, the terms "amino acid" and "amino acid residue" generally represent the same meaning. In addition, to simplify expression, the amino acid residues prior to substitution are retained in the present disclosure before the site where the amino acid residue is located, the letter before the site represents the original amino acid residue, and the letter after the site represents the substituted amino acid residue. For example, S211 represents the original amino acid residue at position 211 as S, and when it is substituted with R, it may be denoted as S211R.
In the present invention, if an amino acid is substituted, it means that it is substituted with another amino acid residue different from the original amino acid residue. If the original amino acid is a positively charged amino acid, it is substituted with a positively charged amino acid, it means that it is substituted with another positively charged amino acid residue different from the original amino acid residue. For example, an original amino acid residue is R, which is substituted with a positively charged amino acid, meaning that it is substituted with H or K.
Sequence identity
As used herein, the term "identity" or "PERCENT IDENTITY" is used to refer to the match of sequences between two polypeptides or between two nucleic acids. When a position in both compared sequences is occupied by the same base or amino acid monomer subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides is occupied by lysine), then the molecules are identical at that position. The "percent sequence identity" (PERCENT IDENTITY) between two sequences is a function of the number of matched positions shared by the two sequences divided by the number of positions to be compared x 100%. For example, if 6 out of 10 positions of two sequences match, then the two sequences have 60% sequence identity. Typically, the comparison is made when two sequences are aligned to produce maximum sequence identity. Such alignment may be by using published and commercially available alignment algorithms and procedures such as, but not limited to, clustal omega, MAFFT, probcons, T-Coffee, probalign, BLAST, which one of ordinary skill in the art would have a reasonable choice to use. One skilled in the art can determine suitable parameters for aligning sequences, including, for example, any algorithm required to achieve a superior alignment or optimal alignment for the full length of the compared sequences, and any algorithm required to achieve a superior alignment or optimal alignment for the parts of the compared sequences.
Protein domains
In some embodiments, the Cas protein or Cas protein inactivating variant is covalently linked or fused to a homologous or heterologous protein domain.
In some embodiments, the protein domain is any one or more selected from the group consisting of: subcellular localization signals, DNA binding domains, protease domains, transcription activation domains, transcription inhibition domains, nuclease domains, deaminase domains, uracil DNA glycosylase domains (UDG), uracil DNA glycosylase inhibition domains (UGI), methylases, demethylases, transcription release factors, histone acetylase domains, histone deacetylase domains, DNA ligases, epitope tags, and reporter domains.
In some embodiments of the invention, the deaminase domain is optionally selected from: apodec 1, apodec 3A, APOBEC, B, APOBEC3C, APOBEC3D, APOBEC F, activation-induced cytidine deaminase (AID), CDA from lamprey, mutants of adenosine deaminase (TadA) engineered to act on DNA.
In some embodiments, the transcriptional activation domain is optionally selected from: p65, VPR, VP16, VP64, VTR1, VTR2, VTR3, P65, myoD1, HSF1, RTA, SET7/9 and histone acetyltransferase.
In some embodiments, the transcription repression domain is optionally selected from: KOX1, KAP-1, MAD, FKHR, EGR-1, ERD, SID, SID (e.g., SID 4X), tigg, v-ERB-A, MBD2, MBD3, TRa, histone methyltransferase, histone Deacetylase (HDAC), nuclear hormone receptors (e.g., estrogen receptor or thyroid hormone receptor), DNMT family members (e.g., DNMT1, DNMT3A, DNMT B), KRAB domain of MeCP2, ROM2, and AtHD2A.
In some embodiments, the transcription repression domain is a KRAB domain from a KOX1 protein.
In some embodiments, the nuclease domain is optionally selected from fokl, a polypeptide having ssDNA cleavage activity, a polypeptide having dsDNA cleavage activity.
In some embodiments, the methylase domain is selected from DNA methylases, including but not limited to DNMT1, DNMT3a, DNMT3b.
In some embodiments, the demethylase is selected from TET1CD, TET1, ROS1, DME, DML2, and DML3.
Methylation and demethylation are recognized in the art as important ways of epigenetic gene regulation.
In some embodiments, the homologous or heterologous protein domain is a sequence tag useful for the solubilization, purification, or detection of the fusion protein or conjugate. Provided herein are suitable protein tag sequences including, but not limited to, biotin Carboxylase Carrier Protein (BCCP) tags, myc tags, calmodulin tags, FLAG tags, hemagglutinin (HA) tags, polyhistidine tags (also known as His tags), maltose Binding Protein (MBP) tags, nus tags, glutathione-S-transferase (GST) tags, green Fluorescent Protein (GFP) tags, thioredoxin tags, S-tags, softtags (e.g., softtag 1, softtag 3), strep-tags, biotin ligase tags, flAsH tags, V5 tags, and SBP tags. Additional suitable sequences will be apparent to those of ordinary skill in the art.
Therapeutic application
Another aspect of the disclosure relates to a pharmaceutical composition comprising a Cas protein according to the present invention, a guide RNA according to the present invention, a Cas protein inactivating variant according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention, a carrier system according to the present invention, a delivery system according to the present invention, or a cell according to the present invention. The pharmaceutical composition can comprise, for example, an AAV vector encoding a Cas protein or a Cas protein inactivating variant and a guide RNA described herein. The pharmaceutical composition can comprise, for example, a lipid nanoparticle comprising a guide RNA described herein and an mRNA encoding a Cas protein. The pharmaceutical composition can comprise, for example, a lentiviral vector comprising a guide RNA as described herein and an mRNA encoding a Cas protein. The pharmaceutical composition can comprise, for example, a virus-like particle comprising a guide RNA and a Cas protein described herein or a ribonucleoprotein complex formed from the guide RNA and Cas protein.
Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention for cleaving or editing a target nucleic acid in a mammalian cell.
Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention in any of the following: cleavage or nicking of one or more target nucleic acid molecules, activating or upregulating expression of one or more target nucleic acid molecules, activating or inhibiting transcription of one or more target nucleic acid molecules, inactivating one or more target nucleic acid molecules, visualizing, labeling or detecting one or more target nucleic acid molecules, binding to one or more target nucleic acid molecules, transporting one or more target nucleic acid molecules, and masking one or more target nucleic acid molecules.
Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention to modify one or more target nucleic acid molecules comprising one or more of the following: nucleobase substitution, nucleobase deletion, nucleobase insertion, fragmentation of a target nucleic acid, nucleic acid methylation, and nucleic acid demethylation.
Another aspect of the disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention in the diagnosis, treatment or prevention of a disease or disorder associated with a target nucleic acid.
Another aspect of the present disclosure relates to the use of a Cas protein according to the present invention, a guide RNA according to the present invention, a fusion protein or conjugate according to the present invention, a nucleic acid according to the present invention, a CRISPR-Cas system according to the present invention or a vector system according to the present invention for the manufacture of a medicament for the diagnosis, treatment or prevention of a disease.
Another aspect of the present disclosure relates to the use of a Cas protein, a guide RNA, a fusion protein or conjugate, a nucleic acid, a CRISPR-Cas system or a vector system according to the present invention in the manufacture of a medicament for diagnosing, treating or preventing a disease associated with a target nucleic acid.
In some embodiments, the pharmaceutical composition is delivered to a human subject in vivo. The pharmaceutical composition may be delivered by any effective route. Exemplary routes of administration include, but are not limited to, intravenous infusion, intravenous injection, intraperitoneal injection, intramuscular injection, intratumoral injection, subcutaneous injection, intradermal injection, intraventricular injection, intravascular injection, intracerebral injection, intraocular injection, subretinal injection, intravitreal injection, intracameral injection, intrathecal injection, intranasal administration, and inhalation.
Examples
The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention. The experimental methods, in which specific conditions are not noted in the following examples, were selected according to conventional methods and conditions, or according to the commercial specifications.
EXAMPLE 1 preparation and purification of Cas proteins
The inventor finds 2 novel Cas12 proteins named Cas12i-Z1 and Cas12i-Z2 through bioinformatics analysis and combination of AI.
Amino acid sequence of Cas12i-Z1 protein (SEQ ID NO: 1):
MTHYVDPTRVAYWDQMPPDITACPNIQFQNSVPPNLAIPCQATLTYAVATYIRTIQATYVNPQAPFITATYISPQFATNVNPQAPFITATYISPNFATNVNPQAPFITATYVSPNFATNVNPQAPFITATYISPAQVTNINPQAPFITATYVSPNFATNVNPQAPFITATYISPAQVTNINPQAPNIQFNSIQNINITNPNVSIQNATNAQFYNPVFQFNSIQFINVTYPTVLQYQTNEVKVPTQFPTNISSTYFAFAPQYFPQTQSPNIFAFAPTDTPSIQSTQYFAFVPTLFPYTQSPNFFAFAPTPTPSYQSTQYFNFVPVQFPYTQSPTDFAFVPQTFPQTQSPNPTVPVPVTGPSLDSPTNFAFVPYTFFYTPSPAPFAFAPIQFPQNTSPFYNVFIPYTFPSNQSQFTYNVVGFHCQPIEQNTLTPILAKCARQDNTYTQPIAHLQTPNYEVNYQDTVSAQFATNAQFYVPNFQFNKIQFINVTNPNLSIQNATNAQFYNPNFQFNSIQFINVTYPTVLQYQTNAVKVPTQFPVNTSSNYFAFVPQPFPSLNSPDHFATVPQTFPSTQSTNRNAFVPYINPSTNSPTLFATVPDIFPSQHSPNTFVFVPTDFPKNQSPAFTAFVPELTPQAPSITYTAFVPQLFPSIQSPFYFAFIPYTFPSNQSQFTYVAVQFTFNFIQNAATPIYAKCARQDFTYTQPIAHLQTPNTEVNYQDTVSAQFATNVNFYMPNFQMNSIQFINVTYTPILQCQNSEVKVPTQFPENTSSTYFAFVPDEFPSTQSPTFFVFVPIWFPRYQSPVQNAFIPYTFPQTQSPNPTVPVPVTGPEIDSPIPFAFVPTYFFYTPSPEPFAFAPQTFPQNTSPFYFAFVPYTFPSNTSQFTYVAVGFHCTPVNTPFTPILVKCASFPNPNTPTIKYQNPNTEVNYDPFKNLAAQHCNQCEKHIAHFGIEPCEFPAYYRPVPNVDPTYNSVEICDYAKHVFTELNVSANCVYQPNARNIQFTYATIQTLNAYPTYPELYWHAYPKRMTVEAECDKTFANAQPDPCYRWVPT
Amino acid sequence of Cas12i-Z2 protein (SEQ ID NO: 2):
MTHFWDPTRWIFYDQLPPDATICPNAQVQNWWPPNMIAPCQITMTYIWITYARTIQITYWNPQIPVATITYASPQVITNWNPQIPVATITYASPNVITNWNPQIPVATITYWSPNVITNWNPQIPVATITYASPIQWTNANPQIPVATITYWSPNVITNWNPQIPVATITYASPIQWTNANPQIPNAQVNSAQNANATNPNWSAQNITNIQVFLPNVNVNSIQVANWTYPTWMQFQTNEWKWPTQVPTNASSTFVIVIPQFVPQTQSPNAVIVIPTDTPSIQSTQFVIVWPTMVPFTQSPNSVIVIPTPTPSFQSTQFVNVWPWQVPFTQSPTDVIVWPQTVPQTQSPNPTWPWPWTGPSMDSPTNVIVWPFTNVFTPSPIPVIVIPIQVPQNTSPVFNWVAPFTVPSNQSQVTYWWWGVHCQPAEQNTMTPAMIKCIRQDNTFTQPAIHMQTPNFEWNFQDTWSIQVITNIQVFWPNVQVNKAQVANWTNPNMSAQNITNIQVFNPNVQVNSAQVANWTYPTWMQFQTNIWKWPTQVPWNTSSNFVIVWPQPVPSMNSPDHVITWPQTVPSTQSTNRNIVWPFANPSTNSPTMVITWPDAVPSQHSPNTVWVWPTDVPKNQSPIVTIVWPEMTPQIPSATFTIVWPQMVPSIQSPVFVIVAPFTVPSNQSQVTYWIWQVTVNVAQNIITPAFIKCIRQDVTFTQPAIHMQTPNTEWNFQDTWSIQVITNWNVFLPNVQLNSAQVANWTYTPAMQCQNSEWKWPTQVPTNASVTFVIVWPDWVPSTQSPTNVWVWPMIVPRFQSPPQVIVAPFTVPHTQSPNPTWPWPWTGPSMDSPAPVIVWPFTNVFTPSPIPVIVIPQTVPQNTSPVFVIVWPFTVPSNTSQVTYWIWGVHCTPWNTPVTPAMWKCISVPNPNTPTAKFQNPNTEWNFDPVKNMIIQHCNQCEKHAIHVGAEPCEVPIFFRPWPNWDPTYNSWEACDFIKHWVTEMNWSINCWFQPNIRNAQVTFITIQTMNIFPTYPEMFYHIFPKRLTWEIECDKTVINIQPDPCFRYWPT
DR sequence (SEQ ID NO: 3) corresponding to Cas12i-Z1 protein:
AGAGAGTTCGCGTAGTTCTGTAGTGTGGAACTTGAGAC。
DR sequence (SEQ ID NO: 4) corresponding to Cas12i-Z2 protein:
GCCGGTAGTAATCCCCGAGCTCAGCGCAGCCGGATG。
1. Vector construction
The pET28a vector plasmid is digested by BamHI and XhoI, and the linearized vector is recovered by agarose gel electrophoresis. DNA nucleic acid (SEQ ID NO: 5, SEQ ID NO: 6) containing coding sequences of Cas proteins (Cas 12i-Z1 and Cas12 i-Z2) is synthesized in an outsourcing service company, and is connected with the linearized pET28a vector through T4 ligase after double digestion of BamHI and XhoI, so that recombinant vectors pET28a-Cas12i-Z1 (SEQ ID NO: 7) and pET28a-Cas12i-Z2 (SEQ ID NO: 8) are constructed. Stbl3 competence is transformed by the reaction solution, LB plates with kanamycin sulfate resistance are coated, and after overnight culture at 37 ℃, clone sequencing identification is selected.
Positive clones with correct sequences were picked overnight, plasmid was extracted and transformed into expression strain Rosetta (DE 3), LB plates containing kanamycin sulfate were plated, and cultured overnight at 37 ℃.
2. Protein expression
The monoclonal was inoculated into 5mL of LB medium containing kanamycin sulfate, and cultured overnight at 37 ℃.
500ML of LB medium containing kanamycin sulfate was inoculated at a ratio of 1:100, cultured at a rotation speed of 220rpm, at 37℃to OD 0.6, and IPTG was added to a final concentration of 0.2mM, followed by induction at 16℃for 24 hours.
Rinsing with 15mL PBS, centrifuging to collect thalli, adding lysis buffer solution, performing ultrasonic disruption, centrifuging for 30min with 10000g to obtain supernatant containing recombinant protein, filtering the supernatant with a 0.45 μm filter membrane, and purifying by column chromatography.
3. Protein purification
Purification was performed using 6 His at the N-terminus as purification tag by ProPac ™ IMAC-10 HPLC chromatography column (eluent a20 mM HEPES +0.5M NaCl, 25 mM imidazole, ph=7.5; eluent B20 mM HEPES +0.5M NaCl, 500 mM imidazole, ph=7.5. Elution gradient a/B100%/0% to 0%/100%; flow rate 0.5 mL/min, UV 280 nm). Recombinant proteins of Cas12i-Z1 and Cas12i-Z2 (the recombinant protein structure is His tag-NLS-Cas-NLS-NLS) are obtained, and the sequences are SEQ ID NO 9 and SEQ ID NO 10 respectively. The purified recombinant protein was determined to be a single band by SDS-PAGE electrophoresis.
Example 2 determination of Cas protein recognized PAM sequence
In this example, sgRNA (single guide RNA) containing specific guide sequences and the recombinant protein purified in example 1 were mixed, cleavage of in vitro cleavage substrates (containing spacer sequences and 7nt random sequences) was performed, purification after incubation at 37 ℃, library construction was performed, and NGS sequencing and analysis were performed to determine PAM sequences recognized by Cas proteins.
The designed in vitro cleavage substrate sequence is shown as SEQ ID NO. 11.
N in the sequence represents A, T, C, G.
The cleavage substrates were taken to sequencing companies for PCR-Free library construction and NGS sequencing.
Preparation of sgRNA
In a DNA template system containing T7 RNA transcriptases, four triphosphoric ribonucleotides and a T7 promoter, sgRNA containing a specific guide sequence is synthesized by in vitro transcription at 37 ℃, and the transcription product is precipitated and purified by LiCl.
The sgRNA sequence is:
Cas12i-Z1-sgRNA(SEQ ID NO: 12)、Cas12i-Z1-sgRNA-Rev(SEQ ID NO: 13)、
Cas12i-Z2-sgRNA(SEQ ID NO: 14)、Cas12i-Z2-sgRNA-Rev(SEQ ID NO: 15)。
PAM library cutting
A reaction system containing Cas12i-Z1 recombinant protein (about 10mg/mL, 0.5. Mu.L), 2.8. Mu.g Cas12i-Z1-sgRNA or Cas12i-Z1-sgRNA-Rev, in vitro cleavage substrate (60 ng/. Mu.L, 36. Mu.L) and buffer (200 mM HEPES, 1M NaCl, 50mM MgCl 2, 1mM EDTA; 5. Mu.L) was formulated and reacted at 37℃for 3h,75℃for 15 min.
A reaction system containing Cas12i-Z2 recombinant protein (about 10mg/mL, 0.5. Mu.L), 2.8. Mu.g Cas12i-Z2-sgRNA or Cas12i-Z2-sgRNA-Rev, in vitro cleavage substrate (60 ng/. Mu.L, 36. Mu.L) and buffer (200 mM HEPES, 1M NaCl, 50mM MgCl 2, 1mM EDTA; 5. Mu.L) was formulated and reacted at 37℃for 3h,75℃for 15 min.
T4 DNA Polymerase treatment and filling in of cleavage products
To the cut product was added T4 DNA Polymerase (Thermo Scientific), and the reaction was carried out at 37℃for 20min and at 85℃for 10min.
3' -Terminal addition of A and addition of biotin-labeled linker
A. Adding 78 mu L SPRISELECT Beads (Beckman COULTER) to the T4 DNA Polymerase reaction product, mixing, standing at room temperature for 5min, transferring the product to a magnetic rack for adsorption for 5min, and transferring the supernatant to a new 1.5mL tube; adding 39 mu L SPRISELECT Beads (Beckman COULTER), mixing, standing at room temperature for 5min, transferring the product to a magnetic rack, adsorbing for 5min, discarding supernatant, washing with 85% ethanol for 2 times, standing at room temperature for 10min, air drying, and adding 50 mu L ddH 2 O for eluting.
B. 3' adding A to the product in the step a by utilizing SYNPLSEQ DNA Library Prep Kit for Illumina library building Kit, wherein the temperature is 37 ℃ for 10min, the temperature is 65 ℃ for 20min, and the temperature is 4 ℃ for infinity.
C. Adapter 1 was obtained by annealing an upstream primer 5' biosg/GTTGACATGCTGGATTGAGACTTCCTACACTCTTTCCCTACACGACGCTCTTCCGATC. Times.t (SEQ ID NO: 16), which represents a phosphorothioate modification at t bases, and a downstream primer GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGGAAGTCTCAATCCAGCATGTCAAC (SEQ ID NO: 17). Adapter 1 was added and reacted overnight at 20℃for 30min and 16 ℃. The reaction product was purified using SPRISELECT BEADS.
D. the reaction product was purified using streptavidin-labeled magnetic beads Dynabeads cube M-280 Streptavidin (Invitrogen).
e.Recover PCR
Primers were designed and a RecoverPCR reaction was performed using Q5-Hot START HIGH-Fidelty x Master Mix (NEB).
Recover PCR primer F: GGAGTTCAGACGTGTGCTC (SEQ ID NO: 18)
Recover PCR primer R: GTTGACATGCTGGATTGAGACTTC (SEQ ID NO: 19)
The Recovery PCR product was transferred to a magnetic rack, adsorbed for 5min, the supernatant was transferred to a fresh 1.5mL centrifuge tube, 3. Mu.L of Recovery PCR product was taken and diluted with 148.5. Mu.L of ddH 2 O.
g.Index PCR
Index PCR was performed using the primers.
Index PCR primer IF501:
aatgatacggcgaccaccgagatctacactatagcctacactctttccctacacgacg(SEQ ID NO: 20)
index PCR primer IR701:
caagcagaagacggcatacgagatcgagtaatgtgactggagttcagacgtgtgctc(SEQ ID NO: 21).
Index PCR products added with 0.7x SPRISelect Beads for product purification, added with 38 u L ddH 2 O for elution, using Qubit concentration determination, sent to NGS sequencing.
NGS result analysis: reference (A compact Cas9 ortholog from Staphylococcus Auricularis (SauriCas9) expands the DNA targeting scope. PLoS biology, 2020,18(3), e3000686.) method was analyzed with WebLogo software to obtain a captured 7nt random sequence. Thus, PAM sequences were identified: the PAM sequence identified by the Cas12i-Z1 is 5'-TTN-3', and the PAM sequence identified by the Cas12i-Z2 is 5'-TTN-3'.
Example 3.
Cas12i-Z1-N2-Target plasmid (SEQ ID NO: 23) was constructed at the outsourcing service company and sgRNA (SEQ ID NO: 22, N2-sgRNA) was synthesized that could Target the plasmid. The Cas12i-Z1-N2-Target plasmid was subjected to enzyme-tangential digestion with XmnI (NEB, R0194) enzyme, and after completion of the reaction at 37℃the product was purified with Wizard-SV GEL AND PCR CLEAN-Up System (Progema, A9282) and the concentration was determined with Nanodrop.
Cas12i-Z1 recombinant protein (10 mg/mL 0.5. Mu.L) prepared as in example 1 was used, and the linearized plasmid (140 ng/μL,18μL)、N2-sgRNA(180ng/μL,15μL)、10xCut Buffer (200 mM HEPES/1M NaCl/50mM MgCl2/1mM EDTA,5μL), described above was mixed and then added with ultrapure water to 50. Mu.L. The reaction was carried out at 37℃for 1h. Water bath at 75 ℃ for 10min.
6 Mu L of loading buffer is added to the reaction product, 30 mu L of electrophoresis is taken for detection, and the existence of a cleavage fragment is observed at the position of 1000-2000bp, so that the Cas12i-Z1 protein is proved to cleave the target nucleic acid.
Claims (5)
1. A Cas protein is characterized in that the amino acid sequence of the Cas protein is shown as SEQ ID NO. 1.
2. A guide RNA comprising
(I) A homodromous repeated sequence, wherein the homodromous repeated sequence is a sequence shown as SEQ ID NO. 3,
(Ii) A guide sequence engineered to hybridize to a target nucleic acid; the cognate repeat sequence is linked to the guide sequence, the guide RNA being capable of forming a complex with the Cas protein of claim 1 and directing the specific binding of the complex to the sequence of the target nucleic acid;
the guide sequence is located 3' to the homodromous repeat sequence.
3. A fusion protein comprising the following elements:
(1) The Cas protein of claim 1, and
(2) Homologous or heterologous functional domains;
The homologous or heterologous functional domain is any selected from one or more of the following: subcellular localization signals, transcriptional activation domains, transcriptional inhibition domains, deaminase domains, methylases, demethylases.
4. An isolated nucleic acid encoding the Cas protein of claim 1 or the fusion protein of claim 3.
5. A CRISPR-Cas system, characterized in that the CRISPR-Cas system comprises:
a. the Cas protein of claim 1, the fusion protein of claim 3, or the nucleic acid of claim 4; and
B. The guide RNA of claim 2, or a polynucleotide sequence encoding the guide RNA.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410155116.6A CN117683749B (en) | 2024-02-04 | 2024-02-04 | Cas proteins and uses thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410155116.6A CN117683749B (en) | 2024-02-04 | 2024-02-04 | Cas proteins and uses thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117683749A CN117683749A (en) | 2024-03-12 |
CN117683749B true CN117683749B (en) | 2024-05-17 |
Family
ID=90133859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410155116.6A Active CN117683749B (en) | 2024-02-04 | 2024-02-04 | Cas proteins and uses thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117683749B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023078314A1 (en) * | 2021-11-02 | 2023-05-11 | Huidagene Therapeutics Co., Ltd. | Novel crispr-cas12i systems and uses thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114015674A (en) * | 2021-11-02 | 2022-02-08 | 辉二(上海)生物科技有限公司 | Novel CRISPR-Cas12i system |
-
2024
- 2024-02-04 CN CN202410155116.6A patent/CN117683749B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023078314A1 (en) * | 2021-11-02 | 2023-05-11 | Huidagene Therapeutics Co., Ltd. | Novel crispr-cas12i systems and uses thereof |
Non-Patent Citations (2)
Title |
---|
CRISPR-Cas基因编辑系统升级:聚焦Cas蛋白和PAM;唐连超;遗传;20200331;第42卷(第3期);第236-249页 * |
Mechanisms for target recognition and cleavage by the Cas12i RNA-guided endonuclease;Heng Zhang;NATuRE STRuCTuRAl & MolECulAR BIoloGy;20200907;第27卷;第1069-1092页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117683749A (en) | 2024-03-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11155803B2 (en) | Adenosine deaminase base editors and methods of using same to modify a nucleobase in a target sequence | |
AU2020267249B2 (en) | Genome editing using campylobacter jejuni crispr/cas system-derived rgen | |
US20240132877A1 (en) | Genome editing systems comprising repair-modulating enzyme molecules and methods of their use | |
US20190062734A1 (en) | Grna fusion molecules, gene editing systems, and methods of use thereof | |
CN106852157B (en) | Compositions and methods for expressing CRISPR guide RNA using H1 promoter | |
CA3100014A1 (en) | Methods of suppressing pathogenic mutations using programmable base editor systems | |
US20170152508A1 (en) | Using Truncated Guide RNAs (tru-gRNAs) to Increase Specificity for RNA-Guided Genome Editing | |
US11344609B2 (en) | Compositions and methods for treating hemoglobinopathies | |
CA3100034A1 (en) | Methods of editing single nucleotide polymorphism using programmable base editor systems | |
JP2022526695A (en) | Inhibition of unintentional mutations in gene editing | |
Krishnamurthy et al. | Functional correction of CFTR mutations in human airway epithelial cells using adenine base editors | |
CA3172178A1 (en) | Compositions and methods for the targeting of c9orf72 | |
AU722909B2 (en) | Method for the production of a therapeutic DNA | |
JP7123982B2 (en) | A platform for expressing proteins of interest in the liver | |
JP2022516647A (en) | Non-toxic CAS9 enzyme and its uses | |
US20220313799A1 (en) | Compositions and methods for editing a mutation to permit transcription or expression | |
KR20230076820A (en) | Synthetic Miniature CRISPR-CAS (CASMINI) System for Eukaryotic Genome Engineering | |
CN115768487A (en) | CRISPR inhibition for facioscapulohumeral muscular dystrophy | |
WO2020069331A1 (en) | Artificial rna-guided splicing factors | |
CN117683749B (en) | Cas proteins and uses thereof | |
CN114144519A (en) | Single base replacement proteins and compositions comprising the same | |
CN112585266A (en) | Novel transcriptional activator | |
Qian et al. | A new compact adenine base editor generated through deletion of HNH and REC2 domain of SpCas9 | |
US20240132855A1 (en) | Compositions and methods for epigenetic regulation of hbv gene expression | |
WO2024078645A2 (en) | Cas protein and use thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |