CN116144629A - Cas9 protein, gene editing system containing Cas9 protein and application - Google Patents

Cas9 protein, gene editing system containing Cas9 protein and application Download PDF

Info

Publication number
CN116144629A
CN116144629A CN202211134139.6A CN202211134139A CN116144629A CN 116144629 A CN116144629 A CN 116144629A CN 202211134139 A CN202211134139 A CN 202211134139A CN 116144629 A CN116144629 A CN 116144629A
Authority
CN
China
Prior art keywords
sequence
nucleic acid
seq
base
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211134139.6A
Other languages
Chinese (zh)
Inventor
王永明
陶晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202211134139.6A priority Critical patent/CN116144629A/en
Publication of CN116144629A publication Critical patent/CN116144629A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention relates to the technical field of gene editing, in particular to a CRISPR/Cas9 gene editing system and application thereof. The gene editing system disclosed by the invention relates to three specific Cas9 proteins, namely SauriCas9-HF, sha2Cas9-HF and Sa-SlugCas9-HF and a complex formed by the SauriCas9-HF and sgRNA, can accurately position a target DNA sequence and cut the target sequence, causes double-strand break damage to the target sequence, has high specificity, can reduce the off-target rate of gene editing in cells or in vitro, and has wide application prospects.

Description

Cas9 protein, gene editing system containing Cas9 protein and application
Technical Field
The application relates to the technical field of gene editing, in particular to a Cas9 protein, a gene editing system containing the Cas9 protein and related applications thereof.
Background
The CRISPR/Cas9 system is an adaptive immune system that bacteria and archaea evolve to protect against foreign virus or plasmid invasion. The CRISPR/Cas9 system contains a tracrRNA (trans-activating RNA) and a crRNA (CRISPR-extended RNA), which together with Cas9 form a complex to function. the tracrRNA and crRNA can be fused into single guide RNA (sgRNA) by a linker sequence. After DNA break damage, two major DNA damage repair mechanisms within the cell are responsible for repair: non-homologous end joining (NHEJ) and homologous recombination (homologous recombination, HR). The NHEJ repair results in base deletion or insertion, and gene knockout can be performed; in the case of providing a homologous template, site-directed insertion of genes and precise substitution of bases can be performed using HR repair.
Besides basic scientific research, the CRISPR/Cas9 gene editing system also has wide clinical application prospect. When the CRISPR/Cas9 gene editing system is used for gene therapy, precise mediated editing of the Cas nuclease is required, and off-target can be caused if a target site and a highly similar sequence cannot be accurately distinguished, so that the CRISPR/Cas9 gene editing system is one of the problems to be solved in the current gene therapy. One approach to this problem is to develop Cas9 mutants, improve their editing specificity, and enhance their ability to distinguish between target site sequences and highly similar sequences. The inventors have previously invented SauriCas9, sha2Cas9 and Sa-slaugcas 9 proteins, but they have low specificity, are easily off-target, and are difficult to use widely. Thus, increasing the specificity of these three proteins and decreasing the off-target rate are key to the application of these three proteins in gene therapy.
Disclosure of Invention
The present inventors have made intensive studies to solve the above problems, and have completed the present invention by constructing three highly specific Cas9 proteins that can constitute a CRISPR/Cas9 gene editing system capable of efficiently performing gene editing with the same single-stranded guide RNA.
Accordingly, in a first aspect, the present invention provides a Cas9 protein, the Cas9 protein being:
Has the sequence of SEQ ID NO:1, or a SauriCas9-HF protein having an amino acid sequence as set forth in SEQ ID NO:1 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence set forth in seq id no;
has the sequence of SEQ ID NO:2, or a Sha2Cas9-HF protein having an amino acid sequence set forth in SEQ ID NO:2 and retains at least 80% sequence identity to the amino acid sequence shown in seq id no; or alternatively
Has the sequence of SEQ ID NO:3, or a Sa-slegcas 9-HF protein having an amino acid sequence as set forth in SEQ ID NO:3 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence depicted in figure 3.
In a second aspect, the present invention provides a conjugate comprising:
a) The Cas9 protein of the first aspect;
b) A modifying moiety; and
c) Optionally a linker for linking the Cas9 protein to the modifying moiety.
In a third aspect, the present invention provides a fusion protein comprising:
a) The Cas9 protein of the first aspect;
b) Additional proteins and polypeptides; and
c) Optionally a linker for linking the Cas9 protein to the additional proteins and polypeptides.
In a fourth aspect, the invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
a) The Cas9 protein of the first aspect;
b) The conjugate of the second aspect; or alternatively
c) The fusion protein of the third aspect.
In a fifth aspect, the invention provides a vector comprising a nucleic acid sequence encoding:
a) The Cas9 protein of the first aspect;
b) The conjugate of the second aspect; or alternatively
c) The fusion protein of the third aspect.
In a sixth aspect, the invention provides a CRISPR/Cas9 gene editing system comprising:
a) A protein component comprising the Cas9 protein of the first aspect, the conjugate of the second aspect; or the fusion protein of the third aspect;
b) A nucleic acid component comprising: a single stranded guide RNA comprising a scaffold sequence having:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity;
And, the protein component and the nucleic acid component are bound to each other to form a complex.
In a seventh aspect, the invention provides a cell comprising: the isolated nucleic acid molecule of the fourth aspect, or the vector of the fifth aspect.
In an eighth aspect, the invention provides a method of gene editing a target sequence in an intracellular or in vitro environment, the method comprising: contacting any one of the following (1) to (3) with a target sequence in an intracellular or in vitro environment:
(1) The Cas9 protein of the first aspect, the conjugate of the second aspect, or the fusion protein of the third aspect, and a single stranded guide RNA;
(2) The vector of the fifth aspect; and
(3) The CRISPR/Cas9 gene editing system of the sixth aspect;
wherein upon contact with a target sequence, the Cas9 protein, the conjugate, or the fusion protein recognizes a respective protospacer adjacent sequence (PAM) located at the 5 'end of the target sequence and having the sequence 5' -NNGG;
wherein the single stranded guide RNA comprises a scaffold sequence; the scaffold sequence has:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
In a ninth aspect, the present invention provides a kit for gene editing of a target sequence in an intracellular or in vitro environment, comprising:
a) Any one selected from the following 1) to 4):
1) The Cas9 protein of the first aspect, the conjugate of the second aspect, or the fusion protein of the third aspect, and a single stranded guide RNA;
2) The isolated nucleic acid molecule of the fourth aspect;
3) The vector of the fifth aspect; or alternatively
4) The CRISPR/Cas9 gene editing system of the sixth aspect; and
b) Instructions for how to perform gene editing of a target sequence in an intracellular or in vitro environment;
wherein the single stranded guide RNA comprises a scaffold sequence having:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
As described above, the invention modifies each Cas9 protein capable of gene editing in eukaryotic cell environment, and the Cas9 proteins have higher specificity, can form complex with the same sgRNA to carry out gene editing, have off-target rate close to 0%, and are very suitable for later development as gene therapy tools. The invention expands the range of gene editing and has wide application prospect in the field of gene editing.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 shows a schematic diagram of the results of editing efficiency after gene editing of four target sites by a CRISPR/SauriCas9-HF gene editing system;
FIG. 2 shows a schematic diagram of the editing efficiency results of a CRISPR/Sha2Cas9-HF gene editing system after gene editing of eight target sites;
FIG. 3 shows a schematic diagram of the results of editing efficiency after gene editing of eight target sites by the CRISPR/Sa-SlugCas9-HF gene editing system;
FIG. 4 shows a schematic representation of the CRISPR/SauriCas9-HF gene editing system to enhance specific detection results in the GFP reporter system HEK293T cell line;
fig. 5 shows a schematic representation of the CRISPR/Sha2Cas9-HF gene editing system to enhance specific detection results in GFP reporter system HEK293T cell line.
FIG. 6 shows a schematic representation of the CRISPR/Sa-SlugCas9-HF gene editing system improving the specificity detection results in the GFP reporter system HEK293T cell line.
Detailed Description
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. It is to be understood that the following description is intended to illustrate the invention by way of example only, and is not intended to limit the scope of the invention as defined by the appended claims. And, it is understood by those skilled in the art that modifications may be made to the technical scheme of the present invention without departing from the spirit and gist of the present invention. The technical means used in the examples are conventional means well known to those skilled in the art unless otherwise indicated.
Where a range of values is provided, such as a range of concentrations, a range of percentages, or a range of ratios, it is to be understood that each intervening value, to the tenth of the unit of the lower limit, between the upper and lower limit of the range, and any other stated or intervening value in that stated range, is encompassed within the subject matter unless the context clearly dictates otherwise. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and such embodiments are also included in the subject matter, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the subject matter.
In the context of the present invention, many embodiments use the expression "comprising", "including" or "consisting essentially/mainly of. The expression "comprising", "including" or "consisting essentially of is generally understood to mean an open-ended expression that includes not only the individual elements, components, assemblies, method steps, etc., specifically listed thereafter, but also other elements, components, assemblies, method steps. In addition, the expression "comprising," "including," or "consisting essentially of" is also to be understood in some instances as a closed-form expression, meaning that only the elements, components, assemblies, method steps specifically listed thereafter are included, and that no other elements, components, assemblies, method steps are included. At this time, the expression is equivalent to the expression "consisting.
For a better understanding of the present teachings and without limiting the scope of the present teachings, all numbers expressing quantities, percentages or proportions used in the specification and claims, and other numerical values, are to be understood as being modified in all instances by the term "about" unless otherwise indicated. Accordingly, unless indicated to the contrary, the numerical parameters set forth in the following specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained. At the very least, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques.
Definition of the definition
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the subject matter described herein belongs. Before describing the present invention in detail, the following definitions are provided to better understand the present invention.
The terms "Cas9 protein", "Cas9" and "Cas" are used interchangeably herein to refer to RNA-guided nucleases including Cas9 proteins or functionally active fragments thereof. Cas9 proteins are protein components of the CRISPR/Cas9 genome editing system that can target and cleave DNA target sequences under the direction of single-stranded guide RNAs (grnas) to form DNA double-strand breaks (DSBs). The DNA double strand breaks activate non-homologous end joining (non-homologous end joining, NHEJ) and homologous recombination (homologous recombination, HR) of the intrinsic repair mechanism in the cell, thereby repairing DNA damage in the cell. During repair, site-directed editing is performed on the specific DNA sequence.
The terms "single stranded guide RNA", "gRNA", "sgRNA (single guide RNA)" or "mature crRNA" as used herein are used interchangeably in this application and have the meaning commonly understood by those skilled in the art. In general, a single-stranded guide RNA may comprise a scaffold sequence (scaffold sequence) and a guide sequence (guide sequence), also referred to herein as a guide RNA (guide RNA or gRNA). In the context of endogenous CRISPR systems, guide sequences are also known as spacer sequences (spacers). In certain instances, the guide sequence is any polynucleotide sequence that has sufficient similarity to a target sequence to hybridize to the target sequence and guide the specific binding of the CRISPR/Cas9 complex to the target sequence. In certain embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% when optimally aligned. It is within the ability of one of ordinary skill in the art to determine the optimal alignment. For example, there are published and commercially available alignment algorithms and programs such as, but not limited to, the Smith-Waterman algorithm (Smith-Waterman), bowtie, geneious, biopython, and SeqMan in ClustalW, matlab.
The term "CRISPR/Cas9 complex" as used herein refers to a complex formed by single stranded guide RNA (single guide RNA) or mature crRNA binding to a Cas9 protein, comprising a guide sequence that hybridizes to a target sequence and thereby binds Cas9 protein to the target sequence. The complex is capable of recognizing and cleaving a polynucleotide that hybridizes to the single stranded guide RNA or mature crRNA.
Thus, in the context of forming a CRISPR/Cas9 complex, a "target sequence" refers to a polynucleotide that is designed to be targeted by a guide sequence that is targeted, e.g., a sequence that has complementarity to the guide sequence, wherein hybridization between the target sequence and the guide sequence will promote the formation of the CRISPR/Cas9 complex. Complete complementarity is not necessary so long as sufficient complementarity exists to cause hybridization and promote the formation of a CRISPR/Cas complex. The target sequence may comprise any polynucleotide, such as DNA or RNA. In some cases, the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located within an organelle of a eukaryotic cell, such as a mitochondria or chloroplast.
The term "target sequence" or "target polynucleotide" as used herein may be any polynucleotide that is endogenous or exogenous to a cell (e.g., eukaryotic cell). For example, the target polynucleotide may be a polynucleotide that is present in the nucleus of a eukaryotic cell. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or unwanted DNA). In some cases, the target sequence should be related to the Protospacer Adjacent Motif (PAM). The exact sequence and length requirements for PAM will vary depending on the Cas protein used, but PAM is typically a 2-5 base sequence adjacent to the protospacer sequence (target sequence). Those skilled in the art are able to identify PAM sequences for use with a given Cas protein.
The terms "polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid fragment" as used herein are used interchangeably and are single-or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural or altered nucleotide bases. Nucleotides are referred to by their single letter designations as follows: "A" is adenosine or deoxyadenosine (corresponding to RNA or DNA, respectively), "C" represents cytidine or deoxycytidine, "G" represents guanosine or deoxyguanosine, "U" represents uridine, "T" represents deoxythymidine, "R" represents purine (A or G), "Y" represents pyrimidine (C or T), "K" represents G or T, "H" represents A or C or T, "I" represents inosine, and "N" represents any nucleotide.
The terms "polypeptide", "peptide", and "protein" as used herein are used interchangeably herein to refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogues of the corresponding naturally occurring amino acid, and to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms including, but not limited to, glycosylation, lipid attachment, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
The term sequence "identity" or "homology" as used herein has art-recognized meanings and the percent sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using the disclosed techniques. Sequence identity can be measured along the full length of a polynucleotide or polypeptide or along a region of the molecule (see, e.g., computational Molecular Biology, lesk, a.m., ed., oxford University Press, new York,1988;Biocomputing:Informatics and Genome Projects,Smith,D.W, ed., academic Press, new York,1993;Computer Analysis of Sequence Data,Part I,Griffin,A.M, and Griffin, h.g., eds., humana Press, new Jersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G, academic Press,1987;and Sequence Analysis Primer,Gribskov,M.and Devereux,J, eds., M stock Press, new York, 1991). Although there are many ways to measure identity between two polynucleotides or polypeptides, the term "identity" is well known to the skilled person to be suitable for conservative amino acid substitutions in peptides or proteins and can generally be performed without altering the biological activity of the resulting molecule. In general, one skilled in the art recognizes that single amino acid substitutions in the non-essential region of a polypeptide do not substantially alter biological activity (see, e.g., watson et al Molecular Biology of the Gene,4th Edition,1987,The Benjamin/Cummings pub. Co., p. 224).
The term "vector" as used herein refers to a nucleic acid vehicle into which a polynucleotide may be inserted. A vector is referred to as an expression vector when it is capable of allowing expression of a protein encoded by an inserted polynucleotide, or when it is capable of allowing transcription (e.g., transcription of an mRNA or functional RNA) of an inserted polynucleotide. The vector may be introduced into a host cell by transformation, transduction or transfection such that the genetic material elements carried thereby are expressed in the host cell. Vectors are well known to those skilled in the art and include, but are not limited to: plasmid vectors, viral vectors, and the like. The vector may also contain a variety of regulatory sequences that regulate expression. "regulatory sequence" and "regulatory element" are used interchangeably herein to refer to a nucleotide sequence that is located upstream (5 'non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence, and affects transcription, RNA processing or stability, or translation of the relevant coding sequence. Regulatory sequences may include, but are not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, reporter genes, and the like. The regulatory sequences may be of different origin or may be of the same origin but arranged in a manner different from that normally found in nature. In addition, the vector may also contain a replication origin.
The term "promoter" as used herein refers to a nucleic acid fragment capable of controlling transcription of another nucleic acid fragment. In some embodiments of the invention, the promoter is a promoter capable of controlling transcription of a gene in a cell, whether or not it is derived from the cell. The promoter may be a constitutive or tissue specific or developmentally regulated or inducible promoter.
The term "constitutive promoter" as used herein refers to a promoter that will generally cause a gene to be expressed in most cases in most cell types. "tissue-specific promoter" and "tissue-preferred promoter" are used interchangeably and refer to promoters that are expressed primarily, but not necessarily exclusively, in one tissue or organ, but also in one particular cell or cell type. "developmentally regulated promoter" refers to a promoter whose activity is determined by developmental events. An "inducible promoter" selectively expresses an operably linked DNA sequence in response to an endogenous or exogenous stimulus (environmental, hormonal, chemical signal, etc.).
"introducing" a nucleic acid molecule (e.g., plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming a cell of the organism with the nucleic acid or protein such that the nucleic acid or protein is capable of functioning in the cell. "transformation" as used herein includes both stable transformation and transient transformation.
The term "stable transformation" as used herein refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in stable inheritance of an exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generation thereof.
The term "transient transformation" as used herein refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without stable inheritance of an exogenous gene. In transient transformation, the exogenous nucleic acid sequence is not integrated into the genome.
The term "complementarity" as used herein refers to the ability of one nucleic acid sequence to form one or more hydrogen bonds with another nucleic acid sequence by means of conventional Watson-Crick or other non-conventional types. Percent complementarity means the percentage of residues in one nucleic acid molecule that can form hydrogen bonds (e.g., watson-Crick base pairing) with another nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 are complementary, then the percent complementarity is 50%, 60%, 70%, 80%, 90% and 100%). "fully complementary" means that all consecutive residues of one nucleic acid sequence form hydrogen bonds with the same number of consecutive residues in another nucleic acid sequence. "substantially complementary" as used herein refers to a degree of complementarity of at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides, or to two nucleic acids that hybridize under stringent conditions.
The term "stringent conditions" used herein in connection with hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence hybridizes predominantly to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are typically sequence-dependent and will depend on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in Tijssen, laboratory Techniques in Biochemistry and Molecular Biology-Hybridization With Nucleic Acid Probes, section I, chapter two, "Overview of principles of hybridization andthe strategy of nucleic acid probe assay", elsevier, NY,1993.
The term "hybridization" as used herein refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding of bases between the nucleotide residues. Hydrogen bonding may occur by watson-crick base pairing, hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex, three or more strands forming a multi-strand complex, a single self-hybridizing strand, or any combination of these. Hybridization reactions may constitute a step in a broader process, such as the start of PCR, or cleavage of polynucleotides via an enzyme. A sequence that hybridizes to a given sequence is referred to as the "complement" of the given sequence.
Cas9 proteins
In a first aspect, the invention provides a Cas9 protein, the Cas9 protein being:
has the sequence of SEQ ID NO:1, or a SauriCas9-HF protein having an amino acid sequence as set forth in SEQ ID NO:1 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence set forth in seq id no;
has the sequence of SEQ ID NO:2, or a Sha2Cas9-HF protein having an amino acid sequence set forth in SEQ ID NO:2 and retains at least 80% sequence identity to the amino acid sequence shown in seq id no; or alternatively
Has the sequence of SEQ ID NO:3, or a Sa-slegcas 9-HF protein having an amino acid sequence as set forth in SEQ ID NO:3 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence depicted in figure 3.
The "at least 80% sequence identity" may be at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80% and 100%.
In the present invention, the "biological activity" of the Cas9 protein refers to the activity of binding to a single-stranded guide RNA, endonuclease activity (including single-stranded cleavage activity and double-stranded cleavage activity), and/or activity of binding to and cleaving at a specific site of a target sequence under guide of guide RNA (gRNA), but is not limited thereto.
Derivatizing proteins
Cas9 proteins may be derivatized, e.g., linked to additional molecules (e.g., additional proteins or polypeptides). In general, derivatization (e.g., labeling) of a protein does not adversely affect the desired activity of the protein (e.g., its activity of binding to a single-stranded guide RNA, endonuclease activity, activity of binding to and cleaving at a specific site of a target sequence under guide RNA). Accordingly, cas9 proteins of the invention are also intended to include such derivatized forms. For example, the Cas9 proteins of the invention may be functionally linked (by chemical coupling, gene fusion, non-covalent linkage, or otherwise) to one or more other molecular moieties, such as additional proteins or polypeptides, detectable labels, pharmaceutical agents, and the like.
In particular, cas9 proteins may be linked to other functional units. For example, it may be linked to a Nuclear Localization Signal (NLS) sequence to increase the ability of the proteins of the invention to enter the nucleus. For example, it may be linked to a targeting moiety to render the Cas9 proteins of the invention targeted. For example, it may be linked to a detectable label to facilitate detection of the Cas9 protein of the invention. For example, it may be linked to an epitope tag to facilitate expression, detection, tracking, and/or purification of the Cas9 proteins of the invention.
Accordingly, in a second aspect, the present invention provides a conjugate comprising:
a) The Cas9 protein of the first aspect;
b) A modifying moiety; and
c) Optionally a linker for linking the Cas9 protein to the modifying moiety.
It will be appreciated that in addition to the Cas9 protein itself, cas9 proteins may also be conjugated to other substances, such as other proteins or markable tags, etc., to impart other functionalities.
Thus, in a particular embodiment, the modifying moiety may be an additional protein or polypeptide, a detectable label, or a combination thereof.
In a further embodiment, the additional protein or polypeptide is selected from one or more of an epitope tag, a reporter protein or Nuclear Localization Signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylases DNMT3A and MQ1, cytosine demethylase Tet1, transcriptional activator proteins VP64, p65 and RTA, transcriptional repressor protein KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease fokl.
Epitope tags are well known to those skilled in the art, examples of which include, but are not limited to, his, V5, FLAG, HA, myc, VSV-G, trx, etc., and it is known to those skilled in the art how to select an appropriate epitope tag according to the intended purpose (e.g., purification, detection, or labeling).
Reporter proteins are well known to those skilled in the art, examples of which include, but are not limited to GST, HRP, CAT, GFP, hcRed, dsRed, CFP, YFP, BFP, etc.
Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes, such as Fluorescein Isothiocyanate (FITC) or DAPI.
The Cas9 proteins of the invention may be coupled, conjugated or fused to the modifying moiety through a linker, or may be directly linked to the modifying moiety without a linker. Linkers are well known in the art, examples of which may include, but are not limited to, linkers comprising 1-50 amino acids (e.g., glu or Ser) or amino acid derivatives (e.g., ahx, beta-Ala, GABA, or Ava), or PEG, etc.
In a third aspect, the present invention provides a fusion protein comprising:
a) The Cas9 protein of the first aspect;
b) Additional proteins and polypeptides; and
c) Optionally a linker for linking the Cas9 protein to the additional proteins and polypeptides.
As in the second aspect of the invention, the additional protein or polypeptide may be selected from one or more of an epitope tag, a reporter protein or Nuclear Localization Signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylases DNMT3A and MQ1, cytosine demethylase Tet1, transcriptional activator proteins VP64, p65 and RTA, transcriptional repressor protein KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease fokl.
Epitope tags are well known to those skilled in the art, examples of which include, but are not limited to, his, V5, FLAG, HA, myc, VSV-G, trx, etc., and it is known to those skilled in the art how to select an appropriate epitope tag according to the intended purpose (e.g., purification, detection, or labeling). Reporter proteins are well known to those skilled in the art, examples of which include, but are not limited to GST, HRP, CAT, GFP, hcRed, dsRed, CFP, YFP, BFP, etc.
Reporter proteins are well known to those skilled in the art, examples of which include, but are not limited to GST, HRP, CAT, GFP, hcRed, dsRed, CFP, YFP, BFP, etc.
Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes, such as Fluorescein Isothiocyanate (FITC) or DAPI.
The Cas9 proteins of the invention may be coupled, conjugated or fused to the additional protein or polypeptide through a linker, or may be directly linked to the additional protein or polypeptide without a linker. Linkers are well known in the art, examples of which include, but are not limited to, linkers comprising 1-50 amino acids (e.g., glu or Ser) or amino acid derivatives (e.g., ahx, beta-Ala, GABA, or Ava), or PEG, etc.
The invention reforms three kinds of Cas9 proteins capable of carrying out gene editing in eukaryotic cell environment, namely SauriCas9-HF protein, sha2Cas9-HF and Sa-SlugCas9-HF, the reformed proteins have higher specificity, can form a complex with the same sgRNA to carry out more accurate gene editing, and are very suitable for the development of later-period gene therapeutic tools. The invention expands the range of gene editing and has wide application prospect in the field of gene editing.
Coding nucleic acid and vector
In a fourth aspect, the invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
a) The Cas9 protein of the first aspect;
b) The conjugate of the second aspect; or alternatively
c) The fusion protein of the third aspect.
In a specific embodiment, wherein the isolated nucleic acid molecule further comprises a nucleic acid sequence encoding a single stranded guide RNA; the single stranded guide RNA includes a scaffold sequence having:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
The "at least 90% sequence identity" may be at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.9% or at least 100% sequence identity.
In a specific embodiment, the modification may be one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence.
In a further embodiment, the shortening of the sequence and the lengthening of the sequence comprises a deletion or addition of 1 to 10 bases relative to the base sequence, e.g., a deletion or addition of one, two, three, four, five, six, seven, eight, nine, or ten bases relative to the base sequence.
In yet another specific embodiment, the single stranded guide RNA can further comprise a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length and capable of complementary pairing with a target sequence.
In a preferred embodiment, the CRISPR spacer sequence is a sequence 21 nucleotides in length and capable of complementary pairing with a target sequence.
In a further embodiment, the single stranded guide RNA further comprises a terminator at the 3' end of the spacer sequence. As an example, the terminator may be a plurality of terminators such as at least six (e.g., seven or eight) us.
The single-stranded guide RNA is capable of binding to the Cas9 protein, conjugate or fusion protein described above to form a complex that recognizes the corresponding PAM and thereby binds to the target sequence, thereby enabling cleavage or gene editing of the target sequence.
In a further embodiment, the isolated nucleic acid molecule comprises the sequence of SEQ ID NO:8 or a degenerate sequence thereof, and preferably further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
After transfection of the isolated nucleic acid molecules of the invention into a corresponding cell using certain means known in the art, such as expression vectors, the isolated nucleic acid molecules of the invention can express the Cas9 protein, conjugates or fusion proteins thereof, and/or the single stranded guide RNAs described above of the invention and perform the corresponding function therein, such as gene editing.
In addition, the isolated nucleic acid molecules of the invention may express Cas9 protein, its conjugates or fusion proteins, and single stranded guide RNAs individually/separately, or the expression products may be expressed in bulk, depending on the particular manner of expression.
Furthermore, the expression products have the corresponding actions and/or functions described above, and are not described here again for brevity.
In a fifth aspect, the invention provides a vector comprising a nucleic acid sequence encoding:
a) The Cas9 protein of the first aspect;
b) The conjugate of the second aspect; or alternatively
c) The fusion protein of the third aspect.
In a specific embodiment, the vector comprises SEQ ID NO: 4. SEQ ID NO:5 and SEQ ID NO:6 or a degenerate sequence thereof.
The vector may be an expression vector, for example a plasmid vector such as a pUC19 vector, an adherent vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
In yet another specific embodiment, the vector further comprises a nucleic acid sequence encoding a single stranded guide RNA. The single stranded guide RNA includes a scaffold sequence having:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
The "at least 90% sequence identity" may be at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.9% or at least 100% sequence identity.
In a specific embodiment, the modification may be one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence.
In a further embodiment, the shortening of the sequence and the lengthening of the sequence comprises a deletion or addition of 1 to 10 bases relative to the base sequence, e.g., a deletion or addition of one, two, three, four, five, six, seven, eight, nine, or ten bases relative to the base sequence.
In yet another specific embodiment, the single stranded guide RNA can further comprise a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length and capable of complementary pairing with a target sequence.
In a preferred embodiment, the CRISPR spacer sequence is a sequence 21 nucleotides in length and capable of complementary pairing with a target sequence.
In a further embodiment, the single stranded guide RNA further comprises a terminator at the 3' end of the spacer sequence. As an example, the terminator may be a plurality of terminators such as at least six (e.g., seven or eight) us.
From the above, it can be seen that after transfection of the vector of the present invention into cells, the coding sequence cloned in the vector can be expressed as Cas9 protein, its conjugate or fusion protein, and/or single stranded guide RNA as described above, and perform the corresponding functions herein. For example, gene editing is performed.
In addition, multiple vectors, e.g., two vectors, can be transfected into the cell, wherein one vector expresses the Cas9 protein, conjugate or fusion protein thereof and the other vector expresses the single stranded guide RNA. Subsequently, the expressed Cas9 protein, conjugate or fusion protein thereof complexes with the expressed single-stranded guide RNA to form a complex, and performs a corresponding function therein, for example, gene editing.
Of course, the nucleic acid sequence encoding the Cas9 protein, its conjugate or fusion protein, and the nucleic acid sequence encoding the single-stranded guide RNA may also be cloned into a vector, such that the vector expresses both the Cas9 protein, its conjugate or fusion protein, and the single-stranded guide RNA after transfection into a cell, and performs the corresponding function therein, e.g., gene editing.
CRISPR/Cas9 gene editing system
In a sixth aspect, the invention provides a CRISPR/Cas9 gene editing system comprising:
a) A protein component comprising the Cas9 protein of the first aspect, the conjugate of the second aspect; or the fusion protein of the third aspect;
b) A nucleic acid component comprising: a single stranded guide RNA comprising a scaffold sequence having:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity;
and, the protein component and the nucleic acid component are bound to each other to form a complex.
The "at least 90% sequence identity" may be at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 1OO% sequence identity.
In a specific embodiment, the modification may be one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence.
In a further embodiment, the shortening of the sequence and the lengthening of the sequence comprises a deletion or addition of 1 to 10 bases relative to the base sequence, e.g., a deletion or addition of one, two, three, four, five, six, seven, eight, nine, or ten bases relative to the base sequence.
In yet another specific embodiment, the single stranded guide RNA can further comprise a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length and capable of complementary pairing with a target sequence.
In a preferred embodiment, the CRISPR spacer sequence is a sequence 21 nucleotides in length and capable of complementary pairing with a target sequence.
In a further embodiment, the single stranded guide RNA further comprises a terminator at the 3' end of the spacer sequence. As an example, the terminator may be a plurality of terminators such as at least six (e.g., seven or eight) us.
The CRISPR/Cas9 gene editing system of the present invention may consist of Cas9 proteins, homologues thereof, or conjugates or fusion proteins thereof described herein, directly with the single stranded guide RNAs described herein, as well as expression products resulting from expression of the vectors described herein. The CRISPR/Cas9 gene editing system enables identification, localization, cleavage and gene editing of target sequences by the co-action of Cas9 proteins and single stranded guide RNAs contained therein.
The CRISPR/Cas9 gene editing system can accurately position a target sequence. The term "pinpoint" has two-layer meaning: the first layer means that the CRISPR/Cas9 gene editing system of the present invention is itself capable of recognizing and binding to a target sequence, and the second layer means that the CRISPR/Cas9 gene editing system of the present invention is capable of bringing other proteins fused to the Cas9 protein or proteins specifically recognizing the sgRNA to the position of the target sequence.
The CRISPR/Cas9 gene editing system of the present invention has low tolerance to non-target sequences. By "low tolerance" is meant herein that the CRISPR/Cas9 gene editing system of the invention is essentially incapable, or completely incapable, of recognizing and binding to non-target sequences, or is essentially incapable, or completely incapable, of bringing other proteins fused to the Cas9 protein or proteins specifically recognizing the sgrnas to the location of non-target sequences.
Cells
In a seventh aspect, the invention provides a cell comprising: the isolated nucleic acid molecule of the fourth aspect, or the vector of the fifth aspect.
As an example, the cell may be a prokaryotic cell or a eukaryotic cell. For the eukaryotic cell, it may be an animal cell, as an example. For the animal cells, as an example, it may be mammalian cells such as human cells.
Method
In an eighth aspect, the invention provides a method of gene editing a target sequence in an intracellular or in vitro environment, the method comprising: contacting any one of the following (1) to (3) with a target sequence in an intracellular or in vitro environment:
(1) The Cas9 protein of the first aspect, the conjugate of the second aspect, or the fusion protein of the third aspect, and a single stranded guide RNA;
(2) The vector of the fifth aspect; and
(3) The CRISPR/Cas9 gene editing system of the sixth aspect;
wherein upon contact with a target sequence, the Cas9 protein, the conjugate, or the fusion protein recognizes a respective protospacer adjacent sequence (PAM) located at the 5 'end of the target sequence and having the sequence 5' -NNGG;
wherein the single stranded guide RNA comprises a scaffold sequence; the scaffold sequence has:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
In a specific embodiment, the modification may be one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence.
In a further specific embodiment, the shortening of the sequence and the lengthening of the sequence comprises a deletion or addition of 1-1O bases relative to the base sequence, e.g., a deletion or addition of one, two, three, four, five, six, seven, eight, nine, or ten bases relative to the base sequence.
In yet another specific embodiment, the single stranded guide RNA can further comprise a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length and capable of complementary pairing with a target sequence.
In a preferred embodiment, the CRISPR spacer sequence is a sequence 21 nucleotides in length and capable of complementary pairing with a target sequence.
In a further embodiment, the single stranded guide RNA further comprises a terminator at the 3' end of the spacer sequence. As an example, the terminator may be a plurality of terminators such as at least six (e.g., seven or eight) us.
In a specific embodiment, the cell is a prokaryotic cell or a eukaryotic cell, e.g., an animal cell, e.g., a mammalian cell, e.g., a human cell.
In yet another specific embodiment, the gene editing comprises one or more of gene knockout, site-directed base change, site-directed insertion, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification, single base shift, and chromatin imaging tracking of the target sequence.
In a further embodiment, the single base conversion comprises a base adenine to guanine conversion, cytosine to thymine conversion, or cytosine to uracil conversion.
In yet another specific embodiment, in the method, the CRISPR spacer of the single stranded guide RNA forms a complete base complementary pairing structure with the target sequence and an incompletely base complementary pairing structure with a non-target sequence.
As used herein, the incomplete base-pairing structure refers to a structure that includes a portion of base-pairing and a portion of non-base-pairing, including, for example, base-mismatches (mismatches) and/or base-projections (bands), and the like.
In yet another specific embodiment, the incomplete base-complementary pairing structure comprises one or more, e.g., two or more, base mismatches.
Therefore, the Cas9 protein can cut the target site on the target sequence, and double-strand break of the target sequence occurs under the cutting action of the Cas9 protein. Further, when the method is performed in a cell, the cleaved target sequence may be repaired by a non-homologous end joining repair or homologous recombination repair pathway in the cell, thereby achieving gene editing of the target sequence.
The CRISPR/Cas9 gene editing system and the gene editing method adopting the gene editing system, disclosed by the invention, can form a complex with the same sgRNA to carry out gene editing through experiments, and the gene editing method comprises the following steps of: 1. SEQ ID NO:2 and SEQ ID NO:3 has an editing efficiency of 5% -25%. In addition, the mismatch-containing guide RNA has a fault tolerance of approximately O% for the SauriCas9-HF protein, the Sha2Cas9-HF and the Sa-SlugCas9-HF protein gene editing system. Therefore, the gene editing systems can edit target genes with high specificity, have the characteristics of high editing efficiency and low off-target rate, and can be widely applied to gene editing in cells or in-vitro environments.
Kit for detecting a substance in a sample
In a ninth aspect, the present invention provides a kit for gene editing of a target sequence in an intracellular or in vitro environment, comprising:
a) Any one selected from the following 1) to 4):
1) The Cas9 protein of the first aspect, the conjugate of the second aspect, or the fusion protein of the third aspect, and a single stranded guide RNA;
2) The isolated nucleic acid molecule of the fourth aspect;
3) The vector of the fifth aspect; or alternatively
4) The CRISPR/Cas9 gene editing system of the sixth aspect; and
b) Instructions for how to perform gene editing of a target sequence in an intracellular or in vitro environment;
wherein the single stranded guide RNA comprises a scaffold sequence; the scaffold sequence has:
(i) SEQ ID NO: 7;
(ii) And SEQ ID NO:7 and retains at least 90% sequence identity to the nucleic acid sequence shown in seq id no; or alternatively
(iii) Based on SEQ ID NO:7 and retains its biological activity.
In a specific embodiment, the modification may be one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence.
In a further specific embodiment, the shortening of the sequence and the lengthening of the sequence comprises a deletion or addition of 1 to 10 bases relative to the base sequence, for example a deletion or addition of one, two, three, four, five, six, seven, eight, nine or ten bases relative to the base sequence.
In yet another specific embodiment, the single stranded guide RNA can further comprise a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length and capable of complementary pairing with a target sequence.
In a preferred embodiment, the CRISPR spacer sequence is a sequence 21 nucleotides in length and capable of complementary pairing with a target sequence.
In a further embodiment, the single stranded guide RNA further comprises a terminator at the 3' end of the spacer sequence. As an example, the terminator may be a plurality of terminators such as at least six (e.g., seven or eight) us.
Of course, it will be appreciated by those skilled in the art that other reagents that facilitate gene editing may also be included in the kits of the invention.
Brief description of the invention related to sequences
SEQ ID NO:1: sauriCas9-HF protein sequence
SEQ ID NO:2: sha2Cas9-HF protein sequences
SEQ ID NO:3: sa-SlugCas9-HF protein sequence
SEQ ID NO:4: coding sequence of SauriCas9-HF protein
SEQ ID NO:5: coding sequence of Sha2Cas9-HF protein
SEQ ID NO:6: coding sequence of Sa-SlugCas9-HF protein
SEQ ID NO:7: support sequence for use with Cas9 protein
SEQ ID NO:8: DNA sequence of the scaffold sequence of single stranded guide RNA related to Cas9 protein
Examples
In the following examples, exemplary CRISPR/Cas9 gene editing systems of the present invention and related applications are shown. Unless otherwise indicated, all test procedures used herein were conventional, and all test materials used in the examples described below were purchased from a conventional reagent store, unless otherwise indicated. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It should be noted that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The foregoing summary of the invention and the following detailed description are only for the purpose of illustrating the invention and are not intended to limit the invention in any way. The scope of the invention is determined by the appended claims without departing from the spirit and scope of the invention.
Example 1
(1) Construction of plasmid pAAV2_Cas9_ITR
Wherein the amino acid sequences of the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein are respectively shown in SEQ ID NO: 1. SEQ ID NO:2 and SEQ ID NO: 3.
Table 1: cas9 protein and NCBI protein search ID and sequence number thereof
Cas9 protein is fully known NCBI protein search ID Amino acid sequence
SauriCas9-HF Without any means for SEQ ID NO:1
Sha2Cas9-HF Without any means for SEQ ID NO:2
Sa-SlugCas9-HF Without any means for SEQ ID NO:3
And carrying out codon optimization on the amino acid sequence of each Cas9 protein to obtain a gene sequence with high expression of the Cas9 protein in human cells. The optimized gene sequences of the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein are respectively shown in SEQ ID NO: 4. SEQ ID NO:5 and SEQ ID NO: shown at 6.
The SEQ ID NO obtained above: 4. SEQ ID NO:5 and SEQ ID NO:6, and constructing the high expression gene sequence of each Cas9 protein on a sleggCas 9 skeleton plasmid (Addgene platform, catalog # 163793) to obtain a plasmid pAAV2_Cas9_ITR.
(2) Preparation of linearized plasmid hU6-Sa_tracr
Plasmid hU6-Sa_tracr (Addgene platform, catalog # 135973) was digested with BsaI restriction enzyme, and the scaffold sequence in this plasmid was SEQ ID NO: 8. The enzyme digestion system is as follows: mu.g of plasmid hU6-Sa_tracr, 5. Mu.L of 10 XCutSmart buffer (from NEB Co.), 1. Mu.L of BsaI restriction endonuclease (from NEB Co.) and water were made up to 50. Mu.L. The cleavage system was allowed to stand at 37℃overnight.
Then, the digested product was electrophoresed on a 1% agarose gel at 120V for 30min.
The DNA fragment was excised from the agarose gel, recovered using a gel recovery kit (Tiangen Biochemical technology (Beijing) Co., ltd., DP 209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water. The DNA fragment is a Scafold linearization plasmid hU6-Sa_tracr containing SaCas9 RNA, and the size of the DNA fragment is 3088bp.
The recovered linearized plasmid hU6-Sa_tracr was subjected to NanoDrop TM Lite spectrophotometry (Thermo Scientific) measures DNA concentration for later use or for long-term storage at-20 ℃.
(3) Preparation of plasmid hU6-Sa_sgRNA
Each gRNA was designed and its sequence is shown in Table 2 below. The cohesive end sequences corresponding to both sides of the linearized plasmid hU6-Sa_tracr were added to the sense strand and the antisense strand for each gRNA sequence pair designed, respectively, and two single-stranded oligonucleotides were synthesized, the specific sequences of which are also shown in Table 2 below.
Figure BDA0003848964650000271
Annealing the oligonucleotide single-stranded DNA to obtain double-stranded DNA. The annealing reaction system is as follows: 1. Mu.L of 100. Mu.M oligo-F, 1. Mu.L of 100. Mu.M oligo-R, 28. Mu.L of water. After the annealing system is evenly mixed by vibration, the annealing system is placed in a PCR instrument to run an annealing program, and the annealing program is as follows: 95 ℃ C. 1min, 85 ℃ C. 1min,75 ℃ C. 1min,65 ℃ C. 1min,55 ℃ C. 1min,45 ℃ C. 1min,35 ℃ C. 1min,25 ℃ C. 1min,4 ℃ for preservation, the cooling rate is 0.3 ℃/s. After annealing, the resulting product was ligated to the linearized hU6-Sa_tracr plasmid obtained in step (2) by DNA ligase (purchased from NEB).
1. Mu.L of the obtained ligation product was added to E.coli DH 5. Alpha. Competent cells (purchased from Shanghai Weidi Biotechnology Co., ltd.) and incubated on ice for 30min, heat-shock at 42℃for 1min, and ice for 2min, then added to 900. Mu.L of LB medium, and cultured at 37℃for 1 hour to perform activation recovery of E.coli DH 5. Alpha. Competent cells.
The recovered competent cells of E.coli DH 5. Alpha. Were spread on LB solid plates containing the corresponding resistance and cultured in an inverted incubator at 37℃and the resulting E.coli DH 5. Alpha. Monoclonal was verified by Sanger sequencing.
And (3) cloning and shaking the escherichia coli DH5 alpha with correct connection by sequencing verification, and extracting plasmids to obtain plasmids hU6-Sa_sgRNA containing the target sgRNA expression sequence for later use.
(4) Transfection of HEK293T cell line with plasmid pAAV2_Cas9_ITR expressing Cas protein and plasmid hU6-Sa_sgRNA expressing sgRNA
On day 0 HEK293T cells containing the target sequences were plated in 24 well plates at a cell density of around 30% depending on the transfection requirements.
On day 1, transfection was performed as follows:
500ng of plasmid pAAV 2-Cas9-ITR and 300ng of plasmid hU6-Sa_sgRNA were mixed and added to 25. Mu.L of Opti-MEM medium (available from Gibco corporation) and gently mixed.
The transfection reagent liposomes (purchased from Invitrogen corporation) or polyethylenimine (hereinafter abbreviated as PEI, 100. Mu.M) (purchased from polysciences corporation) were gently mixed, 1.6. Mu.L or 0.8. Mu.L PEI was pipetted into 25. Mu.L of Opti-MEM medium (purchased from Gibco corporation), gently mixed, and allowed to stand at room temperature for 5min.
Mixing the diluted transfection reagent and diluted plasmid, gently stirring, standing at room temperature for 20min, adding into culture medium containing HEK293T cells to be transfected, and placing the cells at 37deg.C and 5% CO 2 Culturing in the incubator was continued for 3 days.
(5) Preparation of second Generation sequencing library
Three days after editing HEK293T cells were collected, genomic DNA was extracted using a DNA kit (Tiangen Biochemical technology (Beijing) Co., ltd., DP 304) and according to the instructions provided by the DNA kit.
The first round of PCR was performed, and the PCR reaction was performed using 2 XQ 5Master mix (purchased from NEB Co.) with the following primers:
table 3: second generation sequencing one round of PCR
Figure BDA0003848964650000291
The reaction system is as follows:
Figure BDA0003848964650000301
the PCR run was as follows:
Figure BDA0003848964650000302
the second round of PCR was performed with sequencing and library construction, and PCR reactions were performed with 2 XQ 5Master mix, with the following PCR primers:
f2 primer: AATGAracggccaccgagattactacactatagacctcactcttttcccacacgac
R2 primer: CAAGCAGAAGACGGCATACGAGATTGCTGGGTGTGACTGGAGTTCAGACGTGTG
The reaction system is as follows:
Figure BDA0003848964650000303
the PCR run was as follows:
Figure BDA0003848964650000304
the second round of PCR products were purified using the gel recovery kit according to the procedure provided by the manufacturer to obtain DNA fragments of 241bp, 185bp, 266bp, 274bp, 266bp, 249bp, 241bp, 185bp and 385bp, which were the sizes of E4, E7, G1, G3, G4, G5, G6, G8, G9, G10 and S3 sites, respectively. Thus, the second generation sequencing library was prepared.
(6) Analysis of second generation sequencing results
The prepared second generation sequencing library was subjected to double-ended sequencing on a high throughput sequencer Hiseq XTen (Illumina).
The editing efficiency for each target site calculated for the second generation sequencing is shown in FIGS. 1-3, where the X-axis represents target site and the Y-axis represents editing efficiency (Indels%). As can be seen from fig. 1-3, the gene editing system containing the SauriCas9-HF protein, sha2Cas9-HF, and Sa-slegcas 9-HF protein can be used for cell gene editing.
Example 2
(1) Construction of plasmid pAAV2_Cas9_ITR
The amino acid sequences of the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein are respectively shown in SEQ ID NO:1 to SEQ ID NO: 3.
And carrying out codon optimization on the amino acid sequence of the Cas9 protein to obtain a gene sequence with high expression of the Cas protein in human cells. The gene sequences of the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein are respectively shown in SEQ ID NO: 4. SEQ ID NO:5 and SEQ ID NO: shown at 6.
The SEQ ID NO obtained above: 4. SEQ ID NO:5 and SEQ ID NO:6, and constructing the gene sequence of each Cas9 protein high expression on a sleggcas 9 skeleton plasmid (adedge platform, catalog # 163793) to obtain a plasmid pAAV2_cas9_itr.
(2) Preparation of linearized plasmid hU6-Sa_tracr
The plasmid hU6-Sa_tracr was subjected to cleavage reaction with BsaI restriction enzyme, and the scaffold sequence in the plasmid was SEQ ID NO: 8. The enzyme digestion system is as follows: mu.g of plasmid hU6-Sa_tracr, 5. Mu.L of 10 XCutSmart buffer (from NEB Co.), 1. Mu.L of BsaI restriction endonuclease (from NEB Co.) and water were made up to 50. Mu.L. The cleavage system was allowed to stand at 37℃overnight.
Then, the digested product was electrophoresed on a 1% agarose gel at 120V for 30min.
The DNA fragment was excised from the agarose gel, recovered using a gel recovery kit (Tiangen Biochemical technology (Beijing) Co., ltd., DP 209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water. The DNA fragment is a Scafold linearization plasmid hU6-Sa_tracr containing SaCas9 RNA, and the size of the DNA fragment is 3088bp.
The recovered linearized plasmid hU6-Sa_tracr was subjected to DNA concentration measurement with a NanoDropTM Lite spectrophotometer for standby or long-term storage at-20 ℃.
(3) Preparation of plasmid hU6-Sa-on target sgRNA or hU6-Sa-mismatch sgRNA
The sequences of the on target gRNA and the mismatch gRNA were designed and their corresponding oligonucleotide single stranded DNA is shown in table 5 below, where the mismatch bases are shown in the sequence listing as underlined bold bases.
The obtained single-stranded oligonucleotide DNA corresponding to the on target gRNA and single-stranded oligonucleotide DNA corresponding to the different mismatch gRNAs are annealed respectively. The annealing reaction system is as follows: mu.L of 100. Mu.M oligo-F, L. Mu.L of 100. Mu.M oligo-R, 28. Mu.L of water. After the annealing system is uniformly vibrated and mixed, the annealing system is placed in a PCR instrument to run an annealing program; the annealing procedure was as follows: 95 ℃ C. 1min, 85 ℃ C. 1min,75 ℃ C. 1min,65 ℃ C. 1min,55 ℃ C. 1min,45 ℃ C. 1min,35 ℃ C. 1min,25 ℃ C. 1min,4 ℃ for preservation, the cooling rate is 0.3 ℃/s. After annealing, the resulting products were ligated to the resulting linearized hU6-Sa_tracr plasmid by DNA ligase (purchased from NEB Co.).
The recovered competent cells of E.coli DH 5. Alpha. Were spread on LB solid plates containing the corresponding resistance and cultured in an inverted incubator at 37℃and the resulting E.coli DH 5. Alpha. Monoclonal was verified by Sanger sequencing.
And (3) cloning and shaking the escherichia coli DH5 alpha with correct connection by sequencing verification, and extracting plasmids to obtain plasmids hU6-Sa-On target sgRNA for expressing the On target gRNA sequence and plasmids hU6-Sa-mismatch sgRNA for expressing different mismatch gRNA sequences respectively for later use.
(4) The obtained plasmid hU6-Sa-on target sgRNA expressing the on target gRNA sequence and plasmids hU6-Sa-mismatch sgRNA and pAAV2_Cas9_ITR expressing the different mismatch gRNA sequences were respectively transfected into a GFP reporter system HEK293T cell line containing the target sequence (GGCTCGGAGATCATCATTGCG) by liposome.
Table 4: oligonucleotide single-stranded DNA corresponding to on target gRNA and mismatch sgRNA
Figure BDA0003848964650000331
The GFP reporter system HEK293T cell line containing the target sequence was obtained by: insertion of PAM sequence and specific target sequence between the initiation codon ATG and GFP coding sequence resulted in GFP frameshift mutation, which was then integrated into HEK293T cells by lentiviral infection, resulting in a GFP reporter HEK293T cell line containing the target sequence. After the target sequence is cut by the gene editing system, part of cells can restore GFP reading frame through the self repair system to generate green fluorescence, and the editing capability and specificity of the gene editing system can be evaluated by counting GFP positive cell ratio through flow analysis.
The transfection process comprises the following steps:
on day 0, GFP reporter HEK293T cell lines containing target sequences were plated in 24 well plates with cell densities controlled at 30% as required for transfection.
The GFP reporter system HEK293T cell line containing the target sequence comprises a CMV-ATG-PAM-target site-GFP nucleotide sequence, wherein the PAM sequence is shown in figure 4, and the sequence of the target site (target site) is the target sequence GGCTCGGAGATCATCATTGCG.
On day 1, transfection was performed as follows:
500ng of plasmid pAAV2_Cas9_ITR and 300ng of plasmid hU6-Sa_on target gRNA or (2) 500ng of plasmid pAAV2_Cas9_ITR and 300ng of plasmid hU6-Sa_mismatch gRNA are respectively taken and mixed and added into 25 mu L of Opti-MEM culture medium, and the mixture is gently blown and homogenized. Will be
Figure BDA0003848964650000341
2000 (available from Invitrogen) or PEI (available from polysciences) were gently flicked and mixed, and 1.6. Mu.L +.>
Figure BDA0003848964650000342
2000 or 0.8. Mu.L PEI was added to 25. Mu.L Opti-MEM medium, gently mixed, and allowed to stand at room temperature for 5min.
Mixing the diluted plasmid and the diluted transfection reagent, gently stirring, standing at room temperature for 20min, adding into culture medium of GFP reporter system HEK293T cell line containing target sequence, and placing at 37deg.C and 5% CO 2 Culturing is continued in the incubator.
The CRISPR/Cas9 gene editing system provided by the invention is used for analyzing the editing efficiency and the off-target rate of a target sequence by adopting a flow cytometry analysis technology.
In particular, at CO 2 CulturingHEK293T cell lines after 5 days of culture in the box were tested for specificity using flow cytometry (BD Biosciences FACSCalibur) and analyzed for GFP positive ratios and plotted using FlowJo analysis software.
The results of specific detection of the CRISPR/Cas9 gene editing system of the present invention in GFP reporter system HEK293T cell lines containing target sequences are shown in fig. 4-6, wherein the upper bar shows a schematic representation of GFP reporter system with specific PAM sequences and target sequences inserted between the start codon ATG and GFP coding sequence resulting in GFP frameshift mutations. Thus, when the gene editing system cleaves the target sequence, the cells will restore part of the cells to the GFP reading frame by the self-repair system, producing green fluorescence. The Y-axis in the bar graph below in FIGS. 4-6 represents the percent (%) GFP positive cells, and the X-axis represents the oligonucleotide single stranded DNA sequences corresponding to the on-target gRNA and the mismatch gRNA. As can be seen from fig. 4-6, the CRISPR gene editing system of the present invention edits the target sites in the GFP reporter HEK293T cell line, and the proportion of gene editing mediated by the mismatch gRNA is significantly lower than that mediated by the on-target gRNA. Moreover, in the research results of a gene editing system comprising the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein, no obvious mismatch phenomenon is found in all double-mismatches, which indicates that the gene editing system comprising the SauriCas9-HF protein, the Sha2Cas9-HF protein and the Sa-SlugCas9-HF protein has extremely high requirements on complete pairing between gRNA and a target sequence, lower fault tolerance rate and higher safety in practical application.

Claims (14)

1. A Cas9 protein, the Cas9 protein being:
has the sequence of SEQ ID NO:1, or a SauriCas9-HF protein having an amino acid sequence as set forth in SEQ ID NO:1 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence set forth in seq id no;
has the sequence of SEQ ID NO:2, or a Sha2Cas9-HF protein having an amino acid sequence set forth in SEQ ID NO:2 and retains at least 80% sequence identity to the amino acid sequence shown in seq id no; or alternatively
Has the sequence of SEQ ID NO:3, or a Sa-slegcas 9-HF protein having an amino acid sequence as set forth in SEQ ID NO:3 and retains at least 80% sequence identity and retains biologically active homologs of the amino acid sequence depicted in figure 3.
2. A conjugate, the conjugate comprising:
a) The Cas9 protein of claim 1;
b) A modifying moiety; for example, the modifying moiety is selected from an additional protein or polypeptide, a detectable label, or a combination thereof; for example, the additional protein or polypeptide is selected from one or more of an epitope tag, a reporter protein or Nuclear Localization Signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase DNMT3A and MQ1, cytosine demethylase Tet1, transcriptional activator proteins VP64, p65 and RTA, transcriptional repressor protein KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease fokl; and
c) Optionally a linker for connecting the Cas9 protein to the modifying moiety; for example, the linker is a linker of 1-50 amino acids in length.
3. A fusion protein, the fusion protein comprising:
a) The Cas9 protein of claim 1;
b) Additional proteins and polypeptides; for example, one or more selected from the group consisting of epitope tag, reporter protein or Nuclear Localization Signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase DNMT3A and MQ1, cytosine demethylase Tet1, transcriptional activator proteins VP64, p65 and RTA, transcriptional repressor protein KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI; and
c) Optionally a linker for linking the Cas9 protein to the additional proteins and polypeptides; for example, the linker is a linker of 1-50 amino acids in length.
4. An isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
a) The Cas9 protein of claim 1;
b) The conjugate of claim 2; or alternatively
c) A fusion protein according to claim 3.
5. The isolated nucleic acid molecule of claim 4, wherein the isolated nucleic acid molecule further comprises a nucleic acid sequence encoding a single stranded guide RNA; the single stranded guide RNA includes a scaffold sequence having:
(i) A nucleic acid sequence shown in SEQ ID NO. 7;
(ii) A nucleic acid sequence which has at least 90% sequence identity to the nucleic acid sequence shown in SEQ ID NO. 7 and retains its biological activity; or alternatively
(iii) A nucleic acid sequence engineered based on the nucleic acid sequence set forth in SEQ ID NO. 7 and retaining its biological activity;
wherein the modification is, for example, one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence; shortening of the sequence and lengthening of the sequence includes, for example, deletions or additions of 1 to 10 bases relative to the base sequence;
for example, the isolated nucleic acid molecule comprises SEQ ID NO:8 or a degenerate sequence thereof.
6. The isolated nucleic acid molecule of claim 5, wherein the single stranded guide RNA further comprises a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 21 nucleotides) in length and capable of complementary pairing with a target sequence.
7. A vector comprising a nucleic acid sequence encoding:
a) The Cas9 protein of claim 1;
b) The conjugate of claim 2; or alternatively
c) A fusion protein of claim 3;
for example, the vector comprises SEQ ID NO: 4. SEQ ID NO:5 and SEQ ID NO:6 or a degenerate sequence thereof;
for example, the vector is a plasmid vector such as pUC19 vector, an adherent vector, pAAV2_ITR vector, a retrovirus vector, a lentivirus vector, an adenovirus vector or an adeno-associated virus vector.
8. The vector of claim 7, wherein the vector further comprises a nucleic acid sequence encoding a single stranded guide RNA; the single stranded guide RNA includes a scaffold sequence having:
(i) A nucleic acid sequence shown in SEQ ID NO. 7;
(ii) A nucleic acid sequence which has at least 90% sequence identity to the nucleic acid sequence shown in SEQ ID NO. 7 and retains its biological activity; or alternatively
(iii) A nucleic acid sequence engineered based on the nucleic acid sequence set forth in SEQ ID NO. 7 and retaining its biological activity;
wherein the modification is, for example, one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence; shortening of the sequence and lengthening of the sequence includes, for example, deletions or additions of 1 to 10 bases relative to the base sequence;
For example, the vector comprises SEQ ID NO:8 or a degenerate sequence thereof.
9. The vector of claim 8, wherein the single stranded guide RNA further comprises a CRISPR spacer sequence at the 5' end of the scaffold sequence, the CRISPR spacer sequence being a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 21 nucleotides) in length and capable of complementary pairing with a target sequence.
10. A CRISPR/Cas9 gene editing system, comprising:
a) A protein component comprising the Cas9 protein of claim 1, the conjugate of claim 2; or the fusion protein of claim 3;
b) A nucleic acid component comprising: a single stranded guide RNA comprising a scaffold sequence having:
(i) A nucleic acid sequence shown in SEQ ID NO. 7;
(ii) A nucleic acid sequence which has at least 90% sequence identity to the nucleic acid sequence shown in SEQ ID NO. 7 and retains its biological activity; or alternatively
(iii) A nucleic acid sequence engineered based on the nucleic acid sequence set forth in SEQ ID NO. 7 and retaining its biological activity;
wherein the modification is, for example, one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence; shortening of the sequence and lengthening of the sequence includes, for example, deletions or additions of 1 to 10 bases relative to the base sequence;
And preferably, the single stranded guide RNA further comprises a CRISPR spacer sequence at the 5' end of the scaffold sequence; the CRISPR spacer sequence is a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 21 nucleotides) in length and capable of complementary pairing with a target sequence;
and, the protein component and the nucleic acid component are bound to each other to form a complex.
11. A cell, the cell comprising: the isolated nucleic acid molecule of any one of claims 4 to 6, or the vector of any one of claims 7 to 9; the cell is, for example, a prokaryotic cell or a eukaryotic cell, for example, an animal cell, for example, a mammalian cell such as a human cell.
12. A method of gene editing a target sequence in an intracellular or in vitro environment, the method comprising: contacting any one of the following (1) to (3) with a target sequence in an intracellular or in vitro environment:
(1) The Cas9 protein according to claim 1, the conjugate according to claim 2 or the fusion protein according to claim 3, and a single stranded guide RNA;
(2) The carrier of claim 9; and
(3) The CRISPR/Cas9 gene editing system according to claim 10;
wherein upon contact with a target sequence, the Cas9 protein, the conjugate, or the fusion protein recognizes a respective protospacer adjacent sequence (PAM) located at the 5 'end of the target sequence and having the sequence 5' -NNGG;
for example, the cell is a prokaryotic cell or a eukaryotic cell, for example, an animal cell, for example, a mammalian cell such as a human cell;
for example, the gene editing includes one or more of gene knockout to a target sequence, site-directed base changes, site-directed insertion, regulation of gene transcription levels, DNA methylation regulation, DNA acetylation modification, histone acetylation modification, single base conversion including, for example, conversion of the base adenine to guanine, cytosine to thymine, or cytosine to uracil, and chromatin imaging tracking;
wherein the single stranded guide RNA comprises a scaffold sequence; the scaffold sequence has:
(i) A nucleic acid sequence shown in SEQ ID NO. 7;
(ii) A nucleic acid sequence which has at least 90% sequence identity to the nucleic acid sequence shown in SEQ ID NO. 7 and retains its biological activity; or alternatively
(iii) A nucleic acid sequence engineered based on the nucleic acid sequence set forth in SEQ ID NO. 7 and retaining its biological activity;
wherein the modification is, for example, one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence; shortening of the sequence and lengthening of the sequence includes, for example, deletions or additions of 1 to 10 bases relative to the base sequence;
and preferably, the single stranded guide RNA further comprises a CRISPR spacer sequence at the 5' end of the scaffold sequence; the CRISPR spacer sequence is a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 21 nucleotides) in length and capable of complementary pairing with a target sequence.
13. The method of claim 12, wherein the CRISPR spacer sequence forms a complete base complementary pairing structure with the target sequence and an incompletely base complementary pairing structure with a non-target sequence; for example, the incomplete base-pairing structure includes one or more structures, such as two or more base mismatches.
14. A kit for gene editing of a target sequence in an intracellular or in vitro environment, comprising:
a) Any one selected from the following 1) to 4):
1) The Cas9 protein according to claim 1, the conjugate according to claim 2, or the fusion protein according to claim 3, and a single-stranded guide RNA;
2) The isolated nucleic acid molecule of claim 6;
3) The carrier of claim 9; or alternatively
4) The CRISPR/Cas9 gene editing system according to claim 10; and
b) Instructions for how to perform gene editing of a target sequence in an intracellular or in vitro environment;
wherein the single stranded guide RNA comprises a scaffold sequence; the scaffold sequence has:
(i) A nucleic acid sequence shown in SEQ ID NO. 7;
(ii) A nucleic acid sequence which has at least 90% sequence identity to the nucleic acid sequence shown in SEQ ID NO. 7 and retains its biological activity; or alternatively
(iii) A nucleic acid sequence engineered based on the nucleic acid sequence set forth in SEQ ID NO. 7 and retaining its biological activity;
wherein the modification is, for example, one or more of base phosphorylation, base vulcanization, base methylation, base hydroxylation, shortening of the sequence, and lengthening of the sequence; shortening of the sequence and lengthening of the sequence includes, for example, deletions or additions of 1 to 10 bases relative to the base sequence;
And preferably, the single stranded guide RNA further comprises a CRISPR spacer sequence at the 5' end of the scaffold sequence; the CRISPR spacer sequence is a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 21 nucleotides) in length and capable of complementary pairing with a target sequence.
CN202211134139.6A 2022-09-16 2022-09-16 Cas9 protein, gene editing system containing Cas9 protein and application Pending CN116144629A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211134139.6A CN116144629A (en) 2022-09-16 2022-09-16 Cas9 protein, gene editing system containing Cas9 protein and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211134139.6A CN116144629A (en) 2022-09-16 2022-09-16 Cas9 protein, gene editing system containing Cas9 protein and application

Publications (1)

Publication Number Publication Date
CN116144629A true CN116144629A (en) 2023-05-23

Family

ID=86357062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211134139.6A Pending CN116144629A (en) 2022-09-16 2022-09-16 Cas9 protein, gene editing system containing Cas9 protein and application

Country Status (1)

Country Link
CN (1) CN116144629A (en)

Similar Documents

Publication Publication Date Title
CN113373130B (en) Cas12 protein, gene editing system containing Cas12 protein and application
JP7153992B2 (en) Orthogonal CAS9 proteins for RNA-guided gene regulation and editing
KR102084186B1 (en) Method of identifying genome-wide off-target sites of base editors by detecting single strand breaks in genomic DNA
CN109983124B (en) Enhancing targeted genomic modifications using programmable DNA binding proteins
CN105121648B (en) Engineering of systems, methods and optimized guide compositions for sequence manipulation
KR20230031832A (en) Compositions and methods for gene editing
CN105658796B (en) CRISPR-CAS component systems, methods, and compositions for sequence manipulation
JP2018529353A (en) Comprehensive in vitro reporting of cleavage events by sequencing (CIRCLE-seq)
AU2021282578A1 (en) Programmable nucleases and methods of use
US11767525B2 (en) System and method for genome editing
CN110804628A (en) High-specificity non-off-target single-base gene editing tool
WO2023028444A1 (en) Effector proteins and methods of use
WO2023102329A2 (en) Effector proteins and uses thereof
CN117025570A (en) Cas12a mutant protein, gene editing system containing Cas12a mutant protein and application
WO2023092132A1 (en) Effector proteins and uses thereof
CN116751762A (en) Cas12b proteins, single stranded guide RNAs, gene editing systems comprising same and related applications
WO2020087631A1 (en) System and method for genome editing based on c2c1 nucleases
CN113249362B (en) Modified cytosine base editor and application thereof
CN116144629A (en) Cas9 protein, gene editing system containing Cas9 protein and application
CN118325867A (en) Cas9 protein, gene editing system containing Cas9 protein and application
CN118165956A (en) CRISPR/Cas9 gene editing system based on Tsp2Cas9 protein and related application thereof
CN116804190A (en) SlugCas9 mutant protein and related application thereof
CN113652411A (en) Cas9 protein, gene editing system containing Cas9 protein and application
CN113583999A (en) Cas9 protein, gene editing system containing Cas9 protein and application
CN113583999B (en) Cas9 protein, gene editing system containing Cas9 protein and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination