CN113234701B - Cpf1 protein and gene editing system - Google Patents

Cpf1 protein and gene editing system Download PDF

Info

Publication number
CN113234701B
CN113234701B CN202110325073.8A CN202110325073A CN113234701B CN 113234701 B CN113234701 B CN 113234701B CN 202110325073 A CN202110325073 A CN 202110325073A CN 113234701 B CN113234701 B CN 113234701B
Authority
CN
China
Prior art keywords
lys
sequence
protein
leu
crispr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110325073.8A
Other languages
Chinese (zh)
Other versions
CN113234701A (en
Inventor
谢红娴
程欢欢
黄龙
兰凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Shutong Medical Technology Co ltd
Original Assignee
Zhuhai Shutong Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Shutong Medical Technology Co ltd filed Critical Zhuhai Shutong Medical Technology Co ltd
Publication of CN113234701A publication Critical patent/CN113234701A/en
Application granted granted Critical
Publication of CN113234701B publication Critical patent/CN113234701B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a Cpf1 protein and a gene editing system, which comprise a Cpf1 protein or one or more polynucleotides encoding the Cpf1 protein, and CRISPR RNA or one or more polynucleotides encoding the CRISPR RNA protein; wherein, the amino acid sequence of the Cpf1 protein is shown as SEQ ID NO.1 or a sequence with at least 80% homology with SEQ ID NO. 1. According to the invention, a new type 2 CRSIPR/Cas gene editing system is excavated through the bioinformatics analysis of the metagenome, and the gene editing system is applied to editing genes of prokaryotes or eukaryotes, so that a new choice is provided for a gene editing tool kit. The invention provides a novel V-type CRISPR/Cas12a gene editing system which has novel physicochemical properties and can identify various different PAM sequences (TTNA).

Description

Cpf1 protein and gene editing system
Technical Field
The invention belongs to the technical field of gene editing, and particularly relates to a Cpf1 protein and a gene editing system.
Background
Gene editing (gene editing) technology makes it possible to modify DNA sequence sites, for example, Zinc Finger Nucleases (ZFNs) which are first generation gene editing tools, and transcription-activated small nucleases (TALENs) which are similar to second generation gene editing tools can be used for modifying targeted genomes, but these methods are difficult to design, difficult to manufacture, expensive in cost and not strong in universality.
The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) system is a natural immune system from archaea and bacteria, is a third-generation gene editing tool, is different from the conventional gene editing tool (protein-DNA recognition), utilizes the complementary pairing principle of nucleic acid base to recognize a target DNA sequence, guides a Cas effector protein to perform site-directed cutting, and has the advantages of strong applicability, simple design, low cost and high efficiency. Cas proteins contain a variety of different effector domains (domains) that play a role in different activities such as nucleic acid recognition, stabilizing complex structures, hydrolyzing DNA phosphodiester bonds, and the like. Among them, the type II CRISPR/Cas system derived from Streptococcus pyogenes (Streptococcus pyogene Cas9, SpCas9) is currently the most widely used CRISPR/Cas system due to its high cleavage efficiency. This system leaves a blunt-ended overhang and affects gene editing by identifying and cleaving the Protospacer Adjacent Module (PAM) sequence, i.e., "NGG," on the targeted polynucleotide. In recent years, type II CRISPRs comprising Cpf1, classified as type V, have also been increasingly used in the field of gene editing, and unlike Cas9 endonuclease, the cleavage of Cpf1 protein requires only a single RNA guide, which can simplify the design and use of gene editing tools. Meanwhile, Cpf1, a system that produces gene editing effects by recognizing T-rich PAM and leaving sticky ends in its targeting DNA sequence, the T/C preference of Cpf1 family proteins to recognize PAM expands the range of CRISPR targeted editing nucleic acids.
In large and diverse metagenomes, microorganisms that have not been cultured or even discovered are hidden, and there may be a large number of undiscovered CRISPR/Cas systems whose activity in prokaryotes and eukaryotes, as well as in an in vitro environment, needs to be confirmed.
Disclosure of Invention
The object of the present invention is to provide a novel gene editing system to enrich the existing gene editing tool family.
In order to achieve the purpose, the invention adopts the technical scheme that:
according to a first aspect of the present invention there is provided a Cpf1 protein, the amino acid sequence of the Cpf1 protein being as shown in SEQ ID No.1 or having at least 80% homology thereto.
Preferably, the amino acid sequence of the Cpf1 protein has at least 85% homology with the amino acid sequence shown in SEQ ID No.1, preferably at least 90% homology, more preferably at least 95% homology, even more preferably at least 96%, 97%, 98%, 99% homology.
Preferably, the Cpf1 protein comprises one or more mutant amino acid residues at positions: positions 117, 150, 267, 281, 538, 593, 776.
The Cpf1 protein is a DNA endonuclease, and the Cpf1 protein cleaves double-stranded DNA complementary to CRISPR RNA (crRNA) downstream of the PAM sequence through different nuclease domains; the different nuclease domains are HNH-like nuclease domains or RuvC-like nuclease domains.
The Cpf1 protein is abbreviated as LtCpf1, comprises 1296 amino acids and is a multifunctional DNA multi-domain endonuclease. The DNA has RNA and DNA endonuclease activities simultaneously, participates in maturation of pre-crRNA, recognizes and effectively cuts double-stranded DNA which is complementary with the crRNA and is positioned downstream of PAM through a RuvC nuclease domain and an HNH-like nuclease domain, cuts 23 th nucleotide of a target DNA chain and 17 th nucleotide of a non-target DNA chain downstream of the PAM sequence, and generates a sticky end with 6 nucleotide protrusions.
In a second aspect, the present invention provides a polynucleotide encoding a Cpf1 protein as defined above.
Preferably, the polynucleotide is codon optimized for expression in the cell of interest.
In a third aspect, the present invention provides a vector comprising the polynucleotide as described above.
In a fourth aspect, the invention provides a vector system comprising one or more vectors comprising a polynucleotide as described above and comprising one or more polynucleotides encoding CRISPR RNA on the same or different vectors.
In a fifth aspect of the present invention there is provided a complex comprising the Cpf1 protein, and CRISPR RNA.
The sixth aspect of the present invention provides a V-type CRISPR/Cas12a gene editing system comprising said Cpf1 protein or one or more nucleotide sequences encoding said Cpf1 protein, and one or more polynucleotide sequences of CRISPR RNA encoding said CRISPR RNA.
Preferably, the V-type CRISPR/Cas12a gene editing system further comprises an accessory protein, which may be involved in the capture of a foreign gene, or one or more polynucleotides encoding the accessory protein.
Further preferably, the auxiliary protein comprises one or more of a Cas1 protein, a Cas2 protein and a Cas4 protein.
Further preferably, the Cas1 protein has the amino acid sequence shown in SEQ ID No.2, or an amino acid sequence at least 80% homologous to the amino acid sequence shown in SEQ ID No.2, preferably at least 85% homologous, further preferably at least 90% homologous, more preferably at least 95% homologous, still more preferably at least 96%, 97%, 98%, 99% homologous;
the Cas2 protein has an amino acid sequence shown in SEQ ID NO.3 or an amino acid sequence at least 80% homologous with the amino acid sequence shown in SEQ ID NO.3, preferably at least 85% homologous, further preferably at least 90% homologous, more preferably at least 95% homologous, and still more preferably at least 96%, 97%, 98%, 99% homologous;
the Cas4 protein has an amino acid sequence shown in SEQ ID NO.4 or an amino acid sequence at least 80% homologous with the amino acid sequence shown in SEQ ID NO.4, preferably at least 85% homologous, further preferably at least 90% homologous, more preferably at least 95% homologous, and still more preferably at least 96%, 97%, 98%, 99% homologous. The seventh aspect of the present invention also provides a design principle CRISPR RNA, including one or more of the following:
a) CRISPR RNA the sequence format is: 5 '-direct repeat binding to Cpf1 protein-spacer complementary to target sequence-3';
b) CRISPR RNA the length of the spacer sequence is 10-30 bases;
c) CRISPR RNA has a direct repeat sequence length of 12-37 bases;
d) CRISPR RNA should contain a stem-loop structure.
Specifically, the target sequence is an exogenous DNA fragment or a target sequence designed and artificially synthesized aiming at a target gene.
Preferably, the direct repeat sequence of CRISPR RNA is as shown in SEQ ID NO.5 or has at least 80% homology thereto, more preferably the direct repeat sequence is as shown in any one of SEQ ID NO.6 to 12 or has at least 80% homology, preferably at least 85% homology, more preferably at least 90% homology, even more preferably at least 95% homology, even more preferably at least 96%, 97%, 98%, 99% homology to any one of SEQ ID NO.6 to 12.
Preferably, CRISPR RNA is generated by CRISPR Array transcription, which results in precursor CRISPR RNA (pre-crRNA), precursor CRISPR RNA is processed and cleaved to form CRISPR RNA, wherein CRISPR RNA is used as a guide RNA to form a complex with Cpf1 protein, and the mature CRISPR RNA spacer sequence processed by transcription is complementary to the anchor gene of interest, which leads Cpf1 protein to cleave the gene in the genome of interest.
The CRISPR Array comprises a direct repetitive sequence matched with the Cpf1 protein and a spacer sequence, wherein the spacer sequence comprises a target sequence and an element related to the Cpf1 protein.
Further preferably, the sequence of the element related to the Cpf1 protein is as shown in one or more of SEQ ID NO 13 to NO 17 or a sequence having at least 80% homology with the sequence shown in SEQ ID NO 13 to NO 17, preferably at least 85% homology, more preferably at least 90% homology, even more preferably at least 95% homology, even more preferably at least 96%, 97%, 98%, 99% homology.
CRISPR RNA (crRNA) according to the present invention, in a base complementary form, directs the Cpf1 protein to recognize the invading foreign genome. When bacteria are exposed to bacteriophage or virus and the like for invasion, short fragments of exogenous DNA are integrated between CRISPR array repeated spacer sequences in a host chromosome as new spacer sequences, thereby providing genetic record of infection, when the organism is invaded by exogenous genes again, CRISPR array is transcribed to generate precursor crRNA (pre-crRNA), precursor CRISPR RNA (pre-crRNA) is cut and processed to obtain mature CRISPR RNA (crRNA) with 5 'end being direct repeated sequence and 3' end being spacer sequence, the direct repeated sequence at 5 'end guides Cpf1 protein to be combined with a target sequence, the spacer sequence at 3' end is complementary to the sequence of the exogenous invasion genes, and mature CRISPR RNA (crRNA) is used as guide RNA (sgRNA) of Cpf1 protein to cut and target sequence.
The vector system, the complex, or the V-type CRISPR/Cas12a gene editing system in the invention binds or cuts the structure of DNA function in biological process; preferably, the structure of the DNA function includes, but is not limited to, the crRNA secondary structure, the Cpf1 effector protein domain, or the Cpf1-crRNA complex structure; preferably, the DNA is a prokaryotic or eukaryotic DNA.
The Cpf1 protein (LtCpf1) can recognize the break (DSB) of DNA double-stranded molecules formed by adjacent modules (PAM) of various protospacer sequences immediately upstream of a targeting sequence, and two important factors are required for recognizing the targeting sequence by the LtCpf1 protein: one is the nucleotide complementary to the crRNA spacer sequence, and the other is the Protospacer Adjacent Module (PAM) sequence Adjacent to the complementary sequence. A depletion experiment shows that LtCpf1 has a cleavage effect in a prokaryotic system, and positions 1 and 2 of a PAM sequence recognized by the newly-discovered V-type CRISPR/Cas12a system are preliminarily verified to be TT. And interference experiments and in vitro cutting prove that the PAM upstream of the target sequence recognized by the LtCpf1 can be TTNA (N is A, T, C, G). By artificially designing the spacer sequence in the crRNA, this CRISPR-Cas system can target almost all DNA sequences of interest in the genome, creating site-specific cohesive-end Double Strand Breaks (DSBs). (ii) the DSB is repaired by non-homologous ends, thereby generating small random insertions/deletions (indels) at the cleavage site to inactivate the gene of interest; alternatively, by high fidelity homologous repair, a homologous repair template can be used to make precise genomic modifications at the DSB site. In addition to directional cleavage of double-stranded DNA, LtCpf1 activates the single-stranded DNA cleavage domain upon targeting the double-stranded DNA of interest, thereby generating a side-cut single-stranded DNA effect. The characteristic of the indiscriminate lateral cutting single-stranded DNA can be applied to the rapid nucleic acid detection of DNA or RNA viruses, and has remarkable clinical application value.
The vector system, the compound and the V-shaped CRISPR/Cas12a gene editing system are applied to editing or detecting genes of prokaryotes or eukaryotes; preferably, it is used for binding, targeted cleavage or non-targeted cleavage of DNA.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
(1) according to the invention, a new type 2 CRSIPR/Cas gene editing system is excavated through the bioinformatics analysis of the metagenome, and the gene editing system is applied to editing genes of prokaryotes or eukaryotes, so that a new choice is provided for a gene editing tool kit.
(2) The novel V-shaped CRISPR/Cas12a gene editing system provided by the invention has novel physicochemical properties and can identify various different PAM sequences, and the PAM specific sequence identified by the gene editing system is TTNA (N represents A, T, C, G).
Drawings
Fig. 1 is a composition diagram of the homologous V-type CRISPR/Cas12a gene editing system described in the present invention.
FIG. 2 is a schematic diagram of sRNA-seq of the homologous V-type CRISPR/Cas12a gene editing system.
Fig. 3 is a schematic diagram of a preferred crRNA backbone of the homologous V-type CRISPR/Cas12a gene editing system described in the present invention.
Fig. 4 is a conserved PAM sequence of the homologous V-type CRISPR/Cas12a gene editing system of the present invention.
Fig. 5 shows the interference experiment result of the homologous V-type CRISPR/Cas12a gene editing system of the present invention.
Fig. 6 is a schematic diagram of the homologous type V CRISPR/Cas12a gene editing system of the present invention for targeted cleavage of target DNA at the in vitro level;
fig. 7 is a specific PCR experiment gel and Sanger sequencing result diagram of the homologous V-type CRISPR/Cas12a system of the present invention inserted into a double-stranded oligonucleotide chain (dsODN) after a eukaryotic cell line cuts a target.
Fig. 8 is a schematic diagram of the optimal length of the direct repeat sequence of the homologous V-type CRISPR/Cas12a gene editing system described in the present invention.
Fig. 9 is a schematic diagram of the optimal length of the recognition sequence of the homologous V-type CRISPR/Cas12a gene editing system described in the present invention.
FIG. 10 is a schematic diagram showing the result of the rapid nucleic acid detection of the COVID-19 gene by the homologous V-type CRISPR/Cas12a gene editing system of the present invention.
Detailed Description
To better illustrate the objects, aspects and advantages of the present invention, the present invention will be further described with reference to specific examples.
Example 1
In the embodiment, the protein and related elements related to the CRISPR-Cas system are obtained by analyzing, predicting and screening intestinal metagenome; the V-type CRISPR/Cas12a gene editing system is composed as shown in figure 1.
As can be seen from fig. 1, the V-type CRISPR/Cas12a gene editing system described in the present invention comprises the following components: endonuclease Cpf1(LtCpf1), helper proteins Cas1, Cas2, Cas4, crRNA. The LtCpf1 comprises 1296 amino acids, the sequence of the LtCpf1 is shown as SEQ ID NO.1, and double-stranded DNA complementary to the sgRNA is effectively cut at the downstream of the PAM through different nucleotidase domains; the auxiliary proteins Cas1 (the sequence of which is shown in SEQ ID NO. 2), Cas2 (the sequence of which is shown in SEQ ID NO. 3), Cas4 (the sequence of which is shown in SEQ ID NO. 4) are involved in foreign gene capture and maturation of crRNA; the crRNA includes direct repeats (the sequence of which is shown in SEQ ID No.5) and spacers (complementary to the foreign gene fragment or the artificially designed target sequence), and the crRNA can be transcribed from the CRISPR Array, which includes the direct repeats (corresponding to the direct repeats of the crRNA) and the spacers, and the spacers of the CRISPR Array include the foreign gene fragment or the artificially designed target sequence, the spacers corresponding to the crRNA, and elements related to the LtCpf1 protein, as shown in SEQ ID nos. 13 to 17.
With the aid of the proteins Cas1, Cas2, Cas4, foreign gene fragments or artificially designed sequences (target sequences) are integrated as new spacer sequences between the direct repeats of the CRISPR Array, which are transcribed to give the precursor CRISPR RNA (pre-crRNA), the pre-crRNA sequence being from 5 'to 3': the spacer of the 5 '-direct repeat-spacer-direct repeat-3', pre-crRNA includes sequences complementary to the target sequence as well as sequences complementary to elements associated with the LtCpf1 protein. The pre-crRNA is then cut and processed to form a mature CRISPR RNA (crRNA) sequence with a direct repetitive sequence at the 5 'end and a spacer sequence at the 3' end, the spacer sequence at the 3 'end is a sequence complementary to an exogenous gene segment or an artificially designed sequence (target sequence), and the direct repetitive sequence at the 5' end guides the LtCpf1 protein to be combined with the target sequence, so that the Lt1Cas13d protein is guided to cut the target sequence as a guide RNA (sgRNA).
Example 2
This example is to verify the crRNA structure of the V-type CRISPR/Cas12a gene editing system predicted in example 1 of the present invention.
(1) Materials: the CRISPR/Cas gene editing system-associated genes predicted in example 1 above.
(2) The verification method comprises the following steps: constructing a mature escherichia coli editing system by using the V-type CRISPR/Cas12a gene editing system, extracting total RNA of recombinant escherichia coli, purifying, adding a sequencing linker, and analyzing to obtain a wild mature CRISPR/Cas12a-crRNA structure;
the specific operation is as follows:
(a) inserting the V-type CRISPR/Cas12a gene editing system (comprising endonuclease LtCpf1, helper proteins Cas1, Cas2, Cas4 and CRISPR array) described in embodiment 1 of the invention into a pACYC184 vector, carrying out Escherichia coli codon optimization on Cpf1 protein, adding elements (SEQ ID NO. 13-NO. 17) related to LtCpf1 protein described in embodiment 1 and a Library sequence (SEQ ID NO.18) adopted in reference (1) into the CRISPR array as target sequences, and adding strong heterologous promoters J23119 on Ltprokaryotic 1 protein and CRISPR array to construct a plasmid for expressing pACYC184-LtCpf1, so as to obtain recombinant Escherichia coli;
(b) extracting total RNA of the recombinant escherichia coli, and performing DNA removal, rRNA removal and phosphorylation treatment on a sample;
(c) reserving sRNA with the length range of 15-120nt in an RNA sample, adding an illlumina platform sequencing joint on two sections of the sRNA, performing reverse transcription on an RNA library, performing PCR amplification, and performing high-throughput sequencing on the obtained high-yield sRNA-seq sequencing library;
(d) and (3) aligning the CRISPR/Cas12a sequence maps to obtain a wild mature CRISPR/Cas12a-crRNA structure.
FIG. 2 is a schematic diagram of sRNA-seq of the homologous V-type CRISPR/Cas12a gene editing system.
By comparing CRISPR/Cas12a sequence maps, as shown in fig. 2, under the action of LtCpf1 nuclease, pre-crRNA removes 15nt upstream of the repeat sequence and 7-9nt downstream of the spacer sequence, forming mature crRNA with a direct repeat sequence at the 5 'end and a spacer sequence at the 3' end, and the mature crRNA is fused with LtCpf1 to form a crRNA-Cpf1 complex.
Reference (1) is Esvelt, K.M., Mali, P., Braff, J.L., Moosburner, M., Yaung, S.J.and Church, G.M, (2013) organic Cas9 proteins for RNA-regulated gene regulation and identification Nat Methods,10,1116-1121.
Example 3
This example is to predict the secondary structure of crRNA of V-type CRISPR/Cas12a gene editing system described in example 1 of the present invention.
The combination process of RNA transcribed by a Spacer sequence (Spacer) and a direct repeat sequence (repeat sequence) is simulated, and the structure of crRNA after the combination of the Spacer sequence and the repeat sequence is predicted.
(1) Materials: predicted DR sequence and repeat sequence.
(2) Software: NUPACK (http:// www.nupack.org/partition/new)
(3) The prediction method comprises the following steps: the secondary structure of CRISPR RNA (crRNA) was simulated in vitro at 37 ℃ by using on-line NUPACK application.
Fig. 3 is a schematic diagram of a preferred crRNA framework of a crRNA molecule identified by the homologous V-type CRISPR/Cas12a gene editing system of the present invention after screening and optimization. FIG. 3 shows that the repeat sequence of crRNA (shown in SEQ ID NO.5) forms a 5 base-paired stemloop secondary structure, which is the key domain that crRNA and LtCpf1 protein recognize each other and direct LtCpf1 protein to target double-stranded DNA cleavage. Also, fig. 3 shows that the variant structure of the repeat sequence (SEQ ID No.5) forming the stem-loop structure in the crRNA backbone preferably includes repeat sequences as shown in SEQ ID nos. 6 to 12.
Example 4
The purpose of this example is to discover a protospacer proximity module (PAM) recognized by the V-type CRISPR/Cas12a gene editing system of the present invention in a prokaryotic system.
(1) Materials: the pACYC184-LtCpf1 plasmid, the target-library plasmid, E.coli DH5a obtained in example 2.
(2) The verification method comprises the following steps: in this embodiment, a prokaryotic verification system is constructed for the V-type CRISPR/Cas12a gene editing system described in the present invention, the cleavage effect thereof is verified, and the identified PAM sequence is primarily discovered by a second-generation sequencing technology. .
The specific operation is as follows:
(a) adding 7 random bases (16384 inserts in total) at the 3' end of the spacer sequence (SEQ ID NO.18) of library, selecting EcoRI and NcoI enzyme cutting sites in the MCS multiple cloning of a pUC19 vector, cloning the library into the vector, and constructing a target-library plasmid;
(b) transfecting Escherichia coli DH5a with pACYC184-LtCpf1 plasmid or unloaded pACYC184 plasmid, making into competent cells, respectively transferring into 200ng target-library plasmid, recovering at 25 ℃ for 2h, uniformly coating on SOB culture medium containing ampicillin sodium (100ug/ml) and chloramphenicol (25ug/ml) dual resistance, incubating at 25 ℃ for 30h, and collecting the plasmid by alkaline lysis method;
(c) PCR amplification contains a spacer sequence and seven random bases, two ends of a PCR product are added with joints for second-generation sequencing, a PAM exhaustion threshold value (PPDV) relative to a no-load control group is calculated, and a PAM sequence of the V-type CRISPR/Cas12a gene editing system is generated by using Weblogo.
Fig. 4 is a schematic diagram of a conserved PAM sequence of a protospacer adjacent module recognized by the V-type CRISPR/Cas12a gene editing system in a prokaryotic system, library DNA obtained by a depletion experiment is subjected to second-generation sequencing analysis, a PAM depletion threshold (PPDV) relative to a no-load control group is calculated, and a PAM sequence generated by Weblogo to LtCpf1 is TTNA.
Example 5
In this example, a plurality of possible protospacer proximity modules (PAMs) identified in the V-type CRISPR/Cas12a gene editing system described in example 1 of the present invention were further determined and verified by interference experiments, and their cleavage ability at a prokaryotic level was determined.
(1) Materials: pACYC184-LtCpf1 plasmid, target-library plasmid, PAM preliminarily recognized by LtCpf1 obtained in example 4, and Escherichia coli DH5 a.
(2) The verification method comprises the following steps: in this embodiment, an escherichia coli interference experiment is used to further determine a Protospacer Adjacent Module (PAM) recognized by the V-type CRISPR/Cas12a gene editing system in a prokaryotic system;
the specific operation is as follows:
(a) TTNA (N stands for A, T, C, G) is added at the 5 'end of the spacer sequence (SEQ ID NO.18), and the 5' end is cloned into pUC19 through EcoRI and NcoI enzyme cutting sites respectively to construct target-library plasmid with PAM as TTNA;
(b) the target-library plasmid containing TTNA as 5' terminal PAM was transferred into E.coli DH5a electrotransfer competence containing LtCpf 1-related locus, recovered at 25 ℃ for 2h, then diluted in gradient, incubated overnight at 25 ℃ on SOB medium containing ampicillin sodium (100ug/ml) and chloramphenicol (25ug/ml) dual-resistance by dot coating, and the number of monoclonal bacteria was observed.
FIG. 5 is a schematic diagram of an interference experiment of the V-type CRISPR/Cas12a gene editing system, PAMs of target-library plasmids used in the 1 st, 2 nd, 3 th and 4 th columns in FIG. 5 from left to right are TTAA, TTTA, TTCA and TTGA respectively, and the 5 th column is non-target plasmid transferred into LtCpf1 E.coli. The colony numbers of the monoclonals are observed by gradient dilution, and the colony numbers of the 1 st, 2 nd, 3 th and 4 th columns (TTNA-target) are obviously reduced compared with the colony number of the 5 th column (non-target), which indicates that the LtCpf1 can effectively identify TTNA as PAM and target and cut a DNA sequence in Escherichia coli.
Example 6
In this example, the in vitro cleavage experiment is used to verify the cleavage capability of the V-type CRISPR/Cas12a gene editing system described in example 1 of the present invention at the in vitro level and the potential gene editing capability in eukaryotes.
(1) Materials: PAM preliminarily identified by LtCpf1 obtained in example 4; the structure of LtCpf1 wild-type crRNA obtained by sRNA-seq; purifying LtCpf1 protein and HEK293T cell DNA in vitro;
(2) the verification method comprises the following steps: in this embodiment, an HEK293T cell DNA in vitro cleavage experiment is used to further determine whether the V-type CRISPR/Cas12a gene editing system of the present invention has the ability to cleave double-stranded DNA in a targeted manner at an in vitro level, and the identified protospacer proximity module (PAM);
the specific operation is as follows:
(a) selecting a sequence with TTNA near a target spot tested by a literature as an in vitro cutting target spot (SEQ ID NO. 19-NO. 22) to be added into a CRISPR array, and designing a specific primer to amplify to obtain an HEK293T cell DNA in vitro cutting template;
(b) CRISPR array transcribes 4 pieces of LtCpf1-crRNA aiming at 4 different PAMs (TTAA, TTTA, TTCA and TTGA) in vitro;
(c) according to LtCpf 1: crRNA: mixing samples with the molar weight of the template being 10:10:1, incubating for 15 minutes at 37 ℃, performing agarose gel electrophoresis, and observing whether the DNA template has a cutting effect.
Fig. 6 is a schematic diagram of in vitro cleavage of the V-type CRISPR/Cas12a gene editing system described in the present invention. Fig. 6 shows that LtCpf1 is effective in recognizing TTNA and performing targeted cleavage at in vitro levels, wherein significant double strand breaks are observed in the TTAA, TTTA, and TTCA groups, and relatively weak double strand breaks are observed in the TTGA group. The LtCpf1 can generate an effective targeted gene cleavage effect in vitro by combining a PAM sequence preliminarily discovered in a depletion experiment and a crRNA skeleton structure obtained by sRNA-seq.
Example 7
In this embodiment, the ability of the V-type CRISPR/Cas12a gene editing system of the present invention to cut a target DNA sequence in a eukaryotic cell is verified through an ODN experiment, and ODN-PCR results and Sanger sequencing results are shown in fig. 7.
(1) Materials: the CRISPR/Cas gene editing system of example 1 and PAM identified by LtCpf1 obtained in example 4, the existing V-type CRISPR/Cas12a gene editing system (LbCpf1 protein), LbCpf1 plasmid was purchased from addrene, the catalog number is PY 016;
(2) the verification method comprises the following steps: in the embodiment, ODN experiments are used for verifying that the V-type CRISPR/Cas12a gene editing system of the LtCpf1 has the capability of cutting a target DNA sequence in eukaryotic cells,
the specific operation is as follows:
(a) synthesizing an adult-optimized LtCpf1 protein sequence, cloning the LtCpf1 protein sequence into a PX330 eukaryotic vector, and constructing a PX330-LtCpf1 plasmid;
(b) selecting a human gene locus CDKN2A to design crRNA, wherein the design principle is as follows (refer to the sequencing result of sRNA-seq):
1) crRNA includes Spacer (Spacer) and Direct Repeat (DR) sequences in the format: 5 '-direct repeat-spacer-3' that binds to LtCpf1 protein;
2) the length of the spacer sequence of the crRNA is 10-30 base sequences; 3) the length of the direct repetitive sequence of the crRNA is 12-37 base sequences;
4) the direct repeat sequence of crRNA should contain a stem loop structure (stem loop);
inserting crRNA (the structural design of wild mature crRNA obtained by referring to sRNA-SEQ: the Direct Repeat length is 22bp, and the Spacer length is 23bp) into a target plasmid by a Gibson method according to a Spacer sequence designed by CDKN2A, such as SEQ ID NO.20, adding a human-derived eukaryotic strong promoter U6 at the upstream of the crRNA, and constructing a PX330-LtCpf1-crRNA plasmid;
(c) electrically transferring PX330-LbCpf1-crRNA plasmid 2.5ug and 1.5ul ODN of the constructed target gene locus CDKN2A into HEK293T cells with good state, and collecting all cells after 72h to extract DNA;
(d) a pair of primers near the target gene locus and on an ODN sequence are designed to carry out ODN-PCR, and agarose gel electrophoresis is used for observing whether a band exists to preliminarily identify whether a target cutting event occurs.
As shown in FIG. 7, agarose gel electrophoresis showed that the positive control Lcpcf 1 and the experimental group Lcpcf 1 both had the ODN-PCR target band with the correct size and the band intensities were substantially the same. Sanger sequencing showed successful integration of the ODN fragment into the target gene CDKN2A targeting amino acid sequence.
Example 8
In this example, the optimal spacer sequence (DR) length of the target DNA sequence in eukaryotic cells in the V-type CRISPR/Cas12a gene editing system described in the present invention is verified through an amplicon experiment, and a histogram result of the quantitative editing efficiency of the amplicon is shown in fig. 8.
(1) Materials: CRISPR/Cas gene editing system of example 1 and PAM recognized by LtCpf1 obtained in example 4;
(2) the verification method comprises the following steps: this example compares the effect of different length spacer sequences (DR) in editing CDKN2A gene in eukaryotic cells by amplicon experiments
The specific operation is as follows:
(a) synthesizing an adult-optimized LtCpf1 protein sequence, cloning the LtCpf1 protein sequence into a PX330 eukaryotic vector, and constructing a PX330-LtCpf1 plasmid;
(b) the human gene locus CDKN2A is selected to design crRNA, and the design principle is as follows:
1) crRNA includes Spacer (Spacer) and Direct Repeat (DR) sequences in the format: 5 '-direct repeat-crRNA spacer-3' that binds to LtCpf1 protein;
2) the length of the spacer sequence of the crRNA is 23 base sequences;
3) the length of the direct repetitive sequence of the crRNA is 14, 16, 18, 20, 22, 24, 26 and 28 base sequences;
4) the direct repeat sequence of crRNA should contain a stem loop structure (stem loop);
4) respectively inserting 8 different crRNAs into a target plasmid by a Gibson method, adding a human eukaryotic strong promoter U6 at the upstream of the crRNAs, and constructing 8 PX330-LtCpf1-crRNA plasmids;
(c) respectively electrotransfering PX330-LbCpf1-crRNA plasmid 2.5ug and 1.5ul ODN of the constructed target human gene locus CDKN2A into HEK293T cells with good states, and collecting all cells after 72h to extract DNA;
(d) performing ODN-PCR by designing a pair of primers near the target gene locus and on the ODN sequence, observing a target band with correct size by agarose gel electrophoresis, and preliminarily identifying that a target cutting event occurs;
(e) designing a proper amplification primer aiming at a target gene sequence, carrying out amplicon high-throughput library construction on DNA, quantifying the Indel rate of a target region, and comparing the gene editing efficiency when DR is respectively 14, 16, 18, 20, 22, 24, 26 and 28 bases;
the amplicon quantitative gene editing efficiency histogram is shown in fig. 8. The results show that crRNA can not effectively guide LtCpf1 to target genome editing when DR is lower than 20nt, and can effectively guide LtCpf1 to target genome editing when DR ranges from 22 nt to 28 nt.
Example 9
In this embodiment, the best recognition sequence (Spacer) length of the target DNA sequence in eukaryotic cells in the V-type CRISPR/Cas12a gene editing system described in the present invention is verified through an amplicon experiment, and a histogram result of amplicon quantitative editing efficiency is shown in fig. 9.
(1) Materials: the CRISPR/Cas gene editing system of the example and the resulting PAM recognized by LtCpf1 of example 4;
(2) the verification method comprises the following steps: in this embodiment, the optimal recognition sequence (Spacer) length of the target DNA sequence in eukaryotic cells in the V-type CRISPR/Cas12a gene editing system is verified by amplicon experiments,
the specific operation is as follows:
(a) synthesizing an adult-optimized LtCpf1 protein sequence, cloning the LtCpf1 protein sequence into a PX330 eukaryotic vector, and constructing a PX330-LtCpf1 plasmid;
(b) the human gene locus CDKN2A is selected to design crRNA, and the design principle is as follows:
1) crRNA includes Spacer (Spacer) and Direct Repeat (DR) sequences in the format: 5 '-direct repeat-crRNA spacer-3' that binds to LtCpf1 protein;
2) the length of the spacer sequence of the crRNA is 15, 17, 19, 21, 23, 25, 27 and 29 base sequences;
3) the length of the direct repetitive sequence of the crRNA is 22 base sequences;
4) the direct repeat sequence of crRNA should contain a stem loop structure (stem loop);
respectively inserting 8 different crRNAs into a target plasmid by a Gibson method, adding a human eukaryotic strong promoter U6 at the upstream of the crRNAs, and constructing 8 different PX330-LtCpf1-crRNA plasmids;
(c) respectively electrotransfering PX330-LbCpf1-crRNA plasmid 2.5ug and 1.5ul ODN of the constructed target human gene locus CDKN2A into HEK293T cells with good states, and collecting all cells after 72h to extract DNA;
(d) performing ODN-PCR by designing a pair of primers near the target gene locus and on the ODN sequence, observing a target band with correct size by agarose gel electrophoresis, and preliminarily identifying that a target cutting event occurs;
(e) designing a proper amplification primer aiming at a target gene sequence, carrying out amplicon high-throughput library construction on DNA, quantifying the Indel rate of a target region, and comparing the gene editing efficiency when the Spacer is respectively 15, 17, 19, 21, 23, 25, 27 and 29 bases;
the amplicon quantitative gene editing efficiency histogram is shown in fig. 9. The results show that LtCpf1 can be effectively guided to target genome editing when the Spacer range is 15-29.
Example 10
In this example, the ability of the V-type CRISPR/Cas12a gene editing system described in example 1 of the present invention to rapidly detect codv-19 nucleic acid in vitro was verified by an in vitro side-cut probe experiment.
(1) Materials: PAM preliminarily identified by LtCpf1 obtained in example 4; the optimal crRNA structure of LtCpf1 obtained in examples 8 and 9; LtCpf1 protein was purified in vitro.
(2) The verification method comprises the following steps: collecting total RNA of a COVID-19 gene expression plasmid vector, amplifying N gene of the new coronavirus at constant temperature by using RT-fluorescent nucleic acid amplification reagent (RAA method), selecting a proper target point according to PAM, combining an in-vitro cutting single-stranded nucleic acid fluorescent probe experiment, observing the capability of LtCpf1 for rapidly detecting COVID-19 nucleic acid at an in-vitro level, and comparing the capability with the capability of LtCpf1 of reference 2 for detecting COVID-19 nucleic acid;
the specific operation is as follows:
(a) extracting total RNA expressed by the COVID-19 plasmid vector in escherichia coli, designing a proper constant-temperature amplification primer, and obtaining an N gene segment of the COVID-19 through an RT-fluorescent nucleic acid amplification reagent (RAA method);
(b) and (3) selecting a target with a better detection effect verified in reference (2) by combining PAM (Polyacrylamide) (TTNA) and LtCpf1, designing and extracorporeally transcribing the crRNA aiming at the N gene of LtCpf1 according to the design principle that the target is combined with the crRNA. (ii) a
(c) Synthesizing a FAM-ssDNA-TAMRA fluorescent probe;
(d) according to Cas12 a: crRNA: mixing the samples with the molar weight of the template being 10:10:1, incubating for 15 minutes at 37 ℃, adding the synthesized single-stranded DNA fluorescent probe, incubating for 1 hour at 37 ℃ in a fluorescent quantitative Polymerase Chain Reaction (PCR) detection system, activating the Cas12 protein to cut the ssDNA sequence in the FAM-ssDNA-TAMRA fluorescent probe when new crown N gene double-stranded DNA exists in the sample, so that the fluorescent group is not inhibited by a quenching group, and recording a curve graph of the change of the fluorescence brightness in a reaction system along with time.
FIG. 10 is a schematic diagram of the result of rapid nucleic acid detection of the COVID-19 gene by the homologous V-type CRISPR/Cas12a gene editing system. A, picture A: we tested different copy numbers of N gene PCR product input and recorded fluorescence-time profiles (6.67 x 10^10, 6.67 x 10^9, 6.67 x 10^8, 6.67 x 10^7, respectively). The results show that LtCpf1 can still generate a rapid and sensitive nucleic acid detection effect when the copy quantity of the N gene is as low as 6.67 x 10^ 7; b, drawing: fluorescence intensity histograms recorded 60min after the start of the reaction for 4 different N gene copy number experimental groups. The results show that the copy amounts of the 4N genes can effectively activate the activity of the LtCpf1 protein for indiscriminately cutting the ssDNA fluorescent probe.
The reaction conditions of the implementation are the same as that of reference 2, and reference 2 shows that the lower limit of the effective detection of the N gene of the existing V-type CRISPR/Cas12a gene editing system (LbCpf1 protein) is about 10^10 copy numbers, so that the V-type CRISPR/Cas12a gene editing system has higher sensitivity.
Reference (2) is Wang, x, Zhong, m., Liu, y, Ma, p., Dang, l, Meng, q, Wan, w., Ma, x, Liu, j., Yang, g.et al (2020) Rapid and sensitive detection of cove-19 using CRISPR/Cas12a-based detection with naked eye readout, CRISPR/Cas12 a-ner.sci. bright, 65, 1436-.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.
Sequence listing
<110> Zhuhaishutong medical science and technology Limited
<120> Cpf1 protein and gene editing system
<150> 2020111230731
<151> 2020-10-20
<160> 22
<170> SIPOSequenceListing 1.0
<210> 1
<211> 1296
<212> PRT
<213> Artificial sequence (rengongxulie)
<400> 1
Met Lys Ser Ile Tyr Glu Asp Phe Ile Gly Leu Glu Ser Lys Asn Leu
1 5 10 15
Thr Leu Arg Phe Ala Leu Lys Pro Glu Pro Lys Thr Glu Glu Asn Leu
20 25 30
Lys Gln Tyr Trp Asp Lys Leu Arg Asp Glu Glu Arg Ala Lys Ala Tyr
35 40 45
Pro Ile Val Lys Lys Ile Leu Asp Arg Glu Tyr Gln Arg Leu Ile Ser
50 55 60
Glu Gly Leu Lys Ser Leu Glu Asn Gln Asn Ala Leu Asp Trp Thr Glu
65 70 75 80
Leu Ala Glu Tyr Ile Arg Thr Ser Ser Leu Asn Lys Lys Lys Asn Glu
85 90 95
Glu Lys Arg Leu Arg Lys Leu Ile Ala Gln Ser Leu Lys Ala His Pro
100 105 110
Leu Val Asp Lys Leu Lys Val Lys Asn Ala Phe Gly Lys Asn Gly Tyr
115 120 125
Leu Glu Thr Leu Pro Leu Gly Lys Glu Glu Lys Glu Ala Val Lys Val
130 135 140
Phe Ala Gly Phe Gly Gly Phe Phe Asn Asn Tyr Asn Lys Asn Arg Glu
145 150 155 160
Asn Tyr Phe Ser Thr Glu Glu Lys Ser Thr Ala Ile Ala Asn Arg Ile
165 170 175
Val Asn Glu Asn Phe Ser Lys His Phe Ser Asn Val Glu Ile Val Thr
180 185 190
Lys Ile Gln Lys Glu Val Pro Glu Leu Ile Gln Ile Val Glu Ala Gln
195 200 205
Phe Lys Gly Tyr Asp Ala Ile Phe Thr Val Asn Gly Tyr Asn Met Ala
210 215 220
Leu Ser Gln Ala Gly Ile Asp Thr Tyr Asn Glu Met Val Ala Ile Trp
225 230 235 240
Asn Lys Glu Ala Asn Leu Tyr Ala Gln Lys Ala Gly Lys Leu Pro Asp
245 250 255
Gly His Pro Leu Lys Lys Lys Arg Asn Tyr Leu Leu Ser Ala Leu Phe
260 265 270
Lys Gln Ile Gly Ser Glu Lys Glu His Leu Ile Gln Ile Asp Arg Phe
275 280 285
Asp Gly Asp Glu Glu Val Ile Glu Ala Leu Thr Gly Val Lys Lys Met
290 295 300
Leu Gln Glu Ala Asp Val Phe Glu Lys Leu Asn Met Leu Val Glu Asp
305 310 315 320
Met Glu Asn Trp Asp Tyr Ser Lys Ile Tyr Leu Ser Ala Gln Ser Leu
325 330 335
Ser Asn Val Ser Val Phe Leu Asn Asn Leu Tyr Glu Asp Glu Arg Glu
340 345 350
Asn Ser Trp Asn Tyr Leu Asp Asn Val Leu Arg Glu Lys Trp Gln Ile
355 360 365
Glu Leu Gln Gly Lys Lys Lys Gly Thr Asp Leu Glu Glu Ala Ile Arg
370 375 380
Lys Lys Lys Lys Ser Phe Tyr Ser Ile Ala Glu Leu Gln Glu Thr Val
385 390 395 400
Asn Ala Leu Glu Glu Thr Asp Lys Cys Tyr Ser Val Ser Lys Trp Leu
405 410 415
Leu Glu Ala Leu Lys Ser Glu Thr Val Ile Glu Glu Lys Glu Lys Asp
420 425 430
Ala Glu Asp Phe Cys Thr Lys Trp Lys Thr Glu Arg Asn Pro Leu Lys
435 440 445
Glu Thr Asp Ile Thr Ala Leu Lys Glu Tyr Leu Glu Gln Trp Ile Leu
450 455 460
Leu Ala Arg Tyr Cys Lys Ser Phe Tyr Ala Asn Gly Ile Glu Lys Lys
465 470 475 480
Glu Arg Asp Glu Ala Phe Tyr His Ile Leu Glu Asp Val Leu Tyr Val
485 490 495
Leu Lys Glu Val Ile Tyr Phe Tyr Asn Lys Val Arg Asn Tyr Val Thr
500 505 510
Lys Lys Pro Tyr Ser Leu Glu Lys Ile His Leu Lys Phe Gly His Val
515 520 525
Thr Leu Gly Asn Gly Trp His Ile Asn Gln Glu Lys Asp Asn Gly Thr
530 535 540
Thr Leu Leu Arg Lys Asp Gly Lys Tyr Tyr Leu Ala Ile Thr Asn Ser
545 550 555 560
Leu Asn Lys Lys Ile Cys Ile Pro Ser Gln Ile Glu Gly Thr Gly Asn
565 570 575
Asp Tyr Glu Lys Met Val Leu Asn Ala Phe Lys Lys Asp Lys Ile Tyr
580 585 590
Met Leu Ile Pro Lys Cys Thr Thr Glu Arg Lys Asn Val Glu Ser Cys
595 600 605
Phe Glu Ser Lys Glu Ser Ala Gln Tyr Phe Ile Ile Asp Thr Pro Lys
610 615 620
Phe Val Lys Pro Phe Lys Val Leu Arg Glu Glu Tyr Glu Leu Asn Lys
625 630 635 640
Ile Thr Tyr Asp Gly Val Lys Lys Trp Gln Ser Asp Tyr Leu Lys Lys
645 650 655
Thr Lys Asp Glu Lys Gly Tyr Lys Glu Ala Val Ala Lys Trp Ile Arg
660 665 670
Phe Cys Met Arg Phe Leu Gln Ser Tyr Lys Ser Thr Ala Ile Tyr Asp
675 680 685
Tyr Ser Thr Leu Gln Gln Pro Glu Glu Tyr Glu Thr Val Asp Ser Phe
690 695 700
Tyr Gln Asp Val Gly Lys Ile Thr Tyr Glu Cys His Phe Glu Tyr Val
705 710 715 720
Pro Thr Ser Glu Ile Glu Arg Leu Glu Asn Glu Gly Ser Ile Phe Leu
725 730 735
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Glu Asn Arg Arg Pro Asp Ser
740 745 750
Lys Lys Asn Leu His Thr Leu Tyr Trp Glu Ala Leu Phe Ser Glu Glu
755 760 765
Asn Gln Lys Ala Lys Val Ile Gln Leu Ser Gly Asn Ala Glu Val Phe
770 775 780
Arg Arg Glu Lys Ser Ile Glu Asn Pro Ile Val His Lys Ala Gly Glu
785 790 795 800
Val Leu Val Asn Lys Arg Thr Lys Lys Gly Glu Pro Ile Pro Asp Asp
805 810 815
Ile Tyr Arg Asp Leu Cys Asn Tyr Phe Asn Gly Lys Asp Val Pro Ser
820 825 830
Glu Lys Glu Asp Tyr Lys Glu Tyr Leu Asp Lys Val Tyr Thr Ser Thr
835 840 845
Lys Lys Tyr Asp Ile Thr Lys Asp Lys Arg Phe Thr Glu Asn Lys Tyr
850 855 860
Glu Phe His Val Pro Ile Thr Leu Asn His Gln Ala Glu Gly Val Lys
865 870 875 880
Tyr Leu Asp Gln Lys Ile Leu Arg Met Leu Arg Asp Asn Pro Asp Val
885 890 895
Asn Ile Ile Gly Leu Asp Arg Gly Glu Arg Asn Leu Ile Ser Tyr Val
900 905 910
Val Leu Asn Gln Glu Gly Lys Ile Val Asn Asn Gln Gln Gly Ser Phe
915 920 925
Asn Ile Val Gly Lys Met Asp Tyr Gln Lys Lys Leu Tyr Gln Lys Glu
930 935 940
Lys Asn Arg Asp Lys Glu Arg Lys Thr Trp Lys Asn Ile Glu Thr Ile
945 950 955 960
Lys Asp Leu Lys Glu Gly Tyr Ile Ser Gln Val Val His Glu Leu Thr
965 970 975
Asp Met Ala Ile Arg Asn Asn Ala Ile Ile Val Met Glu Asp Leu Asn
980 985 990
Phe Gly Phe Lys Arg Val Arg Thr Lys Val Glu Arg Gln Val Tyr Gln
995 1000 1005
Lys Phe Glu Leu Ala Leu Leu Lys Lys Leu His Tyr Leu Val Thr Asp
1010 1015 1020
Lys Thr Glu Gly Lys Ala Met Leu Lys Pro Gly Gly Val Leu Gln Gly
1025 1030 1035 1040
Tyr Gln Leu Ala Arg Glu Val Lys Thr Leu Lys Glu Ile Gly Lys Gln
1045 1050 1055
Cys Gly Cys Val Phe Tyr Val Pro Pro Gly Tyr Thr Ser Lys Ile Asp
1060 1065 1070
Pro Thr Thr Gly Phe Val Asp Val Phe Asn Met Ser Gly Val Thr Asn
1075 1080 1085
Arg Glu Lys Lys Lys Ala Phe Phe Glu Lys Phe Asp Asn Met Phe Tyr
1090 1095 1100
Asp Glu Lys Arg Asp Met Phe Gly Phe Ser Phe Asn Tyr Glu Lys Phe
1105 1110 1115 1120
Ala Thr Tyr Gln Ser Ser His Arg Asn Asp Trp Ile Val Tyr Ser Asn
1125 1130 1135
Gly Ser Lys Tyr Val Trp Asn Ser Leu Asn Lys Thr Asn Glu Leu Ile
1140 1145 1150
Asp Val Thr Lys Glu Leu Lys Met Leu Phe Glu Lys Tyr Ala Ile Asn
1155 1160 1165
Tyr Arg Asn Glu Ala Leu Phe Glu Gln Ile Ile Ser Lys Asp Thr Asp
1170 1175 1180
Lys Asn Asn Ala Asp Phe Trp Asn Lys Leu Phe Trp Tyr Phe Arg Val
1185 1190 1195 1200
Leu Leu Arg Ile Arg Asn Ser Ser Gly Glu Leu Asp Gln Ile Ile Ser
1205 1210 1215
Pro Val Leu Asn Gln Asn Gly Glu Phe Phe Glu Thr Pro Lys Lys Ile
1220 1225 1230
Thr Glu Lys Ser Tyr Leu Ser Asp Tyr Pro Met Asp Ala Asp Thr Asn
1235 1240 1245
Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Tyr Leu Ile Gln Glu Lys
1250 1255 1260
Ile Ala Asp Glu Ser Val Asp Leu Asp Asp Lys Leu Pro Asn Asp Phe
1265 1270 1275 1280
Tyr Lys Ile Ser Asn Ala Glu Trp Phe Arg Phe Arg Gln Lys Glu Lys
1285 1290 1295
<210> 2
<211> 333
<212> PRT
<213> Artificial sequence (rengongxulie)
<400> 2
Met Asn Gln Leu Val Thr Gly Gly Ile Ser Val Leu Asn Lys Gly Glu
1 5 10 15
Phe Ile Lys Lys Gln Ile Leu Val Tyr Glu Pro Phe Leu Gly Asp Lys
20 25 30
Met Ser Tyr Lys Asn Asp Asn Met Val Ile Arg Asp Gly Asn Gly Lys
35 40 45
Ile Lys Tyr Gln Val Ser Cys Tyr Arg Ile Phe Met Val Leu Ile Val
50 55 60
Gly Asp Val Thr Ile Thr Thr Gly Ile Leu Arg Arg Gln Gln Lys Phe
65 70 75 80
Gly Phe Arg Leu Cys Phe Leu Thr Leu Gly Leu Lys Val Tyr Ser Val
85 90 95
Ile Gly Pro Gln Leu Gln Gly Asn Thr Leu Leu His Cys Lys Gln Tyr
100 105 110
Ala Tyr Asp Glu Leu Thr Val Gly Lys Ser Ile Ile Ile Asn Lys Ile
115 120 125
Leu Asn Gln Arg Ala Ala Leu Thr Arg Leu Arg Ser Lys Thr Glu Asp
130 135 140
Val Trp Glu Cys Ile Ser Leu Leu Glu Gln Tyr Ser Lys Arg Leu Gln
145 150 155 160
Asn Asp Ser Leu Asn Leu Gln Glu Ile Ile Gly Ile Glu Gly Met Ala
165 170 175
Ser Lys Ile Tyr Phe Pro Arg Ile Phe Ser Asn Thr Gln Trp Ile Gly
180 185 190
Arg Lys Pro Arg Ile Lys Phe Asp Tyr Ile Asn Thr Leu Leu Asp Ile
195 200 205
Gly Tyr Asn Ala Leu Phe Asn Phe Ile Asp Ala Ile Leu Gln Val Phe
210 215 220
Gly Phe Asp Val Tyr Tyr Gly Val Leu His Thr Cys Phe Tyr Met Arg
225 230 235 240
Lys Ser Leu Val Cys Asp Ile Met Glu Pro Met Arg Pro Ile Val Asp
245 250 255
Trp Gln Ile Arg Lys Ser Ile Asn Leu Lys Gln Phe Lys Gln Asp Asp
260 265 270
Phe Val Gln Val Gly Lys Gln Tyr Gln Leu Lys Tyr Lys Lys Ser Thr
275 280 285
Gln Tyr Leu Gln Val Phe Leu Glu Ala Ile Leu Asn Tyr Lys Glu Glu
290 295 300
Ile Phe Val Tyr Val Arg Asp Tyr Tyr Arg Ser Phe Met Lys Asn Asn
305 310 315 320
Pro Ile Glu Ala Tyr Pro Val Phe Lys Leu Glu Glu Leu
325 330
<210> 3
<211> 97
<212> PRT
<213> Artificial sequence (rengongxulie)
<400> 3
Met Ile Ile Val Ser Tyr Asp Ile Ser Asp Asp Lys Leu Arg Thr Lys
1 5 10 15
Phe Ser Lys Tyr Leu Ser Arg Phe Gly His Arg Ile Gln Tyr Ser Met
20 25 30
Phe Glu Ile Asp Asn Ser Glu Arg Ile Leu Asn Asn Ile Ile Cys Asp
35 40 45
Ile His Asn Gln Phe Glu Lys Lys Phe Ser Gln Glu Asp Ser Ile Tyr
50 55 60
Ile Phe Asn Leu Ser Lys Trp Cys Lys Ile Glu Arg Phe Gly Tyr Ala
65 70 75 80
Lys Asn Glu Thr Asn Asp Leu Leu Val Leu Thr Gly Cys Lys Pro Arg
85 90 95
Pro
<210> 4
<211> 189
<212> PRT
<213> Artificial sequence (rengongxulie)
<400> 4
Met Glu Asp Ile Ile Leu Ile Thr Glu Leu Asn Asp Phe Ile Phe Cys
1 5 10 15
Pro Ala Ser Ile Tyr Phe His His Leu Tyr Gly Ser Arg Asp Pro Val
20 25 30
Leu Phe Gln Ser Glu Ala Gln Ile Lys Gly Thr Lys Ala His Glu Ala
35 40 45
Val Asp Ser Gly Cys Tyr Ser Lys Lys Ser Ser Ile Leu Gln Ser Leu
50 55 60
Asp Val Tyr Cys Glu Lys Tyr Arg Leu Leu Gly Lys Ile Asp Ile Tyr
65 70 75 80
Asp Gly Lys Lys Lys Ile Leu Arg Glu Arg Lys Arg Gln Ile Lys Gln
85 90 95
Val Tyr Asp Gly Tyr Ile Phe Gln Leu Tyr Gly Gln Tyr Phe Ser Leu
100 105 110
Ile Glu Met Gly Tyr Glu Val Asp Lys Met Glu Leu Tyr Ser Met Ile
115 120 125
Asp Asn Lys Lys Tyr Pro Ile Glu Leu Pro His Asn Asn Ile Asn Met
130 135 140
Leu Met Lys Phe Glu Met Leu Ile His Glu Met Arg Glu Phe Arg Leu
145 150 155 160
Asp Asp Arg Phe Ile Gln Glu Asn Ala Asn Lys Cys Lys Asn Cys Ile
165 170 175
Tyr Glu Pro Ala Cys Asp Arg Gly Asn Ile Gly Ala Lys
180 185
<210> 5
<211> 37
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 5
cuuugaaaga auauaauuuc uacugaaagu guagaua 37
<210> 6
<211> 28
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 6
aauauaauuu cuacugaaag uguagaua 28
<210> 7
<211> 27
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 7
auauaauuuc uacugaaagu guagaua 27
<210> 8
<211> 26
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 8
uauaauuucu acugaaagug uagaua 26
<210> 9
<211> 25
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 9
auaauuucua cugaaagugu agaua 25
<210> 10
<211> 24
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 10
uaauuucuac ugaaagugua gaua 24
<210> 11
<211> 23
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 11
aauuucuacu gaaaguguag aua 23
<210> 12
<211> 22
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 12
auuucuacug aaaguguaga ua 22
<210> 13
<211> 25
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 13
augcaaacuu uaccgaugau gaaga 25
<210> 14
<211> 28
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 14
uacgagguug ugaucgaagu ccauaacc 28
<210> 15
<211> 25
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 15
uuccuaaaau uacaaauaaa uccug 25
<210> 16
<211> 26
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 16
gaugcaguuu ucagauuuug uuuuug 26
<210> 17
<211> 26
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 17
agaaaaguca agauauucaa acuaaa 26
<210> 18
<211> 30
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 18
auggcgaaua cuuuuaaagu cauguccaug 30
<210> 19
<211> 23
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 19
agguaaaacu ccaaucuggc uug 23
<210> 20
<211> 23
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 20
gccccaauaa uccccacaug uca 23
<210> 21
<211> 23
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 21
ucccugucuu cugcaaaggu gag 23
<210> 22
<211> 23
<212> RNA
<213> Artificial sequence (rengongxulie)
<400> 22
uucugugugg guucaaacac auu 23

Claims (10)

1. A V-type CRISPR/Cas12a gene editing system, characterized in that: comprising a Cpf1 protein or one or more polynucleotides encoding said Cpf1 protein, and CRISPR RNA or one or more polynucleotides encoding this CRISPR RNA; wherein the Cpf1 protein is a DNA endonuclease, and the Cpf1 protein cleaves double-stranded DNA complementary to CRISPR RNA downstream of the PAM sequence through a different nuclease domain; the different nuclease domains are HNH-like nuclease domains or RuvC-like nuclease domains; the PAM sequence is TTNA, wherein N is A, T, C, G, the amino acid sequence of the Cpf1 protein is shown as SEQ ID NO.1,
the CRISPR RNA is characterized in that:
1) CRISPR RNA has the sequence format: 5 '-direct repeat binding to Cpf1 protein-spacer complementary to target sequence-3';
2) CRISPR RNA, the length of the spacer sequence is 15-29 bases;
3) CRISPR RNA has a direct repeat sequence length of 22-37 bases;
4) CRISPR RNA contains a stem-loop structure.
2. The type V CRISPR/Cas12a gene editing system of claim 1, characterized in that: the direct repeat sequence of CRISPR RNA is shown in SEQ ID NO: 5.
3. The type V CRISPR/Cas12a gene editing system of claim 2, characterized in that: the direct repetitive sequence of CRISPR RNA is shown in any sequence of SEQ ID NO 6-12.
4. The type V CRISPR/Cas12a gene editing system of claim 1, characterized in that: the gene editing system further comprises an accessory protein or one or more polynucleotides encoding the accessory protein; the auxiliary protein comprises one or more of Cas1 protein, Cas2 protein and Cas4 protein,
the Cas1 protein has an amino acid sequence shown as SEQ ID NO. 2;
the Cas2 protein has an amino acid sequence shown as SEQ ID NO. 3;
the Cas4 protein has an amino acid sequence shown as SEQ ID NO. 4.
5. A Cpf1 protein as described in the V-type CRISPR/Cas12a gene editing system of any of claims 1 to 4.
6. A polynucleotide encoding the Cpf1 protein of claim 5.
7. The polynucleotide of claim 6, wherein: the polynucleotides are codon-optimized for expression in a cell of interest.
8. A vector comprising the polynucleotide of claim 6 or 7.
9. Use of the V-type CRISPR/Cas12a gene editing system of any one of claims 1 to 4, or the Cpf1 protein of claim 5, or the polynucleotide of claim 6 or 7, or the vector of claim 8 for editing or detecting prokaryotic or eukaryotic genes.
10. Use according to claim 9, characterized in that: the application is binding, targeted cleavage or non-targeted cleavage of DNA.
CN202110325073.8A 2020-10-20 2021-03-26 Cpf1 protein and gene editing system Active CN113234701B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011123073 2020-10-20
CN2020111230731 2020-10-20

Publications (2)

Publication Number Publication Date
CN113234701A CN113234701A (en) 2021-08-10
CN113234701B true CN113234701B (en) 2022-08-16

Family

ID=77130541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110325073.8A Active CN113234701B (en) 2020-10-20 2021-03-26 Cpf1 protein and gene editing system

Country Status (1)

Country Link
CN (1) CN113234701B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115998341B (en) * 2023-01-18 2024-04-12 珠海舒桐医疗科技有限公司 HPV rapid detection system for non-disease treatment and diagnosis
CN116376874A (en) * 2023-03-24 2023-07-04 尧唐(上海)生物科技有限公司 Cas protein, gene editing system and application thereof
CN116751763B (en) * 2023-05-08 2024-02-13 珠海舒桐医疗科技有限公司 Cpf1 protein, V-type gene editing system and application
CN117757774B (en) * 2023-05-08 2024-08-06 珠海舒桐医疗科技有限公司 Cas9 protein, type II CRISPR/Cas9 gene editing system and application

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107083392A (en) * 2017-06-13 2017-08-22 中国医学科学院病原生物学研究所 A kind of CRISPR/Cpf1 gene editings system and its application in mycobacteria
CN108486146A (en) * 2018-03-16 2018-09-04 中国农业科学院作物科学研究所 LbCpf1-RR mutant is used for application of the CRISPR/Cpf1 systems in plant gene editor
WO2019070762A1 (en) * 2017-10-02 2019-04-11 Genedit Inc. Modified cpf1 guide rna
CN110878290A (en) * 2019-11-15 2020-03-13 武汉大学 II type V type CRISPR protein BfCas12a and application thereof in gene editing
CN111235232A (en) * 2020-01-19 2020-06-05 华中农业大学 Visual rapid nucleic acid detection method based on CRISPR-Cas12a system and application
CN112331264A (en) * 2020-09-11 2021-02-05 中山大学附属第一医院 Construction method of homologous type 2 CRISPR/Cas gene editing system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107312761B (en) * 2017-07-18 2019-07-05 江苏溥博生物科技有限公司 A kind of AsCpf1 mutant protein, encoding gene, recombinant expression carrier and the preparation method and application thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107083392A (en) * 2017-06-13 2017-08-22 中国医学科学院病原生物学研究所 A kind of CRISPR/Cpf1 gene editings system and its application in mycobacteria
WO2019070762A1 (en) * 2017-10-02 2019-04-11 Genedit Inc. Modified cpf1 guide rna
CN108486146A (en) * 2018-03-16 2018-09-04 中国农业科学院作物科学研究所 LbCpf1-RR mutant is used for application of the CRISPR/Cpf1 systems in plant gene editor
CN110878290A (en) * 2019-11-15 2020-03-13 武汉大学 II type V type CRISPR protein BfCas12a and application thereof in gene editing
CN111235232A (en) * 2020-01-19 2020-06-05 华中农业大学 Visual rapid nucleic acid detection method based on CRISPR-Cas12a system and application
CN112331264A (en) * 2020-09-11 2021-02-05 中山大学附属第一医院 Construction method of homologous type 2 CRISPR/Cas gene editing system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CRISPR-Cas12a: Functional overview and applications;Bijoya Paul et al.;《Biomedical Journal》;20200205;第43卷;第8-17页 *
Rapid and sensitive detection of COVID-19 using CRISPR/Cas12a-based detection with naked eye readout, CRISPR/Cas12a-NER;Xinjie Wang et al.;《Science Bulletin》;20200505;第65卷;第1436–1439页 *
基因编辑新技术最新进展;张梦娜等;《中国细胞生物学学报》;20181231;第40卷(第12期);第2098–2107页 *

Also Published As

Publication number Publication date
CN113234701A (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN113234701B (en) Cpf1 protein and gene editing system
CN107922931B (en) Thermostable Cas9 nuclease
US11713471B2 (en) Class II, type V CRISPR systems
CN114075559B (en) 2-type CRISPR/Cas9 gene editing system and application thereof
CN112430586B (en) VI-B type CRISPR/Cas13 gene editing system and application thereof
CN113234702B (en) Lt1Cas13d protein and gene editing system
CN116179512B (en) Endonuclease with wide target recognition range and application thereof
WO2024146332A1 (en) Pam-restriction-free endonuclease and gene editing system mediated by same
CN112111471A (en) FnCpf1 mutant for identifying PAM sequence in broad spectrum and application thereof
Burnett et al. Examination of the cell cycle dependence of cytosine and adenine base editors
EP3676396B1 (en) Transposase compositions, methods of making and methods of screening
CN116200382A (en) Novel gene editing system for mediating A-to-C mutation or T-to-G mutation and application thereof
RU2794774C1 (en) Crispr/cas9 type ii genome editing system and its use
CN116751763B (en) Cpf1 protein, V-type gene editing system and application
RU2722934C1 (en) Dna protease cutting agent based on cas9 protein from pasteurella pneumotropica bacteria
AU2022201166B2 (en) Type ii crispr/cas9 genome editing system and the application thereof
RU2724470C1 (en) Use of cas9 protein from pasteurella pneumotropica bacteria for modifying genomic dna in cells
RU2778156C1 (en) DNA-CUTTING AGENT BASED ON THE Cas9 PROTEIN FROM THE BACTERIUM CAPNOCYTOPHAGA OCHRACEA
US12123014B2 (en) Class II, type V CRISPR systems
RU2712492C1 (en) DNA PROTEASE CUTTING AGENT BASED ON Cas9 PROTEIN FROM DEFLUVIIMONAS SP.
Beck Roles of Escherichia coli 5’-terminal AUG triplets in translation initiation and regulation
CN118006584A (en) Programmable nuclease with CRISPR loci completely deleted from Cas1, cas2 and Cas4 and application thereof
OA20812A (en) Use of CAS9 protein from the bacterium pasteurella pneumotropica.
EA044419B1 (en) APPLICATION OF CAS9 PROTEIN FROM PASTEURELLA PNEUMOTROPICA BACTERIA
CN116355885A (en) Deaminase mutant and base editor constructed based on deaminase mutant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant