CN110835630A - Efficient sgRNA and application thereof in gene editing - Google Patents

Efficient sgRNA and application thereof in gene editing Download PDF

Info

Publication number
CN110835630A
CN110835630A CN201911200779.0A CN201911200779A CN110835630A CN 110835630 A CN110835630 A CN 110835630A CN 201911200779 A CN201911200779 A CN 201911200779A CN 110835630 A CN110835630 A CN 110835630A
Authority
CN
China
Prior art keywords
sequence
sgrna
protein
sakkhn
rna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911200779.0A
Other languages
Chinese (zh)
Other versions
CN110835630B (en
Inventor
张成伟
徐雯
刘亚
赵思
杨进孝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911200779.0A priority Critical patent/CN110835630B/en
Publication of CN110835630A publication Critical patent/CN110835630A/en
Application granted granted Critical
Publication of CN110835630B publication Critical patent/CN110835630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Abstract

The invention provides an efficient sgRNA and application thereof in gene editing. The sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I); the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework; the RNA segment A and the RNA segment B are reversely complementary; the sizes of the RNA fragment A and the RNA fragment B are both 3 nt; the sgRNA framework is an RNA molecule shown in a sequence 9. Experiments prove that: the modified sgRNA can obviously improve the C.T base replacement efficiency of a Cytosine Base Editor (CBE), and the highest efficiency can reach 86.4%.

Description

Efficient sgRNA and application thereof in gene editing
Technical Field
The invention belongs to the technical field of biology, and particularly relates to an efficient sgRNA and application thereof in gene editing.
Background
The CRISPR-Cas9 technology has become a powerful genome editing means and is widely applied to many tissues and cells. The CRISPR/Cas9 protein-RNA complex is localized on the target by a guide RNA (guide RNA), cleaved to generate a DNA double strand break (dsDNA break, DSB), and the organism will then instinctively initiate a DNA repair mechanism to repair the DSB. Repair mechanisms are generally of two types, one being non-homologous end joining (NHEJ) and the other being homologous recombination (HDR). In general, NHEJ dominates, and repair produces random indels (insertions or deletions) much higher than precise repair. For base exact substitution, the application of using HDR to achieve base exact substitution is greatly limited because of the low efficiency of HDR and the need for a DNA template.
In 2016, two laboratories such as David Liu and Akihiko Kondo independently report two different types of Cytosine Base Editors (CBEs), respectively, and use two different types of cytidine deaminases rAPOBEC1(rat APOBEC1) and PmCDA1(activation-induced Cytosine deaminase (AID) orthogonal template), which are based on the principle that the base editing of a single Cytosine (C) base is directly realized by using the cytidine deaminase, but not by generating DSB and initiating HDR repair, so that the base editing efficiency of C to be replaced by Thymine (Thymine, T) is greatly improved. Specifically, dead Cas9(dCas9) or the Cas9 nickase (Cas9n) are positioned to a target point through sgRNA together with rAPOBEC1 or PmCDA1, rAPOBEC1 or PmCDA1 catalyzes cytosine deamination reaction of C on unpaired single-stranded DNA to Uracil (U), and the U is paired with Adenine (Adenine, a) through DNA repair and finally paired with a through DNA replication, thereby realizing C-to-T conversion. The mean mutation rate of SpCas9n (D10A) & rAPOBEC1/PmCDA1& UGI base editing system (which contains uracil DNA glycosylase inhibitor, UGI)) was higher in the editor tested for two reasons: firstly, UGI can inhibit Uracil DNA Glycosylase (UDG) from catalyzing and removing U in DNA, and secondly, SpCas9n (D10A) generates a nick on a non-editing chain, and induces a eukaryotic mismatch repair mechanism or a long-patch BER (base-extension repair) repair mechanism to promote more preferential repair of U: G mismatch into U: A. In order to improve the working efficiency and reduce the working cost, the improvement of the efficiency of C.T base substitution has been the research direction of the base editing system of animal and plant genomes.
Cas9(SaCas9) from Staphylococcus aureus (Staphylococcus aureus) is a SpCas9 homologue, and the NNGRRT PAM, SaCas9 variant SaKKH recognizes a broader range of NNNRRT PAM, both of which are developed into potent CBEs, greatly expanding the range of editable C in animal and plant genomes. At present, no research report for improving the C.T base replacement efficiency of the CBE related to the SaKKH by modifying the structure of sgRNA (SaCas9 sgRNA) corresponding to SaCas9 exists.
Disclosure of Invention
The purpose of the present invention is to improve the efficiency of C.T base substitution in a Cytosine Base Editor (CBE).
To achieve the above object, the present invention provides a kit comprising a sgRNA or a biological material related to the sgRNA, a Cas9 nuclease or a biological material related to the Cas9 nuclease, a cytosine deaminase or a biological material related to the cytosine deaminase;
the sgRNA targets a target sequence;
the sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I);
the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework;
the RNA segment A and the RNA segment B are reversely complementary;
the sizes of the RNA fragment A and the RNA fragment B are both 3 nt;
the sgRNA backbone is m1) or m2) or m 3):
m1) the RNA molecule shown as the sequence 9;
m2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in m1) and having the same function;
m3) and m1) or m2) and has the same function.
In the above kit, the modified sgRNA backbone is n1) or n2) or n 3):
n1) the RNA molecule shown as the sequence 10;
n2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in n1) and having the same function;
n3) and n1) or n2) and has the same function.
In the kit, the Cas9 nuclease can be a protein such as SaKKHn, SaCas9, SaKKH-HF or SaCas 9-HF. In a particular embodiment of the invention, the Cas9 nuclease is specifically a SaKKHn protein.
The SaKKHn protein is E1) or E2) or E3):
E1) the amino acid sequence is a protein shown in a sequence 2;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
E3) a fusion protein obtained by connecting a label to the N terminal or/and the C terminal of E1) or E2);
the biological material related to the SaKKHn protein is any one of F1) to F5):
F1) a nucleic acid molecule encoding said SaKKHn protein;
F2) an expression cassette comprising the nucleic acid molecule of F1);
F3) a recombinant vector comprising the nucleic acid molecule of F1) or a recombinant vector comprising the expression cassette of F2);
F4) a recombinant microorganism containing F1) said nucleic acid molecule, or a recombinant microorganism containing F2) said expression cassette, or a recombinant microorganism containing F3) said recombinant vector;
F5) a transgenic cell line comprising the nucleic acid molecule of F1) or a transgenic cell line comprising the expression cassette of F2).
In the kit, the cytosine deaminase can be a protein such as human APOBEC3A, human AID, PmCDA1 or rAPOBEC 1. In a particular embodiment of the invention, the cytosine deaminase is in particular PmCDA1 protein.
The PmCDA1 protein is G1) or G2) or G3):
G1) the amino acid sequence is a protein shown in a sequence 3;
G2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table and has the same function;
G3) a fusion protein obtained by connecting a tag to the N-terminus or/and the C-terminus of G1) or G2);
the biological material related to the PmCDA1 protein is any one of H1) to H5):
H1) a nucleic acid molecule encoding the PmCDA1 protein;
H2) an expression cassette comprising the nucleic acid molecule of H1);
H3) a recombinant vector containing H1) the nucleic acid molecule or a recombinant vector containing H2) the expression cassette;
H4) a recombinant microorganism containing H1) the nucleic acid molecule, or a recombinant microorganism containing H2) the expression cassette, or a recombinant microorganism containing H3) the recombinant vector;
H5) a transgenic cell line containing H1) the nucleic acid molecule or a transgenic cell line containing H2) the expression cassette.
In the kit, the sgRNA can be tRNA-sgRNA;
the tRNA-sgRNA is shown as a formula I: tRNA-the RNA transcribed from the target sequence-engineered sgRNA backbone (formula I);
the tRNA is 1) or 2) or 3):
1) an RNA molecule obtained by replacing T in the 474-550 th position of the sequence 1 with U;
2) RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides in the RNA molecules shown in 1) and have the same functions;
3) RNA molecule with 75% or more than 75% identity with the nucleotide sequence defined in 1) or 2) and with the same function.
The kit may further include a UGI protein or a biological material associated with the UGI protein;
the UGI protein is I1) or I2) or I3):
I1) the amino acid sequence is a protein shown in a sequence 4;
I2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 4 in the sequence table and has the same function;
I3) a fusion protein obtained by connecting labels at the N terminal or/and the C terminal of I1) or I2);
the biological material related to the UGI protein is any one of J1) to J5):
J1) a nucleic acid molecule encoding the UGI protein;
J2) an expression cassette comprising the nucleic acid molecule of J1);
J3) a recombinant vector comprising J1) said nucleic acid molecule, or a recombinant vector comprising J2) said expression cassette;
J4) a recombinant microorganism containing J1) the nucleic acid molecule, or a recombinant microorganism containing J2) the expression cassette, or a recombinant microorganism containing J3) the recombinant vector;
J5) a transgenic cell line comprising J1) the nucleic acid molecule or a transgenic cell line comprising J2) the expression cassette.
In order to facilitate the purification of the protein in E1), G1), I1), the amino terminal or the carboxyl terminal of the protein consisting of the amino acid sequence shown in the sequence 2 or the sequence 3 or the sequence 4 in the sequence table is linked with the tags shown in the following table.
Sequence of Table, tag
Label (R) Residue of Sequence of
Poly-Arg 5-6 (typically 5) RRRRR
Poly-His 2-10 (generally 6) HHHHHH
FLAG 8 DYKDDDDK
Strep-tag II 8 WSHPQFEK
c-myc 10 EQKLISEEDL
The protein in E2), G2) and I2) is a protein having 75% or more identity to or having 75% or more identity to the amino acid sequence of the protein shown in SEQ ID NO. 2, SEQ ID NO. 3 or SEQ ID NO. 4 and having the same function. The identity of 75% or more than 75% is 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity.
The protein in E2), G2) and I2) can be artificially synthesized, or can be obtained by synthesizing the coding gene and then performing biological expression.
The genes encoding the proteins in E2), G2) and I2) can be obtained by deleting one or more amino acid residues from the DNA sequence shown in 3013-6225 (protein shown in coding sequence 2) of sequence 1, 6511-7134 (protein shown in coding sequence 3) of sequence 1 and 7156-7452 (protein shown in coding sequence 4) of sequence 1, and/or carrying out missense mutation of one or more base pairs, and/or connecting the coding sequence with the tags shown in the table at the 5 'end and/or the 3' end.
Further, F1) the nucleic acid molecule is F1) or F2) or F3):
f1) a cDNA molecule or DNA molecule shown in 3013-position 6225 of a sequence 1 in a sequence table;
f2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in f1) and encoding said SaKKHn;
f3) a cDNA molecule or DNA molecule which hybridizes with the nucleotide sequence limited by f1) or f2) under strict conditions and codes for the SaKKHn;
H1) the nucleic acid molecule is h1) or h2) or h 3):
h1) a cDNA molecule or DNA molecule shown in 6511-7134 site of a sequence 1 in a sequence table;
h2) a cDNA molecule or DNA molecule which has 75 percent or more identity with the nucleotide sequence defined by h1) and codes the PmCDA 1;
h3) hybridizing with the nucleotide sequence defined by h1) or h2) under strict conditions, and encoding the cDNA molecule or DNA molecule of the PmCDA 1;
J1) the nucleic acid molecule is j1) or j2) or j 3):
j1) a cDNA molecule or DNA molecule shown in the 7156-nd and 7452 site of the sequence 1 in the sequence table;
j2) a cDNA molecule or DNA molecule having 75% or more identity to the nucleotide sequence defined in j1) and encoding said UGI;
j3) hybridizing under stringent conditions with the nucleotide sequence defined in j1) or j2), and encoding the cDNA molecule or DNA molecule of the UGI.
Wherein the nucleic acid molecule may be DNA, such as cDNA, genomic DNA or recombinant DNA; the nucleic acid molecule may also be RNA, such as mRNA or hnRNA, etc.
The nucleotide sequence of the present invention encoding said SaKKHn or said PmCDA1 or said UGI can be easily mutated by a person of ordinary skill in the art using known methods such as directed evolution and point mutation. Those nucleotides which have been artificially modified to have 75% or more identity to the nucleotide sequence of said SaKKHn or said PmCDA1 or said UGI of the present invention are derived from the nucleotide sequence of the present invention and are identical to the sequence of the present invention as long as they encode said SaKKHn or said PmCDA1 or said UGI and have the same function.
The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "identity" includes nucleotide sequences that are 75% or more, or 85% or more, or 90% or more, or 95% or more identical to the nucleotide sequence of a protein consisting of the amino acid sequence shown in coding sequences 2, 3, 4 of the present invention. Identity can be assessed visually or by computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.
The stringent conditions are hybridization and washing of the membrane 2 times, 5min each, at 68 ℃ in a solution of 2 XSSC, 0.1% SDS, and 2 times, 15min each, at 68 ℃ in a solution of 0.5 XSSC, 0.1% SDS; alternatively, hybridization was carried out at 65 ℃ in a solution of 0.1 XSSPE (or 0.1 XSSC), 0.1% SDS, and the membrane was washed.
The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.
F2) The expression cassette containing a nucleic acid molecule encoding a SaKKHn protein (SaKKHn gene expression cassette) refers to a DNA capable of expressing the SaKKHn protein in a host cell, and the DNA may include not only a promoter which initiates transcription of the SaKKHn gene but also a terminator which terminates transcription of the SaKKHn gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the SaKKHn gene expression cassette can be constructed by using the existing expression vector.
H2) The expression cassette containing the nucleic acid molecule encoding the PmCDA1 protein (PmCDA1 gene expression cassette) refers to a DNA capable of expressing the PmCDA1 protein in a host cell, and the DNA may include not only a promoter for initiating transcription of the PmCDA1 gene, but also a terminator for terminating transcription of the PmCDA1 gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the PmCDA1 gene expression cassette can be constructed by using the existing expression vector.
J2) The expression cassette containing a nucleic acid molecule encoding the UGI protein (UGI gene expression cassette) refers to a DNA capable of expressing the UGI protein in a host cell, and the DNA may include not only a promoter for initiating transcription of the UGI gene but also a terminator for terminating transcription of the UGI gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the UGI gene expression cassette can be constructed using an existing expression vector.
The vector may be a plasmid, cosmid, phage or viral vector. In a specific embodiment of the invention, the recombinant vector is specifically a SaKKHn-pBE +3bp-1 recombinant expression vector, a SaKKHn-pBE +3bp-2 recombinant expression vector, a SaKKHn-pBE +3bp-3 recombinant expression vector, a SaKKHn-pBE +3bp-4 recombinant expression vector, a SaKKHn-pBE +3bp-5 recombinant expression vector, a SaKKHn-pBE +3bp-6 recombinant expression vector or a SaKKHn-pBE +3bp-7 recombinant expression vector.
The nucleotide sequence of the SaKKHn-pBE +3bp-1 recombinant expression vector is obtained by replacing DNA sequences of origin sgRNA frameworks in a sequence of the SaKKHn-pBE-1 recombinant expression vector with DNA sequences of +3bp sgRNA frameworks shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-2 recombinant expression vector is obtained by replacing DNA sequences of origin sgRNA frameworks in a sequence of the SaKKHn-pBE-2 recombinant expression vector with DNA sequences of +3bp sgRNA frameworks shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-3 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-3 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-4 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-4 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-5 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-5 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-6 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-6 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-7 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-7 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The microorganism may be a yeast, bacterium, algae or fungus. Wherein the bacterium can be an Agrobacterium, such as Agrobacterium EHA 105. In a specific embodiment of the present invention, the recombinant microorganism is agrobacterium EHA105 which contains the SaKKHn-pBE +3bp-1 recombinant expression vector, the SaKKHn-pBE +3bp-2 recombinant expression vector, the SaKKHn-pBE +3bp-3 recombinant expression vector, the SaKKHn-pBE +3bp-4 recombinant expression vector, the SaKKHn-pBE +3bp-5 recombinant expression vector, the SaKKHn-pBE +3bp-6 recombinant expression vector, or the SaKKHn-pBE +3bp-7 recombinant expression vector.
The transgenic cell line does not include propagation material.
The kit has the following uses:
x1) editing of a target sequence in the genome of an organism or cell of an organism;
x2) preparing an edited product of a target sequence of a genome of an organism or a cell of an organism;
x3) increasing the efficiency of editing a target sequence in the genome of an organism or cell of an organism;
x4) to produce a product that increases the efficiency of editing a target sequence in the genome of an organism or cell of an organism.
The sgRNA or the modified sgRNA backbone in the kit also belongs to the protection scope of the present invention.
In order to achieve the above object, the present invention also provides a new use of the above kit or sgRNA or modified sgRNA backbone.
The invention provides an application of the complete reagent set or the sgRNA or the new application of the modified sgRNA framework in any one of X1) -X4):
x1) editing of a target sequence in the genome of an organism or cell of an organism;
x2) preparing an edited product of a target sequence of a genome of an organism or a cell of an organism;
x3) increasing the efficiency of editing a target sequence in the genome of an organism or cell of an organism;
x4) to produce a product that increases the efficiency of editing a target sequence in the genome of an organism or cell of an organism.
To achieve the above object, the present invention finally provides the process according to Y1) or Y2):
y1) or a method for improving the efficiency of editing a genomic target sequence of an organism or a biological cell, comprising expressing the sgRNA, the Cas9 nuclease, and the cytosine deaminase in an organism or a biological cell to edit the genomic target sequence; the sgRNA targets the target sequence;
y2) biological mutant, comprising the following steps: editing the genome of the organism according to the method described in Y1) to obtain a biological mutant.
In the above method, Y1), the sgRNA is the tRNA-sgRNA, and the tRNA-sgRNA transcribed from the DNA molecule that transcribes the tRNA-sgRNA is an immature RNA precursor, in which tRNA is cleaved by two enzymes (RNase P and RNase Z) to obtain mature RNA. And obtaining independent mature RNAs according to the number of targets in a recombinant expression vector, wherein each mature RNA consists of RNA transcribed by the target sequence and the sgRNA framework in sequence or consists of RNA transcribed by the target sequence, the sgRNA framework and residual individual bases of tRNA in sequence.
In the above method, Y1) further comprises the step of expressing UGI in the organism or the organism cell, and the number of the UGI may be 1 or 2 or more. In a specific embodiment of the present invention, the number of the UGIs is specifically 1.
Further, the sgRNA, the Cas9 nuclease, the cytosine deaminase, and the UGI are expressed in an organism or an organism cell by introducing a gene encoding the Cas9 nuclease, a DNA molecule that transcribes the sgRNA, a gene encoding the cytosine deaminase, and a gene encoding the UGI into the organism or the organism cell.
Further, the gene encoding Cas9 nuclease, the DNA molecule transcribing the sgRNA, the gene encoding cytosine deaminase, and the gene encoding the UGI are introduced into an organism or an organism cell via a recombinant expression vector.
The encoding gene of the Cas9 nuclease, the DNA molecule for transcribing the sgRNA, the encoding gene of the cytosine deaminase and the encoding gene of the UGI can be introduced into an organism or an organism cell through the same recombinant expression vector, or can be introduced into the organism or the organism cell through two or more recombinant expression vectors.
In a specific embodiment of the invention, the gene encoding Cas9 nuclease, the DNA molecule transcribing the sgRNA, the gene encoding cytosine deaminase and the gene encoding the UGI are introduced into the organism or the biological cell through the same recombinant expression vector. The recombinant expression vector contains an expression cassette A and an expression cassette B; the expression cassette A expresses the sgRNA, and the expression cassette B expresses a fusion protein consisting of the Cas9 nuclease, the cytosine deaminase and the UGI.
The recombinant expression vector is specifically the SaKKHn-pBE +3bp-1 recombinant expression vector, the SaKKHn-pBE +3bp-2 recombinant expression vector, the SaKKHn-pBE +3bp-3 recombinant expression vector, the SaKKHn-pBE +3bp-4 recombinant expression vector, the SaKKHn-pBE +3bp-5 recombinant expression vector, the SaKKHn-pBE +3bp-6 recombinant expression vector or the SaKKHn-pBE +3bp-7 recombinant expression vector.
In the kit or use or method, the number of target sequences may be 1 or 2 or more. The PAM sequence of the target sequence is NNNRRT.
In the kit or the use or the method, the genome target sequence is edited by mutating C in the target sequence to T. The C is C at any position in the target point sequence.
In the above kit or use or method, the organism is S1) or S2) or S3) or S4):
s1) plants or animals;
s2) a monocot or dicot;
s3) gramineous plants;
s4) rice;
the biological cell is T1) or T2) or T3) or T4):
t1) plant cells or animal cells;
t2) a monocotyledonous or dicotyledonous plant cell;
t3) graminaceous plant cells;
t4) rice cells.
The invention provides a modified sgRNA, which has a structure shown in a formula I: an RNA-engineered sgRNA backbone transcribed from a target sequence (formula I); the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework; the RNA segment A and the RNA segment B are reversely complementary; the sizes of the RNA fragment A and the RNA fragment B are both 3 nt; the sgRNA framework is an RNA molecule shown in a sequence 9. Experiments prove that: the modified sgRNA can obviously improve the C.T base replacement efficiency of a Cytosine Base Editor (CBE), and the highest efficiency can reach 86.4%.
Drawings
Fig. 1 shows the unmodified SaCas9sgRNA structure and the modified SaCas9sgRNA structure.
FIG. 2 is a schematic structural diagram of a recombinant expression vector.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The experimental procedures in the following examples are conventional unless otherwise specified. Materials, reagents, instruments and the like used in the following examples are commercially available unless otherwise specified. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
Primer pair T1 was composed of primer T1-F: 5'-ttcaaattctaatccccaatcc-3' and primer T1-R: 5'-tcgtacctgtctgcaaccttg-3', and is used for amplifying target T1.
Primer pair T2 was composed of primer T2-F: 5'-gctttagatgatttgttacatttcgc-3' and primer T2-R: 5'-tgagttggtatggcaagaacaag-3', and is used for amplifying target T2.
Primer pair T3 was composed of primer T3-F: 5'-aacacggtcaccaacttcatc-3' and primer T3-R: 5'-acaacctggcttgctatatatgc-3', and is used for amplifying target T3.
Primer pair T4 was composed of primer T4-F: 5'-tggatcggatatggacttctc-3' and primer T4-R: 5'-gaaatgaacaatcacctgagatctttg-3', and is used for amplifying target points T4 and T7.
Primer pair T5 was composed of primer T5-F: 5'-cgagctacctgaagaacaactacc-3' and primer T5-R: 5'-cctcgattgcctgaaatttg-3', and is used for amplifying target T5.
Primer pair T6 was composed of primer T6-F: 5'-tgcgagctcgacaacatcatg-3' and primer T6-R: 5'-gacggcccatgtggaaacc-3', and is used for amplifying target T6.
Primer pair T8 was composed of primer T8-F: 5'-gacgcccatagtcgaggtc-3' and primer T8-R: 5'-ctctgctggatcaatgtcaatg-3', and is used for amplifying target T8.
Primer pair T9 was composed of primer T9-F: 5'-cctcatccaatcgactgacac-3' and primer T9-R: 5'-gtaattgtgcttggtgatggag-3', and is used for amplifying target T9.
In the following examples, C.T base substitutions refer to mutations from C to T at any position in the target sequence.
The efficiency of C · T base substitution was equal to the number of positive T0 seedlings with C · T base substitution/total positive T0 seedlings analyzed × 100%.
Japanese fine rice: reference documents: the effects of sodium nitroprusside and its photolysis products on the growth of Nippon rice seedlings and the expression of 5 hormone marker genes [ J ]. proceedings of university of Master Henan (Nature edition), 2017(2): 48-52.; the public is available from the agroforestry academy of sciences of Beijing.
Recovering the culture medium: n6 solid medium containing 200mg/L timentin.
Screening a culture medium: n6 solid medium containing 50mg/L hygromycin.
Differentiation medium: n6 solid culture medium containing 2mg/L KT, 0.2mg/L NAA, 0.5g/L glutamic acid and 0.5g/L proline.
Rooting culture medium: n6 solid medium containing 0.2mg/L NAA, 0.5g/L glutamic acid, 0.5g/L proline.
Example 1 modification of sgRNA framework Structure in SaCas9sgRNA
The structure of SaCas9sgRNA is as follows: an RNA-sgRNA backbone transcribed from a target sequence.
The sgRNA framework structures in the SaCas9sgRNA structure are modified, and the two ways of modifying the sgRNA framework structures are shown in fig. 1.
Origin represents the unmodified SaCas9sgRNA structure, the unmodified SaCas9sgRNA is designated as origin sgRNA, the sgRNA backbone in origin sgRNA is designated as origin sgRNA backbone, and the RNA sequence of the origin sgRNA backbone is as follows: GUUUUAGUACUCUGGAAACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (SEQ ID NO: 9); the DNA sequence of the origin sgRNA backbone is as follows: GTTTTAGTACTCTGGAAACAGAATCTACTAAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTTGGCGAGAT are provided.
O +3bp represents a SaCas9sgRNA structure obtained by adding 3 pairs of bases on the basis of an unmodified SaCas9sgRNA structure, the modified SaCas9sgRNA is marked as +3bp sgRNA, a sgRNA framework in the +3bp sgRNA is marked as a +3bp sgRNA framework, and an RNA sequence of the +3bp sgRNA framework is as follows: GUUUUAGUACUCUGCUGGAAACAGCAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (SEQ ID NO: 10, the underlined sequence being 3 additional bases); the DNA sequence of the +3bp sgRNA framework is shown as sequence 6.
O +8bp represents a SaCas9sgRNA structure obtained by adding 8 pairs of bases on the basis of an unmodified SaCas9sgRNA structure, the modified SaCas9sgRNA is marked as +8bp sgRNA, a sgRNA framework in the +8bp sgRNA is marked as a +8bp sgRNA framework, and an RNA sequence of the +8bp sgRNA framework is as follows: GUUUUAGUACUCUGUAAUUUUAGAAAUAAAAUUACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (the underlined sequence is 8 bases in addition); the DNA sequence of the +8bp sgRNA framework is shown as sequence 7.
Example 2 application of modified SacAS9sgRNA in improving C.T base substitution efficiency of SaKKHn & PmCDA1& UGI base editing system
Construction of recombinant expression vector
Artificially synthesizing the following recombinant expression vectors, wherein each expression vector is a circular plasmid:
two recombinant expression vectors containing Original sgRNA: SaKKHn-pBE-1 and SaKKHn-pBE-2;
two recombinant expression vectors containing +3bp sgRNA: SaKKHn-pBE +3bp-1 and SaKKHn-pBE +3 bp-2;
two recombinant expression vectors containing +8bp sgRNA: SaKKHn-pBE +8bp-1 and SaKKHn-pBE +8 bp-2.
The nucleotide sequence of the SaKKHn-pBE-1 recombinant expression vector is sequence 1 in the sequence table. Wherein, the 131 th-467 th site of the sequence 1 is a nucleotide sequence of OsU3 promoter, the 474 th-550 th site and the 648 th-724 th site are nucleotide sequences of tRNA, the 551 th-647 th site and the 725 th-821 th site are nucleotide sequences of two sgRNAs targeting OsWaxy gene respectively, the DNA sequence of the common sgRNA skeleton (Original sgRNA skeleton) of the two sgRNAs is the 571 th-647 th site of the sequence 1 or the 745 th-821 th site of the sequence 1, and the 996 th-1286 th site is a nucleotide sequence of OsU3 terminator; the 1293-3006 site of the sequence 1 is the nucleotide sequence of OsUbq3 promoter, the 3013-6225 site is the coding sequence of SaKKHn protein (without stop codon), the SaKKHn protein shown in the coding sequence 2; the 6511-7134 of the sequence 1 is the coding sequence of the PmCDA1 protein (without a stop codon), and the PmCDA1 protein is shown as the coding sequence 3; the 7156-7452 site of the sequence 1 is a coding sequence of UGI protein, and the UGI protein shown in a coding sequence 4; the nucleotide sequence of 35S terminator at position 7459-7653 of the sequence 1; the 7728-9720 site of the sequence 1 is the nucleotide sequence of ZmUbi1 promoter, the 9727-10749 site is the coding sequence of hygromycin phosphotransferase, and the 10779-10994 site is the nucleotide sequence of CaMV35S polyA. Two targets in the SaKKHn-pBE-1 recombinant expression vector are T1 and T2 respectively, and the sequences are shown in Table 1.
The nucleotide sequence of the SaKKHn-pBE-2 recombinant expression vector is obtained by replacing the 474 th and 995 th position of the sequence 1 with a sequence 5 and keeping other sequences unchanged. Wherein, the 1 st-77 th site and the 175 st-251 th site of the sequence 5 are both nucleotide sequences of tRNA, and the 78 th-174 th site and the 252 st-348 th site are nucleotide sequences of two sgRNAs targeting the OsNRT1.1B gene and the OsGRF4 gene, respectively. The DNA sequences of the origin sgRNA frameworks are at positions 98-174 and 272-348. Two targets in the SaKKHn-pBE-2 recombinant expression vector are T3 and T4 respectively, and the sequences are shown in Table 1.
The nucleotide sequence of the SaKKHn-pBE +3bp-1 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-2 recombinant expression vector is obtained by replacing the DNA sequences of the origin sgRNA frameworks in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +8bp-1 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +8bp sgRNA framework shown in the sequence 7 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +8bp-2 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +8bp sgRNA framework shown in the sequence 7 and keeping other sequences unchanged.
The target nucleotide sequence and the corresponding PAM sequence of each vector are shown in table 1.
TABLE 1
Figure BDA0002295817890000101
Second, obtaining the Positive T0 Rice seedlings
Respectively operating the SaKKHn-pBE-1 vector, the SaKKHn-pBE-2 vector, the SaKKHn-pBE +3bp-1 vector, the SaKKHn-pBE +3bp-2 vector, the SaKKHn-pBE +8bp-1 vector and the SaKKHn-pBE +8bp-2 vector obtained in the first step according to the following steps 1-9:
1. the vector was introduced into Agrobacterium EHA105 (product of Shanghai Diego Biotechnology Ltd., CAT #: AC1010) to obtain recombinant Agrobacterium.
2. Culturing the recombinant Agrobacterium with a medium (YEP medium containing 50. mu.g/ml kanamycin and 25. mu.g/ml rifampicin), shaking at 28 ℃ and 150rpm to OD600At room temperature, centrifuging at 10000rpm for 1min, resuspending the thallus with an infection solution (glucose and sucrose are replaced by N6 liquid culture medium, and the concentrations of glucose and sucrose in the infection solution are 10g/L and 20g/L respectively) and diluting to OD600And the concentration is 0.2, and an agrobacterium tumefaciens infection solution is obtained.
3. The mature seeds of the rice variety Nipponbare are shelled and threshed, placed in a 100mL triangular flask, added with 70% (v/v) ethanol water solution for soaking for 30sec, then placed in 25% (v/v) sodium hypochlorite water solution, sterilized by shaking at 120rpm for 30min, washed by sterile water for 3 times, sucked by filter paper to remove water, then placed on an N6 solid culture medium with the embryo of the seeds facing downwards, and cultured in dark at 28 ℃ for 4-6 weeks to obtain the callus of the rice.
4. After the step 3 is completed, soaking the rice callus in an agrobacterium infection solution A (the agrobacterium infection solution A is a liquid obtained by adding acetosyringone into the agrobacterium infection solution, the addition amount of the acetosyringone meets the volume ratio of the acetosyringone to the agrobacterium infection solution of 25 mul: 50ml), soaking for 10min, then placing the rice callus on a culture dish (containing about 200ml of the agrobacterium-free infection solution) paved with two layers of sterilization filter paper, and performing dark culture at 21 ℃ for 1 day.
5. And (4) putting the rice callus obtained in the step (4) on a recovery culture medium, and performing dark culture at 25-28 ℃ for 3 days.
6. And (4) placing the rice callus obtained in the step (5) on a screening culture medium, and performing dark culture at 28 ℃ for 2 weeks.
7. And (4) putting the rice callus obtained in the step (6) on a screening culture medium again, and performing dark culture at 28 ℃ for 2 weeks to obtain the rice resistance callus.
8. And (3) putting the rice resistant callus obtained in the step (7) on a differentiation culture medium, performing illumination culture at 25 ℃ for about 1 month, transplanting the differentiated plantlets on a rooting culture medium, and performing illumination culture at 25 ℃ for 2 weeks to obtain rice T0 seedlings.
9. Extracting genome DNA of rice T0 seedling, using the genome DNA as a template, and performing PCR amplification by using a primer pair consisting of a primer F (5'-attatgtagcttgtgcgtttcg-3') and a primer R (5'-ctccacctcattgacattatgc-3') to obtain a PCR amplification product; the PCR amplification product was subjected to agarose gel electrophoresis, followed by judgment as follows: if the PCR amplification product contains DNA fragments of about 898bp, the corresponding rice T0 seedling is a rice positive T0 seedling; if the PCR amplification product does not contain the DNA fragment of about 898bp, the corresponding rice T0 seedling is not the rice positive T0 seedling.
Third, result analysis
1. Taking the genomic DNA of the rice positive T0 seedling obtained in the step two as a template for each vector, and carrying out PCR amplification on a T1 target spot by adopting a primer pair T1 to obtain a PCR amplification product; for the T2 target, carrying out PCR amplification on T2 by adopting a primer pair to obtain a PCR amplification product; for the T3 target, carrying out PCR amplification on T3 by adopting a primer pair to obtain a PCR amplification product; for the T4 target, PCR amplification is carried out by adopting a primer pair T4 to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for each target region. The number of positive T0 seedlings with C.T base substitution of T1, T2, T3 and T4 was counted, and the C.T base substitution efficiency was calculated, and the results are shown in Table 2.
The results show that for all four targets, compared with the SaKKHn & PmCDA1& UGI base editing system using the origin sgRNA, the SaKKHn & PmCDA1& UGI base editing system using the +3bp sgRNA can improve the C.T base replacement efficiency, and only for the T2 target, the C.T base replacement efficiency is improved by 3 times. And the SaKKHn & PmCDA1& UGI base editing system using +8bp sgRNA is unstable, improves the C.T base replacement efficiency to T2, T3 and T4 targets to different degrees, but reduces the C.T base replacement efficiency to a certain degree to the T1 target. On the overall synergistic level, in addition to the T4 target, the efficiency of realizing C.T base replacement by using the SaKKHn & PmCDA1& UGI base editing system of the +3bp sgRNA is better than that of the SaKKHn & PmCDA1& UGI base editing system of the +8bp sgRNA.
TABLE 2
Example 3 application of +3bp sgRNA to increase the efficiency of C.T base substitution in the SaKKHn & PmCDA1& UGI base editing System
Construction of recombinant expression vector
Artificially synthesizing the following recombinant expression vectors, wherein each expression vector is a circular plasmid:
five recombinant expression vectors containing Original sgRNA: SaKKHn-pBE-3, SaKKHn-pBE-4, SaKKHn-pBE-5, SaKKHn-pBE-6 and SaKKHn-pBE-7;
five recombinant expression vectors containing +3bp sgRNA: SaKKHn-pBE +3bp-3, SaKKHn-pBE +3bp-4, SaKKHn-pBE +3bp-5, SaKKHn-pBE +3bp-6 and SaKKHn-pBE +3 bp-7.
The nucleotide sequence of the SaKKHn-pBE-3 recombinant expression vector is obtained by replacing the 474 th and 995 th positions of the sequence 1 with a sequence 8 and keeping other sequences unchanged. Wherein, the 1 st to 77 th positions of the sequence 8 are nucleotide sequences of tRNA, the 78 th to 174 th positions are nucleotide sequences of sgRNA of targeted OsWaxy gene, and the 98 th to 174 th positions are DNA sequences of origin sgRNA framework. The target point in the SaKKHn-pBE-3 recombinant expression vector is T5, and the sequence is shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-4 recombinant expression vector is obtained by replacing a T5 target sequence in the sequence of the SaKKHn-pBE-3 recombinant expression vector with a T6 target sequence and keeping other sequences unchanged. The T6 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-5 recombinant expression vector is obtained by replacing a T5 target sequence in the sequence of the SaKKHn-pBE-3 recombinant expression vector with a T7 target sequence and keeping other sequences unchanged. The T7 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-6 recombinant expression vector is obtained by replacing a T5 target sequence in the sequence of the SaKKHn-pBE-3 recombinant expression vector with a T8 target sequence and keeping other sequences unchanged. The T8 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-7 recombinant expression vector is obtained by replacing a T5 target sequence of a sequence in the SaKKHn-pBE-3 recombinant expression vector with a T9 target sequence and keeping other sequences unchanged. The T9 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE +3bp-3 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-3 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-4 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-4 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-5 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-5 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-6 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-6 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-7 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-7 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The target nucleotide sequence and the corresponding PAM sequence for each vector are shown in table 3.
TABLE 3
Name of target point Target gene Target sequence (5 '-3') PAM Name of recombinant expression vector
T5 OsWaxy tcctcggcgtagtacgggct CACGGT SaKKHn-pBE-3;SaKKHn-pBE+3bp-3
T6 OsWaxy tatccgggcaaggtgagggc CGTGGT SaKKHn-pBE-4;SaKKHn-pBE+3bp-4
T7 OsGRF4 acgccggcaccgccctggct CTGGGT SaKKHn-pBE-5;SaKKHn-pBE+3bp-5
T8 OsALS cccaagcatgcgcagggaca ACGGGT SaKKHn-pBE-6;SaKKHn-pBE+3bp-6
T9 OsALS cacgtccttcccgctcgagg CCGGGT SaKKHn-pBE-7;SaKKHn-pBE+3bp-7
Second, obtaining the Positive T0 Rice seedlings
And (2) operating the SaKKHn-pBE-3 vector, the SaKKHn-pBE-4 vector, the SaKKHn-pBE-5 vector, the SaKKHn-pBE-6 vector, the SaKKHn-pBE-7 vector, the SaKKHn-pBE +3bp-3 vector, the SaKKHn-pBE +3bp-4 vector, the SaKKHn-pBE +3bp-5 vector, the SaKKHn-pBE +3bp-6 vector and the SaKKHn-pBE +3bp-7 vector constructed in the step one according to 1-9 of the step two in the example 2 respectively to obtain a positive T0 seedling of rice.
Third, result analysis
1. Taking the genomic DNA of the rice positive T0 seedling obtained in the step two as a template for each vector, and carrying out PCR amplification on a T5 target spot by adopting a primer pair T5 to obtain a PCR amplification product; for the T6 target, carrying out PCR amplification on T6 by adopting a primer pair to obtain a PCR amplification product; for the T7 target, carrying out PCR amplification on T4 by adopting a primer pair to obtain a PCR amplification product; for the T8 target, carrying out PCR amplification on T8 by adopting a primer pair to obtain a PCR amplification product; for the T9 target, PCR amplification is carried out by adopting a primer pair T9 to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for each target region. The number of positive T0 seedlings with C.T base substitution of T5, T6, T7, T8 and T9 was counted, and the C.T base substitution efficiency was calculated, and the results are shown in Table 4.
The results show that the SaKKHn & PmCDA1& UGI base editing system using +3bp sgRNA can improve the c.t base replacement efficiency compared with the SaKKHn & PmCDA1& UGI base editing system using Original sgRNA for all five targets. For the T9 target spot only, the SaKKHn & PmCDA1& UGI base editing system using origin sgRNA could not realize C.T base substitution, while the SaKKHn & PmCDA1& UGI base editing system using +3bp sgRNA could successfully realize C.T base substitution.
TABLE 4
Figure BDA0002295817890000131
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> agriculture and forestry academy of sciences of Beijing City
<120> high-efficiency sgRNA and application thereof in gene editing
<160>10
<170>PatentIn version 3.5
<210>1
<211>17400
<212>DNA
<213>Artificial Sequence
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc gaagcaactt aaagttatca ggcatgcatg gatcttggag gaatcagatg 180
tgcagtcagg gaccatagca caagacaggc gtcttctact ggtgctacca gcaaatgctg 240
gaagccggga acactgggta cgttggaaac cacgtgatgt gaagaagtaa gataaactgt 300
aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt gcagtatggg 360
ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc tcggctatcc 420
acatagatca aagctgattt aaaagagttg tgcagatgat ccgtggcgga tccaacaaag 480
caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg ttcgattccc 540
ggctggtgca catccaatgc gatgatcaag gttttagtac tctggaaaca gaatctacta 600
aaacaaggca aaatgccgtg tttatctcgt caacttgttg gcgagataac aaagcaccag 660
tggtctagtg gtagaatagt accctgccac ggtacagacc cgggttcgat tcccggctgg 720
tgcaaatcac cagtggaagc taaggtttta gtactctgga aacagaatct actaaaacaa 780
ggcaaaatgc cgtgtttatc tcgtcaactt gttggcgaga taacaaagca ccagtggtct 840
agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg ctggtgcaac 900
cggatttgaa cgatggacgt tttagtactc tggaaacaga atctactaaa acaaggcaaa 960
atgccgtgtt tatctcgtca acttgttggc gagatttttt tttttcgttt tgcattgagt 1020
tttctccgtc gcatgtttgc agttttattt tccgttttgc attgaaattt ctccgtctca 1080
tgtttgcagc gtgttcaaaa agtacgcagc tgtatttcac ttatttacgg cgccacattt 1140
tcatgccgtt tgtgccaact atcccgagct agtgaataca gcttggcttc acacaacact 1200
ggtgacccgc tgacctgctc gtacctcgta ccgtcgtacg gcacagcatt tggaattaaa 1260
gggtgtgatc gatactgctt gctgctaagc ttacaaattc gggtcaaggc ggaagccagc 1320
gcgccacccc acgtcagcaa atacggaggc gcggggttga cggcgtcacc cggtcctaac 1380
ggcgaccaac aaaccagcca gaagaaatta cagtaaaaaa aaagtaaatt gcactttgat 1440
ccacctttta ttacctaagt ctcaatttgg atcaccctta aacctatctt ttcaatttgg 1500
gccgggttgt ggtttggact accatgaaca acttttcgtc atgtctaact tccctttcag 1560
caaacatatg aaccatatat agaggagatc ggccgtatac tagagctgat gtgtttaagg 1620
tcgttgattg cacgagaaaa aaaaatccaa atcgcaacaa tagcaaattt atctggttca 1680
aagtgaaaag atatgtttaa aggtagtcca aagtaaaact tatagataat aaaatgtggt 1740
ccaaagcgta attcactcaa aaaaaatcaa cgagacgtgt accaaacgga gacaaacggc 1800
atcttctcga aatttcccaa ccgctcgctc gcccgcctcg tcttcccgga aaccgcggtg 1860
gtttcagcgt ggcggattct ccaagcagac ggagacgtca cggcacggga ctcctcccac 1920
cacccaaccg ccataaatac cagccccctc atctcctctc ctcgcatcag ctccaccccc 1980
gaaaaatttc tccccaatct cgcgaggctc tcgtcgtcga atcgaatcct ctcgcgtcct 2040
caaggtacgc tgcttctcct ctcctcgctt cgtttcgatt cgatttcgga cgggtgaggt 2100
tgttttgttg ctagatccga ttggtggtta gggttgtcga tgtgattatc gtgagatgtt 2160
taggggttgt agatctgatg gttgtgattt gggcacggtt ggttcgatag gtggaatcgt 2220
ggttaggttt tgggattgga tgttggttct gatgattggg gggaattttt acggttagat 2280
gaattgttgg atgattcgat tggggaaatc ggtgtagatc tgttggggaa ttgtggaact 2340
agtcatgcct gagtgattgg tgcgatttgt agcgtgttcc atcttgtagg ccttgttgcg 2400
agcatgttca gatctactgt tccgctcttg attgagttat tggtgccatg ggttggtgca 2460
aacacaggct ttaatatgtt atatctgttt tgtgtttgat gtagatctgt agggtagttc 2520
ttcttagaca tggttcaatt atgtagcttg tgcgtttcga tttgatttca tatgttcaca 2580
gattagataa tgatgaactc ttttaattaa ttgtcaatgg taaataggaa gtcttgtcgc 2640
tatatctgtc ataatgatct catgttacta tctgccagta atttatgcta agaactatat 2700
tagaatatca tgttacaatc tgtagtaata tcatgttaca atctgtagtt catctatata 2760
atctattgtg gtaatttctt tttactatct gtgtgaagat tattgccact agttcattct 2820
acttatttct gaagttcagg atacgtgtgc tgttactacc tatctgaata catgtgtgat 2880
gtgcctgtta ctatcttttt gaatacatgt atgttctgtt ggaatatgtt tgctgtttga 2940
tccgttgttg tgtccttaat cttgtgctag ttcttaccct atctgtttgg tgattatttc 3000
ttgcagtacg taatggctcc taagaagaag cggaaggttg gcatccacgg tgtcccggcg 3060
gcaaagagaa actacatcct gggtctggcc atcggtatta catcggtggg ctacggcatc 3120
atcgactacg agacaaggga tgtcatcgat gccggcgtcc ggctcttcaa ggaggccaac 3180
gtggagaata acgagggcag gcgctccaag cgcggcgcgc ggaggctgaa gcgcaggcgg 3240
aggcatcgca tccagcgggt gaagaagctc ctcttcgact acaatctgct cacggatcat 3300
tccgagctgt ctggcatcaa cccatacgag gcgcgggtga agggcctgtc ccagaagctc 3360
tcggaggagg agttctcggc ggccctgctg catctcgcga agaggcgcgg cgtgcataat 3420
gtcaatgagg tggaggagga taccggcaat gagctgtcaa ccaaggagca gatcagcagg 3480
aactccaagg cgctggagga gaagtatgtg gcggagctcc agctcgagag gctgaagaag 3540
gatggcgagg tccggggctc catcaatagg ttcaagacat cggactacgt gaaggaggcc 3600
aagcagctcc tgaaggtgca gaaggcgtac caccagctgg accagagctt catcgacacc 3660
tacatcgatc tgctcgagac acgccggacg tactacgagg gcccgggcga gggctcaccg 3720
ttcggctgga aggacatcaa ggagtggtac gagatgctga tgggccactg cacctacttc 3780
cctgaggagc tgaggagcgt gaagtacgcg tacaatgcgg acctctacaa cgccctgaac 3840
gacctcaata acctcgtgat cacgcgcgac gagaatgaga agctcgagta ctacgagaag 3900
ttccagatca tcgagaacgt gttcaagcag aagaagaagc cgaccctcaa gcagatcgcc 3960
aaggagatcc tcgtcaatga ggaggacatc aagggctaca gggtgacctc gaccggcaag 4020
ccagagttca ccaacctgaa ggtctaccac gacatcaagg atatcaccgc ccgcaaggag 4080
atcatcgaga atgcggagct cctggatcag atcgcgaaga tcctcaccat ctaccagtcc 4140
agcgaggaca tccaggagga gctcacgaac ctgaatagcg agctgaccca ggaggagatc 4200
gagcagatct ccaacctcaa gggctacacc ggcacgcaca atctgagcct caaggcgatc 4260
aatctcatcc tcgatgagct ctggcataca aatgataacc agatcgccat cttcaatcgc 4320
ctcaagctgg tcccaaagaa ggtcgatctg tcgcagcaga aggagatccc aacgacactg 4380
gtcgatgact tcatcctctc acctgtcgtg aagaggtcgt tcatccagtc gatcaaggtc 4440
atcaatgcga tcatcaagaa gtacggcctc cctaatgata tcatcatcga gctggcccgc 4500
gagaagaatt caaaggacgc gcagaagatg atcaacgaga tgcagaagag gaatcggcag 4560
acaaacgagc gcatcgagga gatcatccgc acaaccggca aggagaatgc caagtacctg 4620
atcgagaaga tcaagctgca tgacatgcag gagggcaagt gcctctactc actggaggcc 4680
atcccactcg aggacctgct gaataaccca ttcaattacg aggtcgacca tatcatcccg 4740
cgctccgtgt cgttcgacaa ttccttcaat aacaaggtcc tcgtcaagca ggaggagaac 4800
tccaagaagg gcaatcgcac cccgttccag tacctgtcct cttcggacag caagatctct 4860
tacgagacat tcaagaagca catcctcaac ctggccaagg gcaagggccg gatctccaag 4920
accaagaagg agtacctcct ggaggagagg gatatcaacc ggttcagcgt gcagaaggac 4980
ttcatcaatc gcaacctggt cgatacccgg tacgccacca ggggcctcat gaacctgctc 5040
cggtcctact tccgggtgaa caatctcgac gtgaaggtca agagcatcaa cggcggcttc 5100
acctcgttcc tcaggcggaa gtggaagttc aagaaggagc ggaacaaggg ctacaagcac 5160
catgccgagg acgccctcat catcgcgaac gcggacttca tcttcaagga gtggaagaag 5220
ctcgataagg cgaagaaggt catggagaac cagatgttcg aggagaagca ggccgagtcg 5280
atgccagaga tcgagacaga gcaggagtac aaggagatct tcatcacccc gcaccagatc 5340
aagcacatca aggacttcaa ggactacaag tactcccatc gggtcgataa gaagccaaat 5400
cggaagctca tcaatgatac cctctactcg acacgcaagg atgacaaggg caacaccctg 5460
atcgtcaata acctcaatgg cctctacgac aaggataacg acaagctgaa gaagctcatc 5520
aacaagagcc cagagaagct cctcatgtac caccacgatc cgcagacata ccagaagctc 5580
aagctgatca tggagcagta cggcgacgag aagaacccac tctacaagta ctacgaggag 5640
acaggcaact acctgaccaa gtactccaag aaggacaatg gcccagtgat caagaagatc 5700
aagtactacg gcaataagct gaacgcccac ctcgatatca cggacgatta ccctaacagc 5760
cggaataagg tggtcaagct gtccctcaag ccgtaccgct tcgacgtcta cctggataac 5820
ggcgtctaca agttcgtgac agtcaagaat ctcgacgtca tcaagaagga gaactactac 5880
gaggtcaatt ctaagtgcta cgaggaggcc aagaagctca agaagatcag caaccaggcc 5940
gagttcatcg ccagcttcta caagaacgat ctgatcaaga tcaacggcga gctctacagg 6000
gtcatcggcg tgaacaatga cctgctcaat aggatcgagg tgaacatgat cgacatcacc 6060
taccgcgagt acctcgagaa catgaacgat aagcggcctc cacacatcat caagacaatc 6120
gcctctaaga cccagtccat caagaagtac tccacggata tcctcggcaa cctctacgag 6180
gtgaagtcaa agaagcaccc gcagatcatc aagaagggct cggctggagg aggaggcacg 6240
ggaggaggag gctccgccga gtatgtgcgc gcgctcttcg acttcaacgg caatgacgag 6300
gaggatctcc ctttcaagaa gggcgacatc ctccgcatcc gcgataagcc ggaggagcag 6360
tggtggaacg cagaggactc cgagggcaag cggggcatga tcctggtgcc atacgtcgag 6420
aagtacagcg gcgattacaa ggaccacgat ggcgactaca aggatcatga catcgattac 6480
aaggacgatg acgataagtc cggcgtcgac atgacggacg cggagtatgt gcgcatccac 6540
gagaagctcg atatctacac cttcaagaag cagttcttca acaataagaa gtcggtgtcc 6600
catcggtgct acgtcctctt cgagctgaag cgcaggggag agcgccgcgc ctgcttctgg 6660
ggctacgcgg tgaataagcc gcagtcaggc acagagcgcg gcatccacgc cgagatcttc 6720
tcgatccgga aggtcgagga gtacctccgc gacaacccag gccagttcac gatcaattgg 6780
tactccagct ggtccccttg cgcagattgc gcagagaaga tcctcgagtg gtacaaccag 6840
gagctgaggg gcaatggcca taccctcaag atctgggcct gcaagctgta ctacgagaag 6900
aacgcgagga atcagatcgg cctctggaac ctgcgggata atggcgtggg cctcaacgtg 6960
atggtgtccg agcactacca gtgctgccgc aagatcttca tccagtcctc ccacaatcag 7020
ctgaacgaga ataggtggct cgaaaagacc ctgaagcgcg ccgagaagtg gaggagcgag 7080
ctgtctatca tgatccaggt caagatcctg cacaccacaa agtcaccggc ggtgggcggc 7140
ggcggcagcg aattctccgg cggcagcacg aacctcagcg acatcatcga gaaggagaca 7200
ggcaagcagc tcgtgatcca ggagtctatc ctcatgctgc ctgaggaggt ggaggaggtc 7260
atcggcaaca agccggagtc cgatatcctc gtgcacaccg cctacgacga gtcgacagat 7320
gagaatgtca tgctcctgac ctccgacgca ccagagtaca agccatgggc gctcgtgatc 7380
caggattcca acggcgagaa taagatcaag atgctgtctg gcggctcccc gaagaagaag 7440
cgcaaggtct agactagtct gaaatcacca gtctctctct acaaatctat ctctctctat 7500
aataatgtgt gagtagttcc cagataaggg aattagggtt cttatagggt ttcgctcatg 7560
tgttgagcat ataagaaacc cttagtatgt atttgtattt gtaaaatact tctatcaata 7620
aaatttctaa ttcctaaaac caaaatccag tggggcgccc gacctgtact cgcgaaggtt 7680
aacttacaga gagtgtccgg gcgcgcctgg tggatcgtcc gcctaggctg cagtgcagcg 7740
tgacccggtc gtgcccctct ctagagataa tgagcattgc atgtctaagt tataaaaaat 7800
taccacatat tttttttgtc acacttgttt gaagtgcagt ttatctatct ttatacatat 7860
atttaaactt tactctacga ataatataat ctatagtact acaataatat cagtgtttta 7920
gagaatcata taaatgaaca gttagacatg gtctaaagga caattgagta ttttgacaac 7980
aggactctac agttttatct ttttagtgtg catgtgttct cctttttttt tgcaaatagc 8040
ttcacctata taatacttca tccattttat tagtacatcc atttagggtt tagggttaat 8100
ggtttttata gactaatttt tttagtacat ctattttatt ctattttagc ctctaaatta 8160
agaaaactaa aactctattt tagttttttt atttaataat ttagatataa aatagaataa 8220
aataaagtga ctaaaaatta aacaaatacc ctttaagaaa ttaaaaaaac taaggaaaca 8280
tttttcttgt ttcgagtaga taatgccagc ctgttaaacg ccgtcgacga gtctaacgga 8340
caccaaccag cgaaccagca gcgtcgcgtc gggccaagcg aagcagacgg cacggcatct 8400
ctgtcgctgc ctctggaccc ctctcgagag ttccgctcca ccgttggact tgctccgctg 8460
tcggcatcca gaaattgcgt ggcggagcgg cagacgtgag ccggcacggc aggcggcctc 8520
ctcctcctct cacggcaccg gcagctacgg gggattcctt tcccaccgct ccttcgcttt 8580
cccttcctcg cccgccgtaa taaatagaca ccccctccac accctctttc cccaacctcg 8640
tgttgttcgg agcgcacaca cacacaacca gatctccccc aaatccaccc gtcggcacct 8700
ccgcttcaag gtacgccgct cgtcctcccc ccccccccct ctctaccttc tctagatcgg 8760
cgttccggtc catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc 8820
gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac 8880
acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc 8940
gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg 9000
cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct 9060
tttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag 9120
aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata 9180
catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac 9240
atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga 9300
tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca 9360
aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt 9420
tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt 9480
ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt 9540
acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga 9600
tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta 9660
tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag 9720
gagctcatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag 9780
ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc 9840
ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac 9900
aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt 9960
gacattgggg agtttagcga gagcctgacc tattgcatct cccgccgttc acagggtgtc 10020
acgttgcaag acctgcctga aaccgaactg cccgctgttc tacaaccggt cgcggaggct 10080
atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg 10140
caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat 10200
gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc 10260
gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat 10320
ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc 10380
gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg 10440
ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga 10500
tcgccacgac tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg 10560
gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga 10620
tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc 10680
gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg 10740
gcaaagaaat agagtagatg ccgaccggga tctgtcgatc gacaagctcg agtttctcca 10800
taataatgtg tgagtagttc ccagataagg gaattagggt tcctataggg tttcgctcat 10860
gtgttgagca tataagaaac ccttagtatg tatttgtatt tgtaaaatac ttctatcaat 10920
aaaatttcta attcctaaaa ccaaaatcca gtactaaaat ccagatcccc cgaattaatt 10980
cggcgttaat tcagcctgca ggacgcgttt aattaagtgc acgcggccgc ctacttagtc 11040
aagagcctcg cacgcgactg tcacgcggcc aggatcgcct cgtgagcctc gcaatctgta 11100
cctagtgttt aaactatcag tgtttgacag gatatattgg cgggtaaacc taagagaaaa 11160
gagcgtttat tagaataacg gatatttaaa agggcgtgaa aaggtttatc cgttcgtcca 11220
tttgtatgtg catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc 11280
ctccgctgct atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac 11340
atgtcgcaca agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt 11400
cttgtcgcgt gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac 11460
gccatgaaca agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac 11520
caggacttga ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc 11580
gagaagatca ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta 11640
cgccctggcg acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac 11700
ctactggaca ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag 11760
ccgtgggccg acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt 11820
gccgagttcg agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag 11880
gcccgaggcg tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc 11940
cgcgagctga tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg 12000
catcgctcga ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc 12060
aggcggcgcg gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc 12120
gagaatgaac gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt 12180
ttttcattac cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc 12240
ccgcgcacgt ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc 12300
tggcggcctg gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt 12360
gatgtgtatt tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag 12420
taaataaaca aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg 12480
cgggtcaggc aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc 12540
cgatgttctg ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg 12600
ggaagatcaa ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa 12660
ggccatcggc cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc 12720
tgtgtccgcg atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga 12780
catatgggcc accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg 12840
aaggctacaa gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga 12900
ggttgccgag gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg 12960
cgtgagctac ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg 13020
cgacgctgcc cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt 13080
taatgaggta aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc 13140
gcacgcagca gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg 13200
gtcaactttc agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa 13260
ggcaagacca ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc 13320
aaatgaataa atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga 13380
acaaccaggc accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg 13440
cgtaagcggc tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga 13500
atcggcgtga cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg 13560
acctggtgga gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag 13620
cacgccccgg tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac 13680
cgccggcagc cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt 13740
ttttcgttcc gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg 13800
ccgttttccg tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc 13860
cagacgggca cgtagaggtttccgcagggc cggccggcat ggccagtgtg tgggattacg 13920
acctggtact gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga 13980
agggagacaa gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc 14040
ggcgagccga tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca 14100
ccacgcacgt tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat 14160
ccgagggtga agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg 14220
agtacatcga gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc 14280
cggacgtgct gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc 14340
tctaccgcct ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga 14400
tctacgaacg cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc 14460
tgatcgggtc aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc 14520
cgatcctagt catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat 14580
gtacggagca gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct 14640
ttcctgtgga tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt 14700
acattgggaa cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa 14760
aagagaaaaa aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa 14820
cccgcctggc ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc 14880
ctacccttcg gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg 14940
ctggccgctc aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg 15000
cgccgtcgcc actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt 15060
gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 15120
gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 15180
ggcgcagcca tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg 15240
catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg 15300
taaggagaaa ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct 15360
cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 15420
cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 15480
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 15540
acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 15600
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 15660
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 15720
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 15780
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 15840
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 15900
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 15960
gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 16020
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 16080
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 16140
acgaaaactc acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca 16200
gtaaaatata atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata 16260
gctcgacata ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt 16320
cataccactt gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat 16380
ctttcacaaa gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg 16440
gcttttccgt ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt 16500
cccagttttc gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta 16560
agcggctgtc taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc 16620
tgatgcactc cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt 16680
ccgagcaaag gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt 16740
caaagtgcag gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt 16800
cccgttccac atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt 16860
tttcattttc tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta 16920
cgcagcggta tttttcgatc agttttttca attccggtga tattctcatt ttagccattt 16980
attatttcct tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa 17040
gacgaactcc aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt 17100
ttcaaagttgttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc 17160
gcggtgatca caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga 17220
gatcatccgt gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac 17280
atgagcaaag tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg 17340
ctgcctgtat cgagtggtga ttttgtgccg agctgccggt cggggagctg ttggctggct 17400
<210>2
<211>1071
<212>PRT
<213>Artificial Sequence
<400>2
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
20 25 30
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
35 40 45
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
50 55 60
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
65 70 75 80
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
85 90 95
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
100 105 110
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
115 120 125
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
130 135 140
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
145 150 155 160
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
165 170 175
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
180 185 190
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
195 200 205
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
210 215 220
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
225 230 235 240
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
245 250 255
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
260 265 270
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
275 280 285
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
290 295 300
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
305 310 315 320
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
325 330 335
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
340 345 350
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
355 360 365
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
370 375 380
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
385 390 395 400
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
405 410 415
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
420 425 430
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
435 440 445
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
450 455 460
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
465 470 475 480
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
485 490 495
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
500 505 510
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
515 520 525
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
530 535 540
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
545 550 555 560
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
565 570 575
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
580 585 590
Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
595 600 605
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
610 615 620
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
625 630 635 640
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
645 650 655
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
660 665 670
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
675 680 685
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
690 695 700
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
705 710 715 720
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
725 730 735
Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys
740 745 750
Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu
755 760 765
Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp
770 775 780
Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu Ile
785 790 795 800
Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu
805 810 815
Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu
820 825 830
Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His
835 840 845
Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly
850 855 860
Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr
865 870 875 880
Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile
885 890 895
Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp
900 905 910
Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr
915 920 925
Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val
930 935 940
Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser
945 950 955 960
Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala
965 970 975
Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn Gly
980 985 990
Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile
995 1000 1005
Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn
1010 1015 1020
Met Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala Ser
1025 1030 1035
Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn
1040 1045 1050
Leu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys
1055 1060 1065
Gly Ser Ala
1070
<210>3
<211>208
<212>PRT
<213>Artificial Sequence
<400>3
Met Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu Asp Ile Tyr
1 5 10 15
Thr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val Ser His Arg
20 25 30
Cys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg Arg Ala Cys
35 40 45
Phe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr Glu Arg Gly
50 55 60
Ile His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu Tyr Leu Arg
65 70 75 80
Asp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser Pro
85 90 95
Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu Leu
100 105 110
Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr Tyr
115 120 125
Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp Asn
130 135 140
Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys Arg
145 150 155 160
Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg Trp
165 170 175
Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Trp Arg Ser Glu Leu Ser
180 185 190
Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala Val
195 200 205
<210>4
<211>98
<212>PRT
<213>Artificial Sequence
<400>4
Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly
1 5 10 15
Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val
20 25 30
Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr
35 40 45
Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp
50 55 60
Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly
65 70 75 80
Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg
85 90 95
Lys Val
<210>5
<211>522
<212>DNA
<213>Artificial Sequence
<400>5
aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60
gattcccggc tggtgcagat gccacacagc aaggagtgtt ttagtactct ggaaacagaa 120
tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agataacaaa 180
gcaccagtgg tctagtggta gaatagtacc ctgccacggt acagacccgg gttcgattcc 240
cggctggtgc acagaaccga caacagatga ggttttagta ctctggaaac agaatctact 300
aaaacaaggc aaaatgccgt gtttatctcg tcaacttgtt ggcgagataa caaagcacca 360
gtggtctagt ggtagaatag taccctgcca cggtacagac ccgggttcga ttcccggctg 420
gtgcaccagc tcatttggct cggcggtttt agtactctgg aaacagaatc tactaaaaca 480
aggcaaaatg ccgtgtttat ctcgtcaact tgttggcgag at 522
<210>6
<211>83
<212>DNA
<213>Artificial Sequence
<400>6
gttttagtac tctgctggaa acagcagaat ctactaaaac aaggcaaaat gccgtgttta 60
tctcgtcaac ttgttggcga gat 83
<210>7
<211>93
<212>DNA
<213>Artificial Sequence
<400>7
gttttagtac tctgtaattt tagaaataaa attacagaat ctactaaaac aaggcaaaat 60
gccgtgttta tctcgtcaac ttgttggcga gat 93
<210>8
<211>174
<212>DNA
<213>Artificial Sequence
<400>8
aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60
gattcccggc tggtgcatcc tcggcgtagt acgggctgtt ttagtactct ggaaacagaa 120
tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agat 174
<210>9
<211>77
<212>RNA
<213>Artificial Sequence
<400>9
guuuuaguac ucuggaaaca gaaucuacua aaacaaggca aaaugccgug uuuaucucgu 60
caacuuguug gcgagau 77
<210>10
<211>83
<212>RNA
<213>Artificial Sequence
<400>10
guuuuaguac ucugcuggaa acagcagaau cuacuaaaac aaggcaaaau gccguguuua 60
ucucgucaac uuguuggcga gau 83

Claims (10)

1. A kit comprising a sgRNA or a biological material associated with the sgRNA, a Cas9 nuclease or a biological material associated with the Cas9 nuclease, a cytosine deaminase or a biological material associated with the cytosine deaminase;
the sgRNA targets a target sequence;
the sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I);
the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework;
the RNA segment A and the RNA segment B are reversely complementary;
the sizes of the RNA fragment A and the RNA fragment B are both 3 nt;
the sgRNA backbone is m1) or m2) or m 3):
m1) the RNA molecule shown as the sequence 9;
m2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in m1) and having the same function;
m3) and m1) or m2) and has the same function.
2. The kit of claim 1, wherein: the engineered sgRNA backbone is n1) or n2) or n 3):
n1) the RNA molecule shown as the sequence 10;
n2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in n1) and having the same function;
n3) and n1) or n2) and has the same function.
3. The kit of claim 1 or 2, wherein: the Cas9 nuclease is a SaKKHn protein;
the SaKKHn protein is E1) or E2) or E3):
E1) the amino acid sequence is a protein shown in a sequence 2;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
E3) a fusion protein obtained by connecting a label to the N terminal or/and the C terminal of E1) or E2);
the biological material related to the SaKKHn is any one of F1) to F5):
F1) a nucleic acid molecule encoding said SaKKHn protein;
F2) an expression cassette comprising the nucleic acid molecule of F1);
F3) a recombinant vector comprising the nucleic acid molecule of F1) or a recombinant vector comprising the expression cassette of F2);
F4) a recombinant microorganism containing F1) said nucleic acid molecule, or a recombinant microorganism containing F2) said expression cassette, or a recombinant microorganism containing F3) said recombinant vector;
F5) a transgenic cell line comprising the nucleic acid molecule of F1) or a transgenic cell line comprising the expression cassette of F2).
4. The kit of claim 1 or 2, wherein: the cytosine deaminase is PmCDA1 protein;
the PmCDA1 protein is G1) or G2) or G3):
G1) the amino acid sequence is a protein shown in a sequence 3;
G2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table and has the same function;
G3) a fusion protein obtained by connecting a tag to the N-terminus or/and the C-terminus of G1) or G2);
the biological material related to the PmCDA1 protein is any one of H1) to H5):
H1) a nucleic acid molecule encoding the PmCDA1 protein;
H2) an expression cassette comprising the nucleic acid molecule of H1);
H3) a recombinant vector containing H1) the nucleic acid molecule or a recombinant vector containing H2) the expression cassette;
H4) a recombinant microorganism containing H1) the nucleic acid molecule, or a recombinant microorganism containing H2) the expression cassette, or a recombinant microorganism containing H3) the recombinant vector;
H5) a transgenic cell line containing H1) the nucleic acid molecule or a transgenic cell line containing H2) the expression cassette.
5. The kit of any one of claims 1 to 4, wherein: the sgRNA is tRNA-sgRNA;
the tRNA-sgRNA is shown as a formula I: tRNA-the RNA transcribed from the target sequence-engineered sgRNA backbone (formula I);
the tRNA is 1) or 2) or 3):
1) an RNA molecule obtained by replacing T in the 474-550 th position of the sequence 1 with U;
2) RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides in the RNA molecules shown in 1) and have the same functions;
3) RNA molecule with 75% or more than 75% identity with the nucleotide sequence defined in 1) or 2) and with the same function.
6. The kit of any one of claims 1 to 5, wherein: the kit further comprises a UGI protein or a biological material associated with the UGI protein;
the UGI protein is I1) or I2) or I3):
I1) the amino acid sequence is a protein shown in a sequence 4;
I2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 4 in the sequence table and has the same function;
I3) a fusion protein obtained by connecting labels at the N terminal or/and the C terminal of I1) or I2);
the biological material related to the UGI protein is any one of J1) to J5):
J1) a nucleic acid molecule encoding the UGI protein;
J2) an expression cassette comprising the nucleic acid molecule of J1);
J3) a recombinant vector comprising J1) said nucleic acid molecule, or a recombinant vector comprising J2) said expression cassette;
J4) a recombinant microorganism containing J1) the nucleic acid molecule, or a recombinant microorganism containing J2) the expression cassette, or a recombinant microorganism containing J3) the recombinant vector;
J5) a transgenic cell line comprising J1) the nucleic acid molecule or a transgenic cell line comprising J2) the expression cassette.
7. The sgRNA of any one of claims 1-6 or the engineered sgRNA backbone of any one of claims 1-6.
8. The kit of any one of claims 1-6, or the sgRNA of claim 7, or the modified sgRNA backbone of claim 7, for use in any one of X1) -X4):
x1) editing of a target sequence in the genome of an organism or cell of an organism;
x2) preparing an edited product of a target sequence of a genome of an organism or a cell of an organism;
x3) increasing the efficiency of editing a target sequence in the genome of an organism or cell of an organism;
x4) to produce a product that increases the efficiency of editing a target sequence in the genome of an organism or cell of an organism.
9, Y1) or Y2):
y1) or a method of increasing the efficiency of editing a genomic target sequence of an organism or a cell of an organism, comprising expressing the sgRNA of any one of claims 1 to 6, the Cas9 nuclease of any one of claims 1 to 6, the cytosine deaminase of any one of claims 1 to 6 in the organism or cell of the organism to effect editing of the genomic target sequence; the sgRNA targets the target sequence;
y2) biological mutant, comprising the following steps: editing the genome of the organism according to the method described in Y1) to obtain a biological mutant.
10. The kit of any one of claims 1 to 6 or the use of claim 8 or the method of claim 9, wherein:
editing the genome target sequence to mutate C in the target sequence into T;
and/or, the organism is S1) or S2) or S3) or S4):
s1) plants or animals;
s2) a monocot or dicot;
s3) gramineous plants;
s4) rice;
and/or, the biological cell is T1) or T2) or T3) or T4):
t1) plant cells or animal cells;
t2) a monocotyledonous or dicotyledonous plant cell;
t3) graminaceous plant cells;
t4) rice cells.
CN201911200779.0A 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing Active CN110835630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Publications (2)

Publication Number Publication Date
CN110835630A true CN110835630A (en) 2020-02-25
CN110835630B CN110835630B (en) 2023-01-03

Family

ID=69577858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911200779.0A Active CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Country Status (1)

Country Link
CN (1) CN110835630B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023005935A1 (en) * 2021-07-30 2023-02-02 中国科学院天津工业生物技术研究所 Method for reducing editing window of base editor, base editor and use

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BENJAMIN P KLEINSTIVER 等: "Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition", 《NATURE BIOTECHNOLOGY》 *
F. ANN RAN等: "In vivo genome editing using Staphylococcus aureus Cas9", 《NATURE》 *
YING WU等: "Increasing Cytosine Base Editing Scope and Efficiency With Engineered Cas9-PmCDA1 Fusions and the modified sgRNA in Rice", 《FRONTIERS IN GENETICS》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023005935A1 (en) * 2021-07-30 2023-02-02 中国科学院天津工业生物技术研究所 Method for reducing editing window of base editor, base editor and use

Also Published As

Publication number Publication date
CN110835630B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
CN108707621A (en) A kind of CRISPR/Cpf1 System-mediateds using rna transcription sheet as the methods of homologous recombination of recovery template
KR20100085935A (en) Synthetic 5&#39;utrs, expression vectors, and methods for increasing transgene expression
CN107418954B (en) Populus tomentosa gene PtomiR390a and application thereof
CN108085287A (en) A kind of restructuring corynebacterium glutamicum, its preparation method and its application
CN103205458B (en) Intermediate expression carrier applicable to monocotyledon transformation and construction method thereof
CN110835631B (en) Modified sgRNA and application thereof in improving base editing efficiency
CN110835630B (en) Efficient sgRNA and application thereof in gene editing
CN108342409B (en) Plant RNAi expression vector and construction method and application thereof
CN112778405B (en) Protein related to plant flowering phase and coding gene and application thereof
CN110408646B (en) Plant genetic transformation screening vector and application thereof
CN109206496B (en) Application of protein GhFLS1 in regulation and control of plant heat resistance
CN111304242A (en) Method for preparing single mutant based on SaKKHn-pBE system
CN113121662B (en) Application of cotton GhBZR3 protein and coding gene thereof in regulating plant growth and development
CN111154797B (en) Genetic transformation method of maize backbone inbred line mediated by gene gun
CN110923263B (en) Rice beta-amylase BA1 and coding gene and application thereof
CN110747186B (en) CRISPR/Cas9 systems and methods for efficient generation of mutants not carrying a transgenic element in plants
CN110878321B (en) Expression vector for klebsiella pneumoniae gene editing
CN113604412A (en) High-yield strain with sub-appropriate amount of L-glutamic acid, construction method thereof and NH4+Staged control of fermentation process
CN111187787A (en) Multifunctional plant expression vector and construction method and application thereof
CN111154796B (en) Genetic transformation method of agrobacterium-mediated corn backbone inbred line
CN109321594B (en) Method for improving artemisinin content in artemisia annua by taking artemisia annua suspension cell line as receptor through iaaM gene transfer
CN111269298B (en) Application of protein GhCCOAOMT7 in regulation and control of plant heat resistance
CN109694402B (en) Plant lignin synthesis related protein and coding gene and application thereof
CN112458113A (en) Plant transgenic dominant suppression vector and application thereof
CN112575028A (en) RNAi plant expression vector for inhibiting expression of HIS1 gene and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant