CN110835630B - Efficient sgRNA and application thereof in gene editing - Google Patents

Efficient sgRNA and application thereof in gene editing Download PDF

Info

Publication number
CN110835630B
CN110835630B CN201911200779.0A CN201911200779A CN110835630B CN 110835630 B CN110835630 B CN 110835630B CN 201911200779 A CN201911200779 A CN 201911200779A CN 110835630 B CN110835630 B CN 110835630B
Authority
CN
China
Prior art keywords
sequence
sgrna
plant
lys
sakkhn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911200779.0A
Other languages
Chinese (zh)
Other versions
CN110835630A (en
Inventor
张成伟
徐雯
刘亚
赵思
杨进孝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911200779.0A priority Critical patent/CN110835630B/en
Publication of CN110835630A publication Critical patent/CN110835630A/en
Application granted granted Critical
Publication of CN110835630B publication Critical patent/CN110835630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Virology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides an efficient sgRNA and application thereof in gene editing. The sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I); the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework; the RNA segment A and the RNA segment B are reversely complementary; the sizes of the RNA fragment A and the RNA fragment B are both 3nt; the sgRNA framework is an RNA molecule shown in a sequence 9. Experiments prove that: the modified sgRNA can obviously improve the C.T base replacement efficiency of a Cytosine Base Editor (CBE), and the highest efficiency can reach 86.4%.

Description

Efficient sgRNA and application thereof in gene editing
Technical Field
The invention belongs to the technical field of biology, and particularly relates to an efficient sgRNA and application thereof in gene editing.
Background
The CRISPR-Cas9 technology has become a powerful genome editing means and is widely applied to many tissues and cells. The CRISPR/Cas9 protein-RNA complex is localized on the target by guide RNA (guide RNA), cleaved to generate a DNA double strand break (dsDNA break, DSB), and then the organism will instinctively initiate a DNA repair mechanism to repair the DSB. Repair mechanisms are generally of two types, one being non-homologous end joining (NHEJ) and the other being homologous recombination (HDR). In general NHEJ dominates, so repair produces random indels (insertions or deletions) much higher than precise repair. For base accurate replacement, the application of using HDR to achieve base accurate replacement is greatly limited because of the low efficiency of HDR and the need for DNA templates.
In 2016, two laboratories, david Liu and Akihiko Kondo, have reported two different types of Cytosine Base Editors (CBEs), respectively, and have used two different kinds of cytidine deaminases, rAPOBEC1 (rat apobe 1) and PmCDA1 (activation-induced Cytosine deaminase (AID) orthogonal simple), which are based on the principle that the editing of a single Cytosine (C) base is directly realized by using the cytidine deaminase, and the base editing efficiency of C-to-Thymine (T) is greatly improved by no longer generating DSB and initiating HDR repair. Specifically, dead Cas9 (dCas 9) or the Cas9 nickase (Cas 9 n) is positioned to a target point through sgRNA together with rAPOBEC1 or PmCDA1, the rAPOBEC1 or PmCDA1 catalyzes C on unpaired single-stranded DNA to generate cytosine deamination reaction to become Uracil (Uracil, U), the U is paired with Adenine (Adenine, A) through DNA repair, and finally T is paired with A through DNA replication, so that conversion from C to T is realized. The mean mutation rate of the SpCas9n (D10A) & rAPOBEC1/PmCDA1& UGI base editing system (which contains uracil DNA glycosylase inhibitor, UGI)) was higher in the editors tested for two reasons: firstly, UGI can inhibit Uracil DNA Glycosylase (UDG) from catalyzing and removing U in DNA, and secondly, spCas9n (D10A) generates a nick on a non-editing chain, and induces a eukaryotic mismatch repair mechanism or a long-batch BER (base-exposure repair) repair mechanism to promote more preferential repair of U: G mismatch into U: A. In order to improve the working efficiency and reduce the working cost, the improvement of the efficiency of C.T base substitution has been the research direction of the base editing system of animal and plant genomes.
Cas9 (Sacas 9) derived from Staphylococcus aureus (Staphylococcus aureus) is a SpCas9 homologue, a NNGRRT PAM, saCas variant, saKKH, recognizes a wider NNNRRT PAM, and both are developed into effective CBE, so that the range of editable C of the CBE in animal and plant genomes is greatly expanded. At present, no research report for improving the C.T base replacement efficiency of the CBE related to the SaKKH by modifying the structure of sgRNA (SaCas 9 sgRNA) corresponding to the SaCas9 exists.
Disclosure of Invention
The purpose of the present invention is to improve the efficiency of C.T base substitution in a Cytosine Base Editor (CBE).
To achieve the above object, the present invention provides a kit comprising a sgRNA or a biological material related to the sgRNA, a Cas9 nuclease or a biological material related to the Cas9 nuclease, a cytosine deaminase or a biological material related to the cytosine deaminase;
the sgRNA targets a target sequence;
the sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I);
the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework;
the RNA segment A and the RNA segment B are reversely complementary;
the sizes of the RNA fragment A and the RNA fragment B are both 3nt;
the sgRNA backbone is m 1) or m 2) or m 3):
m 1) an RNA molecule shown as a sequence 9;
m 2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in m 1) and having the same function;
m 3) and m 1) or m 2) are defined by the nucleotide sequence of 75 percent or more than 75 percent of identity and RNA molecules with the same function.
In the above kit, the modified sgRNA backbone is n 1) or n 2) or n 3):
n 1) an RNA molecule shown as a sequence 10;
n 2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in n 1) and having the same function;
n 3) and the nucleotide sequence defined by n 1) or n 2) have identity of 75 percent or over 75 percent and RNA molecules with the same function.
In the kit, the Cas9 nuclease can be a protein such as SaKKHn, saCas9, saKKH-HF or SaCas 9-HF. In a specific embodiment of the invention, the Cas9 nuclease is specifically a SaKKHn protein.
The SaKKHn protein is E1) or E2) or E3):
e1 ) the amino acid sequence is the protein shown in the sequence 2;
e2 Protein with the same function is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table;
e3 A fusion protein obtained by connecting a label to the N end or/and the C end of E1) or E2);
the biological material related to the SaKKHn protein is any one of F1) to F5):
f1 Nucleic acid molecules encoding the SaKKHn protein;
f2 An expression cassette comprising the nucleic acid molecule according to F1);
f3 A recombinant vector containing the nucleic acid molecule according to F1) or a recombinant vector containing the expression cassette according to F2);
f4 A recombinant microorganism containing the nucleic acid molecule according to F1), or a recombinant microorganism containing the expression cassette according to F2), or a recombinant microorganism containing the recombinant vector according to F3);
f5 A transgenic cell line containing the nucleic acid molecule according to F1) or a transgenic cell line containing the expression cassette according to F2).
In the kit, the cytosine deaminase can be human APOBEC3A, human AID, pmCDA1 or rAPOBEC1 protein. In a particular embodiment of the invention, the cytosine deaminase is in particular a PmCDA1 protein.
The PmCDA1 protein is G1) or G2) or G3):
g1 ) the amino acid sequence is the protein shown in the sequence 3;
g2 Protein with the same function obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table;
g3 A fusion protein obtained by connecting a label to the N terminal or/and the C terminal of G1) or G2);
the biological material related to the PmCDA1 protein is any one of H1) to H5):
h1 A nucleic acid molecule encoding the PmCDA1 protein;
h2 An expression cassette comprising the nucleic acid molecule according to H1);
h3 A recombinant vector containing the nucleic acid molecule according to H1) or a recombinant vector containing the expression cassette according to H2);
h4 A recombinant microorganism containing H1) said nucleic acid molecule, or a recombinant microorganism containing H2) said expression cassette, or a recombinant microorganism containing H3) said recombinant vector;
h5 A transgenic cell line containing the nucleic acid molecule described in H1) or a transgenic cell line containing the expression cassette described in H2).
In the kit, the sgRNA can be tRNA-sgRNA;
the tRNA-sgRNA is shown as formula I: tRNA-the RNA transcribed from the target sequence-engineered sgRNA backbone (formula I);
the tRNA is 1) or 2) or 3):
1) An RNA molecule obtained by replacing T in 474-550 th positions of the sequence 1 with U;
2) The RNA molecule shown in 1) is subjected to substitution and/or deletion and/or addition of one or more nucleotides, and has the same function;
3) RNA molecule with 75% or more than 75% identity and same function with the nucleotide sequence defined in 1) or 2).
The kit may further include a UGI protein or a biological material associated with the UGI protein;
the UGI protein is I1) or I2) or I3):
i1 ) the amino acid sequence is the protein shown in the sequence 4;
i2 Protein with the same function obtained by substituting and/or deleting and/or adding one or more amino acid residues of the amino acid sequence shown in the sequence 4 in the sequence table;
i3 A fusion protein obtained by connecting a label to the N end or/and the C end of the I1) or the I2);
the biological material related to the UGI protein is any one of J1) to J5):
j1 A nucleic acid molecule encoding the UGI protein;
j2 An expression cassette containing the nucleic acid molecule according to J1);
j3 A recombinant vector containing J1) the nucleic acid molecule or a recombinant vector containing J2) the expression cassette;
j4 A recombinant microorganism containing J1) said nucleic acid molecule, or a recombinant microorganism containing J2) said expression cassette, or a recombinant microorganism containing J3) said recombinant vector;
j5 A transgenic cell line containing the nucleic acid molecule according to J1) or a transgenic cell line containing the expression cassette according to J2).
In order to facilitate the purification of the proteins in E1), G1) and I1), the amino terminal or the carboxyl terminal of the protein consisting of the amino acid sequences shown in the sequence 2, the sequence 3 or the sequence 4 in the sequence table can be connected with the tags shown in the following table.
Sequence of Table, tag
Label (R) Residue(s) of Sequence of
Poly-Arg 5-6 (typically 5) RRRRR
Poly-His 2-10 (generally 6) HHHHHH
FLAG 8 DYKDDDDK
Strep-tag II 8 WSHPQFEK
c-myc 10 EQKLISEEDL
The protein in E2), G2) and I2) is a protein having 75% or more identity to or having 75% or more identity to the amino acid sequence of the protein shown in SEQ ID NO. 2, SEQ ID NO. 3 or SEQ ID NO. 4 and having the same function. The identity of 75% or more than 75% is 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity.
The proteins in E2), G2) and I2) can be artificially synthesized, or can be obtained by synthesizing the coding genes and then carrying out biological expression.
The genes encoding the proteins in E2), G2) and I2) above can be obtained by deleting one or several codons of amino acid residues from the DNA sequence shown in 3013-6225 (protein shown in coding sequence 2) of sequence 1, 6511-7134 (protein shown in coding sequence 3) of sequence 1, 7156-7452 (protein shown in coding sequence 4) of sequence 1, and/or carrying out missense mutation of one or several base pairs, and/or connecting the coding sequences with the tags shown in the above table at the 5 'end and/or 3' end.
Further, the nucleic acid molecule of F1) is F1) or F2) or F3):
f1 A cDNA molecule or DNA molecule shown in 3013-6225 site of a sequence 1 in a sequence table;
f2 A cDNA molecule or DNA molecule having 75% or more identity with the nucleotide sequence defined in f 1) and encoding said SaKKHn;
f3 A cDNA molecule or DNA molecule hybridizing under stringent conditions with the nucleotide sequence defined in f 1) or f 2) and encoding said SaKKHn;
h1 ) the nucleic acid molecule is h 1) or h 2) or h 3):
h1 A cDNA molecule or DNA molecule shown in 6511-7134 site of a sequence 1 in a sequence table;
h2 A cDNA molecule or DNA molecule which has 75 percent or more than 75 percent of identity with the nucleotide sequence defined by h 1) and codes the PmCDA 1;
h3 A cDNA molecule or a DNA molecule which is hybridized with the nucleotide sequence limited by h 1) or h 2) under strict conditions and codes the PmCDA 1;
j1 ) the nucleic acid molecule is j 1) or j 2) or j 3):
j1 A cDNA molecule or DNA molecule shown in 7156-7452 th site of sequence 1 in a sequence table;
j2 A cDNA molecule or DNA molecule having 75% or more identity to the nucleotide sequence defined in j 1) and encoding said UGI;
j3 A cDNA molecule or DNA molecule hybridizing under stringent conditions with the nucleotide sequence defined in j 1) or j 2) and encoding the UGI.
Wherein the nucleic acid molecule may be DNA, such as cDNA, genomic DNA or recombinant DNA; the nucleic acid molecule may also be RNA, such as mRNA or hnRNA, etc.
The nucleotide sequence of the present invention encoding said SaKKHn or said PmCDA1 or said UGI can be easily mutated by a person of ordinary skill in the art using known methods such as directed evolution and point mutation. Those nucleotides which are artificially modified to have 75% or more identity to the nucleotide sequence of said SaKKHn or said PmCDA1 or said UGI of the present invention are derived from the nucleotide sequence of the present invention and are identical to the sequence of the present invention as long as they encode said SaKKHn or said PmCDA1 or said UGI and have the same function.
The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "identity" includes nucleotide sequences that are 75% or more, or 85% or more, or 90% or more, or 95% or more identical to the nucleotide sequence of a protein consisting of the amino acid sequence shown in coding sequences 2, 3, 4 of the present invention. Identity can be assessed visually or by computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.
The stringent conditions are hybridization and washing of the membrane at 68 ℃ 2 times, 5min each, in a solution of 2 XSSC, 0.1% SDS, and hybridization and washing of the membrane at 68 ℃ 2 times, 15min each, in a solution of 0.5 XSSC, 0.1% SDS; alternatively, 0.1 XSSPE (or 0.1 XSSC), 0.1% SDS in a medium, and the membrane is washed at 65 ℃.
The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.
F2 The SaKKHn gene expression cassette) containing a nucleic acid molecule encoding a SaKKHn protein means a DNA capable of expressing the SaKKHn protein in a host cell, and the DNA may include not only a promoter which initiates transcription of the SaKKHn gene but also a terminator which terminates transcription of the SaKKHn gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the SaKKHn gene expression cassette can be constructed by using the existing expression vector.
H2 The expression cassette containing a nucleic acid molecule encoding the PmCDA1 protein (PmCDA 1 gene expression cassette) means a DNA capable of expressing the PmCDA1 protein in a host cell, and the DNA may include not only a promoter which initiates transcription of the PmCDA1 gene but also a terminator which terminates transcription of the PmCDA1 gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the PmCDA1 gene expression cassette can be constructed by using an existing expression vector.
J2 The expression cassette containing a nucleic acid molecule encoding UGI protein (UGI gene expression cassette) means a DNA capable of expressing UGI protein in a host cell, and the DNA may include not only a promoter which initiates transcription of UGI gene but also a terminator which terminates transcription of UGI gene. Further, the expression cassette may also include an enhancer sequence. The recombinant vector containing the UGI gene expression cassette can be constructed using an existing expression vector.
The vector may be a plasmid, cosmid, phage or viral vector. In the specific embodiment of the invention, the recombinant vector is specifically a SaKKHn-pBE +3bp-1 recombinant expression vector, a SaKKHn-pBE +3bp-2 recombinant expression vector, a SaKKHn-pBE +3bp-3 recombinant expression vector, a SaKKHn-pBE +3bp-4 recombinant expression vector, a SaKKHn-pBE +3bp-5 recombinant expression vector, a SaKKHn-pBE +3bp-6 recombinant expression vector or a SaKKHn-pBE +3bp-7 recombinant expression vector.
The nucleotide sequence of the SaKKHn-pBE +3bp-1 recombinant expression vector is obtained by replacing DNA sequences of origin sgRNA frameworks in a sequence of the SaKKHn-pBE-1 recombinant expression vector with DNA sequences of +3bp sgRNA frameworks shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-2 recombinant expression vector is obtained by replacing DNA sequences of origin sgRNA frameworks in a sequence of the SaKKHn-pBE-2 recombinant expression vector with DNA sequences of +3bp sgRNA frameworks shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-3 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-3 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-4 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-4 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-5 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-5 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-6 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-6 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-7 recombinant expression vector is obtained by replacing the DNA sequence of an origin sgRNA framework in the sequence of the SaKKHn-pBE-7 recombinant expression vector with the DNA sequence of a +3bp sgRNA framework shown in a sequence 6 and keeping other sequences unchanged.
The microorganism may be a yeast, bacterium, algae or fungus. Wherein the bacterium can be an Agrobacterium, such as Agrobacterium EHA105. In a specific embodiment of the present invention, the recombinant microorganism is agrobacterium EHA105 which contains the SaKKHn-pBE +3bp-1 recombinant expression vector, the SaKKHn-pBE +3bp-2 recombinant expression vector, the SaKKHn-pBE +3bp-3 recombinant expression vector, the SaKKHn-pBE +3bp-4 recombinant expression vector, the SaKKHn-pBE +3bp-5 recombinant expression vector, the SaKKHn-pBE +3bp-6 recombinant expression vector, or the SaKKHn-pBE +3bp-7 recombinant expression vector.
The transgenic cell line does not include propagation material.
The kit has the following uses:
x1) editing of a genomic target sequence of an organism or a cell of an organism;
x2) preparing an edited product of a genomic target sequence of an organism or a cell of an organism;
x3) improving the editing efficiency of the genome target sequence of the organism or the organism cell;
x4) preparing a product for improving the editing efficiency of the genome target sequence of the organism or the organism cell.
The sgRNA or the modified sgRNA backbone in the kit also belongs to the protection scope of the present invention.
In order to achieve the above object, the present invention also provides a new use of the above kit or sgRNA or modified sgRNA backbone.
The invention provides an application of the complete reagent or the sgRNA or the new application of the modified sgRNA framework in any one of X1) to X4):
x1) editing of a genomic target sequence of an organism or a cell of an organism;
x2) preparing an edited product of a genomic target sequence of an organism or a cell of an organism;
x3) improving the editing efficiency of the genome target sequence of the organism or the organism cell;
x4) preparing a product for improving the editing efficiency of the genome target sequence of the organism or the organism cell.
In order to achieve the above object, the present invention finally provides a method as defined in Y1) or Y2):
y1) a method for editing a genomic target sequence or a method for improving the efficiency of editing a genomic target sequence of an organism or a biological cell, comprising expressing the sgRNA, the Cas9 nuclease, and the cytosine deaminase in an organism or a biological cell to edit a genomic target sequence; the sgRNA targets the target sequence;
y2) a method for preparing a biological mutant, comprising the following steps: editing the genome of the organism according to the method described in Y1) to obtain a biological mutant.
In the above method, in Y1), the sgRNA is the tRNA-sgRNA, and the tRNA-sgRNA obtained by transcribing the DNA molecule of the tRNA-sgRNA is an immature RNA precursor, and the tRNA in the RNA precursor is cleaved with two enzymes (RNase P and RNase Z) to obtain mature RNA. And (b) obtaining independent mature RNAs by determining the number of targets in a recombinant expression vector, wherein each mature RNA sequentially consists of the RNA transcribed by the target sequence and the sgRNA skeleton or sequentially consists of the RNA transcribed by the target sequence, the sgRNA skeleton and residual individual bases of the tRNA.
In the above method, Y1) further comprises a step of expressing UGI in the organism or the organism cell, and the number of the UGI may be 1 or 2 or more. In a specific embodiment of the present invention, the number of the UGIs is specifically 1.
Further, the sgRNA, the Cas9 nuclease, the cytosine deaminase, and the UGI are expressed in an organism or an organism cell by introducing a gene encoding the Cas9 nuclease, a DNA molecule that transcribes the sgRNA, a gene encoding the cytosine deaminase, and a gene encoding the UGI into the organism or the organism cell.
Further, the gene encoding Cas9 nuclease, the DNA molecule transcribing the sgRNA, the gene encoding cytosine deaminase, and the gene encoding UGI are introduced into an organism or biological cell via a recombinant expression vector.
The encoding gene of the Cas9 nuclease, the DNA molecule for transcribing the sgRNA, the encoding gene of the cytosine deaminase and the encoding gene of the UGI can be introduced into an organism or an organism cell through the same recombinant expression vector, or can be introduced into the organism or the organism cell through two or more recombinant expression vectors.
In a specific embodiment of the present invention, the gene encoding Cas9 nuclease, the DNA molecule transcribing the sgRNA, the gene encoding cytosine deaminase, and the gene encoding UGI are introduced into an organism or an organism cell through the same recombinant expression vector. The recombinant expression vector contains an expression cassette A and an expression cassette B; the expression cassette A expresses the sgRNA, and the expression cassette B expresses a fusion protein consisting of the Cas9 nuclease, the cytosine deaminase and the UGI.
The recombinant expression vector is specifically the SaKKHn-pBE +3bp-1 recombinant expression vector, the SaKKHn-pBE +3bp-2 recombinant expression vector, the SaKKHn-pBE +3bp-3 recombinant expression vector, the SaKKHn-pBE +3bp-4 recombinant expression vector, the SaKKHn-pBE +3bp-5 recombinant expression vector, the SaKKHn-pBE +3bp-6 recombinant expression vector or the SaKKHn-pBE +3bp-7 recombinant expression vector.
In the kit or use or method, the number of target sequences may be 1 or 2 or more. The PAM sequence of the target sequence is NNNRRT.
In the kit or the use or the method, the genome target sequence is edited by mutating C in the target sequence to T. The C is C at any position in the target point sequence.
In the kit or use or method above, the organism is S1) or S2) or S3) or S4):
s1) plants or animals;
s2) monocotyledons or dicotyledons;
s3) a gramineous plant;
s4) paddy rice;
the biological cell is T1) or T2) or T3) or T4):
t1) plant cells or animal cells;
t2) a monocotyledonous or dicotyledonous plant cell;
t3) graminaceous plant cells;
t4) Rice cells.
The invention provides a modified sgRNA, which has a structure shown in a formula I: an RNA-engineered sgRNA backbone transcribed from a target sequence (formula I); the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework; the RNA segment A and the RNA segment B are reversely complementary; the sizes of the RNA fragment A and the RNA fragment B are both 3nt; the sgRNA framework is an RNA molecule shown in a sequence 9. Experiments prove that: the modified sgRNA can obviously improve the C.T base replacement efficiency of a Cytosine Base Editor (CBE), and the highest efficiency can reach 86.4%.
Drawings
Fig. 1 shows the unmodified and modified SaCas9sgRNA structures.
FIG. 2 is a schematic structural diagram of a recombinant expression vector.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, and the examples are given only for illustrating the present invention and not for limiting the scope of the present invention. The experimental procedures in the following examples are conventional unless otherwise specified. Materials, reagents, instruments and the like used in the following examples are commercially available unless otherwise specified. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
The primer pair T1 consists of primers T1-F:5'-ttcaaattctaatccccaatcc-3' and primers T1-R: 5363 and 5'-tcgtacctgtctgcaaccttg-3' for amplifying target T1.
The primer pair T2 consists of primers T2-F:5'-gctttagatgatttgttacatttcgc-3' and primers T2-R: 5363 and 5'-tgagttggtatggcaagaacaag-3' for amplifying target T2.
The primer pair T3 consists of primers T3-F:5'-aacacggtcaccaacttcatc-3' and primers T3-R: 5363 and 5'-acaacctggcttgctatatatgc-3' for amplifying target T3.
The primer pair T4 consists of primers T4-F:5'-tggatcggatatggacttctc-3' and primers T4-R: 5363 and 5'-gaaatgaacaatcacctgagatctttg-3' for amplifying target points T4 and T7.
Primer pair T5 consists of primers T5-F:5'-cgagctacctgaagaacaactacc-3' and primers T5-R: 5363 and 5'-cctcgattgcctgaaatttg-3' for amplifying target T5.
The primer pair T6 consists of primers T6-F:5'-tgcgagctcgacaacatcatg-3' and primers T6-R: 5363 and 5'-gacggcccatgtggaaacc-3' for amplifying target T6.
The primer pair T8 consists of a primer T8-F:5'-gacgcccatagtcgaggtc-3' and primers T8-R:5'-ctctgctggatcaatgtcaatg-3' for amplifying target T8.
The primer pair T9 consists of primers T9-F:5'-cctcatccaatcgactgacac-3' and primers T9-R:5'-gtaattgtgcttggtgatggag-3' for amplification of target T9.
In the following examples, C.T base substitutions refer to mutations from C to T at any position in the target sequence.
C · T base substitution efficiency = number of positive T0 seedlings where C · T base substitution occurred/total positive T0 seedlings analyzed × 100%.
Japanese fine rice: reference documents: liang Weigong, wang Gaohua, du Jingyao, et al, the effects of sodium nitroprusside and its photolytic products on the growth of young plants of fine rice in japan and the expression of 5 hormone marker genes [ J ]. Proceedings of south africa university of the river (nature edition), 2017 (2): 48-52; the public is available from the agroforestry academy of sciences of Beijing.
And (3) recovering the culture medium: n6 solid medium containing 200mg/L timentin.
Screening a culture medium: n6 solid medium containing 50mg/L hygromycin.
Differentiation medium: n6 solid culture medium containing 2mg/L KT, 0.2mg/L NAA, 0.5g/L glutamic acid, 0.5g/L proline.
Rooting culture medium: n6 solid culture medium containing 0.2mg/L NAA, 0.5g/L glutamic acid, 0.5g/L proline.
Example 1 modification of sgRNA backbone Structure in SaCas9sgRNA
The structure of SaCas9sgRNA is as follows: an RNA-sgRNA backbone transcribed from a target sequence.
The sgRNA framework structures in the SaCas9sgRNA structure are modified, and the two ways of modifying the sgRNA framework structures are shown in figure 1.
Origin represents the unmodified SaCas9sgRNA structure, the unmodified SaCas9sgRNA is designated as origin sgRNA, the sgRNA backbone in origin sgRNA is designated as origin sgRNA backbone, and the RNA sequence of the origin sgRNA backbone is as follows: GUUUUAGUACUCUGGAAACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (seq. No. 9); the DNA sequence of the origin sgRNA backbone is as follows: GTTTTAGTACTCTGGAAACAGAATCTACTAAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTTGGCGAGAT.
O +3bp represents a SaCas9sgRNA structure obtained by adding 3 pairs of bases on the basis of an unmodified SaCas9sgRNA structure, the modified SaCas9sgRNA is marked as +3bp sgRNA, a sgRNA framework in the +3bp sgRNA is marked as a +3bp sgRNA framework, and an RNA sequence of the +3bp sgRNA framework is as follows: GUUUUAGUACUCUGCUGGAAACAGCAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (sequence 10, underlinedThe sequence shown is 3 pairs of bases increased); the DNA sequence of the +3bp sgRNA framework is shown as sequence 6.
O +8bp represents a SaCas9sgRNA structure obtained by adding 8 pairs of bases on the basis of an unmodified SaCas9sgRNA structure, the modified SaCas9sgRNA is marked as +8bp sgRNA, a sgRNA framework in the +8bp sgRNA is marked as a +8bp sgRNA framework, and an RNA sequence of the +8bp sgRNA framework is as follows: GUUUUAGUACUCUGUAAUUUUAGAAAUAAAAUUACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (the underlined sequence is 8 base pairs in addition); the DNA sequence of the +8bp sgRNA backbone is shown in sequence 7.
Example 2 application of modified SacAS9sgRNA in improvement of C.T base substitution efficiency of SaKKHn & PmCDA1& UGI base editing system
1. Construction of recombinant expression vectors
Artificially synthesizing the following recombinant expression vectors, wherein each expression vector is a circular plasmid:
two recombinant expression vectors containing Original sgRNA: saKKHn-pBE-1 and SaKKHn-pBE-2;
two recombinant expression vectors containing +3bp sgRNA: saKKHn-pBE +3bp-1 and SaKKHn-pBE +3bp-2;
two recombinant expression vectors containing +8bp sgRNA: saKKHn-pBE +8bp-1 and SaKKHn-pBE +8bp-2.
The nucleotide sequence of the SaKKHn-pBE-1 recombinant expression vector is sequence 1 in the sequence table. Wherein, the 131 th to 467 th positions of the sequence 1 are nucleotide sequences of OsU promoter, the 474 th to 550 th positions and the 648 th to 724 th positions are both nucleotide sequences of tRNA, the 551 th to 647 th positions and the 725 th to 821 th positions are respectively nucleotide sequences of two sgRNAs targeting OsWaxy genes, the DNA sequence of a common sgRNA framework (origin sgRNA framework) of the two sgRNAs is the 571 th to 647 th positions of the sequence 1 or the 745 th to 821 th positions of the sequence 1, and the 996 th to 1286 th positions are nucleotide sequences of OsU terminator; the 1293-3006 th site of the sequence 1 is nucleotide sequence of OsUbq3 promoter, the 3013-6225 th site is coding sequence of SaKKHn protein (without stop codon), the SaKKHn protein shown in coding sequence 2; the 6511 th to 7134 th sites of the sequence 1 are the coding sequence of the PmCDA1 protein (without a stop codon), and the PmCDA1 protein shown in the coding sequence 3; the 7156-7452 bits of the sequence 1 are the coding sequence of UGI protein, and the UGI protein shown in the coding sequence 4; the 7459-7653 th nucleotide sequence of the sequence 1 is a 35S terminator; the 7728-9720 th site of the sequence 1 is the nucleotide sequence of ZmUbi1 promoter, the 9727-10749 th site is the coding sequence of hygromycin phosphotransferase, and the 10779-10994 th site is the nucleotide sequence of CaMV35S polyA. Two targets in the SaKKHn-pBE-1 recombinant expression vector are respectively T1 and T2, and the sequences are shown in Table 1.
The nucleotide sequence of the SaKKHn-pBE-2 recombinant expression vector is obtained by replacing 474 th-995 th of a sequence 1 with a sequence 5 and keeping other sequences unchanged. Wherein, the 1 st to 77 th sites and the 175 th to 251 th sites of the sequence 5 are nucleotide sequences of tRNA, and the 78 th to 174 th sites and the 252 th to 348 th sites are nucleotide sequences of two sgRNAs targeting the OsNRT1.1B gene and the OsGRF4 gene respectively. Positions 98-174 and 272-348 are the DNA sequences of the Original sgRNA backbone. Two targets in the SaKKHn-pBE-2 recombinant expression vector are T3 and T4 respectively, and the sequences are shown in Table 1.
The nucleotide sequence of the SaKKHn-pBE +3bp-1 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-2 recombinant expression vector is obtained by replacing the DNA sequences of the origin sgRNA frameworks in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +8bp-1 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +8bp sgRNA framework shown in the sequence 7 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +8bp-2 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +8bp sgRNA framework shown in the sequence 7 and keeping other sequences unchanged.
The target nucleotide sequence and the corresponding PAM sequence of each vector are shown in table 1.
TABLE 1
Figure BDA0002295817890000101
2. Obtaining of Positive T0 seedlings of Rice
Respectively operating the SaKKHn-pBE-1 vector, the SaKKHn-pBE-2 vector, the SaKKHn-pBE +3bp-1 vector, the SaKKHn-pBE +3bp-2 vector, the SaKKHn-pBE +8bp-1 vector and the SaKKHn-pBE +8bp-2 vector obtained in the first step according to the following steps 1-9:
1. the vector was introduced into Agrobacterium EHA105 (product of Shanghai Diego Biotechnology Ltd., CAT #: AC 1010) to obtain a recombinant Agrobacterium.
2. Culturing the recombinant Agrobacterium with a medium (YEP medium containing 50. Mu.g/ml kanamycin and 25. Mu.g/ml rifampicin), shaking at 28 ℃ and 150rpm to OD 600 At room temperature of 1.0-2.0, centrifuging at 10000rpm for 1min, resuspending the thallus with infection solution (glucose and sucrose are replaced by N6 liquid culture medium, and the concentration of glucose and sucrose in the infection solution is 10g/L and 20g/L respectively), and diluting to OD 600 And the concentration is 0.2, and an agrobacterium tumefaciens infection solution is obtained.
3. The mature seeds of the rice variety Nipponbare are shelled and threshed, placed in a 100mL triangular flask, added with 70% (v/v) ethanol water solution for soaking for 30sec, then placed in 25% (v/v) sodium hypochlorite water solution, sterilized by shaking at 120rpm for 30min, washed by sterile water for 3 times, sucked by filter paper to remove water, then placed on an N6 solid culture medium with the embryo downwards, and cultured in dark at 28 ℃ for 4-6 weeks to obtain the rice callus.
4. After the step 3 is completed, soaking the rice callus in an agrobacterium infection solution A (the agrobacterium infection solution A is a liquid obtained by adding acetosyringone into the agrobacterium infection solution, the addition amount of the acetosyringone meets the volume ratio of the acetosyringone to the agrobacterium infection solution of 25 mul: 50 ml), soaking for 10min, then placing on a culture dish (containing about 200ml of the infection solution without agrobacterium) paved with two layers of sterilization filter paper, and performing dark culture at 21 ℃ for 1 day.
5. And (5) putting the rice callus obtained in the step (4) on a recovery culture medium, and performing dark culture for 3 days at the temperature of 25-28 ℃.
6. And (4) placing the rice callus obtained in the step (5) on a screening culture medium, and performing dark culture at 28 ℃ for 2 weeks.
7. And (4) putting the rice callus obtained in the step (6) on a screening culture medium again, and performing dark culture at 28 ℃ for 2 weeks to obtain the rice resistance callus.
8. And (4) putting the rice resistant callus obtained in the step (7) on a differentiation medium, culturing for about 1 month at 25 ℃ under illumination, transplanting the differentiated plantlets on a rooting medium, and culturing for 2 weeks at 25 ℃ under illumination to obtain rice T0 seedlings.
9. Extracting genome DNA of the T0 rice seedling and taking the genome DNA as a template, and performing PCR amplification by adopting a primer pair consisting of a primer F (5'-attatgtagcttgtgcgtttcg-3') and a primer R (5'-ctccacctcattgacattatgc-3') to obtain a PCR amplification product; the PCR amplification product was subjected to agarose gel electrophoresis, followed by judgment as follows: if the PCR amplification product contains DNA fragments of about 898bp, the corresponding rice T0 seedling is a rice positive T0 seedling; if the PCR amplification product does not contain DNA fragments of about 898bp, the corresponding rice T0 seedling is not a rice positive T0 seedling.
3. Analysis of results
1. Taking the genomic DNA of the rice positive T0 seedlings obtained in the step two as a template for each vector, and carrying out PCR amplification on T1 by adopting a primer for a T1 target spot to obtain a PCR amplification product; for the T2 target, performing PCR amplification on the T2 by using a primer to obtain a PCR amplification product; for the T3 target, performing PCR amplification on the T3 by using a primer to obtain a PCR amplification product; and for the T4 target spot, performing PCR amplification on the T4 by using a primer to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for each target region. The number of positive T0 seedlings with C.T base substitution of T1, T2, T3 and T4 was counted, and the C.T base substitution efficiency was calculated, and the results are shown in Table 2.
The results show that for all four targets, compared with the SaKKHn & PmCDA1& UGI base editing system using the origin sgRNA, the SaKKHn & PmCDA1& UGI base editing system using the +3bp sgRNA can improve the C.T base replacement efficiency, and only for the T2 target, the C.T base replacement efficiency is improved by 3 times. And the SaKKHn & PmCDA1& UGI base editing system using +8bp sgRNA is unstable, improves the C.T base replacement efficiency to T2, T3 and T4 targets to different degrees, but reduces the C.T base replacement efficiency to a certain degree to the T1 target. On the overall synergistic level, except for the T4 target, the efficiency of realizing C.T base replacement by using the SaKKHn & PmCDA1& UGI base editing system of +3bp sgRNA is better than that of the SaKKHn & PmCDA1& UGI base editing system of +8bp sgRNA.
TABLE 2
Figure BDA0002295817890000111
Example 3 application of +3bp sgRNA in improving C.T base substitution efficiency of SaKKHn & PmCDA1& UGI base editing system
1. Construction of recombinant expression vectors
Artificially synthesizing the following recombinant expression vectors, wherein each expression vector is a circular plasmid:
five recombinant expression vectors containing Original sgRNA: saKKHn-pBE-3, saKKHn-pBE-4, saKKHn-pBE-5, saKKHn-pBE-6, and SaKKHn-pBE-7;
five recombinant expression vectors containing +3bp sgRNA: saKKHn-pBE +3bp-3, saKKHn-pBE +3bp-4, saKKHn-pBE +3bp-5, saKKHn-pBE +3bp-6 and SaKKHn-pBE +3bp-7.
The nucleotide sequence of the SaKKHn-pBE-3 recombinant expression vector is obtained by replacing 474 th-995 th of a sequence 1 with a sequence 8 and keeping other sequences unchanged. Wherein, the 1 st to 77 th positions of the sequence 8 are nucleotide sequences of tRNA, the 78 th to 174 th positions are nucleotide sequences of sgRNA of targeted OsWaxy gene, and the 98 th to 174 th positions are DNA sequences of origin sgRNA framework. The target point in the SaKKHn-pBE-3 recombinant expression vector is T5, and the sequence is shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-4 recombinant expression vector is obtained by replacing a T5 target sequence in a SaKKHn-pBE-3 recombinant expression vector sequence with a T6 target sequence and keeping other sequences unchanged. The T6 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-5 recombinant expression vector is obtained by replacing a T5 target sequence in a SaKKHn-pBE-3 recombinant expression vector sequence with a T7 target sequence and keeping other sequences unchanged. The T7 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-6 recombinant expression vector is obtained by replacing a T5 target sequence in a SaKKHn-pBE-3 recombinant expression vector sequence with a T8 target sequence and keeping other sequences unchanged. The T8 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE-7 recombinant expression vector is obtained by replacing a T5 target sequence of a sequence in the SaKKHn-pBE-3 recombinant expression vector with a T9 target sequence and keeping other sequences unchanged. The T9 target sequences are shown in Table 3.
The nucleotide sequence of the SaKKHn-pBE +3bp-3 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-3 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-4 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-4 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-5 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-5 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-6 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-6 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The nucleotide sequence of the SaKKHn-pBE +3bp-7 recombinant expression vector is obtained by replacing the DNA sequence of the origin sgRNA framework in the sequence of the SaKKHn-pBE-7 recombinant expression vector with the DNA sequence of the +3bp sgRNA framework shown in the sequence 6 and keeping other sequences unchanged.
The target nucleotide sequence and the corresponding PAM sequence for each vector are shown in table 3.
TABLE 3
Name of target point Target gene Target sequence (5 '-3') PAM Name of recombinant expression vector
T5 OsWaxy tcctcggcgtagtacgggct CACGGT SaKKHn-pBE-3;SaKKHn-pBE+3bp-3
T6 OsWaxy tatccgggcaaggtgagggc CGTGGT SaKKHn-pBE-4;SaKKHn-pBE+3bp-4
T7 OsGRF4 acgccggcaccgccctggct CTGGGT SaKKHn-pBE-5;SaKKHn-pBE+3bp-5
T8 OsALS cccaagcatgcgcagggaca ACGGGT SaKKHn-pBE-6;SaKKHn-pBE+3bp-6
T9 OsALS cacgtccttcccgctcgagg CCGGGT SaKKHn-pBE-7;SaKKHn-pBE+3bp-7
2. Obtaining of Positive T0 seedlings of Rice
And (2) operating the SaKKHn-pBE-3 vector, the SaKKHn-pBE-4 vector, the SaKKHn-pBE-5 vector, the SaKKHn-pBE-6 vector, the SaKKHn-pBE-7 vector, the SaKKHn-pBE +3bp-3 vector, the SaKKHn-pBE +3bp-4 vector, the SaKKHn-pBE +3bp-5 vector, the SaKKHn-pBE +3bp-6 vector and the SaKKHn-pBE +3bp-7 vector constructed in the step one according to 1-9 of the step two in the example 2 respectively to obtain a positive T0 seedling of rice.
3. Analysis of results
1. Taking the genomic DNA of the rice positive T0 seedlings obtained in the step two as a template for each vector, and carrying out PCR amplification on T5 by adopting a primer for a T5 target spot to obtain a PCR amplification product; for the T6 target, performing PCR amplification on the T6 by using a primer to obtain a PCR amplification product; for the T7 target spot, performing PCR amplification on T4 by using a primer to obtain a PCR amplification product; for the T8 target spot, performing PCR amplification on the T8 by using a primer to obtain a PCR amplification product; and for the T9 target, performing PCR amplification on the T9 by using a primer to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for each target region. The number of positive T0 seedlings with C.T base substitution of T5, T6, T7, T8 and T9 was counted, and the C.T base substitution efficiency was calculated, and the results are shown in Table 4.
The results show that the SaKKHn & PmCDA1& UGI base editing system using +3bp sgRNA can improve the c.t base replacement efficiency compared to the SaKKHn & PmCDA1& UGI base editing system using Original sgRNA for all five targets. For the T9 target only, the SaKKHn & PmCDA1& UGI base editing system using origin sgRNA can not realize C.T base substitution, and the SaKKHn & PmCDA1& UGI base editing system using +3bp sgRNA can successfully realize C.T base substitution.
TABLE 4
Figure BDA0002295817890000131
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific examples, it will be appreciated that the invention may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> agriculture and forestry academy of sciences of Beijing City
<120> high-efficiency sgRNA and application thereof in gene editing
<160>10
<170>PatentIn version 3.5
<210>1
<211>17400
<212>DNA
<213>Artificial Sequence
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc gaagcaactt aaagttatca ggcatgcatg gatcttggag gaatcagatg 180
tgcagtcagg gaccatagca caagacaggc gtcttctact ggtgctacca gcaaatgctg 240
gaagccggga acactgggta cgttggaaac cacgtgatgt gaagaagtaa gataaactgt 300
aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt gcagtatggg 360
ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc tcggctatcc 420
acatagatca aagctgattt aaaagagttg tgcagatgat ccgtggcgga tccaacaaag 480
caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg ttcgattccc 540
ggctggtgca catccaatgc gatgatcaag gttttagtac tctggaaaca gaatctacta 600
aaacaaggca aaatgccgtg tttatctcgt caacttgttg gcgagataac aaagcaccag 660
tggtctagtg gtagaatagt accctgccac ggtacagacc cgggttcgat tcccggctgg 720
tgcaaatcac cagtggaagc taaggtttta gtactctgga aacagaatct actaaaacaa 780
ggcaaaatgc cgtgtttatc tcgtcaactt gttggcgaga taacaaagca ccagtggtct 840
agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg ctggtgcaac 900
cggatttgaa cgatggacgt tttagtactc tggaaacaga atctactaaa acaaggcaaa 960
atgccgtgtt tatctcgtca acttgttggc gagatttttt tttttcgttt tgcattgagt 1020
tttctccgtc gcatgtttgc agttttattt tccgttttgc attgaaattt ctccgtctca 1080
tgtttgcagc gtgttcaaaa agtacgcagc tgtatttcac ttatttacgg cgccacattt 1140
tcatgccgtt tgtgccaact atcccgagct agtgaataca gcttggcttc acacaacact 1200
ggtgacccgc tgacctgctc gtacctcgta ccgtcgtacg gcacagcatt tggaattaaa 1260
gggtgtgatc gatactgctt gctgctaagc ttacaaattc gggtcaaggc ggaagccagc 1320
gcgccacccc acgtcagcaa atacggaggc gcggggttga cggcgtcacc cggtcctaac 1380
ggcgaccaac aaaccagcca gaagaaatta cagtaaaaaa aaagtaaatt gcactttgat 1440
ccacctttta ttacctaagt ctcaatttgg atcaccctta aacctatctt ttcaatttgg 1500
gccgggttgt ggtttggact accatgaaca acttttcgtc atgtctaact tccctttcag 1560
caaacatatg aaccatatat agaggagatc ggccgtatac tagagctgat gtgtttaagg 1620
tcgttgattg cacgagaaaa aaaaatccaa atcgcaacaa tagcaaattt atctggttca 1680
aagtgaaaag atatgtttaa aggtagtcca aagtaaaact tatagataat aaaatgtggt 1740
ccaaagcgta attcactcaa aaaaaatcaa cgagacgtgt accaaacgga gacaaacggc 1800
atcttctcga aatttcccaa ccgctcgctc gcccgcctcg tcttcccgga aaccgcggtg 1860
gtttcagcgt ggcggattct ccaagcagac ggagacgtca cggcacggga ctcctcccac 1920
cacccaaccg ccataaatac cagccccctc atctcctctc ctcgcatcag ctccaccccc 1980
gaaaaatttc tccccaatct cgcgaggctc tcgtcgtcga atcgaatcct ctcgcgtcct 2040
caaggtacgc tgcttctcct ctcctcgctt cgtttcgatt cgatttcgga cgggtgaggt 2100
tgttttgttg ctagatccga ttggtggtta gggttgtcga tgtgattatc gtgagatgtt 2160
taggggttgt agatctgatg gttgtgattt gggcacggtt ggttcgatag gtggaatcgt 2220
ggttaggttt tgggattgga tgttggttct gatgattggg gggaattttt acggttagat 2280
gaattgttgg atgattcgat tggggaaatc ggtgtagatc tgttggggaa ttgtggaact 2340
agtcatgcct gagtgattgg tgcgatttgt agcgtgttcc atcttgtagg ccttgttgcg 2400
agcatgttca gatctactgt tccgctcttg attgagttat tggtgccatg ggttggtgca 2460
aacacaggct ttaatatgtt atatctgttt tgtgtttgat gtagatctgt agggtagttc 2520
ttcttagaca tggttcaatt atgtagcttg tgcgtttcga tttgatttca tatgttcaca 2580
gattagataa tgatgaactc ttttaattaa ttgtcaatgg taaataggaa gtcttgtcgc 2640
tatatctgtc ataatgatct catgttacta tctgccagta atttatgcta agaactatat 2700
tagaatatca tgttacaatc tgtagtaata tcatgttaca atctgtagtt catctatata 2760
atctattgtg gtaatttctt tttactatct gtgtgaagat tattgccact agttcattct 2820
acttatttct gaagttcagg atacgtgtgc tgttactacc tatctgaata catgtgtgat 2880
gtgcctgtta ctatcttttt gaatacatgt atgttctgtt ggaatatgtt tgctgtttga 2940
tccgttgttg tgtccttaat cttgtgctag ttcttaccct atctgtttgg tgattatttc 3000
ttgcagtacg taatggctcc taagaagaag cggaaggttg gcatccacgg tgtcccggcg 3060
gcaaagagaa actacatcct gggtctggcc atcggtatta catcggtggg ctacggcatc 3120
atcgactacg agacaaggga tgtcatcgat gccggcgtcc ggctcttcaa ggaggccaac 3180
gtggagaata acgagggcag gcgctccaag cgcggcgcgc ggaggctgaa gcgcaggcgg 3240
aggcatcgca tccagcgggt gaagaagctc ctcttcgact acaatctgct cacggatcat 3300
tccgagctgt ctggcatcaa cccatacgag gcgcgggtga agggcctgtc ccagaagctc 3360
tcggaggagg agttctcggc ggccctgctg catctcgcga agaggcgcgg cgtgcataat 3420
gtcaatgagg tggaggagga taccggcaat gagctgtcaa ccaaggagca gatcagcagg 3480
aactccaagg cgctggagga gaagtatgtg gcggagctcc agctcgagag gctgaagaag 3540
gatggcgagg tccggggctc catcaatagg ttcaagacat cggactacgt gaaggaggcc 3600
aagcagctcc tgaaggtgca gaaggcgtac caccagctgg accagagctt catcgacacc 3660
tacatcgatc tgctcgagac acgccggacg tactacgagg gcccgggcga gggctcaccg 3720
ttcggctgga aggacatcaa ggagtggtac gagatgctga tgggccactg cacctacttc 3780
cctgaggagc tgaggagcgt gaagtacgcg tacaatgcgg acctctacaa cgccctgaac 3840
gacctcaata acctcgtgat cacgcgcgac gagaatgaga agctcgagta ctacgagaag 3900
ttccagatca tcgagaacgt gttcaagcag aagaagaagc cgaccctcaa gcagatcgcc 3960
aaggagatcc tcgtcaatga ggaggacatc aagggctaca gggtgacctc gaccggcaag 4020
ccagagttca ccaacctgaa ggtctaccac gacatcaagg atatcaccgc ccgcaaggag 4080
atcatcgaga atgcggagct cctggatcag atcgcgaaga tcctcaccat ctaccagtcc 4140
agcgaggaca tccaggagga gctcacgaac ctgaatagcg agctgaccca ggaggagatc 4200
gagcagatct ccaacctcaa gggctacacc ggcacgcaca atctgagcct caaggcgatc 4260
aatctcatcc tcgatgagct ctggcataca aatgataacc agatcgccat cttcaatcgc 4320
ctcaagctgg tcccaaagaa ggtcgatctg tcgcagcaga aggagatccc aacgacactg 4380
gtcgatgact tcatcctctc acctgtcgtg aagaggtcgt tcatccagtc gatcaaggtc 4440
atcaatgcga tcatcaagaa gtacggcctc cctaatgata tcatcatcga gctggcccgc 4500
gagaagaatt caaaggacgc gcagaagatg atcaacgaga tgcagaagag gaatcggcag 4560
acaaacgagc gcatcgagga gatcatccgc acaaccggca aggagaatgc caagtacctg 4620
atcgagaaga tcaagctgca tgacatgcag gagggcaagt gcctctactc actggaggcc 4680
atcccactcg aggacctgct gaataaccca ttcaattacg aggtcgacca tatcatcccg 4740
cgctccgtgt cgttcgacaa ttccttcaat aacaaggtcc tcgtcaagca ggaggagaac 4800
tccaagaagg gcaatcgcac cccgttccag tacctgtcct cttcggacag caagatctct 4860
tacgagacat tcaagaagca catcctcaac ctggccaagg gcaagggccg gatctccaag 4920
accaagaagg agtacctcct ggaggagagg gatatcaacc ggttcagcgt gcagaaggac 4980
ttcatcaatc gcaacctggt cgatacccgg tacgccacca ggggcctcat gaacctgctc 5040
cggtcctact tccgggtgaa caatctcgac gtgaaggtca agagcatcaa cggcggcttc 5100
acctcgttcc tcaggcggaa gtggaagttc aagaaggagc ggaacaaggg ctacaagcac 5160
catgccgagg acgccctcat catcgcgaac gcggacttca tcttcaagga gtggaagaag 5220
ctcgataagg cgaagaaggt catggagaac cagatgttcg aggagaagca ggccgagtcg 5280
atgccagaga tcgagacaga gcaggagtac aaggagatct tcatcacccc gcaccagatc 5340
aagcacatca aggacttcaa ggactacaag tactcccatc gggtcgataa gaagccaaat 5400
cggaagctca tcaatgatac cctctactcg acacgcaagg atgacaaggg caacaccctg 5460
atcgtcaata acctcaatgg cctctacgac aaggataacg acaagctgaa gaagctcatc 5520
aacaagagcc cagagaagct cctcatgtac caccacgatc cgcagacata ccagaagctc 5580
aagctgatca tggagcagta cggcgacgag aagaacccac tctacaagta ctacgaggag 5640
acaggcaact acctgaccaa gtactccaag aaggacaatg gcccagtgat caagaagatc 5700
aagtactacg gcaataagct gaacgcccac ctcgatatca cggacgatta ccctaacagc 5760
cggaataagg tggtcaagct gtccctcaag ccgtaccgct tcgacgtcta cctggataac 5820
ggcgtctaca agttcgtgac agtcaagaat ctcgacgtca tcaagaagga gaactactac 5880
gaggtcaatt ctaagtgcta cgaggaggcc aagaagctca agaagatcag caaccaggcc 5940
gagttcatcg ccagcttcta caagaacgat ctgatcaaga tcaacggcga gctctacagg 6000
gtcatcggcg tgaacaatga cctgctcaat aggatcgagg tgaacatgat cgacatcacc 6060
taccgcgagt acctcgagaa catgaacgat aagcggcctc cacacatcat caagacaatc 6120
gcctctaaga cccagtccat caagaagtac tccacggata tcctcggcaa cctctacgag 6180
gtgaagtcaa agaagcaccc gcagatcatc aagaagggct cggctggagg aggaggcacg 6240
ggaggaggag gctccgccga gtatgtgcgc gcgctcttcg acttcaacgg caatgacgag 6300
gaggatctcc ctttcaagaa gggcgacatc ctccgcatcc gcgataagcc ggaggagcag 6360
tggtggaacg cagaggactc cgagggcaag cggggcatga tcctggtgcc atacgtcgag 6420
aagtacagcg gcgattacaa ggaccacgat ggcgactaca aggatcatga catcgattac 6480
aaggacgatg acgataagtc cggcgtcgac atgacggacg cggagtatgt gcgcatccac 6540
gagaagctcg atatctacac cttcaagaag cagttcttca acaataagaa gtcggtgtcc 6600
catcggtgct acgtcctctt cgagctgaag cgcaggggag agcgccgcgc ctgcttctgg 6660
ggctacgcgg tgaataagcc gcagtcaggc acagagcgcg gcatccacgc cgagatcttc 6720
tcgatccgga aggtcgagga gtacctccgc gacaacccag gccagttcac gatcaattgg 6780
tactccagct ggtccccttg cgcagattgc gcagagaaga tcctcgagtg gtacaaccag 6840
gagctgaggg gcaatggcca taccctcaag atctgggcct gcaagctgta ctacgagaag 6900
aacgcgagga atcagatcgg cctctggaac ctgcgggata atggcgtggg cctcaacgtg 6960
atggtgtccg agcactacca gtgctgccgc aagatcttca tccagtcctc ccacaatcag 7020
ctgaacgaga ataggtggct cgaaaagacc ctgaagcgcg ccgagaagtg gaggagcgag 7080
ctgtctatca tgatccaggt caagatcctg cacaccacaa agtcaccggc ggtgggcggc 7140
ggcggcagcg aattctccgg cggcagcacg aacctcagcg acatcatcga gaaggagaca 7200
ggcaagcagc tcgtgatcca ggagtctatc ctcatgctgc ctgaggaggt ggaggaggtc 7260
atcggcaaca agccggagtc cgatatcctc gtgcacaccg cctacgacga gtcgacagat 7320
gagaatgtca tgctcctgac ctccgacgca ccagagtaca agccatgggc gctcgtgatc 7380
caggattcca acggcgagaa taagatcaag atgctgtctg gcggctcccc gaagaagaag 7440
cgcaaggtct agactagtct gaaatcacca gtctctctct acaaatctat ctctctctat 7500
aataatgtgt gagtagttcc cagataaggg aattagggtt cttatagggt ttcgctcatg 7560
tgttgagcat ataagaaacc cttagtatgt atttgtattt gtaaaatact tctatcaata 7620
aaatttctaa ttcctaaaac caaaatccag tggggcgccc gacctgtact cgcgaaggtt 7680
aacttacaga gagtgtccgg gcgcgcctgg tggatcgtcc gcctaggctg cagtgcagcg 7740
tgacccggtc gtgcccctct ctagagataa tgagcattgc atgtctaagt tataaaaaat 7800
taccacatat tttttttgtc acacttgttt gaagtgcagt ttatctatct ttatacatat 7860
atttaaactt tactctacga ataatataat ctatagtact acaataatat cagtgtttta 7920
gagaatcata taaatgaaca gttagacatg gtctaaagga caattgagta ttttgacaac 7980
aggactctac agttttatct ttttagtgtg catgtgttct cctttttttt tgcaaatagc 8040
ttcacctata taatacttca tccattttat tagtacatcc atttagggtt tagggttaat 8100
ggtttttata gactaatttt tttagtacat ctattttatt ctattttagc ctctaaatta 8160
agaaaactaa aactctattt tagttttttt atttaataat ttagatataa aatagaataa 8220
aataaagtga ctaaaaatta aacaaatacc ctttaagaaa ttaaaaaaac taaggaaaca 8280
tttttcttgt ttcgagtaga taatgccagc ctgttaaacg ccgtcgacga gtctaacgga 8340
caccaaccag cgaaccagca gcgtcgcgtc gggccaagcg aagcagacgg cacggcatct 8400
ctgtcgctgc ctctggaccc ctctcgagag ttccgctcca ccgttggact tgctccgctg 8460
tcggcatcca gaaattgcgt ggcggagcgg cagacgtgag ccggcacggc aggcggcctc 8520
ctcctcctct cacggcaccg gcagctacgg gggattcctt tcccaccgct ccttcgcttt 8580
cccttcctcg cccgccgtaa taaatagaca ccccctccac accctctttc cccaacctcg 8640
tgttgttcgg agcgcacaca cacacaacca gatctccccc aaatccaccc gtcggcacct 8700
ccgcttcaag gtacgccgct cgtcctcccc ccccccccct ctctaccttc tctagatcgg 8760
cgttccggtc catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc 8820
gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac 8880
acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc 8940
gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg 9000
cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct 9060
tttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag 9120
aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata 9180
catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac 9240
atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga 9300
tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca 9360
aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt 9420
tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt 9480
ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt 9540
acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga 9600
tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta 9660
tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag 9720
gagctcatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag 9780
ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc 9840
ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac 9900
aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt 9960
gacattgggg agtttagcga gagcctgacc tattgcatct cccgccgttc acagggtgtc 10020
acgttgcaag acctgcctga aaccgaactg cccgctgttc tacaaccggt cgcggaggct 10080
atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg 10140
caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat 10200
gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc 10260
gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat 10320
ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc 10380
gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg 10440
ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga 10500
tcgccacgac tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg 10560
gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga 10620
tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc 10680
gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg 10740
gcaaagaaat agagtagatg ccgaccggga tctgtcgatc gacaagctcg agtttctcca 10800
taataatgtg tgagtagttc ccagataagg gaattagggt tcctataggg tttcgctcat 10860
gtgttgagca tataagaaac ccttagtatg tatttgtatt tgtaaaatac ttctatcaat 10920
aaaatttcta attcctaaaa ccaaaatcca gtactaaaat ccagatcccc cgaattaatt 10980
cggcgttaat tcagcctgca ggacgcgttt aattaagtgc acgcggccgc ctacttagtc 11040
aagagcctcg cacgcgactg tcacgcggcc aggatcgcct cgtgagcctc gcaatctgta 11100
cctagtgttt aaactatcag tgtttgacag gatatattgg cgggtaaacc taagagaaaa 11160
gagcgtttat tagaataacg gatatttaaa agggcgtgaa aaggtttatc cgttcgtcca 11220
tttgtatgtg catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc 11280
ctccgctgct atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac 11340
atgtcgcaca agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt 11400
cttgtcgcgt gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac 11460
gccatgaaca agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac 11520
caggacttga ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc 11580
gagaagatca ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta 11640
cgccctggcg acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac 11700
ctactggaca ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag 11760
ccgtgggccg acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt 11820
gccgagttcg agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag 11880
gcccgaggcg tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc 11940
cgcgagctga tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg 12000
catcgctcga ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc 12060
aggcggcgcg gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc 12120
gagaatgaac gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt 12180
ttttcattac cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc 12240
ccgcgcacgt ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc 12300
tggcggcctg gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt 12360
gatgtgtatt tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag 12420
taaataaaca aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg 12480
cgggtcaggc aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc 12540
cgatgttctg ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg 12600
ggaagatcaa ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa 12660
ggccatcggc cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc 12720
tgtgtccgcg atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga 12780
catatgggcc accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg 12840
aaggctacaa gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga 12900
ggttgccgag gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg 12960
cgtgagctac ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg 13020
cgacgctgcc cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt 13080
taatgaggta aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc 13140
gcacgcagca gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg 13200
gtcaactttc agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa 13260
ggcaagacca ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc 13320
aaatgaataa atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga 13380
acaaccaggc accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg 13440
cgtaagcggc tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga 13500
atcggcgtga cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg 13560
acctggtgga gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag 13620
cacgccccgg tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac 13680
cgccggcagc cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt 13740
ttttcgttcc gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg 13800
ccgttttccg tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc 13860
cagacgggca cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg 13920
acctggtact gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga 13980
agggagacaa gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc 14040
ggcgagccga tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca 14100
ccacgcacgt tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat 14160
ccgagggtga agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg 14220
agtacatcga gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc 14280
cggacgtgct gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc 14340
tctaccgcct ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga 14400
tctacgaacg cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc 14460
tgatcgggtc aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc 14520
cgatcctagt catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat 14580
gtacggagca gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct 14640
ttcctgtgga tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt 14700
acattgggaa cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa 14760
aagagaaaaa aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa 14820
cccgcctggc ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc 14880
ctacccttcg gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg 14940
ctggccgctc aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg 15000
cgccgtcgcc actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt 15060
gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 15120
gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 15180
ggcgcagcca tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg 15240
catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg 15300
taaggagaaa ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct 15360
cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 15420
cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 15480
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 15540
acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 15600
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 15660
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 15720
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 15780
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 15840
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 15900
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 15960
gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 16020
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 16080
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 16140
acgaaaactc acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca 16200
gtaaaatata atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata 16260
gctcgacata ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt 16320
cataccactt gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat 16380
ctttcacaaa gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg 16440
gcttttccgt ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt 16500
cccagttttc gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta 16560
agcggctgtc taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc 16620
tgatgcactc cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt 16680
ccgagcaaag gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt 16740
caaagtgcag gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt 16800
cccgttccac atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt 16860
tttcattttc tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta 16920
cgcagcggta tttttcgatc agttttttca attccggtga tattctcatt ttagccattt 16980
attatttcct tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa 17040
gacgaactcc aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt 17100
ttcaaagttg ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc 17160
gcggtgatca caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga 17220
gatcatccgt gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac 17280
atgagcaaag tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg 17340
ctgcctgtat cgagtggtga ttttgtgccg agctgccggt cggggagctg ttggctggct 17400
<210>2
<211>1071
<212>PRT
<213>Artificial Sequence
<400>2
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
20 25 30
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
35 40 45
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
50 55 60
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
65 70 75 80
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
85 90 95
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
100 105 110
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
115 120 125
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
130 135 140
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
145 150 155 160
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
165 170 175
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
180 185 190
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
195 200 205
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
210 215 220
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
225 230 235 240
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
245 250 255
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
260 265 270
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
275 280 285
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
290 295 300
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
305 310 315 320
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
325 330 335
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
340 345 350
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
355 360 365
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
370 375 380
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
385 390 395 400
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
405 410 415
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
420 425 430
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
435 440 445
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
450 455 460
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
465 470 475 480
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
485 490 495
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
500 505 510
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
515 520 525
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
530 535 540
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
545 550 555 560
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
565 570 575
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
580 585 590
Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
595 600 605
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
610 615 620
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
625 630 635 640
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
645 650 655
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
660 665 670
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
675 680 685
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
690 695 700
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
705 710 715 720
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
725 730 735
Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys
740 745 750
Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu
755 760 765
Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp
770 775 780
Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu Ile
785 790 795 800
Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu
805 810 815
Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu
820 825 830
Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His
835 840 845
Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly
850 855 860
Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr
865 870 875 880
Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile
885 890 895
Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp
900 905 910
Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr
915 920 925
Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val
930 935 940
Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser
945 950 955 960
Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala
965 970 975
Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn Gly
980 985 990
Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile
995 1000 1005
Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn
1010 1015 1020
Met Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala Ser
1025 1030 1035
Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn
1040 1045 1050
Leu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys
1055 1060 1065
Gly Ser Ala
1070
<210>3
<211>208
<212>PRT
<213>Artificial Sequence
<400>3
Met Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu Asp Ile Tyr
1 5 10 15
Thr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val Ser His Arg
20 25 30
Cys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg Arg Ala Cys
35 40 45
Phe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr Glu Arg Gly
50 55 60
Ile His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu Tyr Leu Arg
65 70 75 80
Asp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser Pro
85 90 95
Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu Leu
100 105 110
Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr Tyr
115 120 125
Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp Asn
130 135 140
Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys Arg
145 150 155 160
Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg Trp
165 170 175
Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Trp Arg Ser Glu Leu Ser
180 185 190
Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala Val
195 200 205
<210>4
<211>98
<212>PRT
<213>Artificial Sequence
<400>4
Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly
1 5 10 15
Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val
20 25 30
Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr
35 40 45
Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp
50 55 60
Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly
65 70 75 80
Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg
85 90 95
Lys Val
<210>5
<211>522
<212>DNA
<213>Artificial Sequence
<400>5
aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60
gattcccggc tggtgcagat gccacacagc aaggagtgtt ttagtactct ggaaacagaa 120
tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agataacaaa 180
gcaccagtgg tctagtggta gaatagtacc ctgccacggt acagacccgg gttcgattcc 240
cggctggtgc acagaaccga caacagatga ggttttagta ctctggaaac agaatctact 300
aaaacaaggc aaaatgccgt gtttatctcg tcaacttgtt ggcgagataa caaagcacca 360
gtggtctagt ggtagaatag taccctgcca cggtacagac ccgggttcga ttcccggctg 420
gtgcaccagc tcatttggct cggcggtttt agtactctgg aaacagaatc tactaaaaca 480
aggcaaaatg ccgtgtttat ctcgtcaact tgttggcgag at 522
<210>6
<211>83
<212>DNA
<213>Artificial Sequence
<400>6
gttttagtac tctgctggaa acagcagaat ctactaaaac aaggcaaaat gccgtgttta 60
tctcgtcaac ttgttggcga gat 83
<210>7
<211>93
<212>DNA
<213>Artificial Sequence
<400>7
gttttagtac tctgtaattt tagaaataaa attacagaat ctactaaaac aaggcaaaat 60
gccgtgttta tctcgtcaac ttgttggcga gat 93
<210>8
<211>174
<212>DNA
<213>Artificial Sequence
<400>8
aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60
gattcccggc tggtgcatcc tcggcgtagt acgggctgtt ttagtactct ggaaacagaa 120
tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agat 174
<210>9
<211>77
<212>RNA
<213>Artificial Sequence
<400>9
guuuuaguac ucuggaaaca gaaucuacua aaacaaggca aaaugccgug uuuaucucgu 60
caacuuguug gcgagau 77
<210>10
<211>83
<212>RNA
<213>Artificial Sequence
<400>10
guuuuaguac ucugcuggaa acagcagaau cuacuaaaac aaggcaaaau gccguguuua 60
ucucgucaac uuguuggcga gau 83

Claims (13)

1. A kit comprising a sgRNA, a Cas9 nuclease or a biological material associated with the Cas9 nuclease, a cytosine deaminase or a biological material associated with the cytosine deaminase, and a UGI protein or a biological material associated with the UGI protein;
the sgRNA targets a target sequence;
the sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence;
the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework;
the RNA segment A and the RNA segment B are reversely complementary;
the sizes of the RNA fragment A and the RNA fragment B are both 3nt;
the sgRNA framework is an RNA molecule shown in a sequence 9;
the Cas9 nuclease is a protein with an amino acid sequence shown as a sequence 2; the biological material related to the Cas9 nuclease is a nucleic acid molecule encoding the Cas9 nuclease or an expression cassette, a recombinant vector or a recombinant microorganism containing the nucleic acid molecule;
the cytosine deaminase is protein with an amino acid sequence shown as a sequence 3; the biological material related to the cytosine deaminase is a nucleic acid molecule encoding the cytosine deaminase or an expression cassette, a recombinant vector or a recombinant microorganism containing the nucleic acid molecule;
the UGI protein is a protein with an amino acid sequence shown as a sequence 4; the biological material related to the UGI protein is a nucleic acid molecule encoding the UGI protein or an expression cassette, a recombinant vector or a recombinant microorganism containing the nucleic acid molecule.
2. The kit of claim 1, wherein: the modified sgRNA framework is an RNA molecule shown as a sequence 10.
3. The kit of claim 1, wherein: the sgRNA is tRNA-sgRNA;
the tRNA-sgRNA is shown as a formula II: tRNA-RNA transcribed from the target sequence-engineered sgRNA backbone;
the tRNA is an RNA molecule obtained by replacing T in 474-550 th positions of the sequence 1 with U.
4. Use of a kit according to any one of claims 1 to 3 in any one of X1) to X4):
x1) editing of a plant or plant cell genomic target sequence;
x2) preparing an edited product of a plant or plant cell genomic target sequence;
x3) improving the editing efficiency of the plant or plant cell genome target sequence;
x4) preparing a product for improving the editing efficiency of the plant or plant cell genome target sequence.
5. Use according to claim 4, characterized in that: editing the genome target sequence to mutate C in the target sequence into T.
6. Use according to claim 4, characterized in that: the plant is a monocotyledon or a dicotyledon; the plant cell is a monocotyledon cell or a dicotyledon cell.
7. Use according to claim 4, characterized in that: the plant is a gramineous plant; the plant cell is a gramineous plant cell.
8. Use according to claim 4, characterized in that: the plant is rice; the plant cell is a rice cell.
The method according to Y1) or Y2):
y1) a method of editing or increasing the efficiency of editing a plant or plant cell genomic target sequence comprising expressing in a plant or plant cell the sgRNA of any one of claims 1-3, the Cas9 nuclease of any one of claims 1-3, the cytosine deaminase of any one of claims 1-3, and the UGI protein of any one of claims 1-3 to effect editing of the genomic target sequence; the sgRNA targets the target sequence;
y2) A method for producing a plant mutant comprising the steps of: editing the genome of the plant according to the method described in Y1) to obtain a plant mutant.
10. The method of claim 9, wherein: editing the genome target sequence to mutate C in the target sequence into T.
11. The method of claim 9, wherein: the plant is a monocotyledon or a dicotyledon; the plant cell is a monocotyledon cell or a dicotyledon cell.
12. The method of claim 9, wherein: the plant is a gramineous plant; the plant cell is a gramineous plant cell.
13. The method of claim 9, wherein: the plant is rice; the plant cell is a rice cell.
CN201911200779.0A 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing Active CN110835630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Publications (2)

Publication Number Publication Date
CN110835630A CN110835630A (en) 2020-02-25
CN110835630B true CN110835630B (en) 2023-01-03

Family

ID=69577858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911200779.0A Active CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Country Status (1)

Country Link
CN (1) CN110835630B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115678900A (en) * 2021-07-30 2023-02-03 中国科学院天津工业生物技术研究所 Method for reducing editing window of base editor, base editor and use

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition;Benjamin P Kleinstiver 等;《Nature Biotechnology》;20151102;第33卷(第12期);第1293-1298页 *
In vivo genome editing using Staphylococcus aureus Cas9;F. Ann Ran等;《Nature》;20150409;第520卷(第7546期);第186-191页 *
Increasing Cytosine Base Editing Scope and Efficiency With Engineered Cas9-PmCDA1 Fusions and the modified sgRNA in Rice;Ying Wu等;《Frontiers in Genetics》;20190426;第10卷;第379篇第1-10页 *

Also Published As

Publication number Publication date
CN110835630A (en) 2020-02-25

Similar Documents

Publication Publication Date Title
CN108085287B (en) Recombinant corynebacterium glutamicum, preparation method and application thereof
CN107418954B (en) Populus tomentosa gene PtomiR390a and application thereof
CN103205458B (en) Intermediate expression carrier applicable to monocotyledon transformation and construction method thereof
CN110835630B (en) Efficient sgRNA and application thereof in gene editing
CN110835631B (en) Modified sgRNA and application thereof in improving base editing efficiency
CN108342409B (en) Plant RNAi expression vector and construction method and application thereof
CN112778405B (en) Protein related to plant flowering phase and coding gene and application thereof
CN110408646B (en) Plant genetic transformation screening vector and application thereof
CN109206496B (en) Application of protein GhFLS1 in regulation and control of plant heat resistance
CN111304242A (en) Method for preparing single mutant based on SaKKHn-pBE system
CN111154797B (en) Genetic transformation method of maize backbone inbred line mediated by gene gun
CN114990112B (en) Specific promoter for spiny skin
CN113121662B (en) Application of cotton GhBZR3 protein and coding gene thereof in regulating plant growth and development
CN110923263B (en) Rice beta-amylase BA1 and coding gene and application thereof
CN110747186B (en) CRISPR/Cas9 systems and methods for efficient generation of mutants not carrying a transgenic element in plants
CN110878321B (en) Expression vector for klebsiella pneumoniae gene editing
CN113604412A (en) High-yield strain with sub-appropriate amount of L-glutamic acid, construction method thereof and NH4+Staged control of fermentation process
CN111154796B (en) Genetic transformation method of agrobacterium-mediated corn backbone inbred line
CN111187787A (en) Multifunctional plant expression vector and construction method and application thereof
CN109321594B (en) Method for improving artemisinin content in artemisia annua by taking artemisia annua suspension cell line as receptor through iaaM gene transfer
CN111269298B (en) Application of protein GhCCOAOMT7 in regulation and control of plant heat resistance
CN112458113A (en) Plant transgenic dominant suppression vector and application thereof
CN112575028A (en) RNAi plant expression vector for inhibiting expression of HIS1 gene and application thereof
CN109694402B (en) Plant lignin synthesis related protein and coding gene and application thereof
CN112458111A (en) RNAi plant expression vector for inhibiting HSL1 gene expression by using rice endogenous sequence and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant