CN112538471B - CRISPR SpCas9 (K510A) mutant and application thereof - Google Patents

CRISPR SpCas9 (K510A) mutant and application thereof Download PDF

Info

Publication number
CN112538471B
CN112538471B CN202011587591.9A CN202011587591A CN112538471B CN 112538471 B CN112538471 B CN 112538471B CN 202011587591 A CN202011587591 A CN 202011587591A CN 112538471 B CN112538471 B CN 112538471B
Authority
CN
China
Prior art keywords
lys
leu
asp
glu
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011587591.9A
Other languages
Chinese (zh)
Other versions
CN112538471A (en
Inventor
李娟�
王国华
王灿茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southern Medical University
Original Assignee
Southern Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southern Medical University filed Critical Southern Medical University
Priority to CN202011587591.9A priority Critical patent/CN112538471B/en
Publication of CN112538471A publication Critical patent/CN112538471A/en
Application granted granted Critical
Publication of CN112538471B publication Critical patent/CN112538471B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Abstract

The invention discloses a CRISPR SpCas9 (K510A) mutant and application thereof, wherein the SpCas9 mutant is designed by mutating a 510 th lysine residue of wild SpCas9 into an alanine residue, wherein the amino acid sequence of the wild SpCas9 is shown as SEQ ID NO.1, and the amino acid sequence of the SpCas9 mutant is shown as SEQ ID NO. 2. The SpCas9 (K510A) mutant can remarkably reduce the editing efficiency of off-target sites, namely remarkably reduce the off-target rate under the condition that the shearing efficiency of the off-target sites is basically the same as that of the wild SpCas9.

Description

CRISPR SpCas9 (K510A) mutant and application thereof
Technical Field
The invention belongs to the field of molecular biology, and particularly relates to a CRISPR SpCas9 (K510A) mutant and application thereof.
Background
SpCas9 (SpCas 9) is widely used in gene editing. The SpCas9 gene editing system includes SpCas9 protein and sgrnas. The SpCas9 protein and sgRNA bind to form a complex that specifically recognizes the target site through a guide sequence on SpCas9 with the PAM interaction domain and the sgRNA, and double-stranded blunt-ended cleavage of the target DNA using the HNH domain and RuvC domain. In living cells, the sheared genomic DNA may initiate repair of NHEJ, resulting in insertion/deletion mutations (indels).
However, in the related art, although the wild type SpCas9 can efficiently edit the gene, some non-target sites similar to the target sites are edited at the same time of editing the target sites, which seriously affects the clinical application of the SpCas9.
Therefore, there is a need to develop a new SpCas9.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems in the prior art described above. Therefore, compared with the conventional wild-type SpCas9, the SpCas9 mutant or the active fragment thereof provided by the invention can remarkably reduce the editing efficiency of off-target sites, namely remarkably reduce the off-target rate under the condition that the shearing efficiency of the target sites is basically the same as that of the wild-type SpCas9.
The invention also provides a gene for encoding the SpCas9 mutant or the active fragment thereof.
The invention also provides a composition.
The invention also provides a polynucleotide.
The invention also proposes a guide-polynucleotide/Cas complex.
The invention also provides a recombinant vector, recombinant bacteria or cells containing the genes.
The invention also provides a method for modifying a target site in a genome of a cell.
The invention also provides application of the SpCas9 mutant or the active fragment thereof or the gene in gene editing.
According to a first aspect of the present invention there is provided a SpCas9 mutant or an active fragment thereof, the SpCas9 mutant or active fragment thereof comprising: has at least 90% amino acid identity with the wild-type SpCas9 polypeptide shown in SEQ ID No.1, and the amino acid residue at position 510 of the REC3 domain of the wild-type SpCas9 polypeptide is mutated to an alanine residue, wherein the SpCas9 mutant has endonuclease activity.
According to a preferred embodiment of the invention, there is at least the following advantageous effect: the SpCas9 (K510A) mutant is designed by mutating the 510 th lysine residue of the wild SpCas9 (SEQ ID NO. 1) into an alanine residue. The inventor analyzes the interaction between the amino acid residue of the wild SpCas9 REC3 structural domain and the sgRNA/target DNA heterozygous double chain, discovers that a part of amino acid residues in the REC3 structure and the sgRNA/target DNA double chain form nonspecific hydrogen bond interaction, and obtains the hydrophilic amino acid residue Lysine residue (Lysine residue) at the 510 th position in the REC3 structure as an optimal mutation point after further screening the interaction force type, size and structure. By mutating the lysine residue at position 510 of the wild-type SpCas9 REC3 domain to an alanine residue, repeated verification demonstrates that the SpCas9 (K510A) mutant is able to significantly reduce the editing efficiency of off-target sites, i.e. significantly reduce the off-target rate, while maintaining substantially the same cleavage efficiency of the target sites as the wild-type SpCas9.
Wherein, the amino acid sequence of the wild SpCas9 shown in the SEQ ID NO.1 is as follows:
MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVSSDYKDHDGDYKDHDIDYKDDDDKAAG(SEQ ID NO.1)。
the bolded and underlined portion is the lysine residue before mutation;
only the underlined parts are modified sequences.
In some embodiments of the invention, the SpCas9 mutant or active fragment thereof described above further includes: has at least 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75% amino acid identity with the wild-type SpCas9 polypeptide shown in SEQ ID No.1, and the amino acid residue at position 510 of the REC3 domain of the wild-type SpCas9 polypeptide is mutated to an alanine residue, wherein the SpCas9 mutant has endonuclease activity.
In some embodiments of the invention, the amino acid sequence of the SpCas9 mutant or active fragment thereof described above includes:
(1) An amino acid sequence shown in SEQ ID NO. 2; or (b)
(2) The amino acid sequence of the SpCas9 (K510A) mutant shown in SEQ ID NO.2 is a sequence with endonuclease activity after being substituted, deleted and/or added with one or more amino acids and/or terminal modification.
Wherein, the amino acid sequence of the SpCas9 (K510A) mutant shown in the SEQ ID NO.2 is as follows:
MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPAHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVSSDYKDHDGDYKDHDIDYKDDDDKAAG(SEQ ID NO.2)。
the part which is thickened and marked with a lower slide wire is the alanine residue after mutation;
only the underlined parts are modified sequences.
In some preferred embodiments of the invention, the amino acid sequence of the SpCas9 mutant or active fragment thereof described above further comprises: and adding a nuclear signal peptide, a 3×flag or his6 tag or other modified sequences conventional in the art on the basis of the amino acid sequence shown in SEQ ID NO. 2.
According to a second aspect of the invention there is provided a gene encoding a SpCas9 mutant or an active fragment thereof according to the first aspect of the invention.
According to a preferred embodiment of the invention, there is at least the following advantageous effect: the above-mentioned gene encoding the SpCas9 mutant or the active fragment thereof according to the first aspect of the present invention may be used to express the SpCas9 mutant or the active fragment thereof according to the first aspect of the present invention, and the SpCas9 (K510A) mutant obtained by expression may significantly reduce the editing efficiency of off-target sites, that is, significantly reduce the off-target rate, while maintaining substantially the same cleavage efficiency of the target sites as that of the wild-type SpCas9.
According to a third aspect of the present invention there is provided a composition comprising a SpCas9 mutant or an active fragment thereof according to the first aspect of the present invention.
According to a preferred embodiment of the invention, there is at least the following advantageous effect: the composition contains the SpCas9 mutant or the active fragment thereof according to the first aspect of the invention, and the off-target rate in the process of gene editing can be remarkably reduced by using the composition.
According to a fourth aspect of the present invention there is provided a polynucleotide comprising a gene according to the second aspect of the present invention.
In some preferred embodiments of the invention, the polynucleotide is a guide-polynucleotide.
In some more preferred embodiments of the invention, the guide-polynucleotide is sgRNA.
In some more preferred embodiments of the present invention, the guide sequence (guide sequence) of the sgrnas described above is:
5’-gagtccgagcagaagaagaa-3’(SEQ ID NO.3)。
the targeting specificity of Cas9 depends on the presence of 20nt of the sgRNA guide sequence and PAM adjacent to the target sequence in the genome, thus, a rational and efficient involvement of the sgRNA guide sequence can effectively improve the usability of Cas9.
The sgRNA plasmid vector constructed using the guide sequence of the above sgrnas may also be inserted with an antibiotic resistance gene and a reporter Group (GFP) sequence.
In some more preferred embodiments of the invention, the antibiotic resistance gene comprises a eukaryotic puromycin resistance gene or a hygromycin resistance gene.
In some more preferred embodiments of the invention, the antibiotic resistance gene is a eukaryotic puromycin resistance gene.
In some more preferred embodiments of the invention, the reporter group comprises an EGFP group.
Of course, the person skilled in the art can insert other functional genes, such as an inducible promoter of doxycycline (doxycycline), to enhance the transient expression of Cas9 according to the actual needs.
According to a fifth aspect of the present invention there is provided a guide-polynucleotide/Cas complex comprising at least one guide-polynucleotide and at least one SpCas9 mutant or active fragment thereof according to the first aspect of the present invention;
wherein the guide-polynucleotide is a chimeric non-naturally occurring guide-polynucleotide;
the guide-polynucleotide/Cas complex is capable of recognizing, in whole or in part, binding to a target sequence and nicking or unwinding the target sequence, cleaving the target sequence.
In some preferred embodiments of the invention, the polynucleotide is a guide-polynucleotide.
In some more preferred embodiments of the invention, the guide-polynucleotide is sgRNA.
In some more preferred embodiments of the present invention, the guide sequence of the above sgrnas is shown in SEQ ID No. 3.
The introduction of the CRISPR/Cas9 component into a cell is typically accomplished using a DNA introduction system, such as transfection of plasmids encoding Cas9 and sgrnas into the cell. However, when plasmid introduction is problematic, gene editing can be achieved by direct introduction of Cas9/sgRNA Ribonucleoprotein (RNP) complex or using viral vectors (e.g., lentiviral vectors). In contrast to the method of introducing plasmids, editing by the method of introducing Cas9/sgRNA Ribonucleoprotein (RNP) complex can be started quickly after introducing cells. The Cas9/sgRNA Ribonucleoprotein (RNP) complex has less off-target and less immunogenicity, and is more efficient and long lasting when transfected with viral vectors.
According to a sixth aspect of the present invention there is provided a recombinant vector, recombinant bacterium or cell comprising a gene according to the second aspect of the present invention.
In some embodiments of the invention, the cells comprise prokaryotic cells or eukaryotic cells.
In some preferred embodiments of the invention, the cells comprise: cells of animal, bacterial, fungal, insect, yeast and plant origin.
In some more preferred embodiments of the invention, the above-described cells comprise: human cells, animal cells, plant cells and unicellular organisms in vivo, ex vivo or in vitro; wherein the human cells and animal cells are preferably human and animal blood cells.
According to a seventh aspect of the present invention there is provided a method of modifying a target site in a genome of a cell, comprising: introducing the guide-polynucleotide/Cas complex of the fifth aspect of the invention into a cell such that the target site in the cell has the following modifications:
substitution of at least one nucleotide; and/or
A deletion of at least one nucleotide; and/or
Insertion of at least one nucleotide.
According to an eighth aspect of the present invention there is provided the use of the SpCas9 mutant or the active fragment thereof according to the first aspect of the present invention or the gene according to the second aspect of the present invention in gene editing.
According to a preferred embodiment of the invention, there is at least the following advantageous effect: the SpCas9 (K510A) mutant is designed by mutating the 510 th lysine residue of the wild SpCas9 into alanine residue. The inventor analyzes the interaction between the amino acid residue of the wild SpCas9 REC3 structural domain and the sgRNA/target DNA heterozygous double chain, discovers that a part of amino acid residues in the REC3 structure and the sgRNA/target DNA double chain form nonspecific hydrogen bond interaction, and obtains the hydrophilic amino acid residue Lysine residue (Lysine residue) at the 510 th position in the REC3 structure as an optimal mutation point after further screening the interaction force type, size and structure. By mutating the lysine residue at position 510 of the wild-type SpCas9 REC3 domain to an alanine residue, repeated verification demonstrates that the SpCas9 mutant SpCas9 (K510A) is capable of significantly reducing the editing efficiency of off-target sites, i.e. significantly reducing the off-target rate, while maintaining substantially the same cleavage efficiency of the target sites as the wild-type SpCas9.
Drawings
The invention is further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a plasmid map of SpCas9 in an embodiment of the invention;
FIG. 2 is a map of an sgRNA plasmid in an example of the present invention;
FIG. 3 is a two-dimensional graph showing the editing efficiency results of wild SpCas9 at three sites of EMX1-1 (A), EMX1-1-OT1 (B) and EMX1-1-OT2 (C); wherein the blue point: single FAM positive, green dot: single VIC positive, grey dot: FAM/VIC double negative, brown dot: FAM/VIC double positive;
FIG. 4 is a two-dimensional graph showing the results of editing efficiency of mutant SpCas9 (K510A) at three sites of EMX1-1 (A), EMX1-1-OT1 (B) and EMX1-1-OT2 (C); wherein the blue point: single FAM positive, green dot: single VIC positive, grey dot: FAM/VIC double negative, brown dot: FAM/VIC double positive.
Detailed Description
The conception and the technical effects produced by the present invention will be clearly and completely described in conjunction with the embodiments below to fully understand the objects, features and effects of the present invention. It is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and that other embodiments obtained by those skilled in the art without inventive effort are within the scope of the present invention based on the embodiments of the present invention.
Terms in the specification regarding the present invention: spCas9 refers to streptococcus pyogenes (Streptococcus pyogenes) Cas9, cas refers to clustered regularly interspaced palindromic sequences (CRISPR (clustered regularly interspaced short palindromic repeats) associded).
The term "Cas protein" or "Cas polypeptide" refers to a polypeptide encoded by a Cas (CRISPR-associated) gene. Cas proteins include Cas endonucleases.
The term "Cas endonuclease" refers to a Cas polypeptide (Cas protein) that is capable of recognizing, binding to, and optionally nicking or cleaving all or part of a specific DNA target sequence when complexed with a suitable polynucleotide component. Cas endonucleases are directed by the guide-polynucleotides to recognize, bind to, and optionally nick or cleave all or part of a particular target site in double-stranded DNA (e.g., at a target site in the genome of a cell). The Cas endonucleases described herein comprise one or more nuclease domains. Cas endonucleases employed in the donor DNA insertion methods described herein are endonucleases that introduce single-or double-strand breaks into DNA at a target site. Alternatively, cas endonucleases herein may lack DNA cleavage or nicking activity, but may still specifically bind to DNA target sequences when complexed with suitable RNA components.
Plasmid design and construction
In the following examples, the BPK4410 plasmid (Addgene plasmid # 101178) is taken as an example, but it should be noted that the present invention is not limited to BPK4410 plasmid, and alternative plasmids can be reasonably selected as experimental vectors within the knowledge of one of ordinary skill in the art.
(1) SpCas9 mutant plasmid vector construction:
the inventors hypothesize that compaction of the sgRNA/target DNA hybrid duplex may cause an increase in the hydrophobicity of the duplex, thereby causing an increase in the hydrophobic interaction between the sgRNA/target DNA hybrid duplex and the REC3 domain, and that the increased hydrophobic interaction results in movement of the REC3 domain towards the sgRNA/target DNA hybrid duplex, thereby initiating cleavage. It is further envisaged that the best case of the above findings should be that the forces between the REC3 domain and the sgRNA/target DNA hybrid duplex are only hydrophobic interactions without non-specific interactions such as hydrogen bonds, and that movement of the REC3 domain to the sgRNA/target DNA hybrid duplex is only initiated when the sgRNA is 100% matched to the target DNA strand. According to the idea, the inventor finds that part of amino acid residues form nonspecific hydrogen bond interactions with the sgRNA/target DNA double strand by analyzing interactions between the amino acid residues of the REC3 domain and the sgRNA/target DNA double strand, and further obtains the optimal mutant SpCas9 (K510A) by screening and obtaining the hydrophilic amino acid residue lysine residue at the position 510 through mutation into alanine residue by taking factors such as the type, the size and the structure of interaction force and the like into consideration.
The concrete construction steps are as follows:
taking plasmid BPK4410 (hypersas 9, edge plasmid # 101178) as a template, firstly mutating the alanine residue at 692 position of hypersas 9 into an asparagine residue, mutating the alanine residue at 694 into a methionine residue, mutating the alanine residue at 695 into a glutamine residue, mutating the alanine residue at 698 into a histidine residue, and obtaining wild-type SpCas9.
Mutation of lysine residue at position 510 of the SpCas9 amino acid sequence to alanine residue resulted in mutant SpCas9 (K510A).
The amino acid sequence of the wild SpCas9 is shown as SEQ ID NO.1, and the amino acid sequence of the mutant SpCas9 (K510A) after mutation is shown as SEQ ID NO. 2.
The nucleotide sequence of the plasmid expressing wild SpCas9 is shown as SEQ ID NO. 4.
The nucleotide sequence of the plasmid expressing SpCas9 (K510A) is shown in SEQ ID NO. 5:
the plasmid map of the constructed SpCas9 (K510A) mutant is shown in figure 1.
(2) Construction of sgRNA plasmid vector (sgRNA-EMX 1-1-puro):
the sgRNA is designed, and after screening, the sequence shown in SEQ ID NO.3 is selected as a guide sequence.
The sequence shown in SEQ ID NO.3 is:
5’-gagtccgagcagaagaagaa-3’(SEQ ID NO.3)。
the constructed sgRNA plasmid vector also comprises a promoter, a eukaryotic puromycin resistance gene sequence and an EGFP reporter gene sequence, and can express green fluorescent protein after transfection, and can use puromycin to carry out medicine screening to screen positive cells successfully transfected.
To verify the actual effect of the SpCas9 mutant, the EMX1-1 site of the EMX1 gene was used as a target site, and the sgRNA plasmid vector, designated as sgRNA-EMX1-1-puro, was targeted to the target site.
The nucleotide sequence of the constructed sgRNA plasmid vector (sgRNA-EMX 1-1-puro) is shown as SEQ ID NO.6, and the plasmid map is shown as FIG. 2.
The nucleotide sequence of the plasmid expressing sgRNA-EMX1-1 is shown as SEQ ID NO. 6).
Cell culture
HEK293T cells were used as transfection targets.
HEK293T cells were cultured in DMEM medium (10% FBS, 2mM glutamine and two antibiotics (penicillin and streptomycin)) at 37℃under 5% CO 2
Construction of transfected cells
HEK293T cells cultured in the above examples were seeded into 24-well plates (2X 10) 5 Individual cells/well), 0.5ml of complete medium was added to each well and incubated overnight. Transfection was performed until the next day at a cell fusion of about 70-80%. Transfection cells were transfected using a total of 1 μg of plasmid (750 ng SpCas9/SpCas9 (K510A) plasmid and 250ng sgRNA-EMX1-1-puro plasmid) using Lipofectamine 3000 (Invitrogen, 1.5 μl lipo3000,2 μ L p 3000) as transfection reagent, reference reagent instructions or routine procedures in the art.
The transfected cells were screened with 2. Mu.g/mL Puromycin on days 2-4 after transfection and genomic DNA was extracted the next day after screening was completed for use.
Editing efficiency detection at target site and off-target site
The sites EMX1-1-OT1 and EMX1-1-OT2 with higher off-target rate of the wild SpCas9 when the EMX1-1 site is sheared are selected, which are confirmed by the report, and are used for detecting off-target conditions of SpCas9 (K510A) mutant in the embodiment of the invention. Wherein the sequences and PAM sequences of the target sites EMX1-1, the off-target sites EMX1-1-OT1 and EMX1-1-OT2 are shown in table 1.
TABLE 1 target and off-target site target and PAM sequences
Wherein the thickened part is a base of which the off-target site is inconsistent with the target site.
ddPCR (Droplet Digital PCR, microdroplet digital PCR) was used to verify the efficiency of editing at the target site and off-target site.
For the sequences of the target sites EMX1-1, the off-target sites EMX1-1-OT1 and EMX1-1-OT2 shown in Table 1, primer sets were designed for amplifying a fragment of about 60-200bp long containing the target site or the off-target site, respectively, while Reference probes (containing FAM fluorescent modification groups) were designed for targeting the sequences at both ends of the sheared target site, and NHEJ probes (containing VIC fluorescent modification groups) were designed for targeting the sheared target site. The NHEJ probe can bind to the target site sequence when the target site is not edited, and can not bind to the target site any more when the target site is edited, so that the off-target condition can be judged by identifying the fluorescence of the VIC on the target sequence.
Wherein, the primer group, reference probe and NHEJ probe sequences of the target site EMX1-1, off-target site EMX1-1-OT1 and EMX1-1-OT2 are respectively as follows:
(1) Target site EMX1-1:
primer set sequence:
upstream primer F:5'-cggaggacaaagtacaaacgg-3' (SEQ ID NO. 10);
the downstream primer R:5'-gtcattggaggtgacatcgatg-3' (SEQ ID NO. 11).
The Reference probe sequences were: 5'-FAM-ccattggcctgcttcgtggcaatgcg-BHQ1-3' (SEQ ID NO. 12).
The NHEJ probe sequence was: 5'-VIC-cgagcagaagaagaag-MGB-3' (SEQ ID NO. 13).
(2) Off-target sites EMX1-1-OT1:
primer set sequence:
upstream primer F:5'-gctacctgtacatctgcacaag-3' (SEQ ID NO. 14);
the downstream primer R:5'-aagaaatgcccaatcattgatgc-3' (SEQ ID NO. 15).
The Reference probe sequences were: 5'-FAM-ctgtcttgccatgccataagcccctatt-BHQ1-3' (SEQ ID NO. 16).
The NHEJ probe sequence was: 5'-VIC-atgcctttcttcttc-MGB-3' (SEQ ID NO. 17).
(3) Off-target sites EMX1-1-OT2:
primer set sequence:
upstream primer F:5'-agcctctttctcaatgtgcttc-3' (SEQ ID NO. 18);
the downstream primer R:5'-agagtagatggttgggtagtgg-3' (SEQ ID NO. 19).
The Reference probe sequences were: 5'-FAM-ccatcacggcctttgcaaatagagccct-BHQ1-3' (SEQ ID NO. 20).
The NHEJ probe sequence was: 5'-VIC-ctaagcagaagaagaagag-MGB-3' (SEQ ID NO. 21).
Preparation of microdroplets in ddPCR reference QX200 TM Microdroplet digital PCR systemOr using other procedures conventional in the art.
The ddPCR reaction system is as follows:
TABLE 2 ddPCR reaction System
The reaction procedure is: pre-denaturation at 95℃for 10min; denaturation at 94℃for 30s; annealing at 50-65 ℃ for 1min; cycling for 40 times; 98 ℃ for 10min; maintained at 4 ℃.
Detection using a QX200TM microdroplet digital PCR system: the droplets are separated in sequence by a droplet analyzer, fluorescence signal detection is carried out one by one, and fluorescence signal values of each droplet in the FAM channel and the VIC channel are detected respectively. Droplets with fluorescent signal were positive and droplets without fluorescent signal were negative. The number and the ratio of positive droplets in each sample were recorded. And calculating editing efficiency according to the proportional relation.
The editing efficiency is calculated by using the NHEJ mutation rate, and the formula is as follows:
the results are shown in FIGS. 3-4 and Table 3.
FIGS. 2-3 are two-dimensional graphs of editing efficiency results of wild-type SpCas9 and mutant SpCas9 (K510A) at three sites of EMX1-1, EMX1-1-OT2, respectively. The blue dot in FIG. 2 is single FAM positive, the green dot is single VIC positive, the gray dot is FAM/VIC double negative, and the brown dot is FAM/VIC double positive. Further analysis found that wild-type SpCas9 had an edit rate of 53.31% for the target site EMX1-1, 2.51% for the off-target site EMX1-1-OT1, and 1.76% for the off-target site EMX1-1-OT 2. SpCas9 (K510A) mutant has an editing rate of 41.47% for the target site EMX1-1, 0.24% for the off-target site EMX1-1-OT1 and 0.08% for the off-target site EMX1-1-OT 2. Compared with the wild type SpCas9, the editing efficiency of the mutant SpCas9 (K510A) on the target site is basically the same as that of the wild type SpCas9 (more than 70%), and the editing rate on the off-target site is greatly reduced.
TABLE 3 editing efficiency of wild-type SpCas9 and mutant SpCas9 (K510A) at EMX1-1 target site and off-target site
vs. wild type, P <0.01, P <0.001.
Editing efficiency verification of SpCas9 (K510A) mutant at target site EMX1-1
Editing efficiency of SpCas9 (K510A) mutants was detected using T7E1 cleavage.
Primers were designed for the EMX1-1 site (720 bp amplification product length):
upstream primer F:5'-CTTCCAGAGCCTGCACTCCT-3' (SEQ ID NO. 22);
the downstream primer R:5'-AGGCTCTCCGAGGAGAAGGC-3' (SEQ ID NO. 23).
The primer is usedThe EMX1-1 gene sequence was amplified using Hot Start High-Fidelity 2X Master Mix premix (M0494, NEB). The amplified product was subjected to T7E1 cleavage (E3321, NEB). The results are shown in Table 4.
TABLE 4T 7E1 enzymatic cleavage assay results
Wild type, P <0.05.
The results showed that the cleavage rate of SpCas9 mutant (K510A) for the target site EMX1-1 was 72.0% of the cleavage rate of the wild-type (table 4), consistent with the results of ddPCR detection.
In summary, the SpCas9 (K510A) mutant in the embodiment of the present invention is designed by mutating the 510 th lysine residue of the wild SpCas9 to an alanine residue. The editing condition of the target sequence is detected through a T7E1 experiment, and ddPCR proves that under the condition that the cleavage efficiency of the SpCas9 (K510A) mutant to the target site is basically the same as that of the wild SpCas9, the editing efficiency of the off-target site can be obviously reduced, namely the off-target rate is obviously reduced.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention. Furthermore, embodiments of the invention and features of the embodiments may be combined with each other without conflict.
SEQUENCE LISTING
<110> university of medical science in south China
<120> a CRISPR SpCas9 mutant and application thereof
<130>
<160> 23
<170> PatentIn version 3.5
<210> 1
<211> 1404
<212> PRT
<213> artificial sequence
<400> 1
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp
1370 1375 1380
His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp
1385 1390 1395
Asp Asp Lys Ala Ala Gly
1400
<210> 2
<211> 1404
<212> PRT
<213> artificial sequence
<400> 2
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Ala His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp
1370 1375 1380
His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp
1385 1390 1395
Asp Asp Lys Ala Ala Gly
1400
<210> 3
<211> 20
<212> DNA
<213> artificial sequence
<400> 3
gagtccgagc agaagaagaa 20
<210> 4
<211> 7614
<212> DNA
<213> artificial sequence
<400> 4
cggatcggga gatcgatctc ccgatcccct agggtcgact ctcagtacaa tctgctctga 60
tgccgcatag ttaagccagt atctgctccc tgcttgtgtg ttggaggtcg ctgagtagtg 120
cgcgagcaaa atttaagcta caacaaggca aggcttgacc gacaattgca tgaagaatct 180
gcttagggtt aggcgttttg cgctgcttcg cgatgtacgg gccagatata cgcgttgaca 240
ttgattattg actagttatt aatagtaatc aattacgggg tcattagttc atagcccata 300
tatggagttc cgcgttacat aacttacggt aaatggcccg cctggctgac cgcccaacga 360
cccccgccca ttgacgtcaa taatgacgta tgttcccata gtaacgccaa tagggacttt 420
ccattgacgt caatgggtgg agtatttacg gtaaactgcc cacttggcag tacatcaagt 480
gtatcatatg ccaagtacgc cccctattga cgtcaatgac ggtaaatggc ccgcctggca 540
ttatgcccag tacatgacct tatgggactt tcctacttgg cagtacatct acgtattagt 600
catcgctatt accatggtga tgcggttttg gcagtacatc aatgggcgtg gatagcggtt 660
tgactcacgg ggatttccaa gtctccaccc cattgacgtc aatgggagtt tgttttggca 720
ccaaaatcaa cgggactttc caaaatgtcg taacaactcc gccccattga cgcaaatggg 780
cggtaggcgt gtacggtggg aggtctatat aagcagagct ggtttagtga accgtcagat 840
ccgctagaga tccgcggccg ctaatacgac tcactatagg gagagccgcc accatggata 900
aaaagtattc tattggttta gacatcggca ctaattccgt tggatgggct gtcataaccg 960
atgaatacaa agtaccttca aagaaattta aggtgttggg gaacacagac cgtcattcga 1020
ttaaaaagaa tcttatcggt gccctcctat tcgatagtgg cgaaacggca gaggcgactc 1080
gcctgaaacg aaccgctcgg agaaggtata cacgtcgcaa gaaccgaata tgttacttac 1140
aagaaatttt tagcaatgag atggccaaag ttgacgattc tttctttcac cgtttggaag 1200
agtccttcct tgtcgaagag gacaagaaac atgaacggca ccccatcttt ggaaacatag 1260
tagatgaggt ggcatatcat gaaaagtacc caacgattta tcacctcaga aaaaagctag 1320
ttgactcaac tgataaagcg gacctgaggt taatctactt ggctcttgcc catatgataa 1380
agttccgtgg gcactttctc attgagggtg atctaaatcc ggacaactcg gatgtcgaca 1440
aactgttcat ccagttagta caaacctata atcagttgtt tgaagagaac cctataaatg 1500
caagtggcgt ggatgcgaag gctattctta gcgcccgcct ctctaaatcc cgacggctag 1560
aaaacctgat cgcacaatta cccggagaga agaaaaatgg gttgttcggt aaccttatag 1620
cgctctcact aggcctgaca ccaaatttta agtcgaactt cgacttagct gaagatgcca 1680
aattgcagct tagtaaggac acgtacgatg acgatctcga caatctactg gcacaaattg 1740
gagatcagta tgcggactta tttttggctg ccaaaaacct tagcgatgca atcctcctat 1800
ctgacatact gagagttaat actgagatta ccaaggcgcc gttatccgct tcaatgatca 1860
aaaggtacga tgaacatcac caagacttga cacttctcaa ggccctagtc cgtcagcaac 1920
tgcctgagaa atataaggaa atattctttg atcagtcgaa aaacgggtac gcaggttata 1980
ttgacggcgg agcgagtcaa gaggaattct acaagtttat caaacccata ttagagaaga 2040
tggatgggac ggaagagttg cttgtaaaac tcaatcgcga agatctactg cgaaagcagc 2100
ggactttcga caacggtagc attccacatc aaatccactt aggcgaattg catgctatac 2160
ttagaaggca ggaggatttt tatccgttcc tcaaagacaa tcgtgaaaag attgagaaaa 2220
tcctaacctt tcgcatacct tactatgtgg gacccctggc ccgagggaac tctcggttcg 2280
catggatgac aagaaagtcc gaagaaacga ttactccatg gaattttgag gaagttgtcg 2340
ataaaggtgc gtcagctcaa tcgttcatcg agaggatgac caactttgac aagaatttac 2400
cgaacgaaaa agtattgcct aagcacagtt tactttacga gtatttcaca gtgtacaatg 2460
aactcacgaa agttaagtat gtcactgagg gcatgcgtaa acccgccttt ctaagcggag 2520
aacagaagaa agcaatagta gatctgttat tcaagaccaa ccgcaaagtg acagttaagc 2580
aattgaaaga ggactacttt aagaaaattg aatgcttcga ttctgtcgag atctccgggg 2640
tagaagatcg atttaatgcg tcacttggta cgtatcatga cctcctaaag ataattaaag 2700
ataaggactt cctggataac gaagagaatg aagatatctt agaagatata gtgttgactc 2760
ttaccctctt tgaagatcgg gaaatgattg aggaaagact aaaaacatac gctcacctgt 2820
tcgacgataa ggttatgaaa cagttaaaga ggcgtcgcta tacgggctgg ggacgattgt 2880
cgcggaaact tatcaacggg ataagagaca agcaaagtgg taaaactatt ctcgattttc 2940
taaagagcga cggcttcgcc aataggaact ttatgcagct gatccatgat gactctttaa 3000
ccttcaaaga ggatatacaa aaggcacagg tttccggaca aggggactca ttgcacgaac 3060
atattgcgaa tcttgctggt tcgccagcca tcaaaaaggg catactccag acagtcaaag 3120
tagtggatga gctagttaag gtcatgggac gtcacaaacc ggaaaacatt gtaatcgaga 3180
tggcacgcga aaatcaaacg actcagaagg ggcaaaaaaa cagtcgagag cggatgaaga 3240
gaatagaaga gggtattaaa gaactgggca gccagatctt aaaggagcat cctgtggaaa 3300
atacccaatt gcagaacgag aaactttacc tctattacct acaaaatgga agggacatgt 3360
atgttgatca ggaactggac ataaaccgtt tatctgatta cgacgtcgat cacattgtac 3420
cccaatcctt tttgaaggac gattcaatcg acaataaagt gcttacacgc tcggataaga 3480
accgagggaa aagtgacaat gttccaagcg aggaagtcgt aaagaaaatg aagaactatt 3540
ggcggcagct cctaaatgcg aaactgataa cgcaaagaaa gttcgataac ttaactaaag 3600
ctgagagggg tggcttgtct gaacttgaca aggccggatt tattaaacgt cagctcgtgg 3660
aaacccgcca aatcacaaag catgttgcac agatactaga ttcccgaatg aatacgaaat 3720
acgacgagaa cgataagctg attcgggaag tcaaagtaat cactttaaag tcaaaattgg 3780
tgtcggactt cagaaaggat tttcaattct ataaagttag ggagataaat aactaccacc 3840
atgcgcacga cgcttatctt aatgccgtcg tagggaccgc actcattaag aaatacccga 3900
agctagaaag tgagtttgtg tatggtgatt acaaagttta tgacgtccgt aagatgatcg 3960
cgaaaagcga acaggagata ggcaaggcta cagccaaata cttcttttat tctaacatta 4020
tgaatttctt taagacggaa atcactctgg caaacggaga gatacgcaaa cgacctttaa 4080
ttgaaaccaa tggggagaca ggtgaaatcg tatgggataa gggccgggac ttcgcgacgg 4140
tgagaaaagt tttgtccatg ccccaagtca acatagtaaa gaaaactgag gtgcagaccg 4200
gagggttttc aaaggaatcg attcttccaa aaaggaatag tgataagctc atcgctcgta 4260
aaaaggactg ggacccgaaa aagtacggtg gcttcgatag ccctacagtt gcctattctg 4320
tcctagtagt ggcaaaagtt gagaagggaa aatccaagaa actgaagtca gtcaaagaat 4380
tattggggat aacgattatg gagcgctcgt cttttgaaaa gaaccccatc gacttccttg 4440
aggcgaaagg ttacaaggaa gtaaaaaagg atctcataat taaactacca aagtatagtc 4500
tgtttgagtt agaaaatggc cgaaaacgga tgttggctag cgccggagag cttcaaaagg 4560
ggaacgaact cgcactaccg tctaaatacg tgaatttcct gtatttagcg tcccattacg 4620
agaagttgaa aggttcacct gaagataacg aacagaagca actttttgtt gagcagcaca 4680
aacattatct cgacgaaatc atagagcaaa tttcggaatt cagtaagaga gtcatcctag 4740
ctgatgccaa tctggacaaa gtattaagcg catacaacaa gcacagggat aaacccatac 4800
gtgagcaggc ggaaaatatt atccatttgt ttactcttac caacctcggc gctccagccg 4860
cattcaagta ttttgacaca acgatagatc gcaaacgata cacttctacc aaggaggtgc 4920
tagacgcgac actgattcac caatccatca cgggattata tgaaactcgg atagatttgt 4980
cacagcttgg gggtgacgga tcccccaaga agaagaggaa agtctcgagc gactacaaag 5040
accatgacgg tgattataaa gatcatgaca tcgattacaa ggatgacgat gacaaggctg 5100
caggatgacc ggtcatcatc accatcacca ttgagtttaa acccgctgat cagcctcgac 5160
tgtgccttct agttgccagc catctgttgt ttgcccctcc cccgtgcctt ccttgaccct 5220
ggaaggtgcc actcccactg tcctttccta ataaaatgag gaaattgcat cgcattgtct 5280
gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg gggaggattg 5340
ggaagacaat agcaggcatg ctggggatgc ggtgggctct atggcttctg aggcggaaag 5400
aaccagctgg ggctcgatac cgtcgacctc tagctagagc ttggcgtaat catggtcata 5460
gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag 5520
cataaagtgt aaagcctagg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg 5580
ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 5640
acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc 5700
gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg 5760
gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa 5820
ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga 5880
cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag 5940
ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct 6000
taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg 6060
ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc 6120
ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt 6180
aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta 6240
tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac 6300
agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc 6360
ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat 6420
tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 6480
tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt 6540
cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta 6600
aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct 6660
atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg 6720
cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga 6780
tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt 6840
atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt 6900
taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt 6960
tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat 7020
gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc 7080
cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc 7140
cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat 7200
gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag 7260
aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt 7320
accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc 7380
ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa 7440
gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg 7500
aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa 7560
taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tcga 7614
<210> 5
<211> 7614
<212> DNA
<213> artificial sequence
<400> 5
cggatcggga gatcgatctc ccgatcccct agggtcgact ctcagtacaa tctgctctga 60
tgccgcatag ttaagccagt atctgctccc tgcttgtgtg ttggaggtcg ctgagtagtg 120
cgcgagcaaa atttaagcta caacaaggca aggcttgacc gacaattgca tgaagaatct 180
gcttagggtt aggcgttttg cgctgcttcg cgatgtacgg gccagatata cgcgttgaca 240
ttgattattg actagttatt aatagtaatc aattacgggg tcattagttc atagcccata 300
tatggagttc cgcgttacat aacttacggt aaatggcccg cctggctgac cgcccaacga 360
cccccgccca ttgacgtcaa taatgacgta tgttcccata gtaacgccaa tagggacttt 420
ccattgacgt caatgggtgg agtatttacg gtaaactgcc cacttggcag tacatcaagt 480
gtatcatatg ccaagtacgc cccctattga cgtcaatgac ggtaaatggc ccgcctggca 540
ttatgcccag tacatgacct tatgggactt tcctacttgg cagtacatct acgtattagt 600
catcgctatt accatggtga tgcggttttg gcagtacatc aatgggcgtg gatagcggtt 660
tgactcacgg ggatttccaa gtctccaccc cattgacgtc aatgggagtt tgttttggca 720
ccaaaatcaa cgggactttc caaaatgtcg taacaactcc gccccattga cgcaaatggg 780
cggtaggcgt gtacggtggg aggtctatat aagcagagct ggtttagtga accgtcagat 840
ccgctagaga tccgcggccg ctaatacgac tcactatagg gagagccgcc accatggata 900
aaaagtattc tattggttta gacatcggca ctaattccgt tggatgggct gtcataaccg 960
atgaatacaa agtaccttca aagaaattta aggtgttggg gaacacagac cgtcattcga 1020
ttaaaaagaa tcttatcggt gccctcctat tcgatagtgg cgaaacggca gaggcgactc 1080
gcctgaaacg aaccgctcgg agaaggtata cacgtcgcaa gaaccgaata tgttacttac 1140
aagaaatttt tagcaatgag atggccaaag ttgacgattc tttctttcac cgtttggaag 1200
agtccttcct tgtcgaagag gacaagaaac atgaacggca ccccatcttt ggaaacatag 1260
tagatgaggt ggcatatcat gaaaagtacc caacgattta tcacctcaga aaaaagctag 1320
ttgactcaac tgataaagcg gacctgaggt taatctactt ggctcttgcc catatgataa 1380
agttccgtgg gcactttctc attgagggtg atctaaatcc ggacaactcg gatgtcgaca 1440
aactgttcat ccagttagta caaacctata atcagttgtt tgaagagaac cctataaatg 1500
caagtggcgt ggatgcgaag gctattctta gcgcccgcct ctctaaatcc cgacggctag 1560
aaaacctgat cgcacaatta cccggagaga agaaaaatgg gttgttcggt aaccttatag 1620
cgctctcact aggcctgaca ccaaatttta agtcgaactt cgacttagct gaagatgcca 1680
aattgcagct tagtaaggac acgtacgatg acgatctcga caatctactg gcacaaattg 1740
gagatcagta tgcggactta tttttggctg ccaaaaacct tagcgatgca atcctcctat 1800
ctgacatact gagagttaat actgagatta ccaaggcgcc gttatccgct tcaatgatca 1860
aaaggtacga tgaacatcac caagacttga cacttctcaa ggccctagtc cgtcagcaac 1920
tgcctgagaa atataaggaa atattctttg atcagtcgaa aaacgggtac gcaggttata 1980
ttgacggcgg agcgagtcaa gaggaattct acaagtttat caaacccata ttagagaaga 2040
tggatgggac ggaagagttg cttgtaaaac tcaatcgcga agatctactg cgaaagcagc 2100
ggactttcga caacggtagc attccacatc aaatccactt aggcgaattg catgctatac 2160
ttagaaggca ggaggatttt tatccgttcc tcaaagacaa tcgtgaaaag attgagaaaa 2220
tcctaacctt tcgcatacct tactatgtgg gacccctggc ccgagggaac tctcggttcg 2280
catggatgac aagaaagtcc gaagaaacga ttactccatg gaattttgag gaagttgtcg 2340
ataaaggtgc gtcagctcaa tcgttcatcg agaggatgac caactttgac aagaatttac 2400
cgaacgaaaa agtattgcct gcccacagtt tactttacga gtatttcaca gtgtacaatg 2460
aactcacgaa agttaagtat gtcactgagg gcatgcgtaa acccgccttt ctaagcggag 2520
aacagaagaa agcaatagta gatctgttat tcaagaccaa ccgcaaagtg acagttaagc 2580
aattgaaaga ggactacttt aagaaaattg aatgcttcga ttctgtcgag atctccgggg 2640
tagaagatcg atttaatgcg tcacttggta cgtatcatga cctcctaaag ataattaaag 2700
ataaggactt cctggataac gaagagaatg aagatatctt agaagatata gtgttgactc 2760
ttaccctctt tgaagatcgg gaaatgattg aggaaagact aaaaacatac gctcacctgt 2820
tcgacgataa ggttatgaaa cagttaaaga ggcgtcgcta tacgggctgg ggacgattgt 2880
cgcggaaact tatcaacggg ataagagaca agcaaagtgg taaaactatt ctcgattttc 2940
taaagagcga cggcttcgcc aataggaact ttatgcagct gatccatgat gactctttaa 3000
ccttcaaaga ggatatacaa aaggcacagg tttccggaca aggggactca ttgcacgaac 3060
atattgcgaa tcttgctggt tcgccagcca tcaaaaaggg catactccag acagtcaaag 3120
tagtggatga gctagttaag gtcatgggac gtcacaaacc ggaaaacatt gtaatcgaga 3180
tggcacgcga aaatcaaacg actcagaagg ggcaaaaaaa cagtcgagag cggatgaaga 3240
gaatagaaga gggtattaaa gaactgggca gccagatctt aaaggagcat cctgtggaaa 3300
atacccaatt gcagaacgag aaactttacc tctattacct acaaaatgga agggacatgt 3360
atgttgatca ggaactggac ataaaccgtt tatctgatta cgacgtcgat cacattgtac 3420
cccaatcctt tttgaaggac gattcaatcg acaataaagt gcttacacgc tcggataaga 3480
accgagggaa aagtgacaat gttccaagcg aggaagtcgt aaagaaaatg aagaactatt 3540
ggcggcagct cctaaatgcg aaactgataa cgcaaagaaa gttcgataac ttaactaaag 3600
ctgagagggg tggcttgtct gaacttgaca aggccggatt tattaaacgt cagctcgtgg 3660
aaacccgcca aatcacaaag catgttgcac agatactaga ttcccgaatg aatacgaaat 3720
acgacgagaa cgataagctg attcgggaag tcaaagtaat cactttaaag tcaaaattgg 3780
tgtcggactt cagaaaggat tttcaattct ataaagttag ggagataaat aactaccacc 3840
atgcgcacga cgcttatctt aatgccgtcg tagggaccgc actcattaag aaatacccga 3900
agctagaaag tgagtttgtg tatggtgatt acaaagttta tgacgtccgt aagatgatcg 3960
cgaaaagcga acaggagata ggcaaggcta cagccaaata cttcttttat tctaacatta 4020
tgaatttctt taagacggaa atcactctgg caaacggaga gatacgcaaa cgacctttaa 4080
ttgaaaccaa tggggagaca ggtgaaatcg tatgggataa gggccgggac ttcgcgacgg 4140
tgagaaaagt tttgtccatg ccccaagtca acatagtaaa gaaaactgag gtgcagaccg 4200
gagggttttc aaaggaatcg attcttccaa aaaggaatag tgataagctc atcgctcgta 4260
aaaaggactg ggacccgaaa aagtacggtg gcttcgatag ccctacagtt gcctattctg 4320
tcctagtagt ggcaaaagtt gagaagggaa aatccaagaa actgaagtca gtcaaagaat 4380
tattggggat aacgattatg gagcgctcgt cttttgaaaa gaaccccatc gacttccttg 4440
aggcgaaagg ttacaaggaa gtaaaaaagg atctcataat taaactacca aagtatagtc 4500
tgtttgagtt agaaaatggc cgaaaacgga tgttggctag cgccggagag cttcaaaagg 4560
ggaacgaact cgcactaccg tctaaatacg tgaatttcct gtatttagcg tcccattacg 4620
agaagttgaa aggttcacct gaagataacg aacagaagca actttttgtt gagcagcaca 4680
aacattatct cgacgaaatc atagagcaaa tttcggaatt cagtaagaga gtcatcctag 4740
ctgatgccaa tctggacaaa gtattaagcg catacaacaa gcacagggat aaacccatac 4800
gtgagcaggc ggaaaatatt atccatttgt ttactcttac caacctcggc gctccagccg 4860
cattcaagta ttttgacaca acgatagatc gcaaacgata cacttctacc aaggaggtgc 4920
tagacgcgac actgattcac caatccatca cgggattata tgaaactcgg atagatttgt 4980
cacagcttgg gggtgacgga tcccccaaga agaagaggaa agtctcgagc gactacaaag 5040
accatgacgg tgattataaa gatcatgaca tcgattacaa ggatgacgat gacaaggctg 5100
caggatgacc ggtcatcatc accatcacca ttgagtttaa acccgctgat cagcctcgac 5160
tgtgccttct agttgccagc catctgttgt ttgcccctcc cccgtgcctt ccttgaccct 5220
ggaaggtgcc actcccactg tcctttccta ataaaatgag gaaattgcat cgcattgtct 5280
gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg gggaggattg 5340
ggaagacaat agcaggcatg ctggggatgc ggtgggctct atggcttctg aggcggaaag 5400
aaccagctgg ggctcgatac cgtcgacctc tagctagagc ttggcgtaat catggtcata 5460
gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag 5520
cataaagtgt aaagcctagg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg 5580
ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 5640
acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc 5700
gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg 5760
gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa 5820
ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga 5880
cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag 5940
ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct 6000
taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg 6060
ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc 6120
ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt 6180
aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta 6240
tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac 6300
agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc 6360
ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat 6420
tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 6480
tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt 6540
cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta 6600
aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct 6660
atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg 6720
cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga 6780
tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt 6840
atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt 6900
taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt 6960
tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat 7020
gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc 7080
cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc 7140
cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat 7200
gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag 7260
aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt 7320
accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc 7380
ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa 7440
gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg 7500
aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa 7560
taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tcga 7614
<210> 6
<211> 8408
<212> DNA
<213> artificial sequence
<400> 6
aatgtagtct tatgcaatac tcttgtagtc ttgcaacatg gtaacgatga gttagcaaca 60
tgccttacaa ggagagaaaa agcaccgtgc atgccgattg gtggaagtaa ggtggtacga 120
tcgtgcctta ttaggaaggc aacagacggg tctgacatgg attggacgaa ccactgaatt 180
gccgcattgc agagatattg tatttaagtg cctagctcga tacataaacg ggtctctctg 240
gttagaccag atctgagcct gggagctctc tggctaacta gggaacccac tgcttaagcc 300
tcaataaagc ttgccttgag tgcttcaagt agtgtgtgcc cgtctgttgt gtgactctgg 360
taactagaga tccctcagac ccttttagtc agtgtggaaa atctctagca gtggcgcccg 420
aacagggact tgaaagcgaa agggaaacca gaggagctct ctcgacgcag gactcggctt 480
gctgaagcgc gcacggcaag aggcgagggg cggcgactgg tgagtacgcc aaaaattttg 540
actagcggag gctagaagga gagagatggg tgcgagagcg tcagtattaa gcgggggaga 600
attagatcgc gatgggaaaa aattcggtta aggccagggg gaaagaaaaa atataaatta 660
aaacatatag tatgggcaag cagggagcta gaacgattcg cagttaatcc tggcctgtta 720
gaaacatcag aaggctgtag acaaatactg ggacagctac aaccatccct tcagacagga 780
tcagaagaac ttagatcatt atataataca gtagcaaccc tctattgtgt gcatcaaagg 840
atagagataa aagacaccaa ggaagcttta gacaagatag aggaagagca aaacaaaagt 900
aagaccaccg cacagcaagc ggccgctgat cttcagacct ggaggaggag atatgaggga 960
caattggaga agtgaattat ataaatataa agtagtaaaa attgaaccat taggagtagc 1020
acccaccaag gcaaagagaa gagtggtgca gagagaaaaa agagcagtgg gaataggagc 1080
tttgttcctt gggttcttgg gagcagcagg aagcactatg ggcgcagcgt caatgacgct 1140
gacggtacag gccagacaat tattgtctgg tatagtgcag cagcagaaca atttgctgag 1200
ggctattgag gcgcaacagc atctgttgca actcacagtc tggggcatca agcagctcca 1260
ggcaagaatc ctggctgtgg aaagatacct aaaggatcaa cagctcctgg ggatttgggg 1320
ttgctctgga aaactcattt gcaccactgc tgtgccttgg aatgctagtt ggagtaataa 1380
atctctggaa cagatttgga atcacacgac ctggatggag tgggacagag aaattaacaa 1440
ttacacaagc ttaatacact ccttaattga agaatcgcaa aaccagcaag aaaagaatga 1500
acaagaatta ttggaattag ataaatgggc aagtttgtgg aattggttta acataacaaa 1560
ttggctgtgg tatataaaat tattcataat gatagtagga ggcttggtag gtttaagaat 1620
agtttttgct gtactttcta tagtgaatag agttaggcag ggatattcac cattatcgtt 1680
tcagacccac ctcccaaccc cgaggggacc cgacaggccc gaaggaatag aagaagaagg 1740
tggagagaga gacagagaca gatccattcg attagtgaac ggatctcgac ggtatcgcta 1800
gcttttaaaa gaaaaggggg gattgggggg tacagtgcag gggaaagaat agtagacata 1860
atagcaacag acatacaaac taaagaatta caaaaacaaa ttacaaaaat tcaaaatttt 1920
actagtgagg gcctatttcc catgattcct tcatatttgc atatacgata caaggctgtt 1980
agagagataa ttggaattaa tttgactgta aacacaaaga tattagtaca aaatacgtga 2040
cgtagaaagt aataatttct tgggtagttt gcagttttaa aattatgttt taaaatggac 2100
tatcatatgc ttaccgtaac ttgaaagtat ttcgatttct tggctttata tatcttgtgg 2160
aaaggacgaa acaccggagt ccgagcagaa gaagaagttt tagagctaga aatagcaagt 2220
taaaataagg ctagtccgtt atcaacttga aaaagtggca ccgagtcggt gcttttttgt 2280
ctagaggtac cgaattccaa ctttgtatag aaaagttggg gttgcgcctt ttccaaggca 2340
gccctgggtt tgcgcaggga cgcggctgct ctgggcgtgg ttccgggaaa cgcagcggcg 2400
ccgaccctgg gtctcgcaca ttcttcacgt ccgttcgcag cgtcacccgg atcttcgccg 2460
ctacccttgt gggccccccg gcgacgcttc ctgctccgcc cctaagtcgg gaaggttcct 2520
tgcggttcgc ggcgtgccgg acgtgacaaa cggaagccgc acgtctcact agtaccctcg 2580
cagacggaca gcgccaggga gcaatggcag cgcgccgacc gcgatgggct gtggccaata 2640
gcggctgctc agcagggcgc gccgagagca gcggccggga aggggcggtg cgggaggcgg 2700
ggtgtggggc ggtagtgtgg gccctgttcc tgcccgcgcg gtgttccgca ttctgcaagc 2760
ctccggagcg cacgtcggca gtcggctccc tcgttgaccg aatcaccgac ctctctcccc 2820
aggcaagttt gtacaaaaaa gcaggctgcc accatggtga gcaagggcga ggagctgttc 2880
accggggtgg tgcccatcct ggtcgagctg gacggcgacg taaacggcca caagttcagc 2940
gtgtccggcg agggcgaggg cgatgccacc tacggcaagc tgaccctgaa gttcatctgc 3000
accaccggca agctgcccgt gccctggccc accctcgtga ccaccctgac ctacggcgtg 3060
cagtgcttca gccgctaccc cgaccacatg aagcagcacg acttcttcaa gtccgccatg 3120
cccgaaggct acgtccagga gcgcaccatc ttcttcaagg acgacggcaa ctacaagacc 3180
cgcgccgagg tgaagttcga gggcgacacc ctggtgaacc gcatcgagct gaagggcatc 3240
gacttcaagg aggacggcaa catcctgggg cacaagctgg agtacaacta caacagccac 3300
aacgtctata tcatggccga caagcagaag aacggcatca aggtgaactt caagatccgc 3360
cacaacatcg aggacggcag cgtgcagctc gccgaccact accagcagaa cacccccatc 3420
ggcgacggcc ccgtgctgct gcccgacaac cactacctga gcacccagtc cgccctgagc 3480
aaagacccca acgagaagcg cgatcacatg gtcctgctgg agttcgtgac cgccgccggg 3540
atcactctcg gcatggacga gctgtacaag ggctccggag agggcagggg aagtcttcta 3600
acatgcgggg acgtggagga aaatcccggc cccatgaccg agtacaagcc cacggtgcgc 3660
ctcgccaccc gcgacgacgt ccccagggcc gtacgcaccc tcgccgccgc gttcgccgac 3720
taccccgcca cgcgccacac cgtcgatccg gaccgccaca tcgagcgggt caccgagctg 3780
caagaactct tcctcacgcg cgtcgggctc gacatcggca aggtgtgggt cgcggacgac 3840
ggcgccgcgg tggcggtctg gaccacgccg gagagcgtcg aagcgggggc ggtgttcgcc 3900
gagatcggcc cgcgcatggc cgagttgagc ggttcccggc tggccgcgca gcaacagatg 3960
gaaggcctcc tggcgccgca ccggcccaag gagcccgcgt ggttcctggc caccgtcggc 4020
gtctcgcccg accaccaggg caagggtctg ggcagcgccg tcgtgctccc cggagtggag 4080
gcggccgagc gcgccggggt gcccgccttc ctggagacct ccgcgccccg caacctcccc 4140
ttctacgagc ggctcggctt caccgtcacc gccgacgtcg aggtgcccga aggaccgcgc 4200
acctggtgca tgacccgcaa gcccggtgcc tgaacccagc tttcttgtac aaagtggtgg 4260
tacccgataa tcaacctctg gattacaaaa tttgtgaaag attgactggt attcttaact 4320
atgttgctcc ttttacgcta tgtggatacg ctgctttaat gcctttgtat catgctattg 4380
cttcccgtat ggctttcatt ttctcctcct tgtataaatc ctggttgctg tctctttatg 4440
aggagttgtg gcccgttgtc aggcaacgtg gcgtggtgtg cactgtgttt gctgacgcaa 4500
cccccactgg ttggggcatt gccaccacct gtcagctcct ttccgggact ttcgctttcc 4560
ccctccctat tgccacggcg gaactcatcg ccgcctgcct tgcccgctgc tggacagggg 4620
ctcggctgtt gggcactgac aattccgtgg tgttgtcggg gaagctgacg tcctttccat 4680
ggctgctcgc ctgtgttgcc acctggattc tgcgcgggac gtccttctgc tacgtccctt 4740
cggccctcaa tccagcggac cttccttccc gcggcctgct gccggctctg cggcctcttc 4800
cgcgtcttcg ccttcgccct cagacgagtc ggatctccct ttgggccgcc tccccgcatc 4860
ggctttaaga ccaatgactt acaaggcagc tgtagatctt agccactttt taaaagaaaa 4920
ggggggactg gaagggctaa ttcactccca acgaagacaa gatctgcttt ttgcttgtac 4980
tgggtctctc tggttagacc agatctgagc ctgggagctc tctggctaac tagggaaccc 5040
actgcttaag cctcaataaa gcttgccttg agtgcttcaa gtagtgtgtg cccgtctgtt 5100
gtgtgactct ggtaactaga gatccctcag acccttttag tcagtgtgga aaatctctag 5160
cagtagtagt tcatgtcatc ttattattca gtatttataa cttgcaaaga aatgaatatc 5220
agagagtgag aggaacttgt ttattgcagc ttataatggt tacaaataaa gcaatagcat 5280
cacaaatttc acaaataaag catttttttc actgcattct agttgtggtt tgtccaaact 5340
catcaatgta tcttatcatg tctggctcta gctatcccgc ccctaactcc gcccatcccg 5400
cccctaactc cgcccagttc cgcccattct ccgccccatg gctgactaat tttttttatt 5460
tatgcagagg ccgaggccgc ctcggcctct gagctattcc agaagtagtg aggaggcttt 5520
tttggaggcc tagggacgta cccaattcgc cctatagtga gtcgtattac gcgcgctcac 5580
tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc 5640
ttgcagcaca tccccctttc gccagctggc gtaatagcga agaggcccgc accgatcgcc 5700
cttcccaaca gttgcgcagc ctgaatggcg aatgggacgc gccctgtagc ggcgcattaa 5760
gcgcggcggg tgtggtggtt acgcgcagcg tgaccgctac acttgccagc gccctagcgc 5820
ccgctccttt cgctttcttc ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag 5880
ctctaaatcg ggggctccct ttagggttcc gatttagtgc tttacggcac ctcgacccca 5940
aaaaacttga ttagggtgat ggttcacgta gtgggccatc gccctgatag acggtttttc 6000
gccctttgac gttggagtcc acgttcttta atagtggact cttgttccaa actggaacaa 6060
cactcaaccc tatctcggtc tattcttttg atttataagg gattttgccg atttcggcct 6120
attggttaaa aaatgagctg atttaacaaa aatttaacgc gaattttaac aaaatattaa 6180
cgcttacaat ttaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt 6240
tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca 6300
ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt 6360
ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga 6420
tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa 6480
gatccttgag agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct 6540
gctatgtggc gcggtattat cccgtattga cgccgggcaa gagcaactcg gtcgccgcat 6600
acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga 6660
tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc 6720
caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat 6780
gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa 6840
cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac 6900
tggcgaacta cttactctag cttcccggca acaattaata gactggatgg aggcggataa 6960
agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc 7020
tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc 7080
ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag 7140
acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag accaagttta 7200
ctcatatata ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa 7260
gatccttttt gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc 7320
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat 7380
ctgctgcttg caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga 7440
gctaccaact ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt 7500
tcttctagtg tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata 7560
cctcgctctg ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac 7620
cgggttggac tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg 7680
ttcgtgcaca cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg 7740
tgagctatga gaaagcgcca cgcttcccga agagagaaag gcggacaggt atccggtaag 7800
cggcagggtc ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct 7860
ttatagtcct gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc 7920
aggggggcgg agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt 7980
ttgctggcct tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg 8040
tattaccgcc tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga 8100
gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa ccgcctctcc ccgcgcgttg 8160
gccgattcat taatgcagct ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg 8220
caacgcaatt aatgtgagtt agctcactca ttaggcaccc caggctttac actttatgct 8280
tccggctcgt atgttgtgtg gaattgtgag cggataacaa tttcacacag gaaacagcta 8340
tgaccatgat tacgccaagc gcgcaattaa ccctcactaa agggaacaaa agctggagct 8400
gcaagctt 8408
<210> 7
<211> 20
<212> DNA
<213> artificial sequence
<400> 7
gagtccgagc agaagaagaa 20
<210> 8
<211> 20
<212> DNA
<213> artificial sequence
<400> 8
gagttagagc agaagaagaa 20
<210> 9
<211> 20
<212> DNA
<213> artificial sequence
<400> 9
gagtctaagc agaagaagaa 20
<210> 10
<211> 21
<212> DNA
<213> artificial sequence
<400> 10
cggaggacaa agtacaaacg g 21
<210> 11
<211> 22
<212> DNA
<213> artificial sequence
<400> 11
gtcattggag gtgacatcga tg 22
<210> 12
<211> 26
<212> DNA
<213> artificial sequence
<400> 12
ccattggcct gcttcgtggc aatgcg 26
<210> 13
<211> 16
<212> DNA
<213> artificial sequence
<400> 13
cgagcagaag aagaag 16
<210> 14
<211> 22
<212> DNA
<213> artificial sequence
<400> 14
gctacctgta catctgcaca ag 22
<210> 15
<211> 23
<212> DNA
<213> artificial sequence
<400> 15
aagaaatgcc caatcattga tgc 23
<210> 16
<211> 28
<212> DNA
<213> artificial sequence
<400> 16
ctgtcttgcc atgccataag cccctatt 28
<210> 17
<211> 15
<212> DNA
<213> artificial sequence
<400> 17
atgcctttct tcttc 15
<210> 18
<211> 22
<212> DNA
<213> artificial sequence
<400> 18
agcctctttc tcaatgtgct tc 22
<210> 19
<211> 22
<212> DNA
<213> artificial sequence
<400> 19
agagtagatg gttgggtagt gg 22
<210> 20
<211> 28
<212> DNA
<213> artificial sequence
<400> 20
ccatcacggc ctttgcaaat agagccct 28
<210> 21
<211> 19
<212> DNA
<213> artificial sequence
<400> 21
ctaagcagaa gaagaagag 19
<210> 22
<211> 20
<212> DNA
<213> artificial sequence
<400> 22
cttccagagc ctgcactcct 20
<210> 23
<211> 20
<212> DNA
<213> artificial sequence
<400> 23
aggctctccg aggagaaggc 20

Claims (7)

1. The SpCas9 mutant is characterized in that the amino acid sequence of the SpCas9 mutant is shown as SEQ ID NO. 2.
2. A composition comprising the SpCas9 mutant of claim 1.
3. A polynucleotide encoding the SpCas9 mutant of claim 1.
4. A guide-polynucleotide/Cas complex, characterized in that the guide-polynucleotide/Cas complex comprises a guide-polynucleotide and the SpCas9 mutant of claim 1;
wherein the nucleotide sequence of the guide-polynucleotide is shown as SEQ ID NO. 3.
5. A recombinant vector, recombinant bacterium or cell comprising the polynucleotide of claim 3; the cells are derived from non-human animals, bacteria, fungi, insects or yeast.
6. A method of modifying a target site in a cell genome for non-diagnostic and therapeutic purposes, comprising: introducing the guide-polynucleotide/Cas complex of claim 4 into a cell.
7. Use of the SpCas9 mutant of claim 1 in the preparation of a gene editing product.
CN202011587591.9A 2020-12-28 2020-12-28 CRISPR SpCas9 (K510A) mutant and application thereof Active CN112538471B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011587591.9A CN112538471B (en) 2020-12-28 2020-12-28 CRISPR SpCas9 (K510A) mutant and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011587591.9A CN112538471B (en) 2020-12-28 2020-12-28 CRISPR SpCas9 (K510A) mutant and application thereof

Publications (2)

Publication Number Publication Date
CN112538471A CN112538471A (en) 2021-03-23
CN112538471B true CN112538471B (en) 2023-12-12

Family

ID=75017796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011587591.9A Active CN112538471B (en) 2020-12-28 2020-12-28 CRISPR SpCas9 (K510A) mutant and application thereof

Country Status (1)

Country Link
CN (1) CN112538471B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108350449A (en) * 2015-08-28 2018-07-31 通用医疗公司 The CRISPR-Cas9 nucleases of engineering
CN110272881A (en) * 2019-06-29 2019-09-24 复旦大学 Endonuclease SpCas9 high specific truncates variant TSpCas9-V1/V2 and its application

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015089486A2 (en) * 2013-12-12 2015-06-18 The Broad Institute Inc. Systems, methods and compositions for sequence manipulation with optimized functional crispr-cas systems
US11214779B2 (en) * 2015-04-08 2022-01-04 University of Pittsburgh—of the Commonwealth System of Higher Education Activatable CRISPR/CAS9 for spatial and temporal control of genome editing
US9926546B2 (en) * 2015-08-28 2018-03-27 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
AU2018279829B2 (en) * 2017-06-09 2024-01-04 Editas Medicine, Inc. Engineered Cas9 nucleases
CA3069296A1 (en) * 2017-07-07 2019-01-10 Toolgen Incorporated Target-specific crispr mutant

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108350449A (en) * 2015-08-28 2018-07-31 通用医疗公司 The CRISPR-Cas9 nucleases of engineering
CN110272881A (en) * 2019-06-29 2019-09-24 复旦大学 Endonuclease SpCas9 high specific truncates variant TSpCas9-V1/V2 and its application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Minghui Guo等.Structural insights into a high fidelity variant of SpCas9.Cell Research.2019,第29卷第183-192页. *

Also Published As

Publication number Publication date
CN112538471A (en) 2021-03-23

Similar Documents

Publication Publication Date Title
CN109735479B (en) Recombinant bacillus subtilis for synthesizing 2&#39; -fucosyllactose and construction method and application thereof
US6429001B1 (en) Recombinant AAV packaging systems
KR101992345B1 (en) Promoter polynucleotide, signal polypeptide and use thereof
AU2023241335A1 (en) Biosynthesis Of Benzylisoquinoline Alkaloids And Benzylisoquinoline Alkaloid Precursors
KR20230035362A (en) Improved RNA editing methods
US20050260624A1 (en) Novel nucleic acid complexes and detection thereof
CN110343698B (en) Method for constructing B2m site-directed knock-in human B2M cDNA mouse model
CN107190001A (en) A kind of method for synthesizing gene
CN112538471B (en) CRISPR SpCas9 (K510A) mutant and application thereof
CN112680430B (en) CRISPR SpCas9 mutant and application thereof
CN113215155A (en) Primer designed for TCR with epitope point of FLYALALLL and application thereof
CN113388612A (en) Primer designed for TCR with epitope point of IYVLVMLVL and application thereof
BRPI0619665A2 (en) method for enhancing expression of a transgene in a host cell, mammalian expression cassette, vector, host cell, DNA vaccine and pharmaceutical composition
CN112680477A (en) Seamless cloning technology-based H9N2 subtype avian influenza virus rescue method
CN114085874A (en) Method for preparing immortalized liver cells with reversible liver functions and application thereof
CN112342216B (en) CRISPR-Cas13d system for improving expression efficiency of CHO cells and recombinant CHO cells
CN102361977B (en) Nucleic acid derived from hepatitis c virus, and expression vector, transformed cell and hepatitis c virus particles each prepared by using same
CN113897359A (en) Improved RNA editing method
CN110331164B (en) Targeting vector for mouse with LILRA3 gene knock-in and construction method of mouse with LILRA3 gene knock-in
CN107988202B (en) Method for knocking out saccharomyces cerevisiae chromosome
CN108277232A (en) A kind of Se-enriched yeast and preparation method thereof of ease constipation function
CN112342240B (en) 59R mutant vector for expressing rFC protein and preparation method and application thereof
CN101899472A (en) Pig endogenous retrovirus vector and construction method thereof
CN112359059B (en) 84E mutant vector for expressing rFC protein and preparation method and application thereof
KR101973007B1 (en) Recombinant transition vector for enhancement of foreign protein expression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant