Disclosure of Invention
The invention provides a novel high-activity transposase which shows extremely high transposition activity in cells such as escherichia coli, insect cells, yeast cells, mammalian cells and the like, has broad spectrum of application to host cells compared with the prior high-activity transposase hyPBase, also has high transposition activity in mammalian cells, particularly high transposition activity in human cells, and provides a new clue and basis for the search of transposase, particularly the search of transposase in human cells.
The invention also provides an amino acid sequence and a peptide segment which are used as the basis of the novel high-activity transposase, a nucleotide sequence which is used for encoding the amino acid sequence, the peptide segment and the protein of the high-activity transposase, a nucleic acid construct, a recombinant vector and a host cell which are based on the nucleotide sequence, and a gene transfer system and an application which are based on the peptide segment, the protein, the nucleic acid construct, the recombinant vector and the host cell component.
The invention simultaneously mutates isoleucine at position 92 into asparagine, valine at position 119 into alanine, and glutamine at position 601 into arginine in the amino acid sequence (shown in SEQ ID NO:1) of the prior high-activity transposase hyPBase to obtain the target mutant amino acid sequence shown in SEQ ID NO: 2. In CHO cells, compared with the transposition efficiency (30.9%) of the prior high-activity transposase hyPBase through codon optimization and addition of a nuclear localization signal system, the transposition efficiency (51.7%) of the target high-activity transposase bz-hyPBase generated on the basis of the amino acid sequence of SEQ ID NO:2 is improved by nearly 21%; in PBMC cells, compared with the transposition efficiency (9.81%) of the existing high-activity transposase hyPBase through codon optimization and addition of a nuclear localization signal system, the transposition efficiency (19.4%) of the target high-activity bz-hyPBase enzyme generated on the basis of the amino acid sequence of SEQ ID NO:2 is improved by nearly 10%. It is demonstrated that the target highly active enzyme based on the mutated amino acid sequence of the present invention exhibits superior transposition activity compared to the existing highly active transposase hyppase, especially high transposition activity exhibited in mammalian cells and human-derived cells. Therefore, the invention provides a novel high-activity transposase which contains one or more amino acid sequences shown in SEQ ID NO. 2, and the high-activity transposase shows extremely high transposition activity in escherichia coli, insect cells, yeast cells and mammalian cells, and particularly meets the high transposition activity requirements of the mammalian cells and human cells.
The amino acid sequence of hyPBase (SEQ ID NO: 1):
MGPAAKRVKLDGSSLDDEHILSALLQSDDELVGEDSDSEVSDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRILTLPQRTIRGKNKHCWSTSKPTRRSRVSALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTSATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRVYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMGLTSSFMRKRLEAPTLKRYLRDNISNILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCQSCF*
target mutant amino acid sequence (SEQ ID NO: 2):
MGPAAKRVKLDGSSLDDEHILSALLQSDDELVGEDSDSEVSDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRNLTLPQRTIRGKNKHCWSTSKPTRRSRASALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTSATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRVYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMGLTSSFMRKRLEAPTLKRYLRDNISNILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCRSCF*
the mutated amino acid sequence obtained by carrying out the above amino acid mutation on the 92 th site, 119 th site and 601 th site of the amino acid sequence of the existing high-activity transposase hyPBase (shown in SEQ ID NO:1) or on any two sites of the 92 th site, the 119 th site and the 601 th site, and the enzyme formed on the basis of one or more of the mutated amino acid sequences also has the same or similar transposition efficiency as the target high-activity transposase bz-hyPBase described in the embodiment of the invention or the existing hyPBase, and also belongs to the mutated amino acid sequence of the novel high-activity transposase to be protected by the invention, and the enzyme formed on the basis of the mutated amino acid sequence also belongs to the novel high-activity transposase to be protected by the invention.
As described above, the amino acid sequence still maintaining or improving the enzymatic activity obtained by carrying out the above amino acid mutation on the 92 th site, 119 th site, 601 th site of the amino acid sequence of the prior high-activity transposase hyPBase (shown in SEQ ID NO:1), any two sites or three sites, and then carrying out one or more amino acid deletion, substitution, insertion or addition operations also belongs to the substitution scheme with the same or similar technical effects in the technical scheme of the invention, and is within the protection scope of the invention. Also belonging to the mutated amino acid sequence of the novel high-activity transposase to be protected in the present invention, an enzyme formed on the basis of one or more of the mutated amino acid sequences also belongs to the novel high-activity transposase to be protected in the present invention.
As described above, the mutant amino acid sequence obtained by performing the above amino acid mutation on the 92 th, 119 th and 601 th sites of the amino acid sequence of the existing high-activity transposase hyPBase (shown in SEQ ID NO:1), any two sites or three sites, and also contains the amino acid sequence of the functional protein, and the functional protein is added on the new high-activity transposase to improve or increase the functions of the new high-activity transposase, such as the amino acid sequence of the nuclear localization signal, the amino acid sequence of the expressed EGFP green fluorescent protein, the amino acid sequence of the tag protein or the amino acid sequence of the antibody. These functional proteins can increase the transposition activity of a new highly active transposase, e.g., nuclear localization signals can help increase the transposition activity of a transposase; or can enhance the transposition monitoring function of the high-activity transposase, such as EGFP green fluorescent protein or tag protein to facilitate the qualitative and/or quantitative monitoring of the transposition activity of the transposase; or to add new functions to the new highly active transposase, e.g., antibodies may additionally increase immune activity.
The invention also protects a mutant amino acid sequence obtained by carrying out the above amino acid mutation on 92, 119 and 601 sites of the amino acid sequence of the prior high-activity transposase hyPBase (shown in SEQ ID NO:1), any two sites or three sites, and a derived amino acid sequence which is obtained by carrying out one or more amino acid deletion, substitution, insertion or addition operations on the basis of the mutant amino acid sequence and still maintains or improves the enzyme activity, and a chain compound which is a peptide segment is connected by peptide bonds after amino acid dehydration condensation. The number of the above-mentioned mutated amino acids or the above-mentioned derived amino acid sequences contained in the peptide fragment may be one or more. The peptide segment is also connected with the peptide segment of the functional protein which is formed by connecting the amino acid sequence of the functional protein through peptide bonds after dehydration and condensation of amino acids, such as the peptide segment of a nuclear localization signal, the peptide segment of an EGFP green fluorescent protein, a label protein peptide segment or an antibody peptide segment and the like.
The invention relates to a mutant amino acid sequence obtained by carrying out amino acid mutation on 92 th site, 119 th site and 601 th site of an amino acid sequence of the prior high-activity transposase hyPBase (shown in SEQ ID NO:1), a peptide segment formed on the basis of the mutant amino acid sequence, a derivative amino acid sequence which is obtained by carrying out deletion, substitution, insertion or addition operation on one or more amino acids on the basis of the mutant amino acid sequence and still keeps or improves the activity of the enzyme, and a protein formed on the basis of the peptide segment formed on the basis of the derivative amino acid sequence, which belong to the novel high-activity transposase protected by the invention. The number of the above-mentioned mutant amino acid sequence, derivative amino acid sequence and peptide segment formed on the basis of the above-mentioned mutant amino acid sequence and derivative amino acid sequence in said new high-activity transposase is one or more.
The mutant nucleotide sequence encoding the above-mentioned novel high-activity transposase, peptide fragment and amino acid sequence thereof of the present invention, the nucleotide sequence complementary to, hybridized with or overlapped with the mutant nucleotide sequence, or the nucleotide sequence subjected to base substitution, deletion or addition operation and having a nucleotide sequence encoding the novel high-activity transposase, or the nucleotide sequence having at least 80% homology with the mutant nucleotide sequence, preferably at least 90% homology with the mutant nucleotide sequence, more preferably at least 96% homology with the mutant nucleotide sequence, are all the mutant nucleotide sequences encoding the novel high-activity transposase, peptide fragment and amino acid sequence thereof of the present invention to be protected, and the number thereof may be one or multiple copies. The method comprises the following specific steps:
the nucleotide sequence of the amino acid sequence of the prior high-activity enzyme hyPBase (shown in SEQ ID NO:1) is optimized by a human codon to obtain a human codon optimized nucleotide sequence, and the base mutations of the following sites are carried out on the basis of the human codon optimized nucleotide sequence (SEQ ID NO: 4): 276 base T is mutated into base C, 356 base T is mutated into base C, 900 base G is mutated into base A, 1802 base A is mutated into base G; obtaining a mutant nucleotide sequence which encodes the amino acid sequence (shown in SEQ ID NO:2) of the novel high-activity transposase bz-hyPBase, and the mutant nucleotide sequence is shown in SEQ ID NO: 3.
Nucleotide sequence of human codon optimized prior high activity enzyme hyPBase (SEQ ID NO: 4): atgggccctgctgccaagagggtcaagttggacggcagcagcctggacgacgagcacatcctgagcgccctgctgcagagcgacgacgagctggtgggcgaggacagcgacagcgaggtgagcgaccacgtgagcgaggacgacgtgcagagcgacaccgaggaggccttcatcgacgaggtgcacgaggtgcagcccaccagcagcggcagcgagatcctggacgagcagaacgtgatcgagcagcccggcagcagcctggccagcaaccgcaatctgaccctgccccagcgcaccatccgcggcaagaacaagcactgctggagcaccagcaagcccacccgccgcagccgcgtcagcgccctgaacatcgtgcgcagccagcgcggccccacccgcatgtgccgcaacatctacgaccccctgctgtgcttcaagctgttcttcaccgacgagatcatcagcgagatcgtgaagtggaccaacgccgagatcagcctgaagcgccgcgagagcatgaccagcgccaccttccgcgacaccaacgaggacgagatctacgccttcttcggcatcctggtgatgaccgccgtgcgcaaggacaaccacatgagcaccgacgacctgttcgaccgcagcctgagcatggtgtacgtgagcgtgatgagccgcgaccgcttcgacttcctgatccgctgcctgcgcatggacgacaagagcatccgccccaccctgcgcgagaacgacgtgttcacccccgtgcgcaagatctgggacctgttcatccaccagtgcatccagaactacacccccggcgcccacctgaccatcgacgagcagctgctgggcttccgcggccgctgccccttccgcgtgtacatccccaacaagcccagcaagtacggcatcaagatcctgatgatgtgcgacagcggcaccaagtacatgatcaacggcatgccctacctgggccgcggcacccagaccaacggcgtgcccctgggcgagtactacgtgaaggagctgagcaagcccgtgcacggcagctgccgcaacatcacctgcgacaactggttcaccagcatccccctggccaagaacctgctgcaggagccctacaagctgaccatcgtgggcaccgtgcgcagcaacaagcgcgagatccccgaggtgctgaagaacagccgcagccgccccgtgggcaccagcatgttctgcttcgacggccccctgaccctggtgagctacaagcccaagcccgccaagatggtgtacctgctgagcagctgcgacgaggacgccagcatcaacgagagcaccggcaagccccagatggtgatgtactacaaccagaccaagggcggcgtggacaccctggaccagatgtgcagcgtgatgacctgcagccgcaagaccaaccgctggcccatggccctgctgtacggcatgatcaacatcgcctgcatcaacagcttcatcatctacagccacaacgtgagcagcaagggcgagaaggtgcagagccgcaagaagttcatgcgcaacctgtacatgggcctgaccagcagcttcatgcgcaagcgcctggaggcccccaccctgaagcgctacctgcgcgacaacatcagcaacatcctgcccaaggaggtgcccggcaccagcgacgacagcaccgaggagcccgtgatgaagaagcgcacctactgcacctactgccccagcaagatccgccgcaaggccagcgccagctgcaagaagtgcaagaaggtgatctgccgcgagcacaacatcgacatgtgccagagctgcttctaa
Mutant nucleotide sequence (SEQ ID NO: 3):
atgggccctgctgccaagagggtcaagttggacggcagcagcctggacgacgagcacatcctgagcgccctgctgcagagcgacgacgagctggtgggcgaggacagcgacagcgaggtgagcgaccacgtgagcgaggacgacgtgcagagcgacaccgaggaggccttcatcgacgaggtgcacgaggtgcagcccaccagcagcggcagcgagatcctggacgagcagaacgtgatcgagcagcccggcagcagcctggccagcaaccgcaacctgaccctgccccagcgcaccatccgcggcaagaacaagcactgctggagcaccagcaagcccacccgccgcagccgcgccagcgccctgaacatcgtgcgcagccagcgcggccccacccgcatgtgccgcaacatctacgaccccctgctgtgcttcaagctgttcttcaccgacgagatcatcagcgagatcgtgaagtggaccaacgccgagatcagcctgaagcgccgcgagagcatgaccagcgccaccttccgcgacaccaacgaggacgagatctacgccttcttcggcatcctggtgatgaccgccgtgcgcaaggacaaccacatgagcaccgacgacctgttcgaccgcagcctgagcatggtgtacgtgagcgtgatgagccgcgaccgcttcgacttcctgatccgctgcctgcgcatggacgacaagagcatccgccccaccctgcgcgagaacgacgtgttcacccccgtgcgcaagatctgggacctgttcatccaccagtgcatccagaactacacccccggcgcccacctgaccatcgacgagcagctgctgggcttccgcggccgctgccccttccgcgtgtacatccccaacaagcccagcaaatacggcatcaagatcctgatgatgtgcgacagcggcaccaagtacatgatcaacggcatgccctacctgggccgcggcacccagaccaacggcgtgcccctgggcgagtactacgtgaaggagctgagcaagcccgtgcacggcagctgccgcaacatcacctgcgacaactggttcaccagcatccccctggccaagaacctgctgcaggagccctacaagctgaccatcgtgggcaccgtgcgcagcaacaagcgcgagatccccgaggtgctgaagaacagccgcagccgccccgtgggcaccagcatgttctgcttcgacggccccctgaccctggtgagctacaagcccaagcccgccaagatggtgtacctgctgagcagctgcgacgaggacgccagcatcaacgagagcaccggcaagccccagatggtgatgtactacaaccagaccaagggcggcgtggacaccctggaccagatgtgcagcgtgatgacctgcagccgcaagaccaaccgctggcccatggccctgctgtacggcatgatcaacatcgcctgcatcaacagcttcatcatctacagccacaacgtgagcagcaagggcgagaaggtgcagagccgcaagaagttcatgcgcaacctgtacatgggcctgaccagcagcttcatgcgcaagcgcctggaggcccccaccctgaagcgctacctgcgcgacaacatcagcaacatcctgcccaaggaggtgcccggcaccagcgacgacagcaccgaggagcccgtgatgaagaagcgcacctactgcacctactgccccagcaagatccgccgcaaggccagcgccagctgcaagaagtgcaagaaggtgatctgccgcgagcacaacatcgacatgtgccggagctgcttctaa
or, the mutant nucleotide sequence (shown in SEQ ID NO:3) is subjected to base substitution, deletion or addition operation and has a nucleotide sequence for coding a novel high-activity transposase bz-hyPBase;
or a nucleotide sequence which is complementary with the mutant nucleotide sequence (shown in SEQ ID NO:3) according to the base complementary pairing principle and has a new nucleotide sequence of the high-activity transposase bz-hyPBase after base substitution, deletion or addition operation;
or a nucleotide sequence which is overlapped with a mutant nucleotide sequence (shown in SEQ ID NO:3) and has a nucleotide sequence for coding a novel high-activity transposase bz-hyPBase;
or a nucleotide sequence which hybridizes with a mutant nucleotide sequence (shown in SEQ ID NO:3) and has a nucleotide sequence coding a novel high-activity transposase bz-hyPBase;
or has more than 80% homology with the mutant nucleotide sequence (shown in SEQ ID NO:3) and has a nucleotide sequence for coding a novel high-activity transposase bz-hyPBase; specifically, it is preferable that it has 90% or more homology with the mutant nucleotide sequence (shown in SEQ ID NO:3) and has a nucleotide sequence encoding a novel highly active transposase bz-hyPBase; more preferably a nucleotide sequence having more than 96% homology with the mutant nucleotide sequence (shown in SEQ ID NO:3) and encoding a novel highly active transposase bz-hyPBase;
all belong to the mutant nucleotide sequence of the new high-activity transposase bz-hyPBase or the peptide fragment thereof or the amino acid sequence thereof to be protected by the invention.
If the novel high-activity transposase of the invention is also connected with a functional protein, the mutant nucleotide sequence for coding the functional protein also contains a nucleotide sequence for coding the functional protein, such as a nucleotide sequence for coding a nuclear localization signal, a nucleotide sequence for expressing EGFP green fluorescent protein, a nucleotide sequence for coding a tag protein peptide segment or a nucleotide sequence for coding an antibody, and the like.
The present invention also provides the above-mentioned nucleic acid polymerized from a mutant nucleotide sequence encoding the novel high-activity transposase of the present invention, or a peptide fragment thereof, or an amino acid sequence thereof. When the novel high activity transposase of the present invention is linked to a functional protein, the nucleic acid also contains a nucleotide sequence encoding a functional protein (nuclear localization signal, EGFP green fluorescent protein, tag protein or antibody).
The present invention also provides a nucleic acid construct comprising one or more control sequences operably linked to the nucleic acid construct, wherein the control sequences direct the expression of a target sequence in a host cell, wherein the expression of the coding sequence comprises any step involved in the production of the protein or polypeptide, including but not limited to transcription, post-transcriptional modification, translation, post-translational modification, secretion, and the like. The nucleic acid construct further comprises the above-mentioned mutant nucleotide sequence encoding the novel high-activity transposase of the present invention, or a peptide fragment thereof, or an amino acid sequence thereof, or a nucleic acid obtained by polymerizing the mutant nucleotide sequence.
The present invention also provides a recombinant vector comprising the above-mentioned mutant nucleotide sequence encoding the novel high-activity transposase of the present invention, or a peptide fragment thereof, or an amino acid sequence thereof, or a nucleic acid obtained by polymerizing the mutant nucleotide sequence, or the above-mentioned nucleic acid construct. The recombinant vector comprises a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant virus vector, wherein the recombinant cloning vector comprises a pRS vector, a T vector or a pUC vector and the like, the recombinant eukaryotic expression vector comprises pEGFP, pCMVp-NEO-BAN or pSV2 and the like, and the recombinant virus vector comprises a recombinant adenovirus vector or a lentivirus vector and the like.
The present invention also provides a host cell comprising the above-mentioned mutant nucleotide sequence encoding the novel high-activity transposase of the present invention, or a peptide fragment thereof, or an amino acid sequence thereof, or a nucleic acid polymerized from the mutant nucleotide sequence, or the above-mentioned nucleic acid construct, or the above-mentioned recombinant vector. The host cell includes Escherichia coli cell, insect cell, yeast cell or mammal cell, etc.
The new high-activity transposase for improving transposon transposition activity of the transposition system, or the peptide fragment forming the new high-activity transposase, or the nucleic acid construct encoding the new high-activity transposase, or the recombinant vector encoding the new high-activity transposase, or the host cell (escherichia coli cell, insect cell, yeast cell, mammalian cell, etc.) containing the new high-activity transposase and/or the nucleic acid construct encoding the new high-activity transposase and/or the recombinant vector encoding the new high-activity transposase, provided by the invention, can be used for site-directed, stable and efficient integration of exogenous genes into the host cell genome, and realizing long-term and stable expression, without affecting the stable expression of the host original genes, can be used for constructing a new gene transfer system, and can also be used for preparing or being used for genome research, gene therapy, cell therapy, Or a pluripotent stem cell-inducing and/or differentiating agent, and also useful for preparing or as a tool for genomic research, gene therapy, cell therapy, or pluripotent stem cell-inducing and/or differentiating.
A gene transfer system comprising the novel high-activity transposase of the present invention, or a nucleic acid construct encoding the novel high-activity transposase, or a recombinant vector encoding the novel high-activity transposase, or a host cell comprising the novel high-activity transposase and/or a nucleic acid construct encoding the novel high-activity transposase and/or a recombinant vector encoding the novel high-activity transposase.
In the gene transfer system, a transposon gene is also contained, and a nucleic acid or a nucleic acid construct encoding a novel high-activity transposase is integrated with the transposon gene; or nucleic acids or nucleic acid constructs encoding the novel highly active transposase are independent of the transposon gene; or the nucleic acid or nucleic acid construct encoding the novel high activity transposase is located on the same recombinant vector as the transposon gene; or the nucleic acid or nucleic acid construct encoding the novel high activity transposase is located on a different recombinant vector from the transposon gene; or the transposon gene is integrated into a nucleic acid construct encoding a novel highly active transposase; or the transposon gene is integrated into a recombinant vector encoding a new transposase with high activity; or the transposon gene is independent of a recombinant vector encoding a new transposase with high activity; or the transposon gene is transferred into a host cell containing a new high-activity transposase and/or a nucleic acid construct encoding the new high-activity transposase and/or a recombinant vector encoding the new high-activity transposase; or the transposon gene is located outside the host cell containing the novel high activity transposase and/or the nucleic acid construct encoding the novel high activity transposase and/or the recombinant vector encoding the novel high activity transposase.
A drug and/or a preparation for genome research, gene therapy, cell therapy, or pluripotent stem cell induction and/or differentiation, which comprises the novel high-activity transposase of the present invention, or a nucleic acid construct encoding the novel high-activity transposase, or a recombinant vector encoding the novel high-activity transposase, or a host cell comprising the novel high-activity transposase and/or a nucleic acid construct encoding the novel high-activity transposase and/or a recombinant vector encoding the novel high-activity transposase, or a gene transfer system as described above.
The medicine for genome research, gene therapy, cell therapy or multifunctional stem cell induction and/or differentiation also contains pharmaceutically acceptable auxiliary materials, can be prepared into any pharmaceutically feasible dosage form, and can be simultaneously supplemented with auxiliary therapeutic components.
A means for genome research, gene therapy, cell therapy, or pluripotent stem cell induction and/or differentiation, comprising the novel high-activity transposase of the present invention, or a nucleic acid construct encoding the novel high-activity transposase, or a recombinant vector encoding the novel high-activity transposase, or a host cell comprising the novel high-activity transposase and/or a nucleic acid construct encoding the novel high-activity transposase and/or a recombinant vector encoding the novel high-activity transposase, or the above-mentioned gene transfer system.
Detailed Description
The invention will be elucidated more clearly in conjunction with the drawings and the specific embodiments described in the description, which are intended to illustrate the invention, but are not limited thereto. The experimental method conditions in the examples are, unless otherwise specified, conventional experimental method conditions; reagents and the like are carried out according to the manufacturer's instructions without special instructions.
EXAMPLE 1 obtaining of highly active bz-hyPBase mutant
Based on the original sequence (shown as SEQ ID NO:1) of the prior high-activity piggyBac transposase (hyppase for short), the sequence information of the protected base piggyBac transposase (bz-hyppase for short) is obtained by the following changes:
(1) based on the preference of codon usage of human beings, the codon optimization is carried out on the prior high-activity piggybac transposase to obtain a nucleotide sequence shown as SEQ ID NO.4 so as to improve the expression level of the transposase;
(2) a human c-myc nuclear localization signal is added behind the initiation codon, so that the integration efficiency of the exogenous gene in the host cell is improved;
(3) the nucleotide sequence shown in SEQ ID NO.4 is subjected to random mutation by adopting the following method to obtain a mutant with transposition efficiency obviously superior to that of the existing high-activity piggybac transposase, and the mutant is named as bz-hyPBase (an amino acid sequence shown in SEQ ID NO. 2 and a nucleotide sequence shown in SEQ ID NO. 3), and the method comprises the following steps:
a. construction of screening reporter vectors
The resistance gene G418 is inserted between the 5 'IR and 3' IR of the transposon element by means of gene synthesis to form the transposon G418-IR. The transposon is inserted into TTAA of URA3 gene by recombination after PCR, and transposase with inducible promoter is inserted into PRS316 polyclonal enzyme cutting site, finally forming screening report vector PRS 316-URA-PBase. The specific operation is as follows:
(1) template PRS316 was subjected to PCR using primers pURA-F (SEQ ID NO: 5: aagccgctaaaggcattatccgcc) and pURA-R (SEQ ID NO: 6: aactgtgccctccatggaaaaatcagtc) to give linearized fragment 1 of plasmid PRS 316.
(2) PCR was performed on the synthetic transposon G418-IR using the primers pURA-IR-F (SEQ ID NO: 7:) and pURA-IR-R (SEQ ID NO: 8:), to obtain the linearized fragment 2 of the transposon having a sequence homologous to PRS 316.
pURA-IR-F(SEQ ID NO:7):
gactgatttttccatggagggcacagttaaccctagaaagatagtctgcgtaaaattgacgcatgcgac
pURA-IR-R(SEQ ID NO:8):
ggcggataatgcctttagcggcttaaccctagaaagataatcatattgtg
(3) Fragment 1 and fragment 2 were ligated using NEBuilder homologous recombinase to construct plasmid PRS 316-URA.
(4) The PB transposase gene with the GALS inducible promoter was synthesized and cloned into the vector PRS316-URA using SacI and EcoRI, resulting in the plasmid PRS 316-URA-PBase. The PRS316-URA-PBase vector map is shown in FIG. 1.
b. Construction of mutant pools
PCR primers were designed outside of the transposase Open Reading Frame (ORF): GR-F (SEQ ID NO: 9: taatcagcgaagcgatga) and GR-R (SEQ ID NO: 10: cagcatgcctgctattgtcttcc), wherein homologous sequences of about 50bp are arranged at two ends of transposase ORF on a PRS-URA-PBase vector, a clonth error-prone PCR kit is used for mutating transposase, and the number of mutations can be accumulated by recovering PCR fragments as templates (shown in a flow chart above a figure 2), so that transposase fragments containing point mutations are finally obtained. Screening the reporter vector PRS316-URA-PBase was linearized with XbaI and EcoRI and the original, unmutated transposase was removed. The transposase fragment recovered by PCR and the linearized vector are transformed into ura-deficient yeast strain according to the molar ratio of 10:1 (shown in the flow chart below the figure 2 and the figure 3), and the yeast can utilize the self-contained homologous recombination repair mechanism to ensure that the exogenous target fragment is replaced into the DNA plasmid with the gap through the homologous arm, so that the complete plasmid with the target fragment is automatically combined in the yeast cell. By the method, one-step cloning of the DNA fragment to the yeast strain can be realized, and the phenomenon of high-frequency repeat of mutant in the process of transferring the amplified plasmid constructed by escherichia coli into yeast is reduced. By the method, the clones obtained on the plate after transformation are mutants, and a certain amount of mutant libraries can be obtained by selecting a single clone.
c. Process for screening high-efficiency transposase
As shown in fig. 3, the screening process is divided into two screens. The first screening is carried out on all mutants in a large range, the mutants with transposition efficiency obviously higher than that of an unmutated control group are obtained through screening, the second screening is carried out in the yeast obtained through the first screening, and the mutant with the transposition efficiency increased in the yeast, namely bz-hypase (SEQ ID NO:2 amino acid sequence, SEQ ID NO:3 nucleotide sequence), is obtained through calculating the exact transposition efficiency.
Screening for the first time: the transformed mutant library was picked up and monocloned into a 96-well plate and YPD medium containing G418 antibiotic for activation, and after 24 hours of activation, it was transferred using a replicator and inoculated into YPD medium containing 2% galactose for induction. After 24 hours of induction, the bacterial liquid is diluted to 10-2 or 10-3 (determined according to the growth condition of yeast), 10 mul of spot plate is taken to ura defect type solid culture medium, after 48 hours of culture, the growth condition of mutant is observed, and compared with the non-mutant clone, the clone with obviously improved transposition efficiency is screened out, and secondary screening is carried out.
And (3) screening for the second time: activating the suspected mutant obtained by the first screening for 24 hours, adjusting the OD600 value to be consistent after activation, inoculating the suspected mutant into a YPD culture medium containing 2% galactose according to the proportion of 1:100 for induction for 24 hours, adjusting the OD600 value to be consistent again after induction, diluting the mutant in a gradient manner to 10-2, 10-3 and 10-4, taking 20 mu l of the mutant to dilute the mutant to 10-2 and 10-3, coating the mutant on a ura-deficient solid culture medium for culture for 24 hours, counting the number of clones, and obtaining the clone which has undergone transposition after growing on the ura-deficient solid culture medium. At the same time, 20. mu.l of the solution was diluted to 10-3 and 10-4 and applied to YPD complete solid medium for alignment control, and the grown clones were the total yeast number. Transposase transposition efficiency ═ number of clones that have transposed/total number of clones ═ (number of clones in ura-deficient medium × (fold dilution)/(number of clones in YPD medium × (fold dilution) × (100%). By the method, high-throughput screening can be realized, the throughput screening of 96-960 mutants can be realized by one-time single-person operation, and the probability of obtaining high-activity transposase by screening is greatly increased.
Through the calculation, accurate transposition efficiency of the mutant can be obtained, and the strain with the increased transposition efficiency is selected for mutation site analysis. Inoculating yeast in the initially activated 96-well plate for amplification culture, extracting yeast plasmid, sending to company for sequencing analysis, and obtaining mutant mutation sites by comparison with the original sequence.
The amino acid sequence of hyPBase (SEQ ID NO: 1):
MGPAAKRVKLDGSSLDDEHILSALLQSDDELVGEDSDSEVSDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRILTLPQRTIRGKNKHCWSTSKPTRRSRVSALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTSATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRVYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMGLTSSFMRKRLEAPTLKRYLRDNISNILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCQSCF*
amino acid sequence of bz-hyPBase (SEQ ID NO: 2):
MGPAAKRVKLDGSSLDDEHILSALLQSDDELVGEDSDSEVSDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRNLTLPQRTIRGKNKHCWSTSKPTRRSRASALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTSATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRVYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMGLTSSFMRKRLEAPTLKRYLRDNISNILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCRSCF*
the amino acid sequence of the prior high-activity transposase hyPBase (shown in SEQ ID NO:1) is that isoleucine at the 92-position is mutated into asparagine, threonine at the 119-position is mutated into alanine, and glutamine at the 601-position is mutated into arginine, so as to obtain the amino acid sequence of bz-hyPBase shown in SEQ ID NO: 2.
Nucleotide sequence of human codon optimized hyPBase (SEQ ID NO: 4):
atgggccctgctgccaagagggtcaagttggacggcagcagcctggacgacgagcacatcctgagcgccctgctgcagagcgacgacgagctggtgggcgaggacagcgacagcgaggtgagcgaccacgtgagcgaggacgacgtgcagagcgacaccgaggaggccttcatcgacgaggtgcacgaggtgcagcccaccagcagcggcagcgagatcctggacgagcagaacgtgatcgagcagcccggcagcagcctggccagcaaccgcaatctgaccctgccccagcgcaccatccgcggcaagaacaagcactgctggagcaccagcaagcccacccgccgcagccgcgtcagcgccctgaacatcgtgcgcagccagcgcggccccacccgcatgtgccgcaacatctacgaccccctgctgtgcttcaagctgttcttcaccgacgagatcatcagcgagatcgtgaagtggaccaacgccgagatcagcctgaagcgccgcgagagcatgaccagcgccaccttccgcgacaccaacgaggacgagatctacgccttcttcggcatcctggtgatgaccgccgtgcgcaaggacaaccacatgagcaccgacgacctgttcgaccgcagcctgagcatggtgtacgtgagcgtgatgagccgcgaccgcttcgacttcctgatccgctgcctgcgcatggacgacaagagcatccgccccaccctgcgcgagaacgacgtgttcacccccgtgcgcaagatctgggacctgttcatccaccagtgcatccagaactacacccccggcgcccacctgaccatcgacgagcagctgctgggcttccgcggccgctgccccttccgcgtgtacatccccaacaagcccagcaagtacggcatcaagatcctgatgatgtgcgacagcggcaccaagtacatgatcaacggcatgccctacctgggccgcggcacccagaccaacggcgtgcccctgggcgagtactacgtgaaggagctgagcaagcccgtgcacggcagctgccgcaacatcacctgcgacaactggttcaccagcatccccctggccaagaacctgctgcaggagccctacaagctgaccatcgtgggcaccgtgcgcagcaacaagcgcgagatccccgaggtgctgaagaacagccgcagccgccccgtgggcaccagcatgttctgcttcgacggccccctgaccctggtgagctacaagcccaagcccgccaagatggtgtacctgctgagcagctgcgacgaggacgccagcatcaacgagagcaccggcaagccccagatggtgatgtactacaaccagaccaagggcggcgtggacaccctggaccagatgtgcagcgtgatgacctgcagccgcaagaccaaccgctggcccatggccctgctgtacggcatgatcaacatcgcctgcatcaacagcttcatcatctacagccacaacgtgagcagcaagggcgagaaggtgcagagccgcaagaagttcatgcgcaacctgtacatgggcctgaccagcagcttcatgcgcaagcgcctggaggcccccaccctgaagcgctacctgcgcgacaacatcagcaacatcctgcccaaggaggtgcccggcaccagcgacgacagcaccgaggagcccgtgatgaagaagcgcacctactgcacctactgccccagcaagatccgccgcaaggccagcgccagctgcaagaagtgcaagaaggtgatctgccgcgagcacaacatcgacatgtgccagagctgcttctaa
the nucleotide sequence of bz-hyPBase (SEQ ID NO: 3):
atgggccctgctgccaagagggtcaagttggacggcagcagcctggacgacgagcacatcctgagcgccctgctgcagagcgacgacgagctggtgggcgaggacagcgacagcgaggtgagcgaccacgtgagcgaggacgacgtgcagagcgacaccgaggaggccttcatcgacgaggtgcacgaggtgcagcccaccagcagcggcagcgagatcctggacgagcagaacgtgatcgagcagcccggcagcagcctggccagcaaccgcaacctgaccctgccccagcgcaccatccgcggcaagaacaagcactgctggagcaccagcaagcccacccgccgcagccgcgccagcgccctgaacatcgtgcgcagccagcgcggccccacccgcatgtgccgcaacatctacgaccccctgctgtgcttcaagctgttcttcaccgacgagatcatcagcgagatcgtgaagtggaccaacgccgagatcagcctgaagcgccgcgagagcatgaccagcgccaccttccgcgacaccaacgaggacgagatctacgccttcttcggcatcctggtgatgaccgccgtgcgcaaggacaaccacatgagcaccgacgacctgttcgaccgcagcctgagcatggtgtacgtgagcgtgatgagccgcgaccgcttcgacttcctgatccgctgcctgcgcatggacgacaagagcatccgccccaccctgcgcgagaacgacgtgttcacccccgtgcgcaagatctgggacctgttcatccaccagtgcatccagaactacacccccggcgcccacctgaccatcgacgagcagctgctgggcttccgcggccgctgccccttccgcgtgtacatccccaacaagcccagcaaatacggcatcaagatcctgatgatgtgcgacagcggcaccaagtacatgatcaacggcatgccctacctgggccgcggcacccagaccaacggcgtgcccctgggcgagtactacgtgaaggagctgagcaagcccgtgcacggcagctgccgcaacatcacctgcgacaactggttcaccagcatccccctggccaagaacctgctgcaggagccctacaagctgaccatcgtgggcaccgtgcgcagcaacaagcgcgagatccccgaggtgctgaagaacagccgcagccgccccgtgggcaccagcatgttctgcttcgacggccccctgaccctggtgagctacaagcccaagcccgccaagatggtgtacctgctgagcagctgcgacgaggacgccagcatcaacgagagcaccggcaagccccagatggtgatgtactacaaccagaccaagggcggcgtggacaccctggaccagatgtgcagcgtgatgacctgcagccgcaagaccaaccgctggcccatggccctgctgtacggcatgatcaacatcgcctgcatcaacagcttcatcatctacagccacaacgtgagcagcaagggcgagaaggtgcagagccgcaagaagttcatgcgcaacctgtacatgggcctgaccagcagcttcatgcgcaagcgcctggaggcccccaccctgaagcgctacctgcgcgacaacatcagcaacatcctgcccaaggaggtgcccggcaccagcgacgacagcaccgaggagcccgtgatgaagaagcgcacctactgcacctactgccccagcaagatccgccgcaaggccagcgccagctgcaagaagtgcaagaaggtgatctgccgcgagcacaacatcgacatgtgccggagctgcttctaa
the nucleotide sequence of the prior high-activity enzyme hyPBase is optimized by a human codon to obtain a human codon optimized nucleotide sequence, and the nucleotide mutations of the following sites are carried out based on the human codon optimized nucleotide sequence (SEQ ID NO: 4): 276 base T is mutated into base C, 356 base T is mutated into base C, 900 base G is mutated into base A, 1802 base A is mutated into base G; obtaining the mutant nucleotide sequence which encodes the novel high-activity transposase bz-hyPBase and is shown as SEQ ID NO. 3.
Example 2 higher transposition efficiency of bz-hyPBase in Yeast
A transposon with a G418 resistance gene is inserted into a URA3 gene in a yeast plasmid PRS316 to destroy the expression of the URA gene, transposases with inducible promoters are simultaneously cloned into the PRS316 to generate a plasmid PRS316-URA-Pbase, and plasmids carrying different transposases WT PBase, hypase, optimized hypase and bz-hypase are prepared in parallel. The plasmid is transferred into ura-deficient saccharomyces cerevisiae BJ2168, and the strain can not survive in ura-deficient culture medium. The transposase starts expression under the regulation and control of inducer galactose, transposons of transposons are promoted, transposons of the transposons occur, URA genes are normally expressed, and the clones which have the transposons recover normal growth in URA defective culture media. The efficiency of transposing transposase in saccharomyces cerevisiae can be calculated by counting the number of clones transposable in a certain amount of saccharomyces. By the method, the transposition efficiencies of wild-type piggybac transposase WT PBase, the existing high-activity piggybac transposase hypPBase, transposase optimized hypPBase subjected to codon optimization and added with a nuclear localization signal and bz-hypPBase are compared, and the experimental result of FIG. 4 shows that the transposition efficiency of bz-hypPBase is 3 times higher than that of hypPBase, so that bz-hypPBase is proved to have higher transposition efficiency in yeast.
WT PBase is a plasmid carrying piggybac transposase optimized by mammalian codons, hyPBase is a plasmid carrying existing high-activity piggybac transposase (obtained by mutating 7 amino acid sites of WTPPase described in the background technology), optimized hyPBase is a plasmid carrying transposase obtained by optimizing human codons and adding a nuclear localization signal system to the existing high-activity piggybac transposase, and bz-hyPBase is a plasmid carrying new high-activity transposase screened by the invention (namely, transposase obtained by mutating three amino acid sites of optimized hyPBase described in the embodiment of the invention).
Example 3 higher Gene editing efficiency of bz-hyPBase in CHO cells
We cloned optimized hyppase and bz-hyppase into mammalian cell expression vectors to generate plasmids ploxP-optimized hyppase (structure same as FIG. 5, transposing only the transposase bz-hyppase in FIG. 5 with optimized hyppase) and ploxP-bz-HyPB (FIG. 5), allowing them to express transposase. Human c-myc nuclear localization signals are connected behind the promoters of optimized hyppase and bz-hyppase. The transposon carrying the EGFP gene was cloned into the vector pSAD-EGFP (FIG. 6) to express the green fluorescent protein. Two plasmids expressing transposase and transposon are jointly transferred into CHO cells by electricity, the transposon with EGFP is inserted into a genome under the action of the transposase to enable the transposon to stably express green fluorescent protein, after two subcultures, the cells expressing the green fluorescent protein are counted on 7 th and 14 th days by using a flow cytometry detection technology, and the more the number of the cells capable of expressing the fluorescent protein is, the higher the transposase transposition efficiency is. From the statistical results in FIG. 7, the transposition activity of bz-hyPBase was significantly superior to hyPBase.
Example 4 higher Gene editing efficiency of bz-hyPBase in T cells
We electroporated the ploxP-optimized hypPase and the ploxP-bz-hypB plasmids of example 3 into PBMC cells of peripheral blood mononuclear cells for T cell genome editing. The T cell genome is edited by the transposon with the EGFP green fluorescent protein gene under the action of transposase, and the editing efficiency of the T cell can reflect the activity of the transposase. We performed multiple sets of experiments using PBMC cells from 3 different healthy human sources, and detected the gene editing efficiency using flow cytometry at day 5, with higher EGFP positive rates representing higher transposase activity. As shown in FIG. 8, the transposition activity of bz-hyppase was superior to that of optimized hyppase in PBMC cells from different donors.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the protection scope of the present invention, and although the present invention is described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that several modifications can be made without departing from the principle of the present invention, and these modifications should also be regarded as the protection scope of the present invention.
Sequence listing
<110> Shanghai cell therapy institute
SHANGHAI CELL THERAPY GROUP Co.,Ltd.
<120> a high-activity transposase and use thereof
<130> 199908Z1
<150> CN 201911227263.5
<151> 2019-12-04
<160> 10
<170> SIPOSequenceListing 1.0
<210> 1
<211> 604
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 1
Met Gly Pro Ala Ala Lys Arg Val Lys Leu Asp Gly Ser Ser Leu Asp
1 5 10 15
Asp Glu His Ile Leu Ser Ala Leu Leu Gln Ser Asp Asp Glu Leu Val
20 25 30
Gly Glu Asp Ser Asp Ser Glu Val Ser Asp His Val Ser Glu Asp Asp
35 40 45
Val Gln Ser Asp Thr Glu Glu Ala Phe Ile Asp Glu Val His Glu Val
50 55 60
Gln Pro Thr Ser Ser Gly Ser Glu Ile Leu Asp Glu Gln Asn Val Ile
65 70 75 80
Glu Gln Pro Gly Ser Ser Leu Ala Ser Asn Arg Ile Leu Thr Leu Pro
85 90 95
Gln Arg Thr Ile Arg Gly Lys Asn Lys His Cys Trp Ser Thr Ser Lys
100 105 110
Pro Thr Arg Arg Ser Arg Val Ser Ala Leu Asn Ile Val Arg Ser Gln
115 120 125
Arg Gly Pro Thr Arg Met Cys Arg Asn Ile Tyr Asp Pro Leu Leu Cys
130 135 140
Phe Lys Leu Phe Phe Thr Asp Glu Ile Ile Ser Glu Ile Val Lys Trp
145 150 155 160
Thr Asn Ala Glu Ile Ser Leu Lys Arg Arg Glu Ser Met Thr Ser Ala
165 170 175
Thr Phe Arg Asp Thr Asn Glu Asp Glu Ile Tyr Ala Phe Phe Gly Ile
180 185 190
Leu Val Met Thr Ala Val Arg Lys Asp Asn His Met Ser Thr Asp Asp
195 200 205
Leu Phe Asp Arg Ser Leu Ser Met Val Tyr Val Ser Val Met Ser Arg
210 215 220
Asp Arg Phe Asp Phe Leu Ile Arg Cys Leu Arg Met Asp Asp Lys Ser
225 230 235 240
Ile Arg Pro Thr Leu Arg Glu Asn Asp Val Phe Thr Pro Val Arg Lys
245 250 255
Ile Trp Asp Leu Phe Ile His Gln Cys Ile Gln Asn Tyr Thr Pro Gly
260 265 270
Ala His Leu Thr Ile Asp Glu Gln Leu Leu Gly Phe Arg Gly Arg Cys
275 280 285
Pro Phe Arg Val Tyr Ile Pro Asn Lys Pro Ser Lys Tyr Gly Ile Lys
290 295 300
Ile Leu Met Met Cys Asp Ser Gly Thr Lys Tyr Met Ile Asn Gly Met
305 310 315 320
Pro Tyr Leu Gly Arg Gly Thr Gln Thr Asn Gly Val Pro Leu Gly Glu
325 330 335
Tyr Tyr Val Lys Glu Leu Ser Lys Pro Val His Gly Ser Cys Arg Asn
340 345 350
Ile Thr Cys Asp Asn Trp Phe Thr Ser Ile Pro Leu Ala Lys Asn Leu
355 360 365
Leu Gln Glu Pro Tyr Lys Leu Thr Ile Val Gly Thr Val Arg Ser Asn
370 375 380
Lys Arg Glu Ile Pro Glu Val Leu Lys Asn Ser Arg Ser Arg Pro Val
385 390 395 400
Gly Thr Ser Met Phe Cys Phe Asp Gly Pro Leu Thr Leu Val Ser Tyr
405 410 415
Lys Pro Lys Pro Ala Lys Met Val Tyr Leu Leu Ser Ser Cys Asp Glu
420 425 430
Asp Ala Ser Ile Asn Glu Ser Thr Gly Lys Pro Gln Met Val Met Tyr
435 440 445
Tyr Asn Gln Thr Lys Gly Gly Val Asp Thr Leu Asp Gln Met Cys Ser
450 455 460
Val Met Thr Cys Ser Arg Lys Thr Asn Arg Trp Pro Met Ala Leu Leu
465 470 475 480
Tyr Gly Met Ile Asn Ile Ala Cys Ile Asn Ser Phe Ile Ile Tyr Ser
485 490 495
His Asn Val Ser Ser Lys Gly Glu Lys Val Gln Ser Arg Lys Lys Phe
500 505 510
Met Arg Asn Leu Tyr Met Gly Leu Thr Ser Ser Phe Met Arg Lys Arg
515 520 525
Leu Glu Ala Pro Thr Leu Lys Arg Tyr Leu Arg Asp Asn Ile Ser Asn
530 535 540
Ile Leu Pro Lys Glu Val Pro Gly Thr Ser Asp Asp Ser Thr Glu Glu
545 550 555 560
Pro Val Met Lys Lys Arg Thr Tyr Cys Thr Tyr Cys Pro Ser Lys Ile
565 570 575
Arg Arg Lys Ala Ser Ala Ser Cys Lys Lys Cys Lys Lys Val Ile Cys
580 585 590
Arg Glu His Asn Ile Asp Met Cys Gln Ser Cys Phe
595 600
<210> 2
<211> 604
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Met Gly Pro Ala Ala Lys Arg Val Lys Leu Asp Gly Ser Ser Leu Asp
1 5 10 15
Asp Glu His Ile Leu Ser Ala Leu Leu Gln Ser Asp Asp Glu Leu Val
20 25 30
Gly Glu Asp Ser Asp Ser Glu Val Ser Asp His Val Ser Glu Asp Asp
35 40 45
Val Gln Ser Asp Thr Glu Glu Ala Phe Ile Asp Glu Val His Glu Val
50 55 60
Gln Pro Thr Ser Ser Gly Ser Glu Ile Leu Asp Glu Gln Asn Val Ile
65 70 75 80
Glu Gln Pro Gly Ser Ser Leu Ala Ser Asn Arg Asn Leu Thr Leu Pro
85 90 95
Gln Arg Thr Ile Arg Gly Lys Asn Lys His Cys Trp Ser Thr Ser Lys
100 105 110
Pro Thr Arg Arg Ser Arg Ala Ser Ala Leu Asn Ile Val Arg Ser Gln
115 120 125
Arg Gly Pro Thr Arg Met Cys Arg Asn Ile Tyr Asp Pro Leu Leu Cys
130 135 140
Phe Lys Leu Phe Phe Thr Asp Glu Ile Ile Ser Glu Ile Val Lys Trp
145 150 155 160
Thr Asn Ala Glu Ile Ser Leu Lys Arg Arg Glu Ser Met Thr Ser Ala
165 170 175
Thr Phe Arg Asp Thr Asn Glu Asp Glu Ile Tyr Ala Phe Phe Gly Ile
180 185 190
Leu Val Met Thr Ala Val Arg Lys Asp Asn His Met Ser Thr Asp Asp
195 200 205
Leu Phe Asp Arg Ser Leu Ser Met Val Tyr Val Ser Val Met Ser Arg
210 215 220
Asp Arg Phe Asp Phe Leu Ile Arg Cys Leu Arg Met Asp Asp Lys Ser
225 230 235 240
Ile Arg Pro Thr Leu Arg Glu Asn Asp Val Phe Thr Pro Val Arg Lys
245 250 255
Ile Trp Asp Leu Phe Ile His Gln Cys Ile Gln Asn Tyr Thr Pro Gly
260 265 270
Ala His Leu Thr Ile Asp Glu Gln Leu Leu Gly Phe Arg Gly Arg Cys
275 280 285
Pro Phe Arg Val Tyr Ile Pro Asn Lys Pro Ser Lys Tyr Gly Ile Lys
290 295 300
Ile Leu Met Met Cys Asp Ser Gly Thr Lys Tyr Met Ile Asn Gly Met
305 310 315 320
Pro Tyr Leu Gly Arg Gly Thr Gln Thr Asn Gly Val Pro Leu Gly Glu
325 330 335
Tyr Tyr Val Lys Glu Leu Ser Lys Pro Val His Gly Ser Cys Arg Asn
340 345 350
Ile Thr Cys Asp Asn Trp Phe Thr Ser Ile Pro Leu Ala Lys Asn Leu
355 360 365
Leu Gln Glu Pro Tyr Lys Leu Thr Ile Val Gly Thr Val Arg Ser Asn
370 375 380
Lys Arg Glu Ile Pro Glu Val Leu Lys Asn Ser Arg Ser Arg Pro Val
385 390 395 400
Gly Thr Ser Met Phe Cys Phe Asp Gly Pro Leu Thr Leu Val Ser Tyr
405 410 415
Lys Pro Lys Pro Ala Lys Met Val Tyr Leu Leu Ser Ser Cys Asp Glu
420 425 430
Asp Ala Ser Ile Asn Glu Ser Thr Gly Lys Pro Gln Met Val Met Tyr
435 440 445
Tyr Asn Gln Thr Lys Gly Gly Val Asp Thr Leu Asp Gln Met Cys Ser
450 455 460
Val Met Thr Cys Ser Arg Lys Thr Asn Arg Trp Pro Met Ala Leu Leu
465 470 475 480
Tyr Gly Met Ile Asn Ile Ala Cys Ile Asn Ser Phe Ile Ile Tyr Ser
485 490 495
His Asn Val Ser Ser Lys Gly Glu Lys Val Gln Ser Arg Lys Lys Phe
500 505 510
Met Arg Asn Leu Tyr Met Gly Leu Thr Ser Ser Phe Met Arg Lys Arg
515 520 525
Leu Glu Ala Pro Thr Leu Lys Arg Tyr Leu Arg Asp Asn Ile Ser Asn
530 535 540
Ile Leu Pro Lys Glu Val Pro Gly Thr Ser Asp Asp Ser Thr Glu Glu
545 550 555 560
Pro Val Met Lys Lys Arg Thr Tyr Cys Thr Tyr Cys Pro Ser Lys Ile
565 570 575
Arg Arg Lys Ala Ser Ala Ser Cys Lys Lys Cys Lys Lys Val Ile Cys
580 585 590
Arg Glu His Asn Ile Asp Met Cys Arg Ser Cys Phe
595 600
<210> 3
<211> 1815
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
atgggccctg ctgccaagag ggtcaagttg gacggcagca gcctggacga cgagcacatc 60
ctgagcgccc tgctgcagag cgacgacgag ctggtgggcg aggacagcga cagcgaggtg 120
agcgaccacg tgagcgagga cgacgtgcag agcgacaccg aggaggcctt catcgacgag 180
gtgcacgagg tgcagcccac cagcagcggc agcgagatcc tggacgagca gaacgtgatc 240
gagcagcccg gcagcagcct ggccagcaac cgcaacctga ccctgcccca gcgcaccatc 300
cgcggcaaga acaagcactg ctggagcacc agcaagccca cccgccgcag ccgcgccagc 360
gccctgaaca tcgtgcgcag ccagcgcggc cccacccgca tgtgccgcaa catctacgac 420
cccctgctgt gcttcaagct gttcttcacc gacgagatca tcagcgagat cgtgaagtgg 480
accaacgccg agatcagcct gaagcgccgc gagagcatga ccagcgccac cttccgcgac 540
accaacgagg acgagatcta cgccttcttc ggcatcctgg tgatgaccgc cgtgcgcaag 600
gacaaccaca tgagcaccga cgacctgttc gaccgcagcc tgagcatggt gtacgtgagc 660
gtgatgagcc gcgaccgctt cgacttcctg atccgctgcc tgcgcatgga cgacaagagc 720
atccgcccca ccctgcgcga gaacgacgtg ttcacccccg tgcgcaagat ctgggacctg 780
ttcatccacc agtgcatcca gaactacacc cccggcgccc acctgaccat cgacgagcag 840
ctgctgggct tccgcggccg ctgccccttc cgcgtgtaca tccccaacaa gcccagcaaa 900
tacggcatca agatcctgat gatgtgcgac agcggcacca agtacatgat caacggcatg 960
ccctacctgg gccgcggcac ccagaccaac ggcgtgcccc tgggcgagta ctacgtgaag 1020
gagctgagca agcccgtgca cggcagctgc cgcaacatca cctgcgacaa ctggttcacc 1080
agcatccccc tggccaagaa cctgctgcag gagccctaca agctgaccat cgtgggcacc 1140
gtgcgcagca acaagcgcga gatccccgag gtgctgaaga acagccgcag ccgccccgtg 1200
ggcaccagca tgttctgctt cgacggcccc ctgaccctgg tgagctacaa gcccaagccc 1260
gccaagatgg tgtacctgct gagcagctgc gacgaggacg ccagcatcaa cgagagcacc 1320
ggcaagcccc agatggtgat gtactacaac cagaccaagg gcggcgtgga caccctggac 1380
cagatgtgca gcgtgatgac ctgcagccgc aagaccaacc gctggcccat ggccctgctg 1440
tacggcatga tcaacatcgc ctgcatcaac agcttcatca tctacagcca caacgtgagc 1500
agcaagggcg agaaggtgca gagccgcaag aagttcatgc gcaacctgta catgggcctg 1560
accagcagct tcatgcgcaa gcgcctggag gcccccaccc tgaagcgcta cctgcgcgac 1620
aacatcagca acatcctgcc caaggaggtg cccggcacca gcgacgacag caccgaggag 1680
cccgtgatga agaagcgcac ctactgcacc tactgcccca gcaagatccg ccgcaaggcc 1740
agcgccagct gcaagaagtg caagaaggtg atctgccgcg agcacaacat cgacatgtgc 1800
cggagctgct tctaa 1815
<210> 4
<211> 1815
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
atgggccctg ctgccaagag ggtcaagttg gacggcagca gcctggacga cgagcacatc 60
ctgagcgccc tgctgcagag cgacgacgag ctggtgggcg aggacagcga cagcgaggtg 120
agcgaccacg tgagcgagga cgacgtgcag agcgacaccg aggaggcctt catcgacgag 180
gtgcacgagg tgcagcccac cagcagcggc agcgagatcc tggacgagca gaacgtgatc 240
gagcagcccg gcagcagcct ggccagcaac cgcaatctga ccctgcccca gcgcaccatc 300
cgcggcaaga acaagcactg ctggagcacc agcaagccca cccgccgcag ccgcgtcagc 360
gccctgaaca tcgtgcgcag ccagcgcggc cccacccgca tgtgccgcaa catctacgac 420
cccctgctgt gcttcaagct gttcttcacc gacgagatca tcagcgagat cgtgaagtgg 480
accaacgccg agatcagcct gaagcgccgc gagagcatga ccagcgccac cttccgcgac 540
accaacgagg acgagatcta cgccttcttc ggcatcctgg tgatgaccgc cgtgcgcaag 600
gacaaccaca tgagcaccga cgacctgttc gaccgcagcc tgagcatggt gtacgtgagc 660
gtgatgagcc gcgaccgctt cgacttcctg atccgctgcc tgcgcatgga cgacaagagc 720
atccgcccca ccctgcgcga gaacgacgtg ttcacccccg tgcgcaagat ctgggacctg 780
ttcatccacc agtgcatcca gaactacacc cccggcgccc acctgaccat cgacgagcag 840
ctgctgggct tccgcggccg ctgccccttc cgcgtgtaca tccccaacaa gcccagcaag 900
tacggcatca agatcctgat gatgtgcgac agcggcacca agtacatgat caacggcatg 960
ccctacctgg gccgcggcac ccagaccaac ggcgtgcccc tgggcgagta ctacgtgaag 1020
gagctgagca agcccgtgca cggcagctgc cgcaacatca cctgcgacaa ctggttcacc 1080
agcatccccc tggccaagaa cctgctgcag gagccctaca agctgaccat cgtgggcacc 1140
gtgcgcagca acaagcgcga gatccccgag gtgctgaaga acagccgcag ccgccccgtg 1200
ggcaccagca tgttctgctt cgacggcccc ctgaccctgg tgagctacaa gcccaagccc 1260
gccaagatgg tgtacctgct gagcagctgc gacgaggacg ccagcatcaa cgagagcacc 1320
ggcaagcccc agatggtgat gtactacaac cagaccaagg gcggcgtgga caccctggac 1380
cagatgtgca gcgtgatgac ctgcagccgc aagaccaacc gctggcccat ggccctgctg 1440
tacggcatga tcaacatcgc ctgcatcaac agcttcatca tctacagcca caacgtgagc 1500
agcaagggcg agaaggtgca gagccgcaag aagttcatgc gcaacctgta catgggcctg 1560
accagcagct tcatgcgcaa gcgcctggag gcccccaccc tgaagcgcta cctgcgcgac 1620
aacatcagca acatcctgcc caaggaggtg cccggcacca gcgacgacag caccgaggag 1680
cccgtgatga agaagcgcac ctactgcacc tactgcccca gcaagatccg ccgcaaggcc 1740
agcgccagct gcaagaagtg caagaaggtg atctgccgcg agcacaacat cgacatgtgc 1800
cagagctgct tctaa 1815
<210> 5
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
aagccgctaa aggcattatc cgcc 24
<210> 6
<211> 28
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
aactgtgccc tccatggaaa aatcagtc 28
<210> 7
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gactgatttt tccatggagg gcacagttaa ccctagaaag atagtctgcg taaaattgac 60
gcatgcgac 69
<210> 8
<211> 50
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
ggcggataat gcctttagcg gcttaaccct agaaagataa tcatattgtg 50
<210> 9
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
taatcagcga agcgatga 18
<210> 10
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
cagcatgcct gctattgtct tcc 23