CN111718949B - Introduction of unnatural amino acids in proteins using a two-plasmid system - Google Patents

Introduction of unnatural amino acids in proteins using a two-plasmid system Download PDF

Info

Publication number
CN111718949B
CN111718949B CN201910210100.XA CN201910210100A CN111718949B CN 111718949 B CN111718949 B CN 111718949B CN 201910210100 A CN201910210100 A CN 201910210100A CN 111718949 B CN111718949 B CN 111718949B
Authority
CN
China
Prior art keywords
leu
ser
lys
glu
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910210100.XA
Other languages
Chinese (zh)
Other versions
CN111718949A (en
Inventor
查若鹏
吴松
张振山
刘慧玲
陈卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Kunpeng Biotech Co Ltd
Original Assignee
Ningbo Kunpeng Biotech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Kunpeng Biotech Co Ltd filed Critical Ningbo Kunpeng Biotech Co Ltd
Priority to CN201910210100.XA priority Critical patent/CN111718949B/en
Priority to PCT/CN2020/080039 priority patent/WO2020187271A1/en
Priority to CN202080023302.4A priority patent/CN113631712A/en
Publication of CN111718949A publication Critical patent/CN111718949A/en
Application granted granted Critical
Publication of CN111718949B publication Critical patent/CN111718949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione

Abstract

The present invention provides methods for introducing unnatural amino acids into proteins using a two-plasmid system. Specifically, the present invention provides a dual plasmid system comprising a first plasmid comprising a first expression cassette for expressing a protein of interest and a second plasmid comprising a second expression cassette for expressing an aminoacyl-tRNA synthetase; the system also contains a third expression cassette for encoding an artificial tRNA. The result shows that the double-plasmid system can directly introduce the unnatural amino acid into the protein, and has the advantages of low cost, high yield and small environmental pollution. In addition, the mutant lysyl-tRNA synthetases of the invention can increase the amount of unnatural amino acid inserted and the amount of a protein of interest that contains an unnatural amino acid, as compared to a wild-type lysyl-tRNA synthetase.

Description

Introduction of unnatural amino acids in proteins using a two-plasmid system
Technical Field
The invention belongs to the technical field of biomedical engineering. In particular, the invention relates to the introduction of unnatural amino acids into proteins using a two-plasmid system.
Background
As key residues in the post-translational modification of proteins and in the active centers of various enzymes, unnatural amino acids play an important role in the process by which various proteins perform their physiopathological functions. For the modification of proteins, particularly polypeptide drugs, and unnatural amino acids, not only the drug effect of the polypeptide drugs can be enhanced and the drug toxicity is reduced, but also the immunogenicity of the polypeptide drugs is greatly reduced and the immune rejection reaction is reduced due to the incorporation of the unnatural amino acids, and certain proteases can not recognize polypeptide substances which are incorporated with the unnatural amino acids any more, so that the drugs can be maintained in vivo for a longer time without being degraded, the half-life period of the drugs is prolonged, and the defect of continuous injection administration of the polypeptide drugs is overcome; and the polypeptide drug is expected to carry other chemical accessories through modification, thereby leading to the appearance of a new method for treating diseases. For modification of polypeptide drugs, most of them are modified only by chemical synthesis.
Therefore, there is a need in the art to develop a method for efficiently introducing unnatural amino acids into polypeptide drugs at a desired site.
Disclosure of Invention
The invention aims to provide a method for efficiently introducing unnatural amino acids into polypeptide drugs at fixed points.
In a first aspect of the present invention, there is provided a dual plasmid system comprising:
(1) a first plasmid comprising a first expression cassette for expression of a protein of interest, the first expression cassette comprising a first coding sequence encoding the protein of interest, the first coding sequence comprising non-natural codons for introduction of a predetermined modified amino acid, the non-natural codons being UAG (amber), UAA (ochre), or UGA (opal); and
(2) a second plasmid comprising a second expression cassette for expression of an aminoacyl-tRNA synthetase;
and, the system further comprises a third expression cassette encoding an artificial tRNA, wherein the artificial tRNA comprises an anticodon corresponding to the unnatural codon, wherein the third expression cassette is located in the first plasmid and/or the second plasmid;
and said aminoacyl-tRNA synthetase specifically catalyzes the formation of an "artificial tRNA-Xa" complex by said artificial tRNA, wherein Xa is said predetermined modified amino acid in aminoacyl form.
In another preferred embodiment, the non-natural codon is UAG (amber) or UGA (opal).
In another preferred embodiment, the codon comprises a three base nucleotide sequence corresponding to amino acids on mRNA or DNA.
In another preferred embodiment, the predetermined modified amino acid is lysine with a modification group.
In another preferred embodiment, the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
In another preferred embodiment, the alkynyl oxycarbonyl lysine has the structure shown in formula I below:
Figure BDA0002000213430000021
wherein n is 0 to 8.
In another preferred embodiment, the third expression cassette is located in a second plasmid.
In another preferred embodiment, the second expression cassette comprises a second coding sequence that encodes an aminoacyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a wild-type aminoacyl-tRNA synthetase or a mutant aminoacyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a lysyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a mutant lysyl-tRNA synthetase.
In another preferred embodiment, the mutant lysyl-tRNA synthetase is mutated in an amino acid sequence corresponding to arginine (R) at position 19 and/or histidine (H) at position 29 in the amino acid sequence of a wild-type lysyl-tRNA synthetase.
In another preferred embodiment, the wild-type lysyl-tRNA synthetase is from Methanosarcina mazei (Methanosarcina mazei), Methanosarcina pasteurianus (Methanosarcina barkeri) or Methanosarcina acetophaga (Methanosarcina acetivorans) of methanogenic archaea.
In another preferred embodiment, the amino acid sequence of the wild-type lysyl-tRNA synthetase is as shown in SEQ ID No. 1.
In another preferred embodiment, the amino acid sequence of the wild-type lysyl-tRNA synthetase is as shown in SEQ ID No. 2.
In another preferred embodiment, the arginine (R) at position 19 is mutated to histidine (H) or lysine (K); and/or
Histidine (H) at position 29 was mutated to arginine (R) or lysine (K).
In another preferred embodiment, the mutation in the mutant lysyl-tRNA synthetase is selected from the group consisting of: R19H, R19K, H29R, H29K, or a combination thereof.
In another preferred embodiment, the mutant lysyl-tRNA synthetase further comprises a mutation at a site selected from the group consisting of: isoleucine (I) at position 26, threonine (T) at position 122, leucine (L) at position 309, cysteine (C) at position 348, tyrosine (Y) at position 384, or a combination thereof.
In another preferred embodiment, the mutation site of the mutant lysyl-tRNA synthetase further comprises isoleucine (I) at position 26; preferably, the isoleucine (I) at position 26 is mutated to valine (V).
In another preferred embodiment, the mutation site of the mutant lysyl-tRNA synthetase further comprises threonine (T) at position 122; preferably, the threonine (T) at position 122 is mutated to tryptophan (S).
In another preferred embodiment, the mutation site of the mutant lysyl-tRNA synthetase further comprises leucine (L) at position 309; preferably, the leucine (L) at position 309 is mutated to alanine (a).
In another preferred embodiment, the mutation site of the mutant lysyl-tRNA synthetase further comprises cysteine (C) at position 348; preferably, the cysteine (C) at position 348 is mutated to a tryptophan (S).
In another preferred embodiment, the mutation site of the mutant lysyl-tRNA synthetase further comprises tyrosine (Y) at position 384; preferably, the tyrosine (Y) at position 384 is mutated to phenylalanine (F).
In another preferred embodiment, the mutant lysyl-tRNA synthetase further comprises a mutation selected from the group consisting of: I26V, T122S, L309A, C348S, Y384F, or a combination thereof.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19H and H29R.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19K and H29R.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19H and H29K.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19K and H29K.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19H, I26V and H29R.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19H, H29R, T122S and Y384F.
In another preferred embodiment, the mutant lysyl-tRNA synthetase comprises a mutation selected from the group consisting of: R19H, H29R, L309A and C348S.
In another preferred embodiment, the mutant lysyl-tRNA synthetase has the same or substantially the same amino acid sequence as shown in SEQ ID No.1 or SEQ ID No. 2 except for the mutation (e.g., positions 19 and/or 29, and optionally positions 26, 122, 309, 348, and/or 384).
In another preferred embodiment, the substantial identity is a difference of up to 50 (preferably 1-20, more preferably 1-10) amino acids, wherein the difference comprises a substitution, deletion or addition of an amino acid and the mutein still has lysyl-tRNA synthetase activity.
In another preferred embodiment, the amino acid sequence of the mutant lysyl-tRNA synthetase has at least 70%, preferably at least 75%, 80%, 85%, 90%, more preferably at least 95%, 96%, 97%, 98%, 99% or more sequence identity compared to SEQ ID No.1 or SEQ ID No. 2.
In another preferred embodiment, the mutant lysyl-tRNA synthetase is mutated from a wild-type lysyl-tRNA synthetase as set forth in SEQ ID No.1 or SEQ ID No. 2.
In another preferred embodiment, the mutant lysyl-tRNA synthetase is selected from the group consisting of:
(1) a polypeptide having an amino acid sequence as set forth in any one of SEQ ID No. 3-9; or
(2) A polypeptide which is formed by substituting, deleting or adding one or more, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-8, more preferably 1-3 and most preferably 1 amino acid residue in the amino acid sequence shown in any one of SEQ ID NO. 3-9, has the function of the polypeptide shown in (1) and is derived from the polypeptide of the amino acid sequence shown in any one of SEQ ID NO. 3-9.
In another preferred embodiment, the amino acid sequence of the mutant lysyl-tRNA synthetase is as set forth in any one of SEQ ID No. 3-9.
In another preferred embodiment, the mutant lysyl-tRNA synthetase is a non-natural protein.
In another preferred embodiment, the mutant lysyl-tRNA synthetase is used to introduce a lysine derivative into a protein of interest.
In another preferred embodiment, the mutant lysyl-tRNA synthetase has the following characteristics:
compared with wild lysyl-tRNA synthetase, it can introduce lysine derivative with large functional group into protein.
In another preferred embodiment, the coding nucleic acid sequence of the artificial tRNA is shown in SEQ ID NO. 10.
GGAAACCTGATCATGTAGATCGAATGGACTCTAAATCCGTTCAGCCGGGTTAGATTCCCGGGGTTTCCGCCA(SEQ ID NO.:10)
In another preferred embodiment, the protein of interest is selected from the group consisting of: insulin, human insulin precursor protein, insulin lispro precursor protein, insulin glargine precursor protein, parathyroid hormone, clielin, calcitonin, bivalirudin, glucagon-like peptide and its derivatives exenatide and liraglutide, somaglutelin, ziconotide, sertraline, ghrelin, secretin, tedulptin, hirudin, growth hormone, growth factor, growth hormone releasing factor, adrenocorticotropic hormone, releasing factor, dessertraline, desmopressin, elcatonin, glucagon, leuprorelin, luteinizing hormone releasing hormone, somatostatin, thyroid hormone releasing hormone, triptorelin, vasoactive intestinal peptide, interferon, parathyroid hormone, BH3 peptide, amyloid peptide, or fragments of the above peptides, or combinations thereof.
In another preferred embodiment, the first and/or second plasmid further comprises one or more promoters operably linked to the first coding sequence, enhancer, transcription termination signal, polyadenylation sequence, origin of replication, selectable marker, nucleic acid restriction site, and/or homologous recombination site.
In another preferred embodiment, the first plasmid is an expression vector selected from the group consisting of: pBAD-His ABC, pBAD/His ABC, pET28a, pETDuet-1.
In another preferred embodiment, the first plasmid further comprises a resistance gene, a tag sequence, a repressor gene (araC), a promoter gene (araBAD), or a combination thereof.
In another preferred embodiment, the resistance gene is selected from the group consisting of: AmpR chloramphenicol resistance gene (CmR), kanamycin resistance gene (KanaR), tetracycline resistance gene (TetR), or a combination thereof.
In another preferred embodiment, the first expression cassette further comprises a first promoter, preferably the first promoter is an inducible promoter.
In another preferred embodiment, the first promoter is selected from the group consisting of: arabinose promoter (AraBAD), lactose promoter (Plac), pLacUV5 promoter, pTac promoter, or a combination thereof.
In another preferred embodiment, said first expression cassette comprises, in order from 5 'to 3', a promoter, a ribosome binding site RBS, said first coding sequence, a terminator or a tag sequence.
In another preferred embodiment, the second plasmid is a pEvol-pBpF vector.
In another preferred embodiment, the second plasmid further comprises a resistance gene, a tag sequence, a repressor gene (araC), a promoter gene (araBAD), or a combination thereof.
In another preferred embodiment, the resistance gene is selected from the group consisting of: an ampicillin resistance gene (AmpR), a chloramphenicol resistance gene (CmR), a kanamycin resistance gene (KanaR), a tetracycline resistance gene (TetR), or a combination thereof.
In another preferred embodiment, the second expression cassette further comprises a second promoter, preferably the second promoter is an inducible promoter.
In another preferred embodiment, the second promoter is selected from the group consisting of: arabinose promoter (AraBAD), glnS promoter, proK promoter, or a combination thereof.
In another preferred embodiment, the second expression cassette comprises, in order from 5 'to 3', a promoter (araBAD), a ribosome binding site RBS, the second coding sequence, and a terminator (rrnB).
In another preferred embodiment, the third expression cassette further comprises a third promoter, preferably the third promoter is a constitutive promoter.
In another preferred embodiment, the third promoter is the reverse transcription promoter proK.
In another preferred embodiment, the third expression cassette comprises, in order from 5 'to 3', a promoter, a ribosome binding site RBS, an artificial tRNA coding sequence, a terminator or a tag sequence.
In a second aspect of the invention, there is provided a host cell or cell extract comprising the dual plasmid system of the first aspect of the invention.
In another preferred embodiment, the host cell is selected from the group consisting of: escherichia coli, Bacillus subtilis, yeast cells, insect cells, mammalian cells, or a combination thereof.
In another preferred embodiment, the cell extract is from a cell selected from the group consisting of: escherichia coli, Bacillus subtilis, yeast cells, insect cells, mammalian cells, or a combination thereof.
In a third aspect of the invention, there is provided a kit comprising (a) a container, and (b) located within the container:
(1) a first plasmid comprising a first expression cassette for expression of a protein of interest, the first expression cassette comprising a first coding sequence encoding the protein of interest, the first coding sequence comprising non-natural codons for introduction of a predetermined modified amino acid, the non-natural codons being UAG (amber), UAA (ochre), or UGA (opal); and
(2) a second plasmid comprising a second expression cassette for expression of an aminoacyl-tRNA synthetase;
and, the kit further comprises a third expression cassette encoding an artificial tRNA, wherein the artificial tRNA comprises an anticodon corresponding to the unnatural codon, wherein the third expression cassette is located in the first plasmid and/or the second plasmid;
and said aminoacyl-tRNA synthetase specifically catalyzes the formation of an "artificial tRNA-Xa" complex by said artificial tRNA, wherein Xa is said predetermined modified amino acid in aminoacyl form.
In another preferred example, the kit further comprises a cell extract.
In another preferred embodiment, the first plasmid and the second plasmid are in the same or different containers.
In another preferred embodiment, the non-natural codon is a UAG (amber) or UGA (opal) codon.
In another preferred embodiment, the predetermined modified amino acid is lysine with a modification group.
In another preferred embodiment, the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
In another preferred embodiment, the third expression cassette is located in a second plasmid.
In another preferred embodiment, the second expression cassette comprises a second coding sequence that encodes an aminoacyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a wild-type aminoacyl-tRNA synthetase or a mutant aminoacyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a lysyl-tRNA synthetase.
In another preferred embodiment, the aminoacyl-tRNA synthetase is a mutant lysyl-tRNA synthetase.
In another preferred embodiment, the protein of interest is selected from the group consisting of: insulin, human insulin precursor protein, insulin lispro precursor protein, insulin glargine precursor protein, parathyroid hormone, clielin, calcitonin, bivalirudin, glucagon-like peptide and its derivatives exenatide and liraglutide, somaglutelin, ziconotide, sertraline, ghrelin, secretin, tedulptin, hirudin, growth hormone, growth factor, growth hormone releasing factor, adrenocorticotropic hormone, releasing factor, dessertraline, desmopressin, elcatonin, glucagon, leuprorelin, luteinizing hormone releasing hormone, somatostatin, thyroid hormone releasing hormone, triptorelin, vasoactive intestinal peptide, interferon, parathyroid hormone, BH3 peptide, amyloid peptide, or fragments of the above peptides, or combinations thereof.
In another preferred embodiment, the first and/or second plasmid further comprises one or more promoters operably linked to the first coding sequence, enhancer, transcription termination signal, polyadenylation sequence, origin of replication, selectable marker, nucleic acid restriction site, and/or homologous recombination site.
In a fourth aspect of the invention, there is provided the use of a dual plasmid system according to the first aspect of the invention, or a host cell or cell extract according to the second aspect of the invention, or a kit according to the third aspect of the invention, for the preparation of a protein comprising a predetermined modified amino acid.
In another preferred embodiment, the predetermined modified amino acid is lysine with a modification group.
In another preferred embodiment, the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
In a fifth aspect of the present invention, there is provided a method for producing a protein containing a predetermined modified amino acid, the method comprising the steps of:
(1) providing a host cell or cell extract according to the second aspect of the invention, and
(2) adding the predetermined modified amino acid, and culturing the cell or cell extract to obtain a protein containing the predetermined modified amino acid.
In another preferred embodiment, the host cell is selected from the group consisting of: escherichia coli, Bacillus subtilis, yeast cells, insect cells, mammalian cells, or a combination thereof.
In another preferred embodiment, the cell extract is from a cell selected from the group consisting of: escherichia coli, Bacillus subtilis, yeast cells, insect cells, mammalian cells, or a combination thereof.
In another preferred embodiment, the predetermined modified amino acid is lysine with a modification group.
In another preferred embodiment, the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
In another preferred embodiment, the protein of interest is selected from the group consisting of: insulin, human insulin precursor protein, insulin lispro precursor protein, insulin glargine precursor protein, parathyroid hormone, clielin, calcitonin, bivalirudin, glucagon-like peptide and its derivatives exenatide and liraglutide, somaglutelin, ziconotide, sertraline, ghrelin, secretin, tedulptin, hirudin, growth hormone, growth factor, growth hormone releasing factor, adrenocorticotropic hormone, releasing factor, dessertraline, desmopressin, elcatonin, glucagon, leuprorelin, luteinizing hormone releasing hormone, somatostatin, thyroid hormone releasing hormone, triptorelin, vasoactive intestinal peptide, interferon, parathyroid hormone, BH3 peptide, amyloid peptide, or fragments of the above peptides, or combinations thereof.
It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.
Drawings
FIG. 1 shows a map of plasmid pBAD-A1-u4-u 5-TEV-R-MiniINS.
FIG. 2 shows a map of the plasmid pEvol-pylRs (R19K, H29K, T122S, Y384F) -pylT.
FIG. 3 shows the expression of Boc modified fusion proteins by wild-type lysyl-tRNA synthetase pylRs (SEQ ID NO: 1) and mutant lysyl-tRNA synthetase pylRs (R19K, H29K, T122S, Y384F). FIG. 3a expression of Boc-modified GFP-TEV-R-MiniINS fusion proteins by the two enzymes. Lane 1, strain 1, wild-type pylRs; lane 2, strain 3, mutant pylRs (R19K, H29K, T122S, Y384F); m is a protein standard plasmid standard. FIG. 3b expression of Boc-modified A1-u4-u5-TEV-R-MiniINS fusion protein by two enzymes. M is a protein standard plasmid standard; lane 1, strain 2, wild-type pylRs; lane 2, Strain 4, mutant pylRs (R19K, H29K, T122S, Y384F).
Detailed Description
The inventor of the present invention has conducted extensive and intensive studies, and unexpectedly found that an unnatural amino acid can be directly introduced into a protein by the dual-plasmid system of the present invention, which is low in cost, high in yield, and low in environmental pollution. In addition, the present application surprisingly obtains a mutant lysyl-tRNA synthetase. The mutant lysyl-tRNA synthetases of the invention can increase the amount of unnatural amino acid inserted and the amount of a protein of interest that contains an unnatural amino acid, as compared to a wild-type lysyl-tRNA synthetase. In addition, the mutant lysyl-tRNA synthetase of the invention can also improve the stability of target proteins, so that the target proteins are not easy to break. On this basis, the inventors have completed the present invention.
Description of the terms
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
As used herein, the term "about" when used in reference to a specifically recited value means that the value may vary by no more than 1% from the recited value. For example, as used herein, the expression "about 100" includes 99 and 101 and all values in between (e.g., 99.1, 99.2, 99.3, 99.4, etc.).
As used herein, the term "comprising" or "includes" can be open, semi-closed, and closed. In other words, the term also includes "consisting essentially of …," or "consisting of ….
Sequence identity (or homology) is determined by comparing two aligned sequences along a predetermined comparison window (which may be 50%, 60%, 70%, 80%, 90%, 95% or 100% of the length of the reference nucleotide sequence or protein) and determining the number of positions at which identical residues occur. Typically, this is expressed as a percentage. The measurement of sequence identity of nucleotide sequences is a method well known to those skilled in the art.
Carrier
As used herein, the term "construct" or "vector" generally refers to a nucleic acid capable of transporting the coding sequence of the protein of interest to which it is linked. One type of vector is a "plasmid," which refers to a circular double-stranded DNA loop into which additional DNA segments can be ligated.
The coding sequence for the protein of interest may be incorporated into a vector. The vectors may be used to replicate the nucleic acid in a compatible host cell. The vector may be recovered from the host cell. The vector may be an expression vector for expressing a nucleic acid sequence of interest in a compatible host cell. Suitably, the coding sequence for the protein of interest is operably linked to a control sequence (e.g. a promoter or enhancer) capable of providing for expression of the coding sequence for the protein of interest in the host cell. The term "operably linked" means that the components being described are in a relationship that allows them to function in their intended manner. Regulatory sequences operably linked to the coding sequence of the protein of interest are ligated in such a way that expression of the nucleic acid sequence of interest is achieved under conditions compatible with the control sequences.
The vector may be transformed or transfected into a suitable host cell to provide for expression of the protein. The process may comprise culturing a host cell transformed with an expression vector under conditions that provide for expression of the vector encoding the nucleic acid sequence of interest of the protein, and optionally recovering the expressed protein.
The vector may be, for example, a plasmid or viral vector provided with an origin of replication, optionally a promoter for expression of the nucleic acid sequence of interest and optionally a regulator of the promoter. In the case of bacterial plasmids, the vector may contain one or more selectable marker genes, such as the kanamycin resistance gene.
Methods well known to those skilled in the art can be used to construct expression vectors containing a DNA sequence encoding a protein of the invention and appropriate transcription/translation control signals, preferably commercially available vectors: bacterial plasmids, bacteriophages, yeast plasmids, plant cell viruses, mammalian cell viruses such as adenoviruses, retroviruses or other vectors. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like. The DNA sequence may be operably linked to a suitable promoter in an expression vector to direct mRNA synthesis. Representative examples of such promoters are: lac or trp promoter of E.coli; PL promoter of lambda phage: eukaryotic promoters include CMV early promoter, HSV thymidine kinase promoter, early and late SV40 promoter, LTRs of retrovirus, and other known promoters capable of controlling the expression of genes in prokaryotic or eukaryotic cells or viruses. Expression vectors also include ribosome binding sites for translation initiation and transcription terminators and enhance transcription in higher eukaryotes by inserting enhancer sequences into the vector. Enhancers are cis-acting elements of DNA expression, usually about 10-300bp, that act on a promoter to increase gene transcription. Such as an adenovirus enhancer. In addition, the expression vector preferably comprises one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells.
The present invention also provides a recombinant vector comprising the DNA sequences of the protein of interest of the present invention, a mutant lysyl-tRNA synthetase gene, and optionally a tRNA. In a preferred embodiment, the promoter downstream of the recombinant vector comprises a multiple cloning site or at least one enzyme cleavage site. When the target gene needs to be expressed, the target gene is connected into a proper multiple cloning site or enzyme cutting site, so that the target gene and the promoter are operably connected.
In another preferred embodiment, the recombinant vector comprises in the 5 'to 3' direction: a promoter, a gene of interest, and a terminator. If desired, the recombinant vector may further comprise the following elements: a protein purification tag; a 3' polyadenylation signal; an untranslated nucleic acid sequence; transport and targeting nucleic acid sequences; selection markers (antibiotic resistance genes, fluorescent proteins, etc.); an enhancer; or operator.
Methods for preparing recombinant vectors are well known to those of ordinary skill in the art. The expression vector may be a bacterial plasmid, phage, yeast plasmid, plant cell virus, mammalian cell virus or other vector, and in a preferred embodiment, the expression vector may be pET, pCW, pUC, pPIC9k, pMA5 or other vectors. In general, any plasmid and vector may be used as long as it can replicate and is stable in the host.
One of ordinary skill in the art can construct vectors containing the promoter and/or gene sequence of interest of the present invention using well known methods. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like.
The expression vector of the present invention can be used to transform an appropriate host cell so that the host transcribes the target RNA or expresses the target protein. The host cell may be a prokaryotic cell, such as E.coli, C.glutamicum, Brevibacterium flavum, Streptomyces, Agrobacterium: or lower eukaryotic cells, such as yeast cells; or higher eukaryotic cells, such as plant cells, preferably rape, tobacco, soybean; insect cells such as Drosophila S2 or Sf 9; animal cells such as CHO, COS or Bowes melanoma cells. In a preferred embodiment, the expression host may be E.coli, B.subtilis, Pichia pastoris, Streptomyces, or other host cells. It will be clear to one of ordinary skill in the art how to select an appropriate vector and host cell. Transformation of a host cell with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art. When the host is a prokaryote (e.g., Escherichia coli), CaCl may be used2The treatment can also be carried out by electroporation. When the host is a eukaryote, the following DNA transfection methods may be used: calcium phosphate coprecipitationMethods, conventional mechanical methods (e.g., microinjection, electroporation, liposome encapsulation, etc.). The method is carried out by growing or culturing the cells according to host cells by methods known to those skilled in the art. For example, the microbial cells are usually at a temperature of 0 to 100 ℃ and preferably 10 to 60 ℃ and oxygen. The culture medium contains carbon source such as glucose; nitrogen sources, usually in the form of organic nitrogen, such as yeast extract, amino acids; salts, such as ammonium sulfate, trace elements, such as iron, magnesium salts; vitamins if desired. The pH of the medium can be kept at a fixed value during this period, that is, controlled or not during the cultivation. The culture may be carried out as a batch culture, a semi-discontinuous culture or a continuous culture. After culturing, the cells are collected, disrupted or used directly. The transformed plant may be transformed by methods such as Agrobacterium transformation or biolistic transformation, for example, leaf disc method, immature embryo transformation, flower bud soaking method, etc. The transformed plant cells, tissues or organs can be regenerated into plants by conventional methods to obtain transgenic plants.
The obtained transformant can be cultured by a conventional method to express the target protein of the present invention. The medium used in the culture may be selected from various conventional media depending on the host cell used. The culturing is performed under conditions suitable for growth of the host cell. After the host cells have been grown to an appropriate cell density, the selected promoter is induced by suitable means (e.g., temperature shift or chemical induction) and the cells are cultured for an additional period of time.
The recombinant polypeptide in the above method may be expressed intracellularly or on the cell membrane, or secreted extracellularly. If necessary, the recombinant protein can be isolated and purified by various separation methods using its physical, chemical and other properties. These methods are well known to those skilled in the art. Examples of such methods include, but are not limited to: conventional renaturation treatment, treatment with a protein precipitant (such as salt precipitation), centrifugation, cell lysis by osmosis, sonication, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, High Performance Liquid Chromatography (HPLC), and other various liquid chromatography techniques, and combinations thereof.
The term "operably linked" means that the gene of interest to be expressed transcriptionally is linked to its control sequences in a manner conventional in the art to be expressed.
Two plasmid system
The present invention provides a dual plasmid system comprising:
(1) a first plasmid comprising a first expression cassette for expression of a protein of interest, the first expression cassette comprising a first coding sequence encoding the protein of interest, the first coding sequence comprising non-natural codons for introduction of a predetermined modified amino acid, the non-natural codons being UAG (amber), UAA (ochre), or UGA (opal); and
(2) a second plasmid comprising a second expression cassette for expression of an aminoacyl-tRNA synthetase;
and, the system further comprises a third expression cassette encoding an artificial tRNA, wherein the artificial tRNA comprises an anticodon corresponding to the unnatural codon, wherein the third expression cassette is located in the first plasmid and/or the second plasmid;
and said aminoacyl-tRNA synthetase specifically catalyzes the formation of an "artificial tRNA-Xa" complex by said artificial tRNA, wherein Xa is said predetermined modified amino acid in aminoacyl form.
In another preferred embodiment, the non-natural codon is UAG (amber) or UGA (opal).
In another preferred embodiment, the codon comprises a three base nucleotide sequence corresponding to amino acids on mRNA or DNA.
In another preferred embodiment, the predetermined modified amino acid is lysine with a modification group.
In another preferred embodiment, the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
In another preferred embodiment, the third expression cassette is located in a second plasmid.
In another preferred embodiment, the second expression cassette comprises a second coding sequence that encodes an aminoacyl-tRNA synthetase.
In another preferred embodiment, the first plasmid is an expression vector selected from the group consisting of: pBAD-His ABC, pBAD/His ABC, pET28a, pETDuet-1.
In another preferred embodiment, the first plasmid further comprises a resistance gene, a tag sequence, a repressor gene (araC), a promoter gene (araBAD), or a combination thereof.
In another preferred embodiment, the resistance gene is selected from the group consisting of: AmpR chloramphenicol resistance gene (CmR), kanamycin resistance gene (KanaR), tetracycline resistance gene (TetR), or a combination thereof.
In another preferred embodiment, the first expression cassette further comprises a first promoter, preferably the first promoter is an inducible promoter.
In another preferred embodiment, the first promoter is selected from the group consisting of: arabinose promoter (AraBAD), lactose promoter (Plac), pLacUV5 promoter, pTac promoter, or a combination thereof.
In another preferred embodiment, said first expression cassette comprises, in order from 5 'to 3', a promoter, a ribosome binding site RBS, said first coding sequence, a terminator or a tag sequence.
In another preferred embodiment, the second plasmid is a pEvol-pBpF vector.
In another preferred embodiment, the second plasmid further comprises a resistance gene, a repressor gene (araC), a promoter gene (araBAD), a tag sequence, or a combination thereof.
In another preferred embodiment, the resistance gene is selected from the group consisting of: an ampicillin resistance gene (AmpR), a chloramphenicol resistance gene (CmR), a kanamycin resistance gene (KanaR), a tetracycline resistance gene (TetR), or a combination thereof.
In another preferred embodiment, the second expression cassette further comprises a second promoter, preferably the second promoter is an inducible promoter.
In another preferred embodiment, the second promoter is selected from the group consisting of: arabinose promoter (AraBAD), glnS promoter, proK promoter, or a combination thereof.
In another preferred embodiment, the second expression cassette comprises, in order from 5 'to 3', a promoter (araBAD), a ribosome binding site RBS, the second coding sequence, and a terminator (rrnB).
In another preferred embodiment, the third expression cassette further comprises a third promoter, preferably the third promoter is a constitutive promoter.
In another preferred embodiment, the third promoter is the reverse transcription promoter proK.
In another preferred embodiment, the third expression cassette comprises, in order from 5 'to 3', a promoter, a ribosome binding site RBS, an artificial tRNA coding sequence, a terminator or a tag sequence.
Wild-type lysyl-tRNA synthetase
As used herein, "wild-type lysyl-tRNA synthetases", "wild-type enzyme pylRs" refers to naturally occurring aminoacyl-tRNA synthetases that have not been artificially engineered, whose nucleotides can be obtained by genetic engineering techniques, such as genomic sequencing, Polymerase Chain Reaction (PCR), etc., and whose amino acid sequence can be deduced from the nucleotide sequence. The source of the wild-type lysyl-tRNA synthetase is not particularly limited, and one preferred source is derived from Methanosarcina mazei (Methanosrcina mazei), Methanosarcina pasteurianum (Methanosrcina barkeri), Methanosarcina acetophaga (Methanosrcina acetovorans), and the like, but is not limited thereto.
In a preferred embodiment of the invention, the amino acid sequence of the wild-type lysyl-tRNA synthetase is shown in SEQ ID No. 1.
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:1)
In a preferred embodiment of the invention, the amino acid sequence of the wild-type lysyl-tRNA synthetase is shown in SEQ ID No. 2.
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSGSYYNGISTNL(SEQ ID NO.:2)
Mutant lysyl-tRNA synthetases
As used herein, the terms "mutein", "mutein of the invention", "mutated aminoacyl-tRNA synthetase of the invention", "mutated lysyl-tRNA synthetase", "mutant enzyme", "mutant of aminoacyl-tRNA synthetase", all used interchangeably, refer to a mutated aminoacyl-tRNA synthetase that does not occur naturally, and the mutated aminoacyl-tRNA synthetase is a protein that has been artificially engineered from a polypeptide as set forth in SEQ ID No.1 or SEQ ID No. 2. In particular, the mutant aminoacyl-tRNA synthetase is as described in the first aspect of the invention.
It is understood that the numbering of amino acids in the mutant lysyl-tRNA synthetases of the invention is based on the wild-type lysyl-tRNA synthetase (preferably, SEQ ID NO: 1 or SEQ ID NO: 2). When a particular mutein has 80% or more homology to the sequence shown in SEQ ID No.1 or SEQ ID No. 2, the amino acid numbering of the mutein may be misaligned with respect to the amino acid numbering of SEQ ID No.1 or SEQ ID No. 2, e.g., by 1-5 positions towards the N-terminus or C-terminus of the amino acid, and using sequence alignment techniques conventional in the art, one of ordinary skill in the art will generally appreciate that such misalignment is within a reasonable range and that muteins having the same or similar glycosyltransferase activity that have 80% (e.g., 90%, 95%, 98%) homology due to the misalignment of the amino acid numbering are not within the scope of the muteins of the invention.
The muteins of the present invention are synthetic or recombinant proteins, i.e., they may be chemically synthesized products or produced using recombinant techniques from prokaryotic or eukaryotic hosts (e.g., bacteria, yeast, plants). Depending on the host used in the recombinant production protocol, the muteins of the invention may be glycosylated or may be non-glycosylated. The mutant proteins of the present invention may or may not also include an initial methionine residue.
The invention also includes fragments, derivatives and analogues of the muteins. As used herein, the terms "fragment," "derivative," and "analog" refer to a protein that retains substantially the same biological function or activity as the mutein.
The mutein fragment, derivative or analogue of the invention may be (i) a mutein wherein one or more conserved or non-conserved amino acid residues, preferably conserved amino acid residues, are substituted, and such substituted amino acid residues may or may not be encoded by the genetic code, or (ii) a mutein having a substituent group in one or more amino acid residues, or (iii) a mutein wherein the mature mutein is fused to another compound, such as a compound that extends the half-life of the mutein, e.g. polyethylene glycol, or (iv) a mutein wherein an additional amino acid sequence is fused to the mutein sequence, such as a leader or secretory sequence or a sequence used to purify the mutein or a proprotein sequence, or a fusion protein with an antigenic IgG fragment. Such fragments, derivatives and analogs are within the purview of those skilled in the art in view of the teachings herein. In the present invention, conservatively substituted amino acids are preferably generated by amino acid substitutions according to Table I.
TABLE I
Initial residue(s) Representative substitutions Preferred substitutions
Ala(A) Val;Leu;Ile Val
Arg(R) Lys;Gln;Asn Lys
Asn(N) Gln;His;Lys;Arg Gln
Asp(D) Glu Glu
Cys(C) Ser Ser
Gln(Q) Asn Asn
Glu(E) Asp Asp
Gly(G) Pro;Ala Ala
His(H) Asn;Gln;Lys;Arg Arg
Ile(I) Leu;Val;Met;Ala;Phe Leu
Leu(L) Ile;Val;Met;Ala;Phe Ile
Lys(K) Arg;Gln;Asn Arg
Met(M) Leu;Phe;Ile Leu
Phe(F) Leu;Val;Ile;Ala;Tyr Leu
Pro(P) Ala Ala
Ser(S) Thr Thr
Thr(T) Ser Ser
Trp(W) Tyr;Phe Tyr
Tyr(Y) Trp;Phe;Thr;Ser Phe
Val(V) Ile;Leu;Met;Phe;Ala Leu
Recognition of the amino acid substrate by PylRS is related to the steric structure of the catalytic active domain, and since the size of the lysine derivative that can be activated by wild-type PylRS is limited and a lysine derivative having a large functional group cannot be introduced into a protein, the effect is improved by mutation at the PylRS site to avoid steric hindrance of the bound substrate or interaction between the mutated amino acid and the substrate amino acid or the main chain portion.
Preferably, the mutein is as shown in any one of SEQ ID No. 3-9.
MDKKPLNTLISATGLWMSHTGTIHKVKHREVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:3)
MDKKPLNTLISATGLWMSHTGTIHKIKHREVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENSEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVFGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:4)
MDKKPLNTLISATGLWMSHTGTIHKIKHREVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYARKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFSQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:5)
MDKKPLNTLISATGLWMSHTGTIHKIKHREVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:6)
MDKKPLNTLISATGLWMSHTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:7)
MDKKPLNTLISATGLWMSRTGTIHKIKHREVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:8)
MDKKPLNTLISATGLWMSKTGTIHKIKHKEVSRSKIYIEMACGDHLVVNNSRSSRTARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL(SEQ ID NO.:9)
It is understood that the muteins of the invention generally have a higher homology (identity) to the sequence shown in SEQ ID No.1 or SEQ ID No. 2, preferably said muteins have a homology of at least 80%, preferably at least 85% to 90%, more preferably at least 95%, more preferably at least 98%, most preferably at least 99% to the sequence shown in SEQ ID No.1 or SEQ ID No. 2.
In addition, the mutant protein can be modified. Modified (generally without altering primary structure) forms include: chemically derivatized forms of the mutein such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those resulting from glycosylation modifications during synthesis and processing of the mutein or during further processing steps. Such modification may be accomplished by exposing the mutein to an enzyme that performs glycosylation, such as mammalian glycosylase or deglycosylase. Modified forms also include sequences having phosphorylated amino acid residues (e.g., phosphotyrosine, phosphoserine, phosphothreonine). Also included are muteins which have been modified to increase their resistance to proteolysis or to optimize solubility.
The term "polynucleotide encoding a mutant lysyl-tRNA synthetase" can include a polynucleotide that encodes a mutant lysyl-tRNA synthetase of the invention, and can also include additional coding and/or non-coding sequences.
The invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of the polypeptides or muteins of the same amino acid sequence as the present invention. These nucleotide variants include substitution variants, deletion variants and insertion variants. As is known in the art, an allelic variant is a substitution of a polynucleotide, which may be a substitution, deletion, or insertion of one or more nucleotides, without substantially altering the function of the mutein it encodes.
The present invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, and more preferably at least 80% identity between the two sequences. The present invention particularly relates to polynucleotides hybridizable under stringent conditions (or stringent conditions) with the polynucleotides of the present invention. In the present invention, "stringent conditions" mean: (1) hybridization and elution at lower ionic strength and higher temperature, such as 0.2 XSSC, 0.1% SDS, 60 ℃; or (2) adding denaturant during hybridization, such as 50% (v/v) formamide, 0.1% calf serum/0.1% Ficoll, 42 deg.C, etc.; or (3) hybridization occurs only when the identity between two sequences is at least 90% or more, preferably 95% or more.
The muteins and polynucleotides of the present invention are preferably provided in isolated form, and more preferably, purified to homogeneity.
The full-length sequence of the polynucleotide of the present invention can be obtained by PCR amplification, recombination, or artificial synthesis. For PCR amplification, primers can be designed based on the nucleotide sequences disclosed herein, particularly open reading frame sequences, and the sequences can be amplified using commercially available cDNA libraries or cDNA libraries prepared by conventional methods known to those skilled in the art as templates. When the sequence is long, two or more PCR amplifications are often required, and then the amplified fragments are spliced together in the correct order.
Once the sequence of interest has been obtained, it can be obtained in large quantities by recombinant methods. This is usually done by cloning it into a vector, transferring it into a cell, and isolating the relevant sequence from the propagated host cell by conventional methods.
In addition, the sequence can be synthesized by artificial synthesis, especially when the fragment length is short. Generally, fragments with long sequences are obtained by first synthesizing a plurality of small fragments and then ligating them.
At present, DNA sequences encoding the proteins of the present invention (or fragments or derivatives thereof) have been obtained completely by chemical synthesis. The DNA sequence may then be introduced into various existing DNA molecules (or vectors, for example) and cells known in the art. Furthermore, mutations can also be introduced into the protein sequences of the invention by chemical synthesis.
Methods for amplifying DNA/RNA using PCR techniques are preferably used to obtain the polynucleotides of the invention. Particularly, when it is difficult to obtain a full-length cDNA from a library, it is preferable to use the RACE method (RACE-cDNA terminal rapid amplification method), and primers used for PCR can be appropriately selected based on the sequence information of the present invention disclosed herein and synthesized by a conventional method. The amplified DNA/RNA fragments can be isolated and purified by conventional methods, such as by gel electrophoresis.
The technical scheme of the invention has the following beneficial effects:
(1) the invention directly introduces the unnatural amino acid into the synthesis process of the protein through a double-plasmid system, and has the advantages of low cost, high yield and small environmental pollution.
(2) Only the amino acid with the modification group is needed to be chemically synthesized, the search time of a synthetic route of a method for chemically synthesizing a peptide chain is saved, and compared with organic synthesis, the modification efficiency and the final product yield are high.
(3) The mutant lysyl-tRNA synthetases of the invention can increase the amount of unnatural amino acid inserted and the amount of a protein of interest that contains an unnatural amino acid, as compared to a wild-type lysyl-tRNA synthetase. In addition, the mutant lysyl-tRNA synthetase of the invention can also improve the stability of target proteins, so that the target proteins are not easy to break. The mutation of the partial sequence of the lysyl-tRNA synthetase can promote the soluble expression of the lysyl-tRNA synthetase, and is more favorable for separating and purifying target proteins in different expression forms.
The invention is further illustrated with reference to specific embodiments. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: the conditions described in the Laboratory Manual (New York: Cold Spring Harbor Laboratory Press,1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are by weight.
Example 1 construction of expression vector for target protein
A DNA sequence (SEQ ID NO: 12) of A1-u4-u5-TEV-R-MiniINS was synthesized based on the amino acid sequence (SEQ ID NO: 11) of the fusion protein A1-u4-u5-TEV-R-MiniINS according to the codon preference of E.coli, and cloned into the NcoI-XhoI site downstream of the araBAD promoter of the expression vector plasmid pBAD/His A (purchased from NTCC, kanamycin resistance). The original tag gene (6 × His) of the expression vector plasmid pBAD/His A was not retained.
A1-u4-u5-TEV-R-MiniINS DNA sequence (SEQ ID NO: 11)
MVSKGEELFTGVYVQERTISFKDTYKTRAEVKFEGDENLYFQGRFVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN*
Amino acid sequence 1 of A1-u4-u5-TEV-R-MiniINS (SEQ ID NO: 12)
ATGGTTAGCAAAGGTGAAGAACTGTTTACCGGCGTTTATGTGCAGGAACGTACCATTAGCTTCAAAGATACCTATAAAACCCGTGCGGAAGTTAAATTTGAAGGCGATGAAAACCTGTATTTTCAGGGACGTTTCGTTAACCAACACCTGTGCGGCAGCCACCTGGTAGAGGCACTGTATCTGGTTTGTGGTGAACGTGGCTTCTTCTATACTCCGTAGACTCGTGGTATCGTGGAACAGTGTTGCACTTCTATTTGCTCTCTGTATCAGCTGGAAAATTACTGTAAT
Sequence 1 was excised from the cloning vector pUC57-A1-u4-u5-TEV-R-MiniINS with restriction enzymes NcoI and XhoI, while the expression vector plasmid pBAD/His A was cut with NcoI and XhoI, separated by nucleic acid electrophoresis, extracted with an agarose gel DNA recovery kit, ligated using T4DNA Ligase, chemically (CaCl2 method) transformed into E.coli Top10 competent cells, which were cultured on LB agar medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl, 1.5% agar) containing kanamycin overnight at 37 ℃. Single viable colonies were picked and cultured overnight at 37 ℃ and 220rpm in liquid LB medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl) containing kanamycin. The plasmid was extracted with a plasmid miniprep kit and the resulting plasmid was named pBAD-A1-u4-u 5-TEV-R-MiniINS. The plasmid map is shown in FIG. 1.
Example 2 construction of lysyl-tRNA synthetase plasmid
The DNA sequence of the wild-type lysyl-tRNA synthetase pylRs was synthesized according to the codon preference of E.coli based on the amino acid sequence of the pylRs (SEQ ID NO.:1), and cloned into the expression vector plasmid pEvol-pBpF (available from NTCC, chloramphenicol resistance) at the SpeI-SalI site downstream of the araBAD promoter, where the SpeI cleavage site was increased by PCR, and the SalI site was present in the vector itself. The glutamine promoter glnS originally present in the expression vector plasmid pEvol-pBpF was retained. The DNA sequence of the tRNA (pylTcua) of lysyl-tRNA synthetase (SEQ ID NO: 10) was inserted by PCR into the expression vector plasmid pEvol-pBpF downstream of the proK promoter. This plasmid was designated pEvol-pylRs-pylT.
The mutant lysyl-tRNA synthetase pylRs (R19K, H29K, T122S, Y384F) with the amino acid sequence shown in SEQ ID NO. 9 were obtained by introducing the mutations R19K, H29K, T122S, Y384F based on the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID NO. 1). And DNA sequences of pylRs (R19K, H29K, T122S, Y384F) were synthesized according to the codon usage of E.coli. Sequence 6 was excised from the cloning vector pUC57-pylRs (R19K, H29K, T122S, Y384F) with the restriction enzymes SpeI and SalI, and at the same time, plasmid pEvol-pylRs-pylT (the desired DNA fragment was a 4.3kb large fragment thereof) was cut with SpeI and SalI, separated by nucleic acid electrophoresis, extracted with agarose gel DNA recovery kit, ligated with T4DNA Ligase, and chemically (CaCl)2Method) into large E.coli Top10 competent cells, the transformed cells were cultured overnight at 37 ℃ on LB agar medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl, 1.5% agar) containing chloramphenicol. Single viable colonies were picked and cultured overnight at 37 ℃ and 220rpm in liquid LB medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl) containing chloramphenicol. The plasmid was extracted with a plasmid miniprep kit and the resulting plasmid was designated pEvol-pylRs (R19K, H29K, T122S, Y384F) -pylT, and the plasmid map is shown in FIG. 2。
Example 3 Strain construction and high Density expression of Tert-Butoxycarbonyl (Boc) modified fusion proteins
A DNA sequence (SEQ ID NO: 14) of GFP-TEV-R-MiniINS was synthesized based on the amino acid sequence (SEQ ID NO: 13) of the fusion protein GFP-TEV-R-MiniINS and the codon preference of Escherichia coli, and cloned into an expression vector plasmid pBAD/His A, and the resulting plasmid was named pBAD-GFP-TEV-R-MiniINS, in the same manner as described in example 1.
Amino acid sequence of GFP-TEV-R-MiniINS (SEQ ID NO.:13)301aa
MVSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYAGSENLYFQGRFVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN*
DNA sequence (SEQ ID NO.:14)906bp of GFP-TEV-R-MiniINS
ATGGTTAGCAAAGGTGAAGAACTGTTTACCGGCGTTGTGCCGATTCTGGTGGAACTGGATGGTGATGTGAATGGCCATAAATTTAGCGTTCGTGGCGAAGGCGAAGGTGATGCGACCAACGGTAAACTGACCCTGAAATTTATTTGCACCACCGGTAAACTGCCGGTTCCGTGGCCGACCCTGGTGACCACCCTGACCTATGGCGTTCAGTGCTTTAGCCGCTATCCGGATCATATGAAACGCCATGATTTCTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGTACCATTAGCTTCAAAGATGATGGCACCTATAAAACCCGTGCGGAAGTTAAATTTGAAGGCGATACCCTGGTGAACCGCATTGAACTGAAAGGTATTGATTTTAAAGAAGATGGCAACATTCTGGGTCATAAACTGGAATATAATTTCAACAGCCATAATGTGTATATTACCGCCGATAAACAGAAAAATGGCATCAAAGCGAACTTTAAAATCCGTCACAACGTGGAAGATGGTAGCGTGCAGCTGGCGGATCATTATCAGCAGAATACCCCGATTGGTGATGGCCCGGTGCTGCTGCCGGATAATCATTATCTGAGCACCCAGAGCGTTCTGAGCAAAGATCCGAATGAAAAACGTGATCATATGGTGCTGCTGGAATTTGTTACCGCCGCGGGCATTACCCACGGTATGGATGAACTGTATGCGGGCAGCGAAAACCTGTATTTTCAGGGACGTTTCGTTAACCAACACCTGTGCGGCAGCCACCTGGTAGAGGCACTGTATCTGGTTTGTGGTGAACGTGGCTTCTTCTATACTCCGTAGACTCGTGGTATCGTGGAACAGTGTTGCACTTCTATTTGCTCTCTGTATCAGCTGGAAAATTACTGTAATTAA
Plasmid pEvol-pylRs-pylT, plasmid pBAD-GFP-TEV-R-MiniINS, pEvol-pylRs (R19K, H29K, T122S, Y384F) -pylT and plasmidThe particles pBAD-A1-u4-u5-TEV-R-MiniINS, as shown in Table 1, were combined two by two, respectively by chemical method (CaCl)2Method) were co-transformed into E.coli Top10 competent cells (competent cells were purchased from Thermo Co.), and the transformed cells were cultured overnight at 37 ℃ on LB agar medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl, 1.5% agar) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. Single viable colonies were picked and cultured overnight at 37 ℃ and 220rpm in liquid LB medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. The final concentration of 20% glycerol was added to preserve the strain.
TABLE 1 expression Strain construction
Figure BDA0002000213430000181
Inoculating each strain in a liquid LB culture medium at 37 ℃ and 220rpm for culture overnight, inoculating 1% (v/v) of a tank fermentation culture medium (12g/L of yeast peptone, 24g/L of yeast extract powder, 4mL/L of glycerol, 12.8g/L of disodium hydrogen phosphate, 3g/L of potassium dihydrogen phosphate and 0.3% of defoaming agent) and culturing under the conditions of 35 +/-3 ℃ at 200-1000 rpm and air flow of 2-6L/min. After culturing for 3-10 h, feeding a supplemented medium containing glycerol and yeast peptone at a stepping rate, and continuing until the fermentation is finished. When the culture was performed until OD600 reached 25 to 80, L-ara at a final concentration of 0.25% and Boc-Lys at a final concentration of 5mM were added for induction. And (5) continuously culturing until the OD600 reaches 180-220, and placing the culture tank. Then, the mixture was collected by centrifugation (5000rpm, 30min, 25 ℃). And detecting the expression condition of the fusion protein containing Boc modified lysine in the whole cells of each strain by SDS-polyacrylamide electrophoresis.
The results are shown in FIG. 3, and show that the mutant enzyme pylRs (R19K, H29K, T122S, Y384F) expresses Boc fusion protein in a significantly higher amount than the wild-type enzyme pylRs under the same condition of the target fusion protein.
The fusion protein is expressed as insoluble "inclusion bodies". To release the inclusion bodies, the E.coli cells were disrupted with a high-pressure homogenizer. Nucleic acids, cell debris and soluble proteins were removed by centrifugation at 10000 g. The inclusion bodies containing the fusion protein were washed with pure water, and the resulting inclusion body precipitates were used as a raw material for folding.
To refold the fusion protein, the inclusion bodies are dissolved in a 7.5M urea solution containing 2 to 10mM mercaptoethanol at pH 10.5 such that the concentration of total protein after dissolution is 10 to 25 mg/mL. Diluting the sample by 5-10 times, and carrying out conventional folding for 16-30 hours under the conditions of 4-8 ℃ and pH of 10.5-11.7. And (3) maintaining the pH value at 8.0-9.5 at 18-25 ℃, performing enzymolysis on the fusion protein by using trypsin and carboxypeptidase B for 10-20 hours, and adding 0.45M ammonium sulfate to stop the enzymolysis reaction. The reverse phase HPLC analysis showed that the yield of this enzymatic step was higher than 90%. The insulin analogue obtained after enzymatic hydrolysis of trypsin with carboxypeptidase B was designated BOC-lysine insulin. Boc-lysine insulin was not enzymatically cleaved under the above conditions. The sample is clarified by membrane filtration and initially purified by hydrophobic chromatography using 0.45mM ammonium sulfate as buffer, and the purity of the SDS-polyacrylamide gel electrophoresis is up to 90%. And MALDI-TOF mass spectrometry is carried out on the obtained Boc-human insulin, and the result detects that the molecular weight of the Boc-human insulin is consistent with the theoretical molecular weight of 5907.7 Da. Eluting by hydrophobic chromatography to collect a sample, adding hydrochloric acid to perform Boc-human insulin deprotection reaction, adding sodium hydroxide solution to control the pH to be 2.8-3.2 to terminate the reaction, and performing two-step high-pressure reverse phase chromatography to obtain the recombinant human insulin with the yield higher than 85%.
TABLE 2 production of human insulin by each expression strain
Figure BDA0002000213430000191
As a result, it was found that when a target protein containing Boc-lysine was prepared using the mutant enzyme of the present invention and the protein promotion element (A1-u4-u5) in the fusion protein, the amount of unnatural amino acid insertion and the amount of the target protein containing unnatural amino acid could be significantly increased.
Example 4
The mutant lysyl-tRNA synthetase pylRs (R19H) with the amino acid sequence shown in SEQ ID No.:7 were obtained by introducing the mutation R19H according to the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID No.: 1). And the DNA sequence of pylRs (R19H) was synthesized according to the codon usage of E.coli.
The mutant lysyl-tRNA synthetase pylRs (H29R) with the amino acid sequence shown in SEQ ID No.:8 were obtained by introducing the mutation H29R according to the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID No.: 1). And based on the codon preference of E.coli, a DNA sequence of pylRs (H29R) was synthesized.
The mutant lysyl-tRNA synthetase pylRs (R19H, H29R) with the amino acid sequence shown in SEQ ID No. 6 were obtained by introducing the mutations R19H and H29R based on the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID No. 1). And the DNA sequence of pylRs (R19H, H29R) was synthesized according to the codon usage of E.coli.
Based on the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID NO.:1), the mutations R19H, I26V and H29R were introduced to obtain mutant lysyl-tRNA synthetase pylRs (R19H, I26V and H29R) with the amino acid sequence shown in SEQ ID NO: 3. And DNA sequences of pylRs (R19H, I26V, H29R) were synthesized according to the codon preference of E.coli.
The mutant lysyl-tRNA synthetase pylRs (R19H, H29R, T122S, Y384F) with the amino acid sequence shown in SEQ ID NO. 4 are obtained by introducing the mutations R19H, H29R, T122S and Y384F according to the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID NO. 1). And DNA sequences of pylRs (R19H, H29R, T122S, Y384F) were synthesized according to the codon usage of E.coli.
The mutations R19H, H29R, L309A and C348S were introduced based on the amino acid sequence of the wild-type lysyl-tRNA synthetase pylRs (SEQ ID NO.:1) to obtain mutant lysyl-tRNA synthetase pylRs (R19H, H29R, L309A and C348S) whose amino acid sequence is shown in SEQ ID NO: 5. And DNA sequences of pylRs (R19H, H29R, L309A, C348S) were synthesized according to the codon usage of E.coli.
Plasmids pEvol-pylRs (R19) -pylT, plasmid pEvol-pylRs (H29) -pylT, plasmid pEvol-pylRs (R19, H29, T122, Y384) -pylRs (R19, H29, L309, C348) were constructed by replacing the DNA sequence of pylRs (R19, H29, T122, Y384) with the DNA sequence of pylRs (R19, H29), the DNA sequence of pylRs (R19, H29, T122, Y384), the DNA sequence of pylRs (R19, H29, L309, C348), respectively, according to the method of example 2.
EXAMPLE 5 Strain construction and high Density expression of Tert-Butoxycarbonyl (Boc) modified fusion proteins
Plasmid pEvol-pylRs-pylT, plasmid pEvol-pylRs (R19H) -pylT, plasmid pEvol-pylRs (H29R) -pylT, plasmid pEvol-pylRs (R19H, H29R) -pylT, plasmid pEvol-pylRs (R19H, I26V, H29R) -pylT and plasmid pEvol-pylRs (R19H, H29R, T122S, Y384F) -pylT plasmid pEvol-pylRs (R19H, H29R, L309A, C348S) -pylT were chemically combined with insulin fusion protein expression vector pBAD-INS (Carna resistance), respectively2Method) were co-transformed into E.coli Top10 competent cells (competent cells were purchased from Thermo Co.), and the transformed cells were cultured overnight at 37 ℃ on LB agar medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl, 1.5% agar) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. Single viable colonies were picked and cultured overnight at 37 ℃ and 220rpm in liquid LB medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. The final concentration of 20% glycerol was added to preserve the strain.
Inoculating each strain in a liquid LB culture medium at 37 ℃ and 220rpm for overnight culture, inoculating 1% (v/v) of a tank fermentation culture medium (12g/L of yeast peptone, 24g/L of yeast extract powder, 4mL/L of glycerol, 12.8g/L of disodium hydrogen phosphate, 3g/L of potassium dihydrogen phosphate and 0.3% of defoaming agent) and culturing under the conditions of 35 +/-3 ℃ at 200-1000 rpm and air flow of 2-6L/min. After culturing for 3-10 h, feeding a supplemented medium containing glycerol and yeast peptone at a stepping rate, and continuing until the fermentation is finished. Cultured to OD600When the concentration reached 25 to 80, L-ara with a final concentration of 0.25% and Boc-Lys with a final concentration of 5mM were added to induce. Continuing to culture until OD600And when the temperature reaches 180-220 ℃, putting the tank. Then, the mixture was collected by centrifugation (5000rpm, 30min, 25 ℃). Fusion protein containing Boc modified lysine in whole cell of each strain by SDS-polyacrylamide electrophoresisWhite expression was detected.
The fusion protein is expressed as insoluble "inclusion bodies". To release the inclusion bodies, the E.coli cells were disrupted with a high-pressure homogenizer. Nucleic acids, cell debris and soluble proteins were removed by centrifugation at 10000 g. The inclusion bodies containing the fusion protein were washed with pure water, and the resulting inclusion body precipitates were used as a raw material for folding.
The expression levels of the fusion proteins of the different mutant enzymes are shown in the following table:
enzyme Boc lysine fusion protein expression level (g/L fermentation broth)
pylRs 7.9
R19H 8.9
H29R 8.6
R19H,H29R 9.4
R19H,I26V,H29R 11.4
R19H,H29R,T122S,Y384F 14.0
To refold the fusion protein, the inclusion bodies are dissolved in a 7.5M urea solution containing 2 to 10mM mercaptoethanol at pH 10.5 such that the concentration of total protein after dissolution is 10 to 25 mg/mL. Diluting the sample by 5-10 times, and carrying out conventional folding for 16-30 hours under the conditions of 4-8 ℃ and pH of 10.5-11.7. And (3) maintaining the pH value at 8.0-9.5 at 18-25 ℃, performing enzymolysis on the fusion protein by using trypsin and carboxypeptidase B for 10-20 hours, and adding 0.45M ammonium sulfate to stop the enzymolysis reaction. The reverse phase HPLC analysis showed that the yield of this enzymatic step was higher than 90%. The insulin analogue obtained after enzymatic hydrolysis of trypsin with carboxypeptidase B was designated BOC-lysine insulin. Boc-lysine insulin was not enzymatically cleaved under the above conditions. The sample is clarified by membrane filtration and initially purified by hydrophobic chromatography using 0.45mM ammonium sulfate as buffer, and the purity of the SDS-polyacrylamide gel electrophoresis is up to 90%. And MALDI-TOF mass spectrometry is carried out on the obtained Boc-human insulin, and the result detects that the molecular weight of the Boc-human insulin is consistent with the theoretical molecular weight of 5907.7 Da. Eluting by hydrophobic chromatography to collect a sample, adding hydrochloric acid to perform Boc-human insulin deprotection reaction, adding sodium hydroxide solution to control the pH to be 2.8-3.2 to terminate the reaction, and performing two-step high-pressure reverse phase chromatography to obtain the recombinant human insulin with the yield higher than 85%.
The expression level of recombinant human insulin for different mutant enzymes is shown in the following table:
enzyme Yield of Boc human insulin (mg/L fermentation broth)
pylRs 360
R19H 440
H29R 400
R19H,H29R 450
R19H,I26V,H29R 580
R19H,H29R,T122S,Y384F 700
The results show that the mutant enzyme of the invention can be used for preparing the target protein containing Boc-lysine, and the insertion amount of the unnatural amino acid and the amount of the target protein containing the unnatural amino acid can be obviously improved.
EXAMPLE 6 Strain construction and high Density expression of Butynyloxycarbonyl-modified fusion proteins
Plasmid pEvol-pylRs-pylT, plasmid pEvol-pylRs (R19H) -pylT, plasmid pEvol-pylRs (H29R) -pylT, plasmid pEvol-pylRs (R19H, H29R) -pylT, plasmid pEvol-pylRs (R19H, I26V, H29R) -pylT, plasmid pEvol-pylRs (R19H, H29R, T122S, Y384F) -pylT and plasmid pEvol-pylRs (R19H, H29R, L309A, C348S) -pylT were each combined with insulin fusion protein expression vector pBAD-INS (plasmid constructed by this company, Kanna resistance) chemically (CaCl-I)2Method) was co-transformed into large E.coli Top10 competent cells (competent cells were purchased from Thermo Co.), and the transformed cells were cultured overnight at 37 ℃ on LB agar medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl, 1.5% agar) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. Single viable colonies were picked and cultured overnight at 37 ℃ and 220rpm in liquid LB medium (10g/L yeast peptone, 5g/L yeast extract, 10g/L NaCl) containing 25. mu.g/mL kanamycin and 17. mu.g/mL chloramphenicol. The final concentration of 20% glycerol was added to preserve the strain.
Inoculating each strain in liquid LB medium at 37 deg.C and 220rpm for overnight culture, inoculating 1% (v/v) fermentation medium (12g/L yeast peptone, 24g/L yeast extract powder, 4mL/L glycerol, 12.8g/L dibasic phosphate)Sodium, 3g/L potassium dihydrogen phosphate and 0.3 per thousand of defoaming agent) under the conditions of 35 +/-3 ℃, 200-1000 rpm and 2-6L/min of air flow. After culturing for 3-10 h, feeding a supplemented medium containing glycerol and yeast peptone at a stepping rate, and continuing until the fermentation is finished. Cultured to OD600When the concentration reached 25 to 80, L-ara with a final concentration of 0.25% and butynyloxycarbonyl-Lys with a final concentration of 5mM were added for induction. Continuing to culture until OD600And when the temperature reaches 180-220 ℃, putting the tank. Then, the mixture was collected by centrifugation (5000rpm, 30min, 25 ℃). Detecting the expression condition of the fusion protein containing the butynyloxycarbonyl modified lysine in the whole cells of each strain by SDS-polyacrylamide electrophoresis.
Enzyme Expression level of fusion protein of butyloxycarbonyl lysine (g/L fermentation broth)
pylRs 4.5
H29R 5.0
R19H 5.2
R19H,H29R 6.8
R19H,I26V,H29R 6.7
R19H,H29R,L309A,C348S 8.4
The fusion protein is expressed as insoluble "inclusion bodies". To release the inclusion bodies, the E.coli cells were disrupted with a high-pressure homogenizer. Nucleic acids, cell debris and soluble proteins were removed by centrifugation at 10000 g. The inclusion bodies containing the fusion protein were washed with pure water, and the resulting inclusion body precipitates were used as a raw material for folding. To refold the fusion protein, the inclusion bodies are dissolved in a 7.5M urea solution containing 2 to 10mM mercaptoethanol at pH 10.5 such that the concentration of total protein after dissolution is 10 to 25 mg/mL. Diluting the sample by 5-10 times, and carrying out conventional folding for 16-30 hours under the conditions of 4-8 ℃ and pH of 10.5-11.7. And (3) maintaining the pH value at 8.0-9.5 at 18-25 ℃, performing enzymolysis on the fusion protein by using trypsin and carboxypeptidase B for 10-20 hours, and adding 0.45M ammonium sulfate to stop the enzymolysis reaction. The reverse phase HPLC analysis showed that the yield of this enzymatic step was higher than 90%. The insulin analogue obtained after enzymatic hydrolysis of trypsin with carboxypeptidase B was named butynyloxycarbonyl-lysine insulin. Butyryloxycarbonyl-lysine insulin is not enzymatically hydrolyzed under the above conditions. The sample is clarified by membrane filtration and initially purified by hydrophobic chromatography using 0.45mM ammonium sulfate as buffer, and the purity of the SDS-polyacrylamide gel electrophoresis is up to 90%. And MALDI-TOF mass spectrometry is carried out on the obtained butynyloxycarbonyl-human insulin, and the result detects that the molecular weight of the human insulin accords with the theoretical molecular weight of 5907.7 Da.
Enzyme Yield of Butynyloxycarbonyl human insulin (mg/L fermentation broth)
pylRs 530
H29R 590
R19H 610
R19H,H29R 810
R19H,I26V,H29R 790
R19H,H29R,L309A,C348S 1000
The result shows that the mutant enzyme of the invention is used for preparing the target protein modified by butynyloxycarbonyl, and the insertion amount of the unnatural amino acid and the amount of the target protein containing the unnatural amino acid can be obviously improved.
All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.
Sequence listing
<110> Ningbo spread Biotechnology Ltd
<120> introduction of unnatural amino acids into proteins using two-plasmid System
<130> P2019-0305
<160> 14
<170> PatentIn version 3.5
<210> 1
<211> 454
<212> PRT
<213> Methanosarcina mazei (Methanosarcina mazei)
<400> 1
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser Arg Thr Gly Thr Ile His Lys Ile Lys His His Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 2
<211> 454
<212> PRT
<213> Methanosarcina mazei (Methanosarcina mazei)
<400> 2
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser Arg Thr Gly Thr Ile His Lys Ile Lys His His Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Gly Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 3
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 3
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser His Thr Gly Thr Ile His Lys Val Lys His Arg Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 4
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 4
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser His Thr Gly Thr Ile His Lys Ile Lys His Arg Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Ser Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Phe
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 5
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 5
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser His Thr Gly Thr Ile His Lys Ile Lys His Arg Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Ala Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Ser Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 6
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 6
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser His Thr Gly Thr Ile His Lys Ile Lys His Arg Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 7
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 7
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser His Thr Gly Thr Ile His Lys Ile Lys His His Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 8
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 8
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser Arg Thr Gly Thr Ile His Lys Ile Lys His Arg Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 9
<211> 454
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 9
Met Asp Lys Lys Pro Leu Asn Thr Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser Lys Thr Gly Thr Ile His Lys Ile Lys His Lys Glu Val Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Ser Arg Thr Ala Arg Ala Leu Arg His His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Leu Asn
65 70 75 80
Lys Phe Leu Thr Lys Ala Asn Glu Asp Gln Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Ala Pro Thr Arg Thr Lys Lys Ala Met Pro Lys Ser Val
100 105 110
Ala Arg Ala Pro Lys Pro Leu Glu Asn Thr Glu Ala Ala Gln Ala Gln
115 120 125
Pro Ser Gly Ser Lys Phe Ser Pro Ala Ile Pro Val Ser Thr Gln Glu
130 135 140
Ser Val Ser Val Pro Ala Ser Val Ser Thr Ser Ile Ser Ser Ile Ser
145 150 155 160
Thr Gly Ala Thr Ala Ser Ala Leu Val Lys Gly Asn Thr Asn Pro Ile
165 170 175
Thr Ser Met Ser Ala Pro Val Gln Ala Ser Ala Pro Ala Leu Thr Lys
180 185 190
Ser Gln Thr Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile
195 200 205
Ser Leu Asn Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu
210 215 220
Ser Arg Arg Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu
225 230 235 240
Asn Tyr Leu Gly Lys Leu Glu Arg Glu Ile Thr Arg Phe Phe Val Asp
245 250 255
Arg Gly Phe Leu Glu Ile Lys Ser Pro Ile Leu Ile Pro Leu Glu Tyr
260 265 270
Ile Glu Arg Met Gly Ile Asp Asn Asp Thr Glu Leu Ser Lys Gln Ile
275 280 285
Phe Arg Val Asp Lys Asn Phe Cys Leu Arg Pro Met Leu Ala Pro Asn
290 295 300
Leu Tyr Asn Tyr Leu Arg Lys Leu Asp Arg Ala Leu Pro Asp Pro Ile
305 310 315 320
Lys Ile Phe Glu Ile Gly Pro Cys Tyr Arg Lys Glu Ser Asp Gly Lys
325 330 335
Glu His Leu Glu Glu Phe Thr Met Leu Asn Phe Cys Gln Met Gly Ser
340 345 350
Gly Cys Thr Arg Glu Asn Leu Glu Ser Ile Ile Thr Asp Phe Leu Asn
355 360 365
His Leu Gly Ile Asp Phe Lys Ile Val Gly Asp Ser Cys Met Val Tyr
370 375 380
Gly Asp Thr Leu Asp Val Met His Gly Asp Leu Glu Leu Ser Ser Ala
385 390 395 400
Val Val Gly Pro Ile Pro Leu Asp Arg Glu Trp Gly Ile Asp Lys Pro
405 410 415
Trp Ile Gly Ala Gly Phe Gly Leu Glu Arg Leu Leu Lys Val Lys His
420 425 430
Asp Phe Lys Asn Ile Lys Arg Ala Ala Arg Ser Glu Ser Tyr Tyr Asn
435 440 445
Gly Ile Ser Thr Asn Leu
450
<210> 10
<211> 72
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 10
ggaaacctga tcatgtagat cgaatggact ctaaatccgt tcagccgggt tagattcccg 60
gggtttccgc ca 72
<210> 11
<211> 96
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 11
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Tyr Val Gln Glu
1 5 10 15
Arg Thr Ile Ser Phe Lys Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys
20 25 30
Phe Glu Gly Asp Glu Asn Leu Tyr Phe Gln Gly Arg Phe Val Asn Gln
35 40 45
His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys Gly
50 55 60
Glu Arg Gly Phe Phe Tyr Thr Pro Lys Thr Arg Gly Ile Val Glu Gln
65 70 75 80
Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu Glu Asn Tyr Cys Asn
85 90 95
<210> 12
<211> 288
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 12
atggttagca aaggtgaaga actgtttacc ggcgtttatg tgcaggaacg taccattagc 60
ttcaaagata cctataaaac ccgtgcggaa gttaaatttg aaggcgatga aaacctgtat 120
tttcagggac gtttcgttaa ccaacacctg tgcggcagcc acctggtaga ggcactgtat 180
ctggtttgtg gtgaacgtgg cttcttctat actccgtaga ctcgtggtat cgtggaacag 240
tgttgcactt ctatttgctc tctgtatcag ctggaaaatt actgtaat 288
<210> 13
<211> 301
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 13
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu
1 5 10 15
Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly
20 25 30
Glu Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile
35 40 45
Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr
50 55 60
Leu Thr Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys
65 70 75 80
Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu
85 90 95
Arg Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu
100 105 110
Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly
115 120 125
Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr
130 135 140
Asn Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn
145 150 155 160
Gly Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser
165 170 175
Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly
180 185 190
Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu
195 200 205
Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe
210 215 220
Val Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Ala Gly
225 230 235 240
Ser Glu Asn Leu Tyr Phe Gln Gly Arg Phe Val Asn Gln His Leu Cys
245 250 255
Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly
260 265 270
Phe Phe Tyr Thr Pro Lys Thr Arg Gly Ile Val Glu Gln Cys Cys Thr
275 280 285
Ser Ile Cys Ser Leu Tyr Gln Leu Glu Asn Tyr Cys Asn
290 295 300
<210> 14
<211> 906
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 14
atggttagca aaggtgaaga actgtttacc ggcgttgtgc cgattctggt ggaactggat 60
ggtgatgtga atggccataa atttagcgtt cgtggcgaag gcgaaggtga tgcgaccaac 120
ggtaaactga ccctgaaatt tatttgcacc accggtaaac tgccggttcc gtggccgacc 180
ctggtgacca ccctgaccta tggcgttcag tgctttagcc gctatccgga tcatatgaaa 240
cgccatgatt tctttaaaag cgcgatgccg gaaggctatg tgcaggaacg taccattagc 300
ttcaaagatg atggcaccta taaaacccgt gcggaagtta aatttgaagg cgataccctg 360
gtgaaccgca ttgaactgaa aggtattgat tttaaagaag atggcaacat tctgggtcat 420
aaactggaat ataatttcaa cagccataat gtgtatatta ccgccgataa acagaaaaat 480
ggcatcaaag cgaactttaa aatccgtcac aacgtggaag atggtagcgt gcagctggcg 540
gatcattatc agcagaatac cccgattggt gatggcccgg tgctgctgcc ggataatcat 600
tatctgagca cccagagcgt tctgagcaaa gatccgaatg aaaaacgtga tcatatggtg 660
ctgctggaat ttgttaccgc cgcgggcatt acccacggta tggatgaact gtatgcgggc 720
agcgaaaacc tgtattttca gggacgtttc gttaaccaac acctgtgcgg cagccacctg 780
gtagaggcac tgtatctggt ttgtggtgaa cgtggcttct tctatactcc gtagactcgt 840
ggtatcgtgg aacagtgttg cacttctatt tgctctctgt atcagctgga aaattactgt 900
aattaa 906

Claims (17)

1. A two-plasmid system, comprising:
(1) a first plasmid comprising a first expression cassette for expression of a protein of interest, the first expression cassette comprising a first coding sequence encoding the protein of interest, the first coding sequence comprising non-natural codons for introduction of a predetermined modified amino acid, the non-natural codons being UAG, UAA, or UGA; and
(2) a second plasmid comprising a second expression cassette for expression of an aminoacyl-tRNA synthetase;
and, the system further comprises a third expression cassette encoding an artificial tRNA, wherein the artificial tRNA comprises an anticodon corresponding to the unnatural codon, wherein the third expression cassette is located in the first plasmid and/or the second plasmid;
and said aminoacyl-tRNA synthetase specifically catalyzes said artificial tRNA to form an "artificial tRNA-Xa" complex, wherein Xa is said predetermined modified amino acid in aminoacyl form;
wherein said aminoacyl-tRNA synthetase is a mutant lysyl-tRNA synthetase, and said mutant lysyl-tRNA synthetase is mutated in response to a wild-type lysyl-tRNA synthetase, said mutations are as follows:
R19H;
H29R;
R19H and H29R;
R19H, I26V and H29R;
R19H, H29R, T122S and Y384F; or
R19H, H29R, L309A and C348S;
and the amino acid sequence of the wild type lysyl-tRNA synthetase is shown as SEQ ID NO: 1.
2. The dual plasmid system of claim 1 wherein the modified amino acid is selected from the group consisting of: an alkynyloxycarbonyl lysine derivative, a tert-Butoxycarbonyl (BOC) -lysine derivative, a fatty acylated lysine derivative, or a combination thereof.
3. The dual plasmid system of claim 1 wherein the third expression cassette is located in a second plasmid.
4. The dual plasmid system of claim 1, wherein the amino acid sequence of the mutant lysyl-tRNA synthetase is as set forth in any one of SEQ ID NOs 3-9.
5. The dual plasmid system of claim 1 wherein the mutant lysyl-tRNA synthetase is used to introduce a lysine derivative into a protein of interest.
6. The dual plasmid system of claim 1, wherein the mutant lysyl-tRNA synthetase has the following characteristics:
compared with wild lysyl-tRNA synthetase, it can introduce lysine derivative with large functional group into protein.
7. The dual plasmid system of claim 1, wherein the artificial tRNA has the nucleic acid sequence set forth in SEQ ID NO 10.
8. The dual plasmid system of claim 1 wherein the protein of interest is selected from the group consisting of: insulin, human insulin precursor protein, insulin lispro precursor protein, insulin glargine precursor protein, exenatide and its derivatives exenatide and liraglutide, somagluteptide, teduglutide, hirudin, growth hormone, glucagon, interferon, parathyroid hormone, or combinations thereof.
9. The dual plasmid system of claim 1 wherein the first plasmid and/or the second plasmid further comprises one or more promoters operably linked to the first coding sequence, enhancer, transcription termination signal, polyadenylation sequence, origin of replication, selectable marker, nucleic acid restriction site, and/or homologous recombination site.
10. The dual plasmid system of claim 1 wherein the first plasmid is an expression vector selected from the group consisting of: pBAD-His ABC, pBAD/His ABC, pET28a, pETDuet-1.
11. The dual plasmid system of claim 1 wherein the second plasmid is a pEvol-pBpF vector.
12. The dual plasmid system of claim 1 wherein the third promoter is the reverse transcription promoter proK.
13. A host cell or an extract thereof, wherein the host cell comprises the dual plasmid system of claim 1.
14. The host cell or extract thereof of claim 13, wherein the host cell is selected from the group consisting of: escherichia coli, Bacillus subtilis, yeast cells, insect cells, mammalian cells, or a combination thereof.
15. A kit comprising (a) a container, and (b) located within the container:
(1) a first plasmid comprising a first expression cassette for expression of a protein of interest, the first expression cassette comprising a first coding sequence encoding the protein of interest, the first coding sequence comprising non-natural codons for introduction of a predetermined modified amino acid, the non-natural codons being UAG (amber), UAA (ochre), or UGA (opal); and
(2) a second plasmid comprising a second expression cassette for expression of an aminoacyl-tRNA synthetase;
and, the kit further comprises a third expression cassette encoding an artificial tRNA, wherein the artificial tRNA comprises an anticodon corresponding to the unnatural codon, wherein the third expression cassette is located in the first plasmid and/or the second plasmid;
and said aminoacyl-tRNA synthetase specifically catalyzes said artificial tRNA to form an "artificial tRNA-Xa" complex, wherein Xa is said predetermined modified amino acid in aminoacyl form;
wherein the aminoacyl-tRNA synthetase is a mutant lysyl-tRNA synthetase, and the mutant lysyl-tRNA synthetase is mutated in an amino acid sequence corresponding to a wild-type lysyl-tRNA synthetase, the mutations being as follows:
R19H;
H29R;
R19H and H29R;
R19H, I26V and H29R;
R19H, H29R, T122S and Y384F; or
R19H, H29R, L309A and C348S;
and the amino acid sequence of the wild-type lysyl-tRNA synthetase is shown in SEQ ID NO 1 or 2.
16. Use of the dual plasmid system of claim 1, or the host cell or extract thereof of claim 13, or the kit of claim 15 for the preparation of a protein comprising a predetermined modified amino acid.
17. A method of producing a protein comprising a predetermined modified amino acid, said method comprising the steps of:
(1) providing a host cell comprising the dual plasmid system of claim 1, and
(2) adding the predetermined modified amino acid, and culturing the host cell, thereby obtaining a protein containing the predetermined modified amino acid.
CN201910210100.XA 2019-03-19 2019-03-19 Introduction of unnatural amino acids in proteins using a two-plasmid system Active CN111718949B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201910210100.XA CN111718949B (en) 2019-03-19 2019-03-19 Introduction of unnatural amino acids in proteins using a two-plasmid system
PCT/CN2020/080039 WO2020187271A1 (en) 2019-03-19 2020-03-18 Introduction of unnatural amino acids in proteins using dual plasmid system
CN202080023302.4A CN113631712A (en) 2019-03-19 2020-03-18 Introduction of unnatural amino acids in proteins using a two-plasmid system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910210100.XA CN111718949B (en) 2019-03-19 2019-03-19 Introduction of unnatural amino acids in proteins using a two-plasmid system

Publications (2)

Publication Number Publication Date
CN111718949A CN111718949A (en) 2020-09-29
CN111718949B true CN111718949B (en) 2021-10-01

Family

ID=72518972

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910210100.XA Active CN111718949B (en) 2019-03-19 2019-03-19 Introduction of unnatural amino acids in proteins using a two-plasmid system
CN202080023302.4A Pending CN113631712A (en) 2019-03-19 2020-03-18 Introduction of unnatural amino acids in proteins using a two-plasmid system

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202080023302.4A Pending CN113631712A (en) 2019-03-19 2020-03-18 Introduction of unnatural amino acids in proteins using a two-plasmid system

Country Status (2)

Country Link
CN (2) CN111718949B (en)
WO (1) WO2020187271A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111850020B (en) * 2019-04-25 2021-05-07 苏州鲲鹏生物技术有限公司 Introduction of unnatural amino acids in proteins using plasmid systems
CN111849929B (en) * 2019-04-30 2021-05-11 苏州鲲鹏生物技术有限公司 aminoacyl-tRNA synthetase for efficiently introducing lysine derivative
WO2023282315A1 (en) * 2021-07-07 2023-01-12 味の素株式会社 Method for secretory production of unnatural-amino-acid-containing protein
CN115701451B (en) * 2021-08-02 2023-08-01 宁波鲲鹏生物科技有限公司 aminoacyl-tRNA synthetase for high-efficiency introducing lysine derivative and application thereof
CN114634958A (en) * 2021-12-22 2022-06-17 清华大学 Method for insertion of unnatural amino acids using cell-free protein synthesis system
CN114672524B (en) * 2022-03-30 2024-01-26 吉林大学 Bifunctional heme protein for catalyzing unnatural amino acid derivatives

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107022568A (en) * 2016-02-01 2017-08-08 北京大学 The system that efficient multipoint inserts alpha-non-natural amino acid in mammalian cell
CN109295100A (en) * 2017-07-25 2019-02-01 北京大学 Carry the building of the stable cell lines of orthogonal tRNA/ aminoacyl tRNA synthetase

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2192185B1 (en) * 2007-09-20 2014-01-22 Riken Mutant pyrrolysyl-trna synthetase, and method for production of protein having non-natural amino acid integrated therein by using the same
CN102504022A (en) * 2011-11-30 2012-06-20 苏州元基生物技术有限公司 Proinsulin containing protecting lysine and preparation method for insulin by utilizing proinsulin
US20170292139A1 (en) * 2014-06-17 2017-10-12 B.G NEGEV TECHNOLOGIES AND APPLICATIONS LTD., at BEN-GURION UNIVERSITY Genetically expanded cell free protein synthesis systems, methods and kits
GB201419109D0 (en) * 2014-10-27 2014-12-10 Medical Res Council Incorporation of unnatural amino acids into proteins

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107022568A (en) * 2016-02-01 2017-08-08 北京大学 The system that efficient multipoint inserts alpha-non-natural amino acid in mammalian cell
CN109295100A (en) * 2017-07-25 2019-02-01 北京大学 Carry the building of the stable cell lines of orthogonal tRNA/ aminoacyl tRNA synthetase

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Patrick O’Donoghue,et al.Near-cognate suppression of amber, opal and quadruplet codons competes with aminoacyl-tRNAPyl for genetic code expansion.《FEBS Letters》.2012,第586卷(第21期),3931-3937. *

Also Published As

Publication number Publication date
CN111718949A (en) 2020-09-29
WO2020187271A1 (en) 2020-09-24
CN113631712A (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN111718949B (en) Introduction of unnatural amino acids in proteins using a two-plasmid system
CN111849929B (en) aminoacyl-tRNA synthetase for efficiently introducing lysine derivative
CN111850020B (en) Introduction of unnatural amino acids in proteins using plasmid systems
CN111718920B (en) aminoacyl-tRNA synthetases with high efficiency of lysine derivatives incorporation into proteins
CN108239633B (en) Mutant of D-psicose-3-epimerase with improved catalytic activity and application thereof
CN110551701A (en) carbonyl reductase mutant and application thereof in reduction of cyclopentadione compounds
CN113667685B (en) Signal peptide related sequence and application thereof in protein synthesis
JP7266325B2 (en) Fusion proteins containing fluorescent protein fragments and uses thereof
CN109136209B (en) Enterokinase light chain mutant and application thereof
RU2790662C1 (en) AMINOACIL-tRNA SYNTHASE, EFFECTIVE INTRODUCTION OF LYSINE DERIVATIVES
RU2799794C2 (en) AMINOACIL-tRNA SYNTHASE FOR EFFECTIVE INTRODUCTION OF LYSINE DERIVATIVE INTO PROTEIN
CN113801236A (en) Preparation method of insulin lispro
CN113801235A (en) Insulin lispro derivative and application thereof
RU2801248C2 (en) Hybrid protein containing fragments of fluorescent proteins and its application
CN115873837A (en) High-expression novel phenylalanine ammonia lyase
CN115701451A (en) aminoacyl-tRNA synthetase for efficiently introducing lysine derivative and application thereof
US20230272004A1 (en) Ramp tag for insulin overexpression and method for manufacturing insulin using same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Introducing Non Natural Amino Acids into Proteins Using a Dual Plasmid System

Effective date of registration: 20230612

Granted publication date: 20211001

Pledgee: Huarong Financial Leasing Co.,Ltd. Ningbo Branch

Pledgor: Ningbo Kunpeng Biotechnology Co.,Ltd.

Registration number: Y2023980043585

PE01 Entry into force of the registration of the contract for pledge of patent right