WO2004016648A1

WO2004016648A1 - Fluorescent protein

Info

Publication number: WO2004016648A1
Application number: PCT/DE2003/002436
Authority: WO
Inventors: Ludger Altrogge; Tatjana Males
Original assignee: Amaxa Gmbh
Priority date: 2002-07-19
Filing date: 2003-07-19
Publication date: 2004-02-26
Also published as: WO2004016648A8; AU2003257392A1; DE10233082A1; DE10393454D2

Abstract

The invention relates to artificially-produced autofluorescent proteins. Novel autofluorescent proteins are disclosed, which may be detected in cells as a result of the fluorescence thereof and may thus be used as markers for gene expression and protein localisation in cellular-, development- and molecular-biology. Said proteins also partly have novel properties, such as for example the regeneration of the fluorescence after fading, by means of irradiation with light of a particular wavelength. The invention relates to a method for the production of said fluorescent proteins.

Description

Fluorescent protein

Background of the Invention

The invention relates to an artificially produced autofluorescent protein and a method for its production.

State of the art

Fluorescent proteins are excellent markers for gene expression and protein localization in various biological systems (Kendall et al. (1998), Trends Biotechnol. 16, 216-224). Since the publication of the green fluorescent protein (GFP = green fluorescent protein) from the bioluminescent jellyfish Aequorea victoria (Prasher et al. (1992) Gene 111, 229-233), numerous attempts have been made to change this protein advantageously. For example, the solubility of the protein and its representability in fluorescence-activated cell sorting (FACS = fluorescence activated cell sorting) could be improved (Cormack et al. (1996), Gene 173, 33 - 38). Various GFP color mutants were also generated or isolated, which fluoresce, for example, yellow, blue or red (eg Sawano et al. (2000) NAR 28 (16), E 78; Yang et al. (1998), J. Biol. Chem . 273 (14), 8212; Heim et al. (1996), Curr. Biol. 6 (2), 178-182; Lewis et al. (1999) Anal. Chem. 71 (19), 4321; Patterson et al . (2001), J. Cell. Sci. 114, 837-838; Matz et al. (1999), Nature Biotechnol. 17, 969-973). In addition, numerous attempts have been made to generate a lighter fluorescence in the GFP from Aequorea by mutation (Sacchetti (2001) FEBS 492 (1 - 2), 151; Battistutta (2000), Proteins 41 (4), 429; Ito (1999 ) Biochem. Biophys. Res. Com. 264 (2), 556; Kim et al. (1998) Brain Res. Bull. 47 (1), 35; US 5,491, 084 and Nature Biotechnol. (1996) 14, 315 - 319). These numerous efforts to change and improve the known fluorescent proteins show that there is a great need for such autofluorescent proteins with different Properties. The article by van Roessel et al. Provides an overview of the various possible uses of fluorescent proteins. (Nature Cell Biology (2002) Vol. 4, E 15 - E 20).

In addition, new fluorescent proteins from natural sources, such as anthozoa (corals), have been increasingly isolated and characterized (e.g. Matz et al. (1999), Nat. Biotechnol. 17, 969-973; Fradkov et al. (2000), FEBS Letters 479, 127-130). Such isolation of new proteins from natural sources and their subsequent cloning is very complex and costly.

WO 99/49019 A2, for example, discloses green fluorescent proteins from anthozoa of the genera Renilla and Ptilosarcus, as well as the isolated nucleic acids which code for these proteins.

WO 00/46233 A1 also discloses a fluorescent protein from corals, as well as the genes coding therefor and possible uses for the proteins.

WO 01/32688 A1 discloses the amino acid sequences of green fluorescent proteins from Renilla reniformis (Anthozoa, Coelenterata), as well as the nucleotide sequences of the nucleic acids derived therefrom and a multitude of possible uses.

WO 01/34824 A2 shows a sequence comparison (“alignment”) of different fluorescent proteins from Aequorea victoria, Ptilosarcus gurneyi and Renilla mulleri. This sequence comparison serves to determine homologies of the proteins, ie to determine the relative similarities between these proteins, and does not lead for the production of new fluorescent proteins. The efforts to isolate new autofluorescent proteins from ever new sources show that there is a very great need for new fluorescent proteins with advantageous properties.

Summary of the invention

It is therefore an object of the invention to provide new fluorescent proteins.

The object is achieved according to the invention by a fluorescent protein which has an amino acid sequence which has at least 80% homology with one of the amino acid sequences according to Seq ID Nos. 1, 15, 17, 19 and 21. Thus, new autofluorescent proteins are made available which are detectable in cells due to their fluorescence and can therefore be used as markers for gene expression and protein localization, for example in cell, development and molecular biology. The proteins according to the invention also sometimes have new properties, such as, for example, the ability of the fluorescence to be regenerated after it has bleached out by irradiation with light of a specific wavelength.

In the context of the invention, fluorescent proteins are also to be understood as fusion proteins and multimers which contain at least one fluorescent protein according to the invention. This applies in particular to fusion proteins, since the proteins according to the invention can be used, inter alia, as expression markers and thus a fusion with other proteins is appropriate.

For the purposes of the invention, the term homology means the degree of agreement between two protein sequences, ie the number of amino acid positions in the proteins that match in percent. One or more gaps can be inserted into one or both protein sequences so that the highest possible number of identical amino acids in Are assigned to each other with respect to their respective position. A conventional data processing program can, for example, also be used to determine the homology.

In a particularly advantageous embodiment of the invention, the fluorescent protein according to the invention has at least 90% homology with one of the amino acid sequences according to Seq ID Nos. 1, 15, 17, 19 and 21.

The properties of the protein according to the invention can be advantageously influenced by targeted or undirected mutation. It has proven to be particularly advantageous that, based on the amino acid sequence according to Seq ID No. 1, the amino acid at position 2 is valine or glutamic acid, the amino acid at position 3 is alanine or leucine, the amino acid at position 4 is lysine or cysteine, the amino acid Position 6 lysine or valine or glutamic acid, the amino acid at position 7 asparagine or alanine, the amino acid at position 10 lysine or threonine, the amino acid at position 44 threonine or alanine, the amino acid at position 98 isoleucine or phenylalanine, the amino acid at position 108 isoleucine or aianin, the amino acid at position 125 leucine or phenylalanine, the amino acid at position 128 valine or alanine, the amino acid at position 150 lysine or glutamic acid, the amino acid at position 174 tyrosine or histidine, the amino acid at position 183 lysine or glutamic acid, the amino acid at position 213 valine or alanine, the amino acid at position 223 glycine or lysine, di e Amino acid at position 224 valine or isoleucine or serine, the amino acid at position 225 alanine or arginine or tryptophan, the amino acid at position 226 leucine or glycine or serine, the amino acid at position 227 proline or threonine and / or the amino acid at position 228 lysine or is serine. The presence of these amino acids at the respective positions has an advantageous effect, for example, on the expression, stability, solubility and fluorescence intensity of the proteins according to the invention. Some of the fluorescent proteins according to the invention have the advantageous and surprising property that the fluorescence decays after a short time when irradiated with light and is suitable for fluorescence and can be regenerated by subsequent irradiation with light of a different wavelength. The irradiation to excite the fluorescence with light of a wavelength between 375 and 580 nm and the irradiation to regenerate the fluorescence with light of shorter wavelength, in particular a wavelength between 320 and 400 nm, can take place. According to the invention, this fluorescent protein can advantageously be used, for example, for the detection of time-dependent cellular processes, such as protein diffusion or transport, or for optical information storage.

The invention further relates to a nucleic acid molecule with a nucleotide sequence which codes for a fluorescent protein, the nucleotide sequence being selected from the group consisting of a) an isolated or artificial nucleotide sequence which codes for the fluorescent protein according to the invention. b) a nucleotide sequence according to Seq ID No. 2, 16, 18, 20 or 22, or c) a nucleotide sequence which differs from the nucleotide sequences according to a) or b) by the exchange of at least one codon for a synonymous codon, ie in that has at least one silent mutation.

The invention further relates to a vector for expressing a fluorescent protein in a suitable cell, which contains the previously described nucleic acid molecule in an expressible form.

The invention also relates to cells which contain the protein according to the invention, the nucleic acid molecule and / or the vector mentioned, and a kit which contain the protein according to the invention, the nucleic acid molecule described, the vector described and / or at least one of the cells mentioned. A pharmaceutical composition is also provided which contains the protein according to the invention, the nucleic acid molecule and / or the vector mentioned, and preferably conventional auxiliaries and / or carriers.

The invention further relates to a method for producing a fluorescent protein according to the invention. In this method, the amino acid sequences of at least three known autofluorescent proteins are compared in that they are initially arranged next to one another in such a way that, with the introduction of gaps, the positions of invariant amino acids or similar regions for all proteins may match. The amino acid sequences can, for example, be aligned with one another in such a way that the highest possible number of identical amino acids are assigned to one another in relation to their respective positions, this preferably being done while maintaining the respective order of the amino acids. Similar areas are all sequence sections, preferably a section of at least three consecutive amino acids which have an essentially identical primary structure or which form a domain in the folded protein, the function of which is already known. The similar areas can also be in different positions for different proteins. In this case, the similar areas can be compared, for example, by inserting gaps or moving these areas in the sequence.

With the aid of this comparison of the respective positions of the selected amino acid sequences, an average sequence over at least a substantial part of the total length of the amino acid sequence is determined. For each given position, the amino acid that occurs most frequently in the underlying amino acid sequences at this position is selected, with a gap being treated like an amino acid. This results in a sequence of amino acids and gaps with positions in between where no amino acid occurs most frequently or where two or more amino acids occur with the same frequency. At these points, an amino acid must now be selected according to previously defined, meaningful, reproducible and generally applicable criteria. The method according to the invention provides several options for this.

According to a preferred embodiment of the invention, an amino acid at such a position is determined by first establishing a ranking of the underlying proteins before comparing the amino acid sequences, i.e. Rank numbers from 1 to n are assigned to the individual proteins. If the most common amino acid cannot be clearly identified at a position, the amino acid that belongs to the protein sequence with the lowest rank number among the most common amino acids is used.

According to a further preferred embodiment of the invention, an amino acid is determined at such a position in that the amino acids are divided into groups on the basis of functional and / or structural criteria and, in the case of several amino acids, an amino acid is selected with the same frequency from the group that is present on the respective Position occurs most frequently.

According to a further preferred embodiment of the invention, an amino acid at such a position is determined by selecting the amino acid in the case of several amino acids with the same frequency on the basis of functional and / or structural criteria, taking into account the properties of the known proteins.

When creating the average sequence, the selection of an amino acid with several amino acids with the same frequency within an amino acid sequence can also take place on the basis of different criteria, ie the aforementioned embodiments can be combined with one another in an advantageous manner. Furthermore, functional and structural criteria or other knowledge of the known proteins also requires, or at least influences, the aforementioned determination of the ranking of the amino acid sequences before their comparison.

The method according to the invention surprisingly enables the production of new fluorescent proteins which can be expressed in a suitable system. The process steps described above lead to a complete, artificial amino acid sequence which represents a protein which shows autofluorescence and can be expressed in a suitable system.

For the expression of the protein and for the detection of fluorescence, an artificial nucleic acid sequence, which codes for the determined average sequence, is first created and at least one corresponding nucleic acid molecule is synthesized. This nucleic acid molecule is linked to a suitable promoter and the protein is then expressed in a suitable system. The method according to the invention thus leads to a new, fully functional fluorescent protein. The method can be carried out easily and without great expenditure on equipment. Furthermore, in the method according to the invention, not only is a known protein changed, but rather it is based on several known proteins, so that the positive properties of the different proteins can be combined in the new protein. In this way, new fluorescent proteins can be produced without having to isolate and clone previously unknown natural proteins with great effort.

In a preferred embodiment, the average sequence is modified before the creation of the artificial nucleic acid sequence at the N-terminus and / or C-terminus by exchanging at least one amino acid. To achieve an optimal Kozak region in the region of the start codon, the additional amino acid valine can be introduced, for example, after the start amino acid methionine. At the C-terminus, for example additional hydrophilic amino acid serine can be added to improve the solubility of the protein.

In a particularly advantageous embodiment of the method according to the invention it is provided that subsequently, i.e. after creation and expression of the average sequence, at least one codon is exchanged and / or changed by mutagenesis in the artificial nucleic acid molecule and thereby at least one amino acid of the average sequence is exchanged. By exchanging one or more amino acids, certain properties of the artificial protein, such as stability, solubility, compatibility or functionality, can be changed in an advantageous manner. In particular, the fluorescence, i.e. the common function of the originally known proteins can be influenced or improved in a targeted manner. According to the invention, it was possible to generate other mutants of the fluorescent average protein (average - FP), which are significantly improved with regard to the solubility in the cell and the brightness of the fluorescence. The deviations of the mutant clones from the average FP with respect to the amino acid sequence are a maximum of 10%, i.e. the homology of the proteins according to the invention with one another is at least 90%.

In a further embodiment of the method according to the invention it is provided that the protein is isolated and / or purified after expression so that it can be used in a suitable form for further use.

Brief description of the pictures

The invention is explained in more detail below using the figures as an example.

FIG. 1 shows a comparison of the amino acid sequences of 9 fluorescent proteins (FPs), in which the sequences are arranged with one another in such a way that an optimal match of results in invariant amino acids or similar regions for all proteins. Rank numbers were assigned to the individual FPs:

Rank 1: Aequorea victoria GFP (Gene (1992) 111, 229) Rank 2: Zoanthus sp. zFP506 (Nature Biotechnol. (1999) 17, 969) Rank 3: Zoanthus sp. zFP538 (Nature Biotechnol. (1999) 17, 969) Rank 4: Discosoma striata dsFP483 (Nature Biotechnol. (1999) 17, 969) Rank 5: Discosoma sp. "red" dsFP583 (Nature Biotechnol. (1999) 17,969) Rank 6: Anemonia majano amFP486 (Nature Biotechnol. (1999) 17, 969) Rank 7: Clavularia sp. cFP484 (Nature Biotechnol. (1999) 17, 969) Rank 8: Renilla mulleri GFP (WO 99/49019) Rank 9: Ptilosarcus gurneyi GFP (WO 99/49019)

From this, the average sequence or, by adding additional modifications, the average FP was determined in accordance with the method according to the invention.

a) Amino acid positions 1 to 53 b) Amino acid positions 54 to 109 c) Amino acid positions 110 to 164 d) Amino acid positions 165 to 222 e) Amino acid positions 223 to 234

The numbering of the positions refers to the average sequence determined. At the points where there is a gap in the average sequence, the positions are not numbered consecutively, but are given the additions "b" or "c". Exceptions to this are only positions 1b and 226b, where additional amino acids are inserted in the average FP compared to the average sequence determined.

FIG. 2 shows a schematic representation of the structure of the plasmids which are used to carry out the described method and to express the fluorescent proteins according to the invention in Escherichia coli and mammalian cells. FIG. 3 shows fluorescence microscopic images of NIH3T3 cells which were transfected with expression plasmids for various mutagenesis products of the proteins according to the invention. A: 9 hours after transfection, B: 24 hours after transfection.

FIG. 4 shows fluorescence microscopic images of NIH3T3 cells (fluorescein filter set) after transfection with an expression plasmid for a mutagenesis product of a protein produced according to the invention (p2A9-c15m3). A: Recording 6 hours after transfection when irradiated with fluorescein excitation filter, B: recording after 10 seconds irradiation with fluorescein excitation filter, C: recording after 2 seconds irradiation with DAPI excitation filter for regeneration and subsequent irradiation with fluorescein excitation filter.

FIG. 5 shows fluorescence microscopic images of Escherichia coli cells (DH5), each with 7.1 μg of a plasmid without fluorescent protein (pMCS5, negative control; A) and of an expression plasmid (pExpAcFP; B) which contains the gene for the average FP contains were transformed. 2 ml of a culture of the transformed cells were pelleted, the supernatant was decanted, the pellet was resuspended with the last drop and 10 μl of this suspension were placed on a slide with a cover slip.

Description of the invention

The following example serves to explain the invention in greater detail, but without restricting it to the substances and methods disclosed by way of example. Example: Production of the artificial autofluorescent protein

For the production of the artificial autofluorescent protein by the method described above, the amino acid sequences of the following naturally occurring proteins, which have autofluorescence as a common function, were taken as a basis, with a ranking:

Rank 1: Aequorea victoria GFP Rank 2: Zoanthus sp. zFP506 Rank 3: Zoanthus sp. zFP538 Rank 4: Discosoma striata dsFP483 Rank 5: Discosoma sp. "red" dsFP583 Rank 6: Anemonia majano amFP486 Rank 7: Clavulaha sp. cFP484 Rank 8: Renilla mulleri GFP Rank 9: Ptilosarcus gurneyi GFP

The amino acid sequences were initially arranged in the form of an "alignments", in which, with the introduction of gaps, the positions of invariant amino acids were matched for all proteins, the amino acid sequences being aligned with one another in such a way that the highest possible order was maintained while maintaining the respective amino acid sequence Number of identical amino acids are assigned to one another in relation to their respective position (cf. Matz et al. (1999), Nat. Biotechnol. 17, 969-973). The amino acid sequences of Renilla mulleri GFP and Ptilosarcus gurneyi GFP were then also taken into account obviously invariant amino acids added (see Figure 1 a - e). The sequence of the clearly most common amino acids resulting from this arrangement was supplemented at the other positions by that of the most common amino acids which has the highest ranking number at this position The average sequence was then both at the N-terminus and at the C- Modified term. To achieve an optimal Kozak region in the area of the start codon, the additional amino acid valine was introduced after the start amino acid methionine. The additional hydrophilic amino acid serine was added to the C-terminus in order not to complicate the solubility of the protein with a hydrophobic C-terminus. A coding DNA sequence (Seq ID No. 2) has now been designed for this protein sequence (Seq ID No. 1), taking into account the codons that occur particularly frequently in mammalian cells. For the cloning of the desired FP gene (FP = fluorescent protein) into the expression plasmid pExpA, 12 partially complementary DNA oligonucleotides (Seq ID Nos. 3-14) were used, from which the entire FP gene was amplified in two successive PCR reactions , Subsequent restriction digestion with the restriction enzymes Xba I and Not I gave a DNA fragment which, after ligation with the Xba I / Not I fragment from pExpA, gave the expression plasmid pExpA-cFP (see FIG. 2A). After transformation of Escherichia coli cells (DH5), this leads to ampicillin-resistant bacterial colonies, which in the fluorescence microscope using fluorescence filters excite the fluorescence molecules with light in the blue wavelength range from 460 nm to 505 nm and emitted light in the green wavelength range from 510 nm to Allow 560 nm to pass (filter set for fluorescein), show green fluorescence (see FIG. 5B), while no fluorescence could be detected in the corresponding negative control (see FIG. 5A). It was therefore possible to detect autofluorescence for the average FP (Seq ID No. 1) that arose directly from the method, without the need for a subsequent mutagenesis.

Several strategies were used to optimize the function by mutagenesis of the autofluorescent gene obtained in this way. Initially, mutations were introduced into the FP gene from PCR using pExpA-cFP. This was done using primers chosen to hybridize to pExPA-cFP in areas outside the coding sequence, but so that the Recognition sequences for the restriction enzymes Xba I and Not I were still included. This allowed for random mutagenesis across the entire coding area. The PCR was carried out in the presence of Mn ²⁺ ions, which leads to a desired increase in the error rate of the Taq DNA polymerase used. The PCR products obtained in this way were digested with Xba I and Not I and then ligated with the appropriate fragment of pExpA and transformed into Escherichia co // lines (DH5). Of the ampicillin-resistant colonies, those that were significantly lighter when viewed under a fluorescence microscope were selected. The increase in fluorescence intensity was verified by repeated transformation into Escherichia coli cells, the plasmid DNA was prepared and the sequence of the FP gene was determined.

In a further mutagenesis, the improved pExpA-cFP plasmids were combined and again amplified in the presence of Mn ²⁺ ions. This time, however, the restriction digested PCR products were ligated with the Xba I and Not I cut p2A9 fragment in order to obtain plasmids which are suitable for expression both in Escherichia coli cells and in mammalian cells.

Alternatively, PCR reactions were carried out to specifically change the N- and C-terminal areas. The δ'-primer hybridized with the FP gene and contained a mixture of all 4 bases instead of the first 6 codons at 18 positions, as well as the 3'-primer instead of the last 6 codons, so that in the translated protein at the N- and C-terminus a random sequence of 6 arbitrary amino acids was created. These PCR products were also digested with Xba I and Not I and ligated to the corresponding fragment of p2A9.

In a third approach, selected codons were specifically mutated by PCR with partially homologous primers in order to exchange individual amino acid codons. Here too, the PCR products obtained were digested with Xba I and Not I and ligated to the corresponding p2A9 fragment. All three approaches became 48 after the ligation and subsequent transformation of Escherichia coli cells (DH5) Colonies selected, which had a high brightness in fluorescence, and prepared the plasmid DNA. This was then transformed into NIH3T3 cells in a subsequent experiment in order to select those clones for sequencing which lead to a higher fluorescence when these cells are transfected.

Cloning the expression plasmids

For the expression of the above-mentioned proteins in suitable Escherichia coli cells, plasmids were constructed which contain a coding sequence for FP (FP = fluorescent protein) under the control of the lac promoter (pExpA-cFP, see FIG. 2A). The plasmids were produced by modification of pMCS5 (MoBiTec, Göttingen, Germany). pMCS5 has a similar structure to pBluescript SK (-) (Stratagene) and differs from it only in the 5 'region of the coding sequence for the lacZα fragment. pMCS5 therefore has the lac promoter and, downstream, the lac operator, as a result of which the expression of an inserted coding sequence depends on the absence of an active lac repressor. In order to achieve constitutive expression in Escherichia coli cells in each case, pMCS5 was modified in such a way that the lac operator is no longer included. In its place, the early mRNA polyadenylation signal from SV40 was used to prepare a reconstruction of this expression plasmid for Escherichia coli into an expression plasmid for mammalian cells (p2A9-cFP, see FIG. 2B). This was done by additionally inserting regulatory promoter sequences which cause expression of the downstream gene in mammalian cells. A combination of the complete "immediately early" cytomegalovirus (CMV) promoter, followed by a 100 base pair fragment of the lac promoter from Escherichia coli and the 180 base pair 3 'end of the "immediate early" proved to be particularly effective "CMV promoters. These fragments were each prepared by PCR using primers which, in addition to the homologous regions, contained recognition sequences for restriction enzymes. For the insertion of the artificial FP genes were combined between the Promoter and the polyadenylation sequence recognition sites for the restriction enzymes Xba I and Not I inserted.

Experiment description: The expression of FP in mammalian cells leads to autofluorescence of the cells:

NIH3T3 cells (adherent, 2.5 x 10 ⁵ cells per well in a 12-hole dish, cultivated to 70-80% confluence) were transfected with 1 μg vector DNA with Exgene (MBI Fermentas). The plasmids p2A9-c15m2 (Seq ID No. 15), p2A9-c15m3 (Seq ID No. 17), p2A9-c15m12 (Seq ID No. 19) and p2A9-c15m48 (Seq ID No. 21) contained differently mutated genes, c15m2 (Seq ID No. 16), c15m3 (Seq ID No. 18), c15m12 (Seq ID No. 20) and c15m48 (Seq ID No. 22), for the FP (FP = fluorescent protein) under the control of a combined Promoter that leads to the constitutive expression of the downstream gene in Escherichia coli cells and mammalian cells. After 9 (see FIG. 3A) or 24 hours (see FIG. 3B), the cells were analyzed in a fluorescence microscope. Using fluorescence filters that excite fluorescence molecules with light in the blue wavelength range from 460 nm to 505 nm and allow emitted light in the green wavelength range from 510 nm to 560 nm to pass (filter set for fluorescein), all cells with FP or mutated FP containing plasmids were transfected, green fluorescence can be observed.

Experiment description: The autofluorescence of FP fades and can be regenerated by irradiation with ultraviolet light:

NIH3T3 cells (adherent, 2.5 x 10 ⁵ cells per well in a 12-hole dish, cultivated to 70-80% confluence) were transfected with 1 μg vector DNA with Exgene (MBI Fermentas). The plasmid p2A9-c15m3 contained a mutated gene c15m3 for FP (FP = fluorescent protein) under the control of a combined promoter, which is responsible for the constitutive expression of the downstream Gene leads to Escherichia coli cells and mammalian cells. After 6 hours, the cells were analyzed in a fluorescence microscope. Using fluorescence excitation filters, which excite fluorescence molecules with light in the blue wavelength range from 460 nm to 505 nm and let emitted light in the green wavelength range from 510 nm to 560 nm pass (filter set for fluorescein), cells with this mutated FP containing plasmid were transfected, green fluorescence can be observed (see Figure 4A). This fluorescence weakened during the observation within a few seconds until it almost completely disappeared (see FIG. 4B). After irradiation of the field of view with short-wave light in the visible blue to ultraviolet range (320 nm to 400 nm, DAPI - excitation filter), the fluorescence can be observed again with the fluorescein filter set (see FIG. 4C), ie it could be regenerated with the shorter-wave light , This mutated FP can thus be used, for example, for the detection of time-dependent cellular processes, such as protein diffusion or transport, or for optical information storage.

Comparison of the amino acid sequences of the average FP and its mutants with the known fluorescent proteins:

Table 1 shows the differences in the amino acid sequence between the individual modified proteins cFP2, cFP3, cFP12 and cFP48, which are caused by the mutated genes c15m2 (Seq ID No. 16), c15m3 (Seq ID No. 18), c15m12 (Seq ID No. 20) and c15m48 (Seq ID No. 22) are encoded, in comparison with the Dur.chnitt-FP (D-FP). The numbering of the amino acid positions is taken from that of the average sequence according to FIG. 1. This results in deviations of the individual proteins in relation to the average FP between 6% and 8%, ie relative similarities between 92% and 94% (Table 2). In contrast, Table 2 shows that the differences between the new fluorescent proteins according to the invention and the known proteins, which were the starting point for determining the average sequence, are relatively high. There are only sequence homologies here between 32% and 69%. It was therefore possible to provide a completely new group of amino acid sequences or proteins which show clear autofluorescence and offer advantageous possible uses. In addition, an additional function or advantageous property could be generated in the proteins according to the invention.

List of abbreviations used:

CMV cytomegalovirus

C-terminus C-terminal end of an amino acid sequence

DNA deoxyribonucleic acid

FACS fluorescence activated cell sorting

FP fluorescent protein

GFP green fluorescent protein nm nanometer (unit of wavelength)

N-terminus N-terminal end of an amino acid sequence

PCR polymerase chain reaction

Seq ID No. Sequence identification number (according to sequence listing)

SV40 Simian Virus 40 μg microgram

Table 1: Differences in the amino acid sequences between the average FP (D-FP) and the individual mutated fluorescent proteins (clones cFP2 / Seq. ID No. 15, cFP3 / Seq ID No. 17, cFP12 / Seq ID No. 19 and cFP48 / Seq ID No. 21) based on the positions of the amino acids in the average FP according to Seq ID No. 1.

Table 2: Relative similarities of the fluorescent proteins in percent based on the respective amino acid sequences. (D-FP (Seq ID No. 1) = average FP; cFP2 (Seq ID No. 15), cFP3 (Seq ID No. 17), cFP12 (Seq ID No. 19) and cFP48 (Seq ID No. 21 ) = Mutated D-FP clones)

Claims

claims

1. Fluorescent protein with an amino acid sequence which has at least 80% homology with one of the amino acid sequences according to Seq ID Nos. 1, 15, 17, 19 and 21.

2. Fluorescent protein according to claim 1, characterized in that it has at least 90% homology with one of the amino acid sequences according to Seq ID Nos. 1. 15, 17, 19 and 21.

3. Fluorescent protein according to claim 1 or 2, characterized in that the amino acid based on the amino acid sequence according to Seq ID No. 1 at position 2 valine or glutamic acid, the amino acid at position 3 alanine or leucine, the amino acid at position 4 lysine or cysteine , the amino acid at position 6 lysine or valine or glutamic acid, the amino acid at position 7 asparagine or alanine, the amino acid at position 10 lysine or threonine, the amino acid at position 44 threonine or alanine, the amino acid at position 98 isoleucine or phenylalanine, the amino acid at position 108 isoleucine or alanine, the amino acid at position 125 leucine or phenylalanine, the amino acid at position 128 valine or alanine, the amino acid at position 150 lysine or glutamic acid, the amino acid at position 174 tyrosine or histidine, the amino acid at position 183 lysine or Glutamic acid, the amino acid at position 213 valine or alanine, the amino acid at position 223 Glycine or lysine, the amino acid at position 224 valine or isoleucine or serine, the amino acid at position 225 alanine or arginine or tryptophan, the amino acid at position 226 leucine or glycine or serine, the amino acid at position 227 proline or threonine and / or the amino acid at position 228 is lysine or serine.

4. Fluorescent protein according to one of claims 1 to 3, characterized by the property that the fluorescence decays after irradiation with light of a suitable wavelength for the fluorescence after a short time and can be regenerated by a subsequent irradiation with light of a different wavelength.

5. Fluorescent protein according to claim 4, characterized in that the irradiation to excite the fluorescence with light of a wavelength between 375 and 580 nm and the irradiation to regenerate the fluorescence with light of shorter wavelength, in particular a wavelength between 320 and 400 nm.

6. Nucleic acid molecule with a nucleotide sequence which codes for a fluorescent protein, the nucleotide sequence being selected from the group consisting of

a) isolated or artificial nucleotide sequence which codes for the fluorescent protein according to one of claims 1 to 5,

b) nucleotide sequence according to Seq ID No. 2, 16, 18, 20 or 22,

d) nucleotide sequence which differs from the nucleotide sequences according to a) or b) in that at least one codon is replaced by a synonymous codon.

7. Vector for expressing a protein in a suitable cell, characterized in that it contains the nucleic acid molecule according to claim 6 in an expressible form.

8. Cell which contains the protein according to any one of claims 1 to 5, the nucleic acid molecule according to claim 6 and / or the vector according to claim 7.

9. Kit which contains the protein according to one of claims 1 to 5, the nucleic acid molecule according to claim 6, the vector according to claim 7 and / or at least one cell according to claim 8.

10. A pharmaceutical composition which contains the protein according to any one of claims 1 to 5, the nucleic acid molecule according to claim 6 and / or the vector according to claim 7, and preferably conventional auxiliaries and / or carriers.

11. A method for producing the protein according to one of claims 1 to 5, characterized by the following steps,

a) comparison of the amino acid sequences of at least three known fluorescent proteins,

b) matching the positions of the invariant amino acids or regions of the amino acid sequences by mutual alignment and / or inserting at least one gap in one or more amino acid sequences in order to produce a match of the positions of the invariant amino acids or regions,

c) Creation of an average sequence over at least a substantial part of the entire length of the amino acid sequence, the most common amino acid being selected at the respective position or, in the case of several amino acids, one of these amino acids being selected with the same frequency, a gap being treated like an amino acid becomes,

d) creating an artificial nucleic acid sequence which codes for the average sequence from step c) and synthesizing at least one corresponding nucleic acid molecule, e) connecting the nucleic acid molecule from step d) with a suitable promoter and expression of the protein in a suitable system.

12. The method according to claim 11, characterized in that the amino acid sequences in step b), preferably while maintaining the respective order of the amino acids, are aligned in such a way that the highest possible number of identical amino acids are assigned to one another with respect to their respective position.

13. The method according to claim 11 or 12, characterized in that in step c) a ranking of the amino acid sequences is determined, whereby the individual amino acid sequences are assigned rank numbers from 1 to n, and in the case of several amino acids with the same frequency, the amino acid with the lowest rank number for the average sequence is selected, or that in step c) the amino acids are divided into groups on the basis of functional and / or structural criteria and, in the case of several amino acids, an amino acid is selected with the same frequency from a group which occurs most frequently at the respective position, or that in step c) with several amino acids, the amino acid is selected with the same frequency on the basis of functional and / or structural criteria, taking into account the properties of the known proteins.

14. The method according to one or more of claims 11 to 13, characterized in that when creating the average sequence, the selection of an amino acid for several amino acids takes place with the same frequency within an amino acid sequence on the basis of different criteria.

15. The method according to any one of claims 11 to 14, characterized in that the average sequence from step c) is modified at the N-terminus and / or C-terminus by exchanging at least one AS before performing step d).

16. The method according to any one of claims 11 to 15, characterized in that in the nucleic acid molecule from step d) at least one codon is exchanged and / or changed by mutagenesis and thereby at least one amino acid of the average sequence is exchanged.

17. The method according to any one of claims 11 to 16, characterized in that the protein is isolated and / or purified after step e).