WO2013144595A1 - Peridinin-chlorophyll binding proteins - Google Patents

Peridinin-chlorophyll binding proteins Download PDF

Info

Publication number
WO2013144595A1
WO2013144595A1 PCT/GB2013/050754 GB2013050754W WO2013144595A1 WO 2013144595 A1 WO2013144595 A1 WO 2013144595A1 GB 2013050754 W GB2013050754 W GB 2013050754W WO 2013144595 A1 WO2013144595 A1 WO 2013144595A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
carotene
caroten
seq
fragment
Prior art date
Application number
PCT/GB2013/050754
Other languages
French (fr)
Inventor
Farid KHAN
Ping Wang
Hans-Peter Meyer
Original Assignee
Protein Technologies Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Protein Technologies Limited filed Critical Protein Technologies Limited
Priority to GB1418608.4A priority Critical patent/GB2515698A/en
Publication of WO2013144595A1 publication Critical patent/WO2013144595A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/405Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from algae
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/21Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/60Fusion polypeptide containing spectroscopic/fluorescent detection, e.g. green fluorescent protein [GFP]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins

Definitions

  • the present invention relates to novel peridinin-chlorophyll binding proteins (PCPs) and their uses.
  • PCPs peridinin-chlorophyll binding proteins
  • the invention also relates to nucleic acid sequences encoding the PCPs and to the production of the PCPs.
  • a variety of fluorescent proteins have been used to visualise cells and components and molecules in cells, including fluorescent proteins derived from jellyfish and corals.
  • haemoglobin effectively absorbs the blue, green, red and other wavelengths used both to excite standard fluorescent proteins and the various wavelengths that they generate on fluorescence. This interference results in an increase in either scattering or absorption in the tissue or sample and a resultant decrease in light penetration - the so-called attenuation length.
  • near-infrared light however, both scattering and absorption in biological tissues are generally less severe and attenuation lengths are correspondingly longer. This property allows near -infrared fluorescent proteins to be used as probes with substantial increases in imaging sensitivity and performance.
  • PCPs have been used as fluorescent labels. These proteins exist in the thylakoid lumen of algae. A unique class of light-harvesting proteins, PCPs use blue-green carotenoids as their primary light-absorbers, transferring the absorbed energy to bound chlorophylls. PCPs are present in most photosynthetic dinoflagellates and zooxanthellae.
  • PCPs are almost exclusively obtained from extracts of natural organisms including Zooxanthellae (Tridacna species), Amphidinium carterae (Plymouth 450), Cachonina terrorism, Gonyaulax polyedra, Glenodinium species, Amphidinium rhynocephaleum and Gymnodinium splendens.
  • PCPs form trimers, in which each polypeptide contains an unusual jellyroll fold of the a-helical amino- and carboxyl -terminal domains. These domains constitute a scaffold with pseudo -twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin and two chlorophyll -a molecules. Each monomer consists of two peridinin chlorophyll -binding units termed the PCP 'block' .
  • PCP monomer There are two forms of PCP monomer; a short version with a molecular weight between 14- 16kDa and a longer version with a molecular weight of 30-35 kDa.
  • the protein complex has a unique emission spectrum that peaks at 676nm.
  • holo-PCP i.e. a PCP fully bound with peridinin and chlorophyll
  • apo-PCP i.e. a PCP without the bound peridinin and chlorophyll
  • apo-PCP is non-fluorescent.
  • PCPs used in fluorescent labelling have typically been natural protein isolates.
  • US 4876190 and US 6133429 disclose the use of naturally-sourced PCPs from living organisms as fluorescent labels using chemical conjugation methods.
  • PCPs and their derivatives It would be preferable to use recombinant DNA technology to produce PCPs and their derivatives.
  • a recombinant PCP derived from the dinoflagellate Amphidinium has been produced in E. coli (Miller et al., Photosynthesis Research (2005), 86, 229-240)
  • the recombinant PCP was expressed as an unfolded protein that produced inclusion bodies. The study did not generate soluble, active protein; subsequent refolding of the inclusion bodies was required to produce native-like PCP.
  • recombinant DNA encoding a particular recombinant protein When recombinant DNA encoding a particular recombinant protein is introduced into a host organism, the protein in question will not necessarily be expressed in a bioactive, soluble form.
  • recombinant proteins may be expressed either in their soluble form or as inclusion bodies.
  • the production of soluble, bioactive protein in the cytosol is highly desirable as subsequent downstream purification is normally quite straightforward.
  • obtaining bioactive protein from inclusion bodies is a far more cumbersome process. Solubilisation and refolding necessitate many operational steps and generally result in very low levels of recovered, refolded protein which, moreover, may have lost much of its biological activity. Accordingly, the formation of inclusion bodies remains a significant barrier to the industrial production of recombinant proteins.
  • Inclusion bodies are dense particles of aggregated protein found both in the cytoplasmic and periplasmic spaces of E. coli during the expression of high levels of heterologous protein. There is growing evidence to indicate that the formation of inclusion bodies occurs as a result of the intracellular accumulation of partially folded expressed proteins which aggregate through non-covalent hydrophobic or ionic interactions or a combination of the two. The exact mechanisms by which proteins form inclusion bodies have not been fully elucidated and it is not yet possible to predict in advance whether a given protein is likely to be expressed as such.
  • E. coli was the first host used to express Eli Lilly's human recombinant insulin in 1982, and the protein was in fact produced as inclusion bodies requiring oxidative protein folding steps to achieve natively folded, bioactive insulin. Subsequently, insulin has been recombinantly produced as soluble protein in yeast in which insulin precursors are simultaneously secreted and processed without the need for refolding.
  • Another example of the formation of inclusion bodies is Monsanto' s bovine growth hormone product launched in 1994, the manufacturing process for which also required additional, labour-intensive, steps of solubilising and refolding when implemented at industrial scale.
  • inclusion bodies reduces the economic feasibility of protein production by increasing manufacturing costs and, for this reason, most biopharmaceutical companies are unwilling to implement processes in which in vitro protein refolding is required. Therefore, in industrial bioprocessing, one of the major challenges in E. coli protein expression is obtaining the desired recombinant product in its soluble, bioactive form.
  • coli as an expression system include its inability to perform many of the posttranslational modifications found in eukaryotic proteins, the lack of a secretion mechanism for the efficient release of protein into the culture medium and the limited ability of the bacterium to facilitate extensive disulfide bond formation.
  • the present invention has been made from a consideration of these problems and describes for the first time the recombinant expression of soluble PCPs.
  • the present invention is concerned with novel isolated nucleic acid sequences encoding PCPs and the recombinant and soluble expression of the novel PCP apo-proteins in high yield.
  • Gene sequences were isolated from Symbiodinium species (zooxanthellae) which lives inside the soft coral species Sinularia flexibilis, and has been termed 'Symbiodinium Sinularia flexibilis' or 'S. S. flexibilis
  • a protein selected from: a protein having at least 95% sequence identity with the amino acid sequence of SEQ ID NCv l;
  • the protein can be expressed recombinantly as soluble, highly active
  • the protein is a PCP apoprotein ('apo-PCP') with chlorophyll and carotenoid-binding properties.
  • the apo-PCP can be readily reconstituted with chlorophylls and carotenoids, without the need for refolding, to yield highly fluorescent protein complexes (PCP holoproteins; 'holo-PCP') with near-infrared fluorescent properties. Excitation and emission have been measured at 475nm and 675nm, respectively.
  • amino acid sequence of SEQ ID NO: l corresponds to the full-length PCP monomer ('fPCP').
  • the inventors have surprisingly observed that a C-terminal portion of the PCP is sufficient for PCP activity.
  • the C-terminal domain of the PCP has chlorophyll and carotenoid-binding properties and the holo-protein results in near-infrared fluorescence.
  • the amino acid sequence of SEQ ID NO:2 corresponds to a C-terminal fragment of the fPCP, also termed herein 'C-terminal PCP' or 'cPCP' . Sequencing of the cPCP revealed a single E147A amino acid substitution in the recombinant cPCP sequence.
  • the fPCP and cPCP have approximate molecular masses of 36 kDa and 20 kDa respectively, as determined by SDS-PAGE.
  • cPCP even after reconstitution with xanthophyll and chlorophyll, cPCP has been found to have a mass of only 22.8kDa making this protein one of the smallest infrared proteins known. In general, the smaller the protein, the more sterically accessible it will be as a fusion protein in binding to its biological targets. In comparison to eqFP670, which is a dimer of mass 52 kDa, the present invention provides a holo-cPCP with a mass of approximately 23 kDa. Thus, the recombinant PCPs of the present invention are ideal labels for designing near-infrared fusion constructs as biotherapeutics.
  • a further aspect of the present invention provides a recombinant soluble PCP, or a fragment or derivative thereof.
  • the native-like fluorescent proteins according to the present invention are particularly useful as fluorescent labels.
  • the PCPs are also useful for the creation of fusion proteins in combination with other recombinant proteins. Site-specific attachment and conjugation are difficult to achieve with naturally-derived PCPs using conventional chemical protein conjugation methods and the production of true fusion proteins is essentially impossible by those means.
  • PCPs as described herein are their susceptibility to recombinant production in high yields. Soluble, active protein is readily obtainable by expression in a host such as E. coli which negates the need for protein refolding observed in the prior art and greatly simplifies production. The high yield of recombinant PCPs was unexpected and unusual, and substantially more PCP protein is produced than any prior art purification method.
  • novel PCPs according to the present invention have excellent stability. It is highly desirable to produce protein-based reagents or biotherapeutics that have high thermostability as this ensures the activity of product during shipment, storage and usage (e.g., in bioassays or in vivo administration).
  • the protein structure may remain substantially intact (i.e. not substantially unfolded or denatured) at temperatures up to 60°C, 70°C, 80°C or 90°C. It has been observed that even at 95°C, only a small part of the protein was denatured.
  • the PCPs of the present invention are monomeric.
  • Monomelic fluorescent proteins are advantageous in cell based and in vitro based assays as they generally exhibit reduced aggregation. Fluorescent aggregates can cause interference in assays or increases in background fluorescence signals. Further, an important regulatory concern with biotherapeutics is the need to demonstrate homogenous, monomeric proteins with minimum aggregated products. Aggregated products may cause adverse immunological responses and loss of efficacy of the drug. Structural analysis has confirmed that the recombinant PCPs of the present invention are tightly folded monomers.
  • the ability to produce highly soluble recombinant PCPs constitutes a major technological breakthrough in the manufacture and applications of PCPs and represents a significant step forward in the state of the art.
  • the recombinant PCPs according to the present invention provide a number of advantages over naturally-derived PCPs.
  • DNA and fermentation technology produces high yields of protein - at least ten-fold greater than the equivalent biomass of naturally-derived PCPs extracted from living organisms.
  • naturally-derived PCPs could be purified in sufficient quantities, their use as fluorescent labels is technically difficult because of the limitations of conventional chemical protein conjugation methods.
  • Such methods suffer from the general disadvantage of protein heterogeneity due to the random manner in which proteins within a population couple to one another - an undesirable natural phenomenon which affects their active sites and results in a reduction of their biological activity. This is also true for chemical immobilisation techniques based on attaching proteins to functionalized surfaces. Fusion proteins on the other hand may be engineered at the genetic level with enzymatic tags that can catalyse the bioconjugation of proteins to surfaces.
  • fusion proteins which may be useful, for example, as diagnostic reagents, biotherapeutics and imaging reagents, such as recombinant antibody fusions.
  • the recombinant and soluble PCPs as described in the present application are apo-proteins that have the ability to bind different carotenoids and chlorophylls. This is advantageous as their fluorescence and absorbance properties can be modulated by reconstitution with a wide range of carotenoids, chlorophylls and synthetically-produced analogues.
  • naturally-derived PCPs that are extracted from living organisms have fixed amounts of carotenoids and chlorophylls that are pre-bound.
  • Figure 1 shows a schematic summary which contrasts the prior art and the present invention: panel A shows the chemical conjugation of natural isolates of PCP to proteins via crosslinkers and heterobifunctional linkers (patent numbers US 4,876,190 and US 6,133,429); panel B illustrates the published method of Miller et al., which produces recombinant but aggregated PCPs as inclusion bodies and requires at least twelve steps to refold the protein; and panel C illustrates the present invention according to which a novel, highly soluble, high yielding, recombinant PCP can be produced in a single-step process without the need for refolding, thus opening up new applications for recombinant PCPs as protein/peptide fusions.
  • panel A shows the chemical conjugation of natural isolates of PCP to proteins via crosslinkers and heterobifunctional linkers (patent numbers US 4,876,190 and US 6,133,429)
  • panel B illustrates the published method of Miller et al., which produces recombinant but aggregated PCPs as
  • embodiments of the present invention provide for the production of recombinant PCPs with a single processing step of centrifugation prior to chromatography.
  • this single step is the centrifugation of lysed cells (e.g. at 10,000 rpm) to produce supernatant containing the soluble recombinant PCP that can be directly loaded and purified onto chromatography columns (the pellet containing cellular debris can be discarded).
  • lysed cells e.g. at 10,000 rpm
  • the method of Miller et al. produces 10-20mg of soluble protein after the refolding process from a fermentation of 5 litres of E. coli culture. This equates to 2-4mg of soluble protein per litre of culture.
  • the present produces substantially more protein, yields of 136 mg per litre of soluble protein having been produced in one embodiment - at least 34 times higher yields than those reported by Miller et al.
  • the protocol of Miller et al. admits to losses in protein yield due to problems of refolding which were "high and varied considerably". This loss of protein yield is a major problem with methods based on the refolding of inclusion bodies.
  • the present invention encompasses the PCP proteins associated with ligands, in particular chlorophylls and carotenoids.
  • a further aspect of the present invention therefore provides a protein complex including a PCP protein as described herein or a fragment or derivative thereof.
  • the absorbance and fluorescence properties of the PCPs and their derivatives can be varied by the reconstitution of apo-PCPs with different carotenoids and chlorophylls and their derivatives and analogues, including synthetic analogues.
  • the binding sites in the PCPs can bind different carotenoids and chlorophylls and structurally related molecules.
  • the inventors have observed that the simultaneous binding of xanthophyll with chlorophyll a or chlorophyll b to cPCP results in the visual appearance of red fluorescence.
  • the inventors have further found that the binding of xanthophyll with chlorophyll a produces more intense fluorescence compared to chlorophyll b.
  • any natural or synthetic chlorophyll or chlorophyll analogue, and derivatives thereof, may be used in aspects of the present invention, either alone or in any combination.
  • Non- limiting examples include chlorophyll a, chlorophyll b, chlorophyll c, chlorophyll d, chlorophyll e, chlorophyll f, bacteriochlorophylls, 2-(l-hexyloxyethyl)-2-devinyl pyropheophorbide-a (HPPH), porphyrin structures, synthetic derivatives of porphyrins, corrins, chlorins (2,3-dihydroporphyrins), corphins and heme, and analogues and derivatives thereof.
  • caroteinoids and analogues and derivatives thereof may be used in aspects of the present invention, either alone or in any combination.
  • Peridinin and its analogues, and derivatives thereof may be employed.
  • Other non-limiting examples include a-carotene, ⁇ -carotene, ⁇ -carotene, ⁇ -carotene, ⁇ -carotene, ⁇ -carotene, lycopene, neurosporene, phytoene, phytofluene, antheraxanthin, astaxanthin, canthaxanthin, citranaxanthin, cryptoxanthin, diadinoxanthin, diatoxanthin, dinoxanthin, flavoxanthin, fucoxanthin, lutein, neoxanthin, rhodoxanthin, rubixanthin, violaxanthin, zeaxanthin, abscisic acid, apocarotenal, bixin, crocet
  • Naturally occurring carotenoids include the following examples: hydrocarbons including lycopersene (7,8, 1 1, 12, 15, 7', 8', 1 ⁇ , 12', 15'-decahydro-Y,y-carotene), phytofluene, hexahydrolycopene (15-cis-7,8, l l, 12,7',8'-hexahydro-Y,Y-carotene), torulene (3',4'-didehydro- ⁇ , ⁇ -carotene) and a-zeacarotene (7',8'-dihydro-8,Y-carotene); alcohols including alloxanthin, cynthiaxanthin, pectenoxanthin, cryptomonaxanthin ((3R,3'R)-7,8,7',8'-tetradehydro-P,P- carotene-3,3'-diol), crustaxanthin (P,-carotene-3,4,3'
  • amino acid' as used herein includes naturally-occurring amino acids, naturally-occurring amino acid structural variants, and synthetic non-naturally occurring analogues that are capable of participating in peptide bonds.
  • the term 'protein' as used herein explicitly permits of post-translational and postsynthetic modifications, such as glycosylation.
  • the protein may comprise a full-length protein or polypeptide, or any functional component or fragment thereof.
  • a full-length protein comprises the complete structure of a transcribed gene which may consist of single or multiple domains of discrete secondary structure.
  • a polypeptide comprises two or more amino acids that are linked together to form peptidic bonds and form part of the primary sequence of the protein.
  • Functional components comprise secondary structural elements which are involved in the functional regions of a protein, for example binding sites.
  • proteins described herein may be synthesised or purified using any suitable techniques.
  • Proteins of the present invention may comprise or consist of the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2, or a fragment or derivative thereof.
  • Proteins of the present invention may comprise or consist of a variant of the amino acid sequence according to a reference sequence as used herein (for example, SEQ ID NO: l or SEQ ID NO: 2), or fragment or derivative of said variant.
  • 'Variants' of the amino acid sequence include insertions, deletions and substitutions, either conservative or non-conservative.
  • conservative substitution refers to the substitution of an amino acid within the same general class (e.g. an acidic amino acid, a basic amino acid, a non-polar amino acid, a polar amino acid or an aromatic amino acid) by another amino acid within the same class.
  • the meaning of a conservative amino acid substitution and non-conservative amino acid substitution is well known in the art. Conservative substitutions are preferred.
  • the variant may, for example, have an amino acid sequence which has at least 96% identity with the amino acid sequence according to a reference sequence (for example, SEQ ID NO: 1 or SEQ ID NO:2) or a fragment thereof, for example at least 97%, at least 98% or at least 99% identity.
  • a reference sequence for example, SEQ ID NO: 1 or SEQ ID NO:2
  • a fragment thereof for example at least 97%, at least 98% or at least 99% identity.
  • the percent sequence identity between two polypeptides may be determined using a suitable computer program, for example the CLUSTAL 2.1 multiple sequence alignment program (Larkin et al. (2007) Bioinformatics 23(21): 2947-2948).
  • Protein variants in which less than 50, less than 40, less than 30, less than 20, less than 10, or 5-50, 5-25, 5-10, 1 -5, or 1-2 amino acids are substituted, deleted, or added in any combination are preferred.
  • the present invention includes functional variants of the protein which exhibit PCP ligand-binding activity (i.e. binding of carotenoids and chlorophylls and their derivatives or analogues).
  • Ligand binding activity can be determined by measuring near- infrared fluorescence following reconstitution of the protein with chlorophyll and carotenoid (see Example 6 and Figure 6).
  • PCP ligand-binding activity may also be determined using prior art methods, for example the method of Miller et al. ⁇ Photosynthesis Research (2005), 86, 229-240). The chlorophyll and carotenoid reconstitution method of Miller et al.
  • PCP apoprotein typically 200-400 ⁇ g protein
  • 50 mMTris-HCl, pH8.0 that was made to 25 mM Tricine 10 mM KCl pH 7.6 and mixed with a stoichiometric amount of PCP pigments dissolved in ethanol, which resulted in a final ethanol concentration of 15% in volume of 1-1.5 ml.
  • Pigment protein stoichiometry was based on 1 mg/ml native PCP having an A476 nm of 21.8 and assuming peridinin in organic solution has the same specific extinction as in PCP. Chlorophylls were added as 1 A at the Qy band to 4 A at peridinin maximum.
  • the initial volumes were increased with proportionally scaled-up amounts of pigment and PCP apoprotein.
  • the samples were held at 4°C for 72 h and then diluted to give an A 670 nm ⁇ 0.05 for fluorescence measurements and A670 nm ⁇ 0.25 for absorbance measurements.
  • Reconstituted samples were equilibrated to 5 mM Tricine 2 mM KCl pH 7.6 by passage through a PD10 column and bound to a column of DEAE Tris-acryl (Sigma). After washing, the reconstituted PCP was removed with 5 mM Tricine, 2 mM KCl pH 7.6 containing 0.1 M NaCl for N-domain PCP and 0.06 M NaCl for full-length PCP.
  • fPCP SEQ ID NO: l
  • the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 191, histidine 229, histidine 230, isoleucine 254, tyrosine 270, phenylalanine 301 and tyrosine 302.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
  • fPCP SEQ ID NO: l
  • the inventors have highlighted the following amino acids as important in the N-terminal binding pocket: phenylalanine 28, histidine 66, histidine 67, asparagine 89, tyrosine 108 and tyrosine 136.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, or all six amino acids.
  • cPCP SEQ ID NO:2
  • the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 47, histidine 85, histidine 86, isoleucine 1 10, tyrosine 126, phenylalanine 157 and tyrosine 158.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
  • polyhistidine-tagged derivatives of the novel proteins may comprise or consist of a protein with the amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4, or a fragment or derivative thereof.
  • polyhistidine-tagged fPCP SEQ ID NO:3
  • the inventors have highlighted the following amino acids as important in binding either chlorophylls or caroteoids: tyrosine 206, histidine 244, histidine 245, isoleucine 269, tyrosine 285, phenylalanine 316 and tyrosine 317.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
  • polyhistidine-tagged fPCP SEQ ID NO:3
  • the inventors have highlighted the following amino acids as important in the N-terminal binding pocket: phenylalanine 43, histidine 81, histidine 82, asparagine 104, tyrosine 123 and tyrosine 151.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, or all six amino acids.
  • polyhistidine-tagged cPCP SEQ ID NO:4
  • the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 62, histidine 100, histidine 101, isoleucine 125, tyrosine 141, phenylalanine 172 and tyrosine 173.
  • one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
  • SEQ ID NO: l With reference to fPCP (SEQ ID NO: l) the following are unique amino acids: isoleucine 95, leucine 137, glutamic acid 147, aspartic acid 202, isoleucine 254, alanine 264, valine 295 and glycine 296.
  • the amino acids of SEQ ID NO: 1 are numbered from 1 to 313 starting from aspartic acid 1 and ending at arginine 313.
  • a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%), at least 97%, at least 98%>, or at least 99% identity, with the amino acid sequence of SEQ ID NO: l and having one or more of: isoleucine 95, leucine 137, glutamic acid 147, aspartic acid 202, isoleucine 254, alanine 264, valine 295 and glycine 296.
  • cPCP SEQ ID NO:2
  • amino acids of SEQ ID NO: 2 are numbered from 1 to 169 starting from valine 1 and ending at arginine 169.
  • a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%>, or at least 99% identity, with the amino acid sequence of SEQ ID NO: 2 and having one or more of: aspartic acid 58, isoleucine 110, alanine 120, valine 151 and glycine 152.
  • polyhistidine-tagged fPCP (SEQ ID NO:3) the following are unique amino acids: isoleucine 110, leucine 152, glutamic acid 162, aspartic acid 217, isoleucine 269, alanine 279, valine 310 and glycine 311.
  • the amino acids of SEQ ID NO:3 are numbered from 1 to 328 starting from methionine 1 and ending at arginine 328.
  • a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80%> sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%, or at least 99% identity, with the amino acid sequence of SEQ ID NO:3 and having one or more of: isoleucine 110, leucine 152, glutamic acid 162, aspartic acid 217, isoleucine 269, alanine 279, valine 310 and glycine 311.
  • polyhistidine-tagged cPCP SEQ ID NO:4
  • amino acids of SEQ ID NO: 4 are numbered from 1 to 184 starting from methionine 1 and ending at arginine 184.
  • a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%, or at least 99% identity, with the amino acid sequence of SEQ ID NO:4 and having one or more of: aspartic acid 73, isoleucine 125, alanine 135, valine 166 and glycine 167.
  • Suitable fragments or derivatives of PCP proteins according to the present invention may comprise, for example, at least 50 amino acid residues, for instance at least 75 amino acid residues, or at least 100 amino acid residues, or at least 120 amino acid residues. References to proteins and fragments of proteins also encompass derivatives of such proteins or fragments, except for where the context requires otherwise.
  • the present invention includes, in particular, polyhistidine-tagged derivatives of the PCP proteins of the present invention, or fragments thereof.
  • Such derivatives have a terminal polyhistidine tag of at least five histidine residues, typically six histidine residues, preferably at the N-terminus of the protein.
  • the polyhistidine tag is useful in isolation, purification, binding and immobilisation of the tagged proteins, as is known to those skilled in the art.
  • recombinant polyhistidine-tagged PCP proteins of the present invention may conveniently be isolated following expression in a host cell by exposing the cell lysate to a Ni 2+ NTA resin.
  • the polyhistidine tag may also be associated with a suitable amino acid sequence that facilitates removal of the tag from the protein using an endopeptidase, such as enterokinase.
  • the polyhistidine-tagged protein may comprise or consist of a protein with the amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4, or a fragment or derivative thereof.
  • a protein selected from:
  • Variants of the polyhistidine-tagged protein may have an amino acid sequence which has at least 96% identity with the amino acid sequence according to SEQ ID NO:3 or a fragment thereof, for example at least 97%, at least 98% or at least 99% identity.
  • Variants of the polyhistidine-tagged protein may have an amino acid sequence which has at least 98% identity with the amino acid sequence according to SEQ ID NO: 4 or a fragment thereof, for example at least 99% identity.
  • the PCP proteins of the present invention may be conjugated to other agents, including proteins, biomolecules and polymers.
  • the high thermal stability of the proteins of the present invention facilitates their use as labelling reagents for conjugation to such other agents.
  • a further aspect of the present invention provides a protein conjugate comprising a protein as described herein, or fragment or derivative thereof.
  • the protein may be conjugated to any agent of interest.
  • proteins of the present invention include fusion proteins comprising a PCP protein of the present invention, or fragment or derivative thereof, and a protein or peptide of interest.
  • a further aspect of the present invention provides a fusion protein comprising a protein as described herein, or fragment or derivative thereof.
  • Fusion proteins are proteins created through the joining of two or more genes which originally coded for separate proteins.
  • the genes may be derived from the same organism or more commonly they are derived from different organisms.
  • fusion proteins can be arranged as tandem constructs or placed at the N- or C-terminus of a protein. Translation of this fusion gene results (e.g. when expressed in a host organism, such as E. coif) in a single polypeptide with functional properties derived from each of the original proteins.
  • fusions may be created using a range of different proteins or peptides of interest.
  • the present invention provides for the design and engineering of PCP fusion proteins as tandem genetic conjugates.
  • the recombinant PCPs may be given added functionality.
  • a PCP may be engineered at the N-terminus of a single-chain antibody, resulting in a fusion protein that not only binds a specific antigen but can also be detected by virtue of its PCP fluorescence.
  • Such fusion products are useful as in vivo fluorescent diagnostic markers for disease.
  • any known soluble recombinant protein targets of interest can be labelled at the N- and/or C-terminus with PCPs according to the present invention.
  • the recombinant PCP itself may also be engineered with other proteins at the N- and/or C-terminus. It is possible, therefore, to produce multiple fusions of proteins/peptides with recombinant PCPs.
  • Non-limiting examples include fluorescent proteins, antibodies, such as single chain/fragment antibodies, antigens, enzymes, and epitope tags for affinity purification and immobilisation, for example polyhistidine tag, V5 tag, TAP tag, etc.
  • fusion partners for the PCPs include: flag peptide, anti-flag antibodies, glutathione- ⁇ -transferase, Staphylococcal protein A, Streptococcal protein G, calmodulin organic ligands, thioredoxin, b-galactosidase, ubiquitin, chloramphenicol acetyltransf erase, S-peptide (RNase A, residues 1-20), S-protein (RNase A, residues 21-124), myosin heavy chain, DsbA, biotin subunit, avidin, streptavidin, Strep-tag streptavidin, c-myc, dihydrofolate reductase, CKSc, polyarginine, polycysteine, polyphenylalanine, lac repressor, T4 gp55, growth hormone N terminus, maltose-binding protein, galactose-binding protein, cyclomaltodextrin
  • coli 1 ell protein, TrpE or TrpLE, protein kinases, (AlaTrpTrpPro)n, HAId epitope, BTag (VP7 protein region of bluetongue virus), anti-BTag antibodies, green fluorescent protein.
  • nucleic acid encoding a protein, or fragment or derivative thereof, as herein described.
  • nucleic acid sequences encoding proteins of the present invention may be varied or changed without substantially affecting the sequence of the product encoded thereby, to provide a functional variant thereof.
  • sequences of possible nucleic acids that may be used to encode proteins defined by the amino acid sequences of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3 and SEQ ID NO:4 will be readily apparent to the skilled person, and the skilled person will be able to make reference to the examples provided as SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8 respectively.
  • the nucleic acid may comprise or consist of a nucleotide sequence selected from SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8, or a fragment thereof.
  • the nucleic acids may contain alterations in the coding regions, non-coding regions, or both.
  • polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded protein or fragment thereof.
  • nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code are encompassed.
  • Polynucleotide variants can be produced for a variety of reasons, for example to optimise codon expression for a particular host.
  • nucleic acids described herein may be synthesised or isolated using any suitable techniques.
  • the proteins of the present invention may conveniently be obtained by expression in a host cell and recovery of the expressed protein.
  • the host cell may be a prokaryotic or eukaryotic cell, such as a bacterial cell, yeast cell, plant cell, mammalian cell or insect cell.
  • a nucleic acid encoding the protein of the present invention may be introduced to the host cell for expression using any of the means known in the art. Typically, the nucleic acid is incorporated into an expression vector for expression and production of the protein of interest.
  • Commonly used protein expression systems include those derived from bacteria, yeast, plants, algae, baculovirus/insect, and mammalian cells. Proteins may also be expressed in cell-free systems in which in vitro transcription and translation is achieved using cell lysates derived from bacteria, yeast, baculovirus/insect or mammalian cells.
  • a further aspect of the invention provides a vector comprising a nucleic acid encoding a protein of the present invention as herein described or a fragment or derivative thereof.
  • cloning vector systems may also be used, such as the gateway vectors (Life Technologies, USA) or pET vectors (Merck Group, Germany).
  • Other vectors may contain selectable markers, such as antibiotic resistance other than ampicillin; e.g. tetracycline, chloramphenicol or kanamycin.
  • Histochemical identification of recombinant clones may also include the amino-terminal part of the lacZ gene which produces the 'alpha' part of beta-galactosidase. To produce functional enzyme, the host must provide the Omega' part of the protein.
  • the two-part functional enzyme can convert the chromogenic substrate 5-bromo-4-chloro-3-indolyl-beta-D-galactoside (X-gal) into a detectable blue compound. It may also include insertion of cloned DNA into the truncated 5' region of lacZ gene, which can abolish the alpha complementation to produce a white colony in the presence of X-gal. Multiple cloning sites can be used (or polylinkers) in the 5' region of the lacZ gene with a great number of restriction endonuclease sites that are unique to this region of the plasmid to allow for great flexibility in the cloning of DNA fragments.
  • Bacteriophage promoters from the T3, T7 and S6 bacteriophages can be included on either side of the multiple cloning sites. These allow the directed synthesis of RNA using the inserted DNA as template.
  • the BL21 (DE3) strain of E. coli has proved suitable for use in the present invention, but a number of other strains could also be used, for examples the expression from T7 promoter with codon bias correction (BL21 codon plus; Rosetta (Novagen, USA) and the improved disulphide bond formation strain Origami (Novagen, USA).
  • Other E.coli genotypes may include laclq, DE3 font pLysS,, Ion,, ompT,, araD/ara-14,, dnaJ,gor.
  • a host cell comprising a nucleic acid encoding a protein of the present invention as herein described or a vector comprising a nucleic acid encoding a protein of the present invention as herein described.
  • a method of obtaining a protein of the present invention as herein described, or fragment or derivative thereof comprising culturing a host cell of the present invention as herein described, expressing the protein in the host cell and purifying the protein or fragment or derivative thereof.
  • polyhistidine-tagging can be used to facilitate purification.
  • a recombinant polyhistidine-tagged PCP protein, or fragment or derivative thereof, can be recovered from the host cell by lysis and exposure of the lysate to a suitable binding medium, such as a Ni 2+ NTA resin. If desired, the polyhistidine-tag can subsequently be removed by proteinase digestion.
  • the proteins of the present invention have wide utility.
  • PCPs of the present invention are highly suited for use as fluorescent labels.
  • the recombinant PCPs of the present invention may be used to create protein conjugates, for example those described in US 4,876, 190 and US 6, 133,429.
  • the proteins of the present invention may also be used for labelling using organic dyes, such as fluorescein, cyanine dyes or nanoparticles to create novel quantum dots.
  • organic dyes such as fluorescein, cyanine dyes or nanoparticles to create novel quantum dots.
  • the PCPs reconstituted with suitable carotenoid and chlorophyll, can be used as imaging agents.
  • a significant obstacle in using fluorescent protein probes for the in vivo imaging of tissues or as in vitro diagnostics in bodily fluids such as blood is that haemoglobin effectively absorbs the blue, green, red and other wavelengths used both to excite standard fluorescent proteins and the various wavelengths that they generate on fluorescence. This interference results in an increase in either scattering or absorption in the tissue or sample and a resultant decrease in light penetration.
  • the proteins of the present invention can be used to provide near-infrared fluorescent imaging agents for in vivo and whole body imaging. With near-infrared light, both scattering and absorption in biological tissues are generally less severe.
  • the quantum efficiency or the energy transfer pathway of natural isolated PCP from peridinin to chlorophyll a has been calculated to be at least 80% and this value can be represented as a fraction equal to 0.8.
  • a green fluorescent protein homologue eqFP670 has similar emission peak at 670nm, however the fluorescence quantum yield of this protein is 0.06.
  • Another reported near-infrared fluorescent protein is a bacteriophytochrome which requires addition of biliverdin to become fluorescent and has a quantum yield of 0.07.
  • the fluorescence quantum yields of recombinant PCP is expected to be at least 0.24, which is at least four-fold greater than eqFP670 and three-fold greater than the bacteriophytochrome.
  • the recombinant PCPs have an extremely wide Stake's shift of 240nm (e.g. excitation at 435nm and emission at 675nm). This property allows the recombinant PCPs to be excited using conventional argon-ion lasers or halogen lamps as excitation sources which are used in microscopes or through the use of bright LEDs.
  • the near-infrared recombinant PCPs of the present invention can be used as probes with substantial increases in imaging sensitivity and performance.
  • the holo-PCPs of the present invention allow much deeper penetration of light due to low absorbance and light scattering.
  • Examples of practical applications using proteins of the present invention include whole-body fluorescence imaging, for instance to investigate metastasis and tumour localisation, cell migration, embryogenesis and other studies involving deep-tissue imaging not possible with known imaging agents.
  • the absorbance and fluorescence properties of the PCPs and their derivatives can be varied by the reconstitution of apo-PCPs with different carotenoids and chlorophylls.
  • the proteins of the present invention are useful in diagnostic kits and as research agents.
  • fusion proteins of the recombinant PCPs of the present invention with antibodies may be used in fluorescence based immunoassays, such as kits based on ELISAs and in vitro based assays using the PCPs as acceptor/donor pairs with lanthanide based fluorescence assays.
  • drug discovery assays include time-resolved fluorescence (TRF) and homogeneous time-resolved fluorescence assays (HTRF).
  • lanthanide chelates as donor chromophores that allow fluorescence resonance transfer from the lanthanide (e.g. terbium or europium) to long wavelength acceptors such as cyanine dyes, Cy5 and phycobilloproteins such as R-phycocyanin (RPC) and allophycocyanin (APC). Due to similar emission wavelengths, and the fact that the recombinant PCPs can be fused to other proteins, Cy5 and APC may be replaced by recombinant PCP as the preferred acceptor in TRF and HTRF assays. Previously, acceptor dyes and phycobilloproteins have been produced by chemical conjugation methods.
  • An advantage of the recombinant PCPs as described herein is the ability to engineer protein fusions (e.g. to receptors or ligands) at the gene level and thus assay molecular interactions of interest.
  • an assay kit comprising a protein or protein complex as herein described.
  • PCPs of the present invention are also useful for the isolation of highly pure carotenoids and or chlorophylls and analogues and derivatives thereof.
  • Ni- NTA Ni 2+ -nitrilotriacetic acid
  • the production of hexahistidine-tagged apo-PCPs according to the present invention allows for the affinity purification of carotenoids or chlorophylls.
  • commercially available Ni-NTA chromatography resins or beads allow the immobilisation of His6-tagged proteins through chelation of the Ni 2+ ion (e.g. Ni-NTA agarose from Qiagen, USA).
  • immobilised apo-PCP via the His6-tag-Ni-NTA interaction
  • chromatography media can be used to purify any number of carotenoids and chlorophylls from other organisms (e.g. plant, bacterial, or mammalian) and organic waste material (e.g. shellfish extracts, plant extracts, bacterial fermentation extracts, etc.).
  • a useful visual assay of binding of carotenoids and/or chlorophylls is the infrared fluorescence of the chromatography beads (i.e. Ni-NTA beads bound with apo-PCP) on forming holo-PCP.
  • Immobilised holo-PCP may be eluted, for example, using imidazole so allowing carotenoids and or chlorophylls to be extracted by organic solvents.
  • the present invention may be used to isolate and purify a wide range of carotenoids and chlorophylls, including those mentioned herein.
  • the recombinant PCPs of the present invention are also useful if the design of novel dual- or tri -functional biotherapeutics.
  • the fusion of PCP with tumour-specific antibodies can be used for the imaging of tumours and consequent therapeutic intervention.
  • Photosensitizer molecules that are chlorophyll derivatives such as HPPH are used in photodynamic therapy (PDT). Therefore, recombinant apo-PCPs of the present invention, or antibody fusions thereof, may be used to 'load' HPPH thereby providing novel photosensitised drug carriers for use in PDT.
  • a therapeutic agent comprising a protein or protein complex as herein described.
  • the proteins of the present invention are also useful for binding drugs and as drug- carriers for drug delivery.
  • the inventors have demonstrated the immobilisation of hexahistidine-tagged recombinant PCP on Ni-NTA beads (i.e. on chromatography resin).
  • Recombinant PCPs may thus be formulated for the slow release of drugs (by pre-loading with drug compounds), which can be immobilised, for example on microbeads, polymer surfaces or on quantum dots.
  • recombinant apo-PCPs may be used as drug carriers for retinoid drugs, such as Acitretin, Alitretinoin, Bexarotene, Etretinate, Fenretinide, Isotretinoin, Tazarotene, and Tretinoin, and synthetic analogues thereof.
  • retinoid drugs such as Acitretin, Alitretinoin, Bexarotene, Etretinate, Fenretinide, Isotretinoin, Tazarotene, and Tretinoin, and synthetic analogues thereof.
  • retinoid drugs such as Acitretin, Alitretinoin, Bexarotene, Etretinate, Fenretinide, Isotretinoin, Tazarotene, and Tretinoin, and synthetic analogues thereof.
  • These retinoid drugs have similar structures to carotenoids.
  • the proteins of the present invention are also useful as biosensors, for example for carotenoids.
  • Fluorescent proteins and quantum dots may be engineered as biosensors.
  • apo-PCPs may be reconstituted with chlorophylls only, thereby rendering the resultant PCP-chlorophyll complex a molecular sensor for a wide range of carotenoids.
  • a corresponding specific fluorescence signal will be detectable which can be used to identify the carotenoid in question.
  • Specific applications include, for example, the identification of toxin-producing dinoflagellates, which cause 'red tides', and dinoflagellates that are responsible for ciguatera and other shellfish poisonings.
  • a protein or protein complex as herein described as a biosensor.
  • the proteins of the present invention are also useful as biosensors for porphyrins, such as chlorophylls and heme.
  • porphyrins such as chlorophylls and heme.
  • apo-PCPs may be reconstituted with carotenoid only, such as peridinin, thereby rendering the resultant PCP -carotenoid complex a molecular sensor for porphyrins.
  • the PCP -carotenoid complex produces a fluorescence signal that can be used to distinguish the porhyrin in question.
  • the biosensors may be used, for example, to detect heme and heme metabolic products in human samples (e.g. blood or urine) for the diagnosis of porphyria (resulting in either an increase of a fluorescent signal or a fluorescence quenching signal).
  • the PCPs of the present invention may be used as enhancing food supplements, commonly known as a 'nutraceuticals' .
  • Carotenoids and chlorophylls have antioxidant activity and thus PCPs preloaded with these pigments make ideal nutraceuticals. It should be noted that both carotenoids and chlorophylls are not soluble in water. Therefore, PCP complexes provide a useful formulation for the body to take up the antioxidants in soluble form (or as a slow-release nutrient supplement).
  • PCPs can be produced as food supplements containing photo-protective carotenoids such as these.
  • a food supplement comprising a protein or protein complex as herein described.
  • the proteins of the present invention are also useful in the design and synthesis of novel artificial light-harvesting complexes (LHC).
  • LHC light-harvesting complex
  • a light-harvesting complex is a compilation of a number of subunit proteins that may be part of a larger supercomplex of a photosystem, the functional unit in photosynthesis.
  • the ultimate goal of an artificial LHC is to exploit the energy of absorbed light to drive chemical reactions.
  • Light-harvesting complexes such as PCPs are particularly well suited as they contain carotenoids and chlorophylls to funnel absorbed energy to the special pair via resonance energy transfer. Carotenoids serve a secondary function, suppressing damaging photochemical reactions, in particular those including oxygen, which can be induced by bright sunlight.
  • Molecular fusions of recombinant PCPs linked to other photoactive chromophores such as GFP provide for the creation of enhanced LHCs.
  • Such 'enhanced' LHCs may be used to create novel organisms with enhanced photosynthetic yields or even artificial ⁇ vitro ' photosynthesis.
  • a light harvesting complex comprising a protein or protein complex as herein described.
  • Figure 1 shows a schematic representation of methods of the prior art contrasted with a representative method according to the present invention: panel A shows the chemical conjugation of natural isolates of PCP to proteins via crosslinkers and heterobifunctional linkers (patent numbers US 4,876, 190 and US 6, 133,429); panel B illustrates the published method of Miller et al. which produces recombinant but aggregated PCP as inclusion bodies and requires at least twelve steps to refold the protein; and panel C illustrates an embodiment of the present invention in which a novel, highly soluble, high yielding, recombinant His6- tagged-PCP is produced in a single-step process without the need for refolding.
  • FIG. 2 shows, in schematic form, a representative cloning vector (pET46) suitable for use in aspects of the present invention.
  • the pET46 vector contains a strong T7-lac promoter and an amino-terminal His6-tag coding sequence immediately followed by an Ek/LIC cloning site designed to allow the generation of fusion proteins with minimal vector- encoded sequence.
  • Figure 3 illustrates the detection of expressed PCP proteins according to the present invention.
  • Clones of fPCP- and cPCP-pET46 were transformed into the BL21 (DE3) strain of E. coli and small-scale cell cultures were lysed and analysed by sodium dodecylsulphate- polyacrylamide agarose gel electrophoresis (SDS-PAGE).
  • SDS-PAGE sodium dodecylsulphate- polyacrylamide agarose gel electrophoresis
  • the recombinant fPCP and cPCP had approximate molecular masses of 36 kDa and 20 kDa respectively, as determined by SDS-PAGE.
  • Figure 4 illustrates the detection of expressed recombinant PCP proteins according to the present invention, produced at a larger scale.
  • a 1-L culture was established of the BL21 (DE3) strain of E. coli transformed with the cPCP-pET46 clone and the lysate was analysed by SDS-PAGE. High concentrations of expressed hexahistidine-tagged cPCP of 20 kDa mass were observed in each lane and protein was expressed with high yield (136mg/L).
  • Figure 5 illustrates the visual detection of fluorescence from PCP proteins according to the present invention in association with various chlorophylls and carotenoids.
  • cPCP and fPCP were reconstituted with chlorophyll a ('a') or chlorophyll b ('b') and either xanthophyll (' ⁇ ') or beta-carotene ('c').
  • the subsequent fluorescence under UV light was recorded with a camera.
  • Panels 1, 2, 5 and 6 are reconstitution reactions containing both protein and pigments.
  • Panels 3, and 7 are negative controls, containing only pigments without the proteins and Panel 4 is a negative control containing only protein without the pigments.
  • the appearance of red fluorescence was observed with cPCP and fPCP only in the presence of xanthophyll or beta-carotene with either chlorophyll a or chlorophyll b.
  • Figure 6 shows fluorescence spectra (excited at 435nm) for PCP proteins according to the present invention in association with a chlorophyll plus xanthophyll.
  • the following fluorescence spectra are illustrated: chlorophyll a; chlorophyll a plus xanthophyll; fPCP plus chlorophyll a plus xanthophyll and cPCP plus chlorophyll a plus xanthophyll.
  • a large fluorescence intensity increase of fPCP and cPCP was observed in the presence of chlorophyll a and xanthophyll.
  • FIG. 7 illustrates the measurement of the molecular weight of PCP protein according to the present invention.
  • cPCP was reconstituted with xanthophyll and chlorophyll a.
  • the reconstituted cPCP was analysed with a size-exclusion column coupled to a multi- angle laser light scattering (MALLS) instrument.
  • MALLS multi- angle laser light scattering
  • the major peak pertaining to the protein fraction at 15mL was calculated to have a molecular weight of 22.8 kDa with an rms radius moments of 5nm as determined by MALLS, which indicates that the holo-cPCP is a well-packed spherical monomeric protein.
  • Figure 8 illustrates the secondary structure analysis of PCP protein according to the present invention.
  • the circular dichroism (CD) spectrum of expressed cPCP is shown in Figure 8A.
  • the CD spectrum shows a characteristic signature for a folded protein with a dominant alpha-helix secondary structure (with a dip at 207 nm).
  • the CD spectra for model protein secondary structures when scanned in the UV range are shown in Figure 8B. It is clear from Figure 8A that the CD spectrum is similar to the X-marked curve in Figure 8B which corresponds to a protein structure with dominant a-helical secondary structure.
  • Figure 9 illustrates the measurement of the thermal stability of PCP protein according to the present invention.
  • CD signals at 207nm for expressed cPCP were measured at increasing temperatures up to 95°C. The results show that only a 10 mdeg change is observed. This is only a 10% fraction of the total possible mdeg change ( Figure 8 shows a 1 10 mdeg units maximum in the folded cPCP structure).
  • Figure 10 illustrates the predicted structure of PCPs according to the present invention.
  • Figure 10A shows a 3-D molecular structural diagram of the predicted crystal structure of fPCP (SEQ ID NO: 1) and
  • Figure 10B shows the 3-D structure for cPCP (SEQ ID NO: 2). This was achieved using the modelling software developed by Guex and Peitsch (Guex, N. and Peitsch, M.C. (1997) Electrophoresis 18, 2714-2723).
  • a detailed schematic of the predicted secondary structure of 313 residues of fPCP is illustrated in Figure IOC.
  • This model shows a helical structure (72%) with an alpha solenoid architecture comprised of 16 helices, highlighting the important and unique amino acid residues.
  • FIG. 10A and Figure 10B were modelled using the crystal structure of PCP from the dinoflagellate ⁇ Amphidinium carterae) (Taxld: 2961, protein data bank (pdb) file: IPPR) with mutated residues valine 95 to isoleucine, glutamic acid 137 to leucine, lysine 147 to glutamic acid, glutamine 202 to aspartic acid, leucine 254 to isoleucine, serine 264 to alanine, asparagine 295 to valine and alanine 296 to glycine, to give the exact primary sequences of SEQ ID NO: l (fPCP) and SEQ ID NO:2 (cPCP) respectively.
  • fPCP dinoflagellate ⁇ Amphidinium carterae
  • cPCP protein data bank
  • Figure 12 shows the sequence alignment of the sequence of polyhistidine-tagged cPCP (SEQ ID NO:4) in relation to PCP sequences from S. kawagutii, S. sp-RKT/203 and A. carterae. Sequence alignment was carried out as described in relation to Figure 1 1.
  • Sequencing reactions were achieved using pET-RP (reverse primer) with the DNA primer of sequence CTAGTTATTGCTCAGCGG at 10 ⁇ (GATC Biotech Ltd., London) using a ABI 3730x1 DNA sequencer which uses the Sanger technology for determining the sequences for DNA.
  • PCR fragments of the fPCP (SEQ ID NO: 5) and cPCP (SEQ ID NO: 6) were cloned into pET46 vector ( Figure 2) whereby the expression of the cloned gene was under the control of T7/Lac promoter which can be induced with IPTG.
  • the highly active Lambda RNA polymerase driving the expression of the cloned gene was produced by the host E.coli, BL21 (DE3).
  • the expressed protein was tagged with six histidines which were used to facilitate the downstream isolation of the expressed His-tagged proteins from E.coli lysate with Ni 2+ NTA resins.
  • the His tag can also be cleaved from the protein by proteinase digestion, enterokinase being used in this instance.
  • the calculated molecular weights of the proteins were 38.46 kDa for the fPCP and 19.59 kDa for cPCP.
  • the clones of S. S. flexibilis fPCP and cPCP were transformed into the BL21 (DE3) strain of E. coli.
  • the recombinant expressed fPCP and cPCP had approximate molecular masses of 36 kDa and 20 kDa respectively as determined by SDS-PAGE. These are in agreement with the theoretically calculated molecular weights from the protein sequence of 38.46kDa for the fPCP and 19.59kDa for cPCP. It was estimated by subsequent purification that the yield was about 100 mg/L of culture of fPCP and cPCP.
  • a one-litre culture of the cPCP clone according to Example 2 was set up under the same conditions as the small-scale culture. After induction at 37°C for 4 hours, the culture was harvested and cells lysed with a cell breaker. Protein was purified on a 5 ml N 2+ NTA column. After washing the column with washing buffer (50 mM phosphate, pH8 and 300 mM NaCl, including 60 mM imidazole), the bound protein was eluted with 200 mM imidazole in washing buffer in 1ml aliquots. The eluted protein was analysed with SDS- PAGE and the results are shown in Figure. 4. These results show that hexahistidine-tagged cPCP was expressed reproducibly in high yield of approximately 136 mg/L.
  • Panels 3, and 7 are negative controls, containing only pigments without the proteins and Panel 4 is a negative control containing only protein without the pigments.
  • 'a' represents chlorophyll a
  • 'b' represents chlorophyll b
  • 'x' represents xanthophyll
  • 'c' represents beta-carotene
  • 'cPCP' represents the C-terminal PCP
  • 'fPCP' represents the full-length PCP.
  • both the cPCP (panel 1) and fPCP (panel 2) of the present invention resulted in red fluorescence in the presence of chlorophyll a or chlorophyll b and xanthophyll.
  • Greater fluorescence intensity of fPCP was observed upon reconstitution with chlorophyll a and xanthophyll (panel 2) compared to reconstitution with chlorophyll b and xanthophyll (panel 6).
  • the pure pigments without any proteins did not result in any fluorescence in the range of 484 to 850 nm.
  • the pigments chlorophyll a, xanthophylls mixture (without PCP) generated a very weak emission signal at 675 nm.
  • chlorophyll a and xanthophyll with the addition of either cPCP or fPCP had significantly higher fluorescence intensity signals (at least eight-fold) at 675 nm as shown in Figure 6. Both PCPs had equivalent fluorescence intensity in these conditions.
  • the protein bands indicated by the arrow heads in Figure 3 were cut from the gels and analysed by LC-MS using an ESI-TRAP mass spectrometer (Bruker Daltonics, Esquire 3000 plus ). Both marked proteins in the fPCP and the cPCP clone were identified as PCP using a protein fragment search with Mascot software.
  • the Mascot search results are shown in Table 1. Mascot is a software search engine that uses mass spectrometry data to identify proteins from primary sequence databases.
  • Figure 7 shows a size-exclusion column chromatogram of absorbance (A280 nm or A690) against fraction elution (ml) after injection of cPCP sample (reconstituted with chlorophyll a and xanthophyll).
  • the line with solid circles is the UV signal at 280nm and the solid line is the laser light scattering signal where a 690nm laser light was used.
  • the arrow indicates where the cPCP was eluted which corresponds to the A280 peak.
  • the molecular weight of this cPCP fraction was analysed with a multi -angle laser light scattering (MALLS) instrument.
  • MALLS is a technique for independently determining the absolute molar mass and the average size of particles in solution by detecting how they scatter light.
  • the measured molecular weight was 22.8 kDa as determined by MALLS. This was very close to the calculated molecular weight from the clone's DNA sequence information (coding for 184 amino acid residues, given a molecular weight of 19.59 kDa).
  • the molecule has rms radius moments of 5 nm. This indicates that the cPCP is a well-packed, spherical monomer when expressed in E coli.
  • the purified cPCP block was studied using Circular Dichroism with a JASCO J-810 spectropolarimeter with a quartz cuvette with a path length of 10 mm.
  • the data generated indicate that the cPCP consists of mainly alpha helices (see Figure 8). This is consistent with the published crystal structure of PCP proteins (Schulte, Johanning et al. (2010) European Journal of Cell Biology Molecular Biology of Complex Functions of Botanical Systems 89(12): 990-997).
  • Figure 8A shows the CD spectrum of the purified cPCP.
  • Figure 8B shows typical spectra of protein secondary structures when scanned with a CD spectrometer in the UV range.
  • thermostability of expressed PCP in solution The thermostability of cPCP was studied by measuring its CD signals at 207 nm with incremental temperature increase from 30-95°C.
  • the CD signal is a probe for the secondary helical structure of cPCP.
  • Figure 9 These data show that the cPCP is a remarkably stable protein and does not unfold at 60 C. Even at 95°C only a very small part of the protein was denatured, with only a 10 mdeg change observed. This is only a 10% fraction of the total mdeg change possible.
  • Figure 8 A shows 110 mdeg units maximum in the folded cPCP structure.
  • Figure 8B illustrates the expected CD spectra for the alpha-helical, beta-sheets and random coil secondary structures of proteins, indicating that the cPCP is mostly alpha-helical in structure.
  • VVAKNQVTTASAPAVVPSGDKIGVAAKALSDASYPFIKDID WLSDIYLKPLPGKTAPDTLKAIDKMIVMGAKMDGNLLKAA AEAHHKAIGSIDAKGVTSAADYEAVNAAIGRLVASVPKATV MDVYNSMAKVVDSTVTNNMFSKVNPLDAVGAAKGFYTFK DVVEASQR Stop

Abstract

The invention relates to novel peridinin-chlorophyll binding proteins (PCPs) and describes for the first time the recombinant expression of soluble PCPs. The invention also relates to nucleic acid sequences encoding the PCPs and to the production of the PCPs. Novel gene sequences were isolated from the Symbiodinium species (zooxanthellae) in the soft coral Sinularia flexibilis. Expression in E. coli produces high yields of soluble, active PCPs. The recombinant PCPs are highly stable and become near infra-red emitting on binding to chlorophylls and carotenoids. The novel PCPs are useful, for example, in fluorescent labelling.

Description

PERIDININ-CHLOROPHYLL BINDING PROTEINS
The present invention relates to novel peridinin-chlorophyll binding proteins (PCPs) and their uses. The invention also relates to nucleic acid sequences encoding the PCPs and to the production of the PCPs.
A variety of fluorescent proteins have been used to visualise cells and components and molecules in cells, including fluorescent proteins derived from jellyfish and corals. However, there is a significant obstacle in using fluorescent protein probes for in vivo imaging of tissues or as in vitro diagnostics in bodily fluids such as blood: haemoglobin effectively absorbs the blue, green, red and other wavelengths used both to excite standard fluorescent proteins and the various wavelengths that they generate on fluorescence. This interference results in an increase in either scattering or absorption in the tissue or sample and a resultant decrease in light penetration - the so-called attenuation length. With near-infrared light, however, both scattering and absorption in biological tissues are generally less severe and attenuation lengths are correspondingly longer. This property allows near -infrared fluorescent proteins to be used as probes with substantial increases in imaging sensitivity and performance.
PCPs have been used as fluorescent labels. These proteins exist in the thylakoid lumen of algae. A unique class of light-harvesting proteins, PCPs use blue-green carotenoids as their primary light-absorbers, transferring the absorbed energy to bound chlorophylls. PCPs are present in most photosynthetic dinoflagellates and zooxanthellae. Notably, PCPs are almost exclusively obtained from extracts of natural organisms including Zooxanthellae (Tridacna species), Amphidinium carterae (Plymouth 450), Cachonina niei, Gonyaulax polyedra, Glenodinium species, Amphidinium rhynocephaleum and Gymnodinium splendens.
Crystal structures and biophysical analysis of purified PCP isolated from
Amphidinium carterae have shown non-covalent binding sites for both peridinin and chlorophyll. In general, naturally derived PCPs form trimers, in which each polypeptide contains an unusual jellyroll fold of the a-helical amino- and carboxyl -terminal domains. These domains constitute a scaffold with pseudo -twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin and two chlorophyll -a molecules. Each monomer consists of two peridinin chlorophyll -binding units termed the PCP 'block' . There are two forms of PCP monomer; a short version with a molecular weight between 14- 16kDa and a longer version with a molecular weight of 30-35 kDa. The protein complex has a unique emission spectrum that peaks at 676nm. Unlike holo-PCP (i.e. a PCP fully bound with peridinin and chlorophyll) which is a naturally fluorescent complex, apo-PCP (i.e. a PCP without the bound peridinin and chlorophyll) is non-fluorescent.
PCPs used in fluorescent labelling have typically been natural protein isolates. For example, US 4876190 and US 6133429 disclose the use of naturally-sourced PCPs from living organisms as fluorescent labels using chemical conjugation methods.
It would be preferable to use recombinant DNA technology to produce PCPs and their derivatives. However, whilst a recombinant PCP derived from the dinoflagellate Amphidinium has been produced in E. coli (Miller et al., Photosynthesis Research (2005), 86, 229-240), the recombinant PCP was expressed as an unfolded protein that produced inclusion bodies. The study did not generate soluble, active protein; subsequent refolding of the inclusion bodies was required to produce native-like PCP.
When recombinant DNA encoding a particular recombinant protein is introduced into a host organism, the protein in question will not necessarily be expressed in a bioactive, soluble form. In the case of E. coli, recombinant proteins may be expressed either in their soluble form or as inclusion bodies. The production of soluble, bioactive protein in the cytosol is highly desirable as subsequent downstream purification is normally quite straightforward. On the other hand, obtaining bioactive protein from inclusion bodies is a far more cumbersome process. Solubilisation and refolding necessitate many operational steps and generally result in very low levels of recovered, refolded protein which, moreover, may have lost much of its biological activity. Accordingly, the formation of inclusion bodies remains a significant barrier to the industrial production of recombinant proteins.
Inclusion bodies are dense particles of aggregated protein found both in the cytoplasmic and periplasmic spaces of E. coli during the expression of high levels of heterologous protein. There is growing evidence to indicate that the formation of inclusion bodies occurs as a result of the intracellular accumulation of partially folded expressed proteins which aggregate through non-covalent hydrophobic or ionic interactions or a combination of the two. The exact mechanisms by which proteins form inclusion bodies have not been fully elucidated and it is not yet possible to predict in advance whether a given protein is likely to be expressed as such.
For example, E. coli was the first host used to express Eli Lilly's human recombinant insulin in 1982, and the protein was in fact produced as inclusion bodies requiring oxidative protein folding steps to achieve natively folded, bioactive insulin. Subsequently, insulin has been recombinantly produced as soluble protein in yeast in which insulin precursors are simultaneously secreted and processed without the need for refolding. Another example of the formation of inclusion bodies is Monsanto' s bovine growth hormone product launched in 1994, the manufacturing process for which also required additional, labour-intensive, steps of solubilising and refolding when implemented at industrial scale.
Thus, the formation of inclusion bodies reduces the economic feasibility of protein production by increasing manufacturing costs and, for this reason, most biopharmaceutical companies are unwilling to implement processes in which in vitro protein refolding is required. Therefore, in industrial bioprocessing, one of the major challenges in E. coli protein expression is obtaining the desired recombinant product in its soluble, bioactive form.
Furthermore, it is not sufficient merely to obtain soluble active protein. In order for production to be truly economical, it must also be expressed at high levels and in high yields. The choice of expression system for the high-level production of recombinant proteins depends on many factors. These include cell growth characteristics, expression levels, intracellular and extracellular expression, posttranslational modifications and the biological activity of the protein of interest; regulatory considerations must also be taken into account in the production of therapeutic proteins. In addition, cost considerations also influence the selection of a particular expression system in terms of process, design and other economic factors. Bacterial, yeast, insect and mammalian expression systems all have their advantages and disadvantages. The major drawbacks of E. coli as an expression system include its inability to perform many of the posttranslational modifications found in eukaryotic proteins, the lack of a secretion mechanism for the efficient release of protein into the culture medium and the limited ability of the bacterium to facilitate extensive disulfide bond formation.
The present invention has been made from a consideration of these problems and describes for the first time the recombinant expression of soluble PCPs.
The present invention is concerned with novel isolated nucleic acid sequences encoding PCPs and the recombinant and soluble expression of the novel PCP apo-proteins in high yield. Gene sequences were isolated from Symbiodinium species (zooxanthellae) which lives inside the soft coral species Sinularia flexibilis, and has been termed 'Symbiodinium Sinularia flexibilis' or 'S. S. flexibilis
According to a first aspect of the present invention there is provided a protein selected from: a protein having at least 95% sequence identity with the amino acid sequence of SEQ ID NCv l; and
a protein having at least 97% sequence identity with the amino acid sequence of SEQ ID NO:2;
or a fragment or derivative thereof.
Surprisingly, the protein can be expressed recombinantly as soluble, highly active
PCP.
The protein is a PCP apoprotein ('apo-PCP') with chlorophyll and carotenoid-binding properties. The apo-PCP can be readily reconstituted with chlorophylls and carotenoids, without the need for refolding, to yield highly fluorescent protein complexes (PCP holoproteins; 'holo-PCP') with near-infrared fluorescent properties. Excitation and emission have been measured at 475nm and 675nm, respectively.
The amino acid sequence of SEQ ID NO: l corresponds to the full-length PCP monomer ('fPCP').
The inventors have surprisingly observed that a C-terminal portion of the PCP is sufficient for PCP activity. In other words, the C-terminal domain of the PCP has chlorophyll and carotenoid-binding properties and the holo-protein results in near-infrared fluorescence. The amino acid sequence of SEQ ID NO:2 corresponds to a C-terminal fragment of the fPCP, also termed herein 'C-terminal PCP' or 'cPCP' . Sequencing of the cPCP revealed a single E147A amino acid substitution in the recombinant cPCP sequence. The fPCP and cPCP have approximate molecular masses of 36 kDa and 20 kDa respectively, as determined by SDS-PAGE.
In one embodiment, even after reconstitution with xanthophyll and chlorophyll, cPCP has been found to have a mass of only 22.8kDa making this protein one of the smallest infrared proteins known. In general, the smaller the protein, the more sterically accessible it will be as a fusion protein in binding to its biological targets. In comparison to eqFP670, which is a dimer of mass 52 kDa, the present invention provides a holo-cPCP with a mass of approximately 23 kDa. Thus, the recombinant PCPs of the present invention are ideal labels for designing near-infrared fusion constructs as biotherapeutics.
A further aspect of the present invention provides a recombinant soluble PCP, or a fragment or derivative thereof.
The native-like fluorescent proteins according to the present invention are particularly useful as fluorescent labels. The PCPs are also useful for the creation of fusion proteins in combination with other recombinant proteins. Site-specific attachment and conjugation are difficult to achieve with naturally-derived PCPs using conventional chemical protein conjugation methods and the production of true fusion proteins is essentially impossible by those means.
An important advantage of the novel PCPs as described herein is their susceptibility to recombinant production in high yields. Soluble, active protein is readily obtainable by expression in a host such as E. coli which negates the need for protein refolding observed in the prior art and greatly simplifies production. The high yield of recombinant PCPs was unexpected and unusual, and substantially more PCP protein is produced than any prior art purification method.
A further advantage is that the novel PCPs according to the present invention have excellent stability. It is highly desirable to produce protein-based reagents or biotherapeutics that have high thermostability as this ensures the activity of product during shipment, storage and usage (e.g., in bioassays or in vivo administration). In embodiments of the present invention, the protein structure may remain substantially intact (i.e. not substantially unfolded or denatured) at temperatures up to 60°C, 70°C, 80°C or 90°C. It has been observed that even at 95°C, only a small part of the protein was denatured.
Also advantageously, the PCPs of the present invention are monomeric. Monomelic fluorescent proteins are advantageous in cell based and in vitro based assays as they generally exhibit reduced aggregation. Fluorescent aggregates can cause interference in assays or increases in background fluorescence signals. Further, an important regulatory concern with biotherapeutics is the need to demonstrate homogenous, monomeric proteins with minimum aggregated products. Aggregated products may cause adverse immunological responses and loss of efficacy of the drug. Structural analysis has confirmed that the recombinant PCPs of the present invention are tightly folded monomers.
The ability to produce highly soluble recombinant PCPs constitutes a major technological breakthrough in the manufacture and applications of PCPs and represents a significant step forward in the state of the art. The recombinant PCPs according to the present invention provide a number of advantages over naturally-derived PCPs.
DNA and fermentation technology produces high yields of protein - at least ten-fold greater than the equivalent biomass of naturally-derived PCPs extracted from living organisms. Moreover, even in the case that naturally-derived PCPs could be purified in sufficient quantities, their use as fluorescent labels is technically difficult because of the limitations of conventional chemical protein conjugation methods. Such methods suffer from the general disadvantage of protein heterogeneity due to the random manner in which proteins within a population couple to one another - an undesirable natural phenomenon which affects their active sites and results in a reduction of their biological activity. This is also true for chemical immobilisation techniques based on attaching proteins to functionalized surfaces. Fusion proteins on the other hand may be engineered at the genetic level with enzymatic tags that can catalyse the bioconjugation of proteins to surfaces. In cases where exposed functional groups such as carboxy- or amino-bearing residues are used for conjugation, another disadvantage of chemical conjugation is the coincidental alteration of protein stability. The surface electrostatic charges conferred by carboxy- or amino-bearing surface residues in proteins are important in maintaining protein folding and segregation of hydrophobic and hydrophilic domains. Thus, conjugation chemistries that result in the loss of these charged groups may have a detrimental effect on protein stability, aggregation and activity. Furthermore, such methods demand the use of purified proteins since the reactivity and non-specificity of these methods will also result in the co-immobilisation of other impurities that may be present in the mixture. In cases where large numbers of protein conjugates are required, a commensurate number of purifications is needed, which is extremely costly in terms of resources and time. For example, enzymes are employed in a wide range of industrial processes, however, at large scale the chemical conjugation of enzymes to proteins would be impossible or uneconomical by conventional chemical methods. For these reasons, recombinant DNA technology allows the expression of proteins with precise control, i.e. in the site-specific attachment of proteins as tandem fusion proteins linked through peptide bonds. This circumvents the aforementioned problems associated with the covalent conjugation of proteins. Protein fusions exhibit increased functionality as site-specific modifications are made at the level of the gene through peptide bonds, as opposed to chemical -covalent labelling that requires coupling through exposed functional residues. The ability to produce recombinant PCPs in accordance with the present invention facilitates the large-scale manufacture of fusion proteins which may be useful, for example, as diagnostic reagents, biotherapeutics and imaging reagents, such as recombinant antibody fusions.
Moreover, the recombinant and soluble PCPs as described in the present application are apo-proteins that have the ability to bind different carotenoids and chlorophylls. This is advantageous as their fluorescence and absorbance properties can be modulated by reconstitution with a wide range of carotenoids, chlorophylls and synthetically-produced analogues. In contrast, naturally-derived PCPs that are extracted from living organisms have fixed amounts of carotenoids and chlorophylls that are pre-bound.
To date there have been no reports of the direct expression of a soluble recombinant PCP and only one publication (Miller et al.) reporting the expression of PCPs as inclusion bodies (inactive aggregates) in E. coli. In Miller et al., the DNA sequences of the species Amphidinium carterae were used to produce full-length, N-terminal domains of PCP; both of the apo-proteins were expressed as inclusion bodies. An arduous and labour-intensive refolding regime was used to refold the inclusion bodies into active protein. After the cellular lysis of E. coli, an additional twelve processing steps were employed before the use of column chromatography for the purification of the protein (i.e. ten cycles of centrifugation/resuspension plus the solubilisation step of inclusion bodies and centrifugation). Figure 1 shows a schematic summary which contrasts the prior art and the present invention: panel A shows the chemical conjugation of natural isolates of PCP to proteins via crosslinkers and heterobifunctional linkers (patent numbers US 4,876,190 and US 6,133,429); panel B illustrates the published method of Miller et al., which produces recombinant but aggregated PCPs as inclusion bodies and requires at least twelve steps to refold the protein; and panel C illustrates the present invention according to which a novel, highly soluble, high yielding, recombinant PCP can be produced in a single-step process without the need for refolding, thus opening up new applications for recombinant PCPs as protein/peptide fusions.
Thus, embodiments of the present invention provide for the production of recombinant PCPs with a single processing step of centrifugation prior to chromatography. In effect, the further twelve processing steps in the earlier published method of Miller et al. (i.e. ten cycles of centrifugation/resuspension plus the solubilisation step of inclusion bodies plus centrifugation) have been superseded in embodiments of the present invention by a single-step method. In embodiments of the invention, this single step is the centrifugation of lysed cells (e.g. at 10,000 rpm) to produce supernatant containing the soluble recombinant PCP that can be directly loaded and purified onto chromatography columns (the pellet containing cellular debris can be discarded). Thus, the production of soluble protein renders the industrial scale manufacture of recombinant PCPs economic.
Furthermore, the method of Miller et al. produces 10-20mg of soluble protein after the refolding process from a fermentation of 5 litres of E. coli culture. This equates to 2-4mg of soluble protein per litre of culture. In contrast, the present produces substantially more protein, yields of 136 mg per litre of soluble protein having been produced in one embodiment - at least 34 times higher yields than those reported by Miller et al. The protocol of Miller et al. admits to losses in protein yield due to problems of refolding which were "high and varied considerably". This loss of protein yield is a major problem with methods based on the refolding of inclusion bodies.
The present invention encompasses the PCP proteins associated with ligands, in particular chlorophylls and carotenoids.
A further aspect of the present invention therefore provides a protein complex including a PCP protein as described herein or a fragment or derivative thereof.
The absorbance and fluorescence properties of the PCPs and their derivatives can be varied by the reconstitution of apo-PCPs with different carotenoids and chlorophylls and their derivatives and analogues, including synthetic analogues. The binding sites in the PCPs can bind different carotenoids and chlorophylls and structurally related molecules. For example, the inventors have observed that the simultaneous binding of xanthophyll with chlorophyll a or chlorophyll b to cPCP results in the visual appearance of red fluorescence. The inventors have further found that the binding of xanthophyll with chlorophyll a produces more intense fluorescence compared to chlorophyll b.
Any natural or synthetic chlorophyll or chlorophyll analogue, and derivatives thereof, may be used in aspects of the present invention, either alone or in any combination. Non- limiting examples include chlorophyll a, chlorophyll b, chlorophyll c, chlorophyll d, chlorophyll e, chlorophyll f, bacteriochlorophylls, 2-(l-hexyloxyethyl)-2-devinyl pyropheophorbide-a (HPPH), porphyrin structures, synthetic derivatives of porphyrins, corrins, chlorins (2,3-dihydroporphyrins), corphins and heme, and analogues and derivatives thereof.
Particularly good results have been observed using bacteriochlorophylls.
Similarly, a range of different caroteinoids and analogues and derivatives thereof may be used in aspects of the present invention, either alone or in any combination. Peridinin and its analogues, and derivatives thereof, may be employed. Other non-limiting examples include a-carotene, β-carotene, γ-carotene, δ-carotene, ε-carotene, ζ-carotene, lycopene, neurosporene, phytoene, phytofluene, antheraxanthin, astaxanthin, canthaxanthin, citranaxanthin, cryptoxanthin, diadinoxanthin, diatoxanthin, dinoxanthin, flavoxanthin, fucoxanthin, lutein, neoxanthin, rhodoxanthin, rubixanthin, violaxanthin, zeaxanthin, abscisic acid, apocarotenal, bixin, crocetin, food orange 7 (el60f), ionones, retinal, retinoic acid and retinol, and analogues and derivatives thereof.
Naturally occurring carotenoids include the following examples: hydrocarbons including lycopersene (7,8, 1 1, 12, 15, 7', 8', 1 Γ, 12', 15'-decahydro-Y,y-carotene), phytofluene, hexahydrolycopene (15-cis-7,8, l l, 12,7',8'-hexahydro-Y,Y-carotene), torulene (3',4'-didehydro- β,γ-carotene) and a-zeacarotene (7',8'-dihydro-8,Y-carotene); alcohols including alloxanthin, cynthiaxanthin, pectenoxanthin, cryptomonaxanthin ((3R,3'R)-7,8,7',8'-tetradehydro-P,P- carotene-3,3'-diol), crustaxanthin (P,-carotene-3,4,3',4'-tetrol), gazaniaxanthin ((3R)-5'-cis- P,y-caroten-3-ol), OH-chlorobactene ( ,2'-dihydro-f,Y-caroten-l'-ol), loroxanthin (β,ε- carotene-3, 19,3'-triol), lycoxanthin (y,y-caroten-16-ol), rhodopin (l,2-dihydro-Y,y-caroten-l- ol), rhodopinol (warmingol; 13-cis-l,2-dihydro-Y,y-carotene-l,20-diol), saproxanthin (3',4'- didehydro-l',2'-dihydro-P,Y-carotene-3, -diol), zeaxanthin, glycosides, oscillaxanthin (2,2'- bis(P-l-rhamnopyranosyloxy)-3,4,3',4'-tetradehydro-l,2, ,2'-tetrahydro-Y,Y-carotene-l, l'- diol) and phleixanthophyll (l'-(P-d-glucopyranosyloxy)-3',4'-didehydro- ,2'-dihydro-P,Y- caroten-2'-ol); ethers including rhodovibrin (l '-methoxy-3',4'-didehydro-l,2, ,2'-tetrahydro- γ,γ-caroten-l-ol), spheroidene (l -methoxy-3,4-didehydro-l,2,7',8'-tetrahydro-Y,Y-carotene), epoxides, diadinoxanthin (5,6-epoxy-7',8'-didehydro-5,6-dihydro— carotene-3,3- diol)luteoxanthin (5,6: 5',8'-diepoxy-5,6,5',8'-tetrahydro-P,P-carotene-3,3'-diol), mutatoxanthin, citroxanthin, zeaxanthin, furanoxide (5,8-epoxy-5,8-dihydro-P,P-carotene- 3,3'-diol), neochrome (5',8'-epoxy-6,7-didehydro-5,6,5',8'-tetrahydro-P,P-carotene-3,5,3'- triol), foliachrome, trollichrome, vaucheriaxanthin (5',6'-epoxy-6,7-didehydro-5,6,5',6'- tetrahydro-P,P-carotene-3,5, 19,3'-tetrol); aldehydes including rhodopinal, wamingone (13 - cis-l-hydroxy-l,2-dihydro-Y,y-caroten-20-al) and torularhodinaldehyde (3',4'-didehydro-P,y- caroten-16'-al); acids and acid esters including torularhodin (3',4'-didehydro-P,Y-caroten-16'- oic acid) and torularhodin (methyl ester methyl 3',4'-didehydro-P,Y-caroten-16'-oate); ketones including astaxanthin, canthaxanthin (aphanicin; chlorellaxanthin P,P-carotene-4,4'-dione), capsanthin ((3R,3'R,5'R)-3,3'-dihydroxy-P,K-caroten-6'-one), capsorubin ((3 S,5R,3'S,5'R)- 3,3'-dihydroxy-K,K-carotene-6,6'-dione), cryptocapsin ((3'R,5'R)-3'-hydroxy-P,K-caroten-6'- one), 2,2'-diketospirilloxanthin (l, l'-dimethoxy-3,4,3',4'-tetradehydro-l,2, ,2'-tetrahydro-Y,Y- carotene-2,2'-dione), flexixanthin (3, l'-dihydroxy-3',4'-didehydro- ,2'-dihydro-P,Y-caroten-4- one), 3-OH-canthaxanthin (adonirubin; phoenicoxanthin; 3-hydroxy-P,P-carotene-4,4'-dione), hydroxyspheriodenone (l'-hydroxy-l -methoxy-3,4-didehydro-l,2, ,2',7',8'-hexahydro-Y,Y- caroten-2-one), okenone (l'-methoxy- ,2'-dihydro-c,Y-caroten-4'-one), pectenolone (3,3'- dihydroxy-7',8'-didehydro-P,P-caroten-4-one), phoeniconone (dehydroadonirubin; 3-hydroxy- 2,3-didehydro-P,P-carotene-4,4'-dione), phoeni copter one (P,8-caroten-4-one), rubixanthone (3-hydroxy-P,y-caroten-4'-one), siphonaxanthin (3, 19,3'-trihydroxy-7,8-dihydro-P,8-caroten- 8-one); esters of alcohols including astacein (3,3'-bispalmitoyloxy-2,3,2',3'-tetradehydro-P,P- carotene-4,4'-dione or 3,3'-dihydroxy-2,3,2',3'-tetradehydro-P,P-carotene-4,4'-dione dipalmitate), fucoxanthin (3'-acetoxy-5,6-epoxy-3,5'-dihydroxy-6',7'-didehydro-5,6,7,8,5',6'- hexahydro-P,P-caroten-8-one), isofucoxanthin (3'-acetoxy-3,5,5'-trihydroxy-6',7'-didehydro- 5,8,5',6'-tetrahydro-P,P-caroten-8-one), physalien, zeaxanthin dipalmitate ((3R,3'R)-3,3'- bispalmitoyloxy-P,P-carotene or (3R,3'R)-P,P-carotene-3,3'-diol dipalmitate), siphonein (3,3'- dihydroxy-19-lauroyloxy-7,8-dihydro-P,8-caroten-8-one or 3, 19,3'-trihydroxy-7,8-dihydro- P,8-caroten-8-one 19-laurate); apo carotenoids including P-apo-2'-carotenal (3',4'-didehydro- 2'-apo-b-caroten-2'-al), apo-2-lycopenal, apo-6'-lycopenal (6'-apo-y-caroten-6'-al), azafrinaldehyde (5,6-dihydroxy-5,6-dihydro-10'-apo-P-caroten-10'-al), bixin (6'-methyl hydrogen 9'-cis-6,6'-diapocarotene-6,6'-dioate), citranaxanthin (5',6'-dihydro-5'-apo-P- caroten-6'-one or 5',6'-dihydro-5'-apo-18'-nor-P-caroten-6'-one or 6'-methyl-6'-apo-P-caroten- 6'-one), crocetin (8,8'-diapo-8,8'-carotenedioic acid), crocetinsemialdehyde (8'-οχο-8,8'- diapo-8-carotenoic acid), crocin (digentiobiosyl 8,8'-diapo-8,8'-carotenedioate), hopkinsiaxanthin (3-hydroxy-7,8-didehydro-7',8'-dihydro-7'-apo-b-carotene-4,8'-dione or 3- hydroxy-8'-methyl-7,8-didehydro-8'-apo-b-carotene-4,8'-dione), methyl apo-6'-lycopenoate (methyl 6'-apo-y-caroten-6'-oate), paracentrone (3,5-dihydroxy-6,7-didehydro-5,6,7',8'- tetrahydro-7'-apo-b-caroten-8'-one or 3,5-dihydroxy-8'-methyl-6,7-didehydro-5,6-dihydro-8'- apo-b-caroten-8'-one), sintaxanthin (7',8'-dihydro-7'-apo-b-caroten-8'-one or 8'-methyl-8'- apo-b-caroten-8'-one); nor and seco carotenoids including actinioerythrin (3,3'-bisacyloxy- 2,2'-dinor-b,b-carotene-4,4'-dione), β-carotenone (5,6:5',6'-diseco-b,b-carotene-5,6,5',6'- tetrone); peridinin (3'-acetoxy-5,6-epoxy-3,5'-dihydroxy-6',7'-didehydro-5,6,5',6'-tetrahydro- 12',13',20'-trinor-b,b-caroten-19, l 1-olide), pyrrhoxanthininol (5,6-epoxy-3,3'-dihydroxy-7',8'- didehydro-5,6-dihydro-12', 13',20'-trinor-b,b-caroten-19, l 1-olide), semi-a-carotenone (5,6- seco-b,e-carotene-5,6-dione), semi-P-carotenone (5,6-seco-b,b-carotene-5,6-dione or 5',6'- seco-b,b-carotene-5',6'-dione), triphasiaxanthin (3-hydroxysemi-b-carotenone 3'-hydroxy-5,6- seco-b,b-carotene-5,6-dione or 3-hydroxy-5',6'-seco-b,b-carotene-5',6'-dione); retro carotenoids and retro apo carotenoids incuding eschscholtzxanthin (4',5'-didehydro-4,5'-retro- b,b-carotene-3,3'-diol), eschscholtzxanthone (3'-hydroxy-4',5'-didehydro-4,5'-retro-b,b- caroten-3-one), rhodoxanthin (4',5'-didehydro-4,5'-retro-b,b-carotene-3,3'-dione) tangeraxanthin (3-hydroxy-5'-methyl-4,5'-retro-5'-apo-b-caroten-5'-one or 3-hydroxy-4,5'- retro-5'-apo-b-caroten-5'-one); and higher carotenoids including nonaprenoxanthin (2-(4- hydroxy-3-methyl-2-butenyl)-7',8',H', 12'-tetrahydro-e,y-carotene), decaprenoxanthin (2,2'- bis(4-hydroxy-3-methyl-2-butenyl)-e,e-carotene), c.p. 450 (2-[4-hydroxy-3-(hydroxymethyl)- 2-butenyl]-2'-(3-methyl-2-butenyl)-b,b-carotene), c.p. 473 (2'-(4-hydroxy-3-methyl-2- butenyl)-2-(3-methyl-2-butenyl)-3',4'-didehydro-r,2'-dihydro-b,y-caroten- -ol) and bacterioruberin (2,2'-bis(3-hydroxy-3-methylbutyl)-3,4,3',4'-tetradehydro-l,2, ,2'-tetrahydro- y,y-carotene- 1 , 1 '-dio).
The term 'amino acid' as used herein includes naturally-occurring amino acids, naturally-occurring amino acid structural variants, and synthetic non-naturally occurring analogues that are capable of participating in peptide bonds.
The term 'protein' as used herein explicitly permits of post-translational and postsynthetic modifications, such as glycosylation. The protein may comprise a full-length protein or polypeptide, or any functional component or fragment thereof. A full-length protein comprises the complete structure of a transcribed gene which may consist of single or multiple domains of discrete secondary structure. A polypeptide comprises two or more amino acids that are linked together to form peptidic bonds and form part of the primary sequence of the protein. Functional components comprise secondary structural elements which are involved in the functional regions of a protein, for example binding sites.
It will be appreciated that the proteins described herein may be synthesised or purified using any suitable techniques.
Proteins of the present invention may comprise or consist of the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2, or a fragment or derivative thereof.
Proteins of the present invention may comprise or consist of a variant of the amino acid sequence according to a reference sequence as used herein (for example, SEQ ID NO: l or SEQ ID NO: 2), or fragment or derivative of said variant.
'Variants' of the amino acid sequence include insertions, deletions and substitutions, either conservative or non-conservative. For example, conservative substitution refers to the substitution of an amino acid within the same general class (e.g. an acidic amino acid, a basic amino acid, a non-polar amino acid, a polar amino acid or an aromatic amino acid) by another amino acid within the same class. The meaning of a conservative amino acid substitution and non-conservative amino acid substitution is well known in the art. Conservative substitutions are preferred. The variant may, for example, have an amino acid sequence which has at least 96% identity with the amino acid sequence according to a reference sequence (for example, SEQ ID NO: 1 or SEQ ID NO:2) or a fragment thereof, for example at least 97%, at least 98% or at least 99% identity.
The percent sequence identity between two polypeptides may be determined using a suitable computer program, for example the CLUSTAL 2.1 multiple sequence alignment program (Larkin et al. (2007) Bioinformatics 23(21): 2947-2948).
It will be appreciated that percentage identity is calculated in relation to polypeptides whose sequences have been aligned optimally.
Protein variants in which less than 50, less than 40, less than 30, less than 20, less than 10, or 5-50, 5-25, 5-10, 1 -5, or 1-2 amino acids are substituted, deleted, or added in any combination are preferred.
In particular, the present invention includes functional variants of the protein which exhibit PCP ligand-binding activity (i.e. binding of carotenoids and chlorophylls and their derivatives or analogues). Ligand binding activity can be determined by measuring near- infrared fluorescence following reconstitution of the protein with chlorophyll and carotenoid (see Example 6 and Figure 6). PCP ligand-binding activity may also be determined using prior art methods, for example the method of Miller et al. {Photosynthesis Research (2005), 86, 229-240). The chlorophyll and carotenoid reconstitution method of Miller et al. used a small volume of PCP apoprotein (typically 200-400 μg protein) in 50 mMTris-HCl, pH8.0 that was made to 25 mM Tricine 10 mM KCl pH 7.6 and mixed with a stoichiometric amount of PCP pigments dissolved in ethanol, which resulted in a final ethanol concentration of 15% in volume of 1-1.5 ml. Pigment protein stoichiometry was based on 1 mg/ml native PCP having an A476 nm of 21.8 and assuming peridinin in organic solution has the same specific extinction as in PCP. Chlorophylls were added as 1 A at the Qy band to 4 A at peridinin maximum. For larger-scale preparations the initial volumes were increased with proportionally scaled-up amounts of pigment and PCP apoprotein. The samples were held at 4°C for 72 h and then diluted to give an A 670 nm <0.05 for fluorescence measurements and A670 nm <0.25 for absorbance measurements. Reconstituted samples were equilibrated to 5 mM Tricine 2 mM KCl pH 7.6 by passage through a PD10 column and bound to a column of DEAE Tris-acryl (Sigma). After washing, the reconstituted PCP was removed with 5 mM Tricine, 2 mM KCl pH 7.6 containing 0.1 M NaCl for N-domain PCP and 0.06 M NaCl for full-length PCP. Unincorporated pigment bound to the DEAE Tris-acryl column and could only be removed with organic solvent. For reconstitutions with chlorophyll 10-30% of added pigments were recovered in purified PCP. Similar yields were obtained for the N-domain PCP reconstituted with different chlorophylls.
With reference to fPCP (SEQ ID NO: l) the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 191, histidine 229, histidine 230, isoleucine 254, tyrosine 270, phenylalanine 301 and tyrosine 302. In variants of fPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
Also with reference to fPCP (SEQ ID NO: l), the inventors have highlighted the following amino acids as important in the N-terminal binding pocket: phenylalanine 28, histidine 66, histidine 67, asparagine 89, tyrosine 108 and tyrosine 136. In variants of fPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, or all six amino acids.
With reference to cPCP (SEQ ID NO:2) the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 47, histidine 85, histidine 86, isoleucine 1 10, tyrosine 126, phenylalanine 157 and tyrosine 158. In variants of cPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
As discussed herein, the present invention encompasses polyhistidine-tagged derivatives of the novel proteins. The polyhistidine-tagged protein may comprise or consist of a protein with the amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4, or a fragment or derivative thereof.
With reference to the polyhistidine-tagged fPCP (SEQ ID NO:3) the inventors have highlighted the following amino acids as important in binding either chlorophylls or caroteoids: tyrosine 206, histidine 244, histidine 245, isoleucine 269, tyrosine 285, phenylalanine 316 and tyrosine 317. In variants of the polyhistidine-tagged fPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids. Also with reference to the polyhistidine-tagged fPCP (SEQ ID NO:3), the inventors have highlighted the following amino acids as important in the N-terminal binding pocket: phenylalanine 43, histidine 81, histidine 82, asparagine 104, tyrosine 123 and tyrosine 151. In variants of fPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, or all six amino acids.
With reference to the polyhistidine-tagged cPCP (SEQ ID NO:4) the inventors have highlighted the following amino acids as important in binding either chlorophylls or carotenoids: tyrosine 62, histidine 100, histidine 101, isoleucine 125, tyrosine 141, phenylalanine 172 and tyrosine 173. In variants of the polyhistidine-tagged cPCP, it is preferred that one or more of these amino acids is conserved (which includes conservative substitution), such as two or more, three or more, four or more, five or more, six or more, or all seven amino acids.
Save for where the context requires otherwise, all references to proteins in accordance with the invention should also be taken to encompass fragments or derivatives of such proteins, wherein such fragments or derivatives are characterised in that they comprise substitutions that differentiate them from fragments or derivatives derivable from the known PCP amino acid sequences.
With reference to fPCP (SEQ ID NO: l) the following are unique amino acids: isoleucine 95, leucine 137, glutamic acid 147, aspartic acid 202, isoleucine 254, alanine 264, valine 295 and glycine 296. The amino acids of SEQ ID NO: 1 are numbered from 1 to 313 starting from aspartic acid 1 and ending at arginine 313.
Thus, a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%), at least 97%, at least 98%>, or at least 99% identity, with the amino acid sequence of SEQ ID NO: l and having one or more of: isoleucine 95, leucine 137, glutamic acid 147, aspartic acid 202, isoleucine 254, alanine 264, valine 295 and glycine 296.
Correspondingly, with reference to cPCP (SEQ ID NO:2) the following are unique amino acids: aspartic acid 58, isoleucine 1 10, alanine 120, valine 151 and glycine 152. The amino acids of SEQ ID NO: 2 are numbered from 1 to 169 starting from valine 1 and ending at arginine 169. Thus, a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%>, or at least 99% identity, with the amino acid sequence of SEQ ID NO: 2 and having one or more of: aspartic acid 58, isoleucine 110, alanine 120, valine 151 and glycine 152.
With reference to the polyhistidine-tagged fPCP (SEQ ID NO:3) the following are unique amino acids: isoleucine 110, leucine 152, glutamic acid 162, aspartic acid 217, isoleucine 269, alanine 279, valine 310 and glycine 311. The amino acids of SEQ ID NO:3 are numbered from 1 to 328 starting from methionine 1 and ending at arginine 328.
Thus, a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80%> sequence identity, for example at least 81%>, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%, or at least 99% identity, with the amino acid sequence of SEQ ID NO:3 and having one or more of: isoleucine 110, leucine 152, glutamic acid 162, aspartic acid 217, isoleucine 269, alanine 279, valine 310 and glycine 311.
Correspondingly, with reference to the polyhistidine-tagged cPCP (SEQ ID NO:4) the following are unique amino acids: aspartic acid 73, isoleucine 125, alanine 135, valine 166 and glycine 167. The amino acids of SEQ ID NO: 4 are numbered from 1 to 184 starting from methionine 1 and ending at arginine 184.
Thus, a further aspect of the present invention comprises proteins, and fragments and derivatives thereof, having at least 80% sequence identity, for example at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%o, at least 97%, at least 98%, or at least 99% identity, with the amino acid sequence of SEQ ID NO:4 and having one or more of: aspartic acid 73, isoleucine 125, alanine 135, valine 166 and glycine 167.
Suitable fragments or derivatives of PCP proteins according to the present invention may comprise, for example, at least 50 amino acid residues, for instance at least 75 amino acid residues, or at least 100 amino acid residues, or at least 120 amino acid residues. References to proteins and fragments of proteins also encompass derivatives of such proteins or fragments, except for where the context requires otherwise.
The present invention includes, in particular, polyhistidine-tagged derivatives of the PCP proteins of the present invention, or fragments thereof. Such derivatives have a terminal polyhistidine tag of at least five histidine residues, typically six histidine residues, preferably at the N-terminus of the protein. The polyhistidine tag is useful in isolation, purification, binding and immobilisation of the tagged proteins, as is known to those skilled in the art. For instance, recombinant polyhistidine-tagged PCP proteins of the present invention may conveniently be isolated following expression in a host cell by exposing the cell lysate to a Ni2+NTA resin.
The polyhistidine tag may also be associated with a suitable amino acid sequence that facilitates removal of the tag from the protein using an endopeptidase, such as enterokinase.
The polyhistidine-tagged protein may comprise or consist of a protein with the amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4, or a fragment or derivative thereof.
According to a further aspect of the present invention there is provided a protein selected from:
a protein having at least 95% sequence identity with the amino acid sequence of SEQ ID NO:3; and
a protein having at least 97% sequence identity with the amino acid sequence of SEQ
ID NO:4;
or a fragment or derivative thereof.
Variants of the polyhistidine-tagged protein may have an amino acid sequence which has at least 96% identity with the amino acid sequence according to SEQ ID NO:3 or a fragment thereof, for example at least 97%, at least 98% or at least 99% identity.
Variants of the polyhistidine-tagged protein may have an amino acid sequence which has at least 98% identity with the amino acid sequence according to SEQ ID NO: 4 or a fragment thereof, for example at least 99% identity.
The PCP proteins of the present invention may be conjugated to other agents, including proteins, biomolecules and polymers. Advantageously, the high thermal stability of the proteins of the present invention facilitates their use as labelling reagents for conjugation to such other agents. A further aspect of the present invention provides a protein conjugate comprising a protein as described herein, or fragment or derivative thereof. The protein may be conjugated to any agent of interest.
Further derivatives of the proteins of the present invention include fusion proteins comprising a PCP protein of the present invention, or fragment or derivative thereof, and a protein or peptide of interest.
A further aspect of the present invention provides a fusion protein comprising a protein as described herein, or fragment or derivative thereof.
Fusion proteins, or chimeric proteins, are proteins created through the joining of two or more genes which originally coded for separate proteins. The genes may be derived from the same organism or more commonly they are derived from different organisms. Usually, fusion proteins can be arranged as tandem constructs or placed at the N- or C-terminus of a protein. Translation of this fusion gene results (e.g. when expressed in a host organism, such as E. coif) in a single polypeptide with functional properties derived from each of the original proteins.
It will be appreciated that a wide variety of fusions may be created using a range of different proteins or peptides of interest. The present invention provides for the design and engineering of PCP fusion proteins as tandem genetic conjugates. Thus, the recombinant PCPs may be given added functionality. For example, a PCP may be engineered at the N-terminus of a single-chain antibody, resulting in a fusion protein that not only binds a specific antigen but can also be detected by virtue of its PCP fluorescence. Such fusion products are useful as in vivo fluorescent diagnostic markers for disease. It must be noted that any known soluble recombinant protein targets of interest can be labelled at the N- and/or C-terminus with PCPs according to the present invention. Indeed, the recombinant PCP itself may also be engineered with other proteins at the N- and/or C-terminus. It is possible, therefore, to produce multiple fusions of proteins/peptides with recombinant PCPs. Non-limiting examples include fluorescent proteins, antibodies, such as single chain/fragment antibodies, antigens, enzymes, and epitope tags for affinity purification and immobilisation, for example polyhistidine tag, V5 tag, TAP tag, etc.
Other non-limiting examples of fusion partners for the PCPs include: flag peptide, anti-flag antibodies, glutathione-^-transferase, Staphylococcal protein A, Streptococcal protein G, calmodulin organic ligands, thioredoxin, b-galactosidase, ubiquitin, chloramphenicol acetyltransf erase, S-peptide (RNase A, residues 1-20), S-protein (RNase A, residues 21-124), myosin heavy chain, DsbA, biotin subunit, avidin, streptavidin, Strep-tag streptavidin, c-myc, dihydrofolate reductase, CKSc, polyarginine, polycysteine, polyphenylalanine, lac repressor, T4 gp55, growth hormone N terminus, maltose-binding protein, galactose-binding protein, cyclomaltodextrin glucanotransferase, cellulose-binding domain, hemolysin A, E. coli, 1 ell protein, TrpE or TrpLE, protein kinases, (AlaTrpTrpPro)n, HAId epitope, BTag (VP7 protein region of bluetongue virus), anti-BTag antibodies, green fluorescent protein.
According to a further aspect of the present invention there is provided a nucleic acid encoding a protein, or fragment or derivative thereof, as herein described.
Due to the degeneracy of the genetic code, it is clear that nucleic acid sequences encoding proteins of the present invention may be varied or changed without substantially affecting the sequence of the product encoded thereby, to provide a functional variant thereof. The sequences of possible nucleic acids that may be used to encode proteins defined by the amino acid sequences of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3 and SEQ ID NO:4 will be readily apparent to the skilled person, and the skilled person will be able to make reference to the examples provided as SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8 respectively.
Thus, the nucleic acid may comprise or consist of a nucleotide sequence selected from SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8, or a fragment thereof.
Variants of these nucleotide sequences are also encompassed by the present invention.
The nucleic acids may contain alterations in the coding regions, non-coding regions, or both. Especially preferred are polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded protein or fragment thereof. As noted above, nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code are encompassed. Polynucleotide variants can be produced for a variety of reasons, for example to optimise codon expression for a particular host.
It will be appreciated that the nucleic acids described herein may be synthesised or isolated using any suitable techniques.
The proteins of the present invention, or fragments or derivatives thereof, may conveniently be obtained by expression in a host cell and recovery of the expressed protein. The host cell may be a prokaryotic or eukaryotic cell, such as a bacterial cell, yeast cell, plant cell, mammalian cell or insect cell. A nucleic acid encoding the protein of the present invention may be introduced to the host cell for expression using any of the means known in the art. Typically, the nucleic acid is incorporated into an expression vector for expression and production of the protein of interest. Commonly used protein expression systems include those derived from bacteria, yeast, plants, algae, baculovirus/insect, and mammalian cells. Proteins may also be expressed in cell-free systems in which in vitro transcription and translation is achieved using cell lysates derived from bacteria, yeast, baculovirus/insect or mammalian cells.
Thus, a further aspect of the invention provides a vector comprising a nucleic acid encoding a protein of the present invention as herein described or a fragment or derivative thereof.
Any suitable vector-host cell combination may be employed in the present invention and the skilled person will be aware of many suitable examples.
Particularly good results have been achieved using the pET46 vector (see Figure 2), whereby the expression of the cloned PCP gene is under the control of the t7/Lac promoter, and E. coli host cells.
However, other cloning vector systems may also be used, such as the gateway vectors (Life Technologies, USA) or pET vectors (Merck Group, Germany). Other vectors may contain selectable markers, such as antibiotic resistance other than ampicillin; e.g. tetracycline, chloramphenicol or kanamycin. Histochemical identification of recombinant clones may also include the amino-terminal part of the lacZ gene which produces the 'alpha' part of beta-galactosidase. To produce functional enzyme, the host must provide the Omega' part of the protein. Together, the two-part functional enzyme can convert the chromogenic substrate 5-bromo-4-chloro-3-indolyl-beta-D-galactoside (X-gal) into a detectable blue compound. It may also include insertion of cloned DNA into the truncated 5' region of lacZ gene, which can abolish the alpha complementation to produce a white colony in the presence of X-gal. Multiple cloning sites can be used (or polylinkers) in the 5' region of the lacZ gene with a great number of restriction endonuclease sites that are unique to this region of the plasmid to allow for great flexibility in the cloning of DNA fragments. In addition, the presence of the origin of replication of single-stranded phage allows for the possibility of ssDNA production. Bacteriophage promoters from the T3, T7 and S6 bacteriophages can be included on either side of the multiple cloning sites. These allow the directed synthesis of RNA using the inserted DNA as template. In particular, the BL21 (DE3) strain of E. coli has proved suitable for use in the present invention, but a number of other strains could also be used, for examples the expression from T7 promoter with codon bias correction (BL21 codon plus; Rosetta (Novagen, USA) and the improved disulphide bond formation strain Origami (Novagen, USA). Other E.coli genotypes may include laclq, DE3„ pLysS,, Ion,, ompT,, araD/ara-14,, dnaJ,gor.
According to a further aspect of the present invention there is provided a host cell comprising a nucleic acid encoding a protein of the present invention as herein described or a vector comprising a nucleic acid encoding a protein of the present invention as herein described.
According to a still further aspect of the present invention there is provided a method of obtaining a protein of the present invention as herein described, or fragment or derivative thereof, comprising culturing a host cell of the present invention as herein described, expressing the protein in the host cell and purifying the protein or fragment or derivative thereof.
As noted above, polyhistidine-tagging can be used to facilitate purification. A recombinant polyhistidine-tagged PCP protein, or fragment or derivative thereof, can be recovered from the host cell by lysis and exposure of the lysate to a suitable binding medium, such as a Ni2+NTA resin. If desired, the polyhistidine-tag can subsequently be removed by proteinase digestion.
The proteins of the present invention have wide utility.
The PCPs of the present invention are highly suited for use as fluorescent labels.
According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described as a labelling agent.
The recombinant PCPs of the present invention may be used to create protein conjugates, for example those described in US 4,876, 190 and US 6, 133,429.
The proteins of the present invention may also be used for labelling using organic dyes, such as fluorescein, cyanine dyes or nanoparticles to create novel quantum dots.
The PCPs, reconstituted with suitable carotenoid and chlorophyll, can be used as imaging agents.
According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described as an imaging agent.
A significant obstacle in using fluorescent protein probes for the in vivo imaging of tissues or as in vitro diagnostics in bodily fluids such as blood is that haemoglobin effectively absorbs the blue, green, red and other wavelengths used both to excite standard fluorescent proteins and the various wavelengths that they generate on fluorescence. This interference results in an increase in either scattering or absorption in the tissue or sample and a resultant decrease in light penetration. However, the proteins of the present invention can be used to provide near-infrared fluorescent imaging agents for in vivo and whole body imaging. With near-infrared light, both scattering and absorption in biological tissues are generally less severe.
The quantum efficiency or the energy transfer pathway of natural isolated PCP from peridinin to chlorophyll a has been calculated to be at least 80% and this value can be represented as a fraction equal to 0.8. There are very few near-infrared emitting proteins known to date. Certain non-PCP-like proteins have similar fluorescence emission spectra, but these proteins are far less bright than the reconstituted PCPs according to the present invention. For example, a green fluorescent protein homologue eqFP670 has similar emission peak at 670nm, however the fluorescence quantum yield of this protein is 0.06. Another reported near-infrared fluorescent protein is a bacteriophytochrome which requires addition of biliverdin to become fluorescent and has a quantum yield of 0.07. By contrast, the fluorescence quantum yields of recombinant PCP is expected to be at least 0.24, which is at least four-fold greater than eqFP670 and three-fold greater than the bacteriophytochrome. In addition, the recombinant PCPs have an extremely wide Stake's shift of 240nm (e.g. excitation at 435nm and emission at 675nm). This property allows the recombinant PCPs to be excited using conventional argon-ion lasers or halogen lamps as excitation sources which are used in microscopes or through the use of bright LEDs.
The near-infrared recombinant PCPs of the present invention can be used as probes with substantial increases in imaging sensitivity and performance.
According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described as a probe.
The holo-PCPs of the present invention allow much deeper penetration of light due to low absorbance and light scattering. Examples of practical applications using proteins of the present invention include whole-body fluorescence imaging, for instance to investigate metastasis and tumour localisation, cell migration, embryogenesis and other studies involving deep-tissue imaging not possible with known imaging agents.
As noted above, the absorbance and fluorescence properties of the PCPs and their derivatives can be varied by the reconstitution of apo-PCPs with different carotenoids and chlorophylls. The proteins of the present invention are useful in diagnostic kits and as research agents. For example, fusion proteins of the recombinant PCPs of the present invention with antibodies may be used in fluorescence based immunoassays, such as kits based on ELISAs and in vitro based assays using the PCPs as acceptor/donor pairs with lanthanide based fluorescence assays. For example, drug discovery assays include time-resolved fluorescence (TRF) and homogeneous time-resolved fluorescence assays (HTRF). These assays utilise lanthanide chelates as donor chromophores that allow fluorescence resonance transfer from the lanthanide (e.g. terbium or europium) to long wavelength acceptors such as cyanine dyes, Cy5 and phycobilloproteins such as R-phycocyanin (RPC) and allophycocyanin (APC). Due to similar emission wavelengths, and the fact that the recombinant PCPs can be fused to other proteins, Cy5 and APC may be replaced by recombinant PCP as the preferred acceptor in TRF and HTRF assays. Previously, acceptor dyes and phycobilloproteins have been produced by chemical conjugation methods. An advantage of the recombinant PCPs as described herein is the ability to engineer protein fusions (e.g. to receptors or ligands) at the gene level and thus assay molecular interactions of interest.
According to a further aspect of the present invention there is provided an assay kit comprising a protein or protein complex as herein described.
The PCPs of the present invention are also useful for the isolation of highly pure carotenoids and or chlorophylls and analogues and derivatives thereof.
According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described in the purification of a carotenoid and/or a chlorophyll.
The purification of hexahistidine-tagged proteins via Ni2+-nitrilotriacetic acid (Ni- NTA) chromatography resins is a well-established technique (Khan et al. (2006) Anal Chem. 1;78(9): 3072-9). The production of hexahistidine-tagged apo-PCPs according to the present invention allows for the affinity purification of carotenoids or chlorophylls. For example, commercially available Ni-NTA chromatography resins or beads allow the immobilisation of His6-tagged proteins through chelation of the Ni2+ ion (e.g. Ni-NTA agarose from Qiagen, USA). Therefore, immobilised apo-PCP (via the His6-tag-Ni-NTA interaction) on chromatography media can be used to purify any number of carotenoids and chlorophylls from other organisms (e.g. plant, bacterial, or mammalian) and organic waste material (e.g. shellfish extracts, plant extracts, bacterial fermentation extracts, etc.). A useful visual assay of binding of carotenoids and/or chlorophylls is the infrared fluorescence of the chromatography beads (i.e. Ni-NTA beads bound with apo-PCP) on forming holo-PCP. Immobilised holo-PCP may be eluted, for example, using imidazole so allowing carotenoids and or chlorophylls to be extracted by organic solvents. The present invention may be used to isolate and purify a wide range of carotenoids and chlorophylls, including those mentioned herein.
The recombinant PCPs of the present invention are also useful if the design of novel dual- or tri -functional biotherapeutics. For example, the fusion of PCP with tumour-specific antibodies can be used for the imaging of tumours and consequent therapeutic intervention. Photosensitizer molecules that are chlorophyll derivatives such as HPPH are used in photodynamic therapy (PDT). Therefore, recombinant apo-PCPs of the present invention, or antibody fusions thereof, may be used to 'load' HPPH thereby providing novel photosensitised drug carriers for use in PDT.
According to a further aspect of the present invention there is provided a therapeutic agent comprising a protein or protein complex as herein described.
The proteins of the present invention are also useful for binding drugs and as drug- carriers for drug delivery. The inventors have demonstrated the immobilisation of hexahistidine-tagged recombinant PCP on Ni-NTA beads (i.e. on chromatography resin). Recombinant PCPs may thus be formulated for the slow release of drugs (by pre-loading with drug compounds), which can be immobilised, for example on microbeads, polymer surfaces or on quantum dots. By way of example, recombinant apo-PCPs may be used as drug carriers for retinoid drugs, such as Acitretin, Alitretinoin, Bexarotene, Etretinate, Fenretinide, Isotretinoin, Tazarotene, and Tretinoin, and synthetic analogues thereof. These retinoid drugs have similar structures to carotenoids.
According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described as a carrier for a pharmaceutical agent.
The proteins of the present invention are also useful as biosensors, for example for carotenoids. Fluorescent proteins and quantum dots may be engineered as biosensors. For example, apo-PCPs may be reconstituted with chlorophylls only, thereby rendering the resultant PCP-chlorophyll complex a molecular sensor for a wide range of carotenoids. On binding a specific carotenoid, a corresponding specific fluorescence signal will be detectable which can be used to identify the carotenoid in question. Specific applications include, for example, the identification of toxin-producing dinoflagellates, which cause 'red tides', and dinoflagellates that are responsible for ciguatera and other shellfish poisonings. According to a further aspect of the present invention there is provided the use of a protein or protein complex as herein described as a biosensor.
The proteins of the present invention are also useful as biosensors for porphyrins, such as chlorophylls and heme. For example, apo-PCPs may be reconstituted with carotenoid only, such as peridinin, thereby rendering the resultant PCP -carotenoid complex a molecular sensor for porphyrins. On binding porphyrin, the PCP -carotenoid complex produces a fluorescence signal that can be used to distinguish the porhyrin in question. The biosensors may be used, for example, to detect heme and heme metabolic products in human samples (e.g. blood or urine) for the diagnosis of porphyria (resulting in either an increase of a fluorescent signal or a fluorescence quenching signal).
The PCPs of the present invention may be used as enhancing food supplements, commonly known as a 'nutraceuticals' . Carotenoids and chlorophylls have antioxidant activity and thus PCPs preloaded with these pigments make ideal nutraceuticals. It should be noted that both carotenoids and chlorophylls are not soluble in water. Therefore, PCP complexes provide a useful formulation for the body to take up the antioxidants in soluble form (or as a slow-release nutrient supplement). An important recent finding by Connolly et al. has shown that a supplement containing the three macular carotenoids meso-zeaxanthin, lutein and zeaxanthin can uniquely enrich an individual's protective macular pigment (Connolly et al. (2011) Investigative Ophthalmology and Visual Science October 6th edition). Thus, according to the present invention PCPs can be produced as food supplements containing photo-protective carotenoids such as these.
According to a further aspect of the present invention there is provided a food supplement comprising a protein or protein complex as herein described.
The proteins of the present invention are also useful in the design and synthesis of novel artificial light-harvesting complexes (LHC). A light-harvesting complex is a compilation of a number of subunit proteins that may be part of a larger supercomplex of a photosystem, the functional unit in photosynthesis. The ultimate goal of an artificial LHC is to exploit the energy of absorbed light to drive chemical reactions. Light-harvesting complexes such as PCPs are particularly well suited as they contain carotenoids and chlorophylls to funnel absorbed energy to the special pair via resonance energy transfer. Carotenoids serve a secondary function, suppressing damaging photochemical reactions, in particular those including oxygen, which can be induced by bright sunlight. Molecular fusions of recombinant PCPs linked to other photoactive chromophores such as GFP provide for the creation of enhanced LHCs. Such 'enhanced' LHCs may be used to create novel organisms with enhanced photosynthetic yields or even artificial Ίη vitro ' photosynthesis.
According to a further aspect of the present invention there is provided a light harvesting complex comprising a protein or protein complex as herein described.
The invention will now be further described with reference to the following non- limiting examples and accompanying figures in which:
Figure 1 shows a schematic representation of methods of the prior art contrasted with a representative method according to the present invention: panel A shows the chemical conjugation of natural isolates of PCP to proteins via crosslinkers and heterobifunctional linkers (patent numbers US 4,876, 190 and US 6, 133,429); panel B illustrates the published method of Miller et al. which produces recombinant but aggregated PCP as inclusion bodies and requires at least twelve steps to refold the protein; and panel C illustrates an embodiment of the present invention in which a novel, highly soluble, high yielding, recombinant His6- tagged-PCP is produced in a single-step process without the need for refolding.
Figure 2 shows, in schematic form, a representative cloning vector (pET46) suitable for use in aspects of the present invention. The pET46 vector contains a strong T7-lac promoter and an amino-terminal His6-tag coding sequence immediately followed by an Ek/LIC cloning site designed to allow the generation of fusion proteins with minimal vector- encoded sequence.
Figure 3 illustrates the detection of expressed PCP proteins according to the present invention. Clones of fPCP- and cPCP-pET46 were transformed into the BL21 (DE3) strain of E. coli and small-scale cell cultures were lysed and analysed by sodium dodecylsulphate- polyacrylamide agarose gel electrophoresis (SDS-PAGE). The recombinant fPCP and cPCP had approximate molecular masses of 36 kDa and 20 kDa respectively, as determined by SDS-PAGE.
Figure 4 illustrates the detection of expressed recombinant PCP proteins according to the present invention, produced at a larger scale. A 1-L culture was established of the BL21 (DE3) strain of E. coli transformed with the cPCP-pET46 clone and the lysate was analysed by SDS-PAGE. High concentrations of expressed hexahistidine-tagged cPCP of 20 kDa mass were observed in each lane and protein was expressed with high yield (136mg/L).
Figure 5 illustrates the visual detection of fluorescence from PCP proteins according to the present invention in association with various chlorophylls and carotenoids. Specifically, recombinant cPCP and fPCP were reconstituted with chlorophyll a ('a') or chlorophyll b ('b') and either xanthophyll ('χ') or beta-carotene ('c'). The subsequent fluorescence under UV light was recorded with a camera. Panels 1, 2, 5 and 6 are reconstitution reactions containing both protein and pigments. Panels 3, and 7 are negative controls, containing only pigments without the proteins and Panel 4 is a negative control containing only protein without the pigments. The appearance of red fluorescence was observed with cPCP and fPCP only in the presence of xanthophyll or beta-carotene with either chlorophyll a or chlorophyll b.
Figure 6 shows fluorescence spectra (excited at 435nm) for PCP proteins according to the present invention in association with a chlorophyll plus xanthophyll. The following fluorescence spectra are illustrated: chlorophyll a; chlorophyll a plus xanthophyll; fPCP plus chlorophyll a plus xanthophyll and cPCP plus chlorophyll a plus xanthophyll. A large fluorescence intensity increase of fPCP and cPCP was observed in the presence of chlorophyll a and xanthophyll.
Figure 7 illustrates the measurement of the molecular weight of PCP protein according to the present invention. cPCP was reconstituted with xanthophyll and chlorophyll a. The reconstituted cPCP was analysed with a size-exclusion column coupled to a multi- angle laser light scattering (MALLS) instrument. The results show a chromatogram of absorbance versus fraction volume, whereby the MALLS detector measured the absorbance of the cPCP at 280nm and the light scatter at 680 nm of the eluted fractions. The major peak pertaining to the protein fraction at 15mL (marked with an arrow) was calculated to have a molecular weight of 22.8 kDa with an rms radius moments of 5nm as determined by MALLS, which indicates that the holo-cPCP is a well-packed spherical monomeric protein.
Figure 8 illustrates the secondary structure analysis of PCP protein according to the present invention. The circular dichroism (CD) spectrum of expressed cPCP is shown in Figure 8A. The CD spectrum shows a characteristic signature for a folded protein with a dominant alpha-helix secondary structure (with a dip at 207 nm). For reference, the CD spectra for model protein secondary structures when scanned in the UV range are shown in Figure 8B. It is clear from Figure 8A that the CD spectrum is similar to the X-marked curve in Figure 8B which corresponds to a protein structure with dominant a-helical secondary structure.
Figure 9 illustrates the measurement of the thermal stability of PCP protein according to the present invention. CD signals at 207nm for expressed cPCP were measured at increasing temperatures up to 95°C. The results show that only a 10 mdeg change is observed. This is only a 10% fraction of the total possible mdeg change (Figure 8 shows a 1 10 mdeg units maximum in the folded cPCP structure).
Figure 10 illustrates the predicted structure of PCPs according to the present invention. Figure 10A shows a 3-D molecular structural diagram of the predicted crystal structure of fPCP (SEQ ID NO: 1) and Figure 10B shows the 3-D structure for cPCP (SEQ ID NO: 2). This was achieved using the modelling software developed by Guex and Peitsch (Guex, N. and Peitsch, M.C. (1997) Electrophoresis 18, 2714-2723). A detailed schematic of the predicted secondary structure of 313 residues of fPCP is illustrated in Figure IOC. This model shows a helical structure (72%) with an alpha solenoid architecture comprised of 16 helices, highlighting the important and unique amino acid residues. Both Figure 10A and Figure 10B were modelled using the crystal structure of PCP from the dinoflagellate {Amphidinium carterae) (Taxld: 2961, protein data bank (pdb) file: IPPR) with mutated residues valine 95 to isoleucine, glutamic acid 137 to leucine, lysine 147 to glutamic acid, glutamine 202 to aspartic acid, leucine 254 to isoleucine, serine 264 to alanine, asparagine 295 to valine and alanine 296 to glycine, to give the exact primary sequences of SEQ ID NO: l (fPCP) and SEQ ID NO:2 (cPCP) respectively.
Figure 1 1 shows the sequence alignment of the sequence of polyhistidine-tagged fPCP (SEQ ID NO:3) in relation to PCP sequences from S. kawagutii, S. sp-RKT/203 and A. carterae. Multiple sequence alignment for fPCP protein was achieved using the CLUSTAL 2.1 multiple sequence alignment program(www.ebi. ac.uk/Tools/msa/clustalw2) to obtain the alignment. The residues highlighted in bold are unique to fPCP. Note the symbols ' *' = identical amino acids, ' :' = very similar amino acids and ' . ' = similar amino acids.
Figure 12 shows the sequence alignment of the sequence of polyhistidine-tagged cPCP (SEQ ID NO:4) in relation to PCP sequences from S. kawagutii, S. sp-RKT/203 and A. carterae. Sequence alignment was carried out as described in relation to Figure 1 1.
Examples
1. DNA sequencing of S. S. flexibilis PCPs
Sequencing reactions were achieved using pET-RP (reverse primer) with the DNA primer of sequence CTAGTTATTGCTCAGCGG at 10μΜ (GATC Biotech Ltd., London) using a ABI 3730x1 DNA sequencer which uses the Sanger technology for determining the sequences for DNA.
2. DNA constructs for the expression of S. S. flexibilis PCPs
The PCR fragments of the fPCP (SEQ ID NO: 5) and cPCP (SEQ ID NO: 6) were cloned into pET46 vector (Figure 2) whereby the expression of the cloned gene was under the control of T7/Lac promoter which can be induced with IPTG. The highly active Lambda RNA polymerase driving the expression of the cloned gene was produced by the host E.coli, BL21 (DE3). The expressed protein was tagged with six histidines which were used to facilitate the downstream isolation of the expressed His-tagged proteins from E.coli lysate with Ni2+NTA resins. Optionally, the His tag can also be cleaved from the protein by proteinase digestion, enterokinase being used in this instance. The calculated molecular weights of the proteins were 38.46 kDa for the fPCP and 19.59 kDa for cPCP.
3. Production of soluble S. S. flexibilis PCPs
The clones of S. S. flexibilis fPCP and cPCP were transformed into the BL21 (DE3) strain of E. coli. When the E. coli culture was grown to a density of OD600 = 0.4, the cultures were induced with 0.1 mM IPTG. After a further 4 hours of incubation, the cultures were harvested by centrifugation at 6,000 rpm and the cells were lysed using a cell breaker. The lysed cells were further centrifuged at 10,000 rpm and the supernatant and pellets were analysed using SDS-PAGE, the results of which are shown in Figure 3. These data show that the fPCP and the cPCP were expressed in high yields and in soluble form from the clones. The recombinant expressed fPCP and cPCP had approximate molecular masses of 36 kDa and 20 kDa respectively as determined by SDS-PAGE. These are in agreement with the theoretically calculated molecular weights from the protein sequence of 38.46kDa for the fPCP and 19.59kDa for cPCP. It was estimated by subsequent purification that the yield was about 100 mg/L of culture of fPCP and cPCP.
4. Larger-scale production of soluble S. S. flexibilis PCP
A one-litre culture of the cPCP clone according to Example 2 was set up under the same conditions as the small-scale culture. After induction at 37°C for 4 hours, the culture was harvested and cells lysed with a cell breaker. Protein was purified on a 5 ml N2+NTA column. After washing the column with washing buffer (50 mM phosphate, pH8 and 300 mM NaCl, including 60 mM imidazole), the bound protein was eluted with 200 mM imidazole in washing buffer in 1ml aliquots. The eluted protein was analysed with SDS- PAGE and the results are shown in Figure. 4. These results show that hexahistidine-tagged cPCP was expressed reproducibly in high yield of approximately 136 mg/L.
5. Reconstitution of S. S. flexibilis PCPs with chlorophylls and carotenoids PCP reconstitution was carried out according to the method of Miller et al. A small volume of PCP apoprotein (typically 200-400 μg protein) in 25 mM Tris-HCl, pH7.5, 10 mM KC1 was mixed with a stoichiometric amount of PCP pigments dissolved in ethanol, resulting in a final ethanol concentration of 15% in volume of 1 ml. The sample was then left at 4°C for 72 hours before examining under blue light and the fluorescent signals recorded with a camera. The results are shown in Figure 5, in which Panels 1, 2, 5 and 6 are reconstitution reactions containing both protein and pigments. Panels 3, and 7 are negative controls, containing only pigments without the proteins and Panel 4 is a negative control containing only protein without the pigments. In Figure 5, 'a' represents chlorophyll a; 'b' represents chlorophyll b; 'x' represents xanthophyll; 'c' represents beta-carotene; 'cPCP' represents the C-terminal PCP and 'fPCP' represents the full-length PCP.
As shown in Figure 5, both the cPCP (panel 1) and fPCP (panel 2) of the present invention resulted in red fluorescence in the presence of chlorophyll a or chlorophyll b and xanthophyll. Greater fluorescence intensity of fPCP was observed upon reconstitution with chlorophyll a and xanthophyll (panel 2) compared to reconstitution with chlorophyll b and xanthophyll (panel 6). The protein solution alone (Figure 5, panel 4, cPCP and fPCP) or a single pigment (chlorophyll a, chlorophyll b, xanthophylls, beta-carotene) or the mixture of these pigments did not produce observable fluorescence (Figure 5, 3, 3.X, 3C, b, bx and be, in panel 3 and 7). In addition, both the cPCP and fPCP can bind chlorophyll a or chlorophyll b as their ligands (panels 1, 2, 5 and 6). Binding of the carotenoids is extremely important for the fluorescent state of the bound chlorophylls. Without the binding of carotenoids no fluorescence was observed (Figure. 5, a or b in all of the panels). Fluorescence emission was only produced when both chlorophyll and carotenoids were present when reconstituted with either cPCP or fPCP (Figure 5, ax, ac, bx and be). 6. Fluorescence spectrometry with the reconstituted PCPs
The fluorescence spectra (excited at 435 nm) of the individual pigments chlorophyll a, xanthophyll, mixtures of chlorophyll a and xanthophyll, and the reconstituted protein pigment complex, scanned with a Cary Eclipse Fluorescence Spectrophotometer (Agilent, USA), are shown in Figure 6. The pure pigments without any proteins did not result in any fluorescence in the range of 484 to 850 nm. The pigments chlorophyll a, xanthophylls mixture (without PCP) generated a very weak emission signal at 675 nm. However, chlorophyll a and xanthophyll with the addition of either cPCP or fPCP had significantly higher fluorescence intensity signals (at least eight-fold) at 675 nm as shown in Figure 6. Both PCPs had equivalent fluorescence intensity in these conditions.
7. LC-MASS spectrometry of the expressed PCPs
The protein bands indicated by the arrow heads in Figure 3 were cut from the gels and analysed by LC-MS using an ESI-TRAP mass spectrometer (Bruker Daltonics, Esquire 3000plus). Both marked proteins in the fPCP and the cPCP clone were identified as PCP using a protein fragment search with Mascot software. The Mascot search results are shown in Table 1. Mascot is a software search engine that uses mass spectrometry data to identify proteins from primary sequence databases.
Table 1
Protein source Full-length band C-terminal band
Match to PCP SYMSP PCP SYMSP
Organism Symbiodinium sp. Symbiodinium sp.
Score 418 186
Protein Peridinin-chlorophyll a-binding Peridinin-chlorophyll a binding protein, chloroplastic protein, chloroplastic
Nominal mass (Mr) 37.27 kDa 37.88 kDa
Calculated pi value 6.61 6.61 8. Molecular weight and secondary structure of expressed PCP
Figure 7 shows a size-exclusion column chromatogram of absorbance (A280 nm or A690) against fraction elution (ml) after injection of cPCP sample (reconstituted with chlorophyll a and xanthophyll). The line with solid circles is the UV signal at 280nm and the solid line is the laser light scattering signal where a 690nm laser light was used. The arrow indicates where the cPCP was eluted which corresponds to the A280 peak. The molecular weight of this cPCP fraction was analysed with a multi -angle laser light scattering (MALLS) instrument. MALLS is a technique for independently determining the absolute molar mass and the average size of particles in solution by detecting how they scatter light. The measured molecular weight was 22.8 kDa as determined by MALLS. This was very close to the calculated molecular weight from the clone's DNA sequence information (coding for 184 amino acid residues, given a molecular weight of 19.59 kDa). The molecule has rms radius moments of 5 nm. This indicates that the cPCP is a well-packed, spherical monomer when expressed in E coli.
9. Secondary structure study of expressed PCP by CD spectrometry
The purified cPCP block was studied using Circular Dichroism with a JASCO J-810 spectropolarimeter with a quartz cuvette with a path length of 10 mm. The data generated indicate that the cPCP consists of mainly alpha helices (see Figure 8). This is consistent with the published crystal structure of PCP proteins (Schulte, Johanning et al. (2010) European Journal of Cell Biology Molecular Biology of Complex Functions of Botanical Systems 89(12): 990-997). Figure 8A shows the CD spectrum of the purified cPCP. Figure 8B shows typical spectra of protein secondary structures when scanned with a CD spectrometer in the UV range.
10. Stability of expressed PCP in solution The thermostability of cPCP was studied by measuring its CD signals at 207 nm with incremental temperature increase from 30-95°C. The CD signal is a probe for the secondary helical structure of cPCP. The results are shown in Figure 9. These data show that the cPCP is a remarkably stable protein and does not unfold at 60 C. Even at 95°C only a very small part of the protein was denatured, with only a 10 mdeg change observed. This is only a 10% fraction of the total mdeg change possible. Figure 8 A shows 110 mdeg units maximum in the folded cPCP structure. Figure 8B illustrates the expected CD spectra for the alpha-helical, beta-sheets and random coil secondary structures of proteins, indicating that the cPCP is mostly alpha-helical in structure.
11. Sequence homology of S. S. flexibilis PCP with known PCPs
Sequence identity with known PCPs with respect to fPCP (SEQ ID NO: l) was determined using the CLUSTAL 2.1 multiple sequence alignment program. Percent identity was calculated in relation to polypeptides whose sequences were aligned optimally. The results are shown in Table 2 below:
Table 2
Figure imgf000033_0001
Sequence alignment of the fPCP (SEQ ID NO: l) and cPCP (SEQ ID NO:2) of the present invention with known PCPs from other organisms is shown in Figures 11 and 12, respectively. Sequence information
Amino acid sequence of full-length PCP of S. S.flexibilis (fPCP; SEQ ID NO:l) DEIGDAAKKLGDASYSFAKEVDWNNGIFLQAPGKFQPLKAL KAIDKMIEMGAAADPKLLKEAAEAHHKAIGSISGPNGVTSR ADWDAVNAAIGRIVASVPKAKVMAVYNSVKDITDPKVPAY MKSLVNGPDAEKAYLGFLEFKDVVEKNQVTTASAPAVVPS GDKIGVAAKALSDASYPFIKDIDWLSDIYLKPLPGKTAPDTL KAIDKMIVMGAKMDGNLLKAAAEAHHKAIGSIDAKGVTSA ADYEAVNAAIGRLVASVPKATVMDVYNSMAKVVDSTVTNN MFSKVNPLDAVGAAKGFYTFKDVVEASQR Stop
Amino acid sequence of C-terminal PCP of S. S.flexibilis (cPCP; SEQ ID NO:2)
VVAKNQVTTASAPAVVPSGDKIGVAAKALSDASYPFIKDID WLSDIYLKPLPGKTAPDTLKAIDKMIVMGAKMDGNLLKAA AEAHHKAIGSIDAKGVTSAADYEAVNAAIGRLVASVPKATV MDVYNSMAKVVDSTVTNNMFSKVNPLDAVGAAKGFYTFK DVVEASQR Stop
Amino acid sequence of His-tagged full-length PCP of S. S. flexibilis (fPCP; SEQ ID NO:3) MAHHHHHHVDDDDKI
DEIGDAAKKLGDASYSFAKEVDWNNGIFLQAPGKFQPLKAL KAIDKMIEMGAAADPKLLKEAAEAHHKAIGSISGPNGVTSR ADWDAVNAAIGRIVASVPKAKVMAVYNSVKDITDPKVPAY MKSLVNGPDAEKAYLGFLEFKDVVEKNQVTTASAPAVVPS GDKIGVAAKALSDASYPFIKDIDWLSDIYLKPLPGKTAPDTL KAIDKMIVMGAKMDGNLLKAAAEAHHKAIGSIDAKGVTSA ADYEAVNAAIGRLVASVPKATVMDVYNSMAKVVDSTVTNN MFSKVNPLDAVGAAKGFYTFKDVVEASQR Stop Amino acid sequence of His-tagged C-terminal PCP of S. S. flexibilis (cPCP; SEQ ID NO:4)
MAHHHHHHVDDDDKIVVAKNQVTTASAPAVVPSGDKIGVA AKALSDASYPFIKDIDWLSDIYLKPLPGKTAPDTLKAIDKMI VMGAKMDGNLLKAAAEAHHKAIGSIDAKGVTSAADYEAVN AAIGRLVASVPKATVMDVYNSMAKVVDSTVTNNMFSKVNP LDAVGAAKGFYTFKDVVEASQR Stop DNA encoding full-length PCP of S. S. flexibilis (SEQ ID NO: 5)
GATGAGATTGGCGATGCTGCAAAGAAACTTGGGGATGCCTCCTACTCTTTTGCCA
AGGAGGTGGACTGGAACAATGGAATTTTCCTCCAGGCCCCTGGCAAGTTTCAGC
CCTTGAAAGCTTTGAAAGCCATTGACAAGATGATCGAAATGGGGGCAGCTGCAG ATCCC AAGCTTCTGAAAGAGGCTGCAGAAGC ACATC ACAAGGCCATTGGAAGCA TCAGTGGGCCAAATGGTGTGACTTCGCGTGCTGACTGGGATGCCGTGAATGCAG CCATTGGTCGTATAGTCGCTTCGGTCCCCAAAGCAAAGGTCATGGCCGTTTACAA TTCAGTGAAAGACATCACGGATCCCAAAGTGCCAGCTTACATGAAGTCCTTGGTG AACGGGCCCGATGCCGAAAAGGCCTACCTAGGATTCCTGGAATTCAAGGATGTT GTTGAAAAGAACCAGGTGACCACCGCCAGTGCTCCTGCAGTTGTGCCTTCTGGGG ACAAGATTGGTGTGGCTGCAAAAGCGTTGTCCGATGCATCCTATCCTTTCATCAA GGACATCGATTGGCTGTCAGACATTTACCTGAAGCCGCTGCCCGGCAAGACTGCC CCAGACACCCTGAAAGCCATTGACAAGATGATTGTGATGGGCGCCAAGATGGAT GGCAACCTCTTGAAGGCAGCAGCAGAGGCACACCACAAGGCCATTGGCAGCATT GATGCCAAGGGTGTGACATCCGCGGCCGACTATGAAGCTGTGAATGCAGCCATT GGGCGCTTGGTGGCATCCGTGCCCAAGGCCACCGTGATGGATGTGTACAATTCCA TGGCCAAGGTCGTTGATTCCACCGTCACCAACAACATGTTCTCCAAGGTGAATCC ATTGGATGCAGTGGGTGCCGCCAAGGGTTTCTACACCTTCAAAGATGTTGTGGAG GCTTCCCAGCGCTGA
DNA encoding C-terminal PCP of S. S. flexibilis (SEQ ID NO: 6)
GTTGTTGCTAAGAACCAGGTGACCACCGCCAGTGCTCCTGCAGTTGTGCCTTCTG GGGACAAGATTGGTGTGGCTGCAAAAGCGTTGTCCGATGCATCCTATCCTTTCAT CAAGGACATCGATTGGCTGTCAGACATTTACCTGAAGCCGCTGCCCGGCAAGAC TGCCCCAGACACCCTGAAAGCCATTGACAAGATGATTGTGATGGGCGCCAAGAT GGATGGCAACCTCTTGAAGGCAGCAGCAGAGGCACACCACAAGGCCATTGGCAG CATTGATGCCAAGGGTGTGACATCCGCGGCCGACTATGAAGCTGTGAATGCAGC CATTGGGCGCTTGGTGGCATCCGTGCCCAAGGCCACCGTGATGGATGTGTACAAT TCCATGGCCAAGGTCGTTGATTCCACCGTCACCAACAACATGTTCTCCAAGGTGA ATCCATTGGATGCAGTGGGTGCCGCCAAGGGTTTCTACACCTTCAAAGATGTTGT GGAGGCTTCCCAGCGCTGA DNA encoding His-tagged full-length PCP of S. S. flexibilis (SEQ ID NO: 7)
ATGGCACATCACCACCACCATCACGTGGATGACGACGAC
AAGATTGATGAGATTGGCGATGCTGCAAAGAAACTTGGGGATGCCTCCTACTCTT TTGCCAAGGAGGTGGACTGGAACAATGGAATTTTCCTCCAGGCCCCTGGCAAGTT TCAGCCCTTGAAAGCTTTGAAAGCC ATTGACAAGATGATCGAAATGGGGGCAGC TGCAGATCCCAAGCTTCTGAAAGAGGCTGCAGAAGCACATCACAAGGCCATTGG AAGCATCAGTGGGCCAAATGGTGTGACTTCGCGTGCTGACTGGGATGCCGTGAA TGCAGCCATTGGTCGTATAGTCGCTTCGGTCCCCAAAGCAAAGGTCATGGCCGTT TACAATTCAGTGAAAGACATCACGGATCCCAAAGTGCCAGCTTACATGAAGTCC TTGGTGAACGGGCCCGATGCCGAAAAGGCCTACCTAGGATTCCTGGAATTCAAG GATGTTGTTGAAAAGAACCAGGTGACCACCGCCAGTGCTCCTGCAGTTGTGCCTT CTGGGGACAAGATTGGTGTGGCTGCAAAAGCGTTGTCCGATGCATCCTATCCTTT CATCAAGGACATCGATTGGCTGTCAGACATTTACCTGAAGCCGCTGCCCGGCAA GACTGCCCCAGACACCCTGAAAGCCATTGACAAGATGATTGTGATGGGCGCCAA GATGGATGGCAACCTCTTGAAGGCAGCAGCAGAGGCACACCACAAGGCCATTGG CAGCATTGATGCCAAGGGTGTGACATCCGCGGCCGACTATGAAGCTGTGAATGC AGCCATTGGGCGCTTGGTGGCATCCGTGCCCAAGGCCACCGTGATGGATGTGTAC AATTCCATGGCCAAGGTCGTTGATTCCACCGTCACCAACAACATGTTCTCCAAGG TGAATCCATTGGATGCAGTGGGTGCCGCCAAGGGTTTCTACACCTTCAAAGATGT TGTGGAGGCTTCCCAGCGCTGA DNA encoding His-tagged C-terminal PCP of S. S. flexibilis (SEQ ID NO: 8)
ATGGCACATCACCACCACCATCACGTGGATGACGACGACAAGATTGTTGTTGCTA AGAACCAGGTGACCACCGCCAGTGCTCCTGCAGTTGTGCCTTCTGGGGACAAGA TTGGTGTGGCTGCAAAAGCGTTGTCCGATGCATCCTATCCTTTCATCAAGGACAT CGATTGGCTGTCAGACATTTACCTGAAGCCGCTGCCCGGCAAGACTGCCCCAGAC ACCCTGAAAGCCATTGACAAGATGATTGTGATGGGCGCCAAGATGGATGGCAAC CTCTTGAAGGCAGCAGCAGAGGCACACCACAAGGCCATTGGCAGCATTGATGCC AAGGGTGTGACATCCGCGGCCGACTATGAAGCTGTGAATGCAGCCATTGGGCGC TTGGTGGC ATCCGTGCCCAAGGCCACCGTGATGGATGTGTAC AATTCCATGGCCA AGGTCGTTGATTCCACCGTCACCAACAACATGTTCTCCAAGGTGAATCCATTGGA TGCAGTGGGTGCCGCCAAGGGTTTCTACACCTTCAAAGATGTTGTGGAGGCTTCC CAGCGCTGA

Claims

1. A protein selected from:
a protein having at least 95% sequence identity with the amino acid sequence of SEQ ID NCv l; and
a protein having at least 97% sequence identity with the amino acid sequence of SEQ ID NO:2;
or a fragment or derivative thereof.
2. A protein, or fragment or derivative thereof, according to claim 1, having at least 96%), at least 97%, at least 98%> or at least 99% sequence identity with the amino acid sequence of SEQ ID NO: 1.
3. A protein having at least 80%> sequence identity with the amino acid sequence of SEQ ID NO: l and having one or more of: isoleucine 95, leucine 137, glutamic acid 147, aspartic acid 202, isoleucine 254, alanine 264, valine 295 and glycine 296, or a fragment or derivative thereof.
4. A protein, or fragment or derivative thereof, according to claim 3, having at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%), at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with the amino acid sequence of SEQ ID NO: 1.
5. A protein, or fragment or derivative thereof, according to any one of the preceding claims, having one or more of: tyrosine 191, histidine 229, histidine 230, isoleucine 254, tyrosine 270, phenylalanine 301 and tyrosine 302.
6. A protein, or fragment or derivative thereof, according to any one of the preceding claims, having one or more of: phenylalanine 28, histidine 66, histidine 67, asparagine 89, tyrosine 108 and tyrosine 136.
7. A protein according to any one of the preceding claims, wherein the protein comprises the sequence of SEQ ID NO: 1.
8. A protein according to any one of the preceding claims, wherein the protein consists of the sequence of SEQ ID NO: 1.
9. A protein, or fragment or derivative thereof, according to claim 1, having at least 98%) or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:2.
10. A protein having at least 80% sequence identity with the amino acid sequence of SEQ ID NO:2 and having one or more of: aspartic acid 58, isoleucine 110, alanine 120, valine 151 and glycine 152, or a fragment or derivative thereof.
11. A protein, or fragment or derivative thereof, according to claim 10, having at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%), at least 96%, at least 97%, at least 98%>, or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:2.
12. A protein, or fragment or derivative thereof, according to any one of claims 9 to 11, having one or more of: tyrosine 47, histidine 85, histidine 86, isoleucine 110, tyrosine 126, phenylalanine 157 and tyrosine 158.
13. A protein according to claim 1 or any one of claims 9 to 12, wherein the protein comprises the sequence of SEQ ID NO:2.
14. A protein according to claim 1 or any one of claims 9 to 13, wherein the protein consists of the sequence of SEQ ID NO:2.
15. A protein comprising a recombinant soluble peridinin-chlorophyll binding protein (PCP), or a fragment or derivative thereof.
16. A protein, or fragment or derivative thereof, according to any one of the preceding claims, comprising one or more polyhistidine tags.
17. A protein according to claim 16, wherein the protein is selected from a protein comprising the amino acid sequence of SEQ ID NO:3 and a protein comprising the amino acid sequence of SEQ ID NO:4, or a fragment or derivative thereof.
18. A protein selected from:
a protein having at least 95% sequence identity with the amino acid sequence of SEQ ID NO:3; and
a protein having at least 97% sequence identity with the amino acid sequence of SEQ ID NO:4;
or a fragment or derivative thereof.
19. A protein, or fragment or derivative thereof, according to claim 18, having at least 96%), at least 97%, at least 98% or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:3.
20. A protein having at least 80% sequence identity with the amino acid sequence of SEQ ID NO:3 and having one or more of: isoleucine 110, leucine 152, glutamic acid 162, aspartic acid 217, isoleucine 269, alanine 279, valine 310 and glycine 311 , or a fragment or derivative thereof.
21. A protein, or fragment or derivative thereof, according to claim 20, having at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%), at least 96%, at least 97%, at least 98%>, or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:3.
22. A protein, or fragment or derivative thereof, according to any one of claims 18 to 21, having one or more of: tyrosine 206, histidine 244, histidine 245, isoleucine 269, tyrosine 285, phenylalanine 316 and tyrosine 317.
23. A protein, or fragment or derivative thereof, according to any one of claims 18 to 22, having one or more of: phenylalanine 43, histidine 81, histidine 82, asparagine 104, tyrosine 123 and tyrosine 151.
24. A protein according to any one of claims 18 to 23, wherein the protein comprises the sequence of SEQ ID NO:3.
25. A protein according to any one of claims 18 to 24, wherein the protein consists of the sequence of SEQ ID NO:3.
26. A protein, or fragment or derivative thereof, according to claim 18, having at least 98%) or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:4.
27. A protein having at least 80%> sequence identity with the amino acid sequence of SEQ ID NO:4 and having one or more of: aspartic acid 73, isoleucine 125, alanine 135, valine 166 and glycine 167, or a fragment or derivative thereof.
28. A protein, or fragment or derivative thereof, according to claim 27, having at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%), at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with the amino acid sequence of SEQ ID NO:4.
29. A protein, or fragment or derivative thereof, according claim 18 or any one of claims and 26 to 28, having one or more of: tyrosine 62, histidine 100, histidine 101, isoleucine 125, tyrosine 141, phenylalanine 172 and tyrosine 173.
30. A protein according to claim 18 or any one of claims 26 to 29, wherein the protein comprises the sequence of SEQ ID NO:4.
31. A protein according to claim 18 or any one of claims 26 to 30, wherein the protein consists of the sequence of SEQ ID NO:4.
32. A protein comprising a protein conjugate including a protein according to any one of the preceding claims, or a fragment or derivative thereof.
33. A protein comprising a fusion protein including a protein according to any one of claims 30 to 32, or a fragment or derivative thereof.
34. A protein according to claim 33, wherein the fusion protein comprises one or more of the following: antibodies, antigens, enzymes, and epitope tags.
35. A protein according to claim 33, wherein the fusion protein comprises one or more of the following: polyhistidine tag, V5 tag, TAP tag, flag peptide, anti-flag antibodies, glutathione-^-transferase, Staphylococcal protein A, Streptococcal protein G, calmodulin organic ligands, thioredoxin, b-galactosidase, ubiquitin, chloramphenicol acetyltransferase, S-peptide, S-protein, myosin heavy chain, DsbA, biotin subunit, avidin, streptavidin, Strep- tag streptavidin, c-myc, dihydrofolate reductase, CKSc, polyarginine, polycysteine, polyphenylalanine, lac repressor, T4 gp55, growth hormone N terminus, maltose-binding protein, galactose-binding protein, cyclomaltodextrin glucanotransferase, cellulose-binding domain, hemolysin A, E. coli, 1 ell protein, TrpE or TrpLE, protein kinases, (AlaTrpTrpPro)n, HAId epitope, BTag, anti-BTag antibodies and green fluorescent protein.
36. A protein complex comprising a protein according to any preceding claim.
37. A protein complex according to claim 36, wherein the protein complex comprises one or more of: chlorophyll a, chlorophyll b, chlorophyll c, chlorophyll d, chlorophyll e, chlorophyll f, bacteriochlorophylls, 2-(l-hexyloxyethyl)-2-devinyl pyropheophorbide-a (HPPH), porphyrin structures, synthetic derivatives of porphyrins, corrins, chlorins (2,3-dihydroporphyrins), corphins and heme, or analogues or derivatives thereof.
38. A protein complex according to claim 37, wherein the protein complex comprises a bacteriochlorophyll.
39. A protein complex according to any one of claims 36 to 38, wherein the protein complex comprises one or more of: a-carotene, β-carotene, γ-carotene, δ-carotene, ε-carotene, ζ-carotene, lycopene, neurosporene, phytoene, phytofluene, antheraxanthin, astaxanthin, canthaxanthin, citranaxanthin, cryptoxanthin, diadinoxanthin, diatoxanthin, dinoxanthin, flavoxanthin, fucoxanthin, lutein, neoxanthin, rhodoxanthin, rubixanthin, violaxanthin, zeaxanthin, abscisic acid, apocarotenal, bixin, crocetin, food orange 7 (el60f), ionones, retinal, retinoic acid and retinol, or analogues or derivatives thereof.
40. A protein complex according to any one of claims 36 to 38, wherein the protein complex comprises one or more of: lycopersene (7,8, 11,12, 15, 7', 8', 1 Γ,12', 15'- decahydro-yj-carotene), phytofluene, hexahydrolycopene (15-cis-7,8,l 1, 12,7', 8'-hexahydro- γ,γ-carotene), torulene (3',4'-didehydro-P,Y-carotene) and a-zeacarotene (7',8'-dihydro-8,y- carotene); alcohols including alloxanthin, cynthiaxanthin, pectenoxanthin, cryptomonaxanthin ((3R,3'R)-7,8,7',8'-tetradehydro-P,P-carotene-3,3'-diol), crustaxanthin (β,- carotene-3,4,3',4'-tetrol), gazaniaxanthin ((3R)-5'-cis-P,y-caroten-3-ol), OH-chlorobactene ( ,2'-dihydro-f,Y-caroten-l'-ol), loroxanthin (P,8-carotene-3,19,3'-triol), lycoxanthin (γ,γ- caroten-16-ol), rhodopin (l,2-dihydro-Y,y-caroten-l-ol), rhodopinol (warmingol; 13-cis-l,2- dihydro-Y,y-carotene-l,20-diol), saproxanthin (3',4'-didehydro-l',2'-dihydro-P,Y-carotene-3, - diol), zeaxanthin, glycosides, oscillaxanthin (2,2'-bis(P-l-rhamnopyranosyloxy)-3,4,3',4'- tetradehydro-l,2, ,2'-tetrahydro-Y,Y-carotene-l,l'-diol) and phleixanthophyll (1'-(β-ά- glucopyranosyloxy)-3',4'-didehydro- ,2'-dihydro-P,Y-caroten-2'-ol); ethers including rhodovibrin ( -methoxy-3',4'-didehydro-l,2, ,2'-tetrahydro-Y,Y-caroten-l-ol), spheroidene (l-methoxy-3,4-didehydro-l,2,7',8'-tetrahydro-Y,Y-carotene), epoxides, diadinoxanthin (5,6- epoxy-7',8'-didehydro-5,6-dihydro— carotene-3,3-diol)luteoxanthin (5,6: 5',8'-diepoxy- 5,6,5',8'-tetrahydro-P,P-carotene-3,3'-diol), mutatoxanthin, citroxanthin, zeaxanthin, furanoxide (5,8-epoxy-5,8-dihydro-P,P-carotene-3,3'-diol), neochrome (5',8'-epoxy-6,7- didehydro-5,6,5',8'-tetrahydro-P,P-carotene-3,5,3'-triol), foliachrome, trollichrome, vaucheriaxanthin (5',6'-epoxy-6,7-didehydro-5,6,5',6'-tetrahydro-P,P-carotene-3,5,19,3'- tetrol); aldehydes including rhodopinal, wamingone (13 -cis-1 -hydroxy- l,2-dihydro-Y,y- caroten-20-al) and tomlarhodinaldehyde (3',4'-didehydro-P,Y-caroten-16'-al); acids and acid esters including torularhodin (3',4'-didehydro-P,Y-caroten-16'-oic acid) and torularhodin (methyl ester methyl 3',4'-didehydro-P,Y-caroten-16'-oate); ketones including astaxanthin, canthaxanthin (aphanicin; chlorellaxanthin P,P-carotene-4,4'-dione), capsanthin ((3R,3'R,5'R)-3,3'-dihydroxy-p,K-caroten-6'-one), capsorubin ((3S,5R,3'S,5'R)-3,3'- dihydroxy-K,K-carotene-6,6'-dione), cryptocapsin ((3'R,5'R)-3'-hydroxy-P,K-caroten-6'-one), 2,2'-diketospirilloxanthin ( 1,1 '-dimethoxy-3, 4,3 ',4'-tetradehydro- 1,2, 1 ',2'-tetrahydro-Y,y- carotene-2,2'-dione), flexixanthin (3, -dihydroxy-3',4'-didehydro- ,2'-dihydro-P,Y-caroten-4- one), 3-OH-canthaxanthin (adonirubin; phoenicoxanthin; 3-hydroxy-P,P-carotene-4,4'-dione), hydroxyspheriodenone ( -hydroxy-l-methoxy-3,4-didehydro-l,2, ,2',7',8'-hexahydro-Y,Y- caroten-2-one), okenone (l'-methoxy- ,2'-dihydro-c,Y-caroten-4'-one), pectenolone (3,3'- dihydroxy-7',8'-didehydro-P,P-caroten-4-one), phoeniconone (dehydroadonirubin; 3-hydroxy- 2,3-didehydro-P,P-carotene-4,4'-dione), phoeni copter one (P,8-caroten-4-one), rubixanthone (3-hydroxy-P,y-caroten-4'-one), siphonaxanthin (3, 19,3'-trihydroxy-7,8-dihydro-P,8-caroten- 8-one); esters of alcohols including astacein (3,3'-bispalmitoyloxy-2,3,2',3'-tetradehydro-P,P- carotene-4,4'-dione or 3,3'-dihydroxy-2,3,2',3'-tetradehydro-P,P-carotene-4,4'-dione dipalmitate), fucoxanthin (3'-acetoxy-5,6-epoxy-3,5'-dihydroxy-6',7'-didehydro-5,6,7,8,5',6'- hexahydro-P,P-caroten-8-one), isofucoxanthin (3'-acetoxy-3,5,5'-trihydroxy-6',7'-didehydro- 5,8,5',6'-tetrahydro-P,P-caroten-8-one), physalien, zeaxanthin dipalmitate ((3R,3'R)-3,3'- bispalmitoyloxy-P,P-carotene or (3R,3'R)-P,P-carotene-3,3'-diol dipalmitate), siphonein (3,3'- dihydroxy-19-lauroyloxy-7,8-dihydro-P,8-caroten-8-one or 3, 19,3'-trihydroxy-7,8-dihydro- P,8-caroten-8-one 19-laurate); apo carotenoids including P-apo-2'-carotenal (3',4'-didehydro- 2'-apo-b-caroten-2'-al), apo-2-lycopenal, apo-6'-lycopenal (6'-apo-y-caroten-6'-al), azafrinaldehyde (5,6-dihydroxy-5,6-dihydro-10'-apo-P-caroten-10'-al), bixin (6'-methyl hydrogen 9'-cis-6,6'-diapocarotene-6,6'-dioate), citranaxanthin (5',6'-dihydro-5'-apo-P- caroten-6'-one or 5',6'-dihydro-5'-apo-18'-nor-P-caroten-6'-one or 6'-methyl-6'-apo-P-caroten- 6'-one), crocetin (8,8'-diapo-8,8'-carotenedioic acid), crocetinsemialdehyde (8'-οχο-8,8'- diapo-8-carotenoic acid), crocin (digentiobiosyl 8,8'-diapo-8,8'-carotenedioate), hopkinsiaxanthin (3-hydroxy-7,8-didehydro-7',8'-dihydro-7'-apo-b-carotene-4,8'-dione or 3- hydroxy-8'-methyl-7,8-didehydro-8'-apo-b-carotene-4,8'-dione), methyl apo-6'-lycopenoate (methyl 6'-apo-y-caroten-6'-oate), paracentrone (3,5-dihydroxy-6,7-didehydro-5,6,7',8'- tetrahydro-7'-apo-b-caroten-8'-one or 3,5-dihydroxy-8'-methyl-6,7-didehydro-5,6-dihydro-8'- apo-b-caroten-8'-one), sintaxanthin (7',8'-dihydro-7'-apo-b-caroten-8'-one or 8'-methyl-8'- apo-b-caroten-8'-one); nor and seco carotenoids including actinioerythrin (3,3'-bisacyloxy- 2,2'-dinor-b,b-carotene-4,4'-dione), β-carotenone (5,6:5',6'-diseco-b,b-carotene-5,6,5',6'- tetrone); peridinin (3'-acetoxy-5,6-epoxy-3,5'-dihydroxy-6',7'-didehydro-5,6,5',6'-tetrahydro- 12',13',20'-trinor-b,b-caroten-19, l 1-olide), pyrrhoxanthininol (5,6-epoxy-3,3'-dihydroxy-7',8'- didehydro-5,6-dihydro-12', 13',20'-trinor-b,b-caroten-19, l 1-olide), semi-a-carotenone (5,6- seco-b,e-carotene-5,6-dione), semi-P-carotenone (5,6-seco-b,b-carotene-5,6-dione or 5',6'- seco-b,b-carotene-5',6'-dione), triphasiaxanthin (3-hydroxysemi-b-carotenone 3'-hydroxy-5,6- seco-b,b-carotene-5,6-dione or 3-hydroxy-5',6'-seco-b,b-carotene-5',6'-dione); retro carotenoids and retro apo carotenoids incuding eschscholtzxanthin (4',5'-didehydro-4,5'-retro- b,b-carotene-3,3'-diol), eschscholtzxanthone (3'-hydroxy-4',5'-didehydro-4,5'-retro-b,b- caroten-3-one), rhodoxanthin (4',5'-didehydro-4,5'-retro-b,b-carotene-3,3'-dione) tangeraxanthin (3-hydroxy-5'-methyl-4,5'-retro-5'-apo-b-caroten-5'-one or 3-hydroxy-4,5'- retro-5'-apo-b-caroten-5'-one); and higher carotenoids including nonaprenoxanthin (2-(4- hydroxy-3-methyl-2-butenyl)-7',8',H', 12'-tetrahydro-e,y-carotene), decaprenoxanthin (2,2'- bis(4-hydroxy-3-methyl-2-butenyl)-e,e-carotene), c.p. 450 (2-[4-hydroxy-3-(hydroxymethyl)- 2-butenyl]-2'-(3-methyl-2-butenyl)-b,b-carotene), c.p. 473 (2'-(4-hydroxy-3-methyl-2- butenyl)-2-(3-methyl-2-butenyl)-3',4'-didehydro-r,2'-dihydro-b,y-caroten-l'-ol) and bacterioruberin (2,2'-bis(3-hydroxy-3-methylbutyl)-3,4,3',4'-tetradehydro-l,2,l',2'-tetrahydro- y,y-carotene- 1 , 1 '-dio).
41. A nucleic acid encoding a protein, or fragment or derivative thereof, according to any one of claims 1 to 35.
42. A nucleic acid comprising or consisting of a nucleotide sequence selected from SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8, or a fragment thereof.
43. A vector comprising a nucleic acid encoding a protein, or fragment or derivative thereof, according to any one of claims 1 to 35.
44. A host cell comprising a nucleic acid encoding a protein, or fragment or derivative thereof, according to any one of claims 1 to 35 or a vector comprising a nucleic acid encoding a protein, or fragment or derivative thereof, according to any one of claims 1 to 35.
45. A method of obtaining a protein, or fragment or derivative thereof, according to any one of claims 1 to 35, comprising culturing a host cell of claim 44, expressing the protein, or fragment or derivative thereof, in the host cell and purifying the protein or fragment or derivative thereof.
46. An assay kit comprising a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40.
47. A therapeutic agent comprising a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40.
48. A food supplement comprising a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40.
49. A light harvesting complex comprising a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40.
50. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 as a labelling agent.
51. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 as an imaging agent.
52. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 as a probe.
53. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 in the purification of a carotenoid and/or a chlorophyll.
54. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 as a carrier for a pharmaceutical agent.
55. The use of a protein according to any one of claims 1 to 35 or a protein complex according to any one of claims 36 to 40 as a biosensor.
PCT/GB2013/050754 2012-03-26 2013-03-22 Peridinin-chlorophyll binding proteins WO2013144595A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1418608.4A GB2515698A (en) 2012-03-26 2013-03-22 Peridinin-chlorophyll binding proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB1205288.2A GB201205288D0 (en) 2012-03-26 2012-03-26 Peridinin-chlorophyll binding proteins
GB1205288.2 2012-03-26

Publications (1)

Publication Number Publication Date
WO2013144595A1 true WO2013144595A1 (en) 2013-10-03

Family

ID=46087129

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2013/050754 WO2013144595A1 (en) 2012-03-26 2013-03-22 Peridinin-chlorophyll binding proteins

Country Status (2)

Country Link
GB (2) GB201205288D0 (en)
WO (1) WO2013144595A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017040813A3 (en) * 2015-09-02 2017-05-04 University Of Massachusetts Detection of gene loci with crispr arrayed repeats and/or polychromatic single guide ribonucleic acids
US11124499B2 (en) 2018-09-13 2021-09-21 National Guard Health Affairs Pharmaceutical composition derived from Tecoma plant and a method for treating cancer

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4876190A (en) 1987-10-21 1989-10-24 Becton Dickinson & Company Peridinin-chlorophyll complex as fluorescent label
US6133429A (en) 1997-10-03 2000-10-17 Becton Dickinson And Company Chromophores useful for the preparation of novel tandem conjugates

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4876190A (en) 1987-10-21 1989-10-24 Becton Dickinson & Company Peridinin-chlorophyll complex as fluorescent label
US6133429A (en) 1997-10-03 2000-10-17 Becton Dickinson And Company Chromophores useful for the preparation of novel tandem conjugates

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
CONNOLLY ET AL.: "Investigative Ophthalmology and Visual Science", October 2011
DATABASE UniProt [online] 1 October 1996 (1996-10-01), "RecName: Full=Peridinin-chlorophyll a-binding protein 1, chloroplastic; Short=PCP; Flags: Precursor;", XP002697496, retrieved from EBI accession no. UNIPROT:P80484 Database accession no. P80484 *
DATABASE UniProt [online] 22 February 2012 (2012-02-22), "SubName: Full=Chloroplast soluble peridinin-chlorophyll a-binding protein; Flags: Precursor;", XP002697495, retrieved from EBI accession no. UNIPROT:G9I8T0 Database accession no. G9I8T0 *
DAVID J MILLER ET AL: "Reconstitution of the Peridinin-chlorophyll a Protein (PCP): Evidence for Functional Flexibility in Chlorophyll Binding", PHOTOSYNTHESIS RESEARCH ; OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF PHOTOSYNTHESIS RESEARCH, SPRINGER, BERLIN, DE, vol. 86, no. 1-2, 1 November 2005 (2005-11-01), pages 229 - 240, XP019264114, ISSN: 1573-5079, DOI: 10.1007/S11120-005-2067-1 *
E. HOFMANN ET AL: "Structural Basis of Light Harvesting by Carotenoids: Peridinin-Chlorophyll-Protein from Amphidinium carterae", SCIENCE, vol. 272, no. 5269, 21 June 1996 (1996-06-21), pages 1788 - 1791, XP055063409, ISSN: 0036-8075, DOI: 10.1126/science.272.5269.1788 *
GUEX, N.; PEITSCH, M.C., ELECTROPHORESIS, vol. 18, 1997, pages 2714 - 2723
KHAN ET AL., ANAL CHEM. 1, vol. 78, no. 9, 2006, pages 3072 - 9
LARKIN ET AL., BIOINFORMATICS, vol. 23, no. 21, 2007, pages 2947 - 2948
MILLER ET AL., PHOTOSYNTHESIS RESEARCH, vol. 86, 2005, pages 229 - 240
MILLER, PHOTOSYNTHESIS RESEARCH, vol. 86, 2005, pages 229 - 240
SCHULTE; JOHANNING ET AL., EUROPEAN JOURNAL OF CELL BIOLOOY MOLECULAR BIOLOGY OF COMPLEX FUNCTIONS OF BOTANICAL SYSTEMS, vol. 89, no. 12, 2010, pages 990 - 997

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017040813A3 (en) * 2015-09-02 2017-05-04 University Of Massachusetts Detection of gene loci with crispr arrayed repeats and/or polychromatic single guide ribonucleic acids
US11390908B2 (en) 2015-09-02 2022-07-19 University Of Massachusetts Detection of gene loci with CRISPR arrayed repeats and/or polychromatic single guide ribonucleic acids
US11124499B2 (en) 2018-09-13 2021-09-21 National Guard Health Affairs Pharmaceutical composition derived from Tecoma plant and a method for treating cancer

Also Published As

Publication number Publication date
GB201418608D0 (en) 2014-12-03
GB2515698A (en) 2014-12-31
GB201205288D0 (en) 2012-05-09

Similar Documents

Publication Publication Date Title
Mukougawa et al. Metabolic engineering to produce phytochromes with phytochromobilin, phycocyanobilin, or phycoerythrobilin chromophore in Escherichia coli
Fischer et al. A brilliant monomeric red fluorescent protein to visualize cytoskeleton dynamics in Dictyostelium
Gentle et al. Direct production of proteins with N-terminal cysteine for site-specific conjugation
Wang et al. Spectral and functional studies on siphonaxanthin-type light-harvesting complex of photosystem II from Bryopsis corticulans
Miao et al. Adapting photosynthesis to the near-infrared: non-covalent binding of phycocyanobilin provides an extreme spectral red-shift to phycobilisome core-membrane linker from Synechococcus sp. PCC7335
US20150005481A1 (en) Norbornene Modified Peptides and Their Labelling With Tetrazine Compounds
Ding et al. Small monomeric and highly stable near-infrared fluorescent markers derived from the thermophilic phycobiliprotein, ApcF2
Mothersole et al. PucC and LhaA direct efficient assembly of the light‐harvesting complexes in Rhodobacter sphaeroides
Liu et al. Biosynthesis of fluorescent cyanobacterial allophycocyanin trimer in Escherichia coli
Al-Homsi et al. Construction of pRSET-sfGFP plasmid for fusion-protein expression, purification and detection
US11021523B2 (en) Cyanobacteriochromes active in the far-red to near-infrared
WO2013144595A1 (en) Peridinin-chlorophyll binding proteins
Eastwood et al. High-yield vesicle-packaged recombinant protein production from E. coli
Swainsbury et al. Engineering of a calcium-ion binding site into the RC-LH1-PufX complex of Rhodobacter sphaeroides to enable ion-dependent spectral red-shifting
JP6051438B2 (en) Calcium sensor protein using red fluorescent protein
Sun et al. Orange fluorescent proteins constructed from cyanobacteriochromes chromophorylated with phycoerythrobilin
Wu et al. Modular generation of fluorescent phycobiliproteins
EP2977462A1 (en) In vivo production of a recombinant carotenoid-protein complex
Mollaev et al. Recombinant alpha-fetoprotein receptor-binding domain co-expression with polyglutamate tags facilitates in vivo folding in E. coli
Guo et al. Dichromic allophycocyanin trimer covering a broad spectral range (550–660 nm)
Xu et al. Chromophorylation (in Escherichia coli) of allophycocyanin B subunits from far-red light acclimated Chroococcidiopsis thermalis sp. PCC7203
WO2011096501A1 (en) Photosensitizing fluorescent protein
CN104962530B (en) A kind of enzyme amalgamation and expression albumen and its application in terms of production of astaxanthin
CN106632640B (en) Red fluorescent protein, fusion protein, isolated nucleic acid, vector and application
Gerlach et al. Photo-regulation of enzyme activity: the inactivation of a carboligase with genetically encoded photosensitizer fusion tags

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13712898

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 1418608

Country of ref document: GB

Kind code of ref document: A

Free format text: PCT FILING DATE = 20130322

122 Ep: pct application non-entry in european phase

Ref document number: 13712898

Country of ref document: EP

Kind code of ref document: A1