WO2020242379A1 - Method of modifying a polypeptide - Google Patents

Method of modifying a polypeptide Download PDF

Info

Publication number
WO2020242379A1
WO2020242379A1 PCT/SG2020/050303 SG2020050303W WO2020242379A1 WO 2020242379 A1 WO2020242379 A1 WO 2020242379A1 SG 2020050303 W SG2020050303 W SG 2020050303W WO 2020242379 A1 WO2020242379 A1 WO 2020242379A1
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
enzyme
rss
amino acid
sequence
Prior art date
Application number
PCT/SG2020/050303
Other languages
French (fr)
Inventor
Brandon Isamu MORINAKA
Thi Quynh Ngoc NGUYEN
Yi Wei TOOH
Original Assignee
National University Of Singapore
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University Of Singapore filed Critical National University Of Singapore
Publication of WO2020242379A1 publication Critical patent/WO2020242379A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/52Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/25Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving enzymes not classifiable in groups C12Q1/26 - C12Q1/66
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B30/00Methods of screening libraries
    • C40B30/04Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/10Libraries containing peptides or polypeptides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/21Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/35Fusion polypeptide containing a fusion for enhanced stability/folding during expression, e.g. fusions with chaperones or thioredoxin

Definitions

  • the invention relates generally to the field of biotechnology.
  • the invention teaches a method of modifying a polypeptide with an rS AM/SPASM (rSS) enzyme.
  • rSS rS AM/SPASM
  • Polypeptides have a wide range of biomedical and industrial applications. Many polypeptides act as hormones, enzyme inhibitors, substrates, and neurotransmitters in the body, which has led to their increasing use as therapeutic agents for the treatment of diseases. Polypeptides are also often used as diagnostic tools or drug delivery applications. The advantage of using polypeptides is that they are generally cheap and straightforward to produce by techniques such as chemical synthesis or recombinant expression. However, one major disadvantage of the use of polypeptides is their susceptibility towards proteolytic degradation. For example, therapeutic polypeptides, which are often relatively unstructured, can be rapidly degraded in vivo , often with half-lives in the order of minutes.
  • the present disclosure teaches a method of modifying a polypeptide.
  • a method of modifying a polypeptide comprising the steps of: a) providing a polypeptide comprising a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X 1 and X 3 .
  • rSS rSAM/SPASM
  • kits for modifying a polypeptide comprising a) an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid and b) an expression construct comprising a nucleic acid that encodes an rS AM/SPASM (rSS) enzyme.
  • rSS rS AM/SPASM
  • a method of preparing a modified polypeptide library comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; and b) contacting the polypeptide library with a rS AM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X 1 and X 3 so as to generate a modified polypeptide library.
  • rSS rS AM/SPASM
  • a method of selecting a modified peptide capable of binding to a ligand comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; b) contacting the polypeptide library with a rS AM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X 1 and X 3 so as to generate a modified polypeptide library; c) contacting the modified polypeptide library with the ligand; and d) selecting the modified polypeptides that bind to the ligand.
  • rSS rS AM/SPASM
  • Fig 1 Maturase systems, biosynthetic gene clusters and strains of origin. Representative gene clusters and core peptide sequences (to the right of clusters) from: a, GenPropl090 Xye maturase system ( xnc , ykc, and etc).
  • the A and B genes represent precursor peptide (XyeA, TIGR04495) and SPASM protein (XyeB, TIGR04496), respectively.
  • Core peptides are assigned the sequence C-terminal to GG motifs; b, GenPropl037 Gly-rich repeat (Grr) maturase system: ( osc , Isc, and gsc).
  • the A and B genes represent precursor peptide (GrrA, TIGR04260) and rSAM enzyme (GrrM, TIGR04261), respectively. Core peptides are defined at the C-terminus where the Gly-rich region commences.; c, GenProP1068 Fxs maturase system (, msc ).
  • the A and B genes represent precursor peptide (FxsA, TIGR04268) and rSAM enzyme (FxsB, TIGR04269), respectively. The start of the core peptides is unknown, selected residues at the C-terminus are shown.
  • Modifications were detected from coexpression of precursor peptides (A) with cognate rSAM proteins (B), Ni-affinity chromatography, and digestion with trypsin and analysis by LC-MS/MS, and are indicated by blue connectors. For all gene clusters shown, Blue connectors indicate motifs where -2 Da modifications have been detected from tryptic digest fragments. Modifications were detected from coexpression of precursor peptide (A) with cognate SPASM protein (B) followed by Ni-affinity chromatography, digestion with trypsin and analysis by LC-MS/MS.
  • the asterisk marks the location of the mutation c, Key 2D NMR correlations for residues -1 to +4 of fragment 3 (top) and residues +4 to +11 of fragment 3 (bottom) d, Conformational analysis and NOE correlations for WIN (left), FGN (center), and WER (right) cyclophanes. Coupling constants are indicated for Asn3, Asn7, and Arg10.
  • Fig 3. Detection of activity and characterization of modifications by OscB.
  • a In vivo coexpression of NHis 6 -OscA2 + OscB followed by Ni-affinity purification, trypsin digest, and LC-MS to detect fragments 4 - 7.
  • b Extracted ion chromatogram (EIC, left) and corresponding mass spectra (right) to detect unmodified 4 (GGGGSWGNGGSWR (SEQ ID NO: 1)) and modified 6 for coexpression shown in a.
  • c EIC (left) and corresponding mass spectra (right) showing unmodified 5 (FINSR) and modified 7 for coexpression shown in a.
  • Fig 4. Protein sequence similarity network for selected SPASM protein families
  • Thioether bond formation is represented by sporulation killing factor (Skf) maturases (IPR030915, TIGR04403, SkfB is a characterized member), quinohemoprotein (Qhp) maturases (IPR023886, TIGR03906, QhpD is a characterized member), and six -residue in forty-five (SCIFF) maturases (IPR024025, TIGR03974, CteB and Ttel 186 are characterized members); Tyramine excision is represented by spliceases (N113), one subfamily annotated as Nifl 1-class peptide radical SAM maturase 3 (IPR026482, TIGR04103, PlpX is a characterized member); Tyrosine decarboxylation by mycofactocin (Myc) maturases
  • the predominant form of 24 after acid hydrolysis is the hydrochloride salt c, Detection of activity for NHis 6 -MscA + MscB-375.
  • EICs are presented for trypsin digests of NHis 6 -MscA (top) and NHis 6 - MscA+MscB-375 (bottom).
  • Corresponding mass spectra for 10 and 11 are shown on the right d, Reaction scheme for the synthesis of standards 17, 18, 19, 22, and 23.
  • e HPLC chromatograms for standards (17-19) and comparison to degradation fragment 13.
  • f 1 H NMR spectra for synthetic standards (22 and 23) and comparison to degradation fragment 24.
  • the method as disclosed herein may comprise a) providing a polypeptide comprising one or more three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid.
  • the method may also comprise b) contacting the polypeptide with one or more rSAM/SPASM (rSS) enzyme(s) for a sufficient time and under conditions to modify the polypeptide to form one or more cyclophane group connecting X 1 and X 3 within each three residue motif.
  • rSS rSAM/SPASM
  • the rSAM/SPASM (rSS) enzyme can recognise a 3 -residue motif and form a cyclophane group that leads to restricted rotation of the aromatic ring and induces planar chirality in the asymmetric indole bridge.
  • the 3-residue motif can have variants that are diverse in sequence that are recognised by the rSS enzyme.
  • the rSS enzyme can be used to catalyze formation of multiple cyclophane groups within a polypeptide containing 3 -residue motifs.
  • Each cyclophane group is conformationally rigid and compact which may increase the proteolytic stability of the polypeptide.
  • Each cyclophane group also forms a unique three-dimensional scaffold that can have wide-ranging applications, such as for binding to new drug targets.
  • the method may also be used to alter the binding properties of a therapeutic polypeptide to a target.
  • the method as defined herein may also be used to increase the in vitro stability of a polypeptide or enzyme for applications outside an animal body, such as for use as pesticide, food preservative or research tool kit.
  • the polypeptide as referred to herein may comprise or further comprise a polypeptide of interest,
  • the polypeptide of interest may, for example, be insulin, growth hormones, clotting factors such as factor VIII and factor IX, thrombin, hemopoietic growth factor, viral antigens, erythropoietin, enzyme inhibitors, substrates, or neurotransmitters.
  • the methods as referred to herein may be used to reduce the susceptibility of the polypeptide to protease degradation.
  • the method comprises engineering an X 1 -X 2 -X 3 motif within a polypeptide of interest. In one embodiment, the method comprises engineering an X 1 -X 2 -X 3 motif at the amino and/or carboxy terminus of a polypeptide of interest which prevents degradation by exopeptidase enzymes.
  • a method of increasing the in vitro or in vivo stability of a polypeptide comprising the steps of: a) providing a polypeptide comprising a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X 1 and X 3 .
  • rSS rSAM/SPASM
  • Statine-like isosteres hydroxyethylene isosteres, reduced amide bond isosteres, thioamide isosteres, urea isosteres, carbamate isosteres, thioether isosteres, vinyl isosteres and other amide bond isosteres known to the art are also included.
  • Amino acid substitutions falling within the scope of the invention are, in general, accomplished by selecting substitutions that do not differ significantly in their effect on maintaining (a) the structure of the peptide backbone in the area of the substitution, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. After the substitutions are introduced, the variants are screened for biological activity.
  • the modification may, for example, result in the crosslinking on the three residue motif which includes W, F, Y or H to form indole-or phenyl-bridged cyclophanes.
  • the modified polypeptide may, for example, display restricted rotation of the aromatic ring and induce planar chirality in the asymmetric indole bridge.
  • X 1 may be an aromatic amino acid.
  • the term“aromatic amino acid” may refer to an amino acid with an aromatic ring.
  • the aromatic amino acid may be a naturally-occurring or non-natural aromatic amino acid. It can also be an analogue or a mimetic of a naturally-occurring or non natural aromatic amino acid.
  • X 1 is W, F, Y or H.
  • X 2 and X 3 may each independently be any amino acid.
  • X 2 is I, G, E, Y, V, L, A, D, S, T, N or Q.
  • X 3 may be a non-aromatic amino acid.
  • X 3 is an amino acid that is not W, F, Y or H.
  • X 3 is N, R, S, D or K.
  • the method may be performed under anaerobic or oxygen-free conditions.
  • the rSS enzyme may be an rSS enzyme of the Xenorhabdus, Yersinia and Erwinia (XYE) maturase system (XyeB, TIGR04496, IPR030989), Glycine-rich repeat (Grr) maturase system (GrrM, TIGR04261, IPR026357) or the Fxs maturase system (FxsB, TIGR04269, IPR026335).
  • the rSS enzyme may also be an enzymatically active fragment of an rSS enzyme of the Xenorhabdus, Yersinia and Erwinia (XYE) maturase system (XyeB, TIGR04496, IPR030989), Glycine-rich repeat (Grr) maturase system (GrrM, TIGR04261, IPR026357) or the Fxs maturase system (FxsB, TIGR04269, IPR026335).
  • XYE Xenorhabdus, Yersinia and Erwinia
  • XYE Xenorhabdus, Yersinia and Erwinia
  • XYE Xenorhabdus, Yersinia and Erwinia
  • Grr Glycine-rich repeat maturase system
  • FxsB FxsB, TIGR04269, IPR026335
  • the rSS enzyme is a C-terminal truncated
  • the rSS enzyme or enzymatically active fragment has two Cys-rich domains that are critical or essential for activity.
  • the two Cys-rich domains may include the rSAM binding domain in the N-terminus (CXXXCXXC (SEQ ID NO: 10)) and the SPASM domain in the C-terminus (CXXXCXXXXC (SEQ ID NO: 11)) or CXXCXXXXC (SEQ ID NO: 12), where X may be any amino acid).
  • domain refers to a part of a molecule or structure that shares common physicochemical features, such as, but not limited to, hydrophobic, polar, globular and helical domains or properties such as ligand-binding, membrane fusion, signal transduction, cell penetration and the like. Often, a domain has a folded protein structure which has the ability to retain its tertiary structure independently of the rest of the protein. Generally, domains are responsible for discrete functional properties of proteins, and in many cases may be added, removed or transferred to other proteins without loss of function of the remainder of the protein and/or of the domain. Domains may be co-extensive with regions or portions thereof; domains may also include distinct, non-contiguous regions of a molecule.
  • recombinant when used with reference to, e.g., polypeptide, enzyme, nucleic acid or cell refers to a material, or a material corresponding to the natural or native form of the material, that has been modified in a manner that would not otherwise exist in nature, or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques.
  • Non-limiting examples include, among others, recombinant cells expressing genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise expressed at a different level.
  • isolated polypeptide refers to a polypeptide which is substantially separated from other contaminants that naturally accompany it, e.g., protein, lipids, and polynucleotides.
  • the term embraces polypeptides which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis).
  • the improved ketoreductase enzymes may be present within a cell, present in the cellular medium, or prepared in various forms, such as lysates or isolated preparations. As such, in some embodiments, the improved ketoreductase enzyme can be an isolated polypeptide.
  • the method may comprise co-expressing the polypeptide and the rSS enzyme in a host cell such that the polypeptide contacts the rSS enzyme for a sufficient time and under conditions to modify the polypeptide in the host cell.
  • the terms“host”,“host cell”,“host cell line” and“host cell culture” are used interchangeably and refer to cells into which exogenous nucleic acid has been introduced, including the progeny of such cells.
  • Host cells include“transformants” and“transformed cells”, which include the primary transformed cell and progeny derived therefrom without regard to the number of passages. Progeny may not be completely identical in nucleic acid content to a parent cell, but may contain mutations.
  • a host cell is any type of cellular system that can be used to a modified polypeptide of the present invention.
  • Host cells include cultured cells, e.g., mammalian cultured cells, such as CHO cells, BHK cells, NS0 cells, SP2/0 cells, YO myeloma cells, P3X63 mouse myeloma cells, PER cells, PER.C6 cells or hybridoma cells, yeast cells, insect cells, and plant cells, to name only a few, but also cells comprised within a transgenic animal, transgenic plant or cultured plant or animal tissue.
  • mammalian cultured cells such as CHO cells, BHK cells, NS0 cells, SP2/0 cells, YO myeloma cells, P3X63 mouse myeloma cells, PER cells, PER.C6 cells or hybridoma cells, yeast cells, insect cells, and plant cells, to name only a few, but also cells comprised within a transgenic animal, transgenic plant
  • the polypeptide comprises WX 4 X 5 X 6 X 7 X 8 X 9 X 10 X 11 X 12 (SEQ ID NO: 44) , wherein X 4 is I, D, L or V, wherein X 5 is N, K or R, wherein X 6 is A, F or V, wherein X 7 is F or Y, wherein X 8 is A, G, L, S or V, wherein X9 is N, K or R, wherein X 10 is W or F, wherein X 11 is D, E, G, N, P, S or T and wherein X12 is K or R.
  • the polypeptide comprises WIX 4 AFX 5 NWX 6 X 7 (SEQ ID NO: 13), wherein X 4 is N or K, wherein X 5 is G or A, wherein X 6 is E, S or T and wherein X 7 is R or K.
  • the polypeptide may comprise a sequence having at least 80% identity to a sequence of: ELVDSLLDTVSX 13 GWINAFGNWERAFH (SEQ ID NO: 14), wherein X 13 is G or K.
  • the enzyme is an enzyme from the XYE maturase system.
  • the enzyme may be an XyeB SPASM protein (e.g.
  • the polypeptide may be a polypeptide having at least 80% identity to an XyeA precursor peptide (e.g. xncA, ykcA and etcA), including an XyeA precursor peptide that is listed in Table 3.
  • the polypeptide comprises WIX4AFX5NWX6X7 (SEQ ID NO: 13), wherein X 4 is N or K, wherein X 5 is G or A, wherein X 6 is E, S or T and wherein X 7 is R or K.
  • the polypeptide may comprise WINAFGNWER (SEQ ID NO: 15), WIKAFGNWSR (SEQ ID NO: 16) or WINAFANWTK (SEQ ID NO: 17) , WINAFGNWERAFH (SEQ ID NO: 18), AGWIKAFGNWSRSF (SEQ ID NO: 19) or WINAFANWTKRI (SEQ ID NO: 20).
  • the enzyme is an enzyme from the GRR maturase system.
  • the enzyme may be an GrrM SPASM protein (e.g. oscB, lscB or gscB) or an enzymatically active fragment of the enzyme.
  • the enzyme may, for example, act on a peptide having at least 80% identity to an GrrA precursor peptide (e.g. oscA, lscA and gscA), including a GrrA precursor peptide that is listed in Table 4.
  • the polypeptide may comprise
  • the enzyme is an enzyme from the FXS maturase system.
  • the enzyme may be an FxsB SPASM protein (e.g. mscB) or an enzymatically active fragment of the enzyme.
  • the enzyme may, for example, act on a peptide having at least 80% identity to an FxsA precursor peptide (e.g. mscA), including a FxsA precursor peptide that is listed in Table 5.
  • the polypeptide may comprise IPAAKFSSFI (SEQ ID NO: 24).
  • Percentage of sequence identity and “percentage identity” are used interchangeably herein to refer to comparisons among polynucleotides and polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the percentage may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • the percentage may be calculated by determining the number of positions at which either the identical nucleic acid base or amino acid residue occurs in both sequences or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • Those of skill in the art appreciate that there are many established algorithms available to align two sequences.
  • Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, 1981, Adv. Appl. Math.2:482, by the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol.48:443, by the search for similarity method of Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin Software Package), or by visual inspection (see generally, Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc.
  • HSPs high scoring sequence pairs
  • T is referred to as, the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score.
  • Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915).
  • W wordlength
  • E expectation
  • BLOSUM62 scoring matrix see Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915.
  • Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided.
  • the polypeptide may comprises a peptide sequence comprising at least 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more, 17 or more, 18 or more, 19 or more, 20 or more amino acids positioned upstream of the X 1 -X 2 -X 3 motif, wherein X 1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid.
  • the amino acids may be amino acids.
  • the polypeptide may comprise a peptide sequence of ELVDSLLDTVSGG (SEQ ID NO: 42) positioned at the N terminus of the X 1 -X 2 -X 3 motif.
  • the polypeptide may comprise a leader sequence having at least 80% identity to a sequence of: MSKLQREIAANKAQLSHEDKKKTQHK (SEQ ID NO: 25).
  • the polypeptide may comprise a leader sequence having at least 80% sequence identity to a sequence of:
  • the polypeptide may comprise a leader sequence having at least 80% sequence identity to a sequence of:
  • the polypeptide may comprise an affinity tag (such as a hexa-histidine sequence) and/or a solubility tag (such as SUMO).
  • affinity tag such as a hexa-histidine sequence
  • solubility tag such as SUMO
  • the methods of the present invention can be used to modify ligands of receptors including, for example, TNF superfamily members, cytokine superfamily members, growth factors, chemokine superfamily members, pro-angiogenic factors, pro-apoptotic factors, integrins, hormones and other soluble factors, among others, including RANK-L, Lymphotoxin (LT)-a, LT-b, LT-a1b2, zLIGHT, BTLA.
  • TNF superfamily members including, for example, TNF superfamily members, cytokine superfamily members, growth factors, chemokine superfamily members, pro-angiogenic factors, pro-apoptotic factors, integrins, hormones and other soluble factors, among others, including RANK-L, Lymphotoxin (LT)-a, LT-b, LT-a1b2, zLIGHT, BTLA.
  • Nucleic Acids Res 33 Database Issue:D169-173) or any mimetic or analog thereof.
  • the methods of the invention can be used to stabilise enzymes such as for example angiotensin converting enzymes (ACE), matrix metalloproteases, ADAM metalloproteases with thrombospondin type I motif (ADAMTS1, 4, 5, 13), aminopeptidases, beta-site APP- cleaving enzymes (BACE-1 and -2), chymase, kallilkreins, reelin, serpins, or any mimetic or analog thereof.
  • ACE angiotensin converting enzymes
  • ADAMTS1 ADAM metalloproteases with thrombospondin type I motif
  • BACE-1 and -2 beta-site APP- cleaving enzymes
  • chymase kallilkreins
  • reelin reelin
  • serpins or any mimetic or analog thereof.
  • the methods of the invention can be used to stabilize chemotherapeutic agents, and toxins such as small molecule toxins or enzymatically active toxins of bacterial, fungal, plant or animal origin, or fragments thereof to increase potency of targeted compounds for therapeutic purposes, such as for example calicheamicin, pseudomonas exotoxin, diphtheria toxin, ricin, saporin, apoptosis-inducing peptides or any analog thereof.
  • chemotherapeutic agents such as small molecule toxins or enzymatically active toxins of bacterial, fungal, plant or animal origin, or fragments thereof to increase potency of targeted compounds for therapeutic purposes, such as for example calicheamicin, pseudomonas exotoxin, diphtheria toxin, ricin, saporin, apoptosis-inducing peptides or any analog thereof.
  • the methods of the invention can be used to stabilise antigens for cancer vaccines such as for example the colorectal cancer antigen A33, a-fetoprotein, mucin 1 (MUC1), CDCP1, carcinoembryonic antigen cell adhesion molecules, Her-2, 3 and 4, mesothelin, CDCP1, NETO-1, NETO-2, syndecans, LewisY, CA-125, melanoma associated antigen (MAGE), tyrosinase, epithelial tumor antigen (ETA), among others, as well as for fusing viral envelope antigens or fungal antigens for treatment of infectious diseases.
  • MUC1 mucin 1
  • CDCP1 carcinoembryonic antigen cell adhesion molecules
  • Her-2, 3 and 4 mesothelin
  • CDCP1, NETO-1, NETO-2 syndecans
  • LewisY CA-125
  • MAGE melanoma associated antigen
  • ETA epithelial tumor antigen
  • compositions comprising a modified polypeptide as disclosed herein.
  • a pharmaceutical composition comprising a modified polypeptide as defined herein.
  • the pharmaceutical composition may comprise a pharmaceutically acceptable carrier.
  • pharmaceutically acceptable carrier is meant a pharmaceutical vehicle comprised of a material that is not biologically or otherwise undesirable, i.e., the material may be administered to a subject along with the selected active agent without causing any or a substantial adverse reaction.
  • Carriers may include excipients and other additives such as diluents, detergents, coloring agents, wetting or emulsifying agents, pH buffering agents, preservatives, and the like.
  • Representative pharmaceutically acceptable carriers include any and all solvents, dispersion media, coatings, surfactants, antioxidants, preservatives ⁇ e.g., antibacterial agents, antifungal agents), isotonic agents, absorption delaying agents, salts, preservatives, drugs, drug stabilizers, gels, binders, excipients, disintegration agents, lubricants, sweetening agents, flavoring agents, dyes, such like materials and combinations thereof, as would be known to one of ordinary skill in the art (see, for example, Remington's Pharmaceutical Sciences, 18th Ed. Mack Printing Company, 1990, pp.1289-1329, incorporated herein by reference).
  • any conventional carrier is incompatible with the active ingredient(s)
  • its use in the pharmaceutical compositions is contemplated.
  • a disease in one embodiment, there is provided a method of treating a disease in a subject, comprising administering a modified as defined herein to the subject.
  • a modified polypeptide as defined herein for use in treating a disease.
  • the use of the modified polypeptide in the manufacture of a medicament for the treatment in a subject may, for example, be cancer, diabetes or an infectious disease.
  • treating may refer to (1) preventing or delaying the appearance of one or more symptoms of the disorder; (2) inhibiting the development of the disorder or one or more symptoms of the disorder; (3) relieving the disorder, i.e., causing regression of the disorder or at least one or more symptoms of the disorder; and/or (4) causing a decrease in the severity of one or more symptoms of the disorder.
  • subject as used throughout the specification is to be understood to mean a human or may be a domestic or companion animal.
  • the methods of the invention are for treatment of humans, they are also applicable to veterinary treatments, including treatment of companion animals such as dogs and cats, and domestic animals such as horses, cattle and sheep, or zoo animals such as primates, felids, canids, bovids, and ungulates.
  • The“subject” may include a person, a patient or individual, and may be of any age or gender.
  • the term“administering” refers to contacting, applying, injecting, transfusing or providing a composition of the present invention to a subject.
  • kits for modifying a polypeptide comprising a) an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid and b) an expression construct comprising a nucleic acid that encodes an rSAM/SPASM (rSS) enzyme.
  • rSS rSAM/SPASM
  • nucleic acid refers to the capacity of a nucleic acid to provide for another nucleic acid or a polypeptide.
  • a nucleic acid sequence is said to“encode” a polypeptide if it can be transcribed and/or translated to produce the polypeptide or if it can be processed into a form that can be transcribed and/or translated to produce the polypeptide.
  • Such a nucleic acid sequence may include a coding sequence or both a coding sequence and a non-coding sequence.
  • the terms“encode”,“encoding” and the like include a RNA product resulting from transcription of a DNA molecule, a protein resulting from translation of a RNA molecule, a protein resulting from transcription of a DNA molecule to form a RNA product and the subsequent translation of the RNA product, or a protein resulting from transcription of a DNA molecule to provide a RNA product, processing of the RNA product to provide a processed RNA product (e.g., mRNA) and the subsequent translation of the processed RNA product.
  • the term“construct” refers to a recombinant genetic molecule including one or more isolated nucleic acid sequences from different sources.
  • constructs are chimeric molecules in which two or more nucleic acid sequences of different origin are assembled into a single nucleic acid molecule and include any construct that contains (1) nucleic acid sequences, including regulatory and coding sequences that are not found together in nature (i.e., at least one of the nucleotide sequences is heterologous with respect to at least one of its other nucleotide sequences), or (2) sequences encoding parts of functional RNA molecules or proteins not naturally adjoined, or (3) parts of promoters that are not naturally adjoined.
  • constructs include any recombinant nucleic acid molecule such as a plasmid, cosmid, virus, autonomously replicating polynucleotide molecule, phage, or linear or circular single stranded or double stranded DNA or RNA nucleic acid molecule, derived from any source, capable of genomic integration or autonomous replication, comprising a nucleic acid molecule where one or more nucleic acid molecules have been operably linked.
  • Constructs of the present invention will generally include the necessary elements to direct expression of a nucleic acid sequence of interest that is also contained in the construct, such as, for example, a target nucleic acid sequence or a modulator nucleic acid sequence.
  • Such elements may include control elements such as a promoter that is operably linked to (so as to direct transcription of) the nucleic acid sequence of interest, and often includes a polyadenylation sequence as well.
  • the construct may be contained within a vector.
  • the vector may include, for example, one or more selectable markers, one or more origins of replication, such as prokaryotic and eukaryotic origins, at least one multiple cloning site, and/or elements to facilitate stable integration of the construct into the genome of a host cell.
  • Two or more constructs can be contained within a single nucleic acid molecule, such as a single vector, or can be containing within two or more separate nucleic acid molecules, such as two or more separate vectors.
  • An“expression construct” generally includes at least a control sequence operably linked to a nucleotide sequence of interest.
  • promoters in operable connection with the nucleotide sequences to be expressed are provided in expression constructs for expression in an organism or part thereof including a host cell.
  • conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art, see for example, Molecular Cloning: A Laboratory Manual, 3rd edition Volumes 1, 2, and 3. J. F. Sambrook, D.
  • control element or“control sequence” is meant nucleic acid sequences (e.g., DNA) necessary for expression of an operably linked coding sequence in a particular host cell.
  • control sequences that are suitable for prokaryotic cells for example, include a promoter, and optionally a cis-acting sequence such as an operator sequence and a ribosome binding site.
  • Control sequences that are suitable for eukaryotic cells include transcriptional control sequences such as promoters, polyadenylation signals, transcriptional enhancers, translational control sequences such as translational enhancers and internal ribosome binding sites (IRES), nucleic acid sequences that modulate mRNA stability, as well as targeting sequences that target a product encoded by a transcribed polynucleotide to an intracellular compartment within a cell or to the extracellular environment.
  • transcriptional control sequences such as promoters, polyadenylation signals, transcriptional enhancers, translational control sequences such as translational enhancers and internal ribosome binding sites (IRES), nucleic acid sequences that modulate mRNA stability, as well as targeting sequences that target a product encoded by a transcribed polynucleotide to an intracellular compartment within a cell or to the extracellular environment.
  • a method of preparing a modified polypeptide library comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X 1 and X 3 so as to generate a modified polypeptide library.
  • the method may comprise expressing a polypeptide library in a host cell.
  • the method may comprise co-expressing an rSS enzyme with the polypeptide library in the host cell.
  • the polypeptides may be displayed in a polypeptide display system (such as phage or yeast display system).
  • the method may comprise contacting the displayed polypeptides with an rSS enzyme.
  • a method of selecting a modified peptide capable of binding to a ligand comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X 1 -X 2 -X 3 , wherein X 1 is an aromatic amino acid and wherein X 2 and X 3 are each independently any amino acid; b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X 1 and X 3 so as to generate a modified polypeptide library; c) contacting the modified polypeptide library with the ligand; and d) selecting the modified polypeptides that bind to the ligand.
  • rSS rSAM/SPASM
  • coli, pACYCDuet-1 and pRSFDuet-1 were purchased from life technologies (USA). Antibiotics (chloramphenicol for pACYCDuet-1 and kanamycin for pRSFDuet-1) were used at a concentration of 25 ⁇ g/mL in solid (LB agar) and liquid medium (LB and TB medium). Electroporation was carried out using mode Ec2 (2.5 kV) on a Bio-Rad (USA) MicroPulser Electroporator. Escherichia coli BL21(DE3) purchased from NEB (USA) were used for protein expression. Ultra Yield TM flasks (500 mL or 2.5 L) from Thomson Instrument Company (USA) were used for protein expression.
  • HisPur Ni-NTA resin was purchased from Thermo Scientific (USA). For desalting of proteins, GE Healthcare (USA) PD Minitrap G-25 columns were used. For tryptic digests a Phenomenex (USA), Kinetex XB-C18, 2.6 m, 150 x 4.6 mm column was used. For semi- preparative HPLC a Phenomenex (USA), Kinetex XB-C18, 5 m, 250 x 10 mm column was used. E. coli cells were lysed using a Fisherbrand Model 505 Sonic Dismembrator fitted with a FB44201/4” Microtip, FB44181/8” Micotip, or FB42191/2” solid probe.
  • LC-MS experiments were performed on a Waters Acquity UPLC System coupled to a Waters Micromass Q-Tof Premier Mass Spectrometer (USA).
  • HPLC grade solvents water + 0.1% formic acid, water + 0.5% formic acid, acetonitrile +0.1% formic acid, or 1:1 acetonitrile/isopropanol + 0.5% formic acid
  • NMR spectra were acquired using a Bruker (USA) 400 MHz Avance III or 600 MHz Avance III with a cryoprobe operating at 298 K. NMR solvents were purchased from Cambridge Isotope Labs (USA). Insert gene sequences.
  • coli BL21(DE3) with plasmids.
  • the plasmids were dissolved in MilliQ grade water to a final concentration of 25 ng/mL.
  • 1 mL plasmid DNA from precursor (NHis-SUMO-XncA or NHis-SUMO-XncA-G(-2)K) was added to 70 ml E. coli BL21(DE3) electrocompetent cells.
  • the E. coli BL21(DE3) cells were transformed in a 2 mm electroporation cuvettes using the settings described above.
  • plasmid DNA from precursor and 1 mL plasmid DNA from rSAM enzyme were added to 70 ml E. coli BL21(DE3) electrocompetent cells.
  • the transformed cells were then grown overnight at 37°C on lysogeny broth (LB) agar supplemented with appropriate antibiotics (chloramphenicol for cells harboring the precursor plasmids only and kanamycin plus chloramphenicol for cells harboring both the precursor and rSAM enzyme plasmids) at a final concentration of 25 mg/mL for each antibiotic added. Protein expression and purification.
  • a 50 ml falcon tube containing 10 ml LB medium supplemented with appropriate antibiotics was inoculated with a colony from the transformation above.
  • the 10 ml culture was grown overnight at 37 o C at 200 rpm.
  • the overnight culture was used to inoculate either a 200 mL TB medium in a 500 mL Ultra Yield TM flask or 1 L TB medium in a 2.5 L Ultra Yield TM flask in ratio 1:100 (v:v) containing appropriate antibiotics.
  • the cells were then grown at 37 o C, 200 rpm until OD600 reached 1.6-2.5. Culture was placed on ice for 30 min then induced by addition of IPTG at a concentration of 0.8 mM.
  • Elution fractions were desalted into 50 mM Tris buffer pH 8.0 using PD Minitrap G-10 column, and then digested with trypsin (1:100, precursor/trypsin w:w) for 16 h.
  • Semi-preparative HPLC The trypsin digested peptides were freeze dried and then resuspended in DI water and subjected to reversed phase semi-preparative HPLC using a flow rate of 5 ml/min, column temperature of 60 o C, and a linear gradient from 100% water +0.1% trifluoroacetic acid to 27.5% isopropanol/acetonitrile (1:1) + 0.1% trifluoroacetic acid/water + 0.1% trifluoroacetic acid over 17 minutes.
  • Example 1 The invention is based upon an efficient protocol for interrogating rSAM sequence-function space in the TIGRFAM database for suitable targets.
  • the inventors were interested by a number of uncharacterized rSS proteins annotated as putative maturases within the TIGRFAM database.
  • Several of these proteins have been assigned as part of a maturase system that is minimally composed of a substrate precursor (A) and rSS protein (B). They initially focused on a single maturase system based on the following criteria.
  • the precursor sequences do not contain known motifs for previous characterized rSS maturases as well as predicted core peptides void of Cys residues which are a characteristic of sactipeptides. Based on the diversity of reactions catalyzed by rSS proteins it is not possible to predict the transformation and the inventors turned toward functional studies in a heterologous host, Escherichia coli. Identification of rSAM maturases encoding novel posttranslational modifications Spliceases and cyclophane forming enzymes from the rSAM superfamily belong to a subfamily known as SPASM domain containing proteins (referred to as SPASM proteins).
  • rSAM binding domain responsible for homolytic cleavage of S- adenosylmethionine, they contain either one or two C-terminal Cys-rich domain(s) (PF13186, TIGR04085) which bind auxiliary [4Fe-4S] cluster(s).
  • PF13186, TIGR04085 C-terminal Cys-rich domain(s)
  • auxiliary [4Fe-4S] cluster auxiliary [4Fe-4S] cluster(s).
  • SPASM proteins were assigned as part of maturase systems (biosynthetic gene cluster) in the TIGRFAM database. These gene clusters encode a substrate precursor peptide (designated as A) which is proposed to be modified by a SPASM protein (designated as B). Maturase systems may also contain proteases and transport proteins to cleave and export final products outside of the cell.
  • the predicted core peptide sequences did not contain motifs modified by previously characterized maturase or SPASM proteins and are void of Cys residues which are prevalent in thioether bridged sactipeptides. Based on the diversity of reactions catalyzed by SPASM proteins it was not possible to predict the transformation and the inventors turned toward functional studies in Escherichia coli.
  • the XYE maturase system occurs in bacterial strains of the genus Xenorhabdus, Yersinia, and Erwinia (Fig.1a).
  • the substrate precursors are collectively referred to as XyeA (TIGR04495, putative rSAM-modified RiPP) and the SPASM protein as XyeB (TIGR04496, radical SAM/SPASM domain peptide maturase).
  • the length of the XyeA precursor peptides are ⁇ 50 amino acids with a corresponding GG motif predicted to separate the posttranslational modifying enzyme recognition sequence (N-terminal leader peptide) from the target sequence (C-terminal core peptide).
  • XncB As a 3-residue cyclophane forming enzyme (3-CyFE).
  • the general motif is defined as X1-X2- X3 where X1’ is an aromatic amino acid (Trp or Phe for XyeA precursor peptides).
  • Xnc product (3) appears similar to the end product because there is a single identifiable modifying enzyme (XncB) and the presence of proteases and a Gly-Gly motif suggests cleavage at this site.
  • the stereochemistry at Ca-positions are consistent with either all L- or all D-configuration.
  • the former was chosen as a more plausible scenario based on precedent for cyclophane formation to occur by abstraction of hydrogen from Cb and retention of configuration at Ca positions.
  • the newly formed stereocenters were assigned the S- configuration based on large vicinal coupling to Ha and NOE correlations which supported an approximate anti orientation of Ha and Hb. Further, the observed NOE correlations within the WIN and WER macrocycles shown in Fig. 2d strongly suggest the cyclophanes have planar chirality and adopt 6Sp configuration.
  • GrrA TIGR04260, rSAM-associated Gly-rich repeat protein
  • GrrM TIGR04261, radical SAM/SPASM domain protein
  • a Ser to Ala variant (NHis 6 -MscA-S3A) as a substrate for MscB-375 was tested.
  • the resulting cyclized product would not contain a stereocenter at the Cb-position and upon hydrolysis would possess C2-symmetry (L,L or D,D) or would be an optically inactive meso-form (L,D or D,L which are indistinguishable).
  • NHis 6 -MscA-S3A was coexpressed with MscB-375 and cyclization was detected and localized by LC-MSMS.
  • the optically enriched (L, L)-isomer was prepared from known N-phthaloyl-L-Ala-aminoquinoline (14) 48 and protected L-4-iodo-Phe (15).
  • the key C-C bond was formed by the b-C–H monoarylation procedure using palladium acetate in the presence of silver tetrafluoroborate to give 16.
  • Global deprotection (6N HCl, 110°C) and derivatization with L-FDVA gave the (L, L)-configured standard (17).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Plant Pathology (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The invention relates generally to the field of biotechnology. In particular, the invention teaches a method of modifying a polypeptide, the method comprising the steps of: a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X1 and X3.

Description

METHOD OF MODIFYING A POLYPEPTIDE
Field of Invention
The invention relates generally to the field of biotechnology. In particular, the invention teaches a method of modifying a polypeptide with an rS AM/SPASM (rSS) enzyme.
Background
Polypeptides have a wide range of biomedical and industrial applications. Many polypeptides act as hormones, enzyme inhibitors, substrates, and neurotransmitters in the body, which has led to their increasing use as therapeutic agents for the treatment of diseases. Polypeptides are also often used as diagnostic tools or drug delivery applications. The advantage of using polypeptides is that they are generally cheap and straightforward to produce by techniques such as chemical synthesis or recombinant expression. However, one major disadvantage of the use of polypeptides is their susceptibility towards proteolytic degradation. For example, therapeutic polypeptides, which are often relatively unstructured, can be rapidly degraded in vivo , often with half-lives in the order of minutes.
Accordingly, it is generally desirable to overcome or ameliorate one or more of the above mentioned difficulties.
Summary
The present disclosure teaches a method of modifying a polypeptide. Disclosed herein is a method of modifying a polypeptide, the method comprising the steps of: a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X1 and X3.
In one embodiment is a modified polypeptide obtained according to a method as defined herein. Also disclosed herein is a kit for modifying a polypeptide, the kit comprising a) an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid and b) an expression construct comprising a nucleic acid that encodes an rS AM/SPASM (rSS) enzyme.
Disclosed herein is a method of preparing a modified polypeptide library, the method comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide library with a rS AM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library.
Disclosed herein is a method of selecting a modified peptide capable of binding to a ligand, the method comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; b) contacting the polypeptide library with a rS AM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library; c) contacting the modified polypeptide library with the ligand; and d) selecting the modified polypeptides that bind to the ligand.
Brief Description of Drawings
Embodiments of the present invention are hereafter described, by way of non-limiting example only, with reference to the accompanying drawings in which:
Fig 1. Maturase systems, biosynthetic gene clusters and strains of origin. Representative gene clusters and core peptide sequences (to the right of clusters) from: a, GenPropl090 Xye maturase system ( xnc , ykc, and etc). The A and B genes represent precursor peptide (XyeA, TIGR04495) and SPASM protein (XyeB, TIGR04496), respectively. Core peptides are assigned the sequence C-terminal to GG motifs; b, GenPropl037 Gly-rich repeat (Grr) maturase system: ( osc , Isc, and gsc). The A and B genes represent precursor peptide (GrrA, TIGR04260) and rSAM enzyme (GrrM, TIGR04261), respectively. Core peptides are defined at the C-terminus where the Gly-rich region commences.; c, GenProP1068 Fxs maturase system (, msc ). The A and B genes represent precursor peptide (FxsA, TIGR04268) and rSAM enzyme (FxsB, TIGR04269), respectively. The start of the core peptides is unknown, selected residues at the C-terminus are shown. Modifications (-2 Da) were detected from coexpression of precursor peptides (A) with cognate rSAM proteins (B), Ni-affinity chromatography, and digestion with trypsin and analysis by LC-MS/MS, and are indicated by blue connectors. For all gene clusters shown, Blue connectors indicate motifs where -2 Da modifications have been detected from tryptic digest fragments. Modifications were detected from coexpression of precursor peptide (A) with cognate SPASM protein (B) followed by Ni-affinity chromatography, digestion with trypsin and analysis by LC-MS/MS.
Fig 2. Detection and characterization of the Xnc product, a, LC-MS chromatograms for trypsin digests of NHis6-SUMO-XncA (top) and NHis6-SUMO-XncA + XncB (bottom). Inlays are mass spectra for tR = 9 - 12 minutes. The m/z values given are for the monoisotopic peaks b, Gly(-2) to Lys construct of NHis6-SUMO-XncA was coexpressed with XncB followed by protein purification, digestion with trypsin, and purification by semi-preparative HPLC to yield fragment 3. The asterisk marks the location of the mutation c, Key 2D NMR correlations for residues -1 to +4 of fragment 3 (top) and residues +4 to +11 of fragment 3 (bottom) d, Conformational analysis and NOE correlations for WIN (left), FGN (center), and WER (right) cyclophanes. Coupling constants are indicated for Asn3, Asn7, and Arg10.
Fig 3. Detection of activity and characterization of modifications by OscB. a, In vivo coexpression of NHis6-OscA2 + OscB followed by Ni-affinity purification, trypsin digest, and LC-MS to detect fragments 4 - 7. b, Extracted ion chromatogram (EIC, left) and corresponding mass spectra (right) to detect unmodified 4 (GGGGSWGNGGSWR (SEQ ID NO: 1)) and modified 6 for coexpression shown in a. c, EIC (left) and corresponding mass spectra (right) showing unmodified 5 (FINSR) and modified 7 for coexpression shown in a. d, In vivo coexpression of NHis6-OscA2-(16-28-W27G)-(43-47) + OscB followed by Ni-affinity purification, digestion with trypsin, and LC-MS to detect unmodified (5 and 8) and modified fragments (7 and 9). e, EIC (left) and corresponding mass spectra (right) to detect unmodified 8 (GGGGSWGNGGSGR (SEQ ID NO: 2)) and modified 9 for coexpression shown in d. f, EIC (left) and corresponding mass spectra (right) to detect unmodified 5 (FINSR) and modified 7 for coexpression shown in d. g, Products 7 and 9 purified from trypsin digest of the expression shown in d. h, 1 H NMR of product 7 (500 MHz in D2O) showing an expansion of the aromatic protons i, Modified fragments detected when an engineered sequence of OscA2 containing additional Lys residues was coexpressed with OscB, purified by Ni-affinity purification, digested and analysed by LC-MS. Asterisks indicate mutated positions. Blue connectors indicate modifications (-2 Da) detected and localized by LC-MS/MS.
Fig 4. Protein sequence similarity network for selected SPASM protein families, a, Protein sequence similarity network constructed for TIGRFAM SPASM maturase systems with characterized members. Thioether bond formation is represented by sporulation killing factor (Skf) maturases (IPR030915, TIGR04403, SkfB is a characterized member), quinohemoprotein (Qhp) maturases (IPR023886, TIGR03906, QhpD is a characterized member), and six -residue in forty-five (SCIFF) maturases (IPR024025, TIGR03974, CteB and Ttel 186 are characterized members); Tyramine excision is represented by spliceases (N113), one subfamily annotated as Nifl 1-class peptide radical SAM maturase 3 (IPR026482, TIGR04103, PlpX is a characterized member); Tyrosine decarboxylation by mycofactocin (Myc) maturases (IPR023913, TIGR03962, MftC is a characterized member); Epimerization by Yyd maturases (IPR023904, TIGR04078, YydG is a characterized member); Glu to Tyr five-residue cyclophane formation by pyrroloquinoquinoline (Pqq) maturases (IPR011843, TIGR02109, pyrroloquinolinequinone); Lys to Trp five-residue cyclophane formation by KxxxW maturases (IPR024017, TIGR04080, StrB is a characterized member); Three-residue cyclophane formation is represented by Xenorhabdus Yersina Erwinia (Xye) maturases (IPR030989, TIGR04496, XncB is a characterized member) and Gly-rich repeat (Grr) maturases (IPR026357, TIGR04261, OscB is a characterized member). The InterPro family (IPR) numbers were used as the input into ESI-EST (option B) using the UniRef90, an E-value of 5, alignment score threshold of 20, the representative node network (RepNode = 50%), and visualized in Cytoscape 3.5.1. b, Protein sequence similarity network created using same method as in a with the addition of FxsB maturases (IPR026335, TIGR042069, MscB-375 was characterized in this work).
Fig. 5. Detection of activity and characterization of MscB product, a, InterPro44 analysis of MscB to define the rS AM/SPASM domain and HEXXH domains b, Coexpression of NHis6- MscA or NHis6-MscA-S3A with MscB-375, Ni-purification, digest, and HPLC yielded 11 or 13, respectively. Subsequent acid hydrolysis, derivatization with L-FDVA, and detection by LC-MS/UV to detect 12, 13, and the amino acids L-Ser, L-Phe, and L-Ile. The predominant form of 24 after acid hydrolysis is the hydrochloride salt c, Detection of activity for NHis6-MscA + MscB-375. EICs are presented for trypsin digests of NHis6-MscA (top) and NHis6- MscA+MscB-375 (bottom). Corresponding mass spectra for 10 and 11 are shown on the right d, Reaction scheme for the synthesis of standards 17, 18, 19, 22, and 23. e, HPLC chromatograms for standards (17-19) and comparison to degradation fragment 13. f, 1 H NMR spectra for synthetic standards (22 and 23) and comparison to degradation fragment 24.
Fig. 6 I Production of xenorceptide derived from the xnc gene cluster. Two different strategies were used to detect the natural product a, Promoter exchange in which an L-arabinose inducible PBAD promoter was inserted upstream of the xnc gene cluster b, Constructs used for heterologous expression in Escherichia coli in which partial (top) or full (bottom) xnc gene clusters were cloned into pET-28b(+) vector with an NHis6-tag at the N-terminus of the xnc A precursor gene c, In vivo coexpression of NHis6-SUMO-XncA(G-lK) variant + XncB followed by Ni-affinity purification, digestion with trypsin, and LC-MS analysis to detect modified fragment 27. d, EIC chromatograms from a, b, and c. e, Mass spectra of peaks 25-27 shown in d. Peaks labelled in the spectra are monoisotopic peaks. The detected products 25-27 represent the natural product xenorceptide.
Detailed Description
The present disclosure teaches a method of modifying a polypeptide. Disclosed herein is a method of modifying a polypeptide, the method comprising the steps of: a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X1 and X3.
The method as disclosed herein may comprise a) providing a polypeptide comprising one or more three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid. The method may also comprise b) contacting the polypeptide with one or more rSAM/SPASM (rSS) enzyme(s) for a sufficient time and under conditions to modify the polypeptide to form one or more cyclophane group connecting X1 and X3 within each three residue motif.
Without being bound by theory, the inventors have found that the rSAM/SPASM (rSS) enzyme can recognise a 3 -residue motif and form a cyclophane group that leads to restricted rotation of the aromatic ring and induces planar chirality in the asymmetric indole bridge. The 3-residue motif can have variants that are diverse in sequence that are recognised by the rSS enzyme. The rSS enzyme can be used to catalyze formation of multiple cyclophane groups within a polypeptide containing 3 -residue motifs. Each cyclophane group is conformationally rigid and compact which may increase the proteolytic stability of the polypeptide. Each cyclophane group also forms a unique three-dimensional scaffold that can have wide-ranging applications, such as for binding to new drug targets.
The contacting of the polypeptide with a rS AM/SPASM (rSS) enzyme may allow the formation of a cyclophane group connecting X1 and X3. In one embodiment, there is provided a method of modifying a polypeptide, the method comprising the steps of: a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group by connecting X1 and X3.
The term“cyclophane group” or“cyclophane” may be used interchangeably to refer to a macrocycle or ring consisting of an aromatic unit (typically a benzene ring) and an optionally substituted aliphatic chain that forms a bridge between two non-adjacent positions of the aromatic ring. For example, the“cyclophane group” or“cyclophane” can refer to a macrocycle or ring formed when an aromatic unit in an aromatic amino acid X1 (such as W, F, Y or H ) in a peptide comprising a 3 residue motif X1-X2-X3 is joined to a Cb in X3 via a carbon to carbon bond (see for example Figure 2(c)).
In one embodiment, the method of modifying the polypeptide reduces the susceptibility of the polypeptide to protease degradation (such as to trypsin degradation). In one embodiment, the method of modifying the polypeptide increases the in vitro or in vivo stability of the polypeptide by reducing its susceptibility to proteolytic degradation. In one embodiment, the method of modifying the polypeptide increases the in vivo half-life of the polypeptide in an animal body. In one embodiment, the method of modifying the polypeptide increases the in vitro half-life of the polypeptide. The method as defined herein may, for example, be used to increase the in vivo stability of a therapeutic polypeptide or vaccine. The method may also be used to alter the binding properties of a therapeutic polypeptide to a target. The method as defined herein may also be used to increase the in vitro stability of a polypeptide or enzyme for applications outside an animal body, such as for use as pesticide, food preservative or research tool kit. The polypeptide as referred to herein may comprise or further comprise a polypeptide of interest, The polypeptide of interest may, for example, be insulin, growth hormones, clotting factors such as factor VIII and factor IX, thrombin, hemopoietic growth factor, viral antigens, erythropoietin, enzyme inhibitors, substrates, or neurotransmitters. The methods as referred to herein may be used to reduce the susceptibility of the polypeptide to protease degradation. In one embodiment, the method comprises engineering an X1-X2-X3 motif within a polypeptide of interest. In one embodiment, the method comprises engineering an X1-X2-X3 motif at the amino and/or carboxy terminus of a polypeptide of interest which prevents degradation by exopeptidase enzymes.
In one embodiment, there is provided a method of increasing the in vitro or in vivo stability of a polypeptide, the method comprising the steps of: a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X1 and X3.
The terms "polypeptide",“peptides” and "protein" are used interchangeably and include any polymer of amino acids (dipeptide or greater) linked through peptide bonds or modified peptide bonds, whether produced naturally or synthetically. The polypeptides of the invention may comprise non-peptidic components, such as carbohydrate or fatty acid groups.
The term "amino acid" refers to naturally occurring and non-natural amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally encoded amino acids are the 20 common amino acids (alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine) and pyrrolysine and selenocysteine. Amino acid analogs refer to compounds that have the same basic chemical structure as a naturally occurring amino acid, by way of example, an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group. Such analogs may have modified R groups (by way of example, norleucine) or may have modified peptide backbones, while still retaining the same basic chemical structure as a naturally occurring amino acid. Non-limiting examples of amino acid analogs include homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. The amino acid as referred to herein may be a D or L amino acid. The amino acid may also be a b-amino acid. The term "amino acid” can include D-amino acids, a,a-disubstituted amino acids, N-alkyl amino acids, homo-amino acids, dehydroamino acids, aromatic amino acids (other than phenylalanine, tyrosine and tryptophan), and ortho-, meta- or para-aminobenzoic acid, non- conventional amino acids such as compounds which have an amine and carboxyl functional group separated in a 1,3 or larger substitution pattern, such as b-alanine, y-amino butyric acid, Freidinger lactam, the bicyclic dipeptide (BTD) , amino- methyl benzoic acid and others well known in the art. Statine-like isosteres, hydroxyethylene isosteres, reduced amide bond isosteres, thioamide isosteres, urea isosteres, carbamate isosteres, thioether isosteres, vinyl isosteres and other amide bond isosteres known to the art are also included.
A“conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art, which can be generally sub-classified as follows:
Table 1: Amino Acid Subclassification
Figure imgf000009_0001
Conservative amino acid substitution also includes groupings based on side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. Lor example, it is reasonable to expect that replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid will not have a major effect on the properties of the resulting variant polypeptide. Whether an amino acid change results in a functional polypeptide can readily be determined by assaying its activity. Conservative substitutions are shown in Table 2 under the heading of exemplary and preferred substitutions. Amino acid substitutions falling within the scope of the invention, are, in general, accomplished by selecting substitutions that do not differ significantly in their effect on maintaining (a) the structure of the peptide backbone in the area of the substitution, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. After the substitutions are introduced, the variants are screened for biological activity.
Table 2: Exemplary and Preferred Amino Acid Substitutions
Figure imgf000010_0001
In one embodiment, the polypeptide is modified to form a cyclophane group within the three residue motif with a carbon to carbon bond between X1 and X3.
The modification may, for example, result in the crosslinking on the three residue motif which includes W, F, Y or H to form indole-or phenyl-bridged cyclophanes. The modified polypeptide may, for example, display restricted rotation of the aromatic ring and induce planar chirality in the asymmetric indole bridge. X1 may be an aromatic amino acid. The term“aromatic amino acid” may refer to an amino acid with an aromatic ring. The aromatic amino acid may be a naturally-occurring or non-natural aromatic amino acid. It can also be an analogue or a mimetic of a naturally-occurring or non natural aromatic amino acid. In one embodiment, X1 is W, F, Y or H. X2 and X3 may each independently be any amino acid. In one embodiment, X2 is I, G, E, Y, V, L, A, D, S, T, N or Q. X3 may be a non-aromatic amino acid. In one embodiment, X3 is an amino acid that is not W, F, Y or H. In one embodiment, X3 is N, R, S, D or K.
The method may be performed under anaerobic or oxygen-free conditions.
In one embodiment, the polypeptide is a linear polypeptide.
The term“rSAM” refers to radical S-adenosylmethionine.
The rSS enzyme may be an rSS enzyme of the Xenorhabdus, Yersinia and Erwinia (XYE) maturase system (XyeB, TIGR04496, IPR030989), Glycine-rich repeat (Grr) maturase system (GrrM, TIGR04261, IPR026357) or the Fxs maturase system (FxsB, TIGR04269, IPR026335).
The rSS enzyme may also be an enzymatically active fragment of an rSS enzyme of the Xenorhabdus, Yersinia and Erwinia (XYE) maturase system (XyeB, TIGR04496, IPR030989), Glycine-rich repeat (Grr) maturase system (GrrM, TIGR04261, IPR026357) or the Fxs maturase system (FxsB, TIGR04269, IPR026335).
The rSS enzyme may have an amino acid sequence that is at least 70% (or 75%, 80%, 85%, 90% or 95%) identical to the following sequences: XncB:
MTT S KS EKIKHLEIILKIS ERCNINCS Y C Y VFNMGN S LATDS PP VIS LDN VLALRGFFER SAAENEIEVIQVDFHGGEPLMMKKDRFDQMCDILRQGDYSGSRLELALQTNGILIDDE WIS LFEKHKVH AS IS IDGPKHINDRYRLDRKGKS T YEGTIHGLRMLQN A WKQGRLPG EPGILS V ANPT AN G AEIYHHF AN VLKC QHFDFLIPD AHHDDDID GIGIGRFMNE ALD A WFADGRSEIFVRIFNTYLGTMLSNQFYRVIGMSANVESAYAFTVTADGLLRIDDTLRS T S DEIFN AIGHLS ELS LS G VLN S PN VKE YLS LN S ELPS DC ADC VWNKICHGGRLVNRF SR ANRFNNKT VFC S S MRLFLS RA AS HLIT AGIDEETIMKNIQK (SEQ ID NO: 3)
YkcB:
MEVITGSEGRVMLNLLIEKNIRHLEIILKISERCNINCDYCYVFNKGNSAADDSPARLS NKNIHHLVCFLQRACQEYKIGTVQIDFHGGEPLLMKKENFTDMCIQLISGNYCGSNIR LALQTNATLIDNEWIAIFEKYSVNVSISIDGPKHINDRHRLDTKGRSTYESTVRGLRIL QNAYQQGRLPSDPGILCVTNAQANGAEIYRHFVDELGVYSFDFLIPDDSYKDAHPDA V GIGRFLNE ALDEW VKDNN AKIF VRLF QTHI AS LLGQKN S G VLGHTPNITG V Y ALT V SSDGFVRVDDTLRSTSDRMFNPIGHLSEVNLSNVFASPQFQEYSSIGQSLPTECEGCIW ENIC AGGRIVNRFS TEDRFKHKS IYC YS MRTFLS RS S AHLLNMGIKEERIM A AIR A
(SEQ ID NO: 4)
EtcB
MTQLKGEKIKHLEIILKISERCNINCTYCYVFNMGNTLATDSTPVISLDNVYALRGFFE RSAAENDIEVIQVDFHGGEPLMMKKDRFDRMCQILLQGNYRSSKFELALQTNGILIDD EWIALFEKHQVHASISVDGPKHINDRHRLDRKGKSTYEGTITGLRLLQNAWQQGRLP GEPGILS V AN AN AN G AEIYRHF ADTLQC QRFDFLIPDDHHDDS PDGEG V GRFLNE AL DAWFADGRPEIFIRIFNTYLGTMLNSQFNRVLGMSANVESAYAFTVTADGMLRIDDT LRS T S DEIFN A V GH V S ELS LARVLET S C VKE YLALS S NLPT V C AEC VWNNICHGGRLV NRFS RTNRFNNKT VFC KS MRLFLS RA AS HLM AS G VDEKEIMKNIQK (SEQ ID NO: 5)
MscB
MAPGPARAALTEFVLKVHARCDLACDHCYVYEHADQSWRRRPVRMTPEVLRTAAG RIAEHAAAHDLPDVTVILHGGEPLLLGAERLGEVLADLRRVIDPVTRLRLGMQTNGV LLSERLCDLLAEHDVAVGVSLDGDRAANDRHRRFRSGAGSYDQVLRAIGLLRRPAY RRIY S GLLCTVD VRNDPIA VYES LLTQEPPRIDFLLPHATWDDPPWRPAGGGTAYAG WLRAVYDRWLADGRPVSVRLFDSLLSTAAGGPSGTEWLGLDPVDLAVVETDGEWE QADSLKTAYDGAPATGMTVFSHAADDVAASPLLARRRSGRAGLSDECRRCPVVDQC GGGLFAHRYGAGHFDHPSVYCADLKELIVHVNENPPAPVRLDAGLPDDFIDRLAALT GDRVAIGRLVEAQIAIVRALLAEVADRLPAGGAGADGWEALTALDRSAPESVARIAA HPYVRAWAVDCLAGSGTGARQGPDYLSALAVAAALDAGTPVRLDVPVRSGRLHLP TVGTVLLPEVGDGAARVETGPGSLRVAAGDVTVAIRPGTPGDAPRWWPTRVLAAPD VSVLLEDGDPHRDCHRLPAGDRLDDAGAARWAETFAAAWQVIRDEVPGHAEELRA GLRAVVPLRRSGAGVSEASTARQAFGGVAATETDAGSLAVLLVHEFQHSKMNALLD ICDLVDGTRPIDITVGWRPDPRPAEAVLHGIYAHAAVADIWRIRADRQVDGAQAVYR RYRDWTAEAIGALQRADALTPAGSRLVRQVARSMSGWPS (SEQ ID NO: 6) OscB:
MINPTLLNPEKIDISKFGPINLVVIQATSFCNLNCDYCYLPNRDLKNTLSLDLIEPIFKNI FNSPFVGDEFTICWHAGEPLAVPISFYESAFQLIQAADQKYNQKQAKIWHSVQTNAT YINQKWCDFIQEHNICVGVSLDGPEFIHDAHRQTRKGTGSHAQTMRGISFLQKNNIPF YVISVVTQDSLNYADEIFNFFRENGIYDVGFNLEEIEGVNQSSTLEAVGTSEKYRAFM QRFWELTSEVQGEFNLREFEAICGLIYSNTRLTQTDMNNPFVLINIDYQGNFSTFDPEL LSVNIKPYGNFILGNVLTDSFESVCDTEKFQKIYTDMQEGIKLCRETCEYFGVCGGGA GSNKYWENGTFACSETMACRYRIKVVTDIILDKLENSLGLVENC (SEQ ID NO: 7) LscB:
MTISKMNLPVQTDNFRASSTLDLSAFGPINLVVIQSTSFCNLNCDYCYLRDRQSKNRL SLDLIEPILKTVLTSPFVGCDFTILWHAGEPLAMPISFYDSATALIREAERQYKTQPIQIF QSIQTNATLINQAWCDCFRRNEIYVGVSLDGPAFLHDAHRQTYKGTGTHAATMRGIS LLQKNEIPFNVICVLTQDSLDYPDEIFNFFRSNRITEVGFNMEEAEGVHQHSTLDQQGT EERYRAFMQRFWDLTVQAKGEFKLREFETICTLAYTGDRLGYTDMNQPFVIVNFDH QGNFSTFDPELLSFKIKEYGDFVLGNVLHNTLESVCQTEKFQKIYQDMAAGVVQCRQ SCEYFGLCGGGAGSNKYWENGTFNCTETKACRYRIKVIADIVLEGLENSLELANSIS
(SEQ ID NO: 8) GscB
MSIVTSKPVINFKNTANFGPISLIIIQPNSFCNLDCDYCYLPDRHLQNKLSLDLIDPIFKSI FTSPFLGCDFGVCWHAGEPLTMPVSFYKSAFQLIEEANTKYNKSEYSFYHSYQTNGT LINQGWCDLWQEYPVHVGVSIDGPAFLHDVHRKNRKGGNSHDLTMRGIRYLQKNNI PYNTISVITEESLNYPDEMFNFFAENEIYDLAFNMEETEGVNELTSLNGIEIEHKYSQFI KRFWQLVTESKLPFIVREFEILISLIYSGNRLTNTDMNKPFVIVNFDYQGNFSTFDPELL SVKTDKYGDFIFGNVLKDSLESICETEKFKTIYKDINDGVKLCSDNCSYFGICGGGAG SNKYWENGTFASMETQACRYRIKILTDVLVSTIENSLGL (SEQ ID NO: 9) In one embodiment, the rSS enzyme is a C-terminal truncated MScB enzyme with the following sequence: MAPGPARAALTEFVLKVHARCDLACDHCYVYEHADQSWRRRPVRMTPEVLRTAAG RIAEHAAAHDLPDVTVILHGGEPLLLGAERLGEVLADLRRVIDPVTRLRLGMQTNGV LLSERLCDLLAEHDVAVGVSLDGDRAANDRHRRFRSGAGSYDQVLRAIGLLRRPAY RRIYSGLLCTVDVRNDPIAVYESLLTQEPPRIDFLLPHATWDDPPWRPAGGGTAYAG WLRAVYDRWLADGRPVSVRLFDSLLSTAAGGPSGTEWLGLDPVDLAVVETDGEWE QADSLKTAYDGAPATGMTVFSHAADDVAASPLLARRRSGRAGLSDECRRCPVVDQC GGGLFAHRYGAGHFDHPSVYCADLKELIVHVNENPPAPV (SEQ ID NO: 41). The enzymes as referred to herein may comprise one or more conservative amino acid substitution. In one embodiment, the rSS enzyme is an enzymatically active fragment of any one of the above sequences. In one embodiment, the enzymatically active fragment is one that contains the rSAM and SPASM domains (such as CNINCSYC (SEQ ID NO: 42) and CADCVWNKIC (SEQ ID NO: 43) in XncB). The rSS enzyme may be a XyeB, GrrM or FxsB rSS enzyme from a bacterial genus listed in Tables 3-5. Table 3. Precursor (XyeA , IPR030990) and rSS (XyeB, IPR030989) paired sequences from the UniProt database.
Figure imgf000014_0001
Figure imgf000015_0001
Table 4. Precursor (GrrA , IPR026356) and rSS (GrrM, IPR026357) paired sequences from the UniProt database.
Figure imgf000015_0002
Figure imgf000016_0001
Figure imgf000017_0001
Figure imgf000018_0001
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Table 5. Precursor (FxsA , IPR026334) and rSS (FxsB, IPR026335) paired sequences from the UniProt database.
Figure imgf000022_0001
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
In one embodiment, the rSS enzyme or enzymatically active fragment has two Cys-rich domains that are critical or essential for activity. The two Cys-rich domains may include the rSAM binding domain in the N-terminus (CXXXCXXC (SEQ ID NO: 10)) and the SPASM domain in the C-terminus (CXXXCXXXXXC (SEQ ID NO: 11)) or CXXCXXXXXC (SEQ ID NO: 12), where X may be any amino acid).
The term“domain”, as used herein, refers to a part of a molecule or structure that shares common physicochemical features, such as, but not limited to, hydrophobic, polar, globular and helical domains or properties such as ligand-binding, membrane fusion, signal transduction, cell penetration and the like. Often, a domain has a folded protein structure which has the ability to retain its tertiary structure independently of the rest of the protein. Generally, domains are responsible for discrete functional properties of proteins, and in many cases may be added, removed or transferred to other proteins without loss of function of the remainder of the protein and/or of the domain. Domains may be co-extensive with regions or portions thereof; domains may also include distinct, non-contiguous regions of a molecule.
The rSS enzyme may be a recombinant enzyme or is isolated from bacteria.
The term“recombinant” when used with reference to, e.g., polypeptide, enzyme, nucleic acid or cell refers to a material, or a material corresponding to the natural or native form of the material, that has been modified in a manner that would not otherwise exist in nature, or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques. Non-limiting examples include, among others, recombinant cells expressing genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise expressed at a different level. 31
“Isolated polypeptide” refers to a polypeptide which is substantially separated from other contaminants that naturally accompany it, e.g., protein, lipids, and polynucleotides. The term embraces polypeptides which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis). The improved ketoreductase enzymes may be present within a cell, present in the cellular medium, or prepared in various forms, such as lysates or isolated preparations. As such, in some embodiments, the improved ketoreductase enzyme can be an isolated polypeptide. The method may comprise co-expressing the polypeptide and the rSS enzyme in a host cell such that the polypeptide contacts the rSS enzyme for a sufficient time and under conditions to modify the polypeptide in the host cell. The terms“host”,“host cell”,“host cell line” and“host cell culture” are used interchangeably and refer to cells into which exogenous nucleic acid has been introduced, including the progeny of such cells. Host cells include“transformants” and“transformed cells”, which include the primary transformed cell and progeny derived therefrom without regard to the number of passages. Progeny may not be completely identical in nucleic acid content to a parent cell, but may contain mutations. Mutant progeny that have the same function or biological activity as screened or selected for in the originally transformed cell are included herein. A host cell is any type of cellular system that can be used to a modified polypeptide of the present invention. Host cells include cultured cells, e.g., mammalian cultured cells, such as CHO cells, BHK cells, NS0 cells, SP2/0 cells, YO myeloma cells, P3X63 mouse myeloma cells, PER cells, PER.C6 cells or hybridoma cells, yeast cells, insect cells, and plant cells, to name only a few, but also cells comprised within a transgenic animal, transgenic plant or cultured plant or animal tissue. In one embodiment, the polypeptide comprises WX4X5X6X7X8X9X10X11X12 (SEQ ID NO: 44), wherein X4 is I, D, L or V, wherein X5 is N, K or R, wherein X6 is A, F or V, wherein X7 is F or Y, wherein X8 is A, G, L, S or V, wherein X9 is N, K or R, wherein X10 is W or F, wherein X11 is D, E, G, N, P, S or T and wherein X12 is K or R. In one embodiment, the polypeptide comprises WIX4AFX5NWX6X7 (SEQ ID NO: 13), wherein X4 is N or K, wherein X5 is G or A, wherein X6 is E, S or T and wherein X7 is R or K. The polypeptide may comprise a sequence having at least 80% identity to a sequence of: ELVDSLLDTVSX13GWINAFGNWERAFH (SEQ ID NO: 14), wherein X13 is G or K. In one embodiment, the enzyme is an enzyme from the XYE maturase system. The enzyme may be an XyeB SPASM protein (e.g. xncB, ykcB or etcB) or an enzymatically active fragment of the enzyme. The polypeptide may be a polypeptide having at least 80% identity to an XyeA precursor peptide (e.g. xncA, ykcA and etcA), including an XyeA precursor peptide that is listed in Table 3. In one embodiment, the polypeptide comprises WIX4AFX5NWX6X7 (SEQ ID NO: 13), wherein X4 is N or K, wherein X5 is G or A, wherein X6 is E, S or T and wherein X7 is R or K. The polypeptide may comprise WINAFGNWER (SEQ ID NO: 15), WIKAFGNWSR (SEQ ID NO: 16) or WINAFANWTK (SEQ ID NO: 17) , WINAFGNWERAFH (SEQ ID NO: 18), AGWIKAFGNWSRSF (SEQ ID NO: 19) or WINAFANWTKRI (SEQ ID NO: 20). In one embodiment, the enzyme is an enzyme from the GRR maturase system. The enzyme may be an GrrM SPASM protein (e.g. oscB, lscB or gscB) or an enzymatically active fragment of the enzyme. The enzyme may, for example, act on a peptide having at least 80% identity to an GrrA precursor peptide (e.g. oscA, lscA and gscA), including a GrrA precursor peptide that is listed in Table 4. The polypeptide may comprise
(a) GAWGNGGGRGGWINRGGGGSWGNGGSWRNGGGWRNGWGDGGRFINSR (SEQ ID NO: 21);
(b) GGGFTQGGRRGVATGPRGGNFYNAHPNYGRVGGPVGVGRGAAWADGGGF YNGTYQDGGSFVNGSDGGAAFKNGTYGAGGFVNGSQGGAGFRNW (SEQ ID NO: 22); or
(c) GFANGGGGFANRVGPGGFLNDNGGGGFLNNRGWGDGGGGFLNRR (SEQ ID NO: 23). In one embodiment, the enzyme is an enzyme from the FXS maturase system. The enzyme may be an FxsB SPASM protein (e.g. mscB) or an enzymatically active fragment of the enzyme. The enzyme may, for example, act on a peptide having at least 80% identity to an FxsA precursor peptide (e.g. mscA), including a FxsA precursor peptide that is listed in Table 5. The polypeptide may comprise IPAAKFSSFI (SEQ ID NO: 24). The terms “Percentage of sequence identity” and “percentage identity” are used interchangeably herein to refer to comparisons among polynucleotides and polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Alternatively, the percentage may be calculated by determining the number of positions at which either the identical nucleic acid base or amino acid residue occurs in both sequences or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Those of skill in the art appreciate that there are many established algorithms available to align two sequences. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, 1981, Adv. Appl. Math.2:482, by the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol.48:443, by the search for similarity method of Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin Software Package), or by visual inspection (see generally, Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1995 Supplement) (Ausubel)). Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., 1990, J. Mol. Biol.215: 403-410 and Altschul et al., 1977, Nucleic Acids Res.3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information website. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as, the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915). Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided. The polypeptide may comprises a peptide sequence comprising at least 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more, 17 or more, 18 or more, 19 or more, 20 or more amino acids positioned upstream of the X1-X2-X3 motif, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid. The amino acids may be amino acids. For example, the polypeptide may comprise a peptide sequence of ELVDSLLDTVSGG (SEQ ID NO: 42) positioned at the N terminus of the X1-X2-X3 motif. The polypeptide may comprise a leader sequence having at least 80% identity to a sequence of: MSKLQREIAANKAQLSHEDKKKTQHK (SEQ ID NO: 25). The polypeptide may comprise a leader sequence having at least 80% sequence identity to a sequence of:
LEISTKIGLVGFFLALSALNIPAANATIKTPESTTIESRLSRITETIKERENQLQVKPEVPQ PGEQIAR (SEQ ID NO: 26). The polypeptide may comprise a leader sequence having at least 80% sequence identity to a sequence of:
MTTAPLRQDEPPVSLGRVRAEPLDRLSAADISPIVRRVMAAHLDQTRIPAAK (SEQ ID NO: 27). The polypeptide may comprise an affinity tag (such as a hexa-histidine sequence) and/or a solubility tag (such as SUMO). The methods of the present invention may be used to stabilise therapeutic polypeptides. The methods of the present invention can be used to modify ligands of receptors including, for example, TNF superfamily members, cytokine superfamily members, growth factors, chemokine superfamily members, pro-angiogenic factors, pro-apoptotic factors, integrins, hormones and other soluble factors, among others, including RANK-L, Lymphotoxin (LT)-a, LT-b, LT-a1b2, zLIGHT, BTLA. TL1A, FasL, TWEAK, CD30L, 4-1BB-L (CD137L), CD27L, Ox40L (CD134L), GITRL, CD40L (CD154), APRIL (CD256), BAFF, EDA1, IL-1a, IL-1b, IL-1RA, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL- 14, IL-15, IL-16, IL-17A, IL-17F, IL-17A/F, IL-18, IL-1 g, IL-20, IL-21, IL-22, IL-23, IL-24, IL-25, IL-26, IL-27, IL-28, IL-29, IL-30, IL-31, IL-32, IL-33, IL-34, IL-35, IFN-gamma, IFN- alpha, IFN-beta, TNF-a, TNF-b, G-CMF, GM-CSF, TGF-b1, 2 and 3, TGF-a, cardiotrophin- 1, leukemia inhibitory factor (LIF), betacellulin, amphiregulin, thymic stromal lymphopoietin (TSLP), flt-3, CXCL1-16, CCL1-3, CCL3L1, CCL4-CCL8, CCL9/10, CCL11-28, XCL1, XCL2, CX3CL1, HMG-B1, heat shock proteins, chemerin, defensins, macrophage migration inhibitory factor (MIF), oncostatin M, limitin, vascular endothelial growth factors VEGF A-D and PIGF, lens epithelium derived growth factor, erythropoietin, thrombopoietin, platelet derived growth factor, epidermal growth factor, fibroblast growth factors FGF1-14 and 16-23, hepatoma-derived growth factor, hepassocin, hepatocyte growth factor, platelet-derived endothelial growth factor (PD-ECGF), insulin-like growth factors IGF1 and IGF2, IGF binding proteins (IGFBP 1-6), GASPS (growth and differentiation-factor-associated serum proteins), connective tissue growth factor, epigen, epiregulin, developmental arteries and neural crest epidermal growth factor (DANCE), glial maturation factor-b, insulin, growth hormone, angiogenin, angiopoietin 1-4, angiopoietin-like proteins 1-4, integrins aVb3, aVb5 and a5b1, erythropoietin, thrombopoietin, prolactin releasing hormone, corticotropin-releasing hormone (CRH), gonadotropin releasing hormone, thyrotropin releasing hormone, somatostatin, vasopressin, oxytocin, demoxytocin, carbetocin, luteinizing hormone (LH) and chorionic gonadotropins, thyroid-stimulating hormone, ANP, BNP, CNP, calcitonin, CCK a, CCK B, vasoactive intestinal peptides 1 and 2, encephalin, dynorphin, b-endorphin, morphine, 4-PPBP, [1] SA 4503, Ditolylguanidine, siramesine angiotensin, kallidin, bradykinin, tachykinins, substance P, calcitonin, galanin, neurotensin, neuropeptides Y1-5, neuropeptide S, neuropeptide FF, neuropeptide B/W, brain-derived neurotrophic factors BDNF, NT-3, NT-4/5, activin A, AB, B and C, inhibin, Mullerian inhibiting hormone (MIH), bone morphogenetic proteins BMP2, BMP3, BMP4, BMP5, BMP6, BMP7, BMP8a, BMP8b, BMP10, BMP15, growth differentiating factors GDF1, GDF2, GDF3, GDF5, GDF6, GDF7, Myostatin/GDF8, GDF9, GDF10, GDF11, GDF15, nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), neurotrophin-3 (NT-3), and neurotrophin-4 (NT-4), artemin, persephin, neurturin, GDNF, agrin, ephrin ligands EFNA1, EFNA2, EFNA3, EFNA4, EFNA5 EFNB1, EFNB2, EFNB3, adiponectin, a2-macroglobulin, agrecan, agouti-related protein (AgRP), a-melanocyte stimulating hormone, albumin, ameloblastin, plasminogen, angiostatin, apolipoproteins A1, AII, B, B100, E, amyloid, autophagin, TGF-beta induced protein Ig H3), biglycan, leukocyte cell-derived chemotaxin LECT2, C-reactive protein, complement components, chordin, chordin-like proteins, collectins, clusterin-like protein 1, cortisol, van willebrandt factor, cytostatins, endostatin, endoreppellin, ephrin ligands, fetuins, ficolins, glucagon, granulysin, gremlin, HGF activator inhibitors HAI-1 and 2, kallilcreins, laminins, leptins, lipocalins, mannan binding lectins (MBL), meteorin, MFG-E8, macrophage galactose N-acetyl- galactosamine-specific lectin (MGL), midkine, myocilin, nestin, osteoblast-specific factor 2, osteopontin, osteocrin, osteoadherin, pentraxin, persephin, placenta growth factor, relaxins, resistin and resistin-like molecules, stem cell factor, stanniocalcins, VE-statin, substance P, tenascins, vitronectin, tissue factor, tissue factor pathway inhibitors, as well as any other of the >7000 proteins identified in the human secretome as listed in the secreted protein database (Chen Y et al., 2005. Nucleic Acids Res 33 Database Issue:D169-173), or any mimetic or analog thereof. Additionally, the methods of the invention can be used to stabilise enzymes such as for example angiotensin converting enzymes (ACE), matrix metalloproteases, ADAM metalloproteases with thrombospondin type I motif (ADAMTS1, 4, 5, 13), aminopeptidases, beta-site APP- cleaving enzymes (BACE-1 and -2), chymase, kallilkreins, reelin, serpins, or any mimetic or analog thereof. The methods of the invention can be used to stabilize chemotherapeutic agents, and toxins such as small molecule toxins or enzymatically active toxins of bacterial, fungal, plant or animal origin, or fragments thereof to increase potency of targeted compounds for therapeutic purposes, such as for example calicheamicin, pseudomonas exotoxin, diphtheria toxin, ricin, saporin, apoptosis-inducing peptides or any analog thereof. In other embodiments, the methods of the invention can be used to stabilise antigens for cancer vaccines such as for example the colorectal cancer antigen A33, a-fetoprotein, mucin 1 (MUC1), CDCP1, carcinoembryonic antigen cell adhesion molecules, Her-2, 3 and 4, mesothelin, CDCP1, NETO-1, NETO-2, syndecans, LewisY, CA-125, melanoma associated antigen (MAGE), tyrosinase, epithelial tumor antigen (ETA), among others, as well as for fusing viral envelope antigens or fungal antigens for treatment of infectious diseases. In one embodiment is a modified polypeptide obtained according to a method as defined herein. Provided herein are also compositions comprising a modified polypeptide as disclosed herein. In one embodiment, there is provided a pharmaceutical composition comprising a modified polypeptide as defined herein. The pharmaceutical composition may comprise a pharmaceutically acceptable carrier. By“pharmaceutically acceptable carrier” is meant a pharmaceutical vehicle comprised of a material that is not biologically or otherwise undesirable, i.e., the material may be administered to a subject along with the selected active agent without causing any or a substantial adverse reaction. Carriers may include excipients and other additives such as diluents, detergents, coloring agents, wetting or emulsifying agents, pH buffering agents, preservatives, and the like. Representative pharmaceutically acceptable carriers include any and all solvents, dispersion media, coatings, surfactants, antioxidants, preservatives {e.g., antibacterial agents, antifungal agents), isotonic agents, absorption delaying agents, salts, preservatives, drugs, drug stabilizers, gels, binders, excipients, disintegration agents, lubricants, sweetening agents, flavoring agents, dyes, such like materials and combinations thereof, as would be known to one of ordinary skill in the art (see, for example, Remington's Pharmaceutical Sciences, 18th Ed. Mack Printing Company, 1990, pp.1289-1329, incorporated herein by reference). Except insofar as any conventional carrier is incompatible with the active ingredient(s), its use in the pharmaceutical compositions is contemplated. Provided herein are uses and methods of treating a disease. In one embodiment, there is provided a method of treating a disease in a subject, comprising administering a modified as defined herein to the subject. Provided herein is also a modified polypeptide as defined herein for use in treating a disease. Also provided herein is the use of the modified polypeptide in the manufacture of a medicament for the treatment in a subject. The disease may, for example, be cancer, diabetes or an infectious disease. The term“treating" as used herein may refer to (1) preventing or delaying the appearance of one or more symptoms of the disorder; (2) inhibiting the development of the disorder or one or more symptoms of the disorder; (3) relieving the disorder, i.e., causing regression of the disorder or at least one or more symptoms of the disorder; and/or (4) causing a decrease in the severity of one or more symptoms of the disorder. The term“subject” as used throughout the specification is to be understood to mean a human or may be a domestic or companion animal. While it is particularly contemplated that the methods of the invention are for treatment of humans, they are also applicable to veterinary treatments, including treatment of companion animals such as dogs and cats, and domestic animals such as horses, cattle and sheep, or zoo animals such as primates, felids, canids, bovids, and ungulates. The“subject” may include a person, a patient or individual, and may be of any age or gender. The term“administering” refers to contacting, applying, injecting, transfusing or providing a composition of the present invention to a subject. Also disclosed herein is a kit for modifying a polypeptide, the kit comprising a) an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid and b) an expression construct comprising a nucleic acid that encodes an rSAM/SPASM (rSS) enzyme. Also disclosed herein is a kit for modifying a polypeptide, the kit comprising an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each any amino acid and a nucleic acid that encodes an rSAM/SPASM (rSS) enzyme. The term "nucleic acid" includes a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides. The terms“nucleic acid”,“nucleic acid molecule”,“nucleic acid sequence” and polynucleotide etc. are used interchangeably herein unless the context indicates otherwise. As used herein, the terms“encode”,“encoding” and the like refer to the capacity of a nucleic acid to provide for another nucleic acid or a polypeptide. For example, a nucleic acid sequence is said to“encode” a polypeptide if it can be transcribed and/or translated to produce the polypeptide or if it can be processed into a form that can be transcribed and/or translated to produce the polypeptide. Such a nucleic acid sequence may include a coding sequence or both a coding sequence and a non-coding sequence. Thus, the terms“encode”,“encoding” and the like include a RNA product resulting from transcription of a DNA molecule, a protein resulting from translation of a RNA molecule, a protein resulting from transcription of a DNA molecule to form a RNA product and the subsequent translation of the RNA product, or a protein resulting from transcription of a DNA molecule to provide a RNA product, processing of the RNA product to provide a processed RNA product (e.g., mRNA) and the subsequent translation of the processed RNA product. The term“construct” refers to a recombinant genetic molecule including one or more isolated nucleic acid sequences from different sources. Thus, constructs are chimeric molecules in which two or more nucleic acid sequences of different origin are assembled into a single nucleic acid molecule and include any construct that contains (1) nucleic acid sequences, including regulatory and coding sequences that are not found together in nature (i.e., at least one of the nucleotide sequences is heterologous with respect to at least one of its other nucleotide sequences), or (2) sequences encoding parts of functional RNA molecules or proteins not naturally adjoined, or (3) parts of promoters that are not naturally adjoined. Representative constructs include any recombinant nucleic acid molecule such as a plasmid, cosmid, virus, autonomously replicating polynucleotide molecule, phage, or linear or circular single stranded or double stranded DNA or RNA nucleic acid molecule, derived from any source, capable of genomic integration or autonomous replication, comprising a nucleic acid molecule where one or more nucleic acid molecules have been operably linked. Constructs of the present invention will generally include the necessary elements to direct expression of a nucleic acid sequence of interest that is also contained in the construct, such as, for example, a target nucleic acid sequence or a modulator nucleic acid sequence. Such elements may include control elements such as a promoter that is operably linked to (so as to direct transcription of) the nucleic acid sequence of interest, and often includes a polyadenylation sequence as well. Within certain embodiments of the invention, the construct may be contained within a vector. In addition to the components of the construct, the vector may include, for example, one or more selectable markers, one or more origins of replication, such as prokaryotic and eukaryotic origins, at least one multiple cloning site, and/or elements to facilitate stable integration of the construct into the genome of a host cell. Two or more constructs can be contained within a single nucleic acid molecule, such as a single vector, or can be containing within two or more separate nucleic acid molecules, such as two or more separate vectors. An“expression construct” generally includes at least a control sequence operably linked to a nucleotide sequence of interest. In this manner, for example, promoters in operable connection with the nucleotide sequences to be expressed are provided in expression constructs for expression in an organism or part thereof including a host cell. For the practice of the present invention, conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art, see for example, Molecular Cloning: A Laboratory Manual, 3rd edition Volumes 1, 2, and 3. J. F. Sambrook, D. W. Russell, and N. Irwin, Cold Spring Harbor Laboratory Press, 2000. By“control element” or“control sequence” is meant nucleic acid sequences (e.g., DNA) necessary for expression of an operably linked coding sequence in a particular host cell. The control sequences that are suitable for prokaryotic cells for example, include a promoter, and optionally a cis-acting sequence such as an operator sequence and a ribosome binding site. Control sequences that are suitable for eukaryotic cells include transcriptional control sequences such as promoters, polyadenylation signals, transcriptional enhancers, translational control sequences such as translational enhancers and internal ribosome binding sites (IRES), nucleic acid sequences that modulate mRNA stability, as well as targeting sequences that target a product encoded by a transcribed polynucleotide to an intracellular compartment within a cell or to the extracellular environment. Disclosed herein is a method of preparing a modified polypeptide library, the method comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library. The method may comprise expressing a polypeptide library in a host cell. The method may comprise co-expressing an rSS enzyme with the polypeptide library in the host cell. The polypeptides may be displayed in a polypeptide display system (such as phage or yeast display system). The method may comprise contacting the displayed polypeptides with an rSS enzyme. Disclosed herein is a method of selecting a modified peptide capable of binding to a ligand, the method comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library; c) contacting the modified polypeptide library with the ligand; and d) selecting the modified polypeptides that bind to the ligand. Throughout this specification and the statements which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps. The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates. Those skilled in the art will appreciate that the invention described herein in susceptible to variations and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications which fall within the spirit and scope. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, individually or collectively, and any and all combinations of any two or more of said steps or features. Certain embodiments of the invention will now be described with reference to the following examples which are intended for the purpose of illustration only and are not intended to limit the scope of the generality hereinbefore described. EXAMPLES Materials and Methods General experimental procedures. All chemicals and reagents were purchased from either Sigma (USA) or Bio Basic (Canada). Trypsin protease was purchased from Sigma (USA). Synthetic genes inserted into expression vectors were obtained from Gene Universal (USA) or Twist Bioscience (USA). Vectors for protein expression in E. coli, pACYCDuet-1 and pRSFDuet-1 were purchased from life technologies (USA). Antibiotics (chloramphenicol for pACYCDuet-1 and kanamycin for pRSFDuet-1) were used at a concentration of 25 µg/mL in solid (LB agar) and liquid medium (LB and TB medium). Electroporation was carried out using mode Ec2 (2.5 kV) on a Bio-Rad (USA) MicroPulser Electroporator. Escherichia coli BL21(DE3) purchased from NEB (USA) were used for protein expression. Ultra YieldTM flasks (500 mL or 2.5 L) from Thomson Instrument Company (USA) were used for protein expression. HisPur Ni-NTA resin was purchased from Thermo Scientific (USA). For desalting of proteins, GE Healthcare (USA) PD Minitrap G-25 columns were used. For tryptic digests a Phenomenex (USA), Kinetex XB-C18, 2.6 m, 150 x 4.6 mm column was used. For semi- preparative HPLC a Phenomenex (USA), Kinetex XB-C18, 5 m, 250 x 10 mm column was used. E. coli cells were lysed using a Fisherbrand Model 505 Sonic Dismembrator fitted with a FB44201/4” Microtip, FB44181/8” Micotip, or FB42191/2” solid probe. LC-MS experiments were performed on a Waters Acquity UPLC System coupled to a Waters Micromass Q-Tof Premier Mass Spectrometer (USA). HPLC grade solvents (water + 0.1% formic acid, water + 0.5% formic acid, acetonitrile +0.1% formic acid, or 1:1 acetonitrile/isopropanol + 0.5% formic acid) were used for LC-MS. NMR spectra were acquired using a Bruker (USA) 400 MHz Avance III or 600 MHz Avance III with a cryoprobe operating at 298 K. NMR solvents were purchased from Cambridge Isotope Labs (USA). Insert gene sequences. NHis-SUMO-XncA into pACYCDuet-1 at restriction sites NcoI/EcoRI: GCAGCAGCCATCATCACCACCATCACGGGTCCCTGCAGGACTCAGAAGTCAATCA AGAAGCTAAGCCAGAGGTCAAGCCAGAAGTCAAGCCTGAGACTCACATCAATTT AAAGGTGTCCGATGGATCTTCAGAGATCTTCTTCAAGATCAAAAAGACCACTCCT TTAAGAAGGCTGATGGAAGCGTTCGCTAAAAGACAGGGTAAGGAAATGGACTCC TTAACGTTCTTGTACGACGGTATTGAAATTCAAGCTGATCAGACCCCTGAAGATT TGGACATGGAGGATAACGATATTATTGAGGCTCACCGCGAACAGATTGGAGGTA TGAGCAAATTACAGCGTGAAATTGCAGCAAACAAAGCTCAACTGAGCCATGAAG ACAAGAAGAAAACGCAGCACAAAGAGCTTGTTGACAGCCTGCTGGATACTGTCT CTGGTGGTTGGATAAACGCTTTTGGAAACTGGGAGAGAGCCTTTCATTAA (SEQ ID NO: 28) NHis-SUMO-XncA-G(-2)K into pACYCDuet-1 at restriction sites NcoI/EcoRI: GCAGCAGCCATCATCACCACCATCACGGGTCCCTGCAGGACTCAGAAGTCAATCA AGAAGCTAAGCCAGAGGTCAAGCCAGAAGTCAAGCCTGAGACTCACATCAATTT AAAGGTGTCCGATGGATCTTCAGAGATCTTCTTCAAGATCAAAAAGACCACTCCT TTAAGAAGGCTGATGGAAGCGTTCGCTAAAAGACAGGGTAAGGAAATGGACTCC TTAACGTTCTTGTACGACGGTATTGAAATTCAAGCTGATCAGACCCCTGAAGATT TGGACATGGAGGATAACGATATTATTGAGGCTCACCGCGAACAGATTGGAGGTA TGAGCAAATTACAGCGTGAAATTGCAGCAAACAAAGCTCAACTGAGCCATGAAG ACAAGAAGAAAACGCAGCACAAAGAGCTTGTTGACAGCCTGCTGGATACTGTCT CTAAAGGTTGGATAAACGCTTTTGGAAACTGGGAGAGAGCCTTTCATTAA (SEQ ID NO: 29) XncB into pRSFDuet-1 at restriction sites NdeI/XhoI: ACGACATCAAAGAGTGAGAAGATCAAACATCTTGAGATCATTCTCAAAATTAGTG AACGATGCAATATCAATTGCTCCTATTGCTATGTATTCAATATGGGTAACTCACTG GCTACCGATAGTCCTCCGGTCATATCGCTTGATAACGTGCTGGCGTTGAGGGGAT TCTTTGAGCGCTCCGCAGCAGAAAACGAGATTGAAGTTATCCAAGTCGATTTTCA CGGTGGTGAACCACTGATGATGAAAAAAGACCGTTTCGATCAAATGTGTGACATT CTTCGGCAGGGTGACTATAGCGGTTCCCGGCTTGAATTAGCATTACAGACTAACG GTATTCTGATTGATGATGAATGGATTTCACTGTTTGAAAAACATAAAGTCCATGC CAGCATATCAATCGATGGACCAAAACATATCAATGACCGCTATCGGTTGGACCGA AAAGGAAAAAGCACTTACGAAGGAACAATTCACGGCTTGCGCATGCTCCAGAAT GCGTGGAAGCAAGGGCGACTCCCGGGAGAGCCCGGCATTCTCTCTGTGGCAAAC CCCACAGCGAATGGTGCAGAGATTTATCACCACTTTGCAAACGTCCTCAAATGTC AGCACTTCGATTTCCTCATACCCGACGCTCACCATGATGATGATATTGATGGCAT AGGTATTGGCAGATTCATGAATGAAGCGCTTGACGCATGGTTTGCTGACGGTCGG TCAGAGATTTTTGTTCGAATCTTTAACACATACCTTGGCACGATGCTAAGTAACCA GTTTTACCGGGTTATTGGCATGAGCGCGAATGTAGAATCTGCTTATGCTTTCACGG TAACTGCCGACGGCCTGCTCCGTATTGATGATACTTTGCGTTCCACCTCTGATGAA ATATTCAATGCCATTGGGCATCTCAGTGAATTGTCACTCTCCGGCGTACTCAATTC ACCTAATGTCAAAGAATATCTTTCACTAAATAGTGAACTGCCAAGTGATTGTGCA GATTGTGTGTGGAACAAAATCTGTCACGGTGGCCGCTTGGTCAATCGCTTTTCAC GGGCAAACCGTTTCAATAATAAAACCGTGTTCTGTTCATCAATGAGGCTTTTCCTT AGTCGCGCGGCTTCACACCTGATTACGGCTGGTATTGATGAAGAAACAATAATGA AAAATATTCAGAAATAG (SEQ ID NO: 30) Transformation of E. coli BL21(DE3) with plasmids. The plasmids were dissolved in MilliQ grade water to a final concentration of 25 ng/mL. For precursor only expression, 1 mL plasmid DNA from precursor (NHis-SUMO-XncA or NHis-SUMO-XncA-G(-2)K) was added to 70 ml E. coli BL21(DE3) electrocompetent cells. The E. coli BL21(DE3) cells were transformed in a 2 mm electroporation cuvettes using the settings described above. For coexpression of precursor + rSAM enzyme, 1 mL of plasmid DNA from precursor and 1 mL plasmid DNA from rSAM enzyme (XncB) were added to 70 ml E. coli BL21(DE3) electrocompetent cells. The transformed cells were then grown overnight at 37°C on lysogeny broth (LB) agar supplemented with appropriate antibiotics (chloramphenicol for cells harboring the precursor plasmids only and kanamycin plus chloramphenicol for cells harboring both the precursor and rSAM enzyme plasmids) at a final concentration of 25 mg/mL for each antibiotic added. Protein expression and purification. A 50 ml falcon tube containing 10 ml LB medium supplemented with appropriate antibiotics was inoculated with a colony from the transformation above. The 10 ml culture was grown overnight at 37oC at 200 rpm. The overnight culture was used to inoculate either a 200 mL TB medium in a 500 mL Ultra YieldTM flask or 1 L TB medium in a 2.5 L Ultra YieldTM flask in ratio 1:100 (v:v) containing appropriate antibiotics. The cells were then grown at 37oC, 200 rpm until OD600 reached 1.6-2.5. Culture was placed on ice for 30 min then induced by addition of IPTG at a concentration of 0.8 mM. After induction, culture was shaken at 16oC at 200 rpm for 16 h. The cells were collected by centrifugation (6000 rcf, 20 min). Denaturing lysis buffer was added to cell pellets at a ratio of 3:1 (v:w). The cells pellets were reconstituted and lysed by sonication, ¾” inch solid probe (10 seconds on and 10 seconds off for 30 cycles at 25% amplitude). After sonication, the cell debris was removed by centrifugation (16,000 rcf, 15 min). HisPur Ni-NTA resin (0.7 ml) was added to ~15-20 mL lysed supernatant in a 50 ml falcon tube and gently shaken for 30 min to allow the binding of precursor peptide to the Ni-NTA resin. Peptide-bound Ni-NTA resin were then washed with denaturing wash buffer (2 ml for 0.7 ml resin), NPI-20 (5x1ml for 0.7 ml resin) and eluted with NPI-250 (2.5 ml for 0.7 ml resin). Elution fractions were desalted into 50 mM Tris buffer pH 8.0 using PD Minitrap G-10 column, and then digested with trypsin (1:100, precursor/trypsin w:w) for 16 h. Semi-preparative HPLC. The trypsin digested peptides were freeze dried and then resuspended in DI water and subjected to reversed phase semi-preparative HPLC using a flow rate of 5 ml/min, column temperature of 60oC, and a linear gradient from 100% water +0.1% trifluoroacetic acid to 27.5% isopropanol/acetonitrile (1:1) + 0.1% trifluoroacetic acid/water + 0.1% trifluoroacetic acid over 17 minutes. Fractions containing fragment 3, as judged by LCMS, were collected and dried by freeze dryer. Dried samples were dissolved in DMSO-d6 for NMR analysis. Example 1 The invention is based upon an efficient protocol for interrogating rSAM sequence-function space in the TIGRFAM database for suitable targets. The inventors were intrigued by a number of uncharacterized rSS proteins annotated as putative maturases within the TIGRFAM database. Several of these proteins have been assigned as part of a maturase system that is minimally composed of a substrate precursor (A) and rSS protein (B). They initially focused on a single maturase system based on the following criteria. The precursor sequences do not contain known motifs for previous characterized rSS maturases as well as predicted core peptides void of Cys residues which are a characteristic of sactipeptides. Based on the diversity of reactions catalyzed by rSS proteins it is not possible to predict the transformation and the inventors turned toward functional studies in a heterologous host, Escherichia coli. Identification of rSAM maturases encoding novel posttranslational modifications Spliceases and cyclophane forming enzymes from the rSAM superfamily belong to a subfamily known as SPASM domain containing proteins (referred to as SPASM proteins). In addition to the canonical rSAM binding domain responsible for homolytic cleavage of S- adenosylmethionine, they contain either one or two C-terminal Cys-rich domain(s) (PF13186, TIGR04085) which bind auxiliary [4Fe-4S] cluster(s). Intriguingly, several uncharacterized SPASM proteins were assigned as part of maturase systems (biosynthetic gene cluster) in the TIGRFAM database. These gene clusters encode a substrate precursor peptide (designated as A) which is proposed to be modified by a SPASM protein (designated as B). Maturase systems may also contain proteases and transport proteins to cleave and export final products outside of the cell. The investigations were initiated on the radical SAM/SPASM maturase system Xye (GenProp1090) that encodes a precursor with features which led to the belief that novel posttranslational modifications may occur (Fig 1a). The predicted core peptide sequences did not contain motifs modified by previously characterized maturase or SPASM proteins and are void of Cys residues which are prevalent in thioether bridged sactipeptides. Based on the diversity of reactions catalyzed by SPASM proteins it was not possible to predict the transformation and the inventors turned toward functional studies in Escherichia coli. Example 2 The XYE maturase system encodes a cyclophane forming enzyme. The XYE maturase system (GenProp1090) occurs in bacterial strains of the genus Xenorhabdus, Yersinia, and Erwinia (Fig.1a). The substrate precursors are collectively referred to as XyeA (TIGR04495, putative rSAM-modified RiPP) and the SPASM protein as XyeB (TIGR04496, radical SAM/SPASM domain peptide maturase). The length of the XyeA precursor peptides are ~50 amino acids with a corresponding GG motif predicted to separate the posttranslational modifying enzyme recognition sequence (N-terminal leader peptide) from the target sequence (C-terminal core peptide). The core peptides are composed of 12-14 amino acids which have been labelled positive numbers starting from the first residue of the predicted core sequence (following the GG motif). Corresponding three-letter names were given for specific gene clusters within the Xye maturase system. To initiate functional studies, the xnc cluster from Xenorhabdus nematophela F1 (xnc) was selected. The xncA precursor gene was encoded with an N-terminal 6 x Histidine (NHis6)/SUMO (affinity/solubility) tag and coexpressed with xncB in E. coli BL21 (DE3). Proteins were purified using nickel-affinity chromatography and the resulting eluates desalted and digested with trypsin. Comparative liquid chromatography-mass spectrometry (LC-MS) of NHis6-SUMO-XncA only samples were analyzed in parallel with NHis6-SUMO-XncA + XncB samples to identify differences resulting from posttranslational modifications. The disappearance of the C-terminal fragment (1, residues -13 to +10, Fig.2a) when compared to precursor only expression was observed. Although the appearance of a new chromatographic peak was not obvious, extracted ion chromatograms of a three-minute time window (tR = 9-12 min) from the LC-MS data revealed a new mass peak ([M+4H+]4+, m/z 976.44) corresponding to fragment 2 (residues -13 to +13) with concomitant loss of -6 Da (Fig.2a). Subsequent tandem mass spectrometry (MS/MS) of this modified fragment showed -2 Da mass shifts localized to WIN, FGN, and WER motifs within the predicted core peptide. Fragmentation within each of the three-residue motifs was not observed, indicating that cyclization within these motifs may have occurred. A similar analysis of coexpression from NHis6-SUMO-YkcA + YkcB and NHis6-SUMO-EtcA + EtcB proteins revealed -2 Da mass losses at WIK/FGN/WSR and WIN/FAN/WTK, respectively. The inventors next turned toward structure determination of the NHis6-SUMO-XncA + XncB product. To obtain a smaller peptide fragment more suitable for NMR analysis, a Gly(-2) to Lys mutant (Fig 2b) was used. A similar protein expression and purification on larger scale followed by semi-preparative high-performance liquid chromatography (HPLC) gave fragment 3 which showed the same -2 Da modifications as the nonmutated precursor. Analysis of 2D NMR from fragment 3 (COSY, HSQC, HMBC and NOESY) in DMSO-d6 revealed key correlations that allowed assignment of the newly formed bonds (Fig 2c). The b-position of the third residue within each of the motifs (Asn3-Cb, Asn7-Cb, and Arg10-Cb) were now substituted methines. Key correlations were observed to show that a new carbon-carbon bond had formed between Trp1-C6 to Asn3-Cb, Phe5-C4 to Asn7-Cb, and Trp8-C6 to Arg10-Cb. The assignment of the carbon-carbon bond to form three-residue macrocycles establishes XncB as a 3-residue cyclophane forming enzyme (3-CyFE). The general motif is defined as X1-X2- X3 where X1’ is an aromatic amino acid (Trp or Phe for XyeA precursor peptides). The Xnc product (3) appears similar to the end product because there is a single identifiable modifying enzyme (XncB) and the presence of proteases and a Gly-Gly motif suggests cleavage at this site. Additional paired coexpressions from Yersina kristensenii (NHis6-SUMO-YkcA + YkcB) and Ersinia tolentana (NHis6-SUMO-EtcA + EtcB) revealed -2 Da mass losses at similar three- residue motifs WIK/FGN/WSR and WIN/FAN/WTK, respectively. Three-residue indole containing macrocycles have planar chirality There are several features of the cyclophane forming reaction that deserve further comment. Single product formation by HPLC (analytical and semi-preparative) and NMR was observed, which suggests that the three newly formed C-C bonds are installed with stereochemical fidelity at Cb. During the structure determination of the Xnc product (3) it was apparent that the strained nature of the p-cyclophane macrocycle leads to restricted rotation of the phenyl ring as distinct chemical shifts could be observed for the four aromatic protons. This observation is consistent with other natural and synthetic constrained p-cyclophane macrocycles. With the restricted rotation of the phenyl ring in mind, it is hypothesized that the asymmetric nature of the indole cyclophane bridges coupled with restricted rotation would induce a distinct type of planar chirality where the chiral plane is the substituted indole. To probe this hypothesis, the NOESY correlations within the macrocycle was analyzed to assign the conformation and relative orientation of the indole ring. The stereochemistry at Ca-positions are consistent with either all L- or all D-configuration. The former was chosen as a more plausible scenario based on precedent for cyclophane formation to occur by abstraction of hydrogen from Cb and retention of configuration at Ca positions. The newly formed stereocenters were assigned the S- configuration based on large vicinal coupling to Ha and NOE correlations which supported an approximate anti orientation of Ha and Hb. Further, the observed NOE correlations within the WIN and WER macrocycles shown in Fig. 2d strongly suggest the cyclophanes have planar chirality and adopt 6Sp configuration. Three-residue cyclophane formation represents a new scaffolding type in peptide cyclization chemistry and defines a new family of RiPP natural products for which are given the name triceptides (three-residue in cyclophane peptides). Example 3 X1-X2-X3 motif guided genome mining yields cyclophane products from cyanobacteria With the characterization of selected Xye maturase proteins in hand, precursor peptides encoded with similar three residue motifs were next searched in the genomes of other prokaryotes. Similar motifs were found in precursor peptides from the Glycine-rich repeat (Grr) maturase system (GenProp1037, Fig 1b). In this system, precursors and corresponding SPASM proteins are generally referred to as GrrA (TIGR04260, rSAM-associated Gly-rich repeat protein) and GrrM (TIGR04261, radical SAM/SPASM domain protein), respectively. The GrrA precursors are characterized by a Gly-rich C-terminal sequence that marks the start of the putative core peptide. It was noticed that embedded in the glycine-rich region were three- residue X1-X2-X3 motifs, some of which were identical to those modified in XyeA precursors (i.e. WIN, FAN, and FGN). In contrast to XyeA type precursors, GrrAs contain particularly long core sequences (up to ~100 amino acids) with greater spatial distances between three- residue motifs. Gene clusters from Oscillatoriales cyanobacterium CG2_30_40_61 (osc), Lyngbya sp. PCC 8106 (lsc), and Geminocystis sp. NIES-3709 (gsc) were selected for functional studies. Four precursors were selected and encoded with NHis6-affinity tags (NHis6-OscA2, NHis6- LscA, NHis6-GscA1 and NHis6-GscA2) and only when coexpressed with their cognate SPASM proteins (OscB, LscB, and GscB), were fragments detected from trypsin digest of purified precursors (Fig.3a-c). Detailed analysis of trypsin digests from the coexpression samples were carried out and -2 Da modifications to X1-X2-X3 motifs were localised. The NHis6-OscA1 precursor was selected for further characterization since both Phe and Trp residues were modified. Although modification at WGN (21-23) and FIN (43-45) motifs could be clearly detected, a complex mixture of additional products that were likely formed at the other X1-X2-X3 motifs was observed. To obtain a cleaner reaction mixture, a precursor was engineered by fusion of the peptide fragments (residues 16-28 and 43-47) where modification was detected. In addition, Trp27 was mutated to Gly in the engineered sequence as this was also a potential site of modification (Fig 3d). The coexpression of the engineered precursor NHis6-OscA1-(16-28 W27G)-(43-47) with OscB led to better conversion to products 7 and 9, a cleaner product mixture, and was more suitable for obtaining products fragments for NMR (Fig 3e-f). Larger scale protein expression, and purification of tryptic digest by semi- preparative HPLC yielded fragments 7 and 9 for which 2D NMR data sets were obtained. Analysis of NMR confirmed the formation of cyclophanes within the two fragments. While the Xnc product (3) was substituted at C6, the C-C bond in the Osc product (9) occurs between Trp21C7 to of Asn23Cb. Conformational analysis using NOESY correlations showed that that 7 and 9 also display a restricted rotation of the aromatic rings. The indole in product 9 is oriented with N1 on same face as TrpHa and AsnHa, and enrichment of the 7Sp-isomer was assigned. The C-terminal product 7 was assigned similarly to the FGN motif in 3 as the paracyclophane. Since fragment 5 contained a single aromatic residue (Phe), the signals in the aromatic region of the 1H NMR acquired in D2O were clearly observed (Fig 3h). Several X1-X2-X3 motifs appear in the OscA2 sequence; however, fragments containing residues 1-15 and 29-42 were not easily detected. It was hypothesized that if WRN motifs were modified this would prevent cleavage by trypsin and result in larger fragments that may elude detection. To facilitate identification of modifications at WRN motifs, a precursor with additional trypsin cleavage sites (G18K, G25K, and G31K) was engineered. Coexpression of this construct with OscB showed modified fragments could be detected for the remaining X1- X2-X3 motifs and showed that OscB does not show selectivity for certain motifs (Fig 3i). To further support that all motifs on the core peptide of GrrAs are modified, modification on all motifs on NHis6-GscA2 by GscB were detected and on most motifs for an engineered version of NHis6-LscA + LscB. Only one modified motif for GscA1 could be detected which we attribute to low expression of this construct. It is of note that three-residue cyclophanes with Arg or Lys at X2, X3, and X4 positions were protected from cleavage by trypsin. Collectively, these results demonstrate that Grr maturases (TIGR04261) also encode 3-CyFEs and represent additional members of the triceptide family. The Grr gene clusters often encode a protein annotated as an ABC solute binding protein presumably used for transport. In contrast to XyeAs which contain a Gly-Gly motif predicted to be the cleavage site, GrrAs contain an abundance of Gly residues, making it difficult to predict where or if the precursor is cleaved. Example 4 Analysis of sequence-function space expands the 3-CyFE family to actinobacteria Functional validation and characterization of products from XyeB and GrrM maturase families allowed an uncharacterized region of SPASM protein sequence-function space to be defined. The inventors were then in a suitable position to leverage this knowledge to expand the 3-CyFE family. These efforts were initiated by generating a protein sequence similarity network using the EFI-EST tool with selected InterPro families that correspond to TIGRFAMs with characterized members. The network was visualized in cytoscape and showed that XyeB and GrrM protein families inhabit a distinct region in SPASM protein sequence-function space. Although this collection represents a small percentage of SPASM protein families, it was generally observed that each protein family grouped together with the exception of six cysteine in forty-five residue (SCIFF) and quinohemoprotein (Qhp) maturases. This initial sequence similarity network demonstrated the potential for use in mapping additional SPASM proteins of unknown function to expand the 3-CyFE family. To test this hypothesis, a third maturase system (GenPro1068, Fxs maturase system) composed of precursors (TIGR04268, FXSXX-COOH protein) that contain a characteristic FXSXX motif at the C-terminus was explored. The maturase proteins (TIGR04269, radical SAM/SPASM domain protein, FxsB family) contain an N-terminal rSAM/SPASM domain and an additional C-terminal HEXXH domain (TIGR04267, HEXXH motif domain), the latter of which is suggestive of metalloprotease activity. When FxsB protein sequences (IPR026335) were combined into the protein sequence similarity network (Fig 4b), nodes for FxsB proteins clustered adjacent to the GrrM family were found while nodes for XyeB were either at the border of or within the FxsB family cluster. It was also evident that QhpD proteins, which catalyze thioether bond formation, clustered adjacent to FxsB proteins. This prompted functional studies to be carried out. One of the FxsB seed sequences from the TIGRFAM database located in a gene cluster (msc) from Micromonospora sp. L5 was selected. The protein sequence for MscB was analyzed using InterPro to generate a truncated variant composed of the N-terminal 375 residues (mscB-375) to avoid unpredictable transformations by the C-terminal HExxH domain (Fig. 5a). Comparative analysis of tryptic digests of NHis6-MscA expressed alone and NHis6-MscA + MscB-375 showed that the C-terminal fragment (10) was converted to major product (11) which showed -2 Da mass loss localized to the FSS motif (Fig 5b). Preparative purification and NMR analysis of the major product (11) confirmed that the rSAM/SPASM domain of MscB catalyzes p-cyclophane formation analogous to XncB and OscB. The FxsB maturase system extends the 3-CyFE family to a third subfamily and represents both an entry point to study a potential bifunctional rSAM enzyme and an example of sequence-function space expansion. Configurational assignment of the para-cyclophane 11 from MscB-375.
It is proposed that the newly formed C(sp2)-C(sp3) bond by 3-CyFEs occurs without epimerization at the Ca positions. To confirm this hypothesis as well as to assign the configuration at the newly formed stereocenter at Cb, a degradative method for product 11 is pursued. These efforts were initated by acid hydrolysis and Marfey’s type derivatization with L-FDVA which confirmed the L-configuration of the amino acids not involved in the newly formed bond (L-Ser, L-Phe, and L-Ile). In addition, the fragment (12) containing the crosslink between Phe1 and Ser3 can be detected. To obviate the need for synthesis of the 8 possible stereoisomers of 12, a Ser to Ala variant (NHis6-MscA-S3A) as a substrate for MscB-375 was tested. The resulting cyclized product would not contain a stereocenter at the Cb-position and upon hydrolysis would possess C2-symmetry (L,L or D,D) or would be an optically inactive meso-form (L,D or D,L which are indistinguishable). NHis6-MscA-S3A was coexpressed with MscB-375 and cyclization was detected and localized by LC-MSMS. Following purification, hydrolysis, and Marfey’s type analysis, the derivatized fragment (13) was detected. The inventors then proceeded with synthesis of the three possible diastereomers. The optically enriched (L, L)-isomer was prepared from known N-phthaloyl-L-Ala-aminoquinoline (14)48 and protected L-4-iodo-Phe (15). The key C-C bond was formed by the b-C–H monoarylation procedure using palladium acetate in the presence of silver tetrafluoroborate to give 16. Global deprotection (6N HCl, 110°C) and derivatization with L-FDVA gave the (L, L)-configured standard (17). An analogous sequence using protected N-Boc-4-iodo-D-Phe-CO2tBu provided the corresponding diastereomer 18. Hydrolysis of (L, L)-16 and derivatization with D-FDVA gave compound 19 which served as the equivalent (D, D)-isomer by reversed phase HPLC. Comparison of the three standards with 13 showed a match in retention time to 17, thus confirming the L-configuration at Ca positions. Next, acetoxylation of 16 was carried out to obtain major and minor diastereomers (20 and 21, respectively) in a ratio of 5:1. The two isomers were separated and subjected to acid hydrolysis to give (S)-22 and (R)-23, which were compared to 24 by NMR spectroscopy. While the 1H NMR spectrum for (R)-23 showed significant differences, the spectrum for (S)-22 was superimposable with 24, securing the assignment of (S)-configuration at the newly formed stereocenter (Fig 6f). Production of a triceptide natural product Initial objectives were focused on revealing the function for Xye, Grr, and Fxs maturases, which collectively represent a new RiPP family. These studies led to the question of whether an end product could be identified to demonstrate the authenticity of one of these pathways. A gene cluster was selected, for which 1) a fully modified core sequence could be obtained and 2) the penultimate cleavage by a protease could be predicted. The xnc cluster matched these criteria because clean formation of a 3x modified product on existing X1-X2-X3 motifs was observed, and the precursor contains a Gly-Gly motif predicted as the site of cleavage. The xnc gene cluster encodes a characteristic protease/transporter (xncE) predicted to cleave at the Gly-Gly motif and export the product out of the cell. Two strategies were explored in parallel for detection of the natural product: full cluster expression and exchange to an inducible promoter to control production in the host strain. The full cluster (xncABCDE) was cloned with the addition of an NHis6-tag on the precursor for ease of detecting the leader peptide. This construct was transformed into E. coli BL21 DE3 and expressed in LB medium. Ni-affinity purification as before showed the cleaved leader (until the Gly-Gly motif) could be detected in the xncABCDE construct but not from expressions of xncABC or xncAB . Extraction of the supernatant by polymeric solid phase extraction and LC- MS allowed the detection of the 3x modified and cleaved product (25) for the induced full cluster (xncABCDE) expression, but this product could not be detected from induced expression of xncABC or xncAB (Fig. 7c). In a second strategy, an L-arabinose inducible promoter was installed in the wild type strain using the pCEP vector. The PBAD promoter was integrated into the genome of Xenorhabdus nematophila DSM3370. The 3x modified and cleaved product (26) was detected only in the supernatant of an induced culture of the strain modified to contain the PBAD promoter just upstream of the xnc cluster (Fig.7c). This represents production of the first natural product from a pathway containing a 3-CyFE, for which is given the named xenorceptide. The two products (25 and 26) showed identical retention time and MSMS data to the 3x modified product (27) resulting from coexpression of NHis6-SUMO-XncA-(G-1K) variant + XncB (cleaved with trypsin) (Fig. 7c-d). It is proposed that the structure of xenorceptide to be that of Xnc product 3 without the N-terminal Gly residue. The experiments above outline three strategies for which additional triceptide natural products may be obtained. Definition, prevalence and distribution of triceptide gene clusters The triceptide natural product family is tentatively defined as originating from gene clusters containing a SPASM protein that catalyzes three-residue cyclophane formation (3-CyFE) through a single sp2-sp3 C-C bond on a precursor containing a C-terminal core peptide containing at least one X1-X2-X3 motif (X1 = any aromatic amino acid). An overview of bacterial genomes that encode triceptide gene clusters can be obtained by the precursor sequences obtained from the InterPro database. The Xye maturase system is the smallest group (n = 28 precursors) which are prevalent in gram-negative bacteria of the order Enterobacteriales (Table 3). The majority have been identified in Yersina enterocolitica and Yersinia kristensenii, which are known pathogens found in livestock and humans. The Grr maturase systems are the second most abundant (IPR026356, n = 200 proteins matched) which are almost exclusively found in cyanobacteria (n = 179, Table 4). These gene clusters are broadly distributed but are particularly prevalent in isolates of the genus Synechococcus (n = 37) and Microcystis (n = 30). The third and largest is the Fxs maturase system (IPR026334, n = 200 with FxS motif) which is broadly distributed among actinobacteria (Table 5). These clusters are most prevalent in the genus Streptomyces (n = 259) and can be found in many model strains that have been intensively studied for their secondary metabolites. These analyses demonstrate that triceptide gene clusters are widespread in bacterial genomes and likely confer an as of yet unknown but important function. Example 5 The present invention may be used for the de novo creation of therapeutics by display methods. This includes the use of yeast, phage, or mRNA display to display a library of minimal polypeptide sequences (for example, C-terminal half of a precursor sequence). The library of minimal polypeptide sequences may be generated by standard molecular biology techniques such as error prone PCR of a DNA scaffold sequence followed by expression of the polypeptide library. The minimal polypeptide sequences may be modified with an rSS enzyme of the present invention to obtain novel polypeptide sequences, which are then screened for a desired activity such as binding to a therapeutic target. The present invention may also be used for rational drug design to develop new therapeutic polypeptides based on a similar scaffold. Finally, existing bioactive peptides or molecules may be modified. This may include the addition of the cyclophane ring to antimicrobial peptides for instance.

Claims

CLAIMS 1. A method of modifying a polypeptide, the method comprising the steps of:
a) providing a polypeptide comprising a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and
b) contacting the polypeptide with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to modify the polypeptide to form a cyclophane group connecting X1 and X3.
2. The method of claim 1, wherein the polypeptide is modified to form a cyclophane group within the three residue motif with a carbon to carbon bond between X1 and X3.
3. The method of claim 1, wherein X1 is W, F, Y or H.
4. The method of claim 1, wherein X2 is I, G, E, Y, V, L, A, D, S, T, N or Q.
5. The method of claim 1, wherein X3 is a non-aromatic amino acid.
6. The method of claim 5, wherein X3 is N, R, S, D or K.
7. The method of claim 1, wherein the method is performed under anaerobic or oxygen- free conditions.
8. The method of claim 1, wherein the polypeptide is a linear polypeptide.
9. The method of claim 1, wherein the rSS enzyme is an rSS enzyzme of the Xenorhabdus, Yersinia and Erwinia (XYE) maturase system (XyeB), Glycine-rich repeat (Grr) maturase system (GrrM) or the Fxs maturase system (FxsB).
10. The method of claim 1, wherein the rSS enzyme is a XyeB, GrrM or FxsB rSS enzyme from a bacterial genus listed in Tables 3-5.
11. The method of claim 1, wherein the rSS enzyme is a recombinant enzyme or is isolated from bacteria.
12. The method of claim 1, wherein the method comprises co-expressing the polypeptide and the rSS enzyme in a host cell such that the polypeptide contacts the rSS enzyme for a sufficient time and under conditions to modify the polypeptide in the host cell.
13. The method of claim 1, wherein the polypeptide comprises WX4X5X6X7X8X9X10X11X12 (SEQ ID NO: 44), wherein X4 is I, D, L or V, wherein X5 is N, K or R, wherein X6 is A, F or V, wherein X7 is F or Y, wherein X8 is A, G, L, S or V, wherein X9 is N, K or R, wherein X10 is W or F, wherein X11 is D, E, G, N, P, S or T and wherein X12 is K or R.
14. The method of claim 13, wherein the polypeptide comprises WIX5AFX8NWX11X12, (SEQ ID NO: 13) wherein X5 is N or K, wherein X8 is G or A, wherein X11 is E, S or T and wherein X12 is R or K.
15. The method of claim 1, wherein the polypeptide comprises a sequence having at least 80% identity to a sequence of:
ELVDSLLDTVSX13GWINAFGNWERAFH, wherein X13 is G or K (SEQ ID NO: 14).
16. The method of claim 1, wherein the polypeptide comprises a leader sequence having at least 80% identity to a sequence of:
MSKLQREIAANKAQLSHEDKKKTQHK (SEQ ID NO: 25).
17. The method of claim 1, wherein the polypeptide comprises an affinity tag (such as a hexa-histidine sequence) and/or a solubility tag (such as SUMO).
18. A kit for modifying a polypeptide, the kit comprising a) an expression construct comprising a nucleic acid encoding a polypeptide comprising a three residue motif X1- X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid and b) an expression construct comprising a nucleic acid that encodes an rSAM/SPASM (rSS) enzyme.
19. A method of preparing a modified polypeptide library, the method comprising: a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid; and
b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library.
20. The method of claim 19, wherein the method comprises expressing a polypeptide library in a host cell.
21. The method of claim 20, wherein the method comprises co-expressing an rSS enzyme with the polypeptide library in the host cell.
22. The method of claim 19, wherein the polypeptides are displayed in a polypeptide display system.
23. The method of claim 22, wherein the method comprises contacting the displayed polypeptides with an rSS enzyme.
24. A method of selecting a modified peptide capable of binding to a ligand, the method comprising:
a) providing a polypeptide library comprising polypeptides having a three residue motif X1-X2-X3, wherein X1 is an aromatic amino acid and wherein X2 and X3 are each independently any amino acid;
b) contacting the polypeptide library with a rSAM/SPASM (rSS) enzyme for a sufficient time and under conditions to form a cyclophane group connecting X1 and X3 so as to generate a modified polypeptide library;
c) contacting the modified polypeptide library with the ligand; and
d) selecting the modified polypeptides that bind to the ligand.
PCT/SG2020/050303 2019-05-24 2020-05-22 Method of modifying a polypeptide WO2020242379A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10201904686V 2019-05-24
SG10201904686V 2019-05-24

Publications (1)

Publication Number Publication Date
WO2020242379A1 true WO2020242379A1 (en) 2020-12-03

Family

ID=73552991

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2020/050303 WO2020242379A1 (en) 2019-05-24 2020-05-22 Method of modifying a polypeptide

Country Status (1)

Country Link
WO (1) WO2020242379A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024025474A1 (en) * 2022-07-27 2024-02-01 National University Of Singapore Peptides with antimicrobial properties

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
BARR IAN, LATHAM JOHN A., IAVARONE ANTHONY T., CHANTAROJSIRI TEERA, HWANG JENNIFER D., KLINMAN JUDITH P.: "Demonstration That the Radical S-Adenosylmethionine (SAM) Enzyme PqqE Catalyzes de Novo Carbon-Carbon Cross-linking Within a Peptide Substrate PqqA in the Presence of the Peptide Chaperone PqqD", J BIO, CHEM, vol. 291, no. 17, 8 March 2016 (2016-03-08), pages 8877 - 8884, XP055764407, DOI: 10.1074/JBC.C115.699918 *
BENJDIA ALHOSNA, DECAMPS LAURE, GUILLOT ALAIN, KUBIAK XAVIER, RUFFIÉ PAULINE, SANDSTRÖM CORINE, BERTEAU OLIVIER: "Insights Into the Catalysis of a Lysine-Tryptophan Bond in Bacterial Peptides by a SPASM Domain Radical S-adenosylmethionine (SAM) Peptide Cyclase", J BIOL CHEM, vol. 292, no. 26, 5 May 2017 (2017-05-05), pages 10835 - 10844, XP055764400, DOI: 10.1074/JBC.M117.783464 *
BENJDIA ALHOSNA, GUILLOT ALAIN, LEFRANC BENJAMIN, VAUDRY HUBERT, LEPRINCE JÉRÔME, BERTEAU OLIVIER: "Thioether Bond Formation by SPASM Domain Radical SAM Enzymes: Calpha H-atom Abstraction in Subtilosin A Biosynthesis", CHEM COMMUN (CAMB, vol. 52, no. 37, 18 April 2016 (2016-04-18), pages 6249 - 6252, XP055764424, DOI: 10.1039/C6CC01317A *
BUSHIN LEAH B., CLARK KENZIE A., PELCZER ISTVÁN, SEYEDSAYAMDOST MOHAMMAD R.: "Charting an Unexplored Streptococcal Biosynthetic Landscape Reveals a Unique Peptide Cyclization Motif", J AM CHEM SOC, vol. 140, no. 50, 6 November 2018 (2018-11-06), pages 17674 - 17684, XP055764398, DOI: 10.1021/JACS.8B10266 *
CARUSO ALESSIO, MARTINIE RYAN J., BUSHIN LEAH B., SEYEDSAYAMDOST MOHAMMAD R.: "Macrocyclization via an Arginine-Tyrosine Crosslink Broadens the Reaction Scope of Radical S-Adenosylmethionine Enzymes", J AM CHEM SOC, vol. 141, no. 42, 14 October 2019 (2019-10-14), pages 16610 - 16614, XP055764496, DOI: 10.1021/JACS.9B09210 *
DAVIS KATHERINE M., SCHRAMMA KELSEY R., HANSEN WILLIAM A., BACIK JOHN P., KHARE SAGAR D., SEYEDSAYAMDOST MOHAMMAD R., ANDO NOZOMI: "Structures of the Peptide-Modifying Radical SAM Enzyme SuiB Elucidate the Basis of Substrate Recognition", PROC NATL ACAD SCI USA, vol. 114, no. 39, 26 September 2017 (2017-09-26), pages 10420 - 10425, XP055764410, DOI: 10.1073/PNAS.1703663114 *
LATHAM JOHN A., IAVARONE ANTHONY T., BARR IAN, JUTHANI PRERAK V., KLINMAN JUDITH P.: "PqqD is a Novel peptide Chaperone Th at Forms a Ternary Complex With the Radical S-adenosylmethionine Protein PqqE in the Pyrroloquinoline Quinone Biosynthetic Pathway", J BIOL CHEM, vol. 290, no. 20, 27 March 2015 (2015-03-27), pages 12908 - 12918, XP055764416, DOI: 10.1074/JBC.M115.646521 *
SCHRAMMA KELSEY R., BUSHIN LEAH B., SEYEDSAYAMDOST MOHAMMAD R.: "Structure and Biosynthesis of a Macrocyclic Peptide Containing an Unprecedented Lysine-To-Tryptophan Crosslink", NAT CHEM, vol. 7, no. 5, 20 April 2015 (2015-04-20), pages 431 - 437, XP055764401, DOI: 10.1038/NCHEM.2237 *
YOKOYAMA KENICHI, LILLA EDWARD A.: "C-C Bond Forming Radical SAM Enzymes Involved in the Construction of Carbon Skeletons of Cofactors and Natural Products", NAT PROD REP, vol. 35, no. 7, 18 July 2018 (2018-07-18), pages 660 - 694, XP055764495, DOI: 10.1039/C8NP00006A *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024025474A1 (en) * 2022-07-27 2024-02-01 National University Of Singapore Peptides with antimicrobial properties

Similar Documents

Publication Publication Date Title
EP2157099B1 (en) Novel immunoglobulin-binding proteins with improved specificity
US20010034050A1 (en) Fusion peptides isolatable by phase transition
MacBeath et al. UGA read-through artifacts—when popular gene expression systems need a pATCH
WO2003091429A1 (en) Antimicrobial polypeptide and utizliation thereof
CN112877387A (en) Biological magnetic microsphere and preparation method and application thereof
CN108473979B (en) Peptide tag and protein having tag comprising the same
KR101360375B1 (en) Recombinant E. coli producing soluble BMP-2 and method for producing soluble BMP-2 using the same
WO2013091661A2 (en) Proteolytic resistant protein affinity tag
EP1220933B1 (en) Purification of recombinant proteins fused to multiple epitopes
US20210372948A1 (en) Versatile display scaffolds for proteins
WO2020242379A1 (en) Method of modifying a polypeptide
CN111132996A (en) Fusion tag for recombinant protein expression
JP2018514231A (en) Separation of growth and protein production
JPWO2019077887A1 (en) Modification of D and T arm of tRNA that enhances uptake of D-amino acid and β-amino acid
CN106755042B (en) Preparation method of bioactive small peptide based on combined self-shearing and protein scaffold
EP3741774A1 (en) N-terminal fusion partner for producing recombinant polypeptide, and method for producing recombinant polypeptide using same
US20090239262A1 (en) Affinity Polypeptide for Purification of Recombinant Proteins
US20220119792A1 (en) Gene expression cassette for expressing n-terminal methionine-truncated protein of interest and method for producing n-terminal methionine-truncated protein of interest by using same
JPWO2019239751A1 (en) Composition for cell transplantation and method of cell transplantation
JP2011239686A (en) Method of producing peptide and peptide derivative
KR20160077750A (en) Mass production method of recombinant trans glutaminase
WO2002036762A1 (en) Process for producing peptide
CN111349623B (en) 9 ℃ N DNA polymerase mutant
EP4079845A1 (en) Method for enhancing water solubility of target protein by whep domain fusion
US9856483B2 (en) Expression system for producing protein having a N-terminal pyroglutamate residue

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20812930

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20812930

Country of ref document: EP

Kind code of ref document: A1