WO2014046731A1 - Preparation of cyclotides - Google Patents

Preparation of cyclotides Download PDF

Info

Publication number
WO2014046731A1
WO2014046731A1 PCT/US2013/031741 US2013031741W WO2014046731A1 WO 2014046731 A1 WO2014046731 A1 WO 2014046731A1 US 2013031741 W US2013031741 W US 2013031741W WO 2014046731 A1 WO2014046731 A1 WO 2014046731A1
Authority
WO
WIPO (PCT)
Prior art keywords
cyclotide
mcoti
polypeptide
amino acid
cyclotides
Prior art date
Application number
PCT/US2013/031741
Other languages
French (fr)
Inventor
Julio A. Camarero
Krishnappa JAGADISH
Original Assignee
University Of Southern California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Southern California filed Critical University Of Southern California
Publication of WO2014046731A1 publication Critical patent/WO2014046731A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • A61P31/18Antivirals for RNA viruses for HIV
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • Cyclotides are spectacular natural plant micro-proteins ranging from 28 to 37 amino acid residues long and exhibit various biological actions including anti-microbial, insecticidal, cytotoxic, antiviral (against HIV), protease inhibitory, and hormone-like activities. They share a unique head-to-tail circular knotted topology of three disulfide bridges; one disulfide penetrates through a macrocycle formed by the other two disulfides, thereby inter-connecting the peptide backbone to form what is called a cystine knot topology (FIG. 1).
  • This cyclic cystine knot (CCK) framework gives the cyclotides exceptional rigidity, resistance to thermal and chemical denaturation, and enzymatic stability against degradation.
  • cyclotides have been shown to be orally bioavailable.
  • the first cyclotide to be discovered was found to be an orally effective uterotonic, and other cyclotides have been shown to cross the cell membrane through macropinocytosis. All of these features make cyclotides ideal tools for drug development.
  • Cyclotides have been isolated from plants in the Rubiaceae, Violaceae, Cucurbitacea, and, most recently, Fabaceae families. Around 200 different cyclotide sequences have been reported in the literature, although it has been estimated that ⁇ 50,000 cyclotides may exist. Despite sequence diversity, all cyclotides share the same CCK motif (FIG. 1A). Hence, these micro-proteins can be considered natural combinatorial peptide libraries that are structurally constrained by the cystine-knot scaffold and head-to-tail cyclization and in which, with the exception of the strictly conserved cysteines comprising the cysteine-knot, hypermutation of essentially all residues is permitted.
  • Cyclotides can be chemically synthesized, thereby permitting the introduction of specific chemical modifications or biophysical probes. More importantly, cyclotides can now be biosynthesized in bacterial cells using a biomimetic approach that involves the use of modified protein splicing units. These characteristics make them ideal substrates for the production of genetically-encoded libraries based on the cyclotide framework. These cell-based libraries allow in-cell molecular evolution strategies to enable the generation and high throughput selection of compounds with optimal binding and inhibitory characteristics. In contrast to chemically generated libraries, genetically-encoded libraries enable the facile generation and screening of very large combinatorial libraries of molecules.
  • One embodiment of the present disclosure provides an isolated or recombinant polypeptide comprising a linear cyclotide fused to a C-terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively.
  • the split intein comprises a DnaE split intein.
  • the DnaE split intein comprises a Nostoc punitiforme PCC73102 DnaE split intein.
  • the C-terminal fragment comprises an amino acid sequence of SEQ ID NO: 2.
  • the N-terminal fragment comprises an amino acid sequence of SEQ ID NO: 3.
  • the cyclotide comprises an amino acid sequence selected from Table 1 or an amino acid that has at least about 90% sequence identity thereto.
  • the cyclotide comprises at least an unnatural amino acid residue but retains six cysteine residues that form three disulfide bonds in a cyclized cyclotide.
  • the unnatural amino acid comprises one or more selected from p-methyxyphenylalanine, p- azidophenylalanine or L-(7-hydroxycoumarin-4-yl)ethylglycine.
  • the cyclotide comprises an amino acid sequence of SEQ ID NO: 1.
  • an isolated polynucleotide comprising a nucleic acid sequence encoding a polypeptide of any of the above embodiment or a biological equivalent thereof or a polynucleotide that hybridizes under conditions of high stringency to the polynucleotide or its complement.
  • the present disclosure in one embodiment, provides a method for preparing a cyclic peptide comprising expressing a linear cyclotide in a cell and cyclize the linear cyclotide, wherein the cyclotide comprises at least an unnatural amino acid but retains six cysteine residues to form three disulfide bonds.
  • Another embedment provides a cyclized cyclotide obtainable by such a method.
  • the unnatural acid is modified with an agent comprising a detectable label.
  • Vectors and host cells comprising polynucleotides and compositions containing any of the polynucleotides or polypeptides are also provided.
  • FIG. 1A shows the tertiary structure of the cyclotide MCoTI-II (PDB code: 1IB9) and primary structures of cyclotides used in Example 1.
  • the backbone cyclized peptide (connecting bond shown in light gray) is stabilized by the three disulfide bonds (shown in dark gray).
  • FIG. IB illustrates intein precursors used for the expression of cyclotides produced in Example 1.
  • the unnatural amino acid (Uaa) was introduced at position 14, which is in the middle of loop 2 and is marked with an "X”.
  • FIG. 2A-D show in-cell expression of MCoTI-I based cyclotides in E. coli cells using Npu DnaE intein-mediated protein trans-splicing (PTS).
  • A SDS-PAGE analysis of the recombinant expression of cyclotide precursors 2a and 2b in Origami2 (DE3) cells for in-cell production of the cyclotides MCoTI-I and MCoTI-OmeF, respectively.
  • B Analytical HPLC trace (left panel) of the soluble cell extract of bacterial cells expressing precursor 2a (MCoTI-I) after purification by affinity chromatography on a trypsin-sepharose column.
  • FIG. 3 presents a scheme for in-cell expression of native folded cyclotides using intein- mediated protein trans-splicing.
  • FIG. 4A-C shows synthesis of DBCO-AMCA.
  • A A synthetic scheme for the production of DBCO-AMCA from 6-((7-amino-4- methylcoumarin-3-acetyl)-amino)-hexanoic acid succinimidyl ester (AMCA-X) and 5,6-dihydro-l l,12-didehydrodibenzo-[b,f]-azocino-3- oxoprop-yl-4-amine (DBCO-NH2).
  • B CI 8 RP-HPLC trace of purified DBCO-AMCA. Linear gradient from 0% B to 100% B over 30 min. Detection was performed at 360 nm.
  • C ES-MS spectrum of purified DBCO-AMCA.
  • FIG. 5A-C show in vitro production of cyclotides MCoTI-OmeF and MCoTI-aziF by expressed protein ligation (EPL).
  • EPL expressed protein ligation
  • Cyclotides MCoTI-OmeF and MCoTI-aziF are marked with an arrow.
  • the mass observed for MCoTI-aziF corresponds to the photodegradation product (p- amino-phenylalanine derivative).
  • FIG. 6 presents ⁇ Hj-NOESY spectrum of cyclotide MCoTI-I produced in-cell by PTS.
  • FIG. 7 shows ES-MS spectra of purified MCoTI-OmeF (left) and MCoTI-AziF (right).
  • FIG. 8A-B show NMR characterization of cyclotide MCoTI-OmeF.
  • A 1H ⁇ 1H ⁇ - TOCSY shows the amino acids assignments of McoTI-OmeF.
  • B Aromatic region of 1H ⁇ 1H ⁇ - NOESY shows the assignments of p-MeO-Phel4 side chain of MCoTI-OmeF.
  • FIG. 9 shows in vitro labeling of MCoTI-aziF with DBCO-AMCA through copper-free click chemistry.
  • the ligation reactions were analyzed by LC-MS/MS (right panel).
  • Product was characterized by ES-MS (left panel).
  • FIG. 10 shows in vivo labeling of MCoTI-AziF with DBCO-AMCA.
  • Cyclotides MCoTI-I and MCoTI-aziF were expressed in Origami(DE3) cells transformed with plasmid pERAzi as described above. In both cases the cells were incubated with DBCO-AMCA (0.5 ⁇ ) in PBS for 4 h. The cells were then washed with PBS until no AMCA was detected in the washes, and analyzed by fluorescence microscopy. Bar size corresponds to 10 ⁇ .
  • FIG. 10 shows in vivo labeling of MCoTI-AziF with DBCO-AMCA.
  • FIG. 11 shows direct binding of AMCA-labeled MCoTI-AziF to trypsin was measured by fluorescence polarization anisotropy by exciting AMCA at 360 nm and reading the fluorescence polarization anisotropy at 450 nm. Binding experiments were performed in phosphate buffer at pH 7.4 at room temperature by titrating AMCA-labeled MCoTI-AziF (5 nM) with increasing amounts of trypsin.
  • FIG. 12A presents data from direct binding of AMCA-labeled MCoTI-AziF to trypsin- S195A-GFP was measured by the increment in the FRET signal at 415 nm when exciting at 360 nm.
  • FIG. 12B shows results from binding experiments performed in phosphate buffer at pH 7.4 at room temperature by titrating AMCA-labeled MCoTI-AziF (170 nM) with increasing amounts of trypsin trypsin-S 195 A-GFP (0.25 nM - 300 nM).
  • B Fluorescence spectra of MCoTI- AziF (170 nM) and trypsin-S 195 A-GFP at different concentrations (0.25 nm - 300 nM).
  • a cell includes a plurality of cells, including mixtures thereof.
  • compositions and methods include the recited elements, but not excluding others.
  • Consisting essentially of when used to define compositions and methods shall mean excluding other elements of any essential significance to the combination for the stated purpose.
  • a composition consisting essentially of the elements as defined herein would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives and the like.
  • Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions of this invention or process steps to produce a composition or achieve an intended result.
  • isolated refers to molecules separated from other DNAs or RNAs, respectively, that are present in the natural source of the macromolecule.
  • isolated as used herein also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized.
  • an isolated nucleic acid is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state.
  • isolated is also used herein to refer to cells or polypeptides which are isolated from other cellular proteins or tissues. Isolated polypeptides is meant to encompass both purified and recombinant polypeptides.
  • the term "recombinant" as it pertains to polypeptides or polynucleotides intends a form of the polypeptide or polynucleotide that does not exist naturally, a non-limiting example of which can be created by combining polynucleotides or polypeptides that would not normally occur together.
  • Cells "host cells” or “recombinant host cells” are terms used interchangeably herein. It is understood that such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
  • a "vector” is a vehicle for transferring genetic material into a cell. Examples of such include, but are not limited to plasmids and viral vectors.
  • a viral vector is a virus that has been modified to transduct genetic material into a cell.
  • a plasmid vector is made by splicing a DNA construct into a plasmid.
  • the appropriate regulatory elements are included in the vectors to guide replication and/or expression of the genetic material in the selected host cell.
  • Homology refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An "unrelated" or “nonhomologous” sequence shares less than 40% identity, though preferably less than 25% identity, with one of the sequences of the present invention.
  • a polynucleotide or polynucleotide region (or a polypeptide or polypeptide region) has a certain percentage (for example, 60 %, 65 %, 70 %, 75 %, 80 %, 85 %, 90 %, 95 %, 98 % or 99 %) of "sequence identity" to another sequence means that, when aligned, that percentage of bases (or amino acids) are the same in comparing the two sequences.
  • This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in Ausubel et al. eds. (2007) Current Protocols in Molecular Biology. Preferably, default parameters are used for alignment.
  • One alignment program is BLAST, using default parameters.
  • Biologically equivalent polynucleotides are those having the above-noted specified percent homology and encoding a polypeptide having the same or similar biological activity.
  • an equivalent nucleic acid or polynucleotide refers to a nucleic acid having a nucleotide sequence having a certain degree of homology with the nucleotide sequence of the nucleic acid or complement thereof.
  • a homolog of a double stranded nucleic acid is intended to include nucleic acids having a nucleotide sequence which has a certain degree of homology with or with the complement thereof.
  • homologs of nucleic acids are capable of hybridizing to the nucleic acid or complement thereof.
  • Hybridization reactions can be performed under conditions of different "stringency". In general, a low stringency hybridization reaction is carried out at about 40°C in about 10 x SSC or a solution of equivalent ionic strength/temperature. A moderate stringency hybridization is typically performed at about 50°C in about 6 x SSC, and a high stringency hybridization reaction is generally performed at about 60°C in about 1 x SSC. Hybridization reactions can also be performed under "physiological conditions" which is well known to one of skill in the art. A non-limiting example of a physiological condition is the temperature, ionic strength, pH and
  • oligonucleotide refers to polynucleotides such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA).
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • Deoxyribonucleotides include deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymidine.
  • nucleotide of a nucleic acid which can be DNA or an RNA
  • adenosine cytidine
  • guanosine thymidine
  • thymidine a nucleotide having a uracil base
  • polynucleotide and “oligonucleotide” are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides or analogs thereof. Polynucleotides can have any three-dimensional structure and may perform any function, known or unknown.
  • polynucleotides a gene or gene fragment (for example, a probe, primer, EST or SAGE tag), exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, dsRNA, siRNA, miRNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers.
  • a polynucleotide can comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs.
  • modifications to the nucleotide structure can be imparted before or after assembly of the polynucleotide.
  • the sequence of nucleotides can be interrupted by non-nucleotide components.
  • a polynucleotide can be further modified after polymerization, such as by conjugation with a labeling component.
  • the term also refers to both double- and single-stranded molecules. Unless otherwise specified or required, any embodiment of this invention that is a polynucleotide encompasses both the double-stranded form and each of two complementary single-stranded forms known or predicted to make up the double-stranded form.
  • a polynucleotide is composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil (U) for thymine when the polynucleotide is RNA.
  • A adenine
  • C cytosine
  • G guanine
  • T thymine
  • U uracil
  • polynucleotide sequence is the alphabetical representation of a polynucleotide molecule. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
  • polymorphism refers to the coexistence of more than one form of a gene or portion thereof.
  • a polymorphic region can be a single nucleotide, the identity of which differs in different alleles.
  • encode refers to a polynucleotide which is said to "encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof.
  • the antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
  • the term "detectable label” intends a directly or indirectly detectable compound or composition that is conjugated directly or indirectly to the composition to be detected, e.g., polynucleotide or protein such as an antibody so as to generate a "labeled" composition.
  • the term also includes sequences conjugated to the polynucleotide that will provide a signal upon expression of the inserted sequences, such as green fluorescent protein (GFP) and the like.
  • the label may be detectable by itself (e.g. radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable.
  • the labels can be suitable for small scale detection or more suitable for high-throughput screening.
  • suitable labels include, but are not limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and proteins, including enzymes.
  • the label may be simply detected or it may be quantified.
  • a response that is simply detected generally comprises a response whose existence merely is confirmed, whereas a response that is quantified generally comprises a response having a quantifiable (e.g., numerically reportable) value such as an intensity, polarization, and/or other property.
  • the detectable response may be generated directly using a luminophore or fluorophore associated with an assay component actually involved in binding, or indirectly using a luminophore or fluorophore associated with another (e.g., reporter or indicator) component.
  • luminescent labels that produce signals include, but are not limited to bioluminescence and chemiluminescence.
  • Detectable luminescence response generally comprises a change in, or an occurrence of, a luminescence signal.
  • Suitable methods and luminophores for luminescently labeling assay components are known in the art and described for example in Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6 th ed.).
  • luminescent probes include, but are not limited to, aequorin and luciferases.
  • fluorescent labels include, but are not limited to, fluorescein, rhodamine, tetramethylrhodamine, eosin, erythrosin, coumarin, methyl-coumarins, pyrene, Malacite green, stilbene, Lucifer Yellow, Cascade Blue.TM., and Texas Red.
  • suitable optical dyes are described in the Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6 th ed.).
  • the fluorescent label is functionalized to facilitate covalent attachment to a cellular component present in or on the surface of the cell or tissue such as a cell surface marker.
  • Suitable functional groups including, but not are limited to, isothiocyanate groups, amino groups, haloacetyl groups, maleimides, succinimidyl esters, and sulfonyl halides, all of which may be used to attach the fluorescent label to a second molecule.
  • the choice of the functional group of the fluorescent label will depend on the site of attachment to either a linker, the agent, the marker, or the second labeling agent.
  • Attachment of the fluorescent label may be either directly to the cellular component or compound or alternatively, can by via a linker.
  • Suitable binding pairs for use in indirectly linking the fluorescent label to the intermediate include, but are not limited to,
  • antigens/antibodies e.g., rhodamine/anti-rhodamine, biotin/avidin and biotin/strepavidin.
  • the term "carrier” encompasses any of the standard carriers, such as a phosphate buffered saline solution, buffers, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents.
  • the compositions also can include stabilizers and preservatives.
  • stabilizers and adjuvants see Sambrook and Russell (2001), supra. Those skilled in the art will know many other suitable carriers for binding polynucleotides, or will be able to ascertain the same by use of routine experimentation.
  • the carrier is a buffered solution such as, but not limited to, a PCR buffer solution.
  • a "pharmaceutical composition” is intended to include the combination of an active agent with a carrier, inert or active, making the composition suitable for diagnostic or therapeutic use in vitro, in vivo or ex vivo.
  • the term "pharmaceutically acceptable carrier” encompasses any of the standard pharmaceutical carriers, such as a phosphate buffered saline solution, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents.
  • the compositions also can include stabilizers and preservatives.
  • stabilizers and adjuvants see Martin (1975) Remington's Pharm. Sci., 15th Ed. (Mack Publ. Co., Easton).
  • Examples in this disclosure shows in-cell production of natively folded cyclotides.
  • the folder cyclotides can contain unnatural amino acids (Uaas) by employing in vivo Uaa incorporation in combination with protein splicing to mediate intracellular backbone cyclization.
  • Uaas unnatural amino acids
  • This is the first time that a natively folded cyclotide containing Uaas has been produced inside living cells.
  • This approach opens the possibility for in-cell generation of cyclotides containing Uaas with new or enhanced biological functions.
  • the introduction of fluorescent amino acids or Uaas able to site-specifically incorporate fluorescent probes should facilitate the in-cell production of fluorescent-labeled cyclotides for screening or probing molecular interactions using cell-based optical screening approaches.
  • One embodiment of the present disclosure provides a method for preparing a cyclotide.
  • the method entails generation of a linear peptide that contains the desired cyclotide in a linear form, flanked by two peptide fragments that have affinity to each other so as to be capable of bringing two ends of the linear cyclotide together, facilitating cyclization.
  • the two peptide fragments are the C-terminus and N-terminus domains of a split intein.
  • Cyclotides are small disulfide rich peptides isolated from plants. Typically containing 28-37 amino acids, cyclotides are characterized by their head-to-tail cyclized peptide backbone and the interlocking arrangement of their three disulfide bonds. These combined features have been termed the cyclic cystine knot (CC ) motif (FIG. 1A). To date, over 100 cyclotides have been isolated and characterized from species of the Rubiaceae, Violaceae, Cucurbitaceae and Fabaceae families. Table 1 below lists non- limiting examples of known cyclotides.
  • cyclic backbone includes a molecule comprising a sequence of amino acid residues or analogues thereof without free amino and carboxy termini.
  • the cyclic backbone of the disclosure comprises sufficient disulfide bonds, or chemical equivalents thereof, to confer a knotted topology on the three-dimensional structure of the cyclic backbone.
  • cyclotide refers to a peptide comprising a cyclic cystine knot motif defined by a cyclic backbone, at least two but preferably at least three disulfide bonds and associated beta strands in a particular knotted topology.
  • the knotted topology involves an embedded ring formed by at least two backbone disulfide bonds and their connecting backbone segments being threaded by a third disulfide bond.
  • a disulfide bond may be replaced or substituted by another form of bonding such as a covalent bond.
  • Hypa A GIPCAESCVYIPCTITALLGCSCKNKVCYN 43 circulin B GVIPCGESCVFIPCISTLLGCSC NKVCYRN 44 circulin C GIPCGESCVFIPCITSVAGCSCKSKVCYRN 45 circulin D KIPCGESCVWIPCVTSIFNCKCENKVCYHD 46 circulin E KIPCGESCVWIPCLTSVFNCKCENKVCYHD 47 circulin F AIPCGESCVWIPCISAAIGCSCKNKVCYR 48 cycloviolacin 04 GIPCGESCVWIPCISSAIGCSC N VCYRN 49 cycloviolacin_03 GIPCGESCVWIPCLTSAIGCSCKSKVCYRN 50 cycloviolacin 05 GTPCGESCVWIPCISSAVGCSCK KVCYKN 51 cycloviolacin_06 GTLPCGESCVWIPCISAAVGCSC S VCYKN 52 cycloviolacin 07 SIPCGESCVWIPCTITALAGCKCKSKVCYN 53 cycloviolacin 010 GIPC
  • Oantr_protein GV S SETTLMFLKEMQLKLP 71 vhl-2 GLPVCGETCFTGTCYTNGCTCDPWPVCTRN 72 Cylcotide Protein Sequence SEQ
  • Hyfl_A SISCGESCVYIPCTVTALVGCTCKDKVCYLN 75
  • Hyfl_F SISCGETCTTFNCWIPNC CNHHDKVCYWN 80
  • Hyep_B_(partial) CGETCIYIPCFTEAVGCKCKDKVCY N 100 tricyclon B GGTIFDCGESCFLGTCYTKGCSCGEWKLCYGEN 101 kalata_B8 GSVLNCGETCLLGTCYTTGCTCNKYRVCTKD 102 cycloviolacin H4 GIPCAESCVWIPCTVTALLGCSCSNNVCYN 103 cycloviolacin 013 GIPCGESCVWIPCISAAIGCSC S VCYRN 104 violacin A SAISCGETCFKFKCYTPRCSCSYPVC 105 cycloviolacin 014 GSIPACGESCFKGKCYTPGCSCSKYPLCAKN 106 cycloviolacin O 15 GLVPCGETCFTGKCYTPGCSCSYPICKKN 107 cycloviolacin 016 GLPCGETCFTGKCYTPGCSCSYPICKXIN 108 cycloviolacin 017 GIPCGESCVWIPCISAAIGCSCKNK
  • GLPVCGETCVGGTCNTPGCACSWPVCTRN 169 mram l GSIPCGESCVYIPCIS SLLGCSCKSKVCYKN 170 mram 2 GIPCAESCVYIPCLTSAIGCSC S VCYRN 171 mram_3 GIPCGESCVYLPCFTTIIGC CQGKVCYH 172 mram 4 GSIPCGESCVFIPCISSWGCSCKNKVCYKN 173 mram 5 GTIPCGESCVFIPCLTSAIGCSC S VCYKN 174 mram_6 GSIPCGESCVYIPCISSLLGCSCESKVCY N 175 mram 7 GSIPCGESCVFIPCISSIVGCSC S VCYKN 176 mram_8 GIPCGESCVFIPCLTSAIGCSCKSKVCYRN 177 mram 9 GVPCGESCVWIPCLTSrVGCSCKNNVCTLN 178 mram 10 GVIPCGESCVFIPCISSVLGCSCI NKVCYRN
  • the present technology can be used to prepare any cyclotide, including those known cyclotides as listed in Table 1.
  • New cyclotides can also be prepared.
  • a known cyclotide can be modified to substitute, insert and/or delete one or more amino acids.
  • the modified cyclotide is at least about 80%, 85%, 90%, or 95% identical to a reference cyclotide.
  • the modified cyclotide retains six cysteine residues that form three disulfide bonds in a cyclized cyclotide.
  • the cyclotide incorporates one or more unnatural amino acids.
  • Unnatural amino acids are amino acids not in the standard 20-amino acid list but can be incorporated into a protein sequence.
  • Non-limiting examples of unnatural amino acids include p- methyxyphenylalanine, p-azidophenylalanine, L-(7-hydroxycoumarin-4-yl)ethylglycine, acetyl- 2-naphthyl alanine, 2-naphthyl alanine, 3-pyridyl alanine, 4-chloro phenyl alanine,
  • the unnatural amino acid is located in loop 2 of the cyclotide. In alternative embodiments, the unnatural amino acid is located in loop 1, 3, 4, 5 or 6. In some embodiments, the cyclotide contains two, three, four or more unnatural amino acids.
  • the cyclotide comprises a molecular framework comprising a sequence of amino acids forming a cyclic backbone wherein the cyclic backbone comprises sufficient disulfide bonds to confer knotted topology on the molecular frameword or part therof.
  • the cyclic backbone comprises the structure:
  • the amino acid residues corresponding to [X ⁇ i ...X f ] in the cyclotide comprise the unnatural amino acid.
  • the unnatural amino acid is located in loop 1 of the cyclotide.
  • the amino acid residues corresponding to [X ⁇ -X a ] in the cyclotide comprise the unnatural amino acid.
  • the unnatural amino acid is located in loop 2 of the cyclotide.
  • the amino acid residues corresponding to [X ..X b ] in the cyclotide comprise the unnatural amino acid.
  • the unnatural amino acid is located in loop 3 of the cyclotide.
  • the amino acid residues corresponding to [ ⁇ ⁇ ⁇ ...X c ] in the cyclotide comprise the unnatural amino acid.
  • the unnatural amino acid is located in loop 4 of the cyclotide. In this embodiment, the unnatural amino acid residues corresponding to
  • [X ⁇ i ...Xc] in the cyclotide comprise the unnatural amino acid.
  • the unnatural amino acid is located in loop 5 of the cyclotide.
  • the amino acid residues corresponding to [X V i ...X e ] in the cyclotide comprise the unnatural amino acid.
  • the cyclotide comprises an amino acid sequence of
  • GGVCPKILQRCRRXSDCPGACICRGNGYCGSGSD (SEQ ID NO: 1) where X indicates an unnatural amino acid.
  • the present disclosure provides a polypeptide precursor for generating a cyclotide.
  • the polypeptide comprises a linear cyclotide fused to a C-terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively.
  • a structure of the polypeptide is illustrated in FIG. IB, lower panel.
  • a "split intein” is an interin of a precursor protein that comes from two separate genes.
  • DnaE the catalytic subunit a of DNA polymerase III
  • dnaE-n the catalytic subunit a of DNA polymerase III
  • dnaE-c the dnaE-n product consists of an N-extein sequence followed by a 123-AA intein sequence
  • the dnaE-c product consists of a 36- AA intein sequence followed by a C-extein sequence.
  • the split intein comprises a DnaE split intein.
  • the DnaE split intein comprises a Nostoc punitiforme PCC73102 DnaE split intein.
  • the C-terminal fragment comprises an amino acid sequence of SEQ ID NO: 2 (MIKIATRKYLGKQNVYDIGVERDHNFALKNGFIASN).
  • the N-terminal fragment comprises an amino acid sequence of SEQ ID NO: 3
  • cyclotides prepared from the cyclotide precursor as provided above are also provided.
  • polynucleotides encoding these polypeptides are also provided.
  • the polynucleotide uses a stop codon to code for an unnatural amino acid. Suitable conditions for translating the stop codon into a natural amino acid is known in the art and detailed in Example 1.
  • Methods for preparing a cyclotide are also provided.
  • the method entails incubating a polypeptide of the disclosure under conditions for the linear cyclotide to cyclize.
  • a cyclic cyclotide does not have to use the split intein as described above. As shown in Example 1 , other protein domains can also be used to bring both ends of a linear cyclotide together. Accordingly, one embodiment of the present disclosure provides a method for preparing a cyclic peptide comprising expressing a linear cyclotide in a cell and cyclize the linear cyclotide, wherein the cyclotide comprises at least an unnatural amino acid but retains six cysteine residues to form three disulfide bonds. Cyclized cyclotides from the methods are provided.
  • cyclic cyclotides prepared by methods of the present disclosure can be further modified.
  • an unnatural acid incorporated into the cyclotide can be modified with an agent comprising a detectable label.
  • a detectable can be useful for diction and screening of the cyclotides.
  • kits, libraries are also provided for screening cyclotide libraries for potential pharmaceutical agents.
  • Various mutations can be made to wild-type or known or proposed cyclotides, and to generate a large library of cyclotides.
  • such a library can include cyclotides of diverse structures, further enhancing the value of the library.
  • Analytical HPLC was performed on a HP 1100 series instrument with 220 and 280 nm detection using a Vydac CI 8 column (5 micron, 4.6 x 1 0 mm) at a flow rate of 1 mL/min.
  • Semipreparative HPLC was performed on a Waters Delta Prep system fitted with a Waters 2487 Ultraviolet- visible (UV/Vis) detector using a Vydac CI 8 column (15- 20 ⁇ , 10 x 250 mm) at a flow rate of 5 mL/min. All runs used linear gradients of 0.1% aqueous trifluoroacetic acid (TFA, solvent A) vs.
  • TFA trifluoroacetic acid
  • Flow cytometry analysis was performed on LSR II instrument (BD). Protein samples were run on 4-20% Tris-Glycine Gels (Lonza). The gels were then stained with Pierce Gelcode Blue (Pierce), photographed/digitized using a Kodak EDAS 290, and quantified using NIH ImageJ software (http://rsb.info.nih.gov/ij/). DNA sequencing was performed by Retrogen DNA facility (San Diego,CA), and the sequence data were analyzed with DNAStar Lasergene v5.5.2. from Aldrich (Milwaukee, WI) or Novabiochem (San Diego, CA) unless otherwise indicated. Restriction enzymes were purchased from New England Biolabs. Primers were ordered from IDT (Integrated DNA Technologies).
  • DBCO-AMCA 6-((7- amino-4-methylcoumarin-3-acetyl)-amino)-hexanoic acid succinimidyl ester (AMCA-X, SE, Anaspec) (10 mg, 22.5 ⁇ ) was reacted with 5,6-dihydro-l l , 12-didehydrodibenzo-[b,f]- azocino-3-oxoprop-yl-4-amine (DBCO-amine, Click Chemistry Tools, Bioconjugate Technology Company) (7 mg, 25.3 "mol) in DMF (100 ⁇ ) containing 5% di-isopropylethyl amine (DIEA) for 30 mins at room temperature.
  • DIEA di-isopropylethyl amine
  • the OmeRS gene was amplified by polymerase chain reaction (PCR) using plasmid pBK-JY16 as template.
  • the '-primer (5'-CT ATG ACT AGT GAC GAA TTT GAA ATG ATA AAG-3 ', SEQ ID NO: 290) encoded a Spe I restriction site.
  • the 3 '-primer (5 '-GTG ATG AGA TCT TTA TAA TCT CTT TCT AAT TGG CTC-3', SEQ ID NO: 291) encoded a Bgl II.
  • PCR product was purified digested with Spe I and Bgl II, and ligated into a Spe I, Bgl I- treated pLeitRNA Opt-STAT3 plasmid to give the plasmid pVLOmeRS.
  • B2J066 was amplified by PCR using plasmid pYYl-Npu-IN as template and the following primers: 5 '-primer (5'- AA AAA CAT ATG AAA CGG AAA TAT TGA C -3 ', SEQ ID NO: 292) and 3 '-primer (5 '- T TTT AAG CTT AAT TCG GCA AAT TAT CAA CCC -3', SEQ ID NO: 293) which introduced a Nde I and Hind III restriction sites, respectively. The resulting DNA fragment was purified and digested with Sal I and Not I. 5'-Phosphorylated
  • oligonucleotides coding for the DnaE IC Npu (residues 1-36, UniProtKB: B2J821) (Table 2) were synthesized and PAGE purified by IDT DNA. Complementary strands were annealed in 20 mM sodium phosphate, 0.3M NaCl buffer at pH 7.4 and the resulting double stranded DNA (dsDNA) was purified using Qiagen's PCR Purification Kit. 5'-Phosphorylated oligonucleotides coding for MCoTI-I (Table 2) were synthesized and PAGE purified by IDT DNA.
  • the PCR product was purified, digested with Nco I and Hind III, and ligated into a Nco I, Hind Hi-treated pET28a plasmid (Novagen) to give the plasmid pET28- TS-MCoTI-I.
  • the gene fragment encoding EGFP was amplified by PCR using plasmid pEGFP-Nl (Clontech) as a template.
  • the 5 '-primer (5 '- TCT AGA GGT GGT TCT GGT GGT TCT TCT GGT GGT GTC GAC AGC AAG GGC GAG GAG CTG TTC ACC GGG G -3 ', SEQ ID NO: 296) introduced a Nco I restricition site and the flexible linker (Gly-Gly-Ser)3 in frame with EGFP.
  • the 3 '-primer (5'- A AGC TTA TTA GTG GTG ATG ATG GTG ATG AGA ACC ACC CTT GTA CAG CTC GTC CAT GCC GAG AGT G -3 ', SEQ ID NO: 297) introduced a Hind III restriction site, a poly-His tag in frame with EGFP and a stop codon.
  • the resulting DNA fragment was purified, double digested with Nco I and Hind III, and ligated onto a Nco I, Hind Ill-treated pET25b plasmid (Novagen) to give pET25-EGFP.
  • Mature anionic rat trypsin gene was mutated at positions 16 (116V) 195 (S I 95 A) using the plasmid pPicZalphaWTTg (generous gift from Dr. Teaster Baird Jr, SFSU) by site directed mutagenesis kit (Agilent biosystems) as per the manufacturer's protocol using the mutagenic primers (5'- GGA GAT ATA CAT ATG gtc GTT GGA GGA TAC ACC -3 ', SEQ ID NO: 298, and 5 '-CAC AGG GCC ACC ggc GTC ACC CTG GCA GC -3 ', , SEQ ID NO: 299, respectively). The mutations were confirmed by DNA sequencing.
  • Inactive mature anionic rat trypsin was amplified by PCR using the mutated plasmid pPicZalphaWTTg as template and the following primers: 5 '-primer (5 '- CAT ATG ATC GTT GGA GGA TAC ACC TGC C -3 ', SEQ ID NO: 300) and 3'-primer (5'- CC A TG GCG TTG GCA GCA ATT GTG TCC TG -3', SEQ ID NO: 301), which introduced a Nde I and Nco I restriction sites, respectively.
  • the DNA fragment encoding inactive anionic rat trypsin fused to the N-terminus of EGFP was amplified by PCR using pET25-Tryp-EGFP as template.
  • the 5 '-primer (5'- AAA CAT ATG GTC GTT GGA GGA TAC ACC TGC C -3' , SEQ ID NO: 302) and 3'-primer (5'- TTT TGG TAC CAT TAG TGG TGA TGA TGA TGA TGA GAA CCA CCC -3', SEQ ID NO: 303) introduced a Nde-I and Kpn-I restriction sites, respectively.
  • the DNA fragment was purified, double digested with Nde-I and Kpn-I and ligated into a Nde-I I Kpn-I treated pRSF- duet (Novagen) to give plasmid pRSF-Tryp-EGFP.
  • E.coli BL21 (DE3) or Origami (DE3) cells were transformed with plasmid pTXBl-MCoTI.[5, 27, 28] Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 "g/L) at 30° C for BL21 (DE3) or room temperature for Origami (DE3) cells. Briefly, 5 mL of an overnight starter culture derived from either a single clone were used to inoculate 1 L of 2XYT media. Cells were grown to an OD at 600 nm of !0.6 at 37° C.
  • Protein expression was induced by addition of isopropyl-$-Dthiogalactopyranoside (IPTG) to a final concentration of 0.3 mM at 30°C for 4h in BL21 (DE3) cells and room temperature for overnight in Origami (DE3) cells. The cells were then harvested by centrifugation. For fusion protein purification, the cells were resuspended in 30 mL of lysis buffer (0.1 mM EDTA, 1 mM PMSF, 50 mM sodium phosphate, 250 mM NaCl buffer at pH 7.2 containing 5% glycerol) in the presence or absence of 20 mM ICH2CONH2 and then lysed by sonication.
  • IPTG isopropyl-$-Dthiogalactopyranoside
  • the lysate was clarified by centrifugation at 15,000 rpm in a Sorval SS-34 rotor for 30 min.
  • the clarified supernatant was incubated with chitin-beads (1-3 mL beads/L cells, New England Biolabs), previously equilibrated with column buffer (0.1 mM EDTA, 50 mM sodium phosphate, 250 mM NaCl buffer at pH 7.2) at 4° C for 1 hour with gentle rocking.
  • the beads were extensively washed with 50 bed-volumes of column buffer (50 mM sodium phosphate, 0.1 mM EDTA, 250 mM NaCl buffer at pH 7.2) containing 0.1% Triton X100 and then rinsed and equilibrated with 50 bed- volumes of column buffer. Quantification of the precursor intein was carried out spectrophotometrically using an extinction coefficient per chain at 280 nm of 38,150 M-lcm-1. The expression level for intein precursor la was ⁇ 40 mg/L.
  • E.coli BL21 (DE3) or Origami (DE3) cells were co-transformed with plasmids pTXBl-MCoTI-stop and pVLOmeRS. Expression was carried out in 2XYT medium (1 L) ampicillin (100 ⁇ g L) and chloroamphenicol (35 ⁇ g L) at 30° C for BL21 (DE3) or room temperature for Origami(DE3) cells. Cells were grown to an OD at 600 nm of ⁇ 0.2 at 37 °C at which point 2 mM /7-methoxy-phenylalanine (ChemPep Inc.) was added.
  • Protein expression was induced with IPTG when the OD at 600 nm was 0.6 as described above.
  • the cells were harvested and the intein precursor purified as described for intein construct la.
  • the expression level for intein precursor lb was ⁇ 3 mg/L.
  • E.coli BL21(DE3) or Origami(DE3) cells were co-transformed with plasmids pTXBl-MCoTI-stop and pERazi. Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 ⁇ g/L) and chloroamphenicol (35 ⁇ g/L) at 30° C for BL21 (DE3) or room temperature for Origami (DE3) cells. Cells were grown to an OD at 600 nm of ⁇ 0.2 at 37° C at which point ImM />-azido-phenylalanine (Chem-Impex International Inc.) was added.
  • Arabinose (0.02%) was added when the OD at 600 nm reached a value of 0.4 and 0.6, respectively, then protein expression was induced with IPTG as described for precursor la.
  • Cells were harvest and the intein precursor purified as described for precursor la.
  • the expression level for intein precursor lc was ⁇ 7 mg/L. (Note: all steps involving proteins containing -azido- phenylalanine needs to be carried out in complete darkness to avoid photodecomposition of the aryl-azido group).
  • intein-fusion proteins la, lb and lc adsorbed on chitin beads (1 mL) were cleaved in freshly degassed column buffer containing 100 mM GSH (total volume 1.5 mL). The cleavage/cyclization reaction was kept for 20 h at 25o C with gentle rocking. Once the cleavage was complete the beads were filtered and analyzed by analytical HPLC. Folded cyclotide MCoTI-I was purified by semipreparative HPLC using a linear gradient of 10-30% solvent B over 30 min.
  • Trypsin-immobilized agarose beads were prepared as previously described. Briefly, NHS-activated Sepharose was washed with 15 volumes of ice-cold 1 mM HC1. Each volume of beads was incubated with an equal volume of coupling buffer (200 mM sodium phosphate, 250 mM NaCl buffer at pH 6,) containing 2-4 mg of porcine pancreatic trypsin type IX-S (14,000 units/mg)/mL for 3 h with gentle rocking at room temperature. The beads were then rinsed with 10 volumes of coupling buffer, and incubated with excess coupling buffer containing 100 mM ethanolamine (Eastman Kodak) for 3 hours with gentle rocking at room temperature.
  • coupling buffer 200 mM sodium phosphate, 250 mM NaCl buffer at pH 6,
  • porcine pancreatic trypsin type IX-S (14,000 units/mg)/mL
  • the sepharose-trypsin beads are table for a month under these conditions.
  • Affinity purification of MCoTI-cyclotides was carried out as follows, 30 mL of clarified lysate was incubated with 500 "L of trypsin-sepharose for one hour at room temperature with gentle rocking, and centrifuged at 3000 rpm for 1 minute.
  • the beads were washed with 50 volumes of column buffer containing 0.1% Tween 20 and then rinsed with 50 volumes of column buffer without detergent.
  • the sepharose beads were treated with 3 x 0.5 mL of 8 M GdmCl at room temperature for 15 min and then eluted by gravity. The elute fractions were analyzed by HPLC and ES-MS.
  • Origami (DE3) cells (Novagen) were co-transformed with pET28-TS-MCoTI.
  • Precursor intein 2a was expressed as previously described for la in presence of kanamycin (25 ⁇ g/L) instead. Cells were harvested and lysed as described above. MCoTI-I was purified from the cell lysate using sepharose-trypsin beads as described earlier.
  • Origami (DE3) cells (Novagen) were transformed with pET28-TS-MCoTI-stop and pVLOmeRS.
  • Precursor intein 2b was expressed as previously described for lb but in presence of kanamycin (25 ⁇ /L) and chloroamphenicol (35 ⁇ g/L).
  • Cells were harvested and lysed as described above.
  • MCoTI-OmeF was purified from the cell lysate using sepharose-trypsin beads as described earlier and characterized by LC-MS.
  • Origami (DE3) cells (Novagen) were transformed with pET28-TS-MCoTI-stop and pERAzi.
  • Precursor intein 2c was expressed as previously described for lc but in presence of kanamycin (25 ⁇ /L) and chloroamphenicol (35 ⁇ g/L).
  • Cells were harvested and lysed as described above.
  • MCoTI-aziF was purified from the cell lysate using sepharose-trypsin beads as described earlier and characterized by LC-MS.
  • AMCA-labeled MCoTI-aziF was eluted with 8 M GdmCl and analyzed by LC-MS and ES-MS [AMCA-labeled MCoTI-aziF; expected averaged mass 4159.11 Da, found mass 4159.3 ⁇ 0.25 Da] (FIG. 9).
  • Origami(DE3) cells (Novagen) were transformed with plasmid pET25-Tryp-EGFP. Cells were grown in LB media containing ampicillin (100 "g/L) to an OD at 600 nm of ⁇ 0.62 at 37° C. Protein expression was induced with 0.3 mM IPTG for 6 h at 30°C. The cells were harvested by centrifugation, resuspended in 30 mL of lysis buffer (0.1 mM PMSF, 10 mM imidazole, 25 mM sodium phosphate, 150 mM NaCl buffer at pH 8.0 containing 5% glycerol) and lysed by sonication.
  • lysis buffer 0.1 mM PMSF, 10 mM imidazole, 25 mM sodium phosphate, 150 mM NaCl buffer at pH 8.0 containing 5% glycerol
  • the lysate was clarified by centrifugation at 15,000 rpm in a Sorval SS- 34 rotor for 30 minutes.
  • the clarified supernatant was incubated with 1 mL of Ni-NTA agarose beads (Qiagen) previously equilibrated with column buffer (20 mM imidazole, 50 mM sodium phosphate, 300 mM NaCl buffer at pH 8.0) at 4°C for 1 hour with gentle rocking.
  • the Ni-NTA agarose beads were washed sequentially with column buffer containing (100 mL) followed by column buffer containing 20 mM imidazole (100 mL).
  • the fusion protein was eluted with 2 mL of column buffer containing 100 mM EDTA.
  • the Protein was characterized as the desired product by ES-MS (FIG. 9). Quantification of Typsin-S195A-EGFP was carried out spectrophotometrically using an extinction coefficient per chain at 484 nm of 56000 M-lcm-1. Expression level of soluble protein was estimated -160 ⁇ g/L.
  • Origami(DE3) cells (Novagen) were co-transformed with plasmids pASK-TS- MCoTIstop, pRSF-Tryp-EGFP and pERAzi. Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 "g/L), chloroamphenicol (35 "g/L) and kanamycin (25 ⁇ g/L) at room temperature for Origami(DE3) cells. Cells were grown to an OD at 600 nm of ⁇ 0.2 at 37° C at which point 1 mM /7-azido-phenylalanine was added.
  • the dissociation constant between trypsin and AMCA-labeled MCoTI-aziF was measured by fluorescence polarization anisotropy at 25 °C using a Jobin Yvon/Spex Fluorolog 3 spectrofluorometer with the excitation bandwidth set at 5 nm and emission at 5 nm.
  • the excitation wavelength for coumarin was set at 360 nm and emission was monitored at 450 nm.
  • the equilibrium dissociation constant (3 ⁇ 4) for the interaction was obtained by titrating a fixed concentration of AMCA-labeled MCoTI-aziF (5 nM) with increasing concentrations of trypsin in 0.5 mM EDTA, 50 mM sodium phosphate, 150 mM NaCl at pH 7 by assuming formation of a 1 :1 complex.
  • the calculated ⁇ ⁇ value was 4.5 ⁇ 0.7 nM (FIG. 11).
  • NMR samples were prepared by dissolving cyclotides into 80 mM potassium phosphate pH 6.0 in 90% H 2 O/10% 2 H 2 0 (v/v) to a concentration of approximately 0.5 mM for McoTI-I and 0.1 mM for MCoTI-OmeF. All 1H NMR data were recorded on either Bruker Avance III 500 MHz or Bruker Avance II 700 MHz spectrometers equipped with TCI cryoprobes. Data were acquired at 298 K, and 2,2-dimethyl-2-silapentane-5-sulfonate, DSS, was used as an internal reference.
  • the carrier frequency was centered on the water signal, and the solvent was suppressed by using WATERGATE pulse sequence.
  • X H, ⁇ -TOCSY (spin lock time 80 ms) and 1H, ⁇ -NOESY (mixing time 150 ms) spectra were collected using 4096 1 2 points and 256 ti of 64 transients. Spectra were processed using Topspin 1.3 (Bruker). Each 2D-data set was apodized by 90°-shifted sinebell-squared in all dimensions, and zero filled to 4096 x 512 points prior to Fourier transformation. Assignments for H a and H' protons of folded MCoTI-cyclotides (Table 2) were obtained using standard procedures.
  • AAA ATC CTG CAG CGT TGC CGT CGT GAC TCT GAC TGC CCG GGT GCT TGC ATC TGC CGT GGT AAC GGT TAC TGT TTA TCA - 3 ' (SEQ ID NO: 288) p3 5 '-TA TGA TAA ACA GTA ACC GTT ACC ACG GCA GAT GCA AGC ACC CGG GCA GTC AGA GTC ACG ACG GCA ACG CTG CAG GAT TTT CGG GCA AAC ACC ACC GTC AGA ACC AGA ACC GCA GTT -3 ' (SEQ ID NO: 289)
  • This example tested the feasibility of introducing Uaas into folded cyclotides in living cells, the example used the cyclotide MCoTI-I (FIG. 1A).
  • This cyclotide is a powerful trypsin inhibitor (Ki «20 pM) that has been recently isolated from dormant seeds of Momordica
  • cochinchinensis a plant member of the Cucurbitaceae family. Trypsin inhibitor cyclotides are interesting candidates for drug design because their specificity for inhibition can be altered and their structures can be used as natural scaffolds to generate novel binding activities.
  • Intein-precursors were expressed in 2XYT medium at 30°C for 4 h in the presence of 1 mM AziF or 2 mM OmeF. These conditions were optimized for the expression of the wild-type MCoTI-intein precursor in BL21(DE3) cells. In both cases, the expression level of the intein precursors containing Uaas (lb and lc) was similar (FIG. 5). The suppression efficiency was estimated to be -10% (MCoTI-OmeF precursor, lb) and ⁇ 20% (MCoTI-aziF precursor, lc) compared to the expression of the wild-type MCoTI-I intein precursor la ( ⁇ 40 mg/L).
  • this example tested the ability of the different intein-MCoTI precursors to produce the corresponding folded cyclotide by treatment with reduced glutathione (GSH) at pH 7.2 following the conditions optimized for MCoTI-cyclotides.
  • GSH reduced glutathione
  • the in vitro reaction was clean and efficient in providing major products as analyzed by analytical HPLC (FIG. 5).
  • ES-MS electrospray mass spectrometry
  • the cyclotide MCoTI-OmeF was also characterized by homonuclear NMR spectroscopy to confirm the native cyclotide scaffold was intact (FIG. 7).
  • the final yield after purification was 4 ⁇ g/L (MCoTI-OmeF) and 14 ⁇ g/L (MCoTI-aziF).
  • the expression yield for the wild-type MCoTI-I using these expression and cyclization conditions was ⁇ 48 ⁇ g/L after purification.
  • this example explored the expression of the MCoTI-OmeF and MCoTI-aziF cyclotides inside bacterial cells using EPL-mediated cyclization. This was accomplished by expressing the corresponding intein-precursor in Origami2(DE3) cells. These cells have mutations in the thioredoxin and glutathione reductase genes, which facilitate the formation of disulfide bonds in the bacterial cytosol. Wild-type MCoTI-I were expressed in-cell reaching intracellular concentrations ⁇ 1 ⁇ . When we tried this approach with the cyclotides MCoTI- OmeF and MCoTI-aziF, however, the amount of folded cyclotides was below the detection limit.
  • Protein trans-splicing is a post- translational modification similar to protein splicing with the difference that the intein self- processing domain is split into N- (IN) and C-intein (IC) fragments.
  • split-intein fragments are not active individually, however, they can bind to each other with high specificity under appropriate conditions to form an active protein splicing or intein domain in trans.[40]
  • PTS- mediated backbone cyclization can be accomplished by rearranging the order of the intein fragments. By fusing the IN and IC fragments to the Cand N-termini of the polypeptide for cyclization, the trans-splicing reaction yields a backbone-cyclized polypeptide (FIG. 3).
  • this example used the Nostoc puntiforme PCC73102 (Npu) DnaE split-intein.
  • This DnaE intein has the highest reported rate of protein trans-splicing ( ⁇ /2 ⁇ 60 s) and has a high splicing yield.
  • this example explored the ability of the Npu DnaE split- intein to produce folded wild-type MCoTI-I cyclotide inside living E. coli cells.
  • the example designed the split-intein construct 2a (FIG. IB).
  • the MCoTI-I linear precursor was fused in- frame at the C- and N-termini directly to the Npu DnaE IN and IC polypeptides.
  • the precursor was expressed at very high levels (-70 mg/L) and was almost completely cleaved ( ⁇ 95% in vivo cleavage, FIG. 2A). Reducing the induction time for the expression of the precursor 2a did not significantly decrease the level of in vivo cleavage, indicating the inherent ability of the construct to undergo protein trans-splicing. The high reactivity of this precursor prevented us from performing a full characterization of the precursor protein including kinetic studies of the trans-splicing induced reaction in vitro.
  • this example tried to isolate the natively folded MCoTI-I generated in-cell by incubating the soluble fraction of a fresh cell lysate with trypsin-immobilized sepharose beads.
  • Correctly folded MCoTI-cyclotides are able to bind trypsin with high affinity (K t 3 ⁇ 4 20-30 pM). Therefore, this step can be used for affinity purification and to test the biological activity of the recombinant cyclotides. After extensive washing, the absorbed products were eluted with a solution containing 8 M guanidinium chloride (GdmCl) and analyzed by HPLC.
  • GdmCl guanidinium chloride
  • the HPLC analysis revealed the presence of a major peak that had the expected mass of the native MCoTI-I fold (FIG. 2B and 6).
  • Recombinant MCoTI-I produced by PTS-mediated cyclization was also characterized by NMR spectroscopy (FIG. 6 and Table 3) and was shown to have to the natively folded MCoTI-I.
  • the in-cell expression level of folded MCoTI-I produced by PTS-mediated cyclization was estimated to be -70 ⁇ g/L of bacterial culture, which corresponds to an intracellular concentration of -7 ⁇ .
  • the trans-splicing reaction is also extremely fast ( ⁇ / 2 « 60 s for the Npu DnaE intein).
  • EPL-mediated cyclization follows a slightly more complex mechanism that relies on the formation of the C-terminal thioester at the N-extein-junction and the removal of the N- terminal leading sequence (a Met residue in this case) to provide an N-terminal Cys. These two groups then react to form a peptide bond between the N- and C-termini of the polypeptide.
  • the Npu ortholog used in this work tolerates different sequences at both junctions as demonstrated by the efficient trans-splicing of precursor 2a (FIG. 2A).
  • the tetrapeptide sequences at both intein-extein junctions in construct 2a have only a 20% sequence homology with the native sequences of both Npu DnaE exteins.
  • Constructs 2b and 2c are similar to 2a but was designed to incorporate Uaas into residue Asp 14 in MCoTI-I (FIG. IB).
  • In-cell trans-splicing for 2b and 2c was also similar ( ⁇ 90%, FIG. 2A) to that of the wild-type PTS construct 2a.
  • Cyclotides MCoTI-OmeF and MCoTI-aziF were purified by affinity chromatography using trypsin sepharose beads from fresh soluble cell lysates, and the trypsin-bound fractions were analyzed by LC-MS/MS and ES-MS (Figs. 2C and S3). Cyclotide MCoTI-OmeF generated in-cell by PTS was also characterized by NMR, confirming the adoption of a native cyclotide fold (FIG. 2E and 8).
  • the in-cell expression level for cyclotide MCoTIOmeF and MCoTI-aziF were estimated to be ⁇ 1 ⁇ g L and ⁇ 2 ⁇ g/L corresponding to an intracellular concentration ⁇ 0.1 ⁇ and 0.17 ⁇ , respectively.
  • AMCA-labeled MCoTI-aziF was also able to bind trypsin efficiently (K ⁇ of 4.5 ⁇ 0.7 nM , FIG. 11).
  • trypsin was fused to the N-terminus of green fluorescent protein (GFP).
  • GFP green fluorescent protein
  • AMCA and GFP show a good overlap between the emission band of the donor (AMCA) and the absorption band of the acceptor (GFP), which should allow the visualization of the molecular interaction by fluorescence resonance emission transfer (FRET).
  • MCoTI-aziF and trypsin-S 195A-GFP were encoded in inducible plasmids under the control of the tetracycline (pASK) and T7 (pRSF) promoters, respectively, to facilitate the co-expression of both proteins.
  • pASK tetracycline
  • pRSF T7
  • Co-transformed Origami(DE3) cells were first induced with 0.02% arabinose and 200 ng/L anhydrotetracycline in the presence of aziF to produce MCoTI-aziF.
  • the amount of MCoTI-aziF produced under these conditions was similar to the value obtained when expressed under the control of a T7 promoter.
  • the cells were incubated with DBCO-AMCA in PBS for 4h at 37 °C.
  • In-cell AMCA-labeling of MCoTI-aziF was monitored through LC-MS indicating that under these conditions all the MCoTI-aziF produced inside the cells reacted with DBCO-AMCA.
  • the cells were washed again with PBS to remove unreacted DBCO-AMCA, resuspended in M9 and induced with isopropyl ⁇ -D-l-thiogalactopyranoside (IPTG) for 18 h at room temperature to induce the expression of trypsin-S 195 A-GFP.
  • IPTG isopropyl ⁇ -D-l-thiogalactopyranoside
  • the intracellular concentration of trypsin-S 195 A-GFP was practically not affected by the expression and labeling of MCoTI-aziF and was estimated to be ⁇ 1 ⁇ .
  • the in- cell interaction between AMCA-labeled MCoTI-aziF and trypsin-S 195 A-EGFP was first analyzed by fluorescence spectroscopy.
  • the fluorescence spectrum of the live cells revealed the presence of a strong FRET emission signal at 520 nm upon excitation of the AMCA fluorophore at 360 nm indicating the intracellular formation of the trypsin-MCoTI complex.
  • the presence of complex was also confirmed by flow cytometry.
  • the intracellular FRET efficiency ( ⁇ 0.6) calculated as the ratio between the fluorescence signal of the acceptor fluorophore excited at 360 nm and 415 nm, was also consistent with the dissociation constant for this molecular complex and the intracellular concentrations of trypsin and MCoTI-I. More importantly, these results show that in-cell produced fluorescent labeled cyclotides can be used for monitoring and/or screening intracellular biomolecular interactions using fluorescence-based readout platforms.
  • this example shows that the biosynthesis of cyclotides containing Uaas can be achieved by using different intein-based methods.
  • EPL-backbone cyclization can provide Uaa-containing cyclotides when the cyclization is carried out in vitro by GSH-induced cyclization and folding of the corresponding precursor. In-cell production, however, is less efficient using this method.
  • PTS-mediated backbone cyclization using the highly efficient Npu DnaE split-intein can be employed for the efficient production of cyclotides inside live E. coli cells.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Oncology (AREA)
  • Virology (AREA)
  • Communicable Diseases (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Botany (AREA)
  • Biophysics (AREA)
  • AIDS & HIV (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Genetics & Genomics (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Provided are compositions and methods for preparing a cyclotide, in particular a cyclotide with one or more unnatural amino acids. Also provided is a polypeptide comprising a linear cyclotide fused to a C-terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively. Methods of establishing screening libraries of cyclotides and use of the library for identifying pharmaceutical, therapeutic or cosmetic agents are also provided.

Description

PREPARATION OF CYCLOTIDES
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 61/703,108, filed September 19, 2012, the contents of which is incorporated herein by reference in its entirety.
STATEMENT OF GOVERNMENT SUPPORT
[0002] This invention was made with government support under Grant No. R01 GM090323 awarded by the National Institutes of Health and Grant No . W81 XWH- 10-1-0151 awarded by the Department of Defense, Congressionally Directed Medical Research Program, Prostate Cancer Research Program. The government has certain rights in the invention.
BACKGROUND
[0003] Cyclotides are fascinating natural plant micro-proteins ranging from 28 to 37 amino acid residues long and exhibit various biological actions including anti-microbial, insecticidal, cytotoxic, antiviral (against HIV), protease inhibitory, and hormone-like activities. They share a unique head-to-tail circular knotted topology of three disulfide bridges; one disulfide penetrates through a macrocycle formed by the other two disulfides, thereby inter-connecting the peptide backbone to form what is called a cystine knot topology (FIG. 1). This cyclic cystine knot (CCK) framework gives the cyclotides exceptional rigidity, resistance to thermal and chemical denaturation, and enzymatic stability against degradation. In fact, some cyclotides have been shown to be orally bioavailable. For example, the first cyclotide to be discovered, kalata Bl, was found to be an orally effective uterotonic, and other cyclotides have been shown to cross the cell membrane through macropinocytosis. All of these features make cyclotides ideal tools for drug development.
[0004] Cyclotides have been isolated from plants in the Rubiaceae, Violaceae, Cucurbitacea, and, most recently, Fabaceae families. Around 200 different cyclotide sequences have been reported in the literature, although it has been estimated that ~ 50,000 cyclotides may exist. Despite sequence diversity, all cyclotides share the same CCK motif (FIG. 1A). Hence, these micro-proteins can be considered natural combinatorial peptide libraries that are structurally constrained by the cystine-knot scaffold and head-to-tail cyclization and in which, with the exception of the strictly conserved cysteines comprising the cysteine-knot, hypermutation of essentially all residues is permitted. Cyclotides can be chemically synthesized, thereby permitting the introduction of specific chemical modifications or biophysical probes. More importantly, cyclotides can now be biosynthesized in bacterial cells using a biomimetic approach that involves the use of modified protein splicing units. These characteristics make them ideal substrates for the production of genetically-encoded libraries based on the cyclotide framework. These cell-based libraries allow in-cell molecular evolution strategies to enable the generation and high throughput selection of compounds with optimal binding and inhibitory characteristics. In contrast to chemically generated libraries, genetically-encoded libraries enable the facile generation and screening of very large combinatorial libraries of molecules.
SUMMARY
[0005] One embodiment of the present disclosure provides an isolated or recombinant polypeptide comprising a linear cyclotide fused to a C-terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively.
[0006] In one aspect, the split intein comprises a DnaE split intein. In one aspect, the DnaE split intein comprises a Nostoc punitiforme PCC73102 DnaE split intein.
[0007] In one aspect, the C-terminal fragment comprises an amino acid sequence of SEQ ID NO: 2. In one aspect, the N-terminal fragment comprises an amino acid sequence of SEQ ID NO: 3.
[0008] In one aspect, the cyclotide comprises an amino acid sequence selected from Table 1 or an amino acid that has at least about 90% sequence identity thereto.
[0009] In some aspects, the cyclotide comprises at least an unnatural amino acid residue but retains six cysteine residues that form three disulfide bonds in a cyclized cyclotide. In one aspect, the unnatural amino acid comprises one or more selected from p-methyxyphenylalanine, p- azidophenylalanine or L-(7-hydroxycoumarin-4-yl)ethylglycine.
[0010] In one aspect, the cyclotide comprises an amino acid sequence of SEQ ID NO: 1.
[0011] Also provided is a method for preparing a cyclic peptide, comprising incubating a polypeptide of any of the above embodiments under conditions for the linear cyclotide to cyclize.
[0012] Still also provided is an isolated polynucleotide comprising a nucleic acid sequence encoding a polypeptide of any of the above embodiment or a biological equivalent thereof or a polynucleotide that hybridizes under conditions of high stringency to the polynucleotide or its complement.
[0013] The present disclosure, in one embodiment, provides a method for preparing a cyclic peptide comprising expressing a linear cyclotide in a cell and cyclize the linear cyclotide, wherein the cyclotide comprises at least an unnatural amino acid but retains six cysteine residues to form three disulfide bonds. Another embedment provides a cyclized cyclotide obtainable by such a method. In one aspect, the unnatural acid is modified with an agent comprising a detectable label.
[0014] Vectors and host cells comprising polynucleotides and compositions containing any of the polynucleotides or polypeptides are also provided.
BRIEF DESCRIPTION OF THE FIGURES
[0015] FIG. 1A shows the tertiary structure of the cyclotide MCoTI-II (PDB code: 1IB9) and primary structures of cyclotides used in Example 1. The backbone cyclized peptide (connecting bond shown in light gray) is stabilized by the three disulfide bonds (shown in dark gray).
[0016] FIG. IB illustrates intein precursors used for the expression of cyclotides produced in Example 1. The unnatural amino acid (Uaa) was introduced at position 14, which is in the middle of loop 2 and is marked with an "X".
[0017] FIG. 2A-D show in-cell expression of MCoTI-I based cyclotides in E. coli cells using Npu DnaE intein-mediated protein trans-splicing (PTS). A. SDS-PAGE analysis of the recombinant expression of cyclotide precursors 2a and 2b in Origami2 (DE3) cells for in-cell production of the cyclotides MCoTI-I and MCoTI-OmeF, respectively. B. Analytical HPLC trace (left panel) of the soluble cell extract of bacterial cells expressing precursor 2a (MCoTI-I) after purification by affinity chromatography on a trypsin-sepharose column. Folded MCoTI-I is marked with an arrow. Endogenous bacterial proteins that bind trypsin are marked with an asterisk. Mass spectrum (right panel) of affinity purified MCoTI-I. The expected average molecular weight is shown in parentheses. C. Analytical HPLC-MS/MS trace of the soluble cell extract of bacterial cells expressing precursor 2b (MCoTI-OmeF) and 2c (MCoTI-AziF). D. Summary of the backbone 1H NMR assignments for the backbone protons of MCoTI-OmeF produced in-cell by PTS: Δ δ 1H are the deviations in the chemical shifts of the main chain protons between the values obtained for MCoTI-OmeF and MCoTI-I (Table 3). Assignments for residue 14 were not included in the graph. [0018] FIG. 3 presents a scheme for in-cell expression of native folded cyclotides using intein- mediated protein trans-splicing.
[0019] FIG. 4A-C shows synthesis of DBCO-AMCA. (A) A synthetic scheme for the production of DBCO-AMCA from 6-((7-amino-4- methylcoumarin-3-acetyl)-amino)-hexanoic acid succinimidyl ester (AMCA-X) and 5,6-dihydro-l l,12-didehydrodibenzo-[b,f]-azocino-3- oxoprop-yl-4-amine (DBCO-NH2). (B) CI 8 RP-HPLC trace of purified DBCO-AMCA. Linear gradient from 0% B to 100% B over 30 min. Detection was performed at 360 nm. (C) ES-MS spectrum of purified DBCO-AMCA.
[0020] FIG. 5A-C show in vitro production of cyclotides MCoTI-OmeF and MCoTI-aziF by expressed protein ligation (EPL). A. SDS-PAGE analysis of EPL-intein precursors lb and lc expressed in BL21(DE3) cells. B. Analytical HPLC traces of the GSH-induced
cyclization/folding crudes for precursors lb and lc. Cyclotides MCoTI-OmeF and MCoTI-aziF are marked with an arrow. C. ES-MS spectra for purified cyclotides MCoTIOmeF and MCoTI- aziF. The mass observed for MCoTI-aziF corresponds to the photodegradation product (p- amino-phenylalanine derivative).
[0021] FIG. 6 presents ^^Hj-NOESY spectrum of cyclotide MCoTI-I produced in-cell by PTS.
[0022] FIG. 7 shows ES-MS spectra of purified MCoTI-OmeF (left) and MCoTI-AziF (right).
[0023] FIG. 8A-B show NMR characterization of cyclotide MCoTI-OmeF. (A) 1H { 1H} - TOCSY shows the amino acids assignments of McoTI-OmeF. (B) Aromatic region of 1H{1H}- NOESY shows the assignments of p-MeO-Phel4 side chain of MCoTI-OmeF.
[0024] FIG. 9 shows in vitro labeling of MCoTI-aziF with DBCO-AMCA through copper-free click chemistry. The ligation reactions were analyzed by LC-MS/MS (right panel). Product was characterized by ES-MS (left panel).
[0025] FIG. 10 shows in vivo labeling of MCoTI-AziF with DBCO-AMCA. Cyclotides MCoTI-I and MCoTI-aziF were expressed in Origami(DE3) cells transformed with plasmid pERAzi as described above. In both cases the cells were incubated with DBCO-AMCA (0.5 μΜ) in PBS for 4 h. The cells were then washed with PBS until no AMCA was detected in the washes, and analyzed by fluorescence microscopy. Bar size corresponds to 10 μηι. [0026] FIG. 11 shows direct binding of AMCA-labeled MCoTI-AziF to trypsin was measured by fluorescence polarization anisotropy by exciting AMCA at 360 nm and reading the fluorescence polarization anisotropy at 450 nm. Binding experiments were performed in phosphate buffer at pH 7.4 at room temperature by titrating AMCA-labeled MCoTI-AziF (5 nM) with increasing amounts of trypsin.
[0027] FIG. 12A presents data from direct binding of AMCA-labeled MCoTI-AziF to trypsin- S195A-GFP was measured by the increment in the FRET signal at 415 nm when exciting at 360 nm.
[0028] FIG. 12B shows results from binding experiments performed in phosphate buffer at pH 7.4 at room temperature by titrating AMCA-labeled MCoTI-AziF (170 nM) with increasing amounts of trypsin trypsin-S 195 A-GFP (0.25 nM - 300 nM). B. Fluorescence spectra of MCoTI- AziF (170 nM) and trypsin-S 195 A-GFP at different concentrations (0.25 nm - 300 nM).
Background fluorescence was subtracted in all cases.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0029] Before the compositions and methods are described, it is to be understood that the invention is not limited to the particular methodologies, protocols, cell lines, assays, and reagents described, as these may vary. It is also to be understood that the terminology used herein is intended to describe particular embodiments of the present invention, and is in no way intended to limit the scope of the present invention as set forth in the appended claims.
[0030] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology and recombinant DNA, which are within the skill of the art. See, e.g., Sambrook and Russell eds. (2001) Molecular Cloning: A Laboratory Manual, 3rd edition; the series Ausubel et al. eds. (2007) Current Protocols in Molecular Biology; the series Methods in Enzymology (Academic Press, Inc., N.Y.); MacPherson et al. (1991) PCR 1 : A Practical Approach (IRL Press at Oxford University Press); MacPherson et al. (1995) PCR 2: A Practical Approach; Harlow and Lane eds. (1999) Antibodies, A Laboratory Manual; Freshney (2005) Culture of Animal Cells: A Manual of Basic Technique, 5th edition; Gait ed. (1984) Oligonucleotide Synthesis; U.S. Patent No. 4,683,195; Hames and Higgins eds. (1984) Nucleic Acid Hybridization; Anderson (1999) Nucleic Acid Hybridization; Hames and Higgins eds. (1984) Transcription and
Translation; Immobilized Cells and Enzymes (IRL Press (1986)); Perbal (1984) A Practical Guide to Molecular Cloning; Miller and Calos eds. (1987) Gene Transfer Vectors for
Mammalian Cells (Cold Spring Harbor Laboratory); Makrides ed. (2003) Gene Transfer and Expression in Mammalian Cells; Mayer and Walker eds. (1987) Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Herzenberg et al. eds (1996) Weir's Handbook of Experimental Immunology; Manipulating the Mouse Embryo: A Laboratory Manual, 3rd edition (Cold Spring Harbor Laboratory Press (2002)); Current Protocols In
Molecular Biology (F. M. Ausubel, et al. eds., (1987)); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)); Harlow and Lane, eds. (1988) Antibodies, A Laboratory Manual; Harlow and Lane, eds. (1999) Using Antibodies, A Laboratory Manual; Animal Cell Culture (R.I.
Freshney, ed. (1987)); Zigova, Sanberg and Sanchez-Ramos, eds. (2002) Neural Stem Cells.
[0031] All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied ( + ) or ( - ) by increments of 0.1 or 1 where appropriate. It is to be understood, although not always explicitly stated that all numerical designations are preceded by the term "about". The term "about" also includes the exact value "X" in addition to minor increments of "X" such as "X + 0.1 or 1" or "X - 0.1 or 1," where appropriate. It also is to be understood, although not always explicitly stated, that the reagents described herein are merely exemplary and that equivalents of such are known in the art.
[0032] As will be understood by one skilled in the art, for any and all purposes, particularly in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as "up to," "at least," "greater than," "less than," and the like include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above.
[0033] Throughout and within this disclosure the patent and technical literature is identified by a bibliographic citation or by a Arabic number. The bibliographic citations for these references are found in this disclosure immediately preceding the claims. All references disclosed here are incorporated by reference to more fully describe the state of the art to which this invention pertains. Definitions
[0034] As used in the specification and claims, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a cell" includes a plurality of cells, including mixtures thereof.
[0035] As used herein, the term "comprising" is intended to mean that the compositions and methods include the recited elements, but not excluding others. "Consisting essentially of when used to define compositions and methods, shall mean excluding other elements of any essential significance to the combination for the stated purpose. Thus, a composition consisting essentially of the elements as defined herein would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives and the like. "Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions of this invention or process steps to produce a composition or achieve an intended result.
Embodiments defined by each of these transition terms are within the scope of this invention.
[0036] The term "isolated" as used herein with respect to cells, nucleic acids, such as DNA or R A, refers to molecules separated from other DNAs or RNAs, respectively, that are present in the natural source of the macromolecule. The term "isolated" as used herein also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized. Moreover, an "isolated nucleic acid" is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state. The term "isolated" is also used herein to refer to cells or polypeptides which are isolated from other cellular proteins or tissues. Isolated polypeptides is meant to encompass both purified and recombinant polypeptides.
[0037] As used herein, the term "recombinant" as it pertains to polypeptides or polynucleotides intends a form of the polypeptide or polynucleotide that does not exist naturally, a non-limiting example of which can be created by combining polynucleotides or polypeptides that would not normally occur together.
[0038] "Cells," "host cells" or "recombinant host cells" are terms used interchangeably herein. It is understood that such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
[0039] As used herein, a "vector" is a vehicle for transferring genetic material into a cell. Examples of such include, but are not limited to plasmids and viral vectors. A viral vector is a virus that has been modified to transduct genetic material into a cell. A plasmid vector is made by splicing a DNA construct into a plasmid. As is apparent to those of skill in the art, the appropriate regulatory elements are included in the vectors to guide replication and/or expression of the genetic material in the selected host cell.
[0040] "Homology" or "identity" or "similarity" refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An "unrelated" or "nonhomologous" sequence shares less than 40% identity, though preferably less than 25% identity, with one of the sequences of the present invention.
[0041] A polynucleotide or polynucleotide region (or a polypeptide or polypeptide region) has a certain percentage (for example, 60 %, 65 %, 70 %, 75 %, 80 %, 85 %, 90 %, 95 %, 98 % or 99 %) of "sequence identity" to another sequence means that, when aligned, that percentage of bases (or amino acids) are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in Ausubel et al. eds. (2007) Current Protocols in Molecular Biology. Preferably, default parameters are used for alignment. One alignment program is BLAST, using default parameters. In particular, programs are BLASTN and BLASTP, using the following default parameters: Genetic code = standard; filter = none; strand = both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank CDS translations + SwissProtein + SPupdate + PIR. Details of these programs can be found at the following Internet address: http://www.ncbi.nlm.nih.gov/blast/Blast.cgi, last accessed on May 21 , 2008. Biologically equivalent polynucleotides are those having the above-noted specified percent homology and encoding a polypeptide having the same or similar biological activity.
[0042] The term "an equivalent nucleic acid or polynucleotide" refers to a nucleic acid having a nucleotide sequence having a certain degree of homology with the nucleotide sequence of the nucleic acid or complement thereof. A homolog of a double stranded nucleic acid is intended to include nucleic acids having a nucleotide sequence which has a certain degree of homology with or with the complement thereof. In one aspect, homologs of nucleic acids are capable of hybridizing to the nucleic acid or complement thereof.
[0043] Hybridization reactions can be performed under conditions of different "stringency". In general, a low stringency hybridization reaction is carried out at about 40°C in about 10 x SSC or a solution of equivalent ionic strength/temperature. A moderate stringency hybridization is typically performed at about 50°C in about 6 x SSC, and a high stringency hybridization reaction is generally performed at about 60°C in about 1 x SSC. Hybridization reactions can also be performed under "physiological conditions" which is well known to one of skill in the art. A non-limiting example of a physiological condition is the temperature, ionic strength, pH and
2_|_
concentration of Mg normally found in a cell.
[0044] As used herein, the term "oligonucleotide" refers to polynucleotides such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term should also be understood to include, as equivalents, derivatives, variants and analogs of either RNA or DNA made from nucleotide analogs, and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides. Deoxyribonucleotides include deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymidine. For purposes of clarity, when referring herein to a nucleotide of a nucleic acid, which can be DNA or an RNA, the terms "adenosine", "cytidine", "guanosine", and "thymidine" are used. It is understood that if the nucleic acid is RNA, a nucleotide having a uracil base is uridine.
[0045] The terms "polynucleotide" and "oligonucleotide" are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides or analogs thereof. Polynucleotides can have any three-dimensional structure and may perform any function, known or unknown. The following are non-limiting examples of polynucleotides: a gene or gene fragment (for example, a probe, primer, EST or SAGE tag), exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, dsRNA, siRNA, miRNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers. A polynucleotide can comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure can be imparted before or after assembly of the polynucleotide. The sequence of nucleotides can be interrupted by non-nucleotide components. A polynucleotide can be further modified after polymerization, such as by conjugation with a labeling component. The term also refers to both double- and single-stranded molecules. Unless otherwise specified or required, any embodiment of this invention that is a polynucleotide encompasses both the double-stranded form and each of two complementary single-stranded forms known or predicted to make up the double-stranded form.
[0046] A polynucleotide is composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil (U) for thymine when the polynucleotide is RNA. Thus, the term "polynucleotide sequence" is the alphabetical representation of a polynucleotide molecule. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching. The term "polymorphism" refers to the coexistence of more than one form of a gene or portion thereof. A portion of a gene of which there are at least two different forms, i.e., two different nucleotide sequences, is referred to as a "polymorphic region of a gene". A polymorphic region can be a single nucleotide, the identity of which differs in different alleles.
[0047] The term "encode" as it is applied to polynucleotides refers to a polynucleotide which is said to "encode" a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
[0048] As used herein, the term "detectable label" intends a directly or indirectly detectable compound or composition that is conjugated directly or indirectly to the composition to be detected, e.g., polynucleotide or protein such as an antibody so as to generate a "labeled" composition. The term also includes sequences conjugated to the polynucleotide that will provide a signal upon expression of the inserted sequences, such as green fluorescent protein (GFP) and the like. The label may be detectable by itself (e.g. radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable. The labels can be suitable for small scale detection or more suitable for high-throughput screening. As such, suitable labels include, but are not limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and proteins, including enzymes. The label may be simply detected or it may be quantified. A response that is simply detected generally comprises a response whose existence merely is confirmed, whereas a response that is quantified generally comprises a response having a quantifiable (e.g., numerically reportable) value such as an intensity, polarization, and/or other property. In luminescence or fluoresecence assays, the detectable response may be generated directly using a luminophore or fluorophore associated with an assay component actually involved in binding, or indirectly using a luminophore or fluorophore associated with another (e.g., reporter or indicator) component.
[0049] Examples of luminescent labels that produce signals include, but are not limited to bioluminescence and chemiluminescence. Detectable luminescence response generally comprises a change in, or an occurrence of, a luminescence signal. Suitable methods and luminophores for luminescently labeling assay components are known in the art and described for example in Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed.). Examples of luminescent probes include, but are not limited to, aequorin and luciferases.
[0050] Examples of suitable fluorescent labels include, but are not limited to, fluorescein, rhodamine, tetramethylrhodamine, eosin, erythrosin, coumarin, methyl-coumarins, pyrene, Malacite green, stilbene, Lucifer Yellow, Cascade Blue.TM., and Texas Red. Other suitable optical dyes are described in the Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed.).
[0051] In another aspect, the fluorescent label is functionalized to facilitate covalent attachment to a cellular component present in or on the surface of the cell or tissue such as a cell surface marker. Suitable functional groups, including, but not are limited to, isothiocyanate groups, amino groups, haloacetyl groups, maleimides, succinimidyl esters, and sulfonyl halides, all of which may be used to attach the fluorescent label to a second molecule. The choice of the functional group of the fluorescent label will depend on the site of attachment to either a linker, the agent, the marker, or the second labeling agent.
[0052] Attachment of the fluorescent label may be either directly to the cellular component or compound or alternatively, can by via a linker. Suitable binding pairs for use in indirectly linking the fluorescent label to the intermediate include, but are not limited to,
antigens/antibodies, e.g., rhodamine/anti-rhodamine, biotin/avidin and biotin/strepavidin.
[0053] As used herein, the term "carrier" encompasses any of the standard carriers, such as a phosphate buffered saline solution, buffers, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents. The compositions also can include stabilizers and preservatives. For examples of carriers, stabilizers and adjuvants, see Sambrook and Russell (2001), supra. Those skilled in the art will know many other suitable carriers for binding polynucleotides, or will be able to ascertain the same by use of routine experimentation. In one aspect of the invention, the carrier is a buffered solution such as, but not limited to, a PCR buffer solution.
[0054] A "pharmaceutical composition" is intended to include the combination of an active agent with a carrier, inert or active, making the composition suitable for diagnostic or therapeutic use in vitro, in vivo or ex vivo.
[0055] As used herein, the term "pharmaceutically acceptable carrier" encompasses any of the standard pharmaceutical carriers, such as a phosphate buffered saline solution, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents. The compositions also can include stabilizers and preservatives. For examples of carriers, stabilizers and adjuvants, see Martin (1975) Remington's Pharm. Sci., 15th Ed. (Mack Publ. Co., Easton).
Descriptive Embodiments
[0056] Examples in this disclosure shows in-cell production of natively folded cyclotides. The folder cyclotides can contain unnatural amino acids (Uaas) by employing in vivo Uaa incorporation in combination with protein splicing to mediate intracellular backbone cyclization. To the inventors' knowledge, this is the first time that a natively folded cyclotide containing Uaas has been produced inside living cells. This approach opens the possibility for in-cell generation of cyclotides containing Uaas with new or enhanced biological functions. For example, the introduction of fluorescent amino acids or Uaas able to site-specifically incorporate fluorescent probes should facilitate the in-cell production of fluorescent-labeled cyclotides for screening or probing molecular interactions using cell-based optical screening approaches.
[0057] One embodiment of the present disclosure provides a method for preparing a cyclotide. The method entails generation of a linear peptide that contains the desired cyclotide in a linear form, flanked by two peptide fragments that have affinity to each other so as to be capable of bringing two ends of the linear cyclotide together, facilitating cyclization. In one aspect, the two peptide fragments are the C-terminus and N-terminus domains of a split intein.
[0058] "Cyclotides" are small disulfide rich peptides isolated from plants. Typically containing 28-37 amino acids, cyclotides are characterized by their head-to-tail cyclized peptide backbone and the interlocking arrangement of their three disulfide bonds. These combined features have been termed the cyclic cystine knot (CC ) motif (FIG. 1A). To date, over 100 cyclotides have been isolated and characterized from species of the Rubiaceae, Violaceae, Cucurbitaceae and Fabaceae families. Table 1 below lists non- limiting examples of known cyclotides. [0059] Reference herein to a "cyclic backbone" includes a molecule comprising a sequence of amino acid residues or analogues thereof without free amino and carboxy termini. The cyclic backbone of the disclosure comprises sufficient disulfide bonds, or chemical equivalents thereof, to confer a knotted topology on the three-dimensional structure of the cyclic backbone. The term "cyclotide" as used herein refers to a peptide comprising a cyclic cystine knot motif defined by a cyclic backbone, at least two but preferably at least three disulfide bonds and associated beta strands in a particular knotted topology. The knotted topology involves an embedded ring formed by at least two backbone disulfide bonds and their connecting backbone segments being threaded by a third disulfide bond. However, a disulfide bond may be replaced or substituted by another form of bonding such as a covalent bond.
Table 1. Known cyclotides
Figure imgf000015_0001
Cylcotide Protein Sequence SEQ
ID
NO. kalata _B4 GLPVCGETCVGGTCNTPGCTCSWPVCTRD 31 vodo M GAPICGESCFTGKCYTVQCSCSWPVCTRN 32 cyclopsychotride A SIPCGESCVFIPCTVTALLGCSCKSKVCYKN 33 cycloviolacin H 1 GIPCGESCVYIPCLTSAIGCSCKS VCYRN 34 cycloviolacin 09 GIPCGESCVWIPCLTSAVGCSCKSKVCYRN 35 vico A GSIPCAESCVYIPCFTGIAGCSCK KVCYYN 36 vitri A GIPCGESCVWIPCITSAIGCSCKSKVCYRN 37 kalata S GLPVCGETCVGGTCNTPGCSCSWPVCTRN 38 cycloviolacin O 12 GLPICGETCVGGTCNTPGCSCSWPVCTRN 39 vodo_N GLPVCGETCTLGKCYTAGCSCSWPVCYRN 40 vico B GSIPCAESCVYIPCITGIAGCSCKNKVCYY 41 kalata Bl IIa GLPVCGETCVGGTCNTPGCTCSWPVCTRN 42
Hypa A GIPCAESCVYIPCTITALLGCSCKNKVCYN 43 circulin B GVIPCGESCVFIPCISTLLGCSC NKVCYRN 44 circulin C GIPCGESCVFIPCITSVAGCSCKSKVCYRN 45 circulin D KIPCGESCVWIPCVTSIFNCKCENKVCYHD 46 circulin E KIPCGESCVWIPCLTSVFNCKCENKVCYHD 47 circulin F AIPCGESCVWIPCISAAIGCSCKNKVCYR 48 cycloviolacin 04 GIPCGESCVWIPCISSAIGCSC N VCYRN 49 cycloviolacin_03 GIPCGESCVWIPCLTSAIGCSCKSKVCYRN 50 cycloviolacin 05 GTPCGESCVWIPCISSAVGCSCK KVCYKN 51 cycloviolacin_06 GTLPCGESCVWIPCISAAVGCSC S VCYKN 52 cycloviolacin 07 SIPCGESCVWIPCTITALAGCKCKSKVCYN 53 cycloviolacin 010 GIPCGESCVYIPCLTSAVGCSC SKVCYRN 54 kalata_B5 GTPCGESCVYIPCISGVIGCSCTDKVCYLN 55 varv_peptide_B GLPVCGETCFGGTCNTPGCSCDPWPMCSRN 56 varv_peptide_C GVPICGETCVGGTCNTPGCSCSWPVCTRN 57 varv_peptide_D GLPICGETCVGGSCNTPGCSCSWPVCTRN 58 varv_peptide_F GVPICGETCTLGTCYTAGCSCSWPVCTRN 59 varv_peptide_G GVPVCGETCFGGTCNTPGCSCDPWPVCSRN 60 varv_peptide_H GLPVCGETCFGGTCNTPGCSCETWPVCSRN 61 cycloviolin A GVIPCGESCVFIPCISAAIGCSCK KVCYRN 62 cycloviolin B GTACGESCYVLPCFTVGCTCTSSQCFKN 63 cycloviolin C GIPCGESCVFIPCLTTVAGCSCKN VCYRN 64 cycloviolin D GFPCGESCVFIPCISAAIGCSC N VCYRN 65 violapeptide l GLPVCGETCVGGTCNTPGCSCSRPVCTXN 66 vhl-1 SISCGESCAMISFCFTEVIGCSCKNKVCYLN 67
Vontr Protein ALETQKPNHLEEALVAFAKKGNLGGLP 68 hcf-1 GIPCGESCHYIPCVTSAIGCSCRNRSCMRN 69 htf-1 GIPCGDSCHYIPCVTSTIGCSCTNGSCMRN 70
Oantr_protein GV S SETTLMFLKEMQLKLP 71 vhl-2 GLPVCGETCFTGTCYTNGCTCDPWPVCTRN 72 Cylcotide Protein Sequence SEQ
ID
NO. cycloviolacin H3 GLPVCGETCFGGTCNTPGCICDPWPVCTRN 73 cycloviolacin_H2 SAIACGESCVYIPCFIPGCSCRNRVCYLN 74
Hyfl_A SISCGESCVYIPCTVTALVGCTCKDKVCYLN 75
Hyfl B GSPIQCAETCFIGKCYTEELGCTCTAFLCMKN 76
Hyfl C GSPRQCAETCFIGKCYTEELGCTCTAFLCMKN 77
Hyfl D GSVPCGESCVYIPCFTGIAGCSCKSKVCYYN 78
Hyfl_E GEIPCGESCVYLPCFLPNCYCRNHVCYLN 79
Hyfl_F SISCGETCTTFNCWIPNC CNHHDKVCYWN 80
Hyfl_G_(partial) CAETCVVLPCFIVPGCSCKSSVCYFN 81
Hyfl_H_(partial) CAETCIYIPCFTEAVGCKCKDKVCY N 82
Hyfl l GIPCGESCVFIPCISGVIGCSCKSKVCYRN 83
Hyfl_J GIACGESCAYFGCWIPGCSCRN VCYFN 84
Hyfl_K GTPCGESCVYIPCFTAVVGCTCB DKVCYLN 85
Hyfl L GTPCAESCVYLPCFTGVIGCTCKD VCYLN 86
Hyfl_M GNIPCGESCIFFPCFNPGCSCKDNLCYYN 87
Hyfl_N_(partial) CGETCVILPCISAALGCSCKDTVCY N 88
Hyfl_0_(partial) CGETCVIFPCISAAFGCSCKDTVCYKN 89
Hyfl P GSVPCGESCVWIPCISGIAGCSCK KVCYLN 90
Hymo A (partial) CGETCLFIPCIFSWGCSCSSKVCYRN 91
Hymo B (partial) CGETCVTGTCYTPGCACDWPVCKRD 92
Hyst_A_(partial) CGETCIWGRCYSENIGCHCGFGICTLN 93
Hyve_A_(p artial) CGETCLFIPCLTSVFGCSCKNRGCYKI 94
Hyca A (partial) CGETCWDTRCYTKKCSCAWPVCMRN 95
Hyde A (partial) CVWIPCISAAIGCSCKSKVCYRN 96
Hy en_A_(p artial) CGESCVYIPCTVTALLGCSCKDKVCYKN 97
Hyen B (partial) CGETCKVTB RCSGQGCSCLKGRSCYD 98
Hyep_A_(partial) CGETCVVLPCFIVPGCSCKSSVCYFN 99
Hyep_B_(partial) CGETCIYIPCFTEAVGCKCKDKVCY N 100 tricyclon B GGTIFDCGESCFLGTCYTKGCSCGEWKLCYGEN 101 kalata_B8 GSVLNCGETCLLGTCYTTGCTCNKYRVCTKD 102 cycloviolacin H4 GIPCAESCVWIPCTVTALLGCSCSNNVCYN 103 cycloviolacin 013 GIPCGESCVWIPCISAAIGCSC S VCYRN 104 violacin A SAISCGETCFKFKCYTPRCSCSYPVC 105 cycloviolacin 014 GSIPACGESCFKGKCYTPGCSCSKYPLCAKN 106 cycloviolacin O 15 GLVPCGETCFTGKCYTPGCSCSYPICKKN 107 cycloviolacin 016 GLPCGETCFTGKCYTPGCSCSYPICKXIN 108 cycloviolacin 017 GIPCGESCVWIPCISAAIGCSCKNKVCYR 109 cycloviolacin O 18 GIPCGESCVYIPCTVTALAGCKCKS VCYN 110 cycloviolacin 019 GTLPCGESCVWIPCISSWGCSCKSKVCYKD 111 cycloviolacin 020 GIPCGESCVWIPCLTSAIGCSCKSKVCYRD 112 cycloviolacin_021 GLPVCGETCVTGSCYTPGCTCSWPVCTRN 113 cycloviolacin 022 GLPICGETCVGGTCNTPGCTCSWPVCTRN 114
Figure imgf000018_0001
Cylcotide Protein Sequence SEQ
ID
NO. vibi E GIPCAESCVWIPCTVTALIGCGCSNKVCYN 156 vibi_F GTIPCGESCVFIPCLTSALGCSCKSKVCY N 157 vibi G GTFPCGESCVFIPCLTSAIGCSCKSKVCY N 158 vibi_H GLLPCAESCVYIPCLTTVIGCSCKSKVCY N 159 vibi l GIPCGESCVWIPCLTSTVGCSCKSKVCYRN 160 vibi J GTFPCGESCVWIPCISKVIGCACKSKVCYKN 161 vibi_K GIPCGESCVWIPCLTSAVGCPCKSKVCYRN 162
Viba_2 GIPCGESCVYLPCFTAPLGCSCSSKVCYRN 163
Viba_5 GIPCGESCVWIPCLTATIGCSCKSKVCYRN 164
Viba lO GIPCAESCVYLPCVTrv^IGCSCKDKVCYN 165
Viba_12 GIPCAESCVWIPCTVTALLGCSCKDKVCYN 166
Viba_14 GRLCGERCVIERTRAWCRTVGCICSLHTLECVRN 167
Viba_17 GLPVCGETCVGGTCNTPGCGCSWPVCTRN 168
Viba 15 GLPVCGETCVGGTCNTPGCACSWPVCTRN 169 mram l GSIPCGESCVYIPCIS SLLGCSCKSKVCYKN 170 mram 2 GIPCAESCVYIPCLTSAIGCSC S VCYRN 171 mram_3 GIPCGESCVYLPCFTTIIGC CQGKVCYH 172 mram 4 GSIPCGESCVFIPCISSWGCSCKNKVCYKN 173 mram 5 GTIPCGESCVFIPCLTSAIGCSC S VCYKN 174 mram_6 GSIPCGESCVYIPCISSLLGCSCESKVCY N 175 mram 7 GSIPCGESCVFIPCISSIVGCSC S VCYKN 176 mram_8 GIPCGESCVFIPCLTSAIGCSCKSKVCYRN 177 mram 9 GVPCGESCVWIPCLTSrVGCSCKNNVCTLN 178 mram 10 GVIPCGESCVFIPCISSVLGCSCI NKVCYRN 179 mram l 1 GHPTCGETCLLGTCYTPGCTC RPVCYKN 180 mram 12 GSAILCGESCTLGECYTPGCTCSWPICTB N 181 mram_13 GHPICGETCVGNKCYTPGCTCTWPVCYRN 182 mram_14 GSIPCGEGCVFIPCISSIVGCSCKSKVCYKN 183
Viba l GIPCGEGCVYLPCFTAPLGCSCSS VCYRN 184
Viba_3 GIPCGESCVWIPCLTAAIGCSCSSKVCYRN 185
Viba_4 GVPCGESCVWIPCLTSAIGCSCKSSVCYRN 186
Viba_6 GIPCGESCVLIPCISSVIGCSCKSKVCYRN 187
Viba_7 GVIPCGESCVFIPCISSVIGCSCKSKVCYRN 188
Viba_8 GAGCIETCYTFPCISEMI CSCK SRCQKN 189
Viba_9 GIPCGESCVWIPCISSAIGCSCKN VCYRK 190
Viba l l GIPCGESCVWIPCISGAIGCSCKSKVCYRN 191
Viba_13 TIPCAESCVWIPCTVTALLGCSCKD VCYN 192
Viba_16 GLPICGETCTLGTCYTVGCTCSWPICTRN 193
[GlA]kalata_Bl ALPVCGETCVGGTCNTPGCTCSWPVCTRN 194
[L2A]kalata Bl GAPVCGETCVGGTCNTPGCTCSWPVCTRN 195
[P3A]kalata_Bl GLAVCGETCVGGTCNTPGCTCSWPVCTRN 196
[V4A]kalata_Bl GLPACGETCVGGTCNTPGCTCSWPVCTRN 197 Cylcotide Protein Sequence SEQ
ID
NO.
[G6A]kalata Bl GLPVCAETCVGGTCNTPGCTCSWPVCTRN 198
[E7A]kalata_Bl GLPVCGATCVGGTCNTPGCTCSWPVCTRN 199
[T8A]kalata_Bl GLPVCGEACVGGTCNTPGCTCSWPVCTRN 200
[V10A]kalata_Bl GLPVCGETCAGGTCNTPGCTCSWPVCTRN 201
[Gl lA]kalata_Bl GLPVCGETCVAGTCNTPGCTCSWPVCTRN 202
[G12A]kalata_Bl GLPVCGETCVGATCNTPGCTCSWPVCTRN 203
[T13A]kalata_Bl GLPVCGETCVGGACNTPGCTCSWPVCTRN 204
[N15A]kalata_Bl GLPVCGETCVGGTCATPGCTCSWPVCTRN 205
[T16A]kalata_Bl GLPVCGETCVGGTCNAPGCTCSWPVCTRN 206
[P17A]kalata_Bl GLPVCGETCVGGTCNTAGCTCSWPVCTRN 207
[G18A]kalata_Bl GLPVCGETCVGGTCNTPACTCSWPVCTRN 208
[T20A]kalata_Bl GLPVCGETCVGGTCNTPGCACSWPVCTRN 209
[S22A]kalata_Bl GLPVCGETCVGGTCNTPGCTCAWPVCTRN 210
[W23A]kalata_Bl GLPVCGETCVGGTCNTPGCTCSAPVCTRN 211
[P24A]kalata_Bl GLPVCGETCVGGTCNTPGCTCSWAVCTRN 212
[V25A]kalata_Bl GLPVCGETCVGGTCNTPGCTCSWPACTRN 213
[T27A]kalata_Bl GLPVCGETCVGGTCNTPGCTCSWPVCARN 214
[R28A]kalata_Bl GLPVCGETCVGGTCNTPGCTCSWPVCTAN 215
[N29A]kalata Bl GLPVCGETCVGGTCNTPGCTCSWPVCTRA 216
Cter A GVIPCGESCVFIPCISTVIGCSCKN VCYRN 217
Cter B GVPCAESCVWIPCTVTALLGCSCKDKVCYLN 218 hcf-1 variant GIPCGESCHIPCVTSAIGCSCRNRSCMRN 219
Vpl-1 GSQSCGESCVLIPCISGVIGCSCSSMICYFN 220
Vpf-1 GIPCGESCVFIPCLTAAIGCSCRSKVCYRN 221 c031 GLPVCGETCVGGTCNTPGCSCSIPVCTRN 222 c028 GLPVCGETCVGGTCNTPGCSCSWPVCFRD 223 c032 GAPVCGETCFGGTCNTPGCTCDPWPVCTND 224 c033 GLPVCGETCVGGTCNTPYCTCSWPVCTRD 225 c034 GLPVCGETCVGGTCNTEYCTCSWPVCTRD 226 c035 GLPVCGETCVGGTCNTPYCFCSWPVCTRD 227 c029 GIPCGESCVWIPCISGAIGCSC SKVCYKN 228 cO30 GIPCGESCVWIPCISSAIGCSC NKVCFKN 229 c026 GSIPACGESCFRG CYTPGCSCSKYPLCAKD 230 c027 GSIPACGESCFKGWCYTPGCSCSKYPLCAKD 231
Globa F GSFPCGESCVFIPCISAIAGCSC NKVCYKN 232
Globa A GIPCGESCVFIPCITAAIGCSCKTKVCYRN 233
Globa B GVIPCGESCVFIPCISAVLGCSCKSKVCYRN 234
Globa D GIPCGETCVFMPCISGPMGCSCKHMVCYRN 235
Globa G GVIPCGESCVFIPCISSVLGCSCKNKVCYRN 236
Globa E GSAFGCGETCVKGKCNTPGCVCSWPVCKKN 237
Globa C APCGESCVFIPCISAVLGCSC S VCYRN 238
Glopa D GVPCGESCVWVPCTVTALMGCSCVREVCRKD 239
Glopa E GIPCAESCVWIPCTVTKMLGCSCKDKVCYN 240
Glopa A GGSIPCIETCVWTGCFLVPGCSCKSDKKCYLN 241
Glopa B GGSVPCIETCVWTGCFLVPGCSCKSDKKCYLN 242
Glopa C GDIPLCGETCFEGGNCRIPGCTCVWPFCSKN 243
Co36 GLPTCGETCFGGTCNTPGCTCDPFPVCTHD 244 cycloviolacin Tl GIPVCGETCVGGTCNTPGCSCSWPVCTRN 245 Cylcotide Protein Sequence SEQ
ID
NO. cycloviolacin T2 GLPICGETCVGGTCNTPGCSCSWPVCTRN 246 psyle A GIACGESCVFLGCFIPGCSCKSKVCYFN 247 psyle B GIPCGETCVAFGCWIPGCSCKDKLCYYD 248 psyle C KLCGETCFKFKCYTPGCSCSYFPCK 249 psyle D GIPCGESCVFIPCTVTALLGCSCQNKVCYRD 250 psyle E GVIPCGESCVFIPCISSVLGCSCKNKVCYRD 251 psyle F GVIPCGESCVFIPCITAAVGCSCKNKVCYRD 252 vaby A GLPVCGETCAGGTCNTPGCSCSWPICTRN 253 vaby B GLPVCGETCAGGTCNTPGCSCTWPICTRN 254 vaby C GLPVCGETCAGGRCNTPGCSCSWPVCTRN 255 vaby D GLPVCGETCFGGTCNTPGCTCDPWPVCTRN 256 vaby E GLPVCGETCFGGTCNTPGCSCDPWPVCTR 257
Oak6 cyclotide 2 GLPICGETCFGGTCNTPGCICDPWPVCTRD 258
Oak7 cyclotide GSHCGETCFFFGCYKPGCSCDELRQCYKN 259
Oak8 cyclotide GVPCGESCVFIPCLTAWGCSCSNKVCYLN 260
Oak6 cyclotide 1 GLPVCGETCFGGTCNTPGCACDPWPVCTRN 261
Cter C GVPCAESCVWIPCTVTALLGCSCKDKVCYLD 262
Cter D GIPCAESCVWIPCTVTALLGCSCKDKVCYLN 263
Cter E GIPCAESCVWIPCTVTALLGCSCKDKVCYLD 264
Cter F GIPCGESCVFIPCISSVVGCSCKSKVCYLD 265
Cter G GLPCGESCVFIPCITTVVGCSC N VCYNN 266
Cter H GLPCGESCVFIPCITTVVGCSC N VCYND 267
Cter I GTVPCGESCVFIPCITGIAGCSCKN VCYIN 268
Cter J GTVPCGESCVFIPCITGIAGCSCKNKVCYID 269
Cter K HEPCGESCVFIPCITTWGCSCKN VCYN 270
Cter L HEPCGESCVFIPCITTWGCSCKN VCYD 271
Cter M GLPTCGETCTLGTCYVPDCSCSWPICMKN 272
Cter N GSAFCGETCVLGTCYTPDCSCTALVCLKN 273
Cter 0 GIPCGESCVFIPCITGIAGCSCKSKVCYRN 274
Cter P GIPCGESCVFIPCITAAIGCSCKSKVCYRN 275
Cter Q GIPCGESCVFIPCISTVIGCSCKNKVCYRN 276
Cter R GIPCGESCVFIPCTVTALLGCSCKDKVCYKN 277 vitri B GVPICGESCVGGTCNTPGCSCSWPVCTTN 278 vitri C GLPICGETCVGGTCNTPGCFCTWPVCTRN 279 vitri D GLPVCGETCFTGSCYTPGCSCNWPVCNRN 280 vitri E GLPVCGETCVGGTCNTPGCSCSWPVCFRN 281 vitri F GLTPCGESCVWIPCISSWGCAC S VCYKD 282 hedyotide Bl GTRCGETCFVLPCWSA FGCYCQ GFCYRN 283
[0060] In one embodiment, the present technology can be used to prepare any cyclotide, including those known cyclotides as listed in Table 1. New cyclotides can also be prepared. For instance, a known cyclotide can be modified to substitute, insert and/or delete one or more amino acids. In one aspect, the modified cyclotide is at least about 80%, 85%, 90%, or 95% identical to a reference cyclotide. In a particular aspect, the modified cyclotide retains six cysteine residues that form three disulfide bonds in a cyclized cyclotide. [0061] In one embodiment, the cyclotide incorporates one or more unnatural amino acids. "Unnatural amino acids" are amino acids not in the standard 20-amino acid list but can be incorporated into a protein sequence. Non-limiting examples of unnatural amino acids include p- methyxyphenylalanine, p-azidophenylalanine, L-(7-hydroxycoumarin-4-yl)ethylglycine, acetyl- 2-naphthyl alanine, 2-naphthyl alanine, 3-pyridyl alanine, 4-chloro phenyl alanine,
alloisoleucine, Z-alloisoleucine dcha salt, allothreonine, , 4-Iodo- phenylalanine, L-benzothienyl- D-alanine OH.
[0062] In one embodiment, the unnatural amino acid is located in loop 2 of the cyclotide. In alternative embodiments, the unnatural amino acid is located in loop 1, 3, 4, 5 or 6. In some embodiments, the cyclotide contains two, three, four or more unnatural amino acids.
[0063] The cyclotide comprises a molecular framework comprising a sequence of amino acids forming a cyclic backbone wherein the cyclic backbone comprises sufficient disulfide bonds to confer knotted topology on the molecular frameword or part therof. The cyclic backbone comprises the structure:
C[X:i . ..XJC[Xni . ..Xb]C[Xmi .
Figure imgf000022_0001
...Xd]C[XVi . ...Xf] wherein C is cysteine; and each of each of [XY ..Xa], [XV .XtJ, [XV -XJ, [XV-Xd], [XV-XJ, and [X -Xf], represents one or more amino acid residues wherein each one or more amino acid residues within or between the sequence residues may be the same or different; and wherein a, b, c, d, e and f represent the number of amino acid residues in each respective sequence and each of a to f may be the same or different and range from 1 to about 20. When the unnatural amino acid is located in loop 6 of the cyclotide, the amino acid residues corresponding to [X^i ...Xf] in the cyclotide comprise the unnatural amino acid. In some embodiments, the unnatural amino acid is located in loop 1 of the cyclotide. In this embodiment, the amino acid residues corresponding to [X · -Xa] in the cyclotide comprise the unnatural amino acid. In another embodiment, the unnatural amino acid is located in loop 2 of the cyclotide. In this embodiment, the amino acid residues corresponding to [X ..Xb] in the cyclotide comprise the unnatural amino acid. In a further embodiment, the unnatural amino acid is located in loop 3 of the cyclotide. In this embodiment, the amino acid residues corresponding to [ΧΙΠι ...Xc] in the cyclotide comprise the unnatural amino acid. In yet a further embodiment, the unnatural amino acid is located in loop 4 of the cyclotide. In this embodiment, the unnatural amino acid residues corresponding to
[X^i ...Xc] in the cyclotide comprise the unnatural amino acid. In another embodiment, the unnatural amino acid is located in loop 5 of the cyclotide. In this embodiment, the amino acid residues corresponding to [XVi ...Xe] in the cyclotide comprise the unnatural amino acid. [0064] In one aspect, the cyclotide comprises an amino acid sequence of
GGVCPKILQRCRRXSDCPGACICRGNGYCGSGSD (SEQ ID NO: 1) where X indicates an unnatural amino acid.
[0065] The present disclosure provides a polypeptide precursor for generating a cyclotide. In one embodiment, the polypeptide comprises a linear cyclotide fused to a C-terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively. A structure of the polypeptide is illustrated in FIG. IB, lower panel.
[0066] A "split intein" is an interin of a precursor protein that comes from two separate genes. For example, in cyanobacteria, DnaE, the catalytic subunit a of DNA polymerase III, is encoded by two separate genes, dnaE-n and dnaE-c. The dnaE-n product consists of an N-extein sequence followed by a 123-AA intein sequence, whereas the dnaE-c product consists of a 36- AA intein sequence followed by a C-extein sequence.
[0067] In one embodiment, the split intein comprises a DnaE split intein. In one embodiment, the DnaE split intein comprises a Nostoc punitiforme PCC73102 DnaE split intein.
[0068] In one embodiment, the C-terminal fragment comprises an amino acid sequence of SEQ ID NO: 2 (MIKIATRKYLGKQNVYDIGVERDHNFALKNGFIASN). In one embodiment, the N-terminal fragment comprises an amino acid sequence of SEQ ID NO: 3
(CLSYETEILTVEYGLLPIGKIVEKRIECTVYSVDNNGNiYTQPVAQWHDRGEQEVFEYCL EDGSLIRATKDHKFMTVDGQMLPIDEIFERELDLMRVDNLPN)
[0069] Also provided are cyclotides prepared from the cyclotide precursor as provided above. Still provided are polynucleotides encoding these polypeptides. In one aspect, the polynucleotide uses a stop codon to code for an unnatural amino acid. Suitable conditions for translating the stop codon into a natural amino acid is known in the art and detailed in Example 1.
[0070] Methods for preparing a cyclotide are also provided. In one embodiment, the method entails incubating a polypeptide of the disclosure under conditions for the linear cyclotide to cyclize.
[0071] It is contemplated that preparation of a cyclic cyclotide does not have to use the split intein as described above. As shown in Example 1 , other protein domains can also be used to bring both ends of a linear cyclotide together. Accordingly, one embodiment of the present disclosure provides a method for preparing a cyclic peptide comprising expressing a linear cyclotide in a cell and cyclize the linear cyclotide, wherein the cyclotide comprises at least an unnatural amino acid but retains six cysteine residues to form three disulfide bonds. Cyclized cyclotides from the methods are provided.
[0072] The cyclic cyclotides prepared by methods of the present disclosure can be further modified. For instance, an unnatural acid incorporated into the cyclotide can be modified with an agent comprising a detectable label. Such a detectable can be useful for diction and screening of the cyclotides.
[0073] Methods, kits, libraries are also provided for screening cyclotide libraries for potential pharmaceutical agents. Various mutations can be made to wild-type or known or proposed cyclotides, and to generate a large library of cyclotides. In particular with unnatural amino acids are used, such a library can include cyclotides of diverse structures, further enhancing the value of the library.
[0074] Applicants have previously utilized cyclotides for the screening and design of biologically relevant peptides (see WO 2011/005598, incorporated herein by reference).
EXAMPLES
[0075] Having described the general concepts of this invention, the following illustrative examples are provided.
Example 1
Materials and instrumentation
[0076] Analytical HPLC was performed on a HP 1100 series instrument with 220 and 280 nm detection using a Vydac CI 8 column (5 micron, 4.6 x 1 0 mm) at a flow rate of 1 mL/min. Semipreparative HPLC was performed on a Waters Delta Prep system fitted with a Waters 2487 Ultraviolet- visible (UV/Vis) detector using a Vydac CI 8 column (15- 20 μιη, 10 x 250 mm) at a flow rate of 5 mL/min. All runs used linear gradients of 0.1% aqueous trifluoroacetic acid (TFA, solvent A) vs. 0.1% TFA, 90% acetonitrile in H20 (solvent B). UV/Vis spectroscopy was carried out on an Agilent 8453 diode array spectrophotometer. Electrospray mass spectrometry (ES-MS) analysis was routinely applied to all cyclized peptides. ES-MS was performed on an Applied Biosystems API 3000 triple quadrupole electrospray mass spectrometer using Analyst 1.4.2. Calculated masses were obtained using Analyst 1.4.2. Fluorescence microscopy was perfomed on a FSX100 fluorescence microscope (Olympus). Fluorescence spectra were recorded on a Fluorolog-3 spectrophotometer (Horiba Scientific). Flow cytometry analysis was performed on LSR II instrument (BD). Protein samples were run on 4-20% Tris-Glycine Gels (Lonza). The gels were then stained with Pierce Gelcode Blue (Pierce), photographed/digitized using a Kodak EDAS 290, and quantified using NIH ImageJ software (http://rsb.info.nih.gov/ij/). DNA sequencing was performed by Retrogen DNA facility (San Diego,CA), and the sequence data were analyzed with DNAStar Lasergene v5.5.2. from Aldrich (Milwaukee, WI) or Novabiochem (San Diego, CA) unless otherwise indicated. Restriction enzymes were purchased from New England Biolabs. Primers were ordered from IDT (Integrated DNA Technologies).
Synthesis of DBCO-AMCA
[0077] The synthesis of DBCO-AMCA was performed as described in FIG. 4. Briefly, 6-((7- amino-4-methylcoumarin-3-acetyl)-amino)-hexanoic acid succinimidyl ester (AMCA-X, SE, Anaspec) (10 mg, 22.5 μιηοΐ) was reacted with 5,6-dihydro-l l , 12-didehydrodibenzo-[b,f]- azocino-3-oxoprop-yl-4-amine (DBCO-amine, Click Chemistry Tools, Bioconjugate Technology Company) (7 mg, 25.3 "mol) in DMF (100 μΐ) containing 5% di-isopropylethyl amine (DIEA) for 30 mins at room temperature. The reaction was monitored by HPLC and was complete in 30 mins. The product (DBCO-AMCA) was purified by reverse phase chromatography using a C18 Sepak cartridge (Waters). The pure product was characterized by ES-MS [DBCO-AMCA: expected averaged mass 604.8 Da, found mass 604.2 ± 0.2 Da] (FIG. 4).
Cloning of E. coli expression plasmids pERAzi
[0078] Genes for argU, thrU, tyrU, glyT and thrT tRNAs were subcloned from the plasmid pRARE2 (Novagen), digested with Sph I and Nhe I, and ligated into plasmid pEvol-AziRS pre- cut with the same restriction enzymes to afford pERAzi. pVLOmeRS
[0079] The OmeRS gene was amplified by polymerase chain reaction (PCR) using plasmid pBK-JY16 as template. The '-primer (5'-CT ATG ACT AGT GAC GAA TTT GAA ATG ATA AAG-3 ', SEQ ID NO: 290) encoded a Spe I restriction site. The 3 '-primer (5 '-GTG ATG AGA TCT TTA TAA TCT CTT TCT AAT TGG CTC-3', SEQ ID NO: 291) encoded a Bgl II. The PCR product was purified digested with Spe I and Bgl II, and ligated into a Spe I, Bgl I- treated pLeitRNA Opt-STAT3 plasmid to give the plasmid pVLOmeRS. pTXBl-MCoTI-I
[0080] The cloning of pTXBl-MCoTI-I has been previously described (Camarero,
J.A.;Kimura, R.H.;Woo, Y.H.;Shekhtman, A.;Cantor, J. Biosynthesis of a fully functional cyclotide inside living bacterial cells. Chembiochem, 2007, 8, 1363-1366; Austin, J.;Wang, W.;Puttamadappa, S.;Shekhtman, A.;Camarero, J.A. Biosynthesis and biological screening of a genetically encoded library based on the cyclotide MCoTI-I. Chembiochem, 2009, 10, 2663- 2670.) p TXBl-MCo Tl-stop
[0081] The codon encoding the residue Aspl5 (GAC) in plasmid pTXBl-MCoTI-I was replaced by the stop codon (UAG) by site directed mutagenesis using the Quick Change Lightning Multi Site Directed Mutagenesis kit (Agilent technologies) and the primer 5 ' GCG TTG CCG TCG TTA GTC TGA CTG CCC 3 ', SEQ ID NO: 291. The resulting plasmid was sequenced to confirm the mutation introduced. pET28-TS-MCo TI-I
[0082] The gene fragment containing the DnaE IN Npu (residues 775-876, UniProtKB:
B2J066) was amplified by PCR using plasmid pYYl-Npu-IN as template and the following primers: 5 '-primer (5'- AA AAA CAT ATG AAA CGG AAA TAT TGA C -3 ', SEQ ID NO: 292) and 3 '-primer (5 '- T TTT AAG CTT AAT TCG GCA AAT TAT CAA CCC -3', SEQ ID NO: 293) which introduced a Nde I and Hind III restriction sites, respectively. The resulting DNA fragment was purified and digested with Sal I and Not I. 5'-Phosphorylated
oligonucleotides coding for the DnaE IC Npu (residues 1-36, UniProtKB: B2J821) (Table 2) were synthesized and PAGE purified by IDT DNA. Complementary strands were annealed in 20 mM sodium phosphate, 0.3M NaCl buffer at pH 7.4 and the resulting double stranded DNA (dsDNA) was purified using Qiagen's PCR Purification Kit. 5'-Phosphorylated oligonucleotides coding for MCoTI-I (Table 2) were synthesized and PAGE purified by IDT DNA.
[0083] Complementary strands were annealed and purified as described above. The three DNA fragments were ligated using T4 DNA ligase (New England Biolabs). The ligation product was then amplified by PCR using the following primers: 5 '-primer (5'- AAA ACC ATG GGC AGC AGC CAT CAT CAT -3', SEQ ID NO: 294) and 3'-primer (5'- TTT TAA GCT TAA TTC GGC AAA TTA TCA ACC C -3', SEQ ID NO: 295), which introduced a Nco I and Hind III restriction sites, respectively. The PCR product was purified, digested with Nco I and Hind III, and ligated into a Nco I, Hind Hi-treated pET28a plasmid (Novagen) to give the plasmid pET28- TS-MCoTI-I. pET28-TS-MCoTI-stop
[0084] The codon encoding the residue Asp 15 (GAC) in plasmid pET28-MCoTI-stop was replaced by the stop codon (UAG) by site directed mutagenesis using the QuickChange
Lightning Multi Site Directed Mutagenesis kit (Agilent technologies) as described before. pASK-TS-MCoTI-stop
[0085] The gene encoding IC-MCoTI-I stop- IN was digested with Xba-I and Hindlll, purified using Qiagen gel extraction kit as per the protocol and ligated into pASKIBA35 digested with Xba-I and Hindlll to give a pASK-TS-MCoTI-stop. pET25-Tryp-EGFP
[0086] The gene fragment encoding EGFP was amplified by PCR using plasmid pEGFP-Nl (Clontech) as a template. The 5 '-primer (5 '- TCT AGA GGT GGT TCT GGT GGT TCT TCT GGT GGT GTC GAC AGC AAG GGC GAG GAG CTG TTC ACC GGG G -3 ', SEQ ID NO: 296) introduced a Nco I restricition site and the flexible linker (Gly-Gly-Ser)3 in frame with EGFP. The 3 '-primer (5'- A AGC TTA TTA GTG GTG ATG ATG GTG ATG AGA ACC ACC CTT GTA CAG CTC GTC CAT GCC GAG AGT G -3 ', SEQ ID NO: 297) introduced a Hind III restriction site, a poly-His tag in frame with EGFP and a stop codon. The resulting DNA fragment was purified, double digested with Nco I and Hind III, and ligated onto a Nco I, Hind Ill-treated pET25b plasmid (Novagen) to give pET25-EGFP. Mature anionic rat trypsin gene was mutated at positions 16 (116V) 195 (S I 95 A) using the plasmid pPicZalphaWTTg (generous gift from Dr. Teaster Baird Jr, SFSU) by site directed mutagenesis kit (Agilent biosystems) as per the manufacturer's protocol using the mutagenic primers (5'- GGA GAT ATA CAT ATG gtc GTT GGA GGA TAC ACC -3 ', SEQ ID NO: 298, and 5 '-CAC AGG GCC ACC ggc GTC ACC CTG GCA GC -3 ', , SEQ ID NO: 299, respectively). The mutations were confirmed by DNA sequencing. Inactive mature anionic rat trypsin was amplified by PCR using the mutated plasmid pPicZalphaWTTg as template and the following primers: 5 '-primer (5 '- CAT ATG ATC GTT GGA GGA TAC ACC TGC C -3 ', SEQ ID NO: 300) and 3'-primer (5'- CC A TG GCG TTG GCA GCA ATT GTG TCC TG -3', SEQ ID NO: 301), which introduced a Nde I and Nco I restriction sites, respectively. The resulting DNA fragment was purified, digested with Nde I and Nco I and ligated into Nde I, Nco I-treated pET25-EGFP plasmid to give the plasmid pET28-Tryp-EGFP. pRSF-Tryp-EGFP
[0087] The DNA fragment encoding inactive anionic rat trypsin fused to the N-terminus of EGFP was amplified by PCR using pET25-Tryp-EGFP as template. The 5 '-primer (5'- AAA CAT ATG GTC GTT GGA GGA TAC ACC TGC C -3' , SEQ ID NO: 302) and 3'-primer (5'- TTT TGG TAC CAT TAG TGG TGA TGA TGG TGA TGA GAA CCA CCC -3', SEQ ID NO: 303) introduced a Nde-I and Kpn-I restriction sites, respectively. The DNA fragment was purified, double digested with Nde-I and Kpn-I and ligated into a Nde-I I Kpn-I treated pRSF- duet (Novagen) to give plasmid pRSF-Tryp-EGFP.
Bacterial expression and purification
Expression and purification of intein precursor la
[0088] E.coli BL21 (DE3) or Origami (DE3) cells (Novagen) were transformed with plasmid pTXBl-MCoTI.[5, 27, 28] Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 "g/L) at 30° C for BL21 (DE3) or room temperature for Origami (DE3) cells. Briefly, 5 mL of an overnight starter culture derived from either a single clone were used to inoculate 1 L of 2XYT media. Cells were grown to an OD at 600 nm of !0.6 at 37° C. Protein expression was induced by addition of isopropyl-$-Dthiogalactopyranoside (IPTG) to a final concentration of 0.3 mM at 30°C for 4h in BL21 (DE3) cells and room temperature for overnight in Origami (DE3) cells. The cells were then harvested by centrifugation. For fusion protein purification, the cells were resuspended in 30 mL of lysis buffer (0.1 mM EDTA, 1 mM PMSF, 50 mM sodium phosphate, 250 mM NaCl buffer at pH 7.2 containing 5% glycerol) in the presence or absence of 20 mM ICH2CONH2 and then lysed by sonication. The lysate was clarified by centrifugation at 15,000 rpm in a Sorval SS-34 rotor for 30 min. The clarified supernatant was incubated with chitin-beads (1-3 mL beads/L cells, New England Biolabs), previously equilibrated with column buffer (0.1 mM EDTA, 50 mM sodium phosphate, 250 mM NaCl buffer at pH 7.2) at 4° C for 1 hour with gentle rocking. The beads were extensively washed with 50 bed-volumes of column buffer (50 mM sodium phosphate, 0.1 mM EDTA, 250 mM NaCl buffer at pH 7.2) containing 0.1% Triton X100 and then rinsed and equilibrated with 50 bed- volumes of column buffer. Quantification of the precursor intein was carried out spectrophotometrically using an extinction coefficient per chain at 280 nm of 38,150 M-lcm-1. The expression level for intein precursor la was ~ 40 mg/L.
Expression and purification of intein precursor lb
[0089] E.coli BL21 (DE3) or Origami (DE3) cells (Novagen) were co-transformed with plasmids pTXBl-MCoTI-stop and pVLOmeRS. Expression was carried out in 2XYT medium (1 L) ampicillin (100 μg L) and chloroamphenicol (35 μg L) at 30° C for BL21 (DE3) or room temperature for Origami(DE3) cells. Cells were grown to an OD at 600 nm of ~ 0.2 at 37 °C at which point 2 mM /7-methoxy-phenylalanine (ChemPep Inc.) was added. Protein expression was induced with IPTG when the OD at 600 nm was 0.6 as described above. The cells were harvested and the intein precursor purified as described for intein construct la. The expression level for intein precursor lb was ~ 3 mg/L.
Expression and purification of intein precursor lc
[0090] E.coli BL21(DE3) or Origami(DE3) cells (Novagen) were co-transformed with plasmids pTXBl-MCoTI-stop and pERazi. Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 μg/L) and chloroamphenicol (35 μg/L) at 30° C for BL21 (DE3) or room temperature for Origami (DE3) cells. Cells were grown to an OD at 600 nm of ~ 0.2 at 37° C at which point ImM />-azido-phenylalanine (Chem-Impex International Inc.) was added. Arabinose (0.02%) was added when the OD at 600 nm reached a value of 0.4 and 0.6, respectively, then protein expression was induced with IPTG as described for precursor la. Cells were harvest and the intein precursor purified as described for precursor la. The expression level for intein precursor lc was ~ 7 mg/L. (Note: all steps involving proteins containing -azido- phenylalanine needs to be carried out in complete darkness to avoid photodecomposition of the aryl-azido group).
In vitro EPL-mediated cyclization of cyclotides MCoTI-I, MCoTI-OmeF and MCoTIaziF
[0091] Purified intein-fusion proteins la, lb and lc adsorbed on chitin beads (1 mL) were cleaved in freshly degassed column buffer containing 100 mM GSH (total volume 1.5 mL). The cleavage/cyclization reaction was kept for 20 h at 25o C with gentle rocking. Once the cleavage was complete the beads were filtered and analyzed by analytical HPLC. Folded cyclotide MCoTI-I was purified by semipreparative HPLC using a linear gradient of 10-30% solvent B over 30 min. Purified products were characterized as the desired product by ES-MS [MCoTI-I: expected averaged mass 3480.9 Da, found mass 3481.0 ± 0.9 Da; MCoTI-OmeF: expected averaged mass 3543.9 Da, found mass 3543.5 ± 0.2 Da and MCoTI-aziF: expected averaged mass 3553.9 Da; mass found 3528.7 ± 1.3 Da, this mass corresponds to the photodecomposition product] (FIG. 5). Quantification was carried out spectrophotometrically using an extinction coefficient per chain at 280 nm of 2,240 M-lcm-1 (MCoTI-I) and 3,730 M-lcm-1 (MCoTI- OmeF and MCoTI-aziF).
Capture of in-cell generated MCoTI-cyclotides using trypsin-immobilized shepharose beads
[0092] Trypsin-immobilized agarose beads were prepared as previously described. Briefly, NHS-activated Sepharose was washed with 15 volumes of ice-cold 1 mM HC1. Each volume of beads was incubated with an equal volume of coupling buffer (200 mM sodium phosphate, 250 mM NaCl buffer at pH 6,) containing 2-4 mg of porcine pancreatic trypsin type IX-S (14,000 units/mg)/mL for 3 h with gentle rocking at room temperature. The beads were then rinsed with 10 volumes of coupling buffer, and incubated with excess coupling buffer containing 100 mM ethanolamine (Eastman Kodak) for 3 hours with gentle rocking at room temperature. Finally, the beads were washed with 50 volumes of washing buffer (200 mM sodium acetate, 250 mM NaCl buffer at pH 4.5) and stored at 4°C until use. The sepharose-trypsin beads are table for a month under these conditions. Affinity purification of MCoTI-cyclotides was carried out as follows, 30 mL of clarified lysate was incubated with 500 "L of trypsin-sepharose for one hour at room temperature with gentle rocking, and centrifuged at 3000 rpm for 1 minute. The beads were washed with 50 volumes of column buffer containing 0.1% Tween 20 and then rinsed with 50 volumes of column buffer without detergent. The sepharose beads were treated with 3 x 0.5 mL of 8 M GdmCl at room temperature for 15 min and then eluted by gravity. The elute fractions were analyzed by HPLC and ES-MS.
In-cell expression of cyclotide MCoTI-I using protein trans-splicing
[0093] Origami (DE3) cells (Novagen) were co-transformed with pET28-TS-MCoTI.
Precursor intein 2a was expressed as previously described for la in presence of kanamycin (25 μg/L) instead. Cells were harvested and lysed as described above. MCoTI-I was purified from the cell lysate using sepharose-trypsin beads as described earlier.
In-cell expression of cyclotide MCoTI-OmeF using protein trans-splicing
[0094] Origami (DE3) cells (Novagen) were transformed with pET28-TS-MCoTI-stop and pVLOmeRS. Precursor intein 2b was expressed as previously described for lb but in presence of kanamycin (25 μΒ/L) and chloroamphenicol (35 μg/L). Cells were harvested and lysed as described above. MCoTI-OmeF was purified from the cell lysate using sepharose-trypsin beads as described earlier and characterized by LC-MS.
In-cell expression of cyclotide MCoTI-aziF using protein trans-splicing
[0095] Origami (DE3) cells (Novagen) were transformed with pET28-TS-MCoTI-stop and pERAzi. Precursor intein 2c was expressed as previously described for lc but in presence of kanamycin (25 μΒ/L) and chloroamphenicol (35 μg/L). Cells were harvested and lysed as described above. MCoTI-aziF was purified from the cell lysate using sepharose-trypsin beads as described earlier and characterized by LC-MS.
In vitro labeling of MCoTI-aziF with DBCO-AMCA
[0096] Trypsin-sepharose immobilized MCoTI-aziF (500 μί, ~ 2 μg, ~ 0.48 nmol) was reacted with DBCO-AMCA (0.2 mg, 0.33 μιηοΐ) in 500 μΐ, PBS buffer (20 mM sodium phosphate, 100 mM NaCl buffer at pH 7.2) at 37°for 30 mins. Once the labeling reaction was complete the excess of dye was removed by washing the trypsin-sepharose immobilized AMCA-labeled MCoTI-aziF with PBS (5x 10 mL). AMCA-labeled MCoTI-aziF was eluted with 8 M GdmCl and analyzed by LC-MS and ES-MS [AMCA-labeled MCoTI-aziF; expected averaged mass 4159.11 Da, found mass 4159.3 ± 0.25 Da] (FIG. 9).
In-cell labeling of MCoTI-aziF with DBCO-AMCA
[0097] In-cell production of MCoTI-aziF was carried out as described earlier. The cells (1 L) were washed with cold PBS (3 x 250 mL), resuspended in 10 mL of PBS and incubated with DBCO-AMCA (500 nM) at 37°C for 4h. Unreacted DBCO-AMCA inside cells was washed away with cold PBS (10 x 25 mL). Cells were lysed as described above. The AMCA-labeled MCoTi-aziF was purified on trypsin-sepharose and analyzed by LC-MS. Expression of trypsin-S195A-EGFP
[0098] Origami(DE3) cells (Novagen) were transformed with plasmid pET25-Tryp-EGFP. Cells were grown in LB media containing ampicillin (100 "g/L) to an OD at 600 nm of ~ 0.62 at 37° C. Protein expression was induced with 0.3 mM IPTG for 6 h at 30°C. The cells were harvested by centrifugation, resuspended in 30 mL of lysis buffer (0.1 mM PMSF, 10 mM imidazole, 25 mM sodium phosphate, 150 mM NaCl buffer at pH 8.0 containing 5% glycerol) and lysed by sonication. The lysate was clarified by centrifugation at 15,000 rpm in a Sorval SS- 34 rotor for 30 minutes. The clarified supernatant was incubated with 1 mL of Ni-NTA agarose beads (Qiagen) previously equilibrated with column buffer (20 mM imidazole, 50 mM sodium phosphate, 300 mM NaCl buffer at pH 8.0) at 4°C for 1 hour with gentle rocking. The Ni-NTA agarose beads were washed sequentially with column buffer containing (100 mL) followed by column buffer containing 20 mM imidazole (100 mL). The fusion protein was eluted with 2 mL of column buffer containing 100 mM EDTA. The Protein was characterized as the desired product by ES-MS (FIG. 9). Quantification of Typsin-S195A-EGFP was carried out spectrophotometrically using an extinction coefficient per chain at 484 nm of 56000 M-lcm-1. Expression level of soluble protein was estimated -160 μg/L.
Sequential co-expression of trypsin-Sl 95A-EGFP and MCoTI-aziF
[0099] Origami(DE3) cells (Novagen) were co-transformed with plasmids pASK-TS- MCoTIstop, pRSF-Tryp-EGFP and pERAzi. Expression was carried out in 2XYT medium (1 L) containing ampicillin (100 "g/L), chloroamphenicol (35 "g/L) and kanamycin (25 μg/L) at room temperature for Origami(DE3) cells. Cells were grown to an OD at 600 nm of ~ 0.2 at 37° C at which point 1 mM /7-azido-phenylalanine was added. Arabinose (0.02%) was added when the OD at 600 nm reached a value of 0.4 and 0.6, respectively, then the expression of MCoTI-aziF was induced with anhydrotetracycline (200 ng/L) for 18 h. The cells were pelleted and washed with cold PBS (3 x 250 mL), and then incubated with DBCO-AMCA as described earlier. Once the labeling reaction was complete, the cells were washed with cold PBS (3 x 50mL) to remove any unreacted DBCO-AMCA and resuspended in M9 media at an OD at 600 nm around 0.5. At this point the expression of trypsin-Sl 95 A-EGFP was induced with 0.3 mM IPTG at room temperature for 18 h. Measurement of affinity constant between trypsin and AMCA-labeled MCoTI-aziF
[0100] The dissociation constant between trypsin and AMCA-labeled MCoTI-aziF was measured by fluorescence polarization anisotropy at 25 °C using a Jobin Yvon/Spex Fluorolog 3 spectrofluorometer with the excitation bandwidth set at 5 nm and emission at 5 nm. The excitation wavelength for coumarin was set at 360 nm and emission was monitored at 450 nm. The equilibrium dissociation constant (¾) for the interaction was obtained by titrating a fixed concentration of AMCA-labeled MCoTI-aziF (5 nM) with increasing concentrations of trypsin in 0.5 mM EDTA, 50 mM sodium phosphate, 150 mM NaCl at pH 7 by assuming formation of a 1 :1 complex. The calculated ΚΌ value was 4.5 ± 0.7 nM (FIG. 11).
NMR spectroscopy
[0101] NMR samples were prepared by dissolving cyclotides into 80 mM potassium phosphate pH 6.0 in 90% H2O/10% 2H20 (v/v) to a concentration of approximately 0.5 mM for McoTI-I and 0.1 mM for MCoTI-OmeF. All 1H NMR data were recorded on either Bruker Avance III 500 MHz or Bruker Avance II 700 MHz spectrometers equipped with TCI cryoprobes. Data were acquired at 298 K, and 2,2-dimethyl-2-silapentane-5-sulfonate, DSS, was used as an internal reference. The carrier frequency was centered on the water signal, and the solvent was suppressed by using WATERGATE pulse sequence. XH, ^-TOCSY (spin lock time 80 ms) and 1H, ^-NOESY (mixing time 150 ms) spectra were collected using 4096 12 points and 256 ti of 64 transients. Spectra were processed using Topspin 1.3 (Bruker). Each 2D-data set was apodized by 90°-shifted sinebell-squared in all dimensions, and zero filled to 4096 x 512 points prior to Fourier transformation. Assignments for Haand H' protons of folded MCoTI-cyclotides (Table 2) were obtained using standard procedures.
Table 2. Forward (p5) and reverse (p3) 5'-phosphorylated oligonucleotides used to produce plasmid pET28-TS-MCoTI-I
Figure imgf000033_0001
p3 5 '-CGA AGC TAT GAA GCC ATT TTT GAG TGC AAA ATT ATG GTC GCG CTC AAC TCC AAT GTC ATA GAC ATT TTG TTT GCC TAA ATA TTT ACG TGT GGC TAT TTT GAT CAT GCT GCC GCG CGG CAC CAG GCC GCT GCT GTG ATG ATG ATG ATG ATG GCT GCT GCC C -3 ' (SEQ ID NO : 287)
Npu-Iiy p5 5 '-CG AAC TGC GGT TCT GGT TCT GAC GGT GGT GTT TGC CCG
AAA ATC CTG CAG CGT TGC CGT CGT GAC TCT GAC TGC CCG GGT GCT TGC ATC TGC CGT GGT AAC GGT TAC TGT TTA TCA - 3 ' (SEQ ID NO: 288) p3 5 '-TA TGA TAA ACA GTA ACC GTT ACC ACG GCA GAT GCA AGC ACC CGG GCA GTC AGA GTC ACG ACG GCA ACG CTG CAG GAT TTT CGG GCA AAC ACC ACC GTC AGA ACC AGA ACC GCA GTT -3 ' (SEQ ID NO: 289)
Results
[0102] This example tested the feasibility of introducing Uaas into folded cyclotides in living cells, the example used the cyclotide MCoTI-I (FIG. 1A). This cyclotide is a powerful trypsin inhibitor (Ki «20 pM) that has been recently isolated from dormant seeds of Momordica
[0103] cochinchinensis, a plant member of the Cucurbitaceae family. Trypsin inhibitor cyclotides are interesting candidates for drug design because their specificity for inhibition can be altered and their structures can be used as natural scaffolds to generate novel binding activities.
[0104] Since MCoTI-cyclotides have been expressed inside Escherichia coli cells using an intramolecular version of expressed protein ligation (EPL), this example tried this approach first for the in-cell generation of MCoTI-based cyclotides containing different Uaas. This method relies on the use of a protein splicing unit in combination with an in-cell intramolecular native chemical ligation reaction to perform the backbone cyclization of the linear cyclotide precursor. The amber stop codon TAG was used to encode the Uaa at the position corresponding to the residue AspM in MCoTI-I. This residue is located in the middle of loop 2 (FIG. 1A), which has been shown to be tolerant to mutations without affecting the structure and biological activity of the resulting cyclotide. The incorporation of Uaas into the cyclotide framework was tested using the Uaas ^-methoxyphenylalanine (OmeF) and /?-azidophenylalanine (AziF), which have been successfully incorporated into various recombinant proteins. More importantly, incorporation of AziF into the cyclotide framework should allow the site-specific incorporation of fluorescent probes into this scaffold inside the living cell by using alkyne-containing fluorescent probes and click chemistry.
[0105] First, the example explored the expression level of the corresponding intein precursors (la and lc, FIG. IB) in BL21(DE3) cells. Expression of the intein precursors of MCoTI-OmeF and MCoTI-aziF was performed in cells co-transformed with a plasmid encoding the corresponding MCoTI-intein precursor for EPL-mediated cyclization and the plasmid encoding the orthogonal amber suppressing tRNAcuA aminoacyl-tRNA synthetase pair specific for OmeF (pVLOmeRS) or AziF (pERAzi), respectivly. Intein-precursors were expressed in 2XYT medium at 30°C for 4 h in the presence of 1 mM AziF or 2 mM OmeF. These conditions were optimized for the expression of the wild-type MCoTI-intein precursor in BL21(DE3) cells. In both cases, the expression level of the intein precursors containing Uaas (lb and lc) was similar (FIG. 5). The suppression efficiency was estimated to be -10% (MCoTI-OmeF precursor, lb) and ~ 20% (MCoTI-aziF precursor, lc) compared to the expression of the wild-type MCoTI-I intein precursor la (~ 40 mg/L).
[0106] These differences were attributed to the different promoters used to control the expression of the corresponding tR ACuA/synthetase pairs and the efficiencies of the corresponding tRNA synthetases. Under the expression conditions used in these experiments, all intein precursors showed around 60%> in- vivo cleavage, indicating that the intein was active and unaffected by the incorporation of the Uaa (FIG. 5).
[0107] Next, this example tested the ability of the different intein-MCoTI precursors to produce the corresponding folded cyclotide by treatment with reduced glutathione (GSH) at pH 7.2 following the conditions optimized for MCoTI-cyclotides. In both cases, the in vitro reaction was clean and efficient in providing major products as analyzed by analytical HPLC (FIG. 5). These products corresponded to the properly folded cyclotides MCoTI-OmeF and MCoTI-aziF as shown by electrospray mass spectrometry (ES-MS) analysis (FIG. 5) and retained their ability to bind specifically to trypsin (data not shown). The cyclotide MCoTI-OmeF was also characterized by homonuclear NMR spectroscopy to confirm the native cyclotide scaffold was intact (FIG. 7). The final yield after purification was 4 μg/L (MCoTI-OmeF) and 14 μg/L (MCoTI-aziF). The expression yield for the wild-type MCoTI-I using these expression and cyclization conditions was ~ 48 μg/L after purification.
[0108] Next, this example explored the expression of the MCoTI-OmeF and MCoTI-aziF cyclotides inside bacterial cells using EPL-mediated cyclization. This was accomplished by expressing the corresponding intein-precursor in Origami2(DE3) cells. These cells have mutations in the thioredoxin and glutathione reductase genes, which facilitate the formation of disulfide bonds in the bacterial cytosol. Wild-type MCoTI-I were expressed in-cell reaching intracellular concentrations ~1 μΜ. When we tried this approach with the cyclotides MCoTI- OmeF and MCoTI-aziF, however, the amount of folded cyclotides was below the detection limit.
[0109] In order to boost the expression of cyclotides in living cells we explored the use of protein trans-splicing (PTS) to facilitate the in-cell cyclization process and to improve the expression yield of Uaa-containing cyclotides (FIG. 3). Protein trans-splicing is a post- translational modification similar to protein splicing with the difference that the intein self- processing domain is split into N- (IN) and C-intein (IC) fragments. The split-intein fragments are not active individually, however, they can bind to each other with high specificity under appropriate conditions to form an active protein splicing or intein domain in trans.[40] PTS- mediated backbone cyclization can be accomplished by rearranging the order of the intein fragments. By fusing the IN and IC fragments to the Cand N-termini of the polypeptide for cyclization, the trans-splicing reaction yields a backbone-cyclized polypeptide (FIG. 3). This approach has been recently used for the biosynthesis of small cyclic hexapeptides containinig Uaas (Young, T.S.;Young, D.D.;Ahmad, I.;Louis, J.M.;Benkovic, S.J.;Schultz, P.G. Evolution of cyclic peptide protease inhibitors. Proc Natl Acad Sci U S A, 2011, 108, 11052-11056.). In this work, in-cell cyclization was performed using the naturally occurring Synechocystis sp. (Ssp) PCC6803 DnaE split intein. However, the Ssp DnaE intein requires specific amino acid residues at both intein-extein junctions for efficient protein splicing.
[0110] To overcome this problem this example used the Nostoc puntiforme PCC73102 (Npu) DnaE split-intein. This DnaE intein has the highest reported rate of protein trans-splicing (τι/2 ~ 60 s) and has a high splicing yield. First, this example explored the ability of the Npu DnaE split- intein to produce folded wild-type MCoTI-I cyclotide inside living E. coli cells. To accomplish this, the example designed the split-intein construct 2a (FIG. IB). In this construct, the MCoTI-I linear precursor was fused in- frame at the C- and N-termini directly to the Npu DnaE IN and IC polypeptides. None of the additional native C- or N-extein residues were added in this construct. This example used the native Cys residue located at the beginning of loop 6 of MCoTI-I (FIG. 1) to facilitate backbone cyclization. A His-tag was also added at the N-terminus of the construct to facilitate purification. In-cell expression of wild-type MCoTI-I using PTS-mediated backbone cyclization was achieved by transforming the plasmid encoding the split-precursor 2a into Origami(DE3) cells to facilitate folding. The MCoTI-precursor split-intein was over expressed for 18 h at room temperature. Using these conditions the precursor was expressed at very high levels (-70 mg/L) and was almost completely cleaved (~ 95% in vivo cleavage, FIG. 2A). Reducing the induction time for the expression of the precursor 2a did not significantly decrease the level of in vivo cleavage, indicating the inherent ability of the construct to undergo protein trans-splicing. The high reactivity of this precursor prevented us from performing a full characterization of the precursor protein including kinetic studies of the trans-splicing induced reaction in vitro.
[0111] Next, this example tried to isolate the natively folded MCoTI-I generated in-cell by incubating the soluble fraction of a fresh cell lysate with trypsin-immobilized sepharose beads. Correctly folded MCoTI-cyclotides are able to bind trypsin with high affinity (Kt ¾ 20-30 pM). Therefore, this step can be used for affinity purification and to test the biological activity of the recombinant cyclotides. After extensive washing, the absorbed products were eluted with a solution containing 8 M guanidinium chloride (GdmCl) and analyzed by HPLC. The HPLC analysis revealed the presence of a major peak that had the expected mass of the native MCoTI-I fold (FIG. 2B and 6). Recombinant MCoTI-I produced by PTS-mediated cyclization was also characterized by NMR spectroscopy (FIG. 6 and Table 3) and was shown to have to the natively folded MCoTI-I. The in-cell expression level of folded MCoTI-I produced by PTS-mediated cyclization was estimated to be -70 μg/L of bacterial culture, which corresponds to an intracellular concentration of -7 μΜ.
Table 3. ^-Backbone chemical shits of MCoTI-cyclotides at 298 K in 20 mM sodium phosphate buffer at pH 6.0, 90%/10% H20/D20
Residue δ H / ppm 6 Ha / ppm δ H / ppm δ Ha / ppm Δδ H / ppm Δδ Ha*7 ppm
MCoTI-OMeF MCoTI-OMeF MCoTI-l MCoTI-l
G1 8.057 3.666 8.069 3.943 -0.012 -0.277
G2 8.046 3.815 8.051 3.828 -0.005 -0.013
V3 8.297 3.844 8.319 3.853 -0.022 -0.009
C4 8.505 4.946 8.525 4.976 -0.02 -0.03
R6 8.02 4.717 8.142 4.131 -0.122 0.586
I7 7.526 4.205 7.564 4.253 -0.038 -0.048
L8 8.495 4.297 8.55 4.324 -0.055 -0.027
Q9 8.612 4.317 8.624 4.312 -0.012 0.005
R10 8.504 4.284 8.542 4.323 -0.038 -0.039
C11 8.277 4.686 8.225 4.753 0.052 -0.067
R12 9.314 4.322 9.271 4.322 0.043 0
R13 7.978 4.369 7.971 4.609 0.007 -0.24
O eF/Asp"* 7.533 4.832 (9.056) (3.956) - -
S15 8.069 3.954 8.088 4.15 -0.019 -0.196
D16 7.54 4.415 7.64 4.43 -0.1 -0.015
C17 7.644 5.037 7.766 4.807 -0.122 0.23
G19 8.429 3.599 8.452 3.629 -0.023 -0.03
A20 8.37 4.35 8.394 4.291 -0.024 0.059
C21 7.971 4.879 8.043 4.392 -0.072 0.487
I22 8.693 4.289 8.852 4.274 -0.159 0.015
C23 9.07 4.9 9.275 4.816 -0.205 0.084
R24 8.023 4.138 7.986 4.152 0.037 -0.014
G25 8.89 3.775 8.842 3.766 0.048 0.009
N26 7.626 4.504 7.663 4.543 -0.037 -0.039
G27 8.232 3.826 8.271 3.83 -0.039 -0.004
Y28 7.169 5.109 7.158 5.108 0.011 0.001
C29 8.696 5.311 8.643 5.265 0.053 0.046
G30 9.356 4.311 9.694 4.344 -0.338 -0.033
S31 8.683 4.406 8.66 4.351 0.023 0.055
G32 9.06 4.26 9.049 4.26 0.011 0
S33 8.601 4.288 8.589 4.341 0.012 -0.053
D34 8.295 4.484 8.3 4.48 -0.005 0.004
Backbone Amide proton Chemical shift difference of mutant and wild type(H mutant- H V/iidtype)
hemical shift difference of alpha protons of mutant and wild type (Ha mutant- Ha WjidtyPe)
" The Asp14 residue of wild type MCoTI-l is mutated to OmeF in the mutant MCoTI-OmeF
[0112] In-cell expression of folded MCoTI-cyclotides by PTS was about 7 times more efficient than intramolecular EPL-mediated backbone cyclization. This improvement may be explained by our choice of the split-intein Npu DnaE. This split-intein is extremely efficient; it exhibits fast kinetics with a good yield of protein trans-splicing. Differences in the cyclization process between the PTS and EPL methods may also contribute to the improvement in the cyclization yield. In PTS, the cyclization is driven by the affinity between the two-intein fragments, EST and IC, which in the case of the Npu DnaE intein is very high (KD « 3 nM). Once the intein complex is formed, the trans-splicing reaction is also extremely fast (τι/2 « 60 s for the Npu DnaE intein). In contrast, EPL-mediated cyclization follows a slightly more complex mechanism that relies on the formation of the C-terminal thioester at the N-extein-junction and the removal of the N- terminal leading sequence (a Met residue in this case) to provide an N-terminal Cys. These two groups then react to form a peptide bond between the N- and C-termini of the polypeptide. It is also worth noting that in contrast with the Ssp DnaE intein, which requires at least 4 native residues at the N- and C-terminal extein-intein junctions to work efficiently, the Npu ortholog used in this work tolerates different sequences at both junctions as demonstrated by the efficient trans-splicing of precursor 2a (FIG. 2A). The tetrapeptide sequences at both intein-extein junctions in construct 2a have only a 20% sequence homology with the native sequences of both Npu DnaE exteins.
[0113] Subsequently, this example conducted in-cell expression of cyclotides MCoTI-OmeF and MCoTI-aziF using PTS. For this purpose precursors 2b and 2c (FIG. IB) were
overexpressed in Origami(DE3) cells by transforming the plasmids pVLOmeRS and pERAzi and growing the bacterial cells in the presence of OmeF or AziF, respectively. Constructs 2b and 2c are similar to 2a but was designed to incorporate Uaas into residue Asp 14 in MCoTI-I (FIG. IB). The expression level of the intein precursors 2b and 2c were =7 mg/L (10% suppression in comparison to wild-type precursor 2a) and -20 mg/mL (25% suppression), respectively. In-cell trans-splicing for 2b and 2c was also similar (~ 90%, FIG. 2A) to that of the wild-type PTS construct 2a.
[0114] Cyclotides MCoTI-OmeF and MCoTI-aziF were purified by affinity chromatography using trypsin sepharose beads from fresh soluble cell lysates, and the trypsin-bound fractions were analyzed by LC-MS/MS and ES-MS (Figs. 2C and S3). Cyclotide MCoTI-OmeF generated in-cell by PTS was also characterized by NMR, confirming the adoption of a native cyclotide fold (FIG. 2E and 8). The in-cell expression level for cyclotide MCoTIOmeF and MCoTI-aziF were estimated to be ~ 1 μg L and ~ 2 μg/L corresponding to an intracellular concentration ~ 0.1 μΜ and 0.17 μΜ, respectively.
[0115] Next this example explored the possibility of using fluorescent-labeled cyclotides to perform in-cell screening of protein-cyclotide interactions. To accomplish this the example used MCoTI-aziF and anionic rat trypsin as a model system. Preliminary results showed that MCoTI- aziF can be efficiently labeled both in vitro (FIG. 9) and in vivo (FIG. 3 and 10) with a dibenzo- cyclooctyne (DBCO)-derivative of the fluorescent dye amino-methyl-coumarin acetate (AMCA) through copper-free click chemistry. As expected the resulting AMCA-labeled MCoTI-aziF was also able to bind trypsin efficiently (K^ of 4.5 ± 0.7 nM , FIG. 11). To facilitate in-cell monitoring of this interaction, trypsin was fused to the N-terminus of green fluorescent protein (GFP). AMCA and GFP show a good overlap between the emission band of the donor (AMCA) and the absorption band of the acceptor (GFP), which should allow the visualization of the molecular interaction by fluorescence resonance emission transfer (FRET). Moreover, structural analysis of a MCoTI-II trypsin complex model[28] reveals that the distance between the C- terminus of Trypsin and the Caof residue 15 in MCoTI-I is ¾ 35 A. This distance is well in range for the visualization of the complex formation by FRET. The catalytic residue Ser195 in trypsin was also mutated to Ala in order to prevent its toxicity when expressed in bacterial cells. The mutation of this catalytic residue did not affect significantly the ability of trypsin-S19 A-GFP to bind AMCA-labeled MCoTI-aziF (FIG. 12A-B).
[0116] To test the interaction between AMCA-labeled MCoTI-aziF and trypsin-S 195A-GFP inside live bacterial cells, MCoTI-aziF and trypsin-S 195 A-GFP were encoded in inducible plasmids under the control of the tetracycline (pASK) and T7 (pRSF) promoters, respectively, to facilitate the co-expression of both proteins. These plasmids are completely orthogonal to each other and to the plasmid encoding the tRNAcuA-tRNA synthetase (pERAzi) required for the incorporation of aziF. Co-transformed Origami(DE3) cells were first induced with 0.02% arabinose and 200 ng/L anhydrotetracycline in the presence of aziF to produce MCoTI-aziF. The amount of MCoTI-aziF produced under these conditions was similar to the value obtained when expressed under the control of a T7 promoter. After washing the cells to remove unused aziF, the cells were incubated with DBCO-AMCA in PBS for 4h at 37 °C. In-cell AMCA-labeling of MCoTI-aziF was monitored through LC-MS indicating that under these conditions all the MCoTI-aziF produced inside the cells reacted with DBCO-AMCA. The cells were washed again with PBS to remove unreacted DBCO-AMCA, resuspended in M9 and induced with isopropyl β-D-l-thiogalactopyranoside (IPTG) for 18 h at room temperature to induce the expression of trypsin-S 195 A-GFP. The intracellular concentration of trypsin-S 195 A-GFP was practically not affected by the expression and labeling of MCoTI-aziF and was estimated to be ~1 μΜ. The in- cell interaction between AMCA-labeled MCoTI-aziF and trypsin-S 195 A-EGFP was first analyzed by fluorescence spectroscopy. The fluorescence spectrum of the live cells revealed the presence of a strong FRET emission signal at 520 nm upon excitation of the AMCA fluorophore at 360 nm indicating the intracellular formation of the trypsin-MCoTI complex. The presence of complex was also confirmed by flow cytometry. The intracellular FRET efficiency (~ 0.6) calculated as the ratio between the fluorescence signal of the acceptor fluorophore excited at 360 nm and 415 nm, was also consistent with the dissociation constant for this molecular complex and the intracellular concentrations of trypsin and MCoTI-I. More importantly, these results show that in-cell produced fluorescent labeled cyclotides can be used for monitoring and/or screening intracellular biomolecular interactions using fluorescence-based readout platforms.
[0117] In summary, this example shows that the biosynthesis of cyclotides containing Uaas can be achieved by using different intein-based methods. EPL-backbone cyclization can provide Uaa-containing cyclotides when the cyclization is carried out in vitro by GSH-induced cyclization and folding of the corresponding precursor. In-cell production, however, is less efficient using this method. This example also shows that PTS-mediated backbone cyclization using the highly efficient Npu DnaE split-intein can be employed for the efficient production of cyclotides inside live E. coli cells.
[0118] It is estimated that the in-cell production of MCoTI-I was around 7 times more efficient using Npu DnaE PTS than EPL, thereby providing an attractive alternative for the production of these types of polypeptides in bacterial cells. The high efficiency of PTS-mediated cyclization combined with nonsense suppressing orthogonal tR A/synthetase technology made the in-cell production of cyclotides containing Uaas possible. Of particular interest is the introduction azido-containing Uaas, which can react with DBCO-containing fluorescent probes to provide in- cell fluorescent-labeled cyclotides. The classical approach for in-cell production of fluorescent- labeled proteins by fusing a fluorescent protein to one of the termini of the protein to be studied is not applicable to cyclotides due to their backbone-cyclized topology.
[0119] This example has shown that cyclotides containing the Uaa aziF can be expressed in live bacterial cells and easily labeled with DBCO-AMCA to monitor/screen intracellular cyclotide-protein interactions. This interesting finding makes possible in-cell screening of genetically-encoded libraries of cyclotides for the rapid selection of novel cyclotide sequences able to bind a specific bait protein using high throughput cell-based optical screening approaches such as fluorescence activated cell sorting (FACS).
[0120] It is to be understood that while the invention has been described in conjunction with the above embodiments, that the foregoing description and examples are intended to illustrate and not limit the scope of the invention. Other aspects, advantages and modifications within the scope of the invention will be apparent to those skilled in the art to which the invention pertains.

Claims

CLAIMS:
1. An isolated or recombinant polypeptide comprising a linear cyclotide fused to a C- terminal fragment and an N-terminal fragment of a split intein, at the N-terminus and C-terminus of the cyclotide, respectively.
2. The polypeptide of claim 1 , wherein the split intein comprises a DnaE split intein.
3. The polypeptide of claim 2, wherein the DnaE split intein comprises a Nostoc punitiforme PCC73102 DnaE split intein.
4. The polypeptide of claim 3, wherein the C-terminal fragment comprises an amino acid sequence of SEQ ID NO: 2.
5. The polypeptide of claim 3 or 4, wherein the N-terminal fragment comprises an amino acid sequence of SEQ ID NO: 3.
6. The polypeptide of any preceding claim, wherein the cyclotide comprises an amino acid sequence selected from Table 1 or an amino acid that has at least about 90% sequence identity thereto.
7. The polypeptide of any preceding claim, wherein the cyclotide comprises at least one unnatural amino acid residue but retains six cysteine residues that form three disulfide bonds in a cyclized cyclotide.
8. The polypeptide of claim 7, wherein the unnatural amino acid comprises one or more selected from p-methyxyphenylalanine, p-azidophenylalanine or L-(7-hydroxycoumarin-4- yl)ethylglycine.
9. The polypeptide of claim 7, wherein the cyclotide comprises an amino acid sequence of SEQ ID NO: 1 or a biological equivalent thereof.
10. A method for preparing a cyclic peptide, comprising incubating a polypeptide of any one of claims 1-9 under conditions for the linear cyclotide to cyclize.
11. An isolated polynucleotide comprising a nucleic acid sequence encoding a polypeptide of any one of claims 1 -9 or a biological equivalent thereof or a polynucleotide that hybridizes under conditions of high stringency to the polynucleotide or its complement.
12. A method for preparing a cyclic peptide comprising expressing a linear cyclotide in a cell and cyclize the linear cyclotide, wherein the cyclotide comprises at least an unnatural amino acid but retains six cysteine residues to form three disulfide bonds.
13. A cyclized cyclotide obtainable by the method of claim 12.
14. The cyclotide of claim 13, wherein the unnatural acid is modified with an agent comprising a detectable label.
15. A vector or host cell comprising the polynucleotide of claim 11.
16. A composition comprising a carrier and one or more of a polynucleotide of claims 1 to 7, a polynucleotide of claim 11, a cyclized cyclotide of claim 13 or 14, or a vector or host cell of claim 15.
PCT/US2013/031741 2012-09-19 2013-03-14 Preparation of cyclotides WO2014046731A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261703108P 2012-09-19 2012-09-19
US61/703,108 2012-09-19

Publications (1)

Publication Number Publication Date
WO2014046731A1 true WO2014046731A1 (en) 2014-03-27

Family

ID=50341833

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/031741 WO2014046731A1 (en) 2012-09-19 2013-03-14 Preparation of cyclotides

Country Status (1)

Country Link
WO (1) WO2014046731A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108289428A (en) * 2015-09-30 2018-07-17 赫希玛有限公司 A kind of method
WO2023215032A3 (en) * 2022-05-03 2024-03-14 University Of Southern California Potent anti-cancer cyclotides

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
GARCIA A. E. ET AL.: "Biological activities of natural and engineered cyclotides, a novel molecular scaffold for peptide-based therapeutics", CURRENT MOLECULAR PHARMACOLOGY, vol. 3, no. 3, 2010, pages 153 - 163 *
JAGADISH K. ET AL.: "Expression of fluorescent cyclotides using protein trans-splicing for easy monitoring of cyclotide-protein interactions", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, vol. 52, no. 11, 2013, pages 3126 - 3131 *
SCOTT, C. P. ET AL.: "Production of cyclic peptides and proteins in vivo", PNAS, vol. 96, no. 24, 1999, pages 13638 - 13643 *
SCOTT, C. P. ET AL.: "Structural requirements for the biosynthesis of backbone cyclic peptide libraries", CHEMISTRY AND BIOLOGY, vol. 8, 2001, pages 801 - 815 *
TAVASSOLI A ET AL.: "Split-intein mediated circular ligation used in the synthesis of cyclic peptide libraries inE. coli", NATURE PROTOCOLS, vol. 2, no. 5, 2007, pages 1126 - 1133 *
YOUNG, T. S. ET AL.: "Evolution of cyclic peptide protease inhibitors", PNAS, vol. 108, no. 27, 2011, pages 11052 - 11056 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108289428A (en) * 2015-09-30 2018-07-17 赫希玛有限公司 A kind of method
WO2023215032A3 (en) * 2022-05-03 2024-03-14 University Of Southern California Potent anti-cancer cyclotides

Similar Documents

Publication Publication Date Title
US10184936B2 (en) Activation of bioluminescence by structural complementation
JP2015521842A6 (en) In vitro production of cyclic peptides
JP2015521842A (en) In vitro production of cyclic peptides
US9365623B2 (en) Cyanochrome fluorophores
WO2014046731A1 (en) Preparation of cyclotides
JPWO2020080490A1 (en) Method for producing peptide library
US9354175B2 (en) Lucigen yellow (LucY), a yellow fluorescent protein
US9783800B2 (en) Method for producing peptides having azole-derived skeleton
US20230342826A1 (en) Activation of bioluminescence by structural complementation
JP5881509B2 (en) Azobenzene compounds
EP2399925B1 (en) Method for fluorescently labeling protein
Cheng et al. Fluorescence-based characterization of genetically encoded peptides that fold in live cells: progress toward a generic hairpin scaffold
CN117106097A (en) RNA-protein complex and application thereof
JP2020076673A (en) Fluorescent probe and rapid fluorescence measuring method using probe

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13839218

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13839218

Country of ref document: EP

Kind code of ref document: A1