WO2014170853A1 - Methods and materials for encapsulating proteins - Google Patents

Methods and materials for encapsulating proteins Download PDF

Info

Publication number
WO2014170853A1
WO2014170853A1 PCT/IB2014/060784 IB2014060784W WO2014170853A1 WO 2014170853 A1 WO2014170853 A1 WO 2014170853A1 IB 2014060784 W IB2014060784 W IB 2014060784W WO 2014170853 A1 WO2014170853 A1 WO 2014170853A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
interest
encapsulated
rhs
repeat
Prior art date
Application number
PCT/IB2014/060784
Other languages
French (fr)
Inventor
Jason Nicholas BUSBY
Jeremy Shaun LOTT
Mark Robin Holmes Hurst
Original Assignee
Agresearch Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agresearch Limited filed Critical Agresearch Limited
Priority to US14/785,505 priority Critical patent/US10526378B2/en
Priority to AU2014255358A priority patent/AU2014255358B2/en
Priority to EP14786060.5A priority patent/EP2986725A4/en
Publication of WO2014170853A1 publication Critical patent/WO2014170853A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/24Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01NPRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
    • A01N37/00Biocides, pest repellants or attractants, or plant growth regulators containing organic compounds containing a carbon atom having three bonds to hetero atoms with at the most two bonds to halogen, e.g. carboxylic acids
    • A01N37/18Biocides, pest repellants or attractants, or plant growth regulators containing organic compounds containing a carbon atom having three bonds to hetero atoms with at the most two bonds to halogen, e.g. carboxylic acids containing the group —CO—N<, e.g. carboxylic acid amides or imides; Thio analogues thereof
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8279Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
    • C12N15/8286Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance for insect resistance
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/52Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/50Fusion polypeptide containing protease site
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/23Aspartic endopeptidases (3.4.23)
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
    • Y02A40/146Genetically Modified [GMO] plants, e.g. transgenic plants

Definitions

  • the invention relates to various applications for encapsulated proteins.
  • the protein for example a therapeutic
  • the protein may be necessary to protect from degradation due to chemical, physical, and biological factors in certain environments, before it is introduced into the target environment
  • a particular biological environment from the protein if the protein is, for example, a targeted toxin.
  • growth factors are widely used in tissue engineering applications to induce and guide blood vessel formation.
  • protective matrices such as hydrogel scaffolds can results in a reduction in activity due to reactions such as cross-linking.
  • the invention provides method for encapsulating a protein of interest, the method comprising the step of expressing a fusion protein comprising an N-terminal region of a rearrangement hot spot (RHS) - repeat-con taming protein fused to the protein of interest.
  • RHS rearrangement hot spot
  • the fusion protem. is expressed in a cell.
  • the protein of interest upon expression and folding of the fusion protein, is cleaved from N-terminal region of the IlHS-repeat-containing protein.
  • cleavage is affected by the action of a protease intrinsic to die N- terminal region of the RHS-repeat containing protein.
  • the protease is an aspartate protease.
  • the protein of interest is encapsulated by a shell comprising the N- terminal region of the ⁇ 5 -repeat-containing protem.
  • the shell is hollow.
  • the shell is formed from one long strip of ⁇ -sheet, or ⁇ -sheets, that wraps around a central cavity.
  • the ⁇ -sheet is composed of at least 50 ⁇ -strands, preferabl at least 60 ⁇ -strands, preferably at least 70 ⁇ -strands, preferably at least 80 ⁇ -strands, preferably at least 90 derived from the N-terminal region of the RHS-repeat-containing protein.
  • the shell includes a ⁇ -sheet formed from between 60 and 90 ⁇ - strands, preferably between 70 and 80, preferably about 76, and most preferably 76 of the ⁇ - strands.
  • the central cavity is: between 30 and 55, preferably between 35 and 50, preferably between 40 and 44, preferably about 42, preferably 42 A wide.
  • the central cavity is: between 75 and 100, preferably between 80 and 95, preferably between 85 and 90, preferably about 87, preferably 87 A long.
  • die central cavity has a total enclosed volume of between 45,000 and 75,000, preferably between 50,000 and 70,000, preferably between 55,000 and 65,000, preferably about 59,000, preferably 59,000 A 3 .
  • the shell is closed at both ends.
  • carboxy end of the shell is closed by an RHS-repeat-associated core domain.
  • the RHS-repeat-associated core domain also acts as a protease.
  • the overall shape of the shell is reminiscent of a hollow egg.
  • Protein of interest (with any RHS)
  • the protein of interest is one that is normally naturally associated with the N-terminal region of the RHS-repeat-containing protein. That is the protein of interest is die naturally occurring C-terminal region of the RHS-repeat -containing protein.
  • the protein of interest is one that is not normally naturally associated with the N-terminal region of the RHS- repeat -containing protein.
  • the protein of interest is heterologous the N --terminal region of the RHS- repeat -containing protein.
  • the protein of interest is small enough to fit inside die shell.
  • the protein of interest has a molecular weight of less than 103kDa.
  • the molecular weight is less than 44kDa. More preferably the molecular weight is less than 36kDa.
  • the RHS-repeat - containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin, complex, a non -toxin complex RHS -repeat containing protein , and a YD - repeat containing protein.
  • TcC toxin complex C
  • K S -repeat-containing protein is a toxin complex C (TcC) component of a bacterial toxin complex
  • the RHS-containing protein is a toxin complex C (TcC) component of a bacterial toxin complex.
  • TcC toxin complex C
  • fusion protein comprising the N-terminal region of the toxin complex C (TcC) component is co -expressed with a toxin complex B (TcB) component of a bacterial toxin complex.
  • TcC toxin complex C
  • TcB toxin complex B
  • TcB toxin complex B
  • a fusion protein comprising an N-terminal region of a toxin complex C (TcC)
  • the a) and b) are expressed in a cell.
  • Protein of interest is expressed in a cell.
  • the protein of interest is one that is normally naturally associated with the the N-terminal region of a toxin complex C ( TcC) component. That is the protein of interest is the naturally associated C-terminal region of a toxin complex C (TcC) component
  • die protein of interest is one that is not normally naturally associated with the N-terminal region of the toxin complex C (TcC) component.
  • the protein of interest is heterologous to the N-terminal region of the toxin complex C (TcC) component.
  • the protein of interest is small enough to fit inside the shell.
  • the protein of interest has a molecular weight of less than 40kDa.
  • the molecular weight is less than 35kDa. More preferably the molecular weight is less than 32kDa.
  • the protein of interest is cleaved from N-terminal region of TcC component upon formation of a complex between the TcB and the fusion protein.
  • cleavage is affected by the action of a protease intrinsic to the N- terminal region of TcC component.
  • the protease is an aspartate protease.
  • the protease activity is encoded by an RHS-repeat associated core domain sequence.
  • RHS-repeat associated core domain sequence is as defined herein.
  • the protein of interest is encapsulated by a shell formed by a complex of the TcB component and the N- terminal regio of the TcC.
  • the shell is hollow.
  • the shell is formed from one long strip of ⁇ -sheet, or ⁇ -sheets, that wraps around a central cavity.
  • the ⁇ -sheet is composed of at least 50 ⁇ -strands, preferably at least 60 ⁇ -strands, preferably at least 70 ⁇ -strands, preferably at least 80 ⁇ -strands, preferably at least 90 derived from the TcB component and the N- terminal region of TcC.
  • the shell includes a ⁇ -sheet formed from between 60 and 90 ⁇ - strands, preferably between 70 and 80, preferably about 76, and most preferably 76 of the ⁇ - strands.
  • C-terminus of TcB and the N-terminus of the N -terminal region of TcC are in close proximity.
  • C- terminus of TcB and the N- terminus of the -terminal region of TcC are withi 7.5 A of each other.
  • the central cavity is: between 30 and 55, preferably between 35 and 50, preferably between 40 and 44, preferably about 42, preferably 42 A wide.
  • the central cavity is: between 75 and 100, preferably between 80 and 95, preferably between 85 and 90, preferably about 87, preferably 87 A wide.
  • the central cavity has a total enclosed volume of between 45,000 and 75,000, preferably between 50,000 and 70,000, preferably between 55,000 and 65,000, preferably about 59,000, preferably 59,000 A 3 .
  • the shell is closed at both ends.
  • the shell includes a ⁇ -propeller domain at the TcB end of the shell.
  • the ⁇ -propelier domain is inserted into the loop between the 29 tn ⁇ - strand ( ⁇ 29) and the 51 st ⁇ -strand ( ⁇ 51).
  • the ⁇ -propeller domain closes the TcB end of the shell.
  • the shell includes an RHS-repeat-associated core domain at the TcC end of the shell.
  • the RHS- repeat associated core domain is a short strip of ⁇ -sheet that spirals inwards at the TcC end of the shell.
  • the RHS-repeat-associated core domain is formed by a region extending from the 45* ⁇ -strand ( ⁇ 45) to the 49th ⁇ -strand ( ⁇ 49).
  • the RHS-repeat-associated core domain forms a plug.
  • the RHS-repeat-associated core domain closes the TcC end of the shell.
  • the plug closes the TcC end of the shell.
  • the overall shape of the shell is reminiscent of a hollow egg.
  • the invention provides an encapsulated protein of interest produced by the method of the invention.
  • the invention provides a protein of interest encapsulated by a shell formed by the N-terminal region of the RHS-repeat-containing protein.
  • the invention provides a protein of interest encapsulated by a shell formed by a complex of the TcB component and the N-terminal region of the TcC component.
  • the invention provides a cell comprising an encapsulated protein according to the invention.
  • the invention provides a composition comprising an encapsulated protein of the invention or produced by a method of the invention.
  • the composition is an insecticidal composition.
  • the composition has insecticidal activity.
  • the composition comprises an agriculturally acceptable carrier.
  • composition is a pharmaceutical composition.
  • composition has pharmaceutical activity.
  • composition comprises a
  • the encapsulated protein is releasable or can be released, from the shell.
  • the encapsulated protein is releasable, can be released, or is released, from the shell in certain conditions.
  • the encapsulated protein is releasable, can be released, or is released, from the shell by lowering the pH of the environment surrounding the encapsulated protein.
  • the encapsulated protei is releasable, can be released, or is released, from the sheii by introducing the encapsulated protein into a low pH environment.
  • the encapsulated protein is releasable, can be released, or is released, when the pi I is less than 5.5.
  • the pH is less than 5.0. More preferably the pi I is less than 4.5. Release method
  • the invention provides a method of controlled release of a protein of interest, the method comprising placing an encapsulated protein of the invention, or produced by the method of the invention, into an appropriate environment.
  • the appropriate environment affects release of the protein of interest.
  • the invention provides a method of controlled release of a protein of interest, the method comprising placing an encapsulated protein of the invention, or produced by the method of the invention, into a low pH environment.
  • the low pH environment has a pH of less than 5.5.
  • the pH is less than 5.0. More preferably the pH is less than 4.5.
  • the protein of interest is released by a conformational change in the shell encapsulating the protein of interest.
  • the pH-induced conformatio al change is opening of the shell resulting in release of the protein, of interest.
  • the conformational change involves separation of the ⁇ -propeller blades allowing extrusion of an unfolded protein of interest through the middle of the propeller.
  • the conformational change is induced by a lowering of the pH environment of the encapsulated protein.
  • the invention provides a method of delivering a protein of interest to a low pH environment.
  • the low pH environment has a pH of less than 5.5.
  • the pH is less than 5.0. More preferably the pH is less than 4.5.
  • the low pH environment is the endosome of a cell.
  • the low pi I environment affects release of the encapsulated protein from the shell to deliver the protein o interest into the low pH environment.
  • the low pH environment triggers delivery of the protein of interest into the cytosol of the cell.
  • the protein of interest is released by a pH-induced conformational change in the shell encapsulating the protein of interest.
  • the pH-induced conformational change is opening of the shell resulting in release of the protein of interest.
  • the invention provides a method of delivering a protein of interest into a cell, the method comprising contacting the cell with and encapsulated protein of the invention.
  • delivery requires co-expression of a TcA component of a bacterial toxin complex.
  • the invention provides a method of controlling a pest, the method comprising contacting an encapsulated protein of the invention, or produced by a method of the invention, with the pest.
  • the pest is a pest of a plant. In one embodiment the pest is an insect.
  • the protein of interest is a protein that is toxic to the insect.
  • the pest is selected from the lepidoptera, coleoptera, diptera; and orthoptera.
  • the pest is selected from the following list: . Antheraea eucalypti
  • the pest is selected from the Odontna, Papuana, Verkeptus, Pyronola, Wiseana, and Costelytra.
  • the pest is a Nematode.
  • the Nematode is from the genus lleterorhahditis.
  • the encapsulated protein is produced in the plant by expressing in the plant a fusion protein comprising an N-terminal region of a rearrangement hot spot (RIIS)- repeat-containing protein fused to the protein of interest.
  • RIIS rearrangement hot spot
  • TcB toxin complex B
  • a iusion protei comprising an N- erminal regio of a toxi complex C (TcC)
  • component ot a bacterial toxin complex fused to the protein of interest.
  • the method requires co- expression oi a TcA component of a bacterial toxin complex.
  • the pest is contacted by the encapsulated protein produced in the plant.
  • the pest is contacted when it ingests the encapsulated protein.
  • the method provides a method for producing an insect resistant plant the method comprising expressing in the plant a fusion protein comprising an N- terminal region of a rearrangement hot spot (RHS)-repeat-containing protein fused to the protein of interest.
  • RHS rearrangement hot spot
  • TcB toxin complex B
  • fusion protein comprising an N-ternunal region of a toxin complex C (TcC)
  • the method requires co-expression of a TcA component oi a bacterial toxin complex.
  • TcB is an example of a protein that contains RHS (rearrangement hot spot) repeats.
  • RHS rangement hot spot
  • the applicant's data illustrates a structural architecture that is likely to be conserved across both this widely distributed bacterial RHS-repeat-containing protein family and the eukaryo ic YD- repeat-containing protein family (which they show is a sub -set of the broader class of RHS- repeat-containing proteins).
  • the applicants provide a generic mechanism for protein encapsulation and delivery. This is described in detail in Example 1.
  • the method involves co-expressing the TcB component and a fusion protein consisting of an N-terminal region of the TcC component iused to the protein of interest.
  • R earrangement hot spot (RHS) ⁇ repeat ⁇ containing protein
  • RHS earrangement hot spot
  • HS-repeat-containingpfotein'' or TiS-repeat-containingpfotein means a protein with an amino acici sequence that contains RHS repeats.
  • KH S -repeat-containing protein includes toxin complex C (TcC) component of a bacterial toxin complex, a non-toxin-complex RHS-repeat containing proteins, and a YD -repeat containing proteins.
  • TcC toxin complex C
  • the RHS repeats conform to the profile-hidden Markov model (PIMM) as described in the file YD_RHS_coinbined_] ackHMMER.hmm ( Figure 14)
  • the RHS repeats conform to the profile-HMM as described i the file RHS . __repeat___pf05593.hmm ( Figure 12).
  • the RHS repeats conform to the profile-HMM as described in the file YD ., repeat .. TIGR01643.HMM. ( Figure 13).
  • the proteins of this embodiment are of the YD repeat class of RHS-repeat containing proteins.
  • Profile HMMs turn a multiple sequence alignment into a position-specific scoring system suitable for searching databases for remotely homologous sequences.
  • Profile HMM analyses complement standard pairwise comparison methods for large-scale sequence analysis.
  • RHS-repeat containing proteins typically contain 2— 60 RHS repeats.
  • the RHS-repeat containing proteins as used in the methods and products of the invention contain at least 5, preferably at least 20, more preferably at least 30 RHS repeats. In a further embodiment the RHS-repeat containing proteins as used in the methods and products of the invention contain at between 5 and 60, preferably between 15 and 50, more preferably between 35 and 45 RHS repeats.
  • the RHS -containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin complex, a non- toxin complex RHS containing protein, and a YD-repeat containing protein.
  • TcC toxin complex C
  • RhsA McNulty et ai, 2006
  • RHS repeat containing proteins In addition to possessing RHS repeats, RHS repeat containing proteins also typically contain an "RHS repeat-associated core domain".
  • the RHS repeat containing proteins as used in the methods and products of the invention contain an RHS repeat-associated core domain.
  • the RHS repeat associated core domai is highly conserved, ⁇ 76 amino acids in. length, and has aspartic peptidase activity as described here.
  • the RHS repeat associated core domain conforms to the profile-HMM in file RHS_assockted_core_domain_TIGR03696.HMM ( Figure 15) Further description of the RHS repeat-associated core domain can be found at http://www ⁇ dot>ebi ⁇ dot>ac ⁇ dot>uk/interpro/entry/IPR022385.
  • the RHS repeat-associated core domain is defined by performing an alignment between the RHS repeat containing protein sequence and the profile-HMM of the RHS repeat-associated core domain RI IS associated core domain TIGR03696.1 IMM— Figure 15), using a program such as hmmalign.
  • the protein of interest will be joined at the C-terminal end of the RHS repeat-associated core domain.
  • the final residues of the conserved RHS-associated core domain are generally "DxxGx", and when expessed cleavage will occur after the residue following the glycine.
  • the protease activity, intrinsic to the RHS repeat-containing protein, is encoded by the RHS repeat-associated core domain sequence.
  • the RHS-containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin complex, a non-toxin complex RHS repeat-containing protein, and a YD -repeat containing protein.
  • TcC toxin complex C
  • Bacterial toxin complex refers to the large, multi-subunit toxin complexes (Tc) produced by some bacteria. Bacterial toxin complexes are of interest due to their potent oral insecticidal activity. They are composed of at least three toxin complex proteins, TcA, TcB and TcC, which are considered to be required to assemble together in order to be fully toxic.
  • TcA components are listed in the Table 1 below. The sequences as indicated in the table are provided in the sequence listing. Table 1. TcA component sequences
  • TcB components which can be used in the methods and compositions of the invention, are listed in the Table 2 below.
  • the TcB component is from a species selected from those listed in Table 2.
  • the TcB component has an amino acid sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO: 11 to SEQ ID NO: 19.
  • the TcB component has an amino acid sequence selected from from any one of SEQ ID ⁇ :11 to SEQ ID NO: 19.
  • the TcB component is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:81 to SEQ ID NO:89. In a further embodiment the TcB component is encoded by a sequence selected from any one of SEQ ID NO:81 to SEQ ID NO:89.
  • the TcB component is from Yersinia entomophaga.
  • the TcB component has a sequence with at least 70% sequence identity to SEQ ID NO: 1 1.
  • the TcB component has the sequence of SEQ ID NO:1 l .
  • the TcB component is encoded by a sequence with at least 70% sequence identity to SEQ ID NO:8 l
  • the TcB component is encoded by the sequence of SEQ ID NO:81.
  • TcC components which can be used in the methods and compositions of the invention, are listed in the Table 3 below 7 .
  • the TcC component is from a species selected from those listed in Table 3.
  • the TcC component has an amino acid sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:20 to 29.
  • the TcC component has an amino acid sequence selected from any one of SEQ ID NO:20 to 29.
  • the TcC component is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:90 to 29.
  • the TcC component is encoded by a sequence selected from any one of SEQ ID NO:90 to 99.
  • the TcC component is from Yersinia entomophaga.
  • the TcC component has a sequence with at least 70% sequence identity to SEQ ID NO:21
  • the TcC component has the sequence of SEQ ID NO:21.
  • the TcC component is encoded by a sequence with at least 70% sequence identity to SEQ ID NO:91 In a further embodiment the TcC component is encoded by the sequence of SEQ ID NO:91.
  • the N-terminal region of the TcC component extends from the N- terminus to the amino acid following the final conserved glycine in the RHS-repeat-associated core domain.
  • variable carboxy - terminal domain of TcC proteins (uspstream of the final conserved glycine in the RHS-repeat-associated core domain) is replaced by the protein of interest in the fusion protein.
  • die TcC components which can be used in the mediods and compositions of the invention, are Ksted in the Table 4 below.
  • N-terminal region of the TcC component is selected from a species Ksted in Table 4.
  • the N-ter.minal region of the TcC component has an amino acid sequence with at least 70% identity to a sequence selected from SEQ ID NO: 30 and 31.
  • the N- terminal region of the TcC component has an amino acid sequence selected from SEQ ID NO: 30 and 31. In a further embodiment the N- terminal region of the TcC component is from Yersinia entomophaga.
  • the N-terminal region of the TcC component has a sequence with at least 70% identity to the sequence of SEQ ID NO: 31
  • N-ter.minal region of the TcC component has the sequence of SEQ ID NO: 31.
  • N-terminal region of the TcC component is encoded by the sequence of SEQ ID NO: 101.
  • a single ORF encodes an apparent TcB-TcC fusion protein.
  • An example of this is the tcdB2 ORF of Burkbolderia rbi ⁇ oxinica.
  • TcB-- TcC fusion proteins are intented to be encompassed by the term RHS repeat-containg proteins for use in the methods and compositions of the invention.
  • RHS ' -repeat containing proteins including YD -repeat containing proteins
  • Acidovorax citruHt 33 Bacillus cereus wall- associated Drotein 34 04
  • the RHS -repeat- containing protein is from a species selected from those listed in Table 3.
  • the RHS-repeat-containing protein has an amino acid sequence with at least 70% identify to a sequence selected from any one of SEQ ID NO: 33 to 70.
  • the RHS-repeat-containing protein has an amino acid sequence seiected from any one of SEQ ID NO: 33 to 70.
  • the RHS-repeat-containing protein is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ) ID NO: 102 to 140.
  • the RHS-repeat-containing protein is encoded by a sequence selected from any one of SEQ ID NO: 102 to 140.
  • fusion protein means a head to tail fusion of two proteins.
  • the C-terminus of a first protein is covalently linked to the N-terminus of a second protein through peptide bonding.
  • the C-terminus of N- terminal region of RHS-repeat-containing protein is fused to the N-terminus of protein of interest.
  • the C-terminus of N- terminal region of the TcB componen t is fused to the N -terminus of protein oi interest.
  • Methods for producing fusion proteins are well known to those skilled in the art.
  • a fusion protein is produced by expression of a polynucleotide encoding the fusion protein. In that case, polynucleotide sequences encoding each portion of the fusion protein are themselves fused.
  • protein of interest refers to any protein to be encapsulated in the methods or compositions of the invention.
  • the protein of interest is of a size small enough to fit within the hollo shell.
  • the protein of interest may be usefully selected from a toxin, a pharmaceutical, a biological reagent, and a bioactive polypeptide.
  • the encapsulated protein of the invention may be produced by in vitro
  • Constructs and templates for transcription and translation may be produced by standard molecular biology techniques, such as cloning, or may be synthesised by methods well-known to those skilled in the art.
  • the encapsulated protein of the invention may be produced by expression in a cell.
  • the cell used in the methods of the invention to express and produce the encapsulated protein may be any cell type.
  • the cell is a prokaryosic cell.
  • the cell is a eukaryotic cell.
  • the cell is selected from a bacterial ceil, a yeast cell, a fungal ceil, an insect cell, algal ceil, and a plant cell.
  • the cell is a bacterial cell.
  • the cell is a yeast cell.
  • the yeast cell is a 5, cenviseae cell.
  • the cell is a fungal cell.
  • the cell is an insect cell.
  • the cell is an algal cell.
  • the cell is a plant cell.
  • the cell is a non-plant cell.
  • the non-plant is selected from E. coli, P. p storis, S. cerinseae, D. salina, C. mnhardtii.
  • the non- plant is selected from P. pastoris, S. cenviseae, D. salina, C. mnhardtii
  • the cell is a yeast cell. In yet another embodiment, the cell is a synthetic cell.
  • the cell is a bacterial ceil.
  • the cell is a bacterial cell selected from the genera: Yersinia,
  • Yersinia species include Y rsinia pe sits, Yersinia pseudotuberculosis, Yersinia enterocolitica, Yersinia mollaretii, Yersinia fredenksenit and Yersinia entomophaga
  • a preferred Photorhabdus species is Photorhabdus luminescens
  • a preferred Xenorhabdus species is Xenorhabdus nematophUus
  • Serr tia species include Serr tia entomophila and Serratia protearnaculans.
  • the plant cells, and plants in which the encapsulated proteins are expressed may be from any plant species.
  • the plant cell or plant is from a gymnosperm plant species.
  • the plant cell or plant is from an angiosperm plant species.
  • the plant cell or plant is from a from dicotyledonous plant species.
  • Preferred dicotyledonous genera include: Amygdalus, Anacardmm, Anemone, Arachis, Brassica, Cajanus, Cannabis, Carthamus, Carp, Ceiba, Cicer, Claytonia, Conundrum, Coronilla, Corydalis, Crotalaria, Cyclamen, Dentcrria, Dicentra, Dolichos, Eranthis, Glycine.
  • Preferred dico yledonous species include: Amygdalus communis, Anacardmm occidentaie, Anemone americana, Anemone occidentalis, AracMs hypogaea, Arachis lypogea, Brassica napus Rape, Brassica nigra, Brassica ca pestris, Cajanus ca/an, Cajanus indicus, Cannabis sativa, Carthamus iinctorius, Carya illinoinensis, Ceiba pentandra, Cicer aneimum, Claytonia exigua, Claytonia megarhiva, Conandrum sativum, Comnilla varia, Corydalis flavula, Co sy da s sempervmns, Croialana juncea, Cyclamen coum, Dentaria laciniata, Dicentra eximia, Dicentra f rmosa, Dotichos la blab, rant his by em ils,
  • Glycine max Glycine ussunensis, Glycine gracilis, Heliauthus annus, Lupnnus angusdfolius, Lup nus Intern, Lupinus mutabi&s, Lespede ⁇ a sericea, Lespede ⁇ a striata, Lotus ligmos s, Lathyrus satims, Lens eulinaris, Lespedeiya stipulacea, Linum usitatissi um, Lotus corniculaius, Lupinus albus, Medicago arborea, Medicago falcate, Medicago hispuia, Medicago officinalis, Medicago sativa (alfalfa), Medicago tribukides, Macadamia ntegnjoiia, Medicago arabica, MeHlot s albus, Mucuna pmrietis, Olea europaea, Onobtychis iifolia, Omithopus satim
  • the plant cell or plant is from a monocotyledonous plant species.
  • Preferred monocotyledonous genera include: Agropyron, Allium, Alopecurus, Andropogon, Arrhenatherum, Asparagus, Avena, Bambusa, Bellavalia, Brimeura, Brodiaea, Bulbocodiu ⁇ , Bothrichloa, Bouteloua, Bromus, Calarnoviifa, Camassia, Cenchrus, Chionodoxa, Chirms, Colchicum, Crocus, Cymbopogon, Cynodon, Cypnpedium, Dactylis, Dichanthium, Digitaria, Elaeis, Eleusme, Eragrostis, Eremums, Erythronium, Fagppyrum, Festuca, tritillaria, Galanthus, Fhlianthus, Flordeum, Hyacintbus, Eyacinibo des, ipbeion, Ins, Eeuco/um, Liatris, Eolium, Laycoris, Miscanthis, Mis
  • Omttbogah Orjpa, Panicum, Paspalum, Pennisetum, Phalaris, Phleum, Poa, Puschkinia, Saccbarum, S ' ecale, Setaria, Sorgbastrum, Sorghum, Thinotyrum, Triticum, Vanilla, X Triticosecale Triticale and Zea.
  • Preferred monocotyledonous species include: Agropyron cristatum, Agropyron desertorum, Agropyron elongatum, Agropyron intermedium, Agropyron smitbii, Agropyron spicatum, Agropyron Trachycautum, Agropyron inchophorum, Allium ascalonicum, Allium cepa, Allium cbinense, Allium porrum, Allium seboenoprasum, Allium fistulosum, Allium sativum, Alopecurus pratensis, Andropogon gerardi,
  • Hyacintbus orient alls, Hyacintboides hispanica, Hyacintboides non- s pta, Ipheion sessile, Ins colkttii, Ins danfardiae, Iris reticulate, Lemojuw aesttvum, LJ tns cylindr ce , Liairis eiegans, L imm lonj fbmm, LoBurn mulUfim m, Laimrn p nnn , L lmm westemmldkum, bybridu , Lycoris radiata, Mi scan this sinensis, Miscanthus : ⁇ gigpnteus, Muscan armem cum, Ma scan m cm rp m, Natrinsi/s pseudmarcissas, Ortiitbogafam moniannm, ⁇ / ⁇ s tiva, Pamcum
  • Preferred plants include crop plants, such as cotton, sorghum, maize, wheat, rice, soy and barley.
  • plant is intended to include a whole plant, any part of a plant, a seed, a fruit, propagules and progeny of a plane.
  • 'propagule' means any part of a plant that may he used in reproduction or propagation, either sexual or asexual, including seeds and cuttings.
  • the plants of the invention may he grown and either self-ed or crossed with a different plant strain and the resulting progeny, comprising the polynucleotides or constructs of the invention, and/or expressing the polypeptide sequences of the invention, also form an part of the present invention.
  • the plants, plant parts, propagules and progeny comprise a polynucleotide or construct of the invention, and/ or express a polypeptide sequence of the invention.
  • agriculturally acceptable carrier covers ail liquid and solid carriers known in the art suc as water and oils, as well as adjuvants, dispersants, binders, wettants, surfactants, humectants tackifiers, formulation excipiants, and the Kite that are ordinarily known for use in the preparation of control compositions, including insecticide compositions.
  • insecticidal activity means activity in at least one of: killing, slowing the growth of, preventing reproduction of, and reducing numbers of any given insect.
  • an “insect pest” is an insect that causes damage to a non -insect resistant plant.
  • pharmaceutical composition includes the encapsulated protein of the invention or produced by the method of the invention.
  • the "pharamaceutical composition” may also include the use of formulation chemistry, including but not limited to methods described in: Pharmaceutical Formulation Development of Peptides and Proteins, Second Edition Published: November 14, 2012 by CRC Press - 392 Pages
  • Bditor(s) Lars Hovgaard, Novo Nordisk A/ S, Malov, Denmark; Sven Frokjaer, University of Copenhagen, Denmark; Marco van de Weert, University of Copenhagen, Denmark.
  • the term "pharmaceutical” is intended to cover veterinary applications as w T ell as human health application.
  • Animal that may be treated for veterinary applications include agricultural animals such as cows, sheep, goats, pigs, horses, chickens, deer, as well as companion animals such as clogs, cats and rabbits.
  • polynucleotide(s), means a single or double -stranded deoxyribonucleotide or ribonucleotide polymer of any lengt but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments.
  • a "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides.
  • primer refers to a short polynucleotide, usually having a free 3 ⁇ group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the target.
  • probe refers to a short polynucleotide that is used to detect a polynucleotide sequence that is complementary to the probe, in a hybridization -based assay.
  • the probe may consist of a "fragment" of a polynucleotide as
  • polypeptide encompasses amino acid chains of any length but preferably at least 5 amino acids, including full-lengt proteins, in which amino acid residues are linked by covalent peptide bonds.
  • Polypeptides of the present invention, or used in the methods of die invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques.
  • a "fragment" of a polypeptide is a subsequence of the polypeptide that preferably performs a function of and/or provides three dimensional structure of the polypeptide.
  • the term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof capable of performing the above enzymatic activity.
  • isolated as applied to the polynucleotide or polypeptide sequences disclosed herein, is used to refer to sequences that are removed from their natural cellular environment.
  • An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.
  • recombinant refers to a polynucleotide sequence that is removed from sequences that surround it in its natural context and/ or is recombined with sequences that are not present in its natural context.
  • a "recombinant" polypeptide sequence is produced by translation from a "recombinant” polynucleotide sequence.
  • variant refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturaliy occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polypeptides possess biological activities that are the same or similar to those of the inventive polypeptides or polypeptides.
  • variant with reference to polypeptides and polypeptides encompasses all forms of polypeptides and polypeptides as defined herein.
  • Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least
  • mo e preferably at least 55%,, more preferably at least 56%, more preferably at least
  • Identity is found over a comparison window of at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention.
  • Polynucleotide sequence identity can be determined in the following manner.
  • the subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from die BLAST suite of programs, version 2.2.5 [Nov 2002 j) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from the NCBI website on the World Wide Web at ftp:/ / ftp ⁇ dot>ncbi ⁇ dot>nih ⁇ dot>gov/bkst/.
  • the default parameters of b!2seq are utilized except that filtering of low complexity parts should be turned off.
  • polynucleotide sequences may be examined using the following unix command line parameters: bl2seq— i nucleotideseql— j nucleotideseq2— F F— p blastn
  • the parameter — F F turns off filtering of low complexity sections.
  • the parameter — p selects the appropriate algorithm for the pair of sequences.
  • Polynucleotide sequence identity may also be calculated over die entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) j. Mol. Biol. 48, 443-453).
  • Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice,P. Longden . and Bleasby,A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in. Genetics June 2000, vol 16, No 6.
  • a preferred method for calculating polynucleotide % sequence identity is based on aligning sequences to be compared using Clustal X (Jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)
  • Polynucleotide variants also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance.
  • sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from the NCBI website on the World Wide Web at
  • the similarity of polynucleotide sequences may be examined using the following unix command line parameters: b!2seq— i nucleoli deseql — j nucleotideseq2— F F— p tblastx he parameter— F F turns off filtering of low complexity sections.
  • the parameter— selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. The size of this database is set by default in the bl2seq program. For small E values, much less than one, the E value is approximately the probability of such a random match.
  • Variant polynucleotide sequences preferably exhibit an E value of less than 1 x 10 -6 more preferably less than 1 x 10 -9, more preferably less than 1 x 10 -12, more preferably less than 1 x 10 -15, more preferably less than 1 x 10 -18, more preferably less than 1 x 10 -21, more preferably less than 1 x 10 -30, more preferably less than 1 x 10 -40, more preferably less than 1 x 10 -50, more preferably less than 1 x 10 -60, more preferably less than 1 x 10 -70, more preferably less than 1 x 10 -80, more preferably less than 0 -90 and most preferably less than 1 x 1.0-100 when compared with any one of the specifically identified sequences.
  • variant polynucleotides of the present invention hybridize to the specified polynucleotide sequences, or complements thereof under s tringent condition s .
  • hybridize under stringent conditions refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA. or RN.A blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration.
  • a target polynucleotide molecule such as a target polynucleotide molecule immobilized on a DNA. or RN.A blot, such as a Southern blot or Northern blot
  • the ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.
  • Tm melting temperature
  • Typical stringent conditions for polynucleotide of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X. SSC, 0.1% SDS at 65°C,
  • exemplary stringent hybridization conditions are 5 to 10° C below Tm.
  • Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/ oligonucleotide length)" C.
  • Tm values are higher than those for DNA-DNA or DNA-RNA hybrids, and can be calculated using the formula described in Giesen et al., Nucleic Acids Res. 1998 Nov l ;26(21):5004-6.
  • Exemplary stringent hybridization conditions for a DNA-PNA hybrid having a length less than 100 bases are 5 to 10° C below the Tm.
  • Variant polynucleotides used in the methods of the invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degenerac of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention.
  • a sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codoris for the same amino acid may be changed by art recognized techniques, e.g., to optimize codori expression in a particular host organism.
  • Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention.
  • a skilled artisan will be aware of methods for malting phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 306).
  • Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bi2seq program from the BLAST suite of programs (version 2.2.5 [Nov 20021) from the NCBI website on the World Wide Web at ftp: / /ftp ⁇ dot>ncbi ⁇ dot>nih ⁇ dot>gov/blast/ via the tblastx algorithm as previously described.
  • variant polypeptide sequences preferably exhibit at least 50%, more preferably at least 51%, more preferably at least 52%, more preferably at least 53%, more preferably at least 54%), more preferably at least 55%, more preferably at least 56%, more preferably at least 57%, more preferably at least 58%, more preferably at least 59%, more preferably at least 60%, more preferably at least 61 %!, more preferably at least 62%, more preferably at least 63%, more preferably at least 64%, more preferably at least 65%, more preferably at least 66%, more preferably at least 67%, more preferably at least 68%, more preferably at least 69%, more preferably at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferabl at least 74%, more preferably at least 75%, more preferably at
  • Polypeptide sequence identity can be determined in the following manner.
  • the subject polypeptide sequence is compared to a candidate polypeptide sequence using BLASTP (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq, which is publicly available from the NCBI website on the World Wide Web at ftp:/ / ftp ⁇ dot>ncbi ⁇ dot>nih. ⁇ dot>gov/ blast/.
  • BLASTP from the BLAST suite of programs, version 2.2.5 [Nov 2002]
  • bl2seq which is publicly available from the NCBI website on the World Wide Web at ftp:/ / ftp ⁇ dot>ncbi ⁇ dot>nih. ⁇ dot>gov/ blast/.
  • the default parameters of bl2seq are utilized except that filtering oi lo complexity regions should be turned off.
  • Polypeptide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs.
  • EMBOSS-needle available at http:/www ⁇ dot>ebi ⁇ dot>ac ⁇ dot>uk/ emboss/ align/
  • GAP Human, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227 -235.
  • a preferred method for calculating polypeptide % sequence identity is based on aligning sequences to be compared using Clustal X (jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)
  • Polypeptide variants used in the methods of the invention also encompass those which exhibit a similarity to one or more of the specifically 7 identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance.
  • sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from the NCBI website on the World Wide Web at ftp: //ftp.iicbi.nih.gov/blast/.
  • the similarity of polypeptide sequences may be examined using the following unix command line parameters: bl2seq— i peptideseql— j peptideseq2 -F F— blastp
  • Variant polypeptide sequences preferably exhibit an E value of less than 1 x 10 -6 more preferably less than 1 x 10 -9, more preferably less than 1 x 10 - 2, more preferably less than 1 x 10 - 15, more preferably less than 1 x 10 - 18, more preferably less than 1 x 10 -21 , more preferably less than 1 x 10 -30, more preferably less than 1 x 10 -40, more preferably less than 1 x 10 -50, more preferably less than 1 x 10 -60, more preferably less than 1 x 10 -70, more preferably less than 1 x 10 -80, more preferably less than 1 x 10 -90 and most preferably lxl 0- 100 when compared with any one of the specifically identified sequences.
  • the parameter— F F turns off filtering of low complexity sections.
  • the parameter— selects the appropriate algorithm for the pair of sequenc.es. This program finds regions of similarity between the sequences and for each such region reports an "F, value" which is the expected number of times one could expect to see such a match by chance in a database ot a fixed reference size containing random sequences. For small E values, much less than one, this is approximately the probability of such a random match.
  • the term "genetic construct” refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule.
  • a genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide.
  • the insert polynucleotide molecule may be derived from the host cell, or may 7 be derived from a different cell or organism and /or may be a recombinant polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA.
  • the genetic construct may be linked to a vector.
  • vector refers to a polynucleotide molecule, usually double stranded DN A, which is used to transport the genetic construct into a host ceil.
  • the vector may be capable oi replicatio in at least one additional host system, such as E. coli.
  • expression construct refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide.
  • An expression construct typically comprises in a 5' to 3' direction:
  • coding region or "open reading frame” (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcri tion product and/or a polype tide under the control of appropriate regulatory sequences, lite coding sequence may, in some cases, identified by the presence of a 5' translation start codon and a 3' translation stop codon.
  • ORF open reading frame
  • “Operably -linked” means that the sequenced to be expressed is placed under the control of regulatory elements that include promoters, tissue -specific regulatory elements, temporal regulator)' elements, enhancers, repressors and terminators.
  • noncoding region refers to untranslated sequences that are upstream of the transiational start site and downstream of the transiational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination, mRNA stability, and for regulation of translation efficiency. Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important de erminants of mRNA stability and in some cases have been found to have spatial regulatory functions. he term “promoter” refers to nontranscribed cis -regulatory elements upstream of the coding region that regulate gene tra scri tion.
  • Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors. Introns within coding sequences can also regulate transcription and influence post-transcriptional processing (including splicing, capping and polyadenylation).
  • a promoter may be homologous with respect to the polynucleotide to be expressed. This means that the promoter and polynucleotide are iound operably linked in nature.
  • the promoter may be heterologous with respect to the polynucleotide to be expressed. This means that the promoter and the polynucleotide are not found operably linked in nature.
  • polynucleotides/polypeptides of the invention may be andvantageously expessed under the control of selected promoter sequences as described below.
  • a seed specific promoter is found in US 6,342,657; and US 7,081,565; and US 7,405,345; and US 7,642,346; and US 7,371,928.
  • a preferred seed specific promoter is the napin promoter oiBrassk napus (Josefsson et al., 1987, ⁇ Biol Chem. 262(25):12196-201; Ellerstrom et al., 1996, Plant Molecular Biology, Volume 32, Issue 6, pp 1019-1027).
  • Non-photosynthetic tissue preferred promoters include those preferentially expressed in non- photosynthetic tissues / organs of the plant.
  • Non-photosynthetic tissue preferred promoters may also include light repressed promoters. Light repressed promoters
  • Photosythetic tissue preferred promoters include those that are preferentially expressed in p oto synthetic tissues of the plants.
  • Photosynthetic tissues of the plant include leaves, stems, shoots and above ground parts of the plant.
  • Photosythetic tissue preferred promoters include light regulated promoters.
  • Light regulated promoters are known to those skilled in the art and include for example chlorophyll a/b (Cab) binding protein promoters and Rubisco Small Sub unit (SSU) promoters.
  • An example of a light regulated promoter is found in LJS 5,750,385.
  • Light regulated in this context means light inducible or light induced.
  • transgene is a polynucleotide that is taken from one organism and introduced into a different organism by transformation.
  • the transgene may be derived from the same species or from a different species as the species of the organism into which the transgene is introduced.
  • Host cells may be derived from, for example, bacterial, fungal, yeast, insect, mammalian, algal or plant organisms. Host cells may also be synthetic cells. Preferred host cells are eukaryotic cells. A particularly preferred host cell is a plant cell, particularly a plant cell in a vegetative tissue of a plant.
  • a ''transgenic plant refers to a plant which contains new genetic material as a result of genetic manipulation or transformation.
  • the new genetic material may be derived from a plant of the same species as the resulting transgenic plant or from a different species.
  • polypeptides of the invention can he isolated by using a variety of techniques known to those of ordinary skill in the art.
  • such polypeptides can be isolated through use of the polymerase chain reaction (PGR) described in. Mullis et al., Eds. 1994 The Polymerase Chai Reaction, Birkhauser, incorporated herein by reference.
  • PGR polymerase chain reaction
  • the polypeptides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.
  • hybridization probes include use of all, or portions of, the polypeptides having the sequence set forth herein as hybridization probes.
  • Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1.
  • An optional further wash (for twenty minutes) can be conducted under conditions of 0.1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C.
  • polynucleotide fragments of the invention may be produced by techniques well -known in the art such as restriction endonuciease digestion, oligonucleotide synthesis and PGR amplification.
  • a partial polynucleotide sequence may be used, in methods well -known in the art to identify the corresponding full length polynucleotide sequence. Such methods include PGR -based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- basecl method, computer/database—based methods. Further, by way of example, inverse PGR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Trigiia et ai., 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference).
  • the method uses several restriction enzymes to generate a suitable fragment in the known region of a gene.
  • the fragment is then circularized by intramolecular ligation and used as a PGR template.
  • Divergent primers are designed from the known region.
  • standard molecular biology approaches can be utilized (Sambrook et aL, Molecular Cloning: A
  • Variants including orthologues may be identified by the methods described.
  • Variant polypeptides may be identified using PCR-based methods (Mullis et aL, Eds. 1994 The Polymerase Chain Reaction, Birkhauser).
  • the polynucleotide sequence of a primer useful to amplify variants of polynucleotide molecules of the invention by PGR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.
  • Polype tide variants may also be identified by physical methods, for example by screening expression libraries using antibodies raised against polypeptides of the invention (Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987) or by identifying polypeptides from natural sources with the aid of such antibodies.
  • variant sequences of the invention may also be identified by computer-based methods well -known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Pro t, PIR and others). See, e.g.. Nucleic Acids Res. 29: 1 -10 and 11 - 16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.
  • An exemplary iatnily of programs useful for identilying variants in sequence databases is the BLAST sake of programs (version 2,2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp://ftp.ncbi.nih.gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38A, Room 8N805, Bethesda, MD 20894 USA.
  • NCBI sewer also provides the facility to use the programs to screen a number of publicly available sequence databases.
  • BLASTN compares a nucleotide query sequence against a nucleotide sequence database.
  • BLASTP compares an.
  • BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database.
  • tBLASTN compares a protein, query sequence against a nucleotide sequence database dynamically translated in all reading frames.
  • tBLAS X compares the sis -frame translations of a nucleotide query sequence against the six -frame translations of a nucleotide sequence database.
  • the BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.
  • BLAST family of algorithms including BLASTN, BLASTP, and BLASTX, is described i the publication of Altschul et al., Nucleic Acids Res. 25: 3389 -3402, 1997.
  • the '"hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm align and identify similar portions of sequences.
  • the hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.
  • the BLASTN, BLASTP, BLASTX,, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments.
  • the Expect value (E) indicates the number of hits one can "expect” to see by chance when searching a database of the same size containing random contiguous sequences.
  • the Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance.
  • the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm. Multiple sequence alignments of a group of related sequences can be carried out with.
  • CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, TJ, (1994) CLUSTALW:
  • Pattern recognition software applications are available for finding motifs or signature sequences.
  • MEME Multiple Em for Motif Elicitation
  • MAST Motif Alignment and Search Tool
  • the MAST results are provided as a series of alignments with appropriate statistical data and a visual overview of the motifs found.
  • MEME and MAST were developed at the University of California, San Diego.
  • PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al., 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences.
  • the PROSITE database www.expasy.org/prosite
  • Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature.
  • polypeptides used in the methods of the invention may be prepared using peptide synthesis methods well known in the art such as direct peptide synthesis using solid phase techniques (e.g. Stewart et al., 1969, in Solid-Phase Peptide
  • polypeptides and variant polypeptides used in the methods of the invention may also be purified from natural sources using a variety of techniques chat are well known in the art (e.g. Deutsche*, 1990, Ed, Methods in Enzymology, Vol. 182, Guide to Protein Purification).
  • polypeptides and variant polypeptides used in the methods of the invention may be expressed recombinant!; 7 in suitable host cells and separated from the cells as dis cus s ed b elow.
  • the genetic constructs expressing the encapsulated proteins according to the invention may be useful for transforming, for example, bacterial, fungal, insect, mammalian or plant organisms.
  • the genetic constructs of the invention are intended to include expression constructs as herein defined.
  • Polynucleotides, expression cassettes, and constructs can also be conveniently ' synthesized in their entirety using techniques well-known and or available to those skilled in the art
  • Host ceils comprising genetic constructs, such as expression constructs, of the invention are useful in methods well known in the art (e.g. Sambrook et al., Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubei et al., Current Protocols in Molecular Biology, Greene Publishing, 1987) for recombinant production of encapsulated proteins.
  • Such mediods may involve the culture of host cells in an appropriate medium in conditions suitable for or conducive to expression of a polypeptide of the invention.
  • the expressed recombinant polypeptide which may 7 optionally 7 be secreted into the culture, may 7 then be separated from the medium, host cells or culture medium by 7 methods well known in the art (e.g. Deutsche*, Ed, 1990, Methods in Enzymology, Vol 182, Guide to Protein Purifica don) .
  • the invention further provides methods for producing plant cells and plants expressing the en cap sul ated prote ns .
  • strategies may be designed to increase expression of a polynucleotide/polypeptide in a plane cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/ olypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed.
  • the expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species.
  • Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.
  • the promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a tnonocot or dicot plant and include cell-, tissue- and organ- specific promoters, ceil cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired.
  • the promoters may be those normally associated widi a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi.
  • promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention.
  • constitutive plant promoters include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues respond to internal developmental signals or external abiotic or biotic stresses are described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894 and WO2011 /053169, which is herein incorporated by reference.
  • Exemplary terminators that are commonly used in plant transformation genetic construct include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agfobackrium tnmefaciens nopaline synthase or octopine synthase terminators, the L ' ea mays zein gene terminator, the Oty a saliva ADP-glucose pyrophosphorylase terminator and the Solatium tuberosum PI-II terminator.
  • CaMV cauliflower mosaic virus
  • CaMV cauliflower mosaic virus
  • Agfobackrium tnmefaciens nopaline synthase or octopine synthase terminators the L ' ea mays zein gene terminator
  • the Oty a saliva ADP-glucose pyrophosphorylase terminator and the Solatium tuberosum PI-II terminator.
  • NPT II die neomycin phophotransferase II gene
  • aadA gene which confers spectinomycin and streptomycin resistance
  • the phosphinodiricin acetyl transferase ⁇ bar gene for Ignite (AgrEvo) and Basta (Hoechst) resistance
  • hpt hygromycin phosphotransferase gene
  • Figure 1 shows the structure of the YenB /YenC2-N complex, a, Ribbon diagram of YenB/YenC2-N.
  • YenB is on the left in Sight grey and YenC is on the right in dark grey
  • b-c Orthoganol views of the complex, with the central cavity shown as a translucent surface and the protein as grey ribbons, and with approximate interior and exterior diameters marked.
  • the position of an RGD motif is shown with a circle, d, a schematic topology diagram of the structure with cx -helices shown as cylinders and ⁇ -sheets as arrows, and the domains labelled.
  • Figure 2 show's structural details of the YenC2-N auto -proteolysis site.
  • the residues immediately upstream of the cleavage point are D686, P687, D688, G689 and M690.
  • the side chains of a selection of residues conserved in the RHS- associated core domain are shown.
  • the distance (6.3 A) between the two conserved residues that are essential for proteolysis, D686 and D663 is shown in dark grey.
  • the side chain of D663 is within hydrogen-bonding distance of die terminal carboxyi group of the cleaved peptide (distances (2.8 A and 3.2 A) shown in light grey).
  • Figure 3 sho ns RHS repeat structure, a, A section of the shell showing the pattern of RHS repeats, viewed from the inside of the central cavity. A single RHS repeat is highlighted with a grey background box. The ordered loop made by the DxxG motif is shown at the top and the conserved pattern of hydrophobic residues on the face of the ⁇ -sheet is shown in dark grey (conserved tyrosines) and light grey (other hydrophobic residues), b, The same face of the ⁇ - sheet shown as a solvent accessible surface and coloured by side -chain hydrophobicity (hydrophilic as white, hydrophobic as dark grey, protein backbone as mid grey). The stripe formed by the conserved hydrophobic residues is boxed with a dashed line.
  • Figure 4 shows the position of the YenB/YenC2-N complex in the complete Yen-Tc particle, a-b,
  • the YenB/YenC2-N dimer is shown as a ribbon diagram, fitted to fitted to EM class averages of the complete Yen -Tc toxin particle.
  • c ⁇ d Orthogonal views of the complete Yen - Tc complex.
  • the YenB/YenC2-N dimer is shown as a ribbon diagram.
  • the associated chidnases ChiV" and Chi.?, (PDB ID: 4DWS) are shown as pale grey ribbon diagrams.
  • the EM map of the YenA/ Chil / Chi2 complex determined at a resolution of 17A by single particle averaging 6 is shown as a light grey surface.
  • Figure 5 shows a topology diagram of YenB/YenC2-N.
  • the YenB/YenC2-N structure is shown in schematic form with cc-helices as cylinders and ⁇ -sheets as arrows. The start and end points of secondary structure elements are indicated.
  • the domains are composed as follows: ⁇ -strands 1-29, YenB SpvB domain; ⁇ -strands 30-50, YenB ⁇ -propeller domain; ⁇ - strands 51 -92 (light grey), the remainder of YenB; ⁇ -strands 1 -44 (dark grey), YenC2-N; ⁇ - strands 45-49 (dark grey), YenC2-N RHS hyper-conserved core domain.
  • Figure 6 shows SAXS bead models of YenB/YenC2 and YenB/YenC2-N.
  • a and c a slice through the ah initio bead models produced from small- angle X-ray scattering of YenB / YenC2 and YenB/YenC2-N respectively.
  • the model of YenB/YenC2-N has a large internal cavity shown in dark grey, absent from die model of YenB/YenC2.
  • b and d fit of die ah initio bead models to scattering data.
  • Figure 7 sho ns SAXS data for YenB/YenC2.
  • a purity of YenB/YenC2 sample for SAXS analysis, shown by size exclusion chromatography trace and SDS-PAGE (inset)
  • b SAXS data for YenB/ YenC2 as a log-log plot.
  • Inset is a Guinier plot of the low- ⁇ region,
  • d scattering of YenB/ YenC2 compared to the theoretical scattering of the YenB / YenC2-N crystal structure, highlighting the poor fit.
  • Figure 8 shows SAXS data for YenB/YenC2-N.
  • a purity of YenB/YenC2-N sample for SAXS analysis, shown by size exclusion chromatography trace and SDS-PAGE, (inset)
  • b SAXS data for YenB/YenC2-N as a log-log plot.
  • Inset is a Guinier plot of the Jow- region, c, P(r) plot for YenB/YenC2-N with D max - 138 A.
  • d scattering of YenB/YenC2-N compared to the theoretical scattering of the YenB /YenC2-N crystal structure.
  • Figure 9 shows the effect of point mutations on YenC2 self-cleavage.
  • WT wild-type YenC2
  • M690 Three point mutations (R650A, D663N, and D686N) were found to abrogate self-cleavage.
  • FIG. 10 profile-FIMM logos of RHS and YD repeats. Profile-HMM logos of an RHS repeat (a) and a YD repeat (b) show that these two repeats have the same consensus sequence.
  • Figure 11 shows the E. co/ ' i CNFI catalytic domain compared with YenB / YenC2-N. The catalytic domain of E. colt CNF i (dark grey surface representation), which is homologous to YenCl-C, is shown manually placed inside the hollow shell formed by YenB / YenC2-N (cartoon diagram). This shows that the central cavity of YenB/YenC2-N is large enough to accomodate the C-terminal toxin domain of die YenC proteins.
  • Figure 16 Size exclusion chromatography trace of TcB:TcC-GFP fusion at pH7.5; b, SDS-PAGE of fractions from size exclusion trace in panel a). Peakl - GFP encapsulated in TcBTcC shell, Peak 2 non bound GFP; c, left microcentrifuge tube contains protein from Peak 1 (no fluorescence) and right: microcentrifuge tube contains released GFP from Peak 2 (fluorescence under LJV illumination).
  • Example 1 Elucidation of the structure of the complex formed between the TcB and TcC components of ABC toxin complexes (Tc).
  • Tc ABC toxin complexes
  • TcA The ABC toxin complexes
  • TcB The ABC toxin complexes
  • TcC The ABC toxin complexes (Tc) produced by some bacteria are of interest due to their potent oral insecticidal activity'" 2 and potential role in human disease 3 . They are composed of at least three proteins, TcA, TcB and TcC, which must assemble together in order to be fully toxic 4 .
  • the carboxy-terminal section of TcC is the main cytotoxic component 5 , and displays remarkable heterogeneity between different Tcs.
  • a general model of action has been proposed, in which the TcA component first binds to the cell surface, is endocytosed and subsequently forms a pH-triggered channel, allowing the translocation of TcC into the cytoplasm 5 , where it can cause cytoskeletal disruption in both insect and mammalian cells.
  • Tc complexes have been visualised using single particle electron microscopy*'', but no high- resolution structures of the components are available, and the role of TcB in the mechanism of toxicity remains unknown.
  • TcB the conserved amino -terminal section of TcC determined to 2.3A by X-ray crystallography.
  • These components assemble to form an unprecedented large hollow structure that encapsulates and sequesters the cytotoxic carboxy-terminal portion of TcC like the shell of an egg.
  • the shell is decorated on one end with a ⁇ -propeller domain, which mediates attachment of the TcB /TcC dimer to the TcA component of the complex.
  • TcC is the first known protei structure to contain RHS (rearrangement hot spot) repeats 8 , and illustrates the structural architecture that is likely to be conserved across this widely distributed bacterial protein family and the related eukaryotic YD-repeat-containing protein family, which includes the teneurins".
  • RHS rangement hot spot
  • the entomopathogenic bacterium Yersinia entomvph ga contains a related Tc locus where the TcA component is split into two ORFs (YenAl and Yen A2), along with a single TcB gene (YenB) and two TcC genes (Y enCl and YenC2) u .
  • the TcC proteins of this and other Tcs are similar to the "polymorphic toxins" described by Zhang et a/.
  • YenCl-C is homologous to cytotoxic necrotising factor 1 (CNF1) from Hscherictna colt 2
  • YenC2-C is homologous to the deaminase YwqJ from Bacillus subtilis 1 '.
  • TcB proteins in Tcs As the function of TcB proteins in Tcs is unknown, we prepared the complex of YenB (167 kDa) with the conserved N-terrninal 76 kDa portion of YenC2 (YenC2-N) by co- expression and incubation at low pH (Supplementary methods), and solved its structure using X-ray crystallography ( able SI). The structure reveals a remarkable, intimately associated heterodimer formed by YenB and YenC2-N that cooperatively fold into a large, hollow shell (Fig. la). An immediately striking feature is the single long ⁇ -sheet, comprised of 76 ⁇ - strands derived from both proteins, that constitutes the majority of the shell structure.
  • the shell is completed by a second ⁇ -sheet formed by 14 strands contributed by YenC2-N, bringing to 90 the total number of p-strands that wrap around what is a substantial central cavity (Fig. 1 b & c; Fig. 5).
  • he carboxy-terminus of YenB is in close proximity to the amino- terminus of YenC2, suggesting that the two proteins could be produced as a single polypeptide.
  • the central cavity is a solvent-accesible space approximately 42 A wide and 87 A long, with a total enclosed volume of approximately 59,000 A 3 .
  • the shell is closed at both ends - the YenB end by a ⁇ -propeller domain inserted into the loop between strands ⁇ 29 and ⁇ 51, and the YenC2 end by another short strip of ⁇ -sheet (strands ⁇ 45- ⁇ 49) that spirals inwards, forming a plug.
  • the overall shape is reminiscent of a hollow egg.
  • the carhoxy-terminal end of YenC2-N (i.e. the cleavage site between the two portions of YeriC2) lies inside this shell.
  • the complete YenB /YenC complex encapsulates YenC2-C within the shell of ⁇ -sheets created by YenB and YenC2-N.
  • This proposal is supported by small-angle X-ray scattering data, which are consistent with a hollow spheroid for the YenB/Y enC2-N complex, but with a solid spheroid for the complete YenB/YenC2 complex (Supplementary Figs. 2-4 and Tables 8-11).
  • YenC2- C remains tightly associated with the complex after auto-proteolysis, in the absence of any covalent linkage between the proteins.
  • this also explains how generally cytotoxic proteins encoded by the C-terminai regions of TcC proteins, such as deaminases, can be safely produced withou intoxication of the producing cell.
  • the toxic payload in this case YenC2-C, remains sequestered until exposure to a change in pH triggers its release 5 .
  • RH.S repeats themselves are present in many polymorphic toxin complexes that are found across a diverse range oi bacterial species. Until now, RHS-repeat containing proteins have been structurally intractable, making this structure of YenC-N the first of any protein containing RHS repeats. Individual RHS repeat proteins can vary in size, and die overall sequence conservation across the family is low, but a consensus sequence for die repeat has been previously defined: GxxxRYxYDxxGRL(I/T) 15 . When this is mapped onto the structure of YenC-N (Fig. 3), it is clear that each RHS repeat corresponds to a single strand-turn-strand motif, multiple copies of which make up the extended ⁇ -sheet of the shell.
  • the initial glycine is not especially well conserved, it marks the hairpin facing the Ye C2-end of the shell.
  • the central DxxG creates the hairpin facing the YenB-end, with the aspartic acid hydrogen - bonding to the backbone amides of the glycine and adjacent arginine.
  • This glycine is largely conserved, but the aspartic acid can be replaced by a glutamic acid, threonine or serine and typically the interactions formed remain the same.
  • the YxY motif places the two tyrosine sidechains inside the shell (coloured magenta in Fig. 3a) where they sit parallel to each other, and also stack with the post- hairpin arginine from an adjacent strand.
  • the conserved hydrophobic amino acids at the C-terminal end of the repeat (coloured yellow in Fig. 3a) also lie inside the shell, forming a continuous hydrophobic stripe along the face of the p-sheet composed of tyrosines and leucines/ isoleucines on alternating strands (Fig. 3b).
  • the RHS structural motif is present in YenB as well as YenC2-N, albeit with less sequence conservation.
  • the YenB sequence contains more insertions and extensions within the RHS repeats than the YenC sequence, which makes identifying the RHS pattern difficult by sequence conservation alone.
  • inspection of the structure reveals many examples of DxxG turns and tyrosine or phenylalanine sidechains arranged in an equivalent fashion.
  • the translocation state of the toxin contains an open pore wide enough to allow the passage of a folded protein.
  • the overall architecture of the YenB/YenC-N shell with its conserved RFIS repeats producing an interior hydrophobic pattern of tyrosine, leucine and isoleucine residues, is reminiscent of the protein chaperone GroEL 2 ", perhaps implying that the function of TcB/TcC proteins, and of RFIS and YD repeats more generally, is to encapsulate unfolded proteins.
  • the structure of the YenB/ YenC-N complex presented here reveals how 7 the cytotoxic TcC components of ABC-type Tc complexes are processed and contained, demonstrates the function of die TcB component within the Tc and provides a framework for further experiments to build a complete mechanistic model of action for this class of toxins. More broadly, it also illuminates the function of the widely distributed RHS and YD repeat families of proteins, which had until now been unknown.
  • the YenB/YenC2 protein complex was produced by co-expression in E. colt and purified using Ni-affinity and size exclusion chromatography.
  • the YenB/YenC2-N protein complex was obtained by dialysing YenB/YenC2 against acetate buffer at pH 4.5, filtration and size exclusion chromatography. Crystallisation was carried out by hanging-drop vapor diffusion with microseeding in drops containing 18% (w/v) PEG 3350, 0.15 M KH 2 P0 4 pH 4.8.
  • X- ray diffraction data was collected to a resolution of 2.26 A at beamline MX2 at the Australian Synchrotron 2 ⁇ , integrated using XDS" and scaled and merged using Aimless" '4 (Tables 6 and 7). Phasing was accomplished by a combination of MAD and SAD using Ta 6 Br,, soaked and seienometliionine-substituted crystals '5 26 . Structure refinement and analysis was performed using Phenix and diagrams were produced using PyMol 2b and Chimera 2 '
  • YenC-C dissociates at low pH
  • YenB and either YenCl or YenC2 were co-expressed in E. colt, YenC l and YetiC.?, auto-proteolysed into two fragments as described previously (ref).
  • YenB + YenCl /YenC2 complexes were tested the behaviour of YenB + YenCl /YenC2 complexes at a range of pH conditions from 4.5 to 9.5.
  • Scattering data were placed on die absolute scale by measuring the scattering of a water sample:' 0 .
  • Ab initio bead models were created by running dammifi" 20 times, superimposing and averaging the resulting models with damaver" 1 , and using this as input for a final refinement run of dammin 33 .
  • SAXS data were compared with the theoretical scattering of the YenB/YenC2-N crystal structure using crysoP 4 . Table 6. Data collection and refinement statistics for native YenB/YenC2-N dataset.
  • Table 7 Data collection and refinement statistics for selenomethionine protein crystals.
  • Anomalous resolution defined as the point at which CC Formula, orn drops below 0.3.
  • Rhs elements of Escherichia colt a family of genetic composites each encoding a large mosaic protein. Mol Microbiol 12, 865-871 (1994).
  • the CCP4 suite programs for protein crystallography. Acta CrystaHogr. D 50, 760-763 (1994).
  • Example 2 Demonstrating activity by expressing toxin proteins in E.coli nd feeding to insects
  • the encapsulated proteins of the invention can be expressed using commercially available non-conjugative vectors such as pET in E. coli (GATEWAY® technology, Invitrogen).
  • the transformed E. coli (as a bacterial cell in broth culture) can be used in a standard bioassay against, for example,diamondback moth (DBM) and other insects, to demonstrate insecticidal activity.
  • DBM diamondback moth
  • the kit uses Transform One Shot® Chemically Competent E. coli.
  • the pET vectors carry a bacteriophage T / promotor, transcription and translation signals.
  • the source of T7 RNA polymerase is provided by the host cells.
  • Diamondback moth Plutella xylostella (Eepidoptera: Plutellidae), Cabbage white butterfly, Pieris rapae (Eepidoptem: Pieridae) and cabbage loopermoth, Trichoplusia ni (Lepidoptera: Noctuidae) Diamondback moth larvae can be reared on hrassica (cabbage plants), or the strains resistant to Cry 1A and CrylC and a susceptible (G88) strain can be obtained and tested at the New York State Agricultural Experiment Station, College of Agriculture and Life Sciences at Georgia University, located in Geneva, NY, USA. Cabbage white butterfly larvae can be field- collected.
  • Te 2 r,d -3 m ins tar larvae can be used and placed on 3cm disc of cabbage leaf treated with either 20 ⁇ of transformed B.co/i solution, or dipped in the solution.
  • a wetting agent Siliwet L--77 (Mornentive Performance Materials, New York, USA) or Triton X-100 (Rohm and Hass Co, Philidelphia, USA) is used at ⁇ 0.05%.
  • Each treatment can be replicated 3-5 times (3 -50 larvae per treatment). Treated larvae remai o the cabbage leaf at 23°C 16L:8HD (Lincoln) or at 2TC 16hL:8hD (USA) and are checked daily for dead larvae.
  • Sequences for expressing the encapsulated proteins of the invention can be cloned into suitable constructs and vectors for transformation of plants as is well known by those skilled in the art, and disclosed herein.
  • the plant transformation vector, pHZBar is derived from pART27 (Gieave 1992, Plant Mol Biol 20: 1203—1207).
  • the pnos-nptII-nos3' selection cassette has been replaced by the CaMV35S-BAR-OCS3' selection cassette with the bar gene (which confers resistance to the herbicide ammonium glufosinate) expressed from the CaMV 35S promoter.
  • Cloning of expression cassettes into this binary vector is facilitated by a unique Not! restriction site and selection of recombinants by blue/ white screening for ⁇ — galactosidase.
  • the polynucleotide sequences encoding die encapsulated proteins of the invention can be cloned by standard techniques into pART7 downstream of the 35S promoter. A unique A'o/I fragment can then be shuttled into pAR ' 27 (Gieave, 1992, Plant Mol Biol 20: 1203 -1207) for transformation of various plant species. This binary vector contains the nptll selection gene for kanamycin resistance under the control of the CaMV 35S promoter. Genetic constructs in pART27 can be transferred into Agro bacterium tumefaciens strain GV3101 or EHA105 as plasmid DNA using freeze-thaw transformation method (Ditta et at " 1980, Proc. Natl. Acad. Sci. USA 77: 7347-7351). The structure of the constructs maintained in Agrobacterium can be confirmed by restriction digest of plasmid DNA's prepared from bacterial culture.
  • Agrobacterium cultures can be prepared in glycerol and transferred to -80°C for long term storage. Genetic constructs maintained in Agrobacterium strain GV3101 can be inoculated into 25 ml, of MGL broth containing spectinomvcin at a concentration of lOOmg/'L. Cultures can be grown overnight (16 hours) on a rotary shaker (200 rpm) at 28°C. Bacterial cultures can be harvested by centrifugation (3000 x g, 10 minutes). The supernatant is removed and the cells resuspended in a 5mL solution of lOmM MgS0 4 .
  • Plants can be transformed to express the encapsulated proteins of the invention by numerous methods well known to those skilled in the art and disclosed herein.
  • Tobacco can he transformed via the leaf disk transformation-regeneration method (Horsch et al.l 985).
  • Leal- disks from sterile wild type W38 tobacco plants are inoculated with an Agrobacterium tumefaciens strain containing the appropriate binary vector, and cultured for 3 days.
  • the leaf disks are then transferred to MS selective medium containing 100 mg/L of kanamycin (or 5mg/L of giufosinate) and 300 mg/'L, of cefotaxime.
  • Shoot regeneration occurs over a month, and the leaf explains are placed on hormone free medium containing kanamycin or giufosinate for root formation.
  • constructs described above can also be used to transform Sorghum.
  • a suitable protocol for transforming Sorghum is found in Howe et aL 2006, Plant Cell Reports, Volume 25, No
  • Cotton transformation The constructs described above can also be used to transform cotton.
  • a suitable protocol for transforming cotton is found in US 5, 846, 797.
  • constructs described above can also be used to transform mai ⁇ e.
  • a suitable protocol for transforming maize is found in US, 8,247, 369.
  • Transformation protocols for other plants are well-known to those skilled in the art, and are disclosed herein.
  • ELLS A analysis according to the method disclosed in U.S. Pat. No. 5,625,136 can be used for the quantitative determination of the level of the encapsulated proteins in. transgenic plants, or Darts thereof.
  • Various parts of the transformed plants, or whole plants can be used in standard insect bioassay procedures to test the activity of the encapsulated proteins of the invention, and the resistance of the transformed plants to various insects.
  • TcB/TcC (BC) complex in which die C- terminal region of YenC2 (TcC) has been replaced with green fluorescent protein (GFP).
  • TcC die C- terminal region of YenC2
  • GFP green fluorescent protein
  • the GFP protein is produced as a fusion of the the N -terminal region of YenC2 (YenC2NTR) and GFP.
  • This fusion protein was expected to self-cleave at the boundary between these two proteins, analogous to the cleavage that occurs in the native complex.
  • This cleavage occurs with both the native GFP and GFP+ 6 variants, and the protein complex consisting of YenB, the N-terminal region of YenC2 (YenC2NTR), and GFP co-purify and form a single peak on size exclusion.
  • Yersinia entomophaga YenB and Y " enC2 were cloned into die pETDuet-1 co-expression vector using standard cloning techniques. Expression was performed in E. coli
  • Rosetta2(DE3) cells using ZYM-5052 auto-induction medium (Studier, 2005). Freshly transformed cells were grown, in 5 -ml LB cultures overnight and used to inoculate 500-ml ZYM-5052 cultures in 2 litre baffled flasks. These were incubated at 3 ' 7°C for 4 hours, followed by 18°C for 24 hours. Cultures were harvested by cen trifugation at 4,680 RCF for 30 minutes and cell pellets were either frozen at -20 °C or used immediately.
  • Cell pellets were resuspended in his-0 buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 1 mM 2- mercaptoethanol) with the addition of Roche Complete mini EDTA-free protease inhibitor tablets, according to the manufacturer's directions.
  • Cells were lysed by passage through a continuous-flow cell dismptor (Microfluidics microfluidizer M-110P) at a pressure of 18 MPa. Cell lysate was clarified by cent ifugation at 27,000 RCF for 30 minutes followed by filtration.
  • the protein complex was purified by IMAC using a 5- ml Talon HiTrap column.
  • This protein was subsequently applied to the same Talon HiTrap column and the flow-throug collected. This was then concentrated and applied to a HiLoad 16/60 Superdex 200 size exclusion column (GE) attached to an Akta prime FPLC system.
  • GE HiLoad 16/60 Superdex 200 size exclusion column
  • His-0 buffer was pumped over the column at a rate of 1 ml/minute, and fractions were collected and analysed by SDS-PAGE.
  • Analytical size exclusion was performed using a Superdex 200 10/300 analytical column (GE). GFP fluorescence was determined by illumination with blue light and observation under a yellow filter.

Abstract

The invention provides a method for encapsulating a protein of interest, the method comprising the step of expressing a fusion protein comprising an N-terminal region of a rearrangement hot spot (RHS)-repeat-containing protein fused to the protein of interest. The invention further provides applications for the encapsulation, release and delivery of the protein of interest. The invention also encompasses the encapsulated protein of interest and compositions comprising the encapsulated protein of interest. The invention also provides uses of the encapsulated protein of interest, optionally after release from encapsulation, to control pests. The encapsulated protein of interest may for example be produced via expression in a plant to control a pest of the plant, such as an insect pest.

Description

METHODS AND MATERIALS FOR ENCAPSULATING PROTEINS
TECHNICAL FIELD
The invention relates to various applications for encapsulated proteins. BACKGROUND
The protection, delivery and controlled release of biologically active proteins have a wide variety of applications. Such applications range from use as biological reagents, therapeutics, anti-pest products, in tissue engineering and many others.
However, while various delivery approaches have been established, developing the ability to deliver proteins according to local environmental changes, e.g., pH remains highly challenging.
In practice it may be necessary to protect the protein, for example a therapeutic, from degradation due to chemical, physical, and biological factors in certain environments, before it is introduced into the target environment Conversely, it may be necessary to protect a particular biological environment from the protein if the protein is, for example, a targeted toxin.
There are various drawbacks with present encapsulation methods. For example, growth factors are widely used in tissue engineering applications to induce and guide blood vessel formation. However, their incorporation into protective matrices such as hydrogel scaffolds can results in a reduction in activity due to reactions such as cross-linking.
There is thus a general need for methods and materials that can protect proteins from certain environments, while simultaneously allowing them to be released and functional in other environments.
It is therefore an object of the invention to provide improved methods and materials useful for encapsulating proteins, advantageously with options for controlled release of the encapsulated protein, and/ or at least to provide the public with a useful choice. SUMMARY OF THE INVENTION
Method
In the first aspect the invention provides method for encapsulating a protein of interest, the method comprising the step of expressing a fusion protein comprising an N-terminal region of a rearrangement hot spot (RHS) - repeat-con taming protein fused to the protein of interest.
In one embodiment the fusion protem. is expressed in a cell.
Cleavage of protein of interest
In a further embodiment, upon expression and folding of the fusion protein, the protein of interest is cleaved from N-terminal region of the IlHS-repeat-containing protein.
In a further embodiment cleavage is affected by the action of a protease intrinsic to die N- terminal region of the RHS-repeat containing protein.
In a further embodiment the protease is an aspartate protease.
Encapsulation of protein of interest
In a further embodiment the protein of interest is encapsulated by a shell comprising the N- terminal region of the ΕΉ5 -repeat-containing protem.
Shell
In a further embodiment the shell is hollow.
In a further embodiment the shell is formed from one long strip of β-sheet, or β-sheets, that wraps around a central cavity. In a further embodiment the β -sheet is composed of at least 50 β-strands, preferabl at least 60 β-strands, preferably at least 70 β-strands, preferably at least 80 β-strands, preferably at least 90 derived from the N-terminal region of the RHS-repeat-containing protein.
In a further embodiment the shell includes a β -sheet formed from between 60 and 90 β- strands, preferably between 70 and 80, preferably about 76, and most preferably 76 of the β- strands.
In a further embodiment the central cavity is: between 30 and 55, preferably between 35 and 50, preferably between 40 and 44, preferably about 42, preferably 42 A wide.
In a further embodiment the central cavity is: between 75 and 100, preferably between 80 and 95, preferably between 85 and 90, preferably about 87, preferably 87 A long.
In a further embodiment die central cavity has a total enclosed volume of between 45,000 and 75,000, preferably between 50,000 and 70,000, preferably between 55,000 and 65,000, preferably about 59,000, preferably 59,000 A3.
In a further embodiment the shell is closed at both ends.
In a further embodiment the carboxy end of the shell is closed by an RHS-repeat-associated core domain.
In a further embodiment the RHS-repeat-associated core domain also acts as a protease.
In a further embodiment the overall shape of the shell is reminiscent of a hollow egg. Protein of interest (with any RHS)
In one embodiment the protein of interest is one that is normally naturally associated with the N-terminal region of the RHS-repeat-containing protein. That is the protein of interest is die naturally occurring C-terminal region of the RHS-repeat -containing protein.
In a preferred embodiment the protein of interest is one that is not normally naturally associated with the N-terminal region of the RHS- repeat -containing protein.
Thus in a preferred embodiment the protein of interest is heterologous the N --terminal region of the RHS- repeat -containing protein.
In a further embodiment the protein of interest is small enough to fit inside die shell.
In a further embodiment the protein of interest has a molecular weight of less than 103kDa. Preferably the molecular weight is less than 44kDa. More preferably the molecular weight is less than 36kDa.
KHS -containing protein
In one embodiment the RHS-repeat - containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin, complex, a non -toxin complex RHS -repeat containing protein , and a YD - repeat containing protein.
K S -repeat-containing protein is a toxin complex C (TcC) component of a bacterial toxin complex
In one embodiment the RHS-containing protein is a toxin complex C (TcC) component of a bacterial toxin complex.
In a further embodiment fusion protein comprising the N-terminal region of the toxin complex C (TcC) component is co -expressed with a toxin complex B (TcB) component of a bacterial toxin complex. Thus in one embodiment the invention provides method for encapsulating a protein of interest the method comprising the step of co-expressing:
a) a toxin complex B (TcB) component of a bacterial toxin complex, and
b) a fusion protein comprising an N-terminal region of a toxin complex C (TcC)
component of a bacterial toxin complex fused to the protein of interest.
In one embodiment the a) and b) are expressed in a cell. Protein of interest
In one embodiment the protein of interest is one that is normally naturally associated with the the N-terminal region of a toxin complex C ( TcC) component. That is the protein of interest is the naturally associated C-terminal region of a toxin complex C (TcC) component
In a preferred embodiment die protein of interest is one that is not normally naturally associated with the N-terminal region of the toxin complex C (TcC) component.
Thus in a preferred embodiment the protein of interest is heterologous to the N-terminal region of the toxin complex C (TcC) component.
In a further embodiment the protein of interest is small enough to fit inside the shell.
In a further embodiment the protein of interest has a molecular weight of less than 40kDa. Preferably the molecular weight is less than 35kDa. More preferably the molecular weight is less than 32kDa.
Cleavage of protein of interest
In a further embodiment the protein of interest is cleaved from N-terminal region of TcC component upon formation of a complex between the TcB and the fusion protein.
In a further embodiment cleavage is affected by the action of a protease intrinsic to the N- terminal region of TcC component. In a further embodiment the protease is an aspartate protease.
In one embodiment the protease activity is encoded by an RHS-repeat associated core domain sequence.
In a further embodiment the RHS-repeat associated core domain sequence is as defined herein.
Encapsulation of protein of interest
In a further embodiment the protein of interest is encapsulated by a shell formed by a complex of the TcB component and the N- terminal regio of the TcC.
Shell
In a further embodiment the shell is hollow.
In a further embodiment the shell is formed from one long strip of β-sheet, or β-sheets, that wraps around a central cavity.
In a further embodiment the β -sheet is composed of at least 50 β-strands, preferably at least 60 β-strands, preferably at least 70 β-strands, preferably at least 80 β-strands, preferably at least 90 derived from the TcB component and the N- terminal region of TcC.
In a further embodiment the shell includes a β -sheet formed from between 60 and 90 β- strands, preferably between 70 and 80, preferably about 76, and most preferably 76 of the β- strands.
In a further embodiment C-terminus of TcB and the N-terminus of the N -terminal region of TcC are in close proximity.
In a further embodiment C- terminus of TcB and the N- terminus of the -terminal region of TcC are withi 7.5 A of each other. In a further embodiment the central cavity is: between 30 and 55, preferably between 35 and 50, preferably between 40 and 44, preferably about 42, preferably 42 A wide.
In a further embodiment the central cavity is: between 75 and 100, preferably between 80 and 95, preferably between 85 and 90, preferably about 87, preferably 87 A wide.
In a further embodiment the central cavity has a total enclosed volume of between 45,000 and 75,000, preferably between 50,000 and 70,000, preferably between 55,000 and 65,000, preferably about 59,000, preferably 59,000 A3.
In a further embodiment the shell is closed at both ends.
In a further embodiment the shell includes a β-propeller domain at the TcB end of the shell.
In a further embodiment the β-propelier domain is inserted into the loop between the 29tn β- strand (β29) and the 51 st β-strand (β51).
In one embodiment the β -propeller domain closes the TcB end of the shell.
In a further embodiment the shell includes an RHS-repeat-associated core domain at the TcC end of the shell.
In a further embodiment the RHS- repeat associated core domain is a short strip of β-sheet that spirals inwards at the TcC end of the shell. In a further embodiment the RHS-repeat-associated core domain is formed by a region extending from the 45* β-strand (β45) to the 49th β-strand (β49).
In a further embodiment the RHS-repeat-associated core domain forms a plug.
In a further embodiment the RHS-repeat-associated core domain closes the TcC end of the shell.
In a further embodiment the plug closes the TcC end of the shell.
In a further embodiment the overall shape of the shell is reminiscent of a hollow egg.
Products
In a further aspect the invention provides an encapsulated protein of interest produced by the method of the invention.
In a further aspect the invention provides a protein of interest encapsulated by a shell formed by the N-terminal region of the RHS-repeat-containing protein.
In a further aspect the invention provides a protein of interest encapsulated by a shell formed by a complex of the TcB component and the N-terminal region of the TcC component.
In a further embodiment the invention provides a cell comprising an encapsulated protein according to the invention. Compositions
In a further embodiment the invention provides a composition comprising an encapsulated protein of the invention or produced by a method of the invention.
In one embodiment the composition is an insecticidal composition. Preferably the composition has insecticidal activity. Preferably the composition comprises an agriculturally acceptable carrier.
In one embodiment the composition is a pharmaceutical composition. Preferably the composition has pharmaceutical activity. Preferably the composition comprises a
pharmaceutically acceptable carrier.
Re/easahle protein of interest
In a further aspect the encapsulated protein is releasable or can be released, from the shell.
In a further aspect the encapsulated protein is releasable, can be released, or is released, from the shell in certain conditions.
In a further aspect the encapsulated protein is releasable, can be released, or is released, from the shell by lowering the pH of the environment surrounding the encapsulated protein.
In a further embodiment the encapsulated protei is releasable, can be released, or is released, from the sheii by introducing the encapsulated protein into a low pH environment.
In one embodiment the encapsulated protein is releasable, can be released, or is released, when the pi I is less than 5.5. Preferably the pH is less than 5.0. More preferably the pi I is less than 4.5. Release method
In a further aspect the invention provides a method of controlled release of a protein of interest, the method comprising placing an encapsulated protein of the invention, or produced by the method of the invention, into an appropriate environment.
Preferably the appropriate environment affects release of the protein of interest.
In a further aspect the invention provides a method of controlled release of a protein of interest, the method comprising placing an encapsulated protein of the invention, or produced by the method of the invention, into a low pH environment.
In one embodiment the low pH environment has a pH of less than 5.5. Preferably the pH is less than 5.0. More preferably the pH is less than 4.5.
In a further embodiment the protein of interest is released by a conformational change in the shell encapsulating the protein of interest.
In one embodiment the pH-induced conformatio al change is opening of the shell resulting in release of the protein, of interest.
In one embodiment the conformational change involves separation of the β -propeller blades allowing extrusion of an unfolded protein of interest through the middle of the propeller.
In one embodiment the conformational change is induced by a lowering of the pH environment of the encapsulated protein.
Method of delivery of a protein into a low pH environment
In a further aspect the invention provides a method of delivering a protein of interest to a low pH environment. In one embodiment the low pH environment has a pH of less than 5.5. Preferably the pH is less than 5.0. More preferably the pH is less than 4.5.
In a further embodiment the low pH environment is the endosome of a cell.
In a further embodiment the low pi I environment affects release of the encapsulated protein from the shell to deliver the protein o interest into the low pH environment.
In a further embodiment the low pH environment triggers delivery of the protein of interest into the cytosol of the cell.
In a further embodiment the protein of interest is released by a pH-induced conformational change in the shell encapsulating the protein of interest.
In a further embodiment the pH-induced conformational change is opening of the shell resulting in release of the protein of interest.
Method of delivering a protein of interest into a cell
In a further aspect the invention provides a method of delivering a protein of interest into a cell, the method comprising contacting the cell with and encapsulated protein of the invention.
In one embodiment delivery requires co-expression of a TcA component of a bacterial toxin complex.
Method of controlling a pest.
In a further embodiment the invention provides a method of controlling a pest, the method comprising contacting an encapsulated protein of the invention, or produced by a method of the invention, with the pest.
In one embodiment the pest is a pest of a plant. In one embodiment the pest is an insect.
In this emboditrient the protein of interest is a protein that is toxic to the insect.
In a further embodiment the pest is selected from the lepidoptera, coleoptera, diptera; and orthoptera.
In a further embodiment, the pest is selected from the following list: . Antheraea eucalypti
Scott, 1864
2. ombyx mon
Linnaeus, 1758
3. Cydia pom one Ha
Linnaeus 1758
4. Drosophila melanogasier
Meigen, 1830
5. Galleria melloneUa
Linnaeus, 1758
6. ripiphyas postvittana
Walker, 863
7. Helicoverpa arrnigera
Hubner, 1805
8. HeUotbis vimcens
Fabricius, 1777
9. Helicoverpa ea
Boddie, 1850
10. Heteronych s orator
11. Lymantna dispar
Linnaeus, 1758
12. Mamestra hrassicae
Linnaeus, 1758
13. Manduca sexta
Linnaeus, 1763 14. Pieris hrassicae
Linnaeus, 1758
15. Pims nipae
Linnaeus, 1758
16. P lute II a xylosteiia
Linnaeus, 1758
'/ 7. Spodoptera fr gperda
Smith 1797
18. Spodoptera litur
Fabricius, 1775
19. Tnbolium castaneum
Herbst, 1797
20. Trichoplusia ni
Hubner 1803
In a further embodiment the pest is selected from the Odontna, Papuana, Verkeptus, Pyronola, Wiseana, and Costelytra.
In a further embodiment the pest is a Nematode.
In a further embodiment the Nematode is from the genus lleterorhahditis.
Method involving expression in a plant
In one embodiment the encapsulated protein is produced in the plant by expressing in the plant a fusion protein comprising an N-terminal region of a rearrangement hot spot (RIIS)- repeat-containing protein fused to the protein of interest.
In a further embodiment the encapsulated protein is produced in the plant by co-expressing in the plant:
a) a toxin complex B (TcB) component of a bacterial toxin complex, and
b) a iusion protei comprising an N- erminal regio of a toxi complex C (TcC)
component ot a bacterial toxin complex fused to the protein of interest. In one embodiment the method requires co- expression oi a TcA component of a bacterial toxin complex.
In a furdier embodiment the pest is contacted by the encapsulated protein produced in the plant.
In one embodiment the pest is contacted when it ingests the encapsulated protein.
Producing insect resistant plant
In a further embodiment the method provides a method for producing an insect resistant plant the method comprising expressing in the plant a fusion protein comprising an N- terminal region of a rearrangement hot spot (RHS)-repeat-containing protein fused to the protein of interest.
In a further embodiment the method provides a method for producing an insect resistant plant the method comprising co-expressing in the plant:
a) a toxin complex B (TcB) component of a bacterial toxin complex, and
b) a fusion protein comprising an N-ternunal region of a toxin complex C (TcC)
component of a bacterial toxin complex fused to the protein of interest.
In one embodiment the method requires co-expression of a TcA component oi a bacterial toxin complex.
DETAILED DESCRIPTION OF THE INVENTION
The applicants have elucidated for the first time the 3 -dimensional structure produced by interaction of the TcB and TcC components of bacterial toxic complexes. The applicants describe how these components assemble to form a large hollow shell that encapsulates and sequesters the cytotoxic carboxy-terminal section of TcC. Furthermore, the applicants describe how TcC auto-proteolyses when folded in complex with TcB. TcC is an example of a protein that contains RHS (rearrangement hot spot) repeats. The applicants have also provided the first 3-dimensional structure for any RHS -repeat-containing protein. The applicant's data illustrates a structural architecture that is likely to be conserved across both this widely distributed bacterial RHS-repeat-containing protein family and the eukaryo ic YD- repeat-containing protein family (which they show is a sub -set of the broader class of RHS- repeat-containing proteins). In addition to indicating the function of these protein families, the applicants provide a generic mechanism for protein encapsulation and delivery. This is described in detail in Example 1.
These discoveries have allowed the applicants to invent the methods and compositions of the invention which relate to encapsulating any protein of interest by expressing fusion proteins comprising the N-terminal region of an RHS -repeat-containing protein fused to a protein of interest.
In one embodiment the method involves co-expressing the TcB component and a fusion protein consisting of an N-terminal region of the TcC component iused to the protein of interest.
There are multiple applications for technology of the invention. Such applications form further aspects and embodiments of the invention as described herein.
In this specification where reference has been made to patent specifications, other external documents, or other sources of information, this is generally for the purpose of providing a context for discussing the features of the invention. Unless specifically stated otherwise, reference to such external documents is not to be construed as an admission that such documents, or such sources of information, in any jurisdiction, are prior art, or form part of the common general knowledge in the art.
The term "comprising" as used in this specification means "consisting at least in part of. When interpreting each statement in this specification that includes the term "comprising", features other than that or those prefaced by the term may also be present. Related terms such as "comprise" and "comprises" are to be interpreted in the same manner. In some embodiments, the term "comprising" (and related terms such as "comprise and "comprises") can be replaced by "consisting of (and related terms "consist" and "consists").
Definitions R " earrangement hot spot (RHS)~repeat~containing protein The term "rearrangement hot spot ( HS)-repeat-containingpfotein'' or "TUiS-repeat-containingpfotein" as used herein means a protein with an amino acici sequence that contains RHS repeats.
As used herein, the term "KH S -repeat-containing protein " includes toxin complex C (TcC) component of a bacterial toxin complex, a non-toxin-complex RHS-repeat containing proteins, and a YD -repeat containing proteins.
In one embodiment the RHS repeats conform to the profile-hidden Markov model (PIMM) as described in the file YD_RHS_coinbined_] ackHMMER.hmm (Figure 14)
In a further embodiment the RHS repeats conform to the profile-HMM as described i the file RHS.__repeat___pf05593.hmm (Figure 12).
In a further embodiment the RHS repeats conform to the profile-HMM as described in the file YD., repeat.. TIGR01643.HMM. (Figure 13). The proteins of this embodiment are of the YD repeat class of RHS-repeat containing proteins.
Profile hidden Markov mode! (profile HMM)
Profile HMMs turn a multiple sequence alignment into a position-specific scoring system suitable for searching databases for remotely homologous sequences. Profile HMM analyses complement standard pairwise comparison methods for large-scale sequence analysis.
In a further embodiment the RHS repeats conform with the consensus
GxxxRYxYDxxGRL(I/T) [Wang et aL 1998].
RHS-repeat containing proteins typically contain 2— 60 RHS repeats.
In one embodiment the RHS-repeat containing proteins as used in the methods and products of the invention contain at least 5, preferably at least 20, more preferably at least 30 RHS repeats. In a further embodiment the RHS-repeat containing proteins as used in the methods and products of the invention contain at between 5 and 60, preferably between 15 and 50, more preferably between 35 and 45 RHS repeats.
In one embodiment the RHS -containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin complex, a non- toxin complex RHS containing protein, and a YD-repeat containing protein.
Non-toxin-complex RHS-repeat containing proteins
In addition to YD-repeat containing proteins, there are a number of other non-toxin-complex RHS-repeat containing protein including RhsA (McNulty et ai, 2006).
YD-repeat containing proteins
There are a number of other non -toxin-complex YD-repeat containing protein including teneurins (Tucker & Chiquet-Ehrismann, 2006),
RHS repeat associated-core domain
In addition to possessing RHS repeats, RHS repeat containing proteins also typically contain an "RHS repeat-associated core domain".
In one embodiment the RHS repeat containing proteins as used in the methods and products of the invention contain an RHS repeat-associated core domain.
The RHS repeat associated core domai is highly conserved, ~76 amino acids in. length, and has aspartic peptidase activity as described here. The RHS repeat associated core domain conforms to the profile-HMM in file RHS_assockted_core_domain_TIGR03696.HMM (Figure 15) Further description of the RHS repeat-associated core domain can be found at http://www<dot>ebi<dot>ac<dot>uk/interpro/entry/IPR022385.
In a further embodiment the RHS repeat-associated core domain is defined by performing an alignment between the RHS repeat containing protein sequence and the profile-HMM of the RHS repeat-associated core domain RI IS associated core domain TIGR03696.1 IMM— Figure 15), using a program such as hmmalign.
Fusion of the protein of interest to the N-termin l regon of the RHS repeat containing protein
The protein of interest will be joined at the C-terminal end of the RHS repeat-associated core domain. The final residues of the conserved RHS-associated core domain are generally "DxxGx", and when expessed cleavage will occur after the residue following the glycine.
In one embodiment the protease activity, intrinsic to the RHS repeat-containing protein, is encoded by the RHS repeat-associated core domain sequence.
In one embodiment the RHS-containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin complex, a non-toxin complex RHS repeat-containing protein, and a YD -repeat containing protein.
Bacteria/ toxin complex
The term "bacterial toxin complex" refers to the large, multi-subunit toxin complexes (Tc) produced by some bacteria. Bacterial toxin complexes are of interest due to their potent oral insecticidal activity. They are composed of at least three toxin complex proteins, TcA, TcB and TcC, which are considered to be required to assemble together in order to be fully toxic.
Toxin complex A (TcA) component of a bacterial toxin complex
Examples of TcA components are listed in the Table 1 below. The sequences as indicated in the table are provided in the sequence listing. Table 1. TcA component sequences
Polypeptide Polynucleotide
Species Reference
SEQ ID NO >: SEQ ID NO:
Yersinia entomophaga YenAl 1 71
Yersinia entomophaga YenA2
Xenorhabdus bovienii SS-2004 XptA2 3 73
Photorhabdus luminescens TccA 4 74
Burkholdena pseudom- allei 5 75
Burkholdena pseudomallei 406 e TcdA l 6 76
Yersinia pseudotuberculosis TcaB 7 77
Yersinia vests KIM 10+ 8 78
Serratia entomobhila SepA 9 79
Bacillus thimngienszs IBL 200 TcaB 0 80
Ί ¾x»¾r complex B (TirB,) component of a bacterial toxin complex
Examples of TcB components, which can be used in the methods and compositions of the invention, are listed in the Table 2 below. The sequences, as indicated in the table, are provided in the sequence listing.
Table 2. TcB component sequences
Polypeptid le Polynucleotide
Species Reference
SEQ ID N Ό: SEQ ID NO:
Yersinia entomophaga YeiiB 1 1 81 Xenorbabdus bovkmi
TcaC 12 82
SS-2004
Photorhabdus
TcaC 13 83
l min scens
Bur.kholderia
TcdB ! 1 84
pseudomaiki
T urkholderia
TcdB2 15 85
pseudomaiki
Yersinia
16 86
pseudotuberculosis
Yersinia tests
TcaCl 87
KIM10+
Serratia entomophUa SepB 18 88
B acid'us thurinsiensis
TcaC 19 89
IBL 200
In one embodiment the TcB component is from a species selected from those listed in Table 2.
In a further embodiment the TcB component has an amino acid sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO: 11 to SEQ ID NO: 19.
In a further embodiment the TcB component has an amino acid sequence selected from from any one of SEQ ID ΝΌ:11 to SEQ ID NO: 19.
In a further embodiment the TcB component is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:81 to SEQ ID NO:89. In a further embodiment the TcB component is encoded by a sequence selected from any one of SEQ ID NO:81 to SEQ ID NO:89.
In a further embodiment the TcB component is from Yersinia entomophaga.
In a further embodiment the TcB component has a sequence with at least 70% sequence identity to SEQ ID NO: 1 1.
In a further embodiment the TcB component has the sequence of SEQ ID NO:1 l .
In a further embodiment the TcB component is encoded by a sequence with at least 70% sequence identity to SEQ ID NO:8 l
In a further embodiment the TcB component is encoded by the sequence of SEQ ID NO:81.
Ί Oxin complex C (1 cC) component of a bacterial toxin complex
Examples of TcC components, which can be used in the methods and compositions of the invention, are listed in the Table 3 below7. The sequences, as indicated in the table, are provided in the sequence listing.
Table 3. TcC component sequences
Polypeptide Polynucleotide
Species Reference
SEQ ID NC ): SEQ ID NO:
Yersinia entomophaga YenCl 20 90
Yersinia entomophaga YenC2 21 91
Xenorhabdus bovienii SS-
TccC 99 92
2004
Photorhabdus luminescens TccC 23 93
B urkholderia pseudomallei TccC4 24 94 Yersinia pseudotuberculosis
Yersinia tests KIM 10+ 26 96
S erratic! enlomophila SepC 27 97
Bacillus thunn iensis IBL 200 TccC 28 98
Bacillus thuringenns IBL 200 TccC 29 99
In one embodiment the TcC component is from a species selected from those listed in Table 3.
In a further embodiment the TcC component has an amino acid sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:20 to 29.
In a further embodiment the TcC component has an amino acid sequence selected from any one of SEQ ID NO:20 to 29.
In a further embodiment the TcC component is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ ID NO:90 to 29.
In a further embodiment the TcC component is encoded by a sequence selected from any one of SEQ ID NO:90 to 99.
In a further embodiment the TcC component is from Yersinia entomophaga.
In a further embodiment the TcC component has a sequence with at least 70% sequence identity to SEQ ID NO:21
In a further embodiment the TcC component has the sequence of SEQ ID NO:21.
In a further embodiment the TcC component is encoded by a sequence with at least 70% sequence identity to SEQ ID NO:91 In a further embodiment the TcC component is encoded by the sequence of SEQ ID NO:91.
N-terminal region of a toxin complex C (YcC) component of a bacterial toxin complex
In a further embodiment the N-terminal region of the TcC component extends from the N- terminus to the amino acid following the final conserved glycine in the RHS-repeat-associated core domain.
In a further embodiment the variable carboxy - terminal domain of TcC proteins (uspstream of the final conserved glycine in the RHS-repeat-associated core domain) is replaced by the protein of interest in the fusion protein.
Examples of die TcC components, which can be used in the mediods and compositions of the invention, are Ksted in the Table 4 below. The sequences, as indicated in die table, are provided in the sequence listing.
Table 4. N-teminal region of TcC component sequences
Polypeptide Polynucleotide
Species Reference
SEQ ID NO: SEQ ID NO:
Yersinia entomophaga YenC'l - N term 30 too
Yersinia entomophaga YenC2 - N term 3 101
In a further embodiment the N-terminal region of the TcC component is selected from a species Ksted in Table 4.
In a further embodiment the N-ter.minal region of the TcC component has an amino acid sequence with at least 70% identity to a sequence selected from SEQ ID NO: 30 and 31.
In a further embodiment the N- terminal region of the TcC component has an amino acid sequence selected from SEQ ID NO: 30 and 31. In a further embodiment the N- terminal region of the TcC component is from Yersinia entomophaga.
In a furdier embodiment the N-terminal region of the TcC component has a sequence with at least 70% identity to the sequence of SEQ ID NO: 31
In a further embodiment the N-ter.minal region of the TcC component has the sequence of SEQ ID NO: 31.
In a further embodiment the N- terminal region ot the TcC component is encoded by a
sequence with at least 70% identity to the sequence of SEQ ID NO: 101
In a further embodiment the N-terminal region of the TcC component is encoded by the sequence of SEQ ID NO: 101.
Naturally occurring TcB-TcC fusion proteins
In some cases rather than TcB and TcC being encoded separately, a single ORF encodes an apparent TcB-TcC fusion protein. An example of this is the tcdB2 ORF of Burkbolderia rbi^oxinica.
Such naturally occurring TcB-- TcC fusion proteins are intented to be encompassed by the term RHS repeat-containg proteins for use in the methods and compositions of the invention.
Other RHS '-repeat containing proteins including YD -repeat containing proteins
Table 5. RHS-tepeat-containing proteins (including YD-repeat-containing proteins).
Polyp eptide Poiyiiucleo tide
Species Reference
SEQ : [D NO: SEQ ID N' O:
Actdowrax avenae 32 102
Acidovorax citruHt 33 103 Bacillus cereus wall- associated Drotein 34 04
Bactemides
cell wall-associated protein 35 105 thetaiotaomicro n aenorhabditis elegans Terieurin-1 36 106
Cellvihrio iaponicus RHS Repeat- contaimng protein 107
Chloroflexus sp. Y-400 -
YD repeat-containing- protein 38 108 fl
Coryne bacterium
RHS Repeat-containing protein. 39 109m atruchotii
Desulfatibacillum
Dalk_1 87 40 110 alkenivorans
Dickeya zeae YD repeat- containing protein 41 1 1 1
Escherichia colt Rhs core protein with extension 42 112
Escherichia coli core Drotein 43 113
Escherichia coli rhsA 44 114
Erank.ia symbiomt of
RHS Repeat- containing protein 45 15 Datisca glornerata
Erankia sp. E IIIC FraEuIl c_2129 46 116
Geobacter
RHS repeat protein 47 117 metalltreducens
Kitasatos ora seta KSE..13070 48 118
Methanosarcina
MA2045 49 119 aceitvorans
Micro bacterium Rhs family protein 50 120 testaceum
Pontoeo ananatis RhsD 51 121
Parachlamydaa
pah__c013o039 52 122 acanihamoebae bacterium Ellin5 4 YD repeat-containing protein 53 123
Velobacter propionics YD repeat- containing protein 54 124
Vrevotella denticol RHS repeat protein 55 125
Prevotella denticola
RHS repeat protein 56 126 F0289
Prevotella palkns HMPREF9144_1 55 ."> / 127
Pre vote! la pa I lens HMPREF9144_2442 58 128
Prevotella salivae conserved hypothetical protein 59 129
Prevotella sp. conserved hypodietical protein 60 130
Salmonella enterica RHS Repeat family protein 61 131
Salmonella enterica RhsG 62 132
Sorangium cellulosum RhsA 63 133
Sorangi m cellulosum Rhs family pro ein 64 134
Streptomyces
Rhs protein 65 135 inngchenggensis
Streptomyces clavuligerus YD repeat-containing protein 66 136
Streptomyces griseoflavus conserved hypothetical protein 67 37
Streptomyces
RHS repeat protein 68 38 violaceusniger Verrucomicrobium
YD repeat protein 69 139 sptnosum
Homo sapiens teneurin-1 isoform 1 70 140
In one embodiment the RHS -repeat- containing protein is from a species selected from those listed in Table 3.
In a further embodiment the RHS-repeat-containing protein has an amino acid sequence with at least 70% identify to a sequence selected from any one of SEQ ID NO: 33 to 70.
In a furdier embodiment the RHS-repeat-containing protein has an amino acid sequence seiected from any one of SEQ ID NO: 33 to 70.
In a further embodiment the RHS-repeat-containing protein is encoded by a sequence with at least 70% identity to a sequence selected from any one of SEQ) ID NO: 102 to 140.
In a further embodiment the RHS-repeat-containing protein is encoded by a sequence selected from any one of SEQ ID NO: 102 to 140.
Fusion protein
The term "fusion protein" means a head to tail fusion of two proteins. In a fusion protein, the C-terminus of a first protein is covalently linked to the N-terminus of a second protein through peptide bonding.
In a further embodiment the fusion proteins according to the invention, the C-terminus of N- terminal region of RHS-repeat-containing protein is fused to the N-terminus of protein of interest.
In a further embodiment the fusion proteins according to the invention, the C-terminus of N- terminal region of the TcB componen t is fused to the N -terminus of protein oi interest. Methods for producing fusion proteins are well known to those skilled in the art. Typically, a fusion protein is produced by expression of a polynucleotide encoding the fusion protein. In that case, polynucleotide sequences encoding each portion of the fusion protein are themselves fused.
Protein of interest
The term "protein of interest" refers to any protein to be encapsulated in the methods or compositions of the invention.
In a preferred embodiment the protein of interest is of a size small enough to fit within the hollo shell.
The protein of interest may be usefully selected from a toxin, a pharmaceutical, a biological reagent, and a bioactive polypeptide.
Expression
The encapsulated protein of the invention may be produced by in vitro
transcription/ translation by methods well-known to those skilled in the art.
Constructs and templates for transcription and translation may be produced by standard molecular biology techniques, such as cloning, or may be synthesised by methods well-known to those skilled in the art.
Alternatively the encapsulated protein of the invention may be produced by expression in a cell.
Cells
The cell used in the methods of the invention to express and produce the encapsulated protein may be any cell type. In one embodiment the cell is a prokaryosic cell. In a further embodiment the cell is a eukaryotic cell. In one embodiment the cell is selected from a bacterial ceil, a yeast cell, a fungal ceil, an insect cell, algal ceil, and a plant cell. In one embodiment the cell is a bacterial cell. In a further embodiment the cell is a yeast cell. In one embodiment the yeast cell is a 5, cenviseae cell. In further embodiment the cell is a fungal cell. In further embodiment the cell is an insect cell. In further embodiment the cell is an algal cell. In a further embodiment the cell is a plant cell.
In one embodiment the cell is a non-plant cell. In one embodiment the non-plant is selected from E. coli, P. p storis, S. cerinseae, D. salina, C. mnhardtii. In a further embodiment the non- plant is selected from P. pastoris, S. cenviseae, D. salina, C. mnhardtii
In yet another embodiment, the cell is a yeast cell. In yet another embodiment, the cell is a synthetic cell.
In a preferred embodiment the cell is a bacterial ceil.
In a further embodiment the cell is a bacterial cell selected from the genera: Yersinia,
Photorhabdus, Xenorhabdus, Serratia and P }i obium,
Preferred Yersinia species include Y rsinia pe sits, Yersinia pseudotuberculosis, Yersinia enterocolitica, Yersinia mollaretii, Yersinia fredenksenit and Yersinia entomophaga
A preferred Photorhabdus species is Photorhabdus luminescens
A preferred Xenorhabdus species is Xenorhabdus nematophUus
Preferred Serr tia species include Serr tia entomophila and Serratia protearnaculans.
A preferred
Figure imgf000030_0001
pseudomonmas. Plants
The plant cells, and plants in which the encapsulated proteins are expressed may be from any plant species.
In one embodiment the plant cell or plant, is from a gymnosperm plant species.
In a further embodiment the plant cell or plant, is from an angiosperm plant species.
In a further embodiment the plant cell or plant, is from a from dicotyledonous plant species.
Preferred dicotyledonous genera include: Amygdalus, Anacardmm, Anemone, Arachis, Brassica, Cajanus, Cannabis, Carthamus, Carp, Ceiba, Cicer, Claytonia, Conundrum, Coronilla, Corydalis, Crotalaria, Cyclamen, Dentcrria, Dicentra, Dolichos, Eranthis, Glycine. Gossypium, HeKanthus, Laihy s, Lens, Lerpede^a, Linum, Lotus, Lupinus, Macadamia, Medicago, Meliloi , Mucuna, Olea, Onobtychis, Omithopus, Oxalis, P paver, Phaseoius, Phoenix, Pistacia, Pimm, Prunus, Pueraria, Ribes, Bicinus, Sesa um, Thadctruni, The bn a. 'Irifolium, 'Limine Ha, Vicia zad Vigna.
Preferred dico yledonous species include: Amygdalus communis, Anacardmm occidentaie, Anemone americana, Anemone occidentalis, AracMs hypogaea, Arachis lypogea, Brassica napus Rape, Brassica nigra, Brassica ca pestris, Cajanus ca/an, Cajanus indicus, Cannabis sativa, Carthamus iinctorius, Carya illinoinensis, Ceiba pentandra, Cicer aneimum, Claytonia exigua, Claytonia megarhiva, Conandrum sativum, Comnilla varia, Corydalis flavula, Co sy da s sempervmns, Croialana juncea, Cyclamen coum, Dentaria laciniata, Dicentra eximia, Dicentra f rmosa, Dotichos la blab, rant his by em ils, Gossypium arboreum, Gossypium nan king, Gossypium barbadense, Gossypium herbaceum, Gossyp m kirsuium. Glycine max, Glycine ussunensis, Glycine gracilis, Heliauthus annus, Lupnnus angusdfolius, Lup nus Intern, Lupinus mutabi&s, Lespede^a sericea, Lespede^a striata, Lotus ligmos s, Lathyrus satims, Lens eulinaris, Lespedeiya stipulacea, Linum usitatissi um, Lotus corniculaius, Lupinus albus, Medicago arborea, Medicago falcate, Medicago hispuia, Medicago officinalis, Medicago sativa (alfalfa), Medicago tribukides, Macadamia ntegnjoiia, Medicago arabica, MeHlot s albus, Mucuna pmrietis, Olea europaea, Onobtychis iifolia, Omithopus satims, Oxalis tuberosa, Phaseoius aureus, Prunus cerasifera, Prunus cerasus, Phaseoius coccineus, Prunus do esiica, Phaseoius lunatus, Prunus maheleb, Phaseoius ungo, Prunus persica, Prunus pseudocerasus, Phaseoius vulgaris, Papaver somniferum, Phaseoius acuiifolius, Phoenix dactydpfra, Pistacia vera, Pisu sativum, Prunus amygdalus, Prunus armeniaca, Pueraria tkunbergiana. Kibes nigrum, Ribes rubrum, Ribes grossutaria, Ric nus communis, Sesamum indicum, Lkalictrum dioicum, Lhalictrum flavum. Thaiietrum lhalictroides, Theobroma cacao, Trifolium augusiiftilium, 'Trifolium diffusum, Trifolium hybridtm, Trifolium incarnatnm, Trifolium ingrescens, Trifolium pratense, Trifolium repens, Trifolium m pinat m, Trifolium subterraneum, Trifolium alexandrinum, Trigonella foenumgraecum, Vicia angustifolia, Vicia atropnrpuna, Vicia calcaraia, Vicia dasycarpa, Vicia erviha, V actinium oxycoccos, Vicia pannomca. Vigna sesquipedalis, Vigna unensts, Vicia villosa, \ 'icia faba, Vicia sative and Vigna angu/aris.
In a furdier embodiment the plant cell or plant, is from a monocotyledonous plant species.
Preferred monocotyledonous genera include: Agropyron, Allium, Alopecurus, Andropogon, Arrhenatherum, Asparagus, Avena, Bambusa, Bellavalia, Brimeura, Brodiaea, Bulbocodiu ·, Bothrichloa, Bouteloua, Bromus, Calarnoviifa, Camassia, Cenchrus, Chionodoxa, Chirms, Colchicum, Crocus, Cymbopogon, Cynodon, Cypnpedium, Dactylis, Dichanthium, Digitaria, Elaeis, Eleusme, Eragrostis, Eremums, Erythronium, Fagppyrum, Festuca, tritillaria, Galanthus, Fhlianthus, Flordeum, Hyacintbus, Eyacinibo des, ipbeion, Ins, Eeuco/um, Liatris, Eolium, Laycoris, Miscanthis, Miscanthus x giganteus, Muscan. Omttbogah , Orjpa, Panicum, Paspalum, Pennisetum, Phalaris, Phleum, Poa, Puschkinia, Saccbarum, S'ecale, Setaria, Sorgbastrum, Sorghum, Thinotyrum, Triticum, Vanilla, X Triticosecale Triticale and Zea.
Preferred monocotyledonous species include: Agropyron cristatum, Agropyron desertorum, Agropyron elongatum, Agropyron intermedium, Agropyron smitbii, Agropyron spicatum, Agropyron Trachycautum, Agropyron inchophorum, Allium ascalonicum, Allium cepa, Allium cbinense, Allium porrum, Allium seboenoprasum, Allium fistulosum, Allium sativum, Alopecurus pratensis, Andropogon gerardi,
Andropogon Gerardii, Andropogon scoparious, Arrhenatherum elaiius. Asparagus officinalis, Avena nuda, Avena satwa, Bambusa vulgaris, Beilevalia trifoliate, Bfi eura amethysima. Brodiaea caafornica, Brodiaea coronaria, Brodiaea ekgans, Bulbocodium versicolor. Bothrichloa barbinodis, Bothrichloa ischaemum, Bothrichloa saccharides, bouteloua curipe dnla, Bouteloua eriofoda, Bouteloua gracilis, Bromus erecf s, Bromus inermis, Bromus riparius. Calarnoviifa longifilia, Camassia scilloides, Cenchrus ciliaris, Chionodoxa forbesii, Chloris gay ana, Colchicum autumnale, Crocus sativus, Cymbopogon nardus, Cynodon dactykn, Cypnpedium acanle, Dactylis glomerate, Dichanthium annul actum, Dichanthium aristafum, Dichanthium sericeum, Digitaria decumbens, Digitaria smut sit, Elaeis guineensis, Elaeis oleifera, Eleusme coracan, Elymus angustus, Elymus funceus, Eragrostis curvuia, Eragrostis tef, Eremums robuslus, Erythronium elegans, Erythronium helenae, Fagopyrum escul ntum, agopyrum iaiancum, Festuca arundinacea, Bestuca ovina, Festuca pratensis, Bestuca rubra, Friiiltaria ctrrhosa, Galanthus nivalis, Heliantbus annuus sunflower, Hordeum disiichum, Hordeum vulgatv. Hyacintbus orient alls, Hyacintboides hispanica, Hyacintboides non- s pta, Ipheion sessile, Ins colkttii, Ins danfardiae, Iris reticulate, Lemojuw aesttvum, LJ tns cylindr ce , Liairis eiegans, L imm lonj fbmm, LoBurn mulUfim m, Laimrn p nnn , L lmm westemmldkum,
Figure imgf000033_0001
bybridu , Lycoris radiata, Mi scan this sinensis, Miscanthus : ·< gigpnteus, Muscan armem cum, Ma scan m cm rp m, Natrinsi/s pseudmarcissas, Ortiitbogafam moniannm, Οη/ψ s tiva, Pamcum italicium, Pamcum m xim , Panicam mihiaceum, Patncum pwrpurascens, Panicam virgaiu , P sp m dilatatum. Paspalum n tatum, Penniseium dandesiinum, Pennine turn plan cum, Pennine turn pwrpurenm Pennine turn spicaium, Vhalans arundmacea, Pkleum bertoiinii, Phleum pratense, Poa findlenana, Poa pralensin, Poa nemoralis, Punchkinia ncilhides, S acckamm ofpcinarnm, S accharum mbusttm, S ccharum s ense,
.5 acckamm spontaneum, S cilia auiumnalin, S cilia peruviana, Secale cereale, Setaria ilalic , Setaria sphacelaia, Sotghasirum nutans, Sorghum bicolor, Sorghum dochna, Sorghum hatepense, Sorghum sudauense, Teinopyrum poniicum, Trillium ggaudifiorum, Triticum aentirum, Ί ticum dicoc u , Triiicum durum, Triticum monococcum, Tulipa batalimi, Pulipa clusiana, Tulipa dasysiemon, Tulipa gesneriana. Pulipa greigi, 'Pulipa kaupmanniana, Ί /ipa sylvestns, Ί 'ulipa iurkenianica. Vanilla fr-agrans. X Triticosecale and Zea mays.
Preferred plants include crop plants, such as cotton, sorghum, maize, wheat, rice, soy and barley.
Plant parts, propagues and progeny
The term "plant" is intended to include a whole plant, any part of a plant, a seed, a fruit, propagules and progeny of a plane.
The term 'propagule' means any part of a plant that may he used in reproduction or propagation, either sexual or asexual, including seeds and cuttings.
The plants of the invention may he grown and either self-ed or crossed with a different plant strain and the resulting progeny, comprising the polynucleotides or constructs of the invention, and/or expressing the polypeptide sequences of the invention, also form an part of the present invention.
Preferably the plants, plant parts, propagules and progeny comprise a polynucleotide or construct of the invention, and/ or express a polypeptide sequence of the invention. The term "agriculturally acceptable carrier" covers ail liquid and solid carriers known in the art suc as water and oils, as well as adjuvants, dispersants, binders, wettants, surfactants, humectants tackifiers, formulation excipiants, and the Kite that are ordinarily known for use in the preparation of control compositions, including insecticide compositions.
The phrase insecticidal activity" means activity in at least one of: killing, slowing the growth of, preventing reproduction of, and reducing numbers of any given insect.
An "insect pest" is an insect that causes damage to a non -insect resistant plant. he "pharmaceutical composition" includes the encapsulated protein of the invention or produced by the method of the invention.
The "pharamaceutical composition" may also include the use of formulation chemistry, including but not limited to methods described in: Pharmaceutical Formulation Development of Peptides and Proteins, Second Edition Published: November 14, 2012 by CRC Press - 392 Pages
Bditor(s): Lars Hovgaard, Novo Nordisk A/ S, Malov, Denmark; Sven Frokjaer, University of Copenhagen, Denmark; Marco van de Weert, University of Copenhagen, Denmark.
The term "pharmaceutical" is intended to cover veterinary applications as wTell as human health application. Animal that may be treated for veterinary applications include agricultural animals such as cows, sheep, goats, pigs, horses, chickens, deer, as well as companion animals such as clogs, cats and rabbits.
Poyl nucleotides and fragments
The term "polynucleotide(s)," as used herein, means a single or double -stranded deoxyribonucleotide or ribonucleotide polymer of any lengt but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments. A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides.
The term "primer" refers to a short polynucleotide, usually having a free 3ΌΗ group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the target.
The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence that is complementary to the probe, in a hybridization -based assay. The probe may consist of a "fragment" of a polynucleotide as denned herein.
Poylpeptides and fragments
The term "polypeptide", as used herein, encompasses amino acid chains of any length but preferably at least 5 amino acids, including full-lengt proteins, in which amino acid residues are linked by covalent peptide bonds. Polypeptides of the present invention, or used in the methods of die invention, may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques.
A "fragment" of a polypeptide is a subsequence of the polypeptide that preferably performs a function of and/or provides three dimensional structure of the polypeptide. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof capable of performing the above enzymatic activity.
The term "isolated" as applied to the polynucleotide or polypeptide sequences disclosed herein, is used to refer to sequences that are removed from their natural cellular environment. An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.
The term "recombinant" refers to a polynucleotide sequence that is removed from sequences that surround it in its natural context and/ or is recombined with sequences that are not present in its natural context.
A "recombinant" polypeptide sequence is produced by translation from a "recombinant" polynucleotide sequence. The term "derived from" with respect to polynucleotides or polypeptides being derived from a particular genera or species, means that the polynucleotide or polypeptide has the same sequence as a polynucleotide or polypeptide found naturally in that genera or species. The polynucleotide or polypeptide, derived from a particular genera or species, may therefore be produced synthetically or recombinantly.
Variants
As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturaliy occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polypeptides possess biological activities that are the same or similar to those of the inventive polypeptides or polypeptides. The term "variant" with reference to polypeptides and polypeptides encompasses all forms of polypeptides and polypeptides as defined herein.
Polynucleotide variants
Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least
51%, more preferably at least 52%, more preferably at least 53%, more preferably at least
54%., mo e preferably at least 55%,, more preferably at least 56%, more preferably at least
57%, more preferably at least 58%, more preferably at least 59%, more preferably at least
60%,, mo e preferably at least 61 %,, more preferably at least 62%, more preferably at least
63%, more preferably at least 64%, more preferably at least 65%, more preferably at least
66%, more prefer bly at least 67%, more preferably at least 68%, more preferably at least
69%, more preferably at least 70%, more preferably at least 71%, more preferably at least
72%, more preferably at least 73%, more preferably at least 74%, more preferably at least
75%, more preferably at least 76%, more preferably at least 77%, more preferably at least
78%, more preferably at least 79%, more preferably at least 80%, more preferably at least
81%,, more preferably at least 82%,, more preferably at least 83%, more preferably at least
84%, more preferably at least 85%, more preferably at least 86%, more preferably at least
87%,, mo e preferably at least 88% more preferably at least 89%, more preferably at least
90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferabl at least 94%, more preferabl at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequence of the present invention. Identity is found over a comparison window of at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention.
Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from die BLAST suite of programs, version 2.2.5 [Nov 2002 j) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from the NCBI website on the World Wide Web at ftp:/ / ftp<dot>ncbi<dot>nih<dot>gov/bkst/. The default parameters of b!2seq are utilized except that filtering of low complexity parts should be turned off.
The identity of polynucleotide sequences may be examined using the following unix command line parameters: bl2seq— i nucleotideseql— j nucleotideseq2— F F— p blastn
The parameter — F F turns off filtering of low complexity sections. The parameter — p selects the appropriate algorithm for the pair of sequences. The bl2seq program reports sequence identity as both the number and percentage of identical nucleotides in a line "Identities = ".
Polynucleotide sequence identity may also be calculated over die entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) j. Mol. Biol. 48, 443-453). A full implementation of the Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice,P. Longden . and Bleasby,A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in. Genetics June 2000, vol 16, No 6. p.276-277) which can be obtained from the world wide web at http://www<dot>hgmp<dot>mrc<dot>ac<dot>uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at littp:/ www <dot>ebi< do t>ac<dot>uk/ emboss/align/. Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps, GAP is described in die following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.
A preferred method for calculating polynucleotide % sequence identity is based on aligning sequences to be compared using Clustal X (Jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)
Polynucleotide variants also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from the NCBI website on the World Wide Web at
ftp : / / ftp < dot> ncbi< dot> nih< do t> gov /blast / .
The similarity of polynucleotide sequences may be examined using the following unix command line parameters: b!2seq— i nucleoli deseql — j nucleotideseq2— F F— p tblastx he parameter— F F turns off filtering of low complexity sections. The parameter— selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. The size of this database is set by default in the bl2seq program. For small E values, much less than one, the E value is approximately the probability of such a random match.
Variant polynucleotide sequences preferably exhibit an E value of less than 1 x 10 -6 more preferably less than 1 x 10 -9, more preferably less than 1 x 10 -12, more preferably less than 1 x 10 -15, more preferably less than 1 x 10 -18, more preferably less than 1 x 10 -21, more preferably less than 1 x 10 -30, more preferably less than 1 x 10 -40, more preferably less than 1 x 10 -50, more preferably less than 1 x 10 -60, more preferably less than 1 x 10 -70, more preferably less than 1 x 10 -80, more preferably less than 0 -90 and most preferably less than 1 x 1.0-100 when compared with any one of the specifically identified sequences.
Alte tively, variant polynucleotides of the present invention, or used in the methods of the invention, hybridize to the specified polynucleotide sequences, or complements thereof under s tringent condition s .
The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA. or RN.A blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.
With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al, 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-iog (Na+). (Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Boiton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X. SSC, 0.1% SDS at 65°C,
With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10° C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/ oligonucleotide length)" C.
With respect to the DNA mimics known as peptide nucleic acids (PNAs) (Nielsen et al., Science. 1991 Dec 6;254(5037): 1497-500) Tm values are higher than those for DNA-DNA or DNA-RNA hybrids, and can be calculated using the formula described in Giesen et al., Nucleic Acids Res. 1998 Nov l ;26(21):5004-6. Exemplary stringent hybridization conditions for a DNA-PNA hybrid having a length less than 100 bases are 5 to 10° C below the Tm.
Variant polynucleotides used in the methods of the invention, also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degenerac of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codoris for the same amino acid may be changed by art recognized techniques, e.g., to optimize codori expression in a particular host organism.
Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for malting phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 306).
Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bi2seq program from the BLAST suite of programs (version 2.2.5 [Nov 20021) from the NCBI website on the World Wide Web at ftp: / /ftp<dot>ncbi<dot>nih<dot>gov/blast/ via the tblastx algorithm as previously described.
Polypeptide variants
The term "variant" with reference to polypeptides encompasses naturally occurring, recombinant!}7 and synthetically produced polypeptides. Variant polypeptide sequences preferably exhibit at least 50%, more preferably at least 51%, more preferably at least 52%, more preferably at least 53%, more preferably at least 54%), more preferably at least 55%, more preferably at least 56%, more preferably at least 57%, more preferably at least 58%, more preferably at least 59%, more preferably at least 60%, more preferably at least 61 %!, more preferably at least 62%, more preferably at least 63%, more preferably at least 64%, more preferably at least 65%, more preferably at least 66%, more preferably at least 67%, more preferably at least 68%, more preferably at least 69%, more preferably at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferabl at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%), more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably7 at least 92%, more preferably7 at least 93%), more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequences of the present invention. Identity is found over a comparison window of at least 20 amino acid positions, preferably at least 50 amino acid positions, more preferably at least 100 amino acid positions, and most preferably over the entire length of a polypeptide oi the invention.
Polypeptide sequence identity can be determined in the following manner. The subject polypeptide sequence is compared to a candidate polypeptide sequence using BLASTP (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq, which is publicly available from the NCBI website on the World Wide Web at ftp:/ / ftp<dot>ncbi<dot>nih.<dot>gov/ blast/. The default parameters of bl2seq are utilized except that filtering oi lo complexity regions should be turned off.
Polypeptide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs. EMBOSS-needle (available at http:/www<dot>ebi<dot>ac<dot>uk/ emboss/ align/) and GAP (Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227 -235.) as discussed above are also suitable global sequence alignment programs for calculating polypeptide sequence identity.
A preferred method for calculating polypeptide % sequence identity is based on aligning sequences to be compared using Clustal X (jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)
Polypeptide variants used in the methods of the invention, also encompass those which exhibit a similarity to one or more of the specifically7 identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from the NCBI website on the World Wide Web at ftp: //ftp.iicbi.nih.gov/blast/. The similarity of polypeptide sequences may be examined using the following unix command line parameters: bl2seq— i peptideseql— j peptideseq2 -F F— blastp
Variant polypeptide sequences preferably exhibit an E value of less than 1 x 10 -6 more preferably less than 1 x 10 -9, more preferably less than 1 x 10 - 2, more preferably less than 1 x 10 - 15, more preferably less than 1 x 10 - 18, more preferably less than 1 x 10 -21 , more preferably less than 1 x 10 -30, more preferably less than 1 x 10 -40, more preferably less than 1 x 10 -50, more preferably less than 1 x 10 -60, more preferably less than 1 x 10 -70, more preferably less than 1 x 10 -80, more preferably less than 1 x 10 -90 and most preferably lxl 0- 100 when compared with any one of the specifically identified sequences.
The parameter— F F turns off filtering of low complexity sections. The parameter— selects the appropriate algorithm for the pair of sequenc.es. This program finds regions of similarity between the sequences and for each such region reports an "F, value" which is the expected number of times one could expect to see such a match by chance in a database ot a fixed reference size containing random sequences. For small E values, much less than one, this is approximately the probability of such a random match.
Conservative substitutions of one or several amino acids of a described polypeptide sequence without significantly altering its biological activity7 are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 1306).
Constructs, vectors and components thereof
The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may7 be derived from a different cell or organism and /or may be a recombinant polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The genetic construct may be linked to a vector.
The term "vector" refers to a polynucleotide molecule, usually double stranded DN A, which is used to transport the genetic construct into a host ceil. The vector may be capable oi replicatio in at least one additional host system, such as E. coli.
The term "expression construct" refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction:
a) a promoter functional in the host ceil into which the construct will be transformed,
b) the polynucleotide to be expressed, and
c) a terminator functional in the host cell into which the construct will be transformed.
The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcri tion product and/or a polype tide under the control of appropriate regulatory sequences, lite coding sequence may, in some cases, identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is capable of being expressed when it is operably linked to promoter and terminator sequences.
"Operably -linked" means that the sequenced to be expressed is placed under the control of regulatory elements that include promoters, tissue -specific regulatory elements, temporal regulator)' elements, enhancers, repressors and terminators.
The term "noncoding region" refers to untranslated sequences that are upstream of the transiational start site and downstream of the transiational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination, mRNA stability, and for regulation of translation efficiency. Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important de erminants of mRNA stability and in some cases have been found to have spatial regulatory functions. he term "promoter" refers to nontranscribed cis -regulatory elements upstream of the coding region that regulate gene tra scri tion. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors. Introns within coding sequences can also regulate transcription and influence post-transcriptional processing (including splicing, capping and polyadenylation).
A promoter may be homologous with respect to the polynucleotide to be expressed. This means that the promoter and polynucleotide are iound operably linked in nature.
In a preferred embodiment the promoter may be heterologous with respect to the polynucleotide to be expressed. This means that the promoter and the polynucleotide are not found operably linked in nature.
In certain embodiments the polynucleotides/polypeptides of the invention may be andvantageously expessed under the control of selected promoter sequences as described below.
Vegetative tissue specific promoters
An example of a vegetative specific promoter is found in US 6,229,067; and US 7,629,454; and US 7,153,953; and US 6,228,643.
Pollen specific promoters
An example of a polle specific promoter is found in US 7, 141 ,424; and US 5,545,546; and US 5,412,085; and US 5,086,169; and US 7,667,097.
Seed specific promoters
An example of a seed specific promoter is found in US 6,342,657; and US 7,081,565; and US 7,405,345; and US 7,642,346; and US 7,371,928. A preferred seed specific promoter is the napin promoter oiBrassk napus (Josefsson et al., 1987, } Biol Chem. 262(25):12196-201; Ellerstrom et al., 1996, Plant Molecular Biology, Volume 32, Issue 6, pp 1019-1027).
Fruit specific promoters
An example of a fruit specific promoter is found in LIS 5,536,653; and US 6,127,179; and US 5,608,150; and US 4,943,674.
Non-photosynthetic tissue preferred promoters
Non-photosynthetic tissue preferred promoters include those preferentially expressed in non- photosynthetic tissues / organs of the plant.
Non-photosynthetic tissue preferred promoters may also include light repressed promoters. Light repressed promoters
An example of a light repressed promoter is found in US 5,639,952 and in US 5,656,496. Root specific promoters
An example of a root specific promoter is found in US 5,837,848; and US 2004/0067506 and US 2001 /0047525.
Ί 'uber specific promoters
An example of a tuber specific promoter is found in US 6, 84,443.
Bulb specific promoters
An example of a bulb specific promoter is found i Smeets et al., ( 997) Plant Physiol.
113:765-771.
Whiypme preferred promoters
An example of a rhizome preferred promoter is found Seong Jang et al., (2006) Plant: Physiol. 142:1148-1159.
Endosperm specific promoters
An example of an endosperm specific promoter is found in US 7,745,697. Corm promoters
An example of a promoter capable of driving expression in a corm is found in Schenk et al., (2001) Plant Molecular Biology, 47:399-412.
Photosythetk tissue preferred promoters
Photosythetic tissue preferred promoters include those that are preferentially expressed in p oto synthetic tissues of the plants. Photosynthetic tissues of the plant include leaves, stems, shoots and above ground parts of the plant. Photosythetic tissue preferred promoters include light regulated promoters.
Light regulated promoters
Numerous light regulated promoters are known to those skilled in the art and include for example chlorophyll a/b (Cab) binding protein promoters and Rubisco Small Sub unit (SSU) promoters. An example of a light regulated promoter is found in LJS 5,750,385. Light regulated in this context means light inducible or light induced.
A "transgene" is a polynucleotide that is taken from one organism and introduced into a different organism by transformation. The transgene may be derived from the same species or from a different species as the species of the organism into which the transgene is introduced.
Host celts
Host cells may be derived from, for example, bacterial, fungal, yeast, insect, mammalian, algal or plant organisms. Host cells may also be synthetic cells. Preferred host cells are eukaryotic cells. A particularly preferred host cell is a plant cell, particularly a plant cell in a vegetative tissue of a plant.
A ''transgenic plant" refers to a plant which contains new genetic material as a result of genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species as the resulting transgenic plant or from a different species. Methods for isolating or producing polynucleotides
The polynucleotide molecules of the invention, can he isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such polypeptides can be isolated through use of the polymerase chain reaction (PGR) described in. Mullis et al., Eds. 1994 The Polymerase Chai Reaction, Birkhauser, incorporated herein by reference. The polypeptides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.
Further methods for isolating polynucleotides of the invention include use of all, or portions of, the polypeptides having the sequence set forth herein as hybridization probes. The technique of hybridizing labelled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1. 0 X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C An optional further wash (for twenty minutes) can be conducted under conditions of 0.1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C.
The polynucleotide fragments of the invention may be produced by techniques well -known in the art such as restriction endonuciease digestion, oligonucleotide synthesis and PGR amplification.
A partial polynucleotide sequence may be used, in methods well -known in the art to identify the corresponding full length polynucleotide sequence. Such methods include PGR -based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- basecl method, computer/database—based methods. Further, by way of example, inverse PGR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Trigiia et ai., 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PGR template. Divergent primers are designed from the known region. In order to physically assemble full-length clones, standard molecular biology approaches can be utilized (Sambrook et aL, Molecular Cloning: A
Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
Variants (including orthologues) may be identified by the methods described.
Methods for identifying variants Physical methods
Variant polypeptides may be identified using PCR-based methods (Mullis et aL, Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variants of polynucleotide molecules of the invention by PGR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.
Alternatively library screening methods, well known to those skilled in the art, may be employed (Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring- Harbor Press, 1987). When identifying variants of the probe sequence, hybridization and/ or wash stringency will typically be reduced relatively to when exact sequence matches are sought.
Polype tide variants may also be identified by physical methods, for example by screening expression libraries using antibodies raised against polypeptides of the invention (Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987) or by identifying polypeptides from natural sources with the aid of such antibodies.
Computer based methods
The variant sequences of the invention, including both polynucleotide and polypeptide variants, may also be identified by computer-based methods well -known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Pro t, PIR and others). See, e.g.. Nucleic Acids Res. 29: 1 -10 and 11 - 16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments. An exemplary iatnily of programs useful for identilying variants in sequence databases is the BLAST sake of programs (version 2,2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp://ftp.ncbi.nih.gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38A, Room 8N805, Bethesda, MD 20894 USA. The NCBI sewer also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence against a nucleotide sequence database. BLASTP compares an. amino acid query sequence against a protein, sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein, query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLAS X compares the sis -frame translations of a nucleotide query sequence against the six -frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.
The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is described i the publication of Altschul et al., Nucleic Acids Res. 25: 3389 -3402, 1997.
The '"hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.
The BLASTN, BLASTP, BLASTX,, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm. Multiple sequence alignments of a group of related sequences can be carried out with.
CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, TJ, (1994) CLUSTALW:
improving the sensitivity of progressiv e multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680, htφ://w -igbmc.u-strasbg.fr/BioInfo/ClustalW/To . tml) or T-COFFEE (Cedric Notredame, Desmond G. I liggins, jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. (2000) 302: 205-217)) or PILEUP, which uses progressive, pairwise alignments. (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351).
Pattern recognition software applications are available for finding motifs or signature sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual overview of the motifs found. MEME and MAST were developed at the University of California, San Diego.
PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al., 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain (s) are present in the sequence (Falquet et al., 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature.
Methods for isolating polypeptides
The polypeptides used in the methods of the invention, including variant polypeptides, may be prepared using peptide synthesis methods well known in the art such as direct peptide synthesis using solid phase techniques (e.g. Stewart et al., 1969, in Solid-Phase Peptide
Synthesis, WIT Freeman Co, San Francisco California, or automated synthesis, for example using an Applied Biosystems 431A Peptide Synthesizer (Foster City, California). Mutated forms of the polypeptides may also be produced during such syntheses. The polypeptides and variant polypeptides used in the methods of the invention, ma also be purified from natural sources using a variety of techniques chat are well known in the art (e.g. Deutsche*, 1990, Ed, Methods in Enzymology, Vol. 182, Guide to Protein Purification).
Alternatively the polypeptides and variant polypeptides used in the methods of the invention, may be expressed recombinant!;7 in suitable host cells and separated from the cells as dis cus s ed b elow.
Methods for producing constructs and vectors
The genetic constructs expressing the encapsulated proteins according to the invention may be useful for transforming, for example, bacterial, fungal, insect, mammalian or plant organisms. The genetic constructs of the invention are intended to include expression constructs as herein defined.
Methods for producing and using genetic constructs and vectors are well known in the art and are described generally in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubei et al., Current Protocols in Molecular Biology, Greene Publishintr, 1987).
Polynucleotides, expression cassettes, and constructs can also be conveniently' synthesized in their entirety using techniques well-known and or available to those skilled in the art
Methods for producing host cells comprising polynucleotides, constructs or vectors
Host ceils comprising genetic constructs, such as expression constructs, of the invention are useful in methods well known in the art (e.g. Sambrook et al., Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubei et al., Current Protocols in Molecular Biology, Greene Publishing, 1987) for recombinant production of encapsulated proteins. Such mediods may involve the culture of host cells in an appropriate medium in conditions suitable for or conducive to expression of a polypeptide of the invention. The expressed recombinant polypeptide, which may7 optionally7 be secreted into the culture, may7 then be separated from the medium, host cells or culture medium by7 methods well known in the art (e.g. Deutsche*, Ed, 1990, Methods in Enzymology, Vol 182, Guide to Protein Purifica don) . Methods for producing plant cells and plants comprising constructs and vectors
The invention further provides methods for producing plant cells and plants expressing the en cap sul ated prote ns .
Methods for transforming plant cells, plants and portions thereof with polypeptides are described in Draper et al., 1988, Plant Genetic Transformation and Gene Expression. A Laboratory Manual Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer- Verlag, Berlin.; and Gelvin et al., 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London.
Methods for genetic manipulation of plants
A number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Moi Biol, 48, 297; Hellens et al., 2000, Plant Moi Biol 42: 819-32; Heliens et al., Plant: Meth 1: 13). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plane cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/ olypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species.
Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.
The promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a tnonocot or dicot plant and include cell-, tissue- and organ- specific promoters, ceil cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated widi a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention. Examples of constitutive plant promoters include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues respond to internal developmental signals or external abiotic or biotic stresses are described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894 and WO2011 /053169, which is herein incorporated by reference.
Exemplary terminators that are commonly used in plant transformation genetic construct include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agfobackrium tnmefaciens nopaline synthase or octopine synthase terminators, the L ' ea mays zein gene terminator, the Oty a saliva ADP-glucose pyrophosphorylase terminator and the Solatium tuberosum PI-II terminator.
Selectable markers commonly used in plant transformation include die neomycin phophotransferase II gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinodiricin acetyl transferase {bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the hygromycin phosphotransferase gene ( hpt) for hygromycin resistance.
The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: Rice (Alam et al., 1999, Plant Cell Rep. 18, 572); apple (Yao et al., 1995, Plant Cell Reports 14, 407-412); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al., 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et al., 1996 Plant J. 9, : 821); cassava (Li et al., 1996 Nat. Biotechnology 14, /36); lettuce Michelmore et al., 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al., 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073 and 6. 020, 539); peppermint (Nm et al., 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al., 1995, Plant Sci.104, 183); caraway (Krens et al., 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial N o. 5, 792, 935); soybean (US Patent Nos. 5, 416, 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. 4, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); cereals (US Patent No. 6, 074, 877); pear (Matsuda et al,, 2005, Plant Cell Rep. 24(1):45-51); Primus (Ramesh et al., 2006 Plant Cell Rep. 25(8):821-8; Song and Sink 2005 Plant Cell Rep. 2006 ;25{2):117-23; Gonzalez Padilla et al., 2003 Plant Cell Rep.22(l):38- 45); strawberry (Oosumi et al., 2006 Planta. 223(01:1219-30; Folta et al., 2006 Planta Apr 14; PMID: 16614818), rose (Li et al., 2003), Rubus (Graham et al., 1995 Methods Mol Biol.
1995;44:129-33), tomato (Dan et al., 2006, Plant Cell Reports V25:432-441), apple (Yao et al., 1995, Plant Cell Rep. 14, 407-412), Canola (Brassica napus L.),(Cardoza and Stewart, 2006 Methods Mol Biol. 343:257-66), safflower (Orlikowska et al, 1995, Plant Cell Tissue and Organ Culture 40:85-91), ryegrass (Altpeter et al., 2004 Developments in Plant Breeding 110:255- 250), rice (Christou et al., 1991 Nature Biotech. 9:957-962), maize (Wang et al, 2009 In:
Handbook of Maize pp. 609 -639) and Ac nidi eriantha (Wang et al., 2006, Plant Ceil Rep. 25,5: 425 -31). Transformation of other species is also contemplated by the invention.
Suitable methods and protocols are available in the scientific literature.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows the structure of the YenB /YenC2-N complex, a, Ribbon diagram of YenB/YenC2-N. YenB is on the left in Sight grey and YenC is on the right in dark grey, b-c, Orthoganol views of the complex, with the central cavity shown as a translucent surface and the protein as grey ribbons, and with approximate interior and exterior diameters marked. The position of an RGD motif is shown with a circle, d, a schematic topology diagram of the structure with cx -helices shown as cylinders and β-sheets as arrows, and the domains labelled.
Figure 2 show's structural details of the YenC2-N auto -proteolysis site. The site of auto- proteolysis in YenC. The residues immediately upstream of the cleavage point are D686, P687, D688, G689 and M690. The side chains of a selection of residues conserved in the RHS- associated core domain are shown. The distance (6.3 A) between the two conserved residues that are essential for proteolysis, D686 and D663, is shown in dark grey. The side chain of D663 is within hydrogen-bonding distance of die terminal carboxyi group of the cleaved peptide (distances (2.8 A and 3.2 A) shown in light grey). Figure 3 sho ns RHS repeat structure, a, A section of the shell showing the pattern of RHS repeats, viewed from the inside of the central cavity. A single RHS repeat is highlighted with a grey background box. The ordered loop made by the DxxG motif is shown at the top and the conserved pattern of hydrophobic residues on the face of the β-sheet is shown in dark grey (conserved tyrosines) and light grey (other hydrophobic residues), b, The same face of the β- sheet shown as a solvent accessible surface and coloured by side -chain hydrophobicity (hydrophilic as white, hydrophobic as dark grey, protein backbone as mid grey). The stripe formed by the conserved hydrophobic residues is boxed with a dashed line.
Figure 4 shows the position of the YenB/YenC2-N complex in the complete Yen-Tc particle, a-b, The YenB/YenC2-N dimer is shown as a ribbon diagram, fitted to fitted to EM class averages of the complete Yen -Tc toxin particle. c~d, Orthogonal views of the complete Yen - Tc complex. The YenB/YenC2-N dimer is shown as a ribbon diagram. The associated chidnases ChiV" and Chi.?, (PDB ID: 4DWS) are shown as pale grey ribbon diagrams. The EM map of the YenA/ Chil / Chi2 complex determined at a resolution of 17A by single particle averaging 6 is shown as a light grey surface.
Figure 5 shows a topology diagram of YenB/YenC2-N. The YenB/YenC2-N structure is shown in schematic form with cc-helices as cylinders and β -sheets as arrows. The start and end points of secondary structure elements are indicated. The domains are composed as follows: β-strands 1-29, YenB SpvB domain; β-strands 30-50, YenB β-propeller domain; β- strands 51 -92 (light grey), the remainder of YenB; β-strands 1 -44 (dark grey), YenC2-N; β- strands 45-49 (dark grey), YenC2-N RHS hyper-conserved core domain. Figure 6 shows SAXS bead models of YenB/YenC2 and YenB/YenC2-N. a and c, a slice through the ah initio bead models produced from small- angle X-ray scattering of YenB / YenC2 and YenB/YenC2-N respectively. The model of YenB/YenC2-N has a large internal cavity shown in dark grey, absent from die model of YenB/YenC2. b and d, fit of die ah initio bead models to scattering data.
Figure 7 sho ns SAXS data for YenB/YenC2. a, purity of YenB/YenC2 sample for SAXS analysis, shown by size exclusion chromatography trace and SDS-PAGE (inset), b, SAXS data for YenB/ YenC2 as a log-log plot. Inset is a Guinier plot of the low-^ region, c, P(r) plot for YenB/ YenC2 with Dmax = 134 A. d, scattering of YenB/ YenC2 compared to the theoretical scattering of the YenB / YenC2-N crystal structure, highlighting the poor fit.
Figure 8 shows SAXS data for YenB/YenC2-N. a, purity of YenB/YenC2-N sample for SAXS analysis, shown by size exclusion chromatography trace and SDS-PAGE, (inset), b, SAXS data for YenB/YenC2-N as a log-log plot. Inset is a Guinier plot of the Jow- region, c, P(r) plot for YenB/YenC2-N with Dmax - 138 A. d, scattering of YenB/YenC2-N compared to the theoretical scattering of the YenB /YenC2-N crystal structure.
Figure 9 shows the effect of point mutations on YenC2 self-cleavage. When co-expressed with YenB, wild-type YenC2 (WT) self-cleaves following M690. Three point mutations (R650A, D663N, and D686N) were found to abrogate self-cleavage.
Figure 10 profile-FIMM logos of RHS and YD repeats. Profile-HMM logos of an RHS repeat (a) and a YD repeat (b) show that these two repeats have the same consensus sequence. Figure 11 shows the E. co/'i CNFI catalytic domain compared with YenB / YenC2-N. The catalytic domain of E. colt CNF i (dark grey surface representation), which is homologous to YenCl-C, is shown manually placed inside the hollow shell formed by YenB / YenC2-N (cartoon diagram). This shows that the central cavity of YenB/YenC2-N is large enough to accomodate the C-terminal toxin domain of die YenC proteins.
Figure 12. Profile-HMM that describes the RHS repeat, as defined in the Pfam database (http://pfam<dot>sanger<dot>ac<dot>uk/ family /PF05593).
Figure 13. Profile-HMM that describes the YD repeat, as defined in the TIGRfams database (http : / /www< do t>)cvi< dot> org/cgi-bin
/tigrfams/FImmReportPage.cgi acc=TIGR01643)
Figure 14. Composite profile-HMM that describes bodi the RHS repeat and die YD repeat, constructed using the program jackhmm r
(http:/ /hmmer<dot>janelia<dot> org/ search/jackhmmer).
Figure 15. Profile-HMM that describes the RHS repeat- ssocia ed core domain, as defined in the TIGRfams database (http:/ /www<dot>jcvi<dot>org/cgi-bin /tigrfams/HmmReportPage<dot>cgi?acc=TlGR03696).
Figure 16. a, Size exclusion chromatography trace of TcB:TcC-GFP fusion at pH7.5; b, SDS-PAGE of fractions from size exclusion trace in panel a). Peakl - GFP encapsulated in TcBTcC shell, Peak 2 non bound GFP; c, left microcentrifuge tube contains protein from Peak 1 (no fluorescence) and right: microcentrifuge tube contains released GFP from Peak 2 (fluorescence under LJV illumination).
EXAMPLES
The invention will now be exemplified with reference to the following non-limiting Examples
Example 1: Elucidation of the structure of the complex formed between the TcB and TcC components of ABC toxin complexes (Tc).
The ABC toxin complexes (Tc) produced by some bacteria are of interest due to their potent oral insecticidal activity'"2 and potential role in human disease3. They are composed of at least three proteins, TcA, TcB and TcC, which must assemble together in order to be fully toxic4. The carboxy-terminal section of TcC is the main cytotoxic component5, and displays remarkable heterogeneity between different Tcs. A general model of action has been proposed, in which the TcA component first binds to the cell surface, is endocytosed and subsequently forms a pH-triggered channel, allowing the translocation of TcC into the cytoplasm5, where it can cause cytoskeletal disruption in both insect and mammalian cells. Tc complexes have been visualised using single particle electron microscopy*'', but no high- resolution structures of the components are available, and the role of TcB in the mechanism of toxicity remains unknown. Here we report the three-dimensional structure of the complex between TcB and the conserved amino -terminal section of TcC determined to 2.3A by X-ray crystallography. These components assemble to form an unprecedented large hollow structure that encapsulates and sequesters the cytotoxic carboxy-terminal portion of TcC like the shell of an egg. The shell is decorated on one end with a β-propeller domain, which mediates attachment of the TcB /TcC dimer to the TcA component of the complex. Furthermore, the structure shows how TcC auto-proteolyses when folded in complex with TcB. TcC is the first known protei structure to contain RHS (rearrangement hot spot) repeats8, and illustrates the structural architecture that is likely to be conserved across this widely distributed bacterial protein family and the related eukaryotic YD-repeat-containing protein family, which includes the teneurins". In addition to indicating the function of these protein families, the structure suggests a generic mechanism for protein encapsulation and deliver]?.
ABC toxins were first identified, and have been best characterised, in the bacterium Vhotorhabdus luminescens. The entomopathogenic bacterium Yersinia entomvph ga contains a related Tc locus where the TcA component is split into two ORFs (YenAl and Yen A2), along with a single TcB gene (YenB) and two TcC genes (Y enCl and YenC2) u. The TcC proteins of this and other Tcs are similar to the "polymorphic toxins" described by Zhang et a/. '1 as they have a conserved RHS-repeat-containing amino- erminal region and a variable carboxy- erminal region'0. The carboxy- terminal regions of the Y. entomophaga TcC proteins are predicted to have different toxic activities: YenCl-C is homologous to cytotoxic necrotising factor 1 (CNF1) from Hscherictna colt 2, while YenC2-C is homologous to the deaminase YwqJ from Bacillus subtilis 1 '. When YenB and YenCl or YenC2 proteins are co-expressed, the YenC proteins are cleaved at the boundary between the conserved amino-terminal domain and the variable carboxy-terminal domain.
As the function of TcB proteins in Tcs is unknown, we prepared the complex of YenB (167 kDa) with the conserved N-terrninal 76 kDa portion of YenC2 (YenC2-N) by co- expression and incubation at low pH (Supplementary methods), and solved its structure using X-ray crystallography ( able SI). The structure reveals a remarkable, intimately associated heterodimer formed by YenB and YenC2-N that cooperatively fold into a large, hollow shell (Fig. la). An immediately striking feature is the single long β-sheet, comprised of 76 β- strands derived from both proteins, that constitutes the majority of the shell structure. The shell is completed by a second β-sheet formed by 14 strands contributed by YenC2-N, bringing to 90 the total number of p-strands that wrap around what is a substantial central cavity (Fig. 1 b & c; Fig. 5). he carboxy-terminus of YenB is in close proximity to the amino- terminus of YenC2, suggesting that the two proteins could be produced as a single polypeptide. Evidence in support: of this can be found in the bacterium urkbolderia rhi^oxinica where a single ORF (tcrJB ) encodes an apparent TcB-TcC fusion protein.
The central cavity is a solvent-accesible space approximately 42 A wide and 87 A long, with a total enclosed volume of approximately 59,000 A3. The shell is closed at both ends - the YenB end by a β-propeller domain inserted into the loop between strands β29 and β51, and the YenC2 end by another short strip of β-sheet (strands β45-β49) that spirals inwards, forming a plug. The overall shape is reminiscent of a hollow egg.
The carhoxy-terminal end of YenC2-N (i.e. the cleavage site between the two portions of YeriC2) lies inside this shell. We therefore propose that the complete YenB /YenC complex encapsulates YenC2-C within the shell of β -sheets created by YenB and YenC2-N. This proposal is supported by small-angle X-ray scattering data, which are consistent with a hollow spheroid for the YenB/Y enC2-N complex, but with a solid spheroid for the complete YenB/YenC2 complex (Supplementary Figs. 2-4 and Tables 8-11). This explains how YenC2- C remains tightly associated with the complex after auto-proteolysis, in the absence of any covalent linkage between the proteins. In broader terms, this also explains how generally cytotoxic proteins encoded by the C-terminai regions of TcC proteins, such as deaminases, can be safely produced withou intoxication of the producing cell. According to the current model, the toxic payload, in this case YenC2-C, remains sequestered until exposure to a change in pH triggers its release5.
The amino acids immediately before the YenC2 cleavage site are clearly visible in the electron density, allowing us to suggest a mechanism of auto- roteolysis. We propose that two aspartate residues (D663 and D686) positioned either side of the last residue prior to cleavage (M690), form the catalytic site for proteolysis (Fig. 2). In our structure, they are too far apart to form the canonical aspartic protease arrangement (6.2 A between carboxyi oxygens), but as we have visualised the post-cleavage state, it is possible that prior to cleavage these residues may adopt a slightly different conformation. To establish their role, we made point mutations that replaced the conserved aspartates with asparagines (D663N & D686N), and showed that mutation of either residue completely abolished auto-proteolytic activity (Fig. 9). These residues are part of the strongly conserved RHS repeat-associated core domain"4 (TIGRfam TIGR03696, Interpro IPR022385), which is widely distributed across the archaea, bacteria and eukaryota. Our structure therefore shows this domain to be a cryptic aspartic protease.
RH.S repeats themselves are present in many polymorphic toxin complexes that are found across a diverse range oi bacterial species. Until now, RHS-repeat containing proteins have been structurally intractable, making this structure of YenC-N the first of any protein containing RHS repeats. Individual RHS repeat proteins can vary in size, and die overall sequence conservation across the family is low, but a consensus sequence for die repeat has been previously defined: GxxxRYxYDxxGRL(I/T)15. When this is mapped onto the structure of YenC-N (Fig. 3), it is clear that each RHS repeat corresponds to a single strand-turn-strand motif, multiple copies of which make up the extended β-sheet of the shell. Although the initial glycine is not especially well conserved, it marks the hairpin facing the Ye C2-end of the shell. The central DxxG creates the hairpin facing the YenB-end, with the aspartic acid hydrogen - bonding to the backbone amides of the glycine and adjacent arginine. This glycine is largely conserved, but the aspartic acid can be replaced by a glutamic acid, threonine or serine and typically the interactions formed remain the same. The YxY motif places the two tyrosine sidechains inside the shell (coloured magenta in Fig. 3a) where they sit parallel to each other, and also stack with the post- hairpin arginine from an adjacent strand. The conserved hydrophobic amino acids at the C-terminal end of the repeat (coloured yellow in Fig. 3a) also lie inside the shell, forming a continuous hydrophobic stripe along the face of the p-sheet composed of tyrosines and leucines/ isoleucines on alternating strands (Fig. 3b). The RHS structural motif is present in YenB as well as YenC2-N, albeit with less sequence conservation. The YenB sequence contains more insertions and extensions within the RHS repeats than the YenC sequence, which makes identifying the RHS pattern difficult by sequence conservation alone. However, inspection of the structure reveals many examples of DxxG turns and tyrosine or phenylalanine sidechains arranged in an equivalent fashion. Using this structural conservation as a guide, we were able to produce a refined consensus sequence for the RHS repeat (Fig. 10) and show7 that the pattern of conservation is identical to that seen in YD repeats (TIGRfam TIGR01643; Figs. 10, 12, 13, 14) that are found in many bacterial and eukaryoiic proteins, notably in the extracelluar domains of teneurins, which are developmental signalling proteins conserved from flies to mammals and required for synaptic partner matching16'1 '. We propose that RHS and YD repeats represent a conserved structural motif that will always give rise to an extended β-sheet, forming a shell structure similar to that seen here. Support for this proposal can be found in previous low-resolution EM images of the extracellular domains of mouse teneurin, which revealed globular domain of similar dimensions to the YenB/YenTc-C complex18. We predict that the YD-repeat containing domains of eukaryotic teneurins will encapsulate their C-terminal regions, the teneurin C-terminal associated peptides (TCAPs), which are known to be active extracellular signalling components m mice 19 ?n
Previous visualisations of complete ABC Tcs from Y. entomaphagct and P. luminescent have shown that the TcB/TcC complex sits in the vestibule of the channel-forming domain of TcA, positioned at the end of the Tc complex furthest from the membrane, hitting the YenB/'YenC-N structure into the 25 A EM map of the P. luminescens Tc unambiguously places the five-bladed β-propeller domain of the TcB/TcC as a point of interaction with the TcA pentamer (Fig. 4). In Tcs such as Yen-Tc, where the TcA component is encoded by two separate ORFs, this represents an interaction with YenA2. We are now able to model the complete Yen-Tc complex for the first time (Fig. 4) by docking both the YenB/YenC-N complex and both associated chitinase enzymes, Chil and Chi2, onto the previously determined 16 A EM structure of the Y. ento aphaga YenA pentamer6.
A general model for the injection mechanism of Tcs has been proposed, based on the EM structure of a membrane-bound form of die P. luminescens Tc, in which the C-terminal domain of TcC is translocated in an unfolded state through a transmembrane pore, 1.5 A in diameter, formed by TcA. It remains unclear whether the toxic C -terminal region of YenC is encapsulated in a folded or unfolded state within the YenB /YenC-N shell, but the central cavity is large enough to contain the C-terminal region of YenC2 in a folded state, assuming it adopts the same fold as other deaminases (Fig. 11). As the pore of the TcA channel has not yet been visualised in an active conformation, it remains a possibility that the translocation state of the toxin contains an open pore wide enough to allow the passage of a folded protein. On the other hand, the overall architecture of the YenB/YenC-N shell, with its conserved RFIS repeats producing an interior hydrophobic pattern of tyrosine, leucine and isoleucine residues, is reminiscent of the protein chaperone GroEL2", perhaps implying that the function of TcB/TcC proteins, and of RFIS and YD repeats more generally, is to encapsulate unfolded proteins. There is support for this idea in the observation that many polymorphic toxins have predicted proteases as their toxic components, which would need to be contained in an inactive state to prevent proteolysis of the shell itself.
Release of the encapsulated TcC-C from the TcB/TcC-N shell will require a conformational change, as there are no gaps in the structure large enough for a polypeptide to pass though. Two possibilries exist: the β-propeller blades could separate, allowing extrusion of an unfolded polypeptide through the middle o the propeller, or the propeller domain could swing aside like a bottle-top, hinged on the β29/β30 and β50/β51 loops, which form the only covalent connections between the β -propeller and the main body of the shell. Either mechanism is likely Co be dependent on bot a pH-driven tigger and mechanical interactions with the TcA component of the toxin.
The structure of the YenB/ YenC-N complex presented here reveals how7 the cytotoxic TcC components of ABC-type Tc complexes are processed and contained, demonstrates the function of die TcB component within the Tc and provides a framework for further experiments to build a complete mechanistic model of action for this class of toxins. More broadly, it also illuminates the functio of the widely distributed RHS and YD repeat families of proteins, which had until now been unknown.
Methods Summary
The YenB/YenC2 protein complex was produced by co-expression in E. colt and purified using Ni-affinity and size exclusion chromatography. The YenB/YenC2-N protein complex was obtained by dialysing YenB/YenC2 against acetate buffer at pH 4.5, filtration and size exclusion chromatography. Crystallisation was carried out by hanging-drop vapor diffusion with microseeding in drops containing 18% (w/v) PEG 3350, 0.15 M KH2P04 pH 4.8. X- ray diffraction data was collected to a resolution of 2.26 A at beamline MX2 at the Australian Synchrotron2^, integrated using XDS" and scaled and merged using Aimless"'4 (Tables 6 and 7). Phasing was accomplished by a combination of MAD and SAD using Ta6Br,, soaked and seienometliionine-substituted crystals '5 26. Structure refinement and analysis was performed using Phenix and diagrams were produced using PyMol2b and Chimera2'
Supplementary methods.
YenC-C dissociates at low pH When YenB and either YenCl or YenC2 were co-expressed in E. colt, YenC l and YetiC.?, auto-proteolysed into two fragments as described previously (ref). In both cases, although all three protein fragments co-eluted as a single complex when purified by size-exclusion chromatography, we were unable to crystallise the purified complexes. As the current model for Tc cell entry involves exposure to low pH in the acidified endosome", we tested the behaviour of YenB + YenCl /YenC2 complexes at a range of pH conditions from 4.5 to 9.5. At low pH (4.5 - 5.0), the complexes began to precipitate, with the C- terminal domains of the YenC proteins showing differential p ecipitation, allowing purification of complexes containing just YenB and the N-terminal portions of YenC.
SAXS data collection aad processing
Small-angle X-ray scattering data wTere collected at the SAXS/WAXS beamline at the Australian Synchrotron (Figures 6, 7 and 8; Tables 8, 9, 10 and 11). Samples were purified to homogeniety by IMAC, and SBC and exhaustively dialysed against sample buffer containing 20 mM HE.PES pH 7.5, 150 mM NaCl. The dklysate was used as the solvent blank. Data collection was carried out at 291 K with 1 or 2 second exposures. Sample was flowed across the beam at 4 μί/s to avoid radiation damage. Multiple co centrations were tested for each protein, and images for each concentration were compared, averaged and buffer-subtracted using the ScatterBrain software provided by the Australian Synchrotron. Scattering data were placed on die absolute scale by measuring the scattering of a water sample:'0. Ab initio bead models were created by running dammifi" 20 times, superimposing and averaging the resulting models with damaver"1, and using this as input for a final refinement run of dammin33. SAXS data were compared with the theoretical scattering of the YenB/YenC2-N crystal structure using crysoP4. Table 6. Data collection and refinement statistics for native YenB/YenC2-N dataset.
Wavelength (A) 0.9537
Resolution range (A) 49.55 - 2.26 (2.40 - 2.26)
S ace rou P2,212¾
Unit cell (A) 133.7 147.6 274.4 90 90 90
Total reflections 3,380,388 (245,257)
Unique reflections 245,036 (33,576)
Completeness (%) 97.3 (83.2)
Multiplicity 13.8 (7.3)
Mean Ι/σιΤ) 9.81 (0.81)
CC 99.1 (15.7)
Wilson B-factor 37.69
R -measure 0.3129 (2.4219)
R-factor/ R-free 0.2075/0.2574
Number of atoms 35,548
m acromol ecules 33,027
ligands 54
water 2,467
Protein residues 4,240
RMS (bonds) (A) 0.005
RMS (angles) (°) 0.99
Ramachandran favored (%) 96.00
Ramachandran outliers (%) 0.14
Clasliscore 12.16
Average B-factor 48.20
macromolecules 48.30
solvent 46.80
Statistics for the highest-resolution shell are shown in parentheses.
Table 7. Data collection and refinement statistics for selenomethionine protein crystals.
Combined Ta6Br12
Dataset SeMet 1 SeMet 2 SeMet 3 SeMet 4
SeMet soak
Wavelength
0.979100 0.979100 1.258000
(A)
Figure imgf000068_0001
149.7, 150.5, 150.4, 150.3, 152.9,
Unit cell (A)
276.0, 90, 276.7, 90, 276.6, 90, 276.3, 90, 276.3, 90,
90, 90 90, 90 90, 90 90, 90 90, 90
Total 375,898 907,911 917,907 2,030,940 4,226,680 2,584,590 reflections (43,908) (136,715) (140,953) (227,949) (41,245) (112,511)
Unique 238,027 234,490 237, 191 271,549 141 ,963 89,256 reflections 77*i (36,658) (37,874) (40,699) (5,905) (4,841)
Completeness 87.5 99.4 99.7 98.6 90.6
99.2 (84.4)
(%) A (75.8) (96.1) (98.4) (91.3) (94.8)
29.0
Multiplicity 1.6 (1.3) 3.9 (3.7) 3.9 (3.7) 7.5 (5.6) 29.8 (7) (23.2)
9.12 5.91 6.35 10.67
Mean Ι/σ(Ι) 4.6 (1.1) 13.9 (1.0)
(1.64) (0.71) (0.75) (1. 8)
99.5 97.4 97.9 98.8 99.7
CC1 2 (%) 99.5 (47.2)
(74.8) (20.5) (23.2) (60.1) (38.8)
Anomalous
correlation 22 (4) 12 (2) 13 (3) 17 (0) 13.8 (0) 52.5 (9.5)
Figure imgf000068_0002
Anomalous
resolution" 1 1.1 6.7 6.6 5.5 5.0 5.68
(A)
Statistics for the highest-resolution shell are shown in parentheses.
1 Anomalous resolution defined as the point at which CC„,orn drops below 0.3.
Table 8. Data collection and scattering derived parameters for YenB/YenC2 SAXS.
Data-collection parameters
Australian Synchrotron SAXS /WAXS instrument
beamline
Beam geometry point
Wavelength (A) 1.12713
q range (A"1) 0.009 -0.614
Exposure time 2 s
Concentration range (mg/ml) 5-0.15
Figure imgf000069_0001
Molecular mass determination
Partial specific volume £cmJ/g) 0.7425
Contrast (Δρ x 1010 / cm2) 2.1
Molecular mass [from 1(0)] (kDa) 261.5
Molecular mass [from SAXS-MoW] (kDa) 253.8
Calculated monomeric Mt from sequence (kDa) 276.3
Software employed
Primary data reduction ScatterBrain
Data processing GNOM
Ab initio analysis DAMMIF & DAMMIN
Validation and averaging DAMAVER
Three-dimensional graphics representations PyMOL
data reported for 5 mg/ml concentration.
Table 9. Concentration dependence of SAXS data for YenB/YenC
Concentration Rv standard
Guinier range ¾ (Α) To/concentration (tng/ml) deviation (%)
5 5-18 41.8 0 0.189
3-17 42.7 0 0.193
1.25 7-17 4.9 n 0 0.176
0.31 7-17 42.9 1 0.161
0.15 16-38 41.8 1 0.147
Table 10. Data collection and scattering <»> derived a arameters for YenB/ _ YenC2-N SAXS,
Data-collection parameters
Australian Synchrotron SAXS/WAXS
Instrument
beam
Beam geometry point
Wavelength (A) 1.12713
q range (A"') 0.004-0.255
Exposure time 1 s
Concentration range (mg/ml) 1-0.016
Temperature ( ) 291
Structural parameters'
7(0) (cm"1) [from P(r)"| 0.20
R (A) [from P(r)l 44.1
7(0) (cm"1) [from Guinierj 0.20
Rj, (A) [from Guinier] 44.2
Dmax (A) 138
Porod volume estimate (AJ) 365,000
Molecular mass determination'
Partial specific volume (cm/Vg) 0.7425
Contrast (Δρ X 1010 / cm2) 2.1
Molecular mass [from 7(0) j (kDa) 277.0
Molecular mass [from SAXS-MoW] (kDa) 257.6
Calculated monomeric Mt from sequence
243.3
(kDa)
Software employed
Primary data reduction ScatterBrain
Data processing GNOM
Ab initio analysis DAMMIF & DAMMIN
Validation and averaging DAMAVER
Computation of model intensities CRYSOL
Three-dimensional graphics
PyMOL
representations
'Data reported for 1 mg/rnl concentration.
Table 11. Concentration dependence of SAXS data for YenB / YenC2-
Concentration „ . . _ Rv standard
. , T. Gu ier range K To/concentration (mg/ ml) ° deviation
1 20-40 44.20 3% 0.200
0.5 24-45 43.87 11 % 0.216
0.25 21-45 44.06 10% 0.216
0.125 8-30 46.95 10% 0.208
0.063 18-43 43.95 7%i 0.190
0.031 5-41 45.82 9% 0.161
0.016 15-43 44.71 56% 0.125 Bowen, D. et al Insecticidal toxins from the bacterium Photorhabdus luminescent. Science129-2132 (1998).
ffrench-Constant, R. H. & Waterfield, N. R. Ground control for insect pests. Nat. Bio!echml. 24, 660-661 (2006).
Hares, M. C. et al. The Y erstnia pseudotuberculosis and Yersinia pestis toxin complex is active against cultured mammalian cells. Microbiology 154, 3503-3517 (2008).
Waterfield, N., Hares, M., Yang, G., Dowling, A. & ffrench -Constant, R. Potentiation and cellular phenot pes of the insecticidal Toxin complexes of Photorhabdus bacteria. Cell. Microbiol. 7, 373-382 (2005).
Lang, A. E. et al. Photorhabdus luminescens Toxins ADP-Ribosyiate Ac tin and RhoA to Force Actm Clustering. Science 327, 1139-1142 (2010).
Landsberg, M. J. et al. 3D structure of the Yersinia entomophaga toxin complex and implications for insecticidal activity. Proc. Natl. Acad. Sci. USA 108, 20544-20549 (2011). Gatsogiannis, G, Lang, A. E., Meusch, D. & Pfaumann. A syringe-like injection mechanism in Photorhabdus luminescens toxins. Nature 495, 520-523 (2013).
Hill, C. W., Sandt, C. H. & Vlazny, D. A. Rhs elements of Escherichia colt: a family of genetic composites each encoding a large mosaic protein. Mol Microbiol 12, 865-871 (1994).
Minet, A. D., Rubin, B. P., Tucker, R. P., Baumgartner, S. & Chiquet-Ehrismann, R. Teneurin-i, a vertebrate homoiogue of the Drosophila pair-rule gene ten-m, is a neuronal protein with a novel type of heparin-binding domain. . Cell Sci. 112, 2019-2032 (1999). Hurst, M. R. I L, Jones, S. A., Bmglin, T., Harper, L, A. & Glare, T. R. The mam virulence determinant of Yersinia entomophaga Al l 1 6 is a broad host range insect active, Toxin Complex. /. Bacterial (201 1). Zhang, D., de Souza, R. F., Ananthataman, V., Iyer, L. M. & Aravind, L. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics. Biol. Direct 7, 18 (2012).
Buetow, L., Flatau, G., Chiu, ., Boquet, P. & Ghosh, P. Structure of the Rho-activating domain of Escherichia colt cytotoxic necrotizing factor 1. Nat. Struct. Biol. 8, 584-588 (2001).
Iyer, L. M., Zhang, D., Rogozin, I. B. & Aravind, L. Evolution of the deaminase fold and multiple origins of eukaryotic editing and mutagenic nucleic acid deaminases from bacterial toxin systems. Nucleic Acids Res. 39, 9473-9497 (2011).
Jackson, A. P., Thomas, G. H., Parkhill, j, & Thomson, N. R. Evolutionary diversification of an ancient gene family (rhs) through C-terminal displacement. BMC Genotmcs 10, 584 (2009).
Wang, Y. D., Zhao, S. & Hill, C. W. Rhs elements comprise three subfamilies which diverged prior to acquisition by Escherichia colt. j. Bacterial. 180, 4102-41 10 (1998).
Mosca, T. }., Hong, W., Dans, V. S., Favaloro, V. &: Luo, L. Trans-synaptic Teneurin signalling in neuromuscular synapse organization and targe choice. Nature 484, 237-241
(2012) .
Hong, W., Mosca, T. j. & Luo, L. Teneurins instruct synaptic partner matching in an olfactory map. Nature 484, 201-207 (2012).
Feng, . el at. All four members of the Ten-m/Odz family of transmembrane proteins form dimers. /. Biol. Chem. 277, 26128-26135 (2002).
Chand, D. et al. C-terminal processing of the teneurin proteins: Independent actions of a teneurin C-terminal associated peptide in hippocampal cells. Mot. Cell. NeuroscL 52, 38-50
(2013) . Tucker, R. P. & Chiquet-Ehsismann, R. Teneurins: a conserved family of transmembrane proteins involved in intercellular signaling during development:. Dev. BioL 290, 237-245 (2006).
Xu, Z., Horwich, A. L. & Sigler, P. B. The crystal structure of the asymmetric GroEL- GroES-(ADP)_7 chaperonin complex. Nature 388, 741-750 (1997).
McPhillips, T. M. et at. Blu-Ice and the Distributed Control System: software for data acquisition and instrument control at macromoiecular crystallography beamlines. ]. Synchrotron Radial 9, 401 -406 (2002).
Kabsch, W. XDS. Acta CrystaHogr. D 66, 125-132 (2010).
The CCP4 suite: programs for protein crystallography. Acta CrystaHogr. D 50, 760-763 (1994).
Panjikar, S., Parthasarathy, V., Lamzin, V. S., Weiss, M. S. & Tucker, P. A. Auto- rickshaw: an automated crystal structure determination platform as an efficient tool for the validation of an X-ray diffraction experiment. Acta CrystaHogr. D 61, 449-457 (2005). Panjikar, S., Parthasarathy, V., Lamzin, V. S., Weiss, M. S. & Tucker, P. A. On the combination of molecular replacement and single -wavelength anomalous diffraction phasing for automated structure determination. Acta CrystaHogr. D 65, 1089-1097 (2009). Adams, P. D. et at. ΡΗΕΝΓΧ: a comprehensive Python-based system for macromoiecular structure solution. Acta Ciysta 'ogr. D 66, 213-221 (2010).
Schrodinger, L. L. C. The P MOL Molecular Graphics System, Version 1.3ri . (2010). Pettersen, E. F. et ai. LJCSF Chimera— a visualization system for exploratory research and analysis. /. Comput. Chem. 25, 1605-1612 (2004).
Orthaber, D., Bergmann, A. & Glatter, O. SAXS experiments on absolute scale with Kratky systems using water as a secondary standard. . Appl'. CrystaHogr. 33, 218-225 (2000). 31. Franke, D. & Svergun, D. I. DAMMIF, a program for rapid ab-initio shape determination in small-angle scattering. /. AppL Ctystallogr. 42, 342-346 (2009).
32. Volkov, V. V. & Svergun, D. I. Uniqueness of ab initio shape determination in small- angle scattering. /. AppL Ctystallogr. 36, 860-864 (2003).
33. Svergun, D. I. Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. iophys. J. 76, 2879-2886 (1999).
34. Svergun, D., Barberato, C. & Koch, M. H. J. CRYSOL-a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. . AppL Ctystallogr. 28, 768-773 (1995).
35. Busby, j. N. et al. Structural Analysis of Chil Chitinase from Yen-Tc: The Multisubunit Insecticidal ABC Toxin Complex of Yersinia entomophaga. ]. Mot. Biol. 415, 359-371 (2012).
Example 2: Demonstrating activity by expressing toxin proteins in E.coli nd feeding to insects
Cloning and expression
The encapsulated proteins of the invention can be expressed using commercially available non-conjugative vectors such as pET in E. coli (GATEWAY® technology, Invitrogen).
Once the gene expression has been established, by methods well-known to those skilled in the art, the transformed E. coli (as a bacterial cell in broth culture) can be used in a standard bioassay against, for example,diamondback moth (DBM) and other insects, to demonstrate insecticidal activity.
The kit uses Transform One Shot® Chemically Competent E. coli. The pET vectors carry a bacteriophage T / promotor, transcription and translation signals. The source of T7 RNA polymerase is provided by the host cells.
Bioassay methods
Diamondback moth, Plutella xylostella (Eepidoptera: Plutellidae), Cabbage white butterfly, Pieris rapae (Eepidoptem: Pieridae) and cabbage loopermoth, Trichoplusia ni (Lepidoptera: Noctuidae) Diamondback moth larvae can be reared on hrassica (cabbage plants), or the strains resistant to Cry 1A and CrylC and a susceptible (G88) strain can be obtained and tested at the New York State Agricultural Experiment Station, College of Agriculture and Life Sciences at Cornell University, located in Geneva, NY, USA. Cabbage white butterfly larvae can be field- collected.
Te 2r,d-3m ins tar larvae can be used and placed on 3cm disc of cabbage leaf treated with either 20 μΐ of transformed B.co/i solution, or dipped in the solution. A wetting agent, Siliwet L--77 (Mornentive Performance Materials, New York, USA) or Triton X-100 (Rohm and Hass Co, Philidelphia, USA) is used at <0.05%. Each treatment can be replicated 3-5 times (3 -50 larvae per treatment). Treated larvae remai o the cabbage leaf at 23°C 16L:8HD (Lincoln) or at 2TC 16hL:8hD (USA) and are checked daily for dead larvae.
Example 3: Demonstrating activity in transgenic plants and in plant bioassay Cloning
Genetic constructs used in the transformation protocol
Sequences for expressing the encapsulated proteins of the invention can be cloned into suitable constructs and vectors for transformation of plants as is well known by those skilled in the art, and disclosed herein.
For example, the plant transformation vector, pHZBar is derived from pART27 (Gieave 1992, Plant Mol Biol 20: 1203—1207). The pnos-nptII-nos3' selection cassette has been replaced by the CaMV35S-BAR-OCS3' selection cassette with the bar gene (which confers resistance to the herbicide ammonium glufosinate) expressed from the CaMV 35S promoter. Cloning of expression cassettes into this binary vector is facilitated by a unique Not! restriction site and selection of recombinants by blue/ white screening for β— galactosidase.
The polynucleotide sequences encoding die encapsulated proteins of the invention can be cloned by standard techniques into pART7 downstream of the 35S promoter. A unique A'o/I fragment can then be shuttled into pAR '27 (Gieave, 1992, Plant Mol Biol 20: 1203 -1207) for transformation of various plant species. This binary vector contains the nptll selection gene for kanamycin resistance under the control of the CaMV 35S promoter. Genetic constructs in pART27 can be transferred into Agro bacterium tumefaciens strain GV3101 or EHA105 as plasmid DNA using freeze-thaw transformation method (Ditta et at" 1980, Proc. Natl. Acad. Sci. USA 77: 7347-7351). The structure of the constructs maintained in Agrobacterium can be confirmed by restriction digest of plasmid DNA's prepared from bacterial culture.
Agrobacterium cultures can be prepared in glycerol and transferred to -80°C for long term storage. Genetic constructs maintained in Agrobacterium strain GV3101 can be inoculated into 25 ml, of MGL broth containing spectinomvcin at a concentration of lOOmg/'L. Cultures can be grown overnight (16 hours) on a rotary shaker (200 rpm) at 28°C. Bacterial cultures can be harvested by centrifugation (3000 x g, 10 minutes). The supernatant is removed and the cells resuspended in a 5mL solution of lOmM MgS04.
Plant transformation
Plants can be transformed to express the encapsulated proteins of the invention by numerous methods well known to those skilled in the art and disclosed herein.
Tobacco Transformation
Tobacco can he transformed via the leaf disk transformation-regeneration method (Horsch et al.l 985). Leal- disks from sterile wild type W38 tobacco plants are inoculated with an Agrobacterium tumefaciens strain containing the appropriate binary vector, and cultured for 3 days. The leaf disks are then transferred to MS selective medium containing 100 mg/L of kanamycin (or 5mg/L of giufosinate) and 300 mg/'L, of cefotaxime. Shoot regeneration occurs over a month, and the leaf explains are placed on hormone free medium containing kanamycin or giufosinate for root formation.
Sorghum transformation
The constructs described above can also be used to transform Sorghum. A suitable protocol for transforming Sorghum is found in Howe et aL 2006, Plant Cell Reports, Volume 25, No
8, 784-791.
Cotton transformation The constructs described above can also be used to transform cotton. A suitable protocol for transforming cotton is found in US 5, 846, 797.
Wheat transformation
The constructs described above can also be used to transform wheat. A suitable protocol for transforming wheat is found in Ortiz et al., 1996, Plant Cell Rep. 15, 1996, 877
Mai^e transformation
The constructs described above can also be used to transform mai^e. A suitable protocol for transforming maize is found in US, 8,247, 369.
Transformation of other plants
Transformation protocols for other plants are well-known to those skilled in the art, and are disclosed herein.
ELISA assay
ELLS A analysis according to the method disclosed in U.S. Pat. No. 5,625,136 can be used for the quantitative determination of the level of the encapsulated proteins in. transgenic plants, or Darts thereof.
Bioassay
Various parts of the transformed plants, or whole plants can be used in standard insect bioassay procedures to test the activity of the encapsulated proteins of the invention, and the resistance of the transformed plants to various insects.
Numerous suitable methods are known to those skilled in the art and are described for example in the following US patents: US 8,247,369; US 8,216,806; US 8,173 872; US 8,034,997; US 7,858,849; and US 7,803,993; US 7,655,838 and US 7,919,609; all of which are incorporated herein by reference. Example 4: Demonstrating encapsulation of a foreign protein
The applicants have created versions of the TcB/TcC (BC) complex in which die C- terminal region of YenC2 (TcC) has been replaced with green fluorescent protein (GFP). Two versions of this construct have been created, the native-GPP version, in which GFP has a net negative charge, and a version containing a modified GFP with a net positive charge
(GFP+6), mimicking the charge of the normal TcC.
Both constructs have been expressed in h. coii and purified by standard procedures
(immobilized metal ion affinity chromatograph [IMACj, and size exclusion chromatography [SEC]). The GFP protein is produced as a fusion of the the N -terminal region of YenC2 (YenC2NTR) and GFP. This fusion protein was expected to self-cleave at the boundary between these two proteins, analogous to the cleavage that occurs in the native complex. This cleavage occurs with both the native GFP and GFP+ 6 variants, and the protein complex consisting of YenB, the N-terminal region of YenC2 (YenC2NTR), and GFP co-purify and form a single peak on size exclusion. This indicates that GFP is being encapsulated within the BC shell, in a similar manner to the native TcC. This protein complex does not fluoresce, suggesting that GFP is encapsulated in an unfolded state. After storage for several days, fluorescence was observed with the native-GFP-containing complex. When this was again subjected to SEC, a major peak consisting of all three complex proteins (Y e B, YenC2NTR, GFP) was observed (Figure 16 A-B), which did not fluoresce (Figure 16 C). A smaller peak was also observed consisting of GFP alone, which did fluoresce (Figure 16 B-C), This indicates that there has been some slow leakage of GFP from the complex, at which point GFP folds and is able to fluoresce. This leakage occurred at a reduced rate in the positively- charged GFP+6 variant.
Materials and Methods
Yersinia entomophaga YenB and Y"enC2 were cloned into die pETDuet-1 co-expression vector using standard cloning techniques. Expression was performed in E. coli
Rosetta2(DE3) cells using ZYM-5052 auto-induction medium (Studier, 2005). Freshly transformed cells were grown, in 5 -ml LB cultures overnight and used to inoculate 500-ml ZYM-5052 cultures in 2 litre baffled flasks. These were incubated at 3'7°C for 4 hours, followed by 18°C for 24 hours. Cultures were harvested by cen trifugation at 4,680 RCF for 30 minutes and cell pellets were either frozen at -20 °C or used immediately. Cell pellets were resuspended in his-0 buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 1 mM 2- mercaptoethanol) with the addition of Roche Complete mini EDTA-free protease inhibitor tablets, according to the manufacturer's directions. Cells were lysed by passage through a continuous-flow cell dismptor (Microfluidics microfluidizer M-110P) at a pressure of 18 MPa. Cell lysate was clarified by cent ifugation at 27,000 RCF for 30 minutes followed by filtration. The protein complex was purified by IMAC using a 5- ml Talon HiTrap column. The protein complex wTas washed with his-0 buffer and eluted wTith his-150 buffer (identical to his-0 with the addition of 150 mM imidazole). This eluted fraction wTas concentrated and dialysed against his-0 buffer overnight at 4°C with the addition of TEV protease (Blommel & Fox, 2007) to remove the his-tag. This protein was subsequently applied to the same Talon HiTrap column and the flow-throug collected. This was then concentrated and applied to a HiLoad 16/60 Superdex 200 size exclusion column (GE) attached to an Akta prime FPLC system. His-0 buffer was pumped over the column at a rate of 1 ml/minute, and fractions were collected and analysed by SDS-PAGE. Analytical size exclusion was performed using a Superdex 200 10/300 analytical column (GE). GFP fluorescence was determined by illumination with blue light and observation under a yellow filter.
References
Blommel, P. G., & Fox, B. G. (2007). A combined approach to improving large-scale production of tobacco etc virus protease. Protein Expression and Purification, 55(1), 53—68. doi: 10. 1 016/j .pep.2007.04.013.
Studier, F. W. (2005). Protein production by auto-induction in high density shaking cultures. Protein Expression and Purification, 41 (1), 207-234.

Claims

CLAIMS:
1. A method for encapsulating a protein of interest, tlie method comprising the step of expressing a fusion protein comprising an N-terminal region of a rearrangement: hot spot (RHS) -repeat-containing protein fused to the protein of interest,
2. The method of claim 1 wherein the fusion protein is expressed in a cell.
3. The method of claim 1 or 2 wherein upon expression and folding of the fusion protein, the protein of interest is cleaved from N- erminal region of the RHS-repeat-containing protein.
4. The method of claim 3 wherein cleavage is affected by the action of a protease intrinsic to the N-terminal region of the RHS-repeat containing protein.
5. The method of claim 4 wherein the protease is an aspartate protease.
6. The method of any one of claims 1 to 5 wherein the protein of interest is encapsulated by a shell comprising the N-terminal region of the RHS-repeat-containing protein.
7. The method of claim 6 wherein the shell is formed from one long strip of β-sheet, or β- sheets, that wraps around a central cavity.
8. The method of claim 6 or 7 wherein the β -sheet is composed of at least 50 β-strands derived from the N-terminal region of the RHS-repeat-containing protein.
9. The method of any one of claims 6 to 8 wherein the shell is closed at both ends.
10. The method of any one of claims 1 to 9 wherein the protein of interest is one that is normally naturally associated with the the N-terminal region of the RHS- repeat-containing Dtotein
11. The method of any one of claims 1 to 9 wherein the protei of interest is one that is not normally naturally associated with the N- erminal region of the RHS-repeat-cori taming protem.
12. The method of any one of claims 1 to 11 wherein die protein of interest has a molecular weight of less than 103kDa. 3. The method of any one of claims 1 to 12 wherein the RHS-repeat-containing protein is selected from a toxin complex C (TcC) component of a bacterial toxin complex, a non- toxin complex RHS-repeat containing protein, and a YD~repeat containing protein.
14. The method of any one of claims 1 to 13 wherein the RHS -containing protein is a toxin complex C (TcC) component of a bacterial toxin complex.
15. The method of any one of claims 1 to 14 wherein a fusion protein comprising the N- terminal region of the toxin complex C (TcC) component is co-expressed with a toxin complex B (TcB) component of a bacterial toxin complex..
16. The method of claims 14 to 15 wherein the protein of interest has a molecular weight of less than 40kDa.
17. The method of any one of claims 14 to 16 wherein the protein of interest is cleaved from N- terminal region of TcC component upon formation of a complex between the TcB and the fusion protein.
18. The method of any one of claims 14 to 17 wherein the protein of interest is encapsulated by a shell formed by a complex of die TcB component and the N-termirial region of the TcC.
1 . The method of claim 18 wherein the shell is formed from one long strip of β-sheet, or β- sheets, that wraps around a central cavity, wherein the β -sheet is composed of at least 50 β - strands derived from the TcB component and the N-terminal region of TcC.
20. An encapsulated protein of interest produced by the method of any one of claims 1 to 19.
21. A protein of interest encapsulated by a shell formed by the N-terminal region of the RHS- repeat-containing protein.
22. The protein of claim 51 that is encapsulated by a shell formed by a complex of the TcB component and the N-terminal region of the TcC component.
23. A cell comprising the encapsulated protein of any one of claims 20 to 22.
24. A compositio comprising an encapsulated protein of any one of claims 20 to 22.
2,5. The composition of claim 24 tha is an insecticidai composition or a pharmaceutical composition.
26. The method, encapsulated protein, cell or composition of any preceeding claim, wherein the encapsulated protein is releasable or can be released, from the shell.
27. The method, encapsulated protein, cell or composition of any preceeding claim, wherein the encapsulated protein is releasable, ca be released, or is released, from the shell in certain conditions.
28. The method, encapsulated protein, cell or composition of any preceeding claim, wherei the encapsulated protein is releasable, can be released, or is released, from the shell by lowering the pH of the environment surrounding the encapsulated protein.
29. The method, encapsulated protein, cell or composition of any preceeding claim, wherein protein is releasable, can be released, or is released, from the shell by introducing the encapsulated protein into a low pi I environment.
30. A method of controlled release of a protein of interest, the method comprising placing an encapsulated protein of any one of claims 20 to 22 into an appropriate environment that affects release of the protein of interest.
31. The method of claim 30 wherein the appropriate environment is a low pH environment.
32. The method of claim 30 or 31 wherein the protein of interest is released by a conformational change in the shell encapsulating the protein of interest.
33. The method of claim 32 wherein conformational change is opening of the shell resulting in release of the protein of interest.
34. A method of delivering a protein of interest to a low pH environment, the method comprising delivering an encapsulated protein of any one of claims 20 to 22 to the low pH environment, and release of the encapsulated protein into the low pH environment.
35. The method of claim 34 wherein the low pH environment has a pH of less than 5.5.
36. The method of claim 34 wherein the low pH environment is the endosome of a cell.
37. The method of any one of claims 34 to 36 wTherein the low pH environment affects release of the encapsulated protein from the shell to deliver the protein of interest into the low pH environment.
38. The method of claims 36 wherein the low pH environment triggers delivery of the protein of interest into the cytosol of the cell.
39. A method of delivering a protein of interest into a ceil, the method comprising contacting the cell with and encapsulated protein of any one of claims 20 to 22.
40. A method of controlling a pest, the method comprising contacting an encapsulated protein of any one of claims 20 to 22 with the pest.
41. The method of claim 40 wherein the pest is a pest of a plant.
42. The method of claim 40 or 41 wherein the pest is an insect.
43. The method of claim 42 wherein the protein of interest is a protein that is toxic to the insect.
44. The method of any one of claims 40 to 43 wherein the encapsulated protein is produced in the plant: by expressing in the plant a fusion protein comprising an N-terminal region of a rearrangement: hot spot (RHS)-repeat-containing protein fused to the protein of interest.
45. The method of claim 44 wherein the encapsulated protein is produced in the plant by co- expressing in the plant:
a) a toxin complex B (TcB) component of a bacterial toxin complex, and
b) a fusion protein comprising an N-terminal region of a toxin complex C (TcC)
component oi a bacterial toxin complex fused to the protein of interest.
46. The method of claim 45 comproising the additional step oi co-expression of a TcA component of a bacterial toxin complex in the plant.
47. The method of any one of claims 44 to 46 wherein the pest is contacted by the encapsulated protein produced in the plant.
48. The method of claim 47 wherein the pest is contacted when it ingests the encapsulated protein.
49. A method for producing an insect resistant plant the method comprising expressing in the plant a fusion protei comprising an N- terminal region of a rearrangement hot spot ( HS)- repeat- containing protein fused to the protei of interest.
50. The method of claim 49 comprising co-expressing in the plant:
a) a toxin complex B (TcB) component of a bacterial toxin complex, and
b) a fusion protein comprising an N-terminal region of a toxin complex C (TcC)
component of a bacterial toxin complex fused to the protein of interest.
51. The method of claim 49 or 50 comprising the additional step of co- expressing of a TcA component of a bacterial toxin complex in the plant.
PCT/IB2014/060784 2013-04-19 2014-04-17 Methods and materials for encapsulating proteins WO2014170853A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US14/785,505 US10526378B2 (en) 2013-04-19 2014-04-17 Methods and materials for encapsulating proteins
AU2014255358A AU2014255358B2 (en) 2013-04-19 2014-04-17 Methods and materials for encapsulating proteins
EP14786060.5A EP2986725A4 (en) 2013-04-19 2014-04-17 Methods and materials for encapsulating proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ609662 2013-04-19
NZ60966213 2013-04-19

Publications (1)

Publication Number Publication Date
WO2014170853A1 true WO2014170853A1 (en) 2014-10-23

Family

ID=51730883

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2014/060784 WO2014170853A1 (en) 2013-04-19 2014-04-17 Methods and materials for encapsulating proteins

Country Status (4)

Country Link
US (1) US10526378B2 (en)
EP (1) EP2986725A4 (en)
AU (1) AU2014255358B2 (en)
WO (1) WO2014170853A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109258463B (en) * 2018-09-18 2022-08-02 广西壮族自治区林业科学研究院 Vegetative propagation method of paphiopedilum armeniacum

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060168683A1 (en) * 2004-03-02 2006-07-27 Dow Agrosciences Llc Insecticidal toxin complex fusion proteins
US20060205653A1 (en) * 2005-03-02 2006-09-14 Larrinua Ignacio M Sources for, and types of, insecticidally active proteins, and polynucleotides that encode the proteins

Family Cites Families (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4943674A (en) 1987-05-26 1990-07-24 Calgene, Inc. Fruit specific transcriptional factors
US5753475A (en) 1985-01-17 1998-05-19 Calgene, Inc. Methods and compositions for regulated transcription and expression of heterologous genes
US4795855A (en) 1985-11-14 1989-01-03 Joanne Fillatti Transformation and foreign gene expression with woody species
US5750871A (en) 1986-05-29 1998-05-12 Calgene, Inc. Transformation and foreign gene expression in Brassica species
US5188958A (en) 1986-05-29 1993-02-23 Calgene, Inc. Transformation and foreign gene expression in brassica species
US5177010A (en) 1986-06-30 1993-01-05 University Of Toledo Process for transforming corn and the products thereof
US5187073A (en) 1986-06-30 1993-02-16 The University Of Toledo Process for transforming gramineae and the products thereof
US5004863B2 (en) 1986-12-03 2000-10-17 Agracetus Genetic engineering of cotton plants and lines
US5416011A (en) 1988-07-22 1995-05-16 Monsanto Company Method for soybean transformation and regeneration
US5639952A (en) 1989-01-05 1997-06-17 Mycogen Plant Science, Inc. Dark and light regulated chlorophyll A/B binding protein promoter-regulatory system
US5086169A (en) 1989-04-20 1992-02-04 The Research Foundation Of State University Of New York Isolated pollen-specific promoter of corn
US5837848A (en) 1990-03-16 1998-11-17 Zeneca Limited Root-specific promoter
US5498830A (en) 1990-06-18 1996-03-12 Monsanto Company Decreased oil content in plant seeds
US5641664A (en) 1990-11-23 1997-06-24 Plant Genetic Systems, N.V. Process for transforming monocotyledonous plants
WO1994000977A1 (en) 1992-07-07 1994-01-20 Japan Tobacco Inc. Method of transforming monocotyledon
DK0651814T3 (en) 1992-07-09 1997-06-30 Pioneer Hi Bred Int Maize pollen-specific polygalacturonase gene
HUT70467A (en) 1992-07-27 1995-10-30 Pioneer Hi Bred Int An improved method of agrobactenium-mediated transformation of cultvred soyhean cells
ES2164759T3 (en) 1993-12-09 2002-03-01 Texas A & M Univ Sys TRANSFORMATION OF MUSA SPECIES USING AGROBACTERIUM TUMEFACIENS.
GB9421286D0 (en) 1994-10-21 1994-12-07 Danisco Promoter
US5536653A (en) 1994-11-04 1996-07-16 Monsanto Company Tomato fruit promoters
US5846797A (en) 1995-10-04 1998-12-08 Calgene, Inc. Cotton transformation
GB9606062D0 (en) 1996-03-22 1996-05-22 Zeneca Ltd Promoters
US6127179A (en) 1996-04-17 2000-10-03 Dellapenna; Dean Gene promoter for tomato fruit
DE19644478A1 (en) 1996-10-25 1998-04-30 Basf Ag Leaf-specific expression of genes in transgenic plants
US5981840A (en) 1997-01-24 1999-11-09 Pioneer Hi-Bred International, Inc. Methods for agrobacterium-mediated transformation
US5952543A (en) 1997-02-25 1999-09-14 Dna Plant Technology Corporation Genetically transformed pineapple plants and methods for their production
US6037522A (en) 1998-06-23 2000-03-14 Rhone-Poulenc Agro Agrobacterium-mediated transformation of monocots
US6342657B1 (en) 1999-05-06 2002-01-29 Rhone-Poulenc Agro Seed specific promoters
US7642346B2 (en) 1999-08-27 2010-01-05 Sembiosys Genetics Inc. Flax seed specific promoters
MXPA02007130A (en) 2000-01-21 2002-12-13 Pioneer Hi Bred Int Novel root preferred promoter elements and methods of use.
US7091399B2 (en) 2000-05-18 2006-08-15 Bayer Bioscience N.V. Transgenic plants expressing insecticidal proteins and methods of producing the same
AU2001291656A1 (en) 2000-06-30 2002-01-08 Willem Broekaert Gene silencing vector
US7214788B2 (en) 2000-09-12 2007-05-08 Monsanto Technology Llc Insect inhibitory Bacillus thuringiensis proteins, fusions, and methods of use therefor
CA2430642A1 (en) 2000-12-01 2003-02-20 John B. Ohlrogge Plant seed specific promoters
US20040067506A1 (en) 2000-12-04 2004-04-08 Ben Scheres Novel root specific promoter driving the expression of a novel lrr receptor-like kinase
EP1256629A1 (en) 2001-05-11 2002-11-13 Société des Produits Nestlé S.A. Leaf specifc gene promoter of coffee
US7230167B2 (en) 2001-08-31 2007-06-12 Syngenta Participations Ag Modified Cry3A toxins and nucleic acid sequences coding therefor
US7371928B2 (en) 2002-11-11 2008-05-13 Korea Chungang Educational Foundation Plant seed-specific expression promoter derived from sesame and seed-specific expression vector comprising the promoter
KR100537955B1 (en) 2003-10-29 2005-12-20 학교법인고려중앙학원 A solely pollen-specific promoter
EP1528104A1 (en) 2003-11-03 2005-05-04 Biogemma MEG1 endosperm-specific promoters and genes
WO2005100575A2 (en) 2004-04-14 2005-10-27 Bayer Bioscience N.V. Rice pollen-specific promoters and uses thereof
MY187548A (en) 2005-06-13 2021-09-29 Government Of Malaysia As Represented By The Mini Of Science Tehnology And Innovation Malaysia Leaf-specific chlorophyll a/b binding protein promoter
CN101268094B (en) 2005-08-31 2012-09-05 孟山都技术有限公司 Nucleotide sequences encoding insecticidal proteins
US7449552B2 (en) 2006-04-14 2008-11-11 Pioneer Hi-Bred International, Inc. Bacillus thuringiensis cry gene and protein
MX2009005901A (en) 2006-12-08 2009-06-19 Pioneer Hi Bred Int Novel bacillus thuringiensis crystal polypeptides, polynucleotides, and compositions thereof.
US8158777B2 (en) 2008-02-13 2012-04-17 Eastman Chemical Company Cellulose esters and their production in halogenated ionic liquids
AU2010313865B2 (en) 2009-10-30 2016-05-05 Agresearch Limited Modified oil encapsulating proteins and uses thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060168683A1 (en) * 2004-03-02 2006-07-27 Dow Agrosciences Llc Insecticidal toxin complex fusion proteins
US20060205653A1 (en) * 2005-03-02 2006-09-14 Larrinua Ignacio M Sources for, and types of, insecticidally active proteins, and polynucleotides that encode the proteins

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BUSBY, JN. ET AL.: "THE BC COMPONENT OF ABC TOXINS IS AN RHS-REPEAT-CONTAINING PROTEIN ENCAPSULATIONDEVICE", NATURE, vol. 501, 4 August 2013 (2013-08-04), pages 547 - 550, XP055287357 *
GATSOGIANNIS, C. ET AL.: "A syringe-like injection mechanism in Photorhabdus luminescens toxins", NATURE, vol. 495, 20 March 2013 (2013-03-20), pages 520 - 523, XP055287345 *
LANDSBERG, MJ. ET AL.: "3D structure of the Yersinia entomophaga toxin complex and implications for insecticidal activity", PROC. NATL. ACAD. SCI. USA., vol. 108, no. 51, 2011, pages 20544 - 20549, XP055287352 *
See also references of EP2986725A4 *
SHEETS, JJ. ET AL.: "Insecticidal toxin complex proteins from Xenorhabdus nematophilus - structure and pore formation", J. BIOL. CHEM., vol. 286, no. 26, 2011, pages 22742 - 22749, XP055287350 *

Also Published As

Publication number Publication date
AU2014255358A1 (en) 2015-11-05
EP2986725A1 (en) 2016-02-24
AU2014255358B2 (en) 2019-12-05
US10526378B2 (en) 2020-01-07
US20160075743A1 (en) 2016-03-17
EP2986725A4 (en) 2016-11-09

Similar Documents

Publication Publication Date Title
CN103588865B (en) The protein of desinsection
Santa-María et al. KT-HAK-KUP transporters in major terrestrial photosynthetic organisms: A twenty years tale
US8686224B2 (en) Plant defense signal peptides
US7696340B2 (en) Nucleotide and amino acid sequences from Xenorhabdus and uses thereof
US10800819B2 (en) Vegetative insecticidal proteins useful for control of insect pests
Yalpani et al. An Alcaligenes strain emulates Bacillus thuringiensis producing a binary protein that kills corn rootworm through a mechanism similar to Cry34Ab1/Cry35Ab1
MX2010013248A (en) Compositions and methods for improving plants.
Sampson et al. Discovery of a novel insecticidal protein from Chromobacterium piscinae, with activity against western corn rootworm, Diabrotica virgifera virgifera
WO2015118207A1 (en) New cry protein of bacillus thuringiensis with insecticide activity for controlling hemipterans
US20180127771A1 (en) Insecticidal toxins for plant resistance to hemiptera
US20190239513A1 (en) Insect toxin delivery mediated by a densovirus coat protein
Banyuls et al. Effect of substitutions of key residues on the stability and the insecticidal activity of Vip3Af from Bacillus thuringiensis
US20110166335A1 (en) Xenorhabdus sp. genome sequences and uses thereof
AU2014255358B2 (en) Methods and materials for encapsulating proteins
CN109068603A (en) Insecticidal cry toxins
US8822157B2 (en) Bacterial proteins with pesticidal activity
WO2014102697A2 (en) Polynucleotides, polypeptides and methods of use
BRPI0808895A2 (en) INSECT-RESISTANCE PROTEIN FROM A PLANT, INSECT RESISTANCE GENE, RECOMBINANT VECTOR, HOST CELL, C-PLANT CELL, TRANSFORMER, METHOD TO PRODUCE A TRANSFORMER, PROTEIN AND RESISTANT AGENT
US8440880B2 (en) Xenorhabdus sp. genome sequences and uses thereof
Wu et al. Identification of functional regions of the HrpZ Psg protein from Pseudomonas savastanoi pv. glycinea that induce disease resistance and enhance growth in plants
WO2001021657A1 (en) Novel protein, gene encoding the same and method of utilization thereof
CN114685630A (en) Engineered CRY6A insecticidal proteins
Feiler Novel aspects of receptor protein RLP30-mediated detection of Sclerotinia and Pseudomonas patterns
WO2023069905A1 (en) Insect toxin delivery mediated by a begomovirus coat protein
Manning et al. Small secreted proteins of pyrenophora tritici-repentis and their role in wheat Infection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14786060

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 14785505

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2014255358

Country of ref document: AU

Date of ref document: 20140417

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2014786060

Country of ref document: EP