WO2023220628A1

WO2023220628A1 - Resilin-silica binding domain fusion proteins for biomaterial formation

Info

Publication number: WO2023220628A1
Application number: PCT/US2023/066821
Authority: WO
Inventors: Guohong MAO; Phillip James HUNT; Oliver YU
Original assignee: Conagen Inc.
Priority date: 2022-05-11
Filing date: 2023-05-10
Publication date: 2023-11-16

Abstract

The present application relates, at least in part, to fusion proteins comprising (i) a resilin protein and (ii) a silicon oxide-binding domain. Also included are vectors, cell systems, and methods for making the fusion proteins, as well as composite materials comprising same.

Description

RESILIN-SILICA BINDING DOMAIN FUSION PROTEINS FOR BIOMATERIAL FORMATION

RELATED APPLICATIONS

This application claims the benefit under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 63/340,864, entitled “RESILIN-SILICA BINDING DOMAIN FUSION PROTEINS FOR BIOMATERIAL FORMATION”, filed on May 11, 2022, and U.S. Provisional Application No. 63/394,997, entitled “RESILIN-SILICA BINDING DOMAIN FUSION PROTEINS FOR BIOMATERIAL FORMATION”, filed on August 04, 2022; the contents of each of which are incorporated herein by reference in their entirety.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (C149770090WO00-SEQ-VLJ.xml; Size: 28,420 bytes; and Date of Creation: May 2, 2023) is herein incorporated by reference in its entirety.

BACKGROUND

Resilin is a rubber-like protein which occurs in certain regions of the insect cuticle and is the most efficient elastic material known. The elastic efficiency of the material is purported to be 97%; only 3% of stored energy is lost as heat. It confers long range elasticity to the cuticle and functions as both an energy store and as a damper of vibrations in insect flight systems. It is also used in the jumping mechanisms of fleas and grasshoppers.

The first description of resilin was by Weis-Fogh (1960). This was of elastic ligaments associated with the wings of the locust and elastic tendons in the flight musculature of the dragonfly. Resilin displays extraordinary elasticity (Weis-Fogh, 1960). The elastic tendon from dragonflies can be stretched to over three times its original unstrained length without breaking and it returns immediately to its original length when the strain is released. No lasting deformations are present even after the sample has been kept in the stretched condition for weeks on end (Weis-Fogh, 1961a, 1961b).

Resilin has been found in the jumping mechanism of fleas (Bennet-Clark and Lucey, 1967; Neville and Rothschild, 1967) and in a number of other insect structures and in some crustaceans (Andersen and Weis-Fogh, 1964). It has been found in all insects investigated and also in crustaceans such as crayfish (Astacus fluviatilis) (Andersen and Weiss-Fogh, 1964), but appears to be absent from arachnids. Resilin has been found in the soundproducing organs of some insects, including cicadas (Young and Bennet-Clark, 1995) and moths (Skals and Surlykke, 1999). Resilin has also been found in some cuticular structures which are stretchable but possess no long-range elasticity, such as the abdominal wall of physogastric termite queens (Varman, 1980) and some ants (Varman, 1981).

The two most outstanding properties of resilin are its elasticity and its insolubility. It is insoluble in water below 140 °C. In many solvents, resilin swells considerably, especially in protein solvents such as phenol, formamide, formic acid. Resilin also swells without going into solution in concentrated solutions of lithium thiocyanate and cupric ethylenediamine, solvents which are able to dissolve silk fibroins and cellulose. When resilin is placed in methanol, ethanol or acetone, it shrinks to a hard glassy substance as when dried in air. When placed back in water, it swells to its original size with no noticeable change in its elastic properties (Weis-Fogh, 1960).

The elastic properties of resilin are consistent with the requirements of polymer elasticity: the cross-linked molecules must be flexible and conformationally free. There are two theories to explain elastic behavior of materials. The first is the so called “rubber theory”, which attributes rubber-like properties to a decrease in conformational entropy on deforming a network of kinetically free, random polymer molecules. The second is the theory of Urry and co-workers (Urry, 1988; Urry et al. 1995), which proposes that the elastic mechanism arises from the beta- spiral structure. Resilin and abductin behave as entropic elastomers, returning almost all of the energy stored in deformation. However, abductin has low proline content with no predicted p-turns and hence no P-spiral. The amino acid composition of resilin is more like that of elastin, with high proline, glycine and alanine content. Nevertheless, the sequences do not show similarities in alignment however and appear to be unrelated on an evolutionary basis.

An important property of resilin is the cross-linked nature of the insoluble resilin. This has been shown to be due to tyrosine cross-linking resulting in the formation of dityrosine moieties (Andersen, 1964; 1966). The precursors of resilin are probably soluble, non-cross- linked peptide chains, which are secreted from the apical surface of the epidermal cells into the subcuticular space, where they are rapidly cross-linked to form a three dimensional easily deformable protein network. Elvin et al. (Nature (2005) 437: 999-1002) successfully expressed and polymerized a synthetic, resilin functional fragment gene in E. coll. The synthetic gene consists of the 17 repeats of the native gene. The protein, once expressed, undergoes photochemical crosslinking which casts it into a rubber-like biomaterial. Methods for synthesizing bioelastomers by cross-linking pro-resilin fragments are described in International Application No. WO 2004/104042.

Resilin is found in specialized regions of the cuticle of most insects, providing low stiffness, high strain and efficient energy storage; it is best known for its roles in insect flight and the remarkable jumping ability of fleas and spittle bugs. Previously, the Drosophila melanogaster CG15920 gene was identified as one encoding a resilin-like protein. The first exon (exon-1) of the Drosophila CGI 5920 gene encodes a soluble protein (SEQ ID NO: 1, hereinafter referred to as “RE”), which can be cast into rubber-like biomaterial by rapid photochemical crosslinking.

SUMMARY

In a first aspect, provided herein is a fusion protein comprising (i) a resilin protein and (ii) a silicon oxide-binding domain. In representative embodiments, the silicon oxide-binding domain comprises an amino acid sequence selected from the group consisting of: SEQ ID NO: 2; SEQ ID NO: 3, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 6, and SEQ ID NO: 7. The fusion protein may also comprise a second silicon-oxide binding domain. The second silicon oxide-binding domain may comprise an amino acid sequence selected from the group consisting of: SEQ ID NO: 2; SEQ ID NO: 3, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 6, and SEQ ID NO: 7. In one embodiment, the first silicon oxide-binding domain is linked to the amino-terminus of the resilin protein and the second silicon oxide-binding domain is linked to the carboxy-terminus of the resilin protein. The resilin protein may be a full-length or a functional fragment of native resilin. Example native resilins include those from Drosophila sechellia, Acromyrmex echinatior, a dragonfly of the genus Aeshna, Haematobia irritans, Ctenocephalides felis, Bombus terrestris, Tribolium castaneum, Apis mellifera, Nasonia vitripennis, Pediculus humanus corporis, Anopheles gambiae, Glossina morsitans, Atta cephalotes, Anopheles darlingi, Acyrthosiphon pisum, Drosophila virilis, Drosophila erecta, Lutzomyia longipalpis, Rhodnius prolixus, Solenopsis invicta, Culex quinquefasciatus, Bactrocera cucurbitae, and Trichogramma pretiosum. In one exemplary embodiment, the resilin protein comprises an amino acid sequence of SEQ ID NO: 1. In a set of non-limiting embodiments, the fusion protein comprises an amino acid sequence selected from the group consisting of SEQ ID NO: 8; SEQ ID NO: 10; SEQ ID NO: 12; SEQ ID NO: 14; SEQ ID NO: 16; SEQ ID NO: 18, and SEQ ID NO: 20. In one embodiment, the silicon oxide-binding domain is directly linked to the resilin protein, for example via a linker that is not a silicon oxide-binding domain.

In a second aspect, there is provided a vector comprising a polynucleotide encoding a fusion protein according to the aforesaid first aspect. The polynucleotide may be operably linked to a heterologous regulatory element. In representative embodiments, the polynucleotide comprises a nucleic acid sequence selected from the group consisting of: SEQ ID NO: 9; SEQ ID NO: 11; SEQ ID NO: 13; SEQ ID NO: 15; SEQ ID NO: 17; SEQ ID NO: 19, and SEQ ID NO: 21.

In a third aspect, there is provided a host cell transformed with the recombinant vector according to the aforesaid second aspect.

In a fourth aspect, the application provides a method for producing a fusion protein comprising (i) a resilin protein and (ii) a silicone oxide-binding domain. The method comprises culturing the host cell of the aforesaid third aspect in a medium under conditions that result in producing the fusion protein.

In a fifth aspect, provided herein is a composite material comprising the fusion protein according to the aforesaid first aspect and a silicon oxide-containing substance. In a representative embodiment, the silicon oxide-containing substance is silica. The fusion protein may be crosslinked, for example through intermolecular dityrosine bond formations.

Other features and advantages of the present application will become apparent in the following detailed description, taken with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:

FIG. 1 is a diagram of chimeric resilin-silica binding domain fusion proteins

FIG. 2 illustrates the binding of chimeric proteins to silica gel.

FIG. 3 illustrates the formation of resilin-silica binding domain fusion proteins with or without silica particles. REH: RE-H only; REH + silica: RE-H protein mix with silica particles; RSB1: RSB1 protein only; RSB1 + silica: RSB1 protein mix with silica particles.

FIG 4 illustrates hydrogel formation from resilin-silica binding domain fusion proteins and silica particles. DEFINITIONS

As used herein, the singular forms “a,” “an” and “the” include plural references unless the content clearly dictates otherwise.

To the extent that the term “include,” “have,” or the like is used in the description or the claims, such term is intended to be inclusive in a manner similar to the term “comprise” as “comprise” is interpreted when employed as a transitional word in a claim.

The word “exemplary” is used herein to mean serving as an example, instance, or illustration. Any embodiment described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments.

“Cellular system” is any cells that provide for the expression of proteins. It includes bacteria, yeast, plant cells and animal cells. It includes both prokaryotic and eukaryotic cells. It also includes the in vitro expression of proteins based on cellular components, such as ribosomes.

“Coding sequence” is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to refer to a DNA sequence that encodes a specific amino acid sequence.

“Growing” or “cultivating” a cellular system includes providing an appropriate medium that would allow cells to multiply and divide. It also includes providing resources so that cells or cellular components can translate and make recombinant proteins.

“Yeasts” are eukaryotic, single-celled microorganisms classified as members of the fungus kingdom. Yeasts are unicellular organisms which evolved from multicellular ancestors but with some species useful for the current invention being those that have the ability to develop multicellular characteristics by forming strings of connected budding cells known as pseudo hyphae or false hyphae.

The term “complementary” is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the subjection technology also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.

The terms "nucleic acid" and "nucleotide" are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double- stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally-occurring nucleotides. In any one embodiments provided herein, a particular nucleic acid sequence can also encompass conservatively modified or degenerate variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated.

The term "isolated" is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and when used in the context of an isolated nucleic acid or an isolated polypeptide, is used without limitation to refer to a nucleic acid or polypeptide that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid or polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transgenic host cell.

The terms "incubating" and "incubation" as used herein means a process of mixing two or more chemical or biological entities (such as a chemical compound and an enzyme) and allowing them to interact under conditions favorable for producing resveratrol.

The term "degenerate variant" refers to a nucleic acid sequence having a residue sequence that differs from a reference nucleic acid sequence by one or more degenerate codon substitutions. Degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed base and/or deoxyinosine residues. A nucleic acid sequence and all of its degenerate variants will express the same amino acid or polypeptide.

The terms "polypeptide," "protein,” and "peptide" are to be given their respective ordinary' and customary meanings to a person of ordinary skill in the art; the three terms are sometimes used interchangeably and are used without limitation to refer to a polymer of amino acids, or amino acid analogs, regardless of its size or function. Although "protein" is often used in reference to relatively large polypeptides, and "peptide" is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term ’polypeptide" as used herein refers to peptides, polypeptides, and proteins, unless otherwise noted. The terms "protein," "polypeptide," and "peptide" are used interchangeably herein when referring to a polynucleotide product. Thus, exemplary polypeptides include polynucleotide products, naturally occurring proteins, homologs, orthologs, paralogs, fragments and other equivalents, variants, and analogs of the foregoing. The terms "polypeptide fragment" and "fragment," when used in reference to a reference polypeptide, are to be given their ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy-terminus of the reference polypeptide, or alternatively both.

The term "functional fragment" of a polypeptide or protein refers to a peptide fragment that is a portion of the full-length polypeptide or protein, and has substantially the same biological activity, or carries out substantially the same function as the full-length polypeptide or protein (e.g., carrying out the same enzymatic reaction). In any one embodiment, the Resilin polypeptide may be a functional fragment.

The term “fibrous polypeptide” refers to a polypeptide that includes a plurality of monomer chains arranged in a matrix so as to form fibers or sheets. Fibrous proteins are described in D. Voet & J. G. Voet, “Biochemistry” (2d ed., John Wiley & Sons, New York, 1995, pp. 153-162).

The terms "variant polypeptide," "modified amino acid sequence" or "modified polypeptide," which are used interchangeably, refer to an amino acid sequence that is different from the reference polypeptide by one or more amino acids, e.g., by one or more amino acid substitutions, deletions, and/or additions. In an aspect, a variant is a "functional variant" which retains some or all of the ability of the reference polypeptide. In any one embodiment, the Resilin polypeptide may be a functional variant.

The term "functional variant" further includes conservatively substituted variants.

The term "conservatively substituted variant" refers to a peptide having an amino acid sequence that differs from a reference peptide by one or more conservative amino acid substitutions and maintains some or all of the activity of the reference peptide. A "conservative amino acid substitution" is a substitution of an amino acid residue with a functionally similar residue. Examples of conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another; the substitution of one charged or polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between threonine and serine; the substitution of one basic residue such as lysine or arginine for another; or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another; or the substitution of one aromatic residue, such as phenylalanine, tyrosine, or tryptophan for another. Such substitutions are expected to have little or no effect on the apparent molecular weight or isoelectric point of the protein or polypeptide. The phrase "conservatively substituted variant" also includes peptides wherein a residue is replaced with a chemically- derivatized residue, provided that the resulting peptide maintains some or all of the activity of the reference peptide as described herein.

The term "variant," in connection with the polypeptides of the subject technology, further includes a functionally active polypeptide having an amino acid sequence at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least

82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least

89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least

96%, at least 97%, at least 98%, at least 99%, and even 100% identical to the amino acid sequence of a reference polypeptide. In any one embodiment, the resilin polypeptide may be a variant with any one of the foregoing percentage identities.

The term "homologous" in all its grammatical forms and spelling variations refers to the relationship between polynucleotides or polypeptides that possess a "common evolutionary origin," including polynucleotides or polypeptides from super families and homologous polynucleotides or proteins from different species (Reeck et al., CELL 50:667, 1987). Such polynucleotides or polypeptides have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or the presence of specific amino acids or motifs at conserved positions. For example, two homologous polypeptides can have amino acid sequences that are at least 75%, at least 76%, at least 77%, at least 78%, at least

79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least

86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least

93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and even 100% identical.

"Suitable regulatory sequences" is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

"Promoter" is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to refer to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a promoter sequence. Promoters may be derived in their entirety from a native gene or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters, which cause a gene to be expressed in most cell types at most times, are commonly referred to as "constitutive promoters." It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

The term "expression" as used herein, is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to refer to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the subject technology or production of a gene product in transgenic, transformed or recombinant organisms.

"Transformation" is to be given its ordinary and customary meaning to a person of ordinary skill in the art and is used without limitation to refer to the transfer of a polynucleotide into a target cell. The transferred polynucleotide can be incorporated into the genome or chromosomal DNA of a target cell, resulting in genetically stable inheritance, or it can replicate independent of the host chromosomal. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" or “transformed” or “recombinant”.

The terms "transformed," "transgenic," and "recombinant," when used herein in connection with host cells, are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to a cell of a host organism, such as a plant or microbial cell, into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host cell, or the nucleic acid molecule can be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or subjects are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.

The terms "recombinant," "heterologous," and "exogenous," when used herein in connection with polynucleotides, are to be given their ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to a polynucleotide (e.g., a DNA sequence or a gene) that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of site-directed mutagenesis or other recombinant techniques. The terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position or form within the host cell in which the element is not ordinarily found.

Similarly, the terms "recombinant," "heterologous," and "exogenous," when used herein in connection with a polypeptide or amino acid sequence, means a polypeptide or amino acid sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, recombinant DNA segments can be expressed in a host cell to produce a recombinant polypeptide.

“Protein Expression” refers to protein production that occurs after gene expression. It consists of the stages after DNA has been transcribed to messenger RNA (mRNA). The mRNA is then translated into polypeptide chains, which are ultimately folded into proteins. DNA is present in the cells through transfection - a process of deliberately introducing nucleic acids into cells. The term is often used for non-viral methods in eukaryotic cells. It may also refer to other methods and cell types, although other terms are preferred: "transformation" is more often used to describe non-viral DNA transfer in bacteria, nonanimal eukaryotic cells, including plant cells. In animal cells, transfection is the preferred term as transformation is also used to refer to progression to a cancerous state (carcinogenesis) in these cells. Transduction is often used to describe virus-mediated DNA transfer. Transformation, transduction, and viral infection are included under the definition of transfection for this application.

The terms "plasmid," "vector," and "cassette" are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double- stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. "Transformation cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell. "Expression cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.

As used herein "sequence identity" refers to the extent to which two optimally aligned polynucleotide or peptide sequences are invariant throughout a window of alignment of components, e.g., nucleotides or amino acids. An "identity fraction" for aligned segments of a test sequence and a reference sequence is the number of identical components which are shared by the two aligned sequences divided by the total number of components in reference sequence segment, i.e., the entire reference sequence or a smaller defined part of the reference sequence.

As used herein, the term "percent sequence identity" or "percent identity" refers to the percentage of identical nucleotides in a linear polynucleotide sequence of a reference ("query") polynucleotide molecule (or its complementary strand) as compared to a test ("subject") polynucleotide molecule (or its complementary strand) when the two sequences are optimally aligned (with appropriate nucleotide insertions, deletions, or gaps totaling less than 20 percent of the reference sequence over the window of comparison). Optimal alignment of sequences for aligning a comparison window are well known to those skilled in the art and may be conducted by tools such as the local homology algorithm of Smith and Waterman, the homology alignment algorithm of Needleman and Wunsch, the search for similarity method of Pearson and Lipman, and preferably by computerized implementations of these algorithms such as GAP, BESTFIT, FASTA, and TFASTA available as part of the GCG® Wisconsin Package® (Accelrys Inc., Burlington, MA). An "identity fraction" for aligned segments of a test sequence and a reference sequence is the number of identical components which are shared by the two aligned sequences divided by the total number of components in the reference sequence segment, i.e., the entire reference sequence or a smaller defined part of the reference sequence. Percent sequence identity is represented as the identity fraction multiplied by 100. The comparison of one or more polynucleotide sequences may be to a full-length polynucleotide sequence or a portion thereof, or to a longer polynucleotide sequence. For purposes of this invention "percent identity" may also be determined using BLASTX version 2.0 for translated nucleotide sequences and BLASTN version 2.0 for polynucleotide sequences.

The percent of sequence identity is preferably determined using the "Best Fit" or "Gap" program of the Sequence Analysis Software Package™ (Version 10; Genetics Computer Group, Inc., Madison, WI). "Gap" utilizes the algorithm of Needleman and Wunsch (Needleman and Wunsch, JOURNAL OF MOLECULAR BIOLOGY 48:443-453, 1970) to find the alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. "BestFit" performs an optimal alignment of the best segment of similarity between two sequences and inserts gaps to maximize the number of matches using the local homology algorithm of Smith and Waterman (Smith and Waterman, ADVANCES IN APPLIED MATHEMATICS, 2:482-489, 1981, Smith etal., NUCLEIC ACIDS RESEARCH 11:2205- 2220, 1983). The percent identity is most preferably determined using the "Best Fit" program.

Useful methods for determining sequence identity are also disclosed in the Basic Local Alignment Search Tool (BLAST) programs which are publicly available from National Center Biotechnology Information (NCBI) at the National Library of Medicine, National Institute of Health, Bethesda, Md. 20894; see BLAST Manual, Altschul et al., NCBI, NLM, NIH; Altschul et al., J. MOL. BIOL. 215:403-410 (1990); version 2.0 or higher of BLAST programs allows the introduction of gaps (deletions and insertions) into alignments; for peptide sequence BLASTX can be used to determine sequence identity; and, for polynucleotide sequence BLASTN can be used to determine sequence identity.

As used herein, the term "substantial percent sequence identity" refers to a percent sequence identity of at least about 70% sequence identity, at least about 80% sequence identity, at least about 85% identity, at least about 90% sequence identity, or even greater sequence identity, such as about 98% or about 99% sequence identity. Thus, one embodiment of the invention is a polynucleotide molecule that has at least about 70% sequence identity, at least about 80% sequence identity, at least about 85% identity, at least about 90% sequence identity, or even greater sequence identity, such as about 98% or about 99% sequence identity with a polynucleotide sequence described herein. Polynucleotide molecules that have the activity genes of the current invention are useful in the production of resveratrol as provided herein and have a substantial percent sequence identity to the polynucleotide sequences provided herein and are encompassed within the scope of this invention.

Identity is the fraction of amino acids that are the same between a pair of sequences after an alignment of the sequences (which can be done using only sequence information or structural information or some other information, but usually it is based on sequence information alone), and similarity is the score assigned based on an alignment using some similarity matrix. The similarity index can be any one of the following BLOSUM62, PAM250, or GONNET, or any matrix used by one skilled in the art for the sequence alignment of proteins.

Identity is the degree of correspondence between two sub-sequences (no gaps between the sequences). An identity of 25% or higher implies similarity of function, while 18-25% implies similarity of structure or function. Keep in mind that two completely unrelated or random sequences (that are greater than 100 residues) can have higher than 20% identity. Similarity is the degree of resemblance between two sequences when they are compared. This is dependent on their identity.

As used herein, the term “disrupted gene” refers to a gene containing one or more mutations (e.g., insertion, full or partial deletion, or full or partial nucleotide substitution, etc.) relative to the wild-type counterpart so as to substantially reduce or completely eliminate the activity of the encoded gene product. The one or more mutations may be located in a non-coding region, for example, a promoter region, a regulatory region that regulates transcription or translation; or an intron region. Alternatively, the one or more mutations may be located in a coding region (e.g., in an exon). In some instances, the disrupted gene does not express or expresses a substantially reduced level of the encoded protein. In other instances, the disrupted gene expresses the encoded protein in a mutated form, which is either not functional or has substantially reduced activity. In some embodiments, a disrupted gene is a gene that does not encode functional protein. In some embodiments, a cell that comprises a disrupted gene does not express a detectable level (e.g. by enzymatic activity) of the protein encoded by the gene. A cell that does not express a detectable level of the protein may be referred to as a knockout cell. For example, a cell having an enzyme gene edit may be considered a knockout cell if enzymatic activity associated with a protein cannot be detected using a substrate specific for the enzyme. DETAILED DESCRIPTION

Functionalizing resilin with a peptide domain that binds strongly and specifically to a silicon oxide-containing substance would enable direct immobilization of resilin to substrates such as silica and glass without modifying the substrate surface. Unfortunately, there has been no report of such a construct. The present application provides novel fusion proteins featuring a number of domains that, when linked to a full-length or a functional fragment of resilin, yield a fusion protein that binds strongly to silicon oxide-containing substances, thereby accomplishing a new family of composite materials.

Fusion Proteins

In one aspect, a fusion protein according to the present application is capable of binding to a silicon oxide-containing substance. In a non-limiting embodiment, the fusion protein is capable of binding to a silicon-oxide containing substance in aqueous 20 mM Tris- HC1 buffer having pH 8.0. As such, according to an exemplary embodiment, there is provided a polypeptide comprising an amino acid sequence of a resilin molecule that is linked to a silicone oxide-binding domain. Non-limiting examples of native resilins include those from Drosophila sechellia, Acromyrmex echinatior, dragonflies of the genus Aeshna, Haematobia irritans, Ctenocephalid.es felis, Bombus terrestris, Tribolium castaneum, Apis mellifera, Nasonia vitripennis, Pediculus humanus corporis, Anopheles gambiae, Glossina morsitans, Atta cephalotes, Anopheles darlingi, Acyrthosiphon pisum, Drosophila virilis, Drosophila erecta, Lutzomyia longipalpis, Rhodnius prolixus, Solenopsis invicta, Culex quinquefasciatus, Bactrocera cucurbitae, and Trichogramma pretiosum. GenBank Accession Nos. of specific non-limiting examples of resilin are listed in Table 1 below. The fusion protein may be encoded by a genes containing one or more exons of a native resilin. A resilin of the present application also refers to homologs (e.g. polypeptides which are at least at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% homologous to a native resilin). The homolog may also refer to a deletion, insertion, or substitution variant, including an amino acid substitution, thereof and functional fragments thereof.

TABLE 1. Exemplary native resilin proteins

As mentioned, the chimeric polypeptides of the present application comprise a resilin polypeptide attached to a heterologous silicon oxide-binding domain. As used herein, the qualifier “heterologous” when relating to the heterologous polysaccharide binding domains of the polypeptides of the present application indicates that the heterologous silicon oxidebinding domain is not naturally found in that resilin to which it is fused. In one non-limiting embodiment, the phrase “silicon oxide-binding domain” refers to an amino acid sequence which binds a silicon oxide-containing material in aqueous 20 mM Tris-HCl buffer having pH 8.0. Typically, the silicon oxide-binding domain comprises at least a functional fragment of a silicon oxide-binding protein.

A number of silicon oxide-binding domains have been reported in the literature, including those listed in Table 2 below:

TABLE 2: Exemplary silicone oxide-binding domains. The aforesaid silicon oxide-binding domains have been found to bind to materials such as silica gel, biosilica, and nanosilica. A resilin protein may be functionalized by linking to one, two, or more silicon oxide-binding domains, the link typically being a covalent bond. In exemplary embodiments, a silicon oxide-binding domain is linked to either the amino-terminus or the carboxy-terminus of the resilin polypeptide via a peptide bond.

The polyhistidine motif 6xHis, commonly known as 6xHis-tag, is traditionally used for affinity purification of genetically modified proteins. For example, US Patent No. 7,960,312 to Kuroda et al. (“Kuroda et al.) proposes that a protein which is strongly bindable to silicone oxide may also include an additional peptide such as a His-tag (see Kuroda et al., paragraph bridging columns 7 and 8). However, it must be noted that in Kuroda et al. the His-tag is linked to the silicone oxide-binding domain of a fusion protein, whereas in the constructs of the present application the His-tag itself is the silicone oxide-binding domain. This structural difference is illustrated in the examples provided below where the His-tag is directly linked to the resilin protein in the absence of another silicon oxide-binding domain interposed therebetween. Also contemplated are embodiments where the silicon oxidebinding domain and the resilin are bound to one another via a linker moiety, where the linker is not characterized by a high silicon oxide-binding activity.

Constructs for Transgenic Polypeptides

In a further aspect, the present application relates to constructs like expression vectors for expressing a transgenic polypeptide.

In an embodiment, the expression vector includes those genetic elements for expression of a recombinant polypeptide described herein (e.g., one or more of the aforesaid fusion proteins) in various cellular systems. The elements for transcription and translation in the host cell can include a promoter, a coding region for the fusion protein, and a transcriptional terminator.

A person of ordinary skill in the art will be aware of the molecular biology techniques available for the preparation of expression vectors. The polynucleotide used for incorporation into the expression vector of the subject technology, as described above, can be prepared by routine techniques such as polymerase chain reaction (PCR). In molecular cloning, a vector is a DNA molecule used as a vehicle to artificially carry foreign genetic material into another cell, where it can be replicated and/or expressed (e.g. plasmid, cosmid, Lambda phages). A vector containing foreign DNA is considered recombinant DNA. The four major types of traditional vectors are plasmids, viral vectors, cosmids, and artificial chromosomes. Of these, the most commonly used vectors are plasmids. Common to all engineered vectors are an origin of replication, a multicloning site, and a selectable marker.

A number of molecular biology techniques have been developed to operably link DNA to vectors via complementary cohesive termini. In one embodiment, complementary homopolymer tracts can be added to the nucleic acid molecule to be inserted into the vector DNA. The vector and nucleic acid molecule are then joined by hydrogen bonding between the complementary homopolymeric tails to form recombinant DNA molecules.

In an alternative embodiment, synthetic linkers containing one or more restriction sites provide are used to operably link the polynucleotide of the subject technology to the expression vector. In an embodiment, the polynucleotide is generated by restriction endonuclease digestion. In an embodiment, the nucleic acid molecule is treated with bacteriophage T4 DNA polymerase or E. coli DNA polymerase I, enzymes that remove protruding, 3'-single-stranded termini with their 3'-5'-exonucleolytic activities, and fill in recessed 3'-ends with their polymerizing activities, thereby generating blunt-ended DNA segments. The blunt-ended segments are then incubated with a large molar excess of linker molecules in the presence of an enzyme that is able to catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA ligase. Thus, the product of the reaction is a polynucleotide carrying polymeric linker sequences at its ends. These polynucleotides are then cleaved with the appropriate restriction enzyme and ligated to an expression vector that has been cleaved with an enzyme that produces termini compatible with those of the polynucleotide.

Alternatively, a vector having ligation-independent cloning (LIC) sites can be employed. The required PCR amplified polynucleotide can then be cloned into the LIC vector without restriction digest or ligation (Aslanidis and de Jong, NUCL. ACID. RES. 18 6069-74, (1990), Haun et al, BIOTECHNIQUES 13, 515-18 (1992), each of which are incorporated herein by reference).

In an embodiment, in order to isolate and/or modify the polynucleotide of interest for insertion into the chosen plasmid, it is suitable to use PCR. Appropriate primers for use in PCR preparation of the sequence can be designed to isolate the required coding region of the nucleic acid molecule, add restriction endonuclease or LIC sites, place the coding region in the desired reading frame.

In an embodiment, a polynucleotide for incorporation into an expression vector of the subject technology is prepared using PCR appropriate oligonucleotide primers. The coding region is amplified, while the primers themselves become incorporated into the amplified sequence product. In an embodiment, the amplification primers contain restriction endonuclease recognition sites, which allow the amplified sequence product to be cloned into an appropriate vector.

The expression vectors can be introduced into host cells by conventional transformation or transfection techniques. Transformation of appropriate cells with an expression vector of the subject technology is accomplished by methods known in the art and typically depends on both the type of vector and cell. Suitable techniques include calcium phosphate or calcium chloride co-precipitation, DEAE-dextran mediated transfection, lipofection, chemoporation or electroporation.

Successfully transformed cells, that is, those cells containing the expression vector, can be identified by techniques well known in the art. For example, cells transfected with an expression vector of the subject technology can be cultured to produce polypeptides described herein. Cells can be examined for the presence of the expression vector DNA by techniques well known in the art.

The host cells can contain a single copy of the expression vector described previously, or alternatively, multiple copies of the expression vector.

In some embodiments, the transformed cell is a plant cell, an algal cell, a fungal cell, or a bacterial cell of the Escherichia genus, e.g., Escherichia coli.

Microbial host cell expression systems and expression vectors containing regulatory sequences that direct high-level expression of foreign proteins that are well-known to those skilled in the art. Any of these could be used to construct vectors for expression of the recombinant polypeptide of the subjection technology in a microbial host cell. These vectors could then be introduced into appropriate microorganisms via transformation to allow for high level expression of the recombinant polypeptide of the subject technology.

Vectors or cassettes useful for the transformation of suitable microbial host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant polynucleotide, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the polynucleotide which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is preferred for both control regions to be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a host. Termination control regions may also be derived from various genes native to the microbial hosts. A termination site optionally may be included for the microbial hosts described herein.

Preferred host cells include those known to have the ability to produce resilin and its derivatives. For example, preferred host cells can include bacteria of the species Escherichia coli.

Transgenic Polypeptides

Depending on the vector and host system used for production, resultant fusion proteins of the present application may either remain within the recombinant cell, secreted into the fermentation medium, secreted into a space between two cellular membranes, such as the periplasmic space in E. coli, or retained on the outer surface of a cell or viral membrane.

Following a predetermined time in culture, recovery of the fusion protein from the cellular system is effected. The phrase “recovering the fusion protein” as used herein refers to collecting the whole fermentation medium containing the polypeptide and need not imply additional steps of separation or purification. Thus, polypeptides of the present application can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential solubilization.

In addition to being synthesizable in host cells, fusion proteins of the present application can also be synthesized using in vitro expression systems. These methods are well known in the art and the components of the system are commercially available.

Following expression and optional purification of the fusion proteins of the present application, the proteins may be polymerized to form an insoluble material from a solution, preferably one with a relatively high concentration of protein. According to one embodiment, the critical concentration of a fusion polypeptide of the present application is about 50 mg/ml. According to one embodiment, the polypeptide is concentrated by ultracentrifugation.

Crosslinking the Fusion Protein

Typically, crosslinking of proteins can be performed using standard crosslinking agents such as glutaraldehyde, di-isocyanate and genipin. According to an exemplary embodiment, the crosslinking is such that dityrosine bonds are formed. These methods are well known to the person skilled in the art and are discussed by Malencik and Anderson (Biochemistry 1996, 35, 4375-4386). In an embodiment, enzyme-mediated cross-linking in the presence of Ru(bpy)3Ch-6H2O may be employed. Exemplary peroxidases that may be used to crosslink resilin include, but are not limited to horseradish peroxidase, Arthromyces peroxidase, Duox peroxidase from Caenorhabditis elegans, sea urchin ovoperoxidases and chorion peroxidases.

Following irradiation, a Ru(III) ion is formed, which serves as an electron abstraction agent to produce a carbon radical within the polypeptide, preferentially at a tyrosine residue, and thus allows dityrosine link formation. This method of induction allows quantitative conversion of soluble fusion protein fragments to a very high molecular weight aggregate. Moreover, this method allows for convenient shaping of composite materials by introducing the fusion protein to a silicone oxide-containing material of the desired shape and size and irradiating the fusion protein contained therein.

In another embodiment, UV irradiation is effected in order to crosslink a fusion polypeptides of the present application (Lehrer S, Fasman G D. (1967) Biochemistry. 6(3):757-67; Malencik D A, Anderson S R. (2003) Amino Acids. 25(3-4):233-47), although care must be taken not to damage the protein through exposure to this radiation. UVB radiation cross-linking may also be undertaken in the presence or absence of riboflavin. In the absence of riboflavin, a substantial amount of cross-linking takes place within one hour of exposure. The crosslinking time is substantially reduced if riboflavin is present. Still further, cross-linking may be effected with ultra-violet light in the presence of coumarin or by white light in the presence of fluorescein. An analysis of the dityrosine may be performed using conventional methods such as high performance liquid chromatography measurements in order to ascertain the extent of dityrosine cross-link formation.

Composite Materials

The fusion proteins of the present application may be used as are or they may be combined with silicone oxide-containing substrates in order to generate novel composite materials. Thus, according to one aspect of the present application, there is provided a composite material including one or more of the aforesaid fusion proteins and a silicon oxide- containing substrate. As used herein the term “composite” refers to a substantially solid material, e.g., a collection of particles or a continuous layer, that is composed of two or more discrete materials, one being the fusion protein, the other the silicon oxide-containing substrate, each of which retains its identity, e.g., physical characteristics, while contributing desirable properties to the composite. Exemplary silicon oxide-containing substrate contemplated for the composites of the present application include, but are not limited to silica gel, biosilica, nanosilica, and silica glass. In order to generate the composites of the present invention, suspensions of monomers of the fusion protein and the silicon oxide-containing material (e.g., silica gel) — for example at approximately 2% solid content, are blended together.

Exemplary ratio of the component suspension include: 100/0, 90/10, 80/20, 70/30, 60/40, 50/50, 40/60, 30/70, 20/80, 10/90, and 0/100.

The mixed solutions may be cast onto suitable molds (e.g., teflon or polystyrene) following which appropriate assembly and crosslinking is optionally effected. The crosslinking may be effected in the presence of other fibrous polypeptides to generate the two fibrous polypeptide/silicon oxide composites.

The present application also contemplates coating a silicon oxide-containing substrate to form novel composites. According to one embodiment, the coating includes one or more of the aforesaid fusion proteins which will typically bind to the surface of the substrate through their silicon oxide-binding domains. Following coating, a suitable crosslinking method may be used depending on the actual fusion protein of the coating.

The disclosure will be more fully understood upon consideration of the following non-limiting Examples. It should be understood that these examples, while indicating preferred embodiments of the subject technology, are given by way of illustration only. From the above discussion and these examples, one skilled in the art can ascertain the essential characteristics of the subject technology, and without departing from the spirit and scope thereof, can make various changes and modifications of the subject technology to adapt it to various uses and conditions.

EXAMPLES

Example 1: Design and expression of resilin-silica binding domain fusion proteins (RSBs) in E. coli

In order to combine the features of resilin with those of silica-based materials, the silicon oxide-binding domains of Table 2 were fused with resilin through molecular design, synthesis, and characterization of novel chimeric proteins for subsequent use in the formation of novel composite materials.

As seen below in Table 3, a total of seven example chimeric resilin-silicone oxidebinding domain fusion proteins were produced. The fusion proteins featured one or two silicon oxide-binding domains. Each silicon oxide-binding domain was linked to either the N-terminal or C-terminal amino acid of a resilin (RE) protein from Drosophila melanogaster (SEQ ID NO: 1).

TABLE 3. List of silica binding domains and resilin-silica binding domain fusion proteins

The fusion protein genes were codon optimized for E. coli expression. The synthesized DNA was cloned into a bacterial pET21a (+) expression vector. Each expression construct was transformed into E. coli BL21 (DE3), which was subsequently grown in LB plus 20mM glucose media containing 50 pg/mL carbenicillin at 37 °C until reaching an ODeoo of 0.8- 1.0. Protein expression was induced by addition of 1 mM isopropyl P-D-l- thiogalactopyranoside (IPTG) and the culture was further grown at 37°C for 6 hr. Cells were harvested by centrifugation (5,000 x g; 10 min; 4 °C). The cell pellets were collected and were either used immediately or stored at -80 °C.

The cell pellets were re-suspended in lysis buffer (50 mM potassium phosphate buffer, pH 8.0, 25 pg/ml lysozyme, 1 pg/ml DNase I, 20 mM imidazole, 500 mM NaCl, 10% glycerol). The cells were disrupted by sonication on ice, and the cell debris was clarified by centrifugation (20,000 x g; 20 min). Supernatant was loaded to an equilibrated (equilibration buffer: 50 mM potassium phosphate buffer, pH 8.0, 20 mM imidazole, 500 mM NaCl, 10% glycerol) Ni-NTA (Qiagen) affinity column. After loading of protein sample, the column was washed with equilibration buffer to remove unbound contaminant proteins. The His- tagged chimeric fusion proteins were eluted by equilibration buffer containing 200 mM imidazole. Purified fusion proteins were dialyzed against suitable buffer (20mM Tris-HCl buffer, pH 8.0) for further test.

Example 2: Characterization of silica binding of chimeric proteins

A silica gel binding assay was carried to verify the silica binding ability of purified chimeric proteins. Silica gel powder (Silica gel 60, Millipore, USA) was pretreated with 20 mM Tris-HCl (pH 8.0) aqueous buffer to form a silica gel slurry. The same volume of a purified solution of each protein was added to the slurry and mixed on a rotator for 2 hours. The resulting mixture was centrifuged at 5,000xg for 2 minutes and supernatant was collected for analysis. The pellet was resuspended in 300 pL of 20 mM Tris-HCl pH 8.0 for washing. This washing step was repeated for a total of three washes, allowing 5 minutes of mixing in between each centrifugation step. After the final wash, 300pL of 500 mM of arginine in 20 mM Tris-HCl (pH 8.0) was added and allowed to mix with the slurry before being pelleted at 5,000xg for 2 minutes. The supernatant was eluant containing silica binding protein. All samples from each step were collected and analyzed by SDS-PAGE. As shown in FIG. 2, all 7 purified chimeric fusion proteins can be detected in arginine eluant (FIG. 2, Lane 4) indicating all chimeric fusion proteins had a silica binding property.

Example 3: Biomaterial formation from chimeric proteins and nano silica particles

Photochemical cross-linking of protein (150 -200 mg/mL) was carried out by irradiation from a 2000 lumen light source in aqueous reaction mixture samples. The mixture contained 20 mM potassium phosphate buffer (pH 8.0), NaCl 134 mM, and 2 mM Ru(bpy)3Ch. The cross-linking was performed either in the presence or absence of 3 wt% nano silica particles having an average diameter between 10 and 25 nm. Ammonium persulfate was added to a final concentration of 10 mM immediately prior to irradiation of a sample. A hydrogel or film was formed when the experiment was conducted at room temperature.

As illustrated in FIG. 3, a film was formed when the mixture was spread on a glass slide. As illustrated in FIG. 4, both RE-H and RSB 1 proteins can form hydrogels with silica particles. Films formed from reaction mixtures containing silica particles were characterized by greater transparency than those without the silica particles. The above results indicate that the resilin- silica binding domain fusion proteins are characterized by both resilin- and silica- binding properties that can be put to use in the formation of hydrogel and film materials. REFERENCES

US Pat. Appl. Publ. No. 2019/0330287. Elastomeric proteins.

US Pat. Appl. Publ. No. 2010/0317588. Compositions comprising fibrous polypeptides and polysaccharides.

US Pat. Appl. Publ. No. 2007/0099231. Bioelastomer.

US Pat. No. 7,960,312. Method and agent for immobilizing protein via protein bound to silicon oxide-containing substance.

PCT International Publication No. WO 2015/068160. Cross-linking resilin-containing materials.

Elvin CM, Carr AG, Huson MG, Maxwell JM, Pearson RD, Vuocolo T, Liyou NE, Wong DC, Merritt DJ, Dixon NE. Synthesis and properties of crosslinked recombinant pro-resilin. Nature. 2005 Oct 13;437(7061):999-1002. doi: 10.1038/nature04085. PMID: 16222249.

Qin G, Lapidot S, Numata K, Hu X, Meirovitch S, Dekel M, Podoler I, Shoseyov O, Kaplan DL. Expression, cross-linking, and characterization of recombinant chitin binding resilin. Biomacromolecules. 2009 Dec 14;10(12):3227-34. doi: 10.1021/bm900735g. PMID: 19928816.

Abdelhamid MA, Ikeda T, Motomura K, Tanaka T, Ishida T, Hirota R, Kuroda A.

Application of volcanic ash particles for protein affinity purification with a minimized silica- binding tag. J Biosci Bioeng. 2016 Nov;122(5):633-638. doi: 10.1016/j.jbiosc.2016.04.011. Epub 2016 May 19. PMID: 27212265.

Abdelhamid MA, Motomura K, Ikeda T, Ishida T, Hirota R, Kuroda A. Affinity purification of recombinant proteins using a novel silica-binding peptide as a fusion tag. Appl Microbiol Biotechnol. 2014 Jun;98(12):5677-84. doi: 10.1007/s00253-014-5754-z. Epub 2014 Apr 23. PMID: 24756322. Zhou S, Huang W, Belton DJ, Simmons LO, Perry CC, Wang X, Kaplan DL. Control of silicification by genetically engineered fusion proteins: silk-silica binding peptides. Acta Biomater. 2015 Mar;15: 173-80. doi: 10.1016/j.actbio.2014.10.040. Epub 2014 Nov 4. PMID: 25462851; PMCID: PMC4331239.

Liu C, Steer DL, Song H, He L. Superior Binding of Proteins on a Silica Surface: Physical Insight into the Synergetic Contribution of Polyhistidine and a Silica-Binding Peptide. J Phys Chem Lett. 2022 Feb 17;13(6): 1609-1616. doi: 10.1021/acs.jpclett.lc03306. Epub 2022 Feb 10. PMID: 35142521.

NUCLEIC ACID AND AMINO ACID SEQUENCES

SEQ ID NO: 1

Amino acid

Drosophila melanogaster resilin

PEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAPGQGQGQG

QGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYGAPGGGNGG

RPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPGGGNGGRPSD

TYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSDTYGAPGGGN

GNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPGSGNGNGGRP

SSSYGAPGSGPGGRPSDSYGPPASG

SEQ ID NO: 2

Amino acid

6xHis

HHHHHH

SEQ ID NO: 3

Amino acid

R5

SSKKSGSYSGSKGSKRRIL

SEQ ID NO: 4

Amino acid

Al

SGSKGSKRRIL

SEQ ID NO: 5

Amino acid

A3

MSPHPHPRHHHT SEQ ID NO: 6

Amino acid

CotB Ip

SGRARAQRQSSRGR

SEQ ID NO: 7

Amino acid

SB7

RQSSRGR

SEQ ID NO: 8

Amino acid

RE-H

MPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAPGQGQGQ

GQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYGAPGGGNG

GRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPGGGNGGRPS

DTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSDTYGAPGGG

NGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPGSGNGNGGR

PSSSYGAPGSGPGGRPSDSYGPPASGHHHHHH

SEQ ID NO: 9

Synthetic DNA

RE-H

ATGCCGGAACCGCCGGTTAATAGCTATCTGCCGCCGAGCGATAGTTATGGTGCAC

CGGGCCAGAGTGGCCCTGGTGGTCGTCCTAGTGATAGCTATGGCGCCCCTGGTGG

CGGTAATGGCGGTCGTCCGAGCGATTCATACGGTGCCCCTGGTCAGGGCCAGGG

TCAGGGTCAGGGCCAAGGCGGTTATGCAGGTAAACCGAGCGATACCTATGGCGC

CCCTGGTGGTGGCAATGGTAATGGCGGCCGCCCGAGCAGCAGTTATGGTGCGCC

GGGCGGCGGCAATGGCGGTCGCCCTAGCGATACCTACGGCGCCCCTGGTGGAGG

TAATGGTGGCCGCCCGTCAGATACCTATGGTGCCCCTGGTGGGGGTGGTAATGGC

AATGGTGGCCGTCCGAGTAGTAGTTATGGTGCTCCGGGTCAGGGTCAAGGTAAT

GGCAACGGTGGCCGTAGTAGCAGTAGTTATGGCGCACCGGGTGGTGGCAACGGC

GGCCGTCCTTCAGATACCTACGGTGCACCGGGTGGCGGCAATGGTGGTCGTCCGA

GTGATACCTATGGAGCCCCTGGTGGTGGTAATAATGGTGGTCGCCCGAGCTCAAG

TTATGGTGCCCCTGGTGGCGGCAACGGCGGTAGACCTAGCGATACATACGGTGC

CCCTGGTGGCGGAAATGGTAATGGTAGTGGCGGCCGCCCTAGCAGCAGCTATGG

TGCCCCTGGTCAAGGCCAGGGTGGTTTTGGTGGTCGTCCAAGCGATAGCTATGGT GCGCCTGGCCAGAATCAGAAACCGAGCGACAGCTATGGTGCACCTGGTAGTGGT

AATGGTAATGGAGGTCGTCCGTCTAGTAGTTATGGAGCACCGGGCAGTGGTCCG

GGCGGTAGACCAAGTGATAGCTACGGCCCGCCGGCAAGCGGCCACCACCACCAC CACCACTGA

SEQ ID NO: 10

Amino acid

H-RE

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASG

SEQ ID NO: 11

Synthetic DNA

H-RE

ATGCACCACCACCACCACCACCCGGAACCGCCGGTGAACAGCTACTTACCGCCTT

CGGACAGCTATGGAGCTCCTGGACAGTCCGGTCCCGGGGGTCGGCCATCAGATA

GCTACGGCGCACCGGGCGGAGGCAATGGGGGACGCCCATCAGATTCGTATGGCG

CCCCGGGCCAAGGTCAAGGCCAAGGCCAAGGCCAAGGCGGGTACGCGGGCAAA

CCGAGCGATACTTACGGAGCACCTGGCGGCGGTAATGGGAATGGAGGACGTCCT

AGCAGTTCTTATGGGGCTCCTGGAGGCGGTAATGGGGGTCGCCCGTCGGACACTT

ACGGAGCTCCCGGAGGAGGAAACGGAGGACGTCCGTCTGACACGTACGGAGCTC

CGGGCGGTGGCGGTAATGGTAACGGAGGAAGGCCAAGCAGCTCCTACGGCGCTC

CCGGCCAAGGCCAAGGCAATGGTAACGGCGGACGGAGCAGCAGCAGCTATGGC

GCACCGGGCGGAGGAAATGGTGGCCGCCCGAGCGATACATACGGGGCCCCCGGA

GGGGGAAATGGAGGGCGGCCGAGCGATACGTATGGCGCTCCGGGGGGAGGAAA

CAACGGAGGACGCCCGTCTAGCAGCTATGGCGCCCCTGGCGGGGGGAATGGAGG

CCGTCCATCAGACACCTATGGCGCCCCGGGCGGGGGCAATGGGAACGGGTCGGG

TGGGCGACCGAGCAGCAGCTATGGAGCACCTGGACAAGGCCAAGGCGGATTCGG

AGGTCGCCCGTCCGACTCCTATGGTGCTCCGGGTCAGAATCAGAAACCTAGCGAT

AGCTATGGCGCACCTGGGTCGGGAAATGGGAACGGTGGACGACCAAGCTCTAGC

TATGGGGCTCCCGGAAGTGGCCCGGGAGGACGACCGTCCGATAGTTACGGCCCG

CCCGCGAGCGGCTGA

SEQ ID NO: 12

Amino acid RSB1

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASGSSKKSGSYSGSKGSKRRIL

SEQ ID NO: 13

Synthetic DNA

RSB1

ATGCACCACCACCACCACCACCCGGAACCGCCGGTGAACAGCTACTTACCGCCTT

CGGACAGCTATGGAGCTCCTGGACAGTCCGGTCCCGGGGGTCGGCCATCAGATA

GCTACGGCGCACCGGGCGGAGGCAATGGGGGACGCCCATCAGATTCGTATGGCG

CCCCGGGCCAAGGTCAAGGCCAAGGCCAAGGCCAAGGCGGGTACGCGGGCAAA

CCGAGCGATACTTACGGAGCACCTGGCGGCGGTAATGGGAATGGAGGACGTCCT

AGCAGTTCTTATGGGGCTCCTGGAGGCGGTAATGGGGGTCGCCCGTCGGACACTT

ACGGAGCTCCCGGAGGAGGAAACGGAGGACGTCCGTCTGACACGTACGGAGCTC

CGGGCGGTGGCGGTAATGGTAACGGAGGAAGGCCAAGCAGCTCCTACGGCGCTC

CCGGCCAAGGCCAAGGCAATGGTAACGGCGGACGGAGCAGCAGCAGCTATGGC

GCACCGGGCGGAGGAAATGGTGGCCGCCCGAGCGATACATACGGGGCCCCCGGA

GGGGGAAATGGAGGGCGGCCGAGCGATACGTATGGCGCTCCGGGGGGAGGAAA

CAACGGAGGACGCCCGTCTAGCAGCTATGGCGCCCCTGGCGGGGGGAATGGAGG

CCGTCCATCAGACACCTATGGCGCCCCGGGCGGGGGCAATGGGAACGGGTCGGG

TGGGCGACCGAGCAGCAGCTATGGAGCACCTGGACAAGGCCAAGGCGGATTCGG

AGGTCGCCCGTCCGACTCCTATGGTGCTCCGGGTCAGAATCAGAAACCTAGCGAT

AGCTATGGCGCACCTGGGTCGGGAAATGGGAACGGTGGACGACCAAGCTCTAGC

TATGGGGCTCCCGGAAGTGGCCCGGGAGGACGACCGTCCGATAGTTACGGCCCG

CCCGCGAGCGGCAGCAGCAAAAAAAGCGGCAGCTATAGCGGCAGCAAAGGCAG

CAAACGCCGCATTCTGTGA

SEQ ID NO: 14

Amino acid

RSB2

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASGSGSKGSKRRIL SEQ ID NO: 15

Synthetic DNA

RSB2

ATGCACCACCACCACCACCACCCGGAACCGCCGGTTAATAGCTATCTGCCGCCGA

GCGATAGTTATGGTGCACCGGGCCAGAGTGGCCCTGGTGGTCGTCCTAGTGATAG

CTATGGCGCCCCTGGTGGCGGTAATGGCGGTCGTCCGAGCGATTCATACGGTGCC

CCTGGTCAGGGCCAGGGTCAGGGTCAGGGCCAAGGCGGTTATGCAGGTAAACCG

AGCGATACCTATGGCGCCCCTGGTGGTGGCAATGGTAATGGCGGCCGCCCGAGC

AGCAGTTATGGTGCGCCGGGCGGCGGCAATGGCGGTCGCCCTAGCGATACCTAC

GGCGCCCCTGGTGGAGGTAATGGTGGCCGCCCGTCAGATACCTATGGTGCCCCTG

GTGGGGGTGGTAATGGCAATGGTGGCCGTCCGAGTAGTAGTTATGGTGCTCCGG

GTCAGGGTCAAGGTAATGGCAACGGTGGCCGTAGTAGCAGTAGTTATGGCGCAC

CGGGTGGTGGCAACGGCGGCCGTCCTTCAGATACCTACGGTGCACCGGGTGGCG

GCAATGGTGGTCGTCCGAGTGATACCTATGGAGCCCCTGGTGGTGGTAATAATGG

TGGTCGCCCGAGCTCAAGTTATGGTGCCCCTGGTGGCGGCAACGGCGGTAGACCT

AGCGATACATACGGTGCCCCTGGTGGCGGAAATGGTAATGGTAGTGGCGGCCGC

CCTAGCAGCAGCTATGGTGCCCCTGGTCAAGGCCAGGGTGGTTTTGGTGGTCGTC

CAAGCGATAGCTATGGTGCGCCTGGCCAGAATCAGAAACCGAGCGACAGCTATG

GTGCACCTGGTAGTGGTAATGGTAATGGAGGTCGTCCGTCTAGTAGTTATGGAGC

ACCGGGCAGTGGTCCGGGCGGTAGACCAAGTGATAGCTACGGCCCGCCGGCAAG

CGGCAGCGGCAGCAAAGGCAGCAAACGCCGCATTCTGTGA

SEQ ID NO: 16

Amino acid

RSB3

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASGMSPHPHPRHHHT

SEQ ID NO: 17

Synthetic DNA

RSB3

ATGCACCACCACCACCACCACCCGGAACCGCCGGTTAATAGCTATCTGCCGCCGA

GCGATAGTTATGGTGCACCGGGCCAGAGTGGCCCTGGTGGTCGTCCTAGTGATAG

CTATGGCGCCCCTGGTGGCGGTAATGGCGGTCGTCCGAGCGATTCATACGGTGCC CCTGGTCAGGGCCAGGGTCAGGGTCAGGGCCAAGGCGGTTATGCAGGTAAACCG

AGCGATACCTATGGCGCCCCTGGTGGTGGCAATGGTAATGGCGGCCGCCCGAGC

AGCAGTTATGGTGCGCCGGGCGGCGGCAATGGCGGTCGCCCTAGCGATACCTAC

GGCGCCCCTGGTGGAGGTAATGGTGGCCGCCCGTCAGATACCTATGGTGCCCCTG

GTGGGGGTGGTAATGGCAATGGTGGCCGTCCGAGTAGTAGTTATGGTGCTCCGG

GTCAGGGTCAAGGTAATGGCAACGGTGGCCGTAGTAGCAGTAGTTATGGCGCAC

CGGGTGGTGGCAACGGCGGCCGTCCTTCAGATACCTACGGTGCACCGGGTGGCG

GCAATGGTGGTCGTCCGAGTGATACCTATGGAGCCCCTGGTGGTGGTAATAATGG

TGGTCGCCCGAGCTCAAGTTATGGTGCCCCTGGTGGCGGCAACGGCGGTAGACCT

AGCGATACATACGGTGCCCCTGGTGGCGGAAATGGTAATGGTAGTGGCGGCCGC

CCTAGCAGCAGCTATGGTGCCCCTGGTCAAGGCCAGGGTGGTTTTGGTGGTCGTC

CAAGCGATAGCTATGGTGCGCCTGGCCAGAATCAGAAACCGAGCGACAGCTATG

GTGCACCTGGTAGTGGTAATGGTAATGGAGGTCGTCCGTCTAGTAGTTATGGAGC

ACCGGGCAGTGGTCCGGGCGGTAGACCAAGTGATAGCTACGGCCCGCCGGCAAG

CGGCATGAGCCCGCATCCGCATCCGCGCCATCATCATACCTGA

SEQ ID NO: 18

Amino acid

RSB4

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASGSGRARAQRQSSRGR

SEQ ID NO: 19

Synthetic DNA

RSB4

ATGCACCACCACCACCACCACCCGGAACCGCCGGTTAATAGCTATCTGCCGCCGA

GCGATAGTTATGGTGCACCGGGCCAGAGTGGCCCTGGTGGTCGTCCTAGTGATAG

CTATGGCGCCCCTGGTGGCGGTAATGGCGGTCGTCCGAGCGATTCATACGGTGCC

CCTGGTCAGGGCCAGGGTCAGGGTCAGGGCCAAGGCGGTTATGCAGGTAAACCG

AGCGATACCTATGGCGCCCCTGGTGGTGGCAATGGTAATGGCGGCCGCCCGAGC

AGCAGTTATGGTGCGCCGGGCGGCGGCAATGGCGGTCGCCCTAGCGATACCTAC

GGCGCCCCTGGTGGAGGTAATGGTGGCCGCCCGTCAGATACCTATGGTGCCCCTG

GTGGGGGTGGTAATGGCAATGGTGGCCGTCCGAGTAGTAGTTATGGTGCTCCGG

GTCAGGGTCAAGGTAATGGCAACGGTGGCCGTAGTAGCAGTAGTTATGGCGCAC

CGGGTGGTGGCAACGGCGGCCGTCCTTCAGATACCTACGGTGCACCGGGTGGCG

GCAATGGTGGTCGTCCGAGTGATACCTATGGAGCCCCTGGTGGTGGTAATAATGG

TGGTCGCCCGAGCTCAAGTTATGGTGCCCCTGGTGGCGGCAACGGCGGTAGACCT AGCGATACATACGGTGCCCCTGGTGGCGGAAATGGTAATGGTAGTGGCGGCCGC

CCTAGCAGCAGCTATGGTGCCCCTGGTCAAGGCCAGGGTGGTTTTGGTGGTCGTC

CAAGCGATAGCTATGGTGCGCCTGGCCAGAATCAGAAACCGAGCGACAGCTATG

GTGCACCTGGTAGTGGTAATGGTAATGGAGGTCGTCCGTCTAGTAGTTATGGAGC

ACCGGGCAGTGGTCCGGGCGGTAGACCAAGTGATAGCTACGGCCCGCCGGCAAG

CGGCTCGGGTCGTGCTCGTGCCCAGCGTCAGTCAAGCCGTGGTCGTTGA

SEQ ID NO: 20

Amino acid

RSB5

MHHHHHHPEPPVNSYLPPSDSYGAPGQSGPGGRPSDSYGAPGGGNGGRPSDSYGAP

GQGQGQGQGQGGYAGKPSDTYGAPGGGNGNGGRPSSSYGAPGGGNGGRPSDTYG

APGGGNGGRPSDTYGAPGGGGNGNGGRPSSSYGAPGQGQGNGNGGRSSSSYGAPG

GGNGGRPSDTYGAPGGGNGGRPSDTYGAPGGGNNGGRPSSSYGAPGGGNGGRPSD

TYGAPGGGNGNGSGGRPSSSYGAPGQGQGGFGGRPSDSYGAPGQNQKPSDSYGAPG

SGNGNGGRPSSSYGAPGSGPGGRPSDSYGPPASGRQSSRGR

SEQ ID NO: 21

Synthetic DNA

RSB5

ATGCACCACCACCACCACCACCCGGAACCGCCGGTTAATAGCTATCTGCCGCCGA

GCGATAGTTATGGTGCACCGGGCCAGAGTGGCCCTGGTGGTCGTCCTAGTGATAG

CTATGGCGCCCCTGGTGGCGGTAATGGCGGTCGTCCGAGCGATTCATACGGTGCC

CCTGGTCAGGGCCAGGGTCAGGGTCAGGGCCAAGGCGGTTATGCAGGTAAACCG

AGCGATACCTATGGCGCCCCTGGTGGTGGCAATGGTAATGGCGGCCGCCCGAGC

AGCAGTTATGGTGCGCCGGGCGGCGGCAATGGCGGTCGCCCTAGCGATACCTAC

GGCGCCCCTGGTGGAGGTAATGGTGGCCGCCCGTCAGATACCTATGGTGCCCCTG

GTGGGGGTGGTAATGGCAATGGTGGCCGTCCGAGTAGTAGTTATGGTGCTCCGG

GTCAGGGTCAAGGTAATGGCAACGGTGGCCGTAGTAGCAGTAGTTATGGCGCAC

CGGGTGGTGGCAACGGCGGCCGTCCTTCAGATACCTACGGTGCACCGGGTGGCG

GCAATGGTGGTCGTCCGAGTGATACCTATGGAGCCCCTGGTGGTGGTAATAATGG

TGGTCGCCCGAGCTCAAGTTATGGTGCCCCTGGTGGCGGCAACGGCGGTAGACCT

AGCGATACATACGGTGCCCCTGGTGGCGGAAATGGTAATGGTAGTGGCGGCCGC

CCTAGCAGCAGCTATGGTGCCCCTGGTCAAGGCCAGGGTGGTTTTGGTGGTCGTC

CAAGCGATAGCTATGGTGCGCCTGGCCAGAATCAGAAACCGAGCGACAGCTATG

GTGCACCTGGTAGTGGTAATGGTAATGGAGGTCGTCCGTCTAGTAGTTATGGAGC

ACCGGGCAGTGGTCCGGGCGGTAGACCAAGTGATAGCTACGGCCCGCCGGCAAG

CGGCCGTCAGAGCAGCCGTGGTCGTTGA As is evident from the foregoing description, certain aspects of the present disclosure are not limited by the particular details of the examples provided herein, and it is therefore contemplated that other modifications and applications, or equivalents thereof, will occur to those skilled in the art. It is accordingly intended that the claims shall cover all such modifications and applications that do not depart from the spirit and scope of the present disclosure.

Moreover, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure belongs. Although any methods and materials similar to or equivalent to or those described herein can be used in the practice or testing of the present disclosure, the preferred methods and materials are described above.

Claims

1. A fusion protein comprising (i) a resilin protein and (ii) a silicon oxide-binding domain.

2. The fusion protein according to claim 1, wherein the silicon oxide-binding domain comprises an amino acid sequence selected from the group consisting of: SEQ ID NO: 2; SEQ ID NO: 3, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 6, and SEQ ID NO: 7.

3. The fusion protein according to claim 1, further comprising a second silicon-oxide binding domain.

4. The fusion protein according to claim 3, wherein the second silicon oxide-binding domain comprises an amino acid sequence selected from the group consisting of: SEQ ID NO: 2; SEQ ID NO: 3, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 6, and SEQ ID NO: 7.

5. The fusion protein according to claim 4, wherein the first silicon oxide-binding domain is linked to the amino-terminus of the resilin protein and the second silicon oxidebinding domain is linked to the carboxy-terminus of the resilin protein.

6. The fusion protein according to any one of claims 1 to 5, wherein the resilin protein is a full-length or a functional fragment of native resilin.

7. The fusion protein according to claim 6, wherein the native resilin is from an organism selected from the group consisting of: Drosophila sechellia, Acromyrmex echinatior, a dragonfly of the genus Aeshna, Haematobia irritans, Ctenocephalides felis, Bombus terrestris, Tribolium castaneum, Apis mellifera, Nasonia vitripennis, Pediculus humanus corporis, Anopheles gambiae, Glossina morsitans, Atta cephalotes, Anopheles darlingi, Acyrthosiphon pisum, Drosophila virilis, Drosophila erecta, Lutzomyia longipalpis, Rhodnius prolixus, Solenopsis invicta, Culex quinquefasciatus , Bactrocera cucurbitae, and Tricho gramma pretiosum.

8. The fusion protein according to any of claims 1 to 5, wherein the resilin protein comprises an amino acid sequence of SEQ ID NO: 1.

9. The fusion protein of claim 1 comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 8; SEQ ID NO: 10; SEQ ID NO: 12; SEQ ID NO: 14; SEQ ID NO: 16; SEQ ID NO: 18, and SEQ ID NO: 20.

10. The fusion protein according to any one of claims 1 to 9, wherein the silicon oxide-binding domain is directly linked to the resilin protein.

11. The fusion protein according to any one of claims 1 to 9, wherein the silicone oxide-binding domain is linked to the resilin protein via a linker, wherein the linker is not a silicon oxide-binding domain.

12. A recombinant vector comprising a polynucleotide encoding a fusion protein according to any of claims 1 to 11.

13. The recombinant vector of claim 12, wherein the polynucleotide is operably linked to a heterologous regulatory element.

14. The recombinant vector according to any one of claims 12 and 13, wherein the polynucleotide comprises a nucleic acid sequence selected from the group consisting of: SEQ ID NO: 9; SEQ ID NO: 11; SEQ ID NO: 13; SEQ ID NO: 15; SEQ ID NO: 17; SEQ ID NO: 19, and SEQ ID NO: 21.

15. A host cell transformed with the recombinant vector according to any of claims 12 to 14.

16. A method for producing a fusion protein comprising (i) a resilin protein and (ii) a silicon oxide-binding domain, the method comprising culturing the host cell of claim 15 in a medium under conditions that result in producing the fusion protein.

17. A composite material comprising the fusion protein according to any one of claims 1 to 11 and a silicon oxide-containing substance.

18. The composite material according to claim 17, wherein the silicon oxide-containing substance is silica.

19. The composite material according to any one of claims 17 and 18, wherein the fusion protein is crosslinked.

20. The composite material according to claim 19, wherein the fusion protein is crosslinked through intermolecular dityrosine bond formations.