EP2542573A1 - Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof - Google Patents

Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof

Info

Publication number
EP2542573A1
EP2542573A1 EP11705893A EP11705893A EP2542573A1 EP 2542573 A1 EP2542573 A1 EP 2542573A1 EP 11705893 A EP11705893 A EP 11705893A EP 11705893 A EP11705893 A EP 11705893A EP 2542573 A1 EP2542573 A1 EP 2542573A1
Authority
EP
European Patent Office
Prior art keywords
protein
seq
acid sequence
amino acid
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP11705893A
Other languages
German (de)
French (fr)
Inventor
Maria Teresa Pisabarro
Aurelie Tomczak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Technische Universitaet Dresden
Original Assignee
Technische Universitaet Dresden
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Technische Universitaet Dresden filed Critical Technische Universitaet Dresden
Publication of EP2542573A1 publication Critical patent/EP2542573A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/52Cytokines; Lymphokines; Interferons
    • C07K14/54Interleukins [IL]
    • C07K14/5421IL-8

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

The present invention relates to a protein having structural homology to proteins with IL-8-like chemokine fold and uses thereof. The invention has various industrial applications in medical, pharmaceutical and biotechnological applications, such as pharmaceuticals, diagnostics, biosensors and bioreactors. The object of the invention is to provide and use novel secreted proteins with a structural homology to proteins with an IL-8-like chemokine fold which are involved in immune-mediated and inflammatory diseases or neoplastic malignancies. The invention relates to an isolated protein having at least 80% amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No.1. The protein according to the invention has structural similarity to proteins of the IL-8-like chemokine fold. The invention also relates to an isolated protein having at least 80% amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No. 9. The invention further relates to an isolated protein having at least 80% amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No.16, with cysteine residues at positions 20 and 47 and with at least one disulfide bridge formed between two cysteine residues.

Description

Novel protein with structural homology to proteins with IL-8-like chemokine fold and uses thereof
Field of the invention:
The present invention relates to the identification and uses of a protein having structural homology to proteins with IL-8-like chemokine fold. The invention has various industrial applications in medical, pharmaceutical and biotechnological applications, such as pharmaceuticals, diagnostics, biosensors and bioreactors.
Background of the invention:
Chemokines are a large family of small proteins with similar structure secreted by cells. Proteins are classified as chemokines according to shared structural characteristics such as their small size, which comprising a molecular mass between 8 and 10 kilodaltons (kDa) containing an IL8-like chemokine fold, and the presence of in most cases four cysteine residues that interact with each other forming intramolecular disulfide bonds essential for their function. The disulfide bonds are typically formed between the first and third and the second and fourth cysteine residue forming a "Greek key" shaped characteristic tertiary structure so-called IL-8 fold, which consists of an N-terminal loop of approximately ten amino acids, a single turn helix, three beta-strands and a C-terminal alpha-helix. Known chemokines show a wide range of sequence similarity, ranging from 10 % up to more than 90 % sequence identity. Chemokines can be categorized into four groups, depending on the spacing of the first two cysteine residues. CC chemokines possess two adjacent cysteines, in CXC chemokines the first two cysteines are separated by one amino acid, CX3C chemokines have three amino acids between the two cysteines and XC chemokines only possess two cysteine residues at all.
Chemokines bind to chemokine receptors on the surface of target cells, predominantly leukocytes. Chemokine receptors are themselves divided into four families according to the type of chemokine they bind. Thus, CC-receptors (CCR) bind CC chemokines, CXC-receptors (CXCR) bind CXC chemokines, CXsC-receptors (CX3CR) bind CX3C chemokines and XC -receptors (XCR) bind XC chemokines.
Chemokines induce directed chemotaxis in nearby responsive cells by binding to their respective receptors on the cells and thus promoting signaling cascades that generate responses like chemotaxis, degranulation, release of superoxide anions or changes in the avidity of cell adhesion molecules. Attracted chemotactic cells thereby follow a signal of increasing chemokine concentration towards the source of the chemokine, which may be for example infected or damaged cells. Some chemokines, like truncation regulated MCP2-4 can inhibit migration instead of inducing it.
Some chemokines are considered pro-inflammatory and can be induced during an immune response to promote cells of the immune system to a site of infection, while others are considered homeostatic and are involved in controlling the migration of cells during normal processes of tissue maintenance or development.
A member of the CXC-group and the structural representative of the so-called IL8-like chemokine fold is Interleukin-8 (IL-8), which is produced by macrophages and other cells, for example epithelial cells. It is known to play a role in acute inflammatory responses. These responses are mediated primarily by TNF- alpha, IL-1 and IL-6. Localized effects include increased adherence of circulating white blood cells to vascular endothelial cells and their extravasation into tissue spaces. Both IL-1 and TNF-alpha induce increased expression of cell-adhesion molecules (CAMs) on endothelial cells. These two cytokines also induce production of IL-8 by macrophages and endothelial cells. IL-8 chemotactically attracts neutrophils and promotes their adherence to endothelial cells. Specifically, IL-8 chemoattracts monocytes and dendritic cells. Both cell types play an important role in the initiation of an immune response.
Various types of molecules have been targeted by the pharmaceutical industry to subdue immune related and inflammatory diseases. Drugs that alter cell migration represent a particularly promising class of new anti-inflammatory drugs that appear to show potential in clinical trials and in many animal models of inflammatory diseases. Cell migration inhibitors not only interfere with migration of cells to a tissue, but also can affect other necessary processes such as mediator release and angiogenesis.
In cancer the balance between angiogenic and angiostatic chemokines can determine tumor survival (Frederick & dayman 2001 ). Angiogenic chemokines (CXCL 1 , 2, 3, 5, 6, 7, 8, 12 and the viral chemokines vMIP-I and vMIP-II (Boshoff 1997)) promote tumor growth by stimulation proliferation and chemotaxis of endothelial cells of the blood vessels that serve the tumor whereas angiostatic chemokines (CXCL4, 9, 10, 11, 13, 14) promote tumor necrosis by inhibition of endothelial cell proliferation and chemotaxis.
Furthermore chemokines and their receptors were shown to be involved in HIV-1 infections (Boshoff 1997). Two CC chemokines that are encoded in Kaposi sarcoma herpes virus (KSHV), vMIP-I and vMIP-II (viral macrophage inflammatory proteins) act as inhibitors for HIV-1 entry into CD4 T cells.
Viral MIP-I (vMIP-I) has been shown to function as a specific agonist for the CC-Chemokine receptor CCR8, a chemokine receptor expressed by Th2 lymphocytes and cultured monocytes. As already discussed, it has also been shown to be highly angiogenic. In primary effusion lymphoma- derived cell lines viral MIP-1-alpha inhibits chemically induced cell death by apoptosis in these cells (Liu 2001). Louahed et al. (Louahed 2003) have reported that viral MIP-1 has anti-apoptotic activities.
Efforts are being undertaken to identifying novel secreted or membrane proteins by screening mammalian recombinant DNA libraries to identify the coding sequences. Classical bioinformatics approaches based on sequence similarity are often useful for identification of homologous proteins and thus infer the function of unknown proteins. In case sequence-based annotation methods fail, the application of structure-based methods, like protein threading (also referred to herein as "fold recognition analysis"), can provide functional clues or confirm tentative functional assignments. These methods have already been proven to be successful for individual proteins. Protein threading is a method of computational protein structure prediction based on statistical analysis of available protein structures used for characterization of protein sequences, which do not show any statistically significant sequence homology to any known characterized protein.
Interest in the family of chemotactic molecules has increased as it has become apparent that chemokines may contribute to a number of important medical conditions related to immune function and cell division. Given the potential of chemokine related molecules to occupy important roles in the control of immune function, there is an interest in the identification of other members of this family and the receptors that direct the actions of these molecules through particular target cell populations. In this respect, the present invention describes the identification, characterization and use of novel proteins that are structurally similar to proteins with an IL-8-like chemokine fold, and active variants thereof.
Objective of the invention:
Thus, the object of the invention is to identify, characterize and use novel secreted proteins with a structural homology to proteins with an IL-8-like chemokine fold which are involved in immune- mediated and inflammatory diseases or neoplastic malignancies.
Description of the invention:
The object is solved by providing a novel protein, named and further referred to herein as "B42", with structural similarity to proteins of the IL-8-like chemokine fold, in particular to the known chemokines vMIP-I and vMIP-II, which is characterized by its amino acid sequence as presented in SEQ ID No. 1 and its nucleic acid sequence as presented in SEQ ID No. 2 and SEQ ID No. 3.
In one aspect the invention relates to an isolated protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence of the protein shown in one of the amino acid sequences presented in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No . 7, preferably SEQ ID No. 1. Preferably the invention relates to an isolated protein comprising an amino acid sequence that is identical to one of the amino acid sequences presented in SEQ ID No. l, SEQ ID No. 6 and SEQ ID No. 7. The proteins with SEQ ID No. 6 and SEQ ID No. 7 are preferred examples for proteins with at least 80 % amino acid sequence identity to SEQ ID No. 1.
The B42 protein according to the invention contains at least 2, preferably at least 4, preferably 6, cysteine residues. The amino acid sequence of the protein according to the invention possesses two adjacent cysteine residues, these are preferably the first two cysteine residues in the sequence, preferably located at position 7 and 8 of the amino acid sequence (as presented in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7)·
The B42 protein according to the invention was identified by fold recognition analysis and shows structural similarity with proteins belonging to the IL-8-like chemokine fold family, in particular with vMIP-I and vMIP-II.
In the study on which the invention is based the inventors extracted uncharacterized polypeptide sequences from the UniProt database containing at least 2 cysteine residues to identify putative novel chemokine candidates. These sequences were analyzed to detect possible signal peptides and transmembrane regions, which were removed before the threading analysis and the structure-based protein function prediction.
The inventors used the fold recognition algorithm ProHit for threading the uncharacterized sequences against two fold libraries built up for this purpose; a fold library containing all known structures of the members of the IL-8-like fold family (consisting of 217 entries) and a fold library obtained from filtering the Brookhaven Protein Data Bank (PDB) structures at 95 % sequence identity (consisting of 22009 representative three-dimensional protein structures).
Using this methodology, the protein according to the invention, referred herein also as "B42", was identified to have an IL-8-like three-dimensional structure, and therefore to be a novel member of the chemokine family. Furthermore, B42 exhibits a strong structural similarity to vMIP-I and vMIP-II, members of the IL-8-like chemokine fold family, which are involved in blockade of viral infiltration into host cells. The sequence identity between B42 and vMIP-II is 26.2 %. The sequence identity between B42 and vMIP-I is 22.6 %. B42 shares 11.9 % sequence identity with IL-8.
The identified polypeptide sequence of B42 was not yet characterized as a protein in sequence databases (ENSEMBL) since it was wrongly annotated as being located in an intron structure. Currently (ENSEMBL release 55), it is labelled as "processed transcript". The inventors cloned this gene product and obtained B42 protein by in vitro expression.
Further analyses of the newly identified isolated B42 protein by circular dichroism (CD) and Fourier transform infrared spectroscopy studies, which determine the secondary structure of macromolecules, confirmed the threading results and the structure-based characterization of this protein as a new member of the chemokine family. Circular dichroism patterns of the newly identified protein B42 and vMIP-II were highly similar. The structural similarity to IL-8-like chemokines makes this novel protein, B42, a target for therapeutic and diagnostic applications in immune related disorders.
The terms "polypeptide" or "protein" mean molecules having the sequence of native proteins, that is, proteins produced by naturally-occurring and specifically non-recombinant cells, or genetically- engineered or recombinant cells, and comprise molecules having the amino acid sequence of the native protein, or molecules having deletions from, additions to, and/or substitutions of one or more amino acids of the native sequence.
The term "isolated protein" according to the invention means that a subject protein
a) is preferably free of at least some other proteins with which it would normally be found,
b) is preferably essentially free of other proteins from the same source, e. g. from the same species, c) is preferably expressed by a cell from a different species,
d) has been separated from at least about 50 percent of polynucleotides, lipids, carbohydrates, or other materials with which it is associated in nature,
e) is preferably not associated (by covalent or noncovalent interaction) with portions of a protein with which the "isolated protein" is associated in nature, or
f) is preferably operably associated (by covalent or noncovalent interaction) with a polypeptide with which it is not associated in nature.
Such an isolated protein can be encoded by genomic DNA, cDNA, mRNA or other RNA, of synthetic origin, or any combination thereof. Preferably, the isolated protein is substantially free from proteins or polypeptides or other contaminants that are found in its natural environment that would interfere with its use (therapeutic, diagnostic, prophylactic, research or otherwise).
With "amino acid sequence identity" with respect to a specific amino acid sequence it is referred to the percentage of amino acid residues in a candidate sequence that is identical with the amino acid sequence in the specific amino acid sequence.
The B42 protein according to the invention has structural homology to the chemokine protein family that share a common 3D structure, the IL-8-like chemokine fold. The B42 protein according to the invention in particular possesses a high structural homology to vMIP-I and vMIP-II.
With structural homology is meant the fractions of secondary structure content and the degree of 3- dimensional shape similarity between proteins. The structural homology is identified by methods well known in the art either experimental, such as X-ray crystallography and Nuclear Magnetic resonance (NMR), or computational such as Threading.
The 3-dimensional structure of the B42 protein according to the invention features an IL-8 like chemokine fold. More specifically, it comprises at least two cysteine residues that form an intramolecular disulfide bond, three β-strands and an a-helix.
The characteristic 3-dimensional structure of the proteins according to the invention, the IL-8 like chemokine fold, is formed by cysteines that interact with each other in pairs by the formation of disulfide bonds to create a characteristic Greek key shape (which is named for its resemblance to the Greek key meander pattern in art). The IL8-like chemokine fold more specifically comprises an N-terminal loop (N- loop) followed by a single-turn helix or a 3io-helix (3-turn helix), three anti-parallel β-strands and a C- terminal α-helix (4-turn helix). These helices and strands are connected by turns preferably so called 30s, 40s and 50s loops.
Therefore, the amino acid sequence of the B42 protein according to the invention comprises at least two cysteine residues, preferably four cysteine residues, more preferably six cysteine residues. Preferably, four cysteine residues are involved in formation of intramolecular disulfide bonds, which preferably join the first to third, and the second to fourth cysteine residues, numbered as they appear in the protein sequence of the chemokine. More preferably, six cysteine residues are involved in intramolecular disulfide bonds.
Experimentally determined crystal structures of chemokines can be found in following Protein Data Bank (PDB) entries: lal5, lb2t, lb3a, lb50, lb53, lboO, lcm9, ldok, Idol, ldom, ldon, leig, leih, lelO, leot, leqt, lesr, lf21, lf9p, lf9q, lf9r, lf9s, lg2s, lg2t, lg91, lha6, lhfg, lhfn, lhhv, lhrj, lhum, lhun, licw, likl, likm, H18, lilp, lilq, lj8i, lj9o, lje4, llv9, lm8a, lmgs, lmi2, lmlO, lmsg, lmsh, lnap, lncv, lnr2, lnr4, lo7y, lo7z, lo80, lpfm, lpfn, lplf, lqe6, lqg7, lqnk, lrhp, lrjt, lrod, lrtn, lrto, lsdf, ltvx, lu41, lu4m, lu4p, lu4r, lvmc, lvmp, lzxt, 2bdn, 2eot, 2ffk, 2fht, 2fin, 2fj2, 2hcc, 2hci, 2hdl, 2hdm, 2il8, 2j7z, 2jpl, 2jyo, 2k01, 2k03, 2k04, 2k05, 2kec, 2ked, 2kee, 2nwg, 2nyz, 2nzl, 2q8r, 2q8t, 2r3z, 2ra4, 2sdf, 2vxw, 3ifd, 3il8. Preferably, a protein according to the invention features a 3D-structure of the aforementioned PDB-entries.
Further embodiments of the invention are chimeric molecules which comprise a B42 polypeptide according to the invention fused to a heterologous amino acid sequence. Preferably this heterologous amino acid sequence is an epitope tag sequence or an Fc region of an immunoglobulin.
The invention further comprises proteins that are circular permutations of the B42 protein according to the invention, further referred to as circularly permutated variants. Preferably the invention comprises circularly permutated variants of the B42 protein with the amino acid sequence according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7 (or a protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7).
By circular permutation in coherence with the invention is meant a genetic operation in which part of the C-terminus of a protein is moved to its N-terminus by maintaining the three-dimensional structure and biological function of said protein. A circular permutation is built from one or more sets of structural elements in cyclic order. An engineered circular permutation of a protein can be envisioned as the result of taking a linear protein sequence, joining its ends to create a circle and then cleaving the circle at another site to generate a new linear monomeric protein sequence, only altered at the sites of joining and cleavage and with a resulting different sequential order of the secondary structure elements but same 3D structure. This approach of engineering a new protein sequence by circularly permuting was pioneered by Goldenberg in 1983 (Goldenberg 1983). The newly engineered protein maintains the original function either totally (preferred) or partially. The B42 protein according to the invention preferably contains loops (amino acids counted) at the following amino acid positions: amino acids 8 - 20 (N-loop), amino acids 27 - 33 (30S-loop), amino acids 38 - 42 (40S-loop), amino acids 47 - 57 (50S-loop).
Circular permutated variants are obtained by connecting the C-terminus with the N-Terminus and cutting (genetic operation) within one of said mentioned loops (to generate a new C-terminus and N-Terminus). Preferably, the cutting site is not positioned in the N-loop of the protein.
Circularly permutated variants of a B42 protein according to the invention preferably comprise the amino acid sequence according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7 (or a protein having at least 80 %>, preferably at least 90 %>, preferably at least 95 %>, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7) in rearranged order.
By cyclic permutation the following structural variants of the protein according to the invention can be received (structural elements are listed according to their order from N-terminus to C-terminus):
a) beta sheet, beta sheet, alpha helix, beta sheet,
b) beta sheet, alpha helix, beta sheet, beta sheet, or
c) alpha helix, beta sheet, beta sheet, beta sheet.
Preferably structural elements of circular permutated variants of the B42 protein according to the invention comprise the following amino acid sequences (each one comprising at least 80%, preferably 90%), more preferably 95%, even more preferably 99% amino acid sequence identity to the respective amino acid sequence according to SEQ ID No. 1).
a) beta sheet according to amino acids 34 to 37 in SEQ ID No. 1 (S2), beta sheet according to amino acids 43 to 46 in SEQ ID No. 1 (S3), alpha helix according to amino acids 58 to at least 65 in SEQ ID No. 1, beta sheet according to amino acids 23 to 26 in SEQ ID No. 1 (SI);
b) beta sheet according to amino acids 43 to 46 in SEQ ID No. 1 (S3), alpha helix according to amino acids 58 to at least 65 in SEQ ID No. 1, beta sheet according to amino acids 23 to 26 in SEQ ID No. 1 (SI); beta sheet according to amino acids 34 to 37 in SEQ ID No. 1 (S2), c) alpha helix according to amino acids 58 to at least 65 in SEQ ID No. 1, beta sheet according to amino acids 23 to 26 in SEQ ID No. 1 (SI); beta sheet according to amino acids 34 to 37 in SEQ ID No. 1 (S2), beta sheet according to amino acids 43 to 46 in SEQ ID No. 1 (S3).
Chemokine receptor activation is a two step process (Baggiolini 2001). First, a chemokine binds the N- terminal region of the receptor by interaction mainly through basic surface residues and second, the receptor signaling is triggered by interaction of the flexible N-terminus of the chemokine with a second site on the receptor. Often the receptor affinity is modulated by proteolytic cleavage of the chemokine N- terminus (Wolf 2008). By this, the chemokine can be either turned into an inhibitor of the receptor or the receptor activation is increased or reduced.
The three dimensional IL8-like structure is preferably maintained by circular permutated variants according to the invention. Thus, the circularly permutated variants according to the invention are able to bind to the receptor (as inhibitors or activators). The biological function of a circularly permutated variant according to the invention is maintained in activating variants with structures according to a) or c).
Preferably, for receptor activation an additional extension comprising an amino acid sequence from the original N-terminus is appended to the N-terminus of a circularly permutated variant according to the invention. Otherwise, only binding ability (inhibition) is maintained, whereas signaling ability is not.
A circularly permutated variant according to a structure described in b) is preferably an inhibitor of the natural receptor. Said variant is able to bind the receptor but does not have the flexible N-terminus at the right position to induce receptor signaling.
In another aspect the present invention relates to an isolated nucleic acid encoding for the B42 polypeptide according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7 (also herein referred to as "B42 nucleic acid"). Preferably, the invention relates to an isolated nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the B42 protein shown in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7. In a preferred embodiment the isolated nucleic acid has at least 80 %, preferably at least 90 %, more preferably at least 95 %, even more preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence presented in SEQ ID No. 2 or to the nucleotide sequence presented in SEQ ID No. 3.
According to the invention, nucleic acids as referred to herein mean single-stranded or double- stranded nucleic acid polymers of at least 10 nucleotides in length. In certain embodiments, the nucleotides comprising the nucleic acids can be ribonucleotides or deoxyribonucleotides or a modified form of either type of nucleotide. Said modifications include base modifications such as bromouridine, ribose modifications such as arabinoside and 2', 3 '-dideoxyribose and internucleotide linkage modifications such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphor odiselenoate, phosphoroanilothioate, phoshoraniladate and phosphoroamidate. The term "nucleic acid" specifically includes single and double stranded forms of DNA.
The term "nucleic acid sequence identity" with respect to a specific nucleic acid sequence refers to the percentage of nucleic acid residues in a candidate sequence that are identical with the nucleic acid sequence in the specific nucleic acid sequence.
Further aspects of the invention are directed to an isolated nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence comprising the coding sequence of the full length B42 polypeptide cDNA as presented in SEQ ID No. 2 or the full transcript sequence as presented in SEQ ID No. 3. Another aspect of the invention relates to a vector comprising a B42 nucleic acid according to the invention (also herein referred to as "B42 vector"). In one aspect, the vector according to the invention comprises a B42 nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the B42 protein shown in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7. In another aspect, the vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 2. In yet another aspect, the vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 3.
According to the invention the term "vector" as used herein refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments may be ligated. Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e. g. bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e. g. non-episomal mammalian vectors) can be integrated into the genome of a host cell upon introduction into the host cell and thereby are replicated along with the host genome.
In another embodiment of the invention the vector comprising the nucleic acid is operably linked to control sequences recognized by a host cell transformed with the vector.
The term "operably linked" means that the components to which the term is applied are in a relationship that allows them to carry out their inherent functions under suitable conditions. For example, a control sequence "operably linked" to a protein coding sequence, as comprised in the vector according to the invention, is ligated thereto so that expression of the protein coding sequence is achieved under conditions compatible with the transcriptional activity of the control sequences.
In this context, the term "host cell" is used to refer to a cell which has been transformed, or is capable of being transformed with a nucleic acid sequence and then of expressing a selected gene of interest. The term includes the progeny of the parent cell, whether or not the progeny is identical in morphology or in genetic make-up to the original parent, so long as the selected gene is present.
In yet another embodiment the invention refers to a host cell comprising the B42 vector according to the invention. Preferably, the host cells are CHO cells, HEK-293T cells, HEK-293A cells, HeLa cells, E. coli cells, yeast cells or baculovirus infected insect cells.
A multi-cellular organism according to the invention is a naturally occurring or otherwise genetically modified organism, preferably an invertebrate or vertebrate organism, like Drosophila melanogaster, Caenorhabditis elegans, Xenopus laevis, Medaka or Zebrafish or Mus musculus, or an embryo thereof. The invention also relates to a process for producing a B42 polypeptide as presented in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7 or a polypeptide with at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % sequence identity to the polypeptide presented in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7. The process comprises the steps of:
culturing the host cells or multi cellular organisms according to the invention under conditions suitable for the expression of said B42 polypeptide and
recovering said B42 polypeptide from the cells or the culture supernatant.
Suitable conditions for culturing a prokaryotic or eukaryotic host are well known to a person skilled in the art. In general, the skilled person is also aware that these conditions may have to be adapted to the needs of the host and the requirements of the polypeptide expressed. In case an inducible promoter controls the nucleic acid of the invention in the vector present in the host cell, expression of the polypeptide can be induced by addition of an appropriate inducing agent. Suitable expression protocols and strategies are known to a person skilled in the art.
The proteins according to the invention are recovered from the cells or the culture supernatant by methods well-known in the art. These methods comprise without limitation method steps such as ion exchange chromatography, gel filtration chromatography (size exclusion chromatography), affinity chromatography, high pressure liquid chromatography (HPLC), reversed phase HPLC, disc gel electrophoresis or immunoprecipitation.
The B42 protein according to the invention displays a target protein for therapeutic and diagnostic applications related to immune related disorders.
Therefore, the invention further comprises an antibody, which specifically binds to a B42 protein according to the invention. In coherence with the invention the term "antibody" refers to an intact antibody, a binding fragment thereof or derivatives thereof that are capable of specifically binding to the protein according to the invention. Binding fragments and derivatives comprise the epitope-binding fragments of the intact antibody and include, but are not limited to, F (ab), F (ab'), F (ab') 2, Fv, and single-chain antibodies.
In preferred embodiments the antibody is a monoclonal antibody, a polyclonal antibody, a humanized antibody or a single chain antibody.
For therapeutic or diagnostic applications the B42 protein according of the invention or an antagonist or agonist thereof or an antibody according to the invention are composed in combination with a carrier. Thus, the invention refers to a composition of matter comprising a) a B42 protein according to the invention, b) an agonist of said B42 protein, c) an antagonist of said B42 protein or d) an antibody the specifically binds to said B42 protein in combination with a carrier. Suitable carriers are well known to a person skilled in the art. In preferred embodiments the carrier is a pharmaceutically acceptable carrier. By "pharmaceutically acceptable carrier" is meant a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or formulation auxiliary of any type. Examples of suitable pharmaceutically acceptable carriers are well known in the art and include phosphate buffered saline solutions, water, emulsions, such as oil/water emulsions, various types of wetting agents, sterile solutions, organic solvents including DMSO etc. Compositions comprising such carriers can be formulated by well known conventional methods.
An agonist as referred to herein includes any molecule that mimics a biological activity of the protein according to the invention.
An antagonist according to the invention includes any molecule that partially or fully blocks, inhibits, or neutralizes a biological activity of the protein according to the invention.
Suitable agonist molecules specifically include agonist antibodies or antibody fragments, fragments or amino acid sequence variants of polypeptides according to the invention, peptides, chemokine mutants or small organic molecules.
Suitable antagonist molecules specifically include antagonist antibodies or antibody fragments, chemokine mutants, fragments of chemokines, fragments of chemokine mutants, antisense oligonucleotides or small organic molecules.
Biological activity as used herein refers to a biological function, which may be either inhibitory or stimulatory, that is caused by a native or naturally- occurring protein according to the invention and which is other than the ability to induce the production of an antibody against an antigenic epitope possessed by a native or naturally- occurring protein according to the invention.
Antagonists or agonists include small molecules that bind to the active site, the receptor binding site, or growth factor or other relevant binding site of the protein according to the invention. Examples of small molecules include, but are not limited to, small peptides or peptide-like molecules, preferably soluble peptides, and synthetic non-peptidyl organic or inorganic compounds. A small molecule according to the invention has a molecular weight below about 500 Daltons.
In a further embodiment the invention relates to the use of a composition of matter according to the invention that comprises a B42 protein according to the invention, an agonist of said B42 protein, an antagonist of said B42 protein or an antibody the specifically binds to said B42 protein in combination with a carrier to treat an immune related disorder in a mammal or human. In that context a B42 protein according to the invention, an agonist thereof, an antagonist thereof or an antibody that specifically binds to said protein is capable of increasing the proliferation of T-lymphocytes in a mammal, inhibiting the proliferation of T-lymphocytes in a mammal, increasing infiltration of inflammatory cells into a tissue of a mammal, or decreasing the infiltration of inflammatory cells into a tissue of a mammal. In a further embodiment the composition of matter according to the invention comprises a therapeutically effective amount of a) a B42 protein according to the invention, b) an agonist of said B42 protein, c) an antagonist of said B42 protein or d) an antibody the specifically binds to said B42 protein.
The therapeutically effective amount for a given situation or a given immune related disorder will readily be determined by routine experimentation and is within the skills and judgment of the ordinary clinician or physician. Generally, the regimen as a regular administration of the pharmaceutical composition should be in the range of 1 μg to 5 g units per day. However, a preferred dosage might be in the range of 0.01 mg to 100 mg, preferably 0.01 mg to 50 mg and preferably 0.01 mg to 10 mg per day.
In that context the invention further comprises an article of manufacture, comprising a container, a label on said container, and a composition of matter comprising a) a B42 protein according to the invention, b) an agonist of said B42 protein, c) an antagonist of said B42 protein or d) an antibody that specifically binds to said B42 protein, contained within said container, wherein a label on said container indicates that said composition of matter can be used for treating an immune related disease.
In one embodiment the invention further comprises a method of treating an immune related disorder in a mammal or human in need thereof comprising administering to said mammal or human a therapeutically effective amount of a) a B42 protein according to the invention, b) an agonist of said B42 protein, c) an antagonist of said B42 protein or d) an antibody that specifically binds to said B42 protein.
Immune related disorders according to the invention include without limitations systemic lupus erythematosus, rheumatoid arthritis, osteoarthritis, juvenile chronic arthritis, a spondyloarthropathy, systemic sclerosis, an idiopathic inflammatory myopathy, Sjogren's syndrome, systemic vasculitis, sarcoidosis, autoimmune hemolytic anemia, autoimmune thrombocytopenia, thyroiditis, diabetes mellitus type I, immune-mediated renal disease, a demyelinating disease of the central or peripheral nervous system, idiopathic demyelinating polyneuropathy, Guillain-Barre syndrome, a chronic inflammatory demyelinating polyneuropathy, a hepatobiliary disease, infectious or autoimmune chronic active hepatitis, primary biliary cirrhosis, granulomatous hepatitis, sclerosing cholangitis, inflammatory bowel disease, gluten-sensitive enteropathy, Whipple's disease, an autoimmune or immune-mediated skin disease, a bullous skin disease, erythema multiforme, contact dermatitis, psoriasis, an allergic disease, asthma, allergic rhinitis, atopic dermatitis, food hypersensitivity, urticaria, an immunologic disease of the ovaries, an immunologic disease of the lung, eosinophilic pneumonia, idiopathic pulmonary fibrosis, hypersensitivity pneumonitis, a transplantation associated disease, graft rejection, graft-versus-host- disease or AIDS. Preferably the immune related disorder is AIDS.
The B42 protein according to the invention is a target for diagnostic applications, e.g. for diagnosis of an immune related disorder in a patient.
In a further aspect the invention refers to a method for determining the presence of a B42 protein according to the invention or a naturally- occuring B42 protein with an amino acid sequence according to SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7 in a test sample suspected of containing said B42 protein. Said method according to the invention comprises the steps of a) exposing said sample to an antibody according to the invention, that specifically binds to said B42 polypeptide and b) determining binding of said antibody by detecting the formation of a complex between the antibody and the B42 protein according to the invention.
The detection is qualitative or quantitative, and is preferentially performed in comparison with monitoring the complex formation in a control sample. A larger quantity of complexes formed in the test sample indicates the presence or absence of an immune disease in the mammal from which the test tissue cells were obtained. The antibody preferably carries a detectable label. Complex formation can be monitored, for example, by light microscopy, flow cytometry, fluorimetry, ELISA, immunohistochemistry or other techniques known in the art. The detection methods of antibody binding are well known to an ordinary person skilled in the art.
Further aspects of the invention refer to methods for diagnosis of immune related diseases in a mammal or human.
In one aspect the invention is related to a method of diagnosing an immune related disease in a mammal, said method comprising detecting the level of expression of a gene encoding the B42 amino acid sequence shown in SEQ ID No. 1, SEQ ID No. 6 or SEQ ID No. 7, in a test sample of tissue cells obtained from the mammal or human, and in a control sample of known normal tissue cells of the same cell type, wherein a higher or lower level of expression of said gene in the test sample as compared to the control sample is indicative of the presence of an immune related disease in the mammal from which the test tissue cells were obtained.
In another aspect the invention relates to a method of diagnosing an immune related disease in a mammal or human, said method comprising
a) contacting an antibody according to the invention, which specifically binds to a B42 protein according to the invention, with a test sample of tissue cells obtained from said mammal or human and
b) detecting the formation of a complex between the antibody and a naturally occurring B42 polypeptide with an amino acid sequence according to SEQ ID No. 1 , SEQ ID No. 6 or SEQ ID No. 7 in the test sample,
wherein formation of said complex is indicative of the presence of an immune related disease in the mammal from which the test tissue cells were obtained.
Further aspects of the invention relate to methods of identifying a compound that modulates the activity of a B42 protein according to the invention, said method comprising contacting cells which normally respond to said protein with said B42 protein and a candidate compound, and determining the lack of responsiveness by said cell to a).
By modulation of protein activity in that context is meant a compound that either stimulates or inhibits the biological activity of the said protein.
In further aspects the invention encompasses methods of screening compounds to identify those that mimic the B42 protein of the invention (agonists) or prevent the effect of said B42 protein (antagonists). Methods for identifying agonists or antagonists of a protein according to the invention may comprise contacting said protein with a candidate agonist or antagonist molecule and measuring a detectable change in one or more biological activities normally associated with said protein.
To assay for antagonists, the protein according to the invention is preferably added to a cell along with the compound to be screened for a particular activity and the ability of the compound to inhibit the activity of interest in the presence of said protein indicates that the compound is an antagonist to said protein.
In an alternative screening method antagonists to the protein according to the invention are detected by combining said protein and a potential antagonist with membrane-bound receptors binding to a protein according to the invention or recombinant receptors under appropriate conditions for a competitive inhibition assay. The protein according to the invention can be labeled, such as by radioactivity, such that the number of said protein molecules bound to the receptor can be used to determine the effectiveness of the potential antagonist.
In a further aspect, the invention concerns a method of identifying agonists of or antagonists to a B42 protein according to the invention which comprises contacting said protein with a candidate molecule and monitoring a biological activity mediated by said protein. In a specific aspect, the agonist or antagonist is an anti-B42 antibody.
Assays for identification of antagonists or agonists are designed to identify compounds that bind or complex with the proteins according to the invention, or otherwise modulate the interaction of the encoded proteins with other cellular proteins. Such screening assays will include assays amenable to high- throughput screening of chemical libraries, making them particularly suitable for identifying small molecule candidates for agonists or antagonists. Small molecules contemplated include synthetic organic or inorganic compounds. The assays can be performed in a variety of formats, including protein-protein binding assays, biochemical screening assays, immunoassays and cell based assays, which are well characterized in the art.
Another aspect of the invention refers to a method of identifying a compound that inhibits the expression of a gene encoding the B42 amino acid sequence shown in SEQ ID No. l, SEQ ID No. 6 or SEQ ID No. 7, said method comprising contacting cells which normally express said B42 protein with a candidate compound, and determining the lack of expression of said gene. The decrease in the expression level of B42 gene expression can be detected through measurement of B42 RNA level in the cell, or through measurement of B42 protein level in the cell. In a preferred embodiment the candidate compound is an antisense nucleic acid.
The object is also solved by providing a novel protein, named and further referred to herein as "N73", with structural similarity to proteins of the IL-8-like chemokine fold, in particular to the known chemokines vMIP-II and vMIP-I, which is characterized by its amino acid sequence as presented in SEQ ID No. 9 or SEQ ID No. 10, and its nucleic acid sequence as presented in SEQ ID No. 11 or SEQ ID No. 12.
The protein according to SEQ ID No. 10 corresponds to the protein according to SEQ ID No . 9 additionally comprising its associated signal peptide. The nucleic acid sequence according to SEQ ID No.
11 corresponds to the protein coding region, whereas the nucleic acid sequence according to SEQ ID No.
12 additionally comprises regions of the full spliced gene transcript.
In one aspect the invention relates to an isolated N73 protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence of the N73 protein shown in SEQ ID No. 9, SEQ ID No. 13 or SEQ ID No. 14. Preferably, the invention relates to an isolated N73 protein comprising an amino acid sequence that is identical to one of the amino acid sequences presented in SEQ ID No. 9, SEQ ID No. 13 and SEQ ID No. 14.
The N73 protein according to the invention contains at least 2, preferably at least 6, more preferably 7, 8 or 12 cysteine residues.
The amino acid sequence of the N73 protein according to the invention possesses two adjacent cysteine residues (CC-motif), which are preferably located at position 54 and 55 in the amino acid sequence presented in SEQ ID No. 9.
The N73 protein according to the invention was identified by fold recognition analysis and shows structural similarity with proteins belonging to the IL-8-like chemokine fold family, in particular with vMIP-II and vMIP-I.
Using the methodology of the above described study on which the invention is based, the protein according to the invention, referred herein also as "N73", was identified to have an IL-8-like three- dimensional structure, and therefore to be a novel member of the chemokine family. Furthermore, N73 exhibits a strong structural similarity to vMIP-II and vMIP-I, members of the IL-8-like chemokine fold family, which are involved in blockade of viral infiltration into host cells. The sequence identity between N73 and vMIP-II is 23.2 %. The sequence identity between N73 and vMIP-I is 18.7 %. N73 shares 17,2 % sequence identity with IL-8.
The amino acid sequence of N73 comprises 208 amino acids, of which 12 are cysteines, the amino acid sequence includes a CC motif. The full N73 amino acid sequence is presented in SEQ ID No. 10 and comprises a signal peptide, which consists of 23 amino acids. The signal peptide comprises 4 cysteine residues. The N73 amino acid sequence without signal peptide is presented in SEQ ID No. 9.
The nucleic acid sequence of N73 was first identified as a gene related to cancer development and progression (Wan 2004) in a large-scale cDNA transfection screening.
Therefore, the amino acid sequence of the N73 protein according to the invention comprises at least two cysteine residues, preferably 7 cysteine residues, more preferably 8 cysteine residues, more preferably 12 cysteine residues. Preferably, six cysteine residues are involved in formation of intramolecular disulfide bonds.
Further embodiments of the invention are chimeric molecules which comprise a N73 polypeptide according to the invention fused to a heterologous amino acid sequence. Preferably this heterologous amino acid sequence is an epitope tag sequence or an Fc region of an immunoglobulin.
The invention further comprises proteins that are circular permutations of the N73 protein according to the invention, further referred to as circularly permutated variants. Preferably the invention comprises circularly permutated variants of the N73 protein with the amino acid sequence according to SEQ ID No. 9, SEQ ID No. 13 or SEQ ID No. 14 (or a protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 9, SEQ ID No. 13 or SEQ ID No. 14). By circular permutation in coherence with the invention is meant a genetic operation in which part of the C-terminus of a protein is moved to its N-terminus by maintaining the three-dimensional structure and biological function of said protein. A circular permutation is built from one or more sets of structural elements in cyclic order. An engineered circular permutation of a protein can be envisioned as the result of taking a linear protein sequence, joining its ends to create a circle and then cleaving the circle at another site to generate a new linear monomeric protein sequence, only altered at the sites of joining and cleavage and with a resulting different sequential order of the secondary structure elements but same 3D structure. This approach of engineering a new protein sequence by circularly permuting was pioneered by Goldenberg in 1983 (Goldenberg 1983). The newly engineered protein maintains the original function either totally (preferred) or partially. The N73 protein according to the invention preferably contains loops (amino acids counted without signal peptide) at the following amino acid positions: amino acids 8 - 20 (N-loop), amino acids 74 - 82 (30S-loop), amino acids 86 - 99 (40S-loop), amino acids 103 - 107 (50S-loop).
Circularly permutated variants of a N73 protein according to the invention preferably comprise the amino acid sequence according to SEQ ID No. 9, SEQ ID No. 13 or SEQ ID No. 14 (or a protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 9, SEQ ID No. 13 or SEQ ID No. 14) in rearranged order. Preferably structural elements of circular permutated variants of the N73 protein according to the invention comprise the following amino acid sequences (each one comprising at least 80%, preferably 90%, more preferably 95%, even more preferably 99% amino acid sequence identity to the respective amino acid sequence according to SEQ ID No. 9).
a) beta sheet according to amino acids 83 to 85 in SEQ ID No. 9 (S2), beta sheet according to amino acids 100 to 102 in SEQ ID No. 9 (S3), alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 9, beta sheet according to amino acids 66 to 73 in SEQ ID No. 9 (SI);
b) beta sheet according to amino acids 100 to 102 in SEQ ID No. 9 (S3), alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 9, beta sheet according to amino acids 66 to 73 in SEQ ID No. 9 (SI); beta sheet according to amino acids 83 to 85 in SEQ ID No. 9 (S2), c) alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 9, beta sheet according to amino acids 83 to 85 in SEQ ID No. 9 (SI); beta sheet according to amino acids 49 to 51 in SEQ ID No. 9 (S2), beta sheet according to amino acids 100 to 102 in SEQ ID No. 9 (S3).
Preferably, for receptor activation an additional extension comprising an amino acid sequence from the original N-terminus is appended to the N-terminus of a circularly permutated variant according to the invention. Otherwise, only binding ability (inhibition) is maintained, whereas signaling ability is not.
A circularly permutated variant according to a structure described in b) is preferably an inhibitor of the natural receptor. Said variant is able to bind the receptor but does not have the flexible N-terminus at the right position to induce receptor signaling.
In another aspect the present invention relates to an isolated nucleic acid encoding for a N73 polypeptide according to SEQ ID No. 9 or SEQ ID No. 10 (herein also referred to as "N73 nucleic acid"). Preferably, the invention relates to an isolated N73 nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the N73 protein shown in SEQ ID No. 9 or SEQ ID No. 10. In a preferred embodiment the isolated nucleic acid has at least 80 %, preferably at least 90 %, more preferably at least 95 %, even more preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence presented in SEQ ID No. 11 or to the nucleotide sequence presented in SEQ ID No. 12.
According to the invention, nucleic acids as referred to herein means single-stranded or double- stranded nucleic acid polymers of at least 10 nucleotides in length. In certain embodiments, the nucleotides comprising the nucleic acids can be ribonucleotides or deoxyribonucleotides or a modified form of either type of nucleotide. Said modifications include base modifications such as bromouridine, ribose modifications such as arabinoside and 2', 3 '-dideoxyribose and internucleotide linkage modifications such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phoshoraniladate and phosphoroamidate. The term "nucleic acid" specifically includes single and double stranded forms of DNA. Further aspects of the invention are directed to an isolated N73 nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence comprising the coding sequence of the full length N73 polypeptide cDNA as presented in SEQ ID No. 11 or the full transcript sequence as presented in SEQ ID No. 12.
Another aspect of the invention relates to a vector comprising a nucleic acid according to the invention (herein also referred to as "N73 vector"). In one aspect, the N73 vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the N73 protein shown in SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 13 or SEQ ID No. 14. In another aspect, the N73 vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 1 1. In yet another aspect, the N73 vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 12.
In another embodiment of the invention the N73 vector comprising the nucleic acid is operably linked to control sequences recognized by a host cell transformed with the vector.
In yet another embodiment the invention refers to a host cell comprising the N73 vector according to the invention. Preferably, the host cells are CHO cells, HEK-293T cells, HEK-293A cells, HeLa cells, E. coli cells, yeast cells or baculovirus infected insect cells.
The invention also relates to a process for producing a N73 polypeptide as presented in SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 13 or SEQ ID No. 14 or a polypeptide with at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % sequence identity to the polypeptide presented in SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 13 or SEQ ID No. 14. The process comprises the steps of: culturing the host cells or multi cellular organisms according to the invention under conditions suitable for the expression of said N73 polypeptide and
recovering said N73 polypeptide from the cells or the culture supernatant.
The N73 protein according to the invention displays a target protein for therapeutic and diagnostic applications related to immune related disorders or cancer.
Therefore, the invention further comprises an antibody, which specifically binds to a N73 protein according to the invention.In preferred embodiments the antibody is a monoclonal antibody, a polyclonal antibody, a humanized antibody or a single chain antibody.
For therapeutic or diagnostic applications the N73 protein according of the invention or an antagonist or agonist thereof or an antibody according to the invention are composed in combination with a carrier. Thus, the invention refers to a composition of matter comprising a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody the specifically binds to said N73 protein in combination with a carrier.
In a further embodiment the invention relates to the use of a composition of matter that comprises a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody the specifically binds to said N73 protein in combination with a carrier to treat a disease in a mammal or human.
Preferably, diseases treated with a composition of matter comprising a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody the specifically binds to said N73 protein in combination with a carrier are immune related disorders, preferably the aforementioned immune related disorders. Preferably the immune related disorder is AIDS.
In that context a N73 protein according to the invention, an agonist thereof, an antagonist thereof or an antibody that specifically binds to said N73 protein is capable of increasing the proliferation of T- lymphocytes in a mammal, inhibiting the proliferation of T-lymphocytes in a mammal, increasing infiltration of inflammatory cells into a tissue of a mammal, or decreasing the infiltration of inflammatory cells into a tissue of a mammal.
In another embodiment, said diseases treated by a composition of matter comprising a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody the specifically binds to said N73 protein in combination with a carrier are malignant neoplasms, such as tumor diseases or other types of cancer.
In a further embodiment the composition comprising a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody the specifically binds to said N73 protein in combination with a carrier comprises a therapeutically effective amount of a) a protein according to the invention, b) an agonist of said protein, c) an antagonist of said protein or d) an antibody the specifically binds to said protein.
The therapeutically effective amount for a given situation or a given disease will readily be determined by routine experimentation and is within the skills and judgment of the ordinary clinician or physician. Generally, the regimen as a regular administration of the pharmaceutical composition should be in the range of 1 μg to 5 g units per day. However, a preferred dosage might be in the range of 0.01 mg to 100 mg, preferably 0.01 mg to 50 mg and preferably 0.01 mg to 10 mg per day.
In that context the invention further comprises an article of manufacture, comprising a container, a label on said container, and a composition of matter comprising a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody that specifically binds to said N73 protein, contained within said container, wherein a label on said container indicates that said composition of matter can be used for treating a disease. Preferred diseases treated are immune related disorders or malignant neoplasm.
In one embodiment the invention further comprises a method of treating a disease in a mammal or human in need thereof comprising administering to said mammal or human a therapeutically effective amount of a) a N73 protein according to the invention, b) an agonist of said N73 protein, c) an antagonist of said N73 protein or d) an antibody that specifically binds to said N73 protein. Diseases treated with a method according to the invention include the above-mentioned diseases. Preferably, the treated disease is AIDS.
The N73 protein according to the invention is a target for diagnostic applications, e.g. for diagnosis of a disease in a patient, like immune related disorders or cancer.
In a further aspect the invention refers to a method for determining the presence of a N73 protein according to the invention in a test sample suspected of containing said N73 protein. Said method according to the invention comprises the steps of a) exposing said sample to an antibody according to the invention, that specifically binds to said N73 protein and b) determining binding of said anti-N73 antibody by detecting the formation of a complex between the antibody and the N73 protein according to the invention.
The detection is qualitative or quantitative, and is preferentially performed in comparison with monitoring the complex formation in a control sample. A larger quantity of complexes formed in the test sample indicates the presence or absence of a disease, like immune related disorders or cancer, in the mammal from which the test tissue cells were obtained. The anti-N73 antibody preferably carries a detectable label. Complex formation can be monitored, for example, by light microscopy, flow cytometry, fluorimetry, ELISA, immunohistochemistry or other techniques known in the art. The detection methods of antibody binding are well known to an ordinary person skilled in the art.
Further aspects of the invention refer to methods for diagnosis of a disease in a mammal or human. Diseases to be diagnosed with a method according to the invention include all above-mentioned diseases. Preferably, diseases diagnosed are cancer diseases. More preferably, the method of diagnosis according to the invention is used to diagnose immune related disorders, preferably above mentioned immune related disorders, more preferably AIDS.
One aspect of the invention is related to a method of diagnosing a disease, preferably an immune related disorder or cancer as mentioned above, in a mammal, said method comprising detecting the level of expression of a gene encoding the N73 amino acid sequence shown in SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 13 or SEQ ID No. 14 in a test sample of tissue cells obtained from the mammal or human, and in a control sample of known normal tissue cells of the same cell type. A higher or lower level of expression of said gene in the test sample as compared to the control sample is indicative of the presence of said disease in the mammal from which the test tissue cells were obtained. Another embodiment of the invention relates to a method of diagnosing a disease as mentioned above for N73 in a mammal or human, said method comprising
a) contacting an antibody according to the invention which specifically binds to a N73 protein according to the invention with a test sample of tissue cells obtained from said mammal or human and b) detecting the formation of a complex between the antibody and the N73 protein in the test sample, wherein formation of said complex is indicative of the presence of an immune related disease in the mammal from which the test tissue cells were obtained.
Further aspects of the invention relate to methods of identifying a compound that modulates the activity of a N73 protein according to the invention, said method comprising contacting cells which normally respond to said N73 protein with said N73 protein and a candidate compound, and determining the lack of responsiveness by said cell to a).
In further aspects the invention encompasses methods of screening compounds to identify those that mimic the N73 protein of the invention (agonists) or prevent the effect of said N73 protein (antagonists). Methods for identifying agonists or antagonists of a protein according to the invention may comprise contacting said protein with a candidate agonist or antagonist molecule and measuring a detectable change in one or more biological activities normally associated with said protein.
To assay for antagonists, the protein according to the invention is preferably added to a cell along with the compound to be screened for a particular activity and the ability of the compound to inhibit the activity of interest in the presence of said protein indicates that the compound is an antagonist to said protein.
In an alternative screening method antagonists to the protein according to the invention are detected by combining said protein and a potential antagonist with membrane-bound N73 protein receptors or recombinant receptors under appropriate conditions for a competitive inhibition assay. The protein according to the invention can be labeled, such as by radioactivity, such that the number of said protein molecules bound to the receptor can be used to determine the effectiveness of the potential antagonist.
In a further aspect, the invention concerns a method of identifying agonists of or antagonists to a protein according to the invention which comprises contacting said protein with a candidate molecule and monitoring a biological activity mediated by said protein. In a specific aspect, the agonist or antagonist is an anti-N73 antibody.
Another aspect of the invention refers to a method of identifying a compound that inhibits the expression of a gene encoding the N73 amino acid sequence shown in SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 13 or SEQ ID No. 14, said method comprising contacting cells which normally express said N73 protein with a candidate compound, and determining the lack of expression of said gene. The decrease in the expression level of N73 gene expression can be detected through measurement of N73 RNA level in the cell, or through measurement of N73 protein level in the cell. In a preferred embodiment the candidate compound is an antisense nucleic acid.
The object is also solved by providing a protein, named and further referred to herein as "F10", with structural similarity to proteins of the IL-8 like chemokine fold, in particular to the known chemokines viral MIP-I (vMIP-I) and CXCL10, which is characterized by its amino acid sequence as presented in SEQ ID No. 16, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37.
In one aspect the invention relates to an isolated protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence of the protein shown in SEQ ID No. 16, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, further referred to herein as "F10".
Preferably the invention relates to an isolated F10 protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence of the protein shown in SEQ ID No. 16. Preferably, said F10 protein comprises cysteine residues at position 20 and 47, which preferentially form a disulfide bridge. In a preferred embodiment, said F10 protein further comprises a proline residue at position 30 of SEQ ID No. 16.
The F10 protein according to SEQ ID No. 17 corresponds to the F10 protein according to SEQ ID No. 16 additionally comprising its associated signal peptide, which is cleaved during protein release from the cells. The nucleic acid sequence according to SEQ ID No. 18 corresponds to the F10 protein coding region, whereas the nucleic acid sequence according to SEQ ID No. 19 additionally comprises regions of the full spliced gene transcript.
Another aspect of the invention relates to an isolated F10 protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence of the protein shown in SEQ ID No. 17.
The F10 protein according to the invention contains at least 2, preferably at least 4, more preferably 5 cysteine residues.
The F10 protein according to the invention was identified by fold recognition analysis and shows structural similarity with proteins belonging to the IL-8-like chemokine fold family, in particular with CXCL10. Using the methodology as described above, the protein according to the invention, referred herein also as "F10" or "F10 protein", was identified to have an IL-8-like three-dimensional structure, and therefore to be a novel member of the chemokine family. Furthermore, F10 exhibits a strong structural similarity to CXCL10, a member of the IL-8-like chemokine fold family, which is an inhibitor of angiogenesis. The sequence identity between F10 and vMIP-I is 20 %. The sequence identity between F10 and CXCL10 is 17.8 %. F10 shares 16.7 % sequence identity with IL-8.
The amino acid sequence of F10 is also known as C4orf26 and comprises 130 amino acids, of which 5 are cysteines that could potentially be involved in disulfide bond formation. The full F 10 amino acid sequence is presented in SEQ ID No. 17 and comprises a signal peptide which consists of 23 amino acids. The signal peptide comprises 2 cysteine residues. The F10 amino acid sequence without signal peptide is presented in SEQ ID No. 16.
Orthologues of the human F10 protein could be found in other mammals comprising the following amino acid sequences: flying fox SEQ ID No. 23, dog SEQ ID No. 24, microbat SEQ ID No. 25, hyrax SEQ ID No. 26, tenrec SEQ ID No. 27, bushbaby SEQ ID No. 28, alpaca SEQ ID No. 29, dolphin SEQ ID No. 30, chimpanzee SEQ ID No. 31, orangutan SEQ ID No. 32, mouse lemur SEQ ID No. 33, tree shrew SEQ ID No. 34, horse SEQ ID No. 35, squirrel SEQ ID No. 36, rabbit SEQ ID No. 37.
Human F 10 nucleotide sequence was first found by a large-scale cDNA identification within the Mammalian Gene Collection (MGC) project (Gerhard 2004). If the nucleotide sequence was protein coding and which structural and functional characteristics this gene product would have were at this point unknown.
Gene localization analyses revealed the open reading frame is located on chromosome 4q21.1 in very close proximity to the chemokine mini cluster with CXCL9, CXCL10 and CXCL1 1. These chemokines all bind the same receptors: CXCR3 as agonist with angiostatic function that is expressed by TH1 cells and NK cells and CCR3 as antagonist/inhibitor. Thus, a putative signaling function of the protein according to the invention is via said receptors, CXCR3 and/or CCR3.
Chemokine binding to CXCR3 induces a variety of cellular responses, such as integrin activation, cytoskeletal changes and chemotactic migration. CXCR3 has been implicated in diseases, such as atherosclerosis, multiple sclerosis, pulmonary fibrosis, type 1 diabetes, autoimmune myasthenia gravis, nephrotoxic nephritis, acute cardiac allograft rejection and Celiac Disease. CCR3 is highly expressed in eosinophils and basophils. It can also be detected in TH1 and TH2 cells, as well as in airway epithelial cells. This receptor may contribute to the accumulation and activation of eosinophils and other inflammatory cells in the allergic airway. It is also known to be an entry co-receptor for HIV-1.
The analysis of the F10 candidate sequence revealed several chemokine- like properties like the presence of a signal peptide for secretion, chromosomal proximity to the angiostatic chemokines CXCL9 - 11. In addition the formation of one disulfide bond is possible (classifying as C motif-containing chemokine). Two FIO isoforms are expressed, which have 2 and 3 exons like most chemokines. The FI O amino acid sequence is conserved across many mammalian species.
Isoform A corresponds to the FIO amino acid sequence as presented in SEQ ID No. 17 (with signal peptide), which is encoded by a nucleic acid as presented in SEQ ID No. 18 (protein coding region only) or SEQ ID No. 19 (full spliced gene transcript region). Isoform B corresponds to the FIO amino acid sequence as presented in SEQ ID No. 20 (with signal peptide), which is encoded by a nucleic acid as presented in SEQ ID No. 21 (protein coding region only) or SEQ ID No. 20 (full spliced gene transcript region).
FI O (C4orf26) was used as one of 51 marker genes to correctly distinguish between high-risk and low- risk lymph-node-negative breast cancer patients (Karlsson 2008). Significantly lower expression of F10 was found in the tumors from deceased patients compared to the tumors from 10-year survivors. Moreover, gene expression of F10 was found to be 17 fold unregulated in response to cyclic mechanical stress in trabecular meshwork cells (Luna 2009). It was accompanied by changes in genes related to cytoskeleton, cell adhesion as well as genes involved in protection against stress like heat shock proteins and the chemokine CCL7.
Kittler et al, 2007 (Kittler 2007) showed a cell cycle arrest / cell division defect phenotype in an RNAi screen similar to known angiostatic chemokines (CXCL10, 13, 14) suggesting a similar role of F10 in cancer. This is further substantiated by the fact shown by the inventors that F10 presents most necessary CXCR3 receptor binding residues known from CXCL10 in sequence and 3D. As F10 has been found to be expressed mainly in placenta it could be a placenta specific CXCL10 analog performing similar functional roles by binding to CXCR3, which is known to be expressed in placenta, too.
An amino acid sequence with 99% sequence identity to SEQ ID No. 16 has been annotated previously and is included in EP 2107127-A1. There, the amino acid sequence is included in a diagnostic screening assay for somatic and ovarian cancers with a variety of other amino acid sequences. Structural and functional information about the previously known sequence are not provided in EP 2107127-A1.
The F10 protein according to the invention differs from the previously known protein in a proline residue at position 30 of SEQ ID No. 16. The previously known sequence comprises a leucine residue at this position. Thus, the F10 protein according to the invention preferably comprises a proline residue at this position (position 30 of SEQ ID No. 16). More preferably, the F10 protein according to the invention does not comprise a leucine residue at position 30 of SEQ ID No. 16.
The F10 protein according to the invention in particular possesses a high structural homology to vMIP-I and CXCLlO.
Therefore, the amino acid sequence of the F10 protein according to the invention comprises at least two cysteine residues, preferably three cysteine residues, more preferably five cysteine residues. In an F10 protein according to the invention preferably two cysteine residues are involved in formation of intramolecular disulfide bonds.
The IL8-like chemokine fold more specifically comprises an N-terminal loop (N-loop) followed by a single-turn helix or a 3 io-helix (3-turn helix), three anti-parallel β-strands and a C-terminal a-helix (4-turn helix). These helices and strands are connected by turns preferably so called 30s, 40s and 50s loops.
Further embodiments of the invention are also chimeric molecules which comprise an F10 protein according to the invention fused to a heterologous amino acid sequence. Preferably this heterologous amino acid sequence is an epitope tag sequence or an Fc region of an immunoglobulin.
The invention further comprises proteins that are circular permutations of the F10 protein according to the invention, further referred to as circularly permutated variants. Preferably the invention comprises circularly permutated variants of the protein with the amino acid sequence according to SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31 , SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16 (or a protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31 , SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16). By circular permutation in coherence with the invention is meant a genetic operation in which part of the C-terminus of a protein is moved to its N-terminus by maintaining the three-dimensional structure and biological function of said protein. A circular permutation is built from one or more sets of structural elements in cyclic order. An engineered circular permutation of a protein can be envisioned as the result of taking a linear protein sequence, joining its ends to create a circle and then cleaving the circle at another site to generate a new linear monomeric protein sequence, only altered at the sites of joining and cleavage and with a resulting different sequential order of the secondary structure elements but same 3D structure. This approach of engineering a new protein sequence by circularly permuting was pioneered by Goldenberg in 1983 (Goldenberg 1983). The newly engineered protein maintains the original function either totally (preferred) or partially.The F10 protein according to the invention preferably contains loops (amino acids counted without signal peptide) at the following amino acid positions: amino acids 20 - 29 (N-loop), amino acids 38 - 48 (30S-loop), amino acids 52 - 58 (40S-loop), amino acids 61 - 66 (50S-loop).
Circularly permutated variants of a F10 protein according to the invention preferably comprise the amino acid sequence according to SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31 , SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16 (or a protein having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % amino acid sequence identity to the amino acid sequence according to SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16) in rearranged order.
Preferably structural elements of circular permutated variants of the FIO protein according to the invention comprise the following amino acid sequences (each one comprising at least 80%, preferably 90%, more preferably 95%, even more preferably 99% amino acid sequence identity to the respective amino acid sequence according to SEQ ID No. 16).
a) beta sheet according to amino acids 49 to 51 in SEQ ID No. 16 (S2), beta sheet according to amino acids 59 to 60 in SEQ ID No. 16 (S3), alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 16, beta sheet according to amino acids 35 to 37 in SEQ ID No. 16 (SI); b) beta sheet according to amino acids 59 to 60 in SEQ ID No. 16 (S3), alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 16, beta sheet according to amino acids 35 to 37 in SEQ ID No. 16 (SI); beta sheet according to amino acids 49 to 51 in SEQ ID No. 16 (S2), c) alpha helix according to amino acids 67 to at least 72 in SEQ ID No. 16, beta sheet according to amino acids 35 to 35 in SEQ ID No. 16 (SI); beta sheet according to amino acids 49 to 51 in SEQ ID No. 16 (S2), beta sheet according to amino acids 59 to 60 in SEQ ID No. 16 (S3).
Preferably, for receptor activation an additional extension comprising an amino acid sequence from the original N-terminus is appended to the N-terminus of a circularly permutated variant of F10 protein according to the invention. Otherwise, only binding ability (inhibition) is maintained, whereas signaling ability is not.
A circularly permutated variant according to a structure described in b) is preferably an inhibitor of the natural receptor. Said variant is able to bind the receptor but does not have the flexible N-terminus at the right position to induce receptor signaling.
In another aspect the present invention relates to an isolated nucleic acid encoding for the F10 protein according to the invention (herein also referred to as "F10 nucleic acid"). Preferably said isolated F10 nucleic acid encodes for a protein according to SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16 or SEQ ID No. 17. Preferably, the invention relates to an isolated F10 nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the F10 protein shown in SEQ ID No. 18. In a preferred embodiment the isolated F10 nucleic acid has at least 80 %, preferably at least 90 %, more preferably at least 95 %, even more preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence presented in SEQ ID No. 19.
Further aspects of the invention are directed to an isolated FIO nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence comprising the coding sequence of the full length F10 polypeptide cDNA as presented in SEQ ID No. 18 or the full transcript sequence as presented in SEQ ID No. 19.
Another aspect of the invention relates to a vector comprising an F10 nucleic acid according to the invention (herein also referred to as "F10 vector"). In one aspect, the F10 vector according to the invention comprises an F10 nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to a nucleotide sequence encoding the protein shown in SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16 or SEQ ID No. 17. In another aspect, the F10 vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 18. In yet another aspect, the F10 vector according to the invention comprises a nucleic acid having at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % nucleic acid sequence identity to the nucleotide sequence shown in SEQ ID No. 19.
In another embodiment of the invention the F10 vector comprising the nucleic acid is operably linked to control sequences recognized by a host cell transformed with the vector.
In yet another embodiment the invention refers to a host cell comprising the F10 vector according to the invention. Preferably, the host cells are CHO cells, HEK-293T cells, HEK-293A cells, HeLa cells, E. coli cells, yeast cells or baculovirus infected insect cells.
The invention also relates to a process for producing a F10 polypeptide as presented in SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably in SEQ ID No. 16 or a polypeptide with at least 80 %, preferably at least 90 %, preferably at least 95 %, preferably at least 99 % sequence identity to the polypeptide presented in SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably in SEQ ID No. 16.
The process comprises the steps of: culturing the host cells or multi cellular organisms according to the invention under conditions suitable for the expression of said FIO polypeptide and
recovering said FIO polypeptide from the cells or the culture supernatant.
The FIO protein according to the invention displays a target protein for therapeutic and diagnostic applications related to diseases, preferably immune related disorders or cancer.
Therefore, the invention further comprises an antibody, which specifically binds to a FIO protein according to the invention (herein also referred to as "anti-FlO antibody").
In preferred embodiments the anti-FlO antibody is a monoclonal antibody, a polyclonal antibody, a humanized antibody or a single chain antibody.
For therapeutic or diagnostic applications the FIO protein according of the invention or an antagonist or agonist thereof or an anti-FlO antibody according to the invention are composed in combination with a carrier. Thus, the invention refers to a composition of matter comprising a) a FIO protein according to the invention, b) an agonist of said FIO protein, c) an antagonist of said FIO protein or d) an antibody the specifically binds to said FIO protein in combination with a carrier.
In a further embodiment the invention relates to the use of a composition of matter comprising a) a FI O protein according to the invention, b) an agonist of said FIO protein, c) an antagonist of said FIO protein or d) an antibody the specifically binds to said FIO protein in combination with a carrier to treat a disease in a mammal or human. Preferably, diseases treated with a composition of matter comprising a) a FI O protein according to the invention, b) an agonist of said FIO protein, c) an antagonist of said FIO protein or d) an antibody the specifically binds to said FI O protein in combination with a carrier are immune related disorders, more preferably the above mentioned immune related disorders.
In that context a FIO protein according to the invention, an agonist thereof, an antagonist thereof or an antibody that specifically binds to said FIO protein is capable of increasing the proliferation of T- lymphocytes in a mammal, inhibiting the proliferation of T-lymphocytes in a mammal, increasing infiltration of inflammatory cells into a tissue of a mammal, or decreasing the infiltration of inflammatory cells into a tissue of a mammal.
In another embodiment, said diseases treated by a composition of matter comprising a) a FI O protein according to the invention, b) an agonist of said FIO protein, c) an antagonist of said FIO protein or d) an antibody the specifically binds to said FIO protein in combination with a carrier are malignant neoplasms, such as tumor diseases or other types of cancer.
In a further embodiment the composition of matter according to the invention comprises a therapeutically effective amount of a) a FIO protein according to the invention, b) an agonist of said FIO protein, c) an antagonist of said FIO protein or d) an antibody the specifically binds to said FIO protein. The therapeutically effective amount for a given situation or a given disease will readily be determined by routine experimentation and is within the skills and judgement of the ordinary clinician or physician. Generally, the regimen as a regular administration of the pharmaceutical composition should be in the range of 1 μg to 5 g units per day. However, a preferred dosage might be in the range of 0.01 mg to 100 mg, preferably 0.01 mg to 50 mg and preferably 0.01 mg to 10 mg per day.
In that context the invention further comprises an article of manufacture, comprising a container, a label on said container, and a composition of matter comprising a) a F10 protein according to the invention, b) an agonist of said F10 protein, c) an antagonist of said F10 protein or d) an antibody that specifically binds to said F10 protein, contained within said container, wherein a label on said container indicates that said composition of matter can be used for treating a disease. Preferred diseases treated are immune related disorders or malignant neoplasm.
In one embodiment the invention further comprises a method of treating a disease in a mammal or human in need thereof comprising administering to said mammal or human a therapeutically effective amount of a) a F10 protein according to the invention, b) an agonist of said F10 protein, c) an antagonist of said F10 protein or d) an antibody that specifically binds to said F10 protein. Diseases treated with a method according to the invention include the above-mentioned diseases. Preferably, the treated disease is AIDS.
The F10 protein according to the invention is a target for diagnostic applications, e.g. for diagnosis of a disease in a patient, like immune related disorders or cancer.
In a further aspect the invention refers to a method for determining the presence of a F10 protein according to the invention in a test sample suspected of containing said F10 protein. Said method according to the invention comprises the steps of a) exposing said sample to an antibody according to the invention, that specifically binds to said F10 protein and b) determining binding of said antibody by detecting the formation of a complex between the antibody and the F10 protein according to the invention.
The detection is qualitative or quantitative, and is preferentially performed in comparison with monitoring the complex formation in a control sample. A larger quantity of complexes formed in the test sample indicates the presence or absence of a disease, like immune related disorders or cancer, in the mammal from which the test tissue cells were obtained. The antibody preferably carries a detectable label. Complex formation can be monitored, for example, by light microscopy, flow cytometry, fluorimetry, ELISA, immunohistochemistry or other techniques known in the art. The detection methods of antibody binding are well known to an ordinary person skilled in the art.
Further aspects of the invention refer to methods for diagnosis of a disease in a mammal or human. Diseases to be diagnosed with a method according to the invention include all above-mentioned diseases. Preferably, diseases diagnosed are cancer diseases. More preferably, the method of diagnosis according to the invention is used to diagnose immune related disorders, preferably above mentioned immune related disorders, more preferably AIDS.
One aspect of the invention is related to a method of diagnosing a disease, preferably an immune related disorder or cancer as mentioned above, in a mammal, said method comprising detecting the level of expression of a gene encoding the F10 amino acid sequence shown in SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably in SEQ ID No. 16 or SEQ ID No. 17, in a test sample of tissue cells obtained from the mammal or human, and in a control sample of known normal tissue cells of the same cell type. A higher or lower level of expression of said gene in the test sample as compared to the control sample is indicative of the presence of said disease in the mammal from which the test tissue cells were obtained.
Another embodiment of the invention relates to a method of diagnosing a disease as mentioned above in a mammal or human, said method comprising
a) contacting an antibody according to the invention which specifically binds to a F10 protein according to the invention with a test sample of tissue cells obtained from said mammal or human and b) detecting the formation of a complex between the antibody and the F10 protein in the test sample, wherein formation of said complex is indicative of the presence of said disease in the mammal from which the test tissue cells were obtained.
Further aspects of the invention relate to methods of identifying a compound that modulates the activity of a F10 protein according to the invention, said method comprising contacting cells which normally respond to said F10 protein with said protein and a candidate compound, and determining the lack of responsiveness by said cell to a).
In further aspects the invention encompasses methods of screening compounds to identify those that mimic the F10 protein of the invention (agonists) or prevent the effect of said F10 protein (antagonists). Methods for identifying agonists or antagonists of a F10 protein according to the invention may comprise contacting said F10 protein with a candidate agonist or antagonist molecule and measuring a detectable change in one or more biological activities normally associated with said F10 protein.
In an alternative screening method antagonists to the F10 protein according to the invention are detected by combining said protein and a potential antagonist with membrane-bound F10 protein receptors or recombinant receptors under appropriate conditions for a competitive inhibition assay. The F10 protein according to the invention can be labeled, such as by radioactivity, such that the number of said protein molecules bound to the receptor can be used to determine the effectiveness of the potential antagonist. In a further aspect, the invention concerns a method of identifying agonists of or antagonists to a FIO protein according to the invention which comprises contacting said FIO protein with a candidate molecule and monitoring a biological activity mediated by said FIO protein. In a specific aspect, the agonist or antagonist is an anti-FlO antibody.
Another aspect of the invention refers to a method of identifying a compound that inhibits the expression of a gene encoding the amino acid sequence shown in SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 20, SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25, SEQ ID No. 26, SEQ ID No. 27, SEQ ID No. 28, SEQ ID No. 29, SEQ ID No. 30, SEQ ID No. 31, SEQ ID No. 32, SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35, SEQ ID No. 36 or SEQ ID No. 37, preferably to SEQ ID No. 16 or SEQ ID No. 17, said method comprising contacting cells which normally express said FIO protein with a candidate compound, and determining the lack of expression of said gene. The decrease in the expression level of FI O gene expression can be detected through measurement of FI O RNA level in the cell, or through measurement of FIO protein level in the cell. In a preferred embodiment the candidate compound is an antisense nucleic acid.
The invention is further illustrated by the following figures and examples without being limited to these. Description of the drawings:
Fig. 1 shows the 3D-structure of the B42 protein as determined by threading and 3D-modelling and a known structure of vMIP-II, a chemokine from the IL-8-like chemokine fold family. The two 3D-structures in superposition are shown in the middle.
Fig. 2 shows a sequence-to-structure alignment of B42 with the vMIP-I template structure
(pdb:lzxt, SEQ ID No. 8) used for modeling of the 3D structure of B42.
Fig. 3 shows the comparison of stability and energetic properties of B42 (lower graph) with known chemokines (CXCL17, upper graph) as determined by molecular dynamics simulation.
Fig. 4 shows a sequence alignment of orthologous B42 proteins in human (B42_Hum), chimpanzee (B42_chimp, SEQ ID No. 6) and orangutan (B42_orang, SEQ ID No. 7). Conserved regions are highlighted.
Fig. 5 shows ENSEMBL identifiers for B42. 1) ENSEMBL version 54; 2) ENSEMBL version
55/56. Figure shows chromosomal location of B42 protein with surrounding genes and exon structure.
Fig. 6 demonstrates splicing of B42. Two possible isoforms exist for B42, which are referred herein as "isoform A" and "isoform B". Isoform B exhibits a stop codon within the additional longer exon 3 sequence and, if protein coding, would only encode a short peptide of 9 residues. Thus, only isoform A is coding for the full length protein.
Fig. 7 shows the in vivo expression of B42 isoforms in human tissues as determined by PCR analyses. Protein coding isoform A is expressed in placenta, heart, lung, liver, pancreas, skeletal muscle and weakly in brain.
Fig. 8 shows the construct used for B42 expression. Cleavage position is indicated by vertical line (I). GA-addition remains at the N-terminus of B42-protein after cleavage with TEV protease.
Fig. 9 shows results of an SDS gel electrophoresis of eluted fractions of B42 protein. Peak fractions from the eluate were pooled (line 3) and concentrated over a 3 kDa cutoff membrane (line 2: 2 μΐ sample; line 1 : 20μ1 sample) and analyzed by SDS gel electrophoresis to check protein recovery.
Fig. 10 shows the region of interest of the deconvoluted B42 mass spectrum (scale Dalton).
Experimental data are shown. Accordingly, the molecular weight of B42 protein was identified to be 8931 Da. Arrow: monoisotropic mass.
Fig. 11 shows the region of interest of the deconvoluted B42 mass spectrum (scale Dalton).
Shown is the theoretical isotropic distribution for B42 with 3 assumed disulfide bonds, Arrow: monoisotropic mass.
Fig. 12 shows the secondary structure composition of the in vitro expressed B42 protein (right side graph) and the known chemokine vMIP-II (left side graph) as determined by circular dichroism spectroscopy.
Fig. 13 shows the Westernblot analysis of the N-terminal GFP-tagged B42 transgenic cell pool using antibody to GFP. A band of the expected size 38.8kD (GFP-tag: 31kD + B42 8.8kD) was detected (marked with *).
Fig. 14 shows the localization of the N-terminal GFP-tagged B42 protein in the cytoplasm of
HeLa cells with antibody to GFP in A. Localization of antibody to alpha-tubulin is shown in B and localization of DNA is shown in C. The cell cycle phase is interphase.
Fig. 15 shows a sequence-to-structure alignment of N73 with the vMIP-II template structure
(pdb: 2fj2, SEQ ID No. 15) used for modeling of the 3D structure of N73. Secondary structure elements of the template structure are indicated as follows: S 1-S3 ... beta- sheets, a ... a-helix. Cysteines are highlighted with black background and white font. Disulfide bonds are indicated by lines between cysteines.
Fig. 16 shows the 3D-structure of the N73 protein as determined by threading and 3D-modelling and a known structure of vMIP-II (pdb:2fj2, SEQ ID No. 15), a chemokine from the IL-8- like chemokine fold family. The two 3D-structures in superposition are shown in the middle.
Fig. 17 shows the comparison of stability and energetic properties of N73 (lower graph) with known chemokines (CXCLl 7, upper graph) as determined by molecular dynamics simulation.
Fig. 18 shows ENSEMBL identifiers for N73. Chromosomal location of the N73 protein with surrounding genes and exon structure is shown. The open reading frame is located on chromosome 8q23.3 (142,513,827-142,517,232 forward strand) and in close proximity to the N73 gene an orphan G protein coupled receptor (GPR20) can be found.
Fig. 19 shows a sequence alignment of orthologous N73 proteins in human (N73_Hum/l-211,
S E Q I D No . 1 0), gorilla (N73_gorilla/l-211, S E Q I D No . 1 3) and bushbaby (N73_Bushb/l-211, SEQ ID No. 14). Conserved regions are highlighted.
Fig. 20 shows the in vivo expression of N73 isoforms in human tissues, taken from a GeneNote expression study. N73 is expressed in placenta, heart, lung, liver, pancreas, skeletal muscle and weakly in brain.
Fig. 21 shows the sequence-structure alignment of the F10 amino acid sequence with vMIP-I
(pdb: lzxt, SEQ ID No. 38). Secondary structure elements are indicated as follows: S1-S3 ... beta-sheets, a ... a-helix. Disulfide bond is indicated by (*—*).
Fig. 22 shows the 3D-structure of the F10 polypeptide as determined by threading and 3D- modelling and a known structure of CXCLl 0 (pdb: 1 o7z), a chemokine from the IL-8 fold family, The two 3D-structures in superposition are presented in the middle of Fig. 22.
Fig. 23 shows 3D-structures of F10 (left) and CXCLI O (right) with receptor binding amino acid residues indicated in sticks.
Fig. 24 shows a structure-based sequence alignment of F10 (starting at amino acid #9 of SEQ ID
No. 16, sequence is also presented in SEQ ID No. 42) with CXCR3 binding chemokines CXCL9 (SEQ ID No. 40), CXCLI O (SEQ ID No. 39) and CXCLl 1 (SEQ ID No. 41). CXCR3 binding residues indicated. The secondary structure of CXCLI O is indicated as follows: SI -S3 ... beta-sheets, a ... a-helix.
Fig. 25 shows the comparison of stability and energetic properties of F10 (lower graph) with known chemokines (CXCLl 7, upper graph) as determined by molecular dynamics simulation.
Fig. 26 shows genetic localization of F10. The open reading frame is located on chromosome
4q21.1 (4:76,481,273-76,491,081 forward strand), which is in very close proximity of the chemokine mini cluster with CXCL9, CXCLIO and CXCLl 1. Those chemokines (CXCL9, CXCL10 and CXCL11) bind all the same Receptors: 1) CXCR3 as agonist with angiostatic function that is expressed by TH 1 cells and NK cells and 2) CCR3 as antagonist/inhibitor.
Fig. 27 shows orthologs of F10 identified with OMA Browser (OMA Group 388998 : the 16 longe st o f 20 memb ers shown, all mammals ; http ://omabrows er. org/cgi- bin/gateway.pl?f=DisplayGroup&pl=388998) .
Fig. 28 shows the in vivo expression of F10 isoforms in human tissues, taken from a GeneNote expression study. F10 is widely expressed in all tissues tested namely placenta, bone marrow, brain, heart, kidney, liver, lung, pancreas, prostate, skeletal muscle, spinal cord, spleen and thymus.
Fig. 29 demonstrates splicing of F10. Five possible F10 isoforms were identified in HeLa cells and placenta.
Fig. 30 shows expression of F10 isoforms in human cDNA of placenta. In placenta, of the 3 different isoforms of F10, isoform A is the most abundant form showing the biggest band the in vivo expression.
Example 1 : Overview of the workflow and results obtained in the computational chemokine screening
In summary, the computational chemokine screening protocol consists of the following steps:
First, 6933 uncharacterized human protein sequences containing more than one cysteine were extracted from the UniProt Knowledgebase Release 14.9 (03-Mar-2009) consisting of: UniProtKB/Swiss-Prot Release 56.9 and UniProtKB/TrEMBL Release 39.9.
The subsequent sequence-based pre-filtering step includes prediction and removal of signal peptides (SP) and transmembrane (TM) regions as they might produce noise due to variant amino acid composition or a lack of structural information about them in our representative fold libraries. Then, a BLAST search against the Protein Databank (PDB) is performed to identify and remove regions that can be already annotated by sequence similarity to a characterized protein of known structure. From the remaining sequences/ sequence parts only those longer than 54 amino acids were selected that at the same time contain more than one cysteine. In total 2141 protein sequences passed this pre-filtering step successfully.
A fold library containing all three-dimensional (3D) protein structures of the chemokine fold family (SCOP: d.9.1.1 , 107 PDB entries with in total 217 protein chains) from the Protein Data Bank was generated and the fold recognition algorithm ProHit was used to generate sequence-structure alignments of the query sequences with chemokine template structures. The 760 best ranked sequence-structure alignments covering the full chemokine fold were automatically modeled and evaluated for disulfide bond formation possibilities by measuring distances between the cysteines in the chemokine- like 3D models. In 454 sequences the cysteines came close enough together to form at least one disulfide bond in the obtained 3D model.
For the top 50 sequences with potential disulfide bonds a control threading run was performed to assess the quality and ranking of the chemokine structure alignment within the structural templates of the whole PDB. The pdb95 was used as fold library, which comprises all to date known 3D structures with less than 95% sequences identity and, furthermore, all known chemokine structures were added to that library. A sequence hit was further analyzed in case the chemokine template was ranked high in the pdb95 fold library and by manual inspection no other template was considered to be superior compared to the chemokine template.
The first approach model obtained from the threading alignment was refined by building up the amino acid side chains including cysteines forming disulfide bonds. Then, the structure was energy minimized and side chain rotamers were manually adjusted to improve the packing of the core.
Then Molecular Dynamics simulations were performed with the 7 best refined models. After adding solvent to the system, 40ps MD simulation with position-restraints was performed, followed by 10ns MD without restraints both at constant temperature of 300K. Models with stability similar to comparable chemokine models were selected.
For further confirmation of those hits, the corresponding genes were analyzed for conservation across different species as well as additional chemokine features like similar exon organization, chromosomal location and proximity to known chemokines, expression profiles, presence of a PolyA site (ENSEMBL, PolyA), common transcription factor binding sites of chemokines within their promoter region.
Three promising candidates (B42, N73 and F 10) were further analyzed experimentally by gene expression studies, protein expression and protein analysis using circular dichroism (CD), Fourier transform infrared spectroscopy (FTIR) and mass spectrometry (MS), and RNAi knockdown experiments.
Example 2: Sequence-based pre-filtering and structure-based chemokine screening of B42
The initial sequence dataset was extracted from the Uniprot knowledgebase (SwissProt and TrEMBL) by filtering all human protein sequences labeled as "unknown", "or/" (open reading frame), "hypothetical", "uncharacterized' or "putative" that contain at least two cysteine.
Signal peptides (SP) and transmembrane (TM) regions were removed in the pre-filtering step using the consensus of different prediction methods (i.e. PrediSi, SigPfam, TMHMM, Memsat, (Hiller 2004, Jones 1994, Sonnhammer 1998, Zhang 2004)). Sequence parts that could be annotated by sequence similarity were detected and removed by a BLAST search using an e-value cutoff of 0.0005 with a minimum alignment length of 50 amino acids and a minimum sequence identity of 30 %. The cysteine content of the remaining sequences was evaluated and only those sequences with two or more cysteines and a minimum sequence length of 55 amino acids were selected.
The fold recognition algorithm ProHit (ProCeryon Biosciences) (Sippl 1992, Sippl 1993) and a fold library containing the three-dimensional (3D) protein structures of all members of the chemokine fold family (SCOP: d.9.1.1, 107 PDB entries with in total 217 protein chains) from the Protein Data Bank was used to generate sequence-structure alignments.
In ProHit different standard scores are calculated. The so-called z-scores are calculated for each sequence-to-structure alignment by the formula: z = x ~ ^ , where: x is a raw score to be standardized; μ σ
is the mean of the population; σ is the standard deviation of the population. The distance between the raw score and the population mean in units of the standard deviation is represented by the quantity z.
The z-scores calculated for each sequence-structure alignment are:
The z-pairwise score (z-pair), which is the pairwise (residue-residue) potential component of the alignment.
The z-surface score (z-surf), which is the surface (residue-solvent) potential component of the alignment.
The z-combined score (z-comb) is a weighted combination of the pairwise and surface z-scores.
The z-sequence score (z-seq), which is the sequence substitution (sequence similarity) component of the alignment.
The final overall ranking is calculated based on the Threading index (Thd idx), which is a combination of sequence similarity (z-seq), residue-residue and residue-solvent interactions (z-comb) normalized by the query sequence length.
To exclude sequence-structure alignments not covering the full length of a given template fold (false positives: FP), the ratio between the template fold length (fl) and the number of aligned residues in the sequence-structure alignment (path length: pi) is calculated. High confidence (HC) alignments covering more than 75% of the template structure have fl/pl values in the range (Pisabarro 2006): 0.6<fl/pl<1.3
Furthermore the percentage of amino acid sequence identity (%Id) and the percentage of domain coverage (%Dcov) are calculated but not used for ranking the results.
As the scores and the threading index may vary depending on the nature of the query protein as well as its length, the quality of an alignment is only given by the overall ranking within all structural templates in the pdb95 fold library (95 Rank and the fl/pl filtered Rank) but not by a certain score alone. A redundancy-reduced summary of the results obtained for the novel protein B42 using the fold library containing all known structures of the members of the IL-8-like chemokine fold family is shown in table 1. The "95 rank" of those B42 alignments with chemokine template structures corresponds to the overall rank in the pdb95 library. The novel protein B42 according to SEQ ID No. 1 shows high structural similarity to the known chemokines vMIP-I (rank #1 ; ID: 22,6) and v-MIP-II (rank #4; ID: 26,2) both encoded by kaposi's sarcoma-associated herpesvirus. CCL14 (pdb: 2Q8R) was found as best scoring human chemokine template (ID: 18%) then the human chemokines CCL5 (ID: 20,6%) and CCL3 (ID: 17,2%).
Abbreviations and nomenclature used in table 1 are as follows: 95 Rank ... Rank of the sequence- structure alignment in the pdb95 control run (22009 entries), Fold rank ... Rank of the structure with the chemokine fold (217 domains in library) , fl/pl ... fold length/path length (good between 0,6 and 1,3), thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %>ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, Scop ... SCOP family identifier, pdb ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), %>D_cov ... percentage of domain coverage (folds with coverage lower than 70%> are filtered out), descr ... description of PDB template, Uniprot ... Uniprot identifier, Rec. agonist ... Chemokine receptors for that the template protein is an agonist (activator), Rec. antagonist ... Chemokine receptors for that the template protein is an antagonist (inhibitor).
Example 3: 3D-Modelling and energy refinement of B42
The MODELLER package was used to generate 3D models of the query sequences with the sequence to structure alignments calculated by ProHit. In those 3D models, the distances between cysteine were measured and analyzed for possibilities of disulfide bond formation. To refine the candidate's model generated from the threading alignment, the cysteines were connected, the structure energy minimized and side chain rotamers were manually adjusted using MOE2008.10.
The sequence-to-structure alignment of B42 with its best ranked template structures shows a good matching of the structurally important cysteines involved in disulfide bridges (Fig. 2). Most conserved residues are located in the core of the protein (L16, Y20, W24, V34, L37, L48, V57, W59, 160, M61 , A63, A64, L68 and 4 cysteines) that is necessary for stability of the fold. The 3D model of B42 built on the vMIP-I template structure (pdb: lzxt) has 2 to 3 disulfide bonds. The first model has disulfide bonds between C7 - C29 and C8 - C45, which would correspond to the CC-motif. Since there is a third cysteine present close to the CC-motif it is also possible that the bonds are formed from C7 - C72, C8 - C29 and CI 1- C45. The second model is experimentally supported by the molecular mass (8931Da) measured by mass spectrometry (see example 9).
A detailed 3D-model of the B42 protein with the amino acid sequence according to SEQ ID No. 1 and the experimental structure ofvMIP-II (pdb: lcm9) are shown in Fig. 1. Superposition of both structures is also shown in Fig. 1. Disulfide bonds in the tertiary structure of the protein with the amino acid sequence according to SEQ ID No. 1 are formed at equivalent positions in 3D.
Example 4: pdb95 ranking control of B42
For sequences with potential disulfide bonds, a control threading run is performed to assess the quality and ranking of the chemokine structure alignment within the structural templates of the whole PDB. The pdb95 is used as fold library (22009 entries), which comprises all to date known 3D structures with less than 95 % sequences identity and, furthermore, all chemokine and cytokine structures were added to that library. A sequence hit was further analyzed in case the chemokine alignment was within the top 10 folds in the pdb95 fold library and by manual inspection none of the better ranked alignments was considered to be superior compared to the chemokine template in terms of gap content and disulfide bonding possibilities.
A redundancy-reduced summary of the pdb95 control results of the B42 protein sequence are shown in table 2. Ranking with this fold library confirms that the structure of the novel protein resembles an IL-8- like chemokine fold. The novel protein shows high structural similarity to vMIP-I even among all possible protein folds (rank #1 , table 2). Ergotoxin (rank #11), Metallothionein (rank #15), Zinc finger (rank #23) and Leupaxin (rank #33) are considered false positive because of too high gap content. Ranks #12, #16, #26, #31 are all alpha helical proteins, which contradict the CD spectroscopy results to be protein containing both alpha helices and beta sheets (see example 10). The alignments with the structures at ranks #10, #20, #24, #25, #34 and #35 are not covering the full length of the template folds and are therefore also considered false positive.
Abbreviations and nomenclature used in table 2 is as follows: Rank ... Rank of the sequence-structure alignment in Pdb95 control run (22009 entries), Confid. ... Confidence class: HC=High Confidence, FP=false positive, NS=Not secreted, NL=Not Location predicted, fl/pl ... fold length/path length (good between 0,6 and 1,3), avThx ... Average Threading Index calculated from the best 20 structures of the fold, thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, Scop ... SCOP family identifier, pdb ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), %cover. ... percentage of domain coverage (folds with coverage lower than 70% are filtered out), descr ... description of pdb template, Pfam ... Pfam description, Uniprot ... Uniprot identifier.
Table 2: B42 - pdb95 results first per fold only
Example 5: Molecular dynamics simulations of B42
Molecular Dynamics (MD) simulations were performed with the refined models from example 3 and their stability was evaluated by comparison with a known chemokine used as control.
MD simulations were performed using GROMACS version 3.3 (Lindahl 2001, Van Der Spoel 2005) with application of the G53a6 force field (GROMOS96.1 (Van Gunsteren 1996)) for all chemokine-like models. Each model was hydrated using the flexible simple point charge water model (SPC) water model (Berendsen 1981) in a dodecahedral periodic water box having 1 nm distance to the protein periphery. In order to neutralize the system counter ions, Na+ or CI- , were added by replacing water molecules in each system. Then, 1000 steps of steepest descent energy minimization were applied to the system using the GROMOS96 force field G53a6.
During the first 40ps, the position of all bonds of the protein was restrained to allow the adjustment of the solvent molecules. Afterwards, free molecular dynamics simulations were performed applying time steps of 2 fs and using the Particle-Mesh-Ewald (PME) method (Essmann 1995) to treat the electrostatic interactions. The temperature was kept constant at 300K by using the Berendsen algorithm (Berendsen 1984) with 0.1 ps coupling time. Pressure was maintained with isotropic pressure coupling, a coupling time of lps and isothermal compressibility of 4.575e"5 bar-1.
For all models the MD simulation without restraints was carried out for 10 ns. Secondary structure elements were analyzed after the simulation and used for least square fit of the protein snapshots written to a .pdb file with a step size of 10 ps as well as for calculation of the RMSD plot using the gromacs subprograms gmx2pdb and g rms. PDB results were graphically analyzed using VMD (Humphrey 1996) and MOE2008.10 and the RMSD plots with MS excel.
The structure of the known chemokine CXCL17 with low sequence identity to solved chemokine structures was predicted by threading using the same methodology previously described. The threading alignment of CXCL17 was generated with the best compatible 3D structure (pdb:lNR4 seq.ID: 21.4 %) of an IL8-like chemokine as template for the MODELLER package. The model was refined and the cysteines were connected to disulfide bonds using MOE2008.10. The MD simulation of this control model was performed using the same conditions as previously described.
A putative novel chemokine candidate is evaluated by: 1.) Comparison of the RMSD plot of the control chemokine MD simulation with the candidate's simulation RMSD plot and 2.) by visual inspection of the stability of candidate's 3D model during and after the simulation.
The MD simulation of the control model (CXCL17) shows good stability of the secondary structure elements with low RMSD values of around 0.2-0.3nm throughout the simulation.
Likewise, the MD simulation of the B42 protein with the amino acid sequence according to SEQ ID No. 1 shows as good stability of the secondary structure elements as the control 3D model of CXCL17 with low RMSD values of around 0.2-0.3 nm (Fig. 3) throughout the simulation. Only the RMSD values of the all backbone curve (solid line) and coils are slightly higher because of the relatively long and flexible C- terminus following the helix, which is not present in the control 3D model (but known also from other CKs like Lymphotactin).
Example 6: Computational gene and protein analysis of B42
The selected candidate genes were manually analyzed for conservation across different species (orthologs), exon organization, chromosomal location and proximity to known chemokines, presence of a PolyA site (using ENSEMBL (Flicek 2008), Polyah.pl (softberry) and Polyadq (Tabaska 1999)), transcription factor binding sites of chemokine regulators (Chong 2000; Genin 2000; Gerber 2004; Roebuck 1999; Ueda 2007; Widmer 1993; Yeruva 2008) and expression profiles (GeneNote expression study (Shmueli 2003)). Furthermore the protein was checked for structural and functional similarity to known chemokines by analyzing glycosylation sites and other eukaryotic linear motifs with the ELM server (Puntervoll 2003), presence of receptor binding residues in sequence and 3D (with MOE2008.10), if known, and leaderless secretion in case no signal peptide is present using SecretomeP (Bendtsen 2004).
The B42 (Q1T7F1, AB091373-6, ENSv55: ZNF578-202) amino acid sequence according to SEQ ID No. 1 was previously identified (Maegawa 2002) as hypothetical novel Kruppel-like zinc finger protein. Sequence analysis using BLAST and Pfam did not identify any statistically significant sequence homology with any previously characterized protein. No signal peptide or transmembrane region was identified in B42 by standard prediction methods. However SecretomeP, a tool to predict leaderless secretion, obtained a high score (0.8 of 0.5 needed), suggesting that leaderless secretion of B42 is possible. Also WoLF PSORT (Horton 2007) predicts extracellular location with a high score for B42.
ENSEMBL's automatic annotation pipeline Release 54 (Flicek 2008) identified two uncharacterized orthologous proteins, one in chimpanzee sharing 98% sequence identity (SEQ ID No. 6) and the other in orangutan (SEQ ID No. 7) with 91% sequence identity (Fig. 4, table 3) with B42.
Table 3: Orthologs identified by ENSEMBL annotation pipeline (Release 54)
Species Gene ID Peptide ID length Location
Chimpanzee ENSPTRG00000034188 ENSPTRP00000057761 81aa 19: 58141094 - 58145639
Orangutan ENSPPYG00000010339 ENSPPYP0000001 1581 81aa 19: 53958640 - 53961962
The B42 gene lies within the same chromosomal region (19ql 3.41) with two other known chemokines (CXCL17 and CCL25) that can be found at 19ql3.2. The gene of B42 was first automatically annotated as zinc finger protein ZNF528 (ENSEMBL Release 54) because its protein coding region was located within the untranslated region (intron) of this kruppel-like zinc finger gene. Since the ENSEMBL Release 55 it was re-annotated as ZNF578, which, according to the ENSEMBL annotation, does not code for a protein but is a processed transcript. The translated B42 protein does not share sequence identity to any characterized zinc finger protein although in close proximity to the B42 gene many zinc finger genes are located. The B42 coding transcript has 3 exons like many known chemokines. Another common feature of chemokines is that the exon boundaries have between exon 1 and 2 the intron phase 1 (position of the codon within the intron) and phase 2 between exon 2 and 3 (Betts 2001). In B42 the protein coding region starts at the end of exon 2 and ends with exon 3 having also a phase 2 intron boundary between exon 2 and 3 (table 4).
Table 4: Exon organisation of ENST00000396436 (ZNF578-202) (ENSEMBL Release 55)
A poly-A site was predicted at position +657 downstream the translation end in the human and orangutan genes. Furthermore, the promoter region of B42 was analyzed for transcription factor binding sites known from other chemokines like NF-IL6, NF-kappaB, APl(C-Jun), INF-1 and the CK1 motif (Chong 2000; Genin 2000; Gerber 2004; Roebuck 1999; Ueda 2007; Widmer 1993; Yeruva 2008). Different binding sites were identified and the results are shown in table 5.
Table 5: Transcription factor binding sites found in human and orangutan B42 promoter known from other chemokines
Example 7: In vivo expression of the B42 nucleic acid sequence
Expression of the nucleic acid sequence according to SEQ ID No. 2 in human tissues was determined by analysis of pooled human cDNA by PCR analysis.
The mRNA expression of B42 was analyzed by PCR with templates from 8 different tissues (heart, brain, placenta, lung, liver, skeletal muscle, kidney, pancreas) using Human MTC Panel I (Clontech, Catalog No. 636742, LOT No. 8082935A). 43 PCR cycles were done at an annealing temperature of 58°C. The expected size of the PCR product is 272 bp and the following primers were used: forward primer: 5'- CCTAAGGAAGAAGCCTAGAAGAGG-3' (SEQ ID No. 4) and reverse primer: 5'- CAGGC AGTTGTGCACATTAAG-3 ' (SEQ ID No. 5).
PCR expression studies and subsequent sequencing of the PCR products identified two isoforms (Fig. 7) of which only isoform A is protein coding and expressed in placenta, heart, lung, liver, pancreas, skeletal muscle and weakly in brain. Isoform B has a stop codon within the additional longer exon 3 sequence and, if protein coding, would only encode a short peptide of 9 residues (Fig. 6). It is expressed mainly in kidney, placenta, pancreas, liver and weakly in lung and muscle. B42 was not expressed in bone marrow (data not shown). Example 8: In vitro expression of the B42 protein
The gene encoding the mature B42 protein (Uniprot: Q1T7F1 ; AB091373-6) was synthesized and cloned in an expression plasmid pETMM-60. The expression and purification was done as described in Magistrelli et al. (Magistrelli 2005) using a His-tagged NusA fusion protein to make the B42 as fusion partner soluble and Origami B cells (for disulfide bonds formation in cytoplasm because of a glutathion S transferase mutation).
The B42 protein was successfully expressed in E. coli and soluble as fusion protein. Then, the construct was cleaved from NusA with TEV protease and leaving a GA-addition at the N-terminus of the putative chemokine (Fig. 8).
Example 9: Purification of B42 protein
The expressed protein from example 8 was purified and concentrated. The expression of the right construct was confirmed by SDS-Page (Fig. 9) and Mass spectrometry (measured Mass: 8931 Da) (Fig. 10).
For the purification cells were lysed, and the supernatant was diluted with an equal volume of 100 mmol/1 Tris-Cl pH 8, 600 mmol/1 NaCl, 40 mmol/1 imidazole and subsequently loaded onto a HisTrap HP column pre- equilibrated with 50 mmol/1 Tris-Cl pH 8, 300 mmol/1 NaCl, 20 mmol/1 imidazole. The column was washed with 50 mmol/1 Tris-Cl pH 8, 300 mmol/1 NaCl, 20 mmol/1 imidazole, subsequently with 50 mmol/1 Tris-Cl pH 8, 1305 mmol/1 NaCl, 20 mmol/1 imidazole, and subsequently with 50 mmol/1 Tris-Cl pH 8, 300 mmol/1 NaCl, 20 mmol/1 imidazole. Elution was carried out using 50 mmol/1 Tris-Cl pH 8, 200 mmol/1 NaCl, 400 mmol/1 imidazole collecting 1 ml fractions at a flow rate of 2 ml/min. After checking protein content in the fractions with Bradford reagent, fractions 5 - 14 were pooled and desalted on PD10 columns into 50 mmol/1 Tris-Cl pH 8.
The yielded solution comprised 12.02 mg/ml NusA-His-B42 (MW 65,9 kDa). 0.5 ml TEV-protease (1.6 mg/ml) was added to the solution and incubated overnight at 18 °C. 100 mmol/1 Tris-Cl pH 8, 600 mmol/1 NaCl, 40 mmol/1 imidazole was added to a final volume of 40 ml and was loaded on a pre- equilibrated HisTrap HP column and eluted with 50 mmol/1 Tris-Cl pH 8, 300 mmol/1 NaCl, 20 mmol/1 imidazole collection 2 ml fractions. Bound NusA was eluted with 50 mmol/1 Tris-Cl pH 8, 200 mmol/1 NaCl, 400 mmol/1 imidazole from the columns.
Fractions 5 - 16 were pooled after checking their protein content with Bradford reagent. The yielded solution contained a protein concentration of 0.467 mg/ml. The solution was diluted to yield a NaCl concentration < 50 mmol/1. The sample was loaded directly onto a HiTrap Q HP columns pre- equilibrated with anion buffer pH 8.5. After washing the column was eluted with a salt gradient from 0 to 1 mo 1/1 NaCl. Column was washed with 1.5 mol/1 NaCl. SDS-Page was run to identify peak fractions. Peak fractions were pooled and concentrated on a 3 kDa cutoff centrifugal ultrafiltration membrane (Millipore) and desalted into PBS. Concentration was checked and revealed a protein concentration of 0.486 mg/ml. SDS-Page was run to crosscheck for recovery (Fig. 9)·
Fraction 23 was concentrated down to the volume of 0,6 ml and desalted into PBS. Yielded protein B42 (MW of 8,931 kDa) was obtained in an amount of 109 μιηοΐ.
Example 10: Structural comparison of B42 with vMIP-II by circular dichroism (CD) and Fourier transform infrared spectroscopy (FTIR)
Obtained isolated B42 protein was analyzed with circular dichroism (CD) (Fig. 12) and FTIR spectroscopy. Both methods give information of the overall fractions and alpha helical and beta sheet secondary structure elements in the sample protein. The experiment was done with a protein concentration of 28,3 μιηοΐ/ηιΐ in lx PBS buffer at 20 °C.
CD measurements were carried out in the far UV-range (190-260 nm) using cuvettes of 0,02 cm path length (size 0,2 mm). The results were averaged over ten repetitive scans at a scan rate of 1 nm. Ellipticity was normalized to molarity using mass of 8931 Da as determined by mass spectrometry from example 9. The secondary structure was calculated using CONTINLL and SELCON3 both provided in the CDpro software package.
FTIR spectroscopy analysis was performed and the amide I band of the measured spectrum was decomposed into peaks with Peakfit. The percentage of secondary structure elements was calculated using the factors described in de Jongh et al. (de Jongh 1996).
The secondary structure element fractions calculated from the measured CD spectra and the FTIR spectra of B42 are in agreement with the fractions of the 3D model obtained by the structure prediction. Comparison is demonstrated in the following table 6.
Table 6: secondary structure element fractions of B42
%
secondary Min % % Min AA AA of 83
Measured AA of 83 by structure Predicted Measured of 83 in by CD
by CD FTIR elements by model by FTIR model Average
Average
a helical 11 15 14 9 12 12 β sheet 16 27 45 13 22 38 turn 13 19 21 11 15 17 unordered 14 35 19 12 29 16 Example 11: Generation of a BAC transgenic HeLa cell line with the B42 gene and tag-based B42 protein localization
The following method describes the generation of a cell line expressing an N-terminal GFP-tagged B42 protein and its tag-based protein localization.
A transgenic HeLa cell line was generated carrying the tagged B42 gene in a BAC (bacterial artificial chromosome) using the methodology described in (Poser 2008).
The Westemblot analysis (Fig. 13) of the N-terminal GFP-tagged B42 transgenic cell pool using antibody to GFP shows a band of the expected size 38.8kD (GFP-tag: 31kD + B42 8.8kD), which confirms that the transgene is expressed and in vivo translated to a protein. Thus, it is proven that B42 is a real protein in contrast to the Ensembl annotation to be only a„processed transcript".
The GFP-tag was then used for localization experiments and the results (Fig. 14) show localization in the cytoplasm that is not changing during different cell cycle phases. Although, chemokines are almost always secreted proteins, B42 is located in the cytoplasm. As the localization pictures are taken from fixed cells, it is likely that secreted B42 is washed away during the fixation and for this reason not seen in the cell surrounding media. Another explanation could be that the GFP-tag affects the protein secretion or an additional stimulus is necessary for induction of the B42 secretion.
Example 12: Expression of B42 in Yeast
The following method describes recombinant expression of B42 polypeptides in yeast.
First, yeast expression vectors are constructed for intracellular production or secretion of B42 from the ADH2/GAPDH promoter. DNA encoding the B42 polypeptide and the promoter is inserted into suitable restriction enzyme sites in the selected plasmid to direct intracellular expression of the B42 polypeptide. For secretion, DNA encoding B42 can be cloned into the selected plasmid, together with DNA encoding the ADH2/GAPDH promoter, a mammalian signal peptide, or, for example, a yeast alpha-factor or invertase secretory signal/leader sequence, and linker sequences (if needed) for expression of B42.
Yeast cells, such as yeast strain AB110, can then be transformed with the expression plasmids described above and cultured in selected fermentation media. The transformed yeast supernatants can be analyzed by precipitation with 10% trichloroacetic acid and separation by SDS-PAGE, followed by staining of the gels with Coomassie Blue stain.
Recombinant B42 polypeptides can subsequently be isolated and purified by removing the yeast cells from the fermentation medium by centrifugation and then concentrating the medium using selected cartridge filters. The concentrate containing the B42 polypeptide may further be purified using selected column chromatography resins. Example 13: Expression of B42 in Baculovirus-Infected Insect Cells
The following method describes recombinant expression of B42 polypeptides in Baculovirus-infected insect cells.
The sequence coding for B42 is fused upstream of an epitope tag contained within a Baculovirus expression vector. Such epitope tags include poly-His tags and immunoglobulin tags (like Fc regions of IgG). A variety of plasmids may be employed, including plasmids derived from commercially available plasmids such as pVL1393 (Novagen). Briefly, the sequence encoding B42 or the desired portion of the coding sequence of B42 such as the sequence encoding the extracellular domain of a transmembrane protein or the sequence encoding the mature protein if the protein is extracellular is amplified by PCR with primers complementary to the 5' and 3' regions. The 5' primer may incorporate flanking (selected) restriction enzyme sites. The product is then digested with those selected restriction enzymes and subcloned into the expression vector.
Recombinant baculovirus is generated by co-transfecting the above plasmid and BaculoGold(TM) virus DNA (Pharmingen) into Spodoptera frugiperda ("Sf9") cells (ATCC CRL 171 1 ) using lipofectin (commercially available from GIBCO-BRL). After 4-5 days of incubation at 28[deg.] C, the released viruses are harvested and used for further amplifications. Viral infection and protein expression are performed as described by O'Reilley et al., Baculovirus expression vectors: A Laboratory Manual, Oxford: Oxford University Press (1994).
Expressed poly-His tagged B42 can then be purified, for example, by Ni<2+> -chelate affinity chromatography as follows. Extracts are prepared from recombinant virus-infected Sf9 cells as described by Rupert et al., Nature, 362: 175-179 (1993). Briefly, Sf9 cells are washed, resuspended in sonication buffer (25 mL Hepes, pH 7.9; 12.5 mM MgC12; 0.1 mM EDTA; 10% glycerol; 0.1% NP-40; 0.4 M KC1), and sonicated twice for 20 seconds on ice. The sonicates are cleared by centrifugation, and the supernatant is diluted 50-fold in loading buffer (50 mM phosphate, 300 mM NaCl, 10%> glycerol, pH 7.8) and filtered through a 0.45 [mu]m filter. A Ni<2+> -NTA agarose column (commercially available from Qiagen) is prepared with a bed volume of 5 mL, washed with 25 mL of water and equilibrated with 25 mL of loading buffer. The filtered cell extract is loaded onto the column at 0.5 mL per minute. The column is washed to baseline A280 with loading buffer, at which point fraction collection is started. Next, the column is washed with a secondary wash buffer (50 mM phosphate; 300 mM NaCl, 10%> glycerol, pH 6.0), which elutes nonspecifically bound protein. After reaching A280 baseline again, the column is developed with a 0 to 500 mM Imidazole gradient in the secondary wash buffer. One mL fractions are collected and analyzed by SDS-PAGE and silver staining or Western blot with Ni<2+> -NTA-conjugated to alkaline phosphatase (Qiagen). Fractions containing the eluted His l O-tagged B42 are pooled and dialyzed against loading buffer. Alternatively, purification of the IgG tagged (or Fc tagged) B42 can be performed using known chromatography techniques, including for instance, Protein A or Protein G column chromatography.
B42 polypeptides were successfully expressed as described above.
Example 14: Preparation of Antibodies that bind B42
This example illustrates preparation of monoclonal antibodies which can specifically bind B42.
Techniques for producing the monoclonal antibodies are known in the art. Immunogens that may be employed include purified B42 polypeptides, fusion proteins containing B42 polypeptides, and cells expressing recombinant B42 polypeptides on the cell surface. Selection of the immunogen can be made by the skilled artisan without undue experimentation.
Mice, such as BALB/c, are immunized with the B42 immunogen emulsified in complete Freund's adjuvant and injected subcutaneously or intraperitoneally in an amount from 1-100 micrograms. Alternatively, the immunogen is emulsified in MPL-TDM adjuvant (Ribi Immunochemical Research, Hamilton, Mont.) and injected into the animal's hind foot pads. The immunized mice are then boosted 10 to 12 days later with additional immunogen emulsified in the selected adjuvant. Thereafter, for several weeks, the mice may also be boosted with additional immunization injections. Serum samples may be periodically obtained from the mice by retro-orbital bleeding for testing in ELISA assays to detect anti- B42 antibodies.
After a suitable antibody titer has been detected, the animals "positive" for antibodies can be injected with a final intravenous injection of B42. Three to four days later, the mice are sacrificed and the spleen cells are harvested. The spleen cells are then fused (using 35% polyethylene glycol) to a selected murine myeloma cell line such as P3X63AgU. 1, available from ATCC, No. CRL 1597. The fusions generate hybridoma cells which can then be plated in 96 well tissue culture plates containing HAT (hypoxanthine, aminopterin, and thymidine) medium to inhibit proliferation of non-fused cells, myeloma hybrids, and spleen cell hybrids.
The hybridoma cells will be screened in an ELISA for reactivity against B42. Determination of "positive" hybridoma cells secreting the desired monoclonal antibodies against B42 is within the skill in the art.
The positive hybridoma cells can be injected intraperitoneally into syngeneic BALB/c mice to produce ascites containing the anti-B42 monoclonal antibodies. Alternatively, the hybridoma cells can be grown in tissue culture flasks or roller bottles. Purification of the monoclonal antibodies produced in the ascites can be accomplished using ammonium sulfate precipitation, followed by gel exclusion chromatography. Alternatively, affinity chromatography based upon binding of antibody to Protein A or Protein G can be employed. Example 15: Purification of B42 Polypeptides Using Specific Antibodies
Native or recombinant B42 polypeptides may be purified by a variety of standard techniques in the art of protein purification. For example, pro-B42 polypeptide, mature B42 polypeptide, or pre-B42 polypeptide is purified by immunoaffinity chromatography using antibodies specific for the B42 polypeptide of interest. In general, an immunoaffinity column is constructed by covalently coupling the anti-B42 polypeptide antibody to an activated chromatographic resin.
Polyclonal immunoglobulins are prepared from immune sera either by precipitation with ammonium sulfate or by purification on immobilized Protein A (Pharmacia LKB Biotechnology, Piscataway, N.J.). Likewise, monoclonal antibodies are prepared from mouse ascites fluid by ammonium sulfate precipitation or chromatography on immobilized Protein A. Partially purified immunoglobulin is covalently attached to a chromatographic resin such as CnBr-activated SEPHAROSE(TM) (Pharmacia LKB Biotechnology). The antibody is coupled to the resin, the resin is blocked, and the derivative resin is washed according to the manufacturer's instructions.
Such an immunoaffinity column is utilized in the purification of B42 polypeptide by preparing a fraction from cells containing B42 polypeptide in a soluble form. This preparation is derived by solubilization of the whole cell or of a subcellular fraction obtained via differential centrifugation by the addition of detergent or by other methods well known in the art. Alternatively, soluble B42 polypeptide containing a signal sequence may be secreted in useful quantity into the medium in which the cells are grown.
A soluble B42 polypeptide-containing preparation is passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of B42 polypeptide (e.g., high ionic strength buffers in the presence of detergent). Then, the column is eluted under conditions that disrupt antibody/B42 polypeptide binding (e.g., a low pH buffer such as approximately pH 2-3, or a high concentration of a chaotrope such as urea or thiocyanate ion), and B42 polypeptide is collected.
Example 16: Sequence-based pre-filtering and structure-based chemokine screening of N73
The pre-filtering and structure-based chemokine screening of N73 was performed by the same methodology as described in example 2 for B42.
A redundancy-reduced summary of the results obtained for the novel protein N73 using the fold library containing all known structures of the members of the IL-8-like chemokine fold family is shown in table 7. The "95 rank" of those N73 alignments with chemokine template structures corresponds to the overall rank in the pdb95 library. The novel protein N73 according to SEQ ID No. 9 shows high structural similarity to the known chemokines vMIP-II (rank #1, ID 23.2%) and v-MIP-I (rank #2, ID 18.7%), both encoded by kaposi's sarcoma-associated herpesvirus. Both can block HIV entry through CCR3 and CCR5 (Allen 2007; Luttichau 2007; Pease 2006) and, in addition, vMIP-II is able to block CXCR4 (Table 8). CCL20 (pdb: 1M8A) was found as best scoring human chemokine template (ID: 21 %), then the human chemokines CXCL2 (ID: 18.2 %) and XCL1 (ID: 12.8 %).CCL20 (pdb: 2Q8R) is the only natural ligand of CCR6 (Hoover 2002) that has important functions in the defense response to bacterium by attraction of natural killer and memory T cells to sites of inflammation.
Abbreviations and nomenclature used in table 7 are as follows: 95 Rank ... Rank of the sequence- structure alignment in the pdb95 control run (22009 entries), Fold rank ... Rank of the structure with the chemokine fold (217 domains in library) , fl/pl ... fold length/path length (good between 0,6 and 1,3), thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, scopid ... SCOP family identifier, pdb id ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), cover ... percentage of domain coverage (folds with coverage lower than 70% are filtered out), descr ... description of PDB template, Uniprot ... Uniprot identifier, Rec. agonist ... Chemokine receptors for that the template protein is an agonist (activator), Rec. antagonist ... Chemokine receptors for that the template protein is an antagonist (inhibitor).
Example 17: 3D-Modelling and energy refinement of N73
The MODELLER package was used to generate 3D models of the query sequences with the sequence to structure alignments calculated by ProHit. In those 3D models, the distances between cysteine were measured and analyzed for possibilities of disulfide bond formation. To refine the candidate's model generated from the threading alignment, the cysteines were connected, the structure energy minimized and side chain rotamers were manually adjusted using MOE2008.10.
The sequence-to-structure alignment of N73 with its best ranked template structure vMIP-II (l fj2) shows a good matching of the structurally important cysteines involved in disulfide bridges (Fig. 15). Two to three disulfide bonds could be formed either between C54-C75 and C55-C102 or, in case three bonds are formed, between C49-C75, C54-C102 and C55-C104 (indicated with black lines between cysteines in Fig. 15). In both cases, C54 and C55 are arranged in a CC-motif known from chemokines. The second model of N73 containing 3 disulfide bonds is shown in Fig. 16.
By visual inspection one relatively long loop (40S loop) can be found in the l fj2-like 3D model of N73 after the second sheet with 9 residues insertion. The sequence identity is relatively high with 23% and conservation of important amino acid properties can be observed. Positions of proline and glycine residues in the N2-loop are present conserved (P74, P76, G79, G81 , G82) as well as hydrophobic properties of residues forming the core of the protein (alignment numbering from Fig. 15: 164, W72, A84, V98, L113, W114, V119).
A detailed 3D-model of the N73 protein with the amino acid sequence according to SEQ ID No. 9 and the experimental structure of vMIP-II are shown in Fig. 16. Superposition of both structures is shown in Fig. 16 (middle). Disulfide bonds in the tertiary structure of the protein with the amino acid sequence according to SEQ ID No. 9 are equivalent in 3D to the bonds found in the template and other known chemokines.
Example 18: pdb95 ranking control of N73
For sequences with potential disulfide bonds, a control threading run is performed to assess the quality and ranking of the chemokine structure alignment within the structural templates of the whole PDB. The pdb95 is used as fold library (22009 entries), which comprises all to date known 3D structures with less than 95 % sequences identity and, furthermore, all chemokine and cytokine structures were added to that library. A sequence hit was further analyzed in case the chemokine alignment was within the top 10 folds in the pdb95 fold library and by manual inspection none of the better ranked alignments was considered to be superior compared to the chemokine template in terms of gap content, cysteine matching and disulfide bonding possibilities. A redundancy-reduced summary of the pdb95 control results of the N73 protein sequence are shown in table 8. Ranking with this fold library confirms that the structure of the novel protein resembles an IL-8- like chemokine fold.
The presence of a signal peptide implicates that the protein's sub-cellular location is extracellular. Thus the 6 folds known to be secreted are highlighted with bold letters in the scop column, and only 5 of those were considered of "high confidence" based on the fold coverage value (0.6 < fl/pl < 1.3, bold and italic letters in the fl/pl column). The best chemokine template structure is also the best ranked high confidence prediction in the pdb95 control run presenting all secondary structure elements, low gap content as well as a good cysteine arrangement in 3D to form disulfide bonds. The other HC predictions were dismissed as they either do not present a good cysteine arrangement in 3D (rank #1 1), do not cover all secondary structure elements (rank #6) or present a high gap content (rank #24).
Thus, the novel protein shows high structural similarity to the IL-8-like chemokine vMIP-II even among all possible protein folds (rank #2, table 8).
Abbreviations and nomenclature used in table 8 are as follows: Rank ... Rank of the sequence-structure alignment in Pdb95 control run (22009 entries), Confid. ... Confidence class: HC=High Confidence, FP=false positive, NS=Not secreted, NL=Not Location predicted, fl/pl ... fold length/path length (good between 0.6 and 1.3), avThx ... Average Threading Index calculated from the best 20 structures of the fold, thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, scop ... SCOP family identifier, pdb ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), %cover. ... percentage of domain coverage (folds with coverage lower than 70% are filtered out), descr ... description of pdb template, Pfam ... Pfam description, Uniprot ... Uniprot identifier.
Table 8: N73 -pdb95 results first per fold only
Example 19: Molecular dynamics simulation of N73
Molecular Dynamics (MD) simulations were performed as described in example 5 with the refined models from example 17 and their stability was evaluated by comparison with a known chemokine used as control.
The N73 RMSD plot of the 10ns MD simulation is shown in Fig. 17. The MD simulation of the N73 protein with the amino acid sequence according to SEQ ID No. 9 shows, that the backbone RMSD is even lower than expected from the control chemokine model and stable around 0.5 nm throughout the simulation. The RMSD values fluctuation of the secondary structure elements of the control 3D model of CXCL17 are around 0.2-0.3 nm (Fig. 17) throughout the simulation. Likewise, the secondary structure elements of N73 are stable and their RMSD is not higher than 0.2 nm. Only the insertion in the 40S loop connecting the 2nd and 3rd sheet is flexible at the beginning of the simulation and packs more tightly to the protein core after refinement. Nevertheless, the N73 model is overall more stable than the control chemokine. After the MD the resulting model has kept all secondary structure elements and has "nativelike" overall appearance (RMSD between start and end is only 0.5nm).
Example 20: Computational gene and protein analysis of N73
Computational gene and protein analysis of N73 protein was performed according to example 6. N73 was first identified as a genes related to cancer development and progression in a large-scale cDNA transfection screening (Wan 2004). The transfection with N73 cDNA resulted in an inhibition of growth in the Hepatoma 7721 cell line (-87%) and stimulation of growth in the fibroblast cell line NIH3T3 (128%).
A signal peptide was identified that is cleaved at the sequence position 23 but no transmembrane region was found. WoLF PSORT predicts extracellular location with a score of 25. The ELM server was queried with the N73 protein sequence using the extracellular filter, and the signal peptide prediction from 1 -23 was confirmed. Furthermore, low complexity regions (LCR) were predicted from positions 33-53 and 164-182. An N-glycosylation site (pos:28-33) and a glycosaminoglycan attachment site (pos: 147-150) was identified. Also a RGD motif found from position 198-200 is noteworthy. This is a receptor binding motif recognized by different members of the integrin family. The RGD motif is known to lie on an exposed flexible loop if functional (Assa-Munt 2001), which is the case for N73, as it is located at the C- terminus after a LCR. RGD motifs in proteins of the extracellular matrix are used by integrins to link the intracellular cytoskeleton of cells with the extracellular matrix. Without attachment to the extracellular matrix, cells normally undergo apoptosis ('anoikis') (Frisch 2001). Small soluble peptides containing the RGD motif inhibit cell attachment and consequently induce apoptosis of the integrin presenting cells (Kim 2003). This makes the RGD peptides and mimetics potential therapeutic agents against tumor- induced angiogenesis, inflammation and cancer mestastasis. Thus, it might be possible that the soluble secreted N73 protein has similar functions through integrin interaction with its RGD motif.
Currently, there is one other chemokine known to interact with integrins, CXCL4, despite lacking an RGD motif. CXCL4 is a potent inhibitor of tumor-induced angiogenesis and it is suggested that the chemokine/integrin crosstalk may contribute to the anti-angiogenic effect of CXCL4 (Aidoudi 2008). Thus, it might be possible that N73 has similar anti-angiogenic functions through integrin interaction with its RGD motif.
N73 is conserved across different primate species. After cleavage of the signal peptide, the processed protein contains 8 cysteines that could potentially form disulfide bonds. A BLAST search and ENSEMBL (table 9) identified one close ortholog in gorilla (SEQ ID No. 13) sharing 79% sequence identity having all 8 cysteine residues of the processed protein sequence conserved and another in bushbaby (SEQ ID No. 14) sharing 41% sequence identity (Fig. 19).
Table 9: N73 orthologs by Ensembl Rel.55 - July 2009
The N73 gene (ENSG00000184334) comprises 3 exons with the protein coding region on the first one only. The open reading frame is located on chromosome 8q24.3 (142,513,827-142,517,232 forward strand) and in close proximity to the N73 gene an orphan G protein coupled receptor (GPR20) can be found (Fig. 18). At present, no other chemokine is known to be located on this chromosome.
A poly-A site was predicted using Polyadq (Tabaska 1999) & Polyah.pl (softberry) at position +534 downstream in the human gene. Furthermore the promoter region of N73 was analyzed for transcription factor binding sites and we identified several binding sites (table 10) that are known from other chemokines like NF-IL6, NF-kappaB and APl (C-Jun) (Chong 2000; Genin 2000; Gerber 2004; Roebuck 1999; Ueda 2007; Widmer 1993; Yeruva 2008).
Table 10: transcription factor binding sites found in human N73 promoter known from other chemokines
Example 21: Information available in public resources about in vivo expression of the N73 nucleic acid sequence
GeneNote (Gene Normal Tissue Expression), a survey of complete gene expression profiles in 12 healthy human tissues (Shmueli 2003), found that the N73 transcript is widely expressed in all tissues tested namely heart, skeletal muscle, bone marrow, brain, kidney, liver, lung, pancreas, prostate, spinal cord, spleen and thymus (Fig. 20).
Example 22: In vitro expression and purification of the N73 protein in E. coli
The following method describes recombinant expression of N73 polypeptides in E. coli.
The gene encoding the mature N73 protein is synthesized and cloned in an expression plasmid pETMM- 60. The expression and purification can be done as described in Magistrelli et al. (Magistrelli 2005) using a His-tagged NusA fusion protein to make the N73 as fusion partner soluble and Origami B cells (for disulfide bonds formation in cytoplasm because of a glutathion S transferase mutation).
The N73 protein is expressed in E. coli and soluble as fusion protein. The His-tagged fusion protein can be purified using a HisTrap HP column collecting the HisTrap elution. Then, the construct is cleaved from NusA with TEV protease and leaving a GA-addition at the N-terminus of the N73 protein and the His-tag at the NusA. The N73 protein can be separated from the Nus A and purified by running another HisTrap HP column collecting the flow-through as the His-tagged NusA binds to the column but not the N73 protein.
Example 23: Expression of N73 in Yeast
The following method describes recombinant expression of N73 polypeptides in yeast.
First, yeast expression vectors are constructed for intracellular production or secretion of N73 from the ADH2/GAPDH promoter. DNA encoding the N73 polypeptide and the promoter is inserted into suitable restriction enzyme sites in the selected plasmid to direct intracellular expression of the N73 polypeptide. For secretion, DNA encoding N73 can be cloned into the selected plasmid, together with DNA encoding the ADH2/GAPDH promoter, native N73 signal peptide or another mammalian signal peptide, or, for example, a yeast alpha-factor or invertase secretory signal/leader sequence, and linker sequences (if needed) for expression of N73.
Yeast cells, such as yeast strain AB 1 10, can then be transformed with the expression plasmids described above and cultured in selected fermentation media. The transformed yeast supernatants can be analyzed by precipitation with 10% trichloroacetic acid and separation by SDS-PAGE, followed by staining of the gels with Coomassie Blue stain. Recombinant N73 polypeptides can subsequently be isolated and purified by removing the yeast cells from the fermentation medium by centrifugation and then concentrating the medium using selected cartridge filters. The concentrate containing the N73 polypeptide may further be purified using selected column chromatography resins.
Example 24: Expression of N73 in Baculovirus-Infected Insect Cells
The following method describes recombinant expression of N73 polypeptides in Baculovirus-infected insect cells.
The sequence coding for N73 is fused upstream of an epitope tag contained within a Baculovirus expression vector. Such epitope tags include poly-His tags and immunoglobulin tags (like Fc regions of IgG). A variety of plasmids may be employed, including plasmids derived from commercially available plasmids such as pVL1393 (Novagen). Briefly, the sequence encoding N73 or the desired portion of the coding sequence of N73 such as the sequence encoding the extracellular domain of a transmembrane protein or the sequence encoding the mature protein if the protein is extracellular is amplified by PCR with primers complementary to the 5' and 3' regions. The 5' primer may incorporate flanking (selected) restriction enzyme sites. The product is then digested with those selected restriction enzymes and subcloned into the expression vector.
Recombinant baculovirus is generated by co-transfecting the above plasmid and BaculoGold(TM) virus DNA (Pharmingen) into Spodoptera frugiperda ("Sf9") cells (ATCC CRL 171 1 ) using lipofectin (commercially available from GIBCO-BRL). After 4-5 days of incubation at 28[deg.] C, the released viruses are harvested and used for further amplifications. Viral infection and protein expression are performed as described by O'Reilley et al., Baculovirus expression vectors: A Laboratory Manual, Oxford: Oxford University Press (1994).
Expressed poly-His tagged N73 can then be purified, for example, by Ni2+ -chelate affinity chromatography as follows. Extracts are prepared from recombinant virus-infected Sf9 cells as described by Rupert et al., Nature, 362: 175-179 (1993). Briefly, Sf9 cells are washed, resuspended in sonication buffer (25 mL Hepes, pH 7.9; 12.5 mM MgC12; 0.1 mM EDTA; 10% glycerol; 0.1% NP-40; 0.4 M KC1), and sonicated twice for 20 seconds on ice. The sonicates are cleared by centrifugation, and the supernatant is diluted 50-fold in loading buffer (50 mM phosphate, 300 mM NaCl, 10%> glycerol, pH 7.8) and filtered through a 0.45 [mu]m filter. A Ni2+ -NTA agarose column (commercially available from Qiagen) is prepared with a bed volume of 5 mL, washed with 25 mL of water and equilibrated with 25 mL of loading buffer. The filtered cell extract is loaded onto the column at 0.5 mL per minute. The column is washed to baseline A280 with loading buffer, at which point fraction collection is started. Next, the column is washed with a secondary wash buffer (50 mM phosphate; 300 mM NaCl, 10%> glycerol, pH 6.0), which elutes nonspecifically bound protein. After reaching A280 baseline again, the column is developed with a 0 to 500 niM Imidazole gradient in the secondary wash buffer. One mL fractions are collected and analyzed by SDS-PAGE and silver staining or Western blot with Ni<2+> -NTA-conjugated to alkaline phosphatase (Qiagen). Fractions containing the eluted Hisl O-tagged N73 are pooled and dialyzed against loading buffer.
Alternatively, purification of the IgG tagged (or Fc tagged) N73 can be performed using known chromatography techniques, including for instance, Protein A or Protein G column chromatography.
N73 polypeptides were successfully expressed as described above.
Example 25: Preparation of Antibodies that bind N73
This example illustrates preparation of monoclonal antibodies which can specifically bind N73.
Techniques for producing the monoclonal antibodies are known in the art. Immunogens that may be employed include purified N73 polypeptides, fusion proteins containing N73 polypeptides, and cells expressing recombinant N73 polypeptides on the cell surface. Selection of the immunogen can be made by the skilled artisan without undue experimentation.
Mice, such as BALB/c, are immunized with the N73 immunogen emulsified in complete Freund's adjuvant and injected subcutaneously or intraperitoneally in an amount from 1-100 micrograms. Alternatively, the immunogen is emulsified in MPL-TDM adjuvant (Ribi Immunochemical Research, Hamilton, Mont.) and injected into the animal's hind foot pads. The immunized mice are then boosted 10 to 12 days later with additional immunogen emulsified in the selected adjuvant. Thereafter, for several weeks, the mice may also be boosted with additional immunization injections. Serum samples may be periodically obtained from the mice by retro-orbital bleeding for testing in ELISA assays to detect anti- N73 antibodies.
After a suitable antibody titer has been detected, the animals "positive" for antibodies can be injected with a final intravenous injection of N73. Three to four days later, the mice are sacrificed and the spleen cells are harvested. The spleen cells are then fused (using 35% polyethylene glycol) to a selected murine myeloma cell line such as P3X63AgU. 1 , available from ATCC, No. CRL 1597. The fusions generate hybridoma cells which can then be plated in 96 well tissue culture plates containing HAT (hypoxanthine, aminopterin, and thymidine) medium to inhibit proliferation of non-fused cells, myeloma hybrids, and spleen cell hybrids.
The hybridoma cells will be screened in an ELISA for reactivity against N73. Determination of "positive" hybridoma cells secreting the desired monoclonal antibodies against N73 is within the skill in the art.
The positive hybridoma cells can be injected intraperitoneally into syngeneic BALB/c mice to produce ascites containing the anti-N73 monoclonal antibodies. Alternatively, the hybridoma cells can be grown in tissue culture flasks or roller bottles. Purification of the monoclonal antibodies produced in the ascites can be accomplished using ammonium sulfate precipitation, followed by gel exclusion chromatography. Alternatively, affinity chromatography based upon binding of antibody to Protein A or Protein G can be employed.
Example 26: Purification of N73 Polypeptides Using Specific Antibodies
Native or recombinant N73 polypeptides may be purified by a variety of standard techniques in the art of protein purification. For example, pro-N73 polypeptide, mature N73 polypeptide, or pre-N73 polypeptide is purified by immuno affinity chromatography using antibodies specific for the N73 polypeptide of interest. In general, an immunoaffinity column is constructed by covalently coupling the anti-N73 polypeptide antibody to an activated chromatographic resin.
Polyclonal immunoglobulins are prepared from immune sera either by precipitation with ammonium sulfate or by purification on immobilized Protein A (Pharmacia LKB Biotechnology, Piscataway, N.J.). Likewise, monoclonal antibodies are prepared from mouse ascites fluid by ammonium sulfate precipitation or chromatography on immobilized Protein A. Partially purified immunoglobulin is covalently attached to a chromatographic resin such as CnBr-activated SEPHAROSE(TM) (Pharmacia LKB Biotechnology). The antibody is coupled to the resin, the resin is blocked, and the derivative resin is washed according to the manufacturer's instructions.
Such an immunoaffinity column is utilized in the purification of N73 polypeptide by preparing a fraction from cells containing N73 polypeptide in a soluble form. This preparation is derived by solubilization of the whole cell or of a subcellular fraction obtained via differential centrifugation by the addition of detergent or by other methods well known in the art. Alternatively, soluble N73 polypeptide containing a signal sequence may be secreted in useful quantity into the medium in which the cells are grown.
A soluble N73 polypeptide-containing preparation is passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of N73 polypeptide (e.g., high ionic strength buffers in the presence of detergent). Then, the column is eluted under conditions that disrupt antibody/N73 polypeptide binding (e.g., a low pH buffer such as approximately pH 2-3, or a high concentration of a chaotrope such as urea or thiocyanate ion), and N73 polypeptide is collected.
Example 27: Sequence-based pre-filtering and structure-based chemokine screening of F10
The pre-filtering and structure-based chemokine screening of F 10 was performed by the same methodology as described in example 2 for B42.
A redundancy-reduced summary of the results obtained for the novel protein F10 using the fold library containing all known structures of the members of the IL-8-like chemokine fold family is shown in table 11. The "95 rank" of those F10 alignments with chemokine template structures corresponds to the overall rank in the pdb95 library. The novel protein F10 according to SEQ ID No. 16 shows high structural similarity to the known chemokines vMIP-I (rank #1 , 20.0 %>, SEQ ID No. 38), which is encoded by kaposi's sarcoma-associated herpesvirus and CXCLI O (rank #4, 17.8 %>, SEQ ID No. 39). The best scoring human chemokine template (ID: 16,7%) was CXCL8 (rank #2).
The best ranked sequence-to-structure alignment of the F10 protein sequence is found with the viral ΜΓΡ- I (ID: 20%) as member of the CC chemokine family whereas the second and third ranked alignments are with the CXC chemokines CXCL8/IL8 (ID: 16,7%) and CXCL4/PF4 (ID: 17,7%, Table 11 ). IL8 is known to promote tumor cell survival and it was reported that depletion of CXCL8 expression by RNAi causes cell cycle arrest and spontaneous apoptosis in two IL8 secreting androgen independent prostate cancer (AIPC) cell lines (PC-3, DU145) (Singh 2009). A cell cycle arrest/ cell division defect phenotype was also observed after RNAi knock-down of F10 in HeLa cells but at the G2/M/N8 boundary whereas the IL8 knockdown caused arrest at the Gl/S boundary (see Example 34).
Abbreviations and nomenclature used in table 11 are as follows: 95 Rank ... Rank of the sequence- structure alignment in the pdb95 control run (22009 entries), Fold rank ... Rank of the structure with the chemokine fold (217 domains in library) , fl/pl ... fold length/path length (good between 0,6 and 1,3), thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %>ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, scopid ... SCOP family identifier, pdb id ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), cover ... percentage of domain coverage (folds with coverage lower than 70%> are filtered out), descr ... description of PDB template, Uniprot ... Uniprot identifier, Rec. agonist ... Chemokine receptors for that the template protein is an agonist (activator), Rec. antagonist ... Chemokine receptors for that the template protein is an antagonist (inhibitor).
Table 11: A redundancy-reduced summary of the in the threading result obtained for the novel protein FIO using the fold library containing all known structures of the members of the IL-8-like chemokine fold family.
Example 28: 3D-Modelling and energy refinement of FIO
The MODELLER package was used to generate 3D models of the query sequences with the sequence to structure alignments calculated by ProHit. In those 3D models, the distances between cysteine were measured and analyzed for possibilities of disulfide bond formation. To refine the candidate's model generated from the threading alignment, the cysteines were connected, the structure energy minimized and side chain rotamers were manually adjusted using MOE2008.10.
The sequence to structure alignment of FIO with its best ranked template structure vMIP-I (pbd: lzxt, SEQ ID No. 38) shows presence of the structurally important cysteines involved in disulfide bonds (Fig.
22) . One disulfide bridge can be formed between C20 and C47 in a C-motif. Visual inspection of the lzxt-like 3D model of FI O shows good packing of the protein without any long loops. The sequence identity not high with 20% but conservation of important amino acid properties can be observed.
Positions of proline and glycine residues are very well conserved (P27-29, P40, P45, P63) as well as hydrophobic properties of residues forming the core of the protein (pos.: V30, F49, F51, F77, V78, F76). A sequence to structure alignment of FIO with vMIP-I is shown in Fig. 21.
To date there is no 3D structure of a chemokine in complex with its receptor known, however mutation studies in CXCLIO (SEQ ID No. 39) identified residues that are important for CXCR3 receptor binding and signaling (Booth 2002). Residues important for CXCR3 binding include: V7, R8, Q17, VI 9, Q34, R38, T44. The chemokine- like 3D model of F10 was compared with the CXCLI O X-ray structure (Fig.
23) . Evaluation of the presence of the CXCR3 receptor binding residues in F10 revealed that all binding requirements can be found at equivalent regions in space. Residues V7 and VI 9 of CXCLI O are represented by A15 and P28 in F10; R38, R8 of CXCLI O with R46, K43 in F10; and Q17 and Q34 of CXCLIO by T26 and T42 in F10. F10 model does not provide an equivalent residue for T44 of CXCLIO (F52 in F10).
The structure-based sequence alignment of F10 with chemokines known to bind CXCR3 (CXCL9, CXCLIO and CXCL11) is shown in Fig. 24. Conservation is indicated as well as the residues involved in binding to CXCR3. Although, the residues important for receptor binding do not present sequence- position conservation in F10, they are recognizable at equivalent functional positions in space (Fig. 23).
Example 29: pdb95 ranking control of F10
For sequences with potential disulfide bonds, a control threading run is performed to assess the quality and ranking of the chemokine structure alignment within the structural templates of the whole PDB. The pdb95 is used as fold library (22009 entries), which comprises all to date known 3D structures with less than 95 % sequences identity and, furthermore, all chemokine and cytokine structures were added to that library. A sequence hit was further analyzed in case the chemokine alignment was within the top 10 folds in the pdb95 fold library and by manual inspection none of the better ranked alignments was considered to be superior compared to the chemokine template in terms of gap content, cysteine matching and disulfide bonding possibilities.
A redundancy-reduced summary of the pdb95 control results of the F10 protein sequence are shown in table 12. Ranking with this fold library confirms that the structure of the novel protein resembles an IL-8- like chemokine fold. The presence of a signal peptide implicates that the protein's sub-cellular location is extracellular. The folds known to be secreted are highlighted with bold letters in the scop column and only 1 of those (shown in bold and underlined) was considered of "high confidence" based on the fold coverage value (0.6<fl/pl<1.3, green background in the fl/pl column). Thus the best chemokine/IL8-like template structure (vMIP-I; pdb: lzxt, SEQ ID No. 38) is also the best ranked high confidence prediction in the pdb95 control run presenting all secondary structure elements, low gap content as well as a good cysteine arrangement in 3D to form disulfide bonds.
When using the GO terms "cell cycle arrest" and "negative regulation of cell cycle" as additional false positive filter, there is only one better ranking fold that could explain the cell cycle phenotype observed on the RNAi cell cycle screen, which is rank #14 (PDB: 1 wen). But this template at rank #14 is a nuclear protein and not secreted, whereas CXCL8, the second best ranked member of the chemokine family, is both secreted and annotated with the GO term "cell cycle arrest".
Thus, ranking with this fold library revealed that the putative structure of the novel protein is very likely to be a chemokine from the IL-8-like chemokine fold family. The novel protein shows high structural similarity to vMIP-I even among all possible protein folds (rank #14 but the only high confidence secreted protein, table 12).
Abbreviations and nomenclature used in table 12 are as follows: Rank ... Rank of the sequence-structure alignment in Pdb95 control run (22009 entries), Confid. ... Confidence class: HC=High Confidence, FP=false positive, NS=Not secreted, NL=Not Location predicted, fl/pl ... fold length/path length (good between 0,6 and 1,3), avThx ... Average Threading Index calculated from the best 20 structures of the fold, thx idx ... threading index, z comb ... combined pairwise and surface z-score, z_pair ... z-score corresponding to the pairwise (residue-residue) potential component of the alignment, z surf ... z-score corresponding to the surface (residue-solvent) potential component of the alignment, z sec ... z score corresponding to sequence identity , %ID ... percentage of amino acid sequence identity, pi ... alignment path length, fl ... template fold length, scop ... SCOP family identifier, pdb ... template identifier in the Protein Data Bank, chain ... template chain (author chain label), %cover. ... percentage of domain coverage (folds with coverage lower than 70% are filtered out), descr ... description of pdb template, Pfam ... Pfam description. Table 12: F10 - pdb95 results first per fold only
Example 30: Molecular dynamics of F10
Molecular Dynamics (MD) simulations were performed as described in example 5 with the refined models from example 28 and their stability was evaluated by comparison with a known chemokine used as control.
Likewise, the MD simulation of the F10 protein with the amino acid sequence according to SEQ ID No. 16 shows as good stability of the secondary structure elements as the control 3D model of CXCL17 with low RMSD values throughout the simulation (Fig. 25: lower graph). The backbone RMSD stays around 0.3-0.4nm and thus much lower than expected from the control CXCL17 chemokine model with 0.6- 0.8nm (Fig. 25: upper graph). At the same time the RMSD is stable throughout the simulation as well as the secondary structure elements. Their RMSD is not higher than 0.2-0.3nm and no long flexible loops are present. The resulting model after the MD has "native-like" conformation containing all secondary structure elements and suggesting a good model resolution of about 3-4A.
Example 31: Computational gene and protein analysis of F10
Computational gene and protein analysis of F10 protein was performed according to example 6. The OMA browser (Schneider 2007) identified 19 different orthologs in mammals sharing 60-97% sequence identity with the human F10 protein with the human F10 protein in almost all of them the three cysteine residues of the processed protein are conserved and some have additional cysteines (Fig. 27).
Orthologues of the F10 protein comprise the following amino acid sequences: flying fox SEQ ID No. 23, dog SEQ ID No. 24, microbat SEQ ID No. 25, hyrax SEQ ID No. 26, tenrec SEQ ID No. 27, bushbaby SEQ ID No. 28, alpaca SEQ ID No. 29, dolphin SEQ ID No. 30, chimpanzee SEQ ID No. 31, orangutan SEQ ID No. 32, mouse lemur SEQ ID No. 33, tree shrew SEQ ID No. 34, horse SEQ ID No. 35, squirrel SEQ ID No. 36, rabbit SEQ ID No. 37.
The F10 protein sequence is also known as C4orf26 was first identified within a the large-scale cDNA identification within the Mammalian Gene Collection (MGC) project (Gerhard 2004). F10 is 130 amino acids long and contains 5 cysteineSi No transmembrane region was identified in the F10 protein sequence but a signal peptide that is cleaved after residue 23. WoLF PSORT predicts extracellular location with a score 30. The ELM server was queried with the F10 protein sequence using the extracellular filter and low complexity regions (LCR) are predicted from position 68 to 90. GlobPlot predicts many disordered regions from 24-74, 83-101 and 123-130. In addition, cleavage sites for different enzymes are predicted at positions 77-79 and two overlapping sites from 114-121.
The F10 gene (ENSG00000174792) is an alternatively spliced gene encoding iso-forms with 2 and 3 Exons (Fig. 29). The open reading frame is located on chromosome 4q21.1 (4:76,481,273-76,491,081 forward strand, Fig. 26), which is in very close proximity of the chemokine mini cluster with CXCL9, CXCL10 and CXCL1 1. Those chemokines (CXCL9, CXCL10 and CXCL 1 1 ) bind all the same Receptors: 1) CXCR3 as agonist with angiostatic function that is expressed by TH1 cells and NK cells and 2) CCR3 as antagonist/inhibitor.
A poly-A site was predicted at position +124 downstream in the human gene. Furthermore the promoter region of F10 was analyzed for transcription factor binding sites and we identified several binding sites (Table 13) that are known from other chemokines like NF-IL6, NF-kappaB, APl(C-Jun) and the CKl motif (Chong 2000; Genin 2000; Gerber 2004; Roebuck 1999; Ueda 2007; Widmer 1993; Yeruva 2008).
Table 13: Transcription factor binding sites found in human F10 promoter known from other chemokines
Example 32: Information available in public resources about in vivo expression of the F10 nucleic acid sequence
The GeneNote expression study (Shmueli 2003) found that F10 is widely expressed in all tissues tested namely bone marrow, brain, heart, kidney, liver, lung, pancreas, prostate, skeletal muscle, spinal cord, spleen, thymus and placenta (Fig. 28, http://www.genecards.org/cgi-bin/carddisp.pl?id=Q17RF5).
Example 33: In vivo expression of the F10 nucleic acid sequence in human tissues
Expression of the nucleic acid sequence according to SEQ ID No. 1 8 and 19 in human tissue was determined by analysis of pooled human cDNA by PCR analysis.
The mRNA expression of F10 was analyzed by PCR with templates from 8 different tissues (heart, brain, placenta, lung, liver, skeletal muscle, kidney, pancreas) using Human MTC Panel I (Clontech, Catalog No. 636742, LOT No. 8082935A). 43 PCR cycles were done at an annealing temperature of 58°C. The expected size of the PCR products of F10 (isoform A/ Transcript 1) is 354bp. The following primers were used for F10: forward primer: 5'-TGGTGGTAACTGTGGCAGAA-3 ' (SEQ ID No. 43) and reverse primer: 5'-TTCCCTCTCAGCTTTCCTCA-3' (SEQ ID No. 44).
Expression of F10 was additionally tested in HeLa cells using the same methodology.
PCR expression studies and subsequent sequencing of the PCR products identified different isoforms (Fig. 29 top) of which only isoform A is F10 protein coding and pre-dominantly expressed in placenta and weaker in the HeLa cell line. In the other tissues heart, lung, liver, pancreas, skeletal muscle and brain, F10 (isoA) was expressed only very weakly or not at all. Isoform B has an additional exon in them middle of the intron that shifts the reading frame and thus in the translated protein only the signal sequence would be similar to Iso A. But this Isoform B was not found in placenta or HeLa (Fig. 29 bottom, Fig. 30). In Placenta, 3 different isoforms have been found but the F10 protein coding isoform A is the most abundant form showing the biggest band (Fig. 30). In HeLa cells, which have been used for the RNAi experiment, also Isoform A was found and an additional isoform C most probably not protein coding, as it has several stop codons in the sequence. The expression in HeLa was investigated to confirm that the observed phenotype in the RNAi experiment was caused by silencing the F10 coding isoform A.
Example 34: RNAi knockdown experiments of F10 transcripts
Cell cycle arrest/ cell division defect phenotype observed after RNAi knockdown of both F10 transcripts. The RNAi knockdown experiments were performed as described in (Kittler 2007).
Results for F10 (c4orf26) in comparison with results of known chemokines are shown in table 14. In the RNAi screen, F10 (c4orf26) shows a cell division defect phenotype with cell cycle arrest in the G2M and 8N cell cycle phase similar to known angiostatic chemokines (CXCL10, 13, 14, underlined) suggesting a functional similarity of F10 to known angiostatic chemokines. Thus, the GO terms "cell cycle arrest" and "negative regulation of cell cycle" were chosen as additional GO filter to remove false positives in the pdb95 result list.
Expression studies in HeLa cells revealed that only F10 isoform A is expressed in those cells, suggesting that cell cycle regulation is not associated with the other isoforms B-E (Fig. 30).
Table 14: RNAi phenotype of F10 compared to phenotypes of known chemokines ordered by correlation (bold italics: Angiostatic chemokines; bold underlined: z-scores showing a phenotype)
Averaged normalized values (Z-scores) calculated
from whole RNAi experiment
Gene Phenotypic. Correlation
AVnCN AVnGl AVnS AYiiG2M AVnSN Receptors names class with C4orf26
Cell division
C4orf26 -1,10 -5.23 -1,5" 3,93 12,36 1,00 Unknown
defect
CCL 16 0,82 .[ 7 j -0,99 1,10 5,86 0,96 CCR1,3.5,S ecu -2,39 -2.97 -0,55 2,70 4.71 0,95 CCR8
CCL20 0.75 -0,70 -0,30 0,4S 3 0,94 CCR6
CXX:LI4 0. 0 -0,75 -L2S 0,45 6,51 0,93 unknown
( ΧΓΚ3
CXCL10 -0.06 -0,58 -1,19 0,75 1,93 0,92
CCR3
CXCL13 -G..63 -0,41 -0,86 0,09 3.52 0,92 CXCR5
CCL25 -0.59 -3,03 1.37 1 ,87 4,13 0,90 CCR9 Abbreviations used in table are as follows: Gl ... 2N DNA content; S ... 2N...4N DNA content; G2M ... 4N DNA content; 8N ... 8N DNA content; CN: cell number; AVn ... averaged normalised value of a parameter; The cell number for DNA content analysis assay for calculating nCN was determined as the sum of cells with Gl, S, G2M and 8N DNA content.
Example 35: In vitro expression and purification of the F10 protein in E. coli
The following method describes recombinant expression of F10 polypeptides in E. coli.
The gene encoding the mature F10 protein (without signal peptide) is synthesized and cloned in an expression plasmid pETMM-60. The expression and purification can be done as described in Magistrelli et al. (Magistrelli 2005) using a His-tagged NusA fusion protein to make the F10 as fusion partner soluble and Origami B cells (for disulfide bonds formation in cytoplasm because of a glutathion S transferase mutation).
The F10 protein is expressed in E. coli and soluble as fusion protein. The His-tagged fusion protein can be purified using a HisTrap HP column collecting the HisTrap elution. Then, the construct is cleaved from NusA with TEV protease and leaving a GA-addition at the N-terminus of the
F10 protein and the His-tag at the NusA. The F10 protein can be separated from the Nus A and purified by running another HisTrap HP column collecting the flow-through as the His-tagged NusA binds to the column.
Example 36: Expression of F10 in Yeast
The following method describes recombinant expression of F10 polypeptides in yeast.
First, yeast expression vectors are constructed for intracellular production or secretion of F10 from the ADH2/GAPDH promoter. DNA encoding the F10 polypeptide and the promoter is inserted into suitable restriction enzyme sites in the selected plasmid to direct intracellular expression of the F10 polypeptide. For secretion, DNA encoding F10 can be cloned into the selected plasmid, together with DNA encoding the ADH2/GAPDH promoter, native F10 signal peptide or another mammalian signal peptide, or, for example, a yeast alpha-factor or invertase secretory signal/leader sequence, and linker sequences (if needed) for expression of F10.
Yeast cells, such as yeast strain ABl 10, can then be transformed with the expression plasmids described above and cultured in selected fermentation media. The transformed yeast supernatants can be analyzed by precipitation with 10% trichloroacetic acid and separation by SDS-PAGE, followed by staining of the gels with Coomassie Blue stain. Recombinant F10 polypeptides can subsequently be isolated and purified by removing the yeast cells from the fermentation medium by centrifugation and then concentrating the medium using selected cartridge filters. The concentrate containing the F10 polypeptide may further be purified using selected column chromatography resins.
Example 37: Expression of F10 in Baculovirus-Infected Insect Cells
The following method describes recombinant expression of F10 polypeptides in Baculovirus-infected insect cells.
The sequence coding for F 10 is fused upstream of an epitope tag contained within a Baculovirus expression vector. Such epitope tags include poly-His tags and immunoglobulin tags (like Fc regions of IgG). A variety of plasmids may be employed, including plasmids derived from commercially available plasmids such as pVL1393 (Novagen). Briefly, the sequence encoding F10 or the desired portion of the coding sequence of F10 such as the sequence encoding the extracellular domain of a transmembrane protein or the sequence encoding the mature protein if the protein is extracellular is amplified by PCR with primers complementary to the 5' and 3' regions. The 5' primer may incorporate flanking (selected) restriction enzyme sites. The product is then digested with those selected restriction enzymes and subcloned into the expression vector.
Recombinant baculovirus is generated by co-transfecting the above plasmid and BaculoGold(TM) virus DNA (Pharmingen) into Spodoptera frugiperda ("Sf9") cells (ATCC CRL 171 1 ) using lipofectin (commercially available from GIBCO-BRL). After 4-5 days of incubation at 28[deg.] C, the released viruses are harvested and used for further amplifications. Viral infection and protein expression are performed as described by O'Reilley et al., Baculovirus expression vectors: A Laboratory Manual, Oxford: Oxford University Press (1994).
Expressed poly-His tagged F 10 can then be purified, for example, by Ni<2+> -chelate affinity chromatography as follows. Extracts are prepared from recombinant virus-infected Sf9 cells as described by Rupert et al., Nature, 362: 175-179 (1993). Briefly, Sf9 cells are washed, resuspended in sonication buffer (25 mL Hepes, pH 7.9; 12.5 mM MgC12; 0.1 mM EDTA; 10% glycerol; 0.1% NP-40; 0.4 M KC1), and sonicated twice for 20 seconds on ice. The sonicates are cleared by centrifugation, and the supernatant is diluted 50-fold in loading buffer (50 mM phosphate, 300 mM NaCl, 10%> glycerol, pH 7.8) and filtered through a 0.45 [mu]m filter. A Ni<2+> -NTA agarose column (commercially available from Qiagen) is prepared with a bed volume of 5 mL, washed with 25 mL of water and equilibrated with 25 mL of loading buffer. The filtered cell extract is loaded onto the column at 0.5 mL per minute. The column is washed to baseline A280 with loading buffer, at which point fraction collection is started. Next, the column is washed with a secondary wash buffer (50 mM phosphate; 300 mM NaCl, 10%> glycerol, pH 6.0), which elutes nonspecifically bound protein. After reaching A280 baseline again, the column is developed with a 0 to 500 niM Imidazole gradient in the secondary wash buffer. One mL fractions are collected and analyzed by SDS-PAGE and silver staining or Western blot with Ni<2+> -NTA-conjugated to alkaline phosphatase (Qiagen). Fractions containing the eluted His l O-tagged F 10 are pooled and dialyzed against loading buffer.
Alternatively, purification of the IgG tagged (or Fc tagged) F 10 can be performed using known chromatography techniques, including for instance, Protein A or Protein G column chromatography.
F10 polypeptides were successfully expressed as described above.
Example 38: Preparation of Antibodies that bind F10
This example illustrates preparation of monoclonal antibodies which can specifically bind F10.
Techniques for producing the monoclonal antibodies are known in the art and are described, for instance, in Goding, supra. Immunogens that may be employed include purified F10 polypeptides, fusion proteins containing F10 polypeptides, and cells expressing recombinant F10 polypeptides on the cell surface. Selection of the immunogen can be made by the skilled artisan without undue experimentation.
Mice, such as BALB/c, are immunized with the F 10 immunogen emulsified in complete Freund's adjuvant and injected subcutaneously or intraperitoneally in an amount from 1-100 micrograms. Alternatively, the immunogen is emulsified in MPL-TDM adjuvant (Ribi Immunochemical Research, Hamilton, Mont.) and injected into the animal's hind foot pads. The immunized mice are then boosted 10 to 12 days later with additional immunogen emulsified in the selected adjuvant. Thereafter, for several weeks, the mice may also be boosted with additional immunization injections. Serum samples may be periodically obtained from the mice by retro-orbital bleeding for testing in ELISA assays to detect anti- F10 antibodies.
After a suitable antibody titer has been detected, the animals "positive" for antibodies can be injected with a final intravenous injection of F10. Three to four days later, the mice are sacrificed and the spleen cells are harvested. The spleen cells are then fused (using 35% polyethylene glycol) to a selected murine myeloma cell line such as P3X63AgU. 1 , available from ATCC, No. CRL 1597. The fusions generate hybridoma cells which can then be plated in 96 well tissue culture plates containing HAT (hypoxanthine, aminopterin, and thymidine) medium to inhibit proliferation of non-fused cells, myeloma hybrids, and spleen cell hybrids.
The hybridoma cells will be screened in an ELISA for reactivity against F10. Determination of "positive" hybridoma cells secreting the desired monoclonal antibodies against F10 is within the skill in the art.
The positive hybridoma cells can be injected intraperitoneally into syngeneic BALB/c mice to produce ascites containing the anti-FlO monoclonal antibodies. Alternatively, the hybridoma cells can be grown in tissue culture flasks or roller bottles. Purification of the monoclonal antibodies produced in the ascites can be accomplished using ammonium sulfate precipitation, followed by gel exclusion chromatography. Alternatively, affinity chromatography based upon binding of antibody to Protein A or Protein G can be employed.
Example 39: Purification of F10 Polypeptides Using Specific Antibodies
Native or recombinant F10 polypeptides may be purified by a variety of standard techniques in the art of protein purification. For example, pro-FlO polypeptide, mature F10 polypeptide, or pre-FlO polypeptide is purified by immunoaffmity chromatography using antibodies specific for the F10 polypeptide of interest. In general, an immuno affinity column is constructed by covalently coupling the anti-Fl O polypeptide antibody to an activated chromatographic resin.
Polyclonal immunoglobulins are prepared from immune sera either by precipitation with ammonium sulfate or by purification on immobilized Protein A (Pharmacia LKB Biotechnology, Piscataway, N.J.). Likewise, monoclonal antibodies are prepared from mouse ascites fluid by ammonium sulfate precipitation or chromatography on immobilized Protein A. Partially purified immunoglobulin is covalently attached to a chromatographic resin such as CnBr-activated SEPHAROSE(TM) (Pharmacia LKB Biotechnology). The antibody is coupled to the resin, the resin is blocked, and the derivative resin is washed according to the manufacturer's instructions.
Such an immunoaffmity column is utilized in the purification of F10 polypeptide by preparing a fraction from cells containing F10 polypeptide in a soluble form. This preparation is derived by solubilization of the whole cell or of a subcellular fraction obtained via differential centrifugation by the addition of detergent or by other methods well known in the art. Alternatively, soluble F10 polypeptide containing a signal sequence may be secreted in useful quantity into the medium in which the cells are grown.
A soluble F10 polypeptide-containing preparation is passed over the immunoaffmity column, and the column is washed under conditions that allow the preferential absorbance of F10 polypeptide (e.g., high ionic strength buffers in the presence of detergent). Then, the column is eluted under conditions that disrupt antibody/FlO polypeptide binding (e.g., a low pH buffer such as approximately pH 2-3, or a high concentration of a chaotrope such as urea or thiocyanate ion), and F10 polypeptide is collected. Aidoudi S, Bujakowska K, Kieffer N, Bikfalvi A. The CXC-chemokine CXCL4 interacts with integrins implicated in angiogenesis. PLoS One 2008, 3:e2657
Allen SJ, Crown SE, Handel TM. Chemokine: receptor structure, interactions, and antagonism. Annu Rev Immunol 2007, 25:787-820
Assa-Munt N, Jia X, Laakkonen P, Ruoslahti E. Solution structures and integrin binding activities of an RGD peptide with two isomers. Biochemistry 2001, 40:2373-8
Baggiolini M.: Chemokines in pathology and medicine. J Intern Med 2001. 250:91-104.
Bendtsen JD, Jensen LJ, Blom N, Von Heijne G, Brunak S: Feature-based prediction of non-classical and leaderless protein secretion. Protein Eng Des Sel lOOA, 17(4):349-356.
Berendsen HJC, Postma JPM, Van Gunsteren WF, Hermans J, Pullman B, Israel Academy of Sciences and Humanities (Jerusalem): Intermolecular forces proceedings of the fourteenth Jerusalem symposium on quantum chemistry and biochemistry held in Jerusalem, Israel, April 13-16, 1981 p.331.
Dordrecht [etc.]: Reidel; 1981.
Berendsen HJC, Postma JPM, Vangunsteren WF, Dinola A, Haak JR: Molecular-Dynamics with Coupling to an External Bath. Journal of Chemical Physics 1984, 81(8):3684-3690.
Betts MJ, Guigo R, Agarwal P, Russell RB: Exon structure conservation despite low sequence similarity: a relic of dramatic events in evolution?
Embo J 2001, 20(19):5354-5360.
Booth V, Keizer DW, Kamphuis MB, Clark-Lewis I, Sykes BD: The CXCR3 binding chemokine IP-10/CXCLlO: structure and receptor interactions.
Biochemistry 2002, 41 : 10418-25
Boshoff C, Endo Y, Collins PD, Takeuchi Y, Reeves JD, Schweickart VL, Siani MA, Sasaki T, Williams TJ, Gray PW et al: Angiogenic and HIV-inhibitory functions of KSHV-encoded chemokines. Science 1997, 278(5336):290-294.
Chong IW, Shi MM, Love JA, Christiani DC, Paulauskis JD: Regulation of chemokine mRNA expression in a rat model of vanadium-induced pulmonary inflammation. Inflammation 2000, 24(6):505-517.
de Jongh HH, Goormaghtigh E, Ruysschaert JM: The different molar absorptivities of the secondary structure types in the amide I region: an attenuated total reflection infrared study on globular proteins. Anal Biochem 1996, 242(1):95-103.
Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG: A Smooth Particle Mesh Ewald Method. Journal of Chemical Physics 1995, 103(19):8577-8593.
Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutis T et al: Ensembl 2008. Nucleic Acids Res 2008, 36(Database issue) :D707-714. (Frederick & dayman 2001) Frederick MJ, dayman GL. Chemokines in cancer. Expert Rev Mol Med 2001, 3: 1-18
(Frisch 2001) Frisch SM, Screaton RA. Anoikis mechanisms. Curr Opin Cell Biol 2001.
13:555-62.
(Genin 2000) Genin P, Algarte M, Roof P, Lin R, Hiscott J: Regulation of RANTES chemokine gene expression requires cooperativity between NF-kappa B and IFN-regulatory factor transcription factors. J Immunol 2000, 164(10):5352- 5361.
(Gerber 2004) Gerber A, Heimburg A, Reisenauer A, Wille A, Welte T, Buhling F: Proteasome inhibitors modulate chemokine production in lung epithelial and monocytic cells. Eur Respir J 2004, 24(l):40-48.
(Gerhard 2004) Gerhard DS, Wagner L, Feingold EA, Shenmen CM, Grouse LH, et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res 2004, 14:2121-7
(Goldenberg 1983) Goldenberg DP, Creighton TE. Circular and circularly permuted forms of bovine pancreatic trypsin inhibitor. J Mol Biol 1983. 165:407-13.
(Hiller 2004) Hiller K, Grote A, Scheer M, Munch R, Jahn D: PrediSi: prediction of signal peptides and their cleavage positions. Nucleic Acids Res 2004, 32(Web Server issue):W375-379.
(Horton 2007) Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K.
WoLF PSORT: protein localization predictor. Nucleic Acids Res. 2007 Jul;35(WebServer issue):W585-7.
(Hoover 2002) Hoover DM, Boulegue C, Yang D, Oppenheim JJ, Tucker K, et al. The structure of human macrophage inflammatory protein-3alpha /CCL20. Linking antimicrobial and CC chemokine receptor-6-binding activities with human beta-defensins. J Biol Chem 2002, 277:37647-54
(Humphrey 1996) Humphrey W, Dalke A, Schulten K: VMD: visual molecular dynamics. J Mol
Graph 1996, 14(l):33-38, 27-38.
(Jones 1994) Jones DT, Taylor WR, Thornton JM: A model recognition approach to the prediction of all-helical membrane protein structure and topology.
Biochemistry 1994, 33(10):3038-3049.
(Karlsson 2008) Karlsson E, Delle U, Danielsson A, Olsson B, Abel F, et al. Gene expression variation to predict 10-year survival in lymph- node- negative breast cancer.
BMC Cancer 2008, 8:254
(Kim 2003) Kim JE, Kim S J, Jeong HW, Lee BH, Choi JY, et al. : RGD peptides released from beta ig-h3, a TGF-beta-induced cell-adhesive molecule, mediate apoptosis. Oncogene 2003. 22:2045-53
(Kittler 2007) Kittler R, Pelletier L, Heninger AK, Slabicki M, Theis M, et al. Genome-scale
RNAi profiling of cell division in human tissue culture cells. Nat Cell Biol 2007, 9: 1401-12
(Lindahl 2001) Lindahl E, Hess B, van der Spoel D: GROMACS 3.0: a package for molecular simulation and trajectory analysis. Journal of Molecular Modeling 2001, 7(8):306-317. Liu C, Okruzhnov Y, Li H, Nicholas J: Human herpesvirus 8 (HHV-8)- encoded cytokines induce expression of and autocrine signaling by vascular endothelial growth factor (VEGF) in HHV-8-infected primary-effusion lymphoma cell lines and mediate VEGF-independent antiapoptotic effects. J
Virol. 2001, Nov;75(22): 10933-40.
Louahed J, Struyf S, Demoulin JB, Parmentier M, Van Snick J, Van Damme J, Renauld JC: CCR8-dependent activation of the RAS/MAPK pathway mediates anti-apoptotic activity of 1-309/ CCL1 and vMIP-I. Eur J Immunol. 2003 Feb;33(2):494-501.
Luna C, Li G, Liton PB, Epstein DL, Gonzalez P. Alterations in gene expression induced by cyclic mechanical stress in trabecular meshwork cells.
Mol Vis 2009, 15:534-44
Luttichau HR, Johnsen AH, Jurlander J, Rosenkilde MM, Schwartz TW: Kaposi sarcoma-associated herpes virus targets the lymphotactin receptor with both a broad spectrum antagonist vCCL2 and a highly selective and potent agonist vCCL3. J Biol Chem. 2007 Jun 15;282(24): 17794-805.
Magistrelli G, Gueneau F, Muslmani M, Ravn U, Kosco-Vilbois M, Fischer N: Chemokines derived from soluble fusion proteins expressed in Escherichia coli are biologically active. Biochem Biophys Res Commun 2005, 334(2):370-375.
Maegawa S, Shiraishi M, Otsuka S, Meguro M, Mitsuya K, Nanba E, Oshimura M: "Identification of a novel Kruppel-like zinc finger protein." (unpublished). In: EMBL/GenBank/DDB J databases, vol. ; 2002.
Pease JE, Williams TJ. The attraction of chemokines as a target for specific anti-inflammatory therapy. Br J Pharmacol 2006, 147 Suppl 1 :S212-21
Pisabarro MT, Leung B, Kwong M, Corpuz R, Frantz GD, Chiang N, Vandlen R, Diehl LJ, Skelton N, Kim HS et al: Cutting edge: novel human dendritic cell- and monocyte-attracting chemokine-like protein identified by fold recognition methods. J Immunol 2006, 176(4):2069-2073.
Poser I, Sarov M, Hutchins JR, Heriche JK, Toyoda Y, et al. 2008. BAC TransgeneOmics: a high-throughput method for exploration of protein function in mammals. Nat Methods 5:409-15.
Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M, Cameron S, Martin DM, Ausiello G, Brannetti B, Costantini A et al: ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res 2003, 31(13):3625-3630.
Roebuck KA: Regulation of interleukin-8 gene expression. J Interferon Cytokine Res 1999, 19(5):429-438.
Schneider A, Dessimoz C, Gonnet GH. OMA Browser—exploring orthologous relations across 352 complete genomes. Bioinformatics 2007 23:2180-2
Shmueli O, Horn-Saban S, Chalifa-Caspi V, Shmoish M, Ophir R, Benjamin- Rodrig H, Safran M, Domany E, Lancet D: GeneNote: whole genome expression profiles in normal human tissues. C R Biol 2003, 326(10-11):1067- 1072. (Singh 2009) Singh RK, Lokeshwar BL. Depletion of intrinsic expression of Interleukin-8 in prostate cancer cells causes cell cycle arrest, spontaneous apoptosis and increases the efficacy of chemotherapeutic drugs. Mol Cancer 2009, 8:57
(Sippl 1992) Sippl MJ, Weitckus S: Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins 1992, 13(3):258-271.
(Sippl 1993) Sippl MJ: Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures. J Comput Aided Mol Des 1993, 7(4):473-501.
(Sonnhammer 1998) Sonnhammer EL, von Heijne G, Krogh A: A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 1998, 6: 175-182.
(Tabaska 1999) Tabaska JE, Zhang MQ: Detection of polyadenylation signals in human DNA sequences. Gene 1999, 231(1 -2):77-86.
(Ueda 2007) Ueda Y, Su Y, Richmond A: CCAAT displacement protein regulates nuclear factor-kappa beta-mediated chemokine transcription in melanoma cells.
Melanoma Res 2007, 17(2):91-103.
(Van der Spoel 2005) Van Der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, Berendsen HJ:
GROMACS: fast, flexible, and free. J Comput Chem 2005, 26(16): 1701-1718. (Van Gunsteren 1996) Van Gunsteren WF, Billeter SR, Eising AA, Hiinenberger PH, Kriiger P, Mark
AE, Scott WRP, Tironi IG: Biomolecular Simulation: The GROMOS96 Manual and User Guide Ziirich, Switzerland Vdf Hochschulverlag AG an der ETH Zurich; 1996.
(Yeruva 2008) Yeruva S, Ramadori G, Raddatz D: NF-kappaB-dependent synergistic regulation of CXCL10 gene expression by IL-lbeta and IFN-gamma in human intestinal epithelial cell lines. Int J Colorectal Dis 2008, 23(3):305-317.
(Wan 2004) Wan D, Gong Y, Qin W, Zhang P, Li J, et al. Large-scale cDNA transfection screening for genes related to cancer development and progression. Proc Natl Acad Sci USA 2004, 101 :15724-9
(Widmer 1993) Widmer U, Manogue KR, Cerami A, Sherry B: Genomic cloning and promoter analysis of macrophage inflammatory protein (MIP)-2, MIP-1 alpha, and MIP-1 beta, members of the chemokine superfamily of proinflammatory cytokines. J Immunol 1993, 150(11):4996-5012.
(Wolf 2008) Wolf M, Albrecht S, Marki C. Proteolytic processing of chemokines:
implications in physiological and pathological conditions. Int J Biochem Cell Biol 2008, 40:1185-98.
(Zhang 2004) Zhang Z, Henzel WJ: Signal peptide prediction based on analysis of experimentally verified cleavage sites. Protein Sci 2004, 13(10):2819-2824.

Claims

Claims
1. An isolated protein having at least 80 % amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No.l .
2. An isolated protein having at least 80% amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No. 9.
3. Isolated protein according to claim 2 with a CC motive.
4. An isolated protein having at least 80% amino acid sequence identity to an amino acid sequence of the protein shown in SEQ ID No.16, with cysteine residues at positions 20 and 47 and with at least one disulfide bridge formed between two cysteine residues.
5. Isolated protein of claim 4, with a proline residue at position 30.
6. Isolated protein according to one of claims 1 to 5 with an IL-8 like chemokine fold.
7. Isolated protein which is a circularly permutated variant of a protein according to one of claims 1 to 6.
8. Isolated protein that is a chimeric molecule comprising a protein according to one of claims 1 to 7 fused to a heterologous amino acid sequence.
9. Isolated protein of claim 8, wherein said heterologous amino acid sequence is an epitope tag sequence or an Fc region of an immunoglobulin.
10. An isolated nucleic acid having at least 80 % nucleic acid sequence identity to a nucleotide sequence encoding for a protein of one of claims 1 to 9.
11. A vector comprising a nucleic acid according to claim 10.
12. The vector according to claim 11 operably linked to control sequences recognized by a host cell transformed with the vector.
13. A host cell or host organism comprising a nucleic acid according to claim 10 or a vector of claim 11 or 12.
14. A process for producing a protein according to one of claims 1 to 9 comprising culturing the host cell of claim 13 under conditions suitable for the expression of said protein and recovering said protein from the cells or the culture supernatant.
15. An antibody which specifically binds to a protein according to claim 1 to 7.
16. The antibody of claim 15, wherein said antibody is a monoclonal antibody, a polyclonal antibody, a humanized antibody or a single chain antibody.
17. A composition of matter comprising at least one of the following a) A protein according to one of the claims 1 to 9 or a vector according to claim 11 or 12, b) An agonist of said protein, c) An antagonist of said protein and/or d) An antibody that specifically binds to said protein in combination with a carrier.
18. The composition of matter of claim 17, wherein said carrier is a pharmaceutically acceptable carrier.
19. The composition of matter of claim 17 or 18 for the treatment of an immune related disease in a mammal.
20. Use of a protein of one of the claims 1 to 7 as pharmaceutical.
21. A method of treating an immune related disorder in a mammal in need thereof comprising administering to said mammal a therapeutically effective amount of a) A protein according to one of the claims 1 to 9 or a vector according to claim 11 or 12, b) An agonist of said protein, c) An antagonist of said protein or d) An antibody that specifically binds to said protein.
22. The method of claim 21, wherein the immune related disorder is systemic lupus erythematosus, rheumatoid arthritis, osteoarthritis, juvenile chronic arthritis, a spondyloarthropathy, systemic sclerosis, an idiopathic inflammatory myopathy, Sjogren's syndrome, systemic vasculitis, sarcoidosis, autoimmune hemolytic anemia, autoimmune thrombocytopenia, thyroiditis, diabetes mellitus type I, immune-mediated renal disease, a demyelinating disease of the central or peripheral nervous system, idiopathic demyelinating polyneuropathy, Guillain-Barre syndrome, a chronic inflammatory demyelinating polyneuropathy, a hepatobiliary disease, infectious or autoimmune chronic active hepatitis, primary biliary cirrhosis, granulomatous hepatitis, sclerosing cholangitis, inflammatory bowel disease, gluten-sensitive enteropathy, Whipple's disease, an autoimmune or immune-mediated skin disease, a bullous skin disease, erythema multiforme, contact dermatitis, psoriasis, an allergic disease, asthma, allergic rhinitis, atopic dermatitis, food hypersensitivity, urticaria, an immunologic disease of the ovaries, an immunologic disease of the lung, eosinophilic pneumonia, idiopathic pulmonary fibrosis, hypersensitivity pneumonitis, a transplantation associated disease, graft rejection, graft-versus-host-disease or AIDS.
23. A method of diagnosing an immune related disease in a mammal, said method comprising detecting the level of expression of a gene encoding the protein according to one of claims 1 to 6 a) in a test sample of tissue cells obtained from the mammal, and b) in a control sample of known normal tissue cells of the same cell type, wherein a higher or lower level of expression of said gene in the test sample as compared to the control sample is indicative of the presence of an immune related disease in the mammal from which the test tissue cells were obtained.
24. A method of diagnosing an immune related disease in a mammal, said method comprising a) contacting an antibody of claim 15 or 16 with a test sample of tissue cells obtained from said mammal and b) detecting the formation of a complex between said antibody and a naturally occurring polypeptide with an amino acid sequence according to SEQ ID No. 1 , SEQ ID No. 9 or SEQ ID No. 16 in the test sample, wherein formation of said complex is indicative of the presence of an immune related disease in the mammal from which the test tissue cells were obtained.
25. A method of identifying a compound that modulates the activity of a protein according to one of claims 1 to 7, said method comprising contacting cells which normally respond to said protein with a) said protein and b) a candidate compound, and determining the lack of responsiveness by said cell to a).
26. A method of identifying a compound that inhibits the expression of a gene encoding the protein according to one of claims 1 to 7 and/or the level of expression of a nucleic acid according to claim 10, said method comprising contacting cells which normally express said protein with a candidate compound, and determining the lack of expression said gene.
27. A method of claim 26, wherein said candidate compound is an antisense nucleic acid.
EP11705893A 2010-03-02 2011-03-01 Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof Withdrawn EP2542573A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US30944710P 2010-03-02 2010-03-02
US30944010P 2010-03-02 2010-03-02
US30945210P 2010-03-02 2010-03-02
PCT/EP2011/052984 WO2011107456A1 (en) 2010-03-02 2011-03-01 Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof

Publications (1)

Publication Number Publication Date
EP2542573A1 true EP2542573A1 (en) 2013-01-09

Family

ID=43920772

Family Applications (1)

Application Number Title Priority Date Filing Date
EP11705893A Withdrawn EP2542573A1 (en) 2010-03-02 2011-03-01 Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof

Country Status (2)

Country Link
EP (1) EP2542573A1 (en)
WO (1) WO2011107456A1 (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2107127A1 (en) 2008-03-31 2009-10-07 Université Joseph Fourier In vitro diagnostic method for the diagnosis of somatic and ovarian cancers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2011107456A1 *

Also Published As

Publication number Publication date
WO2011107456A1 (en) 2011-09-09

Similar Documents

Publication Publication Date Title
Hemmerich et al. Identification of residues in the monocyte chemotactic protein-1 that contact the MCP-1 receptor, CCR2
Shi et al. A novel cytokine receptor-ligand pair: identification, molecular characterization, and in vivo immunomodulatory activity
EP1525213B1 (en) Three-dimensional structures of tall-1 and its cognate receptors and modified proteins and methods related thereto
JP4980721B2 (en) GAG binding protein
AU2002213053B2 (en) Nectin polypeptides, polynucleotides, methods of making and use thereof
Mayer et al. NMR solution structure and receptor peptide binding of the CC chemokine eotaxin-2
Ziarek et al. Sulfopeptide probes of the CXCR4/CXCL12 interface reveal oligomer-specific contacts and chemokine allostery
Scully et al. Selective hexapeptide agonists and antagonists for human complement C3a receptor
EP3122767B1 (en) Water-soluble trans-membrane proteins and methods for the preparation and use thereof
JPH11502420A (en) Mammalian chemokines CCF8 and chemokine receptor CCKR3
JPH11503610A (en) A novel chemokine expressed in eosinophils
CN102712686B (en) Biomaterial and application thereof
Joshi et al. Elucidating the molecular interactions of chemokine CCL2 orthologs with flavonoid baicalin
Cao et al. The receptor CgIL-17R1 expressed in granulocytes mediates the CgIL-17 induced haemocytes proliferation in Crassostrea gigas
JP2006505243A (en) Novel antagonist of MCP protein
JP2000510690A (en) Mammalian mixed lymphocyte receptor, chemokine receptor (MMLR-CCR)
Costantini et al. Peptides targeting chemokine receptor CXCR4: Structural behavior and biological binding studies
JPH11243960A (en) Human chemokine cc eotaxin 3
JPH10512154A (en) Novel chemokine expressed in pancreas
WO2011107456A1 (en) Novel protein with structural homology to proteins with il-8-like chemokine fold and uses thereof
JP2011501675A (en) Glycosaminoglycan antagonist based on SDF-1 and method of use thereof
EP1284289A1 (en) Method of examining allergic disease
Takahashi et al. Cloning and characterization of guinea pig CXCR1
US20050124043A1 (en) Chemokine-like factors (CKLFs) with chemotactic and hematopoietic stimulating activities
JPH11501817A (en) Hyaluronan receptor expressed in human umbilical vein endothelial cells

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20120928

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20140923

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20160429