AU779095B2 - Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines - Google Patents

Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines Download PDF

Info

Publication number
AU779095B2
AU779095B2 AU87234/01A AU8723401A AU779095B2 AU 779095 B2 AU779095 B2 AU 779095B2 AU 87234/01 A AU87234/01 A AU 87234/01A AU 8723401 A AU8723401 A AU 8723401A AU 779095 B2 AU779095 B2 AU 779095B2
Authority
AU
Australia
Prior art keywords
amino acid
proviso
sequence
socs
amino acids
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU87234/01A
Other versions
AU8723401A (en
Inventor
Warren S. Alexander
Douglas J Hilton
Donald Metcalf
Sandra E Nicholson
Rachael T Richardson
Robyn Starr
Elizabeth M Viney
Tracy A. Wilson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Walter and Eliza Hall Institute of Medical Research
Original Assignee
Walter and Eliza Hall Institute of Medical Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU46943/97A external-priority patent/AU735735B2/en
Application filed by Walter and Eliza Hall Institute of Medical Research filed Critical Walter and Eliza Hall Institute of Medical Research
Priority to AU87234/01A priority Critical patent/AU779095B2/en
Publication of AU8723401A publication Critical patent/AU8723401A/en
Application granted granted Critical
Publication of AU779095B2 publication Critical patent/AU779095B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Description

Regulation 3.2
AUSTRALIA
Patents Act 1990 DIVISIONAL APPLICATION oooo go °oooo *oooo *oo ooo** *oooo *ooo *oo o Name of Applicant: Actual Inventor(s): Address for Service: Invention Title: The Walter and Eliza Hall Institute of Medical Research HILTON, Douglas, ALEXANDER, Warren VINEY, Elizabeth, WILSON, Tracy, RICHARDSON, Rachael, STARR, Robyn; NICHOLSON, Sandra, and METCALF, Donald.
DAVIES COLLISON CAVE, Patent Attorneys, Level 3, 303 Coronation Drive, Milton, Queensland, 4064, Australia "Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines" Details of Parent Application No: Australian Application No. 46943/97 The following statement is a full description of this invention, including the best method of performing it known to me/us: Q \Opcr\Vpa\OctobrT\2001\2472 7 45 div WEH1.305 doc 1/11/01 P:\OPERNVPA\VPACOM-I\SOCSDI-I.WPD- 1/11/01 1A THERAPEUTIC AND DIAGNOSTIC AGENTS CAPABLE OF MODULATING CELLULAR RESPONSIVENESS TO CYTOKINES FIELD OF THE INVENTION The present invention relates generally to therapeutic and diagnostic agents. More particularly, the present invention provides therapeutic molecules capable of modulating signal transduction such as but not limited to cytokine-mediated signal transduction. The molecules of the present invention are useful, therefore, in modulating cellular responsiveness to cytokines as well as other mediators of signal transduction such as endogenous or exogenous molecules, antigens, microbes and microbial products, viruses or components thereof, ions, hormones and parasites.
The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement or any form of suggestion that the prior art forms part of the common general 15 knowledge in Australia.
.o.o Bibliographic details of the publications referred to in this specification by author are collected at the end of the description. Sequence Identity Numbers (SEQ ID NOs.) for the nucleotide and amino acid sequences referred to in the specification are defined after the bibliography. A summary of the SEQ ID NOs is given in Table 1.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not tLe exclusion of any other integer or group of integers.
BACKGROUND OF THE INVENTION Cells continually monitor their environment in order to modulate physiological and biochemical 30 processes which in turn affects future behaviour. Frequently, a cell's initial interaction with its P:\OPERVPA\VPACOM-I\SOCSDI-l.WPD- 1/11/01 -2surroundings occurs via receptors expressed on the plasma membrane. Activation of these receptors, whether through binding endogenous ligands (such as cytokines) or exogenous ligands (such as antigens), triggers a biochemical cascade from the membrane through the cytoplasm to the nucleus.
Of the endogenous ligands, cytokines represent a particularly important and versatile group.
Cytokines are proteins which regulate the survival, proliferation, differentiation and function of a variety of cells within the body [Nicola, 1994]. The haemopoietic cytokines have in common a four-alpha helical bundle structure and the vast majority interact with a structurally related family of cell surface receptors, the type I and type II cytokine receptors [Bazan, 1990; Sprang, 1993]. In all cases, ligand-induced receptor aggregation appears to be a critical event in initiating intracellular signal transduction cascades. Some cytokines, for example growth hormone, erythropoietin (Epo) and granulocyte-colony-stimulating factor (G-CSF), trigger receptor homodimerisation, while for other cytokines, receptor heterodimerisation or 15 heterotrimerisation is crucial. In the latter cases, several cytokines share common receptor subunits and on this basis can be grouped into three subfamilies with similar patterns of intracellular activation and similar biological effects [Hilton, 1994]. Interleukin-3 and granulocyte-macrophage colony-stimulating factor (GM-CSF) use the common P-receptor subunit (pc) and each cytokine stimulates the production and functional activity ofgranulocytes and macrophages. IL-2, IL-4, IL-7, IL-9, and IL-15 each use the common y-chain while IL-4 and IL-13 share an alternative y-chain (y'c or IL-13 receptor a-chain). Each of these cytokines plays an important role in regulating acquired immunity in the lymphoid system.
Finally, IL-6, IL-11, leukaemia inhibitory factor (LIF), oncostatin-M (OSM), ciliary neurotrophic factor (CNTF) and cardiotrophin (CT) share the receptor subunit gpl30. Each of these cytokines appears to be highly pleiotropic, having effects both within and outside the haemopoietic system [Nicola, 1994].
In all of the above cases at least one subunit of each receptor complex contains the conserved sequence elements, termed boxl and box2, in their cytoplasmic tails [Murakami, 1991]. Boxl is a proline-rich motif which is located more proximal to the transmembrane domain than the acidic box 2 element. The box-1 region serves as the binding site for a class of cytoplasmic P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD -III 1/01 -3tyrosine kinases termed JAKs (Janus kinases). Ligand-induced receptor dimerisation serves to increase the catalytic activity of the associated JAKs through cross-phosphorylation. Activated JAKs then tyrosine phosphorylate several substrates, including the receptors themselves.
Specific phosphotyrosine residues on the receptor then serve as docking sites for SH2containing proteins, the best characterised of which are the signal transducers and activators of transcription (STATs) and the adaptor protein, shc. The STATs are then phosphorylated on tyrosines, probably by JAKs, dissociate from the receptor and form either homodimers or heterodimers through the interaction of the SH2 domain of one STAT with the phosphotyrosine residue of the other. STAT dimers then translocate to the nucleus where they bind to specific cytokine-responsive promoters and activate transcription [Darnell, 1994; Ihle, 1995; Ihle, 1995].
In a separate pathway, tyrosine phosphorylated she interacts with another SH2 domaincontaining protein, Grb-2, leading ultimately to activation of members of the MAP kinase family and in turn transcription factors such as fos and jun [Sato, 1993; Cutler, 1993]. These pathways are not unique to members of the cytokine receptor family since cytokines that bind 15 receptor tyrosine kinases also being able to activate STATs and members of the MAP kinase family [David, 1996; Leaman, 1996; Shual, 1993; Sato, 1993; Cutler, 1993].
Four members of the JAK family of cytoplasmic tyrosine kinases have been described, JAK1, JAK2, JAK3 and TYK2, each of which binds to a specific subset of cytokine receptor subunits.
Six STATs have been described (STATl through STAT6), and these too are activated by distinct cytokine/receptor complexes. For example, STAT1 appears to be functionally specific to the interferon system, STAT4 appears to be specific to IL-12, while STAT6 appears to be specific for IL-4 and IL-13. Thus, despite common activation mechanisms some degree of cytokine specificity may be achieved through the use of specific JAKs and STATs [Thierfelder, 1996; Kaplan, 1996; Takeda, 1996; Shimoda, 1996; Meraz, 1996; Durbin, 1996].
In addition to those described above, there are clearly other mechanisms of activation of these pathways. For example, the JAK/STAT pathway appears to be able to activate MAP kinases independent of the shc-induced pathway [David, 1995] and the STATs themselves can be activated without binding to the receptor, possibly by direct interaction with JAKs [Gupta, 1996]. Conversely, full activation of STATS may require the action of MAP kinase in addition P:\OPER\VPAVPACOM-I\SOCSDI--.WPD. 1/11/01 -4to that of JAKs [David, 1995; Wen, 1995].
While the activation of these signalling pathways is becoming better understood, little is known of the regulation of these pathways, including employment of negative or positive feedback loops. This is important since once a cell has begun to respond to a stimulus, it is critical that the intensity and duration of the response is regulated and that signal transduction is switched off. It is likewise desirable to increase the intensity of a response systemically or even locally as the situation requires.
In work leading up to the present invention, the inventors sought to isolate negative regulators of signal transduction. The inventors have now identified a new family of proteins which are capable of acting as regulators of signalling. The new family of proteins is defined as the suppressor of cytokifie signalling (SOCS) family based on the ability of the initially identified SOCS molecules to suppress cytokine-mediated signalling. It should be noted, however, that S. 15 not all members of the SOCS family need necessarily share suppressor function nor target solely cytokine mediated signalling. The SOCS family comprises at least three classes of protein "molecules based on amino acid sequence motifs located N-terminal of a C-terminal motif called the SOCS box. The identification of this new family of regulatory molecules permits the generation of a range of effector or modulator molecules capable of modulating signal transduction and, hence, cellular responsiveness to a range of molecules including cytokines.
The present invention, therefore, provides therapeutic and diagnostic agents based on SOCS proteins, derivatives, homologues, analogues and mimetics thereof as well as agonists and antagonists of SOCS proteins.
SUMMARY OF THE INVENTION The present invention provides inter alia nucleic acid molecules encoding members of the SOCS family of proteins as well as the proteins themselves. Reference hereinafter to "SOCS" encompasses any or all members of the SOCS family. Specific SOCS molecules are defined numerically such as, for example, SOCS1, SOCS2 and SOCS3. The species from which the SOCS has been obtained may be indicated by a preface of a single letter abbreviation where "h" P:\OPER\VPA\VPACOM-I\SOCSDI-I .WPD 1/11/01 is human, is murine and is rat. Accordingly, "mSOCSl"is a specific SOCS from a murine animal. Reference herein to "SOCS" is not to imply that the protein solely suppresses cytokine-mediated signal transduction, as the molecule may modulate other effector-mediated signal transductions such as by hormones or other endogenous or exogenous molecules, antigens, microbes and microbial products, viruses or components thereof, ions, hormones and parasites. The term "modulates" encompasses up-regulation, down-regulation as well as maintenance of particular levels.
One aspect of the present invention provides a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein comprises a SOCS box in its Cterminal region 15 Another aspect of the present invention provides a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein comprises a SOCS box in its Cterminal region and a protein:molecule interacting region.
Yet another aspect of the present invention is directed to a nucleic acid molecule comprising a *sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein comprises a C-terminal region and a protein:molecule interacting region located in a region N-terminal of the SOCS box.
Preferably, the protein:molecule interacting region is a protein:DNA or protein:protein binding region.
Still a further aspect of the present invention provides a nucleic acid molecule comprising a P:\OPERVPAWVPACOM-l\SOCSDI-I.WPD. 1/11/01 -6sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein comprises a SOCS box in its C-terminal region and one or more of an SH2 domain, WD-40 repeats or ankyrin repeats N-terminal of the SOCS box.
Even still a further aspect of the present invention is directed to a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein comprises a SOCS box in its C-terminal region wherein the SOCS box comprises the amino acid sequence: X, X, X 3
X
4 X X, X X 7 X X X, X X ,1 X13 XX 1
X
16 [Xi]n X 1 7 X18 X 19
X
20 15 X 21
X
2 2
X
2 3 [Xj]n X 2 4
X
2 5
X
2 6
X
27
X
28 wherein: X, is L, I, V, M, A or P;
X
2 is any amino acid residue;
X
3 is P, T or S; X4 is L, I, V, M, A or P;
X
5 is any amino acid;
X
6 is any amino acid;
X
7 is L, I, V, M, A, F, Y or W; X, is C, T or S;
X
9 is R, K or H;
X,
0 is any amino acid; X is any amino acid; X,2 isL, I,V,M, A orP;
X,
3 is any amino acid; X4, is any amino acid;
X,
5 is any amino acid; P:\OPERA\VPACOM-I\SOCSDI-I.WPD 1/11/01 -7-
X,
6 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence Xi may comprise the same or different amino acids selected from any amino acid residue; X, is L, I, V, M, A or P; Xg is any amino acid;
X,
9 is any amino acid;
X
20 L, I, V, M, A or P;
X
21 is P;
X
22 is L, I, V, M, A, P or G;
X
23 is P or N; [Xj] n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; 15 X 24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
27 is Y or F;
X
28 is L, I, V, M, A or P; and a protein:molecule interacting region such as but not limited to one or more of an SH2 domain, WD-40 repeats and/or ankyrin repeats N-terminal of the SOCS box.
Another aspect of the present invention is directed to a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said protein exhibits the following characteristics: comprises a SOCS box in its C-terminal region having the amino acid sequence: X, X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 Xlo X 1
X
12
X
1 3
X
14
X
15
X
16 [X]n X 17
X
1 8
X
1 9
X
2 0 P:\OPER\VPA\VPACOM-lSOCSDI-I.WPD- 111/01 -8-
X
2 1
X
22
X
2 3 [Xj]n X 24
X
25
X
2 6 X27X 2 8 wherein: X 1 is L, I, V, M, A or P;
X
2 is any amino acid residue;
X
3 is P, T or S;
X
4 is L, I, V, M, A or P; X, is any amino acid;
X
6 is any amino acid;
X
7 is L, I, V, M, A, F, Y or W;
X
8 is C, T or S;
X
9 is R, K or H; Xo is any amino acid; is any amino acid;
X,
2 is L, I, V, M, A or P; 15 X,3 is any amino acid; Xo 4 is any amino acid; X, is any amino acid;
X,
6 is L, I, V, M, A, P, G, C, T or S; [Xi] is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X i may comprise the same or different amino acids selected from any amino acid residue;
X,
7 is L, I, V, M, A or P;
X,
8 is any amino acid; is any amino acid;
X
20 L, I, V, M, A or P;
X
21 is P;
X
2 2 is L, I, V, M, A, P orG;
X
23 is PorN; [Xj]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xj may comprise the same or different amino acids selected from any amino acid residue; P:%OPERVPAVPACOM-I\SOCSD- I.WPD- 1/11/01 -9-
X
24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
27 is Y or F;
X
28 is L, I, V, M, A or P; and (ii) comprises at least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats or other protein:molecule interacting domain in a region N-terminal of the SOCS box.
Preferably, the SOCS molecules modulate signal transduction such as from a cytokine or hormone or other endogenous or exogenous molecule, a microbe or microbial product, an antigen or a parasite.
More preferably, the SOCS molecule modulate cytokine mediated signal transduction.
15 Still another aspect of the present invention comprises a nucleic acid molecule c'mprising a sequence of nucleotides encoding or complementary to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or comprises a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42°C wherein said protein exhibits the following characteristics; is capable of modulating signal transduction; (ii) comprises a SOCS box in its C-terminal region having the amino acid sequence: X, X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 Xlo XI, X 12 X,3 XI 4
X
1 5
X
1 6 [XiIn X17 X18 XI 9
X
20 *ooi•*
X
2 1
X
2 2
X
2 3 [Xj]n X 24
X
2 5
X
2 6
X
2 7
X
2 8 wherein: X, is L, I, V, M, A or P;
X
2 is any amino acid residue;
X
3 is P, T or S;
X
4 is L, I, V, M, A or P; X, is any amino acid;
X
6 is any amino acid; P:\OPERVPA\VPACOM-l\SOCSDI-I.WPD- 1/11101
X
7 is L, I, V, M, A, F, Y or W;
X
8 is C, T or S;
X
9 is R, K or H; Xio is any amino acid; XI, is any amino acid;
X,
2 is L, I, V, M, A or P;
X,
3 is any amino acid;
X,
4 is any amino acid;
X,
5 is any amino acid;
X,
6 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue;
SX,
7 is L, I, V, M, A or P; 15 X, 8 is any amino acid;
X,,
9 is any amino acid;
X
2 o L, I, V, M, A or P;
X
21 is P;
X
2 2 is L, I, V, M, A, P or G;
X
23 is P or N; o[X,]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue;
X
24 is L, I, V, M, A or P;
X
2 5 is any amino acid;
X
26 is any amino acid;
X
27 is Y or F;
X
28 is L, I, V, M, A or P; and (iii) comprises at least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats or other protein:molecule interacting domain in a region N-terminal of the SOCS box.
P:\OPERkVPAWVPACOM-I\SOCSDI-I WPD- 111101 -ll- Preferably, the signal transduction is mediated by a cytokine such as one or more of EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, IL-13, IL-6, LIF, IL-12, IFNa, TNFc, IL-1 and/or M-
CSF.
Preferably, the signal transduction is mediated by one or more of Interleukin 6 (IL-6), Leukaemia Inhibitory Factor (LIF), Oncostatin M (OSM), Interferon (IFN)-a and/or thrombopoietin.
Preferably, the signal transduction is mediated by IL-6.
Particularly preferred nucleic acid molecules comprise nucleotide sequences substantially set forth in SEQ ID NO:3 (mSOCS1), SEQ ID NO:5 (mSOCS2), SEQ ID NO:7 (mSOCS3), SEQ ID NO:9 (hSOCS1), SEQ ID NO:1 1 (rSOCS1), SEQ ID NO:13 (mSOCS4), SEQ ID NO: 15 and .SEQ ID NO:16 (hSOCS4), SEQ ID NO:17 (mSOCS5), SEQ ID NO:19 (hSOCS5), SEQ ID 15 NO:20 (mSOCS6), SEQ ID NO:22 and SEQ ID NO:23 (hSOCS6), SEQ ID NO:24 (mSOCS7), SEQ ID NO:26 and SEQ ID NO:27 (hSOCS7), SEQ ID NO:28 (mSOCS8), SEQ ID (mSOCS9), SEQ ID NO:31 (hSOCS9), SEQ ID NO:32 (mSOCS10), SEQ ID NO:33 and SEQ ID NO:34 (hSOCSO1), SEQ ID NO:35 (hSOCS SEQ ID NO:37 (mSOCS12), SEQ ID NO:38 and SEQ ID NO:39 (hSOCS12), SEQ ID NO:40 (mSOCS13), SEQ ID NO:42 (hSOCS13), SEQ ID NO: 43 (mSOCS14), SEQ ID NO:45 (mSOCS15) and SEQ ID NO:47 (hSOCS15) or a nucleotide sequence having at least about 15% similarity to all or a region of any of the listed sequences or a nucleotide acid molecule capable of hybridizing to any one of the listed sequences under low stringency conditions at 42 0
C.
Yet a further aspect of the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42°C wherein said protein comprises a SOCS box in its C-terminal region.
Still yet a further aspect of the present invention contemplates an isolated nucleic acid molecule P:\OPERVPA\VPACOM-I\SOCSDI-I WPD 1/11/01 12comprising a sequence of nucleotides encoding or complementary to a sequence encoding an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42°C wherein said protein comprises a SOCS box in its C-terminal region and wherein the SH2 domain comprises the amino acid sequence: X, X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 XIo X 1 2 X,3 X 14 XI5 X 16 X17 X18 X 19
X
20 X21 X 2 2
X
23
X
2 4
X
2
X
26
X
27
X
28
X
29
X
30
X
3 1
X
32
X
33
X
34
X
35
X
36 X37 X 38
X
39
X
40
X
41
X
42
X
43
X
44
X
45
X
46 X47
X
48
X
49
X
50 X51 X 52
X
53
X
54
X
55
X
56
X
57
X
58 [Xq] n
X
59
X
60
X
6 1 [Xr]n X 62
X
63 X64 X 6 5
X
66 wherein: X, is G or P;
X
2 is F, W or C;
X
3 is Y;
X
4 is W;
X
5 is G or S;
X
6 is any amino acid, preferably P, A, S or V;
X
7 is L, M, V or I;
X
8 is any amino acid, preferably S, T, D or N;
X
9 is any amino acid, preferably V, G, A, R, K or W;
X
10 is any amino acid, preferably H, G, N, S, Y, W or E; is any amino acid, preferably G, E, A or D; SX, 2 is A;
X,
3 is any amino acid, preferably H, N, K, R or E;
X,
4 is any amino acid, preferably E, L, Q, A, G or M; is any amino acid, preferably R, L, K or H;
X,
6 is L;
X,
7 is any amino acid, preferably R, S, K, Q, E or A;
X,
8 is any amino acid, preferably A, E, K, G, N or S;
X,
9 is any amino acid, preferably E, A, M, K or V;
X
2 0 is P;
X
2 is any amino acid, preferably V, A, E or D; P:\OPERWVPAWVPACOM-\SOCSDI-IWPD 1/11/01 13
X
2 2 is G;
X
2 3 is T or S;
X
24 is F;
X
25 is L;
X
26 is V, I or L;
X
2 7 is R;
X
28 is D;
X
2 9 is S;
X
3 0 is any amino acid, preferably R, S, T or A; X3, is any amino acid, preferably Q, D or H;
X
32 is any amino acid, preferably R, Q, S, P, E or D;
X
33 is any amino acid, preferably N, R, D or S;
X
3 4 is any amino acid, preferably C, H or Y; [Xp]n is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I;
X
35 is any amino acid, preferably A, T or S;
X
36 is L, I, V or C;
X
3 7 is S or D;
X
3 8 is V orF;
X
39 is K or R; S X 40 is any amino acid, preferably M, T, R or S;
X
4 is any amino acid, preferably A, Q, S, T, Y or H;
X
42 is any amino acid, preferably S, A, R, N or G;
X
43 is any amino acid, preferably G, R, K or I; X44 is any amino acid, preferably P, T or S;
X
4 is any amino acid, preferably T, K, L or H;
X
46 is any amino acid, preferably S, N or H;
X
47 is any amino acid, preferably I, L, V, A or T;
X
48 is R;
X
49 is V, I or M; P:\OPER\VPA\VPACOM-I\SOCSDj-I.WPD /11/01 14- Xs 0 is any amino acid, preferably H, Q or E;
X
5 is any amino acid, preferably F, C, Y, Q or H;
X
52 is any amino acid, preferably Q, E, A, W, S or Y;
X
5 3 is any amino acid, preferably A, G, D, N or R;
X
54 is G, S or H;
X,
5 is any amino acid, preferably R, S, K, N or T;
X,
6 is F;
X,
7 is any amino acid, preferably H, S or R; Xs 8 is L or F; [Xq]n is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue;
X
59 is D, T or E;
X
60 is any amino acid, preferably C, S, V, I, L or F; 15 X 61 is L, V, T or I; [Xr]n is a sequence of n amino acids wherein n is from 1 to 2 amino S acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue;
X
62 is L, ForA;
X
63 is L, V, I or N; X64 is any amino acid, preferably E, H, D, Q or M;
X
65 is H, Y or G; and
X
6 6 is Y, L or S.
Another aspect of the present invention relates to a protein or a derivative, homologue, analogue or mimetic thereof comprising a SOCS box in its C-terminal region.
Yet another aspect of the present invention is directed to a protein or a derivative, homologue, analogue or mimetic thereof comprising a SOCS box in its C-terminal region and a protein:molecule interacting region.
P:\OPERVPAVPACOM-lSOCSDI-l.WPD- 1/11/01 Even yet another aspect of the present invention provides a protein or a derivative, homologue, analogue or mimetic thereof comprising a protein:molecule interacting region located in a region N-terminal of the SOCS box.
Preferably, the protein:molecule interacting region is a protein:DNA or a protein:protein binding region.
Suitably, the protein:molecule interacting region is an SH2 domain.
Another aspect of the present invention contemplates a protein or a derivative, homologue, analogue or mimetic thereof comprising a SOCS box in its C-terminal region and an SH2 domain, WD-40 repeats or ankyrin repeats N-terminal of the SOCS box.
Still yet another aspect of the present invention provides a protein or a derivative, homologue, 15 analogue or mimetic thereof exhibiting the following characteristics: comprises a SOCS box in its C-terminal region having the amino acid sequence: X X 2
X
3
X
4
X
6 X X X 9 XI X 1 X X XX3 XI 4 XI5 X 1 6 [X]n X 17
X
18
X
9
X
20
X
21
X
22
X
23 [Xj]n X 24
X
25
X
26
X
27
X
28 wherein: X, is L, I, V, M, A or P;
X
2 is any amino acid residue;
X
3 is P, T or S; X4 is L, I, V, M, A or P; X, is any amino acid;
X
6 is any amino acid; X, is L, I, V, M, A, F, Y or W; X, is C, T or S;
X
9 is R, K or H; Xio is any amino acid; P:\OPERNVPA\VPACOM-I\SOCSDI-IWPD- 1/11101 -16-
X,
1 is any amino acid;
X,
2 is L, I, V, M, A or P;
X
3 is any amino acid;
X,
4 is any amino acid; X,5 is any amino acid;
X,
6 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; X,,is L, I, V, M, A orP;
X,
8 is any amino acid;
X,,
9 is any amino acid;
X
20 L,I,V,M,AorP;
X
21 is P; 15 X 22 is L, I, V, M, A, P or G;
X
23 is P or N; [Xj]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xj may comprise the same or different *amino acids selected from any amino acid residue;
X
24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
27 is Y or F;
X
28 is L, I, V, M, A or P; and (ii) comprises at least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats or other protein:molecule interacting domain in a region N-terminal of the SOCS box.
Preferably, the proteins modulate signal transduction such as cytokine-mediated signal transduction.
P:\OPER\VPAVPACOM-ISOCSDj--l.WPD- 1/l 1/01 17- Preferred cytokines are EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, IL- 13, IL-6, LIF, IL- 12, IFNy, TNFa, IL- I and/or M-CSF.
A particularly preferred cytokine is IL-6.
Even yet another aspect of the present invention provides a protein or derivative, homologue, analogue or mimetic thereof exhibiting the following characteristics: is capable of modulating signal transduction such as cytokine-mediated signal transduction; (ii) comprises a SOCS box in its C-terminal region having the amino acid sequence:
:X
1
X
2
X
3
X
4
X
5
X
6
X
7
X
8
X
9
XI
0
X
1 I X 12 X1 3
X
14 X1 5
X
1 6
X
17
X
18
X
1 9
X
20
X
21
X
22
X
23 [Xj]n X 2 4
X
25
X
26
X
27
X
28 wherein: X, is L,1, V,M, Aor P;
X
2 is any amino acid residue; S 3 is P, Tor S;
X
4 is L, 1, V, M, A or P;
X
5 is any amino acid;
X
6 is any amino acid;
X
7 is L, I, V, M, A, F, Y or W;
X
8 is C, T or S;
X
9 is R, K or H;
X
1 0 is any amino acid;
X
1 is any amino acid;
X,
2 is L, 1, V, M, A or P;
X,
3 is any amino acid;
X,
4 is any amino acid; is any amino acid;
X,
6 isL, I,V, M, A, P,G,C, Tor S; [Xi]n is a sequence of n amino acids wherein n is from 1 to 50 amino P:\OPERkVPA\VPACO.M-I\SOCSD-I .WPD- 1/11/01 -18acids and wherein the sequence Xi may comprise the same or different amino acids selected from any amino acid residue;
X,
7 is L, I, V, M, A or P; Xs is any amino acid;
X,
1 is any amino acid;
X
2 0 L,I,V,M, A orP;
X
21 is P;
X
22 is L, I, V, M, A, P or G;
X
23 is P or N; is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X may comprise the same or different amino acids selected from any amino acid residue;
X
24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
2 7 is Y or F;
SX
28 is L, I, V, M, A or P; and (iii) comprises at least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats or other 20 protein-molecule interacting domain in a region N-terminal of the SOCS box.
Particularly preferred SOCS proteins comprise an amino acid sequence substantially as set forth in SEQ ID NO:4 (mSOCS SEQ ID NO:6 (mSOCS2), SEQ ID NO:8 (mSOCS3), SEQ ID (hSOCS1), SEQ ID NO:12 (rSOCSl), SEQ ID NO:14 (mSOCS4), SEQ ID NO:18 (mSOCS5), SEQ ID NO:21 (mSOCS6), SEQ ID NO:25 (mSOCS7), SEQ ID NO:29 (mSOCS8), SEQ ID NO:36 (hSOCSll), SEQ ID NO:41 (mSOCS13), SEQ ID NO:44 (mSOCS14), SEQ ID NO:46 (mSOCS15) and SEQ ID NO:48 (hSOCS15) or an amino acid sequence having at least 15% similarity to all or a region of any one of the listed sequences.
Still yet another aspect of the present invention provides an isolated polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said P:\OPER\VPA'VPACOM-l\SOCSDI-I.WPD 1111/01 -19protein comprises a SOCS box in its C-terminal region.
Even still yet another aspect of the present invention contemplates an isolated polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said protein comprises a SOCS box in its C-terminal region wherein the SH2 domain comprises the amino acid sequence: X, X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 XoX X 12
X
1 3
X
14
X
1 5
X
16
X
17
X
18
X
19
X
20
X
2
X
22
X
23
X
24
X
25
X
26
X
27
X
28
X
29
X
30
X
31
X
32
X
33 X 34[Xp]nX35 X 36
X
37
X
38
X
39 X 4 0 X 4 X 42 X 43 X 44 X 45 X 4 47 X 48 X 49
X
50
X
52
X
53
X
54
X
55
X
56
X
57
X
58
X
59
X
60 X6, X 62
X
63 X4 X 65 X6 wherein: X, is G or P;
X
2 is F, W or C;
X
3 is Y; 15 X 4 is W; X, is G or S;
X
6 is any amino acid, preferably P, A, S or V;
X
7 is L, M, V or I;
X
8 is any amino acid, preferably S, T, D or N; S" 20 X 9 is any amino acid, preferably V, G, A, R, K or W;
X,
0 is any amino acid, preferably H, G, N, S, Y, W or E;
X,
1 is any amino acid, preferably G, E, A or D;
X
12 is A;
X,
3 is any amino acid, preferably H, N, K, R or E; X,4 is any amino acid, preferably E, L, Q, A, G or M;
X,
5 is any amino acid, preferably R, L, K or H;
X,
6 is L;
X,
7 is any amino acid, preferably R, S, K, Q, E or A;
X,
8 is any amino acid, preferably A, E, K, G, N or S;
X
9 is any amino acid, preferably E, A, M, K or V; is P; P:kOPERNVPANVPACOM- I\SOCSDI-I.WPD- 1/11/01
X
2 1 is any amino acid, preferably V, A, E or D;
X
22 is G;
X
23 is T or S;
X
2 4 is F;
X
2 5 is L;
X
26 is V, I or L;
X
2 7 is R;
X
28 is D;
X
29 is S;
X
30 is any amino acid, preferably R, S, T or A;
X
3 is any amino acid, preferably Q, D or H;
X
32 is any amino acid, preferably R, Q, S, P, E or D;
°X
33 is any amino acid, preferably N, R, D or S;
X
34 is any amino acid, preferably C, H or Y; [Xp]n is a sequence ofn amino acids wherein n is from I to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I;
X
35 is any amino acid, preferably A, T or S;
X
36 is L, I, V or C;
X
3 7 is S or D;
X
3 8 is V or F;
X
39 is K or R;
X
40 is any amino acid, preferably M, T, R or S;
X
4 is any amino acid, preferably A, Q, S, T, Y or H;
X
42 is any amino acid, preferably S, A, R, N or G;
X
43 is any amino acid, preferably G, R, K or I; X44 is any amino acid, preferably P, T or S;
X
45 is any amino acid, preferably T, K, L or H;
X
46 is any amino acid, preferably S, N or H;
X
4 7 is any amino acid, preferably I, L, V, A or T; X4, is R; P:NOPERVPAWVPACOM-1SOCSDI-I .WPD- 1/11101 -21
X
49 is V, I or M; Xs 0 is any amino acid, preferably H, Q or E; X, is any amino acid, preferably F, C, Y, Q or H;
X
52 is any amino acid, preferably Q, E, A, W, S or Y;
X
53 is any amino acid, preferably A, G, D, N or R;
X,
4 is G, S or H;
X
5 5 is any amino acid, preferably R, S, K, N or T;
X
5 6 is F;
X
57 is any amino acid, preferably H, S or R;
X,
8 is L or F; is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue;
X
59 is D, T or E; 15 X 6 0 is any amino acid, preferably C, S, V, I, L or F;
X
6 is L, V, T or I; [Xr]n is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue;
X
62 is L, F or A;
SX
6 3 is L, V, I or N; X6 is any amino acid, preferably E, H, D, Q or M; X6, is H, Y or G; and X, is Y, L or S.
Another aspect of the present invention contemplates a method of modulating levels of a SOCS protein in a cell said method comprising contacting a cell containing a SOCS gene with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time and under conditions sufficient to modulate levels and/or activity of said SOCS protein.
A related aspect of the present invention provides a method of modulating signal transduction P:%OPERVPAVPACOM-ISOCSDI-t.WPD- 1/11/01 -22in a cell containing a SOCS gene comprising contacting said cell with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time sufficient to modulate signal transduction.
Yet a further related aspect of the present invention is directed to a method of influencing interaction between cells wherein at least one cell carries a SOCS gene, said method comprising contacting the cell carrying the SOCS gene with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time sufficient to modulate signal transduction.
Another aspect of the present invention is directed to the use of a polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said protein comprises a SOCS box in its C-terminal region in screening for modulators of SOCS protein activity.
15 Still another aspect of the present invention provides the use of a polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said protein comprises a SOCS box in its C-terminal region in screening for modulators of SOCS protein activity wherein the SH2 domain comprises the amino acid sequence: X, X2X3 X4 X5 X6X7 XX9XoX, XI X,3XI4X, X, X,17Xg X9X20X21 X22X2 X 24
X
25
X
26
X
2
X
28
X
29
X
30
X
3
X
3 2 X 3 3
X
34 [Xp] X 3 X 36 X 3 7 X 38 X 3 9 X 4 0 4 1 X 4 2 4 X 4 3 X 4 3 X 4 X 47( 49
X
5 0
X
5 1
X
5 2
X
5 3 X 5 4
X
55
X
56
X
5 7
X
5 8 [Xq]n X 59
X
6 0
X
6 1
X
6 2
X
63 X4 X 6
X
66 wherein: X, is G or P;
X
2 is F, W or C;
X
3 is Y;
X
4 is W; X, is G or S; X, is any amino acid, preferably P, A, S or V; X7 is L, M, V or I;
X
8 is any amino acid, preferably S, T, D or N; P:AOPERkVPA\VPACOM-l\SOCSD-I.WPD. 1/11/01 -23-
X
9 is any amino acid, preferably V, G, A, R, K or W; Xio is any amino acid, preferably H, G, N, S, Y, W or E; is any amino acid, preferably G, E, A or D;
X
12 is A; X,3 is any amino acid, preferably H, N, K, R or E;
X,
4 is any amino acid, preferably E, L, Q, A, G or M;
X,
5 is any amino acid, preferably R, L, K or H;
X,
6 is L;
X,
7 is any amino acid, preferably R, S, K, Q, E or A;
X,
8 is any amino acid, preferably A, E, K, G, N or S;
X,
9 is any amino acid, preferably E, A, M, K or V;
X
20 is P;
X,
2 is any amino acid, preferably V, A, E or D;
:"X
22 is G; 15 X 23 is T or S;
X
24 is F; X is L;
X
26 is V, I or L;
X
27 is R; 20 X 2 8 is D; X29 is S;
X
3 0 is any amino acid, preferably R, S, T or A;
X
31 is any amino acid, preferably Q, D or H;
X
32 is any amino acid, preferably R, Q, S, P, E or D;
X
33 is any amino acid, preferably N, R, D or S;
X
3 4 is any amino acid, preferably C, H or Y; [Xp]n is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; X35 is any amino acid, preferably A, T or S;
X
36 is L, I, V or C; P:%OPER\VPA\VPACOM- I \SOCSDI-I.WPD- 1/11/01 -24-
X
37 is S or D;
X
38 is V or F;
X
3 9 is K or R;
X
4 0 is any amino acid, preferably M, T, R or S;
X
4 is any amino acid, preferably A, Q, S, T, Y or H;
X
42 is any amino acid, preferably S, A, R, N or G;
X
43 is any amino acid, preferably G, R, K or I; X44 is any amino acid, preferably P, T or S;
X
45 is any amino acid, preferably T, K, L or H;
X
4 6 is any amino acid, preferably S, N or H;
X
47 is any amino acid, preferably I, L, V, A or T; 48
X
is R; SX 49 is V, I or M;
X
5 0 is any amino acid, preferably H, Q or E; 15 is any amino acid, preferably F, C, Y, Q or H;
X
52 is any amino acid, preferably Q, E, A, W, S or Y;
X
5 3 is any amino acid, preferably A, G, D, N or R;
X
5 4 is G, S or H;
X
5 5 is any amino acid, preferably R, S, K, N or T; S 20 X 5 6 is F;
X
57 is any amino acid, preferably H, S or R;
X,
8 is L or F; [Xq]j is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue;
X
59 is D, T or E;
X
6 0 is any amino acid, preferably C, S, V, I, L or F;
X
6 1 is L, V, T or I; [Xr]n is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; P:\OPER\VPAVPACOM-I\SOCSDI- .WPD- 1/11/01
X
62 is L, F or A;
X
63 is L, V, I or N; X4 is any amino acid, preferably E, H, D, Q or M;
X
6 is H, Y or G; and
X
6 6 is Y, L or S.
Even still another aspect of the present invention contemplates a method of screening for a modulator of SOCS protein activity, said method comprising contacting a preparation containing a polypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with said SH2 domain or derivative, homologue, analogue or mimetic thereof with a test agent and detecting a different level of interaction between said ligand or analogue or derivative thereof and said SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of said interaction in the absence of said test agent, wherein said 15 different level is indicative of said agent being a modulator of SOCS protein activity.
A further aspect of the present invention provides a method of screening for a modulator of SOCS protein activity, said method comprising contacting a preparation containing a polypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or 20 mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with said SH2 domain or derivative, homologue, analogue or mimetic thereof with a test agent and detecting a reduced level of interaction between said ligand or analogue or derivative thereof and said SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of said interaction in the absence of said test agent, wherein said reduced level is indicative of said agent being a modulator of SOCS protein activity.
Yet a further aspect of the present invention contemplates a method of screening for a modulator of SOCS protein activity, said method comprising contacting a preparation containing a polypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with said SH2 domain or derivative, homologue, analogue or mimetic thereof with a test agent P:\OPER\VPA\VPACOM-1 SOCSDI-l WPD II1/01 -26and detecting an enhanced level of interaction between said ligand or analogue or derivative thereof and said SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of said interaction in the absence of said test agent, wherein said enhanced level is indicative of said agent being a modulator of SOCS protein activity.
In accordance with the present invention, n in n and may, in addition from being 1-50, be from 1-30, 1-20, 1-10 and A summary of the SEQ ID NOs referred to in the subject specification is given in Table 1.
P:OPER\VPA\VPACOM-I\SOCSD-I.WPD -1/1101 -27- TABLE 1 SUMMARY OF SEQUENCE IDENTITY NUMBERS SEQUENCE SEQ ID NO.
PCR Primer 1 PCR Primer 2 Mouse SOCS 1 (nucleotide) 3 Mouse SOCS I (amino acid) 4 Mouse SOCS2 (nucleotide) Mouse SOCS2 (amino acid) 6 Mouse SOCS3 (nucleotide) 7 Mouse SOCS3 (amino acid) 8 Human SOCS1 (nucleotide) 9 9 15 Human SOCS1 (amino acid) Rat SOCS (nucleotide) 11 Rat SOCSI (amino acid) 12 nucleotide sequence of murine SOCS4 13 amino acid sequence of murine SOCS4 14 20 nucleotide sequence of SOCS4 cDNA human contig 4.1 nucleotide sequence of SOCS4 cDNA human contig 4.2 16 nucleotide sequence of murine SOCS5 17 amino acid sequence of murine SOCS5 18 nucleotide sequence of human SOCS5 19 nucleotide sequence of murine SOCS6 amino acid of murine SOCS6 21 nucleotide sequence of human SOCS6 contig h6.1 22 nucleotide sequence of human SOCS6 contig h6.2 23 nucleotide sequence of murine SOCS7 24 P:\OPER\VPAVPACOM-I\SOCSDI-I.WPD 1/11/01 -28amino acid sequence of murine SOCS7 nucleotide sequence of human SOCS7 contig h7.1 26 nucleotide sequence of human SOCS7 contig 17.2 27 nucleotide sequence of murine SOCS8 28 amino acid sequence of murine SOCS 8 29 nucleotide sequence of murine SOCS9 nucleotide sequence of human SOCS9 31 nucleotide sequence of murine SOCS 10 32 nucleotide sequence of human SOCS 10 contig h 10.1 33 nucleotide sequence of human SOCS 10 contig h10.2 34 nucleotide sequence of human SOCS 11 amino acid sequence of human SOCS 11 36 nucleotide sequence of mouse SOCS 12 37 S* nucleotide sequence of human SOCS12 contig h12.1 38 15 nucleotide sequence of human SOCS12 contig h12.2 3
C
nucleotide sequence of murine SOCS 13 amino acid sequence of murine SOCS13 41 nucleotide sequence of human SOCS 13 cDNA contig h 13.1 42 nucleotide sequence of murine SOCS14 cDNA 43 20 amino acid sequence of murine SOCS14 44 nucleotide sequence of murine SOCS15 cDNA amino acid sequence of murine SOCS15 46 nucleotide sequence of human SOCS 15 47 amino acid sequence of human SOCS 15 48 P:XOPERVPAVPACOM-I\SOCSDI-l.WPD 1/11/01 29 Single and three letter abbreviations are used to denote amino acid residues and these are summarized in Table 2.
TABLE 2 Amino Acid Three-letter One-letter Abbreviation Symbol Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic acid Asp D Cysteine Cys C *Glutamine Gin Q 15 Glutamic acid Glu E *Glycine Gly G *Histidine His H Isoleucine Ile I *Leucine Leu L Lysine Lys K Methionine Met M *Phenylalanine Phe F Proline Pro P *Serine Ser S Threonine Thr T Tryptophan Trp W Tyrosine Tyr Valine Val V Any residue Xaa X P:\OPERWVPA\VPACOM-I\SOCSDI-I .WPD 1/11/01 BRIEF DESCRIPTION OF THE DRAWINGS In some of the Figures, abbreviations are used to denote SOCS proteins with certain binding motifs. SOCS proteins which contain WD-40 repeats are referred to as WSB -WSB4. SOCS proteins with ankyrin repeats are referred to as ASB 1-ASB3.
Figure 1 is a diagrammatic representation showing generation of an IL-6-unresponsive Ml clone by retroviral infection. The RUFneo retrovirus, showing the position c f landmark restriction endonuclease cleavage sites, the 4A2 cDNA insert and the position of PCR primer sequences.
Figure 2 is a photographic representation of Southern and Northern analysis. (Left and Middle Panels) Southern blot analysis of genomic DNA from clone 4A2 and a control infected MI clone. DNA was digested with BamH I, to reveal the number of retroviruses carried by each clone, and Sac I, to estimate the size of the retroviral cDNA insert. Left panel; probed with neo.
Right panel; probed with the Xho I-digested 4A2 PCR product. (Right Panel) Northern blot i analysis of total RNA from clone 4A2 and a control infected Ml clone, probed witn the Xho Idigested 4A2 PCR product. The two bands represent unspliced and spliced retroviral transcripts, resulting from splice donor and acceptor sites in the retroviral genome.
Figure 3 is a representation of the nucleotide sequence and structure of the SOCS gene. A.
The genomic context of SOCS1 in relation to the protamine gene cluster on murine chromosome 16. The accession number of this locus is MMPRMGNS (direct submission; G.
Schlueter, 1995) for the mouse and BTPRMTNP2 for the rat (direct submission; G. Schlueter, 1996). B. The nucleotide sequence of the SOCS1 cDNA and deduced amino acid sequence.
Conventional one letter abbreviations are used for the amino acid sequence and the asterisk indicates the stop codon. The polyadenylation signal sequence is underlined. The coding region is shown in uppercase and the untranslated region is shown in lower case.
Figure 4 is a graphical representation of cell differentiation in the presence ofcytokines. Semisolid agar cultures of parental MI cells (Ml and M .mpl) and Ml cells expressing SOCS1 (4A2 and M l.mpl.SOCS were used and the percentage of colonies which differentiated in response P:\OPERVPAVPACOM-I\SOCSD-I.WPD 1/11/01 -31 to a titration of 1 mg/ml IL-6 100 ng/ml LIF 1 mg/ml OSM 100 ng/ml IFN-y 500 ng/ml TPO or 3x10- 6 M dexamethasone determined.
Figure 5 is a photographic representation of cytospins of liquid cultures of parental Ml cells (Ml and MI.mpl) and Ml cells expressing SOCS1 (4A2 and Ml.mpl.SOCS1) cultured for 4 days in the presence of 10 ng/ml IL-6 or saline. Unlike parental Ml cells, morphological features consistent with macrophage differentiation are not observed in M 1 cells constitutively expressing SOCSI (4A2 and Ml.mpl.SOCS1) when cultured in IL-6.
Figure 6 is a photographic representation showing inhibition ofphosphorylation of signalling molecules by SOCS1. Parental Ml cells (Ml and Ml.mpl) and Ml cells expressing SOCSI (4A2 and M I.mpl.SOCS 1) were incubated in the absence or presence of 10 ng/ml of IL-6 for 4 minutes at 37°C Cells were then lysed and extracts were either immunopreciptated using anti-mouse gpl30 antibody prior to SDS-PAGE (two upper panels) or were electrophoresed 15 directly (two lower panels). Gels were blotted and the filters were then probed with antiphosphotyrosine (upper panel), anti-gpl30 antibody (second top panel), anti-phospho-STAT3 (second bottom panel) or anti-STAT3 (lower panel). Blots were visualised using peroxidaseconjugated secondary antibodies and Enhanced Chemiluminescence (ECL) reagents.
20 Figure 7 is a representation of protein extracts prepared from Ml cells or Ml cells expressing SOCS1 (4A2) and Ml.mpl cells or Ml.mpl.SOCS1 cells incubated for 10 min at 37 0 C in 10 ml serum-free DME containing either saline, 100 ng/ml IL-6 or 100 ng/ml IFN-y.
The binding reactions contained 4-6 pg protein (constant within a given experiment), 5 ng 32p labelled m67 oligonucleotide encoding the high affinity SIF (c-sis- inducible factor) binding site, and 800 ng sonicated salmon sperm DNA. For certain experiments, protein samples were preincubated with an excess of unlabelled m67 oligonucleotide, or antibodies specific for either STAT1 or STAT3.
Figure 8 is a photographic representation of Northern hybridisation. Mice were injected intravenously with 2 p/g and after various periods of time, the livers were removed and polyA+ P:\OPERVPA\VPACOM-I\SOCSDI-I.WPD /11/01 -32mRNA was purified. MI cells were stimulated for various lengths of time with 500 ng/ml of IL-6, after which polyA+ mRNA was isolated. mRNA was fractionated by electrophoresis and immobilized on nylon filters. Northern blots were prehybridized, hybridized with randomprimed 32 P-labelled SOCS 1 or GAPDH DNA fragments, washed and exposed to film overnight.
Figure 9 is a representation of a comparison of the amino acid sequences of SOCS1, SOCS2, SOCS3 and CIS. Alignment of the predicted amino acid sequence of mouse human (hs) and rat (rr) SOCS 1, SOCS2, SOCS3 and CIS. Those residues shaded are conserved in three or four mouse SOCS family members. The SH2 domain is boxed in solid lines, while the SOCS box is bounded by double lines.
Figure 10 is a photographic representation showing the phenotype of IL-6 unresponsive Ml cell clone, 4A2. Colonies of parental Ml cells (left panel) and clone 4A2 (right panel) cultured in semi-solid agar for 7 days in saline or 100 ng/ml IL-6.
Figure 11 is a photographic representation showing expression of mRNA for SOCS family members in vitro and in vivo.
Northern analysis of mRNA from a range of mouse organs showing constitutive expression of SOCS family members in a limited number of tissues.
Norther analysis of mRNA from liver and M cells showing induction of expression of SOCS family members following exposure to IL-6.
Reverse transcriptase PCR analysis of mRNA from bone marrow showing induction of expression of SOCS family members by a range of cytokines.
Figure 12 is a photographic representation showing SOCS suppresses the phosphorylation and activation ofgpl30 and STAT-3.
Western blots of extracts from parental Ml cells (Ml and Ml.mpl) and Ml cells expressing SOCS1 (4A2 and M .mpl.SOCS1) stimulated with or without 100 ng/ml IL- 6. Top: Extracts immunoprecipitated with antu-gpl30 (cgpl30) and immunoblotted with antiphosphotyrosine (aPY-STAT3), or for STAT3 (aSTAT3) to demonstrate equal loading of protein. The molecular weights of the bands are shown on the right.
P:\OPERVPAVPACOM-1 SOCSDI-I.WPD- 1/11/01 -33- EMSA of Ml.mpl and Ml.mpl.SOCS1 cells stimulated with and without 100 ng/ml IL-6 or 100 ng/ml IFNy. The DNA-binding complexes SIF A, B, and C are indicated at the left.
Figure 13 is a representation of a comparison of the amino acid sequence of the SOCS proteins Alignment of N-terminal regions of SOCS proteins. Alignment of the SH2 domains of CIS, SOCS1, 2, 3, 5, 9, 11 and 14. Alignment of the WD-40 repeats of SOCS4, SOCS6, SOCS13 and SOCS15. Alignment of the ankyrin repeats of SOCS7 and SOCS10. (E) Alignment of the regions between SH2, WD-40 and ankyrin repeats and the SOCS box. (F) Alignment of the SOCS box. In each case the conventional one letter abbreviations for amino acids are used, with X denoting residues of uncertain identity and 000 denoting the beginning and the end of contigs. Amino acid sequence obtained from conceptual translation of nucleic acid sequence derived from isolated cDNAs is shown in upper case while amino acid sequence obtained by conceptual translation of ESTs is shown in lower case and is approximate only.
15 Conserved residues, defined as (LIVMA), (FYW), S, (KRH), (PG) are shaded in the SH2 domain, WD-40 repeats, ankyrin repeats and the SOCS box. For the alignment of SH2 domains, WD-40 repeats and ankyrin repeats a consensus sequence is shown above. In each case this has been derived from examination of a large and diverse set of domains (Neer et al, 1994; Bork, 1993).
i" Figures 14(A) and are photographic representations showing analysis ofmRNA expression of mouse SOCS 1 and SOCS5 and SOCS containing a WD-40 repeat (WSB2) and ankyrin repeats (ASB 1).
Figure 15 is a representation showing the nucleotide sequence of the mouse SOCS4 cDNA.
The nucleotides encoding the mature coding region from the predicted ATG "start" codon to the stop codon is shown in upper case, while the predicted 5' and 3' untranslated regiol s are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 17.
Figure 16 is a representation showing the predicted amino acid sequence of the mouse SOCS4 P:\OPER\VPA\VPACOM- lSOCSDI-I.WPD- 1/11/01 -34protein, derived from the nucleotide sequence in Figure 15. The SOCS box, which also shown in Figure 13, is underlined.
Figure 18 is a representation showing the nucleotide sequence of human SOCS4 cDNA contigs h4.1 and h4.2, derived from analysis of ESTs listed in Table 4.1. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 17.
Figure 19 is a diagrammatic representation showing the relationship of mouse SOCS5 genomic (57-2) and cDNA clones to contigs derived from analysis of mouse ESTs (Table 5.1) and human cDNA clone (5-94-2) and ESTs (Table The nucleotide sequence of the mouse contig is shown in Figure 20, with the sequence of human SOCS5 contig (h5.1) being shown in Figure 21. The deduced amino acid sequence of mouse SOCS5 is shown in Figure The structure of the protein is shown schematically, with the SH2 domain indicated by and the SOCS box by The putative 5' and 3' translated regions are shown by the thin 15 solid line.
Figure 20A is a representation showing the nucleotide sequence of the mouse SOCS5 derived from analysis of genomic and cDNA clones. The nucleotides encoding the mature coding region from the predicted ATG "start" codon to the stop codon is shown in upper case, while the predicted 5' and 3' untranslated regions are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 19.
Figure 20B is a representation of the predicted amino acid sequence of mouse SOCS5 protein, derived from the nucleotide sequence in Figure 20A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 21 is a representation showing the nucleotide sequence of human SOCS5 cbNA contig h5.1, derived from analysis of cDNA clone 5-94-2 and the ESTs listed in Table 5.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 19.
Figure 22 is a diagrammatic representation showing the relationship of mouse SOCS6 cDNA P:\OPERVPAVPACOM-1lSOCSDI--IWPD- 1/11/01 clones (6-1A, 6-2A, 6-5B, 6-4N, 6-18, 6-29, 6-3N and 6-5N) to contigs derived from analysis of mouse ESTs (Table 6.1) and human ESTs (Table The nucleotide sequence of the mouse SOCS-6 contig is shown in Figure 23, with the sequence of human SOCS6 contigs (h6.1 and h6.2) being shown in Figure 24. The deduced amino acid sequence of mouse SOCS6 is shown in Figure 23B. The structure of the protein is shown schematically, while the WD-40 repeats indicated by and the SOCS box by The putative 5' and 3' untranslated regions are shown by the thin solid line.
Figure 23A is a representation showing the nucleotide sequence of the mouse SOCS6 derived from analysis of cDNA clone 64-10A-11. The nucleotides encoding the part of the predicted coding region, ending in the stop codon are shown in upper case, while the predicted 3' untranslated regions are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 22.
15 Figure 23B is a representation showing the predicted amino acid sequence of mouse SOCS6 protein, derived from the nucleotide sequence in Figure 23A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 24 is a representation showing the nucleotide sequence of human SOCS6 cDNA contig h6.1, derived from analysis of cDNA clone 5-94-2 and the ESTs listed in Table 6.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 22 Figure 25.is a diagrammatic representation showing the relationship of mouse SOCS7 cDNA clone (74-10A-11) to contigs derived from analysis of mouse ESTs (Table 7.1) and human ESTs (Table The nucleotide sequence of the mouse SOCS7 contig is shown in Figure 26 with the sequence of human SOCS7 contigs (h7.1 and h7.2) being shown in Figure 27. The deduced amino acid sequence of mouse SOCS7 is shown in Figure 26B. The structure of the protein is shown schematically, with the ankyrin repeats indicated by and the SOCS box by The putative 5' and 3' untranslated regions are shown by the thin solid line in the mouse and by the wavy line in h7.2. Based on analysis of clones isolated to date and ESTs the 3' untranslated regions of mSOCS7 and hSOCS7 share little similarity.
P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD- 1/11/01 -36- Figure 26A is a representation showing the nucleotide sequence of the mouse SOCS7 derived from analysis of cDNA clone 74-10A-11. The nucleotides encoding the part of the predicted coding region, ending in the stop codon are shown in upper case, while the predicted 3' untranslated regions are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure Figure 26B is a representation showing the predicted amino acid sequence of mouse SOCS7 protein, derived from the nucleotide sequence in Figure 26A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 27 is a representation showing the nucleotide sequence of human SOCS7 cDNA contig h7.1 and h7.2 derived from analysis of the ESTs listed in Table 7.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 15 Figure 28 is a diagrammatic representation of the relationship of sequence derived from analysis of mouse SOCS8 ESTs (Table 8.1 and Figure 29A) to the predicted protein structure of mouse SOCS8. The deduced partial amino acid sequence of mouse SOCS8 is shown in Figure 29B. The structure of the protein is shown schematically with the SOCS box highlighted The predicted 3' untranslated region is shown by the thin line.
Figure 29A is a representation showing the partial nucleotide sequence of mouse SOCS8 cDNA (contig 8.1) derived from analysis of ESTs. The nucleotides encoding the part of the predicted coding region, ending in the STOP codon are shown in upper case, while the predicted 3' untranslated regions are shown in lower case.
Figure 29B is a representation showing the partial predicted amino acid sequence of the mouse SOCS8 protein, derived from the nucleotide sequence in Figure 29A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 30 is a diagrammatic representation showing the relationship of mouse SOCS9 ESTs (Table 9.1) and human SOCS9 ESTs (Table The nucleotide sequence of the mouse P:\OPER\VPA\VPACOM-l SOCSDI-I.WPD 1/11/01 -37- SOCS9 contig (m9.1) is shown in Figure 31, with the sequence of human SOCS9 contig (h9.1) being shown in Figure 32. The deduced amino acid sequence of human SOCS9 is shown schematically, with the SH2 domain indicated by and the SOCS box by The putative 3' untranslated region is shown by the thin solid line.
Figure 31 is a representation showing the partial nucleotide sequence of mouse SOCS9 cDNA (contig m9.1), derived from analysis of the ESTs listed in Table 9.1. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure Figure 32 is a representation showing the partial nucleotide sequence of human SOCS9 cDNA (contig h9.1), derived from analysis of the ESTs listed in Table 9.2. Although it is clear that contig h9.1 encodes a protein with an SH2 domain and a SOCS box, the quality of the sequence is not high enough to derive a single unambiguous open reading frame. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure Figure 33 is a representation showing the relationship of mouse SOCS10 cDNA clones (10-9, 10-12, 10-23 and 10-24) to contigs derived from analysis of mouse ESTs (Table 10.1) and human ESTs (Table 10.2). The nucleotide sequence of the mouse SOCS10 contig is shown in Figure 10.2, with the sequence of human SOCS10 contigs (hl0.1 and h10.2) being shown in Figure 35. The predicted structure of the protein is shown schematically, with the ankyrin repeats indicated by and the SOCS box by The putative 3' untranslated regions is shown by the thin line solid line in the mouse and by the wavy line in hl0.2. Based on analysis of clones isolated to date and ESTs the 3' untranslated regions of mSOCS-10 and hSOCS-10 share little similarity.
Figure 34 is a representation showing the nucleotide sequence of the mouse SOCS 10 derived from analysis of cDNA clone 10-9, 10-12, 10-23 and 10-24. The nucleotides encoding the part of the predicted coding region, ending in the stop codon are shown in upper case, while the predicted 3' untranslated regions are shown in lower case. Although it is clear that contig ml0.1 encodes a protein with a series of ankyrin repeats and a SOCS box, the quality of the sequence is not high enough to derive a single unambiguous open reading frame. The relationship of P:\OPERVPA\VPACOM-\SOCSDI-I.WPD 1/11/01 -38mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 33.
Figure 35 is a representation showing the nucleotide sequence of human SOCS10 cDNA contig hl0.2 and h10.2 derived from analysis of the ESTs listed in Table 10.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 33.
Figure 36A is a representation showing the partial nucleotide sequence of the human SOCS 11 cDNA derived from analysis of ESTs listed in Table 11.1 The nucleotides encoding the mature coding region from the predicted ATG "start" codon to the stop codon is shown in upper case, while the predicted 5' and 3' untranslated regions are shown in lower case. The relationship of the partial cDNA sequence, derived from ESTs, to the predicted protein is shown il Figure 37.
Figure 36B is a representation showing the partial predicted amino acid sequence of human SOCSI 1 protein, derived from the nucleotide sequence in Figure 36A. The SOCS box, which also shown in Figure 13, is underlined.
o* 0 Figure 37 is a diagrammatic representation showing the relationship of sequence derived from analysis of human SOCS-11 ESTs (Table 11.1 and Figure 36A) to the predicted protein structure of human SOCS 11. The deduced partial amino acid sequence of human SOCS 1 is 9 .shown in Figure 36B. The structure of the protein is shown schematically with the SH2 domain shown by and the SOCS box highlighted by The predicted 3' untranslated region is shown by the thin line.
Figure 38 is a diagrammatic representation showing the relationship of mouse SOCS12 cDNA clones (12-1) to contigs derived from analysis of mouse ESTs (Table 12.1) and human ESTs (Table 12.2). The nucleotide sequence of the mouse SOCS12 contig is shown in Figure 12.2, with the sequence of human SOCS12 contigs (h12.1 and h12.2) being shown in Figure 40. The deduced partial amino acid sequence of mouse SOCS 12 is shown in Figure 39. The structure of the protein is sown schematically, with the ankyrin repeats indicated by and the SOCS box by The putative 3' untranslated region is shown by the thin line solid line in the mouse and by the wavy line in h12.2. Based on analysis of clones isolated to date and ESTs the 3' untranslated regions of mSOCS12 and hSOCS12 share little similarity.
P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/11/01 -39- Figure 39 is a representation showing the nucleotide sequence of the mouse SOCS12 derived from analysis of cDNA clone 12-1 and the ESTs listed in Table 12.1. The nucleotides encoding the part of the predicted coding region, including the stop codon are shown in upper case, while the predicted 3' untranslated region is shown in lower case. By homology with human SOCS 12 it is clear that contig m12.1 encodes a protein with a series ofankyrin repeats and a SOCS box, the quality of the sequence is not high enough to derive a single unambiguous open reading frame. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 38.
Figure 40 is a representation showing the nucleotide sequence of human SOCS12 cDNA contig h12.1 and h12.2 derived from analysis of the ESTs listed in Table 12.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 38.
Figure 41 is a diagrammatic representation showing the relationship of contig m 1 3.1 derived 15 from analysis of mouse SOCS13 cDNA clones (62-1, 62-6-7, 62-14) and mouse ESTs (Table 13.1) to contig h13.1 derived from analysis of human ESTs (Table 13.2). The nucleotide sequence of the mouse SOCS13 contig is shown in Figure 42, with the sequence of human SOCS 13 contig (h13.1) being shown in Figure 43. The deduced amino acid sequence of mouse SOCS13 is shown in Figure 42B. The structure of the protein is shown schematically, with the WD-40 repeats highlighted by and the SOCS box highlighted by The 3' untranslated region is shown by the thin line solid line.
Figure 42A is a representation showing the nucleotide sequence of the mouse SOCS13 derived from analysis of cDNA clones 62-1, 62-6-7 and 62-14. The nucleotides encoding part of the predicted coding region, ending in the stop codon are shown in upper case, while those encoding the predicted 3' untranslated regions are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 41.
Figure 42B is a representation showing the predicted amino acid sequence of mouse SOCS13 protein, derived from the nucleotide sequence in Figure 42A. The SOCS box, which also shown in Figure 13 is underlined.
P:\OPERWVPA\VPACOM-I\ SOCSDI-l.WPD- 1/11/01 Figure 43 is a representation showing the nucleotide sequence of human SOCS13 cDNA contig h13.1 derived from analysis of the ESTs listed in Table 13.2. The relationship of these contigs to the mouse cDNA sequence is illustrated in Figure 41.
Figure 44 is a diagrammatic representation showing the relationship of a partial mouse SOCS 14 cDNA clone (14-1) to contigs derived from analysis of mouse ESTs (Table 14.1). The nucleotide sequence of the mouse SOCS 14 contig is shown in Figure 45. The deduced partial amino acid sequence of mouse SOCS14 is shown in Figure 45B. The structure of the protein is shown schematically, with the SH2 domain indicated by and the SOCS box by The putative 3' untranslated region is shown by the thin line.
Figure 45A is a representation showing the nucleotide sequence of the mouse SOCS 14 derived from analysis of genomic and cDNA clones. The nucleotides encoding the mature coding region from the predicted ATG "start" codon to the stop codon is shown in upper case, while 15 the predicted 5' and 3' untranslated regions are shown in lower case. The relationship of mouse cDNA sequence to mouse and human EST contigs is illustrated in Figure 44.
Figure 45B is a representation showing the predicted amino acid sequence of mouse SOCS14 protein, derived from the nucleotide sequence in Figure 45B. The SOCS box, which also shown in Figure 13 is underlined.
S
Figure 46 is a diagrammatic representation showing the relationship of contig m15.1 derived Sfrom analysis of mouse BAC and mouse ESTs (Table 15.1) to contig h15.1 derived from analysis of the human BAC and human ESTs (Table 15.2). The nucleotide sequence of the mouse SOCS15 contig is shown in Figure 47, with the sequence of human SOCS15 contig (h 15.1) being shown in Figure 47. The deduced amino acid sequence of mouse SOCS 15 is shown in Figure 47B. The structure of the protein is shown schematically, with the repeats highlighted by and the SOCS box highlighted by The 5' and 3' untranslated region are shown by the thin line solid line. The introns which interrupt the coding region are shown by P:\OPERVPAVPACOM-I\SOCSDI-I.WPD- 1/11101 -41 Figure 47A is a representation showing the nucleotide sequence covering the mouse SOCS gene derived from analysis the mouse BAC listed in Table 15.1. The nucleotides encoding the predicted coding region, beginning with the ATG and ending in the stop codon are shown in upper case, while those encoding the predicted 5' untranslated region, the introns and the 3' untranslated region are shown in lower case. The relationship of mouse BAC to mouse and human ESTs contigs is illustrated in Figure 46.
Figure 47B is a representation showing the predicted amino acid sequence of mouse SOCS protein, derived from the nucleotide sequence in Figure 47A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 48A is a representation showing the nucleotide sequence covering the human SOCS gene derived from analysis the human BAC listed in Table 15.2. The nucleotides encoding the predicted coding region, beginning with the ATG and ending in the stop codon a:e shown in 15 upper case, while those encoding the predicted 5' untranslated region, the introns and the 3' .untranslated region are shown in lower case. The relationship of the human BAC to mouse and human ESTs contigs is illustrated in Figure 46.
Figure 48B is a representation showing the predicted amino acid sequence of human 20 protein, derived from the nucleotide sequence in Figure 48A. The SOCS box, which also shown in Figure 13 is underlined.
Figure 49 is a photographic representation showing SOCS1 inhibition of JAK2 kinase activity.
Upper panel. Cos M6 cells were transiently transfected with either Flag-tagged mJAK2 and 25 mSOCS-1 DNA (SOCSI) or Flag-mJAK2 DNA alone lysed, JAK2 proteins immunoprecipitated using anti-JAK2 antibody and subjected to an in vitro kinase assay. Lower panel. A portion of the JAK2 immunoprecipitates were Western blotted with anti-JAK2 antibody. Upper panel. Cos M6 cells were transiently transfected with Flag- mJAK2 and Flag- mSOCS-1 DNA or Flag-mJAK2 DNA alone, lysed, JAK2 proteins immunoprecipitated using anti-JAK2 (UBI) and separated by SDS/PAGE gel. Immunoprecipitates were then analysed by Western blot with anti-phosphotyrosine antibody. Lower panel; JAK2 expression.
P:\OPERkVPA\VPACOM-I\SOCSDI-I .WPD I/11/01 -42- Cos cell lysates were separated by SDS/PAGE gel and analysed by Western blot with anti- FLAG antibody (M2).
Figure 50 is a photographic representation showing interaction between JAK2 and SOCS protein. Cos M6 cells were transiently transfected with Flag-tagged mJAK2 and various Flag-tagged SOCS DNAs (SOCS-l;S1, SOCS-2;S2, SOCS-3;S3, CIS) or Flag-mJAK2 alone, lysed, JAK2 proteins immunoprecipitated using anti-JAK2 (UBI) and separated by SDS/PAGE.
Immunoprecipitates were then analysed by Western blot with anti-FLAG antibody (B) Cos cell lysates described in were separated by SDS/PAGE and expression levels of the various proteins were determined by Western blot with anti-FLAG antibody JAK2 tyrosine phosphorylation. Cos cell lysates described in were separated by SDS/PAGE and proteins analysed by Western blot with anti-phosphotyrosine antibody.
S:,Figure 51 is a diagrammatic representation of ppgalpAloxneo.
Figure 52 is a diagrammatic representation of ppgalpAloxneoTK.
Figure 53 is a diagrammatic representation of SOCS I knockout construct.
P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD- 1/11/01 -43- DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS The present invention provides a new family of modulators of signal transduction. As the initial members of this family suppressed cytokine signalling, the family is referred to as the "suppressors of cytokine signalling" family of "SOCS". The SOCS family is defined by the presence of a C-terminal domain referred to as a "SOCS box". Different classes of SOCS molecules are defined by a motif generally but not exclusively located N-terminal to the SOCS box and which is involved by protein:molecule interaction such as protein:DNA or protein:protein interaction. Particularly preferred motifs are selected from an SH2 domain, WD- 40 repeats and ankyrin repeats.
repeats were originally recognised in the P-subunit of G-proteins. WD-40 repeats appear to form a P-propeller-like structure and may be involved in protein-protein interactions.
Ankyrin repeats were originally recognised in the cytoskeletal protein ankryin.
.15 Members of the SOCS family may be identified by any number of means. For example, SOCS1 to SOCS3 were identified by their ability to suppress cytokine-mediated signal transduction and, hence, were identified based on activity. SOCS4 to SOCS15 were identified as nucleotide sequences exhibiting similarity at the level of the SOCS box.
The SOCS box is a conserved motif located in the C-terminal region of the SOCS molecule.
S• In accordance with the present invention, the amino acid sequence of the SOCS box is: X, X X3 X4 X X X, X X9 Xo XI, X 12
X
13
X
4
X,
5
X
16 [Xi]n X17 X 18
X
1 9
X
20 25 X 2 1
X
22
X
23 [Xj] n
X
24
X
25
X
26
X
27
X
28 wherein: X, is L, I, V, M, A or P;
X
2 is any amino acid residue;
X
3 is P, T or S; X4 is L, I, V, M, A or P; X, is any amino acid; P:\OPERNVPA\VPACOM-ISOCSDI-I.WPD- 1/11/01 -44- X, is any amino acid;
X
7 is L, I, V, M, A, F, Y or W;
X
8 is C, T or S;
X
9 is R, K or H; Xo is any amino acid; is any amino acid;
X,
2 is L, I, V, M, A or P;
X,
3 is any amino acid;
X,
4 is any amino acid; is any amino acid;
X,
6 is L, I, V, M, A, P, G, C, T or S; [XiJ] is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; 15 X is L, I, V, M, A orP; is any amino acid; *.oo
X,
9 is any amino acid;
X
20 L,I,V,M, A orP;
X
21 is P; 20 X 22 isL,I,V,M,A, P orG;
X
23 is P or N; [Xj]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xj may comprise the same or different amino acids selected from any amino acid residue; 25 X 24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
2 7 is Y or F; and
X
28 is L, I, V, M, A or P.
As stated above and in accordance with the present invention, SOCS proteins are divided into P:\OPERkVPA\VPACOM-I\SOCSDI-I.WPD- 1/11101 separate classes based on the presence of a protein:molecule interacting region such as but not limited to an SH2 domain, WD-40 repeats and ankyrin repeats located N-terminal of the SOCS box. The latter three domains are protein:protein interacting domains.
Examples of SH2 containing SOCS proteins include SOCS1, SOCS2, SOCS3, SOCS5, SOCS9, SOCS1 I and SOCS 14. Examples of SOCS containing WD-40 repeats include SOCS4, SOCS6 and SOCS 15. Examples of SOCS containing ankyrin repeats include SOCS7, SOCS 10 and SOCS 12.
The present invention provides inter alia nucleic acid molecules encoding SOCS proteins, purified naturally occurring SOCS proteins as well as recombinant forms of SOCS proteins and methods of modulating signal transduction by modulating activity of SOCS proteins or expression of SOCS genes. Preferably, signal transduction is mediated by a cytokine, examples of which include EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, IL-13, IL-6, LIF, IL-12, 15 IFNy, TNFa, IL-1 and/or M-CSF. Particularly preferred cytokines include IL-6, LIF, OSM, IFN-y and/or thrombopoietin.
Accordingly, one aspect of the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a S 20 protein or a derivative, homologue, analogue or mimetic thereof or comprises a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 0 C wherein said Sprotein comprises a SOCS box in its C-terminal region and optionally a protein:molecule •interacting domain N-terminal of the SOCS box.
25 Preferably, the protein:molecule interacting domain is a protein:DNA or protein:protein interacting domain. Most preferably, the protein:molecule interacting domain is one of an SH2 domain, WD-40 repeats and/or ankyrin repeats.
As stated above, preferably the subject SOCS modulate cytokine-mediated signal transduction.
The present invention extends, however, to SOCS molecules modulating other effectormediated signal transduction such as mediated by other endogenous or exogenous molecules, P:AOPER\VPA\VPACOM-SOCSI-1.WPD 1/11/01 -46antigens, microbes and microbial products, viruses or components thereof, ions, h rmones and parasites. Endogenous molecules in this context are molecules produced within the cell carrying the SOCS molecule. Exogenous molecules are produced by other cells or are introduced to the body.
Preferably, the nucleic acid molecule or SOCS protein is in isolated or purified form. The terms "isolated" and "purified" mean that a molecule has undergone at least one purification step away from other material.
Preferably, the nucleic acid molecule is in isolated form and is DNA such as cDNA or genomic DNA. The DNA may encode the same amino acid sequence as the naturally occurring SOCS or the SOCS may contain one or more amino acid substitutions, deletions and/or additions. The nucleotide sequence may correspond to the genomic coding sequence (including exons and introns) or to the nucleotide sequence in cDNA from mRNA transcribed from the genomic gene 15 or it may carry one or more nucleotide substitutions, deletions and/or additions thereto.
oe In a preferred embodiment, the nucleic acid molecule comprises a sequence of nucleotide encoding or complementary to a sequence encoding a SOCS protein or a derivative, homologue, analogue or mimetic thereof wherein the amino acid sequence of said SOCS protein is selected 20 from SEQ ID NO:4 (mSOCS1), SEQ ID NO:6 (mSOCS2), SEQ ID NO:8 (mSOCS3), SEQ ID (hSOCSl), SEQ ID NO:12 (rSOCSl), SEQ ID NO:14 (mSOCS4), SEQ ID NO:18 o (mSOCS5), SEQ ID NO:21 (mSOCS6), SEQ ID NO:25 (mSOCS27), SEQ ID NO:29 "(mSOCS8), SEQ ID NO:36 (hSOCSll), SEQ ID NO:41 (mSOCS13), SEQ ID NO:44 (mSOCS 14), SEQ ID NO:46 (mSOCS 15) and SEQ ID NO:48 (mSOCS 15) or encodes an amino S 25 acid sequence with a single or multiple amino acid substitution, deletion and/or addition to the listed sequences or is a nucleotide sequence capable of hybridizing to the nucleic acid molecule under low stringency conditions at 42 0
C.
In an even more preferred embodiment, the present invention provides a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding a SOCS protein or a derivative, homologue, analogue or mimetic thereof wherein the nucleotide P:\OPERVPA\VPACO-l\SOCSDI-I WPD- 1/11/01 -47sequence is selected from a nucleotide sequence substantially set forth in SEQ ID NO:3 (mSOCS SEQ ID NO:5 (mSOCS2), SEQ ID NO:7 (mSOCS3), SEQ ID NO:9 (hSOCS 1), SEQ ID NO:11 (rSOCSI), SEQ ID NO: 13 (mSOCS4), SEQ ID NO:15 and SEQ ID NO:16 (hSOCS4), SEQ ID NO: 17 (mSOCS5), SEQ ID NO: 19 (hSOCS5), SEQ ID NO:20 (mSOCS6), SEQ ID NO:22 and SEQ ID NO:23 (hSOCS6), SEQ ID NO:24 (mSOCS7), SEQ ID NO:26 and SEQ ID NO:27 (hSOCS7), SEQ ID NO:28 (mSOCS8), SEQ ID NO:30 (mSOCS9), SEQ ID NO:31 (hSOCS9), SEQ ID NO:32 (mSOCSO1), SEQ ID NO:33 and SEQ ID NO:34 SEQ ID NO:35 (hSOCS SEQ ID NO:37 (mSOCS12), SEQ ID NO:38 and SEQ ID NO:39 (hSOCS12), SEQ ID NO:40 (mSOCS13), SEQ ID NO:42 (hSOCSl3), SEQ ID NO:43 (mSOCS14), SEQ ID NO:45 (mSOCS15) and SEQ ID NO:47 (hSOCS15) or a nucleotide sequence having at least about 15% similarity to all or a region of any of the listed sequences or a nucleic acid molecule capable of hybridizing to any of the listed sequences under low stringency conditions at 42 0
C.
15 Reference herein to a low stringency at 42 0 C includes and encompasses from at least about 1% v/v to at least about 15% v/v formamide and from at least about 1M to at least about 2M salt for hybridisation, and at least about 1M to at least about 2M salt for washing conditions.
Alternative stringency conditions may be applied where necessary, such as medium stringency, which includes and encompasses from at least about 16% v/v to at least about 30% v/v formamide and from at least about 0.5M to at least about 0.9M salt for hybridisation and at least about 0.5M to at least about 0.9M salt for washing conditions, or high stringency, which includes and encompasses from at least about 31% v/v to at least about 50% v/v formamide and from at least about 0.01M to at least about 0.15M salt for hybridisation, and at least about 0.01M to at least about 0.15M salt for washing conditions.
In another embodiment, the present invention is directed to a SOCS protein or a derivative, homologue, analogue or mimetic thereof wherein said SOCS protein is identified as follows: human SOCS4 characterised by EST81149, EST180909, EST182619, ya99H09, ye70co4, yh53c09, yh77gl 1, yh87h05, yi45h07, yj04e06, yql2h06, yq56a06, yq60e02, yq92g03, yq97h06, yr90f0 I, yt69c03, yv30a08, yv55f07, yv57h09, yv87h02, yv98e 1, P:\OPER\VPA\VPACOM-l\SOCSDI- WPO 1/101 48 yw68d 10, yw82a03, yxO8aO7, yx72h06, yx76b09, yy37h08, yy66b02, za8 I f'08, zb I 8f'07, zcO6eO8, zdl14gO6, zd5 Ihl12, zd52b09, ze25gl 11, ze69fO2, zf54f03 zh96e07, zv66h 12, zs83a08 and zs83g08; mouse SOCS-4 characterised by mc65fD4, mf42e06, mplIOcl10, mr8l1gO9, and mtlI9hl12; human SOCS-5 characterised by EST1I5B 103, ESTI 5BIO05, EST27530 and zf5OfOl1; mouse SOCS-5 characterised by mc55aOlI, mh98f09, my26h 12 and ve24c06; human SOCS-6 characterised by yf6leO8, yf93a09, ygO5fl2, yg4lfD4, yg45c02, yhl MfO, yhl3bO5, zc35a12, zeO2hO8, z109a03, z169e10, zn39d08 and zo39e06; mouse SOCS-6 characterised by mcO4cO5, md48a03, mf3l1dO3, mh26b07, mh78e 11, 15 mh88h09, mh94h07, mi27h04 and mj29c05, mp66g04, mw75g03, va53b05, vb34h02, vc55d07, vc59e05, vc67d03, vc68dl10, vc97hOl1, vc99c08, vdO7hO3, vdO8cOl1, vdO9bl12, vdlI9bO2, vd29a04 and vd46d06; human SOCS-7 characterised. by STS W130171, EST00939, EST12913, yc29b05, yp49flO, ztlOfIO3 and zx73g04; mouse SOCS-7 characterised by mj39aOl and vi52h07; mouse SOCS-8 characterised by mj6eO9 and vj27a029; human SOCS-9 characterised by CSRL-82f2-u, ESTI 14054, yyO6bO7, yyO6gO6, zr4OcO9, zr72hO I, yx92c08, yx93b08 and hfe0662; mouse SOCS-9 characterised by me65d05; human SOCS-10 characterised by aa48hlO, zp35hOl, zp97h12, zqO8hOl, zr34g05, P:\OPER\VPAVPACO\M-I\SOCSDI-l .WPD 1/1 1/01 49 EST73000 and mouse SOCS-l10 characterised by mbl14dl12, mb4OfO06, mg89bl 11, mq89e 12, mpO3gl12 and vh53cl 11; human SOCS- I11 characterised by zt24h06 and zr43b02; human SOCS-13 characterised by EST59161; mouse SOCS- 13 characterised by ma39a09, me6OcO5, mi78g05, uk IOci 11, mo48g 12, mp94aOl1, vb5 7c07 and vhO7cl 11; and human SOCS- 14 characterised by mi75e03 vd29hl 1 and vd53g07; or a derivative or homologue of the above ESTs characterised by a nucleic acid molecule 15 being capable of hybridizing to any of the listed ESTs under low stringency conditions at 42 0
C.
In another embodiment, the nucleotide sequence encodes the following amino acid sequence: 20 XI X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9
XI
0
X
1 I X1 2
X
1 3 X1 4
X
1 5
X
1 6 [Xijn X 1 7
X
1 8 X 1 9
X
2 0
X
21
X
22
X
2 3 [Xj]n X 24
X
25
X
26
X
27
X
28 *wherein: X, is L,1, V, M, Aor P;
X
2 is any amino acid residue; X is P, Tor S;
X
4 is L,1, V, M,Aor P; X, is any amino acid;
X
6 is any amino acid;
X
7 is L, I, V, M, A, F, Y or W;
X
8 isC, Tor S;
X
9 is R, K or H; P:\OPERVPAVPACOM-I\SOCSDI-I.WPD- 1/11/01 XIo is any amino acid; is any amino acid;
X,
2 is L, I, V, M, A or P;
X,
3 is any amino acid;
X,
4 is any amino acid;
X,
5 is any amino acid;
X,
6 is L, I, V, M, A, P, G, C, T orS; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue;
X,
7 is L, I, V, M, A or P;
X,
8 is any amino acid;
X,
9 is any amino acid;
X
20 L,I,V,M, A orP; 15 X 2 1 is P;
X
22 is L, I, V, M, A, P or G;
X
23 is P or N; *[Xj]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X. may comprise the same or different 20 amino acids selected from any amino acid residue;
X
24 is L, I, V, M, A or P;
X
25 is any amino acid;
X
26 is any amino acid;
X
27 is Y or F; and
X
28 is L, I, V, M, A or P.
The above sequence comparisons are preferably to the whole molecule but may also be to part thereof. Preferably, the comparisons are made to a contiguous series of at leas' about 21 nucleotides or at least about 5 amino acids. More preferably, the comparisons are made against at least about 21 contiguous nucleotides or at least 7 contiguous amino acids. Comparisons may also only be made to the SOCS box region or a region encompassing the protein:molecule P:\OPER\VPA\VPACOM-I\SOCSDJ-I.WPD. 1/11/01 -51interacting region such as the SH2 domain WD-40 repeats and/or ankyrin repeats.
Still another embodiment of the present invention contemplates an isolated polypeptide or a derivative, homologue, analogue or mimetic thereof comprising a SOCS box in its C-terminal region.
Preferably the polypeptide further comprises a protein:molecule interacting domain such as a protein:DNA or protein:protein interacting domain. Preferably, this domain is located Nterminal of the SOCS box. It is particularly preferred for the protein:molecule interacting domain to be at least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats.
Preferably, the signal transduction is mediated by a cytokine selected from EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, IL-13, IL-6, LIF, IL-12, IFNy, TNFa, IL-I and/or M-CSF.
Preferred cytokines are IL-6, LIF, OSM, IFN-y or thrombopoietin.
15 More preferably, the protein comprises a SOCS box having the amino acid sequence: X, X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 XIo X 1
X
1 2
X
13
X
14 X1 5
X
1 6 [Xiln X 17
X
18
X
19
X
20
X
2 1 X 22
X
23
X
24
X
25
X
26
X
27
X
2 8 wherein: X, is L, I, V, M, A or P;
X
2 is any amino acid residue;
SX
3 is P, T or S;
X
4 is L, I, V, M, A or P; X, is any amino acid;
X
6 is any amino acid;
X
7 is L, I, V, M, A, F, Y or W;
X
8 is C, T or S; X9 is R, K or H;
X
10 is any amino acid; is any amino acid; P:\OPERNVPA\VPACOM-I\SCSDI-I .WPD I1/01 -52-
X,
2 is L, I, V, M, A or P;
X,
3 is any amino acid;
X,
4 is any amino acid;
X,
5 is any amino acid; X,6 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue;
X,
7 is L, I, V, M, A or P; X,g is any amino acid;
X,
9 is any amino acid;
X
20 L, I, V, M, A or P;
X
21 is P;
X
22 is L, I, V, M, A, P or G; 15 X 23 is PorN; [Xj]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X may comprise the same or different amino acids selected from any amino acid residue;
X
2 4 is L, I, V, M, A or P;
X
25 any amino acid;
X
2 is any amino acid; X,6 is any amino acid; i X 27 is Y or F; and
SX
28 is L, I, V, M, A or P.
25 Still another embodiment provides an isolated polypeptide or a derivative, homologue, analogue or mimetic thereof comprising a sequence of amino acids substantially as set forth in SEQ ID NO:4 (mSOCSl), SEQ ID NO:6 (mSOCS2), SEQ ID NO:8 (mSOCS3), SEQ ID (hSOCS SEQ ID NO: 12 (rSOCS SEQ ID NO: 14 (mSOCS4), SEQ ID NO: 18 SEQ ID NO:21 (mSOCS6), SEQ ID NO:25 (mSOCS7), SEQ ID NO:29 (mSOCS8), SEQ ID NO:36 (hSOCS11), SEQ ID NO:41 (mSOCS13), SEQ ID NO:44 (mSOCS14), SEQ ID NO:46 (mSOCS 15) and SEQ ID NO:48 (hSOCS 15) or an amino acid sequence having at least P:%OPERVPAVPACOM-i\SOCSDI-I.WPD -1/1 1/01 -53similarity to all or a part of the listed sequences.
Preferred nucleotide percentage similarities include at least about 20%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at leas: about or above such as 93%, 95%, 98% or 99%.
Preferred amino acid similarities include at least about 20%, at least about 30%, at least about at least about 50%, at least about 60%, at least about 70%, at least about 83%, at least about 90%, at least about 95%, at least about 97% or 98% or above.
As stated above, similarity may be measured against an entire molecule or a region comprising at least 21 nucleotides or at least 7 amino acids. Preferably, similarity is measured in a conserved region such as SH2 domain, WD-40 repeats, ankyrin repeats or other .protein:molecule interacting domains or a SOCS box.
The term "similarity" includes exact identity between sequences or, where the sequence differs, different amino acids are related to each other at the structural, functional, biochemical and/or conformational levels.
The nucleic acid molecule may be isolated from any animal such as humans, primater, livestock animals horses, cows, sheep, donkeys, pigs), laboratory test animals mice, rats, i rabbits, hamsters, guinea pigs), companion animals dogs, cats) or captive wild animals (e.g.
deer, foxes, kangaroos).
25 Another aspect of the present invention contemplates a protein or a derivative, homologue, analogue or mimetic thereof comprising a protein:molecule interacting domain located Nterminal of the SOCS box. Preferably, the protein:molecule interacting domain is a protein:DNA or protein:protein interacting domain. Most preferably, the protein:molecule interacting domain is one of an SH2 domain, WD-40 repeats and/or ankyrin repeats.
In a preferred embodiment, the present invention provides a polypeptide comprising an SH2 P:\OPERWVPA\VPACOM- l\SOCSDI-I .WPD 1/1 1/01 54 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said protein comprises a SOCS box in its C-terminal region.
Preferably, the SH2 domain comprises the amino acid sequence: XI X 2
X
3
X
4
X
5
X
6
X
7
X
8
X
9 XIO I I X 12
X
13 X14 XIS X 16 X17 X 8 X 1 9
X
20
X
2 1
X
22
X
23 X24X25X2
X
27
X
28
X
29
X
3 0
X
3 1
X
32
X
33
X
34
[XP].X
35
X
36
X
37 X 38 X 39 X 40 X 41
X
42 X 43 X 4 4 X 45 X 46 X 47 X 4 8X 4 0
X
5 1 X 52
X
53
X
5 4
X
5 5
X
56
X
57
X
58 [Xq]n X 59 X60 X 61 [Xr]n X 62
X
63 X64 X 6
X
6 6 wherein: X, is Gor P; X2is F, W or C;
X
3 isY;
X
4 isW; 15X, is G or S;
X
6 is any amino acid, preferably P, A, S or V; 7 is L,M, VorI1;
X
8 is any amino acid, preferably S, T, D or N;
X
9 is any amino acid, preferably V, G, A, R, K or W;
XI
0 is any amino acid, preferably H, G, N, S, Y, W or E; X I is any amino acid, preferably G, E, A or D;
X,
2 is A;
X
13 is any amino acid, preferably H, N, K, R or E;
X
14 is any amino acid, preferably E, L, Q, A, G or M; XIS is any amino acid, preferably R, L, K or H;
X,
6 is L;
XP
7 is any amino acid, preferably R, S, K, Q, E or A;
X,
8 is any amino acid, preferably A, E, K, G, N or S;
X,
9 is any amino acid, preferably E, A, M, K or V;
X
20 is P;
X
2 is any amino acid, preferably V, A, E or D;
X
22 is G; P:\OPERPA\VPACO,\-t1SOCSDI-I.WPD- I/11/01
X
23 is T or S;
X
24 is F;
X
2 5 is L;
X
26 is V, I or L;
X
27 is R;
X
28 is D;
X
2 9 is S;
X
30 is any amino acid, preferably R, S, T or A;
X
3 is any amino acid, preferably Q, D or H;
X
32 is any amino acid, preferably R, Q, S, P, E or D;
X
33 is any amino acid, preferably N, R, D or S;
X
3 4 is any amino acid, preferably C, H or Y; [Xp]n is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different 15 amino acids selected from F, L or I;
X
35 is any amino acid, preferably A, T or S;
X
3 6 is L, I, V or C;
X
3 7 is S or D;
X
3 8 is V or F; 20 X 3 9 is K or R;
X
4 0 is any amino acid, preferably M, T, R or S;
X
4 is any amino acid, preferably A, Q, S, T, Y or H; X42 is S* X 4 3 is any amino acid, preferably S, A, R, N or G;
X
43 is any amino acid, preferably G, R, K or I; 25 X 44 is any amino acid, preferably P, T or S;
X
45 is any amino acid, preferably T, K, L or H;
X
4 6 is any amino acid, preferably S, N or H;
X
4 7 is any amino acid, preferably I, L, V, A or T; X48 is R;
X
4 9 is V, I or M; Xs 0 is any amino acid, preferably H, Q or E; P:\OPERVPAVPACOM-I\SOCSDI-I.WPD- 1/11/01 -56- X is any amino acid, preferably F, C, Y, Q or H;
X
52 is any amino acid, preferably Q, E, A, W, S or Y;
X,
3 is any amino acid, preferably A, G, D, N or R;
X
5 4 is G, S or H; X,5 is any amino acid, preferably R, S, K, N or T;
X,
6 is F;
X
57 is any amino acid, preferably H, S or R;
X
58 is L or F; [Xq]n is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue;
X
5 9 is D, T or E;
X
60 is any amino acid, preferably C, S, V, I, L or F;
X
6 is L, V, T or I; is a sequence of n amino acids wherein n is from I to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue;
X
62 is L, F or A;
X
6 3 is L, V, I or N; X6 is any amino acid, preferably E, H, D, Q or M; X isH, YorG; and
X
66 is Y, L or S.
In a preferred embodiment, the SH2 domain comprises a sequence selected from:
GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNCFFALSVKM
ASGPTSIRVHFQAGRFHLDGSRETFDCLFELLEHY;
(ii) GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNCFFALSVKM
ASGPTSIRVHFQAGRFHLDGSRESFDCLFELLEHY;
P:\OPERXVPAVPACOM-I\SOCSDI-I .WPD 1/11/01 57 (iii) GFYWGPLSVHGAHERLRSEPVGTFLVRDSRQRNCFFALSVKM
ASGPTSIRVHFQAGRFHLDGNRETFDCLFELLEHY;
(iv) GWvYWGSMTVNEAKEKLKEAPEGTFLIRDSSHSDYLLTISVKTS
AGPTNLRIEYQDGKFRLDSIICVKSKLKQFDSVVHLIDYY;
GFYWSAVTGGEANLLLSAEPAGTFLIRDSSDQRHFFTLSVKTQ
SGTKNLRIQCEGGSFSLQSDPRSTQPVPRFDCVLKLVHHY;
(vi) PCYWGVMDRYEAEALLEGKPEGTFLLRDSAQEDYFSVSFRRY NRSLHARIEQWNHfNFSFDAHDPCVFHSSTVTGLLEHY; :(vii) GWYYWGPITRWEAEGKLANVPDGSFLVRDSSDDRYLSCDFRSH
GKTLHTRIEHSNGRFSFYEQXDVEGHTSIVDLIGAFNQGL;
(viii) GWYWGPMNWEDAEMKLKGKPDGSFLVRDSSDPRYILSLSFRS
QGITHHTRMEHYRGTFSLWCHPKFEDRCQSVVEFIKRAIMHS;
(xi) PCYWGVMDKYAAEALLEGKPEGTFLLRDSAQEDYLFSVSFRR YSRSLHARIEQWNHfNFSFDAHDPCVFHSPDITGLLEHY; or
GWYWGSITASEARQHLQKMPEGTFLVRDSTHPSYFTLSVKTT
RGPTNVRIEYADSSFRLDSNCLSRPRILAFPDVVSLVQHY.
In another embodiment, the present invention provides a nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding an SH2 domain as broadly described above or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42*C.
The termns "derivatives" or its singular form "derivative" whether in relation to a nucleic acid molecule or a protein includes parts, mutants, fragments and analogues as well as hybrid or P:kOPERNVPAVPACOM- I\SOCSD -I.WPD 1/1/01 -58fusion molecules and glycosylation variants. Particularly useful derivatives comprise single or multiple amino acid substitutions, deletions and/or additions to the SOCS amino acid sequence.
Preferably, the derivatives have functional activity or alternatively act as antagonists or agonists.
The present invention further extends to homologues of SOCS which include the functionally or structurally related molecule from different animal species. The present invention also encompasses analogues and mimetics. Mimetics include a class of molecule generally but not necessarily having a non-amino acid structure and which functionally are capable of acting in an analogous manner to the protein for which it is a mimic, in this case, a SOCS. Mimetics may comprise a carbohydrate, aromatic ring, lipid or other complex chemical structure or may also be proteinaceous in composition. Mimetics as well as agonists and antagonists contemplated herein are conveniently located through systematic searching of environments, such as coral, marine and freshwater river beds, flora and microorganisms. This is sometimes referred to as natural product screening. Alternatively, libraries of synthetic chemical compounds may be 15 screened for potentially useful molecules.
As stated above, the present invention contemplates agonists and antagonists of the SOCS. One example of an antagonist is an antisense oligonucleotide sequence. Useful oligonucleotides are those which have a nucleotide sequence complementary to at least a portion of he proteincoding or "sense" sequence of the nucleotide sequence. These anti-sense nucleotides can be used to effect the specific inhibition of gene expression. The antisense approach can cause inhibition of gene expression apparently by forming an anti-parallel duplex by complementary base pairing between the antisense construct and the targeted mRNA, presumably resulting in hybridisation arrest of translation. Ribozymes and co-suppression molecules may also be used.
Antisense and other nucleic acid molecules may first need to be chemically modified to permit penetration of cell membranes and/or to increase their serum half life or otherwise make them more stable for in vivo administration. Antibodies may also act as either antagonists or agonists although are more useful in diagnostic applications or in the purification of SOCS proteins.
Antagonists and agonists may also be identified following natural product screening or screening of libraries of chemical compounds or may be derivatives or analogues of the SOCS molecules.
P:\OPERVPA\VPACOM-I\SOCSDI-I.WPD 1/11/01 -59- Accordingly, the present invention extends to analogues of the SOCS proteins of the present invention. Analogues may be used, for example, in the treatment or prophylaxis of cytokine mediated dysfunction such as autoimmunity, immune suppression or hyperactive ;mmunity or other condition including but not limited to dysfunctions in the haemopoietic, endocrine, hepatic and neural systems. Dysfunctions mediated by other signal transducing elements such as hormones or endogenous or exogenous molecules, antigens, microbes and microbial products, viruses or components thereof, ions, hormones and parasites are also contemplated by the present invention.
Analogues of the proteins contemplated herein include, but are not limited to, modification to side chains, incorporating of unnatural amino acids and/or their derivatives during peptide, polypeptide or protein synthesis and the use of crosslinkers and other methods which impose .i conformational constraints on the proteinaceous molecule or their analogues.
15 Examples of side chain modifications contemplated by the present invention include modifications of amino groups such as by reductive alkylation by reaction with an aldehyde followed by reduction with NaBH 4 amidination with methylacetimidate; acylation with acetic anhydride; carbamoylation of amino groups with cyanate; trinitrobenzylation of amino groups with 2, 4, 6-trinitrobenzene sulphonic acid (TNBS); acylation of amino groups with succinic 20 anhydride and tetrahydrophthalic anhydride; and pyridoxylation of lysine with phosphate followed by reduction with NaBH 4 The guanidine group of arginine residues may be modified by the formation of heterocyclic condensation products with reagents such as 2,3-butanedione, phenylglyoxal and glyoxal.
The carboxyl group may be modified by carbodiimide activation via O-acylisourea formation followed by subsequent derivitisation, for example, to a corresponding amide.
Sulphydryl groups may be modified by methods such as carboxymethylation with iodoacetic acid or iodoacetamide; performic acid oxidation to cysteic acid; formation of a mixed disulphides with other thiol compounds; reaction with maleimide, maleic anhydride or other P:\OPER\VPA\VPACOM-l'SOCSDI-I WPD 1/11/01 substituted maleimide; formation of mercurial derivatives using 4-chloromercuribenzoate, 4chloromercuriphenylsulphonic acid, phenylmercury chloride, 2-chloromercuri-4-nitrophenol and other mercurials; carbamoylation with cyanate at alkaline pH.
Tryptophan residues may be modified by, for example, oxidation with N-bromosuccinimide or alkylation of the indole ring with 2-hydroxy-5-nitrobenzyl bromide or sulphenyl halides.
Tyrosine residues on the other hand, may be altered by nitration with tetranitromethane to form a 3-nitrotyrosine derivative.
Modification of the imidazole ring of a histidine residue may be accomplished by alkylation with iodoacetic acid derivatives or N-carbethoxylation with diethylpyrocarbonate.
Examples of incorporating unnatural amino acids and derivatives during peptide synthesis include, but are not limited to, use of norleucine, 4-amino butyric acid, 4-amino-3-hydroxy-5phenylpentanoic acid, 6-aminohexanoic acid, t-butylglycine, norvaline, phenylglycine, omithine, sarcosine, 4-amino-3-hydroxy-6-methylheptanoic acid, 2-thienyl alanine and/or D-isomers of amino acids. A list of unnatural amino acid, contemplated herein is shown in Table 3.
P:\OPER\VPAVPACOM-I\SOCSDI-I .WPD 1/1 1/01 61 TABLE 3 Non-conventional Code Non-conventional Code amino acid amino acid a-aminobutyric acid a-amino-a-methylbutyrate aminocyclopropanecarboxylate aminoisobutyric acid aminonorbornylcarboxylate cyclohexylalanine cyclopentylalanine 15 D-alanine D-arginine D-aspartic acid D-cysteine D-glutamine D-glutamic acid D-histidine D-isoleucine D-leucine D-lysine D-methionine D-ornithine D-phenylalanine D-proline D-serine D-threonine D-tryptophan Abu Mgabu Cpro Aib Norb Cpen Dal Darg Dasp Dcys Dgln Dglu Dhis Dule Dleu Dlys Dmet Domn Dphe Dpro Dser Dthr Dtrp L-N-methylalanine L-N-methylarginine L-N-methylasparagine L-N-methylaspartic acid L-N-methylcysteine L-N-methylglutamine L-N-methylglutamic acid Chexa L-N-methylhistidine L-N-methylisolieucine L-N-methylleucine L-N-methyllysine L-N-methylmethionine L-N-methylnorleucine L-N-methylnorvaline L-N-methylornithine L-N-methylphenylalanine L-N-methylprol me L-N-methylserine L-N-methylthreonine L-N-methyltryptophan L-N-methyltyrosine L-N-methylvalime L-N-methylethylglycine L-N-methyl-t-butylglycine L-norleucine L-norvaline Nmala Nmarg Nmasn Nmasp Nmcys Nmgln Nmglu Nmhis Nmile Nmleu Nmlys Nmmet Nmnle Nmnva N morn Nmphe Nmpro Nmser Nmthr Nmtrp Nmtyr Nmval Nmetg Nmtbug NMe Nva P:\OPER VPA\VPACOM-l\SOCSDI-I.WPD 1/11/01 62 D-tyrosine D-valine D-a-methylalanine D-ac-methylarginine D-a-methylasparagine D-a-methylaspartate D-a-methylcysteine D-oc-methylglutamine D-cc-methylhistidine D-a-methylisoleucine D-a-methyl leucine D-a-methyllysine D-a-methylmethionine D-cc-methylornithine D-a-methylphenylalanine D-a-methylproline D-a-methylserine D-a-methylthreonine D-a-methyltryptophan D-cz-methyltyrosine D-a-methylvaline D-N-methylalanine D-N-methylarginine D-N-methylasparagine D-N-methylaspartate D-N-methylcysteine D-N-methylglutamine D-N-methylglutamate D-N-methylhistidine D-N-methylisoleucine D-N-methylleucine Dtyr DvaI Dmala Dmarg Dmasn Dmasp Dmcys Dmgln Dmhis Dmile Dmleu Dmlys Dmmet Dmorn Dmphe Dmpro Dmser Dmthr Dmtrp Dmty Dmval Dnmala Dnmarg Dnmasn Dnmasp Dnmcys Dnmgln Dnmglu Dnmhis Dnmile ax-methyl-amino isobutyrate a-methyl-y-aminobutyrate a-methylcyclohexylalanine a-methylcylcopentylalanine a-methyl-a-napthylalanine a-methylpenicillamine N-(4-aminobutyl)glycine N-(2-aminoethyl)glycine N-(3-aminopropyl)glycine N-amino-a-methylbutyrate a-napthylalanine N-benzylglycine N-(2-carbamylethyl)glycine N-(carbamylmethyl)glycine N-(2-carboxyethyl)glycine N-(carboxymethyl)glyc ine N-cyclobutylglycine N-cycloheptylglycine N-cyclohexylglycine N-cyclodecylglycine N-cylcododecylglycine N-cyclooctylglycine N-cyclopropylglycine N-cycloundecylglycine N-(2,2-diphenylethyl)glycine N-(3 ,3-diphenylpropyl)glycine N-(3-guanidinopropyl)glycine 1-hydroxyethyl)glycine N-(hydroxyethyl))glycine N-(imidazolylethyl))glycine Maib Mgabu Mchexa Mcpen Manap Mpen Nglu Naeg Nomn Nmaabu Anap Nphe Ngln Nasn Nglu Nasp Ncbut Nchep Nchex Ncdec Ncdod Ncoct Ncpro Ncund Nbhm Nbhe Narg Nthr Nser Nhis Dnmleu N-(3-indolylyethyl)glycineNhr Nhtrp P:\OPER\VPA\VPACOM-I'SOCSDI- I.WPD. -I 1/101 63 D-N-methyllysine N-methylcyclohexylalIan ine D-N-methylornithine N-methylglycine N-methylaminoisobutyrate 1 -methylpropyl)glycine N-(2-methylpropyl)glycine D-N-methyltryptophan D-N-methyltyrosine D-N-methylvaline y-aminobutyric acid L-t-butylglycine L-ethylglycine L-homophenylalanine L-a-methylarginine L-a-methylaspartate L-a-methylcysteine L-ax-methylglutamine L-a-methylhistidine L-a-methylisoleucine L-a-methylleucine L-ce-methylmethionine L-ca-methylnorvaline L-cc-methylphenylalanine L-c-methylserine L-a-methyltryptophan L-a-methylvaline iphenyl ethyl) carbamylmethyl)glycine 1 -carboxy-l1-(2,2-diphenylethylamino)cyclopropane Dnmlys Nmchexa Dnmorn Nala Nmaib Nile Nleu Dnmtrp Dnmtyr Dnmval Gabu Tbug Etg Hphe Marg Masp Mcys Mgln Mhis Mile Mleu Mmet Mnva Mphe Mser N-methyl-y-aminobutyrate D-N-methylmethionine N-methylcyclopentylalanine D-N-methylphenylalanine D-N-methylproline D-N-methylserine D-N-methylthreonine 1 -methylethyl)glycine N-methyla-napthylalanine N-methylpenicillamine N-(p-hydroxyphenyl)glycine N-(thiomethyl)glycine penicillamine L-ox-methylalanine L-ca-methylasparagine L-a-methyl-t-butylglycine L-methylethylglycine L-a-methylglutamate L-a-methylhomophenylaianine N-(2-methylthioethyl)glycine L-a-methyllysine L-c-methyinorleucine L-cc-methylornithine L-a-methylproline L-oc-methylthreonine L-o-methyltyrosine L-N-methylhomophenylalanine N-(N-(3,3-diphenylpropyl) carbamylmethyl)glycine Nmgabu Dnmmet Nmcpen Dnmphe Dnmpro Dnmser Dnmthr Nval Nmanap Nmpen Nhtyr Ncys Pen Mala Masn Mtbug Metg Mglu Mhphe Nmet Mlys Mnle Morn Mpro Mthr Mtyr Nmhphe Nnbhe Mtrp Mval Nnbhrn Nmbc P:\OPER VPAWVPACOM-l'SOCSDI-I.WPD 1/11/01 -64- Crosslinkers can be used, for example, to stabilise 3D conformations, using homo-bifunctional crosslinkers such as the bifunctional imido esters having (CH2)n spacer groups with n=l to n=6, glutaraldehyde, N-hydroxysuccinimide esters and hetero-bifunctional reagents which usually contain an amino-reactive moiety such as N-hydroxysuccinimide and another group specific-reactive moiety such as maleimido or dithio moiety (SH) or carbodiimide (COOH).
In addition, peptides can be conformationally constrained by, for example, incorporation of C, and N,-methylamino acids, introduction of double bonds between Ca and Cp atoms of amino acids and the formation of cyclic peptides or analogues by introducing covalent bonds such as forming an amide bond between the N and C termini, between two side chains or between a side chain and the N or C terminus.
These types of modifications may be important to stabilise the cytokines if administered to an individual or for use as a diagnostic reagent.
.*g Other derivatives contemplated by the present invention include a range of glycosylation variants from a completely unglycosylated molecule to a modified glycosylated molecule. Altered glycosylation patterns may result from expression of recombinant molecules in different host cells.
Another embodiment of the present invention contemplates a method for modulating expression of a SOCS protein in a mammal, said method comprising contacting a gene encoding a SOCS S. or a factor/element involved in controlling expression of the SOCS gene with an effzctive amount of a modulator of SOCS expression for a time and under conditions sufficient to up-regulate or down-regulate or otherwise modulate expression of SOCS. An example of a modulator is a cytokine such as IL-6 or other transcription regulators of SOCS expression.
Expression includes transcription or translation or both.
Another aspect of the present invention contemplates a method of modulating activity of SOCS in a human, said method comprising administering to said mammal a modulating effective amount of a molecule for a time and under conditions sufficient to increase or decrease SOCS activity.
P:\OPERVPA\VPACOM-I\SOCSDI-I.WPD 1111101 The molecule may be a proteinaceous molecule or a chemical entity and may also be a derivative of SOCS or a chemical analogue or truncation mutant of SOCS.
A further aspect of the present invention provides a method of inducing synthesis of a SOCS or transcription/translation of a SOCS comprising contacting a cell containing a SOCS gene with an effective amount of a cytokine capable of inducing said SOCS for a time and under conditions sufficient for said SOCS to be produced. For example, SOCS 1 may be induced by IL-6.
Still a further aspect of the present invention contemplates a method of modulating levels of a SOCS protein in a cell said method comprising contacting a cell containing a SOCF gene with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time and under conditions sufficient to modulate levels and/or activity of said SOCS protein.
Yet a further aspect of the present invention contemplates a method of modulating signal 15 transduction in a cell containing a SOCS gene comprising contacting said cell with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time sufficient to modulate signal transduction.
Even yet a further aspect of the present invention contemplates a method of influencing interaction between cells wherein at least one cell carries a SOCS gene, said method comprising contacting the cell carrying the SOCS gene with an effective amount of a modulator of SOCS gene expression or SOCS protein activity for a time sufficient to modulate signal transduction.
As stated above, of the present invention contemplates a range of mimetics or small molecules capable of acting as agonists or antagonists of the SOCS. Such molecules may be obtained from natural product screening such as from coral, soil, plants or the ocean or antarctic environments.
Alternatively, peptide, polypeptide or protein libraries or chemical libraries may be readily screened. For example, MI cells expressing a SOCS do not undergo differentiation in the presence of IL-6. This system can be used to screen molecules which permit differentiation in the presence of IL-6 and a SOCS. A range of test cells may be prepared to screen fo' antagonists and agonists for a range ofcytokines. Such molecules are preferably small molecules and may P:\OPERVPAVPACOM-I SOCSDI-I.WPD /11/01 -66be of amino acid origin or of chemical origin. SOCS molecules interacting with signalling proteins (eg. JAKs) provide molecular screens to detect molecules which interfere or promote this interaction. Once such screening protocol involves natural product screening.
In a preferred embodiment the present invention is directed to the use of a polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein said protein comprises a SOCS box in its C-terminal region in screening for modulators of SOCS protein activity.
Accordingly the present invention contemplates a method of screening for a modulator of SOCS protein activity, said method comprising contacting a preparation containing a polypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with said SH2 domain or derivative, homologue, analogue or mimetic thereof with a test agent and detecting a different 15 level of interaction between said ligand or analogue or derivative thereof and said SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of said interaction in the absence of said test agent, wherein said different level is indicative of said agent being a modulator of SOCS protein activity.
In one embodiment, a reduced level of interaction is detected.
In an alternate embodiment, an enhanced level of interaction is detected.
Preferably, the intracellular ligand is a member of the Janus family of protein tyrosine kinases (JAKs).
The present invention also contemplates a pharmaceutical composition comprising SOCS or a derivative thereof or a modulator of SOCS expression or SOCS activity and one or more pharmaceutically acceptable carriers and/or diluents. These components are referred to as the "active ingredients". These and other aspects of the present invention apply to any SOCS molecules such as but not limited to SOCS1 to SOCS P:\OPER\VPA\VPACOM-l\SOCSDI-l WPD 1/11/01 67- The pharmaceutical forms containing active ingredients suitable for injectable use include sterile aqueous solutions (where water soluble) sterile powders for the extemporaneous preparation of sterile injectable solutions. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi.
The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils. The proper fluidity can be maintained, for example, by the use of a coating such as licithin, by the maintenance of the required particle size in the case of dispersion and by the use of superfactants. The preventions of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, 6:o. chlorobutanol, phenol, sorbic acid, thirmerosal and the like. In many cases, it will be preferable 0@SS to include isotonic agents, for example, sugars or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions are prepared by incorporating the active compounds in the required amount in the appropriate solvent with various of the other ingredients enumerated above, as required, followed by filtered sterilization. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and the oooo• 20 freeze-drying technique which yield a powder of the active ingredient plus any add;tional desired ingredient from previously sterile-filtered solution thereof.
When the active ingredients are suitably protected they may be orally administered, for example, with an inert diluent or with an assimilable edible carrier, or it may be enclosed i-n hard or soft shell gelatin capsule, or it may be compressed into tablets. For oral therapeutic administration, the active compound may be incorporated with excipients and used in the form of ingestible tablets, buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers and the like. Such compositions and preparations should contain at least 1% by weight of active compound. The percentage of the compositions and preparations may, of course, be varied and may conveniently be between about 5 to about 80% of the weight of the unit. The amount of active compound in such therapeutically useful compositions in such that a suitable dosage will be obtained. Preferred P:\OPERVPAVPACOM-lSOCSDI- I.WP- 1/11/01 68 compositions or preparations according to the present invention are prepared so that an oral dosage unit form contains between about 0.1 gg and 2000 mg of active compound.
The tablets, troches, pills, capsules and the like may also contain the components as listed hereafter. A binder such as gum, acacia, corn starch or gelatin; excipients such as dicalcium phosphate; a disintegrating agent such as corn starch, potato starch, alginic acid and the like; a lubricant such as magnesium stearate; and a sweetening agent such a sucrose, lactose or saccharin may be added or a flavouring agent such as peppermint, oil of wintergreen or cherry flavouring.
When the dosage unit form is a capsule, it may contain, in addition to materials of the above type, a liquid carrier. Various other materials may be present as coatings or to otherwise modify the physical form of the dosage unit. For instance, tablets, pills, or capsules may be coated with shellac, sugar or both. A syrup or elixir may contain the active compound, sucrose as a S-°•sweetening agent, methyl and propylparabens as preservatives, a dye and flavouring such as cherry *"or orange flavour. Of course, any material used in preparing any dosage unit form should be 15 pharmaceutically pure and substantially non-toxic in the amounts employed. In addition, the 00°005 active compound(s) may be incorporated into sustained-release preparations and formulations.
°.The present invention also extends to forms suitable for topical application such as creams, lotions •and gels.
Pharmaceutically acceptable carriers and/or diluents include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents and the like. The use of such media and agents for pharmaceutical active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active ingredient, use thereof in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions.
It is especially advantageous to formulate parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refer to physically discrete units suited as unitary dosages for the mammalian subjects to be treated; each unit containing a predetermined quantity of active material calculated to produce the desired P:\OPERNVPA\VPACOM-I\SOCSDI-I.WPD- 111101 -69therapeutic effect in association with the required pharmaceutical carrier. The specification for the novel dosage unit forms of the invention are dictated by and directly dependent on the unique characteristics of the active material and the particular therapeutic effect to be achieved, and the limitations inherent in the art of compounding such an active material for the treatment of disease in living subjects having a diseased condition in which bodily health is impaired as herein disclosed in detail.
The principal active ingredient is compounded for convenient and effective administration in effective amounts with a suitable pharmaceutically acceptable carrier in dosage unit form as hereinbefore disclosed. A unit dosage form can, for example, contain the principal active compound in amounts ranging from 0.5 pig to about 2000 mg. Expressed in proportions, the active compound is generally present in from about 0.5 pg to about 2000 mg/ml of carrier. In the case of compositions containing supplementary active ingredients, the dosages are determined by reference to the usual dose and manner of administration of the said ingredients. The effective 15 amount may also be conveniently expressed in terms of an amount per kg of body weight. For .*oo.i example, from about 0.01 ng to about 10,000 mg/kg body weight may be administered.
The pharmaceutical composition may also comprise genetic molecules such as a vector capable of transfecting target cells where the vector carries a nucleic acid molecule capable of modulating 20 SOCS expression or SOCS activity. The vector may, for example, be a viral vector. In this °°o.oi regard, a range of gene therapies are contemplated by the present invention including isolating certain cells, genetically manipulating and returning the cell to the same subject or to a genetically -related-or-similar- subject. Still another aspect of the present invention is directed to antibodies to SOCS and its derivatives.
Such antibodies may be monoclonal or polyclonal and may be selected from naturally occurring antibodies to SOCS or may be specifically raised to SOCS or derivatives thereof. In the case of the latter, SOCS or its derivatives may first need to be associated with a carrier molecule. The antibodies and/or recombinant SOCS or its derivatives of the present invention are particularly useful as therapeutic or diagnostic agents.
P:\OPER\VPA\VPACOM-I\SOCSDI-I .WPD U11/01 For example, SOCS and its derivatives can be used to screen for naturally occurring antibodies to SOCS. These may occur, for example in some autoimmune diseases. Alternatively, specific antibodies can be used to screen for SOCS. Techniques for such assays are well known in the art and include, for example, sandwich assays and ELISA. Knowledge of SOCS 'evels may be important for diagnosis of certain cancers or a predisposition to cancers or monitoring cytokine mediated cellular responsiveness or for monitoring certain therapeutic protocols.
Antibodies to SOCS of the present invention may be monoclonal or polyclonal. Alternatively, fragments of antibodies may be used such as Fab fragments. Furthermore, the present invention extends to recombinant and synthetic antibodies and to antibody hybrids. A "synthetic antibody" S is considered herein to include fragments and hybrids of antibodies. The antibodies of this aspect of the present invention are particularly useful for immunotherapy and may also be used as a diagnostic tool for assessing apoptosis or monitoring the program of a therapeutic regimin.
For example, specific antibodies can be used to screen for SOCS proteins. The latter would be important, for example, as a means for screening for levels of SOCS in a cell extract or other biological fluid or purifying SOCS made by recombinant means from culture supernatant fluid.
Techniques for the assays contemplated herein are known in the art and include, for example, sandwich assays and ELISA.
It is within the scope of this invention to include any second antibodies (monoclonal, polyclonal or fragments of antibodies or synthetic antibodies) directed to the first mentioned antibodies discussed above. Both the first and second antibodies may be used in detection assays or a first antibody may be used with a commercially available anti-immunoglobulin antibody. An antibody as contemplated herein includes any antibody specific to any region of SOCS.
Both polyclonal and monoclonal antibodies are obtainable by immunization with the enzyme or protein and either type is utilizable for immunoassays. The methods of obtaining both types of sera are well known in the art. Polyclonal sera are less preferred but are relatively easily prepared by injection of a suitable laboratory animal with an effective amount of SOCS, or antigenic parts thereof, collecting serum from the animal, and isolating specific sera by any of the known P:\OPERWVPA\VPACOM-1\SOCSDI-I.WPD- 1/I 1/01 -71 immunoadsorbent techniques. Although antibodies produced by this method are utilizable in virtually any type of immunoassay, they are generally less favoured because oi the potential heterogeneity of the product.
The use of monoclonal antibodies in an immunoassay is particularly preferred because of the ability to produce them in large quantities and the homogeneity of the product. The preparation of hybridoma cell lines for monoclonal antibody production derived by fusing an immortal cell line and lymphocytes sensitized against the immunogenic preparation can be done by techniques which are well known to those who are skilled in the art.
Another aspect of the present invention contemplates a method for detecting SOCS in a biological sample from a subject said method comprising contacting said biological sample with an antibody specific for SOCS or its derivatives or homologues for a time and under conditions sufficient for *an antibody-SOCS complex to form and then detecting said complex.
The presence of SOCS may be accomplished in a number of ways such as by Western blotting and ELISA procedures. A wide range of immunoassay techniques are available as can be seen by reference to US Patent Nos. 4,016,043, 4, 424,279 and 4,018,653. These, of course, include both single-site and two-site or "sandwich" assays of the non-competitive types, as well as in the traditional competitive binding assays. These assays also include direct binding of a labelled antibody to a target.
Sandwich assays are among the most useful and commonly used assays and are favoured for use in the present invention. A number of.variations of the sandwich assay technique exist, and all are intended to be encompassed by the present invention. Briefly, in a typical forward assay, an unlabelled antibody is immobilized on a solid substrate and the sample to be tested brought into contact with the bound molecule. After a suitable period of incubation, for a period of time sufficient to allow formation of an antibody-antigen complex, a second antibody specific to the antigen, labelled with a reporter molecule capable of producing a detectable signal is then added and incubated, allowing time sufficient for the formation of another complex of antibody-antigenlabelled antibody. Any unreacted material is washed away, and the presence of the antigen is P:\OPERVPA\VPACOM\-ISOCSDI-I .WPD 1/11/01 72 determined by observation of a signal produced by the reporter molecule. The results may either be qualitative, by simple observation of the visible signal, or may be quantitated by comparing with a control sample containing known amounts of hapten. Variations on the .orward assay include a simultaneous assay, in which both sample and labelled antibody are added simultaneously to the bound antibody. These techniques are well known to those skilled in the art, including any minor variations as will be readily apparent. In accordance with the present invention the sample is one which might contain SOCS including cell extract, tissue biopsy or possibly serum, saliva, mucosal secretions, lymph, tissue fluid and respiratory fluid. The sample is, therefore, generally a biological sample comprising biological fluid but also extends to fermentation fluid and supernatant fluid such as from a cell culture.
S. In the typical forward sandwich assay, a first antibody having specificity for the SOCS or .•.antigenic parts thereof, is either covalently or passively bound to a solid surface. The solid surface Is typically glass or a polymer, the most commonly used polymers being cellulose, 15 polyacrylamide, nylon, polystyrene, polyvinyl chloride or polypropylene. The solid supports may .o.o.i be in the form of tubes, beads, discs of microplates, or any other surface suitable for conducting an immunoassay. The binding processes are well-known in the art and generally consist of cross- .•!!linking covalently binding or physically adsorbing, the polymer-antibody complex is washed in S:'i preparation for the test sample. An aliquot of the sample to be tested is then added to the solid phase complex and incubated for a period of time sufficient 2-40 minutes or overnight if more convenient) and under suitable conditions room temperature to 37C) to allow binding of any subunit present in the antibody. Following the incubation period, the antibody subunit solid phase is washed and dried and incubated with a second antibody specific for a portion of the hapten. The second antibody is linked to a reporter molecule which is used to indicate the binding of the second antibody to the hapten.
An alternative method involves immobilizing the target molecules in the biological sample and then exposing the immobilized target to specific antibody which may or may not be labelled with a reporter molecule. Depending on the amount of target and the strength of the reporter molecule signal, a bound target may be detectable by direct labelling with the antibody. Alternatively, a second labelled antibody, specific to the first antibody is exposed to the target-first antibody P:%OPERVPA\VPACOM-I\SOCSDl-I.WPD -1/11/01 -73complex to form a target-first antibody-second antibody tertiary complex. The complex is detected by the signal emitted by the reporter molecule.
By "reporter molecule" as used in the present specification, is meant a molecule which, by its chemical nature, provides an analytically identifiable signal which allows the detection of antigenbound antibody. Detection may be either qualitative or quantitative. The most commonly used reporter molecules in this type of assay are either enzymes, fluorophores or radionuclide containing molecules radioisotopes) and chemiluminescent molecules.
In the case of an enzyme immunoassay, an enzyme is conjugated to the second antibody, generally by means ofglutaraldehyde or periodate. As will be readily recognized, however, a wide variety of different conjugation techniques exist, which are readily available to the sxilled artisan.
Commonly used enzymes include horseradish peroxidase, glucose oxidase, beta-galactosidase and alkaline phosphatase, amongst others. The substrates to be used with the specific enzymes are generally chosen for the production, upon hydrolysis by the corresponding enzyme, of a detectable colour change. Examples of suitable enzymes include alkaline phosphatase and peroxidase. It is also possible to employ fluorogenic substrates, which yield a fluorescent product rather than the chromogenic substrates noted above. In all cases, the enzyme-labelled antibody is added to the first antibody hapten complex, allowed to bind, and then the excess reagent is washed away.
A solution containing the appropriate substrate is then added to the complex ofantibody-antigenantibody. The substrate will react with the enzyme linked to the second antibody, giving a qualitative visual signal, which may be further quantitated, usually spectrophotometrically, to give an indication of the amount of hapten which was present in the sample. "Reporter molecule" also extends to use of cell agglutination or inhibition of agglutination such as red blood cells on latex beads, and the like.
Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be chemically coupled to antibodies without altering their binding capacity. When activated by illumination with light of a particular wavelength, the fluorochrome-labelled antibody adsorbs the light energy, inducing a state to excitability in the molecule, followed by emission of the light at a characteristic colour visually detectable with a light microscope. As in the EIA, the fluorescent labelled P:\OPERXVPAVPACO-I\SOCSDI-I .WPD. I-I 1/01 -74antibody is allowed to bind to the first antibody-hapten complex. After washing off the unbound reagent, the remaining tertiary complex is then exposed to the light of the appropriate wavelength the fluorescence observed indicates the presence of the hapten of interest. Immunofluorescene and EIA techniques are both very well established in the art and are particularly preferred for the present method. However, other reporter molecules, such as radioisotope, chemil iminescent or bioluminescent molecules, may also be employed.
The present invention also contemplates genetic assays such as involving PCR analysis to detect SOCS gene or its derivatives. Alternative methods or methods used in conjunction include direct nucleotide sequencing or mutation scanning such as single stranded conformation polymorphisms analysis (SSCP) as specific oligonucleotide hybridisation, as methods such as direct protein truncation tests.
Since cytokines are involved in transcription of some SOCS molecules, the detection of SOCS provides surrogate markers for cytokines or cytokine activity. This may be useful in assessing subjects with a range of conditions such as those will autoimmune diseases, for example, rheumatoid arthritis, diabetes and stiff man syndrome amongst others.
The nucleic acid molecules of the present invention may be DNA or RNA. When the nucleic acid molecule is in DNA form, it may be genomic DNA or cDNA. RNA forms of the nucleic acid molecules of the present invention are generally mRNA.
Although the nucleic acid molecules of the present invention are generally in isolated form, they may be integrated into or ligated to or otherwise fused or associated with other genetic molecules such as vector molecules and in particular expression vector molecules. Vectors and expression vectors are generally capable of replication and, if applicable, expression in one or both of a prokaryotic cell or a eukaryotic cell. Preferably, prokaryotic cells include E. coli, Bacillus sp and Pseudomonas sp. Preferred eukaryotic cells include yeast, fungal, mammalian and insect cells.
Accordingly, another aspect of the present invention contemplates a genetic construct comprising a vector portion and a mammalian and more particularly a human SOCS gene portion, which P:\OPERVPA\VPACOM-I\SOCSDl-I.WPD 1/11/01 SOCS gene portion is capable of encoding a SOCS polypeptide or a functional or immunologically interactive derivative thereof.
Preferably, the SOCS gene portion of the genetic construct is operably linked to a promoter on the vector such that said promoter is capable of directing expression of said SOCS gene portion in an appropriate cell.
In addition, the SOCS gene portion of the genetic construct may comprise all or part of the gene fused to another genetic sequence such as a nucleotide sequence encoding glutathione-Stransferase or part thereof.
The present invention extends to such genetic constructs and to prokaryotic or eukaryotic cells comprising same.
The present invention also extends to any or all derivatives of SOCS including mutants, part, fragments, portions, homologues and analogues or their encoding genetic sequence including single or multiple nucleotide or amino acid substitutions, additions and/or deletions to the naturally occurring nucleotide or amino acid sequence. The present invention a.so extends to mimetics and agonists and antagonists of SOCS.
The SOCS and its genetic sequence of the present invention will be useful in the generation of a range of therapeutic and diagnostic reagents and will be especially useful in the detection of a cytokine involved in a particular cellular response or a receptor for that cytokine. For example, cells expressing SOCS gene such as Ml cells expressing the SOCSI gene, will no longer be responsive to a particular cytokine such as, in the case of SOCSI, IL-6. Clearly, the present invention further contemplates cells such as Ml cells expressing any SOCS gene such as from SOCSI to SOCS15. Furthermore, the present invention provides the use of molecules that regulate or potentiate the ability of therapeutic cytokines. For example, molecules which block some SOCS activity, may act to potential therapeutic cytokine activity (eg. G-CSF).
Soluble SOCS polypeptides are also contemplated to be particularly useful in the treatment of PAOPERWVPA\VPACOM- I SOCSDI-I.WPD- 1/11/01 -76disease, injury or abnormality involving cytokine mediated cellular responsiveness such as hyperimmunity, immunosuppression, allergies, hypertension and the like.
A further aspect of the present invention contemplates the use of SOCS or its functional derivatives in the manufacture of a medicament for the treatment of conditions involving cytokine mediated cellular responsiveness.
The present invention further contemplates transgenic mammalian cells expressing a SOCS gene.
Such cells are useful indicator cell lines for assaying for suppression of cytokine function. One example is MI cells expressing a SOCS gene. Such cell lines may be useful for screening for cytokines or screening molecules such as naturally occurring molecules from plants, coral, microorganisms or bio-organically active soil or water capable of acting as cytokine antagonists or agonists.
The present invention further contemplates hybrids between different SOCS from the same or different animal species. For example, a hybrid may be formed between all or a functional part of mouse SOCS 1 and human SOCS1. Alternatively, the hybrid may be between all or part of mouse SOCS 1 and mouse SOCS2. All such hybrids are contemplated herein and are particularly useful in developing pleiotropic molecules.
The present invention further contemplates a range of genetic based diagnostic assays screening for individuals with defective SOCS genes. Such mutations may result in cell types not being responsive to a particular cytokine or resulting in over responsiveness leading to a range of conditions. The SOCS genetic sequence can be readily verified using a range of PCR or other techniques to determine whether a mutation is resident in the gene. Appropriate gene therapy or other interventionist therapy may then be adopted.
The present invention is further described by the following non-limiting Examples.
P:'OPERVPA\VPACOM-l'SOCSDI-I.WPD- 1/11/01 -77- Examples 1-16 relate to SOCSI, SOCS2 and SOCS3 which were identified on the basis of activity. Examples 17-24 relate to various aspects of SOCS4 to SOCS15 which were cloned initially on the basis of sequence similarity. Examples 25-36 relate to specific aspects of SOCS4 to SOCS15, respectively.
EXAMPLE 1 CELL CULTURE AND CYTOKINES The Ml cell line was derived from a spontaneously arising leukaemia in SL mice [Ichikawa, 1969]. Parental MI cells used in this study have been in passage at the Walter and Eliza Hall Institute for Medical Research, Melbourne, Victoria, Australia, for approximately 10 years. Ml cells were maintained by weekly passage in Dulbecco's modified Eagle's medium (DME) containing 10% foetal bovine serum (FCS). Recombinant cytokines are generally available from commercial sources or were prepared by published methods. Recombinant murine LIF was produced in Escherichia coli and purified, as previously described [Gearing, 1989]. Purified human oncostatin M was purchased from PeproTech Inc (Rocky Hill, NJ, USA), and'purified 15 mouse IFN-y was obtained from Genzyme Diagnostics (Cambridge, MA, USA). Recombinant *•oo• Smurine thrombopoietin was produced as a FLAGTM-tagged fusion protein in CHO cells and then purified.
EXAMPLE 2 AGAR COLONY ASSAYS SIn order to assay the differentiation of Ml cells in response to cytokines, 300 cells were cultured in 35 mm Petri dishes containing 1 ml of DME supplemented with 20%(v/v) fital calf serum (FCS), agar and 0.1 ml of serial dilutions of IL-6, LIF, OSM, IFN-y, tpo or dexamethasone (Sigma Chemical Company, St Louis, MI). After 7 days culture at 37 0 C in a fully humidified atmosphere, containing 10% CO 2 in air, colonies of Ml cells were counted and classified as differentiated if they were composed of dispersed cells or had a corona of dispersed cells around a tightly packed centre.
EXAMPLE 3 GENERATION OF RETROVIRAL LIBRARY A cDNA expression library was constructed from the factor-dependent haemopoietic cell line P:\OPER\VPA\VPACOM-I\SOCSDI-I .WPD.- 1/11/01 -78- FDC-PI, essentially as described [Rayner, 1994]. Briefly, cDNA was cloned into the retroviral vector pRUFneo and then transfected into an amphotrophic packaging cell 'ine (PA317).
Transiently generated virus was harvested from the cell supernatant at 48 hr posttransfection, and used to infect Y2 ecotropic packaging cells, to generate a high titre virus-producing cell line.
EXAMPLE 4 RETROVIRAL INFECTION OF M1 CELLS Pools of 10 6 infected '2 cells were irradiated (3000 rad) and cocultivated with 10 6 Ml cells in DME supplemented with 10%(v/v) FCS and 4 ig/ml Polybrene, for 2 days at 37°C. To select for IL-6-unresponsive clones, retrovirally-infected Ml cells were washed once in DME, and cultured at approximately 2x10 4 cells/ml in 1 ml agar cultures containing 400 pg/ml geneticin (GibcoBRL, Grand Island, NY) and 100 ng/ml IL-6. The efficiency of infection ofMl cells was as estimated by agar plating the infected cells in the presence of geneticin only.
EXAMPLE
PCR
Genomic DNA from retrovirally-infected Ml cells was digested with Sac I and 1 Ag of phenol/chloroform extracted DNA was then amplified by polymerase chain reaction (PCR).
Primers used for amplification of cDNA inserts from the integrated retrovirus were GAG3 CACGCCGCCCACGTGAAGGC 3' [SEQ ID which corresponds to the vector gag sequence approximately 30 bp 5' of the multiple cloning site, and HSVTK TTCGCCAATGACAAGACGCT 3' [SEQ ID which corresponds to the pMClneo sequence approximately 200 bp 3' of the multiple cloning site. The PCR entailed an initial denaturation at 94°C for 5 min, 35 cycles of denaturation at 94°C for 1 min, annealing at 56°C for 2 min, and extension at 72°C for 3 min, followed by a final 10 min extension. PCR products were gel purified and then ligated into the pGEM-T plasmid (Promega, Madison, WI), and sequenced using an ABI PRISM Dye Terminator Cycle Sequencing Kit and a Model 373 Automated DNA Sequencer (Applied Biosystems Inc., Foster City, CA).
P:\OPER\VPA\VPACOM-l\SOCSD-I. WPD- 1/11/01 -79- EXAMPLE 6 CLONING OF cDNAs Independent cDNA clones encoding mouse SOCS 1 were isolated from a murine thymus cDNA library essentially as described (Hilton et al, 1994). The nucleotide and predicted amino acid sequences of mouse SOCS1 cDNA were compared to databases using the BLASTN and TFASTA algorithms (Pearson and Lipman, 1988; Pearson, 1990; Altshcul et al, 1990). Oligonucleotides were designed from the ESTs encoding human SOCS and mouse SOC-1 and SOCS3 and used to probe commercially available mouse thymus and spleen cDNA libraries. Sequencing was performed using an ABI automated sequencer according to the manufacturer's instructions.
EXAMPLE 7 SOUTHERN AND NORTHERN BLOT ANALYSES AND RT-PCR 32 P-labelled probes were generated using a random decanucleotide labelling kit (Bresatec, Adelaide, South Australia) from a 600 bp Pst I fragment encoding neomycin phophotransfease from the plasmid pPGKneo, 1070 bp fragment of the SOCS1 gene obtained by digestion of the 1.4 kbp PCR product with Xho I, SOCS2, SOCS3, CIS and a 1.2 kbp fragment of the chicken glyceraldehyde 3-phosphate dehydrogenase gene [Dugaiczyk, 1983].
Genomic DNA was isolated from cells using a proteinase K-sodium dodecyl sulfate procedure essentially as described. Fifteen micrograms of DNA was digested with either BamH I or Sac I, fractionated on a agarose gel, transferred to GeneScreenPlus membrane (Du Pont NEN, Boston MA), prehybridised, hybridised with random-primed 32 P-labelled DNA fragments and washed essentially as described [Sambrook, 1989].
Total RNA was isolated from cells and tissues using Trizol Reagent, as recommended by the manufacturer (GibcoBRL,Grand Island, NY). When required polyA+ mRNA was purified essentially as described [Alexander, 1995]. Northern blots were prehybridised, hybridized with random-primed 32P-labelled DNA fragments and washed as described [Alexander, 1995].
To assess the induction of SOCS genes by IL-6, mice (C57BL6) were injected intravenously with Aug IL-6 followed by harvest of the liver at the indicated timepoints after injection. Ml cells P:AOPER\VPAVPACOM-l\SOCSDI-l.WPD. 1/11/01 were cultured in the presence of 20 ng/ml IL-6 and harvested at the indicated times. For RT-PCR analysis, bone marrow cells were harvested as described (Metacalfet al, 1995) and stimulated for 1 hr at 37°C with 100 ng/ml of a range of cytokines. RT-PCR was performed on total RNA as described (Metcalfet al, 1995). PCR products were resolved on an agarose gel and Southern blots were hybridised with probes specific for each SOCS family member. Expression of P-actin was assessed to ensure uniformity of amplification.
EXAMPLE 8 DNA CONSTRUCTS AND TRANSFECTION A cDNA encoding epitope-tagged SOCS 1 was generated by subcloning the entire SOCS 1 coding 0%O region into the pEF-BOS expression vector [Mizushima, 1990], engineered to encode an inframe FLAG epitope downstream of an initiation methionine (pF-SOCS Using electroporation as described previously [Hilton, 1994], MI cells expressing the thrombopoietin receptor (M .mpl) were transfected with the 20 ,ug of Aat II-digested pF-SOCS expression plasmid and 2 gg of a Sca I-digested plasmid in which transcription of a cDNA encoding puromycin N-acetyl transferase was driven from the mouse phosphoglycerokinase promoter (pPGKPuropA). After 48 hours in culture, transfected cells were selected with 20 pg/ml puromycin (Sigma Chemical Company, St Louis MO), and screened for expression of SOCS by Western blotting, using the M2 anti-FLAG monoclonal antibody according to the manafacturer's instructions (Eastman Kodak, Rochester NY). In other experiments Ml cells were transfected with only the pF-SOCSI plasmid or a S° control and selected by their ability to grow in agar in the presence of 100 ng/ml of IL-6.
P:\OPERVPAVPACOM-l'SOCSD1-I.WPD 1/11/01 -81 EXAMPLE 9 IMMUNOPRECIPITATION AND WESTERN BLOTTING Prior to either immunoprecipitaion or Western blotting, 107 Ml cells or their derivatives were washed twice, resuspended in Iml of DME, and incubated at 37°C for 30 min. The cells were then stimulated for 4 min at 37°C with either saline or 100 ng/ml IL-6, after which sodium vanadate (Sigma Chemical Co., St Louis, MI) was added to a concentration of 1 mM. Cells were placed on ice, washed once with saline containing 1 mM sodium vanadate, and then solubilised for 5 min on ice with 300 ul 1% Triton X-100, 150 mM NaCI, 2 mM EDTA, 50 mM Tris- HCI pH 7.4, containing Complete protease inhibitors (Boehringer Mannheim, Mannheim, Germany) and 1 mM sodium vanadate. Lysates were cleared by centrifugation and quantitated using a Coomassie Protein Assay Reagent (Pierce, Rockford IL).
For immunoprecipitations, equal concentrations of protein extracts (1-2 mg) were incubated for 1 hr or overnight at 4C with either 4 pg of anti-gpl30 antibody (M20; Santa Cruz Biotechnology Inc., Santa Cruz, CA) or 4 lg of anti-phosphotyrosine antibody (4G10; Upstate Biotechnology Inc., Lake Placid NY), and 15 pl packed volume of Protein G Sepharose (Pharmacia, Uppsala, Sweden) [Hilton et al, 1996]. Immunoprecipitates were washed twice in 1% NP40, 150 mM NaCI 50 mM Tris-HCI pH 8.0, containing Complete protease inhibitors (Boehringer Mannheim, Mannheim, Germany and 1 mM sodium vanadate. The samples were heated for 5 min at in SDS sample buffer (625 mM Tris-HCI pH 6.8, 0.05% SDS, 0.1% glycerol, bromophenol blue, 0.125% 2-mercaptoethanol), fractionated by SDS-PAGE and immunoblotted as described above.
For Western blotting, 10 pg of protein from a cellular extract or material from an immunoprecipitation reaction was loaded onto 4-15% Ready gels (Bio-Rad Laboratcries, Hercules CA), and resolved by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE).
Proteins were transferred to PVDF membrane (Micron Separations Inc., Westborough MA) for 1 hr at 100 V. The membranes were probed with the following primary antibodies; anti-tyrosine phosphorylated STAT3 (1:1000 dilution; New England Biolabs, Beverly, MA); anti-STAT3 (C- 20; 1:100 dilution; Santa Cruz Biotechnology Inc., Santa Cruz CA); anti-gpl30 (M20, 1:100 dilution; Santa Cruz Biotechnology Inc., Santa Cruz CA); anti-phosphotyrosine (horseradish P:\OPERXVPMkVPACOM-I\SOCSDI-I.WPD 11/01 -82peroxidase-conjugated RC20, 1:5000 dilution; Transduction Laboratories, Lexington KY); antityrosine phosphorylated MAP kinase and anti-MAP kinase antibodies (1:1000 dilution; New England Biolabs, Beverly, MA). Blots were visualised using peroxidase-conjugated secondary antibodies and Enhanced Chemiluminescence (ECL) reagents according to the manafacturer's instructions (Pierce, Rockford IL).
EXAMPLE ELECTROPHORETIC MOBILITY SHIFT ASSAYS Assays were performed as described [Novak, 1995], using the high affinity SIF (c-sis- inducible factor) binding site m67 [Wakao, 1994]. Protein extracts were prepared from Ml cells incubated for 4-10 min at 37°C in 10 ml serum-free DME containing either saline, 100 ng/ml IL-6 or 100 i* ng/ml IFN-y. The binding reactions contained 4-6 pg protein (constant within a given 3: *experiment), 5 ng 32 P-labelled m67 oligonucleotide, and 800 ng sonicated salmop sperm DNA.
For certain experiments, protein samples were preincubated with an excess of unlabelled m67 oligonucleotide, or antibodies specific for either STATI (Transduction Laboratories, Lexington, KY) or STAT3 (Santa Cruz Biotechnology Inc., Santa Cruz CA), as described [Novak, 1995].
Western blots were performed using anti-tyrosine phosphorylated STAT3 or anti-STAT3 (New SEngland Biolabs, Beverly, MA) or anti-gpl30 (Santa Cruz Biotechnology Inc.) as described (Nicola et al, 1996). EMSA were performed using the m67 oligonucleotide probe, as described (Novak et al, 1995).
P:\OPER\VPAVPACOM-I\SOCSDI-I .WPD- 1/11/01 -83- EXAMPLE 11 EXPRESSION CLONING OF A NOVEL SUPPRESSOR OF CYTOKINE SIGNAL TRANSDUCTION In order to identify cDNAs capable of suppressing cytokine signal transduction, an expression cloning approach was adopted. This strategy centred on Ml cells, a monocytic leukaemia cell line that differentiates into mature macrophages and ceases proliferation in response to the cytokines IL-6, LIF, OSM and IFN-y, and the steroid dexamethasone. Parental Ml cells were infected with the RUFneo retrovirus, into which cDNAs from the factor-dependent haemopoietic cell line FDC- P1 had been cloned. In this retrovirus, transcription of both the neomycin resistance gene and the cloned cDNA was driven off the powerful constitutive promoter present in the retroviral LTR (Figure When cultured in semi-solid agar, parental Ml cells form large tightly packed colonies. Upon stimulation with IL-6, Ml cells undergo rapid differentiation, resulting in the formation in agar of only single macrophages or small dispersed clusters of cells Retrovirallyinfected Ml cells that were unresponsive to IL-6 were selected in semi-solid agar culture by their ability to form large, tightly packed colonies in the presence of IL-6 and geneticin. A single stable IL-6-unresponsive clone, 4A2, was obtained after examining 104 infected cells.
A fragment of the neomycin phosphotransferase (neo) gene was used to probe a Southern blot of Sgenomic DNA from clone 4A2 and this revealed that the cell line was infected with a single retrovirus containing a cDNA approximately 1.4 kbp in length (Figure PCR amplification using primers from the retroviral vector which flanked the cDNA cloning site enaled recovery of a 1.4 kbp cDNA insert, which we have named suppressor of cytokine signalling-1, or SOCS 1.
This PCR product was used to probe a similar Southern blot of 4A2 genomic DNA and hybridised to two fragments, one which corresponded to the endogenous SOCS I gene and the other, which matched the size of the band seen using the neo probe, corresponded to the SOCS1 cDNA cloned into the integrated retrovirus (Figure The latter was not observed in an Ml cell clone infected with a retrovirus containing an irrelevant cDNA. Similarly, Northern blot analysis revealed that SOCSI mRNA was abundant in the cell line 4A2, but not in the control infected MI cell clone (Figure 2).
P:%OPER\VPA\VPACOM-I\SOCSDI-l.WPD- 1/11/01 -84- EXAMPLE 12 SOCS1, SOCS2, SOCS3 AND CIS DEFINE A NEW FAMILY OF SH2-CONTAINING PROTEINS The SOCS PCR product was used as a probe to isolate homologous cDNAs from a mouse thymus cDNA library. The sequence of the cDNAs proved to be identical to the PCR product, suggesting that constitutive or over expression, rather than mutation, of the SOCS 1 protein was sufficient for generating an IL-6-unresponsive phenotype. Comparison of the sequence of SOCS 1 cDNA with nucleotide sequence databases revealed that it was present on mouse and rat genomic DNA clones containing the protamine gene cluster found on mouse chromosome 16. Closer inspection revealed that the 1.4 kb SOCS 1 sequence was not homologous to any of the protamine genes, but rather represented a previously unidentified open reading frame located at the extreme 3' end of these clones (Figure There were no regions of discontinuity between the sequences .of the SOCS1 cDNA and genomic locus, suggesting that SOCS1 is encoded by a s'ngle exon. In addition to the genomic clone containing the protamine genes, a series of murine and human expressed sequenced tags (ESTs) also revealed large blocks of nucleotide sequence identity to mouse SOCS1. The sequence information provided by the human ESTs allowed the rapid cloning of cDNAs encoding human SOCS1.
The mouse and rat SOCSI gene encodes a 212 amino acid protein whereas the human SOCS1 gene encodes a 211 amino acid protein. Mouse, rat and human SOCS1 proteins share 95-99% :amino acid identity (Figure A search of translated nucleic acid databases with the predicted amino acid sequence of SOCSI showed that it was most related to a recently cloned cytokineinducible immediate early gene product, CIS, and two classes of ESTs. Full length cDNAs from the two classes of ESTs were isolated and found to encode proteins of similar length and overall structure to SOCSI and CIS. These clones were given the names SOCS2 and SOCS3. Each of the four proteins contains a central SH2 domain and a C-terminal region termed the SOCS motif.
The SOCS proteins exhibit an extremely high level of amino acid sequence similarity (95-99% identity) amongst different species. However, the forms of the SOCS1, SOCS2, SOCS3 and CIS from the same animal, while clearly defining a new family of SH2-containing proteins, exhibited a lower amino acid identity. SOCS2 and CIS exhibit approximately 38% amino acid identity, while the remaining members of the family share approximately 25% amino acid identity (Figure P:kOPERkVPA\VPACOM-I\SOCSDI-I.WPD 1/11/01 The coding region of the genes for SOCS1 and SOC3 appear to contain no int-ons while the coding region of the genes for SOCS2 and CIS contain one and two introns, respectively.
The Genbank Accession Numbers for the sequences referred to herein are mouse SOCS 1 cDNA (U88325), human SOCSI cDNA (U88326), mouse SOCS2 cDNA (U88327), mouse SOCS3 cDNA (U88328).
EXAMPLE 13 CONSTITUTIVE EXPRESSION OF SOCSI SUPPRESSES THE ACTION OF A RANGE OF CYTOKINES To formally establish that the phenotype of the 4A2 cell line was directly related to expression of SOCS and not to unrelated genetic changes which may have occurred independently in these cells, a cDNA encoding an epitope-tagged version of SOCS1 under the control of the EFla promoter was transfected into parental MI cells, and Ml cells expressing the receptor for thrombopoietin, c-mpl (Ml.mpl). Transfection of the SOCSI expression vector into both cell lines resulted in an increase in the frequency of IL-6 unresponsive Ml cells.
Multiple independent clones of Ml cells expression SOCS1, as detected by Western blot, displayed a cytokine-unresponsive phenotype that was indistinguishable from 4A2. Further, if transfectants were not maintained in puromycin, expression of SOCS 1 was lost over time and cells regained their cytokine responsiveness. In the absence of cytokine, colonies derived from 4A2 and other SOCS1 expressing clones characteristically grew to a smaller size than colones formed by control MI cells (Figure The effect of constitutive SOCS1 expression on the response of Ml cells to a range of cytokines was investigated using the 4A2 cell line and a clone of Ml.mpl cells expressing SOCS1 (M1.mpl.SOCSI). Unlike parental MI cells and Ml.mpl cells, the two cell lines expressing SOCS1 continued to proliferate and failed to form differentiated colonies in response to either IL- 6, LIF, OSM, IFN-y or, in the case of the MI.mpl.SOCS1 cell line, thrombopoietin (Figure 4).
For both cell lines, however, a normal response to dexamethasone was observed, suggesting that SOCSI specifically affected cytokine signal transduction rather than differentiation per se.
P:AOPERVPAVPACOM-I\SOCSDI-.WPD 111/01 -86- Consistent with these data, while parental MI cells and Ml.mpl cells became large end vacuolated in response to IL-6, 4A2 and M1.mpl.SOCSI cells showed no evidence of morphological differentiation in response to IL-6 or other cytokines (Figure EXAMPLE 14 SOCS1 INHIBITS A RANGE OF IL-6 SIGNAL TRANSDUCTION PROCESSES, INCLUDING STAT3 PHOSPHORYLATION AND ACTIVATION Phosphorylation of the cell surface receptor component gpl30, the cytoplasmic t rosine kinase JAK1 and the transcription factor STAT3 is thought to play a central role in IL-6 signal transduction. These events were compared in the parental Ml and Ml.mpl cell lines and their SOCS 1-expressing counterparts. As expected, gpl30 was phosphorylated rapidly in response to .IL-6 in both parental lines, however, this was reduced five- to ten-fold in the cell lines expressing SOCS1 (Figure Likewise, STAT3 phosphorylation was also reduced by approximately tenfold in response to IL-6 in those cell lines expressing SOCSI (Figure Consistent with a reduction in STAT3 phosphorylation, activation of specific STAT DNA binding complexes, as determined by electrophoretic mobility shift assay, was also reduced. Notably, there was a reduction in the formation of SIF-A (containing STAT3), SIF-B (STATI/STAT3 heterodimer) and SIF-C (containing STAT1), the three STAT complexes induced in M1 cells stimulated with IL-6 (Figure Similarly, constitutive expression of SOCSI also inhibited IFN-y-stimulated formation of p91 homodimers (Figure STAT phosphorylation and activation were not the only cytoplasmic processes to be effected by SOCS1 expression, as the phosphorylation of other proteins, including she and MAP kinase, was reduced to a similar extent (Figure 7).
EXAMPLE TRANSCRIPTION OF THE SOCS1 GENE IS STIMULATED BY IL-6 IN VITRO AND IN VIVO Although SOCS1 can inhibit cytokine signal transduction when constitutively expressed in MI cells, this does not necessarily indicate that SOCS1 normally functions to negatively regulate an IL-6 response. In order to investigate this possibility the inventors determined whether transcription of the SOCSI gene is regulated in the response of Ml cells to IL-6 and, because of P:\OPERNVPA\VPACOM-l\SOCSDI-I .WPD 1/11/01 -87the critical role IL-6 plays in regulating the acute phase response to injury and infection, the response of the liver to intravenous injection of 5 mg IL-6. In the absence of IL-6, SOCS1 mRNA was undetectable in either MI cells or in the liver. However, for both cell types, a 1.4 kb SOCS1 transcript was induced within 20 to 40 minutes by IL-6 (Figure For Ml cells, where the IL-6 was present throughout the experiment, the level of SOCS 1 mRNA remained elevaied (Figure 8).
In contrast, IL-6 was administered in vivo by a single intravenous injection and was rapidly cleared from the circulation, resulting in a pulse of IL-6 stimulation to the liver. Consistent with this, transient expression of SOCS mRNA was detectable in the liver, peaking approximately minutes after injection and declining to basal levels within 4 hours (Figure 8).
.:EXAMPLE 16 REGULATION OF SOCS GENES Since CIS was cloned as a cytokine-inducible immediate early gene the inventors examined whether SOCS 1, SOCS2 and SOCS3 were similarly regulated. The basal pattern of expression :of the four SOCS genes was examined by Northern blot analysis of mRNA from a variety of tissues from male and female C57B1/6 mice (Figure 1I Constitutive expression of SOCS1 was observed in the thymus and to a lesser extend in the spleen and the lung. SOCS2 expression was restricted primarily to the testis and in some animals the liver and lung; for SOCS3 a low level of expression was observed in the lung, spleen and thymus, while CIS expression was more widespread, including the testis, heart, lung, kidney and, in some animals, the liver.
The inventors sought to determine whether expression of the four SOCS genes was regulated by IL-6. Northern blots of mRNA prepared from the livers of untreated and IL-6-injected mice, or from unstimulated and IL-6-stimulated Ml cells, were hybridised with labelled fragments of SOCS1, SOCS2, SOCS3 and CIS cDNAs (Figure 1 IB). Expression of all four SOCS genes was increased in the liver following IL-6 injection, however the kinetics of induction appeared to differ. Expression of SOCSI and SOCS3 was transient in the liver, with mRNA detectable after minutes of IL-6 injection and declining to basal levels within 4 hours for SOCS and 8 hours for SOCS3. Induction of SOCS2 and CIS mRNA in the liver followed similar initial kinetics to that of SOCS1, but was maintained at an elevated level for at least 24 hours. A similar induction P:\OPER\VPA\VPACOM-I\SOCSD-I.WPD I/11/01 -88ofSOCS gene mRNA was observed in other organs, notably the lung and the spleen. In contrast, in MI cells, while SOCS1 and CIS mRNA were induced by IL-6, no induction of either SOCS2 or SOCS3 expression was detected. This result highlights cell type-specific differences in the expression of the genes of SOCS family members in response to the same cytokine.
In order to examine the spectrum of cytokines that was capable of inducing transcription of the various members of the SOCS gene family, bone marrow cells were stimulated for an hour with a range of cytokines, after which mRNA was extracted and cDNA was synthesised. PCR was then used to assess the expression of SOCS1, SOCS2, SOCS3 and CIS (Figure 11C). In the absence of stimulation, little or no expression of any of the SOCS genes was detectable in bone marrow by PCR. Stimulation of bone marrow cells with a broad array of cytokines appeared capable of up regulating mRNA for one or more members of the SOCS family. IFNy, for example, induced expression of all four SOCS genes, while erythropoietin, granulocyte colonystimulating factor, granulocyte-macrophage colony stimulating factor and interleukin-3 induced expression of SOCS2, SOCS3 and CIS. Interestingly, tumor necrosis factor alpha, macrophage colony-stimulating factor and interleukin-1, which act through receptors that do not fall into the type I cytokine receptor class also appeared capable of inducing expression of SOCS3 and CIS, suggesting that SOCS proteins may play a broader role in regulating signal transduction.
As constitutive expression of SOCS1 inhibited the response of Ml cells to a range of cytokines, the inventors examined whether phosphorylation of the cell surface receptor component and the transcription factor STAT3, which are though to play a central role in IL-6 signal transduction, were affected. These events were compared in the parental MI and Ml.mpl cell lines and their SOCS 1-expressing counterparts. As expected, gpl30 was phyosphorylated rapidly in response to IL-6 in both parental lines, however, this was reduced in the cell lines expressing SOCS1 (Figure 12A). Likewise, STAT3 phosphorylation was also reduced in response to IL-6 in those cell lines expressing SOCS1 (Figure 12A). Consistent with a reduction in STAT3 phosphorylation, activation of specific STAT/DNA binding complexes, as determined by electrophoretic mobility shift assay, was also reduced. Notably, there was a failure to form SIF-A (containing STAT3) and SIF-B(STATI/STAT3 heterodimer), the major STAT complexes induced in Ml cells stimulated with IL-6 (Figure 12B). Similarly, constitutive expression of P:\OPERVPA\VPACOM-I\SOCSDI-I.WPD- 1/11101 -89- SOCS also inhibited IFNy-stimulating formation of SIF-C (STATI homodimer, Figure 12B).
These experiments are consistent with the proposal that SOCS1 inhibits signal transduction upstream of receptor and STAT phosphorylation, potentially at the level of the JAK kinases.
The ability of SOCS1 to inhibit signal transduction and ultimately the biological response to cytokines suggest that, like the SH2-containing phosphatase SHP-1 [Ihle et al, 1994; Yi et al, 1993], the SOCS proteins may play a central role in controlling the intensity and/or duration of a cell's response to a diverse range of extracellular stimuli by suppressing the signal transduction process. The evidence provided here indicates that the SOCS family acts in a classical negative feedback loop for cytokine signal transduction. Like other genes such as OSM, expression of genes encoding the SOCS proteins is induced by cytokines through the activation of STATs.
Once expressed, it is proposed that the SOCS proteins inhibit the activity of JAKs and so reduce the phosphorylation of receptors and STATs, thereby suppressing signal transduction and any ensuing biological response. Importantly, inhibition of STAT activation will, over time, lead to a reduction in SOCS gene expression, allowing cells to regain responsiveness to cytokines.
EXAMPLE 17 DATABASE SEARCHES The NCBI genetic sequence database (Genbank), which encompasses the major database of expressed sequence tags (ESTs) and TIGR database of human expressed sequence tags, were searched for sequences with similarity to a concensus SOCS box sequence using the TFASTA and MOTIF/PATTERN algorithms [Pearson, 1990; Cockwell and Giles, 1989]. Using the software package SRS [Etzold et al, 1996], ESTs that exhibited similarity to the SOCS box (and their partners derived from sequencing the other end of cDNAs) were retrieved and assembled into contigs using Autoassembler (Applied Biosystems, Foster City, CA). Consensus nucleotide sequences derived from overlapping ESTs were then used to search the various databases using BLASTN [Altschul et al, 1990]. Again, positive ESTs were retrieved and added\'o the contig.
This process was repeated until no additional ESTs could be recovered. Final consensus nucleotide sequences were then translated using Sequence Navigator (Applied Biosystems, Foster City, CA).
P:\OPER\VPA\VPACOM-j\SOCSDIj.WPD. 1/11/01~ 90 The ESTs encoding the new SOCS proteins are as follows: human SOCS4 (EST8l 149, EST 180909, EST 182619, ya99H09, ye70co4, yh53c09, yh77gl 11, yh87h05, yi45h07, yjO4eO6, yq1I2h06, yq56a06, yq6OeO2, yq92g03, yq97h06, yr9OfOlI, yt69c03, yv3OaO8, yv55ffl7, yv57h09, yv87h02, yv98el 1, yw68d 10, yw82a03, yxO8aO7, yx72h06, yx76b09, yy37h08, yy66b02, za8l f08, zblI8fD7, zcO6eO8, zd l4gO6, zd5 Ihl12, zd52b09, ze25gl 11, ze69fD2, zf54fD3, zh96e07, zv66h 12, zs83a08 and zs83g08). mouse SOCS-4 (mc65f'04, mf42e06, mpl0c 10, mr8lgO9, and mtl9hl2). human SOCS-5 (EST1I5B 103, ESTI 5Bi105, EST27530 and zf5Ofol1). mouse SOCS- 5(mc55aOl, mh98f09, my26h12 and ve24e06). human SOCS-6 (yffileO8, yf93a09, ygO5fl2, yg4lfO4, yg45c02, yhllflO, yhl3bO5, zc35a12, zeO2hO8, A109a03, z169e10, zn39d08 and zo39e06). mouse SOCS-6 (=c04c05, md48a03, mf3ldO3, mh26b07, mh78el 1, mh88h09, mh94h07, mi27h04 and mj29c05, mp66g04, mw75g03, va53b05, vb34h02, vc55d07, vc59e05, ~.vc67d03, vc68dlO, vc97hOl, vc99c08, vdO7hO3, vdO8cOl, vdO9bl2, vdl9bO2, vd29a04 and :vd46d06). human SOCS-7 (STS W13017 1, EST00939, EST 12913, yc29b05, yp4)9flO0, ztlIOFD3 and zx73g04). mouse SOCS-7 (mj39aOlI and vi52h07). mouse SOCS-8 (mj6eO9 and vj27a029).
human SOCS-9 (CSRL-82f2-u, EST 114054, yy06b07, yy06g06, zr4OcO9, zr72hO 1, yx92c08, yx93b08 and hfe0662). mouse SOCS-9 (me65d05). human SOCS-10 (aa48hlO, ~**.zp97h12, zqO8hOl, zr34g05, EST73000 and HSDHEIOO5). mouse SOCS-10 (mbl4dl2, mb4OfO6, mg89bl 1, mq89e 12, mpO3gl2 and vh53cl human SOCS-11 (zt24h06 and zr43b02). human SOCS-13 (EST59161). mouse SOCS-13 (ma39a09, me6OcO5, mi78g05, mkl~cl 1, mo48g12, mp94aOl, vb57c07 and vhO7cll1). human SOCS-14 (mi75e03, vd29hl 1 and vd53g07).
EXAMPLE 18 cDNA CLONING Based on the concensus sequences derived from overlapping ESTs, oligonucleotides were designed that were specific to various members of the SOCS family. As described above, oligonucleotides were labelled and used to screen commerically available genomic and cDNA libraries cloned with X bacteriophage. Genomic and/or cDNA clones covering the entire coding region of mouse SOCS4, mouse SOCS5 and mouse SOCS6 were isolated. The entire gene for is on the human 12pl13 BAC (Genbank Accession Number HSU47924) and the mouse P:NOPER\VPA\VPACOM-I\SOCSDI-I.WPD -I/11/01 -91 chromosome 6 BAC (Genbank Accession Number AC002393). Partial cDNAs for mouse SOCS7, SOCS9, SOCS10, SOCS 1, SOCS12, SOCS13 and SOCS14 were also isolated.
EXAMPLE 19 NORTHERN BLOTS AND rtPCR Northern blots were performed as described above. The sources of hybridisation probes were as follows; the entire coding region of the mouse SOCSI cDNA, (ii) a 1059 bp PCR product derived from coding region of SOCS5 upstream of the SH2 domain, (iii) the entire :oding region of the mouse SOCS6 cDNA, (iv) a 790 bp PCR product derived from the coding region of a partial SOCS7 cDNA and a 1200 bp Pst I fragment of the chicken glyceraldehyde 3-phosphate dehydrogenase (GAPDH) cDNA.
EXAMPLE 15 ADDITIONAL MEMBERS OF SOCS FAMILY SOCS1, SOCS2 and SOCS3 are members of the SOCS protein family identified in Examples 1- 16. Each contains a central SH2 domain and a conserved motif at the C-terminus, named the SOCS box. In order to isolate further members of this protein family, various DNA databases 20 were searched with the amino acid sequence corresponding to conserved residues of the SOCS box. This search revealed the presence of human and mouse ESTs encoding twelve further members of the SOCS protein family (Figure 13). Using this sequence information cDNAs encoding SOCS4, SOCS5, SOCS6, SOCS7, SOCS9, SOCS10, SOCS 1, SOCS12, SOCS13, SOCS14 and SOCS15 have been isolated. Further analysis of contigs derived from ESTs and cDNAs revealed that the SOCS proteins could be placed into three groups according to their predicted structure N-terminal of the SOCS box. The three groups are those with SH2 domains, (ii) WD-40 repeats and (iii) ankyrin repeats.
P:XOPER\VPA\VPACOM-l'SOCSDI-I.WPD- 1111/01 -92- EXAMPLE 21 SOCS PROTEIN WITH SH2 DOMAINS Eight SOCS proteins with SH2 domains have been identified. These include SOCS 1, SOCS2 and SOCS3, SOCS5, SOCS9, SOCS11 and SOCS14 (Figure 13). Full length cDNAs were isolated for mouse SOCS5 and SOCS14 and partial clones encoding mouse SOCS9 and SOCS14.
Analysis of primary amino acid sequence and genomic structure suggest that pairs of these proteins (SOCS1 and SOCS3, SOCS2 and CIS, SOCS5 and SOCS14 and SOCS9 and SOCS11) are most closely related (Figure 13). Indeed, the SH2 domains of SOCS5 and SOCS14 are almost identical (Figure 13B), and unlike CIS, SOCS1, SOCS2 and SOCS3, SOCS5 and SOCS14 have an extensive, though less well conserved, N-terminal region preceding their SH2 domains (Figure 13A).
EXAMPLE 22 15 SOCS PROTEINS WITH WD-40 REPEATS Four SOCS proteins with WD-40 repeats were identified. As with the SOCS proteins with SH2 domains, pairs of these proteins appeared to be closely related. Full length cDNAs of mouse SOCS4 and SOCS6 were isolated and shown to encode proteins containing eight WD-40 repeats 20 N-terminal of the SOCS box (Figure 13) and SOCS4 and SOCS6 share 65% amino acid similarity. SOCS15 was recognised as an open reading frame upon sequencing BACs from human chromosome 12pl3 and the syntenic region of mouse chromosome 6 [Ansari-Lari et al, 1997]. In the human, chimp and mouse, SOCS15 is encoded by a gene with two coding exons that lies within a few hundred base pairs of the 3' end of the triose phosphate isomerase (TPI) gene, but which is encoded on the opposite strand to TPI In addition to a C-terminal SOCS box, the SOCS 15 protein contains four WD-40 repeats. Interestingly, within the EST databases, there is a sequence of a nematode, an insect and a fish relative of SOCS 15. SOCS 15 appears most closely related to SOCS13.
P:\OPER\VPA\VPACOM-I\SOCSD-I WPO- 1/11/01 -93- EXAMPLE 23 SOCS PROTEINS WITH ANKYRIN REPEATS Three SOCS proteins with ankyrin repeats were identified. Analysis of partial cDNAs of mouse SOCS7, SOCS10 and SOCS12 demonstrated the presence of multiple ankyrin repeats.
EXAMPLE 24 EXPRESSION PATTERN OF SOCS PROTEINS The expression of mRNA from representative members of each class of SOCS proteins SOCS1 and SOCS5 from the SH2 domain group, SOCS6 from the WD-40 repeat group and SOCS7 from the ankyrin repeat group was examined. As shown above, SOCS 1 mRNA is found in abundance :in the thymus and at lower levels in other adult tissues.
Since transcription of the SOCS1 gene is induced by cytokines, the inventors sought to determine whether levels of SOCS5, SOCS6 and SOCS7 mRNA increased upon cytokine stimulation. In the livers of mice injected with IL-6, SOCS1 mRNA is detectable after 20 min and decreases to background levels within 2 hours. In contrast, the kinetics of SOCS5 mRNA expression are quite different, being only detectable 12 to 24 hours after IL-6 injection. SOCS6 mRNA appears to be expressed constitutively while SOCS7 mRNA was not detected in the liver either before injection of IL-6 or at any time after injection.
Expression of these genes was also examined after cytokine stimulation of the factor-dependent cell line FDCP-1 engineered to express bcl-w. Again, while SOCS6 mRNA was expressed constitutively.
EXAMPLE SOCS4 Mouse and human SOCS4 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS4 cDNAs are tabulated P:\OPER\VPA\VPACOM-I\SOCSDI-I .WPD- 1111/01 -94below (Tables 4.1 and Using sequence information derived from mouse ESTs several oligonucleotides were designed and used to screen, in the conventional manner, a mouse thymus cDNA library cloned into 1-bacteriophage. Two cDNAs encoding mouse SOCS4 were isolated and sequenced in their entirety (Figure 15) and shown to overlap the mouse ESTs identified in the database (Table 4.1 and Figure 17). These cDNAs include a region of 5' untranslated region, the entire mouse SOCS4 coding region and a region of 3' untranslated region (Figure 17). Analysis of the sequence confirms that the SOCS4 cDNA encodes a SOCS Box at its C-terminus and a series of 8 WD-40 repeats before the SOCS Box (Figures 17 and 16). The relationship of the two sequence contigs of human SOCS4 (h4.1 and h4.2) to the experimentally determined mouse SOCS4 cDNA sequence is shown in Figure 17. The nucleotide sequence of the two human contigs is listed in Figure 18.
SEQ ID NO: 13 and 14 represent the nucleotide sequence of murine SOCS4 and the corresponding amino acid sequence. SEQ ID NOs: 15 and 16 are SOCS4 cDNA human contigs h4.1 and h4.2, 15 respectively.
EXAMPLE 26 Mouse and human SOCS5 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS5 cDNAs are tabulated below (Tables 5.1 and Using sequence information derived from mouse and human ESTs, several oligonucleotides were designed and used to screen, in the conventional manner, a mouse thymus cDNA library, a mouse genomic DNA library and a human thymus cDNA Jibrary cloned into 1-bacteriophage. A single genomic DNA clone (57-2) and cDNA clone encoding mouse SOCS5 were isolated and sequenced in their entirety and shown to overlap with the mouse ESTs identified in the database (Figures 19 and 20A). The entire coding region, in addition to a region of 5' and 3' untranslated regions of mouse SOCS5 appears to be encoded on a single exon (Figure 19). Analysis of the sequence (Figure 20) confirms that SOCS5 genomic and cDNA clones encode a protein with a SOCS box at its C-terminus in addition to an SH2 domain (Figure 19 and 20B). The relationship of the human SOCS5 contig (h5.1; Figure 21) derived from P:\OPERWVPA\VPACOM-I\SOCSDl-I.WPD- 1/11/01 analysis of cDNA clone 5-94-2 and the human SOCS5 ESTs (Table 5.2) to the mouse DNA sequence is shown in Figure 19. The nucleotide sequence and corresponding amino acid sequence of murine SOCS5 are shown in SEQ ID NOs: 17 and 18, respectively. The human nucleotide sequence is shown in SEQ ID NO: 19.
EXAMPLE 27 SOCS6 Mouse and human SOCS6 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS6 cDNAs are tabulated below (Tables 6.1 and Using sequence information derived from mouse ESTs, several oligonucleotides were designed and use to screen, in the conventional manner, a mouse thymus cDNA library. Eight cDNA clones (6-1A, 6-2A, 6-5B, 6-4N, 6-18, 6-29, 6-3N, 6-5N) cDNA clone encoding mouse SOCS6 were isolated and sequenced in their entirety and shown to overlap with the mouse ESTs identified in the database (Figures 22 and 23A). Analysis of the sequence (Figure 23) confirms that the mouse SOCS6 cDNA clones encode a protein with a SOCS box at its C-terminus in addition to a eight WD-40 repeats (Figures 22 and 23B). The relationship of the human SOCS-6 contigs (h6.1 and h6.2 Figure 24) derived from analysis of human SOCS6 ESTs (Table 6.2) to the mouse SOCS6 DNA sequence is shown in Figure 22. The nucleotide and 20 corresponding amino acid sequences of murine SOCS6 are shown in SEQ ID NGs: 20 and 21, respectively. SOCS6 human contigs h6.1 and h6.2 are shown in SEQ ID NOs: 22 and 23, respectively.
EXAMPLE 28 SOCS7 Mouse and human SOCS7 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS-7 cDNAs are tabulated below (Tables 7.1 and Using sequence information derived from mouse ESTs, several oligonucleotides were designed and use to screen, in the conventional manner, a mouse thymus cDNA library. One cDNA clone (74-10A-11) cDNA clone encoding mouse SOCS7 was P:\OPERVPA\VPACOM-I\SOCSDI-I.WPD- 1/11/01 -96isolated and sequenced in its entirety and shown to overlap with the mouse ESTs identified in the database (Figures 25 and 26A). Analysis of the sequence (Figure 26) suggests that mouse SOCS7 encodes a protein with a SOCS box at its C-terminus, in addition to several ankyrin repeats (Figure 25 and 26B). The relationship of the human SOCS7 contigs (h7.1 and h7.2 Figure 27) derived from analysis of human SOCS7 ESTs (Table 7.2) to the mouse SOCS7 DNA sequence is shown in Figure 25. The nucleotide and corresponding amino acid sequences ofmurine SOCS7 are shown in SEQ ID NOs: 24 and 25, respectively. The nucleotide sequence of SOCS7 human contigs h7.1 and h7.2 are shown in SEQ ID NOs: 26 and 27, respectively.
EXAMPLE 29 SOCS8 9o ESTs derived from mouse SOCS8 cDNAs are tabulated below (Table As described for other °members of the SOCS family, it is possible to isolate cDNAs for mouse SOCS8 using sequence 15 information derived from mouse ESTs. The relationship of the ESTs to the predicted coding region of SOCS8 is shown in Figure 28. With the nucleotide sequence obtained fiom the ESTs shown in Figure 29A and the partial amino acid sequence of SOCS8 shown in Figure 29B. The nucleotide sequence and corresponding amino acid sequences for murine SOCS8 are shown in SEQ ID NOs:28 and 29, respectively.
EXAMPLE SOCS9 Mouse and human SOCS-9 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS9 cDNAs are tabulated below (Tables 9.1 and The relationship of the mouse SOCS9 contigs (m9.1; Figure 9.2) derived from analysis of the mouse SOCS9 EST (Table 9.1) to the human SOCS-9 DNA contig (h9.1; Figure 32) derived from analysis of human SOCS9 ESTs (Table 9.2) is shown in Figure 31. Analysis of the sequence (Figure 32) indicates that the human SOCS9 cDNA encodes a protein with a SOCS box at its C-terminus, in addition to an SH2 domain (Figure 30). The nucleotide sequence of muring SOCS9 cDNA is shown in SEQ ID NO:30. The nucleotide P:\OPER\VPA\VPACOM-l'SOCSD-I .WP- 111 1/01 -97sequence of human SOCS9 cDNA is shown in SEQ ID NO:31.
EXAMPLE 31 Mouse and human SOCS10 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS10 cDNAs are tabulated below (Table 10.1 and 10.2). Using sequence information derived from mouse ESTs, several oligonucleotides were designed and use to screen, in the conventional manner, a mouse thymus cDNA library. Four cDNA clones (10-9, 10-12, 10-23 and 10-24) encoding mouse were isolated, sequenced in their entirety and shown to overlap with the mouse and human ESTs identified in the database (Figures 33 and 34). Analysis of the sequence (Figure 34) indicates that the mouse SOCS10 cDNA clone is not full length but that it does encode a protein *with a SOCS box at its C-terminus, in addition to several ankyrin repeats (Figure 33). The 15 relationship of the human SOCS10 contigs (hl0.1 and h10.2 Figure 35) derived from analysis of human SOCS10 ESTs (Table 10.2) to the mouse SOCS10 DNA sequence is shown in Figure 33. Comparison of mouse cDNA clones and ESTs with human ESTs suggests that the 3' untranslated regions of mouse and human SOCS10 differ significantly. The nucleotide sequence of murine SOCS10 is shown in SEQ ID NO:32 and the nucleotide sequence of SOCS10 human 9b** 20 contigs hlO.1 and hl0.2 are shown in SEQ ID NOs:33 and 34, respectively.
EXAMPLE 32 SOCS11 Human SOCS I were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from human SOCS11 cDNAs are tabulated below (Table 11.1 and 11.2). The relationship of the human SOCS11 contigs (hi 1.1; Figure 36A, B), derived from analysis ESTs (Table 11.2) to the predicted encoded protein, is shown in Figure 37.
Analysis of the sequence indicates that the human SOCS11 cDNA encodes a protein with a SOCS box at its C-terminus, in addition to an SH2 domain (Figure 37 and 36B). The nucleotide sequence and corresponding amino acid sequence of human SOCS 11 are represented in SEQ ID PAOPER\VPA\VPACOM-M'SOCSDi-l.WPD 1/11/01 -98and 36, respectively.
EXAMPLE 33 SOCS12 Mouse and human SOCS-12 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS12 cDNAs are tabulated below (Tables 12.1 and 12.2). Using sequence information derived from mouse ESTs, several oligonucleotides were designed and use to screen, in the conventional manner, a mouse thymus cDNA library. Four cDNA clones (10-9, 10-12, 10-23 and 10-24) encoding mouse SOCS 12 were isolated, sequenced in their entirety and shown to overlap with the mouse and human ESTs identified in the database (Figures 38 and 39). Analysis of the sequence (Figure 39 oI and 40) indicates that the SOCS12 cDNA clone encodes a protein with a SOCS box at its Cterminus, in addition to several ankyrin repeats (Figure 38). The relationship of the human 15 SOCS 12 contigs (h 12.1 and h 12.2 Figure 40) derived from analysis of human SOCS 12 ESTs (Table 12.2) to the mouse SOCS12 DNA sequence is shown in Figure 38. Comparison of mouse cDNA clones and ESTs with human ESTs suggests that the 3' untranslated regions of mouse and human SOCS12 differ significantly. The nucleotide sequence of SOCS12 is shown in SEQ ID NO:37. The nucleotide sequence of human SOCS12 contigs h12.1 and h12.2 are shown in SEQ ID NOs:38 and 39, respectively.
EXAMPLE 34 SOCS13 Mouse and human SOCS-13 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS13 cDNAs are tabulated below (Tables 13.1 and 13.2). Using sequence information derived from mouse ESTs, several oligonucleotides were designed and use to screen, in the conventional manner, a mouse thymus and a mouse embryo cDNA library. Three cDNA clones (62-1, 62-6-7 and 62-14) encoding mouse SOCS 13 were isolated, sequenced in their entirety and shown to overlap with the mouse ESTs identified in the database (Figure 41 and 42A). Analysis of the sequence (Figure 42) P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/1 1/01 -99indicates that the mouse SOCS13 cDNA encodes a protein with a SOCS box at its C-terminus, in addition to a potential WD-40 repeat (Figure 41 and 42B). The relationship of the human SOCS 13 contigs (h 13.1 and h 13.2 Figure 43) derived from analysis of human SOCS 13 ESTs (Table 13.2) to the mouse SOCS13 DNA sequence is shown in Figure 41. The nucleotide sequence and corresponding amino acid sequence of murine SOCS13 and shown in SEQ ID and 41, respectively. The nucleotide sequence of human SOCS13 contig h13.1 is shown in SEQ ID NO:42.
EXAMPLE SOCS14 Mouse and human SOCS-14 were recognized through searching EST databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS14 cDNAs are tabulated below (Tables 14.1 and 14.2). Using sequence information derived from mouse and human ESTs, several oligonucleotides were designed and use to screen, in the conventional Smanner, a mouse thymus cDNA library, a mouse genomic DNA library and a human thymus .*oo cDNA library cloned into 1-bacteriophage A single genomic DNA clone (57-2) and (5-3-2) cDNA clone encoding mouse SOCS14 were isolated and sequenced in their entirety and shown to overlap with the mouse ESTs identified in the database (Figures 44 and 45A). The entire coding region, in addition to a region of 5' and 3' untranslated regions, of mouse SOCS 14 appears to be encoded on a single exon (Figure 44). Analysis of the sequence (Figure 45) confirms that SOCS14 genomic and cDNA clones encode a protein with a SOCS box at its C-terminus in addition to an SH2 domain (Figure 44 and 45B). The relationship of the human SOCS14 contig (h 4.1; Figure 14.3) derived from analysis of cDNA clone 5-94-2 and the human SOCS14 ESTs 25 (Table 14.2) to the mouse SOCS14 DNA sequence is shown in Figure 44.
The nucleotide sequence and corresponding amino acid sequence of murine SOCS14 are shown in SEQ ID NOs: 43 and 44, respectively.
P:\OPERXVPAVPACOM-I\SOCSDI-I.WPD. 1111/01 100- EXAMPLE 36 Mouse and human SOCS15 were recognized through searching DNA databases using the SOCS box consensus (Figure 13). Those ESTs derived from mouse and human SOCS15 cDNAs are tabulated below (Tables 15.1 and 15.2), as are a mouse and human BAC that contain the entire mouse and human SOCS-15 genes. Using sequence information derived from the ESTs and the BACs it is possible to predict the entire amino acid sequence of SOCS 15 and as described for the other SOCS genes it is feasible to design specific oligonucleotide probes to allow cDNAs to be isolated. The relationship of the BACs to the ESTs is shown in Figure 46 and the nucleotide and predicted amino acid sequence of the SOCS-15, derived from the mouse and human BACs is shown in Figures 47 and 48. The nucleotide sequence and corresponding amino acid sequence of murine SOCS15 are shown in SEQ ID NOs:46 and 47, respectively. The nucleotide and corresponding amino acid sequence of human SOCS15 are shown in SEQ ID NO:48 and 49, 15 respectively.
EXAMPLE 37 SOCS INTERACTION WITH JAK2 KINASE 20 These Examples show interaction between SOCS and JAK2 kinase. Interaction is mediated via the SH2 domain of SOCS1, 2, 3 and CIS. The interaction resulted in inhibition of JAK2 kinase activity by SOCS1 (Figure 49). General interaction between JAK2 and SOCS1, 2, 3, and CIS is shown in Figure 25 The following methods are employed: Immunoprecipitation: Cos 6 cells were transiently transfected by electroporatioi. and cultured for 48 hours. Cells were then lysed on ice in lysis buffer (50 mMI Tris/HCL, pH 7.5, 150 mM NaCI, 1% v/v Triton-X-100, 1 mM EDTA, 1 mM Naf, 1 mM Na 3
VO
4 with the addition of complete protease inhibitors (Boehringer Mannheim), centrifuged at 4 0 C (14,000 x g, 10 min) and the supernatant retained for immunoprecipitation. JAK2 proteins were immunoprecipitated using P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD /I1/01 101 1l anti-JAK2 antibody (UBI). Antigen-antibody complexes were recovered using protein A- Sepharose (30 /l of a 50% slurry).
Western blotting: Immunoprecipitates were analysed by sodium dodecyl sulphate (SDS) polyacrylamide gel electrophoresis (PAGE) under reducing conditions. Protein was then electrophoretically transferred to nitrocellulose, blocked overnight in 10% w/v skim-milk and washed in PBS/0.1% v/v Tween-20 (Sigma) (wash buffer) prior to incubation with either antiphosphotyrosine antibody (4G10) (1:5000, UBI), anti-FLAG antibody (1.6 ,g/ml) or anti-JAK2 antibody (1:2000, UBI) diluted in wash buffer/1% w/v BSA for 2 hr. Nitrocellulose blots were washed and primary antibody detected with either peroxidase-conjugated sheep anti-rabbit immunoglobulin (1:5000, Silenus) or peroxidase-conjugated sheep anti-mouse immunoglobulin (1:5000, Silenus) diluted in wash buffer/1% w/v BSA. Blots were washed and antibody binding visualised using the enhanced chemiluminescence (ECL) system (Amersham, UK) according to the manufacturers' instructions.
In-vitro kinase assay: An in vitro kinase assy was performed to assess intrinsic JAK2 kinase catalytic activity. JAK2 protein were immunopreciptated as described, washed twice in kinase assay buffer (50 mM NaCI, 5 mM MgCl2, 5 mM MnC12, 1 mM NaF, 1 mM Nao VO4, 10 mM "HEPES, pH 7.4) and suspended in an equal volume of kinase buffer containing 0.25 p/Ci/ml (y- 32 P)-ATP (30 min, room temperature). Excess 3 y- P)-ATP was removed and the immunoprecipitates analysed by SDS/PAGE under reducing conditions. Gels were subjected to a mild alkaline hydrolysis by treatment with I M KOH (55°C, 2 hours) to remove phosphoserine and phosphothreonine. Radioactive bands were visualised with IMAGEQUANT software on a PhosphorImage system (Molecular Dynamics, Sunnyvale, CA, USA).
EXAMPLE 38 MAKING SOCS-1 KNOCKOUT CONSTRUCTS Diagrams of plasmid constructs and knockout constructs are shown in Figures 51-53. The genomic SOCS-1 clone 95-11-10 was digested with the restriction enzymes BamHl and EcoRI to obtain a 3.6Kb DNA fragment 3' of the coding region (SOCS-1 exon), which was used as the P:\OPER\VPA\VPACOM-I\SOCSDI-I WPD 1/11/01 102- 3' arm in the SOCS-1 knockout vectors. The ends of this fragment were then blunted. This fragment was then ligated into the following vectors: pBgalpAloxNeo and pBgalpAloxNeoTK which had been linearized at the unique Xhol site and then blunted. This ligation resulted in the formation of the following vectors: 3'SOCS-1 arm in pBgalpAloxNeo and 3'SOCS-1 arm in pBgalpAloxNeoTK The 5' arm of the SOCS-1 knockout vectors was constructed by using PCR to generate a PCR product from the genomic SOCS-1 clone 95-11-10 just 5' of the SOCS-1 coding region (SOCS-1 exon). The oligo's used to generate this product were: oligo (sense) (2465) AGCT AGA TCT GGA CCC TAC AAT GGC AGC [SEQ ID NO:49] 3' oligo (antisense) (2466) AGCT AG ATC TGC CAT CCT ACT CGA GGG GCC AGC TGG [SEQ ID The PCR product was then digested with the restriction enzyme BglII, to generate BglII ends to 20 the PCR product. This 5' SOCS-1 PCR product,with BglII, ends was then ligated as follows: 3'SOCS-1 arm in pBgalpAloxNeo and 3'SOCS-1 arm in pBgalpAloxNeoTK, which had been linearized with the unique restriction enzyme BamHl. This resulted in the following vectors being formed: 5'&3'SOCS-1 arms in pBgalpAloxNeo 25 and 5'&3'SOCS-1 arms in pBgalpAloxNeoTK These were the final SOCS- knockout constructs. Both these constructs lacked the entire SOCS- 1 coding region (SOCS-1 EXON), being replaced with portions of the Bgal, B globin polyA, PGK promoter, neomycin and PGK polyA sequences. The 5'&3'SOCS-1 arms in pBgalpAloxNeoTK vector also contained the tymidine kinase gene sequence, between the neomycin and PGK poly A sequences.
P:\OPER\VPA\VPACOM-IMOCSD1-!.WPD 1/11/01 103- The vectors: 5'&3'SOCS-1 arms in pBgalpAloxNeo and 5'&3'SOCS-1 arms in pBgalpAloxNeoTK were linearized with the unique restriction enzyme Not and then transfected into Embryonic stem cells by electroporation. Clones which were resistant to neomycin were selected and analysed by southern blot to determine if they contained the correctly integrated SOCS-1 targeting sequence.
In order to determine if correct integration had occurred, genomic DNA from the neomycin resistant clones was digested with the restriction enzyme EcoR1. The digested DNA was then blotted onto nylon filters and probed with a 1.5Kb EcoR1 /Hind III DNA fragment, which was further 5' of the 5'arm sequence used in the knockout constructs. The band sizes expected for correct integration were: Wild type SOCS-1 allele 5.4Kb SOCS-1 knockout allele 8.2Kb in 5'&3'SOCS-1 arms in pBgalpAloxNeo or 11Kb in 5'&3'SOCS-1 arms in pBgalpAloxNeoTK transfomed cells.
Those skilled in the art will appreciate that the invention described herein is susceptible to variations and modifications other than those specifically described. It is to be understood that 20 the invention includes all such variations and modifications. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, Sindividually or collectively, and any and all combinations of any two or more of said steps or features.
P:\OPER\VPA\VPACQM-I 'SOCSDI-I .WPD- 1/1 1/01 104- Table 4.1 Summary of ESTs derived from mouse SOCS-4 cDNAs SOCS Species EST name SOCS-4 Mouse mc65f04 End EST no 5' EST0549700 mf42e06 5' EST0593477 Library source d 13.5-14.5 mouse embryo d 13.5 -14.5 mouse embryo d 8.5 mouse embryo d 13 embryo spleen Contig m4. 1 m4. 1 m4. 1 m4. 1 m4.1 mplOclO mr8l1gO9 mtl9hl2 EST0747905 EST078308 1 EST0816531 Table 4.2 Summary of ESTs derived fromI SOCS Species EST name SOCS-4 Human 27b5 3 0d2 JO0I59F J3802F EST19523 EST81 149 EST1 80909 EST 182619 human SOCS-4 cDNAs End EST no 5' EST0534081 5' EST0534315 5f EST0461188 EST046 1428 EST0958884 ESTlOl 1015 EST095 1375 Library source retina retina foetal heart foetal heart retina placenta Jurkat Tlymphocyte Jurkat Tlymphocyte Contig h4.2 h4. 2 h4.2 h4.2 h4 .2 h4.2 h4 .2 h41 5' EST0953220 P:\OPER\VPA\VPACOM-l\SOCSDI--I.WPD 1111/01 105 ya99h09 ye7OcO4 yh53c09 yh77g1 II yh87h05 yi45h07 yjO4eO6 yqlI2hO6 yq56a06 yq6OeO2 EST0 103262 ESTO 172673 EST0 197390 EST0197391 EST02034 18 EST02034 19 EST0204888 EST0204773 EST0246604 EST025 8541 EST0258285 EST0309968 EST0346924 EST0347259 EST0347209 EST0355932 EST0355884 EST0357618 EST0357416 EST0372402 EST0338395 EST0338303 EST0458506 EST046539 1 placenta h4.2 foeatl liver/spleen h4.2 placenta h4.2 h4.2 placenta h4.2 h4.1I placenta h4.1 h4. 1 placenta h4.2 placenta h4.1I h4. 1 foetal liver spleen h4.2 foetal liver spleen h4.2 foetal liver spleen h4.2 h4. 2 foetal liver spleen h4.2 h4.2 foetal liver spleen h4.2 h4.2 foetal liver spleen h4.2 foetal liver spleen h4.2 h4.2 foetal liver spleen h4.2 foetal liver spleen h4.2 yq92g03 yq97h06 yr9OfDl yt69c03 yv3OaO8 yv55fU07 P:AOPER\VPA\VPACOM-I'SOCSDI-I.WPD 1/1 1/01 106yv57h09 yv87h02 yv98el 1 yw68dl10 yw82a03 yxO8aO7 yx72h06 EST046333 1 EST0464336 EST0458765 EST0388085 EST0400679 EST0400680 EST044 1370 EST0463005 EST0433678 EST04070 16 EST0435 158 EST0422871I EST0434011I EST045 1704 EST0505446 EST051 1777 EST0485315 EST0540473 EST0540354 EST0564666 EST0578099 h4.2 foetal liver spleen h4.2 h4. 2 melanocyte h4.2 melanocyte h4.2 h4.2 placenta (8-9 h4.2 wk) placenta (8-9 h4.2 wk) h4.A melanoocyte h4.A melanoocyte h4.2 melanoocyte h4.A melanoocyte h4.2 melanoocyte h4.2 multiple h4.2 sclerosis lesion foetal lung h4.2 foetal lung h4.1 parathyroid h4.1 tumor h4.A foetal heart h4.1I foetal heart h4. 1 yx76b09 yy37h08 yy66b02 za8l.f'08 zblI8fU7 zc06eO8 zdlI4gO6 zd~lhl2 P: OPER\VPA\VPACOM-l\SOCSDI-l WPD 1/11/01 107zd52b09
II
ze69f02 zf5403 zh96e07 EST05 82012 EST058 1958 EST0679543 EST0635563 EST0635472 EST06801 11 EST0616241 EST061 5745 EST 1043265 EST0920072 EST0920016 EST0920121 EST0920 122 zv66h 12 zs83 a08 foetal heart h4.1I h4.1I foetal heart h4. 1 retina h4,2 h4. 1 retina h4.2 foetal liver h4.2 spleen h4.2 8-9w foetus h4.2 germinal centre h4. I B cell h4.1I germinal centre h4. 1 B cell h4. 1 Library source Contig d13.5-14.5 m5.1 mouse embryo placenta m5.1 mixed organs m5.1 heart m5.1 zs83g08 Table 5.1 Summary of ESTs derived from mouse SOCS-5 cDNAs SOCS Species EST name End EST no Mouse mc55aOl 5' EST0541556 mh98flJ9 my26h 12 ve24e06 EST0638237 EST0859939 EST0819106 P:\OPER\VPA\VPACOM-l\SOCSDI-l.WPD. 1/11/01 108-
C
Table 5.2 Summary of ESTs derived from human SOCS-5 cDNAs SOCS Species EST name End EST no Human EST15BI03 EST0258029 EST15BI05 EST0258028 EST27530 5' EST0965892 5' EST0679820 Table 6.1 Summary of ESTs derived from mouse SOCS-6 cDNAs SOCS Species EST name End EST no SOCS-6 Mouse mco4c05 5' EST0525832 md48a03 5' EST0566730 mf3 1dO3 5' EST0675970 25 mh26b07 5' EST0628752 mh78ell 5' EST0637608 mh88h09 5' EST0644383 mh94h07 5' EST063 8078 mi27h04 5' EST0644252 mj29c05 5' EST0664093 mp66g04 5' EST0757905 mw75g03 5' EST0847938 Library source Contig adipose tissue h5. 1 adipose tissue h5.1I cerebellum h5.1I retina h5.1 Library source Contig d 19.5 embryo m6.1 d 13.5-14.5 embryo m6.1 d 13.5-14.5 embryo m6.1 d 13.5-14.5 placenta m6.1 d 13.5-14.5 placenta m6.1 d 13.5-14.5 placenta m6.1 d 13.5-14.5 placenta m6.1 d 13.5-14.5 embryo m6.1 d 13.5-14.5 embryo m6.1 thymus m6. 1 liver m6.1 P:\OPERkVPA\VPACOM-I\SOCSDI-l.WPD 1/11/01 109va53b05 vb34h02 vc55dO7 vc59e05 vc67d03 vc68dlO vc97hOl vc99c08 vd07hO3 vdO8cOl vd09bl2 vdl 9b02 vd29a04 vd46d06 EST090 1540 EST0930 132 EST 1057735 EST1058201 EST 1057849 EST 1058663 EST 1059343 EST1059410 EST1058 173 EST 1058275 EST1058632 EST1 059723 ?none found ?none found d 12.5 embryo' lymph node 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo 2 cell embryo m6.1 m6.1I m6. 1 m6.1 m6.1 m6.1 m6.1 m6.1 m6.1 m6.1 m6. 1 m6. 1 m6. 1 m6.1 P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/11l01 110- Table 6.2 Summary of ESTs derived from human SOCS-5 cDNAs SOCS Species EST name End EST no SOCS-6 Human yf6 1e08 yf93a09 2 yg4 1 f04 yg45c02 yhl MfO yhl3bO5 *fl.
EST0 184387 ESTO 186084 EST0 19 1486 EST0 1950 17 EST0 185308 EST0236705 EST0237191 EST0236958 EST05555 18 EST0603826 EST06037 18 EST0773936 EST0773892 EST0683363 EST07 18885 EST0785947 Library source d73 infant brain d73 infant brain d73 infant brain d73 infant brain d73 infant brain d73 infant brain d73 infant brain senescent fibroblasts foetal heart pregnant uterus colon endothelial cell endothelial cell Contig h6.1 h6.1I h6.1I h6.1I h6.1I h6.1I h6. 1 h6.2 h6. 1 h6. 1 h6.2 h6. 1 h6. 1 h6.1 h6.1 h6.1 zc35a12 zeO2hO8 zl09aO3 z169e 10 zn3 9d08 zo39e06 P:\OPER\VPA\VPACOM\-I\SOCSDl--I.WPD. -/11/01 -1III- Table 7.1 Summary of ESTs derived from mouse SOCS-7 cDNAs SOCS Species EST name End EST no SOCS-7 Mouse mj39aOl 5' EST0665627 vi52h07 5' EST1267404 Library source d 13.5/14.5 embryo d7.5 embryo Contig m7. 1 m7.1I a. a.
a.
a..
Table 7.2 Summary
SOCS
SOCS-7 of ESTs derived from human SOCS-5 cDNAs Species EST name End EST no HUMAN STS WI-30171 (G2 1563) EST00939 5' EST0000906 EST12913 3' EST0944382 yc29b05 3' EST0 128727 Library source Chromosome 2 hippocampus uterus liver retina germinal centre B ovarian tumour Contig h7.2 h7.1 h7.2 h7.2 h7.2 chl.2 h7. 1 h7.1 yp49flO0 ztlOflJ3 EST0301914 EST0922932 EST0921231 ESTI 102975 zx73g04 Table 8.1 Summary
SOCS
SOCS-8 of ESTs derived from mouse SOCS-8 cDNAs Species EST name End EST no Library source Mouse mjlI6eO9 r I EST0666240 d 13.5/14.5 embryo vj27a029 ri EST1155973 heart Contig m8. 1 m8.1 P:\OPER\VPA\VPACOM-I\SOCSDI-l WPD 1/11/01 112- Table 9.1 Summary of ESTs derived from mouse SOCS-9 cDNAs SOCS Species EST name End EST no Library source Mouse me65d05 5' EST0585211I d 13.5/14.5 embryo Contig m9.1 Table 9.2 Summary of ESTs derived from human SOCS-5 cDNAs SOCS Species EST name SOCS-9 Human a *aa.
a a CSRL-83f2-u ESTi 114054 yyO6bO7 yyO6gO6 zr4OcO9 zr72hOlI yx92c08 yx93b08 hfe0662 End EST no (B06659) 5' EST0939759 3' EST0434504 5' EST0443783 5' EST0832461 5' EST0892025 3' EST0892026 5' EST0441160 5 EST0441260 5' EST0889611 Library source chromsome I11 h9.1I placenta h9. 1 melanocyte h9. 1 melanocyte h9.1I melanocyte, heart, hRi~us melanocyte, heart, hi~us h9.1 melanocyte h9.1I melanocyte h9. 1 foetal heart h9.1I Contig P:IOPER\VPAVPACOM-I\SOCSDI.WPD- 1/11/01 113- Table 10.1 Summary of ESTs derived from mouse SOCS-10 cDNAs SOCS Species Mouse EST name mb 1 4d 12 mb40fD~6 mg89bl I1 mq89e 12 mpO3gl2 vh53cl 1 End 5, 5' 5, 5, 5, 51 EST no EST0549887 EST05 15064 EST0630631I EST0776015 EST0741991 EST1 154634 Library source Contig d 19.5 embryo m10.1 d 19.5 embryo m10.1 d 13.5-14.5 embryoml10. 1 heart M10.1 heart m10.1 mammary gland ml 0.1 4* a..
Table 10.2 Summary of ESTs derived from human SOCS-5 cDNAs SOCS Species EST name SOCS- 10 Human aa48hl10 zp97h 12 End EST no 3' EST1135220 31 EST0819137 5, EST083 5442 3' EST083121 1 5' EST0835907 5' EST0834251 31 EST0834440 5 EST1004491 EST00 13906 Library source germinal centre B cell muscle muscle muscle Contig hl10.2 hl10.2 h 10.2 h 10.2 hlO.1 zqO8hO I zr34g05 melanocyte, heart, uterW. 0.2 hlO.2 EST73000 HSDHE1005 ovary heart hi 0.2 h 10.2 P:\OPER\VPA\VPACOM-l\SOCSDI-l WPD 1/11(0) 114- Table 11.1 Summary of ESTs derived from human SOCS-5 cDNAs SOCS Species EST name End EST no SOCS- I I Human zt24h06 ri EST0925023 Library source ovarian tumor Contig 11.1 zr43b02 ri EST0873006 melanocy-te, heart, uterus si EST0872954 Table 12.1 Summary of ESTs derived from mouse SOCS-12 cDNAs SOCS Species EST name End EST no Library source S. .5
S
*5
S
.5 S S SOCS- 12 Mouse EST03803 mtI8fD2 mz60glO vaO5cl I Table 12.2 Summary of ESTs derived from SOCS Species EST name EST1054 173 EST08 17652 EST0890872 EST0909449 day 7.5 emb ectoplacental cone 3NbMS spleen lymph node lymph node Contig m12.1 m12.1 m 12.1 ml 2.1 human SOCS-5 cDNAs
S
*5*S
S
End EST no
S
SOCS-12 Human STS-SHGC-13867 EST 177695 EST64550 EST76868 PMY2369 yb38fD4 EST094 8071 EST0997367 EST1007291 EST1 115998 EST0 108807 EST0224407 EST0237226 EST0236992 Library source Chromosome 2 Jurkat cells Jurkat cells pineal body
KG-I
foetal spleen Contig h12.2 h12.1 h12.1 h 12.2 h12.1 h12.1 h 12.2 h12.1 h12.1 h 12.2 yg74e 12 yhl3gO4 d73 brain d73 brain P:\0PER\VPA\VPACOM-.I\SOCSDI-I.WPD 1/01 115yh48b06 yh53a05 yn48h09 yn90a09 yo08fD3 yol 1 O I yo63bl2 yq56g02 zh57c04 zh79h01 zh99aI I zo92h12 zs48cOl zs45h02 yh48b06 EST0 197282 EST0 197486 EST0278258 EST0278259 EST0302557 EST030 1790 EST0302059 ?none found EST0303606 EST0304085 EST0346935 EST05 94201 EST0598945 EST06 18570 EST0803392 EST0803393 EST0925714 EST0925530 EST0932296
S
5505 Sees @0 eS 4@ 0 0 5* 0 5@ 0O .5.5 0
S.
S S eg 505.., 0
*SS@
S
S
See...
S
9 See...
S
*000S0 a 06S5 6 *SOe placenta placenta brain brain brain breast foetal liver spleen foetal liver spleen foetal liver spleen foetal liver spleen ovarian cancer germinal centre B cell germinal centre B cell Library source day 19.5 embryo day 13.5/14.5 embryo day 19.5 embryo day 19.5 embryo h 12.2 hl2.2 h12.2 h 12.2 h 12.2 h 12.2 h 12.2 h 12.2 h 12.2 h 12.2 h 12.2 H12.1 H12.2 h 12.2 h 12.2 H12.1 h 12.2 h12.1 H12.2 h 12.2 Table 13.1 Summary of ESTs derived from mouse SOCS-13 cDNAs SOCS Species EST name End EST no SOCS- 13 Mouse ma39c09 me60cO5 mi78g05 mk iOc 11 5' EST0517875 5' EST0584950 5- EST0653834 5' EST0735158 Contig m13.1 m 13.1 m 13.1 mn13.1 PAkOPERVPA\VPACOM-1'SOCSDI-I .WPD 1/1 1/01 116mo48g 12 mp94a01I vb57c07 vhO7cl I EST07451 11 EST0762827 EST 1028976 EST1 117269 day 10.5 embryo thymus day 11.5 embryo mammary gland m13.1 m13.1 m13.1 m13.1 Table 13.2 Summary of ESTs derived from human SOCS-13 cDNAs SOCS Species EST name End EST no Library source Contig SOCS- 13 Human EST59161 Table 14.1 Summary of ESTs derived from SOCS Species EST name 5' EST0992726 infant brain h 13.1 mouse SOCS-14 cDNAs End EST no SOCS- 14 mouse mi75e03 vd29h1 II vd53g07 EST065 1892 EST 1067080 EST1 119627 Library source d 19.5 embryo 2 cell embryo 2 cell embryo Library source Contig m14.1 m14.1 m14.1 Table 15.1 Summary of ESTs derived from mouse SOCS-15 cDNAs SOCS Species EST name End EST no Mouse mh29b05 mh98h09 m145a02 mu43alO EST0628834 EST063 8243 EST0687171 EST85 1588 placenta placenta testis thymus Contig m15.1 m15.1 ml15.1 m15.1 P:\0PER\VPA\VPACOM- a SOCSDI- I.WPD 1111/101 117my3 8c09 vj37h07 AC002393 EST8 78461 EST1 174791 pooled organs diaphragm Chromosome 6 BAC m15.1 m15.1 m15.1 Table 15.2 Summary of ESTs derived from human SOCS-15 cDNAs
SOCS
SOCS-15 Species Human EST name EST98889 ne48bo5 ybl2hl2 HSU47924 End EST no 5' EST1026568 3' EST1 138057 5' EST0098885 3' EST0098886 Library source thyroid colon tumour placenta Contig h 15.1 h 15.1 h 15.1 h 15.1 h 15.1 Chromosome 12 BAC P:\OPER\VPA\VPACOM- ISOCSDI--.WPD- 1/11/01 -118-
BIBLIOGRAPHY:
Alexander WS, Metcalf D and Dunn AR (1995). Embo Journal 14, 5569-78.
Altschul, S.F. Gish, W. Miller, W. Myers, E.W. Lipman, D.J. (1990) J. Mol. Biol. 215, 403-10.
Ansari-Lari, Shen, Munzy, Lee, W. and Gibbs, R.A. (1997) Gen me. Res. 7, 268-280.
Bazan JF (1990). [Review]. Immunology Today 11, 350-4.
Bork, P. (1993) Proteins. Struct. Funct. Genet. 17, 363-374.
Cockwell, L.Y. and Giles, I.G. (1989) Comp. Appl. Biosci. 5, 227-232.
Cutler RL, Liu L, Damen JE and Krystal G (1993). Journal of Biological Chemistry 268, 21463-5.
Darnell J Jr., Kerr IM and Stark GR (1994). Science 264, 1415-21.
•eoe* David M, Petricoin E3, Benjamin C, Pine R, Weber MJ and Lamer AC (1995). Science 269, 1721-3.
David M, Wong L, Flavell R, Thompson SA, Wells A, Lamer AC and Johnson GR (1996).
Journal of Biological Chemistry 271, 9185-8.
Dugaiczyk A, Haron JA, Stone EM, Dennison OE, Rothblum KN and Schwartz Rj (1983).
Biochemistry 22, 1605-13.
Durbin JE, Hackenmiller R, Simon MC and Levy DE (1996). Cell 84, 443-50.
P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD- 1/11/01 -119- Etzold, Ulyanov, A. and Argos, P. (1996) Methods Enzymol. 266, 114-28.
Gearing DP, Nicola NA, MetcalfD, Foote S, Willson TA, Gough NM and Williams L (1989).
BioTechnology 7, 1157-1161.
Gupta S, Yan H, Wong LH, Ralph S, Krolewski J and Schindler C (1996). Embo Journal 1075-84.
Hilton DJ (1994). An introduction to cytokine receptors, p8-16 in Guidebook to Cytokines and Their Receptors, Eds: N. A. Nicola. Oxford University Press: Oxford.
Hilton DJ, Hilton AA, Raicevic A, Rakar S, Harrison-Smith M, Gough NM, Begley CG, Metcalf D, Nicola NA and Willson TA (1994). Embo Journal 13, 4765-75.
Hilton DJ, Watowich SS, Katz L and Lodish HF (1996). J. Biol. Chem. 271, 4699-4708.
Ichikawa Y (1969). Journal of Cellular Physiology 74, 223-34.
IhleJN(1995). Nature 377, 591-4.
Ihle JN, Witthuhn BA, Quelle FW, Yamamoto K and Silvennoinen 0 (1995). Annual Review of Immunology 13, 369-98.
i Kaplan MH, Schindler U, Smiley ST and Grusby MJ (1996a). Immunity 4, 313-9.
Kaplan MH, Sun YL, Hoey T and Grusby MJ (1996b). Nature 382, 174-179.
Levy DE and Stark GR (1996). Molecular Cellular Biology 16, 369-75.
Metcalf D, Wilson TA, Hilton DJ, DiRago L and Mifsud S. (1995) Leukaemia 9, 1556-1564.
P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/1/01 120- Meraz MA, White JM, Sheehan KC, Bach EA, Rodig SJ, Dighe AS, Kaplan DH, Riley JK, Greenlund AC, Campbell D, Carver-Moore K, DuBois RN, Clark R, Aguet M and Schreiber RD (1996). Cell 84, 431-42.
Mizushima S and Nagata S (1990). Nucleic Acids Research 18, 5322.
Murakami M, Narazaki M, Hibi M, Yawata H, Yasukawa K, Hamaguchi M, Taga T and Kishimoto T (1991). Proc. Natl. Acad. Sci. USA 88, 11349-11353.
Neer, Schmidt, Nambudripad, R. and Smith, T.F. (1994) Nature 371, 297-300.
Nicola NA( (1994). Guidebook to Cytokines and Their Receptors. Oxford University Press: Oxford.
Nicola, NV, Viney E, Hilton DJ, Roberts B and Wilson T. (1996) Growth Factors 13, 141- 149.
Novak U, Harpur AG, Paradiso L, Kanagasundaram V, Jaworowski A, Wilks AF and Pearson WR. (1990) Methods Enzymol. 183, 63-98.
Rayner JR and Gonda TJ (1994). Molecular Cellular Biology 14, 880-7.
Sambrook J, Fritsch EF and Maniatis T (1989). Molecular Cloning, A Laborator, Manual.
Cold Spring Harbour Laboratory Press, Cold Spring Harbour USA.
Sato N, Sakamaki K, Terada N, Arai K and Miyajima A (1993). Embo Journal 12, 4181-9.
P:NOPER\VPA\VPACOM-\SOCSDl-l.WPD 1/11/01 121 Shimoda K, van Deursen J, Sangster MY, Sarawar SR, Carson RT, Tripp RA, Chu C, Quelle FW, Nosaka T, Vignali DA, Doherty PC, Grosveld G, Paul WE and Ihie IN (1996). Nature 380, 630-3.
Shual K, Ziemiecki A, Wilks AF, Harpur AG, Sadowski HB, Gilman MZ and Darnell JE (1993). Nature 366, 5 80-3.
Sprang SR and lBazan JF (1993). Curr. Opin. Structural Bio. 3, 815-827.
Takeda K, Tanaka T, Shi W, Matsumoto M, Minami M, Kashiwamura 5, Nakanishi K, Yoshida N, Kishimoto T and Akira S (1996). Nature 380, 627-30.
Thierfelder WE, Vandeursen JM, Yamamoto K, Tripp RA, Sarawar SR, Carson RT, Sangster MY, Vignali DDA, Doherty PC, Grosveld GC and IhIe IN (1996). Nature 382, 17 1-174.
Wakao H, Gouilleux F and Groner B (1994). Embo Journal 13, 2182-9 1.
Wen Z, Zhong Z and Darnel I J Jr. (1995). Cell 82, 241-5 0.
Yi T, Mui AL, Krystal G and Ihie IN (1993). Molecular Cellular Biology 13, 7577-86.
Yoshimura A, Ohkubo T, Kiguchi T, Jenkins NA, Gilbert DJ, Copeland NG, Hara T and Miyajima A (1995). Embo, Journal 14, 2816-26.
Woo P:\OPERWVPA\VPACOM-l'.SOCSDI-l.WPD- 1/11/01 122- SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: (Other than US) THE WALTER AND ELIZA HALL INSTITUTE OF MEDICAL RESEARCH (US Only) (ii) TITLE OF INVENTION: THERAPEUTIC AND DIAGNOSTIC AGENTS (iii) NUMBER OF SEQUENCES: 49 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: DAVIES COLLISON CAVE STREET: 1 LITTLE COLLINS STREET CITY: MELBOURNE STATE: VICTORIA COUNTRY: AUSTRALIA ZIP: 3000 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.25 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT INTERNATIONAL FILING DATE: 31-OCT-1997 (vi) PRIOR APPLICATION DATA: APPLICATION NUMBER: PO5117 FILING DATE: 14-FEB-1997 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: PO 3384 FILING DATE: 01-NOV-1996 :(viii) ATTORNEY/AGENT INFORMATION: NAME: HUGHES DR, E JOHN L REFERENCE/DOCKET NUMBER: EJH/EK (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: +61 3 9254 2777 TELEFAX: +61 3 9254 2770 P:\OPERVPAWPACOM-\SOCSDI-I.WPD- 1/11/01 123- INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: CACGCCGCCC ACGTGAAGGC INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: TTCGCCAATG ACAAGACGCT INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 1236 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..636 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: CGAGGCTCAA GCTCCGGGCG GATTCTGCGT GCCGCTCTCG GGCCTGTGCC ACCCGGACGC CCGGCTCACT GCCTCTGTCT GACGCTATGG CCCACCCCTC CAGCTGGCCC CTCGAGTAGG ATG GTA GCA CGC AAC CAG GTG GCA GCC GAC AAT Met Val Ala Arg Asn Gln Val Ala Ala Asp Asn 1 5 10 GCA GAG CCC CGA CGG CGG TCA GAG CCC TCC TCG CTCCTTGGGG TCTGTTGGCC CCCCCATCAG CGCAGCCCCG GCG ATC TCC CCG GCA Ala Ile Ser Pro Ala TCC TCG TCT TCG TCC -101 -41 P:\OPER\VPA\VPACOM-1\SOCSD1-I .WPD 1/l11/01 124- Ala
TCG
Ser
CCA
Pro
TAC
Tyr
TAT
Tyr
GAG
Giu
TTC
Phe
GTG
Val
TTC
Phe 145
CGC
Arg
GAG
Giu
GCG
Ala
CCC
Glu
CCA
Pro
GCC
Al a
CGG
Arg
TGG
Trp
CCC
Pro
TTC
Phe
CAC
His 130
GAC
Asp
ATG
Met
CTG
Leu
CGC
Arg
TTC
Pro
GCG
Ala
CCT
Pro
CGC
Arg
GGA
Gly
GTG
Val1
GCG
Ala 115
TTC
Phe
TGC
Cys
TTG
Leu
TGT
Cys
ATC
Ile 195
CAG
Arg
GCC
Ala
GGC
Gly
ATC
Ile
CCC
Pro
GGC
Gly 100
CTC
Leu
CAG
Gin
CTT
Leu
GGG
Gly
CGC
Arg 180
CCT
Pro
ATC
Arg
CCC
Pro
GAC
Asp
ACG
Thr
CTG
Leu
ACC
Thr
AGC
Ser
GCC
Ala
TTC
Phe
GCC
Ala 165
CAG
Gin
CTT
Leu
TGA
Arg Ser Glu GTG CGT CCC Val Arg Pro 40 ACT CAC TTC Thr His Phe 55 CGG ACC AGC Arg Thr Ser 70 AGC GTG CAC Ser Val His TTC TTG GTG Phe Leu Val GTG AAG ATG Val Lys Met 120 GGC CGC TTC Gly Arg Phe 135 GAG CTG CTG Glu Leu Leu 150 CCG CTG CGC Pro Leu Arg CGC ATC GTG Arg Ile Val AAC CCG GTA Asn Pro Val 200 CCGGCTG CCGC Pro
CGG
Arg
CGC
Arg
GCG
Ala
GGG
Gly
CGC
Arg 105
GCT
Ala
CAC
His
GAG
Glu
CAG
Gin
GCC
Ala 185
CTC
Ser Ser Ser Ser Ser Ser Ser
CCC
Prc
ACC
Thr
CTC
Leu
GCG
Al a 90
GAC
Asp
TCG
Ser
TTG
Leu
CAC
His
CGC
Arg 170
GCC
Ala
CGT
*TGC
Cys
TTC
Phe
CTG
Leu 75
CAC
His
AGT
Ser
GGC
Gly
GAC
Asp
TAC
Tyr 155
CGC
Arg
GTG
Val1
GAC
CCG
Pro
CGC
Arg
GAC
Asp
GAG
Glu
CGT
Arg
CCC
Pro
GGC
Gly 140
GTG
Val1
GTG
Val1
GGT
Gly
TAC
GCG
Ala
TCC
Ser
GCC
Ala
CGG
Arg
CAA
Gin
ACG
Thr 125
AGC
Ser
GCG
Ala
CGG
Arg
CGC
Arg
CTG
GTC
Val1
CAC
His
TGC
Cys
CTG
Leu
CGG
Arg 110
AGC
Ser
CGC
Arg
GCG
Ala
CCG
Pro
GAG
Giu 190
AGT
CCA
Pro
TCC
Ser
GGC
Gly
CGT
Arg
AAC
Asn
ATC
Ile
GAG
Giu
CCG
Pro
CTG
Leu 175
AAC
Asn
TCC
GCC
Ala
GAT
Asp
TTC
Phe
GCC
Ala
TGC
Cys
CGC
Arg
ACC
Thr
CGC
Arg 160
CAG
Gin
CTG
Leu
TTC
144 192 240 288 336 384 432 480 528 576 624 a. a a a.
a a.
a Leu Arg Asp Tyr Leu Ser Ser Phe 205 TGTGCC GCAGCATTAA GTGGGGGCGC Pro Phe Gin Ile 210 CTTATTATTT CTTATTATTA ATTATTATTA TTTTTCTGGA GCCTGGGTCG GAGGGAGTGG TTGTGGAGGG TGAGATGCCT TCATCCCACC TCTCAGGGGT GGGGGTGCTC CCCTCCTGGT GGTTGTAGCA GCTTGTGTCT GGGGCCAGGA CCTGAATTCC ACATATTCCC AGTATCTTTG CACAAACCAG GGGTCGGGGA CTGCTGTGCA GAATATCCTA TTTTATATTT TTACAGCCAG ACCACGTGGG AGCCCTCCCC CCCACTTCTG GCTGGAGACC GCTCCCTCCG GGTCCCCCCT ACTCCTACCT CTCCATGTTT GGGTCTCTGG CTTCATTTTT TTTAGGTAAT AAACTTTATT 736 796 856 916 976 1036 P:\OPER\VPA\VPACOM-l'SOCSDI--l.WPD 1/11/01 125- ATGAAAGTTT TTTTTTAAAA GAAAAAAAAA AAAAAAAAA INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 212 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 1075 Met Val Ala Arg 1 Ala Ser Pro Tyr Tyr 0 Glu 0000Phe Va41 Arg 0 0 0 Ala Pro Glu Pro Ala Arg Trp Pro Phe His 130 Asp Met Leu Arg Phe 210 Pro Ala Pro Arg Gly Val1 Ala 115 Phe Cys Leu Cys Ile 195 Gin Arg Ala Gly Ile Pro Gly 100 Leu Gin Leu Gly Arg 180 Pro Ile Asn 5 Arg Pro Asp Thr Leu Thr Ser Ala Phe Ala 165 Gin Arg Val1 Thr Arg 70 Ser Phe Val1 Gly Giu 150 Pro Arg Ser Arg His 55 Thr Val1 Leu Lys Arg 135 Leu Leu Ile Glu Pro 40 Phe Ser His Val1 Met 120 Phe Leu Arg Val1 Val 200 Gin Val Ala Ala Pro 25 Arg Arg Ala Gly Arg 105 Al a His Glu Gin Ala 185 Asp Asn Ala 10 Ser Ser Ser Pro Cys Pro Thr Phe Arg Leu Leu Asp 75 Ala His Glu 90 Asp Ser Arg Ser Gly Pro Leu Asp Gly 140 His Tyr Val 155 Arg Arg Val 170 Ala Val Gly Ile Ser Ala Ser Ala Arg Gin Thr 125 Ser Ala Arg Arg Leu 205 Ser Ser Val1 His Cys Leu Arg 110 Ser Arg Ala Pro Giu 190 Pro Ser Pro Ser Gly Arg Asn Ile Giu Pro Leu 175 Asn Ala Ser Ala Asp Phe Ala Cys Arg Thr Arg 160 Gin Leu Leu Asn Pro Leu Arg Asp Tyr Ser Ser Phe INFORMATION FOR SEQ ID P:\OPER\VPAVPACOM-l\SOCSDI-I.WPD- 1111/01 126- SEQUENCE CHARACTERISTICS: LENGTH: 1121 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 223..819 (xi) SEQUENCE DESCRIPTION: SEQ ID GCGATCTGTG GGTGACAGTG TCTGCGAGAG ACTTTGCCAC ACCATTCTGC CGGAATTTGG AGAAAAAGAA CCAGCCGCTT CCAGTCCCCT CCCCCTCCGC CACCATTTCG GACACCCTGC ACACTCTCGT TTTGGGGTAC CCTGTGACTT CCAGGCAGCA CGCGAGGTCC ACTGGCCCCA GCTCGGGCGA CCAGCTGTCT GGGACGTGTT GACTCATCTC CC ATG ACC CTG CGG Met Thr Leu Arg 120 180 234
TGC
Cys CTG GAG CCC TCC Leu Glu Pro Ser GGG AAT Gly Asn 10 GGA GCG GAC AGG Gly Ala Asp Arg ACG CGG AGC CAG Thr Arg Ser Gln
TGG
Trp GGG ACC GCG GGG Gly Thr Ala Gly
TTG
Leu CCG GAG GAA CAG Pro Glu Glu Gln
TCC
Ser 30 CCC GAG GCG GCG Pro Glu Ala Ala CGT CTG Arg Leu 330 378 GCG AAA GCC Ala Lys Ala ATG ACT GTT Met Thr Val
CTG
Leu 40 CGC GAG CTC AGT CAA ACA GGA TGG TAC Arg Glu Leu Ser Gln Thr Gly Trp Tyr 45 TGG GGA AGT Trp Gly Ser CCA GAA GGA Pro Glu Gly AAT GAA GCC AAA Asn Glu Ala Lys AAA TTA AAA GAG Lys Leu Lys Glu
GCT
Ala ACT TTC Thr Phe TTG ATT AGA GAT Leu Ile Arg Asp TCG CAT TCA GAC Ser His Ser Asp
TAC
Tyr CTA CTA ACT ATA Leu Leu Thr Ile
TCC
Ser GTT AAG ACG TCA Val Lys Thr Ser
GCT
Ala 90 GGA CCG ACT AAC Gly Pro Thr Asn CGG ATT GAG TAC Arg Ile Glu Tyr
CAA
Gin 100 GAT GGG AAA TTC Asp Gly Lys Phe
AGA
Arg 105 TTG GAT TCT ATC Leu Asp Ser Ile
ATA
Ile 110 TGT GTC AAG TCC Cys Val Lys Ser AAG CTT Lys Leu 115 570 AAA CAG TTT Lys Gln Phe TGC AAG GAT Cys Lys Asp 135
GAC
Asp 120 AGT GTG GTT CAT Ser Val Val His
CTG
Leu 125 ATT GAC TAC TAT Ile Asp Tyr Tyr GTC CAG ATG Val Gln Met 130 GGG ACT GTT Gly Thr Val AAA CGG ACA GGC Lys Arg Thr Gly
CCA
Pro 140 GAA GCC CCA CGG Glu Ala Pro Arg
AAT
Asn 145 P:\OPER\VPA\VPACOM-\SOCSD-l .WPD- 1/11/01 127- CAC CTG TAC CTG ACC AAA CCT CTG TAT ACA TCA His Leu Tyr Leu Thr Lys Pro Leu Tyr Thr Ser 150 155 CAT TTC TGT CGA CTC GCC ATT AAC AAA TGT ACC His Phe Cys Arg Leu Ala Ile Asn Lys Cys Thr 165 170 175 CTG CCT TTA CCA ACA AGA CTA AAA GAT TAC TTG Leu Pro Leu Pro Thr Arg Leu Lys Asp Tyr Leu 185 190 CAG GTA TAAGTATTTC TCTCTCTTTT TCGTTTTTTT TT Gin Val GCCTCATATA GACTATCTCC GAATGCAGCT ATGTGAAAGA GGATAACTGC GCAGAATTCT CTCTTAAGGA CAGTTGGGCT AAGATGTAGC TAGGTATTTT AAAGTTCCCC TTAGGTAGTT TTCCTATGGC TGCTCAAGAT CAAATGGCCC TTTTAAATGA AAAAAAAAAA AAAAA INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 198 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6 Met Thr Leu Arg Cys Leu Glu Pro Ser Gly Asn 1 5 10 Arg Ser Gln Trp Gly Thr Ala Gly Leu Pro Glu 25 Ala Ala Arg Leu Ala Lys Ala Leu Arg Glu Leu 40 Tyr Trp Gly Ser Met Thr Val Asn Glu Ala Lys 55 Ala Pro Glu Gly Thr Phe Leu Ile Arg Asp Ser 70 75 Leu Leu Thr Ile Ser Val Lys Thr Ser Ala Gly 90 Ile Glu Tyr Gin Asp Gly Lys Phe Arg Leu Asp 100 105 Lys Ser Lys Leu Lys Gin Phe Asp Ser Val Val 115 120 GCA CCC ACT CTG CAG Ala Pro Thr Leu Gin 160 GGT ACG ATC TGG GGA Gly Thr Ile Trp Gly 180 GAA GAA TAT AAA TTC Glu Glu Tyr Lys Phe 195 kAAAAAAA AAAAACACAT GAACCCAGAG GCCCTCCTCT CAGTCTAACT TAAAGGTGTG TTAGCTGAAT GATGCTTTCT AACAAAACAA AACAAAACAA 714 762 810 866 926 986 1046 1106 1121 Gly Glu Ser Glu Ser Pro Ser His Ala Gin Gin Lys His Thr Ile Leu 125 Asp Ser Thr Leu Ser Asn Ile 110 Ile Arg Pro Gly Lys Asp Leu Cys Asp Thr Glu Trp Glu Tyr Arg Val Tyr P:\OPERVPAVPACOM-I'SOCSDI-I.WPD 1/11/01 128o r Tyr Val Gin Met Cys Lys Asp Lys Arg Thr Gly Pro Glu Ala Pro Arg 130 135 140 Asn Gly Thr Val His Leu Tyr Leu Thr Lys Pro Leu Tyr Thr Ser Ala 145 150 155 160 Pro Thr Leu Gin His Phe Cys Arg Leu Ala Ile Asn Lys Cys Thr Gly 165 170 175 Thr Ile Trp Gly Leu Pro Leu Pro Thr Arg Leu Lys Asp Tyr Leu Glu 180 185 190 Glu Tyr Lys Phe Gin Val 195 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 2187 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 18..695 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: CGCTGGCTCC GTGCGCC ATG GTC ACC CAC AGC AAG TTT CCC GCC GCC GGG Met Val Thr His Ser Lys Phe Pro Ala Ala Gly 1 5 ATG AGC CGC CCC CTG GAC ACC AGC CTG CGC CTC AAG ACC TTC AGC TCC Met Ser Arg Pro Leu Asp Thr Ser Leu Arg Leu Lys Thr Phe Ser Ser 20 AAA AGC GAG TAC CAG CTG GTG GTG AAC GCC GTG CGC AAG CTG CAG GAG Lys Ser Glu Tyr Gin Leu Val Val Asn Ala Val Arg Lys Leu Gin Glu 30 35 AGC GGA TTC TAC TGG AGC GCC GTG ACC GGC GGC GAG GCG AAC CTG CTG Ser Gly Phe Tyr Trp Ser Ala Val Thr Gly Gly Glu Ala Asn Leu Leu 50 CTC AGC GCC GAG CCC GCG GGC ACC TTT CTT ATC CGC GAC AGC TCG GAC Leu Ser Ala Glu Pro Ala Gly Thr Phe Leu Ile Arg Asp Ser Ser Asp 65 70 CAG CGC CAC TTC TTC ACG TTG AGC GTC AAG ACC CAG TCG GGG ACC AAG Gin Arg His Phe Phe Thr Leu Ser Val Lys Thr Gin Ser Gly Thr Lys 85 AAC CTA CGC ATC CAG TGT GAG GGG GGC AGC TTT TCG CTG CAG AGT GAC Asn Leu Arg Ile Gin Cys Glu Gly Gly Ser Phe Ser Leu Gln Ser Asp 100 105 98 146 194 242 290 338 P:\OPER\VPAVPACOM-ISOCSD-l.WPD- 1/11/01 129- CCC CGA AGC Pro Arg Ser 110 GTG CAC CAC Val His His ACG CAG CCA GTT Thr Gin Pro Val CCC CGC Pro Arg 115 CCA GGG Pro Gly TTC GAC TGT GTA CTC AAG CTG 386 Phe Asp Cys Val1 120
TTT
Phe Leu Lys Leu TCT TTG CCA Ser Leu Pro TAC ATG CCG Tyr Met Pro 125 CCC ACG Pro Thr
CCT
Pro 130
GAA
Glu ACC CCC Thr Pro
TCC
Ser 135
CCA
Pro GAA CCC TCG Glu Pro Ser 140
CTC
Leu
TCC
Ser 145
CCC
Pro GTT CCG GAG Val Pro Giu
CAG
Gin 150
TAC
Tyr CCT GCC CAG Pro Ala Gin
GCA
Ala 155 CCC GGG AGT Pro Gly Ser
ACC
Thr 160
CTG
Leu AAG AGA GCT Lys Arg Ala
TAC
Tyr 165
CCT
Pro ATC TAT TCT Ile Tyr Ser GGG GGC Gly Gly 170 530 GAG AAG ATT Glu Lys Ile ACC CTC CAG Thr Leu Gin 190 TAT GAG AAA Tyr Giu Lys 205
CCG
Pro 175
CAT
His GTA CTG AGC Val Leu Ser
CGA
Arg 180
ACT
Thr CTC TCC TCC Leu Ser Ser CTT TGT CGG Leu Cys Arg
AAG
Lys 195
CCT
GTC AAC GGC Vai Asn Gly
CAC
His 200
GAG
Glu AAC GTG GCC Asn Val Ala 185 CTG GAC TCC Leu Asp Ser TTC CTG GAT Phe Leu Asp GTG ACC CAG CTG Val Thr Gin Leu Pro 210 GGA CCC ATT Gly Pro Ile
CGG
Arg 215
C
CAG TAT GAT GCT CCA CTT TAAGGAGCAA AAGGGTCAGA GGGGGGCCTG Gin Tyr Asp Ala Pro Leu 220 225 GGTCGGTCGG TCGCCTCTCC TCCGAGGCAC ATGGCACAAG CACAAAAATC CAGCCCCAAC
GGTCGGTAGC
GCAGAGTAGA
TTCCCCCCTC
ACAATACCTT
AGGGAGGTGG
TCCCGCTGGA
GGATGGAAGA
CACTGCCCAA
ACCTGAAGAG
CAGATCCCTT
GACAGATGAG
ATGAGCCATC
CCTATGTGGG
CACAAGGAGC
TCCCAGTGAG
GCTGGCAGGA
CCCCAGCTCC
TGACAAGCGG
GGACACCTCC
ACTTGTTTGC
GAAAAGGGTG
CCTAGGTGAG
AGCTATACTG
GCACCCCAGA
GCTGGTGAGC
TTGGAGCCCA
GCTAGGAGAC
CAAACACAGC
CCAGGGGCAG
CCTGGAATTC
AGCTTCTTTC
ACTCTCCCCT
AAGTGTTGAA
GCTTTGATTT
TGTGAAGGGT
GAGTGGTGGC
GTGCCAGGCT
ACCCTCCCCG
TGGCCGCCTT
GGTTTCCCCT
TCGCCTTAAA
CAATAGGCAG
ATTGGCTTCT
GTCTGAGGGG
AAGTGGAGCC
CCCCTTCCTC
CTTAGAACTG
GGTTTGATCA
TTTTATGCTG
TCCTGGCTCT
CCTCTCCATG
TTGTGAAGAG
TTCCAACACC
GGAGCAGATG
TGCCCTCTGT
AGAGTTGAGG
TCCTCAGGCC
AGGGGGAGCT
AGCCGGCCTG
CACACCCCCT
CAAGGGGAAT
AGAGCAGGCA
GCCAAAGAAA
GGGGAGAGTG
GGGCAGCTAA
GCAGTAGCAT
GAAGGGAGGC
GAGGGTTCTG
CCCAGGGATG
GATTCACCCA
CTCCACTCCC
GCCACCTGCT
GCCTGGTGGG
CTGCTTCCCA
CTTCAAACTT
CCTGGGGGAA
TAACCACTCC
GCAAGGGGTG
TGAAACCTCG
TTAGAAGGGA
AGATCAACAG
CTTTGTCTCT
GGGATTGGCA
GGTGGCTACA.
782 842 902 962 1022 1082 1142 1202 1262 1322 1382 1442 1502 1562 1622 P:\OPERWVPA\VPACOM-l\SOCSDI--l.WPD 1/1/0 130-
GGCCAGGGGA
ACTGGGAAGA
GCCCTGCACA
CATGCCGCTC
CTGTGTTGGG
TGTTGGGGTG
TTATAAAAAT
GATGCTTGAA
TTATACTCAG
AGTGGCTGCA
CATTGGCCAG
GCCCTCCTTT
ACAGGGGCCT
GAGAAACAAG
GCTTTTTTCT
CCACCTCCAG
AAACTCAACC
AAAAGAAACA
GGGAGAGAC
TCCTAGTCAT
CTCACCTGGG
CACGGGAATG
TTTTCTGAAG
CTCTGTTTTG
CCCGCCCCTC
AAATCCCAGT
TTTCAGTAAT
CCAGTCACTC
CTCTCGGTCA
GGGAGGCAGG
CAGCAGCCAT
TCAGGTATGG
AATAATGTTT
TCCCCACTCA
TCAACTCAGA
TTATAATAAA
CAGGAGACTC
GTAGGTCCGA
AGGTGATGGA
GCAATTACCT
GGCTGGGTGG
ACAATTTGCC
GGCCTTCGAG
CTTTGCACAT
AGAGCACTAT
CTGAGTTAAC
GAGCTTCCAG
GAAGCCTTCC
GGAACTGGTC
GGCAGCTGTG
TCAATCACTT
GCTGTCTGAA
ATATTTATAT
TTTTTAATGA
1682 1742 1802 1862 1922 1982 2042 2102 2162 2187 ~AAAAAA AAAAA A~AA INFORMATION FOR SEQ ID NO:8: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 225 amino acids TYPE: amino acid TOPOLOGY: linear
S.S*
S
S
S
(i i) (xi)
MOLECULE
SEQUENCE
TYPE: protein DESCRIPTION: SEQ ID NO:8: Met Val Thr His Ser Lys Phe Pro Ala Gly Met Ser Arg Pro Leu Asp Thr Ser Leu Val Val 35 Ser Ala Val Leu Arg Leu Lys Thr Phe 25 Ser Lys Ser Giu Tyr Gin Phe Tyr Trp Asn Ala Val Thr Gly Gly Arg Lys 40 Giu Ala 5 Leu Gin Glu Ser Gly Asn Leu Leu Ala Gly Leu Gin Ser Ala Giu Pro Thr Phe Leu Thr Ile 70 Arg Asp Ser Ser Arg His Phe Phe Leu Ser Val Lys Ser Thr Gin Ser Gly Lys Asn Leu Arg Ile Gin Cys Giu Gly Pro Val Pro 115 Pro Pro Pro 130 Gly 100 Arg Phe Ser Leu Gin 105 Leu Asp Pro Arg Ser Thr Gin 110 Phe Asp Cys Val 120 Lys Leu Val His His Tyr Met 125 Thr Glu Pro Ser Gly Thr Pro Ser Phe 135 Ser Leu Pro Pro 140 Ser Giu Val Pro Glu Gin Pro Pro Ala Gin Ala Leu Pro Gly Ser Thr 150 155 160 P:\OPERWVPA\VPACOM-P'SOCSDI.WPD- 1/11/01 131 Pro Lys Arg Ala Tyr 165 Tyr Ile Tyr Ser Giy 170 Asn Val 185 Gly Giu Lys Ile Pro Leu 175 Val Leu Ser Cys Arg Lys 195 Arg 180 Pro Leu Ser Ser Ala Thr Leu Gin His Leu 190 Thr Vai Asn Giy His Leu Asp Ser Tyr Giu Lys Val Thr 200 205 Gin Leu 210 Pro Giy Pro Ile Arg 215 Giu Phe Leu Asp Gin 220 Tyr Asp Ala Pro INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 1094 base pairs TYPE: nucleic acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: S S
S
*54*SS
CTCCGGCTGG
TCTCCACAGC
CCGCGGCCCC
CGCACTTCCG
TCCTGGACGC
TGCGCGCCGA
TCGCCCTTAG
GCCGCTTTCA
ACTACGTGGC
CGCTGCAGGA
GCATCCCCCT
CGGCAGCGCC
GCCTGGAACC
GCGAGGCGCC
GGGGTCCTCC
GTATCTGGAG
TATCTTTGCA
AATCCTATTT
TTTTTTAAAA
CCCCTTCTGT
AGCAGAGCCC
CGCGCGCCCG
CACATTCCGT
CTGCGGATTC
GCCCGTGGGC
CGTGAAGATG
CCTGGATGGC
GGCGCCGCGC
GCTGTGCCGC
CAACCCCGTC
CGCCGTGCAC
ATGTGGGTAC
TCCCGCCCTC
CCCTCCTGGT
CCAGGACCTG
CAAACCAGGG
TATATTTTTT
AAAA
AGGATGGTAG
CGACGGCGGC
CGGCCGTGCC
TCGCACGCCG
TACTGGGGGC
ACCTTCCTGG
GCCTCGGGAC
AGCCGCGAGA
CGCATGCTGG
CAGCGCATCG
CTCCGCGACT
GCAGCATTAA
CCTCCCCGGC
GGCTGGAGAC
GCTCCCTCTG
AACTCGCACC
GTTGGGGGAG
AAAGTCAGTT
CACACAACCA
CAGAACCTTC
CCGCGGTCCC
ATTACCGGCG
CCCTGAGCGT
TGCGCGACAG
CCACGAGCAT
GCTTCGACTG
GGGCCCCGCT
TGGCCACCGT
ACCTGAGCTC
CTGGGATGCC
CTGGGTTGGA
GAGGCCGCAG
GGTCCCCCTG
TCCTACCTCT
GGTCTCTGGC
TAGGTAATAA
GGTGGCAGCC
CTCCTCTTCC
GGCCCCGGCC
CATCACGCGC
GCACGGGGCG
CCGCCAGCGG
CCGCGTGCAC
CCTCTTCGAG
GCGCCAGCGC
GGGCCGCGAG
CTTCCCCTTC
GTGTTATTTT
GGGAGCGGAT
ACCCCTTCTC
GTTGTTGTAG
TCATGTTTAC
TTTATTTTTC
ACTTTATTAT
GACAATGCAG
TCCTCCTCGC
CCCGGCGACA
GCCAGCGCGC
CACGAGCGGC
AACTGCTTTT
TTTCAGGCCG
CTGCTGGAGC
CGCGTGCGGC
AACCTGGCTC
CAGATTTGAC
GTTATTACTT
GGGTGTAGGG
ACCTCTTGAG
CAGCTTAACT
ATATACCCAG
TGCTGTGCAG
GAAAGTTTTT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1094 P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/11/01 132- INFORMATION FOR SEQ ID NO:l0: SEQUENCE CHARACTERISTICS: LENGTH: 211 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l0: Met Val Ala His Asn Gin. Val Ala Ala Asp Asn Ala Val Ser Thr Ala 1 Ala Arg Ser Glu Pro Arg Ala Ala Pro Arg Pro Glu Ser Ser Ser Ser Ser Ser Pro Ala Val Pro Ala Pro
C.
S
Pro Ala Arg Pro Ala Pro Gly Arg 40 Arq Cys His Asp Thr His Arq Arg Phe 55 Ser Thr Phe Arq Ser Ala Ala Asp Tyr Ile Thr Arg 65 Trp Ala 70 Val1 Ala Leu Leu Cys Gly Phe Tyr Gly Pro Leu Ser Phe His Gly Ala His 90 Arg Leu Arg Ala Glu Pro Val Gly Thr 100 Ser Leu Val Arg Asp Ser 105 Ser Gly Arg Gin Arg Pro Thr Ser 125 Asn Cys Phe 110 Ile Arg Val Phe Ala Leu 115 Val Lys Met Ala 120 His Phe Gin Ala Gly Arg 130 Phe 135 His Leu Asp Gly Ser 140 Arg Glu Ser Phe P:AOPER\VPAVPACOM-I\SOCSDI-l.WPD 1/11/01 133 Asp 145 Cys Leu Phe Giu Leu Leu 150 Glu His Tyr Val Ala Ala Pro 155 Arg Arg 160 Gin Glu 175 Met Leu Gly Ala Pro 165 Leu Arg Gin Arg Arg 170 Val Arg Pro Leu Leu Cys Arg Arg Ile Pro 195 Gin 180 Arg Ile Val Ala Thr 185 Val Gly Arg Glu Asn Leu Ala 190 Ser Phe Pro Leu Asn Pro Val Leu 200 Arg Asp Tyr Leu Ser 205 Phe Gin Ile 210 INFORMATION FOR SEQ ID NO:ii: SEQUENCE CHARACTERISTICS: LENGTH: 2807 base pairs TYPE: nucleic acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll:
GGAAACCGAG
CAAACAGAGA
AACTCACCCG
ACAAACTGAA
CCAAGGTGTG
GCTCTGTAAG
GGGCCTCAGT
AACCCAACCC
CGCCTTAGGC
CAAGGCCAGG
GCCCCACAGG
CGGGTCCTGG
GCTCCAGGCG
GTCAGCCCCA
GCGGGGAGAC
AACCTGTAGA
CCTTCATTCA
TCACGAAACC
TACGGGGCAT
ACAGAGGCTA
TTCTTCGGTT
CGCCCAGTTT
TTCCTTTGAA
CCGAGTGGCC
TCTCCAGGGC
GCAGGAAGGA
CGGTGGAGCT
AGCTAGCATC
CAGGAGGCCT
GGGCAGTGTG
TAAACATCGT
ACAGTGTCCT
GGGAGCCCTT
GGAAGACAAA
GCCCACGTAG
CCGAGGAACT
GCCTCTGCGG
AACGGGAGGG
TGGCTAGCCG
TCCTGGCAGG
CTGACCAGGA
CCACCCGGGG
TGGCCTCAGA
CGTCACTTAG
CAGCTAGGCA
TAAAATAGGT
GTGCAGAGAT
GTTGGGGGCT
GAGTGCAGAG
CGTCCGGGAG
TCAGGCCACC
GCCCGCGCGC
GGCTCCTAGA
GAGGAGTTGC
GAATGCACAC
AGCAGCGATG
GCTTCAGAGT
CTCAGGGAAG
CCTACTCCTG
CTGACCGCCT
GCTTGCAGGA
ACAGCTTCTT
AGTCCAGCCC
CGGGGGCGCC
GCTTCCTGGG
GATTCTGGAG
GCGGAGACTG
TTGGGGGGTG
ACTCGGAGGG
TGGGGCGAAG
CGCGTGGCAG
CTGCACGCGA
GGCTTTCAGG
GAATCCCTGG
GCCTTGAGGG
GTCCTGCCCG
CTGGGGACCC
CCTCCCGCAC
AAGCCCAAGC
GAGGGCGGCG
CCAAGGCCTT
GGGGGGAAAG
GAGGAGGCGT
GTAGCCAGAG
120 180 240 300 360 420 480 540 600 660 720 780 840 P:\OPER\VPA\VPACOM-1'SO CSDI-I.WPD.- 1/11/01 134-
CAAAAGAGCA
AAGTCCCATT
GCCAAAGAAA
TGGGCGGGAT
AGCAGAGAGA
GCAGCCCCGG
CTGCGAAGGA
CTCGCAGACT
GCCCAGGCGG
CGGATCGTCC
CAAGCCACCC
GCGCGGGGAG
CTTCGCAGAT
CCTTGGGGTC
CCCATCAGCG
GGCACCAGGT
CAGGGAAGGT
TCTAAAGAGA
CGGTGGGCGG
ACTGCGGCCG
AACCCCCAGC
GCAGGCGGGA
GCATGGCGGG
CCCCTCGCGC
GCCCGGGTTC
AGCGGGGACG
GCAGGGCTCC
GAGCCCACCG
CGCTGGCCGG
CAGCCCCGGA
GGTAGCACGT AACCAGGTGG 4* *5*t
GCGGCCAGAG
GCCCTGCCCG
CCACTCTGAT
CTGGGGACCC
CTTCTTGGTG
CCGCGAGACC
CATGTTGGGG
GCGCATCGTG
CCGTGACTAC
ATTAAGTGGG
GTGGGAGCCC
TTCTGGCTGG
GTCCCCCTGG
CCATGTTTAC
TCATTTTTCT
ACTTTATTAT
CCATCCTCGT
GTGGTCCCGG
TACCGGCGCA
CTGAGCGTGC
CGCGACAGTC
ACGAGCATTC
TTCGACTGCC
GCCCCACTGC
GCCGCCGTGG
CTGAGTTCCT
AGCGCCTTAT
TCCCCGCCTA.
AGACCTTATC
TTGTAGCAGC
ATGTTCCCAG
GCTGTGCAGA
GAAAGTTTTT
GACACGAAAC
GCGAGGCGAG
ACCCGAAGGA
GGCCTCCCTG
TGGCAGCGGC
CGCGGCGCCC
GGGGATGGGA
GTCGTGGATG
GCGCGGGGCG
CAGTTCCCGG
GCCTGGAGTC
ACCGCCAGTC
AGGCTCAGGC
CCTGTGCCAC
CGCTATGGCC
AAGCCGACAA
CCTCGTCTTC
CCCCGGCTCC
TCACGCGGAC
ATGGGGCGCA
GCCAGCGGAA
GTGTGCACTT
TCTTCGAGCT
GCCAGCGCCG
GTCGCGAGAA.
TCCCCTTCCA
TATTTCTTAT
GGTCGGAGGG
CCGCCTCTCG
TTGTGTCTGGC
TATCTTTGCA
ATATTCTATT
LTTTTTAAAG
AGAAGATTCC
AACGAGTTAG
CTTGCCGGAA.
GTTTAAGAGC
ACGGCTCCCG
CGCGTCCCGC
GGAAGGGGAG
CTATGCCTCT
CCGTCAGCCC
CGTGGCCAGT
GGGCCCCTCT
TGGAAGGGTT
TCCGGGCGGA
CCGGACGCCC
CACCCCTCCA
TGCGATCTCC
GTCCTCGCCG
GGGCGACACT
CAGCGCTCTC
CGAACGGCTG
CTGCTTCTTC
CCAGGCCGGC
GCTGGAGCAC
CGTGCGGCCG
CCTGGCACGC
GATCTGACCG
TATTAATTAT
%.GTGGGTGTG
3GGGGCCTCC 3GCCAGGACC
'AAACCAGGG
AGAGAAACCG
TTGATGCAGG
GCCCCGGAGC
CGCCAGGTGA
CAGAGCCTGG
GGCGCCCGCC
CTCCTCTCCG
AGGCGGCAAC
CCACGCCCCC
CCACATACAG
TTCTGCGTGT
GGTTCACTGC
GCTGGCCCCT
CCGGCATCAG
GCGGCCCCGG
CACTTCCGCA
CTGGACGCCT
CGTTCCGAAC
GCGCTCAGCG
CGCTTCCACC
TACGTGGCGG
CTGCAGGAGC
ATCCCTCTTA
GCTGCCGCCG
TATTATTTTT
GAGGGTGAGA
CCTCCTGGTG
TGAACTCCAC
GTGGGGGAGG
AAAGCGGCGG
GGCGGGCAGC
ATGCGCGACA
GCCGAGGCAG
CAGGACTATC
CCACCGGCTC
GCCCTGAGCC
CGCGAGGCGG
TTCTCCACGC
GAACGGCCTA
CACCCTCGCT
CTCTGTCTCC
CGAGTAGGAT
AGCCCCGACG
CGCGTCCCCG
CCTTCCGCTC
GCGGCTTCTA
CCGTGGGCAC
TGAAGATGGC
TGGACGGCAA
CGCCGCGCCG
TGTGTCGCCA
A.CCCGGTACT
TGCCCGCAGA
CTGGAACCAC
TCCCTCCCAC
.TCCCTCCCG
3CCTACCTCT 3
TCTCTGGCT
TAGATAATAA
GGGTAGAGCC AGAACCCCAG GTGGACCCTC TCCAGGGGCA 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2807 ETATATTTTT ACATCCAGTT kAACAAAGAT TTCTAGA INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 212 amino acids TYPE: amino acid P:\OPER\VPA\VPACOM-l'SOCSDI-I.WPD 1/11101 135- STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Met Val Ala Arg Asn Gin Val Glu Ala Asp Asn Ala Ile Ser Glu Pro Arg Pro Ala Ala Arg Arg Pro Glu Pro 25 Arg Ser Ser Ser Ser Pro Ala Ser Ser Ser Val Pro Ala His Ser Asp Ser Pro Ala Arg Pro Cys Pro Pro Ala Pro Val1 Ser Gly Asp Thr Arg Thr Phe Tyr Arg Arg Asp Arg Ile Thr Tyr Arg 70 Ser Ser Ala Leu Leu 75 His Ala Cys Gly Phe Trp Gly Pro Leu Thr Val His Gly Ala 90 Asp Glu Arg Leu Arg Ser Glu Pro Val Phe Phe Ala 115 Val His Phe Gly 100 Leu Phe Leu Val Arg 105 Ala Ser Arg Gin Ser Val Lys Met 120 Phe Ser Gly Pro Thr 125 Asn Arg Asn Cys 110 Ser Ile Arg Arg Glu Thr Gin Ala Gly His Leu Asp 130 Phe Asp 145 Gly 140 Val1 Cys Leu Phe Glu 150 Leu Glu His Tyr 155 Ala Ala Pro Arg Met Leu Gly Ala Pro Leu Arg Gln Arg Arg Val Arg Pro Leu Gln 165 17017 175 P:\OPER\VPA\VPACOM-I\SOCSDI-I .WPD 1/11/01 136- Glu Leu Cys Arg Gin Arg Ile Val Ala Ala Val Gly Arg Glu Asn Leu 180 185 190 Ala Arg Ile Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe 195 200 205 Pro Phe Gin Ile 210 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 1611 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 263..1529 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: CGAATTCCGG GCGGGCTGTG TGAGTCTGTG AGTGGAAGGC GCGCCGGCTC TTTTGTCTGA GTGTGACCCG GTGGCTTTGT TCCAGGCATT CCGGTGATTT CCTCCGGGCA GTCCGCAGAA 120 GCCGCAGCGG CCGCCCGCGC TCTCTCTGCA GTCTCCACAC CCGGGAGAGC CTGAGCCCGC 180 GTCACGCCCC TCAGCCCCCG CTGAGTCCCT TCTCTGTTGT CGCGTCCGAA TCGAGTTCCC 240 GGAATCAGAC GGTGCCCCAT AG ATG GCC AGC TTT CCC CCG AGG GTT AAC GAG 292 Met Ala Ser Phe Pro Pro Arg Val Asn Glu 1 5 AAA GAG ATC GTG AGA TCA CGT ACT ATA GGG GAA CTC TTG GCT CCA GCA P:\OPERkVPA\VPACOM-I\SOCSDI-Il.WPD 1/11101 137 Lys Giu Ile Val Arg Ser Arg Thr Ile Gly Giu Leu 20 GCT CCT TTT GAC Ala Pro Phe Asp GCT CCT GAT GGT Ala Pro Asp Gly AAG AAA TGT GGT Lys Lys Cys Gly GGT GAG AAC TGG Gly Giu Asn Trp 35 TGG TCA CAA GGA Trp Ser Gin Gly Leu Ala Pro Ala ACG GTT GCT TTT Thr Val Ala Phe TAT CGC ATA GTG Tyr Arg Ile Val CTT TTG CAT GGT Leu Leu His Gly 388 TCC TAC TTT Ser Tyr Phe
GCG
Ala 50 AAG CTT Lys Leu GTC CCG TGG TCC CAG Val Pro Trp Ser Gin 65 TGC CGT AAG AAC TTT Cys Arg Lys Asn Phe AGC TGT CTA AAA TTG Ser Cys Leu Lys Leu 85 484 AAA AAT GTT ACC AAT Lys Asn Val Thr Asn GCA AGA CAA Ala Arg Gin
AAC
Asn
S
*SSS
S
AGT AAT GGT GGT Ser Asn Gly Gly GGA GAC ATA GTC Gly Asp Ile Val 110
CAG
Gin 95 AAA AAC AAG CCT Lys Asn Lys Pro
CCT
Pro 100 GAG CAC GTT ATA Glu His Val Ile GAC TGT Asp Cys 105 532 580 628 TGG AGT CTT GCT Trp Ser Leu Ala
TTT
Phe 115 GGG TCT TCA GTT CCA GAA AAA Gly Ser Ser Val Pro Giu Lys 120 CAG AGT CGT Gin Ser Arg 125 TGC GTT AAT ATA Cys Val Asn Ile
GAA
Giu 130 TGG CAT CGG TTC Trp His Arg Phe
CGA
Arg 135 TTT GGA CAG Phe Gly Gin 676 GAT CAG Asp Gin 140 CTA CTC CTT GCC ACA GGA TTA AAC AAT Leu Leu Leu Ala Thr Gly Leu Asn Asn 145 GGT CGC ATC AAA ATC Gly Arg Ile Lys Ile 150
TGG
Trp 155 GAT GTA TAT ACA Asp Val Tyr Thr GGA AAA Gly Lys 160 CTC CTC CTT AAT TTG GTA GAC CAC Leu Leu Leu Asn Leu Val Asp His 165
ATT
Ile 170 772 820 GAA ATG GTT AGA GAT TTA ACT Giu Met Val Arg Asp Leu Thr TTT GCT CCA GAT GGG AGC TTA CTC CTT Phe Ala Pro Asp Gly Ser Leu Leu Leu P:\OPER\VPA\VPACOM-1\SOCSDI-I.WPD- 1/11101 138- GTA TCA GCT Val Ser Ala GAT GGA AAC Asp Gly Asn 205 TCA AGA Ser Arg 190 GAC AAA ACT Asp Lys Thr CTA AGA GTG TGG GAC Leu Arg Val Trp Asp 195 CGG GCA CAT CAG AAT Arg Ala His Gin Asn 215 CTG AAA GAT Leu Lys Asp 200 TGG GTG TAC Trp Val Tyr 868 ATG GTG AAA GTA Met Val Lys Val
TTG
Leu 210 AGT TGT Ser Cys 220 GCA TTC TCT CCC Ala Phe Ser Pro
GAC
Asp 225 TGT TCT ATG CTG Cys Ser Met Leu TGT TCA GTG GGC GCC Cys Ser Val Gly Ala 230
AGT
Ser 235 AAA GCA GTT TTC Lys Ala Val Phe
CTT
Leu 240 TGG AAT ATG GAT Trp Asn Met Asp
AAA
Lys 245 TAC ACC ATG ATT Tyr Thr Met Ile 1012 AAG CTG GAA GGT Lys Leu Glu Gly
CAT
His 255 CAC CAT GAT GTT His His Asp Val
GTA
Val 260 GCT TGT GAC TTT Ala Cys Asp Phe TCT CCT Ser Pro 265
GAT
Asp GGA GCA TTG Gly Ala Leu 270 CTA GCT ACT GCA Leu Ala Thr Ala
TCC
Ser 275 TAT GAC ACT CGT Tyr Asp Thr Arg GTG TAT GTC Val Tyr Val 280 CAC CTG TTT His Leu Phe 1060 1108 1156 TGG GAT CCA Trp Asp Pro 285 CAC AAT GGA GAC His Asn Gly Asp
CTT
Leu 290 CTG ATG GAG TTT Leu Met Glu Phe
GGG
Gly 295 CCC TCG Pro Ser 300 CCC ACT CCA ATA Pro Thr Pro Ile
TTT
Phe 305 GCT GGA GGA GCA Ala Gly Gly Ala AAT GAC CGA TGG GTG Asn Asp Arg Trp Val 310
AGA
Arg 315
GAT
Asp GCT GTG TCT TTC Ala Val Ser Phe
AGT
Ser 320 CAT GAT GGA CTG His Asp Gly Leu
CAT
His 325 GTT GCC AGC Val Ala Ser CTT GCT Leu Ala 330 CCG GTA Pro Val 345 1204 1252 1300 GAT AAA ATG Asp Lys Met
GTG
Val 335 AGG TTC TGG Arg Phe Trp AGA ATC GAT GAG GAT TGT Arg Ile Asp Glu Asp Cys P:\OPER\VPAVPACOM-I\SOCSDI-I.WPD- 1/11/01 -139- CAA GTT GCA CCT TTG AGC AAT GGT CTT TGC TGT GCC TTT TCT ACT GAT 1348 Gin Val Ala Pro 350 Leu Ser Asn Gly Leu Cys Cys Ala 355 CAT GAT GGA AGT His Asp Gly Ser Phe Ser Thr Asp 360 GGC AGT Gly Ser
GTT
Val 365 TTA GCT GCT GGG Leu Ala Ala Gly
ACA
Thr 370
GTG
Val 375 TAT TTT TGG Tyr Phe Trp
GCC
Ala
ATC
Ile 395 ACT CCA AGG CAA GTC Thr Pro Arg Gin Val 380 CGA AGA GTG ATG TCC Arg Arg Val Met Ser 400
CCT
Pro 385 AGC CTT CAA CAT Ser Leu Gin His TCC AAA ATA Ser Lys Ile TTG GCG Leu Ala 415
TTT
Phe ACC CAA GAA Thr Gin Glu CTC TCC TAC Leu Ser Tyr GTC CAA Val Gin 405 CGC GGT Arg Gly 420 ATA TGT CGC ATG TCA Ile Cys Arg Met Ser 390 AAA CTG CCT GTT CCT Lys Leu Pro Val Pro 410 TAG A CTGAAGACTG 1396 1444 1492 1539 9* 9* 9 9 9 CCTTTCCTGG TAGGCCTGCC AGACAGAGCG CCCTTTACAA GACACACCTC AAGCTTTACC TCGTGCCGAA TT INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 422 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 1599 1611 Met Ala Ser Phe Pro Pro Arg Val Asn Glu Lys Glu Ile Val Arg Ser Arg Thr Ile Gly Glu Leu Leu Ala Pro Ala Ala Pro Phe Asp Lys Lys 25 P:%OPERVPMkVPACOM-[\/SOCSDI-l.WPD 1/11/01 140 Cys Gly Gly Glu Asn Trp Thr Val Ala Phe Ala Pro Asp Val1 Gly Ser Tyr Pro Trp Ser Trp Phe Ala Gin Cys Ser Gin Gly Tyr Arg 55 Leu Leu Ile Val Lys Leu Lys Arg Lys Asn Ser Phe 70 Leu His Gly Ser 75 Ser Asn Val Thr Asn Ser Cys Leu Ala Arg Gin Asn Cys Asn Gly Gly Gin Lys Asn Lys Pro Leu Ala Phe 115 Ilie Giu Trp Pro 100 Gly His Val Ile Asp 105 Glu Giy Asp Ile Ser Ser Val Pro 120 Phe Lys Gin Ser Arg 125 Leu Val Trp Ser Cys Val Asn Leu Leu Ala His Arg Phe S S
S.
S
*S.SS.
S
S
5.5.
130 Thr Gly Arg 135 Arg Gly Gin Asp Gin 140 Asp Leu Asn Asn 145 Lys Gly 150 Leu Ile Lys Ile Trp 155 Glu Val Tyr Thr Gly 160 Leu Leu Leu Asn 165 Asp Val Asp His Met Val Arg Asp Leu 175 Thr Phe Ala Lys Thr Leu 195 Val Leu Arg Pro 180 Arg Gly Ser Leu Leu 185 Lys Val Ser Ala Val Trp Asp Leu 200 Trp Asp Asp Gly Asn 205 Ala Ser Arg Asp 190 Met Val Lys Phe Ser Pro Ala His Gin 210 Asp Cys Asn 215 Ser Val Tyr Ser Cys 220 Lys Ser Met Leu Val Gly Ala 225 Ser 235 Ala Val Phe Leu 240 Trp Asn Met Asp Lys Tyr Thr Met Ile Arg Lys Leu Glu Gly His His 245 25025 255 P:\OPER\VPA\VPACOM-I\SOCSDl-I.WPD 1/11/01 141 His Asp Val Val Ala Cys Asp Phe Ser Pro Asp Gly Ala Leu Leu Ala 270 His Asn Gly Thr Ala Ser 275 Tyr Asp Thr Arg Val1 280 Tyr Val Trp Asp Pro 285 Asp Leu 290 Leu Met Glu Phe Gly 295 His Leu Phe Pro Ser 300 Pro Thr Pro Ile Phe 305 Ala Gly Gly Ala Asp Arg Trp Val Arg 315 Ala Val Ser Phe Ser 320 Arg His Asp Gly Leu His 325 Val Ala Ser Leu Asp Asp Lys Met Phe Trp Arg Asn Gly Leu 355 Ile 340 Asp Glu Asp Cys Pro 345 Val Gln Val Ala Pro Leu Ser 350 Leu Ala Ala Cys Cys Ala Phe Ser 360 Thr Asp Gly Ser 9 *99* 9* 99 9 9 9 9* 9 99 9.
9**9 9 9.
99 99 9b' 9 *999 9*9* *9*999 9 9*99** 9 Gly 'ihr 370 His Asp Gly Ser Val1 375 Tyr Phe Trp Ala Thr 380 Pro Arg Gln Val Pro 385 Ser Leu Gln His Ile 390 Cys Arg Met Ser Ile 395 Arg Arg Val Met Ser 400 Thr Gln Glu Val Gln Lys Leu Pro Val 405 Pro Ser Lys Ile Leu Ala Phe 410 415 Leu Ser Tyr Arg Gly 420 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 783 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear P:\OPER\VPAVPACOM-I\SOCSD-IlWPD. 1/11/01 142 (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID CTGTCTTCCT CCGCAGCGCG AGGCTGGGTA CAGGGTCTAT TGTCTGTGGT TGACTCCGTA CTTTGGTCTG AGGCCTTCGG CCCCGCCCGT CTCCTCTGTC GTCGCCACTC TCTTCTCTGT CATAGATGGC CAGCTTTCCC TAGGTGAACT TTTAGCTCCT CTGTTGCTTT TGCTCCAGAT 0:0 0000 Vcoo, AGCTTGTTCC GTGGTCCCAG too 0 CCAATTCAAG CAGTTTAAGA 0@ *00 e* CTCGTGACAT ATTATAGACT Soo**:TCCAGAAAAA CAGAGTCGCT *too TCAGCTACTT CTTGCTACAG
GAGCTTTCCC
CCTGGGCCCG
TGTTGGGTCC
CCGAGGGTCA
GCAGCTCCTT
GGTTCATACT
TGCCTTCAGA
TTGCCAAGAC
GTGGAGATAT
GTGTAAATAT
GGTTGAACAA
GAGGCAGTTA
GGAGACAAAC
GCATCGTATT
ACGAGAAAGA
TTGACAAGAA
TTGCTTGGTC
ACTTTCTCTT
AAAATAGTGA
AGTCTGGAGT
AGAATGGCAT
GCAGAAGCCG
TTGGCGTCAC
CCCGGAATCA
GATCGTGAGA
ATGTGGTCGT
ACAAGGACAT
GCATGGCACC
TGGTGGTCAG
CTTGCTTTTG
CGCTTCAGAT
CAGCGACCGC
GCCCTCAGCG
GACGGTGCCC
TCACGTACTA
GAAAATTGGA
CGCACAGTAA
AAGAATGTTA
AAAAATAAGC
GGTCATCAGT
TTGGACAAGA
120 180 240 300 360 420 480 540 600 660 720 780 783 TGGGCGTATC AAAATATGGG ATGTATATCA 0 000000 0 0 000...
S
0 000000 0 0000 0 0 0000 GGAAACTCCT CCTTAACTTG GTAGATCATA CTGAAGTGGT CAGAGATTTA ACTTTTGCTC
CAG
INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 1122 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear P:\OPERVPA\VPACOM-\SOCSDI-l.WPD. 1/11/01 143 (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: CTCTGTATGT CTGAATGAAG CTATAACATT TGCCTTTTTA TTGCAGGTTT TCCTTTGGAA TATGGATAAA TACACCATGA TGACTTTTCT CCTGATGGAG CTGGGATCCA CATAATGGAG TCCAATATTT GCTGGAGGAG TGGACTGCAT GTTGCAAGCC GGATTATCCA GTGCAAGTTG TGGCAGTGTT TTAGCTGCTG3 *GCAGGTCCCT AGCCTGCAAC AGAAGTTCAG GAGCTGCCGA GAAGATTCTG CCTTCCCTAG CTTTACTGAC TTCAATTATC TCTTGTACTG CATTTTGATC TTTCTGAACA TATCAAATAT GTACATATTT AGATATAAGC AGTTCTGACA TGTATATATT CAAGGAAATT TTAAATTCTG
TACGGAAACT
CATTACTGGC
ACATTCTGAT
CAAATGACCG
TTGCTGATGA
CACCTTTGAG
GGACACATGA
ATTTATGTCG
TTCCTTCCAA
TAGTAGGGAC
TGTTTTTAAA
AGTTGAGCTT
AAATTTTTTT
TGCTATATGT
GCTTCAGTAG
GGACACTGAG
AGAAGGACAT
TACTGCATCT
GGAATTTGGG
GTGGGTACGA
TAAAATGGTG
CAATGGTCTT
CGGAAGTGTG
CATGTCAATC
GCTTTTGGAG
TGACAGAATA
GACGTAGAAG
TTAAAATATT
AAAGATCTAA
TGAATGGACC
AGCCACAATA
TTAGATGGTA
CACCATGATG
TATGATACTC
CACCTGTTTC
TCTGTATCTT
AGGTTCTGGA
TGCTGTGCCT
TATTTTTGGG
CGAAGAGTGA
TTTCTCTCGT
CACTTAACAC
ATTTATTTAA
ATTTATAGAC
CTGTGAAAAC
CTTTTGCTTT
TGTATCTTTG
k.ATACTGACT
TGGTAGCTTG
GACTATATAT
CCCCACCTAC
TTAGCCATGA
GAATTGATGA
TCTCTACTGA
CCACTCCACG
TGCCCACCCA
ATCGTATTTA
AAACCTCAAG
TTTGATATGT
AATAGAAGTA
ATACATACCT
TCTGATTTTT
CTGTAAAGTG
I'ACGAAAGTT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 GAATTGGGTG AGGCGGGCAA ATCACCTGAG GTCAGCAGTT TGAGACTAGC CGCAC CTGGCAAACA P:OPER\VPA\VPACOM-l'SOCSD-I.WPD- 1/11/01 144- TGATGAAACC CTGTCTCTAC TAAAAATACA AAAAAAAAAA AA INFORMATION FOR SEQ ID NO:l7: SEQUENCE CHARACTERISTICS: LENGTH: 2537 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 422..2029 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 1122 o CGGCACGAGC CGGGCTCCGT CCGGAGGAAG CGAGGCTGCG CCGCCGGCCC GGCAGGAGCG GAGGACGGGA GCGCGGGCGG CCTTGCCTGG CCGCAGGTGC CCCCCGCGGT CGCCCGGCGC TTTCCCCCGG CGCAGTCCTC CCCTCGGGCC GGGATGGATC CGCACGGTGC CGCCGCGCGT A ATG GAT AAA GTG GGG Met Asp Lys Val Gly TCGCGCTCGC CCTGTCGCTG ACTGCGCTGC CCCGGCCCAT CCTGGATGAG GCCGCCGCGC GTGTCCCGGC CGCTGAGTGT CTGCCCTCAA GCGGCCGCCT CTCCTTGCCC GGGTCCCCGT CTCCGGTGGG CGCCTCCGCA CCTCGGCGCA GGCGGCACGG CGCCGGGAAG AGGAAGACAA GCCGGGGCGT TGAGCCCCTG AGTGGGAGCT TACTCGCAGT AGGCTCTCGC TCTTCTAATC AAA ATG TGG AAC AAC TTA AAA TAC AGA TGC Lys Met Trp Asn Asn Leu Lys Tyr Arg Cys 120 180 240 300 360 420 466 CAG AAT CTC TTC AGC CAC GAG GGA GGA AGC CGT AAT GAG AAC GTG GAG P:\OPERVPAVPACOM-l\SOCSDl- I.WPD 1/11/01 145 Gin Asn Leu Phe Ser His Glu Gly Giy Ser Arg Asn Giu Asn Vai Giu
ATG
Met AAC CCC AAC Asn Pro Asn AGA TGT Arg Cys CCG TCT GTC Pro Ser Val 40 AAA GAG Lys Giu AAA AGC ATC AGT CTG Lys Ser Ile Ser Leu 562 GGA GAG GCA Gly Giu Ala GCT CCC CAG CAA GAG Ala Pro Gin Gin Glu 55 AGC AGT CCC TTA AGA Ser Ser Pro Leu Arg GAA AAT GTT Glu Asn Val GCC TTA Ala Leu
CAG
Gin CTG GGA CTG AGC Leu Gly Leu Ser 70 CCT TCC AAG ACC Pro Ser Lys Thr
TTT
Phe TCC AGG CGG AAC Ser Arg Arg Asn 658
CAA
Gin AAC TGT GCC GCA Asn Cys Aia Ala
GAG
Giu 85 ATC CCT CAA GTG Ile Pro Gin Val
GTT
Val1 90 GAA ATC AGC ATC Glu Ile Ser Ile
GAG
Glu AAA GAC AGT GAC Lys Asp Ser Asp
TCG
Ser 100 GGT GCC ACC CCA Gly Ala Thr Pro
GGA
Gly 105 ACG AGG CTT GCA Thr Arg Leu Aia CGG AGA Arg Arg 110 4,
C,
4,*4,4, GAC TCC TAC Asp Ser Tyr
TCG
Ser 115 CGG CAC GCC CCG Arg His Ala Pro
TGG
Trp 120 GGA GGA AAG AAG AAA CAT TCC Gly Gly Lys Lys Lys His Ser 125
TGT
Cys
AGA
Arg TCC ACA Ser Thr 130 AAG ACC CAG AGT Lys Thr Gin Ser TTG GAT ACC GAG Leu Asp Thr Giu
AAA
Lys 140 AAG TTT GGT Lys Phe Gly
ACT
Thr 145 CGA AGC GGC CTT Arg Ser Gly Leu
CAG
Gin 150 AGG CGA GAG CGG Arg Arg Giu Arg CGC TAT GGA GTC AGC Arg Tyr Gly Val Ser 155
TCC
Ser 160 ATG CAG GAC ATG Met Gin Asp Met
GAC
Asp 165 AGC GTT TCT AGC Ser Val Ser Ser
CGC
Arg 170 GCG GTC GGG AGC Ala Val Gly Ser
CGC
Arg 175 TCC CTG AGG CAG AGG CTC CAG GAC ACG GTG GGT TTG TGT TTT CCC ATG Ser Leu Arg Gin Arg Leu Gin Asp Thr Val Gly Leu Cys Phe Pro Met P:\OPER\VPAVPACOM-l\SOCSDl-I.WPD 1/ 1/01 146- AGA ACT TAC Arg Thr Tyr
AGC
Ser 195 AAG CAG TCA AAG CCA CTC Lys Gin Ser Lys Pro Leu 200 TTT TCC AAT AAA AGA AAA 1042 Phe Ser Asn Lys Arg Lys 205 CCT GCT GGC Pro Ala Gly ATA CAT Ile His
CTT
Leu 210 TCT GAA TTA ATG CTG Ser Glu Leu Met Leu 215 GAG AAA TGC CCT Glu Lys Cys Pro
TTT
Phe 220
TCG
Ser
GTG
Val 240 GAT TTA GCA CAA AAG Asp Leu Ala Gin Lys 225 AGC CCA CAC TCA ACA Ser Pro His Ser Thr 245
TGG
Trp 230 CAT TTG ATT AAA His Leu Ile Lys
CAG
Gin 235 CAT ACC GCC CCT His Thr Ala Pro TTT TTT GAT ACA TTT Phe Phe Asp Thr Phe 250 GAT CCA TCA Asp Pro Ser
CTG
Leu
GTG
Val 255 1090 1138 1186 1234 1282 1330 TCT ACA GAA GAT Ser Thr Glu Asp
GAA
Glu 260 GAA GAT AGG CTT Glu Asp Arg Leu
CGC
Arg 265 GAG AGA AGA CGG CTT AGT Glu Arg Arg Arg Leu Ser 270 c r: r ATC GAA GAA Ile Glu Glu GAA GCT ACT Glu Ala Thr 290
GGG
Gly 275 GTG GAT CCC CCT Val Asp Pro Pro
CCC
Pro 280 AAC GCA CAA ATA Asn Ala Gin Ile CAC ACC TTT His Thr Phe 285 CCA AAG TTA Pro Lys Leu GCA CAG GTC AAC Ala Gin Val Asn
CCA
Pro 295 TTG TAT AAG CTG Leu Tyr Lys Leu
GGA
Gly 300 GCT CCT Ala Pro 305 GGG ATG ACA GAG Gly Met Thr Glu
ATA
Ile 310 AGT GGA GAT GGT Ser Gly Asp Gly
TCT
Ser 315 GCA ATT CCA CAA Ala Ile Pro Gin 1378 1426 GCA Ala 320
GGA
Gly ATT GTG ACT CAG Ile Val Thr Gin
AAG
Lys 325 AGG ATT CAA CCA Arg Ile Gin Pro
CCC
Pro 330 TAT GTC TGC AGT Tyr Val Cys Ser
CAC
His 335 GGC AGA AGC AGC GCC Gly Arg Ser Ser Ala 340 AGG TGT CCG GGG ACA Arg Cys Pro Gly Thr 345 GCC ACG CGC Ala Thr Arg ACG TTA Thr Leu 350 1474 P:\OPER\VPA\VPACOM-I'SOCSDF-I.WPD. 1/11/0 147- GCA GAC AGG GAG CTT GGA AAG TTC ATA CGC AGA TCG ATT ACA TAG ACT 1522 Ala Asp Arg GCC TCG TGC Ala Ser Cys 370 Giu 355
GAG
Gin Leu Giy Lys Phe ATT TGG TTC AGA Ile Gys Phe Arg 375 Arg Arg Ser Ile Thr Tyr Thr 365 GTT ACT GG Val Thr Gly TGA GAG GGA ATC Ser Gin Gly Ile
CCT
Pro 380 1570
GG
Ala
TGA
385 TGG ACC GAT ACG Trp Thr Asp Thr
AGG
Arg 390 CCG AAG CCC TTG TAG Pro Lys Pro Phe 395 AAG GGA Lys Gly AAG GG Asn Arg 1618
AAG
Lys 400 GCA GGT TCT TGC TCA Aia Arg Ser Cys Ser 405 GGG ACT GTG GAG Giy Thr Leu His
AGO
Arg 410 AGO ACT ACC TGT Arg Thr Thr Ser
TCT
Ser 415 GTG TGA GCT TGG Leu Ala Ser
GCG
Al a 420 GGT ACA ACA Ala Thr Thr GOT GTG Giy Leu 425 TGG ACG CCC GGA TGG AGC Gys Thr Pro Gly Ser Ser 430 1666 1714 1762 1810 *9 AGT GGA ACC Ser Giy Thr ACT GGT GGA Thr Pro Pro 450
ACA
Thr 435 ACT TGA GCT TCG Thr Ser Ala Ser
ATG
Met 440 CCC ATG ACC GGT GGG TGT TTG Pro Met Thr Pro Ala Gys Phe 445 CGT GAG GGG GGT Arg His Gly Ala
TCT
Ser 455 CGA ACA CTA TAA Arg Thr Leu
AGA
Arg 460 CCC GAG GTG Pro Gin Leu TTG CAT Leu His 465 GTT TTT TGA ACC Val Phe Thr
GTT
Val1 470 GGT AAG GAT ATC Ala Asn Asp Ile
ACT
Thr 475 GAA TAG AAC TTT Glu Asn Phe 1858 1906 CCC Pro 480
TAG
Tyr TTT GAG GGT GGA Phe Gin Pro Ala
GTA
Val1 485 TAT CTG CCG CGG Tyr Leu Pro Arg
AGT
Ser 490 GAT CTG GAG ATG Asp Leu Gin Met
GAG
His 495 GTA TGA TGG Val Trp TGA GGG GCT CCC Arg Ala Pro ACC GTC GAT GTT Thr Val Asp Val ACA GGA Thr Gly 510 1954 TTT TTT A.AA AGA GTA TGA TTA TAA ACA AAA AGT TAG GGT TGG GTG GTT 2002 P:'OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1111101 148 Phe Phe Lys Arg Val Ser Leu Thr Lys Ser Gly Ser Leu Val 515 520 525 AGA ACG AGA CCA GTC AAA GCA AAG TAACTICCTGST CCCCAAAGGG CACTAACTAA Arg Thr Arg Pro Val Lys Ala Lys 530 535 GTCTGCTCCT CCCGTGCATC GAACTGCACC CATAGGAGGC AGTCAGCTGC TAGGATTTCC CACCCAGAAT GGGAGCTTAG TCATTAGCCT CTGCCCTATG GGGTCCGCTG TTCCTCAGAC AAAGGTGCCT AGGGACAGCA AGATGGCTTG CAGGTGTTCG GTGGGCTGTG ACAACTGAGG GAGGCAACTC TGGGGCATTT GCTATGAAGA ATTCTATTTC TTACCGAAGA ACAAATTATT AATATTGGAT GGGTATTTCA ATAGTGTGAC TAATGTTTGA AATTATTTTT TCTAAGAATT TTTCTATAAC CTTCAGAAAA AGTAGTGATG TTTGTAGTTA CTATAAATCA AGCTTTGAAA GTTCAAAACA AACAAGTTAA ATAAAAGACT ACCTTCCTTT TAGAGAAAAC AAATGCAAGT TTTCCCAGCC ACAGGCATTG TGCACTGTTA ATGTTGCTTG TTATCAGCTC CTTTCTCCTC 2056 2116 2176 2236 2296 2356 2416 2476 2536 25S3 7 a INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 535 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l18: Met Asp Lys Val Gly Lys Met Trp Asn Asn Leu Lys Tyr Arg Cys Gin 1 5 10 Asn Leu Phe Ser His Glu Gly Gly Ser Arg Asn Glu Asn Val Glu Met P:\OPER\VPA\VPACO\M-1 SOCSDI-I.WPD 1/11/01 149 Asn Pro Asn Arg Cys Pro Ser Val1 40 Lys Giu Lys Ser Ile Ser Leu Gly Glu Ala Ala Pro Gin Gin Ser Ser Pro Leu Arg Giu Asn Val Ala Gin Leu Gly Leu Ser Pro Ser Lys Thr Phe Ser Arg Arg Asn Asn Cys Aia Ala Giu Ile Pro Gin Val Giu Ile Ser Ile Giu Lys Asp Ser Asp Ser Tyr Ser 115 Ser 100 Gly Aia Thr Pro Gly 105 Thr Arg Leu Ala Arg Arg Asp 110 His Ser Cys Arg His Ala Pro Trp 120 Gly Gly Lys Lys Ser Thr 130 Lys Thr Gin Ser Ser 135 Leu Asp Thr Giu Lys Phe Gly Arg Thr 145 Arg Ser Gly Leu Gin 150 Arg Arg Giu Arg Arg 155 Tyr Gly Val Ser Ser 160 Met Gin Asp Met Asp 165 Ser Val Ser Ser Ala Vai Gly Ser Arg Ser 175 Leu Arg Gin Thr Tyr Ser 195 Arg 180 Leu Gin Asp Thr Val1 185 Gly Leu Cys Phe Pro-Met Arg 190 Arg Lys Ile Lys Gin Ser Lys Pro 200 Leu Phe Ser Asn Lys 205 His Leu 210 Ser Giu Leu Met Leu 215 Giu Lys Cys Pro Phe 220 Pro Ala Gly Ser Asp 225 Leu Ala Gin Lys Trp 230 His Leu Ile Lys Gin 235 His Thr Ala Pro P:\OPER\VPA\VPACOM- I\SOCSDI-I .WPD. 1/101 150- Ser Pro His Ser Thr 245 Phe Phe Asp Thr Phe 250 Asp Pro Ser Leu Val Ser 255 Thr Glu Asp Glu Giu Gly 275 Glu 260 Giu Asp Arg Leu Giu Arg Arg Arg Leu Ser Ile 270 Thr Phe Giu Val Asp Pro Pro Asn Ala Gin Ile Ala Thr 290 Ala Gin Val Asn Pro 295 Leu Tyr Lys Leu Gly 300 Pro Lys Leu Ala Gly Met Thr Giu Ser Gly Asp Gly Ala Ile Pro Gin Ala 320 Ile Val Thr Gin Arg Ile Gin Pro Tyr Val Cys Ser His Gly 335 Gly Arg Ser Ser 340 Ala Arg Cys Pro Gly 345 Thr Ala Thr Arg Thr Leu Ala 350 Tyr Thr Ala 9.
Asp Arg Giu 355 Leu Gly Lys Phe Ile 360 Arg Arg Ser Ile Thr 365 Ser Cys 370 Gin Ile Cys Phe Arg 375 Ser Gin Gly Ile Pro 380 Val Thr Gly Ala 385 Trp Thr Asp Thr Arg 390 Pro Lys Pro Phe 395 Lys Gly Asn Arg Lys 400 Ala Arg Ser Cys Ser 405 Gly Thr Leu His Arg 410 Arg Thr Thr Ser Ser Leu 415 *Ala Ser Ala 420 Gly Thr Thr Thr 435 Ala Thr Thr Gly Leu 425 Cys Thr Pro Gly Ser Ser Ser 430 Cys Phe Thr Ser Ala Ser Met 440 Pro Met Thr Pro Al a 445 Pro Pro 450 Arg His Gly Ala Ser 455 Arg Thr Leu Arg 460 Pro Gin Leu Leu P:\OPER\VPA\VPACOM-lSOCSDI-I.WPD 1/11/01 151 His 465 Phe Val Phe Thr Val 470 Ala Asn Asp Ile Thr Glu 475 Asp Leu Gin Pro Ala Val 485 Tyr Leu Pro Arg Ser 490 Asn Phe Pro 480 Gin Met His Tyr 495 Val Thr Gly Phe 510 Ser Leu Val Arg 525 Val Trp Phe Lys Arg 515 Thr Arg Pro 530 Asp 500 Val Arg Ala Pro Ala Thr 505 Lys Ser Val Asp Ser Leu Thr 520 Gly Val Lys Ala INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 1221 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ii) MOLECULE TYPE: DNA e e e r (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: GATTAAACAG CATACAGCTC CTGTGAGCCC ACATTCAACA TTTTTTGATA TCTTTGGTTT CTACAGAAGA TGAAGAAGAT AGGCTTAGAG AGAGAAGGCG GAAGAAGGGG TTGATCCCCC TCCCAATGCA CAAATACATA CATTTGAAGC GTTAATCCAT TATTAAACTG GGACCAAAAT TAGCTCCTGG AATGACTGAA ACAGTTCTGC AATTCCACAA GCTAATTGTG ACTCGGAAGA GGATACAACC GCAGTCACGG AGGCAGAAGC AGCGTCAGAT ATCTGGAGAC AGCCATACCC
CTTTGATCCA
GCTTAGTATT
TACTGCACAG
ATAAGTGGGG
ACCCTGTGTT
ATGTTAGCAG
120 180 240 300 360 P:\OPER\VPAVPACOM-l\SOCSDI-l.WPD 1/113/01 152
ACAGGGAGCT
GCTTCAAATT
CCTTCTCGAA
CTTCTTCTCT
GAATCACAAC
GGGACTTTTA
TATATCACTA
CAGGTGCACT
TTTAAAAGAG
CAAGGCAAAG
TCAGACAGTA
AAGCTTAGTG
GCCTATTGGA
GCCTATTTCG
GGCATTTGTT
TGGAAAGTCC
ACAGGGAATC
GGGAAACCTG
GTGAGCTTCC
TTTAGTTTCG
GAACATTATA
AATAGGACTT
ACGTATGATG
TATCATTATA
TAAACTCTCC
CACCTATAGC
TTAGTATCTG
ACAATAGCGG
CTAACAGTTT
ATGAAGAAAT
ACACACAGAT
CCTGTTACTG
AAGGCACGTT
GCCGATACAA
ACGCCCATGA
AAGATCCCAG
TCCCTTTTAG
GAATTGATGG
AACAAAAAGT
GGTCCCCAAA
AAGCACACGT
TCAGATGCTA
ATAGAGCTAC
TGATTACATA
GGGAGTGATG
CACTGCTTCG
GACCGTTATG
TGCCTGATTT
AAGCAGAAGC
TTTGCTCAGG GACTCTGCGC AAGAGGACTA
CAGATCCCTG
CCCGTGTGTA
TTCGTGCATG
CCTGCAGTAT
GCTCCCTCTA
TAGAGTTCGC
GGGTGTTAAC
AGCAGTGTTA.
CCTGCTGTTA
CATGCCCGAA
TTTCACTCCT
TTTTTTGAAC
ATCTGTCGCG
CCCTCAATGT
TGGTTGGAAC
TAGGTCCGCT
GGCTTTTTCA
CTTATTCAGA
TTGAGCAGTG
CCACTGTAAC
CATTGCTTAC
CGGTAATCTG
TACAGGATTT
GAGAACCAGT
TTCATGTGCA
TACAGTATGT
TAAACATGGT
420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1221 AGGTGTTCAG TAAGACTACA AAAACATTTT GGTTTTTAAT GGCTGTGGTA TTTGAGTGAG GCAACTCTGG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 2369 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA P:\OPER\VPA\VPACOM-'SOCSDI-I.WPD 1111101 153- (ix) FEATURE: NAME/KEY: CDS LOCATION: 116..1330 (xi) SEQUENCE DESCRIPTION: SEQ ID GGCACGAGGC GGTGGTGGCG GCGGCGGGCG CGGCCGCGGC GGGGCGGGCG CGGAATGAAG GCCCACGGCC CTGGGGGCTG AGGCGCCCGC CGCCTGGGGC GGGCCGCGCG TCCTC ATG Met 1 GAG GCC GGA GAG GAG CCG CTG CTG CTG GCT GAA CTC AAG CCT GGG CGC Glu Ala Gly Glu Glu Pro Leu Leu Leu Ala Giu Leu Lys 10 Pro Gly Arg AGC GTG GCC Ser Val Ala CCC CAC CAG Pro His Gin 20 TTC GAC TGG AAG TCA AGC Phe Asp Trp Lys Ser Ser 25 TGC GAG ACC TGG Cys Glu Thr Trp 214 r r TTC TCG Phe Ser CCA GAC GGT TCC Pro Asp Gly Ser
TGG
Trp 40 TTC GCC TGG TCT Phe Ala Trp Ser
CAA
Gin GGA CAC TGC GTG Gly His Cys Val
GTC
Va1 AAG CTG GTC CCC Lys Leu Val Pro CCC TTA GAG GAA Pro Leu Glu Glu TTC ATC CCT AAA Phe Ile Pro Lys
GGA
Gly TTC GAA GCC AAG Phe Glu Ala Lys AGT CTG AAG GAG Ser Leu Lys Glu
AGC
Ser CGA AGC AGC AAG Arg Ser Ser Lys
AAT
Asn 75 GAC CCA AAA GGA Asp Pro Lys Gly CGG GGC Arg Gly AAG ACG CTG GAC Lys Thr Leu Asp GGC CAG ATT GTG TGG GGG CTG Gly Gin Ile Vai Trp Gly Leu
GCC
Ala TTC AGC Phe Ser 100 CCG TGG Pro Trp CCC TCT CCA Pro Ser Pro 105 CCC AGC AGG Pro Ser Arg AAA CTC TGG GCA CGT Lys Leu Trp Ala Arg P:AOPER\VPA\VPACOM-I\SOCSDI-l.WPD 1/11/01 154- CAC CAT is His 115 CCC CAG GCG CCT Pro Gin Ala Pro
GAT
Asp 120 GTT TCT TGC CTG Val Ser Cys Leu ATC CTG GCC Ile Leu Ala 125 CAG ACA GGC Gin Thr Gly ACA GGT Thr Gly CTC CTG Leu Leu 145
CTC
Leu 130 AAC GAT GGG CAG Asn Asp Gly Gin
ATC
Ile 135 AAG ATT TGG GAG GTA Lys Ile Trp Giu Val 140 550 CTT CTG AAT CTT Leu Leu Asn Leu
TCT
Ser iSO GGC CAC CAA GAC Giy His Gin Asp GTG AGA GAT Val Arg Asp ACG CCC AGC Thr Pro Ser
GGC
Gly 165 AGT TTG ATT TTG Ser Leu Ile Leu
GTC
Val1 170 TCT GCA TCC CGG Ser Ala Ser Arg CTG AGC TTC Leu Ser Phe 160 GAT AAG ACA Asp Lys Thr 175 CAG GTG TTA Gin Val Leu CTT CGA Leu Arg
ATT
Ile 180 TGG GAC CTG AAT Trp Asp Leu Asn AAA CAC GGT AAG CAG Lys His Gly Lys Gin 185 TAC TGC TGC TCC ATC Tyr Cys Cys Ser Ile 205
ATC
Ile 190 TCC GGC Ser Gly 195 CAT CTG CAG TGG His Leu Gin Trp
GTT
Val1 200 TCC CCT GAC TGT Ser Pro Asp Cys
AGC
Ser 210 ATG CTG TGC TCT Met Leu Cys Ser
GCA
Ala 215 GCT GGG GAG AAG Ala Gly Giu Lys
TCG
Ser 220 GTC TTT CTG TGG Val Phe Leu Trp
AGC
Ser 22S ATG CGG TCC TAC Met Arg Ser Tyr
ACA
Thr 230 CTA ATC CGG AAA Leu Ile Arg Lys
CTA
Leu 235 GAA GGC CAC CAA Giu Giy His Gin AGC AGT Ser Ser 240 GTT GTC TCC Val Val Ser TCG TAT GAC Ser Tyr Asp 260
TGT
Cys 245 GAT TTC TCT CCT Asp Phe Ser Pro
GAT
Asp 250 TCA GCC TTG CTT Ser Ala Leu Leu GTC ACA GCT Val Thr Ala 255 GGC GCG AGG Gly Ala Arg ACC AGT GTG ATT ATG Thr Ser Val Ile Met 265 TGG GAC CCC Trp Asp Pro TAC ACC Tyr Thr 270 CTG AGG TCA CTT CAT CAC ACA CAA CTT GAA CCC ACC ATG GAT GAC AGT P:\OPER\VPA\VPACOM-l\.SOCSDI- I.WPD 111 1/01 155- Leu Arg 275 Ser Leu His His Thr 280 Gin Leu Glu Pro Thr Met Asp Asp Ser 285
GAC
Asp 290 GTC CAC ATG AGC Val His Met Ser
TCC
Ser 295 CTG AGG TCC GTG Leu Arg Ser Val
TGC
Cys 300 TTC TCA CCT GAA Phe Ser Pro Glu
GC
Gly 305 1030 TTG TAT CTC GCT Leu Tyr Leu Ala
ACGS
Thr 310 GTG GCA GAT GAC Val Ala Asp Asp CTG CTC AGG ATC Leu Leu Arg Ile TGG GCT Trp Ala 320 1078 CTG GAA CTG Leu Giu Leu TGC TGC ACG Cys Cys Thr 340
AAG
Lys 325 GCT CCG GTT GCC Ala Pro Val Ala
TTT
Phe 330 GCT CCG ATG ACC Ala Pro Met Thr AAT GGT CTT Asn Gly Leu 335 GGG ACG AGA Gly Thr Arg 1126 1174 TTC TTC CCA CAC Phe Phe Pro His
GGT
Gly 345 GGA ATT ATT GCC Gly Ile Ile Ala
ACA
Thr 350
GAT
Asp
GGC
Gly 355 CAT GTC CAG TTC TGG His Val Gin Phe Trp 360 ACA GCT CCC CGG Thr Ala Pro Arg
GTC
Val1 365 CTG TCC TCA CTG Leu Ser Ser Leu 1222 1270 AAG Lys 370 CAC TTA TGC AGG His Leu Cys Arg
AAA
Lys 375 GCC CTC CGA AGT Ala Leu Arg Ser
TTC
Phe 380 CTG ACA ACG TAT CAA Leu Thr Thr Tyr Gin 385 GTC CTA GCA CTG Val Leu Ala Leu AGG ACT TTC TAGC Arg Thr Phe
CCA
Pro 390 ATC CCC AAG AAG ATG AAA GAG TTC CTC ACA TAC Ile Pro Lys Lys Met Lys Glu Phe Leu Thr Tyr 395 400 1318 1367 AGTGCC GGCTCCCCCA CCTCCTGCAG CAGCAGCAGT 405 ACAAGGGACT GGCTAGGATG GAGTCAGGCA GCTCACACTG GACCAGTGTG GACCTTCCTT CCTCCCATGG CATGTGCAAG TAGGTCTGCG TGACCCCACT TCTGTGGTGC CGGCCTTACC TCGTCTTCAT CCGTGGTGAG CAGCCTTCGT CAGTCTAGTT GTGTTGAAGC CAAGTGCAGT 1427 1487 1547 P:\OPER\VPA\VPACQM-I\SOCSD-I.WPD- 1/11/01 156-
TGTGGATGTT
AGCCACACTC
AGCCAGCGGC
TTCCCTGCTC
TTGTTTCCTG
ACGAGGAGGC
TATTTTCTCA
GTTCTTAAGT
ACTGGCTTGT
CCTCCAGTTC
AGCAGCTACC
GCTGGGGTAA
CCTTAACTGG
TGCATGGTTT
AGCTGCGCGT
TAAACAGTGG
GTTCCTCGGT
GCACATAGTA
GGCCCAGTTG
GCTGTCTGTC
AACTGCCCAA
TATTCAAGAC
TAAAGGCAAG
GAAGTACCTG
GAAGTTCCTC
GGACTGGGCT
TTGCATGTGT
GCATGACGGT
AGGTACAACT
TGGAGCCAAG
ACATGTGTTT
AACAGACAGC
GCCTCACACA
CGGGCTCCAG AGCCTCTCTG GTGGCGGCCA
CCACGTAGGG
CGTTGTGGTC
GAGCTCCTCA
AGAGAAGTAA
CAGATGGCCA
GTGTTTTCTC
TCTAAGTCGT
GTCTCTGCTG
CCCTTCCAAG
AAATCTGCCT
TCTTTGGCCT
CATTTCTGCT GCCTATTTCC
AGAAGAACTC
CCATACACTA
CAAGCGAGTA
TTTATCAGCA
AATTGTCTCG
GTGGAGTCAG
CTTGACCTCA
CACCGTTCTT
TAGAAAGTTA
TTCTTA-ATTG
CTCCTCTATT
TGGTGTTTGG
GTGCCGGCTT
TTCAGATCAT
TATTTATTTG
AAAAAACAGA
TGCTGACATC
TGGGATGTAC
TGACAGCGGT
ATATATTTTA
ATGCTTTATG
TGAAGAACAA
1607 1667 1727 1787 1847 1907 1967 2027 2087 2147 2207 2267 2327 2369 a AATTATTTTA AAAGAAACTC AACATCTTAT GAGGCAGTGT TAACATTGTA CAGTGTATGC ATAGAGGAGT TGCAAAATGA GGCTTTCATT GAAGGGAAAA AAAAAA AA INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 404 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: Met Glu Ala Gly Glu Glu Pro Leu Leu Leu Ala Glu Leu Lys Pro Gly P:\OPER\VPAWPACOM-l\SOCSDI-l .WPD 1/11/01 157- 1 Arg 5 Pro His Gin Phe Asp Trp Lys Ser 10 Ser Cys Giu Thr Pro Phe Trp Ser Vai Giy His Cys Ile Pro Lys Ala Phe Ser Val Val Lys Asp Gly Ser Trp 40 Pro Ala Trp Ser Gin Phe Leu Val Pro Gly Phe Trp 55 Arg Leu Giu Giu Gin Asp Giu Ala Lys Gly Ser 70 Lys Ser Ser Lys Asn 75 Gly Pro Lys Gly Arg Ser Leu Lys Glu Pro Thr Leu Asp Cys 90 Pro Gin Ile Val Trp Gly Leu Ala Phe Arg His His 115 Giv Leu Asn Ser 100 Pro Trp Pro Ser Pro 105 Val1 Ser Arg Lys Gin Ala Pro Asp 120 Lys Ser Cys Leu Ile 125 Gin Leu Trp Ala 110 Leu Ala Thr Thr Gly Leu Asp Gly Gin Ile Trp Glu 130 Leu Leu Val1 140 Val1 Leu Asn Leu 145 Phe Ser 150 Ser His Gin Asp Vali 155 Ser Arg Asp Leu Ser 160 Thr Pro Ser Gly 165 Trp Leu Ile Leu Ala Ser Arg Asp Lys 175 Thr Leu Arg Leu Ser Gly 195 Ile 180 His Asp Leu Asn Lys 185 Tyr Gly Lys Gin Ile Gin Val 190 Ser Pro Asp Leu Gin Trp Val1 200 Cys Cys Ser Ile 205 Cys Ser Met Leu Cys Ser Ala Ala Gly Giu Lys Ser Val Phe Leu Trp 210 215 220 P:\OPER\VPA\VPACOM-l\SOCSDI- .WPD 1/11/01 158- Ser Met Arg Ser Tyr Thr Leu Ile Arg Lys Leu Glu Gly His Gin Ser 225 230 235 240 Ser Val Val Ser Cys 245 Asp Phe Ser Pro Asp 250 Ser Ala Leu Leu Val Thr 255 Ala Ser Tyr Arg Leu Arg 275 Asp 260 Thr Ser Val Ile Trp Asp Pro Tyr Thr Gly Ala 270 Met Asp Asp Ser Leu His His Thr 280 Gin Leu Glu Pro Thr 285 Ser Asp 290 Val His Met Ser Ser 295 Leu Arg Ser Val Phe Ser Pro Giu Gly 305 Leu Tyr Leu Ala Thr 310 Val Ala Asp Asp Arg 315 Leu Leu Arg Ile Trp 320 Ala Leu Glu Leu Ala Pro Val Ala Ala Pro Met Thr Asn Gly 335 a a.
a.
4..a *aa.
a *aa.
Leu Cys Cys Arg Asp Gly 355 Phe Phe Pro His Gly 345 Gly Ile Ile Ala Thr Gly Thr 350 Leu Ser Ser His Val Gin Phe Trp 360 Thr Ala Pro Arg Val1 365 Leu Lys 370 His Leu Cys Arg Lys 375 Ala Leu Arg Ser Phe 380 Leu Thr Thr Tyr Gin 385 Val Leu Ala Leu Pro 390 Ile Pro Lys Lys Met 395 Lys Glu Phe Leu Tyr Arg Thr Phe INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 1246 base pairs P:\OPERVPA\VPACOM-I\SOCSD-- I.WPD 1/01 159- TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: GACACTGCAT CGTCAAACTG ATCCCCTGGC CGTTGGAGGA GCAGTTCATC CCTAAAGGGT TTGAAGCCAA AAGCCGAAGT AGACGCTGGA CTGTGGTCAG CAGCAGGAAG CTCTGGGCAC TGCTACGGGA CTCAACGATG TTTGAATCTT TCCGGCCACC .TTTGATTTTG GTCTCCGCGT CGGTAAACAG ATTCAAGTGT CCCAGACTGC AGCATGCTGT GAGGTCCTAC ACGTTAATTC CTTCTCCCCC GACTCTGCCC GGACCCCTAC ACCGGCGAAA GGATGACAGT GACGTCCACA GTACCTTGCC ACGGTGGCAG TCCCATTGCA TTTGCTCCTA GAGTCATTGC CACAGGGACA
AGCAAAAATG
ATTGTCTGGG
GCCACCACCC
GGCAGATCAA
AAGATGTCGT
CACGGGATAA
TATCGGGCCA
GCTCTGCAGC
GGAAGCTAGA
TGCTTGTCAC
GGCTGAGGTC
TTAGCTCACT
ATGACAGACT
TGACCAATGG
AGAGATGGCC
AGACGAAAGG GCGGGGCAGC
GGCTGGCCTT
CCAAGTGCCC
GATCTGGGAG
GAGAGATCTG
GACTCTTCGC
CCTGCAGTGG
TGGAGAGAAG
GGGCCATCAA
GGCTTCTTAC
ACTCCACCAC
GAGATCTGTG
CCTCAGGATC
GCTTTGCTGG
ACGTCCAGTT
CAGCCTGTGC
GATGTCTCTT
GTGCAGACAG
AGCTTCACAC
ATCTGGGACC
GTTTACTGCT
TCGGTCTTTC
AGCAGTGTTG
GATACCAATG
ACCCAGGTTG
TGCTTCTCTC
TGGGCCCTGG
CACATTTTTT
CTGGACAGCT
CCAAAAGAGA
TTTCCCCACC
GCCTGGTTCT
GGCTCCTGCT
CCAGTGGCAG
TGAATAAACA
GTTC CAT CTC
TATGGAGCAT
TCTCTTGTGA
TGATTATGTG
ACCCCGCCAT
CAGAAGGCTT
A.ACTGAAAAC
CCACATGGTG
CCTAGGGTCC
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 P:AOPER\VPA\VPACOM-l'\SOCSDI-l.WPD.- 1/11/01 160-
TGTCCTCACT
TCCTAGCACT
CAACACCACA
GGAATAATGG
AGAATGTAGC
GAAGCACTTA
GCCAATCCCC
TCTTGTGCTT
GCCAAACATC
AAAACCAGAT
TGCCGGAAAG
AAGAAAATGA
CTTTGTAGCA
TGGTCTTGCA
TCCAGTGTAC
CCCTTCGAAG TTTCCTAACA ACTTACCAAG AAGAGTTCCT CACATACAGG ACTTTTTAAG GGGTAAATCG TCCTGTCAAA GGGAGTTGCT TTGAAATAGC ATTTCTTTGG GATTGTGAAT TAGTCATGGA TTTTTC 1020 1080 1140 1200 1246 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 422 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA
S
S. .5 S S
S
S*
OS
0* 0*@S
S*
9
S.
.5.55
S
555
C
4 55555*
*.*SSS
S
@5@5 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: ACCATGGTTC CAAGTCCTCT CCCCTGTGGT CAAGTTGCCC GAATGTTGGG TTTTCCTCCT TGGGCCTCCC CTTCTGACCT GCAGGACAGT TTTCCGGAGC TGAGGTATTA ATTAGCCTTA ACTAAATTAC AGGGGACTCA GAGGCCGTGC TCCAGACACT ATTTTTTTTT TTTTTTTTTA ACAATGGTGT GCATGTGCAG ATTTGTATGT CAGATTATAC AAGGATGTAT TCTTAAACCG CATGACTATT CTGAGTTATC AGTGGCCATT TATTAGCATC ATATTTATTT GTATTTTCTC AAGGTACAAC TGTGTTTTTC TCGATTATCT AAAAACCATA GTACTTAAAT
AA
CCCAAGTGCC
CCATTTGGTA
TCCTGA.CCGA
GAAATGACAA
CAGATGGCTA
AACAGATGTT
TGAAAAAAAA
120 180 240 300 360 420 422 P:AOPERWPA\VPACOM-M0sCSD-l.WPD.- 1/11/01 161 INFORMATION FOR SEQ ID NO:24: i)SEQUENCE CHARACTERISTICS: LENGTH: 2019 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: GGCACGAGGC GGGGTCAGGG CGGAGGCTGA GGACCAAGTA GGCATGGCGG AGGGCGGGAC 0 *000 00 00 S. S
S
OS 0 0O @0
S.
S
OS
S S 0 @00555 0
OS..
S
@505
S
@SOSSS
S
0 @5.550
S
@550@5 0 0050
S
CGGCCCCGAT
GGAGCAGTTC
CTATGTAGGG
CATCAATGAG
AGCCACTGCA
CCTGGTGGAT
GAGCACTGAG
CACTCCTGTG
GTATGGGGCA
ACGGCGGCTA
TCAGTGCTTC
TGTCAACACC
GGACGGGCCG
TGTGACCATC
GACCTCCAGA
AAGTCTGTCT
GGCCATGGGA
GTCAAGGGGC
ATCCTTTTGG
TACCATGCCT
GATGTTGATG
ACCTCCTTGG
AGGCTGCTCT
CAGGAGTTCT
GCCCGGGACC
CACTGGAGCA
CCCTCAGGAA
GGTGCTGCGG
ACTGTGTGGA
AGACTGCCCT
AAGCTGGTGC
YTCGTGTGGG
TCAACCATCA
TGGTCTGTCC
TGCAGGCTGG
ACAGGGGATC
CTGTGACGAT
CCTACTGCAA
CTGGCTTCCC
CTTCCTCATA
GTATGTGGCT
TGATCCCAAC
TAGGGACGAC
TCTGAATTCT
TCTATACATC
GGCAAATCCT
CCCTGGGTGT
ACAAGACTCC
GAGGAGAGCT
TGCACACCAC
CGCAAAGGGG
GTAGTGAACG
GGCAGCCGGC
ATCCTGAAGG
GACACCCGGC
AGTGCTGCCT
GACTTCAATT
GTCATGGATG
CGCAGGTCCT AATCTGAAGG AGTGGCTGAG
ATGATGCAGC
ACCGGAGCCG
TGAGGATCGC
CCGAGGTGGA
GGCACTTGGA
ACCACCGCAG
CTCTTATCAG
CCCCTTTTTC
ACCATAACCT
GCAATGGCCC
CTGTCCTGCG
120 180 240 300 360 420 480 540 600 660 720 780 P:\OPER\VPA\VPACOM-ISOCSDI-I .WPD 1/I1/01 162
CCATGGCTGT
GGTGAAGTGG
CTTGCAGGTC
GGCTGTGAGA
AGACCCCATA
GTGGAAGCCG
CCTGTGCTGC
ACTGTGATTC
GAGGCACATA
CAATGTGTGC
AGGCTATGCA
CAAAACAGAA
GAAGCAGCCT
GAATCCCTGG
TTTAAAGAGG
AGAGCTCTTG
AAGAAGTTTT
ATCACCTGCA
CAGCTTGATC
TATCTCAGGT
CAAACTTAAT
AGCATGGGCT
TCTATTTATG
AGAGGTTTGT
C
TCGTGAGTCT
GCCCAGAGGC
CCAGAAGTAT
GCAAATACCG
TGCTTTATGA
GTGAAAACTG
CTTGGCTGTC
GCTTGGGCCA
TTTGTTCCTC
GAGCCTGGTG
TTCCTACTTT
TAAGAAAAGA
TGCCTGGGTT
CTATTTTGGG
GGTCCCAGAA
CCCTGGGGAT
TGACTGCCCA
A.GAAGATGAC
GTCACCTGGG
A.GTTTTTTGGC
GGTTGCTTCCC
AAGAGGCAGA
TCCCAGGACC
ACTGCATCTG
GTAGCATTCA
ACACAGACTC
AGTGAAGAAA
TCGAACGCTC
TTCAGTCTCT
ATTGCCCTAG
GCAATTTATT
TATAGGGAGA
TGTCTGTCTA
AGTTGTCTTC
CAGTGTTCAT
CTTCAGACAG
CATTCAGATC
%CGGGTGCTC
I'TGTTCCCAG
3TTGACTGTT
'TAAAAKTT
AGAAAGATGG ATCCTGAGGC
TTGCTGAGTT
GTTCCCTCGC
CATGCAGTGC
TGGCATCCTG
AAACGGCTGT
CTTGAGTCAT
CTGTTTTGGA
TGGGGAAGGC
GTTCTTTTAA
AAGGAATTCC
TGCTGCCTGG
CGTCTAAGAT
AGGGCACCAT
TGGTTACCTT
1GGGACCATC
TCTTCAGACA
GCTACAACTT
CCCTCCCCAC
TGTGCCGGGT
TGCCGCTGCC
TGACTGCAAT
GGAACCATGG
GTTCTCTTGG
TGTCAACTGA
TTCTTCCTGG
TTTTTTCTCC
GGCTTGATAT
GGTTCCGTGC
TGCACATCCC
GGCTTCTGGG
CTGCTCTGCC
TAGGAGACCC
TTAATAGTAC
CTCCCATACA
CTTGGTGTTC
TTTCCTTGAA
GTTGGTAGAG TTTGGAGCCA ACCTGAACCT 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2019 ACTTGCTAGC CTGCTTTCCT
TTCTCTTTGC
GTTCTATCTT
AAGGGTTTTC
ACCTGGAACT
TCACTGCCAG
GGAAGTTGGA
CACTAARACC
NCCCAATGCC
TGCCACTGTT
ATTGCACAGA
TGATGTCTTA
AACCATTAAG
TCCTCACAAG
AAATGTCTTG
AGRATATCCT
CNTTTGTKTN
P:\OPER\VPAVPACOM-p\SOCSDj-I .WPD 1/11/01 163 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 350 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Arg Gly Gly Val Arg Ala Glu Ala Glu Asp Gln Val Gly Met Ala 1 Glu Gly Ala Gly Gly Thr Asn Leu Lys Pro Asp Gly Arg 25 Glu Gly Pro Gly Pro Glu Trp Leu Gln Phe Cys 35 Glu His Cys Asp Tyr Pro Ala Gly His Pro Leu Val Gly Asp Asp Asp Thr 50 Leu Gln Arg 55 Leu His Asp Ala Ala Ser Thr Leu Arg 65 Ile Asn 70 Val1 Leu Gln Glu Glu 75 Trp Tyr Arg Ser Arg Asn Glu Lys Ser Ala Trp Cys Cys Gly 90 Gly Leu Pro Cys Thr P~fo Leu Arg Ile Ala 100 Gly Thr Ala Gly His 105 Leu Asn Cys Val Asp Phe Leu 110 Gly Gln Thr Ile Arg Lys 115 Ala Glu Val Asp 120 Val Asp Val Lys 125 Ala Leu Tyr Val Ala Val Val Asn Gly His Leu Glu Ser Thr Glu Ile P:\OPER\VPAVPACOM- l\SOCSDI-l WPD 1/11/01 164 135 140 Leu Glu Ala Gly Al a 150 Asp Pro Asn Gly Ser 155 Arg His His Arg Ser 160 Leu Lys 175 Thr Pro Val Tyr His 165 Ala Xaa Arg Val Gly 170 Arg Asp Asp Ile Ala Leu Ile Ser Asp Thr 195 Arg 180 Tyr Gly Ala Asp Asp Val Asn His His Leu Asn 190 Leu Val Val Arg Pro Pro Phe Arg Arg Leu Thr Ser 205 Cys Pro 210 Leu Tyr Ile Ser Ala Tyr His Asn Leu 220 Gln Cys Phe Arg Leu 225 Leu Leu Gln Ala Gly 230 Ala Asn Pro Asp Phe 235 Asn Cys Asn Gly Pro 240
C
C
Val Asn Thr Gln Glu 245 Phe Tyr Arg Gly Pro Gly Cys Val Met Asp 255 Ala Val Leu Glu Phe Gly 275 His Gly Cys Glu Ala Phe Val Ser Leu Leu Val 270 Leu Gly Pro Ala Asn Leu Asn Leu 280 Val Lys Trp Glu Ser 285 Glu Ala 290 Arg Gly Arg Arg Lys 295 Met Asp Pro Glu Ala 300 Leu Gln Val 3he Lys 305 Glu Ala Arg Ser Ile 310 Pro Arg Thr Leu Leu 315 Ser Leu Cys Arg Val1 320 Ala Val Arg Arg Ala 325 Leu Gly Lys Tyr Arg 330 Leu His Leu Val Pro Ser 335 Leu Pro Leu Pro 340 Asp Pro Ile Lys Lys 345 Phe Leu Leu Tyr Glu 350 P:AOPERVPAVPACOM-I\SOCSDI-I WPD 1/11/01 165- INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 419 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA r r (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: GCATCCATGG CGGAGGGCGG CAGCACGACG GGCGGGCAGG GCCGGGCTCC ATCTGAAGGA GTGGCTGAGG GAGCAATTTT GTGATCATCC GCTGGAGCAC CGAGGCTCCA TGATGCAGCT TACGTCGGGG ACCTCCAGAC CCTCAGGAGC AGGAGAGCTA CCGGAGCCGC ATCAACGAGA AGTCTGTCTG GTGCTGTGGC GCACACCGTT GCGAATCGCG GCCACTGCAG GCCATGGGAG CTGTGTGGAC GGAAGGGGGC CGAGGTGGAT CTGGTGGACG TAAAAGGACA GACGGCCCTG TGGTGAACGG GCACCTAGAG AGTACCCAGA TCCTTCTCGA AGCTGGCGCG INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 595 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA
GCAGGTCGTA
TGTGAGGACA
CTATTGCAAG
TGGCTCCCCT
TTCCTCATCC
TATGTGGCTG
GACCCCAAC
120 180 240 300 360 419 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: P:kOPERNVPANVPACOM-lSOCSD-l.WPD 1l 166-
GAGGAAGAAG
CCAGAACCTT
TCATCTGATT
GACTCCAAGT
CCGAGCCCTG
CTCGTAGACT
CAGCTGAGAG
TCAGTTGCAT
GCCTACTAAT
TTTCTACTTC
AAAAGTGGAC
GCTGTGTCTG
CCTTCGCTGC
GCTGCGGTTG
AGTGCTGTGC
GTCATTGCTC
GCTTATACTA
ATTAATGTAA
TTCCTGTAGG
TCAATTTGAT
CCTGAGGCCT
TGCCGTGTGG
CTCTGCCAGA
ATTCCAGTGA
TGCTGCTGGT
CTCAGGTGCC
AAGTTATTAT
CGGGCCATGG
GAAGACTCCC
GGTTCGATTA
TGCAGGTCTT
CTGTGAGAAG
CCCCATAAAG
GGGAGAAAGT
CTCCTGATGG
TGGGCCGCTG
TGTTTTTCCC
GGTATGTACA
AGCACTTCTG
AAGCCTTCTA
TAAAGAGGCC
AGCTCTTGGC
AAGTTTCTAC
GATCTGCAGG
CTGTTGCTGC
AACAGTCCTT
AAGTTCTCTG
TGTAGGGGCT
GAACTGTGCT
GTATCTCAAT
AGAAGTGTTC
AAAACCGGCT
TCCATGAGTA
GAGGTGGACA
AGAAGATGTC
GGGTCATTGT
TTCTGGATTT
GAGGTTGGAG
TCTCTTTATT
GAAAA
120 180 240 300 360 420 480 540 595 a INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 896 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 4. .396 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: CTG ATG TCC GCA ATT CTG AAG GTT GGA CAC CAC TGC TGG CTG CCT GTG P:\OPER\VPAVPACOM-'SOCSDl-I.WPD 1/11/01 167- Met Ser Ala Ile Leu Lys Val Gly His His Cys Trp Leu Pro Val1
ACA
Thr TCC GCT GTC Ser Ala Val AAT CCC CAA AGG ATG CTG Asn Pro Gin Arg Met Leu 25 GCC GCT TGC TGC TGT CTG Ala Ala Cys Cys Cys Leu 40 CCA CCA CCA Pro Pro Pro ACC GCT Thr Ala GTT TTC AAC Val Phe Asn
TGT
Cys TGG GGG CAG ATG CTG ATG Trp Gly Gin Met Leu Met AAT ACA TAC Asn Thr Tyr CGT GTA GTT CAG Arg Val Val Gin
CTT
Leu 55 CCT GAG GAG Pro Glu Giu GCC AAG Ala Lys GGC TTG GTG Gly Leu Val CCA CCA Pro Pro GAG ATT CTA CAG Giu Ile Leu Gin
AAG
Lys 70 TAC CAT GGA TTC Tyr His Gly Phe
TAC
Tyr TCT TCC CTC TTT Ser Ser Leu Phe
GCC
Al a TTG GTG AGG CAG Leu Val Arg Gin
CCC
Pro 85 AGG TCG CTG CAG Arg Ser Leu Gin
CAT
His 90 CTC TGC CGT TGT Leu Cys Arg Cys
GCG
Ala CTC CGC AGT CAC Leu Arg Ser His
CTG
Leu 100 GAG GGC TGT CTG Glu Gly Cys Leu
CCC
Pro 105 CAT GCA CTA CCG His Ala Leu Pro CGC CTT Arg Leu 110 GAG GAT Glu Asp CCC CTG CCA Pro Leu Pro
CCG
Pro 115 CGC ATG CTC CGC Arg Met Leu Arg
TTT
Phe 120 CTG CAG CTG GAC Leu Gin Leu Asp
TTT
Phe 125 CTG CTC TAC Leu Leu Tyr 130 TAGGCTTGCT GCCCTGTGAA. CAAAGCAGAC CCCACCCCCA CCCCAAGGGC ATCTCTCAGC AATGAATGAT GCAAGGCGGT CTGTCTTCAA GTCAGGAGTG GACGCCTTGA TCCACACTTG AGAGAAGAGG CCAGATCAGC ACCYGGCTGG TAGTGATNGC AGAGGGCACC TGTGCAGATC TGTGTGCGCA CTGGAAATCT CTAGGCTGAA GGCYAGAGCA 493 553 P:\OPER\VPAVPACOM-I\SOCSDI-l WPD. 1/1/01 168-
AATGGTGCAR
GAGAGTGCAC
CGGGCGAACT
TCTTTGGACC
TAGCAATACC
GTGTTAGTCC
ATGTCAAGTG
TCTTTAGGGG
CTTCCCGGGT
GGGTGCTTTT
TTGGGANGAG AGACAGANGG TGAGAAAGCA AGACAGAGGT GTAGATTGCC TTAAAAGAAA GCTAAAAAAA GAAAAAGATT TAATGCTGCA GCGTGTTAAA. CTGACTGACC AGCGTCCATA GAAAAAGCCC CTTCATCCTC CAGCGCTCCC CAAGGGTGCT CTGCCGCAAA GTGAGTTACC AAA 673 733 793 853 896 INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 130 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein
S
Met 1 Ser (xi) SEQUENCE Ser Ala Ile Leu 5 Ala Val Asn Pro DESCRIPTION: SEQ ID NO:29: Lys Val Gly His His 10 Arg Cys Trp Leu Pro Val Thr 1s Gin Arg Met 20 Ala Leu 25 Leu Pro Pro Pro Thr Ala Val Leu Met Asn Phe Asn Cys Thr Tyr Arg Ala Cys Cys Trp Gly Gin Met Gly Val Val Gin Pro Glu Leu Tyr Glu Giu Ala Leu Val Pro Ile Leu Gin Leu Lys Arg His Gly Phe Ser Ser Leu Phe Val Arg Gin Ser Leu Gin Leu Cys Arg Cys Ala Arg Ser His Leu Gly Cys Leu Pro Arg er Hs Lu Gl Cy LeuPro Ala Leu Pro Arg Leu Pro P:\OPERVPA\VPACOM-1\SOCSDI-I.WPD- It 1/01 -169- 100 105 110 Leu Pro Pro Arg Met Leu Arg Phe Leu Gin Leu Asp Phe Glu Asp Leu 115 120 125 Leu Tyr 130 INFORMATION FOR SEQ ID i) SEQUENCE CHARACTERISTICS: LENGTH: 436 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA *r a a (xi) SEQUENCE DESCRIPTION: SEQ ID GTGGGGGCGT CATCATGACC TCCTCTAGGG CTCTGCAACA TGACTCCTGT AACAAATTGT TCACTGATGA ATCCACAAGG ATCTCTGGGC CTACAACCAG ACATGACTGT CGTCTTCGGA GAAGGCACCA CTCGCCCCCG GCAGGTACGG CATGGGAGAA GACGTATCCA GGCAGCAGCT GCGCGGCCCT TCAAGAGGGC ATCTAAAGGC ACGGTGTACT GAAGGTAGTC CTGAGACATG AGTCCGATTA GTGTTCCTCC AGGTGGAGGC TCAGGTCCCC GGGTGAGCTG GGGCTGCAGC GCGCGGCTCT GGCTGCAGGT CTCGCAGCTC CCTGGGCTGT AGCTCCCGCA CACACCGTTG ACTGGT
GGTGCAAATC
GTCCTGGTCC
CTGACACCTC
ACATCCCGTC
CTACAGGCAC
GGGACTCAGG
GATCCTTGCG
120 180 240 300 360 420 436 P:\OPER\VPAVPACOM-I\SOCSDI-I.WPD- 1/11/01 170- INFORMATION FOR SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 2180 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: TTAATAGTAC-CTACATAGTA GAAAATTATA ACTCCACTTT AAAACAATGT TTTCTTTCTA 9~SS 9**9
S
TTCAAATCAA
TGAAAATTAG
AGATAAAACA
ATGATCATCT
ATTTGTACCC
CTTTACAATT
AAGGCAACAG
TTTAGAAAAA
TGTCAATTCA
TTTAAAACTT
TTGACAATCA
TTTTACCAAA
AATATTTCTT
GAGTTTAATT
TTTTAATGTA
AAAAATTCAA
AGTGTTGTTA
CGATTCTAAA
TTTATAAACA
AGTTCACCCA
AGGATAGGTA
TAATAATAAT
ACAGAAAAGG
AGGCCATTTA
CTTTTCACAA
AAAGATATGT
TAAATCTTTT
TTTTATCACA
TGGCAATTTC
AAATGTGCTT
GGGGGCATTG
TTAATGTTGC
AGAAAATGTT
ACACACAAAA
TCTAGTTCCA
CAACAATTTC
TTAAAATAGA
CCAAAAGAAT
TGCAGATCTC
TAAAGTAAGA
CAGGTATGTT
CATTCTGTTA
TTAATGCATA
AGGAGAATTT
AAGAGAATCC
GACTAAGCTA
AAATGCTATC
TAGGTTTTCA
TAAATTGGTG
CAAACTAGAA
TAGCACAACC
CGTTCCATTA
GATTAAAAAC
TATTCAACAC
AGTTTCATTC
AAAATCACAG
GATAATTCAC
AGTCCATTTA
AAGAAATCAC
ACAGGAAGCT
TGTTATGCCA
GTATACATTT
GATGAAAACG
TTAGAAATAA
CCCAAGATTA
TCATCTTCAG
TGCTTTGGA.A
AACCTTTACT
TGGATTAGCC
ATTGTGATTA
120 180 240 300 360 420 480 540 600 660 720 780 840 TGTATATGTA AATTCCGTGG
ATGGACCATT
TAGGGGTTGA
AGCAAAAGGG
TAAAAGGACA
TTACCACATG
ACTGGGCGGG
P:\OPER\VPA\VPACOM-I\SOCSDI-l .WPD 1/1 171
TTCTGCACAT
AGAACCAAAA
ACTCTTTTAC
GTTTGTAAAC
CAATCTTTCA
TCTGAATTAA
AGCGCACCTA
GCAGCGAGAC
ATTAGATCAA
CATTTGAGTG
AAAGGTAACG
GCTTCCCTTC
TTTTCAGCTC
AAACTCTTGC
TCAGCCACGT
TTATTCCTGC
TCTCCGGGGT
ATCCACGAAG
GAAAGCGCGC
CAGACAATAG
CTGAATGTAG
TGATGAAACA
TAAACCCAAG
ACATAAAACT
TGTATTTCAA
GTAGTGCTTC
GTCTATTCTG
TGAACCCCGG
CTTGAATAAC
CTATGGACGT
CTCAATTCTA
GTCGTCAGAA
TGCCTCCCAA
CTCTGTAAGG
AACCCCAGGA
GGGGCTTTCT
ATTGGAGTAA
TCTGCAACAT
ATCTCTGGGG
CCTCCCACTT
CTCCGTGATC
TCCTGAGGCA
TAATTCACAC
ACACCTTGCT
GAAATAGTTA
TCTCTTGTTC
TCCTGTAAAT
GTATACTGAC
AACACTGGTT
AAAAAGCTCC
ATGTCCTTCC
GTGTGAAGTG
CTGTCCCGAA
CGTGTGATTG
CTTGTCACAA
GCAGAGTTCG
GTCCAGTGAG
CCAATGGTGA
GACTCCCGTG
CGACAACTAG
GAGGAGGAAC
CTTCCAAAGG
TAAGTCCAAT
CTCTAAAACC
GACACTTCCC
TGGCAGCAAA
TTATTCCCAA
AATCCTTCAT
GTATAACAAA
GGCAAGTTCT
CATTTTCAGA
ACATCGGCTG
TTTTACCATG
CAAGAAAAGA
GTCCCCAGTA
CCATGGGACC
GATCAAAATT
TCCACTGAAA
AGATTGGAGG
GTGCCAATCA
GTCCTGGTCT
CGCAGAGACT
PTACATCCCC
AACGACAGGC
TCAAGACTTC
CACCCCTAAA
AGATTTTGAT
AGTGCAAGAT
TTTGTTTGGC
ACGACACAGG
GACGGAAGTG
GTCCCTGATT
TTCATAAAAG
GGAGCGAAAG
ACCATCTGGC
CCATCCTTGC
ACTACTTTGC
CAAATGACAG
GTTCCCCTTT
GACATCCATC
ACAAGCCATT
A~CCTGACTCT
TCCATGGGAG
TCATCTAAAG(
ACATGTTCAT
CCTTTTTTAA
CAAACTGATG
GGCAATGAAA
GCAGGGTTCT
AAAGGCAGTT
TACTGCAACG
CAGATTCCAG
GAATGCTCCA
CTAAACCTAC
TCACAGCTTA
ACGTTTGCTA
TTTGCAAGTT
ACTGAGTCAT
CGCATAACTT
GGGATTTGGA
GTGAACCCGC
CACCGGACTG
CATCCTCGGG
AAGAGCTGTC
GCACAGTATA
CCAGGTGAAG
900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 P:AOPER\VPA\VPAC0M-.l0SCSDI-l.WPD 111 1/01 172- ATGCAGGTCT CCATTATGAG AAGCCGAGCT CTTCAGTGAA TTGGCTTGCT CCTGGCACGT GGTCTCAGAC TGGAGGTCGT 2160 2180 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 2649 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: a.
GGCACGAGGC TGTGTCCAGC ACACAGAGAG GGCCCGGCCA TCTGCTTTGG TTCAGAGCCC
TGTGTCTGTC
ACTGGCTCCA
TCTGCACCTT
CTGTTCCAAG
GCTATGGACC
ATCCAGGATG
GCTGCCTACT
ATTGACCAAC
CTGGATTGCC
TGTCACTTAG
GCATGACTCG
CCAGGTCCCC
GGGTCATGCA
CCGTGCTGAA
GGAAGAATCT
ATGGCCAGCT
GCACACTGCA
TCCTGTCGCT
ACTCTTCCTC
CTTCTCTTAT
TTCGTCTCCC
GAAGTATAGC
GGCCATCAAG
TGCAGAGCCC
GGGCTGCCTG
GGAAGAGACA
GCTCCAGGCG
CCGGCTCGCA.
GCAGAGTACT
GAGAACCCAC
AGCAACCTGT
GAAGGGGATG
AACAAGGAGG
AAAGTCCTGC
GCATTATACC
GGGGCAGAGC
GCTCACCCTC
TTGCTCTGTT
CGGCCCGCGC
TCAAGACCTC
AAGAGGCCTT
GCTGGCTGCC
AGCAAGCCTA
TGGCCACATG
CTGACATCTC
CATCCTCCTT
TCACTCTGGC
ACCCCTGGGT
CCAGATGGCG
GAAGATCATG
GCTCCACGAG
CCCAGGGACC
CAGAGAACAC
TAACAAATCC
120 180 240 300 360 420 480 540 600 AGGGAGACTC CACTTTACAA AGCCTGTGAG CGCAAGAACG CGGAGGCGGT GAGGATATTG P:\OPER\VPA\VPACOM-I\SOCSD -I.WPD- 1/11/01 173- GTGCGATACA ACGCAGACGC CAACCACCG(
TCTGTCTCCC
GAGGCCAAG.A
GAGGCCCTGA
GCATCAGCCC
TCTCAGGGCG
TCCAAGAAGG
GTGCGCCGTA
CTGGAGGCGC
CGCCTCTACG
TACGCCACCG
CTGCTCGTGG
GCCAACATCG
GCCATGAAGT
TGCTTCTCCT
ACGACGCACC
CCCCGGAAGT
ACGTGCAGCT
TCAAGGAGAA
AGGCCATAGG
TCAGATACTT
GCAATGACCI
ATGTCTACAG-
GGTTCCTGGC
TCTACGAGGC
CCGATGCTAA
GCAACTATAG
GCGGCATCAG
TGCTGGCCGC
AGGACCGCCG
AGCTGTTGCT
CCATCCGCCA
ACGCCTACAT
GCCTGTCGTT
GCCTGTACGG
CGTGGACGAC
GAGCCGCTGG
GTGCTCCCGG
GGCAGAACCT
AAAATACCGG
GAAATATGAG
GGAGGTCATC
SCATCACCCC1
CAAGCATGGI
CAGCAAGAAI
CAAAGCCAAC
AATAGTGCAG
CCCGCTGCAC
GCGCTTCGAC
CAGTTCTGCG
GCTGGCGGGC
CGGCTGCCTG
CGCCACTCAC
ACTCAAGTTC
CAACGGGCCG
AAGGCACCTA
GCGGGACCCA
CTGAAGGAGC
CCGAGACCTC
ATAAAACTCC
AATACACAGT
TGTAACAGGC
;GAGATCCTAG
TTGTTTGTGG
GCAGACATCA
GAGCATGAAG
AAGGACGGCC
ATGCTGCTGC
CTAGCGGCCG
GTGAACGCAC
CTCTACTTCG
GCGGACCCCA
CGCACCATGC
CCCACCGCCT
CTTATGGACC
CACCACCCGC
GCGTGGTGCA
TCATCGATGT
ACATCGACAG
TGGCTCACCT
TGGACACACT
AACCAGCCTG
GCTGGACCGC
TGAGTGGCGG
CTGCCCAGAG TGGGCAGCTG
ACACGCAGGC
ACGTGGTAGA
TGCTCCCCCT
CTGTGACCAG
AGCGCAACCA
CTCTGGCTCC
CTGTGGTCAA
ACCGCGATGT
AGCTGCTGTT
TTCCAGCCAC
TCGGCTGCGA
CCCGCGACCT
GTTCTGTGAG
CCTCCTGGAC
CTTTGAGGAC
CTGCCGGCTG
GCCGCTTCCC
GAGAGGAGAT
CAGTGACAGT
GTTTCTTCTC
GCATGTTGCC
CCGCACGCGC
CGACGCGGTG
CGAGCGCGCC
CAACAATGTG
CATCAGCCCT
GGACCATGGC
CATCATGTTT
TGGCGAGCCC
GGCCGCTTCC
TTCCTGTCGG
TATGTGGGCA
TGGGCTGTCA
CGGGTTCGGA
GGCAGGCTAA
GTGGCCTTCA
ACTGCACGAG
GGCCAAGGTG
720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 P:\OPER\VPA\VPACOM-I\SOCSDI- I.WPD. 1/l11/01 174
GACTGTTTCC
GTGACCTTGC
GACGTGGAAC
TAGCCAGAGA
AGGAATCCTG
AGGCTGCGAG
CAGAAAAAAA
CGCTTAAAGC
ACGAGAGCCG
TGTCCGACGT
GGGACGCCCC
TGGTTCTTTG
CTCTGCTTTC
CCTTCAACCT
GCCTTAAGCT
GGAGACGCCC
AAAAAACATG
TACTGGAAAC
ACGTGCTTCA
AATTGACCCC
AGGTGGCCTG
CTGGAGCTTC
ACACTGTCAG
GGGGCCAGGG
GGAGAACTTG
AGCCCAAGTA
GGCGCAGCTT
ATGCGTTCCA.
AGGTTGGATT
CATCCAGGAC
ACCCAAAGTG
CCCCTGGGGT
AGAACCTGAT
CGGATCGCAG ACCCGCTCTG
GAGAGCTGGT
TAGGAATCCC
TTTTATTTCC
ATTCCTTAGT
CTATGCTTGA
TTTGGTTGCC
CTGGGCAAGG
TCACTGGACC
GTGACACAAT
AGGGTATTTA
GAATCCCCTT
CCTTTGGCGT
GTTCCGACTA
CAGAACAGGT
GTGGGGAGTG
CTTCTGGCCA
TGGCCCAGGC
CTCAGCTTTC
AACGTTGTAT
CTTGCATGCG
GCACTGGTAA
TCCGCGGGTT
TTGGGGGGCT
1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2649 GTGTTTTGTC ACTTTCGAGT TTTGGTTGTC CCCAAAATTG TGGGTGGTGT GCGGACGCCA CGAGAAGTGG TTCATGGGCG ATAATCATTA CTGGAGAATG TAGAGCGGCG GTTTTACGAA. TAAATATTTT TTAAGCCGCC
TTCCCAAAA
INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 495 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: CCTCCTGAGA GTTCGCCGGC CCGGGCCCAA TGGGTTGTTC CAAGGGGTCA TGCAGAAATA P:\OPERkVPAVPACOM-l'SOCSDI--I.WPD- 1/l 1/01 175- CAGCAGCAGC TTGTTCAAGA CCTCCCAGCT GGCGCCTGCG CAAGGATGCG ATGAAGAGGC CTTGAAGACC ATGATCAAGG CCCAACAAGG AGGGCTGGCT GCCGCTGCAC GAGGCCGCAT CTGAAAGTCC TGCAGCGAGC GTACCCAGGG ACCATCGACC ACAGCCGTTT ACTTGGCAAC GTGCAGGGGC CACCTGGACT GCAGGGGCAG AGCGGGACAT CTCCAACAAA TCCCGAGAGA AGCGCAAGAA CGCGGAAGCC GTGAAGATTC TTGGTGCAGC GCTGCAACCG GGCTG INFORMATION FOR SEQ ID NO:34: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 709 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA
GACCCCTTGA
AAGGGAAGAA
ACTATGGCCA
AGCGCACCCT
GTCTCCTGTC
ACCGCTCTAC
ACAACGCAGA
TAAAGGCCAT
TCTCGCAGAG
GGTGGGCTGC
GCAGGAGGAA
ACTGCTCCAA
AAAGCCTGTG
CACCAACAAC
120 180 240 300 360 420 480 49S 9 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GTGCAGCTCT GCTCGCGGCT GAAGGAACAC ATCGACAGCT TTGAGGACTG AAGGAGAAGG CAGAACCTCC AAGACCTCTG GCTCACCTTT GCCGACTGCG GCCATTGGGA AATACCGTAT AAAACTCCTA GACACCTTGC CGCTCCCAGG AGATACCTGA AATACGAGAA CACCCAGTAA CTGGGGCCAC GGGGAGAGAG TCAGACTCTT CTTACTAAGT CTCAGGACGT CGGTGTTCCC AACTCCAAGG ACAGACGAGG CTGCAGGCTG CCTCCCTCTC AGCCTGGACA GCTACCAGGA
GGCCGTCATC
GGTTCGAAAG
CAGGCTGATT
GAGTAGCCCC
GGACCTGGTG
TCTCACTGGG
120 180 240 300 360 P:\OPER\VPA\VPACOM-l\SOCSDl--I.WPD 1/11/01 176-
TCTCAGGGCC
TTGTTTACAA
TCTATTCCTG
TCCCTGTGCC
AAAGACTAAG
GCTTTGTTAA
CAGAGCTTTG
ACTGATGAGC
GGGCCAGGGC
CCTCCACTTG
ATGAAGACGT
GTATTAATAT
GCCAGAGCAG
AGATCCCAGA
AGAGCTTGAG
TTCTGGAAAA
GGCCCAAGGT
AATAAATGTT
AGAACAGAAT
CCTTCTCTAC
GTGTTCTGGG
CTCACCACTT
AGGGGGTAGG
ACACATGTGA
GTGTCAAGGA GAAGAATCAT CTTCAGGAAT GGCAGAAACC GAAGGTGGTG CTCAGAGCCT GACTTCAGAG CTTTCTCTCC GGGAGCCTGG GTCTTGGAGG 420 480 540 600 660 709 o INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 848 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 1._624 (xi) SEQUENCE DESCRIPTION: SEQ ID TTG GAG AAG TGT GGT TGG TAT TGG GGG CCA ATG AAT TGG GAA GAT GCA Leu Glu Lys Cys Gly Trp Tyr Trp Gly Pro Met Asn Trp Glu Asp Ala 1 5 10 GAG ATG AAG CTG AAA GGG AAA CCA GAT GGT TCT TTC CTG GTA CGA GAC Glu Met Lys Leu Lys Gly Lys Pro Asp Gly Ser Phe Leu Val Arg Asp 25 AGT TCT GAT CCT CGT TAC ATC CTG AGC CTC AGT TTC CGA TCA CAG GGT P:\OPERVPA\VPACOM-I\SOCSDi-I.WPD- 1/11I01 177- Ser Ser Asp Pro Arg Tyr Ile Leu Ser Leu Ser Phe Arg Ser Gin Gly ATC ACC Ilie Thr CAC CAC ACT AGA His His Thr Arg
ATG
Met 55 GAG CAC TAC AGA Giu His Tyr Arg GGA ACC Gly Thr
TTC
Phe AGC CTG Ser Leu
TGG
Trp TGT CAT CCC AAG Cys His Pro Lys
TTT
Phe 70 GAG GAC CGC TGT Giu Asp Arg Cys
CAA
Gin 75 TCT GTT GTA GAG Ser Val Val Giu
TTT
Phe 240 ATT AAG AGA GCC Ilie Lys Arg Aia
ATT
Ile ATG CAC TCC AAG Met His Ser Lys
AAT
Asn 90 GGA AAG TTT CTC Giy Lys Phe Leu TAT TTC Tyr Phe TTA AGA TCC Leu Arg Ser
AGG
Arg 100 GTT CCA GGA CTG Vai Pro Giy Leu CCA CCA Pro Pro i05 ACT CCT GTC CAG CTG CTC Thr Pro Vai Gin Leu Leu 110 336 S.
S
S. S
SS
S.
S.
S S
S.
555.5.
S
*SSS
TAT CCA GTG Tyr Pro Vai 115 TCC CGA TTC AGC AAT Ser Arg Phe Ser Asn 120 GTC AAA TCC CTC CAG CAC CTT TGC Vai Lys Ser Leu Gin His Leu Cys 125 AGA TTC Arg Phe 130 CGG ATA CGA CAG CTC Arg Ile Arg Gin Leu 135 GTC AGG ATA GAT Vai Arg Ile Asp
CAC
His 140 ATC CCA GAT CTC Ile Pro Asp Leu
CCA
Pro 145 CTG CCT AAA CCT Leu Pro Lys Pro CTG ATC TCT TAT ATC Leu Ile Ser Tyr Ilie iso GTA TAC CTG TCT CTA Vai Tyr Leu Ser Leu 170
CGA
Arg 155 AAG TTC TAC TAC Lys Phe Tyr Tyr
TAT
Tyr 160 480 GAT CCT CAG GAA Asp Pro Gin Giu
S
*S.S
GAG
Giu 165 AAG GAA GCG CAG Lys Giu Aia Gin CGT CAG Arg Gin 175 TTT CCA AAC AGA Phe Pro Asn Arg 180 AGC AAG Ser Lys AGG TGG AAC Arg Trp Asn 185 CCT CCA CGT AGC GAG GGG CTC Pro Pro Arg Ser Giu Gly Leu 190 CCT GCT GGT CAC CAC CAA GGG CAT TTG GTT GCC AAG CTC CAG CTT TGAAGk.ACCA P:\OPER\VPAWVPACOM-I\SO-CSDI--LWPD 1/11/01 178- Pro Ala Gly His His Gin Gly His Leu Val Ala Lys Leu Gin Leu 195 200 205 AATTAAGCTA CCATGAAAAG AAGAGGAAAA GTGAGGGAAC AGGAAGGTTG GGATTCTCTG TGCAGAGACT TTGGTTCCCC ACGCAAGCCC TGGGGCTTGG AAGAAGCACA TGACCGTACT CTGCGTGGGG CTCCACCTCA CACCCACCCC TGGGCATCTT AGGACTGGAG GGGCTCCTTG GAAAACTGGA AGAAGTCTCA ACACTGTTTC TTTTTCA 691 751 811 848 INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 207 amino acids TYPE: amino acid TOPOLOGY: linear 0
@OOS
00 00 @0 S 0 0* 0 00 @5
S.
0550 0 0 0*
S
0 005S 0 0005 S S 0000S0 0 0 OS@S@s 0 0000 0 (ii) MOLECULE TYPE: protein Leu 1 Glu (xi) SEQUENCE Glu Lys Cys Gly 5 Met Lys Leu Lys 20 Ser Asp Pro Arq DESCRIPTION: SEQ ID Trp Tyr Trp Gly Pro 10 Gly Lys Pro Asp Gly NO: 3 6: Met Asn Trp Glu Asp Ala Ser Phe Leu Val Arg Asp Ser Gin Gly Ser Tyr Ile 35 Ile Thr His 50 Trp Cys His Thr Arg Met 55 Pro Lys Phe Giu Leu 40 Glu Leu Ser Phe His Tyr Arg Gly Ser Arg Thr Phe Ser Leu His Asp Arg Cys Val Val Glu 6S Ile Lys Arg Ala Ile Met His Ser Lys Asn 90 Gly Lys Phe Leu Tyr P:\OPER\VPA\VPACOM- I\SOCSDI-I.WPD- 1/11/01 179- Leu Arg Ser Tyr Pro Val 115 Arg Phe Arg Arg 100 Ser Val Pro Gly Leu Pro 105 Val Pro Thr Pro Val Arg Phe Ser Asn 120 Val Lys Ser Leu Gin Leu Leu 110 His Leu Cys Pro Asp Leu Ile Arg Gin 130 Pro Leu Leu 135 Ile Arg Ile Asp His 140 Lys Pro Lys Pro 145 Asp Leu 150 Val Ser Tyr Ile Arg 155 Lys Phe Tyr Tyr Tyr 160 Pro Gin Glu Glu 165 Ser Tyr Leu Ser Leu 170 Pro Glu Ala Gin Arg Gin 175 Gly Leu ,:oo oo 0 o*oo oo ooooo *oo o *ooe o*o*o* Phe Pro Asn Pro Ala Gly 195 Arg 180 His Lys Arg Trp Asn 185 Leu Pro Arg Ser Glu 190 Gin Leu His Gin Gly His 200 Val Ala Lys Leu 205 INFORMATION FOR SEQ ID NO:37: SEQUENCE CHARACTERISTICS: LENGTH: 464 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: GTTCCAAGCC TAACCCATCT TTGTCGTTTG GAAATTCGGG CCAGTCTAAA AGCAGAGCAC CTTCACTCTG ACATTTTCAT CCATCAGTTG CCACTTCCCA GAAGTCTGCA GAACTATTTG CTCTATGAAG AGGTTTTAAG AATGAATGAG ATTCTAGAAC CAGCAGCTAA TCAGGATGGA P:\OPER\VPAVPACOM-I\SOCSD I-WPD- 1/11101 180- GAAACCAGCA AGGCCACCTG ACACAGGTCC TGTGTGACTG TTTGGATTTG GTGATCAAAT GTGTCTTTCC CAATATTGTG AACCTTATCC CACTTTGTTG TGTATTATTT GTTTACCTGA GTAATTCTGA AAAAAAAAAA AAAAAAAAAA INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS LENGTH: 747 base pa TYPE: nucleic acid STRANDEDNESS: singl TOPOLOGY: linear (ii) MOLECULE TYPE: DNA TTTAATTCTG TTTAGTCACA AAAGACGGCT GTCCATGTTT ACAGTTGCTT TTCCCAGTTT ATCTTGCCTT ACTCAGTTTT ATTTCTAGTG CCATTTTCTA CTTTATTCTG CTAATAAACT AAAAAAAAAA AAAA 240 300 360 420 464 r (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: GGGGATCGAA AGCGGGGGCT TCTGGGACGC AGCTCTGGAG ACGCGGCCTC TTTCGGTGTA GAAGTGGCAG CACGGCAGAC TGGTCAAACA AATGGATTTT ACGCGGACAC GTGCTCTACA GTTGGACTTG CTGCCAGGGA AGGCAATGTT GGAAACTGCT CAAAAAGGGC CGAAGTGTCG ATGTTGCTGA TAACAGGGGA TTCATGAAGC AGCTTATCAC AACTCTGTAG AATGTTTGCA AATGTTAATT CATCTGAAAA CTACATTAAG ATGAAGACCT TTGAAGGTTT CTGTGCTTTG CAAGTCAAGG ACATTGGAAA ATCGTACAGA TTCTTTTAGA AGCTGGGGCA CAACTACTTT AGAAGAAACG ACACCATTGT TTTTAGCTGT TGAAAATGGA TGTTAAGGCT GTTGCTTCAA CACGGAGCAA ATGTTAATGG ATCCCATTCT
GGACCAGCCA
ACAGAGGCTT
AAAGTCTTAA
TGGATGCCAA
AATGCAGATT
CATCTCGCTG
GATCCTAATG
CAGATAGATG
ATGTGTGGAT
120 180 240 300 360 420 480 540 P:\OPER\VPA\VPACOM-l'SOCSDl- WPD 1/11/101 181 GGAACTCCTT GCACCAGGCT TCTTTTCAGG AAAATGCTGA GATCATAAAA TTGCTTCTTA GAAAAGGAGC AAACAAGGAA TGCCAGGATG ACTTTGGAAT CACACCTTTA TTTGTGGCTG CTCAGTATGG CCAAGCTAGA AAGCTTTGAA GCATACTTAT TTCATCCGGG TGCAAATGTC AATTGTCAAG CCTTGGACAA AGCTACC INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 1018 base pairs TYPE: nucleic acid STRLANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA 600 660 720 747 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: CACAAATGGG ACCATACAAA AATCTTGGAC TTGTTAATAA. CCACTTACTA ACCGGGACCT
C
GTGACACTGG
GGATTGCCTA
TTTTTGGATT
TTGGAATTGT
ACTGCCTGA.A
TGGGACCATG
ATAAGGAGTG
ATTCTTGGAT
GGAAGACACT
GCTAAACAAA
GAAATATTAC
CAGTTCTCCT
GAACATTCTT
GTACGAGAAG
GAACCATATA
GTTGCCACAT
TGACTCAGTC
TGCACCAGCT
GTAAGTCCCT
TCCGGAATGG
GTGTGCATGG
TTGAAATATG
TTTTCGATAT
TATGAATTTG
CTTCTGGTTG
AGCATTGACA
GTTGAAAGGA
GTTTACTCAG
TCTACAGCCC
CTTTCCAAAA
GAGCCCAGAT
TTCGCTACTT
TAAATCATGC
CTGGATTTGA
CCCTTATCTT
TGCTCTCTGC
CAGTGTTTGG
AGACGCCCAG
GGAGGTGGAG
AAATGAACTT
TTTGAGGAAA
AATTAAAGCA
CCCACTGATT
CACTTTGGAG
TCGTGCCTCA
GGGACATGAA
GCGTGCCTTG
CTGTAGTTCT
CATTTGGCAT
GGTTGCTCAT
CAAGCAAAAT
CTACTGTGCA
TTTACTAATT
AACGCTTGGA
120 180 240 300 360 420 480 540 600 P:'OPER\VPAVPACOM-l\SOCSDJ-l.WPD. 1/10 182-
TTCTACAGCA
CGGTCCAGTC
CCCAGAAGCC
GAACTGGCAG
TCTCTGAAAA
CAAAAGATGA
AATGTTTATG
ACATATTGCC
TAAAATCAGA
TACATAATTA
CTATTCAAGA
ATCATCGAGA
TTATTGATTG
TTTACAACTA
CACTGTTCCA
ACGTCTACGG
TTTGCTCTAT
TGGATAAATC
CAAAAGAGCC
TCAGATAGGT
GCCTTCCCAG
TCCCTGACCC
TCTGACAGTT
GAAGACGTTC
AGTGAAACTA
ACAGAGTACA
TAGGTTTTGG
TAAAAAAAAA
ATCTTTGTCG
ATATTAGTCA
TGAGGATGTA
CTTAACACAG
AGTTTTTATG
GGGGCCAGTA
TTTGGAAATT
GCTGCCACTT
TGAAGTTCCA
CTAATTTTTT
ATTTTATAGT
GTTCAGTGAG
AAAAAAAA
660 720 780 840 900 960 1018 .55.5.
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1897 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID CGGGGGGCTG GGACCTGGGG CGTAACCGTC TCTACCACGA CGGCAAGAAC AAACATACCC AGCCTTTCTG GAGCCGGACG AGACATTCAT TGTCCCTGAC TGGCCCTGGA CATGRATGAT GGGACCTTAA GTTTCATCGT GGATGGACAG TGGCTTTCCG GGGACTCAAG GGTAAAAAGC TGTATCCTGT AGTGAGTGCC ACTGTGAGAT CCGCATGCGC TACTTGAACG GACTTGATCC TGAGCCCCTG ACCTGTGCCG GCGTTCGGTG CGCCTAGCGC TGGGAAAAGA GCGCCTGGGT
CAGCCAAGTA
TCCTTTTTCG
TACATGGGAG
GTCTGGGGCC
CCACTCATGG
GCCATCCCCG
120 180 240 300 360 P:\OPER\VPA\VPACOM-I'SOCSDI.WPD 11111(01 183 CTCTGCCGCT ACCTGCCTCC CTCAAAGCCI
ACCGCCATAC
TGCGCCTGGG
TGCCTCTGCT
GCAGCACCCA
GGCGCCTTCT
CCCACCTGGG
CAGGAATGGC
TGAGACAGAA
GACAGCCATC
ATGGAAGCCC
GGGGGAACCT
GGCCCTTCCC
GCTCTCAGGT
GGGTCCTGGG
CAGGAMCAGG
CCCCAGGTTG
a TGCMGGTGCT CTGTGGCCTG
CCACTTAGAA
GGAATGTGTG
TGCACAGATA
CCAAGCACCA
GCTTCTGTTG
AGCTTCGAAT
GGCCTGGAGG
TGGCAAGGCA
CTCAGGCTCA
CCCAGGAGAG
CCCTCTGGCT
AGGAAACCTT
TGCCACACCT
CACGTATGTC
GAGCAGACCC
CCTTGGGACT
TTTGCCCCCG
GCATTAGGAC
GGATTAGAAA
AGACCCCGCC
GAATGCCTGT
GTTTCTCTTG
TGGTGCCAAF
ACCTCAGCCP
ATGCCAACGG
TGAACCAGAT
GGAGTGGGCT
AGGTAAGACT
CCATACAGAT
GAMTTCCCTT
TATTTATTCT
GGTGGGTGGY
GTCCTTGTCC
GTCGTGCATG
CAAGAGAGGC
CATACACCGG
ATTCTTCTGA
ACATGGAATG
GACCAAGAGC
ACACACCCAC
CCTAGCAGCA
CTCTAGAATC
ACCTCCTCTA
TCACTGAGCC
TGGGCAGACG
ACTTCTCCCT
GCAGAGAATA
GCCCCCCACT
AGTAGGAGGT
GAAGCTCAGG
GGGCCAACGA
TTAAACAGTA
TTCCCTCGAT
CAGGCCAGGA
ACCCCTGACT
CCGTGCAAGT
CACACGTGTT
TATTTCCCAT
AGTGGGGTCT
AGGGTGGGGC
TCAAGCCTCA
CGTACATGGA
AACTCCCTAC
*CCAGTGATCC
CGTTGGGGTC
TGCCCCCTCA
TCCCAACACT
AACTATGAAA
CTCTGCAGAG
GCCAGGGCTG
ATGTCACATA
GTGCCAGCTT
GCAAAGGCCA
GTGCTTTCCC
CTGTGGCACA
AGTTCCTAAG
CCCCATGTCC
TCAGCCTCTT
TGGCATCCTC
CCAGCCCCTG
GCCATGAAGC
GAAGTGGTGT
GCACCCCACA
ATTGGGAATG
*ACATCCCAGG
*CGCCGACCCC
TCCTACCGGC
GGCTGAAGCA
ACCTCTCTCA
AGAGGCTACA
ARTCCAAAAG
CCATGGACAM
TAATGTCAGC
TTTATTTATT
CCACCTCCCT
TGAGCTGGTG
TAGCCCTGCA
CCAGGTCCCT
GACTTCCATG
CAAAGCTCTG
GGAAAGCCAC
CTGTATGCCT
GTAGGGCAGC
TGTGCTCCAG
TAGCCATTTG
420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 P:\OPER\VPA\VPACOM-I\SOCSD-l WPD.- 1/11/01 184- GTAGAGGACT TGCCTAGCCT GCAGGAAGCT AAAGCTCAGG AGGCTGAGGC AGGAGGATTG CCTGGGCTAT ATTAAACCTT GTCCTTTAAG CTGAGTTCTG CTCATTTCTG CACAGGTACA AATATATTTA CMTATATATA TATTTGTAAG INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS LENGTH: 134 amino a TYPE: amino acid STRANDEDNESS: singi TOPOLOGY: linear (ii) MOLECULE TYPE: DNA CACGTTCCAT CCCCTGCACC AAGGAGAATC CTGTCAGTGG TGTACAGAGG TCATGGCCAT AAAAAGAAAA GAAATCAACT TCCATTGAAT ATAGATGACT TKATTTGTTG AAAAATGKTT
AAGCATT
1680 1740 1800 1860 1897
C
(xi) Gly 1 Gin SEQUENCE DESCRIPTION: SEQ ID NO:41: Gly Trp Asp Leu Gly Arg Asn Arg Leu 5 10 Pro Ser Lys Thr Tyr Pro Ala Phe Leu Tyr His Asp Gly Lys Asn Giu Pro Asp Al a Giu Thr b'he Asp Gly Thr Ile Val Pro Leu Ser Phe Ser Phe Phe Val1 40 Gin Leu Asp Met Xaa Ala Ile Val Asp Leu Lys Gly 55 Tyr Tyr Met Gly Val1 Ala Phe Arg olIy Gly Lys Lys Cys Leu 70 Arg Pro Val Val Ser 75 Val Trp Gly His Leu Giu Ile Arg Met Tyr Leu Asn Gly 90 Leu Asp Pro Giu Pro P:\OPER\VPA\VPACOM-l\SOCSDI-I.WPD 1/11/01 185- Pro Leu Met Asp Leu Cys Arg Arg Ser Val Arg Leu Ala Leu Gly Lys 100 105 110 Glu Arg Leu Gly Ala Ile Pro Ala Leu Pro Leu Pro Ala Ser Leu Lys 115 120 125 Ala Tyr Leu Leu Tyr Gln 130 INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 265 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear ease (ii) MOLECULE TYPE: DNA e (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: AAGGGTAAAA AACTGTATCC TGTAGTGAGT GCCGTCTGGG GCCACTGTAG ATCCGAATGC e GCTACTTGAA CGGACTCGAT CCCGAGACTG CCGCTCATGG ATTTGTGCCG TCGCTCGGTG 120 .eoee: *CGCCTGGCCC TGGGGAGGGA GCGCCTGGGG GAGAACCACA CCTGCCGCTG CCGGCTTCCC 180 TCAAGGCCTA CCTCCTCTAC CAGTGACGTT CGCCATCATA CCGCCAGCGC GACAGCCACC 240 TGGTGCCAAC TCACTGAGCC GCCTG 265 INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 2438 base pairs P:\OPER\VPA\VPACOM-I\SOCSDI-I.WPD 1/11/01 186- TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: AAGTGGCGGC GGTCCCTGGA GAGCAGGCGG AGGCAGCGGC AAGTCTGACT CTGGGCTGAC
C
CGTGGAGCCG
CGGGAGTGGC
GGCGGCAGGC
GCCCAGAAGA
TGTAAAGAAT
AACAAGTCGG
GTCTTGGTCC
TGTTGAAATT
CTTAGATCAT
TGCGGTGGGG
ATCTAAAAGA
CTCAGATTTA
CTCAGATGAA
ACGAAGAAAC
GGGCGGGGGC
CGGGCCTCTC
GCTCGGACAG
AACTTCCTCT
TTGTTTAAAA
AGTCGAAGTG
AAAAAGAGTG
CCTCTAAGAA
TCCTGTGGGC
CAGTGTTTTC
AAGATTCATA
GCCTTTAGGT
TGGGTGAGTG
ACAGAAGATG
TTCCGCGCTT
CTCCGCTTGA
TAGAAAAGCT
TGGCTGAAAA
CTGACAGGAA
AGAGTTGTTC
GCCAAGAAAG
ATAGATTTTT
CAATAAAGAA
TCAGTGAACT
GGCATTTTAT
CAGACCTGTC
ACATACCCTG
GAGCGAGCGC
GCTGAGCTCG
GAAAAACACA
CAATAGTAAA
GGATGGTTAT
TGAATCTGAA
GCAGCTTAGC
AGGCCGATCC
TTGTAGTGGC
CATGTTAGAT
TAAACGACAC
TGAGAGGAAA
TTTCTCACAT
TGACAGCCAG GCCTCCGCCT GGCGGGAGCC
CGGGTGATGG
GAGAGATCCG
RTATTTATAA
AATGTAGATG
GTGTGGAGTG
GCCATAGGTA
TGTTCGTCCA
CTTAAACAGA
CGACACTCTC
AAGTGCCCTT
ACTGTTCCTA
CTGAGAGATG
ACCAATGGCC
GCACGAGGAG
CGGTGGTGAT
TCCAGAAAGT
CACTGGAAAT
TACGGCCTAA
GAAAGAAGTT
CTGTTGAGAA
TTGAGTTGGA
AACTGCAAGA
CAGGGCTTCC
TCCCACCTCG
TGAGTCCCAA
CTCAGCTGAA
AGCCTTGTGT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 P:AOPER\VPAVPACOM-I\SOCSDI-l.WPD- 1/1 1/01 187-
CATAACTGCC
GGTCACAAAC
GTGCACAAGC
GTTGGAGGCA
CCTCCTTCAG
AGCTCTGCTG
TTATTTATTC
GTGGAATCAT
TACTGGGCTC
GTCCACTCCC
TTGTAATTGT
GTATCTGAAG
GCAGCAGTGA
TATTTTTCTT
CCCTAAGTTT
AAGCAGGTGT
AATTTCAAGG
TTTCCTATGT
GTTACCTATT
ATAATGTAAT
ATGAAACTTA
AACAGTGCT I
AACAGCATAC
TCCAGAAAAP
CCTCCTAAGT
ATCAGTAACA
GAAGGAAAGC
TCTGTTAGTT
AACTTTAGCT
CTGGAACACT
TTAATCCGGA
ACGACTTACG
GAATACCATT
TGCGGAGAGG
ATGCCTCTTT
TAATTCCAGA
TTGGTTTTGT
CATTGCATAT
GTGAAATATT
CTCTTTTCAT
TGACTTTTGT
ATGTATTTTC
CGTGTACAGC
AAGACAGTGP
GGAATAAGCC
TCCACACCCA
ATCCGTGCTA
CAGAGGGCAC
TTAGACGCTA
TTGATGCCCA
ATAAGGACCC
CGTTCCCCTT
ATGGCATCGA
ATAAATCAAA
TTAGAATGTC
GAATTTTTGT
TCAATTTATT
TTTTACCATA
ACATTTGAAT
TTGCTAATCT
CTTGAAGATT
AATTTGCCAA
ATTTTAAATA
TGGTCACATA
CATGGATTCA
CAGGTGGGAA ATGGAAGAGG
*GATCGACTAC
CTGGGGTGTC
CTTTTTACTT
CAGTCGTTCT
TGATCCTTGT
CAGTGCCTGT
TTCCTTGCAG
TGCCCTTCCC
AGTTAGGTTA
GACCTGCATA
ACAAAGGCAG
TTTTTTATGA
TAAATTTACA
ATTCTGTATT
ATGCTATCAG
TTCAGTAAAG
TAGGAGTGTT
TTAACTAAAC(
GTCCACTGCC
ATGGACAAAT
CGAGATTCAG
CTTCATGCTA
GTCTTCCATT
ATGTTCTTTG
CATATTTGCA
ATTCCTTCGC
CTCAGGATTG
CATATTTTCA
TTGAATCAAA
TACACTTGTT
TATGGTCCAG
TTTTAAATAA
TATTCTTGTA
%GTGTTGTAA
%AACAACAAA
7AAGTTTGTT I)
ACTGGTTCTA
GAGGATGAAA
TGATGAACTT
TTATAACGCT
AGATCCTGCA
TTGTTCCAGA
ATGCAGCCGA
CGCAGGAAGA
GAATTGAGCA
CTCCTGATAT
AGCCGCTCTT
GAACGGTTAT
CTATGAAATT
ATGTGCCAGA
TTTAATATTT
TAAAACTGTG
ATATATTTTT
GCATATTTAC
TCTTTTGTTC
rGACCGAATA
LCAATCCATT
kTGATTTAAA
EGTTAGTTAT
960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 P:\OPER\VPANVPACOM- I\SOCSDI-I .WPD 1/1 1/01 188- TCTAGCCAAT AAGAAAAGAG AATGTAGCAT CCTAGAGGTG TATTTGTTCT GCAGTTTGGC AGGACCGTCA GTTAGTCCAA ATAAACATCC CCTCAGCGTG GAGGCGAATG GAACCTGTGC TCCTTTCTTA CGGGAAGCTT TGCAAAGCAA AATAGCAGGG TTACAAGCTT GGAGTTGTTA AGGCAACTAG AGTTTTCTCT ATTAATTTAT AGACTGTTGT TGCACCTACT TAGCTCTTTT TTGGGAACTC TAGTTCCCAG GGGAAAATAC CTCGTGCC INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: LENGTH: 542 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA 2220 2280 2340 2400 2438 (xi) Ser 1 Ser SEQUENCE DESCRIPTION: SEQ ID NO:44: Gly Gly Gly Pro Trp Arg Ala Gly Gly 5 10 Gly Leu Thr Val Glu Pro Gly Arg Gly Gly Ser Gly Lys 4 Ser Asp Leu Thr Ala Ser 2S Gly Arg Pro 3ro Ser Leu Pro Pro Gly Gly Arg Leu Ser Arg Thr Arg Ser 40 Met Ser Gly Arg Ala Ala Glu Arg Arg Arg Thr Val 55 Leu Ala Val Val Met Ser Ala Gly I-la Pro Ala Pro Leu Arg Asn Phe Glu 70 Leu Ser Ser Glu Arg Lys Val Gln Lys Arg Leu Glu Lys Leu 90 Asn Thr Xaa Phe 9S P:\OPER\VPAVPACOM-l\SOCSD-j.WPD 1/1 1/01 189- Thr Leu Giu Ile Val Lys Asn Leu Phe Lys Met Ala Giu Asn Asn Ser 100 105 110 Lys Asn Val 115 Asp Val Arg Pro Thr Ser Arg Ser Arg Ser Ala Asp 125 Arg Lys 130 Asp Gly Tyr Vai Trp 135 Ser Gly Lys Lys Leu 140 Ser Trp Ser ',ys Ser Glu Ser Cys Glu Ser Giu Ala Ile 155 Gly Thr Val Giu Asn 160 er Val Giu Ile Pro Leu 165 Arg Ser Gin Glu Gin Leu Ser Cys Ile Glu Leu Ser Leu Lys 195 Asp 180 Leu Asp His Ser Gly His Arg Phe Leu Gly Arg 190 Phe Pro Ile Gin Lys Leu Gin Ala Val Giy Gin Cys 205 Lys Asn 210 Cys Ser Gly Arg Ser Pro Giy Leu Pro 220 Ser Lys Arg Lys Ile 225 His Ile Ser Giu Leu 230 Met Leu Asp Lys Cys 235 Pro Phe Pro Pro Arg 240 )ro Ser Asp Leu Ala Phe 245 Arg Trp His Phe Lys Arg His Thr Met Ser Pro Asn 260 Ser Asp Glu Trp Ser Ala Asp Leu Ser Glu Arg 270 Asp Asp Ile Lys Leu Arg 275 Asp Ala Gin Leu Lys 280 Arg Arg Asn Thr Giu 285 Pro Cys 290 Phe Ser His Thr Asn 295 Gly Gin Pro Cys Val1 300 Ile Thr Ala Asn Ser 305 Ala Ser Cys Thr Gly 310 Gly His Ile Thr Gly 315 Ser Met Met Asn Leu 320 P:\OPER\VPA\VPACOM-I'.SOCSDI-l.WPD 1/1 1/01 190 Vai Thr Asn Asn Ser 325 Ile Giu Asp Ser Asp Met Asp Ser Giu Asp Giu Ile Ile Thr Leu 340 Cys Thr Ser Ser Arg Lys 345 Arg Asn Lys Pro Arg Trp 350 Giu Met Giu 355 Giu Giu Ile Leu Gin 360 Leu Giu Ala Pro Pro 365 Lys Phe iis Thr Gin 370 Ile Asp Tyr Vai His 375 Cys Leu Vai Pro Asp 380 Leu Leu Gin Ile Ser 385 Asn Asn Pro Cys Tyr 390 Trp Giy Vai Met Asp 395 Lys Tyr Aia Aia Giu 400 55
S
S
Aia Leu Leu Giu Giy 405 Lys Pro Giu Giy Thr 410 Phe Leu Leu Arg Asp Ser 415 Aia Gin Giu Ser Leu His 435 Asp 420 Tyr Leu Phe Ser Ser Phe Arg Arg Tyr Ser Arg 430 Ser Phe Asp Aia Arg Ile Giu Gin 440 Trp Asn His Asn Phe 445 Aia His 450 Asp Pro Cys Vai Phe 455 His Ser Pro Asp Ile 460 Thr Giy Leu Leu Giu 465 His Tyr Lys Asp Pro 470 Ser Aia Cys Met Phe 475 Phe Giu Pro Leu Ser Thr Pro Leu Ile 485 Arg Thr Phe Pro Phe 490 Ser Leu Gin His Ile Cys 495 Arg Thr Vai Pro Ile Pro 515 Ile 500 Cys Asn Cys Thr Tyr Asp Giy Ile Asp Aia Leu 510 His Tyr Lys Ser Pro Met Lys Leu 520 Tyr Leu Lys Giu Tyr 525 Ser Lys Vai Arg Leu Leu Arg Ile Asp Vai Pro Giu Gin Gin 530 535 540 P:\OPER\VPA\VPACOM-I \SOCSDI-I.WPD. 1/1 1/01 -191 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 4999 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID CCCTCTGGGC AAGCCGCCCC CCCCCCACCC ATCTACCACA CACACACACA CACACACACA
CACACATTCA
TGGAAAGTCC
CTGCCACAAA
CTGGCTGTCC
CCTCTGCCTC
TCAATTAAGG
ACACAGCACA
AACTTATGCA
AGAGCAGCCG
GACCTTGGGG
TTACTTCAGG
GGAGTCTTTT
TGGAGCTCAC
CTGAGTGCTG
TTCGTTCCTT
GTTTGTATGC
CATTTGTGAG
TGCTCTTAAC
CAAAAACAAA
AAGGTTGGCA.
TTTTTAATGG
TTTGTAGACC
GGATTAAAGG
TCAGATAACT
CACATTCAGT
CTTCCACTTG
TGCTGAGCCA
GCAAAATAAC AACAACAAAA
GATGAGGAGC
TTTTTCAAGA
AGGCTGGCCT
CGTGCAGCAC
CTAGGTTCTG
TCAGAAGACA
GGAGTGGGAA
AAGGGAACAT
CAGGGTTTCT
CGAACTCAGA
CATGTCCAAC
GGTCAAGCTG
CCCAACCTCC
CCTGAACTGG
ACACTGCCTG
TTTATCAGGA
CTGTATAGCC
AATTCGCCTG
TGGCATTTTC
ACACAAGGCT
CTGGAACTGG
GTCCTCTGCA
AATTAAGTTA
120 180 240 300 360 420 480 540 600 660 720 780 TTTCAGCAGC CTCACATCAG GAAATTAGCCG GGTATGAATC ATACCCTTAG AATCCTAGCA TCTGAAAGCA GAGCTAAGAG AAACAGGGAT TCAAGACCAG CTCTTGGCTA CAGAGCCCGT CCTGTCCTAG GATGGGCTAC AAGAGACTAT TTCAAAGCCA TCCAAACAAC AATA.ACTACA ACAACAACAA GGTTAAAATT P:'OPER\VPAWVPACOM-M0SCSDI-I .WPD. 1/1/0 192 AGGCTGGGCA CAGGGTACAC ACCTTTAATG CCAACACTCA GGAGGCAGAG GCAGGCTGAT CAGTGTGAGT TTGAGTTCAA CGTGGTCTAC ATAGGGAGTT CTAGGCCAGC AGAGGTTACA 840 900
GTCTCTCTCT
CACACACACA
GCCCTGGCAT
TCTGCCTCCC
TTTTTGAACT
GATAATGTAG
AGACAGCGTG
TAAGGCTTCA
GAACACAGGA
GGAATGACTA
ACACACACAC
TTAATTCTAG
AAATAAACTG
TCAAAGAGTG
TTTGCCTGGA
GTCTGGGAGG
CGTCCGCAGC
TATCTGTACT
CTCTCTCTCT
CACACACGGT
AGATTCACTC
AAGTGCTGGG
GTTATCAAGA
CTAATGATCA
CCCACTCGTA
GCAACCTAGG
GCAGGAGTTC
GCTCAGGCAG
ACACACACAC
TCTGGGATGG
CTGAAGAGTC
TGCTCCACAA
TTCTTTGTAG
TGACGATAGG C CCCGCTGGAG C rcCACAGAGG 'I
CTCTCTCTCT
GGCATTATGG
TGTAGACTAG
ATTATAGGTG
GGCTTTCGAG
AACGACACTC
GTTCCATTAC
GAGCCGCAGG
AGGAAGGGTG
AGAGAGCAAA
A.CACACACAC
GGGGGAGGGT
rGACGCCAGG k~GCATGCGCG
'CGGTGGGTT
;TTAATCGTC
CGGAAGCAG
'CTCTGCGAG
CTCTCACACA
GATTTTTTTG
GCTAGCCTTG
TTGCACCACC
GAGGTCAAAC
AAAACTTAAC
TCAGGAGGCT
GGACAGTAGT
TCAAGGCCGC
GGTCTCCAGT
AGAATCCAAG
GGGGCACGCA
GAGTCCTGGG
CTTGTCCACG
CTCAAGGCGG
CACAGAGCCC I rGGCTGGTCA C
"TAGGGGGAC
CACACACACA
GGATAAGGTT
AACTCAGAGA
ACTGCCCAGC
TTCAACAGCA
CCTTAAAGCA
GAAGCAGGAG
CTCAATCCCT
TTACTGATCT
GGAGAAGTCT
GCGATGACGT
GCTGTCAGGT
%.GGGACAAGA
r'CTGGAGTCG
LAAGTGGTGT
GGGGCGGAG
;GGGCGCTTC
~GTGAGGTGC
CACACACACA
TCTCTGTCTA
TCCGCCTGCC
CACTTTGGGA
ACCTCTCCAT
CACATCCACC
GATGAAGGAC
ACATTCTCCT
TAGGGCCTCA
ACACACACAC
CATCAAAGGG
GGCTTTGGAA
GGTTACCCAC
TCACTTATTT
GGCCGCCGTG
CGCGGGCGGG
rAGCCTTCCC 3GGGTAGGGG 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 CCCGGCGTTA GAGCCAGCAA GGGGACGGTT CACGGTAAGG TCTGAGGGAG AGAGAGCTCC 24 2040 P:\OPER\VPANVPACOM-l\SOCSDj- .WPD 111 1/01 193 TGAGAAACTT GGGGGGCGCG ACACAGATAG GGTGAAAGCA GAGTGATAGA CCTGGGATGG TTAGGGGACC AAGGGAAGAC CAGGCTGGTT GGCATACACC GGTGAACGGA TGGGAGTCCT
C
C
C
C
C
AGGGAAAGA I
GGAGCTCAAC
GCTCTCCTCA
GCTCTGTACT
CCTGACCTGG
GATGTCAAGG
GTCCGGGGGA
GAGCAAAGGG
GACCACTATG
GGAAAATTGT
GGTGAGCAGC
CTTGGCTACT
ACCCTCTATC
GGCGAAAGAA
TTGGGCTGGA
ACTTAGATGG
CAGAGCCTTT
GCCTGAGCAG
*GATGCGCCTP
TTTCAAAAGC
*TGGGCCAGAC
CGGACTTCTC
TTGCCCAACG
AAGGGGGTCT
AACGGGGCTA
GCACACACGC
CGGCGCTTTT
ATCATCAGAG
TAGTGGTGCC
CTATTGGGGG
CCTCTGTAAG
GAGGTGAGAT
AACTCATGGT
CCTTGGATCT
GGGTCTCCCT
GATCTCAAGT
LACAGTCCTTT
GAGACGCCCC
GGCCCTGGCA
TCCTCCCGAG
GCACCACGGC
GTGCTTTGAG
TTCGAGAGGT
CGTGGTGGGC
GGGCAGCAAC
TAAGGGCCTC
AGAGAGACTG
CACGTACCTG
TGCTGTTTGG(
ACGGACTAGG
TGGAGCACAG(
AGCTTCACTC(
CAGCTGAGGT C TCAAGGATGC C
CTGTCTCCAC
AGCAAGCCTG
AGGGGCAGCA
GGCTTGGAGG
TGGAACCCCA
CGGCGCCCTG
CTGCACGCCT
GTGGCCACCG
AGCGAGTCCT
GAGGCCCCCC
CTGGTGGTTC
GGACCAGCCT
GGCCAGTGCC
TGTGGGGAGA
3AAGTAGGCT 7CAATCCCTA I GCGGTGGAA I
ACCACTCCAG
TTTTGAGAAG TTCTTCAGCG
GCAGCACCCC
AGCTCCTGTC
AGGATTGCTC
TGGCCCAGAG
GGGAGATCAG
CCCTCGCCCC
GGGGCTGGGA
A~GTATCCAGC
TGGACATGGA
TCCGTGGACT
!GGTCCGCAT
rCACTACTCT
CCTTGTCACT
~TGGATGTGA
TGGAGGAAG
TACCTCGCAG
TGCTCCCCCT
CGAGAACATC
CACTGATGGA
CTGGCCCCTG
GCTGCAGGCT
TATTGGGCGG
TGGACCTCAG
GGAGGGGACT
GAAGGGGAGG
CCGCTACATG
TGGCAATGGT
rTGGCCTGTC
TGCACAAATT
)LAGGAAGGGT
L'GTCTTCCTT
GGGACGATCC
2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 TGGAGTTGC TTACTTACCT CTCTCTCCGC AGTGGAGGAA CCACAATCCC TTCTGCACCT GAGCCGCCTG TGTGTGCGCC 3300 P:NOPER%VPAVPACOM-I\SOCSDI-I.WPD 1/11/01 194- ATGCTCTGGG GGACACCCGG CTGGGTCAAA TATCCACTCT GCCTTTGCCC CCTGCCATGA AGCGCTATCT GCTCTACAAA TGACCCAGTA GTACAGGGTG GACAGGTGGA GAGGCACCCG CTGGCCTAGA GGGGGGCTGG ACCCCTTCAC CTCCCCTTCT S S *5 *S .S SS
S
S
S. S*S*
S
S
AAACACCATG
CAACCTGTTT
CCATCACTGT
CTTACATGTA
CTATCCCAGG
CTGCATCAGA
AGGCCAAAGT
AATGGAAGGT
CACTAGGAGC
CTGGTCCCAA
CACAAGATGG
TTCTGGGCAA
TGGCATTGAT
CCACGTCAGG
GAACAAGAAG
CAAGGCAGCC
CTCCATAAAT
GCAGCCTGGG
CATTATTGTT
CTTAAGGAAT
GGATGGTTCA
CCTCTTAGGG
CATCCAGTAG
GACACGAAGC
GATTTCACTT
CACCTTGGTG
CCATAATAGG
GGCAGATGAT
CCTAGTCCAT
GATGTCCACA
CTGGCTTGCC
ACAGTTTGGT
TCAGTCTGTC
GATCCGGGTG
ACAAAGAGGT
TTTTGTTTTG
TATGACAACC
CAAACACAAT
TCTCATGTAT
AACTTGGGAG
CCACTTCCTG
GTCAGGGCCC
ACAGTTGATT
GCGGTGGAAA
GTCATCAGAA
GCTATGCAGG
AATTCAGGCT
AGCTCTTTGC
CAGGTCTATG
TTAGCCCATT
CTCTGAGCCAC
CAACTTTAAA
CACAGGAGGA
TTTTGAAGTA
TTTTACACTC
CACAAAGCTC
ACAGGGGCTT
ACCGAATTCA
TGAAGCTAGA
TGCTCCAACC
AAAGGGACAG
CTACCCACTG
CGGCTCAGGA
GCATGTGACC
CAGGTAGAGG
TGAGAGATGC
AGGTTGCTCC
NTCAGAACAC
L'CCGTCTTAG
'CCCATCATT
TGCTGGCACC
AAGCTGGTGPA
AGACATATAG
AAAAATGAGA
CCCCACCCCA
AGGCCCAGGT
TGGCACCGTG
GACCCGAAAG
GCCAAGGCCA
ATGAGTTTCC
TCAGTTCTAC
TAAGTGGTAA
GGGTACAGCG
GGTGGGAGCA
GATGGGCAGT
GCCACCCACA
AGTCACAGAA
TTAAGCCCCA
CTAGAGCCAA
GACATTGGAT
*CTACCGTGGG
AGCTGGGGGG
AAATGATATT
TGTATTGTCA
GGCTAGAGCC
GTTTATTTCC
GGGGAGGGGA
CTCTGAATTT
TCTAAGTGAC
AGCCCAAACC
TCCCTCCCCT
AGGGATTGGC
TGGATTAGGC
GTTACTAAAC
GCTCATTGTT
AGGAAGCCGT
CCTGTACCAG
CCTCTCTGTG
AGCCACTCAC
TTCAGCCATC
3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 P:\OPER\VPA\VPACOM-I SOCSDkl .WPD 1/11/01 195 CCCGGAGCTT CTCGTGTACT TCCTGTGCCT AGAAGGAGGA TCCTTCCTAT CTATCATTCA AGGAGTAAAA ACCACTGGTT CAGAAAAGCC CCGGGACCAG AGAGTGGCAA GGCTCCAATC ATTTTTGGCA AAGTCACTCT CCTTGGTGAG TTTGGGGGCC GGATGGGCTC CATAGCTGTG TGAGTCTGTT AAAGCCGGAC GTTACCTGCT GAGGGGTTGC CGTCTTGCCA GTCCCAATGG AGGACCACCT TGCTCCAGTC TTTCACATTA TCTGTGGGGC AGGAGCTGAC CCGCCAAGC INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 264 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA
GGCAGAGCTA
CTCACATAGA
CCACCAGGCT
CTCTGTCTCT
AGGCTGAGGA
CCCACACAGG
AGAGAGGAGA
CTAAGTAAGC
GTTGAGTTTC
TGGAATGAAC
AAAGGGGCTT
GCTCTGGGTA
TTCATAGGCC
GTGAGTAGGA
4620 4680 4740 4800 4860 4920 4980 4999 *9 *9 0 0 9* 0@ 00 0 0 00 0 0 *00000 0.00..
0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: Met Gly Gln Thr Ala Leu Ala Arg Gly Ser Ser Ser 1 5 10 Gln Ala Leu Tyr Ser Asp Phe Ser Pro Pro Glu Gly 25 Leu Ser Ala Pro Pro Pro Asp Leu Val Ala Gln Arg 40 Asn Pro Lys Asp Cys Ser Glu Asn Ile Asp Val Lys 55 Thr Pro Thr Ser Leu Glu Glu ',eu His Gly Trp His Glu Gly Gly -*eu P:QOPER\VPA\VPACON-I\SOCSDI-.WPD 1/11/01 196- Cys Lys Phe Glu Arg Arg Pro Vai Ala Gin Arg Giy Leu His Ser Thr Asp Giy Val Arg Gly Arg Gly Tyr Ser Gly Ala Trp Giu 90 Val Giv Val Ile Ser Trp Pro Leu Glu Gin Ala Pro Leu 115 Giu Ser Trp Arg 100 Gin Thr His Ala Ala Ala Asp His Tyr 120 Gly Ala Leu Leu Giy 125 Tyr Thr Ala Leu 110 Ser Asn Ser His Gin 23er Gly Trp Asp 130 Lys Gly Ile 135 Gin Arg Gly Lys Leu 140 Pro 0 0SSS S@ SO
S
S
0
S.
S.
5505 0
S@
S
*5 0 555505 0 0000
S
0055
S
000006 5 0
S
S
0 0000 5 0055 Leu Giu Ala 145 Leu Pro 150 Arg Tyr Pro Ala Gly 155 Leu Gin Gly Giu Gin 160 Val Val Pro Giu 165 Ser Leu Leu Val Asp Met Glu Giu Gly 175 Thr Leu Gly Gly Leu Lys 195 Gin Cys Gin Tyr 180 Gly Ile Gly Gly Thr 185 Pro Leu Gly Pro Arg Thr Leu Tyr 200 Tyr Ser Val Ser Ala 205 Arg Ala Phe Arg 190 Val Trp Gly Val Glu Giu Val Arg Ile 210 Pro Gin Arg 215 Leu Met Gly Giu Arg 220 Val1 Ser Leu Leu 225 Gly His 230 Gly Ser Arg Leu Cys 235 Leu Arg His Ala Leu 240 Leu Pro Pro l~a 255 Asp Thr Arg Leu 245 Gin Ile Ser Thr 250 Pro Met Lys Arg Tyr 260 Leu Leu Tyr Lys INFORMATION FOR SEQ ID NO:47: P:AOPER\VPA\VPACOM-M0SCSDI-I .WPD 1/11/01 197- SEQUENCE CHARACTERISTICS: LENGTH: 5615 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: GTACTTTCTT TATATCTCCA TAATTTTATT TACTATTACT ACATGATACA TTATTTTATA
AAAGTCTTTG
TTAAATGCGA
CTACCAACCC
ACAGGGAGGT
CCGGGGAGTC
TGTCGGCACT
CTCTGTGCTA
ACCATCATTG
GGCTCATGCC
GGCCGGACGT
TCACAAGGTC
AAATACAAA.A
TGAGGCAGGA
TAACCTCCTT
ACCAGAAACT
CATGCAATAG
TCAGTAATTT
TGGTCCCACA
CATTAAGCAC
CATGCTTTAC
CTCCTATTTT
TGAAATCCCA
GGTGGCTCAC
AGGAGTTCTA
AATAGCTAGG
GAATCCCCTG
CTTCCAAATG
ATACTAATGT
GCCCAAGGTC
GCTGGCATGT
ATTGACAGCT
CTGGATTATT
ACATAACAGA
GCACTTTGGG
ACCTGTAATC
GACCAGCCTG
CGTGGTGGCA
AACCTGGGAG
TGTTACATCT
GATCTCTGTC
ATACACACAC
TTGCCATTAT
ATGCTTGGTG
TCAACTGCAC
AAACTACAGA
AGACCCTGTC
TCAGCACTTT
GCCAACATGG
GGTGCCTGTA
ATGGAGGTTA
AAGGATTCAC TGCTTAATCT CCAGTGCTTA
ATAACCTCAT
TTACAGAGGA
TGGCCTTCAG
ATTATATTGC
AGTGACTACT
AACAACCCTG
AATCTGGGGC
TCTAAAAAAA
GGGAGGCTAA
CAAAACCCTG
ATCCCAGCTA
CAGAGAGCCG
*GCACAAATCA
TGGATTCTCA
AGAAACAGGC
GTATTCATGC
CTCCTTATAG
ATGTACCCAG
TGAGGTAACT
TGGGCGTAGT
ATTTTTTTTT
GGCAGGCAGA
TGTCTACTAA
CTCAGGAGGC
AGATCGTGCC
120 180 240 300 360 420 480 540 600 660 720 780 840 GCTGCACTCC AGCCTGGGCA ACAAGAGCAA GACTCTGTCT CGAAAAAAAT AAAAATAAAA P:\OPER\VPA\VPACOM-M~OCSDI-l.WPD 1/11/01 198-
ATAAAAATAT
CTTGGGAGGC
TTTTTTAAAA
TGAGGTAGGA
ATTAGCTGGG
GGATCACTTG
TGTGGTAGCA
AGCCCAGGAG
TGATGGCGCC ACTGCACTCT AGCCTTGGTG ACAGCAAGAC a.
AAGAGAAATC
AACTCGAAGI
GCATCCAGTPA
AAGCAGATTT
ACCAGCCCAG
AAGCCCCTCC
GGCTCAAAGG
ATCAAAGCGT
GGGTTAACTG
GACTGAGACC
CGGCTGCTGC
GGCCTCGGAG
AGGGGCGGGG
CGGTCTGCGC
CTCCAGGGCT
GGGGGATGGT
GATCAGATGG
AGCGGGCCCG
GGGCAACTTC
CTTAATCAGC
TGAAGACTGC
CACAGCCTCA
AAACAGTGAC
TTGACCCCAT
TCTTGACGGT
TAATTCTGAG
CCCCGGGCCT
CGAGGTTAAC
CGGTATAGAG
CTGCACGGCC
CCGCGGGACG
CCTGCGCGCC
GGGAAGTCAA
CCATGCTGAG
GGCGAGGGCA
TTGGCAAACT
CCCAAGATCG
ACACTCTACC
ACCAGCAGGG
GAGGTGGCAC
CCAGAATCAC
GCTCCTTACC
GGAGAACACC
ATGGGCCTGC
TCGCCGTGGG
CGCCCGGGGT
CGGTAACTGC
GTGGGCGGCG
GGCTCGGCCC
TCGGCTTCTT
CCGAGGTTCG
GCCCAAATGG
GATGAAGGGC
TGGGTGAAAG
CGCAGTTAAC
AAATGAGATC
AGAACTATGA
AGGCTGACTC
AGGGAAGTAG
CTCAGGGGCG
ATCCCCAGGG
CCGGGTGCGG
GGCGGGGCCT
GGGCTCCACG
CCAGGAGGGG
ATGAGAGGGT
A~AGGGAGGAG
TCCGCCCGGC
GGGGCAGCGG
3GCGAACTCG 7CAGGAGCTT I 7ATGGGGTAC C
CATGCCTGTA
GTCAAGGCTG
CCTGTCTCAA
TAGTGGCATA
AACGGCTCAG
TGCGTACAGC
ACAACCCGGG
AAATGGGATT
CAGGAGTTAG
ATTCCCGACG
ACTCTGCCGC
CGGGGAGGGT
GGGGCGGGGC
GCGGGGCCCC
TAAGCCCCAG
CTGGGGGCGG
ICCTTCAGAG C GAGGGCTCC C GAGAGTCTC T GGGGCAGCG A TGGGTGACG A
GTCCCAGCTA
CAGTGGGCTG
GCTTCACTCA
TAATGGATTG
CTAGAGCCTG
GCAGAAAGCG
CGGCACAATG
TCGCTCAGGC
CGGTGATGCC
A.GCAAGAGAA
CACAGCCCGG
%TGCTCTCCG
%.CAGGGGCGT
%~GGGCCCTGG
AGCGGCCGG
,CCCGGCGAC
~GGCGAGTAA
GGCGACCTG
~GGAGGGAGG
.GCCCCCGCC
960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 P:\OPER\VPAWPACOM-I'.SOCSDI-I.WPD 1/11/01 199-
AGGATTCTGC
GCTCAACTTT
TCTTCACGCC
CAGAAGAGAA
CCTTTTCTCC
AGACGCCCCA
CAGCTCCCTT
GCAAGCCTCT
CCAGGTCAAT
TTCGGGGAGT
CCAAACTGGA
CCTCTAGCTC
2220 2280 2340 CTCACCTCCA TGGGCCAGAC AGCTCTGGCA GGGGGCAGCA GCAGCACCCC CACGCCACAG
GCCCTGTACC
CCTGACCTGG
GAGGTCAAGG
GCCCGGGGTA
GAGCAGAGGG
GACCACTACG
GGGAAGCTGT
GGTGAGCAGC
CTGGGCTACG
ACCCTCTATC
GGCGAAAGGA
GTGGTTTGGG
CCTCAACCTC
ACCCAACAGC
CAGTAAATGG
ATATCAACCC
AAAGAAGGAT
TGTCTCTGTG
CTGACCTCTC
GGGCCCAGCG
AAGGAGGGTT
AGAGGGGCTA
GCACGCATGC
CGGCGCTGCT
ACCATCAGAG
TGGAGGTGCC
CTATTGGGGG
CGGCAGTAAG
GAGGTGAGGC
ATGGAAACTC
TGTTCAGTGC
AATAGAGGTG
TGAACCTTCA
CTGAGCAAGA
GGGCAGAGCC
CCACATACCT
GCGCCACGGI
GTACTTTGAG
TTCAAGGGGC
CGTGGTGGGC
GGGCAGCAAC
CAAGGGGCCC
AGAGAGACTG
CACCTACCTG
CGCTGTCTGG
CTGGGGCAGA
TTCTGACAAG
TGGGAAAGGC
AAACAGGCTT
GAATGGAGGG
GGTGCAAAGC
AGGTACCCAG
ACTTCCTTCC
*TGGAACCCCA
CGGCGGCCCG
CTGCACGCCT
GTGGCCACGG
AGCGAGTCGT
GGAGCCCCCC
CTGGTGGTTC
GGGCCAGCAT
GGCCAGTGCC
CGTGGGGAGA
AGCAGAGGGG
TAGGGGTCTT
GAGAAAGCAA
AGGAACTGCA
GTTAGGTACT
GCTGTATACC
TCAGCCACAC
AAGACTGTTC
TGGCCCAGAG
GGGAGATCAG
CCCTCGCCCC
GGGGCTGGGA
AGTATCCAGC
TGGACATGGA
TCCGCGGACT
AGGTCCGCAT
ACTTTCTGTC
ATGGACCTTC
CACAGCTGTT
CTTTCTCAAG
GGGATGAGAG
GGGTTTGATG
3GATTCCCTG
TCTGGATGG
AGAGAACATC
CACTGATGGG
CTGGCCCCTA
GCTGCAGACT
CATCGGGCGG
GGGAACTCAG
GGAGGGAACT
GAAGGGCAGG
CCGCTACCTG
CCTGGTGGCA
ATCCAGCCTG
ATTTAATTTA
TTCTCTTGGC
AATTCAGGAG
TACAGGTCCA
3GCTCTAACC k.GACACTGGG CTGTCCCGAG GGCTTGGAAG AGCTGCTGTC TGCACCCCCT 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 P:\OPER\VPA\VPACOMA-I'SOCSDI-l.WPD- 1/11/01 200 a. a a a a
GCCCTGGGCP
GGAGGAGCCI
TGGGCTGTAG
GTCAAAGCCT
CTTAGCCAGC
AGAACTTCCG
GCAGAAAGGC
ATACTTACAT
CGCCTGTGTG
TTGCCCCCTG
TGCTGAGGTC
CCAGCTGCTG
ACAGCCCCAG
TCTGCTACGC
TTCTTGTTTT
GATTATCTGG
CCCTCACATG
CACTCCTGCC
CTTCTGCCCT
TCAGCCAAGG
CCAGGGAGGP.
*CCCCAGGAAC
AGTGGAGGGC
*GGGCCAGCTC
TCTTCACCCC
GGTATCTCAA
ATCTGAGTAG
TCCCCTCCTT
TGCGCCACAA
CCATGAAGCG
TTGCCACCAC
AAAGCTGGTG
AGCCACTTGG
AAAACATTTT
TTTTTTTTTC
GCAAGTCCAG
GGTGGTTCAC
TTCTGAGGGG
CACTAGGGCC
AGAAGCCACC
GAGCAGTGGA
TGACTGGGTC
CATCCCTCCT
CACCACTGTC
CAGCTCTGAC
ATTCCCCTTT
GACCCCGTAG
CTCTCTCCCA
CCTGGGGGAT
CTACCTGCTC
CCCTCCCCTT
AGGCTGAGCC
AGGGAGGAAG
TTCAAGTAAA
TTGCACAAAT
TGAAGGCAGA
ATACACAGCA(
ATCTTGGCCTC
TAGGGAACCC I TTGGTGACGT I
CAGGGCTTGG
CACCTCAGCC
AGAGCCACCT
TAGGGATGTG
CAGCCAGGTG
TTTGAGGACA
GCGGAGCCAC
ACCCGGCTCG
TACCAGTGAG
GGGGAGGTGG
CCTACCCCAA
AAAGGGAGCC
AATAGTAAGA
GATCATTTAT
CAAACCACAA
2AGAGGCACG
"ACGGTGTAA
.GGAGCAAAT 'TAGTTCCAA AGCTGCTCTC TGCAGTTGTG
CCAGCTCCCA
TGGCCTGTTG
TGAAATCTTA
GGCACACTCG
TCTGGCTGGT
ACTCCCTTCT
GCCAGGTGTC
CCCTGTGATA
GGAGGCACTG
CCCAAGCTCT
GGCGTTCAAG
GATGTTGTTA
%TAGCTGCCT
3ACCTAGTGC 3GCACCATGG
AAGGGAGAG
CCACCACGC
CATTATAGT
AGCCTCTGGA
TTTAGAGGGC
TCTGGGAGGC
AAGCAGGAAA
GGCTGCACCC
GCACCTGAGC
TGCCCTGCCC
CCACAGACTG
CTGGCCTAGA
GCGGAAATCA
GCTATGACAG
TAGAAACCTG
CAAAAAGGAA
CAGGTTTATT
GAGAGGGCAG
GATGGTTTCT
CTTCCATCTC
k.AGTGGAGAA GGAGGCAGGG CCTTAGGGTG GGGCAGCAGG 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 GGGATTGGCC TGGTCCCAAC CATTACAGGG TGAAGATATA AACAGTAAAG GAAGATACAG 48 4680 P:\OPER\VPA\VPACOM- I \S0CSDI-.WPD 1/11101 201 TTTGGATGAG GCCACAGGAA GGAGCAGATG ACACCATCAG AAGCATATGC AGGGAAAGGG 4740 CAGTTACTGG GCTTCTGGGC ATGGGGCTCA TTGTTTGGCA CCACAAGGAA GCCATCCACA CAGAGCCTGG GAAGGGAGCA CCCCACCTCC ATGTCCGAGG CCAAAGCCAC TCACCTCCAT GGACTTCAGC CATCCTCGGA CTAAGTAAGA CCTTTTCTGC GTGTGGTTTC TGACTGAGTC ATGCCCTTGG ACAAGTCACT TTGCCCCATG TGGGCTGTGT *CTCCTGGGCC CGGTTACCTG GGCTCATAGG CCAGGACGAC AGATGGAGGG AAGGGTGAAC TGCTTAGTCC CTGGCTTGGC AGGAAGGGTA
TTGATGATGT
TCAGGCTGGC
GAACAAGGGC
GCTCAGTCTA
AAATGATACG
GCTTCTCGTG
CTCTCTAAGA
AGAGTACCAG
GTCTCTGGGT
CTGTCCAAAC
TTGGGGTGTT
CTTGCTCCAG
CCACGAATTC
TGGCCAGCTC
TTGGTCAAGA
GTCCTCAGCC
GGTGCTCTGA
TACTTCCTGG
GGAAAAATCA
GGCTCTGATC
TCAAGGTCTC
CTATTGAGGC
GCAGTCTTGC
TCCTTCACGT
GGGCTTGAGG
CTTGCAGGTT
ATGGGATGAG
CACTCCACCT
GCCACCGCAT
GCCTAGAACA
CTGGCACCAG
CAAGCCAGGC
TGTGTCTTTG
AGGCTGGGAT
CAGTACCAAT
TATCTGCAGG
GGGAAGATGG
GAAGCACCAC
GCCCCAGTCA
TCTGCCCCAT
CAGCCGGGAA
CAGAGACGTT
AGAAGCTGGC
TGGACACTTA
CCTGGACTGG
AAATAAGGGG
GAGGGCAGGG
GGCCCACACA
GCAGAGATAC
4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 AAGAAAGAGC TCTCCAGCCA GGTTCTCCGG AGTACGAAGA ACGGTGGCCT ACTGCCCCCT AGTGGACATT GGGGG INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 263 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 5615 P:\OPERVPA\VPACOM-l'SOCSDI-l WPD 1/11/01 202 (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: Met Gly Gin Thr Ala Leu Ala Gly Gly Ser 1 Gin 5 Pro 10 Pro Ser Ser Thr Pro Thr Pro Giu Gly Leu Giu Glu Leu Ala Leu Tyr Asp Leu Ser Leu Ser Ala Asn Pro Lys Pro Pro Pro Asp Leu 40 Asn Ala Gin Arg Arg Glu His Gly Trp Gly Gly Leu Asp Cys Ser Tyr Phe Giu 55 Val1 Ile Glu Val Lys Asp Giu Arg Arg Lys Pro 70 Arg Ala Gin Ser Thr Trp Gly Ala Arg Arg Gly Tyr Ser Gly Gly Leu His Ala 90 Val1 Glu Ile Ser Trp Pro Leu Glu Gin Ala Pro Leu 115 Giu Ser Trp Arg 100 Gin Thr His Ala Val1 105 Ala Gly Val Ala Thr Asp His Tyr 120 Gly Ala Leu Leu Gly 125 Tyr Thr Ala Leu 110 Ser Asn Ser His Gin Ser Gly Trp Asp Arg Gly Lys 130 Lys Gly Pro Gly Ala 145 Leu Pro 150 Arg Tyr Pro Ala Gly 155 Leu Gin Gly Giu Giu Val Pro Glu 165 Leu Leu Val Val1 170 Asp Met Giu Glu 175 Thr Leu Giy Tyr Ala Ile Gly Gly Thr Tyr Leu Gly Pro Ala Phe Arg 180 185 190 P:\OPER\VPA\VPACOM-ISOCSDI-I .WPD- 1/11/01 -203- Gly Leu Lys Gly Arg Thr Leu Tyr Pro Ala Val Ser Ala Val Trp Gly 195 200 205 Gin Cys Gin Val Arg Ile Arg Tyr Leu Gly Glu Arg Arg Ala Glu Pro 210 215 220 His Ser Leu Leu His Leu Ser Arg Leu Cys Val Arg His Asn Leu Gly 225 230 235 240 Asp Thr Arg Leu Gly Gin Val Ser Ala Leu Pro Leu Pro Pro Ala Met 245 250 255 Lys Arg Tyr Leu Leu Tyr Gin 260 INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: AGCTAGATCT GGACCCTACA ATGGCAGC 28 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID P:\OPERNVPA\VPACOM-l\SOCSDl-l WPD 1/11/01 -204- AGCTAGATCT GCCATCCTAC TCGAGGGGCC AGCTGG *0

Claims (31)

15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 4/123 P:\Otus\VA\VPA CowvrnB\SOCS DIVSH2 vt z 2 cLBa'1.WPD 15/ 11/04 205 THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS: 1. An isolated nucleic acid molecule comprising a sequence of nucleotides encoding or complementary to a sequence encoding an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof or a nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 C wherein the protein comprises a SOCS box in its C-terminal region, which SOCS box comprises the amino acid sequence: X, X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 Xo X 12 X 1 3 X XX 4 XI X 6 X 17 X X 20 X 2 1 X 22 X 23 X 2 4 X 25 X 2 6 X 2 7 X 2 8 wherein: X 1 is L, I, V, M, A or P, with the proviso that it is not A ifXz, is A; X 2 is any amino acid residue; X 3 is P, T or S; X4 is L, L V, M, A or P; X 5 is any amino acid; X6 is any amino acid; SX 7 is L, I, V,M, A, F, Y or W; XgisC,TorS; 20 XisR,KorH; Xo is any amino acid; X 1 is any amino acid; X 2 is L, IV, M, A orP; X 13 is any amino acid; X 14 is any amino acid; X 1 is any amino acid; X 16 is L, I, V. M, A, P, G, C, T or S; S* is a sequence of n amino acids wherein n is from I to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 5/123 P:\OIw\AM\VPA C wou~rs\SOCS DIV s112 P.vr CIr.WD. 15/11/04 -206- XtisL,I,V,M, A orP; XS is any amino acid; Xl, is any amino acid; X 2 0 I, V, M, A or P; is P; X,2 is L, I, V, M, A, P or G; XZ is P or N; is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence XN may comprise the same or different amino acids selected from any amino acid residue; X 2 4 is L, I V, M, A or P; X2, is any amino acid, with the proviso that it is not A if X, is A; X,26 is any amino acid; X27 is Y or F; and X 2 8 ,isL,I,V,M, A orP,and **wherein the SH2 domain comprises the amino acid sequence: I X, z X3 X X X X, X9 XIo XII X1 X3 XI4 XIS X16 Xl7 XIS X1X X21 C22 X23 X4 X, X, X2 X X29 X X X33 X,4 [Xp] X 5 X, X X, X, X X X4, X43 X44 X 4 5 X6 X4 X4 X5 X51 X X X, X,5 X X5 X, [Xq, X, W X. Xr], ,12 X 6 4 3 X 65 X4 66 wherein: X, is G or P; X 2 is F, W or C; X3 is Y; SX4 is W; SX 5 is G or S; X is any amino acid; X, is L, M or V; X, is any amino acid; X, is any amino acid, with the proviso that it is not A if Xo is S, X,3 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 1.5-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 6/123 P:\Oril\VPA\VPA CowtC\SOCS DIV SH2 uwrvn 2 .mN.wDo- 15/11/04 -207- is R, X, 4 is Q, X 1 is H, is Q, Xl, is K, is M, X3 is T, X 33 is S, X4, is T, X 4 2 is R, XY is V, X2 is A, X 54 is S and X6 is Q; Xto is any amino acid, with the proviso that it is not S if X is A, X 13 is R, X 1 4 is Q, XI is H, is Q, XI, is K, is M, X 3 0 is T, X 3 3 is S, X4 is T, X 4 2 is R, X47 is V, X,2 is A, X54 is S and X, is Q; X, is any amino acid; X,2 is A; X,3 is any amino acid, with the proviso that it is not R ifX 9 is A, XIo is S, X, 4 is Q, is H, X,7 is Q, X, 8 is K, X9, is M, X3o is T, X33 is S, X4, is T, X42 is R, X4, is V, is A, X. is S and X, is Q; X,4 is any amino acid, with the proviso that it is not Q ifX is A, XIo is S, X, is Q, X 5 is H, X7 is Q, X,s is K, Xq is M, X3o is T, X3 is S, X1 is T, X, is R, X, is V, X2 is A, Xs4 is S and X, is Q; X 15 is any amino acid, with the proviso that it is not H ifX, is A, X,o is S, X 1 is Q, X14 is Q, X is Q, X is K, Xo is M, Xo is T. 20 X33 is S, X41 is T, X42 is R, X 47 is V, XS2 is A, X54 is S and X, is Q; Xs is L; is any amino acid, with the proviso that it is not Q if X, is A, Xto is S, X,3 is Q, X, is Q, is H, X1, is K, X, is M, X3o is T, 25 X3 is S, X4, is T, X 4 2 is R, X, is V, X2 is A, X,4 is S and X, is Q; X is any amino acid, with the proviso that it is not K if X, is A, Xo is S, is Q, X14 Q, X is H, X is Q, X is i M, Xso is T, X,3 is S, X 4 is T, X42 is R X4, is V, is A, X,4 is S and X, is Q; is any amino acid, with the proviso that it is not M ifX, is A, Xo COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 7/123 \VPVPA Comi i!n.SOCS DIV SH2 kEViD 2 C.mAN.vwn 15/1 1/04 -208- is S, X 13 is Q, X, 4 is Q, X 1 5 is H, X, 7 is Q, X18 is K, X30 is T, X 3 3 is S, X 4 is T, X 4 2 is R, X 4 7 is V, X 52 is A, X 5 is S and X, is Q; X2o is P; is any amino acid; X is G; X2~ is T or S; X4 is F; X 25 is L; is V, I or L; X 2 7 is 1 is D; X29 is S; X 3 o is any amino acid, with the proviso that it is not T ifX, is A, X 10 is S, X,3 is Q, X,4 is Q, is H, is Q, Xl, is K, is M, X33 is S, X4, is T, X4, is R, X47 is V, X,2 is A, X, is S and X .:is Q; SX 3 is any amino acid; S: X32 is any amino acid; 20 X33 is any amino acid, with the proviso that it is not S if X, is A, X,0 is S, X is Q, X,4 is Q, is H. X7 is Q, X1, is K, X,9 is M, X3o is T. X4 is T, X4, is R, X4, is V, Xs2 is A, X54 is S and X4 is Q; X4 is any amino acid; 25 [Xp] is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; X3, is any amino acid; X 3 is L, I or V; X37 is S; X3, is V or F; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 8/123 P:\OIt\VI*A\VPA COMwmrlSOCS DIV SH12 u-ven2 culA.WD 15/11/04 -209- X 39 is K or R; is any amino acid; X 4 is any amino acid, with the proviso that it is not T if X, is A, X, 0 is S, X 13 is Q, X, is Q, X, 15 is H, X 1 7 is Q, Xg is K, X 1 ,9 is M, X 3 is T,X 33 is S, X42 is R, X 4 7 is V, XZis A, Xs 54 isS andX 64 is Q; X 4 2 is any amino acid, with the proviso that it is not R if X, is A, X, 0 is S, X 1 3 is Q, X 1 4 is Q, X 1 5 is H, X 17 isQ, Xs is K, X, 9 is M, X 30 isT,X 33 is S, X 41 isT,X 47 isV,X 5 2 isA, X 54 isS andX, is Q; X43 is any amino acid, X44 is any amino acid; X 4 is any amino acid; X, is any amino acid; X4, is any amino acid, with the proviso that it is not V if X, is A, X 1 0 is S, X, 3 is Q, X,14 is Q is H, X, 17 is Q, XIg is K, X19 is M, X3, is T, X33 is S, X4, is T, X42 is R, X, 2 is A, X,4 is S and X is Q; X4 is R; X 4 9 is V, I or M; XS is any amino acid; X,51 is any amino acid; X 2 is any amino acid, with the proviso that it is not A ifX, is A, X1o is S, X 13 is Q, X, is Q, X,,1 is H, X,7 is Q, X, is K, is M, X 3 0 is T, X, is S, X 4 is T, X 42 is R, X 4 7 is V, X, 4 is S and X, is Q; X13 is any amino acid; X54 is G, S or H, with the proviso that it is not S if X, is A, X10 is S, X1 is Q, X,4 is Q, X1s is H, is Q, Xi is K,.X19 is M, X30 is S 30 T, X33 is S, X4, is.T, X 42 is R, X 4 7 is V, X 5 2 is A and X, is Q; X5 is any amino acid; 0 9i COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 9/123 P:AO\-^VA\VFpA ,nzraP\SOCS DIV SH12R vD2CmEA .Wt 15/i1/04 -210- X, is F; X, 7 is any amino acid; X, 8 is L or F; is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; X, 9 is D or E; X6o is any amino acid; X 61 is L, V, T or I; is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X, is L or A; X6 is L, V or I; X6 is any amino acid, with the proviso that it is not Q if Xg is A, Xio is S, X 3 is Q, X 1 4 is Q, X5 is H, X 1 7 is Q, XIB is K, X 1 is M, X 3 o is T, X, 3 is S, X 4 is T, X 42 is R, X 4 ,7 is V, Xz is A and X$ is S; X6, is H or Y; and Xm is Y or S. 2. An isolated nucleic acid molecule according to claim 1 wherein the protein modulates signal transduction. 3. An isolated nucleic acid molecule according to claim 2 wherein the signal transduction is mediated by a cytokine or a hormone, a microbe or a microbial product, a parasite, an antigen or other effector molecule. 4. An isolated nucleic acid molecule according to claim 3 wherein the protein modulates 30 cytokine-mediated signal transduction. Ioe 0 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 10/123 P :;YV\VPAVPA COWn'vmg OCS DIV SI12 REVrn 2 Cq(U.wPD 15/11/04 -211- An isolated nucleic acid molecule according to claim 4 wherein the signal transduction is mediated by one or more of the cytokines EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, L-13, IL-6, LIF, OSM, IL-12, IFNa, TNFa, IL-1 and/or M-CSF. 6. An isolated nucleic acid molecule according to claim 5 wherein the signal transduction is mediated by one or more of IL-6, LIF, OSM, IFNa and/or TPO. 7. An isolated nucleic acid molecule according to claim 6 wherein the signal transduction is mediated by IL-6. 8. An isolated nucleic acid molecule according to any one of claims 1 to 7 wherein the protein comprises an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 18,. SEQ ID NO. 36 or SEQ ID NO. 44. 9. An isolated nucleic acid molecule according to any one of claims 1 to 8 wherein the SH2 domain comprises the amino acid sequence: X X 2 X3 X4, X X X 7 X X 9 X 10 X X 12 X13 X 14 X 1 5 X 16 X 1 7 X 1 X 1 9 X 2 0 X 2 1 X22 X 2 3 X2 X 2 X 2 6 X2 X 2 X X3, X 31 X3, X X, 3 X3 [Xp, X35 X3, X37 X, X3, X 40 X4 X42, XC X4 5 X46 .47 X48 X49 X o X X s 5 X 52 3 X3 XS X X 6 X57 X5 Xq] Xs] X6o X6 X, X6X64 X6 X6 wherein: X, is G or P; SX2 is F, Wor C; X 3 is Y; X 4 is W; X, is G or S; SX is P, A, S or V; X,isL, M orV; 30 X, is S, T, D orN; X, is V, G, A, R, K or W, with the proviso that it is not A if X, 0 is S, COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 11/123 P:0nFEx\VVPA Cc mm\SOe CSDIVSIH2 uv, 2 JLufl.wo- 15/I/04 -212- X13 is R, X14 is Q, XI is H, X 1 7 is Q, XIS is K, X19 is M, X 30 is T, X,3 is S, X4,1 is T, XQ2 is R, X4, is V, X2 is A. X,4 is S and X, is Q; is H, G, N, S, Y, W or E, with the proviso that it is not S if X, is AX, is RI X14 is Q X, is H, X7is Q, X, is K, X9 is M, is T,X is S, X,41 isT, X42 is PR, X47 is V, X 52 is A, X,54 isSand X, is Q; XII is G, E, A orD; X12 is A; X 1 3 is H, N, K,R or E, with the proviso that it is not R ifX, is A, X 10 is S, X4 is Q, XIS is H, X, is Q, X, is K, X,,19 is M, X, 0 is T, X 33 isS,X 4 1 isT, X421 is R, X4, is V, X 2 isA, Xs4is S andX, is Q; X,4 is E, L, Q, A, G or M, with the proviso that it is not Q if X, is A, Xo is S. X,3 is Q, X15 is H, X1, is Q, is K, Xi, is M, X 30 is T, X 3 3 is S, X 4 1 is T, X 4 is R, X 4 7 is V, X,2 is A, X,J is S and X4 is Q; is R, L, K or H, with the proviso that it is not H if X, is A, X10 is S, X3 is Q, X14 is Q, X17 is Q, XIS is K, Xjg is M, X3o is T, X33 is S, X41 is T, X42 is X47 is V, X, is A, X54 is S and X6 is Q; SXc, is L; is R, S, K, Q, E or A, with the proviso that it is not Q ifX, is A, Xo is S, is Q, X14 is Q, is H, Xis is K, is M, X30 is T, X33 is S, X4, is T, X42 is R, X47 is V, X57 is A, Xs4 is S and 25 X, is Q; is A, E, K, G, N or S, with the proviso that it is not K if X, is A, X 1 o is S, X ,3 is Q, X 1 4 is Q X, 15 is H, is Q, is M, X 3 0 is ST, X3 is S, X 41 is T, X42 is R, X47 is V, X52 is A, X 54 is S and 25*X6 is Q; X19 is E, A, M, K or V, with the proviso that it is not M if X, is A, X1o is S, X,3 is Q, X4 is Q, X15 is H, is Q, X, is K, X3o is T, 0 0 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 12/123 P'\OnR\VA\VPA CoururnSOCS DIV 5Z REV2iucsn LEN W 15/1104 -213- X3 is S, X4, is T, X 42 is R, X 47 is V, X2 is A, Xs54 is S and X,4 is Q; X 20 is P; X 2 1 is V, A, E or D; X,2 is G; X, is T or S; 24 is F; X, 25 is L; X26 is V, I or L; X 27 is R; X2, is D; is S; is R, S, T or A, with the proviso that it is not T if X, is A, is S, X 1 3 is Q, X14 is Q, X 1 is H, X 17 is Q, X1s is K, is M, X33 is S, X4,1 is T, X42 is R, X,47 is V, X52 is A, X,4 is S and X, is Q; X3 is Q, D or H; X32 is R, Q, S. P, E or D; X3% is N, R, D or S, with the proviso that it is not S if X, is A, X1. is S, X 1 ,3 is Q, X14 is Q, XIS isH, X17 is Q, X 8 is K, X 19 is M, Xg is T, X, is T, X42 is R, X 47 is V, X, is A, X is S and X, is Q; X34 is C, H or Y; is a sequence of n amino acids wherein n is from 1 to 2 amino 2acids and whercin the sequence Xp may comprise the same or different amino acids selected from F, L or I; is A, T or S; X 36 is L, I or V; X, 7 is S; X38 is V or F; X3, is K or R; 30 X4 is M, T, R or S; is A, Q, S, T, Y or H, with the proviso that it is not T ifX, is A, COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 13/123 PACa;\\VA\VPAQ MnUrja\sOCs DIV 5112tunEV2i0uHN.WD- 15/11/04 -214- X is S, X 13 is Q, X14 is Q, X5 is H. X, 7 isQ, Xa18 is K, X19 is M, X3 is T, X3 is S, X42, is R, X47, is V, Xs2 is A, ,4 is S and X,64 is Q; X42 is S, A, R, N or G, with the proviso that it is not R ifX, is A, X,0 is S, X1 is Q, X,4 is Q, X,3 is H,X17 is Q, X 1 2 is K, X 19 is M, X 3 0 is T, X 3 is S, X4, iS T, X47 is V, is A, X4 is S and X, is Q; X43 is G, R, K or I; X44 is P, T or S; X4,5 is T, K, L or H; X4, is S, N or H; X47 is I, L, V, A or T, with the proviso that it is not V if X, is A, X10 is S, X13 is Q, X, 4 is Q, X, is H, X,7 is Q, X, is K, X, is M, is T, is S, X4 is T, X,42 is R, X, 2 is A, X,4 isS and X, is Q; X 4 is R; X4,9 is V, I or M; is H, Q or E; X, is F, C, Y, Q or H; X52 is Q, E, A, W, S or Y, with the proviso that it is not A if X is A, Xo is S, X,3 is Q, X 14 is Q, X, is H, X,,7 is Q, XS is K, X1, is M, X30 is T, X33 is S, X4, is T, X42 is R, X47 is V, X54 is S and X, is Q; is A, G, D, N or R; X4 is G, S or H, with the proviso that it is not S ifX, is A, is S, 25 X,3 is Q, X,4 is Q, XIS is H, X,7 is Q, is K, X9,, is M, X3O is X,3 is S, X,4 is T, (42 is R, X47, is V, X52 is A and is Q; is R, S, K, N or T; X, is F; sees isH, S or R; 30 X, 5 is L or F; is a sequence ofn amino acids wherein n is from 7 to 14 amino COMS 10 No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIS COLLISON CAVE Pat.&Trad :61 7 3368 2262 14/123 1:\%I\VPA\VPA0PLE1t*\SOCS I~IV SH3zvM2CzLwJ.XWD- 15/13/04 acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; is D or E; Xis C, S, V, I, L or F; X 6 is L, V, T or i; is a sequence of n amino acids wherein n is from I to 2 amino acids and wherein the sequence Xrniaycomprise the same or different amino acids selected from any amino acid residue; X, is L or A; X.3isL, V orI; X6 is E, H, D, Q or M, with the proviso that it is not Q if is A, X1 0 is S, X13 is Q, X 1 4 is Q, X 1 5 is H, XI 7 is Q X 1 S is K, X 19 is M, X 30 is T, X33 is S, X 41 is T, X2 is R, X47 is V, X52 is A and is S; X 6 is H or Y; and X-66is Y or S. An isolated nucleic acid molecule according to claim 9 wherein the SH2 domain comprises a sequence selected from: GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRICFFALSVKMASG PTSIRVHFQAGRFHLDGSRETFDCLFELLEHY, (ii) GEYWGPLSVHGAHERLRAEPVGThLVRDSRQRNcFFALSVKMASG PTSIRVFFQAGRFHLDGSRESFDCLFELLEHY; (iii) GFYWGPLSVHGAHRLRSEPVGTFLVRDSRQRNCFFALSVKMASGP TSIRVH-IFQAGRFHLDGNRETFDCLFELLEHY; 6 i30 (iv) GWYWGSMTVNEAKEKLKEAPEGTFLIRDSSHSDYLLTISVKTSAGp TNLRIEYQDGKFRLDSIICVKSKLKQFDSVVHLIDYY; 0@. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 15/123 P:\Ona\VA\VPA CfLrMtSOCS DIV SH2 kVta) 2 tnsnvD 15/11/04 -216- GFYWSAVTGGEANLLLSAEPAGTFLIRDSSDQRHFFTLSVKTQSGTK NLRIQCEGGSFSLQSDPRSTQPVPRFDCVLKLVHHY; (vi) PCYWGVMDKYAAEALLEGKPEGTFLLRDSAQEDYLFSVSFRRYSR SLHARIEQWNHNFSFDAHDPCVFHSPDITCLLEHY; or (vii) GWYWGPMNWEDAEMKLKGKPDGSFLVRDSSDPRYILSLSFRSQGI THHTRMEHYRGTFSLWCHPKFEDRCQSVVEFKRAIMHS. 1I. An isolated polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein the protein comprises a SOCS box in its C- terminal region, wherein the SOCS box comprises the amino acid sequence: XI X 2 X X 4 X, X 6 X 7 X X 1 X, 1 X 12 X 13 X, 1 4 X, X, X 17 X X 18 X 1 9 XZO X 21 X22 X2 [Xj] X 4 X 25 X 2 6 XX2 wherein: X, is L, I, V, M, A or P, with the proviso that it is not A if X 2 5 is A; X, is any amino acid residue; X 3 is P, T or S; X4 is L, 1, V, M, A or P; X, is any amino acid; X, is any amino acid; SX, is L, I, V, M, A, F, Y orW; X, is C, T or S; S 25 X, is R, K or H; X 1 0 o is any amino acid; X 11 i is any amino acid; X, 12 is L, I, V, M, A or P; X, 13 is any amino acid; 30 X, is any amino acid; X 1 is any amino acid; O 99. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 16/123 P:\OPfRVPA\VPA ConrjmSOCS DIV SHI2 tEvvtu 2cA.wpe- 15/11/04 -217- X, is L, I, V, M, A, P, G, C, T or S; is a sequence ofn amino acids wherein n is from I to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; X 1 7 isL,I,V,M,AorP; Xis is any amino acid; X is any amino acid; X 2 0 L, L V, M, A or P; X,2 is P; X22 is L, I, V, M, A, P or G; X 3 is P or N; [X is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xj may comprise the same or different amino acids selected from any amino acid residue; X 24 isL,I,V,M, A orP; X 2 s is any amino acid, with the proviso that it is not A if X, is A; X 2 is any amino acid; X 7 is Y or F; and X2S is L, I, V, M, A or P, and wherein the SH2 domain comprises the amino acid sequence: S X, X2 X3 X4 X, X 6 X 7 X 8 X 9 Xo X X 1 2 X 1 3 X X 1 4 X 1 X 16 X 17 X 8 X 19 X 20 X 2 1 X 22 XC23 X 24 X2 X26 X Z X 2 X X 2 X 3, 0 X 1 X 3 2 X 33 X 34 [Xp]X X, X 6 X 3 7 X 33 X 3 X4, X4 2 X43 X44 X 45 X4 25 X47 X X49 X 50 so Xs, X2 Xs X 54 X, X 5 6 X 57 Xs 5 X 59 Xeo X, X 2 X63X6 X, X6 @0 0 wherein: X, is G or P; X2 is F, W or C; see* X3 is Y; 30 X4 is W; X 5 isGorS; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 17/123 P:\OPU\VA\VPA COW.PTUMSOCS DIV Sm2 utlvoJa 2 CL.WmTD- 15/11/04 -218- X, is any amino acid; X 7 is L, M or V; X, is any amino acid; X, is any amino acid, with the proviso that it is not A if Xo is S, X,3 is R, X 1 4 is Q, X, is H, X 7 is Q, X 1 8 is K, is M. X30 is T, X,3 is S, X,4 is T, X 42 is R, X 4 is V, is A, XS4 is S and XU is Q; Xo is any amino acid, with the proviso that it is not S if X, is A, X,, is R, X14 is Q, is H, X1, is Q, Xlg is K, X,9 is M, X3o is T, X,3 is S, X 4 1 is T, X 4 2 is R, X 4 is V, X 52 is A, X4 is S and X, is Q; is any amino acid; X, is A; is any amino acid, with the proviso that it is not R if X, is A, XIo is S, X,4 is Q, Xs is H, X, is Q, X18 is K, is M, X30 is T, X33 is S, X4, is T, X42 is R, X47 is V, X,2 is A, Xs4 is S and X. *is Q; X4 is any amino acid, with the proviso that it is not Q ifX, is A, XIo is S, X, is Q, is H, X, is Q, Xi is K, is M, X30 is T, 20 X 3 3 is S, X4, is T, X42 is R, X 47 is V, X is A, X4 is S and X, is Q; is any amino acid, with the proviso that it is not H if X, is A, Xo is S, Xz 1 is Q, Xt4 is Q, X~ is Q, Xil is K, Xl, is M, Xo is T, X3 is S, X41 is T, X,2 is R, X47 is V, X, is A, X 54 is S and X, 25 is Q; X,6 is L; X,7 is any amino acid, with the proviso that it is not Q ifX, is A, XIo is S, X, 3 is Q, X, is Q, is H, is K, X,9 is M, is T, X,3 is S, X4 is T, X42 is R, X47 is V, XS is A, X54 is S and X4 is Q; X8 is any amino acid, with the proviso that it is not K ifX, is A, Xo COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 18/123 P:\OwitV>A\VPA Cosr maSOCS DIV SH2flEV cnD2ACz I l/ 104 -219- is S, X, 3 is Q, X 14 is Q, X 1 is H, X, is Q, is M, X 3 0 is T, X33 is S, X41 is T, X 42 is R, X47 is V, XZ is A, X is S and X, is Q; X, 9 is any amino acid, with the proviso that it is not M if X, is A, X, 0 is S,3 isQ, X 14 isQ, X is H, is Q, X is K, X 3 0 is T, X,3 is S, X4, is T, X 42 is R, X 4 7 is V, Xs2 is A, XS4 is S and X, is Q; X 2 0 is P; X 21 is any amino acid; X2 is G; X, is T or S; X 24 is F; X 25 is L; X, 6 is V, I or L; X 2 isR; X2g is D; X is S; Xo is any amino acid, with the proviso that it is not T ifX 9 is A, Xo 1 is S. X 1 3 is Q, X 14 is Q, Xs, is H, X 1 7 is Q, XIs is K, X, 9 is M, 20 X33 is S, X4, is T, X2 is R, X7 is V, Xs2 is A, X4 is S and X, is Q; X 3 1 is any amino acid; is any amino acid; X is any amino acid, with the proviso that it is not S ifX9 is A, X, is S, is Q, is Q, is H, X, is Q, is K, is M, Xo is T, X4, is T, X4, is R, X 4 7 is V, is A, Xs is S and X, 9. 9 is Q; 3 X is any amino acid; is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 ;DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 19/123 P;\OpEk\VPAVPPA CwOCMu\SOCS DIV SIU cxii acLUUAN.WD- 15/11/04 -220- X 35 is any amino acid; X 3 6 is L, I or V; X, 7 is S; X 3 8 is V or F; X 3 is K or R; is any amino acid; X41 is any amino acid, with the proviso that it is not T ifX, is A, Xo is S, X3, is Q, X,4 is Q, X5 is H, X7 is Q, X, is K, X, is M, X 3 0 is T, X 3 3 is S, X42 is R, X, is V, Xz is A, X4 is S and X4 is Q; X42 is any amino acid, with the proviso that it is not R if X, is A, XIo is S, X, is Q, X14 is Q, Xs 5 is H, is Q, X,8 is K, is M, is T, X33 is S, X4, is T, X, is V, X52 is A, X, 4 is S and X, is Q; X43 is any amino acid; X4 is any amino acid; X 4 5 is any amino acid; X46 is any amino acid; -i X47 is any amino acid, with the proviso that it is not V if X is A, Xo 20 is S, X13 is Q, is Q, X,5 is H, X, 1 is Q, Xi is K, is M, X3o is T, X33 is S, X41 is T, X4. is R, is A, X54 is S and X 64 is Q; X4 is R; X49 is V, I or M; S 25 X50 is any amino acid; X 5 is any amino acid; is any amino acid, with the proviso that it is not A ifX, is A, Xo is S, X,3 is Q, X14 is Q, XIS is H, X, is Q, is K, is M, Xo is T, X 33 is S, X 4 is T, X42 is R, X47 is V, X, is S and X, is Q; Xs is any amino acid; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 20/123 P:\Op\vroVAVPACOWnIEASOCS DIV SHZM RUiYvr2cYwm.wpD. 151/04 -221- X 4 is G, S or H, with the proviso that it is not S ifX is A, X10 is S, X 3 is Q, X 14 is Q, X1, is H, X, 1 is Q, Xlg is K, is M, X3, is T, X, is S, is T, X,2 is R, X47 is V, X, is A and X, is Q; X 5 5 is any amino acid; Xs is F; X,7 is any amino acid; X, 8 is L or F; is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; is D or E; Xso is any amino acid; X6 is L, V, T or I; is a sequence ofn amino acids wherein n is from I to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X is L or A; X is L, V or I; X is any amino acid, with the proviso that it is not Q if X is A, X,o is S, Xi3 is Q, X4 is Q, X, is H, X, is Q, X1, is K, X, is M, X 3 o is T, X 3 is S, X 4 1 is T, X 42 is R, X 4 is V, is A and X,4 isS; X, is H or Y; and X, is Y or S. 12. An isolated polypeptide according to claim 11 wherein the protein modulates signal transduction. 13. An isolated polypeptide according to claim 12 wherein the signal transduction is mediated by a cytokine or other endogenous molecule, a hormone, a microbe or a microbial product, a parasite, an antigen or other effector molecule. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 21/123 P:\OAVPA\VPPA CowLr\SOCS DIV S$2 nrwVMD 2 CLEiN.WfD- 15/11/04 -222- 14. An isolated polypeptide according to claim 13 wherein the protein modulates cytokine- mediated signal transduction. An isolated polypeptide according to claim 14 wherein the signal transduction is mediated by one ormore of the cytokines EPO, TPO, G-CSF, GM-CSF, L-3, IL-2, II,7, IL13, IL-6, LIF, OSM, IL-12, IFN, TNFa, IL-I and/or M-CSF.
16. An isolated polypeptide according to claim 15 wherein the signal transduction is mediated by one or more of L-6, LIF, OSM, IFNc and/or TPO.
17. An isolated polypeptide according to claim 18 wherein the signal transduction is mediated by 1-6.
18. An isolatedpolypeptide according to claim 19 wherein the protein comprises an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 18, SEQ ID NO. 36 or SEQ ID NO. 44. .19. An isolated polypeptide according to any one of claims 11 to 18 wherein the SH2 domain comprises the amino acid sequence: 2X, X 2 X 3 X 4 X 5 X 6 X7 X 8 X X 10 X, 1 X 1 2 X 1 3 X 1 4 XS X 1 6 X17 X 1 S Xi, X 20 X, X22 X23 X 24 X2 X 2 6 X2, X2 X30 X31 X32 X33 X34 [Xp, X3 X, 4 X37, X 38 X39 X4, X42 X 4 X4 X4X, *5X47 X 4 8 X, X4 X 50 X 51 X X, X,4, X, X, X, X59 X60 X 6 1 [Xr]n X 6 2 X 63 X 6 X 6 X66 wherein: X, is Gor P; X, isF, W orC; X is Y; X4 is W; X, is G or S; X, is P, A, S or V; X, is L, M or V; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 22/123 P:O0fJVv\%VPYrA CoPLErr\SOCS OJ V SHZ REVMEDn2 CQAU.wr- )15/1 1104 -223- X, is S, T, D or N; X, is V, G, A, R, K or W, with the proviso that it is not A if Xo is S, X, 3 isR, X 14 isQ, X, is H, X,17 is Q, X 18 is K, isM, X3 is T, X33 is S, X41 is T, XA2 is R, X4, is V, X, is A, X, is S and X" is Q; x34 is H, G, N, S, Y, W or E, with the proviso that it is not S if X, is A, X 1 3 is R X 1 4 is Q, X15 is H, X17 is Q, XS is K, X,,19 is M, X 3 0 is T, X3 is S, X4, is T, X42 is R, X47 is V, X. is A, X54 is S and X is Q; is G, E, A or D; is A; X13 is H, N, K, R or E, with the proviso that it is not R if X, is A, Xto is S, X,4 is Q, is H, is Q, is K, is M, X, is T, X 33 isS,X 4 1 is T,X42 is R, 47 is V, X52 is A,X,4is S andX,64 is Q; X,,4 is E, L, Q, A, G or M, with the proviso that it is not Q if X, is A, X 1 oisS, isQ,X,, isH,X,1 7 isQ,X,1isK,X,, isM, X 3 is T, X34 is S, X4, is T, X4,2 is R, X47 is V, X52 is A, X, is S and X, is Q; 20 X1, is R, L, K or H, with the proviso that it is not H ifX, is A, XIo is 5, X13 is Q, X, is Q, X, Q, X Q Xa is K, is M, X30 is T, X 33 is S, X41 is T, X42 is R, X, is V, X,2 is A, X, is S and X, is Q; .99. X, 16 is L; is R, S, K, Q, E or A, with the proviso that it is not Q if X, is A, 99@909 XoisS,X, 3 isQX, 4 isQ,X,, isH,X, 8 isK, isM,X 3 0 is T, X33 is S, X41 is T, X2 is X47 isV, X2 is A, X5 is S and 64* X is Q; Xg is A, E, K, 0, N or S, with the proviso that it is not K if X, is A, XO is S, X,3 is Q, Xd is Q, X15 is H, is Q, is M, X 30 is T, X 33 is S, is T, X 42 is R, X4, is V, X,2 is A, X, 5 4 isS and X6 is Q; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 23/123 POPM\VVVPA Ctlawilr\SOCS DIV Sfl2ftnvtev 2 CIuA.WD- 15/i 110 -224- X,, 1 9 is E, A, M, K or V, with the proviso that it is not M ifX, is A, XIO is S, X,3 is Q, X 14 is Q, is H, is Q, is K, X 30 is T, X 3 3 is S, X4, is T, X 42 is R, X 4 7 is V, X, 2 is A, X 54 is S and X64 is Q; X 20 is P; X 2 is V, A, E or D; X22 is G; X 23 is T or S; X 24 is F; X, is L; X 26 is V, I or L; X 27 is R; X 2 is D; X is S; X3 is R, S, T or A, with the proviso that it is not T if X, is A, is S, X, 3 is Q, X 14 is Q, X 1 5 is H, X, 7 is Q, X, 3 is K, X,, 9 is M, X 33 is X4, is T, X4, is R, X 47 is V, X, is A, is S and X, is Q; X, is Q, D or H; 9. X 32 is R, Q, S, P, E or D; X 33 is N, R, D or S, with the proviso that it is not S ifX, is A, X 0 is S, X 1 3 is Q, X, 4 is Q, X 15 is H, X, 7 is Q, X, is K, X 1 is M, X, 0 is T, X 4 is T, X 4 2 is R, X 4 7 is V, X, 2 is A, X, 5 4 is S and X4 is Q; X 34 is C, H or Y; is a sequence of n amino acids wherein n is from I to 2 amino acids and wherein the sequence Xp may comprise the same or different *..amino acids selected from F, L or I; X, 3 is A, T or S; X 3 6 is L, I or V; X, is S; X 3 is V or F; X, is K or R; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 24/123 P:\Onr\aVPAA ODbTET8O\50CS DIV SH2 uNmleou 2 CUnN.wo- -15/ 1/04 -225- X4ois M, T, R or S; X 4 1 is A, Q S, T, Y or H, with the proviso that it is not T ifX, is A, Xo is S, X, 3 is Q, X, 4 is Q, XIS is H, X 7 is Q, X 1 8 is K, X 1 9 is M, X30 is T, X 33 is S, X42 is R, X 4 7 is V, X is A, X4 is S and X, is Q; X4. is S, A, R, N or G, with the proviso that it is not R if X, is A, XIo is S, X 1 3 is Q, X 14 is Q, XI 1 is H, X 17 is Q, X 1 3 is K, is M, X 3 is T, X3 is S, X 41 is T, X 47 is V, X, is A, X, is S and X, is Q; X 4 3 is G, R, K or I; X, is P, T or S; X4, 5 is T, K, L or H; X4 6 is S, N or H; X 4 7 is I, L, V, A or T, with the proviso that it is not V if X, is A, Xo is S, X1 3 is Q, X 14 is Q, X15, is H, X1 7 is Q, XI 18 is K, X 1 is M, X 30 is T, X 33 is S, X, is T, X 4 2 is R, X, is A, X,4 is S and X, is Q; X, 8 is R; X4, 9 is V, I or M; is H, Q or E; X5, is F, C, Y, Q or H; X, 52 is Q, E, A, W, S or Y, with the proviso that it is not A if X, is A, X 1 0 is S, X 13 is Q, X, 14 is Q, XIS, is H, X 17 is Q XI, is K, X 1 is M, X 30 is T, X 3 is S, X 4 1 is T, X 4 is R, X 4 7 is V, X, 4 is S and 9. X is Q; X, 5 3 is A, G, D, N or R; X,4 is G, S or H, with the proviso that it is not S if X, is A, X 1 o is S, X 1 is Q X 1 4 is Q, 15 is H, is Q, X 18 is K, is M, X3 0 is T, X 3 3 is S, X4, is T, X 42 is R, X47 is V, XS 2 is A and X, is Q; is R, S, K, N or T; X 56 is F; is H, S or R; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVI:S COLLISON CAVE Pat.&Trad :61 7 3368 2262 25/123 P:\O,'W\VPA\VPA CWtnl7$OCS DIV S12 rnavmLw,2 CLEM.wW I 5/t 1/04 -226- XsR is L or F; is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; X, 9 is D or E; XO is C, S, V, I, L or F; X61 is L, V, T or I; is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X 62 is L or A; X 63 is L, V or I; Xu is E,HD, Q or M, with the proviso that it is not Q ifXg is A, is S, X 13 is Q, X 1 4 is Q, X 1 is H, X 17 is Q, X18 is K, X, 9 is M, X 30 is T, X 33 is S, X 4 is T, X, 2 is R, X, 7 is V, X2 is A and X, 4 is S; X 6 is H or Y; and X66 is Y or S.
20. An isolated polypeptide according to claim 19 wherein the SH2 domain comprises a sequence selected from: GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNCFFALSVKMASG PTSIRVHFQAGRFHLDGSRETFDCLFELLEHY; (ii) GFYWGPLSVHGAERLRAEPVGTFLVRDSRQRNCFFALSVKMASG PTSIRVHFQAGRFHLDGSRESFDCLFELLEHY; (iii) GFYWGPLSVHGAhERLSEPVGTFLVTRDSRQRNCFFALVKMASGP TSIRVHFQAGRFHLDGNRETFDCLFELLEHY; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad 61 7 33 68 2262 26/123 P:AOnt\NVnAVPACOMatnSOCS 0IVSH2 *Zvisn 2 C.WP- IS/I I04 -227- (iv) GWYWGSMTVNEAKEKLKEAPEGTFLIRDS SHSDYLLTISVKTSAGP TNLREEYQDGKFRLDSIICVKSKLKQFDSVVHLfDYY; GFYWSAVrGEANLLLSAEAGTFLIRDSSDQRF LSKTQSGTK NLRIQCEGOSFSLQSDPRSTQPVPRFDCVLKLVHHY; (vi) PCYWGVMDKYAAEALLEGKPEGTFLLRDSAQEDYLFSVSFRRYSR SLHAR]EQWNHNFSFDAHDPCVFHSPDJTGLLEHY; or (vii) GWYWGPMNWEDAEMKLKGKPDGSFLVRDSSDPRYILSLSFRSQGI THHTRMEHYRGTFSLWCHPKFEDRCQSVrEFI-RAPIS.
21. Use of a polypeptide comprising an SH2 domain of a protein or a derivative, homologue, analogue or mimetic thereof wherein the protein comprises a SOCS box in its C- terminal region for screening of modulators of SOCS protein activity, wherein the SOCS box 0@e@ comprises the amino acid sequence: X 1 X 2 X 3 X 4 X, X 6 X7 X 8 X9 XO X 1 X 2 X 1 3 X 14 X 5 X 6 [XJI X17 X18 X 19 XZO X 21 X2 2X Xj, 24 X25 2%X27 wherein: X, is L, I, V, M, A or P, with the proviso that it is not A if 2. is A; ev X2 is any amino avid residue; X is P, T or S; X 4 is L, I, V, A orP; *Does V 25 Xis any amino acid; X 6 is any amino acid; X 7 is L, LV, M, A, F, Y or W; Xg is C, T or S; X 9 is R, K or H; X, 0 is any amino acid; XII is any amino acid; COMS ID No: SBMI-00995928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 27/123 P:\OIg\VPA\VPA COPLET\lSOCSCqDVSH2Rl)2Cl AN.WPD. IS/I 1/04 228 X, 2 is L, I, V, M, A or P; X j is any amino acid; X1 4 is any amino acid; X is any amino acid; X 1 6 isL,I,V,M,A,P,G,C,TorS; is a sequence of n amino acids wherein n is from I to 50 amino acids and wherein the sequence X. may comprise the same or different amino acids selected from any amino acid residue; Xis L, V,M, A or P; XIs is any amino acid; is any amino acid; L, I, V, M, A or P; X 21 isP; X, is L, I, V, M, A, P orG; Xz is P or N; is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X may comprise the same or different *amino acids selected from any amino acid residue; SX2, 4 is L, I, V, M, A or P; 20 X25 is any amino acid, with the proviso that it is not A if X1 is A; X 26 is any amino acid; X2 is Y or F; and XU is L, I, V, M, A or P, and X26 Y27 X2, X2 9 X3o X31 X33 X34 [XpJ. X35 X3 6 X7 X38 X3, X4 X41 X42 X43 X4 X 4 5 X4 wherein: X is G or P; wherein: Xi is 0 orP; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 28/123 P:\OCA\VA\VPA ACMPOna\SOCSDIV SH2m wlucvac 2 c.wpo- 15/1 1/04 -229- X, is F, W or C; X 3 is Y; X, is W; X, is G or S; X, is any amino acid; X 7 isL, M orV; X, is any amino acid; X, is any amino acid, with the proviso that it is not A if X, is S, X,, is R, X,4 is Q, is H, X, 1 is Q, Xt 8 is K, is M, Xo is T, X, 3 is S, X 4 is T, X 42 is R, X 47 is V, X 5 2 is A, X, is S and X, is Q; X 10 is any amino acid, with the proviso that it is not S if X is A, X 13 is R, X,4 is Q, is H, X 17 is Q, X,I is K, is M, X 3 0 is T, X, 3 is S, X 4 is T, X 42 is R, X 4 7 is V, X, is A, X is S and X, is Q; is any amino acid; X12 is A; is any amino acid, with the proviso that it is not R if X, is A, XIo is S, X,4 is Q, Xi, is H, X,7 is Q, XS, is K, is M, Xo 3 is T, 20 X3, is S, X,4 is T, X4 is R, X 47 is V, Xs2 is A, X is S and X, is Q; X14 is any amino acid, with the proviso that it is not Q if X, is A, X,O is S, X 1 3 is Q, Xis is H, X, 7 is Q, X, 8 is K, is M, X 3 o is T, X, is S, X 4 is T, X 4 2 is R, X47 is V, X,2 is A, X4 is S and X, 25 is Q; XIs is any amino acid, with the proviso that it is not H ifX 9 is A, X 0 is S, X 13 is Q, X, 4 is Q, X 1 is Q, Xg is K, X, is M, X 30 is T, X 33 is S, X 41 is T, X 42 is R, is V, is A, X,4 is S and X 6 S9 is Q; X, is L; X, 7 is any amino acid, with the proviso that it is not Q if X, is A, XIo COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 29/123 P:\OU\VPA\VPACLHATUL\SOCS DIV $H12 mIsD3 cr!T- 15/11/04 -230- is S, X 13 is Q, X 14 is Q, X 15 is H, X, 8 is K, X 19 is M, X 30 is T, X 33 is S, X 4 is T, X42 is R, X 4 is V, X52 is A, X 5 4 is S and X, is Q; XS is any amino acid, with the proviso that it is not K if X, is A, XIo is S, X, 1 3 is Q, X 14 is Q, Xs is H, is Q, is M, X 0 is T, X 33 is S, XA is T, X4, is R, X4, is V, X5 is A, X54 is S and Xa is Q; is any amino acid, with the proviso that it is not M if X, is A, Xo is S, X 1 is Q, X 14 is Q, X 1 5 is H X 1 7 is Q, X18 is K, X 30 is T, X 33 is S, X41 is T, Xa is R, X 4 is V, X 5 is A, X5 is S and X, is Q; is P; isany amino acid; X, is G; X23 is T or S; is F; X 2 is L; X 26 is V, I or L; is R; 20 X2, is D; X 2 ,is S; X 30 is any amino acid, with the proviso that it is not T ifX, is A, XIo is S, X 1 is Q, X 1 is Q, X 1 is H, X 1 7 is Q, X 18 is K, is M, X3 is S, X 4 1 is T, X42, is R, X47 is V, X3, is A, X,4 is S and X 64 25 is Q; X, 31 is any amino acid; X 3 is any amino acid; X33 is any amino acid, with the proviso that it is not S ifX, is A, XIo Sis S, X isQ, X,4 isQ,X, is H, X,, 7 is Q X 8 isK, isM, X30 is T, X41 is T, X 4 is R, X 47 is V, Xz is A, X54 is S and X6 is Q; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :1736 22#3/2 :61 7 3368 2262 30/123 P:\Opot\VPA\VPA Co Ifi~b,,OCS DIVSH2 PnAiuo 2CL5ANAW I S/I 1/04 -231- X34 is any amino acid; is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or 1; X 35 is any amino acid; X 36 is L, Ior V; X 37 is S; is V or F; X 3 g is K or R; X40is any amino acid; X 4 is any amino acid, with the proviso that it is not T if X, is A, XI 0 IS S, X1 3 is Q, X 1 4 is Q, X 1 5 is H, X 17 is Q, XI is K, X 1 9 is M, X 30 is T, X 33 is S, X1 2 is R, X 4 7 is V, 1X 2 isA, X54is Sand X 64 is Q; is any amino acid, with the proviso that it is not R if X 9 is A, XO is S, X 13 is Q, X14 is Q, Xls is H, X17 is Q, XIS is K, Xl, is M, X 3 0 is T, X 33 is S, X41 is T, X4. is V, X 5 2 is A, X54 is S and YX4 is Q; X4 is an*mn cd X 4 4 is any amnino acid; :~X4 *20Xis any amino acid; X46 is any amino acid; sosnymnocd X47 is any amino acid, with the proviso that it is not V if Xg is A, X 10 .:so is S, XsS 3 is QX14isQX15isHX1,isQX18is KX~isM, X 3 0 is T, X 33 is S, X4 is T, X42 is R, )C52 is A, X54 is S and X.4 is Q; .see.:X 4 g is R; **.X49 is V, I or M; X0is any amino acid; X5, is any amino acid; X 52 is any amino acid, with the proviso that it is not A if is A, X 1 0 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 31/123 P:\Onu\VA\VPACOM LIEI\SOCS DIVSH2 W.aMv 2cUEAN.WPD 15/11/04 -232- is S, X, 3 is Q, X 1 4 is Q, X 1 is H, X 17 is Q, X 1 is K, X, 9 is M, X 3 o is T, X 33 is S, X 4 is T, X4 i R, X 47 is V, X4 is S and X, is Q; X 53 is any amino acid; X, is G, S or H, with the proviso that it is not S ifX, is A, X 10 is S, X, 3 is Q, X, 4 is Q, X 15 is H, X 1 is Q, X, 8 is K, X, 9 is M, X, 0 is T, X33 is S, X,1 is T, X4, is R, X47 is V, X, 2 is A and X, is Q; is any amino acid; Xs6 is F; X,7 is any amino acid; Xs8 is L or F; is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; X 59 is D orE; X0o is any amino acid; X6, is L, V, T or I; is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different 20 amino acids selected from any amino acid residue; X2 is L or A; 0 X 63 is L, V or I; X, is any amino acid, with the proviso that it is not Q if X9 is A, XIo Sis S, X 3 is Q, X4 is Q, is H, X7 is Q, X,8 is K, X 1 9 is M, 25 X 3 0 is T, X 3 3 is S, X4, is T, X42 is R, X4 is V, is A and X,4 is S; 0 is H or Y; and S:X, is Y or S.
22. The use according to claim 21 wherein the protein modulates signal transduction. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 32/123 P:\O.nt\V\VPAComLrSlSOCS DrVSM lo 2 c.mZ. o. 15/ 1J/04 -233-
23. The use according to claim 22 wherein the signal transduction is mediated by a cytokine or other endogenous molecule, a hormone, a microbe or a microbial product, a parasite, an antigen or other effector molecule.
24. The use according to claim 23 wherein the protein modulates cytokine-mediated signal transduction. The use according to claim 24 wherein the signal transduction is mediated by one or more of the cytokines EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, IL-4, IL-7, IL-13, IL-6, LIF, OSM, IL-12, IFNa, TNFa, IL-1 and/or M-CSF.
26. The use according to claim 25 wherein the signal transduction is mediated by one or more of IL-6, LIF, OSM, IFNa and/or TPO.
27. The use according to claim 26 wherein the signal transduction is mediated by IL-6.
28. The use according to claim 27 wherein the protein comprises an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. SEQ ID NO. 12, SEQ ID NO. 18, SEQ ID NO. 36 or SEQID NO. 44.
29. The use according to any one of claims 21 to 28 wherein the SH2 domain comprises the amino acid sequence: X X 2 3 3 X4 X5 X6 X7 X8 X X 1 X 1 2 X X3 X14 XS1 X1, XI1 X19 X 2 0 X 2 X X X23 X24 25 X26, X27 X28 X 30 X, XXX X XX XX4 Xp] XX 3 X, X X 4 X41 X42 X, X, X4 X47 X48 X9 X Xo X52 X X, X34 X,6 X57 X,8 X9, X6o X, X, X,63X6 X. X 9 wherein: Xt is G or P; X 2 is F, W or C; X3 is Y; X,4 is W; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 33/123 P:OnlAuVrA\VPA CGnzrEMSOCS DIVS cWisv cDZ CA.WD. 15/11/04 -234- Xis G or S; X, is P, A, S or V; X, is L, M orV; X, is S, T, D or N; X 9 is V, G, A, R, K or W, with the proviso that it is not A if X 0 is S, X 1 3 is R, X 14 is Q, X 1 5 is H, X 1 is Q, X 1 8 is K, X 1 9 is M, X 30 is T, X 33 is S, X41 is T, )Q2 is R, X, is V, X2 is A, X, 4 is S and X6 is Q; X 1 0 is H, G, N, S. Y, W or E, with the proviso that it is not S if X, is AX, 13 isR,X, 4 isQ, X 5 isH, Xt 7 isQ, XIS is K, X 1 9 is M, X 3 0 is T, X 3 is S, X 4 1 is T, X 4 2 is R, X 4 7 is V, X 5 2 is A, Xd is S and X, is Q; X, is 0, E, A or D; X, 12 is A; X 13 is H, N, K, R or E, with the proviso that it is not R if X, is A, X 0 is S, X 14 is Q, X, 5 is H, X 1 is Q, X, is K, X 1 9 is M, X, 0 is T, X,3 is S, X4 1 is T, X 42 is R, X 4 is V, Xs 2 is A, X 5 4 is S and X, 4 **is Q; X 14 is E, L, Q, A, G or M, with the proviso that it is not Q if X, is A, 20 X 0 isS, X,3 isQ, X1isH, X 1 7 isQ,X, 1 isK, X 9 isM,X 3 0 is X33 is S, X4, is T, X4,2 is R, X, is V, X52 is A, Xs, is S and X64 is Q; X,s is R, L, K or H, with the proviso that it is not H if X, is A, XIo is S, X,3 is Q, X 14 iS Q, X,17 is Q, X,1 is K, X, 9 is M, X 3 0 is T, X 33 25 is S, X41 is T, 42 is R, X 0 isV, Xs2 is A, X54 is S and X, is Q; X, 16 is L; is R, S, K, Q, E or A, with the proviso that it is not Q ifX, is A, *is S, X 13 is Q, X14 is Q, XI is H, X, 8 is K, X 1 9 is M, X30 is %fT, X 3 3 is S, X 4 1 is T, X 4 2 is R, X, is V, X 52 is A, X 5 4 is S and X, is Q Xis is A, E, K, G, N or S, with the proviso that it is not K if X, is A, COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 34/123 P:1\Ornwv\VPA COPLET\SOCS DIV $1 r2mms 5cAN.wo- 15/11/04 -235- XIO is S, X,3 is Q, X 14 is Q, X, 5 is H, X, 1 is Q, X 1 is M, X 3 0 is T, X 33 is S, X is T, X 4 2 is R, X 47 is V, X 52 is A, X, 4 is S and X, is Q; X, 9 is E, A, M, K or V, with the proviso that it is not M if X9 is A, XI( isS, X, 3 is Q, X, 4 is Q, X 1 is H, X 17 is Q, X 18 is K, X, is T, X,3 is S, X 4 1 is T, X 4 2 is R, X4, is V, X2, is A, X4 is S and X, is Q; is P; X21 is V, A, E orD; X,2 is G; X23 is T or S; X24 is F; X 2 5 is L; X26 is V, I or L; X2 is R; X 2 is D; X29 is S; 0% 0 X30 is R, S, T or A, with the proviso that it is not T ifX, is A, Xo is S, :0.00, X,3 is Q, X, 4 is Q, X 15 is H, X, 1 is Q, X 1 8 is K, X 1 9 is M, X 33 is 20 S. X4, is T, X 42 is R, X 47 is V, Xs2 is A, X, is S and X, is Q; X 31 is Q, D or H; X32 is R, Q, S, P, E or D; X33 is N, R, D or S, with the proviso that it is not S if X, is A, X10 is S, X 1 3 is Q, is Q, X15 is H, X 17 ,is Q, X is K, X 1 is M, X30 is T, X 41 is T, X 4 2 is R, X 7 is V, X,2 is A, X 5 4 iS S and X, is Q; X 3 4 is C, H or Y; is a sequence of a amino acids wherein n is from I to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; X 3 is A, T or S; X 3 6 is L, I or V; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 35/123 PAOPc\VPAlVPA CaunrS\SOCS DIV SW v urvsZ am.w- 15/ 1I/04 -236- X 37 is S; X39 is V or F; X 39 is K or R; Xo is M, T, R or S; X 4 1 is A, Q, S, T, Y or H, with the proviso that it is not T ifX, is A, XIO is S, X 13 is Q, X 14 is Q, X, 5 is H, X 17 is Q, X 2 s is K, X 19 is M, X 30 is T, X 3 3 is S, X42, is R, X47 is V, X 52 is A, X, is S and X, is Q; X42 is S, A, R, N or G, with the proviso that it is not R ifX, is A, X 1 0 is S, X,3 is Q, Xa is Q, is H, X, 7 is Q, is K, is M, X 3 0 is T, X 3 3 is S, X4, is T, X 4 7 is V, X, is A, X, 4 is S and X 6 4 is Q; XY is G, R, K or I; X44 is P, T or S; X 4 is T, K, L or H; X, is S, N or H; X4,7 is I, L, V, A or T, with the proviso that it is not V if X is A, X, 1 0 is S, X 1 3 is Q, X 3 4 is Q X 1 5 is H, X 1 7 is Q, XIS is K, X 1 9 is M, is T, X33 is S, X4t, is T, X47 is R, X, is A, X54 is S and X, is Q; X 4 is R; CS ;X 4 9 is V, I or M; isH, Q orE; Xs, is F, C, Y, Q or H; X,2 is Q, E, A, W, S or Y, with the proviso that it is not A if X, is A, @0CC 25 X isS, X, is Q, X 4 is Q, X is H, X 1 7 is Q, is K, is SCCC M, Xo is T, X33 is S, X 4 is T, X, is R, X 47 is V, X, is S and X 64 is Q; is A, G, D, N or R; is G, S or H, with the proviso that it is not S if X, is A, Xo is S, X is Q, X 14 is Q, X 15 is H, is Q, Xs is K, X, is M, Xo is T, X33 is S, X41 is T, X 4 is R, X4, is V, XU is A and X, is Q; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 36/123 P:OPR\VPAIVPA CMuUMMSOCS DIV SHs2 ivrD 2 cUA.wLo- 15/l 1/04 -237- is R, S, K, N or T; X, 56 is F; X57is H, S orR; X, is L or F; [Xq] is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; is D or E; Xis C, S, V, I, L or F; X 61 ,isL,V, T orI; is a sequence of n amino acids wherein n is from I to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X, is L br A; X 63 is L, V or I; X4 is E, H, D, Q or M, with the proviso that it is not Q if X 9 is A, X 1 0 is S, X 13 is Q, X, 14 is Q, X 5 is H, X, 7 is Q, is K, X 1 9 is M, X 3 o is T, X 33 is S, X 4 is T, X42 is R, X 4 7 is V, X 52 is A and X 5 4 is S; X, is H or Y; and X, is Y or S.
30. The use according to claim 29 wherein the SH2 domain comprises a sequence selected from: GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNCFFALSVKMASG PTSIRVHFQAGRFHLDGSRRTFDCLFELLEHY; (ii) GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNCFFALSVKMASG PTSIRVIHFQAGRFHLDGSRESFDCLFELLEHY; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 37/123 P:\OAlsA\VPAC aWPfLres\SOCS OIV Stu rV32 CusJAN.wfl -I5/11/04 -238- (iii) GFYWGPLSVHGAHERLRSEPVGTFLVRDSRQRNCFFALSVKMASGP TSIRVHFQAGRFHLDGNRETFDCLFELLEHY; (iv) GWYWGSMTVNEAKEKLKEAPEGTFLIRDSSHSDYLLTISVKTSAGP TNLRIEYQDGKFRLDSIICVKSKLKQFDSVVHLIDYY; GFYWSAVTGGEANLLLSAEPAGTFLIRDSSDQRHFFTLSVKTQSGTK NLRIQCEGGSFSLQSDPRSTQPVPRFDCVLKLVHHY; (vi) PCYWGVMDKYAAEALLEGKPEGTFLLRDSAQEDYLFSVSFRRYSR SLHARIEQWNHNFSFDAHDPCVFHSPDITGLL Y; or (vii) GWYWGPMNWEDAEMKLKGKPDGSFLVRDSSDPRYILSLSFRSQGI THHTRMEEYRGTFSLWCHPKFEDRCQSVVEFIKRAvMHS. S31. A method of screening for a modulator of SOCS protein activity, the method comprising contacting a preparation containing a polypeptide comprising a SOCS protein SH2 S:domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand i or analogue or derivative thereof which interacts with the SH2 domain or derivative, homologue, analogue or mimetic thereof with a test agent and detecting a different level of interaction between the ligand or analogue or derivative thereof and the SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of the interaction in the absence of the test agent, wherein the different level is indicative of the *agent being a modulator of SOCS protein activity, wherein the SOCS protein comprises a o 25 SOCS box that comprises the amino acid sequence: X X, X 4 X 5 X 6 X, X8 X 9 Xo XI X 1 X 1 X 14 X, 1 5 X 1 6 X 1 7 X, X, X X 2 1 X3 X 24 X 2 5 X 2 X, 7 X 28 wherein: X, is I, V, M, A or P, with the proviso that it is not A if X25 is A; X, is any amino acid residue; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 38/123 P;\OPER\V\VPA'PA Q rrn\.SOCS DIV SH2 8rVtw 2 uA.o 1S/11/04 -239- X 3 is P, T or S; X 4 is L, I, V, M, A or P; X s is any amino acid; X 6 is any amino acid; X 7 is L, I, V, M, A, F, Y or W; X, is C, T or S; X, is R, K or H; X 1 o is any amino acid; Xn is any amino acid; X,2 is L, I, V, M, A or P; X, is any amino acid; X 4 is any amino acid; X 15 is any amino acid; X 16 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino ^acids and wherein the sequence Xi may comprise the same or different amino acids selected from any amino acid residue; X, 7 is L, I, V, M, A or P; SX 8 is any amino acid; 20 X 19 is any amino acid; X 2 0 L, I,V,M,AorP; X 2 is P; SX22 is L, I, V, M, A, P or G; X 3 is P or N; 25 is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xi may comprise the same or different amino acids selected from any amino acid residue; X24 is L, I, V, M, A or P; X 25 is any amino acid, with the proviso that it is not A if X, is A; X 2 6 is any amino acid; X 2 7 is Y or F; and COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 39/123 P :\pc\VPAVPA CMLErE\SOCS DIV SH2 areo 2 cuwt.mr- 15/11 /04 -240- is L, I, V, M, A or P, and wherein the SH2 domain comprises the amino acid sequence: X X X, X4 X5 X, X X, X XX X12 X3 X14 X15 XI6 X7 X8 X19 20 21 X22 X2 X2 X, 26, X7 X 9 X X3 X3 33, X34 X35 X36 X37 X37 X39 X40 X41 X42 X43 X44 X45 X46 X4 4 9 Xo0 st X52 X33 X54 X X5, X, X, [Xq]c X X6 X, X)CQ XeO Y6 X wherein: X, is G or P; X 2 is F, W or C; X3 is Y; X4 iS W; X, is G or S; X is any amino acid; X 7 is L, M or V; X, is any amino acid; X, is any amino acid, with the proviso that it is not A if Xo is S. X13 is R, X,4 is Q, XIS is H, X17 is Q, Xlg is K, X1, is M, X30 is T, 3X33 is S, X is T, X43, is R, X47 is V, XS is A, X ,4is S and X,4 is Q; Xo is any amino acid, with the proviso that it is not S if X, is A, X,3 is R, X,4 is Q, X1, is H, X, is Q, Xg is K, isM, X3isT, X33 is S, X41 is T, X42 is R, X47 is V, X, is A, X4 is S and X, is Q; 25 X, is any amino acid; .X,1 is A; is any amino acid, with the proviso that it is not R if X, is A, Xo is S, X, isQ,X s is H, X17 is Q, Xj is K, X19 isM, X30 is T, is S, X4, is T, X42 is R, X4, is V, X,2 is A, X,4 is S and X,4 is Q; X1, is any amino acid, with the proviso that it is not Q if X, is A, Xo COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 40/123 P:\Or\VFA\VPA CurumeASOCS DIV SH3 REVIm2 cL9LWPh -IS/1 t/04 -241- is S, X 1 is Q, X, 5 is H, is Q, XS is K, X, 19 is M, X 3 0 is T, X, is S, X4, is T, X 42 is R, X 47 is V, X2 is A, X$ 4 is S and X, is Q; is any amino acid, with the proviso that it is not H if X, is A, X 10 is S, X 1 3 is Q, X 1 4 is Q, X, is Q, X,8 is K, X 19 is M, X 30 is T, X 33 is S, X 4 is T, X 4 is R, XA, 7 is V, is A, Xs is S and X, is Q; is L; X 17 is any amino acid, with the proviso that it is not Q if X, is A, X 10 is S, X 13 is Q, X 1 4 is Q, X 1 5 is H, X, 1 1 is K, X, is M, X 0 o is T, X 3 3 is S, X 4 1 is T, X 4 2 isR,X 4 7 is V, X, is A, X,4 is S and X, is Q; X 18 is any amino acid, with the proviso that it is not K if X, is A, X 10 is S, X 1 3 is Q, X 14 is Q, X 1 is H, X, 7 is Q, X, 1 9 is M, X 30 is T, X 33 is S, X4, is T, X 42 is R, XC7 is V, X, 2 is A, X, 4 is S and X, is Q; X, 1 9 is any amino acid, with the proviso that it is not M ifX, is A, X 1 0 o. is S, X,3 is Q, X, 4 is Q, X, 5 is H, is Q, X, 8 is K, X, 0 is T, is S, X4 is T, X42 is R,X 4 ,7 isV, X 52 is A, X is Sand X, 20 is Q; X 2 0 is P; X, 21 is any amino acid; X 22 is G; X 2 3 is T or S; X 24 is F; o 9 X2 is L; "'X 2 6 is V, I or L; X,7 is R; X 2 2 is D; X 29 is S; X 3 0 is any amino acid, with the proviso that it is not T if X, is A, X 1 0 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 41/123 P:\V0rm\VrA\VPA 00LEFITKSOCS DIV SH2v mVm 2 CalAN.Wo -1 /11/04 -242 is S, X 13 is Q, X,4 is Q, is H, X 1 is Q, X 8 is K, X 1 9 is M, X 33 is S, X41 is T, X42 is R, X4 is V, X, 2 is A, Xs, is S and X, is Q; X 3 is any amino acid; X 32 is any amino acid; X 33 is any amino acid, with the proviso that it is not S if X, is A, Xo is S, X 3 is Q, X is Q, X 1 is H, X is Q, X8 is K, is M, is T, X4, is T, is R, X,7 is V, X 5 is A, X,5 is S and X, is Q; X34 is any amino acid; [Xp]l is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; X 3 5 is any amino acid; X,6 is L, I or V; 37 is X is V or F; is K or R; X4o is any amino acid; 20 X4t is any amino acid, with the proviso that it is not T if X, is A, Xlo is S, X Q, X 14 is Q, XI is H, X7 is Q, Xl, is K, X 1 9 is M, is T, X,3 is S, X 4 2 is R, X47 is V. is A, X, is S and X, is Q; X42 is any amino acid, with the proviso that it is not R ifX, is A, X 10 25 is S, is Q, X,4 is Q, Xis is H, X, is Q, Xis is K, is M, X30 is T, X,3 is S, X is T, X47 is V, is A, X, 4 is S and X, is Q; X 4 3 is any amino acid; X44 is any amino acid; X4 is any amino acid; X, is any amino acid; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 42/123 P.\Oh\Vr\VPAIC CnmgW\SOCS DIV S32n IwIjj2cun,.wrng -13/11/04 -243- X 4 7 is any amino acid, with the proviso that it is not V if X, is A, Xio is S, X3 is Q, X 14 is Q, X is H, X 17 is Q, X 18 is K, X 1 g is M, is T. X 3 3 is S, X, 1 is T, X42 is R, X.Z is A, X 4 isS and X4 is Q; X 4 is R; X 49 is V, Ior M; XsO is any amino acid; X, is any amino acid; is any amino acid, with the proviso that it is not A if X, is A, X1o is S, X is Q, X, 1 4 is Q, X, is H, is Q, X, 18 is K, is M, X 30 is T, X 33 is S, X 4 1 is T, X 4 2 is R,X 4 7 is V, X, is S and X" is Q; X$ 3 is any amino acid; X 54 is G, S or H, with the proviso that it is not S if X, is A, Xo is S. X 1 3 is Q, X 14 is Q, XS is is Q, is K, X 9 is M, X 3 0 is T, X,, 33 is S, X4, is T, X 4 2 is R, X 47 is V, X, 2 is A and X, is Q; X, is any amino acid; X 56 is F; X, 7 is any amino acid; X 5 is L or F; is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; X 5 9 is D or E; 25 X, is any amino acid; X, is L, V, T or I; is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X 2 is L or A; XQ is L, V or I; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 43/123 P:%E\ORVPA\VPA CLO4nT \SOCS DIV SH2 wrvit 2 vUMI.ws 5/1 1/04 -244- X4 is any amino acid, with the proviso that it is not Q ifX, is A, X, 0 isS, X isQ,Xis Q, Xs is H, X 7 is Q, XS is K, X 1 9 is M, X3 is T, X3, is S, X4, is T, X 42 is R, X7 is V, Xs is A and X, 4 is S; X 4 is H or Y; and Xis Y or S.
32. A method of screening for a modulator of SOCS protein activity, the method comprising contacting a preparation containing apolypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with the SH2 domain or derivative, homologuc, analogue or mimetic thereof with a test agent and detecting a reduced level of interaction between the ligand or analogue or derivative thereof and the SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of the interaction in the absence of the test agent, wherein the reduced level is indicative of the agent being a modulator of SOCS protein activity, wherein the SOCS protein comprises a SOCS box that comprises the amino acid sequence: O 9 X 2 X 3 X 4 X X 6 X 8 X XI Xo X 1 3 X 14 X 1 5 X 1 6 [Xi X 1 7 X 1 X 19 X 2 0 20 X X 2 X 2 X,2 X 25 X 2 X 2 7 X 2 wherein: Xt is L, I, V, M, A or P, with the proviso that it is not A if X2 is A; X2 is any amino acid residue; S' X is P, Tor S; 25 X 4 isL,I,V,M,AorP; S. Xs is any amino acid; X, is any amino acid; X, is L, I, V, M, A, F, Y or W; X, is C, T or S; XisR,K orH; is any amino acid; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 44/123 P:\OnlVPA\VPO COaMuSOCS DIV SiM 2 VZM2 CL.AN.WPO- S1 //04 245 Xn, is any amino acid; X2 isL,I,V, M,AorP; X 13 is any amino acid; X, 4 is any amino acid; Xi, is any amino acid; X, 6 is L, I, V, M, A, P, G, C, T or S; is a sequence of n amino acids wherein n is from 1 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; X 1 7 is L, I, V, M, A orP; Xis is any amino acid; X 19 is any amino acid; X 20 L, I, V, M, A or P; X2, is P; X2 is L, I, V, M, A, P or G; X is P or N; [Xt]n is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence Xj may comprise the same or different amino acids selected from any amino acid residue; 20 X 2 4 is L, I, V, M, A or P; X 2 s is any amino acid, with the proviso that it is not A if X, is A; X 2 6 is any amino acid; X 2 7 is Y or F; and X 28 is L, I, V, M, A or P, and wherein the SH2 domain comprises the amino acid sequence: X 1 X 2 X 3 X 4 X 5 X 6 X7 8 K, X Xi X 12 X1 X 14 X 1 i X 16 X 1 X 1 X9 X20 X 2 1 X~ X X 24 X 2 X 26 X 2 7 X 28 Q 2 X3, X 3 1 X 3 2 X 3 3 X34 X3 5 X 3 X 37 X 3 8 X 39 X4o X X42 X4 X 4 4 X45 X46 X 47 X48 X 49 Xso Xs, X 52 X$ 3 X 54 X 5 X 56 X 57 Xss Xs X60 X 61 [Xr]n 2 X 6 3)X 6 X X 66 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 45/123 P\OnA\V-AVPA Cor-rntSOCS DIV SH2 EVDu 2CL Aa 15/1/04 -246- wherein: X, is G or P; X 2 is F, W or C; X 3 is Y; X,4 is W; X, is G or S; X 6 is any amino acid; X, is L, M or V; X, is any amino acid; X, is any amino acid, with the proviso that it is not A if X 1 0 is S, X, 13 isR,X,4isQ, X1sisH, X 7 isQ,X,8isK, X,9isM, X3 isT, X 33 is S, X 4 1 is T, X4, is R, X 47 is V, X52 is A, X 5 4 is S and X, is Q; X 1 0 is any amino acid, with the proviso that it is not S if X is A, X 13 is R, X 14 is Q, X 1 5 is H, is Q, X is K, X 1 9 is M, is T, S 15 is S, X 1 is T, X 42 is R, X 47 is V, is A, X54 is S and X4 @000 is Q; so is any amino acid; X,2 is A; see X, 13 is any amino acid, with the proviso that it is not R if X, is A, X 1 0 so is S, X, 1 4 is Q, X, is H, X 1 7 is Q, X 1 is K, is M, X 3 0 is T, X 33 is S, X 41 is T, X 42 is R, X,47 is V, X,2 is A, Xs4 is S and X, is Q; X 1 4 is any amino acid, with the proviso that it is not Q if X, is A, X 1 0 sea**: isS, X is Q, isH, X 1 is Q, X 18 ts is K, X 1 isM, X,, 30 isT, X 33 is S, X4, is T, X42 is R, X47 is V, X 52 is A, Xs, is S and X, 0 0is Q; X5 is any amino acid, with the proviso that it is not H if X, is A, X 10 is S, X,3 is Q, X4 is Q, X17 is Q, X18 is K, X19 is M, X30 is T, X 33 is S, X41 is T, X 42 is R, X 47 is V, X5 is A, X34 is S and X, is Q; X6 is L; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 46/123 P:\OwuVrA\VPA CcMPLElf\SOCS DJIv SHRZ uvED2CLEM.WPO- 1$/11/04 247 X, 7 is any amino acid, with the proviso that it is not Q if X, is A, Xo is S, X, 13 is Q, X 14 is Q, X 1 s is H, is K, X 19 is M, Xo is T, X 33 is S, X 41 is T, X 4 2 is R, X 47 is V, Xs 2 is A, Xs4 is S and X64 is Q; Xs is any amino acid, with the proviso that it is not K if X, is A, Xo is S, X, 3 is Q, X, 4 is Q, is H, X 17 is Q, X 19 is M, X30 is T. X3, is S, X, is T, X 42 is R, X, is V, X52 is A, X4,, is S and X, is Q; X is any amino acid, with the proviso that it is not M if X, is A, XIo is S, is Q, X,4 is Q, XI is H, is Q, Xs is K, X, 0 is T, X3 is S, X 41 is T,X 4 2 is R, X 47 is V, is A, X, is S and X, is Q; is P; X21 is any amino acid; 15 X22 is G; 9* 23is T or S; X, is F; X 2 5 is L; X 2 is V, I or L, X, is R; X 28 is D; X 29 is S; 949 X 30 is any amino acid, with the proviso that it is not T if X, is A, XIo 99S99b is S, X 1 is Q, X,4 is Q, XIS is H, X 17 is Q, X1 is K, X 19 is M, 25 is S, X 4 1 is T, X 4 2 is R, X 7 is V, X,5 is A, Xs is S and X, is Q; X3, is any amino acid; X34 is any amino acid; X3, is any amino acid, with the proviso that it is not S if X, is A, XIO isS, X 1 3 isQ,X, 1 isQ, X 15 isH, X 17 isQXS isK,X,, 9 isM, is T, X41 is T, X42 is R, X7 is V, Xs2 is A, X54 is S and X64 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 47/123 P:\Or1rt\VVA OcoLurtrSOCS DIV S52 AiLi D 2 Cjt.wro 15/1 I 04 -248- is Q; X3 is any amino acid; is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; Xs is any amino acid; X 3 6 is L, I or V; X 37 is S; X,3 is V or F; X 3 9 is K or R; Xo is any amino acid; X41 is any amino acid, with the proviso that it is not T ifX, is A, X 10 is S, X,3 is Q, X,4 is Q, Xs, is H, is Q, X, 8 is K, X,1 is M, X 30 is T, X33 is S, X,2 is R, X 4 7 is V, X, is A, X, is S and X, 15 is Q; X42 is any amino acid, with the proviso that it is not R if X, is A, Xo is S, X 3 is Q, X,4 is Q, XI, is H, X, is Q, is K, Xt, is M, X3o is T, X, is S, X41 is T, X47 is V, XJ2 is A, XS4 is S and X" •CS is Q; X any amino acid; X 4 is any amino acid; X4, is any amino acid; X464 is any amino acid; to**: X 4 7 is any amino acid, with the proviso that it is not V if X, is A, X,1 0 *0 25 is S, X 3 is Q, X14 is Q, is H, X,7 is Q, Xg is K, X9, is M, XO is T, X33 is S, is T, X, is R, is A, Xu is S and X, is Q; X4, is R; X 4 is V, I or M; Xo is any amino acid; Xs. is any amino acid; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 48/123 P;O:R\VM\VPACAoL-e-A\SOC DV H2 VMED2 CLUN.w, 13/11 /04 249 Xs 2 is any amino acid, with the proviso that it is not A if X, is A, Xio is S, X 1 is Q, X 1 4 is Q, X 5 is H, X is Q, XiS i K, Xg 9 is M, X 3 0 is T, X3 is S. X4, is T, X 4 2 is R, X47 is V, X 5 4 is S and X.6 is Q; X 53 is any amino acid; X, 4 is G, S or H, with the proviso that it is not S if X 9 is A, Xo0 is S, X 3 is Q, X, 4 is Q, XIs is H, X 7 is Q, Xg is K, X 19 is M, X30 is T, X33 is S, X 41 is T, X 4 2 is R, X 47 is V, Xs2 is A and X6 is Q; Xss is any amino acid; Xs6 is F; X 57 is any amino acid; X,8 is L or F; is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different 15 amino acids selected from any amino acid residue; X,9 is D or E; X6o is any amino acid; S: X6, is L, V, T or I; is a sequence of n amino acids wherein n is from 1 to 2 amino @0 20 acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; 62 is L or A; X 3 is L, V or I; X4 is any amino acid, with the proviso that it is not Q if X, is A, X,0 25 is S, X,3 is Q, X, 14 is Q, Xis is H, X,7 is Q, Xg is K, X,9 is M, X 3 0 is T, X33 is S, X 4 is T, X 42 is R, X 47 is V, X is A and Xs is S; X 6 s is H or Y; and Xe is Y or S.
33. A method of screening for a modulator of SOCS protein activity, the method COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad ;61 7 3368 2262 49/123 P:OM,'\Vp\VA COA MlEIWCSOCS DIV SH2 ZVIBD Z CLlW.WPD. 15/I 1/04 -250- comprising contacting apreparation containing apolypeptide comprising a SOCS protein SH2 domain or a derivative, homologue, analogue or mimetic thereof and an intracellular ligand or analogue or derivative thereof which interacts with the SH2 domain or derivative, homologue, analogue or mimetic thereofwith a test agent and detecting an enhanced level of interaction between the ligand or analogue or derivative thereof and the SH2 domain or derivative, homologue, analogue or mimetic thereof relative to a reference level of the interaction in the absence of the test agent, wherein the enhanced level is indicative of the agent being a modulator of SOCS protein activity, wherein the SOCS protein comprises a SOCS box that comprises the amino acid sequence: X X2 X 3 4 X 5 X 6 X 7 X X 9 X X3 X 2 X 1 3 X 1 4 X) 5 X, 16 X1 7 X 1 8 X 1 9 X2 0 X 21 X 22 X3 X2 X25 X2 X 2 7 X2, wherein: X, is L, I, V, M, A or P, with the proviso that it is not A if X 25 is A; X 2 is any amino acid residue; X 3 isP, T orS; X 4 isLI, V,M, AorP; is X 5 is any amino acid; X 6 is any amino acid; 20 X 7 isL, I, V,M, A,F, Y orW; X& is C, T or S; XisR, K orH; S: X 10 is any amino acid; X 11 is any amino acid; 25 X 1 2 isLI,V,M, A orP; X 13 is any amino acid; X 14 is any amino acid; X1 5 is any amino adcid; X 16 is L, I, V, M, A, P. G, C, T or S; [X 1 is a sequence ofn amino acids wherein n is from I to 50 amino acids and wherein the sequence X, may comprise the same or different COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 50/123 P:\Obe\VPA\VPA C Omrn.ASOCS DIV SI-12 P.t 2 uvWv.w o-15/I 1/04 -251- amino acids selected from any amino acid residue; X 1 7 is L, I, V, M, A or P; Xts is any amino acid; X 19 is any amino acid; X2 L, I, V, M, A or P; X 2 is P; Xa is L, I, V, M, A, P or G; X23 is P or N; is a sequence of n amino acids wherein n is from 0 to 50 amino acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; X24 is L, 1, V, M, A or P; X 2 is any amino acid, with the proviso that it is not A ifX, is A; X2 is any amino acid; X27 is Y or F; and X28 is L, I, V, M, A or P, and wherein the SH2 domain comprises the amino acid sequence: 20 X1 X 2 X3 X 4 X 5 XG X7 X8 X9 XIo XnI X12 X13 X14 X1s X16 X17 X18 X19 X20 X21 X22 Xz X24 X26 X27 X28 X 29 X30 X3I X32 X33 X34 X35 X36 X37 X3 X 3 X 40 X41 X42 X43 X44 X45 X 4 6 **9 X X47 X4 X49 X5o X 1 X5 2 X5X3 XX X(55 X 56 X57 X8 X59 X60 X61 X62 X63X6 X65 X6 wherein: X, is G or P; 25 X 2 is F, W or C; X3 is Y; X4 is W; X, is G or S; X6 is any amino acid; X 7 is L, M or V; Xg is any amino acid; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 51/123 P:\OPER\VA\AVPA COnriq imSOCS DXV SI12 .iwaD2 cLEN.WD. 13/11/04 -252- X, is any amino acid, with the proviso that it is not A if XIO is S, X 13 is R,X 4 is Q, XS is H, XI, is Q, Xa is K, X 1 9 is M, X, 0 is T, X33 is S, X4, is T, X42 is R X4, is V, X, 52 is A, X,54 is S and X, is Q; X10 is any amino acid, with the proviso that it is not S if X, is A, X,3 is R, X 1 4 is Q X 1 5 is K X is Q t X is K, Xg 9 is M, X3. is T, X33 is S, X, is T, X42 is R, X4, is V, X52 is A, X, is S and X, is Q; X,I is any amino acid; X, is A; X, 3 is any amino acid, with the proviso that it is not R if X, is A, Xo is S, X14 is Q, is H, is Q, X,18 is K, is M, X30 is T, X33 is S, X, is T, X4, is R, X4, is V, X52 is A, X54 is S and X, is Q; X,4 is any amino acid, with the proviso that it is not Q if Xg is A, XIo is S, X13 is XS is H, X7 is Q, X 1 8 is K, X,,9 is M, X 30 is T, X33 is S, X41 is T, X4, is R, X47 is V, X,52 is A, X,54 is S and X, is Q; X,5 is any amino acid, with the proviso that it is not H ifX, is A, XIo 20 is S, X, is Q, is Q, X 17 is Q, X, 18 is K, X 9 is M, X,3 is T, X33 is S, X41 is T X42 is R, X4 is V, X2 is A, X, is S and X, ~is Q; *4*9 X, 1 6 is L; 7X, is any amino acid, with the proviso that it is not Q ifX, is A, X, is S, X 13 is Q, X,14 is Q, X, 1 is H, X, 8 is K, X,19 is M, X30 is T, X33 is 5, X, is T, X42 is R, X4, is V, Xs52 is A, X.4 is S and X, *is Q; X,8 is any amino acid, with the proviso that it is not K if X, is A, is S, X 13 is Q, X,4 is Q, X,5 is H, X 7 is Q, X9 is M, X3. is T, X33 is S, X 4 is T, is R,X 4 is V, X, is A, X4 iS S and X, is Q; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 52/123 P:\OrRVA\VPACOulJmASOCS DIV SI2 I: nMn2 CLAN.WPD. O5/ 1/04 -253- X 1 9 is any amino acid, with the proviso that it is not M ifX, is A, XIo is S, X 1 is Q, X, 4 is Q, X 1 is H, X 1 7 is Q, X8 is K, X 30 is T, X3 is S, X4, is T, X 4 2 is R, X 4 is V, X, is A, X, 4 is S and X 4 is Q; X 20 is P; X 21 is any amino acid; X 2 2 is G; Xz is T or S; X, is F; X2 is L; X 2 6 is V, I or L; X 27 is R; X 2 g is D; X2 is S; X3( is any amino acid, with the proviso that it is not T if X 9 is A, X 10 is S, Xu is Q, Xu 1 is Q, is H, X is Q, XI, is K, X 1 is M, o X,3 is S, Xi is T, X 4 2 is R, X 47 is V, X 5 2 is A, X, is S and X, Sis Q; X31 is any amino acid; X32 is any amino acid; X is any amino acid, with the proviso that it is not S ifX, is A, X 0 o is S, X,3 is Q, X,4 is Q, is H, Xn 1 is Q, X, 8 is K, is M, Xo is T, X 4 is T, X, is R, X 47 is V, X, is A, X, 4 is S and X, •is Q; 25 X, 4 is any amino acid; S: is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different amino acids selected from F, L or I; X$ is any amino acid; X3 is L, I or V; X 3 is S; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVI-S COLLISON CAVE Pat.&Trad :61 7 3368 2262 53/123 P:\OslAVPAVPA Cbusr-LSOCS byIV Si-a xnxIE 2 CLLAN.*n 1/ 11/04 -254- X, is V or F; X 39 is K or R; X4 is any amino acid; X4, is any amino acid, with the proviso that it is not T if X, is A, XIo is S, X 1 3 is Q, Xt4 is Q, is H, X, 7 is Q, X 1 is K, Xig is M, Xo is T, X33 is S, X 4 2 is R, X 47 is V, X. 2 is A, X 54 is S and X, is Q; X. is any amino acid, with the proviso that it is not R if X, is A, X 1 0 isS, XIS is Q, X, 4 is Q, X, is H, X 17 is Q, X 18 is K, isM, X 0 o is T, X is S, X 4 1 is T, X 4 7 is V, X, 5 is A, X4 is S and X, is Q; X43 is any amino acid; X44 is any amino acid; X4, is any amino acid; X 4 6 is any amino acid; X,4 is any amino acid, with the proviso that it is not V ifX, is A, Xo 0. is S, X, 1 3 is Q, X 14 is Q, X, 15 is H, is Q, X,8 is K, X, is M, XisT, X33 isS,X 41 isT,X 4 2 isR,X 5 2 zisA, XS4 is S and X4 is Q; X48 is R; X4, is V, Ior M; is any amino acid; is any amino acid; X52 is any amino acid, with the proviso that it is not A ifX, is A, X1o is S, X, is Q, X,4 is Q, XIS is H, X17 is Q, X18 is K, X19 is M, Xo is T, X33 is S, X4, is T, X47 is R, X47 is V, XS4 is S and X, is Q; X,3 is any amino acid; X,4 is G, S or with the proviso that it is not S ifX, is A, X 1 is S, X 13 is Q, X,4 is Q, X S is H, X is Q, X, is K, X is Xo is T,X33 is S, X41 is T, X4 is R,X is V, X52 is A and X,isQ; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 54/123 P:\OPEk\VrA\VPA COMPBTI\SOCSDIVS-2mnivsD 2CutAN.WD. 15/1 1/04 -255- X 55 is any amino acid; Xs 6 is F; X 57 is any amino acid; X s is L or F; is a sequence ofn amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; Xs is D or E; Xo is any amino acid; Xi is L, V, T or I; is a sequence ofn amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xr may comprise the same or different amino acids selected from any amino acid residue; X2 is L or A; X3 is L, V or I; X, is any amino acid, with the proviso that it is not Q if X is A, X,, is S, Xis Q, X X is Q, X isH X is Q, i Q is K, X 1 is M, 9S X 30 is T, X 3 3 is S, X 4 1 is T, X 42 is R, X 4 is V, is A and X 54 is S; S 20 X is H or Y; and Xcs is Y or S. *49 S 34. A method according to any one of claims 31 to 33 wherein the protein modulates signal transduction.
35. A method according to claim 34 wherein the signal transduction is mediated by a cytokine or other endogenous molecule, a hormone, a microbe or a microbial product, a parasite, an antigen or other effector molecule.
36. A method according to claim 35 wherein the protein modulates cytokine-mediated signal transduction. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 55/123 P:\Orn\V.A\VPA COWLrmsESOCSDIV S u2mUI 2 CtLlN.WPD 15/1 t/04 -256-
37. A method according to claim 36 wherein the signal transduction is mediated by one or more of the cytokines EPO, TPO, G-CSF, GM-CSF, IL-3, IL-2, 1L-4, IL-7, IL-13, IL-6, LIP, OSM, 11-12, IFNe, TNFac, IL-1 and/or M-CSF.
38. A method according to claim 37 wherein the signal transduction is mediated by one or more of IL-6, LU, OSM, IFNa and/or TPO.
39. A method according to claim 38 wherein the protein comprises an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 18, SEQ ID NO. 36 or SEQ ID NO. 44. A method according to any one of claims 31 to 39 wherein the SH2 domain comprises the amino acid sequence: XI X 2 X3 X4 X5 X, X, X, X, X, X13 X14 XJ X 1 X17 X18 X19 X20 X 2 1 X 22 X 2 X 2 4 X 2 Sx 2 6 x 27 x, 28 x 2 X, X 31 X 3 2 X 33 X 34 X35 X36 X 3 7 X 3 8 X39 X 40 X 4 1 X 42 43 X 4 4 X 4 X+6 X 4 7 X 4 4, X 50 XSI X, X 53 X 54 X 5 X 56 X, 57 X, 59 XS 9 X60 XCI X62 X 63 Q64 X6 Xs 6 wherein: X, is G or P;- 20 X, is F, W or C; X 3 is Y; SX 4 is W; X, is G or S; X6 is P, A, S or V; 25 X, is L, M or V; X, is S,T, D orN; X, is V, 0, A, R, K or W, with the proviso that it is not A ifX, 0 is S, X 1 3 isR,X 1 4 isQ, X 15 isH, X, 1 7 isQ,X 1 iis IK,X 19 isM,X 30 is T, X33 is S, X 41 is T, X 4 2 is R, X 47 is V, X 52 is A, X, 4 is S and X4 is Q; X, 0 is H, G, N, S, Y, W or E, with the proviso that it is not S if X0 is COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 56/123 P:\Onw.\Vr\VPACOMwrwasSOCSDIVS42.vv 2 aaqnw."o- t5/13/04 -257- A, X 13 is R, X, 1 4 is Q, Xis is H, X 17 is Q, X, i K, X, 19 is M, X3, is T, X35 is S, X4, is T, X42 is R, X 47 is V, is A, is S and X, is Q; XI, is G, E, A or D; X12 is A; XI, is H, N, K, R or E, with the proviso that it is not R if X, is A, Xio is S, X,14 is Q, X 15 is H, is Q, Xjt is K, Xj, is M, X 30 is T, X3 is S, X41 is T, X42 is R, X 1 7 is V, X 5 is A, X, 4 is S and X, is Q; X 14 is E, L, Q, A, G or M, with the proviso that it is not Q ifX, is A, XIo is S, X,3 is Q, X, is H, XI is Q, Xlg is K, is M, X3 is T, X 33 is S, X4, is T, X, is R, X47 is V, X52 is A, X 54 is S and X, is Q; XIS is R, L, K or H, with the proviso that it is not H if X, is A, is S, is Q, X14 is Q, X,17 is Q, is K, X 1 9 is M, X30 is T, X33 is S, X,4 is T, X is R, X47 is V, is A, X, is S and X, is Q; X 16 is L; S X,17 is R, S, K, Q, E or A, with the proviso that it is not Q if X, is A, d SO" Xo is 5, is Q, X, is Q, is H, is K, is M, Xo is T, X3, is S, X4, is T, X4, is R, X4, is V, is A, X,4 is S and X is Q; c is A, E, K, G, N or S, with the proviso that it is not K if X, is A, X,0 is S, X,3 is Q, X, is Q, is H, X1, is Q, is M, X30 is T, X33 is S, X41 is T, X4, is R, X4, is V, X, is A, X,4 is S and X, is Q; is E, A, M, K or V, with the proviso that it is not M if X, is A, Xjg I is S, X13 is Q, XI, is Q, X 1 is H, X,7 is Q, Xs is K, X3Q is T, X, is S, X 4 1 is T, X 42 is R, X 4 is V, X,2 is A, X, is S and X, is Q; X2 is P; X2, is V, A, E or D; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 57/123 P:\Ow 0\VPA\VPACOMPLLr.ASOCS DIV SH;3 lwVI 2 CLAN.W D D-1/1 1/04 258 X2 is G; X,3 is T or S; X 2 4 is F; X 25 is L; X 26 is V, I or L; X 2 7 is R; X28 is D; X, 2 is S; is R, S, T or A, with the proviso that it is not T if X is A, Xo is S, X 1 3 is Q, X 1 4 is Q, X 15 is H, X 17 is Q. X 18 is K, X, 9 is M, X 3 is S, X 4 1 is T, X 42 is R, X 4 7 is V, X,5 is A, X, is S and Xa is Q; XI is Q, D or H; X 3 2 is R, Q, S, P. E or D; X 33 is N, R, D or S, with the proviso that it is not S if X, is A, Xo 1 is S, 15 X1, is Q, Xi4 is Q, XI is H, X, 7 is Q, X, 8 is K, Xl, is M, X 30 is T, Xl4 is T, X4, is R, X47 is V, X5 is A, X 54 is S and X, is Q; X 34 is C, H or Y; is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence Xp may comprise the same or different S" 20 amino acids selected from F, L or I; X 3 is A, T or S; X 3 is L, I or V; X3i is S; X3 is V or F; 25 X 3 9 is K or R; X isM, T, R orS; S" X, is A, Q, S, T, Y or H, with the proviso that it is not T if X, is A, Xo is S,X Q, 4is QX 1 4 QX, is H, X is Q, X, 8 is K, X, 9 is M, X 3 o is T, X 33 is S, X 42 is R, X4, is V, Xz is A, X 54 is S and X, is Q; X42 is S, A, R, N or G, with the proviso that it is not R ifX, is A, X1o COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 58/123 P:\OPEalVM\VA CYmPvuwrsSOCSDIV S2 "IMD2IZcImmcy.bP- I5/11 /04 -259- is S, X 13 is Q, X 14 is Q, X 15 is H, X 17 is Q, X, 2 is K, X 1 is M, X 30 is T, X 33 is S, X4, is T, X4, is V, X, 2 is A, X,4 is S and X,4 is Q; X4, is G, R, K or 1; X44 is P, T or S; is T, K, L or H; X46 is S, N or H; X4,7 is I, L, V, A or T, with the proviso that it is not V if X, is A, XIo is S, X,z is Q, X 14 is Q, X1, is H, X17, is Q, XIS is K, is M, is T, is S, X4, is T, X4,2 is R, X isA, X, 4 is S and X64 is Q; is R; X4, is V, I or M; X, is H, Q or E; X, is F, C, Y, Q or H; .15 X, is Q, E, A, W, S or Y, with the proviso that it is not A ifX, is A, is S, X13 is Q, X 14 is Q, X 1 5 is H, X is Q X18 is IK, X19 is M, X, is T, X33,, is S, X,I is T, X 42 is R, X 4 is V, X 5 4 is S and X64 is Q; X,3 is A, G, D, N or R; X5 is G, S or Hi, with the proviso that it is not S ifX, is A, XIO is S, is Q, X,4 is Q, X,5 is H, X, is Q, XIS is K, X 1 9 is M, X,0 is T, X, is S, X 4 1 is T, X 4 2 is R, X 47 is V, X 5 is A and X, isQ; is R, S, K, N or T; Xs6 is F; X, 7 is H, S or R; is L orF; is a sequence of n amino acids wherein n is from 7 to 14 amino acids and wherein the sequence Xq may comprise the same or different amino acids selected from any amino acid residue; X, is D or E; X, is C, S, V, 1, L or F; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIE-S COLLISON CAVE Pat.&Ttad :1736 22#5/2 :61 7 3368 2262 59/123 P:\OflAR\WA\VIA CW4PLzsrSOCS DrVSfl mmiwc aw. 15/it 1/04 260 4j is L,V, Tor I; [Xrt. is a sequence of n amino acids wherein n is from 1 to 2 amino acids and wherein the sequence 2Cr may comprise the same or different amino acids selected from any amino acid residue; XQ is Lor A; Xw is L, V or 1; )Mis E, H, D, Q or M, with the proviso that it is not Q if X9 is A, X 10 is S, X1 3 is Q, X is Q, X 1 5 is H, X 17 is Q, X 13 is K, X 19 is M, is TI, X33 is S, X 41 is T, X 4 2 isFK X 47 is V, X52 is A and X$4 is S; X, 6 is Hor Y; and X6is Y or S.
41. A method according to claim 40 wherein the SH2 domain comprises a sequence selected from: GFYWGPLSVHGAHERLRAEPVGTFLVRDSRQRNTCFFALSVKVASG PTSTRVHFQAGRFHLDGSRETFDCLFELLEHY; (ii) QFYWGPLSVHGAHERLRABPVGTFLVRDSRQRNCFFALSVKMASG PTSIRVHFQAGRFHLDGSRESFDCLFELLEHY; (iii) GFYWGPLSVHGAHERIRSEPVGTELvRDSRQRNcFFALSvKMASGP TSIRVHFQAGRFHLDGNRETI7DCLFELLEHY; (iv) QWYWGSMTVNEAKEKLKEAPEGTFLRDSSHSDYLLTTSVKTSAGP 9 TNLRIEYQDGKFRLDSIICVKSKILKQFDSVVHLfDYY; GF'YWSAVTGCEANLLLSABPAGTFLIRDSSDQRHFFrLsvKTQsGTK NLRIQCEGGSFSLQSDPRSTQPVPRFDCVLKLVHHY; COMS ID No: SBMI-00996928 Received by IP Australia: rime 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Fat.&Trad :61 7 3368 2262 60/123 P:%Ori \VPA\VPA CO.ElI\SOCCS DrIN SKZ MVIu D2CLEA..- I 1 1/04 -261- (vi) PCYWGVMDKYAAEALLEGKPEGTFLLRDSAQEDYLFSVSFRRYSR SLHARIEQWNHNFSFDAHDPCVFHSPDITGLLEHY; or (vii) GWYWGPMNWEDAEMKLKGKPDGSFLVRDSSDPRYILSLSFRSQGI THHTRMEHYRGTFSLWCHPKFEDRCQSVVEFIKRAVMHS. 42, A method of modulating the activity of an SH2 domain-containing SOCS protein in a cell the method comprising contacting a cell containing a gene encoding the SOCS protein with an effective amount of a modulator of SOCS protein activity for a time and under conditions sufficient to modulate the activity of the SOCS protein, wherein the modulator has been identified, produced, designed or otherwise selected by the use according to any one of claims 21 to
43. A method of modulating the activity of an SH2 domain-containing SOCS protein in a cell the method comprising contacting a cell containing a gene encoding the SOCS protein with an effective amount of a modulator of SOCS protein activity for a time and under conditions sufficient to modulate the activity of the SOCS protein, wherein the modulator has been identified, produced, designed or otherwise selected using a method according to any one of claims 31 to 41.
44. A method of modulating signal transduction in a cell containing a gene encoding an SH2 domain-containing SOCS protein the method comprising contacting the cell with an effective amount of a modulator of SOCS protein activity for a time sufficient to modulate •signal transduction, wherein the modulator has been identified, produced, designed or 25 otherwise selected by the use according to any one of claims 21 to
45. A method of modulating signal transduction in a cell containing a gene encoding an SH2 domain-containing SOCS protein the method comprising contacting the cell with an effective amount of a modulator of SOCS protein activity for a time sufficient to modulate signal transduction, wherein the modulator has been identified, produced, designed or otherwise selected using a method according to any one of claims 31 to 41. COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIE-S COLLISON CAVE Pat.&Trad :1735 22#6/2 :61 7 3368 2262 61/123 P:\O)'-PA\VPACOPACTI SOCSDZ1VSHa ,nmo2a0auP.w~p. 15/1 3(04 262
46. A method of influencing interaction between cells wherein at least one cell carries a gene encoding an SH2 domain-containing SOCS protein, the method comprising contacting the cell carrying the gene with an effective amount of a modulator of SOCS protein activity for a time sufficient to modulate signal transduction, wherein the modulator has been identified, produced, designed or otherwise selected by the use according to any one of claims 21 to
47. A method of influencing interaction between cells wherein at least one cell carries a gene encoding an SH2 domain-containing SOCS protein, the method comprising contacting the cell carrying the gene with an effective amount of a modulator of SOCS protein activity for a timne sufficient to modulate signal transduction, wherein the modulator has been identified, produced, designed or otherwise selected wsing a method according to any one of claims 31 to 41.
48. A method according to any one of claims 42 to 47 wherein signal transduction is mediated by a cytokine, a hormone, a microbe or a microbial product, a parasite, an antigen or other effector molecule. A method according to claim 48 wherei n the cytokine is one or more of EPO, TPO, ~20 G-CSF, GM-CSF, 1L-3, l-2, IL-4, 11-7, IL-13, 11-6, HF, OSM, IL-12, IFNt, TN T I- and/or M-CSF. A method according to claim 49 wherein the cytokine is one or more of 11-6, LWF, OSM, ]IFNa and/or TPO. :51. A method according to claim 50 wherein the cytokine is 11L-6.
52. A method according to any one of claims 42 to 51 wherein the SOCS gene encodes a protein having a SOCS box comprising the amino acid sequence: X-X2 XIX 4 XS X 6 X 7 XII 9 X 10 XII X 1 X1 3 X 1 4 XIS X 1 6 P[X~nX 1 7 X18Xl 9 X 20 COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04: 9:51 :DAVIES COLLISON CAVE Pat.&Trad :61 7 3368 2262 62/123 P:\O n\VVA\VPA COdUHtISOCS 01V St2EVI= 2 CLZA.WPIn- 13/11/04 -263- X 2 1 X 2 X 23 X 24 X2 X2 6 X2,X28 wherein: X, is L, I, V, M, A or P, with the proviso that it is not A if X2 is A; X 2 is any amino acid residue; X 3 is P, T or S; X. is L, I, V, M, A or P; is any amino acid; X6 is any amino acid; X 7 is L, I, V, M, A, F, Y or W; -X 8 is C, T or S; Xg isR K or H; X 1 0 o is any amino acid; X 1 I is any amino acid; X 1 1 is L, I, V, M, A or P; 15 X 1 3 is any amino acid; X14 is any amino acid; 9 XS is any amino acid; SX,6 is L, 1, V, M. A, P, 0, C, T or S; 2 0 [XJ] is a sequence ofn amino acids wherein n is from 1 to 50 amino o acids and wherein the sequence X, may comprise the same or different amino acids selected from any amino acid residue; X 7 is L, I, V, M, A orP; X 1 g is any amino acid; SX 1 9 is any amino acid; 25 X2o L, IV, M, A or P; X2 isP; X2 isL,I, V,M,A, P or G; X3 is P or N; is a sequence ofn amino acids wherein n is from 0 to 50 amnino acids and wherein the sequence X, may comprise the same or differcnt amino acids selected from any amino acid residue; COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15 15-11-04; 9:51 :DAVIES COLLISON CAVE Pat.&Trad :1736 22#6/2 :61 7 3368 2262 63/123 P:%\VPA\VIPA COUPwJM&SOCSIrV SH2 rvi 2CLhA".Wpo. 15/j 1/04 264 X2is L, 1, V. M, A or P; X 2 is any amirto acid, with the proviso that it is not A if X, is A; X 26 is any amino acid; X 2 is Y or F; and 2 isL, I,V, M, Aor P.
53. A method according to claim 52 wherein the SOCS gene comprises a nucleotide sequence selected from SEQ ID NO. 3, SEQ ID) NO. 5, SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 11, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 30, SEQ ID NO. 3 1, SEQ ID NO. 35 or SEQ ID NO. 43.
54. A method according to claim 53 wherein the SOCS gene encodes a protein comiprising an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID) NO. 6, SEQ ID) NO. 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 18, SEQ ID NO. 36 or SEQ ID NO. 44. DATED this 15th day of November, 2004 The Walter and Eliza Hall Institute of Medical Research 2By DAVIES COLLISON CAVE Patent Attorneys for the Applicants COMS ID No: SBMI-00996928 Received by IP Australia: Time 11:32 Date 2004-11-15
AU87234/01A 1996-11-01 2001-11-01 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines Ceased AU779095B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU87234/01A AU779095B2 (en) 1996-11-01 2001-11-01 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
AUPO3384 1996-11-01
AUPO5117 1997-02-14
AU46943/97A AU735735B2 (en) 1996-11-01 1997-10-31 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines
AU87234/01A AU779095B2 (en) 1996-11-01 2001-11-01 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU46943/97A Division AU735735B2 (en) 1996-11-01 1997-10-31 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2003203775A Division AU2003203775A1 (en) 1996-11-01 2003-04-22 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines

Publications (2)

Publication Number Publication Date
AU8723401A AU8723401A (en) 2002-01-03
AU779095B2 true AU779095B2 (en) 2005-01-06

Family

ID=34069681

Family Applications (1)

Application Number Title Priority Date Filing Date
AU87234/01A Ceased AU779095B2 (en) 1996-11-01 2001-11-01 Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines

Country Status (1)

Country Link
AU (1) AU779095B2 (en)

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YOSHIMURA, A ET AL EMBO JOURNAL (1995) VOL 14 NO 12 P2816-26 *

Also Published As

Publication number Publication date
AU8723401A (en) 2002-01-03

Similar Documents

Publication Publication Date Title
EP0948522A1 (en) Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines
US20060003377A1 (en) Therapeutic and diagnostic proteins comprising a SOCS box
EP0907730B1 (en) Haemopoietin receptor and genetic sequences encoding same
US20080009444A1 (en) Biologically active complex of NR6 and cardiotrophin-like-cytokine
US6870031B2 (en) Haemopoietin receptor and genetic sequences encoding same
US20060242718A1 (en) SOCS-box containing peptides
US6911530B1 (en) Haemopoietin receptor and genetic sequences encoding same
US20080274973A1 (en) novel haemopoietin receptor and genetic sequences encoding same
US20090077677A1 (en) Mammalian grainyhead transcription factors
CA2449000A1 (en) Bcl-2-modifying factor (bmf) sequences and their use in modulating apoptosis
AU779095B2 (en) Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines
AU735735B2 (en) Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines
US7279557B2 (en) Therapeutic and diagnostic agents
AU2006252108A1 (en) Therapeutic and diagnostic agents capable of modulating cellular responsiveness to cytokines
US7192576B1 (en) Biologically active complex of NR6 and cardiotrophin-like-cytokine
AU774097B2 (en) A novel regulatory molecule and genetic sequences encoding same
WO1999050410A1 (en) Novel molecules expressed during muscle development and genetic sequences encoding the same
AU728863B2 (en) A novel mammalian gene, bcl-w, belongs to the bcl-2 family of apoptosis-controlling genes
WO2000012695A1 (en) Novel therapeutic molecules and uses therefor
AU2003250580A1 (en) Mammalian grainyhead transcription factors
AU2006200097A1 (en) An animal model for studying hormone signalling and method of modulating the signalling
AU1371901A (en) An animal model for studying hormone signalling and method of modulating the signalling
WO2004037857A1 (en) Bfk protein as therapeutic molecules

Legal Events

Date Code Title Description
NB Applications allowed - extensions of time section 223(2)

Free format text: THE TIME IN WHICH TO MAKE A FURTHER APPLICATION FOR A DIVISIONAL PATENT HAS BEEN EXTENDED TO 20011112