AU779662B2

AU779662B2 - Vaccine

Info

Publication number: AU779662B2
Application number: AU41330/00A
Authority: AU
Inventors: Berry Birdsall; James Feeney; Anthony Holder; William Morgan; Shabih Syed; Chairat Uthaipibull
Original assignee: Medical Research Council
Current assignee: Medical Research Council
Priority date: 1999-04-20
Filing date: 2000-04-20
Publication date: 2005-02-03
Anticipated expiration: 2020-04-20
Also published as: EP1180120A2; CN1372568A; WO2000063245A3; BR0009823A; WO2000063245A2; WO2000063245A9; JP2002543774A; MXPA01010701A; AU4133000A

Description

WO 00/63245 PCT/GBOO/01558

-I-

VACCINE

Field of the Invention The present invention relates to modified Plasmodium MSP-I protein variants and their use in producing a vaccine against malaria. It also relates to a method for the rational design of suitable variants.

Background to the Invention Malaria is a devastating disease that causes widespread morbidity and mortality in areas where it is transmitted by anopheline mosquitoes. In areas of high transmission young children and non-immune visitors are most at risk from this disease, which is caused by protozoa of the genus Plasmodium. In areas of lower or unstable transmission, epidemics of the disease can result and afflict individuals of all ages. The most dangerous form of malaria, responsible for much of the morbidity and most of the mortality, is caused by the species Plasmodium falciparum. It has been estimated that 2 billion people are at risk from malaria, with 200-300 million clinical cases and 1-2 million deaths each year.

The parasite has a complex life cycle in its human and mosquito hosts. In humans the stage of the life cycle which is responsible for the clinical symptoms of the disease occurs in the bloodstream. During this phase the parasite is largely hidden within host red blood cells. Here the parasite grows and multiplies. For example, within a red blood cell each P. falciparum parasite divides several times to produce approximately 20 new ones during a 48 hour cycle. At this point the red blood cell is burst open and the parasites (called merozoites at this stage) are released into the bloodstream. The merozoites must enter new red blood cells in order to survive and for the cycle of replication in the blood to continue.

If the parasites do not manage to enter red blood cells they cannot survive for very long and are rapidly destroyed. Symptoms of malaria such as fever are associated with this cyclic merozoite release and re-invasion of red blood cells.

WO 00/63245 PCT/GB00/01558 -2- There is an urgent need for a vaccine against malaria. There is no effective vaccine currently available. In addition, mosquito control by the spraying of residual insecticides is either becoming ineffective or considered to be unacceptable, and there is a very worrying spread of drug resistance within parasites. The rapid spread of drug resistance is worrying because compounds such as the cheap and once-effective chloroquine are no longer useful in many parts of the world, and there are few if any new drugs available that are both cheap and effective. Vaccines against microorganisms can be very cost effective and efficient ways to protect populations against infectious diseases.

Because of the complexity of the parasite's life cycle there are a number of points in its development within humans that could be the target of a protective immune response. It is known that with increasing age and exposure individuals do become immune to malaria, suggesting that protective responses do develop with time. Broadly speaking there are three types of vaccine strategy: to target the pre-erythrocytic stages, the asexual blood stage and the sexual stage. The pre-erythrocytic stages are the sporozoites that are injected by an infected mosquito when it takes a blood meal and the initial development of the parasite in the liver. The asexual blood stage is the infection and release of merozoites from red blood cells that occurs in a cyclic manner, and the stage responsible for the manifestation of the clinical symptoms. The sexual stage takes place in the mosquito's gut after it has ingested gametocytes in a blood meal and this initiates the infection of the insect to complete the cycle; a vaccine against the sexual stages would not protect the individual but could reduce transmission and therefore the incidence of malaria in a given human population.

During the asexual cycle in the blood the parasite is directly exposed to the host's immune system, and in particular to antibodies circulating within the bloodstream, only transiently: when merozoites are released by rupture of one cell and before they penetrate another. If there are specific antibodies that can bind to the surface of the parasite then it is possible that these antibodies will interfere with the ability of the parasite to invade a new red blood cell. In fact it has been shown that several monoclonal antibodies that recognise single epitopes on parasite surface proteins, are capable of neutralising the parasite and preventing the cycle of reproduction within red blood cells.

WO 00/63245 PCT/GB00/01558 One of the best characterised proteins on the surface of the merozoite is called merozoite surface protein 1 (MSP-1). MSP-I is a large protein that varies in size and amino acid sequence in different parasite lines. It is synthesised as a precursor molecule of -200 kDa by the intracellular parasite and located on the parasite's surface. During release of merozoites from red blood cells and the re-invasion of new erythrocytes the protein undergoes at least two proteolytic modifications. In the first modification as a result of a process called primary processing, the precursor is cleaved to four fragments of -83, 38 and 42 kDa that remain together as a complex on the merozoite surface. This complex also contains two other proteins of 22 kDa and 36 kDa derived from different genes. The complex is maintained by non-covalent interactions between the different subunits and is held on the merozoite surface by a glycosyl phosphatidyl inositol anchor, attached to the C-terminus of the 42 kDa fragment and inserted into the plasma membrane of the merozoite. At the time of merozoite invasion of an erythrocyte the C-terminal 42 kDa fragment is cleaved by a second proteolytic cleavage in a process called secondary processing. The result of secondary processing is that the entire complex is shed from the surface of the merozoite except for a C-terminal sub-fragment that consists of just under one hundred amino acids and which is carried into the newly invaded erythrocyte on the surface of the merozoite.

Based on sequence similarities, the structure of this small C-terminal fragment (called MSP-119) was suggested to consist of two epidermal growth factor (EGF)-like domains (see sequence in Figure 1) (Blackman et al., 1991). An EGF-like motif consists of a 45-50 amino acid sequence with a characteristic disulphide bonding pattern and such domains occur frequently in extracellular modular proteins of animals. In the MSP-1 C-terminal fragment each of the motifs contains six Cys residues proposed to form three disulphide bonds and each motif has a partial match to the EGF consensus (see Figure However, because the degree of similarity is limited and since the pattern of its disulphide bonding is not known, the designation of the MSP-1 C-terminal- fragment as comprised of EGFlike structures has been regarded as tentative. Other relatively divergent potential EGFlike sequences occur in Plasmodium proteins, but previous structure determinations have been confined to those from metazoan organisms (Campbell et al., 1998).

WO 00/63245 PCT/GB00/01558 -4- A number of studies have implicated MSP-1 'as the target of a protective immune response. Although the goal of this work is to develop a malaria vaccine for use in humans, out of necessity most of this experimental work has been done either in model animal systems or in vitro. These include studies of the effect of specific antibodies on parasite invasion of erythrocytes in vitro, passive immunisation studies in rodent malaria models in laboratory mice and direct immunisation in both rodent and primate malaria models using either native protein (derived from the parasite) or recombinant protein expressed from parts of the MSP-1 gene in heterologous organisms. Sero-epidemiological studies have also showed a correlation between human antibody responses to parts of the MSP-1 molecule and protection against clinical disease. Much, but not all, of the work has focused on the immune response to the C-terminal MSP-119. For example some monoclonal antibodies that recognise MSP-119 prevent red blood cell invasion in in vitro cultures (Blackman et al.. 1990). Interestingly, these antibodies that inhibit invasion also inhibit the secondary processing of the 42 kDa fragment, suggesting the mechanism by which they work is by steric hinderance of the protease responsible for secondary processing (Blackman et al., 1994). Since secondary processing goes to completion during successful invasion, if it cannot occur then invasion is interrupted.

All of the work described above would suggest'that MSP-1 and in particular polypeptides based on the C-terminal sequence that forms the 42 kDa or the MSP-119 region, should be very good candidates for malaria vaccine development. However, several studies have shown that the epitopes or binding sites for antibodies on MSP-119 require a correct polypeptide tertiary structure, and that this is destroyed by treatments that reduce the disulphide bonds that are postulated to be present between the cysteine residues present in MSP-119. This limitation appears to have been overcome by the expression of recombinant protein in ways that allow antibodies that recognise the native parasite MSP- 1 to bind. Other investigators have suggested that other parts of MSP-1 also have potential for inclusion in a vaccine, however the MSP-1 C-terminal fragment is currently the lead candidate for development of a vaccine against the blood stages of the malaria parasite (Diggs et al., 1993; Stoute et al., 1998).

WO 00/63245 PCT/GB00/01558 As stated above, every -48 hours P. falciparum merozoites are released from the infected erythrocvte to re-invade new red blood cells and during this time they are exposed to the host's immune system. Therefore, the question arises as to how the parasite has evolved to avoid the potentially lethal effects of, for example, neutralising antibodies. In other infectious micro-organisms it is clear that there is a constant battle between the immune system and the micro-organism, and that sophisticated mechanisms have been evolved by micro-organisms to evade the immune response. For example antigenic variation and antigenic diversity are two mechanisms that involve presenting the immune system with "a moving target" such that even though an immune response to one variant of the microorganism may kill that variant, new variants are produced that are at least partially or fully resistant to the immune response. In the case of malaria merozoites and in particular MSP-1 an alternative mechanism has been proposed whereby the binding of some antibodies ("blocking antibodies") can prevent the binding of neutralising antibodies and thereby allow the parasite to successfully invade a red blood cell even in the presence of neutralising antibodies (Guevara Patino et al., 1997). These blocking antibodies may be of two types, those against epitopes that are formed from amino acids that are distant in the linear primary sequence from the epitopes that are the target of neutralising antibodies, and those that are against epitopes that overlap with the epitopes of the neutralising antibodies. This represents a novel mechanism by which a parasite can evade an effective immune response. and unlike mechanisms based on antigenic polymorphism or diversity, it is not dependent upon amino acid sequence diversity.

Some monoclonal antibodies (mAbs) that bind to MSP-119 inhibit the proteolytic cleavage and erythrocyte invasion, suggesting that cleavage is a prerequisite for invasion (Blackman et al., 1994). Other mAbs that bind to the MSP-1 C-terminal fragment do not inhibit processing or invasion but block the binding of the inhibitory neutralizing antibodies. Other antibodies that bind to MSP-1 1 9 neither inhibit nor block the binding of inhibitory antibodies. In the presence of blocking antibodies, inhibitory antibodies are ineffective and invasion proceeds. The balance between inhibitor' and blocking antibodies induced by immunisation may be a critical factor in determining whether or not the immune response is effective in preventing invasion (Guevara Patinio et al., 1997).

Summary of the Invention In one embodiment the present invention provides an effective vaccine against the malaria parasite based on variants of the Plasmodium MSP-I protein. In designing such a S vaccine, the following criteria should be met: 1. The amino acid sequence of the polypeptide to be used in the vaccine should contain epitopes that are the targets of, and can induce, neutralising antibodies.

2. The polypeptide should ideally not include amino acid sequences that only form epitopes for blocking antibodies.

3. If the polypeptide contains epitopes for both neutralising and blocking antibodies then it should be modified to remove the blocking antibody epitopes without affecting the neutralising epitopes.

To assist in the design of candidate vaccine polypeptides fulfilling these three criteria, it is I important to determine the three-dimensional structure of the MSP-1 C-terminal fragment since this will help in mapping sites of antibody interactions with this fragment. We have 'therefore determined the solution structure of the MSP-1 C-terminal. including the pattern of disulphide bonding, using NMR techniques.

We have made amino acid substitutions in the sequence of MSP-119 that prevent the binding of individual blocking monoclonal antibodies, without affecting the binding of 25 neutralising antibodies. By determining the 3-dimensional structure of MSP-119 we have identified where these antibody binding sites are located in the tertiary structure and this has allowed other amino acid substitutions to be made that have similar properties. We have shown that several substitutions. each affecting the binding of one or more blocking antibodies can be combined into a single molecule, and that these modified molecules continue to bind the neutralising antibodies but fail to bind any of the blocking antibodies.

Such modified molecules are expected to be much more effective than the natural or wildtype protein structure at inducing a protective neutralising antibody response when used to WO 00/63245 PCT/GB00/01558 immunise individuals as a malaria vaccine. In addition we have made other modifications in the primary structure of the molecule which do -not affect the binding of the neutralising antibodies but which may contribute to increased immunogenicity of the molecule. The modified MSP-119 structures, either alone or coupled to other carriers, which may or may not contain other parts of MSP-1 to enhance the immunogenicity (for example a combination of the remainder of the MSP-1 42 kDa fragment with the modified MSP-119) and provide additional T cell epitopes, would be more effective vaccines than equivalent structures that have not been modified in this way.

Accordingly, the present invention provides a non-naturally occurring variant of a C-terminal fragment of a Plasmodium merozoite surface protein-1 (MSP-1) wherein said variant has a reduced affinity, compared with a naturally occurring Plasmodium MSP-ll9, for at least one first antibody capable of blocking the binding of a second antibody, which second antibody inhibits the proteolytic cleavage of Plasmodium MSP-1 42 and (ii) substantially the same affinity for said second antibody compared with said naturally occurring Plasmodium MSP-119.

Preferably, the Plasmodium MSP-l19 and MSP-1 42 are Plasmodiumfalciparum MSP-1 1 9 and MSP-1 42 The first antibody is preferably selected from mAbs IE1, 2.2, 7.5, 9C8 and 111.4. The second antibody is preferably selected from mAbs 12.8, 12.10 and 5B1.

The present invention further provides a non-naturally occurring variant of a C-terminal fragment of a Plasmodium merozoite surface protein-1 (MSP-1) comprising an amino acid modification at any one of amino acid residues 14, 15, 27, 31, 34 43 48 and 53 of the Plasmodiumfalciparum MSP-119 amino acid sequence shown as SEQ I.D. No. 1 or their equivalent positions in other Plasmodium MSP-119 polypeptides.

Preferably said modifications are substitutions selected from Glnl4->Arg, Glnl4--Gly, Glu27--Tyr, Leu31->Arg, Tyr34->Ser, Tyr34-*Ile, Glu43-+Leu, Thr48--Lys and Asn53-+Arg and their equivalents in other Plasmodium MSP-119 WO 00/63245 PCT/GB00/01558 -8polypeptides. More preferably said substitutions are combinations of substitutions selected from [Glu27-Tyr. Leu31->Arg and Glu43->Leu], [Glu27--Tyr. Leu31--Arg, Tyr34--Ser and Glu43--Leu]. [Asnl5--Arg, Glu27-Tyr, Leu31--Arg and Glu43--Leu] and their equivalents in other Plasmodium MSP-119 polypeptides.

In a preferred embodiment, a variant MSP-1 polypeptide of the invention further comprises a mutation at Cysl2 and/or Cys28 of the Plasmodium falciparum MSP-1 1 9 amino acid sequence shown as SEQ I.D. No. 1. Preferably such modifications are substitutions selected from Cysl2---Ile and Cys28--Trp, and Cysl2--Ala and Cys28-Phe.

Most preferably the substitutions are combinations selected from [Cysl2-Ile, Asn Glu27--Tyr, Cys28-Trp, Leu31-Arg, Glu43--*Leu], [Cysl2-Ile, Asn Glu27--Tyr, Cys28-Trp, Leu31---Arg, Glu43--Leu, Asn53-Arg], [Cysl2-Ile, Asn 15--Arg, Glu27-Tyr, Cys28--Trp, Leu31-Arg, Tyr34-Ser, Glu43--Leu, Asn53-Arg] and their equivalents in other PlasmodiumMSP-l 1 9 polypeptides.

The present invention also provides a method for producing a Plasmodium MSP-1 variant for use in preparing a vaccine composition which method comprises modifying one or more amino acid residues of a Plasmodium MSP-1 C-terminal fragment such that the resulting derivative has a reduced affinity, compared with a naturally occurring Plasmodium MSP-1 1 9 for at least one first antibody capable of blocking the binding of a second antibody, which second antibody inhibits the proteolytic cleavage of Plasmodium MSP-1 42 and (ii) substantially the same affinity for said second antibody compared with said naturally occurring Plasmodium MSP- 19. In particular the method of the invention preferably comprises as a preliminary step, selecting a candidate amino acid residue by reference to a three dimensional NMR model structure, preferably as set out in Table 2.

More specifically, the 3D model structure is used to select a surface exposed amino acid residue. Advantageously, a further step is included of computer modelling the three dimensional structure of the variant to exclude polypeptides that do not fold correctly.

WO 00/63245 PCT/GB00/01558 -9- The present invention also provides a non-naturally occurring Plasmodium MSP-1 variant obtained by the method of the invention.

In a further aspect, the present invention provides a polynucleotide encoding a variant of the invention operably linked to a regulatory sequence capable of directing the expression of said nucleotide in a host cell. The polynucleotide may comprise a sequence which has been optimised for expression in the host cell. The host cell may be a Pichia pastoris cell.

Also provided is a nucleic acid vector comprising a polynucleotide of the invention, including viral vectors, and a host cell comprising a nucleotide or vector of the invention.

In another aspect. the present invention provides a pharmaceutical composition comprising a variant of the invention, a polynucleotide of the invention or a vector of the invention together with a pharmaceutically acceptable carrier or diluent.

Preferably, the composition further comprises an immunogenic Plasmodium polypeptide or fragment or derivative thereof such as MSP-1 33 or a fragment or derivative thereof which may be covalently attached to the non-naturally occuring MSP-119. It is preferred not to use wild-type MSP-119 sequences. The further immunogenic peptide may itself be derivatised in an analogous manner as described above for MSP-119. Thus, epitopes present in the peptide may be identified and modified to prevent binding of blocking antibodies, without affecting the binding of neutralising antibodies. These epitopes may be capable of binding to antibodies which have similar properties to the first antibody described above, for example, in binding affinity. The further immunogenic peptide may comprise several such modifications in its amino acid sequence.

The present invention also provides a method for producing anti-MSP-1 antibodies which method comprises administering a polypeptide variant of the invention, or a polynucleotide of the invention or a vector of the invention to a mammal, typically a nonhuman mammal.

In a preferred embodiment, the present invention provides a method for producing polyclonal anti-MSP-1 antibodies which method comprises administering a polypeptide WO 00/63245 PCT/GBOO/01558 variant of the invention, or a polynucleotide of the invention or a vector of the invention to a mammal. typically a non-human mammal. and extracting the serum from said mammal.

Also provided is an antibody produced by the said methods.

The polypeptides. nucleotides and vectors of the present invention may be used in methods of treating and/or preventing malaria caused by Plasmodium species, in particular Plasmodium falciparum. Accordingly, the present invention provides a method of inducing immunity against malaria induced by Plasmodium falciparum which comprises administering to a person in need of such immunity an effective amount of a variant, a polynucleotide or a vector of the invention.

Also provided is a method of immunizing a mammal, said method comprising administering an effective amount of a variant, a polynucleotide or a vector of the invention. In particular, said mammal is immunized against malaria. Preferably the mammal is a human.

The present invention also provides a method of treating a malaria infection in a human patient which comprises administering to the patient an effective amount of the pharmaceutical composition of the invention.

We further provide according to the present invention a nucleic acid encoding a Plasmodium MSP-1 polypeptide, in which the nucleic acid is optimised for expression in a heterologous host cell.. Preferably, the heterologous host is a Pischia pastoris cell. The MSP-1 polypeptide may be selected from the group comprising an MSP-142 polypeptide comprising a sequence shown in Figures 2C and 2E, an MSP-1 1 9 polypeptide comprising a sequence shown in Figure 2C, and an MSP-1 33 polypeptide comprising a sequence shown in Figure 2E. The optimised nucleic acid may comprise a sequence selected from the sequences of Figure 2A, Figure 2B and Figure 2D. We further provide a vector comprising such a nucleic acid. a host cell comprising such a vector, and a pharmaceutical composition comprising such a nucleic acid or a vector, together with a pharmaceutically acceptable carrier or diluent. The pharmaceutical composition may further comprise an immunogenic Plasmodium polypeptide or fragment or derivative thereof.

WO 00/63245 PCT/GB00/01558

-II-

Detailed description of the invention Although in general the techniques mentioned herein are well known in the art, reference may be made in particular to Sambrook et al., Molecular Cloning, A Laboratory Manual (1989) and Ausubel et al., Current Protocols in Molecular Biology (1995), John Wiley Sons. Inc.

A. MSP-1 variant polypeptides The variant MSP-1 polypeptides of the present invention will be described with reference to Plasmodium falciparum MSP-1 amino acid sequences. However, it should be appreciated that except where otherwise stated, all references to MSP-1 polypeptides include homologues of MSP-1 found in other Plasmodium species, such as P. vivax, P. malariae and P. ovale which all infect humans and P. yoelii which infects mice.

The variant MSP-1 polypeptides of the present invention are based on C-terminal fragments of the Plasmodiumfalciparum MSP-1 42 polypeptide shown as SEQ I.D. Nos. 2 or 3. Such polypeptides will comprise some or all of the MSP-119 region (SEQ I.D. No.

preferably at least substantially all of the domain 1 and/or domain 2 EGF-like sequences found in MSP-119 (approximately amino acids 1-47 and amino acids 48-96, respectively, of SEQ I.D. No. It is particularly preferred to use regions that are conserved in most, more preferably all parasites of a single species to increase the effectiveness of the variant as a vaccine against a wide range of strains.

Variant MSP-1 polypeptides of the present invention comprise modifications to their primary amino acid sequence that reduce the ability of blocking antibodies to bind to the MSP-1 polypeptides. In addition, any modifications made should maintain epitopes recognised by neutralising antibodies such that the affinity of the neutralising antibodies for the MSP-1 variant is substantially the same as for naturally-occurring MSP-1 polypeptides (such as an MSP-1 42 polypeptide having the sequence shown in SEQ I.D.

Nos. 2 or Some reduction in the binding of some neutralising antibodies may be WO 00/63245 PCT/GB00/01558 -12tolerated since the primary objective is to inhibit the binding of blocking antibodies and it is likely that an effective reduction in the binding of blocking antibodies will compensate in terms of overall vaccine efficacy for a small reduction in neutralising antibody binding.

S Neutralising antibodies in the context of the present invention are antibodies that inhibit malaria parasite replication. A variety of neutralising antibodies, polyclonal and monoclonal, are known in the art, including mAbs 12.8, 12.10 and 5B1 referred to in the Examples. The activity of neutralising antibodies can be determined in a variety of ways that have been described in the art. For example, a convenient assay method described in Blackman et al., 1994 involves using preparations of merozoites (Blackman et al., 1993; Mrema et al.. 1982) to measure cleavage of MSP-1 4 2 into MSP-1 33 and MSP-1 1 9 Briefly, freshly isolated merozoites are washed in ice-cold buffer and divided up into aliquots of about 2x109 merozoites. A test antibody is added to each aliquot and the sample incubated at 37°C for 1 hour. The samples are then subjected to SDS-PAGE under nonreducing conditions on a 12.5% polyacrylamide gel, Western blotted and the blot probed with antiserum to MSP-1 33 In the control sample, two main bands are seen one corresponding to MSP-142 and one lower molecular weight band corresponding to MSP-1 33 Neutralising antibodies will reduce the amount of the lower molecular weight band as a result of inhibiting secondary proteolytic processing of MSP-1 42 This method is a particularly preferred method for assessing the efficacy of neutralising antibodies in the presence of antibodies believed to act as blocking antibodies. Where candidate competing blocking antibodies are to be tested, the merozoite sample is preincubated with a blocking antibody for 15 mins on ice prior to incubation with a neutralising antibody at 37 0 C for 1 hour as described above. Thus blocking antibodies can readily be identified and/or characterised using such an assay method.

Other assay methods include merozoite invasion inhibition tests as described in Blackman et al., 1990.

As discussed above, blocking antibodies are defined in the context of the present invention as antibodies that inhibit the binding of neutralising antibodies to MSP-1 but WO 00/63245 PCT/GB00/01558 -13which do not themselves inhibit invasion of red blood cells by malaria parasites. Thus they "block" the neutralising function of the neutralising antibodies. A variety of blocking antibodies have been characterised in the art, including mAbs IE1, 2.2. 7.5 and 111.4 referred to in the Examples. As discussed above, blocking antibodies can conveniently be identified and/or characterised using assays that test their effect on neutralising antibody function.

Modifications that may be made to produce MSP-1 variants of the invention include substitutions, deletions and insertions. It is particularly preferred to use substitutions to minimise disruption of the secondary/tertiary structure of the polypeptide. Furthermore, particularly preferred substitutions are those that replace one class of amino acid with another class, such as an aliphatic non-polar residue with a charged polar residue. For example, the twenty naturally occurring amino acids may be divided into four main groups (aliphatic non-polar A, P, I, L and polar un-charged S, T, M, N and Q], polar charged E, K and R] and aromatic F, W and and it is preferred to replace an amino acid from one group with an amino acid from another group.

Other possibilities include replacing a positively charged side chain with a negatively charged side chain, replacing an amino acid with a large side chain with an amino acid with a smaller or no side chain (glycine), replacing a polar amino acid with a charged polar amino acid, replacing a large aromatic amino acid with an amino acid with a small side chain, and replacing cysteine residues that are involved in disulphide bonds.

Particularly preferred modifications are an amino acid modification at any one of amino acid residues 14, 15, 27, 31, 34 43, 48 and 53 of the Plasmodium falciparum MSP-119 amino acid sequence shown as SEQ I.D. No. 1 or their equivalent positions in other Plasmodium MSP-119 polypeptides. These residues are all almost within the EGF-like domain 1. It is known that the epitopes of some antibodies contain amino acid sequences that are within EGF-like domain 2, therefore equivalent modifications may also be made in EGF-like domain 2. Preferred examples of modifications include the following substitutions Glnl4->Arg, Glnl4--GlIv, Asnl5--Arg, Glu27->Tyr, Leu31-->Arg, Tyr34->Ser, Tyr34-Ile, Glu43--Leu. Thr48--Lys and/or Asn53-+Arg and their equivalents in other Plasmodium MSP-119 polypeptides.

WO 00/63245 PCT/GBOO/01558 -i4- It is especially preferred to carry out more than one modification, i.e. to use combinations of modifications, such as two or more or three or more. In a preferred embodiment, an MSP-1 variant of the invention comprises a combination of amino acid substitutions selected from [Glu27--Tyr. Leu31-->Arg and Glu43--Leu], [Glu27-*Tyr, Leu31-+Arg, Tyr34-->Ser and Glu43--Leu], [Asnl5--Arg, Glu27--Tyr, Leu31->Arg and Glu43--Leu] and their equivalents in other Plasmodium MSP-119 polypeptides.

A particularly preferred combination further comprises a modification to Cysl2 and/or Cys28 (and/or their equivalent residues in EGF-like domain 2) to disrupt the disulphide bond. Preferably such modifications are substitutions selected from Cysl2-Ile and Cys28-Trp, and Cysl2-Ala and Cys28--Phe.

Most preferably the substitutions are combinations selected from [Cysl2--Ile. Asn 15--Arg, Glu27-Tyr, Cys28-Trp, Leu31--Arg, Glu43-Leu], [Cysl2-Ile, Asn Glu27-Tyr, Cys28-Trp, Leu31--Arg, Glu43-Leu, Asn53--+Arg], [Cysl2-Ile, Asn 15-Arg, Glu27-Tyr, Cys28--Trp, Leu31-Arg, Tyr34--+Ser, Glu43-Leu, Asn53-Arg] and their equivalents in other Plasmodium MSP-119 polypeptides.

Substitutions are not confined to using naturally occurring amino acids non-naturally occurring amino acid analogues may also be used, in particular where solid phase synthesis is to be used to chemically synthesise the variant, as opposed to recombinant technology.

Modifications to MSP-1 amino acid sequences may be carried out using standard techniques such as site-directed mutagenesis using the polymerase chain reaction.

Alternatively, variants may be obtained by solid phase synthetic techniques.

To determine whether a variant MSP-1 polypeptide produced by modification of its primary amino acid sequence complies with the criteria specified above, the affinity of at least one neutralising antibody and at least one blocking antibody for the variant WO 00/63245 PCT/GB00/01558 polypeptide compared with the naturally occurring MSP-1 sequence may be tested.

Ideally more than one of each type of antibody should be used. for example two or three.

The ability of antibodies to bind to the variant and wild-type polypeptides may be determined using any one of a variety of methods available in the art for determining antibody-epitope binding. One such method, described in the Examples, involves the use of MSP-1 sequences expressed as fusion proteins with a protein tag such as glutathione-Stransferase (GST). These GST-fusion proteins are typically immobilised to a solid phase such as glutathione sepharose beads or a BIAcore sensor chip. Binding of antibodies, such as monoclonal antibodies, to the fusion proteins may be determined using standard techniques such as Western blotting and/or by labelling the antibodies with a radioactive label such as 125I. The use of BIAcore technology allows easy quantitation of the results.

Preferably, the reduction in binding of at least one of the blocking antibodies tested is at least 50% compared to wild-type MSP-1, more preferably at least 75, 80 or 90%, typically as assessed using recombinantly expressed MSP-1 immobilised to a BIAcore sensor chip.

By contrast, the binding of at least one, for example at least two or three, of the neutralising antibodies tested, more preferably at least half of the neutralising antibodies tested, more preferably substantially all of the neutralising antibodies tested is reduced by less than 50%, more preferably less than 25%. The number of neutralising antibodies that need be tested to confirm compliance with the test criteria will not typically exceed from three to five different antibodies (three antibodies are used in the Examples). In a particularly preferred embodiment the binding of at least one neutralising antibody is increased by at least The results given in Table 2 in the Examples provide partial guidance to the skilled person as to which residues may be modified to produce a variant MSP-1 of the invention.

However, the provision herein for the first time of the three dimensional solution structure of MSP-119 provides the skilled person with further detailed guidance as to which residues may be altered. In particular, epitopes are expected to be exposed to the aqueous environment on the exterior of the MSP-119 fragment. Consequently, the precise structural information provided which teaches the position of surface exposed amino acids WO 00/63245 PCT/GBOO/01558 -16allows the skilled person to target those residues for modification. This data is given in Tables A/B and has also been submitted to the Protein Data Bank (PDB Accession no.

ICEJ). It enables the skilled person to identify the precise location of individual amino acids in the three dimensional structure. Typically, the data is loaded into suitable software, well-known in the art such as Insight II, MOLSCRIPT GRAS P and RASIMOL.

Further, knowing the location of a modification in the 3-dimensional structure which affects the binding of a blocking antibody without affecting the binding of the neutralising antibodies, it is possible to identify other residues that are on the surface and in the vicinity of the original modification and which may be easily modified to further improve the properties of a modified protein. These residues may be in either the first or the second EGF-like motifs or in the sequence between them. Since it is known that an antibody binding site can encompass a volume that corresponds approximately to the range of 5 to 8 amino acids, it is clear that modifications of these adjacent residues may also affect the affinity of the protein for the blocking antibodies. Once an adjacent amino acid has been identified it can be modified according to the principles outlined above and the contribution of the modification to the overall antigenicity and immunogenicity of the protein, either alone or in combination with other modifications, can be assessed. Those changes that contribute to a reduced affinity for the blocking antibodies, without a substantial affect on binding of the neutralising antibodies can be incorporated into the improved protein. This can be a reiterative process.

In addition, the 3D NMR structure will enable the skilled person to carry out preliminary computer modelling studies of MSP- 119 variants with specific modifications so that, for example variants that cannot fold properly may be discarded. This will assist in minimising the number of candidate MSP-1 19 variants that need be tested.

Thus the present invention also provides a computer readable medium having stored thereon a model of the MSP-1 19 NMR structure. In a preferred embodiment, said model is built from all or some of the NMR data shown in Tables A and B.

Variants of the present invention may optionally include additional MSP-1 sequences, in WO 00/63245 PCT/GB00/01558

-I-

particular regions of the MSP-1 33 region of MSP-14 2 to confer additional immunogenicity to the variant. Furthermore, additional sequences known to contain and promote T cell responses are advantageously included T cell epitopes). Other modifications may also be made that increase immunogenicitv such as modifications that alter the pathway of antigen processing and presentation.

Polypeptide variants of the invention are typically made by recombinant means, for example as described below. However they may also be made by synthetic means using techniques well known to skilled persons such as solid phase synthesis. Proteins of the invention may also be produced as fusion proteins, for example to aid in extraction and purification. Examples of fusion protein partners include glutathione-S-transferase (GST), 6xHis, GAL4 (DNA binding and/or transcriptional activation domains) and 1-galactosidase. It may also be convenient to include a proteolytic cleavage site between the fusion protein partner and the protein sequence of interest to allow removal of fusion protein sequences. Preferably the fusion protein will not hinder the immunogenicity of the MSP-I variant.

Polypeptides of the invention may be in a substantially isolated form. It will be understood that the polypeptide may be mixed with carriers or diluents which will not interfere with the intended purpose of the polypeptide and still be regarded as substantially isolated. A polypeptide of the invention may also be in a substantially purified form, in which case it will generally comprise the polypeptide in a preparation in which more than e.g. 95%, 98% or 99% of the polypeptide in the preparation is a polypeptide of the invention.

B. Polvnucleotides and vectors As discussed above, the variants of the present invention may be produced recombinantly using standard techniques. Thus, the present invention also provides a polynucleotide encoding a polypeptide MSP-1 variant of the invention. Polynucleotides of the invention may comprise DNA or RNA. They may.also be polynucleotides which include within them synthetic or modified nucleotides. A number of different types of modification to WO 00/63245 PCT/GBOO/01558 -18oligonucleotides are known in the an. These include methylphosphonate and phosphorothioate backbones, addition of acridine'or polylysine chains at the 3' and/or ends of the molecule. For the purposes of the present invention, it is to be understood that the polynucleotides described herein may be modified by any method available in the an.

S Such modifications may be carried out in order to enhance the in vivo activity or life span of polynucleotides of the invention. It will be understood by a skilled person that numerous different polynucleotides can encode the same polypeptide as a result of the degeneracy of the genetic code.

Polynucleotides of the invention comprise can be incorporated into a recombinant replicable vector. The vector may be used to replicate the nucleic acid in a compatible host cell. Thus in a further embodiment, the invention provides a method of making polynucleotides of the invention by introducing a polynucleotide of the invention into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell. Suitable host cells include bacteria such as E. coli, yeast, mammalian cell lines and other eukaryotic cell lines, for example insect Sf9 cells. The host cell may be a methylotrophic yeast such as Pichia pastoris.

The coding sequence of natural or variant MSP polypeptides (including the polypeptide of the invention) may be modified for optimal expression in a host cell. For example, secondary modification such as N-glycosylation may be prevented by removal of sequences necessary for such modification. The sequence of the polypeptide may alternatively or in addition be modified with respect to codon usage for optimal expression in the host cell. Methods of mutagenising a sequence are known in the art; alternatively, the modified coding sequence may be generated by means of PCR gene assembly using overlapping synthetic oligonucleotides (Stemmer et al., 1995; Withers-Martinez et al., 1999).

Preferably, a polynucleotide of the invention in a vector is operably linked to a regulatory sequence that is capable of providing for the expression of the coding sequence by the host cell. i.e. the vector is an expression vector. The term "operably linked" refers to a WO 00/63245 PCT/GB00/01558 -19juxtaposition wherein the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence "operably linked" to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.

Such vectors may be transformed or transfected into a suitable host cell using standard techniques above to provide for expression of a polypeptide of the invention. This process may comprise culturing a host cell transformed with an expression vector as described above under conditions to provide for expression by the vector of a coding sequence encoding the polypeptides, and optionally recovering the expressed polypeptides.

The vectors may be for example, plasmid or virus vectors provided with an origin of replication, optionally a promoter for the expression of the said polynucleotide and optionally a regulator of the promoter. The vectors may contain one or more selectable marker genes, for example an ampicillin resistance gene in the case of a bacterial plasmid or a neomycin resistance gene for a mammalian vector. Vectors may be used in vitro, for example for the production of RNA or used to transfect or transform a host cell. The vector may also be adapted to be used in vivo, for example in a method of gene therapy.

Promoters/enhancers and other expression regulation signals may be selected to be compatible with the host cell for which the expression vector is designed. For example, prokaryotic promoters may be used, in particular those suitable for use in E. coli strains (such as E. coli HB101 or When expression of the polypeptides of the invention in carried out in mammalian cells, either in vitro or in vivo, mammalian promoters may be used. Tissue-specific promoters may also be used. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR), the promoter rous sarcoma virus (RSV) LTR promoter, the SV40 promoter, the human cvtomegalovirus (CMV) IE promoter, herpes simplex virus promoters or adenovirus promoters. All these promoters are readily available in the art.

WO 00/63245 PCT/GB00/01558 C. Administration The variant MSP-1 polypeptides of the present invention and nucleic acid molecules may be used to treat or prevent malaria in animals, specifically humans.

The polypeptides of the invention may be administered by direct injection. Preferably the polypeptides are combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition. Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration. Typically, each polypeptide is administered at a dose of from 0.01 to 30 pg/kg body weight, preferably from 0.1 to 10 pg/kg, more preferably from 0.1 to 1 pg/kg body weight. It is also possible to use antibodies prepared using the polypeptides of the invention, as described below, in treating or preventing Plasmodium infection. Neutralising antibodies, or fragments thereof which retain specificity for Plasmodium antigens, can be administered in a similar manner to the polypeptides of the invention.

The polynucleotides of the invention may be administered directly as a naked nucleic acid construct. When the expression cassette is administered as a naked nucleic acid, the amount of nucleic acid administered is typically in the range of from 1 pg to 10 mg, preferably from 100 pg to 1 mg.

Uptake of naked nucleic acid constructs by mammalian cells is enhanced by several known transfection techniques for example those including the use of transfection agents.

Example of these agents include cationic agents (for example calcium phosphate and DEAE-dextran) and lipofectants (for example lipofectam m and transfectamM).

Typically, nucleic acid constructs are mixed. with the transfection agent to produce a composition.

WO 00/63245 PCT/GBOO/01558 -21- Alternatively. the polynucleotide may be administered as part of a nucleic acid vector, including a plasmid vector or viral vector, such-as a vaccinia virus vector. When the polynucleotide of the invention is delivered to cells by a viral vector of the invention, the amount of virus administered is in the range of from 10' to 1010 pfu, preferably from S to 10 s pfu. more preferably from 106 to 10' pfu. When injected, typically 1-10 pl of virus in a pharmaceutically acceptable suitable carrier or diluent is administered.

Preferably the delivery vehicle naked nucleic acid construct or viral vector comprising the polynucleotide for example) is combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition. Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration.

The routes of administration and dosages described are intended only as a guide since a skilled practitioner will be able to determine readily the optimum route of administration and dosage for any particular patient and condition.

D. Preparation of Vaccines Vaccines may be prepared from one or more polypeptides of the invention. They may also include one or more immunogenic Plasmodium polypeptides known in the art. Thus a vaccine of the invention may comprise one or more polypeptides of the invention and optionally, one or more polypeptides selected from, for example, the asexual blood stage proteins: apical merozoite antigen-1, erythrocyte binding antigen 175, erythrocyte membrane protein-1; the hepatic stage proteins: liver stage antigens 1 and 3; the sporozoite stage proteins: circumsporozoite protein thrombospondin related adhesive protein; and the sexual stage proteins Pfs25 and Pfs28 polypeptides and immunogenic fragments thereof. Preferably, the other immunogenic Plasmodium polypeptides known in the art do not contain wild type MSP-119 sequences.

The preparation of vaccines which contain an immunogenic polypeptide(s) as active WO 00/63245 PCT/GB00/01558 ingredient(s), is known to one skilled in the art. Typically, such vaccines are prepared as injectables. either as liquid solutions or suspensions; solid forms suitable for solution in.

or suspension in, liquid prior to injection may also be prepared. The preparation may also be emulsified. or the protein encapsulated in liposomes. The active immunogenic S ingredients are often mixed with excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline.

dextrose, glycerol, ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, and/or adjuvants which enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include but are not limited to: aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-( '-2'-dipalmitoyl-snglycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by measuring the amount of antibodies directed against an immunogenic polypeptide containing an MSP-1 antigenic sequence resulting from administration of this polypeptide in vaccines which are also comprised of the various adjuvants.

The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in some cases, oral formulations.

For suppositories, traditional binders and carriers may include, for example, polyalkylene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1% to Oral formulations include such normally employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These compositions take the form of solutions, suspensions.

tablets, pills, capsules, sustained release formulations or powders and contain 10% to of active ingredient. preferably 25% to 70%. Where the vaccine composition is WO 00/63245 PCT/GB00/01558 -23lyophilised, the lyophilised material may be reconstituted prior to administration, e.g. as a suspension. Reconstitution is preferably effected in- buffer.

Capsules, tablets and pills for oral administration to a patient may be provided with an enteric coating comprising, for example, Eudragit Eudragit cellulose acetate, cellulose acetate phthalate or hydroxypropylmethyl cellulose.

The polypeptides of the invention may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable salts include the acid addition salts (formed with free amino groups of the peptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids such as acetic, oxalic, tartaric and maleic. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine and procaine.

E. Dosage and Administration of Vaccines The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as will be prophylactically and/or therapeutically effective. The quantity to be administered, which is generally in the range of 5 .g to 250 pg of antigen per dose, depends on the subject to be treated, capacity of the subject's immune system to synthesise antibodies, and the degree of protection desired. Precise amounts of active ingredient required to be administered may depend on the judgement of the practitioner and may be peculiar to each subject.

The vaccine may be given in a single dose schedule, or preferably in a multiple dose schedule. A multiple dose schedule is one in which a primary course of vaccination may be with 1-10 separate doses, followed by other doses given at subsequent time intervals required to maintain and or reinforce the immune response, for example, at 1 to 4 months for a second dose, and if needed, a subsequent dose(s) after several months. The dosage regimen will also, at least in part, be determined by the need of the individual and be WO 00/63245 PCT/GB00/01558 -24dependent upon the judgement of the practitioner.

In addition, the vaccine containing the immunogenic MSP-I antigen(s) may be administered in conjunction with other immunoregulatory agents, for example, immunoglobulins.

F. Preparation of antibodies against the polvpeptides of the invention The variant MSP-1 polypeptides prepared as described above can be used to produce antibodies, both polyclonal and monoclonal. If polyclonal antibodies are desired, a selected mammal mouse. rabbit, goat, horse. etc.) is immunised with an immunogenic polypeptide bearing an MSP-I epitope(s). Serum from the immunised animal is collected and treated according to known procedures. If serum containing polyclonal antibodies to an MSP-1 epitope contains antibodies to other antigens, the polyclonal antibodies can be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal antisera are known in the art.

Monoclonal antibodies directed against MSP-1 epitopes in the polypeptides of the invention can also be readily produced by one skilled in the art. The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibodyproducing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. Panels of monoclonal antibodies produced against MSP-1 epitopes can be screened for various properties; for isotype and epitope affinity.

The polypeptides of the invention can also be used to select for human monoclonal antibodies using the variable regions of immunoglobulin heavy and light chains cloned in the form of a phage display library, preferably from individuals who have been previously exposed to a natural malaria infection.

Antibodies. both monoclonal and polyclonal, which are directed against MSP-1 epitopes are particularly useful in diagnosis, and those which are neutralising are useful in passive WO 00/63245 PCT/GB00/01558 immunotherapy. Monoclonal antibodies. in particular, may be used to raise anti-idiotype antibodies. Anti-idiotype antibodies are immunoglobulins which carry an "internal image" of the antigen of the infectious agent against which protection is desired.

S Techniques for raising anti-idiotype antibodies are known in the art. These anti-idiotype antibodies may also be useful for treatment of Plasmodium infections, as well as for an elucidation of the immunogenic regions of MSP-1 antigens. It is also possible to use fragments of the antibodies described above, for example, F(ab') 2 Fab, Facb and scFv fragments.

It should be appeciated that features from various sections, aspects and embodiments of the invention as described above are generally equally applicable to other sections, aspects and embodiments mutatis mutandis.

The invention will now be further described by way of Examples, which are meant to serve to assist one of ordinary skill in the art in carrying out the invention and are not intended in any way to limit the scope of the invention. The Examples refer to the Figures. In the Figures: Detailed Description of the Figures Figure 1 MSP-1 sequences aligned according to the EGF-like motif consensus. Top sequence: P. falciparum (SWISS-PROT MSPI PLAFW). Second sequence: P. vivax Belem strain (PIR A45604). Third sequence: human EGF (PDB legf). Fourth sequence: EGF-like domain consensus (Prosite EGF1). Bottom sequence: 14 residue EGF core region used for structure alignment in Figure 6. Black highlighting indicates conserved residues of the EGF-like domain. Dark shading shows hydrophobic residues at the EGFmodule pair interface in the P. falciparum, and corresponding conserved residues in the P. vivax sequence.

Figure 2 Sample of multidimensional heteronuclear NOESY experiments showing planes containing NOE connections to the MSP-1 C-terminal fragment Lys35 NH proton.

Top: '3C (D4) and 'H(D3) plane from the 4D-[13C]-HMQC-NOESY-[1N]-HSQC WO 00/63245 PCT/GB00/01558 -26experiment, taken at the chemical shift values of Lys35 NH in 1"N(D2) and 'H(DI).

Bottom: strip from the 3D ["N]-NOESY-HSQC ar the 'H chemical shift value of NH (vertical axis, Dl) taken at the plane of its "N (D3) value. The horizontal 'H axis is aligned with that of the top spectrum. The weak cross-peaks at 2.72 and 3.01 ppm in the S 3D spectrum do not show corresponding cross-peaks in the 4D spectrum because of the lower signal-to-noise ratio in the latter. These peaks have been assigned as the cross-peaks between Lys35 NH and Asn44 Hp 2 (2.72 ppm), and Cys30 HD 3 and/or Cys41 Hp 2 (3.01 ppm).

Figure 3 Stereo drawing showing the backbone C, N, Ca atoms of the 32 refined structures in the final ensemble. The domain-i is on the left (red), with domain-2 on the right (green), and both the N- and C-termini are near the bottom.

Figure 4 MOLSCRIPT picture of the most representative model of the ensemble, showing the backbone C, trace, antiparallel P-sheet elements, and disulphide bridges (Sy atoms in yellow). Domain-1, red; Domain-2, green.

Figure 5 Alignment of typical EGF-like family members with the fitpdb program, using the 14 amino acid "reduced core" consensus (Bersch et al., 1998) (see Figure The aligned backbone segment in each structure is white. The structures are aligned relative to the most representative structure of the group (factor Xa), with increasing divergence from left to right. Numbers indicate the rmsd value of the aligned C, N, Ca atoms. PDB identification codes: factor Xa (crystal structure), lhcg; Complement Clr component, lapq (14h model); human EGF, legf (11' model); fibrillin-1, domains-32 and -33 lemn (minimized average structure); transforming growth factor-a, 2tgf (minimized average structure); MSP-1 domains-1 and this study.

Figure 6 Backbone ribbon view of fibrillin-1 versus MSP-1 EGF module pair arrangements. Fibrillin-1 (lemn) cyan (domain-32) and magenta (domain-33) (Downing et al.. 1996): MSP-1 domain-1 (yellow) and domain-2 (green). Structures were aligned as in Figure 6 by the core consensus of the N-terminal domain of each pair. The bound Ca> ions in the fibrillin- structure are shown as magenta spheres.

WO 00/63245 PCT/GB00/01558 -27- Figure 7 Two views, a and b, (rotated 180 0 about-the y-axis) of the electrostatic potential surface of the MSP-I EGF module pair, calculated with GRASP. Red indicates negative charge, blue indicates positive charge, and white is neutral. The orientation of the views is shown by the adjacent worm diagrams.

Figure 8 CPK model of the MSP-1 C-terminal fragment, showing the location of some mutations that affect binding of monoclonal antibodies. Domain-1 is towards the top and right sides, and domain-2 towards the bottom left.

Figure 9 Examples of the binding of monoclonal antibodies to GST-MSP-1 t9 detected by Western blotting. The binding of each monoclonal antibody to protein based on the wild type sequence and to proteins containing modified sequences is shown. The monoclonal antibodies are shown across the top. On the left is shown the proteins: WT, wild type sequence; 22, Leu22 to Arg; 26, Glu26 to Ile; 15, Asnl5 to Arg; 27, Glu27 to Tyr; 31, Leu31 to Arg; 43, Glu43 to Leu; 27+31+43, Glu27 to Tyr and Leu31 to Arg and Glu43 to Leu; 15+27+31+43, Asnl5 to Arg and Glu27 to Tyr and Leu31 to Arg and Glu43 to Leu.

Figure 10 The binding of monoclonal antibodies to GST-MSP-1 19 detected by BIAcore analysis. The binding of each monoclonal antibody is normalised to 100% binding to protein based on the wild type sequence and the binding of proteins containing modified sequences is expressed as a percentage of this. WT, wild type sequence; 15, Asn 26, Glu26-Ile; 27, Glu27--Tyr; 31, Leu31-Arg; 34, Tyr34-Ser; 43 Glu43-Leu.

Figure 11 The binding of monoclonal antibodies to GST-MSP-119 containing multiple modifications detected by BIAcore analysis. The binding of each monoclonal antibody is normalised to 100% binding to protein based on the wild type sequence and the binding of proteins containing modified sequences is expressed as a percentage of this. WT, wild type sequence; The combinations contain 3 mutations [27+31+43], or 4 mutations ([27+31+34+43] and [15+27+31+43]), at each site the changes are those identified in Figure WO 00/63245 PCT/GB00/01558 -28- Figure 12 Identification of blocking antibodies using a competitive binding assay and immobilised wild type GST-MSP-1 19 The ability of antibodies to compete with the binding ofmAbs 12.8 and 12.10 to GST-MSP-119 was measured using BIAcore analysis.

Individual antibodies (x-axis) were bound to the antigen and then the amount of either 12.8 or 12.10 (inhibitory mAb) that could subsequently bind was quantified. The amount of binding is presented as a percentage of the total amount of either 12.8 or 12.10 bound in the absence of pre-incubation with another antibody.

Figure 13 Antibodies induced by immunisation with a modified recombinant MSP-119 assayed for their ability to inhibit secondary processing. Washed 3D7 merozoites were either analysed directly without incubation (0 h) or incubated for 1 hour at 37 0 C in the presence of no serum (no serum), 1 mM PMSF as a control for complete inhibition, normal rabbit sera (normal serum), or serum from a rabbit immunised with the 15+27+31+43 modified protein (immune serum), all at 1:10 dilution in reaction buffer.

The level of MSP-1 33 released into the supernatant as a results of secondary processing was measured using an ELISA method and is represented by Absorbance at 492nm.

Figure 14. Pichiapastoris codon preference table used for input to the CODOP program.

Figure 15. DNA and protein sequences for the optimized synthetic MSP-142 gene. A: Complete sequence designed for optimum codon usage and expression in P. pastoris. B: Sequence of the synthetic MSP-119 construct in the expression vector pPIC9K-HXa.

Uppercase letters: vector sequences, including the Hiss tag and factor Xa cleavage site (IEGR). Lowercase letters: synthetic MSP-119 coding sequence. The cloned sequence in located at the SnaBI restriction site of the pPIC9K sequence. C: Expressed protein sequence of the synthetic MSP-119 construct. The sequence shown is produced as a fusion to the pPIC9K a-factor secretion signal, following the kex2/STE13 processing sites. The synthetic MSP- 19 is in bold-face type. D: Sequence of the MSP-133 construct.

The cloned sequence is located at the Smal site of the pUC118 vector. E: Predicted protein sequence of the synthetic MSP-133 construct translation product.

Figure 16. Gene assembly PCR reactions for the MSP-133 and MSP-119 sequences.

Reaction 1: 10 pL aliquots of the assembly reactions. Reaction 2: 20 pL aliquots of the WO 00/63245 PCT/GB00/01558 -29amplification reactions. The N-terminal and middle fragments were subsequently spliced together to form the MSP-133 synthetic construct. The C-terminal fragment synthesis reactions produced the optimized MS P-119 construct.

Figure 17. Expression of the synthetic MSP-119 protein in P. pastoris. Lanes 1-6: trichloroacetic acid precipitates of secreted recombinant protein from culture supematants, without further purification (5 iL each). Samples from duplicate cultures of three independent transformants. Lane 8,9: purified, deglycosylated MSP-119 produced from the original P. falciparum sequence. Lane 7,10: NOVEX molecular weight markers.

Figure 18. A: {'H/tN)-HSQC spectrum of the protein (2.5 mM) expressed from the optimized synthetic MSP-119 gene. B: Control 5 N}-HSQC of deglycosylated protein (2.2 mM) expressed from the original P. falciparum sequence (Morgan et al., 1999).

Examples Materials and Methods Protein expression and stable-isotope labelling for NMR The coding sequence of the MSP-1 C-terminal fragment was cloned by polymerase chain reaction with Vent polymerase (New England Biolabs) from a plasmid containing the Plasmodiumfalciparum strain T9/94 fragment (Blackman et al., 1991), using primers that included codons for a 6 residue N-terminal His tag (CACCATCATCATCATCAC), and inserted into the SnaBI restriction site of the pPIC9K vector (Invitrogen). The sequence corresponds to residues 1526-1621 of the SWISS-PROT entry MSPI PLAFW (accession number P04933). This produced an a-factor fusion protein with the sequence KR/EA/EA/YHHHHHHNISQ....SSSN. where the slashes indicate kex2 and STE13 processing sites. High copy number transformants of the methylotrophic yeast Komagataella (Pichia) pastoris protease-deficient strain SMD1168 (his4 pep4) were isolated by screening for high G418 resistance (Clare et al.. 1995).

A Mut- transformant was grown at 29.4 OC in a shaker-incubator in buffered minimal WO 00/63245 PCT/GB00/01558 medium (100 mM potassium phosphate, pH 6.0, yeast nitrogen base (0.34 w/vol) (DIFCO: YNB without amino acids and without (NH 4 2 SO), biotin (4x10 w/vol), Sigma antifoam 289 (0.01% volivol), and carbon and nitrogen sources as described below.

Unlabelled samples were initially grown in medium containing 1 w/vol (NH 4 2

SO

4 and 1 w/vol glycerol. and induced by transfer to medium containing 0.5 CH30H as the carbon source. Labelled samples were initially grown in medium containing 0.2 w/vol [j5N]-(NH4) 2 S04 (Isotech), and 0.5 w/vol glucose or ['jC 6 ]-glucose (Isotech), and induced by transfer to medium containing as carbon source 0.5 w/vol CH30H or

CH

3 OH (Isotech). The initial cultures were grown in 150 ml to a density of -10 ODoo, then harvested and resuspended in methanol medium at 1 OD60oo in a volume of 1.5 L.

Methanol-induced cultures were grown for 4 d. with daily addition of 7.5 ml CH 3 0H or 3 OH, to a final density of-18 OD 6 0 o. This protocol produced a maximum yield of 24 mg/L of purified, 1 uniformly labelled protein at the final stage (see below).

The YNB-based medium produced about 3-fold higher yields than the FM22 medium (Laroche et al., 1994), for stable-isotope labeling of the MSP-1 C-terminal fragment.

Cells were removed by low-speed centrifugation, protease inhibitors added

(COMPLETE

T M tablets, Boehringer-Mannheim: 1 tablet/500 ml supernatant), and the supernatant was filter-sterilized. The supematant was concentrated -20-fold by ultrafiltration in a stirred cell (Amicon, YM3 membrane) at 4 OC. The pH was adjusted to 7.25 with KOH, and the partially N-glycosylated MSP-1 fragment was deglycosylated for 72 h at 37 OC with 5000 U PNGaseF (New England Biolabs). The carbohydrate was completely removed (as shown by electrophoresis and mass spectrometry), with the Asn 1 residue presumably converted to Asp in the process. The supernatant was clarified by lowspeed centrifugation, 5 M NaCI was added to a final concentration of 0.3 M, and the sample was applied to a 2 ml Ni-NTA affinity column (QIAGEN), washed, and eluted with 250 mM imidazole according to manufacturer's instructions. The eluate was dialyzed against 50 mM sodium phosphate (pH 50 mM NaCI, and then passed through a 1 ml Hi-Trap Q anion exchange resin (Pharmacia) to remove misfolded MSP-I that bound to the column. The MSP-1 fragment was characterized by Western blotting and electrospray mass spectrometry (data not shown). Two principal species of mass 11607 and 11807 Da were observed corresponding to the expected fragment, as well as a WO 00/63245 PCT/GB00/01558 -31fragment with an additional N-terminal Glu Ala dipeptide resulting from incomplete STE13 processing of the cc-factor secretion signal Samples for NMR experiments were prepared in either 90 H 2 0/10 D 2 0 with 0.01 wivol NaNs. or 100 D20, 50 mM sodium phosphate, 100 mM NaCI at pH 6.5, (pH uncorrected for deuterium isotope effects), at a concentration of 2.1 to 2.6 mM in 0.6 ml.

Protein concentration was measured by UV absorbance at 280 nm, using a calculated molar extinction coefficient of 5220 liter mol The protein was demonstrated to be monomeric by equilibrium ultracentrifugation of a 0.12 mM sample in the above buffer at 293 K.

NMR experiments and data processing Most of the experiments were performed at 298 K, using Varian Unity and Unity-Plus spectrometers operating at 600 MHz and 500 MHz respectively. Details of the multidimensional experiments (Clore Gronenborn, 1998) and acquisition parameters used for resonance assignments and structure determination are given in Tables A/B and have been submitted to the Protein Data Bank database (PDB Accession No 1CEJ).

All spectra were processed using Felix 95.0 or 97.0 (Biosym/MSI) using a 90 degree- or 72 degree-shifted sinebell-squared window function. Dimensions, zero-filling, and linear prediction details are summarized in Tables A/B and in the submission to the BioMagResBank. Four dimensional and interleaved spectra were processed in Felix using macros written in-house.

Signal assignments: Sequential assignments were made based on connectivities established primarily by CBCA(CO)NH and CBCANH experiments on uniformly 1 3

C/"N

labelled protein. Side chain-spin system assignments were made on the basis of data from '3C/'H -HCCH-TOCSY experiment correlated with information from "N/'H-TOCSY- HSQC and "N/'H-NOESY-HSQC, and HNHA and HNHB experiments. Assignments were obtained for '1N and aliphatic--'C signals for 98% of side-chains and 96% of backbone amide groups. The list of assignments is given in Tables A/B and in the WO 00/63245 PCT/GB00/01558 submission to the Protein Data Bank database (PDB Accession No 1CEJ). The heteronuclear NOE experiment was carried out as- described previously (Kay et al., 1989; Polshakov et al.. 1997).

Distance Restraints: NOE- and ROE-derived distance restraints between backbone and side chain amide protons were obtained primarily from the 3D '"N-NOESY-HSQC, "N-ROESY-HSQC, and 4D "C-HMQC-NOESY-"'N-HSQC experiments. Aliphatic to aliphatic proton distance restraints were obtained from a 4D '3C-HMQC-NOESY-"3C- HSQC experiment. A 3D 3 C-HMQC-NOESY experiment in D 2 0 was used to identify 1o aliphatic to aromatic proton NOEs and 2D NOESY experiments were used to measure aromatic to aromatic proton NOEs. Crosspeaks were quantified by volume integration in Felix for 2D and 3D experiments and for the 4D 'LC-HMQC-NOESY-"N-HSQC experiment, and from peak height measurements in the 4D 3

C-HMQC-NOESY-'

3

C-

HMQC spectra. Crosspeaks were classified as strong, medium and weak and these were assigned to distance restraints of 0 2.8, 0 3.6, and 0 5.5 A. Restraints from backbone amide signals were initially treated in this manner, and then recalibrated more precisely using 3D- 1 SN-ROESY-HSQC data into four classes involving maximum distances of 2.6, 3.1, 3.6, and 4.1 A. Restraints to groups of equivalent or non-stereoassigned protons were treated by r 6 summation. Most intraresidue distances (HN-H 0 and IHa-Hp) were converted to 7% angle restraints as described below and these distance restraints were not included in the final list.

Dihedral Angle Restraints: x angles and stereospecific assignments of P-methylene protons were obtained using the grid-search program AngleSearch, with coupling constant and intraresidue ROE distance information (Polshakov et al., 1995). The coupling constant information was provided by HNHB and HN(CO)HB spectral intensities for 3J(HN-HO) and 3J(CO-Hp), and intraresidue distances (HN-Hp, HaI-H) were obtained from 3D "N-ROESY-HSQC and 2D ROESY (D 2 0) experiments. 3J(HN-Ha) coupling constants were obtained from the HNHA experiment. Residues with positive 4 angles (ca.

-60 degrees) were identified by large intraresidue He crosspeak intensities in the HN(CO)HB experiment, and y angles near -60 degrees from strong Ha(i-) crosspeaks in the HNHB experiment. Ile and Leu Y2 angles and Leu 6 stereoassignments were derived WO 00/63245 PCT/GB00/01558 from the LRCH experiment. Minimum ranges of 40 degrees 72) and 50 degrees y) were used to account for errors and local dynamic effects on the coupling constants.

Disulphide Bonding Pattern. An initial set of 20 structures was calculated by simulated annealing using approximately 550 unambiguous NOE-derived distance restraints and 36 Xi and 4 dihedral angle restraints but with no hydrogen bonding or disulphide bond constraints. The Cys Cys Sy distances in these structures were examined in order to establish the probable bonding pattern. Prior to the calculations, the formation of disulphide bridges for 4 Cys residues (Cysl2 Cys28, Cys78 Cys92) was already established with high probability by the observation of Hp-H a NOEs between these pairs of Cys residues. Examination of the initial structures confirmed these disulphide bridges and also indicated a disulphide bridge between residues Cys30 and Cys41. The third disulphide bridge in domain-1 (Cys7 Cys 18) could thus be assigned by default, although the structure of the N-terminus was not well-defined by the NMR data. The best six structures in terms of total X-PLOR energy and violations indicated that the average Cys Cys S. distance was lowest for the disulphide bonding pattern 2-4, 5-6] in each domain, and only this combination allowed all Cys residues to form contacts with a partner <3.5 A away. Thus, this disulphide bonding pattern was most consistent with the experimental data for both domains, and was imposed (initially as NOE-style distance restraints) in subsequent calculations. The 2-4, 5-6] pattern is that expected for an EGF-like domain.

Hydrogen Bonds: Non-exchanging amide groups involved in stable hydrogen bonds were identified in spectra of samples examined in 100 D 2 0. The corresponding hydrogen bond acceptors were determined by examining the initial structural ensemble, using the Insight II and HBPlus (McDonald et al., 1994) programs, and hydrogen bond distance restraints were included in subsequent calculations. Further hydrogen bonds were identified in a similar manner in iterative calculations. Only 10 backbone hydrogen bonds in the antiparallel 0 sheets were used as restraints. Two distance restraints were used for each hydrogen bond, 1.7 2.3 A from proton to acceptor, and 3.0 3.6 A from donor nitrogen atom to acceptor.

WO 00/63245 PCT/GB00/01558 -34- Structure Calculations All the structure calculations were performed following standard protocols for ab initio simulated annealing from an extended chain using X-PLOR version 3.843 on a Silicon Graphics Origin 200 computer. The initial calculations used an initial temperature of 1000K, and 9000 steps of 5 fs in the restrained molecular dynamics stage. A soft-square potential was used for distance restraints. The SHAKE (Ryckaert et al., 1977) algorithm was employed during molecular dynamics to maintain correct bond lengths. Refinement used a square well potential for restraints, and a final slow cooling of 30000 steps of 4 fs each from 2000K. A modified "parallhdg.pro" force-field parameter set was used, with modifications to parameters for Arg and Pro residues. and for hydrogen bonds (Polshakov et al., 1997). Force constants were 50 kcal mol A 2 for all distance restraints including hydrogen bonds, and 200 kcal mol"' rad 2 for dihedral restraints. The N-terminal sequence including the vector-encoded residues and (His) 6 tag was excluded from the structure calculations. All peptide bonds were constrained to be trans. NOE data for all 5 Pro residues showed strong Ho(i-l)-ProHa crosspeaks, consistent with the trans peptide conformation.

Initial structures were calculated as described above to determine the disulphide bonding pattern. Then the calculation was repeated with identical NOE-derived distance and dihedral angle restraints, with the addition of 6 distance restraints (1.92-3.12 A) representing the disulphide bridges. A new set of 50 structures was obtained, from which the best 20 structures were selected. The criteria used for selection were that the structures were below the median value of both total X-PLOR energy and rms NOE difference, and had no dihedral angle violations. The resulting structures had good geometry and between zero and two NOE violations 0.5 A. These structures were used to assign previously ambiguous NOEs and to determine the hydrogen bonds as described above.

The final structure calculation and refinement used an expanded restraint list including hydrogen bonds, additional dihedral restraints, stereoassignments of P-methylene and Leu 6 signals. and more precisely calibrated ROE data (see Table A set of 100 structures WO 00/63245 PCT/GB00/01558 was obtained using this list. and 38 structures with 0-2 NOE violations 0.5 A and no dihedral angle violations 50 were accepted. These 38 structures were refined by the slow-cooling procedure described above, producing a final ensemble of 32 accepted structures with no NOE violations 0.5 A and no dihedral angle violations These selection criteria produced an ensemble of structures that extend to the end of the continuum of total potential energies in order to include structures having large scale correlated motions (Abseher et al., 1998). Statistics for the final ensemble are given in Table 1. Coordinates for the 32 refined structures have been deposited in the Brookhaven Protein Data Bank (coordinates ID code Icej; NMR restraints ID code rlcejmr).

Structures were analyzed during the calculation process using X-PLOR 3.8 (Nilges et al., 1991), PROCHECK-NMR/AQUA (Laskowski et al., 1996), and Insight II for quality of agreement with experimental data, precision, geometry, and energy. Models were aligned with Insight II and fitpdb. and displayed with Insight II, MOLSCRIPT (Kraulis, 1991), and GRASP (Nicholls et al.,1991).

Table 1.

A: RESTRAINTS SUMMARY Number of conformers calculated: 100 Number of conformers accepted: 32 Acceptance criteria: No distance violation: 0.5 A No dihedral angle violation: 5 o NOE/ROE distance restraints: Intraresidue: 73 Sequential: 222 Medium range 2-4 90 Long range 4 185 Total: 570 Dihedral angle restraints: phi: 25 psi: 33 chi-l: 22 chi-2: 5 Total: WO 00/63245 PCT/GB00/01558 -36- Hydrogen bonds: 10 Disulphide bonds: 6 B: STRUCTURE QUALITY average s.d.

Total X-PLOR energy (kcal mol 168 NOE X-PLOR energy (kcal mol"') 21 8 rmsd NOE 0.026 0.005 rmsd dihedral angle 0.236 0.095 rmsd bond length 0.0029 0.0002 rmsd bond angle 0.357 0.023 rmsd improper 0.266 0.018 Backbone rmsd of structured region: (69 residues) Overall: 1.05 0.28 Domain-1: 0.81 0.32 Domain-2: 0.83 0.35 Ramachandran plot quality (phi/psi angles): Most favoured 49.5 Additional allowed 42.1% Generously allowed 5.6 Disallowed 2.7 Monoclonal antibodies (mAbs) Anti-MSP-11 9 monoclonal antibodies used in this study were mouse IgG mAbs 1El, 1E8, 2F10, 111.2, 111.4 2.2, 5.2, 7.5, 9C8, 12.8, 12.10, 12D11, 117.2. 8A12 (Holder et al., 1985; McBride Heidrich, 1987; Blackman et al., 1987; Guevara Patiiio et al., 1997); and mouse IgM mAb 5B1 (Pirson Perkins, 1985). Of these, mAbs 12.8, 12.10 and 5B1 are neutralising, inhibitory antibodies and 1El, 2.2, 7.5, 9C8 and 111.4 are blocking antibodies. Some antibodies such as 111.2 are neither inhibitory nor blocking.

IWO 00/63245 PCT/GB00/01558 -37- Construction of modified MSP-1.9 clones The DNA coding the wild type MSP-II9 domain of Plasmodium falciparum (T9- 94/Wellcome strain) MSP-1 has been cloned in expression vector pGEX-3X to produce MSP-119 fused to the carboxy-terminus of the Schistosoma japonicum glutathione Stransferase (GST) in Escherichia coli (Burghaus Holder, 1994). Site-directed mutagenesis of MSP-1 1 9 DNA sequence was done in either of two ways.

The first method was a modification of the method of Perrin Gilliland (1990) to carry out polymerase chain reaction (PCR)-mediated site specific mutagenesis. DNA was amplified using the plasmid as a template together with one oligonucleotide to introduce the point mutation and a 5' primer from outside of the MSP-1 1 9 sequence. The amplified product was purified after electrophoresis on an agarose gel and used in a second amplification step together with a 3' primer from outside of the other end of the MSP-1 1 9 sequence and the plasmid as template. This second PCR product was digested with the restriction enzymes EcoRl and BamHl and the product consisting of the modified MSP- 119 coding sequence was inserted back into pGEX-3X and the products were used to transform DH5ct cells.

The second method used the QuikChange T M Site-directed mutagenesis kit from Stratagene. Briefly, using the plasmid pGEX-MSP-1 1 9 as a template, two complementary synthetic oligonucleotide primers containing the desired point mutation were designed and were extended on the template by temperature cycling with the enzyme Pfu DNA Polymerase. This incorporation of the oligonucleotide primers results in the generation of a mutated plasmid containing staggered nicks in the DNA sequence. Following the temperature cycling, the product was treated with DpnI endonulease which will digest the methylated parental DNA template and leaves the mutation-containing newly synthesised DNA intact. The DNA incorporating the desired mutation was then transformed into E.

coli strain DH5ct (Life technologies) competent cells where the nicks will be repaired.

Clones were screened by analysis of restriction enzyme digests and by PCR screening of WO 00/63245 PCTIGBOO/01558 -38the insert gene. The DNA sequence of the selected mutant clones was confirmed using a PerkinElmer Applied Biosystems ABI 377 automatic sequencer according to the manufacturer's instructions.

Expression of the GST-MSP-l fusion proteins Expression of GST-MSP-1 1 9 was induced with 1 mM isopropyl-P-Dthiogalactopyranoside (IPTG; Melford Laboratories) for 1 hour in the E. coli strain TOPP 1 (Stratagene). The cells were then harvested by centrifugation and the cell pellet was resuspended in cell lysis buffer (50 mM Tris-HCl/1 mM EDTA pH 8.0 containing 0.2% Nonidet P40 (NP40; BDH). Phenylmethylsulphonyl fluoride (PMSF; Sigma) in isopropanol was added to a final concentration of 1 mM. The cell suspension was sonicated, on ice, using VibraCell sonicator (Sonics Materials) at 50% duty cycle for 3 min (six 30 sec pulses with 30 sec in between). The cell lysate was centrifuged at 65000 x g for 1 hour at 4 0 C. Supernatant containing soluble GST-fusion protein was applied to a glutathione-agrose column (Sigma) and the GST-fusion protein was eluted with 5 mM reduced glutathione. The eluted GST-fusion protein was dialysed extensively against phosphate buffered saline (PBS) at 4 0

C.

SDS-PAGE and Western Blotting Proteins were analysed by polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulphate (SDS-PAGE). Samples were solubilised in SDS-PAGE buffer without reducing agents, then fractionated on a homogeneous 12.5% polyacrylamide gel. The prestained low range molecular mass markers (24-102kDa) from Bio-Rad were used as markers. When required, SDS-PAGE-fractionated polypeptides were either stained with Coomassie Brilliant Blue R-250 (CBB; Sigma) or electrophoretically transferred to Optitran BA-S 83 reinforced nitrocellulose (Schleicher Schull, 0.2 jm pore size) for analysis by western blotting. Blots were blocked with 5% BSA, 0.5% Tween.20 in PBS (PBS-T) for 1 h at room temperature, then washed in PBS-T. Blots were probed with first antibodies for 2 h at room temperature,-washed 3 times in PBS-T, and then incubated in 1/1000 dilution of horse radish peroxidase (HRP)-conjugated sheep anti-mouse IgG (H+L) WO 00/63245 PCT/GB00/01558 -39- (ICN Immunobiologicals) or Goat anti-mouse IgM chain) (Sigma) for 1 h at room temperature. Blots were then washed 3 times in PBS-T and developed using Super Signal Substrate (Pierce) as HRP substrate for 1 min. Blots were then placed in plastic wrap and exposed to X-ray film (XB-200, X-ograph Imaging Systems). The films were processed with an Agfa Gevamatic60 film processor (Agfa).

Analysis of antibodv-antigen interaction using a BIAcore machine GST-MSP-119 containing either the wild type or various modified sequences was used to coat a carboxymethyl dextran hydrogen sensor chip by the following methodology. The binding of the GST-MSP-119 was via amino groups using EDC/NHS chemistry.

Immobilisation was done with the amine coupling kit (Pharmacia BIAcore). The CM dextran surface was activated with 50 pl. of 200 mM 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide (EDC) and 5 mM N-hydroxysuccinamide (NHS) for 10 min. GST-MSP-1 1 9 was then coupled to the BIAcore sensor surface using 50 p.1 of a solution at 100 ig ml in coating buffer (0.01M sodium acetate buffer, pH 3.5) for 10 min. Unreacted carboxyl groups were blocked by adding 50 p. 1 M ethanolamine, pH 8.5 for 10 min. The cells were washed with two pulses of 20 gl 10 mM glycine-HC1, pH 2.8 for 8 min in total to remove any non-covalently bound protein. The immobilisation procedure was carried out at a flow rate of 5 ul min-'. Measurements were performed on the BIAcore 2000 instrument.

RESULTS

Example 1 Resonance assignments, NMR restraints and structure determination The assignments and restraints were obtained as described in Materials and Methods using a range of multidimensional heteronuclear experiments with 3

C/

5 N uniformly labelled protein. Sample spectra from 3D and 4D experiments showing NOE connections to the Lys35 backbone amide NH proton, resolved and unambiguously assigned using the 3

C

chemical shift information, are shown in Figure 2. The distance, dihedral angle and hydrogen bond restraints used in the final set of structure calculations are summarized in WO 00/63245 PCT/GB00/01558 Table 1. A total of 570 unambiguously assigned distance restraints, 85 dihedral angle restraints, and 10 hydrogen bonds were used in the final set. The assignments and restraint list shown in Table A have been submitted to the BioMagResBank database. Three disulphide bonds, with the 2-4, 5-6) pattern for each domain were experimentally determined from the NMR data in preliminary calculations as described in Materials and Methods, and these were also included in the final refinement. A final set of 32 models was calculated and refined using these restraints and these structures are shown in Figure 3 superimposed on the backbone of the representative structure Srep. Table 1 shows that all 32 models have good geometry and are in good agreement with the experimental data with no NOE violations 0.5 A and no dihedral angle violations 50. The atomic rmsd value for the backbone atoms of the well-structured region (residues 15 64, 74 92) is 1.05 A (see Table The local backbone rmsd is highest at the N-terminus (up to Cysl2), in the loop Glu65 Lys73. and following Cys92 at the C-terminus. The Ramachandran plot quality is typical of that found for other EGF structures (Doreleijers et al., 1998).

Description of the structure EGF-domains Analysis of the final ensemble by PROCHECK-NMR indicated that each domain contains a major stretch of antiparallel p-sheet containing the third and fourth Cys residues of each domain, as expected for an EGF-like fold, as well as an additional minor antiparallel P-sheet at the C-terminal end of domain-1, similar to some (but not all) EGF family members. These secondary structure features, together with the disulphide bonding patterns, can be seen in Figure 4. There is also a well-defined type II tight turn in domain- 1, with a hydrogen bond from Tyr 34NH proton to Leu31 carbonyl oxygen. The normally conserved EGF consensus Gly residue in the tight turn is replaced in domain-1 by a residue with a positive 4 angle (Asn33), while the conserved aromatic residue is present (Tyr34). There is a probable hydrogen bond between Leu 31NH proton to Asnl5 carbonyl oxygen. Domain-2 contains two turns preceding the major p-sheet, (Asn53 Cys56, Asp57 Ala60), and a final bend from Leu86 Phe91 with a probable hydrogen bond from Asp57 NH proton to the carbonyl oxygen of Ile90 or Gly89. A surface-exposed loop from Pro81 to Pro85 replaces the tight turn. while the aromatic residue is not conserved.

WO 00/63245 PCT/GB00/01558 -41- The large loop at the end of the major b-sheet (Glu65 Lys73) is relatively disordered, and high mobility for the segment Gly68 Gly71 was confirmed by backbone amide 1"N {H} heteronuclear NOE measurements (Barbato et al., 1992). The heteronuclear NOE values are dramatically reduced for residues in this region. At the N-terminus: the low NOE intensities correspond to increased mobility compared with the rest of the protein. The interdomain linker region from Pro45 to Pro47 is distinct from other EGF-like module pairs. The conformations of the disulphide bridges between Cys30 Cys41 in domain-1, and the three Cys Cys bonds in domain-2 are all left handed spirals (Richardson, 1981).

Bridges between Cys30 Cys41, Cys56 Cys76, and Cys78 Cys92 are particularly close to their equivalents in the blood coagulation factor Xa structure (Ihcg). The conformations of the first two disulphide bridges in the relatively disordered N-terminal segment of domain-1 were not determined.

Figure 5 shows the backbone C, N, Ca atom alignments of the two MSP-1 C-terminal fragment domains made with typical examples of EGF-like domains from several proteins, using the fitpdb program. Pairwise alignments showed that the two domains from MSP-1 are more similar to the factor Xa structure and its close relative from Clr, than to each other or to the other structures tested. The rmsd values for MSP-1 domains compared to factor Xa are comparable to those of the more distantly related structures fibrillin-1 and transforming growth factor-a.

The overall fold of each MSP-1 domain is thus similar to typical EGF family members, with the turns following the fifth Cys residue roughly equivalent, in spite of the divergence from the EGF consensus (C(5)xxGac) where a is a Phe or Tyr residue.

Although some of the external loops are disordered, the scaffold is quite stable, as indicated by the non-exchangeable backbone amides (see above and in Protein Data Bank/BioMagResBank submission for details).

Unlike many EGF-like domains such as fibrillin-1, the MSP-1 C-terminal fragment lacks the conserved EGF Ca 2 '-binding sequence and there was no evidence of Ca 2 binding to the MSP-1 C-terminal fragment. The 2D 'H-NOESY spectra were virtually identical in the absence or presence of 20 mM CaCI 2 indicating that any binding that might occur has, WO 00/63245 PCT/GB00/01558 -42at most. only a small affect the overall structure.

Domain interface and surface The most striking feature of the MSP-1 C-terminal fragment structure is the interface between the domains, which consists of several nonpolar amino acids (Phel9, Leu31, Leu32, Leu86, Phe87, Ile90 and Phe91) involved in hydrophobic interactions. These residues join the base of the major P-sheet and the tight turn in domain-1 with the final bend from residue 86 to 91 in domain-2. The domain interactions result in the domains forming a U-shaped structure which contrasts with structures observed for other pairs of EGF domains (Downing et al., 1996; Brandstetter et al., 1995). For example, in fibrillin-1, the interface between EGF domains 32 and 33 is largely formed by a shared Ca 2 ligation site (Downing et al., 1996), and the overall structure resembles a rigid rod, with distant Nand C-termini. This contrasts with MSP-1 where the EGF-like domains are folded against each other so that their termini are relatively close together. A comparison of fibrillin-1 and MSP-1 EGF module pairs is shown in Figure 6. Although both termini of the MSP-1 C-terminal fragment are somewhat disordered, NOE contacts were observed between nuclei in the two ends. The proximity of the C- and N-terminal positions may be significant, since it suggests that the proteolytic processing site that produces the Cterminal 96 amino acid fragment may be very close to the GPI membrane attachment site at or near residue 96. This proximity is consistent with the idea that a membrane-bound Plasmodium proteinase is responsible for secondary processing.

The electrostatic potential surface of the MSP-1 C-terminal fragment is shown in two views in Figure 7. The surface in Figure 7a is highly charged, especially in the protruding loop regions 23 27, 35 40 and 64 66. The surface in Figure 7b contains more neutral hydrophilic residues as well as a small hydrophobic patch from Pro85 Phe87 near the center of the surface. In the future, such information could assist in understanding how these different surfaces may be involved in interactions with the rest of the MSP-1 precursor, the processing proteinase. other proteins on the merozoite surface, or unknown targets on the erythrocyvte or parasite vacuolar membrane surfaces.

WO 00/63245 PCT/GB00/01558 -43- Primary sequence conservation The residues involved in the hydrophobic domain interface in P. falciparum are also shown in Figure 1. together with corresponding residues in MSP-1 of the less virulent human malaria parasite, P. vivax (Del Portillo et al., 1991; Gibson et al., 1992).

Extensive conservation of the interface residues (with conservative substitutions) suggests that P. vivax and perhaps other Plasmodium species as well, may have a similar U-shaped EGF module pair arrangement. Another feature of the P. vivax sequence, also seen in other Plasmodium species, is the single disulphide bond deficiency in the first EGF-like domain resulting from the absence of cysteine residues equivalent to the P. falciparum Cysl2 and Cys28.

P.falciparum dimorphic sites Five dimorphic sites have been observed in the P. falciparum MSP-119 C-terminal fragment from different isolates (Qari et al., 1998). Several observations can be made about the position of these sites on the MSP-1 structure. Two sites, Glnl4/Glul4 and Lys61/Thr61, involve residues in relatively well-structured backbone regions, with surface-exposed hydrophilic or charged side-chains. A pair of adjacent sites, with the sequence variants Asn70 Gly71/Ser70 Arg71, occurs in the disordered loop of domain- 2, within a segment (residues 68 71) that has been shown to be highly mobile. The region from Glu65 to Lys73 also appears to be the most variable region among different Plasmodium species (Daly et al., 1992; Holder et al., 1992). Finally, the fifth site has a substitution between hydrophobic residues (Leu86/Phe86). This partially-exposed sidechain is located at the hydrophobic domain interface, and the conservative substitution is consistent with a role in this interaction.

Example 2 Mutation and Monoclonal Antibody Binding studies As a step towards understanding antibody interactions with the MSP-1 C-terminal fragment, the effect of engineered point mutations (within domain-1) on antibody binding has been studied. Amino acid substitutions were made that consisted of radical changes.

WO 00/63245 PCT/GB00/01558 -44- These radical changes consisted of, for example. replacing an aliphatic residue with a charged polar residue, replacing a positively charged side chain with a negatively charged side chain, replacing an amino acid with a large side chain with an amino acid with a smaller or no side chain (glycine), replacing a polar amino acid with a charged polar amino acid, replacing a polar amino acid with an aromatic amino acid, replacing a large aromatic amino acid with an amino acid with a small side chain, and replacing cysteine residues that are involved in disulphide bonds.

Four individual amino acid substitutions shown in Figure 8, each completely abolish binding of one or more mAbs to the mutant fragment, as detected by Western blotting.

The Glu26 mutation, shown in cyan, is closest to the N-terminal proteolytic processing site (magenta) at Asnl, and is the only one of this group of mutations that affects binding of a processing-inhibitory antibody, i.e. one that is capable of preventing both proteolytic processing of the MSP-1 precursor and erythrocyte invasion in vitro. The other three mutations abolish binding of blocking antibodies that bind to the native C-terminal fragment and interfere with the binding of processing-inhibitory antibodies.

Additional mutations were made based on the immunochemical analyses and the tertiary structure of the molecule, and the binding of the mAbs was assessed by western blotting and BIAcore analysis. The results are summarised in Table 2. The results of the binding of selected mAbs to the modified proteins as detected by Western blotting are shown in Figure 9, and by BIAcore analysis in Figurel0. Some individual amino acid changes have no effect on the binding of any of the mAbs tested (for example Leu22 to Arg). Other substitutions affect the binding of one or more mAbs.

Of particular interest are those changes that prevent the binding of blocking antibodies but have no effect on the binding of the inhibitory antibodies. For example, replacement of Asnl 5 by Arg prevents the binding of mAb 7.5, replacement of Glu27 by Tyr prevents the binding of mAb 2.2, replacement of Leu31 by Arg prevents the binding of mAb 1El, replacement of Tyr34 by Ser prevents the binding of mAb 7.5, and replacement of Glu43 by Leu prevents the binding of mAb 111.4.

WO 00/63245 PCT/GB00/01558 Several combinations of substitutions that prevent the binding of blocking antibodies but do not affect the binding of inhibitory antibodies were made in single proteins (Table 2 and Figure 11). In the first Glu27-+Tyr. Leu31--Arg and Glu43-+Leu were combined, in the second Glu27--Tyr, Leu31 Arg, Tyr34--Ser, and Glu43--Leu were combined, and the third Asnl 5--Arg, Glu27-Tyr, Leu31 -Arg and Glu43->Leu were combined. None of these modified proteins bound any of the blocking antibodies but continued to bind the inhibitory antibodies. We propose that the mutant proteins will induce a polyclonal response that is more inhibitory than that induced by the wild type protein.

The modified recombinant proteins will also be used to affinity select antibodies from pooled serum from individuals exposed to malaria. We hypothesise that the modified proteins will select less blocking antibody than the wild type protein and that therefore these selected antibodies will be more effective in inhibiting parasite invasion in vitro and secondary processing.

In the first EGF-like domain of MSP-1 from the rodent, primate and P. vivax malaria parasites, cysteines 2 and 4 are not present. We have replaced this cysteine pair (Cysl2 and Cys28) in the P. falciparum protein. This does not have appear to have any effect on the binding of any of the inhibitory antibodies, but does abolish the binding of the blocking antibody mAb 2.2. We propose that one reason why the proteins from these other malaria parasites are more immunogenic is that T cell recognition is more effective or that processing by antigen processing cells proceeds by a different degradation pathway that drives the fine specificity of the antibody response in a more productive direction (see for example Egan et al., 1997). Removal of the cysteine pair may improve the immunogenicity of the modified protein and this will be assessed by measuring the level of antibodies induced by the P. falciparum protein without the two cysteines with the level of antibodies induced by the wild type protein.

U,

N

U,

0% Table 2 The location of amino acid sequence changes and their effect on the binding of monoclonal antibodies Position Amino acid Antibody Wild Mutnt 12.8 12.10 SBI -IEI 2.2 7.5 111.4 111.2 9C8 2FI0 12DI 1 117.2 15.2 IES IIA12 t 6 Gin Ile i+ 4-4 14 Gin Gly 4- 44 44 14 GIn 4-4R 4-4 4-4 44 4+ 4-4 Is Asn Ar 44 44 44 ++4 Ag Oiu 44 44 22 Leu Ar 44 44 44 44 44 44 44 44 24 (flu LY 4-4 44 44 44 Ar 44I 44 44 4+ 44 44 44 26 Gu Ile 4+ 4+ 44 44 44 +4 4- 27 Giu Tyr 44 4+ 44 44 44 4 4-4 29 Ls Ser 4-4 44 44 44 44 4-4 +4 +4- 31 LiU u Ar 44 44 44 44 44 4-4 32 Leu Ar +4 44 +4 44 44 44 44 44 4-4 +4 33 Asn Ilie 44 34 Tr Ser 44 44 44 ++4 34 T Ilie 44 44 44 slie 44 44 +4 44 4-4 4+ 36 Gin Gl +4- 37 Gu Ilie 44 44 44 44 44 4-4 44 39 As Thr 44 44 44 44 +4 +4 +4- L Ile 44 4- 4-4 4-4 4- 4- 4- 4-4 ++4 43 Giu Leu 44 44 44 4-4 4-4 4-4- Thr Lys 44 44 4 44 4-4 4-4 4- 53 Asn Ar 44 44 44 44 44 44 4-4 4-4 4-f so Lys Ile 44 44 44 4-4 44 4-4 +4 Wild typ I I +I 4-4 4-4 =strong bin ding, -binding, no binding 0 t 00 Table 2 (cont.) Position Amino acid MonocloI anlibody bindin Conbinations Wildtype Mutant 12.8 12.10 SBI lEl 2.2 7.5 111.4 111.2 9C8 2F10 12DI1 117.2 5.2 IES 8A12 12+28 Cys le 12+28 Cys Ala 44 44 +4 -H +4 Phe 14+18 Gin Gly Cy Ty 14+18 Gin Arg 34439 Tyr Ser 44 4 43+48 Glu Ieu 44 44 1' Thr Ilu 1 1 43+48 Glu Leu 4- 44+ +4 44 +4 44 4-4 +4 Thr Asn 47+48 Pro Ser Lys 27+31+43 Glu Tyr 44 4+ 4+ +4 44 LUu Arg Glu LCu 27+31+34+43 Glu Tyr 44 Leu Arg Tyr Set Glu Leu 15+27+31+43 Asn AFg 44 441 II 4- -4 4- OGu Tyr Lcu Arg Glu Lcu 12+15+27+31 Cys lie 443 Asn Arg Glu Tyr Lau Arg _lu Leu 1+ strong binding, binding, no binding WO 00/63245 PCT/GBOO/01558 -48- TABLE A 13-10-98 f merozoite surface protein-I (MSP- 1) Plasmodium -falciparum (C-terminal fragment) #Reference: 'H DSS=0.000 dioxane=3.755 (internal) 1 5 N: indirect '3C: indirect f 25C pH 6.5 *jOmv NaPO4 100mM NaCI 90%H20/10%D20

FORMAT--

BioMagResBank U The original sequence entered was: NI SQHQCVKKQCPQNSGCFRHLDEREECKCLLNYKQEGDKCVENPNPTCNENNGGCDADAKCTEEDSGSNGK

KITCECTKPDSYPLFDGIFCSSSN

Expressed in NMR-STAR. this sequence is: _Molresidue sequence

NISQHQCVKKQCPQNSGCFR

HLDEREECKCLLNYKQEGDK

CVENPNPTCNENNGGCDADA

KCTEEDSGSNGKKITCECTK

PDSYPLFDGIFCSSSN

loop_ Residue sececode -Residue_author seqcode Residue-label 1. @ASN 2 ILE 38 SER 4 GLN 5 HIS 6 GLN 7 CYS 8 VAL 9 LYS 10 LYS 11 GLN 12 CYS 13 PRO 14 GLN 15 ASN 16 SER 17 GLY 18 CYS 19 PHE 20 ARG 21 HIS 22 LEU 23 ASP 24 GLU 25 ARG 26 GLU 27 GLU 28 CYS 29 LYS 30 CYS 31 LEU 32 LEU 33 ASN 34 TYR 35 LYS 36 GLN 37 GLU 38 GLY 39 ASP 40 LYS 41 CYS 42 VAL 43 GLU 44 ASN 45 PRO 46 ASN 47 8 PRO 48 THR 49 CYS 50 ASN 518GLU 52 ASN 53 ASN 54 GLY 55 GLY 56 CYS 57 ASP 58 ALA 59 ASP 60 ALA 61@LYS 62 CYS 63 THR 64 GLU 65 GLU 66 ASP 67 SER 68 GLY 69 SER 70 ASN 71 GLY 72 LYS 73 LYS 74 ILE 75 THR 76 CYS 77 8 GLU 78 CYS 79 THR 80 LYS 81 PRO 82 ASP 83 8 SER 84 TYR 85 PRO 86 LE 87 8 PHE 88 ASP 89 GLY 90 ILE 91@PHE 92 CYS 93 SER 94 SER 95 SER 96 ASN stop- WO 00/63245 PCT/GB00/01558 -49- Chemical Shift Ambiguity Code Definitions S Codes Definition 1 Unique 4 2 Ambiguity of geminal atoms or geminal methyl i proton groups 3 Aromatic atoms on opposite sides of the ring i Tyr HEI and HE2 protons) 4 Intraresidue ambiguities Lys HG and HD protons) 5 Interresidue ambiguities (Lys 12 vs. Lys 27) 9 Ambiguous, specific ambiguity not defined 4 INSTRUCTIONS 1) Replace the @-signs with appropriate values.

2) Text comments concerning the assignments can be supplied in the full deposition.

3) Feel free to add or delete rows to the table as needed.

The row numbers (_Atom_shift_assign_ID values) will be re-assigned to sequential values by BMRB The atom table chosen for this sequence is: loop_ _Atom_shift_assign_ID _Residue_seq_code _Residue_label Atomname _Atom_type _Chem_shift_value Chem_shiftvalueerror _Chem_shiftambiguity_code Table Al. Supplementary: IH, 13C and 15N chemical shift assignments of MSP-1 C-terminal fragment Atom Residue shift Seq Residue Atom Atom Shift/ Error/ Ambiguityassign no. Name Name Type ppm ppm Code 1 1 ASN H H 8.29 0.02 1 2 1 ASN HA H 4.60 0.02 1 3 1 ASN HB2 H 2.86 0.02 2 4 1 ASN HB3 H 2.75 0.02 2 1 ASN HD21 H 6 I ASN HD22 H 7 1 ASN C C 8 1 ASN CA C 55.5 0.6 1 9 1 ASN CB C 40.9 0.6 1 11 1 ASN N N 125.8 0.3 1 12 1 ASN ND2 N 13 2 ILE H H 8.29 0.02 1 WO 00/63245 PCT/GBOO/01558 14 2 ILE HA H 4.25 0.02 1 2 ILE HB H 1.97 0.02 1 16 2 ILE HG12 H- 1.39 0.02 2 17 2- ILE HGI3 H 1.19 0.02 2 18 2 ILE HG2 H 0.92 0.02 1 19 2 ILE HDI H 0.81 0.02 1 2 ILE C C 173.8 0.6 1 21 2 ILE CA C 62.2 0.6 1 22 2 ILE CB C 38.7 0.6 1 23 2 ILE CG1 C 27.5 0.6 1 24 2 ILE CG2 C 18.2 0.6 I 2 ILE CD1 C 13.7 0.6 1 26 2 ILE N N 121.1 0.3 1 27 3 SER H H 8.47 0.02 1 28 3 SER HA H 4.20 0.02 1 29 3 SER HB2 H 3.90 0.02 1 3 SER HB3 H 3.90 0.02 1 32 3 SER C C 33 3 SER CA C 60.9 0.6 1 34 3 SER CB C 63.3 0.6 1 3 SER N N 119.3 0.3 1 36 4 GLN H H 8.32 0.02 1 37 4 GLN HA H 4.02 0.02 1 38 4 GLN HB2 H 1.88 0.02 1 39 4 GLN HB3 H 1.88 0.02 1 4 GLN HG2 H 1.75 0.02 1 41 4 GLN HG3 H 1.75 0.02 1 42 4 GLN HE21 H 43 4 GLN. HE22 H 44 4 GLN C C 4 GLN CA C 57.7 0.6 1 46 4 GLN CB C 27.9 0.6 1 47 4 GLN CG C 32.7 0.6 1 49 4 GLN N N 121.6 0.3 1 4 GLN NE2 N 51 5 HIS H H 7.76 0.02 9 52 5 HIS HA H 5.09 0.02 1 53 5 HIS HB2 H 2.70 0.02 1 54 5 HIS HB3 H 2.70 0.02 1 56 5 HIS HD2 H 6.87 0.02 1 57 5 HIS HEI H 7.92 0.02 1 59 5 HIS C C 175.7 0.6 1 5 HIS CA C 54.8 0.6 1 61 5 HIS CB C 29.2 0.6 1 5 HIS N N 113.6 0.3 9 68 6 GLN H H 7.42 0.02 9 69 6 GLN HA H 4.43 0.02 1 6 GLN HB2 H 2.05 0.02 1 71 6 GLN HB3 H 2.05 0.02 1 72 6 GLN HG2 H 2.42 0.02 1 73 6 GLN HG3 H 2.42 0.02 1 74 6 GLN HE21 H 7.59 0.02 6 GLN HE22 H 6.92 0.02 76 6 GLN C C 175.7 0.6 1 77 6 GLN CA C 55.1 0.6 1 78 6 GLN CB C 28.8 0.6 1 79 6 GLN CG C 33.8 0.6 1 81 6 GLN N N 122.5 0.3 9 WO 00/63245 WO 0063245PCT/GBOO/01558 G LN

CYS

GYS

CYS

GYS

CYS

VAL

LYS

L-YS

LYS

GLN

NE2

H

HA

HB2 HB3

C

CA

CB

N

H

HA

HB

HG 1 HG2

C

CA

CB

CG 1 CG2

N

H

HA

HB2 HB3 HG2 HG3 HD2 HD3 HE2 HE3

C

CA

CB

CG

CD

CE

N

H

HA

HB2 HB3 HG2 HG3 HD2 HD3 HE2 HE3

C

CA

CB

CG

CD

CE

N

H

HA

HB2 HB3 112.6 9.18 4.09 3.31 3.11 174.4 56.6 42.3 124.5 10.42 4.33 2.15 0.84 0.82 176.6 62.5 34.4 21.5 19.7 119.1 9.42 4.51 1.81 1.81 1.41 1.41 3.33 57.7 124.1 8.94 4.06 1.86 1.86 1.29 1.29 1.70 1.59 3.04 3.04 57.1 34.3 25.6 29.6 42.4 122.4 4.47 2.03 1.89 0.3 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.03 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.

0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 WO 00/63245 PCT/GBOO/01558 -52- 145 11 GLN HG2 H 2.28 0.02 1 146 11 GLN HG3 H 2.28 0.02 1 147 11 GLN HE21 H- 7.45 0.02 2 148 11 GLN HE22 H 6.84 0.02 2 149 11 GLN C C 150 11 GLN CA C 54.4 0.6 1 151 11 GLN CB C 28.7 0.6 1 152 11 GLN CG C 33.8 0.6 1 154 11 GLN N N 155 11 GLN NE2 N 112.9 0.3 1 156 12 CYS H H 157 12 CYS HA H 5.09 0.02 1 158 12 CYS HB2 H 3.49 0.02 2 159 12 CYS HB3 H 2.34 0.02 2 161 12 CYS C C 162 12 CYS CA C 52.4 0.6 1 163 12 CYS CB C 37.2 0.6 1 164 12 CYS N N 165 13 PRO HA H 4.55 0.02 1 166 13 PRO HB2 H 2.45 0.02 1 167 13 PRO HB3 H 1.94 0.02 1 168 13 PRO HG2 H 1.73 0.02 2 169 13 PRO HG3 H 2.04 0.02 2 170 13 PRO HD2 H 3.43 0.02 2 171 13 PRO HD3 H 3.80 0.02 2 172 13 PRO C C 176.4 0.6 1 173 13 PRO CA C 62.6 0.6 1 174 13 PRO CB C 33.0 0.6 1 175 13 PRO CG C 27.5 0.6 1 176 13 PRO CD C 50.6 0.6 1 178 14 GLN H H 8.48 0.02 1 179 14 GLN HA H 4.01 0.02 1 180 14 GLN HB2 H 1.94 0.02 1 181 14 GLN HB3 H 1.94 0.02 1 182 14 GLN HG2 H 2.42 0.02 1 183 14 GLN HG3 H 2.42 0.02 1 184 14 GLN HE21 H 7.59 0.02 185 14 GLN HE22 H 6.92 0.02 186 14 GLN C C 176.6 0.6 1 187 14 GLN CA C 57.4 0.6 1 188 14 GLN CB C 28.6 0.6 1 189 14 GLN CG C 33.8 0.6 I 191 14 GLN N N 120.5 0.3 1 192 14 GLN NE2 N 112.6 0.3 193 15 ASN H H 8.93 0.02 1 194 15 ASN HA H 3.77 0.02 1 195 15 ASN HB2 H 2.58 0.02 1 196 15 ASN HB3 H 1.09 0.02 1 197 15 ASN HD21 H 6.97 0.02 1 198 15 ASN HD22 H 7.12 0.02 1 199 15 ASN C C 171.9 0.6 1 200 15 ASN CA C 54.6 0.6 1 201 15 ASN CB C 36.3 0.6 1 203 15 ASN N N 115.8 0.3 1 204 15 ASN ND2 N 115.4 0.3 1 205 16 SER H H 7.34 0.02 1 206 16 SER HA H 4.93 0.02 1 207 16 SER HB2 H 3.62 0.02 2 WO 00/63245 PTGO/15 PCT/GBOO/01558 208 210 211 2'11 213 214 215 216 217 218 219 220 221 222 223 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 267 268 269 270 271 272 273 274

SER

GLY

CYS

GYS

PHE

PH-E

PHE

PRE

ARG

ARO

ARC

ARG

HS

HIS

HB3

C

CA

CB

N

H

HA27 HA3

C

CA

N

H-

HA

HB2 HB3

C

CA

GB

N

H

HA

HB2 H133 HD I HD2

HELI

HE2

HZ

C

CA

GB

N

H

HA

132 HB3 HG2 HG3 HD2 HD3

HE

HHI 1 HH 12 H H'2 I HH22

C

CA

CB

CG

CD

N

NE

NH 1 N H2

H

HA

HB2 H133 3.52 173.5 57.5 67.9 109.9 8.92 3.83 2.06 42.6 108.5 7.01 5.63 3.01 3.01 172.5 55.8 43.2 120.5 9.12 4.43 1.70 0.62 6.12 6.12 6.30 6.30 6.37 172.3 57.0 41.7 132.0 7.76 4.83 1.26 0.99 1.59 1.42 3.35 3.10 7.18 6.23 6.23 6.23 6.23 174.7 54.3 32.3 28.0 44.1 129.4 85.0 70.1 70.1 9.30 4.50 3.52 3.44 0.02 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.6 0.3 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.3 0.3 0.3 0.3 0.02 0.02 0.02 0.02 WO 00/63245 PCT/GB00/01558 -54- 276 21 HIS HD2 H 7.02 0.02 1 277 21 HIS HE1 H 8.44 0.02 1 279 21 HIS C C" 177.6 0.6 280 21 HIS CA C 56.4 0.6 281 21 HIS CB C 32.2 0.6 1 285 21 HIS N N 125.9 0.3 1 288 22 LEU H H 9.30 0.02 1 289 22 LEU HA H 4.11 0.02 1 290 22 LEU HB2 H 1.87 0.02 1 291 22 LEU HB3 H 1.65 0.02 1 292 22 LEU HG H 1.87 0.02 1 293 22 LEU HDI H 0.77 0.02 2 294 22 LEU HD2 H 0.98 0.02 2 295 22 LEU C C 177.7 0.6 1 296 22 LEU CA C 57.7 0.6 1 297 22 LEU CB C 40.8 0.6 1 298 22 LEU CG C 27.6 0.6 1 299 22 LEU CD1 C 22.6 0.6 2 300 22 LEU CD2 C 25.3 0.6 2 301 22 LEU N N 122.0 0.3 1 302 23 ASP H H 7.86 0.02 1 303 23 ASP HA H 4.52 0.02 1 304 23 ASP HB2 H 3.10 0.02 1 305 23 ASP HB3 H 2.53 0.02 1 306 23 ASP C C 176.8 0.6 1 307 23 ASP CA C 53.6 0.6 1 308 23 ASP CB C 39.7 0.6 1 310 23 ASP N N 116.9 0.3 1 311 24 GLU H H 8.01 0.02 1 312 24 GLU HA H 3.63 0.02 1 313 24 GLU HB2 H 2.57 0.02 2 314 24 GLU HB3 H 2.17 0.02 2 315 24 GLU HG2 H 2.14 0.02 1 316 24 GLU HG3 H 2.14 0.02 1 317 24 GLU C C 176.1 0.6 1 318 24 GLU CA C 59.4 0.6 1 319 24 GLU CB C 27.5 0.6 1 320 24 GLU CG C 37.0 0.6 1 322 24 GLU N N 110.3 0.3 1 323 25 ARG H H 8.04 0.02 1 324 25 ARG HA H 4.24 0.02 1 325 25 ARG HB2 H 1.84 0.02 2 326 25 ARG HB3 H 1.75 0.02 2 327 25 ARG HG2 H 1.56 0.02 1 328 25 ARG HG3 H 1.56 0.02 1 329 25 ARG HD2 H 3.16 0.02 1 330 25 ARG HD3 H 3.16 0.02 1 331 25 ARG HE H 8.01 0.02 1 332 25 ARG HHI1 H 6.71 0.02 333 25 ARG HH12 H 6.71 0.02 334 25 ARG HH21 H 6.71 0.02 335 25 ARG HH22 H 6.71 0.02 S336 25 ARG C C 175.6 0.6 1 337 25 ARG CA C 57.8 0.6 338 25 ARG CB C 31.0 0.6 339 25 ARG CG C 28.3 0.6 1 340 25 ARG CD C 43.5 0.6 I 342 25 ARG N N 121.9 0.3 WO 00/63245 PCT/GB00/01558 343 25 ARG NE N 85.8 0.3 1 344 25 ARG NH1 N 70.3 0.3 345 25 ARG NH2 NT 70.3 0.3 346 26 GLU H H 8.69 0.02 1 347 26 GLU HA H 5.40 0.02 1 348 26 GLU HB2 H 1.90 0.02 1 349 26 GLU HB3 H 1.90 0.02 1 350 26 GLU HG2 H 2.63 0.02 2 351 26 GLU HG3 H 2.09 0.02 2 352 26 GLU C C 173.6 0.6 I 353 26 GLU CA C 55.6 0.6 1 354 26 GLU CB C 37.9 0.6 1 355 26 GLU CG C 31.5 0.6 1 357 26 GLU N N 124.3 0.3 1 358 27 GLU H H 9.07 0.02 1 359 27 GLU HA H 4.84 0.02 1 360 27 GLU HB2 H 2.15 0.02 2 361 27 GLU HB3 H 2.27 0.02 2 362 27 GLU HG2 H 2.48 0.02 2 363 27 GLU HG3 H 2.32 0.02 2 364 27 GLU C C 174.5 0.6 I 365 27 GLU CA C 54.6 0.6 1 366 27 GLU CB C 34.8 0.6 1 367 27 GLU CG C 36.5 0.6 1 369 27 GLU N N 123.9 0.3 1 370 28 CYS H H 8.88 0.02 1 371 28 CYS HA H 5.65 0.02 1 372 28 CYS HB2 H 3.00 0.02 2 373 28 CYS HB3 H 2.81 0.02 2 375 28 CYS C C 175.5 0.6 1 376 28 CYS CA C 53.0 0.6 1 377 28 CYS CB C 40.9 0.6 1 378 28 CYS N N 122.0 0.3 1 379 29 LYS H H 8.76 0.02 1 380 29 LYS HA H 4.56 0.02 1 381 29 LYS HB2 H 1.40 0.02 1 382 29 LYS HB3 H 1.40 0.02 1 383 29 LYS HG2 H 1.38 0.02 1 384 29 LYS HG3 H 1.38 0.02 1 385 29 LYS HD2 H 1.13 0.02 2 386 29 LYS HD3 H 0.92 0.02 2 387 29 LYS HE2 H 2.86 0.02 1 388 29 LYS HE3 H 2.86 0.02 1 389 29 LYS HZ H 7.23 0.02 390 29 LYS C C 174.6 0.6 1 391 29 LYS CA C 56.4 0.6 I 392 29 LYS CB C 38.4 0.6 1 393 29 LYS CG C 25.8 0.6 1 394 29 LYS CD C 29.6 0.6 1 395 29 LYS CE C 42.4 0.6 396 29 LYS N N 124.0 0.3 1 397 29 LYS NZ N 33.0- 0.3 398 30 CYS H H 8.78 0.02 1 399 30 CYS HA H 4.65 0.02 1 400 30 CYS HB2 H 2.49 0.02 1 401 30 CYS HB3 H 3.01 0.02 1 403 30 CYS C C 173.3 0.6 1 404 30 CYS CA C 54.6 0.6 1 WO 00/63245 PCT/GB00/01558 -56- 405 30 CYS CB C 35.9 0.6 406 30 CYS N N 121.7 0.3 407 31 LEU H I- 7.81 0.02 408 31 LEU HA H 4.23 0.02 409 31 LEU HB2 H 1.39 0.02 410 31 LEU HB3 H 1.73 0.02 411 31 LEU HG H 0.94 0.02 1 412 31 LEU HD1 H 0.68 0.02 413 31 LEU HD2 H 0.77 0.02 1 414 31 LEU C C 176.2 0.6 415 31 LEU CA C 54.8 0.6 416 31 LEU CB C 42.5 0.6 417 31 LEU CG C 27.1 0.6 418 31 LEU CDI C 26.0 0.6 1 419 31 LEU CD2 C 21.9 0.6 1 420 31 LEU N N 119.2 0.3 1 421 32 LEU H H 8.91 0.02 422 32 LEU HA H 4.26 0.02 1 423 32 LEU HB2 H 1.75 0.02 1 424 32 LEU HB3 H 1.32 0.02 1 425 32 LEU HG H 1.77 0.02 1 426 32 LEU HD1 H 0.96 0.02 1 427 32 LEU HD2 H 0.77 0.02 1 428 32 LEU C C 179.0 0.6 1 429 32 LEU CA C 56.4 0.6 1 430 32 LEU CB C 42.1 0.6 1 431 32 LEU CG C 26.9 0.6 1 432 32 LEU CDI C 26.0 0.6 1 433 32 LEU CD2 C 23.4 0.6 1 434 32 LEU N N 118.4 0.3 1 435 33 ASN H H 8.84 0.02 1 436 33 ASN HA H 3.90 0.02 437 33 ASN HB2 H 3.26 0.02 1 438 33 ASN HB3 H 2.88 0.02 1 439 33 ASN HD21 H 6.86 0.02 1 440 33 ASN HD22 H 7.25 0.02 1 441 33 ASN C C 173.8 0.6 1 442 33 ASN CA C 56.9 0.6 1 443 33 ASN CB C 37.3 0.6 1 445 33 ASN N N 110.8 0.3 1 446 33 ASN ND2 N 112.0 0.3 1 447 34 TYR H H 8.61 0.02 1 448 34 TYR HA H 5.02 0.02 1 449 34 TYR HB2 H 3.35 0.02 1 450 34 TYR HB3 H 2.52 0.02 1 451 34 TYR HDI H 6.74 0.02 1 452 34 TYR HD2 H 6.74 0.02 1 453 34 TYR HE1 H 6.69 0.02 1 454 34 TYR HE2 H 6.69 0.02 1 456 34 TYR C C 174.7 0.6 457 34 TYR CA C 57.6 0.6 1 458 34 TYR CB C 41.0 0.6 1 465 34 TYR N N 118.6 0.3 1 466 35 LYS H H 9.99 0.02 1 467 35 LYS HA H 4.84 0.02 1 468 35 LYS HB2 H 1.82 0.02 1 469 35 LYS HB3 H 1.59 0.02 1 470 35 LYS HG2 H 1.18 0.02 1 WO 00/63245 PCT/GB00/01558 471 35 LYS HG3 H 1.18 0.02 1 472 35 LYS HD2 H 1.48 0.02 1 473 35 LYS HD3 H 1.48 0.02 1 474 35 LYS HE2 H 2.89 0.02 1 475 35 LYS HE3 H 2.89 0.02 1 477 35 LYS C C 174.1 0.6 1 478 35 LYS CA C 54.2 0.6 1 479 35 LYS CB C 36.3 0.6 1 480 35 LYS CG C 24.1 0.6 1 481 35 LYS CD C 29.6 0.6 1 482 35 LYS CE C 41.6 0.6 1 483 35 LYS N N 119.7 0.3 1 485 36 GLN H H 8.77 0.02 1 486 36 GLN HA H 4.67 0.02 1 487 36 GLN HB2 H 2.06 0.02 1 488 36 GLN HB3 H 2.06 0.02 1 489 36 GLN HG2 H 2.35 0.02 1 490 36 GLN HG3 H 2.35 0.02 1 491 36 GLN HE21 H 7.50 0.02 2 492 36 GLN HE22 H 6.55 0.02 2 493 36 GLN C C 176.3 0.6 1 494 36 GLN CA C 56.7 0.6 1 495 36 GLN CB C 28.6 0.6 1 496 36 GLN CG C 33.7 0.6 1 498 36 GLN N N 124.9 0.3 1 499 36 GLN NE2 N 110.9 0.3 1 500 37 GLU H H 8.96 0.02 1 501 37 GLU HA H 4.51 0.02 1 502 37 GLU HB2 H 1.96 0.02 2 503 37 GLU HB3 H 1.80 0.02 2 504 37 GLU HG2 H 2.11 0.02 1 505 37 GLU HG3 H 2.11 0.02 1 506 37 GLU C C 176.2 0.6 1 507 37 GLU CA C 56.0 0.6 1 508 37 GLU CB C 31.9 0.6 1 509 37 GLU CG C 36.6 0.6 1 511 37 GLU N N 130.2 0.3 1 512 38 GLY H H 9.20 0.02 1 513 38 GLY HA2 H 3.69 0.02 1 514 38 GLY HA3 H 4.05 0.02 1 515 38 GLY C C 174.7 0.6 1 516 38 GLY CA C 47.4 0.6 1 517 38 GLY N N 118.4 0.3 1 518 39 ASP H H 8.81 0.02 1 519 39 ASP HA H 4.69 0.02 1 520 39 ASP HB2 H 2.78 0.02 1 521 39 ASP HB3 H 2.78 0.02 1 522 39 ASP C C 175.2 0.6 1 523 39 ASP CA C 54.2 0.6 1 524 39 ASP CB C 41.0 0.6 1 526 39 ASP N N 127.1 0.3 1 527 40 LYS H H 7.86 0.02 1 528 40 LYS HA H 4.71 0.02 1 529 40 LYS HB2 H 1.86 0.02 1 530 40 LYS HB3 H 1.86 0.02 1 531 40 LYS HG2' H 1.52 0.02 1 532 40 LYS HG3 H 1.52 0.02 1 533 40 LYS HD2 H 1.69 0.02 1 WO 00/63245 PCT/GBOO/01558 -58- 534 40 LYS HD3 H 1.69 0.02 535 40 LYS HE2 H 3.02 0.02 1 536 40 LYS HE3 H1 3.02 0.02 1 538 40 LYS C C 175.4 0.6 1 539 40 LYS CA C 54.8 0.6 1 540 40 LYS CB C 36.3 0.6 541 40 LYS CG C 24.9 0.6 1 542 40 LYS CD C 29.0 0.6 543 40 LYS CE C 42.0 0.6 544 40 LYS N N 118.0 0.3 1 546 41 CYS H H 8.95 0.02 1 547 41 CYS HA H 5.29 0.02 1 548 41 CYS HB2 H 3.00 0.02 1 549 41 CYS HB3 H 2.58 0.02 1 551 41 CYS C C 174.4 0.6 1 552 41 CYS CA C 55.0 0.6 1 553 41 CYS CB C 41.9 0.6 1 554 41 CYS N N 119.8 0.3 1 555 42 VAL H H 9.35 0.02 1 556 42 VAL HA H 4.88 0.02 1 557 42 VAL HB H 2.32 0.02 1 558 42 VAL HG1 H 1.00 0.02 1 559 42 VAL HG2 H 0.88 0.02 1 560 42 VAL C C 175.6 0.6 1 561 42 VAL CA C 59.2 0.6 1 562 42 VAL CB C 35.2 0.6 1 563 42 VAL CGI1 C 21.6 0.6 1 564 42 VAL CG2 C 19.3 0.6 1 565 42 VAL N N 118.7 0.3 1 566 43 GLU H H 9.18 0.02 1 567 43 GLU HA H 3.34 0.02 1 568 43 GLU HB2 H 1.84 0.02 2 569 43 GLU HB3 H 1.75 0.02 2 570 43 GLU HG2 H 2.15 0.02 2 571 43 GLU HG3 H 2.07 0.02 2 572 43 GLU C C 175.4 0.6 1 573 43 GLU CA C 59.0 0.6 1 574 43 GLU CB C 29.8 0.6 1 575 43 GLU CG C 36.8 0.6 1 577 43 GLU N N 123.4 0.3 1 578 44 ASN H H 8.25 0.02 1 579 44 ASN HA H 4.95 0.02 1 580 44 ASN HB2 H 2.72 0.02 2 581 44 ASN HB3 H 1.97 0.02 2 582 44 ASN HD21 H 8.07 0.02 2 583 44 ASN HD22 H 7.24 0.02 2 584 44 ASN C C 585 44 ASN CA C 48.5 0.6 1 586 44 ASN CB C 39.2 0.6 1 588 44 ASN N N 120.2 0.3 1 589 44 ASN ND2 N 112.1 0.3 I 590 45 PRO HA H 4.40 0.02 591 45 PRO HB2 H 2.31 0.02 1 592 45 PRO HB3 H 1 95 0.02 I 593 45 PRO HG2 H 1.97 0.02 1 594 45 PRO HG3 H 1.97 0.02 1 595 45 PRO HD2 H 3.86 0.02 2 596 45 PRO HD3 H 3.81 0.02 2 SWO 00/63245 PCT/GBOO/01558 -59- 597 45 PRO C C 176.2 0.6 598 45 PRO CA C 63.7 0.6 1 599 45 PRO CB C 32.6 0.6 1 600 45 PRO CG C 26.8 0.6 1 601 45 PRO CD C 50.9 0.6 1 603 46 ASN H H 7.47 0.02 1 604 46 ASN HA H 5.09 0.02 1 605 46 ASN HB2 H 2.72 0.02 2 606 46 ASN HB3 H 2.43 0.02 2 607 46 ASN HD21 H 7.57 0.02 1 608 46 ASN HD22 H 6.91 0.02 1 609 46 ASN C C 610 46 ASN CA C 51.6 0.6 1 611 46 ASN CB C 40.1 0.6 1 613 46 ASN N N 114.0 0.3 1 614 46 ASN ND2 N 112.6 0.3 1 615 47 PRO HA H 4.34 0.02 1 616 47 PRO HB2 H 1.97 0.02 1 617 47 PRO HB3 H 1.77 0.02 1 618 47 PRO HG2 H 1.96 0.02 1 619 47 PRO HG3 H 1.96 0.02 1 620 47 PRO HD2 H 3.52 0.02 2 621 47 PRO HD3 H 3.39 0.02 2 622 47 PRO C C 175.5 0.6 1 623 47 PRO CA C 63.7 0.6 1 624 47 PRO CB C 32.4 0.6 1 625 47 PRO CG C 27.5 0.6 1 626 47 PRO CD C 49.8 0.6 1 628 48 THR H H 8.28 0.02 1 629 48 THR HA H 4.68 0.02 1 630 48 THR HB H 4.32 0.02 1 632 48 THR HG2 H 1.13 0.02 1 633 48 THR C C 174.5 0.6 1 634 48 THR CA C 59.1 0.6 1 635 48 THR CB C 69.9 0.6 1 636 48 THR CG2 C 19.3 0.6 1 637 48 THR N N 112.4 0.3 1 638 49 CYS H H 9.35 0.02 1 639 49 CYS HA H 4.44 0.02 1 640 49 CYS HB2 H 2.64 0.02 1 641 49 CYS HB3 H' 3.14 0.02 1 643 49 CYS C C 175.8 0.6 1 644 49 CYS CA C 55.5 0.6 1 645 49 CYS CB C 37.7 0.6 1 646 49 CYS N N 125.3 0.3 1 647 50 ASN H H 8.36 0.02 1 648 50 ASN HA H 4.60 0.02 1 649 50 ASN HB2 H 2.88 0.02 1 650 50 ASN HB3 H 2.70 0.02 1 651 50 ASN HD21 H 7.60 0.02 2 652 50 ASN HD22 H 6.97 0.02 2 653 50 ASN C C 174.4 0.6 1 654 50 ASN CA C 54.8 0.6 1 655 50 ASN CB C 38.9 0.6 1 657 50 ASN N N 116.4 0.3 1 658 50 ASN ND2 N 113.6 0.3 1 659 51 GLU H H 7.43 0.02 1 660 51 GLU HA H 4.64 0.02 1 WO 00/63245 PCT/GBOO/01558 661 51 GLU HB2 H 1.96 0.02 2 662 51 GLU HB3 H 1.81 0.02 2 663 51 GLU HG2 H' 2.12 0.02 1 664 51 GLU HG3 H 2.12 0.02 1 665 51 GLU C C 176.1 0.6 1 666 51 GLU CA C 55.3 0.6 1 667 51 GLU CB C 31.5 0.6 1 668 51 GLU CG C 36.1 0.6 1 670 51 GLU N N 119.2 0.3 1 671 52 ASN H H 9.57 0.02 1 672 52 ASN HA H 4.45 0.02 1 673 52 ASN HB2 H 3.14 0.02 1 674 52 ASN HB3 H 2.64 0.02 1 675 52 ASN HD21 H 6.99 0.02 1 676 52 ASN HD22 H 7.71 0.02 1 677 52 ASN C C 176.4 0.6 1 678 52 ASN CA C 54.2 0.6 1 679 52 ASN CB C 37.8 0.6 I 681 52 ASN N N 125.4 0.3 1 682 52 ASN ND2 N 112.8 0.3 1 683 53 ASN H H 9.31 0.02 1 684 53 ASN HA H 4.64 0.02 1 685 53 ASN HB2 H 3.25 0.02 1 686 53 ASN HB3 H 2.31 0.02 1 687 53 ASN HD21 H 6.95 0.02 1 688 53 ASN HD22 H 6.26 0.02 1 689 53 ASN C C 176.2 0.6 1 690 53 ASN CA C 54.1 0.6 1 691 53 ASN CB C 39.2 0.6 1 693 53 ASN N N 119.3 0.3 1 694 53 ASN ND2 N 111.6 0.3 1 695 54 GLY H H 7.92 0.02 1 696 54 GLY HA2 H 4.28 0.02 1 697 54 GLY HA3 H 3.67 0.02 1 698 54 GLY C C 173.0 0.6 1 699 54 GLY CA C 46.3 0.6 1 700 54 GLY N N 106.3 0.3 1 701 55 GLY H H 8.06 0.02 1 702 55 GLY HA2 H 4.39 0.02 1 703 55 GLY HA3 H 3.47 0.02 1 704 55 GLY C C 175.1 0.6 1 705 55 GLY CA C 44.4 0.6 1 706 55 GLY N N 105.3 0.3 1 707 56 CYS H H 7.77 0.02 1 708 56 CYS HA H 4.39 0.02 1 709 56 CYS HB2 H 3.12 0.02 1 710 56 CYS HB3 H. 2.81 0.02 1 712 56 CYS C C 175.2 0.6 1 713 56 CYS CA C 53.1 0.6 1 714 56 CYS CB C 36.6 0.6 1 715 56 CYS N N 117.6 0.3 1 716 57 ASP H H 8.37 0.02 1 717 57 ASP HA H 3.98 0.02 1 718 57 ASP HB2 H 2.24 0.02 2 719 57 ASP HB3. H 2.03 0.02 2 720 57 ASP C C 176.0 0.6 1 721 57 ASP CA C 56.0 0.6 1 722 57 ASP CB C 45.0 0.6 1 WO 00/63245 PCT/GBOO/01558 -61- 724 57 ASP N N 122.3 0.3 725 58 ALA H H 8.48 0.02 726 58 ALA HA H 3.98 0.02 727 58 ALA HB H 1.41 0.02 728 58 ALA C C 178.9 0.6 729 58 ALA CA C 55.6 0.6 730 58 ALA CB C 19.1 0.6 1 731 58 ALA N N 127.2 0.3 1 732 59 ASP H H 9.11 0.02 1 733 59 ASP HA H 4.90 0.02 734 59 ASP HB2 H 2.44 0.02 735 59 ASP HB3 H 2.82 0.02 736 59 ASP C C 174.6 0.6 737 59 ASP CA C 53.9 0.6 1 738 59 ASP CB C 40.7 0.6 740 59 ASP N N 116.7 0.3 1 741 60 ALA H H 7.94 0.02 742 60 ALA HA H 5.13 0.02 1 743 60 ALA HB H 1.27 0.02 1 744 60 ALA C C 176.3 0.6 1 745 60 ALA CA C 50.5 0.6 1 746 60 ALA CB C 21.3 0.6 1 747 60 ALA N N 121.9 0.3 748 61 LYS H H 9.06 0.02 749 61 LYS HA H 4.54 0.02 750 61 LYS HB2 H 1.77 0.02 1 751 61 LYS HB3 H 1.77 0.02 1 752 61 LYS HG2 H 1.31 0.02 2 753 61 LYS HG3 H 1.39 0.02 2 754 61 LYS HD2 H 1.72 0.02 2 755 61 LYS HD3 H 1.63 0.02 2 756 61 LYS HE2 H 2.92 0.02 2 757 61 LYS HE3 H 2.92 0.02 2 759 61 LYS C C 175.7 0.6 1 760 61 LYS CA C 55.1 0.6 761 61 LYS CB C 34.1 0.6 1 762 61 LYS CG C 24.6 0.6 1 763 61 LYS CD C 29.1 0.6 1 764 61 LYS CE C 42.1 0.6 1 765 61 LYS N N 122.5 03 1 767 62 CYS H H 9.19 0.02 1 768 62 CYS HA H 5.32 0.02 1 769 62 CYS HB2 H 2.44 0.02 1 770 62 CYS HB3 H 2.80 0.02 772 62 CYS C C 174.5 0.6 1 773 62 CYS CA C 56.1 0.6 1 774 62 CYS CB C 37.6 0.6 1 775 62 CYS N N 131.5 0.3 776 63 THR H H 9.23 0.02 1 777 63 THR HA H 4.51 0.02 778 63 THR HB H 4.01 0.02 780 63 THR HG2 H 1.16 0.02 781 63 THR C C 172.2 0.6 782 63 THR CA C 62.8 0.6 1 783 63 THR CB C 71.5 0.6 784 63 THR CG2 C 22.0 0.6 785 63 THR N N 125.9 0.3 786 64 GLU H H 8.61 0.02 WO 00/63245 PCT/GBOO/01558 -62- 787 64 GLU HA H 5.12 0.02 1 788 64 GLU HB2 H 1.90 0.02 1 789 64 GLU HB3 F' 1.90 0.02 1 790 64 GLU HG2 H 2.30 0.02 1 791 64 GLU HG3 H 2.30 0.02 1 792 64 GLU C C 793 64 GLU CA C 54.4 0.6 1 794 64 GLU CB C 32.0 0.6 1 795 64 GLU CG C 36.8 0.6 1 797 64 GLU N N 123.0 0.3 1 798 65 GLU H H 8.76 0.02 1 799 65 GLU HA H 4.60 0.02 1 800 65 GLU HB2 H 2.01 0.02 2 801 65 GLU HB3 H 1.86 0.02 2 802 65 GLU HG2 H 2.15 0.02 1 803 65 GLU HG3 H 2.15 0.02 1 804 65 GLU C C 805 65 GLU CA C 55.1 0.6 1 806 65 GLU CB C 32.9 0.6 1 807 65 GLU CG C 36.1 0.6 1 809 65 GLU N N 123.0 0.3 1 810 66 ASP H H 8.79 0.02 1 811 66 ASP HA H 4.80 0.02 1 812 66 ASP HB2 H 2.80 0.02 2 813 66 ASP HB3 H 2.58 0.02 2 814 66 ASP C C 176.4 0.6 1 815 66 ASP CA C 54.8 0.6 1 816 66 ASP CB C 41.2 0.6 1 818 66 ASP N N 123.9 0.3 1 819 67 SER H H 8.38 0.02 1 820 67 SER HA H 4.55 0.02 1 821 67 SER HB2 H 3.83 0.02 2 822 67 SER HB3 H 3.70 0.02 2 824 67 SER C C 175.4 0.6 1 825 67 SER CA C 58.0 0.6 1 826 67 SER CB C 64.6 0.6 1 827 67 SER N N 119.1 0.3 1 828 68 GLY H H 8.65 0.02 1 829 68 GLY HA2 H 4.13 0.02 2 830 68 GLY HA3 H 3.85 0.02 2 831 68 GLY C C 175.0 0.6 1 832 68 GLY CA C 46.0 0.6 1 833 68 GLY N N 112.2 0.3 1 834 69 SER H H 8.58 0.02 1 835 69 SER HA H 4.42 0.02 1 836 69 SER HB2 H 3.87 0.02 1 837 69 SER HB3 H 3.87 0.02 1 839 69 SER C C 174.8 0.6 1 840 69 SER CA C 59.1 0.6 1 841 69 SER CB C 63.4 0.6 I 842 69 SER N N 117.7 0.3 1 843 70 ASN H H 8.39 0.02 1 844 70 ASN HA H 4.74 0.02 1 845 70 ASN HB2 H 2.94 0.02 2 846 70 ASN HB3 H 2.77 0.02 2 847 70 ASN HD21 H 848 70 ASN HD22 H 849 70 ASN C C WO 00/63245 PTGO/15 PCT/GBOO/01558

ASN

GLY

LYS

ILE

I LE

CA

CB

N

N D2

H-

HA2 HA3

C

CA

N

H

HA

HB2 H-B3 HG2 HG3 HD2 HD3 HE2 HE3

C

CA

CB

CG

CD

CE

N

H

HA

HB2 HB3 HG2 HG3 HD2 HD3 HE2 HE3

C

CA

CB

CG

CD

CE

N

H

HA

fIB HGI12 HG 13 HG2

HDI

C

CA

CB

CG 1 CG2

CDI

N

53.8 38.9 118.4 7.88 4.07 4.07 173.7 45.0 108.1 8.39 4.84 1.75 1.59 1.46 1.38 1.66 1.66 2.93 2.93 173.7 55.4 35.0 24.9 28.9 42.2 120.8 8.81 4.64 1.71 1.71 1.36 1.25 1.63 1.63 2.89 2.89 175.3 55.0 35.4 24.6 122.6 8.54 4.83 1.91 1.50 1.50 1.00 0.57 176 .2 58.4 38.4 27.1 19.0 10.3 126.8 0.6 0.6 0.3 0.02 0.02 0.02 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.6 0.3 WO 00/63245 PCT/GBOO/01558 -64- 913 75 THR H H 8.88 0.02 914 75 THR HA H 4.51 0.02 915 75 THR HB H 4.01 0.02 917 75 THR HG2 H 1.15 0.02 1 918 75 THR C C 172.4 0.6 919 75 THR CA C 61.2 0.6 1 920 75 THR CB C 71.5 0.6 1 921 75 THR CG2 C 22.0 0.6 1 922 75 THR N N 120.3 0.3 1 923 76 CYS H H 8.82 0.02 1 924 76 CYS HA H 5.53 0.02 1 925 76 CYS HB2 H 3.28 0.02 1 926 76 CYS HB3 H 2.69 0.02 1 928 76 CYS C C 174.7 0.6 1 929 76 CYS CA C 52.2 0.6 1 930 76 CYS CB C 40.6 0.6 1 931 76 CYS N N 121.5 0.3 1 932 77 GLU H H 8.64 0.02 1 933 77 GLU HA H 4.74 0.02 1 934 77 GLU HB2 H 1.94 0.02 1 935 77 GLU HB3 H 1.94 0.02 1 936 77 GLU HG2 H 1.83 0.02 1 937 77 GLU HG3 H 1.83 0.02 1 938 77 GLU C C 175.7 0.6 I 939 77 GLU CA C 54.7 0.6 1 940 77 GLU CB C 33.1 0.6 1 941 77 GLU CG C 36.1 0.6 1 943 77 GLU N N 123.0 0.3 1 944 78 CYS H H 9.71 0.02 1 945 78 CYS HA H 4.48 0.02 1 946 78 CYS HB2 H 2.63 0.02 1 947 78 CYS HB3 H 3.30 0.02 1 949 78 CYS C C 175.8 0.6 1 950 78 CYS CA C 57.4 0.6 1 951 78 CYS CB C 39.2 0.6 1 952 78 CYS N N 129.1 0.3 1 953 79 THR H H 8.20 0.02 1 954 79 THR HA H 4.25 0.02 1 955 79 THR HB H 4.25 0.02 1 957 79 THR HG2 H 1.25 0.02 1 958 79 THR C C 176.2 0.6 1 959 79 THR CA C 63.4 0.6 1 960 79 THR CB C 70.1 0.6 1 961 79 THR CG2 C 22.2 0.6 1 962 79 THR N N 114.7 0.3 1 963 80 LYS H H 8.56 0.02 1 964 80 LYS HA H 4.50 0.02 1 965 80 LYS HB2 H 1.79 0.02 1 966 80 LYS HB3 H 1.79 0.02 1 967 80 LYS HG2 H 1.47 0.02 2 968 80 LYS HG3 H 1.64 0.02 2 969 80 LYS HD2 H 1.70 0.02 1 970 80 LYS HD3 H 1.70 0.02 1 971 80 LYS HE2 H 2.94 0.02 1 972 80 LYS HE3 H 2.94 0.02 1 974 80 LYS C C 975 80 LYS CA C 55.4 0.6 1 976 80 LYS CB C 30.9 0.6 1 WO 00/63245 PCT/GB00/01558 977 80 LYS CG C 25.6 0.6 978 80 LYS CD C 29.1 0.6 979 80 LYS CE C 42.1 0.6 980 80 LYS N N 124.8 0.3 982 81 PRO HA H 4.26 0.02 983 81 PRO HB2 H 2.27 0.02 984 81 PRO HB3 H 1.87 0.02 985 81 PRO HG2 H 2.10 0.02 2 986 81 PRO HG3 H 2.04 0.02 2 987 81 PRO HD2 H 3.90 0.02 2 988 81 PRO HD3 H 3.63 0.02 2 989 81 PRO C C 176.9 0.6 1 990 81 PRO CA C 64.2 0.6 1 991 81 PRO CB C 31.9 0.6 1 992 81 PRO CG C 27.8 0.6 1 993 81 PRO CD C 50.8 0.6 1 995 82 ASP H H 8.78 0.02 1 996 82 ASP HA H 4.25 0.02 1 997 82 ASP HB2 H 2.88 0.02 2 998 82 ASP HB3 H 2.77 0.02 2 999 82 ASP C C 1000 82 ASP CA C 55.1 0.6 1 1001 82 ASP CB C 40.0 0.6 1 1003 82 ASP N N 117.9 0.3 1 1004 83 SER H H 7.39 0.02 1 1005 83 SER HA H 4.47 0.02 1 1006 83 SER HB2 H 3.55 0.02 2 1007 83 SER HB3 H 3.50 0.02 2 1009 83 SER C C 175.4 0.6 1 1010 83 SER CA C 57.4 0.6 1 1011 83 SER CB C 65.9 0.6 1 1012 83 SER N N 112.9 0.3 1 1013 84 TYR H H 8.72 0.02 1 1014 84 TYR HA H 5.01 0.02 1 1015 84 TYR HB2 H 2.93 0.02 2 1016 84 TYR HB3 H 2.75 0.02 2 1017 84 TYR HDI H 6.92 0.02 1 1018 84 TYR HD2 H 6.92 0.02 1 1019 84 TYR HEI H 6.69 0.02 1 1020 84 TYR HE2 H 6.69 0.02 1 1022 84 TYR C C 1023 84 TYR CA C 54.6 0.6 1 1024 84 TYR CB C 39.7 0.6 1 1031 84 TYR N N 122.2 0.3 1 1032 85 PRO HA H 5.09 0.02 1 1033 85 PRO HB2 H 2.25 0.02 2 1034 85 PRO HB3 H 1.72 0.02 2 1035 85 PRO HG2 H 2.22 0.02 1 1036 85 PRO HG3 H 2.22 0.02 1 1037 85 PRO HD2 H 3.85 0.02 1 1038 85 PRO HD3 H 3.85 0.02 1 1039 85 PRO C C '176.9 0.6 1 1040 85 PRO CA C 62.9 0.6 1 1041 85 PRO CB C 32.8 0.6 1 1042 85 PRO CG C 27.3 0.6 1 1043 85 PRO CD C 50.6 0.6 1 1045 86 LEU H H 8.51 0.02 1 1046 86 LEU HA H 4.77 0.02 1 WO 00/63245 PCT/GBOO/01558 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1077 1078 1079 1080 1081 1082 1083 1084 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111

LEU

PHE

ASP

GLY

ILE

PHE

HB2 HB3

HG

HDI

HD2

C

CA

CB

CG

CDI

CD2

N

H

HA

HB2 HB3 HD1 HD2

HE

HE2

HZ

C

CA

CB

N

H

HA

HB2 HB3

C

CA

CB

N

H

HA2 HA3

C

CA

N

H

HA

HB

HG 12 HG 13 HG2 HD1

C

CA

CB

CGI

CG2

CDI

N

H

HA

HB2 HB3

HDI

1.76 1.76 1.85 1.00 1.00 177.9 54.5 44.8 29.0 25.7 25.7 120.2 9.29 4.16 3.26 3.19 7.31 7.31 7.70 7.70 7.67 176.4 59.3 36.9 122.6 8.93 4.33 3.11 3.02 175.2 56.3 39.9 111.3 7.87 3.48 4.08 174.7 46.0 102.0 7.17 4.39 1.52 0.65 0.65 1.05 0.51 175.1 64.2 35.9 25.5 16.9 14.9 113.3 7.43 5.15 2.40 2.40 7.00 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.6 0.6 0.6 0.6 0.6 0.6 0.3 0.02 0.02 0.02 0.02 0.02 WO 00/63245 PCT/GBOO/01558 -67- 1112 91 PHE HD2 H 7.00 0.02 1 1113 91 PHE HEI H 6.95 0.02 1 1114 91 PHE HE2 H 6.95 0.02 1115 91 PHE HZ H 7.06 0.02 1 1116 91 PHE C C 175.1 0.6 1 1117 91 PHE CA C 56.5 0.6 I 1118 91 PHE CB C 44.3 0.6 1 1125 91 PHE N N 114.0 0.3 1 1126 92 CYS H H 7.86 0.02 1 1127 92 CYS HA H 5.18 0.02 1 1128 92 CYS HB2 H 2.57 0.02 1 1129 92 CYS HB3 H 3.08 0.02 1 1131 92 CYS C C 174.1 0.6 1 1132 92 CYS CA C 54.1 0.6 1 1133 92 CYS CB C 44.7 0.6 1 1134 92 CYS N N 119.0 0.3 1 1135 93 SER H H 9.26 0.02 1 1136 93 SER HA H 4.23 0.02 1 1137 93 SER HB2 H 3.90 0.02 1 1138 93 SER HB3 H 3.90 0.02 1 1140 93 SER C C 1141 93 SER CA C 60.4 0.6 1 1142 93 SER CB C 63.7 0.6 1 1143 93 SER N N 118.3 0.3 1 1144 94 SER H H 8.04 0.02 1 1145 94 SER HA H 4.59 0.02 1 1146 94 SER HB2 H 3.91 0.02 1 1147 94 SER HB3 H 3.91 0.02 1 1149 94 SER C C 1150 94 SER CA C 58.7 0.6 1 1151 94 SER CB C 64.0 0.6 1 1152 94 SER N N 114.2 0.3 1 1153 95 SER H H 1154 95 SER HA H 4.71 0.02 1 1155 95 SER HB2 H 4.02 0.02 2 1156 95 SER HB3 H 3.94 0.02 2 1158 95 SER C C 1159 95 SER CA C 58.4 0.6 1 1160 95 SER CB C 64.5 0.6 1 1161 95 SER N N 1162 96 ASN H H 1163 96 ASN HA H 4.61 0.02 1 1164 96 ASN HB2 H 2.75 0.02 2 1165 96 ASN HB3 H 2.60 0.02 2 1166 96 ASN HD21 H 1167 96 ASN HD22 H 1168 96 ASN C C 1169 96 ASN CA C 54.3 0.6 1 1170 96 ASN CB C 41.8 0.6 1 1172 96 ASN N N 1173 96 ASN ND2 N stop_ The following loop is used to define sets of Atom-shift assignment IDs that represent related ambiguous assignments taken from the above list of assigned chemical shifts. Each element in the set should be separated by a comma, as shown in the example below, and is the assignment ID for a chemical shift WO 00/63245 PCT/GBOO/01558 assignment that has been 2iven as ambi~uitv code of 4 or 5. Each set indicates that the observed chemical shifts are related to the defined atoms, but have not been assigned uniquely to a specific atom in the set.

loop- _Atom-shift-assig-nID-ambiguitv 4Sets of Atom-shift Assimirnent Ambiouities 4 Example: ,4,7 stop_ WO 00/63245 PCT/GBOO/0 1558 -69- -REMRKStx'J--- 4PROTECTED BACKBONE AMIDE GROUPS (SLOWLY EXCH-ANGING IN D20) FOR RESIDUES: ;GLY 17, PHE 19, GLU 27, LYS 29. LEU 31 TYR 34, LYS 35, VAL 42, CYS 56, ASP 57, ALA 60 LYS 61 THR 63, THR 75, GLU 77, 4LEU 86, GLY 89, ILE 90,PHE 91 -'BROAD HN SIGNALS IN [I15-N]-HSQC OBSERVED FOR RESIDUES: #VAL 8, LYS 9, LYS 10, CYS 18, ARG TWO BACKBONE HN CROSSPEAKS OBSERVED FOR RESIDUES: 4HIS 5: 7.78,113.8 /7.74,113.5 4GLN 6 :7.44.122.6 /7.40,122.4 4TWO AVERAGED SIGNALS OBSERVED FOR RESIDUES: 4ARG 20, ARG 25 NOT SPECIFICALLY ASSIGNED -4 TO INDIVIDUAL ARGININES SLYS 29 NZ/HZ* SIGNAL: TENTATIVELY ASSIGNED TO 4LYS 29 (BURIED LYSINE SIDE CHAIN) 4BASED ON GREATER PROTECTION FROM H20 EXCHANGE THAN OTHER LYSrNE NZ/HZ* SIGNALS ASPARAGINE SIDE CHAIN AMIDE SIGNALS: 4PROBABLE OVERLAPPING CROSSPEAKS -112 PPM 4FOR ASN I, ASN 70, ASN96 'Fable A2. Supplemnentary: NMR experimental details Experiment Dimension Nucleus Complex Spectral Acquisition Carrier Instrument Solvent Temp- Final Digital Mixing Tlotal Points width 'rime Frequency frequency erature data size Resolution time time [after LP) (points) (Hz) (ins) (ppm) (Mllz) 2D NOESY tI W 2D NOESY ti 2D ROI3SY 2D ROSY t 3D 3 N]-NOESY-1-SQ ti 12 03 3D 5 N]-ROESY-[iSQC 400 2048 360 2048 260 2048 360 2048 36 1641 180 512 32 180 512 160 96 384 8000 50 8000 256 8000 45 8000 256 6000 43 6000 341 7000 54 7000 293 2500 7000 7000 2500 7000 7000 7200 10000 7200 4.74 4.74 4.74 4.74 4.74 4.74 4.74 4.74 121.5 4.74 4.74 121.5 4.74 4.74 4.74 41.0 4.74 40.2 3.00 40.2 3.00 (points) (lIz] point)

D

1 0 25 1024 7.8 2048 3.9 1120 25 1024 7.8 2048 3.9 D,0 25 1024 5.9 2048 2.9 1120 25 1024 6.8 2048 3.4 (Ins) (hr) 75-150 22 75 23 60 16 60 62 125 641 60 87 125 89 H1 2 0 25 128 512 512 iH 2 0 25 128 512 512 03 'H 3D IMQC-NOESY t'H Qt13 B3 'H 41.) [13C1]-IIMQC-NOI3SY-[l3C]-IISQC 11 1"C D,0 25 512 14.1 256 39.1 1024 181241 3360 111 74 3600 "3c 18 1241 3360 111 256 4500 D0O 25 64 256 64 256 125 105 Experiment Dimension Nucleus Complex (Reference) Spectral Acquisition Carrier Instrument width Time Frequency 'H--frequency Solvent Temper- Final Digital Mixing TUotal ature data size Res. time time Points [after LPJ 4D I' 3 C]-IIMQC- NOESY-[' 3

N]-HSQC

11 DH Q3 1H (points) (11z)

C,,

2D DQF-COSY 3D IINI IA

CA,

3D IINIIB 3D HN(CO)HBI 0%I 4 111 14 [241 56 14 [241 256 1000 2048 80 512 24 [48] 512 28 [48] 128 512 80 1312 3360 3600 1800 7400 6000 6000 (ins) 4.2 15.6 7.8 34.6 167 341 40.2 3.00 118.9 4.74 4.74 4.74 2500 14 3500 23 7000 73 2500 9.6 8000 11.3 8000 64 1800 8000 8000 (ppm) (Mllz) (OC) (points) (1 hf PointXni1s) (hr) 121.5 500 4.74 4.74 119.1 600 4.74 4.74 118.9 600 4.74 4.74 118.9 600 4.74 118.9 600 4.74 64 52.5 125 128 28.1 64 28.1 256 29.0 4096 1.5 8192 0.7 128 19.5 256 13.7 512 13.7 128 19.5 256 31.3 1024 7.8 128 14.1 512 15.6 512 15.6 256 7.0 2048 3.9 256 7.0 2048 3.9 256 11.8 256 18.8 1024 3.9 96 31 41 108 13 26 2D [15N]-['1 3 CYJ Spin-echo HSQC It I5N 12 11- 2D 1 Cyj Spin-echo HSQC ti 5 N 78 12 'Hl 1216 3D LRCI 1800 44.4 8000 149 1800 43.3 8000 152 3017 4800 4000 Experiment Dimension Nucleus Complex Spectral Acquisition Carrier Instrument Solvent (Reference) 2D TOCSY Points width Time [after LPJ Frequency 'H-frequency Temper- Final Digital Mixing Total ature data size Resolution time time (OC) (points) (lIz] point) (ins) (h~r) ri) 12 S 2D J-IISQC ti S 2D ["CJ-I-ISQC ti 3D OCSY-tISQ ti 12 13 3D I ICCH-TOCSY 3D CBCA(CO)NH ti Q2 tG 3D CBCANII*

'H

15N 1 3 c

'H

1"N

'H

Ili

'H

(points) (Hz) 360 8000 2048 8000 360 4400 1216 8000 (ins) 400 12000 33.3 1216 8000 152 38[(64] 2500 180 8000 512 8000 134 5500 128 8049 416 5500 4.74 4.74 119.6 4.74 41.3 4.74 119.0 4.74 4.74 4.74 41.9 4.74 41.3 118.9 4.74 41.3 118.9 4.74 118.9 176.0 4.74 600 600 600 600 500 600 600 600

H

2 0

H

2 0

H

2 0

H

2 0

D

2 0 H,0

H

2 0

H

2 0 (ppm) (MI-z) 25 1024 7.8 2048 3.9 25 2048 2.1 4096 25 1024 11.7 4096 25 128 19.5 512 15.6 512 15.6 25 512 10.7 512 15.7 512 10.7 25 128 78.1 128 14.1 512 15.6 25 512 19.5 128 14.1 256 31.3 25 128 14.1 256 7.1 512 15.6 66 18 14 2.7 56 43 38[641 10000 3.8 26[361 1800 14.4 512 8000 64 63[1281 10000 6.3 261521 1800 14.4 512 8000 64 17 21 3D IINCO 32[481 1800 64 1811 512 8000 WO 00/63245 PCT/GBOO/0I558 TABLE B merozoite surrce zrz-teir.-. '"S-1 X-?LOR format 09-11-98 noes roes 370rcximace-- Lzg- ranae mecdium-ranoe 0 sequent-a 622 nzrarescue 3 hydrogen tonds 10 (20 restrair :alcicarum C-termnal r'a-'-nr pseudoatom

AVERAGING:

class rt6s class nsam class sing class hbnd types: arom meta_ dgnm_ nsam_ si ng_ corrections not used for R-6 averaging/summation

SUM

aromatic cair .ethyl degenerate methylene on-stereospecifically-assigned methylene single proton long-range medium-range sequential i.ntraresidue 1.ydrogen_bonds (j-i 4) -2-4) 1) hbnd <residue-atom 1> <type> <residue-atom 2> <dist-minus-olus> class rt6s assign assign assign assign assign assign assign assign assian assign assign assign assign assign assign assign assign assign assign assign assign assign assian assign assign assign assicn assign (resid (resid (res id (resid (resid (resid (resid (resid (resid (resid (resid (resid (res id (resid (resid (resid (resid :resid (resid (resid ;resid (resid (resid (resid (resid ,resid (resid (resid and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name hb#) ha) ha) ha#) ha#) hd#) hd#) hd#) hdw) hd#) hd#) hd#) hd#) he#) he#) he4) he hn) hg) hd i hd2#) hn) hn) ha) hd (res d (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (zesid 'resid (resid ~res id (res id (re s i d (resid (regid (resid (resid (res id and and and and and and and and and and and and and and and and and and and and and and and and and and and and name hda) name he#) name hd#) name he#) name hd#) name he#) name hn) name hn) name hd2) name hd#) name ha) name hb#) name hd#) name he#) name hd2) name hd#) name hg) name hb#) name hd#) name hd#) name hd#) name hd4) name hei) name hd#) name hd#) name hd#) name hd#) name hn) 3.6 3.6 3.6 3.6 5.5 5.5 5.5 0.5 3.6 3.6 3.6 .3.6 5.5 3.6 5.5 3.6 5.5 3. 6 5.5 .3.6 3.6 3.6 3.6 5.5 5.5 5.5 5.5 3.6 3.6 5.5 3.6 3.6 0.5 5.5 3.6 5.5 3.6 5.5 3.6 5.5 5.5 5 2.8 5.5 3.6 3.6 3.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 arom 2 !arom2 !arom 2 arom2 !arom aroml !arom s !arom s !arom m !arom m .a-romI larom-1 !arom I arom2 larom n !arom m arom2 arcm !arcm2 arom r, arom m arom .aron arom L arom m !aroa arom arom S WO 00/63245 PCT/GB00/01558 assign assicn assign assign ass ian ass in ass n assi n assion assigi ass gn ass ;Gn assicn assian assign assign assign assign assign assign assign assign assian assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign ass ign assign asslgn assign assian as s ia assign assign assan assign assign (resid resin resid res.

resid resid resid resid resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid -resid resid tresid resid resid (resid resiid resid (resid res id resid resid ana and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hd nd) hd#) hd#) ha)) ne;) he#) he r~t) hd#) hd#)) he#) he#) hn) hn) hd#) hb#) hd#) ha) hd*) hd#) hd#) hd#) hd#) hd#) he#) hn) hd#) hd# hb) ha) hat) hg) hd*) he#) hn) ha) ha) ha) hbn) hg) hn) ha) hbl) hg#) hg2 hg2 hg2#) hn) hn) ha)g2 hg2#) hg2#) hg2#) hg2#) hg2#) hg2#) hg2*) hn) ha) hn) ha:.*f ha#) hr.) hn) ha) ha) hn) (resid (resid (resid (resia res (resid (resid (resid (res id (resid (res id (resid (resiad (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid iresid (res d res resi:.

!resid iresid (resid (resid reszd i (resid and and ar.nd and =r.ci and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hn) ha) hn) hn) ha) hb#) hgn) hd#) hd#) hd#) he#) hd#) he#) hd#) hb) hg2#) hd#) hb#) hd#) he#) hd#) he#) hd#) hn) hd#) hd#) hd#) hd#) hd#) hd#) hd#) hd#) hd#) hg2#) hd#) hg2#) hg2#) hg2#) he#) hg2#) hd#) hd#) hg2#) hd#) hd#) hn) hn) hbl) hn) ha) hb#) hg#) hd#) hd#) hd#) hd#) hg2#) hd#) hb#) hg2#) hb#) hg2#) hb#) hb4) 5.5 2.8 5.5 3.5 3.6 5.5 5.5 3.6 3.6 3.6 3.6 3.6 5.5 3.6 2.8 3.6 3.3 3.3 5.5 5.5 3.6 2.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 r.c 5.5 3.5 5.5 5.5 5.5 3.6 5.5 5.56 5.5 5.5 5.5 5.51 5.6 3.6 5.5 3.6 5.5 5.5 5.5 5.5 5.5 5.3 5.5 3.6 3.6 3.6 3.6 5.5 3.3 3.6 3.6 2.8 5.56 5.5 3.6 3.68 5.5 5.5 3.6 3.6 2.8 3.6 5.5 5.5 5.56 5.5 5.5 5.5 4.1 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 3.

5.5 3.5 5.5 5.56 5.5 5.56 5.5 5.56 5.5 3.1 3.6 3.6 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.3 5.5 3.3 3.6 3.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 aron_1 larm I arom I !arm 1 arom3n.

arom i arom I 'arzmi !arcm s arom 1 arcm s !arom s arom s aroms arom i (arom m arom m arom m arom m arom m aromm arom m arom m arom i arom s !meth 1 !meth 1 !meth 1 !meth 1 meth 1 !meth 1 !meth 1 !meth 1 !meth 1 !meth 1 !meth 1 meth 1 meth 1 meth I meth !meth s !meth 1 !meth i !meth i !meth 1 !meth_1 !meths !methm meth_m !meth m !meth m !meth m !meth m meth_1 meth_1 !meth_1 Imeth_2.

!meth 1 e th I meth I !methn m !meth_ !meth m !methm !meth m !meth m WO 00/63245 PCT/GBOO/01558 assign (resid assign (resid assian Iresid assign fresid assign (resid assian (resid assign (resid assion (resid assian (resid assign (resid assian (resid assian (resid assign (resid assian (resid assign (resic assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resi assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (r-si assign (resid assign (resid assign (resid assign kres assign (res a assign (resid assign (resid assign (resid assign (resid assizn (res c assian ,resiz and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name ame name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name 'nn) ha) hnB) hb) hg2#) ha) ha#) hb#) ha) hn) h b hn) ha2) ha) g2 ha) hn) hg#) ha) hbft) hn#) hn) ha) ha) hg2#) hn) ha) ha) hg2#) hd#) ha#) ha) hn) hn) ha) hg24) hd#) ha) he#) he#) hn) hb#) hin) ha) hg) ha) hn) ha) hb#) hn) hn) hgA) ha) hz) hn~) h) hb) ha) :res id Iresii (resid (resid resid (resid 'resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid Iresid 'resid (resid (resid (res id (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid Iresid (resid ;resid (res id (resid Cresid (resid (res -d anc and and and and and and and and and and and and and and and and and ana and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and ana ana and anc and ana and and ana and name name name n a.-Ile name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name .ame name name name name name name name b n) hn) hn) hb.1) ha#) hn) hg2'r) hd#) hd#) hd#) he#) hd hdn) hg2#) hg2#) hd#) ha) hn) hg2#) hn) hd#) hn) hb#) ha) hd#) hg2#) hdw4) hg2 ha2#) ha2 hn) hn) hg#) hn) hn) hg#) hn) hg#) hg-') nn.) nd#) hb#) hn) hg.4) hg*) hn) hn) ha) hb4) hgF) hgr) hn) hr.) .6 3.6 3.6 3.6 5.5 5.5 5.5 5.5 4.1 3.6 3.6 5.5 5.5 5.5 5.5 3.6 5.5 5.5 3.6 3.6 3.6 5.5 3.6 3.6 3.6 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 3.6 3.6 5.5 5.5 2.8 3.6 3.6 3.6 4.1 3.6 2.8 5.5 5.5 2.8 3.6 3.6 5.5 3.6 3.6 5.5 3.6 3.6 3.6 2.8 3.6 3.6 3.6 3.5 5.5 5.5 4.1 3.6 3.6 5.5 5.5 5.5 5.5 3.6 3.6 3.6 3.6 5.5 3.6 3.6 3.6 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 3.6 5.5 5.5 2.8 3.6 3.6 3.6 4.1 3.6 2.8 5.5 5.5 2.8 3.6 3.6 3.6 3.6 3.6 3.z 3.6 3.6 2.8 r; .j 0.3 0.0 0.0 0.3 0.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 I.

0. 0 0.0C 0.0C .0 f Imeth m methn m .netfl !mech s !meth s mneth 1 .methi meth 1 mech 1 .merh 1 !meth 1 !meth 1 !meth 1 !meth s !meth 1 meth 1 .meth 1 !meth 1 !meth i !meth m !meth s !meth i !meth i .meth i meth s meth s !meth i !meth s !meth s .meth s .meth 1 !meth 1 meth m !meth i !meth s !meth i !meh s dgnmnl dgnm 1 !dgrun 'dgnm_i !danm s !dgnm i dgnmi darn4 dgnm i !dgnm !danm 1 dgnmi car= s !dgnm_1 !damra 1 agnml1 !dgnm i !dgnm-i dgnm WO 00/63245 PCT/GBOO/01558 assign assian assign assian assian assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign -r2.sd (resid Iresid (resid (resid (resid (resid (resid (resi (resid (resid (resid (resid (resici (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (res id (resid (resid (resid (resid (resid anc ane anc anc anc anc anc anc anc anc anc anc anc anc anc anc anc and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and i name name i name -name name i name name r'ame i name name name name name I name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name i name name i name 2 name name name 2 name hc) ha) h at hn( hn) ha) ha,") hg hn) hn) hn) hn) hgr,) hn) hbW) hn) hn) hn) ha) hb#) hn) hb O) ha#) hn) hn) hb#) hn) hb*) hb#) ha) ha) hg#) hb#) hb#) ha) hg#) hg#) ha) hb hn) ha) hn) ha#) hz) hd#) ha) ha) hd2#) hd2#) hd2w) hd27) ha) hd2 hd2 hd2 hd2 hd2#) hd2#) id2#) id2#) idl# (resid (resid (resid (resid (resia (resid (resid (resid (resid ,resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name 2 name name name name name name hn hn) hgd) hn) hn) hgi) hb4) hg#) hg hd22) hb#) hn) hb#) hg*) hgw) hg#) ha) hg*) hn) hn) hd#) hb#) hn) hb#) hd#4) hn) hb#) hg#) hd#) hn) hn) hb#) 3.6 3.6 3.6 3.5 3.6 5.E 5.5 3.6 2.8 3.6 2.8 2.6 2.8 3.6 3.6 3.6 2.8 2.6 3.6 2.8 3.6 2.8 2.8 3.6 2.9 3.6 2.8 5.5 3.6 3.6 5.5 3.6 3.6 .6 3.6 3.6 2.5 3.6 5.8 2.8 3.6 5.5 2.8 3.6 2.8 2.6 2.8 3.6 3.6 3.6 2.8 2.8 3.6 2.8 3.6 2.8 2.8 3.6 2.8 3.6 2.8 5.5 3.6 3.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 .danmm .dgnn_2 .danmS .dgnm_.

.danm s *dgnms !.dnm .dgnmS .dgnm 1 .dgnm_1 dgnm_s *dgnm-i dgnms *dgnm i dgnmi .dgnm i 1 dgnmi .dgnm_l dgnmi *danm s *dgnmi .dgnmi .dgnmJ .dgnm i .dgnmi *dgnms dgnrn_m .dgnm_ .dgnm_m .dgnm_s .dgnm_m .dgnms !dgnml .metn h2.

.meth 2.

.meth_ .meth I !meth 1 !meth 1 !methI !methI .meth_1 !meth m !meth s !meth s !.-eth I !meth 1 meth Imeth 2 !metn I meth 2.

!meth s !meth_2.

(meth 2.

methI meth_ !meth I met h meth_ meth class nsam assian assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign -ass ian assign assign assign ass ion assign (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (res id (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid ha#) hd#) he) hdl#) hdln) hdl#) hd#) hd2#) hd2#) hd#) hn) hd#) hd2 hb#) hz) hg24) hid24) hn) ha) 'ib) ig2*) ,za) ig2 ibz)! 3.6 5.5 3.6 5.5 5.5 5.5 4.1 5.5 3.6 5.5 5.5 3.6 5.5 3.6 5.5 3.6 3.6 3.6 5.5 3.5 3.6 0.0 5.3 0.0 5.5 0.0 3.6 0.0 5.5 0.0 5.5 0.0 5.5 0.0 4.1 0.0 5.5 0.0 3.6 0.0 5.5 0.0 5.5 0.0 3.6 0.0 5.5 0.0 3.6 0.0 5.5 0.0 5.5 0.0 3.6 0.0 3.6 0.0 5.5 0.0 3.5 0.0 1.3 0.0 5 .0 5.5 0.0 5.5 0.0 3.6 0.0 3.5 0.0 WO 00/63245 PCT/GBOO/01558 assian assi=n ass ign ass. u =ssi~n.

assign ass -4--n ass 4nn assian assign assign assicn assign assign assign assian assign assign assign assi=n assin assign assianassign assi-an assign assign assign assign assign assign assign assign assign assign assign assign assign assinn assign assign assign assign assign assign assign assign assign assign assian assign assign assign assign assian assinn assign ass :Cn assign assion assian assign ass irn resid res id resia res id 'resid res id resid ,resid Iresid :resid (resid 'resid .resid (resid 'resid resid (resid (resid (resid (resid fresid 'resid Cresiid (resid (resid fresid (resid (resid ,resid (resid (resid (resid (resid (resid (resid (resid 'resid resid (resid (resid Iresid resid (resid (resid (resid (resid Iresid ,resid (resid (resid (resid (resid (resid (resid !res id (resid (resid ,resid resic resid 'resid (resid (resia resid ~res id and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hn) nb2) ha#) h) hn) ha) hb#) hb#) hg-) hg#) hn) hg#) ha) ha) hb#) hgl) hg2#) ha) ha) ha) ha) ha) hb#) hb*) hb2) hg#) hd#) hb2) ha) hb#) hb#) ha#) ha) ha) hb#) hg#) hg#) hb#) hb#) hb2) hb#) hn) hg#) hn) ha) ha) hb#) hg4) hn) hg#) hd#) hn) hbl) hb#) nb2) hb#) hn) hn) hbl) ha, ha) hb2) ha) hb 1 hb#) 'resid resin res id 'resid (resid (res :resid resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid !resid (resid (resid (resid (resid (resid Iresid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid ;resid 'resid r=esid resi :resr.

resrn re sn regiz resid resin re s~ ann and and ann and ann and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and ar.

and an and and ann and ann name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hg2 ha2"W hg2#.

hg2') hgl#) hg2#) hal) hg2 hgli) hgl') hgl hg2) hg#) hn) hn) hb#) hb#) hd#) hg#) hd#) hd#) hb#) hn) hb#) hb#) hb2) hb#) ha) hn) hz) hg#) hgu) hd#) hn) ha) hn) hn) hn) hn) ng#) hn) hg#) hg#) hg#) hn) hn) hb#u) hn) hb) hb#) hn) hn) hb2) ha) hbl) hb2) hn) hb#) hbl) hn) hb#) hn) hd#) 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 2.8 2.8 2.8 2.8 2.8 2.9 2.8 3.6 5.5 3.6 3.6 3.0 3.6 2.6 3.6 '.6 3.6 4.1 2.8 5.5 5 =.5 5.3 5. 5 3.6 3.6 3.6 3.6 2 3.6 3.6 2.2 3.6 3.6 3.6 3.6 3.6 3.6 3.8 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 3.6 5.5 2.8 2.0 2.5 2.8 3.6 3.6 3.6 5.5 3.8 3.6 3.8 3.6 3.6 3.6 5.5 3.6 3.65 3.6 3.61 3.63 0.0 0.0 0.0 0.0

O.C

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 .etl !meth metn -et' 1 meth_1 ieth 1 .metn n !meth l mech -a meth s *meths .meth s meth s .nsamI .nsam_1 .nsamI .nsam i .nsam s .nsam 1 .nsam m .nsam_1 *sing 1 .nsam 1 nsam 1 .nsam_1 .nsam_1 .nsam i .nsam 1 .nsam i nsam s .nsam m .nsam m !sings .nsam s .nsam i .nsam i .nsam i .nsam_i .nsam i .nsam s *nsam 1 .nsam i nsam rn singm .sam_1 .nsam i .S4no-m .sinn s Insam 1 !sinn 1 .nsam_1 :sing nsam WO 00/63245 PCT/GBOO/01558 assion tresid assign jesic assian eS assign ~e~ assign 'resid assign !<asic assign resiQ assian (resid assian "resid assign !r.sid assign (resid assign :resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign lresid assign (resid assign (resid assign (resid assign (resid assign (resi assign (resid assign (resid assign (resid assign 'resid assign Cresid assign (resid assign 'resid assign fresid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid assign (resid I assign (resid i assign (resid assign (resid I assign (resid assign (resid assign (resid assign (resid assign 'resid assign kresid assign resid assian %res 1assign (resicl7 assian (resid as s resid assign lresid E assign (resid E assign kresid anc anc: anc anc an: anc anC and anc and and anc anc and anc and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and ana and and ano and and and and i name name L nanie name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name b 2, no 7 vta) en) ha) no2) hn) hn) ha) hb2) hb2) hn) En) ha) ha) hb4) in) hd22) hd2 1) ha#) ha#) hbl) hb2) hb2) hb#) h.o#b nb# hEn) hb hb#) ha) hn) En) n) hb#) hb#) ha#) hn) hn) ha) hn) hbl) hbl) hb#)' hb#) nn) hn) hn) i-a) !rsit rsic ;resin, '-as'd .res ,n .rsic :resin (resid (resid (resid (resid (resid (resid (resid (residJ (resid (resid (resid (resid (resid (resid (esid resid (res id (resid (resid (resid (resid (res id (resid (resid (resid (resid (resid iresid 'resid (resid (rsid (resid (resid (res id (resid (resid (resid (resid (resid (resid (res id (resid (resid (resid (resid (resid (resia (resid :resrn (resid [resid ;resid (resicd (resid anc a nd a:c anc ianc a no and a nd and and and and and and and a nd an d and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and arna and and and a nd and and an and aC.n an n name *nam~e name n nane name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hn) hb ha) En) hn( hn) hn) hd#) hd#) hn) hb#) hd#) hd#) hn) hgl#) hgl#) hai) hgl#) ha#) hbl) hn) hn) ha) hn) hn) hn) En) hn) hn) hn) ha) hn) hn) hn) ha) hn) hb#) hg#) hn) hb#) hg#) hb#) hb#) hn) hb#) hn) hg#) ngl*) hal# hb#) En) hat) hn) hb# hn) hb:) hg#) hd#) hb#f) hq 3.6 2.8 2.3 3.6 31.6 5.5 2.8 3.6 3.6 3.6 3.6 4.1 3.6 3.6 3.6 3.6 3.6 5.5 5.5 3.6 3.6 3.6 4.1 5.5 3.6 3.6 4.1 4.1 2.8 5.5 3.6 3.6 3.6 3.6 2.8 2.8 3.5 3.6 3.6 3.6 3.6 3.6 2.8 3.6 3.6 3.6 3.6 5.5 3.6 3. 6 3.

3.6 5. 5 3. 6 3. 6 2.8 2.8 3.6 3.6 5.5 2.3 3.6 3.6 3.6 3.6 4.1 3.6 3.6 4.1 5.5 3.6 5.5 3.6 3.6 3.6 3.6 3.6 4.I 5.5 3.6 3.6 4.1 4.1 2.8 5 5.5 3.6 3.6 3.6 3.6 2.8 2.8 3.6 3.6 3.6 3.6 3.6 3.6 2.3 5.5 3.6 3.6 3.6 3.6 3.6 3.6 0.0 3.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 c 0 .0j 0.0 0.0 0.0 0.0 *nsam i .nsam_1 .sam m .nsam s .nsam m I.sam s sings .nsam i .nsam 2.

.nsam s .nsam S .nsam s .nsam s !sing s .nsam i .nsam s sings .nsam_1 .nsamI .nsam_1 .nsamI .nsam 1 sing 1 sing_s .nsam m .sing_1 .singm sing 1 .nsam s .nsam m .nsam s *sing_! .sing_1 .nsam .nsam m .nsam m .nsam 1 .nsam 1 !nsam s !nsam 1 !nsam s !nsam_1 !nsam i Insam i !nSam i nsam s !nsam 1 !nsamS !nsam nsam i !nsam i !nsami Isings Insam 1 !s ng insam I Insam m Insam 1 Insam !nsam s insamaM Insam i WO 00/63245 PCT/GBOO/01558 assign ass ln1 assign a ss ian ass 1n assign assi an aSS14an assign ass in assign ass q assign ass ian (resid (resi (resid (resit fzesia (resid (res id (res (resid (res:ct (resia (resid (resid (resid and and and anC and ana and and and ar.a and and and and name *.ame n me name n ame ame name name .aMe name name hb2) hd~i hbs) ha) hb 4 hb#F) ha#) class sing assign assign assign assign ass in assign assign assign assign assign assign assign assign assign assign assign assign assign ass icn assign assign assicn assian assign assign assign assign assign assign assign assign assign assian ass 4cn ass icn assicn ass 4an ass ian ass ian ass an ass -a ass4:n assl3i ass-ign ass:01 asslan ass 1=1 ass-an (resid (resid (resid (resid (resia; (resid (resid (resid (resia (resid (resit (resid (res-d (resid (resid (resid (resid (res rd (resid (resi (resid (res id (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (res id (resid (resid (resid (resid (res4d (resiz (resid (res1, (res id (resid {resid (res ii and and and and and and and and and and and and and and and and and and and and and and and and and ana and and and and and and and and and and and and and and and and and and and and and and name name name name name name .ame name name name name name name name name name name name n ame n ame name name r ame name name name name name name name name name name name name name name name name name name name ame name name name name name hn) hn) hn) ha) hn) hn) hn) hn) ha) hn) ha) hn) ha) ha) hn) hn) hn) ha) ha) ha) h n) ha) ha) hn) hn) hn) ha) hz) ha) ha) hn) hn) ha) ha) hn) hn) hn) ha) ha) hg) hn, hn) ha) ha) ha) nn) ha) r'.n) reCsid (resia ;resid Iresid (res~c 'res2ia resid resid :resid resid (resid (resid (resid (resid (resid (resid (resid ,resid (resid (resid (resid (resid (resi (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid iresid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid 'resid fres id (resid (res id Iresid (resid I, re sid (resid tresid and and and a nad a nt a nd a nd ana and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and ana3 and ame name name name name name name name nme name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name hdf) h'n) hn) hn) ha) hn) haF) hr.) hn hgi) hn hn) hn) hn) hn) hn) hn) hn) hn) hn) hn) hn) hn) hn) hn) hg) hn) hn) hg) ha) hn) hg) hn) ha) ha) hn) hd2) ha) hn) hn) hn) ha) hd2) hn) hn hn) hg) hn) hn) hn) hn) hr.) hnr.) ha) hr.) nn) hn) hn) r'n) hn) 5.5 5.5 3.6 5.5 5.5 5.5 5.5 3.6 2.6 a.5 2.6 3.6 3.6 3.6 5.5 5.5 5.5 2.8 3.6 3.6 2.6 2.8 5.5 5.5 5.5 3.1 5.5 3.1 3.6 3.6 3.6 5.5 5.5 5.5 5.5 4. 1 4. a 5.5 4.1 5.5 S. 5 2.6 5.z 5.5 5.5 3.6 5.5 5.5 5.5 3.6 2.6 5.5 2.6 3.6 3.6 3.6 5.

5.5 5.5 2.8 3.6 3. 6 2.6 2.8 5.5 5.5 3.1 5.5 3.1 3.6 3.6 5.5 3.6 3.6 5.5 5.5 a. a 4.1 5. a 4.1 5.5 a. a 5.5 2.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

C.-

0.0 0.

0U.

nsam s *nsam s I sam s *nsam s Snsa m n.sam m .nsam m .nsam m .nsam m *nsarm .nsam sing_s sings *sing_1 sings I sing s *sings sing_s !sing-m !sings sings )sing_1 !sing_1 sing_1 sing_1 !sing_1 s ina I !sina 1 !sina 1 !sing-M sing I sings !singi sings !sing 1 sing_1 singis sing m sina sing-i ~sin-IQ sing s !s ng m, !sinc in v-so s !sinz sing-in sinas..na s ngs WO 00/63245 PCT/GB00/01558 assian asso assicr.

ass icen assign assin assicn assignI a-ssicn assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign ass i an assign asion assign assign assign assign assign assign assign assir.

assign assin resid resid resizd ires.c 'resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid 'resid (resid (resid (resid (resid !resid and and a na and ana and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name arne n.am~e name C~a.3.e name ame name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name (resid 'resid (resiad (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid tresid (resid (resid (resid (resid (resid (resid and and anc and and and and and and and and and and and and and and and and and and and and and and and and anad and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and an name r.ame name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name 3.6 2. 6 L.

2.6 5.

3.6 5.5 3.6 2.6 3.6 3.6 3.1 5.5 5.5 3.1 2.8 3.1 5.5 5.5 2.6 5.5 z 2.6 5.5 2.6 2.8 3.6 3.6 3.5 5.5 2.6 5.5 3.6 2.5 2.6 2.6 3.1 5.5 2.6 3.6 3.6 5.5 2.6 5.5 3.

5.5 3.6 3.6 3.6 3.0 C 4.6 3.5 3.6 5. 4.1 (sing_s !sings sings sing s sing s sing s !sing_ !sing s (sina s singi sing: !sings singm !sings sings sings !sing_! sings !sing_1 !sing_1 !sings .sing_! sing 1 .sing 1 !sing 1 *sing 1 *sings *sing_s !sing_s !sing_l !sing I singm s ingmr !sing_m i sing_s !sings !sings !sing_s !sings !sing_s sing_s sing-s sing_s !sings !sing_s sings !singm !sings !sing_1 sings *sings .sing_m !sinas *singm !sings !sing m !sing_s !sing_.m sing s sing m 'sing .n !sing I !sing s !sing s (sina n !sing 'si ng WO 00/63245 PCT/GB00/01558 -81assian :resid 51 and name ha) resid 52 and name hn) 3.1 3. 0.0 :sings assiar (resid 52 and name hn) :resid 53 and name hn) 3.6 3.S 0.0 sing s sssian :resid 52 and name ha) ;resid 53 adY name hn) 3.6 3.6 0.0 !sinc s assgn. ;resid 53 and name hr.) resi 54 and name hn) 4.1 4. 0. !sin s assian (resid 53 and name ha) :resid 54 and name hn) 3.6 3.6 0.0 !sing assicgn resid 33 and name ha) )resid 56 and name hn) 2.6 2.8 0.0 sng m assin (resid 54 and name h) :resid and name hn) 3.6 3.6 0.0 !sin s assign (resid 55 and name hn) :resid 56 and name hn) 3.1 3. 0.0 !sing s assign resid 56 and name n) 'resid 57 and name hn) 5.5 5.5 0.0 !si.ng s assi;n (resid 56 and name ha) ;resid 57 and name hn) 3.6 3.6 0.0 !sina s assign (resid 57 and name hn) :resid 59 and name hn) 5.5 5.5 0.0 !sing_m assign (resid 57 and name hn) Iresid 60 and name hn) 5.5 5.5 0.0 !sing_m assign (resid 57 and name hn) (resid 90 and name hn) 5.5 5.5 0.0 !sina 1 assign fresid 37 and name hn) (resid 90 and name ha) 2.6 2.6 0.0 !sing n assign (resid 57 and name hn) (resid 91 and name ha) 5.5 5.5 0.0 !sing assign (resid 57 and name hn) (resid 91 and name hn) 3.5 5.5 0.0 !singl assign (resid 57 and name ha) (resid 91 and name ha) 3.6 3.6 0.0 !sing_1 assign (resid 58 and name hn) (resid 59 and name hn) 3.6 3.6 0.0 !sing_s assign (resid 58 and name hn) (resid 60 and name hn) 5.5 5.5 0.0 !sing_m assign (resid 58 and name ha) (resid 59 and name hn) 3.6 3.6 0.0 !sings assign (resid 59 and name hn) (resid 60 and name hn) 3.6 3.6 0.C !sing_s assign (resid 59 and name ha) (resid 60 and name hn) 3.6 3.6 0.0 !sings assign (resid 60 and name hn) (resid 61 and name hn) 5.5 5.5 0.0 !sing_s assign (resid 60 and name hn) (resid 78 and name ha) 5.5 5.5 0.0 !sings assign (resid 60 and name ha) (resid 61 and name hn) 2.6 2.6 0.0 !sing_s assign (resid 60 and name ha) (resid 77 and name hn) 5.5 5.5 0.0 !sing I assign (resid 60 and name ha) (resid 78 and name ha) 2.8 2.8 0.0 !sing_1 assign (resid 60 and name ha) (resid 79 and name hn) 3.6 3.6 0.0 !sing_I assign (resid 61 and name hn) (resid 77 and name hn) 3.6 3.6 0.0 !sing_1 assign (resid 61 and name hn) (resid 78 and name ha) 3.6 3.6 0.0 !sing_l assign (resid 61 and name hn) (resid 79 and name hn) 5.5 5.5 0.0 !singl assign (resid 61 and name ha) (resid 62 and name hn) 2.6 2.6 0.0 !sing_s assign (resid 62 and name ha) (resid 63 and name hn) 2.6 2.6 0.0 !sing_s assign (resid 62 and name ha) (resid 76 and name ha) 2.8 2.8 0.0 !sing_I assign (resid 63 and name hn) (resid 63 and name hb) 3.6 3.6 0.0 !sing_i assign (resid 63 and name hn) (resid 64 and name hn) 5.5 5.5 0.0 !sing_s assign (resid 63 and name hn) (resid 75 and name hn) 3.6 3.6 0.0 !sing_! assign (resid 63 and name hn) (resid 76 and name ha) 5.5 5.3 0.0 !sing_l assign (resid 63 and name ha) (resid 64 and name hn) 2.6 2.6 0.0 :sing_s assign (resid 63 and name hb) (resid 64 and name hn) 3.1 3.1 0.0 !sing _s assign (resid 64 and name ha) (resid 74 and name ha) 2.8 2.2 0.0 !sing_1 assign (resid 65 and name ha) (resid 66 and name hn) 2.6 2.6 0.0 !sings assign (resid 66 and name hn) (resid 67 and name hn) 5.5 5.5 0.0 !sings assign (resid 67 and name hn) (resid 68 and name hn) 5.5 5.5 0.0 !sings assign (resid 67 and name ha) (resid 68 and name hn) 3.6 3.6 0.0 !sings assign (resid 70 and name hn) (resid 71 and name hn) 3.1 3.5 0.0 !sings assign (resid 70 and name ha) (resid 71 and name hn) 3.6 3.6 0.0 !sings assign (resid 71 and name hn) (resid 72 and name hn) 3.6 3.6 0.C !sin_s assign (resid 72 and name hn) (resid 73 and name hn) 5.5 3.5 0.0 !sing_s assign (resid 72 and name ha) (resid 73 and name hn) 2.6 2.6 0.0 !sings assign (resid 73 and name hn) (resid 74 and name hn) 5.5 5.5 0.0 !sings assign (resid 73 and name ha) (resid 74 and name hn) 2.6 2.6 0.0 !sing_s assign (resid 74 and name hn) (resid 74 and name hb) 3.1 3.1 0.0 !singi assign (resid 74 and name ha) .resid 75 and name hn) 2.6 2.6 0.0 !sing_s assign (resid 75 and name hn) (resid 75 and name hb) 3.1 3.1 0.C !sing_i assian (resid 75 and name ha) (resid 76 and name hn) 2.6 2.6 0.C !sing s assign (resid 75 and name hb) (resid 76 and name hn) 3.6 3.6 0.0 !sing s assign (resid 76 and name ha) fresid 77 and name hn) 2.6 2.6 0.0 !sing-s assign (resid 77 and name hn) fresid 73 and name hn) 5.5 5.5 0.0 !sinas assign Iresid 77 and name ha) :resid 73 and name hr.) 2.8 2.3 0.0 !sings assign resid 78 and name hn) (resid 79 and name hn) 5.5 S.5 0.0 !sings assian (resid 78 and name ha) )resid 79 and name hn) 2.6 2.6 0.0 !sings assign (resid 79 and name hn) (resid 79 and name hb) 3.6 3.6 0.0 !sing_i assign (resid 78 and name hn) (resid 50 and name hn) 3.6 3.6 0.0 !sina s assign (resid 79 and name hb) .resid 70 and name hnr.) 5.5 5.5 0.C !sing a WO 00/63245 PCT/GB00101558 assian (re assig-n -re assian (re assign re assin Ire assign !re assiln Ire assin :re assign (re assign -re assign (re assign (re assign (re assign re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign fre assign (re assign (re class hbn assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re assign (re sid sid sia sid sid sid sid sid sid sid sd sid sid sid sid sid sid sid sid sid S a sid sid sid sid sid sid sid sid sid sid sid sid sid sid sid sid sid sid S id sid Sid sid sid id sid sid sid id sid sid Sid: S id and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name ha; hia) hn) ha) hn) ha) hn ha) ha) hn) hn) hn) hn) hn) ha) hn) ha) hn) hn) ha) hn) hn) hn) hb) hn) ha) ha) fresid !resid resid resid resid resid resid resid resid :resid :resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid Iresid (resid (resid and ant and ana and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and ana and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name 3.6 4.1 4.1 5.5 2.6 5.5 2.6 2.8 3.6 5.5 4.1 5.5 3.6 3.6 4.1 3.1 4.1 5.5 4.1 3.6 3.6 3.1 5.5 5.5 3.1 3.1 3.6 4 I 4.1 5.5 2.6 3.5 5-3 2.6 2.8 3.6 5.5 4.1 5.5 3.6 3.6 4.1 3.1 4.1 5.5 4.1 3.6 3.6 3.1 5.5 5.5 3.1 3.1 !sing _s *sing m !sing s -Zing-S SinagS !sing_s singi !sing S !sings sing_m !singm !sing 1 !singm !sing_m !sings !sings .sing_s !sing s !sing_s .sing_m .sing_s !sing_s !sing-i !sing_s sing-s sings !sing_s s ing s and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name hn) n) 0) 0) hr.) n) o0) 0) n) n 0) 0) hn) n) 0) 0) hn) n) 0) 0) 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 2.0 3.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 hbnd !hbnd !hbnd hbnd !hbnd !hbnd !hbnd hbnd !hbnd hbnd !hbnd !hbnd hbnd hbnd hbnd hbnd !htnd hbnd khbnd hbnd dihedral anale restraints X-PLOR format chi- phi psi ch i-2 restraints restraints restraints restraints !<ENERGY> <ANGLE> <RANGE><EXPONENT> chi-1 restraints assin (ed 15 and name n sL= 13 and name cb) ,resid 15 and name ca) resid 15 ana name cg) 1.0 -60.3 40.0 2 WO 00/63245 PCT/GB00/01558 assign (resid ,resid assign (resid (r=sid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid and and and an an and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name .ame name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name n cb) n) c') n cb n) cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) n cb) (resid (resid (resid resid (resid (resid (resid :resid 'resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name Ca, c= Ca.

cy, ca, Ca) eg ca; Cr ca; cg) Ca: ca) CgI Ca) cg) Ca) sg) Ca) cg) Ca, ca) sg) Ca) Cc) Ca) sg) Ca) cg) ca) sg) ca) sg) Ca) sg) Ca) sg) ca) Cg) ca) cg) cal sq) 1.0 180.0 40.0 2 1.0 -60.0 40. 2 1.0 60.0 40.0 2 1.0 -60. 40.0 2 1.0 -60.0 40.C 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 60.0 40.0 2 1.0 180.0 40.0 2 1.0 -60.0 40.0 2 1.0 180.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 1.0 -60.0 40.0 2 phi restraints assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assigan (resid (resid assign (resid (resid assign (resid (resid name name name name name name name name name name name name name name name name name name name name name name (resid 15 (resid 15 (resid 19 (resid 19 (resid 28 (resid 28 (resid 30 (resid 30 (resid 32 (resid 32 tresid 33 (resid 33 (resid 34 (resid 34 (resid 35 (resid 35 (resid 37 (resid 37.

(resid 39 (resid 39 (resid 40 (resid 40 name name name name name name name name name name name name name name name name name name name name name name 60.0 -120.0 1.0 -120.0 -78.0 -394.0 60.0 1. -120.0 -120.0 1.0 -120.0 1.0 -120.0 -120.0 50.0 50.0 70.0 50.0 50.0 50.0 50.0 5..0 2 !HN(CO)HB 2 !HNHA 8.9 Hz 2 !HNHA 7.6 Hz 2 2 !HN(CO)HB 2 !HNHA 8.2 z 2 !HNHA 8.45 Hz 50.0 2 !HNHA 8.6 Hz 50.0 50.0 !HNHA 8.6 -Hz !HNHA 6.3 Hz WO 00/63245 PCT/GB00/01558 assign (resid (resid assign (resid (resid assign :resid (resid assign Iresid (resid assign iresid (resid assign !resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name res-d Iresid ;res:d resid Iresid Iresid (resid =es id (resd (res id (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (:-sid (resid I:esid tres id (=es id (resid tresid and and ann and and and and anc and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name 1.0 -120.0 1.0 -120.0 1.0 -75.0 1.0 -120.0 1.0 60.0 1.0 -60.0 1.0 -120.0 1.0 -120.0 1.0 -120.0 1.0 -150.0 1.0 -120.0 1.0 -120.0 1.0 -120.0 1.0 -120.0 50.0 2 !NHA Hz 50.0 50.0 !iiNHA 8.5 hz 53.0 2 !HNjA .5 Hz 52.3 50.0 50.0 50.0 50.0 70.0 50.0 5C.0 50.0 2 !:HiN(CO)HB 2 HNHA 5.0 4z 2 HNHA 8.8 1z 2 !HNHA 9.2 Hz 2 !H(NRA 8.1 Hz 2 !HNHA 7.9 Hz 2 !HNHA 8.7 Hz 2 2 !HNHA 9.15 Hz 50.0 2 !HNHA 8.2 Hz psi restraints assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resia (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid 'resid assign (resid and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name (resid 15 (resid 16 (resid 19 (resid 20 (resid 25 (resid 26 (resid 27 (resid 28 (resid 28 (resid 29 (resid 23 (resid 30 (resid 30 (resid 31 (resid 31 (resid 32 (resid 32 (resid 33 (resid 33 (resid 34 (resid 34 (resid 35 (resid 35 (resid 36 (resid 36 (resid 37 (resid 37 (resid 39 (resid 40 resid 41 (resid 41 (resid 42 (resid.42 (resid 43 (resid 43 'resid 44 (resi r and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name name ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n )1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n )1.0 ca) n 1.0 ca) n 1.0 ca) n )1.0 ca) n )1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n )1 ca) 50.0 50.0 2 120.0 75.0 2 120.0 75.0 2 120.0 75.0 2 120.0 75.0 2 120.0 60.0 2 125.0 75.0 2 120.0 75.0 2 174.0 50.0 2 60.0 50.0 2 159.0 50.0 2 156.0 50.0 2 120.0 60.0 2 20.0 120.0 60.0 2 60.0 2 120.2 60.0 2 120.0 60.0 2 120.0 60.0 2 WO 00/63245 PCT/GB00/01558 (resid assign (resid (resid assign (resid (resid assin. (resid (resid assicg (resid (resid assicn (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid and and and and and and and and and and and and and and and and and and and and and and and and and and and and and name name name name name name name name name name name name name name name name name name name name name name name name name name name name name iresid resid (resid (resia ;resid resia (resid (resid (resid resid resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid and name and name and-name and name ana name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name and name n 1.0 ca) n )1.0 ca) n C ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n )1.0 ca) n )1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 ca) n 1.0 -60.0 50.0 2 !HNHB 120.0 75.0 2 50.0 50.0 2 -60.0 50.0 2 :HNH3 33.0 50.C 2 162.0 50.0 2 120.0 60.0 2 120.0 60.0 2 165.0 50.0 2 120.0 60.0 2 120.0 60.0 2 120.0 60.0 2 120.0 60.0 2 120.0 60.0 2 120.0 75.0 2 and and and and and and and name name name name name name name chi-2 restraints assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid assign (resid (resid and and and and and and and and and and name name name name name name name name name name ca) cgl) ca) cg) ca) cc) ca) cg) ca) cgl) (resid (resid (resid (resid (resid (resid (resid (resid (resid (resid and and and and and and and and and and name name name name name name name name name name cb) cdl) 1.0 cb) cdl) 1.0 cb) cdl) 1.0 cI) cdl; 1.0 cb) cdl) 1.0 -60.0 40.0 2 !LRCH 180.0 40.0 2 :LRCH 180.0 4C.0 2 !LRCH -60.0 40.3 2 :LRCH 180.0 40.0 2 !LRCH WO 00/63245 PCT/GB00/01558 -86- Example 3: Identification of blocking antibodies using a competitive binding assay and immobilised wild type GST-MSP-1 1 9 In previous studies antibodies that blocked the action of the neutralising antibodies 12.8 and 12.10 had been defined either directly in the MSP-142 processing assay (Blackman et al.. 1994) in a coupled erythrocyte invasion-MSP-142 processing assay (Guevara et al..

1997) or in a competitive radioimmunoassay with merozoite protein as the antigen (Guevara et al.. 1997). These studies have been extended using recombinant MSP-1 and BIAcore analysis.

A recombinant fusion protein comprising wild type MSP-1 1 9 fused to GST was coupled to the sensor chip and competitor antibody was first allowed to bind to the antigen. Then a solution of either mAb 12.8 or 12.10 was passed over the chip and the amount of binding of this second antibody was quantified. If the first antibody interferes with the binding of the second antibody then this is reflected in a reduction in the amount of second antibody bound.

Methods The wild type GST-MSP- 19 was coupled to a CM5 sensor chip. The binding assays were performed with a constant flow rate of 5 .l min- at 25 0 C. For binding, purified mAbs 1El, 8A12 and 2F10 at 100 ug ml l in HBS-EP buffer (10mM HEPES pH7.4 containing 150mM NaCI, 3mM EDTA and 0.005%v/v polysorbate 20); mAbs 1E8, 9C8, 12D11, 111.2 and 111.4 in cell culture medium supernatant; mAbs 2.2, 7.5 and 89.1 at 1:10 dilution of ascitic fluid in HBS-EP buffer; and mouse a-GST antibody at 1:10 dilution serum in HBS-EP buffer were allowed to interact with immobilised wild type GST-MSP- 119 for 10 min. After allowing 5 min for dissociation of low affinity interactions, either mAb 12.8 or 12.10 at 100 Lg ml-' in HBS-EP buffer was added and allowed to bind for min. After washing the chip for 5 min the binding of 12.8 or 12.10 was measured. The chip was regenerated by washing off bound antibody with 10 mM glycine-HCI, pH 2.4. or when required with 100 mM glycine-HC1, pH 1.8. for 3 min.

WO 00/63245 PCT/GB00/01558 -87- Results The results are shown in Figure 12. All the competitor antibodies bind to the GST-MSP- 119 antigen, with the exception of mAb89.1 which is a negative control. As expected, mAbs 12.8 and 12.10 competed with each other (Guevara et al., 1997). The other antibodies which do not inhibit processing could to greater or lesser interfere with the binding of 12.8 and 12.10. As expected from previous studies mAbs 1El, and blocked both 12.8 and 12.10, whereas 2.2 and 111.4 blocked 12.8. Another particularly effective blocking antibody identified in this study was mAb9C8.

Example 4: Immunization of small animals with modified GST-MSP-1 1 9 and analysis of the antibodies induced To determine whether or not the modified proteins were immunogenic, recombinant GST- MSP-119 fusion proteins were used to raise antibodies by immunisation.

Methods Two modified proteins containing either 3 [27+31+43] or 4 [15+27+31+43] amino acid substitutions, respectively, were used to immunise rabbits and mice. The rabbits were immunised subcutaneously with MSP-1 1 9 protein in Freund's complete adjuvant and then boosted on three occasions with 200 gg of the protein in Freund's incomplete adjuvant 21, 42 and 63 days later, and serum samples were collected.

The presence and level of antibodies binding to the native MSP-l protein in the parasite was assessed by indirect immunofluorescence using acetone fixed smears of parasiteinfected erythrocytes. The sera were diluted serially in phosphate buffered saline (PBS) and incubated on the slide for 30 min at room temperature. After washing, the slides were incubated with FITC conjugated goat anti-rabbit or anti-mouse IgG, washed, and then examined by fluorescence microscopy.

The sera were also analysed in an MSP-1 secondary processing assay. Analysis and quantitation of secondary processing of MSP-1 in merozoite preparations was by a WO 00/63245 PCT/GB00/01558 -88modification of an assay described previously (Blackman et al., 1994). Washed P.

falciparum 3D7 merozoites were resuspended in ice-cold 50 mM Tris-HCI pH containing 10 mM CaCI, and 2 mM MgCI 2 (reaction buffer). Aliquots of about 1 x 109 merozoites were dispensed into 1.5 ml centrifuge tubes on ice, and the parasites pelleted in a microfuge at 13,000 x g for 2 minutes at 4cC. The supernatant was removed, and individual merozoite pellets were then resuspended on ice in 25 pl. of reaction buffer further supplemented with protease inhibitor or antibodies as appropriate. Merozoites were maintained on ice for 20 min to allow antibody binding, then transferred to a 37°C water bath for one hour to allow processing to proceed. Assays always included the following controls: a "positive processing" control sample of merozoites resuspended in reaction buffer only; a "negative processing" control sample of merozoites resuspended in reaction buffer plus ImM PMSF; and a zero time (Oh) control, in which processing was stopped before the 37 0 C incubation step. The processing was assayed using the western blot-based method and by a modified processing assay. Supernatants from the assays were obtained after centrifugation for 30 min at 4 0 C, 13.000 x g to remove the insoluble material. The amount of MSP-1 33 in the supernatants was measured using an ELISA method. Fifty microlitres of diluted sample supernatants were added to the wells of an ELISA plate (NUNC F96 Cert. Maxisorp) that had been coated with 100 p/well of 4 g ml-1 human mAb X509 in PBS. Plates were incubated for 4 hours at 37 0 C and then washed 3 times with 0.01% PBS-Tween (PBS-T). Bound MSP-1 33 protein was detected by addition of 100 .l of 1:4000 dilution of mouse mAb G13 for 1 hour at 37 0 C, followed by washing and the addition of 100 p.1 of 1:1000 dilution of sheep anti-mouse IgG (H+L) HRP-conjugated antibody. After incubation for 1 h at 37°C, the plates were washed again and HRP was detected by the addition of 100 p. of freshly prepared substrate solution (400mg 1' o-phenylenediamine dihydrochloride in 0.05 M phosphate buffer, 0.024 M citric acid and 0.012% Hz0 2 at room temperature for 20 min. The reaction was stopped by adding 10 .1l of 1 M sulphuric acid and the absorbance of each sample was measured at 492 nm.

Results The results are shown in Figure 13. The two modified proteins produced antibodies that reacted with MSP-1 in the parasite-infected ervthrocvte, with a serum titre of 1:10.000, WO 00/63245 PCT/GB00/01558 -89which was an identical titre to that of a serum produced in the same way by immunisation with a recombinant protein containing the wildtype MSP-1 sequence. This indicates that the modified proteins can produce antibodies that react with the native protein. The antibodies induced by immunization were able to partially inhibit processing at the concentration used in a preliminary experiment, whereas in the control serum no antibodies that inhibited processing were present.

Example 5: Design and Synthesis of a Plasmodium Falciparum Merozoite Surface Protein-1 Gene Fragment Optimized for Pichia Pastoris Heterologous Expression The coding sequence of the Plasmodium falciparum merozoite surface protein-1 (MSP1) 41.1 kDa processed fragment (MSP-142) has been redesigned for optimal heterologous expression in the yeast Pichia pastoris. The optimized DNA sequence was synthesized by PCR gene assembly, in the form of two fragments, MSP-133 and MSP-119. P. pastoris was transformed with an expression vector containing the optimized MSP-119 construct.

Recombinant strains were shown to express high levels of non-glycosylated, properly folded MSP-119 protein.

Proteins encoded by the AT-rich genome of the human malaria parasite Plasmodium falciparum are generally poorly expressed in heterologous systems (Withers-Martinez et al., 1999). The methylotrophic yeast Pichia (Komagataella) pastoris is an appropriate system for expression of disulphide-bridged proteins such as the C-terminal fragment of the P. falciparum merozoite surface protein-1 (MSP1) (White et al., 1994; Morgan et al., 1999). In the P. pastoris expression system, it is important to avoid premature transcription termination due to AT-rich stretches (Romanos et al., 1991). Codon preferences for highly expressed genes in P. pastoris have been identified (Sreekrishna et al., EP 0 586 892 Al). Therefore, a synthetic MSP-142 gene fragment with codon usage optimized for P. pastoris expression was designed, using novel computer software (Withers-Martinez et al., 1999). It has previously been shown that the MSP-119 fragment is partially glycosylated when expressed in P. pastoris, and the carbohydrate must be enzymatically removed during purification (Morgan et al., 1999). Therefore, two specific point mutations were introduced into the synthetic MSP-142 protein sequence in order to WO 00/63245 PCT/GB00/01558 prevent N-linked glycosylation at NxS/T sites (one potential site within the MSP-133 sequence, and a known site within the MSP-119 sequence at Asn 1).

The optimized MSP-142 sequence was synthesized by gene assembly polymerase chain reaction (Stemmer et al.. 1995. Withers-Martinez et al., 1999), in the form of separate MSP-133 and MSP-119 fragments. The .optimized MSP-119 fragment was subcloned into a novel modified Pichia expression vector, transformed into the P. pastoris host strain SMD1168, and several independent transformants were isolated. There transformants were shown to efficiently express non-glycosylated, properly folded MSP-119. Strong expression of the optimized gene was observed in low copy number transformants. A multiple copy transformant with intermediate level G418 resistance gave expression of purified MSP-119 at a level equal to the high-expressing strain previously described (Morgan et al., 1999), which contains the original P. falciparum DNA. Thus, it should be possible to obtain even higher yields from high level G418-resistant transformants of the synthetic optimized gene.

Methods: Gene Assembly The P. falciparum MSP-142 (41.1 kDa) fragment protein sequence SWISS-PROT accession number P04933: positions 1264-1621) was first altered to eliminate N-linked glvcosylation signals by 2 amino acid substitutions. The sequences NYT (in the Nterminal portion; position 1445) and NIS (at the beginning of the C-terminal fragment; position 1526) were changed to QYT and NIA respectively. The protein sequence was then reverse-translated with DNA-STAR using the S. cerevisiae codon preferences. This sequence was used as input for the CODOP program (Withers-Martinez et al., 1999). Ten random sequences were generated with this program, using a codon weighting table (Figure 14) derived from codon usage in highly expressed P. pastoris genes (Sreekrishna et al., EP 0 586 892 Al). Thus. the codon table should reflect usage in highly expressed genes, rather than average usage. The random sequence that contained the minimum number of unfavourable codons was selected, and these codons were changed manually to more preferred alternatives. The sequence was then analysed with DNA- STAR to check for AT-rich sequences that may cause transcription termination, and for WO 00/63245 PCT/GB00/01558 -91direct and inverted repeats. A set of 50 overlapping oligonucleotides coding for the final sequence was then generated. This consisted of 49 oligonucleotides of length 42 nt, and one of length 48 nt. Each oligonucleotide had a 21 bp overlap with its neighbours, with no gaps. Estimated Ts were in the range of 60 0 C to 77 0 C. Oligonucleotides were synthesised by Oswel (Southampton, UK) at 40 nmol scale, and supplied in deionised water without purification. Outside primers of various lengths for the amplification step were also synthesised, to give a Tm of 62 0 C to 64 0 C, and contained a 5'-terminal phosphate group for ligation following the amplification step. The reverse primers also included a translation termination codon (UAA in the complementary strand). All oligonucleotides were diluted to 10 p.M in ddH,-O before use.

The PCR-mediated gene assembly and amplification were carried out as described (Stemmer et al., 1995; Withers-Martinez et al., 1999), using a Biometra cycler, in thinwalled 200 gL tubes, under the following conditions.

Gene assembly reactions (Reaction 1): pL volume 2 units Vent polymerase (New England Biolabs) 0.4 mM dNTPs 1 x Vent polymerase buffer Oligonucleotide mix containing each oligonucleotide at 200 nM Cycles: 32 cycles (2 h 33 m) denaturation 94 0 C 30 s annealing 52 0 C 30 s extension 72 0 C 3 m Three fragments of the MSP-142 (41.1 kDa) region were synthesised separately with different outside primers and subsets of the 50 oligonucleotide set: WO 00/63245 PCT/GB00/01558 -92- N-terminal fragment (bp 1-423) 21 oligonucleotides middle fragment (bp 337-786) 22 oligonucteotides C-terminal fragment (bp 787-1074) 14 oligonucleotides S The C-terminal fragment produces a 10.6 kDa fragment (MSP-119). The N-terminal and middle fragments, which overlap between positions 337 and 423, were subsequently spliced together at the BglII site (371-376) to give a 786 bp fragment that encodes the 30.5 kDa MSP-133 protein.

Amplification reactions (Reaction 2): 100 pL volume tL aliquot of the gene assembly reaction 4 units Vent polymerase 0.4 mM dNTPs 1 x Vent polymerase buffer 1 IiM outside primers Cycles: 32 cycles (2 h 55 m) denaturation 94 0 C 45 s annealing 52 0 C 45 s extension 72 0 C 3 m final extension 72 0 C 5 m The PCR products were then purified by filtration with Centricon-100 units (Amicon), and cloned directly into the vectors by blunt-end ligation overnight at 16 0 C with T4 DNA ligase.

The synthetic MSP-119 gene was cloned directly into a P. pastoris expression vector. The modified pPIC9KHXa vector, containing a His 6 tag and factor Xa cleavage site (see Figure 15) inserted in the pPIC9K SnaBI site, had been digested with PmlI and treated WO 00/63245 PCT/GB00/01558 -93with calf alkaline phosphatase. The HXa vector had been previously created by insertion of a 36 bp synthetic oligonucleotide. containing the His 6 tag, factor Xa cleavage site. and PmlI restriction site into the SnaBI site of the pPIC9K vector.

The N-terminal and middle fragment PCR products were cloned into the Smal site of the dephosphorylated pUC118 vector. Plasmid clones containing inserts were sequenced.

Clones with the correct synthetic sequence were then digested and the two fragments were gel-purified. The N-terminal fragment clones were digested with EcoRI and BglII, and the middle fragment clones were digested with HindIII and BglII. The recombinant fragments were purified on an agarose gel, and eluted with a QIAGEN extraction kit. The purified N-terminal and middle fragments were then spliced together by ligation into a pUC 118 vector that had been digested with HindIII and EcoRI and treated with calf alkaline phosphatase. This created the complete synthetic MSP-133 coding sequence. The Nterminal and middle fragment PCR products were cloned into the Smal site of the dephosphorylated pUC118 vector. Plasmid clones containing inserts were sequenced.

Clones with the correct synthetic sequence were then digested and the two fragments were gel-purified. The N-terminal fragment clones were digested with EcoRI and BglII, and the middle fragment clones were digested with HindIII and BglII. The recombinant fragments were purified on an agarose gel, and eluted with a QIAGEN extraction kit. The purified N-terminal and middle fragments were then spliced together by ligation into a pUC118 vector that had been digested with HindIII and EcoRI and treated with calf alkaline phosphatase. This created the complete synthetic MSP-133 coding sequence.

Methods: Expression and Purification The methylotrophic yeast Pichia (Komagataella) pastoris strain SMD1168 was transformed by electroporation as described previously (Morgan et al., 1999). In addition, some G418-resistant clones were isolated using Hybond-N+ membranes (Fairlie et al..

1999).

Expression screening of transformants-was performed by growing 10 ml cultures in buffered minimal glucose medium. Cells were harvested and resuspended in 10 ml WO 00/63245 PCT/GB00/01558 -94buffered minimal methanol medium at 1.0 OD 6 0 o and grown overnight to a final OD 600 of to 3.0. Cells were removed by centrifugation.-and 1.2 ml of the supernatant medium was precipitated 30 min on ice with 15 trichloroacetic acid. The samples were centrifuged for 30 min at 14000 rpm at 4 OC in a microfuge, and the protein pellets were washed twice with cold acetone. Samples were resuspended in 12 p.1 ddHO, and 5 ul was electrophoresed, after reduction with DTT, on NOVEX pre-poured acrylamide gels according to manufacturer's instructions. NOVEX 4-12 acrylamide gradient, or 10 acrylamide, Bis/Tris gels in MES buffer were used. Protein gels were stained with Coomassie colloidal Brilliant Blue stain (Sigma).

Homogeneously purified MSP-119 was obtained as described previously (Morgan et al., 1999), except that enzymatic deglycosylation was omitted for the synthetic gene products.

Methods: NMR One-dimensional and 2-dimensional {'H/"N}-HSQC spectra were acquired as described previously (Morgan et al., 1999), at 25 OC, at sample concentrations of 1.1-2.5 mM.

Results The sequences of the synthetic DNA fragments, and the resulting predicted protein products, are shown in Figure 15. A summary of the resulting improvements to the sequence is given in Table 3.

WO 00/63245 PCT/GB00/01558 Total codons P. pastoris preferred Unfavourable AT content codons codons P. falciparum MSP1 358 140 28 74 41.1 kDa fragment Synthetic 41.1 kDa 358 276 0 58 fragment Table 3. Codon usage PCR-gene assembly reactions for the MSP-133 (two sections) and MSP-119 synthetic fragments are shown on agarose gels in Figure 16. This demonstrated that a single, correct size major product was observed in each case. The PCR products were subcloned, screened, and sequenced as described in the Methods section.

P. pastoris was transformed with the synthetic MSP-119 construct in the modified pPIC9K expression vector (pPIC9K-HXa; Figure 15). Expression of the synthetic MSP- 119 product in three independent transformants is shown on a protein gel in Figure 17.

The protein samples were prepared by trichloroacetic acid precipitation from culture supernatants as described in the Methods section. This demonstrated that a single, major product was present in each sample, corresponding to the expected migration of the synthetic MSP-119 protein. This migrated slightly more slowly than the control .ample, which as described previously (Morgan et al., 1999) has a shorter N-terminal tag sequence. There was no trace of heterogeneous, slowly migrating recombinant protein that would result from glycosylation. Therefore, non-glycosylated, synthetic MSP-119 is efficiently expressed by the transformed yeast. The yield (measured by UV absorbance) of purified MSP-119 was 16 mg/L for low copy number transformants (resistant to 0.25 mg/ml G418), and increased to 24 mg/L for intermediate G418 resistance (resistant to mg/ml G418). This can be compared with yields of 1-2 mg/L for low copy number transformants of P. pastoris with the original Plasmodium falciparum coding sequence.

before isolation of a highly G418-resistant strain (Morgan et al.. 1999). This indicated that the synthetic MSP-119 construct is advantageous for recombinant protein expression.

and that further improvement would result from isolation of higher copy number transformants.

One-dimensional proton NMR experiments demonstrated that the synthetic MSP-119 protein spectrum was very similar to the previously studied protein (Morgan et al., 1999), and represented a correctly folded protein (data not shown). This was further confirmed by a N}-HSQC spectra (Figure 18), which also shows that the structure of the synthetic product is identical to the previously studied protein, except for slight differences at the N-terminus which are consistent with the presence of a distinct Nterminal tag sequence, and S3->A mutation at the glycosylation site. Backbone NH proton and chemical shifts for the original P. falciparum sequence product have been previously presented (Morgan et al.. 1999). The similarity between the two spectra, outside of the N-terminal region. is strong evidence that both protein forms are in a structurally similar, correctly folded state.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

The reference to any prior art in this specification is not, and should not be taken as, an acknowledgment or any form of suggestion that that prior art forms part of the common general knowledge in Australia.

WO 00/63245 PCT/GB00/01558 -97- References Abseher, Horstink, Hilbers, C. W. Nilges, M. (1998). Essential spaces defined by NMR structure ensembles and molecular dynamics simulation show significant overlap.

Proteins: Structure, Function and Genetics, 31, 370-382.

Barbato, Ikura, Kay, L. Pastor, R. W. Bax, A. (1992). Backbone dynamics of calmodulin studied by N-15 relaxation using inverse detected 2-dimensional nmrspectroscopy the central helix is flexible. Biochemistry, 31, 5269-5278.

Bersch, Hernandez, Marion, D. Arland, G. A. (1998). Solution structure of the epidermal growth factor (EGF)-like module of human complement Clr, an atypical member of the EGF family. Biochemistry, 37, 1204-1214.

Blackman. M. J. Holder, A. A. (1992). Secondary processing of the Plasmodium falciparum merozoite surface protein-1 (MSP-1) by a calcium-dependent membrane-bound serine protease: shedding of MSP-1 33 as a noncovalently associated complex with other fragments of the MSP-1. Mol. Biochem. Parasitol. 50, 307-316.

Blackman, M. Heidrich, Donachie, McBride, J. S. Holder, A. A. (1990). A single fragment of a malaria merozoite surface protein remains on the parasite surface during red cell invasion and is the target of invasion-inhibiting antibodies. J Exp. Med. 172, 379- 382.

Blackman, M. J. A. Chappel, S. Shai and A. A. Holder, A. A. (1993). A conserved parasite serine protease processes the Plasmodium falciparum merozoite surface protein-1 (MSP-1). Mol. Biochem. Parasitol. 62, 103-114.

Blackman, M. Scott-Finnigan, T. Shai, S. Holder, A. A. (1994). Antibodies inhibit the protease-mediated processing of a malaria merozoite surface protein. J. Exp. Med. 180, 389-393.

Blackman, Ling, I. Nicholls. S.C. Holder, A.A. (1991). Proteolytic processing of the Plasmodium falciparum merozoite surface protein-1 produces a membrane-bound fragment containing two epidermal growth factor-like domains. Mol. Biochem. Parasitol.

WO 00/63245 PCT/GB00/01558 -98- 49, 29-34.

Brandstetter, Bauer, Huber. Lollar, P. Bode, W. (1995). X-ray structure of clotting factor IXa: active site and module structure related to Xase activity and hemophilia B. Proc. Nat. Acad. Sci. USA, 92, 9796-9800.

Burghaus, P. A. Holder, A. A. (1994). Expression of the 19-kilodalton carboxy-terminal fragment of the Plasmodium falciparum merozoite surface protein-1 in Escherichia coli as a correctly folded protein. Mol. Biochem. Parasitol. 64, 165-169.

Campbell, I.D. Downing, A.K. (1998). NMR of modular proteins. Nat. Struct. Biol. Suppl 496-499.

Clare, J. J. Romanos, M. A. (1995). Expression of Cloned Genes in the Yeasts Saccharomyces cerevisiae and Pichia pastoris. Methods in Molec. Cell Biol. 5, 319-329.

Clore, G. M. Gronenbom, A. M. (1998). Determining the structures of large proteins and protein complexes by NMR. Trends in Biotechnology, 16, 22-34.

Daly, T. Bums, J. M. Long, C. A. (1992). Comparison of the carboxyl terminal, cysteine-rich domain of the merozoite surface protein-1 from several strains of Plasmodium yoelii. Mol. Biochem. Parasitol. 52, 279-282.

Del Portillo, H. Longacre, Khouri, E. David, P. H. (1991). Primary structure of the merozoite surface antigen-1 of Plasmodium vivax reveals sequences conserved between different Plasmodium species. Proc. Natl. Acad. Sci. USA 88, 4030-4034.

Diggs, Ballou, W.R. Miller, L.H. (1993). The major merozoite surface protein as a malaria vaccine target. Parasitol Today, 9, 300-302.

Doreleijers, J. Rullman, J. A. C. Kaptein, R. (1998). Quality assessment of NMR structures: a statistical approach. J. Mol. Biol. 281, 149-164.

Downing, A. Knott, Werner, J. Cardy, C. Campbell, I. D. Handford, P.

A. (1996). Solution structure of a pair of calcium binding epidermal growth factor-like domains: implications for the Marfan syndrome and other genetic disorders. Cell. 86, 597- 605.

WO 00/63245 PCT/GB00/01558 -99- Egan, Waterfall, Pinder, Holder, A. Riley, E. (1997) Characterization of human T- and B- cell epitopes in the C-terminus of Plasmodium falciparum merozoite surface protein 1: evidence for poor T-cell recognition of polypeptides with numerous disulfide bonds. Infect. Immun. 65. 3024-3031.

Fairlie, Russell, Zhang, and Breit, S.N. (1999) Screening Procedure for Pichia pastoris Clones Containing Multiple Copy Gene Inserts. BioTechniques 26: 1042- 1044.

Gibson, Tucker, J. Kaslow, Krettli, Collins, W. Kiefer, M. C., Bathurst, I. C. Barr, P. J. (1992). Structure and expression of the gene for Pv200, a major blood-stage surface antigen of Plasmodium vivax. Mol. Biochem. Parasitol. 50, 325-334.

Guevara Patiiio, J. Holder, A. McBride, J. S. Blackman, M. J. (1997). Antibodies that inhibit malaria merozoite surface protein-1 processing and erythrocyte invasion are blocked by naturally acquired human antibodies. J. Exp. Med 186, 1689-1699.

Holder, A. Blackman, M. Burghaus, P. Chappel, J. Ling, I. McCallum- Deighton, N. Shai, S. (1992). A malaria merozoite surface protein (MSP-1) Structure, processing and function. Mem. Inst. Oswaldo Cruz, 87, Suppl III, 37-42.

Holder, A. Lockyer, M. Odink, K. Sandhu. J. Riveros-Moreno, Nicholls, S.

Hillman, Davey, L. Tizard, M. L. Schwarz, R. T. Freeman, R. R. (1985).

Primary structure of the precursor to the three major surface antigens of Plasmodium falciparum merozoites. Nature 317, 270-273.

Kay, L. Torchia, D. A. Bax, A. (1989). Backbone dynamics of proteins as studied by 'SN inverse detected heteronuclear NMR spectroscopy. Application to staphylococcal nuclease. Biochemistry, 28, 8972-8979.

Kraulis, P. J. (1991). Molscript a program to produce both detailed and schematic plots of protein structures. J Appl. Cryst. 24, 946-950.

Laroche, Storme, de Meutter, Messens, J. Lauwereys, M. (1994). High-level secretion and very efficient isotopic labeling of tick anticoagulant peptide (TAP) expressed in the methylotrophic yeast, Pichia pastoris. Bio/Technology, 12, 1119-1124.

WO 00/63245 PCT/GB00/01558 -100- Laskowski, R. Rullmann, J. A. MacArthur, M. Kaptein, R. Thornton, J. M.

(1996). AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR. 8, 477-486.

McBride. J. S. Heidrich, (1987). Fragments of the polymorphic Mr 185,000 glycoprotein from the surface of isolated Plasmodium falciparum merozoites form an antigenic complex. Mol. Biochem. Parasitol. 23, 71-84.

McDonald, I. K. Thornton, J. M. (1994). Satisfying hydrogen bonding potential in proteins. J Biol. Chem. 238, 777-793.

Morgan, Birdsall, Frenkiel, Gradwell, Burghaus, Syed, Uthaipibull, Holder, and Feeney, J. (1999) Solution structure of an EGF module pair from the Plasmodium falciparum Merozoite Surface Protein 1, J Mol. Biol., 289. 113-122.

Mrema, J. E. S. G. Langreth, R. C. Jost, K. H. Rieckmann and H. Heidrich (1982).

Plasmodiumfalciparum: isolation and purification of spontaneously released merozoites by nylon sieve membranes. Exp. Parasitol. 54, 285.

Nicholls, Sharp, K. A. Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins, 11, 281- 296.

Nilges, Kuszewski, J. Briinger, A. T. (1991). Sampling properties of simulated annealing and distance geometry in Computational Aspects of the Study of Biological Macromolecules by NMR C. Hoch, ed.) NY, Plenum Press.451-455.

Perrin, S Gilliland, G. (1990). Site specific mutagenesis using asymmetric polymerase chain reaction and a single mutant primer. Nucl. Acids Res. 18, 7433-7438 Pirson, P. J. Perkins, M. E. (1985). Characterization with monoclonal antibodies of a surface antigen of Plasmodiumfalciparum merozoites. J. Immunol. 134, 1946-1951.

Polshakov, V. Frenkiel, T. Birdsall, Soteriou, A. Feeney, J. (1995).

Determination of stereospecific assignments torsion-angle constraints and rotamer WO 00/63245 PCT/GB00/01558 -101populations in proteins using the program AngleSearch. J. Magn. Reson. Series B, 108, 31-43.

Polshakov, V. Williams, Gargaro, Frenkiel, T. Westley, B. Chadwick, M. May, F. E. B. Feeney, J. (1997). High resolution solution structure of the human breast cancer oestrogen-inducible pNR-2/pS2: a single trefoil domain. J. Mol. Biol. 267, 418-432.

Qari, Shi, Y. Goldman, I. Nahlen, B. Tibayrenc, Lal, A. A. (1998).

Predicted and observed alleles of Plasmodium falciparum merozoite surface protein 1 (MSP-1), a potential malaria vaccine antigen. Mol. Biochem. Parasitol. 92(2), 241-252.

Richardson, J. S. (1981). The Anatomy and Taxonomy of Protein structure. Adv. Prot.

Chem. 34, 167-339.

Romanos, Makoff. Fairweather, Beesley, Slater, Rayment, Payne, and Clare, J.J. (1991) Expression of Tetanus Toxin Fragment-C in YEAST- Gene Synthesis is Required to Eliminate Fortuitous Polyadenylation Sites in AT- Rich DNA. Nucleic Acids Res., 19: 1461-1467.

Ryckaert, Ciccutti, G. Berendsen, H. J. C. (1977). Numerical-integration of Cartesian equations of motion of a system with constraints molecular dynamics of Nalkanes. J. Comput. Phys. 23, 327-351.

Stemmer, Crameri, Ha, Brennan, and Heyneker, H.L. (1995) Single-step assembly of a gene and entire plasmid from large numbers of oligodeoxyribonucleotides. Gene, 164: 49-53.

Stoute, J. A. Ballou, W.R. (1998). The current status of malaria vaccines. BIODRUGS 10,123-136.

White, Kempi, and Komives, (1994), Expression of highly disulfidebonded proteins in Pichia pastoris, Structure, 2: 1003-1005.

Withers-Martinez, Carpenter, Hackett, Ely, Sajid, Grainger, and Blackman, M.J. (1999) PCR-based gene synthesis as an efficient approach for expression WO 00/63245 PCT/GBOO/01558 -102of the A+T-rich malaria genome. Protein Engineering, 12: 1113-1120.

EDITORIAL NOTE FOR 41330/00 The following sequence listing is part of the description The claims follow on page 103

SEQ

<110> Medical Research Counc <120> Vaccine <130> P006720wo <160> <170> PatentIn version <210> 1 <211> 96 <212> PRT <213> Plasmodium falciparum <400> 1 Asn Ile Ser Gin His Gin Cys 1 5 Gly Cys Phe Arg His Leu Asp Asn Tyr Lys Gin Glu Gly Asp Cys Asn Glu Asn Asn Gly Gly 55 Glu Asp Ser Gly Ser Asn Gly 70 Pro Asp Ser Tyr Pro Leu Phe <210> 2 <211> 376 <212> PRT <213> Plasmodium falciparum <400> 2 Ala Val Thr Pro Ser Val Ile 1 5 Glu Tyr Glu Val Leu Tyr Leu Leu Lys Lys Gin Leu Glu Asn Lys Asp Ile Leu Asn Ser Arg 55 Val Leu Glu Ser Asp Leu Ile 70 UENCE LISTING il Val Glu Lys 40 Cys Lys Asp Asp Lys Asn 40 Phe Pro Lys Lys Gin Cys 10 Arg Glu Glu Cys 25 Cys Val Glu Asn Asp Ala Asp Ala Lys Ile Thr Cys 75 Gly Ile Phe Cys 90 Pro Lys Pro Lys Glu Ser Gin Cys Asn Cys Cys Ser Ile Tyr Val Phe Ser Asn Ser Leu Leu Pro Thr Thr Glu Thr Lys Ser Asn Asn Pro 25 Val Asn Tyr Ile Leu Ser Lys 10 Leu Ala Gly Val Met Thr Phe Asn Lys Arg Glu Asn Lys Asp Leu Thr 75 Glu Arg Asn Lys Ser Asn Ser Val Asn Asn Tyr Val Val Lys Phe Leu Ilie Asn Phe 115 Lys Tyr Lys 130 Gin Gly Giu 145 Leu Tyr Lys Giu Ala Lys Lys Ile Lys 195 Asp Phe Lys 210 Asp Tyr Asn 225 Val Phe Giu Asn Leu Gin Gin Cys Pro 275 Giu Cys Lys 290 Giu Asn Pro 305 Asp Ala Lys Thr Cys Glu Phe Cys Ser 355 Met Leu Ile 370 <210> 3 <211> 394 Lys Ser 100 Ala Ser Asn Thr Val1 180 Gi u Lys His Asn Giy 260 Gin Cys Asn Cys Cys 340 Ser Le u Asp Ser Asn Asp Giu Val 165 Leu Leu Asn As n Leu 245 Met Asn Leu Pro Thr 325 Thr Ser Tyr Pro Tyr Lys Tyr Asn Tyr Asp Val Leu 120 Leu Asp Ser 135 Lys Tyr Leu 150 Asn Asp Lys Asn Tyr Thr Asn Tyr Leu 200 Asn Asn Phe 215 Asn Leu Leu 230 Ala Lys Thr Leu Asn Ile Ser Gly Cys 280 Leu Asn Tyr 295 Thr Cys Asn 310 Glu Giu Asp Lys Pro Asp Asn Phe Leu 360 Ser Phe Ile 375 Phe Leu 90 Ile Lys 105 Gly Tyr Ile Lys Pro Phe Ile Asp 170 Tyr Giu 185 Lys Thr Val Giy Thr Lys Val Leu 250 Ser Gin 265 Phe Arg Lys Gin Giu Asn Ser Giy 330 Ser Tyr 345 Gly Ile Asn Asp Tyr Lys Leu 155 Leu Lys Ile Ile Phe 235 Ser His His Giu Asn 315 Ser Pro Ser Lys Ser Lys Tyr 140 Asn Phe Ser Gin Ala 220 Leu Asn Gin Le u Gly 300 Gly Asn Le u Phe Giu Ile Ile 125 Ile Asn Val1 Asn Asp 205 Asp Ser Leu Cys Asp 285 Asp Gly Gi y Phe Leu 365 Lys Asp 110 Leu Asn Ile Ile Val1 190 Lys Leu Thr Leu Val1 270 Giu Lys Cys Lys Asp 350 Leu Arg Thr Ser Asp Giu His 175 Giu Leu Ser Gly Asp 255 Lys Arg Cys Asp Lys 335 Gly Ile Asp Asp Glu Lys Thr 160 Leu Val1 Ala Thr Met 240 Gly Lys Giu Val1 Ala 320 Ile Ile Leu fee* .0~ .0 00 S0S S0 :*,so 00.40S 000.

000 <212> PRT <213> Plasmodium falciparum <400> 3 Ala Ile Ser Val Thr Met Asp Asn Ile Leu Ser Gly Phe Glu Asn Glu 1 5 10 Tyr Asp Val Ile Tyr Leu Lys Pro Leu Ala Gly Val Tyr Arg Ser Leu 25 Lys Lys Gin Ile Glu Lys Asn Ile Phe Thr Phe Asn Leu Asn Leu Asn 40 Asp Ile Leu Asn Ser Arg Leu Lys Lys Arg Lys Tyr Phe Leu Asp Val 55 Leu Glu Ser Asp Leu Met Gin Phe Lys His Ile Ser Ser Asn Glu Tyr 70 75 Ile Ile Glu Asp Ser Phe Lys Leu Leu Asn Ser Glu Gin Lys Asn Thr 90 Leu Leu Lys Ser Tyr Lys Tyr Ile Lys Glu Ser Val Glu Asn Asp Ile 100 105 110 Lys Phe Ala Gin Glu Gly Ile Ser Tyr Tyr Glu Lys Val Leu Ala Lys 115 120 125 Tyr Lys Asp Asp Leu Glu Ser Ile Lys Lys Val Ile Lys Glu Glu Lys 130 135 140 Glu Lys Phe Pro Ser Ser Pro Pro Thr Thr Pro Pro Ser Pro Ala Lys 145 150 155 160 Thr Asp Glu Gin Lys Lys Glu Ser Lys Phe Leu Pro Phe Leu Thr Asn 165 170 175 Ile Glu Thr Leu Tyr Asn Asn Leu Val Asn Lys Ile Asp Asp Tyr Leu 180 185 190 Ile Asn Leu Lys Ala Lys Ile Asn Asp Cys Asn Val Glu Lys Asp Glu 195 200 205 Ala His Val Lys Ile Thr Lys Leu Ser Asp Leu Lys Ala Ile Asp Asp 210 215 220 Lys Ile Asp Leu Phe Lys Asn Pro Tyr Asp Phe Glu Ala Ile Lys Lys 225 230 235 240 Leu Ile Asn Asp Asp Thr Lys Lys Asp Met Leu Gly Lys Leu Leu Ser 245 250 255 Thr Gly Leu Val Gin Asn Phe Pro Asn Thr Ile Ile Ser Lys Leu Ile 260 265 270 Glu Gly Lys Phe Gin Asp Met Leu Asn Ile Ser Gin His Gin Cys Val 275 280 285 Lys Lys Gin Cys Pro Glu Asn Ser Gly Cys Phe Arg His Leu Asp Glu 290 295 300 Arg Glu Glu Cys Lys Cys Leu Leu Asn Tyr Lys Gin Glu Gly Asp Lys 305 310 315 320 Cys Val Glu Asn Pro Asn Pro Thr Cys Asn Glu Asn Asn Gly Gly Cys 325 330 335 Asp Ala Asp Ala Thr Cys Thr Glu Glu Asp Ser Gly Ser Ser Arg Lys 340 345 350 Lys Ile Thr Cys Glu Cys Thr Lys Pro Asp Ser Tyr Pro Leu Phe Asp 355 360 365 Gly Ile Phe Cys Ser Ser Ser Asn Phe Leu Gly Ile Ser Phe Leu Leu 370 375 380 Ile Leu Met Leu Ile Leu Tyr Ser Phe Ile 385 390 <210> 4 <211> 18 <212> DNA <213> Artificial <220> <223> Primer <400> 4 caccatcatc atcatcac 18 S.<210> <211> 47 <212> PRT *<213> Plasmodium falciparum <400> Asn Ile Ser Gin His Gin Cys Val Lys Lys Gin Cys Pro Gin Asn Ser 1 5 10 SGly Cys Phe Arg His Leu Asp Glu Arg Glu Glu Cys Lys Cys Leu Leu 20 25 Asn Tyr Lys Gin Glu Gly Asp Lys Cys Val Glu Asn Pro Asn Pro 35 40 <210> 6 <211> 49 <212> PRT <213> Plasmodium falciparum <400> 6 Thr Cys Asn Glu Asn Asn Gly Gly Cys Asp Ala Asp Ala Lys Cys Thr 1 5 10 Glu Glu Asp Ser Gly Ser Asn Gly Lys Lys Ile Thr Cys Glu Cys Thr 25 Lys Pro Asp Ser Tyr Pro Leu Phe Asp Gly Ile Phe Cys Ser Ser Ser 40 Asn <210> 7 <211> 48 <212> PRT <213> Plasmodium vivax <400> 7 Thr Met Ser Ser Glu His Thr Cys Ile Asp Thr Asn Val Pro Asp Asn 1 5 10 Ala Ala Cys Tyr Arg Tyr Leu Asp Gly Thr Glu Glu Trp Arg Cys Leu 25 Leu Thr Phe Lys Glu Glu Gly Gly Lys Cys Val Pro Ala Ser Asn Val 40 <210> 8 <211> <212> PRT <213> Plasmodium vivax <400> 8 Thr Cys Lys Asp Asn Asn Gly Gly Cys Ala Pro Glu Ala Glu Cys Lys 1 5 10 *Met Thr Asp Ser Asn Lys Ile Val Cys Lys Cys Thr Lys Glu Gly Ser 20 25 Glu Pro Leu Phe Glu Gly Val Phe Cys Ser Ser Ser Ser 40 <210> 9 <211> 53 <212> PRT <213> Artificial <220> <223> Legf human epidermal <400> 9 Asn Ser Tyr Pro Gly Cys Pro Ser Ser Tyr Asp Gly Tyr Cys Leu Asn 1 5 10 Gly Gly Val Cys Met His Ile Glu Ser Leu Asp Ser Tyr Thr Cys Asn 25 Cys Val Ile Gly Tyr Ser Gly Asp Arg Cys Gin Thr Arg Asp Leu Arg 40 Trp Trp Glu Leu Arg <210> <211> 1077 <212> DNA <213> Artificial <220> <223> MSP-142 gene <400> qctgttactc catctgttat cgataacatt ctgtctaaga ttgagaacga atacgaggtc ttgtacttga gtcatgactt aact tcaaga tacgttgtta tcttacaact ggttactaca atcaatgata ttgtacaaga ttgcaataca aagaccattc gatttgtcca gttttcqaga atgttgaaca ttcagacatc gataagtgtg gacgctaagt actaagccag <210> 11 <211> 333 <212> DNA agcctctggc tcaacqttaa a cg ttct g ga aggacccata acattaagga agatcttgtc agcaaggaga ctgttaacga cttacgaaaa aggataagct ctgattacaa acttggctaa ttgcccaaca tggacgagag ttgagaaccc gcaccgaaga cggtgtctac cgtcaaggac gtctgacttg caagttcctg ctccattgat tgagaagtac gaatgaaaag taagatcgat gtctaacgtc ggctqatttc ccacaacaat gactgtcctg ccaatgcgtt agaagaatgt aaaccctacc agactctggt agatccctga attttgaact attccataca aacaaggaga actgatatca aagtctgact tacctgccat ctgt togtca gaggtcaaga aagaagaaca ctgttgacta tccaacctgt aagaagcaat aagtgtctgt t g ta acg a ga tctaacggaa agaagcaact ccagattcaa aggatttgac agagagacaa.

acttcgctaa tggattccat tcctgaataa ttcatttgga tcaaggaatt acaacttcgt agttcctgtc tggatggtaa gtccacaaaa tgaactacaa acaacggtgg agaagattac ggaaaacaac caagagagaa ttcttctaac gttcttgtcc cgacgtcctg caagaagtac cat tgaaact agccaaggtc gaactacctc tggtatcgct taccggtatg cttgcagggt ctccggatgt gcaggaaggt atgcgacgct ttgcgaatgt 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1077 actcttaccc tttgttcgat ggaatcttct gttcttcctc taactaa <213> <220> <223> <400> Artificial MSP-119 11 taccaccatc atcatcatca cattgaaggt agacacaaca ttgcccaaca ccaatgcgtt aaqaagcaat gtccacaaaa ctccggatgt ttcagacatc tggacqagag agaagaatgt aagtgtctgt tgaactacaa gcaggaaggt gataagtqtg ttgagaaccc aaaccctacc tgtaacgaga acaacggtqg atgcgacgct gacgctaagt gcaccgaaga agactctggt tctaacggaa agaagattac ttgcgaatgt actaagccag actcttaccc tttgttcqat ggaatcttct gttcttcctc taactaagtg gta 120 180 240 300 333 <210> 12 <211> 108 <212> PRT <213> Artificial <220> <223> MSP-119 <400> 12 Tyr His His His H 1 5 His Gin Cys Val L His Leu Asp Glu A 35 Glu Gly Asp Lys C 50 Asn Giy Gly Cys A~ Ser Asn Gly Lys L is ys His His Ile Glu Gi y 10 Gin Arg His Asn Ile Lys Gin Cys Pro 25 Lys Asn Ser Gly Cys Tyr Ala Gin Phe Arg Lys Gin rq Giu Giu ys Val Giu 55 sp Ala Asp Cys 40 Asn Cys Leu Leu Asn Cys Pro Asn Pro Thr Glu Asn Glu Asn Ala Lys Cys 70 ys Ile Thr 75 Glu Asp Ser Giy Tyr Thr Cys Giu Gly Cys Thr Lys 90 Ser Ser Asn Pro Asp Ser Pro Leu Phe Asp 100 Ile Phe Cys Ser 105 <210> 13 <211> 786 <212> DNA <213> Artificial <220> <223> MSP-133 <400> 13 gctgttactc catctgttat cgataacatt ctgtctaaga ttgagaacga atacqaggtc ttgtacttga agcctctgqc cqqtgtctac agatccctga agaagcaact ggaaaacaac gtcatgactt tcaacqttaa cgtcaaqqac attttqaact ccagattcaa caagagagaa 120 180 aacttcaaga tacgttgtta tct tacaact ggttactaca atcaatgata ttgtacaaga ttgcaataca aagaccattc gatttgtcca gttttcgaga atgttg acgttctgga aggacccata acattaagga agatcttgtc agcaaggaga ctgttaacqa cttacgaaaa aggataagct ctgattacaa act tggctaa gtctgacttg caagttcctg ctccattgat tqagaagtac gaatgaaaag taagatcgat gtctaacgtc ggctgatttc ccacaacaat gactgtcctg attccataca aggatttgac aacaaqgaga agagagacaa actgatatca acttcgctaa aaqtctgact tggattccat tacctgccat tcctgaataa ctgttcqtca ttcatttgga gaggtcaaqa tcaaggaatt aagaagaaca acaacttcgt ctgttqacta aqttcctgtc tccaacctgt tggatggtaa ttcttctaac gttcttgtcc cgacgtcctg caagaagtac cat tgaaact agccaaggtc gaactacctc tggtatcgct taccggtatg cttgcagggt 240 300 360 420 480 540 600 660 720 780 786 <210> 14 <211> 262 <212> PRT <213> Artificial <220> <223> MSP-133 <400> 14 Ala Val Thr Pro Ser Val 1 5 Giu Tyr Glu Val Leu Tyr Leu Lys Lys Gin Leu Giu Lys Asp Ile Leu Asn Ser 50 Val Leu Glu Ser Asp Leu 65 70 Tyr Val Val Lys Asp Pro 85 Lys Phe Leu Ser Ser Tyr 100 Ile Asn Phe Ala Asn Asp 115 Ile Asp Asn Ile Leu Ser Lys Ile Glu Asn Leu Lys Pro 25 Asn Asn Val Leu Ala Gly Met Thr Phe Val Tyr Arg Ser Asn Val Asn Val Asn Phe Lys Asn 40 Arg Phe Asn Lys Arg 55 Ile Glu Le u Pro Tyr Lys Asp 75 Asn Thr Ser Ser Asn Tyr Lys Phe Asn Tyr Ile 105 Val Leu Gly 120 Leu 90 Lys Tyr Lys Giu Lys Arg Asp Asp Ser Ile Asp Thr Asp 110 Tyr Lys Ile Leu Ser Glu 125 Lys Tyr Lys Ser Asp Leu Asp 130 135 Ser Ile Lys Lys Tyr 140 Ile Asn Asp Lys Gin Gly Glu Asn Glu Lys Tyr Leu Pro Phe Leu Asn Asn Ile Glu Thr 145 150 155 160 Leu Tyr Lys Thr Val Asn Asp Lys Ile Asp Leu Phe Val lie His Leu 165 170 175 Glu Ala Lys Val Leu Gin Tyr Thr Tyr Glu Lys Ser Asn Val Glu Val 180 185 190 Lys Ile Lys Glu Leu Asn Tyr Leu Lys Thr Ile Gin Asp Lys Leu Ala 195 200 205 Asp Phe Lys Lys Asn Asn Asn Phe Val Gly Ile Ala Asp Leu Ser Thr 210 215 220 Asp Tyr Asn His Asn Asn Leu Leu Thr Lys Phe Leu Ser Thr Gly Met 225 230 235 240 Val Phe Glu Asn Leu Ala Lys Thr Val Leu Ser Asn Leu Leu Asp Gly 245 250 255 Asn Leu Gin Gly Met Leu 260 <210> <211> 21 <212> PRT <213> Artificial <220> <223> Alpha factor fusion protein <400> Lys Arg Glu Ala Glu Ala Tyr His His His His His His Asn Ile Ser 1 5 10 *o Gin Ser Ser Ser Asn o good *go• *oo•

Claims

26-11-'04 10:19 FROM- T-957 P006/018 F-924 -103- THE CLAIMS DEFINING THE INVENTION ARIE AS FOLLOWS: 1. A non-naturally occurring variant of a C-terminal fragment of a Plasmodium merozoite surface protein-I (MSP-1) comprising an amino acid modification at any one of amino acid residues 14, 15, 27, 31, 34, 43, 48 and 53 of the Plasmodiumfalciparun MSP- 119 amino acid sequence shown as SEQ 10D NO: 1 or their equivalent positions in other Plasmodium MSP- 1 1 9 polypeptides. 2. A variant to Claim I wherein said modifications are selected from Glnl4 4) Aig, Glnl4 AsniS Arg, Glu27 Tyr, Leu3l Mrg, Tyr34 Ser, Tyr34 -4 le, Glu43 -*Lcu, Thr48 Lys and Asn53 Arg and their equivalents in other Plasmodium MSP-1 1 9 polypeptides. A3. A variant according to Claim 2 wherein said substitutions are selected from [Glu27 -+Tyr, Leu3l Arg and Glu43 Leu], [Glu 27 Tyr, Leu3l Mg, Tyr34 Ser and Ghl43 Leu], [AsniS Arg, Glu27 Tyr, Leu31 Arg and Glu43 Leul and their equivalents in other Plasmodium MSP- 1 19 polypeptides. 4. A non-naturally occurring variant of a C-terminal fragment of Plasmodiumn merozoite surface protein-I1 (M4SP- 1) comprising a mutation at Cys 12 and/or Cys2 8 of the Plasmodiumtfalciparum MSP- 1 19 amino acid sequence showni as SEQ ID) NO: 1. A variant acdording to Claim 4 in which the mutation is a substitution selected ::from Cysl 2 le and Cys,2S Trp, and Cysl12 Ala and Cys28 Phe. A variant according to Claim 4 in which the mutation is the deletion of Cys12 and/or Cys28 of the Plasmodium falciparum MSP- 119 amino acid sequence shown as SEQ ID NO: 1. COMS ID No: SBMI-01014321 Received by IP Australia: Time 10:25 Date 2004-11-26 26-11-'04 10:20 FROM- T-957 P007/018 F-924
104- 7. A variant according to Claim 5, in which the substitutions are selected from [Cys 12 -+Ile, Asnl5 Arg, Glu27 Tyr, Cys28 Trp, Leu3l Arg, Glu43 Leu], [CysI12 le, AsnI 5 -Arg, Glu27 Tyr, Cys28 Trp, Leu3 1 Arg, Glu43 -+Leu, Asn53 [Cys12 Ile, AsniS Arg, Glu27 Tyr, Cys28 Trp, Leu3l Arg, Tyr34 -~Ser, Glu43 Leu, Asn53 Arg] and their equivalents in other Plasmodium MSP-1 19 polypeptides. 8. A method for producing a Plasmodium MSP-1 variant for use in preparing a vaccine composition which method comprises: selecting one or more candidate amino acid residues that are exposed at the surface of a C-terminal fragment of a Plasmodium merozoite surface protein- I (MSP- 1); modifying the one or more candidate amino acids identified in step and identifying a Plasmodium MSP-I variant that: reduces the binding of at least one blocking antibody by at least 50% as compared to wild-type MSP- 1; and (ii) reduces the binding of at least one neutralising antibody which inhibits the cleavage Of MSP-1 42 by less than 50% as compared to wild-type MSP-1. A method according to Claim 8, wherein the blocking antibody is selected from mAbs, 1I12.2, 7.5, 9C8 and 111.4. A method according to Claim 8 or Claim 9, wherein the neutralising antibody is selected from m.Abs 12.8, 12. 10 and 5BI1. 11. A non-naturally occurring Plasmodium MSP-1 variant obtained by the method of any one of Claims 8 to COMS ID No: SBMI-0101 4321 Received by IP Australia: Time 10:25 Date 2004-11-26 26-11-'04 10:20 FROM- T-957 P008/018 F-924 R d MP O7l7ut.h.e d M7 aded dailDwt-2ldI In -105- 12. A polynucleotide encoding a variant according to any one of Claims 1 to 7 or 11 operably linked to a regulatory sequence capable of directing the expression of said nucleotide in a host cell. 13. A polynucleotide according to Claim 12 having a sequence optimised for expression in said host cell. 14. A polynucleotide according to Claim 12 or 13, in which the host cell is a Pischia pastoris cell. A nucleic acid vector comprising a polynucleotide according to Claim 12, 13 or 14. 16. A host cell comprising a vector according to Claim 17. A pharmaceutical composition comprising a variant according to any one of Claims 1 to 7 or 11, a polynucleotide according to any one of Claims 12 to 14 or a vector according to Claim 15 together with a pharmaceutically acceptable carrier or diluent. 18. A composition according to Claim 17 further comprising an immunogenic Plasmodium polypeptide or fragment or derivative thereof. po 19. A method for producing anti-MSP-1 antibodies which method comprises administering a polypeptide according to any one of Claims 1 to 7 or 11, a polynucleotide according to any one of Claims 12 to 14 or a vector according to Claim 15 to a mammal. o. 20. A method for producing polyclonal anti-MSP-1 antibodies which method comprises administering a polypeptide according to any one of Claims 1 to 7 or 11, a polynucleotide according to any one of Claims 12 to 14 or a vector according to Claim to a mammal and extracting the serum from said mammal. 21. An antibody produced by the method of Claim 19 or COMS ID No: SBMI-01014321 Received by IP Australia: Time 10:25 Date 2004-11-26 26-11-'04 10:20 FROM- T-957 P009/018 F-924 -106- 22. A method of inducing immunity against malaria induced by Plasmodium falciparum which comprises administering to a person in need of such immunity an effective amount of the variant of Claim 1, the polynucleotide of Claim 12 or the vector of Claim 23. A method of immunizing a mammal, said method comprising administering an effective amount the polypeptide of Claim 1, the polynucleotide of Claim 12 or the vector of Claim 24. The method of Claim 23, wherein said mammal is immunized against malaria. A method of treating a malaria infection in a human patient which comprises administering to the patient an effective amount of the pharmaceutical composition of Claim 17 or 18. 26. A variant according to any one of Claims 1 to 7 or 11, a polynucleotide according to any one of Claims 12 to 14 or a vector according to Claim 15 for use in therapy. 27. A nucleic acid encoding a Plasmodium MSP-1 polypeptide variant according to any one of Claims 1 to 7 or 11, in which the codon usage of the nucleic acid is optimised •for expression in a heterologous host cell. 28. A nucleic acid according to Claim 27, in which the heterologous host is a Pischla pastoris cell. 29. A nucleic acid according to Claim 27 or 28, in which the polypeptide is selected from an MSP-1 42 polypeptide comprising a sequence shown in Figures 15C and 15E, an MSP-1 19 polypeptide comprising a sequence shown in Figure 15C, and an MSP-1 33 polypeptide comprising a sequence shown in Figure COMS ID No: SBMI-01014321 Received by IP Australia: Time 10:25 Date 2004-11-26 26-11-'04 10:20 FROM- T-957 P010/018 F-924 nrMOEAannermaWi diuAisem-id c6ida-oc.il M -107- A nucleic acid according to any of Claims 27 to 29, in which the nucleic acid comprises a sequence selected from the sequences of Figures 15A, Figure 15B and Figure 31. to A nucleic acid vector comprising a nucleic acid according to any one of Claims 27 r 32. A host cell comprising a vector according to Claim 31. 33. A pharmaceutical composition comprising a nucleic acid according to any of Claims 27 to 29 or a vector according to Claim 31 together with a pharmaceutically acceptable carrier or diluent. 34. A composition according to Claim 33 further comprising an immunogenic Plasmodium polypeptide or fragment or derivative thereof. A variant according to any one of Claims 1 to 7 and 11 substantially as described hereinbefore with reference to the Examples and/or Figures. 36. A method according to any one of Claims 8 to 10, 19, 20 and 22 to 25 substantially as described hereinbefore with reference to the Examples and/or Figures. 37. A polynucleotide according to any one of Claims 12 to 14 substantially as described hereinbefore with reference to the Examples and/or Figures. 38. A nucleic acid vector according to Claim 15 or Claim 31 substantially as described hereinbefore with reference to the Examples and/or Figures. 39. A host cell according to Claim 16 or Claim 31 substantially as described hereinbefore with reference to the Examples and/or Figures. COMS ID No: SBMI-01014321 Received by IP Australia: Time 10:25 Date 2004-11-26 26-11-'04 10:21 FROM- T97P1/1 -2 T-957 P011/018 F-924 -108- A nucleic; acid according to any one of Claims 27 to 30 substantially as described hereinbefore with reference to the Examples and/or Figures. 41. A composition according to Claims 17, 18, 33 and 34 substantially as described hercinbefore with reference to the Examples and/or Figures. DATED this 261h day of November, 2004 Medical Research Council By DAVIES COLLISON CAVE Patent Attorneys for the Applicant COMS ID No: SBMI-01014321 Received by IP Australia: Time 10:25 Date 2004-11-26