PT98302A

PT98302A - Process for production of heterologous proteins

Info

Publication number: PT98302A
Application number: PT9830291A
Authority: PT
Inventors: Mary Ellen Brawner; James Allan Fornwald; James Arthos
Original assignee: Smithkline Beecham Corp
Priority date: 1990-07-11
Filing date: 1991-07-11
Publication date: 1992-05-29

Abstract

The present invention relates to the process for production of a heterologous protein in Streptomyces having a homogeneous amino terminal after processing to remove the signal peptide formed, from the protein product, characterized by comprising: (a) the introduction, in a Streptomyces host cell, of a DNA vector having a nucleic acid sequence coding for the signal sequence of the S. longisporus tyrosine inhibiting gene, joined to a pro-peptidic sequence consisting essentially in an oligonucleotide encoding from one to about six amino acids, which is operatively bound to a coding sequence of the heterologous protein; and (b) culturing said host cell in a suitable culture medium.

Description

72 859 SBC Case 14497-1 -272 859 SBC Case 14497-1 -2

c.W.

ό·"ό · "

MEMÓRIA DESCRITIVADESCRIPTIVE MEMORY

Campo do invento 0 presente invento refere-se à produção de proteínas heterólogas em microorganismos. Mais particularmente, o presente invento refere-se às sequências de ácido nucleico e vectores de expressão para a expressão de proteínas heterólogas, particularmente as proteínas solúveis CD-4, em Streptomyces.FIELD OF THE INVENTION The present invention relates to the production of heterologous proteins in microorganisms. More particularly, the present invention relates to nucleic acid sequences and expression vectors for the expression of heterologous proteins, particularly soluble CD-4 proteins, in Streptomyces.

Historial do invento, A principal via de infecção dos vírus HIV é mediada através da ligação do vírus à proteína CD4. A CD4 (também referida como T4) é uma glicoproteína de superfície das células T, que se associa às moléculas do complexo major de histocompatibilidade maior, classe II. No entanto, a CD4 é também o alvo para a glicoproteína do envelope do HIV-1, gpl20. Quando o vírus HIV-1 se liga à CD4 na superfície da célula alvo, o ARN virai é então introduzido na célula alvo por fusão directa das membranas virai e plasmática - (Maddon et al.. Cell, 54:865-874 (1988)). A sequência de ADN da CD4 foi revelada por Maddon et al. (Cell 42:93-104 (1985)). A partir dela pôs-se em evidência que a proteína CD4 madura consiste numa região extracelular contendo quatro domínios do tipo imunoglobulina (V1-V4), um domínio transmembranar e uma região intracelular com carga de aproximadamente 40 aminoácidos (Maddon et al.. supra).Background of the invention, The main route of infection of the HIV virus is mediated through the binding of the virus to the CD4 protein. CD4 (also referred to as T4) is a T cell surface glycoprotein, which is associated with major histocompatibility major class II molecules. However, CD4 is also the target for the envelope glycoprotein of HIV-1, gp120. When HIV-1 virus binds to CD4 on the target cell surface, viral RNA is then introduced into the target cell by direct fusion of viral and plasma membranes - (Maddon et al., Cell, 54: 865-874 (1988) ). The CD4 DNA sequence was disclosed by Maddon et al. (Cell 42: 93-104 (1985)). It has been shown that the mature CD4 protein consists of an extracellular region containing four immunoglobulin-like domains (V1-V4), a transmembrane domain and an intracellular region with a charge of approximately 40 amino acids (Maddon et al., Supra) .

Os derivados recombinantes solúveis da CD4, como a sT4, incluem proteínas em que os domínios transmembranares e citoplasmáticos, que sofreram delecção, mantendo ainda a capacidade de ligar a gpl20. Foi também posto em evidência que a sT4 inibia a infectividade virai e a formação de sincícios mediada pelo HIV (Deen et al.. Nature 331:82-84 (1988)). Esta inibição resulta da associação da proteína sT4 com a proteína gpl20 do envelope do HIV, competindo assim com a proteína CD4 nativa da superfície celular. O mecanismo pelo qual S~"> / ib 72 859 SBC Case 14497-1 -3-a CD4 e a CD4 solúvel inibe a infectividade virai e formação de sincícios não está completamente compreendido. Pôs-se em evidência, in vitro. que a adição de CD4 solúvel ao HIV-1 ou a células infectadas com HIV-1 parece induzir a libertação da gpl20 sem um aumento concomitante de outras proteínas virais. Isto implica a CD4 como sendo um inibidor activo do HIV-1, em vez de passivo.Soluble recombinant derivatives of CD4, such as sT4, include proteins in which the transmembrane and cytoplasmic domains, which have undergone deletion, still retain the ability to bind to gp120. It was also shown that sT4 inhibited viral infectivity and HIV-mediated syncytium formation (Deen et al., Nature 331: 82-84 (1988)). This inhibition results from the association of the sT4 protein with the HIV envelope gp120 protein, thus competing with native cell surface protein CD4. The mechanism by which S ~ " > / ib 72 859 SBC Case 14497-1 -3-a CD4 and soluble CD4 inhibits viral infectivity and syncytium formation is not fully understood. It was evidenced in vitro. that the addition of soluble CD4 to HIV-1 or to cells infected with HIV-1 appears to induce the release of gp120 without a concomitant increase of other viral proteins. This implies that CD4 is an active inhibitor of HIV-1, rather than passive.

No entanto, in vivo, a proteína sT4 e as proteínas CD4 solúveis, similares, são rapidamente eliminadas do soro. Esta eliminação rápida limita severamente a aplicação clínica das CD4 solúveis como inibidores de função do HIV.,However, in vivo, the sT4 protein and similar soluble CD4 proteins are rapidly eliminated from the serum. This rapid elimination severely limits the clinical application of soluble CD4 as inhibitors of HIV function.

Para prolongar o tempo de clearance do soro, várias quimeras de CD4 solúveis têm sido construídas. Traunecker et al. (Nature 331:84-86 (1988)) revelaram quimeras de CD4 (V1V2 e V1V2V3V4) nas quais a porção terminal carboxílica da proteína consistia numa região constante da cadeia leve da imunoglobu-lina murina. Traunecker et al. revelou ainda que estas proteínas eram expressas em células de mieloma.To prolong the serum clearance time, several soluble CD4 chimeras have been constructed. Traunecker et al. (Nature 331: 84-86 (1988)) revealed CD4 chimeras (V1V2 and V1V2V3V4) in which the carboxyl terminal portion of the protein consisted of a constant region of the murine immunoglobulin light chain. Traunecker et al. further revealed that these proteins were expressed in myeloma cells.

Traunecker et al. (Nature, 339:68-70 (1989)) revelou subsequentemente quimeras de CD4 (V1V2) fundidas com regiões constantes da cadeia pesada de IgM ou IgG2 murina. Revelou-se também que estas construções formavam pentâmeros.Traunecker et al. (Nature, 339: 68-70 (1989)) subsequently revealed CD4 (V1V2) chimeras fused to constant regions of the murine IgM or IgG2 heavy chain. It was also revealed that these constructions formed pentamers.

Seed (EP-A-325,262, publicada em 26 de Julho de 1989) e Capon et al. (W089/02922, publicado em 6 de Abril de 1989) revelaram quimeras de CD4 fundidas com regiões constantes das cadeias leves e pesadas da IgG humana. As proteínas reveladas por Seed e Capon et al. eram todas expressas em sistemas de mamíferos, isto é células de COS, BHK (rim de hamster bébé) e CHO. Como resultado, as quimeras de CD4 reveladas são de produção dispendiosa e têm uma capacidade de produção limitada em relação a sistemas de expressão alternativos. É assim, um objectivo do presente invento a produção de quimeras de CD4 num sistema hospedeiro bacteriano, para aSeed (EP-A-325,262, published July 26, 1989) and Capon et al. (WO089 / 02922, published April 6, 1989) have revealed CD4 chimeras fused to heavy and light chain constant regions of human IgG. The proteins disclosed by Seed and Capon et al. were all expressed in mammalian systems, i.e. COS, BHK (baby hamster kidney) and CHO cells. As a result, the disclosed CD4 chimeras are expensive to produce and have a limited production capacity over alternative expression systems. It is thus an aim of the present invention to produce CD4 chimeras in a bacterial host system for

SBC Case 14497-1 redução do custo de produção, bem como a produção de quimeras de CD4 que possuam um tempo de semi-vida no soro e/ou potência aumentadas contra as infecções por HIV, relativamente à CD4 solúvel. É comum para proteínas produzidas por técnicas de ADN recombinante serem fundidas com uma porção de uma proteína produzida pela célula hospedeira ou com outra proteína. Tipicamente, estas porções extra não adicionam nada à função da proteína desejada mas a sua presença pode impedir a conformação ou o processamento da proteína desejada. É assim desejável que se produzam proteínas que estejam o mais próximo possível da proteína nativa, de forma a evitar ou reduzir estes efeitos. Consequentemente, é também objectivo deste invento a produção de derivados de quimeras de CD4 e outras proteínas de CD4 em sistemas hospedeiros bacterianos que não contenham aminoácidos derivados de proteínas do sistema hospedeiro ou que contenham apenas um número mínimo de aminoácidos adicionados à sequência proteica autêntica. É ainda objectivo do invento, proporcionar vectores para a produção de proteínas e quimeras de CD4 e outras proteínas que resultam em quimeras que não contenham aminoácidos derivados de proteínas do sistema hospedeiro ou que contenham apenas um número mínimo de aminoácidos adicionados à sequência proteica autêntica.SBC Case 14497-1 reduction in the cost of production as well as the production of CD4 chimeras which have an increased serum half-life and / or potency against HIV infections relative to soluble CD4. It is common for proteins produced by recombinant DNA techniques to be fused to a portion of a protein produced by the host cell or another protein. Typically, these extra portions add nothing to the function of the desired protein but their presence may prevent the conformation or processing of the desired protein. It is thus desirable to produce proteins which are as close as possible to the native protein, so as to avoid or reduce these effects. Accordingly, it is also an object of this invention to produce derivatives of CD4 chimeras and other CD4 proteins in bacterial host systems which do not contain amino acids derived from proteins of the host system or which contain only a minimal number of amino acids added to the authentic protein sequence. It is a further object of the invention to provide vectors for the production of CD4 proteins and chimeras and other proteins which result in chimeras that do not contain amino acids derived from proteins of the host system or that contain only a minimal number of amino acids added to the authentic protein sequence.

Sumário do Invento 0 presente invento proporciona sequências de ácidos nucleicos e vectores de ADN úteis na produção de proteínas quiméricas de CD4, assim como outras proteínas heterólogas, em Streptomyces. As sequências de ácido nucleico do invento compreendem a sequência que codifica o péptido de sinal do gene do inibidor da tirosina de Streptomyces loncdsporus (LTI), operativamente ligado a uma sequência pro-peptídica consistindo essencialmente numa sequência que codifica de 1 a 6 aminoácidos, sendo a sequência dos referidos aminoácidos seleccionada para resultar na formação, em Streptomyces. de um produto proteico contendo um terminal amino homogéneoSummary of the Invention The present invention provides nucleic acid sequences and DNA vectors useful in the production of chimeric CD4 proteins, as well as other heterologous proteins, in Streptomyces. The nucleic acid sequences of the invention comprise the sequence encoding the signal peptide of the Streptomyces loncdsporus tyrosine inhibitor (LTI) gene operably linked to a propeptide sequence consisting essentially of a sequence encoding 1 to 6 amino acids, the sequence of said amino acids selected to result in formation, in Streptomyces. of a protein product containing a homogeneous amino terminus

/V 72 859 SBC Case 14497-1/ V 72 859 SBC Case 14497-1

-5- após o processamento para remoção do referido péptido de sinal, formado no produto proteico durante a síntese deste. Numa concretização alternativa do invento, o pro-péptido é omitido e o péptido de sinal de LTI é operativamente ligado a uma sequência de ácido nucleico que codifica uma proteína heteróloga, que foi modificada para codificar a sequência lys-ala- na extremidade 3'.After processing for removal of said signal peptide, formed in the protein product during the synthesis thereof. In an alternative embodiment of the invention, the propeptide is omitted and the LTI signal peptide is operably linked to a nucleic acid sequence encoding a heterologous protein which has been modified to encode the lys-ala sequence at the 3 'end.

Em outros aspectos o presente invento proporciona células de Streptomvces transformadas com uma sequência de ácido nucleico ou vector do invento e métodos de utilização das sequências de ácido nucleico do invento, para a produção de proteínas heterólogas em Streptomyces.In other aspects the present invention provides Streptomyces cells transformed with a nucleic acid sequence or vector of the invention and methods of using the nucleic acid sequences of the invention for the production of heterologous proteins in Streptomyces.

Este invento é precisado com mais rigor nas reivindicações em apêndice e é descrito nas suas concretizações preferidas na seguinte descrição.This invention is more accurately set forth in the appended claims and is described in the preferred embodiments thereof in the following description.

Descricão Detalhada do Invento 0 presente invento proporciona sequências de ácido nucleico e vectores de ADN úteis para a produção de proteínas quimeras de CD4, assim como de outras proteínas heterólogas, em Streptomvces. As sequências de ácido nucleico do invento compreendem a sequência de codificação do péptido de sinal do gene do inibidor da tirosina de Streptomvces lonaisporus (LTI), operativamente ligado a uma sequência pro-peptídica consistindo essencialmente numa sequência que codifica de um a cerca de 6 aminoácidos, sendo a sequência dos referidos aminoácidos seleccionada para resultar na formação, em Streptomvces. de um produto proteico contendo um terminal amino homogéneo após o processamento para a remoção do péptido de sinal, formado no produto proteico durante a síntese deste. Preferencialmente, a sequência de aminoácidos do pro-péptido é seleccionada para causar o processamento do produto proteico numa posição entre a extremidade da sequência de ácido nucleico que codifica para o péptido sinal e o inicio da sequência pro-peptídica. Numa concretização do 72 859 SBC Case 14497-1 ÊDetailed Description of the Invention The present invention provides nucleic acid sequences and DNA vectors useful for the production of chimeric CD4 proteins, as well as other heterologous proteins, in Streptomyces. The nucleic acid sequences of the invention comprise the signal peptide coding sequence of the Streptomvces lonaisporus tyrosine inhibitor gene (LTI) operably linked to a propeptide sequence consisting essentially of a sequence encoding one to about 6 amino acids , the sequence of said amino acids being selected to result in formation, in Streptomoves. of a protein product containing a homogeneous amino terminus after processing for the removal of the signal peptide, formed in the protein product during the synthesis thereof. Preferably, the amino acid sequence of the propeptide is selected to cause the processing of the protein product at a position between the end of the nucleic acid sequence encoding the signal peptide and the start of the propeptide sequence. In one embodiment of SBC 72 859 Case 14497-1 Ê

invento, a sequência de aminoácidos do pro-péptido é thr-, thr-pro-ala-ala- (SEQ ID NO:l), ou thr-pro-ala-ala-ala- (SEQ ID NO:2). A sequência pro-peptídica thr- é mais preferida porque consiste apenas em um aminoácido e proporciona um bom rendimento do produto proteico. Nestas concretizações do invento, a sequência de ácido nucleico do pro-péptido contém de preferência a sequência ACC (que codifica o thr); ACC CCG GCC GCT (SEQ ID NO:3) (que codifica thr-pro-ala-ala, SEQ ID N0:1) ou ACC CCG GCC GCT GCT (SEQ ID NO: 4) (que codifica thr-pro-ala-ala-ala, SEQ ID NO:2). As sequências de ácido nucleico do invento estão de preferência operativamente ligadas a outras sequências de ADN, para formar um vector de expressão que pode então ser inserido em Streptomvces para a produção de proteínas heterólogas. As proteínas heterólogas assim formadas possuirão uma extremidade N-terminal homogénea que contém a sequência do pro-péptido.In another embodiment of the invention, the amino acid sequence of the propeptide is thr-, thr-pro-ala-ala- (SEQ ID NO: 1), or thr-pro-ala-ala-ala- (SEQ ID NO: 2). The pro-peptide sequence thr- is most preferred because it consists of only one amino acid and provides a good yield of the protein product. In these embodiments of the invention, the nucleic acid sequence of the propeptide preferably contains the ACC (encoding thr) sequence; ACC encoding GCC GCT (SEQ ID NO: 3) (encoding thr-pro-ala-ala, SEQ ID NO: 1) or ACC CCG GCC GCT GCT (SEQ ID NO: ala-ala, SEQ ID NO: 2). The nucleic acid sequences of the invention are preferably operably linked to other DNA sequences to form an expression vector which can then be inserted into Streptomoves for the production of heterologous proteins. The heterologous proteins so formed will have a homogeneous N-terminal end containing the propeptide sequence.

Numa concretização alternativa do invento, o pro-péptido é omitido e o péptido de sinal de LTI é operativamente ligado a uma sequência de ácido nucleico que codifica para uma proteína heteróloga, que foi modificada para codificar a sequência lys-ala- na extremidade 3'. A sequência de ácido nucleico do invento é de preferência operativamente ligada a outras sequências de ADN, para formar um vector de expressão que pode ser inserido em Streptomvces para a produção de proteínas heterólogas. Nesta concretização do invento, a sequência de sinal é clivada da proteína heteróloga na extremidade da sequência de sinal, de forma que a proteína heteróloga é formada contendo uma extremidade N-terminal lys-ala-X, em que X é a restante proteína heteróloga. Esta concretização do invento é particularmente vantajosa para a produção de derivados de CD4 e proteínas quiméricas.In an alternative embodiment of the invention, the propeptide is omitted and the LTI signal peptide is operably linked to a nucleic acid sequence encoding a heterologous protein which has been modified to encode the sequence lys-ala- at the 3 ' . The nucleic acid sequence of the invention is preferably operably linked to other DNA sequences, to form an expression vector that can be inserted into Streptomoves for the production of heterologous proteins. In this embodiment of the invention, the signal sequence is cleaved from the heterologous protein at the end of the signal sequence, such that the heterologous protein is formed containing an N-terminal lys-ala-X, where X is the remaining heterologous protein. This embodiment of the invention is particularly advantageous for the production of CD4 derivatives and chimeric proteins.

Surpreendente e inesperadamente verificou-se que alterando apenas um aminoácido na posição 2 perto do terminal N da CD4 (na região VI), pode ser formada uma proteína heteróloga que é segregada eficientemente e processada correctamente de modo a remover toda a sequência de sinal de LTI, mas que ainda 72 859 SBC Case 14497-1Surprisingly and unexpectedly it has been found that by altering only one amino acid at position 2 near the N-terminus of CD4 (in the VI region), a heterologous protein can be formed that is efficiently secreted and processed correctly in order to remove the entire LTI signal sequence , but that still 72 859 SBC Case 14497-1

-7- retém uma substancial capacidade de ligação a gpl20. Por alteração da sequência de ácido nucleico que codifica o derivado ou quimera de CD4, de tal modo que passa a codificar lys-ala- no terminal amino, o produto codificado contém uma sequência de aminoácido no terminal N semelhante à do receptor de L3T4 (análogo de CD) de carneiro, que possui um terminal N lys-ala-.It retains substantial binding capacity to gp120. By altering the nucleic acid sequence encoding the CD4 derivative or chimera such that it is encoded by the amino-terminal lys-ala, the encoded product contains a N-terminal amino acid sequence similar to that of the L3T4 receptor (analogous of CD) of ram, which has an Nys-ala terminal.

Quando é desejável utilizar esta concretização do invento para preparar proteínas heterólogas diferentes dos derivados ou quimeras de CD4, a sequência de ácido nucleico codificadora da proteína heteróloga pode ser modificada para codificar lys-ala-no terminal N por delecção das bases que codificam para os dois aminoácidos do terminal N e substituição de uma sequência codificadora de lys-ala- pelas bases que sofreram delecção, ou então, uma sequência codificadora de lys-ala pode ser simplesmente adicionada à extremidade 3' da sequência codificadora do péptido de sinal de LTI. Deste modo, as proteínas heterólogas que possuem um terminal N modificado, contendo a sequência de aminoácidos lys-ala-, poderão ser formadas em Streptomvces.When it is desired to use this embodiment of the invention to prepare heterologous proteins other than CD4 derivatives or chimeras, the nucleic acid sequence encoding the heterologous protein can be modified to encode lys-ala-at the N-terminus by deletion of the bases encoding the two N-terminal amino acids and substitution of a lys-ala coding sequence by deletion bases, or a lys-ala coding sequence can simply be added to the 3 'end of the LTI signal peptide coding sequence. Thus, heterologous proteins having a modified N-terminus, containing the amino acid sequence lys-ala-, may be formed in Streptomyces.

Devido aos possíveis efeitos nocivos dos aminoácidos adicionais na função das proteínas heterólogas, foi desejável produzir proteínas heterólogas que possuam a sequência da proteína nativa (ou porções dela) com poucos ou sem aminoácidos adicionais derivados de proteínas do sistema hospedeiro, o que não é o caso quando as proteínas heterólogas são formadas como proteínas de fusão. Surpreendentemente verificou-se que o pro-péptido do gene do LTI pode sofrer delecção ou modificação e as proteínas heterólogas formadas que têm terminais N quase autênticos, também são segregadas eficientemente a partir da célula hospedeira de Streptomyces.Due to the possible deleterious effects of additional amino acids on the function of heterologous proteins, it has been desired to produce heterologous proteins having the native protein sequence (or portions thereof) with few or no additional amino acids derived from host system proteins, which is not the case when the heterologous proteins are formed as fusion proteins. Surprisingly, it has been found that the propeptide of the LTI gene may undergo deletion or modification and the heterologous proteins formed having near-authentic N-termini are also efficiently secreted from the Streptomyces host cell.

Tornou-se evidente que a sequência de aminoácidos que rodeia o local de clivagem do péptido de sinal pode ter efeitos dramáticos no processamento do péptido de sinal. Existem pelo 72 859 ^ /7 //7/ /,·/_ ^ SBC Case 14497-1 c.../.;>^=u^as. /' ' menos dois parâmetros que podem influenciar o processamento e, consequentemente, os níveis de produção e segregação: a natureza físico-química dos aminoácidos e a carga total na região que rodeia o local de clivagem.It has become apparent that the amino acid sequence surrounding the cleavage site of the signal peptide can have dramatic effects on the processing of the signal peptide. There are at least 85% of the cases. / '' minus two parameters that may influence processing and hence production and segregation levels: the physicochemical nature of the amino acids and the total charge in the region surrounding the cleavage site.

Uma análise estatística de péptidos de sinal de espécies eucariotas e procariotas mostrou uma clara preferência para classes específicas de aminoácidos, em resíduos específicos, na região que rodeia o local de clivagem do péptido de sinal (VonHeijne, FEBS Letters 244:439-446 (1989)). Por exemplo, resíduos de aminoácidos nas posições -3 e -1 são geralmente pequenos e sem carga. A alanina é o aminoácido mais frequente nestas posições. As posições -3 e -1 mostram também um claro preconceito contra certas classes de aminoácidos; por exemplo, resíduos aromáticos, com carga, hidrofóbicos e prolina não foram encontrados na posição -1 em 78 péptidos de sinal de eucariotas. Na extremidade amina da proteína madura, a prolina e a glicina foram raramente encontradas na posição 1. Os dados experimentais e análise estatística indicam que existe uma clara preferência para uma carga total neutra ou negativa na região que rodeia o sítio de clivagem (Li , et al.. Proc. Natl. Acad. Sei. USA 85: 7685-7689 (1988), Yanane and Mizushima, 263:19690-19696 (1988), Summers, et al.. J. Biol. Chem. 264:20082-20088 (1989), Von Heijne, J. Mol. Biol. 192:287-290 (1986) de péptidos de sinal bacterianos. Li et al.. supra, mostraram que quando o terminal amino da fosfatase alcalina de E. coli era mutagenizada, de tal forma que a carga total aumentava de 0 a +2, a produção total da fosfatase alcalina caía 50 vezes. Adicionalmente, o tempo requerido para o processamento do precursor em produto maduro alterava-se desde o não detectável, para a proteína de tipo selvagem, até 30 minutos, para o mutante de carga total +2. A gravidade deste defeito foi algo diminuída por redução da carga total para +1. Summers et al.. supra. descreveu uma variedade de proteínas híbridas entre o péptido de sinal de B-lactamase e de triosefosfato-isomerase de músculo de galinha. A triosefosfato-isomerase nativa não é segregada para o periplasma de E. coli. Contudo, esta proteína podiaA statistical analysis of signal peptides from eukaryotic and prokaryotic species showed a clear preference for specific classes of amino acids, in specific residues, in the region surrounding the signal peptide cleavage site (VonHeijne, FEBS Letters 244: 439-446 (1989 )). For example, amino acid residues in positions -3 and -1 are generally small and uncharged. Alanine is the most frequent amino acid in these positions. Positions -3 and -1 also show clear prejudice against certain classes of amino acids; for example, aromatic, charged, hydrophobic and proline residues were not found at position -1 in 78 eukaryotic signal peptides. At the amine end of the mature protein, proline and glycine were rarely found at position 1. Experimental data and statistical analysis indicate that there is a clear preference for a total neutral or negative charge in the region surrounding the cleavage site (Li, et. al., Proc. Natl Acad Sci USA 85: 7685-7689 (1988), Yanane and Mizushima, 263: 19690-1969 (1988), Summers et al., J. Biol. Chem. 264: 20082- Li et al., Supra, showed that when the amino terminal of E. coli alkaline phosphatase was mutagenized , so that the total charge increased from 0 to +2, total alkaline phosphatase production dropped by 50 times. Additionally, the time required for processing the precursor into mature product changed from undetectable, to the protein of wild type, up to 30 minutes, for the +2 total charge mutant. The severity of this defect was somewhat reduction of total load to +1. Summers et al., Supra. described a variety of hybrid proteins between the B-lactamase signal peptide and chicken muscle triosephosphate-isomerase. The native triosephosphate isomerase is not secreted into the E. coli periplasm. However, this protein could

SBC Case 14497-1 ser exportada para o periplasma guando o resíduo de arginina na posição 3 da triosefosfato-isomerase é alterado para um resíduo de prolina ou serlna. Estas observações dão crédito à análise estatística de Von Heijne gue mostrou uma clara preferência por uma carga total negativa na região do local de clivagem.SBC Case 14497-1 is exported to the periplasm where the arginine residue at the 3-position of the triosephosphate isomerase is changed to a proline or serine residue. These observations support the statistical analysis of Von Heijne which showed a clear preference for a total negative charge in the region of the cleavage site.

Verificou-se gue os primeiros dois resíduos de aminoácidos de CD4 (na região VI) lys-lys- são absolutamente necessários para a actividade de ligação à gpl20. Na tentativa de proporcionar derivados e guimeras de CD4 contendo terminais N autênticas (por exemplo, KK-V1V2), foi verificado gue os derivados e guimeras de CD4 não podem ser produzidos.The first two amino acid residues of CD4 (in the VI region) lys-lys- have been found to be absolutely necessary for gp120 binding activity. In an attempt to provide CD4 derivatives and guimeras containing authentic N-termini (for example, KK-V1V2), it was found that derivatives and CD4 guimeras can not be produced.

Quando as regiões V1V2 de CD4 se fundem com a seguência de sinal para o LT1, o produto resultante é processado em pelo menos duas formas. Surpreendente e inesperadamente verificou-se gue movendo os resíduos de lisina positivamente carregados do local de clivagem do péptido de sinal, por inserção de até cinco aminoácidos, resultava uma proteína CD4 heteróloga gue possuía um terminal N, homólogo, era eficientemente transportada para o meio de cultura e retinha ainda a capacidade de ligação à lgpl20. Ainda mais surpreendentemente, verificou-se que substituindo a lisina da posição 2 da molécula de CD4 por alanina, podia ser produzida uma proteína CD4 homogénea, contendo um terminal N quase autêntico apesar da presença de lisina no terminal N. Adicionalmente, esta proteína heteróloga era também segregada eficientemente para o meio e retinha a capacidade de ligação à gpl20. As proteínas heterólogas produzidas utilizando as sequências de ácidos nucleicos e vectores do invento têm terminais N, homólogos, isto é, o terminal N do produto tem a mesma sequência de aminoácidos, o que lhes dá algumas vantagens sobre proteínas heterólogas que são produzidas como uma mistura de formas. Os custos de produção são mais baixos uma vez que são necessários menos passos de separação são necessários para que se obtenha o produto desejado numa forma purificada. Um único produto, em vez de uma mistura de proteínas heterólogas é 72 859 SBC Case 14497-1When the CD4 V1V2 regions merge with the signal sequence for the LT1, the resulting product is processed in at least two ways. Surprisingly and unexpectedly, it was found that by moving the positively charged lysine residues from the signal peptide cleavage site by insertion of up to five amino acids, a heterologous CD4 protein having a homologous N-terminal was efficiently transported to the carrier medium. culture and retained the ability to bind to lgp120. Even more surprisingly, it was found that by replacing the lysine of the 2-position of the CD4 molecule with alanine, a homogenous CD4 protein containing a near-authentic N-terminal could be produced despite the presence of lysine at the N-terminus. In addition, this heterologous protein was also segregated efficiently into the medium and retained the ability to bind to gp120. The heterologous proteins produced using the nucleic acid sequences and vectors of the invention have homologous N-termini, i.e., the N-terminus of the product has the same amino acid sequence, which gives them some advantages over heterologous proteins that are produced as a mixture of forms. The production costs are lower since fewer separation steps are required so that the desired product is obtained in a purified form. A single product, rather than a mixture of heterologous proteins is 72 859 SBC Case 14497-1

?sr -10- preferível para uma acção reguladora. Adicionalmente, o enrolamento e função naturais do produto proteico é favorecido pela ausência de aminoácidos derivados de proteínas do sistema hospedeiro.Preferred for a regulatory action. In addition, the natural folding and function of the protein product is favored by the absence of amino acids derived from proteins of the host system.

Noutras concretizações o invento é dirigido a vectores de ADN para expressão de proteínas heterólogas, em Streotomvces. que compreendem uma sequência de codificação para a proteína heteróloga operativamente ligada a um promotor e a uma sequência nucleotídica que codifica a sequência de sinal do gene do inibidor da tirosina de Streptomyces lonaisporus operativamente ligado a uma sequência pro-peptídica consistindo essencialmente num oligonucleótido que codifica de um a 6 aminoácidos. A sequência dos referidos aminoácidos é seleccionada para resultar na formação de uma proteína heteróloga, em Streptomyces. contendo um terminal amino homógeneo após processamento para remoção do péptido de sinal formado na proteína heteróloga durante a síntese desta. Preferencialmente, a sequência de codificação para a proteína heteróloga inclui uma sequência que codifica uma região de ligação gpl20 do HIV.In other embodiments the invention is directed to DNA vectors for heterologous protein expression in Streptococci. comprising a coding sequence for the heterologous protein operably linked to a promoter and to a nucleotide sequence encoding the signal sequence of the Streptomyces lonaisporus tyrosine inhibitor gene operably linked to a propeptide sequence consisting essentially of an oligonucleotide encoding one to 6 amino acids. The sequence of said amino acids is selected to result in the formation of a heterologous protein in Streptomyces. containing a homogenous amino terminus after processing for removal of the signal peptide formed in the heterologous protein during the synthesis thereof. Preferably, the coding sequence for the heterologous protein includes a sequence encoding an HIV gp120 binding region.

Noutra concretização, o invento é dirigido a vectores de ADN para expressão de proteínas heterólogas em Streptomyces. que compreendem a sequência de codificação da proteína heteróloga operativamente ligada a um promotor e à sequência de sinal do gene do inibidor da tripsina de Streptomyces lonaisporus. em que a sequência codificadora da proteína heteróloga é modificada na sua extremidade 5' por adição de bases que codificam a sequência de aminoácidos lys-ala, ou por delecção das bases que codificam os dois aminoácidos da extremidade 5' e substituição da sequência que sofreu delecção pela sequência que codifica os aminoácidos lys-ala. Preferencialmente a sequência que codifica a proteína heteróloga inclui uma sequência que codifica uma região de ligação de gpl20 do HIV.In another embodiment, the invention is directed to DNA vectors for expression of heterologous proteins in Streptomyces. which comprise the coding sequence of the heterologous protein operably linked to a promoter and to the signal sequence of the Streptomyces lonaisporus trypsin inhibitor gene. wherein the coding sequence of the heterologous protein is modified at its 5 'end by addition of bases encoding the amino acid sequence 1a-ala, or by deletion of the bases encoding the two amino acids at the 5' end and substitution of the deleted sequence by the sequence encoding the amino acids lys-ala. Preferably the sequence encoding the heterologous protein includes a sequence encoding an HIV gp120 binding region.

Concretizações a dicionais do invento estão dirigidas a 72 859 SBC Case 14497-1 e.Numerous embodiments of the invention are directed to 72,859 SBC Case 14497-1 e.

r -11- processos de utilização dos vectores de ADN do invento, que compreendem os passos de introdução dum vector do invento numa célula hospedeira Streptomvces e cultura da célula hospedeira num meio de cultura apropriado.methods of using the DNA vectors of the invention comprising the steps of introducing a vector of the invention into a Streptomyces host cell and culturing the host cell in an appropriate culture medium.

Outras concretizações do invento estão dirigidas a células de Streptomyces transfectadas com sequências de ácidos nucleicos ou sequências de ADN do invento. O receptor proteico de CD4 dos linfócitos T, aqui referido como ,,CD4n, é uma glicoproteína de superfície que interactua com a proteína gpl20 do envelope proteico do HIV. Esta interacção de elevada afinidade ocorre entre a gpl20 e o domínio extracelular da CD4. 0 domínio extracelular da CD4 compreende 4 regiões de homologia limitada com as regiões variáveis da imunoglobulina (V). Os domínios de fronteira, aproximados, para as regiões variáveis (tipo imunoglobulina) (V1-V4) de CD4 são, respectivamente, os aminoácidos 100-109, os aminoácidos 175-184, os aminoácidos 289-298, e os aminoácidos 360-369. Assim, como usado aqui, VI refere-se aos aminoácidos 1-183 (aproximadamente) e assim sucessivamente. A remoção dos domínios CD4 transmembranares e citoplasmáticos de CD4 resulta numa proteína receptora solúvel. Estas proteínas CD4 solúveis, consistindo em porções ou na totalidade da região externa do receptor CD4 humano, têm mostrado inibir a infecção por HIV-1 e a fusão celular induzida por vírus (ver por exemplo, Deen et al. . Nature 331:82-84 (1988)). Como usado aqui, o termo "domínio de ligação à gpl20 do HIV" refere-se a qualquer molécula CD4 humana, solúvel, (isto é, sem os domínios transmembranares e citoplasmáticos) que liga a gpl20 do HIV com essencialmente a mesma, ou maior afinidade que a proteína receptora CD4 de comprimento total. A afinidade de ligação da CD4 pela gpl20 pode ser ensaiada como descrito por Arthos et al.. (Cell 57:469-481 (1989)). A molécula CD4 do presente invento compreende a região 72 859 SBC Case 14497-1 -12-Other embodiments of the invention are directed to Streptomyces cells transfected with nucleic acid sequences or DNA sequences of the invention. The T lymphocyte CD4 receptor protein, referred to herein as, CD4n, is a surface glycoprotein that interacts with the HIV protein envelope gp120 protein. This high affinity interaction occurs between gp120 and the extracellular domain of CD4. The CD4 extracellular domain comprises 4 regions of limited homology with the immunoglobulin (V) variable regions. The approximate boundary domains for the CD4 variable immunoglobulin (V1-V4) regions are, respectively, amino acids 100-109, amino acids 175-184, amino acids 289-298, and amino acids 360-369 . Thus, as used herein, VI refers to amino acids 1-183 (approximately) and so on. Removal of CD4 transmembrane and cytoplasmic CD4 domains results in a soluble receptor protein. These soluble CD4 proteins, consisting of portions or the entire outer region of the human CD4 receptor, have been shown to inhibit HIV-1 infection and virus-induced cell fusion (see, for example, Deen et al., Nature 331: 82- 84 (1988)). As used herein, the term " HIV gp120 binding domain " refers to any soluble human CD4 (i.e., without the transmembrane and cytoplasmic domains) CD4 molecule that binds to HIV gp120 with essentially the same or greater affinity than the full length CD4 receptor protein. The binding affinity of CD4 for gp120 can be assayed as described by Arthos et al. (Cell 57: 469-481 (1989)). The CD4 molecule of the present invention comprises the 72-

mínima de ligação da gpl20 do HIV encontrada no domínio VI (aminoácidos 41-55). Preferencialmente, inclui os domínios V1V2 da CD4, ou substancialmente a mesma sequência (isto é, não difere por mais do que 15 aminoácidos). Por isso, uma concretização do presente invento é V1V2 ligada a uma região constante duma imunoglobulina humana. A sequência nucleotídica que codifica a CD4 é revelada por Maddon et al. (Cell 42.:93-104 (1985) ) e aqui incorporada por referência com a nota de excepção de que a extremidade madura amino-terminal da CD4 humana é Lys-Lys e não Gln-Gly-Lys-Lys como relatado por Maddon et al. (ver Littman et al. r Cell 55.:541 (1988)). O ADN que codifica CD4 está disponível em várias fontes, por exemplo o plasmídeo pT4B (Maddon et al.. supra).binding of the HIV gp120 found in domain VI (amino acids 41-55). Preferably, it includes the V1V2 domains of CD4, or substantially the same sequence (i.e., does not differ by more than 15 amino acids). Therefore, one embodiment of the present invention is V1V2 linked to a constant region of a human immunoglobulin. The nucleotide sequence encoding CD4 is disclosed by Maddon et al. (Cell 42: 93-104 (1985)) and incorporated herein by reference with the exception note that the amino terminus mature end of human CD4 is Lys-Lys and not Gln-Gly-Lys-Lys as reported by Maddon et al. (see Littman et al., Cell 55: 511 (1988)). DNA encoding CD4 is available from a variety of sources, for example the plasmid pT4B (Maddon et al., Supra).

Alternativamente, moléculas de ADNc que codificam a sequência da CD4 podem ser derivadas de ARNm de linfócitos T que expressam CD4, utilizando técnicas conhecidas (por exemplo, reacção em cadeia de polimerase), ou podem ser sintetizadas por técnicas usuais de síntese de ADN.Alternatively, cDNA molecules encoding the CD4 sequence may be derived from CD4-expressing T lymphocyte mRNA using known techniques (for example, polymerase chain reaction), or may be synthesized by standard DNA synthesis techniques.

Adicionalmente às proteínas exemplificadas abaixo, qualquer perito na arte pode fácilmente construir moléculas de ADN que codificam outros derivados de CD4. Por exemplo, estão no âmbito desta arte construir adições, substituições, delecções e rearranjos de aminoácidos (exemplo: V1V3, VIV4) e modificações químicas destes. Estes derivados, no entanto, devem reter a capacidade de se ligarem à gpl20 do HIV.In addition to the proteins exemplified below, one skilled in the art can readily construct DNA molecules encoding other CD4 derivatives. For example, it is within the scope of this art to construct additions, substitutions, deletions and rearrangements of amino acids (example: V1V3, VIV4) and chemical modifications thereof. These derivatives, however, should retain the ability to bind to HIV gp120.

Uma vantagem inerente das quimeras CD4-Ig do presente invento, é a potencialidade para poder aumentar a potência para a inibição/inactivação de uma infecção por HIV, relativamenta à CD4 solúvel. Isto pode ser conseguido graças à aquisição das funções do efector Fc como descrito mais detalhadamente a seguir.An inherent advantage of the CD4-Ig chimeras of the present invention is the potential for enhancing the potency for inhibition / inactivation of an HIV infection, relative to soluble CD4. This can be achieved by acquiring the functions of the effector Fc as described in more detail below.

As funções de efector, que residem primariamente na porção Fc da imunoglobulina, incluem: um tempo de clearance do 72 859 SBC Case 14497-1Factor functions, which reside primarily in the Fc portion of the immunoglobulin, include: a clearance time of 72 859 SBC Case 14497-1

/V . C.-: \/./ V. W.-: \/.

-13 soro prolongado (Nakamura et al.. J. Immun. 100:376-383 (1968)); ligação à proteína A (Deisenhofer et al.. Biochem. 20:2361-2370 (1981)); ligação ao receptor Fc (e citotoxicidade celular dependente do antigénio (ADCC): mediada pela ligação de anticorpos aos receptores Fc em células T citotóxicas); fixação do complemento (exemplo: ligação ao Clq); dimerização (Davies et al.. Annu. Rev. Immunol. ,1:87 (1983)); e transferência placentária (Morgan et al. . Adv. Immunol. 40:61-134 (1987).-13 prolonged serum (Nakamura et al., J. Immun., 100: 376-383 (1968)); binding to protein A (Deisenhofer et al., Biochem., 20: 2361-2370 (1981)); Fc receptor binding (and antigen-dependent cellular cytotoxicity (ADCC): mediated by binding of antibodies to Fc receptors on cytotoxic T cells); complement fixation (example: binding to Clq); dimerization (Davies et al., Annu Rev. Immunol., 1: 87 (1983)); and placental transfer (Morgan et al., Adv. Immunol 40: 61-134 (1987)).

No entanto, quando se expressam porções das cadeias pesadas da imunoglobulina, algumas das funções efectoras serão diminuídas ou eliminadas. Uma vantagem do presente invento, por consequência, é poder conferir as funções efectoras, que são desejadas, às moléculas do presente invento por adição de aminoácidos que estão associados a essas funções. Por exemplo, estudos cristalográficos de fragmentos Fc de igG humana indicam que a dimerização ocorre na região de charneira e no domínio CH3 (ver por exemplo, Deisenhofer et al.f J. Biochem. 20.:2361-2370 (1981)). Assim, por adição de aminoácidos da região de charneira ou do domínio CH3, pode-se produzir fragmentos de IgG que possuem a capacidade de dimerizar.However, when expressing portions of the immunoglobulin heavy chains, some of the effector functions will be diminished or eliminated. An advantage of the present invention, therefore, is to be able to confer the effector functions, which are desired, on the molecules of the present invention by addition of amino acids which are associated with these functions. For example, crystallographic studies of human IgG Fc fragments indicate that dimerization occurs in the hinge region and the CH3 domain (see for example Deisenhofer et al., J. Biochem., 20: 23361-2370 (1981)). Thus, by addition of amino acids from the hinge region or the CH3 domain, IgG fragments having the ability to dimerize can be produced.

Além disso, as funções efectoras do Fc acima mencionadas podem ser derivadas de qualquer isotopo ou região constante de uma imunoglobulina apropriada. Por exemplo, igGl, IgG2, IgG3, IgG4, IgM, IgA ou IgE, mas preferencialmente IgGl. A igGl humana é preferida uma vez que foi mostrado que esta subclasse da imunoglobulina mostrou ser a mais eficiente na mediação da morte celular pelo complemento e ADCC (Bruggeman et al. f J. Exp. Med. 166:1351-1361 (1987)). A região constante da IgGl é constituída por vários domínios: o CHI, o de charneira (h), CH2 e CH3, como revelado por Ellison et al. (Nar. 10:4071-79 (1981)) e aqui incorporados por referência. A região constante da imunoglobulina do presente invento, pode incluir toda ou apenas uma porção da cadeia 72 859In addition, the above-mentioned Fc effector functions may be derived from any isotope or constant region of an appropriate immunoglobulin. For example, IgG1, IgG2, IgG3, IgG4, IgM, IgA or IgE, but preferably IgG1. Human IgG1 is preferred since this immunoglobulin subclass has been shown to be the most efficient in mediating cell death by complement and ADCC (Bruggeman et al., J. Exp. Med. 166: 1351-1361 (1987)). . The IgG1 constant region consists of several domains: CHI, hinge (h), CH2 and CH3, as disclosed by Ellison et al. (Nar. 10: 4071-79 (1981)) and incorporated herein by reference. The immunoglobulin constant region of the present invention may include all or only a portion of the chain 72 859

SBC Case 14497-1 14 c r.SBC Case 14497-1 14 c r.

pesada da imunoglobulina humana em que, a região variável sofreu delecção e foi substituída por CD4 ou um seu fragmento de ligação à gpl20 do HIV.human immunoglobulin wherein the variable region has been deleted and has been replaced by CD4 or an HIV gp120 binding fragment thereof.

Adicionalmente, o domínio de ligação à gpl20 pode ser directamente ligado (isto é, sintetizado ou expresso como um único polipéptido), ou indirectamente (isto é, acoplado após a síntese) à região constante da imunoglobulina.,In addition, the gp120 binding domain may be either directly linked (i.e., synthesized or expressed as a single polypeptide), or indirectly (i.e., coupled after synthesis) to the immunoglobulin constant region.

Uma concretização é uma região constante da imunoglobulina IgG humana a que falta a maior parte (isto é, pelo menos 51%) ou todo o domínio CH3. Preferencialmente, a região constante da imunoglobulina compreende a maior parte ou mesmo todo o domínio CH2, ou seja, o domínio CH2 com ou sem a região de charneira (h). A concretização preferida inclui -CHlhCH2, -hCH2 e -CH2. No entanto, o presente invento não necessita de ser limitado a um único domínio de região constante.One embodiment is a human IgG immunoglobulin constant region lacking the major part (i.e., at least 51%) or the entire CH3 domain. Preferably, the immunoglobulin constant region comprises most or even the entire CH2 domain, i.e., the CH2 domain with or without the hinge region (h). The preferred embodiment includes -CH 1 CH 2, -CH 2 and -CH 2. However, the present invention need not be limited to a single constant region domain.

Outra concretização do presente invento, é uma região constante de uma imunoglobulina humana não glicosilada compreendendo em pelo menos um domínio. As concretizações preferidas incluem, mas não se limitam a, -CHlhCH2CH3, -hCH2CH3, -CH2CH3, -hCH2, -CH2 e -CH3. As concretizações mais preferidas incluem, mas não se limitam a, -hCH2 e -CH2. Preferencialmente a proteína desta concretização particular é produzida num sistema hospedeiro bacteriano. È mais preferível que a célula hospedeira seja do género Streptomvces.Another embodiment of the present invention is a constant region of a non-glycosylated human immunoglobulin comprising at least one domain. Preferred embodiments include, but are not limited to, -CH 1 CH 2 CH 3, -CH 2 CH 3, -CH 2 CH 3, -CH 2, -CH 2, and -CH 3. More preferred embodiments include, but are not limited to, -CH 2 and -CH 2. Preferably the protein of this particular embodiment is produced in a bacterial host system. It is more preferred that the host cell is of the genus Streptomoves.

Os processos para a preparação de ADN que codifica a região constante da cadeia leve ou pesada das imunoglobulinas são explicados, por exemplo, por Robinson et al.. Publicação do Pedido PCT Ns W087/02671, publicado em 7 de Maio de 1987. No entanto qualquer perito na arte pode preparar uma molécula de ADN que codifica a sequência da IgG humana de qualquer linha celular que expresse a IgGl, por exemplo, ARH-77 (da ATCC), utilizando as técnicas usuais (por exemplo, a reacção em cadeia de polimerase). Alternativamente, a molécula de ADN pode ser sintetizada pelas técnicas usuaisMethods for the preparation of DNA encoding the immunoglobulin light or heavy chain constant region are explained, for example, by Robinson et al., PCT Application Publication No. WO87 / 02671, published May 7, 1987. However one skilled in the art can prepare a DNA molecule encoding the human IgG sequence of any cell line which expresses IgG1, for example, ARH-77 (ATCC), using standard techniques (e.g. polymerase). Alternatively, the DNA molecule can be synthesized by the usual techniques

72 859 SBC Case 14497-1 -15-de síntese de ADN.Synthesis of DNA.

Adicionalmente a estas proteínas aqui especificamente exemplificadas, qualquer perito na arte consegue facilmente construir moléculas de ADN que codificam outros derivados de Ig. Por exemplo, adições de aminoácidos, substituições, delecções, rearranjos e modificações químicas destes. Outros exemplos incluem a substituição ou delecção de resíduos de cisteína, mutagénese para aumentar a ligação ao receptor Fc e/ou ao Clq, ou a introdução de regiões Fc de outras moléculas de imunoglobulina. Estes derivados possuem algumas ou mesmo todas as funções efectoras do Fc como descrito a seguir.In addition to these proteins specifically exemplified herein, one skilled in the art can readily construct DNA molecules encoding other Ig derivatives. For example, amino acid additions, substitutions, deletions, rearrangements, and chemical modifications thereof. Other examples include the substitution or deletion of cysteine residues, mutagenesis to increase binding to the Fc receptor and / or Clq, or the introduction of Fc regions from other immunoglobulin molecules. These derivatives possess some or even all of the effector functions of Fc as described below.

Dependendo de quais as regiões expressas da IgG, a proteína quimérica resultante pode ligar-se a anticorpos para IgG humana, proteína A, complemento (especificamente Clq), e receptores Fc em células apropriadas do sistema imunitário, por exemplo os macrófagos. Além disso, a ligação do complemento (e outras funções efectoras) parece dependente de outros factores além da mera presença de sequências relevantes da região constante da cadeia pesada, uma vez que diferentes subclasses de IgG humana que contêm uma sequência de ligação ao complemento, diferem na sua capacidade de ligação ao complemento. Assim, outras características estruturais podem estar envolvidas. A molécula de ADN recombinante e as sequência de ácido nucleico deste invento podem compreender sequências adicionais de ADN, por exemplo, um vector que compreende um elemento de regulação, um ou mais marcadores selectivos, e sequências que codifiquem as funções de manutenção e replicação. Tipicamente a região reguladora contém um promotor, encontrado a montante da sequência de codificação deste invento, com funções na ligação da ARN polimerase e na iniciação da transcrição. Noutras palavras, a região ou elemento regulador está operativamente ligada à molécula de ADN recombinante deste invento. Será reconhecido, por um perito na arte, que a selecção de elementos reguladores dependerá daDepending on which regions of IgG are expressed, the resulting chimeric protein can bind to antibodies to human IgG, protein A, complement (specifically Clq), and Fc receptors in appropriate cells of the immune system, e.g. macrophages. In addition, complement binding (and other effector functions) appears to be dependent on factors other than the mere presence of relevant sequences of the heavy chain constant region, since different subclasses of human IgG containing a complement-binding sequence differ in its ability to bind to the complement. Thus, other structural features may be involved. The recombinant DNA molecule and the nucleic acid sequences of this invention may comprise additional DNA sequences, for example a vector comprising a regulatory element, one or more selectable markers, and sequences encoding maintenance and replication functions. Typically the regulatory region contains a promoter, found upstream of the coding sequence of this invention, with functions for RNA polymerase binding and initiation of transcription. In other words, the region or regulatory element is operably linked to the recombinant DNA molecule of this invention. It will be recognized by one of skill in the art that the selection of regulatory elements will depend on the

72 859 SBC Case 14497-1 -16 célula hospedeira utilizada. O sistema de marcador seleccionável pode ser qualquer um de um número de sistemas marcadores conhecidos, de tal forma que o gene marcador confere um novo fenótipo seleccionável nas células transformadas. Exemplos incluem genes de resistência a drogas de Streptomvces como a metilase ribossomal da resistência à tiostreptona (Thompson et al. f Gene 20:51 (1982)) e a neomicina-fosfotransferase (Thompson et al.f supra).The host cell used is the cell. The selectable marker system may be any of a number of known marker systems such that the marker gene confers a novel selectable phenotype on the transformed cells. Examples include Streptomyces drug resistance genes such as ribosomal methylase of thiostrepton resistance (Thompson et al., Gene 20:51 (1982)) and neomycin phosphotransferase (Thompson et al., Supra).

As sequências de ÃDN que codificam as funções de replicação e manutenção, incluem, por exemplo, as derivadas de Streptomvces. os derivados pIJlOl (ver por exemplo, Keiser et al♦. Mol. Gen. Genet. 185:223 (1982)) ou os vectores derivados de SLP1 de Streptomyces (ver por exemplo, Bibb et al. . Mol. Gen. Genet. 184:230 (1981)). O vector deste invento, pode também conter um marcador que permita a amplificação do gene. Estes marcadores que servem para amplificar o número de cópias do gene em Streptomvces. incluem o gene da resistência à espectinomicina (Hornemann et al.. J. Bacteriol. 1691:2360 (1987)) e auxotrofia da arginina (Altenbuchner et al. . Mol. Gen. Genet. 195:134 (1984)). 0 presente invento refere-se também à célula hospedeira transformada com a molécula de ADN recombinante deste invento. Tal célula hospedeira è capaz de crescer num meio de cultura apropriado e expressar a proteína codificada pela molécula de ADN recombinante deste invento. Tal célula hospedeira, é preparada pelo processo deste invento, isto é, pela transformação de uma célula hospedeira desejada com o plasmídeo do invento. Esta transformação é conseguida pela utilização de técnicas de transformação convencionais. As células hospedeiras eleitas pertencem ao género dasThe DNA sequences encoding the replication and maintenance functions include, for example, those derived from Streptomoves. the pIJ10 derivatives (see for example, Keizer et al., Mol. Gen. Genet., 185: 223 (1982)) or Streptomyces SLP1-derived vectors (see for example, Bibb et al., Mol. Gen. Genet. 184: 230 (1981)). The vector of this invention may also contain a marker allowing amplification of the gene. These markers serve to amplify the copy number of the gene in Streptomvces. include the spectinomycin resistance gene (Hornemann et al., J. Bacteriol., 1691: 2360 (1987)) and arginine auxotrophy (Altenbuchner et al., Mol. Gen. Genet., 195: 134 (1984)). The present invention also relates to the host cell transformed with the recombinant DNA molecule of this invention. Such a host cell is capable of growing in an appropriate culture medium and expressing the protein encoded by the recombinant DNA molecule of this invention. Such a host cell is prepared by the process of this invention, i.e., by transforming a desired host cell with the plasmid of the invention. This transformation is achieved by the use of conventional transformation techniques. The host cells chosen belong to the genus of

Streptomyces. Por exemplo, estes hospedeiros incluem, mas não se limitam a, S. lividans. S. coelicolor, S. albus. e S. lonqisporus. As concretizações preferidas incluem S. lividans e S. lonoisporus. Outras células hospedeiras que podem ser utilizadas incluem, mas não se limitam a, células de mamífero, 72 859 SBC Case 14497-1Streptomyces. For example, these hosts include, but are not limited to, S. lividans. S. coelicolor, S. albus. and S. lonqisporus. Preferred embodiments include S. lividans and S. lonoisporus. Other host cells that may be used include, but are not limited to, mammalian cells, 72859 SBC Case 14497-1

células de insecto, leveduras e outras células bacterianas (por exemplo, E. coli, Salmonella. Bacillus). Assim, este invento e os seus produtos não necessitam de se limitar a quaisquer células hospedeiras específicas.insect cells, yeast and other bacterial cells (e.g., E. coli, Salmonella, Bacillus). Thus, this invention and its products need not be limited to any specific host cells.

Para a expressão de proteínas heterólogas em Streptomvces. existem vários promotores disponíveis. Exemplos, incluem o promotor indutível da galactose do operão da galactose de Streptomyces (Fornwald et al.. Proc. Natl. Acad. Sei. USA 84:2130 (1987), o promotor constitutivo do gene da 7/3-galactosidase de S. lividans (Eckhardt et al. r J. Bacteriol. 169:4249 (1987) ou Brawner et al,f Patente U.S. 4 717 666), o gene do inibidor da tripsina de S. lonaisporus (ver EP-A-264,175, publicado em 20 de Abril de 1988), ou um promotor temporariamente regulado como o relatado em M. echinosporsa (Baum et al.. J. Bacteriol 170:71 (1988)). Regiões de terminação de transcrição em Streptomvces são derivadas da extremidade 3' de alguns genes de Streptomyces, por exemplo, o sinal de terminação na extremidade do operão da galactose de Streptomvces ou o encontrado na extremidade do gene da neomicina-fosfotransferase de S. fradiae (Thompson et al. . Proc. Natl. Acad. Sei. USA 80.:5190 (1983)). As sequências para a exportação de proteínas em Streptomvces incluem as isoladas a partir do gene LEP-10 de S. lividans. e o gene do inibidor da tripsina de S. lonoisporus (LTI) (ver EP-A-264,175 publicado em 20 de Abril de 1988). Preferencialmente, a exportação ou a sequência de sinal, são derivados do gene LTI, embora não se limitem a este. Mais preferencialmente, a sequência de sinal é modificada (isto é, por adições, substituições, delecções e/ou rearranjos) como descrito na secção de Exemplos.For the expression of heterologous proteins in Streptomyces. there are several promoters available. Examples include the galactose inducible promoter of Streptomyces galactose (Fornwald et al., Proc. Natl. Acad Sci USA 84: 2130 (1987), the constitutive promoter of the 7Î²-galactosidase gene of S. lividans (Eckhardt et al., J. Bacteriol., 169: 4249 (1987) or Brawner et al., U.S. Patent 4,717,666), the S. lonaisporus trypsin inhibitor gene (see EP-A-264,175, April 20, 1988), or a temporally regulated promoter as reported in M. echinosporsa (Baum et al., J. Bacteriol 170: 71 (1988)) Transcription termination regions in Streptomyces are derived from the 3 'end of some Streptomyces genes, for example, the termination signal at the end of the Streptomvces galactose operon or that found at the end of the S. fradiae neomycin phosphotransferase gene (Thompson et al., Proc. Natl Acad Sci USA The sequences for the export of proteins in Streptomyces include those isolated from the gene LEP-10 from S. lividans. and the S. lonoisporus trypsin inhibitor (LTI) gene (see EP-A-264,175 published April 20, 1988). Preferably, the export or signal sequence is derived from the LTI gene, although not limited thereto. Most preferably, the signal sequence is modified (i.e., by additions, substitutions, deletions and / or rearrangements) as described in the Examples section.

Para os estudos iniciais de expressão, as quimeras CD4-lg, como descrito em mais detalhe nos Exemplos, foram fundidas com a sequência pre-pro do LTI. As funções de replicação em Streptomyces para este vector de expressão, eram fornecidas pelo plasmídeo pIJ351, que é um derivado do pIJlOl (Keiser et al. . Mol. Gen. Genet. 185:223 (1982)). A estirpe hospedeira usada foi a estirpe do tipo selvagem S. lividans 1326 (Bibb etFor the initial expression studies, the CD4-Î²1 chimeras, as described in more detail in the Examples, were fused to the pre-pro sequence of the LTI. The replication functions in Streptomyces for this expression vector were provided by the plasmid pIJ351, which is a derivative of pIJ101 (Keiser et al., Mol. Gen. Genet. 185: 223 (1982)). The host strain used was the wild type strain S. lividans 1326 (Bibb et al.

72 859 SBC Case 14497-1 al.. Mol. Gen. Genet. 184;230 (1981)). 0 invento presente refere-se também a um processo para a produção da proteína codificada pela molécula de ADN recombinante deste invento que compreende cultivar o hospedeiro do invento num meio apropriado e isolar essa proteína. Por "meios de cultura apropriados" entenda-se meios de cultura que permitam ao hospedeiro a expressão em quantidade recuperável, da sequência de codificação do invento. Será reconhecido por um perito na arte que os meios de cultura apropriados dependem da célula hospedeira empregue. O isolamento da proteína assim produzida é preferencialmente efectuado a partir do meio de cultura do hospedeiro, ou seja, a proteína do invento é preferencialmente exportada para o meio de cultura. A(s) proteína(s) do presente invento podem ser isoladas e purificadas de acordo com as técnicas usuais. Por exemplo extracção, precipitação selectiva, cromatografia em coluna, cromatografia de afinidade ou electroforese. Por exemplo, as proteínas igG quiméricas podem ser purificadas fazendo passar a solução que contém a referida proteína quimérica por uma coluna que contém as proteínas A ou G imobilizadas, que ligam selectivamente as porções Fc da proteína de fusão. Ver por exemplo, Reis et al. . J. Immunol. 132:3098-3102 (1984). A proteína quimérica pode então ser eluída por tratamento com um sal caotrópico ou por uma alteração no pH (por exemplo, ácido acético 0,3 M) . Alternativamente, a proteína do invento pode ser purificada em colunas de anticorpo anti-CD4, ou colunas de anticorpo anti-imunoglobulina. A proteína e o produto proteico do presente invento podem ser usados para o tratamento de infecções virais por HIV. Como um profiláctico, as quimeras de CD4-IgG são administradas a indivíduos de alto risco para a SIDA ou indivíduos que mostram exposição ao HIV, pela presença de anticorpos para este. A administração de uma quantidade eficaz de proteína quimérica num /72 859 SBC Case 14497-1 al. Mol. Gen. Genet. 184, 230 (1981)). The present invention also relates to a process for the production of the protein encoded by the recombinant DNA molecule of this invention which comprises culturing the host of the invention in an appropriate medium and isolating said protein. By " appropriate culture media " is meant culture media which allow the host to be able to express in a recoverable amount the coding sequence of the invention. It will be recognized by one skilled in the art that the appropriate culture media is dependent upon the host cell employed. Isolation of the thus-produced protein is preferably effected from the host's culture medium, i.e., the protein of the invention is preferably exported to the culture medium. The protein (s) of the present invention may be isolated and purified according to the usual techniques. For example extraction, selective precipitation, column chromatography, affinity chromatography or electrophoresis. For example, chimeric IgG proteins can be purified by passing the solution containing said chimeric protein through a column containing the immobilized A or G proteins, which selectively bind the Fc portions of the fusion protein. See for example, Reis et al. . J. Immunol. 132: 3098-3102 (1984). The chimeric protein can then be eluted by treatment with a chaotropic salt or by a change in pH (e.g., 0.3 M acetic acid). Alternatively, the protein of the invention can be purified on anti-CD4 antibody columns, or anti-immunoglobulin antibody columns. The protein and protein product of the present invention may be used for the treatment of viral infections by HIV. As a prophylactic, CD4-IgG chimeras are given to individuals at high risk for AIDS or individuals who show exposure to HIV, by the presence of antibodies to it. Administration of an effective amount of chimeric protein in a /

72 859 SBC Case 14497-1 -19-estado precoce da doença ou antes do seu início, actuaria de forma a inibir a infecção dos linfócitos CD4+. Como terapia, a administração de quimeras CD4-IgG a indivíduos infectados com HIV pode inibir a propagação extracelular do vírus.The early stage of the disease or before its onset, would act to inhibit CD4 + lymphocyte infection. As a therapy, administration of CD4-IgG chimeras to HIV-infected individuals may inhibit the extracellular propagation of the virus.

As gamas de dosagem para administração da proteína quimérica do invento são aquelas que produzem o efeito desejado, pelo que, os sintomas de HIV ou de infecção por HIV sejam melhorados. Esta quantidade a administrar é seleccionada de forma a manter uma quantidade que suprima ou iniba a infecção secundária por formação de sincícios ou por circulação de vírus por toda a parte durante o período em que a infecção do HIV é evidenciada, pela presença de anticorpos anti-HlV, presença de vírus cultiváveis e presença do antigénio -24 nos soros do paciente. A presença de anticorpos anti-HIV pode ser determinada pela utilização de ensaios usuais de ELISA ou transferência de Western, por exemplo, anticorpos anti-gpl20, anti-gp41, anti-tat, anti-p55, anti-pl7, etc. A dosagem variará geralmente, com a idade, extensão da infecção, e contra-indicações, se as houver, por exemplo, imunotolerância. A dosagem pode variar de 0,01mg/kg/dia até 50mg/kg/dia, mas preferencialmente de 0,01 a 1,Omg/kg/dia. A molécula quimérica pode ser administrada intravenosa, intraperitoneal, intramuscular ou subcutaneamente. Se administrada parenteralmente, pode ser por injecção única (exemplo: bolus) ou por perfusão gradual ao longo do tempo.The dosage ranges for administration of the chimeric protein of the invention are those which produce the desired effect, whereby the symptoms of HIV or HIV infection are improved. This amount to be administered is selected so as to maintain an amount that suppresses or inhibits the secondary infection by syncytium formation or virus circulation throughout the period during which the HIV infection is evidenced by the presence of anti- HlV, the presence of cultured viruses and the presence of the antigen -24 in the sera of the patient. The presence of anti-HIV antibodies can be determined by the use of usual ELISA or Western blot assays, for example, anti-gp120, anti-gp41, anti-tat, anti-p55, anti-p17, etc. antibodies. The dosage will generally vary, with age, extent of infection, and contraindications, if any, for example, immunotolerance. The dosage may range from 0.01mg / kg / day to 50mg / kg / day, but preferably from 0.01 to 1, Omg / kg / day. The chimeric molecule may be administered intravenously, intraperitoneally, intramuscularly or subcutaneously. If administered parenterally, it may be by single injection (example: bolus) or by gradual infusion over time.

As proteínas do invento podem ser usadas em combinação com outros agentes, por exemplo, em associação com agentes dirigidos contra outras proteínas do HIV, como transcriptase reversa, protease ou tat. Um agente terâpeutico eficaz contra o HIV deve evitar a medicação de vírus, assim como a transmissão célula a célula da infecção. As proteínas podem também ser usadas em combinação com outros agentes anti-virais, por exemplo, a azidotimida (AZT). -20-The proteins of the invention may be used in combination with other agents, for example, in association with agents directed against other HIV proteins, such as reverse transcriptase, protease or tat. An effective terapeutical agent against HIV should avoid virus medication, as well as cell-to-cell transmission of the infection. Proteins may also be used in combination with other anti-viral agents, for example azidotimide (AZT). -20-

72 859 SBC Case 14497-172 859 SBC Case 14497-1

As proteínas deste invento podem também ser utilizadas como reagentes para identificação de moléculas naturais, sintéticas ou recombinantes, que actuem como agentes terâpeuticos ou inibidores da interacção CD4+ -célula. Por exemplo, as proteínas podem ser utilizadas em ensaios de pesquisa, como os ensaios de interacção de proteína, medida por metodologias baseadas em ELISA, para pesquisa de competidores do domínio do receptor de superfície de CD4.The proteins of this invention may also be used as reagents for the identification of natural, synthetic or recombinant molecules which act as teratogenic agents or inhibitors of the CD4 + -cell interaction. For example, proteins may be used in screening assays, such as protein interaction assays, as measured by ELISA-based methodologies, for competitor search of the CD4 surface receptor domain.

Com base em dados obtidos in vitro. que mostram que as proteínas CD4 solúveis se ligam a células que expressam proteínas env do HIV, as proteínas do presente invento podem também servir como moléculas de alvejamento selectivo para células infectadas pelo HIV in vivo. Como proteína transportadora específica para alvo, as proteínas CD4-IgG podem servir, por exemplo, como proteína transportadora para distribuição de agentes citotóxicos às células infectadas, incluindo a distribuição de formulações lipossómicas.Based on data obtained in vitro. which show that soluble CD4 proteins bind to cells expressing HIV env proteins, the proteins of the present invention may also serve as selective targeting molecules for HIV infected cells in vivo. As the target specific carrier protein, CD4-IgG proteins may serve, for example, as carrier protein for delivery of cytotoxic agents to infected cells, including the delivery of liposomal formulations.

Os exemplos que se seguem são para ilustrar e não podem ser interpretados como limitação do presente invento.The following examples are for illustration and may not be construed as limiting the present invention.

EXEMPLOSEXAMPLES

As enzimas utilizadas em manipulações genéticas foram obtidas de fontes comerciais e foram substancialmente utilizadas de acordo com as instruções do fornecedor. Salvo indicação em contrário, os procedimentos foram substancialmente efectuados como descrito em Maniatis et al.. Molecular Clonincr. Cold Spring Harbor Labotatory, 1989.Enzymes used in genetic manipulations were obtained from commercial sources and were substantially used according to the supplier's instructions. Unless otherwise noted, the procedures were substantially performed as described in Maniatis et al., Molecular Clonidine. Cold Spring Harbor Labotatory, 1989.

Exemplo 1 Preparação de quimeras de CD4-IaGExample 1 Preparation of CD4-IaG chimeras

Utilizando manipulações de ADN recombinante, a sequência codificante para os domínios V1V2 de CD4 é seguida pelas regiões humanas Charneira-CH2 e Charneira-CH2”*CH3 em vários plasmídeos, que funcionam em várias células hospedeiras. cUsing recombinant DNA manipulations, the coding sequence for the CD4 V1V2 domains is followed by the human Hinge-CH2 and Hinge-CH2CH3 regions on several plasmids, which function on several host cells. W

U3 72 859 SBC Case 14497-1 -21- A região V1V2 contém os aminoácidos 1-183 (ver Maddon et al., Cell 42;93-104 (1985) e Littman et al.. Cell 55:541 (1988)). As regiões Charneira-CH2 e Charneira CH2-CH3 incluem os aminoácidos de IgGl 97-228 e 97-330, respectivamente (ver Ellison et al.. NAR 10:4071-79 (1982)). A) VlV2-hCH2DHFR e VlV2-hCH2CH3DHFR (células CHCn 0 vector de expressão sT4184.DHFR, revelado em EP-A-331 356 (publicado em 6 de Setembro de 1989), foi modificado de forma a criar o plasmídeo V1V2183DHFR: o plasmídeo ST4184.DHFR foi cortado com EcorI e Nhel. e um fragmento de 682 pares de bases, contendo os nucleótidos de 1-682 do ADN da CD4 foi isolado. Este fragmento foi ligado a um ligador sintético, que codifica os aminoácidos 177-183 da CD4, seguido por um codão de terminação TAA. Adicionalmente, o ligador sintético criou um sitio de restrição HindiII por alteração dos nucleótidos 693 (G a A) e 696 (C a T) da sequência da CD4. Estas alterações de nucleótidos não alteram a sequência de aminoácidos. O fragmento resultante, que é flanqueado por extremidades EcorI e Xbal foi ligado nos sítios EcorI e Xbal de outro plasmídeo ST4184.DHFR. A sequência do ligador sintético é substancialmente a seguinte: 5' CTAGCTTTCCAGAAAGCTTCCTAAT 3' 3' GAAAGGTCTTTCGAAGGATTAGATC 5' 0 plasmídeo resultante, V1V2183DHFR, contém o promotor da j8--globina de ratinho, o gene DHFR, a região poly A do SV40 e a sequência de CD4 que codifica os aminoácidos 1-183. V1V2183DHFR foi subsequentemente linearizado por HindIII. que corta após a sequência que codifica V1V2. A seguir, um fragmento Banll-Aval que codifica a região Charneira-CH2 (nucleótidos 299-677, Ellison et al., supra) foi isolado a partir de ADNc de IgGl humana. (0 sítio de restrição HindIII foi introduzido por mutagénese por PCR). Cada fragmento foi ligado no sítio de restrição HindIII de um plasmídeo V1V2183DHFR diferente com os seguintes ligadores sintéticos: 72 859 SBC Case 14497-1 ![The V1V2 region contains amino acids 1-183 (see Maddon et al., Cell 42: 93-104 (1985) and Littman et al., Cell 55: 541 (1988)). . The Hinge-CH2 and Hinge-CH2-CH3 regions include the amino acids of IgG1 97-228 and 97-330, respectively (see Ellison et al., NAR 10: 4071-79 (1982)). A) VlV2-hCH2DHFR and VlV2-hCH2CH3DHFR (CHCn cells) The expression vector sT4184.DHFR, disclosed in EP-A-331,356 (published September 6, 1989), was modified to create plasmid V1V2183DHFR: plasmid ST4184.DHFR was cut with EcorI and NheI and a 682 base pair fragment containing nucleotides 1-682 of the CD4 DNA was isolated. This fragment was ligated to a synthetic linker encoding amino acids 177-183 of CD4, followed by a TAA termination codon, and the synthetic linker has created a HindIII restriction site by altering nucleotides 693 (G to A) and 696 (C to T) of the CD4 sequence. The resulting fragment, which is flanked by EcorI and XbaI ends, was ligated at the EcorI and XbaI sites of another plasmid ST4184.DHFR The synthetic linker sequence is substantially as follows: 5 'CTAGCTTTCCAGAAAGCTTCCTAAT 3' 3 'GAAAGGTCTTTCGAAGGATTAG The resulting plasmid, V1V2183DHFR, contains the mouse Î²8 globin promoter, the DHFR gene, the SV40 poly A region and the CD4 sequence encoding amino acids 1-183. V1V2183DHFR was subsequently linearized by HindIII. which cuts after the sequence encoding V1V2. Next, a Banll-Aval fragment encoding the Hinge-CH2 region (nucleotides 299-677, Ellison et al., Supra) was isolated from human IgG1 cDNA. (The HindIII restriction site was introduced by PCR mutagenesis). Each fragment was ligated into the HindIII restriction site of a different V1V2183DHFR plasmid with the following synthetic linkers: 72859 SBC Case 14497-1 [

____-jP -22-Î²-

^(L i) um ligador HindiII-BanI; para ligar a extremidade 3' do gene V1V2 com a extremidade 5' das regiões Charneira-CH2 ou Char-neira-CH2-CH3. 5' AGCTTCCAAGGTGGAGCC 3 7 37 AGGTTCCACC 5' ii) um ligador Aval-HindIII: utilizado para introduzir um codão de paragem depois da sequência que codifica CH2 e para unir a extremidade 37 do fragmento Charneira-CH2 com a extremidade 57 da sequência V1V2. 57 CCGAGAGTAGTGACTGCAGA 37 37 CTCATCACTGACGTCTTCGA 57(L i) a HindIII-BanI linker; to bind the 3 'end of the V1V2 gene to the 5' end of the Hinge-CH2 or Charine-CH2-CH3 regions. Ii) an Aval-HindIII linker: used to introduce a stop codon after the CH2 coding sequence and to join the end 37 of the Hinge-CH2 fragment to the 57th end of the V1V2 sequence. AGCTTCCAAGGTGGAGCC 3 7 37 AGGTTCCACC 5 ' 57 CCGAGAGTAGTGACTGCAGA 37 37 CTCATCACTGACGTCTTCGA 57

Estas construções mantêm a estrutura de leitura correcta sem a adição de resíduos de aminoácidos estranhos a CD4 ou IgGl na junção HindIII e são aqui referidas como VlV2-hCH2DHFR e V1V2--hCh2Ch3 DHFR, respectivamente. B) VlV2-hCH2COS e VlV2-hCh2Ch3COS (Células COSI:These constructs maintain the correct reading frame without the addition of foreign amino acid residues to CD4 or IgG1 at the HindIII junction and are referred to herein as VlV2-hCH2DHFR and V1V2-hCh2Ch3 DHFR, respectively. B) VlV2-hCH2COS and VlV2-hCh2Ch3COS (COSI:

Os vectores VlV2-hCH2DHFR e VlV2-hCH2CH3D HFR foram digeridos com Nhel. Foram isolados fragmentos de 1 817 e 2 147 pb, respectivamente, que compreendem a extremidade carboxílica de V2, as regiões completas que codificam as regiões de Charneira-CH2 ou de Charneira-CH2CH3, a região poly A bovina, o promotor da /3-globina de ratinho e uma parte da região que codifica a DHFR de ratinho. 0 vector V1V2183COS2 (abaixo descrito) contém a LTR do vírus do sarcoma de Rous, o promotor precoce do SV40, VIV2, a região precoce poly A do SV40 e o promotor da /3-globulina de ratinho, operativamente ligado à região codificadora de DHFR de ratinho. Quando V1V2183COS2 foi digerido com Nhel, removeu-se um fragmento de 1 723 pb contendo a extremidade carboxílica de V2 até à região de codificação de DHFR de ratinho. Os fragmentos Nhel de VIV2-hCH2DHFR e de VlV2-hCH2CH3DHFR, foram então ligados aos sítios Nhel do vector V1V2183COS2 restaurando, na estrutura correcta de leitura, a extremidade carboxílica de V2 e o gene --------—=^3*» --------—=^3*» yr/ c 72 859 SBC Case 14497-1 -23- DHFR, respectivamente. Os vectores resultante são aqui referidos como VlV2-hCH2C0S e VlV2-hCH2CH3COS/ tendo sido utilizados para transfectar as células COS.The vectors VlV2-hCH2DHFR and VlV2-hCH2CH3D HFR were digested with NheI. Fragments of 1817 and 2147 bp, respectively, comprising the carboxy terminus of V2, the complete regions coding for the Hinge-CH2 or Hinge-CH2CH3 regions, the bovine poly A region, the β- mouse globin and a part of the mouse DHFR coding region. The vector V1V2183COS2 (described below) contains the Rous sarcoma virus LTR, the SV40 early promoter, VIV2, the SV40 poly A early region and the mouse / 3-globulin promoter, operably linked to the coding region of DHFR of mice. When V1V2183COS2 was digested with NheI, a 1773 bp fragment containing the carboxy terminus of V2 was removed to the mouse DHFR coding region. The Nhel fragments of VIV2-hCH2DHFR and V1V2-hCH2CH3DHFR were then ligated to the NheI sites of vector V1V2183COS2 by restoring, in the correct reading frame, the carboxy terminus of V2 and the gene --------- = 3 DHFR, respectively. In the present invention, the present invention relates to a process for the preparation of a compound of the formula: ## STR1 ## in which: The resulting vectors are referred to herein as VlV2-hCH2C0S and VlV2-hCH2CH3COS / and have been used to transfect the COS cells.

Construção de V1V2183COS2: para criar o plasmídeo V1V2183COS2, o plasmídeo V1V2183DHFR foi digerido com EcoRI e Xbal. 0 sítio EcoRI foi preenchido, e um fragmento de 707 pb consistindo na região de codificação para V1V2 foi isolado. Este fragmento foi ligado ao plasmídeo Rst4COS2 (abaixo descrito), que tinha sido cortado com Smal e Xbal para delecção da região de codificação de sT4. O plasmídeo resultante foi o V1V2183COS2.Construction of V1V2183COS2: to create plasmid V1V2183COS2, plasmid V1V2183DHFR was digested with EcoRI and XbaI. The EcoRI site was filled, and a 707 bp fragment consisting of the V1V2 coding region was isolated. This fragment was ligated to the plasmid Rst4COS2 (described below), which had been cut with SmaI and XbaI to delete the sT4 coding region. The resulting plasmid was V1V2183COS2.

Construção de Rst4COS2: para criar o plasmídeo Rst4COS2, o plasmídeo st4DHFR (Maddon et al., PCT/W088/01304, publicado em 25 de Fevereiro de 1988) foi digerido com Smal e EcoRI. 0 sítio EcoRI foi preenchido e um fragmento de 338 pb contendo o promotor precoce de SV40 foi isolado. Este fragmento foi ligado com o Rst4DHFR (ver abaixo), que tinha sido cortado com BamHI, e as extremidades coesivas foram preenchidas. Os plasmídeos resultantes foram seleccionados pela orientação, e aquele em que o promotor precoce do SC40 estava na orientação oposta à LTR do RSV foi seleccionado como o plasmídeo Rst4COS2.Construction of Rst4COS2: To create the plasmid Rst4COS2, the plasmid st4DHFR (Maddon et al., PCT / W088 / 01304, published February 25, 1988) was digested with SmaI and EcoRI. The EcoRI site was filled and a 338 bp fragment containing the SV40 early promoter was isolated. This fragment was ligated with Rst4DHFR (see below), which had been cut with BamHI, and the cohesive ends were filled. The resulting plasmids were selected by orientation, and the one in which the SC40 early promoter was in the opposite orientation to the RSV LTR was selected as the plasmid Rst4COS2.

Construção de RST4DHFR: o plasmídeo TND (Connors et al., DNA 2:651-661 (1988) foi digerido com BalII e HindIII. Um fragmento de 600 pb contendo a LTR do vírus do sarcoma de Rous (RSV) foi isolado. Utilizando um ligador comercialmente disponível (New Englands Biolabs, Beverly, MA), consistindo numa extremidade coesiva HindIII/ num sítio Smal e numa extremidade coesiva EcoRI. o fragmento de LTR do RSV foi ligado ao plasmídeo ST4DHFR (Maddon et al., supra), que tinha sido digerido com BglII e EcoRI para delecção do promotor precoce do SV40. C) OmpAVlV2-hCH2 e OmpAVlV2-hCH2CH3 fE. colilConstruction of RST4DHFR: TND plasmid (Connors et al., DNA 2: 651-661 (1988) was digested with BalII and HindIII) A 600 bp fragment containing Rous sarcoma virus (RSV) LTR was isolated. a commercially available linker (New Englands Biolabs, Beverly, MA), consisting of a HindIII / Smal end and an EcoRI cohesive end. The RSV LTR fragment was ligated to plasmid ST4DHFR (Maddon et al., supra), which had been digested with BglII and EcoRI for deletion of the SV40 early promoter. C) OmpAV1 V2-hCH2 and OmpAV1 V2-hCH2CH3 fE. coli

Os vectores VlV2-hCH2DHFR e VlV2-hCH2CH3DHFR foram digeridos com AfIII e Xbal. Foram isolados fragmentos de 746 e 1041 pb, respectivamente, que contêm parte de VI (aproximadamente 73 aminoácidos) e as regiões de codificação completas para Charnei-The VlV2-hCH2DHFR and VlV2-hCH2CH3DHFR vectors were digested with AfIII and XbaI. Fragments of 746 and 1041 bp, respectively, containing part of VI (approximately 73 amino acids) and complete coding regions for Charnei-

72 859 SBC Case 14497-1 ra-CH2 e Charneira-CH2CH3. Estes fragmentos Af111-XbaI substituíram os fragmentos Af111-XbaI do plasmídeo 0mpAVlV2/ revelado na EP-A-331 356 (publicado em 6 de Setembro de 1989), para criar um vector contendo o promotor de lambda PL e a sequência de sinal de OmpA fundida com VlV2-hCH2 ou VlV2-hCH2CH3. Os vectores, que funcionam em E. coli estão aqui referidos como 0mpAVlV2 -hCH2 e OmpAvlV2-hCH2CH3. D) VlVl-hCH2-Tkk. V1V2-hCH2-KA. VlV2-hCHLStrept e V1V2--hCH2CH3Strept (Streptomvces)72 859 SBC Case 14497-1 ra-CH 2 and Hinge-CH 2 CH 3. These Af111-XbaI fragments replaced the Af111-XbaI fragments of the plasmid 0mpAV1V2 / disclosed in EP-A-331,356 (published September 6, 1989), to create a vector containing the lambda PL promoter and the OmpA signal sequence fused with Vl2-hCH2 or VlV2-hCH2CH3. Vectors which function in E. coli are referred to herein as 0mpAV1V2-hCH2 and OmpAv1V2-hCH2CH3. D) V1-hCH2-Tkk. V1V2-hCH2-KA. VlV2-hCHLStrept and V1V2-hCH2CH3Strept (Streptomyces)

Os vectores VlV2-hCH2DHFR e VlV2-hCH2CH3DHFR foram digeridos com AfIII e Xbal. Foram isolados fragmentos de 746 e 1041 pb, respectivamente, que contêm parte de VI e as regiões de codificação completas para Charneira-CH2 e Charneira-CH2CH3. Estes fragmentos AflII-Xbal substituíram os fragmentos Af111-XbaI do plasmídeo 12B1. 0 plasmídeo 12B1 contém V1V2 operativamente ligada ao promotor e à sequência de sinal do inibidor da tripsina de Streptomvces lividans lonaisporus (LTI) (ver EP-A-264 175, publicado em 20 de Abril de 1988), assim como às funções de replicação do Streptomvces. encontradas no plasmídeo pIJ351 (Keisert et al., Mol. Gen Genet. 185:223 (1982). Os vectores resultantes aqui referidos como VlVl-hCH2strep e VlV2-hCH2CH3strep, compreendem a sequência de sinal de LTI fundida ou ligada com VlV2-Charneira-CH2 ou VlV2-Charnei-ra-CH2CH3. Estes vectores são designados para funcionarem em Streptomvces e exportarem VlV2-hCH2 ou VlV2-hCH2CH3 para o meio de cultura.The VlV2-hCH2DHFR and VlV2-hCH2CH3DHFR vectors were digested with AfIII and XbaI. Fragments of 746 and 1041 bp, respectively, containing part of VI and the complete coding regions for Hinge-CH2 and Hinge-CH2CH3 were isolated. These AflII-XbaI fragments replaced the Af111-XbaI fragments of plasmid 12B1. Plasmid 12B1 contains V1V2 operably linked to the promoter and signal sequence of Streptomyces lividans lonaisporus trypsin (LTI) (see EP-A-264,175, published April 20, 1988), as well as the replication functions of Streptomvces. The vectors resulting hereinafter referred to as V1-hCH2strep and V1V2-hCH2CH3strep, comprise the LTI signal sequence fused or bound to V1V2-Hinge (Fig. These vectors are designed to function in Streptomoves and to export VlV2-hCH2 or VlV2-hCH2CH3 to the culture medium.

No entanto, a proteína LTI de ocorrência natural é expressa como uma prepro-proteína, isto é, a sequência de sinal é clivada após secreção e a pro-sequência é clivada extracelularmente. Como resultado, a proteína LTI isolada pode ter extremidades amino terminais heterólogas. A expressão de VlV2-hCH2 ou VlV2-hCH2CH3 a partir de VlV2-hCH2strept e VlV2-hCH2CH3strept, respectivamente, produz esta anomalia; isto é, proteínas com extremidades amino-terminais heterólogas. A proporção de pro-proteína para proteína madura (isto é, a que falta todos os resíduos de aminoácidos da sequência de sinal) pode ser alterada por variações das condiçõesHowever, the naturally occurring LTI protein is expressed as a prepro-protein, i.e., the signal sequence is cleaved after secretion and the pro-sequence is cleaved extracellularly. As a result, the isolated LTI protein may have heterologous amino terminal ends. The expression of VlV2-hCH2 or VlV2-hCH2CH3 from VlV2-hCH2strept and VlV2-hCH2CH3strept, respectively, produces this anomaly; i.e., proteins with heterologous amino-terminal ends. The ratio of pro-protein to mature protein (i.e., which lacks all amino acid residues in the signal sequence) can be altered by variations of the conditions

72 859 SBC Case 14497-1 -25- de cultura, por exemplo, pH, 02 dissolvido, etc..For example, pH, O 2 dissolved, etc.

Como alternativa, a sequência de tipo selvagem prepro-LTI (isto é, a sequência de sinal mais a pro-peptídica) foi modificada para produzir proteínas com extremidades amino terminais homogéneas. A sequência prepro de tipo selvagem compreende os seguintes aminoácidos: Met-Arg-Asn-Thr-Ala-Arg-Trp-Ala-Ala-Thr-Leu-Ala-Leu-Thr-Ala--Thr-Ala-Val-Cys-Gly-Pro-Leu-Thr-Gly-Ala-Ala-Leu-Ala- ·!· -Thr--Pro-Ala-Ala-Ala-Pro-Ala-Ser; em que 4- assinala o sítio de clivagem entre a sequência de sinal e a pro-proteína. Numa das construções, a sequência da pro-proteína sofreu delecção e foi substituída por Thr (T) (o terminal amino da proteína CD4 madura é KK). Por isso, o terminal amino desta proteína quimérica CD4-Imunoglobulina é Thr-Lys-Lys (TKK). Noutra construção, a sequência pro-peptídica e os primeiros dois aminoácidos da proteína quimérica madura sofreram delecção e foram substituídos por Lys-Ala (KA). Os vectores que codificam VlV2-hCH2 são aqui referidos como VlV2-hCH2-TKK e VlV2-hCH2-KA, respectivamente. Estes vectores codificam proteínas VlV2-hCH2 com terminais amino homogéneos de Thr-Lys-Lys e Lys-Ala, respectivamente. A sequência nucleotídica completa para o vector VlV2-hCH2-KA e a correspondente sequência de aminoácidos para a proteína quimé-rica VlV2-hCH2-KA é revelada como se segue: f/tt ir.v .Alternatively, the wild-type prepro-LTI sequence (i.e., the plus propeptide signal sequence) was modified to produce proteins with homogeneous amino terminal ends. The wild type prepro sequence comprises the following amino acids: Met-Arg-Asn-Thr-Ala-Arg-Trp-Ala-Ala-Thr-Leu-Ala-Leu-Thr-Ala-Thr-Ala-Val- Gly-Pro-Leu-Thr-Gly-Ala-Ala-Leu-Ala-Î ± -Thr-Pro-Ala-Ala-Ala-Pro-Ala-Ser; wherein 4 denotes the cleavage site between the signal sequence and the pro-protein. In one construct, the pro-protein sequence was deleted and replaced by Thr (T) (the amino terminus of the mature CD4 protein is KK). Therefore, the amino terminus of this chimeric CD4-Immunoglobulin protein is Thr-Lys-Lys (TKK). In another construct, the propeptide sequence and the first two amino acids of the mature chimeric protein were deleted and replaced by Lys-Ala (KA). Vectors encoding VlV2-hCH2 are referred to herein as VlV2-hCH2-TKK and VlV2-hCH2-KA, respectively. These vectors encode V1V2-hCH2 proteins with homogeneous amino termini of Thr-Lys-Lys and Lys-Ala, respectively. The complete nucleotide sequence for the V1V2-hCH2-KA vector and the corresponding amino acid sequence for the V1V2-hCH2-KA chimeric protein is disclosed as follows: f / tt ir.v.

72 859 SBC Case 14497-172 859 SBC Case 14497-1

-26-26

1 GOGOOCAATA CGCAAMEGC CICTOCOOX OCGITOGOCG ATTCMTAAX GCM3CTGQCA 61 CGfOGGTTT C003XTGGA AMXQ33CAG TGRGOGCAAC GCTAmATC TGAGTTM3CT1 GOGOOCAATA CGCAAMEGC CICTOCOOX OCGITOGOCG ATTCMTAAX GCM3CTGQCA 61 CGfOGGTTT C003XTGGA AMXQ33CAG TGRGOGCAAC GCTAmATC TGAGTTM3CT

121 acuamsg gcmxxeagg citdoctt tmgctioog cctogtmgt tgtgtggaat 181 TCGM30GGA TAMAATTIC ACACAGGAAA. CMETAIGM CA1GA1TMX MTTCGAGCT 241 0GGCEM3CIC CT0GAM3AGA. TCCIOQGQCA. GOOGQCCQQG AACCAGCCOG CAGCTICICT 301 OGTTOCICIG T03IAATCAT GEAATCOGA TEOOGMX3 GGPGGATGAA OGCAAGGOGG 361 TOQQQGGGBG TOOOGOGACA GCTCAMEGG AAIGTTICPG OXTMTAAC TAM30XPGG 421 AAAICGGCCA. CITQQCIGCT 1O3G3CGAIC AMSMEGCT OOTOCPCG GQTCATOGGG 481 TOGAACICIG TGfiCTTOQOG OOOGATIC AMM3XAM3 GTIACIGAAA CACAIQ33GT 541 OGSOTronT TT00Q0QG0G GERCATCGT G03CICQX CTCQCCGCTC CGGCMCMA121 acamsg gcmxxeagg citdoctt tmgctioog cctogtmgt tgtgtggaat 181 TCGM30GGA TAMAATTIC ACACAGGAAA. CMETAIGM CA1GA1TMX MTTCGAGCT 241 0GGCEM3CIC CT0GAM3AGA. TCCIOQGQCA. GOOGQCCQQG AACCAGCCOG CAGCTICICT 301 OGTTOCICIG T03IAATCAT GEAATCOGA TEOOGMX3 GGPGGATGAA OGCAAGGOGG 361 TOQQQGGGBG TOOOGOGACA GCTCAMEGG AAIGTTICPG OXTMTAAC TAM30XPGG 421 AAAICGGCCA. CITQQCIGCT 1O3G3CGAIC AMSMEGCT OOTOCPCG GQTCATOGGG 481 TOGAACICIG TGfiCTTOQOG OOOGATIC AMM3XAM3 GTIACIGAAA CACAIQ33GT 541 OGSOTronT TT00Q0QG0G GERCATCGT G03CICQX CTCQCCGCTC CGGCMCMA

601 C03GAACGGG T03S7OCC CTOGftATOCT GOGGAAGGAT GCACACA AIG 033 MC MC l>M#t Arg Aan Thr601 C03GAACGGG T03S7OCC CTOGftATOCT GOGGAAGGAT GCACACA AIG 033 MC MC> M # t Arg Aan Thr

660 G33OXTCG3LGXMCCICaXCieMGaXMCGXGICTCQ3LaX 5*Ala Arg Trp AI· Ala Thr Leu Ala Lau Thr Ala Thr Ala Vai Cyt Gly Pro660 G33OXTCG3LGXMCCICaXCieMGaXMCGXGICTCQ3LaX 5 * Ala Arg Trp AI · Ala Thr Leu Ala Lau Thr Ala Thr Ala Val Cyt Gly Pro

711 CIC ΜΕ QGA QX QX CIC QX AEG QX GIG GIG CG GX AAA APA GX GKT 22>Lau Thr Gly Ala Ala Lau Ala Lya Ala Vai Vai Lau Gly Lya Lya Gly Asp711 CIC ΜΕ QGA QX QX CIC QX AEG QX GIG GIG CG GX AAA APA GX GKT 22> Lau Thr Gly Ala Ala Lau Ala Lya Ala Go Vai Lau Gly Lya Lya Gly Asp

762 MA GIG CPA CIG ME 1GT MA QCT ΈΧ GG PPG PPG MC ΜΆ CM TIC GG 39>Thr Vai Gl u Lau Thr Cy» Thr Ala Sar Gin Lya Lya Sar lia Gin Pha His 813 TC PPA PPC TC AM GG ΑΊΑ, ΑΑΞ MT CIG Q3L ΜΓ CPG GX TC TIC ΊΊΑ 561Trp Lya Aan Sar Aan Gin 11· Lya 11a Lau Gly.Aan Gin Gly Sar Pha Lau762 MA GIG CPA CIG ME 1GT MA QCT ΈΧ GG PPG PPG MC ΜΆ CM TIC GG 39> Thr Vai Glu Lau Thr Cy »Thr Ala Sar Gin Lya Sarla Gin Pha His 813 TC PPA PPC TC AM GG ΑΊΑ, ΑΑΞ MT CIG Q3L ΜΓ CPG GX TC TIC ΊΊΑ 561Trp Lya Aan Sar Aan Gin 11 · Lya 11a Lau Gly.Aan Gin Gly Sar Pha Lau

a64MTAftAG3raA,TCAMCIGAAT®TOXGrraC‘ICAA]AM3LMCCIT 73>Thr Lya Gly Pro Sar Lya Lau Aan Aap Arg Ala Aap Sar Arg Arg Sar Laua64MTAftAG3raA, TCAMCIGAAT®TOXGrraC'ICAA] AM3LMCCIT 73 > Thr Lya Gly Pro Sar Lya Lau Aan Aap Arg Ala Aap Sar Arg Arg Sar Lau

915 TC GG CAA GGA. PPC TIC OX CIG MC MC AM ΜΓ CTT AM ΜΆ CAA. GG 90>Trp Aap Gin Gly Aan Pha Pro Lau 11 a 11 a Lya Aan Lau Lya 11 a Gl u Aap915 TC GG CAA GGA. PPC TIC OX CIG MC MC AM ΜΓ CTT AM ΜΆ CAA. GG 90 > Trp Aap Gin Gly Aan Pha Pro Lau 11 to 11 to Lya Aan Lau Lya 11 to Gl u Aap

966 Ό. GKT MT TM MC TGT GA GIG GG GC OG PPG <3G GG GIG C?A HG 107» Sar Aap Thr Tyr 11 a Cya Gl u Vai Gl u Aap Gin Lya Gl u Gl u Vai Gin Lau 1017 CIAGIGTICG3LTIGAJGEAMTer<3CMECMCIGCrrCM;aXCJG 124>Lau Vai Pha Gly Lau Thr Ala Aan Sar Aap Thr Hl a Lau Lau Gin Gly Gin966 Ό. GKT MT TM MC TGT GA GIG GG GC OG PPG <3G GG GIG C? A HG 107> Sar Aap Thr Tyr 11 a Cya Gl u V a Gl Aap Gin Lya Gl u V a Gin Gin 10u CIAGIGTICG3LTIGAJGEAMTer <3CMECMCIGCrrCM; aXCJG 124 > Lau Vai Pha Gly Lau Thr Ala Aan Sar Aap Thr Hl a Lau Lau Gin Gly Gin

1068 MECIGMECIGMETIGGMAXCCCOCrasrAGrAXOCCTCAGIGaA 14HSer Lau Thr Lau Thr Lau Glu Sar Pro Pro Gly Sar Sar Pro Sar Vai Gin1068 MECIGMECIGMETIGGMAXCCCOCrasrAGrAXOCCTCAGIGaA 14HSer Lau Thr Lau Thr Lau Glu Sar Pro Pro Gly Sar Sar Pro Sar Vai Gin

1119 TGT MG MT OOL AGG GST AAA PPC ΑΊΆ GG G33 033 PPG PCC CIC TC GIG 158 ►Cya Arg Sar Pro Arg Gly Lya Aan 11a Gin Gly Gly Lya Thr Lau Sar Vai1119 TGT MG MT OOL AGG GST AAA PPC ΑΊΆ GG G33 033 PPG PCC CIC TC GIG 158 ►Cy Arg Sar Pro Arg Gly Lya Aan 11a Gin Gly Gly Lya Thr Lau Sar Goes

1170 1CT GG CEG ®G OC GG GKT Α3Γ GX ΜΕ TC PCk TC MT GIC TIG CSG !75>Sar Gin Lau Glu Lau Gin Aap Sar Gly Thr Trp Thr Cya Thr Vai Lau Gin1170 1CT GG CEG ®G OC GG GKT Α3Γ GX ΜΕ TC PCk TC MT GIC TIG CSG! 75> Sar Gin Lau Glu Lau Gin Aap Sar Gly Thr Trp Thr Cya Thr Vai Lau Gin

1221 AM CM PPG PPG GIG GG TIC AAA MA OC MC GIG GIG CIA GCT TIC CM 192*Aan Gin Lya Lya Vai Glu Pha Lya lia Aap lia Vai Vai Lau Ala Pha Gin1221 AM CM PPG PPG GIG GG TIC AAA MA OC MC GIG GIG CIA GCT TIC CM 192 * Aan Gin Lya Lya Go Glu Pha Lya lia Aap lia Vai Vai Lau Ala Pha Gin

1272 PPPl GGT ICC AM GIG GBG ΟΧ AAA, ΊΕΓ TC GM AAA MT GC AGL TC OCA 209>Lya Ala Sar Lya Vai Glu Pro Lya Sar Cya Aap Lya Thr Hia Thr Cya Pro1272 PPPl GGT ICC AM GIG GBG AAA, ΊΕΓ TC GM AAA MT GC AGL TC OCA 209> Lya Ala Sar Lya Vai Glu Pro Lya Sar Cya Aap Lya Thr Hia Thr Cya Pro

^TCTCaAGOLCErGftCICClGGGGGHLaETICAGICTICCICTieaE 226>Pro Cya Pro Ala Pro Glu Lau Lau Gly Gly Pro Sar Vai Pha Lau Pha Pro 1374 XA PPA XE AM GC ME OC AIG A3C TC CG3 ME CET GM GIC ΜΆ TC 243>Pro Lya Pro Lya Aap Thr Lau Mat 11 a Sar Arg Thr Pro Glu Vai Thr Cya 27- 72 859 SBC Case 14497-1 ar 1425 GTG GIG GIG GPC GIG AX oc G?A GC OCT QG GX APG TX NC TX T?C 260» Val Val Val Asp Val Sar Hi s Glu Asp Pr o Gl u Val Lys Phe Asn Trp Tyr 1476 GIG GC GX GIG QG GIG CΚΓ ΜΓ OX APG ÍCA M3 OX OX QG QG QG 277» Val A*p Gly Val Gl u Val Hi s Asn AI a Lys Thr Lys Pro Arg Gl u Glu Gin 1527 T7C ?fC AX AX TJC αχ GX GX AX GX CIC AX GX cx QC QG G?C 294» >Ty r Aan Sar Thr Tyr Arg Val Val Sar Val Lau Thr Val Lau Hi s Gl n Asp 1578 Ί03 CIG ΜΓ GX AfG QG TJC AA3 ΊΧ AA3 GX TJX aa: AAA GX CX CQ 311»Trp Lau Aan Gly Lya Gl u Tyr ty· Cy s Lys Val Sar Asn Lys Ala Lau Pro 1629 GX OX MC GM3 AAA AX AX TOC AftA GX AAA GX QG OX OQ GX QG 328»AI a Pr o lia Glu Lya Thr 1 la Sar ly· Ala Lys Gly Gin Pro Ar g Gl u a a a^ TCTCaAGOLCErGftCICClGGGGGHLaETICAGICTICCICTieaE 226 > Pro Cya Pro Ala Pro Glu Lau Lau Gly Gly Pro Sar Vai Pha Lau Pha Pro 1374 XA PPA XE AM GC ME OC AIG A3C TC CG3 ME CET GM GIC TC 243 Pro Lya Pro Lya Aap Thr Lau Mat 11 to Sar Arg Thr Pro Glu Vai Thr Cya 27-72 859 SBC Case 14497-1 ar 1425 GTG GIG GIG GPC GIG AX oc G? A GC OCT QG GX APG TX NC TX T? C 260 Val Val Val Asp Val Sar Hi s Glu Asp Pr o Gl u Val Lys Phe Asn Trp Tyr 1476 GIG GC GX GIG QG GIG CX OX APG IC M3 OX OX OX QG QG 277 Val A * p Gly Val Gl u Val Hi s Asn AI a Lys Thr Lys Pro Arg Gl u Glu Gin 1527 T7C? FC AX AX TJC αχ GX GX AX GX CIC AX GX cx QC QG G? C 294?> Tyr Aan Sar Thr Tyr Arg Val Val Sar Val Lau Thr Val Lau Hi s Gl n Asp 1578 Ί03 CIG ΜΓ GX AfG QG TJC AA3 ΊΧ AA3 GX TJX aa: AAA GX CX CQ 311 »Trp Lau Aan Gly Lya Gl u Tyr ty · Cy s Lys Val Sar Asn Lys Ala Lau Pro 1629 GX OX MC GM3 AAA AX AX Toc AftA GX AAA GX QG OX OQ GX QG 328 »AI a Pr o lia Glu Lya Thr 1 Sar ly · Ala Lys Gly Gin Pro Ar G Gl u a a a

1680 TQGOGAAX TTOCTAATCT AQGTXAX TQQG00QQG GGICOGGQG QGOOGGAXG1680 TQGOGAAX TTOCTAATCT AQGTXAX TQQG00QQG GGICOGGQG QGOOGGAXG

1740 QGOGGAAXG TQCDCTGAX TXQXCCGA GVLVOSLXJC GIGftCGGAAT (XAAXCTCC 1800 TQCAETTOGT CAXTGBCGT ATCTOGGCGA GOGfiCTGOCG AOXCfiCGGC GGAQCGAJC 1860 G0CTOG0QCT GGOGCOQGGC CTOGEAXOC OGCTGGOGQC íiGGSCOGQOG GCAGQCTCC 1920 OGQCTCOQGC CGBCQOOGGA TTGCITGA1C T00GRG00GC AXMXCGCA GAQCITOGCG 1980 OOGTOQGOGT CGCTQGGGGT GGTOGIGCIC AXGOOGfiCG AXGERCGX GCAOGXrCG 2040 QGOGAGQOG ACTOGGGCQC GRGCTRCOGC CIGCACGAAG TQ00QQ0GGG GCCGACCCCG 2100 G30QGTAAT (ΧΟΧΚΠΆ CD000G0G3C TIOGftOOGCG G0GG00GT0G OCGOGTACGT 2160 acoocaoc cqxgeacgt qcogqgkig acge&cggx qgqsqgaggg axtactocg1740 QGOGGAAXG TQCDCTGAX TXQXCCGA GVLVOSLXJC GIGftCGGAAT (XAAXCTCC 1800 TQCAETTOGT CAXTGBCGT ATCTOGGCGA GOGfiCTGOCG AOXCfiCGGC GGAQCGAJC 1860 G0CTOG0QCT GGOGCOQGGC CTOGEAXOC OGCTGGOGQC íiGGSCOGQOG GCAGQCTCC 1920 OGQCTCOQGC CGBCQOOGGA TTGCITGA1C T00GRG00GC AXMXCGCA GAQCITOGCG 1980 OOGTOQGOGT CGCTQGGGGT GGTOGIGCIC AXGOOGfiCG AXGERCGX GCAOGXrCG 2040 QGOGAGQOG ACTOGGGCQC GRGCTRCOGC CIGCACGAAG TQ00QQ0GGG GCCGACCCCG 2100 G30QGTAAT (ΧΟΧΚΠΆ CD000G0G3C TIOGftOOGCG G0GG00GT0G OCGOGTACGT 2160 acoocaoc cqxgeacgt qcogqgkig acge & cggx qgqsqgaggg axtactocg

2220 AAGTQGQOOC ACnJOOGfiGC OGGGGGMUT G00GGQ0QQC OOGCTOCTGG OGGTOGTOGG 2280 CGTOGIOGTC CTGGXGK5 TXTGCICTC G00GTCGQ0G TGCWXTGCT TCCICGCGGC 2340 GCTGGGCGRG GG0GQ03AGC AIGTO3GCCT ACGOCTOQX QOCTOOXC GCQGTGAGCA. 2400 OQCQCTGr GTOGG00G0G TOGGOQGX OQGGAXTC ΟΟΧΑΧΧΤ TCGOCQGGG 2460 OOGOOGGATC CTOCTTOCCG TOCTTOOCIT OQQOGGOOOG GGTOGOCTOS AXXGAXG 2520 GGOGGCGQGT GMOGOGTGC CKTOOGIXrr OQGICfiCGGC GfiCOOOGGOC CGQGCTOCC 2580 OGOOGTOGGC GTOGQOGGOC AXAXA3AT GGBGGTOGIC G30CTOGGX TCGOOGOOGT 2640 CGAQGCOQG CATCTQOOGC AXQGCQX TOQTTOGAT GGCOCGGOGT OCCCGGGITG 2700 OOCÊCTOGTA. CTOGTQOOG OGCGfiGRGGT TCQCTOQG CGRGOOGAX CCG30GG0GT 2760 CGICCTCGGT CKTG00QC0G GXaQGTOOC CQOCCGIOC ©GGAGTXG AACGGQQCGA. 2820 OCTrOOOGX GGroGOOGTC TròfiGGTQGG CGCGGG3QG TTOQGGGX GGOGCCTICC 2880 OCTOCTGGGr CTTQ30GMDG 1SCTCQQ0GA GGT0GTTQ3C GTOGCQCTOG GTCTOQGCC 2940 GCTTGAAGTC GfiCQOOGrGC OQGTOG?ICGG QOGfIGMGGC GQGGTIGAX TTQOGQGQG 3000 OGGOQGTCCA CA33QCOX CfiGIGOOOCr QXZOOGIC GBOOGOGGOG OCGGICQGCT 3060 (XftAXTQX GSCGMCTQC TTCGOQGPCC GCTOCOOCTC GGICOGGOOG OCGftOCAXA. 3120 0GATOG03IG GKTOK30QGG TOOCBGOOCT TGATCTGOOC CfiCQGTGfiCT TCGGTCEOGC 3180 GGMOVIGX GAXEAXOS AlCOQCTCIC G3ATCOOCTC GOQGIOSQOG GCOOGGTQCC 3240 OGTOCTTQGC C0Q30SXCX GOCCmGC CGOOCGIGftl QGTOGCIGG TRGGOQCCCG 3300 GCXXPOGHGG GCTOTOOGQC GICnCQGGG TOOOCTOSC GQOGTJOCKTG A3GT0030GA 3360 GOOGGTOOGT GIOOXAFOG CGGQG03TGA AGGIGAXA3 GQOGOGGIC OGOOG3GGCT 3420 TGKEOOOC GBOQCGQX GCGGIGA1CT CCTOGQOOOG CTEGKXOQG AIOGTO3CGG 3480 CGQGfiGOGG GCfiGRGOCAG AXCQOOOGC m33QG GOOQGGAX ÃCGGfiCGTIC 3540 OQQOOGOOGT CIQGGOGRCG ADCRCGOOQG A3GCBGQGIC CAICAGGGX OGGOQXfiGC 3600 OCTroCfiCOC GGOGTOOOOG CXATOOQX AC&GOGDOCG QOGQOGGCTG TRO0GG30GG 72 859 SBC Case 14497-12220 AAGTQGQOOCACGGGGGMUTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCGGCGCGCGCGCGCGCGCGGCGCGGCGCGCGCGCGCGCGGCGCGCGCGCGCGCGGCGGCGCGCGCGCGCGGCGCGCGCGCGCGGCGCGCGCGCGCGGCCGCGCGCGCGGCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGGCCGCGCGCGCGGCCGCGCGGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGCGGCCGCGCGCGGGCCGCGCGCGGCCGCGCGCGGCCGCGCGCGGCCGCGGCGCGGCGCGCGCGGCCGCGCGCGGCCGCGCG 2400 OQCQCTGr GTOGG00G0G TOGGOQGX OQGGAXTC ΟΟΧΑΧΧΤ TCGOCQGGG 2460 OOGOOGGATC CTOCTTOCCG TOCTTOOCIT OQQOGGOOOG GGTOGOCTOS AXXGAXG 2520 GGOGGCGQGT GMOGOGTGC CKTOOGIXrr OQGICfiCGGC GfiCOOOGGOC CGQGCTOCC 2580 OGOOGTOGGC GTOGQOGGOC AXAXA3AT GGBGGTOGIC G30CTOGGX TCGOOGOOGT 2640 CGAQGCOQG CATCTQOOGC AXQGCQX TOQTTOGAT GGCOCGGOGT OCCCGGGITG 2700 OOCÊCTOGTA. CTOGTQOOG OGCGfiGRGGT TCQCTOQG CGRGOOGAX CCG30GG0GT 2760 CGICCTCGGT CKTG00QC0G GXaQGTOOC CQOCCGIOC © GGAGTXG AACGGQQCGA. 2820 OCTrOOOGX GGroGOOGTC TròfiGGTQGG CGCGGG3QG TTOQGGGX GGOGCCTICC 2880 OCTOCTGGGr CTTQ30GMDG 1SCTCQQ0GA GGT0GTTQ3C GTOGCQCTOG GTCTOQGCC 2940 GCTTGAAGTC GfiCQOOGrGC OQGTOG? ICGG QOGfIGMGGC GQGGTIGAX TTQOGQGQG 3000 OGGOQGTCCA CA33QCOX CfiGIGOOOCr QXZOOGIC GBOOGOGGOG OCGGICQGCT 3060 (XftAXTQX GSCGMCTQC TTCGOQGPCC GCTOCOOCTC GGICOGGOOG OCGftOCAXA. 3120 0GATOG03IG GKTOK30QGG TOOCBGOOCT TGATCTGOOC CfiCQGTGfiCT TCGGTCEOGC 3180 GGMOVIGX GAXEAXOS AlCOQCTCIC G3ATCOOCTC GOQGIOSQOG GCOOGGTQCC 3240 OGTOCTTQGC C0Q30SXCX GOCCmGC CGOOCGIGftl QGTOGCIGG TRGGOQCCCG 3300 GCXXPOGHGG GCTOTOOGQC GICnCQGGG TOOOCTOSC GQOGTJOCKTG A3GT0030GA 3360 GOOGGTOOGT GIOOXAFOG CGGQG03TGA AGGIGAXA3 GQOGOGGIC OGOOG3GGCT 3420 TGKEOOOC GBOQCGQX GCGGIGA1CT CCTOGQOOOG CTEGKXOQG AIOGTO3CGG 3480 CGQGfiGOGG GCfiGRGOCAG AXCQOOOGC m33QG GOOQGGAX ÃCGGfiCGTIC 3540 OQQOOGOOGT CIQGGOGRCG ADCRCGOOQG A3GCBGQGIC CAICAGGGX OGGOQXfiGC 3600 OCTroCfiCOC GGOGTOOOOG CXATOOQX AC & GOGDOCG QOGQOGGCTG TRO0GG30GG 72 859 SBC Case 14497-1

/Λ -- -28/ Λ -28

3660 CmCCGCTG TCGGGCPGCC TOGGTCCGOG flGGTGCTTOC TPCTTOCCPC PGGCIGTCGC 3720 CTCTOGOacr CTOuCCATOC ACCOOGTOOG GPGPAMOGC AGGTTOGGAGG GG1QCX33jAA 3780 ÍCTCTGTIGT TlCrnOCCA AGCTCTTOX mTGOCTOG QQOQXATCT OGOSICPCAC 3840 gogcgrtogc cogcttcgct qccatocggc ígcegtcpga g&gfpgxta ogoggoogtt 3900 TGOCCGGTGT GTQGGCftAIT GOQCTOOOGC AjTOXfiGOG QQQGOOGGOG GGCOGATCTG 3960 QCAATCCCTC GGCMCQCTC OGTRCIOQG QCBOGfiGCAA OSITOCTGTC 1CG000GGCT 4020 AÍGGQGOGOG AGOOGGfiG OGGfiCGQSIC 0303100®. AGTOOGG0OC GITXTCTTT 4080 GGTCTGGTQS GAA1CCTGGC AOCAKTOG33 COGAGOTIC CCTCCQCCPC TCOOGPCQX 4140 CCITQQQQCr GGIGIOACiT GGRG3300GA AGftGfiGOOOC G00GQ3TA1C OQGOQQ3GCT 4200 TI3G3IG0G GTOGIGOCT GTGTO330GA QOGftTQQOCA OGfiGGOCCTG GAAGOOGSGC 4260 Q3ICOGGOGA A3ICGG00CA GICGCAWXG GXTCAGOGC ÍOIGGGCGGA OCftOOCftXG 4320 CCGraQQQSr CCIOG&CCftG GITCftOGGIC COCICQGTCA. GGCGTOCGTC GWOICQGTC 4380 MX3GICQGTC TOCIGGTGGG TQQQQQCQ3G QOGOCSGCfiC GRAGIGXGG CGCCCCGOGG 4440 QGOTrGGTOG GGTCBGGOGC CGAADCQ33G GGOQGOGGCG G0GM3GGC OGTCGGOGGC 4500 G00CKFGG0G CGSrCQOGGT OGGTGGTGPO GGOGGTGOGG TCGG0Q3CGG 0CCW3TCCTG 4560 CIOCRCGCA ftOGTCERCGT QGfiOOCM Q30SOCGA TOGPflGTOQG GGPGTOOGTC 4620 GCTOOGGOQG G00GCTQQ0G aJTOCftOCGG C30GA10C TOQOOCATIG AGAAG000CG 4680 TCfiGGCfiCOG COOGBCGGGG CTICITCPCG TOCRGflCGfiC GTOGITrOOG QQGGTIQOCC 4740 A30QQGTIGA GQOGITGCflC OOCGGTTGfiC OOOOOGCAAC CQQSICTGPC CTQ0SW7IT 4800 GC7W3GTTX GPiXGTTIOC AGG&GGG03C OCrRCQOGTG 033303CGPG GAflGXATTT 4860 TTGMCTRGC TTOOGGflGCC CTICTOGGCC TOQGOCTTOG 0QC&CG2EG GGCOGTCITC 4920 GGBGOGfiCOG COGTCfiCCTC GITGftTQOQG OGSEftGGGGft. OCTCCMOOG GAOOOOCICC 4980 GOGMCMOG GG0G3GCTC σΓΙΓΤ^ΑΑΙΤ: TOGTCCPGCr OOGOGPGGPG CTrGRTOOGC 5040 TGCTGCCCCA AOQGCnCflG OGOOGOCICr Q0CICQG00C GGMCICGOC 03GIGIUITT 5100 TGOGTCAT® AGTCRTOCTG AXGBCIGIG TOGICIGOG CAACTAGTTC A3GTO0GTIT 5160 TrrOCOGOC AfiCITPOOCT AOGTCMX3A GQOGGOOOQC GBG0QGG00G CGCQGCCOGG 5220 ooamm: goooocgct cctctcttog ooooGcrooG qoocggccgc ogaoogqox3660 CmCCGCTG TCGGGCPGCC TOGGTCCGOG flGGTGCTTOC TPCTTOCCPC PGGCIGTCGC 3720 CTCTOGOacr CTOuCCATOC ACCOOGTOOG GPGPAMOGC AGGTTOGGAGG GG1QCX33jAA 3780 ÍCTCTGTIGT TlCrnOCCA AGCTCTTOX mTGOCTOG QQOQXATCT OGOSICPCAC 3840 gogcgrtogc cogcttcgct qccatocggc ígcegtcpga g & gfpgxta ogoggoogtt 3900 TGOCCGGTGT GTQGGCftAIT GOQCTOOOGC AjTOXfiGOG QQQGOOGGOG GGCOGATCTG 3960 QCAATCCCTC GGCMCQCTC OGTRCIOQG QCBOGfiGCAA OSITOCTGTC 1CG000GGCT 4020 AÍGGQGOGOG AGOOGGfiG OGGfiCGQSIC 0303100® . AGTOOGG0OC GITXTCTTT 4080 GGTCTGGTQS GAA1CCTGGC AOCAKTOG33 COGAGOTIC CCTCCQCCPC TCOOGPCQX 4140 CCITQQQQCr GGIGIOACiT GGRG3300GA AGftGfiGOOOC G00GQ3TA1C OQGOQQ3GCT 4200 TI3G3IG0G GTOGIGOCT GTGTO330GA QOGftTQQOCA OGfiGGOCCTG GAAGOOGSGC 4260 Q3ICOGGOGA A3ICGG00CA GICGCAWXG GXTCAGOGC ÍOIGGGCGGA OCftOOCftXG 4320 CCGraQQQSr CCIOG & CCftG GITCftOGGIC COCICQGTCA. GGCGTOCGTC GWOICQGTC 4380 MX3GICQGTC TOCIGGTGGG TQQQQQCQ3G QOGOCSGCfiC GRAGIGXGG CGCCCCGOGG 4440 QGOTrGGTOG GGTCBGGOGC CGAADCQ33G GGOQGOGGCG G0GM3GGC OGTCGGOGGC 4500 G00CKFGG0G CGSrCQOGGT OGGTGGTGPO GGOGGTGOGG TCGG0Q3CGG 0CCW3TCCTG 4560 CIOCRCGCA ftOGTCERCGT QGfiOOCM Q30SOCGA TOGPflGTOQG GGPGTOOGTC 4620 GCTOOGGOQG G00GCTQQ0G aJTOCftOCGG C30GA10C TOQOOCATIG AGAAG000CG 4680 TCfiGGCfiCOG COOGBCGGGG CTICITCPCG TOCRGflCGfiC GTOGITrOOG QQGGTIQOCC 4740 A30QQGTIGA GQOGITGCflC OOCGGTTGfiC OOOOOGCAAC CQQSICTGPC CTQ0SW7IT 4800 GC7W3GTTX GPiXGTTIOC AGG & GGG03C OCrRCQOGTG 033303CGPG GAflGXATTT 4860 TTGMCTRGC TTOOGGflGCC CTICTOGGCC TOQGOCTTOG 0QC & CG2EG GGCOGTCITC 4920 GGBGOGfiCOG COGTCfiCCTC GITGftTQOQG OGSEftGGGGft. OCTCCMOOG GAOOOOCICC 4980 GOGMCMOG GG0G3GCTC σΓΙΓΤ ^ ΑΑΙΤ: TOGTCCPGCr OOGOGPGGPG CTrGRTOOGC 5040 TGCTGCCCCA AOQGCnCflG OGOOGOCICr Q0CICQG00C GGMCICGOC 03GIGIUITT 5100 TGOGTCAT® AGTCRTOCTG AXGBCIGIG TOGICIGOG CAACTAGTTC A3GTO0GTIT 5160 TrrOCOGOC AfiCITPOOCT AOGTCMX3A GQOGGOOOQC GBG0QGG00G CGCQGCCOGG 5220 ooamm: goooocgct cctctcttog ooooGcrooG qoocggccgc ogaoogqox

5280 GOGCBCaCGA OGGGGGOGGC ATCGGIQGQC GGTECfiCSDG GOGCCTGCIC GGOGGBCTGC 5340 GGGCMOGOC GTTGIGCTOG 003CM3GGC MOGGGGfiGG Q3IQ33GGSC TOQOGQQQCT 5400 ÍCG0GG00GT CEGBGOQOCT GTOGOCTOC OGGBGOGOCG TfiOOOOOGOC CTOQOQGTCC 5460 TGBGOCGOGT GW3Q0GBCX TCfiGOOOOGT (JjTOGGTTGC TOG3GPGCPC CTOCTQOOGC 5520 GftlGPOGIGG CQ330GTOGA GCTQGICPGC CGIGOGGCTC OGTOGTGGOC GGTCKTOOGG 5580 CDGOOOGMC CTQGUQQGCA AGMGOOGGC GGAAC0Q30G GfiOCIOGSCC GCGfiCQGCEA 5640 T00QG0Q3QC 0Q03QG0GQG CCTOOGIPGA QQQOGfiGQGC GGGCGCCATC CCGPOCGXA 5700 OQQGGGGOCA CP0300CRQG A3CTO003C3V OGMCfiGOGT C003C0GPCC A3G&3CSGTC 5760 OGGOCRGC3C TQ30GCEMC QOCTOCTOCT GGKCCGGIC CIGGTGGTOC ArCfiGICCIC 5820 ooajixaiCA. paxnwrpc ticaiatocg ggsvtogacc goocgogitc oggmqggga5280 GOGCBCaCGA OGGGGGOGGC ATCGGIQGQC GGTECfiCSDG GOGCCTGCIC GGOGGBCTGC 5340 GGGCMOGOC GTTGIGCTOG 003CM3GGC MOGGGGfiGG Q3IQ33GGSC TOQOGQQQCT 5400 ÍCG0GG00GT CEGBGOQOCT GTOGOCTOC OGGBGOGOCG TfiOOOOOGOC CTOQOQGTCC 5460 TGBGOCGOGT GW3Q0GBCX TCfiGOOOOGT (JjTOGGTTGC TOG3GPGCPC CTOCTQOOGC 5520 GftlGPOGIGG CQ330GTOGA GCTQGICPGC CGIGOGGCTC OGTOGTGGOC GGTCKTOOGG 5580 CDGOOOGMC CTQGUQQGCA AGMGOOGGC GGAAC0Q30G GfiOCIOGSCC GCGfiCQGCEA 5640 T00QG0Q3QC 0Q03QG0GQG CCTOOGIPGA QQQOGfiGQGC GGGCGCCATC CCGPOCGXA 5700 OQQGGGGOCA CP0300CRQG A3CTO003C3V OGMCfiGOGT C003C0GPCC A3G & 3CSGTC 5760 OGGOCRGC3C TQ30GCEMC QOCTOCTOCT GGKCCGGIC CIGGTGGTOC ArCfiGICCIC 5820 ooajixaiCA paxnwrpc ticaiatocg ggsvtogacc goocgogitc oggmqggga

5880 A3G0GGQGA. GCmQOOG ÍGW30OCGA. CTDXCCTTG OGTTOGTGAT TGOOQGTCftG 5940 QXPGXA1C CGOCMCGTC QOGEBGGGTC TOOCOOCA GGAATCGCGT OOGWOC 6ooo AsamsõT aggpogpcca tsosgit asooam: ggaaaiccgt ccgmooosc 6060 GGTGQVGOGG A1CMXXMG TCMCftMOC GICGCGATOC Μ&&ΚΚΆ CMCGTIGAT 6120 OGBGGACGiEC GM3GQrrav T0C2OGCAT OGOGGCOGGG GTOGRCTICA TCGPQGTCia. -29- 72 859 SBC Case 14497-1 /"· ....... ' ei5880 A3G0GGQGA. GCmQOOG IWW30OCGA. CTDXCCTTG OGTTOGTGAT TGOOQGTCftG 5940 QXPGXA1C CGOCMCGTC QOGEBGGGTC TOOCOOCA GGAATCGCGT OOGWOC 6ooo AsamsõT aggpogpcca tsosgit asooam: ggaaaiccgt ccgmooosc 6060 GGTGQVGOGG A1CMXXMG TCMCftMOC GICGCGATOC Μ & & ΚΚΆ CMCGTIGAT 6120 OGBGGACGiEC GM3GQrrav T0C2OGCAT OGOGGCOGGG GTOGRCTICA TCGPQGTCia. -29- 72 859 SBC Case 14497-1 / "

eieo (SXPGOOC WXPGTOCTT TTCCMCKA GTTQCTQGKT CIUIOOGC33C GGCPGAACAT 6240 ACCGGTOOGC CTCftTOGflCT OCTOCaTCCT CMCCRGITC TICAAGQQQG ÍGOQGMGQC 6300 OtfGPCMTC GQCMOGCOC G0GT00CT0G OOOQGOOG3 TTOQGOGftlA TCGCGfiGCCG 6360 OOGTQQGGRC GTOGTCGTTC TCGfiCGGGGT GRfiGAICGTC GGGWOICG GCQCGAZPGT 6420 ADGCfiCGKE CTOGOGCTOG GW303TOQQG GATCATOC1G GlOGfiCftGIG ÍOICACCAG 6400 carooooac cgqogtctoc aaagoqocsg oogsggitac gtcitctocc ttcccgtcgteieo (SXPGOOC WXPGTOCTT TTCCMCKA GTTQCTQGKT CIUIOOGC33C GGCPGAACAT 6240 ACCGGTOOGC CTCftTOGflCT OCTOCaTCCT CMCCRGITC TICAAGQQQG ÍGOQGMGQC 6300 OtfGPCMTC GQCMOGCOC G0GT00CT0G OOOQGOOG3 TTOQGOGftlA TCGCGfiGCCG 6360 OOGTQQGGRC GTOGTCGTTC TCGfiCGGGGT GRfiGAICGTC GGGWOICG GCQCGAZPGT 6420 ADGCfiCGKE CTOGOGCTOG GW303TOQQG GATCATOC1G GlOGfiCftGIG ÍOICACCAG 6400 carooooac cgqogtctoc aaagoqocsg oogsggitac gtcitctocc ttcccgtcgt

6540 TCTCrCCQGr CGOGSGGfiGG OCATOOCCIT CMTCQGGfiC ÍGOXTMTOC A3CIGA1GAC 6600 GCICftfiGQOG GRTOXGaCA TTICOGKAA GGKOCX3QG GftCAATOCQG ATCGGCIQGC 6660 crrocronc gqcbgogaaa AGGGTQoaoc ttxgkxtg ttcgaggmg ogtctioogc6540 TCTCrCCQGr CGOGSGGfiGG OCATOOCCIT CMTCQGGfiC IGOXTMTOC A3CIGA1GAC 6600 GCICftfiGQOG GRTOXGaCA TTICOGKAA GGKOCX3QG GftCAATOCQG ATCGGCIQGC 6660 CrcGcGgGaAa AGGGTQOAOC ttxgkxtg ttcgaggmg ogtctioogc

6720 CrOQGTTIOC ATCCOCMGA IGfiGOCfGPC CGftGTCTCFC AftXTITCOG TTTCOCTOGG 6780 AATCGCGCTG CPOSOGGA TOSOGGAA TCICGCQ30C AWOGMMG OGOCTCTGTT 6840 OCTOXPOX TCQCTKCIC GPCCICGA1T OGTCfiGIGftT GftlCPCCCOG íCfiGDGGATC 6900 AfiGGGGTTTG CGGGTOOCGG TCGGCGCCG5 GOQQQGGftGG OfíaúCOSC OGftCOCTQX 6960 TCIGGGftOGG GOOSjftOGGC A3GG3GPOCG GOQQCCGQGC GW3CIGCAGG CATGCAA3CT 7020 T333O0QC CGIOCTITEA. CAfiCGICGIG ACTOGGAAAA COCTGGCGIT ÍCCCMOTA 7080 HECGOCTTGC A33OIC0C (XTITO3XA GÇIGQOGEAA TPflOSftAGPG QCCOGCPCCG 7140 ATOQOOCnx: RSSKTIFNFM ROCWOGIT G03CW30CIG AKTGQOGAAT GGCQOCIGAT 7200 GCOSERTlTr CTOCTTMGC ATCTGIGOQG TMTTCPO£ GXMMGGT GCACTCICAG 7260 TftCPAlCIQC TCIGftlGCCG CftmGTUJC OCBGOOOOGA 0033333A CPCCOGCTGA 7320 OQOQOCCIGA CGGGCnGTC TGCTOOOQQC A1CCGCTTPC JOCftfiGCTG TGfiCOGTCTC 7380 OGGGPfiCTOC AICTGTCÃGft. GSTmCMC GUCKOCOG AAflCGOQOGA GACGAAAGGG 7440 CCTCGTGftIA CGGCIMTIT TAIM3TTAA TGDCMDGMA M&ATGGTTT CrCOCCTC 7500 A33TQQ3CT TT10GG3GAA MGTQOQGGG AftCOCCMT TOITiMTPP TCTAAfiTRCA 7560 TiaWVEATC TMCCQCTCA TGRGfiCftMA íCOCTGMftA. MCCTTCAftT MIATTGAAA 7620 MGGAftGZCT MGfiGEATTC MCKETEOOG TOTOGOOCTT ATTOCCTTTT TTGOGQCMT 7680 TK30CIT0CT GrrmQCIC PCC&&MC OCIQjTGftAA. GE&AAfiGftTC CTGft?GAlCA. 7740 GITOQCTXft. CGRGTQ3GIT aCRTOGMCT OGATCICMC KXXXZNGk TCCTTGfGRG 7800 TmCGOOOC GAfCRMCTT TTCCAA1GAT GK£fiCTTIT AftK3TICIGC TAIGIGXGC 7860 QCTEAmroC CGTMTGftCG COGGXAAGA GCBftCICSSr OQOOGCRXftC ATEAITCICA 7920 GftATCftCrTG GTTGRGQCr OCCRGICRC AGAAAfiGCAX CmCGGfilG QCA1QOGT 7980 MGK3U!Em TQCPCTGCTG OCMMCCftT GfiGIGMJiíC ACKXX3Q0CA KTEfiCITCT 8040 QOK22A1C GGPGGRCCGft. ÍGGSGCIMC CQCTTTPriG OCMCMO3 GGGftTOMGT 8100 AídOOOCrr gatositcgs amxggroct GAAIGMGGC AIWXAAAX AaaGCGIQV 8160 CfCC3raro CCTCTCCAA TOGCAft^SC GTTGCGCAAA CEATEWOG GCGWOSCr 8220 TPCICIMCr T03JGQÇRÃC ΆΚΠΜΰΧΆ. CT33VDQGAG G03GKEAAÍG TTGCAG3CC 8280 ACnCIGOa TGGGOOCTTC (Π3ΏΠΟΠΧ3 GmMTOCT GATftAATCIG GSfíOOuGTCA 8340 GCGTOGGTCT CGCQGIfiTCA. TTOCSGCSCT GGGXCf^T GGCtfCCCCT CCGGTKTOGT 8400 MTDfflCEflC A33^G33GA. GICSí33CAÃC TftTOGAlGRA OGAAAIRGfiC MKTOQCIGA 8460 GAIRGGTOCC TCfiCTCATTA. ΑΟΖΑΓΙΠ11Ά PCTGTCN3C CAftjITEACT CAXA1AXACT 8520 TDOOTCKr TTftAAPCTTC ΜΤΠΤΡΛΤΤ TftAAfiQGSTC 13í3C3PatfGA TCCITITIIA asso TftKDCTCftTG AXAAAfiTOC CITWmGA UllTlUbTlC OOSfiGOGT CRÍSOOOOGT 8640 íGftAAAuMC AAftGSKTCTT CrEGWaiOC ΊΤΙΤΓΤΊΓΠΞ OOCGUAICT GCTOCnOÇA. -30- 72 859 SBC Case 14497-1 - --'j6720 CrOQGTTIOC ATCCOCMGA IGfiGOCfGPC CGftGTCTCFC AftXTITCOG TTTCOCTOGG 6780 AATCGCGCTG CPOSOGGA TOSOGGAA TCICGCQ30C AWOGMMG OGOCTCTGTT 6840 OCTOXPOX TCQCTKCIC GPCCICGA1T OGTCfiGIGftT GftlCPCCCOG íCfiGDGGATC 6900 AfiGGGGTTTG CGGGTOOCGG TCGGCGCCG5 GOQQQGGftGG OfíaúCOSC OGftCOCTQX 6960 TCIGGGftOGG GOOSjftOGGC A3GG3GPOCG GOQQCCGQGC GW3CIGCAGG CATGCAA3CT 7020 T333O0QC CGIOCTITEA. CAfiCGICGIG ACTOGGAAAA COCTGGCGIT ÍCCCMOTA 7080 HECGOCTTGC A33OIC0C (XTITO3XA GÇIGQOGEAA TPflOSftAGPG QCCOGCPCCG 7140 ATOQOOCnx:. RSSKTIFNFM ROCWOGIT G03CW30CIG AKTGQOGAAT GGCQOCIGAT 7200 GCOSERTlTr CTOCTTMGC ATCTGIGOQG TMTTCPO £ GXMMGGT GCACTCICAG 7260 TftCPAlCIQC TCIGftlGCCG CftmGTUJC OCBGOOOOGA 0033333A CPCCOGCTGA 7320 OQOQOCCIGA CGGGCnGTC TGCTOOOQQC A1CCGCTTPC JOCftfiGCTG TGfiCOGTCTC 7380 OGGGPfiCTOC AICTGTCÃGft GSTmCMC GUCKOCOG AAflCGOQOGA GACGAAAGGG 7440 CCTCGTGftIA CGGCIMTIT TAIM3TTAA TGDCMDGMA M &. ATGGTTT CrCOCCTC 7500 A33TQQ3CT TT10GG3GAA MGTQOQGGG AftCOCCMT TOITiMTPP TCTAAfiTRCA 7560 TiaWVEATC TMCCQCTCA TGRGfiCftMA íCOCTGMftA MCCTTCAftT MIATTGAAA 7620 MGGAftGZCT MGfiGEATTC MCKETEOOG TOTOGOOCTT ATTOCCTTTT TTGOGQCMT 7680 TK30CIT0CT GrrmQCIC CCP & &. MC OCIQjTGftAA GE & AAfiGftTC CTGft Gâlcă 7740 GITOQCTXft?. CGRGTQ3GIT aCRTOGMCT OGATCICMC KXXXZNGk TCCTTGfGRG 7800 TmCGOOOC GAfCRMCTT TTCCAA1GAT GK £ fiCTTIT AftK3TICIGC TAIGIGXGC 7860 QCTEAmroC CGTMTGftCG COGGXAAGA GC BftCICSSr OQOOGCRXftC ATEAITCICA 7920 GftATCftCrTG GTTGRGQCr OCCRGICRC AGAAAfiGCAX CmCGGfilG QCA1QOGT 7980 MGK3U! In TQCPCTGCTG OCMMCCftT GfiGIGMJiíC ACKXX3Q0CA KTEfiCITCT 8040 QOK22A1C GGPGGRCCGft. CQCTTTGGGGGGGGGCGGGGGGGGGGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGGGGGCGCGCGGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCAAACGCGCGCGCAAA CEATEWOG GCGWOSCr 8220 TPCICIMCr T03JGQRR. CT33VDQGAG G03GKEAAÍG TTGCAG3CC 8280 ACnCIGOa TGGGOOCTTC (Π3ΏΠΟΠΧ3 GmMTOCT GATftAATCIG GSfíOOuGTCA 8340 GCGTOGGTCT CGCQGIfiTCA. TTOCSGCSCT GGGXCf ^ T GGCtfCCCCT CCGGTKTOGT 8400 MTDfflCEflC A33 ^ G33GA. GICSí33CAÃC TftTOGAlGRA OGAAAIRGfiC MKTOQCIGA 8460 GAIRGGTOCC TCfiCTCATTA. ΑΟΖΑΓΙΠ11Ά PCTGTCN3C CAftjITEACT CAXA1AXACT 8520 TDOOTCKr TTftAAPCTTC ΜΤΠΤΡΛΤΤ TftAAfiQGSTC 13í3C3PatfGA TCCITITIIA bake TftKDCTCftTG AXAAAfiTOC CITWmGA UllTlUbTlC OOSfiGOGT CrISOOOOGT 8640 iGftAAAuMC AAftGSKTCTT CrEGWaiOC ΊΤΙΤΓΤΊΓΠΞ OOCGUAICT GCTOCNOZA -30- 72 859 SBC Case 14497-1 -

8700 AK3AAAAAA CCPCGOCTC (3GCGGTO3T TTGnTQCOG GftTCftASftGC TAOCMCICT 8760 TITICCGRfiG GTWCIQGCT TCRGCfiGfta: GOGMMCA AA3303TCC TICEaGIGTA 8820 GOOSTASm QQCCAOCSCT TCMGWCIC TGEPCCftOOG OCEPCAEMC TCQCTCTGCT 8880 AKTOCrarTA CCSGTOQCTG CIGOCfiCTQG CGAlAflGTCG TSTCTEACOG QGTIQSOC 8940 AK2CGAIK3 TOOCXSVm «30033005 GTOGGGCIGA AOGGQQGGIT OG1GC?OCA 9000 Q00CSCCTTC GM30GRK3GA CCTTOOCGA. ;OGAGM3£ CPCSOCGIC A3CIMGRGA 9060 AAQ0GCCW3G CTT000GAM3 QGfiGAAftQX 03000131 OCGGTAAGCG GCA3GGTCGG 9120 «OOGRGRG OGCfiOGfiGGG «3CTTCCfl03 G33WO3X ΤΌΟΙΜΌΠΤ AIPOTOCIGT 9i8o oGooiTior cpocicigpc ttgaxctcg attttioiga toctooktg gggqooggag 9240 0CEKT33AA AfiCGCCfiGCA ÍCGCOGCCIT TTOCQGTTC CIOQOCmT GCT330CITT 93oo TOCicaoiG rrcmccio oomicca: tgritctoig gm3aocgea toocgccit8700 AK3AAAAAA CCPCGOCTC (3GCGGTO3T TTGnTQCOG GftTCftASftGC TAOCMCICT 8760 TITICCGRfiG GTWCIQGCT TCRGCfiGfta:. GOGMMCA AA3303TCC TICEaGIGTA 8820 GOOSTASm QQCCAOCSCT TCMGWCIC TGEPCCftOOG OCEPCAEMC TCQCTCTGCT 8880 AKTOCrarTA CCSGTOQCTG CIGOCfiCTQG CGAlAflGTCG TSTCTEACOG QGTIQSOC 8940 AK2CGAIK3 TOOCXSVm '30033005 GTOGGGCIGA AOGGQQGGIT OG1GC OCA 9000 Q00CSCCTTC GM30GRK3GA CCTTOOCGA; OGAGM3 £ CPCSOCGIC A3CIMGRGA 03000131 9060 AAQ0GCCW3G CTT000GAM3 QGfiGAAftQX OCGGTAAGCG GCA3GGTCGG 9120 'OOGRGRG OGCfiOGfiGGG' 3CTTCCfl03 G33WO3X ΤΌΟΙΜΌΠΤ AIPOTOCIGT 9i8o oGooiTior cpocicigpc ttgaxctcg attttioiga toctooktg gggqooggag 9240 0CEKT33AA AfiCGCCfiGCA ÍCGCOGCCIT TTOCQGTTC CIOQOCmT GCT330CITT 93oo TOCicaoiG rrcmccio oomicca: tgritctoig gm3aocgea toocgccit

9360 TGfiGTCfiGCT GM3CC0CTC G0CQC3G00G «VXfGCGSG 03»ΧΕΚ?Γ CSGIGPiGOGA 9420 GGMGOGGAA A em que a sequência de sinal começa no nucleótido 648 e termina no nucleótido 731; VI começa no nucleótido 732 e termina no nucleótido 1286, a região de Charneira começa no nucleótido 1287 e termina no nucleótido 1331, CH2 começa no nucleótido 1332 e continua até o fim da sequência de codificação.Wherein the signal sequence begins at nucleotide 648 and terminates at nucleotide 731; and wherein the signal sequence begins at nucleotide 648 and terminates at nucleotide 731; VI begins at nucleotide 732 and terminates at nucleotide 1286, the Hinge region begins at nucleotide 1287 and terminates at nucleotide 1331, CH2 begins at nucleotide 1332 and continues to the end of the coding sequence.

Exemplo 2 - Expressão das quimeras CD4-IqG A) VIVI-hCH2DHFR e VlV2hCH2ClH3DHFR em células CHO:Example 2 - Expression of the CD4-IqG chimeras A) VIVI-hCH2DHFR and V1V2hCH2ClH3DHFR in CHO cells:

Transfectam-se vlV2-hCH2 em células CHO através de electroporação. Ressuspenderam-se 15 μg de ADN de VlV2-hCH2 em 15 jul de sacarose em tampão fosfato (PBSacarose) (272 mM de sacarose, 7 mM de fosfato de sódio pH 7,4, 1 mM de MgCl2, estéril e filtrado), misturaram-se com 1,0 x 107células CHO (0,8 ml) e incubaram-se em gelo durante 15 minutos numa cuvete Gene Pulser (Bio-Rad, Richmond, Calif). A cuvete foi, então, colocada na câmara do eléctrodo do Gene Pulser (Bio-Rad) e um impulso de 700 volts/^Fd (constante de tempo = 0,7) foi aplicado às células. As células foram colocadas em gelo durante 10 minutos e depois foram diluídas para 10 ml com meio de crescimento (contendo lxITS [insulina, transferrina, selénio; Collaborative Research, Bedford, MA], lx lípidos [Gibo, Grand Island, NY]. As células foram então plaqueadas em placas de 96 poços a 3 x 193 -31- 72 859 SBC Case 14497-1 células/poço num total de 5 placas. Após 48 horas o meio foi substituído para meio de selecção (isto é, meio de crescimento isento de nuclèósidos) para selecção de DHFR.VlV2-hCH2 is transfected into CHO cells by electroporation. 15 .mu.g of VlV2-hCH2 DNA in 15 .mu.l of sucrose were resuspended in phosphate buffer (PBSscharose) (272 mM sucrose, 7 mM sodium phosphate pH 7.4, 1 mM MgCl2, sterile and filtered), mixed with 1.0 x 107 CHO cells (0.8 ml) and incubated on ice for 15 minutes in a Gene Pulser cuvette (Bio-Rad, Richmond, Calif.). The cuvette was then placed in the Gene Pulser electrode chamber (Bio-Rad) and a pulse of 700 volts / æFd (time constant = 0.7) was applied to the cells. The cells were placed on ice for 10 minutes and then diluted to 10 ml with growth medium (containing lxITS [insulin, transferrin, selenium: Collaborative Research, Bedford, MA], lipids [Gibo, Grand Island, NY]). cells were then plated in 96-well plates at 3 x 193 -31- 72 859 SBC Case 14497-1 cells / well in a total of 5 plates. After 48 hours the medium was replaced for selection medium (i.e., growth medium nucleoside free) for selection of DHFR.

As células foram mantidas em meio selectivo durante apro-ximadamente 4 semanas. As análises por transferência de Western foram efectuadas utilizando uma electroforese SDS-PAGE com um gel de 15% e anticorpos policlonais anti-sT4. 0 produto do gene V1V2--hCH2 foi identificado como uma única banda positiva a uma massa molecular de aproximadamente 40 000 daltons. As células dos poços positivos foram mudadas para microplacas de 24 poços e mantidas em meio selectivo até estarem prontas para amplificar. Os dois clones (3F8 e 2F6) foram escolhidos para a amplificação, o clone 3F8 foi amplificado em meio de selecção contendo 50 nM de metotrexato. As células foram re-plaqueadas em microplacas de 96 poços com 3 x I03células/poço. A amplificação começou com níveis de 50 nM de metotrexato, que foi incrementado em cada passagem sucessiva de amplificação, de acordo com os protocolos usuais.Cells were maintained in selective medium for approximately 4 weeks. Western blot analyzes were performed using SDS-PAGE electrophoresis with a 15% gel and anti-sT4 polyclonal antibodies. The V1V2-hCH2 gene product was identified as a single positive band at a molecular mass of approximately 40,000 daltons. Positive well cells were changed into 24-well microplates and maintained in selective medium until ready to amplify. The two clones (3F8 and 2F6) were chosen for amplification, clone 3F8 was amplified in selection medium containing 50 nM methotrexate. Cells were re-plated into 96-well microplates with 3 x 103 cells / well. Amplification began with 50 nM methotrexate levels, which were increased at each successive amplification step, according to the usual protocols.

De igual modo, o vector VlV2-hCH2CH3 foi transfectado em células CHO por electroporação. Os clones positivos foram seleccio-nados pela cultura das células em meio isento de nuclèósidos e a selecção dos sobrenadantes dos poços saudáveis foi efectuada por análise por transferência de Western, utilizando anticorpos anti-sT4. Os clones positivos resultantes foram então amplificados utilizando quantidades crescentes de metotrexato. B) V1V2—hCH2COS e VlV2-hCH2CH3COS em células COS:Similarly, the vector VlV2-hCH2CH3 was transfected into CHO cells by electroporation. Positive clones were selected by culturing the cells in nucleoside free medium and selection of supernatants from healthy wells was performed by Western blot analysis using anti-sT4 antibodies. The resulting positive clones were then amplified using increasing amounts of methotrexate. B) V1V2-hCH2COS and V1V2-hCH2CH3COS in COS cells:

As células COS foram obtidas a partir da ATCC (Rockville, MD) e cultivadas de acordo com os protocolos recomendados pela ATCC. As células foram então transfectadas com 10 μg de ADN de plasmídeo em soro Nu 2,5% (Collaborative Research, Cambridge, MA) em DMEM (meio mínimo essencial de Dulbecco (Gibco Grand Island, NY) com 400 g/ml de DEAE-Dextran (Pharmacia) e 100 μΜ de cloroquina (Seed et al., Proc. Natl. Acad. Sei. 84:3365-3369 (1987). Após 4 horas de incubação a 37°C, o meio foi removido e as células tratadas com 2 ml de dimetilsulfóxido a 10% em PBS (solução salina tamponada com fosfato), durante 3 minutos, a Λ 72 859 SBC Case 14497-1 —-rry>COS cells were obtained from the ATCC (Rockville, MD) and cultured according to the protocols recommended by the ATCC. Cells were then transfected with 10 μg of 2.5% Nu serum plasmid DNA (Collaborative Research, Cambridge, MA) in DMEM (Dulbecco's minimal essential medium (Gibco Grand Island, NY) with 400 g / ml DEAE- Dextran (Pharmacia) and 100 μl of chloroquine (Seed et al., Proc. Natl Acad Sci 84: 3365-3369 (1987)) After 4 hours of incubation at 37 ° C, the medium was removed and treated cells was treated with 2 ml of 10% dimethylsulfoxide in PBS (phosphate buffered saline) for 3 minutes, Λ 72 859 SBC Case 14497-1 - rry>

P -32- 25°C. 0 meio foi removido e substituído por DMEM. .Tanto VlV2-hCH2 como VlV2-hCH2CH3 foram expressas e exportadas para o meio de cultura. Numa transfecção usual, numa placa de 6 mm (cerca de 106 células), 20 μg de VlV2-hCH2 foram acumulados no meio 70 horas após a transfecção. Aproximadamente 0,4 μg do material estavam presentes no lisado de células isentas de núcleos. Para comparação, os valores correspondentes para sT4, uma proteína CD4 solúvel que foi expressa a níveis elevados em vários tipos de células de mamíferos, eram de 20 μg no sobrenadante e 1 Mg no lisado celular. A proteína VlV2-hCH2 do meio migrava como dois fragmentos pouco espaçados. Em condições não-redutoras, a VlV2-hCH2 migrava principalmente como um monómero, embora aproximadamente 10% da amostra parecesse migrar como um dímero. A proteína VlV2-hCH2 do meio ligava-se à gpl20 acoplada a uma resina de Sepharose. A VlV2-hCH2CH3 foi analisada da mesma forma acima descrita. Esta proteína, também bem expressa como produto de secreção, acumulava 20-25 μg no meio de cultura e cerca de 1-2 μg no lisado celular. A proteína no meio migrava como um único fragmento de aproximadamente 50 kD em SDS-PAGE em condições de redução, enquanto que a amostra do lisado celular migrava como duas bandas equivalentes de aproximadamente 40 e 50 kD. Em condições não redutoras a viV2-hCH2CH3 migrava principalmente como um único fragmento com uma massa molecular consistente com a formação de dímero. Esta proteína ligava-se à gpl20 acoplada à Sepharose. C) OmoAVlV2-hCH2 e OmoAVlV2-hCH2CH3 em E. coli-32- 25 ° C. The medium was removed and replaced with DMEM. . Both VlV2-hCH2 and VlV2-hCH2CH3 were expressed and exported to the culture medium. In a usual transfection, in a 6 mm dish (about 106 cells), 20 μg of VlV2-hCH2 were accumulated in the medium 70 hours post-transfection. Approximately 0.4 μg of the material was present in the lysate of cell-free cells. For comparison, corresponding values for sT4, a soluble CD4 protein that was expressed at high levels in various mammalian cell types, were 20 μg in the supernatant and 1 μg in the cell lysate. The VlV2-hCH2 protein from the medium migrated as two fragments sparsely spaced. Under non-reducing conditions, VlV2-hCH2 migrated primarily as a monomer, although approximately 10% of the sample appeared to migrate as a dimer. The VlV2-hCH2 protein from the medium was bound to gp120 coupled to a Sepharose resin. The VlV2-hCH2CH3 was analyzed in the same manner as described above. This protein, also well expressed as a secretion product, accumulated 20-25 μg in the culture medium and about 1-2 μg in the cell lysate. The protein in the medium migrated as a single approximately 50 kD fragment on SDS-PAGE under reducing conditions, while the cell lysate sample migrated as two equivalent bands of approximately 40 and 50 kD. Under non-reducing conditions viV2-hCH2CH3 migrated primarily as a single fragment having a molecular mass consistent with dimer formation. This protein bound to gp120 coupled to Sepharose. C) OmoAV1-V2-hCH2 and OmoAV1-V2-hCH2CH3 in E. coli

As E. coli lisógenas para lambda, AR58 (Debouck et al., EP--A-0 216 747, publicado em 1 de Abril de 1987) e ARI20 (Mott et al., Proc. Natl. Acad. Sei. USA 82.:88-92 (1985)) foram transformadas com 0mpAVlV2-hCH2 e OmpAVlV2-hCH2CH3, respectivamente, utilizando os procedimentos habituais. A expressão de VlV2-hCH2 e VlV2-hCH2CH3 na estirpe AR58 foi conseguida pelo aumento da temperatura dos meios de cultura de 32°C para 42°C (ver por exemplo, Rosenberg et al., Meth Enzymology 101:123 (1983). -33- 72 859 SBC Case 14497-1 A expressão de VlV2-hCH2 e VlV2-hCH2CH3 na estirpe AR120 foi conseguida pela adição de ácido nalidíxico (Nal) aos meios de cultura (ver por exemplo, Mott et al., Proc. Natl. Acad. Sei. USA 82.:88-92 (1985)), da maneira que se segue. Uma cultura de AR120 foi cultivada a 37°C, até uma densidade óptica (a 650 nm) de 0,4 unidades de absorvância, seguindo-se a adição do ácido nalidíxico para uma concentração final de 50 jLig/ml. A cultura foi mantida a 37*C numa incubadora com agitação, durante aproximadamente 5 horas, após o que as células foram centrifugadas e subsequentemente geladas para interromper a indução.Lambda lysogenic E. coli, AR58 (Debouck et al., EP-A-0 216 747, published April 1, 1987) and ARI 20 (Mott et al., Proc. Natl. Acad Sci USA 82 .: 88-92 (1985)) were transformed with OmpAV1-V2-hCH2 and OmpAV1-V2-hCH2CH3, respectively, using standard procedures. The expression of VlV2-hCH2 and VlV2-hCH2CH3 in strain AR58 was achieved by increasing the temperature of the culture media from 32øC to 42øC (see, for example, Rosenberg et al., Meth Enzymology 101: 123 (1983)). The expression of VlV2-hCH2 and VlV2-hCH2CH3 in the AR120 strain was achieved by the addition of nalidixic acid (Nal) to the culture media (see, for example, Mott et al., Proc. Natl A culture of AR120 was grown at 37øC to an optical density (at 650 nm) of 0.4 absorbance units , followed by the addition of the nalidixic acid to a final concentration of 50 μg / ml. The culture was maintained at 37 ° C in a shaking incubator for approximately 5 hours, after which the cells were centrifuged and subsequently chilled to induction.

Para as induções por calor e por Nal, as pelotas celulares e os meios de cultura clarificados (isto é, meios centrifugados) foram testados quanto à expressão de VlV2-hCH2 e VlV2-hCH2CH3. Ambas as proteínas quiméricas foram bem expressas, embora apenas uma pequena percentagem de VlV2-hCH2CH3 fosse detectada nos meios de cultura (isto é, fosse exportada). D) V1V2-hCH2-TKK. VlVl-hCH2-KA. VlV2-hCH2strent e V1V2--hCH2CH3strept em StreptomycesFor the heat and Nal inductions, the cell pellets and the culture media clarified (i.e., centrifuged media) were tested for the expression of VlV2-hCH2 and VlV2-hCH2CH3. Both chimeric proteins were well expressed, although only a small percentage of VlV2-hCH2CH3 was detected in the culture media (i.e., exported). D) V1V2-hCH2-TKK. V1-hCH2-KA. VlV2-hCH2strent and V1V2-hCH2CH3strept in Streptomyces

Os plasmídeos VlV2-hCH2-TKK, VlV2-hCH2strept e V1V2--hCH2CH3strept foram utilizados para transformar a estirpe 1326 de S. lividans (Bibb et al., Mol. Gen. Genet. 184:230 (1981)) utilizando os procedimentos usuais (ver Hopwood et al., Genetic Manipulation of Streptomyces - A Laboratory Manual, F, Crowe & Sons, Ltd., Norwich, England (1985)). Os transformantes foram seleccionados por sobreposição, nas placas de transformação, de ágar (0,4%) contendo 100 jLtg/ml de tioestreptona. As colónias que expressavam a(s) proteína(s) de interesse foram então cultivadas em meio de caldo de tripticase de soja suplementado com 5 μg/ml de tioestreptona. Todas as proteínas quiméricas foram segregadas para o meio de cultura, embora as construções VlV2-hCH2 fossem expressas a um nível superior em relação ao da construção VlV2-hCH2CH3. O VlV2-hCH2-KA foi expresso aproximadamente a 14 mg/1.Plasmids V1V2-hCH2-TKK, V1V2-hCH2strept and V1V2-hCH2CH3strept were used to transform S. lividans strain 1326 (Bibb et al., Mol. Gen. Gen. Gen. 184: 230 (1981)) using standard procedures (see Hopwood et al., Genetic Manipulation of Streptomyces-A Laboratory Manual, F, Crowe & Sons, Ltd., Norwich, England (1985)). Transformants were selected by overlapping agar plates (0.4%) containing 100 æg / ml thiostrepton on transformation plates. Colonies expressing the protein (s) of interest were then cultured in soy tripticase broth medium supplemented with 5 μg / ml thiostrepton. All chimeric proteins were secreted into the culture medium, although the VlV2-hCH2 constructs were expressed at a level higher than that of the VlV2-hCH2CH3 construct. VlV2-hCH2-KA was expressed at approximately 14 mg / l.

72 859 SBC Case 14497-172 859 SBC Case 14497-1

Exemplo 3 - Caracterização das quimeras CD4-IaG A) estrutura da subunidade: as formas moleculares nativas da VlV2-hCH2-KA e VlV2-hCH2CH3 segregadas, foram analisadas por electroforese em gel de poliacrilamida-SDS a 15%, em condições não redutoras. As bandas de proteína foram identificadas por análise de transferência de Western e os resultados são apresentados na Tabela I. B) Reconhecimento do anticorpo: todas as proteínas quiméricas obtidas pelo presente invento são reconhecidas por anticorpos anti-CD4 (Deen et al., Nature 331:82-84 (1988) e por anticorpos anti-receptores Fc da IgG humana (Cappel, Malvern, PA.). Ver Tabela I. C) Ligação à gpl20: as proteínas quiméricas sintetizadas em S. lividans, células COS e CHO, exibiam todas ligação à gpl20 (referência à Tabela I), seja por ensaio de imunoprecipitação como descrito por Arthoos et al. (Cell 57.:469 (1989)) ou por ligação à gpl20 imobilizada em Sepharose como descrito abaixo: as amostras foram diluídas para um volume total de 300 μΐ com tampão de precipitação (ppt) (solução salina tamponada com fosfato contendo 0,5% de leite em pó e 0,1% de NP40). Uma lama de 50% de pérolas de Sepharose acoplada com gpl20 (100 μΐ, como descrito abaixo) foram adicionadas à amostra diluída e incubadas a 4°C durante 60 min. A amostra foi então centrifugada (30 seg.) numa microcentrífuga, para precipitar o complexo proteína quimérica/gpl20-Sepharose. O complexo foi lavado 5 vezes com 400 μΐ de tampão ppt e uma vez com solução salina tamponada com fosfato gelada. 0 complexo foi de novo centrifugado e ressuspenso em 60 μΐ de tampão de carga (Laemmli, Nature 227:680 (1970), fervido durante 5 minutos e aplicado a um gel de poliacrilamida a 15%, seguido de análise por transferência de Western. As proteínas quiméricas expressas em E. coli não foram testadas quanto à ligação à gpl20. A gpl20 foi obtida nos nossos laboratórios. Está também disponível comercialmente, por exemplo pela American BioThechnology, /} 72 859 SBC Case 14497-1 -35-Example 3 - Characterization of CD4-IaG chimeras A) subunit structure: the native molecular forms of the secreted VlV2-hCH2-KA and VlV2-hCH2CH3 were analyzed by 15% SDS-polyacrylamide gel electrophoresis under non-reducing conditions. Protein bands were identified by Western blot analysis and the results are presented in Table I. B) Antibody Recognition: All chimeric proteins obtained by the present invention are recognized by anti-CD4 antibodies (Deen et al., Nature 331 : 82-84 (1988) and by human IgG Fc receptor antibodies (Cappel, Malvern, PA.) See Table I. C) Binding to gp120: the chimeric proteins synthesized in S. lividans, COS and CHO cells, all exhibited binding to gp120 (reference to Table I), either by immunoprecipitation assay as described by Arthoos et al. (Cell 57: 469 (1989)) or by binding to Sepharose immobilized gp120 as described below: the samples were diluted to a total volume of 300 μl with precipitation buffer (ppt) (phosphate buffered saline containing 0.5 % milk powder and 0.1% NP40). A 50% slurry of Sepharose beads coupled with gp120 (100 μl as described below) were added to the diluted sample and incubated at 4 ° C for 60 min. The sample was then centrifuged (30 sec) in a microcentrifuge to precipitate the chimeric protein / gp120-Sepharose complex. The complex was washed 5 times with 400 μl of ppt buffer and once with ice-cold phosphate buffered saline. The complex was again centrifuged and resuspended in 60 μl of loading buffer (Laemmli, Nature 227: 680 (1970), boiled for 5 minutes and applied to a 15% polyacrylamide gel, followed by Western blot analysis. Chimeric proteins expressed in E. coli have not been tested for binding to gp120.Gp120 has been obtained from our laboratories and is commercially available, for example by American Bio-Chnology, 72, 799 SBC, Case 14497-1,

Cambridge, MA. As pérolas de pérolas de Sepharose foram preparadas da seguinte maneira: 1,0 litro de Sepharose C1-6B (Pharmacia Fine Chemicals Piscataway, NJ) foi feita reagir com 52,5 ml de epibromo-hidrina, numa solução básica de NaOH 0,5 N/tetra-hidrofurano a 30% (THF), a 40°C durante 4 horas. A Sepharose activada foi recolhida por filtração num funil de vidro sinterizado e foi extensivamente lavada com tetra-hidrofurano a 30% para remoção da epibromo-hidrina que não reagiu. 0 produto foi ainda lavado com água até a lavagem ser neutra. Este gel foi então ressuspenso com etilenodiamina (50 ml), durante a noite à temperatura ambiente. O gel foi filtrado e a etilenodiamina que não reagiu foi removida por lavagem do gel com ácido acético 0,1 M, seguida de lavagem com água. O gel foi ressuspenso num litro de água, e feito reagir com anidrido succinico (25 g) a pH 6,0. O gel foi ainda lavado com um litro de solução de carbonato de sódio (0,2 M), seguido de uma lavagem com água até a lavagem se apresentar neutra. Este gel foi finalmente lavado com álcool isopropílico e armazenado a 4°C para posterior utilização como um pó húmido. D) Ligações à proteína A e proteína G: as proteínas quiméricas VlV2-hCH2 expressas em S. lividans e células COS não se ligam à proteína A ou proteína G. Em contraste, a proteína quimérica VlV2-hCH2CH3 expressa em S. lividans e células COS exibem afinidade de ligação para ambas as proteínas A e G, embora com diferentes afinidades. A VlV2-hCH2CH3 produzida em Streptomvces tem uma afinidade mais baixa para as proteínas A e G quando comparada com a VlV2-hCH2CH3 produzida nas células COS. Os resultados estão sumarizados na Tabela I. -36- 72 859 SBC Case 14497-1 r ... % aCambridge, MA. Sepharose bead beads were prepared as follows: 1.0 liter of C1-6B Sepharose (Pharmacia Fine Chemicals Piscataway, NJ) was reacted with 52.5 ml of epibromohydrin in a basic 0.5 NaOH solution N / tetrahydrofuran (THF), at 40 ° C for 4 hours. The activated Sepharose was collected by filtration on a sintered glass funnel and was extensively washed with 30% tetrahydrofuran for removal of the unreacted epibromohydrin. The product was further washed with water until the wash was neutral. This gel was then resuspended with ethylenediamine (50 ml) overnight at room temperature. The gel was filtered and the unreacted ethylenediamine was removed by washing the gel with 0.1 M acetic acid, followed by washing with water. The gel was resuspended in one liter of water, and reacted with succinic anhydride (25 g) at pH 6.0. The gel was further washed with one liter of sodium carbonate solution (0.2 M), followed by washing with water until the wash was neutral. This gel was finally washed with isopropyl alcohol and stored at 4 ° C for further use as a wet powder. D) Protein A and G protein binding: VlV2-hCH2 chimeric proteins expressed in S. lividans and COS cells do not bind protein A or protein G. In contrast, the chimeric VlV2-hCH2CH3 protein expressed in S. lividans and cells COS exhibit binding affinity for both A and G proteins, albeit with different affinities. The VlV2-hCH2CH3 produced in Streptomoves has a lower affinity for the A and G proteins when compared to the VlV2-hCH2CH3 produced in the COS cells. The results are summarized in Table I. SBC Case 14497-1 r ...% a

TABELA 1. Caracterização das Proteínas Quiméricas CD4/IqGTABLE 1. Characterization of the CD4 / IqG Chimeric Proteins

Estrutura da sub-unidade Anticorpo Anti-CD4 Reconhe cimento anti-Fc Ligação à gpl20 Ligação às Proteínas A 6 6 VlV2-hCH2 monómero + + + - (S. lividans) VlV2-hCH2 monómero + + + - (COS) e dímero VlV2-hCH2CH3 monómero + + + + (S. lividans) (fraca) VlV2-hCH2CH3 dímero + + + + (COS)Structure of the subunit Anti-CD4 antibody Recognize anti-Fc cement Link to gp120 Link to Proteins A6 6 VlV2-hCH2 monomer + + + - (S. lividans) VlV2-hCH2 monomer + + + - (COS) and VlV2 dimer -CH2CH3 monomer + + + + (S. lividans) (weak) VlV2-hCH2CH3 dimer + + + + (COS)

Exemplo 4 - Construção de plasmídeos de fusão TKK-V1V2. TPAA-AA--V1V2 e KA-V1V2Example 4 - Construction of TKK-V1V2 fusion plasmids. TPAA-AA-V1V2 and KA-V1V2

Os plasmídeos de fusão TKK-V1V2, TPAA-V1V2, TPAAA-V1V2 e KA--V1V2 foram construídos a partir do plasmídeo 12B1 da forma que se segue: construção do plasmídeo 12B1. O plasmídeo 12B1 contém V1V2 operativamente ligado ao promotor e à sequência de sinal do inibidor da tripsina de Streptomyces lonqisporus (LTI) (ver EP-A-264 175, publicado em 20 de Abril de 1988) assim como às funções de replicação de Streptomyces, como encontrado no plasmídeo pIJ351 (Keiser et al.f Mol. Gen. Genet. 185:223 (1982). 0 plasmídeo progenitor 12B1 foi construído como se segue. O sítio de clivagem BbvI. na sequência de codificação para o pépti-do de sinal de CD4 (entre os nucleótidos 148 e 149 da sequência de ADN de CD4; Maddon et al., Cell 42.:93-104 (1985)), foi movido por mutagénese dirigida ao sítio (Kunkel, Proc. Natl. Acad. Sei. USA .82.:488-492), de tal forma que o sítio de clivagem BbvI foi colocado entre os nucleótidos 150 e 151. Esta mutação (denominada 1478) foi inserida num minigene de cCD4 que continha a sequência 72 859 , SBC Case 14497-1 -37- C/i 'if de codificação para os resíduos dos aminoácidos 1-129. Um fragmento EcoRI-HindiII. contendo a mutação 1478, foi transferido do M13mpl8 para o pUC18 para originar o pUCVlpV2(1478). Após digestão do pucvlpV2(l478) com Bbvl. este foi tratado com o fragmento de Klenow da ADN polimerase I, para preencher a sequência 5' de cadeia única, e foi digerido com HindIII. 0 fragmento HindIII de extremidade romba, resultante destas manipulações, foi clonado em pLTI450 que tinha sido digerido com Accl. tratado com o fragmento Klenow da ADN polimerase I e digerido com HindIII. O plasmídeo resultante, 12B1/1477, contém um minigene de sCD4 (resíduos de aminoácido 1-129) fundido com a sequência de codificação da sequência de sinal de LTI, de tal forma que a proteína V1V2 expressa conterá no terminal amino os 6 aminoácidos do pro-péptido de LTI mais os resíduos 1 e 2 da proteína madura LTI. Um replicão de Streotomvces e um marcador seleccionável foram clonados no 12B1/1477 por inserção do pIJ351 (Kieser et al., Mol. Gen. Genet. 185:223-238 (1982)), utilizando 0 único sítio PstI em ambos os plasmídeos. Para criar um minigene de V1V2 completo (resíduos de aminoácido 1-183) no plasmídeo 12B1/1477, um fragmento AflII+Xbal de DHFR VIV2 183#7 no Exemplo 1 foi inserido no 12B1/1477, que tinha sido digerido com AflII e Xbal. 0 plasmídeo resultante foi ο 12B1.The TKK-V1V2, TPAA-V1V2, TPAAA-V1V2 and KA-V1V2 fusion plasmids were constructed from plasmid 12B1 as follows: construct of plasmid 12B1. Plasmid 12B1 contains V1V2 operably linked to the promoter and signal sequence of the Streptomyces lonqisporus trypsin inhibitor (LTI) (see EP-A-264,175, published April 20, 1988) as well as to the replication functions of Streptomyces, as found in plasmid pIJ351 (Keizer et al., Mol. Gen. Genet., 185: 223 (1982)) The progenitor plasmid 12B1 was constructed as follows: The BbvI cleavage site in the coding sequence for the peptide of a CD4 signal (between nucleotides 148 and 149 of the CD4 DNA sequence; Maddon et al., Cell 42: 93-104 (1985)), was moved by site-directed mutagenesis (Kunkel, Proc. Natl. USA, 82: 488-492), such that the BbvI cleavage site was placed between nucleotides 150 and 151. This mutation (designated 1478) was inserted into a cCD4 minigene containing the sequence 72 859, SBC Coding for residues of amino acids 1-129. An EcoRI-HindIII fragment I. containing the 1478 mutation, was transferred from M13mpl8 to pUC18 to give pUCV1pV2 (1478). After digestion of pucvlpV2 (l478) with Bbv1. this was treated with the Klenow fragment of DNA polymerase I to fill the 5 'single chain sequence, and was digested with HindIII. The blunt-ended HindIII fragment resulting from these manipulations was cloned into pLTI450 which had been digested with Accl. treated with the Klenow fragment of DNA polymerase I and digested with HindIII. The resulting plasmid, 12B1 / 1477, contains an sCD4 minigene (amino acid residues 1-129) fused to the LTI signal sequence coding sequence such that the expressed V1V2 protein will contain at the amino terminus the 6 amino acids of the LTI propeptide plus residues 1 and 2 of the mature LTI protein. A Streotomy replicas and a selectable marker were cloned into 12B1 / 1477 by insertion of pIJ351 (Kieser et al., Mol. Gen. Genet. 185: 223-238 (1982)) using the single PstI site in both plasmids. To create a complete V1V2 minigene (amino acid residues 1-183) on plasmid 12B1 / 1477, an AflII + XbaI fragment of DHFR VIV2 183 # 7 in Example 1 was inserted into 12B1 / 1477, which had been digested with AflII and XbaI . The resulting plasmid was 12B1.

Construção do plasmídeo PLTI450: um fragmento Sacl-Kpnl de 0,9 kb contendo o gene LTI foi inserido no pUC18, o qual tinha sido pre-viamente digerido com Saci e Kpnl. Este plasmídeo, pLTl520, foi parcialmente digerido com Eacrl e totalmente digerido com Sall. sendo depois ligado a um ligador sintético (de cadeia dupla), o qual possuía extremidades Eaql e Sall: 5’-GGCCGCCGCCCCCGCG (SEQ ID NO:5) CGGCGGCGGGGGCGCAGCT-5' (SEQ ID NO:6)Construction of plasmid PLTI450: A 0.9 kb SacI-KpnI fragment containing the LTI gene was inserted into pUC18, which had been pre-digested with SacI and KpnI. This plasmid, pLT1520, was partially digested with Eacr1 and fully digested with SalI. and then ligated to a synthetic (double-stranded) linker, which had Eaq1 and Sall ends: 5'-GGCCGCCGCCCCCGCG (SEQ ID NO: 5) CGGCGGCGGGGGCGCAGCT-5 '(SEQ ID NO: 6)

Os plasmídeos obtidos desta ligação foram seleccionados quanto à inserção do ligador sintético no sítio Eaql. localizado aproximadamente a 0,5 kb do sítio Saci. Este sítio Eaql está localizado no par de base 86 em relação à extremidade 5' do gene LTI. 0 plasmídeo resultante contém o promotor de LTI e a se- -38- -38- & Φ A'", 72 859 SBC Case 14497-1 quência codificadora do péptido de sinal e do sitio de clivagem do péptido de sinal.The plasmids obtained from this ligation were selected for insertion of the synthetic linker into the Eaq1 site. located approximately 0.5 kb from the Saci site. This Eaq1 site is located on the base pair 86 relative to the 5 'end of the LTI gene. The resulting plasmid contains the LTI promoter and the sequence < RTI ID = 0.0 > Φ A ', 72 859 SBC Case 14497-1 signal peptide and cleavage site of the signal peptide.

Para criar os plasmídeos de fusão TKK-V1V2, TPAA-V1V2, TPA-AA-V1V2 e KA-VIV2, a mutagénese mediada por oligonucleótidos foi utilizada para fazer a delecção da sequência codificadora do pro-péptido de LTI e criar a sequência codificadora de aminoácidos escolhida após o sítio de clivagem do péptido de sinal. O ADN de cadeia simples de M13 utilizado para esta mutagénese foi o mpl8VlV2/12Blf o qual foi criado por inserção de um fragmento EcoRI-Xbal de 1,1 kb de 12B1 em M13mpl8, digerido com EcoRI e Xbal. Na Tabela 3 estão sumarizados os oligonucleótidos utilizados para a mutagénese dirigida ao sítio. Tabela 3 - Olicronucleótidos utilizados oara oerar derivados de V1V2 Olicronu- Derivado cleótido Seauência olicronucleotídica TKK-V1V2 2214 5'-GCCCAGCACCACTTTCTTGGTGGCGAGCGCGGCTCC-37 TPAA-V1V2 2253 5'-GCCCAGCACCACTTTCTTAGCGGCCGGGGTGGC-37 TPAAA-V1V2 2254 57-GCCCAGCACCACTTTCTTAGCAGCGGCCGGGG-37 KA-V1V2 2216 5 7-GCCCAGCACCACGGCCTTGGCGAGCGCGGCTCC-37To create the fusion plasmids TKK-V1V2, TPAA-V1V2, TPA-AA-V1V2 and KA-VIV2, oligonucleotide-mediated mutagenesis was used to effect deletion of the LTI propeptide coding sequence and to create the coding sequence of amino acid sequence chosen after the cleavage site of the signal peptide. The M13 single-stranded DNA used for this mutagenesis was mpl8V1V2 / 12Blf which was created by insertion of a 1.1 kb EcoRI-Xbal fragment of 12B1 into M13mpl8, digested with EcoRI and XbaI. In Table 3, the oligonucleotides used for site-directed mutagenesis are summarized. Table 3 - Olicronucleótidos used oara oerar derivatives V1V2 Olicronu- derivative cleótido Seauência olicronucleotídica TKK-V1V2 2214 5'-TPAA GCCCAGCACCACTTTCTTGGTGGCGAGCGCGGCTCC-37-5'-V1V2 2253 GCCCAGCACCACTTTCTTAGCGGCCGGGGTGGC TPAAA-37-V1V2 2254 GCCCAGCACCACTTTCTTAGCAGCGGCCGGGG-37 57-KA-5 V1V2 2216 7-GCCCAGCACCACGGCCTTGGCGAGCGCGGCTCC-37

Na Tabela 3 a sequência nucleotídica 2214 é SEQ ID NO:7; a sequência nucleotídica 2253 é SEQ ID NO:8; a sequência nucleotídica 2254 é SEQ ID NO:9; a sequência nucleotídica 2216 é SEQ ID NO:10.In Table 3 the nucleotide sequence 2214 is SEQ ID NO: 7; the nucleotide sequence 2253 is SEQ ID NO: 8; the nucleotide sequence 2254 is SEQ ID NO: 9; nucleotide sequence 2216 is SEQ ID NO: 10.

Um fragmento EcoRI-AflII de 0,7 kb, isolado a partir da forma RF do mpl8VlV2 mutagenizado, foi trocado com um fragmento EcoRI-AflII de 0,75 kb de 12B1, gerando os plasmídeos pVlV2-2214 (OU PTKK-V1V2), pVlV2—2253 (OU PTPAA-V1V2), pVlV2-2254 (OU pTPAAA--V1V2) e pVlV2-2216 (ou pKA-VlV2). 0 PV1V2-2214, também denominado pTKK-VlV2, foi criado utilizando o oligonucleótido 2214, tal como é mostrado na Tabela 3, e contém a sequência de sinal de LTI ligada à sequência treonina do pro-péptido modificado e a sequência treonina do pro-péptido modificado 72 859 SBC Case 14497-1A 0.7 kb EcoRI-AflII fragment, isolated from the RF form of the mutated mpl8V1V2, was exchanged with a 0.75 kb EcoRI-AflII fragment of 12B1, generating the plasmids pVlV2-2214 (OR PTKK-V1V2), pVlV2-2253 (OR PTPAA-V1V2), pVlV2-2254 (OR pTPAAA-V1V2) and pVlV2-2216 (or pKA-VlV2). PV1V2-2214, also referred to as pTKK-V1V2, was generated using oligonucleotide 2214, as shown in Table 3, and contains the LTI signal sequence attached to the threonine sequence of the modified propeptide and the threonine sequence of the pro- modified peptide 72 859 SBC Case 14497-1

------\Í:*t** -39- ligada a V1V2. O pVlV2-2253, também denominado pTPAA-VlV2, foi construído utilizando o oligonucleótido 2253, tal como é mostrado na Tabela 3, e contém a sequência do péptido de sinal de LTI ligada à sequência thr-pro-ala-ala do pro-péptido modificado. 0 pVlV2-2254, também denominado pTPAAA-VlV2, foi construído utilizando o oligonucleótido 2254, tal como é mostrado na Tabela 3, e contém a sequência do péptido de sinal de LTI ligada à sequência thr-pro-ala-ala do pro-péptido modificado, a qual por sua vez se encontra ligada a V1V2. O pVlV2-2216, também denominado pKA-VlV2, foi construído utilizando o oligonucleótido 2216, tal como é mostrado na Tabela 3, e contém a sequência de sinal de LTI que por sua vez se encontra ligada a V1V2.Linked to V1V2. PVV2-2253, also referred to as pTPAA-VlV2, was constructed using oligonucleotide 2253 as shown in Table 3 and contains the LTI signal peptide sequence attached to the thr-pro-ala-ala sequence of the propeptide modified. PVV2-2254, also referred to as pTPAAA-VlV2, was constructed using oligonucleotide 2254, as shown in Table 3, and contains the LTI signal peptide sequence attached to the thr-pro-ala-ala sequence of the propeptide modified, which in turn is bound to V1V2. PVlV2-2216, also called pKA-VlV2, was constructed using oligonucleotide 2216, as shown in Table 3, and contains the LTI signal sequence which in turn is attached to V1V2.

Exemplo 5 - Construção do plasmídeo de fusão fíaal *KK-V1V2 0 jSgal *KK-V1V2 foi construído por mutagénese a partir do plasmídeo p/3gal sT4/7. O p^gal sT4/7 foi construído tal como a seguir se descreve.Example 5 - Construction of the FIAal fusion plasmid KK-V1V2 0Sgal * KK-V1V2 was constructed by mutagenesis from the plasmid p / 3gal sT4 / 7. The p4 gal sT4 / 7 was constructed as described below.

Construção de p/3cralsT4/7: o plasmídeo p*galsT4/7 foi construído a partir dos plasmídeos pUCst4, pIJ702 e p3SSXMCP. 0 p3SSXMCP é um derivado de pUC9 (Viera e Messing, Gene l£: 259 (1982)), que contém o promotor da /J-galactosidase e a sua sequência de sinal. Vinte e seis pares de bases a jusante da sequência codificadora do sítio de clivagem do péptido de sinal encontra-se um sítio Xmnl (Eckhardt et al., J. Bacteriol. 169:4249 (1987)). Inserido neste sítio está um ligador sintético que contém uma sequência de reconhecimento para BamHI (New England Biolabs, Beverly, Massachusetts) foi gerado o plasmídeo p3SSX10. Este vector foi tratado com BamHI e com transcriptase reversa, seguindo-se uma digestão com Xhol. Foi ligado a este vector um fragmento que continha uma extremidade Ncol. tratada com transcriptase reversa de modo a gerar uma extremidade romba, e uma extremidade Sall. A ligação do sítio BamHI preenchido com o sítio e Ncol preenchido recriou os dois sítios BamHI e Ncol). 0 plasmídeo resultante foi o p3SSXMCP. 0 pUCsT4 foi tratado com Xbal e transcriptase reversa seguida de digestão com Ncol. Este fragmento Ncol-Xbal (RT) de 1,1 kb foi então inserido em p3SSXMCP, que tinha sido previamente tratado com Saci e ADN polimerase I de T4 seguido de digestão com -40- 72 859 SBC Case 14497-1Construction of 3cralsT4 / 7: plasmid p * galsT4 / 7 was constructed from plasmids pUCst4, pIJ702 and p3SSXMCP. P3SSXMCP is a derivative of pUC9 (Viera and Messing, Gene 259 (1982)), which contains the β-galactosidase promoter and signal sequence thereof. Twenty-six base pairs downstream of the coding sequence of the signal peptide cleavage site is an Xmnl site (Eckhardt et al., J. Bacteriol., 169: 4249 (1987)). Inserted into this site is a synthetic linker containing a BamHI recognition sequence (New England Biolabs, Beverly, Massachusetts) the plasmid p3SSX10 was generated. This vector was treated with BamHI and reverse transcriptase, followed by digestion with XhoI. A fragment containing one NcoI end was attached to this vector. treated with reverse transcriptase to generate a blunt end, and a Sall end. Binding of the filled-up BamHI site with the filled-in Ncol site recreated the two BamHI and NcoI sites. The resulting plasmid was p3SSXMCP. PUCsT4 was treated with XbaI and reverse transcriptase followed by digestion with NcoI. This 1.1 kb Ncol-Xbal (RT) fragment was then inserted into p3SSXMCP, which had been pretreated with Saci and T4 DNA polymerase I followed by digestion with -40-72,859 SBC Case 14497-1

Ncol. As funções de replicação do Streptomvces foram fornecidas pelo pIJ702, o qual foi inserido no p3SSXsT4 utilizando o único sítio BglII de ambos os plasmídeos. O plasmídeo resultante foi o P)0galsT4 /7.Ncol. The replication functions of Streptomoves were provided by pIJ702, which was inserted into p3SSXsT4 using the single BglII site of both plasmids. The resulting plasmid was P08 / 4/4/7.

Um fragmento EcoRI-Af III de 1,6 kb, de p/0galsT4/7, foi trocado com um fragmento EcoRl-Aflll de 0,75 kb de mpl8VlV2-2214, de modo a gerar mpl8BgalVlV2. A mutagénese dirigida ao sítio foi utilizada para fazer a delecção da sequência codificadora dos resíduos de aminoácidos -9 a -1 do péptido de sinal de CD4, substituindo os resíduos de arginina, nas posições -3 e -4 do péptido de sinal de /Jgal, para aspartamato e glutamato, respectivamente, e fazer a delecção da sequência codificadora para os resíduos 1 a 8 da proteína de jS-galactosidase madura. Estas mutações foram efectuadas utilizando o oligonu-cleótido 2256, cuja sequência é: 5'-GCCCAGCACCACTTTCTTCGCCGCGTCCTCTACGGCGCCTG-3' (SEQ ID NO: 11).A 1.6 kb EcoRI-Af III fragment of p / 0galsT4 / 7 was exchanged with a 0.75 kb EcoR1-AfIII fragment of mpl8V1V2-2214, to generate mpl8BgalV1V2. Site-directed mutagenesis was used to effect deletion of the coding sequence of the amino acid residues -9 to -1 of the CD4 signal peptide, replacing the arginine residues at the -3- and -4-position signal peptide positions , for aspartamate and glutamate, respectively, and deletion of the coding sequence for residues 1 to 8 of the mature Î²-galactosidase protein. These mutations were made using oligonucleotide 2256, which sequence is: 5'-GCCCAGCACCACTTTCTTCGCCGCGTCCTCTACGGCGCCTG-3 '(SEQ ID NO: 11).

Um fragmento EcoRl-Aflll de 1,6 kb da forma RF de mpl8j8galVlV2-2256 foi trocado com um fragmento EcoRl-Aflll de o,75 kb do plasmídeo 12B1. O plasmídeo resultante, pVlV2jSgal-2256 (jSgal *KK-V1V2), tem as seguintes características: (1) a expressão de V1V2 é direccionada pelo promotor jSgal, (2) a sequência codificadora do péptido de sinal do jSgal foi alterado de modo que o resíduo de aminoácido -4 é um glutamato e o -3 é um aspartamato, e (3) V1V2 encontra-se fundido com a sequência de sinal de jSgal de modo que a sequência de aminoácidos N-terminal de V1V2 prevista é Lys-Lys.A 1.6 kb EcoR1-AfIII fragment of the RF form of mpl8j8galV1V2-2256 was exchanged with a 0.75 kb EcoRI-AfIII fragment of plasmid 12B1. The resulting plasmid, pV1V2jSgal-2256 (jSgal * KK-V1V2), has the following characteristics: (1) the expression of V1V2 is directed by the jSgal promoter, (2) the signal peptide coding sequence of the jSgal has been altered so that the amino acid residue -4 is a glutamate and -3 is an aspartamate, and (3) V1V2 is fused to the Î²gal signal sequence so that the predicted V1V2 N-terminal amino acid sequence is Lys-Lys .

Exemplo 6 - Expressão de V1V2 em S. lividans OS plasmídeos pVlV2-2214, PV1V2-2253, PV1V2-2254, pVlV2-2216 e pVlV2jSgal-2256 do Exemplo 4 foram transformados em S. lividans 1326 (Bibb et al., Mol. Gen. Genet. 184: 230-240 (1981), utilizando os procedimentos usuais (Hopwood et al., Genetic Manipula-tion of Streptomyces - A Laboratory Manual, F. Crowe & Sons Ltd., Norwich, England (1985)). Os transformantes foram seleccionados 72 859 SBC Case 14497-1The plasmids pV1V2-2214, PV1V2-2253, PV1V2-2254, pV1V2-2216 and pV1V2jSgal-2256 of Example 4 were transformed into S. lividans 1326 (Bibb et al., Mol. Gen , Genet 184: 230-240 (1981), using standard procedures (Hopwood et al., Genetic Manipulation of Streptomyces-A Laboratory Manual, F. Crowe & Sons Ltd., Norwich, England (1985)). Transformants were selected 72859 SBC Case 14497-1

por sobreposição das placas de transformação R2YE com 3 ml de 0,4% ágar + 100 pq/ml tioestreptona. Os transformantes que expressavam a proteína de interesse foram cultivados em caldo de tripticase de soja ou ME1 + 5% HycaseSF (40 g/litro de glucose, 50 g/litro HycaseSF, 50 g/litro Hysoy, 1 g/litro de extracto de levedura, 1 g/litro CaC04 e 0,001 g/litro de CoCl2), suplementado com 5 jiig/ml de tioestreptona. Os sobrenadantes, isentos de células, utilizados para o ensaio de ligação à gpl20 e para a cromatografia de afinidade de gpl20 foram colhidos a partir de culturas cultivadas em ME1 + 5% Hycase.by overlapping the R2YE transform plates with 3 ml of 0.4% agar + 100 pg / ml thiostrepton. Transformants expressing the protein of interest were grown in trypticase soy broth or ME1 + 5% HycaseSF (40 g / liter glucose, 50 g / liter HycaseSF, 50 g / liter Hysoy, 1 g / liter yeast extract , 1 g / liter CaC04 and 0.001 g / liter of CoCl2) supplemented with 5æg / ml thiostrepton. Cell-free supernatants used for the gp120 binding assay and gp120 affinity chromatography were harvested from cultures grown in 5% Hycase ME1 +.

Os derivados de V1V2, quando expressos em S. lividans foram inicialmente caracterizados por imunotransferência.The V1V2 derivatives when expressed in S. lividans were initially characterized by immunoblotting.

Procedimentos de imunotransferência - Os sobrenadantes isentos de células foram separados por uma electroforese em gel de poliacrilamida 15% (30:0,8 acrilamida:bis) - dodecilsulfato de sódio (Laemmli (1970) Nature 227: 680-685), e depois transferidos para nitrocelulose (Towbin et al. (1979) Proc. Natl. Acad. Sei. USA 76.: 4350-4354). O filtro de nitrocelulose foi processado de modo a detectar V1V2 (Brawner et al. (1985) Gene 40: 191-201) utilizando anti-soro de coelho preparado contra a proteína sCD4 desnaturada. 0 anticorpo ligado foi detectado com 125I-proteína A. As proteínas que imunoreagiram foram visualizadas por autorra-diografia.Immunoblot procedures - Cell-free supernatants were separated by 15% polyacrylamide gel electrophoresis (30: 0,8 acrylamide: bis) -dodecylsulfate (Laemmli (1970) Nature 227: 680-685), and then transferred for nitrocellulose (Towbin et al (1979) Proc Natl Acad Sci USA 76: 4350-4354). The nitrocellulose filter was processed to detect V1V2 (Brawner et al. (1985) Gene 40: 191-201) using rabbit antiserum prepared against the denatured sCD4 protein. The bound antibody was detected with 125 I-protein A. The proteins that immunoreacted were visualized by autoradiography.

Os derivados de TKK-V1V2, TPAA-V1V2, TPAAA-V1V2, KA-V1V2 e j8gal*KK foram produzidos como uma única banda imunorreactiva. Um dupleto foi produzido a partir do derivado de KK-V1V2. Uma das proteínas KK-V1V2 comigrou com a V1V2 produzida a partir das células CHO. A outra banda, que migrou com uma menor mobilidade do que a proteína V1V2 de referência, pode ser o resultado de um processamento incompleto do péptido de sinal de LTI ou de um processamento num sítio de clivagem alternativo dentro do péptido de sinal de LTI. Devido à natureza heterogénea do produto expresso pelo derivado KK-V1V2 não foram efectuados estudos adicionais. 72 859 SBC Case 14497-1 -42- e·' A actividade biológica destes derivados de V1V2 foi então determinada por ligação à gpl20. Tal como foi julgado pela imunoprecipitação quantitativa de gpl20 (Arthos et al., Cell 57: 469-481 (1989)), o Streptomvces produziu prot.eínas V1V2 completa-mente activas. Os resultados do ensaio de imunoprecipitação são mostrados na Tabela 2. Todos os derivados de V1V2 testados produziam ligação de gpl20 maior do que 90%. A produção de V1V2 dos derivados de TKK-V1V2, TPAA-V1V2, TPAAA-V1V2 e )9-gal*KK foi comparada com aquela obtida pelo plasmídeo 12B1, que produz níveis de V1V2 iguais ou superiores a 100 mg/1. os níveis de expressão de V1V2 dos derivados de TPAA--V1V2 e TPAAA-V1V2 eram comparáveis com aqueles obtidos a partir de 12B1; a expressão de KA-V1V2 foi 70% daquela obtida a partir de 12B1; a expressão de TKK-V1V2 foi 30% daquela obtida a partir de 12B1; a expressão de /3gal*KK foi aproximadamente 1 mg/1.The derivatives of TKK-V1V2, TPAA-V1V2, TPAAA-V1V2, KA-V1V2 and j8gal * KK were produced as a single immunoreactive band. A doublet was produced from the derivative of KK-V1V2. One of the KK-V1V2 proteins comigrated with the V1V2 produced from the CHO cells. The other band, which migrated with less mobility than the reference V1V2 protein, may be the result of incomplete processing of the LTI signal peptide or of processing at an alternative cleavage site within the LTI signal peptide. Due to the heterogeneous nature of the product expressed by the derivative KK-V1V2 no further studies have been performed. The biological activity of these V1V2 derivatives was then determined by binding to gp120. As judged by the quantitative immunoprecipitation of gp120 (Arthos et al., Cell 57: 469-481 (1989)), Streptomyces produced fully active V1V2 proteins. The results of the immunoprecipitation assay are shown in Table 2. All of the V1V2 derivatives tested produced gp120 binding greater than 90%. V1V2 production of the TKK-V1V2, TPAA-V1V2, TPAAA-V1V2 e) 9-gal * KK derivatives was compared to that obtained by plasmid 12B1, which produces V1V2 levels of 100 mg / l or greater. the V1V2 expression levels of the TPAA-V1V2 and TPAAA-V1V2 derivatives were comparable to those obtained from 12B1; the expression of KA-V1V2 was 70% of that obtained from 12B1; the expression of TKK-V1V2 was 30% of that obtained from 12B1; the expression of Î²gal * KK was approximately 1 mg / l.

Exemplo 7 - Análise dos aminoácodos N-terminais das proteínas heteróloqas produzidas nelos nlasmídeos PV1V-2214. PV1V2-2253. PV1V2-2254. PV1V2-2216 e PV1V2 flqal-2256 A. Purificação de V1V2 a partir de meios de Streptomyces para sequenciação do terminal N por utilização da coluna de afinidade Sepharose GP-120 1. Preparação da Sepharose gpl20Example 7 - Analysis of the N-terminal amino-acids of the heterologous proteins produced in the n-plasmid PV1V-2214. PV1V2-2253. PV1V2-2254. PV1V2-2216 and PV1V2 flqal-2256 A. Purification of V1V2 from Streptomyces media for N-terminal sequencing by use of the Sepharose GP-120 affinity column 1. Preparation of Sepharose gp120

Imobilizou-se gpl20 purificada em Sepharose CL-6B (Pharmacia), através do grupo amino, utilizando química de éster activo. Aproximadamente 0,25 mg de gpl20 foi acoplada a 1 ml de resina e a capacidade de ligação para sT4 foi determinada como 20 μg/ml. 2. Preparação da amostra a partir dos meiosPurified gp120 in Sepharose CL-6B (Pharmacia) was immobilized through the amino group using active ester chemistry. About 0.25 mg of gp120 was coupled to 1 ml of resin and the binding capacity for sT4 was determined as 20 μg / ml. 2. Preparation of the sample from the media

Os meios de fermentação contendo as construções de V1V2 foram diluídos 5 vezes com tampão de equilíbrio da coluna (ver abaixo), centrifugado durante 15 minutos a 20 OOOxg e filtrado num filtro Acrodisc de 0,2 μ de baixa ligação proteica (Gelmen). 3. Cromatografia de afinidade gpl20 A coluna de Sepharose gpl20 (1,6 cmx6 cm) foi equilibrada com Hepes 50mM, pH 7,5, NaCl 150 mM. A amostra de meios preparadaThe fermentation media containing the V1V2 constructs were diluted 5-fold with column equilibration buffer (see below), centrifuged for 15 minutes at 20,000 x g and filtered on a 0.2 μm low-protein Acrodisc filter (Gelmen). 3. Gp120 Affinity Chromatography The Sepharose gp120 (1.6 cm x 6 cm) column was equilibrated with 50 mM Hepes, pH 7.5, 150 mM NaCl. The prepared media sample

72 859 SBC Case 14497-1 -43- foi aplicada à coluna, lavada com o tampão de equilíbrio até a absorvância atingir a linha de base e lavada novamente com Hepes 50 mM;pH 7,5, NaCl 500 mM para remover quaisquer impurezas ligadas não especificamente. A V1V2 foi eluída com ácido acético 0,1 M. 4. Remoção de impurezas peptídicas da V1V2 purificada por afinidade , utilizando HPLC de fase inversa A V1V2 eluída da coluna de afinidade foi levada a 0,05% de TFA e aplicada à coluna C3 RP-HPLC (DuPont Pro 10/300, 4,6 cm x 250 cm) equilibrada com 0,05% de TFA. A coluna foi eluída a 1 ml/min durante 60 minutos com um gradiente linear de 0-60% de acetonitrilo em 0,05% de TFA. O produto de V1V2 foi eluído como um único pico aos 50 minutos. As fracções reunidas foram concentradas em Centricon 3 (Amicon) e utilizadas para sequenciação e análise de aminoácidos do terminal N. B. Sequência N-terminal A sequência de aminoácidos N-terminal foi identificada para determinar se o processamento dos péptidos de sinal tinha ocorrido no sítio de clivagem previsto. As proteínas V1V2 foram purificadas a partir do sobrenadante da cultura utilizando uma cromatografia de afinidade Sepharose gpl20 e HPLC de fase inversa. As preparações de V1V2 parcialmente purificadas foram ainda purificadas por electroforese em gel de poliacrilamida-SDS seguida de electroeluição numa membrana de difluoreto de poli(difluoreto de vinilideno) (PVDF), em preparação para a sequência de aminoácidos do terminal N (Matsudaria, J. Biol. Chem. 262:10035-10038 (1987)). A sequência de aminoácidos N-terminal foi determinada utilizando um sequenciador de proteína de fase-gasosa. Esta análise mostrou que as proteínas V1V2 de TKK-V1V2, TPAAKK-V1V2, TPAAAKK-V1V2 e KA-V1V2 fundidas com o terminal carboxilo do péptido de sinal de LTI eram correctamente processadas no sítio de clivagem do péptido de sinal de LTI. A sequência de aminoácidos N-terminal encontra-se sumarizada na Tabela 2. A extremidade N-terminal do péptido V1V2 é Lys-Lys. 0 derivado /3gal*KK-VlV2, no entanto, não foi processado no sítio de clivagem natural da sequência de sinal. Neste derivado, os /9 72 859 SBC Case 14497-1 // fíWas applied to the column, washed with the equilibration buffer until the absorbance reached the baseline and washed again with 50 mM Hepes, pH 7.5, 500 mM NaCl to remove any bound impurities not specifically. V1V2 was eluted with 0.1 M acetic acid. 4. Removal of peptidic impurities from affinity purified V1V2 using reverse phase HPLC The V1V2 eluted from the affinity column was brought to 0.05% TFA and applied to column C3 RP-HPLC (DuPont Pro 10/300, 4.6 cm x 250 cm) equilibrated with 0.05% TFA. The column was eluted at 1 ml / min for 60 minutes with a linear gradient of 0-60% acetonitrile in 0.05% TFA. The V1V2 product was eluted as a single peak at 50 minutes. The pooled fractions were concentrated on Centricon 3 (Amicon) and used for NB-terminal amino acid sequencing and analysis. N-terminal sequence The N-terminal amino acid sequence was identified to determine whether signal peptide processing had occurred at the cleavage site foreseen. The V1V2 proteins were purified from the culture supernatant using Sepharose gp120 affinity chromatography and reverse phase HPLC. Partially purified V1V2 preparations were further purified by SDS-polyacrylamide gel electrophoresis followed by electroelution on a polyvinylidene difluoride (PVDF) diffluoride membrane, in preparation for the N-terminal amino acid sequence (Matsudaria, J. Biol. Chem. 262: 10035-10038 (1987)). The N-terminal amino acid sequence was determined using a phase-gas protein sequencer. This analysis showed that the V1V2 proteins of TKK-V1V2, TPAAKK-V1V2, TPAAAKK-V1V2 and KA-V1V2 fused to the carboxyl terminus of the LTI signal peptide were correctly processed at the cleavage site of the LTI signal peptide. The N-terminal amino acid sequence is summarized in Table 2. The N-terminal end of the V1V2 peptide is Lys-Lys. The β / βgal derivative KK-V1V2, however, was not processed at the natural cleavage site of the signal sequence. In this derivative, the SBC Case

44- aminoácidos que precedem Lys-Lys são derivados da sequência de codificação na extremidade 3' do péptido de sinal de /3-gal*. A clivagem da sequência de sinal para a mutação do péptido de sinal de R(-4)E/ /R(-3)D ocorreu dentro da sequência de sinal de jSgal entre as posições -8 e -7.Amino acids preceding Lys-Lys are derived from the coding sequence at the 3 'end of the Î²-gal * signal peptide. Cleavage of the signal sequence for the (-4) E / / R (-3) D signal peptide mutation occurred within the jSgal signal sequence between positions -8 and -7.

Tabela 2 - Características dos derivados de V1V2Table 2 - Characteristics of the V1V2 derivatives

Ligação àConnection to

Derivado Sequência de aminoácidos N-terminal crc>120 TKK-V1V2 Thr-LYS-LYS--- >90% TPAAKK-V1V2 Thr-Pro-Ala-Ala-LYS-LYS--- >90% TPAAAKK-V1V2 Thr-Pro-Ala-Ala-Ala-LYS-LYS--- >90% TPAAAKK-V1V2 LYS-LYS ND (1) KA-V1V2 LYS-Ala--- >90% jSgal *KK-V1V2 Ala-Ala-Val-GLu-Asp-Ala-Ala-LYS-LYS >90% (1) Não Determinado A descrição e os exemplos acima mencionados revelam na totalidade o presente invento, incluindo as suas concretizações. No entanto, deve ser considerado que o invento não é limitado às concretizações particulares acima descritas. Modificações dos processos acima descritos, que são óbvias para os peritos na arte, pretendem estar dentro do âmbito das reivindicações em apêndice.N-terminal amino acid sequence CRc> 120 TKK-V1V2 Thr-LYS-LYS ---> 90% TPAAKK-V1V2 Thr-Pro-Ala-Ala-LYS-LYS ---> 90% TPAAAKK-V1V2 Thr Lys-Ala-Ala-Ala-Lys-LYS ---> 90% TPAAAKK-V1V2 LYS-LYS ND (1) KA-V1V2 LYS-Ala ---> 90% JSgal * KK-V1V2 Ala-Ala -Val-GLu-Asp-Ala-Ala-LYS-LYS > 90% (1) Not Determined The above description and examples fully disclose the present invention, including embodiments thereof. However, it should be appreciated that the invention is not limited to the particular embodiments described above. Modifications of the processes described above, which are obvious to those skilled in the art, are intended to be within the scope of the appended claims.

Claims

A method of producing a heterologous protein in Streptomoves having a homogeneous amino-terminal after processing to remove the signal peptide formed from the protein product, characterized in that it comprises: (a) introducing in a host Streptomyces cell of a DNA vector having a nucleic acid sequence encoding the signal sequence of the S. lonqisporus tyrosine inhibitor gene. linked to a propeptide sequence consisting essentially of an oligonucleotide encoding one to about six amino acids, which is operably linked to a coding sequence of the heterologous protein; and (b) culturing said host cell in a suitable culture medium.

A method according to claim 1, wherein said propeptide encodes the amino acid threonine.

A method according to claim 1, wherein said propeptide encodes the amino acid sequence Thr-Pro-Ala-Ala (SEQ ID NO: 1).

A method according to claim 1, wherein said propeptide encodes the amino acid sequence Thr-Pro-Ala-Ala-Ala (SEQ ID No 2).

A method according to claim 1, characterized in that said nucleic acid sequence comprises: TGC GGA AGG ATG CAC ACA ATG CGG AAC ACC GCG CGC TGG GCA GCC ACC CTC GCC CTC ACG GCC ACC GCC GTC TGC GGA CCC CTC ACC GGA GCC GCG CTC GCC, or a derivative thereof capable of acting as a signal peptide in Streptomoves. -46- 72 859 SBC Case 14497-1

A method according to claim 5, wherein said propeptide sequence includes the ACC sequence.

A method according to claim 5, wherein said propeptide sequence comprises the ACC CCG GCC GCT nucleic acid sequence (SEQ ID No.3).

8. The method of claim 5, wherein said propeptide sequence comprises the ACC CCG nucleic acid sequence GCC GCT GCT (SEQ ID NO: 4).

A method of producing a heterologous protein in Streptomyces having a homogeneous amino-terminal after processing to remove the signal peptide formed from the protein product, comprising: (a) introducing, in a host Streptomyces cell, a vector of DNA having a nucleic acid sequence encoding the modified S. lonqisporus tyrosine inhibitor gene signal sequence encoding Lys-Ala at the 3 'end of said signal sequence which is operably linked to a heterologous protein coding sequence; and (b) culturing said host cell in a suitable culture medium. , ί >

A method according to claim 9, wherein said heterologous protein comprises an HIV gp120 binding region.

A method according to claim 9, wherein said heterologous protein comprises an HIV gp120 binding region, linked to a portion of a human immunoglobulin constant region encoding a polypeptide which lacks all or most of the domain CH3.

By SMITHKLINE BEECHAM CORPORATION