CHIMERIC ANTI-CEA ANTIBODY
This invention was made with government support under Grant No. CA 43904 awarded by the National Institutes of Health. The government has certain rights in the invention.
FIELD OF THE INVENTION
This invention relates to a chimeric mouse-human antibody to carcinoembryomic antigen (CEA) designated T84.12.
BACKGROUND OF THE INVENTION
CEA is a widespread tumor marker. Its expression can be detected in more than 95% of all human colon cancers. It is a member of the immunoglobulin superfamily and is closely related to NCA and BGP.
Of the various available CEA specific monoclonal antibodies, murine T84.66 antibody shows the highest specificity and affinity for CEA ( agener, et al., J. Immunology 130:2308-2315 (1985)). It has been used successfully for ii vivo tumor imaging in mice and humans. It is well suited for the immunodetection and immunotherapy of human colon cancers.
The jLn vivo human use of T84.66 is limited by its murine origin resulting in immune response against the heterologous immunoglobulin. Chimeric T84.66 was created by use of recombinant gene technology to lessen the immunogenicity in man See Neumaier, et al., Cancer Research 5_0:2128-2134 (1990) and United States Patent 5,081,235. The cloned antibody genes including the immunoglobulin promoter were transfected into SP2/0 myeloma cells by electroporation or CHO cells using lipofection. The expressed chimeric mabs were characterized in different enzyme immunoassays and a western blot.
The sequence of the V-regions of the heavy and light chain genes were determined using the well known Sanger chain termination method.
SUMMARY OF THE INVENTION
Murine T84.12 is another well characterized CEA specific monoclonal of the murine IgG2a isotype. It recognizes the same epitope on CEA as T84.66 but with an affinity constant which is lower by a factor of approximately ten (10). For that reason, T84.12 was selected, pursuant to this invention, to generate mouse-human chimeric antibodies for therapeutic purposes in man. cDNA clones were humanized (chimerized) by shuffling the human IgGl heavy or light chain constant domain exons, including the 5'-UT and leader peptide, to the variable regions of the heavy and light chain genes of murine T84.12.
The resulting hybridoma produces significant quantities of chimeric T84.12 anti-CEA antibodies useful for, among other things, human therapeutic purposes.
DETAILED DESCRIPTION OF THE INVENTION
Production of the chimeric anti-CEA antibodies of this invention entails a series of steps including, among others, identification of the amino terminal protein sequences of murine T84.12, determination of the cDNA sequence of mouse light chain and heavy chain clones of T84.12 and of the corresponding amino acid sequences and the chimerization of murine T84.14 cDNA clones. One aspect of the invention entails in vitro mutagenesis of a mouse T84.12 light chain clone. Aminoterminal Sequences of Murine T84.12
Murine T84.12 specific light (L) chain clones L1-L4 and T84.12 heavy chain clones H1-H4 were prepared and sequenced in known manner. All four heavy chain clones showed a 100% V-region homology in
their V-region and therefore clone H4 was selected for the sequencing of the IgG2a heavy chain constant regions. The variable domains of light chain clones L2, L3 and L4 were identical. Clone LI was totally different, apparently representing the endogenous transcript. For the complete characterization of the constant kappa light chain domains and the 3 '-untranslated region the light chain clones LI, L4 and the heavy chain clone H4 were selected.
Table I sets forth the amino terminal sequences of the T84.12 light and T84.12 heavy chains. The reported sequences were determined using reduced (DTT) and alkylated (iodoacetic acid) purified monoclonal antibody. The heavy and light chains were separated under reducing conditions on a Sephadex G100 column using 1 M acetic acid as a running buffer. The isolated chains were subjected to amino acid sequencing.
Residue
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
cDNA Sequence of Mouse Light Chain Clone T84.12 L4 The sequence of full size cDNA T84.12 clone was determined (1020 bp) in known manner. This clone contained a very short 5
,-UT region of 10 bp which was followed by the ATG start codon. The presence of the entire leader peptide, V-region and the cKappa constant domain could be demonstrated. At the end of the Ckappa constant domain a TAG stop codon was present. The 3'-untranslated region (280 bp) contained a polyadenylation signal (AATAAA) and a poly(A) tail. The entire full size cDNA clone was flanked by the destroyed Smal restriction cloning site (GGG-CCC) . The translation of the obtained nucleotide sequence into the amino acid sequence yielded an open reading frame (bp 34-741 = 708 bp) resulting in 236 amino acids. In addition the Ckappa constant domain showed a 99.7% homology to other published Ckappa constant domain sequences (Rabat) . There was only a C to T exchange (see Kabat, et al. , "Sequences of Proteins of Immunological Interest", Fourth Ed. U.S. Dept. of Health and Human Services PHS NIH (1987)) in the T84.12 light chain sequence at bp 711 (CATTGT) . This base pair difference resulted in a silent mutation. The leader peptide and V-region were different from the T84.66 clones.
In SEQ ID NO. 1, the light chain cDNA sequence of murine T84.12, the following regions are underlined (from the top to the bottom) : ATG start codon, start of variable region, start of C-kappa constant domain, TAG stop codon and polyadenylation signal.
Amino Acid Sequence of T84.12 L4 (Frame 1 = 34-741) In SEQ ID. NO. 2 , the light chain amino acid sequence of T84.12, the following regions are underlined (from the top to the bottom) : ATG start codon, start of variable region, start of C-kappa constant domain and TAG stop codon.
cDNA Sequence of Mouse Heavy Chain Clone T84.12 H4 The complete sequence of the full size cDNA clone T84.12 was determined in known manner (1645 bp) . This clone contained a 10 bp longer 5'-UT region than the light chain clone L4 which was also followed by the ATG start codon. The presence of the entire leader peptide, V-region and all three constant domain of IgG2a could be demonstrated. At the end of CH3 constant domain of IgG2a a TGA stop codon was present. The 3'-untranslated region (120 bp) contained the polyadenylation signal AATAAA. The entire full size cDNA clone was flanked by the destroyed Smal restriction cloning site GGG-CCC. The translation of the obtained nucleotide sequence into the amino acid sequence yielded an open reading frame (52-1485 = 1434 bp) resulting in 478 amino acids. In addition, the IgG2a constant domain showed a 98.7% homology to other published IgG2a constant domain sequences (Kabat) . The hinge region showed a 100% homology to the Kabat sequence too. Two different codons in the CHI domain were identical to IgG3 and three different codons in the CH3 domain identical to MOPC21.
In SEQ ID. NO. 3, the heavy chain cDNA sequence of T84.12, the following regions are underlined (from the top to the bottom) : ATG start codon, start of variable region, start of CHI constant domain, start of hinge region, start of CH2 constant domain, start of CH3 constant domain, TGA stop codon and polyadenylation signal.
Amino Acid Sequence of T84.12 H4 (Frame 2 = 52-1485) In SEQ ID NO. 4, the heavy chain amino acid sequence of T84.12 H4, the following regions are underlined (from the top to the bottom) : ATG start
codon, start of variable region, start of CHI constant domain, start of hinge region, start of CH2 constant domain, start of CH3 constant domain and TGA stop codon.
Chimeric T84.12
The obtained and characterized full size cDNA murine T84.12 L4 and H4 clones were chimerized using the constant domains of human IgGl heavy chain cDNAs and the constant domains of human kappa light chain cDNAs respectively. The human heavy and kappa chain constant region sequences were derived from plasmids obtained from Dr. Jeffrey Schlom, National Institutes of Health. The plasmids contained chimeric B72.3 cDNA clones, cloned from cells expressing the chimeric B72.3 antibody (see, Hutzell, et al., Cancer Research 31:181-189 (1991)). Dr. Schlom's group obtained the human gamma and kappa chain genomic expression vectors from Dr. Sherie Morrison, UCLA (Oi, V.T., et al., Biotechniques 4^:214 (1986)), in order to make those constructs. Using specific primers, the variable domains of T84.12 (mouse cDNA) were, in known manner, fused in frame to the human constant domain(s) of chimeric B72.3 using the splice overlap extension PCR. See Ho, et al., Gene 77:51-59 (1988) and Horton, et al., Gene 77:61-68 (1989). These full size cDNA's were named CHI T84.12 L3, L6, L8, H2 and H3.
The chimeric clones were used for the production of Fab, F(ab'^-fragments, Fv-frag ents and of single chain antibodies linked by a synthetic peptide. cDNA Sequence of T84.12 L6
The entire sequence of the full size cDNA clone chiT84.12 L6 was determined in known manner (956 bp) . The clone chiT84.12 L6 showed the correct
sequence for a mouse-human chimeric T84.12 light chain. The clone chiT84.12 L6 was used for further subcloning into the pH^-Apr-neo vector (see Gunning, et al., Proc. Natl. Acad. Sci. 84:4831-4835 (1987)) to transfect SP2/0 myeloma cells.
The clone chiT84.12 L6 contained a short 5,-UT region of 9 bp which was followed by the ATG start codon. The presence of the entire leader peptide, V-region and the human Ckappa constant domain could be confirmed. At the end of the human Ckappa constant domain a TAG stop codon was present. The 3'-untranslated region (218 bp) contained a polyadenylation signal (AATAAA) . The translation of the obtained nucleotide sequence into the amino acid sequence yielded an open reading frame (bp 34-738 = 705 bp) resulting in 235 amino acids. In addition the human Ckappa constant domain showed a 100% homology to other published Ckappa constant domain sequences (Kabat) .
In SEQ ID NO. 7, the light chain cDNA sequence of chiT84.12 L6, the following regions are underlined (from the top to the bottom) : ATG start codon, start of mouse variable region, start of human C-kappa constant domain, TAG stop codon and polyadenylation signal.
Coding Sequence of chiT84.12 L6 (bp = 34-738)
In Seq. ID No. 8, the light chain amino acid sequence of chiT84.12 L6, the following regions are underlined (from the top to the bottom) : ATG start codon, start of mouse variable region, start of human C-kappa constant domain and TAG stop codon.
cDNA Sequence of Chimeric T84.12 H3 The complete sequence of the full size cDNA clone chiT84.12 H3 was determined in known manner (1641 bp) . The clone chiT84.12 H3 showed the correct sequence for a mouse-human chimeric T84.12 heavy chain and had one mutation at the beginning of the CH2 domain (GTG to GCG at position 484 = valine against alanine) and one at the end of the 3'-UT (AAATAAA to GAATAAA) . However, this did not affect the polyadenylation signal. The clone chiT84.12 H3 was used for further subcloning into the pHβ-Apr-gpt vector to transfect SP2/0 myeloma cells which are expressing chiT84.12 kappa light chains.
This clone contained a 41 bp long 5'-UT region which was followed by the ATG start codon. The presence of the entire leader peptide, mouse V-region and all three human constant domain of IgGl could be demonstrated. At the end of CH3 constant domain of IgGl a TGA stop codon was present. The 3'-untranslated region (153 bp) contained the polyadenylation signal AATAAA. The translation of the obtained nucleotide sequence into the amino acid sequence yielded an open reading frame (bp 52-1485 = 1410 bp) resulting in 470 amino acids. In addition, the human IgGl constant domain showed a 100% homology to other published IgGl constant domain sequences (Kabat) . The hinge region showed a 100% homology to the Kabat sequence too.
In SEQ ID NO. 9, the heavy chain cDNA sequence of chiT84.12 H3, the following regions are underlined (from the top to the bottom) : ATG start codon, start of mouse variable region, start of human CHI constant domain, start of hinge region, start of CH2 constant domain, start of CH3 constant domain, TGA stop codon and polyadenylation signal.
In SEQ ID NO. 10, the heavy chain amino acid sequence of chiT84.12 H3, the following regions are underlined (from the top to the bottom): ATG start codon, start of mouse variable region, start of human CHI constant domain, start of hinge region, start of CH2 constant domain, start of CH3 constant domain and TGA stop codon.
In Vitro Mutagenesis of Mouse T84.12 L4 cDNA
With some exceptions, two cysteine residues are typically present in an immunoglobulin domain. The CDR3 (L3) of T84.12 light chain clone L4 contained an additional third cysteine residue in the mouse variable kappa light chain domain. The presence of the third cysteine is apparently related to the loss of binding activity by murine T84.12 after dissociation of both chains and chemical crosslinking using homobifunctional crosslinking agents. Therefore the cysteine (TGT) in position 364-366, see SEQ ID NO. 1, (amino acid residue 91) was changed to a serine (TCT) by site directed mutagenesis.
Overview of MUTA-GENE Phagemid In Vitro Mutagenesis
The mutagenesis was carried out using the MUTA-GENE phagemid _in vitro mutagenesis kit from BioRad. The original procedure was simplified and reduced to the following eleven steps:
1. Subcloning of the coding cDNA strand in pTZ18U or pTZ19U phagemids (depending on the orientation of cloned cDNA in pUC18) .
2. Electrotransformation of E_^ coli CJ236 with pTZ18U or 19U containing the cDNA to be mutagenized (plate on LB-amp + 30 μq/ιwl chloramphenicol) .
3. Miniprep DNA isolation from single recombinant CJ236 colonies. This E_^ coli strain incorporates uracil residues into the phagemid DNA.
4. Growth of uracil containing phagemids in 2xYT media (containing ampicillin (50 μg/ml) and chloramphenicol (30 g/ml) . Start out with a marked and mini prep DNA analyzed single colony from the plate. Add the helper phage M13K07 in order to obtain single stranded phagemid DNA.
5. PEG extraction and purification (PCI) of single stranded phagemid DNA.
6. Phosphorylation of the mutagenesis primer (represents the minus strand and binds to the single stranded plus strand phagemid DNA) .
7. Synthesis of the mutagenic strand by annealing of the phosphorylated mutagenesis primer to the purified single stranded phagemid DNA. The complementary minus strand is created by the T4 DNA polymerase and gaps sealed with the T4DNA ligase.
8. Electrotransformation of E_^ coli MV1190 with double-stranded mutagenized cDNA. This strain removes uracil residues.
9. Isolation of miniprep DNA from single growing recombinant MV1190 colonies. The insert size can be determined by restriction enzyme digest and compared to the wild type.
10. Sequence several miniprep DNA from the mutants and compare it with the wild type sequence.
11. Select clones with the correct mutations and grow a larger culture (100 ml) . Purify the mutagenized cDNA using Qiagen columns and confirm the entire sequence of the mutated cDNA clone.
One such clone, named T84.12 L4-12-1 was selected for exemplification of the invention.
cDNA Sequence of T84.12 L4-12-1 The entire sequence of the full size cDNA clone T84.12 L4-12-1 was determined in known manner (1999 bp) . The clone showed the correct sequence for a mouse T84.12 light chain and the introduced cysteine to serine mutation. It was used for further subcloning into the PH/9-Apr-neo vector (See Gunning, et al., Proc. Natl. Acad. Sci. 84:4831-4835 (1987)) to transfect SP2/0 myeloma cells.
This T84.12 L4-12-1 clone contained a very short 5'-UT region of 10 bp which was followed by the ATG start codon. The presence of the entire leader peptide, V-region and the Ckappa constant domain could be demonstrated. At the end of the Ckappa constant domain a TAG stop codon was present. The 3'-untranslated region (280 bp) contained a polyadenylation signal (AATAAA) . The entire full size cDNA clone was flanked by the destroyed Smal restriction cloning site (GGG-CCC) . The translation of the obtained nucleotide sequence into the amino acid sequence yielded an open reading frame (bp 34-741 = 708 bp) resulting in 236 amino acids. In addition, the Ckappa constant domain showed a 99.7% homology to other published cKappa constant domain sequences (Kabat) . There was only a C (Kabat) to T exchange in the T84.12 light chain sequence at bp 711
(CATTGT) . This base pair difference resulted in a silent mutation.
In SEQ ID NO. 5, the light chain cDNA sequence of T84.12 L4-12-1, the following regions were underlined
(from the top to the bottom) : ATG start codon, start of mouse variable region, start of human C-kappa constant domain, TAG stop codon and polyadenylation signal. The mutagenized TGT (cys) to TCT (ser) is underlined and in italics.
Amino Acid Sequence of T84.12 L4-12-1 (Frame 1 = 34-741)
In SEQ ID NO. 6, the light chain amino acid sequence of T84.12, the following regions are underlined (from the top to the bottom) : ATG start codon, start of variable region, start of C-kappa constant domain and TAG stop codon. The mutagenized TGT (cys) to TCT (ser) is underlined and in italics. All other cysteine residues are underlined.
Expression of Mutagenized Mouse T84.12 cDNAs
The mutated light chain (T84.12 L4-1) cDNA and the normal heavy chain (T84.12 H4) cDNA were transferred in a /9-actin cDNA expression vector (Gunning, et al., supra) and cotransfor ed into Sp2/0 myeloma cells by electroporation. The vectors include the human -actin promoter, intervening sequence, cloning site, and a polyadenylation signal. Since the vectors contain the neomycin-resistance gene, transfectants were selected in the presence of the drug, G418. Clones were expanded and evaluated for antibody production (kappa or gamma chain) and CEA-binding activity by ELISAs. Although levels of expression were low, there was a correlation between antibody and anti-CEA activity in culture supernatants.
Bindin Activit of T84.12 c s —> Ser Mutant
(1) GENERAL INFORMATION:
(i) APPLICANT: John E. Shively
Rainer Fischer Anna Wu Ray Paxton Y.H. Joy Yang
(ii) TITLE OF INVENTION: Chimeric Anti-CEA Antibody
(iii) NUMBER OF SEQUENCES: 10
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: City of Hope
(B) STREET: 1500 East Duarte Road
(C) CITY: Duarte
(D) STATE: California
(E) COUNTRY: United States of America
(F) ZIP: 91010-0269 (V) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: 3M Double Density 5 1/4" diskette
(B) COMPUTER: Wang PC
(C) OPERATING SYSTEM: MS-DOS (R) Version 3.30
(D) SOFTWARE: Microsoft (R) (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: 07/904,074
(B) FILING DATE: 15 June 1992
(C) CLASSIFICATION: Unknown (vii) PRIOR APPLICATION DATA: None
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Irons, Edward S.
(B) REGISTRATION NUMBER: 16,541
(C) REFERENCE/DOCKET NUMBER: None (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (202) 785-6938
(B) TELEFAX: (202) 785-5351
(C) TELEX: 440087 LM WSH
(2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1041
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid
(iii) HYPOTHETICAL: Not Applicable
(iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable
(vi) ORIGINAL SOURCE: Synthically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(X) PUBLICATION INFORMATION: None
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
TTACGAATTC GAGCTCGGTA CCCGGGCATC AAGATGGAGT CACAGACTCA 50
GGTCTTTGTA TACATGTTGC TGTGGTTGTC TGGTGTTGAT GGAGACATTG 100
TGCTGACCCA GTCTCAAAAA TTCATGTCCA CATCAGTTGG AGGCACGGTC 150
AGCGTCACCT GCAAGGCCAG TCAAAATGTG CATACTAATG TTGCCTGGTA 200
TCAACAGAAA CCAGGACAAT CTCCTAAAGC ACTGATTTAC TCGGCATCCT 250
ACCGTTACAG TGGAGTCCCT GATCGCTTCA CAGGCAGTGG ATCTGGGACA 300
GATTTCACTC TCACCATCAG CAATGTGCAG TCTGAAGACT TGGCAGAATA 350
TTTCTGTCAG CAATGTAACA GCTATCCTCT ATTCACGTTC GGCTCGGGGA 400
CAACGTTGGA AATAAAACGG GCTGATGCTG CACCAACTGT ATCCATCTTC 450
CCACCATCCA GTGAGCAGTT AACATCTGGA GGTGCCTCAG TCGTGTGCTT 500
CTTGAACAAC TTCTACCCCA AAGACATCAA TGTCAAGTGG AAGATTGATG 550
GCAGTGAACG ACAAAATGGC GTCCTGAACA GTTGGACTGA TCAGGACAGC 600
AAAGACAGCA CCTACAGCAT GAGCAGCACC CTCACGTTGA CCAAGGACGA 650
GTATGAACGA CATAACAGCT ATACCTGTGA GGCCACTCAC AAGACATCAA 700
CTTCACCCAT TGTCAAGAGC TTCAACAGGA ATGAGTGTTA GAGACAAAGG 750
TCCTGAGACG CCACCACCAG CTCCCCAGCT CCATCCTATC TTCCCTTCTA 800
AGGTCTTGGA GGCTTCCCCA CAAGCGACCT ACCACTGTTG CGGTGCTCCA 850
AACCTCCTCC CCACCTCCTT CTCCTCCTCC TCCCTTTCCT TGGCTTTTAT 900
CATGCTAATA TTTGCAGAAA ATATTCAATA AAGTGAGTCT TTGCACTTGA 950
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1000
AAAAAAAAAA AAGGGGATCC TCTAGAGTCG ACCTGCAGGC A 1041
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235
(B) TYPE: Amino Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown (ii) MOLECULE TYPE: Amino Acid
(iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable (vi) ORIGINAL SOURCE: Synthetically Prepared (vii) IMMEDIATE SOURCE: Syntehtically Prepared (viii) POSITION IN GENOME: None (ix) FEATURE: None
(x) PUBLICATION INFORMATION: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1645
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid (iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable (vi) ORIGINAL SOURCE: Synthetically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(x) PUBLICATION INFORMATION: None
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
TTACGAATTC GAGCTCGGTA CCCCCTGGAT TTGAGTTCCT CACATTCAGT 50
CATGAGCACT GAACACAGAC ACCTCACCAT GAACTTCGGG TTCAGCCTGA 100
TTTTCCTTGT CCTTGTTTTA AAAGGTGTCC AGTGTGAAGT GAAGCTGGTG 150
GAGTCTGGGG GAGGCTTTGT GAAGCCTGGA GGGTCCCTGA AACTCTCCTG 200
TGCAGCCTCC GGATTCACTT TCAGTAGTTA TGCCATGTCT TGGGTTCGCC 250
AGACTCCAGA GAAGAGGCTG GAGTGGGTCG CATCCATTAG TAGTGATGGT 300
ATCACCTTCT ATGTAGACAG TGTGAAGGGC CGATTCACCG TCTCCAGAGA 350
CAATGCCAGG AACATCCTGT ACCTGCAAAT GAGCAGTCTG AGGTCTGAGG 400
ACACGGCCAT GTATTACTGT GCAAGAATCG ACTACTACGG AGGAGGGGGA 450
TTTGGTTACT GGGGCCAAGG GACTCTGGCC ACTGTCTCTG CAGCCAAAAC 500
AACAGCCCCA TCGGTCTATC CACTGGCCCC TGTGTGTGGA GATACAACTG 550
GCTCCTCGGT GACTCTAGGA TGCCTGGTCA AGGGTTATTT CCCTGAGCCA 600
GTGACCTTGA CCTGGAACTC TGGATCCCTG TCCAGTGGTG TGCACACCTT 650
CCCAGCTGTC CTGCAGTCTG ACCTCTACAC CCTCAGCAGC TCAGTGACTG 700
TAACCTCGAG CACCTGGCCC AGCCAGTCCA TCACCTGCAA TGTGGCCCAC 750
CCGGCAAGCA GCACCAAGGT GGACAAGAAA ATTGAGCCCA GAGGGCCCAC 800
AATCAAGCCC TGTCCTCCAT GCAAATGCCC AGCACCTAAC CTCTTGGGTG 850
GACCATCCGT CTTCATCTTC CCTCCAAAGA TCAAGGATGT ACTCATGATC 900
TCCCTGAGCC CCATAGTCAC ATGTGTGGTG GTGGATGTGA GCGAGGATGA 950
CCCAGATGTC CAGATCAGCT GGTTTGTGAA CAACGTGGAA GTACACACAG 1000
CTCAGACACA AACCCATAGA GAGGATTACA ACAGTACTCT CCGGGTGGTC 1050
AGTGCCCTCC CCATCCAGCA CCAGGACTGG ATGAGTGGCA AGGAGTTCAA 1100
ATGCAAGGTC AACAACAAAG ACCTCCCAGC GCCCATCGAG AGAACCATCT 1150
CAAAACCCAA AGGGTCAGTA AGAGCTCCAC AGGTATATGT CTTGCCTCCA 1200
CCAGAAGAAG AGATGACTAA GAAACAGGTC ACTCTGACCT GCATGGTCAC 1250
AGACTTCATG CCTGAAGACA TTTACGTGGA GTGGACCAAC AACGGGAAAA 1300
CAGAGCTAAA CTACAAGAAC ACTGAACCAG TCCTGGACTC TGATGGTTCT 1350
TACTTCATGT ACAGCAAGCT GAGAGTGGAA AAGAAGAACT GGGTGGAAAG 1400
AAATAGCTAC TCCTGTTCAG TGGTCCACGA GGGTCTGCAC AATTACCACA 1450
CGACTAAGAG CTTCTCCCGG ACTCCGGGTA AATGAGCTCA GCACCCACAA 1500
AACTCTCAGG TCCAAAGAGA CACCCACACT CATCTCCATG CTTCCCTTGT 1550
ATAAATAAAG CACCCAGCAA TGCCTGGGAC CATGTAAAAA AAAAAAAAAA 1600
AAAAAAAAAA AAAAAAGGGG ATCCTCTAGA GTCGACCTGC AGGCA 1645
(2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid
(iii) HYPOTHETICAL: Not Applicable
(iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable
(vi) ORIGINAL SOURCE: Synthetically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(x) PUBLICATION INFORMATION: None
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
MSTNHRHNTM NNGNSNNNNV NVNKGVNCNV KNVNSGGGNV KNGGSNKNSC 5
AASGNTNSSY AMSWVRNTNN KRNNWVASNS SDGNTNYVDS VKGRNTVSRD 10
NARNNNYNNM SSNRSNDTAM YYCARNDYYG GGGNGYWGNG TNATVSAAKT 15
TANSVYNNAN VCGDTTGSSV TNGCNVKGYN NNNVTNTWNS GSNSSGVHTN 20
NAVNNSDNYT NSSSVTVTSS TWNSNSNTCN VAHNASSTKV DKKNNNRGNT 25
NKNCNNCKCN ANNNNGGNSV NNNNNKNKDV NMNSNSNNVT CVWDVSNDD 30
NDVNNSWNVN NVNVHTANTN THRNDYNSTN RWSANNNNH NDWMSGKNNK 35
CKVNNKDNNA NNNRTNSKNK GSVRANNVYV NNNNNNNMTK KNVTNTCMVT 40
DNMNNDNYVN WTNNGKTNNN YKNTNNVNDS DGSYNMYSKN RVNKKNWVNR 45
NSYSCSWHN GNHNYHTTKS NSRTNGK 47
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1041
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid
(iii) HYPOTHETICAL: Not Applicable
(iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable
(vi) ORIGINAL SOURCE: Synthetically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(x) PUBLICATION INFORMATION: None
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
TTACGAATTC GAGCTCGGTA CCCGGGCATC AAGATGGAGT CACAGACTCA 50
GGTCTTTGTA TACATGTTGC TGTGGTTGTC TGGTGTTGAT GGAGACATTG 100
TGCTGACCCA GTCTCAAAAA TTCATGTCCA CATCAGTTGG AGGCACGGTC 150
AGCGTCACCT GCAAGGCCAG TCAAAATGTG CATACTAATG TTGCCTGGTA 200
TCAACAGAAA CCAGGACAAT CTCCTAAAGC ACTGATTTAC TCGGCATCCT 250
ACCGTTACAG TGGAGTCCCT GATCGCTTCA CAGGCAGTGG ATCTGGGACA 300
GATTTCACTC TCACCATCAG CAATGTGCAG TCTGAAGACT TGGCAGAATA 350
TTTCTGTCAG CAATGTAACA GCTATCCTCT ATTCACGTTC GGCTCGGGGA 400
CAACGTTGGA AATAAAACGG GCTGATGCTG CACCAACTGT ATCCATCTTC 450
CCACCATCCA GTGAGCAGTT AACATCTGGA GGTGCCTCAG TCGTGTGCTT 500
CTTGAACAAC TTCTACCCCA AAGACATCAA TGTCAAGTGG AAGATTGATG 550
GCAGTGAACG ACAAAATGGC GTCCTGAACA GTTGGACTGA TCAGGACAGC 600
AAAGACAGCA CCTACAGCAT GAGCAGCACC CTCACGTTGA CCAAGGACGA 650
GTATGAACGA CATAACAGCT ATACCTGTGA GGCCACTCAC AAGACATCAA 700
CTTCACCCAT TGTCAAGAGC TTCAACAGGA ATGAGTGTTA GAGACAAAGG 750
TCCTGAGACG CCACCACCAG CTCCCCAGCT CCATCCTATC TTCCCTTCTA 800
AGGTCTTGGA GGCTTCCCCA CAAGCGACCT ACCACTGTTG CGGTGCTCCA 850
AACCTCCTCC CCACCTCCTT CTCCTCCTCC TCCCTTTCCT TGGCTTTTAT 900
CATGCTAATA TTTGCAGAAA ATATTCAATA AAGTGAGTCT TTGCACTTGA 950
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1000
AAAAAAAAAA AAGGGGATCC TCTAGAGTCG ACCTGCAGGC A 1041
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid (iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable
(vi) ORIGINAL SOURCE: Synthetically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(x) PUBLICATION INFORMATION: None
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
MNSNTNVNVY MΝΝWΝSGVDG DΝVΝTΝSΝKΝ MSTSVGGTVS VTCKASΝΝVH 5
TΝVAWYΝΝKΝ GΝSΝKAΝΝYS ASYRYSGVΝD RΝTGSGSGTD ΝTΝTΝSΝVΝS 10
ΝDΝAΝYΝCΝΝ SΝSYΝΝΝTΝG SGTTΝΝΝKRA DAAΝTVSΝΝΝ ΝSSΝΝΝTSGG 15
ASWCΝΝΝΝΝ YΝKDΝΝVKWK ΝDGSΝRΝΝGV ΝΝSWTDΝDSK DSTYSMSSTN 20
TNTKDNYNRH NSYTCNATHK TSTSNNVKSN NRNNC 23
(2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 957
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid (iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable (vi) ORIGINAL SOURCE: Synthetically Prepared (vii) IMMEDIATE SOURCE: Synthetically Prepared (viii) POSITION IN GENOME: None (ix) FEATURE: None
(x) PUBLICATION INFORMATION: None ( i) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
TTACGAATTC GAGCTCGGTA CCCGGGCATC AAGATGGAGT CACAGACTCA 50
GGTCTTTGTA TACATGTTGC TGTGGTTGTC TGGTGTTGAT GGAGACATTG 100
TGCTGACCCA GTCTCAAAAA TTCATGTCCA CATCAGTTGG AGGCACGGTC 150
AGCGTCACCT GCAAGGCCAG TCAAAATGTG CATACTAATG TTGCCTGGTA 200
TCAACAGAAA CCAGGACAAT CTCCTAAAGC ACTGATTTAC TCGGCATCCT 250
ACCGTTACAG TGGAGTCCCT GATCGCTTCA CAGGCAGTGG ATCTGGGACA 300
GATTTCACTC TCACCATCAG CAATGTGCAG TCTGAAGACT TGGCAGAATA 350
TTTCTGTCAG CAATGTAACA GCTATCCTCT ATTCACGTTC GGCTCGGGGA 400
CAACGTTGGA AATAAAAACT GTGGCTGCAC CATCTGTCTT CATCTTCCCG 450
CCATCTGATG AGCAGTTGAA ATCTGGAACT GCCTCTGTTG TGTGCCTGCT 500
GAATAACTTC TATCCCAGAG AGGCCAAAGT ACAGTGGAAG GTGGATAACG 550
CCCTCCAATC GGGTAACTCC CAGGAGAGTG TCACAGAGCA GGACAGCAAG 600
GACAGCACCT ACAGCCTCAG CAGCACCCTG ACGCTGAGCA AAGCAGACTA 650
CGAGAAACAC AAAGTCTACG CCTGCGAAGT CACCCATCAG GGCCTGAGCT 700
CGCCCGTCAC AAAGAGCTTC AACAGGGGAG AGTGTTAGAG GGAGAAGTGC 750
CCCCACCTGC TCCTCAGTTC CAGCCTGACC CCCTCCCATC CTTTGGCCTC 800
TGACCCTTTT TCCACAGGGG ACCTACCCCT ATTGCGGTCC TCCAGCTCAT 850
CTTTCACCTC ACCCCCCTCC TCCTCCTTGG CTTTAATTAT GCTAATGTTG 900
GAGGAGAATG AATAAATAAA GTGAATCTTT GCAAAAAGCT TGGCACTGGC 950
CGTCGTT 957
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 234
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid (iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable (vi) ORIGINAL SOURCE: Synthetically Prepared (vii) IMMEDIATE SOURCE: Synthetically Prepared (viii) POSITION IN GENOME: None (ix) FEATURE: None
(x) PUBLICATION INFORMATION: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
MNSNTNVNVY MNNWNSGVDG DNVNTNSNKN MSTSVGGTVS VTCKASNNVH 50 TNVAWYNNKN GNSNKANNYS ASYRYSGVND RNTGSGSGTD NTNTNSNVNS 100 NDNANYNCNN CNSYNNNTNG SGTTNNNKTV AANSVNNNNN SDNNNKSGTA 150 SWCNNNNNY NRNAKVNWKV DNANNSGNSN NSVTNNDSKD STYSNSSTNT 200 NSKADYNKHK VYACNVTHNG NSSNVTKSNN RGNC 234
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1641
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid
(iii) HYPOTHETICAL: Not Applicable
(iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable
(vi) ORIGINAL SOURCE: Synthetically Prepared
(vii) IMMEDIATE SOURCE: Synthetically Prepared
(viii) POSITION IN GENOME: None
(ix) FEATURE: None
(x) PUBLICATION INFORMATION: None
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
TTACGAATTC GAGCTCGGTA CCCCCTGGAT TTGAGTTCCT CACATTCAGT 50 GATGAGCACT GAACACAGAC ACCTCACCAT GAACTTCGGG TTCAGCCTGA 10 TTTTCCTTGT CCTTGTTTTA AAAGGTGTCC AGTGTGAAGT GAAGCTGGTC 15 GAGTCTGGGG GAGGCTTTGT GAAGCCTGGA GGGTCCCTGA AACTCTCCTG 20 TGCAGCCTCC GGATTCACTT TCAGTAGTTA TGCCATGTCT TGGGTTCGCC 25 AGACTCCAGA GAAGAGGCTG GAGTGGGTCG CATCCATTAG TAGTGATGGT 30 ATCACCTTCT ATGTAGACAG TGTGAAGGGC CGATTCACCG TCTCCAGAGA 35 CAATGCCAGG AACATCCTGT ACCTGCAAAT GAGCAGTCTG AGGTCTGAGG 40 ACACGGCCAT GTATTACTGT GCAAGAATCG ACTACTACGG AGGAGGGGGA 45 TTTGGTTACT GGGGCCAAGG GACTCTGGCC ACTGTCTCTG CAGCCTCCAC 50 CAAGGGCCCA TCGGTCTTCC CCCTGGCACC CTCCTCCAAG AGCACCTCTG 55 GGGGCACAGC GGCCCTGGGC TGCCTGGTCA AGGACTACTT CCCCGAACCG 60 GTGACGGTGT CGTGGAACTC AGGCGCCCTG ACCAGCGGCG TGCACACCTT 65 CCCGGCTGTC CTACAGTCCT CAGGACTCTA CTCCCTCAGC AGCGTGGTGA 70 CCGTGCCCTC CAGCAGCTTG GGCACCCAGA CCTACATCTG CAACGTGAAT 75
CACAAGCCCA GCAACACCAA GGTGGACAAG AAAGTTGAGC CCAAATCTTG 800
TGACAAAACT CACACATGCC CACCGTGCCC AGCACCTGAA CTCCTGGGGG 850
GACCGTCAGT CTTCCTCTTC CCCCCAAAAC CCAAGGACAC CCTCATGATC 900
TCCCGGACCC CTGAGGTCAC ATGCGTGGTG GTGGACGCGA GCCACGAAGA 950
CCCTGAGGTC AAGTTCAACT GGTACGTGGA CGGCGTGGAG GTGCATAATG 1000
CCAAGACAAA GCCGCGGGAG GAGCAGTACA ACAGCACGTA CCGTGTGGTC 1050
AGCGTCCTCA CCGTCCTGCA CCAGGACTGG CTGAATGGCA AGGAGTACAA 1100
GTGCAAGGTC TCCAACAAAG CCCTCCCAGC CCCCATCGAG AAAACCATCT 1150
CCAAAGCCAA AGGGCAGCCC CGAGAACCAC AGGTGTACAC CCTGCCCCCA 1200
TCCCGGGATG AGCTGACCAA GAACCAGGTC AGCCTGACCT GCCTGGTCAA 1250
AGGCTTCTAT CCCAGCGACA TCGCCGTGGA GTGGGAGAGC AATGGGCAGC 1300
CGGAGAACAA CTACAAGACC ACGCCTCCCG TGCTGGACTC CGACGGCTCC 1350
TTCTTCCTCT ACAGCAAGCT CACCGTGGAC AAGAGCAGGT GGCAGCAGGG 1400
GAACGTCTTC TCATGCTCCG TGATGCATGA GGCTCTGCAC AACCACTACA 1450
CGCAGAAGAG CCTCTCCCTG TCTCCGGGTA AATGAGTGCG ACGGCCGGCA 1500
AGCCCCCGCT CCCCGGGCTC TCGCGGTCGC ACGAGGATGC TTGGCACGTA 1550
CCCCCTGTAC ATACTTCCCG GGCGCCCAGC ATGGGAATAA AGCACCCAGC 1600
GCTGCCCTGG GCCCCTGCAA GGATCCAAGC TTGGCACTGG C 1641
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477
(B) TYPE: Nucleic Acid
(C) STRANDEDNESS: Single Stranded
(D) TOPOLOGY: Unknown
(ii) MOLECULE TYPE: Nucleic Acid (iii) HYPOTHETICAL: Not Applicable (iv) ANTI-SENSE: Not Applicable
(v) FRAGMENT TYPE: Not Applicable (vi) ORIGINAL SOURCE: Synthetically Prepared (vii) IMMEDIATE SOURCE: Synthetically Prepared (viii) POSITION IN GENOME: None (ix) FEATURE: None
(X) PUBLICATION INFORMATION: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: