CA2338185A1 - Thermostable in vitro complex with polymerase activity - Google Patents

Thermostable in vitro complex with polymerase activity Download PDF

Info

Publication number
CA2338185A1
CA2338185A1 CA002338185A CA2338185A CA2338185A1 CA 2338185 A1 CA2338185 A1 CA 2338185A1 CA 002338185 A CA002338185 A CA 002338185A CA 2338185 A CA2338185 A CA 2338185A CA 2338185 A1 CA2338185 A1 CA 2338185A1
Authority
CA
Canada
Prior art keywords
glu
leu
lys
ile
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002338185A
Other languages
French (fr)
Inventor
Christian Kilger
Ingo Kober
Hartmut Voss
Gerd Moeckel
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sygnis Pharma AG
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from DE19840771A external-priority patent/DE19840771A1/en
Application filed by Individual filed Critical Individual
Publication of CA2338185A1 publication Critical patent/CA2338185A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1252DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Gastroenterology & Hepatology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The inventive thermostable in vitro complex for template-dependent elongatio n of nucleic acids comprises a thermostable staple protein and a thermostable elongation protein.

Description

Thermostable in vitro complex with polymerase activity Description The invention concerns a thermostable in vitro complex for the template-dependent elongation of nucleic acids, a thermostable in vitro complex and DNA sequences and vectors coding therefor. The invention additionally concerns the use of the inventive complexes in methods for the template-dependent elongation of nucleic acids such as PCR reactions, reverse transcription, DNA labelling or DNA sequencing in which an in vitro template-dependent DNA strand synthesis occurs. Finally the invention also concerns kits or reagent kits for carrying out the inventive methods.
DNA polymerases belong to a group of enzymes which use single-stranded DNA as a template for the synthesis of a complementary DNA strand. These enzymes play a major role in nucleic acid metabolism including the processes of DNA
replication, repair and recombination. DNA polymerases have been identified in all cellular organisms from bacterial to human cells, in many viruses as well as in bacteriophages (Kornberg, A. & Baker, T.A. (1991) DNA Replication WH Freeman, New York, NY). The archaebacteria and eubacteria are usually combined to form the prokaryote group which are organisms without a real cell nucleus in contrast to the eukaryotes which are organisms with a real cell nucleus. A common feature of many polymerases from the diverse organisms is often a similarity of the amino acid sequence and a similarity of structure (Wang, J., Sattar, A.K.M.A.; Wang, C.C., Karam, J.D., Konigsberg, W.H. & Steitz, T.A. (1997) Crystal Structure of pol a, family replication DNA polymerase from bacteriophage RB69.Cell 89, 1087-1099).
Organisms such as humans have numerous DNA-dependent polymerases which are, however, not all responsible for DNA replication but some also carry out DNA
repair. Replicative DNA polymerases are usually composed in vivo of protein complexes with several units which replicate the chromosomes of the cellular organisms and viruses. A general property of these replicating polymerases is in general a high processivity which means their ability to polymerise thousands of nucleotides without dissociating from the DNA template (Kornberg, A. & Baker, T.A. (1991) DNA Replication, WH Freeman, New York, NY).
Highly processive replication mechanisms are known in the prior art which are on the one hand cellular mechanisms and, on the other hand, the replication mechanisms occurring in the bacteriophages T4 and T7.
The replication apparatus comprises numerous components. These include among others, a) proteins having polymerase activity, b) proteins which are involved in the formation of a clamp structure, one of the functions of the clamp structure being to bind a polymerase activity to its template, to stabilize the binding and thus to correspondingly change the dissociation constant, c) proteins which load the clamp onto the template, d) proteins which stabilize the template and optionally e) proteins which guide the polymerase onto the template.
Proteins mentioned under b) form structures which are either open or closed, for example circular or semi-circular structures. Such structures can be formed by one or several species of proteins. One of the said protein species may have a polymerase activity.
The proteins responsible for the formation of these structures are referred to in the following as "sliding clamp proteins" or "clamp proteins" provided they have no polymerase activity.
Reference is made to the replication apparatus of the bacteriophages T4 or T7 as an example of a processive replication apparatus which does not have a closed circular shape.
Reference is made to the replication apparatus of the bacterium E. coli as an example of a processive replication apparatus which has a closed circular shape (Stukenberg, P.T., Studwell-Vaughan, P.S. & O'Donnel, M. (1991) Mechanisms of the (3 clamp of DNA polymerase III holoenzyme. J. Biol. Chem. 266, 11328-11334; Kuriyan, J.
& O'Donnel, M. (1993) Sliding clamps of DNA polymerases. J. Mol. Biol. 234, 915-925).
It is known that the replication apparatus in archaebacteria is similar to the eukaryotic replication apparatus although the genome organisation in eukaryotes and archaebacteria is completely different and the cellular structure of the eubacteria is similar to that of the archaea (Edgell, D.R. and Doolittle, W.F. (1997), Archaea and the origins) of DNA replication proteins. Cell 89, 995-998).
The sliding clamp is frequently bound to an elongation protein via one or several other proteins, in other words it is coupled to the elongation protein. Such a coupling protein is referred to herein in the following as a coupling protein or coupling subunit in which the coupling may take place via a plurality of coupling proteins.
An elongation protein should be understood herein as a protein or complex having polymerase activity which has one or several of the following properties: use of RNA as a template to synthesize DNA and/or RNA, use of DNA as a template to synthesize DNA and/or RNA, synthesis of RNA, synthesis of DNA, synthesis of nucleic acids from DNA and RNA, exonuclease activity in the S'-3' direction and exonuclease activity in the 3'-S' direction, strand displacement activity, thermostability and processivity or non-processivity.
The three-dimensional structure of various sliding clamp proteins has already been determined:
- that of the eukaryotic proliferating cell nuclear antigen (PCNA) (Krishna, T.S.R., Kong, X.-P., Gary, S., Burgers, P.M. & Kuriyan, J. (1994) Crystal structure of the eukaryotic DNA polymerase processivity factor PCNA. Cell 79, 1233-1243;
Gulbis, J.M., Kelman, Z., Hurwitz, J., O'Donnel, M. & Kuriyan, J. (1996) Structure of the C-terminal region of p21 WAF 1/CIP 1 complexed with human PCNA. Cell 87, 297-306), - that of the ~3 subunit of the polymerase III of the eubacterium Escherichia coli (Kong, X.-P., Onrust, R., O'Donnel, M. & Kuriyan.; J. ( 1992) Three dimensional structure of the (3 subunit of Escherichia coli DNA polymerase III holoenzyme;
a sliding DNA clamp. Cell 69, 425-437) - and that of the gene-4S protein of the bacter-iophage T4 protein (Kelman, Zvi, Hurwitz, J. O'Donnel, Mike (1998) Structure, 6, 121-12S).
The overall structure of these sliding clamps is very similar; the pictures of the circular total protein structure of PCNA, of the ~i subunit and gp45 rings are superimposable when laid on top of one another (Kelman, Z. & O'Donnel, M.
(1995) Structural and functional similarities of prokaryotic and eukaryotic sliding clamps. Nucleic Acids Res. 23, 3613-3620). Each ring has comparable dimensions and a central opening which is large enough to encircle duplex DNA i.e. a DNA
double strand composed of the two complementary DNA strands.
The sliding clamp cannot position itself in vivo around the DNA but must be clamped around the DNA. In prokaryotes and eukaryotes such a protein complex is composed of numerous subunits which are referred to as the y complex in the eubacterium Escherichia coli and as the replication factor C (RF-C) in humans (Kelman, Z. & O'Donnell, M. ( 1994) DNA replication - enzymology and mechanisms. Curr. Opin. Gent. Dev. 4, 185-195). The; protein complex recognises the 3'-end of the primer in the primer template duplex and positions the sliding clamp around the DNA in the presence of ATP.
In the case of the bacteriophage T7 the same object i.e. a processive DNA
synthesis, is achieved by means of a protein complex with a different structure. The phage expresses its own catalytic polymerase, T7 polymerase., the gene product of gene 5 which binds to a protein from the host Escherichia coli i.e. thioredoxin and thus enables a highly processive DNA replication as a replicase (Proc. Natl. Acad.
Sci.
USA 1992, Oct. 15; 80(20):9774-9778 Genetic analysis of the interaction between bacteriophage T7 DNA polymerase and Escherichia coli thiorexin, Himawan JS, Richardson CC). In this case there is also clamp formation but this clamp does not have the same structure as for example in the case of the eukaryotic PCNA.
It is often necessary, such as in the case of the human polymerase 8, that coupling proteins have to create a connection between the catalytically active part of the polymerase and the processivity factor (sliding clamp). In the case of humans this is the small subunit of the 8 polymerase (Zhang, S.-J., Zeng, X.-R., Zhang, P., Toomey, N.L., Chuang, R.Y., Chang, L.-S., and :Lee, M.Y.W.T. (1994). A
conserved region in the amino terminus of DNA polymerase 8 is involved in poliferating cell nuclear antigen binding. J. Biol. Chem. 270, 7988-7992).
However, in the case of T7 polymerase the processivity factor binds directly to the catalytic unit of the polymerase.
DNA polymerases are characterized among others by two properties, their elongation rate i.e. the number of nucleotides which they can incorporate per second into a growing DNA strand and their dissociation constant. If the polymerase dissociates again from the strand after each step of incorporating one of the nucleotides into the growing chain (i.e. one elongation step occurs per binding event), then the processivity has the value 1 and the polymerase is not processive.
This synthesis is referred to as distributive. If the polymerase remains connected to the strand for repeated nucleic acid incorporations, then the replication modus is referred to as processive and can reach a value of several thousand (see also:
Methods in Enzymology Volume 262, DNA replication, edited by J.L. Campbell, Academic Press 1995, pp. 270-280).
Processivity is a desirable property for most in vitro applications such as PCR or sequencing processes but the thermostable enzymes that have been used up to now in these reactions only possess processivity to a slight extent whereas the temperature-sensitive T7 polymerase associated with thioredoxin has a processivity of several thousand nucleotides. In comparison the thermostable DNA
polymerases from Thermr~.r thermophilzrs or Thermos aqzraticr~s only have a processivity of about 50 nucleotides (Biochim. Biophys. Acta 1995 Nov. 7; 1264(2):243-248 Inactivation of the 5'-3' exonuclease of Thermos aquaticus DNA polymerase. Merkens LS, Bryan SK, Moses RE).
The US patents 4,683,195, 4,800,195 and 4,683,202 describe the application of such thermostable DNA polymerases in the polymerase chain reaction (PCR). In PCR
DNA is newly synthesized using primers, templates (also referred to as matrices), nucleotides, a DNA polymerase, an appropriate buffer and suitable reaction conditions. The starting point is usually a double-stranded DNA sequence of which a certain target region is to be amplified. Two primers are used for this which are complementary to flanking regions of the target sequence, each being on a partial strand of the DNA double strand. However, in order to hybridize the primers, the DNA double strands are firstly denatured and in particular thermally melted.
After hybridization of the primers, they are elongated by means of the polymerase.
Subsequently denaturation is again carried out in order to separate the newly formed DNA strands from the template strands whereupon, in addition to the original template strands, the nucleic acid strands formed in the first step are also available as a template for a further elongation cycle, these are each again hybridized with primers and a new elongation takes place. This procedure is carried out in cycles each with thermal denaturation as intermediate steps. A thermostable polymerase which survives the cyclic thermal melting of the DNA strands is preferably used for the PCR. Thus Taq DNA polymerase is often used (US patent 4,965,188). However, the processivity of Taq DNA polymerase is relatively low compared to T7 polymerase as described above.
DNA polymerases are also used in DNA sequence determination (Sanger et al., Proc. Natl. Acad. Sci., USA 74:5463-5467 (1997)). A T7 DNA polymerase is frequently used for sequencing according to Sanger (Tabor, S. and Richardson, C.C.
Proc. Natl. Acad Sci., USA 86:4076-4080 (1989)). Subsequently the cycle sequencing method was developed (hurray, V. (1989) ,Nucleic Acids Re.s. 17, 8889) which does not require a single-stranded template and allows initiation of the sequence reaction with relatively small amounts of template. The templates that can be used for this are for example the Taq polymerase mentioned above (US patent 5,075,216) or the polymerase from Thermotoga neapolitana (WO 96/10640) or other thermostable volvmerases. Recent methods couvle the exponential amplification and sequencing of a DNA fragment in one step so that it is possible to directly sequence genomic DNA. One of the methods, the so-called DEXAS method (Nucleic Acids Res. 1997 May 15;25(10):2032-2034 Direct DNA sequence determination from total genomic DNA. Kilger C, Paabo S, Biol. Chem. 1997 Feb;
378(2):99-105 Direct exponential amplification and sequencing (DEXAS) of genomic DNA. Kilger C, Paabo S and DE 19653439.9 and DE 19653494.1), uses a polymerase with a reduced ability to discriminate against dideoxynucleotides (ddNTPs), compared to deoxynucleotides (dNTPs) as well as a reaction buffer, two primers which are preferably not present in equimolar amounts and the above-mentioned nucleotides in order to then obtain a complete, sequence-specific DNA
ladder of a fragment in several cycles which is flanked by the primers. A
fi~rther development of this method comprises the use of a polymerase mixture in which one of the two polymerases discriminates between ddNTPs and dNTPs whereas the _7-second has a reduced discrimination ability (Nucleic Acidr Res. 1997 May 15;
26( 10):2032-2034 Direct DNA sequence determination from total genomic DNA.
Kilger C, Paabo S).
DNA polymerases are also used for the reverse transcription of RNA into DNA.
In this case RNA serves as a template and the polymerase synthesizes a complementary DNA strand. The thermostable DNA polymerase from the organism Thermos thermusphilus (Tth) (US patent 5,322,770) is for example used in this case.
The polymerase may also have a proof reading activity i.e. a 3'-5' exonuclease activity. This property is particularly desirable when the product to be synthesized should be produced with a low rate of nucleotide incorporation errors. The polymerases from the organism Pyrococcus wosei are an example of this.
Most of the above-mentioned enzymes that are usually used in PCR reactions are not actually replication enzymes in vivo but are mostly enzymes which are assumed to be involved in DNA repair which is why their processivity is relatively small.
Hence an object of the present invention was t:o combine several of the aforementioned properties of polymerases in particular that of high processivity and thermostability for use in in vitro reactions.
This object was achieved according to the invention by providing a thermostable in vitro complex comprising a thermostable sliding clamp protein and a thermostable elongation protein having polymerase activity. The inventive complex can thus be used for the template-dependent elongation of nucleic acid(s).
This complex can be used in in vitro reactions such as in PCR reactions and has a high processivity in these reactions. An additional advantage is that the complex has a low error rate in the nucleotide incorporation i.e. an increased accuracy of base incorporation. Hence this complex can be used advantageously for the elongation, amplification, reverse transcription, DNA labelling and sequencing of nucleic acids.

_ g _ The said applications each represent particularly preferred embodiments of the invention.
If the inventive complex is used to amplify nucleic acids it was surprisingly found that the amplification product produced in this process has a particularly low rate of erroneous base incorporation.
The use of such a complex, for example in standard PCR reactions, ensures a simple handling and a high processivity as shown for example in Fig. 26.
In one embodiment it is intended that the sliding clamp protein is linked to the elongation protein in the in vitro complex according to t:he invention.
In a preferred embodiment of the inventive thermostable in vitro complex the sliding clamp protein and the elongation protein are linked by a coupling protein.
The coupling between the sliding clamp protein and the elongation protein having polymerase activity can be achieved by covalent and also by non-covalent binding. In a preferred alternative embodiment a direct coupling between the sliding clamp protein and elongation protein is envisaged.
In this case the sliding clamp protein andlor the elongation protein can be derived from archaebacteria.
In a preferred embodiment of the inventive complex it is a prokaryotic in vitro complex.
In a preferred embodiment the inventive prokaryotic complex can be an archaebacterial in vitro complex.
In a preferred alternative embodiment the inventive prokaryotic complex can be a eubacterial in vitro complex.

In an alternative embodiment of the complex according to the invention it is a eukaryotic in vitro complex.
In this connection a prokaryotic in vitro complex is one in which the sliding clamp protein is of prokaryotic origin irrespective of the origin of the elongation protein.
Correspondingly a eubacterial complex is one in which the sliding clamp protein is of eubacterial origin irrespective of the origin of the elongation protein.
Correspondingly an archaebacterial complex is one in which the sliding clamp protein is of archaebacterial origin irrespective of the origin of the elongation protein. Furthermore a eukaryotic complex is correspondingly one in which the sliding clamp protein is of eukaryotic origin irrespective of the origin of the elongation product.
The present invention also concerns those thermostable irr vitro complexes in which the proteins of which the complexes are composed are partly derived from archaebacteria, eukaryotes and eubacteria. In this respect any permutations of the irr vitro complexes with regard to their protein components or their respective origin are a subject matter of the present invention.
In this connection origin in the above sense is intended to denote any source in which the gene, the gene information or the protein is based.
This is independent of the actual manner in which the sliding clamp protein or elongation protein is obtained such as by chemical synthesis, genetic engineering methods or isolation from natural sources.
The invention in particular concerns a thermostable prokaryotic in vitro complex for the elongation and especially for the template-dependent elongation of nucleic acids which comprises a thermostable sliding clamp protein (fig. 20) which wholly or partially encircles the complementary nucleic acid strands, and a thermostable protein having polymerase activity (fig. 21 and fig. 22), this protein or this protein complex being coupled to or associated with the sliding clamp protein.

In the scope of the present invention the term elongation protein having polymerise activity also encompasses protein complexes having polymerise activity or subunits of such complexes which carry the polymerise activity.
Thermostable in the sense of the present invention means that the irr vitro complex incorporates nucleotides into growing nucleic acid strands with high processivity at the low as well as at the high temperatures which occur in PCR or other reactions e.g. DNA sequencing.
PCR usually for example comprises the steps of denaturation (70°C to 98°C), annealing (40°C to 78°C) and DNA strand synthesis (60°C
to 76°C). Hence this complex must be fiznctional at least between ca. 60°(: and ca.
70°C, in particular preferably between 60°C and 76°C and particularly preferably between 40°C and 98°C. There should be no signs of irreversible denaturation of the complex or of individual components during the entire reaction which could prevent or inhibit the elongation reaction.
The sliding clamp The following details are intended to illustrate the fiznction and possible forms of the sliding clamp and of the sliding clamp protein.
The function of the sliding clamp protein is to bind the elongation protein to the DNA. As already mentioned above the sliding clamp protein itself surrounds the single-stranded or double-stranded nucleic acid wholly or partially or by association with the protein having polymerise activity or as the case may be with the protein complex having polymerise activity or a subunit thereof and thus forms a clamp around the nucleic acid. In any case the processivity is significantly increased by at least one and a half fold by this clamp formation (example 5 and fig. 23).
This means that the processivity of the inventive in vitro complex is at least one and a half fold of an elongation protein alone or a protein complex having polymerise activity or a subunit thereof without a sliding clamp (example 5 and fig. 23).

According to the invention homologues or functional analogues of the proliferating cell nuclear antigen protein complex coded in the human genome or homologues of the likewise circular (3-clamp protein complex from E coli which are derived from thermostable organisms or are thermostable or, if they are non-thermostable, can be made thermostable, or are derived from non-thermostable organisms and are thermostable or can subsequently be made thermostable by modifying the amino acid sequence, can be used for example as sliding clamps (Eijsink VG, van der Zee JR, van den Burg B, Vriend G, Venema G, FEBS Lett 1991 Apr 22; 282(1):13-16, Improving the thermostability of the neutral protease of Bacillus stearothermophilus by replacing a buried asparagine by leucine, Bertus Van den Burg, Gert Vriend, Oene R. Veltman, Gerard Venema and Vincent G.H. Eijsink Engineering An enzyme to resist boiling PNAS 1998 95:2056-2060). Homologous sequences are understood herein in the following as sequences which are characterized by having a sequence that is similar to one or several other sequences and namely to such an extent that it cannot be assumed to be a coincidental similarity. The degree of sequence similarity is expressed in percent and is referred to as homology. Sometimes the term sequence identity is also used. A homologue is a nucleic acid or amino acid sequence which is a homologous sequence to a reference sequence.
The sliding clamp can be composed of several components. The sliding clamp identified in the human genome is composed of three PCNA-protein components (SEQ ID NO:11 ) (homotrimer) and the sliding clamp identified in the E coli genome is composed of two components (SEQ ID N0:35) (homodimer).
A sliding clamp in the sense of the present invention is understood in particular as any protein that has the functional property of increasing polymerase processivity (example 5 and fig. 23) and/or which reduces the error rate. For this purpose the sliding clamp can have a circular three-dimensional structure or can form a circular three-dimensional structure by coupling to another protein by which means it is able to wholly or partially encircle single and double-stranded nucleic acids.

A sliding clamp in the sense of the present invention is in particular understood as a protein which 1. has a sequence identity of at least 20 %, preferably of at least 25 % and more preferably of at least 30 °ro to the human PCNA amino acid sequence (eukaryotes) (SEQ ID NO:11 ) over a length of at least 100 amino acids in a sequence alignment or which 2. has a sequence identity of at least 20 %, preferably of at least 25 % and more preferably of at least 30 % to the bacterial (3-clamp sequence from E. coli (eubacteria) (SEQ ID N0:35) over a length of at least 100 amino acids in a sequence alignment or which 3. has a sequence identity of at least 20 %, preferably of at least 25 % and more preferably of at least 30 % to the amino acid sequence of the PCNA homologue from Archaeoglobus.fulgiclus (archaebacteria) (SEQ ID N0:12) over a length of at least 100 amino acids in a sequence alignment.
All sequence alignments disclosed herein were generated using the BLAST
algorithm according to Altschul, S.F., Gish, W. Miller, W., Myers, E.W., and Lipman, D.J., J.
Mol. Biol. 215, 403-410 (1990).
The sliding clamp according to the invention can have one or several of the aforementioned features.
In the sense of the present invention sliding clamps or sliding clamp proteins are also to be understood as proteins which contain one or both of the following consensus sequences (of region 1 and region 2) and deviate at not more than four positions from region 1 (SEQ ID N0:39) or at not more than four positions from region 2 (SEQ ID N0:40) (fig. 4):
Region 1 (SEQ ID N0:39):
[GAVLIMPFW]-D-X-X-X-[GAVL11VIPFW]-X-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-X-X-X-F-X-X-Y-X-X-D
and/or Region 2 (SEQ ID N0:40):
[GAVLIMPFW]-X(3)-L-A-P-[KRHDE]-[GAVLIMPFW]-E
The amino acids are denoted according to the standard fUPAC - single letter -nomenclature and shown in accordance with the Prosite pattern description standard.
The following amino acid groups are frequently pooled together:
G,A,V,L,I,M,P,F or W (amino acids with non-polar side chains) S,T,N,Q,Y or C (amino acid with uncharged polar side chains) K,R,H,D or E (amino acid with charged and polar side chains) In addition X denotes any desired amino acid or insertion or deletion in the sequences or sequence protocols.
Furthermore a hidden Markov model was generated from the multiple alignment of human PCNA homologues shown in fig. 12. Hence a sliding clamp in the sense of the present invention is also especially to be understood as any protein which has a score of more than 20 preferably 25 and most preferably 30 with the hidden Markov model (referred to as HMM in the following) generated in this manner (fig. 12) whereby a score is the output value of a HMM analysis. The hidden Markov model and the corresponding scores were calculated using the hmmfs programme (version 1.8.4, July 1997) from the l~-IMMER package (~:R protein and DNA hidden Markov Model (version 1.8) by Sean Eddy, Dept. of Genetics, Washington University School of Medicine, St. Louis, USA).
Markov models with a hidden profile (profile HMMs) can also be referred to in a short form as hidden Markov models and are abbreviated herein as HMM are statistical models of the consensus of the primary structure of a sequence family. The profiles use position-specific scores for amino acids (or nucleotides) and position-specific scores for the opening or extension of an insertion or deletion.
Methods for setting up profiles from multiple alignments were introduced by Taylor ( 1986), Gribskov et al. ( I 987), Barton ( 1990) and Heinikof~ ( 1996).
HMMs provide a completely probabilistic description of profiles i.e. the teachings of Bayes determine how the entire probability (assessment) parameters should be set (cf. Krogh et al. 1994, Eddy 1996 and Eddy 1998). The pivotal idea is that a HMM
is a finite model which describes a probability distribution over an unlimited number of possible sequences. The HMM is composed of a number of states which correspond to the columns of a multiple alignment as it is usually shown. Each state emits symbols (remainders) corresponding to the (state-specific) symbol emission probabilities and the states are linked together by state transition probabilities. A
series of states is generated starting from an initial state by transition from one state to the other according to the state transition probabilities until an end state is reached. Each state then emits symbols corresponding to the emission probability distribution of this state which generates an observable sequence of symbols.
The attribute hidden is derived from the fact that the underlying state sequence cannot be observed; it is the symbol sequence that is observed. An estimation of the transition and emission probabilities (the training of the model) is achieved by dynamic programming algorithms which are implemented in the HMMER package.
With an existing HN11VI and a given sequence it is possible to calculate the probability that the HMM can generate the sequence in question. The HMMER package provides a numerical quantity (the score or output value) which is proportional to this probability i.e. the information content of the sequence is stated in bits and measured according to the HMM.
Reference is made to the following literature references in connection with HMM:
Barton, G.J. (1990):
Protein multiple alignment and flexible pattern matching.
Methods enzymol. 183: 403-427 Eddy, S.R. (1996):
Hidden Markov models Curr. Opin. Strct. Biol. 6: 361-365 Eddy, S.R. (1998):
Profile hidden Markov models Bioinformatics. 14: 75 5-763 Gribskov, M. McLachlan, A.D. and Eisenberg D. (1987):
Profile analysis: Detection of distantly related proteins Proc. Natl. Acad. Sci. USA 84: 4355-5358 Heinikoff, S. ( 1996):
Scores for sequence searches and alignment Curr. Opin. Strct. Biol. 6: 353-360 Krogh, A., Brown, M., Mian, I. S., Sjolander, K. and Haussler, D. ( 1994):
Hidden Markov models in computational biology: Applications to protein modelling.
J. Mol. Biol. 235: 1501-1531 Taylor, W.R. (1986):
Identification of protein sequence homology by consensus template alignment J. Mol. Biol. 188: 233-258 A HMM was generated from the multiple alignment of E. coli (3-clamp homologues shown in fig. 13. Hence a sliding clamp in the sense of the present invention is also in particular to be understood as any protein which has a score of more than 25, preferably 30 and most preferably 35 in the HMM generated in this manner (fig.
13).
The sliding clamps can be composed of several components which are bound firmly together by a characteristic binding such that a stable circular molecular complex is formed which cannot be readily dissociated from the nucleic acid. This enables a firm but non-covalent binding to the nucleic acid which does not hinder free movement on the nucleic acid. Moreover, the sliding clamp proteins that increase processivity have characteristic local molecular properties in the region of interaction with the DNA which facilitate free movability and which can be facilitated by water molecules intercalated in this region.
A further preferred embodiment of the present invention is in particular a thermostable prokaryotic in vitro complex in which the sliding clamp protein is one of the following: AF 0335 from Archaeoglobus ficlgid2cs (SEQ ID N0:12) (fig 24), MJ0247 from Methanococctrs jarn~aschii (SEQ ID N0:13), PHLA008 from Pyrococcrrs horikoschii (SEQ ID N0:14), MTH1312 from Methanobacterium Thermoazetotrophicus (SEQ ID N0:15) as well as AE000761 7 from Aquifex aeolicus (SEQ ID N0:36).
In particular those thermostable in vitro complexes are a subject matter of this application in which the sliding clamp protein contains an amino acid sequence which is selected from the group comprising SEQ ID NO: 1 I, 12, 13, 14, 15 and 36.
The sliding clamp loader A further preferred embodiment is one in which the inventive complex includes a sliding clamp loader. A sliding clamp loader is understood herein as a protein, protein complex or subunit of a protein which comprises a homologue of the replication factor C protein complex identified in humans.
In humans this protein complex is composed of five subunits and is coded by five separate genes in humans. The four small subunits each coded by one gene (referred to herein as sliding clamp loader 1 ) form a protein complex in humans. The protein of the large subunit is coded by one gene in humans (referred to herein as sliding clamp loader 2). The sequences of the four small subunits are shown as SEQ ID
NO:1, 32, 33, 34, the sequence of the large subunit is shown as SEQ ID N0:6.
According to the invention homologues or fiznctional analogues of any of the above-mentioned sequences SEQ ID NO:1, 32, 33, 34 can be used singly or in any combination as the sliding clamp loader 1. In this connection the homologues can be of prokaryotic as well as eubacterial or archaebacterial or eukaryotic origin.

According to the invention homologues or functional analogues of the above-mentioned sequence SEQ ID N0:6 can be used singly or in any combination as the sliding clamp loader 2. In this connection the homologues can be of prokaryotic as well as eubacterial or archaebacterial or eukaryotic origin.
A protein can also be understood as a sliding clamp loader 1 in the sense of the present invention which has an at least 20 %, preferably at least 25 % and even more preferably at least 30 % sequence identity to the human (eukaroytic) amino acid sequence (SEQ ID NO:1, 32, 33, 34) over a length of at least 100 amino acids in a sequence alignment.
A protein can also be understood as a sliding clamp loader 2 in the sense of the present invention which has an at least 20 %, preferably at least 25 % and even more preferably at least 30 % sequence identity to the human (eukaryotic) amino acid sequence (SEQ ID N0:6) over a length of at least 150 amino acids in a sequence alignment.
Sliding clamp loader homologues in the sense of the above definition are for example the genes from archaebacteria listed in fig. 1. These genes correspond to the sequences SEQ ID NO: 2, 3, 4 and 5 for the sliding clamp loader 1 and to the sequences SEQ ID NO: 7, 8, 9 and 10 for the sliding clamp loader 2.
A protein can also be understood as a sliding clamp loader 1 in the sense of the present invention which contains a sequence in accordance with the following consensus sequence and which differs at no more than four positions from this sequence (see also fig. 6 for the alignment):
SEQ ID N0:41:
C-N-Y-X-S-[KRHDE]-I-I-X-[GAVLIMPFW]-[GAVL,:fMPFW]-Q-S-R-C-X-X-F-R-F-X-P-[GAVLIIVVIPFW]
A protein can also be understood as a sliding clamp loader 2 in the sense of the present invention which contains a sequence in accordance with the following consensus sequence and which differs at no more than four positions from this sequence (see also fig. 7 for the alignment):
SEQ ID N0:42:
K-X-X-L-L-X-G-P-P-G-X-G-K-T-[STNQYC]-X-[GAVLIMPFWJ-X-X-[GAVLIMPFW]
In addition a HMM was generated from the multiple alignment of sequences of the sliding clamp loader 1 shown in fig. 14 comprising the human sequence and some homologous sequences thereto from archaebacteria. Consequently a sliding clamp loader 1 is also understood in the sense of the present invention as a protein which has a score of more than 25, preferably more than 30 and most preferably more than 35 in the HMM generated in this manner (see also fig. 14 for the alignment).
In addition a HMM was generated from the multiple alignment of sequences of the sliding clamp loader 2 shown in fig. 15 comprising the human sequence and some homologous sequences thereto from archaebacteria. C',onsequently a sliding clamp loader 2 is also understood in the sense of the present invention as a protein which has a score of more than 15, preferably more than 20 and most preferably more than 25 in the HMM generated in this manner (see also fig. 15 for the alignment).
The inventive in vitro complex may also contain a protein homologous to the eubacterium Escherichia coli y-complex or parts thereof as the sliding clamp loader 1 or sliding clamp loader 2.
Consequently a sliding clamp loader in the sense of the present invention can be a sliding clamp loader 1 alone, a sliding clamp loader 2 alone or a combination of one or several sliding clamp loaders 1 or sliding clamp loaders 2 each as defined above.
Furthermore in one embodiment the inventive thermostable in vitro complex for the elongation of nucleic acids additionally contains a compound which releases energy when cleaved such as ATP, GTP, CTP, TTP, dATP, dGTP, dCTP or dTTP.

Without wanting to be limited thereto in the following, a sliding clamp loader appears to assemble the components of the sliding clamp around the uninterrupted DNA strand and to remove these again when the reaction is completed. In this connection the sliding clamp can have a ring-shaped three-dimensional structure or can form a ring-shaped three-dimensional structure by coupling to another protein by which means it is able to wholly or partially encircle single-stranded or double-stranded nucleic acids. The sliding clamp loader can be advantageous when the template nucleic acid molecule is present in a closed ring shape.
In particular those thermostable in vitro complexes are a subject matter of this application in which the sliding clamp loader 1 comprises an amino acid sequence which is selected from the group comprising SEQ ID NO: 2, 3, 4 and 5.
In particular those thermostable in vitro complexes are a subject matter of this application in which the sliding clamp loader 2 comprises an amino acid sequence which is selected from the group comprising SEQ ID NO: 7, 8, 9 and 10.
The coupling protein The function of the coupling protein is to connect an elongation protein with one or several sliding clamp proteins or with a sliding clamp protein complex. Hence a coupling protein in the sense of the present invention is to be understood in particular as any protein which has the function described above. Those coupling proteins are preferred which are of archaebacterial origin.
Coupling proteins in the sense of the present invention are for example homologues or functional analogues of singly or in any combination of each of the sequences listed in fig. 1 which are homologues to the human sequence of the small subunit of polymerase 8 which is referred to herein as the coupling subunit (sequence name:
DPD2 I-fUMAN, shown in the sequence protocol as SEQ >D NO:16). In the sense of the present invention a coupling protein can be a protein which comprises a sequence that is selected from groups comprising the sequences SEQ ID NO: 17, 18, 19, 20 and 21.

A coupling protein in the sense of the present invention is in particular to be understood as a protein which has a sequence identity to the human (eukaryotic) amino acid sequence (SEQ ID N0:16) of at least 18 °io, preferably of at least 22 and even more preferably of at least 26 % over a length of at least 150 amino acids in a sequence alignment.
A coupling protein in the sense of the present invention is in particular to be understood as a protein which has a sequence identity to the amino acid sequence (SEQ ID N0:19) from Pyrococcus horikoshii of at least 20 %, preferably a sequence identity of at least 25 % and even more preferably a sequence identity of more than 30 % over a length of at least 150 amino acids in a sequence alignment.
A coupling protein in the sense of the present invention is also to be understood as any protein which contains the following consensus sequence and deviates from this sequence at not more than four positions. The generation of the consensus sequence is shown in fig. 5 which is disclosed herein as SEQ ID N0:43.
SEQ ID N0:43:
[FLJ-[GAVLIMPFWJ-X-X-[GAVLIMPFW]-X-G-X( 13)-[GAVLIMPFWJ-X-[YRJ-[GAVLIMPFWJ-X-[GAVLIMPFWJ-A-G-[DNJ-[GAVLIMPFWJ-[GAVLIMPFWJ-[DSJ
In addition a HMM was generated from the multiple alignment of homologues to the human coupling subunit or coupling protein shown in fig. 16. Hence a coupling subunit in the sense of the present invention is to be understood as any protein which has a score of more than 10, preferably of more than 15 and most preferably of more than 20 in the HMM generated in this manner.
In particular those thermostable in vitro complexes are a subject matter of this application in which the coupling protein comprises an amino acid sequence selected from the group comprising SEQ ID NO: 17, 18, 19, 20 and 21.

Elongation protein An elongation protein can be used within the scope of the present invention which has the features defined above. In addition it is possible to use forms of elongation proteins which are described in the following and of which at least some are already known in the prior art.
It is known that some elongation proteins require the presence of a coupling protein in order to have any polymerase activity at all. It is also known that some elongation proteins can bind directly to sliding clamp proteins whereas other elongation proteins require the presence of a coupling protein in order to bind to the sliding clamp proteins. Furthermore elongation proteins can combine both of the above-mentioned properties i.e. bind to a sliding clamp protein via a coupling protein or directly i.e.
without a coupling protein.
Preferred elongation proteins for the inventive in vitro complex can be selected from the group which comprises the organisms Carboxydothermrrs hydrogenoformans, Thermrrs aqrraticZrs, Thernnrs caldophihrs, lhernarrs chliarophilus, Thermrrs filiformis, Thernnrs fZcnnss, Thermos oshimai, Tlrermus tuber, Thermrrs scotodrrctrrs, Thermos silvarrus, Thermrrs species ZO~, Thermrrs species sp. 17, Thermos thermrrsphilus, Therotoga maritima, Therotoga rreapolitaraa, Thermosipho africanrrs, Afraerocellum thermophilum, Bacillus caldotenax or Bacillus stearothermophilrrs.
The elongation proteins listed in fig. 1 from Archaebacteria (SEQ ID NO: 23, 24, 25, 26) which are homologous or fianctionally analogous to the human elongation protein (SEQ ID N0:22) are for example also suitable as elongation proteins.
In particular those thermostable irr vitro complexes are a subject matter of the invention in which the elongation protein comprises an amino acid sequence which is selected from the group comprising SEQ ID NO: 23, 24, 25, 26, 27, 28, 29, 30 and 31.

An elongation protein in the sense of the present invention is in particular to be understood as a protein which has a sequence identity of at least 20 %, preferably of at least 25 % and most preferably of at least 30 % to the human (eukaryotic) amino acid sequence (SEQ ID N0:22) over a length of at least 200 amino acids in a sequence alignment.
An elongation protein in the sense of the present invention is also in particular understood as a protein which contains the following consensus sequence (SEQ
ID
N0:44) and does not differ at more than four positions from this sequence.
Fig. 8 shows the alignment which forms the basis for the consensus sequence.
SEQ ID N0.44:
D-[GAVLIMPFW]-[GAVLIMPFW]-X-X-Y-N-X-X-X-F-D-X-P-Y-[GAVLIMPFW]-X-X-R-A
In addition a HMM was generated from the multiple alignment of homologues to the human elongation protein (SEQ ID N0:22) shown in fig. 17. Hence an elongation protein in the sense of the present invention is to be understood as any protein which has a score of more than 20, preferably of more than 25 and most preferably of more than 30 in the HMM generated in this manner.
Hence an elongation protein in the sense of the present invention is in particular to be understood as a protein which has a sequence identity of at least 25 %, preferably of at least 30 % and most preferably of at least 35 %, to the archaebacterial amino acid sequence (SEQ ID N0:27) over a length of at least 400 amino acids in a sequence alignment. For example the proteins derived from Archaebacteria having SEQ ID N0:27, 28, 29. 30 or 31 are suitable.
Hence an elongation protein in the sense of the present invention is also in particular understood as any protein which contains the following consensus sequence referred to herein as SEQ 117 N0:45 and which differs from this sequence at not more than four positions. (Fig. 9 shows the alignment which forms the basis for the consensus sequence).

SEQ ID N0:45:
A-[GAVLIMPFW]-R-T-A-[GAVLIMPFW]-A-[GAVLIMPFW]-[GAVLIMPFW]-T-E-G-[GAVLIMPFW]-V-X-A-P-[GAVLIMPFW]-E-G-I-A-X-V-[KRHDE]
In addition a HMM was generated from the multiple alignment of homologues to the archaebacterial elongation protein (SEQ ID N0:27) shown in fig. 18. Hence an elongation protein in the sense of the present invention is to be understood as any protein which has a score of more than 35, preferably of more than 40 and even more preferably of more than 45 in the HMM generated in this manner (fig. I
8).
The elongation protein can also be of eubacterial origin.
Hence an elongation protein in the sense of the present invention is in particular also to be understood as a protein which has a sequence identity of at least 25 %, preferably of at least 30 % and most preferably of at least 35 % to the eubacterial amino acid sequence (SEQ ID N0:37) over a length of at least 300 amino acids in a sequence alignment.
Hence an elongation protein in the sense of the present invention is also in particular understood as any protein which contains the following consensus sequence and which differs from this sequence at not more than eight positions. (Fig.10).
SEQ ID N0:46:
[GAVLIMPFW]-P-V-G-[GAVLIMPFW]-G-R-G-S-X-[GAVLIMPFW]-G-S-[GAVLIMPFW]-V-A-X-A-[GAVLIMPFW]-X-I-T-D-[GAVLIMPFW]-D-P-[GAVLIMPFW]-X-X-X-[GAVLIMPFW]-L-F-E-R-F-I,-N-P-E-R-[GAVLIMPFW]-S-M-P-D
In addition a HMM was generated from the multiple alignment of homologues to the eubacterial elongation protein (SEQ ID N0:37) shown in fig. 19. Hence an elongation protein in the sense of the present invention is to be understood as any protein which has a score of more than 20, preferably of more than 25 and most preferably of more than 30 in the HMM generated in this manner.

In particular those thermostable irr vitro complexes are a subject matter of this application in which the elongation protein comprises an amino acid sequence which is selected from the group comprising SEQ ID N0:38.
Use of the in vitro complex according to the invention:
Previously DNA polymerases such as e.g. DNA polymerase I from Pyrococcus _frrriosrrs (US No. 5,545,552) or Pyrococcrrs species (EP-A-0 547 359) were used as elongation proteins for standard PCR reactions without a coupling protein and without a sliding clamp. A characteristic of these enzymes is that they are thermostable and frequently possess a 3'-5' exonuclease activity (proof reading activity). Only recently was a heterodimer with polymerase activity discovered in Pyrococcrrs.firriosrrs (Uemori, T., Sato, Y., Kato, L, Doi, H,, and Ishino, Y.
(1997).
A novel DNA polymerase in the hyperthermophilic archaeon, Pyrococcrrs,frrriosrrs:
gene cloning, expression and characterization. Genes to Cells 2, 499-512).
It is also possible to optimize the properties of these proteins by deletions or mutations or by attaching amino acids. These modified proteins are also a subject matter of the present invention provided they form the inventive irr vitro complex for the elongation of nucleic acids and fulfil the functions which are specified above in more detail.
Fig. 3 illustrates by way of example an embodiment of the inventive thermostable in vitro complex in which the sliding clamp binds to the elongation protein by means of a coupling protein.
Furthermore the inventive in vitro complex or the inventive reaction mixture containing this complex can contain a nucleic acid which is for example the nucleic acid to be elongated, to be sequenced, to be amplified or to be reversely transcribed and/or a primer in which case this primer is preferably hybridized to a nucleic acid.
Primers are usually oligonucleotides that are complementary to a target sequence which enables them to bind to it and in opposite orientation with their 3'-ends facing one another, they enclose the nucleic acid section to be elongated, to be sequenced, to be amplified or to be reversely transcribed. They serve as the starting point of enzyme activity and usually provide a free 3'OH end for the polymerise to incorporate a nucleotide.
During the use of the inventive thermostable irmitro complex for the amplification, elongation, reverse transcription and/or sequencing, the inventive complex may be present in a reaction mixture preferably in a suitable buffer. Suitable buffers are those that are used for PCR, sequencing, nucleic acid labelling and other ifs vitro nucleic acid elongation reactions by means of polymerise. Suitable buffers are described for example in Methods in Molecular Biology, vol. 15 Humana Press Totowa, New Jersey, 1993, publ. Bruce A. White.
When the inventive thermostable in vitro complex is used for elongation, amplification, reverse transcription and/or sequencing, a nucleotide or a mixture of nucleotides can be present or used or included in addition to the inventive complex.
Deoxynucleotides can be selected from dGTP, dATP, dTTP and dCTP but are not limited to these. In addition derivatives of deoxynucleotides can also be used which are defined as those deoxynucleotides which are able to be incorporated by a thermostable polymerise into growing nucleic acid molecules. Such derivatives include the thionucleotides 7-deaza-2'-dGTP, 7-deaza-2'-dATP, digoxigenin-dUTP
(Boehringer Mannheim)-dATP as well as deoxyinosine triphosphate which can also be used as a substitute deoxynucleotide for dATP, dGTP, dTTP or dCTP but are not limited to these. It is also possible to use labelled deoxynucleotides. It is also possible to use pyrenes and pyrene derivatives. In this connection all known and/or suitable labels for the purpose of the invention can be present or used.
Dideoxynucleotides can be selected from ddGTP, ddATP, ddTTP and ddCTP, but are not limited to these. Alternatively it is also possible to use derivatives of dideoxy-nucleotides which are defined as those dideoxynucleotides that are able to be incorporated by a polymerise into growing nucleic acid molecules that are synthesized in the reaction. Such derivatives can be radioactive dideoxynucleotides (ddATP, ddGTP, ddTTP and ddCTP) or dideoxynucleotides (ddATP, ddGTP, ddTTP and ddCTP) that are labelled with e.g. FITC, CyS, Cy5.5, Cy7, Texas-red or other dyes but are not limited to these. As part of a sequencing using the inventive thermostable in vitro complex it is also possible to use labelled deoxynucleotides together with unlabelled dideoxynucleotides.
Ribonucleotides can be selected from GTP, ATP, TTP and CTP but are not limited to these. In addition derivatives of ribonucleotides can also be used according to the invention which are defined as those ribonucleotides which are able to be incorporated by a polymerase into growing nucleic acid molecules which are synthesized in the reaction. Such derivatives can be radioactive ribonucleotides (ATP, GTP, TTP and CTP) or ribonucleotides (ATP, CiTP, TTP and CTP) that are labelled with e.g. FITC, CyS, Cy5.5, Cy7 and Texas-red or others but are not limited to these.
During the use of the protein complex for amplification, elongation or sequencing it may prove to be advantageous when a pyrophosphatase is present in the reaction or in the reaction mixture.
Reaction mixture containing the thermostable in vitro complex according to the invention: ' A further subject matter of the present invention is a reaction mixture which contains the ifi vitro complex according to the invention. Provision can also be made for the reaction mixture to additionally contain one or several elongation proteins which have at least one or several of the above-mentioned properties or activities.
Such an additional elongation protein is advantageously a thermostable polymerase.
Such a reaction mixture allows an increased processivity compared with using the known thermostable polymerases.
Identification of the genes, cloning the genes, expression of these and purification of the proteins of inventive in vitro complexes:
The complexes, which are preferably completely or partially composed of recombinant proteins, can usually be prepared by the following steps:
Preparation of the nucleic acid fragment which codes for the desired protein, ligation into an expression vector, transformation into a host, expression and purification of the protein (fig. 25). In accordance with the present invention the genes and especially those from the Archaebacteria may contain inteins which can firstly be removed (Proc, Natl, Acac~ Sci. USA 1992 Jun. 15, 89(12):5577-5581, Intervening sequences in an Archaea DNA polymerase gene, Perler FB, Comb DG, Jack WE, Moran LS, Qiang B, Kucera RB, Benner J, Slatko BE, Nwankwo DO, Hempstead SK, et al).
Further genes and/or proteins which are suitable for the inventive complex can for example be identified by homology searches in databases that contain genomes from prokaryotes. Suitable programs for this include for example the programs BLAST, BLASTP and FASTA but are not limited to these (Altschul, Stephen F., Thomas L.
Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller and David J. Lipman ( 1997), Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. :25:3389-3402. W.R.
Pearson & D.J. Lipman PNAS (1988) 85:2444-2448).
They can also be identified by using DNA probes in order to screen for suitable genes in for example total genomic banks from prokaryotes or eukaryotes. The experimental methods required for this can be found in Maniatis et al.
(Molecular Cloning (2°d edition, 3 volume set): A laboratory Manual, Cold Spring Harbor Laboratory Press, N.Y. (1989)) or in Ausubel et al (Current Protocols in Molecular Biology, John Wiley and Sons ( 1988)).
The purified nucleic acid of the genes for the proteins of which the inventive complexes are composed can for example be provided by isolating it from a genomic bank of the relevant organism or by synthetic DNA preparation each being combined if desired with an amplification by means of PCR with the aid of primers that are specific for the desired gene section. Standard methods are described in Maniatis et al. (Molecatlar Cloning: A laboratory Manual, Cold Spring Harbor Laboratory Press, N.Y. (1989)).

The genes of the proteins of the inventive in vitro complexes can be cloned using numerous methods and thus made available for protein expression in the host organism by means of an expression vector. Standard methods are described in Maniatis et al. (Molecular C'loning.~ A laboratory th'arnral, Cold Spring Harbor Laboratory Press, N.Y. (1989)). In this connection the genes of the complexes can for example be firstly cloned into a high copy vector e.g. pUCl8, past or pBR322 and only afterwards be recloned into a prokaryotic expression vector e.g.
pTrc99, pQE30, pQE31 or pQE32, or alternatively directly cloned into a prokaryotic or viral expression vector. Vectors in this connection are understood as nucleic acids which are able to transport another nucleic acid molecule into or between different organisms or genetic backgrounds. As a rule they are able to autonomously replicate and/or express (expression vectors) the operatively linked nucleic acid molecule.
Operatively linked as used herein means that the transported nucleic acid molecule is linked with the vector in such a manner that it is under the transcriptional and translational control of expression control sequences of the vector and can be expressed in a host cell. Bacterial and viral expression systems, their preferred applications and a selection of vector systems are described for example in Gene Expression Technology, (Meth. Enzymol. vol 185, Goeddel Ed., Academic Press, N.Y. (1990)). Suitable vectors for the present invention should enable the proteins to be expressed at different strengths due to the fact that they have one or all of the following properties: ( 1 ) promoters or transcription initiation sites either directly adjacent to the start of the protein or as a fusion protein, (2) operators can be used to switch the gene expression on or off, (3) ribosomal binding sites for an improved translation and (4) termination sites for the transcription or translation which lead to an improved stability of the transcript.
Expression vectors that are compatible with eukaryotic cells and preferably with vertebrate cells can also be used. Some known vectors are pSVL and pKSV-10 (Pharmacia), pBPV-1/pML2d (International Biotechnologies, Inc.), pAcHLT-ABC
(Pharmingen) and pTDTI (ATCC 31255). It is also possible to use a retroviral expression vector.
Hence further subject matters of the present invention are DNA sequences which code for the inventive thermostable iyi vitro complexes and appropriate vectors preferably expression vectors.

The vectors according to the invention contain at least one gene for the sliding clamp protein and at least one gene for the coupling protein or at least one gene for a sliding clamp loader 1 and/or 2 or at least one gene for an elongation protein. In one embodiment the vectors simultaneously contain several of the various genes mentioned above, for example the genes for an elongation protein and a sliding clamp protein.
Within the scope of the invention it is preferred that the vector contains additional suitable restriction cleavage sites and optionally polylinkers for the insertion of further DNA sequences in addition to the DNA sequences that are already contained therein. It is particularly preferred that the spatial arrangement of the DNA
sequence already present and the additional insertion site leads to the formation of a fusion protein after expression.
It is additionally preferred that the inventive vector contains promoter and/or operator regions and it is particularly preferred that such promoter and/or operator regions are inducible or repressible. This considerably simplifies the control of expression in host cells and can be made to be particularly efficient.
Such promoter/operator regions can also occur several times in one expression vector which may enable several DNA sequences to be expressed independently using only one expression vector.
A further subject matter of the present invention is a host cell containing one or several inventive vectors) wherein expression to form proteins can take place in this host cell under suitable conditions. Suitable conditions include for example the presence of a repressor, inducer or a derepressor.
There are standard protocols for transformation, phage infection and cell culture in Maniatis et al. (Molecular Cloning: A laboratory tl~lanual, Cold Spring Harbor Laboratory Press, N.Y.). Among the numerous available E. coli strains that are suitable for transformation, the following are preferred JM101 (ATCC No.
33876), XI,I (Stratagene), RRI (ATCC No. 31343) M15[prep4] (QIAGEN) and BL21 (Pharmacia). Protein expression can for example also occur with the aid of the E.

coli strand INVaF' (Invitrogen). The transformants are cultured under suitable growth conditions for the host strain. Thus most E. coli strains are for example cultured in LB medium at 30°C to 42°C until the logarithmic or stationary growth phase is reached. The proteins can be purified from a transformed culture, and this can either be from a cell pellet after centrifugation or from the culture liquid. If the proteins are purified from the cell pellet, the cells are resuspended in a suitable buffer and disrupted by means of ultrasonic treatment, enzymatic treatment or freezing and thawing. If they are purified from the culture suspension either with or without a fusion protein, the supernatant is separated from the cells by means of known procedures such as centrifugation.
The proteins of the inventive complexes can be separated and purified either from the supernatant of the culture solution or from the cell extract by known separation or purification procedures. These methods are for example those which are based on solubilities such as salt precipitations and solvent precipitations, methods which utilize the differences in molecular weights such as dialysis, ultrafiltration, gel filtration and SDS polyacrylamide gel electrophoresis, methods which utilize differences in charge such as ion exchange chromatography, methods which utilize differences in hydrophobicity such as reverse phase HPLC (High Performance Liquid Chromatography), methods which utilize particular affinities such as affinity chromatography (example 6, 7 fig. 24 and fig. 25) and methods which utilize differences in the isoelectric point such as isoelectric focussing. It is also conceivable that cell extracts can be made either from the organism which carries the gene of the accessory complex which fulfils the inventive object or from the recombinant host organism e.g. E. coli. If these extracts are used it is possible to omit other purification steps.
The methods described above can be used in many combinations in order to prepare proteins of the in vitro complex.
Elongation of nucleic acids The inventive thermostable in vitro complexes can be used to elongate nucleic acids e.g, for the polymerase chain reaction (example 3, 4 and fig. 21, fig. 22), DNA

sequencing, to label nucleic acids and for other reactions which comprise the in vitro synthesis of nucleic acids.
Hence a further subject matter of the present invention is a method for the template-dependent elongation of nucleic acids in which the nucleic acid is denatured if necessary, is provided with at least one primer under hybridization conditions, the primer being sufficiently complementary to a flanking region of a desired nucleic acid sequence of the template strand and a primer elongation is carried out in the presence of nucleotides with the aid of a polymerase, the inventive thermostable in vitro complex being used as the polymerase.
Methods are known to a person skilled in the art for the template-dependent elongation of nucleic acids in which the elongation is initiated with a primer that has been hybridized to the template nucleic acid and provides a free 3'-OH end for the elongation. In particular a PCR polymerase chain reaction is carried out for the amplification.
Reverse transcription The thermostable irt vitro complex according to the invention can also be used for reverse transcription in which either the inventive complex itself has reverse transcriptase activity or a suitable enzyme is additionally added which has reverse transcriptase activity irrespective of whether the thermostable in vitro complex itself has a reverse transcriptase activity.
An inventive thermostable in vib°o complex whose elongation protein itself has a reverse transcriptase activity is also used for the reverse transcription of RNA into DNA which is preferred according to the invention. This reverse transcriptase activity may be the only polymerase activity of the elongation protein but it may also be present in addition to an existing 5'-3'-DNA polymerase activity. An embodiment of the in vitro complex that is preferred according to the invention contains the elongation protein from the organism Carboxydothermus hydrogenoformans as disclosed in EP-A 0 834 569.

Sequencing A further preferred use of the inventive in vitro complex is to sequence nucleic acids according to the method of Sanger. Starting with at least one primer which is sufficiently complementary to a part of the nucleic acid to be sequenced, a template-dependent elongation is carried out. When sequencing RNA it is necessary to carry out a reverse transcription. In the scope of this preferred embodiment the respective derivatives described above are also regarded to be suitable as deoxynucleotides or dideoxynucleotides. In particular it is preferable for the inventive method of nucleic acid elongation that the generated nucleic acids are labelled. For this it is possible to use labelled primers and/or labelled deoxynucleotides and/or labelled dideoxynucleotides and/or labelled ribonucleotides or appropriate derivatives of each of these like those that have for example already been described above.
Labelling of nucleic acids A further subject matter of the present invention is a method for labelling nucleic acids e.g. by inserting individual breaks in the phosphodiester bonds of the nucleic acid chain and replacing a nucleotide at the sites of the breaks by a labelled nucleotide with the aid of a polymerase, in which an inventive thermostable in vitro complex is used as the polymerase.
A preferred method is the method that is generally referred to as nick translation which enables a simple labelling of nucleic acids. All labelled ribonucleotides or deoxyribonucleotides or derivatives thereof that have already been described above are suitable for this provided the polymerase accepts them as a substrate.
Labelling in the above sense is also a labelling which occurs as part of a PCR
reaction whereby in this case labelled nucleotides or derivatives thereof are incorporated into the nucleic acid sequence.

The present invention additionally concerns a kit for the elongation and/or amplification and/or reverse transcription and/or sequencing of nucleic acids wherein this kit can contain one or several containers.
The kit itself comprises a) an inventive thermostable ira vitro complex or b) a thermostable in vitro complex and optionally, separately therefrom, an elongation protein having polymerase activity and optionally one or several components which are selected from the group comprising primers, buffer substances, nucleotides, ATP, other cofactors and pyrophosphatase.
In this case it is possible that the thermostable irJ vitro complex according to the invention is present in the said kit in the form of its individual components i.e. as a thermostable sliding clamp protein and thermostable elongation protein that are separated or combined in one container and which does not form as such until a later time.
In particular a subject matter of the present invention is a kit for the elongation, amplification, reverse transcription, labelling and sequencing of nucleic acids additionally containing deoxynucleotides or derivatives thereof.
The kit according to the invention can optionally also contain ribonucleotides or derivatives thereof especially when an elongation protein is used which accepts ribonucleotides as a substrate.
A preferred embodiment of the inventive kit is a kit for sequencing nucleic acids containing dideoxynucleotides or derivatives thereof for chain termination in addition to deoxynucleotides and derivatives thereof.
Furthermore a subject matter of the present invention is in particular a kit for the reverse transcription of nucleic acids in which either the inventive complex itself has reverse transcriptase activity or a suitable enzyme is additionally present which has reverse transcriptase activity and deoxynucleotides or derivatives thereof may be present in the reaction mixture.

In a further particularly preferred embodiment the kit contains primers and/or deoxynucleotides and/or dideoxynucleotides and/or ribonucleotides and/or their respective derivatives in a labelled form.
Especially when sequencing nucleic acids it is necessary to insert a label.
Suitable labels have already been described above in the form of examples. Suitable labelling agents are included in a preferred embodiment of the inventive kit.
Within the scope of the present invention it is additionally possible that the kit (also referred to as reagent kit herein) is used to label nucleic acids. In this case the reagent kit contains the components a) or b) and labelled nucleotides wherein buffer substances, ATP or other cofactors and/or pyrophosphatase may also be present.
A kit is especially preferred which contains a suitable buffer as described above. It is also preferred that the kit according to the invention additionally contains a pyrophosphatase, ATP and/or other cofactors.
A further subject matter of the present invention is the use of a thermostable sliding clamp protein in in vitro methods for the elongation, amplification, labelling and sequencing or reverse transcription of nucleic acids.
The thermostable irt vitro complex according to the invention can be used for the purposes of sequencing, amplification, reverse transcription and such like using short as well as long nucleic acid fragments to achieve an overall reduction in the error rate (incorporation of incorrect nucleotides) compared to the use of thermostable elongation proteins alone.
In a preferred alternative the thermostable in vitro complex is used for the sequencing, amplification and reverse transcription of long nucleic acid fragments.
In a fizrther preferred alternative the thermostable in vitro complex is used for the sequencing, amplification and reverse transcription of those nucleic acid fragments which form a secondary structure.

The following examples in conjunction with the figures are intended to further elucidate the invention and illustrate the advantages:
Figures:
Sequence names are often used in the following which refer to the protein or nucleic acid sequences in the gene bank and the EMBL database. They show the following:
Fig. l tabulation of protein sequence names of the inventive in vitro complex and values from paired alignments and multiple alignments between Archaebacteria and between Archaebacteria and the corresponding human genes;
Fig. 2 a table similar to that of fig. 1 but limited to the sliding clamp pratein and elongation protein of E. coli and A. aeolicu.s;
Fig. 3 a schematic representation of the inventive thermostable in vitro complex;
Fig. 4 alignments of two conserved regions of the sliding clamp protein;
Fig. 5 an alignment of a conserved region of the coupling protein;
Fig. 6 an alignment of a conserved region of the sliding clamp loader;
Fig. 7 an alignment of a conserved region of the sliding clamp loader 2;
Fig. 8 an alignment of a conserved region of the elongation protein 1;
Fig. 9 an alignment of a conserved region of the elongation protein 2;
Fig. 10 an alignment of a conserved region of the elongation protein from eubacteria;

Fig. 1 I an alignment of a conserved region of the sliding clamp from eubacteria;
Fig. 12 to 19 multiple alignments of various protein sequences for the generation of Hidden Markov models (H1V1NI);
Fig. 20 a chromatographic analysis of a recombinant sliding clamp (Archaeoglobus fulgidus PCNA (AF 0335));
Fig. 21 and 22 the result of a PCR using an elongation protein as a component of a thermostable in vitro complex;
Fig. 23 comparison of the activity of an elongation protein with and without a sliding clamp protein;
Fig. 24 the result of the purification of a sliding clamp protein;
Fig. 25 the expression and purification of Archaeoglobus J'rrlgidrrs DNA
polymerise;
Fig. 26 the results of using an inventive ira vitro complex in the PCR;
Fig. 27 the results of a Y2H experiment;
Fig. 28 the PCR amplification result of the human collagen gene using the inventive thermostable ifr vitro complex;
Fig. 29 a tabulated overview of those genes that correspond to the protein sequence numbers stated in fig. 1 and which can be obtained from the respective databases;
and Fig. 30 a tabulation of the background information for the various databases in the English language which can be used to obtain the nucleic acid sequence data and amino acid sequence data that are stated herein.
Detailed description of the figures Fig. 1:
Figure 1 is a tabulation of protein sequence names of the inventive thermostable in vitro complex and values from paired alignments and multiple alignments between Archaebacteria and between Archaebacteria and the corresponding human genes.
In this connection the annotation ~~~" denotes the percentage identity (%) to the corresponding human gene calculated from the paired alignment (see figures) using BLAST 2Ø4 [Feb-24-1998] and the annotation ~~Z'~ denotes the percentage identity to the corresponding gene of Archaeoglobus , fiilgidus, calculated from the paired alignment (see figures) using BLASTP 2Ø4 [Feb-24-1998] and the annotation ~~3"
denotes the percentage identity to the corresponding human gene calculated from the paired alignment using FASTA 3.1t02 [March,1998]~. 'The methods are described in more detail by: Altschul, Stephen F. Thomas L. Madden, Alejandro A. SchaiTer, Jinghui Zhang, Zheng Zhang, Webb Miller and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25:3389-3402 and W.R. Pearson & D.J. Lipman Proc. Natl.
Acad. Sci. (1988) 85:2444-2448 Fig. 1 shows the sequence names from the databases and also the SEQ ID numbers for the sliding clamp loader 1, the sliding clamp loader 2, the sliding clamp, the coupling protein and the elongation protein 1, the values in parentheses each representing the percentage identity per number of amino acids. In the case of the elongation protein 2 the values refer to the percentage sequence identity relative to the Archaeoglobus fulgidu.s sequence. In this case the sequence names from the databases and their SEQ ID NO are also shown.
Fig. 2:
Fig. 2 shows protein sequences, paired alignments and multiple alignments from eubacteria of the replication apparatus. The annotation' denotes the percentage identity to the corresponding gene from E. coli calculated from the paired alignment (see attachment) using BLASTP 2Ø4 [Feb-24-1998]#. The method is described in more detail in: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller and David J. Lipman (1997), gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25:3389-3402.
Fig. 3:
Fig. 3 shows a sketch of a possible form of the inventive thermostable ira vitro complex in which the sliding clamp binds to the elongation protein by means of a coupling protein.
Fig. 4:
Fig. 4 shows alignments of two conserved regions of the sliding clamp protein as well as the consensus sequences derived therefrom. The following genes are shown:
human PCNA (from SEQ ID NO:11) and the corresponding sequences from Archaeoglobus firlgidus (from SEQ ID N0:12), from tl~TethanococcTrs janashii (from SEQ ID N0:13), from Pyrococcus horikoschii (from SEQ ID N0:14) and from Methanococcus thermoautothrophicus (from SEQ ID NO:15).
Fig. 5:
Fig. 5 shows an alignment of a conserved region of the coupling protein and the consensus sequences derived therefrom The following genes are shown: PfuORF2, DPD2 HUMAN, AF 1790 and MJ0702. The SEQ ID numbers can be taken from fig. 1.
Fig. 6:
Fig. 6 shows an alignment of a conserved region of the coupling subunit and the consensus sequences derived therefrom. The following genes are shown AC11 HUMAN, AF2060, MTH0241, PHBN012 and MJ1422. The SEQ ID
numbers can be taken from fig. 1.
Fig. 7:
Fig. 7 shows an alignment of a conserved region of the sliding clamp loader 2 and the consensus sequences derived therefrom. The following genes are shown AC I 5 HUMAN, MJ0884, AF I I 95, MTH0240 and MTH0240. The SEQ ID
numbers can be taken from fig. 1.
Fig. 8:
Fig. 8 shows an alignment of a conserved region of the elongation protein 1 and the consensus sequences derived therefrom. The following genes are shown:
DPOD HUMAN, MJ0885, MTH 1208, PHBT047 and DPOL ARCFU. The SEQ
ID numbers can be taken from fig. 1.
Fig. 9:
Fig. 9 shows an alignment of a conserved region of the elongation protein 2 and the consensus sequences derived therefrom. The following genes are shown: AF1722, MJ1630, PfuORF3, MTH1536 and PHBN021. The SF?Q ID numbers can be taken from fig. 1.
Fig. 10:
Fig. 10 shows an alignment of a conserved region of the elongation protein from eubacteria and the consensus sequences derived therefrom. The following genes are shown: DP3A ECOLI:DNAPoI III, alpha subunit, Escherichia coli, BB0579: DNA
Pol III, alpha subunit, Borrelia bTrrgdorferi, DP3A HELPY: DNA Pol III, alpha subunit, Helicobacter pylori AA50: Aquifex aeolicus, section 50 and DP3A SALTY: DNA Pol III, alpha subunit, Salmonella typhimurium).

Fig. 11:
Fig. 11 shows an alignment of a conserved region of the sliding clamp from eubacteria and the consensus sequences derived therefrom. The following genes are shown: AAPOL3B, DP3B ECOI,I, S.TYPHIM, DP3B PROMI, DP3B PSEPU
and DP3B_STRCO (AAPOL3B: Aqz~ifex aeolicns section 93: DP3B ECOLI: DNA
Pol III, beta chain, ~scherichia coli S.TYPHIM: DNA Pol III, beta chain, Salmonella typhimtrrizem P3B PROMI: DNA Pol III, beta chain Proteres mirabilis DP3B PSEPU: DNA Pol III, beta chain, Psercdomonas putidcr DP3B STRCO:
DNA Pol III, beta chain, Streptomyces coelicolor).
Fig. 12:
Fig. 12 shows a multiple alignment of the sliding clamp protein sequences for generating the HMM.
Fig. 13:
Fig. 13 shows a multiple alignment of the eubacterial sliding clamp protein sequences to generate the HMM (AAPOL3B: Aquifex aeolicus section 93: DP3B ECOLI:
DNA Pol III, beta chain, F,scherichia coli S. TYPHIM: DNA Pol III, beta chain, Salomonella typhimitrium P3B PROMI: DNA Pol III, beta chain, Protezrs mirabili.s DP3B PSEPU: DNA Pol III, beta chain, Pseudomonas putida DP3B STRCO:
DNA Pol III, beta chain, Streptomyces coelicolor.
Fig. 14:
Fig. 14 shows a multiple alignment of the sliding clamp loader 1 protein sequences for generating the HMM.

Fig. 15:
Fig. 15 shows a multiple alignment of the sliding clamp loader 2 protein sequences for generating the HMM.
Fig. 16:
Fig. 16 shows a multiple alignment of the protein sequences of the coupling proteins for generating the HMM.
Fig. 17:
Fig. 17 shows a multiple alignment of the sequences of the elongation proteins 1 for generating the hidden Markov model.
Fig. 18:
Fig. 18 shows a multiple alignment of the sequences of the elongation proteins 2 for generating the hidden Markov model.
Fig. 19:
Fig. 19 shows a multiple alignment of the sequences of the eubacterial elongation proteins for generating the I-PVIM. The following sequences are shown:
DP3A ECOLI: DNA Pol III, alpha subunit, Esche~°ichia coli, BB0579:
DNA Pol III, alpha subunit, Borrelia burgdorferi, DP3A HELPY: DNA Pol III, alpha subunit, Helicobacter pylori AA50: Aguifex aeoliczrs, section 50 and DP3A SALTY: DNA Pol III, alpha subunit, Salmonella typhiniuriz~m.

Fig. Z0: (example 2) Fig. 20 shows a chromatographic analysis of recombinant Archaeoglohus firlgidrrs PCNA (AF0335):
Recombinant Archaeoglohr~.s.fulgidrrs PCNA (AF 0335) is present as a trimer under native conditions. Fig. 20A shows proteins with a His tag (histidine tag) in fractions 15 (lane 1), 17 (lane 2), 19 (lane 3), 21 (lane 4), 23 (lane 5), 25 (lane 6) of the chromatography carried out without urea. Fig. 20B shows proteins with a His tag in fractions 10 (lane 1), 11 (lane 2), 12 (lane3), 13 (lane 4), 14 (lane S), 15 (lane 6), 16 (lane 7), 17 (lane 8) of the denaturing chromatography carried out in the presence of urea.
Fig. 21: (example 3) Fig. 21 shows the result of a PCR using an elongation protein as a component of an inventive thermostable in vitro complex:
I pl (lane 4), 2.5 pl (lane 5) and 5 pl (lane 6) Pyrococcu.s horilroshii DNA
polymerase (PH1947; crude extract see fig. 25) were each used individually in standard PCR reactions for an activity comparison with 1 unit Taq polymerase (lane 2) and 1 pl Archaeoglohrrs.firlgidus DNA polymerase (AF 0497) crude extract (see.
fig. 25) (lane 3); lane 1 shows a DNA size marker (New England Biolabs;
mixture of 1 kb DNA ladder and 100 by DNA ladder).
Fig. 22: (example 4) Fig. 22 shows the result of a PCR using an elongation protein as a component of an inventive thermostable in vitro complex.
Samples of the PCR using Archaeoglobrrs firlgidus DNA polymerase AF 0497 were taken after various numbers of cycles (Z) and separated on a 1 % agarose gel in 1 x TAE bufl"er (40 mM Tris acetate; 20 mM sodium acetate; 10 mM EDTA; pH 7.2) at v/cm. Lane 2: 16Z; lane 3: 21Z; lane 4: 26Z; lane 5: 28Z; lane 6: 30Z; lane 7:
32Z; lane 8: 34Z; lane 9: 36Z; lane 10: 38Z; lane 11: 40Z; lane 1 shows a DNA
size marker (New England Biolabs; mixture of 1 kb DNA ladder and 100 by DNA
ladder). The upper section shows reaction products of Taq polymerase; the lower section shows reaction products of Archaeoglobrrs firlgidrrs DNA polymerase AF
0497.
Fig. 23: (example 5) Fig. 23 shows a comparison of the activity of an elongation protein with and without a sliding clamp protein. Sample I represents an enzyme-free mixture. Samples 2-contained additionally 3 ~l each of a 1:1000 dilution of a fraction of recombinant Archaeoglobrrs fulgidus DNA polymerase (initial concentration 7.5 ~g/pl).
Samples 3-7 and 8-12 additionally contained 0.5; l; 2; 4 and 8 pl of a fraction of recombinant sliding clamp protein from Archaeoglobus _ fzrlgidus 1?CNA. The intensities were evaluated using AIDA: intensity of background lane l: 46.4; lane 2= 258.5;
lane 3=
164.4; lane 4= 122.8; lane 5= 162.1; lane 6= 297.4; lane' 7= 359.5 Fig. 24: (example 6) Fig. 24 shows the result of purifying a sliding clamp protein: lane 1 shows a molecular weight standard (BIO RAD cat. No. 161-0317). For lane 2 500 ~l bacteria were sedimented directly before induction, they were treated and applied according to the manufacturer's instructions for operating NuPage gels (NOVEX;
fig. 25). Lane 3 shows the same amount of bacteria 16 hours after the elution.
Lanes 4 and 5 each show 8 ~1 of the two eluates of the Ni-NTA agarose column after dialysis. Highly purified fractions of the sliding clamp protein of the organism Archaeoglobzrs .fulgidus were obtained by purification over Ni-NTA agarose (Qiagen) (see lanes 4 and 5).
Fig. 25: (example 7) Fig. 25 shows the expression and purification of A~°chaeoglobzrs fulgidus DNA
polymerase (see also example 7):
Lane 1 shows a molecular weight standard (BIO RAD cat. No. 161-0317). For lane 2 500 pl bacteria were sedimented directly before induction, they were treated and applied according to the manufacturer's instructions for operating NuPage gels.
Lane 3 shows the same amount of bacteria 16 hours after the elution. Lanes 4 and 5 each show 8 pl of the two eluates of the Ni-NTA agarose column after dialysis.
Lane 6 shows 8 ml of a dialysed crude extract. Lanes 4 and 5 show highly purified fractions of the Archaeoglobrr.s . frrlgidTrs DNA polymerase which are obtained by purification over Ni-NTA agarose.
Fig. 26: (example 8) Fig. 26 shows the results of using an inventive ire vitro complex in the PCR.
Lane 1 shows a PCR reaction without using a sliding clamp whereas lane 2 shows the result of a PCR reaction using a sliding clamp protein.
Fig. 27 (example 9) Fig. 27 shows the results of a yeast two hybrid experiment referred to herein as the Y2H experiment in which cells that carry the empty pGAD424 vector (Clontech, Palo Alto, USA) are placed in row A such that the transcription activation domain is expressed and cells which carry the pGAD424 vector from which the Sacharomyces cereve.siae gene CDC48 is expressed as a fusion protein with the transcription activation domain are placed in row B and row C contains cells which carry the pGAD424 vector from which the sliding clamp gene fram Archaeoglobus firlgidus is expressed as a fusion protein with the transcription activation domain, row D
contains no cells and row E contains cells which carry the pGAD424 vector from which the elongation protein gene from Archaeoglobus fi~lgidus is expressed as a fusion protein with the transcription activation domain.
Column 1 is provided with cells which carry the empty pGBT9 vector (Clontech, Palo Alto, USA) such that the DNA binding domain is expressed, column 2 is provided with cells which carry the pGBT9 vector from which the Saccharomyces cerevisiae gene UFD3 is expressed as a fusion protein with the DNA binding domain, column 3 is provided with cells which carry the pGBT9 vector from which the sliding clamp protein from Archaeoglobus fulgidus is expressed as a fusion protein with the DNA binding domain, column 4 is provided with cells which carry the pGBT9 vector from which the coupling protein from Archaeoglobus fi~lgidus is expressed as a fusion protein with a DNA binding domain and column 5 is provided with cells which carry the pGBT9 vector from which the elongation protein from Archaeoglobrr.s.firlgidns is expressed as a fusion protein with the DNA
binding domain.
Fig. 28: (example 10) Fig. 28 shows the PCR amplification result of the human collagen gene using the inventive thermostable in vitro complex. The expected amplificate has a size of about I kb in both cases. Lane 1 shows a molecular weight marker, lane 2 shows the result of the amplification using an inventive elongation protein without a sliding clamp and lane 3 shows the result of the amplification using the inventive thermostable in vitro complex.
Examples:
The invention is described in more detail by the following examples but is not limited thereto.
Example 1:
DNA is purified from the organism Archaeoglobzts fulgid~r.s (DSM No. 4304) by known methods. The organisms were cultured by the DSM ("Deutsche Sammlung fiir Mikroorganismen"). In order to clone the appropriate genes (sliding clamp loader 1 and 2, sliding clamp, elongation proteins 1 and 2, coupling subunit) into the expression vectors pTrc99 and pQE30, primers were developed for each gene which span the complete open reading frame including the stop codon. The primer sequences only additionally contain the start codon for cloning into pTRC99.
The corresponding primers for cloning into pQE30 contain the nucleotides which immediately follow the start codon as the first gene-specific sequences.
Restriction ends are added to the primers which facilitate the directed cloning into the expression vector. PCR reactions (about 35 cycles) are carried out using about ng total genomic DNA at the appropriate annealing temperatures and the resulting products are purified. After purification the products are treated with restriction enzymes and purified over an agarose gel in order to prepare them for ligation. The expression vector is linearized by means of restriction enzymes, purified and diluted in such a manner that it is ready for ligation with the arnplificates of the genes of the inventive irt vitro complex from the above PCR. The ligation is set up and after incubation an aliquot is transformed into one of the E. coli strains INValphaF' (Invitrogen) or XL,1 blue or M15 [prep4). At least 3 positive colonies are picked from each gene, plasmid DNA is prepared and the inserts are checked for completeness and correctness by means of DNA sequencing. Correct clones are selected and again placed on agar plates for isolatian (ampicillin; ampicillin and kanamycin). Colonies are picked and overnight cultures are prepared. An aliquot (500 pl) of the overnight culture is added to a one to five litre culture of LB
(ampicillin: 80 mg/1 or additionally 25 pg/ml kanamycin for M15[prep4) strains. The cultures grow up to an OD~~" of 0.8 at 37°C. IPTG (125 mg/1) is added for induction. These cultures now grow for a further 4-21 hours. The cultures are centrifuged and starting from the recombinant proteins expressed by the vector pQE30, are extracted and purified according to protocol 8 and 11 from the QIAexpressionist (third edition, QIAGEN). In alternative purification protocols, the pellets are taken up in a buffer (buffer A: 50 mM Tris-HCI pH 7.9, 50 mM
dextrose, I mM EDTA). After centrifugation the cells are taken up again but buffer A now additionally contains lysozyme (4 mg/ml). After incubation ( 15 min) the same volume of buffer B is added (B: 10 mM Tris-HCl (p:H 7.9), 50 mM KCI, 1 mM
EDTA, 1 mM PMSF, 0.5 % Tween 20, 0.5 % Nonidet P40) and lysed by incubating at 75°C for one hour. After centrifugation the supernatant is removed and the overexpressed proteins are precipitated by means of (NH.,)~SOa. The pellets are pooled after centrifugation and the proteins are resuspended in buffer A. The resuspended proteins are dialysed against storage buffer (50 mM Tris-HCI, (pH
7.9), 50 mM KCI, 0.1 mM EDTA, 1 mM DTT, 0.5 mM PMSF, 50 % glycerol) and subsequently stored at +4°C to -70°C.
Reactions are set up as follows in order to test the activity of the proteins.
Aliquots of the proteins are combined in different configurations and molarities;
sliding clamp loader I, 2 with the sliding clamp, coupling subunit and elongation protein 1, or sliding clamp loader 1, 2 with the sliding clamp with and without the coupling subunit and elongation proteins 2 or sliding clamp and elongation protein 1 or 2 and finally only elongation protein 1 or 2; the above-mentioned storage buffer served as the buffer. DNA polymerization activity is measured by the incorporation of (methyl 3H) TTP into trichloric acid-insoluble material or by incorporation of digoxigenin dUTP
into unlabelled DNA double-strand regions with free internal 3' ends (lshino, Y., Iwasaki, H., Fukui, H., Mineno, J., Kato, L, & Schinigawa, H. (1992) Biochemie 74, 131-136). In order to determine the processivity the above-mentioned protein mixtures are used in primer elongation experiments. An M13 single-stranded template is added to 10 mM Tris-HCI (pH 9.4) and heated (92°C) and cooled (room temperature) together with a universal primer (New England Biolabs) (5'-FITC
labelled). Serial dilutions of the thus generated template primer mixtures are added to a reaction composed of nucleotides (about 200 pM to 1 mM), reaction buffer (final concentration: 50 mM KCI, 10 mM Tris-HCl (pH 8.3), 1.5-5 mM MgCh, ATP (0 mM - 200 mM) and protein-stabilizing agents and incubated for 10 minutes at 37°C, 52°C, 62°C, 68°C, 74°C and 78°C. An aliquot is loaded onto an automated sequencer for analysis (e.g. Alf, Pharmacia Biotech). Alternatively the increase of processivity can be analysed qualitatively according to Maga G., Jonssom Z.O., Stucki M., Spadari S. and Hiibscher U. (J. Mol. Biol. 1999; 285: 259-267, Dual Mode of Interaction of DNA polymerase and Proliferating Cell Nuclear Antigen in Primer Binding and DNA synthesis) by detecting a stimulation of the incorporation of nucleotides into double-stranded regions with free internal 3' ends under suitable reaction conditions (see fig. 23). The above-mentioned protein mixtures also serve to measure fidelity and exonuclease activity using the methods described in Kohler et al.
(Proc. Natl. Acad. Sci. USA 88:7958-7962 (1991) or Chase et al. (J. Biol.
Chem., 249:4545-4552 (1972). The protein mixtures are also used in the PCR (fig. 26, Methods in Molecular Biology, vol. 15 Humana Press Totowa, New Jersey, 1993, edited by Bruce A. White).
Example 2: Trimerization of PCNA
In the following experiments it is shown that recombinant A~chaeoglobus_fulgidus PCNA protein (AF 0335) is present as a trimer under native conditions and can thus adopt a suitable structure for clamp formation.
For the experiment shown in fig. 20A, 350 pl Ni-NTA agarose eluate of PCNA
(see fig. 24) and 150 pl of a crude DNA polymerase fraction (see fig. 25) were made up to 1 ml with storage buffer without glycerol (see fig. 25) and the proteins were separated according to their molecular weight in the same buffer on a Superdex HR (Pharmacia) FPLC column according to the manufacturer's instructions. I ml fractions were collected during the entire run. For the experiment shown in fig. ZOB, the same amounts of the same protein fractions were made up to 1 ml in storage buffer without glycerol containing 8 M urea, denatured for 10 minutes at 95°C and subsequently the proteins were separated according to molecular weight in the same buffer as shown for fig. 20A. 8 gl of each fraction was subsequently separated on a NuPage Bis-Tris gel (NOVEX; see fig. 25) and blotted by means of a blot module (NOVEX) onto a nitrocellulose membrane according to the manufacturer's instructions. Proteins with a His tag were detected according to the manufacturer's instructions using the RGS His antibody (QIAGEN) and the DIG luminescent detection kit for nucleic acids (Boehringer Mannheim). Fig. 20 A shows proteins with a His tag in the fractions 15 (lane 1), 17 (lane 2), 19 (lane 3), 21 (lane 4), 23 (lane 5), 25 (lane 6) in the chromatography which was carried out without urea. Fig.
20 B shows proteins with a His tag in the fractions 10 (lane 1 ), 11 (lane 2), 12 (lane 3), 13 (lane 4), 14 (lane 5), 15 (lane 6), 16 (lane 7), 17 (lane 8) in the chromatography which was carried out in the presence of urea. Archaeoglobrrs firlgidrrs DNA polymerase (AF0497) has a calculated molecular weight of Mr =

kDa. Archaeoglobus.firlgidzrs PCNA (AF0335) has a calculated molecular weight of Mr = 27 kDa. IfArchaeoglobrrs,fulgidrrs PCNA (AF 335) is present as a homotrimer like the homologous protein from eukaryotes, the native factor therefore has a theoretical molecular weight of Mr = 81 kDa. The results shown in fig. 20A
confirm this assumption according to which native PCNA has a similar molecular weight to the DNA polymerase for the recombinant protein: both proteins elute under native conditions in the same fractions from the gel filtration column (fig. 20A, lanes I-3).
Most of the PCNA elutes somewhat later than the DNA polymerase peak and correlates with the somewhat smaller size of the postulated trimer (81 kDa compared to 90 kDa). The data shown in fig. 20B prove that this observation is based on an oligomerization of PCNA since under denaturing conditions which do not allow protein/protein interactions, PCNA elutes considerably later from the column than the DNA polymerase (fig. 20B, lanes 1-7) which corresponds to the lower molecular weight of the monomer.

Example 3: Isolation, preparation and use of an elongation protein (Pyrocoecus horikoshii DNA polymerase (PH 1947)) to form the inventive in vitro complex The elongation protein from Pyrococcus horikoshii (Pyrococcus horikoshii-DNA
polymer (PH1947)) was used in PCR reactions and leads to an efficient amplification of a specific DNA product.
1 (lane 4), 2.5 (lane 5) and 5 pl (lane 6) Pyrococcu.s horikoshii DNA
polymerase (PH1947; crude extract see fig. 25) were used individually in standard PCR
reactions to compare the activity with 1 unit Taq polymerase (lane 2) and 1 pl Archavoglobrrs .frrlgidus DNA polymerase (AF497) crude extract (see fig. 25) (lane 3); lane 1 shows a DNA size marker (New England Biolabs; mixture of 1 kb DNA molecular weight size marker and 100 by DNA molecular weight size marker). In addition to the enzyme, each reaction contained 5 ng template DNA (421 by Rsa I fragment with adapters cloned into PCR 2.1; (Kaiser C., v. Stein (J., Laux G., Hoffmann M., Electrophoresis 1999; 20: 261-268, Functional Genomics in Cancer Research:
Identification of target genes of the Epstein-Barr virus nuclear antigen 2 by subtractive cDNA cloning and high throughput differential screening using high-density agarose gel); 0.2 mM each of dATP, dTTP, dC',TP and dGTP; 1.5 pM each of the specific adapter primers (Kaiser et al., (1999)); 50 mM KCI; 2 mM MgCh, mM Tris HCl (pH 8.3 for Taq reactions; pH 7.5 for Archaeoglobrr.s firlgidus and Pyrococcrrs horikoshii polymerase reactions) in a total volume of 50 pl per reaction.
The samples were subjected to 40 cycles comprising 30 s at 95°C; 30 s at 55°C and 120 s at 68°C. Subsequently 5 ul of the reactions was removed and separated at 10 V/cm on a 1 % agarose gel in 1 x TAE buffer (40 mM Tris acetate; 20 mM sodium acetate; 10 mM EDTA; pH 7.2).
Example 4: Use of an elongation protein The Archaeoglobus fulgidus DNA polymerase AF 0497 can generate PCR products as efficiently as Taq polymerase. 1 unit Taq polymerase and 1 pl Archaeoglobus firlgidus DNA polymerase AF 0497 crude extract (see fig. 25) were used individually in standard PCR reactions to compare the activity. In addition to the enzyme each reaction contained 20 ng M 13 MP 18 ssDNA; 0.2 mM each of dATP, dTTP, dCTP

and dGTP; 1.5 ~M DNA of each of the primers with the following nucleotide sequences:
GGATTGACCGTAATGGGATAGGTTACGTT (SEQ ID NO: 47) or AGCGGATAACAATTTCACACAGGAAACAG (SEQ ID NO: 48) in 50 mM KCI; 2 mM MgCh; 10 mM Tris-HCl (pH 8.3 for Taq reactions, pH 7.5 for Archaeoglobns fi~lgidr~s polymerase reactions) in a total volume of 50 pl per reaction. The samples ran through various numbers of cycles comprising 30 s at 95°C; 30 s at 59°C and 60 s at 68°C. After different numbers of cycles (Z), 5 pl of the mixtures was removed and separated at 10 V/cm on a 1 % agarose gel in 2 x TAE buffer (40 mM Tr~is acetate, 20 mM sodium acetate; 10 mM EDTA; pH 7.2) (see fig. 22).
Example 5: Preparation and use of an inventive thermostable in vitro complex The following components of an inventive irr vitro complex were present among others in a final volume of 50 pl: 10 mM Tris-HCI, pH 7.5; 50 mM KCI; 2 mM
MgCh; 10 pg BSA (can also be omitted); 0.5 mM digoxigenin dUTP (DIG-dUTP, Boehringer Mannheim); 40 pM; 0.5 pg poly dA/40 rrNl oligodT (20mer) hybrid.
Sample I without an elongation protein. Samples 2-12 each additionally contained 3 ~l of a 1:1000 dilution of a fraction of recombinant Archaeoglobus,firlgidz~s DNA
polymerase (elongation protein). Samples 3-7 as well as 8-12 additionally contained 0. S; I ; 2; 4 and 8 girl of a fraction of recombinant sliding clamp protein from Archaeoglobzrs fulgidns.
The samples were incubated for 30 minutes at 68°C and nucleic acids were subsequently precipitated with 3 parts by volume ethanol / 3 M sodium acetate pH
5.2 (30/1). The precipitate was resuspended in 20 pl 100 mM Tris-HCl (pH 7.9) and lrl aliquots were added dropwise to individual wells of a 96-well silent screen plate containing nylon 66 Biodyne B 0.45 pm pore (Nalge Nunc) and the nucleic acids were fixed to the membranes for 10 minutes at 70°C. The incorporated digoxigenin dUTP (Boehringer Mannheim) was detected by means of the DIG
luminescent detection kit for nucleic acids (Boehringer Mannheim). In order to detect the chemiluminescence, an X-ray film was subsequently placed on the membrane for 30 s. PCNA considerably stimulated the incorporation of DIG-dUTP

by the DNA polymerase (compare lanes 3-7 with lane 2). The PCNA fraction used has no endogenous polymerase activity (lanes 8-12).
Example 6: Amplification, cloning, expression and purification of a sliding clamp protein from Archaeoglobus fulgidus After amplification of the inventive sliding clamp protein gene from Archaeoglobrrs frrlgidrr.s, the gene was cloned into the expression vectors pTrc99 and pQE30.
The expression, purification and gel-electrophoretic separation of the Archaeoglobirs .firlgidrrs sliding clamp protein (PCNA (AF 0335)) was carried out as shown for the elongation protein (the DNA polymerase AF0497) in fig. 25.
Example 7: Preparation of an elongation protein Expression and purification of the Archaeoglobu.s firlgidrr.s DNA polymerase:
pQE30 plasmid DNA (QIAGEN) with a gene inserted in the correct reading frame for the elongation protein, the Archac.~oglobrr.s firlgiclrrs DNA polymerase AF 0497, was transformed in competent E. coli M 15 [prep4] according to the instructions from the QIAexprssionist (third edition: QIAGEN). Transfer to 1 1 culture medium and induction of protein expression were also carried out according to the schemes given in these instructions. After 16 hours induction time, the bacteria were sedimented for 10 minutes at 5000 g. The QIAexpressionist procedure (third edition:
QIAGEN; protocol 8, protocol 11 ) was used to obtain highly purified fractions. Only the elution of the bound proteins was carried out with 2 x 2 ml elution buffer and not as described in these instructions with 4 x 0.5 ml elution buffer. In order to obtain crude extracts of recombinant proteins, the bacterial sediments were alternatively washed with 100 ml buffer A (50 mM Tris-HCI, pH 7.9; 50 mM glucose; 1 mM
EDTA) and, after centrifuging again, they were resuspended in 50 ml buffer A
containing 4 mg/ml lysozyme. After 15 minutes at room temperature, 50 ml lysis buffer (10 mM Tris-HCl pH 7.9; 50 mM KCI; 1 mM EDTA; 0.5 % Tween 20; 0.5 IGPAL) was added and E. coli proteins were denatured by incubating for 60 min at 75°C in a water bath. After subsequent centrifugation for 15 minutes at 27,000 g, the supernatant was precipitated by slow addition with permanent stirring of 40 mg crystalline ammonium sulphate per ml extract. Precipitated proteins were sedimented for 10 minutes at 27,000 g and the sediment was resuspended in 20 ml bui~'er A.
These crude extracts as well as the elution fractions from the Ni NTA agarose were each dialysed twice for 8 hours against at least 50 parts per volume storage buffer (buffer ( I 0 mM Tris-HCl pH 7.9; 50 mM KCI; 1 mM EDTA; 50 % glycerol; 1 mM
dithiothreitol; 0.5 mM phenylmethylsulfonyl fluoride) and stored at -20°C. In order to analyse the protein composition of the fractions obtained, aliduots thereof were separated electrophoretically on NuPage'''f 10 % Bis-Tris gels (NOVEX) according to the manufacturer's instructions and the proteins obtained were stained with Coomassie brilliant blue. Lane 1 shows a molecular weight standard (BIO RAD
cat.
No. 161-0317). For lane 2, 500 pl bacteria were sedimented directly before induction and treated and applied according to the instructions for rurming NuPage gels. Lane 3 shows the same amount of bacteria 16 hours after the elution.
Lanes 4 and S each show 8 pl of the two eluates of the Ni-NTA agarose column after dialysis. Lane 6 shows 8 ml of a dialysed crude extract. Highly purified fractions of the Archaeoglobus,firlgidus DNA polymerase are obtained by purification over Ni-NTA agarose (lanes 4 and 5). However, this enzyme is already the dominant polypeptide in the crude extract.
Example 8: Use of an inventive in vitro complex in the PCR
The use of Archaeoglohn.s fiilgidus PCNA (AF 0335) in PCR reactions containing Archaeoglobrss fulgidus DNA polymerase (AF 0497) led to a more efficient amplification of a specific DNA product compared to a PCR reaction without PCNA. Evaluation of the amplified amounts of DNA according to AIDA (Advanced Image Data Analyzer, software version AIDA 2.1, Raytest Company) shows a background of 94, a value of 104.9 for lane 1 and a value of 228.4 for lane 2.
If the background is subtracted from the values for lanes 1 and 2, it follows that lane 2 shows the result of a 12.3-fold processive stimulation by PCNA in the reaction.
0.3 pl Archaeoglobus fulgid~rs DNA polymerase (7.5 ug/pl protein concentration of AF497; crude extract see fig. 25) was used individually in PCR reactions to analyse the stimulation of the PCR activity by 0 pl and 0.01 pl Archaeoglobus _fulgidzis PCNA (Ni-NTA eluate; fig. 24). In addition to the enzyme and 0.05 pl (2.8 pg) of the sliding clamp protein from Archaeoglobus frrlgidr~s (homologous to proliferating cell nuclear antigen PCNA), each reaction contained a non-purified PCNA gene fragment amplified by means of PCR (PCR reaction for the specific amplification of the PCNA fragment: 0.5 pl Archaeoglob~rs , firlgidze.s polymerase; 0.2 mM each of dATP, dTTP, dCTP and dGTP; 1.5 pM of each of the specific primers (5'-ACG
CGC GGA TCC ATA GAC GTC ATA ATG AC(: GG-3' (SEQ ID N0:49);
5'-TAC GGG GTA CCC GAG CCA AAA TTG GGT AAA G-3' (SEQ ID N0:50);
50 mM KCI; 2 mM MgCh; 10 mM Tris-HCI (pH 7.5) as well as 50 ng of a pQE30 plasmid which carries the coding sequences of Archaeoglobr~s _fi~lgidt~s PCNA
inserted into the BamHI and Kpn I restriction sites in a total volume of 50 pl per reaction with a cycle number of 40 comprising 30 s at 95°C; 30 s at 61°C; 240 s at 68°C; 0.2 mM each of dATP, dTTP, dCTP and dGTP; 1.5 pM of each of the specific primers (S'-ACG CGC GGA TCC ATA GAC GTC ATA ATG ACC GG-3' (SEQ ID NO: 51 ) and 5'-TAC GGG GTA CCC GAG CCA AAA TTG GGT AAA
G-3' (SEQ ID N0:52); 50 mM KCI; 2 mM MgCh; 10 mM Tris HCl pH 7.5 in a total volume of 50 pl per reaction. The samples ran through 40 cycles comprising 30 s at 95°C; 120 s at 61°C; 240 s at 68°C. Subsequently 5 pl of the mixtures was removed and separated at 10 V/cm on a 1 % agarose gel in I x TAE buffer (40 mM
Tris acetate; 20 mM sodium acetate; 10 mM EDTA; pH 7.2).
Example 9: Y2H experiments Interactions of proteins of the inventive complex from .4rchaeoglobus.frelgidus were demonstrated with a Y2H system. The coding regions of genes from Archaeoglobtrs fi~lgidus, whose gene products were used in the inventive thermostable in vitro complex were amplified by means of PCR, cloned into the vectors pGBT9 (vertical columns of fig. 27) and pGAD424 (horizontal rows of fig. 27) and expressed as hybrid proteins by gap repair in yeast PJ69-4a (for pGAD424) and PJ69-4alpha (for pGBT9). A positive control was also amplified by means of PCR, cloned into the vectors pGBT9 (see also vertical columns of fig. 27) and pGAD424 (see also horizontal rows of fig. 27) and expressed as hybrid proteins by gap repair in the yeast strain PJ69-4a (for pGAD424) and PJ69-4alpha (for pGBT9). Diploid cells which contained both vectors were generated by pairing according to the raster shown in fig. 27. Expression of three independent reporters (HIS3, ADE2 and MEL 1 ) was measured. Fig. 27 shows the growth on medium without histidine and adenine. The cells which grow in this experiment are those which carry both vectors and in which additionally the expression products of both these vectors bind to one another. Transcription of the reporter genes is initiated as a result of the binding of the expression products such that histidine and adenine auxotrophy is abolished and these cells are able to grow in the said medium.
All the clones that were positive here were also positive with respect to the expression of the MEL 1 gene. The yeast two hybrid (Y2H) system was used according to the instructions of the Clontech Company (yeast protocols handbook, PT3024-1). Fig. 27 shows binding between the proteins of the positive control, the elongation protein and the sliding clamp protein, the sliding clamp protein and the sliding clamp protein and the coupling protein and the sliding clamp protein.
Example 10: Use of an inventive thermostable in vitro complex The inventive thermostable ire vitro complex can be used very well for the amplification of genomic DNA fragments. An effcient amplification occurs even when using small amounts of a template or of an elongation protein. In example 10 a section of the human collagen gene was amplified using the inventive thermostable in vitro complex and also using an elongation protein alone. The following were used in a total volume of 50 pl: 0.5 pl nucleotide mix (25 niM initial solution comprising each nucleotide A, C, G and T), 0.2 pl of each primer ( 100 pmol/pl initial solution "collagen forward" 5'-TAA AGG GTC ACC GTG GCT TC-3' (SEQ ID N0:53), 100 pmol/pl initial solution of "collagen reverse" 5'-CGA ACC ACA TTG GCA
TCA TC-3' (SEQ ID NO: 54), 0.8 pl DNA (200 ng/pl human genomic DNA from Boehringer Mannheim), 5 pl 10 x PCR buffer (pH 7.5) (100 mM Tris-HCI, pH 7.5, 500 mM KCI, 15 mM MgCh), 1 pl elongation protein (AF1722, 7.5 pg/pl protein concentration) and 8 pl sliding clamp protein (AF 0335, protein concentration 0.3 pglml) and subjected to the following cycle in a PCR machine: initially 5 min at 95°
then 30 times 30 s at 95°C, 30 s at 59°C and finally 6 min at 68°C. Fig. 28 clearly shows the advantage of using the inventive thermostable in vitro complex.
The features of the invention disclosed in the aforementioned description and in the claims can be important individually as well as in any combination for the realization of the invention in its various embodiments.

- SEQUENCE PROTOCOL
(1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: LION bioscience AG
(B) STREET: Im Neuenheimer Feld 517 (C) CITY: Heidelberg (E) COUNTRY: DE
(F) POSTCODE: 69120 (ii) TITLE OF THE INVENTION: Ac~~essory complexes with polymerase activity (iii) NUMBER OF SEQUENCES: 54 (iv) COMPUTER-READABLE FORM:
(A) DATA CARRIER: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 (EPO) (2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 340 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi)SEQUENCE
DESCRIPTION:
SEQ
ID
NO:
1:

MetGluThr SerAla LeuLysGln GlnGluGln ProAlaAla ThrLys IleArgAsn LeuPro TrpValGlu LysTyrArg ProGlnThr LeuAsn AspLeuIle SerHis GlnAspIle LeuSerThr IleGlnLys PheIle AsnGluAsp ArgLeu ProHisLeu LeuLeuTyr GlyProPro GlyThr GlyLysThr SerThr IleLeuAla CysAlaLys GlnLeuTyr LysAsp LysGluPhe GlySer MetValLeu GluLeuAsn AlaSerAsp AspArg GlyIleAsp IleIle ArgGlyPro IleLeuSer PheAlaSer ThrArg ' 2 -Thr Ile Phe Lys Lys Gly Phe Lys Leu Val Ile Leu Asp Glu Ala Asp Ala Met Thr Gln Rsp Ala Gln Asn Ala Leu Arg Arg Val Ile Glu Lys Phe Thr Glu Asn Thr Arg Phe Cys Leu Ile Cys Asn Tyr Leu Ser Lys Ile Ile Pro Ala Leu Gln Ser Arg Cys Thr Arg Phe Arg Phe Gly Pro Leu Thr Pro Glu Leu Met Val Pro Arg Leu Glu His. Val Val Glu Glu Glu Lys Val Asp Ile Ser Glu Asp Gly Met Lys Ala Leu Val Thr Leu Ser Ser Gly Asp Met Arg Arg Ala Leu Asn Ile Leu Gln Ser Thr Asn Met Ala Phe Gly Lys Val Thr Glu Glu Thr Val Tyr Thr Cys Thr Gly His Pro Leu Lys Ser Asp Ile Ala Asn Ile Leu Asp Trp Met Leu Asn Gln Asp Phe Thr Thr Ala Tyr Arg Asn Ile Thr Glu Leu Lys Thr Leu Lys Gly Leu Ala Leu His Asp Ile Leu Thr Glu Ile His Leu Phe Val His Arg Val Asp Phe Pro Ser Ser Val Arg Ile His Leu Leu Thr Lys Met Ala Asp Ile Glu Tyr Arg Leu Ser Val Gly Thr Asn Glu Lys Ile Gln Leu Ser Ser Leu Ile Ala Ala Phe Gln Val Thr Arg Asp Leu Ile Val Ala Glu Ala (2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 319 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:

a Met Glu Asn Phe Glu Ile Trp Val Glu Lys Tyr Arg Pro Arg Thr Leu Asp Glu Val Val Gly Gln Asp Glu Val Ile Gln Arg Leu Lys Gly Tyr Val Glu Arg Lys Asn Ile Pro His Leu Leu Phe Ser Gly Pro Pro Gly Thr Gly Lys Thr Ala Thr Ala Ile Ala Leu Ala Arg Asp Leu Phe Gly Glu Asn Trp Arg Asp Asn Phe Ile Glu Met Asn Ala Ser Asp Glu Arg Gly Ile Asp Val Val Arg His Lys Ile Lys Glu Phe Ala Arg Thr Ala Pro Ile Gly Gly Ala Pro Phe Lys Ile Ile Phe Leu Asp Glu Ala Asp Ala Leu Thr Ala Asp Ala Gln Ala Ala Leu Arg Arg Thr Met Glu Met Tyr Ser Lys Ser Cys Arg Phe Ile Leu Ser Cys Asn Tyr Val Ser Arg Ile Ile Glu Pro Ile Gln Ser Arg Cys Ala Val Phe Arg Phe Lys Pro Val Pro Lys Glu Ala Met Lys Lys Arg Leu Leu Glu Ile Cys Glu Lys Glu Gly Val Lys Ile Thr Glu Asp Gly Leu Glu Ala Leu Ile Tyr Ile Ser Gly Gly Asp Phe Arg Lys Ala Ile Asn Ala Leu Gln Gly Ala Ala Ala Ile Gly Glu Val Val Asp Ala Asp Thr Ile Tyr Gln Ile Thr Ala Thr Ala Arg Pro Glu Glu Met Thr Glu Leu Ile Gln Thr Ala Leu Lys Gly Asn Phe Met Glu Ala Arg Glu Leu Leu Asp Arg Leu Met Val Glu Tyr Gly Met Ser Gly Glu Asp Ile Val Ala Gln Leu Phe Arg Glu Ile Ile Ser Met Pro Ile Lys Asp Ser Leu Lys Val Gln Leu Ile Asp Lys Leu Gly Glu Val Asp Phe Arg Leu Thr Glu Gly Ala Asn Glu Arg Ile Gln Leu Asp Ala Tyr Leu Ala Tyr Leu Ser Thr Leu Ala Lys Lys (2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1847 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii (xi) SEQUENCE DESCRIPTION: SEQ ID N0: 3:
Met Val Ile Ile Met Glu Lys Pro Trp Val Glu Lys Tyr Arg Pro Lys Thr Leu Asp Asp Ile Val Gly Gln Asp Glu Ile Val. Lys Arg Leu Lys Lys Tyr Val Glu Lys Lys Ser Met Pro His Leu Leu Phe Ser Gly Pro Pro Gly Val Gly Lys Cys Leu Thr Gly Asp Thr Lys Val Ile Val Asn Gly Glu Ile Arg Glu Ile Gly Glu Val Ile Glu Glu. Ile Ser Asn Gly Lys Phe Gly Val Thr Leu Thr Asn Asn Leu Lys Val Leu Gly Ile Asp GluAspGly LysIleArg GluPheAsp ValGlnTyr ValTyrLys Asp LysThrAsn ThrLeuIle LysIleLys ThrLysMet GlyArgGlu Leu LysValThr ThrTyrHis ProLeuLeu IleAsnHis LysAsnGly Glu IleLysTrp GluLysAla GluAsnLeu LysValGly AspLysLeu Ala Thr Pro Arg Tyr Ile Leu Phe Asn Glu Ser Asp Tyr Asn Glu Glu Leu Ala Glu Trp Leu Gly Tyr Phe Ile Gly Asp Gly His Ala Asp Lys Glu Ser Asn Lys Ile Thr Phe Thr Asn Gly Asp Glu Lys Leu Arg Lys Arg Phe Ala Glu Leu Thr Glu Lys Leu Phe Lys Asp Ala Lys Ile Lys Glu Arg Ile His Lys Asp Arg Thr Pro Asp Ile Tyr Val Asn Ser Lys Glu Ala ValGlu PheIleAsp LysLeuGly LeuArgGly LysLysA1a Asp Lys ValArg IleProLys GluIleMet ArgSerAsp AlaLeuArg Ala Phe LeuArg AlaTyrPhe AspCysAsp GlyGlyIle GluLysHis Ser Ile ValLeu SerThrAla SerLysGlu MetAlaGlu AspLeuVal Tyr Ala LeuLeu ArgPheGly IleIleAla LysLeuArg GluLysVal Asn Lys AsnAsn AsnLysVal TyrTyrHis IleValIle SerAsnSer Ser Asn LeuArg ThrPheLeu AspAsnIle GlyPheSer GlnGluArg Lys Leu Lys Lys Leu Leu Glu Ile Ile Lys Asp Glu Asn Pro Asn Leu Asp ValIleThr IleAspLys GluLysIle ArgTyrIle ArgAspArg Leu 370 375 38C!

LysValLys LeuThrArg AspIleGlu LysAspAsn TrpSerTyr Asn LysCysArg LysIleThr GlnGluLeu LeuLysGlu IleTyrTyr Arg LeuGluGlu LeuLysGlu IleGluLys AlaLeuGlu.GluAsnIle Leu IleAspTrp AspGluVal AlaGluArg RrgLysGlu IleAlaGlu Lys ThrGlyIle ArgSerAsp ArgIleLeu GluTyrIle RrgGlyLys Arg Lys Pro Ser Leu Lys Asn Tyr Ile Lys Ile Ala Asn Thr Leu Gly Lys Asn Ile Glu Lys Ile Ile Asp Ala Met Arg Ile Phe Ala Lys Lys Tyr Ser Ser Tyr Ala Glu Ile Gly Lys Met Leu Asn Met Trp Asn Ser Ser Ile Lys Ile Tyr Leu Glu Ser Rsn Thr Gln Glu Ile Glu Lys Leu Glu Glu Ile Arg Lys Thr Glu Leu Lys Leu Val Lys Glu Ile Leu Asn Asp Glu Lys Leu Ile Asp Ser Ile Gly Tyr Val Leu Phe Leu Ala Ser Asn Glu Ile Tyr Trp Asp Glu Ile Val Glu Ile Glu Gln Leu Asn Gly Glu Phe Thr Ile Tyr Asp Leu His Val Pro Arg Tyr His Asn Phe Ile Gly Gly Asn Leu Pro Thr Ile Leu His Asn Thr Thr Ala Ala Leu Cys Leu Ala Arg Asp Leu Phe Gly Glu Asn Trp Arg Asp Asn Phe Leu Glu Leu Asn Ala Ser Val Ser Lys Asp Thr Pro Ile Leu Val Lys Ile Asp Gly Lys Val Lys Arg Thr Thr Phe Glu Glu Leu Asp Lys Ile Tyr Phe Glu Thr Asn Asp Glu Asn Glu Met Tyr Lys Lys Val Asp Asn Leu Glu Val Leu Thr Val Asp Glu Asn Phe Arg Val Arg Trp Arg Lys Val Ser Thr Ile Ile Arg His Lys Val Asp Lys Ile Leu Arg Ile Lys Phe Glu Gly Gly Tyr Ile Glu Leu Thr Gly Asn His Ser Ile Met Met Leu Asp Glu Asn Gly Leu Val Ala Lys Lys Ala Ser Asp Ile Lys Val Gly Asp Cys Phe Leu Ser Phe Val Ala Asn Ile Glu Gly Glu Lys Rsp Arg Leu Asp Leu Lys Glu Phe Glu Pro Lys Asp Ile Thr Ser Arg Val Lys Ile Ile Asn Asp Phe Asp Ile Asp Glu Asp Thr Ala Trp Met Leu Gly Leu Tyr Val Ala Glu Gly Ala Val Gly Phe Lys Gly Lys Thr Ser Gly Gln Val Ile Tyr Thr Leu Gly Ser His Glu His Asp Leu Ile Asn Lys Leu Asn Asp Ile Val Asp Lys Lys Gly Phe Ser Lys Tyr Glu Asn Phe Thr Gly Ser Gly Phe Asp Arg Lys Arg Leu Ser Ala Lys Gln Ile Arg Ile Leu Asn Thr Gln Leu Ala Arg Phe Val Glu Glu Asn Phe Tyr Asp Gly Asn Gly Arg Arg Ala Arg Asn Lys Arg Ile Pro Asp Ile Ile Phe Glu Leu Lys Glu Asn Leu Arg Val Glu Phe Leu Lys Gly Leu Ala Asp Gly Asp Ser Ser Gly Asn Trp Arg Glu Val Val Arg Ile Ser Ser Lys Ser Asp _ 7 _ Asn Leu LeuIle AspThr ValTrpLeuAla ArgIle SerGlyIleGlu Ser Ser IlePhe GluAsn GluAlaArgLeu IleTrp LysGlyGlyMet Lys Trp LysLys SerAsn LeuLeuProAla GluPro IleIleLysMet Ile Lys LysLeu GluAsn LysIleAsnGly AsnTrp ArgTyrIleLeu Arg His GlnLeu TyrGlu GlyLysLysArg ValSer LysAspLysIle Lys Gln IleLeu GluMet ValAsnValGlu LysLeu SerAspLysGlu Lys Glu ValTyr AspLeu LeuLysLysLeu SerLys ThrGluLeuTyr Ala Leu ValVal LysGlu IleGluIleIle AspTyr AsnAspPheVal Tyr Asp ValSer ValPro AsnAsnGluMet PhePhe AlaGlyAsnVal Pro Ile Leu Leu His Asn Ser Asp Glu Arg Gly Ile Asp Val Ile Arg ThrLysVal LysAspPheAla Arg Lys ProIleGly ValPro Thr Asp PheLysIle IlePheLeuAsp Glu Asp AlaLeuThr AspAla Ser Ala GlnAsnAla LeuArgArgThr Met Lys TyrSerAsp CysArg Glu Val PheIleLeu SerCysLeuThr Gly Ala LysIleThr ProAsp Asp Leu Glu Arg Glu Ile Lys Ile Glu Asp Phe Ile Lys Met Phe Glu Glu Arg Lys Leu Lys His Val Leu Asn Arg Asn Gly Glu Asp Leu Val Leu Ala Gly Val Lys Phe Asn Ser Lys Ile Val Asn His Lys Val Tyr Arg Leu 11?0 1175 1180 Val Leu Glu Ser Gly Arg Glu Ile Glu Ala Thr Gly Asp His Lys Phe Leu Thr Arg Asp Gly Trp Lys Glu Val Tyr Glu Leu Lys Glu Asp Asp Glu Val Leu Val Tyr Pro Ala Leu Glu Gly Val Gly Phe Glu Val Asp Glu Arg Arg Ile Ile Gly Leu Asn Glu Phe Tyr Glu Phe Leu Thr Asn _ Tyr GluIleLys LeuGlyTyr LysPro LeuGlyLys AlaLysSer Tyr Lys GluLeuIle ThrArgAsp LysGlu LysIleLeu SerArgVal Leu Glu LeuSerAsp LysTyrSer LysSer GluIleArg ArgLysIle Glu Glu GluPheGly IleLysIle SerLeu ThrThrIle LysAsnLeu Ile Asn GlyLysIle AspGlyPhe AlaLeu LysTyrVal ArgLysIle Lys Glu LeuGlyTrp AspGluIle ThrTyr AspAspGlu LysAlaGly Ile Phe AlaArgLeu LeuGlyPhe IleIle GlyAspGly HisLeuSer Lys Ser LysGluGly ArgIleLeu IleThr AlaThrIle AsnGluLeu Glu Gly IleLysLys AspLeuGlu LysLeu GlyIleLys AlaSerAsn Ile Ile GluLysAsp IleGluHis LysLeu AspGlyArg GluIleLys Gly Lys ThrSerPhe IleTyrIle AsnAsn LysAlaPhe TyrLeuLeu Leu Asn PheTrpG1y ValGluIle GlyAsn LysThrIle AsnGlyTyr Asn Ile ProLysTrp IleLysTyr GlyAsn LysPheVal LysArgGlu Phe Leu ArgGlyLeu PheGlyAla AspGly ThrLysPro TyrIleLys Lys Tyr AsnIleAsn GlyIleLys LeuGly IleArgVal GluAsnIle Ser Lys AspLysThr LeuGluPhe PheGlu GluValLys LysMetLeu Glu Glu PheGluVal GluSerTyr IleLys ValSerLys IleAspAsn Lys Asn Leu Thr Glu Leu Ile Val Lys Ala Asn Asn Lys Asn Tyr Leu Lys Tyr Ser Arg Ile Tyr TyrGlu Asp Phe Ala Leu Ser Ala Lys Asn Arg Leu Gly Glu Tyr Arg LysGlu Tyr Asp Ile Val Leu Ile Ala Lys Ile Leu Glu Ile Ala Asn LeuLys Ala Gly Glu Lys Glu Ala Glu Asp Lys g _ _ Ser Leu Arg Glu Leu Ala Arg Lys Tyr Asn Val Pro Val Asp Phe Ile Ile Rsn Gln Leu Lys Gly Lys Asp Ile Gly Leu Pro Arg Asn Phe Met Thr Phe Glu Glu Phe Leu Lys Glu Lys Val Val Asp Gly Lys Tyr Val Ser Glu Arg Ile Ile Lys Lys Glu Cys Ile Gly Tyr Arg Asp Val Tyr Asp Ile Thr Cys His Lys Asp Pro Ser Phe Ile Ala Asn Gly Phe Val Ser His Asn Cys Asn Tyr Pro Ser Lys Ile Ile Pro Pro Ile Gln Ser Arg Cys Ala Val Phe Arg Phe Ser Pro Leu Lys Lys Glu Asp Ile Ala Lys Lys Leu Lys Glu Ile Ala Glu Lys Glu Gly Leu Asn Leu Thr Glu Ser Gly Leu Glu Ala Ile Ile Tyr Val Ser Glu Gly Asp Met Arg Lys Ala Ile Asn Val Leu Gln Thr Ala Ala Ala Leu Ser Asp Val Ile Asp Asp Glu Ile Val Tyr Lys Val Ser Ser Arg Ala Arg Pro Glu Glu Val Lys Lys Met Met Glu Leu Ala Leu Asp Gly Lys Phe Met Glu Ala Arg Asp Leu Leu Tyr Lys Leu Met Val Glu Trp Gly Met Ser Gly Glu Asp Ile Leu Asn Gln Met Phe Arg Glu Ile Asn Ser Leu Asp Ile Asp Glu Arg Lys Lys Val Glu Leu Ala Asp Ala Ile Gly Glu Thr Asp Phe Arg Ile Val Glu Gly Ala Asn Glu Arg Ile Gln Leu Ser Ala Leu Leu Ala Lys Met Ala Leu Met Gly Arg (2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 855 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID N0: 4:
Met His Asn Met Glu Glu Val Arg Glu Val Lys Val Leu Glu Lys Pro Trp Val Glu Lys Tyr Arg Pro Gln Arg Leu Asp Glu Ile Val Gly Gln Glu His Ile Val Lys Arg Leu Lys His Tyr Val Lys Thr Gly Ser Met Pro His Leu Leu Phe Ala Gly Pro Pro Gly Val Gl:y Lys Cys Leu Thr Gly Asp Thr Lys Val Ile Ala Asn Gly Gln Leu Phe Glu Leu Arg Glu Leu Val Glu Lys Ile Ser Gly Gly Lys Phe Gly Pro Thr Pro Val Lys Gly Leu Lys Val Ile Gly Ile Asp Glu Asp Gly Lys Leu Arg Glu Phe Glu Val Gln Tyr Val Tyr Lys Asp Lys Thr Glu Arq_ Leu Ile Arg Ile Arg Thr Arg Leu Gly Arg Glu Leu Lys Val Thr Pro Tyr His Pro Leu Leu Val Asn Arg Arg Asn Gly Glu Ile Lys Trp Val. Lys Ala Glu Glu Leu Lys Pro Gly Asp Lys Leu Ala Val Pro Arg Phe Leu Pro Ile Val Thr Gly Glu Asp Pro Leu Ala Glu Trp Leu Gly Tyr Phe Leu Gly Gly Gly Tyr Ala Asp Ser Lys Glu Asn Leu Ile Met Phe Thr Asn Glu Asp Pro Leu Leu Arg Gln Arg Phe Met Glu Leu Thr Glu Lys Leu Phe Ser Asp Ala Arg Ile Arg Glu Ile Thr His Glu Asn Gly Thr Ser Lys Val Tyr Val Asn Ser Lys Lys Ala Leu Lys Leu Val Asn Ser Leu Gly Asn Ala His Ile Pro Lys Glu Cys Trp Arg Gly Ile Arg Ser Phe Leu Arg Ala Tyr Phe Asp Cys Asn Gly Gly Val Lys Gly Asn Ala Ile Val Leu Ala Thr Ala Ser Lys Glu Met Ser Gln Glu Ile Ala Tyr Ala Leu Ala - Gly Phe Gly Ile Ile Ser Arg Ile Gln Glu Tyr Arg Val Ile Ile Ser Gly Ser Asp Asn Val Lys Lys Phe Leu Asn Glu Ile Gly Phe Ile Asn Arg Asn Lys Leu Glu Lys Ala Leu Lys Leu Val Lys Lys Asp Asp Pro Gly His Asp Gly Leu Glu Ile Asn Tyr Glu Leu Ile Ser Tyr Val Lys Asp Arg Leu Arg Leu Ser Phe Phe Asn Asp Lys Arg Ser Trp Ser Tyr Arg Glu Ala Lys Glu Ile Ser Trp Glu Leu Met Lys Glu Ile Tyr Tyr Arg Leu Asp Glu Leu Glu Lys Leu Lys Glu Ser Leu Ser Arg Gly Ile Leu Ile Asp Trp Asn Glu Val Ala Lys Arg Ile Glu Glu Val Ala Glu Glu Thr Gly Ile Arg Ala Asp Glu Leu Leu Glu Tyr Ile Glu Gly Lys Arg Lys Leu Ser Phe Lys Asp Tyr Ile Lys Ile Ala Lys Val Leu Gly Ile Asp Val Glu His Thr Ile Glu Ala Met Arg Val Phe Ala Arg Lys Tyr Ser Ser Tyr Ala Glu Ile Gly Arg Arg Leu Gly Thr Trp Asn Ser Ser Val Lys Thr Ile Leu Glu Ser Asn Ala Val Asn Val Glu Ile Leu Glu Arg Ile Arg Lys Ile Glu Leu Glu Leu Ile Glu Glu Ile Leu Ser Asp Glu Lys Leu Lys Glu Gly Ile Ala Tyr Leu Ile Phe Leu Ser Gln Asn Glu Leu Tyr Trp Asp Glu Ile Thr Lys Val Glu Glu Leu Arg Gly Glu Phe Ile Ile Tyr Asp Leu His Val Pro Gly Tyr His Asn Phe Ile Ala Gly Asn Met Pro Thr Val Val His Asn Thr Thr Ala Ala Leu Ala Leu Ser Arg Glu Leu Phe Gly Glu Asn Trp Arg His Asn Phe Leu Glu Leu Asn Ala Ser Asp Glu Arg Gly Ile Asn Val Ile Arg Glu Lys Val Lys Glu Phe Ala Arg Thr Lys Pro Ile Gly Gly Ala Ser Phe Lys Ile Ile Phe Leu Asp Glu Ala Asp Ala Leu Thr Gln Asp Ala Gln Gln Ala Leu Arg Arg Thr Met Glu Met Phe Ser Ser Asn Val Arg Phe Ile Leu Ser Cys Asn Tyr Ser Ser Lys Ile Ile Glu Pro Ile Gln Ser Arg Cys Ala Ile Phe Arg Phe Arg Pro Leu Arg Asp Glu Asp Ile Ala Lys Arg Leu Arg Tyr Ile Ala Glu Asn Glu Gly Leu Glu Leu Thr Glu Glu Gly Leu Gln Ala Ile Leu Tyr Ile Ala Glu Gly Asp Met Arg Arg Ala Ile Asn Ile Leu Gln Ala Ala Ala Ala Leu Asp Lys Lys Ile Thr Asp Glu Asn Val Phe Met Val Ala Ser Arg Ala Arg Pro Glu Asp Ile Arg Glu Met Met Leu Leu Ala Leu Lys Gly Asn Phe Leu Lys Ala Arg Glu Lys Leu Arg Glu Ile Leu Leu Lys Gln Gly Leu Ser Gly Glu Asp Val Leu Ile Gln Met His Lys Glu Val Phe Asn Leu Pro Ile Asp Glu Pro Thr Lys Val Tyr Leu Ala Asp Lys Ile Gly Glu Tyr Asn Phe Arg Leu Val Glu Gly Ala Asn Glu Met Ile Gln Leu Glu Ala Leu Leu Ala Gln Phe Thr Leu Val Gly Lys Lys Lys (2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 321 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautotrophicum (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
Met Ile Ile Met Asn Gly Pro Trp Val Glu Lys Tyr Arg Pro Gln Lys Leu Asp Asp Ile Val Gly Gln Glu His Ile Ile Pro Arg Leu Lys Arg Tyr Val Glu Glu Lys Ser Met Pro Asn Leu Met Phe Thr Gly Pro Ala Gly Val Gly Lys Thr Thr Ala Ala Leu Ala Leu Ala Arg Glu Ile Leu Gly Glu Tyr Trp Arg Gln Asn Phe Leu Glu Leu Asn Ala Ser Asp Ala Arg Gly Ile Asp Thr Val Arg Thr Ser Ile Lys Asn Phe Cys Arg Leu Lys Pro Val Gly Ala Pro Phe Arg Ile Ile Phe Leu Asp Glu Val Asp Asn Met Thr Lys Asp Ala Gln His Ala Leu Arg Arg Glu Met Glu Met Tyr Thr Lys Thr Ser Ser Phe Ile Leu Ser Cys Asn Tyr Ser Ser Lys Ile Ile Asp Pro Ile Gln Ser Arg Cys Ala Ile Phe Arg Phe Leu Pro Leu Lys Gly His Gln Ile Ile Lys Arg Leu Glu Tyr Ile Ala Glu Lys Glu Rsn Leu Glu Tyr Glu Ala His Ala Leu Glu Thr Ile Val Tyr Phe Ala Glu Gly Asp Leu Arg Lys Ala Ile Asn Leu Leu Gln Ser Ala Ala Ser Leu Gly Glu Lys Ile Thr Glu Ser Ser Ile Tyr Asp Val Val Ser Arg Ala Arg Pro Lys Asp Val Arg Lys Met Ile Lys Thr Ile Leu Asp Gly Lys Phe Met Glu Ala Arg Asp Met Leu Arg Glu Ile Met Val Leu Gln Gly Ile Ser Gly Glu Asp Met Val Thr Gln Ile Tyr Gln Glu Leu Ser Arg Leu Ala Met Glu Gly Glu Val Asp Gly Asp Arg Tyr Val Gly Leu Ile Asp Ala Ile Gly Glu Tyr Asp Phe Arg Ile Arg Glu Gly Ala Asn Pro Arg Ile Gln Leu Glu Ala Leu Leu Ala Arg Phe Leu Glu His Ala (2) INFORMATION FOR SEQ ID N0: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1148 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
Met Asp Ile Arg Lys Phe Phe Gly Val Ile Pro Ser Gly Lys Lys Leu Val Ser Glu Thr Val Lys Lys Asn Glu Lys Thr Lys Ser Asp Glu Glu Thr Leu Lys Ala Lys Lys Gly Ile Lys Glu Ile Lys Val Asn Ser Ser Arg Lys Glu Asp Asp Phe Lys Gln Lys Gln Pro Ser Lys Lys Lys Arg Ile Ile Tyr Asp Ser Asp Ser Glu Ser Glu Glu Thr Leu Gln Val Lys Asn Ala Lys Lys Pro Pro Glu Lys Leu Pro Val Ser Ser Lys Pro Gly Lys Ile Ser Arg Gln Asp Pro Val Thr Tyr Ile Ser Glu Thr Asp Glu Glu Asp Asp Phe Met Cys Lys Lys Ala Ala Ser Lys Ser Lys Glu Asn Gly Arg Ser Thr Asn Ser His Leu Gly Thr Ser Asn Met Lys Lys Asn Glu Glu Asn Thr Lys Thr Lys Asn Lys Pro Leu Ser Pro Ile Lys Leu Thr Pro Thr Ser Val Leu Asp Tyr Phe Gly Thr Gly Ser Val Gln Arg Ser Asn Lys Lys Met Val Ala Ser Lys Arg Lys Glu Leu Ser Gln Asn Thr Asp Glu Ser Gly Leu Asn Asp Glu Ala Ile Ala Lys Gln Leu Gln Leu Asp Glu Asp Ala Glu Leu Glu Arg Gln Leu His Glu Asp Glu Glu Phe Ala Arg Thr Leu Ala Met Leu Asp Glu Glu Pro Lys Thr Lys Lys Ala Arg Lys Asp Thr Glu Ala Gly Glu Thr Phe Ser Ser Val Gln Ala Asn Leu Ser Lys Ala Glu Lys His Lys Tyr Pro His Lys Val Lys Thr Ala Gln Val Ser Asp Glu Rrg Lys Ser Tyr Ser Pro Arg Lys Gln Ser Lys Tyr Glu Ser Ser Glu Ser Gln Gln His Se.r Lys Ser Lys Ser Ala Asp Lys Ile Gly Glu Ser Ser Pro Lys Ala Ser Ser Lys Val Leu Ala Ile Met Lys Arg Lys Glu Ser Ser Tyr Lys Glu Ile Glu Lys Pro Val Ala Ser Lys Arg Lys Asn Ala Ile Lys Leu Lys Gly Glu Glu Thr Lys Thr Pro Lys Lys Thr Ser Ser Pro Ala Lys Lys Glu Ser Lys Val Ser Pro Glu Asp Ser Glu Lys Arg Thr Asn Tyr Gln Ala Tyr Lys Arg Ser Tyr Leu Rsn Arg Glu Pro Lys Ala Leu Gly Ser Lys Glu Gly Ile Pro Lys Gly A1a Glu Asn Leu Glu Gly Leu Ile Phe Val Ile Cys Thr Gly Val Leu Glu Ser Ile Arg Asp Glu Ala Lys Ser Leu Ile Glu Glu Arg Tyr Gly Gly Lys Val Gly Asn Val Ser Lys Lys Thr Asn Thr Tyr Leu Val Met Gly Arg Asp Gly Gln Ser Lys Ser Asp Lys Ala Ser Ala Ala Leu Gly Thr Lys Ile Asp Glu Asp Gly Leu Leu Asn Leu Ile Ile Arg Thr Met Pro Gly Lys Ser Lys Tyr Glu Ile Ala Val Glu Lys Thr Glu Met Lys Lys Glu Ser Leu Glu Arg Thr Pro Gln Lys Asn Lys Val Gln Gly Lys Arg Pro Ser Lys Lys Glu Ser Glu Ser Lys Ile Ser Lys Lys Ser Arg Pro Thr Ser Lys Arg Asp Ser Leu Ala Lys Thr Ile Lys Lys Glu Thr Asp Val Phe Trp Lys Ser Leu Asp Phe Lys Glu Gln Val Ala Glu Glu Thr Ser Gly Asp Ser Lys Ala Arg Asn Leu Ala Asp Asp Ser Ser Glu Asn Lys Val Glu Asn Leu Leu Trp Val Asp Lys Tyr Lys Pro Thr Ser Leu Lys Thr Ile Ile Gly Gln Gln Gly Asp Gln Ser Cys Ala 595 ~ 600 605 Asn Lys Leu Leu Arg Trp Leu Arg Asn Trp Gln Lys Ser Ser Ser Glu Asp Lys Lys His Ala Ala Lys Phe Gly Lys Phe Ser Gly Lys Asp Asp Gly Ser Ser Phe Lys Ala Ala Leu Leu Ser Gly Pro Pro Gly Val Gly Lys Thr Thr Thr Ala Ser Leu Va.l Cys Gln Glu Leu Gly Tyr Ser Tyr Val Glu Leu Asn Ala Ser Asp Thr Arg Ser Lys Ser Ser Leu Lys Ala Ile Val Ala Glu Ser Leu Asn Asn Thr Ser Ile Lys Gly Phe Tyr Ser Asn Gly Ala Ala Ser Ser Val Ser Thr Lys His Ala Leu Ile Met Asp Glu Val Asp Gly Met Ala Gly Asn Glu Asp Arg Gly Gly Ile Gln Glu Leu Ile Gly Leu Ile Lys His Thr Lys Ile Pro Ile Ile Cys Met Cys Asn Asp Arg Asn His Pro Lys Ile Arg Ser Leu Val His Tyr Cys Phe Asp Leu Arg Phe Gln Arg Pro Arg Val Glu Gln Ile Lys Gly Ala Met Met Ser Ile Ala Phe Lys Glu Gly Leu Lys Ile Pro Pro Pro Ala Met Asn Glu Ile Ile Leu Gly Ala Asn Gln Asp Ile Arg Gln Val Leu His Asn Leu Ser Met Trp Cys Ala Arg Ser Lys Ala Leu Thr Tyr Asp Gln Ala Lys Ala Asp Ser His Arg Ala Lys Lys Asp Ile Lys Met Gly Pro Phe Asp Val Ala Arg Lys Val Phe Ala Ala Gly Glu Glu Thr Ala His Met Ser Leu Val Asp Lys Ser Asp Leu Phe Phe His Asp Tyr Ser Ile Ala Pro Leu Phe Val Gln Glu Asn Tyr Ile His Val Lys Pro Val Ala Ala Gly Gly Asp Met Lys Lys His Leu Met Leu Leu Ser Arg Ala Ala Asp Ser Ile Cys Asp Gly Asp Leu Val Asp Ser Gln Ile Arg Ser Lys Gln Asn Trp Ser Leu Leu Pro Ala Gln Ala Ile Tyr Ala Ser Val Leu Pro Gly Glu Leu Met Arg Gly Tyr Met Thr Gln Phe Pro Thr Phe Pro Ser Trp Leu Gly Lys His Ser Ser Thr Gly Lys His Asp Arg Ile Val Gln Asp Leu Ala Leu His Met Ser Leu Arg Thr Tyr Ser Ser Lys Arg Thr Val Asn Met Asp Tyr Leu Ser Leu Leu Arg Asp Ala Leu Val Gln Pro Leu Thr Ser Gln Gly Val Asp Gly Val Gln Asp Val Val Ala Leu Met Asp Thr Tyr Tyr Leu Met Lys Glu Asp Phe Glu Asn Ile Met Glu Ile Ser Ser Trp Gly Gly Lys Pro Ser Pro Phe Ser Lys Leu Asp Pro Lys Val Lys Ala Ala Phe Thr Arg Ala Tyr Asn Lys Glu Ala His Leu Thr Pro Tyr Ser Leu Gln Ala Ile Lys Ala Ser Arg His Ser Thr Ser Pro Ser Leu Asp Ser Glu Tyr Asn Glu Glu Leu Asn Glu Asp Asp Ser Gln Ser Asp Glu Lys Asp Gln Asp Ala Ile Glu Thr Asp Ala Met Ile Lys Lys Lys Thr Lys Ser Ser Lys Pro Ser Lys Pro Glu Lys Asp Lys Glu Pro Arg Lys Gly Lys Gly Lys Ser Ser Lys Lys (2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:

Met Leu Trp Val Glu Lys Tyr Arg Pro Lys Thr Leu Glu Glu Val Val Ala Asp Lys Ser Ile Ile Thr Arg Val Ile Lys Trp Ala Lys Ser Trp Lys Arg Gly Ser Lys Pro Leu Leu Leu Ala Gly Pro Pro Gly Val Gly Lys Thr Ser Leu Ala Leu Ala Leu Ala Asn Thr Met Gly Trp Glu Ala Val Glu Leu Asn Ala Ser Asp Gln Arg Ser Trp Arg Val Ile Glu Arg Ile Val Gly Glu Gly Ala Phe Asn Glu Thr Ile Ser Asp Glu Gly Glu Phe Leu Ser Ser Arg Ile Gly Lys Leu Lys Leu Ile Ile Leu Asp Glu Val Asp Asn Ile His Lys Lys Glu Asp Val Gly Gly Glu Ala Ala Leu Ile Arg Leu Ile Lys Arg Lys Pro Ala Gln Pro Leu Ile Leu Ile Ala Asn Asp Pro Tyr Lys Leu Ser Pro Glu Leu Arg Asn Leu Cys Glu Met Ile Asn Phe Lys Arg Leu Thr Lys Gln Gln Val Ala Arg Val Leu Glu Arg Ile Ala Leu Lys Glu Gly Ile Lys Val Asp Lys Ser Val Leu Leu Lys Ile Ala Glu Asn Ala Gly Gly Asp Leu Arg Ala Ala Ile Asn Asp Phe Gln Ala Leu Ala Glu Gly Lys Glu Glu Leu Lys Pro Glu Asp Val Phe Leu Thr Lys Arg Thr Gln Glu Lys Asp Ile Phe Arg Val Met Gln Met Ile Phe Lys Thr Lys Asn Pro Ala Val Tyr Asn Glu Ala Met Leu Leu Asp Glu Ser Pro Glu Asp Val Ile His Trp Val Asp Glu Asn Leu Pro Leu Glu Tyr Ser Gly Val Glu Leu Val Asn Ala Tyr Glu Ala Leu Ser Arg Ala Asp Ile Phe Leu Gly Arg Val Arg Arg Arg Gln Phe Tyr Arg Leu Trp Lys Tyr Ala Ser Tyr Leu Met Thr Val Gly Val Gln Gln Met Lys Glu Glu Pro Lys Lys Gly Phe Thr Arg Tyr Arg Arg Pro Ala Val Trp Gln Met Leu Phe Gln Leu Arg Gln Lys Arg Glu Met Thr Arg Lys Ile Leu Glu Lys Ile Gly Lys Tyr Ser His Leu Ser Met Arg Lys Ala Arg Thr Glu Met Phe Pro Val Ile Lys Leu Leu Leu Lys Glu Leu Asp Val Asp Lys Ala Ala Thr Ile Ala Ala Phe Tyi: Glu Phe Thr Lys Glu Glu Leu Glu Phe Leu Val Gly Glu Lys Gly Asp Glu Ile Trp Lys Tyr Val Glu Lys His Gly Met His Arg Ile Glu Asp Glu Thr Phe Leu Glu Ser Phe Val Lys Ala Glu Lys Glu Glu Lys Glu Glu Ser Val Glu Glu Val Ala Glu Glu Lys Pro Glu Glu Glu Arg Glu Glu Pro Arg Ala Arg Lys Lys Ala Gly Lys Asn Leu Thr Leu Asp Ser Phe Phe Ser (2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 516 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
Met Leu Ser Trp Val Glu Lys Tyr Arg Pro Lys Ser Leu Lys Asp Val Ala Gly His Glu Lys Val Lys Glu Lys Leu Lys Thr Trp Ile Glu Ser Tyr Leu Lys Gly Glu Thr Pro Lys Pro Ile Leu Leu Val Gly Pro Pro Gly Cys Gly Lys Thr Thr Leu Ala Tyr Ala Leu Ala Asn Asp Tyr Gly Phe Glu Val Ile Glu Leu Asn Ala Ser Asp Lys Arg Asn Ser Ser Ala Ile Lys Lys Val Val Gly His Ala Ala Thr Ser Ser Ser Ile Phe Gly _ - 20 -Lys Lys Phe Leu Ile Val Leu Asp Glu Val Asp Gly Ile Ser Gly Lys GluAsp AlaGlyGly ValSerGlu LeuIleLys Val.IleLys LysAla LysAsn ProIleIle LeuThrAla AsnAspAla TyrAlaPro SerIle ArgSer LeuLeuPro TyrValGlu ValIleGln LeuAsnPro ValHis ThrAsn SerValTyr LysValLeu LysLysIle Ala.GluLys GluGly LeuAsp ValAspAsp LysThrLeu LysMetIle AlaGlnHis SerAla GlyAsp LeuArgSer AlaIleAsn AspLeuGlu AlaLeuAla LeuSer Gly LeuSer TyrGluAla AlaGlnLysLeu ProAspArg LysArg Asp GluAla AsnIle PheAspAla LeuArgValIle LeuLysThr ThrHis TyrGly IleAla ThrThrAla LeuMetAsnVal AspGluThr ProAsp ValVal IleGlu TrpIleAla GluAsnValPro LysGluTyr GluLys ProGlu GluVal AlaArgAla PheGluTyrLeu SerLysAla AspArg TyrLeu GlyArg ValMetArg ArgGlnAsnTyr SerPheTrp LysTyr AlaThr ThrLeu MetThrAla GlyValAlaLeu SerLysAsp GluLys TyrArg LysTrp ThrProTyr SerTyrProLys IlePheArg LeuLeu ThrLys ThrLys AlaGluArg GluIleLeuAsn LysIleLeu LysLys IleGly GluLys ThrHisThr SerSerLysArg AlaArgPhe AspLeu GlnMet LeuLys LeuLeuAla LysGluAsnPro SerValAla AlaAsp LeuVal AspTyr PheGluIle LysGluAspGlu LeuLysVal LeuVal GlyAsp LysLeu AlaSerGlu IleLeuLysIle LeuLysGlu LysLys LysLeu GluArg LysLysLys LysGluLysGlu LysLeuGlu LysGlu Lys Lys Lys Glu Glu Lys Ala Lys Glu Lys Gln Ser Asn Leu Ile Ile Gln Pro Lys Glu Ile Lys Glu Glu Val Lys Ala Glu Val Glu Lys Lys 450 455 q6p Glu Glu Val Lys Glu Lys Ile Val Glu Lys Pro Lya Ala Glu Glu Val Lys Glu Lys Ser Lys Thr Glu Glu Lys Glu Thr Lys Lys Asp Lys Lys Lys Gly Lys Lys Lys Lys Glu Asp Lys Gly Lys Gln Leu Thr Leu Asp Ala Phe Phe Lys (2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
Met Pro Asp Val Pro Trp Ile Glu Lys Tyr Arg Pro Arg Lys Leu Ser Glu Ile Val Asn Gln Glu Gln Ala Leu Glu Lys Val Arg Ala Trp Ile Glu Ser Trp Leu His Gly Asn Pro Pro Lys Lys Lys Rla Leu Leu Leu Ala Gly Pro Pro Gly Ser Gly Lys Thr Thr Thr Val Tyr Ala Leu Ala His Glu Tyr Asn Phe Glu Val Ile Glu Leu Asn Ala Ser Asp Glu Arg Thr Tyr Asn Lys Ile Ala Arg Tyr Val Gln Ala Ala Tyr Thr Met Asp Ile Met Gly Lys Arg Arg Lys Ile Ile Phe Leu Asp Glu Ala Asp Asn Ile Glu Pro Ser Gly Ala Pro Glu Ile Ala Lys Leu Ile Asp Lys Ala Arg Asn Pro Ile Ile Met Ala Ala Asn His Tyr Trp Glu Val Pro Lys Glu Ile Arg Asp Arg Ala Glu Leu Val Glu Tyr Lys Arg Leu Asn Gln Arg Asp Val Ile Ser Ala Leu Val Arg Ile Leu Lys Arg Glu Gly Ile Thr Val Pro Lys Glu Ile Leu Thr Glu Ile Ala Lys Arg Ser Ser Gly Asp Leu Arg Ala Ala Ile Asn Asp Leu Gln Thr Ile Val Ala Gly Gly Tyr Glu Asp Ala Lys Tyr Val Leu Ala Tyr Arg Asp Val Glu Lys Thr Val Phe Gln Ser Leu Gly Met Val Phe Ser Ser Asp Asn Ala Lys Arg Ala Lys Leu Ala Leu Met Asn Leu Asp Met Ser Pro Asp Glu Phe Leu Leu Trp Val Asp Glu Asn Ile Pro His Met Tyr Leu Lys Pro Glu Glu Met Ala Arg Ala Tyr Glu Ala Ile Ser Arg Ala Asp Ile Tyr Leu Gly Arg Ala Gln Arg Thr Gly Asn Tyr Ser Leu Trp Lys Tyr Ala Ile Asp 290 295 30C) Met Met Thr Ala Gly Val Ala Val Ala Gly Thr Lye, Lys Lys Gly Phe Ala Lys Phe Tyr Pro Pro Asn Thr Leu Lys Met Leu. Ala Glu Ser Lys Glu Glu Arg Ser Ile Arg Asp Ser Ile Ile Lys Lys Ile Met Lys Glu Met His Met Ser Lys Leu Glu Ala Leu Glu Thr Met Lys Ile Leu Arg Thr Ile Phe Glu Asn Asn Leu Asp Leu Ala Ala His Phe Thr Val Phe Leu Glu Leu Thr Glu Lys Glu Val Glu Phe Leu Ala Gly Lys Glu Lys Ala Gly Thr Ile Trp Gly Lys Thr Leu Ser Ile Arg Arg Arg Ile Lys Glu Thr Glu Lys Ile Glu Glu Lys Ala Val Glu Glu Lys Val Glu Glu Glu Glu Ala Glu Glu Glu Glu Glu Glu Glu Arg Lys Glu Glu Glu Lys Pro Lys Ala Glu Lys Lys Lys Gly Lys Gln Val Thr Leu Phe Asp Phe Ile Lys Lys Asn (2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautotrrophicum (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Met Ser Trp Thr Glu Lys Tyr Arg Pro Gly Ser Phe Asp Glu Val Val Gly Asn Gln Lys Val Ile Ala Glu Ile Lys Glu Trp Ile Lys Ala Trp Lys Ala Gly Lys Pro Gln Lys Pro Leu Leu Leu Val Gly Pro Pro Gly Thr Gly Lys Thr Thr Leu Ala His Ile Ile Gly Lys Glu Phe Ser Asp Thr Leu Glu Leu Asn Ala Ser Asp Arg Arg Ser Gln Asp Ala Leu Met Arg Ser Ala Gly Glu Ala Ser Ala Thr Arg Ser Leu Phe Asn His Asp Leu Lys Leu Ile Ile Leu Asp Glu Val Asp Gly Ile His Gly Asn Glu Asp Arg Gly Gly Val Gln Ala Ile Asn Arg Ile Ile Lys Glu Ser Arg His Pro Met Val Leu Thr Ala Asn Asp Pro Tyr Ser Lys Arg Leu Gln Ser Ile Lys Pro Arg Cys Arg Val Leu Asn Leu Arg Lys Val His Thr Ser Ser Ile Ala Ala Ala Leu Arg Arg Ile Cys Arg Ala Glu Gly Ile Glu Cys Pro Asp Asp Val Leu Arg Glu Leu Ala Lys Arg Ser Arg Gly Asp Leu Arg Ser Ala Ile Asn Asp Leu Glu Ala Met Ala Glu Gly Glu Glu Arg Ile Gly Glu Glu Leu Leu Lys Met Gly Glu Lys Asp Ala Thr Ser Asn Leu Phe Asp Ala Val Arg Ala Val Leu Lys Ser Arg Asp Val Ser Lys Val Arg Glu Ala Met Arg Val Asp Asp Asp Pro Thr Leu Val Leu Glu Phe Ile Ala Glu Asn Val Pro Arg Glu Tyr Glu Lys Pro Asn Glu Ile Ser Arg Ala Tyr Asp Met Leu Ser Arg Ala Asp Ile Phe Phe Gly Arg Ala Val Arg Thr Arg Asn Tyr Thr Tyr Trp Arg Tyr Ala Ser Glu Leu Met Gly Pro Gly Val Ala Leu Ala Lys Asp Lys Thr Tyr Arg Lys Phe Val Arg Tyr Thr Gly Ser Ser Ser Phe Arg Ile Leu Gly Lys Thr Arg Lys Gln Arg Ser Leu Arg Asp Ser Val Ala Ala Lys Met Ala Gly Lys Met His Ile Ser Pro Lys Val Ala Ile Ser Met Phe Pro Tyr Met Glu Ile Leu Phe Glu Asn Asp Glu Met Ala Tyr Asp Ile Ser Glu Phe Leu Glu Leu Arg Asp Glu Glu Ile Lys Leu Phe Arg Lys Arg Lys Ile Lys Ala Pro Lys Arg Lys Lys Thr Pro Arg Lys Ala Glu Ile Lys Val Gly Pro Leu Tyr Ser Gln Lys Lys Asp Lys Gly Ala Asp Lys Ser Ile Asn Asp Lys Ala Thr Asp Lys Ser Ala Lys Thr Pro Ile Lys Ser Ser Lys Lys Asp Asp Arg Pro Arg Asp Glu Ser Ser Ser Ser Ser Asp Asp Lys Lys Pro Lys Glu Lys Gln Thr Ser Leu Phe Gln Phe Ser (2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 261 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (Vi) INITIAL ORIGIN:
(A) ORGANISM: Homo sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID N0: 11:
Met Phe Glu Ala Arg Leu Val Gln Gly Ser Ile Leu Lys Lys Val Leu Glu Ala Leu Lys Asp Leu Ile Asn Glu Ala Cys Trp Asp Ile Ser Ser Ser Gly Val Asn Leu Gln Ser Met Asp Ser Ser His Val Ser Leu Val Gln Leu Thr Leu Arg Ser Glu Gly Phe Asp Thr Tyr Arg Cys Asp Arg Asn Leu Ala Met Gly Val Asn Leu Thr Ser Met Ser Lys Ile Leu Lys Cys Ala Gly Asn Glu Asp Ile Ile Thr Leu Arg Ala Glu Asp Asn Ala Asp Thr Leu Ala Leu Val Phe Glu Ala Pro Asn Gln Glu Lys Val Ser Asp Tyr Glu Met Lys Leu Met Asp Leu Asp Val Glu Gln Leu Gly Ile Pro Glu Gln Glu Tyr Ser Cys Val Val Lys Met Pro Ser Gly Glu Phe Ala Arg Ile Cys Arg Asp Leu Ser His Ile Gly Asp Ala Val Val Ile Ser Cys Ala Lys Asp Gly Val Lys Phe Ser Ala Ser Gly Glu Leu Gly Asn Gly Asn Ile Lys Leu Ser Gln Thr Ser Asn Val Asp Lys Glu Glu Glu Ala Val Thr Ile Glu Met Asn Glu Pro Val Gln Leu Thr Phe Ala Leu Arg Tyr Leu Asn Phe Phe Thr Lys Ala Thr Pro Leu Ser Ser Thr Val Thr Leu Ser Met Ser Ala Asp Val Pro Leu Val Val Glu Tyr Lys Ile Ala Asp Met Gly His Leu Lys Tyr Tyr Leu Ala Pro Lys Ile Glu Asp Glu Glu Gly Ser (2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 245 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi)SEQUENCE DESCRIPTION: 12:
SEQ
ID
NO:

Met IleAsp ValIleMet ThrGlyGlu LeuLeuLysThr ValThrArg Ala IleVal AlaLeuVal SerGluAla ArgIleHisPhe LeuGluLys Gly LeuHis SerArgAla ValAspPro AlaAsnValAla MetValIle Val AspIle ProLysAsp SerPheGlu ValTyrAsnIle AspGluGlu Lys ThrIle GlyValAsp MetAspArg IlePheAspIle SerLysSer Ile Ser Thr Lys Asp Leu Val Glu Leu Ile Val Glu Asp Glu Ser Thr Leu Lys Val Lys Phe Gly Ser Val Glu Tyr Lys Val Ala Leu Ile Asp Pro Ser Ala Ile Arg Lys Glu Pro Arg Ile Pro Glu Leu Glu Leu Pro Ala Lys Ile Val Met Asp Ala Gly Glu Phe Lys Lys Ala Ile Ala Ala Ala Asp Lys Ile Ser Asp Gln Val Ile Phe Arg Ser Asp Lys Glu Gly Phe Arg Ile Glu Ala Lys Gly Asp Val Asp Ser Ile Val Phe His Met Thr Glu Thr Glu Leu Ile Glu Phe Asn Gly Gly Glu Ala Arg Ser Met Phe Ser Val Asp Tyr Leu Lys Glu Phe Cys Lys Val Ala Gly Ser Gly Asp Leu Leu Thr Ile His Leu Gly Thr Asn Tyr Pro Val Arg Leu Val Phe Glu Leu Val Gly Gly Arg Ala Lys Val Glu Tyr Ile Leu Ala Pro Arg Ile Glu Ser Glu (2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 247 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
Met Phe Arg Gly Val Met Glu Ser Ala Lys Glu Phe Lys Lys Val Val Asp Thr Ile Ser Thr Leu Leu Asp Glu Ile Cys Phe Glu Val Asp Glu Glu Gly Ile Lys Ala Ser Ala Met Asp Pro Ser His Val Ala Leu Val Ser Leu Glu Ile Pro Arg Leu Ala Phe Glu Glu Tyr Glu Ala Asp Ser His Asp Ile Gly Ile Asp Leu Glu Ala Phe Lys Lys Val Met Asn Arg Ala Lys Ala Lys Asp Arg Leu Ile Leu Glu Leu Asp Glu Glu Lys Asn Lys Leu Asn Val Ile Phe Glu Asn Thr Gly Lys Arg Lys Phe Ser Leu Rla Leu Leu Asp Ile Ser Ala Ser Ser Val Lys Val Pro Glu Ile Glu Tyr Pro Asn Val Ile Met Ile Lys Gly Asp Ala Phe Lys Glu Ala Leu Lys Asp Ala Asp Leu Phe Ser Asp Tyr Val Ile Leu Lys Ala Asp Glu Asp Lys Phe Val Ile His Ala Lys Gly Asp Leu Asn Glu Asn Glu Ala Ile Phe Glu Lys Asp Ser Ser Ala Ile Ile Ser Leu Glu Val Lys Glu Glu Ala Lys Ser Ala Phe Asn Leu Asp Tyr Leu Met Asp Met Val Lys Gly Val Ser Ser Gly Asp Ile Ile Lys Ile Tyr Leu Gly Asn Asp Met Pro Leu Lys Leu Glu Tyr Ser Ile Ala Gly Val Asn Leu Thr Phe Leu Leu Ala Pro Arg Ile Glu Gly (2) INFORMATION FOR SEQ ID N0: 14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 299 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
Met Pro Phe Glu Ile Val Phe Glu Gly Ala Lys Glu Phe Ala Gln Leu Ile Glu Thr Ala Ser Arg Leu Ile Asp Glu Ala Ala. Phe Lys Val Thr Glu Glu Gly Ile Ser Met Arg Ala Met Asp Pro Ser Arg Val Val Leu Ile Asp Leu Asn Leu Pro Ser Ser Ile Phe Ser Lys Tyr Glu Val Asp Gly Glu Glu Thr Ile Gly Val Asn Met Asp His Leu Lys Lys Val Leu Lys Arg Gly Lys Ala Lys Asp Thr Leu Ile Leu Arg Lys Gly Glu Glu Asn Phe Leu Glu Ile Ser Leu Gln Gly Thr Ala Thr Arg Thr Phe Arg Leu Pro Leu Ile Asp Val Glu Glu Ile Glu Val Glu Leu Pro Asp Leu Pro Tyr Thr Ala Lys Val Val Val Leu Gly Glu Val Leu Lys Glu Ala Val Lys Asp Ala Ser Leu Val Ser Asp Ser Ile Lys Phe Met Ala Lys Glu Asn Glu Phe Ile Met Arg Ala Glu Gly Glu Thr Gln Glu Val Glu Val Lys Leu Thr Leu Glu Asp Glu Gly Leu Leu Asp Ile Glu Val Gln Glu Glu Thr Lys Ser Ala Tyr Gly Val Ser Tyr Leu Ala Asp Met Val Lys Gly Ile Gly Lys Ala Asp Glu Val Thr Met Arg Phe Gly Asn Glu Met Pro Met Gln Met Glu Tyr Tyr Ile Arg Asp Glu Gly Arg Leu Thr Phe Leu Leu Ala Pro Arg Val Glu Glu (2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 244 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautotrophicum (xi)SEQUENCE 15:
DESCRIPTION:
SEQ
ID
NO:

MetPheLys AlaGluLeu AsnAsp ProAsnIleLeu ArgThrSer Phe AspAlaIle SerSerIle ValAsp GluValGlnIle GlnLeuSer Ala GluGlyLeu ArgLeuAsp AlaLeu AspArgSerHis IleThrTyr Val HisLeuGlu LeuLysAla GluLeu PheAspGluTyr ValCysAsp Glu Pro Glu Arg Ile Asn Val Asp Thr Glu Glu Leu Met Lys Val Leu Lys Arg Ala Lys Ala Asn Asp Arg Val Ile Leu Ser Thr Asp Glu Gly Asn Leu Ile Ile Gln Phe Glu Gly Glu Ala Val Arg Thr Phe Lys Ile Arg Leu Ile Asp Ile Glu Tyr Glu Thr Pro Ser Pro Pro Glu Ile Glu Tyr Glu Asn Glu Phe Glu Val Pro Phe Gln Leu Leu Lys Asp Ser Ile Ala Asp Ile Rsp Ile Phe Ser Asp Lys Ile Thr Phe Arg Val Asp Glu Asp Arg Phe Ile Ala Ser Ala Glu Gly Glu Phe Gly Asp Ala Gln Ile Glu Tyr Leu His Gly Glu Arg Ile Asp Lys Pro Ala Arg Ser Ile Tyr Ser Leu Asp Lys Ile Lys Glu Met Leu Lys Ala Asp Lys Phe Ser Glu Thr Ala Ile Ile Asn Leu Gly Asp Asp Met Pro Leu Lys Leu Thr Leu Lys Met Ala Ser Lys Glu Gly Glu Leu Ser Phe Leu Leu Ala Pro Arg Ile Glu Ala Glu Glu (2) INFORMATION FOR SEQ ID NO: 16:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 469 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi)SEQUENCE 16:
DESCRIPTION:
SEQ
ID
N0:

MetPheSer GluGlnAla AlaGlnArg AlaHisThr LeuLeuSer Pro ProSerAla AsnAsnAla ThrPheAla ArgValPro ValAlaThr Tyr ThrAsnSer SerGlnPro PheArgLeu GlyGluArg SerPheSer Arg GlnTyrAla HisIleTyr AlaThrArg LeuIleGln MetArgPro Phe LeuGluAsn ArgAlaGln GlnHisTrp GlySerGly ValGlyVal Lys LysLeuCys GluLeuGln ProGluGlu LysCysCys ValValGly Thr LeuPheLys AlaMetPro LeuGlnPro SerIleLeu ArgGluVal Ser GluGluHis AsnLeuLeu ProGlnPro ProArgSer LysTyrIle His ProAspAsp GluLeuVal LeuGluAsp GluLeuGln ArgIleLys Leu Lys Gly Thr Ile Asp Val Ser Lys Leu Val Thr Gly Thr Val Leu Ala Val Phe Gly Ser Val Arg Asp Asp Gly Lys Phe Leu Val Glu Asp Tyr Cys Phe Ala Asp Leu Ala Pro Gln Lys Pro Ala Pro Pro Leu Asp Thr Asp Arg Phe Val Leu Leu Val Ser Gly Leu Gly Leu Gly Gly Gly Gly Gly Glu Ser Leu Leu Gly Thr Gln Leu Leu Val Asp Val Val Thr Gly Gln Leu Gly Asp Glu Gly Glu Gln Cys Ser Ala Ala His Val Ser Arg Val Ile Leu Ala Gly Asn Leu Leu Ser His Ser Thr Gln Ser Arg Asp Ser Ile Asn Lys Ala Lys Tyr Leu Thr Lys Lys Thr Gln Ala Ala Ser Val Glu Ala Val Lys Met Leu Asp Glu Ile Leu Leu Gln Leu Ser Ala Ser Val Pro Val Asp Val Met Pro Gly Glu Phe Asp Pro Thr Asn Tyr Thr Leu Pro Gln Gln Pro Leu His Pro Cys Met Phe Pro Leu Ala Thr Ala Tyr Ser Thr Leu Gln Leu Val Thr Asn Pro Tyr Gln Ala Thr Ile Asp Gly Val Arg Phe Leu Gly Thr Ser Gly Gln Asn Val Ser Asp Ile Phe Arg Tyr Ser Ser Met Glu Asp His Leu Glu Ile Leu Glu Trp Thr Leu Arg Val Arg His Ile Ser Pro Thr Ala Pro Asp Thr Leu Gly Cys TyrPro PheTyr LysThrAsp ProPheIlePhe ProGluCys ProHis ValTyr PheCys GlyAsnThr ProSerPheGly SerLysIle IleArg GlyPro GluAsp GlnThrVal LeuLeuValThr ValProAsp PheSer AlaThr GlnThr AlaCysLeu ValAsnLeuArg SerLeuAla CysGln Pro Ile Ser Phe Ser Gly Phe Gly Ala Glu Asp Asp Asp Leu Gly Gly Leu Gly Leu Gly Pro (2) INFORMATION FOR SEQ ID NO: 17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 488 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
Met Val Ile Lys Asn Ile Asp Ala Ala Thr Val Ala Lys Lys Phe Leu Val Arg GlyTyrAsn IleAspPro LysAlaAla GluLeuIle CysLys Ser Gly LeuPheSer AspGluLeu ValAspLys IleCysArg IleAla Asn Gly GlyPheIle IleGluLys SerValVal GluGluPhe LeuArg Asn Leu SerAsnLeu LysProAla ThrLeuThr ProArgPro GluGlu Arg Lys ValGluGlu ValLysAla SerCysIle AlaLeuLys ValIle Lys Asp IleThrGly LysSerSer CysGlnGly AsnValGlu AspPhe Leu Met TyrPheAsn SerArgLeu GluLysLeu SerArgIle IleArg Ser Arg ValAsnThr ThrProIle AlaHisAla GlyLysVal ArgGly Asn Val SerValVal GlyMetVal AsnGluVal TyrGluArg GlyAsp Lys Cys TyrIleArg LeuGluAsp ThrThrGly ThrIleThr CysVal Ala Thr GlyLysAsn AlaGluVal AlaArgGlu LeuLeuGly AspGlu Val Ile GlyValThr GlyLeuLeu LysGlySer SerLeuTyr AlaAsn Arg Ile ValPhePro AspValPro IleAsnGly AsnGlyGlu LysLys Arg Asp Phe Tyr Ile Val Phe Leu Ser Asp Thr His Phe Gly Ser Lys Glu Phe Leu Glu Lys Glu Trp Glu Met Phe Val Arg Trp Leu Lys Gly Glu Val Gly Gly Lys Lys Ser Gln Asn Leu Ala Glu Lys Val Lys Tyr Ile Val Ile Ala Gly Asp Ile Val Asp Gly Ile Gly Val Tyr Pro Gly Gln Glu Asp Asp Leu Ala Ile Ser Asp Ile Tyr Gly Gln Tyr Glu Phe Ala Ala Ser His Leu Rsp Glu Ile Pro Lys Glu Ile Lys Ile Ile Val Ser Pro Gly Asn His Asp Ala Val Arg Gln Ala Glu Pro Gln Pro Ala Phe Glu Gly Glu Ile Arg Ser Leu Phe Pro Lys Asn Val Glu His Val Gly Asn Pro Ala Tyr Val Asp Ile Glu Gly Val Lys Val Leu Ile Tyr His Gly Arg Ser Ile Asp Asp Ile Ile Ser Lys Ile Pro Arg Leu Ser Tyr Asp Glu Pro Gln Lys Val Met Glu Glu Leu Leu Lys Arg Arg His Leu Ser Pro Ile Tyr Gly Gly Arg Thr Pro Leu Ala Pro Glu Arg Glu Asp Tyr Leu Val Ile Glu Asp Val Pro Asp Ile Leu His Cys Gly His Ile His Thr Tyr Gly Thr Gly Phe Tyr Arg Gly Val Phe Met Val Asn Ser Ser Thr Trp Gln Ala Gln Thr Glu Phe Gln Lys Lys Val Asn Leu Asn Pro Met Pro Gly Asn Val Ala Val Tyr Arg Pro Gly Gly Glu Val Ile Arg Leu Arg Phe Tyr Gly Glu (2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 594 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii (xi)SEQUENCE 18:
DESCRIPTION:
SEQ
ID
NO:

MetGluIle IleAsnLys PheLeuAsp LeuGluAla LeuLeuSer Pro ThrValTyr GluLysLeu LysAsnPhe AspGluGlu LysLeuLys Arg LeuIleGln LysIleArg GluPheLys LysTyrAsn AsnAlaPhe Ile LeuLeuAsp GluLysPhe LeuAspIle PheLeuGln LysRspLeu Asp GluIleIle AsnGluTyr LysAspPhe AspPheIle PheTyrTyr Thr GlyGluGlu GluLysGlu LysProLys GluValLys LysGluIle Lys LysGlu ThrGluGlu LysIleGlu LysGluLys IleGluPhe ValLys LysGlu GluLysGlu GlnPheIle LysLysSer AspGluAsp ValGlu GluLys LeuLysGln LeuIleSer LysGluGlu LysLysGlu AspPhe AspAla GluArgAla LysArgTyr GluHisIle ThrLysIle LysGlu SerVal AsnSerArg IleLysTrp IleAlaLys AspIleAsp AlaVal IleGlu IleTyrGlu AspSerAsp ValSerGly LysSerThr CysThr GlyThr IleGluAsp PheValLys TyrPheArg AspArgPhe GluArg LeuLys ValPheIle GluArgLys AlaGlnArg LysGlyTyr ProLeu LysAsp IleLysLys MetLysGly GlnLysAsp IlePheVal ValGly IleVal SerAspVal AspSerThr ArgAsnGly AsnLeuIle ValArg IleGlu AspThrGlu AspGluAla ThrLeuIle LeuProLys GluLys IleGlu AlaGlyLys IleProAsp AspIleLeu LeuAspGlu ValIle Gly Ala Ile Gly Thr Val Ser Lys Ser Gly Ser Ser Ile Tyr Val Asp Glu Ile Ile Arg Pro Ala Leu Pro Pro Lys Glu Pro Lys Arg Ile Asp Glu Glu Ile Tyr Met Ala Phe Leu Ser Asp Ile His Val Gly Ser Lys Glu Phe Leu His Lys Glu Phe Glu Lys Phe Ile Arg Phe Leu Asn Gly Asp Val Asp Asn Glu Leu Glu Glu Lys Val Val Ser Arg Leu Lys Tyr Ile Cys Ile Ala Gly Asp Leu Val Asp Gly Val Gly Val Tyr Pro Gly Gln Glu Glu Asp Leu Tyr Glu Val Asp Ile Ile Glu Gln Tyr Arg Glu Ile Ala Met Tyr Leu Asp Gln Ile Pro Glu His Ile Ser Ile Ile Ile Ser Pro Gly Asn His Asp Ala Val Arg Pro Ala Glu Pro Gln Pro Lys - Leu Pro Glu Lys Ile Thr Lys Leu Phe Asn Arg Asp Asn Ile Tyr Phe Val Gly Asn Pro Cys Thr Leu Asn Ile His Gly Phe Asp Thr Leu Leu Tyr His Gly Arg Ser Phe Asp Asp Leu Val Gly Gln Ile Arg Ala Ala Ser Tyr Glu Asn Pro Val Thr Ile Met Lys Glu Leu Ile Lys Arg Arg Leu Leu Cys Pro Thr Tyr Gly Gly Arg Cys Pro Ile Ala Pro Glu His Lys Asp Tyr Leu Val Ile Asp Arg Rsp Ile Asp Ile Leu His Thr Gly His Ile His Ile Asn Gly Tyr Gly Ile Tyr Arg Gly Val Val Met Val Asn Ser Gly Thr Phe Gln Glu Gln Thr Asp Phe Gln Lys Arg Met Gly Ile Ser Pro Thr Pro Ala Ile Val Pro Ile Ile Asn Met Ala Lys Val Gly Glu Lys Gly His Tyr Leu Glu Trp Asp Arg Gly Val Leu Glu Val Arg Tyr (2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 622 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
Met Asp Glu Phe Val Lys Gly Leu Met Lys Asn Gly Tyr Leu Ile Thr Pro Ser Ala Tyr Tyr Leu Leu Val Gly His Phe Asn Glu Gly Lys Phe Ser Leu Ile Glu Leu Ile Lys Phe Ala Lys Ser Arg Glu Thr Phe Ile Ile Asp Asp Glu Ile Ala Asn Glu Phe Leu Lys Ser Ile Gly Ala Glu Val Glu Leu Pro Gln Glu Ile Lys Glu Gly Tyr Ile Ser Thr Gly Glu Gly Ser Gln Lys Val Pro Asp His Glu Glu Leu Glu Lys Ile Thr Asn Glu Ser Ser Val Glu Ser Ser Ile Ser Thr Gly Glu Thr Pro Lys Thr Glu Glu Leu Gln Pro Thr Leu Asp Ile Leu Glu Glu Glu Ile Gly Asp Ile Glu Gly Gly Glu Ser Ser Ile Ser Thr Gly Asp Glu Val Pro Glu Val Glu Asn Asn Asn Gly Gly Thr Val Val Val Phe Asp Lys Tyr Gly Tyr Pro Phe Thr Tyr Val Pro Glu Glu Ile Glu Glu Glu Leu Glu Glu Tyr Pro Lys Tyr Glu Asp Val Thr Ile Glu Ile Asn Pro Asn Leu Glu Val Val Pro Ile Glu Lys Asp Tyr Glu Ile Lys Phe Asp Val Arg Arg Val Lys Leu Lys Pro Pro Lys Val Lys Ser Gly Ser Gly Lys Glu Gly Glu Ile Ile Val Glu Ala Tyr Ala Ser Leu Phe Arg Ser Arg Leu Arg Lys Leu Arg Arg Ile Leu Arg Glu Asn Pro Glu Val Ser Asn Val Ile Asp Ile Lys Lys Leu Lys Tyr Val Lys Gly Asp Glu Glu Val Thr Ile Ile Gly Leu Val Asn Ser Lys Lys Glu Thr Ser Lys Gly Leu Ile Phe Glu Val Glu Asp Gln Thr Asp Arg Val Lys Val Phe Leu Pro Lys Asp Ser Glu Asp Tyr Arg Glu Ala Leu Lys Val Leu Pro Asp Ala Val Val Ala Phe Lys Gly Val Tyr Ser Lys Arg Gly Ile Phe Phe Ala Asn Arg Phe Tyr Leu Pro Asp Val Pro Leu Tyr Arg Lys Gln Lys Pro Pro Leu Glu Glu Lys Val Tyr Ala Val Leu Thr Ser Asp Ile His Val Gly Ser Lys Glu Phe Cys Glu Lys Ala Phe Ile Lys Phe Leu Glu Trp Leu Asn Gly Tyr Val Glu Ser Lys Glu Glu Glu Glu Ile Val Ser Arg Ile Arg Tyr Leu Ile Ile Ala Gly Asp Val Val Asp Gly Ills Gly Ile Tyr Pro Gly Gln Tyr Ser Asp Leu Ile Ile Pro Asp Ile Phe Asp Gln Tyr Glu Ala Leu Ala Asn Leu Leu Ser Asn Val Pro Lys Hip; Ile Thr Ile Phe Ile Gly Pro Gly Asn His Asp Ala Ala Arg Pro Ala Ile Pro Gln Pro Glu Phe Tyr Glu Glu Tyr Ala Lys Pro Leu Tyr Lys Leu Lys Asn Thr Val Ile Ile Ser Asn Pro Ala Val Ile Arg Leu His Gly Arg Asp Phe Leu Ile Ala His Gly Arg Gly Ile Glu Asp Val Val Ser Phe Val Pro Gly Leu Thr His His Lys Pro Gly Leu Pro Met Val Glu Leu Leu Lys Met Arg His Leu Ala Pro Thr Phe Gly Gly Lys Val Pro Ile Ala Pro Asp Pro Glu Asp Leu Leu Val Ile Glu Glu Val Pro Asp Leu Val Gln Met Gly His Val His Val Tyr Asp Thr Ala Val Tyr Arg Gly Val Gln Leu Val Asn Ser Ala Thr Trp Gln Ala Gln Thr Glu Phe Gln Lys Met Val Asn Ile Val Pro Thr Pro Gly Leu Val Pro Ile Val Asp Val Glu Ser Ala Arg Val Ile Lys Val Leu Asp Phe Ser Arg Trp Cys (2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 482 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautotrophicum (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
Met Asn Glu Ile Ile Gly Lys Phe Ala Arg Glu Gly Ile Leu Ile Glu Asp Asn Ala Tyr Phe Arg Leu Arg Glu Met Asp Asp Pro Ala Ser Val Ser Ser Glu Leu Ile Val Lys Ile Lys Ser Asn Gl;y Gly Lys Phe Thr Val Leu Thr Ser Glu Met Leu Asp Glu Phe Phe Glu Ile Asp Asn Pro Ala Glu Ile Lys Ala Arg Gly Pro Leu Met Val Pro Ala Glu Arg Asp Phe Asp Phe Glu Val Ile Ser Asp Thr Ser Asn Arg Ser Tyr Thr Ser Gly Glu Ile Gly Asp Met Ile Rla Tyr Phe Asn Ser Arg Tyr Ser Ser Leu Lys Asn Leu Leu Ser Lys Arg Pro Glu Leu Lys Gly His Ile Pro Ile Ala Asp Leu Arg Gly Gly Glu Asp Val Val Ser Ile Ile Gly Met Val Asn Asp Val Arg Asn Thr Lys Asn Asn His Arg Ile Ile Glu Leu Glu Rsp Asp Thr Gly Glu Ile Ser Val Val Val His Asn Glu Asn His Lys Leu Phe Glu Lys Ser Glu Lys Ile Val Arg Asp Glu Val Val Gly Val His Gly Thr Lys Lys Gly Arg Phe Val Val Ala Ser Glu Ile Phe His Pro Gly Val Pro Arg Ile Gln Glu Lys Glu Met Asp Phe Ser Val Ala Phe Ile Ser Asp Val His Ile Gly Ser Gln Thr Phe Leu Glu Asp Ala Phe Met Lys Phe Val Lys Trp Ile Asn Gly Asp Phe Gly Ser Glu Glu Gln Arg Ser Leu Ala Ala Asp Val Lys Tyr Leu Val Val Ala Gly Asp Ile Val Asp Gly Ile Gly Ile Tyr Pro Gly Gln Glu Lys Glu Leu Leu Ile Arg Asp Ile His Glu Gln Tyr Glu Glu Rla Ala Arg Leu Phe Gly Asp Ile Arg Ser Asp Ile Lys Ile Val Met Ile Pro Gly Asn His Asp Ser Ser Arg Ile Ala Glu Pro Gln Pro Ala Ile Pro Glu Glu Tyr Ala Lys Ser Leu Tyr Ser Ile Arg Asn Ile Glu Phe Leu Ser Asn Pro SerLeuVal SerLeuAsp Gly Arg ThrLeu IleTyrHisGly Val Rrg SerPheAsp AspMetAla MetSerVal AsnGly LeuSerHisGlu Arg SerAspLeu IleMetGlu GluLeuLeu GluLys ArgHisLeuAla Pro IleTyrGly GluArgThr ProLeuAla SerGlu IleGluAspHis Leu ValIleAsp GluValPro HisValLeu HisThr GlyHisValHis Ile AsnAlaTyr LysLysTyr LysGlyVal HisLeu IleAsnSerGly Thr PheGlnSer GlnThrGlu PheGlnLys IleTyr AsnIleValPro Thr CysGlyGln ValProVal LeuAsnArg GlyVal MetLysLeuLeu Glu PheSer (2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 613 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus furiosus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
Met Asp Glu Phe Val Lys Ser Leu Leu Lys Ala Asn Tyr Leu Ile Thr Pro Ser Ala Tyr Tyr Leu Leu Arg Glu Tyr Tyr Glu Lys Gly Glu Phe Ser Ile Val Glu Leu Val Lys Phe Ala Arg Ser Arg Glu Ser Tyr Ile Ile Thr Asp Ala Leu Ala Thr Glu Phe Leu Lys Val Lys Gly Leu Glu Pro Ile Leu Pro Val Glu Thr Lys Gly Gly Phe Val Ser Thr Gly Glu Ser Gln Lys Glu Gln Ser Tyr Glu Glu Ser Phe Gly Thr Lys Glu Glu Ile Ser Gln Glu Ile Lys Glu Gly Glu Ser Phe Ile Ser Thr Gly Ser Glu Pro Leu Glu Glu Glu Leu Asn Ser Ile Gly Ile Glu Glu Ile Gly Ala Asn Glu Glu Leu Val Ser Asn Gly Asn Asp Asn Gly Gly Glu Ala Ile Val Phe Asp Lys Tyr Gly Tyr Pro Met Val Ty:r Ala Pro Glu Glu Ile Glu Val Glu Glu Lys Glu Tyr Ser Lys Tyr Glu Asp Leu Thr Ile Pro Met Asn Pro Asp Phe Asn Tyr Val Glu Ile Lys Glu Asp Tyr Asp Val Val Phe Asp Val Arg Rsn Val Lys Leu Lys Pro Pro Lys Val Lys Asn Gly Asn Gly Lys Glu Gly Glu Ile Ile Val Glu Ala Tyr Ala Ser Leu Phe Arg Ser Arg Leu Lys Lys Leu Arg Lys Ile Leu Arg Glu Asn Pro Glu Leu Asp Asn Val Val Asp Ile Gly Lys Leu Lys Tyr Val Lys Glu Asp Glu Thr Val Thr Ile Ile Gly Leu Val Asn Ser Lys Arg Glu Val Asn Lys Gly Leu Ile Phe Glu Ile Glu Asp Leu Thr Gly Lys Val Lys Val Phe Leu Pro Lys Asp Ser Glu Asp Tyr Arg Glu Ala Phe Lys Val Leu Pro Asp Ala Val Val Ala Phe Lys Gly Val Tyr Ser Lys Arg Gly Ile Leu Tyr Ala Asn Lys Phe Tyr Leu Pro Asp Val Pro Leu Tyr Arg Arg Gln Lys Pro Pro Leu Glu Glu Lys Val Tyr Ala Ile Leu Ile Ser Asp Ile His Val Gly Ser Lys Glu Phe Cys Glu Asn Ala Phe Ile Lys Phe Leu Glu Trp Leu Asn Gly Asn Val Glu Thr Lys Glu Glu Glu Glu Ile Val Ser Arg Val Lys Tyr Leu Ile Ile Ala Gly Asp Val Val Asp Gly Val Gly Val Tyr Pro Gly Gln Tyr Ala Asp Leu Thr Ile Pro Asp Ile Phe Asp Gln Tyr Glu Ala Leu Ala Asn Leu Leu Ser His Val ProLysHisIle ThrMetPhe IleAla ProGly HisAspAla Ala Asn ArgGlnAlaIle ProGlnPro GluPhe TyrLysGlu TyrAlaLys Pro IleTyrLysLeu LysAsnAla ValIle IleSerAsn ProAlaVal Ile ArgLeuHisGly ArgAspPhe LeuIle AlaHisGly ArgGlyIle Glu AspValValGly SerValPro GlyLeu ThrHisHis LysProGly Leu ProMetValGlu LeuLeuLys MetArg HisValAla ProMetPhe Gly GlyLysValPro IleAlaPro AspPro GluAspLeu LeuValIle Glu GluValProAsp ValValHis MetGly HisValHis ValTyrAsp Ala ValValTyrArg GlyValGln LeuVal AsnSerAla ThrTrpGln Ala GlnThrGluPhe GlnLysMet ValAsn IleValPro ThrProAla Lys ValProValVal AspIleAsp ThrAla LysValVal LysValLeu Asp PheSerGlyTrp Cys (2) INFORMATION FOR SEQ ID NO: 22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1107 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
MetAsp LysArgArg Pro Pro Gly Gly Val Pro Pro Gly Gly Pro Lys ArgAla GlyGlyLeu Trp Asp Asp Ala Pro Trp Pro Arg Asp Asp Ser GlnPhe GluAspLeu Ala Met Glu Met Glu Ala Glu Glu Leu Glu His Arg Leu Gln Glu Gln Glu Glu Glu Glu Leu Gln Ser Val Leu Glu Gly Val Ala Asp Gly Gln Val Pro Pro Ser Ala Ile Asp Pro Arg Trp Leu Arg Pro Thr Pro Pro Ala Leu Asp Pro Gln Thr Glu Pro Leu Ile Phe Gln Gln Leu Glu Ile Asp His Tyr Val Gly Pro Ala Gln Pro Val Pro Gly Gly Pro Pro Pro Ser Arg Gly Ser Val Pro Val Leu Arg Ala Phe Gly Val Thr Asp Glu Gly Phe Ser Val Cys Cys Hip; Ile His Gly Phe Ala Pro Tyr Phe Tyr Thr Pro Ala Pro Pro Gly Phe Gly Pro Glu His Met Gly Asp Leu Gln Arg Glu Leu Asn Leu Ala Ile Ser Arg Asp Ser Arg Gly Gly Arg Glu Leu Thr Gly Pro Ala Val Leu Ala Val Glu Leu Cys Ser Arg Glu Ser Met Phe Gly Tyr His Gly His Gly Pro Ser Pro Phe Leu Arg Ile Thr Val Ala Leu Pro Arg Leu Val Ala Pro Ala Arg Arg Leu Leu Glu Gln Gly Ile Arg Val Ala Gly Leu Gly Thr Pro Ser Phe Ala Pro Tyr Glu Ala Asn Val Asp Phe Glu Ile Arg Phe Met Val Asp Thr Asp Ile Val Gly Cys Asn Trp Leu Glu Leu Pro Ala Gly Lys Tyr Ala Leu Arg Leu Lys Glu Lys Ala Thr Gln Cys Gln Leu Glu Ala Asp Val Leu Trp Ser Asp Val Val Ser His Pro Pro Glu Gly Pro Trp Gln Arg Ile Ala Pro Leu Arg Val Leu Ser Phe Asp Ile Glu Cys Ala Gly Arg Lys Gly Ile Phe Pro Glu Pro Glu Arg Asp Pro Val Ile Gln Ile Cys Ser Leu Gly Leu Arg Trp Gly Glu Pro Glu Pro Phe Leu Arg Leu Ala Leu Thr Leu Arg Pro Cys Ala Pro Ile Leu Gly Ala Lys Val Gln Ser Tyr Glu Lys Glu Glu Asp Leu Leu Gln Ala Trp Ser Thr Phe Ile Arg Ile Met Asp Pro Asp Val Ile Thr Gly Tyr Asn Ile Gln Asn Phe Asp Leu Pro Tyr Leu Ile Ser Arg Ala Gln Thr_ Leu Lys Val Gln Thr Phe Pro Phe Leu Gly Arg Val Ala Gly Leu Cys Ser Asn Ile Arg Asp Ser Ser Phe Gln Ser Lys Gln Thr Gly Arg Arg_ Asp Thr Lys Val Val Ser Met Val Gly Arg Val Gln Met Asp Met Leu Gln Val Leu Leu Arg Glu Tyr Lys Leu Arg Ser His Thr Leu Asn Rla. Val Ser Phe His 465 470 4?5 480 Phe Leu Gly Glu Gln Lys Glu Asp Val Gln His Ser Ile Ile Thr Asp Leu Gln Asn Gly Asn Asp Gln Thr Arg Arg Arg Leu Ala Val Tyr Cys Leu Lys Asp Ala Tyr Leu Pro Leu Arg Leu Leu Glu Arg Leu Met Val Leu Val Asn Ala Val Glu Met Ala Arg Val Thr Gly Val Pro Leu Ser Tyr Leu Leu Ser Arg Gly Gln Gln Val Lys Val Val Ser Gln Leu Leu Arg Gln Ala Met His Glu Gly Leu Leu Met Pro Val Val Lys Ser Glu Gly Gly Glu Asp Tyr Thr Gly Ala Thr Val Ile Glu Pro Leu Lys Gly Tyr Tyr Asp Val Pro Ile Ala Thr Leu Asp Phe Ser Ser Leu Tyr Pro Ser Ile Met Met Ala His Asn Leu Cys Tyr Thr Thr Leu Leu Arg Pro Gly Thr Ala Gln Lys Leu Gly Leu Thr Glu Asp Gln Phe Ile Arg Thr Pro Thr Gly Asp Glu Phe Val Lys Thr Ser Val Arg Lys Gly Leu Leu Pro Gln Ile Leu Glu Asn Leu Leu Ser Ala Arg Lys Arg Ala Lys Ala Glu Leu Ala Lys Glu Thr Asp Pro Leu Arg Arg Gln Val Leu Asp Gly Arg Gln Leu Ala Leu Lys Val Ser Ala Asn Ser Val Tyr Gly Phe Thr Gly Ala Gln Val Gly Lys Leu Pro Cys Leu Glu Ile Ser Gln Ser Val _ 44 _ Thr Gly Phe Gly Arg Gln Met Ile Glu Lys Thr Lys Gln Leu Val Glu Ser Lys Tyr Thr Val Glu Asn Gly Tyr Ser Thr Ser Ala Lys Val Val Tyr Gly Asp Thr Asp Ser Val Met Cys Arg Phe Gl.y Val Ser Ser Val Ala Glu Ala Met Ala Leu Gly Arg Glu Ala Ala Asp Trp Val Ser Gly His Phe Pro Ser Pro Ile Arg Leu Glu Phe Glu Lys Val Tyr Phe Pro Tyr Leu Leu Ile Ser Lys Lys Arg Tyr Ala Gly Leu Leu Phe Ser Ser Arg Pro Asp Ala His Asp Arg Met Asp Cys Lys Gly Leu Glu Ala Val Arg Arg Asp Asn Cys Pro Leu Val Ala Asn Leu Val Thr Ala Ser Leu Arg Arg Leu Leu Ile Asp Arg Asp Pro Glu Gly Ala Val Ala His Ala Gln Asp Val Ile Ser Asp Leu Leu Cys Asn Arg IlE= Asp Ile Ser Gln Leu Val Ile Thr Lys Glu Leu Thr Arg Ala Ala Ser_ Asp Tyr Ala Gly Lys Gln Ala His Val Glu Leu Ala Glu Arg Met Ar<_~ Lys Arg Asp Pro Gly Ser Ala Pro Ser Leu Gly Asp Arg Val Pro Tyt: Val Ile Ile Ser Ala Ala Lys Gly Val Ala Ala Tyr Met Lys Ser Glu Asp Pro Leu Phe Val Leu Glu His Ser Leu Pro Ile Asp Thr Gln Tyr Tyr Leu Glu Gln Gln Leu Ala Lys Pro Leu Leu Arg Ile Phe Glu Pro Ile Leu Gly Glu Gly Arg Ala Glu Ala Val Leu Leu Arg Gly Asp His Thr Arg Cys Lys Thr Val Leu Thr Gly Lys Val Gly Gly Leu Leu Ala Phe Ala Lys Arg Arg Asn Cys Cys Ile Gly Cys Arg Thr Val Leu Ser His Gln Gly Ala Val Cys Glu Phe Cys Gln Pro Arg Glu Ser Glu Leu Tyr Gln Lys Glu Val Ser His Leu Asn Ala Leu Glu Glu Arg Phe Ser Arg Leu Trp Thr Gln Cys Gln Arg Cys Gln Gly Ser Leu His Glu Asp Val Ile Cys Thr Ser Arg Asp Cys Pro Ile Phe Tyr Met Arg Lys Lys Val Arg Lys Asp Leu Glu Asp Gln Glu Gln Leu Leu Arg Arg Phe Gly Pro Pro Gly Pro Glu Ala Trp (2) INFORMATION FOR SEQ ID NO: 23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 781 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:
Met Glu Arg Val Glu Gly Trp Leu Ile Asp Ala Asp Tyr Glu Thr Ile Gly Gly Lys Ala Val Val Arg Leu Trp Cys Lys Asp Asp Gln Gly Ile Phe Val Ala Tyr Asp Tyr Asn Phe Asp Pro Tyr Phe Tyr Val Ile Gly Val Asp Glu Asp Ile Leu Lys Asn Ala Ala Thr Ser Thr Arg Arg Glu Val Ile Lys Leu Lys Ser Phe Glu Lys Ala Gln Leu Lys Thr Leu Gly Arg Glu Val Glu Gly Tyr Ile Val Tyr Ala His His Pro Gln His Val Pro Lys Leu Arg Asp Tyr Leu Ser Gln Phe Gly Asp Val Arg Glu Ala Asp Ile Pro Phe Ala Tyr Arg Tyr Leu Ile Asp Lys Asp Leu Ala Cys Met Asp Gly Ile Ala Ile Glu Gly Glu Lys Gln Gly Gly Val Ile Arg Ser Tyr Lys Ile Glu Lys Val Glu Arg Ile Pro Arg Met Glu Phe Pro Glu Leu Lys Met Leu Val Phe Asp Cys Glu Met Leu Ser Ser Phe Gly Met Pro Glu Pro Glu Lys Asp Pro Ile Ile Val Ile Ser Val Lys Thr Asn Asp Asp Asp Glu Ile Ile Leu Thr Gly Asp Glu Arg Lys Ile Ile Ser Asp Phe Val Lys Leu Ile Lys Ser Tyr Asp Pro Asp Ile Ile Val Gly Tyr Asn Gln Asp Ala Phe Asp Trp Pro Tyr Leu Arg Lys Arg Ala Glu Arg Trp Asn Ile Pro Leu Asp Val Gly Arg Asp Gly Ser Asn Val Val Phe Arg Gly Gly Arg Pro Lys Ile Thr Gly Arg_ Leu Asn Val Asp Leu Tyr Asp Ile Ala Met Arg Ile Ser Asp Ile Lys Ile Lys Lys Leu Glu Asn Val Ala Glu Phe Leu Gly Thr Lys Ile Glu Ile Ala Asp Ile Glu Ala Lys Asp Ile Tyr Arg Tyr Trp Ser Arg Gly Glu Lys Glu Lys Val Leu Asn Tyr Ala Arg Gln Asp Ala Ile Asn Thr Tyr Leu Ile Ala Lys Glu Leu Leu Pro Met His Tyr Glu Leu Ser Lys Met Ile Arg Leu Pro Val Asp Asp Val Thr Arg Met Gly Arg Gly Lys Gln Val Asp Trp Leu Leu Leu Ser Glu Ala Lys Lys Ile Gly Glu Ile Ala Pro Asn Pro Pro Glu His Ala Glu Ser Tyr Glu Gly Ala Phe Val Leu Glu Pro Glu Arg Gly Leu His Glu Asn Val Ala Cys Leu Asp Phe Ala Ser Met Tyr Pro Ser Ile Met Ile Ala Phe Asn Ile Ser Pro Asp Thr Tyr Gly Cys Arg Asp Asp Cys Tyr Glu Ala Pro Glu Val Gly His Lys Phe Arg Lys Ser Pro Asp Gly Phe Phe Lys Arg Ile Leu Arg Met Leu Ile Glu Lys Arg Arg Glu Leu Lys Val Glu Leu Lys Asn Leu Ser Pro Glu Ser Ser Glu Tyr Lys Leu Leu Asp Ile Lys Gln Gln Thr Leu Lys Val Leu Thr Asn Ser Phe Tyr Gly Tyr Met Gly Trp Asn Leu Ala Arg Trp Tyr Cys _ 47 _ His Pro Cys Ala Glu Ala Thr Thr Ala Trp Gly Arg His Phe Ile Arg Thr Ser Ala Lys Ile Ala Glu Ser Met Gly Phe Lys Val Leu Tyr Gly 530 535 54() Asp Thr Asp Ser Ile Phe Val Thr Lys Ala Gly Met. Thr Lys Glu Asp Val Asp Arg Leu Ile Asp Lys Leu His Glu Glu Leu Pro Ile Gln Ile Glu Val Asp Glu Tyr Tyr Ser Ala Ile Phe Phe Val. Glu Lys Lys Arg Tyr Ala Gly Leu Thr Glu Asp Gly Arg Leu Val Val. Lys Gly Leu Glu Val Arg Arg Gly Asp Trp Cys Glu Leu Rla Lys Lys Val Gln Arg Glu Val Ile Glu Val Ile Leu Lys Glu Lys Asn Pro Glu Lys Ala Leu Ser Leu Val Lys Asp Val Ile Leu Arg Ile Lys Glu Gly Lys Val Ser Leu Glu Glu Val Val Ile Tyr Lys Gly Leu Thr Lys Lys Pro Ser Lys Tyr Glu Ser Met Gln Ala His Val Lys Ala Ala Leu Lys Ala Arg Glu Met Gly Ile Ile Tyr Pro Val Ser Ser Lys Ile Gly Tyr Val Ile Val Lys Gly Ser Gly Asn Ile Gly Asp Arg Ala Tyr Pro Ile Asp Leu Ile Glu Asp Phe Asp Gly Glu Asn Leu Arg Ile Lys Thr Lys Ser Gly Ile Glu Ile Lys Lys Leu Asp Lys Asp Tyr Tyr Ile Asp Asn Gln Ile Ile Pro Ser Val Leu Arg Ile Leu Glu Arg Phe Gly Tyr Thr Glu Ala Ser Leu Lys Gly Ser Ser Gln Met Ser Leu Asp Ser Phe Phe Ser (2) INFORMATION FOR SEQ ID NO: 24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1634 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii _ 48 _ (xi)SEQUENCE
DESCRIPTION:
SEQ
ID
N0:
24:

Met LysIle Ile Asn Gly Asp Asp Met Ala Ser Leu Met Gly Lys Ile Thr AlaVal Ile : Ile Tyr Tyi Leu Lys Tyr Thr Leu Ile Glu Asp Lys Asn Lys Asp Arg PheLys Pro Phe ValGlu Ser Asp Tyr Tyr Ile Leu Leu GluLys Val Glu GluAsp Ile LysIle LysGlu His Asn Glu.
Lys PheLeu LysAsn Asp Leu LysPhe ValGluAsnIle GluVal Leu Leu ValLysLys IleIle Leu Arg GluLys GluValIleLys IleIle Lys AlaThrHis ProGln Lys Val LysLeu ArgLysIleLys GluCys Pro GluIleVal LysGlu Ile Tyr HisAsp IleProPheAla LysArg Glu TyrLeuIle AspAsn Glu Ile ProMet ThrTyrTrpAsp PheGlu Ile AsnLysLys ProVal Ser Ile IlePro LysLeuLysSer ValAla Glu PheAspMet GluVal Tyr Asn AspThr GluProAsnPro GluArg Arg AspProIle LeuMet Ala Ser TrpAsp GluRsnGlyGly LysVal Phe IleThrTyr LysGlu Phe Asn ProAsn IleGluValVal LysAsn His GluLysGlu LeuIle Lys Lys IleGlu ThrLeuLysGlu TyrAsp Ile ValIleTyr ThrTyr Asn Gly Phe AspPheProTyr LeuLys Asp Asn AlaArgAla LysIle Tyr Gly Ile LeuGlyLys AspGly Ile Asp Asn GluGluLeu LysIle Lys Arg GluTyrArgSer TyrIle Gly Gly Met ProGlyArg ValHis Ile Asp IleSerArgArg Leu Leu Tyr . Leu Pro LysLeuThr Lys Thr Leu Tyr Leu Phe Tyr Glu Asp Asn Gly Val Val - Ile Glu Lys Leu Lys Ile His Thr Lys Ile Val Pro Asp Tyr Trp Ala Asn Asn Asp Lys Thr Leu Glu Tyr Ser Leu Gln Ile Asp Ala Lys Tyr Thr Tyr Lys Ile Gly Lys Phe Phe Pro Leu Glu Tyr Val Met Phe Ser Arg Ile Val Asn Gln Thr Phe Glu Ile Thr Arg Pro Met Ser Ser Gly Gln Met Val Glu Tyr Leu Met Lys Arg Ala Phe~ Asn Leu Lys Glu Met Ile Val Pro Asn Lys Pro Glu Glu Glu Tyr Arg Val Asp Arg Arg Leu Thr Thr Tyr Glu Gly Gly Val Lys Glu Pro Glu Met Tyr Lys Gly Phe Glu Asp Ile Ile Ser Met Phe Arg Cys His Pro Thr Asp Lys Gly Lys Val Val Val Lys Gly Lys Ile Val Asn Ile Glu Lys Gly Asp Val Glu 435 440 q45 Gly Asn Tyr Val Leu Gly Asp Gly Trp Gln Lys Lys Ile Val Lys Val Trp Lys Tyr Glu Tyr Glu Glu Leu Ile Asn Val Leu Gly Asn Gly Lys Cys Thr Pro Asn His Lys Pro Leu Arg Tyr Lys His Ile Ile Lys Lys Lys Ile Asn Lys Asn Asp Leu Val Arg Asp Ile Lys Tyr Tyr Ala Ser Leu Leu Thr Lys Phe Lys Glu Gly Lys Leu Ile Lys Gly Leu Cys Asp Phe Glu Thr Ile Gly Asn Glu Lys Tyr Ile Asn Asp Tyr Asp Met Glu Asp Phe Ile Leu Lys Ser Leu Ile Gly Ile Leu Glu Glu Leu Ala Gly His Leu Leu Gly Arg Lys Arg Asp Ile Glu Tyr Phe Asp Ser Ser Arg Lys Arg Ile Glu Ser Asp His Gln Tyr Arg Val Glu Ile Thr Val Asn Glu Lys Asp Leu Phe Ile Glu Phe Lys Ile Lys Tyr Ile Phe Lys Lys Asn Tyr Glu Ile Leu Tyr Val Thr Arg Arg Lys Lys Gly Thr Lys Ala Leu Gly Cys Ala Lys Lys Asp Ile Tyr Leu Lys Ile Glu Glu Ile Leu Lys Asn Lys Glu Lys Tyr Leu Pro Asn Ala Ile Leu Arg Gly Phe Phe Glu Gly Asp Gly Tyr Val Asn Thr Val Arg Arg Ala Val Val Val Asn Gln Gly Thr Asn Asn Tyr Asp Lys Ile Lys Phe I1~~ Ala Ser Leu Leu Asp Arg Leu Gly Ile Lys Tyr Ser Phe Tyr Thr Ty.r Ser Tyr Glu Glu 690 695 7p0 Arg Gly Lys Lys Leu Lys Arg Tyr Val Ile Glu IlE= Phe Ser Lys Gly Asp Leu Ile Lys Phe Ser Ile Leu Ile Ser Phe Ile Ser Arg Arg Lys Asn Asn Leu Leu Asn Glu Ile Ile Arg Gln Lys Thr Leu Tyr Lys Ile Gly Asp Tyr Gly Phe Tyr Asp Leu Asp Asp Val Cys Val Ser Leu Glu Ser Tyr Lys Gly Glu Val Tyr Asp Leu Thr Leu Glu Gly Arg Pro Tyr Tyr Phe Ala Asn Gly Ile Leu Thr His Asn Ser Leu Tyr Pro Ser Ile Ile Ile Ser Tyr Asn Ile Ser Pro Asp Thr Leu Rsp Cys Glu Cys Cys Lys Asp Val Ser Glu Lys Ile Leu Gly His Trp Phe Cys Lys Lys Lys Glu Gly Leu Ile Pro Lys Thr Leu Arg Asn Leu Ile Glu Arg Arg Ile Asn Ile Lys Arg Arg Met Lys Lys Met Ala Glu Ile Gly Glu Ile Asn Glu Glu Tyr Asn Leu Leu Asp Tyr Glu Gln Lys Ser Leu Lys Ile Leu Ala Asn Ser Ile Leu Pro Asp Glu Tyr Leu Thr Ile Ile Glu Glu Asp Gly Ile Lys Val Val Lys Ile Gly Glu Tyr Ile Asp Asp Leu Met Arg Lys His Lys Asp Lys Ile Lys Phe Ser Gly Ile Ser Glu Ile Leu Glu Thr Lys Asn Leu Lys Thr Phe Ser Phe Asp Lys Ile Thr Lys Lys Cys Glu Ile Lys Lys Val Lys Ala Leu Ile Arg His Pro Tyr Phe Gly Lys Ala Tyr Lys Ile Lys Leu Arg Ser Gly Arg Thr Ile Lys Val Thr Arg Gly His Ser Leu Phe Lys Tyr Glu Asn Gly Lys I1e Val Glu Val Lys Gly Asp Asp Val Arg Phe Gly Asp Leu Ile Val Val Pro Lys Lys Leu Thr Cys Val Asp Lys Glu Val Val Ile Asn Ile Pro Lys Arg Leu Ile Asn Ala Asp Glu Glu Glu Ile Lys Asp Leu Val Ile Thr Lys His Lys Asp Lys Ala Phe Phe Val Lys Leu Lys Lys Thr Leu Glu Asp Ile Glu Asn Asn Lys Leu Lys Val Ile Phe Asp Asp Cys Ile Leu Tyr Leu Lys Glu Leu Gly Leu Ile Asp Tyr Asn Ile Ile Lys Lys Ile Asn Lys Val Asp Ile Lys Ile Leu Asp Glu Glu Lys Phe Lys A1<~ Tyr Lys Lys Tyr Phe Asp Thr Val Ile Glu His Gly Asn Phe Lys Lys Gly Arg Cys Asn Ile Gln Tyr Ile Lys Ile Lys Asp Tyr Ile Ala Asn Ile Pro Asp Lys Glu Phe Glu Asp Cys Glu Ile Gly Ala Tyr Ser Gly Lys Ile Asn Ala Leu Leu Lys Leu Asp Glu Lys Leu Ala Lys Phe Leu Gly Phe Phe Val Thr Arg Gly Arg Leu Lys Lys Gln Lys Leu Lys Gly Glu Thr Val Tyr Glu Ile Ser Val Tyr Lys Ser Leu Pro Glu Tyr Gln Lys Glu Ile Ala Glu Thr Phe Lys Glu Val Phe Gly Ala Gly Ser Met Val Lys Asp Lys Val Thr Met Asp Asn Lys Ile Val Tyr Leu Val Leu Lys Tyr Ile Phe Lys Cys Gly Asp Lys Asp Lys Lys His Ile Pro Glu Glu Leu Phe Leu Ala Ser Glu Ser Val Ile Lys Ser Phe Leu Asp Gly Phe Leu Lys Ala Lys Lys Asn Ser His Lys Gly Thr Ser Thr Phe Met Ala Lys Asp Glu Lys Tyr Leu Asn Gln Leu Met Ile Leu Phe Asn Leu Val Gly Ile Pro Thr Arg Phe Thr Pro Val Lys Asn Lys Gly Tyr Lys :Leu Thr Leu Asn Pro Lys Tyr Gly Thr Val Lys Asp Leu Met Leu Asp Glu Val Lys Glu Ile Glu Ala Phe Glu Tyr Ser Gly Tyr Val Tyr Asp Leu Ser Val Glu Asp Asn Glu Asn Phe Leu Val Asn Asn Ile Tyr Ala His Asn Ser Val Tyr Gly Tyr Leu Ala Phe Pro Arg Ala Arg Phe Tyr Ser Arg Glu Cys Ala Glu Ile Val Thr Tyr Leu Gly Arg Lys Tyr Ile Leu Glu Thr Val Lys Glu Ala Glu Lys Phe Gly Phe Lys Val Leu Tyr Ile Asp Thr Asp Gly Phe Tyr Ala Ile Trp Lys Glu Lys Ile Ser Lys Glu Glu Leu Ile Lys Lys Ala Met Glu Phe Val Glu Tyr Ile Asn Ser Lys Leu Pro Gly Thr Met Glu Leu Glu Phe Glu Gly Tyr Phe Lys Arg Gly Ile Phe Val Thr Lys Lys Arg Tyr Ala Leu Ile Asp Glu Asn Gly Arg Val Thr Val Lys Gly Leu Glu Phe Val Arg Arg Asp Trp Ser Asn Ile Ala Lys Ile Thr Gln Arg Arg Val Leu Glu Ala Leu Leu Val Glu Gly Ser Ile Glu Lys Ala Lys Lys Ile Ile Gln Asp Val Ile Lys Asp Leu Arg Glu Lys Lys Ile Lys Lys Glu Asp Leu Ile Ile Tyr Thr Gln Leu Thr Lys Asp Pro Lys Glu Tyr Lys Thr Thr Ala Pro His Val Glu Ile Ala Lys Lys Leu Met Arg Glu Gly Lys Arg Ile Lys Val Gly Asp Ile Ile Gly Tyr Ile Ile Val Lys Gly Thr Lys Ser Ile Ser Glu Arg Ala Lys Leu Pro Glu Glu Val Asp Ile Asp Asp Ile Asp Val Asn Tyr Tyr Ile Asp Asn Gln Ile Leu Pro Pro Val Leu Arg Ile Met Glu Ala 'Val Gly Val Ser Lys Asn Glu Leu Lys Lys Glu Gly Ala Gln Leu Thr Leu Asp Lys Phe Phe Lys (2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1235 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
Met Ile Leu Asp Ala Asp Tyr Ile Thr Glu Asp Gly Lys Pro Ile Ile Arg Ile Phe Lys Lys Glu Asn Gly Glu Phe Lys Val Glu Tyr Asp Arg Asn Phe Arg Pro Tyr Ile Tyr Ala Leu Leu Arg Asp Asp Ser Ala Ile Asp Glu Ile Lys Lys Ile Thr Ala Gln Arg His Gly Lys Val Val Arg Ile Val Glu Thr Glu Lys Ile Gln Arg Lys Phe Leu Gly Arg Pro Ile Glu Val Trp Lys Leu Tyr Leu Glu His Pro Gln Asp Val Pro Ala Ile Arg Asp Lys Ile Arg Glu His Pro Ala Val Val Asp Ile Phe Glu Tyr Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Thr Pro Met Glu Gly Asn Glu Lys Leu Thr Phe Leu Ala Val Asp Ile Glu Thr Leu Tyr His Glu Gly Glu Glu Phe Gly Lys Gly Pro Val Ile Met Ile Ser Tyr Ala Asp Glu Glu Gly Ala Lys Val Ile Thr Trp Lys Lys Ile Asp Leu Pro Tyr Val Glu Val Val Ser Ser Glu Arg Glu Met Ile Lys Arg Leu Ile Arg Val Ile Lys Glu Lys Asp Pro Asp Val Ile Ile Thr Tyr Asn Gly Asp Asn Phe Asp Phe Pro Tyr Leu Leu Lys Arg Ala Glu Lys Leu Gly Ile Lys Leu Leu Leu Gly Arg Asp Asn Ser Glu Pro Lys MetGln LysMet Gly LeuAla GluIle:Lys Gly Asp Val Arg Ser Ile HisPhe AspLeu PheProVal IleArgArg ThrIleAsn Leu Thr Pro TyrThr LeuGlu AlaValTyr GluAlaIle PheGlyLys ProLysGlu LysVal TyrAla AspGluIle AlaLysAla TrpGluThr GlyGluGly LeuGlu ArgVal AlaLysTyr SerMetGlu AspAlaLys ValThrTyr GluLeu GlyArg GluPhePhe ProMetGlu AlaGlnLeu AlaArgLeu ValGly GlnPro ValTrpAsp ValSerArg SerSerThr GlyAsnLeu ValGlu TrpPhe LeuLeuArg LysAlaTyr GluArgAsn GluLeuAla ProAsn LysPro AspGluLys GluTyrGlu ArgArgLeu ArgGluSer TyrGlu GlyGly TyrValLys GluProGlu LysGlyLeu TrpGluGly IleVal SerLeu AspPheArg SerLeuTyr ProSerIle IleIleThr HisAsn ValSer ProAspThr LeuRsnArg GluGlyCys GluGluTyr AspVal AlaPro LysValGly HisArgPhe CysLysAsp PheProGly PheIle ProSer LeuLeuGly GlnLeuLeu GluGluArg GlnLysIle LysLys ArgMet LysGluSer LysAspPro ValGluLys LysLeuLeu 465 470 475 4g0 AspTyr ArgGln ArgAlaIle LysIleLeu AlaAsnSer IleLeuPro AspGlu TrpLeu ProIleVal GluAsnGlu LysValArg PheValLys IleGly AspPhe IleAspArg GluIleGlu GluAsnAla GluArgVal LysArg Gly GluThrGlu IleLeuGlu Lys LeuLysAla Asp Val Asp LeuSer PheAsn GluThr LysLysSer LeuLys LysValLys Arg Glu AlaLeu Ile Tyr Lys Ser IleLysLeu Arg Ser Val His Gly Tyr Arg Lys Ser Gly Arg Arg Ile Lys Ile Thr Ser Gly His Ser Leu Phe Ser Val Lys Asn Gly Lys Leu Val Lys Val Arg Gly Asp Glu Leu Lys Pro Gly Asp Leu Val Val Val Pro Gly Arg Leu Lys Leu Pro Glu Ser Lys Gln Val Leu Asn Leu Val Glu Leu Leu Leu Lys Leu Pro Glu Glu Glu Thr Ser Asn Ile Val Met Met Ile Pro Val Lys Gly Arg Lys Asn Phe Phe Lys Gly Met Leu Lys Thr Leu Tyr Trp Ile Phe Gly Glu Gly Glu Arg Pro Arg Thr Ala Gly Arg Tyr Leu Lys His Leu Glu Arg Leu Gly Tyr Val Lys Leu Lys Arg Arg Gly Cys Glu Val Leu Asp Trp Glu Ser Leu Lys Arg Tyr Arg Lys Leu Tyr Glu Thr Leu Ile Lys Asn Leu Lys Tyr Asn Gly Asn Ser Arg Ala Tyr Met Val Glu Phe Asn Ser Leu Arg Asp Val Val Ser Leu Met Pro Ile Glu Glu Leu Lys Glu Trp Ile Ile Gly Glu Pro Arg Gly Pro Lys Ile Gly Thr Phe Ile Asp Val Asp Asp Ser Phe Ala Lys Leu Leu Gly Tyr Tyr Ile Ser Ser Gly Asp Val Glu Lys Asp Arg Val Lys Phe His Ser Lys Asp Gln Asn Val Leu Glu Asp Ile Ala Lys Leu Ala Glu Lys Leu Phe Gly Lys Val Arg Arg Gly Arg Gly Tyr Ile Glu Val Ser Gly Lys Ile Ser His Ala Ile Phe Arg Val Leu Ala Glu Gly Lys Arg Ile Pro Glu Phe Ile Phe Thr Ser Pro Met Asp Ile Lys Val Ala Phe Leu Lys Gly Leu Asn Gly Asn Ala Glu Glu Leu Thr Phe Ser Thr Lys Ser Glu Leu Leu Val Asn Gln Leu Ile Leu Leu Leu Asn Ser Ile Gly Val Ser Asp Ile Lys Ile Glu His Glu Lys Gly Val Tyr Arg Val Tyr Ile Asn Lys Lys Glu Ser Ser Asn Gly Asp Ile Val Leu Asp Ser Val Glu Ser Ile Glu Val Glu Lys Tyr Glu Gly Tyr Val TyrAsp LeuSerVal GluAspAsn GluASIlPheLeuValGly 930 935 9qp Phe Gly LeuLeu TyrAlaHis AsnSerTyr TyrGly TyrTyrGlyTyr Ala Lys AlaArg TrpTyrCys LysGluCys AlaGlu SerValThrAla Trp Gly ArgGln TyrIleAsp LeuValArg ArgGlu LeuGluAlaArg Gly Phe LysVal LeuTyrIle AspThrAsp GlyLeu TyrAlaThrIle Pro Gly ValLys AspTrpGlu GluValLys ArgArg AlaLeuGluPhe Val Asp TyrIle AsnSerLys LeuProGly ValLeu GluLeuGluTyr Glu Gly PheTyr AlaArgGly PhePheVal ThrLys LysLysTyrAla Leu Ile AspGlu GluGlyLys IleValThr ArgGly LeuGluIleVal Arg Arg AspTrp SerGluIle AlaLysGlu ThrGln AlaArgValLeu Glu Ala IleLeu LysHisGly AsnValGlu GluAla ValLysIleVal Lys Asp ValThr GluLysLeu ThrAsnTyr GluVal ProProGluLys Leu Val IleTyr GluGlnIle ThrArgPro IleAsn GluTyrLysAla Ile Gly ProHis ValAlaVal RlaLysArg LeuMet AlaArgGlyIle Lys Val LysPro GlyMetVal IleGlyTyr IleVal LeuArgGlyAsp Gly Pro IleSer LysArgAla IleSerIle GluGlu PheAspProArg Lys His LysTyr AspAlaGlu TyrTyrIle GluAsn ValLeuPro Gln Ala Val GluArg IleLeuLys PheGly Lys Glu Leu Ala Tyr Arg Asp Arg Trp Lys Lys GlyLeu Ile Val Gln Thr Gln Gly Lys Val Ala Trp Lys Lys Ser (2) INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 586 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautotrophicum (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
Met Glu Asp Tyr Arg Met Val Leu Leu Asp Ile Asp Tyr Val Thr Val Asp Glu Val Pro Val Ile Arg Leu Phe Gly Lys Asp Lys Ser Gly Gly Asn Glu Pro Ile Ile Ala His Asp Arg Ser Phe Arg~ Pro Tyr Ile Tyr Ala Ile Pro Thr Asp Leu Asp Glu Cys Leu Arg Glu Leu Glu Glu Leu Glu Leu Glu Lys Leu Glu Val Lys Glu Met Arg Asp Leu Gly Arg Pro Thr Glu Val Ile Arg Ile Glu Phe Arg His Pro Gln Asp Val Pro Lys Ile Arg Asp Arg Ile Arg Asp Leu Glu Ser Val Arg Asp Ile Arg Glu His Asp Ile Pro Phe Tyr Arg Arg Tyr Leu Ile Asp Lys Ser Ile Val Pro Met Glu Glu Leu Glu Phe Gln Gly Val Glu Val Asp Ser Ala Pro Ser Val Thr Thr Asp Val Arg Thr Val Glu Val Thr Gly Arg Val Gln Ser Thr Gly Ser Gly Ala His Gly Leu Asp Ile Leu Ser Phe Asp Ile Glu Val Arg Asn Pro His Gly Met Pro Asp Pro Glu Lys Asp Glu Ile Val Met Ile Gly Val Ala Gly Asn Met Gly Tyr Glu Ser Val Ile Ser Thr Ala Gly Asp His Leu Asp Phe Val Glu Val Val Glu Asp Glu Arg Glu Leu Leu Glu Arg Phe Ala Glu Ile Val Ile Asp Lys Lys Pro Asp Ile Leu Val Gly Tyr Rsn Ser Asp Asn Phe Asp Pha_ Pro Tyr Ile Thr Arg Arg Ala Ala Ile Leu Gly Ala Glu Leu Asp Leu Gly Trp Asp Gly Ser Lys Ile Arg Thr Met Arg Arg Gly Phe Ala Asn Ala Thr Ala Ile Lys Gly Thr Val His Val Asp Leu Tyr Pro Val Met. Arg Arg Tyr Met Asn Leu Asp Arg Tyr Thr Leu Glu Arg Val Tyr Gln Glu Leu Phe Gly Glu Glu Lys Ile Asp Leu Pro Gly Asp Arg Leu Trp Glu Tyr Trp Asp Arg Asp Glu Leu Arg Asp Glu Leu Phe Arg Tyr Ser Leu Asp Asp Val Val Ala Thr His Arg Ile Ala Glu Lys Ile Leu Pro Leu Asn Leu Glu Leu Thr Arg Leu Val Gly Gln Pro Leu Phe Asp Ile Ser Arg Met Ala Thr Gly Gln Gln Ala Glu Trp Phe Leu Val Arg Lys Ala Tyr Gln Tyr Gly Glu Leu Val Pro Asn Lys Pro Ser Gln Ser Asp Phe Ser Ser Arg Arg Gly Arg Arg Ala Val Gly Gly Tyr Val Lys Glu Pro Glu Lys Gly Leu His Glu Asn Ile Val Gln Phe Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Ser Lys Asn Ile Ser Pro Asp Thr Leu Thr Asp Asp Glu Glu Ser Glu Cys Tyr Val Ala Pro Glu Tyr Gly Tyr Arg Phe Arg Lys Ser Pro Arg Gly Phe Val Pro Ser Val Ile Gly Glu Ile Leu Ser Glu Arg Val Arg Ile Lys Glu Glu Met Lys Gly Ser Asp Asp Pro Met Glu Arg Lys Ile Leu Asn Val Gln Gln Glu Ala Leu Lys Arg Leu Ala Asn Thr Met Tyr Gly Val Tyr Gly Tyr Ser Arg Phe Arg Trp Tyr Ser Met Glu Cys Ala Glu Ala Ile Thr Ala Trp Gly Rrg Asp Tyr Ile Lys Lys Thr Ile Lys Thr Ala Glu Glu Phe Gly Phe His Thr Val Tyr Ala Asp Thr Asp Gly Phe Tyr Ala Thr Tyr Arg Gly (2) INFORMATION FOR SEQ ID NO: 27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1143 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeoglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
Met Asp Ala Thr Leu Asp Arg Phe Phe Pro Leu Phe Glu Ser Glu Ser Asn Glu Asp Phe Trp Arg Ile Glu Glu Ile Arg Arg~ Tyr His Glu Ser Leu Met Val Glu Leu Asp Arg Ile Tyr Arg Ile Ala Glu Ala Ala Arg Lys Lys Gly Leu Asp Pro Glu Leu Ser Val Glu Ile Pro Ile Ala Lys Asn Met Ala Glu Arg Val Glu Lys Leu Met Asn Leu Gln Gly Leu Ala Lys Arg Ile Met Glu Leu Glu Glu Gly Gly Leu Ser Arg Glu Leu Ile Cys Phe Lys Val Ala Asp Glu Ile Val Glu Gly Lys Phe Gly Glu Met Pro Lys Glu Glu Ala Ile Asp Lys Ala Val Arg Thr Ala Val Ala Ile Met Thr Glu Gly Val Val Ala Ala Pro Ile Glu Gly Ile Ala Arg Val Arg Ile Asp Arg Glu Asn Phe Leu Arg Val Tyr Tyr Ala Gly Pro Ile Arg Ser Ala Gly Gly Thr Ala Gln Val Ile Ser Val Leu Val Ala Asp Tyr Val Arg Arg Lys Ala Glu Ile Gly Arg Tyr Val Pro Thr Glu Glu Glu Ile Leu Rrg Tyr Cys Glu Glu Ile Pro Leu Tyr Lys Lys Val Ala Asn Leu Gln Tyr Leu Pro Ser Asp Glu Glu Ile Arg Leu Ile Val Ser Asn Cys Pro Ile Cys Ile Asp Gly Glu Pro Thr Glu Ser Ala Glu Val Ser Gly Tyr Arg Asn Leu Pro Arg Val Glu Thr Asn Arg Val Arg Gly Gly Met Ala Leu Val Ile Ala Glu Gly Ile Ala Leu Lys Ala Pro Lys Leu Lys Lys Met Val Asp Glu Val Gly Ile Glu Gly Trp Glu Trp Leu Asp Ala Leu Ile Lys Gly Gly Gly Asp Ser Gly Set: Glu Glu Glu Lys Ala Val Ile Lys Pro Lys Asp Lys Tyr Leu Ser Asp Ile Val Ala Gly Arg Pro Val Leu Ser His Pro Ser Arg Lys Gly Gly Phe Arg Leu Arg Tyr Gly Arg Ala Arg Asn Ser Gly Phe Ala Thr Val Gly Val Asn Pro Ala Thr Met Tyr Leu Leu Glu Phe Val Ala Val Gly Thr Gln Leu Lys Val Glu Arg Pro Gly Lys Ala Gly Gly Val Val Pro Val Ser Thr Ile Glu Gly Pro Thr Val Arg Leu Lys Asn Gly Asp Val Val Lys Ile Asn Thr Leu Ser Glu Ala Lys Ala Leu Lys Gly Glu Val Ala Ala Ile Leu Asp Leu Gly Glu Ile Leu Ile Asn Tyr Gly Asp Phe Leu Glu Asn Asn His Pro Leu Ile Pro Ala Ser Tyr Thr Tyr Glu Trp Trp Ile Gln Glu Ala Glu Lys Ala Gly Leu Arg Gly Asp Tyr Arg Lys Ile Ser Glu Glu Glu Ala Leu Lys Leu Cys Asp Glu Phe His Val Pro Leu His Pro Asp Tyr Thr Tyr Leu Trp His Asp Ile Ser Val Glu Asp Tyr Arg Tyr Leu Arg Asn Phe Val Ser Asp Asn Gly Lys Ile Glu Gly Lys His Gly Lys Ser Val Leu Leu Leu Pro Tyr Asp Ser Arg Val Lys Glu Ile Leu Glu Ala Leu Leu Leu Glu His Lys Val Arg Glu Ser Phe Ile Val Ile Glu Thr Trp Arg Ala Phe Ile Arg Cys Leu Gly Leu Asp Glu Lys Leu Ser Lys Val Ser Glu Val Ser Gly Lys Asp Val Leu Glu Ile Val Asn Gly Ile Ser Gly Ile Lys Val Arg Pro Lys Ala Leu Ser Arg Ile Gly Ala Arg Met Gly Arg Pro Glu Lys Ala Lys Glu Arg Lys Met Ser Pro Pro Pro His Ile Leu Phe Pro Val Gly Met Ala Gly Gly Asn Thr Arg Asp Ile Lys Asn Ala Ile Asn Tyr Thr Lys Ser Tyr Asn Ala Lys Lys Gly Glu Ile Glu Val Glu Ile Ala Ile Arg Lys Cys Pro Gln Cys Gly Lys Glu Thr Phe Trp Leu Lys Cys Asp Val Cys Gly Glu. Leu Thr Glu Gln Leu Tyr Tyr Cys Pro Ser Cys Arg Met Lys Asn Thr Ser Ser Val Cys Glu Ser Cys Gly Arg Glu Cys Glu Gly Tyr Met Lys Arg Lys Val Asp Leu Arg Glu Leu Tyr Glu Glu Ala Ile Ala Asn Leu Gly Glu Tyr Asp Ser Phe Asp Thr Ile Lys Gly Val Lys Gly Met Thr Ser Lys Thr Lys Ile Pro Glu Arg Leu Glu Lys Gly Ile Leu Arg Val Lys His Gly Val Phe Val Phe Lys Asp Gly Thr Ala Arg Phe Asp Ala Thr Asp Leu Pro Ile Thr His Phe Lys Pro Ala Glu Ile Gly Val Ser Val Glu Lys Leu Arg Glu Leu Gly Tyr Glu Arg Asp Tyr Lys Gly Ala Glu Leu Lys Asn Glu Asn Gln Ile Val Glu Leu Lys Pro Gln Asp Val Ile Leu Pro Lys Ser Gly Ala Glu Tyr Leu Leu Arg Val Ala Asn Phe Ile Asp Asp Leu Leu Val Lys Phe Tyr Lys Met Glu Pro Phe Tyr Asn Ala Lys Ser Val Glu Asp Leu Ile Gly His Leu Val Ile Gly Leu Ala Pro His Thr Ser Ala Gly Val Leu Gly Arg Ile Ile Gly Phe Ser Asp Val Leu Ala Gly Tyr Ala His Pro Tyr Phe His Rla Ala Lys Arg Arg .Asn Cys Asp Gly Asp Glu Asp Cys Phe Met Leu Leu Leu Asp Gly Leu Leu Asn Phe Ser Arg Lys Phe Leu Pro Asp Lys Arg Gly Gly Gln Met. Asp Ala Pro Leu Val Leu Thr Ala Ile Val Asp Pro Arg Glu Val Asp Lys Glu Val His Asn Met Asp Ile Val Glu Arg Tyr Pro Leu Glu Phe Tyr Glu Ala Thr Met Arg Phe Ala Ser Pro Lys Glu Met Glu Asp Tyr Val Glu Lys Val Lys Asp Arg Leu Lys Asp Glu Ser Arg Phe Cys Gly Leu Phe Phe Thr His Asp Thr Glu Asn Ile Ala Ala Gly Val Lys Glu Ser Ala Tyr Lys Ser Leu Lys Thr Met Gln Asp Lys Val Tyr Arg Gln Met Glu Leu Ala Arg Met Ile Val Ala Val Asp Glu His Asp Val Ala Glu Arg Val Ile Asn Val His Phe Leu Pro Asp Ile Ile Gly Asn Leu Arg Ala Phe Ser Arg Gln Glu Phe Arg Cys Thr Arg Cys Asn Thr Lys Tyr Arg Arg Ile Pro Leu Val Gly Lys Cys Leu Lys Cys Gly Asn Lys Leu Thr Leu Thr Val His Ser Ser Ser Ile Met Lys Tyr Leu Glu Leu Ser Lys Phe Leu 1090 1095 110() Cys Glu Asn Phe Asn Val Ser Ser Tyr Thr Lys Gln Arg Leu Met Leu Leu Glu Gln Glu Ile Lys Ser Met Phe Glu Asn Gly Thr Glu Lys Gln Val Ser Ile Ser Asp Phe Val (2) INFORMATION FOR SEQ ID NO: 28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1139 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanococcus jannaschii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
Met Ile Val Met Val His Val Ala Cys Ser Glu Asn Met Lys Lys Tyr Phe Glu Asn Ile Val Asp Glu Val Lys Lys Ile Tyr Rrg Ile Ala Glu Glu Cys Arg Lys Lys Gly Phe Asp Pro Thr Asp Gl.u Val Glu Ile Pro Leu Ala Ala Asp Met Ala Asp Arg Val Glu Gly Leu Val Gly Pro Lys Gly Val Ala Glu Arg Ile Arg Glu Leu Val Lys Glu Leu Gly Lys Glu Pro Ala Ala Leu Glu Ile Ala Lys Glu Ile Val Glu Gly Lys Phe Gly Asn Phe Asp Lys Glu Lys Lys Ala Glu Gln Ala Val Arg Thr Ala Leu Ala Val Leu Thr Glu Gly Ile Val Ala Ala Pro Leu Glu Gly Ile Ala Asp Val Lys Ile Lys Lys Asn Pro Asp Gly Thr Glu Tyr Leu Ala Ile Tyr Tyr Ala Gly Pro Ile Arg Ser Ala Gly Gly Thr Ala Gln Ala Leu Ser Val Leu Val Gly Asp Phe Val Arg Lys Ala Met Gly Leu Asp Arg Tyr Lys Pro Thr Glu Asp Glu Ile Glu Arg Tyr Val Glu Glu Val Glu Leu Tyr Gln Ser Glu Val Gly Ser Phe Gln Tyr Asn Pro Thr Ala Asp Glu Ile Arg Thr Ala Ile Arg Asn Ile Pro Ile Glu Ile Thr Gly Glu Ala Thr Asp Asp Val Glu Val Ser Gly His Arg Asp Leu Pro Arg Val Glu Thr Asn Gln Leu Arg Gly Gly Ala Leu Leu Val Leu Val Glu Gly Val Leu Leu Lys Ala Pro Lys Ile Leu Arg His Val .asp Lys Leu Gly Ile Glu Gly Trp Asp Trp Leu Lys Asp Leu Met Ser Lys Lys Glu Glu Lys Glu Glu Glu Lys Asp Glu Lys Val Asp Asp Glu Glu Ile Asp Glu Glu Glu Glu Glu Ile Ser Gly Tyr Trp Arg Asp Val hys Ile Glu Ala ° Asn Lys Lys Phe Ile Ser Glu Val Ile Ala Gly Arg Pro Val Phe Ala His Pro Ser Lys Val Gly Gly Phe Arg Leu Arg Tyr Gly Arg Ser Arg Asn Thr Gly Phe Ala Thr Gln Gly Phe His Pro Ala Leu Met Tyr Leu Val Asp Glu Phe Met Ala Val Gly Thr Gln Leu Lys Thr Glu Arg Pro Gly Lys Ala Thr Cys Val Val Pro Val Asp Ser Ile Glu Pro Pro Ile Val Lys Leu Lys Asn Gly Asp Val Ile Arg Val Asp Thr Ile Glu Lys Ala Met Asp Val Arg Asn Arg Val Glu Glu Ile Leu Phe Leu Gly Asp Val Leu Val Asn Tyr Gly Asp Phe Leu Glu Asn Asn His Pro Leu Leu Pro Ser Cys Trp Cys Glu Glu Trp Tyr Glu Lys Ile Leu Ile Ala Asn Asn Ile Glu Tyr Asp Lys Asp Phe Ile Lys Asn Pro Lys Pro Glu Glu Ala Val Lys Phe Ala Leu Glu Thr Lys Thr Pro Leu His Pro Arg Phe Thr Tyr His Trp His Asp Val Ser Lys Glu Asp Ile Ile Leu Leu Arg Asn Trp Leu Leu Lys Gly Lys Glu Asp Ser Leu Glu Gly Lys Lys Val Trp Ile Val Asp Leu Glu Ile Glu Glu Asp Lys Lys Ala Lys Arg Ile Leu Glu Leu Ile Gly Cys Cys His Leu Val Arg Asn Lys Lys Val Ile Ile Glu Glu Tyr Tyr Pro Leu Leu Tyr Ser Leu Gly Phe Asp Val Glu Asn Lys Lys Asp Leu Val Glu Asn Ile Glu Lys Ile Leu Glu Ser Ala Lys Asn Ser Met His Leu Ile Asn Leu Leu Ala Pro Phe Glu Val Arg Arg Asn Thr Tyr Val Tyr Val Gly Ala Arg Met Gly .Arg Pro Glu Lys Ala Ala Pro Arg Lys Met Lys Pro Pro Val Asn Gly Leu Phe Pro Ile Gly Asn Ala Gly Gly Gln Val Arg Leu Ile Asn Lys ;41a Val Glu Glu Asn Asn Thr Asp Asp Val Asp Val Ser Tyr Thr Arg Cys Pro Asn Cys Gly Lys Ile Ser Leu Tyr Arg Val Cys Pro Phe Cys Gly Thr Lys Val Glu Leu Asp Asn Phe Gly Arg Ile Lys Ala Pro Leu Lys Asp Tyr Trp Tyr Ala Ala Leu Lys Arg Leu Gly Ile Asn Lys Pro Gly Asp Val Lys Cys Ile Lys Gly Met Thr Ser Lys Gln Lys Ile Val Glu Pro Leu Glu Lys Ala Ile Leu Arg Ala Ile Asn Glu Val Tyr Val Phe Lys Asp Gly Thr Thr Arg Phe Asp Cys Thr Asp Val Pro Val Thr His Phe Lys Pro Asn Glu Ile Asn Val Thr Val Glu Lys Leu Arg Glu Leu Gly Tyr Asp Lys Asp Ile Tyr Gly Asn Glu Leu Val Asp Gly Glu Gln Val Val Glu Leu Lys Pro Gln Asp Val Ile Ile Pro Glu Ser Cys Ala Glu Tyr Phe Val Lys Val Ala Asn Phe Ile Asp Asp Leu Leu Glu Lys Phe Tyr Lys Val Glu Arg Phe Tyr Asn Val Lys Lys Lys Glu Asp Leu Ile Gly His Leu Val Ile Gly Met Ala Pro His Thr Ser Ala Gly Met Val Gly Arg Ile Ile Gly Tyr Thr Lys Ala Asn Val Gly Tyr Ala His Pro Tyr Phe His Ala Ala Lys Arg Arg Asn Cys Asp Gly Asp Glu Asp Ser Phe Phe Leu Leu Leu Asp Ala Phe Leu Asn Phe Ser Lys Lys Phe Leu Pro Asp Lys Arg Gly Gly Gln Met Asp Ala Pro Leu Val Leu Thr Thr Ile Leu Asp Pro Lys Glu Val Asp Gly Glu Val His Asn Met Asp Thr Met Trp Ser Tyr Pro Leu Glu Phe Tyr Glu Lys Thr Leu Glu Met Pro Ser Pro Lys Glu Val Lys Glu Phe Met Glu Thr Val Glu Asp Arg Leu Gly Lys Pro Glu Gln Tyr Glu Gly Ile Gly Tyr Thr His Glu 'Phr Ser Arg Ile AspLeuGly ProLysVal CysAlaTyr LysThr LeuGlySerMet Leu GluLysThr ThrSerGln LeuSerVal AlaLys LysIleArgAla Thr AspGluArg AspValAla GluLysVal IleGln Sei:HisPheIle Pro AspLeuIle GlyAsnLeu ArgAlaPhe SerArg GlnAlaValArg Cys LysCysGly AlaLysTyr ArgArgIle ProLeu LysGlyLysCys Pro LysCysGly SerAsnLeu IleLeuThr ValSer LysGlyAlaVal Glu LysTyrMet AspValAla GluLysMet AlaGlu GluTyrAsnVal Asn AspTyrIle LysGlnArg LeuLysIle IleLys GluGlyIleAsn Ser IlePheGlu AsnGluLys SerArgGln ValLys LeuSerAspPhe Phe LysIleGly (2) INFORMATION FOR SEQ ID N0: 29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1434 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus horikoshii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29:
MetValLeuMet GluLeuPro LysGlu MetGluGlu TyrPheSer Met LeuGlnArgGlu IleAspLys AlaTyr GluIleAla LysLysAla Arg AlaGlnGlyLys AspProSer LeuAsp ValGluIle ProGlnAla Ser AspMetAlaGly ArgValGlu SerLeu ValGlyPro :ProGlyVal Ala GluArgIleArg GluLeuVal LysGlu TyrGlyLys GluIleAla Ala Leu Lys Ile Val Asp Glu Ile Ile Asp Gly Lys Phe Gly Asp Leu Gly Ser Lys Glu Lys Tyr Ala Glu Gln Ala Val Arg Thr Ala Leu Ala Ile Leu Thr Glu Gly Val Val Ser Ala Pro Ile Glu Gly Ile Ala Ser Val Lys Ile Lys Arg Asn Thr Trp Ser Asp Asn Ser Glu Tyr Leu Ala Leu Tyr Tyr Ala Gly Pro Ile Arg Ser Ser Gly Gly Thr Ala Gln Ala Leu Ser Val Leu Val Gly Asp Tyr Val Arg Arg Lys Leu Gly Leu Asp Arg Phe Lys Pro Ser Glu Lys His Ile Glu Arg Met Val Glu Glu Val Asp Leu Tyr His Arg Thr Val Ser Arg Leu Gln Tyr His Pro Ser Pro Glu Glu Val Arg Leu Ala Met Arg Asn Ile Pro Ile Glu Ile Thr Gly Glu Ala Thr Asp Glu Val Glu Val Ser His Arg Asp Ile Pro Gly Val Glu Thr Asn Gln Leu Arg Gly Gly Ala Ile Leu Val Leu Ala Glu Gly Val Leu Gln Lys Ala Lys Lys Leu Val Lys Tyr Ile Asp Lys Met Gly Ile Glu Gly Trp Glu Trp Leu Lys Glu Phe Val Glu Ala Lys Glu Lys Gly Glu Glu Ile Glu Glu Glu Gly Ser Ala Glu Ser Thr Val Glu Glu Thr Lys Val Glu Val Asp Met Gly Phe Tyr Tyr Ser Leu Tyr Gln Lys Phe Lys Ser Glu Ile Ala Pro Asn Asp Lys Tyr Ala Lys Glu Ile Ile Gly Gly Arg Pro Leu Phe Ser Asp Pro Ser Arg Asn Gly Gly Phe Arg Leu Arg Tyr Gly Arg Ser Arg Val Ser Gly Phe Ala Thr 'rrp Gly Ile Asn Pro Ala Thr Met Ile Leu Val Asp Glu Phe Leu Ala :Cle Gly Thr Gln Leu Lys Thr Glu Arg Pro Gly Lys Gly Ala Val Val Thr Pro Val Thr Thr Ile Glu Gly Pro Ile Val Lys Leu Lys Asp Gly Ser Val Val Lys Val Asp Asp Tyr Lys Leu Ala Leu Lys Ile Arg Asp Glu Val Glu Glu Ile Leu Tyr Leu Gly Asp Ala Val Ile Ala Phe Gly Asp Phe Val Glu Asn Asn Gln Thr Leu Leu Pro Ala Asn Tyr Cys Glu Glu Trp Trp Ile Leu Glu Phe Thr Lys Ala Leu Asn Glu Ile Tyr Glu Val Glu Leu Lys Pro Phe Glu Val Asn Ser Ser Glu Asp Leu Glu Glu Ala Ala Asp Tyr Leu Glu Val Asp Ile Glu Phe Leu Lys Glu Leu Leu Lys Asp Pro Leu Arg Thr Lys Pro Pro Val Glu Leu Ala Ile His Phe Ser Glu Ile Leu Gly Ile Pro Leu His Pro Tyr Tyr Thr Leu Tyr Trp Asn Ser Val Lys Pro Glu Gln Val Glu Lys Leu Trp Arg Val Leu Lys Glu His Ala His Ile Asp Trp Asp Asn Phe Arg Gly Ile Lys Phe Ala Arg Arg Ile Val Ile Pro Leu Glu Lys Leu Arg Asp Ser Lys Arg Ala Leu Glu Leu Leu Gly Leu Pro His Lys Val Glu Gly Lys Asn Val Ile Val Asp Tyr Pro Trp Ala Ala Ala Leu Leu Thr Pro Leu Gly Asn Leu Glu Trp Glu Phe Arg Ala Lys Pro Leu His Thr Thr Ile Asp Ile Ile Asn Glu Asn Asn Glu Ile Lys Leu Arg Asp Arg Gly Ile Ser Trp Ile Gly Ala Arg Met Gly Arg Pro Glu Lys Ala Lys Glu Arg Lys Met Lys Pro Pro Val Gln Val Leu Phe Pro Ile Gly Leu Ala Gly Gly Ser Ser Arg Asp Ile Lys Lys Ala Ala Glu Glu Gly Lys Val Ala Glu Val Glu Ile Ala Leu Phe Lys Cys Pro Lys Cys Gly His Val Gly Pro Glu His Ile Cys Pro Asn Cys Gly Thr Arg Lys Glu Leu Ile Trp Val Cys Pro Arg Cys Asn Ala Glu Tyr Pro Glu Ser Gln Ala Ser Gly Tyr Asn Tyr Thr Cys Pro Lys Cys Asn Val Lys Leu Lys Pro Tyr Ala Lys Arg Lys Ile Lys Pro Ser Glu Leu Leu Lys Arg Ala Met Asp Asn Val Lys Val Tyr Gly Ile Asp Lys Leu Lys Gly Val Met Gly Met Thr Ser Gly Trp Lys Met Pro Glu Pro Leu Glu Lys Gly Leu Leu Arg Ala Lys Asn Asp Val Tyr Val Phe Lys Asp Gly Thr Ile Arg Phe Asp Ala Thr Asp Ala Pro Ile Thr His Phe Arg Pro Arg Glu Ile Gly Val Ser Val Glu Lys Leu Arg Glu Leu Gly Tyr Thr His Asp Phe Glu Gly Asn Pro Leu Val Ser Glu Asp Gln Ile Val Glu Leu Lys Pro Gln Asp Ile Ile Leu Sei: Lys Glu Ala Gly Lys Tyr Leu Leu Lys Val Ala Lys Phe Val Asp Asp Leu Leu Glu Lys Phe Tyr Gly Leu Pro Arg Phe Tyr Asn Ala Glu Lys Met Glu Asp Leu Ile Gly His Leu Val Ile Gly Leu Ala Pro His Thr Ser Ala Gly Ile Val Gly Arg Ile Ile Gly Phe Val Asp Ala Leu Val Gly Tyr Ala His Pro Tyr Phe His Ala Ala Lys Arg Arg Asn Cys Phe Pro Gly Asp Thr Arg Ile Leu Val Gln Ile Asn Gly Thr Pro Gln Arg Val Thr Leu Lys Glu Leu Tyr Glu Leu Phe Asp Glu Glu His Tyr Glu Ser Met Val Tyr Val Arg Lys Lys Pro Lys Val Asp Ile Lys Val Tyr Ser Phe Asn Pro Glu Glu Gly Lys Val Val Leu Thr Asp Ile Glu Glu Val Ile Lys Ala Pro Ala Thr Asp His Leu Ile Arg Phe Glu Leu Glu Leu Gly Ser Ser Phe Glu Thr Thr Val Asp His Pro Val Leu Val Tyr Glu Asn Gly Lys Phe Val Glu Lys Arg Ala Phe Glu Val Arg Glu Gly Asn Ile Ile Ile Ile Ile Asp Glu Ser Thr Leu Glu Pro Leu Lys Val Ala Val Lys Lys _ _ 70 _ Ile Glu Phe Ile Glu Pro Pro Glu Asp Phe Val Phe Ser Leu Asn Ala 17.00 Lys Lys Tyr His Thr Val Ile Ile Asn Glu Asn Il.e Val Thr His Gln Cys Asp Gly Asp Glu Asp Ala Val Met Leu Leu Leu Asp Ala Leu Leu Asn Phe Ser Arg Tyr Tyr Leu Pro Glu Lys Arg Gly Gly Lys Met Asp Ala Pro Leu Val Ile Thr Thr Arg Leu Asp Pro Arg Glu Val Asp Ser 1155 11.60 1165 Glu Val His Asn Met Asp Ile Val Arg Tyr Tyr Pro Leu Glu Phe Tyr Glu Ala Thr Tyr Glu Leu Lys Ser Pro Lys Glu Leu Val Gly Val Ile Glu Arg Val Glu Asp Arg Leu Gly Lys Pro Glu Met Tyr Tyr Gly Leu Lys Phe Thr His Asp Thr Asp Asp Ile Ala Leu Gly Pro Lys Met Ser Leu Tyr Lys Gln Leu Gly Asp Met Glu Glu Lys Val. Lys Arg Gln Leu Asp Val Ala Arg Arg Ile Arg Ala Val Asp Glu His Lys Val Ala Glu 12E~0 Thr Ile Leu Asn Ser His Leu Ile Pro Asp Leu Arg Gly Asn Leu Arg Ser Phe Thr Arg Gln Glu Phe Arg Cys Val Lys Cys Asn Thr Lys Phe Arg Arg Pro Pro Leu Asp Gly Lys Cys Pro Ile Cys Gly Gly Lys Ile Val Leu Thr Val Ser Lys Gly Ala Ile Glu Lys Tyr Leu Gly Thr Ala Lys Met Leu Val Thr Glu Tyr Lys Val Lys Asn Tyr Thr Arg Gln Rrg Ile Cys Leu Thr Glu Arg Asp Ile Asp Ser Leu Phe Glu Thr Val Phe Pro Glu Thr Gln Leu Thr Leu Leu Val Asn Pro Asn Asp Ile Cys Gln Arg Ile Ile Met Glu Arg Thr Gly Gly Ser Lys Lys Ser Gly Leu Leu Glu Asn Phe Ala Asn Gly Tyr Asn Lys Gly Lys Lys Glu Glu Met Pro Lys Lys Gln Arg Lys Lys Glu Gln Glu Lys Ser Lys Lys Arg Lys Val _ 71 a- Ile Ser Leu Asp Asp Phe Phe Ser Arg Lys (2) INFORMATION FOR SEQ ID N0: 30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1092 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Methanobacterium thermoautot:rophicum (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:
Met Met Asp Tyr Phe Asn Glu Leu Glu Arg Glu Thr Glu Arg Leu Tyr Glu Ile Ala Rrg Lys Ala Arg Ala Arg Gly Leu Asp Val Ser Thr Thr Pro Glu Ile Pro Leu Ala Lys Asp Leu Ala Glu Arg Val Glu Gly Leu Val Gly Pro Glu Gly Ile Ala Arg Arg Ile Lys Glu Leu Glu Gly Asp Arg Gly Arg Glu Glu Val Ala Phe Gln Ile Ala Ala Glu Ile Ala Ser Gln Ala Val Pro Asp Asp Asp Pro Glu Glu Arg Glu Lys Leu Ala Asp Gln Ala Leu Arg Thr Ala Leu Ala Ile Leu Thr Glu Gly Val Val Ala Ala Pro Leu Glu Gly Ile Ala Arg Val Arg Ile Lys Glu Asn Phe Asp Lys Ser Arg Tyr Leu Ala Val Tyr Phe Ala Gly Pro Ile Arg Ser Ala Gly Gly Thr Ala Ala Ala Leu Ser Val Leu Ile Ala Asp Tyr Ile Arg Leu Ala Val Gly Leu Asp Arg Tyr Lys Pro Val Glu Arg Glu Ile Glu Arg Tyr Val Glu Glu Val Glu Leu Tyr Glu Ser Glu Val Thr Asn Leu Gln Tyr Ser Pro Lys Pro Asp Glu Val Arg Leu Ala Ala Ser Lys Ile Pro Val Glu Val Thr Gly Glu Pro Thr Asp Lys Val Glu Val Ser His Arg Asp Leu Glu Arg Val Glu Thr Asn Asn Ile Arg_ Gly Gly Ala Leu Leu Ala Met Val Glu Gly Val Ile Gln Lys Ala Pro Lys Val Leu Lys Tyr Ala Lys Gln Leu Lys Leu Glu Gly Trp Asp Trp Leu Glu Lys Phe Ser Lys Ala Pro Lys Lys Gly Glu Gly Glu Glu Lys, Val Val Val Lys Ala Asp Ser Lys Tyr Val Glu Asp Ile Ile Gly Gly Arg Pro Val Leu Ala Tyr Pro Ser Glu Lys Gly Ala Phe Arg Leu Arg Tyr Gly Arg Ala Arg Asn Thr Gly Leu Ala Ala Met Gly Val His Pro Ala Thr Met Glu Leu Leu Gln Phe Leu Ala Val Gly Thr Gln Met Lys Ile Glu Arg Pro Gly Lys Gly Asn Cys Val Val Pro Val Asp Thr Ile Asp Gly Pro Val Val Lys Leu Arg Asn Gly Asp Val Ile Arg Ile Glu Asp Ala Glu Thr Ala Ser Arg Val Arg Ser Glu Val Glu Glu Ile Leu Phe Leu Gly Asp Met Leu Val Ala Phe Gly Glu Phe Leu Arg Asn Asn His Val Leu Met Pro Ala Gly Trp Cys Glu Glu Trp Trp Ile Gln Thr Ile Leu Ser Ser Pro Lys Tyr Pro Gly Asp Asp Pro Leu Asn Leu Ser Tyr Tyr Arg Thr Arg Trp Asn Glu Leu Glu Val Ser Ala Gly Asp Ala Phe Arg Ile Ser Glu Glu Tyr Asp Val Pro Leu His Pro Arg Tyr Thr Tyr Phe Tyr His Asp Val Thr Val Arg Glu Leu Asn Met Leu Arg Glu Trp Leu Asn Thr Ser Gln Leu Glu Asp Glu Leu Val Leu Glu Leu Arg Pro Glu Lys Arg Ile Leu Glu Ile Leu Gly Val Pro His Arg Val Lys Asp Ser Arg Val Val Ile Gly His Asp Asp Ala His Ala Leu Ile Lys Thr Leu Arg Lys Pro Leu Glu Asp Ser Ser Asp Thr Val Glu Ala Leu Asn Arg Val Ser Pro Val Arg Ile Met Lys Lys Ala Pro Thr Tyr Ile Gly Thr Arg Val Gly Arg Pro Glu Lys Thr Lys Glu Arg Lys Met Arg Pro Ala Pro His Val Leu Phe Pro Ile Gly Lys Tyr Gly Gly Ser Arc_~ Arg Asn Ile Pro Asp Ala Ala Lys Lys Gly Ser Ile Thr Val Glu Ile Gly Arg Ala Thr Cys Pro Ser Cys Arg Val Ser Ser Met Gln Ser Ile Cys Pro Ser Cys Gly Ser Arg Thr Val Ile Gly Glu Pro Gly Lys Arg Asn Ile Asn Leu Ala Ala Leu Leu Lys Arg Ala Ala Glu Asn Val Ser Val Arg Lys Leu Asp Glu Ile Lys Gly Val Glu Gly Met Ile Ser Ala Glu Lys Phe Pro Glu Pro Leu Glu Lys Gly Ile Leu Arg Ala Lys Asn Asp Val Tyr Thr Phe Lys Asp Ala Thr Ile Arg His Asp Ser Thr Asp Leu Pro Leu Thr His Phe Thr Pro Arg Glu Val Gly Val Ser Val Glu Arg Leu Arg Glu Leu Gly Tyr Thr Arg Asp Cys Tyr Gly Asp Glu Leu Glu Asp Glu Asp Gln Ile Leu Glu Leu Arg Val Gln Asp Val Val Ile Ser Glu Asp Cys Ala Asp Tyr Leu Val Arg Val Ala Asn Phe Val Asp Asp Leu Leu Glu Rrg Phe Tyr Asp Leu Glu Arg Phe Tyr Asn Val Lys Thr Arg Glu Asp Leu Val Gly His Leu Ile Ala Gly Leu Ala Pro His Thr Ser Ala Ala Val Leu Gly Arg Ile Ile Gly Phe Thr Gly Ala Ser Ala Cys Tyr Ala His Pro Tyr Phe His Ser Ala Lys Arg Arg Asn Cys Asp Ser Asp Glu Asp Ser Val Met Leu Leu Leu Asp Ala Leu Leu Asn Phe Ser Lys Ser Tyr Leu Pro Ser Ser Arg Gly Gly Ser Met Asp Ala Pro Leu Val Leu Ser Thr Arg Ile Asp Pro Glu Glu Ile Asp Asp Glu Ser His Asn Ile _ 79 _ Asp Thr Met Asp Met Ile Pro Leu Glu Val Tyr Glu Arg Ser Phe Asp His Pro Arg Pro Ser Glu Val Leu Asp Val Ile Asp Asn Val Glu Lys Arg Leu Gly Lys Pro Glu Gln Tyr Thr Gly Leu Met Phe Ser His Asn Thr Ser Arg Ile Asp Glu Gly Pro Lys Val Cys Leu Tyr Lys Leu Leu Pro Thr Met Lys Glu Lys Val Glu Ser Gln Ile Thr Leu Ala Glu Lys Ile Arg Ala Val Asp Gln Arg Ser Val Val Glu Gly Val Leu Met Ser His Phe Leu Pro Asp Met Met Gly Asn Ile Arg Ala Phe Ser Arg Gln Lys Val Arg Cys Thr Lys Cys Asn Arg Lys Tyr Arg Arg Ile Pro Leu Ser Gly Glu Cys Arg Cys Gly Gly Asn Leu Val Leu Thr Val Ser Lys Gly Ser Val Ile Lys Tyr Leu Glu Ile Ser Lys Glu Leu Ala Ser Arg Tyr Pro Ile Asp Pro Tyr Leu Met Gln Arg Ile Glu Ile Leu Glu Tyr Gly Val Asn Ser Leu Phe Glu Ser Asp Arg Ser Lys Gln Ser Ser Leu Asp Val Phe Leu (2) INFORMATION FOR SEQ ID NO: 31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1263 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Pyrococcus furiosus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
Met Glu Leu Pro Lys Glu Ile Glu Glu Tyr Phe Glu Met Leu Gln Arg Glu Ile Asp Lys Ala Tyr Glu Ile Ala Lys Lys Ala Arg Ser Gln Gly Lys Pro Thr Asp GluIle Thr Asp Ser Val Pro Asp Gln Met Ala Ala Gly ValGluSer Leu GlyProProGly AlaGln ArgIle Arg Val Val ArgGlu LeuLeuLys GluTyr AspLysGluIle ValAlaLeu Ile Lys ValAsp GluIleIle GluGly LysPheGlyAsp PheGlySer LysGlu LysTyr AlaGluGln AlaVal ArgThrAlaLeu AlaIleLeu ThrGlu GlyIle ValSerAla ProLeu GluGlyIleAla AspValLys IleLys ArgAsn ThrTrpAla AspAsn SerGluTyrLeu AlaLeuTyr TyrAla GlyPro IleArgSer SerGly GlyThrAlaGln AlaLeuSer ValLeu ValGly AspTyrVal RrgArg LysLeuGlyLeu AspArgPhe LysPro SerGly LysHisIle GluArg MetValGluGlu ValAspLeu TyrHis ArgAla ValSerArg LeuGln TyrHisProSer ProAspGlu ValArg LeuAla MetArgAsn IlePro IleGluIleThr GlyGluAla ThrAsp AspVal GluValSer HisArg AspValGluGly ValGluThr AsnGln LeuArg GlyGlyAla IleLeu ValLeuAlaGlu GlyValLeu GlnLys AlaLys LysLeuVal LysTyr IleAspLysMet GlyIleAsp GlyTrp GluTrp LeuLysGlu PheVal GluAlaLysGlu LysGlyGlu GluIle GluGlu SerGluSer LysAla GluGluSerLys ValGluThr ArgVal GluVal GluLysGly PheTyr LysLeuTyr GluLysPhe ArgAla Tyr GluIle ProSer GluLys LysGlu IleIleGly Gly Ala Tyr Arg Ala ProLeu Phe ProSer Phe Leu Arg Ala Glu Arg Tyr Gly Asn Gly Gly Gly Ser Pro Arg :Ile Ala Ser Asn Arg Val Ser Gly Phe Ala Thr Trp 355 360 :365 Thr Met Val Leu Val Asp Glu Phe Leu Ala Ile Gly Thr Gln Met Lys Thr Glu Arg Pro Gly Lys Gly Ala Val Val Thr Pro Ala Thr Thr Ala Glu Gly Pro Ile Val Lys Leu Lys Asp Gly Ser Val Val Arg Val Asp Asp Tyr Asn Leu Ala Leu Lys Ile Arg Asp Glu Val Glu Glu Ile Leu Tyr Leu Gly Asp Ala Ile Ile Ala Phe Gly Asp Phe Val Glu Asn Asn Gln Thr Leu Leu Pro Ala Asn Tyr Val Glu Glu Trp Trp Ile Gln Glu Phe Val Lys Ala Val Asn Glu Ala Tyr Glu Val Glu Leu Arg Pro Phe Glu Glu Asn Pro Arg Glu Ser Val Glu Glu Ala Ala Glu Tyr Leu Glu Val Asp Pro Glu Phe Leu Ala Lys Met Leu Tyr Asp Pro Leu Arg Val Lys Pro Pro Val Glu Leu Ala Ile His Phe Ser Glu Ile Leu Glu Ile Pro Leu His Pro Tyr Tyr Thr Leu Tyr Trp Asn Thr Val Asn Pro Lys Asp Val Glu Arg Leu Trp Gly Val Leu Lys Asp Lys Ala Thr Ile Glu Trp Gly Thr Phe Arg Gly Ile Lys Phe Ala Lys Lys Ile Glu Ile Ser Leu Asp Asp Leu Gly Ser Leu Lys Arg Thr Leu Glu Leu Leu Gly Leu Pro His Thr Val Arg Glu Gly Ile Val Val Val Asp Tyr Pro Trp Ser Ala Ala Leu Leu Thr Pro Leu Gly Asn Leu Glu Trp Glu Phe Lys Ala Lys Pro Phe Tyr Thr Val Ile Asp Ile Ile Asn Glu Asn Asn Gln Ile Lys Leu Arg Asp Arg Gly Ile Ser Trp Ile Gly Ala Arg Met Gly Arg Pro Glu Lys Ala Lys Glu Arg Lys Met Lys Pro Pro Val Gln Val Leu Phe Pro Ile Gly Leu Ala Gly Gly Ser Ser Arg Asp Ile Lys Lys Ala Ala Glu Glu Gly Lys Ile Ala Glu Val Glu Ile Ala Phe Phe Lys Cys _ 77 _ ProLys CysGly HisValGly ProGluThr Pro GluCysGly Leu Cy~s IleArg LysGlu LeuIleTrp ThrCysPro LysCy~oGly GluTyr Ala ThrAsn SerGln AlaGluGly TyrSerTyr SerCysPro LysCysAsn ValLys LeuLys ProPheThr LysArgLys IleLysPro SerGluLeu LeuAsn ArgAla MetGluAsn ValLysVal TyrGlyVal AspLysLeu LysGly ValMet GlyMetThr SerGlyTrp LysIleAla GluProLeu GluLys GlyLeu LeuArgAla LysAsnGlu ValTyrVal PheLysAsp GlyThr IleArg PheAspAla ThrAspAla ProIleThr HisPheArg ProArg GluIle GlyValSer ValGluLys LeuArgGlu LeuGlyTyr ThrHis AspPhe GluGlyLys ProLeuVal SerGluAsp GlnIleVal GluLeu LysPro GlnAspVal IleLeuSer LysGluAla GlyLysTyr LeuLeu ArgVal AlaArgPhe ValAspAsp LeuLeuGlu LysPheTyr GlyLeu ProArg PheTyrAsn AlaGluLys MetGluAsp LeuIleGly HisLeu ValIle GlyLeuAla ProHisThr SerAlaGly IleValGly ArgIle IleGly PheValAsp AlaLeuVal GlyTyrAla HisProTyr 930 935 94p PheHis AlaAla LysArgArg AsnCysAsp GlyAspGlu AspSerVal MetLeu LeuLeu AspAlaLeu LeuAsnPhe SerArgTyr TyrLeuPro GluLys ArgGly GlyLysMet AspAlaPro LeuValIle ThrThrArg LeuAsp ProArg GluValAsp SerGluVal HisAsnMet AspValVal ArgTyr TyrPro LeuGluPhe GluAla ThrTyrGlu LeuLysSer Tyr ProLys GluLeu Val IleGluGly ValGlu LeuGly Val Asp Arg Arg _ 78 Lys Pro Glu Met Tyr Tyr Gly Ile Lys Phe Thr His Asp Thr Asp Asp Ile Ala Leu Gly Pro Lys Met Ser Leu Tyr Lys Gln Leu Gly Asp Met Glu Glu Lys Val Lys Arg Gln Leu Thr Leu Ala Glu Arg Ile Arg Ala Val Asp Gln His Tyr Val Ala Glu Thr Ile Leu Asn. Ser His Leu Ile Pro Asp Leu Arg Gly Asn Leu Arg Ser Phe Thr Arg Gln Glu Phe Arg Cys Val Lys Cys Asn Thr Lys Tyr Arg Arg Pro Pro Leu Asp Gly Lys Cys Pro Val Cys Gly Gly Lys Ile Val Leu Thr Val Ser Lys Gly Ala Ile Glu Lys Tyr Leu Gly Thr Ala Lys Met Leu Val Ala Asn Tyr Asn Val Lys Pro Tyr Thr Arg Gln Arg Ile Cys Leu Thr Glu Lys Asp Ile Asp Ser Leu Phe Glu Tyr Leu Phe Pro Glu Ala Gln Leu Thr Leu Ile Val Asp Pro Asn Asp Ile Cys Met Lys Met Ile Lys Glu Arg Thr Gly Glu Thr Val Gln Gly Gly Leu Leu Glu Asn Phe Asn Ser Ser Gly Asn Asn Gly Lys Lys Ile Glu Lys Lys Glu Lys Lys Ala Lys Glu Lys Pro Lys Lys Lys Lys Val Ile Ser Leu Asp Asp Phe Phe Ser Lys Arg 1250 1255 126() (2) INFORMATION FOR SEQ ID N0: 32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 363 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:
Met Gln Ala Phe Leu Lys Gly Thr Ser Ile Ser Thr Lys Pro Pro Leu _ 79 _ Thr Lys Asp Arg Gly Val Ala Ala Ser Ala Gly Ser Ser Gly Glu Asn Lys Lys AlaLys ValPro ValGlu Arg Cys Pro Trp Lys Pro Tyr Lys Val Asp GluValAla PheGln GluValVal AlaValLeu Lys Glu Lys Ser Leu GluGly AspLeu AsnLeuLeu PheTyrGly Pro Ala Pro Pro Gly Thr GlyLysThr SerThrIle LeuAlaAla AlaArgGlu LeuPhe Gly Pro GluLeuPhe ArgLeuArg ValLeuGlu LeuAsnAla SerAsp Glu Arg GlyIleGln ValValArg GluLysVal LysAsnPhe AlaGln Leu Thr ValSerGly SerArgSer AspGlyLys ProCysPro ProPhe Lys Ile ValIleLeu AspGluAla AspSerMet ThrSerAla AlaGln Ala Ala LeuArgArg ThrMetGlu LysGluSer LysThrThr ArgPhe Cys Leu IleCysAsn TyrValSer ArgIleIle GluProLeu ThrSer Arg Cys SerLysPhe ArgPheLys ProLeuSer AspLysIle GlnGln Gln Arg LeuLeuAsp IleAlaLys LysGluAsn ValLysIle SerAsp Glu Gly IleAlaTyr LeuValLys ValSerGlu GlyAspLeu ArgLys Ala Ile ThrPheLeu GlnSerAla ThrArgLeu ThrGlyGly LysGlu Ile Thr GluLysVal IleThrAsp IleAlaGly ValIlePro AlaGlu Lys Ile AspGlyVal PheAlaAla CysGlnSer GlySerPhe AspLys Leu Glu AlaValVal LysAspLeu IleAspGlu GlyHisAla AlaThr Gln Leu Val Gln LeuHisAsp ValValVal GluAsnAsn LeuSer Asn .

Asp Lys Gln Ser IleIleThr GluLysLeu AlaGluVal AspLys Lys Cys Leu Gly AspGlu LeuGln LeuIleSer LeuCys Ala Asp Ala His _ 80 _ Ala Thr Val Met Gln Gln Leu Ser Gln Asn Cys (2) INFORMATION FOR SEQ ID NO: 33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:
Asn Leu Val Gln Cys Gly Asp Phe Pro His Leu Leu Val Tyr Gly Pro Ser Gly Ala Gly Lys Lys Thr Arg Ile Met Cys Ile Leu Arg Glu Leu Tyr Gly Val Gly Val Glu Lys Leu Arg Ile Glu His Gln Thr Ile Thr Thr Pro Ser Lys Lys Lys Ile Glu Ile Ser Thr Ile Ala Ser Asn Tyr His Leu Glu Val Asn Pro Ser Asp Ala Gly Asn Ser Asp Arg Val Val Ile Gln Glu Met Leu Lys Thr Val Ala Gln Ser Gln Gln Leu Glu Thr Rsn Ser Gln Arg Asp Phe Lys Val Val Leu Leu Thr Glu Val Asp Lys Leu Thr Lys Asp Ala Gln His Ala Leu Arg Arg Thr Met Glu Lys Tyr Met Ser Thr Cys Arg Leu Ile Leu Cys Cys Asn Ser Thr Ser Lys Val Ile Pro Pro Ile Arg Ser Arg Cys Leu Ala Val Arg Val Pro Ala Pro Ser Ile Glu Asp Ile Cys His Val Leu Ser Thr Val Cys Lys Lys Glu Gly Leu Asn Leu Pro Ser Gln Leu Ala His Arg Leu Ala Glu Lys Ser Cys Arg Asn Leu Arg Lys Ala Leu Leu Met Cys Glu Ala Cys Arg Val Gln Gln Tyr Pro Phe Thr Ala Asp Gln Glu Ile Pro Glu Thr Asp Trp Glu Val Tyr Leu Arg Glu Thr Ala Asn Ala Ile Val Ser Gln Gln Thr Pro Gln Arg Leu Leu Glu Val Arg Gly Arg Leu Tyr Glu Leu Leu Thr His Cys Ile Pro Pro Glu Ile Ile Met Lys Gly Leu Leu Ser Glu Leu Leu His Asn Cys Asp Gly Gln Leu Lys Gly Glu Val Ala Gln Met Ala Ala Tyr Tyr Glu His Arg Leu Gln Leu Gly Ser Lys Ala Ile Tyr His Leu Glu Ala Phe Val Ala Lys Phe Met Ala Leu Tyr Lys Lys Phe Ile Gln Asp Gly Leu Glu Gly Met Met Phe (2) INFORMATION FOR SEQ ID NO: 34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 354 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Homo Sapiens (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
Met Glu Val Glu Ala Val Cys Gly Gly Ala Gly Glu Val Glu Ala Gln Asp Ser Asp Pro Ala Pro Ala Phe Ser Lys Ala Pro Gly Ser Ala Gly His Tyr Glu Leu Pro Trp Val Glu Lys Tyr Arg Pro Val Lys Leu Asn Glu Ile Val Gly Asn Glu Asp Thr Val Ser Arg Leu Glu Val Phe Ala Arg Glu Gly Asn Val Pro Asn Ile Ile Ile Ala Gly Pro Pro Gly Thr 65 70 75 g0 Gly Lys Thr Thr Ser Ile Leu Cys Leu Ala Arg Ala Leu Leu Gly Pro Ala Leu Lys Asp Ala Met Leu Glu Leu Asn Ala Ser Asn Asp Arg Gly Ile Asp Val Val Arg Asn Lys Ile Lys Met Phe Ala Gln Gln Lys Val Thr Leu Pro Lys Gly Arg His Lys Ile Ile Ile Leu Asp Glu Ala Asp Ser Met Thr Asp Gly Ala Gln Gln Ala Leu Arg Arg Thr Met Glu Ile Tyr Ser Lys Thr Thr Arg Phe Ala Leu Ala Cys Asn Ala Ser Asp Lys Ile Ile Glu Pro Ile Gln Ser Arg Cys Ala Val Leu Arg Tyr Thr Lys Leu Thr Asp Ala Gln Ile Leu Thr Arg Leu Met Asn Val Ile Glu Lys Glu Arg Val Pro Tyr Thr Asp Asp Gly Leu Glu Ala Ile Ile Phe Thr Ala Gln Gly Asp Met Arg Gln Ala Leu Asn Asn Leu Gln Ser Thr Phe Ser Gly Phe Gly Phe Ile Asn Ser Glu Asn Val Phe Lys Val Cys Asp Glu Pro His Pro Leu Leu Val Lys Glu Met Ile Gln His Cys Val Asn Ala Asn Ile Asp Glu Ala Tyr Lys Ile Leu Ala His Leu Trp His Leu Gly Tyr Ser Pro Glu Asp Ile Ile Gly Asn Ile Phe Arg Val Cys Lys Thr Phe Gln Met Ala Glu Tyr Leu Lys Leu Glu Phe Ile Lys Glu Ile Gly Tyr Thr His Met Lys Ile Ala Glu Gly Val Asn Ser Leu Leu Gln Met Ala Gly Leu Leu Ala Arg Leu Cys Gln Lys Thr Met Ala Pro Val Ala Ser (2) INFORMATION FOR SEQ ID NO: 35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 366 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Escherichia coli (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:

Met Lys Phe Thr Val Glu Arg Glu His Leu Leu Lys Pro Leu Gln Gln Val Ser Gly Pro Leu Gly Gly Arg Pro Thr Leu Pro Ile Leu Gly Asn Leu Leu Leu Gln Val Ala Asp Gly Thr Leu Ser Leu Thr Gly Thr Asp 35 40 q5 Leu Glu Met Glu Met Val Ala Arg Val Ala Leu Val Gln Pro His Glu Pro Gly Ala Thr Thr Val Pro Ala Arg Lys Phe Phe Asp Ile Cys Arg Gly Leu Pro Glu Gly Ala Glu Ile Ala Val Gln Leu Glu Gly Glu Arg Met Leu Val Arg Ser Gly Arg Ser Arg Phe Ser Leu Ser Thr Leu Pro Ala Ala Asp Phe Pro Asn Leu Asp Asp Trp Gln Ser Glu Val Glu Phe Thr Leu Pro Gln Ala Thr Met Lys Arg Leu Ile Glu Ala Thr Gln Phe Ser Met Ala His Gln Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Phe Glu Thr Glu Gly Glu Glu Leu Arg Thr Val Ala Thr Asp Gly His Arg Leu Ala Val Cys Ser Met Pro Ile Gly Gln Ser Leu Pro Ser His Ser Val Ile Val Pro Arg Lys Gly Val Ile Glu Leu Met Arg Met Leu Asp Gly Gly Asp Asn Pro Leu Arg Val Gln Ile Gly Ser Asn Asn Ile Arg Ala His Val Gly Asp Phe Ile Phe Thr Ser Lys Leu Val Asp Gly Arg Phe Pro Asp Tyr Arg Arg Val Leu Pro Lys Asn Pro Asp Lys His Leu Glu Ala Gly Cys Asp Leu Leu Lys Gln Ala Phe Ala Arg Ala Ala Ile Leu Ser Asn Glu Lys Phe Arg Gly Val Arg Leu Tyr Val Ser Glu Asn Gln Leu Lys Ile Thr Ala Asn Asn Pro Glu Gln Glu Glu Ala Glu Glu Ile Leu Asp Val Thr Tyr Ser Gly Ala Glu Met Glu Ile Gly Phe Asn Val Ser Tyr Val Leu Asp Val Leu Asn Ala Leu Lys Cys Glu Asn Val _ 84 _ Arg Met Met Leu Thr Asp Ser Val Ser Ser Val Gln Ile Glu Asp Ala Ala Ser Gln Ser Ala Ala Tyr Val Val Met Pro Met Arg Leu (2) INFORMATION FOR SEQ ID NO: 36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 363 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Aquifex Aeolicus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36:
Met Arg Val Lys Val Asp Arg Glu Glu Leu Glu Glu Val Leu Lys Lys Ala Arg Glu Ser Thr Glu Lys Lys Ala Ala Leu Pro Ile Leu Ala Asn Phe Leu Leu Ser Ala Lys Glu Glu Asn Leu Ile Val Arg Ala Thr Asp Leu Glu Asn Tyr Leu Val Val Ser Val Lys Gly Glu Val Glu Glu Glu Gly Glu Val Cys Val His Ser Gln Lys Leu Tyr Asp Ile Val Lys Asn Leu Asn Ser Ala Tyr Val Tyr Leu His Thr Glu Gly Glu Lys Leu Val Ile Thr Gly Gly Lys Ser Thr Tyr Lys Leu Pro Thr Ala Pro Ala Glu Asp Phe Pro Glu Phe Pro Glu Ile Val Glu Gly Gly Glu Thr Leu Ser Gly Asn Leu Leu Val Asn Gly Ile Glu Lys Val Glu 'ryr Ala Ile Ala Lys Glu Glu Ala Asn Ile Ala Leu Gln Gly Met Tyr :Leu Arg Gly Tyr Glu Asp Arg Ile His Phe Val Gly Ser Asp Gly His Arg Leu Ala Leu Tyr Glu Pro Leu Gly Glu Phe Ser Lys Glu Leu Leu :Cle Pro Arg Lys Ser Leu Lys Val Leu Lys Lys Leu Ile Thr Gly Ile C~lu Asp Val Asn Ile Glu Lys Ser Glu Asp Glu Ser Phe Ala Tyr Phe Ser Thr Pro Glu Trp Lys Leu Ala Val Arg Leu Leu Glu Gly Glu Phe Pro Asp Tyr Met Ser Val Ile Pro Glu Glu Phe Ser Ala Glu Val Leu Phe Glu Thr Glu Glu Val Leu Lys Val Leu Lys Arg Leu Lys Ala Leu Ser Glu Gly Lys Val Phe Pro Val Lys Ile Thr Leu Ser Glu Asn Leu Ala Ile Phe Glu Phe Ala Asp Pro Glu Phe Gly Glu Ala Arg Glu Glu Ile Glu Val Glu Tyr Thr Gly Glu Pro Phe Glu Ile Gly Phe Asn Gly Lys Tyr Leu Met Glu Ala Leu Asp Ala Tyr Asp Ser Glu Arg Val Trp Phe Lys Phe Thr Thr Pro Asp Thr Ala Thr Leu Leu Glu Ala Glu Asp Tyr Glu Lys Glu Pro Tyr Lys Cys Ile Ile Met Pro Met Arg Val (2) INFORMATION FOR SEQ ID NO: 37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1160 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Escherichia coli (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
Met Ser Glu Pro Arg Phe Val His Leu Arg Val His Ser Asp Tyr Ser Met Ile Asp Gly Leu Ala Lys Thr Ala Pro Leu Val Lys Lys Ala Ala Ala Leu Gly Met Pro Ala Leu Ala Ile Thr Asp Phe Thr Asn Leu Cys Gly Leu Val Lys Phe Tyr Gly Ala Gly His Gly Ala Gly Ile Lys Pro Ile Val Gly Ala Asp Phe Asn Val Gln Cys Asp Leu Leu Gly Asp Glu Leu Thr His Leu Thr Val Leu Ala Ala Asn Asn Thr Gly Tyr Gln Asn Leu Thr Leu Leu Ile Ser Lys Ala Tyr Gln Arg Gly Tyr Gly Ala Ala Gly Pro Ile Ile Asp Arg Asp Trp Leu Ile Glu Leu Asn Glu Gly Leu Ile Leu Leu Ser Gly Gly Arg Met Gly Asp Val Gly Arg Ser Leu Leu Arg Gly Asn Ser Ala Leu Val Asp Glu Cys Val Ala Phe Tyr Glu Glu His Phe Pro Asp Arg Tyr Phe Leu Glu Leu Ile Arg Thr Gly Arg Pro Asp Glu Glu Ser Tyr Leu His Ala Ala Val Glu Leu Ala Glu Ala Arg Gly Leu Pro Val Val Ala Thr Asn Asp Val Arg Phe Ile Asp Ser Ser Asp Phe Asp Ala His Glu Ile Arg Val Ala Ile His Asp Gly Phe Thr Leu Asp Asp Pro Lys Arg Pro Arg Asn Tyr Ser Pro Gln Gln Tyr Met Arg Ser Glu Glu Glu Met Cys Glu Leu Phe Ala Asp Ile Pro Glu Ala Leu Ala Asn Thr Val Glu Ile Ala Lys Arg Cys Asn Val Thr Val Arg Leu Gly Glu Tyr Phe Leu Pro Gln Phe Pro Thr Gly Asp Met Ser Thr Glu Asp Tyr Leu Val Lys Arg Ala Lys Glu Gly Leu Glu Glu Arg Leu Ala Phe Leu Phe Pro Asp Glu Glu Glu Arg Leu Lys Arg Arg Pro Glu Tyr Asp Glu Arg Leu Glu Thr Glu Leu Gln Val Ile Asn Gln Met Gly Phe Pro Gly Tyr Phe Leu Ile Val Met Glu Phe Ile Gln Trp Ser Lys Asp Asn Gly Val Pro Val Gly Pro Gly Arg Gly Ser Gly Ala Gly Ser Leu Val Ala Tyr Ala Leu Lys Ile Thr Asp Leu Asp Pro Leu Glu Phe Asp Leu Leu Phe Glu Arg Phe Leu Asn Pro Glu Arg 'Val Ser Met Pro Asp Phe Asp Val Asp Phe Cys Met Glu Lys Arg Asp Gln Val Ile Glu _ 87 _ His Val Ala Asp Met Tyr Gly Arg Asp Ala Val Ser Gln Ile Ile Thr Phe Gly Thr Met Ala Ala Lys Ala Val Ile Arg Asp Val Gly Arg Val Leu Gly His Pro Tyr Gly Phe Val Asp Arg Ile Ser Lys Leu Ile Pro Pro Asp Pro Gly Met Thr Leu Ala Lys Ala Phe Glu Ala Glu Pro Gln Leu Pro Glu Ile Tyr Glu Ala Asp Glu Glu Val Lys Ala Leu Ile Asp Met Ala Arg Lys Leu Glu Gly Val Thr Arg Asn Ala Gly Lys His Ala Gly Gly Val Val Ile Ala Pro Thr Lys Ile Thr Asp Phe Ala Pro Leu Tyr Cys Asp Glu Glu Gly Lys His Pro Val Thr Gln Phe Asp Lys Ser Asp Val Glu Tyr Ala Gly Leu Val Lys Phe Asp Phe Leu Gly Leu Arg Thr Leu Thr Ile Ile Asn Trp Ala Leu Glu Met Ile Asn Lys Arg Arg Ala Lys Asn Gly Glu Pro Pro Leu Asp Ile Ala Ala Ile Pro Leu Asp Asp Lys Lys Ser Phe Asp Met Leu Gln Arg Ser Glu Thr Thr Ala Val Phe Gln Leu Glu Ser Arg Gly Met Lys Asp Leu Ile Lys Arg Leu Gln Pro Asp Cys Phe Glu Asp Met Ile Ala Leu Val Ala Leu Phe Arg Pro Gly Pro Leu Gln Ser Gly Met Val Asp Asn Phe Ile Asp Arg Lys His Gly Arg Glu Glu Ile Ser Tyr Pro Asp Val Gln Trp Gln His Glu Ser Leu Lys Pro Val Leu Glu Pro Thr Tyr Gly Ile Ile Leu Tyr Gln Glu Gln Val Met Gln Ile Ala Gln Val Leu Ser Gly Tyr Thr Leu Gly Gly Ala Asp Met Leu Arg Arg Ala Met Gly Lys Lys Lys Pro Glu Glu Met Ala Lys Gln Arg Ser Val Phe Ala Glu Gly Ala Glu Lys Asn Gly Ile Asn Ala Glu Leu Ala Met Lys Ile Phe Asp Leu Val ~~lu Lys Phe Ala _ 88 _ Gly Tyr Gly Phe Asn Lys Ser His Ser Ala Ala Tyr Ala Leu Val Ser Tyr Gln Thr Leu Trp Leu Lys Ala His Tyr Pro Ala Glu Phe Met Ala Ala Val Met Thr Ala Asp Met Asp Asn Thr Glu Lys Val Val Gly Leu Val Asp Glu Cys Trp Arg Met Gly Leu Lys Ile Leu Pro Pro Asp Ile Asn Ser Gly Leu Tyr His Phe His Val Asn Asp Asp Gly Glu Ile Val Tyr Gly Ile Gly Ala Ile Lys Gly Val Gly Glu Gly Pro Ile Glu Ala Ile Ile Glu Ala Arg Asn Lys Gly Gly Tyr Phe Arg Glu Leu Phe Asp Leu Cys Ala Arg Thr Asp Thr Lys Lys Leu Asn Arg Arg Val Leu Glu 865 870 875 g80 Lys Leu Ile Met Ser Gly Ala Phe Asp Arg Leu Gly Pro His Arg Ala Ala Leu Met Asn Ser Leu Gly Asp Ala Leu Lys Ala Ala Asp Gln His Ala Lys Ala Glu Ala Ile Gly Gln Ala Asp Met Phe Gly Val Leu Ala Glu Glu Pro Glu Gln Ile Glu Gln Ser Tyr Ala Ser Cys Gln Pro Trp Pro Glu Gln Val Val Leu Asp Gly Glu Arg Glu Thr Leu Gly Leu Tyr Leu Thr Gly His Pro Ile Asn Gln Tyr Leu Lys Glu Ile Glu Arg Tyr Val Gly Gly Val Arg Leu Lys Asp Met His Pro Thr Glu Arg Gly Lys Val Ile Thr Ala Ala Gly Leu Val Val Ala Ala Arg Val Met Val Thr Lys Arg Gly Asn Arg Ile Gly Ile Cys Thr Leu Asp .Asp Arg Ser Gly Arg Leu Glu Val Met Leu Phe Thr Asp Ala Leu Asp Lys Tyr Gln Gln Leu Leu Glu Lys Asp Arg Ile Leu Ile Val Ser Gly Gln Val Ser Phe Asp Asp Phe Ser Gly Gly Leu Lys Met Thr Ala Arg Glu Val Met Asp Ile Asp Glu Ala Arg Glu Lys Tyr Ala Arg Gly Leu Ala Ile Ser Leu _ 89 _ Thr Asp Arg Gln Ile Asp Asp Gln Leu Leu Asn Arg Leu Arg Gln Ser Leu Glu Pro His Arg Ser Gly Thr Ile Pro Val His Leu Tyr Tyr Gln Arg Ala Asp Ala Arg Ala Arg Leu Arg Phe Gly Ala Thr Trp Arg Val Ser Pro Ser Asp Arg Leu Leu Asn Asp Leu Arg Gly Leu Ile Gly Ser Glu Gln Val Glu Leu Glu Phe Asp (2) INFORMATION FOR SEQ ID NO: 38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1161 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: Aquifex Aeolicus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38:
Met Ser Lys Asp Phe Val His Leu His Leu His Thr Gln Phe Ser Leu Leu Asp Gly Ala Ile Lys Ile Asp Glu Leu Val Lys Lys Ala Lys Glu Tyr Gly Tyr Lys Ala Val Gly Met Ser Asp His Gly Asn Leu Phe Gly Ser Tyr Lys Phe Tyr Lys Ala Leu Lys Ala Glu Gly Ile Lys Pro Ile Ile Gly Met Glu Ala Tyr Phe Thr Thr Gly Ser Arg Phe Asp Arg Lys Thr Lys Thr Ser Glu Asp Asn Ile Thr Asp Lys Tyr Asn His His Leu Ile Leu Ile Ala Lys Asp Asp Lys Gly Leu Lys Asn Leu Met Lys Leu Ser Thr Leu Ala Tyr Lys Glu Gly Phe Tyr Tyr Lys Pro Arg Ile Asp Tyr Glu Leu Leu Glu Lys Tyr Gly Glu Gly Leu Ile Ala Leu Thr Ala Cys Leu Lys Gly Val Pro Thr Tyr Tyr Ala Ser Ile Asn Glu Val Lys Lys Ala Glu Glu Trp Val Lys Lys Phe Lys Asp Ile Phe Gly Asp Asp Leu Tyr Leu Glu Leu Gln Ala Asn Asn Ile Pro Glu Gln Glu Val Ala Asn Arg Asn Leu Ile Glu Ile Ala Lys Lys Tyr Asp Val Lys Leu Ile Ala Thr Gln Asp Ala His Tyr Leu Asn Pro Glu Asp Arg Tyr Ala His Thr Val Leu Met Ala Leu Gln Met Lys Lys Thr Ile His Glu Leu Ser Ser Gly Asn Phe Lys Cys Ser Asn Glu Asp Leu His Phe Ala Pro Pro Glu Tyr Met Trp Lys Lys Phe Glu Gly Lys Phe Glu Gly Trp Glu Lys Ala Leu Leu Asn Thr Leu Glu Val Met Glu Lys Thr Ala Asp Ser Phe Glu Ile Phe Glu Asn Ser Thr Tyr Leu Leu Pro Lys Tyr Asp Val Pro Pro Asp Lys Thr Leu Glu Glu Tyr Leu Arg Glu Leu Ala Tyr Lys Gly Leu Arg Gln Arg Ile Glu Arg Gly Gln Ala Lys Asp Thr Lys Glu Tyr Trp Glu Arg Leu Glu Tyr Glu Leu Glu Val Ile Asn Lys Met Gly Phe Ala Gly Tyr Phe Leu Ile Val Gln Asp Phe Ile Asn Trp Ala Lys Lys Asn Asp Ile Pro Val Gly Pro Gly Arg Gly Ser Ala Gly Gly Ser Leu Val Ala Tyr Ala Ile Gly Ile Thr Asp Val Asp Pro Ile Lys His Gly Phe Leu Phe Glu Arg Phe Leu Asn Pro Glu Arg Val Ser Met Pro Asp Ile Asp Val Asp Phe Cys Gln Asp Asn Arg Glu Lys 'Val Ile Glu Tyr Val Arg Asn Lys Tyr Gly His Asp Asn Val Ala Gln :Lle Ile Thr Tyr 435 440 .~45 Asn Val Met Lys Ala Lys Gln Thr Leu Arg Asp Val Ala Arg Ala Met Gly Leu Pro Tyr Ser Thr Ala Asp Lys Leu Ala Lys heu Ile Pro Gln Gly Asp Val Gln Gly Thr Trp Leu Ser Leu Glu Glu Met Tyr Lys Thr Pro Val Glu Glu Leu Leu Gln Lys Tyr Gly Glu His Arg Thr Asp Ile Glu Asp Asn Val Lys Lys Phe Arg Gln Ile Cys Glu Glu Ser Pro Glu Ile Lys Gln Leu Val Glu Thr Ala Leu Lys Leu Glu Gly Leu Thr Arg His Thr Ser Leu His Ala Ala Gly Val Val Ile Ala Pro Lys Pro Leu Ser Glu Leu Val Pro Leu Tyr Tyr Asp Lys Glu Gly Glu Val Ala Thr Gln Tyr Asp Met Val Gln Leu Glu Glu Leu Gly Leu Leu Lys Met Asp Phe Leu Gly Leu Lys Thr Leu Thr Glu Leu Lys Leu Met Lys Glu Leu Ile Lys Glu Arg His Gly Val Asp Ile Asn Phe Leu Glu Leu Pro Leu Asp Asp Pro Lys Val Tyr Lys Leu Leu Gln Glu Gly Lys Thr Thr Gly Val Phe Gln Leu Glu Ser Arg Gly Met Lys Glu Leu Leu Lys Lys Leu Lys Pro Asp Ser Phe Asp Asp Ile Val Ala Val Leu Ala Leu Tyr Arg Pro Gly Pro Leu Lys Ser Gly Leu Val Asp Thr Tyr Ile Lys Arg Lys His Gly Lys Glu Pro Val Glu Tyr Pro Phe Pro Glu Leu Glu Pro Val Leu Lys Glu Thr Tyr Gly Val Ile Val Tyr Gln Glu Gln Val Met Lys Met Ser Gln Ile Leu Ser Gly Phe Thr Pro Gly Glu Rla Asp Thr Leu Arg Lys Ala Ile Gly Lys Lys Lys Ala Asp Leu Met Ala Gln Met Lys Asp Lys Phe Ile Gln Gly Ala Val Glu Arg Gly Tyr Pro Glu Glu Lys Ile Arg Lys Leu Trp Glu Asp Ile Glu Lys Phe Ala Ser Tyr Ser Phe Asn Lys Ser His Ser Val Ala Tyr Gly Tyr Ile Ser Tyr Trp Thr Ala Tyr Val Lys Ala His Tyr Pro Ala Glu Phe Phe Ala Val Lys Leu Thr Thr Glu Lys Asn Asp Asn Lys Phe Leu Asn Leu Ile Lys Asp Ala Lys Leu Phe Gly Phe Glu Ile Leu Pro Pro Asp Ile Asn Lys Ser Asp Val Gly Phe Thr Ile Glu Gly Glu Asn Arg Ile Arg Phe Gly Leu Ala Arg Ile Lys Gly Val Gly Glu Glu Thr Ala Lys Ile Ile Val Glu Ala Arg Lys Lys Tyr Lys Gln Phe Lys Gly Leu Ala Asp Phe Ile Asn Lys Thr Lys Asn Arg Lys Ile Asn Lys Lys Val Val Glu Ala Leu Val Lys Ala Gly Ala Phe Asp Phe Thr Lys Lys Lys Arg Lys Glu Leu Leu Ala Lys Val Ala Asn Ser Glu Lys Ala Leu Met Ala Thr Gln Asn Ser Leu Phe Gly Ala Pro Lys Glu Glu Val Glu Glu Leu Asp Pro Leu Lys Leu Glu Lys Glu Val Leu Gly Phe Tyr Ile Ser Gly His Pro Leu Asp Asn Tyr Glu Lys Leu Leu Lys Asn Arg Tyr Thr Pro Ile Glu Asp Leu Glu Glu Trp Asp Lys Glu Ser Glu Ala Val Leu Thr Gly Val Ile Thr Glu Leu Lys Val Lys Lys Thr Lys Asn Gly Asp Tyr Met Ala Val Phe Asn Leu Val Asp Lys Thr Gly Leu Ile Glu Cys Val Val Phe Pro Gly Val Tyr Glu Glu Ala Lys Glu Leu Ile Glu Glu Rsp Arg Val Val Val Val Lys Gly Phe Leu Asp Glu Asp Leu Glu Thr Glu Asn Val Lys Phe Val Val Lys Glu Val Phe Ser Pro Glu Glu Phe Ala Lys Glu Met Arg Asn Thr Leu Tyr Ile Phe Leu Lys Arg Glu Gln Ala Leu Asn Gly Val Ala Glu Lys Leu Lys Gly Ile Ile Glu Asn Asn Arg Thr Glu Asp Gly Tyr Asn Leu Val Leu Thr Val Asp Leu Gly Asp Tyr Phe Val Asp Leu Ala Leu Pro Gln Asp Met Lys Leu Lys Ala Asp Arg Lys Val Val Glu Glu Ile Glu Lys Leu Gly Val Lys Val Ile Ile (2) INFORMATION FOR SEQ ID NO: 39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 64 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:
[GAVLIMPFW]-D-X-X-X-[GAVLIMPFWJ-X-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-X-X-X-F-X-X-Y-X-X-D 64 (2) INFORMATION FOR SEQ ID NO: 40:
(i) SEQUENCE CHARACTERISTICS:
(R) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40:
[GAVLIMPFW]-X(3)-L-A-P-[KRHDE]-[GAVLIMPFW]-E 28 (2) INFORMATION FOR SEQ ID NO: 41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:
C-N-Y-X-S-[KRHDE]-I-I-X-[GAVLIMPFW)-[GAVLIMPFW]-Q-S-R-C-X-X-F-R-F-X-P-[GAVLIMPFW] 51 _ 94 _ (2) INFORMATION FOR SEQ ID NO: 42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 41 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:
K-X-X-L-L-X-G-P-P-G-X-G-K-T-[STNQYC]-X-[GAVLIMPFW]-X-:~-[GAVLIMPFW] 41 (2) INFORMATION FOR SEQ ID NO: 43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: BO amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43:
[FL]-[GAVLIMPFW]-X-X-[GAVLIMPFW]-X-G-X(13)-[GAVLIMPFW]-X-[YR)-[GAVLIMPFW]-X-[GAVLIMPF'4V]-A-G-(DN]-[GAVLIMPFW]-[GAVLIMPFW]-[DS] 80 (2) INFORMATION FOR SEQ ID NO: 44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 44 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:
D-[GAVLIMPFWJ-[GAVLIMPFW)-X-X-Y-N-X-X-X-F-D-X-P-Y-[GAVLIMPFW)-X-X-R-A 44 (2) INFORMATION FOR SEQ ID N0: 95:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 78 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID N0: 45:
A-[GAVLIMPFW]-R-T-A-[GAVLIMPFW]-A-[GAVLIMPFW]-[GAVLIMPFW]-T-E-G-[GAVLIMPFW]-V-X-A-P-[GAVLIMPFW]-E-G-I-A-X-V-[KRHDE)-I 7g (2) INFORMATION FOR SEQ ID NO: 46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 118 amino acids (B) TYPE: amino acid (C) STRAND FORM: single strand (D) TOPOLOGY: both (ii) TYPE OF MOLECULE: protein (vi) INITIAL ORIGIN:
(A) ORGANISM: synthetic xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46:
[GAVLIMPFW)-P-V-G-[GAVLIMPFW)-G-R-G-S-X-[GAVLIMPFW]-G-S-[GAVLIMPFW]-V-A-X-A-[GAVLIMPFW)-X-I-T-D-[GAVLIMPFW]-D-P-[GAVLIMPFWJ-X-X-X-[GRVLIMPFW)-L-F-E-R-F-L-N-P-E-R-[GAVLIMPFW)-S-M-P-D 118 (2) INFORMATION FOR SEQ ID NO: 47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: M13 MP18 ss DNA (phage) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:

(2) INFORMATION FOR SEQ ID NO: 48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: M13 MP18 ss DNA (phage) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48:

(2) INFORMATION FOR SEQ ID NO: 49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49:

(2) INFORMATION FOR SEQ ID NO: 50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID N0: 50:

(2) INFORMATION FOR SEQ ID NO: 51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51:

(2) INFORMATION FOR SEQ ID NO: 52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: Archaeglobus fulgidus (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:

(2) INFORMATION FOR SEQ ID NO: 53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:

(A) ORGANISM: Human Collagen Forward (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53:

(2) INFORMATION FOR SEQ ID NO: 54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRAND FORM: single strand (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA
(vi) INITIAL ORIGIN:
(A) ORGANISM: Human Collagen Reverse (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54:

Claims (46)

Claims
1. Thermostable in vitro complex for the template-dependent elongation of nucleic acids comprising a thermostable sliding clamp protein and a thermostable elongation protein.
2. Thermostable in vitro complex as claimed in claim 1, characterized in that the sliding clamp protein is linked to an elongation protein.
3. Thermostable in vitro complex as claimed in claim 1, characterized in that a sliding clamp protein is directly linked to an elongation protein.
4. Thermostable in vitro complex as claimed in claim 1 or 2, characterized in that the sliding clamp protein and the elongation protein are linked by means of a coupling protein.
5. Thermostable in vitro complex as claimed in one of the claims 1 to 4, characterized in that the sliding clamp protein and/or the elongation protein are derived from Archaebacteria.
6. Thermostable in vitro complex as claimed in one of the claims 1 to 5, characterized in that the sliding clamp protein has a ring-like structure which wholly or partially encircles the template nucleic acid strands.
7. Thermostable in vitro complex as claimed in one of the claims 1 to 6, characterized in that the sliding clamp protein contains one or both of the following amino acid consensus sequences:

SEQ ID NO:39 [GAVLIMPFW]-D-X-X-X-[GAVLIMPFW]-X-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-[GAVLIMPFW]-X-X-X-X-F-X-X-Y-X-X-D
and/or SEQ ID NO:40 [GAVLIMPFW]-X(3)-L-A-P-[KRHDE]-[GAVLIMPFW]E.
8. Thermostable in vitro complex as claimed in one of the claims 1 to 7, characterized in that the sliding clamp protein has a sequence identity of at least 20 % to the human (eukaryotic) PCNA amino acid sequence (SEQ ID NO:11) over a length of at least 100 amino acids in a sequence alignment and/or the sliding clamp protein has a sequence identity of at least 20 %, to the bacterial .beta.-clamp sequence from E. coli (eubacteria) (SEQ ID NO:35) over a length of at least 100 amino acids in a sequence alignment and/or the sliding clamp protein has a sequence identity of at least 20 % to the amino acid sequence of the PCNA homologue from Archaeoglobus fulgidus (SEQ ID NO:12) over a length of at least 100 amino acids in a sequence alignment.
9. Thermostable complex as claimed in one of the previous claims, characterized in that the sliding clamp protein results in a score of at least 20 in a hidden Markov model generated from the alignment from fig. 12 and/or the sliding clamp protein results in a score of at least 25 in the hidden Markov model generated from the alignment from fig. 13.
10. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the sliding clamp protein is a sliding clamp protein which is derived from an organism that is selected from the group comprising Archaeoglobus fulgidus, Methanococcus jannasehii, Pyrococcus horikoshii, Methanobacterium thermoautotrophicus, Aquifex aeolicus and Carboxydothermus hydrogenofhormans.
11. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the sliding clamp protein is selected from the group comprising AF0335 from Archaeoglobus fulgidus, MJ0247 from Methanococcus jannaschii, PHLA008 from Pyrococcus horikoshii, MTH1312 from Methanobacterium thermoautotrophicus and AE000761_7 from Aquifex aeolicus.
12. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the elongation protein has a 5'-3' polymerase activity and/or a reverse transcriptase activity.
13. Thermostable in vitro complex as claimed in claim 12, characterized in that the elongation protein contains at least one of the following consensus sequences and deviates from this sequence at no more than four positions:
SEQ ID NO:44:
D-[GAVLIMPFW]-[GAVLIMPFW]-X-X-Y-N-X-X-X-F-D-X-P-Y-[GAVKUNOFW]-X-X-R-A
SEQ ID NO:45 A-[GAVLIMPFW]-R-T-A-[GAVLIMPFW]-A-[GAVLIMPFW]-[GAVLIMPFW]-T-E-G-[GAVLIMPFW]-V-X-A-P-[GAVLIMPFW]-E-G-I-A-X-V-[KRHDE]-I
SEQ ID NO:46 [GAVLIMPFW]-P-V-G-[GAVLIMPFW]-G-R-G-S-X-[GAVLIMPFW]-G-S-[GAVKUNOFW]-V-A-X-A-[GAVLIMPFW]-X-I-T-D-[GAVKUNOFW]-D-P-[GAVLIMPFW]-X-X-X-[GAVLIMPFW]-L-F-E-R-F-L-N-P-E-R-[GAVLIMPFW]-S-M-P-D.
14. Thermostable in vitro complex as claimed in claim 12, characterized in that the elongation protein has a sequence identity of at least 20 % to the human (eukaryotic) amino acid sequence (SEQ ID NO:22) over a length of at least 200 amino acids in a sequence alignment and/or has a sequence identity of at least 25 % to the archaebacterial amino acid sequence (SEQ ID NO:27) over a length of at least 400 amino acids in a sequence alignment and/or has a sequence identity of at least 25 % to the eubacterial amino acid sequence (SEQ ID NO:37) over a length of at least 300 amino acids in a sequence alignment.
15. Thermostable in vitro complex as claimed in claim 12, characterized in that the elongation protein results in a score of at least 20 in a hidden Markov model generated from an alignment from fig. 17 and/or the elongation protein results in a score of at least 35 in a hidden Markov model generated from an alignment from fig. 18 and/or the elongation protein results in a score of at least 20 in a hidden Markov model generated from an alignment from fig. 19.
16. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the elongation protein is an elongation protein derived from an organism which is selected from the group comprising Archaeoglobus fulgidus, Methanococcus jannaschii, Pyrococcus horikoshii, Methanobacterium thermoautotrophicus, Pyrococcus furiosus and Carboxydothermus hydrogenophormans.
17. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the elongation protein is selected from the group comprising AF0497 or AF1722 from Archaoglobus fulgidus, MJ0885 or MJ1630 from Methanococcus jannaschii, PHBT047 or PHBN021 from Pyrococcus horikoschii, MTH1208 or MTH1536 from Methanobacterium thermoautotrophicus and PFUORF3 from Pyrococcus furiosus.
18. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the coupling protein contains the following consensus sequence and differs from this sequence at no more than four positions:

SEQ ID NO:43 [FL]-[GAVLIMPFW]-X-X-[GAVLIMPFW]-X-G-X(13)-[GAVLIMPFW]-X-[YR]-[GAVLIMPFW]-X-[GAVLIMPFW]-A-G-[DN]-[GAVLIMPFW]-[GAVLIMPFW]-[DS].
19. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the coupling protein has a sequence identity of at least 18 % to the human (eukaryotic) amino acid sequence (SEQ ID NO:16) over a length of at least 150 amino acids in a sequence alignment.
20. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the coupling protein results in a score of at least 10 in a hidden Markov model generated from an alignment from figure 16.
21. Thermostable io vitro complex as claimed in one of the previous claims, characterized in that the coupling protein is a coupling protein which is derived from an organism selected from the group comprising Archaeoglobus, fulgidus, Methanococcus jannaschii, Pyrococcus horikoshii, Methanobacterium thermoautotrophicus, Pyrococcus furiosus and Carboxydothermus hydrogenophormans.
22. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the coupling protein is selected from the group comprising AF1790 from Archaeoglobus fulgidus, MJ0702 from Methanococcus jannaschii, PHBN023 from Pyrococcus horikoschii, MTH1405 from Methanobacterium thermoautotrophicus and PFUORF2 from Pyrococcus.
23. Thermostable in vitro complex as claimed in one of the previous claims, characterized in that the complex is associated with a protein which acts as a sliding clamp loader.
24. Thermostable complex as claimed in one of the previous claims, characterized in that the complex is present associated with ATP or another cofactor.
25. Recombinant DNA sequence, characterized in that it codes for a thermostable in vitro complex as claimed in one of the claims 1 to 24.
26. Vector, characterized in that it contains a recombinant DNA sequence coding for a sliding clamp protein and a coupling protein and/or an elongation protein.
27. Vector as claimed in claim 26, characterized in that it additionally contains at least one additional DNA sequence having at least one suitable restriction cleavage site for the insertion of additional DNA
sequences in an arrangement which results in a fusion protein composed of the sliding clamp protein and the expression product of the additional DNA
sequences.
28. Vector as claimed in claim 26 or 27, characterized in that it contains promoter and/or operator regions that are suitable for controlling the expression of the DNA sequence(s).
29. Vector as claimed in claim 28, characterized in that it contains several promoter and/or operator regions for the separate expression of several DNA sequences.
30. Vector as claimed in one of the claims 26 to 29, characterized in that it contains repressible and/or inducible promoter and/or operator regions.
31. Vector as claimed in one of the claims 26 to 30, characterized in that it contains a DNA sequence as claimed in claim 25.
32. Host cell, characterized in that it is transformed with one or several vectors as claimed in one of the claims 26 to 31.
33. Method for the preparation of a thermostable in vitro complex as claimed in one of the claims 1 to 24, characterized in that an appropriate recombinant DNA sequence as claimed in claim 25 or one or several of the vectors as claimed in one of the claims 26 to 31 are introduced into a host cell, the proteins are expressed and isolated from the culture medium or after cell lysis and are optionally additionally coupled to other components of the complex.
34. Method for the template-dependent elongation of nucleic acids in which the nucleic acid is denatured if necessary, and provided with at least one primer under hybridization conditions, the primer being sufficiently complementary to a flanking region of a desired nucleic acid sequence of the template strand, and primer elongation is carried out with the aid of a polymerase in the presence of nucleotides, characterized in that a thermostable in vitro complex as claimed in one of the claims 1 to 24 is used as the polymerase.
35. Method as claimed in claim 34, characterized in that two primers flanking the desired nucleic acid sequence and deoxynucleotides and/or derivatives thereof and/or ribonucleotides and/or derivatives thereof are used to amplify DNA sequences.
36. Method as claimed in claim 35, characterized in that a polymerase chain reaction is carried out.
37. Method as claimed in claim 34, characterized in that a thermostable in vitro complex as claimed in one of the claims 1 to 24 whose elongation protein has reverse transcriptase activity is used for the reverse transcription of RNA into DNA.
38. Method as claimed in one of the claims 34 to 37, characterized in that a template-dependent elongation or reverse transcription is carried out starting with a primer that is complementary to a region adjacent to the nucleic acids to be sequenced using deoxynucleotides and dideoxynucleotides or their respective derivatives in order to sequence nucleic acids according to the method of Sanger.
39. Method as claimed in one of the claims 34 to 38, characterized in that labels are inserted during the elongation of the nucleic acids.
40. Method as claimed in claim 39, characterized in that labelled primers and/or labelled deoxynucleotides and/or derivatives thereof and/or labelled dideoxynucleotides and/or derivatives thereof and/or labelled ribonucleotides and/or derivatives thereof are used.
41. Method for labelling nucleic acids by generating individual breaks in the phosphodiester bonds of the nucleic acid chain and replacing a nucleotide at the sites of the breaks by a labelled nucleotide with the aid of a polymerase, characterized in that a thermostable in vitro complex as claimed in one of the claims 1 to 24 is used as the polymerase.
42. Reagent kit for the elongation and/or amplification and/or reverse transcription and/or sequencing and/or labelling of nucleic acids containing in one or several separate containers a) a thermostable in vitro complex as claimed in one of the claims 1 to 24 or b) a thermostable in vitro complex as claimed in one of the claims 1 to 24 and separately therefrom an elongation protein having polymerase activity, and optionally primers, buffer substances, nucleotides, ATP, one or several other cofactors and/or pyrophosphatase.
43. Kit as claimed in claim 42, characterized in that in addition to the components a) or b) which have 5'-3' polymerase activity, it contains deoxynucleotides and/or derivatives thereof to amplify nucleic acids.
44. Kit as claimed in claim 42, characterized in that it contains components a) or b) which have reverse transcriptase activity as well as deoxynucleotides and/or derivatives thereof for reverse transcription.
45. Kit as claimed in one of the claims 42 to 44, characterized in that in addition to deoxynucleotides or ribonucleotides and/or derivatives thereof, it contains dideoxynucleotides and/or derivatives thereof for sequencing.
46. Kit as claimed in one of the claims 42 to 45, characterized in that it contains primers and/or deoxynucleotides and/or dideoxynucleotides in a labelled form.
CA002338185A 1998-08-06 1999-08-06 Thermostable in vitro complex with polymerase activity Abandoned CA2338185A1 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
DE19835653.6 1998-08-06
DE19835653 1998-08-06
DE19840771A DE19840771A1 (en) 1998-08-06 1998-09-07 A thermostable in vitro polymerase complex for template-dependent elongation of nucleic acids in amplification or reverse transcription methods
DE19840771.8 1998-09-07
EP99111795 1999-06-18
EP99111795.3 1999-06-18
PCT/DE1999/002480 WO2000008164A2 (en) 1998-08-06 1999-08-06 Thermostable in vitro complex with polymerase activity

Publications (1)

Publication Number Publication Date
CA2338185A1 true CA2338185A1 (en) 2000-02-17

Family

ID=56289931

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002338185A Abandoned CA2338185A1 (en) 1998-08-06 1999-08-06 Thermostable in vitro complex with polymerase activity

Country Status (5)

Country Link
EP (1) EP1100923A2 (en)
JP (1) JP2002522042A (en)
AU (1) AU5617199A (en)
CA (1) CA2338185A1 (en)
WO (1) WO2000008164A2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2792651B1 (en) * 1999-04-21 2005-03-18 Centre Nat Rech Scient GENOMIC SEQUENCE AND POLYPEPTIDES OF PYROCOCCUS ABYSSI, THEIR FRAGMENTS AND USES THEREOF
DE19937230A1 (en) * 1999-08-06 2001-02-08 Lion Bioscience Ag Chimeric proteins
US6627424B1 (en) * 2000-05-26 2003-09-30 Mj Bioworks, Inc. Nucleic acid modifying enzymes
CA2456571A1 (en) * 2001-08-10 2003-02-20 Genset Sa Human secreted proteins, their encoding polynucleotides, and uses thereof
JP6968536B2 (en) * 2014-11-28 2021-11-17 東洋紡株式会社 Nucleic acid amplification reagent
WO2016084880A1 (en) * 2014-11-28 2016-06-02 東洋紡株式会社 Pcna monomer
US20190055527A1 (en) * 2015-11-27 2019-02-21 Kyushu University, National University Corporation Dna polymerase variant

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993015115A1 (en) * 1992-01-24 1993-08-05 Cornell Research Foundation, Inc. E. coli dna polymerase iii holoenzyme and subunits
US5583026A (en) * 1994-08-31 1996-12-10 Cornell Research Foundation, Inc. Process for reconstituting the polymerase III* and other subassemblies of E. coli DNA polymerase III holoenzyme from peptide subunits
US5633159A (en) * 1995-03-10 1997-05-27 Becton, Dickinson And Company DNA polymerase III β-subunit from mycobacteriophage DS6A
AU7683998A (en) * 1997-04-08 1998-10-30 Rockfeller University, The Enzyme derived from thermophilic organisms that functions as a chromosomal replicase, and preparation and uses thereof
US6238905B1 (en) * 1997-09-12 2001-05-29 University Technology Corporation Thermophilic polymerase III holoenzyme
AU7105798A (en) * 1998-04-09 1999-11-01 Rockfeller University, The Enzyme derived from thermophilic organisms that functions as a chromosomal replicase, and preparation and uses thereof

Also Published As

Publication number Publication date
WO2000008164A2 (en) 2000-02-17
JP2002522042A (en) 2002-07-23
EP1100923A2 (en) 2001-05-23
WO2000008164A3 (en) 2000-05-11
AU5617199A (en) 2000-02-28

Similar Documents

Publication Publication Date Title
CA2186021C (en) Purified dna polymerase from bacillus stearothermophilus
CA2090614C (en) 5&#39; to 3&#39; exonuclease mutations of thermostable dna polymerases
US5939292A (en) Thermostable DNA polymerases having reduced discrimination against ribo-NTPs
CA1340628C (en) T7 dna polymerase
Kunkel et al. Exonucleolytic proofreading by calf thymus DNA polymerase delta.
AU705179B2 (en) Thermophilic DNA polymerases from thermotoga neapolitana
US5405774A (en) DNA encoding a mutated thermostable nucleic acid polymerase enzyme from thermus species sps17
US5624833A (en) Purified thermostable nucleic acid polymerase enzyme from Thermotoga maritima
EP0902035B1 (en) Altered thermostable DNA polymerases for sequencing
US7488816B2 (en) Methods for obtaining thermostable enzymes, DNA polymerase I variants from Thermus aquaticus having new catalytic activities, methods for obtaining the same, and applications of the same
CA2280001A1 (en) Polymerases for analyzing or typing polymorphic nucleic acid fragments and uses thereof
CN108779442A (en) Composition, system and the method for a variety of ligases
CA2283635A1 (en) Polymerase enhancing factor (pef) extracts, pef protein complexes, isolated pef protein, and methods for purifying and identifying
JP3112148B2 (en) Nucleic acid amplification method and reagent therefor
AU2008203222A1 (en) B-12 dependent dehydratases with improved reaction kinetics
CN111819188A (en) Fusion single-stranded DNA polymerase Bst, nucleic acid molecule for coding fusion DNA polymerase NeqSSB-Bst, preparation method and application thereof
CA2401727C (en) Thermophilic dna polymerases from thermoactinomyces vulgaris
CA2185362C (en) Purified nucleic acid encoding a thermostable pyrophosphatase
CA2338185A1 (en) Thermostable in vitro complex with polymerase activity
JP2002541861A (en) Pharmacological targeting of mRNA cap formation
EP0875576A2 (en) Bacillus stearothermophilus DNA polymerase I (klenow) clones including those with reduced 3&#39;-to-5&#39; exonuclease activity
CN114174502A (en) Phi29DNA polymerase mutant with improved primer recognition
CA2132452C (en) Casein kinase i-like protein kinase
Sanjanwala et al. DNA polymerase III gene of Bacillus subtilis.
CA2318574A1 (en) Dna replication proteins of gram positive bacteria and their use to screen for chemical inhibitors

Legal Events

Date Code Title Description
FZDE Dead