EP1368375A2 - Secretory molecules - Google Patents

Secretory molecules

Info

Publication number
EP1368375A2
EP1368375A2 EP01966516A EP01966516A EP1368375A2 EP 1368375 A2 EP1368375 A2 EP 1368375A2 EP 01966516 A EP01966516 A EP 01966516A EP 01966516 A EP01966516 A EP 01966516A EP 1368375 A2 EP1368375 A2 EP 1368375A2
Authority
EP
European Patent Office
Prior art keywords
2000sep08
polynucleotide
polypeptide
nout
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01966516A
Other languages
German (de)
French (fr)
Inventor
Stuart E. Jackson
Stephen E. Lincoln
Christina M. Altus
Gerard E. Dufour
Michael S. Chalup
Jennifer L. Jackson
Anissa Lee Jones
Jimmy Y. Yu
Rachel J. Wright
Darryl Gietzen
Tommy F. Liu
Pierre E. Yap
Christopher R. Dahl
Monika G. Momiyama
Diana L. Bradley
Sameer D. Rohatgi
Bernard Harris
Ann M. Roseberry
Edward H.Jr. Gerstin
Careyna H. Peralta
Marie H. David
Scott R. Panzer
Vincent Flores
Abel Daffo
Rakesh Marwaha
Alice J. Chen
Simon C. Chang
Alan P. Au
Rebekah R. Inman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Incyte Corp
Original Assignee
Incyte Genomics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Incyte Genomics Inc filed Critical Incyte Genomics Inc
Publication of EP1368375A2 publication Critical patent/EP1368375A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Definitions

  • the present invention relates to secretory molecules and to the use of these sequences in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, the expression of secretory molecules.
  • Protein transport and secretion are essential for cellular function. Protein transport is mediated by a signal peptide located at the amino terminus of the protein to be transported or secreted.
  • the signal peptide is comprised of about ten to twenty hydrophobic amino acids which target the nascent protein from the ribosome to a particular membrane bound compartment such as the 5 endoplasmic reticulum (ER). Proteins targeted to the ER may either proceed through the secretoiy pathway or remain in any of the secretory organelles such as the ER, Golgi apparatus, or lysosomes. Proteins that transit through the secretory pathway are either secreted into the extracellular space or retained in the plasma membrane.
  • Proteins that are retained in the plasma membrane contain one or more transmembrane domains, each comprised of about 20 hydrophobic amino acid residues.
  • Proteins o that are secreted from the cell are generally synthesized as inactive precursors that are activated by post-translational processing events during ttansit through the secretory pathway. Such events include glycosylation, proteolysis, and removal of the signal peptide by a signal peptidase. Other events that may occur during protein transport include chaperone-dependent unfolding and folding of the nascent protein and interaction of the protein with a receptor or pore complex. Examples of secretory proteins 5 with amino terminal signal peptides are discussed below and include proteins with important roles in cell-to-cell signaling.
  • GPCRs G-protein coupled receptors
  • GPCRs include receptors for biogenic amines such as dopamine, epinephrine, histamine, glutamate (metabotropic-type), acetylcholine (muscarinic-type), and serotonin; for lipid mediators of inflammation such as prostaglandins, platelet activating factor, and leukotrienes; for peptide hormones such as
  • l calcitonin C5a anaphylatoxin, follicle stimulating hormone, gonadotropin releasing hormone, neurokinin, oxytocin, and thrombin; and for sensory signal mediators such as retinal ph ⁇ topigments and olfactory stimulatory molecules.
  • the structure of these highly conserved receptors consists of seven hydrophobic transmembrane regions, cysteine disulfide bridges between the second and third 5 extracellular loops, an extracellular N-terminus, and a cytoplasmic C-terminus.
  • the N-terminus interacts, with ligands, the disulfide bridges interact with agonists and antagonists, and the large third intracellular loop interacts with G proteins to activate second messengers such as cyclic AMP, phospholipase C, inositol triphosphate, or ion channels.
  • second messengers such as cyclic AMP, phospholipase C, inositol triphosphate, or ion channels.
  • Other types of receptors include cell surface antigens identified on leukocytic cells of the immune system.
  • antigens have been identified using systematic, monoclonal antibody (mAb)- based "shot gun” techniques. These techniques have resulted in the production of hundreds of mAbs directed against unknown cell surface leukocytic antigens. These antigens have been grouped into 5 "clusters of differentiation” based on common immunocytochemical localization patterns in various differentiated and undifferentiated leukocytic cell types. Antigens in a given cluster are presumed to identify, a single cell surface protein and are assigned a "cluster of differentiation" or "CD” designation. Some of the genes encoding proteins identified by CD antigens have been cloned and verified by standard molecular biology techniques.
  • CD antigens have been characterized as both o transmembrane proteins and cell surface proteins anchored to the plasma membrane via covalent attachment to fatty acid-containing glycolipids such as glycosylphosphatidylinositol (GPI).
  • GPI glycosylphosphatidylinositol
  • MPs Matrix proteins
  • the expression and balance of MPs may be perturbed by biochemical changes that result from congenital, epigenetic, or infectious diseases.
  • MPs affect leukocyte migration, proliferation, differentiation, and activation in the immune response.
  • MPs are frequently characterized by the presence of one or more domains which may include collagen-like o domains, EGF-like domains, immunoglobulin-like domains, and fibronectin-like domains.
  • MPs may be heavily glycosylated and may contain an Arginine-Glycine-Aspartate (RGD) tripeptide motif which may play a role in adhesive interactions.
  • MPs include extracellular proteins such as fibronectin, collagen, galectin, vitronectin and its proteolytic derivative somatomedin B; and cell adhesion receptors such as cell adhesion molecules (CAMs), cadherins, and integrins.
  • Cytokines are secreted by hematopoietic cells in response to injury or infection. Interleukins, nemotrophins, growth factors, interferons, and chemokines all define cytokine families that work in conjunction with cellular receptors to regulate cell proliferation and differentiation. In addition, cytokines effect activities such as leukocyte migration and function, hematopoietic cell proliferation, temperature regulation, acute response to infection, tissue remodeling, and apoptosis.
  • Chemokines are small chemoattractant cytokines involved in inflammation, leukocyte proliferation and migration, angiogenesis and angiostasis, regulation of hematopoiesis, HJN infectivity, and stimulation of cytokine secretion.
  • Chemokines generally contain 70-100 amino acids and are subdivided into four subfamilies based on the presence of conserved cysteine-based motifs. (Callard, R. and Gearing, A. (1994) The Cytokine Facts Book. Academic Press, New York NY, pp. 5 181-190, 210-213, 223-227.)
  • Growth and differentiation factors are secreted proteins which function in intercellular communication. Some factors require oligomerization or association with MPs for activity. Complex interactions among these factors and their receptors trigger intracellular signal ttansduction pathways that stimulate or inhibit cell division, cell differentiation, cell signaling, and cell motility. Most growth o and differentiation factors act on cells in their local environment (paracrine signaling).
  • the first class includes the large polypeptide growth factors such as epidermal growth factor, fibroblast growth factor, transforming growth factor, insulin-like growth factor, and platelet-derived growth factor.
  • the second class includes the hematopoietic growth factors such as the colony stimulating factors (CSFs).
  • CSFs colony stimulating factors
  • Hematopoietic growth 5 factors stimulate the proliferation and differentiation of blood cells such as B-lymphocytes, T- lymphocytes, erythrocytes, platelets, eosinophils, basophils, neutrophils, macrophages, and their stem cell precursors.
  • the third class includes small peptide factors such as bombesin, vasopressin, oxytocin, endothelin, transferrin, angiotensin H, vasoactive intestinal peptide, and bradykinin which function as hormones to regulate cellular functions other than proliferation.
  • o Growth and differentiation factors play critical roles in neoplastic transformation of cells in vitro and in tumor progression in vivo.
  • Inappropriate expression of growth factors by tumor cells may contribute to vascularization and metastasis of tumors.
  • growth factor misregulation can result in anemias, leukemias, and lymphomas.
  • Certain growth factors such as interferon are cytotoxic to tumor cells both in vivo and in vitro.
  • some growth factors and growth factor receptors are related both sxracturally and functionally to oncoproteins.
  • growth factors affect transcriptional regulation of both proto-oncogenes and oncosuppressor genes. (Reviewed in Pimentel, E. (1994) Handbook of Growth Factors. CRC Press, Ann Arbor MI, pp.
  • Proteolytic enzymes or proteases either activate or deactivate proteins by hydrolyzing peptide bonds.
  • Proteases are found in the cytosol, in membrane-bound compartments, and in the extracellular space. The major families are the zinc, serine, cysteine, thiol, and carboxyl proteases.
  • Ion channels, ion pumps, and transport proteins mediate the transport of molecules across cellular membranes.
  • Transport can occur by a passive, concentration-dependent mechanism or can o be linked to an energy source such as ATP hydrolysis.
  • Symporters and antiporters transport ions and small molecules such as amino acids, glucose, and drags.
  • Symporters transport molecules and ions unidirectionally, and antiporters transport molecules and ions bidirectionally.
  • Transporter superfamilies include facilitative transporters and active ATP-binding cassette transporters which are involved in multiple-drug resistance and the targeting of antigenic peptides to MHC Class I molecules.
  • Ion channels are formed by transmembrane proteins which create a lined passageway across the membrane through which water and ions, such as Na + , K + , Ca 2+ , and Cl", enter and exit the cell.
  • chloride channels are involved in the regulation of the membrane electric potential as well as abso ⁇ tion and secretion of ions across the membrane. Chloride channels also regulate the internal pH of membrane-bound organelles.
  • Ion pumps are ATPases which actively maintain membrane gradients. Ion pumps are classified as P, V, or F according to their structure and function. All have one or more binding sites 5 for ATP in their cytosolic domains.
  • the P-class ion pumps include Ca 2+ ATPase and Na + /K + ATPase and function in transporting H + , Na + , K + , and Ca 2+ ions.
  • P-class pumps consist of two ⁇ and two ⁇ transmembrane subunits.
  • the V- and F-class ion pumps have similar structures but transport only H + .
  • F class H + pumps mediate transport across the membranes of mitochondria and chloroplasts, while V- class H + pumps regulate acidity inside lysosomes, endosomes, and plant vacuoles.
  • V- class H + pumps regulate acidity inside lysosomes, endosomes, and plant vacuoles.
  • a family of structurally related intrinsic membrane proteins known as facilitative glucose transporters catalyze the movement of glucose and other selected sugars across the plasma membrane.
  • the proteins in this family contain a highly conserved, large transmembrane domain comprised of 12 ⁇ -helices, and several weakly conserved, cytoplasmic and exoplasmic domains. (Pessin, J.E. and Bell, G.I. (1992) Annu. Rev. Physiol. 54:911-930.)
  • Amino acid transport is mediated by Na + dependent amino acid transporters. These transporters are involved in gastrointestinal and renal uptake of dietary and cellular amino acids and in neuronal reuptake of neurotransmitters. Transport of cationic amino acids is mediated by the system 5 y+ family and the cationic amino acid transporter (CAT) family. Members of the CAT family share a high degree of sequence homology, and each contains 12-14 putative transmembrane domains. (Ito, K. and Groudine, M. (1997) J. Biol. Chem. 272:26780-26786.)
  • Hormones are secreted molecules that travel through the circulation and bind to specific receptors on the surface of, or within, target cells. Although they have diverse biochemical o compositions and mechanisms of action, hormones can be grouped into two categories.
  • One category includes small lipophilic hormones that diffuse through the plasma membrane of target cells, bind to cytosolic or nuclear receptors, and form a complex that alters gene expression. Examples of these molecules include retinoic acid, thyroxine, and the cholesterol-derived steroid hormones such as progesterone, estrogen, testosterone, cortisol, and aldosterone.
  • the second category includes 5 hydrophilic hormones that function by binding to cell surface receptors that transduce signals across the plasma membrane.
  • hormones include amino acid derivatives such as catecholamines and peptide hormones such as glucagon, insulin, gastrin, secretin, cholecystokinin, adrenocorticotropic hormone, follicle stimulating hormone, luteinizing hormone, thyroid stimulating hormone, and vas ⁇ pressin.
  • catecholamines amino acid derivatives
  • peptide hormones such as glucagon, insulin, gastrin, secretin, cholecystokinin, adrenocorticotropic hormone, follicle stimulating hormone, luteinizing hormone, thyroid stimulating hormone, and vas ⁇ pressin.
  • Neuropeptides and vasomediators comprise a large family of endogenous signaling molecules. Included in this family are neuropeptides and neuropeptide hormones such as bombesin, neuropeptide Y, neurotensin, neuromedin N, melanocortins, opioids, galanin, somatostatin, tachykinins, urotensin II and related peptides involved in smooth muscle stimulation, vasopressin, vasoactive 5 intestinal peptide, and circulatory system-borne signaling molecules such as angiotensin, complement, calcitonin, endothelins, formyl-methionyl peptides, glucagon, cholecystokinin and gastrin.
  • neuropeptides and neuropeptide hormones such as bombesin, neuropeptide Y, neurotensin, neuromedin N, melanocortins, opioids, galanin, somatostatin, tachykinins
  • NP/NMs can transduce signals directly, modulate the activity or release of other neurotransmitters and hormones, and act as catalytic enzymes in cascades.
  • the effects of ⁇ P/VMs range from extremely brief to long-lasting. (Reviewed in Martin, CR. et al. (1985) Endocrine Physiology. Oxford University Press, o New York, NY, pp. 57-62.)
  • the present invention relates to nucleic acid sequences comprising human polynucleotides 5 encoding secretory polypeptides that contain signal peptides and/or transmembrane domains.
  • human polynucleotides as presented in the Sequence Listing uniquely identify partial or full length genes encoding structural, functional, and regulatory polypeptides involved in cell signaling.
  • the invention provides an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 0 NO:l-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184.
  • the polynucleotide comprises at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide 0 complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d
  • the polynucleotide comprises at least 60 contiguous nucleotides. of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-184; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a 5 polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention further provides a composition for the detection of expression of secretory polynucleotides comprising at least one isolated polynucleotide comprising a polynucleotide selected o from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ XD NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d); and a detectable label.
  • a composition for the detection of expression of secretory polynucleotides comprising at least one
  • the invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polyneucleotide selected from the group 5 consisting of a) a polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the method comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
  • the invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polynucleotide selected from the group 5 consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the method 0 comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
  • the 5 invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 30 contiguous nucleotides.
  • the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 60 contiguous nucleotides.
  • the invention further provides a recombinant polynucleotide comprising a promoter sequence o operably Mnked to an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention provides a cell transformed with the recombinant polynucleotide.
  • the invention provides a transgenic organism comprising the recombinant polynucleotide.
  • the invention also provides a method for producing a secretory polypeptide, the method comprising a) culturing a cell under conditions suitable for expression of the secretory polypeptide, wherein said cell is transformed with a recombinant polynucleotide, said recombinant polynucleotide comprising an isolated polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; ii) a 0 polynucleotide comprising a natarally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NO:
  • the invention also provides an isolated secretory polypeptide (SPTM) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184.
  • SPTM secretory polypeptide
  • the invention further provides a method of screening for a test compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ DD 0 NO:l 85-369. The method comprises a) combining the polypeptide having an amino acid sequence.
  • the invention further provides a microarray wherein at least one element of the microarray is an isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; b) a polynucleotide comprising a naturally occurring o polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention also provides a method for generating a transcript image of a sample which contains polynucleotides.
  • the method comprises a) labeling the polynucleotides of the sample, b) contacting the elements of the microarray with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
  • the invention provides a method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • a target polynucleotide comprises a polynucleotide selected from the group consisting of a)
  • the method comprises a) exposing a sample comprising the target polynucleotide to a compound, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
  • the invention further provides a method for assessing toxicity of a test compound, said method comprising a) treating a biological sample containing nucleic acids with the test compound; b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide • comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -184; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a .
  • polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv).
  • Hybridization occurs under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and alternatively, the target polynu
  • the invention further provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ED NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group o consisting of SEQ ID NO:l 85-369, and d) an immimogenic fragment of a polypeptide having an amino acid sequence selected from.the group consisting of SEQ ID NO:l 85-369.
  • the invention provides an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369.
  • the invention further provides an isolated polynucleotide encoding a polypeptide selected from 5 the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ. DD NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an immunogenic fragment of a o polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-
  • polynucleotide encodes a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369.
  • polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-184.
  • the invention provides an isolated antibody which specifically binds to a 5 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an o immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NOX85-369.
  • the invention further provides a composition comprising a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO: 185-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ DD NOX85- 369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, and d) an immunogenic fragment of a polypeptide having 5 an amino acid sequence selected from the group consisting of SEQ DD NO 185-369, and a pharmaceutically acceptable excipient.
  • the composition comprises a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NOX85-369.
  • the invention additionally provides a method of treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment the o composition.
  • the invention also provides a method for screening a compound for effectiveness as an agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 5 from the group consisting of SEQ ID NOX85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369.
  • the method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the sample.
  • the o invention provides a composition comprising an agonist compound identified by the method and a pharmaceutically acceptable excipient.
  • the invention provides a method of treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment the composition.
  • the invention provides a method for screening a compound for effectiveness as 5 an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO: 185-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ D o NO: 185-369, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369.
  • the method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in the sample.
  • the invention provides a composition comprising an antagonist compound identified by the method and a pharmaceutically acceptable excipient.
  • the invention provides a method of treating a disease or condition associated with overexpression of functional SPTM, comprising administering to a patient in need of such treatment the composition.
  • the invention further provides a method of screening for a compound that modulates the 5 activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ED NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NOX85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO 185-369, and d) an o immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369.
  • the method comprises a) combining the polypeptide with at least one test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of 5 the test compound, wherein a change in the activity of the polypeptide in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide.
  • Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template identification o numbers (template DDs) corresponding to the polynucleotides of the present invention, along with the* sequence identification numbers (SEQ ID NO:s) and open reading frame identification numbers (ORF DDs) corresponding to polypeptides encoded by the template ID.
  • Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template DDs) corresponding to the polynucleotides of the present invention, along with 5 polynucleotide segments of each template sequence as defined by the indicated “start” and “stop” nucleotide positions.
  • the reading frames of the polynucleotide segments are shown, and the polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or transmembrane (TM) domains, as indicated.
  • SP signal peptide
  • TM transmembrane
  • the membrane topology of the encoded polypeptide sequence is indicated, the N-terminus (N) listed as being oriented to either the cytosolic (N in) or non- o cytosolic (N out) side of the cell membrane or organelle.
  • Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template DDs) corresponding to the polynucleotides of the present invention, along with component sequence identification numbers (component DDs) corresponding to each template.
  • the component sequences, which were used to assemble the template sequences, are defined by the indicated “start” and “stop” nucleotide positions along each template.
  • Table 4 shows the tissue distribution profiles for the templates of the invention.
  • Table 5 shows the sequence identification numbers (SEQ ED NO:s) corresponding to the polypeptides of the present invention, along with the reading frames used to obtain the polypeptide segments, the lengths of the polypeptide segments, the "start" and “stop” nucleotide positions of the polynucleotide sequences used to define the encoded polypeptide segments, the GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
  • Table 6 summarizes the bioinformatics tools which are useful for analysis of the polynucleotides of the present invention.
  • the first column of Table 6 lists analytical tools, programs, and algorithms, the second column provides brief descriptions thereof, the third column presents appropriate references, all of which are inco ⁇ orated by reference herein in their entirety, and the fourth column presents, where applicable, the scores, probability values, and other parameters used to evaluate the strength of a match between two sequences (the higher the score, the greater the homology between two sequences).
  • sptm refers to a nucleic acid sequence
  • SPTM amino acid sequence encoded by sptm
  • a “full-length” sptm refers to a nucleic acid sequence containing the entire coding region of a gene endogenously expressed in human tissue.
  • Adjuvants are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's immunological response.
  • mineral gels aluminum hydroxide
  • surface active substances lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol
  • Alleles refers to an alternative form of a nucleic acid sequence. Alleles result from a “mutation,” a change or an alternative reading of the genetic code. Any given gene may have none, one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or substitutions of nucleotides. Each of these changes may occur alone, or in combination with the others, one or more times in a given nucleic acid sequence.
  • the present invention encompasses allelic sptm.
  • amino acid sequence refers to a peptide, a polypeptide, or a protein of either natural or synthetic origin.
  • the amino acid sequence is not limited to the complete, endogenous amino acid sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic acid sequence.
  • Amplification refers to the production of additional copies of a sequence and is carried out using polymerase chain reaction (PCR) technologies well known in the art.
  • PCR polymerase chain reaction
  • Antibody refers to intact molecules as well as to fragments thereof, such as Fab, F(ab') 2> and Fv fragments, which are capable of binding the epitopic determinant.
  • Antibodies that bind SPTM polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of interest as the immunizing antigen.
  • the polypeptide or peptide used to immunize an animal e.g., a mouse, a rat, or a rabbit
  • an animal e.g., a mouse, a rat, or a rabbit
  • RNA e.g., a mouse, a rat, or a rabbit
  • Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal.
  • Antisense sequence refers to a sequence capable of specifically hybridizing to a target sequence.
  • the antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified sugar groups such as 2'-methoxyethyl sugars or 2'-methoxyethoxy sugars; or oligonucleotides having modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-deoxyguanosine.
  • PNA peptide nucleic acid
  • Antisense sequence refers to a sequence capable of specifically hybridizing to a target sequence.
  • the antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog.
  • Antisense technology refers to any technology which relies on the specific hybridization of an antisense sequence to a target sequence.
  • a “bin” is a portion of computer memory space used by a computer program for storage of data, and bounded in such a manner that data stored in a bin may be retrieved by the program.
  • “Biologically active” refers to an amino acid sequence having a structural, regulatory, or biochemical function of a naturally occurring amino acid sequence.
  • “Clone joining” is a process for combining gene bins based upon the bins' containing sequence information from the same clone.
  • the sequences may assemble into a primary gene transcript as well as one or more splice variants.
  • “Complementary” describes the relationship between two single-stranded nucleic acid sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3'-T-C-A-5').
  • a “component sequence” is a nucleic acid sequence selected by a computer program such as PHRED and used to assemble a consensus or template sequence from one or more component sequences.
  • a “consensus sequence” or “template sequence” is a nucleic acid sequence which has been assembled from overlapping sequences, using a computer program for fragment assembly such as the GELVTEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a relational database management system (RDMS).
  • GCG Genetics Computer Group
  • RDMS relational database management system
  • Constant amino acid substitutions are those substitutions that, when made, least interfere with the properties of the original protein, i.e., the stracture and especially the function of the protein is conserved and not significantly changed by such substitutions.
  • the table below shows amino acids which may be substituted for an original amino acid in a protein and which are regarded as conservative substitutions.
  • Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
  • “Deletion” refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or amino acid residue, respectively, is absent.
  • “Derivative” refers to the chemical modification of a nucleic acid sequence, such as by replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group.
  • “Differential expression” refers to increased or upregulated; or decreased, downregulated, or absent gene or protein expression, determined by comparing at least two different samples. Such comparisons may be carried out between, for example, a treated and an untreated sample, or a 0 diseased and a normal sample.
  • array element refers to a polynucleotide, polypeptide, or other chemical compound having a unique and defined position on a microarray.
  • E-value refers to the statistical probability that a match between two sequences occurred by chance.
  • Exon shuffling refers to the recombination of different coding regions (exons). Since an exon may represent a structural or functional domain of the encoded protein, new proteins may be assembled through the novel reassortinent of stable substructures, thus allowing acceleration of the evolution of new protein functions.
  • a “fragment” is a unique portion of sptm or SPTM which is identical in sequence to but o shorter in length than the parent sequence.
  • a fragment may comprise up to the entire length of the defined sequence, minus one nucleotide/amino acid residue.
  • a fragment may comprise from 10 to 1000 contiguous amino acid residues or nucleotides.
  • a fragment used as a probe, primer, antigen, therapeutic molecule, or for other pmposes may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. Fragments 5 may be preferentially selected from certain regions of a molecule.
  • a polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence.
  • these lengths are exemplary, and any length that is supported by the specification, including the Sequence Listing and the figures, may be encompassed by the present embodiments.
  • a fragment of sptm comprises a region of unique polynucleotide sequence that specifically 5 identifies sptm, for example, as distinct from any other sequence in the same genome.
  • a fragment of sptm is useful, for example, in hybridization and amplification technologies and in analogous methods that distinguish sptm from related polynucleotide sequences.
  • the precise length of a fragment of sptm and the region of sptm to which the fragment corresponds are routinely determinable by one of ordinary skill in the ait based on the intended pmpose for the fragment. o
  • a fragment of SPTM is encoded by a fragment of sptm.
  • a fragment of SPTM comprises a region of unique amino acid sequence that specifically identifies SPTM.
  • a fragment of SPTM is useful as an immunogenic peptide for the development of antibodies that specifically recognize SPTM.
  • the precise length of a fragment of SPTM and the region of SPTM to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the 5 intended pmpose for the fragment.
  • a “full length” nucleotide sequence is one containing at least a start site for translation to a protein sequence, followed by an open reading frame and a stop site, and encoding a "full length” polypeptide.
  • “Hit” refers to a sequence whose annotation will be used to describe a given template. o Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid matches, the top hit is the exact match with highest percent identity. If the template has no exact matches but has significant protein hits, the top hit is the protein hit with the lowest E- value. If the template has no significant protein hits, but does have significant non-exact nucleotide hits, the top hit is the nucleotide hit with the lowest E-value. 5 "Homology” refers to sequence similarity either between a reference nucleic acid sequence and at least a fragment of an sptm or between a reference amino acid sequence and a fragment of an SPTM.
  • Hybridization refers to the process by which a stiand of nucleotides anneals with a complementary strand through base pairing. Specific hybridization is an indication that two nucleic o acid sequences share a high degree of identity. Specific hybridization complexes form under defined annealing conditions, and remain hybridized after the "washing" step.
  • the defined hybridization conditions include the annealing conditions and the washing step(s), the latter of which is particularly important in determining the stringency of the hybridization process, with more stringent conditions allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not perfectly matched.
  • Permissive conditions for annealing of nucleic acid sequences are routinely determinable and may be consistent among hybridization experiments, whereas wash conditions may be varied among experiments to achieve the desired stringency.
  • stringency of hybridization is expressed with reference to the temperature under which the wash step is carried out.
  • wash temperatures are selected to be about 5°C to 20°C lower than the thermal melting point (T ⁇ for the specific sequence at a defined ionic strength and pH.
  • T m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe.
  • High stringency conditions for hybridization between polynucleotides of the present invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1 % SDS, for 1 hour. 5 Alternatively, temperatures of about 65°C, 60°C, or 55°C may be used. SSC concentration may be varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1%.
  • blocking reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, denatured salmon sperm DNA at about 100-200 ⁇ g/ml. Useful variations on these conditions will be readily apparent to those skilled in the art.
  • Hybridization, particularly under high stringency conditions may be o suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative of a similar role for the nucleotides and their resultant proteins.
  • RNA:DNA 5 hybridizations RNA:DNA 5 hybridizations. Appropriate hybridization conditions are routinely determinable by one of ordinary skill in the art.
  • Immunologically active or “immunogenic” describes the potential for a natural, recombinant, or synthetic peptide, epitope, polypeptide, or protein to induce antibody production in appropriate animals, cells, or cell lines.
  • “Insertion” or “addition” refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or residue, respectively, is added to the sequence.
  • Labeling refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or antibody with a reporter molecule capable of producing a detectable or measurable signal.
  • “Microarray” is any arrangement of nucleic acids, amino acids, antibodies, etc., on a substrate.
  • the substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or an appropriate membrane.
  • Linkers are short stretches of nucleotide sequence which may be added to a vector or an sptm to create restriction endonuclease sites to facilitate cloning.
  • Polylinkers are engineered to inco ⁇ orate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 3' overhangs (e.g., BamHI, EcoRI, and HindlH) and those which provide blunt ends (e.g., EcoRV, SnaBI, and Stul).
  • Naturally occurring refers to an endogenous polynucleotide or polypeptide that may be isolated from viruses or prokaryotic or eukaryotic. cells.
  • Nucleic acid sequence refers to the specific order of nucleotides joined by phosphodiester bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid sequence can be considered an oligomer, oligonucleotide, or polynucleotide.
  • the nucleic acid can be DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be either double-stranded or single-stranded, and can represent either the sense or antisense (complementary) strand.
  • Oligomer refers to a nucleic acid sequence of at least about 6 nucleotides and as many as about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 and 30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may be used as, e.g., primers for PCR, and are usually chemically synthesized.
  • operably linked refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence.
  • a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence.
  • operably linked DNA sequences may be in close proximity or contiguous and, where necessary to join two protein coding regions, in the same reading frame.
  • PNA protein nucleic acid
  • PNAs refers to a DNA mimic in which nucleotide bases are attached to a pseudopeptide backbone to increase stability. PNAs, also designated antigene agents, can prevent gene expression by targeting complementary messenger RNA.
  • Percent identity and “% identity”, as applied to polynucleotide sequences refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences. Percent identity between polynucleotide sequences may be determined using the default parameters of the CLUSTAL V algorithm as inco ⁇ orated into the MEGALIGN version 3.12e sequence alignment program. This program is part of the LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, Madison WT).
  • CLUSTAL V is described in 5 Higgins, D.G. and Sha ⁇ , P.M. (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 8:189-191.
  • the "weighted” residue weight table is selected as the default. Percent identity is reported by CLUSTAL V as the "percent similarity" between aligned polynucleotide sequence pairs.
  • NCBI National Center for Biotechnology Information
  • BLAST Basic Local Alignment Search Tool
  • NCBI National Center for Biotechnology Information
  • BLAST Basic Local Alignment Search Tool
  • the BLAST software suite includes various sequence analysis 5 programs including "blastn,” that is used to determine alignment between a known polynucleotide sequence and other sequences on a variety of databases.
  • BLAST 2 Sequences are commonly used with gap and other parameters set to default settings. For example, to compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62 Reward for match: I 5 Penalty for mismatch: -2
  • Percent identity may be measured over the length of an entire defined sequence, for example, as defined by a particular SEQ DD number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides.
  • Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured.
  • Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
  • Percent identity and “% identity”, as applied to polypeptide sequences refer to the percentage of residue matches between at least two polypeptide sequences aligned using a standardized algorithm. Methods of polypeptide sequence ahgnment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. Percent identity between polypeptide sequences may be determined using the default parameters of the CLUSTAL V algorithm as inco ⁇ orated into the MEGALIGN version 3.12e sequence alignment program (described and referenced above).
  • NCBI BLAST software suite may be used.
  • BLAST 2 Sequences Version 2.0.9 (May-07-1999) with blastp set at default parameters.
  • Such default parameters may be, for example: Matrix: BLOSUM62
  • Percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ DD number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues.
  • Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured.
  • Post-translational modification of an SPTM may involve lipidation, glycosylation, phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by cell type depending on the enzymatic milieu and the SPTM.
  • Probe refers to sptm or fragments thereof, which are used to detect identical, allelic or o related nucleic acid sequences.
  • Probes are isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes.
  • Primmers are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA polymerase enzyme. 5 Primer pairs can be used for amplification (and identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • Probes and primers as used in the present invention typically comprise at least 15 contiguous nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or at o least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may be considerably longer than these examples, and it is understood that any length supported by the specification, including the figures and Sequence Listing, may be used.
  • PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that pmpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA).
  • Oligonucleotides for use as primers are selected using software known in the art for such pmpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection programs have inco ⁇ orated additional features for expanded capabilities. For example, the PrimOU primer selection program (available to the public from the Genome Center at University of Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences and is thus useful for designing primers on a genome- wide scope.
  • the Primer3 primer selection program (available to the public from the Whitehead Institate/MiT Center for Genome Research, Cambridge MA) allows the user to input a "mispriming library," in which sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for microarrays. (The source code for the latter two primer selection programs may also be obtained from their respective sources and modified to meet the user's specific needs.)
  • the PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences.
  • this program is useful for identification of both unique and conserved oligonucleotides and polynucleotide fragments.
  • the oligonucleotides and polynucleotide fragments identified by any of the above selection methods are useful in hybridization technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to those described above.
  • “Purified” refers to molecules, either polynucleotides or polypeptides that are isolated or separated from their natural environment and are at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other compounds with which they are naturally associated.
  • a "recombinant nucleic acid” is a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques such as those described in Sambrook, supra.
  • the term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid.
  • a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell.
  • such recombinant nucleic acids may be part of a viral vector, e.g., based on a vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is expressed, inducing a protective immunological response in the mammal.
  • Regulatory element refers to a nucleic acid sequence from nontranslated regions of a gene, and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host proteins to carry out or regulate transcription or translation.
  • Reporter molecules are chemical or biochemical moieties used for labeling a nucleic acid, an amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemiluminescent, or cfiromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in the art.
  • RNA equivalent in reference to a DNA sequence, is composed of the same linear sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose.
  • Samples may contain nucleic or amino acids, antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues).
  • source e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues).
  • Specific binding or “specifically binding” refers to the interaction between a protein or peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For example, if an antibody is specific for epitope "A,” the presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will reduce the amount of labeled A that binds to the antibody.
  • substitution refers to the replacement of at least one nucleotide or amino acid by a different nucleotide or amino acid.
  • Substrate refers to any suitable rigid or semi-rigid support including, e.g., membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles or capillaries.
  • the substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound.
  • a “transcript image” refers to the collective pattern of gene expression by a particular tissue or cell type under given conditions at a given time.
  • Transformation refers to a process by which exogenous DNA enters a recipient cell. Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell being transformed.
  • Transformants include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as cells which transiently express inserted DNA or RNA.
  • a "transgenic organism,” as used herein, is any organism, including but not limited to animals and plants, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art.
  • the nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus.
  • the term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule.
  • the transgenic organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, and plants and animals.
  • the isolated DNA of the present invention can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation. Techniques for transferring the DNA of the present invention into such organisms are widely known and provided in references such as Sambrook et al. (1989), supra.
  • a "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (March-07- 1999) set at default parameters.
  • Such a pair of nucleic acids may show, for example, at least 30%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length.
  • the variant may result in "conservative" amino acid changes which do not affect structural and/or chemical properties.
  • a variant may be described as, for example, an "allelic” (as defined above), “splice,” “species,” or “polymo ⁇ hic” variant.
  • a splice variant may have significant identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA processing.
  • the corresponding polypeptide may possess additional functional domains or lack domains that are present in the reference molecule.
  • Species variants are polynucleotide sequences that vary from one species to another. The resulting polypeptides generally will have significant amino acid identity relative to each other.
  • a polymo ⁇ hic variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given species.
  • Polymo ⁇ hic variants also may encompass "single nucleotide polymo ⁇ hisms'' (SNPs) in which the polynucleotide sequence varies by one base. The presence of SNPs may be indicative of, for example, a certain population, a disease state, or a propensity for a disease state.
  • variants of the polynucleotides of the present invention may be generated through recombinant methods.
  • One possible method is a DNA shuffling technique such as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S.
  • DNA shuffling is a process by which a library of gene variants is produced using PCR-mediated recombination of gene fragments. The library is then subjected to selection or screening procedures that identify those gene variants with the desired properties.
  • a “variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40-% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at least 60%, at least 70%, at least 80%,
  • a “variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) set at default parameters.
  • Such a pair of polypeptides may show, for example, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93 %, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length of one of the polypeptides. sequence identity over a certain defined length of one of the polypeptides.
  • cDNA sequences derived from human tissues and cell lines were aligned based on nucleotide sequence identity and assembled into "consensus" or "template” sequences which are designated by the template identification numbers (template IDs) in column 2 of 5 Table 2.
  • the sequence identification numbers (SEQ DD NO:s) corresponding to the template EDs are shown in column 1. Segments of the template sequences are defined by the "start” and “stop” nucleotide positions listed in columns 3 and 4. These segments, when translated in the reading frames indicated in column 5, have similarity to signal peptide (SP) or transmembrane (TM) domain consensus sequences, as indicated in column 6. o
  • SP signal peptide
  • TM transmembrane domain consensus sequences
  • sequences of the present invention are used to develop a transcript image for a particular cell or tissue.
  • cDNA was isolated from libraries constructed using RNA derived from normal and diseased o human tissues and cell lines.
  • the human tissues and cell lines used for cDNA library construction were selected from a broad range of sources to provide a diverse population of cDNAs representative of gene transcription throughout the human body. Descriptions of the human tissues and cell lines used for cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc. (Incyte), Palo Alto CA).
  • Human tissues were broadly selected from, for example, cardiovascular, 5 dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, reproductive, and urologic sources.
  • Cell lines used for cDNA library construction were derived from, for example, leukemic cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. Such cell lines include, for example, THP-1, Jurkat, HUVEC, hNT2, WI38, HeLa, and other cell lines o commonly used and available from public depositories (American Type Culture Collection, Manassas).
  • Chain termination reaction products may be electrophoresed on urea- polyacrylamide gels and detected either by autoradiography (for radioisotope-labeled nucleotides) or by fluorescence (for fluorophore-labeled nucleotides).
  • Automated methods for mechanized reaction preparation, sequencing, and analysis using fluorescence detection methods have been developed.
  • Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, Inc.
  • Sequencing can be carried out using, for example, the ABI 373 or 377 (Applied Biosystems) or MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA sequencing systems, or other automated and manual sequencing systems well known in the art.
  • nucleotide sequences of the Sequence Listing have been prepared by current, state-of- the-art, automated methods and, as such, may contain occasional sequencing errors or unidentified nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified bases do not represent a hindrance to practicing the invention for those skilled in the art.
  • Several methods employing standard recombinant techniques may be used to correct errors and complete the missing sequence information. (See, e.g., those described in Ausubel, F.M. et al. (1997) Short Protocols in Molecular Biology. John Wiley & Sons, New York NY; and Sambrook, J. et al. (1989) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Press, Plainview NY.)
  • Human polynucleotide sequences may be assembled using programs or algorithms well known in the art. Sequences to be assembled are related, wholly or in part, and may be derived from a single or many different transcripts. Assembly of the sequences can be performed using such programs as PHRAP (Phils Revised Assembly Program) and the GELVD ⁇ W fragment assembly system (GCG), or other methods known in the art.
  • PHRAP Phils Revised Assembly Program
  • GCG GELVD ⁇ W fragment assembly system
  • cDNA sequences are used as "component" sequences that are assembled into 5 "template” or “consensus” sequences as follows. Sequence chromatograms are processed, verified, and quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway known as Block 1 (See, e.g., the LDFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA). A series of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or masked, to prevent spurious o matches. Mitochondrial and ribosomal RNA sequences are also removed.
  • Block 1 See, e.g., the LDFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA).
  • a series of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., dinucle
  • the processed sequences are then loaded into a relational database management system (RDMS) which assigns edited sequences to existing templates, if available.
  • RDMS relational database management system
  • a process is initiated which modifies existing templates or creates new templates from works in progress (i.e., nonfinal assembled sequences) containing queued sequences or the sequences themselves.
  • the templates can be merged into bins. If multiple templates exist in one bin, the bin can be split and the templates reannotated.
  • bins are "clone joined" based upon clone information. Clone joining occurs when the 5' sequence of one clone is present in one bin and the 3' sequence from the same clone is present in a different bin, indicating that the two o bins should be merged into a single bin. Only bins which share at least two different clones are merged.
  • a resultant template sequence may contain either a partial or a full length open reading frame, or all or part of a genetic regulatory element. This variation is due in part to the fact that the full length cDNAs of many genes are several hundred, and sometimes several thousand, bases in length. 5 With current technology, cDNAs comprising the coding regions of large genes cannot be cloned because of vector limitations, incomplete reverse transcription of the mRNA, or incomplete "second strand" synthesis. Template sequences may be extended to include additional contiguous sequences derived from the parent RNA transcript using a variety of methods known to those of skill in the art. Extension may thus be used to achieve the full length coding sequence of a gene. 0
  • cDNA sequences are analyzed using a variety of programs and algorithms which are well known in the art. (See, e.g., Ausubel, 1997, supra. Chapter 7.7; Meyers, R.A. (Ed.) (1995) Molecular Biology and Biotechnology. Wiley VCH, New York NY, pp. 856-853; and Table 6.) These analyses comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and stop codons; and homology searches.
  • BLAST Basic Local Ahgnment Search Tool
  • BLAST is especially useful in determining exact matches and comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the user
  • Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, in "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997, inco ⁇ orated herein by reference.
  • the sptm of the present invention may be used for a variety of diagnostic and therapeutic pu ⁇ oses.
  • an sptm may be used to diagnose a particular condition, disease, or disorder associated with cell signaling.
  • Such conditions, diseases, and disorders include, but are not limited to, a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix
  • the sptm can be used to detect the presence of, or to quantify the amount of, an sptm- related polynucleotide in a sample. This information is then compared to information obtained from appropriate reference samples, and a diagnosis is established.
  • a polynucleotide complementary to a given sptm can inhibit or inactivate a therapeutically relevant gene related to the sptm.
  • the expression of sptm may be routinely assessed by hybridization-based methods to determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity of sptm expression.
  • the level of expression of sptm may be compared among different cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at different developmental stages, or among cell types or tissues undergoing various treatments.
  • This type of analysis is useful, for example, to assess the relative levels of sptm expression in fully or partially differentiated cells or tissues, to determine if changes in sptm expression levels are correlated with the development or progression of specific disease states, and to assess the response of a cell or tissue to a specific therapy, for example, in pharmacological or toxicological studies.
  • Methods for the analysis of sptm expression are based on hybridization and amplification technologies and include membrane-based procedures such as northern blot analysis, high-throughput procedures that utilize, for example, microarrays, and PCR-based procedures.
  • the sptm, their fragments, or complementary sequences may be used to identify the presence of and/or to determine the degree of similarity between two (or more) nucleic acid sequences.
  • the sptm may be hybridized to naturally occurring or recombinant nucleic acid sequences under appropriately selected temperatmes and salt concentrations. Hybridization with a probe based on the nucleic acid sequence of at least one of the sptm allows for the detection of nucleic acid sequences, including genomic sequences, which are identical or related to the sptm of the Sequence Listing.
  • Probes may be selected from non-conserved or unique regions of at least one of the polynucleotides of SEQ DD NOX-184 and tested for their abihty to identify or amplify the target nucleic acid sequence using standard protocols.
  • Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in SEQ DD NO: 1-184 and fragments thereof, can be identified using various conditions of stringency. (See, e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507-511.) Hybridization conditions are discussed in "Definitions.”
  • a probe for use in Southern or northern hybridization may be derived from a fragment of an sptm sequence, or its complement, that is up to several hundred nucleotides in length and is either single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared or sectioned tissues, or to artificial substrates containing sptm. Microarrays are particularly suitable for identifying the presence of and detecting the level of expression for multiple genes of interest by examining gene expression correlated with, e.g., various stages of development, treatment with a drug or compound, or disease progression.
  • An array analogous to a dot or slot blot may be used to arrange and link polynucleotides to the surface of a substrate using one or more of the following: mechanical (vacuum), chemical, thermal, or UV bonding procedures.
  • Such an array may contain any number of sptm and may be produced by hand or by using available devices, materials, and machines.
  • Microarrays may be prepared, used, and analyzed using methods known in the art.
  • methods known in the art See, e.g., Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT appUcation W095/251116; Shalon, D. et al. (1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.
  • Probes may be labeled by either PCR or enzymatic techniques using a variety of commercially available reporter molecules.
  • commercial kits are available for radioactive and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline phosphatase labeling (Life Technologies).
  • sptm may be cloned into commercially available vectors for the production of RNA probes.
  • Such probes may be transcribed in the presence of at least one labeled nucleotide (e.g., 32 P-ATP, Amersham Pharmacia Biotech).
  • polynucleotides of SEQ ID NOX-184 or suitable fragments thereof can be used to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures well known in the art, e.g., cDNA library screening, PCR amplification, etc.
  • the molecular cloning of such full length cDNA sequences may employ the method of cDNA library screening with probes using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, supra. Chapters 3, 5, and 6.
  • These procedures may also be employed with genomic libraries to isolate genomic sequences of sptm in order to analyze, e.g., regulatory elements.
  • Gene identification and mapping are important in the investigation and treatment of almost all conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, diabetes, and mental illnesses are of particular interest. Each of these conditions is more complex than the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being predictive of predisposition for a particular condition, disease, or disorder. For example, 5 cardiovascular disease may result from malfunctioning receptor molecules that fail to clear cholesterol from the bloodstream, and diabetes may result when a particular individual's immune system is activated by an infection and attacks the insulin-producing cells of the pancreas. In some studies, Alzheimer's disease has been linked to a gene on chromosome 21 ; other studies predict a different gene and location. Mapping of disease genes is a complex and reiterative process and generally 0 proceeds from genetic linkage analysis to physical mapping.
  • a genetic linkage map traces parts of chromosomes that are inherited in the same pattern as the condition.
  • Statistics link the inheritance of particular conditions to particular regions of chromosomes, as defined by RFLP or other markers.
  • RFLP radio frequency polypeptide
  • markers and their locations are known from previous studies. More often, however, the markers are simply stretches of DNA that differ among individuals. Examples of genetic linkage maps can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) World Wide Web site.
  • sptm sequences may be used to generate o hybridization probes useful in chromosomal mapping of naturally occmring genomic sequences. Either coding or noncoding sequences of sptm may be used, and in some instances, noncoding sequences may be preferable over coding sequences. For example, conservation of an sptm coding sequence among members of a multi-gene family may potentially cause xindesired cross hybridization during chromosomal mapping.
  • sequences may be mapped to a particular chromosome, to a specific 5 region of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, or single chromosome cDNA libraries.
  • HACs human artificial chromosomes
  • YACs yeast artificial chromosomes
  • BACs bacterial artificial chromosomes
  • PI constructions or single chromosome cDNA libraries.
  • o Fluorescent in situ hybridization may be correlated with other physical chromosome mapping techniques and genetic map data.
  • FISH Fluorescent in situ hybridization
  • Correlation between the location of sptm on a physical chromosomal map and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA associated with that disorder.
  • the sptm sequences may also be used to detect polymo ⁇ hisms that are genetically linked to the inheritance of a particular condition, disease, or disorder.
  • In situ hybridization of chromosomal preparations and genetic mapping techniques may be used for extending existing genetic 5 maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the number or arm of the corresponding human chromosome is not known. These new marker sequences can be mapped to human chromosomes and may provide valuable information to investigators searching for disease genes using positional cloning or other gene discovery techniques.
  • any sequences mapping to that area may represent associated or regulatory genes for further investigation.
  • the nucleotide sequences of the subject invention may also be used to detect differences in chromosomal architecture due to translocation, inversion, etc., among normal, carrier, or affected individuals.
  • a disease-associated gene is mapped to a chromosomal region, the gene must be cloned in order to identify mutations or other alterations (e.g., translocations or inversions) that may be correlated with disease.
  • This process requires a physical map of the chromosomal region containing the disease-gene of interest along with associated markers. A physical map is necessary for determining the nucleotide sequence of and order of marker genes on a particular chromosomal o region. Physical mapping techniques are well known in the art and require the generation of overlapping sets of cloned DNA fragments from a particular organelle, chromosome, or genome. These clones are analyzed to reconstruct and catalog their order. Once the position of a marker is determined, the DNA from that region is obtained by consulting the catalog and selecting clones from that region. The gene of interest is located through positional cloning techniques using hybridization or 5 similar methods.
  • the sptm of the present invention may be used to design probes useful in diagnostic assays. Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, o disorders, or diseases associated with abnormal levels of sptm expression. Labeled probes developed from spun sequences are added to a sample under hybridizing conditions of desired stringency. In some instances, sptm, or fragments or oligonucleotides derived from sptm, may be used as primers in amplification steps prior to hybridization. The amount of hybridization complex formed is quantified and compared with standards for that cell or tissue. If sptm expression varies significantly from the standard, the assay indicates the presence of the condition, disorder, or disease.
  • Qualitative or quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent 5 assay (ELISA)-like, pin, or chip-based assays.
  • PCR enzyme-linked immunosorbent 5 assay
  • the probes described above may also be used to monitor the progress of conditions, disorders, or diseases associated with abnormal levels of sptm expression, or to evaluate the efficacy of a particular therapeutic treatment.
  • the candidate probe may be identified from the sptm that are specific to a given human tissue and have not been observed in GenBank or other genome databases. 0 Such a probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the treatment of an individual patient. In a typical process, standard expression is established by methods well known in the art for use as a basis of comparison, samples from patients affected by the disorder or disease are combined with the probe to evaluate any deviation from the standard profile, and a therapeutic agent is administered and effects are monitored to generate a treatment profile.
  • Efficacy 5 is evaluated by determining whether the expression progresses toward or returns to the standard normal pattern. Treatment profiles may be generated over a period of several days or several months. Statistical methods well known to those skilled in the art may be use to determine the significance of " such therapeutic agents.
  • the polynucleotides are also useful for identifying individuals from minute biological samples, o for example, by matching the RFLP pattern of a sample' s DNA to that of an individual' s DNA.
  • the polynucleotides of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions-of an individual's genome. These sequences can be used to prepare PCR primers for amplifying and isolating such selected DNA, which can then be sequenced. Using this technique, an individual can be identified through a unique set of DNA sequences. Once a unique 5 DD database is established for an individual, positive identification of that individual can be made from extremely small tissue samples.
  • oligonucleotide primers derived from the sptm of the invention may be used to detect single nucleotide polymo ⁇ hisms (SNPs).
  • SNPs are substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans.
  • Methods of o SNP detection include, but are not limited to, single-stranded conformation polymo ⁇ hism (SSCP) and fluorescent SSCP (fSSCP) methods.
  • SSCP single-stranded conformation polymo ⁇ hism
  • fSSCP fluorescent SSCP
  • oligonucleotide primers derived from sptm are used to amplify DNA using the polymerase chain reaction (PCR).
  • the DNA may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like.
  • SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing gels.
  • the oligonucleotide primers are fluorescently labeled, which allows detection of the amplimers in high- throughput equipment such as DNA sequencing machines.
  • sequence database analysis 5 methods termed in sihco SNP (isSNP) are capable of identifying polymo ⁇ hisms by comparing the sequences of individual overlapping DNA fragments which assemble into a common consensus sequence.
  • DNA-based identification techniques are critical in forensic technology. DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be amplified using, e.g., PCR, to identify individuals. (See, e.g., Erlich, H. (1992) PCR Technology. Freeman and Co., New York, NY). Similarly, polynucleotides of the present 5 invention can be used as polymo ⁇ hic markers.
  • reagents capable of identifying the source of a particular tissue.
  • Appropriate reagents can comprise, for example, DNA probes or primers prepared from the sequences of the present invention that are specific for particular tissues. Panels of such reagents can identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to o screen tissue cultures for contamination.
  • polynucleotides of the present invention can also be used as molecular weight markers on nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a ' particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, 5 and as an antigen to elicit an immune response.
  • the polynucleotides encoding SPTM or their mammalian homologs may be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) cells.
  • ES embryonic stem
  • Such o techniques are weU known in the art and are useful for the generation of animal models of human disease. (See, e.g., U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.)
  • mouse ES cells such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in culture.
  • the ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292).
  • a marker gene e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292).
  • the vector integrates into the corresponding region of the host genome by homologous recombination.
  • homologous recombination takes place using the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 5 (1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330).
  • Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain.
  • the blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains.
  • Transgenic animals thus generated may be tested with potential therapeutic or toxic agents.
  • the polynucleotides encoding SPTM may also be manipulated in vitro in ES cells derived from human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell lineages including endoderm, mesoderm, and ectodermal cell types.
  • the polynucleotides encoding SPTM of the invention can also be used to create "knockin" humanized animals (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region of sptm is injected into animal ES cells, and the injected sequence integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are implanted as described above.
  • Transgenic progeny or inbred lines are studied and treated with potential o pharmaceutical agents to obtain information on treatment of a human disease.
  • a mammal inbred to overexpress sptm resulting, e.g., in the secretion of SPTM in its milk, may also serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74).
  • Screening Assays 5 SPTM encoded by polynucleotides of the present invention may be used to screen for molecules that bind to or are bound by the encoded polypeptides.
  • the binding of the polypeptide and the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the polypeptide or the bound molecule.
  • Examples of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or small molecules.
  • the molecule is closely related to the natural ligand of the polypeptide, e.g., a ligand or fragment thereof, a natural substrate, or a structural or functional mimetic.
  • the molecule can be closely related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, e.g., the active site. In either case, the molecule can be rationally designed using known techniques.
  • the screening for these molecules involves producing appropriate cells which express the polypeptide, either as a secreted protein or on the cell membrane.
  • Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide or cell membrane fractions 5 which contain the expressed polypeptide are then contacted with a test compound and binding, stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed.
  • An assay may simply test binding of a candidate compound to the polypeptide, wherein binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. Alternatively, the assay may assess binding in the presence of a labeled competitor. 0 Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule affixed to a solid support, chemical libraries, or natural product mixtares. The assay may also simply comprise the steps of mixing a candidate compound with a solution containing a polypeptide, measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to a standard.
  • an ELISA assay using, e.g., a monoclonal or polyclonal antibody can measure polypeptide level in a sample.
  • the antibody can measure polypeptide level by either binding, directly or indirectly, to the polypeptide or by competing with the polypeptide for a substrate.
  • All of the above assays can be used in a diagnostic or prognostic context.
  • the molecules discovered using these assays can be used to treat disease or to bring about a particular result in a 0 patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule.
  • the assays can discover agents which may inhibit or enhance the production of the polypeptide from suitably manipulated cells or tissues.
  • Transcript Imaging and Toxicological Testing 5 Another embodiment relates to the use of sptm to develop a transcript image of a tissue or cell type.
  • a transcript image represents the global pattern of gene expression by a particular tissue or ceU type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and their relative abundance under given conditions and at a given time. (See Seilhamer et al, "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, expressly inco ⁇ orated by o reference herein.)
  • a transcript image may be generated by hybridizing the polynucleotides of the present invention or their complements to the totality of transcripts or reverse transcripts of a particular tissue or cell type.
  • the hybridization takes place in high-throughput . format, wherein the polynucleotides of the present invention or their complements comprise a subset of a plurality of elements on a microarray.
  • the resultant transcript image would provide a profile of gene activity pertaining to cell signaling.
  • Transcript images which profile sptm expression may be generated using transcripts isolated from tissues, cell lines, biopsies, or other biological samples.
  • the transcript image may thus reflect 5 sptm expression in vivo, as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line.
  • Transcript images which profile sptm expression may also be used in conjunction with in vitro model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental compounds. All compounds induce characteristic 0 gene expression patterns, frequently termed molecular finge ⁇ rints or toxicant signatures, which are indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog. 24:153- 159; Steiner, S. and Anderson, N. L. (2000) Toxicol. Lett. 112-113:467-71, expressly inco ⁇ orated by reference herein). If a test compound has a signature similar to that of a compound with known toxicity, it is likely to share those toxic properties.
  • finge ⁇ rints or signatures are most useful and 5 refined when they contain expression information from a large number of genes and gene families. Ideally, a genome- wide measurement of expression provides the highest quality signature. Even genes whose expression is not altered by any tested compounds are important as well, as the levels of expression of these genes are used to normalize the rest of the expression data. The normalization procedure is useful for comparison of expression data after treatment with different compounds. o While the assignment of gene function to elements of a toxicant signature aids in inte ⁇ retation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures which leads to prediction of toxicity.
  • the toxicity of a test compound is assessed by treating a biological sample containing nucleic acids with the test compound.
  • Nucleic acids that are expressed in the treated biological sample are hybridized with one or more probes specific to the polynucleotides of the present invention, so that transcript levels corresponding to the polynucleotides of the present invention may be o quantified.
  • the transcript levels in the treated biological sample are compared with levels in an untreated biological sample. Differences in the transcript levels between the two samples are indicative of a toxic response caused by the test compound in the treated sample.
  • proteome refers to the global pattern of protein expression in a particular tissue or cell type.
  • proteome expression patterns, or profiles are analyzed by quantifying the number of expressed proteins and their relative abundance under 5 given conditions and at a given time.
  • a profile of a ceU's proteome may thus be generated by separating and analyzing the polypeptides of a particular tissue or cell type.
  • the separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, 0 supra).
  • the proteins are visuaMzed in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains.
  • the optical density of each protein spot is generally proportional to the level of the protein in the sample.
  • the optical densities of equivalently positioned protein spots from different samples are 5 compared to identify any changes in protein spot density related to the treatment.
  • the proteins in the spots are partially sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry.
  • the identity of the protein in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In some cases, further sequence data may be o obtained for definitive protein identification.
  • a proteomic profile may also be generated using antibodies specific for SPTM to quantify the levels of SPTM expression.
  • the antibodies are used as elements on a microarray, and protein expression levels are quantified by exposing the microarray to the sample and detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103- 5 11; Mendoze, L. G. et al. (1999) Biotechniques 27:778-88). Detection may be performed by a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino- reactive fluorescent compound and detecting the amount of fluorescence bound at each array element.
  • Toxicant signatures at the proteome level are also useful for toxicological screening, and o should be analyzed in paraUel with toxicant signatures at the transcript level.
  • There is a poor correlation between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the analysis of compounds which do not significantly affect the transcript image, but which alter the proteomic profile.
  • the analysis of transcripts in body fluids is difficult, due to rapid degradation of mRNA, so proteomic profiMng may be more reMable and informative in such cases.
  • the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins that are expressed in the treated 5 biological sample are separated so that the amount of each protein can be quantified. The amount of each protein is compared to the amount of the corresponding protein in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the individual proteins and comparing these partial sequences to the SPTM encoded by 0 polynucleotides of the present invention.
  • the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins from the biological sample are incubated with antibodies specific to the SPTM encoded by polynucleotides of the present invention. The amount of protein recognized by the antibodies is quantified. The amount of protein in the treated 5 biological sample is compared with the amount in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.
  • Transcript images may be used to profile sptm expression in distinct tissue types. This process can be used to determine ceU signaUng activity in a particular tissue type relative to this o activity in a different tissue type. Transcript images may be used to generate a profile of sptm expression characteristic of diseased tissue. Transcript images of tissues before and after treatment may be used for diagnostic pmposes, to monitor the progression of disease, and to monitor the efficacy of drug treatments for diseases which affect ceU signaMng activity.
  • Transcript images of cell Mnes can be used to assess ceU signaMng activity and/or to identify 5 cell Mnes that lack or misregulate this activity. Such cell Mnes may then be treated with pharmaceutical agents, and a transcript image fohowing treatment may indicate the efficacy of these agents in restoring desired levels of this activity.
  • a similar approach may be used to assess the toxicity of pharmaceutical agents as reflected by undesirable changes in cell signaMng activity.
  • Candidate pharmaceutical agents may be evaluated by comparing their associated transcript images with those of o pharmaceutical agents of known effectiveness.
  • the polynucleotides of the present invention are useful in antisense technology.
  • Antisense technology or therapy reMes on the modulation of expression of a target protein through the specific binding of an antisense sequence to a target sequence encoding the target protein or directing its expression.
  • Agrawal, S., ed. 1996 Antisense Therapeutics, Humana Press Inc., Totawa NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3):171-178; Crooke, S.T. (1997) Adv. Pharmacol. 40:1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12):1055-1063; and Lavrosky, Y.
  • An antisense sequence is a polynucleotide sequence capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences bind to ceUular mRNA and/or genomic DNA, affecting translation and/or transcription. Antisense sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) Antisense Res. Dev. 1 (3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge, W.M. et al. (1995) Proc. Natl. Acad.
  • Antisense sequences can also bind to DNA duplexes through specific interactions in the major groove of the double heMx.
  • the polynucleotides of the present invention and fragments thereof can be used as antisense sequences to modify the expression of the polypeptide encoded by sptm.
  • the antisense sequences can be produced ex vivo, such as by using any of the ABI nucleic acid synthesizer series (AppMed Biosystems) or other automated systems known in the art. Antisense sequences can also be produced *. biologically, such as by transforming an appropriate host cell with an expression vector containing the sequence of interest. (See, e.g., Agrawal, supra.)
  • any gene deMvery system suitable for introduction of the antisense sequences into appropriate target ceMs can be used.
  • Antisense sequences can be dehvered intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence complementary to at least a portion of the ceMular sequence encoding the target protein.
  • Antisense sequences can also be introduced intracehularly through the use of viral vectors, such as retrovirus and adeno-associated virus vectors.
  • viral vectors such as retrovirus and adeno-associated virus vectors.
  • viral vectors such as retrovirus and adeno-associated virus vectors.
  • Other gene deMvery mechanisms include Mposome-derived systems, artificial viral envelopes, and other systems known in the art.
  • the nucleotide sequences encoding SPTM or fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host.
  • an appropriate expression vector i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host.
  • a variety of expression vector/host systems may be utiMzed to contain and express sequences encoding SPTM.
  • microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect ceU systems infected with viral expression vectors (e.g., baculovirus); plant ceU systems transformed with viral expression vectors (e.g., cauMflower mosaic viras, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal (mammaMan) cell systems.
  • microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect ceU systems infected with viral expression vectors (e.g., baculovirus); plant ceU systems transformed with viral expression vectors (e.g., cauMflower mosaic viras, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vector
  • Expression vectors derived from retroviruses, adenoviruses, or he ⁇ es or vaccinia viruses, or from various bacterial plasmids may be used for deMvery of nucleotide sequences to the targeted organ, tissue, or ceU population.
  • the invention is not Mmited by the host cell employed.
  • sequences encoding SPTM can be transformed into cell Mnes using expression vectors which may contain viral origins of repMcation and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Any number of selection systems may be used to recover transformed ceU Mnes.
  • the polynucleotides encoding SPTM of the invention may be used for somatic or germMne o gene therapy.
  • Gene therapy may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X-Mnked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al.
  • SCID severe combined immunodeficiency
  • ADA adenosine deaminase
  • hepatitis B or C viras HBV, HCV
  • fungal parasites such as Candida albicans and Paracoccidioides brasiMensis
  • protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi.
  • the expression of sptm from an appropriate population of transduced ceUs may alleviate the cMnical manifestations caused by the genetic deficiency.
  • diseases or disorders caused by deficiencies in sptm are treated by constructing mammaMan expression vectors comprising sptm and introducing these o vectors by mechanical means into sptm-deficient ceUs.
  • Mechanical transfer technologies for use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual ceUs, (ii) ballistic gold particle deMvery, (Mi) Mposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. Biochem. 62:191- 217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and Recipon, H. (1998) Curr. Opin. Biotechnol. 9:445-450).
  • Expression vectors that may be effective for the expression of sptm include, but are not Mmited to, the PCDNA 3.1, EPiTAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La JoUa CA), and PTET-OFF,
  • the sptm of the invention may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma viras (RSV), SV40 viras, thymidine kinase (TK), or ⁇ -actin genes), (ii) an inducible promoter (e.g., the tetracycMne-regulated promoter (Gossen, M. and Bujard, H. (1992) Proc. Natl. Acad. Sci. U.S.A.
  • a constitutively active promoter e.g., from cytomegalovirus (CMV), Rous sarcoma viras (RSV), SV40 viras, thymidine kinase (TK), or ⁇ -actin genes
  • an inducible promoter e.g., the tetracycMne-regulated promoter (Gossen, M.
  • Mposome transformation kits e.g., the PERFECT LIPID TRANSFECTION KIT, available from Invitrogen
  • PERFECT LIPID TRANSFECTION KIT available from Invitrogen
  • transformation is performed using the calcium phosphate method (Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J. 1:841-845).
  • the introduction of DNA to primary ceUs requires modification of these standardized mammaMan transfection protocols.
  • diseases or disorders caused by genetic defects with respect to sptm expression are treated by constructing a retrovirus vector consisting of (i) sptm under the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (n) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional retrovirus s-acting RNA sequences and coding sequences required for efficient vector propagation.
  • Retrovirus vectors e.g., PFB and PFBNEO
  • Retrovirus vectors are commercially available (Stratagene) and are based onpubMshed data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. U.S.A.
  • the vector is propagated in an appropriate vector producing ceU Mne (VPCL) that expresses an envelope gene with a tropism for receptors on the target ceUs or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and MiUer, A.D. (1988) J. Virol. 62:3802-3806; DuU, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R.
  • VPCL ceU Mne
  • U.S. Patent Number 5,910,434 to Rigg discloses a method for obtaining retrovirus packaging ceU Mnes and is hereby inco ⁇ orated by reference. Propagation of retrovirus vectors, transduction of a population of ceUs (e.g., CD4 + T-ceUs), and the return of transduced ceUs to a patient are proceedmes well known to persons skiUed in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol.
  • an adeno virus-based gene therapy deMvery system is used to deMver sptm to cells which have one or more genetic abnormaMties with respect to the expression of sptm.
  • the construction and packaging of adenovirus-based vectors are weU known to those with ordinary skill in the art.
  • RepMcation defective adenovirus vectors have proven to be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995) Transplantation 27:263-268).
  • PotentiaUy useful adenoviral vectors are described in U.S. Patent Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby inco ⁇ orated by reference.
  • Adenovirus vectors for gene therapy For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:511-544- and Verma, I.M. and Somia, N.
  • he ⁇ es-based, gene therapy deMvery system is used to deMver sptm to target ceUs which have one or more genetic abnormaMties with respect to the expression of sptm.
  • HSV he ⁇ es simplex virus
  • HSV he ⁇ es simplex virus
  • Patent Number 5,804,413 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a ceU under the control of the appropriate promoter for pmposes including human gene therapy. Also taught by this patent are the constraction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also Goins, W. F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et al., (1994) Dev. Biol. 163:152-161, hereby inco ⁇ orated by reference.
  • an alphaviras (positive, single-stranded RNA virus) vector is used to 5 deMver sptm to target ceUs.
  • SFV SemMki Forest Viras
  • This subgenomic RNA repMcates to higher levels than the fuU-length genomic RNA, resulting in the ove ⁇ roduction of capsid 0 proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase).
  • enzymatic activity e.g., protease and polymerase
  • sptm into the alphaviras genome in place of the capsid-coding region results in the production of a large number of sptm RNAs and the synthesis of high levels of SPTM in vector transduced ceUs.
  • alphaviras infection is typically associated with ceM lysis within a few days
  • the abihty to estabMsh a persistent infection in hamster normal kidney cells (BHK-21) with a variant of 5 Sindbis virus (SIN) indicates that the lytic repMcation of alphavirases can be altered to suit the needs of the gene therapy appMcation (Dryga, S.A. et al. (1997) Virology 228:74-83).
  • alphavirases wiU aU ow the introduction of sptm into a variety of ceU types.
  • the specific transduction of a subset of cells in a population may require the sorting of ceUs prior to transduction.
  • the methods of manipulating infectious cDNA clones of alphavirases, performing alphaviras cDNA and RNA o transfections, and performing alphaviras infections, are well known to those with ordinary skill in the art.
  • Anti-SPTM antibodies may be used to analyze protein expression levels.
  • Such antibodies 5 include, but are not Mmited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments.
  • polyclonal, monoclonal, chimeric, single chain, and Fab fragments For descriptions of and protocols of antibody technologies, see, e.g., Pound J.D. (1998) Immunochemical Protocols, Humana Press, Totowa, NX
  • amino acid sequence encoded by the sptm of the Sequence Listing may be analyzed by appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine o regions of high immunogenicity.
  • appropriate software e.g., LASERGENE NAVIGATOR software, DNASTAR
  • the optimal sequences for immunization are selected from the C- terminus, the N-terminus, and those intervening, hydrophiMc regions of the polypeptide which are Mkely to be exposed to the external environment when the polypeptide is in its natural conformation. Analysis used to select appropriate epitopes is also described by Ausubel (1997, supra, Chapter 11.7). Peptides used for antibody induction do not need to have biological activity; however, they must be antigenic.
  • Peptides used to induce specific antibodies may have an amino acid sequence consisting of at least five amino acids, preferably at least 10 amino acids, and most preferably at least 15 amino acids.
  • a peptide which mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as keyhole Mmpet hemocyanin (KLH; Sigma, St. Louis MO) for antibody production.
  • KLH keyhole Mmpet hemocyanin
  • a peptide encompassing an antigenic region may be expressed from an sptm, synthesized as described above, or purified from human cells.
  • mice, goats, and rabbits may be immunized by injection with a peptide.
  • various adjuvants may be used to increase immunological response.
  • peptides about 15 residues in length may be synthesized using an ABI 431 A peptide synthesizer (AppMed Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction with M-maleimidobenzoyl-N-hydroxysuccinirnide ester (Ausubel, 1995, supra).
  • Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant.
  • the resulting antisera are tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine seram albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG.
  • Antisera with antipeptide activity are tested for anti-SPTM activity using protocols weU known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting.
  • isolated and purified peptide may be used to immunize mice (about 100 ⁇ g of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and used to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of peptide is sufficient for labeUng and screening several thousand clones. Hybridomas of interest are detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific monoclonal antibody.
  • weUs of a multi-weU plate (FAST, Becton-Dickinson, Palo Alto, CA) are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species IgG) antibodies at 10 mg/ml.
  • the coated wells are blocked with 1 % BSA and washed and exposed to supernatants from hybridomas. After incubation, the weUs are exposed to radiolabeled peptide at 1 mg/ml.
  • Clones producing antibodies bind a quantity of labeled peptide that is detectable above background. Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several procedures for the production of monoclonal antibodies, including in vitro production, are described in Pound (supra). Monoclonal antibodies with antipeptide activity are tested for anti-SPTM activity using protocols weU known in the art, including EXISA, RIA, and immunoblotting.
  • Antibody fragments containing specific binding sites for an epitope may also be generated.
  • such fragments include, but are not Mmited to, the F(ab')2 fragments produced by pepsin digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges of the F(ab')2 fragments.
  • construction of Fab expression Mbraries in filamentous bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity (Pound, supra. Chaps. 45-47).
  • Antibodies generated against polypeptide encoded by sptm can be used to purify and characterize fuU-length SPTM protein and its activity, binding partners, etc.
  • Anti-SPTM antibodies may be used in assays to quantify the amount of SPTM found in a particular human ceU. Such assays include methods utiUzing the antibody and a label to detect expression level under normal or disease conditions.
  • the peptides and antibodies of the invention may be used with or without modification or labeled by joining them, either covalently or noncovalently, with a reporter molecule.
  • Protocols for detecting and measuring protein expression using either polyclonal or monoclonal antibodies are well known in the art. Examples include ELISA, RIA, and fluorescent activated cell sorting (FACS). Such immunoassays typicaUy involve the formation of complexes between the SPTM and its specific antibody and the measurement of such complexes. These and other assays are described in Pound (supra).
  • RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixtme of denaturants, such as TRIZOL (Life Technologies), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted with chloroform. RNA was precipitated with either isopropanol or sodium acetate and ethanol, or by other routine methods.
  • RNA was provided with RNA and constructed the corresponding cDNA Mbraries. Otherwise, cDNA was synthesized arid cDNA Mbraries were constructed with the UNIZAP vector system (Stratagene Cloning Systems, Inc. (Stratagene), La Jolla CA) or
  • SUPERSCRIPT plasmid system (Life Technologies), using the recommended procedures or similar methods known in the art. (See, e.g., Ausubel, 1997, supra, Chapters 5.1 through 6.6.) Reverse transcription was initiated using oMgo d(T) or random primers. Synthetic oMgoniicleotide adapters were Mgated to double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes. For most Mbraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis.
  • cDNAs were Mgated into compatible restriction enzyme sites of the polyMnker of a suitable plasmid, e.g., PBLUESCREPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), or pEMCY (Incyte Genomics, Palo Alto CA), or derivatives thereof.
  • Recombinant plasmids were transformed into competent E. coM ceUs including XLl-Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5 ⁇ , DH10B, or ElectroMAX DH10B from Life Technologies. II. Isolation of cDNA Clones
  • Plasmids were recovered from host ceUs by in vivo excision using the UNIZAP vector system (Stratagene) or by ceU lysis. Plasmids were purified using at least one of the foUowing: the Magic or WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra plasmid purification systems or the R.E.A.L. PREP 96 plasmid purification kit (QIAGEN). FoUowing precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or without lyophiUzation, at 4°C
  • plasmid DNA was ampMfied from host cell lysates using direct Mnk PCR in a high-throughput format.
  • Host ceU lysis and thermal cycMng steps were carried out in a single reaction mixture. Samples were processed and stored in 384-well plates, and the concentration of ampMfied plasmid DNA was quantified fluorometrically using PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and a FLUOROSKAN II fluorescence scanner (Labsystems Oy, Helsinki, Finland).
  • cDNA sequencing reactions were processed using standard methods or high-thiOughput instrumentation such as the ABI CATALYST 800 thermal cycler (AppMed Biosystems) or the PTC- 200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific Co ⁇ ., Sunnyvale CA) or the MICROLAB 2200 Mquid transfer system (Hamilton).
  • cDNA sequencing reactions were prepared using reagents provided by Amersham Pharmacia Biotech or suppMed in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppMed Biosystems).
  • Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing system (AppMed Biosystems) in conjunction with standard ABI protocols and base calMng software; or other sequence analysis systems known in the art. Reading frames within the cDNA sequences were identified using standard methods (reviewed in Ausubel, 1997, supra, Chapter 7.7). Some of the cDNA sequences were selected for extension using the techniques disclosed in Example VDX
  • Component sequences from chromatograms were subject to PHRED analysis and assigned a quaMty score.
  • the sequences having at least a required quaMty score were subject to various pre- processing editing pathways to eMminate, e.g., low quaMty 3' ends, vector and Mnker sequences, polyA tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and sequences smaller than 50 base pairs.
  • low-information sequences and repetitive elements e.g., dinucleotide repeats, Alu repeats, etc.
  • sequences were then subject to assembly procedures in which the sequences were assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene bin were assembled to produce consensus sequences (templates). Subsequent new sequences were added to existing bins using BLASTn (v.1.4 WashU) and CROSSMATCH. Candidate pairs were identified as all BLAST hits having a quaMty score greater than or equal to 150. AMgnments of at least 82% local identity were accepted into the bin. The component sequences from each bin were assembled using a version of PHRAP. Bins with several overlapping component sequences were assembled using DEEP PHRAP.
  • each assembled template was determined based on the number and orientation of its component sequences. Template sequences as disclosed in the sequence Msting correspond to sense strand sequences (the "forward" reading frames), to the best determination. The complementary (antisense) strands are inherently disclosed herein.
  • the component sequences which were used to assemble each template consensus sequence are Msted in Table 3 along with their positions along the template nucleotide sequences.
  • Bins were compared against each other and those having local similarity of at least 82% were combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 95% local identity) were re-spMt. Assembled templates were also subject to analysis by STITCHER/EXON MAPPER algorithms which analyze the probabiMties of the presence of spMce variants, alternatively spMced exons, spMce junctions, differential expression of alternative spMced genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of the above assembly procedures .
  • bins were clone joined based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' sequence from the same clone was present in a different bin, it was Mkely that the two bins actuaUy belonged together in a single bin. The resulting combined bins underwent assembly procedures to regenerate the consensus sequences.
  • the template sequences were further analyzed by translating each template in all three forward reading frames and searching each translation against the Pfam database of hidden Markov - model-based protein famiUes and domains using the HMMER software package (available to the pubMc from Washington University School of Medicine, St. Louis MO). (See also World Wide Web site http://pfam.wustl.edu/ for detailed descriptions of Pfam protein domains and famiMes.) Additionally, the template sequences were translated in all three forward reading frames, and each translation was searched against hidden Markov models for signal peptides using the HMMER software-package.
  • Template sequences are further analyzed using the bioinformatics tools Usted in Table 6, or using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Template sequences may be further queried against pubMc databases such as the GenBank rodent, mammaMan, vertebrate, prokaryote, and eukaryote databases.
  • polypeptide sequences were translated to derive the corresponding longest open reading 5 frame as presented by the polypeptide sequences as reported in Table 1.
  • a polypeptide of the invention may begin at any of the methionine residues within the full length translated polypeptide.
  • Polypeptide sequences were subsequently analyzed by querying against the GenBank protein database (GENPEPT, (GenBank version 124)). FuU length polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco o CA) and LASERGENE software (DNASTAR).
  • Polynucleotide and polypeptide sequence aMgnments are generated using default parameters specified by the CLUSTAL algorithm as inco ⁇ orated into the MEGALIGN multisequence ahgnment program (DNASTAR), which also calculates the percent identity between aUgned sequences.
  • Table 5 shows sequences with homology to the polypeptides of the invention as identified by 5 BLAST analysis against the GenBank protein (GENPEPT) database.
  • Column 1 shows the polypeptide sequence identification number (SEQ DD NO:) for the polypeptide segments of the invention.
  • Column 2 shows the reading frame used in the translation of the polynucleotide sequences * encoding the polypeptide segments.
  • Column 3 shows the length of the translated polypeptide ; segments.
  • Columns 4 and 5 show the start and stop nucleotide positions of the polynucleotide o sequences encoding the polypeptide segments.
  • Column 6 shows the GenBank identification number
  • GenBank homolog (GI Number) of the nearest GenBank homolog.
  • Column 7 shows the probabiMty score for the match between each polypeptide and its GenBank homolog.
  • Column 8 shows the annotation of the GenBank homolog.
  • Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a particular cell type or tissue have been bound.
  • BXAST a sequence of nucleotide sequences
  • This analysis is much faster than multiple membrane-based hybridizations.
  • the sensitivity of the computer search can be modified to determine whether any particular match is categorized as exact or similar.
  • the basis of the search is the product score, which is defined as:
  • the product score takes into account both the degree of similarity between two sequences and the length of the sequence match.
  • the product score is a normaMzed value between 0 and 100, and is calculated as foUows: the BLAST score is multipMed by the percent nucleotide identity and the product is divided by (5 times the length of the shorter of the two sequences).
  • the BLAST score is l o calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate the product score.
  • the product score represents a balance between fractional overlap and quaMty in a BLAST aMgnment. For example, a product score of 100 is produced only for 100% identity over the
  • a product score of 70 is produced either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the other.
  • a product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap.
  • polynucleotide sequences encoding SPTM are analyzed with respect to the
  • Each cDNA sequence is derived from a cDNA Mbrary constructed from a human tissue.
  • Each human tissue is classified into one of the foUowing organ tissue categories: cardiovascular system; connective tissue; digestive system; embryonic stractures; endocrine system; exocrine glands; genitaUa, female; genitaMa, male;
  • each human tissue is classified into one of the following disease/condition
  • a tissue distribution profile is determined for each template by compiMng the cDNA Mbrary tissue classifications of its component cDNA sequences.
  • Each component sequence is derived from a cDNA Mbrary constructed from a human tissue.
  • Each human tissue is classified into one of the following categories: cardiovascular system; connective tissue; digestive system; embryonic 0 structures; endocrine system; exocrine glands; genitaMa, female; genitaMa, male; germ cells; hemic and immune system; Mver; musculoskeletal system; nervous system; pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract.
  • Template sequences, component sequences, and cDNA Mbrary/tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA). 5 Table 4 shows the tissue distribution profile for the templates of the invention. For each template, the three most frequently observed tissue categories are shown in column 2, along with the percentage of component sequences belonging to each category. Only tissue categories with percentage values of >10% are shown. A tissue distribution of "widely distributed" in column 2 indicates percentage values of ⁇ 10% in aU tissue categories. 0
  • Transcript images are generated as described in Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, inco ⁇ orated herein by reference.
  • OMgonucleotide primers designed using an sptm of the Sequence Listing are used to extend the nucleic acid sequence.
  • One primer is synthesized to initiate 5' extension of the template, and the other primer, to initiate 3' extension of the template.
  • the initial primers may be designed using OLIGO 4.06 software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another o appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence at temperatures of about 68°C to about 72°C Any stretch of nucleotides which would result in hahpin structures and primer-primer dimerizations are avoided.
  • Selected human cDNA Mbraries are used to extend the sequence. If more than one extension is necessary or desired, additional or nested sets of primers are designed.
  • the concentration of DNA in each well is determined by dispensing 100 ⁇ l PICOGREEN quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in IX Tris-EDTA (TE) and 0.5 ⁇ l of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Inco ⁇ orated (Corning), Corning NY), allowing the DNA to bind to the reagent.
  • the plate is scanned in a FLUOROSKAN D (Labsystems Oy) to measure the fluorescence of the sample and to quantify the concentration of DNA.
  • a 5 ⁇ l to 10 ⁇ l aUquot of the reaction mixture is analyzed by electrophoresis on a 1 % agarose mini-gel to determine which reactions are successful in extending the sequence.
  • the extended nucleotides are desalted and concentrated, transferred to 384-well plates, digested with CviJI cholera viras endonuclease (Molecular Biology Research, Madison WI), and sonicated or sheared prior to reUgation into pUC 18 vector (Amersham Pharmacia Biotech).
  • CviJI cholera viras endonuclease Molecular Biology Research, Madison WI
  • sonicated or sheared prior to reUgation into pUC 18 vector
  • the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose gels, fragments are excised, and agar digested with AGAR ACE (Promega).
  • Extended clones are reMgated using T4 Mgase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fiU-in restriction site overhangs, and transfected into competent E. coM ceUs. Transformed ceUs are selected on antibiotic-containing media, individual colonies are picked and cultured overnight at 37 °C in 384-weU plates in LB/2x carbenicillin Mquid media.
  • the ceUs are lysed, and DNA is ampMfied by PCR using Taq DNA polymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the foUowing parameters: Step 1 :
  • Step 7 storage at 4°C DNA is quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reampMfied using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1 :2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppMed Biosystems). In Mke manner, the sptm is used to obtain regulatory sequences (promoters, introns, and enhancers) using the procedure above, oMgonucleotides designed for such extension, and an appropriate genomic Mbrary.
  • PICOGREEN reagent Molecular Probes
  • Hybridization probes derived from the sptm of the Sequence Listing are employed for screening cDNAs, mRNAs, or genomic DNA.
  • the labeMng of probe nucleotides between 100 and 1000 nucleotides in length is specificaUy described, but essentially the same procedure may be used with larger cDNA fragments.
  • Probe sequences are labeled at room temperature for 30 minutes using a T4 polynucleotide kinase, ⁇ 32 P-ATP, and 0.5X One-Phor-AU Plus (Amersham Pharmacia Biotech) buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech).
  • the probe mixture is diluted to 10 7 dpm/ ⁇ g/ml hybridization buffer and used in a typical membrane-based hybridization analysis.
  • the DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed through a 0.7% agarose gel.
  • the DNA fragments are transferred from the agarose to nylon membrane (NYTRAN Plus, Schleicher & SchueU, Inc., Keene NH) using procedures specified by the manufacturer of the membrane.
  • Prehybridization is carried out for three or more horns at 68 °C, and hybridization is carried out overnight at 68 °C
  • blots are sequentially washed at room temperature under increasingly stringent conditions, up to O.lx saMne sodium citrate (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER cassette (Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of standard and experimental lanes are compared. Essentiahy the same procedure is employed when screening RNA.
  • SSC O.lx saMn
  • Inclusion of a mapped sequence in a cluster wiU result in the assignment of aU sequences of that cluster, including its particular SEQ D NO:, to that map location.
  • the genetic map locations of SEQ BD NOX-184 are described as ranges, or intervals, of human chromosomes.
  • the map position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p-arm.
  • centiMorgan (cM) is a unit of measurement based on recombination frequencies between chromosomal markers.
  • cM is roughly equivalent to 1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.
  • Mb megabase
  • the cM distances are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid markers whose sequences were included in each of the clusters.
  • RNA is isolated from tissue samples using the guanidinium thiocyanate method and polyA + RNA is purified using the oMgo (dT) cellulose method.
  • Each polyA + RNA sample is reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/ ⁇ l oMgo-dT primer (21mer), IX first strand buffer, 0.03 units/ ⁇ l RNase inhibitor, 500 ⁇ M dATP, 500 ⁇ M dGTP, 500 ⁇ M dTTP, 40 ⁇ M dCTP, 40 ⁇ M dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech).
  • the reverse transcription reaction is performed in a 25 ml volume containing 200 ng polyA + RNA with GEMBRIGHT kits (Incyte).
  • Specific control polyA + RNAs are synthesized by in vino transcription from non-coding yeast genomic DNA (W. Lei, unpubMshed).
  • the control mRNAs at 0.002 ng, 0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1 :100,000, 1 :10,000, 1 :1000, 1 :100 (w/w) to sample mRNA respectively.
  • control mRNAs are diluted into reverse transcription reaction at ratios of 1:3, 3:1, 1:10, 10:1, 1:25, 25:1 (w/w) to sample mRNA differential expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and another with Cy5 labehng) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Probes are purified using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc.
  • reaction samples are ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol.
  • the probe is then dried to completion using a SpeedVAC (Savant Instruments Inc., HolbrookNY) and resuspended in 14 ⁇ l 5X SSC/0.2% SDS.
  • Sequences of the present invention are used to generate array elements.
  • Each array element 5 is ampMfied from bacterial ceUs containing vectors with cloned cDNA inserts.
  • PCR ampMfication uses primers complementary to the vector sequences flanking the cDNA insert.
  • Array elements are ampMfied in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 ⁇ g.
  • AmpMfied array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). Purified array elements are immobiMzed on polymer-coated glass sMdes.
  • Glass microscope 0 sMdes (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distiUed water washes between and after treatments.
  • Glass sMdes are etched in 4% hydrofluoric acid (VWR Scientific Products Co ⁇ oration (VWR), West Chester, PA), washed extensively in distilled water, and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol.
  • Coated sMdes are cured in a 110°C oven. 5 Array elements are appMed to the coated glass substrate using a procedure described in US
  • Patent No. 5,807,522 inco ⁇ orated herein by reference.
  • 1 ⁇ l of the array element DNA is loaded into the open capillary printing element by a high-speed robotic apparatus.
  • the apparatus then deposits about 5 nl of array element sample per sMde.
  • Microarrays are UV-crossMnked using a STRATALINKER UV-crossMnker (Stratagene). 0 Microarrays are washed at room temperature once in 0.2% SDS and three times in distiUed water. Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate buffered saMne (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 0.2% SDS and distiUed water as before.
  • PBS phosphate buffered saMne
  • Hybridization reactions contain 9 ⁇ l of probe mixture consisting of 0.2 ⁇ g each of Cy3 and Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer.
  • the probe mixture is heated to 65° C for 5 minutes and is ahquoted onto the microarray surface and covered with an 1.8 cm 2 coversMp.
  • the arrays are transferred to a wate ⁇ roof chamber having a cavity just sMghtly larger o than a microscope sMde.
  • the chamber is kept at 100% humidity intemaUy by the addition of 140 ⁇ l of 5x SSC in a corner of the chamber.
  • the chamber containing the arrays is incubated for about 6.5 horns at 60° C.
  • the arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 0.1 % SDS), three times for 10 minutes each at 45° C in a second wash buffer (0.1X SSC), and dried. Detect
  • Reporter-labeled hybridization complexes are detected with a microscope equipped with an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral Mnes at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5.
  • the excitation laser Mght is focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY).
  • the sMde containing the array is placed on a computer-controUed X-Y stage on the microscope and raster- scanned past the objective.
  • the 1.8 cm x 1.8 cm array used in the present example is scanned with a resolution of 20 micrometers.
  • a mixed gas multiMne laser excites the two fluorophores sequentially. Emitted Mght is spMt, based on wavelength, into two photomultipMer tube detectors (PMT R1477,
  • a specific location on the array contains a complementary DNA sequence, allowing the intensity of the signal at that location to be correlated with a weight ratio of hybridizing species of 1 : 100,000.
  • the caUbration is done by labeMng samples of the caMbrating cDNA with the two fluorophores and adding identical amounts of each to the hybridization mixture.
  • the output of the photomultipMer tube is digitized using a 12-bit RTI-835H analog-to-digital (A/D) conversion board (Analog Devices, Inc., Norwood, MA) instaUed in an IBM-compatible PC computer.
  • the digitized data are displayed as an image where the signal intensity is mapped using a Mnear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high signal).
  • the data is also analyzed quantitatively. Where two. different fluorophores are excited and measmed simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission spectra) between the fluorophores using each fluorophore's emission spectrum.
  • a grid is superimposed over the fluorescence signal image such that the signal from each spot is centered in each element of the grid.
  • the fluorescence signal within each element is then integrated to obtain a numerical value corresponding to the average intensity of the signal.
  • the software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte).
  • Sequences complementary to the sptm are used to detect, decrease, or inhibit expression of the naturaUy occmring nucleotide.
  • the use of oMgonucleotides comprising from about 15 to 30 base pairs is typical in the art. However, smaller or larger sequence fragments can also be used.
  • Appropriate oMgonucleotides are designed from the sptm using OLIGO 4.06 software (National Biosciences) or other appropriate programs and are synthesized using methods standard in the art or ordered from a commercial suppMer.
  • a complementary oMgonucleotide is designed from the most unique 5' sequence and used to prevent transcription factor binding to the promoter sequence.
  • To inhibit translation, a complementary oMgonucleotide is designed to prevent ribosomal binding and processing of the transcript.
  • SPTM Expression and purification of SPTM is accompMshed using bacterial or virus-based expression systems.
  • cDNA is subcloned into an appropriate vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription.
  • promoters include, but are not Mmited to, the trp-lac (tac) hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element.
  • Recombinant vectors are transformed into suitable bacterial hosts, e.g.,
  • SPTM BL21 (DE3). Antibiotic resistant bacteria express SPTM upon induction with isopropyl beta-D- thiogalactopyranoside (D?TG). Expression of SPTM in eukaryotic ceUs is achieved by infecting insect or mammaMan ceU Mnes with recombinant Autographica caMfornica nuclear polyhedrosis virus (AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding SPTM by either homologous recombination or bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of cDNA transcription.
  • AcMNPV Autographica caMfornica nuclear polyhedrosis virus
  • baculo viras Recombinant baculo viras is used to infect Spodoptera fragiperda (Sf9) insect ceUs in most cases, or human hepatocytes, in some cases. Infection of the latter requires additional genetic modifications to baculovirus. (See e.g., Engelhard, supra; and Sandig, supra.)
  • SPTM is synthesized as a fusion protein with, e.g., glutathione S- transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, affinity-based purification of recombinant fusion protein from crade ceU lysates.
  • GST a 26-kilodalton enzyme from Schistosoma iaponicum. enables the purification of fusion proteins on immobiUzed glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia Biotech).
  • the GST moiety can be proteolyticaUy cleaved from SPTM at specificaUy engineered sites.
  • FLAG an 8-amino acid peptide
  • 6-His a stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, supra, Chapters 10 and 16). Purified SPTM obtained by these methods can be used directly in the following activity assay.
  • An assay for SPTM activity measures the expression of SPTM on the cell surface.
  • cDNA encoding SPTM is subcloned into an appropriate mammaMan expression vector suitable for high levels of cDNA expression.
  • the resulting construct is transfected into a nonhuman cell Mne such as N1H3T3.
  • Cell surface proteins are labeled with biotin using methods known in the art.
  • Immunoprecipitations are performed using SPTM-specific antibodies, and immunoprecipitated samples are analyzed using SDS-PAGE and immunoblotting techniques. The ratio of labeled immunoprecipitant to unlabeled immunoprecipitant is proportional to the amount of SPTM expressed on the ceU surface.
  • an assay for SPTM activity measures the amount of SPTM in secretory, membrane-bound organeUes. Transfected ceUs as described above are harvested and lysed. The lysate is fractionated using methods known to those of skill in the art, for example, sucrose gradient ultracentrifugation.
  • Such methods aUow the isolation of subceUular components such as the Golgi apparatas, ER, smaU membrane-bound vesicles, and other secretory organelles.
  • Immunoprecipitations from fractionated and total cell lysates are performed using SPTM-specific antibodies, and immunoprecipitated samples are analyzed using SDS-PAGE and immunoblotting techniques.
  • concentration of SPTM in secretory organeUes relative to SPTM in total ceU lysate is proportional to the amount of SPTM in transit through the secretory pathway.
  • SPTM function is assessed by expressing sptm at physiologicaUy elevated levels in mammaMan cell culture systems.
  • cDNA is subcloned into a mammaMan expression vector containing a strong promoter that drives high levels of cDNA expression.
  • Vectors of choice include pCMV SPORT (Life Technologies) and pCR3.1 (Invitrogen Co ⁇ oration, Carlsbad CA), both of which contain the cytomegalovirus promoter.
  • 5-10 ⁇ g of recombinant vector are transiently transfected into a human ceU Mne, preferably of endotheMal or hematopoietic origin, using either Mposome formulations or electroporation.
  • 1-2 ⁇ g of an additional plasmid containing sequences encoding a marker protein are co-transfected.
  • marker protein provides a means to distinguish transfected ceUs from nontransfected ceUs and is a reUable predictor of cDNA expression from the recombinant vector.
  • Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; CLONTECH), CD64, or a CD64-GFP fusion protein. How cytometry (FCM), an automated laser optics-based technique, is used to identify transfected ceUs expressing GFP or CD64-GFP and to evaluate the apoptotic state of the ceUs and other ceUular properties.
  • FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These events include changes in nuclear DNA content as . measured by staining of DNA with propidium iodide; changes in ceU size and granularity as measured by forward Mght scatter and 90 degree side Mght scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and intraceUular proteins as measured by reactivity with specific antibodies; and alterations in plasma membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to the cell surface. Methods, in flow cytometry are discussed in Orrherod, M. G. (1994) Flow Cytometry, Oxford, New York NY.
  • CD64 and CD64-GFP are expressed on the surface of transfected ceUs and bind to conserved regions of human immunoglobuMn G (IgG). Transfected cells are efficiently separated from nontransfected cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake
  • mRNA can be purified from the cells using methods weU known by those of skill in the art. Expression of mRNA encoding SPTM and other genes of interest can be analyzed by northern analysis or microarray techniques.
  • the SPTM amino acid sequence is analyzed using LASERGENE software (DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is synthesized and used to raise antibodies by means known to those of skiU in the art.
  • LASERGENE software DNASTAR
  • Methods for selection of appropriate epitopes, such as those near the C-terminus or in hydrophiUc regions are weU described in 5 the art. (See, e.g., Ausubel, 1995, supra, Chapter 11.)
  • peptides 15 residues in length are synthesized using an ABI 431 A peptide synthesizer (AppMed Biosystems) using fmoc-cbemistry and coupled to KLH (Sigma) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity.
  • MBS N-maleimidobenzoyl-N-hydroxysuccinimide ester
  • Rabbits are immunized with the peptide-KLH complex in complete Freund's o adjuvant.
  • Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG.
  • Antisera with antipeptide activity are tested for anti-SPTM activity using protocols well known in the art, including EXISA, RIA, and immunoblotting.
  • Naturally occmring or recombinant SPTM is substantiaUy purified by immunoaffinity chromatography using antibodies specific for SPTM.
  • An immunoaffinity column is constructed by covalently coupMng anti-SPTM antibody to an activated chromatographic resin, such as CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupMng, the resin is o blocked and washed according to the manufacturer's instructions.
  • Media containing SPTM are passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of SPTM (e.g., high ionic strength buffers in the presence of detergent).
  • the column is eluted under conditions that disrupt antibody/SPTM binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as 5 urea or thiocyanate ion), and SPTM is coUected.
  • SPTM or biologicahy active fragments thereof, are labeled with 125 I Bolton-Hunter reagent.
  • Bolton-Hunter reagent See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.
  • Candidate molecules o previously arrayed in the weUs of a multi-weU plate are incubated with the labeled SPTM, washed, and any weUs with labeled SPTM complex are assayed. Data obtained using different concentrations of SPTM are used to calculate values for the number, affinity, and association of SPTM with the candidate molecules.
  • molecules interacting with SPTM are analyzed using the yeast two-hybrid system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commerciaUy available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH).
  • SPTM may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) which employs the yeast two-hybrid system in a high-throughput manner to determine aU interactions between the proteins encoded by two large Mbraries of genes (Nandabalan, K. et al. (2000) U.S. Patent No. 6,057,101).
  • LG:980494.1:2000SEP08 304 357 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1267 1335 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1339 1410 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1453 1524 forward 1 TM Nin 5 LG:980494.1:2000SEP08 ' 1564 1611 forward 1 TM Nin 5 LG:980494.1 :2000SEP08 563 628 forward 2 TM Nout 5 LG:980494.1:2000SEP08 1364 1450 forward 2 TM Nout 5 LG:980494.1 :2000SEP08 15 101 forward 3 TM Nin 5 LG:980494.1 :2000SEP08 855 941 forward 3 TM Nin 5 LG:980494.1:2000SEP08 1251 1337 forward 3 TM Nin 5 LG:980494.1:2000SEP08 1374 1424 forward 3 TM Nin

Abstract

The present invention provides purified secretory polynucleotides (sptm) and the polypeptides (SPTM) encoded by sptm. The invention also provides for the use of sptm, or complements, oligonucleotides, or fragments thereof in diagnostic assays. The invention further provides for vectors and host cells containing sptm for the expression of SPTM. The invention additionally provides for the use of isolated and purified SPTM to induce antibodies and to screen libraries of compounds and the use of anti-SPTM antibodies in diagnostic assays. Also provided are microarrays containing sptm and methods of use.

Description

SECRETORY MOLECULES
5 TECHNICAL FIELD
The present invention relates to secretory molecules and to the use of these sequences in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, the expression of secretory molecules.
0 BACKGROUND OF THE INVENTION
Protein transport and secretion are essential for cellular function. Protein transport is mediated by a signal peptide located at the amino terminus of the protein to be transported or secreted. The signal peptide is comprised of about ten to twenty hydrophobic amino acids which target the nascent protein from the ribosome to a particular membrane bound compartment such as the 5 endoplasmic reticulum (ER). Proteins targeted to the ER may either proceed through the secretoiy pathway or remain in any of the secretory organelles such as the ER, Golgi apparatus, or lysosomes. Proteins that transit through the secretory pathway are either secreted into the extracellular space or retained in the plasma membrane. Proteins that are retained in the plasma membrane contain one or more transmembrane domains, each comprised of about 20 hydrophobic amino acid residues. Proteins o that are secreted from the cell are generally synthesized as inactive precursors that are activated by post-translational processing events during ttansit through the secretory pathway. Such events include glycosylation, proteolysis, and removal of the signal peptide by a signal peptidase. Other events that may occur during protein transport include chaperone-dependent unfolding and folding of the nascent protein and interaction of the protein with a receptor or pore complex. Examples of secretory proteins 5 with amino terminal signal peptides are discussed below and include proteins with important roles in cell-to-cell signaling. Such proteins include transmembrane receptors and cell surface markers, extracellular matrix molecules, cytokines, hormones, growth and differentiation factors, neuropeptides, vasomediators, ion channels, transporters/pumps, and proteases. (Reviewed in Alberts, B. et al. (1994) Molecular Biology of The Cell. Garland Publishing, New York NY, pp. 557-560, 582-592.) o G-protein coupled receptors (GPCRs) comprise a superfamily of integral membrane proteins which transduce exfracellular signals. Not all GPCRs contain N-terminal signal peptides. GPCRs include receptors for biogenic amines such as dopamine, epinephrine, histamine, glutamate (metabotropic-type), acetylcholine (muscarinic-type), and serotonin; for lipid mediators of inflammation such as prostaglandins, platelet activating factor, and leukotrienes; for peptide hormones such as
l calcitonin, C5a anaphylatoxin, follicle stimulating hormone, gonadotropin releasing hormone, neurokinin, oxytocin, and thrombin; and for sensory signal mediators such as retinal phόtopigments and olfactory stimulatory molecules. The structure of these highly conserved receptors consists of seven hydrophobic transmembrane regions, cysteine disulfide bridges between the second and third 5 extracellular loops, an extracellular N-terminus, and a cytoplasmic C-terminus. The N-terminus interacts, with ligands, the disulfide bridges interact with agonists and antagonists, and the large third intracellular loop interacts with G proteins to activate second messengers such as cyclic AMP, phospholipase C, inositol triphosphate, or ion channels. (Reviewed in Watson, S. and Arkinstall, S. (1994) The G-protein Linked Receptor Facts Book. Academic Press, San Diego CA, pp. 2-6; and o Bolander, F.F. (1994) Molecular Endocrinology. Academic Press, San Diego CA, pp. 162-176.) Other types of receptors include cell surface antigens identified on leukocytic cells of the immune system. These antigens have been identified using systematic, monoclonal antibody (mAb)- based "shot gun" techniques. These techniques have resulted in the production of hundreds of mAbs directed against unknown cell surface leukocytic antigens. These antigens have been grouped into 5 "clusters of differentiation" based on common immunocytochemical localization patterns in various differentiated and undifferentiated leukocytic cell types. Antigens in a given cluster are presumed to identify, a single cell surface protein and are assigned a "cluster of differentiation" or "CD" designation. Some of the genes encoding proteins identified by CD antigens have been cloned and verified by standard molecular biology techniques. CD antigens have been characterized as both o transmembrane proteins and cell surface proteins anchored to the plasma membrane via covalent attachment to fatty acid-containing glycolipids such as glycosylphosphatidylinositol (GPI). (Reviewed in Barclay, A.N. et al. (1995) The Leucocyte Antigen Facts Book. Academic Press, San Diego CA, pp. 17-20.)
Matrix proteins (MPs) are transmembrane and extracellular proteins which function in 5 formation, growth, remodeling, and maintenance of tissues and as important mediators and regulators of the inflammatory response. The expression and balance of MPs may be perturbed by biochemical changes that result from congenital, epigenetic, or infectious diseases. In addition, MPs affect leukocyte migration, proliferation, differentiation, and activation in the immune response. MPs are frequently characterized by the presence of one or more domains which may include collagen-like o domains, EGF-like domains, immunoglobulin-like domains, and fibronectin-like domains. In addition,
MPs may be heavily glycosylated and may contain an Arginine-Glycine-Aspartate (RGD) tripeptide motif which may play a role in adhesive interactions. MPs include extracellular proteins such as fibronectin, collagen, galectin, vitronectin and its proteolytic derivative somatomedin B; and cell adhesion receptors such as cell adhesion molecules (CAMs), cadherins, and integrins. (Reviewed in Ayad, S. et al. (1994) The Extracellular Matrix Facts Book. Academic Press, San Diego CA, pp. 2- 16; Ruoslahti, E. (1997) Kidney Int. 51:1413-1417; Sjaastad, M.D. and Nelson, W.J. (1997) BioEssays 19:47-55.) 5 Cytokines are secreted by hematopoietic cells in response to injury or infection. Interleukins, nemotrophins, growth factors, interferons, and chemokines all define cytokine families that work in conjunction with cellular receptors to regulate cell proliferation and differentiation. In addition, cytokines effect activities such as leukocyte migration and function, hematopoietic cell proliferation, temperature regulation, acute response to infection, tissue remodeling, and apoptosis. 0 Chemokines, in particular, are small chemoattractant cytokines involved in inflammation, leukocyte proliferation and migration, angiogenesis and angiostasis, regulation of hematopoiesis, HJN infectivity, and stimulation of cytokine secretion. Chemokines generally contain 70-100 amino acids and are subdivided into four subfamilies based on the presence of conserved cysteine-based motifs. (Callard, R. and Gearing, A. (1994) The Cytokine Facts Book. Academic Press, New York NY, pp. 5 181-190, 210-213, 223-227.)
Growth and differentiation factors are secreted proteins which function in intercellular communication. Some factors require oligomerization or association with MPs for activity. Complex interactions among these factors and their receptors trigger intracellular signal ttansduction pathways that stimulate or inhibit cell division, cell differentiation, cell signaling, and cell motility. Most growth o and differentiation factors act on cells in their local environment (paracrine signaling). There are three broad classes of growth and differentiation factors. The first class includes the large polypeptide growth factors such as epidermal growth factor, fibroblast growth factor, transforming growth factor, insulin-like growth factor, and platelet-derived growth factor. The second class includes the hematopoietic growth factors such as the colony stimulating factors (CSFs). Hematopoietic growth 5 factors stimulate the proliferation and differentiation of blood cells such as B-lymphocytes, T- lymphocytes, erythrocytes, platelets, eosinophils, basophils, neutrophils, macrophages, and their stem cell precursors. The third class includes small peptide factors such as bombesin, vasopressin, oxytocin, endothelin, transferrin, angiotensin H, vasoactive intestinal peptide, and bradykinin which function as hormones to regulate cellular functions other than proliferation. o Growth and differentiation factors play critical roles in neoplastic transformation of cells in vitro and in tumor progression in vivo. Inappropriate expression of growth factors by tumor cells may contribute to vascularization and metastasis of tumors. During hematopoiesis, growth factor misregulation can result in anemias, leukemias, and lymphomas. Certain growth factors such as interferon are cytotoxic to tumor cells both in vivo and in vitro. Moreover, some growth factors and growth factor receptors are related both sxracturally and functionally to oncoproteins. In addition, growth factors affect transcriptional regulation of both proto-oncogenes and oncosuppressor genes. (Reviewed in Pimentel, E. (1994) Handbook of Growth Factors. CRC Press, Ann Arbor MI, pp. 1-9.) 5 Proteolytic enzymes or proteases either activate or deactivate proteins by hydrolyzing peptide bonds. Proteases are found in the cytosol, in membrane-bound compartments, and in the extracellular space. The major families are the zinc, serine, cysteine, thiol, and carboxyl proteases.
Ion channels, ion pumps, and transport proteins mediate the transport of molecules across cellular membranes. Transport can occur by a passive, concentration-dependent mechanism or can o be linked to an energy source such as ATP hydrolysis. Symporters and antiporters transport ions and small molecules such as amino acids, glucose, and drags. Symporters transport molecules and ions unidirectionally, and antiporters transport molecules and ions bidirectionally. Transporter superfamilies include facilitative transporters and active ATP-binding cassette transporters which are involved in multiple-drug resistance and the targeting of antigenic peptides to MHC Class I molecules. These s transporters bind to a specific ion or other molecule and undergo a conformational change in order to transfer the ion or molecule across the membrane. (Reviewed in Alberts, B. et al. (1994) Molecular Biology of The Cell. Garland Publishing, New York NY, pp. 523-546.)
Ion channels are formed by transmembrane proteins which create a lined passageway across the membrane through which water and ions, such as Na+, K+, Ca2+, and Cl", enter and exit the cell. o For example, chloride channels are involved in the regulation of the membrane electric potential as well as absoφtion and secretion of ions across the membrane. Chloride channels also regulate the internal pH of membrane-bound organelles.
Ion pumps are ATPases which actively maintain membrane gradients. Ion pumps are classified as P, V, or F according to their structure and function. All have one or more binding sites 5 for ATP in their cytosolic domains. The P-class ion pumps include Ca2+ ATPase and Na+/K+ ATPase and function in transporting H+, Na+, K+, and Ca2+ ions. P-class pumps consist of two α and two β transmembrane subunits. The V- and F-class ion pumps have similar structures but transport only H+. F class H+ pumps mediate transport across the membranes of mitochondria and chloroplasts, while V- class H+ pumps regulate acidity inside lysosomes, endosomes, and plant vacuoles. 0 A family of structurally related intrinsic membrane proteins known as facilitative glucose transporters catalyze the movement of glucose and other selected sugars across the plasma membrane. The proteins in this family contain a highly conserved, large transmembrane domain comprised of 12 α-helices, and several weakly conserved, cytoplasmic and exoplasmic domains. (Pessin, J.E. and Bell, G.I. (1992) Annu. Rev. Physiol. 54:911-930.)
Amino acid transport is mediated by Na+ dependent amino acid transporters. These transporters are involved in gastrointestinal and renal uptake of dietary and cellular amino acids and in neuronal reuptake of neurotransmitters. Transport of cationic amino acids is mediated by the system 5 y+ family and the cationic amino acid transporter (CAT) family. Members of the CAT family share a high degree of sequence homology, and each contains 12-14 putative transmembrane domains. (Ito, K. and Groudine, M. (1997) J. Biol. Chem. 272:26780-26786.)
Hormones are secreted molecules that travel through the circulation and bind to specific receptors on the surface of, or within, target cells. Although they have diverse biochemical o compositions and mechanisms of action, hormones can be grouped into two categories. One category includes small lipophilic hormones that diffuse through the plasma membrane of target cells, bind to cytosolic or nuclear receptors, and form a complex that alters gene expression. Examples of these molecules include retinoic acid, thyroxine, and the cholesterol-derived steroid hormones such as progesterone, estrogen, testosterone, cortisol, and aldosterone. The second category includes 5 hydrophilic hormones that function by binding to cell surface receptors that transduce signals across the plasma membrane. Examples of such hormones include amino acid derivatives such as catecholamines and peptide hormones such as glucagon, insulin, gastrin, secretin, cholecystokinin, adrenocorticotropic hormone, follicle stimulating hormone, luteinizing hormone, thyroid stimulating hormone, and vasόpressin. (See, for example, Lodish et al. (1995) Molecular Cell Biology. Scientific, o American Books Inc., New York NY, pp. 856-864.)
Neuropeptides and vasomediators (NP/VM) comprise a large family of endogenous signaling molecules. Included in this family are neuropeptides and neuropeptide hormones such as bombesin, neuropeptide Y, neurotensin, neuromedin N, melanocortins, opioids, galanin, somatostatin, tachykinins, urotensin II and related peptides involved in smooth muscle stimulation, vasopressin, vasoactive 5 intestinal peptide, and circulatory system-borne signaling molecules such as angiotensin, complement, calcitonin, endothelins, formyl-methionyl peptides, glucagon, cholecystokinin and gastrin. NP/NMs can transduce signals directly, modulate the activity or release of other neurotransmitters and hormones, and act as catalytic enzymes in cascades. The effects of ΝP/VMs range from extremely brief to long-lasting. (Reviewed in Martin, CR. et al. (1985) Endocrine Physiology. Oxford University Press, o New York, NY, pp. 57-62.)
The discovery of new secretory molecules satisfies a need in the art by providing new compositions which are useful in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, cell signaling and the expression of secretory molecules.
SUMMARY OF THE INVENTION
The present invention relates to nucleic acid sequences comprising human polynucleotides 5 encoding secretory polypeptides that contain signal peptides and/or transmembrane domains. These human polynucleotides (sptm) as presented in the Sequence Listing uniquely identify partial or full length genes encoding structural, functional, and regulatory polypeptides involved in cell signaling.
The invention provides an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 0 NO:l-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In one alternative, the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184. In 5 another alternative, the polynucleotide comprises at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide 0 complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In another alternative, the polynucleotide comprises at least 60 contiguous nucleotides. of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-184; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a 5 polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The invention further provides a composition for the detection of expression of secretory polynucleotides comprising at least one isolated polynucleotide comprising a polynucleotide selected o from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ XD NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d); and a detectable label.
The invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polyneucleotide selected from the group 5 consisting of a) a polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). o The method comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
The invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polynucleotide selected from the group 5 consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The method 0 comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof. In one alternative, the 5 invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 30 contiguous nucleotides. In one alternative, the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 60 contiguous nucleotides.
The invention further provides a recombinant polynucleotide comprising a promoter sequence o operably Mnked to an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In one alternative, the invention provides a cell transformed with the recombinant polynucleotide. In another alternative, the invention provides a transgenic organism comprising the recombinant polynucleotide. 5 The invention also provides a method for producing a secretory polypeptide, the method comprising a) culturing a cell under conditions suitable for expression of the secretory polypeptide, wherein said cell is transformed with a recombinant polynucleotide, said recombinant polynucleotide comprising an isolated polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; ii) a 0 polynucleotide comprising a natarally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NO: 1-184; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and b) recovering the secretory polypeptide so expressed. The invention additionally provides a method wherein the polypeptide has an amino acid 5 sequence selected from the group consisting of SEQ ID NOX85-369.
The invention also provides an isolated secretory polypeptide (SPTM) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NOX-184. The invention further provides a method of screening for a test compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ DD 0 NO:l 85-369. The method comprises a) combining the polypeptide having an amino acid sequence. selected from the group consisting of SEQ DD NO:l 85-369 with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NOX85-369 to the test compound, thereby identifying a compound that specifically binds to the polypeptide having an amino acid sequence selected from the 5 group consisting of SEQ ID NOX85-369.
The invention further provides a microarray wherein at least one element of the microarray is an isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; b) a polynucleotide comprising a naturally occurring o polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ED NOX-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The invention also provides a method for generating a transcript image of a sample which contains polynucleotides. The method comprises a) labeling the polynucleotides of the sample, b) contacting the elements of the microarray with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample. Additionally, the invention provides a method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-184; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The method comprises a) exposing a sample comprising the target polynucleotide to a compound, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
The invention further provides a method for assessing toxicity of a test compound, said method comprising a) treating a biological sample containing nucleic acids with the test compound; b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide • comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -184; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a . polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv). Hybridization occurs under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ DD NO: 1-184; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ DD NOX-184; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and alternatively, the target polynucleotide comprises a polynucleotide sequence of a fragment of a polynucleotide selected from the group consisting of i-v above; c) quantifying the amount of hybridization complex; and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound. 5 The invention further provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ED NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group o consisting of SEQ ID NO:l 85-369, and d) an immimogenic fragment of a polypeptide having an amino acid sequence selected from.the group consisting of SEQ ID NO:l 85-369. In one alternative, the invention provides an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369.
The invention further provides an isolated polynucleotide encoding a polypeptide selected from 5 the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ. DD NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an immunogenic fragment of a o polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-
369. In one alternative, the polynucleotide encodes a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369. In another alternative, the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-184. Additionally, the invention provides an isolated antibody which specifically binds to a 5 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an o immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NOX85-369.
The invention further provides a composition comprising a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO: 185-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ DD NOX85- 369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, and d) an immunogenic fragment of a polypeptide having 5 an amino acid sequence selected from the group consisting of SEQ DD NO 185-369, and a pharmaceutically acceptable excipient. In one embodiment, the composition comprises a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NOX85-369. The invention additionally provides a method of treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment the o composition.
The invention also provides a method for screening a compound for effectiveness as an agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 5 from the group consisting of SEQ ID NOX85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the sample. In one alternative, the o invention provides a composition comprising an agonist compound identified by the method and a pharmaceutically acceptable excipient. In another alternative, the invention provides a method of treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment the composition.
Additionally, the invention provides a method for screening a compound for effectiveness as 5 an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ DD NO: 185-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ D o NO: 185-369, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ DD NO:l 85-369. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in the sample. In one alternative, the invention provides a composition comprising an antagonist compound identified by the method and a pharmaceutically acceptable excipient. In another alternative, the invention provides a method of treating a disease or condition associated with overexpression of functional SPTM, comprising administering to a patient in need of such treatment the composition.
The invention further provides a method of screening for a compound that modulates the 5 activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ED NOX85-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NOX85-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO 185-369, and d) an o immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369. The method comprises a) combining the polypeptide with at least one test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of 5 the test compound, wherein a change in the activity of the polypeptide in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide.
DESCRIPTION OF THE TABLES
Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template identification o numbers (template DDs) corresponding to the polynucleotides of the present invention, along with the* sequence identification numbers (SEQ ID NO:s) and open reading frame identification numbers (ORF DDs) corresponding to polypeptides encoded by the template ID.
Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template DDs) corresponding to the polynucleotides of the present invention, along with 5 polynucleotide segments of each template sequence as defined by the indicated "start" and "stop" nucleotide positions. The reading frames of the polynucleotide segments are shown, and the polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or transmembrane (TM) domains, as indicated. The membrane topology of the encoded polypeptide sequence is indicated, the N-terminus (N) listed as being oriented to either the cytosolic (N in) or non- o cytosolic (N out) side of the cell membrane or organelle.
Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template DDs) corresponding to the polynucleotides of the present invention, along with component sequence identification numbers (component DDs) corresponding to each template. The component sequences, which were used to assemble the template sequences, are defined by the indicated "start" and "stop" nucleotide positions along each template.
Table 4 shows the tissue distribution profiles for the templates of the invention.
Table 5 shows the sequence identification numbers (SEQ ED NO:s) corresponding to the polypeptides of the present invention, along with the reading frames used to obtain the polypeptide segments, the lengths of the polypeptide segments, the "start" and "stop" nucleotide positions of the polynucleotide sequences used to define the encoded polypeptide segments, the GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
Table 6 summarizes the bioinformatics tools which are useful for analysis of the polynucleotides of the present invention. The first column of Table 6 lists analytical tools, programs, and algorithms, the second column provides brief descriptions thereof, the third column presents appropriate references, all of which are incoφorated by reference herein in their entirety, and the fourth column presents, where applicable, the scores, probability values, and other parameters used to evaluate the strength of a match between two sequences (the higher the score, the greater the homology between two sequences).
DETAILED DESCRIPTION OF THE INVENTION
Before the nucleic acid sequences and methods are presented, it is to be understood that this invention is not limited to the particular machines, methods, and materials described. Although particular embodiments are described, machines, methods, and materials similar or equivalent to these embodiments may be used to practice the invention. The preferred machines, methods, and materials set forth are not intended to limit the scope of the invention which is limited only by the appended claims.
The singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. All technical and scientific terms have the meanings commonly understood by one of ordinary skill in the art. All publications are incoφorated by reference for the puφose of describing and disclosing the cell lines, vectors, and methodologies which are presented and which might be used in connection with the invention. Nothing in the specification is to be constraed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.
Definitions
As used herein, the lower case "sptm" refers to a nucleic acid sequence, while the upper case "SPTM" refers to an amino acid sequence encoded by sptm. A "full-length" sptm refers to a nucleic acid sequence containing the entire coding region of a gene endogenously expressed in human tissue.
"Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's immunological response.
"Allele" refers to an alternative form of a nucleic acid sequence. Alleles result from a "mutation," a change or an alternative reading of the genetic code. Any given gene may have none, one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or substitutions of nucleotides. Each of these changes may occur alone, or in combination with the others, one or more times in a given nucleic acid sequence. The present invention encompasses allelic sptm.
"Amino acid sequence" refers to a peptide, a polypeptide, or a protein of either natural or synthetic origin. The amino acid sequence is not limited to the complete, endogenous amino acid sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic acid sequence.
"Amplification" refers to the production of additional copies of a sequence and is carried out using polymerase chain reaction (PCR) technologies well known in the art.
"Antibody" refers to intact molecules as well as to fragments thereof, such as Fab, F(ab')2> and Fv fragments, which are capable of binding the epitopic determinant. Antibodies that bind SPTM polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of interest as the immunizing antigen. The polypeptide or peptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal.
"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target sequence. The antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified sugar groups such as 2'-methoxyethyl sugars or 2'-methoxyethoxy sugars; or oligonucleotides having modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-deoxyguanosine.
"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target sequence. The antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog. I "Antisense technology" refers to any technology which relies on the specific hybridization of an antisense sequence to a target sequence.
A "bin" is a portion of computer memory space used by a computer program for storage of data, and bounded in such a manner that data stored in a bin may be retrieved by the program. "Biologically active" refers to an amino acid sequence having a structural, regulatory, or biochemical function of a naturally occurring amino acid sequence.
"Clone joining" is a process for combining gene bins based upon the bins' containing sequence information from the same clone. The sequences may assemble into a primary gene transcript as well as one or more splice variants. "Complementary" describes the relationship between two single-stranded nucleic acid sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3'-T-C-A-5').
A "component sequence" is a nucleic acid sequence selected by a computer program such as PHRED and used to assemble a consensus or template sequence from one or more component sequences. A "consensus sequence" or "template sequence" is a nucleic acid sequence which has been assembled from overlapping sequences, using a computer program for fragment assembly such as the GELVTEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a relational database management system (RDMS).
"Conservative amino acid substitutions" are those substitutions that, when made, least interfere with the properties of the original protein, i.e., the stracture and especially the function of the protein is conserved and not significantly changed by such substitutions. The table below shows amino acids which may be substituted for an original amino acid in a protein and which are regarded as conservative substitutions.
Original Residue Conservative Substitution
Ala Gly, Ser
Arg His, Lys
Asn Asp, Gin, His
Asp Asn, Glu Cys Ala, Ser
Gin Asn, Glu, His
Glu Asp, Gin, His
Gly Ala
His Asn, Arg, Gin, Glu De Leu, Val
Leu De, Val
Lys Arg, Gin, Glu Met Leu, De
Phe His, Met, Leu, Tφ, Tyr
Ser Cys, Thr
Thr Ser, Val
T Phe, Tyr
Tyr His, Phe, Tφ
Val De, Leu, Thr
0 Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
"Deletion" refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or amino acid residue, respectively, is absent. 5 "Derivative" refers to the chemical modification of a nucleic acid sequence, such as by replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group.
"Differential expression" refers to increased or upregulated; or decreased, downregulated, or absent gene or protein expression, determined by comparing at least two different samples. Such comparisons may be carried out between, for example, a treated and an untreated sample, or a 0 diseased and a normal sample.
The terms "element" and "array element" refer to a polynucleotide, polypeptide, or other chemical compound having a unique and defined position on a microarray.
"E-value" refers to the statistical probability that a match between two sequences occurred by chance. 5 "Exon shuffling" refers to the recombination of different coding regions (exons). Since an exon may represent a structural or functional domain of the encoded protein, new proteins may be assembled through the novel reassortinent of stable substructures, thus allowing acceleration of the evolution of new protein functions.
A "fragment" is a unique portion of sptm or SPTM which is identical in sequence to but o shorter in length than the parent sequence. A fragment may comprise up to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used as a probe, primer, antigen, therapeutic molecule, or for other pmposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. Fragments 5 may be preferentially selected from certain regions of a molecule. For example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence. Clearly these lengths are exemplary, and any length that is supported by the specification, including the Sequence Listing and the figures, may be encompassed by the present embodiments.
A fragment of sptm comprises a region of unique polynucleotide sequence that specifically 5 identifies sptm, for example, as distinct from any other sequence in the same genome. A fragment of sptm is useful, for example, in hybridization and amplification technologies and in analogous methods that distinguish sptm from related polynucleotide sequences. The precise length of a fragment of sptm and the region of sptm to which the fragment corresponds are routinely determinable by one of ordinary skill in the ait based on the intended pmpose for the fragment. o A fragment of SPTM is encoded by a fragment of sptm. A fragment of SPTM comprises a region of unique amino acid sequence that specifically identifies SPTM. For example, a fragment of SPTM is useful as an immunogenic peptide for the development of antibodies that specifically recognize SPTM. The precise length of a fragment of SPTM and the region of SPTM to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the 5 intended pmpose for the fragment.
A "full length" nucleotide sequence is one containing at least a start site for translation to a protein sequence, followed by an open reading frame and a stop site, and encoding a "full length" polypeptide.
"Hit" refers to a sequence whose annotation will be used to describe a given template. o Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid matches, the top hit is the exact match with highest percent identity. If the template has no exact matches but has significant protein hits, the top hit is the protein hit with the lowest E- value. If the template has no significant protein hits, but does have significant non-exact nucleotide hits, the top hit is the nucleotide hit with the lowest E-value. 5 "Homology" refers to sequence similarity either between a reference nucleic acid sequence and at least a fragment of an sptm or between a reference amino acid sequence and a fragment of an SPTM.
"Hybridization" refers to the process by which a stiand of nucleotides anneals with a complementary strand through base pairing. Specific hybridization is an indication that two nucleic o acid sequences share a high degree of identity. Specific hybridization complexes form under defined annealing conditions, and remain hybridized after the "washing" step. The defined hybridization conditions include the annealing conditions and the washing step(s), the latter of which is particularly important in determining the stringency of the hybridization process, with more stringent conditions allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not perfectly matched. Permissive conditions for annealing of nucleic acid sequences are routinely determinable and may be consistent among hybridization experiments, whereas wash conditions may be varied among experiments to achieve the desired stringency. 5 Generally, stringency of hybridization is expressed with reference to the temperature under which the wash step is carried out. Generally, such wash temperatures are selected to be about 5°C to 20°C lower than the thermal melting point (T^ for the specific sequence at a defined ionic strength and pH. The Tmis the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. An equation for calculating Tm and conditions for 0 nucleic acid hybridization is well known and can be found in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual. 2nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; specifically see volume 2, chapter 9.
High stringency conditions for hybridization between polynucleotides of the present invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1 % SDS, for 1 hour. 5 Alternatively, temperatures of about 65°C, 60°C, or 55°C may be used. SSC concentration may be varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1%. Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, denatured salmon sperm DNA at about 100-200 μg/ml. Useful variations on these conditions will be readily apparent to those skilled in the art. Hybridization, particularly under high stringency conditions, may be o suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative of a similar role for the nucleotides and their resultant proteins.
Other parameters, such as temperature, salt concentration, and detergent concentration may be varied to achieve the desired stringency. Denaturants, such as formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, such as RNA:DNA 5 hybridizations. Appropriate hybridization conditions are routinely determinable by one of ordinary skill in the art.
"Immunologically active" or "immunogenic" describes the potential for a natural, recombinant, or synthetic peptide, epitope, polypeptide, or protein to induce antibody production in appropriate animals, cells, or cell lines. o "Insertion" or "addition" refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or residue, respectively, is added to the sequence.
"Labeling" refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or antibody with a reporter molecule capable of producing a detectable or measurable signal. "Microarray" is any arrangement of nucleic acids, amino acids, antibodies, etc., on a substrate. The substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or an appropriate membrane.
"Linkers" are short stretches of nucleotide sequence which may be added to a vector or an sptm to create restriction endonuclease sites to facilitate cloning. "Polylinkers" are engineered to incoφorate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 3' overhangs (e.g., BamHI, EcoRI, and HindlH) and those which provide blunt ends (e.g., EcoRV, SnaBI, and Stul).
"Naturally occurring" refers to an endogenous polynucleotide or polypeptide that may be isolated from viruses or prokaryotic or eukaryotic. cells.
"Nucleic acid sequence" refers to the specific order of nucleotides joined by phosphodiester bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid sequence can be considered an oligomer, oligonucleotide, or polynucleotide. The nucleic acid can be DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be either double-stranded or single-stranded, and can represent either the sense or antisense (complementary) strand.
"Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides and as many as about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 and 30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may be used as, e.g., primers for PCR, and are usually chemically synthesized.
"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences may be in close proximity or contiguous and, where necessary to join two protein coding regions, in the same reading frame.
"Peptide nucleic acid" (PNA) refers to a DNA mimic in which nucleotide bases are attached to a pseudopeptide backbone to increase stability. PNAs, also designated antigene agents, can prevent gene expression by targeting complementary messenger RNA.
The phrases "percent identity" and "% identity", as applied to polynucleotide sequences, refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences. Percent identity between polynucleotide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incoφorated into the MEGALIGN version 3.12e sequence alignment program. This program is part of the LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, Madison WT). CLUSTAL V is described in 5 Higgins, D.G. and Shaφ, P.M. (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 8:189-191. For pairwise alignments of polynucleotide sequences, the default parameters are set as follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue weight table is selected as the default. Percent identity is reported by CLUSTAL V as the "percent similarity" between aligned polynucleotide sequence pairs. 0 Alternatively, a suite of commonly used and freely available sequence comparison algorithms is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410), which is available from several sources, including the NCBI, Bethesda, MD, and on the Internet at http://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis 5 programs including "blastn," that is used to determine alignment between a known polynucleotide sequence and other sequences on a variety of databases. Also available is a tool called "BLAST 2 Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2/. The "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST o programs are commonly used with gap and other parameters set to default settings. For example, to compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62 Reward for match: I 5 Penalty for mismatch: -2
Open Gap: 5 and Extension Gap: 2 penalties Gap x drop-off: 50 Expect: 10 Word Size: 11 0 Filter: on
Percent identity may be measured over the length of an entire defined sequence, for example, as defined by a particular SEQ DD number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured. Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
The phrases "percent identity" and "% identity", as applied to polypeptide sequences, refer to the percentage of residue matches between at least two polypeptide sequences aligned using a standardized algorithm. Methods of polypeptide sequence ahgnment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. Percent identity between polypeptide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incoφorated into the MEGALIGN version 3.12e sequence alignment program (described and referenced above). For pairwise alignments of * polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default residue weight table. As with polynucleotide alignments, the percent identity is reported by CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs.
Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) with blastp set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62
Open Gap: 11 and Extension Gap: 1 penalty Gap x drop-off: 50 Expect: 10 Word Size: 3 Filter: on
Percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ DD number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured. 5 "Post-translational modification" of an SPTM may involve lipidation, glycosylation, phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by cell type depending on the enzymatic milieu and the SPTM.
"Probe" refers to sptm or fragments thereof, which are used to detect identical, allelic or o related nucleic acid sequences. Probes are isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. "Primers" are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA polymerase enzyme. 5 Primer pairs can be used for amplification (and identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR).
Probes and primers as used in the present invention typically comprise at least 15 contiguous nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or at o least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may be considerably longer than these examples, and it is understood that any length supported by the specification, including the figures and Sequence Listing, may be used.
Methods for preparing and using probes and primers are described in the references, for example Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual. 2nd ed., vol. 1-3, Cold 5 Spring Harbor Press, Plainview NY; Ausubel et al.,1987, Current Protocols in Molecular Biology. Greene Publ. Assoc. & Wiley-Intersciences, New York NY; Innis et al., 1990, PCR Protocols. A Guide to Methods and Applications. Academic Press, San Diego CA. PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that pmpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA). o Oligonucleotides for use as primers are selected using software known in the art for such pmpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection programs have incoφorated additional features for expanded capabilities. For example, the PrimOU primer selection program (available to the public from the Genome Center at University of Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences and is thus useful for designing primers on a genome- wide scope. The Primer3 primer selection program (available to the public from the Whitehead Institate/MiT Center for Genome Research, Cambridge MA) allows the user to input a "mispriming library," in which sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for microarrays. (The source code for the latter two primer selection programs may also be obtained from their respective sources and modified to meet the user's specific needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments identified by any of the above selection methods are useful in hybridization technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to those described above.
"Purified" refers to molecules, either polynucleotides or polypeptides that are isolated or separated from their natural environment and are at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other compounds with which they are naturally associated.
A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques such as those described in Sambrook, supra. The term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell. Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is expressed, inducing a protective immunological response in the mammal. "Regulatory element" refers to a nucleic acid sequence from nontranslated regions of a gene, and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host proteins to carry out or regulate transcription or translation.
"Reporter" molecules are chemical or biochemical moieties used for labeling a nucleic acid, an amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemiluminescent, or cfiromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in the art.
An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose.
"Sample" is used in its broadest sense. Samples may contain nucleic or amino acids, antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues).
"Specific binding" or "specifically binding" refers to the interaction between a protein or peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will reduce the amount of labeled A that binds to the antibody.
"Substitution" refers to the replacement of at least one nucleotide or amino acid by a different nucleotide or amino acid.
"Substrate" refers to any suitable rigid or semi-rigid support including, e.g., membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles or capillaries. The substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. A "transcript image" refers to the collective pattern of gene expression by a particular tissue or cell type under given conditions at a given time.
"Transformation" refers to a process by which exogenous DNA enters a recipient cell. Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell being transformed.
"Transformants" include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as cells which transiently express inserted DNA or RNA.
A "transgenic organism," as used herein, is any organism, including but not limited to animals and plants, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The transgenic organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, and plants and animals. The isolated DNA of the present invention can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation. Techniques for transferring the DNA of the present invention into such organisms are widely known and provided in references such as Sambrook et al. (1989), supra.
A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 30%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length. The variant may result in "conservative" amino acid changes which do not affect structural and/or chemical properties. A variant may be described as, for example, an "allelic" (as defined above), "splice," "species," or "polymoφhic" variant. A splice variant may have significant identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding polypeptide may possess additional functional domains or lack domains that are present in the reference molecule. Species variants are polynucleotide sequences that vary from one species to another. The resulting polypeptides generally will have significant amino acid identity relative to each other. A polymoφhic variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given species. Polymoφhic variants also may encompass "single nucleotide polymoφhisms'' (SNPs) in which the polynucleotide sequence varies by one base. The presence of SNPs may be indicative of, for example, a certain population, a disease state, or a propensity for a disease state. In an alternative, variants of the polynucleotides of the present invention may be generated through recombinant methods. One possible method is a DNA shuffling technique such as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or improve the biological properties of SPTM, such as its biological or enzymatic activity or its abihty to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene variants is produced using PCR-mediated recombination of gene fragments. The library is then subjected to selection or screening procedures that identify those gene variants with the desired properties. These preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial" breeding and rapid molecular evolution. For example, fragments of a single gene containing random point mutations may be recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, fragments of a given gene may be recombined with fragments of homologous genes in the same gene family, either from the same or different species, thereby maximizing the genetic diversity of multiple natarally occurring genes in a directed and controllable manner.
A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40-% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at least 60%, at least 70%, at least 80%, A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93 %, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length of one of the polypeptides. sequence identity over a certain defined length of one of the polypeptides. THE INVENTION
In a particular embodiment, cDNA sequences derived from human tissues and cell lines were aligned based on nucleotide sequence identity and assembled into "consensus" or "template" sequences which are designated by the template identification numbers (template IDs) in column 2 of 5 Table 2. The sequence identification numbers (SEQ DD NO:s) corresponding to the template EDs are shown in column 1. Segments of the template sequences are defined by the "start" and "stop" nucleotide positions listed in columns 3 and 4. These segments, when translated in the reading frames indicated in column 5, have similarity to signal peptide (SP) or transmembrane (TM) domain consensus sequences, as indicated in column 6. o The invention incoφorates the nucleic acid sequences of these templates as disclosed in the
Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states characterized by defects in cell signaling. The invention further utilizes these sequences in hybridization and amplification technologies, and in particular, in technologies which assess gene expression patterns correlated with specific cells or tissues and their responses in vivo or in vitro to 5 pharmaceutical agents, toxins, and other treatments. In this manner, the sequences of the present invention are used to develop a transcript image for a particular cell or tissue.
Derivation of Nucleic Acid Sequences cDNA was isolated from libraries constructed using RNA derived from normal and diseased o human tissues and cell lines. The human tissues and cell lines used for cDNA library construction were selected from a broad range of sources to provide a diverse population of cDNAs representative of gene transcription throughout the human body. Descriptions of the human tissues and cell lines used for cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc. (Incyte), Palo Alto CA). Human tissues were broadly selected from, for example, cardiovascular, 5 dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, reproductive, and urologic sources.
Cell lines used for cDNA library construction were derived from, for example, leukemic cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. Such cell lines include, for example, THP-1, Jurkat, HUVEC, hNT2, WI38, HeLa, and other cell lines o commonly used and available from public depositories (American Type Culture Collection, Manassas
VA). Prior to mRNA isolation, cell lines were untreated, treated with a pharmaceutical agent such as 5'-aza-2'-deoxycytidine, treated with an activating agent such as lipopolysaccharide in the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress. Sequencing of the cDNAs
Methods for DNA sequencing are well known in the art. Conventional enzymatic methods employ the Klenow fragment of DNA polymerase I, SEQUENASE DNA polymerase (U.S. Biochemical Coφoration, Cleveland OH), Taq polymerase (Applied Biosystems, Foster City CA), thermostable T7 polymerase (Amersham Pharmacia Biotech, Inc. (Amersham Pharmacia Biotech), Piscataway NJ), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies Inc. (Life Technologies), Gaithersburg MD), to extend the nucleic acid sequence from an oligonucleotide primer annealed to the DNA template of interest. Methods have been developed for the use of both single-stranded and double- stranded templates. Chain termination reaction products may be electrophoresed on urea- polyacrylamide gels and detected either by autoradiography (for radioisotope-labeled nucleotides) or by fluorescence (for fluorophore-labeled nucleotides). Automated methods for mechanized reaction preparation, sequencing, and analysis using fluorescence detection methods have been developed. Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, Inc. (MJ Research), Watertown MA), and ABI CATALYST 800 thermal cycler (Applied Biosystems). Sequencing can be carried out using, for example, the ABI 373 or 377 (Applied Biosystems) or MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA sequencing systems, or other automated and manual sequencing systems well known in the art.
The nucleotide sequences of the Sequence Listing have been prepared by current, state-of- the-art, automated methods and, as such, may contain occasional sequencing errors or unidentified nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified bases do not represent a hindrance to practicing the invention for those skilled in the art. Several methods employing standard recombinant techniques may be used to correct errors and complete the missing sequence information. (See, e.g., those described in Ausubel, F.M. et al. (1997) Short Protocols in Molecular Biology. John Wiley & Sons, New York NY; and Sambrook, J. et al. (1989) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Press, Plainview NY.)
Assembly of cDNA Sequences
Human polynucleotide sequences may be assembled using programs or algorithms well known in the art. Sequences to be assembled are related, wholly or in part, and may be derived from a single or many different transcripts. Assembly of the sequences can be performed using such programs as PHRAP (Phils Revised Assembly Program) and the GELVDΞW fragment assembly system (GCG), or other methods known in the art.
Alternatively, cDNA sequences are used as "component" sequences that are assembled into 5 "template" or "consensus" sequences as follows. Sequence chromatograms are processed, verified, and quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway known as Block 1 (See, e.g., the LDFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA). A series of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or masked, to prevent spurious o matches. Mitochondrial and ribosomal RNA sequences are also removed. The processed sequences are then loaded into a relational database management system (RDMS) which assigns edited sequences to existing templates, if available. When additional sequences are added into the RDMS, a process is initiated which modifies existing templates or creates new templates from works in progress (i.e., nonfinal assembled sequences) containing queued sequences or the sequences themselves. 5 After the new sequences have been assigned to templates, the templates can be merged into bins. If multiple templates exist in one bin, the bin can be split and the templates reannotated.
Once gene bins have been generated based upon sequence alignments, bins are "clone joined" based upon clone information. Clone joining occurs when the 5' sequence of one clone is present in one bin and the 3' sequence from the same clone is present in a different bin, indicating that the two o bins should be merged into a single bin. Only bins which share at least two different clones are merged.
A resultant template sequence may contain either a partial or a full length open reading frame, or all or part of a genetic regulatory element. This variation is due in part to the fact that the full length cDNAs of many genes are several hundred, and sometimes several thousand, bases in length. 5 With current technology, cDNAs comprising the coding regions of large genes cannot be cloned because of vector limitations, incomplete reverse transcription of the mRNA, or incomplete "second strand" synthesis. Template sequences may be extended to include additional contiguous sequences derived from the parent RNA transcript using a variety of methods known to those of skill in the art. Extension may thus be used to achieve the full length coding sequence of a gene. 0
Analysis of the cDNA Sequences
The cDNA sequences are analyzed using a variety of programs and algorithms which are well known in the art. (See, e.g., Ausubel, 1997, supra. Chapter 7.7; Meyers, R.A. (Ed.) (1995) Molecular Biology and Biotechnology. Wiley VCH, New York NY, pp. 856-853; and Table 6.) These analyses comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and stop codons; and homology searches. Computer programs known to those of skill in the art for performing computer-assisted searches for amino acid and nucleic acid sequence similarity, include, for example, Basic Local Ahgnment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300; Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410). BLAST is especially useful in determining exact matches and comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the user
(Karlin, S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845). Using an appropriate search tool (e.g., BLAST or HMM), GenBank, SwissProt, BLOCKS, PFAM and other databases maybe searched for sequences containing regions of homology to a query sptm or SPTM of the present invention. Other approaches to the identification, assembly, storage, and display of nucleotide and polypeptide sequences are provided in "Relational Database for Storing Biomolecule Information," U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular Sequence Database," U.S.S.N. 08/811,758, filed March 6, 1997; and "Relational Database and System for Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, all of which are incoφorated by reference herein in their entirety.
Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, in "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997, incoφorated herein by reference.
Human Secretory Sequences
The sptm of the present invention may be used for a variety of diagnostic and therapeutic puφoses. For example, an sptm may be used to diagnose a particular condition, disease, or disorder associated with cell signaling. Such conditions, diseases, and disorders include, but are not limited to, a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gaU bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; an immune system disorder such as such as inflammation, actinic keratosis, acquired immunodeficiency syndrome (ADDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, arteriosclerosis, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomeralonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, paroxysmal nocturnal hemoglobinuria, hepatitis, hypereosinophilia, irritable bowel syndrome, episodic lymphopenia with lymphocytotoxins, mixed connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, myelofibrosis, osteoarthritis, osteoporosis, pancreatitis, polycythemia vera, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, primary thrombocythemia, thrombocytopenic p pura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracoφoreal circulation, trauma, and hematopoietic cancer including lymphoma, leukemia, and myeloma; and a neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotiophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental disorder of the central nervous system, cerebral palsy, a neuroskeletal disorder, an autonomic nervous system disorder, a cranial nerve disorder, a spinal cord disease, muscular dystrophy and other neuromuscular disorder, a peripheral nervous system disorder, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathy, myasthenia gravis, periodic paralysis, a mental disorder including mood, anxiety, and schizophrenic disorder, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postheφetic neuralgia, and Tourette's disorder. The sptm can be used to detect the presence of, or to quantify the amount of, an sptm- related polynucleotide in a sample. This information is then compared to information obtained from appropriate reference samples, and a diagnosis is established. Alternatively, a polynucleotide complementary to a given sptm can inhibit or inactivate a therapeutically relevant gene related to the sptm.
Analysis of sptm Expression Patterns
The expression of sptm may be routinely assessed by hybridization-based methods to determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity of sptm expression. For example, the level of expression of sptm may be compared among different cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at different developmental stages, or among cell types or tissues undergoing various treatments. This type of analysis is useful, for example, to assess the relative levels of sptm expression in fully or partially differentiated cells or tissues, to determine if changes in sptm expression levels are correlated with the development or progression of specific disease states, and to assess the response of a cell or tissue to a specific therapy, for example, in pharmacological or toxicological studies. Methods for the analysis of sptm expression are based on hybridization and amplification technologies and include membrane-based procedures such as northern blot analysis, high-throughput procedures that utilize, for example, microarrays, and PCR-based procedures.
Hybridization and Genetic Analysis
The sptm, their fragments, or complementary sequences, may be used to identify the presence of and/or to determine the degree of similarity between two (or more) nucleic acid sequences. The sptm may be hybridized to naturally occurring or recombinant nucleic acid sequences under appropriately selected temperatmes and salt concentrations. Hybridization with a probe based on the nucleic acid sequence of at least one of the sptm allows for the detection of nucleic acid sequences, including genomic sequences, which are identical or related to the sptm of the Sequence Listing. Probes may be selected from non-conserved or unique regions of at least one of the polynucleotides of SEQ DD NOX-184 and tested for their abihty to identify or amplify the target nucleic acid sequence using standard protocols.
Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in SEQ DD NO: 1-184 and fragments thereof, can be identified using various conditions of stringency. (See, e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507-511.) Hybridization conditions are discussed in "Definitions."
A probe for use in Southern or northern hybridization may be derived from a fragment of an sptm sequence, or its complement, that is up to several hundred nucleotides in length and is either single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared or sectioned tissues, or to artificial substrates containing sptm. Microarrays are particularly suitable for identifying the presence of and detecting the level of expression for multiple genes of interest by examining gene expression correlated with, e.g., various stages of development, treatment with a drug or compound, or disease progression. An array analogous to a dot or slot blot may be used to arrange and link polynucleotides to the surface of a substrate using one or more of the following: mechanical (vacuum), chemical, thermal, or UV bonding procedures. Such an array may contain any number of sptm and may be produced by hand or by using available devices, materials, and machines.
Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g., Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT appUcation W095/251116; Shalon, D. et al. (1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.)
Probes may be labeled by either PCR or enzymatic techniques using a variety of commercially available reporter molecules. For example, commercial kits are available for radioactive and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline phosphatase labeling (Life Technologies). Alternatively, sptm may be cloned into commercially available vectors for the production of RNA probes. Such probes may be transcribed in the presence of at least one labeled nucleotide (e.g., 32P-ATP, Amersham Pharmacia Biotech).
Additionally the polynucleotides of SEQ ID NOX-184 or suitable fragments thereof can be used to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures well known in the art, e.g., cDNA library screening, PCR amplification, etc. The molecular cloning of such full length cDNA sequences may employ the method of cDNA library screening with probes using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, supra. Chapters 3, 5, and 6. These procedures may also be employed with genomic libraries to isolate genomic sequences of sptm in order to analyze, e.g., regulatory elements.
Genetic Mapping
Gene identification and mapping are important in the investigation and treatment of almost all conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, diabetes, and mental illnesses are of particular interest. Each of these conditions is more complex than the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being predictive of predisposition for a particular condition, disease, or disorder. For example, 5 cardiovascular disease may result from malfunctioning receptor molecules that fail to clear cholesterol from the bloodstream, and diabetes may result when a particular individual's immune system is activated by an infection and attacks the insulin-producing cells of the pancreas. In some studies, Alzheimer's disease has been linked to a gene on chromosome 21 ; other studies predict a different gene and location. Mapping of disease genes is a complex and reiterative process and generally 0 proceeds from genetic linkage analysis to physical mapping.
As a condition is noted among members of a family, a genetic linkage map traces parts of chromosomes that are inherited in the same pattern as the condition. Statistics link the inheritance of particular conditions to particular regions of chromosomes, as defined by RFLP or other markers. (See, for example, Lander, E. S. and Botstein, D. (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357.) 5 OccasionaUy, genetic markers and their locations are known from previous studies. More often, however, the markers are simply stretches of DNA that differ among individuals. Examples of genetic linkage maps can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) World Wide Web site.
In another embodiment of the invention, sptm sequences may be used to generate o hybridization probes useful in chromosomal mapping of naturally occmring genomic sequences. Either coding or noncoding sequences of sptm may be used, and in some instances, noncoding sequences may be preferable over coding sequences. For example, conservation of an sptm coding sequence among members of a multi-gene family may potentially cause xindesired cross hybridization during chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific 5 region of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet. 7:149-154.) o Fluorescent in situ hybridization (FISH) may be correlated with other physical chromosome mapping techniques and genetic map data. (See, e.g., Meyers, supra, pp. 965-968.) Correlation between the location of sptm on a physical chromosomal map and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA associated with that disorder. The sptm sequences may also be used to detect polymoφhisms that are genetically linked to the inheritance of a particular condition, disease, or disorder.
In situ hybridization of chromosomal preparations and genetic mapping techniques, such as linkage analysis using established chromosomal markers, may be used for extending existing genetic 5 maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the number or arm of the corresponding human chromosome is not known. These new marker sequences can be mapped to human chromosomes and may provide valuable information to investigators searching for disease genes using positional cloning or other gene discovery techniques. Once a disease or syndrome has been crudely correlated 0 by genetic linkage with a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23, any sequences mapping to that area may represent associated or regulatory genes for further investigation. (See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequences of the subject invention may also be used to detect differences in chromosomal architecture due to translocation, inversion, etc., among normal, carrier, or affected individuals. 5 Once a disease-associated gene is mapped to a chromosomal region, the gene must be cloned in order to identify mutations or other alterations (e.g., translocations or inversions) that may be correlated with disease. This process requires a physical map of the chromosomal region containing the disease-gene of interest along with associated markers. A physical map is necessary for determining the nucleotide sequence of and order of marker genes on a particular chromosomal o region. Physical mapping techniques are well known in the art and require the generation of overlapping sets of cloned DNA fragments from a particular organelle, chromosome, or genome. These clones are analyzed to reconstruct and catalog their order. Once the position of a marker is determined, the DNA from that region is obtained by consulting the catalog and selecting clones from that region. The gene of interest is located through positional cloning techniques using hybridization or 5 similar methods.
Diagnostic Uses
The sptm of the present invention may be used to design probes useful in diagnostic assays. Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, o disorders, or diseases associated with abnormal levels of sptm expression. Labeled probes developed from spun sequences are added to a sample under hybridizing conditions of desired stringency. In some instances, sptm, or fragments or oligonucleotides derived from sptm, may be used as primers in amplification steps prior to hybridization. The amount of hybridization complex formed is quantified and compared with standards for that cell or tissue. If sptm expression varies significantly from the standard, the assay indicates the presence of the condition, disorder, or disease. Qualitative or quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent 5 assay (ELISA)-like, pin, or chip-based assays.
The probes described above may also be used to monitor the progress of conditions, disorders, or diseases associated with abnormal levels of sptm expression, or to evaluate the efficacy of a particular therapeutic treatment. The candidate probe may be identified from the sptm that are specific to a given human tissue and have not been observed in GenBank or other genome databases. 0 Such a probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the treatment of an individual patient. In a typical process, standard expression is established by methods well known in the art for use as a basis of comparison, samples from patients affected by the disorder or disease are combined with the probe to evaluate any deviation from the standard profile, and a therapeutic agent is administered and effects are monitored to generate a treatment profile. Efficacy 5 is evaluated by determining whether the expression progresses toward or returns to the standard normal pattern. Treatment profiles may be generated over a period of several days or several months. Statistical methods well known to those skilled in the art may be use to determine the significance of " such therapeutic agents.
The polynucleotides are also useful for identifying individuals from minute biological samples, o for example, by matching the RFLP pattern of a sample' s DNA to that of an individual' s DNA. The polynucleotides of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions-of an individual's genome. These sequences can be used to prepare PCR primers for amplifying and isolating such selected DNA, which can then be sequenced. Using this technique, an individual can be identified through a unique set of DNA sequences. Once a unique 5 DD database is established for an individual, positive identification of that individual can be made from extremely small tissue samples.
In a particular aspect, oligonucleotide primers derived from the sptm of the invention may be used to detect single nucleotide polymoφhisms (SNPs). SNPs are substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of o SNP detection include, but are not limited to, single-stranded conformation polymoφhism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from sptm are used to amplify DNA using the polymerase chain reaction (PCR). The DNA may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are fluorescently labeled, which allows detection of the amplimers in high- throughput equipment such as DNA sequencing machines. Additionally, sequence database analysis 5 methods, termed in sihco SNP (isSNP), are capable of identifying polymoφhisms by comparing the sequences of individual overlapping DNA fragments which assemble into a common consensus sequence. These.computer-based methods filter out sequence variations due to laboratory preparation of DNA and sequencing errors using statistical models and automated analyses of DNA sequence chromatograms. In the alternative, SNPs may be detected and characterized by mass spectrometry 0 using, for example, the high throughput MASSARRAY system (Sequenom, Inc., San Diego CA). DNA-based identification techniques are critical in forensic technology. DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be amplified using, e.g., PCR, to identify individuals. (See, e.g., Erlich, H. (1992) PCR Technology. Freeman and Co., New York, NY). Similarly, polynucleotides of the present 5 invention can be used as polymoφhic markers.
There is also a need for reagents capable of identifying the source of a particular tissue. Appropriate reagents can comprise, for example, DNA probes or primers prepared from the sequences of the present invention that are specific for particular tissues. Panels of such reagents can identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to o screen tissue cultures for contamination.
The polynucleotides of the present invention can also be used as molecular weight markers on nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a ' particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, 5 and as an antigen to elicit an immune response.
Disease Model Systems Using sptm
The polynucleotides encoding SPTM or their mammalian homologs may be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) cells. Such o techniques are weU known in the art and are useful for the generation of animal models of human disease. (See, e.g., U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292). The vector integrates into the corresponding region of the host genome by homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 5 (1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 0 The polynucleotides encoding SPTM may also be manipulated in vitro in ES cells derived from human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. (1998) Science 282:1145-1147). 5 The polynucleotides encoding SPTM of the invention can also be used to create "knockin" humanized animals (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region of sptm is injected into animal ES cells, and the injected sequence integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and treated with potential o pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, a mammal inbred to overexpress sptm, resulting, e.g., in the secretion of SPTM in its milk, may also serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74).
Screening Assays 5 SPTM encoded by polynucleotides of the present invention may be used to screen for molecules that bind to or are bound by the encoded polypeptides. The binding of the polypeptide and the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the polypeptide or the bound molecule. Examples of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or small molecules. o Preferably, the molecule is closely related to the natural ligand of the polypeptide, e.g., a ligand or fragment thereof, a natural substrate, or a structural or functional mimetic. (See, Coligan et al., (1991) Current Protocols in Immunology 1(2): Chapter 5.) Similarly, the molecule can be closely related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, e.g., the active site. In either case, the molecule can be rationally designed using known techniques.
Preferably, the screening for these molecules involves producing appropriate cells which express the polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide or cell membrane fractions 5 which contain the expressed polypeptide are then contacted with a test compound and binding, stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed.
An assay may simply test binding of a candidate compound to the polypeptide, wherein binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. Alternatively, the assay may assess binding in the presence of a labeled competitor. 0 Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule affixed to a solid support, chemical libraries, or natural product mixtares. The assay may also simply comprise the steps of mixing a candidate compound with a solution containing a polypeptide, measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to a standard. 5 Preferably, an ELISA assay using, e.g., a monoclonal or polyclonal antibody, can measure polypeptide level in a sample. The antibody can measure polypeptide level by either binding, directly or indirectly, to the polypeptide or by competing with the polypeptide for a substrate.
All of the above assays can be used in a diagnostic or prognostic context. The molecules discovered using these assays can be used to treat disease or to bring about a particular result in a 0 patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or enhance the production of the polypeptide from suitably manipulated cells or tissues.
Transcript Imaging and Toxicological Testing 5 Another embodiment relates to the use of sptm to develop a transcript image of a tissue or cell type. A transcript image represents the global pattern of gene expression by a particular tissue or ceU type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and their relative abundance under given conditions and at a given time. (See Seilhamer et al, "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, expressly incoφorated by o reference herein.) Thus a transcript image may be generated by hybridizing the polynucleotides of the present invention or their complements to the totality of transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the hybridization takes place in high-throughput . format, wherein the polynucleotides of the present invention or their complements comprise a subset of a plurality of elements on a microarray. The resultant transcript image would provide a profile of gene activity pertaining to cell signaling.
Transcript images which profile sptm expression may be generated using transcripts isolated from tissues, cell lines, biopsies, or other biological samples. The transcript image may thus reflect 5 sptm expression in vivo, as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line.
Transcript images which profile sptm expression may also be used in conjunction with in vitro model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental compounds. All compounds induce characteristic 0 gene expression patterns, frequently termed molecular fingeφrints or toxicant signatures, which are indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog. 24:153- 159; Steiner, S. and Anderson, N. L. (2000) Toxicol. Lett. 112-113:467-71, expressly incoφorated by reference herein). If a test compound has a signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. These fingeφrints or signatures are most useful and 5 refined when they contain expression information from a large number of genes and gene families. Ideally, a genome- wide measurement of expression provides the highest quality signature. Even genes whose expression is not altered by any tested compounds are important as well, as the levels of expression of these genes are used to normalize the rest of the expression data. The normalization procedure is useful for comparison of expression data after treatment with different compounds. o While the assignment of gene function to elements of a toxicant signature aids in inteφretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures which leads to prediction of toxicity. (See, for example, Press Release 00-02 from the National Institute of Environmental Health Sciences, released February 29, 2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is important and desirable in 5 toxicological screening using toxicant signatures to include all expressed gene sequences.
In one embodiment, the toxicity of a test compound is assessed by treating a biological sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated biological sample are hybridized with one or more probes specific to the polynucleotides of the present invention, so that transcript levels corresponding to the polynucleotides of the present invention may be o quantified. The transcript levels in the treated biological sample are compared with levels in an untreated biological sample. Differences in the transcript levels between the two samples are indicative of a toxic response caused by the test compound in the treated sample.
• Another particular embodiment relates to the use of SPTM encoded by polynucleotides of the present invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, are analyzed by quantifying the number of expressed proteins and their relative abundance under 5 given conditions and at a given time. A profile of a ceU's proteome may thus be generated by separating and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, 0 supra). The proteins are visuaMzed in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot is generally proportional to the level of the protein in the sample. The optical densities of equivalently positioned protein spots from different samples, for example, from biological samples either treated or untreated with a test compound or therapeutic agent, are 5 compared to identify any changes in protein spot density related to the treatment. The proteins in the spots are partially sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry. The identity of the protein in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In some cases, further sequence data may be o obtained for definitive protein identification.
A proteomic profile may also be generated using antibodies specific for SPTM to quantify the levels of SPTM expression. In one embodiment, the antibodies are used as elements on a microarray, and protein expression levels are quantified by exposing the microarray to the sample and detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103- 5 11; Mendoze, L. G. et al. (1999) Biotechniques 27:778-88). Detection may be performed by a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino- reactive fluorescent compound and detecting the amount of fluorescence bound at each array element.
Toxicant signatures at the proteome level are also useful for toxicological screening, and o should be analyzed in paraUel with toxicant signatures at the transcript level. There is a poor correlation between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the analysis of compounds which do not significantly affect the transcript image, but which alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid degradation of mRNA, so proteomic profiMng may be more reMable and informative in such cases.
In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins that are expressed in the treated 5 biological sample are separated so that the amount of each protein can be quantified. The amount of each protein is compared to the amount of the corresponding protein in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the individual proteins and comparing these partial sequences to the SPTM encoded by 0 polynucleotides of the present invention.
In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins from the biological sample are incubated with antibodies specific to the SPTM encoded by polynucleotides of the present invention. The amount of protein recognized by the antibodies is quantified. The amount of protein in the treated 5 biological sample is compared with the amount in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.
Transcript images may be used to profile sptm expression in distinct tissue types. This process can be used to determine ceU signaUng activity in a particular tissue type relative to this o activity in a different tissue type. Transcript images may be used to generate a profile of sptm expression characteristic of diseased tissue. Transcript images of tissues before and after treatment may be used for diagnostic pmposes, to monitor the progression of disease, and to monitor the efficacy of drug treatments for diseases which affect ceU signaMng activity.
Transcript images of cell Mnes can be used to assess ceU signaMng activity and/or to identify 5 cell Mnes that lack or misregulate this activity. Such cell Mnes may then be treated with pharmaceutical agents, and a transcript image fohowing treatment may indicate the efficacy of these agents in restoring desired levels of this activity. A similar approach may be used to assess the toxicity of pharmaceutical agents as reflected by undesirable changes in cell signaMng activity. Candidate pharmaceutical agents may be evaluated by comparing their associated transcript images with those of o pharmaceutical agents of known effectiveness.
Antisense Molecules
The polynucleotides of the present invention are useful in antisense technology. Antisense technology or therapy reMes on the modulation of expression of a target protein through the specific binding of an antisense sequence to a target sequence encoding the target protein or directing its expression. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics, Humana Press Inc., Totawa NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3):171-178; Crooke, S.T. (1997) Adv. Pharmacol. 40:1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12):1055-1063; and Lavrosky, Y. et al. (1997) Biochem. Mol. Med. 62(l):ll-22.) An antisense sequence is a polynucleotide sequence capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences bind to ceUular mRNA and/or genomic DNA, affecting translation and/or transcription. Antisense sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) Antisense Res. Dev. 1 (3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge, W.M. et al. (1995) Proc. Natl. Acad. Sci. USA 92(12):5592-5596; and Nielsen, P. E. and Haaima, G. (1997) Chem. Soc. Rev. 96:73-78.) Typically, the binding which results in modulation of expression occurs through hybridization or binding of complementary base pairs. Antisense sequences can also bind to DNA duplexes through specific interactions in the major groove of the double heMx. The polynucleotides of the present invention and fragments thereof can be used as antisense sequences to modify the expression of the polypeptide encoded by sptm. The antisense sequences can be produced ex vivo, such as by using any of the ABI nucleic acid synthesizer series (AppMed Biosystems) or other automated systems known in the art. Antisense sequences can also be produced *. biologically, such as by transforming an appropriate host cell with an expression vector containing the sequence of interest. (See, e.g., Agrawal, supra.)
In therapeutic use, any gene deMvery system suitable for introduction of the antisense sequences into appropriate target ceMs can be used. Antisense sequences can be dehvered intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence complementary to at least a portion of the ceMular sequence encoding the target protein. (See, e.g., Slater, J.E., et al. (1998) J. Allergy CMn. Immunol. 102(3):469-475; and Scanlon, K.J., et al. (1995) 9(13):1288-1296.) Antisense sequences can also be introduced intracehularly through the use of viral vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g., MiUer, A.D. (1990) Blood 76:271 ; Ausubel, F.M. et al. (1995) Current Protocols in Molecular Biology, John Wiley & Sons, New York NY; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other gene deMvery mechanisms include Mposome-derived systems, artificial viral envelopes, and other systems known in the art. (See, e.g., Rossi, J.J. (1995) Br. Med. Bull. 51(l):217-225; Boado, R.J. et al. (1998) J. Pharm. Sci. 87(11):1308-1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 25(14):2730-2736.) Expression
In order to express a biologicaUy active SPTM, the nucleotide sequences encoding SPTM or fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host. Methods which are well known to those skiUed in the art may be used to constract expression vectors containing sequences encoding SPTM and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra, Chapters 4, 8, 16, and 17; and Ausubel, supra, Chapters 9, 10, 13, and 16.) A variety of expression vector/host systems may be utiMzed to contain and express sequences encoding SPTM. These include, but are not Mmited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect ceU systems infected with viral expression vectors (e.g., baculovirus); plant ceU systems transformed with viral expression vectors (e.g., cauMflower mosaic viras, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal (mammaMan) cell systems. (See, e.g., Sambrook, supra: Ausubel, 1995, supra. Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Bitter, G.A. et al. (1987) Methods Enzymol. 153:516-544; Scorer, CA. et al. (1994) Bio/Technology 12:181-184; Engelhard, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, Y. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-311; Corazzi, G. et al. (1984) EMBO J. 3:1671-1680; BrogMe, R. et al. (1984) Science 224:838-843; Winter, J. et al. (1991) Results Probl. CeM Differ. 17:85-105; The McGraw HiM Yearbook of Science and Technology (1992) McGraw Hill, New York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; and Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355.) Expression vectors derived from retroviruses, adenoviruses, or heφes or vaccinia viruses, or from various bacterial plasmids, may be used for deMvery of nucleotide sequences to the targeted organ, tissue, or ceU population. (See, e.g., Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993) Proc. Natl. Acad. Sci. USA 90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. (1994) Mol. Immunol. 31(3):219-226; and Verma, I.M. and N. Somia (1997) Nature 389:239-242.) The invention is not Mmited by the host cell employed.
For long term production of recombinant proteins in mammaMan systems, stable expression of SPTM in ceU Mnes is preferred. For example, sequences encoding SPTM can be transformed into cell Mnes using expression vectors which may contain viral origins of repMcation and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Any number of selection systems may be used to recover transformed ceU Mnes. (See, e.g., Wigler, M. et al. (1977) CeU 11:223-232; Lowy, I. et al. (1980) CeU 22:817-823.; Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14; Hartman, S.C. 5 and RCMuUigan (1988) Proc. Natl. Acad. Sci. USA 85:8047-8051; Rhodes, CA. (1995) Methods Mol. Biol. 55:121-131.)
Therapeutic Uses of sptm
The polynucleotides encoding SPTM of the invention may be used for somatic or germMne o gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X-Mnked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner, s J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassemias, famiUal hypercholesterolemia, and hemophiMa resulting from Factor VCH or Factor LX deficiencies (Crystal, R.G. (1995) Science 270:404-410; Verma, I.M. and Somia, N. (1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated cell proMferation), or 0 (iii) express a protein which affords protection against intracellular parasites (e.g., against human retroviruses, such as human immunodeficiency viras (HIV) (Baltimore, D. (1988) Nature 335:395- 396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA. 93:11395-11399), hepatitis B or C viras (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides brasiMensis; and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi). In the case where a 5 genetic deficiency in sptm expression or regulation causes disease, the expression of sptm from an appropriate population of transduced ceUs may alleviate the cMnical manifestations caused by the genetic deficiency.
In a further embodiment of the invention, diseases or disorders caused by deficiencies in sptm are treated by constructing mammaMan expression vectors comprising sptm and introducing these o vectors by mechanical means into sptm-deficient ceUs. Mechanical transfer technologies for use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual ceUs, (ii) ballistic gold particle deMvery, (Mi) Mposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. Biochem. 62:191- 217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and Recipon, H. (1998) Curr. Opin. Biotechnol. 9:445-450).
Expression vectors that may be effective for the expression of sptm include, but are not Mmited to, the PCDNA 3.1, EPiTAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La JoUa CA), and PTET-OFF,
PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). The sptm of the invention may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma viras (RSV), SV40 viras, thymidine kinase (TK), or β-actin genes), (ii) an inducible promoter (e.g., the tetracycMne-regulated promoter (Gossen, M. and Bujard, H. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:5547-5551 ; Gossen, M. et al., (1995) Science 268:1766-1769; Rossi, F.M.V. and Blau, H.M. (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid (Invitrogen); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F.M.V. and Blau, H.M. supra), or (iii) a tissue-specific promoter or the native promoter of the endogenous gene encoding SPTM from a normal individual.
Commercially available Mposome transformation kits (e.g., the PERFECT LIPID TRANSFECTION KIT, available from Invitrogen) aUow one with ordinary skill in the art to deliver polynucleotides to target cells in culture and require minimal effort to optimize experimental » parameters. In the alternative, transformation is performed using the calcium phosphate method (Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J. 1:841-845). The introduction of DNA to primary ceUs requires modification of these standardized mammaMan transfection protocols.
In another embodiment of the invention, diseases or disorders caused by genetic defects with respect to sptm expression are treated by constructing a retrovirus vector consisting of (i) sptm under the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (n) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional retrovirus s-acting RNA sequences and coding sequences required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are commercially available (Stratagene) and are based onpubMshed data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. U.S.A. 92:6733-6737), incoφorated by reference herein. The vector is propagated in an appropriate vector producing ceU Mne (VPCL) that expresses an envelope gene with a tropism for receptors on the target ceUs or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and MiUer, A.D. (1988) J. Virol. 62:3802-3806; DuU, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol. 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ("Method for obtaining retrovirus packaging ceU Mnes producing high transducing efficiency retroviral supernatant") discloses a method for obtaining retrovirus packaging ceU Mnes and is hereby incoφorated by reference. Propagation of retrovirus vectors, transduction of a population of ceUs (e.g., CD4+ T-ceUs), and the return of transduced ceUs to a patient are procedmes well known to persons skiUed in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. U.S.A. 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). In the alternative, an adeno virus-based gene therapy deMvery system is used to deMver sptm to cells which have one or more genetic abnormaMties with respect to the expression of sptm. The construction and packaging of adenovirus-based vectors are weU known to those with ordinary skill in the art. RepMcation defective adenovirus vectors have proven to be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995) Transplantation 27:263-268). PotentiaUy useful adenoviral vectors are described in U.S. Patent Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby incoφorated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:511-544- and Verma, I.M. and Somia, N. (1997) Nature 18:389:239-242, both incoφorated by reference herein. In another alternative, a heφes-based, gene therapy deMvery system is used to deMver sptm to target ceUs which have one or more genetic abnormaMties with respect to the expression of sptm. The' use of heφes simplex virus (HSV)-based vectors may be especially valuable for introducing sptm to ceUs of the central nervous system, for which HSV has a tropism. The construction and packaging of heφes-based vectors are weU known to those with ordinary skfil in the art. A repMcation-competent heφes simplex virus (HSV) type 1 -based vector has been used to deMver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res.169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. Patent Number 5,804,413 to DeLuca ("Heφes simplex viras strains for gene transfer"), which is hereby incoφorated by reference. U.S. Patent Number 5,804,413 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a ceU under the control of the appropriate promoter for pmposes including human gene therapy. Also taught by this patent are the constraction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also Goins, W. F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et al., (1994) Dev. Biol. 163:152-161, hereby incoφorated by reference. The manipulation of cloned heφesvirus sequences, the generation of recombinant virus foUowing the transfection of multiple plasmids containing different segments of the large heφesvirus genomes, the growth and propagation of heφesvirus, and the infection of ceUs with heφesvirus are techniques weU known to those of ordinary s U in the art.
In another alternative, an alphaviras (positive, single-stranded RNA virus) vector is used to 5 deMver sptm to target ceUs. The biology of the prototypic alphaviras, SemMki Forest Viras (SFV), has been studied extensively and gene transfer vectors have been based on the SFV genome (Garoff, H. and Xi, K-J. (1998) Curr. Opin. Biotech. 9:464-469). During alphaviras RNA repMcation, a subgenomic RNA is generated that normaUy encodes the viral capsid proteins. This subgenomic RNA repMcates to higher levels than the fuU-length genomic RNA, resulting in the oveφroduction of capsid 0 proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase).
Similarly, inserting sptm into the alphaviras genome in place of the capsid-coding region results in the production of a large number of sptm RNAs and the synthesis of high levels of SPTM in vector transduced ceUs. While alphaviras infection is typically associated with ceM lysis within a few days, the abihty to estabMsh a persistent infection in hamster normal kidney cells (BHK-21) with a variant of 5 Sindbis virus (SIN) indicates that the lytic repMcation of alphavirases can be altered to suit the needs of the gene therapy appMcation (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphavirases wiU aUow the introduction of sptm into a variety of ceU types. The specific transduction of a subset of cells in a population may require the sorting of ceUs prior to transduction. The methods of manipulating infectious cDNA clones of alphavirases, performing alphaviras cDNA and RNA o transfections, and performing alphaviras infections, are well known to those with ordinary skill in the art.
Antibodies
Anti-SPTM antibodies may be used to analyze protein expression levels. Such antibodies 5 include, but are not Mmited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments. For descriptions of and protocols of antibody technologies, see, e.g., Pound J.D. (1998) Immunochemical Protocols, Humana Press, Totowa, NX
The amino acid sequence encoded by the sptm of the Sequence Listing may be analyzed by appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine o regions of high immunogenicity. The optimal sequences for immunization are selected from the C- terminus, the N-terminus, and those intervening, hydrophiMc regions of the polypeptide which are Mkely to be exposed to the external environment when the polypeptide is in its natural conformation. Analysis used to select appropriate epitopes is also described by Ausubel (1997, supra, Chapter 11.7). Peptides used for antibody induction do not need to have biological activity; however, they must be antigenic. Peptides used to induce specific antibodies may have an amino acid sequence consisting of at least five amino acids, preferably at least 10 amino acids, and most preferably at least 15 amino acids. A peptide which mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as keyhole Mmpet hemocyanin (KLH; Sigma, St. Louis MO) for antibody production. A peptide encompassing an antigenic region may be expressed from an sptm, synthesized as described above, or purified from human cells.
Procedures well known in the art may be used for the production of antibodies. Various hosts including mice, goats, and rabbits, may be immunized by injection with a peptide. Depending on the host species, various adjuvants may be used to increase immunological response.
In one procedure, peptides about 15 residues in length may be synthesized using an ABI 431 A peptide synthesizer (AppMed Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction with M-maleimidobenzoyl-N-hydroxysuccinirnide ester (Ausubel, 1995, supra). Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant. The resulting antisera are tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine seram albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-SPTM activity using protocols weU known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting.
In another procedure, isolated and purified peptide may be used to immunize mice (about 100 μg of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and used to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of peptide is sufficient for labeUng and screening several thousand clones. Hybridomas of interest are detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific monoclonal antibody. In a typical protocol, weUs of a multi-weU plate (FAST, Becton-Dickinson, Palo Alto, CA) are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species IgG) antibodies at 10 mg/ml. The coated wells are blocked with 1 % BSA and washed and exposed to supernatants from hybridomas. After incubation, the weUs are exposed to radiolabeled peptide at 1 mg/ml.
Clones producing antibodies bind a quantity of labeled peptide that is detectable above background. Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several procedures for the production of monoclonal antibodies, including in vitro production, are described in Pound (supra). Monoclonal antibodies with antipeptide activity are tested for anti-SPTM activity using protocols weU known in the art, including EXISA, RIA, and immunoblotting.
Antibody fragments containing specific binding sites for an epitope may also be generated. For example, such fragments include, but are not Mmited to, the F(ab')2 fragments produced by pepsin digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges of the F(ab')2 fragments. Alternatively, construction of Fab expression Mbraries in filamentous bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity (Pound, supra. Chaps. 45-47). Antibodies generated against polypeptide encoded by sptm can be used to purify and characterize fuU-length SPTM protein and its activity, binding partners, etc.
Assays Using Antibodies
Anti-SPTM antibodies may be used in assays to quantify the amount of SPTM found in a particular human ceU. Such assays include methods utiUzing the antibody and a label to detect expression level under normal or disease conditions. The peptides and antibodies of the invention may be used with or without modification or labeled by joining them, either covalently or noncovalently, with a reporter molecule.
Protocols for detecting and measuring protein expression using either polyclonal or monoclonal antibodies are well known in the art. Examples include ELISA, RIA, and fluorescent activated cell sorting (FACS). Such immunoassays typicaUy involve the formation of complexes between the SPTM and its specific antibody and the measurement of such complexes. These and other assays are described in Pound (supra).
-- Without further elaboration, it is beUeved that one skfiled in the art can, using the preceding description, utiMze the present invention to its fullest extent. The foUowing preferred specific embodiments are, therefore, to be construed as merely illustrative, and not Ufnitative of the remainder of the disclosure in any way whatsoever.
The disclosures of aU patents, appMcations, and pubMcations mentioned above and below, including U.S. Ser. No. 60/230,517, U.S. Ser. No. 60/230,599, U.S. Ser. No. 60/230,514, U.S.
U.S. Ser. No. 60/230,988, U.S. Ser. No. 60/230,518, U.S. Ser. No. 60/230,515, U.S. Ser. No.
60/229,751, U.S. Ser. No. 60/230,016, U.S. Ser. No. 60/230,610, U.S. Ser. No. 60/229,749, U.S. Ser. No. 60/229,750, U.S. Ser. No. 60/230,597, U.S. Ser. No. 60/230,505, U.S. Ser. No. 60/231,163, U.S.
Ser. No. 60/229,747, U.S. Ser. No. 60/229,748, U.S. Ser. No. 60/230,583, U.S. Ser. No. 60/230,519,
U.S. Ser. No. 60/230,595, U.S. Ser. No. 60/230,896, U.S. Ser. No. 60/230,990, U.S. Ser. No.
60/230,865, U.S. Ser. No. 60/230,989, U.S. Ser. No. 60/230,897, U.S. Ser. No. 60/231,832, U.S. Ser. No. 60/230,596, U.S. Ser. No. 60/230,864, and U.S. Ser. No. 60/230,951 are hereby expressly incoφorated by reference.
EXAMPLES I. Construction of cDNA Libraries
RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixtme of denaturants, such as TRIZOL (Life Technologies), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted with chloroform. RNA was precipitated with either isopropanol or sodium acetate and ethanol, or by other routine methods.
Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA purity. In most cases, RNA was treated with DNase. For most libraries, poly(A+) RNA was isolated using oMgo d(T)-coupled paramagnetic particles (Promega Coφoration (Promega), Madison WD, OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit (Ambion, Inc., Austin TX).
In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA Mbraries. Otherwise, cDNA was synthesized arid cDNA Mbraries were constructed with the UNIZAP vector system (Stratagene Cloning Systems, Inc. (Stratagene), La Jolla CA) or
SUPERSCRIPT plasmid system (Life Technologies), using the recommended procedures or similar methods known in the art. (See, e.g., Ausubel, 1997, supra, Chapters 5.1 through 6.6.) Reverse transcription was initiated using oMgo d(T) or random primers. Synthetic oMgoniicleotide adapters were Mgated to double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes. For most Mbraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs were Mgated into compatible restriction enzyme sites of the polyMnker of a suitable plasmid, e.g., PBLUESCREPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), or pEMCY (Incyte Genomics, Palo Alto CA), or derivatives thereof. Recombinant plasmids were transformed into competent E. coM ceUs including XLl-Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5α, DH10B, or ElectroMAX DH10B from Life Technologies. II. Isolation of cDNA Clones
Plasmids were recovered from host ceUs by in vivo excision using the UNIZAP vector system (Stratagene) or by ceU lysis. Plasmids were purified using at least one of the foUowing: the Magic or WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra plasmid purification systems or the R.E.A.L. PREP 96 plasmid purification kit (QIAGEN). FoUowing precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or without lyophiUzation, at 4°C
Alternatively, plasmid DNA was ampMfied from host cell lysates using direct Mnk PCR in a high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216:1-14.) Host ceU lysis and thermal cycMng steps were carried out in a single reaction mixture. Samples were processed and stored in 384-well plates, and the concentration of ampMfied plasmid DNA was quantified fluorometrically using PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and a FLUOROSKAN II fluorescence scanner (Labsystems Oy, Helsinki, Finland).
III. Sequencing and Analysis cDNA sequencing reactions were processed using standard methods or high-thiOughput instrumentation such as the ABI CATALYST 800 thermal cycler (AppMed Biosystems) or the PTC- 200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific Coφ., Sunnyvale CA) or the MICROLAB 2200 Mquid transfer system (Hamilton). cDNA sequencing reactions were prepared using reagents provided by Amersham Pharmacia Biotech or suppMed in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppMed Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing system (AppMed Biosystems) in conjunction with standard ABI protocols and base calMng software; or other sequence analysis systems known in the art. Reading frames within the cDNA sequences were identified using standard methods (reviewed in Ausubel, 1997, supra, Chapter 7.7). Some of the cDNA sequences were selected for extension using the techniques disclosed in Example VDX
IV. Assembly and Analysis of Sequences
Component sequences from chromatograms were subject to PHRED analysis and assigned a quaMty score. The sequences having at least a required quaMty score were subject to various pre- processing editing pathways to eMminate, e.g., low quaMty 3' ends, vector and Mnker sequences, polyA tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and sequences smaller than 50 base pairs. In particular, low-information sequences and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) were replaced by "n's", or masked, to prevent spurious matches.
Processed sequences were then subject to assembly procedures in which the sequences were assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene bin were assembled to produce consensus sequences (templates). Subsequent new sequences were added to existing bins using BLASTn (v.1.4 WashU) and CROSSMATCH. Candidate pairs were identified as all BLAST hits having a quaMty score greater than or equal to 150. AMgnments of at least 82% local identity were accepted into the bin. The component sequences from each bin were assembled using a version of PHRAP. Bins with several overlapping component sequences were assembled using DEEP PHRAP. The orientation (sense or antisense) of each assembled template was determined based on the number and orientation of its component sequences. Template sequences as disclosed in the sequence Msting correspond to sense strand sequences (the "forward" reading frames), to the best determination. The complementary (antisense) strands are inherently disclosed herein. The component sequences which were used to assemble each template consensus sequence are Msted in Table 3 along with their positions along the template nucleotide sequences.
Bins were compared against each other and those having local similarity of at least 82% were combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 95% local identity) were re-spMt. Assembled templates were also subject to analysis by STITCHER/EXON MAPPER algorithms which analyze the probabiMties of the presence of spMce variants, alternatively spMced exons, spMce junctions, differential expression of alternative spMced genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of the above assembly procedures .
Once gene bins were generated based upon sequence aMgnments, bins were clone joined based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' sequence from the same clone was present in a different bin, it was Mkely that the two bins actuaUy belonged together in a single bin. The resulting combined bins underwent assembly procedures to regenerate the consensus sequences.
The final assembled templates were subsequently annotated using the following procedure. Template sequences were analyzed using BLASTn (v2.0, NCBI) versus gbpri (GenBank version 124). "Hits" were defined as an exact match having from 95% local identity over 200 base pairs through 100% local identity over 100 base pairs, or a homolog match having an E-value, i.e. a probabiMty score, of < 1 x 10 s. The hits were subject to frameshift FASTx versus GENPEPT (GenBank version 124). (See Table 6). In this analysis, a homolog match was defined as having an E-value of <1 x 10"8. The assembly method used above was described in "System and Methods for Analyzing Biomolecular Sequences," U.S.S.N. 09/276,534, filed March 25, 1999, and the LEFESEQ Gold user manual (Incyte) both incoφorated by reference herein.
FoUowing assembly, template sequences were subjected to motif, BLAST, and functional analyses, and categorized in protein hierarchies using methods described in, e.g., "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997; "Relational Database for Storing Biomolecule Information," U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based FuU-Length Biomolecular Sequence Database," U.S.S.N. 08/811,758, filed March 6, 1997; and "Relational Database and System for Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, aU of which are incoφorated by reference herein. The template sequences were further analyzed by translating each template in all three forward reading frames and searching each translation against the Pfam database of hidden Markov - model-based protein famiUes and domains using the HMMER software package (available to the pubMc from Washington University School of Medicine, St. Louis MO). (See also World Wide Web site http://pfam.wustl.edu/ for detailed descriptions of Pfam protein domains and famiMes.) Additionally, the template sequences were translated in all three forward reading frames, and each translation was searched against hidden Markov models for signal peptides using the HMMER software-package. Construction of hidden Markov models and their usage in sequence analysis has been described. (See, for example, Eddy, S.R. (1996) Curr. Opin. Sir. Biol. 6:361-365.) Only those signal peptide hits with a cutoff score of 11 bits or. greater are reported. A cutoff score of 11 bits or greater corresponds to at least about 91-94% true-positives in signal peptide prediction. Template sequences were also μanslated in all three forward reading frames, and each translation was searched against TMAP,, a program that uses weight matrices to deUneate transmembrane segments on protein sequences and determine orientation, with respect to the ceU cytosol (Persson, B. and Argo , P. (1994) J. Mol. Biol. 237:182-192, and Persson, B. and Argos, P. (1996) Protein Sci. 5:363- 371.) Regions of templates which, when translated, contain similarity to signal peptide or transmembrane consensus sequences are reported in Table 2.
Template sequences are further analyzed using the bioinformatics tools Usted in Table 6, or using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Template sequences may be further queried against pubMc databases such as the GenBank rodent, mammaMan, vertebrate, prokaryote, and eukaryote databases.
The template sequences were translated to derive the corresponding longest open reading 5 frame as presented by the polypeptide sequences as reported in Table 1. Alternatively, a polypeptide of the invention may begin at any of the methionine residues within the full length translated polypeptide. Polypeptide sequences were subsequently analyzed by querying against the GenBank protein database (GENPEPT, (GenBank version 124)). FuU length polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco o CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence aMgnments are generated using default parameters specified by the CLUSTAL algorithm as incoφorated into the MEGALIGN multisequence ahgnment program (DNASTAR), which also calculates the percent identity between aUgned sequences.
Table 5 shows sequences with homology to the polypeptides of the invention as identified by 5 BLAST analysis against the GenBank protein (GENPEPT) database. Column 1 shows the polypeptide sequence identification number (SEQ DD NO:) for the polypeptide segments of the invention. Column 2 shows the reading frame used in the translation of the polynucleotide sequences * encoding the polypeptide segments. Column 3 shows the length of the translated polypeptide ; segments. Columns 4 and 5 show the start and stop nucleotide positions of the polynucleotide o sequences encoding the polypeptide segments. Column 6 shows the GenBank identification number
(GI Number) of the nearest GenBank homolog. Column 7 shows the probabiMty score for the match between each polypeptide and its GenBank homolog. Column 8 shows the annotation of the GenBank homolog.
5 V. Analysis of Polynucleotide Expression
Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel, 1995, supra, ch. 4 and 16.) o Analogous computer techniques applying BXAST were used to search for identical or related molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer search can be modified to determine whether any particular match is categorized as exact or similar. The basis of the search is the product score, which is defined as:
BLAST Score x Percent Identity 5 x minimum {length(Seq. 1), length(Seq. 2)}
,5
The product score takes into account both the degree of similarity between two sequences and the length of the sequence match. The product score is a normaMzed value between 0 and 100, and is calculated as foUows: the BLAST score is multipMed by the percent nucleotide identity and the product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is l o calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate the product score. The product score represents a balance between fractional overlap and quaMty in a BLAST aMgnment. For example, a product score of 100 is produced only for 100% identity over the
15 entire length of the shorter of the two sequences being compared. A product score of 70 is produced either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap.
Alternatively, polynucleotide sequences encoding SPTM are analyzed with respect to the
20 tissue sources from which they were derived. Polynucleotide sequences encoding SPTM were assembled, at least in part, with overlapping Incyte cDNA sequences. Each cDNA sequence is derived from a cDNA Mbrary constructed from a human tissue. Each human tissue is classified into one of the foUowing organ tissue categories: cardiovascular system; connective tissue; digestive system; embryonic stractures; endocrine system; exocrine glands; genitaUa, female; genitaMa, male;
25 germ cells; hemic and immune system; Mver; musculoskeletal system; nervous system; pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. The number of Mbraries in each category for each polynucleotide sequence encoding SPTM is counted and divided by the total number of Mbraries across aU categories for each polynucleotide sequence encoding SPTM. Similarly, each human tissue is classified into one of the following disease/condition
30 categories: cancer, ceU Mne, developmental, inflammation, neurological, trauma, cardiovascular, pooled, and other, and the number of Mbraries in each category for each polynucleotide sequence encoding SPTM is counted and divided by the total number of Mbraries across aU categories for each polynucleotide sequence encoding SPTM. The resulting percentages reflect the tissue-specific and disease-specific expression of cDNA encoding SPTM. Percentage values of tissue-specific expression are reported in Table 4. cDNA sequences and cDNA Mbrary/tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA).
5 VI. Tissue Distribution Profiling
A tissue distribution profile is determined for each template by compiMng the cDNA Mbrary tissue classifications of its component cDNA sequences. Each component sequence, is derived from a cDNA Mbrary constructed from a human tissue. Each human tissue is classified into one of the following categories: cardiovascular system; connective tissue; digestive system; embryonic 0 structures; endocrine system; exocrine glands; genitaMa, female; genitaMa, male; germ cells; hemic and immune system; Mver; musculoskeletal system; nervous system; pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. Template sequences, component sequences, and cDNA Mbrary/tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA). 5 Table 4 shows the tissue distribution profile for the templates of the invention. For each template, the three most frequently observed tissue categories are shown in column 2, along with the percentage of component sequences belonging to each category. Only tissue categories with percentage values of >10% are shown. A tissue distribution of "widely distributed" in column 2 indicates percentage values of <10% in aU tissue categories. 0
VII. Transcript Image Analysis
Transcript images are generated as described in Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, incoφorated herein by reference.
5 VIII. Extension of Polynucleotide Sequences and Isolation of a Full-length cDNA
OMgonucleotide primers designed using an sptm of the Sequence Listing are used to extend the nucleic acid sequence. One primer is synthesized to initiate 5' extension of the template, and the other primer, to initiate 3' extension of the template. The initial primers may be designed using OLIGO 4.06 software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another o appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence at temperatures of about 68°C to about 72°C Any stretch of nucleotides which would result in hahpin structures and primer-primer dimerizations are avoided. Selected human cDNA Mbraries are used to extend the sequence. If more than one extension is necessary or desired, additional or nested sets of primers are designed.
High fideMty ampMfication is obtained by PCR using methods weU known in the art. PCR is performed in 96-weU plates using the PTC-200 thermal cycler (MJ Research). The reaction mix contains DNA template, 200 nmol of each primer, reaction buffer containing Mg2+, (NH4)2S04, and β- mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase (Stratagene), with the foUowing parameters for primer pair PCI A and PCI B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68 °C, 5 min; Step 7: storage at 4°C In the alternative, the parameters for primer pair T7 and SK+ are as foUows: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 57 °C, 1 min; Step 4: 68 °C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68 °C, 5 min; Step 7: storage at 4°C
The concentration of DNA in each well is determined by dispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in IX Tris-EDTA (TE) and 0.5 μl of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Incoφorated (Corning), Corning NY), allowing the DNA to bind to the reagent. The plate is scanned in a FLUOROSKAN D (Labsystems Oy) to measure the fluorescence of the sample and to quantify the concentration of DNA. A 5 μl to 10 μl aUquot of the reaction mixture is analyzed by electrophoresis on a 1 % agarose mini-gel to determine which reactions are successful in extending the sequence.
The extended nucleotides are desalted and concentrated, transferred to 384-well plates, digested with CviJI cholera viras endonuclease (Molecular Biology Research, Madison WI), and sonicated or sheared prior to reUgation into pUC 18 vector (Amersham Pharmacia Biotech). For shotgun sequencing, the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose gels, fragments are excised, and agar digested with AGAR ACE (Promega). Extended clones are reMgated using T4 Mgase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fiU-in restriction site overhangs, and transfected into competent E. coM ceUs. Transformed ceUs are selected on antibiotic-containing media, individual colonies are picked and cultured overnight at 37 °C in 384-weU plates in LB/2x carbenicillin Mquid media.
The ceUs are lysed, and DNA is ampMfied by PCR using Taq DNA polymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the foUowing parameters: Step 1 :
94°C 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4°C DNA is quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reampMfied using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1 :2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppMed Biosystems). In Mke manner, the sptm is used to obtain regulatory sequences (promoters, introns, and enhancers) using the procedure above, oMgonucleotides designed for such extension, and an appropriate genomic Mbrary.
IX. Labeling of Probes and Southern Hybridization Analyses Hybridization probes derived from the sptm of the Sequence Listing are employed for screening cDNAs, mRNAs, or genomic DNA. The labeMng of probe nucleotides between 100 and 1000 nucleotides in length is specificaUy described, but essentially the same procedure may be used with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using a T4 polynucleotide kinase, γ32P-ATP, and 0.5X One-Phor-AU Plus (Amersham Pharmacia Biotech) buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The probe mixture is diluted to 107 dpm/μg/ml hybridization buffer and used in a typical membrane-based hybridization analysis.
The DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed through a 0.7% agarose gel. The DNA fragments are transferred from the agarose to nylon membrane (NYTRAN Plus, Schleicher & SchueU, Inc., Keene NH) using procedures specified by the manufacturer of the membrane. Prehybridization is carried out for three or more horns at 68 °C, and hybridization is carried out overnight at 68 °C To remove non-specific signals, blots are sequentially washed at room temperature under increasingly stringent conditions, up to O.lx saMne sodium citrate (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER cassette (Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of standard and experimental lanes are compared. Essentiahy the same procedure is employed when screening RNA.
X. Chromosome Mapping of sptm The cDNA sequences which were used to assemble SEQ DD NOX-184 are compared with sequences from the Incyte LIFESEQ database and pubMc domain databases using BLAST and other implementations of the Smith-Waterman algorithm. Sequences from these databases that match SEQ ED NOX-184 are assembled into clusters of contiguous and overlapping sequences using assembly algorithms such as PHRAP (Table 6). Radiation hybrid and genetic mapping data available from pubMc resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for Genome Research (WIGR), and Genethon are used to determine if any of the clustered sequences have been previously mapped. Inclusion of a mapped sequence in a cluster wiU result in the assignment of aU sequences of that cluster, including its particular SEQ D NO:, to that map location. The genetic map locations of SEQ BD NOX-184 are described as ranges, or intervals, of human chromosomes. The map position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid markers whose sequences were included in each of the clusters.
XI. Microarray Analysis Probe Preparation from Tissue or CeU Samples
Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and polyA+ RNA is purified using the oMgo (dT) cellulose method. Each polyA+ RNA sample is reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/μl oMgo-dT primer (21mer), IX first strand buffer, 0.03 units/μl RNase inhibitor, 500 μM dATP, 500 μM dGTP, 500 μM dTTP, 40 μM dCTP, 40 μM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse transcription reaction is performed in a 25 ml volume containing 200 ng polyA+ RNA with GEMBRIGHT kits (Incyte). Specific control polyA+ RNAs are synthesized by in vino transcription from non-coding yeast genomic DNA (W. Lei, unpubMshed). As quantitative controls, the control mRNAs at 0.002 ng, 0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1 :100,000, 1 :10,000, 1 :1000, 1 :100 (w/w) to sample mRNA respectively. The control mRNAs are diluted into reverse transcription reaction at ratios of 1:3, 3:1, 1:10, 10:1, 1:25, 25:1 (w/w) to sample mRNA differential expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and another with Cy5 labehng) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Probes are purified using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc.
(CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The probe is then dried to completion using a SpeedVAC (Savant Instruments Inc., HolbrookNY) and resuspended in 14 μl 5X SSC/0.2% SDS.
Microarray Preparation
Sequences of the present invention are used to generate array elements. Each array element 5 is ampMfied from bacterial ceUs containing vectors with cloned cDNA inserts. PCR ampMfication uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are ampMfied in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 μg. AmpMfied array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). Purified array elements are immobiMzed on polymer-coated glass sMdes. Glass microscope 0 sMdes (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distiUed water washes between and after treatments. Glass sMdes are etched in 4% hydrofluoric acid (VWR Scientific Products Coφoration (VWR), West Chester, PA), washed extensively in distilled water, and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated sMdes are cured in a 110°C oven. 5 Array elements are appMed to the coated glass substrate using a procedure described in US
Patent No. 5,807,522, incoφorated herein by reference. 1 μl of the array element DNA, at an average concentration of 100 ng/μl, is loaded into the open capillary printing element by a high-speed robotic apparatus. The apparatus then deposits about 5 nl of array element sample per sMde.
Microarrays are UV-crossMnked using a STRATALINKER UV-crossMnker (Stratagene). 0 Microarrays are washed at room temperature once in 0.2% SDS and three times in distiUed water. Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate buffered saMne (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 0.2% SDS and distiUed water as before.
5 Hybridization
Hybridization reactions contain 9 μl of probe mixture consisting of 0.2 μg each of Cy3 and Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The probe mixture is heated to 65° C for 5 minutes and is ahquoted onto the microarray surface and covered with an 1.8 cm2 coversMp. The arrays are transferred to a wateφroof chamber having a cavity just sMghtly larger o than a microscope sMde. The chamber is kept at 100% humidity intemaUy by the addition of 140 μl of 5x SSC in a corner of the chamber. The chamber containing the arrays is incubated for about 6.5 horns at 60° C. The arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 0.1 % SDS), three times for 10 minutes each at 45° C in a second wash buffer (0.1X SSC), and dried. Detection
Reporter-labeled hybridization complexes are detected with a microscope equipped with an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral Mnes at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser Mght is focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The sMde containing the array is placed on a computer-controUed X-Y stage on the microscope and raster- scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a resolution of 20 micrometers.
In two separate scans, a mixed gas multiMne laser excites the two fluorophores sequentially. Emitted Mght is spMt, based on wavelength, into two photomultipMer tube detectors (PMT R1477,
Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate filters positioned between the array and the photomultipMer tubes are used to filter the signals. The emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is typicaUy scanned twice, one scan per fluorophore using the appropriate filters at the laser source, although the apparatus is capable of recording the spectra from both fluorophores simultaneously. The sensitivity of the scans is typically calibrated using the signal intensity generated by a cDNA control species added to the probe mix at a known concentration. A specific location on the array contains a complementary DNA sequence, allowing the intensity of the signal at that location to be correlated with a weight ratio of hybridizing species of 1 : 100,000. When two probes from different sources (e.g., representing test and control ceUs), each labeled with a different fluorophore, are hybridized to a single array for the pmpose of identifying genes that are differentially expressed, the caUbration is done by labeMng samples of the caMbrating cDNA with the two fluorophores and adding identical amounts of each to the hybridization mixture.
The output of the photomultipMer tube is digitized using a 12-bit RTI-835H analog-to-digital (A/D) conversion board (Analog Devices, Inc., Norwood, MA) instaUed in an IBM-compatible PC computer. The digitized data are displayed as an image where the signal intensity is mapped using a Mnear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high signal). The data is also analyzed quantitatively. Where two. different fluorophores are excited and measmed simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission spectra) between the fluorophores using each fluorophore's emission spectrum.
A grid is superimposed over the fluorescence signal image such that the signal from each spot is centered in each element of the grid. The fluorescence signal within each element is then integrated to obtain a numerical value corresponding to the average intensity of the signal. The software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte).
XII. Complementary Nucleic Acids
Sequences complementary to the sptm are used to detect, decrease, or inhibit expression of the naturaUy occmring nucleotide. The use of oMgonucleotides comprising from about 15 to 30 base pairs is typical in the art. However, smaller or larger sequence fragments can also be used. Appropriate oMgonucleotides are designed from the sptm using OLIGO 4.06 software (National Biosciences) or other appropriate programs and are synthesized using methods standard in the art or ordered from a commercial suppMer. To inhibit transcription, a complementary oMgonucleotide is designed from the most unique 5' sequence and used to prevent transcription factor binding to the promoter sequence. To inhibit translation, a complementary oMgonucleotide is designed to prevent ribosomal binding and processing of the transcript.
XIII. Expression of SPTM Expression and purification of SPTM is accompMshed using bacterial or virus-based expression systems. For expression of SPTM in bacteria, cDNA is subcloned into an appropriate vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription. Examples of such promoters include, but are not Mmited to, the trp-lac (tac) hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element. Recombinant vectors are transformed into suitable bacterial hosts, e.g.,
BL21 (DE3). Antibiotic resistant bacteria express SPTM upon induction with isopropyl beta-D- thiogalactopyranoside (D?TG). Expression of SPTM in eukaryotic ceUs is achieved by infecting insect or mammaMan ceU Mnes with recombinant Autographica caMfornica nuclear polyhedrosis virus (AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding SPTM by either homologous recombination or bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculo viras is used to infect Spodoptera fragiperda (Sf9) insect ceUs in most cases, or human hepatocytes, in some cases. Infection of the latter requires additional genetic modifications to baculovirus. (See e.g., Engelhard, supra; and Sandig, supra.)
In most expression systems, SPTM is synthesized as a fusion protein with, e.g., glutathione S- transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, affinity-based purification of recombinant fusion protein from crade ceU lysates. GST, a 26-kilodalton enzyme from Schistosoma iaponicum. enables the purification of fusion proteins on immobiUzed glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia Biotech). FoUowing purification, the GST moiety can be proteolyticaUy cleaved from SPTM at specificaUy engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification using commerciaUy available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak Company, Rochester NY). 6-His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, supra, Chapters 10 and 16). Purified SPTM obtained by these methods can be used directly in the following activity assay.
XIV. Demonstration of SPTM Activity
An assay for SPTM activity measures the expression of SPTM on the cell surface. cDNA encoding SPTM is subcloned into an appropriate mammaMan expression vector suitable for high levels of cDNA expression. The resulting construct is transfected into a nonhuman cell Mne such as N1H3T3. Cell surface proteins are labeled with biotin using methods known in the art.
Immunoprecipitations are performed using SPTM-specific antibodies, and immunoprecipitated samples are analyzed using SDS-PAGE and immunoblotting techniques. The ratio of labeled immunoprecipitant to unlabeled immunoprecipitant is proportional to the amount of SPTM expressed on the ceU surface. Alternatively, an assay for SPTM activity measures the amount of SPTM in secretory, membrane-bound organeUes. Transfected ceUs as described above are harvested and lysed. The lysate is fractionated using methods known to those of skill in the art, for example, sucrose gradient ultracentrifugation. Such methods aUow the isolation of subceUular components such as the Golgi apparatas, ER, smaU membrane-bound vesicles, and other secretory organelles. Immunoprecipitations from fractionated and total cell lysates are performed using SPTM-specific antibodies, and immunoprecipitated samples are analyzed using SDS-PAGE and immunoblotting techniques. The concentration of SPTM in secretory organeUes relative to SPTM in total ceU lysate is proportional to the amount of SPTM in transit through the secretory pathway.
XV. Functional Assays
SPTM function is assessed by expressing sptm at physiologicaUy elevated levels in mammaMan cell culture systems. cDNA is subcloned into a mammaMan expression vector containing a strong promoter that drives high levels of cDNA expression. 'Vectors of choice include pCMV SPORT (Life Technologies) and pCR3.1 (Invitrogen Coφoration, Carlsbad CA), both of which contain the cytomegalovirus promoter. 5-10 μg of recombinant vector are transiently transfected into a human ceU Mne, preferably of endotheMal or hematopoietic origin, using either Mposome formulations or electroporation. 1-2 μg of an additional plasmid containing sequences encoding a marker protein are co-transfected.
Expression of a marker protein provides a means to distinguish transfected ceUs from nontransfected ceUs and is a reUable predictor of cDNA expression from the recombinant vector. Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; CLONTECH), CD64, or a CD64-GFP fusion protein. How cytometry (FCM), an automated laser optics-based technique, is used to identify transfected ceUs expressing GFP or CD64-GFP and to evaluate the apoptotic state of the ceUs and other ceUular properties.
FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These events include changes in nuclear DNA content as . measured by staining of DNA with propidium iodide; changes in ceU size and granularity as measured by forward Mght scatter and 90 degree side Mght scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and intraceUular proteins as measured by reactivity with specific antibodies; and alterations in plasma membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to the cell surface. Methods, in flow cytometry are discussed in Orrherod, M. G. (1994) Flow Cytometry, Oxford, New York NY.
The influence of SPTM on gene expression can be assessed using highly purified populations of cells transfected with sequences encoding SPTM and either CD64 or CD64-GFP. CD64 and CD64-GFP are expressed on the surface of transfected ceUs and bind to conserved regions of human immunoglobuMn G (IgG). Transfected cells are efficiently separated from nontransfected cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake
Success NY). mRNA can be purified from the cells using methods weU known by those of skill in the art. Expression of mRNA encoding SPTM and other genes of interest can be analyzed by northern analysis or microarray techniques.
XVI. Production of Antibodies
SPTM substantiaUy purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to immunize rabbits and to produce antibodies using standard protocols. Alternatively, the SPTM amino acid sequence is analyzed using LASERGENE software (DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is synthesized and used to raise antibodies by means known to those of skiU in the art. Methods for selection of appropriate epitopes, such as those near the C-terminus or in hydrophiUc regions are weU described in 5 the art. (See, e.g., Ausubel, 1995, supra, Chapter 11.)
Typically, peptides 15 residues in length are synthesized using an ABI 431 A peptide synthesizer (AppMed Biosystems) using fmoc-cbemistry and coupled to KLH (Sigma) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity. (See, e.g., Ausubel, supra.) Rabbits are immunized with the peptide-KLH complex in complete Freund's o adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-SPTM activity using protocols well known in the art, including EXISA, RIA, and immunoblotting.
5 XVII. Purification of Naturally Occurring SPTM Using Specific Antibodies
Naturally occmring or recombinant SPTM is substantiaUy purified by immunoaffinity chromatography using antibodies specific for SPTM. An immunoaffinity column is constructed by covalently coupMng anti-SPTM antibody to an activated chromatographic resin, such as CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupMng, the resin is o blocked and washed according to the manufacturer's instructions.
Media containing SPTM are passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of SPTM (e.g., high ionic strength buffers in the presence of detergent). The column is eluted under conditions that disrupt antibody/SPTM binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as 5 urea or thiocyanate ion), and SPTM is coUected.
XVIII. Identification of Molecules Which Interact with SPTM
SPTM, or biologicahy active fragments thereof, are labeled with 125I Bolton-Hunter reagent. (See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.) Candidate molecules o previously arrayed in the weUs of a multi-weU plate are incubated with the labeled SPTM, washed, and any weUs with labeled SPTM complex are assayed. Data obtained using different concentrations of SPTM are used to calculate values for the number, affinity, and association of SPTM with the candidate molecules. Alternatively, molecules interacting with SPTM are analyzed using the yeast two-hybrid system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commerciaUy available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH).
SPTM may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) which employs the yeast two-hybrid system in a high-throughput manner to determine aU interactions between the proteins encoded by two large Mbraries of genes (Nandabalan, K. et al. (2000) U.S. Patent No. 6,057,101).
AU pubMcations and patents mentioned in the above specification are herein incoφorated by reference. Various modifications and variations of the described method and system of the invention will be apparent to those skiUed in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly Mmited to such specific embodiments. Indeed, various modifications of the above-described modes for carrying out the invention which are obvious to those skiUed in the field of molecular biology or related fields are intended to be within the scope of the foUowing claims.
TABLE 1
SEQ ID NO: Template ID SEQ ID NO: O FID
1 LG:983076.3:2000SEP08 185 LG:983076.3,orf3:2000SEP08
2 LG:1382987,7:2000SEP08 186 LG:1382987.7.orf2:2000SEP08
3 LG:235557.15:2000SEP08 187 LG:235557.15.orf2:2000SEP08
4 LG:018494.1:2000SEP08 188 LG:018494.1 ,orf2:2000SEP08
5 LG:980494.1:2000SEP08 189 LG:980494,1 ,orf3:2000SEP08
6 LG:984457.2:2000SEP08 190 LG:984457.2.orfl :2000SEP08
7 LG:406758.1:2000SEP08 191 LG:406758,1.orfl :2000SEP08
8 LG:902957,17:2000SEP08 192 LG:902957.17,orf3:2000SEP08
9 LG:333179.1:2000SEP08 193 LG:333179.1 ,orf3:2000SEP08
10 LG:406568.1:2000SEP08 194 LG:406568.1 ,orf2:2000SEP08
11 LG:353203.1:2000SEP08 195 LG:353203.1.orfl:2000SEP08
12 LG:061277.1:2000SEP08 196 LG:061277.1.orf3;2000SEP08
13 LG:170666.1:2000SEP08 197 LG:170666,1.orf2:2000SEP08
14 LG:311197.1 :2000SEP08 198 LG:311197,l.orf2:2000SEP08
15 LG:220655.4:2000SEP08 199 LG:220655,4.orf2:2000SEP08
16 LG: 1001893.1:2000SEP08 200 LG:1001893.1 ,orf2:2000SEP08
17 LG:004335.1:2000SEP08 201 LG:004335,1.orfl :2000SEP08
18 LG:213092.6:2000SEP08 202 LG:213092.6.orf2:2000SEP08
19 LG:407570.5:2000SEP08 203 LG:407570.5.orf3:2000SEP08
20 LG:337835.8:2000SEP08 204 LG:337835.8.orfl :2000SEP08
21 LG:1099283.1:2000SEP08 205 LG:1099283.1 ,orf2:2000SEP08
22 LG:401274.2:2000SEP08 206 LG:401274,2.orf2:2000SEP08
23 LG:222880.1:2000SEP08 207 LG:222880.1.orf3:2000SEP08
24 LG:406389.1:2000SEP08 208 LG:406389.1.orfl :2000SEP08
25 LG:055461.1:2000SEP08 209 LG:055461.1 ,orf2:2000SEP08
26 LG:979059,5:2000SEP08 210 LG:979059,5.orf1 :2000SEP08
27 LG:399238.1:2000SEP08 211 LG:399238.1.orfl :2000SEP08
28 LG:1382945.7:2000SEP08 212 LG:1382945.7.orf1 :2000SEP08
29 LG:1383610.3:2000SEP08 213 LG:1383610.3.orf2:2000SEP08
30 LG:1384030.1:2000SEP08 214 LG:1384030,1.orf3:2000SEP08
31 LG:390475.1:2000SEP08 215 LG:390475.1.orfl :2000SEP08
32 LG:229105.3:2000SEP08 216 LG:229105.3.orf2:2000SEP08
33 LG:232578.3:2000SEP08 217 LG:232578.3.orf1 ;2000SEP08
34 LG:1166387.9:2000SEP08 218 LG:1166387.9.orf1 '.2000SEP08
35 LG:351357.1:2000SEP08 219 LG:351357.1.orfl ;2000SEP08
36 LG:465592.1:2000SEP08 220 LG:465592.1.orfl :2000SEP08
37 LG:006848.5:2000SEP08 221 LG:006848.5.orf1 :2000SEP08
38 LG:198450.2:2000SEP08 222 LG:198450.2.orf2:2000SEP08
39 LG:1008175.1 :2000SEP08 223 LG:1008175.1.orfl :2000SEP08
40 LG:437981.11 :2000SEP08 224 LG:437981.11.orf2:2000SEP08
41 LG:1025549.L2000SEP08 225 LG:1025549.1 ,orf3:2000SEP08
42 LG:327226.16:2000SEP08 226 LG:327226.1ό.orf1 :2000SEP08
43 LG:1387394.5:2000SEP08 227 LG:1387394.5,orf3:2000SEP08
44 LG:445188.3:2000SEP08 228 LG:445188.3.orf3:2000SEP08
45 LG:898864,11:2000SEP08 229 LG:898864.11 ,orf3:2000SEP08
46 LG:018739,2:2000SEP08 230 LG.O18739.2.orf1 :2000SEP08
47 LG:302915.6:2000SEP08 231 LG:302915.6.orf3:2000SEP08
48 LG:404418.3:2000SEP08 232 LG:404418.3.orf1 :2000SEP08
49 LG:374853.2:2000SEP08 233 LG:374853.2.orfl :2000SEP08
50 LG:228930.1:2000SEP08 234 LG:228930.1.orf2:2000SEP08 TABLE 1
SEQ ID NO: Template ID SEQ ID NO: ORF ID
51 LG:273593,6:2000SEP08 235 LG:273593,6.orf 1 :2000SEP08
52 LG:008215.1 :2000SEP08 236 LG:008215.1 ,orf3:2000SEP08
53 LG:337160.1 :2000SEP08 237 LG:337160.1.orfl :2000SEP08
54 LG:3950ό3.1 :2000SEP08 238 LG:3950ό3, 1 ,orf~3:2000SEP08
55 LG:979069.4:2000SEP08 239 LG:979069.4.orf 1 :2000SEP08
56 LG:346663.5:2000SEP08 240 LG:346663.5.orf3:2000SEP08
57 LG:347615.1 :2000SEP08 241 LG:347όl 5, 1 ,orf3:2000SEP08
58 LG:1397067,1 :2000SEP08 242 LG: 1397067.1.orfl ;2000SEP08
59 LG: 120675.1 :2000SEP08 243 LG: 120675.1.orfl :2000SEP08
60 LG:420050.18:2000SEP08 244 LG:420050, 18.orf3;2000SEP08
61 LG:220495.3:2000SEP08 245 LG:220495.3.orf 1 :2000SEP08
62 LG:274551.1 :2000SEP08 246 LG:274551 , 1.orfl :2000SEP08
63 LG:429658.27:2000SEP08 247 LG:429658.27.orf2:2000SEP08
64 LG:246194.18:2000SEP08 248 LG:246194.18.orf112000SEP08
65 LG:000874.1:2000SEP08 249 LG.O00874.1 ,orf 1 :2000SEP08
66 LG:239967,7:2000SEP08 250 LG:239967.7.orf 1 :2000SEP08
67 LG:238388.1:2000SEP08 251 LG:238388,1.orf3:2000SEP08
68 LG:233674.4:2000SEP08 252 LG:233674.4.orf 1 :2000SEP08
69 LG:411327.2:2000SEP08 253 LG:41 1327.2.orf3:2000SEP08
70 LG:1327310,1:2000SEP08 254 LG: 1327310.1 , orfl .2000SEP08
71 LG:242019.13:2000SEP08 255 LG:242019.13.orf2:2000SEP08
72 LG:012432.12:2000SEP08 256 LG:012432.12.orf1 :2000SEP08
73 LG:257088,9:2000SEP08 257 LG:257088,9.orf3:2000SEP08
74 LG;997505.5:2000SEP08 258 LG:997505.5.orf2:2000SEP08
75 LG:48143ό.2:2000SEP08 259 LG:481436.2.orf3:2000SEP08
76 LG:247776.14:2000SEP08 260 LG:247776.14.orfl :2000SEP08
77 LG:008606.14:2000SEP08 261 LG:008606.14.orf2:2000SEP08
78 LG:985092.3:2000SEP08 262 LG:985092.3.orf2:2000SEP08
79 LG:236649,7:2000SEP08 263 LG:236649.7.orf2:2000SEP08
80 LG:245014.2:2000SEP08 264 LG:245014.2.orfl :2000SEP08
81 LG:170754.4:2000SEP08 265 LG: 170754.4. orf2:2000SEP08
82 LG:988028.1 :2000SEP08 266 LG:988028.1 ,orf 1 :2000SEP08
83 LG:427997,6:2000SEP08 267 LG:427997.6.orf3:2000SEP08
84 LG:464206.1 ;2000SEP08 268 LG:464206.1 ,orf2:2000SEP08
85 LG: 1400108, L2000SEP08 269 LG: 1400108.1.orf3:2000SEP08
86 LG:254531.1 :2000SEP08 270 LG:254531.1.orfl :2000SEP08
87 LG:1101317,L2000SEP08 271 LG:1 101317.1.orf1 :2000SEP08
88 LG:1074728.6:2000SEP08 272 LG: 1074728.6.orf 1 :2000SEP08
89 LG:1081684.L2000SEP08 273 LG: 1081684.1 ,orf3:2000SEP08
90 LG:1076520.L2000SEP08 274 LG: 1076520.1 ,orf 1 :2000SEP08
91 LG:1079477. L2000SEP08 275 LG: 1079477.1.orf2:2000SEP08
92 LG:1076269,1 :2000SEP08 276 LG: 1076269, 1.orf2:2000SEP08
93 LG:1087195.1.2000SEP08 277 LG: 1087195.1 ,orf3:2000SEP08
94 LG:002588.7:2000SEP08 278 LG:002588.7.orf2:2000SEP08
95 LG:1079470.ό:2000SEP08 279 LG: 1079470.6.orf 1 :2000SEP08
96 LG:345705.3:2000SEP08 280 LG:345705.3.orf1 :2000SEP08
97 LG:1083654.1 :2000SEP08 281 LG: 1083654.1.orfl :2000SEP08
98 LG:198782.3:2000SEP08 282 LG: 198782.3.orfl :2000SEP08
99 LG:981076.2:2000SEP08 283 LG:981076.2.orf1 :2000SEP08
100 LG:212023.3:2000SEP08 284 LG:212023.3.orf3:2000SEP08 TABLE 1
SEQ ID NO: Template ID SEQ ID NO: ORF ID
101 LG:977929.3:2000SEP08 285 LG:977929.3.orf3:2000SEP08
102 LG:201936.6:2000SEP08 286 LG:201936.6,orf2:2000SEP08
103 LG:205642.1 :2000SEP08 287 LG:205642.1.orf3:2000SEP08
104 LG:339653.6:2000SEP08 288 LG:339653.ό.orf3:2000SEP08
105 LG:978587.4:2000SEP08 289 LG:978587.4.orf 1 :2000SEP08
106 LG:216848.17:2000SEP08 290 LG:216848.17,orf 1 :2000SEP08
107 LG;219502.1 :2000SEP08 291 LG:219502.1.orf2:2000SEP08
108 Ll:33421 1.1 :2000SEP08 292 LI:33421 1.1.orf2:2000SEP08
109 LI:231024.2:2000SEP08 293 LI:231024.2,orf3:2000SEP08
1 10 LI:228425.5:2000SEP08 294 LI:228425.5.orf2:2000SEP08
1 1 1 Ll:034493.1 :2000SEP08 295 U:034493.1 ,orf3:2000SEP08
1 12 Ll:336218.1 :2000SEP08 296 U:336218.1.orf2:2000SEP08
1 13 LI:235891 .3:2000SEP08 297 Ll:235891 ,3,orfl :2000SEP08
1 14 Ll:344094, 1 :2000SEP08 298 Ll:344094.1 ,orf2:2000SEP08
1 15 LI:399945.2:2000SEP08 299 LI:399945.2.orf3:2000SEP08
1 16 LI:051849,1 :2000SEP08 300 LI:051849.1 ,orfl ;2000SEP08
1 17 LI:238379.3:2000SEP08 301 U:238379.3.orf2:2000SEP08
1 18 LI:352190,8:2000SEP08 302 U:352190.8.orf3:2000SEP08
1 19 LI:432120.1 :2000SEP08 303 Ll:432120.1.orf2:2000SEP08
120 Ll:055461.1 :2000SEP08 304 LI:055461.1.orf2:2000SEP08
121 U:197433.5:2000SEP08 305 LI: 197433.5,orf 1 :2000SEP08
122 LI: 170604.1 :2000SEP08 306 LI :170604.1 ,orf2:2000SEP08
123 LI:205057,3:2000SEP08 307 LI:205057,3,orf2:2000SEP08
124 U:233795.1 :2000SEP08 308 LL233795.1.orfl :2000SEP08
125 Ll:31 1 197, 1 ;2000SEP08 309 LI.31 1 197.1 ,orf2:2000SEP08
126 U;441364.1 :2000SEP08 310 Ll:441364.1 ,orf3:2000SEP08
127 LI:210367.6:2000SEP08 31 1 LI:210367,6.orf3:2000SEP08
128 LI:238194.5:2000SEP08 312 LI:238194.5.orf3:2000SEP08
129 U:039258.5:2000SEP08 313 U:039258.5.orfl :2000SEP08
130 LI:1071842,1 :2000SEP08 314 LI: 1071842.1 ,orf2:2000SEP08
131 LI:481356.3:2000SEP08 315 LI:481356.3.orf2:2000SEP08
132 LI : 103474, 1 :2000SEP08 316 LI: 103474.1 ,orf3:2000SEP08
133 LI: 1073020, 10:2000SEP08 317 LI: 1073020.10.orf2:2000SEP08
134 LI:000874.1 :2000SEP08 318 Ll:000874.1.orfl :2000SEP08
135 U:037298.2:2000SEP08 319 Ll:037298.2,orfl :2000SEP08
136 Ll:422901.1 :2000SEP08 320 LI:422901.1 ,orf1 :2000SEP08
137 LI:345815.1 :2000SEP08 321 U:345815.1 ,orf2:2000SEP08
138 U:1072014.2:2000SEP08 322 LI: 1072014.2.orf3:2000SEP08
139 LI:333138.3:2000SEP08 323 U:333138.3.orf3:2000SEP08
140 LI:414253.1 :2000SEP08 324 LI .414253.1 ,orf3:2000SEP08
141 LI:406389.1 :2000SEP08 325 Ll:406389.1.orfl :2000SEP08
142 LI: 1086171.1 :2000SEP08 326 LI: 1086171 ,1. orfl .2000SEP08
143 U:198782.4:2000SEP08 327 LI:198782.4.orf3:2000SEP08
144 LI:2030279.1 :2000SEP08 328 Ll:2030279.1 ,orf2:2000SEP08
145 LI:1018424.3:2000SEP08 329 LI: 1018424,3.orf2:2000SEP08
146 LI: 130969. L2000SEP08 330 LI: 130969.1 ,orf3:2000SEP08
147 U:286246.2:2000SEP08 331 U:286246.2.orfl :2000SEP08
148 LI:001527.1 :2000SEP08 332 U:001527.1.orf2:2000SEP08
149 LI:395063.1 :2000SEP08 333 U:395063.1.orfl :2000SEP08
150 LI: 1064460.1 :2000SEP08 334 LI: 1064460.1.orfl '.2000SEP08 TABLE 1
SEQ ID NO: Template ID SEQ ID NO: ORF ID
151 LI:344690.2:2000SEP08 335 Ll:344690.2.orf 1 :2000SEP08
152 LI:061585.4:2000SEP08 336 LI:061585.4.orf3:2000SEP08
153 LI:378428.1 :2000SEP08 337 U:378428.1.orfl :2000SEP08
154 LI:474108.2:2000SEP08 338 Ll:474108,2,orf3:2000SEP08
155 LI:23071 1 ,2:2000SEP08 339 Ll:23071 1 ,2,orf T.2000SEP08
156 LI:008942.1 :2000SEP08 340 Ll:008942.1.orfl :2000SEP08
157 LI:732479.1 :2000SEP08 341 Ll:732479.1 ,orf3:2000SEP08
158 U:1 190250.1 :2000SEP08 342 LI: 1 190250.1 ,orf3:2000SEP08
159 LI: 1013717.1 :2000SEP08 343 U: 1013717.1. orfl :2000SEP08
160 LI:2049125.2:2000SEP08 344 Ll:2049125.2.orf 1 :2000SEP08
161 LI: 1092360. L2000SEP08 345 LI: 1092360.1.orf3:2000SEP08
162 LI:791524,1 :2000SEP08 346 Ll:791524.1 ,orf2:2000SEP08
163 LI:1084555.3:2000SEP08 347 LI: 1084555.3,orf 1 :2000SEP08
164 LI:815418.2:2000SEP08 348 LI:815418.2.orfl :2000SEP08
165 LI:416766.1 :2000SEP08 349 Ll:416766.1.orfl :2000SEP08
166 LI:1 171008.2:2000SEP08 350 LI: 1 171008.2.orf 1 :2000SEP08
167 LI:1 169888.3:2000SEP08 351 LI: 1 169888.3,orf2:2000SEP08
168 LI:412592.1 :2000SEP08 352 Ll:412592.1 ,orf3:2000SEP08
169 Ll:349808.1 :2000SEP08 353 Ll:349808.1 ,orf3:2000SEP08
170 LI:349164.2:2000SEP08 354 LI.349164.2.orf3:2000SEP08
171 LI:205413.1 ;2000SEP08 ' 355 Ll:205413.1 ,orf2:2000SEP08
172 LI:2051508,2:2000SEP08 356 Ll:2051508,2.orfl :2000SEP08
173 LI:346242.2:2000SEP08 357 LI:346242,2.orf2:2000SEP08
174 LI:2052717,1 :2000SEP08 358 LC2052717.1 ,orf2:2000SEP08
175 LI:406668,2:2000SEP08 359 U:406668.2.orr2:2000SEP08
175 LI:406668.2:2000SEP08 360 LI:406668,2.orf3:2000SEP08
176 LI:1 178352.1 :2000SEP08 361 LI: 1 178352.1.orfl :2000SEP08
177 LI:814014.7:2000SEP08 362 LI:814014.7.orfl :2000SEP08
178 LI: 1 170624.1 :2000SEP08 363 LI: 1 170624.1 ,orf2:2000SEP08
179 LI: 1 183171.1 :2000SEP08 364 LI:1 183171.1.orf3;2000SEP08
180 LI: 1093491 , L2000SEP08 365 LI : 1093491.1 ,orfl :2000SEP08
181 LI:046515.5:2000SEP08 366 LI:046515.5.orf2:2000SEP08
182 LI:400171.2:2000SEP08 367 U:400171 ,2.orf3:2000SEP08
183 LI:330919,6:2000SEP08 368 LI:330919.6.orf3:2000SEP08
184 LI:219502, 1 ;2000SEP08 369 Ll:219502.1 ,orf3:2000SEP08
TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
LG:983076.3:2000SEP08 775 861 forward 1 TM Nout LG:983076,3:2000SEP08 479 565 forward 2 TM Nin LG:983076.3:2000SEP08 707 793 forward 2 TM Nin LG:983076.3:2000SEP08 114 200 forward 3 TM Nout LG:983076.3:2000SEP08 261 317 forward 3 TM Nout LG:983076.3:2000SEP08 501 587 forward 3 TM Nout LG:983076.3:2000SEP08 738 794 forward 3 TM Nout
2 LG:1382987.7:2000SEP08 28 102 forward 1 TM Nout
3 LG:235557.15:2000SEP08 55 141 forward 1 TM Nin 3 LG:235557.15:2000SEP08 56 118 forward 2 TM Nin 3 LG:235557.15:2000SEP08 149 211 forward 2 TM Nin 3 LG:235557.15:2000SEP08 33 95 forward 3 TM Nin
3 LG:235557.15:2000SEP08 135 197 forward 3 TM Nin
4 LG:018494.1:2000SEP08 175 243 forward 1 TM Nin 4 LG:018494.1:2000SEP08 298 360 forward 1 TM Nin 4 LG:018494.1:2000SEP08 370 432 forward 1 TM Nin 4 LG:018494.1:2000SEP08 460 546 forward 1 TM Nin 4 LG:018494.1:2000SEP08 29 115 forward 2 TM Nout 4 LG:018494.1:2000SEP08 188 274 forward 2 TM Nout 4 LG:018494.1:2000SEP08 347 409 forward 2 TM Nout 4 LG:018494.1:2000SEP08 .446 508 forward 2 TM Nout 4 LG:018494,1;2000SEP08 24 101 forward 3 TM Nin 4 LG:018494.1:2000SEP08 345 431 forward 3 TM Nin 4 LG:018494.1:2000SEP08 453 509 forward 3 TM Nin
4 LG:018494,1:2000SEP08 567 653 forward 3 TM Nin
5 LG:980494.1:2000SEP08 304 357 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1267 1335 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1339 1410 forward 1 TM Nin 5 LG:980494.1:2000SEP08 1453 1524 forward 1 TM Nin 5 LG:980494.1:2000SEP08 ' 1564 1611 forward 1 TM Nin 5 LG:980494.1 :2000SEP08 563 628 forward 2 TM Nout 5 LG:980494.1:2000SEP08 1364 1450 forward 2 TM Nout 5 LG:980494.1 :2000SEP08 15 101 forward 3 TM Nin 5 LG:980494.1 :2000SEP08 855 941 forward 3 TM Nin 5 LG:980494.1:2000SEP08 1251 1337 forward 3 TM Nin 5 LG:980494.1:2000SEP08 1374 1424 forward 3 TM Nin
5 LG:980494.1:2000SEP08 1470 1523 forward 3 TM Nin
6 LG:984457.2:2000SEP08 418 504 forward 1 TM Nout 6 LG:984457.2:2000SEP08 658 738 forward 1 TM Nout 6 LG:984457.2:2000SEP08 889 975 forward 1 TM Nout 6 LG:984457.2:2000SEP08 1126 1188 forward 1 TM Nout
6 LG:984457.2:2000SEP08 765 836 forward 3 TM Nin
7 LG:406758.1 :2000SEP08 1072 1146 forward 1 TM Nout 7 LG:406758.1:2000SEP08 1228 1314 forward 1 TM Nout 7 LG:406758.1:2000SEP08 767 847 forward 2 TM Nin 7 LG:406758.1:2000SEP08 66 119 forward 3 TM Nout 7 LG:406758.1:2000SEP08 753 824 forward 3 TM Nout 7 LG:406758.1:2000SEP08 -84°_ -911 forward 3 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topology
7 LG:406758.1:2000SEP08 1059 1121 forward 3 TM Nout
8 LG:902957.17:2000SEP08 145 216 forward 1 TM Nout
8 LG:902957.17:2000SEP08 161 232 forward 2 TM Nout
8 LG:902957.17:2000SEP08 18 77 forward 3 TM Nout
8 LG:902957, 17:2000SEP08 132 218 forward 3 TM Nout
9 LG:333179.1:2000SEP08 109 171 forward 1 TM Nout
9 LG:333179.1:2000SEP08 211 273 forward 1 TM Nout
9 LG:333179,1:2000SEP08 334 381 forward 1 TM Nout
9 LG:333179.1:2000SEP08 600 647 forward 3 TM Nout
10 LG:406568.1:2000SEP08 1934 2017 forward 2 TM N out
10 LG:406568.1:2000SEP08 2093 2173 forward 2 TM Nout
10 LG:406568.1:2000SEP08 12 59 forward 3 TM Nout
10 LG:406568.1:2000SEP08 498 557 forward 3 TM Nout
10 LG:406568.1:2000SEP08 1740 1826 forward 3 TM Nout
10 LG:406568.1:2000SEP08 1833 1907 forward 3 TM Nout
10 LG:406568.1:2000SEP08 1926 2012 forward 3 TM Nout
11 LG:353203.1:2000SEP08 95 151 forward 2 TM Nin
12 LG:061277.1:2000SEP08 217 303 forward 1 TM Nout
13 LG:170666.1:2000SEP08 84 134 forward 3 TM Nout
14 LG:311197.1:2000SEP08 241 315 forward 1 TM in
14 LG:311197.1 :2000SEP08 527 613 forward 2 TM Nout
15 LG:220655.4:2000SEP08 200 268 forward 2 TM Nin
15 LG:220ό55.4:2000SEP08 341 403 forward 2 TM Nin
15 LG:220655.4:2000SEP08 198 284 forward 3 TM Nout
15 LG:220655,4:2000SEP08 675 722 forward 3 TM Nout
16 LG: 1001893.1:2000SEP08 370 447 forward 1 TM Nout
16 LG: 1001893. L2000SEP08 463 534 forward 1 TM Nout
16 LG: 1001893.1.2000SEP08 405 491 forward 3 TM Nin
16 LG: 1001893.1:2000SEP08 666 752 forward 3 TM Nin
17 LG:004335.1:2000SEP08 607 660 forward 1 TM Nin
17 LG:004335,1:2000SEP08 877 963 forward 1 TM Nin
17 LG:004335.1:2000SEP08 1402 1485 forward 1 TM Nin
17 LG:004335.1:2000SEP08 1492 1578 forward 1 TM Nin
17 LG:004335.1:2000SEP08 1783 1839 forward 1 TM Nin
17 LG:004335.1:2000SEP08 2167 2253 forward 1 TM Nin
17 LG.O04335.1 :2000SEP08 1298 1384 forward 2 TM Nin
17 LG:004335.1:2000SEP08 1415 1486 forward 2 TM Nin
17 LG:004335.1:2000SEP08 2300 2386 forward 2 TM Nin
17 LG:004335.1:2000SEP08 1185 1241 forward 3 TM Nout
17 LG:004335.1:2000SEP08 1416 1502 forward 3 TM Nout
17 LG:004335.1:2000SEP08 1620 1685 forward 3 TM Nout
17 LG:004335.1:2000SEP08 1842 1925 forward 3 TM Nout
17 LG:004335.1:2000SEP08 2316 2393 forward 3 TM Nout
18 LG:213092,6:2000SEP08 217 303 forward 1 TM Nout
19 LG:407570.5:2000SEP08 433 519 forward 1 TM Nin
19 LG:407570,5:2000SEP08 11 85 forward 2 TM Nout
19 LG:407570.5:2000SEP08 440 493 forward 2 TM Nout
19 LG:407570.5:2000SEP08 569 625 forward 2 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topology
19 LG:407570.5:2000SEP08 414 500 forward 3 TM Nout
20 LG:337835,8:2000SEP08 70 141 forward 1 TM Nout
21 LG: 1099283.1:2000SEP08 56 133 forward 2 TM Nin
21 LG:1099283, L2000SEP08 173 259 forward 2 TM Nin
22 LG:401274.2:2000SEP08 292 348 forward 1 TM Nin
23 LG:222880.1:2000SEP08 211 297 forward 1 TM
23 LG:222880.1:2000SEP08 1714 1788 forward 1 TM
23 LG:222880.1:2000SEP08 1813 1899 forward 1 TM
23 LG:222880.1:2000SEP08 1930 2013 forward 1 TM
23 LG:222880.1:2000SEP08 2068 2133 forward 1 TM
23 LG:222880.1:2000SEP08 1667 1753 forward 2 TM Nout
23 LG:222880.1:2000SEP08 1805 1867 forward 2 TM Nout
23 LG:222880.1:2000SEP08 1910 1972 forward 2 TM Nout
23 LG:222880.1:2000SEP08 2273 2320 forward 2 TM Nout
23 LG:222880.1:2000SEP08 423 485 forward 3 TM Nin
23 LG:222880.1:2000SEP08 504 566 forward 3 TM Nin
23 LG:222880.1:2000SEP08 609 689 forward 3 TM Nin
23 LG:222880,1:2000SEP08 696 782 forward 3 TM Nin
23 LG:222880.1:2000SEP08 873 959 forward 3 TM Nin
23 LG:222880.1:2000SEP08 1014 1076 forward 3 TM Nin
23 LG:222880.1:2000SEP08 1104 1166 forward 3 TM Nin
23 LG:222880.1:2000SEP08 1542 1628 forward 3 TM Nin
23 LG:222880,1:2000SEP08 1689 1772 forward 3 TM Nin
23 LG:222880.1:2000SEP08 1938 2024 forward 3 TM Nin
23 LG:222880.1:2000SEP08 2313 2393 forward 3 TM Nin
24 LG:406389.1:2000SEP08 97 171 forward 1 TM Nin
24 LG:406389,1:2000SEP08 481 567 forward 1 TM Nin
24 LG:406389.1:2000SEP08 574 660 forward 1 TM Nin
24 LG:406389.1:2000SEP08 1517 1585 forward 2 TM Nout
24 LG:406389.1:2000SEP08 690 758 forward 3 TM Nin
24 LG:406389.1:2000SEP08 1485 1565 forward 3 TM Nin
25 LG:055461.1:2000SEP08 322 408 forward 1 TM Nin
25 LG:055461.1:2000SEP08 131 193 forward 2 TM Nout
25 LG:055461.1:2000SEP08 227 289 forward 2 TM Nout
25 LG:055461.1:2000SEP08 650 709 forward 2 TM Nout
25 LG:055461.1:2000SEP08 1400 1462 forward 2 TM Nout
25 LG:055461,1:2000SEP08 117 200 forward 3 TM Nout
25 LG:055461.1:2000SEP08 324 383 forward 3 TM Nout
25 LG:055461.1:2000SEP08 1326 1397 forward 3 TM Nout
26 LG:979059.5:2000SEP08 448 522 forward 1 TM Nout
27 LG:399238.1:2000SEP08 280 342 forward 1 TM Nin
27 LG:399238.1:2000SEP08 370 432 forward 1 TM Nin
27 LG:399238,1:2000SEP08 469 531 forward 1 TM Nin
27 LG:399238.1:2000SEP08 550 612 forward 1 TM Nin
27 LG:399238.1:2000SEP08 658 744 forward 1 TM Nin
27 LG:399238.1:2000SEP08 1559 1645 forward 2 TM Nout
27 LG:399238.1 :2000SEP08 177 230 forward 3 TM
27 LG:399238.1:2000SEP08 1551 J_637 forward 3 TM TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
28 LG: 1382945.7:2000SEP08 73 159 forward 1 TM Nin
28 LG: 1382945.7:2000SEP08 229 303 forward 1 TM Nin
28 LG: 1382945.7:2000SEP08 77 139 forward 2 TM Nout
28 LG:1382945.7:2000SEP08 152 214 forward 2 TM Nout
28 LG: 1382945.7:2000SEP08 251 319 forward 2 TM Nout
28 LG:1382945.7:2000SEP08 66 152 forward 3 TM Nout
28 LG:1382945.7:2000SEP08 237 296 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 565 624 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 916 993 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 1126 1197 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 1471 1557 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 1861 1923 forward 1 TM Nout
29 LG;1383610.3:2000SEP08 1969 2031 forward 1 TM Nout
29 LG:1383610,3:2000SEP08 2053 2121 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 2185 2247 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 2275 2337 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 2365 2427 forward 1 TM Nout
29 LG:1383610.3:2000SEP08 1460 1546 forward 2 TM
29 LG:1383610.3:2000SEP08 1658 1744 forward 2 TM
29 LG:1383610.3:2000SEP08 1832 1918 forward 2 TM
29 LG:1383610.3:2000SEP08 1934 1996 forward 2 TM
29 LG:1383610.3:2000SEP08 2021 2083 forward 2 TM
29 LG:1383610.3:2000SEP08 2153 2239 forward 2 TM
29 LG:1383610.3:2000SEP08 2309 2371 forward 2 TM
29 LG:1383610.3:2000SEP08 2399 2461 forward 2 TM
29 LG:1383610.3:2000SEP08 192 278 forward 3 TM Nout
29 LG:1383610,3:2000SEP08 1491 1577 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 1824 1889 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 1926 2012 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 2067 2153 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 2187 2273 forward 3 TM Nout
29 LG:1383610.3:2000SEP08 2319 2405 forward 3 TM Nout
30 LG:1384030.1:2000SEP08 370 420 forward 1 TM Nout
30 LG:1384030. L2000SEP08 820 870 forward 1 TM Nout
30 IG:1384030. L2000SEP08 1069 1155 forward 1 TM Nout
30 LG:1384030.1;2000SEP08 1252 1308 forward 1 TM Nout
30 LG: 1384030.1:2000SEP08 1396 1470 forward 1 TM Nout
30 LG:1384030.1:2000SEP08 1561 1632 forward 1 TM Nout
30 LG:1384030. L2000SEP08 1810 1896 forward 1 TM Nout
30 LG:1384030. 2000SEP08 2203 2286 forward 1 TM Nout
30 LG. 384030. L2000SEP08 83 160 forward 2 TM Nin
30 LG: 1384030. 2000SEP08 1046 1120 forward 2 TM Nin
30 LG:1384030. L2000SEP08 1145 1219 forward 2 TM Nin
30 LG:1384030. L2000SEP08 1376 1462 forward 2 TM Nin
30 LG:1384030. L2000SEP08 1796 1882 forward 2 TM Nin
30 LG:1384030. L2000SEP08 2003 2089 forward 2 TM Nin
30 LG:l 384030.1:2000SEP08 924 1007 forward 3 TM
30 LG: 1384030.1 :2000SEP08 1083 J rward 3 TM TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
30 LG 1384030.1:2000SEP08 1185 1271 forward 3 TM
30 LG 1384030.1:2000SEP08 1419 1505 forward 3 TM
30 LG 1384030.1:2000SEP08 1572 1658 forward 3 TM
30 LG 1384030.1 :2000SEP08 1878 1964 forward 3 TM
30 LG 1384030. L2000SEP08 2079 2144 forward 3 TM
31 LG:390475.1:2000SEP08 559 621 forward 1 TM Nin
31 LG:390475.1:2000SEP08 1648 1704 forward 1 TM Nin
31 LG:390475,1:2000SEP08 542 592 forward 2 TM Nout
31 LG:390475.1:2000SEP08 935 1021 forward 2 TM Nout
31 LG:390475.1:2000SEP08 1217 1291 forward 2 TM Nout
31 LG:390475.1:2000SEP08 600 686 forward 3 TM
31 LG:390475.1:2000SEP08 1053 1103 forward 3 TM
31 LG:390475.1:2000SEP08 1182 1268 forward 3 TM
32 LG:229105.3:2000SEP08 592 642 forward 1 TM Nout
32 LG:229105.3:2000SEP08 578 664 forward 2 TM Nout
33 LG:232578.3:2000SEP08 163 228 forward 1 TM Nin
33 LG:232578.3:2000SEP08 226 276 forward 1 TM Nin
33 LG:232578,3:2000SEP08 295 375 forward 1 TM Nin
33 LG:232578.3:2000SEP08 1042 1119 forward 1 TM Nin
33 LG:232578.3:2000SEP08 155 241 forward 2 TM Nin
33 LG:232578.3:2000SEP08 308 376 forward 2 TM Nin
33 LG:232578,3:2000SEP08 599 685 forward 2 TM Nin
33 LG:232578.3:2000SEP08 704 781 forward 2 TM Nin
33 LG:232578,3:2000SEP08 998 1084 forward 2 TM Nin
33 LG:232578.3:2000SEP08 1154 1240 forward 2 TM Nin
33 LG:232578.3:2000SEP08 165 239 forward 3 TM Nin
33 LG:232578,3:2000SEP08 327 398 forward 3 TM Nin
33 LG:232578.3:2000SEP08 597 647 forward 3 TM Nin
33 LG:232578.3:2000SEP08 720 770 forward 3 TM Nin
33 LG:232578.3:2000SEP08 1008 1094 forward 3 TM Nin
33 LG:232578.3:2000SEP08 1128 1214 forward 3 TM Nin
34 LG:1166387.9:2000SEP08 565 615 forward 1 TM Nout
34 LG:1166387.9:2000SEP08 129 188 forward 3 TM
35 LG:351357.1:2000SEP08 310 396 forward 1 TM
35 LG:351357.1:2000SEP08 467 529 forward 2 TM Nin
35 LG:351357.1:2000SEP08 557 619 forward 2 TM Nin
35 LG:351357.1:2000SEP08 674 736 forward 2 TM Nin
35 LG:351357.1:2000SEP08 761 823 forward 2 TM Nin
35 LG:351357.1:2000SEP08 941 1021 forward 2 TM Nin
35 LG:351357.1:2000SEP08 186 257 forward 3 TM Nin
35 LG:351357.1:2000SEP08 1119 1199 forward 3 TM Nin
36 LG:465592.1:2000SEP08 163 225 forward 1 TM Nin
36 LG:465592.1:2000SEP08 247 309 forward 1 TM Nin
36 LG:465592.1:2000SEP08 409 480 forward 1 TM Nin
36 LG:465592.1:2000SEP08 44 94 forward 2 TM Nin
36 LG:465592,1:2000SEP08 221 301 forward 2 TM Nin
36 LG:465592.1:2000SEP08 36 104 forward 3 TM Nin
37 LG:006848.5:2000SEP08 628 690 forward 1 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
38 LG:198450.2:2000SEP08 245 331 forward 2 TM Nout
38 LG:198450.2:2000SEP08 416 469 forward 2 TM Nout
38 LG:198450.2:2000SEP08 803 889 forward 2 TM Nout
38 LG:198450.2:2000SEP08 750 818 forward 3 TM Nin
39 LG: 1008175.1:2000SEP08 25 87 forward 1 TM Nout
39 LG: 1008175.1:2000SEP08 115 177 forward 1 TM Nout
39 LG: 1008175.1:2000SEP08 220 297 forward 1 TM Nout
39 LG: 1008175, L2000SEP08 352 438 forward 1 TM Nout
39 LG: 1008175.1.2000SEP08 478 564 forward 1 TM Nout
39 LG: 1008175.1.2000SEP08 649 705 forward 1 TM Nout
39 LG: 1008175, L2000SEP08 844 900 forward 1 TM Nout
39 LG: 1008175. L2000SEP08 407 475 forward 2 TM Nin
39 LG: 1008175. L2000SEP08 114 200 forward 3 TM Nout
39 LG: 1008175.1.2000SEP08 309 392 forward 3 TM Nout
40 LG:437981,11:2000SEP08 17 79 forward 2 TM Nout
40 LG:437981.11:2000SEP08 95 157 forward 2 TM Nout
40 LG:437981.11;2000SEP08 381 467 forward 3 TM
41 LG:1025549, L2000SEP08 161 241 forward 2 TM Nout
41 LG:1025549. L2000SEP08 468 530 forward 3 TM Nout
41 LG:1025549. L2000SEP08 540 602 forward 3 TM Nout
42 LG:327226.16:2000SEP08 178 246 forward 1 TM Nin
42 LG:327226.16:2000SEP08 256 312 forward 1 TM Nin
42 LG:327226.16:2000SEP08 325 387 forward 1 TM Nin
42 LG:327226.16:2000SEP08 403 465 forward 1 TM Nin
42 LG:327226.16:2000SEP08 224 301 forward 2 TM Nin
42 LG:327226.16:2000SEP08 353 439 forward 2 TM Nin
42 LG:327226.16:2000SEP08 252 311 forward 3 TM Nin
43 LG: 1387394.5:2000SEP08 475 561 forward 1 TM Nin
43 LG:1387394.5:2000SEP08 619 705 forward 1 TM Nin
43 LG: 1387394.5:2000SEP08 591 653 forward 3 TM Nin
43 LG:1387394.5:2000SEP08 669 731 forward 3 TM Nin
44 LG:445188,3:2000SEP08 124 177 forward 1 TM Nout
44 LG:445188.3:2000SEP08 196 258 forward 1 TM Nout
44 LG:445188,3:2000SEP08 271 333 forward 1 TM Nout
44 LG:445188.3:2000SEP08 346 408 forward 1 TM Nout
44 LG:445188.3:2000SEP08 116 187 forward 2 TM Nin
44 LG:445188.3:2000SEP08 467 526 forward 2 TM Nin
44 LG:445188,3:2000SEP08 102 188 forward 3 TM in
44 LG:445188,3:2000SEP08 195 281 forward 3 TM Nin
44 LG:445188.3:2000SEP08 285 350 forward 3 TM Nin
44 LG:445188.3:2000SEP08 390 476 forward 3 TM Nin
45 LG:898864.11 :2000SEP08 106 192 forward 1 TM Nout
45 LG:898864.11:2000SEP08 50 133 forward 2 TM Nout
46 LG:018739.2;2000SEP08 124 189 forward 1 TM Nin
46 LG:018739.2:2000SEP08 454 516 forward 1 TM Nin
46 LG:018739.2:2000SEP08 553 615 forward 1 TM Nin
46 LG:018739.2:2000SEP08 128 208 forward 2 TM
46 LG:018739.2:2000SEP08 536 598 forward 2 TM TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
46 LG:018739,2:2000SEP08 623 685 forward 2 TM
46 LG:018739.2:2000SEP08 234 305 forward 3 TM Nout
46 LG;018739.2:2000SEP08 402 488 forward 3 TM Nout
46 LG:018739.2:2000SEP08 555 641 forward 3 TM Nout
47 LG:302915.6:2000SEP08 127 207 forward 1 TM Nin
47 LG:302915.6:2000SEP08 330 404 forward 3 TM Nout
47 LG:302915.6:2000SEP08 501 575 forward 3 TM Nout
47 LG:302915,6:2000SEP08 630 716 forward 3 TM Nout
48 LG:404418.3:2000SEP08 616 702 forward 1 TM Nout
48 LG:404418.3:2000SEP08 479 565 forward 2 TM Nout
48 LG:404418.3:2000SEP08 626 682 forward 2 TM Nout
49 LG:374853,2:2000SEP08 235 312 forward 1 TM Nout
49 LG:374853.2:2000SEP08 913 999 forward 1 TM Nout
49 LG:374853,2:2000SEP08 1276 1362 forward 1 TM Nout
49 LG:374853.2:2000SEP08 1384 1437 forward 1 TM Nout
49 LG:374853.2:2000SEP08 1606 1692 forward 1 TM Nout
49 LG:374853.2:2000SEP08 122 208 forward 2 TM Nin
49 LG:374853.2:2000SEP08 371 439 forward 2 TM Nin
49 LG:374853.2:2000SEP08 1121 1183 forward 2 TM Nin
49 LG:374853.2:2000SEP08 1208 1270 forward 2 TM Nin
49 LG:374853.2:2000SEP08 198 284 forward 3 TM Nin
49 LG:374853.2:2000SEP08 381 467 forward 3 TM Nin
49 LG:374853.2:2000SEP08 1311 1397 forward 3 TM Nin
50 LG:228930.1;2000SEP08 697 750 forward 1 TM Nout
50 LG:228930.1:2000SEP08 111 170 forward 3 TM Nin
50 LG:228930.1:2000SEP08 564 650 forward 3 TM Nin
50 LG:228930.1:2000SEP08 858 941 forward 3 TM Nin
51 LG:273593.6:2000SEP08 343 429 forward 1 TM
51 LG:273593.6:2000SEP08 707 763 forward 2 TM Nout
52 LG:008215,1:2000SEP08 181 267 forward 1 TM Nout
52 LG:008215.1:2000SEP08 583 663 forward 1 TM Nout
52 LG:008215.1:2000SEP08 1123 1209 forward 1 TM Nout
52 LG:008215.1:2000SEP08 1404 1466 forward 3 TM Nout
53 LG:337160.1:2000SEP08 844 930 forward 1 TM Nout
53 LG:337160,1:2000SEP08 928 990 forward 1 TM Nout
53 LG:337160.1:2000SEP08 1075 1143 forward 1 TM Nout
53 LG:337160.1:2000SEP08 1354 1437 forward 1 TM Nout
53 LG:337160.1:2000SEP08 119 193 forward 2 TM Nin
53 LG:337160.1:2000SEP08 275 346 forward 2 TM Nin
53 LG:337160.1:2000SEP08 491 577 forward 2 TM Nin
53 LG:337160.1:2000SEP08 1037 1120 forward 2 TM Nin
53 LG:337160.1:2000SEP08 1014 1100 forward 3 TM Nout
54 LG:395063,1:2000SEP08 331 387 forward 1 TM Nin
54 LG:395063,1:2000SEP08 517 603 forward 1 TM Nin
54 LG:395063.1:2000SEP08 748 834 forward 1 TM Nin
54 LG:395063.1:2000SEP08 850 912 forward 1 TM Nin
54 LG:395063.1:2000SEP08 982 1068 forward 1 TM Nin
54 LG:395063.1:2000SEP08 nn ll?7 prwgrd 1 TM Nin
1 CO m D cn *ι CΛ » n w oι θι oι n oι θι *3i C)i *3i w oi cn w 3 *0 * <3 C» CB C» vι vi o θι Oι σι Oi Cπ θι θι Oι θι Oτ O^ D Z
O
_- >l O <-—- *, > 0 m0 —
!cn O *I M ) 00 Ol 4- - ' .4fci*.. NNTT -O ϋCn! WNT or NT ^.
O O O O O O O O O O 3 O O O O O 0 o g ooooooooooooooo δ* ooooooooooooo Tl
Q
Q Ω Ω Q Ω Ω Q Q Q Q O Q Q Ω Ω Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q
Q. dd d d d d d α ddddddddd d d d d d dddddddddd d Ω. φ
NT NT Co Oo NT Oo — * — ' CO CO CO CO CO OO NT NT NT NT NT —> OO CO CO OO CO CO NT NT NT NT NT NT NT NT NT NT NT NT — ' — '
Ό o
3
— I — I — J — J — J Ω ^
-<
Ό φ
TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
59 LG:120675.1 :2000SEP08 1316 1372 forward 2 TM
60 LG:420050.18:2000SEP08 1 12 198 forward 1 TM N out
60 LG:420050.18:2000SEP08 205 291 forward 1 TM N out
60 LG:420050.18:2000SEP08 155 241 forward 2 TM N out
60 LG:420050.18:2000SEP08 413 472 forward 2 TM N out
60 LG:420050.18:2000SEP08 210 290 forward 3 TM N out
61 LG:220495.3:2000SEP08 208 294 forward 1 TM N out
61 LG:220495.3:2000SEP08 502 582 forward 1 TM N out
61 LG:220495.3:2000SEP08 631 717 forward 1 TM N out
61 LG:220495.3:2000SEP08 790 846 forward 1 TM N out
61 LG:220495.3:2000SEP08 865 930 forward 1 TM N out
61 LG:220495.3:2000SEP08 1432 1518 forward 1 TM N out
61 LG:220495.3:2000SEP08 1663 1749 forward 1 TM N out
61 LG:220495.3:2000SEP08 1801 1866 forward 1 TM N out
61 LG:220495.3:2000SEP08 1918 1977 forward 1 TM N out
61 LG:220495,3:2000SEP08 956 1030 forward 2 TM N out
61 LG:220495.3:2000SEP08 1439 1501 forward 2 TM N out
61 LG:220495.3;2000SEP08 1526 1588 forward 2 TM N out
61 LG:220495.3:2000SEP08 1631 1699 forward 2 TM N out
61 LG:220495.3:2000SEP08 1760 1843 forward 2 TM N out
61 LG:220495.3:2000SEP08 1862 1945 forward 2 TM N out
61 LG:220495.3:2000SEP08 486 536 forward 3 TM
61 LG:220495,3:2000SEP08 618 677 forward 3 TM
61 LG:220495.3:2000SEP08 702 779 forward 3 TM
61 LG:220495.3:2000SEP08 843 926 forward 3 TM
61 LG:220495,3:2000SEP08 984 1070 forward 3 TM
61 LG:220495.3:2000SEP08 1449 1499 forward 3 TM
61 LG:220495,3:2000SEP08 1545 1619 forward 3 TM
61 LG:220495.3:2000SEP08 1866 1928 forward 3 TM
62 LG:274551.1 :2000SEP08 81 152 forward 3 TM N out
62 LG:274551 ,1 :2000SEP08 216 269 forward 3 TM N out
63 LG:429658.27:2000SEP08 373 432 forward 1 TM N out
63 LG:429658.27:2000SEP08 197 259 forward 2 TM N out
63 LG:429658.27:2000SEP08 368 430 forward 2 TM out
64 LG:246194.18:2000SEP08 718 804 forward 1 TM N out
64 LG:246194.18:2000SEP08 1816 1890 forward 1 TM N out
64 LG:246194.18:2000SEP08 716 763 ' forward 2 TM N in
64 LG:24όl 94, 18:2000SEP08 1793 1879 forward 2 TM N in
64 LG:246194.18:2000SEP08 726 782 forward 3 TM N in
64 LG:246194.18:2000SEP08 1830 1883 forward 3 TM N in
65 LG:000874.1 :2000SEP08 151 237 forward 1 TM N in
65 LG:000874.1 :2000SEP08 1738 1824 forward 1 TM N in
65 LG:000874,1 :2000SEP08 1849 1935 forward 1 TM N in
65 LG:000874.1 :2000SEP08 170 256 forward 2 TM N out
65 LG:000874.1 :2000SEP08 1991 2047 forward 2 TM N out
65 LG:000874.1 :2000SEP08 2078 2155 forward 2 TM N out
65 LG:000874.1 :2000SEP08 168 233 forward 3 TM N in
65 LG:000874.1 :2000SEP08 627 -7J 3 forward 3 TM N in TABLE 2
SEQ ID NO; Template ID Start Stop Frame Domain Type Topolog
65 LG:000874.1;2000SEP08 1383 1469 forward 3 TM Nin
65 LG:000874.1:2000SEP08 2109 2162 forward 3 TM Nin
66 LG:239967,7:2000SEP08 136 189 forward 1 TM Nout
67 LG:238388.1:2000SEP08 544 618 forward 1 TM Nout
67 LG:238388.1:2000SEP08 1477 1536 forward 1 TM Nout
67 LG:238388,1:2000SEP08 1975 2028 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2029 2100 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2275 2346 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2377 2463 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2734 2790 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2803 2889 forward 1 TM Nout
67 LG:238388.1:2000SEP08 2998 3084 forward 1 TM Nout
67 LG:238388.1:2000SEP08 3118 3204 forward 1 TM Nout
67 LG:238388.1;2000SEP08 3265 3351 forward 1 TM Nout
67 LG:238388.1:2000SEP08 3415 3489 forward 1 TM Nout
67 LG:238388,1:2000SEP08 3550 3612 forward 1 TM Nout
67 LG:238388.1:2000SEP08 3664 3726 forward 1 TM Nout
67 LG:238388.1:2000SEP08 1337 1399 forward 2 TM N out
67 LG:238388.1;2000SEP08 1727 1804 forward 2 TM Nout
67 LG:238388.1:2000SEP08 1883 1969 forward 2 TM Nout
67 LG:238388.1:2000SEP08 2003 2065 forward 2 TM Nout
67 LG:238388.1:2000SEP08 20842146 forward 2 TM Nout
67 LG:238388.1:2000SEP08 2459 2545 forward 2 TM Nout
67 LG:238388,1:2000SEP08 2600 2686 forward 2 TM Nout
67 LG:238388.1:2000SEP08 2714 2785 forward 2 TM Nout
67 LG:238388.1:2000SEP08 2837 2923 forward 2 TM Nout
67 LG:238388,1:2000SEP08 2945 3031 forward 2 TM Nout
67 LG:238388.1:2000SEP08 3176 3256 forward 2 TM Nout
67 LG:238388.1:2000SEP08 3308 3394 forward 2 TM Nout
67 LG:238388.1:2000SEP08 3545 3622 forward 2 TM Nout
67 LG:238388.1:2000SEP08 1473 1553 forward 3 TM
67 LG:238388.1:2000SEP08 1755 1805 forward 3 TM
67 LG:238388.1:2000SEP08 1860 1928 forward 3 TM
67 LG:238388.1:2000SEP08 2232 2318 forward 3 TM
67 LG:238388.1:2000SEP08 2634 2720 forward 3 TM
67 LG:238388.1:2000SEP08 2766 2834 forward 3 TM
67 LG:238388.1:2000SEP08 2877 2951 forward 3 TM
67 LG:238388,1:2000SEP08 3021 3092 forward 3 TM
67 LG:238388.1:2000SEP08 3369 3452 forward 3 TM
67 LG:238388.1:2000SEP08 3636 3719 forward 3 TM
68 LG:233674.4:2000SEP08 3175 3261 forward 1 TM Nin
68 LG:233674.4:2000SEP08 3526 3612 forward 1 TM Nin
68 LG:233674.4:2000SEP08 1826 1885 forward 2 TM Nout
68 LG:233674.4:2000SEP08 3285 3365 forward 3 TM Nout
68 LG:233674.4:2000SEP08 3474 3560 forward 3 TM Nout
69 LG:411327.2:2000SEP08 226 312 forward 1 TM Nout
69 LG:411327,2:2000SEP08 1327 1407 forward 1 TM Nout
69 LG:411327.2:2000SEP08 239 322 forward 2 TM Nout 51 TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topology
69 LG:41 1327.2:2000SEP08 794 865 forward 2 TM N out
69 LG:41 1327.2:2000SEP08 588 650 forward 3 TM N in
69 LG:41 1327,2:2000SEP08 666 728 forward 3 TM N in
69 LG:41 1327.2:2000SEP08 744 806 forward 3 TM N in
69 LG;41 1327.2:2000SEP08 828 914 forward 3 TM N in
69 LG:41 1327.2:2000SEP08 1557 1643 forward 3 TM N in
69 LG:41 1327.2:2000SEP08 1749 1829 forward 3 TM N in
69 LG:41 1327.2:2000SEP08 1842 1925 forward 3 TM N in
70 LG:1327310.1 :2000SEP08 193 258 forward 1 TM N in
71 LG:242019.13:2000SEP08 367 453 forward 1 TM N out
71 LG:242019.13:2000SEP08 402 452 forward 3 TM N out
72 LG:012432.12:2000SEP08 319 405 forward 1 TM N in
72 LG:012432.12:2000SEP08 526 612 forward 1 TM N in
72 LG:012432.12:2000SEP08 290 376 forward 2 TM N in
72 LG:012432.12:2000SEP08 506 565 forward 2 TM N in
72 LG:012432,12:2000SEP08 677 757 forward 2 TM N in
72 LG:012432.12:2000SEP08 642 695 forward 3 TM N out
73 LG:257088.9:2000SEP08 1231 1302 forward 1 TM N out
73 LG:257088,9:2000SEP08 1339 1425 forward 1 TM N out
73 LG:257088.9:2000SEP08 152 229 forward 2 TM
73 LG:257088.9:2000SEP08 656 739 forward 2 TM
73 LG:257088,9:2000SEP08 72 143 forward 3 TM
73 LG:257088.9:2000SEP08 630 692 forward 3 TM
73 LG:257088.9:2000SEP08 939 1010 forward 3 TM
74 LG:997505.5:2000SEP08 1036 1089 forward 1 TM N out
74 LG:997505.5:2000SEP08 1092 1 142 forward 3 TM N in
74 LG:997505.5:2000SEP08 1224 1271 forward 3 TM N in
75 LG:481436.2:2000SEP08 10 60 forward 1 TM N out
75 LG:481436,2:2000SEP08 256 324 forward 1 TM N out
75 LG:481436.2:2000SEP08 430 516 forward 1 TM N out
75 LG:481436.2:2000SEP08 562 648 forward 1 TM N out
75 LG:481436.2:2000SEP08 775 834 forward 1 TM N out
75 LG:481436.2:2000SEP08 1060 1 122 forward 1 TM N out
75 LG:481436.2:2000SEP08 1 147 1209 forward 1 TM N out
75 LG:481436.2:2000SEP08 1414 1500 forward 1 TM N out
75 LG:481436,2:2000SEP08 1522 1608 forward 1 TM N out
75 LG:48143ό.2:2000SEP08 230 316 forward 2 TM N in
75 LG:481436.2:2000SEP08 359 436 forward 2 TM N in
75 LG:48143ό.2:2000SEP08 536 607 forward 2 TM N in
75 LG:481436.2:2000SEP08 635 697 forward 2 TM N in
75 LG:481436.2:2000SEP08 710 772 forward 2 TM N in
75 LG:481436,2:2000SEP08 785 847 forward 2 TM N in
75 LG:481436.2:2000SEP08 1385 1471 forward 2 TM N in
75 LG:481436.2:2000SEP08 1502 1570 forward 2 TM N in
75 LG:481436.2:2000SEP08 276 353 forward 3 TM
75 LG:481436.2:2000SEP08 522 608 forward 3 TM
75 LG:481436.2:2000SEP08 840 926 forward 3 TM
75 LG:481436.2:2000SEP08 1383 1469 forward 3 TM TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
75 LG:481436.2:2000SEP08 1500 1586 forward 3 TM
76 LG:247776.14:2000SEP08 932 985 forward 2 TM Nout
77 LG:008606.14:2000SEP08 414 500 forward 3 TM Nin
77 LG:008606.14:2000SEP08 540 587 forward 3 TM Nin
77 LG:008606.14:2000SEP08 1146 1199 forward 3 TM Nin
78 LG:985092,3:2000SEP08 24 110 forward 3 TM Nout
79 LG:236649.7:2000SEP08 370 456 forward 1 TM Nin
79 LG:236649.7:2000SEP08 377 463 forward 2 TM Nout
79 LG:236649.7:2000SEP08 366 440 forward 3 TM in
80 LG:245014.2:2000SEP08 1345 1407 forward 1 TM Nin
80 LG:245014.2:2000SEP08 1426 1488 forward 1 TM Nin
80 LG:245014.2:2000SEP08 464 541 forward 2 TM Nin
80 LG:245014.2:2000SEP08 599 655 forward 2 TM Nin
80 LG:245014.2:2000SEP08 731 817 forward 2 TM Nin
80 LG:245014.2:2000SEP08 1325 1408 forward 2 TM Nin
80 LG:245014.2:2000SEP08 252 338 forward 3 TM Nout
80 LG:245014.2:2000SEP08 1335 1421 forward 3 TM Nout
81 LG:170754.4:2000SEP08 244 330 forward 1 TM Nout
81 LG:170754.4:2000SEP08 71 157 forward 2 TM Nin
81 LG:170754.4:2000SEP08 206' 292 forward 2 TM Nin
81 LG:170754.4:2000SEP08 585 635 forward 3 TM Nin
82 LG:988028.1:2000SEP08 268 354 forward 1 TM Nout
83 LG:427997.6:2000SEP08 148 222 forward 1 TM Nout
83 LG:427997.6:'2000SEP08 739 822 forward 1 TM Nout
83 LG:427997.6:2000SEP08 859 939 forward 1 TM Nout
83 LG:427997.6:2000SEP08 1309 1395 forward 1 TM Nout
83 LG:427997.6:2000SEP08 1609 1695 forward 1 TM Nout
83 LG:427997.6:2000SEP08 1726 1812 forward 1 TM Nout
83 LG:427997.6:2000SEP08 2107 2187 forward 1 TM Nout
83 LG:427997.6:2000SEP08 134 220 forward 2 TM Nin
83 LG:427997.6:2000SEP08 752 829 forward 2 TM Nin
83 LG:427997,6:2000SEP08 1229 1309 forward 2 TM Nin
83 LG:427997.6:2000SEP08 1487 1540 forward 2 TM Nin
83 LG:427997.6:2000SEP08 1619 1687 forward 2 TM Nin
83 LG:427997.6:2000SEP08 1715 1789 forward 2 TM Nin
83 LG:427997.6:2000SEP08 150 236 forward 3 TM Nout
83 LG:427997,6:2000SEP08 681 743 forward 3 TM Nout
83 LG:427997.6:2000SEP08 759 821 forward 3 TM out
83 LG:427997.6:2000SEP08 1074 1142 forward 3 TM Nout
83 LG:427997.6:2000SEP08 1185 1256 forward 3 TM Nout
83 LG:427997.6:2000SEP08 1449 1520 forward 3 ' TM Nout
83 LG:427997.6:2000SEP08 1740 1826 forward 3 TM Nout
84 LG:464206.1:2000SEP08 4 81 forward 1 TM Nout
84 LG:464206.1:2000SEP08 11 82 forward 2 TM Nout
84 LG:464206.1:2000SEP08 105 170 forward 3 TM Nout
85 LG: 1400108.1;2000SEP08 283 351 forward 1 TM Nout
85 LG: 1400108.1:2000SEP08 373 447 forward 1 TM Nout
85 LG: 1400108, L2000SEP08 598 _684 forward 1 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topolog
85 LG: 1400108. L2000SEP08 724 786 forward 1 TM Nout
85 LG: 1400108.1.2000SEP08 802 864 forward 1 TM Nout
85 LG: 1400108.1:2000SEP08 871 948 forward 1 TM Nout
85 LG: 1400108.1.2000SEP08 92 175 forward 2 TM Nout
85 LG: 1400108. L2000SEP08 293 355 forward 2 TM Nout
85 LG: 1400108. L2000SEP08 368 430 forward 2 TM Nout
85 LG: 1400108. L2000SEP08 587 649 forward 2 TM Nout
85 LG: 1400108. L2000SEP08 662 724 forward 2 TM Nout
85 LG: 1400108. L2000SEP08 785 862 forward 2 TM Nout
85 LG: 1400108.1.2000SEP08 884 970 forward 2 TM Nout
85 LG:1400108,1:2000SEP08 300 371 forward 3 TM Nout
85 LG: 1400108.1.2000SEP08 393 446 forward 3 TM Nout
85 LG: 1400108. L2000SEP08 651 737 forward 3 TM Nout
85 LG: 1400108, L2000SEP08 765 827 forward 3 TM Nout
85 LG: 1400108. L2000SEP08 852 914 forward 3 TM Nout
86 LG:254531.1:2000SEP08 31 111 forward 1 TM Nin
86 LG:254531,1:2000SEP08 277 327 forward 1 TM Nin
86 LG:254531.1:2000SEP08 53 133 forward 2 TM Nout
86 LG:254531.1:2000SEP08 548 634 forward 2 TM Nout
86 LG:254531.1:2000SEP08 60 110 forward 3 TM Nin
86 LG:254531,1:2000SEP08 231 305 forward 3 TM Nin
86 LG:254531.1:2000SEP08 495 581 forward 3 TM Nin
86 LG:254531.1:2000SEP08 687 761 forward 3 TM Nin
87 LG:1101317.1:2000SEP08 607 678 forward 1 TM Nout
87 LG:1101317.1:2000SEP08 1186 1236 forward 1 TM Nout
87 LG:1101317.1:2000SEP08 2482 2568 forward 1 TM Nout
87 LG:1101317.1:2000SEP08 2825 2899 forward 2 TM Nout
87 LG:1101317.1:2000SEP08 1023 1091 forward 3 TM Nout
87 LG:1101317.1:2000SEP08 1194 1271 forward 3 TM Nout
88 LG:1074728.6:2000SEP08 164 250 forward 2 TM Nin
88 LG:1074728.ό:2000SEP08 437 523 forward 2 TM Nin
89 LG: 1081684.1.2000SEP08 124 210 forward 1 TM
89 LG: 1081684. L2000SEP08 256 330 forward 1 TM
89 LG: 1081684. L2000SEP08 361 423 forward 1 TM
89 LG: 1081684. L2000SEP08 460 522 forward 1 TM
89 LG: 1081684.1:2000SEP08 179 265 forward 2 TM Nin
89 LG: 1081684. L2000SEP08 626 673 forward 2 TM Nin
89 LG: 1081684. L2000SEP08 147 233 forward 3 TM Nout
89 LG: 1081684. L2000SEP08 273 341 forward 3 TM Nout
89 LG: 1081684. L2000SEP08 438 509 forward 3 TM Nout
90 LG:1076520. L2000SEP08 70 132 forward 1 TM Nout
90 LG:1076520, L2000SEP08 151 213 forward 1 TM Nout
90 LG:1076520. L2000SEP08 232 294 forward 1 TM Nout
90 LG:1076520.1:2000SEP08 310 378 forward 1 TM Nout
90 LG: 1076520.1 :2000SEP08 550 636 forward 1 TM Nout
90 LG:1076520. 2000SEP08 760 828 forward 1 TM Nout
91 LG:1079477, L2000SEP08 10 75 forward 1 TM Nout
91 LG: 1079477. L2000SEP08 115 189 forward 1 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topology
91 LG 1079477.1 2000SEP08 26 88 forward 2 TM Nout
91 LG 1079477.1 2000SEP08 122 184 forward 2 TM Nout
91 LG 1079477.1 2000SEP08 27 92 forward 3 TM Nin
92 LG 1076269,1 2000SEP08 299 358 forward 2 TM Nout
92 LG 1076269.1 2000SEP08 443 520 forward 2 TM Nout
93 LG 1087195.1 2000SEP08 974 1045 forward 2 TM Nin
93 LG 1087195,1 2000SEP08 945 1031 forward 3 TM Nout
93 LG 1087195.1 2000SEP08 1185 1271 forward 3 TM Nout
94 LG:002588,7:2000SEP08 28 114 forward 1 TM Nout
94 LG:002588.7:2000SEP08 130 186 forward 1 TM Nout
94 LG:002588.7:2000SEP08 349 432 forward 1 TM Nout
94 LG:002588.7:2000SEP08 436 522 forward 1 TM Nout
94 LG;002588.7:2000SEP08 613 690 forward 1 TM Nout
94 LG:002588,7:2000SEP08 38 100 forward 2 TM Nout
94 LG:002588.7:2000SEP08 137 199 forward 2 TM Nout
94 LG:002588.7:2000SEP08 374 436 forward 2 TM Nout
94 LG:002588.7:2000SEP08 458 520 forward 2 TM Nout
94 LG:002588.7:2000SEP08 542 604 forward 2 TM Nout
94 LG:002588.7:2000SEP08 623 709 forward 2 TM Nout
94 LG:002588.7:2000SEP08 30 116 forward 3 TM Nout
94 LG:002588,7:2000SEP08 150 224 forward 3 TM Nout
94 LG:002588.7:2000SEP08 423 494 forward 3 TM Nout
94 LG:002588.7:2000SEP08 546 632 forward 3 TM Nout
94 LG:002588.7:2000SEP08 759 842 forward 3 TM Nout
95 LG:1079470.6:2000SEP08 25 111 forward 1 TM Nin
95 LG:1079470.6:2000SEP08 208 294 forward 1 TM Nin
95 LG:1079470.6:2000SEP08 448 531 forward 1 TM Nin
95 LG:1079470.6:2000SEP08 1021 1107 forward 1 TM Nin
95 LG;1079470.6:2000SEP08 56 142 forward 2 TM Nout
95 LG: 1079470.6:2000SEP08 209 295 forward 2 TM Nout
95 LG: 1079470.6:2000SEP08 389 442 forward 2 TM Nout
95 LG:1079470.ό:2000SEP08 617 691 forward 2 TM Nout
95 LG:1079470,6:2000SEP08 863 946 forward 2 TM Nout
95 LG: 1079470.6:2000SEP08 1016 1078 forward 2 TM out
95 LG:1079470.6:2000SEP08 1097 1159 forward 2 TM Nout
95 LG:1079470.6:2000SEP08 378 440 forward 3 TM Nout
95 LG:1079470,6:2000SEP08 474 536 forward 3 TM out
95 LG:1079470.6:2000SEP08 744 824 forward 3 TM Nout
96 LG:345705.3:2000SEP08 187 270 forward 1 TM Nin
97 LG 1083654.1 2000SEP08 1066 1152 forward 1 TM Nout
97 LG 1083654.1 2000SEP08 2836 2898 forward 1 TM Nout
97 LG 1083654.1 .2000SEP08 2917 2979 forward 1 TM Nout
97 LG 1083654.1 .2000SEP08 3337 3423 forward 1 TM Nout
97 LG 1083654.1 :2000SEP08 197 283 forward 2 TM Nout
97 LG 1083654.1 :2000SEP08 650 736 forward 2 TM Nout
97 LG 1083654.1 :2000SEP08 1049 1135 forward 2 TM out
97 LG 1083654.1 :2000SEP08 2681 2767 forward 2 TM Nout
97 LG 1083654,1 :2000SEP08 2975 3061 forward 2 TM Nout TABLE 2
SEQ ID NO: Template ID Start Stop Frame Domain Type Topology
97 LG: 1083654.1:2000SEP08 3161 3247 forward 2 TM Nout
97 LG:1083654. L2000SEP08 765 827 forward 3 TM Nin
97 LG:1083654. L2000SEP08 1047 1133 forward 3 TM Nin
97 LG: 1083654. L2000SEP08 2592 2666 forward 3 TM Nin
97 LG: 1083654. 2000SEP08 2733 2819 forward 3 TM Nin
97 LG:1083654. L2000SEP08 29253011 forward 3 TM Nin
97 LG: 1083654. L2000SEP08 3129 3215 forward 3 TM Nin
98 LG:198782.3:2000SEP08 8-14 900 forward 1 TM
98 LG:198782,3:2000SEP08 1339 1392 forward 1 TM
98 LG:198782,3:2000SEP08 1678 1764 forward 1 TM
98 LG:198782.3:2000SEP08 233 283 forward 2 TM Nin
98 LG:198782.3:2000SEP08. 830 892 forward 2 TM Nin
98 LG:198782,3:2000SEP08 1655 1741 forward 2 TM in
98 LG:198782.3:2000SEP08 2075 2137 forward 2 TM Nin
98 LG:198782.3;2000SEP08 2639 2695 forward 2 TM Nin
98 LG:198782.3:2000SEP08 792 878 forward 3 TM Nin
98 LG:198782.3:2000SEP08 2040 2111 forward 3 TM Nin
98 LG:198782,3:2000SEP08 2577 2651 forward 3 TM Nin
99 LG:981076.2:2000SEP08 19 81 forward 1 TM Nout
99 LG:981076.2:2000SEP08 409 495 forward 1 TM Nout
99 LG:981076.2:2000SEP08 538 603 forward 1 TM Nout
99 LG;981076.2:2000SEP08 437 523 forward 2 TM Nin
99 LG:981076.2:2000SEP08 387 449 forward 3 TM Nin
100 LG:212023.3:2000SEP08 265 342 forward 1 TM Nout
100 LG:212023,3:2000SEP08 1129 1215 forward 1 TM Nout
100 LG:212023.3:2000SEP08 497 583 forward 2 TM Nin
100 LG:212023,3:2000SEP08 1181 1231 forward 2 TM Nin
100 LG:212023.3:2000SEP08 501 587 forward 3 TM in
100 LG:212023,3:2000SEP08 627 698 forward 3 TM Nin
100 LG:212023.3:2000SEP08 732 794 forward 3 TM Nin
100 LG:212023.3:2000SEP08 822 884 forward 3 TM Nin
100 LG:212023.3:2000SEP08 1011 1064 forward 3 TM Nin
101 LG:977929.3:2000SEP08 115 177 forward 1 TM N out
101 LG:977929.3:2000SEP08 196 258 forward 1 TM out
102 LG:201936.6:2000SEP08 445 519 forward 1 TM Nout
102 LG:201936.6:2000SEP08 92 178 forward 2 TM Nout
102 LG:201936.6:2000SEP08 449 511 forward 2 TM Nout
102 LG:201936.6:2000SEP08 521 583 forward 2 TM Nout
102 LG:201936.6:2000SEP08 641 709 forward 2 TM Nout
102 LG:201936,6:2000SEP08 719 802 forward 2 TM Nout
102 LG:20193ό,6:2000SEP08 279 326 forward 3 TM Nout
103 LG:205642.1:2000SEP08 172 237 forward 1 TM Nin
103 LG:205642.1:2000SEP08 674 751 forward 2 TM Nout
103 LG:205642.1:2000SEP08 752 832 forward 2 TM Nout
103 LG:205642.1:2000SEP08 935 1021 forward 2 TM Nout
103 LG:205642.1:2000SEP08 117 197 forward 3 TM Nin
103 LG:205642.1:2000SEP08 717 797 forward 3 TM Nin
104 LG:339653.6:2000SEP08 28 J05 forward 1 TM Nin co m
_. _ _J --J _ ^ _J _. _ _J _. --. _ _J _J _. ^ _. _. ®
—.—.—._ —*—*—'OOOOOOOO o o o o o o o o o o o τ -o -o o ~o -o -o o Oo<3o*Oo<To00o00o00o00o00ov|o-~jo^Jov|ovloOo0ιoOιo0lo0ιπ o
oo o 00 o C o o o o O NT NT NT NT NT NT NT T NT NT NT NT NT NT NT NT NT NT NT NT NT NT CO Co CO 00 00000 ø"ø" 00
C O o Co o o oo co co co CO Co co NT to N) NJ N) NT N) N) NT NJ CO CO 00 CO CO CO 00 CO 00 Co CO CO co co Co 00 jj NT NT N N (<?T -o τ -o
4 4S. .fc. 5. " ' .fc. .fc. CO 00 (» 00 00 00 TO 00 1X3 00 45. 45. 45. 45. •fc _T) NT NT - Vo
VI I vι -J . E •fc fc. .fc. .fc. 45. 45. o O O O O o o > NT NT NT NT N ". T -o *c τ -O c
K .fc. .fc. 4S. .fc. .fc. • 00 TT 00 00 o o N) NT NT NT NT NT NT NT NT N) N) cn cn cn cn cn c Ό N) N) NT en oi _ Φ,
Co C CO o Co • 45 co o co o . co coτ o co _ - N co _o - NT N) NT N)
0o ) N) NT > 0 cn Oi cn cn cn cn n cn cn cn 4 "5. -fc. 45. •fc ". -fc. 45. •fc. -fc. 45. 45. o o ( ) C i ( ) 4 OcnO n
CX) oo oo 3 cn cn cn cn cn to NT NT NT NT NT c vι VI vl cn cn cn Ol cn NT NT NT NT NT NT NT NT NT NT to ' r-Jfj
NT NT NT NT NT NT NT ro NT NT NT NT NT ro NT NT NT NT NT T NT NT fc. •fc.
NT NT NT NT T to NT NT NT NT NT NT NT NT • Q O O O O o o o o o o CJ o o O O O O O ) C J O O O o o o o o C ) O O O O NT tO NT fO NT NT NT NT — t- o o o O O o ~ o o c. ) ) o o o o o C ) o O O o o O O O O Φ o o O O < J ( > c > O O O o o co or CO CO co o co CO CO CO CO co co co co CO CO o co o co c _o CO co CO CO o c -o o co o co o co o co o co co co (/J (O) CO CO co co O O m m m m m rπ m m m m m m m m m m m m rπ m m m m m m m m m rπ in m rπ m m CO CO co co o CO « CO CO co o co
TJ TJ TJ T TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ D TJ T TJ T —J TJ TJ TJ J TJ TJ m m m m m UJ m m m io o o o o o o o o o o o C ) o O _ o o o o o T TJ TJ TJ TJ _ T _ T TJ TJ oo oo oo oo oo o oo oo oo oo oo oo oo oo oo oo oo oo oo oo 00 00 oo oo oo oo oo oo oo ooo 00 00 00 00 oo oo o oo o oo O o o o o o oo oo oo oo oo oo o oo oo oo
00 NT O co 45. CO O 00 o Co oo cn Co CO NT -o en o ξ≤ _ Oi CO CO 00 — NT NT NT 00 45. 00 Cn co ro 45- o O oo -o loo NT en vi CO -*> O .fc. - 45. cn vi cn CD cn oo co —■ Cn NT o O — Co o 00 o Q
•fc. •3- ] > 03
C i 45. •fc. .fc. o e ϋn CO vj c. NT Co en jv. c i —
INT c oo o O oo CO .fc. 4^ sO o -f co oo τ vl
00 o NT CO NT cn CO 45. ιO 00 CO vl en vl cn ro
Oi oo .fc. ' τ vl 00 VI O -O NT NT -o cn cn NT 4i. T 3J" m
NT
IO O O O O O O ό 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'Q Q Q Q Q Q Q D Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q O Q Q Q Q Q Q Q Q Q Q Q Q Q ddddd ddddddaddddaaddadddddddddadddd ddddadddddaφ
CO OO CO CO CO CO NT NT NT — ' — * C C0 C0 CO NT NT r — ' — ' — ' CO CO CO CO CO NT NT NT NT NT — ' — ' CO CO NT NT — ' CO CO NT NT — ' CO CO NT NT — ' o o
3
Q
-<
TJ
Φ
z z z z z z z z z Z 2 Z Z Z ZΉ
_. _z.z _. z _.z_.z_. o- z O^- z O-z —z —z — z —z — z —z —z — z —^ O O^-^ O z —z — z -.z —z — z _. z — —zz —z ~ O — O O O o o -T D C =3 3 ! =3 =3 D 3 J C C *D =i 3 =! D =S 3 o a o c o c: o c o c. c o o a c ^ c: c c: cz o
-<
CO m
©
NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT rO NT NT NT NT NT NT NT -*-* O 0ι 0l 0ι 0ι Oι 45. 4*i. C0 C0 C0 C0 C0 C0 NT I\T — ' — —* — ' —' O O O O O O O OO vl O Cn 45. 4^ 45. 45. 45. 4i. 45. 45. 4i. 45. 4i. CO NT NT NT o z O
.fc. C Co Co co Co NT NT NT NT NT NT NT o o o o o O .fc. 00 NT O Co CO CO Co CO 00 Co co co co <ύ C NT co co co
45. — - — ' CO Co O O O O O vl -O - o -0 -0 -0 en en en en Oi On Co en co en -0 fc. .fc. 45. 45. 45. (ύ co co CO
— • CO co cn | .fc..fc. 45. 45. 45. - cn cn en cn T vl vol - - vl - l vl cn en cn en en cn NT NT oo — * -0 45. •fc. 45. 45. c. •fc. cn O O
CO — ' — < — * vl vl O o o o o O- -fc. -fc. 45. 45. 45. .fc. 45. 45. •fc. 45. 45. 45. — — • C 00 -O fc -f —1
C T ( ) C ) 00 NT T
-0 . . * Oi cn en Oi cπ < ) CO CO CO o o o o
Co CO 00 O O O O NT
45. vol VoI v J vi -vl -0 N> Φ
-O -0 -0 -0 -0 -l) j - oi v -oi e -on en vi .fc..fc. vi -- -fc. co CO CO -o -0 -o 00 CO CO — . — O -O Oi 45. .fc. .fc. 45. ■fc. 45. 45. .fc. 45. 45. •fc. 00 00 ∞ 3 co co co co co co — * cn en cn en cn cn — 1 00 00 — ■ NT •-H
NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT •NT NT NT NT NT NT NT NT NT NT T NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT g Q o o o o o o o O O O O o o O O o o 0 C ) O O o o C ) C T C ) O O O O O O O O o o O O o o C ) C ) o o o C ) ( 1 C ) o C ) C ) C ) o o o O o o o C ) C ) o Φ o C J 0 0 . C ) C 1 c »
CO CO CO co co o CO o CO CO CO CO CO CO CO co co co co CO co C T CO O CO CO co o co ( CO ) c
CΛ O CO CO o CO CO co o co CO CO o co co o CO o CO CO CO co co CT <n o CO O rπ m m m m rπ m rπ m m m m m m m m m III m m m m m m III in I'll m m rπ m rπ m m m m m m rπ m m rπ m m m rπ
TJ T TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T TJ TJ T TJJ TJ TJ "U TJ TJJ TJ TJ TJ TJ T TJ TJ TJ TJ TJ TJ u TJ o o O O O o O O O O o O 0 0 0 O o CJ C_) 0 o o o O o o C ) C ) O
00 00 00 00 00 o 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 o 00 00 00 00 00 00 00 o o 00 00 00 00 00 00 o 00 o 00 o 00 00 00 00 00 00 00 00 00 00 00 00
CO NT
—fc — h → — h — h *^ — → *^ *-h — h *^ *^ → — h -^ → -^ *^ *^ ^ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q D Q Q Q Q Q Q Ω Ω Ω Ω Ω Ω Ω Ω aaaaaaaaaaaaaaaaaa adaddddddddddadd a a a a a a ddddddddφ
CO CO NT NT — ' — ■ CO NT CO 00 NT NT CO NT NT NT NT — — * CO W CO CO CO CO CO NT NT NT NT -- — — ' — ' NT NT NT NT
O o
3
Ω s 2 <: s < s < s
-<
TJ Φ
-7 Z Z z ^ z z ^ ^ ^ ^ z z z z z z z z z z z z ^ -^ z z z o
— — O O O — O O — — — — O O O O O O O O O O O O _. — O — o z z z ????o ???o
IT c c c i ςz c zi O zi ςz ci i cz cz ci ςz cz ci cz c i o i cz c 5' 5' 5" D -T D D C U IT D O
-*- CΩ
-<
o o δ o o o o o o o o o o o o o o o a o o g 3o"' o' σ o' o' σ o' a o o o o o o o o o o o o o o o o o o
Ω
IΩ Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω ddaddaddadadaddadaadadadddddaddddaddddddaa a a a a a a 3 Φ 3 ω ω ω co N3 W M -* -* -j -' *-j -' *-j -* *-j -' -' ω w co u ω M -j — • Co — * CO — ' C o Ό
3
Ω
-<
TJ φ
z z z z z z z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z2 Z Z 2 Z Z2ZZZZ
_. _. _. _. _. _. _. _. _. _. o o o o o o o o o o o o o O — 3 S D 7 O O Z0 Zl Z } 3 > Z3 > Z) > } Zi z? Z3 3 3 z? D cz c. cz c: c cz c cz cz : cz cz c: cz i 3 o CΩ
-<
co m o
•fc. 45. 45. 4i. 4i. 4^ 45. 45. 45. 4^ 45. 45-, 45. 45. 45. 45. ω ωωωω ω ω ω ω ω ω u NT NT NT NT — ' —* — — ' O O O O O O O O T -0 -O OT VI V| -vJ -v|ωVJω-v| vωjuvj ω-vJ cvoJ ω-vJ ωvJ vJ vJ ωvJω-vJ ω-vl VJ ω*-vl ωVI O O O O O O o
— o O o .fc. I .is. .fc. 45. .fc. .fc. .fc. .fc. .fc. 45. 45. ( ύ co co co co co co co co (ύ co co O co CO 00 co CO co co CO 45. 45. 4^ 45. 45. .fc. 45. oo oo 00 oo o n o O CO CO o vl .fc. 5. .fc. .fc. 45. -fc. .fc. 45. .fc. 45. 45. •fc. 45. •fc. 5. 45. •fc. .fc. •fc. ro NJ NJ ro NJ NJ NJ NT o o o o O- o Ch o 45. .fc. .fc. •fc. 45. 45. -fc. 45. O 00 co NT cn n cn cn n n cn cn cn cn cπ cn cn n cn cn n cn cn cn J NJ NT J NJ NJ NJ NT —1 o CO CO CO N) NJ N) NJ NJ NT NT N) — * 00 (JO 00 O 00 CJ0 00 CD oo oo c» XI 00 00 CD 00 00 00 oo 00 00 -o -O -O -o -O -O - -o φ
VI vl 00 00 00 cn cn cn cn cn cn cn cn cn O Co O C ) C J C J
VI vl ( )
o p τ co CO Co CO CO CO Co CO 00 00 00 l
45. en cn cπ en l cn Oi l cn cn O cn cn cn cn en cπ cn cπ cn co . - . - . 3
NJ ro ro NT NT NT NT NT ro NT ro NT NT NT NT fsT NT T NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT Ω o o n o CT < ) C ) o (1 c ) O C ) c ) C ) c ) c > C ) C ) C ) C ) o o o o O o o o C ) C T o o c ) C ) O o C J C ) C ) O C ) C ) C J φ o o r> o O ΓT (1 C ) CT C ) o C ) C ) ( ) ) C )
CO CΛ O CΛ CΛ CΛ CΛ C O O CO CO CO CΛ ( CO ) CΛ CΛ O CΛ CΛ Λ Λ CΛ CO CΛ (Λ Λ CΛ CΛ ( CΛ ) c )
CΛ C CΛ C <Λ ) C ) C
CΛ (Λ (n CΛ (Λ CO CΛ CΛ CΛ rn D m m m m m m m m rπ m m ITI m rπ m m m m 111 m rri III m rπ I'll m rπ m m rπ m m I'll 1 II m
TJ TJ TJ TJ TJ TJ TJ J TJ J rπ rπ III m rri III m rπ m m rπ J TJ TJ TJ TJ TJ TJ TJ
TJ TJ TJ TJ TJ TJ TJ TJ TJ T TJ TJ TJ TJ TJ TJ TJ TJ TJ
n TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T o o o o o o o o c ) ) C )
00 oo oo 00 oo oo 00 oo 00 00 o C ) C ) o O C ) C ) c ) C ) c > « ) c ) C ) C ) C ) oo 00 oo oo oo 00 00 00 oo 00 00 00 oo oo 00 oo oo 00 oo oo 00 oo 00 00 oo oo 00 oo oo 00 oo oo 00 oo 00 oo oo oo ro NT VI en ro ro
._ VI T ro o •fc. 00 Oi co 00 VI o T O o Ol NT o -o -o o c o NT VI o oo vl VI o -vl NT en cn _ NT Oi Ol o 0 o -o .f. 00 NT o 0 VI Oi o o o V o oo en vj VI
NT en ro o o VI VI VI 45. I -o O 0 o O o 00 e oo o o Oi co o VJ o 0 n 00 NT -O cn VI -fc. o o -O -fc. o o VI
NT o Ol o o VI d
* >! ]45. CO C Co vl o cn NT
V co Ό vl co 00 o 45. Ό 00 VI vl vl T o VI O o o 00 I
00 00 00 .cf. 00 NT vj 00 I -o O vi O VI NT 0 l-O .fc. N NT cn o Ol VJ - 0 T O o o O o V -o vl o NT o 00 N NT
-o vl vl vl o o o 00 Co
•fc. en vl 00 o O en
.fc. O -o -o 00 -o oo -o vj 5 o cπ 00 o -O NT vl o NT 3" m
TJ ro
— h — h — h — ft — h υ C) CJ CJ υ J CJ υ υ υ υ υ υ υ o CJ υ υ υ υ υ o υ υ υ υ o υ υ υ U υ υ υ υ υ υ υ υ J o υ υ J υ υ υ CJ
•^ ^ ^ ^ js? 5 ^ ξ? •^ **> ^ •^ £ § **> ^ € ^ ^ g? •^ ^ ξ? ^ •*-: •^ Ξ g? ^ g? •^ r^ r? •^ ^ ξ ^ •^ •^ •^ ^ ^ r≤1 Ω
Ω o o 0 o o π o Ω CJ O o o Ω o a a o o o a CJ a Ω Ω O o o a o CJ a o Ω Ω Ω CJ C) o o o o o Ω o CJ CJ Ω 3 a a a α a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a φ
CO co NT CO co T NT NT NT — ' — * NT NT NT CO co co CO CO co CO CO T NT NT NT NT NT NT NT — > — * NT NT NT
z z z z-.z-.Z o Z o Z o Z o 2 oz_z_.z_.z_.z_z_.zz tr.z -. z z z z z z z z z o o o 2o-zoz -.z-. — O O O z z z z"§ _ C. CZ ZS D D D Zi D J D Z D 3 D 3 C C 3 3 3 3 3 0
→ → -<
3 3 3 3 3 3 3 o o o o o o o c c c c c __ C C C C C 3 3 3 3 C C C C C C
- — — — o o z z z z z z z z z z z z z o o - — - - - — zzzZ Z 2 Z z zzzzz
I — I — I — I — 1 I — I — - 1 — i — 1 — ! —
C CM CN CM CN CN IM CN CM CM CN CM ^ oO co cO cO oO cO cO cO cO *— i— CM CM CM CN CM CN C ω TJ TJ TJ TJ TJ Ό Ό TJ TJ Ό TJ TJ Ό Ό TJ TJ Ό TJ TJ "2 t o o o o D b b b D D D D D δ b D D j b P > o o o o σ o o o σ o o o o o o b"2b"2b"2b"*b2b^b"2o bb "
.0 0.0 o 4 o— p p ,p o 0.0.00.0.P .0.0.0 o o o o o o o b p.p o o o o o .45 o £ 45 o •45 45 45 45 £ 45 .
∞^r &-sT ≥^ fc" sf C 1 L c 1 r 'C~- c C
"s "s
CΛ m
©
4s. 4. 45. 45. 45. 45. 45. 45. 45. 45. 45. 45. 45. 45. 4. 45. ^ 45. 45. 45. 4. 4. 4^ 45. ^ 4^ 4^ J 4^ 45. 45. . 45. 5. 45. ^ 4^ s. 4s. 4s. 4. 4^ 45. -O -O -O s -O -O -O -O sO s sO -O s sO 3 -O sO TO OO CO TO OO OO CO OO OO CX3 00 vl vl vI -vi vl vI vi vI O O O en 45. 4i. 45. 4S. ££ O
Z
O
rTr cfT oOoH ΓΪ f f^
0 0 0 0 0 0 0 0 0 0 0 0 δ 0 0 0 δ 0 0 0 0 0 0 0 cT 0 0 0 0 0 0 0 0 0 8 -" Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω ddddddadddd ddddddddddad aaaaaaaaaaaao
NT NT NT NT NT NT NT NT — — ' — CO CO CO CO NT NT NT NT NT CO CO CO NT NT NT — ' CO NT NT t CO CO CO CO CO CO CO o σ
3
—I —I —I —I Ω
<:?S2< s2 3
Ό
O
Z Z Z Z Z ZZZZZZZZZZZZZ z z z z z ZT
O O O O O O O O _. o ^ ^ zzz —z Z Z Z CZ CZ Z C C 3 E3 3 3_.3_.3E E EoooooCoCoCoCoCoCoCoCoooooC—3 —3 oC z o cz o cz O cz —> —D —3 —z 3 — - 3 o 3 3 3 C C C C C C C C O CΩ -<
CΛ rπ _ _ ^ _. ^ _. -J --. _ --. ^ --. ^ _- _J --- ^ --- _J _. ^ _ _J _ '_ cn cn cn en ci oi oi oi oi oi oi oi ϋi oi cn cji oi cji oi oi oi oi oi oi oi oi oi oi oi ϋi oi oi oi oi oi o^
45.45.4^ 4^ 45.4^ *fc.45.45.45. C0 τ C0 W CJ0 C0 C0 NT rθ NT rθ NT NT NT — ' —-o o oo oo oo oo oo o-o -o-o-o -o -o
O O O O O O O O O O o o o o o o o o o o o o o o o —oh o o o o o o o o o TI
Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω ■3 dddddddddd adadddaddddddd ddadddddddadφ
CO CO CO NT NT Co CO NT NT NT -* C0 CO M M *-* ω w ω CJ CO CJ CO M M N -J -' *-* « C 03 Cθ S3 W
D o
H H H O ^ ^ 3"
TJ φ
m © n oi o cn n oi oi oi oi oi oi cn cn cn cjn cn cn cn cn cn cn cn cn cn oi oi oi cn cn cn Oi
O c O Oi oOi cOn eOn Oi cOni eOi cOni oOii oOii ϋOii oOii oOii oOii oOii o45.i o45.i o45.i o45.i 4^ 4^ ^ 4^ 4. 45. 45. 4. 45. 4^ 5. 45. 4^ ^ 4^ 45. 4. 45. 4^ 4^ ^. 45. 4. 45. 45. 45.
Z
O
0 0 O 0 0 0 NJ NT NT NT NT NT NT NT NT NT NT -fc. 45. 45. 45. 45. 45. •fc. 45. 45. 45. •fc. 45. 45. -fc. •fc. 45. 45. 45. 45. 45. 4*. 45. 45. 45. 45.
0 0 O 0 0 0 CO CO CO CO CO CO CO CO CO CO CO VI vl vl VI -vl vl VI VI vj vl J VI vl VJ vj VI VI VI VI vl vl VI VI vl vj vl vl -vl vl vj vj
00 00 00 00 00 00 O O O 0 O O O O O O 0 5. 45. •fc. .fc. -fc. 45. 45. .fc. 45. ■k -fc. 45. 5. 45. 45. 45. -fc. •fc. -fc. •fc. •fc. .fc. -fc. •fc. 45. 45. 45. 45. 45. —1
-0 -0 -0 -0 -0 -0 vl vi vl VI VI VJ vl vj "vj vi VJ
45. .fc. 45. -fc. .fc. 45. 0 O O 0 O 0 0 0 O O O O O 0 0 0 0 0 0 O O 0 0 0 O O O 0 O O 0 (D
NT NT NT NT NT NT 00 00 00 00 00 00 00 00 00 00 Co 00 00 00 00 00 00 00 00 CO 00 00 00 00 00 00 OO 00 00 00 00 3
NT NT NT NT NT NT NT fo NT io NT NT NT NT NT NT NT NT NT NT NT fo NT NT NT NT fo fo iό fo fo fo fo NT NT fo fo NT NT NT NT NT TJ ro NT NT NT NT NT NT NT NT NT NT NT NT NT NT ΪO NT fό NT NT fό NT fό NT fό NT NT iό fό fό iό iό iό iό fό fό iό fό NT NT fό fό fό iό iό fό NT fό Ω
0 O O O O O O O O O O O O O O 0 O O O O 0 O 0 O O O O 0 O 0 0 0 0 0 0 O 0 0 O O 0 0 O 0 0 O O O
0 O O O O O O O O O O O O O O 0 O O O O 0 O 0 O O O O 0 O 0 0 0 0 0 0 O 0 0 O O 0 0 O 0 0 O O O st
0 O O O O O O O O O O O O O O 0 O O O O 0 O 0 O O O O 0 O 0 0 0 0 0 0 O 0 0 O O 0 0 O 0 0 O O O
CΛ CO CΛ CO CO CO CO CO CO CΛ CO CO CO CΛ CO CΛ CO CO CΛ CO CO CO CO CO CO CO CO co CO CO CO CO co CΛ CO CΛ CO co CO CΛ CO CO CO C O CO CO rπ m m m m m m rπ m m m m rπ m m rπ m m m m m m m m m rπ m m m m m m m m m rπ m m m m rπ m O co C m m m m m rπ a
TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ J TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ J TJ TJ TJ TJ J TJ TJ TJ TJ TJ TJ J TJ TJ TJ TJ TJ TJ TJ TJ
0 0 0 0 0 0 0 0 0 0 0 0 0 0 O 0 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
CO 00 00 00 CO oo 00 co 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 co 00 00 00 00 00 00
α
O s < 2 s 2
-<
TJ φ
ZZZZZZZZZZZZ ZZZZZ2ZZ zzzzzzzzzzzzzzzz z-8
0 0 0 0 0 0 0 0 0 0 0 0 Z _.ZZZZZZZ z o o c cz cz cz cz cz cz c c cz cz cz Z)_).Z_3.0CZ0CZ0CZ0CZ0CZ0CZ0CZ0CZ0CZ0CZ 0 CZ0CZ0CZ0CZ0CZ0CZ0C0 CZ0 cz0cz0cz0cz0cz0cz0cz0cz0cz0cz0cz0cz c o
CO m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ © o o o o o o o o Oi Oi Oi Oi Oi Oi Oi Ol Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Ol Oi Oi Oi Oi Oi ϋi r Cn cn cπ cn cπ cn cπ 45> o 45.oCOoC oCOoCOoNToNToNToNToNTo— o —o — o — *o O 0 0 *0 *0 0 -0 0 ) 0 ) CO N N N N N N N N N N N O* CM>
0
Ux
—' —' —' —* —■ —* — • CO NT CO NT NT CO NT NT CO NT — * Co CO CO NT NT to — * — * Co CO CO CO NT NT CO C σ o
3
—\ —I —I —I —I —1 —1 —I Q ^ ^. 3
-< J φ
z z zzzzzzzzz
_.z _.z _.z _.z _.f O- O-z — — — — O O O O O O O
Z3 O l ) Zl 3 3 ςz CZ ) Z3 Z D CZ CZ CZ CZ CZ CZ CZ
co m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ © sj vj 'vj -vl -vl vl -vj sj vj -vj -vJ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O' O O O O O O r o o o o o o o O O O O *0 'O *O ) <) W -C0 M N M N N N N N N N M N N M M N N N N Cn 0l CJT CJl 0l Ul 0l CJl Cll
O
TJ φ
zzzzzzzzzz o o . oZ r o7 r o7 zz zzzz zzzzzzzzzzToJ cz ocz ocz z ocz ocz o o zz cz cz o Z. Z_. _ cz o _ cz ocz 3 Z Ω C CZoCZo ozoz zzz Z CZ CZ o z CZoCZ CZ CZoCZ—Zl —Zj- — —ZZi -Zl: —Zi —Zl 3 3 3 3 3 3 3 3 3 3 O CΩ -<
C m
©
00 00 00 C0 00 C0 C0 sJ vJ -vl Vl --J sJ vJ -vl Sl sJ SI SJ Sl vJ SJ *vJ sJ -vl s| Sj SI Sl SI Sl Sj sJ Si VJ Sl S| SJ Sl s| SJ vJ vJ SJ S| Sl S| Sj O O O O O O O -O -O -O Oo OO -vl -vj -vj o o o o o o o o o o o o o oi oi oi cn en cn .fc. 45. 45. -fc. 45. 45. Cθ CO Cθ NT — — ■ — O σ z o
-4
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 c? 0 0 0 0 0 o-o 0 0 0 0 0 0 0 0 0 0 0 c o o o o o o o
Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω Ω aaaaaaaaa aaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa®
CO NT NT — * —* —■ — CO tO — ■ NT NT CO NT — ■ CO CO CO CO CO NT NT NT — - —> —' — • CO NT NT CO NT NT — ' — Co CO NT NT NT Co σ
O 3
—I —1 —I —I
< s < s Q
- 2 - 22 S 2 S < <: - <: , 3
TJ
Φ
ZZZ zzzz ZZZZZZZZZZ z z z _.Z _.Z_.Z_.Z_.Z o Z o Z 0 Z zzzzzz z z-g
_. _. 0 0 0 0 0 0 0 0 0 0 0 0 O O O z z z z _. z . O O — z — z — o ...
3 3 3 3 3 3 3 C C CZ CZ cz cz c c cz cz cz cz cz c. cz Z} ^ Z ^ Z^ Z3 3 I 3 3 C C C 3 3 3 3 3 -C+ -C+ 3 3 3 -C+(Q0
m _ _ --. ^ --. _ --- ^ _ --J _ --- _J --- ^ --- ^ --- _. _ _. ^ __ ^ ®
00 ∞ TO TO OO TO CX3 CO CD OO TO OO ∞ OO TO OO OO C» TO TO CO OO CO CO r-1 45. 45. 45. 4i. 4i. ω C0 C0 ω C0 CO C0 Co C0 C0 NT NT — ' — i — . — —" — < _.
O
oo co NT .fc. co ro —■ CO Cr, co NT —■ i en en 45. —' cn co _ 45. o ro co oo —* o co NT S -' Cπ cn ^ o —* 4. -fc. - -1 - ; * —■ 45. Co si o si cn 00 Ω cn 45.
> cπ oo O 45. Co Cn 45. CO — ' — ' CO NT — « CO Co NT — ' O Cn cn 45. NT i co 01 *=¥ NT -' — ' 45. 0 01 0 — ' Ol O CO O O — * cn cn o co — ' O sj oo ° rπ 00 O O O NT C0 .fc. 00 00 00 O O O O O CO SJ CO O O — ' 00 O τj to
CO CO NT co co CO NT NT CO CO CO co NT NT NT
O o 3
Ω
SSSSSSSSSSSS
-<
TJ Φ
z z zzzzzzzzzz z z z o
ZTJ
E — o o 0 0 0 - o
3 oc O C Z3 —3 Z—) C C oCoC oC oCoCoCoCZ oCoC oC c c C 3 O CΩ <
CΛ rπ © cn θι θι θι cπ eπ cn cjι θι θι oi cn c-π cπ cn cn oi cπ cn cjι θι θι cn <jι .fc» *4s. *fc» *fc. -fc. *^ D
Z
O
I
'
45. O -vJ vj sl s| si 00 00 sJ O O O r-i s0 5 O _, r-i rs n- M M i O — ' — ' — ' — ' CO O O CO — ' — ' — ■ -1 O- v I CO Ol sj -Λ O NT 00 sj cn 4N 4N sj NT O 4N Cj0 4N cjn 4N 4N ^ ° g S ^ — O — ' CO O Ol O — ' O O 01 O O 4N — ■ — ■ CO O o ro oo Q O sj ∞ o cn ∞ N rO O O sj O g sJ O O ^ o o O sJ O O en co co o o oι -fc. oι oo sj si o 00 si co sj ro .
sj 00 00 NT 45. —' 45. 45. en 45. oo —' 00 4s. cn OO NT NT O O O sI CO OO OO tO NT Co — ' NT NT 45. SI O O 45. 00 — ■ — ' CO cn sj 00 co — O —■ —> "° —> 45. -* CO r+
O 45. v| 00 O 45. -' j^ O si —' oo en oi co —' S1 0 4. —- 45. O O O O SI O O O O O O O CO O O O CO O O — " tO CO 00 NT 00 O ^ s| Oi O NT O Ol 45. sI NT sI — . — ' JS. CO Cn 45.45. cn — ' CO — ' O O — ' — ' — ' 04N sj sJ O O rO NT 4N O sj n
N O T)
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
5 LG:980494.1:2000SEP08 2214403F6 627 1059
5 LG:980494.1:2000SEP08 2212765H1 627 881
5 LG:980494.1:2000SEP08 5515783H1 703 959
5 LG:980494.1:2000SEP08 4290144H1 738 996
5 LG:980494.1:2000SEP08 6725205H1 1 614
5 LG:980494.1:2000SEP08 5964643T9 122 634
5 LG:980494.1:2000SEP08 g2783200 342 848
5 LG:980494.1:2000SEP08 1618504H1 447 671
5 LG:980494.1:2000SEP08 1618756F6 447 854
5 LG:980494.1:2000SEP08 1618749H1 447 649
5 LG:980494.1:2000SEP08 1686804H1 450 670
5 LG:980494.1 :2000SEP08 1785019H1 450 698
5 LG:980494.1:2000SEP08 g2835145 474 737
6 LG:984457.2:2000SEP08 6784190H1 435 952
6 LG:984457.2:2000SEP08 6715771H1 454 976
6 LG:984457.2:2000SEP08 3368817F6 534 1083
6 LG:984457.2:2000SEP08 3368817H1 534 807
6 LG:984457.2:2000SEP08 7353082H1 560 1130
6 LG:984457.2:2000SEP08 5754149H1 750 1273
6 LG:984457.2:2000SEP08 2490569T6 1 546
6 LG:984457.2:2000SEP08 3799467T6 25 358
6 LG:984457.2:2000SEP08 3413777H1 321 565
6 LG:984457.2:2000SEP08 7674559J1 396 946
7 LG:406758.1:2000SEP08 g711128 930 1174
7 LG:406758.1:2000SEP08 2270260H1 803 1065
7 LG:406758,1:2000SEP08 3203437F6 518 1024
7 LG:406758.1:2000SEP08 5768790H1 439 1013
7 LG:406758.1:2000SEP08 2573541 HI 555 809
7 LG:406758.1:2000SEP08 3203437H1 520 770
7 LG:406758.1:2000SEP08 6777565H1 1 634
7 LG:406758.1:2000SEP08 7690831Jl 229 618
7 LG:406758.1:2000SEP08 5047721 Rό 37 350
7 LG:406758,1:2000SEP08 7338496H1 867 1358
7 LG:406758.1:2000SEP08 3249541 HI 1040 1354
7 LG:406758.1:2000SEP08 5690615H1 1042 1310
7 LG:406758.1:2000SEP08 g696761 930 1277
7 LG:406758.1:2000SEP08 g734852 930 1228
7 LG:406758.1:2000SEP08 4099355T9 1518 1663
7 LG:406758.1:2000SEP08 g718527 1446 1656
7 LG:406758.1:2000SEP08 g734770 1392 1647
7 LG:406758.1:2000SEP08 g696594 1443 1647
7 LG:406758.1:2000SEP08 3203437T6 1124 1645
7 LG:406758.1:2000SEP08 4243208T6 1340 1645
7 LG:406758.1:2000SEP08 g4451158 1390 1645
7 LG:406758.1:2000SEP08 g4564394 1365 1645
7 LG:406758,1:2000SEP08 g3679317 1324 1645
7 LG:406758.1:2000SEP08 g2838157 1304 1645
7 LG:406758.1:2000SEP08 1334378T6 1315 1645 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
7 LG:406758.1:2000SEP08 7747262H1 1271 1645
7 LG:406758.1:2000SEP08 3343574T6 1206 1645
7 LG:406758.1:2000SEP08 1334378F6 1156 1517
7 LG:406758.1:2000SEP08 1334378H1 1156 1410
8 LG:902957.17:2000SEP08 5742915H1 1 294
9 LG:333179.1:2000SEP08 g703579 183 475
9 LG:333179.1:2000SEP08 g713006 183 464
9 LG:333179.1:2000SEP08 6026950H1 1 275
9 LG:333179,1:2000SEP08 5800088H1 1 625
9 LG:333179.1:2000SEP08 1746609H1 5 261
9 LG:333179,1:2000SEP08 1746609F6 5 555
9 LG:333179.1:2000SEP08 2006872H1 6 148
9 LG:333179.1:2000SEP08 2473167F6 6 469
9 LG:333179.1:2000SEP08 2473167H1 6 239
9 LG:333179.1:2000SEP08 033784H1 27 298
9 LG:333179.1:2000SEP08 5282372H2 27 207
9 LG:333179,1:2000SEP08 4997689H1 33 287
9 LG:333179,1:2000SEP08 g2034987 39 323
9 LG:333179.1:2000SEP08 030695H1 39 241
9 LG:333179,1:2000SEP08 g2154206 61 161
9 LG:333179.1:2000SEP08 4995956H1 79 340
9 LG:333179.1:2000SEP08 6246345H1 1 526
9 LG:333179.1:2000SEP08 gl924106 1 369
9 LG:333179.1:2000SEP08 171289H1 1 230
9 LG:333179,1:2000SEP08 007672H1 195 463
9 LG:333179.1:2000SEP08 4400180H1 203 308
9 LG:333179,1:2000SEP08 4198119H1 221 473
9 LG:333179,1:2000SEP08 g6697567 328 786
9 LG:333179.1:2000SEP08 g4735257 332 784
9 LG:333179,1:2000SEP08 gό661884 338 782
9 LG:333179.1:2000SEP08 g4739770 351 786
9 LG:333179.1:2000SEP08 g3400569 360 .792
9 LG:333179.1:2000SEP08 1975405H1 370 556
9 LG:333179.1:2000SEP08 g3144451 371 782
9 LG:333179.1:2000SEP08 g4125056 375 791
9 LG:333179.1:2000SEP08 g5804179 381 786
9 LG:333179,1:2000SEP08 g4741035 384 785
9 LG:333179.1:2000SEP08 g2806047 387 787
9 LG:333179.1:2000SEP08 g6462769 387 785
9 LG:333179.1:2000SEP08 g5765850 414 790
9 LG:333179.1:2000SEP08 g2821034 422 787
9 LG:333179.1:2000SEP08 g5631640 453 783
9 LG:333179.1:2000SEP08 g3737203 454 608
9 LG:333179.1:2000SEP08 g5393676 478 785
9 LG:333179.1:2000SEP08 g3118453 485 785
9 LG:333179,1:2000SEP08 g3162475 487 785
9 LG:333179.1:2000SEP08 g6567793 503 787
9 LG:333179.1:2000SEP08 2010119H1 534 642 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
9 LG:333179.1:2000SEP08 2473167T6 613 739
9 LG:333179.1:2000SEP08 g703479 648 933
9 LG:333179.1:2000SEP08 g723722 674 921
10 LG:406568.1:2000SEP08 2324266H1 1336 1578
10 LG:406568.1:2000SEP08 3011865F6 1339 1760
10 LG:406568.1:2000SEP08 3576519H1 1338 1494
10 LG:406568.1:2000SEP08 5176829H1 1347 1599
10 LG:406568.1:2000SEP08 3873606H1 1350 1660
10 LG:406568.1:2000SEP08 4852614H1 1362 1586
10 LG:406568.1:2000SEP08 7753572J1 1376 1936
10 LG:406568.1:2000SEP08 3958442H1 1376 1473
10 LG:406568.1:2000SEP08 3016113F6 1375 1646
10 LG:406568.1:2000SEP08 058529H1 1375 1521
10 LG:406568.1:2000SEP08 5276870H1 1375 1534
10 LG:406568.1:2000SEP08 3011865H1 1375 1458
10 LG:406568,1:2000SEP08 3578346H1 1381 1633
10 LG:406568.1:2000SEP08 3016113H1 1382 1668
10 LG:406568,1:2000SEP08 6337413H1 1389 1506
10 LG:406568.1:2000SEP08 6338013H1 1389 1880
10 LG:406568.1:2000SEP08 6335720H1 1389 1882
10 LG:406568.1;2000SEP08 3688195H1 1397 1694
10 LG:406568,1:2000SEP08 4151882H1 1400 1640
10 LG:406568,1:2000SEP08 3874704H1 1438 1713
10 LG:406568.1:2000SEP08 5169119H1 1475 1609
10 LG:406568.1:2000SEP08 5278359H1 1474 1704
10 LG:406568.1:2000SEP08 3685458H1 1485 1782
10 LG:406568.1:2000SEP08 6332978H1 I486 1981
10 LG:406568.1:2000SEP08 5531678H1 1494 1623
10 LG:406568.1:2000SEP08 g5395484 1512 1853
10 LG:406568.1:2000SEP08 3693626H1 1511 1803
10 LG:406568.1:2000SEP08 4152946H1 1519 1791
10 LG:406568.1:2000SEP08 3890704H1 1547 1841
10 LG:406568.1:2000SEP08 3875504H1 1548 1850
10 LG:406568.1:2000SEP08 4466716H1 1581 1838
10 LG:406568.1:2000SEP08 g395449 1590 1922
10 LG:4065ό8,l:2000SEP08 5167337H1 1591 1799
10 LG:406568.1:2000SEP08 g6836605 1592 1891
10 LG:406568.1:2000SEP08 920392H1 1610 1922
10 LG:406568.1:2000SEP08 3011865T6 1617 1971
10 LG:406568.1:2000SEP08 3016113T6 1673 2154
10 LG:406568.1:2000SEP08 5277911 HI 1696 1934
10 LG:406568.1:2000SEP08 3874759H1 1 295
10 LG:406568.1:2000SEP08 3685395H1 3 270
10 LG:406568.1:2000SEP08 g3116688 4 442
10 LG:406568.1:2000SEP08 3692640H1 4 269
10 LG:406568.1:2000SEP08 3692640F6 4 341
10 LG:406568.1:2000SEP08 3445829H2 6 263
10 LG:406568.1 :2000SEP08 306745H1 13 385 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop 10 LG:406568.1:2000SEP08 945423H1 11 292 10 LG:406568.1:2000SEP08 7752844H1 21 658 10 LG:406568.1:2000SEP08 7369625H1 40 556 10 LG:406568.1:2000SEP08 988248R1 178 382 10 LG:406568,1:2000SEP08 988248H1 178 290 10 LG:406568.1:2000SEP08 4454887H1 249 518 10 LG:406568.1:2000SEP08 7753572H1 272 808 10 LG:406568,1:2000SEP08 3877090H1 349 618 10 LG:406568.1:2000SEP08 6901148H1 367 824 10 LG:406568.1:2000SEP08 3028782F6 504 840 10 LG:406568.1:2000SEP08 3028782H1 504 806 10 LG:406568.1:2000SEP08 7752949J1 532 1080 10 LG:406568.1:2000SEP08 741601 ITI 548 952 10 LG:406568.1:2000SEP08 7754049H1 569 1211 10 LG:406568.1:2000SEP08 1005592H1 598 888 10 LG:406568.1:2000SEP08 7754049J1 639 1265 10 LG:406568.1:2000SEP08 7752949H1 683 1175 10 LG:406568.1:2000SEP08 7753382J1 726 1118 10 LG:406568,1:2000SEP08 7752844J1 800 1346 10 LG:406568.1;2000SEP08 4646393H1 800 1060 10 LG:406568.1:2000SEP08 5278286H1 911 1124 10 LG:406568.1:2000SEP08 2521969F6 949 1353 10 LG:406568.1:2000SEP08 983403H1 1062 1334 10 LG:406568.1:2000SEP08 983403R6 1063 1314 10 LG:406568.1:2000SEP08 188949H1 1105 1305 10 LG:406568.1:2000SEP08 983403T6 1122 1346 10 LG:406568.1:2000SEP08 988925H1 1125 1348 10 LG:406568.1:2000SEP08 2521969H1 1128 1353 10 LG:406568.1:2000SEP08 2639257H1 1176 1338 10 LG:406568,1:2000SEP08 983273H1 1240 1543 10 LG:406568.1:2000SEP08 3692640T6 1246 1831 10 LG:406568.1:2000SEP08 4013213H1 1307 1597 10 LG:406568.1:2000SEP08 4010336H1 1327 1610 10 LG:406568.1:2000SEP08 3877453H1 1330 1613 10 LG:406568.1:2000SEP08 3013979H1 1876 2168 10 LG:406568.1:2000SEP08 921802H1 1887 2213 10 LG:406568.1:2000SEP08 920885H1 1887 2203 10 LG:406568.1:2000SEP08 5278931 HI 1995 2213 10 LG:406568.1:2000SEP08 058768H1 1997 2200 10 LG:406568.1:2000SEP08 3045908H1 2012 2280 10 LG:406568.1:2000SEP08 g2787073 2047 2206 10 LG:406568.1:2000SEP08 979691 HI 1835 2119 10 LG:406568.1:2000SEP08 g434193 1851 2025 10 LG:406568.1:2000SEP08 5803204H1 1825 2023 10 LG:406568.1:2000SEP08 2636153H1 1699 1928 10 LG:406568.1:2000SEP08 1567405H1 1705 1888 10 LG:406568.1:2000SEP08 462513R6 1705 2043 10 LG:406568,1:2000SEP08 462513H1 1705 1930
102 Co m
4^ 4N ^ 4N N 4N J N N 4N N 4N 4N 4. 5. 4N C W W CO CO OJ o o o o o o o o o o o o o
000000000000000000000000000000000000000000000000 ώ _. -ώ_. _ω. _ώ i ώ_ ' —ω _ _ _ , _ j —, --. --- —, —. — _ . — . sl sj sl sl sl sl sI sj sJ O O O Cn Cn Ol OJ Cn O O O O O O O O O O O O O , — , _ __ _, _ i _ , _-, -_ , _ __ _- _, _ -_ -_ , _, _ — ' O O O O O O O O O — — ' CO Cύ CO Co Co O O O O O O O O O O O O O rrT _. _. —. _ i — 4 --. _ι _, _ , --- _ -_ _ . _ , _. _ . •^ ■^ o O O O O O O O O NT NT NT NT NT NT NT NT Ol Oi Cπ Cn eπ en Cπ Oi Ol Ol Oi Oi Ol -^ O O O O O O O O O O O O O O O O O O O O O O s| sJ sJ O O O O O O O O O O O O O -{ sJ sJ sJ sl sJ sJ sJ sI sj sJ sj sl sl si sJ sl sJ sJ O O O O O O O O O sJ sl sj
T 7Z' 7 7 7 7Z' 7 7 7 r' 7' ? rT' 7' 7Z' rZ' 7 ?' rT' 7 r r 7Z' 7 ^ 7 ^
IvJ M M WMWM MMWM M M MMMWMWMWMMMMWIvJ M
OO OOOO O O OO O O OO OO O OO OO OO OO O OOO OOO OO OO OO OO OO O OO O O OΦ OOOOOOOOOO OOOOOOOO OO OOOOOO OOOO OOOOOO OOOOOOOOOOOO OOOO OO OO OOOOOO OOOOOOOOOOOOOOOOOOOO OOOO OO OOOO OO OO —D
'w co co cn co co co co co co co cji Co co co co co cn co co co (j Co co co co c) c^ > m m m m ιτι m m m m m m m m m m m m m m m m rτι
TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T^
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O |00 CO O0 CT0 CT0 Co αo <T0 CT0 CT3 O0 CO 0O 0O C» CO O0 0O 0O CO O0 0O CX> <T3 C0 OT
•fc-
CO NT NT NT NT NT NT NT IsT 45. cπ NT NT co ro cπ _, -_ _. _ _, _- _. _, _. _, _, _, _, CO Ol v| 4s. o- o vJ NT O O Cn 00 O O —■ 45. O r^ ∞ ∞ T C OO OO CO SJ sJ sI sJ sl sJ sl -vI sI sl ^-
4. CO 00 —* w sj j-. o ω -- O O O NT CO g oo oo ftt O sj NT NT — ' O O O O O O CO CO NT NΓ Ω NT NT 4N C0 C0 C0 θ 00 00 NT O *^-
NT NT Cπ O O O O O O O O O O O O sJ CO O CO NT — ' 4N 45. — ' o co c 45. O 45. s| sl sl sl sJ sj vj vj -vI sI Oo OO O NT OO CO CO O sI CO O O si O O sj Cn 0i O O 01 sl 00 O O 45. Cn C0 45. O NT sl O O — ' O 00
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
14 LG:311197.1:2000SEP08 5743674R7 183
15 LG:220655,4:2000SEP08 6054260H1 534
15 LG:220655,4:2000SEP08 6053961 HI 609
15 LG:220655.4:2000SEP08 7397741 HI 417 1004
16 LG:1001893.1.2000SEP08 6339672F8 624
16 LG:1001893.1:2000SEP08 6339672H1 590
16 LG:1001893.1:2000SEP08 6339672T8 483 868
17 LG:004335.1:2000SEP08 824343H1 1931 * 2213
17 LG:004335.1:2000SEP08 824343R6 1931 2448
17 LG:004335.1:2000SEP08 2616337T6 1936 2505
17 LG:004335.1:2000SEP08 7451517T1 1961 2473
17 LG:004335.1:2000SEP08 5899954T6 1988 2499
17 LG:004335.1:2000SEP08 6866173H1 2014 2149
17 LG:004335.1:2000SEP08 6866273H1 2014 2381
17 LG:004335,1:2000SEP08 g1163587 1907 2156
17 LG:004335.1:2000SEP08 5895088H1 1910 2161
17 LG:004335.1:2000SEP08 5697385T8 1925 2345
17 LG:004335.1:2000SEP08 5899954F6 1404 1958
17 LG:004335,1:2000SEP08 5897061 HI 1404 1695
17 LG:004335.1:2000SEP08 5899922H1 1404 1668
17 LG:004335.1:2000SEP08 4821288H1 1485 1757
17 LG:004335.1:2000SEP08 5602937H1 1497 1759
17 LG:004335.1:2000SEP08 g4194776 1578 2040
17 LG;004335.1:2000SEP08 5697385F9 1627 1777
17 LG:004335.1:2000SEP08 7608724J1 1738 2301
17 LG;004335.1:2000SEP08 g4573716 1771 2190
17 LG:004335.1:2000SEP08 g4573707 1771 2190
17 LG:004335.1:2000SEP08 6566503H1 1793 2347
17 LG:004335.1:2000SEP08 6211870H1 1797 2076
17 LG:004335.1:2000SEP08 7158924H1 1801 2071
17 LG:004335.1:2000SEP08 7037342H1 1807 1980
17 LG:004335.1:2000SEP08 5697385T9 1889 2470
17 LG:004335.1:2000SEP08 6431671H1 1907 2436
17 LG:004335.1:2000SEP08 3936275H1 61 329
17 LG:004335.1:2000SEP08 6392874H1 99 369
17 LG:004335.1:2000SEP08 4726741 HI 187 433
17 LG:004335.1:2000SEP08 4726741 F6 187 741
17 LG:004335.1:2000SEP08 6951945H1 334 909
17 LG:004335.1:2000SEP08 5678094H1 477 738
17 LG:004335.1:2000SEP08 7082061 HI 483 991
17 LG:004335.1:2000SEP08 6923277H1 512 1037
17 LG:004335.1:2000SEP08 4064256H1 708 944
17 LG:004335.1:2000SEP08 4064256F6 708 969
17 LG:004335.1:2000SEP08 7608724H1 743 1204
17 LG:004335.1:2000SEP08 6426967H1 793 1368
17 LG:004335.1:2000SEP08 6886093Jl 821 1401
17 LG:004335.1:2000SEP08 2616337F6 846 1256
17 LG:004335.1:2000SEP08 2616337H1 846 1099 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
17 LG:004335.1:2000SEP08 7088351 HI 927 1460
17 LG:004335.1:2000SEP08 6340018H1 931 1404
17 LG:004335.1:2000SEP08 4900977H1 976 1182
17 LG:004335.1:2000SEP08 4900977R9 991 1508
17 LG:004335.1:2000SEP08 6536713H1 1019 1506
17 LG:004335.1:2000SEP08 5899231 HI 1404 1686
17 LG:004335.1:2000SEP08 6872178H1 1 274
17 LG:004335.1:2000SEP08 g2784563 24 396
17 LG:004335.1:2000SEP08 6886093H1 501
17 LG:004335.1:2000SEP08 g2810592 176
17 LG:004335.1:2000SEP08 g2896627 324
17 LG:004335.1:2000SEP08 g3924468 78
17 LG:004335.1:2000SEP08 g4852361 260
17 LG:004335.1:2000SEP08 g5233752 140
17 LG:004335.1:2000SEP08 2127851H1 2041 2307
17 LG:004335.1:2000SEP08 6436591 HI 2054 2436
17 LG:004335.1:2000SEP08 5678094T6 2073 2500
17 LG:004335.1:2000SEP08 g5369879 2085 2545
17 LG:004335.1:2000SEP08 g4684665 2087 2515
17 LG:004335.1:2000SEP08 g3307962 2100 2544
17 LG:004335.1:2000SEP08 824343T6 2104 2511
17 LG:004335.1:2000SEP08 g3841316 2116 2545
17 LG:004335.1:2000SEP08 7338633H1 2134 2538
17 LG:004335.1:2000SEP08 g3162182 2132 2545
17 LG:004335.1:2000SEP08 g5530735 2142 2545
17 LG:004335.1:2000SEP08 4900977T9 2175 2422
17 LG:004335.1:2000SEP08 g6838724 2183 2545
17 LG:004335.1:2000SEP08 g4983820 2188 2547
17 LG:004335.1:2000SEP08 g4971326 2227 2543
17 LG:004335.1:2000SEP08 4726741T6 2254 2512
17 LG:004335.1:2000SEP08 1799046H1 2259 2489
17 LG:004335.1:2000SEP08 g5437793 2376 2546
17 LG:004335.1:2000SEP08 5878314H1 2390 2544
18 LG:213092,6:2000SEP08 7394750H1 1 393
19 LG:407570.5:2000SEP08 2929935H1 1 288
19 LG:407570,5:2000SEP08 2929935F6 1 243
19 LG:407570.5:2000SEP08 3002522F6 19 306
19 LG:407570.5:2000SEP08 3002522H1 20 329
19 LG:407570.5:2000SEP08 1951349T6 137 634
19 LG:407570.5:2000SEP08 1658838T6 148 531
19 LG:407570.5:2000SEP08 1658838F6 148 676
19 LG:407570.5:2000SEP08 1658838H1 148 376
20 LG:337835.8:2000SEP08 g2079164 1 421
20 LG:337835.8:2000SEP08 4576482H1 1 119
20 LG:337835.8:2000SEP08 2817432H1 6 276
20 LG:337835.8:2000SEP08 5567451 HI 10 227
20 LG:337835.8:2000SEP08 4631941 HI 10 271
20 LG:337835.8:2000SEP08 2444616F6 22 442 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
20 LG:337835,8:2000SEP08 2444616H1 22 245
20 LG:337835.8:2000SEP08 3535844H1 26 307
20 LG:337835.8:2000SEP08 3737060H1 27 193
20 LG:337835.8:2000SEP08 5293950H2 26 155
20 LG:337835.8:2000SEP08 6855696H1 50 654
20 LG:337835.8:2000SEP08 6955392H1 69 598
20 LG:337835.8:2000SEP08 g1984989 78 309
20 LG:337835.8:2000SEP08 4323489H1 111 339
20 LG:337835.8:2000SEP08 517888H1 119 345
20 LG:337835.8;2000SEP08 6317764H1 160 443
20 LG:337835.8:2000SEP08 6317732H1 160 441
20 LG:337835.8:2000SEP08 6317796H1 166 443
20 LG:337835.8:2000SEP08 g5231782 507 975
20 LG:337835.8:2000SEP08 g5664130 522 976
20 LG:337835.8:2000SEP08 1439187F6 614 988
20 LG:337835.8:2000SEP08 1439187H1 614' 804
20 LG:337835.8:2000SEP08 g4664365 700 969
20 LG:337835.8:2000SEP08 323242H1 783 984
21 LG:1099283.1:2000SEP08 7415654T2 1 470 21 LG:1099283.1:2000SEP08 3076753H1 4 262 21 LG:1099283.1:2000SEP08 4018474H1 16 233 21 LG:1099283.1 :2000SEP08 4023162H1 14 292 21 LG:1099283.1 :2000SEP08 4023196H1 15 291 21 LG:1099283, 1 :2000SEP08 3825381 HI 120 401 21 LG:1099283.1:2000SEP08 4023549H1 133 442 21 LG:1099283.1:2000SEP08 5107901 HI 237 468 21 LG:1099283.1:2000SEP08 3073301 HI 267 356 21 LG:1099283.1:2000SEP08 2346304F6 1 320 21 LG:1099283.1:2000SEP08 591880H1 47 228 21 LG:1099283.1:2000SEP08 4017491 HI 67 327 21 LG:1099283.1 :2000SEP08 4020384H1 74 357 21 LG:1099283.1:2000SEP08 593233H1 47 223
21 LG:1099283.1 :2000SEP08 5374682H1 70 247
22 LG:401274,2:2000SEP08 7004550H1 1 548
22 LG:401274.2:2000SEP08 gόόl242 10 108
23 LG:222880.1 :2000SEP08 7081513H1 1 525 23 LG:222880, 1 :2000SEP08 7757159H1 188 788 23 LG:222880.1 :2000SEP08 2509720H1 252 480 23 LG:222880.1 :2000SEP08 7363062H1 332 912 23 LG:222880.1 :2000SEP08 3242083H1 367 617 23 LG:222880, 1 :2000SEP08 3644604H1 371 666 23 LG:222880.1 :2000SEP08 3644604F6 369 627 23 LG:222880.1 :2000SEP08 5741818H1 407 602 23 LG:222880.1 :2000SEP08 7082116H1 438 981 23 LG:222880.1 :2000SEP08 957645R6 1748 1887 23 LG:222880.1 :2000SEP08 957645T6 1748 1844 23 LG:222880, 1 :2000SEP08 g2930525 1757 2092 23 LG:222880.1 :2000SEP08 7764378H1 582 1187 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
23 LG:222880,1:2000SEP08 2060770H1 626 870
23 LG:222880.1:2000SEP08 5113293H1 709 974
23 LG:222880.1:2000SEP08 g1891059 714 1018
23 LG:222880.1:2000SEP08 6479360H1 822 1303
23 LG:222880.1:2000SEP08 5784355H1 851 1113
23 LG:222880.1:2000SEP08 5794459H1 851 1146
23 LG:222880,l:2000SEP08 5784455H1 851 1086
23 LG:222880,1:2000SEP08 5784355F6 856 1391
23 LG:222880.1:2000SEP08 3734122H1 874 1160
23 LG:222880.1:2000SEP08 6832370H1 874 1477
23 LG:222880.1:2000SEP08 6836219H1 946 , 1257
23 LG:222880,1:2000SEP08 . 5104159H1 960 1199
23 LG:222880.1:2000SEP08 7757159J1 1009 1639
23 LG:222880.1:2000SEP08 4512707H1 1005 1260
23 LG:222880,1:2000SEP08 1836548H1 1055 1111
23 LG:222880,1;2000SEP08 1006724H1 1061 1308
23 LG:222880,1:2000SEP08 523339H1 1125 1370
23 LG:222880,1:2000SEP08 630148H1 1155 1437
23 LG:222880.1:2000SEP08 5897974H1 1197 1466
23 LG:222880.1:2000SEP08 459407H1 1220 1503
23 LG;222880,1:2000SEP08 1930289F6 1290 1795
23 LG:222880.1:2000SEP08 1930289H1 1290 1560
23 LG:222880.1:2000SEP08 1930289T6 1294 1818
23 LG:222880.1:2000SEP08 5784355T6 1319 1729
23 LG:222880,1:2000SEP08 3659618T6 1333 1821
23 LG:222880.1:2000SEP08 2762471 HI 1354 1597
23 LG:222880.1:2000SEP08 3400826H1 1374 1601
23 LG:222880.1:2000SEP08 g3647867 1386 1793
23 LG:222880,1:2000SEP08. g4302533 1431 1885
23 LG:222880,1:2000SEP08 954028H1 1437 1705
23 LG:222880.1:2000SEP08 g6398856 1471 1887
23 LG:222880,1:2000SEP08 g5368595 1475 1929
23 LG:222880.1:2000SEP08 g5662921 1485 1927
23 LG;222880.1:2000SEP08 g2279246 1492 .1887
23 LG:222880.1:2000SEP08 g3432804 1496 1929
23 LG:222880.1:2000SEP08 g3927268 1500 1891
23 LG:222880.1:2000SEP08 g3751389 1533 1886
23 LG:222880.1:2000SEP08 3028225H1 1537 1818
23 LG:222880.1:2000SEP08 g3244749 1547 1892
23 LG:222880.1:2000SEP08 1696241T6 1548 1868
23 LG:222880,1:2000SEP08 1696241 HI 1555 1798
23 LG:222880.1:2000SEP08 1696241 F6 1555 1908
23 LG:222880.1:2000SEP08 g6471216 1558 1865
23 LG:222880.1:2000SEP08 g5110284 1566 2015
23 LG:222880,1:2000SEP08 g4373194 1573 1929
23 LG:222880.1:2000SEP08 g5664143 1578 1927
23 LG:222880.1:2000SEP08 2293651 HI 1577 1807
23 LG:222880.1:2000SEP08 g3869595 1586 1928 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
23 LG:222880.1:2000SEP08 725947H1 1586 1829
23 LG:222880.1:2000SEP08 g2401363 1590 1887
23 LG:222880.1:2000SEP08 6773275J1 1593 2143
23 LG:222880,1:2000SEP08 gl 156118 1607 1887
23 LG;222880.1:2000SEP08 7341844H1 1648 1888
23 LG:222880.1:2000SEP08 g1892265 1660 1926
23 LG:222880.1:2000SEP08 g4242906 1678 1861
23 LG:222880,1:2000SEP08 5399680H1 1685 1882
23 LG:222880.1:2000SEP08 g764351 1721 1940
23 LG:222880.1:2000SEP08 g2716497 1722 2099
23 LG:222880.1:2000SEP08 g2804847 1744 2085
23 LG:222880.1:2000SEP08 957645H1 1748 1853
23 LG:222880.1:2000SEP08 957645T1 1748 1844
23 LG:222880.1:2000SEP08 825718H1 1893 2188
23 LG:222880.1:2000SEP08 3169276F6 1906 23-16
23 LG:222880.1:2000SEP08 3169276H1 1907 2175
23 LG:222880.1:2000SEP08 4880539H1 2030 2263
23 LG:222880.1:2000SEP08 551186H1 2089 2335
23 LG:222880.1:2000SEP08 5223525H1 2144 2295
23 LG:222880.1:2000SEP08 7734877H1 2168 2574
23 LG:222880.1:2000SEP08 g1959837 2259 2562
23 LG;222880.1:2000SEP08 3314476F6 2281 2592
23 LG:222880.1:2000SEP08 3314476H1 2281 2531
23 LG:222880.1:2000SEP08 6773275H1 2282 2747
23 LG:222880.1:2000SEP08 3169276T6 2501 2592
23 LG:222880.1:2000SEP08 2134443H1 2501 2592
24 LG:406389.1:2000SEP08 g4264177 1 472
24 LG:406389.1:2000SEP08 6768921Jl 1 570
24 LG:406389.1:2000SEP08 6767668J1 1 462
24 LG:406389,1:2000SEP08 3787648H1 382 660
24 LG:406389.1:2000SEP08 g709143 395 708
24 LG:406389.1:2000SEP08 g570324 411 731
24 LG:406389.1:2000SEP08 g694218 507 733
24 LG:406389.1:2000SEP08 3574983H1 642 930
24 LG:406389.1:2000SEP08 g1809833 642 903
24 LG:406389.1:2000SEP08 4766304H1 690 961
24 LG:406389.1:2000SEP08 4766304F6 690 1062
24 LG:406389.1:2000SEP08 921701 HI 920 1242
24 LG:406389,1:2000SEP08 7091595H1 922 1457
24 LG:406389.1:2000SEP08 4766304TO 1123 1508
24 LG:406389.1:2000SEP08 g3308722 1139 1550
24 LG:406389.1:2000SEP08 6870105H1 1141 1576
24 LG:406389.1:2000SEP08 6870968H1 1145 1661
24 LG:406389.1:2000SEP08 g3805640 1178 1554
24 LG:406389.1:2000SEP08 g561297 1217 1550
24 LG:406389.1:2000SEP08 g795551 1242 1560
24 LG:406389.1:2000SEP08 g683370 1266 1550
24 LG:406389.1:2000SEP08 g2141803 1393 1554 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
24 LG:406389.1:2000SEP08 4788627T6 1442 1509
25 LG:055461.1:2000SEP08 4592590H1 1 154
25 LG:055461,1:2000SEP08 4592590F8 1 566
25 LG:055461.1:2000SEP08 2457736F6 14 468
25 LG:055461.1:2000SEP08 4194633H1 83 395
25 LG:055461.1:2000SEP08 4194633F7 87 672
25 LG:055461.1:2000SEP08 2073295H1 116 371
25 LG:055461.1:2000SEP08 2073295F6 116 472
25 LG:055461,1:2000SEP08 2895870H1 171 456
25 LG:055461.1:2000SEP08 491179H1 219 482
25 LG.055461.1:2000SEP08 1400739H1 280 523
25 LG:055461.1:2000SEP08 2286216R6 408 803
25 LG:055461.1:2000SEP08 2286216H1 408 591
25 LG:055461.1:2000SEP08 4670223H1 422 675
25 LG:055461.1:2000SEP08 7091663H1 710 1060
25 LG:055461.1:2000SEP08 4172005H1 739 1009
25 LG:055461.1:2000SEP08 4172005F6 740 1250
25 LG:055461,1:2000SEP08 4054692H1 740 1019
25 LG:055461.1:2000SEP08 2407240T6 1014 1498
25 LG:055461.1:2000SEP08 go198775 1113 1514
25 LG:055461.1:2000SEP08 g3118198 1149 1514
25 LG:055461.1:2000SEP08 1600869H1 1268 1516
25 LG:055461.1:2000SEP08 g4329308 1292 1518
25 LG:055461.1:2000SEP08 2073295T6 1409 1613
26 LG:979059.5:2000SEP08 7700332J1 1 658
26 LG:979059.5:2000SEP08 1964843H1 105 373
26 LG:979059.5:2000SEP08 1966204R6 112 354
26 LG:979059.5:2000SEP08 1966137H1 112 344
26 LG:979059.5:2000SEP08 1966137R6 112 570
26 LG:979059.5:2000SEP08 g953428 112 303
26 LG:979059.5:2000SEP08 2542469H1 ,158 386
26 LG:979059.5:2000SEP08 g766126 205 288
27 LG:399238.1:2000SEP08 g4850653 1552 1974
27 LG:399238.1:2000SEP08 g6475403 1800 1970
27 LG:399238,1:2000SEP08 g3330309 1536 1967
27 LG:399238,1:2000SEP08 g770433 1715 1963
27 LG:399238.1:2000SEP08 g3888972 1717 I960
27 LG:399238,1:2000SEP08 g4393908 1681 1961
27 LG:399238,1:2000SEP08 g3843053 1681 1960
27 LG:399238.1:2000SEP08 1926452H1 1899 1959
27 LG:399238.1:2000SEP08 g3077521 1498 1959
27 LG:399238.1:2000SEP08 g5877902 I486 1959
27 LG:399238.1:2000SEP08 g3069392 1602 1956
27 LG:399238.1:2000SEP08 g3109645 1544 1956
27 LG:399238,1:2000SEP08 g2411261 1794 1956
27 LG:399238.1:2000SEP08 g4083704 1511 1956
27 LG:399238.1:2000SEP08 g3214844 1516 1956
27 LG:399238.1:2000SEP08 g3077207 1505 1956 CO m © s NTj sNTj sIsTj stOj vNjT NvTj sNTj sNTl stOj sWj si'Oj sNTj sNTj tsl NsTj IsOl Isj NsTj tsj NsTj rsOj NsTj tsj NsTl rsl sNTj Isj NsTj sl sj sj vj vj v -vj sj sj sj sl sj sj sl sj sj sj sj sj vj vj
O
CO
CoO o
CoO
X
oo oo js. co co en o o f-1 — * co co o - ' sl O O Oi H 2 o o o n i_ 0 — ' NT NT NT NT CO C0 Js. cn 4N Cn cπ C0 θ 4N sj js, cn 4N 45. 45. 0ι 45. Cπ O 45. 45. O Cn θ Cn ? Cύ NT O 4N C0 sj sj C0 C0 j co CO c .fc. O n N J5. C _O C _O O _ C _O C _jl O_ J.5. M . ^ SJ SJ O — ' CO O Cπ sJ Co O O O CO O Cθ 45. 0ι 0 0 0 4N o y 45. sJ 00 θ 00 O NT NT IO NT 00 O *fc. 00 00 O CO — ' NT NJ 00 — * 4- CO Ol CO O CO - ' O 00 3-
.^ .--, -_- --- ._* ._-J --_ --- _* --- .--, _..--* .--* .^ .--* .--* --J --- __ _J .--, --- -^
N -OT o °, O _- —. —. -' -J — ' C ω ω ω c^ ω 3 W .fc> N cjτ cjτ cπ o o sj sj oo co o o o o o o o o o o o o o o o o ;"
00 — ■ Oi O O si si co cn cn o o o o oo o cn — ' Co si co 45. tsT si ro o — ■ — ■ — ' W 45. *fc. oi cπ cπ cπ cπ cjτ *cπ cn cn cn cjn cπ cn oι sj O C NT 45. OD O Ol — ' — ' Si ro rθ sl si co θ 45. — ■ — ' O Co — ' S1000 NT NT 45.45-, — ' O O O O O O O O O O — ' NT C045.45.O
NT NT NT NT NT NT N-T NT NT W NT NT NT NT NT O NT NT NT NT NT NT NT NT NT NT NT NT NT M NT NT NT rO NT NT NT NT
O - O- O- O- O- O- O O- O- O- O- O- O- O- O- O- O- O- ό- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- O- øøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøø - <J) 0 øø ω ω co co ω co w ω co co co co co ω ω co co ϋ co ω ω co co ω co co ijj ω Co CO CO 7~Z do 'do oo ό CO Co OO CO OO CO OO OO OO OO OO OO CO OO OO OO OT OO OO CO OO OO OO OO OO CD CJO CTO ∞ CTO CO OO OT ∞ oo oo oo J^ O O O O O O , ω ω ω ω ω co u ω cj ω w ω co co ω co ω cj ω co u ω ω ω co ω ω ω Co NT NT K o o o o o o ^* o o o o o o o o o o o o o o o o O o o o o o o o o o o o o o o o o o o o o o O O o" ^ NT NT NT NT NT NT ^ Ϊr, 0o co co oo co θo 3 o o o o p o o o o o o o o o o o o o o o o o O o o o O o o o o o o o o 4N 45. p p O i Oi g oo oo oo po oo po **Q
CO CO Co co CO CO Co CO CO CO CO CO CO CO co co CO CO Co CO co co Co co CO co co CO Co CO CO co co CO CO CO CO Co *sl si
NoT NoT NT NT NT NT NT NT NT NT fό fό Fό fό fό fo Fo Fό f Fό fό Fό iό Fό NT NT NT NT fo F NT NT NT Fό Fό NT NT NT NT NT j. -, NT NT NT NT NT NT -→- o ( ) o o ( ) o o o o o o o o o o o o o o c » o o CT o o O O o o o pl o o o o o o Φ o o o o < ) o o o H o o o o o o — o o ) o ( ) ( ) C T o _ o _ o H O O O O O O D
C CO CΛ CΛ co co (Λ C/J ( CΛ) (
(Λ CΛ CO CO CO co co co co co co co co co co CT CΛ CΛ CΛ co co CΛ CΛ CJ CΛ CO co co CΛ CO CO W CO CO CΛ CΛ CO CΛ m m m m m m m rπ m m rπ m m m m m m rπ m rπ m m m m m m iTi m m m rπ m rn m rπ m m m m m ¥τ m m m m m m
TJ TJ u TJ TJ u II TJ ■'II TJ TJ T TJ TJ TJ TJ TJ TJ TJ TI "U o J T ~~J TJ TJ ffl TJ TJ TJ TJ TJ TJ
CJ o o C J o o o oo o o o o M o o o o o o oo oo 00 00 oo oo oo 00 CO CO oo oo oo co ooo ooo oJ TJ co ooo co co oo oo co oo 00 oo υoo oo oo oo co ooo oo oo oo oo oo co oo co ^5 00 00 00 co oo oo
NT I J Ω CΩ r -" i -fcf.; os. t roo
■ 4 O
-fc_.■ ±N2 o vl
M . , C ∞ x -jvi o oo
r rn r,*ι o .*ι o ro o vJ sJ sJ CO sJ sJ sJ O vJ O O sj c^ 0 01 01 01 0l 01 *Cn Cn 4^ 4N 4N 45. 4N 45. 4N ^ |s — ' o S r^ ft ft ft ft ^ C> O O NJ 45. -fc. O O ^ O vI .fc. ∞ Cπ CO OO CTD OT CO CO CO co _,
NT CO — > Q" ] S ∞ ^ ∞ ^ ∞ ^ ^ ^ ^ ^ ^ ^ ^ ° ^ ^ ω ^ ^ 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ∞ ∞ ^ ^ ^ ^ ^ > 0' cn en o Oi 3-
r rn rs rc^N CoN O rn urni 3 ' CO C O ^ OO O K*1 O O OO O O O OO OO OO sj s4 0 0 0 Cn Cjl O sj ^ Cn 4N 4 NT rO M W CΛ tnn uι o
Oi si — Ko ^^ ",' lκ**j CM» CO SI O u C J5. . rr, Cn NT 45. NT rθ NT 45. cn o cπ Oi Ol W O ir. sl Oi s|
4N si Ol si CO ^ § ^ O sl O O ι\} g sl C» Λ gj — ' 00 O SJ 00 O 45. o — CO 45. cn
00 o O O CO — ' sj ro e 3"
Q, M CO - ' CO CO NT O NT O TJ
co m
©
|sT tO NT t NT NT NT NT NT M NT NT NT NT NT M NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT rθ NT NT NT NT NT NT I M NT r NT N^ NT O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O o
0
00000000000000000000000000000
CO _ CO CO C Cj0 C_ C CO CO CO Cj0 CO CO <_O CO C0 <_ CO C CO Cj0 C <_O C CO <-O
∞ co αi rø ca oj co oo α w cB Cϊi cn α α oj oo oj α αi co co ω ω ω co ω ω ω ω ω ω ω ω ω co ω co Q co co ω ω ω co ω ω ω ω co ω o
0000000000000 00000 0000 O 00000000000000000000000 -3 ppppppppppppppppppppppppp co co co co co co co co ω co co jo ω w ω co co ω
Kb """NT iό "NT iό "NT 'NT to iό 'NT to fό "NT to iό
0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o0o00000000000oo0o0000000000000 w
O O cOn cO oo n cOo cOo cOo cOn cOn cOo cOn cOn cOj cO oo o wO O O O O O O O OoOoO O OoOoOoOoO OooO O O O O OoOoOoooooooop; cn co cn cn cji cn ci cn cjj O O O O O O O O J rπ ιτι m m ιτι rτι m rτι m rπ m m ιτι m rτi [τι rπ ιτι rτι m rπ [τ^
TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T^
000000000000000000000000000000000000000000000000
Oi 45. — ' — ' NT — ' Co —* — ' O O r Co O Cn O vJ θ O O *fc. Cπ Co Co Cn θ Oι O O Oι O Ol O Cθ K fe o n NT NT NT NT — . —. —. —. —. — ' CO NT O O 4N 45. 00 O O sJ O ? α K ffl O CO ^ M -' CO ϋl ^ M -' N M C -' -' M Ol N O CD ffl -' O M -' N -' M ω S K O Co OO CO sJ Js, — . — -vj — ' O J5. ω CΛ U C» 01 W M CO ^ M W Co ω M ( i ( i -* -' N -' ω θι CO M M C» I IJ ^ ^' 00 o o oo cn 45. Nτ o o o en cπO
CO m ω C J M M W M IvJ W M M M M M M M M M M M M r © O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O sj o
ft ft en cπ cn JS. e sj s| 00 sl o o o o
^ CN js, £ s, vj JVn 00 O O O a- o - -•- 0 0 0 0 0 0 -' 0 0 0 0 0 0 0 0 0 0 0 0 0 o_ o _ o _ o _ o _ o _ J — < — ' — * sl si - ' O O O Ol O CO CO CO o cn oo n SJ NT O CO — . sl si ro NT O O tsT sj v cn oi CO CO — ' O 45. tO 4N O 00 45. C0 co o j- i o e : CO Ω
00 ^ O O sj O O O sj 45. 00 O N) — ' Ol O CO CO OO OO CO OO CO OO NT —
CO ^ C o NT 4N
4N Sj _ O O O O _ NT
* ω Oι Oι Oι θ cn θ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
30 LG:1384030.1:2000SEP08 g813592 521 841
30 LG:1384030.1:2000SEP08 1241458H1 623 859
30 LG:1384030.1:2000SEP08 3232547H1 636 856
30 LG:1384030.1:2000SEP08 7740771J1 661 1204
30 LG:1384030.1 '.2000SEP08 4226494H1 1925 2192
30 LG:1384030.l:2000SEP08 2792817T6 1610 2152
30 LG:1384030.1.2000SEP08 g1156926 1945 2192
30 LG:1384030.1:2000SEP08 2284623T6 1975 2440
30 LG:1384030.1:2000SEP08 3084794H1 2008 2284
30 LG:1384030.l:2000SEP08 g3149789 2021 2489
30 LG:1384030.1:2000SEP08 2685272H1 2048 2192
30 LG:1384030,1:2000SEP08 g2955029 1752 2063
30 LG:1384030.1 :2000SEP08 g396220ό 1755 2198
30 LG:1384030,l:2000SEP08 766404H1 1763 2008
30 LG:1384030.1:2000SEP08 g2786198 1767 2198
30 LG:1384030.1:2000SEP08 6365044H1 1794 2063
30 LG:1384030,1:2000SEP08 g2215615 1823 2192
30 LG:1384030.1:2000SEP08 g1948592 1370 1648
30 LG:1384030.1:2000SEP08 g1948812 1399 1675
30 LG:1384030.V.2000SEP08 6442338H1 1434 1968
30 LG:1384030.1:2000SEP08 7447536T2 1486 2085
30 LG:1384030,1:2000SEP08 6777364H1 80 303
30 LG:1384030.1:2000SEP08 3855585F6 98 610
30 LG:1384030.1:2000SEP08 741974H1 153 300
30 LG:1384030.1.2000SEP08 6777364J1 80 303
30 LG:1384030.1:2000SEP08 3341227H1 155 405
30 LG: 1384030, 1 :2000SEP08 3341227F6 156 777
30 LG:1384030.V.2000SEP08 6906377H1 357 857
30 LG:1384030.1:2000SEP08 6778372H1 365 946
30 LG:1384030,1:2000SEP08 4228325H1 1925 2192
30 LG:1384030.V.2000SEP08 7449254T2 1501 2113
30 LG:1384030, 1:2000SEP08 3822349H1 1563 1841
30 LG:1384030.1:2000SEP08 6159726H1 1600 1799
30 LG:1384030.1:2000SEP08 3341227T6 1608 2057
30 LG:1384030.1:2000SEP08 5173885H1 1097 1347
30 LG:1384030.1:2000SEP08 5926007H1 1098 1398
30 LG:1384030.1:2000SEP08 6839205H1 1103 1632
30 LG:1384030.1 :2000SEP08 4953785H1 1250 1497
30 LG:1384030.1:2000SEP08 7607682J1 1279 1820
30 LG:1384030.1:2000SEP08 6051418J1 1328 1823
30 LG:1384030,1:2000SEP08 6051418H1 1328 1795
30 LG:1384030.1:2000SEP08 5266960H1 1363 1625
30 LG:1384030.1:2000SEP08 g2022434 667 903
30 LG:1384030.1:2000SEP08 g892844 685 1082
30 LG:1384030.1:2000SEP08 g4265266 703 1132
30 LG:1384030,1:2000SEP08 g2838388 753 1199
30 LG:1384030.1:2000SEP08 2792817F6 794 1342
30 LG:1384030.1:2000SEP08 g2215668 819 1227 rπ © coo ωocoo coo ωoωo coo ωocoo coo coo coo coo coo ωoωocoo coo coo CoΛ Coo coo coo ωoocooωocoocooω coo cooocooj coo coooωoooooooooooo
O
øøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøøø co co co co CO CO co C co co Cύ co (ύ Λ C co CO Λ CO co Co C>0 <ύ CO co Co O (Λ CO CO co OO co CO co co co CO co CO co Co CO Co co
CD CD O CO CO (0 CO CO CD CO CO O CO CO CO CD CD CO CO CD CD CO O O CD CD CD O CO CO CD CO CO CO CD CO CD CO CO CD 00 O O CD O CD CO CO
4N 4 4N 4 •fc. -fc. 4N 4N 4N 45. 45. 45. 4N 45. •fc. 4N -fc. 45. 4N -fc. -IN 45. 45. 4N 4N 4N 45. 4N 45. 4N 45. •fc. 4N .fc. 4N 45. .fc. -fc. 4N 45. -fc. •fc. 4N -fc. •fc. •fc. .fc.
( ) ( ) c- c > O (ύ <ύ (ύ (ύ O O CO < ) CT ( ) C T c ) c ) c J O
CO (ύ O CO ( CO (ύ (ύ O co CO (ύ CO CO O CO O CO CO O CO CO CO O O co CO O CO O O CO CO CO CO CO O CO O CO o o o o o CJ o O O o CJ o o o o o o o o O o o o o o o o o o CJ o o o o CJ o o o o o
NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT
C T < ) < ) Cl o ( ) Cl C T c ) ( ) C-J ( ) < ) T φ
CT CT CO o o Cl T o C T o C o O c >
( c ) Ό
C CO ( (Λ ) ι-j
CΛ ) ( c )
CΛ CΛ < (Λ ) (
CO CΛ ) ( CΛ ) (Λ CO O CΛ ) CO < CΛ ) (")
CΛ (Λ CO ( CΛ ) < CΛ ) CJ
CΛ CΛ CΛ ( (Λ ) CΛ CΛ CO CO cn (Λ O CΛ CΛ ( CΛ ) ( CΛ ) < CΛ ) ( (Λ ) CΛ CΛ CΛ CΛ CΛ < CΛ ) CΛ CO ( C-Λ) ( CΛ ) rπ m m m rπ m m m m m m m m m πι rπ m m rn m m in rn m m rn m in m m m m rn m m m m m πι m m m m m m rn m m
TJ TJ TJ TJ TJ "0 J TJ -IJ "VJ TJ J TJ "II u u n "II TI u "U 11 TJ TJ u u u u TJ TJ -i J n u l VJ J w u u VJ J TJ v u u π -u
C'T o c > CT c-> r ) o CJ CT Cl CJ o CT CT o CJ o o o c > CT o o CT
00 CO 00 00 O CO 00 00 to 00 00 00 O 00 (JO 00 00 00 00 00 CO 00 oo 00 CO CO oo 00 00 oo 00 co 00 O 00 CO 00 CO oo oo 00 CO 00 oo CO CO O
-4 — , _, _ _ —. _ _ _, _, _- _ _, _ — _ _, _, _- _, NT NT NT NT NT NT NT NT NT — « — ' vi si Λ "
^ 00 θ sj ***v[ sJ vj vj O O O O O O 00 00 CT0 ∞ 00 00 ∞ 00 00 O 0l .fc. .ε-. 00 NT O NT NT M
00 — < CO C NT — ' — ' CJθ Oo ω sj js co Oo OO O Cn 4N Co rθ NT NT O O C-n 4N θ ∞ sJ 00 0 0 0 0 0 0 0 4 4N sj *vj co CTD ^ oo o .fc. o oi — ' Cn en ro .fc. cn CO 4N NT O O — ' Oo o cn c i M M M O oi oi cn fc oi c j o -q.
K*. NT r NT NT NT —. —. -. NT NT —2 *^ — ^ — ! —-£ ' NT NT NT NT NT NT NT NT NT NT
D5? —< N NTT —■ —> —■ _ . 0.0 o_ —■ slω^-!sS^ — *! —^ - - —≤ •C ->j Λl, _N,- . iNT. r4Ni -^ 4 ^s. -4N *t^N 4 -^s. 4 -N. t i 4 +i N ivT - —^ — 4 r5n. r 45-.i O r.it N rvT Oii C
— ' r O **— -* C vOw N isT* o O O 4N 0ι Cn ϋι O NT 45. O 4N sl O O CTO O O O ^ ft ^ ^ ^ g sJ M -' NT O NT O O O O O S S ft sJ ∞ ∞ ω
CO — ' NT 4N 01 — ' C0 0 01 4S. O NT ffl CO -* W N M M M N3 fl M w w ω ft ω θ ι^ O > ιTJ CJl f Ji C> C> VJ VI ω ^j -rj
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
30 LG:1384030.1-.2000SEP08 5480213H1 23 206
30 LG:1384030.1:2000SEP08 5478431 HI 23 275
31 LG:390475.1:2000SEP08 373158H1 1425 1612
31 LG:390475.1:2000SEP08 802683H1 1432 1636
31 LG:390475.1:2000SEP08 467131H1 1461 1661
31 LG:390475.1:2000SEP08 374250H1 1424 1672
31 LG:390475.1:2000SEP08 468011 HI 1455 1688
31 LG:390475.1:2000SEP08 5680194H1 1523 1715
31 LG:390475.1:2000SEP08 2005982H1 1465 1658
31 LG:390475.1:2000SEP08 g1203616 1490 1690
31 LG:390475.1:2000SEP08 1942223H1 1461 1697
31 LG:390475.1:2000SEP08 1875616H1 1509 1682
31 LG:390475.1:2000SEP08 1562493H1 1468 1694
31 LG:390475.1:2000SEP08 3321585H2 1490 1572
31 LG:390475.1:2000SEP08 430766H1 1475 1679
31 LG:390475.1:2000SEP08 467859H1 1532 1630
31 LG:390475.1:2000SEP08 3916709H1 1638 1931
31 LG:390475.1:2000SEP08 g4457738 527 669
31 LG:390475.1:2000SEP08 g1384862 537 684
31 LG:390475.1:2000SEP08 g1242860 382 665
31 LG:390475.1:2000SEP08 g766144 383 634
31 LG:390475.1:2000SEP08 067238H1 392 551
31 LG:390475.1:2000SEP08 633651 HI 399 653
31 LG:390475.1:2000SEP08 1403384H1 399 604
31 LG:390475.1:2000SEP08 2209992H1 402 510
31 LG:390475.1:2000SEP08 g787869 412 660
31 LG:390475.1:2000SEP08 g2835653 418 663
31 LG:390475.1:2000SEP08 043054H1 424 709
31 LG:390475.1:2000SEP08 g1274002 428 962
31 LG:390475.1:2000SEP08 g3677757 443 671
31 LG:390475.1:2000SEP08 g3743491 250 438
31 LG:390475.1:2000SEP08 2923096H1 - 251 474
31 LG:390475.1:2000SEP08 3249704H1 291 431
31 LG:390475.1:2000SEP08 1338497H1 309 441
31 LG:390475.1:2000SEP08 3494313H1 325 467
31 LG:390475.1:2000SEP08 g2820741 335 701
31 LG:390475.1:2000SEP08 4714287H1 365 607
31 LG:390475.1:2000SEP08 511973H1 372 566
31 LG:390475.1:2000SEP08 3955111 HI 1182 1456
31 LG:390475.1:2000SEP08 275438H1 1239 1434
31 LG:390475.1:2000SEP08 4539568H1 1190 1442
31 LG:390475.1:2000SEP08 3919091 HI 1195 1434
31 LG:390475,1:2000SEP08 2840455H1 1203 1482
31 LG:390475.1:2000SEP08 276868H1 1238 1467
31 LG:390475.1:2000SEP08 778666H1 1155 1385
31 LG:390475.1:2000SEP08 024765H1 1156 1346
31 LG:390475,1:2000SEP08 g723508 1169 1314
31 LG:390475.1:2000SEP08 805277H1 1169 1391
" in- TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
31 LG:390475.1:2000SEP08 g844491 1173 1495
31 LG:390475,1:2000SEP08 g735070 1173 1416
31 LG:390475.1:2000SEP08 2290953H1 1173 1395
31 LG:390475.1:2000SEP08 4295636H1 1126 1241
31 LG:390475.1:2000SEP08 6964476H1 1124 1272
31 LG:390475,1:2000SEP08 4295715H1 1125 1373
31 LG:390475.1:2000SEP08 6519762H1 1150 1246
31 LG:390475.1:2000SEP08 g1357998 1143 1722
31 LG:390475.1:2000SEP08 2263032H1 1155 1393
31 LG:390475.1:2000SEP08 go17641 168 414
31 LG:390475.1:2000SEP08 2044802H1 185 354
31 LG:390475.1:2000SEP08 gl958504 199 404
31 LG:390475.1:2000SEP08 540922H1 223 442
31 LG:390475.1:2000SEP08 2770791 HI 248 479
31 LG:390475.1:2000SEP08. g1873997 1 177
31 LG:390475.1:2000SEP08 4084288H1 8 189
31 LG:390475.1:2000SEP08 2174238H1 8 159
31 LG:390475.1:2000SEP08 1446053H1 9 264
31 LG:390475.1:2000SEP08 5298475H1 9 246
31 LG:390475.1:2000SEP08 4341605H1 17 234
31 LG:390475.1:2000SEP08 4145764H1 18 214
31 LG:390475.1:2000SEP08 4713341H1 19 141
31 LG:390475,1:2000SEP08 6441361 HI 28 169
31 LG:390475.1:2000SEP08 1001468H1 40 276
31 LG:390475.1:2000SEP08 2723346H1 42 281
31 LG:390475.1:2000SEP08 1450049H1 58 259
31 LG:390475.1;2000SEP08 1839414H1 59 293
31 LG:390475.1:2000SEP08 g1364559 73 403
31 LG:390475.1:2000SEP08 907971 HI 66 283
31 LG:390475.1:2000SEP08 g3182476 86 319
31 LG;390475.1:2000SEP08 2475265H1 86 306
31 LG:390475.1:2000SEP08 5965283H1 109 264
31 LG:390475.1:2000SEP08 3580918H1 120 403
31 LG:390475.1:2000SEP08 3296172H1 176
31 LG:390475.1:2000SEP08 2130474H1 246
31 LG:390475.1:2000SEP08 3369963H1 237
31 LG:390475.1:2000SEP08 2494290H1 258
31 LG:390475.1:2000SEP08 3470236H1 211
31 LG:390475.1:2000SEP08 1839015H1 117
31 LG:390475.1:2000SEP08 630829H1 262
31 LG:390475.1:2000SEP08 3051970H1 242
31 LG:390475.1:2000SEP08 3961074H2 224
31 LG:390475.1:2000SEP08 2947052H2 215
31 LG:390475.1;2000SEP08 1322474H1 194
31 LG:390475.1:2000SEP08 1466377H1 157
31 LG:390475.1:2000SEP08 1561545H1 162
31 LG:390475.1:2000SEP08 3685341 HI 246
31 LG:390475.1:2000SEP08 g1336967 811 1059 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
31 LG:390475.1:2000SEP08 g1303305 812 929
31 LG:390475.1:2000SEP08 795097H1 764 930
31 LG:390475.1:2000SEP08 499861 HI 553 671
31 LG:390475,1:2000SEPQ8 2857203H1 556 642
31 LG:390475.1:2000SEP08 g1146853 562 700
31 LG:390475.1:2000SEP08 3466132H1 578 692
31 LG:390475.1:2000SEP08 452754H1 1095 1326
31 LG:390475.1:2000SEP08 6962745H1 1091 1286
31 LG:390475.1:2000SEP08 g1983447 1272 1512
31 LG:390475.1:2000SEP08 302583H1 1272 1511
31 LG;390475.1:2000SEP08 2279888H1 1271 1517
31 LG:390475.1:2000SEP08 4354187H1 1267 1504
31 LG:390475.1:2000SEP08 1225050H1 1269 1507
31 LG:390475.1:2000SEP08 869885H1 1241 1478
31 LG:390475.1:2000SEP08 g918854 1262 1679
31 LG:390475.1:2000SEP08 2007247H1 1256 1422
31 LG:390475,1:2000SEP08 g1025808 1263 1493
31 LG:390475.1:2000SEP08 5218231H1 1267 1502
31 LG:390475.1:2000SEP08 674150H1 1240 1497
31 LG:390475.1:2000SEP08 3877381 HI 1307 1508
31 LG:390475.1:2000SEP08 7650404H2 1309 1910
31 LG:390475.1:2000SEP08 2761640H1 1319 1541
31 LG:390475,1:2000SEP08 1216133H1 1349 1507
31 LG:390475.1:2000SEP08 3208733H1 1336 1591
31 LG:390475.1:2000SEP08 2489552H1 1339 1546
31 LG:390475.1:2000SEP08 2591840H1 1353 1590
31 LG:390475.1:2000SEP08 146677H1 1286 1414
31 LG:390475.1:2000SEP08 1960680H1 1288 1487
31 LG:390475,1:2000SEP08 2298738H1 1297 1548
31 LG:390475.1:2000SEP08 4976830H1 1294 1451
31 LG:390475.1:2000SEP08 1289679H1 1292 1536
31 LG:390475.1:2000SEP08 2480383H1 1 168
31 LG:390475.1:2000SEP08 3697730H1 2 228
31 LG:390475.1:2000SEP08 2442101 HI 164
31 LG:390475.1:2000SEP08 3267911 HI 190
31 LG:390475.1:2000SEP08 g1801609 258
31 LG:390475.1:2000SEP08 2304464H1 213
31 LG:390475.1:2000SEP08 3348740H1 201
31 LG:390475.1:2000SEP08 g1395505 1281 1442
31 LG:390475.1:2000SEP08 552037H1 1283 1510
31 LG:390475.1:2000SEP08 321116H1 1281 1518
31 LG:390475.1:2000SEP08 4652332H1 1273 1489
31 LG:390475.1:2000SEP08 g1576768 1274 1437
31 LG:390475.1:2000SEP08 2532160H1 1279 1530
31 LG:390475,1:2000SEP08 1413859H1 1276 1527
31 LG;390475.1:2000SEP08 3992247H1 826 1100
31 LG:390475.1:2000SEP08 6443848H1 880 1408
31 LG:390475.1:2000SEP08 g1640782 823 970 TABLE 3
ID NO: Template ID Component ID Start Stop 1 LG:390475.1:2000SEP08 7254474H1 926 1362 1 LG:390475.1:2000SEP08 6967258H1 926 1060 1 LG:390475.1:2000SEP08 6962316H1 926 1321
31 LG:390475.1;2000SEP08 6961986H1 926 1392
31 LG:390475,1:2000SEP08 431342H1 971 1206
31 LG:390475,1:2000SEP08 3534814H1 1018 1188
31 LG:390475.1:2000SEP08 2288879H1 1041 1278
31 LG:390475.1:2000SEP08 2410736H1 1070 1303 1 LG:390475.1:2000SEP08 g395465 1081 1409 1 LG:390475.1:2000SEP08 2727033H1 1086 1317 1 LG:390475.1:2000SEP08 g1295586. 1414 1667
31 LG:390475.1:2000SEP08 gl312999 1415 1672
31 LG:390475.1:2000SEP08 530963H1 1406 1624
31 LG;390475.1:2000SEP08 855339H1 1396 1634
31 LG:390475.1:2000SEP08 5962214H1 1363 1526
31 LG:390475,1:2000SEP08 452967H1 1366 1583
31 LG:390475.1:2000SEP08 1421903H1 1372 1545
32 LG:229105.3:2000SEP08 7435272H1 1 596
32 LG:229105.3:2000SEP08 5835361 HI 323 592
32 LG:229105.3:2000SEP08 7435218H1 1 596
32 LG:229105.3:2000SEP08 5835361 F6 323 712
33 LG:232578.3:2000SEP08 g6663427 1120 1412
33 LG:232578.3;2000SEP08 g327958ό 1080 1417
33 LG:232578.3:2000SEP08 g1384308 1026 1417
33 LG:232578.3:2000SEP08 g!224514 1068 1416
33 LG:232578.3:2000SEP08 g1365326 905 1417
33 LG:232578.3:2000SEP08 g3674482 965 1417
33 LG:232578.3:2000SEP08 g2810626 1034 1416
33 LG:232578.3:2000SEP08 g2901324 957 1416
33 LG:232578.3:2000SEP08 g6073460 1272 1417
33 LG:232578.3:2000SEP08 g1516131 1167 1416
33 LG:232578.3:2000SEP08 g1390751 1093 1418
33 LG:232578.3:2000SEP08 g2583395 970 1416
33 LG:232578.3:2000SEP08 gl 712889 1123 • 1419
33 LG:232578.3:2000SEP08 g3182153 1076 1419
33 LG:232578.3:2000SEP08 g1388644 1126 1418
33 LG:232578.3:2000SEP08 g3331087 1162 1418
33 LG:232578.3:2000SEP08 g6041357 1120 1418
33 LG:232578,3:2000SEP08 g4457829 1310 1419
33 LG:232578.3:2000SEP08 g3331089 1058 1418
33 LG:232578,3:2000SEP08 g4080845 1013 1424
33 LG:232578.3:2000SEP08 g5450749 1028 1423
33 LG:232578.3:2000SEP08 g3842825 1201 1420
33 LG:232578.3:2000SEP08 g5527989 1022 1419
33 LG:232578.3:2000SEP08 g5858042 1024 1420
33 LG:232578.3:2000SEP08 g254007ό 1086 1420
33 LG:232578.3:2000SEP08 3838381 HI 827 1089
33 LG:232578.3:2000SEP08 4640194H1 845 1077 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
33 LG:232578.3:2000SEP08 2342771 HI 849 1063
33 LG:232578.3:2000SEP08 3869126H1 814 1030
33 LG:232578.3:2000SEP08 551056R6 615 1006
33 LG:232578.3:2000SEP08 551056H1 580 788
33 LG:232578.3:2000SEP08 g1978109 458 764
33 LG:232578.3:2000SEP08 3475224H1 319 650
33 LG:232578.3:2000SEP08 5523539H1 569 643
33 LG:232578,3:2000SEP08 g2818319 132 489
33 LG:232578.3:2000SEP08 7252946H1 1 381
33 LG:232578.3:2000SEP08 7252946J1 1 381
33 LG:232578.3:2000SEP08 1446460H1 46 288
33 LG:232578.3:2000SEP08 g897273 1167 1400
33 LG:232578.3:2000SEP08 1819757T6 860 1399
33 LG:232578,3:2000SEP08 g3178643 912 1383
33 LG:232578.3:2000SEP08 2793428T6 856 1378
33 LG:232578,3:2000SEP08 g2432683 960 1371
33 LG:232578.3:2000SEP08 3728864H1 1062 1368
33 LG:232578,3:2000SEP08 1915346H1 1106 1367
33 LG:232578.3:2000SEP08 5422383T8 834 1176
33 LG:232578.3:2000SEP08 7447553T2 690 1334
33 LG:232578.3:2000SEP08 278007H1 1004 1334
33 LG:232578.3:2000SEP08 5980881 HI 1003 1317
33 LG:232578.3:2000SEP08 2048520H1 1045 1301
33 LG;232578.3:2000SEP08 2621273H1 1022 1268
33 LG:232578,3:2000SEP08 275474H1 1006 1224
33 LG:232578.3:2000SEP08 2863295T6 880 1140
33 LG:232578.3:2000SEP08 g1425453 1093 1409
33 LG:232578.3;2000SEP08 549352F1 903 1412
33 LG:232578.3:2000SEP08 g3037624 1243 1409
33 LG:232578.3:2000SEP08 g5392654 1303 1403
33 LG:232578,3:2000SEP08 g6076333 1122 1416
33 LG:232578.3:2000SEP08 g3754744 1083 1416
33 LG:232578.3:2000SEP08 g1729139 1047 1415
33 LG:232578,3:2000SEP08 g4565068 1341 1416
33 LG;232578,3:2000SEP08 g6451170 1159 1416
33 LG:232578.3:2000SEP08 g1425344 1127 1416
33 LG:232578.3:2000SEP08 g4074741 1010 1415
33 LG:232578,3:2000SEP08 2847751 HI 1218 1402
33 LG:232578.3:2000SEP08 g5813374 1000 1087
33 LG:232578.3:2000SEP08 891702H1 962 1223
33 LG:232578.3:2000SEP08 g6228941 951 1106
33 LG:232578,3:2000SEP08 6323671T8 885 1311
34 LG:1166387.9:2000SEP08 7684091 HI 490
34 LG:1166387.9:2000SEP08 7682011 H2 655
34 LG:1166387.9:2000SEP08 7207248H1 608
34 LG:1166387,9:2000SEP08 " 6963062H1 441
34 LG:1166387.9:2000SEP08 6961001 HI 312
34 LG:1166387.9:2000SEP08 7210351 HI 626 CO m c ω w ω c_ <_ Gj ( > c c c c_ co c Gj αj Cj Cj
CT0 ∞ C0 C» sJ sj sl sJ sI sJ sj sJ sj sJ < c **O O O O O O O C*^ 0l 0l Cπ ϋ1 CJl C^
*Oi NT Oi NT sI sj js, vj O sl js, oi v vJ o- sl Cn NT sl 45. CO NT NT js, Ol 4N 4N 4N O Cn ι O C n ( M O. ^ i. Ol U Oι Oι — ' en o cn S5 NT 00 O C0 C0 C0 O NT O C0 C0 SJ C0 C0 01 4N O O O -" O Co N ≤ K ω fc M O O O ^ fr y ^ N ^ Jl Ol CO Iv Ol sl O O NT NT 4N sI NT O IO — ' O O — ' SJ CO s M M W sl M C W O -' M CB Cn -J g g M M ∞ M rTJ C J_, () C —j ■ c oo s oj C —O - oQ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
38 LG:198450.2:2000SEP08 2119916T6 467 1024
39 LG:1008175,1:2000SEP08 6798057F8 1 565
39 LG:1008175.1.2000SEP08 6798057H1 1 529
39 LG: 1008175.1:2000SEP08 6798057T8 490 1065
40 LG.437981,π:2000SEP08 7401501 HI 1 569
40 LG:437981.11:2000SEP08 3254083H1 37 260
41 LG:1025549.1:2000SEP08 6796976H1 1 253
41 LG:1025549.1:2000SEP08 6796976F8 45 582
41 LG:1025549.1:2000SEP08 6796976T8 217 776
42 LG:327226.16:2000SEP08 551971 HI 1 121
42 LG:327226.16:2000SEP08 3019718H1 1 281
42 LG:327226.16:2000SEP08 3479389H1 26 211
42 LG:32722ό.lό:2000SEP08 1696849T6 46 535
42 LG:327226.16:2000SEP08 3228982T6 273 546
42 LG:327226.16:2000SEP08 2948548H1 478 588
43 LG:1387394.5:2000SEP08 6825507J1 1 647
43 LG:1387394.5:2000SEP08 673356H1 80 226
43 LG:1387394.5:2000SEP08 7612276J1 151 694
43 LG:1387394.5:2000SEP08 6825507H1 309 818
44 LG:445188,3:2000SEP08 5434427H1 1 210
44 LG:445188.3:2000SEP08 5434427F9 1 162
44 LG:445188.3:2000SEP08 7234208H1 40 323
44 LG:445188.3:2000SEP08 7689303J1 67 612
44 LG:445188.3:2000SEP08 2588419H1 106 336
44 LG:445188.3:2000SEP08 2588419F6 106 323
44 LG:445188.3:2000SEP08 1465708H1 21 235
44 LG:445188,3:2000SEP08 7234146H1 39 323
44 LG:445188.3:2000SEP08 2331464R6 20 608
44 LG:445188.3:2000SEP08 2331464H1 20 237
44 LG:445188.3:2000SEP08 7658611 HI 15 324
44 LG:445188.3:2000SEP08 5802042H1 7 325
44 LG:445188.3:2000SEP08 7940863H1 1 323
45 LG:898864.11 2000SEP08 4179195F6 1 506
45 LG:898864.11 2000SEP08 7734109H2 167 570
45 LG:898864.11 2000SEP08 g3401261 321 750
45 LG:898864.11 2000SEP08 3323281 HI 379 632
45 LG:898864.11 2000SEP08 3790088H1 386 675
45 LG:898864.11 2000SEP08 3213979H1 527 673
45 LG:898864.11 2000SEP08 4179195H1 1 267
46 LG:018739.2:2000SEP08 1436493H1 1163 1424
46 LG:018739.2:2000SEP08 159249H1 1444 1643
46 LG:018739.2:2000SEP08 7931111 HI 1 484
46 LG:0'18739,2:2000SEP08 3522778H1 341 576
46 LG:018739.2:2000SEP08 g2883522 840 1191
46 LG;018739.2:2000SEP08 4333260F6 710 1141
46 LG:018739.2:2000SEP08 5203041 F6 407 877
46 LG:018739.2:2000SEP08 g1963365 1077 1572
46 LG:018739.2:2000SEP08 5042285H1 979 1214 Cn Oi Oi Ol Oi Oi Ol Ol cπ Oi Oi i oi cn Oi 45. 45. 45. 4N 4N 45. 45. 4N 4N 45. 45. 45. 4N 45. 45. 45. 4N 45. 4N JS, -* -*- ' O O O O O O O O O O O O O O O O O O O O O O O O O O O O CO OO CO OO
CO cn
NT
C0 W
co m
©
CJl CJl Cn C-n Ol Oi Cπ CJl Oi Ol Oi Oi Oi Cn CJl CJi Oi Oi Oi Ol OJ Oi Oi Ol Oi Oi Oi Cπ CJi Oi Ol Oi Oi Oi Oi Oi Oi Oi Oi Oi Ol Oi Ol O^ ω co ω co ω co w ω M ω u ω co ω ω w ω j ω ω ω ω co ω ω co M
O
i 0_ό c0ό c0ό c0ό c0ό c0ό c0o c0ό c0ό c0ό c0ό c0ό c0ό c0ό c0ό c0ό ω000!OOøøøøøøøøøøøøøøøøø.0000000000 u ω ω ω u co co ω u co co co w co co u co co co ω ω Q Co ω
Sj sj sj sj s| sj s| sj v -VI SJ SJ SI SJ SJ SJ SJ SJ SJ SJ SJ SJ SJ SJ S| SJ (JD C0 03 00 00 C <TO OO <TD C» CO CO CO CO OT
— . -j —. — i —. --. —. — i — . _. _. _ , _• _■ _. _ , _, _• _■ _- _, _- _ , _• _- _ ' M NT NT rO NT NT NT NT IO rO NT IO rO M NT NT 'Cn CJI Cn eπ Cn vlJ
O O O O O O O O O O O O O O O O O O O O O O O O O O — " — ■ — • — ■ — ■ — • — • — ■ — ' — ■ — ■ — > — ■ — > — ' — ' — ' O O O —i o o p p p p o p o p p p p p p p p p p p p p p p p p oi pi i pi i ϋi pi pi pi ϋi ϋi ϋi ϋi pi i pi i j ω
-j _-. --_ l-J --. --J --J --. l-j --- --. l-j ---ι --- --j lj -. -j --- --j _^
Fό ιj fό fό fό Fό iό fό fό Fό "τ tό 'Nτ tό fό iό fό fό Fό fό fό fό Fό ooo o oo oo o o o oo oo o oo ooo oo o oo o oo o o oo oo oo o oo o oo o oo o o Φ ,000000000000000000000000000000000000000000000000 —
\ Ocn cOn O cn cOn O cnO cn O cn cOo O cn O coO cn O cn O cnO cn O co O coO cn cOo O cn cOn O cnO co O cn cOn O cn cOnO co O cn O co cOn cOo O co OO OO OO O O O O O OO O O O D m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m !τ -π τj τ τj τ τ τj -o τ τ τj τ -o τ -o τj Tj' -α τ3 τ τj τj -π
I O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O | <T0 CT3 C» CO CD 00 O 00 CO 00 O0 00 <T0 CT0 C0 00 O3 00 00 00 C CJ0 <T0 CT0 OT
O O O O O r-i -' O Js. — ' —■ —■ ,-, _, _, (-, — . NT O Co Co J^" -fc. ° vl SJ sj co s| — ' O CO θ 9 θ 4N E θ <-o 45. 9 (-) _J 9 θ o :_- o ft ft ft co .fc. 4N C» 00 00 _ —■ 00 o o w ° o ro ≥ =* Oi pf- W O O ^ NT O O ∞ CJI O ^ ^ Q VJ en ft O ftO rO^ rCO.i Co en OO OO CO -' en co co co CO 00 Ω
45. 45. O O cπ sj O sj Co CO sj CO o O NT O NT sj ω o co ω Co NT vj Cn co 4N tO 4N O C0
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
53 LG:337160.1:2000SEP08 5167368H1 563 803
53 LG:337160.1:2000SEP08 g3593694 1073 1447
53 LG:3371όO.l:2000SEP08 g752492 1196 1539
53 LG:337160.1:2000SEP08 g5633598 1272 1538
53 LG:337160.1:2000SEP08 g4153397 1272 * 1538
53 LG:337160.1:2000SEP08 3457353T6 1110 1489
54 LG:395063.1:2000SEP08 g3849737 2713 2998
54 LG:395063.1:2000SEP08 4073057H1 2689 2978
54 LG:395063.1:2000SEP08 4073057F6 2688 2756
54 LG:395063.1:2000SEP08 4668484H1 2671 2919
54 LG:395063,1:2000SEP08 3535582F6 2636 3182
54 LG:395063,1;2000SEP08 3535582H1 2636 2911
54 LG:395063.1:2000SEP08 3090546T6 1915 2460
54 LG:395063,1:2000SEP08 348421OH1 2476 2809
54 LG:395063.1:2000SEP08 7093324F8 1378 1771
54 LG:395063.1:2000SEP08 6785022H1 36* 340
54 LG:395063.1:2000SEP08 3699565H1 1 285
54 LG:395063,1:2000SEP08 g5054669 2128 2493
54 LG:395063.1:2000SEP08 g3181305 2249 2493
54 LG:395063.1:2000SEP08 2866136H1 2004 2248
54 LG:395063.1:2000SEP08 7093763F8 1397 1771
54 LG:395063,1:2000SEP08 2674222H1 1886 2108
54 LG:395063.1:2000SEP08 3486280H1 1566 1661
54 LG:395063.1;2000SEP08 7401964H1 1654 2194
54 LG:395063,1:2000SEP08 6828352J1 67 594
54 LG:395063.1:2000SEP08 4180711 HI 54 201
54 LG:395063.1:2000SEP08 6788461 HI 40 474
54 LG:395063,1:2000SEP08 7700096H1 254 855
54 LG:395063.1:2000SEP08 031180H1 206 365
54 LG:395063.1:2000SEP08 030545H1 141 365
54 LG:395063.1:2000SEP08 7610103H1 75 603
54 LG:395063.1:2000SEP08 3696546F6 810 1323
54 LG:395063.1:2000SEP08 6828352H1 619 1165
54 LG:395063.1:2000SEP08 7086817H1 402 950
54 LG:395063.1:2000SEP08 1426382H1 1240 1511
54 LG:395063.1:2000SEP08 7093763H1 1378 1771
54 LG:395063,1:2000SEP08 5732526H1 1047 1322
54 LG:395063.1:2000SEP08 3696546H1 812 1096
54 LG:395063,1:2000SEP08 4668004H1 2426 2525
55 LG:979069.4:2000SEP08 3988928H1 1168 1356
55 LG:979069,4:2000SEP08 3988928R6 1168 1654
55 LG:979069.4:2000SEP08 4320977H1 1238 1523
55 LG:979069.4:2000SEP08 3436719H1 1300 1538
. 55 LG:979069.4:2000SEP08 3988928T6 1414 1924
55 LG:979069.4:2000SEP08 4822576H1 1868 2130
55 LG:979069.4:2000SEP08 7664878H1 925 1423
55 LG:979069.4:2000SEP08 g839595 1022 1258
55 LG:979069.4:2000SEP08 5390613H1 1 259 ΓABLE3
SEQ ID NO: Template ID Component ID Start Stop
55 LG:979069.4:2000SEP08 5390613F8 1 555
55 LG:979069.4:2000SEP08 7664878J1 364 998
55 LG:979069.4:2000SEP08 4248779R6 543 993
55 LG:979069.4:2000SEP08 1681947H1 600 814
55 LG:979069.4:2000SEP08 g2022820 722 1061
56 LG:346663,5:2000SEP08 3421067H1 127 327
56 LG:34όόό3.5:2000SEP08 5088677H1 85 337
56 LG:346663,5:2000SEP08 2118284R6 1 370
56 LG:346663.5:2000SEP08 4616708H1 168 331
56 LG:346663,5:2000SEP08 g2241506 172 611
56 LG;346663.5:2000SEP08 g2054215 202 587
56 LG:346663.5:2000SEP08 g2882171 226 616
56 LG:346663.5:2000SEP08 gό086765 236 615
56 LG:346663.5:2000SEP08 g5547339 241 615
56 LG:346663.5:2000SEP08 g3753350 243 615
56 LG:346663.5:2000SEP08 g205405ό 253 616
56 LG:346663.5:2000SEP08 g6477039 271 618
56 LG:346663.5:2000SEP08 g2806480 332 612
56 LG:346663.5:2000SEP08 053836H1 381 606
56 LG:34όόό3,5:2000SEP08 261802H1 88 258
56 LG:346663,5:2000SEP08 2118284H1 1 266
56 LG:346663,5:2000SEP08 857738H1 54 354
56 LG:346663.5:2000SEP08 6844557H1 89 619
57 LG:347615,1:2000SEP08 3532722H1 1 184
57 LG:347615.1:2000SEP08 3532722F6 1 259
57 LG:347615.1:2000SEP08 3532722T6 40 560
58 LG:1397067.1.2000SEP08 g6568429 39 440
58 LG:1397067, 1:2000SEP08 7606974H1 1 417
58 LG:1397067.1:2000SEP08 4112486H1 547 812
58 LG:1397067.1:2000SEP08 2207794F6 436 998
58 LG:1397067.1:2000SEP08 2207794H1 436 705
58 LG:1397067.1:2000SEP08 2432882H1 60 289
58 LG:1397067.1:2000SEP08 4790250H1 243 '506
58 LG:1397067.1:2000SEP08 5095324H1 794 1055
58 LG:1397067.1:2000SEP08 7002940H1 754 961
59 LG:120675.1:2000SEP08 3991393R6 20 532
59 LG:120675.1:2000SEP08 5356838H1 1 207
59 LG:120675.1:2000SEP08 3191083H1 390 715
59 LG:120675.1:2000SEP08 3191083R6 390 966
59 LG:120675.1:2000SEP08 3343664H1 324 587
59 LG:120675.1:2000SEP08 166952H1 111 413
59 LG:120675,1:2000SEP08 6917760H1 137 619
59 LG:120675,1:2000SEP08 7143843H1 286 867
59 LG:120675.1:2000SEP08 3274716H1 571 817
59 LG:120675.1:2000SEP08 g1963587 434 837
59 LG:120675.1:2000SEP08 g2105985 623 1074
59 LG:120675.1:2000SEP08 7040182H1 623 1145
59 LG:120675.1:2000SEP08- 7674849H2 605 1197 1ABLE 3
ID NO: Template ID Component ID Start Stop 9 LG:120675.1 2000SEP08 2866547F6 1100 1490
59 LG:120675.1 2000SEP08 2908672H1- 804 1071 9 LG:120675.1 2000SEP08 1257063H1 941 1174
59 LG:120675.1 2000SEP08 4947527H1 943 1198 9 LG:120675.1 2000SEP08. 6781420H1 980 1559
59 LG:120675.1 2000SEP08 2813478H1 2 291
59 LG:120675.1 2000SEP08 2908672F6 804 1112
59 LG:120675.1 2000SEP08 3991393T6 639 1134 9 LG:120675.1 2000SEP08 1553725H1 756 953 9 LG: 120675.1 2000SEP08 1348958H1 1224 1502 9 LG:120675,1 2000SEP08 6421838H1 1307 1575 9 LG:120675.1 2000SEP08 6292026H1 1136 1370 9 LG:120675.1 2000SEP08 2866547H1 1100 1415 9 LG: 120675.1 2000SEP08 6295036H1 1136 1438 9 LG:120675.1 2000SEP08 g4686492 1334 1581 9 LG: 120675,1 2000SEP08 7750150H1 49 328 9 LG:120675.1 2000SEP08 3991393H1 20 316 9 LG:120675.1 2000SEP08 g1765138 29 93
60 LG:420050.18:2000SEP08 6974290H1 1 479 0 LG:420050.18:2000SEP08 4246487H1 1 63
60 LG:420050.18:2000SEP08 2720177H1 1 173 0 LG:420050,18:2000SEP08 2649093T6 356 479 0 LG:420050.18:2000SEP08 2661286T6 226 479 0 LG:420050.18:2000SEP08 2649093F6 114 531
60 LG:420050.18:2000SEP08 2720177F6 1 376
60 LG:420050.18;2000SEP08 5623343H1 10 330
60 LG:420050.18:2000SEP08 2649093H1 114 315 0 LG:420050.18:2000SEP08 3454347H1 ' . 114 215
61 LG:220495.3:2000SEP08 5638207H1 1516 1793
61 LG:220495.3:2000SEP08 g827423 1530 1731
61 LG:220495.3:2000SEP08 6073452T6 1547 1822
61 LG:220495.3:2000SEP08 1299193F6 1547 1860
61 LG:220495.3:2000SEP08 1299193H1 1547 1737
61 LG:220495.3:2000SEP08 1299193T6 1547 1817
61 LG:220495.3:2000SEP08 2107254H1 1554 1826
61 LG:220495.3:2000SEP08 g560355 1646 I860
61 LG:220495.3:2000SEP08 gl784513 1688 1808
61 LG:220495.3:2000SEP08 2223448H1 1720 1856
61 LG:220495,3:2000SEP08 2660260H1 1745 1991
61 LG:220495.3:2000SEP08 4857157H1 1776 1981
61 LG:220495,3:2000SEP08 g1398375 1108 1516
61 LG:220495.3:2000SEP08 6917411 HI 1118 1695
61 LG:220495.3:2000SEP08 7452024T1 1202 1717
61 LG:220495,3:2000SEP08 7449786T2 1205 1750
61 LG:220495.3:2000SEP08 3142603H1 1252 1408
61 LG:220495.3:2000SEP08 3142603F6 1253 1663
61 LG:220495.3:2000SEP08 7143143T8 1275 1687
61 LG:220495.3:2000SEP08 g4985474 1288 1713
Ϊ28~ TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
61 LG:220495.3:2000SEP08 5099033H1 1310 1579
61 LG:220495.3:2000SEP08 4147460H1 1316 1507
61 LG:220495.3:2000SEP08 3142603T6 1365 1823
61 LG:220495,3:2000SEP08 3689942H1 1402 1679
61 LG:220495.3:2000SEP08 g4394682 1427 1869
61 LG:220495,3:2000SEP08 g3418148 1429 1868
61 LG:220495,3:2000SEP08 g1398292 1439 1845
61 ° LG:220495.3:2000SEP08 6197152H1 1448 1883
61 LG:220495.3:2000SEP08 2630960T6 1458 1827
61 LG:220495.3:2000SEP08 g3424422 1492 1866
61 LG:220495,3:2000SEP08 gl 139137 836 1180
61 LG:220495.3:2000SEP08 2555307H1 956 1207
61 LG:220495.3:2000SEP08 2554584H1 956 1205
61 LG:220495.3:2000SEP08 g781616 1019 1303
61 LG:220495.3:2000SEP08 6937534H1 1045 1646
61 LG:220495,3:2000SEP08 6092295H1 1104 1361
61 LG:220495.3:2000SEP08 g4435582 1 380
61 LG:220495.3:2000SEP08 6073452H1 72 231
61 LG:220495,3:2000SEP08 7204038H1 148 663
61 LG:220495.3:2000SEP08 6981431H1 294 824
61 LG:220495.3;2000SEP08 2630960H1 730 964
61 LG:220495.3:2000SEP08 2630960F6 730 1089
61 LG:220495.3:2000SEP08 609082H1 801 1064
61 LG:220495.3:2000SEP08 g564302 804 1002
61 LG:220495.3:2000SEP08 6572570J1 1 436
61 LG:220495.3:2000SEP08 7447785T2 1788 2239
61 LG:220495.3:2000SEP08 3242659H1 1893 2113
61 LG:220495.3:2000SEP08 g823764 2038 2374
62 LG:274551.1:2000SEP08 g4325750 1 103
62 LG;274551.1:2000SEP08 4290049H1 1 124
62 LG:274551.1:2000SEP08 4290049F6 1 353
62 LG:274551.1:2000SEP08 5493752H1 172 444
63 LG:429658.27:2000SEP08 g5110045 203 646
63 LG:429ό58,27:2000SEP08 g5665175 196 645
63 LG:429658.27:2000SEP08 g3700373 299 642
63 LG:429658.27:2000SEP08 g3001514 414 640
63 LG:429658.27:2000SEP08 g3835002 361 639
63 LG:429658,27:2000SEP08 g1367049 263 638
63 LG:429658.27:2000SEP08 g2110578 472 639
63 LG:429658.27:2000SEP08 g1367140 292 625
63 LG:429658.27:2000SEP08 5825096H1 202 557
63 LG:429658.27:2000SEP08 943894R1 1 539
63 LG:429658.27:2000SEP08 5825196H1 202 521
63 LG:429658.27:2000SEP08 g1502057 260 460
63 LG:429658.27:2000SEP08 g3118005 249 457
63 LG:429658,27:2000SEP08 g3041363 195 457
63 LG:429658.27;2000SEP08 g3048788 190 457
63 LG:429ό58.27:2000SEP08 943894H1 1 302 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
64 LG:246194.18:2000SEP08 6944445H1 702 1175
64 LG:246194,18:2000SEP08 5028768H1 723 995
64 LG:246194.18:2000SEP08 1339502H1 750 1023
64 LG:246194.18:2000SEP08 4317877H1 751 956
64 LG:246194.18:2000SEP08 1671388H1 703 782
64 LG:246194.18:2000SEP08 1671264H1 703 910
64 LG:246194,18:2000SEP08 5028776H1 723 991
64 LG:246194,18:2000SEP08 2365229H1 767 1007
64 LG:246194.18:2000SEP08 030557H1 768 896
64 LG:246194.18:2000SEP08 031790H1 774 1007
64 LG:246194.18:2000SEP08 031792H1 774 1022
64 LG:246194.18:2000SEP08 031946H1 774 966
64 LG:246194.18:2000SEP08 031427H1 774 974
64 LG:246194.18:2000SEP08 032531 HI 774 1025
64 LG:246194.18:2000SEP08 031608H1 774 995
64 LG:246194.18:2000SEP08 030659H1 774 950
64 LG;2461 4.18:2000SEP08 g1270439 780 1174
64 LG:246194.18:2000SEP08 g1241966 798 1082
64 LG:246194.18:2000SEP08 g1990977 798 982
64 LG:24ό194.18:2000SEP08 g981948 812 1087
64 LG:246194.18:2000SEP08 4700087H1 820 1076
64 LG:246194.18:2000SEP08 g749487 838 904
64 LG:246194.18:2000SEP08 6744795H1 840 1422
64 LG:246194.18:2000SEP08 g711338 888 1160
64 LG:246194.18:2000SEP08 g716153 898 1195
64 LG:246194.18;2000SEP08 2720633H1 931 1164
64 LG:2461 4.18:2000SEP08 1676222H1 969 1178
64 LG:246194.18:2000SEP08 5810107H1 969 1253
64 LG:246194,18:2000SEP08 1675139H1 969 1175
64 LG:246194.18:2000SEP08 6169319H1 969 1268
64 LG:246194.18:2000SEP08 g828043 1033 1288
64 LG:246194.18:2000SEP08 5818587H1 1057 1163
64 LG:246194.18:2000SEP08 5818729H1 1057 1375
64 LG:2461 4,18:2000SEP08 5814910H1 1057 1366
64 LG:2461 4,18:2000SEP08 5813775H1 1057 1359
64 LG:246194,18;2000SEP08 5188360H1 584 856
64 LG:246194,18:2000SEP08 g1281006 674 1140
64 LG:246194.18:2000SEP08 g869772 695 1019
64 LG:246194,18:2000SEP08 2888196H1 1 278
64 LG:246194,18:2000SEP08 2888196F6 1 562
64 LG:246194.18:2000SEP08 1739976R6 2 71
64 LG:246194.18:2000SEP08 1739976H1 3 229
64 LG:246194.18:2000SEP08 1315633H1 87 280
64 LG:246194.18:2000SEP08 079560H1 694 852
64 LG:246194.18:2000SEP08 4321295H1 699 964
64 LG:246194,18:2000SEP08 4144711 HI 96 451
64 LG:246194.18:2000SEP08 5546736H1 210 419
64 LG:246194.18:2000SEP08 6450480H1 237 805 CO rn ©
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1=;
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N π Ό
O
000000000000000000000000000000000000000000000000
•NT Nτ Fό iό fό Fό Fό tό rύ Fό fό Fό iό Fό Fό fύ 'Nτ Fό 'Nτ Fό fό Fό
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N JN 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N -T^ 4N 4N 4N 4N 4N 4N 4N 4N 4N 4^ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O — 1
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O N c» oo co co co po oo co c» oo oo oo oo co co co co cb oo oo co co oo oo co <τo cD C» oo oo co ∞
•ro Fό -ro iO fό iό fό Fo Fό fό 'NT Nτ Fό tό Fό fό fό fό Fό iό iO M o o o o o o o o o g o o o o o o g o o o o o o o o o o o o o o o o o g o o o o o o o o o o o o o o o oo oo oo oo oo oo oo oo go oo oo oo oo oo oo Qo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo rdϊ
CΛ CO CΛ CΛ CJ CΛ CΛ W CJO CO CΛ CΛ CO CΛ CΛ CO CΛ CO CΛ CJ CΛ CO CΛ CΛ m m m m mm m m m mm m m m m m mmm m m m m mm m
-D -O TJ TJ TJ -O TJ -O TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ -O TJ TJ TJ TJ TJ TJ -ϋ TJ TJ TJ TJ TJ -α T^ O O O O O Q O O O O O O O O O O O O O O O O O O O O O O O O O O O O Q O O O O O O O O O O O O O oo co co co oo oo oo oo oo co cTD CD Co oo co oo oo oo oo co co co cTo co oo oo cTo oo co oα oo co oo rø
>
CO JV, vj 03
CΩ C CΩΩ Cn CΩ CΩ 4N NT NT Cn co 4N o Q CΩ CΩ CΩ CΩ CΩ CQ CΩ cπ co - 4N o O v , — Ω CΩ C ~ Oo <Ω *Ω O — • CΩ CΩ CΩ CΩ CΩ CΩ -o O 00 *0(Q NT sj Co O O Cπ 3 r
NT — 1 I NsT! o C r 0 Crnπ — r CΩ Q C
—< i CC C Coo f Co ro o o "-i X 4 IVN — JN — 1 c cnn o rs cn — ' NT Cn tO — 1 4N Oi C —Ω vl N n NT O 44NN o 4N NT o CO o to cn c _n*. N , T c ^o — ' — ' NT o 17 O Ol CO o o cn cn 4N O fi m
Or sl NJ Cl- NT —' o o to cn T NT Co CO O CO — ' CO 4N o cn vl Co O CO j CO
4N 00 ro 00 o NT CO v cn cn o co - N M O •fc. 4N Cn NT o o o O sj vj NT si — ■ O
O o _ c -n. io cn 4N NT O O Co O si cn o co NT 00 vl — ' NT l si sj c3 CO 00 si O O O o CO sj N vj C
4 -N 0 -0 O -l 00 o o co cn co o o — 4N 3 cn o NT O ∞ NT CO
Oi O 00 o n CO 4 O 4N
4N cn CO CJ- CO —'
NT si cn Cn ccπn vj " NT O si n co 0 0 0 si 4N sj O CO o 4N O 45. O 4N sj NT O 00 00 4N en o si o φ ό*! o vi τ- 4 θ *^ ω » cτ5 o rsτ O 00 o O 4N Oi O sl NT X X X X 73 X X X X X 3
O x —■ - X> x
NT —> — ' O
Oa CTO sj sj sj sj sj sj sJ sj sj sj vJ C O O O O O O O O O O O O O O O O O O O O O o S ^ fe fe v^ j^ fe ft ft
O O O O sl O 4N 4N 4N W C0 NT NT O O C» CT0 sJ sj sj sJ O Cn CJl 0i 0i 4N 4N Cj0 W 0J Cj0 O NT M
4N O O CO — > C0 4N C0 00 NT OI — ' M Cθ U fc ^ -' -' O N O* 'M O N C W ω θ O N K α ( b ro ∞ ffl W ^ ™ fc ω ( 5.
M lvT M M iNJ M W M -M M M KJ M -^ M M M M M -' M M M M
O O O O O O O O O O O O O CO O O O O O O O O O O O O O O O O O OO OO O O OO o o si o vi o o o o co cn en co 4N 4N 4N 0 4N 4N 4N — . Cn *JN 4N Cn C CJT 4N 4N 4N 4N 4N C 4N 4N 4N 0 4N Js. 0 4N 4N OO O vI OO C *— ' O O O JN Cn — co o cn cn oi cn o co — ' cn cn co c» o cn 4N en cn cn cn θι O θ θo co o co cι cn oo o cπ oo en co oo o oo o cπ oι ' CcOo O45> NoT O— ' OcoO —co ' —co '-gO
© o o oN o4N o4N oJN o4 _N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O 0 0 0 0 0 0 0 0 0 0 =; JN 4N 4 4N O 0 4N O4N OJN 4ON O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N O4N 4-N 4N 4N 4N JN JN XN JN CJ
0
ø NT NøT roøø NTøNTø NT NøTøNT NøTøNT NøTø NTøto NøTø NT NøTø NTøNTølO NøTø NTøNTø NTøNTø NTøroø NTøNTø NøTø NTøtOø NT NøTø NTøl NøTø IOøNT NøTøNTøøøø NTøIO NøT 4N 4N 4oN t r 4oN 4N JN 4N 4N 4N c. 4N 4N 4N 4N 4N 4o NT NT 4N 4N 4N JN 4N JN 45. 4N 4N JN 4N 4N 4N 4N r 4oN 4N 4N 4N 4N 4N 4N JN JN fc. 4N 4N JN N JN JN 4N .IN 4N o o o o ■f O O O O O O O O O O O O O O o o O O o o o o O O O O O O O O O O O o o O O O 0 o o O O O 0 0 0 0 0 0 0 O O O O O 0
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N s, js, 0JS, o 4N o4N o 4N oJN o 0 0 0
45. o4N o 4N oJN o 4N o4N o 4N ofc. o O O fc. o4N 4N fc. 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N JN 4N fc. .fc. 4N
CO 00 00 co co 00 00 00 oo co αo oo oo oo co oo CO 00 CD 00 CD 00 00 00 co 00 00 00 CD 00 CO 00 00 CO 00 00 Co 00 00
NT NT NT NT NT NT NT NT NT IO NT NT T NT NT ro IΌ NT NT NT NT NT NT NT NT NT NT NT NT IO NT NJ NJ NJ NT NT NT o g C J c > O O C J C J C ) O O O ~ ΓJ C J C J C ) O O O O O
C J o o o o ) CT C o J C J C > C J C J o o CT C J C ) o ~ o o ~
O C J C J C J C J O O O C J ΓJ C J C ) C ) < N"T
9 o o O σ 0 ΓJ O O C ) C J C o J C ) C ) ( J C ) C J C J C J o o o O O O C J C J C J C ) C ) c ) H C ) O O C ) C ) 0
CO CO < CO ) < J
CΛ CO CO ( ) C J C ) O C J C J C ) C J C J
CΛ O
CΛ CΛ CΛ CΛ CΛ CΛ CΛ CO CO CO CO C oΛ o CO o CO co C oΛ o CΛ CΛ CΛ CO CΛ CO CO CO O o 0
CΛ CΛ CΛ CΛ CO o o o D
III 111 m m m rπ m rπ m m m ni IH m rπ m m m m m m m m m m m m 111 III m rπ in fll 111
TJ TJ "I) TI IJ TJ TJJ TJ TJ J TJ TJ T TJJ TJ TJ TJ TJ TJ J T J TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T TJJ TJ TJ O O C J C J O 0 000 0 0 C ) c J C J O C J C J c J C J 0 O 0o o
00 00 00 00 00 CO 00 CO o 00 co 0000 00 00 o CO 00 00 00 00 o C J
00 o 00 o CO o 00 CO CO C oO o 00 00 00 o
00 00 00 00 CD 00 00 CO 00 O 00 00 CO CD Co Co 00 Co o 00
o
Z -7l - — ' I O K> — ' NT NT NT — ' NT NT — * NT NT rO IO NT NT NT NT IO IO IO C tT Co W W W CO Oo CO CO O O O sl O O sl CT O O O O JN O O O O p 4N 4N Oi Oi 4N NT Co co *co (X CTO O Cjj NT CO OO Cn O ooi o o o p o o o o o o co o o o o o o o o o o o o o o ^ sl 4N C0 4N CJ1 0i 0 4N fc. Co sJ JN ^ C» 4N CO IO 4^ 0i £. £. Oi £. £. ^ 0 — ' O O O O O Cn — • sl O O sl o iO Oi — • Ol O sl Js, o ro o cn o — ' 4N θo cn si vj cn en si co o oo oo cn cn oo o o o o"0
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
64 LG:246194.18;2000SEP08 4553620H1 1115 1383
64 LG:246194.18:2000SEP08 5819710H1 1121 1311
64 LG:246194.18:2000SEP08 g734631 1153 1269
64 LG:246194.18:2000SEP08 g2185031 1168 1304
64 LG;246194.18:2000SEP08 5592178H1 1180 1436
64 LG:246194.18:2000SEP08 1949202H1 1179 1436
64 LG:24όl94.18:2000SEP08 4861510H1 1207 1476
64 LG:246194.18:2000SEP08 6213288H1 1328 1480
64 LG:246194,18:2000SEP08 3614166T6 1348 2000
65 LG:000874.1:2000SEP08 4178064T6 1870 2249
65 LG:000874.1:2000SEP08 g2836242 1892 2253
65 LG:000874,1:2000SEP08 g4147256 1894 2282
65 LG:000874.1 :2000SEP08 g3678872 1895 2285
65 LG:000874.1:2000SEP08 4515055H1 1904 2152
65 LG:000874.1:2000SEP08 676441 R6 1983 2282
65 LG:000874.1:2000SEP08 676441HI 1983 2256
65 LG:000874.1:2000SEP08 676441To 1983 2241
65 LG:000874.1:2000SEP08 672386H1 1983 2241
65 LG:000874.1:2000SEP08 gό043865 2104 2282
65 LG:000874.1:2000SEP08 g5663173 2145 2274
65 LG:000874.1:2000SEP08 4178064F6 520 1026
65 LG:000874.1:2000SEP08 6000160H1 968 1466
65 LG:000874.1:2000SEP08 6308558H1 967 1528
65 LG:000874.1:2000SEP08 6558026H1 1399 1961
65 LG:000874.1:2000SEP08 4311435H1 1417 1572
65 LG:000874.1:2000SEP08 6560545H1 1429 1940
65 LG:000874,1:2000SEP08 6445251T8 1627 2176
65 LG:000874.1;2000SEP08 3468564T6 1858 2234
65 LG:000874.1:2000SEP08 3468564H1 1865 2137
65 LG:000874.1:2000SEP08 3468564F6 1865 2218
65 LG:000874,1:2000SEP08 7073540H1 1 329
65 LG:000874.1:2000SEP08 7287283H1 82 396
65 LG:000874.1;2000SEP08 4711204H1 92 226
65 LG:000874.1:2000SEP08 6444020H1 95 611
65 LG:000874.1:2000SEP08 6445251 F8 339 900
65 LG:000874.1:2000SEP08 6445251HI 339 460
65 LG:000874.1:2000SEP08 4178064H1 520 770
66 LG:239967.7:2000SEP08 4137638H1 67 340
66 LG:239967,7:2000SEP08 1002437H1 1 226
66 LG:239967.7:2000SEP08 6452421 H2 4 321
66 LG:239967.7:2000SEP08 5699293H1 5 196
67 LG:238388.1:2000SEP08 g4114141 3607 3994
67 LG:238388.1:2000SEP08 2996081T6 3611 3957
67 LG:238388.1:2000SEP08 5550244H1 3628 3874
67 LG:238388.1:2000SEP08 4787718H1 3645 3890
67 LG:238388.1:2000SEP08 1712082H1 3647 3849
67 LG:238388.1:2000SEP08 5205979H1 3673 3911
67 LG:238388.1:2000SEP08 865219T1 3741 3961 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
67 LG:238388.1:2000SEP08 865219H1 3740 3908
67 LG:238388.1:2000SEP08 2819356H1 3749 3994
67 LG:238388.1:2000SEP08 816218R1 3809 3994
67 LG:238388.1:2000SEP08 816218H1 3809 3994
67 LG:238388,1:2000SEP08 1406892T6 3811 3956
67 LG:238388.1:2000SEP08 1406892F6 3818 3994
67 LG:238388.1:2000SEP08 1406892H1 3818 3997
67 LG:238388.1:2000SEP08 g2017794 3852 3994
67 LG:238388.1:2000SEP08 816218T1 3856 3953
67 LG:238388.1:2000SEP08 g!210982 3872 4000
67 LG:238388.1:2000SEP08 523420H1 3936 3994
67 LG:238388.1:2000SEP08 235451OH1 3938 3994
67 LG:238388.1:2000SEP08 5769009H1 749 1071
67 LG:238388.1:2000SEP08 6772048H1 875 1470
67 LG:238388.1:2000SEP08 g1959595 922 1317
67 LG:238388.1:2000SEP08 6916395H1 1002 1294
67 LG:238388.1:2000SEP08 3563552H1 1004 1310
67 LG:238388.1:2000SEP08 6916017H1 1005 1294
67 LG:238388.1:2000SEP08 7661482H1 1098 1671
67 LG:238388.1:2000SEP08 4777952H1 1260 1531
67 LG:238388.1:2000SEP08 5431473H1 1426 1691
67 LG:238388.1:2000SEP08 5431473F6 1426 1731
67 LG:238388.1:2000SEP08 7371680H1 1453 1844
67 LG:238388.1:2000SEP08 g282555ό 1497 2004
67 LG:238388,1:2000SEP08 479240H1 1549 1758
67 LG:238388,1:2000SEP08 g5809860 1609 1891
67 LG:238388.1:2000SEP08 5431473T6 1640 2165
67 LG;238388,1:2000SEP08 2886276H1 1656 1928
67 LG:238388,1:2000SEP08 3788121H1 1797 2072
67 LG:238388,1:2000SEP08 4129226H2 1857 2111
67 LG:238388,1:2000SEP08 7661482J1 1860 2408
67 LG:238388.1:2000SEP08 129490H1 1858 2075
67 LG:238388.1:2000SEP08 5835769H1 1934 2196
67 LG:238388.1:2000SEP08 7677342J1 2028 2587
67 LG:238388.1:2000SEP08 5727827H1 2057 2563
67 LG:238388.1:2000SEP08 5975633H1 2082 2535
67 LG:238388.1:2000SEP08 7006547H1 2095 2661
67 LG;238388.1:2000SEP08 g3146134 2211 2487
67 LG:238388.1:2000SEP08 6757328H1 2388 2954
67 LG:238388.1:2000SEP08 7737258H1 2481 3065
67 LG:238388.1:2000SEP08 2922871 HI 2556 2836
67 LG:238388.1:2000SEP08 5412320H1 2576 2857
67 LG:238388.1:2000SEP08 3821375H1 2611 2853
67 LG:238388.1:2000SEP08 3821375F6 2612 3062
67 LG:238388.1:2000SEP08 4337484H1 2617 2942
67 LG:238388.1:2000SEP08 1436846T6 2629 2852
67 LG:238388.1:2000SEP08 4219215H1 2689 2971
67 LG:238388.1:2000SEP08 7757483J1 1 659 CO m © vlosl sol soj solosl sol soj so|osloslosloslosloslosjoslosjosi sol sol solos| sojosj -ovj vojos|os|oslosjosjosl Ol O sj SjoSj SojovJoVJ VoJ -ovJoSjoSjoSjoSloSJoSl
0000000000000000000000000000000000O000000000 O 000 NJ iό fό fό fό Fό Fό fό 'rό rό ό rO iό Fό iό fό Fό iό iό fύ fό fό Fό Cjo ω co co ω w co ω u co co co co w co ω ω ω c ω ω co co co co ω ω j ω OO OO OO CO OO OO OO OO CO OO OO OO CO CO OO OO OO CO CO OO OO OO CO CO OO CTO CO OO OO OO OO CO OO OO CO ∞ ∞
■co ω ω w ω co ω ω ω co co ω co w co ω co co iji co co co co w co ω co ω ω co u co w
CT0 CT0 O0 O0 CD OO CO <T0 00 OO O0 00 (» CT0 (T0 CTO (T0 θ0 CT0 CT0 O0 O3 0O O0 O0 ∞ o -co ro ro co 'oo po co co rø Fό fό -J fό Fό Fό FO Nτ iύ fό fό tό Fo Fό Fό fό fό Fό fό Fό fό fύ Fό fO M
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O Φ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O — cOn cOn Ocn Oco Ocn On Ocn Ocn cOn Ocn Ocn Ocn Ocn cOn Ocn Ocn cOn Ocn Ocn Ocn Oco Ocn Oco Ocn Ocn Ocn Ocn Ocn cOn cOn Ocn O O O O O O O O O O O O O O O O D m m m m m m m m m m m m πi m m m m m m m m m
TJ -O TJ TJ TJ -D TJ -σ -O TJ -D -D TD TJ TJ TJ TJ TJ -O -α TJ TJ TJ TJ -D TJ TJ TJ TJ TJ TJ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O cTo co co co oo co cTo cTo co oo co cTo cx cTo cTo σJ oo oo co co oo oo oo oo oo co oo oo oo oo oo co oo ∞
O C0 CjJ Cjj C0 C0 C0 <-0 C0 Cθ C_0 Cj- C0 CJ C-0 Gj G0 C0 C0 Co ω
O CO CJJ CO CO CO CO CO CO IO NT NT — ' — ■ — < — ' O O O O O O O O O O O O O CT CTO sl O OO OO CO OO Co OO sJ sj O ) ^ ^ > ^ *5Λ
Cn cn 4N 4N Nτ ro NT Nτ o si 4 N cn cn ro —■ sl SI NT O O O O O SI O O O O NT JN 4N co rj O- n- vj o O O NT NT Q
— ' — ' O CO SJ NT NT NT O —■ —■ OO CO NT sj o VI O O — ' NT NT NT NT NT 0 4N 0 cπ Oi co co o 45*. CO NT NO
CO CO Co Co Co Cjo ω co co Co c-o CO CJ Cjo Co Co CjJ Oo cjo CO CjJ sj ( CN O 0l Ul 0l O v cn 4N. O vj C0 C0 W C0 Cn l C0 NT — ' NT CO — ' — ' NT — ' — ' O O O — ' NT — ' 4N 0 4N O NT o Co o Cπ Cn — ' O 4N 00 NT O O — ' CJl vJ CO O vl O Cπ Oi — ' O 4N -V1 O VI 4N O ∞ 4N CD 4N JS, VJ SI 0I SJ O C0 SJ O S| SJ O cn 00 —* sl si o Co rO CO O O CO — ' CO C0004N — ' 4N O O 4N O O Cπ O si en C0 O CD 4N NT O O Cn — ' NT — ' Co O Cπ 00 NT 00 sj cn oo o TJ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
67 LG:238388,1:2000SEP08 2878474H1 3358 3638
67 LG:238388,1:2000SEP08 3794206H1 3412 3709
67 LG:238388.1;2000SEP08 2285158H1 3418 3580
67 LG:238388.1:2000SEP08 3041557H1 3441 3722
67 LG:238388.1:2000SEP08 2133969H1 3456 3710
67 LG:238388.1:2000SEP08 3821375T6 3471 3958
67 LG:238388,1:2000SEP08 2020109T6 3484 3952
67 LG:238388.1;2000SEP08 5139405H1 3487 3752
67 LG:238388,1:2000SEP08 886271 HI 3488 3788
67 LG:238388,1:2000SEP08 1969228H1 3496 3771
67 LG:238388.1:2000SEP08 1969228R6 3496 3962
67 LG:238388.1:2000SEP08 1969228T6 3496 3941
67 LG:238388.1:2000SEP08 4207304H1 3528 3747
67 LG:238388.1:2000SEP08 g5110635 3531 3995
67 LG:238388.1:2000SEP08 g6656819 3533 3994
67 LG:238388.1:2000SEP08 4200482H1 3534 3837
67 LG:238388,1:2000SEP08 g3889867 3566 3984
67 LG:238388.1:2000SEP08 * 1808041To 3565 3962
67 LG:238388.1:2000SEP08 g4850515 3579 3994
67 LG:238388.1:2000SEP08 1481976T6 3593 3932
68 LG:233674.4:2000SEP08 g3048238 2946 3414
68 LG:233674.4:2000SEP08 g2820354 2909 3415
68 LG:233674.4:2000SEP08 .1954924T6 2910 3380
68 LG:233674.4:2000SEP08 7335679H1 2918 3430
68 LG:233674.4:2000SEP08 626535H1 2922 3164
68 LG:233674.4:2000SEP08 026535R6 2922 3413
68 LG:233674.4:2000SEP08 6599347H1 2932 3414
68 LG:233674.4:2000SEP08 g5593063 2933 3420
68 LG:233674,4:2000SEP08 347940H1 2932 3059
68 LG:233674.4:2000SEP08 gόόόόl3 2948' 3412
68 LG:233674.4:2000SEP08 g6702571 2942 3414
68 LG:233674.4:2000SEP08 6343858H1 2943 3214
68 LG:233674.4:2000SEP08 6316286H1 2943 3121
68 LG:233674.4:2000SEP08 2598676H1 1647 1909
68 LG:233674.4:2000SEP08 2598676F6 1647 21*46
68 LG;233674.4:2000SEP08 7711995J1 1667 2262
68 LG:233674.4:2000SEP08 g564225 1730 1960
68 LG:233674.4:2000SEP08 4346248H1 1731 2018
68 LG:233674.4:2000SEP08 666846H1 1757 2030
68 LG:233674,4:2000SEP08 2749327H1 1769 2023
68 LG:233674.4:2000SEP08 6798844H1 1889 2448
68 LG:233674.4:2000SEP08 469885H1 1890 2050
68 LG:233674.4:2000SEP08 g5837868 2954 3425
68 LG:233674.4:2000SEP08 g698671ό 2960 3414
68 LG:233674.4:2000SEP08 g4311811 2960 3414
68 LG:233674.4:2000SEP08 g5177329 2961 3392
68 LG:233674.4:2000SEP08 g6661879 2962 3418
68 LG:233674.4:2000SEP08 5329124H1 2696 2943 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
68 LG:233674.4:2000SEP08 g866548 2708 3015
68 LG:233674.4;2000SEP08 2935630H1 2708 2976
68 LG:233674,4:2000SEP08 g855143 2708 2983
68 LG:233674.4:2000SEP08 3854205H1 2725 3021
68 LG:233674.4:2000SEP08 2755008H1 2727 2906
68 LG:233674.4:2000SEP08 2232604H1 2729 2884
68 LG:233674.4:2000SEP08 g1200259 2735 3017
68 LG:233674.4:2000SEP08 3970611 HI 2740 2998
68 LG:233674.4:2000SEP08 3967950H1 2753 3000
68 LG:233674,4:2000SEP08 4542927H1 2757 3019
68 LG:233674.4:2000SEP08 3967941 HI 2762 3000
68 LG:233674.4:2000SEP08 5274070T9 2781 3319
68 LG:233674.4:2000SEP08 2831682H1 2818 3077
68 LG:233674.4:2000SEP08 2118604T6 2837 3385
68 LG:233674.4:2000SEP08 g667071 2841 3129
68 LG:233674,4:2000SEP08 4379465H1 2855 3133
68 LG:233674.4:2000SEP08 642755F1 2882 3418
68 LG:233674.4:2000SEP08 5703055F6 463 677
68 LG:233674.4:2000SEP08 6806348F8 472 1133
68 LG:233674,4:2000SEP08 642752H1 83 301
68 LG:233674.4:2000SEP08 6920944H1 110 500
68 LG:233674.4:2000SEP08 5379362H1 452 714
68 LG:233674.4:2000SEP08 5703055H1 463 592
68 LG:233674.4:2000SEP08 6806348H1 472 986
68 LG:233674.4:2000SEP08 4969531 HI 495 699
68 LG:233674.4:2000SEP08 5972826H1 550 1068
68 LG:233674.4:2000SEP08 g1976170 551 936
68 LG:233674.4:2000SEP08 5972826F8 553 1140
68 LG:233674.4:2000SEP08 6829707J1 578 1096
68 LG:233674.4:2000SEP08 g2254998 3146 3414
68 LG:233674,4:2000SEP08 g211301ό 3158 3414
68 LG:233674.4:2000SEP08 g4326422 3163 3415
68 LG:233674.4:2000SEP08 g4086677 3165 3415
68 LG;233674.4:2000SEP08 626535T6 3135 3375
68 LG:233674.4:2000SEP08 g1940748 3135 3414
68 LG:233674.4:2000SEP08 3154259H1 3146 3410
68 LG:233674.4:2000SEP08 550108H1 3117 3374
68 LG:233674.4:2000SEP08 6569734H1 3118 3663
68 LG:233674.4:2000SEP08 334525H1 3123 3358
68 LG:233674.4:2000SEP08 6372771 HI 3132 3405
68 LG:233674.4:2000SEP08 g2058783 3135 3404
68 LG:233674.4:2000SEP08 4858527H1 2690 2894
68 LG:233674,4:2000SEP08 548931OH1 2696 2980
68 LG:233674.4:2000SEP08 g2154333 2681 3090
68 LG:233674.4:2000SEP08 6325343H1 2648 2877
68 LG:233674.4:2000SEP08 1383313H1 2682 2936
68 LG:233674,4:2000SEP08 6356947H1 2584 2794
68 LG:233674.4:2000SEP08 1954924H1 2486 2708 co m
©
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 =; cjo oo co oo co co co oo oo oo cJo co oo co oo oo co oo oo co co oo co co oo oo co oo oo co oo cTo oo oo oo rø
O
NT NT tO NT NT NT t NT Co Co CO CO CO Co W Co Co CO CO CO Co N NT NT NJ tO NT NT NT r^ CO CO C0 NT NT NT NT NT -' — ' — ' — — '' — —• < - —1 ' ~- -—^ *' —— '' OO OO OO OO 44NN 44NN 44NN 44NN 44NN 44NN 4JNN 44NN 4JNN *CC00 44NN 44NN 44N-. NNJ NIOT NNTT NNTT -—^ > — ' Cn Cn CLn Oi Oi Ol Cn Js. ^- sl C0 4N NT — ' — ' — ' O O — ' ——-- ——■' —— ** OO OO OO OO OO OO OO CCnn 44NN CC0O NNTT Ir NNTT OO CCO 0CO0 0O0O OO ssJJ ssJl —— '' OO 0000 00 SJ O O O 4N Ω NT 4N tO CO O O NT O O Cn NT — ' O -IN — ' O O — ' CO NT C — ' CO sl M CO - - ' sl sl O O O O -tN NT Co Cn Oi — 4N sj o o
M M M M M M M M U CΛ Co ω ω Cθ C θJ C Co ω CΛ U M M M M M *M M M M M M
O J1 0 sl 4N s| JsΛ jv. *s. ^ J^ ^ 4s_ *co ω 4 4 4 ω θ O sJ CJT O O ∞
4N C — * 00 C0 4N — ' — ' — ' —< —. — ' OO NT Cπ — ' — ' NT Cn OO sJ NT NT NT W Cn O O CJl O O NT — ' CjO Cπ — ' — ' — ' sj sJ sJ sj —. ^
VJ SJ O SJ CJ0 4N 0 0 00 4N CO O C0 4N OO NT 0 4N 4N O NT SJ O CJT CD CO O CJI OO NJ O ∞ CJ O O -^
CΛ m ©
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 =; co oo co oo co oo c-o cΛ Co oo oo co co oo oo oo αJ co co oo co oo oo oo oo oo oo oo oo co oo oo oo oo co oo co co
O
sO
co ω w co co ω co ω ω co ω ω ω cd Co ω ω co ω co ω co ω ω co co ω w ω co w M M M ro o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o co sj ;s-
00 4N NT — ' *— ' CD OO sl O O Cn Oi Oi Ol Oi 4N Cn 4N 4N NT — « — > — J — ' O O O O O O Oo OO OO CO sj si si sj sl sl sl O O O tO Co Ω sl sl O O -IN O O Cn CD Oo OO O Cn sl NT NT O O O sj — ' O sl O O CO CO O NT — ' sj — ' Ol 4N O O O Cn Cn iO — ' O O OO Ol OO OO ^-
<- C CO C0 C CO CjJ CO <_O O3 CO C0 C0 C CjJ CO Co OJ C0 CO C0 C0 C^ CO CO CO CO 4N O O Cj0 4N CjJ C0 4N 4N 4N 4N 4N W 4N C0 4N 4N Cj0 4N 4N 4N 4N 4N 4N 4N W Js. J^ J-v 4^ JN 4N C 4^
0 0 0 0 0 0 — ' si en — J — . _ . —. NT — ' O Co —' — .. i — < — __, _, _. _. _, -vi — ' — . — . _ . NT — ' CO — . O NT O — — ■ — ' O SI — ■ — . J N O
0 4N 4N CH 4N — ' CO — ' sl js^ ^ co ω o ω — ' C3 00 Cn O sj Js, js, s, 4si ^ r 1 ^ 4si ^
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
68 LG:233674.4:2000SEP08 g4734601 3289 3406
68 LG:233674.4:2000SEP08 4545520H1 3295 3407
68 LG:233674,4:2000SEP08 g2788765 3300 3418
68 LG:233674.4:2000SEP08 g1887492 3304 3404
68 LG:233674,4:2000SEP08 3349435H1 3329 3406
68 LG:233674.4:2000SEP08 g6836622 3410 3869
68 LG:233674.4:2000SEP08 g4990931 3412 3786
68 LG:233674.4:2000SEP08 g3151731 3412 3801
68 LG:233674.4:2000SEP08 2735154H1 3595 3844
68 LG:233674,4:2000SEP08 2673809H1 3957 4045
68 LG:233674.4:2000SEP08 2673828H1 3963 4045
68 LG:233674.4:2000SEP08 6881871H1 3978 4045
68 LG:233674.4:2000SEP08 4772282F7 1908 2513
68 LG:233674.4:2000SEP08 4772282H1 1908 2165
68 LG:233674,4:2000SEP08 4982350H1 1971 2242
68 LG:233674.4:2000SEP08 3073533H1 1979 2244
68 LG:233674.4:2000SEP08 7057649H1 2082 2531
68 LG:233674.4:2000SEP08 6798844J1 2084 2726
68 LG:233674.4:2000SEP08 g2191282 2108 2416
68 LG:233674.4:2000SEP08 7239623H1 2203 2496
68 LG:233674.4:2000SEP08 3285729H1 1309 1566
68 LG:233674.4:2000SEP08 3285729F6 1309 1891
68 LG:233674,4:2000SEP08 3599653H1 1353 1652
68 LG:233674.4:2000SEP08 5617347H1 1437 1651
68 LG:233674.4:2000SEP08 1893894H1 1418 1651
68 LG:233674,4:2000SEP08 5800444H1 711 1191
68 LG:233674.4:2000SEP08 4051463H1 1432 1607
68 LG:233674.4:2000SEP08 5274070F9 1443 2084
68 LG:233674,4:2000SEP08 7711995H1 984 1604
68 LG:233674.4:2000SEP08 774497Ul 992 1543
68 LG:233674,4:2000SEP08 1483611 HI 1106 1351
68 LG:233674,4:2000SEP08 6197176H1 1302 1815
68 LG:233674.4:2000SEP08 5274070H1 1443 1685
68 LG:233674.4:2000SEP08 5274070F8 1443 1989
68 LG:233674.4:2000SEP08 4051463F8 1444 2021
68 LG:233674.4:2000SEP08 6534830H1 1525 2050
68 LG:233674.4:2000SEP08 7764777J1 1570 2170
68 LG:233674.4:2000SEP08 6806348J1 1553 2175
68 LG:233674.4:2000SEP08 7218885H1 1628 2143
68 LG:233674.4:2000SEP08 6757432F8 1 345
68 LG:233674.4:2000SEP08 6757432H1 18 570
68 LG:233674.4:2000SEP08 3486566H1 49 267
68 LG:233674.4:2000SEP08 642755R1 83 642
69 LG:411327.2:2000SEP08 5608336H1 1160 1409
69 LG:411327.2:2000SEP08 5608336F6 1160 1676
69 LG:411327.2:2000SEP08 3521640H1 1285 1457
69 LG:411327.2:2000SEP08 3521640R6 1303 1469
69 LG:411327.2:2000SEP08 6841761 HI 1402 1855 40 [ABLE 3
SEQ ID NO: Template ID Component ID Start Stop
69 LG:411327.2:2000SEP08 2417869H1 1505 1727
69 LG:411327.2:2000SEP08 2417869F6 1505 1772
69 LG:411327.2:2000SEP08 6758624J1 1540 2134
69 LG:411327.2:2000SEP08 1342359F6 1580 2113
69 LG:411327,2:2000SEP08 1342359H1 1580 1800
69 LG:411327.2:2000SEP08 g5848294 1651 2110
69 LG:411327.2:2000SEP08 g864162 1772 2090
69 LG:411327,2:2000SEP08 g795684 1834 2099
69 LG:411327.2:2000SEP08 g3134525 1842 2142
69 LG:411327.2:2000SEP08 2829095H1 415 664
69 LG:411327.2:2000SEP08 800329H1 440 686
69 LG:411327.2:2000SEP08 2240031 HI 447 699
69 LG:411327,2:2000SEP08 7462852H1 470 769
69 LG:411327.2:2000SEP08 7453938H1 606 1167
69 LG:411327.2:2000SEP08 752915H1 617 894
69 LG:411327.2:2000SEP08 5083742H1 700 924
69 LG:411327.2:2000SEP08 6758624H1 815 1390
69 LG:411327.2:2000SEP08 3000571 HI 1021 1312
69 LG:411327.2:2000SEP08 7115241 HI 1136 1685
69 LG:411327.2:2000SEP08 3336908H1 73 304
69 LG:411327.2:2000SEP08 642608R6 no 368
69 LG:411327,2:2000SEP08 642608H1 110 360
69 LG:411327.2:2000SEP08 7763655H1 120 746
69 LG:411327.2:2000SEP08 803330H1 400 642
69 LG:411327.2:2000SEP08 6805668 l 1 581
69 LG:411327.2:2000SEP08 g1928792 1 296
69 LG:411327.2:2000SEP08 3760715H1 65 384
69 LG:411327.2:2000SEP08 2106453H1 53 344
69 LG:411327.2:2000SEP08 gl481768 66 374
69 LG:411327.2:2000SEP08 2106453R6 54 475
69 LG:411327.2:2000SEP08 g1163559 61 449
69 LG:411327.2:2000SEP08 3473472H1 66 397
70 LG:1327310.1:2000SEP08 6795864H1 1 240
70 LG:1327310.1:2000SEP08 6795864F8 1 589
70 LG:1327310.1:2000SEP08 6791830H1 2 403
70 LG:1327310.1:2000SEP08 6795864T8 74 563
71 LG:242019.13:2000SEP08 4651672H1 1 244
71 LG:242019.13:2000SEP08 7713679J2 24 634
72 LG:012432.12:2000SEP08 222784R1 558 861
72 LG:012432.12:2000SEP08 223917H1 558 733
72 LG:012432.12:2000SEP08 223917R6 558 863
72 LG:012432.12:2000SEP08 g2620512 582 893
72 LG:012432.12:2000SEP08 7640335H2 182 783
72 LG:012432.12:2000SEP08 3271885H1 1 233
72 LG.012432.12:2000SEP08 5638620H1 1 184
72 LG:012432.12:2000SEP08 g2111772 1 436
72 LG:012432.12:2000SEP08 7640335J2 439 841
72 LG:012432.12:2000SEP08 5536760H1 9 267 ABLE 3
SEQ ID NO: Template ID Component ID Start Stop
72 LG:012432.12:2000SEP08 1484585F6 11 506
72 LG:012432, 12;2000SEP08 1484585H1 11 271
72 LG:012432.12:2000SEP08 5957017H1 128 642
72 LG:012432.12:2000SEP08 222784H1 558 739
73 LG:257088.9:2000SEP08 g5542884 1351 1518 73 LG:257088.9:2000SEP08 g2840861 1039 1520 73 LG:257088.9:2000SEP08 2657966T6 567 1061 73 LG:257088.9:2000SEP08 gl06971ό 680 1035 73 LG:257088.9:2000SEP08 4580838H1 772 985 73 LG:257088.9:2000SEP08 5526482H1 717 964 73 LG:257088.9:2000SEP08 1705927H1 677 898 73 LG:257088.9:2000SEP08 g3070773 546 837 73 LG:257088.9:2000SEP08 2657906H1 531 742 73 LG:257088.9:2000SEP08 6267885H1 181 688 73 LG;257088.9:2000SEP08 6134084H1 754 964 73 LG:257088.9:2000SEP08 g3177730 1071 1514 73 LG:257088.9:2000SEP08 g2782497 1077 1441 73 LG:257088.9:2000SEP08 g5395430 1052 1432 73 LG:257088.9:2000SEP08 g1295210 786 1223 73 LG:257088.9:2000SEP08 6336121 HI 563 1076 73 LG:257088.9:2000SEP08 g5768628 1043 1409 73 LG:257088.9:2000SEP08 g5767051 1008 1388 73 LG:257088.9:2000SEP08 g5638685 1008 1284 73 LG:257088.9:2000SEP08 7763642J1 628 1260 73 LG:257088.9:2000SEP08 7375784H1 1 270 73 LG:257088,9:2000SEP08 2657906F6 531 1025 73 LG:257088,9:2000SEP08 6354333H1 677 1001 73 LG:257088.9:2000SEP08 6317132H1 677 956
73 LG:257088,9:2000SEP08 g3094268 1093 1518
74 LG:997505.5:2000SEP08 1317853H1 1320 1483 74 LG:997505,5:2000SEP08 5853247H1 1291 1560 74 LG:997505.5:2000SEP08 4998506H1 1 173 74 LG:997505.5:2000SEP08 4998506F9 1 550 74 LG:997505,5:2000SEP08 4998506F8 24 521 74 LG:997505.5:2000SEP08 5528150H1 488 767 74 LG:997505.5:2000SEP08 3400869H1 723 974 74 LG:997505.5:2000SEP08 4740562H2 828 1094 74 LG:997505.5:2000SEP08 6778477J1 899 1521 74 LG:997505.5:2000SEP08 g878200 1067 1329 74 LG:997505.5:2000SEP08 g888053 1070 1382 74 LG:997505.5:2000SEP08 1391708H1 1153 1370 74 LG:997505.5:2000SEP08 4534154F8 1175 1701 74 LG:997505.5:2000SEP08 453431OH1 1176 1421 74 LG:997505,5:2000SEP08 4534310F8 1175 1682 74 LG:997505.5:2000SEP08 4534154H1 1175 1429
74 LG:997505.5:2000SEP08 5853215H1 1291 1561
75 LG:481436.2:2000SEP08 3324046H1 274 573 75 LG:48143ό.2:2000SEP08 3635943H1 16 312 co m
©
Sj SJ Sj Sj Sl Sj Sj Sl Sj -vJ Sj Sl SJ SJ Sj -vl Sl -vJ sJ -vI SJ Sj SJ Sl SI Sj VJ -O Sj Sl SJ SJ VJ -vI SJ SJ Sj Sj SJ Sl Vj Sl SJ Sj SJ vJ sJ vJ —
■oi oi Oi cji cjn cn cn cπ cjπ oi Oi cji c-n oj cji cn en eπ cn ci cn cn o cn cn oi ϋi oi Oi cn cn cn Oi o^
fe vl
O vj sj sj sl s si sl sl sI sj sj sl sj si sJ sj sj sj oo OO O CO OO Cn σi O NT CΛ vJ sl sl O O — ' C0 4N 4N JN 01 0 0 — ' NT NT NT NT O O j 0l Oι O Ul O O 4N 0l 0ι sl 00 O NT sI NJ JN sJ C0 co sj Nτ CJ 4N θ cτo sj oo co sj cτo cτo oo co oo sj oo oτ cτo cι en cn sι co oo o o o o O O NT O sl O O — ' NT sJ O Cπ O Cn O Ol — ' NT O O O Cπ 4N NT *sl θ sl — ' O O O O O — ' — ' O CO O O O O OI O NT OI OI O NT TJ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
75 LG:481436.2:2000SEP08 g4875574 1324 1756
75 LG:481436.2:2000SEP08 g5391715 1300 1756
75 LG:48143ό.2:2000SEP08 942286H1 1607 1756
75 LG:481436.2:2000SEP08 2878889T6 1418 1852
75 LG:481436.2:2000SEP08 g6030295 1398 1756
75 LG:481436,2:2000SEP08 g5885853 1305 1756
75 LG:481436.2:2000SEP08 2536867T6 1383 1852
75 LG:481436.2:2000SEP08 gδ12334 1579 1755
75 LG:481436.2:2000SEP08 g2016599 1404 1750
75 LG;48143ό.2:2000SEP08 1684433T6 1491 1853
75 LG:481436.2:2000SEP08 4030547T8 1501 1765
75 LG:481436.2:2000SEP08 g2809664 1359 1749
75 LG:481436.2:2000SEP08 g4703900 1299 1705
75 LG:481436,2:2000SEP08 2551976T6 1367 1701
75 LG:481436.2:2000SEP08 6844365H1 1723 1892
75 LG;481436.2:2000SEP08 056667H1 1682 1905
75 LG:48143ό,2:2000SEP08 g2881641 1492 1901
75 LG:481436.2:2000SEP08 g856884 1575 1898
75 LG:481436.2:2000SEP08 g792144 1572 1892
75 LG:48143ό.2:2000SEP08 g1897691 1528 1892
75 LG:481436.2:2000SEP08 g4875377 1469 1892
75 LG:481436.2;2000SEP08 g2876435 1480 1892
75 LG:481436.2:2000SEP08 g5232555 1430 1892
75 LG:481436,2:2000SEP08 g2342028 1543 1892
75 LG:481436.2:2000SEP08 1391226H1 1673 1892
75 LG:481436,2:2000SEP08 3635943T6 1397 1856
76 LG:247776.14:2000SEP08 5022713H1 318 598
76 LG:247776.14:2000SEP08 629811OH1 321 394
76 LG:247776,14:2000SEP08 6298002H1 322 603
• 76 LG:247776,14:2000SEP08 5265311H2 353 598
76 LG:247776, 14:2000SEP08 2583722F6 425 876
76 LG:247776.14:2000SEP08 5390936H1 195 410
76 LG:247776.14:2000SEP08 4493019H1 316 854
76 LG:247776.14:2000SEP08 2583722H1 425 651
76 LG:247776.14:2000SEP08 2396674H1 549 776
76 LG:247776.14:2000SEP08 5082245H1 553 802
76 LG:247776,14:2000SEP08 7063969H1 748 984
76 LG:247776.14:2000SEP08 3349303H1 1 287
76 LG:247776.14:2000SEP08 3469636H1 1 228
76 LG:247776.14:2000SEP08 4606239F6 39 248
76 LG:24777ό, 14:2000SEP08 6545052H1 118 558
76 LG:247776.14:2000SEP08 3967662H1 757 1029
77 LG:008606.14:2000SEP08 7593063H1 1 586
77 LG:00860ό.14:2000SEP08 3559041 HI 99 379
77 LG:008606.14:2000SEP08 533589H1 590 833
77 LG:008606.14:2000SEP08 . 2343384F6 649 1101
77 LG:008606.14:2000SEP08 2343384H1 649 922
77 LG:008606.14:2000SEP08 7460317H1__ 817 1045 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
77 LG:008606.14:2000SEP08 g1005379 907 1237
77 LG:008606.14:2000SEP08 6994923H1 909 1417
77 LG:008606,14:2000SEP08 3714147H1 910 1191
77 LG.O08606.14:2000SEP08 2531438H1 915 1194
77 LG:008606.14:2000SEP08 4969732H1 943 1224
77 LG:008606.14:2000SEP08 2171139H1 969 1167
77 LG:008606.14:2000SEP08 676459H1 974 1250
77 LG:008606.14:2000SEP08 5803841 HI 1062 1365
77 LG:008606.14:2000SEP08 5803934H1 1062 1353
77 LG:008606.14:2000SEP08 1574221 HI 1063 1293
77 LG:008606.14:2000SEP08 1574221F6 1063 1598
77 LG:008606.14:2000SEP08 5803958H1 1068 1346
77 LG:008606.14:2000SEP08 5803950H1 1068 1352
77 LG:008606.14:2000SEP08 5804133H1 1068 1378
77 LG:008606.14:2000SEP08 2836642H1 410 653
77 LG:008606.14:2000SEP08 5538169H1 461 626
77 LG:00860ό.14:2000SEP08 3416607H1 495 748
77 LG:008606.14:2000SEP08 2607480H1 515 799
77 LG:008606.14:2000SEP08 5609774H1 533 669
77 LG:008606.14:2000SEP08 3097711 HI 238 387
77 LG:008606,14:2000SEP08 5575416F6 240 801
77 LG:008606.14:2000SEP08 5575947F6 240 786
77 LG:008606.14:2000SEP08 5575416H1 240 505
77 LG:008606,14:2000SEP08 7192474H2 243 846
77 LG:008606.14:2000SEP08 7192601H2 246 844
77 LG:008606.14:2000SEP08 7368491 HI 243 656
77 LG:008606.14:2000SEP08 6784752H2 248 354
77 LG:008606.14:2000SEP08 2489661 HI 295 548
77 LG;00860ό.14:2000SEP08 5996265H1 236 456
77 LG:008606.14:2000SEP08 7734608J1 392 994
77 LG:008606.14:2000SEP08 2836642F6 410 844
77 LG:008606,14:2000SEP08 3559025H1 101 383
77 LG:008606.14:2000SEP08 6380356H1 223 536
77 LG:008606.14:2000SEP08 6381595H1 223 387
77 LG:008606.14:2000SEP08 3727284H1 228 511
77 LG:008606.14:2000SEP08 2497273H1 230 470
77 LG:008606.14:2000SEP08 3365462H1 230 394
77 LG:008606.14:2000SEP08 g5553168 233 658
77 LG:008606.14:2000SEP08 g6301979 233 661
77 LG:008606.14:2000SEP08 5676084H1 235 516
77 LG:008606.14:2000SEP08 5803849H1 1214 1368
77 LG:00860ό, 14:2000SEP08 453194H1 1302 1418
77 LG:008606.14:2000SEP08 1574221T6 1130 1611
77 LG:008606.14:2000SEP08 g4223830 1200 1650
77 LG:008606.14:2000SEP08 5804149H1 1068 1373 77 ' LG:008606.14:2000SEP08 5803951 HI 1069 1350
77 LG:008606.14:2000SEP08 2549778H1 1071 1318
77 LG:008606.14:2000SEP08 4361441 HI 1083 1350 CΛ m ©
C O CD OO OO CO OO CO OO OO sJ sJ sl sj sl oTO OoOoO CoTO CoTO OoO CoTO CoO CojO CoO OoO CoO CoO CoTOoo OoO CoD OoO OoO OoO OoO CoTO ∞ooooooooooooo O O O O O O O O O OO CO
0
NT
4N
Ol —1
O Φ
4N T 3J fo
Fό — Ωr
O Φ o — o D
CΛ m
TJ o CO
O O O O O CO CO sl sl o O O O O O -fc. ,., 4N CO NT — ■ ι Js. 4N 4N C0 C0 C0 C0 C0 C0 _l
O sj rn is co co rs. vJ sj O 'Ol IO — ' — ' — ' O O O v -vl -Oi — ' —' jV — ' — ' O — ' ^* 0l NJ tO O O NJ C0 ιNJ ^ O NT NT C0 o O Cn 0
Oi Co O OO OO O vJ O Ol .cf. 4N 4N O ft CO O 4 ■N si ft 00 ft O r O^ > C0 θ rθ sJ CT0 CJT sJ cn 0ι NT sI 0ι NT CO C0 O NT 4N C0 4N NT Ω
*q.
SJ J Js, Js, vj O CO NT IΌ IΌ — ' CO NT O O o O _, 00 00 jv. o cn oo o cn . s, 0 0 Ol NT Js. cn sj _- f O CO NT ft
Ol NT O O 00 — ' — ' — ' Sl s, — NT 4N c g co co si cn —' r, J-, SI C0 Oi sl 00 O O O sj
CO CO O
*-' NT NT O O O O NT CO CO CO O sl CO CO sl π t. O CO —. -' <*-* o IΌ —* cπ o osj ^ 5vT scDJ OO OsI 4ON sNTI OIO CO0 —W ' OCn O4 —∞ . ∞ n° rS° mW rl- OvT -?j-*
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop 80 LG:245014.2:2000SEP08 2910268H1 1210 1468 80 LG:245014.2:2000SEP08 4372565H1 1212 1510 80 LG:245014,2:2000SEP08 133005H1 1217 1398 80 LG:245014.2:2000SEP08 133005R6 1217 1651 80 LG:245014.2:2000SEP08 6345408H1 1222 1495 81 LG:170754.4:2000SEP08 6450115H1 177 241 8 LG: 170754.4:2000SEP08 7621674H1 293 761 8 LG:170754.4:2000SEP08 1525016H1 316 525 8 LG:170754.4:2000SEP08 3385664T6 656 1080 8 LG:170754.4:2000SEP08 g1291209 716 1120 8 LG: 170754.4:2000SEP08 6578227H1 857 1088 8 LG:170754.4:2000SEP08 7586739H1 6 433 LG: 170754.4:2000SEP08 737351OH1 1 476
8 LG:170754.4:2000SEP08 3288727H1 1 251
8 LG:170754.4:2000SEP08 7621674J1 1 672
82 LG:988028.1 :2000SEP08 5770980F8 1085 1641
82 LG:988028.1 :2000SEP08 5770980H1 1073 1594
82 LG:988028.1 :2000SEP08 6267543F8 1 657
82 LG:988028.1 :2000SEP08 6568067F8 1 665
82 LG:988028,1 :2000SEP08 6267543H1 1 536
82 LG:988028.1 :2000SEP08 6267543T8 158 699
82 LG:988028.1 :2000SEP08 7765486J1 501 1150
82 LG:988028.1 :2000SEP08 7765486H1 1353 1941
82 LG:988028.1 :2000SEP08 g1303480 1368 1608
82 LG:988028.1 :2000SEP08 ' 6355546F7 1473 1797
82 LG:988028.1 :2000SEP08 6355546H1 1473 1677
82 LG:988028.1 :2000SEP08 6568067H1 1 62
82 LG:988028.1 :2000SEP08 7330071HI 870 1307
82 LG:988028.1 :2000SEP08 5625452H1 511 641
82 LG:988028.1 :2000SEP08 7388455H1 649 1075
83 LG:427997.6:2000SEP08 3751266F6 1 358
83 LG:427997.6:2000SEP08 3751266H1 1 296
83 LG:427997.6:2000SEP08 5376913H1 683 945
83 LG:427997,6:2000SEP08 2681128F6 695 1170
83 LG:427997.6:2000SEP08 2681128H1 696 978
83 LG:427997.6:2000SEP08 1915736R6 731 1130
83 LG:427997.6:2000SEP08 g850466 493 823
83 LG:427997,6:2000SEP08 2557738H1 650 890
83 LG:427997.6:2000SEP08 2560627H1 650 908
83 LG:427997,6:2000SEP08 3321818H1 104 350
83 LG:427997.6:2000SEP08 6351045H2 329 660
83 LG:427997.6:2000SEP08 3321818F6 103 497
83 LG:427997.6:2000SEP08 1915736H1 731 964
83 LG:427997.6:2000SEP08 5296714H1 764 1029
83 LG:427997.6:2000SEP08 3934538H1 892 1190
83 LG:427997.6:2000SEP08 3934758H1 892 1185
83 LG:427997.6:2000SEP08 3934357H1 892 1190
83 LG:427997.6:2000SEP08 3934758F6 892 1393 CO m ©
OO OO CD OO CO CO CO CO CO OO αO OO CD OO CTO CJO ∞ CTO CO CTO CTO CTO ∞ co c-o co cjj co c-o co c-o ω co co co co co co eo co co co co co co cjj co eo co ω
O
000000000000000000000000000000000000000000000000
4N 4N 4N 4N 4N 4N -fc. 4N 4N 4N -fc. •IN -fc. 4N -fc. 4N 4N 4N 4N -fc. 4N 4N 4N 4N 4N 4N -fc. 4N 4N 4N -fc. -fc. 4N -fc. 4N -fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N
NT ro NT NT NT NT NT NT NT NT NT NT NT NT NT ro NT NT ro NT to NT NT IO NT to NT NT NT ro NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT vl vl vj Sl SI VJ vj J VI sl sl SI Sl s| sl Sl SI sl SJ sl sl sl vj sj SI si si si sj SI sl sl SJ sl SI sl VI SI sj si sj sj SI si o O O O O O O O O O O O o o O O o o O O O o O O o o O O O O O O O O O O o O O O O st o o O O O O O O O o O o O O O o o O O o O O O o O
SI o O O O O O O o O o O o O O O vj VI vl sl SI si SI SJ sl sl o vl Sl sj sl -J vl sj vl SI sl SI SI si sl si sl SJ sl sj sl SI SI si Sl Sl SJ sl sl si sl sj SJ SI si si SI SJ 3 J o o O O o O O O o O o O o o O O O o O O O O O O "o o o o O o O O O o O o O o O o O O O o O fύ iό iό Ω iό Fό fό fό fό fό fό fό iό Fύ Fό Fύ fό Fύ fό fύ Fό fό tό fύ iό io ύ fό Fό fύ i tό "ro iό Fύ iό tό io iό fό iό fό Fύ Fo Fo Fό tό Fύ fό o o o o o O o O O O o o o o o o O O O o O O O o o o O o O o o o o o o o o o O o O o o o o o O O φ o o o o o O o O o O o o o o o o O O O o o O O o o o O o O o o o o o o o o o O o O O o o o o O o — o σ o o o o o O o O o o o o o o O O O o o O o o o o O o O o o o o o o o o o O o O O o o o o O o
CO co CΛ CO CΛ CΛ CΛ CO CO CO CΛ co CΛ CO CΛ C CΛ CO CO CO CΛ CO CO CO CO co CO CΛ CO CΛ CΛ CO co CΛ CΛ co CΛ CΛ CO co CO CΛ CO CΛ co CO CΛ CO o m m m m rn m m m m m m m m m m m m rπ m m m rn m m m m m m m m m m m rπ m rn m m m m m m rπ m m m m rπ
TJ J TJ TI J TJ TJ TI TJ J TJ TJ TJ J TJ TJ TJ J TJ TJ TJ TJ J TJ TJ J TI TJ J TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ J TJ J TJ TJ J TJ o o o O O o o o O O o o o O o o o O o o o O o o o o o O o o o o o o O o o o o o o o o o o o o o
CO CO CO 00 oo oo 00 CO 00 oo oo oo oo CO 00 oo 00 CO co 00 oo 00 CO 00 CO co CO co co CO 00 CD CO CO 00 co CO co CO 00 CO CO co 00 oo co oo co
O >
-fc. o _ O cn cn CO NT _ Co C Co n sj J o loo oo CO o Co vl CQ co CΩ c o NT CO Co
NT CO Sl o o CO o CΩ cn NT co CQ C
NT NT o NT oo co CO
O NT 4N o J
CQ o co co co en sl o c
CO SJ o o o o o NT r— cπ Co O to 4N o p o m co C CO oo o 4N cn NT NT O NT en 00 3
_ Cn NT o VI NT — ' SJ o Co o -fc. o 4N en
NT o o — ' SJ o 00 NT SI o Sl O sl o NT
CO o CO Co CO o c CO 00 00 Co Co 00 cn en CO n n Sl ' C o CO o sl to CO cn Co CO co — * NT 4N o sl NT 4N NT sj NT o o S) TJ Co lCn CO o e O ro O O CO Co — • 4N O o o CO 00 co o J o
4N n Sl o o 4N o 00 oo NT
4N cn
SI o Sl o — cπ o 00 00 o c o O o ' o — ' o o 00 4N cπ cn o -fc. SI CO o CO o cn si Ol o o lco -v o 4N o o o 00 Ol SI sj
CO cπ O 00 00 CO 00 o NT o NT 3
O l co o o vj Ol o
X CO 00 Co cn O 4N CO o CO o
X o o o co Ol -fc. co 00 o O 4N Oi •fc. cn o o cn O Φ l-π X τ\ X TI X X X VI
Co o O X 3 —I -J X → i o cn X s o Tl X X X X X X 73 o — ' O s o X 4 oN X X X X X X X X X X X X X
— . 3 o o o O o o o —1- σ
O O O O O O O sl sl si sl sl sj sJ O O O O sj sj |O <jo Co <-o cjJ Cπ 4N CO Co en Cπ θι O — ' oo CΛ co si si o co co ro o o o .fc.4N ' -
CCJ NT NT cS -i ft ^ OJ N NJT O K S3 >T _ OO OO OCO ^O NT O O O O sI sl OO OO O O O O O — ' SJ CO O O Cn CO O NT — NT O Q
00 — ' NT -fc. O0 00 Js. 0O — ' O vi — ' js, — . 0 — ' — ' NT tO Ol NT O O O — ' — ' 4N CO C O — ' CH CO 4N Cn o ro 4 lO O Co o S Co & O ft vl vJ sj
o _. o — ' NT — ' — ' NT NT NT NT NJ NT NT NT NT NT -' — ' NT NT — ' — . —. —. —. —. — . _ _- _, _* _* _ _* co O OO — ' O — ' 00 00 O — ' — ' CO — ' O — . —■ —. — . Oo o — ' — ' 4N O OO OI 4N -V| SJ CΠ cπ C J — --J Co NT 4N 4N -vl 4N — * NT — ' NT CO O Co NT O ; en oo oo o o vi o 4N co co 4N o — ' 4N CΠ OI 4N O NT Js OO O NT Cn OO NT O NT sl o 4N 4N — _ o 4N co en co sJ JN Ol O — ' OI O O
NT 00 4N SJ C0 C0 O O 00 VI NT — ' CO CO — ' CO — ■ O O C OO — ■ — ' NT — ' CO O O O C_J CO — oo o o en NT o o o — ■ NT — ' O --' NT 00 4N TJ
CΛ m ©
OO CO OO OO CO OO OO CTO CTO CO OO OO CO Oo CTD CD CO OO ∞ OO OO rø CTO OO CO OO OO OO OO ∞ CJT Ol Ol Ol Ol Ol O1 Ol Ol Ol 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N C0 j0 CO CO C CO C J C 0 CjT <-O <j0 CO Co ω
0000000000 ø
Ko So So 4o o 4oN !ofc !ot Ϊo 4oN
222--2Ξ-_2--S φ o po opo oooocooooopo ocoooo oooopo T3J Q
N fό fό Fύ Fύ Fό fό fό fό Fό ooo o ooo oo o Φ ooooo oo ooo o oo ooo oo o o D
CΛ CΛ CΛ CΛ CΛ CΛ Λ CO CO CO rπ m m rπ m rπ m rπ rπ m o TJo-DoTJoTJoTIoTJoTloTIoTJoTJ
o
CO NT NT sj —■ —' OI OI 4N NT r NT NT — ' — ' CΛ
4N CO Co o o cn - -■ O 4N —i co co T 0 0 0 0 01 *
O 4N 4N 00 —' 4N Ol O CO co en o NT O 4N — ' 3-
vj oo en 4N CO CO si 00 « M CO J1 - & O C» & C ^ ^ ^ ! K M M M NT NT NT NT NJ NT NT NT NT NT IO NT NT NT NT NO NT NT NT — ' CO co r CO CO _ -' -' t -' -' IvT O C - . _ — ' O NT O — ' Co — ' — ' — ' 00 ,-t c !_o o. to O O CO O NT NT NT sl NT sJ CD NT N NT Cn 0 - o .. 0 00 ^4N 00 00 00 sl 4N sl O C0 — ' OO C O OO OO O O O OO CO OO O -y co 4N O O sj O CO CΠ SI NT SJ IΌ CO SJ O O — ' 00 O C0 O C0 4N r0 O NT 00 sj sl C0 O C O 00 θ C0 — ' O OO OO NT — ' O
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
85 LG:1400108.1:2000SEP08 7040568H1 306 711
85 LG:1400108.1:2000SEP08 4456536H1 587 788
85 LG:1400108.1:2000SEP08 4303918H1 299 475
86 LG:254531.1 2000SEP08 2116745R6 1 429
86 LG:254531.1 2000SEP08 4213937T6 179 561
86 LG:254531.1 2000SEP08 3983524T6 288 650
86 LG.254531.1 2000SEP08 g5513295 318 743
86 LG:254531.1 2000SEP08 2116745H1 1 238
86 LG:254531.1 2000SEP08 4771743H1 613 753
86 LG:254531.1 2000SEP08 4771743T6 833 899
86 LG:254531.1 2000SEP08 4771743F6 613 894
86 LG:254531.1 2000SEP08 g5678401 420 739
86 LG:254531.1 2000SEP08 g4082588 420 740
86 LG.254531.1 2000SEP08 g4148835 409 739
86 LG:254531.1 2000SEP08 g5369564 327 752
87 LG 1101317.1:2000SEP08 4111645H1 2107 2376
87 LG 1101317.1.2000SEP08 1719261 HI 2130 2340
87 LG 1101317.1:2000SEP08 217174H1 2135 2326
87 LG 1101317.1 :2000SEP08 2344908T6 2149 2693
87 LG 1101317.1.2000SEP08 7356032H1 2156 2648
87 LG 1101317.1 :2000SEP08 1861240H1 1990 2250
87 LG 1101317.1.2000SEP08 3094796H1 2026 2320
87 LG 1101317.1:2000SEP08 7216644H1 2079 2650
87 LG 1101317.1.2000SEP08 gl961970 2087 2358
87 LG 1101317.1.2000SEP08 5888913H1 2089 2335
87 LG 1101317.1 :2000SEP08 5884544H1 2089 2350
87 LG 1101317.1 :2000SEP08 802026H1 2092 2315
87 LG 1101317.1.2000SEP08 7243661 HI 2103 2462
87 LG 1101317.1 :2000SEP08 g6402048 481 941
87 LG 1101317.1:2000SEP08 g2064228 489 942
87 LG 1101317.1:2000SEP08 7401413H1 568 1059
87 LG 1101317.1.2000SEP08 7114194H2 712 1212
87 LG 1101317.1:2000SEP08 143461T6 738 1212
87 LG 1101317.1.2000SEP08 4763709H1 754 1022
87 LG 1101317.V.2000SEP08 223371 l 1820 2310
87 LG 1101317.1 :2000SEP08 1861240F6 1990 2391
87 LG 1101317.1:2000SEP08 229654H1 1820 2032
87 LG 1101317.1:2000SEP08 223371 HI 1820 2049
87 LG 1101317.1:2000SEP08 5604126H1 1845 2063
87 LG 1101317.1 :2000SEP08 1598553H1 1851 2059
87 LG 1101317.1:2000SEP08 3387636H1 1865 1980
87 LG 1101317.1.2000SEP08 907585H1 1866 2020
87 LG n01317.1:2000SEP08 907585R2 1866 2433
87 LG 1101317.1 :2000SEP08 6874748H1 1915 2386
87 LG 1101317.1:2000SEP08 gl955731 1921 2228
87 LG 1101317.1 :2000SEP08 3696080H1 1977 2259
87 LG 1101317.1:2000SEP08 g1963935 1 474
87 LG 1101317.1 :2000SEP08 _687012Hl 1 361 CO m © sj sj sj sl sj sl sj sl sj sj sl sj sj sj sj sj vj -vj sj sl sl sl sj sj sj sj sl vj vj v -vl sl sl sl -vj sl sl sl sl sl -vl sl sj sl sj sl sj sl O z
0
000000000000000000000000000000000
rO NT NT NT NT NT NO NT NT NT NT NT NT NT NT — ' —' -' —' —' -' —' -' -' -' -' — ' — • —. —. — . —. -. _. _ —. —. —. _» --. —. —. - . - _, CO
4N 4N Cjo CO CO CO <_o cTJ CO O c_o Co <jj CO Co O O O O O O O CJl Oi Ol Ol Ol Oi sj sj vj v vj -v| v
— ' — ' O 0J C0 s| sj sj Cπ 4N 4N 4N 4N 4N O O O O O 0l O CO sJ θ O C0 O sJ O 4N 4N 4N C0 r0 fO O O CO O ^ ^ Vv ft CD 4N Ω
O CJ1 0 0 4N — ' O 00 O — ' O O O C0 4N sJ 0i 0i O O O 4N sl C0 — ' O O O Cπ CD — ' ^
g & S ajs 0τ3
co m
©
CTO CTO OO CTO ∞ CO CJO CTO CJO OO CO OO OO CO OO CTO CTO CD CΛ OO OO CM CO OO CTO ∞ sl sj sj sl sl sl sj sl sj sj sj sl sj sj sj sj sl sj sj sj sl sj sj sj sl sl sj sj sj sj sl sl sl sl sl sj sl sj sj sl sl sj sl sj sj sl sl si u
000000000000000000000000000000 o __Jo--Jo_-Jo_Jo--_o--.o-_.o_Jo__Jo--Jo-_Jo_,o--.o---o--.o.--.o_-Jo--Jo--- -o--o---o---o--_o-^ooooooooo l _l _l _. _l _. _l --- _l o co co co co w ω co ci ω co u ω ω co co w ω co co co ω ω w sl s| sj sl s| sj sj sj sj sj sj sj sj sj sj sl sl sl s| sj sl sl sl sl sj sl sj sj sj sl vj -vj -vl sj sl sl sl sj sl sj sl s| sl sj sj sl sl si j_ Fό fύofύ rόofύo NT NO fό fόofό foό f Ooό ϊ " O O OoO O Ooo Fύ ό f
Ooo OoO O Ooό Oootoo Fό ϊ ϊ O O O OoOooύ f Ooό Foύ Foόoooooooooooooooooooooooooo^ O O O O O O O O O O O O O O O O O O O OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO πsJ co c cΛ co cn co cn cn co co co co co co cn cn cn cn co co co co co co co cn co co co co cn co co co c^ i m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m . TJ TI TJ -O -D -D T -D -D -O TI TJ TJ TI TJ TJ -Α T -Α -O -O TJ TJ TJ TI TJ TJ TJ -Ϋ -Α -O T -O TJ TJ TJ J T^ Ό O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
OO CO CO CO OO OO OO ∞ OT CTO CTO CO OO OO CO ∞ OO CTO CTD OO OO CO OO ∞ ∞
00 00 CO O 00 f. .. —. _. _. _-. _ , —• —• — ' NO NT NO M NO tO NT NO NT tO NO NT NT NT NT tO NO NT NO NO NT NT NT NO NT NT NO NO NO NT NT NO CΛ vj O O NT O ■> 00 NT IO IO IO IO NT NT NT NT N0 IO IO — ' — . — —. _ ' -* U tt M CO W > '> » 0> l> » (> 4- t!. 45. ti ti i r
4!> 4N CO CO N 4N 0 4N <jo O CJl CjJ NT O C» O OO CJl Cn O s * O C» <TO C» Js, 4N CO NT — ' Vl O Cπ 4N 4N 00 NT 00 sj sj sj JN 4. Ω O O O Co O O
O OO M O *CO O Cl 4N Oi sj — . O OI OI NT NO — ' 0 0 4N CJJ CJJ NT O O C» SJ — ' O — ' Co O W O CjO Cn sJ Oi Oi Oi sj sl ^
— ■ — ■ — ■ — . — ' — , _. _ _ ' —. —. —■ —. -•■. — ' NT NT NO IO tO IO NT M NT rO NT rO IO NT tO NT IO IO NJ rO NT rO tO M fO -CO NT N^
4N — ' — ' Co — ' SJ SJ 0 0 0 00 0 4N 4N SJ OI OI SJ 0 4N SI SI CJ 4N CJI OI 4N 4N 4N SI SI SJ O SJ SJ OO S| SI S| SI O S| SI O S| S1 0 SI →- 00 O O O — ' 4N — ' 0 4N O OT O — ' O 4N 4N NT CO 00 00 C0 O SJ — O O 0J » M C0 U C0 C Co C0 O ti C0 CO (0 r t Co Co *-' Ui -' 0 O sl N 00 - ' CO O O N0 4N 4N CO CO N0 00 0 4N 4N 4N 4N — • sl O O — ' IO NT Ol NT — ' CO NO NT si O NT — ' 01 0 4N CO NT — ' CO — ' O0 4N O0 O
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
87 LG: 1 101317.1 :2000SEP08 1643051 F6 896 1251
87 LG:1101317.1 :2000SEP08 4352568H1 933 1149
87 LG:1 101317.1 :2000SEP08 6955572H1 1013 1577
87 LG:1 101317,1 :2000SEP08 6954828H1 1100 1529
87 LG:1101317.1:2000SEP08 7056589H1 1120 1604
88 LG:1074728.6:2000SEP08 3115529H1 18 316 88 LG:1074728.6:2000SEP08 g1847994 177 565 88 LG:1074728.6:2000SEP08 2740256H1 1 194 88 LG:1074728.6:2000SEP08 g697772 7 228 88 LG:1074728.ό:2000SEP08 g763738 1 95 88 LG:1074728.6:2000SEP08 3276680H1 1 200 88 LG:1074728.6:2000SEP08 6289128H2 70 327 88 LG:1074728.6:2000SEP08 g763739 368 564 88 LG:1074728.6:2000SEP08 1351083H1 303 559 88 LG:1074728.6:2000SEP08 1634580T6 186 561 88 LG:1074728,6:2000SEP08 2622314H1 228 415 88 LG:1074728.6:2000SEP08 g2245965 455 565
88 LG:1074728,6:2000SEP08 g1846405 446 564
89 LG:1081684,1:2000SEP08 g3675256 556 967 89 LG:1081684.1:2000SEP08 6772905J1 356 958 89 LG: 1081684,1.2000SEP08 2189954F6 594 950 89 LG:1081684.1:2000SEP08 6198434H1 485 940 89 LG:1081684.1.2000SEP08 3215765F6 107 722 89 LG: 1081684.1.2000SEP08 6772905H1 1 577
89 LG:1081684.1:2000SEP08 3324630H1 29 289
90 LG:1076520.1 :2000SEP08 g5100723 625 705 90 LG:1076520.1 :2000SEP08 6753645H1 488 960 90 LG:1076520.1 :2000SEP08 7602495H1 634 881 90 LG:1076520.1 :2000SEP08 6866862H1 296 824 90 LG:1076520.1 :2000SEP08 6752544J1 1 467 90 LG:1076520.1 :2000SEP08 g1476958 1 288 90 LG:1076520.1 :2000SEP08 7390806H1 1 255 90 LG: 1076520.1 :2000SEP08 g4147917 604 710 90 LG:1076520.1:2000SEP08 g5448756 604 667 90 LG:1076520.1 :2000SEP08 g5837902 604 710 90 LG:1076520.1:2000SEP08 g5592223 604 710 90 LG:1076520.1:2000SEP08 g5886003 604 710 90 LG:1076520.1:2000SEP08 g1479703 1 398 90 LG: 1076520.1:2000SEP08 g6703597 604 710 90 LG:1076520.1:2000SEP08 g1425839 96 380 90 LG:1076520.1 :2000SEP08 g4453645 604 700 90 LG:1076520.1 :2000SEP08 g3053258 604 700 90 LG:1076520.1 :2000SEP08 g5590354 604 710 90 LG:1076520.1 :2000SEP08 g4078684 604 710 90 LG:1076520.1 :2000SEP08 g!481928 1 324 90 LG:1076520.1 :2000SEP08 g4189996 604 710 90 LG:1076520.1 :2000SEP08 g5811183 604 710 90 LG:1076520.1 :2000SEP08 g4395305 604 710 co m
© oo o o o o o o o o o u ω cooQ CooCooωoCo Coo CoOoU CoooΛ CoO roorO roOor o toOo o o o Mo o o ) oooooooooooooooooooooo
00
Oooom sJ sI — . -'
O O Cπ Oι -J l,
NOofό
OoO o co coo m
-α -α oo
CD O O O O O O O O O O O O O O CT O O O O O O O O O O O O O O CD CD O O O O O O O O O O O O O O CD O O O O O O O O O O O O O iv rv is is rs rs is is rs rs rs o o o o o o o o o o o o o o o
L UT LO L L L L LO LO LO L L L L L L O OO OO O O O O O O O OO O
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
95 LG:1079470.6:2000SEP08 3946142F8 658 1195
95 LG: 1079470.ό:2000SEP08 638942R6 662 1217
95 LG:1079470.6:2000SEP08 2550648F6 662 1027
95 LG:1079470.6:2000SEP08 638942H1 662 917
95 LG:1079470.6:2000SEP08 2550648H1 662 904
95 LG:1079470.ό:2000SEP08 5219834H2 687 921
95 LG:1079470,6:2000SEP08 543748R6 480 1034
95 LG:1079470.6:2000SEP08 5546359H1 484 677
95 LG:1079470.6:2000SEP08 5219834F7 690 1243
95 LG:1079470.6:2000SEP08 2318794H1 715 860
95 LG:1079470.ό:2000SEP08 5202357H1 763 985
95 LG:1079470.6:2000SEP08 3466746H1 764 1006
95 LG:1079470.6:2000SEP08 g1302678 799 1071
95 LG:1079470.6:2000SEP08 2050882H1 825 1077
95 LG:1079470.6:2000SEP08 2733580H1 871 1094
95 LG:1079470.6:2000SEP08 4919329H1 893 1119
95 LG:1079470,6:2000SEP08 6167338H1 947 1025
95 LG:1079470.6:2000SEP08 6060464F8 411 1027
95 LG:1079470,ό:2000SEP08 5996080F8 411 937
95 LG:1079470.6:2000SEP08 1451495F6 448 1072
95 LG:1079470.6:2000SEP08 1451495F1 448 917
95 LG:1079470,6:2000SEP08 1451495H1 448 695
95 LG:1079470.6:2000SEP08 543748H1 480 708
95 LG:1079470.6:2000SEP08 543748R1 480 865
95 LG:1079470.6:2000SEP08 5761741 HI 299 577
95 LG:1079470.6:2000SEP08 4717311H1 302 535
95 LG:1079470.6:2000SEP08 2991334H1 80 419
95 LG:1079470.6:2000SEP08 gόl 97278 293 750
95 LG:1079470.6:2000SEP08 5757130F8 296 883
95 LG:1079470.6:2000SEP08 5757130H1 296 576
95 LG:1079470.6:2000SEP08 5996080H1 411 684
95 LG:1079470.6:2000SEP08 6541925H1 325 851
95 LG:1079470.6:2000SEP08 g965206 353 699
95 LG:1079470.6:2000SEP08 4717311 F8 325 823
95 LG:1079470.6:2000SEP08 6779294H1 372 944
95 LG:1079470,6:2000SEP08 2266214H1 382 648
95 LG:1079470.ό:2000SEP08 5996080F7 411 919
95 LG:1079470.6:2000SEP08 5992238H1 411 700
96 LG:345705.3:2000SEP08 6866333H1 1 508
96 LG:345705.3:2000SEP08 6743970H1 78 400
96 LG:345705.3:2000SEP08 5903130H1 1 270
96 LG:345705.3:2000SEP08 3134791 HI 209 481
97 LG:1083654.1 2000SEP08 2961373T6 2740 2962
97 LG:1083654.1 2000SEP08 243195H1 2742 2862
97 LG:1083654.1 2000SEP08 2961373F6 2747 2997
97 LG:1083654.1 2000SEP08 2961373H1 2749 2877
97 LG:1083654.1 2000SEP08 5308958H1 2786 3006
97 LG:1083654,1 2000SEP08 1251913H1 2867 3008 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
97 LG: 1083654.1 :2000SEP08 1251913F6 2873 2951
97 LG:1083654.1 :2000SEP08 2853591T6 2884 3294
97 LG:1083654.1 :2000SEP08 1712778T6 2965 3407
97 LG:1083654.1 :2000SEP08 g897183 3094 3370
97 LG:1083654.1 :2000SEP08 7932016H1 3113 3732
97 LG:1083654.1 :2000SEP08 g2057056 2249 2605
97 LG:1083654.1 :2000SEP08 6558865T8 2319 2892
97 LG:1083654.1 :2000SEP08 1712778H1 2320 2534
97 LG:1083654.1 :2000SEP08 1712778F6 2320 2857
97 LG:1083654,1 :2000SEP08 4787383H1 2530 2781
97 LG:1083654.1 :2000SEP08 g4629623 2555 3002
97 LG:1083654.1 :2000SEP08 g5512397 2563 3003
97 LG:1083654.1 :2000SEP08 g5661867 2578 3033
97 LG: 1083654.1 :2000SEP08 5445604H1 2591 2832
97 LG:1083654.1 :2000SEP08 g5511864 2593 3003
97 LG:1083654.1:2000SEP08 g2070577 2641 3010
97 LG:1083654,1 :2000SEP08 g4334177 2644 3011
97 LG:1083654.1 :2000SEP08 g2715986 2678 3002
97 LG:1083654.1 :2000SEP08 6513330H1 2735 3027
97 LG:1083654.1 :2000SEP08 7664249H1 250 808
97 LG:1083654.1 :2000SEP08 7011821 HI 430 863
97 LG:1083654.1:2000SEP08 gό438877 427 839
97 LG:1083654.1 :2000SEP08 g3679253 430 886
97 LG:1083654.1 :2000SEP08 g5231632 430 849
97 LG:1083654.1 :2000SEP08 g3665440 430 880
97 LG:1083654.1 :2000SEP08 g5554573 430 894
97 LG:1083654.1:2000SEP08 g6040151 433 870
97 LG:1083654.1 :2000SEP08 g1242884 435 770
97 LG:1083654,1:2000SEP08 g4190333 436 814
97 LG:1083654.1 :2000SEP08 5694841T6 481 910
97 LG: 1083654.1 :2000SEP08 g2240560 494 973
97 LG:1083654.1 :2000SEP08 gl 367813 558 1328
97 LG:1083654.1:2000SEP08 5695241T8 635 1078
97 LG: 1083654.1 :2000SEP08 g1367787 750 1356
97 LG:1083654.1 :2000SEP08 4031856H1 991 1223
97 LG:1083654.1 :2000SEP08 g6986275 1015 1569
97 LG:1083654.1 :2000SEP08 6558865F8 1261 1866
97 LG:1083654.1 :2000SEP08 6558865H1 1261 1764
97 LG:1083654.1 :2000SEP08 7664249J1 1363 1917
97 LG:1083654.1 :2000SEP08 5695241 F9 1382 1960
97 LG:1083654.1 :2000SEP08 4249967H1 1401 1654
97 LG:1083654,1 :2000SEP08 4798922H1 1552 1795
97 LG:1083654.1 :2000SEP08 2853591 F6 1617 1985
97 LG:1083654.1 :2000SEP08 2853591 HI 1617 1703
97 LG:1083654.1 :2000SEP08 5694841 F6 1665 1956
97 LG:1083654,1 :2000SEP08 1962583H1 1668 1933
97 LG:1083654.1 :2000SEP08 6843753F8 1691 2211
97 LG:1083654,1 :2000SEP08 6843753H1 1691 2271
CT CoX oCT0o00o00 oCOo0OoCX o0O o0O o00 o00 oCO o0Oo(D0o(T0 oCT0oO0 o00 oCO o00 oCO o0Oo(D0 oCT0oC» o o oo o oo oooo oo oo o oo o o o o ?^
CO
Q
O CT0 C0 00 00 C0 sJ sJ sj v vj vj vJ O O CJl 4- C0 C0 C0 C0 C0 C0 C0 — ■ — ■ IO —. _. _. _. — CΛ r^ O O 4N N0 IO —' 00 sl sJ O Cπ C0 C0 sl NT sl cn CO 00 C0 00 θ — ' — ' 00 — sj Ol 4N CO CO Co CO _, —' CD si si si sl →- sj CO 4N 4N 4N CO Ω ^ O NT NT CO CO — ' 0ι O Cn 00 O O NT 0ι Oι O O C0 4N C0 0ι — ' — ' 4N NT sj NT O O O CO O NO —■
CO sj o O 00 sl 3-
π O πO FOΪ FO! FO-! FO! πO
00 O 00000000000000000000000 O 00000 O 0 O 0 O 0 O 000 O 00 O 00 to to to to to o o o o o o o o o o o o oo . co co co co o
— > — > o o o o o o o o o o o o o o o o o o o o o oo oo oo co co oo oo co co oo oo oo o o o o o o o o o o o o o o o o o o o o o o o t s co co co co o o o o o o o o o o o o to to to 3
TJ co co Fo Fo io fo Fo Fo to fo fo NT Fo co co co
Ω
Fύ to Fό "tύ Fό Fό "to fό fό iό Fύ tύ Fύ Fό Fύ Fό fό Fό Fύ iό fό fό Fό Fo Fύ Fό Fό Fύ Fύ Fύ ro o Fo tύ iό iό tύ "to Fό Fύ Fύ Fύ Fύ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o φ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o _ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o loo co co co co co co co co co co co co co co co co co co co co co co co co co o
;m m m m m m m rπ rπ m m m m m rπ m m m m m rπ rπ m m m m m m m m m m m m m m rπ m m rπ m m m m m m rπ m lo "" TJ TJ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o co oo co co co co oo co co co co co co co co oo co co co co
O O Cn 4N 4N 4N CO C NT tO NT NT NO NT CO CO __ _, _,
O Cn O OO sJ sJ IO — ' Sj eπ CO Co NO NT CO CO — ■ CD O NT sι o o en o NT NO N0 NO NO CO CO NO —' O CO O OO CO OO — ' 0 4N 0 01 0 4N O O — ' — ' O — ' NT —' en Cn sj NT rj O Oi Js, CO CO M NT M0 -4N* -4N' OOl CNOT NU1 —N ' N— ' OO 4N CNa0 CO rΩ O o- NT —' 00 O — ' SJ 00 00 O 4N O O O O C0 O t0 3-
00 ∞0 o<jJ s-j' s0j c4τNo sOj sCJij cr0π 44NN s4Nj oCjJi 4sl 4sNj θrO θCJl siOJ θO 4sj 4-' C|Oo oOιi ω-' rOo csnj 4-vl -_. ca> o^ p cn cπ -. 00 0 |N O .fc. *--. 4 *N I W O sJ *O O sl O O ^ Cn O -' sl O ^ g ^ v^
co m
© o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o cn oi oi Oi oi Oi i 4N 4N 4N co eo co co co co co eo eo co CO NT NT o NToNToNToNToNTo— 'o— o—o—•o— 'o—'oOoOoOoOoOoO o o o o o o o Ό o
}n ® 3 fi
Ω
X D
4N 4N co co NO 45. 4N 4N 4N —■ 00 C0 00 O O Cn θι Oι 4N NT g T _, _. O CO ? 00 O NT NT O js, _, -vl sj o o o o en NT OO CO O sl CO CO OO NO — ' 00 — ' o —« oo —■ -^^ CJ -—_' vr-j N i NT en NO co oo 4 co sl sl 00 sl —■ o co NO en 00 NT Ol O O
4N o — ' o en oo en cn o co O o o o ω -1- ^ =i-
sj sι o cn cn co
0 O> —C0 ' C —0 MOO CO0 NOO 0Ni n9 CθMo ωvJ β*, tCjO rCo rNoT O_, O^J CO 00 O Ol NT 4N — ' Co NT t0 4N 4N 4N 4N CO NT Cn NT O j ro — M —■ o NO ro NO O o ) tj ω cn j_ oι θ N M N o cjJ M i5, ϋι o cn ϊ _ S _ n, -. o
NO —' O — Ol -. 00 θ eπ -' NT 4 O c^ 0- C0 ft 00 g ς§ -- C0 4N O SJ ^ 4N N0 CJ0 O 4N SJ O O 00 O — * Cn 4 Co c-o C 0 >5 cn cτo 4 TJ
o OoOoOoOoOoOoOoOoO OooOoO Oo OooO OooO OooOoOoOO CoDoCTO CoTO CoTOoCTOoCDoCΛoCO OoOoCOoOO CoαosJ soIosJosJosJ oSJ SoJ oSI oSJ
Fό Fό fό Fό Fό Fό fό fό ^ oOi oOi SiC; 77" 77" : TZ ' TZ ~ ' T. ' TZ TZ TZ ^J Ω
< ) c J t NT NT rθ NT tO NT NO NT NT NT NT NT -1
, C J C J O O O O O O O O O O O O θ < i C ) oooooooooooooooooooft
| M
, COoCO r-J rπ m
ToJ ToJ ToJ ToIoTJ ToI ToJ ToJ ToJ ToJ TJSΩ COoOO ^
_
sj sJ O O O 4N 00 sj sj sj sl 4N 4N 4N 4N fc. O _, OO C0 00 4N 4N O NT — ' — ' O Ol NT NT NO NT C0 O NT IO NT O vl -0 00 en 4N o o 4N vj Ol Ol 4N 4N 4N Ol n CO W - 4 C CO O CJT 0 4N 4N J ! ? _ _ — o Ol o NT 00 00 C0 sj sl sJ O 4N O 00 Cn 00 si en NJ c. O O sj sj sj NT — ■ —CO ' VOJl OCD OOO OJs. r0^ 0o sj NT -f ig CO JN — ' Oi Oi sJ O O O O 3 co co o Ω
o o sl _ O CO sl O O O O CO 00 4N 4N 4N 4N
O — 4N -fc. O o o sj vj o sJ -vi co tO CO CO NT CO -fc..fc. CO
O C0 0 co3 oNo y , , O . CHi _f- 4 _ gn 3-fci Co O si 4N sj cn Co CO NT Co C JN 00 NT 4N CD CD 4N CO O °_ Cj r _- CO vj oo o o —■ en o co en o o CO NT O NT CO O O Cθ 4N NT ^ NT_ sJ sJ O
O sl c ton o sl sj ^ ^ O co ∞ 00 CO g M OJ M Ol M OJ Ol CO M O CD y Ol O -
CΛ m
©
_ _ _ _ _ _ _■ —' O O O O O O O O O O O O O O O O O
NT NO NT NT NT NO NT NT NT NT NT O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
NO NT NT NO NO NT NT NT NT NT NT NT NT C CO CO cjo cjj co co co co co eo eo co co
O O O o rt o o NT NT NT NT IO NT NT NJ NJ NJ
4N 4N 4N -IN 4N -fc. 3 NT NT NT NT NT NT υ
NT NT NT NT NT NT NT NT NT Ω δ'"
CΛ CΛ Ό co oo oo oo oo co CO CO
CN to
O NT -4 — ' 4N — NT Co 4N NT — ■ K-, 4N — ' f . , J_ Ol JN - ■ — ' CD v r NT NT !NT _ -O vT O O CO — i O vl CΛ 00 CO 4N O 4N -fc. 4N CO — ■ — 4 4N 4N vl NT CO O Cn V O Cn -^ N cn NT ^ N NT O W Co 4N Q cπ — ■ o NO - 4N O cn — * CO Ol 00 00 M o M ( Ojιi rNθT αnι s-sjj _s *joi Sg t^ύ S^ K'co ' v _ _ _ - Ns κ NT NT
cjι j js» js, js ro j, oι θ NT No o 4 ro oi co o t— , 00 cπ oι o o o o o o o sι sj N*_T —' —' CΛ
— ' O IO 4N Cθ 4N θ O Cn — ' OO O CO O OO CO OO H CO — i s Ml rO I fc JS, ^ Ps cri oθ *j-ι * rv r∞γι π7- <^-θ s_ Vgθ --'^ ^ en scπ Sfe Sfe -sl S*- ft 5- NT C0 4N Cn t C ^ θ O — ' O O -iN O i sl Oi O O g oj-ι ^ w of- ≥ _ oT FOt" W l M C M C O tO M O O ^ ^ y S ^ g ^ sj ^ co O ∞ ^ M - si JN TJ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
112 LI:336218.1:2000SEP08 2840687F6 14 474
112 LI:336218.1:2000SEP08 2840687H1 14 274
112 LI:336218.1:2000SEP08 7229614H1 1 266
112 LI:336218.1:2000SEP08 5270428H1 1 185
112 U:336218.1:2000SEP08 3233583H1 4 282
112 LI:336218.1:2000SEP08 7591929H1 4 539
112 U:336218.1:2000SEP08 3471708H1 407 595
113 LI:235891.3:2000SEP08 4893031H2 1 282
113 Ll:235891.3:2000SEP08 3116914H1 22 323
113 .LI:235891.3:2000SEP08 60268406D1 157 747
114 LI;344094.1:2000SEP08 g667576 1 344
114 LI;344094.1:2000SEP08 g697637 68 384
114 LI:344094.1:2000SEP08 g 1275867 105 547
114 LI:344094.1:2000SEP08 g655556 207 516
114 LI:344094.1:2000SEP08 g813917 237 594
114 LI:344094.1:2000SEP08 g900307 237 547
114 LI:344094.1:2000SEP08 g878170 237 635
114 LI:344094.1:2000SEP08 2922762F6 351 821
114 LI:344094,1:2000SEP08 2922762H1 351 425
114 LI:344094.1:2000SEP08 g672909 359 546
114 LI:344094.1:2000SEP08 g677456 359 543
114 LI:344094.1:2000SEP08 5492156H1 421 654
114 LI:344094.1:2000SEP08 g 1068390 487 743
114 LI:344094.1;2000SEP08 g 1043520 548 910
115 LI:399945.2:2000SEP08 g5527625 356 685
115 LI:399945.2:2000SEP08 g5236017 121 574
115 LI:399945.2:2000SEP08 g4735160 144 574
115 LI:399945.2:2000SEP08 g5742168 131 574
115 LI:399945.2:2000SEP08 g3932316 149 573
115 LI:399945.2:2000SEP08 g3108874 211 571
115 U:399945.2:2000SEP08 • g5594172 no 571
115 LI:399945.2:2000SEP08 g5878951 129 571
115 LI:399945.2:2000SEP08 g5054566 115 571
115 LI:399945.2:2000SEP08 g4687041 128 571
115 ■ LI:399945.2:2000SEP08 g4393431 142 570
115 LI:399945.2:2000SEP08 g4650676 221 569
115 LI:399945.2:2000SEP08 g5769276 115 568
115 LI:399945.2:2000SEP08 g6710022 113 568
115 LI:399945.2:2000SEP08 g5235149 102 568
115 LI:399945.2:2000SEP08 g3869794 95 568
115 U:399945,2:2000SEP08 g7317560 283 568
115 LI:399945.2:2000SEP08 g2616056 271 560
115 U:399945.2:2000SEP08 g4988831 417 521
115 LI:399945.2:2000SEP08 g5339565 1 474
115 LI:399945.2:2000SEP08 4207047H1 146 419
116 LI:051849.1:2000SEP08 1468364F6 1 273
116 LI:051849.1:2000SEP08 1468364H1 1 140
116 LI:051849.1:2000SEP08 1468364TO 76 594 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop LI:238379.3:2000SEP08 60220769D1 423 716 LI:238379,3:2000SEP08 4140076H1 187 484 LI:238379.3:2000SEP08 7203996H1 1 531 LI:352190.8:2000SEP08 7462355H1 1 561 LI:352190.8:2000SEP08 2728751 HI 99 359 U:352190,8:2000SEP08 2728751 F6 99 610 U:352190,8:2000SEP08 70608138V1 476 732 LI:352190.8:2000SEP08 70605012V1 473 1021 U:352190,8:2000SEP08 71485659V1 599 1205 LI:432120,1:2000SEP08 g7375879 385 847 U:432120.1:2000SEP08 g6476525 387 846 LI:432120,1:2000SEP08 g7281497 577 844 LI:432120,1:2000SEP08 g4078434 408 842 LI:432120.1:2000SEP08 g5755093 389 839 LI:432120,1:2000SEP08 6244543H1 1 560
1 LI:432120,1;2000SEP08 6248154H1 1 540
1 U:432120,1:2000SEP08 6243676H1 1 506
120 LI:055461.1:2000SEP08 4172005F6 752 1261
120 LI:055461.1:2000SEP08 4054692H1 752 1031
120 LI:055461.1:2000SEP08 70571894V1 755 1327
120 U:055461,1:2000SEP08 70570215V1 812 1372
120 LI:055461.1:2000SEP08 70571158V1 837 1341
120 U:055461.1:2000SEP08 70571350V1 861 1452
120 U:055461,1:2000SEP08 70569978V1 880 1517
120 U:055461,1:2000SEP08 70568295V1 879 1469
120 U:055461.1:2000SEP08 70564777V1 881 1289
120 LI:055461,1:2000SEP08 70564128V1 886 1255
120 LI:055461.1:2000SEP08 70563524V1 887 1255
120 LI:055461,1:2000SEP08 70567973V1 906 1369
120 LI:055461.1:2000SEP08 70568827V1 909 1318
120 LI:055461.1:2000SEP08 70571280V1 921 1479
120 LI:055461.1:2000SEP08 70563885V1 914 1064
120 LI . '055461.1:2000SEP08 70572315V1 921 1478
120 LI:055461,1:2000SEP08 70567979V1 947 1516
120 LI:055461.1:2000SEP08 2407240T6 1026 1509
120 LI;055461,1:2000SEP08 70568955V1 1024 1627
120 LI:055461.1:2000SEP08 70566789V1 1058 1324
120 LI:055461.1:2000SEP08 70568622V1 1065 1624
120 LI:055461.1:2000SEP08 70570905V1 1090 1686
120 LI:055461,1:2000SEP08 70570764V1 1089 1638
120 LI:055461.1:2000SEP08 gόl 98775 1125 1525
120 LI:055461,1:2000SEP08 70568160V1 1145 1627
120 U:055461.1:2000SEP08 g3118198 1161 1525
120 LI:055461.1:2000SEP08 70568672V1 1183 1643
120 LI:055461.1:2000SEP08 70568087V1 1205 1624
120 LI:055461.1:2000SEP08 70568736V1 1214 1628
120 U:055461,1:2000SEP08 70571754V1 1233 1627
120 L1:055461,1:2000SEP08 70569000V1 1272 1648 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop 120 LI:055461.1 :2000SEP08 1600869H1 1279 1527 120 U:055461.1 :2000SEP08 7974268H1 1 478 120 LI:055461.1 :2000SEP08 4592590F8 13 578 120 U:055461.1 :2000SEP08 2457736F6 26 480 120 LI:055461.1 :2000SEP08 2457736H1 26 154 120 U:055461.1 :2000SEP08 4194633H1 95 407 120 LI:055461.1 :2000SEP08 4194633F7 99 684 120 LI:055461.1 :2000SEP08 2073295H1 128 383 120 LI:055461.1 :2000SEP08 2073295F6 128 484 120 LI:055461.1 :2000SEP08 2895870H1 183 468 120 LI:055461 , 1 :2000SEP08 491179H1 231 494 120 LI:055461 , 1 :2000SEP08 1400739H1 292 535 120 LI:055461.1 :2000SEP08 2286216H1 420 603 120 LI:055461.1 :2000SEP08 2286216R6 420 815 120 LI:055461 ,1 :2000SEP08 70563652V1 428 619 120 LI:055461.1 :2000SEP08 4670223H1 434 687 120 LI:055461.1 :2000SEP08 70571953V1 446 911 120 LI:055461 ,1 :2000SEP08 70571173V1 523 1020 120 LI:055461.1 :2000SEP08 70570848V1 530 1126 120 LI:055461.1 :2000SEP08 70570874V1 568 1142 120 U:055461.1 :2000SEP08 70568559V1 571 1162 120 LI. '055461.1 :2000SEP08 70568000V1 583 1137 120 LI:055461.1 :2000SEP08 70570860V1 601 911 120 LI:055461.1 :2000SEP08 70571070V1 657 • 1178 120 LI:055461.1 :2000SEP08 70569626V1 649 1262 120 U:055461 ,1 :2000SEP08 70570457V1 658 1315 120 LI:055461 , 1 :2000SEP08 7091663H1 722 1072 120 U:055461 .1 ;2000SEP08 4172005H1 751 1021 120 LI:055461 ,1 :2000SEP08 g4329308 1303 1529 120 LI:055461 ,1 :2000SEP08 70569270V1 1315 1624 120 LI:055461 ,1 :2000SEP08 2073295T6 1420 1624 120 LI:055461.1 :2000SEP08 70562650V1 1442 1644 120 LI:055461.1 :2000SEP08 70567951VI 1457 1627 120 LI:055461.1 :2000SEP08 70568561VI 1511 1641 121 LI: 197433.5:2000SEP08 624501OH1 1 85 121 LI:197433.5:2000SEP08 4111192H1 2 292 121 LI:197433.5:2000SEP08 70752461VI 9 511 121 LI:197433.5:2000SEP08 2923165F6 9 504 121 U:197433.5:2000SEP08 70756635V1 9 607 121 LI:197433.5:2000SEP08 70751919V1 9 514 121 LI:197433,5:2000SEP08 70752173V1 9 481 121 LI: 197433.5:2000SEP08 70755802V1 9 582 121 LI:197433.5:2000SEP08 2923165H1 9 296 121 LI:197433.5:2000SEP08 534459H1 14 244 121 LI:197433.5:2000SEP08 70756060V1 236 788 121 LI:197433.5:2000SEP08 70747907V1 252 632 121 LI:197433,5:2000SEP08 70754739V1 254 788 121 LI:197433.5:2000SEP08 3715708H1 258 367 TABLE 3
! ID NO: Template ID Component ID Start Stop
121 LI:197433,5:2000SEP08 70772671VI 267 688
121 LI:197433.5:2000SEP08 70756945V1 267 813
121 LI:197433.5:2000SEP08 70756464V1 266 788
121 U:197433.5:2000SEP08 70756300V1 266 788
121 LI:197433.5:2000SEP08 70754898V1 287 645
121 LI:197433.5:2000SEP08 g6087492 320 782
121 LI:197433.5:2000SEP08 g3756770 361 783
121 U:197433.5:2000SEP08 g3214623 364 784
121 U:197433.5:2000SEP08 1454194T6 369 746
121 LI:197433,5:2000SEP08 g5176421 370 779
121 U:197433.5:2000SEP08 70752177V1 371 788
121 LI:197433.5:2000SEP08 1406510T6 429 737
121 LI:197433.5:2000SEP08 g3152247 429 792
121 LI:197433,5:2000SEP08 70757065V1 505 647
121 U:197433.5:2000SEP08 70753582V1 550 776
122 LI:170604.1:2000SEP08 g1276258 1 257
122 LI:170604.1:2000SEP08 g4371534 1 264
122 LI:170604.1 :2000SEP08 6319666H1 1 269
122 LI:170604.1.-2000SEP08 4817706F6 6 579
122 LI:170604,1:2000SEP08 4817706H1 6 281
123 LI:205057,3:2000SEP08 g1382854 173 486
123 LI:205057,3:2000SEP08 g4196450 289 485
123 LI:205057.3:2000SEP08 g3134110 98 485
123 L1:205057.3:2000SEP08 g3161929 183 485
123 U:205057,3:2000SEP08 g5855412 138 481
123 LI:205057,3:2000SEP08 g3299207 66 480
123 U:205057.3:2000SEP08 g1269670 162 480
123 U:205057.3:2000SEP08 4367930H1 269 478
123 LI:205057.3:2000SEP08 g3744884 171 477
123 LI:205057,3:2000SEP08 g5445282 156 475
123 LI:205057.3:2000SEP08 7078181H1 119 466
123 U:205057.3:2000SEP08 g2787551 101 453
123 U:205057.3:2000SEP08 5445943H1 142 403
123 LI:205057,3:2000SEP08 6882994H1 102 653
123 U:205057.3:2000SEP08 71500229V1 38 633
123 LI:205057.3:2000SEP08 71500709V1 15 599
123 U:205057,3:2000SEP08 71497289V1 50 628
123 LI:205057.3:2000SEP08 71496610V1 61 626
123 LI:205057,3:2000SEP08 2890214F6 140 626
123 LI:205057.3:2000SEP08 71496889V1 121 626
123 LI:205057,3:2000SEP08 71497655V1 152 626
123 LI:205057.3:2000SEP08 71495949V1 1 626
123 LI:205057.3:2000SEP08 71685242V1 485 626
123 U:205057,3:2000SEP08 g3230634 91 491
123 LI:205057.3:2000SEP08 4974269H1 269 486
124 LI:233795.1:2000SEP08 3082634H1 1 285
125 U:311197.1:2000SEP08 6803514H1 503 984
125 LI:3in97.1:2000SEP08 6803514J1 242 823 co m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ t f W M f M M M M M M W W W M M O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O en Ol Oi Oi Ol Oi Oi Oi Oi Ol Ol Oi Oi Ol Ol Ol Oi o
OslOsι MOsOJOsl O--ιOslOsιOsι sOι—-, __ __ __ __ _ "_ _ _ _ oooooooooo oooooooooosωt OOOOO OOOOOFi O O O O O O O O O O
^ ω cΛ ω co -* -' -4 j_ 4i iji M CΛ M c ω 4>. J- co N) ω M θ3 ω co ω ω ivτ ro Co |O NT NT N0 IO NT N0 NT 4N Ol IO NT CO _? rn — ' — 4 tO O sJ _ NT 4N NT CJ -4 CT O CJT C N0 4N CO l 4N O O Ol NT C0 0 4N O sJ _ — ' en si o o 4N NT si o o cn oo o o — * Q l-π NT O CD IO C0 C0 O Cn C0 θ sJ O O sJ 4N NT CD sl sl _ 00 C0 O O sJ 4N CO N0 — ' 004N SI O CO — ' O O O O ^.
en Oi O O O O sl sj sl Oo oo CO O O O O O O O O O O O O O O O O O O O — ' NT NT Ol O O O O O O O O O O O O sI CΛ sJ O t O O O — • — ' O O — ' 00 — ' O O O O O O O O O O — — ' — ' — — ' — ' — ' O 00 4N O 4N SI sJ sj sj sj sl -vI sj sj sj 00 00 O 0 Cn sJ NT _ cπ en θ — ' O 4N — 44N O O O O O O O O O tO fO NT M C-jj 4N tO CΛ Cjo O sj <jι θl O O O sJ θl Oo 0004N cn θτj
CΛ m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _* tO NT NT tO NO NT NO NO NT NO NT M NT NT M tO NT NT ro ro rO tO IO NT NT tO NO NO NO tO tO NT NO IO NO NT tO tO M NT NT NT ^ OO OO OO sJ sJ sJ sJ sJ sJ sl sl sJ sJ sJ sJ sl sl sJ sl sl sI sl sI sJ sJ sl sJ O O O O O O O O O O O O O O O O O O O O O
NT NO O NT NT NT NT NT NO NT NT NT NO NT NO NT NO NO NO NT NT NT NT NT NO NT NT 4N 4N •fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N -fc. 4N
Co co CO 4N 4N 4N 4N 4 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N •fc. 4N 4N 4N •fc. 4N
CO oo 00
—' —» O o o o o o o o o o O O O O o o o o o o o o o o —1
CO co CO co co co CO co Co co co CO CO co co CO co Co co CO Co co co co CO co CO co CO CO CO co Co co CO co CO CO CO co C CO Co co Co o o o o o o o o o o o o o O O O j si osj o o O o o o o o o o o o o o o O o o o O o O o o o O o o o o φ
4N -fc. 4N SJ sj Sl sl -vl s-J SJ si sl sj sj sj s Sl sj sl sl si sl Sl sj SI 4N 4N 4N 4N 4N 4N 4N 4N -fc. 4N 4N -fc. 4N 4- 4N -fc. 4N 4N -fc. 4N 4N 3
Cn "cn "cn O o o O o "o o o o "o O ci- o ci- O o o o o "o o "o o o TJ
Fύ to fό fό iό fό fό ύ iό fό tύ iό fo iό iό Fό iό Fό Fό fό fό iό Fo Fό Fύ tό Fό Fύ Fύ Fύ Fύ iό iό tύ 'to iό iό tό fύ Fύ Fό fό to Fύ Fό Fύ Fό Fύ Ω o o o O o o o o o o o o O o o o o o o o O o o o o o o o o o o o o o o o o O O O o O o O O o o O o o o O o o o o o o o o O o o o o o o o o o o o o o o o o o o o o o o o o O O O o O o O O o o o t o o o o o o o o o o o o O o o o o o o o o o o o o o o o o o o o o o o o o O O O o o co Λ co CΛ CO CΛ CO CΛ CO co CO Co O O o o o ~-~
CO co co CΛ co CΛ co CΛ co co CΛ co CΛ CΛ CΛ CΛ co CΛ co CΛ co CΛ CΛ CΛ CΛ CΛ CO CΛ co co co C Λ CΛ CΛ CO CΛ CO Ό rπ m rn m rπ m m m m m m m rπ m m rπ m m m m m m rπ m m m rn m m m rn m m m m m rπ m rπ m m m m m m m m m
TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TI TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ J TJ TJ TI TJ TI TJ TJ TJ TJ TJ TJ TJ J TJ TJ Ti o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
00 co CO 00 co 00 co 00 00 00 00 00 00 CO 00 oo co co 00 co 00 00 00 00 co oo 00 co 00 CO 00 CO co co 00 co co CO oo co 00 oo 00 00 00 co 00 00
cn NO 4N 4N O O C0 O 00 00 00 C0 sl sj sJ θ Cπ θι 0l 4N 4N 4N C0 C0 NT Ol 4N Cn 4N 4N 4N 4N NT CΛ o co — ' — ' N0 O 4N C0 C0 C0 CO IO C0 O1 O1 — ' O Oi — ' 00 00 .fc. O O 4N — ' 0 4N 0 0 01 CO NT NT Co NO NT sl O O — ' NT O O O O CO O OO OO CO O O sl co O Cn sl CO O O CO -vl vj 00
3-
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
129 U:039258.5:2000SEP08 7374628H1 1 497
129 LI:039258.5:2000SEP08 7260053H1 97 570
130 LI:1071842,1 :2000SEP08 3295254H1 1 172
130 LI:1071842.1 :2000SEP08 6355170H1 1 313
130 LI:1071842.1 :2000SEP08 6352664H1 1 237
130 LI:1071842.1 :2000SEP08 4969508H1 5 150
130 LI:1071842.1 :2000SEP08 8037001 HI 17 463
130 LI:1071842.1 :2000SEP08 6883378H1 23 424
130 LI:1071842,1 :2000SEP08 g2046377 47 436
130 LI: 1071842.1 :2000SEP08 6856114H1 254 726
130 Ll:1071842.1 :2000SEP08 6856147H1 254 786
130 LI:1071842.1 :2000SEP08 g2816492 351 798
130 LI:1071842,1 :2000SEP08 5805030H1 380 646
130 LI:1071842.1 :2000SEP08 4910380H1 394 709
131 U:481356.3:2000SEP08 5675468H1 127 392
.131 LI:481356.3:2000SEP08 1646823H1 208 409
131 LI:481356,3:2000SEP08 1699236H1 280 491
131 LI:481356.3:2000SEP08 3633532T9 462 944
131 U:481356.3:2000SEP08 6534988H1 27 414
131 U:481356.3:2000SEP08 7271920H1 1 568
132 LI:103474.1:2000SEP08 6243379F8 1 524
132 LI: 103474.1:2000SEP08 6243379H1 1 583
132 LI:103474.1:2000SEP08 6243379T8 1 481
132 LI:103474.1:2000SEP08 6247985T8 1 481
132 LI:103474.1:2000SEP08 * 6246762T8 1 467
132 LI: 103474,1 :2000SEP08 6247985H1 1 558
132 LI: 103474.1:2000SEP08 6246762H1 1 583
132 LI:103474,1:2000SEP08 6246762F8 1 526
132 LI:103474.1:2000SEP08 6243872H1 1 598
132 LI: 103474.1:2000SEP08 6247985F8 9 520
132 LI: 103474.1:2000SEP08 2895966H1 266 560
133 LI:1073020.10:2000SEP08 7925042H1 1 638
133 LI:1073020.10:2000SEP08 7432730H1 505 950
133 LI:1073020.10:2000SEP08 70043051 Dl 912 1415
133 LI:1073020.10:2000SEP08 4599916H1 935 1206
134 Ll:000874.1 2000SEP08 676441 HI 1983 2257
134 Ll:000874.1 2000SEP08 7444379T1 1567 2180
134 LI.O00874.1. 2000SEP08 6445251T8 1627 2177
134 Ll:000874.1' 2000SEP08 7444430T1 1648 2182
134 Ll:000874.1 2000SEP08 3468564T6 1858 2235
134 LL000874.1 2000SEP08 3468564F6 1865 2219
134 Ll:000874.1 2000SEP08 3468564H1 1865 2138
134 LL000874.1 2000SEP08 4178064T6 1870 2250
134 Ll:000874.1 2000SEP08 g2836242 1892 2254
134 Ll:000874.1 2000SEP08 g4147256 1894 2283
134 Ll:000874,l 2000SEP08 g3678872 1895 2286
134 Ll:000874.1 2000SEP08 4515055H1 1904 2153
134 Ll:000874.1 2000SEP08 * 672386H1 1983 2242 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop 134 LI:000874.1 :2000SEP08 676441T6 1983 2242 134 LI:000874.1 :2000SEP08 6445251 HI 339 460 134 LI:000874.1 :2000SEP08 6445251 F8 339 900 134 LI:000874.1 :2000SEP08 71215968V1 520 667 134 U:000874,1 :2000SEP08 4178064H1 520 770 134 LI:000874.1 :2000SEP08 4178064F6 520 1026 134 U:000874.1 :2000SEP08 6000160H1 968 1466 134 LI:000874.1 :2000SEP08 6308558H1 967 1528 134 LI:000874.1 :2000SEP08 6558026H1 1399 1961 134 U:000874.1 :2000SEP08 4311435H1 1417 1572 134 LI:000874.1 :2000SEP08 6560545H1 1429 1940 134 LI.O00874.1 :2000SEP08 7073540H1 1 329 134 LI:000874,1 ;2000SEP08 7287283H1 82 396 134 LI:000874.1 :2000SEP08 4711204H1 92 226 134 LI:000874,1 :2000SEP08 6444020H1 95 611 134 LI:000874.1 :2000SEP08 676441 R6 1983 2283 134 LI:000874.1 :2000SEP08 g6043865 2105 2283 134 LI:000874.1 :2000SEP08 g5663173 2146 2275 135 LI:037298.2:2000SEP08 g4310397 2252 2724 135 LI:037298.2:2000SEP08 g4740199 2272 2727 135 LI:037298.2:2000SEP08 g3778159 2286 2735 135 U:037298,2:2000SEP08 g4991293 2298 2741 135 LI:037298.2:2000SEP08 g1289893 2301 2733 135 LI:037298.2:2000SEP08 g3015750 2350 2731 135 LI:037298.2:2000SEP08 g3181863 2351 2729 135 LI:037298.2:2000SEP08 g3755500 2356 2727 135 LI:037298.2:2000SEP08 5656848H1 2359 2609 135 LI:037298.2:2000SEP08 g7237178 2367 2721 135 LI:037298.2:2000SEP08 g5636414 2385 2721 135 LI:037298.2:2000SEP08 g819400 2388 2739 135 U:037298.2:2000SEP08 g4310951 1 452 135 LI:037298,2:2000SEP08 g4684224 1 480 135 U:037298.2:2000SEP08 g4175610 1 485 135 LI:037298.2:2000SEP08 g6700878 2 277 135 LI:037298.2:2000SEP08 g6699894 2 439 135 U:037298,2:2000SEP08 g6701663 2 529 135 L1:03-7298.2:2000SEP08 g5362582 10 485 135 U:037298,2:2000SEP08 g4175608 19 485 135 LI:037298.2:2000SEP08 g5632072 32 487 135 U:037298.2:2000SEP08 g3095847 147 231 135 LI:037298.2:2000SEP08 7177795H1 189 736 135 LI:037298.2:2000SEP08 70850933V1 386 869 135 LI:037298.2:2000SEP08 70849483V1 386 946 135 LI:037298.2:2000SEP08 70853177V1 386 798 135 LI:037298.2:2000SEP08 4293646F6 386 787 135 LI:037298.2:2000SEP08 70853110V1 386 636 135 LI:037298.2:2000SEP08 4293646H1 386 631 135 Li:037298.2:2000SEP08 70851283V1 502 855 ABLE 3
SEQ ID NO: Template ID Component ID Start Stop 135 LI:037298,2:2000SEP08 g3250513 507 729 135 LI:037298.2:2000SEP08 70854055V1 562 923 135 LI:037298.2:2000SEP08 70853228V1 615 1182 135 LI:037298.2:2000SEP08 70852672V1 642 1270 135 LI:037298.2:2000SEP08 70853222V1 719 1053 135 U:037298.2:2000SEP08 70849522V1 776 1282 135 LI:037298.2:2000SEP08 70803032V1 840 1048 135 LI:037298.2:2000SEP08 70851817V1 854 1441 135 U:037298.2:2000SEP08 70853296V1 854 1426 135 LI:037298.2:2000SEP08 70852005V1 914 1532 135 LI:037298.2:2000SEP08 70852305V1 953 1588 135 LI:037298.2:2000SEP08 70853794V1 974 1492 135 LI:037298.2:2000SEP08 6342011 HI 999 1579 135 LI:037298,2;2000SEP08 g676942 1016 1270 135 LI:037298.2:2000SEP08 g574848 1016 1270 135 LI:037298,2:2000SEP08 g870484 1017 1330 135 U:037298.2:2000SEP08 g824450 1017 1295 135 U:037298.2:2000SEP08 g774466 1018 1327 135 LI:037298.2:2000SEP08 70848993V1 1023 1528 135 U:037298.2:2000SEP08 70853573V1 1031 1559 135 LI:037298.2:2000SEP08 g774465 1045 1370 135 LI:037298,2:2000SEP08 70851133V1 1049 1727 135 LI:037298.2:2000SEP08 70854034V1 1109 1484 135 LI:037298.2:2000SEP08 g3958586 1134 1556 135 LI:037298.2:2000SEP08 70849240V1 1134 1774 135 LI:037298.2:2000SEP08 70850404V1 1152 1437 135 LI:037298.2:2000SEP08 70851556V1 1256 1728 135 LI:037298.2:2000SEP08 70850987V1 1291 1739 135 LI:037298.2:2000SEP08 70853756V1 1343 1915 135 LI:037298.2:2000SEP08 70849160V1 1385 1894 135 LI:037298,2:2000SEP08 70853632V1 1398 1913 135 " LI:037298.2:2000SEP08 70854612V1 1448 1913 135 LI:037298.2:2000SEP08 4293646R6 1483 1913 135 LI:037298.2:2000SEP08 3973584F8 1881 2385 135 LI:037298.2:2000SEP08 3973584H1 1881 2151 135 U:037298.2:2000SEP08 3973584F6 1881 2241 135 LI:037298.2:2000SEP08 g1321459 1981 2410 135 LI:037298.2:2000SEP08 4323149H1 1992 2162 135 LI:037298.2:2000SEP08 g1200548 1997 2284 135 LI:037298.2:2000SEP08 2270683R6 2019 2399 135 LI:037298.2:2000SEP08 2270683H1 2019 2269 135 LI:037298.2:2000SEP08 gll88713 2034 2178 135 LI:037298.2:2000SEP08 6129689H1 2064 2620 135 LI:037298.2:2000SEP08 6125735H1 2065 2629 135 LI:037298.2:2000SEP08 3973584T8 2088 2712 135 U:037298.2:2000SEP08 1003739H1 2168 2404 135 LI:037298.2:2000SEP08 g870485 2391 2738 135 LI:037298.2:2000SEP08 g824261 2424 2738 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
135 LI:037298.2:2000SEP08 g824262 2466 2738
135 LI:037298.2:2000SEP08 g3755677 2470 2727
135 LI:037298,2:2000SEP08 gl 188714 2483 2740
135 LI:037298,2:2000SEP08 g567521 2486 2721
135 LI:037298.2:2000SEP08 gό71199 2510 2721
135 U:037298.2:2000SEP08 g1148845 2615 2721
135 LI:037298,2:2000SEP08 g2842176 2624 2731
136 LI:422901.1:2000SEP08 7669538H1 1 487
136 LI.422901.1:2000SEP08 8009135H1 2 595
136 L1:422901.1:2000SEP08 8007884H2 1 564
136 U: 22901 ,1.2000SEP08 8011553H1 1 559
136 LI:422901.1:2000SEP08 8010625H1 1 592
136 LI.422901.1:2000SEP08 8008669H1 1 471
136 LI:422901.1:2000SEP08 8009653H1 2 489
136 LI:422901.1:2000SEP08 4586653H1 9 173
136 LI:422901.1:2000SEP08 70481183V1 10 562
136 LI:422901,1:2000SEP08 4659525H1 14 200
136 LI:422901.1:2000SEP08 70466352V1 198 686
136 LI:422901.1:2000SEP08 70471460V1 226 486
136 LI:422901,1:2000SEP08 70465222V1 223 700
136 LI:422901.1:2000SEP08 70479783V1 241 359
136 U:422901.1:2000SEP08 70495442V1 286 390
136 LI:422901.1:2000SEP08 70464419V1 298 874
136 LI.422901.1.2000SEP08 70466486V1 314 931
136 LI:422901.1:2000SEP08 70481279V1 353 889
136 LI:422901,1:2000SEP08 70481193V1 359 921
136 LI:422901,1:2000SEP08 70467356V1 361 898
136 LI:422901.1:2000SEP08 70472724V1 399 747
136 U:422901,1:2000SEP08 70469921VI 418 846
136 LI:422901,1:2000SEP08 70496406V1 461 600
136 U:422901.1:2000SEP08 70468552V1 843 1268
136 LI:422901.1:2000SEP08 70469892V1 887 1128
136 LI:422901.1:2000SEP08 70469095V1 887 1126
136 U:422901.1:2000SEP08 70465027V1 905 1183
136 LI:422901.1:2000SEP08 70467528V1 922 1530
136 LI:422901.1:2000SEP08 70466219V1 462 659
136 LI:422901.1:2000SEP08 70481675V1 469 1071
136 LI.422901.1:2000SEP08 70480075V1 479 632
136 LI:422901.1:2000SEP08 70466776V1 508 1009
136 LI:422901.1:2000SEP08 70481912V1 507 1103
136 LI:422901.1:2000SEP08 70465242V1 509 1091
136 LI:422901.1:2000SEP08 70468287V1 547 1127
136 LI:422901.1:2000SEP08 70467133V1 559 1182
136 LI:422901.1:2000SEP08 70479655V1 559 1116
136 LI:422901.1:2000SEP08 70466676V1 559 1114
136 LI:422901.1:2000SEP08 70479011VI 569 797
136 LI:422901.1:2000SEP08 70467578V1 584 1141
136 U:422901.1:2000SEP08 70465558V1 583 798 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
136 Ll.422901.1 2000SEP08 70481928V1 591 1121
136 LI.422901.1 2000SEP08 70467547V1 604 1003
136 Ll.422901.1 2000SEP08 70465569V1 611 1077
136 Ll.422901.1 2000SEP08 70467459V1 638 1003
136 Ll.422901.1 2000SEP08 70466607V1 631 891
136 Ll:422901,1 2000SEP08 70469051VI 621 1183
136 Ll.422901.1 2000SEP08 5808423T8 648 1127
136 LI:422901.1 2000SEP08 70479176V1 660 1111
136 LI,422901.1 2000SEP08 70464261VI 694 1129
136 Ll.422901.1 2000SEP08 70480212V1 692 861
136 Ll.422901.1 2000SEP08 70481058V1 692 862
136 Ll.422901.1 2000SEP08 70480415V1 691 859
136 LI,422901.1 2000SEP08 70479889V1 706 912
136 Ll.422901.1 2000SEP08 70464582V1 708 967
136 Ll.422901.1 2000SEP08 70466381VI 737 1254
136 Ll.422901.1 2000SEP08 70464506V1 780 1299
136 Ll.422901.1 2000SEP08 70466025V1 777 1317
136 Ll.422901.1 2000SEP08 70467618V1 791 1399
136 U:422901.1 2000SEP08 4659870H1 799 1051
136 Ll.422901.1 2000SEP08 70477354V1 807 1308
136 Ll:422901.1 2000SEP08 70479682V1 816 ' 1005
136 Ll:422901.1 2000SEP08 70468490V1 846 1157
136 Ll:422901.1 2000SEP08 70465557V1 922 1357
136 Ll:422901.1 2000SEP08 70481594V1 939 1104
136 Ll:422901.1 2000SEP08 70478404V1 944 1104
136 Ll:422901.1 2000SEP08 70479902V1 979 1275
136 Ll:422901.1 2000SEP08 70465042V1 987 1576
136 Ll:422901,1 2000SEP08 70466835V1 998 1336
136 Ll:422901.1 2000SEP08 70467624V1 1052 1571
136 Ll.422901.1 2000SEP08 70477595V1 1075 1587
136 U:422901,1 2000SEP08 70468737V1 1080 1586
136 Ll:422901.1 2000SEP08 70481023V1 1089 1186
136 Ll:422901.1 2000SEP08 4532750H1 1297 1487
137 Ll:345815.1 2000SEP08 1428801 F6 473 911
137 Ll:345815.1 2000SEP08 2768967H1 772 1020
137 Ll:345815.1 2000SEP08 7586434H1 859 1345
137 Ll:345815.1 2000SEP08 1428801T6 963 1574
137 Ll:345815.1 2000SEP08 2174927H1 1272 1516
137 Ll:345815.1 2000SEP08 55013966J1 1396 1867
137 Ll.345815,1 2000SEP08 g2329596 1396 1762
137 Ll:345815.1 2000SEP08 55013966H1 1507 2097
137 Ll:345815.1 2000SEP08 7974188H1 1523 2001
137 Ll:345815.1 2000SEP08 901446R6 1622 2104
137 Ll:345815.1 2000SEP08 901446R1 1622 2163
137 Ll.345815.1 2000SEP08 901446H1 1622 1911
137 Ll:345815.1 2000SEP08 gl 164340 1653 2030
137 Ll:345815.1 2000SEP08 1719224H1 1806 2015
137 Ll:345815.1 2000SEP08 808303H1 1865 2104 CO m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ __ __ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ *_
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N CO (-O CO (-O CO CO CjJ <jJ CO C^ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O 00 C0 00 00 00 00 00 00 C0 00 00 00 00 s| sj sl sj s|
O
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N co co co
CO Co CO O O o o o o o o CO C
O π co
94N JN 4N 4N
4N •fc.
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N -fc. CO CO CO r.**, sji s vli si -vl s| sj Sl SI sl Sl SI sl en _■
N NJ NT NT ) NJ NO NJ NJ NJ NJ NT ft NT NT
NJ NJ NJ N) NT NJ NJ N) NJ N) NJ N) N) N) N NT NT NT NT NT NJ NT NO NT NJ ro Cn Cπ cπ cn O wJ OJ OJ cπ cn cn cn cn cn cn cn cn cn cn cn cπ cn cn cn cπ cn cn cn cπ cn cπ cn cn cn C Co CO 9 _ O O O O CO O O C J J n CO o° φ
CO co CO CO CO Co CO CO CO co CO CO CO CO CO CO CO co CO CO CO CO CO CO CO CO CO CO ~ 4N .fc. -fc. 4N 4N 4N 4N 4N 4N 4N 4N cn oi oi oi en" 3
CO Co CO fo Fo t fo fo fo NT NT NT NT NT NT '-* _L
NT NT NT NO NO NT NT NT NT NO NO NO NT NT NT NT NO NT NT NT NT NT NT NT NT NT NT NT NO Fύ fύ Ω
C ) C J O O o o o o O O O O r J O O o o r O O O O o NT NT Fό Fό iό fό NT NT NT NT NT NT NT NT NT NT
C J C J ( J O O o o O O O O o o c > o o O O O o o o o o o rt o O O o o o o
CT C J C ) O O o o O O O O c ) c ) c ) o o c ) < ) o o O g
CΛ CΛ CΛ CΛ CO CΛ CO C CO CO CΛ C CΛ CT CΛ CΛ CO CΛ CO CΛ CΛ CΛ CΛ CΛ (Λ co co cn ■Oo o o o o CO CO in m Ml m m m m m m m m m m ITI rn I'll rπ m m m III rπ m III m m m m m ft m O CΛ co co co co (Λ CΛ C/J CΛ CΛ CO CO m m m S. Ό J TI TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ T m rπ m rπ m m rπ 111 rπ m — m m o o o o o o o o C ) o o o oo O _ o o - o TJ J TJ J TJ TJ TJ π TI J TJ T C ) C J
00 O (JO 00 00 oo oo 00 00 o o oo oo oo oo co oo oo co co co co 00 CO CO 00 CO 00 co oo oo oo o o o o o o o o o oo oo oo co co co 00 00 00 00 CO oo co co co oo
Cn Cπ Ul Oι 4N 4N 4N 4N N N Cθ Cθ rO ^ ^ _ N g — ■ NO IΌ NT NO NT NT NT NT NT NO O g NcjϊT - ^ N CO
4N CT0 — ' OO CO CO CO CO CO CO CO CO CO
4N N0 O O 00 00 00 t0 00 00 C0 C0 sJ JN 00 O C0 NT Ol sj s o cjoT - SJ O OO — ' Cn 4N O NO O NT Cθ sl O — ' NT NT °° g o — o o o o o o o o o- o
NO —■ NT — NT NT NT o sj o cπ o o o o^ - o o ω ω N o o NO Cji ω JΛ Oo oi ϋi co SI NT CO NO — ' O O O 4N sl si 00 s| sl O 4N O sJ 0ι C0 C0 NT C0 co
4N § __ β 5 ft ft j^ g r-' C COo ssj| c_o O Cθ 4 . 4.- - π oo 00 00 -vl 0ι O O 4N O O CD sj C0 θ N0 — ' O sJ O IO -C- Oo sl Ol O
O s—j' cCoO o Ooo Oo esnl CO O O C NT NJ O NT O c •IN 4N o O 4N N U1 CO CO N J5. N O - ' O CO O — ' 00 00 0ι O CD 4N O 00 O s! -ξj
4 4N 4N 4_ .fc. j_ 4_
4N 4 4N 4 4-*. 4N .fc. tO NT tO rO NT tO eπ cn oι θι oι c_π θι cjo cjo co ω co co co
_ _ _ _ _ _ _ _ _ _ _ O
■NO fό fό "NT fό iό iό o o o o o o o o o o o o ;*τ" o o o o o o o o o o o o o o o o o o o o o o o o ,-^ co co cn co co co co m m m m
TJ TT TI -σ TT TT TI o o o o o o o o o o o o
00 00 CT0 CT0 CO CD C
si |SJ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ro r r* ι3 m NT l NO — ' — ' NO NT CΛ
_ o co o cn Nθ 4N Nθ co cn co cn cπ eπ eπ o cn o o oo cn si cn 4N Oi Ol Oi Oi sl sl sl Co Cn sl Oi Co Oi Js, V, -4 C0 -* 00 N O -* rt sj si en sj oι θι sι sj 4N No en cπ eπ oo NO cπ oo si —4 cn si t ISoJ t IoSJ - -4 ' VCJOU o W 0 U1l 4N. S SJ| 0 U> 0 U1l 0 \J — 4 " -NISTJ fv O CJIl NlSOJ CCJ O 4N N0 4N O C0 O 4N 4N 4N 4N O Cn o cn Ol CO CO NO o -cn cπ cn to co o N Co en oi co o o -^ cn cn NΠI OCJ- S_ - T o Oo OCJ eπ NlS coTJ - OO CO 4N O
CΛ m ©
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. fc. 4N 4N 4N 4N 4N 4N fc. fcr, 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N JN 4N fc. fc. 4N 4N ro ro _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ o o o o O O O O O O O O O O O O O O O O O O O O z O
_ _ _ _ _ _ _ 4N 4N 4N fc. fc. 4N 4N _| _ _ _ _ _ _ ooooooooS" OoO OoOoOooO θoFi CΛ CΛ CΛ CΛ CΛ CO CΛ O O O O O O O O
o o o en JN Co . NT NT NO NT NO tO W NT NT NT NT NT NT NT I NT NT NT — ' — ' — ' o cn oi — to o _ -- -' c2 S ?j - —' fc. fc. fc. fc. CO NO NO NT NO —' — ' —' —' — ' O O O O O O O CO vθ _ _ O O fc. _4 O jN ω ^ a ^ ci-i -ofc> fsj 'i^ji *M;) o-oj c oi c •Ji ω ω O vl co W tv) CO - " — ' sJ CJI CJl NO 4N fc. 0 Co0 π Ω ^ ω ιj j js, π C |s3 vj oo oι — ' OO O — ' O cn 4N o oo ■*vj _ _ co NO sJ 4N 4N O O sJ O si 3-
JN Ol JN ,-. ,-, |s κ-ι O — . _ —. — . _ , _ _ _ ' NT NT NT NT NT NO NT NO M IO NT rO NJ NT NT NO NT NO iNO NT NT rv NT CΛ
Oi CJi n O O sl sj sJ O vj n 9 ^ ^ ^ J -. -O sJ Oo ft - . i Oi Oi Oi Cπ Ol CO W Oi Oi Oi Oi Oi Co CAi Oi Oi O ϋi S Co *
O NnO ^ . ^ . ^ O O s_I. Oi Oi Oi ϋi Ol Oi α Oi Oi Oi ϋ I^ C> O N O O CO O O CO N O NT OO sJ O NT <TO N O N O O tO CjO N CjJ Cjo y o O 01 ° NT Co O CO — ' SI O — ' O IO O N0 — ' Vj M CO NO N W OO O O NO O Oo NO NT O CO JN O O O NT O CO tO Co O JN CO ^-' Co O
CΛ m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ©_ c_o c4o. ω4~ c4No 4c-o .ωfc. *c-—*o 4cNo .cfco.4cNo 4ωN 4oNj 4CΛN .ωfc c.So ω ω co ω co w ω co ω co ω ω
O
oo —1
CO 00 00 oo φ
NT NO NO to NT NO 3 fc. fc. 4N fc. fc. * TJ fύ fύ J
TJ TJ TJ J o 00
C0 CO 4N 4N 4N 4N 4N 4N o en co co co oo co oo r,, . <TO CT3 00 00 <TO C» C CTD C OO Ol CJl Ol Ol 4N fc. 4N 4N CO N N O Cl Ol 4N 4N 4N 4N 4N 4^ CO O CO NT O O O O 1- O aO sJ Cn fc. fc. 4N 4N — ' O N0 NT O O O 4N N0 NT C0 — ' O OO CO CO O O O CO — ' — ' OO sl O O Oi Ol co O No co o oo cn — ■ O CjO CjT <JJ O O CTO Oi NT 4N O NT O CO O O sl O OO O O sl NT sl O O O O sl sJ O si en 4N OO NT 3-
co 4N 4N 4N cπ en cn eπ tv-, . ..-, cn oι oι θι cjι θι θι θi CJi θi co .',o cjJ Cjj J oo co co ω is
O O O Cπ — ' CO CO — — ' O sl Oo OO sJ O sJ . O_ O _ O_ CO O -cj O-cj — — ' Oo s —l ι sslι Oc_ O-c Oc- O-cj cj- O-C OCj- cj- Oc - Os-/ ir-vv r-i — ■ si co cπ o o NT si rn o sj n ϊ Cji Ol Ol Ol O NT 4N CT0 N0 O O 4N C0 O 4N — ' N0 O N0 Cn θ 4N 4N Cn ω Cjl 00 O 4N Cl C0 N0 4N ^ CJ
SJ cn si cπ i co o o o -cf. 4N fc. fc. NT Co O Cn 4^ NT O CO OO N OO CO Cn sI 4N sJ 4N C0 0 0 4N CO O O Cn — ' Ol CO O o W TJ
m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ *_
4N -fc.4N 4N- 4N 4- .fc 4N 4N 4_ 4- 4N 4N 4N 4N 4N 4 co ω cjo cjo c cj c cjo cjj c co jj c cjj cj cj cjj oo ω
O
4N 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N W CjJ CO CO CjO CO G Cθ ω ω W CO CjO CO rvv O O O O O O O CO CO NT NT NT NT NT NT — ' — * _ O O O O O O O O O O O O O 00 00 00 C0 00 00 sI sI sl O O 7 CD 00 00 Cn ° cn Ol Ol 4N NT O O — ' O ∞ CT O NO -J O ∞ Ol NO ∞ N N IO O sl O O O JN O ∞ ∞ ω ω iO O O W NO Co Co Ω
CD CJI t - ' O NO NO — ' Cπ O NO NO O OO Ol NT O NT — * OJ W CO OJ K) J5. - ' sj si — ' 4N — ' O O OO SI SI CO CO Cn NT v| Cn 3-
v ^j fc i Nθ ftn ∞rv roo ^*-^ '-^T *O^ **^ 0*-**^ *''- sj JN O 0 N -^ N 0l O fc. N C0 C0 C^ CO 00 O co No co cn cn cji -^ si o N ό όo -co cn Cn όo ∞ o cn ro o 4 o cn o o
O CO O — ' NT en 4N 00 sl θ 0ι N0 NT O C0 4N O sl t0 O — ' — ' O OO SJ O C0 01 4N C0 0 0 4N OO C Oi CO 4N O TJ
m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ©
4N 4N 4^ 4N 4N 4N 4N 4N 4- 4N 4N 4N 4- 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4 4N 4N 4N 4N 4N 4^ w ω ω ω ω co co ω u w co co co ω co co ω ω ω u co ω co co ω ω co co ω ϋ o
Ό
NT NT NT — -* -4 —' fc C0 Co C0 C C C W κ - ■ — ■ C
_ _ _ vθ O O 00 sl sJ sl sl Oi CO NO NT — O CB C0 00 C0 CB N N N N N N N 0* C> Cn 0 00 00 O — ' — ' NT CD — vi Ol →-
O CO —' O Ol Ol 4N si co co — ' CD CO CO NT r" s rs IO — ' sl JN Co co — 4 θ 00 00 sl O C0 C0 sl N0 sl 0ι O 0 4N O Ω
O O 4N Ol 4N fc. —' o o o o o oo en t NOv -^ NTv Oo O O O Cπ O CD — ' Cπ — ' Cn Co Ol Ol Oi OO CO OO — ' Ol O Ol sj OO CO sl CO O
lO NO NO NT tO NT NT NO NT NT NT NT *— ' — _ — — ■ Oi vl Oi Oi — ' NT CO NT O — 4 0 NJ O VI C0 4N Nτ iO Nτ ro co ro o to ro Nτ ^ rsi is! "O CO r-i <— i l Oo o O sl C0 sj 00 4N C0 4N C0 N0 CO O O Ol CO O — ' O sJ 0 4N Oι O O O sl O Co — ' 0 0 0 0 -4 SJ O — — ' -° 0 o Sl o — ' ■co o 4N en o eπ Nτ o oo oo co — ' to — * vj . o to -4 SJ 00 00 IO 4N 01 O SI 4N Oi 0 o vj CO vj en -fc co co — - 4N -NT ' o— ' eOn cNnT eOnO lO- C—n ' s vli -oø
co rπ
_ _ _ _ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ !_
4N 4N fc- 4- fc- - fc- fc- .fc fc 4- 4N J_! fc^ ω co co M co co co co u u ω u u u ω ω co co co co co co co co co co co co co u
O
4N 4N 4N 4N 4N 4N 4N (Jo co 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. JN JN 4N 4N Co *Co Co ω ω Cjθ Cjj Co ω ω o o o o o o o cn en cjT cn oi Oi 4N 4N oo si sj si si si sj si sj si co ro ro NT — ■ — ' O O O O o o o o o co oo oo si sj 4N co rO ; -
NJ — 4 — ' — ' O O θ ω jN C0 00 4N 00 C0 O O sl O O O Cn C0 C0 C0 O C0 θ 00 — ' O Cn Ol Oi JN — ' O O O Co Oi — ■ — ' — ' NJ OO -vj Ω CO CO — ' 0 0 4N — ' NT NT NT — ■ — ' CO O NT 4N NT CO NT O O si en CO CO Co O — ' 4N OO O O O O O O NT O sl O O O OO CD O Cn 4N *=5-
CΛ m 0
4N fc. fc. 4N 4N fc. fc. 4N 4N fc_ 4N fc-. 4N 4N -fc. 4N fc. 4N 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. JN 4N 4N 4N fc. 4N fc. 4N 4N fc. fc- fc- fc. fc- fc. 4N 4N 4N 4N 4N 4N ^ - j_ j_ j5. j_ co ω ω cAi co ω ω co ω co ω co ω co c*) ω ω u ω
_ -fc. Co Ω
Co fc. fc. Ol O vl sI sl sl CO OO O — _ _ _ _ _ _ _ _ _ _ . cπ oi Oi Oi oi Oi oi Oi oi Oi Oi Oi Oi Oi Oi Oi Oi oi Oi cπ cn en cπ cπ oi CΛ
,. ,. ,„ ,. _ _* ,v v, o co ω u co co N ffl co co ω u ω ω ω ω co ω co co ω ω ω M M o sj cn o ^ ^ ^ ■^ . _ι 4 r—— ' ^4N_ κ.S!.J Cj3 NT NT fc. CT O CJl ∞ 00 00 0l 0l Cn CT 0l 0l jT CJ^ N -V O ---l NJ j*^ |-o r> js, s OT O- Ol Ol Ol .fc. .fc. CN .fc. r^
"ABLE 3
SEQ ID NO: Template ID Component ID Start Stop 144 LI:2030279.1:2000SEP08 g6989913 1648 1887 144 LI:2030279.1:2000SEP08 g3752364 1454 1886 145 LI:1018424.3:2000SEP08 71223752V1 1083 1652 145 LI:1018424.3:2000SEP08 5781891 F6 685 1177 145 LI:1018424.3:2000SEP08 5781891 HI 685 934 145 LI:1018424.3:2000SEP08 71222903V1 685 1288 145 LI:1018424.3:2000SEP08 3719168H1 933 1218 145 LI:1018424.3:2000SEP08 3719168F6 933 1403 145 U:1018424.3:2000SEP08 71223803V1 1083 1500 145 LI:1018424.3:2000SEP08 71223901VI 676 1359 145 LI:1018424.3:2000SEP08 7694804J1 276 646 145 LI:1018424.3:2000SEP08 7639804H2 391 989 145 LI:10-18424.3:2000SEP08 7711179H1 474 935 145 LI:1018424.3:2000SEP08 5946909H1 480 802 145 LI:1018424.3:2000SEP08 6857771 HI 537 1017 145 LI:1018424.3:2000SEP08 7383248H1 64 468 145 LI:1018424.3:2000SEP08 7639804J2 69 684 145 LI:1018424.3:2000SEP08 7226703H1 156 662 145 LI:1018424.3:2000SEP08 7222639H1 1 482 145 LI:1018424.3:2000SEP08 70842868V1 1118 1795 146 LI:130969.1:2000SEP08 4344164H1 1895 1989 146 LI:130969.1:2000SEP08 6470643H1 1967 2601 146 LI:130969.1:2000SEP08 7720888J1 2002 2357 146 LI:130969.1:2000SEP08 7579533H1 2179 2762 146 LI:130969.1:2000SEP08 4874203H1 2227 2497 146 LI:130969.1:2000SEP08 1506715H1 2251 2520 146 LI:130969.1:2000SEP08 5334709F8 2355 2918 146 LI:130969.1:2000SEP08 5334709F6 2355 2643 146 LI:130969.1:2000SEP08 524905H1 467 716 146 LI:130969,1:2000SEP08 1871804H1 196 450 146 LI:130969.1:2000SEP08 3661732F8 334 916 146 LI:130969.1:2000SEP08 7725180J1 303 866 146 LI:130969.1:2000SEP08 526216H1 466 714 146 LI:130969.1:2000SEP08 3661732T8 1 487 146 LI:130969.1:2000SEP08 g5544854 153 315 146 LI:130969.1:2000SEP08 7725592H1 720 1367 146 LI:130969.1:2000SEP08 7318481H1 683 1244 146 LI:130969.1:2000SEP08 6357121H1 659 800 146 LI:130969.1:2000SEP08 7660892J1 625 1053 146 LI:130969.1:2000SEP08 3661732H1 633 916 146 LI:130969,1:2000SEP08 1657839H1 583 694 146 LI:130969.1:2000SEP08 7000424F8 620 1198 146 LI:130969.1:2000SEP08 7000424R8 620 • 1198 146 LI:130969.1:2000SEP08 3788096H1 921 1040 146 LI:130969.1:2000SEP08 7720888H1 1092 1679 146 LI: 130969.1:2000SEP08 7000424H1 1137 1198 146 LI:130969.1:2000SEP08 8034483J1 1261 1800 146 LI:130969.1:2000SEP08 7718653J1 1555 2210 CΛ m
_ _ _ _ _ _ _ _ _ _ _ _ ®
4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N fc. fc. fc. 4N 4N 4_ 4N 4-^ 4N 4- 4N - 4- 4N N _ _ Π sj sj sj sl sl sj sl sj sl sl sj sj sj sj sj sj sj sl sl sl sj sj sj sj sl sl sl sl sj sj sl sj sj sj sj s! vl sI sl sl sj sJ O O O O O O
ro ro to to , _ , , , oo co co oo co oo co co co o o o o o o o o o o o o o o o o o o o o o o ro to to to to o o o o o fc. f. fc. fc. fc. fc. φ c o o o o o o o o o o o o o o o o o : o o 3 to fo Fo io to io Fo "to Fo io Fo io Fo fo o Fo fo Fo Fo to Fo to io io io Fo 'to to TJ iό Fύ Fό iό iό fό IΌ fό to Fύ Fό Fύ io fό Fύ fό fό fό iό iό to fό fό iό tύ fό Fύ Fύ* Fύ fύ Fύ Fύ iό fύ iό Fό Fό fύ ϊό iό fό iό fύ fύ fύ Ω_ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o φ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o co co co co co co co co co co co co co co co Ό m m m m m m m m m m rπ m rn m m m m rπ m m rπ m m m m rπ m rπ m m m m m m rπ m rπ m m m m rπ m rπ m m m m o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oo oo oo oo oo co co co co oo co co oo co oo oo oo co oo it oo Sl o
CT o _-_! __| o o o o o -fc. ___ o o cn j o en o en o cn cn en cn cn cπ cπ en o en en ro cπ cn en cπ o cπ o cπ o ___ o o *_! o o cn cn co o o m
_ _ •fc. co o co fc. co o o c. co co fc. cπ en Cn f Cn o co o o 3 J o co o _ _ o o c. to oo g o fc. o o o o o f o o to o o o fc. o — ' o o Ol Ω cπ o o o cn cn or o o o en fc. en 3 o o o fc. o cn co cn o o o oo cn _ cn o — ' co co o oo co co o •c o o en — ' f. •fc. co oo φ
< < < < < X <<<<<<<<<<<<<<<<<<<< X cn X
< < < < < < X X o
< < < < < X X X X X 3 — h
Ό
_ _ J _ _ _ _ _ _ _ _ _ _ _ _ _ O NT NO
Cn cn o si _ Co ro Co Oi 4N SJ O O — ' O CO N0 4N Cn OO Ol Oi sI O CO sl NT NJ sj NT N o 00 cOo jOl esn cj- O NO —- l ∞—' srVi sVJi jNNT CNOT NJT —OO' O—' C-O*-
CO fc. si NT 00 NT O O NT O tO OO NT O IO tO Cn tO Ol OO O tO sI O o l o" I - N O IΌ C≤ V N-T CΠ O OO O Ω
O O Cn fc. NT — O oo cn o NT O o o o3 sj cn ro 4N Oi o — - c si o _ si CO CO o o *-O0ι-*(,, l O0ι*O-' -'
NT NT NO —4 — NT _ _ _ _ — 4 t0 — ' — ' — ' — 4 — ' — 4 — ' — ' NO NO NT — ' — 4 —. _ _ —< NT NO NT NO —' NO —' NO —' —' NO NO NT NT NT
-' — O 4N c_n O O O O 00 00 4N O O vJ vJ sJ sJ C0 NJ NT t0 sj sl 1 CO, lCo CΛ ω _ ω cjι sj vj vj vθ 4N Oθ 4N NO t sj θι 0 4N NO r^-
4 IO CJl CT3 O O O O O O O I NT CT0 O O O O 4N O C>^O 0λ S0l l^ C_To c _jo co ?v 00 00 l 'O " C "O sI — ' O — ' N0 NT O — 4 <_0 00 O 0i O s| 0
NNT N—T■ C—O' —Co4 ,-. Co O O NT -sJ CO O Co O NT NT — ' O NT sJ O Cn sl Oi — ' ^ ^ O - Q -O -I- JS, O Oi O O — ' OO O CO Oi O CO O O O TJ
TABLE 3
! ID NO: Template ID Component ID Start Stop
147 LI:286246.2:2000SEP08 70560120V1 1459 2096
147 LI:286246.2:2000SEP08 6274047H1 1495 2090
147 U:286246.2:2000SEP08 71243756V1 1484 2083
147 LI:286246.2:2000SEP08 70561769V1 1533 2141
147 LI:286246.2:2000SEP08 71242738V1 1454 2082
147 U:286246,2:2000SEP08 71046929V1 1546 2082
147 LI:286246.2:2000SEP08 71243123V1 1443 2078
147 U:286246.2:2000SEP08 71046075V1 1371 2078
147 U:286246.2:2000SEP08 71054743V1 1868 2071
147 LI:286246.2:2000SEP08 71540043V1 1860 2061
147 LI:286246.2:2000SEP08 71540268V1 1327 2060
147 LI:286246.2:2000SEP08 70560467V1 1377 2056
147 LI:286246.2:2000SEP08 71245530V1 2527 2710
147 LI:286246.2:2000SEP08 71539404V1 2136 2589
147 U:286246.2:2000SEP08 71539423V1 2109 2598
147 LI:286246.2:2000SEP08 5002902T9 1992 2587
147 LI:286246.2:2000SEP08 71045716V1 2154 2520
147 LI:286246.2:2000SEP08 71540558V1 860 1321
147 LI:28624ό.2:2000SEP08 71047312V1 2085 2710
147 LI:286246.2:2000SEP08 6206414H1 2071 2710
147 U:286246.2:2000SEP08 71553195V1 1028 1389
147 LI:286246.2:2000SEP08 71538629V1 1373 1761
147 LI:286246.2:2000SEP08 71542943V1 1319 1759
147 LI:286246.2:2000SEP08 71540253V1 792 1532
147 LI:286246.2:2000SEP08 71045637V1 1129 1490
147 LI:286246.2:2000SEP08 6271029H2 965 1523
147 LI:286246.2:2000SEP08 70555132V1 922 1520
147 LI:286246,2:2000SEP08 71542566V1 937 1475
147 LI:286246.2:2000SEP08 71242763V1 744 1465
147 LI:286246.2:2000SEP08 71244412V1 1129 1454
147 LI:286246,2:2000SEP08 71544889V1 978 1451
147 LI:286246,2:2000SEP08 71538305V1 1111 1452
147 U:286246.2:2000SEP08 71540285V1 1147 1592
147 LI:286246.2:2000SEP08 71540983V1 1083 1593
147 LI:286246,2:2000SEP08 71542737V1 1036 1565
147 LI:286246.2:2000SEP08 6743680H1 1016 1567
147 LI:286246.2:2000SEP08 70559871VI 1258 1644
147 U:286246.2:2000SEP08 7123394F8 219 510
147 LI:286246.2:2000SEP08 7585860H2 1 633
147 LI:286246.2:2000SEP08 71244615V1 1676 1872
147 LI:286246.2:2000SEP08 71540204V1 1382 1632
147 LI:286246.2:2000SEP08 4122893H1 1970 2202
147 LI:286246.2:2000SEP08 71046145V1 1579 2193
147 U:286246.2:2000SEP08 71052465V1 1753 1913
147 L1:286246.2:2000SEP08 71541476V1 1379 1920
147 U:286246.2:2000SEP08 71040490V1 1135 1220
147 U:286246.2:2000SEP08 5680372F6 744 1136
147 LI:286246,2:2000SEP08 71545686V1 1545 1883 TABLE 3
SEQ ID NO: Template ID Component ID - Start Stop
147 LI:286246.2:2000SEP08 71244339V1 1203 1881
147 U:286246.2:2000SEP08 1941561To 2455 2671
147 U:286246.2:2000SEP08 70843645V1 1609 2162
147 LI:286246.2:2000SEP08 71047111V1 1484 2145
147 LI:286246.2:2000SEP08 71244003V1 1235 1796
147 U:286246.2:2000SEP08 g672529 1426 1778
147 LI:286246.2:2000SEP08 70559901VI 1225 1773
147 LI:286246.2:2000SEP08 71044921VI 1336 1761
147 LI:286246.2:2000SEP08 71047834V1 2048 2689
147 LI:286246.2:2000SEP08 71045886V1 1259 1848
147 LI:286246.2:2000SEP08 71048727V1 1512 1832
147 LI:28624ό.2:2000SEP08 71543038V1 1341 1839
147 LI:286246.2:2000SEP08 71048663V1 1283 1821
147 LI:286246.2:2000SEP08 70560339V1 1148 1819
147 U:286246.2:2000SEP08 70561925V1 1988 2388
147 LI:286246.2:2000SEP08 71047478V1 1343 1864
147 U:286246.2:2000SEP08 71539466V1 1138 1866
147 LI:286246.2:2000SEP08 71545826V1 1471 1687
147 LI:286246.2:2000SEP08 71540331VI 931 1667
147 LI:286246.2:2000SEP08 71044926V1 1191 1666
147 LI:286246.2:2000SEP08 71540192V1 1155 1657
147 LI:286246.2:2000SEP08 71242808V1 2120 2727
147 U:286246.2:2000SEP08 71047724V1 2122 2729
147 U:286246.2:2000SEP08 71047489V1 2234 2712
147 LI:286246.2:2000SEP08 71047868V1 2067 2727
147 U:286246.2:2000SEP08 70449628V1 782 1120
147 * U:286246.2:2000SEP08 5002902T8 1961 2586
147 LI:286246.2:2000SEP08 g2410095 2232 2356
147 LI:286246.2:2000SEP08 70555842V1 1640 2314
147 LI:286246.2:2000SEP08 71243779V1 1729 2294
147 LI:286246.2:2000SEP08 71047027V1 1654 2270
147 LI:286246,2:2000SEP08 71048396V1 1229 1882
147 LI:286246.2:2000SEP08 71054331VI 1388 1889
147 LI:286246.2:2000SEP08 71542747V1 807 1290
147 LI:286246.22000SEP08 71542294V1 564 1304
147 U:286246.2:2000SEP08 71542526V1 535 1289
147 LI:286246.2:2000SEP08 71538262V1 973 1289
147 LI:286246.2:2000SEP08 g2880816 2281 2711
147 LI:286246.2:2000SEP08 71244177V1 2001 2710
147 LI:286246.2:2000SEP08 71243630V1 2085 2710
147 LI:286246.2:2000SEP08 7192911 H2 2144 2715
147 LI:286246.2:2000SEP08 5957753H1 927 1541
147 LI:286246.2:2000SEP08 71539504V1 1039 1564
147 U:286246.2:2000SEP08 71045343V1 1326 1723
147 LI:286246.2:2000SEP08 71044936V1 1104 1722
147 LI:286246.2:2000SEP08 71537853V1 1830 2259
147 LI:286246.2;2000SEP08 70560372V1 1957 2257
147 U:286246.2:2000SEP08 71042212V1 1897 2252 CΛ rπ
©
4N 4N 4N 4N 4N JN 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N 4N fc^ sj sj vj sj sl sj sl sl sl sl sl sj sl sj sl sl sl sl vj sl sl sj sj sl sj sj sl sl sl sj sl sj sl sj sj sj sj sj sl sj sl sl sl sl sl sl sl sl Ό
-
1-1 VI -vl sl sj sj sl cπ sl s ___l SJ sj sl sj sl s __i si s| Sl Sj si S__l S _-_j Sl sl sl SI si sl sl SJ sl sl SI si SI s _-_l vj VJ sj vj vl O o o _ SJ j sl sj s] sj o |_ o o 203 ι—
CN O l NT cπ o T NT N CΩ O o o O o O _. o ' ___
O cn l en Ol o NT to O Ol
]4N cπ cn o Oi oN CO o fc. 4 4N 4eNn N O
4N CO cπ co o JN 4N 4N o 4 CO cn cn cn. c Cπ Cn O Oi T 3 m
O 4N 4N 4N JN c cπ Ol Oi Oi cπ o Ol o o Ol cn cn Ol en cπ O •o co cn N o C cn fc. cn o 4N o 4N fc. o fc. Co o 4N o 4N CO CO fc. 4N 4N J
Ol to co o CO CO fc. CO cn o 00 SI o CO cn 00 o 00 o NT o o O o 00 si o co o —' NT Ol sl oo o NT O en o _ fc cπ o o NO co loo co NT co co SI o O NT SJ 4N Cn NO o Ol CD O o O o o NT sj co —« 00
CO o CO Cπ NT sj o sl CO O si to o _ 00 j oo o
'en Ol co o JN co N 00 Co NT 4N co Co cn co O SJ CO fc. 03 CO o NT o to si Cn cn o O O SJ T NT -fc. ' to cπ oo 3 cn 3 co CO co CO Sl 4N X 4N NO SJ o O sj cn -fc. Co o NT o 0
CO fc. o NT O o co NT NT 00 o CO CO o CO NO cn co CO CO si o NO 4N O O 4N π NO Φ
X < < < < < < < < X o C c l≤ < < < < < < < < < < X < < < < < < < -fc. < < < < < < < < < < < < < < < < < 3 α
NT — ' — ■
O si cn si s N NT NO NT — ' NT NT — • _ _ — ■ NT — —> —■ CO Ol CO J_i No CsOi -O ■ s NT ,c o K O U fc 4i. U O CO Js, < — >
O — O — C -0 4 -N 0i C0 O Cn 4N C0 , r^Λ0 NT NO NT NO 00 NT fc.
CO O JN NT O oo θ cπ fc. o cπ oo ig vj c O- 4N O O O C —O o w— - ■■ — • o — ' 4N NJ θ co o cn c 5π o S 0 sj O O si O —«
--4 fc. co co SI CO O CΠ NT CO O -IN O v4 _J _ co _ ■_ O OO OO NO O Ol 00 o sj —■ £ to _ -3-
NJ —< —■
C 0T 4N -001
O 4- O
co m
_ _ _ _ _ _ _ ©
4N 4N 4i. 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N fc-. fc. fc. fc. fc. fc. 4N fc. fc. 4N 4N 4N 4N fc. 4N 4N 4N fc. fc. 4N 4N fc. 4N O O O O O O O O O O O -C0 00 C0 00 00 CT3 CT3 C0 00 C0 CD 00 CT0 CT3 00 00 03 CT3 00 CD SI sj sj sj sl sl sl sj sl sl sl sj vj sj sl ^ 2! o
O O — ' lO IO NT NO NO rO NT — ' — ' — ' CO
O IO - ' C CO N0 4N 4N CO CO CO O O CO CO CO CO — si o Oi cn — — * N0 N0 N0 N0 4N Cπ O O O rS- 4N 4N Oι sl o 0 01 Cθ O sj sJ θ O O O O O Oι C — ' O sj O O cπ en ■=■ — o -vl — 00 NT
O O 4N 4N -4 O O 00 00 JN SI CO sl CO sl J CO - ' Ol CΛ) M ro io S O - ' Cπ cTo o oo co αo oo αo sl — ' o fc. fc. O 00 O 4N O CJl Cn O 00 00 ^-
— ' — ' — ' NT NT — ' — ' NT si m m — ' — ' — ' — ' — ' NT NT NT NT — * — ' tO NO fO NT lO NT NT NO — ' — * — * CΛ co en o o o oi o o si si si •!- <■ — ' NT 4N O Ol Co NT i rϊ — ' C0 sJ 00 O C0 C0 θ 4N rv f NOό r-i *^ -4 -, ω ω ω ω ^ fc. fc cn c» ιj N ω co si N M N C c c r ro — ' Oi ro oi — ' si — ' m ?r ' -fc. O O -y CO) s.l * sj Or>, n O n O nO nO rO ssJι nO. ^Oι cCOo nO, nO (CiOι ιOτ. — ' — ■ — Co CJ
NT O — ' O 00 4N O 00 ft ft ft o sj vO O O — . — • — * en CO
NT ft Ol ° 4N — ' — ' — ' — ' O Oo O OO fc. — ' O JN O O O O O O NO O O TJ
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
149 U:395063.1:2000SEP08 5732526F8 1047 1701
149 LI:395063.1:2000SEP08 70674516V1 1460 2065
149 LI:395063.1:2000SEP08 70679043V1 1630 2200
149 LI:395063.1:2000SEP08 7401964H1 1662 2218
149 U:395063.1:2000SEP08 70674896V1 1687 2089
149 LI:395063.1:2000SEP08 70671138V1 1742 2263
149 U:395063.1:2000SEP08 4445453H1 599 868
149 LI:395063.1:2000SEP08 3486280H1 1573 1669
149 LI:395063.1:2000SEP08 70672954V1 1587 1842
149 LI:395063.1:2000SEP08 3859309H1 318 405
149 LI:395063,1:2000SEP08 3859388H1 379 454
149 L1:395063.1:2000SEP08 70671032V1 1283 1608
149 U:395063.1:2000SEP08 70678302V1 1909 2233
149 LI:395063,1:2000SEP08 2674222H1 1896 2131
149 LI;395063,1:2000SEP08 3696546F6 810 1323
149 LI:395063.1:2000SEP08 3696546H1 812 1096
149 U:395063.1:2000SEP08 147070R6 384 585
149 LI:395063.1:2000SEP08 g5361477 373 759
149 LI:395063.1:2000SEP08 147070H1 354 534
149 LI:395063.1:2000SEP08 7700096H1 254 855
149 LI:395063,l:2000SEP08 030545H1 141 365
149 LI:395063.1:2000SEP08 4180711H1 54 201
149 LI:3950ό3.1:2000SEP08 6828352J1 67 594
149 U:395063.1:2000SEP08 7610103H1 75 603
149 LI:395063,1:2000SEP08 6788461 HI 40 474
149 LI:395063.1:2000SEP08 7086817H1 402 950
149 LI:395063.1:2000SEP08 g3596902 494 933
149 U:395063.1:2000SEP08 g6702152 1545 1738
149 LI:395063.1:2000SEP08 70675746V1 1569 2106
149 U:395063,1:2000SEP08 70676091VI 1569 2170
149 LI:395063,1:2000SEP08 70674954V1 1545 2135
149 LI:395063.1:2000SEP08 6828352H1 619 1165
149 L1:395063.1:2000SEP08 4445453F8 624 1198
149 U:395063.1:2000SEP08 3696546T6 1926 2487
149 U:395063.1:2000SEP08 70677175V1 2017 2405
149 LI:395063.1:2000SEP08 g4898324 445 712
149 LI:395063.1:2000SEP08 g3424526 540 768
149 LI:395063.1:2000SEP08 g86718ό 558 830
149 LI:395063.1:2000SEP08 3699565H1 1 285
149 LI:395063.1:2000SEP08 6785022H1 36 340
149 LI:395063.1:2000SEP08 031180H1 206 365
149 LI:395063.1:2000SEP08 5862502H1 273 559
149 LI:395063.1:2000SEP08 g3181305 2275 2520
149 LI:395063.1:2000SEP08 4668004H1 2453 2552
149 LI:395063.1:2000SEP08 2866136H1 2024 2272
149 LI:395063.1:2000SEP08 g5054669 2151 2520
149 . LI:395063,1:2000SEP08 3535582F6 2663 3209
149 LI:395063,1:2000SEP08 3535582H1 2663 2938
NT NT NT NT NT NO vi
•fc. fc. 4N O 4N 4N ^v o o o o o O NO O O O O CD r-. ^ cn o co oo o o oo co oo 4N — ' No en oo oo co — 4 — * — ■ ω ro co to w
O 4N 4N O — fc. en co -vj co Kv NT O — ' fc. *— ' O J sI O O O f. co N O sJ Ol vj _ 00 Cθ 4N 4N O O c co co cn o o NT H
O c o NT o 4N CO NT O O 0 4 O fc. sJ IO sj tsT g OI O Oi NT vC0l 0 4N rO f. CO o O O CO O O NT 4N O SI O CO — ' C0 4N 4N cn o 4N cn co cn oTJ
co m
©
CJi ϋi Oi Cn Oi Cn Oi ϋi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi ϋi Oi Oi Oi Oi Oi Oi Cn Cn Oi Cn Oi Oi Oi Oi Oi Ol Oi Oi Cli Oi Oi Oi Oi Oi Oi Oi Oi Oi π 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N -i- 4N 4N 4N 4N fc. 4N 4N 4N 4N 4N -Cθ Co ω Co ω
4N 4N 4N 4N -IN 4N fc. 4N 4N -fc. 4N 4N 4N fc. 4N 4N 4N 4N -fc. 4N fc. 4N -IN fc. fc. fc. 4N fc. 4N fc. fc S. CO CO CO CO vj l SJ sl si si VI VI Sl SI Sl sl sl SJ sj sl SJ si sl sj SI sj SJ sj SJ sl l SJ sl sj c So CO CO co CO co Co CO co
VJ vj VI v l sj sl co Co CO Co
SI Sl si -vl si sj sl VI sl si fc. 4N fc. 4N fc. •fc. 4N •fc. 4N 4N- 4N 4N 4N -fc. 4N fc. 4N 4N 4N fc. fc. 4N fc. 4N -fc. 4N 4N •fc. fc. fc. fc. CO 00 oo 00 co oo co 00 co co CO co oo co 00 oo oo —1
•fc. JN JN 4N •fc. 4N 4N 4N fc. -fc. 4N fc. 4N 4N 4N 4N 4N o O o O O o O O O
CO o o o o o O O O O O O o o o O o o
CO oo Co 00 00 CO CO CO 00 03 00 00 00 00 CO oo 03 co o o O
00 o o T NT to NT T NT NO NO NO
OO 03 co oto N o NT to NO NT N NT NO NO φ
00 00 00 00 00 00 00 03 03 03 00 03 P° ■00 03 CO CO co CD 00 CO 00 00 00 3
Fo Fo Fo NT fo fo to io Fo fo Fo Fo NO io NO fo NO Fo fo T fo io ΪO NO fo NT NO Fo Fo NT Fo TJ to Fύ Fύ iύ fύ Fύ fύ Fύ iύ fύ Fύ NT iύ "tύ fύ iύ fύ iύ fύ fύ NT NO fύ fύ Fύ fύ iύ Fύ "tύ iύ Fύ NT fύ NT iύ fύ iύ NO fύ fύ fύ fύ fύ iύ NT fύ iύ iύ Ω o o o O o o O o o o o o O o O O O O o o o O o O o o o o o o O o o O o O O O O O O o o o o o o o o o o o O o O o o O o o o o O o O o O O o o o O o O o o o o o o O o o O o O O O O o O o o o o o cfj o o o o o o O o o O o o o o O o O o O O o o o O o O o o o o oco o O o o O o O O O O o O o o
CO co CO CO or CO co co CΛ co CO co CO CO CO CΛ co CΛ CO co co co CO co CO co co co CO CΛ co co co CΛ co CΛ CO CO CO CO CO CΛ co oco o co o
CΛ co m m m m m ID m m rπ m rπ m m rπ m u
IH m m m m rπ m m rπ m m m m m m m m rπ m m m m m m rπ m m m m m m m m
TJ TJ TJ TJ J TJ TJ TJ TJ TJ TJ TJ TJ TI TJ TJ J J TJ TJ TJ J TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ J TJ TJ TJ J TJ TI TJ TJ TJ J TJ
O o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o O o oo oo co co co oo O o o o O o o
CO oo oo oo oo 00 oo co oo o 0o CD CO CD oo 00 CO oo CO CO 00 00 co CO oo co 00 00 oo 00 CO CO co o oo co co 00 oo co o o
00 00 CO
CjO Co 4N 4N fc. 4N 4N 4N 4N fc. 4N 4N 4N NO NT NT tO tO rv ro f NT NT 4N -4 sJ sJ _ sj sj |O N0 N0 N0 t0 t0 NT N0 4N sl O O θ O !Λ Sd ; o o 4NS4N 4N O OJ O N Sβ CO NO NT O O O vj sl sj sl Ol ft
— — ' Cπ oi o — 4 — ■ — ' O Oi co to — ' O co o co 00 ' 4N JN O CO CO sl Co NT O Q
4N 4N 03 0ι N0 O 00 O O C0 O O sl N0 — ' 4N zz B ogoSo≤aS 4NS Ol Oi Oi Oi Oi 4N
_ . _ — vl O sjS4N∞slg 00g00aOg≥^ 4_ fc-, θ sI NO O O Oo sj sl sl NT O O O
4N 4N fc. 4N JN fc. fc. JN JN 4N 4N 4N 4N rO CO NT NO NT — ' ^rv — ' 4N N0 NO 4N — ' _ _ _ N0 — ' — ' NT NT — . _ _ _ _ _ — ' — . _ _ _ _ CΛ NO NT O CO O Ol CO O Cn 0 4N 4N CO O O OO SI -0 *— . 5< — ' O SI O O- O — - CO O — — ' CD — ' _ 0 0 4N sj Oi CO Ol 4N CO O CO NO ^- O vl co Oi sJ O vl CD O — ' NT — ' SI OI NO O O — ' sj S C CJU5.00 - ' C04N O N0 -4 eπ N0 NT O O O sl O — ' -Ol CO o fJi o O CO CO CO OO SI O O O — < O 4N C0 C0 O θ r0 4N ι\J c° — ' en Nτ en oι fc. 000 — ■ -* o to - ■ — ' O O OI O OO JN CΠ CO CΠ — ■ o o TJ
c.|jS_i O4Ni ϋJi CJSn oJSi, OJSi, OJSi, JCSr, oJS.i
NT 4N JN NT 4N 4N 4N 4N — _ _ _ _ _ _ _ _ . js, JN 4N CO CO CO CjO C_O CjO Cjj C <jJ CO CjO NO I N0 4N 4N CJT Ol Ol Cj3 C0 4N fc. — ' CO CO CO CΛ C ) OJ
00 — 4 — ' O O O O O O O OO sJ sj sj sj sJ O O O Ol Oι fc. 4N JN O O O O O O NT — ■ — ' O 00 03 C0 O O O O O O 4N sl sl sl sl →- — ' Cn 4N t0 0 0 0 0 4N 4N OO O VI sl si sj co en 4N IO O fc. 4N CO Ol 4N O O O O O --I CO tO .fc. tO IO — ' s| _ o- cn 00 NT JN 4N Co — Q 00 C0 4N O O CJ1 C0 O O O NT O NT — ' — ' — ' CJI NO CO 4N 4N O -Oi O Oi CJi s fc. r — ' O O sj Co j^ o O O O Cjo O CJi Co en co — ' OI N 5-
eo fc. JN C0 4N 4N 4N 4N N0 NT N0 NT N0 NT rθ tO J_, J-, ω o ω ω cAl ω ^ ω 45^ M '^) ^ J^ Ol ϋl Ol Ol fc ^ ιv) ω ω c cΛ) « ω O O en O NO CO CO NT NO Oi Co CO O O O — NO Cπ 4N θσ O sj sl oo O — ' O sl _ O 00 4N C0 — ' O CO NO N0 4N 4N fO sj *o o 7*f O — ' O -sl O -vJ CO OO jN CO IO CO NO O — 4N sI _ CD Cπ 0 0 0 4N 0 4N SJ 0 — ' O O O O O Cn O NT NT O — ' — ' CO — ' 00 CO OJ M ° — ' Cπ O O O CO O OO C O Oi Oi NO — ' sj cn C0 0 4N SI JN 01 0 — ' — 4 θ N0 C0 θ O O O 4N N0 4N 00 O O O vi rθ O — ' N0 fc. N0 O vl TJ
CΛ m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ©
■Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Cn en Ci cji eπ cn oi Oi Oi Oi cn cn cji Oi cn cji cTi cn oi Oi Oi Oi Oi Cπ cJi cji cΛ 4N 4N fc. 4N 4N fc. 4N 4N 4N 4N 4N 4N fc. fc. fc. 4N 4N 4N 4N 4N 4N J J> 4N 4N JN fc^ 4N 4N 4N 4N 4N 4N 4N fc. 4N 4N 4N fc. fc^ o
fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. -fc. -fc. -fc. fc. -fc. -fc. fc. fc. fc. fc. fc. fc. fc. •fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. fc. N fc. fc. φ o o o o o o o o o o o o o o o o o o o o o o o o o o co oo oo oo oo oo oo oo 3
Fo fo Fo Fo Fo fo fo "to fo fo Fo Fo Fo Fo Fo Fo Fo Fo "to to fo to fo Fo Fo Fo Fo fo to Fo Fo Fo fo fύ fύ fύ fύ Fύ iύ fύ iύ fύ fύ fύ fύ Fύ fύ fύ Fύ Fύ fύ fύ Fύ Fύ Fύ fύ fύ fύ fύ Fύ fύ fύ fύ iύ fύ fύ fύ fύ Fύ fύ Fύ fύ Fύ fύ Fύ fύ fύ fύ Fύ fύ fύ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o Φ o o o o o o o o o o o o o o o o o o o o o o o o o o o co co co co co co co co co co co co co co co co co co co co co co Ό m m m m m rπ rπ m rπ m m m m m m m m rπ m m rπ m m m m m rπ m m m m m m m m m m m m m m m m m rπ m m o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o co co co co co co co oo co co co co co co co co co co
_ _ _ _ _ _ _ _ _ Co <j0 CO C0 CO C <j0 C0 C0 CO C0 Cθ 4N fc. C0 CO fc. 4N 4N 4N 4N 4S_^ 4N 4N 4N 4N 4N (_O O <jO Cj0 4N Cjθ CO CJJ tO NT CO CO CΛ vj sJ iO — ' — ' — ' O O O O C — — . OJ OJ CO Oi Ol Ol Ol JN — ■ — ■ — ' O O 00 00 C0 00 C0 O O O O O C0 4N ^*- — ' — ' 00 0ι 4N sl sl sl O C0 4N 01 O SI 4N CO 00 4N 4N 4N SJ SJ 0- C0 — ' 4N 00 00 SI O OI NT — ' O O sJ cπ Ol CO sJ NO NO O O O — vl O S 00 00 00 C0 O O O C0 NT 4N 00 Cn O NT 00 O C0 — ' O O tO sl sJ O O O O — • O 01 0 0 NO N0 4N 4N — ' Cθ Cπ _ sl o O NT O O Co -
_ _ _ ι*o — ' —. —• — ' co co fc. CO CO CO CO CO CO CO CO CO 4N JN 4N 4N fc. JN 4N 4N JN 4N 4N 4N 4N 4N 4N 4N fc. -Cθ 4N JN CO CO CO CO C CO Cθ CΛ — ' O NT OO O O O sJ CO O cn jN si co co co — - to o 4N O — ' O O sl 00 00 03 00 4N C0 Cπ C0 4N O — i — ' 4N CO — ' CO NO — ' OJ r fc — ' Oo eπ o o oo o o o o — CO NT 4N —. Co O 4N sl en — ' SI NO O — ' O CO NT NT O — ' sj O -fc. — ' O NO sl NO — Ol — ' Co — ' fc. — ' — > O
C0 4N Cn NJ O O C0 C0 O 00 — co sj o cn — ' si o o — ' 4N CTO O CJI O CJO — ' <jJ to sj 4N o -θ3 C 4 cn c» o θ 4 o ω cn CD θ θ θ Tj
CTi oi oi Oi oi oi oi oi Oi oi oi oi o oi o oi oi oi cjT cπ cn cn cji cji oi oi oi oi oi oi oi oi oi o oi oi oi oi
O Cπ en CJl Oι Oι Ol Oι 01 0l Ol Ol Cn Cji Cπ Cπ en cjl Oι Cn Cjl Oι 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N fc.
[- C C ιT ro iNτ ω ω ω
0 0 O sl sj sl
_ _ _
_ _ _
Fo fo Fo
'NO "NO "tύ OooOoO c On Oco Oco m m TI -Π T O O O
00 C C
_ 4N fc. fc-. NT -fc. 4N fc. 4N 4N fc. C0 4N Js. rθ NJ fc. NT NT NT rθ 4N — ' CO co oo co cro o o o o o o o ft ft ft ft ft i^ ^ ^ ft o 00 CO Co NO _, O CO CO CO o o o
SJ en en o o o o o o sl fc 4 CO O o co sl JN _5. N0 0N00 O4N N0I 4ON ONT O— ' OO 0O0 ;Ω*-*}- cn co o o o o O N 4 -4 CO co to —4 —' cn c
NO o en sl T o sj co o cn O O O O O O si o o o CO 4N fc. CO to —■ .fc. fc. co cn cn sj fc. s, o- js, ro oo 4 θ cπ o co ^.
_ _ _ _ _ en fc. jN si n r Ci W ^ ^ ^ -^ ^ ^ ^ ^ -N CJ j^ l tO CO tO I. CΛ o g o _ !- g o _ 4N 4N O OO CD O O sI ^ W ώ C> 4N t sl 9 r5 CO O 00 0 ∞ ft ft 00 > C CJl 01 C 45. 4N *^ CO CJ3 p θ CJ3 sj sj o sj c θ
— 4N —sj * occjo- occo-* tιθso * —fc—.■ o NO —fc. ' O ucniι O M * uco r rn C irvv: r h0- *ι rrcnnn . -—j , π n o*,. rr o*inoi r cnn £ -•— o o o No oo sj o en si co sj oi o _ ' o o o o o co co ΓΛ rs CO O 00 vj NO to Cnn - o NT O IO tO 4N O SI O C0 00 O CO O O O sJ θ TJ
CO rn
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ *_
Oi Oi Oi Oi Ol Ol Ol Oi Ol Oi Ol Oi Oi Ol Oi Oi Ol Oi Oi Oi Oi Ol Oi Ol Ol Oi Oi Ol Oi Oi Oi Ol Oi Oi Oi Oi Oi Ol Oi Ol Oi Ol Oi Ol Oi O^ Ol Oi Ol Ol Oi Ol Ol Ol Ol Oi Ol Ol Ol Ol Ol Ol Oi Oi Oi Oi CJi Ol Ol CJi CJl Oi Oi Ol O Oi CJl Cn Ol Oi Oi Ol Ol Ol Ol Ol Ol Oi O^ o
si sl sl si VI si sj sj sj sj sj sl sj sj sj sj sj sl sl sl sl sj Sj SJ sj si si sl O sl 4N sl > 03
O — ' — ' ( ) - :_ co CΩ CΩ O O r— cn cn o cn cπ cn cπ o cn NT o cn en cn oi oi oi o oi oi oi oi oi O fc. J o i Oi o cπ cn cπ 4^ — ' m o -CJ CO -o CJ -CJ -CJ CO -o 00 o sj cn CJ cn SI Ol Ol o O to OO sj vO O O O O O NT O O O O O OO -C) J ' -C) O O 00 o o CO - J -c> - J Oi sl o
Ch si sl — . CO NT 4N N 0l N0 NT 0l sI _ rO O rO 4N NT O CN CO NT NT O sl 4N N) ft o 5 o — 2 O. ^_ CO 00 T3J co
4N O sl CD 00 C J 4N — 4 4 sJ O Ol N0 4N — ' — ' 0 01 0 O N)
CO O fc. O NO O O cn O CO N0 4N — ■ O cn N NO 4N fc. cn - J 00 NO SI VJ C * 4N VI * O-* - cn cn 4N CO OO SJ O ' NO O O — ' NT — ' SI JS. - sl fc. cn CO - J CN — O j ) js, co vθ _ js, CO vθ o- NT sl en O O- CO o co cπ Ol 00 CO CD NT CO CO O CO 00 N
4N o CO cn CO - J CO O N0 4N v - J Ol -' co vj oo vl
< < < < < < < < < X < < < < < < < < < < < < < < < X cn si NO Ol O O cn I
< o < < < < < < < < < < < CO 00 < < < - O < u
CO CO CO CO CO IO IO NO NO IO NO IO O O O O NT NT NT 4N C0 N0 O O O O 01 4N NT O O _ O_ C_» CJl 4N CO CT3 NT OO sj s sJ O O O en o eπ vj en en o oi o O vi oi oo c 'n c 'o N ' ") O O Ol Ol NT sj JN O
NT — ' — ' NT NT — — ' — ■ O O sl O O O O fc. O 00 00 C0 O SI C0 00 SI SI CN SJ 00 SI 00 S1 4N o cn cπ — ' NT NT NO CΛ
Oi — ^ ??, o o. ^
0 0 4N — — ' Cn co o NT JN Oj co co jN cn o si en o co oo si NO o en o Ol — 4N fc. f. O o o -4 - O Tt
CO c ■vj _ _ O o
NT sl O OO Ol O O O sJ Cn 4N O O O en sJ N0 O CD — ' — ' 4N SJ JS, en co o O
CO CO 4N NT o to cn NT CO sl CO NO JN 4N 00 TJ
m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ © ul Ol Oi Oi Oi Oi Ol Oi Oi Oi Oi Oi Oi Oi Oi Oi Cn Oi Ol Oi Oi ϋi Oi ul Cn CJl Oi Oi Oi Oi Oi ϋl Oi Ol Ol ϋi ϋl Ol Ol Oi ϋi ϋl Ol Oi ϋi Oi Oi Oi π O O O O O O O O O O O CJl Oi Ol Cπ Cn Oi Ol Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Ol Oi Oi Oi Oi ϋl Oi Ol Oi Oi Ol Oi Oi Oi Oi Oi
O O O O O O O O O O O -NO NO tO NO tO tO NT NO NT NT NO NT NT NO NT NT NO NT tO rO tO fO NO NO NO NO NO NT tO tO NT tO NO NT NT NO NT o o o o o o Q O o o o co ω co ω co co co co co co co co ω co co ω ω co co co co co ω co co co ω ω
OO OO CO OO OO CO OO CO OO OO OO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O — i 0 0 0 0 0 0 0 0 0 0 0 sl sj sl sj sj sj sj sj sj sj si vj sj sj sj sj sj vj vj sj sj sj sj sj sj sj sj sj vj sj vj sj sj sj sj sj sj (τj 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N 4N — • _ _ _ _ _ _ . _ _ _ _ , _ _ _ , _ _ ■ _ _ _ _ _ _ , _ _ _ , _ _ _ . _ _ _ _ _ _ _ _ _ 5^
NT NO NT NT NO NO NO NT NT tO NT — 4 — ' — . _ _ _ 4 — _ _ _ . _ _ _ , _ _ * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ *J
_ _ •_ *_ *_ _ — . 1-. '— . -_ ^ s*o io iO fo io fo "to io fo fo "NO Fo 'M io Fo Fo fo io io fo iO Fo Fo Fo fo fo fo Fo fo fo M fύ Fύ to iύ Fύ iύ tύ 'tύ to iύ fύ NO iύ iύ tύ Fύ NO Fύ Fύ Fύ fύ Fύ Fύ Fύ 'NO iύ iύ iύ "NO fύ iύ fύ fύ 'NO iύ iύ iύ fύ Fύ 'NO Fύ fύ iύ iύ NO M O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O ^- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 w
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O i=; CΛ CO CO CO CΛ CO CO CO CO CO CO CO CO CO CΛ CΛ CO CO CO CO CO CO CO CO CΛ CΛ CO CO CΛ CO CO CO CO CO CΛ CΛ CO CO CΛ CΛ CO CΛ CO CΛ CO CΛ CΛ CO Cj iri m m m m m m m m m m m m m m m m m m J -π-u -o τ/ -π -o τ/ TJTJ -π -ϋ-O > τj -σ -o-π -O -ατ3 TJ T] -α τ^
0 C» 0CO0000000000C00000C» 0C0 0000000C00000CO0CT00CT30C 0CT00C 0000000000000Cj00000C00000000C 0C» 000000000000000000
NT sl CO co NT to en 01 NO 000 CΛ r.^ r^ o o S * js ft ft rO W O O O O O C i <l Oi Oi Ol 4N 4N 4N 4N 4N fc^ sl O O 03 4N cn co o cn 00 co co o o oco oo c —^n* c"soj g CS -i^ g g ^ g vl Cn W cn en o NO tO Cπ tO O O sJ O- CO OO O O NO — ' — ' cn en 4N SJ SJ o 3.
Cn NT CO O O — ' O sJ JN o o '" si Ji fV -' M M M M M W -' M -' -' M -' -' -' M -' -' I ' ' I -' W jN si js, o- eπ oι o co N θl CJl O - ' |i 4i l0 r0 X 0 ft S () O O - 1 _ — . O CD O SJ O O SJ SI O O O O O O O CO →-
4N — 00 Co 4N si NO O O Cn θ H θ C0 CO O sj 0l 0l sJ ^ C0 N0 ^ ^ >V? 00 C0 sI _ NT — ■ — ' O CD CO O en CO NO OO Cn sl vO sl O O sl O CO CJl ^ — 4 ^ s. VJ O ^ r^ r ω j jvi O r> iv ri O sJ J^
TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
156 LI:008942.1:2000SEP08 7997486H1 260 901
156 LI:008942.1:2000SEP08 6376692H1 1237 1428
156 LI:008942.1:2000SEP08 70332069D1 1.169 1428
156 LI:008942.1:2000SEP08 70330927D1 1283 1776
156 LI:008942.1:2000SEP08 70309464D1 1233 1434
156 U:008942.1:2000SEP08 70311281 Dl 1283 1798
156 L1:008942.1:2000SEP08 70331388D1 1157 1428
156 LI:008942.1:2000SEP08 70728082V1 1160 1425
156 LI:008942,1:2000SEP08 70310348D1 1358 1803
156 LI:008942.1:2000SEP08 7966746H1 1531 2090
156 1-1:008942.1 :2000SEP08 2369660H1 1175 1414
156 LI:008942.1:2000SEP08 70330650D1 1231 1426
156 U:008942.1:2000SEP08 70727843V1 1175 1427
156 I:008942,1:2000SEP08 2372577H1 1175 1423
156 LI:008942,1:2000SEP08 70729702V1 1175 1414
156 LI:008942.1:2000SEP08 70731634V1 1175 1427
156 LI:008942.1:2000SEP08 70801702V1 1175 1424
156 LI:008942.1:2000SEP08 70729964V1 1175 1427
156 LI:008942.1:2000SEP08 2369660F6 1175 1428
156 LI:008942,1:2000SEP08 2372577F6 1175 1428
156 LI:008942.1:2000SEP08 7700714H1 1820 2410
156 LI.'008942.1:2000SEP08 70311066D1 1158 1427
156 U:008942.1:2000SEP08 70730649V1 1168 1428
156 LI:008942,1:2000SEP08 70310381 Dl 1169 1349
156 LI:008942.1:2000SEP08 70310610D1 1251 1427
156 LI.'008942, 1:2000SEP08 2827340H2 68 362
156 LI:008942.1:2000SEP08 6918520H1 696 1194
156 LI:008942.1:2000SEP08 1423821T6 940 1409
156 LI:008942.1:2000SEP08 7700714J1 1108 1637
156 U;008942.1:2000SEP08 7987293H1 1 572
157 LI:732479.1:2000SEP08 6987581 HI 1420 1724
157 LI:732479.1:2000SEP08 7006122H1 1551 1748
157 LI:732479.1:2000SEP08 1314671T6 1 360
157 LI:732479,1:2000SEP08 1314671F6 8 398
157 LI:732479.1:2000SEP08 1314671H1 8 235
157 LI:732479.1:2000SEP08 g1966425 56 531
157 U:732479.1:2000SEP08 g6993126 125 538
157 LI:732479.1:2000SEP08 g2063478 170 406
157 LI:732479.1:2000SEP08 5453575H1 276 518
157 LI:732479.1:2000SEP08 7015239H1 397 976
157 LI:732479.1:2000SEP08 6167048F8 798 1309
157 LI:732479.1:2000SEP08 3727038H1 530 818
157 LI:732479.1:2000SEP08 8006617H1 498 1153
157 LI:732479.1:2000SEP08 3727038F9 530 1069
157 LI:732479.1:2000SEP08 6167048H1 980 1310
157 LI:732479.1:2000SEP08 7077449H1 1214 1706
157 LI:732479.1:2000SEP08 1434378H1 1291 1439
157 LI:732479.1:2000SEP08 6940364H1 869 1396 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
157 LI:732479.1:2000SEP08 8112243H1 1 314
157 LI:732479.1:2000SEP08 7094636H1 789 1287
158 LI:1190250.1.2000SEP08 6779840H1 9 551
158 U:1190250.1:2000SEP08 3386816H1 1 185
158 LI:1190250.1:2000SEP08 3888124H1 4 270
158 LI:1190250.1:2000SEP08 70817678V1 294 831
158 LI:1190250.1 :2000SEP08 1951349T6 298 805
158 LI:1190250.1:2000SEP08 6947424H1 304 817
158 LI:1190250.1:2000SEP08 g5054158 306 793
158 LI:1190250.1 :2000SEP08 1658838H1 309 541
158 LI:1190250.1:2000SEP08 70819393V1 180 693
158 LI:1190250.1:2000SEP08 3002522F6 180 469
158 LI:1190250.1:2000SEP08 71218340V1 181 693
158 LI:1190250.1:2000SEP08 3002522H1 181' 492
158 LI:1190250.1:2000SEP08 71218644V1 202 417
158 LI:1190250.1 :2000SEP08 7203618R8 247 938
158 L1:1190250,1:2000SEP08 g895347 1 84
158 LI:1190250.1:2000SEP08 4005442F6 1 328
158 LI:1190250,1:2000SEP08 3386816F6 1 625
158 LI:1190250.1:2000SEP08 4005442H1 2 287
158 LI:1190250.1:2000SEP08 7946358H1 14 762
158 LI:1190250.1:2000SEP08 6250084H1 67 596
158 LI:1190250,1:2000SEP08 2809974H1 67 326
158 LI:1190250,1:2000SEP08 7607592H1 68 666
158 LI:1190250.1:2000SEP08 3159561H1 58 222
158 LI:1190250.1:2000SEP08 412478H1 375 603
158 LI:1190250.1;2000SEP08 2862291 HI 377 656
158 LI: 1190250.1:2000SEP08 g3934153 378 847
158 LI:1190250,1:2000SEP08 g3900180 406 846
158 U: 1190260.1:2000SEP08 391612F1 434 1065
158 U:1190250.1:2000SEP08 8040158H1 442 1055
158 U:l 190250.1 :2000SEP08 509350H1 107 309
158 LI:1190250.1 :2000SEP08 7946358J2 99 722
158 U:l 190250.1 :2000SEP08 70821132V1 180 735
158 U:l 190250.1 :2000SEP08 70822288V1 180 737
158 U:1190250.1:2000SEP08 70818910V1 180 735
158 U:1190250.1:2000SEP08 2929935H1 184 470
158 U:l 190250.1 :2000SEP08 2929935F6 184 426
158 U:l 190250.1 :2000SEP08 70832705V1 719 828
158 LI:1190250.1 :2000SEP08 412478T6 841 1021
158 U:l 190250.1 :2000SEP08 6779840J1 900 1328
158 U:l 190250.1 :2000SEP08 g2631213 993 1063
158 LI:1190250.1:2000SEP08 5501816F6 1067 1306
158 U:l 190250.1 :2000SEP08 5501816H1 1122 1225
158 U:l 190250.1 :2000SEP08 55024037H1 471 1049
158 U:l 190250.1 :2000SEP08 4005442T6 489 1016
158 U:1190250.1:2000SEP08 70818940V1 508 1082
158 U:1190250.1:2000SEP08 55024037Jl 552 1217 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
158 LI:1190250.1.2000SEP08 g5339250 562 1047
158 LJ:1190250.1 :2000SEP08 059074H1 604 769
158 U:1190250.1:2000SEP08 70821167VI 631 1080
158 U:11 0250.1:2000SEP08 70822737V1 647 955
158 U:l 190250.1 :2000SEP08 8040158J1 693 1219
158 LI:1190250.1 :2000SEP08 3672749H1 81 366
158 U:1190250.1:2000SEP08 653157H1 82 312
158 LI:1190250.1:2000SEP08 g3000968 312 608
158 U:11 0250.1:2000SEP08 3386816T6 339 821
158 U:l 190250.1 :2000SEP08 391612R1 355 904
158 U:11 0250.1:2000SEP08 412478R6 375 703
159 U:1013717.1:2000SEP08 6798057F8 1 566
159 L!:1013717.1 :2000SEP08 6798057H1 1 529
159 LI:1013717.1:2000SEP08 6798057T8 490 1066
160 LI:2049125.2:2000SEP08 5291818H1 1 259
160 LI:2049125.2:2000SEP08 70803952V1 1 661
160 U:2049125.2:2000SEP08 5542143H1 81 171
160 LI:2049125.2:2000SEP08 70801203V1 73 267
160 LI:2049125.2:2000SEP08 6969152U1 45 453
161 LI:1092360.1:2000SEP08 6752529J1 258 791
161 LI:1092360,1.2000SEP08 6752529R8 213 791
161 LI:1092360.1:2000SEP08 6881093F8 1 605
161 LI:1092360.1:2000SEP08 g3597108 494 810
161 LI:1092360.1:2000SEP08 g5364506 349 799
161 LI:1092360.1.2000SEP08 g5590217 452 908
161 LI:1092360.1;2000SEP08 6881093H1 125 605
162 LI:791524.1:2000SEP08 2998325H1 148 431
162 U:791524.1 :2000SEP08 8050221 HI 68 671
162 LI:791524.1:2000SEP08 2998325T6 141 606
162 LI:791524.1:2000SEP08 7752481 HI 1 576
162 LI:791524.1:2000SEP08 2998325F6 148 570
162 U:791524.1:2000SEP08 g5664125 340 652
163 LI:1084555.3:2000SEP08 g2004742 235 515
163 LI:1084555.3:2000SEP08 6457442H1 1 511
163 LI:1084555.3:2000SEP08 6452042H1 1 566
163 LI:1084555.3:2000SEP08 4053081 HI 103 373
164 LI:815418.2:2000SEP08 1939605H1 1643 1921
164 L1:815418.2:2000SEP08 6079908H1 1858 2307
164 LI:815418.2:2000SEP08 8097467H1 9 637
164 LI:815418.2:2000SEP08 7019587H1 2 481
164 LI:815418.2:2000SEP08 7696123J1 5 371
164 LI:815418,2:2000SEP08 71592951VI 1443 2126
164 U:815418,2:2000SEP08 7720648J1 594 1173
164 LI:815418.2:2000SEP08 3524549H1 1247 1373
164 LI:815418.2:2000SEP08 1404445H1 1256 1526
164 LI:815418.2:2000SEP08 7700830H1 377 932
164 U:815418.2:2000SEP08 7646833H1 381 1075
164 U:815418.2:2000SEP08 7704184H1 379 1038 "Ϊ98 o 4No4No4No4No4No4Nofc.o4No4No4N JoN 4oNo4No4No4No4No4No4No4No4No4No4NoJN foc.o4No4No4No4No4No4No4No4No4No4No4No4No4No4Nofc.o4Nofc.o4N
—1
Φ
3
TJ
Ω φ" σ
4N Oi en cπ cn 00 VI to en 4N O Cπ Cπ . 4N Co CO Co C -C -vjθ NT rθ NT NT M NT NT NT IO NT NT NT NO NO NO NO vl sJ JN O sj O NT fc. ^ CO JN CO 4N 4N 4N o o CO fO — ' SJ 0 4N NT — ' NT O O O O O O OO sl O OO sJ sI sI sJ sl sj sj J Co Cn O OO 4N o Oi — ' l O j oo 4N _ NO OO O O OO C0 _ C0 O O sl O sl cn NT NT O O C0 C0 O O O O O O — ' sl Co 00 Co NO O NO O Ω v 3-
NT IO M ,- - _ _ _ _ 4 _ _ — J O o —oo ' —cn * —0 ' —4 —cn ' —Cji ' —cn ' —oi ' —C3 ' —C ' —4 ' o NT NroT —o ' —co ' MN M*J M-' n, ΛN, —0l ' —0 ' —0 ' C
O f -4 _ c . O -l N_ OJ M . O - .. C> --' . -_4 -' S- i**su N_ C — ' r IsJl Sv O C> Ui O en o 4 en _ _ - 4N O 4N O NT sl sj 00 sj r —_ ' \v orJ-> U o-- e U- l — ' U cnl o CJ U coJ I tNoJ t IVo — — " eCJnI — ' 0W -0 OU NIT CCoU 4+N4 O O**J O*CI CUoJ OU CWT3 UCjθJ ^\ Jfl,; θSI ft*S, ftn θ' ιI +4Ni sSj .O«
CJl l O -4 n O 4N 00 O C O O "^ — ' O NT 4N — ' O O O NJ O vJ vO vO — ' NT NT 4N 00 NT CO CO — ' ffl OO O OS N * w C t> u OJ C» Cll )
CO 00 CT0 CΛ Cj0 00 C0 00 CO CO 00 Cj0 C0 00 C 00 03 03 00 Cj0 ∞ o oi oi oi Oi Oi Oi Oi oi oi oi oi oi oi Oi oi Oi Oi oi Oi oi oi Oi cn en cn en cji oi Oi cji cji cjT Ci Cji cn ci c^
■fs _ _ _ 'fc» 4N 4N 'fc. _ _ _ _ _ 'fc. 'fc. _ _ _ _ _ _
C0 C0 C0 CT3 CD 00 00 CΛ 00 CT0 CT3 Cj3 <T3 <_O <_O CJ3 O3 00 00 ∞
"to "NO "NO No fo io "to "to io "NO fo io fO fo Fo fo io fo fO M fo fo FO fo Fo fo fo fό Fύ Fύ fύ fύ Fύ Fύ fύ fύ tύ fύ iύ fύ tύ fύ fύ fύ NO fύ Fύ tύ 'NO Fύ iύ fύ fύ tύ iύ fύ fύ fύ Fύ Fύ iύ fύ fύ NO fύ iύ iύ fύ iύ fύ fύ fύ iύ iύ Fύ Ω
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O Tt O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O 1'1-' ic On On OCD cOn Ocn Oco Ocjo Oco Oco Oco Ocn Oco Oco Oco Oco Ocn Oco Oco Oco Oco Oco Oco Ocn Oco Ocn Oco Oco Ocn Oco Ocn Ocn Ocn O O O O O O O O O O O O O O O O Fi im [Ti 'τι rrι m m m m 'τι m rπ m rn rn 'τι πi 'τi ffl [τι m m ffl m m m i ι ι ιιι m πι m m rπ πι m |-σ -u -D T -α -α -D -o τj ττ τι τj -σ -α -σ τj -α -O -O T3 TJ T -α τj -σ τj -α -D -α 'τj -^
IO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O CΛ CO CD θ3 ∞ CTO CX Cθ CO CO CO Cj3 03 00 03 00 00 00 C Cj3 00 00 00 <-0 ∞
O _ _ _ _ _ _ _ CO
NT 4N O Ol n (J1 ^ C> ft O θ α K l M S Ol W - Co NjT S 00 n O) J_, ( Oι cn o -fc. fc. 4N O O O O O Ol 4N TJ ^- vj O O C rO.i — ' i NtsT-, — ' m ∞ Ov S f ,i rO-. oO Os N^l J fc rv. fc rs^ V^*° *fc rs. r|^ Sθ sιl rCTrι VJ' i_JJ .vjθ-, _ , rvs, moo c^n CO O o •IN CO ~ ' Ol _ **-4 O O O O _ -^ _ — ' _ — ' rOτ cO-, oO - § Ω _ O — ' — ' O Co W M l N M C0 ^ N C W N w a O M CJ ^ m 0. M s) C 0, sj co lO o o cn co cD co en — ' fc.
e ' NT NO NT ,„ NT NO _ — ' — ' NT — tsT NO — ' — ' NT — ' NT NT
4-.. CJ- -rn — v P CJ-> - —Ό * S * O NT —' — _ _ _ _ _ _ _ _ _ _ _ _ Λ fc -4 M INJ O fvl _ — ' iISJ _ — c CAJ C CAo» — ' N ISo. c CJnI c CoJ - —^ * fc 4.. p ^ r. cτ3 ivo pj c uπι c unι * —r-' * θ u* c UoJ oj — _ SI Ol O O O s
M ∞ ^ CJl B M Cθ δ Ol C» C Cθ O. Oi αι C ω ϊ O M C> -' C CO _ J sJ C0 C0 SJ O C0 sJ O r*
CO sj O O Cn Co — ' O O OO SJ O O NO O O
— ' — ' ^ CO O - ' Ol w CO CO M - ' O OO — ■ ' 4N Oι O vj θ O --l C-o- -NO- O — 4 -N O — 0 —0 C - - _ NT I CO O CO NJ 00 0l O O 03 O O vJ 00 VJ jsι TJ
CΛ IH
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ *_
J^ 4oNo4No4N 4oN fco.ofc. 4oN foc.o4No4No4No4No4Nofc. fco.o4Nocf.ofc.o4No4No4NoJNoJNo4No4No4Nofc.ofc. o 4No4No4No4No4No4No4N fco.o4N 4oNo4N fco.ofc. 4oN 4oNo4No4N 4oN fcoi.π
<T3 00 03 00 00 C0 00 C0 00 Cj3 CT0 CD 00 00 00 <T0 C <T3 00 cn cn en cπ cπ cπ cπ cπ cn cn cn cn cn cn cπ cn cn cn cn
•fc. •IN 4N 4N 4N 4N JN 4N fc. 4N 4N 4N •fc. fc. 4N 4N fc. 4N
CO CO 00 00 00 CD 00 00 00 oo co CO 00 00 oo 00 00 00 00
NT NO NO NT to o o NT NT NT NO NO NO NO to NT NT NO NT
NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT NT
O CT ΓJ e ; c >
CO ( ) < ) < >
CΛ CO CΛ CΛ (Λ CΛ CΛ CΛ CΛ CΛ CΛ CΛ C/J C/J CΛ CΛ CΛ CΛ
III πι IH m m m rii m m in m III III ITI rπ m m
TJ TJ J TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TI J TI TJ TI
O CT C ) ΓT o C CT
00 O co 00 CO oo oo CD CD CO CO 03 CO CO oo CD 00 00 O
Cπ θι Oι 00 00 JN 4N 00 NT r σro, rcvπ, rOni ros ros *on oi mcD rCnπ
O Ui NO -sJ CO Cn .fc. O NO
NT -4 NT NT NT —' —' NT
O 00 O O O Ol CO cπ ft g cn o co fc. en fci. o -' ^ — • -- js> ^ _ _ n _ o o co ) _, _ *vo ( co vj NT CΛ
NT — ' Co O ft O CO
4N O CO CO fc. O - ^ sj sj sl oo O CO NT sj sI Ol — ' — ' 4N 4N 0 0 01 4N cπo o4 —en ' 9° ' ■ O CO 00 NT 00 C^ JN p? 00 O
CO NT fc. fc. O C oo cS-, ovj3 ooo 4 o Nτ o O —■ N θ cπ - CO O O Cn — ' NT CO CO O O O O CO O O O — ' VI O O NT CO O NT NT TJ
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CO sl _ sl fc. o O O O O O O O O O O O O r rn Oo cOo <OAi cn si — * o -ω O N N CO O- N CJl Oi CO tO rO n ϊ OOι 4NNT —O ' COoO CCθD CVjjI CsjoJ CCθN 4ON 4ON 4ON COθ C COjθ CCπo eOπO OsJι Osιl.Ovjl Osιl OOι OOι OOι OOι OOιl
NT NO NT NT 4N SI CO O NO O O — 4 — ' N0 O _ O
Cn O — ' θ θ θ θ 4N θ 4N cn co o sj si si cn o oo — ' θo co o o cn cn o Nθ sj _-L
io _ _ _ _ _ _ _ _
^δ o _ CN — — ' — ' — 4 — ' — ' —' —' — ' CO _ to o > _ _ _ _ _ _ _ _ ro _ _ _ rO _ _ _ _ NT co
N0 O 4N O vl vJ Cn sl sJ O sI sJ C0 O O — "O N N O CD OJ N OJ O rt
0∞ 00 sl o _ 0 03 00 01 — ' Cn OO Ol O — ' O O CO Cn sJ O O sJ Js. 0 4N CO tO N0 0 4N 4N θ o vl — ■ si sj si cn si o sl sl js. o o o cn
01 CO O 4N co jN o co co eπ co o co co oo 4N θ en si CO NO SI O O O CO O CO CO CO — ' sl o N O CO — ' 4N Ol 0 4N sl sI J-. O sl TJ
NO
SJ fc. JS. JS _ vl _ J _ 4 fci. O ^; ^ o _ o_ o_ o o o o _ cπ fe o o o o en eπ oι n NT Nτ ,- „, ^ oo . v v C CJOJ s SjJ s SiI s SjJ s VjI C _+O co cπ oi .fc. c ' O C -. 0 0 0 0 ∞ C _ r> ω ^ _ _ r O c N0 01 ft ft r^ 00 cn J SJ oi o o o On cOo -Co4 —CO ' NCOJ —-sJ —O Sh Svi O sJ Cn sl sJ NO O C υ r OJ O M N CJl Co v O O w u' B> l oo vVj- Jv^j O- vj CvIj- O JcNo Q-g.
ro _ -. _
—' sj vl sj sj JN ro _ _ _ NT N0 —' —' NT — —' NT NT NT —' to NT NT NT NT NT NT NT
NT Ol 4N CO —' O_ Co —• —4 — -vl —■ —■ —' O CO CO -4 CO CO NO O NO —■ co > ^ co S ω ft ω sj Cn cπ *- -4 Co co - f
O —' NT O I — ' O Oi CO — ' N0- C-O- 4-N 0—0 SJ sl CO O fc. —' NT NT —' —' — fc. NT O NT ft _ ft K θ j NT O OO -vl Cπ O sj O
O 4N O O 4N O S
CO Cn Co O tO NO — ' 4N NT 00 4N Co CO s] fc. si o vi i en o o o
oooooπ r r r r r r r r r r r r r r r r r r r c r
4 4N "4N "4N "4 J 4 "4N 4N '4 4N 4N N 4N 4N
O O O O O O O O O O O O O O O O Cn Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Oi Cn Oi — i
N N N N N N N N N N M N M N N N ^ ^ J5. ^5. 45. ^ ^ ^ ^ ^^ fc J5. i ^ ^ ^ J5, ^ J_ ^ 5. ^ ^ ^ Λ ^ J5. ^ ^ ^ J. ^ (I)
O O O O O O O O O O O O O O O O — ' — ' — ' — ' — ' — ' — ' — ■ — ■ — ' — ■ — ' — ■ — i — — • — ■ — ' —4 — . _ — . — i — i — . — ' — i — i — . — i — i — . -j O O O O O O O O O O O O O CX 03 00 C0 03 00 00 00 C0 C0 00 00 00 CT3 CT3 00 00 <T0 00 00 C0 ∞ <T0 CT0 00 ∞
'—> '—* '—' '—■ x. _ -4 -4 ιJ _ ^ _ _ -. -ι _ ^ fo "to io fo io NO io fo fo fo Fo Fo fo fo Fo Fo Fo fo Fo M Fύ Fύ Fό fύ fύ Fύ Fύ Fό Fύ fύ iύ iύ Fύ Fύ fύ Fύ iύ iύ iύ fύ Fύ iύ fύ fύ iύ iύ "NO X'τ Fύ F^
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O rVt O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O ω l O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O pj CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CO CΛ CΛ CΛ CΛ CO CΛ CO CΛ CO CΛ CΛ CΛ CΛ CO CΛ CΛ CΛ m m m m m m m m ιτι m m rn m m m m m m rπ m ιτι m m m m m m
ToJ ToJ -oO ToJ -D TJ -O TD T TJ -O TJ -O -OoJ TJ TJ -D TJ TJoTJ -O TJ TJ -O -ϋ TJ TJ T TI -O -D TJ -α -O T^
O3 00 CT3 CO 0oO Oo0 0oO <oT3 Oo0 CoX Oo0 0o0 Co3 CoJ0 CT3 Oo3 Oo3 CoT3 Oo3o0O Co I TO O3 OoOo∞oooooooooooooooooooooooo
r ssj- is-sj- tg r s v J- v≤ s^i ts n&v COo NgT fOO NT oo-, O -CΠ N 4 O O O O O IO NT IO NT W ^ ^ !^ C > O O SJ VJ SJ C^ (JT O OT O vl CjT sj <J OO CO VJ sJ sJ sj rO |J s^
Cn θι § NO -' ft θ g fc Co S -J co NT O - 4-N si ■ C —O O - - NT- _ _ _ * sT I l ϋl vj sj sj sj ^ ^ ^ ^ 00 _, _ o- sj en co o - NO z.
_ _ C — ' — ' — ' NT — ' CO CO CO CO — ' NO NT — ' — ' NO NO NO NO NO — ' — _ _ _ _ _ _ _ _ _ _ _ . _ — ' NO NO NO NT — ' — ' — ' — 4 — • — ■ — ' — ' CΛ
— ' —> NO O NT — ' O O N0 tO en N0 O 4N O 4N 00 — ■ O CO — ■ — ' OO fc. 0 01 0 — ' — ' — • O — . _ _ — . CO NT CO CO — ' Oi O O sj sl O O O ^I" O 4N SI O C0 SI 0- N0 O O — ' SJ O SJ C0 4N O NO CJ1 S1 SJ O O O CO SJ SI SI OO O S1 00 SJ SJ O O — ' O O NO O OO sJ O sJ _ oo O O O CJl 4N 4N _ vj r0 rθ sj _ o *4N 0i 4N O O 4N rθ O O O 4N θ 4N N0 NT C0 C0 sj sJ C0 N0 C0 C0 O 4N 00 rO O 4N C0 C0 O C0 — ' O O TJ
CΛ m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O r—J sj sj sj sj sj sj sj sl Ch O O -Oi Oi Oi Oi Ol Ol Oi Ol ϋl Ol Oi Oi Ol Ol Ol Oi Cπ Cn CJl ϋl Ol Ol Ol Oi Cn Ol Oi Oi Ol Oi Ol Oi Cn O Ol O^ o
tO NO NT tO tO NT NO NT NO OOOOOOOOOOOOOO OO OO OO OO Oω OwOwOcn ωO cOnOωO O OO OO O O
CO CO CO IΌ IΌ NO IΌ CO CO
NO NT — ' — _ O fc. NO NT NT NO NO Co CO CO NT M rvv NO NO NO tO rv nv NO NT o o vi cn Oi Oi O O O sI O JN Cn CO — « —. — O _O C_To S -CO CO C CTo C P Co 5 s] S S o-
Oi — ' — * NO NO — ' — ' ' O O Co — ' Cn fc. co co cn Oi i sj fc. O O fc. 4N 4N NT O *
CO 4N to _ o s CO CD CO — ' 00 NO Oi — ' — — ' fc. N0 fc. cπ sJ C0 O O 00 N0 O C0 o CO sl O v| cn cn ^ _ cn 3 g ° o ±
Cn 4N Cθ sj co Oι IO NO NJ NT 4N — ' — ' r^ v^i si n rv r -i -Co CO CJO CO CjO NO NT CO CO CO CO CO C — ' CO CO CO CO — ' — ' — ' — ' NT NT — ' -4 CO C — ' CΛ
— ' si ci - ' j O fc. — ' o en O θ S ri o *vJ n κ M - ' NT O CO O CD O Cπ tO tO O ro en tO fO tO CO — — ' O CO O CO — ' tO tO -' — ' rT o o co o oo ro o o o o o θ ft ft ft ^ ft ft vj jN O 4N -00 Cθ vI C0 Cπ — ' sJ O O OO O O O O NT Oi — ' 0 CjJ 4^ 1-^ ω sj ω _ M M r > M i v0 O CT3 O O C0 *sJ *T J C»
________________________________________________
OoooO OooO Oo Oo OooO 0o0 Co0oC0oC0o(TDoC0o00o<- oC0o00oC0o00o(T3oC0o00o00o00o00o00o(-0oC»oC0oC»oCT0oooooooooooooooπ z o
NT NT NT —*4 — O CO N0 00 tO tO N0 tO N0 Cn 4N C0 C0 NT NT — 4 — ' NO tO tO sl sl O Oi OO oo co oo o cn cn oι fc. co Co tO C^ S ft ∞ -' o o S ^ sT t O ∞ sJ -vj vj fc. fc. - - -. — ' — ' O O NT — ' N0 O O 03 O — ' cn en co o cn cn jN O oo Q ^ 8 O O Ol O g g ft O O O O O - - - vj CO NO 00 CO _ _ sj — ' sJ O — ' O OO OO si en CO CO O O O O fc. O tO ^
,,, NT NO NO —4
K CO O CO N IO o o — en *****-*. °° oo co go o o o si oo o — ' C0 00 00 sj C0 θ 00 00 r: N*, r-ι _ IO NT O O O O CO o o co
N Ol -*Oι Ol o O O sl O Co — ' O" - O Ol CO NT O O — O O — K 4N N0 0 0 4 ^ ! Co. y-O _—', 4rv O OK, O CO o K, NT o o co co to 4N 4N 4N C O O O O 00 CO sj CO O NT —* O vj. o co en tO N en co o ω cn — en to- —'■ * sj Oi ^ Ov O O NO NT jv. j^ O Co j Ol O TJ
CΛ rn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ sj si sj sj sj sj sl sl sl sj sl sl si sol soJ O oOo OoOo O ' O O O O O O O O o Oo OoOo O oOoOo Oo OoOoooOoo O — ooooooooOoOooOoOoOoOoO r-j o
to co co co co co co co co co co co co o fc. fc. fc. fc. •fc. -fc. fc. fc. fc. fc. fc. -fc. cn cn cn o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oo co co co oo oo cf fc. fc fc. _ — > co co co c φ o o o o o o o o o o o o o o o o o o O o o o o o o o o o o o o o o o ft ft ft fc. -fc. fc. fc. fc. oo oo oo oo oo oo po co 9° 3
Fo to Fo "NO io to to Fo io TJ iύ Fύ tύ Fύ Fύ Fύ Fύ Fύ Fύ iύ fύ iύ Fύ iύ Fύ iύ fύ to Fύ iύ Fύ Fo iύ "'tύ Fύ fύ Fύ Fύ Fύ Fύ Fύ Fύ fύ fύ fύ Fύ tύ tύ Fύ Fύ fύ tύ to Fύ fύ Ω o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o fj" o o o o σ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o co co co co co co co co co co co co co co Q m m m rn rπ m m m m m m m m m m m m m m m m m m m m m m m rn m m m m rπ m m m m m m m m m m m m o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oo co co co co co co co oo oo co co co oo co co oo oo co
| _ o fc. sj 03
CQ fc.
CΩ o o o o cn CΩ o o o o o o o o o o fc. fc. o co o o o o — ' o o fc. to o — ' o o co o o o o o ι— cπ cn en en — . o -fc. o o rπ cπ o co o cn o o cn co cn o o
Sl o — ' fc. o fc. to cπ •fc. o o o o o — ' o o o o cπ fc. n oo o fc. o cn cn fc. T 3 s J
p
00 o — ' en -fc. fc. to o co o en o o co to o o to o co o o en o -fc. o fc. o o cn o o oo £ cn o
00 o to fc. o oo o o o fc. fc X co o •fc. fc. en o .fc. o o o o φ
X X X X ft X X X X X X X X X X X X X o X X c_ X <_ X c_ c_ X X X c_ X X c_ X c_ X X 3 — )-
Ό
Ol 4N 4N sl sl o o oo o o 4N co en CO CO CO 4N 4 NT NO NT jN Cjo CO -IN NT W Cn CTO O — O 4N C0 SJ S! O O Oι 4N Js. θι 4N fc. Oq sJ sl sJ sl sj CD CJl t Js, sj o o oi en o o Co
- ' - - - - - - SI Q, QO jv ^j vj o O CO O O O sJ CO CO sJ O IO O O O O JN 4N _ ^ 7. o oo en en
§ M M O ^ O, » 0. « -J W αι W i CO fc. fc. — ' "O Oi Ui Oi O* M Oi w w — ' NT SI O O -fc. Ω
3-
CO m © sj sl sj sj sj sj sj sj sj sj sj sl sj sl sj sl sj s| vj sl sj s| sj sl sj sl sj s| v s| sj s| sj sj sj sj sl sl sl sj sj sj sj sj sl sj vj sl D Z
0
MoNoO
JSi. OJi co co rύ fύ
I oOoO lO O i cn m
ToJ ToJ
C» 00
CO ±
f,i M M *i ft 8 ft no - lo, 5 sτi w w 4_ ,0
sj sj sj sl sl sl sl sl sj sj sj sj sj sl sl sl sl sl sl sl vj sj sl sj vj vj sj sj vj sj sj sj sl sl sj sj sj vj sj sl sj sj sj sj sj sl sl sj Co ω CjO CO Co ω Co CjJ CjJ CO Co ω Oo cjJ CjO CO NO NO NT NO NO tO NT NO N^
O
Co CO Co co CO CO CO CO CO CO CO CO C C CO CO NO NO NT NO NT NT NT NT NT NT NT T W M M |NO M M M M M NJ W N) M ι M M M 4N 4N 4N fc. JN 4N 4N 4N 4N fc. fc. fc. -fc. 4N fc. 4N O O O O O C) ( J C J O O O ^ K K |N O O h h O O O O O O O O O O en Cn Oi Oi Oi Ol cn CJI Ol Oi Ol S rπ rn O O O O O O O O O O O O O O O O O O _ _ _ en cπ en eπ en θι θι Cn oι θι θι θι θι θι ϋι θι Cπ cπ -H NT NT NT NJ NT NT NO NT NT NT NT NT NO NO NO NO -fc. fc. -ι^ 4N 4N 4N 4N 4N 4N ^ N 4N 4N 4N 4N 4N 4N. 4N φ 4N 4N fc. 4N 4N fc. 4N 4N fc. 4N 4N fc. fc. 4N c cπ cn cn Ol rπ cπ cn 0_ c^ 0n
Ol Ol NT NT NT NT NO NT NT NT NT NT NT NT NT NO to to on o o o o o T C ) O O
00 co CD 00 00 oo co oo oo NT NT "tO NT NT NT NO NT tO NO NT NO rO tO §8§ftftftftftftftftftftftftftftftftftft^
!S> !S> Fo NO NO ro NO ro NT NT Fo fo Fo ) tO r 7 T 777 77' 77' 77' 77' 7 77' 77' 7 77' 77' 77' 77' 77' 77' ~- ro Fύ fύ NT NT NO NO NO NO to to NO NT NT NO NT j. o O o _ C T O O O O o o o o o to NT NT NT NT NT k~Λ to to to t t t t t to t to t t t to to to to o o o o o o o o CT o o o o - o _ CJ o O o o C ) C J ( J o o p5 K -s, o o o o o o o o o o o o o sT
IQ _ o o _ C T o o o o C J C J < o J o o S ft ft o o o o o o o o o o o o o o o o o o H t co co co (Λ CΛ CO CO co co co CO oCΛ oCΛ oCΛ oCΛ oCΛ oCO oCΛ oCO oCO oCΛ oCΛ oCO oCO oCΛ oCO oC j cn co co co C J C J ( o J o _ o _ O oCo FD rπ m m ni rπ m rπ co co r^ fft Hft o
CΛ CΛ C cΛo v-o ^ ^ m m m m m m m m m m m m m m m m m m
TJ m m m m m m m m m o TJ TJ TJ TJ TJ Tl TJ TJ m rπ m m m m in m m m o o o o O o o o TJ TJ TJ TJ
o o CT o TJ o S o o o o o o o o o o o o o o o o o o oo co co oo CD co oo oo co oo co 03 00 03 00 03 o o o C T C J rS rR αo oo oo co αo cσ co o o o o co oo g g g ∞ ∞ OO CD OO OO CΛ OO OO OO OO OO CO CO CO OO CO OO
— _ NO — O O O — 4N Co CO NO NT — O o CO sl — ' O — ■ O Co NT sl Co O CO 00 NO_ O_ M 9
Ol 00 00 _ — r Ov jN o en cn co co — J vNl. ft o
4N JN r . NO NT NT NT NT NO -s 00 sj Co 4N -fc. 4N 4NN OO 4NN CCO N|NT 44NN cCnh oO 44NN NNO3 CC0O O oj^ rs rv rv TV I O O 4N Sl - N - -J C -D sl - S I - -O C -D - SI 4 N _r, o en o o c r.i r.-i rs NT
Co JN 4N 4N sl O S ^ cr rnn Oθ Cωo Oo en i v v^ 0 jv jv, vθ 7 -cj o-sj — — 4 +Ni. o-cj — — c i;i c *Jo* o s c uπi ucni o u* ueni o *u o iNcoi ft rv -θ**.
C si NO CO o NO — ' cπ 4N — ' N Sl Sl 03 SJ cn o o cn si sl CO — — 4N Oi 4N 4N O NT NT — ^ O NO O O O Co tO Co CO -I^ TJ
'ABLE 3
SEQ ID NO: Template ID Component ID Start Stop 173 LI:346242.2:2000SEP08 8017662J1 24 630 173 U:346242.2:2000SEP08 6780866H1 1 8 473 173 L1:346242,2:2000SEP08 7724708J1 336 473 173 LI:346242.2:2000SEP08 6764336J1 339 472 173 LI:346242.2:2000SEP08 8006864H1 451 1094 173 LI:346242.2:2000SEP08 6915577H1 615 697 173 U:346242.2:2000SEP08 6797389H1 691 1241 173 LI:346242.2:2000SEP08 4052477H1 702 784 173 LI:346242,2:2000SEP08 3679845F6 1018 1097 173 U:346242.2:2000SEP08 2720988F6 1265 1373 173 LI:346242,2:2000SEP08 71220924V1 1275 1503 173 LI:346242.2:2000SEP08 2720988T6 1301 1372 174 U:2052717,1:2000SEP08 5067091T9 828 1475 174 LI,'2052717.1:2000SEP08 4894635F6 1126 1550 174 LI:2052717.1:2p00SEP08 4894635H1 1126 1402 174 LI:2052717.1:2000SEP08 2583002T6 1173 1451 174 U:2052717,1:2000SEP08 g5540945 1206 1551 174 LI:2052717.1:2000SEP08 5696575H1 1275 1532 174 LI:2052717.1:2000SEP08 g273554 624 810 174 U:2052717.1:2000SEP08 6595153J1 1490 1788 174 LI:2052717.1;2000SEP08 300256117 1603 1729 174 LI:2052717.1:2000SEP08 3002561 HI 1603 1763 174 L1:2052717.1:2000SEP08 6148958H1 222 789 174 LI:2052717.1:2000SEP08 7714444H1 233 841 174 LI:2052717.1:2000SEP08 5067091 HI 520 763 174 LI:2052717.1:2000SEP08 5067091 F9 523 1007 174 LI:2052717.1:2000SEP08 5063731 F8 543 1081 174 LI:2052717.1:2000SEP08 3002561 Fό 1603 1763 174 U:2052717.1:2000SEP08 g1147397 1616 1713 174 -.1:2052717.1 :2000SEP08 g3593657 1697 1808 174 U:2052717.1:2000SEP08 g7041600 1703 1808 174 LI:2052717,1:2000SEP08 2583002F6 552 1057 174 LI:2052717,1:2000SEP08 2583002H1 552 819 174 LI:2052717.1:2000SEP08 6619463H1 1 571 175 LI:406όό8.2:2000SEP08 5672240H1 1258 1537 175 LI:406όό8.2:2000SEP08 7002774H1 1601 2268 175 LI:406668.2:2000SEP08 7626312J1 1261 1948 175 LI:406όό8.2:2000SEP08 g!718534 1278 1350 175 LI:406668.2:2000SEP08 1390315H1 1186 1449 175 LI:406668.2:2000SEP08 3744324F6 1018 1356 175 LI:406668.2:2000SEP08 gl 779107 1337 1662 175 LI:406όό8.2:2000SEP08 5336594H1 1334 1569 175 L1:406668.2:2000SEP08 4933068H1 1336 1623 175 LI:406όό8.2:2000SEP08 gό703684 1600 2038 175 LI:406όό8.2:2000SEP08 7024005H1 1359 1809 175 LI:40όόό8.2:2000SEP08 1390315F6 1185 1604 175 LI:406668.2:2000SEP08 g5397998 2262 2362 175 LI:406668.2:2000SEP08 410500H1 2284 2393 TABLE 3
SEQ ID NO: Template ID Component ID Start Stop
175 LI:406668.2:2000SEP08 5434671 HI 2167 2384
175 U:406668.2:2000SEP08 8042801 HI 1347 1638
175 LI:406668.2:2000SEP08 3336387H1 69 296
175 LI:406668,2:2000SEP08 6788039H1 195 340
175 LI:406668.2:2000SEP08 3336387F6 69 413
175 LI:406668.2:2000SEP08 7674468J1 306 806
175 U:406668.2:2000SEP08 8042709J1 736 1233
175 LI:406668.2:2000SEP08 804280Ul 749 1313
175 U:406668.2:2000SEP08 4126078H1 906 1084
175 LI:406668.2:2000SEP08 4212553H1 1295 1516
175 LI:406668.2:2000SEP08 gl718535 2092 2395
175 LI:406668,2:2000SEP08 3723230H1 2092 2392
175 U:406668.2:2000SEP08 g1780142 2101 2392
175 U:406668.2:2000SEP08 2441933H1 2118 2310
175 LI:406668.2:2000SEP08 3744324H1 1018 1251
175 LI:406668.2:2000SEP08 6454883H1 11 481
175 LI:406668.2:2000SEP08 6569407H1 34 466
175 U:406668.2:2000SEP08 7723868J1 1 461
175 LI:406668.2:2000SEP08 7985745H2 1 606
175 LI:406668.2:2000SEP08 3473241 HI 8 259
175 LI:406668.2:2000SEP08 g2347904 1895 2391
1 5 LI:406668.2:2000SEP08 * g2269419 1907 2391
175 LI:406668.2:2000SEP08 g3149881 1297 1602
175 LI:406668.2:2000SEP08 g5660211 1936 2356
175 LI:406όό8.2;2000SEP08 g5053468 1908 2391
175 LI:406668.2:2000SEP08 g4988156 1912 2385
175 LI:406όό8.2:2000SEP08 g5440700 1928 2392
175 U;406668.2:2000SEP08 4933068T6 2001 2340
175 L1:406668.2:2000SEP08 5022603F9 1636 2065
175 LI:406όό8.2:2000SEP08 6576919H1 1705 2168
175 LI:406668.2:2000SEP08 3744324T6 1725 2318
175 LI:406668.2:2000SEP08 7626312H1 1734 2340
175 LI:406668.2:2000SEP08 4614958T6 1766 1952
175 U:406668.2:2000SEP08 6870730H1 1832 2347
175 U:406668.2:2000SEP08 5022603T8 1845 2219
175 U:406668.2:2000SEP08 7369624H1 1894 2387
175 LI:406668,2:2000SEP08 5022603H1 1603 1732
175 LI:406668.2:2000SEP08 4614958H1 1003 1244
175 U:406668.2:2000SEP08 - 4614958F6 1003 1573
175 L1:406668.2:2000SEP08 875654H1 921 1167
175 LI:406668.2:2000SEP08 8042709H1 977 1649
176 LI:1178352.1 2000SEP08 70867102V1 1509 1808
176 LI:1178352.1 2000SEP08 71229547V! 241 896
176 LI:1178352.1 2000SEP08 70868356V1 1129 1765
176 LI:1178352.1 2000SEP08 71229578V1 236 908
176 LI:1178352.1 2000SEP08 70867476V1 1267 1808
176 LI:1178352.1 2000SEP08 71300356V1 2101 2550
176 LI:1178352.1 2000SEP08 71228758V1 1070 1548 CO rn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ sj sj sl sj sj sj sj sj vj sj sj sj sj sj vj -vl sj sj sj sj sj sj sj sj vj vj sj vj sl sj v sl sj sj sj sj sl sj sj sj sj vj - sj vj sj sj sj π O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
O
SJ SI SI SJ SJ SJ SJ SJ VJ VJ SJ SJ VJ VJ VJ - SJ VJ VJ SJ VJ SJ SJ SJ VJ VJ VJ VJ SJ SJ VJ SJ SJ SJ VJ -VJ SI VJ VJ VJ VJ SJ SI SI SJ SJ SJ SJ
CT3 CT3 C» CXI <J0 C» 00 00 CO 00 CO 00 C0 CD CT3 (T3 <T3 CT3 Cθ 00 00 <T0 ∞ co cj cj oj (jj co cj eo ω <_o ω co c (jj cjj co j c <jj eo co w en cn cjT cji cn cjT Cji cjT Oi en cn cn cn cn cn cn cn oi Oi OJ Oi oi oi oi oi Oi Oi oi cn oi oi Oi Oi Oi cn cn cπ cji cji cn c^ rO tO tO t tO M tO tO tO M tO tO tO tO tO M
|O iύ fύ Fύ fύ fύ fύ fύ Fύ iύ fύ fύ Fύ fύ fύ Fύ Fύ Fύ fύ Fύ fύ fύ fύ fό fό Fύ Fύ Fύ Fύ Fύ fό fύ fύ iύ iύ fύ iύ iύ
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O φ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O — lO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O π ICm CmO CmO CΛ CO CmO CmO CO CmΛ CO CΛ CO CmO CmΛ CO CO CO CO CΛ CO CO CO CΛ CΛ CO CO CO CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CΛ CO CΛ CO CΛ CO CΛ CΛ CΛ CΛ CO CO CΛ I-O TJ -O -UTI TT -O TT TJ -U -Ό TJ -ΌmTJmTJmTJmTJ -ΌmTJmT^mmm mm mmmm mmm
I oCDo00o00oCJ0o<J3oCj0oCj0oCOoC0o00o00oC o<T3oC o00oCD 0o0oC0 0o3o00 Coj0oCj3 <oT3 ∞ooooooooooooooooooooooooo
NO
CoO
NT rn NT NT NT NT NT l tO NO NO NO NO NT tO NO NT tO NT NO NT NO NT NT .n — ' CO Ol O ^ O O O Sj Sj SJ O r-| _1 JN O CO O O O O O CJl Ol Ol Oi Ol Ol Ol Ol S -' r
Ol fc. co cn en cn oi Oi si si si sj sj sj C0 C0 O O fc. 0i 00 _ v CO CO CO CO JN JN sj si O NO fc. en o NT Oi sj ' Cn — ' CO O O O O O o o o o -vi co co o o o o o cn si oi — cn o o en -vi — ■ fc. fc. 4N 4N O O O O O CO CO 4N oo cn o i — . vQ — ' O sj . O _ O - O O _ O — O - O O r-v * o _ — ' O — i — O _ O rv O C rO . — ^ Tj
CO m s sj sj sj sl sl sl sj sj sj sj sj v sl si sj sj sl sj sj sj sl sj sj sl sj sj vj sj sj sj N N N N N N N N N N N N N N π O O O O O O O O O O O O O O O O OO OO OO CO OO CO OO CO OO sJ sI sj sJ sJ sJ sJ sJ sJ sJ sj vj sj sj vj Ch O O O O
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
OJ CO CO CO OJ OJ CD OJ OJ OJ OJ OJ Oo OO OO CO N N N N N N N N N co ω cjo ω cjo ω co co c ω co cjj co ω ω oo o o o o o o o o o _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ o- o o o o o o o o sl sJ sl sJ sJ sJ sI sl sJ sl sJ sJ v -vl sl vj rO NT NT NO NO NO NO NT NJ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ fc. fc. JN fc. fc. 4N 4N 4N 4N
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Fύ fύ fύ fύ fύ fύ NT NT Fύ iύ fύ fύ Fύ fύ fύ fύ iύ iύ fύ fύ fύ Fύ fύ Fύ Fύ 'O O O O O O O O O O O O O O O O O O O O O O O O O o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Iw w ω ω w w Λ o ω cΛ w w co co ω w cΛ w w ω cΛ ω co ω ω
T -O TJ T| -O -O -O TJ TJ TJ TJ T| -O TJ -O -O TI TJ -α τD T| -O T TJ Tj O OO COD ∞O 0O0 0O3 ∞O 0O3 0O0 ∞O 0O0 O∞ OCTO OCθ OOO O∞ O<T3 O∞ O∞ ∞O OCJO OCO OCθ Oαo O∞ O∞ to sl sl sj sl sj sj sj sj sj sl sl sl sj sj sj vj sj si sj si sj > 03
OJ I O O CN o O O o o o r~ι O co o o O O O ≥ ≥CΩ CΩ o sj
O O <Ω CQ co co CO CO Co NO
CO Ch -o O fc. O o T NT NT NO NO 4N 4N Tt Tt K> NT sl cn 4N O vQ CD
O O O CO cn ^ c NT NO CO O 3
Ch CO o CO O O NT NO NT N
CO NT NT fc. fc. . NT . N Co co co oo cn cn cn oi oo cn cπ cπ co cn cπ cn cπ co co o TJ o- o co CO NO Ch o o CO Ch NO CO cn O O O CO NT cn co o o o
Ch NT (h sl fc. 00 si O -4 cn o f. fc. CO -vl o _ o
4N co co en o — O J- O fc. o o — • c
4N O O SI o 0 ~0 N ~T Q sl o n cn CO 4N NT cn O O 4N CO sl sj NT cn JN fc. -IN 4N
CO O CO cπ NO oo oo NO
4N 4N ' Oi o o Ol O O fc. 00 O CO o — 4 si 4N cn SJ cn NJ _ O NT CO O O Oi co
< < < < < < < < < < < < < < < < τι τι τι X T N N O 4N _ _
< < C TO cn NO en o
03 00 CO ≡ OO CO Co _ o o O TO X X X <- o — ■ - 03 < < < <
Oi fc. 4N fc. NT Cn —' — ' C0 4N 4N — Cn — ' Co — . CO — ' — ' — NT NT NT NT CΛ ' κ SI JN O O — ' OO OO sJ sl — ' CO Ol NT Oo O O Ol NT — — O ' O03 e0n0 Soo1 ScnI cOo o o4No Co0 oNT N4NT 4NN0 —sj NT-1 __1 v_j C si — ' 03 C0 4N N0 O — ' Ol M Ol 00 Co 00 - ' O OO JN CO 00 O —
NT Js. θ 4N 4N O 4N O jN C0 NT NT θ n js JNT CO ' — — ■ ' — O ' C —O * π Ω O O 0 4- -O 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4N 0 0 0 4N O O O fc. Cn ,. ro isi .i ι— T3 CO sJ f JN 4N 4 4N 4N fc. 4N NT 4N 4N 4N ω — 4N 4N SI OO OO SJ O C0 4N — ' O N Co C NO CO O -N rO ^ f^c θθ ,^ι oCn cθ 4 ^ , co o cNo M NOτ g n Kj ^n i^v^ ϊ^n Qt fc. θ K 4 4N 4 4 4N N 4N C0 - J fc K cD Js. _ ∞ Jv j^ v0 O 4 4N g O NO O O — ' O TJ
CO m
©
00 CO CO 00 <T0 CO O0 CO C» <X C0 00 O3 00 (T3 00 00 C Cj3 O3 00 C» CT3 CJ3 C0 00 α0 si sj sl sl sj sl sj sj sj sj sj sj sj sj sj vj sl vj sj o o o o ooooooooooooooooooooooo o o o o o o o o o o o o o o o o o o Ό o
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O —. —. -4 —. _ —_ _ _ _ _ _ _ _ _ _ _ _ _. _
O O O O O O O O O O O O O O O O O O O O O O O O CD 00 00 (T0 CT3 00 00 00 C0 00 00 03 00 00 00 00 00 00 00 c cjo co co cjj cjj cjj cj cj c cjo cjj c- ω cj cjj co co eo ω ω
4N 4N 4N 4N .JN 4N jN js. fc*. JS. 4N 4N 4N 4N 4N fc 4N 4N fc O O O O O O O O O O O O O O O O O O O O O O O O O O O O O -vI sl sJ sl sl sl sJ sJ sJ sj sJ sJ sl sJ sJ sj sj sJ sJ ^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Fύ fύ fύ fύ fύ fύ Fύ fύ NT fύ fύ NT NT Fό fύ fύ "tύ Fύ Fύ Fύ "tύ Fύ Fύ iύ Fύ Fύ fύ fύ Fύ Fύ NO fύ tύ fύ NO fύ 'N NO NT Fύ 'NO Fύ tύ Fύ Fύ fύ Fύ NT O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O Φ OoOooO O OoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoO O OoOooO OoOoOoOoOoOoOoOoOoOoOoOoO ω o oOoOoOoOoOoOoOoOo — M π m mωωmωmwmω oo mω rΛnωmωmωmωmωωmωmωmωmωωwωmwm miωmwmW'Mmwm m m m m m m m m 1-o -o -o -u ττ -o τj τJ TJ -O -α τJ τj τj τJ ττ τj τι τj -o -τj -α -u τJ Tj τj τj -D TJ TJ T^ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
OO OO CO CTO OO CO OO CO CJO OO OO CO OO CO CTO CJO OO OO CO CO CO OO CTO CTO ∞ OO OO OO CO ∞
> 03 m o >
sl sl Ol Ol Co O O O 01 0ι 0ι 4N 4N 4N C0 C0 C0 C0 C0 C0 C0 CO CO CO NO NT NO eπ fc. fc. Cθ Co _ — 4 O J o3 sl O O NO NT O en Cn O sJ Cn 4N 4N CO — - —' —' —■ 00 sl NO CO CO 4N N0 — ' CO 4N NT N0 — ' Cn co oα jN o en co o o en si oi co — ' N0 O 4N CO O O Ol Ol > CJ sJ O O C0 N0 NT tO tO 4N O O Cn θ 00 sj NO Co —' 00 4N O Ol CO O O OO O — ' 00 N0 CO 4N — ■ — ' O NO -vl sj co Oi Ω
3-
NO to co o cn cn 00 _ 00 OO 00 sl sj O O O O OO OO O O Ol O O O — ' 0 0 0 0 0 0 0 0 4N o sj oo co sι 4N fc. 4N .v NT 4N 4N Sj NO CT - OSJ UCO -OCJ OVJ v—jl VωA* -θ«CJ O ∞ O θ po 4N 4 4 Cθ 4 4 4 C^ ^ j;v_ o oo o
4N NO fco. o o o o en ft o co o cn o —' —' NO —' SJ g5 Cn O NT 4N O C0 O 4N C0 sl o C0 CO O O CO t O fc. 4N θ 4N 4N 4N Cθ Oι Oθ g r5; 4 4NN sNTI fcfc.. 44NN u sjl 4N - osol C O0 TOJ
TABLE 3
SEQ ID NO: Template ID CampΩnent ID Start Stop 180 LI:1093491.1:2000SEP08 6349295H2 76 388 180 LI:109349l.l:2000SEP08 4439555H1 1 280 180 LI:1093491.1.2000SEP08 4439555F8 1 606 180 LI:1093491.1:2000SEP08 70090163V1 10 645 180 LI: 1093491.1:2000SEP08 71059365V1 140 777 180 LI:1093491.1 :2000SEP08 1739070F6 10 275 180 LI: 1093491.1:2000SEP08 7293047H1 151 422 180 LI:1093491.1:2000SEP08 71057653V1 204 870 180 LI:1093491.1:2000SEP08 70089451VI 203 695 180 LI:1093491 ,1 :2000SEP08 70090909V1 10 509 180 LI:1093491.1:2000SEP08 71606561VI 10 661 180 LI:1093491.1:2000SEP08 1739070H1 10 247 180 LI:1093491.1.2000SEP08 70093679V1 10 436 180 LI:1093491.1:2000SEP08 70093745V1 10 485 180 LI:1093491.1:2000SEP08 70090843V1 10 434 180 LI:1093491.1:2000SEP08 70090268V1 10 464 180 LI:1093491.1 :2000SEP08 70091238V1 218 710 180 LI:1093491.1:2000SEP08 71607589V1 10 663 180 LI:1093491.1:2000SEP08 70088285V1 208 682 180 LI: 1093491.1:2000SEP08 g5362171 24 129 180 LI:1093491.l:2000SEP08 71058989V1 24 553 180 LI:1093491.1:2000SEP08 70092668V1 24 512 180 LI:1093491.1:2000SEP08 71060482V1 32 345 180 LI:1093491.1 :2000SEP08 4161044F8 50 622 180 LI:1093491.1:2000SEP08 70091458V1 10 446 180 LI:1093491.1.2000SEP08 70089238V1 417 689 180 LI: 1093491.1.2000SEP08 70092828V1 10 416 180 LI:1093491.1.2000SEP08 70093647V1 10 342 180 LI:1093491.1 :2000SEP08 71060092V1 35 424 180 L1:1093491.1;2000SEP08 71060687V1 378 796 180 LI:1093491.1:2000SEP08 71059386V1 588 796 180 LI:1093491.1:2000SEP08 71060760V1 124 645 180 LI;1093491.1.2000SEP08 70092525V1 10 363 180 LI:1093491 ,1 :2000SEP08 70090309V1 10 418 180 LI:1093491 ,1 :2000SEP08 70088745V1 10 311 181 U:046515,5:2000SEP08 5546289F8 78 199 181 LI:046515.5:2000SEP08 6855767T8 1 618 181 LI:046515.5:2000SEP08 5546289H1 78 256 182 LI:400171.2:2000SEP08 7309340H1 634 931 182 LI:400171.2:2000SEP08 2669002H1 684 939 . 182 LI:400171.2:2000SEP08 7066680H1 942 1053 182 LI:400171.2:2000SEP08 4706031F6 1 507 182 LI:400171.2:2000SEP08 4706031 HI 1 137 182 LI:400171.2:2000SEP08 3371804H1 30 153 182 LI:400171.2:2000SEP08 2923902H1 30 133 182 LI:400171.2:2000SEP08 4671104H1 30 137 182 LI:400171,2:2000SEP08 1494879H1 30 132 182 U:400171.2:2000SEP08 6756208H1 60 669 CO m _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
CT3 OO OO CT3 CD CT3 (J3 CT3 C» ∞ O0 <T0 0O OO (T0 CT0 CT3 ∞ 4N 4N 4N 4N 4N 4N 4N -fc. 4N 4N fc. 4N fc. CO CJ NO N -NJ tO IO NT tO N NT NT
O
σ
sl sl sJ sl J co co CO NT — 4 — ' ^rv O O Ol Oi 4N JN fc. 4N 4N fc. ft c Oo cOπi JONO CJo ofO oCn oOi o— ' oC oOi NCOo ^vJ -' — ' NT O O — ' 00 O JN CO 00 NT Q
00 fc. vl O CH — ' O O O vj -q.
O r, iv*v O Oo Oo vl Cθ Cπ co 4N 4N eo 4N 4N Cθ r-) co sj sj en o cn co CO Cl ft r^ NJ Oi -IN NT Cπ — •CO CO O O O O sj J O Cn g g jN CJl NT 04N u 004N vj vO O O cri s OI _ C O NO COO O O s 00 i N j oS j Os_ - Ω--
TABLE 4
SEQ ID NO: Template ID Tissue Distribution
1 LG:983076.3:2000SEP08 Urinary Tract - 50%, Endocrine System - 50%
2 LG:1382987.7:2000SEP08 Urinary Tract - 36%, Endccrine System - 36%, Female Genitalia - 18%
3 LG:235557.15:2000SEP08 Urinary Tract - 31%, Digestive System - 31%, Endocrine System - 31%
4 LG:018494.1:2000SEP08 Endocrine System - 42%, Urinary Tract - 35%, Respiratory System - 16%
5 LG:980494.1:2000SEP08 Germ Cells - 23%, Unclassified/Mixed - 22%, Nervous System - 16%
6 LG:984457.2:2000SEP08 Connective Tissue - 28%, Respiratory System - 20%, Cardiovascular System - 16%, Endocrine
System - 16%
7 LG:406758.1:2000SEP08 Female Genitalia - 17%, Embryonic Structures - 17%, Hemic and Immune System - 13%, Male
Genitalia - 13%
8 LG:902957,17:2000SEP08 Respiratory System - 100%
9 LG:333179.1:2000SEP08 Germ Cells - 62%, Unclassified/Mixed - 17%
10 LG:406568.1:2000SEP08 Stomatognathic System - 52%, Musculoskeletal System - 22%, Cardiovascular System - 19%
11 LG:353203,1:2000SEP08 Connective Tissue - 88%, Nervous System - 13%
12 LG:061277.1:2000SEP08 Male Genitalia - 67%, Hemic and Immune System - 33%
13 LG:170666,1:2000SEP08 Germ Cells - 70%, Respiratory System - 27%
14 LG:311197,1:2000SEP08 Germ Cells - 45%, Digestive System - 14%, Male Genitalia - 12%
15 LG:220655.4:2000SEP08 Urinary Tract - 67%, Nervous System - 33%
16 LG:1001893.1:2000SEP08 Nervous System - 100%
17 LG:004335.1:2000SEP08 Germ Cells - 24%, Unclassified/Mixed - 12%
18 LG:213092.6:2000SEP08 Nervous System - 100%
19 LG:407570,5:2000SEP08 Urinary Tract - 36%, Endocrine System - 36%, Hemic and Immune System - 27%
20 LG:337835.8:2000SEP08 Urinary Tract - 21 %, Respiratory System - 15%, Hemic and Immune System - 14%, Embryonic
Structures - 14%, Pancreas - 14%
21 LG:1099283.1:2000SEP08 Musculoskeletal System -45%, Nervous System - 41%, Male Genitalia - 14%
22 LG:401274.2:2000SEP08 Female Genitalia - 50%, Digestive System - 50%
23 LG:222880.1:2000SEP08 Unclassified/Mixed - 18%, Ccnnective Tissue - 10%, Skin - 10%
24 LG:406389.1:2000SEP08 Embryonic Structures - 41%, Nervous System - 30%, Respiratory System - 11%
25 LG:055461.1:2000SEP08 Urinary Tract - 26%, Unclassified/Mixed - 19%, Nervcus System - 14%
26 LG:979059,5:2000SEP08 Exocrine Glands - 50%, Musculoskeletal System - 25%, Urinary Tract - 17%
TABLE 4
SEQ ID NO: Template ID Tissue Distribution
27 LG:399238.1:2000SEP08 Germ Cells - 22%, Urinary Tract - 18%, Unclassified/Mixed - 11 %
28 LG:1382945.7:2000SEP08 Musculoskeletal System - 75%, Male Genitalia - 25%
29 LG:1383610.3:2000SEP08 Liver - 17%, Sense Organs - 12%
30 LG:1384030.1:2000SEP08 Skin - 10%
31 LG:390475.1:2000SEP08 Skin - 15%, Embryonic Structures - 10%, Cardiovascular System - 10%
32 LG:229105.3:2000SEP08 Pancreas - 95%
33 LG:232578.3:2000SEP08 Respiratory System - 16%, Cardiovascular System - 13%
34 LG:1166387,9:2000SEP08 Ccnnective Tissue - 46%, Skin - 39%
35 LG:351357,1:2000SEP08 Liver - 28%, Endccrine System - 28%, Male Genitalia - 22%
36 LG:465592.1:2000SEP08 Unclassified/Mixed - 33%, Female Genitalia - 21 %, Urinary Tract - 17%, Endocrine System - 17%
37 LG:006848.5:2000SEP08 Unclassified/Mixed - 32%, Respiratory System - 20%, Exocrine Glands - 16% 38 LG:198450.2:2000SEP08 Exocrine Glands - 57%, Cardiovascular System - 29%, Female Genitalia - 14% 39 LG:1008175.1:2000SEP08 Liver - 100%
40 LG:437981.11:2000SEP08 Female Genitalia - 50%, Digestive System - 50%
41 LG:1025549.1:2000SEP08 Liver - 100%
42 LG:327226.16:2000SEP08 Urinary Tract- 31%, Digestive System - 31%, Female Genitalia - 15%, Male Genitalia - 15%
43 LG:1387394,5:2000SEP08 Urinary Tract - 57%, Digestive System - 29%, Nervous System - 14%
44 LG:445188.3:2000SEP08 Pancreas - 31%, Musculoskeletal System - 21%, Digestive System - 14%
45 LG:898864.11:2000SEP08 Urinary Tract - 29%, Digestive System - 29%, Endocrine System - 29%
46 LG:018739.2:2000SEP08 Digestive System - 41%, Pancreas - 31%, Urinary Tract - 14%, Hemic and Immune System - 14%
47 LG:302915.6:2000SEP08 Embryonic Structures - 45%, Musculoskeletal System - 30%, Endocrine System - 20%
48 LG:404418.3:2000SEP08 Liver - 33%, Unclassified/Mixed - 30%, Musculoskeletal System - 22%
49 LG:374853.2:2000SEP08 Cardiovascular System - 26%, Male Genitalia - 23%, Respiratory System - 16%
50 LG:228930.1:2000SEP08 Urinary Tract- 37%, Nervous System - 32%, Respiratory System - 16%, Hemic and Immune System
16%
51 LG:273593.6:2000SEP08 Skin - 52%, Connective Tissue - 26%, Urinary Tract - 15%
52 LG:008215.1:2000SEP08 Embryonic Structures - 26%, Female Genitalia - 21%, Musculoskeletal System - 18%
53 LG:337160,1:2000SEP08 Unclassified/Mixed - 31 %, Embryonic Structures - 18%, Urinary Tract - 17%
54 LG:3950ό3.1:2000SEP08 Urinary Tract - 58%, Digestive System - 36%
TABLE 4
SEQ ID NO: Template ID Tissue Distribution
55 LG:979069.4:2000SEP08 Embryonic Structures - 31%, Male Genitalia - 31%, Urinary Tract- 14%
56 LG:346663.5:2000SEP08 Unclassified/Mixed - 42%, Male Genitalia - 15%, Exocrine Glands - 14%
57 LG:347615,1:2000SEP08 Urinary Tract - 100%
58 LG:1397067.1:2000SEP08 Digestive System - 30%, Urinary Tract - 20%, Exocrine Glands - 20%
59 LG:120675.1:2000SEP08 Embryonic Structures - 21 %, Liver - 20%, Urinary Tract - 12%
60 LG:420050.18:2000SEP08 Urinary Tract - 25%, Endocrine System - 25%, Respiratory System - 19%, Hemic and Immune System - 19%
61 LG:220495.3:2000SEP08 Unclassified/Mixed - 13%, Skin - 11%
62 LG:274551.1:2000SEP08 Nervous System - 67%, Hemic and Immune System - 33%
63 LG:429658.27:2000SEP08 Unclassified/Mixed - 64%
64 LG:246194.18:2000SEP08 Unclassified/Mixed - 17%, Sense Organs - 1 1 % 65 LG:000874.1:2000SEP08 Nervous System - 64%, Exocrine Glands - 14%, Urinary Tract - 14% 66 LG:239967.7:2000SEP08 Exocrine Glands - 67%, Digestive System - 33%
67 LG:238388.1:2000SEP08 Sense Organs - 13%, Unclassified/Mixed - 1 1%, Exocrine Glands - 10%
68 LG:233674.4:2000SEP08 Germ Cells - 24%
69 LG:411327.2:2000SEP08 Sense Organs - 32%, Embryonic Structures - 17%
70 LG:1327310.1:2000SEP08 Liver - 100%
71 LG:242019.13:2000SEP08 Connective Tissue - 78%, Male Genitalia - 22%
72 LG:012432.12:2000SEP08 Pancreas - 44%, Embryonic Structures - 22%, Urinary Tract - 17%
73 LG:257088.9:2000SEP08 Sense Organs - 36%, Germ Cells - 18%, Unclassified/Mixed - 17%
74 LG:997505.5:2000SEP08 Connective Tissue - 36%, Female Genitalia - 23%, Urinary Tract - 18%
75 LG:481436.2:2000SEP08 Embryanic Structures - 16%, Connective Tissue - 12%, Endocrine System - 1 1%
76 LG:247776.14:2000SEP08 Female Genitalia - 21%, Connective Tissue - 21%, Urinary Tract- 12%, Hemic and Immune System - 12%, Male Genitalia - 12%, Exocrine Glands - 12%, Nervous System - 12%
77 LG:008606.14:2000SEP08 Musculoskeletal System - 36%, Liver - 10%
78 LG:985092.3:2000SEP08 Urinary Tract - 80%, Nervous System - 20%
79 LG:236649,7:2000SEP08 Embryonic Structures - 56%, Connective Tissue - 44%
80 LG:245014,2:2000SEP08 Sense Organs - 25%, Liver - 22%
81 LG:170754.4:2000SEP08 Endocrine System - 36%, Musculoskeletal System - 24%, Cardiovascular System - 16%
TABLE 4
5 ID NO: Template ID Tissue Distribution
82 LG:988028.1:2000SEP08 Embryonic Structures - 33%, Urinary Tract - 26%, Respiratory System - 19%
83 LG:427997.6:2000SEP08 Male Genitalia - 15%, Liver - 14%, Embryonic Structures - 11%
84 LG:464206.1:2000SEP08 Endocrine System - 43%, Urinary Tract - 33%, Cardiovascular System - 19%
85 LG:1400108.1.2000SEP08 Embryonic Structures - 26%, Female Genitalia - 15%, Nervous System - 12%, Urinary Tract - 12%, Cardiovascular System - 12%, Exocrine Glands - 12%
86 LG:254531.1:2000SEP08 Unclassified/Mixed - 35%, Female Genitalia - 22%, Exocrine Glands - 17%
87 LG:1101317.1:2000SEP08 Embryonic Structures - 13%
88 LG;1074728.6:2000SEP08 Skin - 36%, Male Genitalia - 23%, Cardiovascular System - 10%, Exocrine Glands - 10%
89 LG: 1081684.1:2000SEP08 Endocrine System - 56%, Male Genitalia - 25%, Female Genitalia - 13%
90 LG:1076520.1;2000SEP08 Sense Organs - 70%
91 LG:1079477.1:2000SEP08 Cardiovascular System - 21 %, Urinary Tract - 21 %, Male Genitalia - 21 % o 92 LG:1076269.1:2000SEP08 Nervous System - 100% to o 93 LG:1087195.1:2000SEP08 Unclassified/Mixed - 26%, Female Genitalia - 22%, Skin - 11%
94 LG:002588.7:2000SEP08 Female Genitalia - 40%, Nervous System - 40%, Hemic and Immune System - 20%
95 LG:1079470.6:2000SEP08 Female Genitalia - 22%, Nervous System - 14%
96 LG:345705.3:2000SEP08 Cardiovascular System - 57%, Nervous System - 43%
97 LG:1083654.1:2000SEP08 Unclassified/Mixed - 19%, Exocrine Glands - 15%
98 LG:198782.3:2000SEP08 Sense Organs - 31%
99 LG:981076.2:2000SEP08 Nervous System - 29%, Endocrine System - 24%,. Respiratory System - 18%, Hemic and Immune System - 18%
100 LG:212023.3:2000SEP08 Digestive System - 28%, Pancreas - 25%, Musculoskeletal System - 18%
101 LG:977929.3:2000SEP08 Male Genitalia - 44%, Nervous System - 33%, Female Genitalia - 22%
102 LG:20193ό.6:2000SEP08 Embryonic Structures - 30%, Unclassified/Mixed - 27%, Female Genitalia - 17%
103 LG;205642,1:2000SEP08 Musculoskeletal System - 40%, Nervous System - 33%, Exocrine Glands - 27%
104 LG:339653,6:2000SEP08 Endocrine System - 90%, Nervous System - 10%
105 LG:978587.4:2000SEP08 Embryonic Structures - 36%, Unclassified/Mixed - 32%, Respiratory System - 12%
106 LG:216848.17:2000SEP08 Cardiovascular System - 67%, Male Genitalia - 33%
107 LG:219502.1:2000SEP08 Unclassified/Mixed - 54%, Exocrine Glands - 26%
108 LI:334211.1:2000SEP08 Nervous System - 36%, Connective Tissue - 27%, Respiratory System - 18%
TABLE 4
SEQ no LI:228425.5:2000SEP08 Cardiovascular System - 32%, Female Genitalia - 27%, Liver - 25%
111 U:034493.1:2000SEP08 Urinary Tract- 49%, Female Genitalia - 33%, Pancreas - 10%
112 LI:336218.1:2000SEP08 Liver - 53%, Nervous System - 1 1 %
113 Ll:235891.3:2000SEP08 Liver - 80%, Respiratory System - 20%
114 LI:344094.1:2000SEP08 Sense Organs - 65%, Embryonic Structures - 14%
115 LI:399945.2:2000SEP08 Urinary Tract - 83%
116 LI:051849.1:2000SEP08 Pancreas - 100%
117 LI:238379.3:2000SEP08 Exocrine Glands - 63%, Respiratory System - 38%
118 LI:352190.8:2000SEP08 Liver - 80%, Female Genitalia - 20%
119 LI:432120.1:2000SEP08 Germ Cells - 86%
Ml 120 U:055461.1:2000SEP08 Digestive System - 28%, Urinary Tract - 18%, Nervous System - 18%
-; i2i LI:197433.5:2000SEP08 Unclassified/Mixed - 33%, Male Genitalia - 25%, Cardiovascular System 18%, Urinary Tract - 18%
122 LI:170604.1:2000SEP08 Respiratory System - 46%, Urinary Tract- 31%, Female Genitalia - 23%
123 LI:205057.3:2000SEP08 Embryonic Structures - 23%, Germ Cells - 23%, Nervous System - 1 1%
124 LI:233795.1:2000SEP08 Nervous System - 100%
125 LI:3ni97.1:2000SEP08 Germ Cells - 32%, Digestive System - 30%, Male Genitalia - 10%
126 U:441364.1:2000SEP08 Unclassified/Mixed - 51%, Digestive System - 23%, Urinary Tract - 12%
127 LI:210367.ό:2000SEP08 Musculaskeletal System - 35%, Male Genitalia - 29%
128 LI:238194.5:2000SEP08 Endocrine System - 67%, Nervous System - 33%
129 LI:039258.5:2000SEP08 Digestive System - 75%, Nervous System - 25%
130 LI:1071842.1:2000SEP08 Musculoskeletal System - 64%, Nervcus System - 16%
131 LI:481356.3:2000SEP08 Liver - 41 %, Urinary Tract - 31 %, Female Genitalia - 17%
132 LI:103474.1:2000SEP08 Male Genitalia - 73%, Urinary Tract - 27%
133 LI:1073020.10:2000SEP08 Digestive System - 62%, Pancreas - 38%
134 LI:000874.1:2000SEP08 Nervcus System - 75%, Exocrine Glands - 10%
135 LI:037298,2:2000SEP08 Unclassified/Mixed - 47%, Germ Cells - 21%
136 LI:422901,1:2000SEP08 Respiratory System - 100%
137 LI:345815.1:2000SEP08 Digestive System - 32%, Embryonic Structures - 16%, Urinary Tract - 12%
TABLE 4
SEQ ID NO: Template ID Tissue Distribution
138 LI:1072014.2:2000SEP08 Female Genitalia - 63%, Digestive System - 38%
139 LI:333138.3:2000SEP08 Nervous System - 100%
140 LI:414253.1:2000SEP08 Nervous System - 80%
141 LI:406389.1:2000SEP08 Nerveus System - 47%, Embryenic Structures - 32%
142 LI:1086171.1:2000SEP08 Liver - 100%
143 LI:198782,4:2000SEP08 Cardiovascular System - 28%, Musculoskeletal System - 13%
144 Ll:2030279.1 :2000SEP08 Unclassified/Mixed - 52%, Exocrine Glands - 14%, Embryonic Structures - 1 1%
145 U:1018424.3:2000SEP08 Nervous System - 26%, Liver - 24%, Embryonic Structures - 24%
146 LI:130969.1:2000SEP08 Cardiovascular System - 32%, Embryonic Structures - 24%, Female Genitalia - 16%
147 LI:286246,2:2000SEP08 Unclassified/Mixed - 41%, Nervous System - 26%, Endocrine System - 22%
148 LI:001527.1:2000SEP08 Stomatognathic System - 43%, Skin - 37%
149 LI:395063.1:2000SEP08 Digestive System - 30%, Unclassified/Mixed - 26%, Urinary Tract - 21%
150 LI:1064460.1 :2000SEP08 Nervous System - 67%, Digestive System - 33%
151 LI:344690.2:2000SEP08 Embryonic Structures - 89%
152 LI:061585.4:2000SEP08 Digestive System - 68%, Exocrine Glands - 23%
153 LI:378428.1:2000SEP08 Endocrine System - 26%, Digestive System - 24%, Female Genitalia - 19%
154 LI:474108.2:2000SEP08 Male Genitalia - 18%, Nervous System - 14%, Urinary Tract - 13%
155 LI:230711.2:2000SEP08 Urinary Tract - 30%, Female Genitalia - 29%, Digestive System - 29%
156 LI:008942.1:2000SEP08 Hemic and Immune System - 31%, Embryonic Structures - 21%, Female Genitalia - 13%
157 LI:732479.1:2000SEP08 Unclassified/Mixed - 20%, Male Genitalia - 19%, Urinary Tract - 18%
158 LI:1190250.1 :2000SEP08 Female Genitalia - 34%, Respiratory System - 24%, Embryonic Structures - 12%
159 LI:1013717.1:2000SEP08 Liver - 100%
160 LI:2049125,2:2000SEP08 Urinary Tract - 57%, Digestive System - 43%
161 LI:1092360.1:2000SEP08 Digestive System - 43%, Unclassified/Mixed - 23%, Nervous System - 20%
162 LI:791524.1:2000SEP08 Cardiovascular System - 61%, Respiratory System - 33%
163 LI:1084555.3:2000SEP08 Digestive System - 80%, Hemic and Immune System - 20%
164 U:815418.2:2000SEP08 Respiratory System - 13%, Cardiovascular System - 13%, Stomatognathic System - 12%, Pancreas 12%
165 LI:416766.1:2000SEP08 Urinary Tract - 28%, Endocrine System - 18%, Exocrine Glands - 10%
TABLE 4
SEQ
167 L[:llό9888.3:2000SEP08 Unclassified/Mixed - 62%
168 U:412592.1:2000SEP08 Endocrine System - 36%, Urinary Tract - 12%, Nervous System - 11 %
169 LI:349808,1:2000SEP08 Female Genitalia - 43%, Endocrine System - 16%, Hemic and Immune System - 14%
170 LI:349164.2:2000SEP08 Musculoskeletal System - 58%, Endocrine System - 19%, Cardiovascular System - 18%
171 LI:205413.1:2000SEP08 Sense Organs - 32%, Nervous System - 20%, Embryonic Structures - 14%
172 LI:2051508.2:2000SEP08 Urinary Tract - 69%, Embryonic Structures - 15%
173 LI:346242,2:2000SEP08 Stomatognathic System - 33%, Musculoskeletal System - 12%, Liver - 11%
174 LI:2052717.1:2000SEP08 Germ Cells - 20%, Endocrine System - 19%, Digestive System - 16%
175 LI:406668.2:2000SEP08 Liver - 21%, Unclassified/Mixed - 21%, Digestive System - 20%
176 LI:1178352.1 :2000SEP08 Female Genitalia - 28%, Embryonic Structures - 22%, Unclassified/Mixed - 21% tto 177 LI:814014.7:2000SEP08 Male Genitalia - 31%, Female Genitalia - 23%, Unclassified/Mixed - 23% ω 178 U:1170624.1:2000SEP08 Nervous System - 100%
180 LI:1093491,1:2000SEP08 Unclassified/Mixed - 62%, Digestive System - 11%
181 LI:046515.5:2000SEP08 Nervous System - 67%, Male Genitalia - 33%
182 U:400171.2:2000SEP08 Respiratory System - 39%, Digestive System - 25%
183 LI:330919.ό:2000SEP08 Digestive System - 100%
184 LI:219502.1:2000SEP08 Unclassified/Mixed - 49%, Exocrine Glands - 29%
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Sc > Annotation
193 3 188 3 566 g12838374 1.00E-93 putative
193 3 188 3 566 g3719323 2.00E-30 D7-like protein
193 3 188 3 566 gό4651 5.00E-25 D7 protein (AA 1-278)
194 2 531 2 1594 g12656196 ό.OOE-94 cardiac leiomodin
194 2 531 2 1594 g28969 3.00E-58 64 Kd autoantigen
194 2 531 2 1594 g1628561 1.00E-49 E-tropomodulin
195 1 138 1 414 g9797761 8.00E-11 Contains similarity to PIR7A protein from Oryza sativa gb | Z34271 and contains an alpha/beta hydrolase fold PF 100561.
195 1 138 1 414 g4406772 4.00E-10 putative nitrilase-associated protein
195 1 138 1 414 g2765837 4.00E-10 NAPlόkDa protein
196 3 167 3 503 g4678973 3.00E-92 dJ207Hl .1 (axonemal Dynein Heavy Chain protein DNAH) to 196 3 167 3 503 g14335444 3.00E-92 axonemal dynein heavy chain 8
+■■ 196 3 167 3 503 g14335468 2.00E-91 axonemal dynein heavy chain 8 short form 2
197 2 155 5 469 gl 817526 1.OOE-26 intermediate chain 1
197 2 155 5 469 g7580490 7.00E-24 NM23-H8
197 2 155 5 469 g7292727 2.00E-12 CGI 8130 gene product
198 2 272 167 982 gl3811938 1.00E-119 dJ1056H 1 ,2.1 (novel protein similar to mitogen inducible protein MIG-2 (isoform 1))
198 2 272 167 982 g505033 7.00E-62 mitogen inducible gene mig-2
198 2 272 167 982 g7294024 8.00E-25 CG7729 gene product
200 2 119 95 451 g8452874 5.00E-21 bromodomain-containing protein
200 2 119 95 451 g6966969 5.00E-21 bromodomain containing protein
200 2 119 95 451 gό626179 6.00E-21 bremodomain protein CELTIX1
201 259 310 1086 g10435124 1.00E-129 unnamed protein product
201 259 310 1086 g3342738 1.00E-100 R26660J, partial CDS
201 259 310 1086 g7297640 2.00E-85 CG4946 gene product
204 122 274 639 g4200330 3.00E-22 dJ821 Dl 1.1 (PUTATIVE protein)
204 122 274 639 g12855287 4.00E-16 putative
205 2 106 2 319 gl3185199 4.00E-44 unnamed protein product
TABLE 5
SE
205 2 106 2 319 g 12964604 4.00E-32 vacuolar proton-ATPase subunit M9.2
206 2 158 74 547 g7297790 8.00E-13 CG6093 gene product
206 2 158 74 547 g6779215 6.00E-1 1 unnamed protein product
206 2 158 74 547 g4454332 6.00E-1 1 tDETl protein
207 3 447 99 1439 g 10436952 1.00E-157 unnamed protein product
207 3 447 99 1439 g 13785928 1.00E-105 unknown
207 3 447 99 1439 g7293558 5.00E-81 CGI 4194 gene product
208 1 357 1 1071 g 12405521 1.00E-61 unnamed protein product
208 1 357 1 1071 g 12839367 7.00E-37 putative
208 1 357 1 1071 g4150939 8.00E-34 GSG1
209 2 352 179 1234 g 13878075 2.00E-17 unknown protein
209 2 352 179 1234 g7076778 4.00E-17 putative protein
209 2 352 179 1234 g7269877 9.00E-14 hypothetical protein
210 219 657 g 14495695 4.00E-53 (BC009459) Unknown (protein for MGC: 15935)
210 219 657 g 13874543 4.00E-53 hypothetical protein
210 219 657 g 13623407 4.00E-53 Similar to RIKEN cDNA 2810468K17 gene
21 1 282 846 g 12841694 1.00E-102 putative
21 1 282 846 g7301806 3.00E-22 CG7582 gene product
21 1 282 846 g9658122 2.00E-1 1 conserved hypothetical protein
212 85 175 429 g 14272514 7.00E-34 unnamed protein product
212 85 175 429 g 12848539 4.00E-31 putative
212 85 175 429 g 12845046 4.00E-31 putative
214 3 314 159 1100 g 14388454 1.00E-163 hypothetical protein
214 3 314 159 1 100 g 13874465 1.00E-163 hypothetical protein
214 3 314 159 1 100 g 14388519 1.00E-162 hypothetical protein
215 1 130 1 390 g9963773 7.00E-66 microtubule-associated proteins 1 A/I B light chain 3
215 1 . 130 1 390 g 10799102 7.00E-66 microtubule-associated proteins 1 A/I B light chain 3
215 1 130 1 390 g 10438107 7.00E-66 unnamed protein product
TABLE 5
5 ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
219 1 261 433 1215 g 10439289 1.00E-1 12 unnamed protein product
219 1 261 433 1215 g4240325 9.00E-66 KIAA0918 protein
219 1 261 433 1215 g 14017925 2.00E-65 KIAA1854 protein
222 2 192 263 838 g 1 1991486 1.00E-73 Glycosyltransferase
222 2 192 263 838 g 14041907 1.00E-72 unnamed protein product
222 2 192 263 838 g 12834837 1.00E-65 putative
223 1 200 1 600 g 12832198 6.00E-77 putative
223 1 200 1 600 g 12052866 5.00E-75 hypcthetical protein
223 1 200 1 600 g 14550502 7.00E-53 (BC009501) CGI-78 protein
224 2 123 2 370 g 1399966 2.00E-42 hypothetical protein 384D8_7
224 2 123 2 370 g 12804169 2.00E-42 Unknown (protein for IMAGE:3954961) to 224 2 123 2 370 g746576 IΌ 4.00E-07 K1 1 G12.6 gene product
CN 228 3 182 66 61 1 g4240130 1.00E-21 KIAA0822 protein
228 3 182 66 611 g 10176866 1.00E-09 ccntains similarity to ABC transporter-gene. jd:MAC9.4
228 3 182 66 61 1 g6598351 7.00E-08 putative ABC transporter
229 3 1 18 318 671 gl 1 139242 3.00E-37 meiotic reccmbination protein REC14
229 3 118 318 671 g 10437122 3.00E-37 unnamed protein product
229 3 1 18 318 671 g 12850275 4.00E-35 putative
230 1 206 1009 1626 g 1262852 4.00E-38 Ml 7 protein
230 1 206 1009 1626 g 13874586 8.00E-09 hypothetical protein
234 2 209 17 643 g9368450 1.00E-1 12 phospholipase C-beta-l b
234 2 209 17 643 g9368448 1.00E-1 12 phospholipase C-beta-1 a
234 2 209 17 643 g206218 1.00E-1 1 1 phospholipase C-l
235 1 151 115 567 g 10441 58 1.00E-33 unknown
235 1 151 1 15 567 g 10438597 1.00E-33 unnamed protein product
235 1 151 1 15 567 g 10438555 1.00E-33 unnamed protein product
236 3 498 3 1496 gl4597912 0 (AX172874) human CLASP-3
236 3 498 3 1496 g 14598037 1.00E-162 (AX173175) human CLASP-7
236 3 498 3 1496 g7243171 1.00E-162 IAA1395 protein
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
237 1 238 121 834 g12858399 1.00E-119 putative
237 1 238 121 834 g7303474 7.00E-33 CG8493 gene product
237 1 238 121 834 g8885570 2.00E-25 gene_id:F15L12.8~unknown protein
238 3 179 1665 2201 g3702269 9.00E-12 sodium iodide symporter
238 3 179 1665 2201 g2887405 9.00E-12 sodium iodide symporter
238 3 179 1665 2201 g1628579 9.00E-12 sodium iodide symporter
239 1 278 889 1722 g13365897 l.OOE-110 hypothetical protein
239 1 278 889 1722 g13603727 5.00E-23 glucose transporter
239 1 278 889 1722 g13560065 5.00E-23 dJ28H20.1 (solute carrier family 2 (facilitated glucose transporter), member 10)
241 3 85 3 257 g9886740 4.00E-12 WNT-2B Isoform 1
241 3 85 3 257 g1524105 4.00E-12 Wnt-13
242 1 168 70 573 g10047243 5.00E-45 KIAA1584 protein
242 1 168 70 573 g7019973 6.00E-22 unnamed protein product
242 1 168 70 573 g6468312 6.00E-22 hypothetical protein
243 1 203 193 801 gl3785614 2.00E-83 sideroflexin 2
243 1 203 193 801 g10433651 2.00E-54 unnamed protein product
243 1 203 193 801 g13785612 2.00E-54 sideroflexin 1
244 3 50 258 407 g13876944 4.00E-07 NEFA-interacting nuclear protein NIP30
244 3 50 258 407 g13182773 4.00E-07 CDA10
244 3 50 258 407 g12006227 4.00E-07 CDA018
245 1 242 1 726 g13936285 8.00E-71 TRH4
245 1 242 1 726 g12845540 1.OOE-70 putative
245 1 242 1 726 gl3185173 3.00E-43 unnamed protein product
247 2 69 2 208 g13325269 1.00E-18 Similar to RIKEN cDNA 2400006N03 gene
247 2 69 2 208 g12845621 8.00E-11 putative
248 1 161 1 483 g10281737 2.00E-30 similar to GenBank Accession Number AC021 163
248 1 161 1 483 g4263748 5.00E-25 similar to KIAA0766; similar to PID:g3882253
248 1 161 1 483 g12328812 4.00E-24 gtf2ird2
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
249 304 580 1491 g9652074 1.00E-132 arginine N-methyltransferase
249 304 580 1491 g7453577 1.00E-115 protein arginine N-methyltransferase 1 -variant 1
249 304 580 1491 g7453576 1.00E-115 protein arginine N-methyltransferase 1 -variant 3
250 107 1 321 g14585861 4.00E-11 (AY037151) hypothetical protein SB 139
250 107 1 321 g14042596 4.00E-11 unnamed protein product
250 107 1 321 gl3182739 4.00E-11 GL008
251 3 534 3 1604 g7303142 6.00E-29 CGI 0155 gene product
251 3 534 3 1604 g8979789 2.00E-08 sprouty (DrasΩphila) hΩmolog 3
251 3 534 3 1604 gl644461 3.00E-08 neural variant mena+++ protein
252 1 597 1 2 1932 g14017889 1.00E-176 KIAA1836 protein
252 1 597 142 1932 g7020201 3.00E-47 unnamed protein product
252 1 597 142 1932 g7297820 2.00E-44 CG4713 gene product
253 3 171 522 1034 g8926691 3.00E-45 hypothetical protein
253 3 171 522 1034 g8132351 3.00E-31 putative seven pass transmembrane protein
253 3 171 522 1034 g13096836 3.00E-31 Similar to transmembrane 7 superfamily member 1 (upregulated in
254 1 162 7 492 g12834045 2.00E-44 putative
254 1 162 7 492 g5531805 6.00E-36 16.7Kd protein
254 1 162 7 492 g13111780 ό.OOE-36 16.7Kd protein
259 3 90 3 272 g10436707 2.00E-08 unnamed protein product
259 3 90 3 272 g10435706 2.00E-08 unnamed protein product
261 2 433 197 1495 g12851542 0 putative
261 2 433 197 1495 g12845721 0 putative
261 2 433 197 1495 g10435667 0 unnamed protein product
263 2 82 191 436 g12859941 3.00E-07 putative
264 1 472 277 1692 g10439926 0 unnamed protein product
264 1 472 277 1692 g4406632 0 Unknown
264 1 472 277 1692 g13477255 1.00E-169 hypothetical protein FLJ23293 similar to ARL-6 interacting protein-2
265 2 182 56 601 g10434977 5.00E-72 unnamed protein product
265 2 182 56 601 g14035806 4.00E-58 unnamed protein product
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
265 2 182 56 601 g 13435476 2.00E-53 Unknown (protein for MGC:6708)
266 1 337 628 1638 g 1 1345539 0 dJ620E1 1.1.1 (novel Helicase C-terminal domain and SNF2 N-terminal domains containing protein, similar to KIAA0308 (isofcrm 1))
266 1 337 628 1638 g 10438729 0 unnamed protein product
266 1 337 628 1638 g7243213 1.00E-134 KIAA1416 protein
267 3 282 3 848 gό996442 3.00E-53 CTL1 protein
267 3 282 3 848 g6996589 7.00E-52 CTL1 protein
267 3 282 3 848 g6996587 1.00E-42 CTL1 protein
270 69 49 255 g4586224 4.00E-08 40S ribosomal protein S24
270 69 49 255 g65054 5.00E-08 ribsomal protein S19
270 69 49 255 g57858 5.00E-08 ribosomal protein S24 271 240 91 810 g 10435124 7.00E-97 unnamed protein product 271 240 91 810 g3342738 2.00E-93 R26660J, partial CDS
271 240 91 810 g7297640 1.00E-82 CG4946 gene product
275 2 175 98 622 g 12804721 1.00E-12 Unknown (protein fer MGC:2663)
275 2 175 98 622 g 10047251 3.00E-12 KIAA1588 protein
275 2 175 98 622 gl 049301 1.00E-1 1 KRAB zinc finger protein; Method: conceptual translaticn supplied by
276 2 77 266 496 g2587027 6.00E-07 HERV-E envelope glycoprotein
276 2 77 266 496 g2587024 6.00E-07 HERV-E envelope glycoprotein
276 2 77 266 496 g 1049232 ό.OOE-07 HERV-E envelope protein
277 3 103 1029 1337 g 1492075 1.00E-12 MC132L
280 162 22 507 g 12857957 4.00E-23 putative
280 162 22 507 g 12854979 4.00E-23 putative
281 580 1 126 2865 g 13625162 0 jerky
281 580 1 126 2865 g2314829 1.00E-45 jerky gene product homolog
281 580 1 126 2865 g 10140857 4.00E-43 jerky
283 160 172 651 g 12053267 6.00E-81 hypothetical protein
TABLE 5
SE AA823760 (NID:g2893628), AA215791 (NID:g1815572), AI095488 (NID:g3434464), and AA969095 (NID:g3144275)
283 1 160 172 651 g5640105 8.00E-54 homeobox protein LSX
284 3 275 210 1034 g4468307 1.00E-141 dJ413H6.1 ,l (hamster Androgen-dependent Expressed Protein LIKE PUTATIVE protein) (iseform 1)
284 3 275 210 1034 g13937819 1.00E-117 Unknown (protein for MGC: 12335)
284 3. 275 210 1034 gl91315 1.00E-78 androgen-dependent expressed protein
285 3 146 57 494 g13938187 6.00E-42 hypothetical protein FLJ22419
285 3 146 57 494 g10438804 6.00E-42 unnamed protein product
285 3 146 57 494 g10436785 7.00E-26 unnamed protein product
NT
CO 286 2 256 77 844 g14035998 9.00E-75 unnamed protein product o 286 2 256 77 844 g12849991 2.00E-73 putative
286 2 256 77 844 g7303811 3.00E-42 CG 18445 gene product
288 3 100 18 317 g14250565 1.00E-09 Unknown (protein for IMAGE:316251 1)
288 3 100 18 317 gl4017961 1.00E-09 KIAA1872 protein
288 3 100 18 317 gl 1611579 1.00E-09 hypothetical protein
289 122 1 366 g7230612 1.00E-14 small rec
289 122 1 366 g6682873 1.00E-14 reduced expression in cancer
289 122 1 366 g12856270 4.00E-11 putative
290 95 10 294 g!4161726 2.00E-13 putative 1-aminocyclopropane-l-carboxylate synthase
290 95 10 294 g12848260 8.00E-11 putative
291 2 158 722 1195 g10047319 2.00E-69 KIAA1621 protein
21 2 158 722 1195 g9965296 3.00E-62 protocadherin-3x
291 2 158 722 1195 g8926619 3.00E-62 protocadherin 3X
295 3 203 27 635 g12849991 3.00E-24 putative
295 3 203 27 635 g7303811 3.00E-17 CGI 8445 gene product
295 3 203 27 635 g3874149 2.00E-12 (Z73103) predicted using Genefinder
296 2 175 2 526 g13477379 5.00E-95 Unknown (protein for IMAGE:3543080)
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
296 2 175 2 526 g 12857733 4.00E-28 putative
296 2 175 2 526 g9944332 7.00E-28 Ttyhl
299 3 132 3 398 g6624727 2.00E-19 putative V-ATPase G subunit
299 3 132 3 398 g4809331 4.00E-19 NG38
299 3 132 3 398 g3805814 4.00E-19 V-ATPase G-subunit like protein
300 1 129 205 591 gl 0174998 8.00E-10 BH2378~unknewn conserved protein
300 1 129 205 591 g2634068 1.00E-07 similar to hypothetical proteins
300 1 129 205 591 g4982442 2.00E-06 conserved hypothetical protein
301 2 238 2 715 g 14042009 1.00E-85 unnamed protein product
301 2 238 2 715 g 14272546 2.00E-85 unnamed protein product
301 2 238 2 715 g 132791 16 9.00E-54 Unknown (protein for MGC: 10848) 303 2 269 2 808 g 12840021 1.00E-109 putative 303 2 269 2 808 g 12838582 1.00E-109 putative
303 2 269 2 808 g 12838265 1.00E-109 putative
304 2 375 191 1315 g 13878075 1.00E-19 unknown protein
304 2 375 191 1315 g7076778 5.00E-19 putative protein
304 2 375 191 1315 g7269877 2.00E-15 hypothetical protein
305 1 219 1 657 g 12052866 3.00E-95 hypothetical protein
305 1 219 1 657 g 12832198 3.00E-73 putative
305 1 219 1 657 g 14550502 3.00E-43 (BC009501) CGI-78 protein
306 2 138 164 577 g7022185 2.00E-16 unnamed protein product
306 2 138 164 577 g3983150 3.00E-13 schlafen2
306 2 138 164 577 g3983164 3.00E-12 schlafen4
307 2 105 80 394 g 13278450 4.00E-51 R1KEN cDNA 2310004124 gene
307 2 105 80 394 g7269324 8.00E-13 hypothetical protein
307 2 105 80 394 g4220517 8.00E-13 hypothetical protein
308 1 71 55 267 g 14041978 2.00E-18 unnamed protein product
308 1 71 55 267 gl3182761 2.00E-18 CDA02
308 1 71 55 267 g7296500 3.00E-09 CG7414 gene product
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Sc j Annotation
309 2 272 167 982 g13811938 1.00E-121 dJ1056Hl ,2.1 (novel protein similar to mitogen inducible protein MIG-2
(isoform 1))
309 2 272 167 982 g505033 2.00E-62 mitogen inducible gene mig-2
309 2 272 167 982 g7292434 2.00E-25 CGI 4991 gene product (alt 1)
312 3 127 141 521 g12052874 6.00E-34 hypothetical protein
312 3 127 141 521 g10438752 6.00E-34 unnamed protein product
312 3 127 141 521 g13905234 1.00E-31 Similar to hypothetical protein FLJ22386
313 1 119 1 357 g14250337 2.00E-40 (BC008598) Unknown (protein for MGC7807)
.313 1 119 1 357 g13542915 2.00E-40 Unknown (protein for IMAGE:3708981)
313 1 119 1 357 g12053337 2.00E-40 hypothetical protein
314 2 237 2 712 g14198207 5.00E-74 Similar to CG4452 gene product to or 314 2 237 2 712 g4200238 4.00E-68 hypothetical protein to 314 2 237 2 712 g4200234 4.00E-68 hypothetical protein
315 2 163 113 601 g5102832 2.00E-79 bK150C2.3 (PUTATIVE novel protein similar to APOBEC1 (Apolipoprotein B mRNA editing protein) and Phorbolin)
315 2 163 113 601 g9294747 2.00E-70 phorbolin 1 protein
315 2 163 113 601 g5102834 7.00E-57 bK150C2.10 (PUTATIVE novel Phorbolin 1 LIKE protein)
316 3 147 3 443 g12839245 2.00E-31 putative
316 3 147 3 443 g14485581 3.00E-30 testis-specific transporter TST1
316 3 147 3 443 g12855499 7.00E-20 putative
317 2 132 623 1018 g12005896 8.00E-68 AD035
317 2 132 623 1018 g10047285 8.00E-68 KIAA1605 protein
317 2 132 623 1018 g14042883 4.00E-67 unnamed protein product
318 304 580 1491 g9652074 1.00E-132 arginine N-methyltransferase
318 304 580 1491 g7453577 1.00E-115 protein arginine N-methyltransferase 1 -variant 1
318 304 580 1491 g7453576 1.00E-115 protein arginine N-methyltransferase 1 -variant 3
319 434 157 1458 g7670342 9.00E-88 unnamed protein product
319 434 157 1458 g7300652 5.00E-16 CGI 5688 gene product
319 434 157 1458 g7300655 3.00E-10 CGI 5689 gene product
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
320 1 442 1 1326 1 1877275 1.00E-168 dJ726C3,4 (ortholog of potential ligand_binding protein RYA3 (Rat))
320 1 442 1 1326 g57734 1.00E-130 potential ligand-binding protein
320 1 442 1 1326 g57732 5.00E-43 potential ligand-binding protein
321 2 195 61 1 1 195 g 14272530 2.00E-77 unnamed protein product
321 2 195 61 1 1 195 g 12846554 3.00E-77 putative
321 2 195 61 1 1 195 g 1799824 2.00E-09 similar to (SwissProt Accessicn Number P39836) -start codon is not identified yet
322 3 139 492 908 g 12848204 2.00E-36 putative
322 3 139 492 908 g7378713 2.00E-10 dJ134019.3 (zinc finger protein 151 (pHZ-67))
322 3 139 492 908 g2230871 2.00E-10 Miz-1 protein
324 3 650 396 2345 g3043634 0 KIAA0555 protein
324 3 650 396 2345 g 12857795 1.00E-131 putative
324 3 650 396 2345 g7291892 1.00E-16 zip gene product
325 275 825 g 12405521 4.00E-44 unnamed protein product
325 275 825 g4150939 9.00E-31 GSG1
325 275 825 g 12839367 5.00E-21 putative
326 147 441 g 12858991 1.00E-64 putative
326 147 441 g 12843356 1.00E-64 putative
326 147 441 g 12834013 1.00E-64 putative
329 2 598 2 1795 g 10438323 0 unnamed protein product
329 2 598 2 1795 g 10439967 1.00E-134 unnamed protein product
329 2 598 2 1795 g6572165 2.00E-54 dJI 1 19A7.5 (novel protein (isofc-rm 2))
333 136 1567 1974 g3702269 2.00E-05 sodium iodide sympΩrter
333 136 1567 1974 g 1628579 2.00E-05 sodium iodide symporter
333 136 1567 1974 g2887405 4.00E-05 sodium iodide symporter
334 247 79 819 g9757150 4.00E-18 extremely cysteine/valine rich protein
334 247 79 819 g 10434098 6.00E-17 unnamed protein product
334 247 79 819 g854065 3.00E-15 U88
337 318 61 1014 g726402ό 1.00E-108 dJ876B10.2 (novel protein (ortholog of rat EX084))
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
337 1 318 61 1014 g2827164 1.00E-100 exo84
337 1 318 61 1014 g7301432 3.00E-19 CG6095 gene product
338 3 142 510 935 g4519558 2.00E-56 Kilon
338 3 142 510 935 g5019445 2.00E-45 neurotractin-L
338 3 142 510 935 g9887387 3.00E-23 CEPU-Se alpha 1 isoform
339 1 306 439 1356 g 1002425 3.00E-16 YSPL-1 form 2
339 1 306 439 1356 g 1002424 3.00E-16 YSPL-1 form 1
340 1 244 1 732 g 14424514 3.00E-08 Similar to solute carrier family 7 (cationic amino acid transporter, y+ system), member 5
340 1 244 1 732 gl l 995021 4.00E-05 hLAT1-3TM
340 1 244 1 732 g5926732 5.00E-05 L-type amina acid transpcrter 1 to 341 3 294 1 1 1 992 g 12846470 1.00E-120 putative
•fc. 341 3 294 i n 992 g 10728401 1.00E-46 EG:39E1.1 gene product (alt 2)
341 3 294 1 1 1 992 g7290235 7.00E-44 EG:39E1.1 gene product (alt 3)
342 3 179 3 539 gl l 907923 2.00E-29 enhancer of polycomb
342 3 179 3 539 g 12857328 2.00E-22 putative
342 3 179 3 539 g7303589 1.00E-18 E(Pc) gene product
343 1 216 1 648 g 12832198 9.00E-86 putative
343 1 216 1 648 g 12052866 6.00E-82 hypothetical protein
343 1 216 1 648 g 14550502 3.00E-55 (BC009501) CGI-78 protein
344 1 82 409 654 g 14042078 2.00E-16 unnamed protein product
344 1 82 409 654 g 14042074 2.00E-16 unnamed protein product
346 2 191 98 670 g 10047251 3.00E-15 KIAA1588 protein
346 2 191 98 670 g 12804721 5.00E-15 Unknown (protein for MGC:2663)
346 2 191 98 670 g 1049301 2.00E-14 KRAB zinc finger protein; Method: conceptual translation supplied by
347 1 87 304 564 g4220590 2.00E-14 nuclear protein np95
347 1 87 304 564 g 14190525 2.00E-14 nuclear zinc finger protein Np95
347 1 87 304 564 g6815251 1.00E-13 transcription factor ICBP90
348 1 137 1423 1833 g2104691 1.00E-07 alpha glucosidase II, beta subunit
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
348 137 1423 1833 g7672979 6.00E-07 glucosidase II beta subunit
348 137 1423 1833 g 182855 ό.OOE-07 80K-H protein
349 437 1576 2886 g 13625162 0 jerky
349 437 1576 2886 g2314829 ό.OOE-45 jerky gene product homolog
349 437 1576 2886 g 12044467 2.00E-35 hypothetical protein
353 3 325 645 1619 g 10047169 1.00E-130 KIAA1552 protein
353 3 325 645 1619 g 12805039 8.00E-69 Unknown (protein for IMAGE:3453830)
353 3 325 645 1619 g 13276673 8.00E-52 hypothetical protein
354 3 132 768 1 163 g 13624461 7.00E-16 dJ259A10.1 (ssDNA binding protein (SEB4D))
354 3 132 768 1 163 g8895698 5.00E-12 RRM-containing protein SEB-4
354 3 132 768 1 163 g8920429 3.00E-10 hypothetical protein 356 1 337 628 1638 gl 1345539 0 dJ620E1 1.1 ,1 (novel Helicase C-terminal domain and SNF2 N-terminal domains containing protein, similar to KIAA0308 (isoform 1))
356 1 337 628 1638 g 10438729 0 unnamed protein product
356 1 337 628 1638 g7243213 1.00E-134 KIAA1416 protein
358 2 205 2 616 g 13445910 4.00E-68 radial spoke protein 3
358 2 205 2 616 g 12841878 9.00E-51 putative
358 2 205 2 616 g 12838997 9.00E-51 putative
362 1 154 604 1065 g7328107 2.00E-27 hypothetical protein
362 1 154 604 1065 g7021037 2.00E-27 unnamed protein product
362 1 154 604 1065 g 13435999 2.00E-27 Similar to hypΩthetical protein
363 2 77 266 496 g2587027 6.00E-07 HERV-E envelepe glycoprotein
363 2 77 266 496 g2587024 6.00E-07 HERV-E envelope glycoprotein
363 2 77 266 496 g 1049232 6.00E-07 HERV-E envelope protein
364 3 51 120 272 g950113 5.00E-15 ribosΩmal protein S3a
364 3 51 120 272 g854179 5.00E-15 ribosomal protein S3a
364 3 51 120 272 g5441551 5.00E-15 Ribosomal protein
366 2 124 2 373 g 14042876 1.00E-18 unnamed protein product
366 2 124 2 373 g 12004990 1.00E-18 RanBP17
TABLE 5
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
366 2 124 2 373 g 12855399 2.00E-17 putative
368 3 62 237 422 g7959315 6.00E-12 KIAA1524 protein
369 3 158 849 1322 g 10047319 2.00E-69 KIAA1621 protein
369 3 158 849 1322 g9965296 3.00E-62 protΩcadherin-3x
369 3 158 849 1322 g8926619 3.00E-62 prctocadherin 3X
to
CO CN
Table 6
Program Description Reference Parameter Threshold
ABI FACTU A A program that removes vector sequences and Applied Biosystems, Foster City, CA. masks ambiguous bases in nucleic acid sequences.
ABI/PARACELFDF A Fast Data Finder useful in comparing and Applied Biosystems, Foster City, CA; Mismatch <50% annotating amino acid or nucleic acid sequences. Paracel Inc., Pasadena, CA.
ABI AutoAssembler A program that assembles nucleic acid sequences. Applied Biosystems, Foster City, CA.
BLAST A Basic Local Alignment Search Tool useful in Altschul, S.F. et al. (1990) J. Mol. Biol. ESTs: Probability value= 1.0E-8 sequence similarity search for amino acid and 215:403-410; Altschul, S.F. et al. (1997) or less nucleic acid sequences'. BLAST includes five Nucleic Acids Res. 25:3389-3402. Full Length sequences: Probabilit functions: blastp, blastn, blastx, tblastn, and tblastx. value= l.OE-10 or less
FASTA A Pearson and Lipman algorithm that searches for Pearson, W.R. and D.J. Lipman (1988) Proc. ESTs: fasta E value=1.06E-6 similarity between a query sequence and a group of Natl. Acad Sci. USA 85:2444-2448; Pearson, Assembled ESTs: fasta Identity= sequences of the same type. FASTA comprises as W.R. (1990) Methods Enzymol. 183:63-98; 95% or greater and least five functions: fasta, tfasta, fastx, tfastx, and and Smith, T.F. and M.S. Waterman (1981) Match length=200 bases or greate ssearch. Adv. Appl. Math. 2:482-489. fastx E value=1.0E-8 or less
Full Length sequences: fastx score=100 or greater
BLIMPS A BLocks IMProved Searcher that matches a Henikoff, S. and J.G. Henikoff (1991) Nucleic Probability value= 1.0E-3 or less sequence against those in BLOCKS, PRINTS, Acids Res. 19:6565-6572; Henikoff, J.G. and DOMO, PRODOM, and PFAM databases to search S. Henikoff (1996) Methods Enzymol. for gene families, sequence homology, and 266:88-105; and Attwood, T.K. et al. (1997) J. structural fingerprint regions. Chem. Inf. Comput. Sci. 37:417-424.
HMMER An algorithm for searching a query sequence against Krogh, A. et al. (1994) J. Mol. Biol. PFAM hits: Probability value= hidden Markov model (HMM)-based databases of 235:1501-1531; Sonnhammer, E.L.L. et al. 1.0E-3 or less protein family consensus sequences, such as PFAM. (1988) Nucleic Acids Res. 26:320-322; Signal peptide hits: Score= 0 or Durbin, R. et al. (1998) Our World View, in a greater Nutshell, Cambridge Univ. Press, pp. 1-350.
Table 6 (cont.)
Program Description Reference Parameter Threshold
ProfileScan An algorithm that searches for structural and sequence Gribskov, M. et al. (1988) CABIOS 4:61-66; Normalized quality score≥GCG motifs in protein sequences that match sequence patterns Gribskov, M. et al. (1989) Methods Enzymol. specified "HIGH" value for that defined in Prosite. 183:146-159; Bairoch, A. et al. (1997) particular Prosite motif. Nucleic Acids Res. 25:217-221. Generally, score=1.4-2.1.
Phred A base-calling algorithm that examines automated Ewing, B. et al. (1998) Genome Res. sequencer traces with high sensitivity and probability. 8:175-185; Ewing, B. and P. Green (1998) Genome Res. 8:186-194.
Phrap A Phils Revised Assembly Program including SWAT and Smith, T.F. and M.S. Waterman (1981) Adv. Score= 120 or greater; CrossMatch, programs based on efficient implementation Appl. Math. 2:482-489; Smith, T.F. and M.S. Match length= 56 or greater of the Smith-Waterman algorithm, useful in searching Waterman (1981) J. Mol. Biol. 147:195-197; sequence homology and assembling DNA sequences. and Green, P., University of Washington, Seattle, WA.
Consed A graphical tool for viewing and editing Phrap Gordon, D. et al. (1998) Genome Res. 8:195-202. assemblies.
SPScan A weight matrix analysis program that scans protein Nielson, H. et al. (1997) Protein Engineering Score=3.5 or greater sequences for the presence of secretory signal peptides. 10:1-6; Claverie, J.M. and S. Audic (1997) CABIOS 12:431-439.
TMAP A program that uses weight matrices to delineate Persson, B. and P. Argos (1994) J. Mol. Biol. transmembrane segments on protein sequences and 237:182-192; Persson, B. and P. Argos (1996) determine orientation. Protein Sci. 5:363-371.
TMHMMER A program that uses a hidden Markov model (HMM) to Sonnhammer, E.L. et al. (1998) Proc. Sixth Intl. delineate transmembrane segments on protein sequences Conf. on Intelligent Systems for Mol. Biol., and determine orientation. Glasgow et al., eds., The Am. Assoc. for Artificial Intelligence Press, Menlo Park, CA, pp. 175-182.
Motifs A program that searches amino acid sequences for patterns Bairoch, A. et al. (1997) Nucleic Acids Res. 25:217-221; that matched those defined in Prosite. Wisconsin Package Program Manual, version 9, page M51-59, Genetics Computer Group, Madison, WI.

Claims

CLAIMS What is claimed is:
1. An isolated polynucleotide selected from the group consisting of: a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ BD NO:l- SEQ ID NO.1-184, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least
90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:l- SEQ ID NO.1-184, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d) .
2. An isolated polynucleotide of claim 1 , selected from the group consisting of SEQ BD NO : 1 - SEQ ID NO.1-184.
3. An isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide of claim 1.
4. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide of claim 1.
5. A composition for the detection of expression of secretory polynucleotides comprising at least one of the polynucleotides of claim 1 and a detectable label.
6. A method for detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 1, the method comprising: a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
7. A method for detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 1 , the method comprising: a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
8. A method of claim 7, wherein the probe comprises at least 30 contiguous nucleotides.
9. A method of claim 7, wherein the probe comprises at least 60 contiguous nucleotides.
10. A recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim 1.
11. A cell transformed with a recombinant polynucleotide of claim 10.
12. A transgenic organism comprising a recombinant polynucleotide of claim 10.
13. A method for producing a secretory polypeptide encoded by a polynucleotide of claim 1„ the method comprising: a) culturing a cell under conditions suitable for expression of the secretory polypeptide, wherein said cell is transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim 1 , and b) recovering the secretory polypeptide so expressed.
14. A method of claim 13, wherein the polypeptide has an amino acid sequence selected from the group consisting of SEQ ID NO:l 85-369.
15. An isolated secretory polypeptide (SPTM) encoded by at least one of the polynucleotides of claim 2.
16. A method of screening for a test compound that specifically binds to the polypeptide of claim 15, the method comprising: a) combining the polypeptide of claim 15 with at least one test compound under suitable conditions, and 5 b) detecting binding of the polypeptide of claim 15 to the test compound, thereby identifying a compound that specifically binds to the polypeptide of claim 15.
17. A microarray wherein at least one element of the microarray is a polynucleotide of claim
0
18. A method for generating a transcript image of a sample which contains polynucleotides, the method comprising: a) labeling the polynucleotides of the sample, b contacting the elements of the microarray of claim 17 with the labeled polynucleotides 5 of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
19. A method for screening a compound for effectiveness in altering expression of a target o polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of a polynucleotide of claim 1, the method comprising: a) exposing a sample comprising the target polynucleotide to a compound, under conditions suitable for the expression of the target polynucleotide, b) detecting altered expression of the target polynucleotide, and 5 c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
20. A method for assessing toxicity of a test compound, said method comprising: a) treating a biological sample containing nucleic acids with the test compound, 0 b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide of claim 1 under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 1 or fragment thereof, c) quantifying the amount of hybridization complex, and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound.
21. An array comprising different nucleotide molecules affixed in distinct physical locations on a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide or polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim 1.
22. An array of claim 21, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 30 contiguous nucleotides of said target polynucleotide.
23. An array of claim 21, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 60 contiguous nucleotides of said target polynucleotide
24. An array of claim 21, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to said target polynucleotide.
25. An array of claim 21 , which is a microarray.
26. An array of claim 21 , further comprising said target polynucleotide hybridized to a nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence.
27. An array of claim 21, wherein a linker joins at least one of said nucleotide molecules to said solid substrate.
28. An array of claim 21, wherein each distinct physical location on the substrate contains multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical location have the same sequence, and each distinct physical location on the substrate contains nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at another distinct physical location on the substrate.
* 29. An isolated polypeptide selected from the group consisting of: a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:185-369, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:185-369, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l 85-369, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l 85-369.
30. An isolated polypeptide of claim 29, having a sequence selected from the group consisting of SEQ ID NO:185-369.
31. An isolated polynucleotide encoding a polypeptide of claim 29.
32. An isolated polynucleotide encoding a polypeptide of claim 30.
33. An isolated polynucleotide of claim 32, having a sequence selected from the group consisting of SEQ ID NO:l- SEQ ID NO:l-184.
34. An isolated antibody which specifically binds to a secretory polypeptide of claim 29.
35. A diagnostic test for a condition or disease associated with the expression of SPTM in a biological sample, the method comprising: a) combining the biological sample with an antibody of claim 34, under conditions suitable for the antibody to bind the polypeptide and form an antibody:polypeptide complex, and b) detecting the complex, wherein the presence of the complex correlates with the presence of the polypeptide in the biological sample.
36. The antibody of claim 34, wherein the antibody is: a) a chimeric antibody, b) a single chain antibody, c) a Fab fragment, d) a F(ab')2 fragment, or e) a humanized antibody.
37. A composition comprising an antibody of claim 34 and an acceptable excipient.
38. A method of diagnosing a condition or disease associated with the expression of SPTM in a subject, comprising administering to said subject an effective amount of the composition of claim 37.
39. A composition of claim 37, wherein the antibody is labeled.
40. A method of diagnosing a condition or disease associated with the expression of SPTM in a subject, comprising administering to said subject an effective amount of the composition of claim 39.
41. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 34, the method comprising: a) immunizing an animal with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ED NO:185-369, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibodies from said animal, and c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal antibody which binds specifically to a polypeptide having an amino acid sequence selected from the group consisting of SEQ BD NO:185-369.
42. An antibody produced by a method of claim 41.
43. A composition comprising the antibody of claim 42 and a suitable carrier.
44. A method of making a monoclonal antibody with the specificity of the antibody of claim 34, the method comprising: a) immunizing an animal with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l 85-369, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibody producing cells from the animal, c) fusing the antibody producing cells with immortalized cells to form monoclonal antibody-producing hybridoma cells, d) culturing the hybridoma cells, and e) isolating from the culture monoclonal antibody which binds specifically to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:185-369.
45. A monoclonal antibody produced by a method of claim 44.
46. A composition comprising the antibody of claim 45 and a suitable carrier.
47. The antibody of claim 34, wherein the antibody is produced by screening a Fab expression library.
48. The antibody of claim 34, wherein the antibody is produced by screening a recombinant immunoglobulin library.
49. A method of detecting a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO 185-369 in a sample, the method comprising: a) incubating the antibody of claim 34 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) detecting specific binding, wherein specific binding indicates the presence of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369 in the sample.
50. A method of purifying a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369 from a sample, the method comprising: a) incubating the antibody of claim 34 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) separating the antibody from the sample and obtaining the purified polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 185-369.
51. A composition comprising a polypeptide of claim 29 and a pharmaceutically acceptable 5 excipient.
52. A composition of claim 51 , wherein the polypeptide has an amino acid sequence of SEQ ID NO:185-369.
o 53. A method for treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment the composition of claim 51.
54. A method for screening a compound for effectiveness as an agonist of a polypeptide of 5 claim 29, the method comprising: a) exposing a sample comprising a polypeptide of claim 29 to a compound, and b) detecting agonist activity in the sample.
55. A composition comprising an agonist compound identified by a method of claim 54 and a o pharmaceutically acceptable excipient.
56. A method for treating a disease or condition associated with decreased expression of functional SPTM, comprising administering to a patient in need of such treatment a composition of claim 55. 5
57. A method for screening a compound for effectiveness as an antagonist of a polypeptide of claim 29, the method comprising: a) exposing a sample comprising a polypeptide of claim 29 to a compound, and b) detecting antagonist activity in the sample. 0
58. A composition comprising an antagonist compound identified by a method of claim 57 and a pharmaceutically acceptable excipient.
59. A method for treating a disease or condition associated with overexpression of functional SPTM, comprising administering to a patient in need of such treatment a composition of claim 58.
60. A method of screening for a compound that modulates the activity of the polypeptide of claim 29, said method comprising: a) combining the polypeptide of claim 29 with at least one test compound under conditions permissive for the activity of the polypeptide of claim 29, b) assessing the activity of the polypeptide of claim 29 in the presence of the test compound, and c) comparing the activity of the polypeptide of claim 29 in the presence of the test compound with the activity of the polypeptide of claim 29 in the absence of the test compound, wherein a change in the activity of the polypeptide of claim 29 in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide of claim 29.
EP01966516A 2000-09-05 2001-08-30 Secretory molecules Withdrawn EP1368375A2 (en)

Applications Claiming Priority (57)

Application Number Priority Date Filing Date Title
US22975100P 2000-09-05 2000-09-05
US23058300P 2000-09-05 2000-09-05
US22974800P 2000-09-05 2000-09-05
US22975000P 2000-09-05 2000-09-05
US22974900P 2000-09-05 2000-09-05
US23001600P 2000-09-05 2000-09-05
US22974700P 2000-09-05 2000-09-05
US230583P 2000-09-05
US229750P 2000-09-05
US230016P 2000-09-05
US229749P 2000-09-05
US23061000P 2000-09-06 2000-09-06
US23051500P 2000-09-06 2000-09-06
US23099000P 2000-09-06 2000-09-06
US23059600P 2000-09-06 2000-09-06
US23098800P 2000-09-06 2000-09-06
US23051400P 2000-09-06 2000-09-06
US23051800P 2000-09-06 2000-09-06
US23086500P 2000-09-06 2000-09-06
US23059500P 2000-09-06 2000-09-06
US23086400P 2000-09-06 2000-09-06
US23051900P 2000-09-06 2000-09-06
US23051700P 2000-09-06 2000-09-06
US23050500P 2000-09-06 2000-09-06
US23059900P 2000-09-06 2000-09-06
US23098900P 2000-09-06 2000-09-06
US23059700P 2000-09-06 2000-09-06
US230989P 2000-09-06
US230990P 2000-09-06
US230599P 2000-09-06
US230864P 2000-09-06
US230988P 2000-09-06
US230865P 2000-09-06
US230596P 2000-09-06
US230518P 2000-09-06
US230505P 2000-09-06
US230597P 2000-09-06
US230514P 2000-09-06
US230610P 2000-09-06
US230517P 2000-09-06
US230595P 2000-09-06
US230515P 2000-09-06
US230519P 2000-09-06
US23183200P 2000-09-07 2000-09-07
US23116300P 2000-09-07 2000-09-07
US23095100P 2000-09-07 2000-09-07
US23089600P 2000-09-07 2000-09-07
US23089700P 2000-09-07 2000-09-07
US231163P 2000-09-07
US230951P 2000-09-07
US231832P 2000-09-07
US230897P 2000-09-07
US230896P 2000-09-07
PCT/US2001/027297 WO2002020756A2 (en) 2000-09-05 2001-08-30 Secretory molecules
US229747P 2009-07-30
US229748P 2009-07-30
US229751P 2009-07-30

Publications (1)

Publication Number Publication Date
EP1368375A2 true EP1368375A2 (en) 2003-12-10

Family

ID=27586833

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01966516A Withdrawn EP1368375A2 (en) 2000-09-05 2001-08-30 Secretory molecules

Country Status (4)

Country Link
EP (1) EP1368375A2 (en)
AU (1) AU2001287022A1 (en)
CA (1) CA2419943A1 (en)
WO (1) WO2002020756A2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK1854809T3 (en) 2000-08-22 2013-01-21 Agensys Inc Nucleic acid and corresponding protein designated 158P1D7 useful in the treatment and detection of bladder cancer and other cancers
US7358353B2 (en) 2000-08-22 2008-04-15 Agensys, Inc. Nucleic acid and corresponding protein named 158P1D7 useful in the treatment and detection of bladder and other cancers
GB0027905D0 (en) * 2000-11-15 2000-12-27 Glaxo Group Ltd New protein
AU2003215369A1 (en) * 2002-02-22 2003-09-09 Incyte Corporation Intracellular signaling molecules
US7198899B2 (en) 2002-06-03 2007-04-03 Chiron Corporation Use of NRG4, or inhibitors thereof, in the treatment of colon and pancreatic cancers
JP2004267015A (en) * 2003-03-05 2004-09-30 National Institute Of Advanced Industrial & Technology Nucleic acid and method for testing canceration using the same nucleic acid
EA201990839A1 (en) 2012-08-23 2019-08-30 Эдженсис, Инк. ANTIBODY MEDICINE (ADC) CONJUGATES THAT CONTACT PROTEINS 158P1D7
CN110850088B (en) * 2019-12-06 2021-08-20 四川大学华西医院 Application of GTF2IRD2 autoantibody detection reagent in preparation of lung cancer screening kit

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0974058A2 (en) * 1997-04-08 2000-01-26 Human Genome Sciences, Inc. 20 human secreted proteins
AU2001236631A1 (en) * 2000-02-24 2001-09-03 Incyte Genomics, Inc. Secretory molecules

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0220756A2 *

Also Published As

Publication number Publication date
WO2002020756A3 (en) 2003-08-28
WO2002020756A2 (en) 2002-03-14
CA2419943A1 (en) 2002-03-14
AU2001287022A1 (en) 2002-03-22

Similar Documents

Publication Publication Date Title
WO2002010387A2 (en) G-protein coupled receptors
CA2447212A1 (en) Secretory molecules
EP1328630A2 (en) Secreted proteins
EP1368375A2 (en) Secretory molecules
WO2002040715A2 (en) Molecules for disease detection and treatment
WO2002024895A2 (en) Transcription factors and zinc finger proteins
US20050095587A1 (en) Molecules for disease detection and treatment
WO2003062385A2 (en) Secretory molecules
WO2001062918A2 (en) Secretory polypeptides and corresponding polynucleotides
JP2003533975A (en) Secreted protein
EP1349935A2 (en) Molecules for disease detection and treatment
EP1472285A2 (en) Secretory molecules
WO2001011032A1 (en) Secretory molecules
EP1181357A2 (en) Molecules for disease detection and treatment
US20040142331A1 (en) Molecules for disease detection and treatment
WO2001023558A2 (en) Human secretory molecules
CA2402747A1 (en) G-protein associated molecules
WO2002031151A2 (en) Lipocalins
EP1265918A1 (en) Human immune response proteins
US20030208040A1 (en) G-protein associated molecules
WO2002012339A2 (en) Sequences for integrin alpha-8
EP1222258A2 (en) Molecules for disease detection and treatment
EP1303609A2 (en) G-protein coupled receptors
JP2004535159A (en) Disease detection and therapeutic molecules

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030401

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17Q First examination report despatched

Effective date: 20040526

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20041008

RIN1 Information on inventor provided before grant (corrected)

Inventor name: INMAN, REBEKAH R.

Inventor name: AU, ALAN P.

Inventor name: CHANG, SIMON C.

Inventor name: CHEN, ALICE J.

Inventor name: MARWAHA, RAKESH

Inventor name: DAFFO, ABEL

Inventor name: FLORES, VINCENT

Inventor name: PANZER, SCOTT R.

Inventor name: DAVID, MARIE H.

Inventor name: PERALTA, CAREYNA H.

Inventor name: GERSTIN, EDWARD H.JR.

Inventor name: ROSEBERRY, ANN M.

Inventor name: HARRIS, BERNARD

Inventor name: ROHATGI, SAMEER D.

Inventor name: BRADLEY, DIANA L.

Inventor name: MOMIYAMA, MONIKA G.

Inventor name: DAHL, CHRISTOPHER R.

Inventor name: YAP, PIERRE E.

Inventor name: LIU, TOMMY F.

Inventor name: GIETZEN, DARRYL

Inventor name: WRIGHT, RACHEL J.

Inventor name: YU, JIMMY Y.

Inventor name: JONES, ANISSA LEE

Inventor name: JACKSON, JENNIFER L.

Inventor name: CHALUP, MICHAEL S.

Inventor name: DUFOUR, GERARD E.

Inventor name: ALTUS, CHRISTINA M.

Inventor name: LINCOLN, STEPHEN E.

Inventor name: JACKSON, STUART E.