WO2004094651A2 - Novel human polypeptides encoded by polynucleotides - Google Patents

Novel human polypeptides encoded by polynucleotides Download PDF

Info

Publication number
WO2004094651A2
WO2004094651A2 PCT/US2004/012049 US2004012049W WO2004094651A2 WO 2004094651 A2 WO2004094651 A2 WO 2004094651A2 US 2004012049 W US2004012049 W US 2004012049W WO 2004094651 A2 WO2004094651 A2 WO 2004094651A2
Authority
WO
WIPO (PCT)
Prior art keywords
cells
nucleic acid
cell
polypeptide
acid molecule
Prior art date
Application number
PCT/US2004/012049
Other languages
French (fr)
Other versions
WO2004094651A3 (en
Inventor
Ernestine Lee
Kevin Hestir
Keting Chu
Mamoru Kamiya
Yoshihide Hayashizaki
Lorianne Masuoka
Lewis Thomas Williams
Original Assignee
Five Prime Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Five Prime Therapeutics, Inc. filed Critical Five Prime Therapeutics, Inc.
Publication of WO2004094651A2 publication Critical patent/WO2004094651A2/en
Publication of WO2004094651A3 publication Critical patent/WO2004094651A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

Definitions

  • the present invention is related generally to novel polynucleotides and novel polypeptides encoded thereby, their compositions, antibodies directed thereto, and other agonists or antagonists thereto.
  • the polynucleotides and polypeptides are useful in diagnostic, prophylactic, and therapeutic applications for a variety of diseases, disorders, syndromes, and conditions, as well as in discovering new diagnostics, prophylactics, and therapeutics for such diseases, disorders, syndromes, and conditions (hereinafter disorders).
  • This application also relates to the field of polypeptides that are associated with regulating cell growth and differentiation, that are over-expressed in cancer, and/or that can be associated with proliferation or inhibition of cancer growth, including hematopoietic cancers such as leukemias, lymphomas, and solid cancers such as pancreatic cancer, tracheal cancer, and lung cancer, for example, adenocarcinomas and/or squamous cell carcinomas.
  • hematopoietic cancers such as leukemias, lymphomas, and solid cancers such as pancreatic cancer, tracheal cancer, and lung cancer, for example, adenocarcinomas and/or squamous cell carcinomas.
  • These polypeptides may also be associated with other conditions, such as inflammatory, immune, and metabolic disorders such as type II diabetes, as well as bone disorders, central nervous system (CNS) disorders, and microbial infections, including viral, bacterial, fungal, and parasitic disorders.
  • CNS central nervous system
  • This application further relates to modulators of biological activity that can specifically bind to these polynucleotides or polypeptides, or otherwise specifically modulate their activity. For example, they can directly or indirectly induce antibody-dependent cellular cytotoxicity (ADCC), complement-dependent cytotoxicity (CDC), endocytosis, apoptosis, or recruitment of other cells to effect cell activation, cell inactivation, cell growth or differentiation or inhibition thereof, and cell killing.
  • ADCC antibody-dependent cellular cytotoxicity
  • CDC complement-dependent cytotoxicity
  • endocytosis endocytosis, apoptosis, or recruitment of other cells to effect cell activation, cell inactivation, cell growth or differentiation or inhibition thereof, and cell killing.
  • This application yet further relates to compositions and methods for diagnosis and treatment of proliferative, inflammatory, immune, metabolic, bone, CNS, and microbial disorders.
  • the molecules ofthe invention encompass a variety of different types of nucleic acids and polypeptides with different structures and functions. They can encode or comprise polypeptides belonging to different protein families (Pfam).
  • the "Pfam” system is an organization of protein sequence classification and analysis, based on conserved protein domains; it can be publicly accessed in a number of ways, for example, at http://pfam.wustl.edu.
  • Protein domains are portions of proteins that have a tertiary structure and sometimes have enzymatic or binding activities; multiple domains can be connected by flexible polypeptide regions within a protein.
  • Pfam domains can comprise the N-terminus or the C-terminus of a protein, or can be situated at any point in between.
  • the Pfam system identifies protein families based on these domains and provides an annotated, searchable database that classifies proteins into families (Bateman et al., 2002).
  • Molecules ofthe invention can encode or be comprised of one, or more than one, Pfam.
  • Molecules encompassed by the invention include, the polypeptides and polynucleotides shown in the Sequence Listing and corresponding molecular sequences found at all developmental stages of an organism.
  • Molecules of the invention can comprise genes or gene segments designated by the Sequence Listing, and their gene products, i.e., RNA and polypeptides.
  • variants of those set forth in the Sequence Listing that are present in the normal physiological state, e.g., variant alleles such as SNPs and splice variants, as well as variants that are affected in pathological states, such as disease-related mutations or sequences with alterations that lead to pathology, and variants with conservative amino acid changes.
  • Molecules ofthe invention are categorized below; any given one belong to one or more than one category.
  • Secreted proteins also referred to as secreted factors or secreted polypeptides, as used herein, include polypeptides, or active portions thereof, that are produced by cells and exported extracellularly; extracellular fragments of transmembrane proteins that are proteolytically cleaved; and extracellular fragments of cell surface receptors, which fragments may be soluble.
  • HG1009657P1 is a 551 amino acid-residue polypeptide comprising a cytochrome p450 Pfam domain and is 94% homologous over the length ofthe polypeptide (i.e., 94%> ofthe 551 amino acid residues are identical) to human cytochrome p450, family 26, subfamily C, polypeptide 1.
  • the secreted proteins ofthe present invention include those in the Sequence Listing with a Tree Vote of 0.5 or higher; the Tree Vote is an internal designation, as described, infra.
  • compositions of the present invention will have in common the ability to act as ligands for binding to receptors on cell surfaces in ligand receptor interactions; to bind to ligands, soluble or otherwise; to inhibit ligand receptor interactions; to trigger certain infracellular responses, such as inducing signal transduction to activate cells or inhibit cellular activity; to induce cellular growth, proliferation, or differentiation; to induce the production of other factors that, in turn, mediate such activities; or to inhibit cell activation or signaling.
  • the cell types having cell surface receptors responsive to secreted proteins are many and various, including, any cell type of any tissue origin or developmental state, for example, stem cells and progenitor cells; precursor and mature cells ofthe hematopoietic, hepatic, neural, lung, heart, thymic, splenic, epithelial, endothelial, pancreatic, adipose, gastrointestinal, colonic, renal, optic, olfactory, bone, cartilaginous, and musculoskeletal lineages.
  • tissue origin or developmental state for example, stem cells and progenitor cells
  • precursor and mature cells ofthe hematopoietic, hepatic, neural, lung, heart, thymic, splenic, epithelial, endothelial, pancreatic, adipose, gastrointestinal, colonic, renal, optic, olfactory, bone, cartilaginous, and musculoskeletal lineages.
  • the hematopoietic cells can be precursor or mature red blood cells or white blood cells, including cells ofthe B lymphocytic (B-cell), T lymphocytic (T cell), monocytic, dendritic, megakaryocytic, natural killer (NK), macrophagic, eosinophilic, and basophilic lineages.
  • B-cell B lymphocytic
  • T cell T lymphocytic
  • NK natural killer
  • macrophagic eosinophilic
  • basophilic lineages eosinophilic, and basophilic lineages.
  • the cell types responsive to secreted proteins also include normal cells and cells implicated in pathological conditions or other disorders.
  • Certain ofthe secreted proteins can stimulate T or B cell growth or differentiation by interacting with precursor T or B cells or hematopoietic progenitor cells, or bone marrow stem cells.
  • certain secreted proteins ofthe present invention can maintain stem cells, progenitor cells or precursor cells in an undifferentiated state.
  • certain secreted proteins ofthe present invention can regulate bone growth by stimulation or inhibition thereof; secretion of insulin; glucose metabolism; cell proliferation; response to microbial infection; and regeneration of tissues including neural, muscular, and epithelial.
  • certain secreted proteins ofthe present invention can induce apoptosis, such as in cancer cells or inflammatory cells.
  • Cytochrome P450 domains are involved in the oxidative degradation of various compounds, including environmental toxins and mutagens (Degtyarenko and Archakov 1993).
  • This polypeptide possesses the functional domain and properties of a lipid binding serum glycoprotein.
  • These families of proteins comprise lipopolysaccharide binding proteins, bactericidal permeability-increasing proteins, cholesteryl ester transfer proteins, and phospholipid transfer proteins. Proteins in these families share biochemical and structural similarities and serve a wide range of physiological functions (Yamashita et al., 2000). Transmembrane Proteins and Related Polypeptides
  • Transmembrane proteins extend into or through the cell membrane's lipid bilayer; they can span the membrane once, or more than once. Transmembrane proteins that span the membrane once are “single transmembrane proteins” (STM), and transmembrane proteins that span the membrane more than once are “multiple transmembrane proteins” (MTM). Examples of transmembrane proteins include the receptors, e.g., insulin receptors; adenylate cyclases; and ion exchangers.
  • a single transmembrane protein typically has one transmembrane (TM) domain spanning a series of consecutive amino acid residues, and numbered on the basis of distance from the N-terminus, with the first amino acid residue at the N- terminus as number 1.
  • TM transmembrane
  • a multi-transmembrane protein typically has more than one TM domain, each spanning a series of consecutive amino acid residues, numbered in the same way as the STM protein.
  • Transmembrane proteins having part of their molecules on either side ofthe bilayers, have many and widely variant biological functions. They cantransport molecules, e.g., ions or proteins across membranes, transduce signals across membranes, act as receptors, and function as antigens. Transmembrane proteins are often involved in cell. signaling events; they can comprise signaling molecules, or can interact with signaling molecules.
  • tyrosine kinases can be transmembrane receptor proteins. Abnormalities of receptor tyrosine kinases are associated with human cancers; tumor cells are known to use receptor tyrosine kinases in transduction pathways to achieve tumor growth, angiogenesis and metastasis.
  • transmembrane proteins or polypeptides like the secreted polypeptides, also have many different functional domains, and belong to a wide variety of Pfam families.
  • kinases for example, receptor tyrosine kinase family members
  • receptor tyrosine kinases are often associated with human cancers.
  • tumor cells are known to use receptor tyrosine kinases in transduction pathways to achieve tumor growth, angiogenesis and metastasis. Therefore, receptor tyrosine kinases represent pivotal targets in cancer therapy.
  • a number of small molecule receptor tyrosine kinase inhibitors have been synthesized, are in clinical trials, are being analyzed in animal models, or have been marketed.
  • Inhibitory mechanisms include ligand-dependent down regulation, e.g., by the adaptor Cbl (Brunelleschi et al., 2002).
  • transmembrane proteins of the present invention have a p450 Pfam domain, which was described, supra, for example HG1009657P1, also described supra as a secreted protein, is also a transmembrane protein. These designations are consistent with a transmembrane protein having an extracellular fragment that is cleaved.
  • Adh_short domains are found in a large family of proteins that includes short-chain dehydrogenases and reductase enzymes; most family members function as NAD- or NADP-dependent oxidoreductases (Jornvall et al. 1995).
  • This polypeptide possesses the functional domain and properties of a YT521-B-like protein, which is a tyrosine-phosphorylated nuclear protein that functions in a signal transduction pathway to influence splice site selection.
  • YT521-B interacts with the nuclear transcriptosomal component scaffold attachment factor B, and a 68-kDa src substrate which is associated during mitosis (Hartmann et al., 1999).
  • Transmembrane proteins that are differentially expressed on the surface of cancer cells are desirable targets for production of antibodies, e.g., diagnostic antibodies or therapeutic antibodies, such as antibodies that mediate ADCC or CDC to effect tumor cell killing.
  • Transmembrane proteins with extracellular fragments that can be cleaved can be useful as secreted proteins to effect ligand/receptor binding so as to mediate infracellular responses, such as signal transduction.
  • Transmembrane proteins that act as receptors, and possess a ligand binding extracellular portion exposed on a cell surface and an intracellular portion that interacts with other cellular components upon activation can be also be useful as transmembrane proteins to mediate intracellular responses, such as signal transduction.
  • Other Proteins and Related Polypeptides are also be useful as transmembrane proteins to mediate intracellular responses, such as signal transduction.
  • the invention also encompasses proteins and related polypeptides that are neither secreted proteins, nor transmembrane proteins. These polypeptides possess the functional domains and properties of their Pfam domains.
  • DAGKcat diacylglycerol kinase catalytic domain
  • integrase DNA binding domain
  • rnaseH Pfam domain
  • rve integrase core
  • the present invention features an isolated polynucleotide that encodes a polypeptide.
  • the polypeptide has at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, or at least about 99% amino acid sequence identity with an amino acid sequence derived from a polynucleotide sequence chosen from at least one nucleotide sequence according to SEQ ID NO: 1 - 123.
  • the polypeptide has an amino acid sequence chosen from at least one amino acid sequence according to SEQ ID NO: 124 - 246.
  • the polypeptide has at least one activity associated with the naturally occurring encoded polypeptide.
  • the polypeptide includes a signal peptide.
  • the polypeptide comprises a mature form of a protein, from which the signal peptide has been cleaved.
  • the polypeptide is a signal peptide.
  • the invention provides fragments of a polypeptide chosen from at least one amino acid sequence according to SEQ ID NO: 124 - 246, where each fragment is an extracellular fragment ofthe polypeptide, or an extracellular fragment ofthe polypeptide minus the signal peptide.
  • the invention provides an N-terminal fragment containing a Pfam domain and a C-terminal fragment containing a Pfam domain and either or both may be biologically active.
  • polypeptides function as secreted proteins. In some further embodiments, the polypeptides function as single- transmembrane proteins. In other embodiments, the polypeptides function as multiple-transmembrane proteins.
  • the polypeptides function as kinases, receptors, phosphatases, proteases, phosphodiesterases, immunoglobulin, growth factors, antigens, complement proteins, adhesion proteins, GTPase activating proteins, binding proteins, ribosylation factors, revers transcriptases, integrases, ribosomal proteins, signaling proteins, transport proteins, phospholipid binding proteins, RNAsetl, nucleotide hydrolases, fransposases, transporters, RNA recognition motifs, proprotein convertases, matrix proteins, membrane transport proteins.
  • the polypeptides function in pathological states. In some embodiments, the polypeptides function as one or more of these.
  • the present invention features an isolated polynucleotide that hybridizes under stringent hybridization conditions to a coding region of at least one nucleotide sequence shown in SEQ ID NO: 1 - 123, or a complement thereof.
  • the present invention features an isolated polynucleotide that shares at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% nucleotide sequence identity with a nucleotide sequence ofthe coding region of at least one sequence shown in SEQ ID NO: 1 - 123, or a complement thereof.
  • a subject polynucleotide has the nucleotide sequence shown in at least one of SEQ ID NO: 1 - 123, or a coding region thereof.
  • the present invention also features a vector, e.g., a recombinant vector, that includes a subject polynucleotide, and a promoter the drives its expression.
  • This vector can transform a host cell, and the present invention further features such host cells, e.g., isolated in vitro host cells, and in vivo host cells, that comprise a polynucleotide ofthe invention, or a recombinant vector ofthe invention.
  • the present invention further features a library of polynucleotides, wherein at least one ofthe polynucleotides comprises the sequence information of a polynucleotide ofthe invention.
  • the library is provided on a nucleic acid array.
  • the library is provided in computer- readable format.
  • the present invention features a pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides in length.
  • the first nucleic acid molecule ofthe pair comprises a sequence of at least 10 contiguous nucleotides having 100% sequence identity to at least one nucleic acid sequence shown in SEQ ID NO: 1 - 123.
  • the second nucleic acid molecule ofthe pair comprises a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to the reverse complement of at least one nucleic acid sequence shown in SEQ ID NO: 1 - 123.
  • the sequence of said second nucleic acid molecule is located 3 'of the nucleic acid sequence ofthe first nucleic acid molecule shown in SEQ ID NO: 1 - 123.
  • the pair of isolated nucleic acid molecules are useful in a polymerase chain reaction or in any other method known in the art to amplify a nucleic acid that has sequence identity to the sequences shown in SEQ ID NO: 1 - 123, particularly when cDNA is used as a template.
  • the invention features a method of determining the presence of a polynucleotide substantially identical to a polynucleotide sequence shown in the Sequence Listing, or a complement of such a nucleotide by providing its complement, allowing the polynucleotides to interact, and determining whether such interaction has occurred.
  • the invention further features methods of regulating the expression ofthe subject polynucleotides and encoded polypeptides.
  • the invention provides a method of inhibiting transcription or translation of a first polynucleotide encoding a first polypeptide ofthe invention by providing a second polynucleotide that hybridizes to the first polynucleotide, and allowing the first polynucleotide to contact and bind to the second polynucleotide.
  • the second polynucleotide can be chosen from an antisense molecule, a ribozyme, and an interfering RNA (RNAi) molecule.
  • the present invention further features an isolated polypeptide, e.g., an isolated polypeptide encoded by a polynucleotide, and biologically active fragments of such polypeptide.
  • the polypeptide is a fusion protein.
  • the polypeptide has one or more amino acid substitutions, and/or insertions and or deletions, compared with at least one sequence shown in SEQ ID NO: 124 - 246.
  • the polypeptide has an amino acid sequence derived from at least one nucleotide sequence shown in SEQ ID NO: 1 - 123.
  • the polypeptide has an amino acid sequence substantially identical to at least one sequence shown in SEQ ID NO: 124 - 246.
  • the invention also provides a method of making a polypeptide ofthe invention by providing a nucleic acid molecule that comprises a polynucleotide sequence encoding a polypeptide ofthe invention, introducing the nucleic acid molecule into an expression system, and allowing the polypeptide to be produced.
  • the method involves in vitro cell-free transcription and/or translation.
  • the expression system can comprise a cell-free expression system, such as an E. coli system, a wheat germ extract system, a rabbit reticulocyte system, or a frog oocyte system.
  • the expression system can comprise a prokaryotic or eukaryotic cell, for example, a bacterial cell expression system, a fungal cell expression system, such as yeast or Aspergillus, a plant cell expression system, e.g., a cereal plant, a tobacco plant, a tomato plant, or other edible plant, an insect cell expression system, such as SF9 of High Five cells, an amphibian cell expression system, a reptile cell expression system, a crustacean cell expression system, an avian cell expression system, a fish cell expression system, or a mammalian cell expression system, such as one using Chinese Hamster Ovary (CHO) cells.
  • a prokaryotic or eukaryotic cell for example, a bacterial cell expression system, a fungal cell expression system, such as yeast or Aspergillus, a plant cell expression system, e.g., a cereal plant, a tobacco plant, a tomato plant, or other edible plant, an insect cell expression system, such as SF9 of High Five
  • the method involves culturing a subject host cell under conditions such that the subject polypeptide is produced by the host cells; and recovering the subject polypeptide from the culture, e.g., from within the host cells, or from the culture medium.
  • the polypeptide can be produced in vivo in a multicellular animal or plant, comprising a polynucleotide encoding the subject polypeptide.
  • the present invention further features a non-human animal injected with at least one polynucleotide comprising at least one nucleotide sequence chosen from SEQ ID NO: 1 - 123, and or at least one polypeptide comprising at least one amino acid sequence chosen form SEQ ID NO: 124 - 246.
  • the present invention further features an antibody that specifically recognizes, binds to, interferes with, or modulates the biological activity of a subject polypeptide or a fragment thereof.
  • the polypeptide can be a secreted protein, single- transmembrane protein, multiple-transmembrane protein, cytoplasmic protein or extracellular protein, or fragment thereof.
  • the fragment can be an extracellular fragment of a subject polypeptide, or an extracellular fragment of a subject polypeptide minus the signal peptide.
  • the present invention further features an antibody that specifically inhibits binding of a polypeptide to its ligand or substrate. It also features an antibody that specifically inhibits binding of a polypeptide as a substrate to another molecule.
  • Another aspect ofthe present invention features a library of antibodies or fragments thereof, wherein at least one antibody or fragment thereof specifically binds to at least a portion of a polypeptide comprising an amino acid sequence according to SEQ ID NO: 124 - 246 , and/or wherein at least one antibody or fragment thereof interferes with at least one activity of such polypeptide or fragment thereof.
  • the antibody library comprises at least one antibody or fragment thereof that specifically inhibits binding of a subject polypeptide to its ligand or substrate, or that specifically inhibits binding of a subject polypeptide as a substrate to another molecule.
  • the present invention also features corresponding polynucleotide libraries comprising at least one polynucleotide sequence that encodes an antibody or antibody fragment ofthe invention.
  • the library is provided on a nucleic acid array or in computer-readable format.
  • An antibody of the present invention may comprise a monoclonal antibody, polyclonal antibody, single chain antibody, intrabody, and active fragments of any of these.
  • the active fragments include variable regions from either heavy chains or light chains.
  • the antibody can comprise the backbone of a molecule with an immunoglobulin domain, e.g., a fibronectin backbone, a T-cell receptor backbone, or a CTLA4 backbone.
  • the present invention further features a targeting antibody, a neutralizing antibody, a stabilizing antibody, an enhancing antibody, an antibody agonist, an antibody antagonist, an antibody that promotes cellular endocytosis of a target antigen, a cytotoxic antibody, and an antibody that mediates antibody dependent cellular cytotoxicity (ADCC).
  • the antibody that mediates ADCC can have a cytotoxic component, e.g., a radioisotope, a radioactive molecule, a microbial toxin, a plant toxin, a chemotherapeutic agent, or a chemical substance, such as doxorubicin or cisplatin.
  • the invention also features an inhibitory antibody, functioning to specifically inhibit the binding of a cognate polypeptide to its ligand or its substrate, or to specifically inhibit the binding of a cognate peptide as the substrate of another molecule.
  • the antibodies ofthe present invention also encompass a human antibody, a non-human primate antibody, a monkey antibody, a non-primate animal antibody, e.g., a rodent antibody, rat antibody, a mouse antibody, a hamster antibody, a guinea pig antibody, a chicken antibody, a cattle antibody, a sheep antibody, a goat antibody, a horse antibody, porcine antibody, a cow antibody, a rabbit antibody, a cat antibody, or a dog antibody. It also features a humanized antibody, a primatized antibody, and a chimeric antibody.
  • the antibodies ofthe invention can be produced in vitro or in vivo.
  • the present invention features an antibody produced in a cell-free expression system, a prokaryote expression system or a eukaryote expression system, as described herein.
  • the invention further provides a host cell that can produce an antibody ofthe invention or a fragment thereof.
  • the antibody may also be secreted by the cell.
  • the host cell can be a hybridoma, or a prokaryotic or eukaryotic cell.
  • the invention also provides a bacteriophage or other virus particle comprising an antibody ofthe invention, or a fragment thereof.
  • the bacteriophage or other virus particle may display the antibody of fragment thereof on its surface, and the bacteriophage itself may exist within a bacterial cell.
  • the antibody may also comprise a fusion protein with a viral or bacteriophage protein.
  • the invention further provides transgenic multicellular organisms, e.g., plants or non-human animals, as well as tissues or organs, comprising a polynucleotide sequence encoding a subject antibody or fragment thereof.
  • the organism, tissues, or organs will generally comprise cells producing an antibody of the invention, or a fragment thereof.
  • the present invention features a method of making an antibody by immunizing a host animal.
  • a polypeptide or a fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a fragment thereof is introduced into an animal in a sufficient amount to elicit the generation of antibodies specific to the polypeptide or fragment thereof, and the resulting antibodies are recovered from the animal.
  • the polypeptide can be encoded by a nucleic acid molecule comprising a nucleotide sequence chosen from at least one polynucleotide sequence according to SEQ ID NO: 1 - 123.
  • the polypeptide may comprise at least one amino acid sequence chosen from SEQ ID NO: 124 - 246.
  • the invention thus also provides a non-human animal comprising an antibody ofthe invention.
  • the animal can be a non-human primate, (e.g., a monkey) a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog.
  • a non-human primate e.g., a monkey
  • rodent e.g., a rat, a mouse, a hamster, a guinea pig
  • a chicken e.g., cattle (e.g., a sheep, a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog.
  • cattle e.g., a sheep, a goat, a horse, a pig, a cow
  • the present invention also features a method of making an antibody by isolating a spleen from an animal injected with a polypeptide or a fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a fragment thereof, and recovering antibodies from the spleen cells.
  • Hybridomas can be made from the spleen cells, and hybridomas secreting specific antibodies can be selected.
  • the present invention further features a method of making a polynucleotide library from spleen cells, and selecting a cDNA clone that produces specific antibodies, or fragments thereof.
  • the cDNA clone or a fragment thereof can be expressed in an expression system that allows production ofthe antibody or a fragment thereof, as provided herein.
  • the present invention also features a pharmaceutical composition comprising a polynucleotide, polypeptide, or modulator ofthe invention and a carrier.
  • the carrier can be a pharmaceutically acceptable carrier.
  • the modulator can be obtainable by any methods ofthe invention, for example, the modulator can be an antibody or a fragment thereof.
  • oral formulations, preparations for injection, aerosol formulations, and suppositories can be prepared, each comprising the polynucleotide, polypeptide, or modulator composition.
  • nucleic acid compositions comprising polynucleotide sequences encoding the subject antibodies, or fragments thereof, can be prepared for administration to a subject.
  • the invention also features a non-human animal injected with the polynucleotide, polypeptide, or modulator composition, for example the antibody composition.
  • the animal can be a non-human primate, (e.g., a monkey) a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog.
  • the invention provides novel polynucleotides, related novel polypeptides such as secreted polypeptides, transmembrane polypeptides, and other polypeptides, i.e., cytoplasmic and luminal polypeptides, and active fragments thereof, as well as novel nucleic acid compositions encoding these polypeptides, and compositions comprising the related polypeptides.
  • the present invention also provides for vectors, host cells, and methods for producing the polynucleotides and polypeptides ofthe invention in these vectors and host cells, and in cell-free systems.
  • the present invention further provides for antisense molecules that are capable of regulating the expression ofthe polynucleotides or polypeptides herein.
  • modulators including antibodies, that bind specifically to the polypeptides or modulate the activity ofthe polypeptides, are also provided.
  • the present polynucleotides, polypeptides, and modulators find use in therapeutic agent screening/discovery applications, such as screening for receptors or competitive ligands, for use, for example, as small molecule therapeutic drugs. Also provided are methods of modulating a biological activity of a polypeptide and methods of treating associated disease conditions, particularly by administering modulators ofthe present polypeptides, such as small molecule modulators, antisense molecules, and specific antibodies.
  • polypeptides, polynucleotides, and modulators find use in a number of diagnostic, prophylactic, and therapeutic applications.
  • the polynucleotides and polypeptides ofthe invention are useful in diagnosis, and can be used in diagnostic kits.
  • polynucleotides and polypeptides ofthe invention are also useful for treating a variety of disorders, including proliferative disorders such as cancer, inflammatory disorders such as ulcerative colitis, immune disorders such as autoimmune diseases, e.g., multiple sclerosis, diseases caused by infectious and parasitic microorganisms including, for example, bacteria, fungi, prions, or viruses, metabolic disorders such as diabetes and obesity, central nervous system disorders such as Alzheimer's and Parkinson's, and bone and cartilage disorders such as osteoporosis and achondroplasia (Braunwald et al., 2001).
  • proliferative disorders such as cancer
  • inflammatory disorders such as ulcerative colitis
  • immune disorders such as autoimmune diseases, e.g., multiple sclerosis
  • diseases caused by infectious and parasitic microorganisms including, for example, bacteria, fungi, prions, or viruses
  • metabolic disorders such as diabetes and obesity
  • central nervous system disorders such as Alzheimer's and Parkinson's
  • bone and cartilage disorders such as osteop
  • polynucleotides and polypeptides ofthe invention, and related compositions will inhibit or modulate the replication, differentiation, signaling, or other function of a pathologically important cell ofthe system involved in the disorder to be treated.
  • a polynucleotide or polypeptide composition ofthe invention can treat an immune disorder by modulating the replication, differentiation, signaling, or other function of a pathologically important cell ofthe immune system, such as a B-lymphocyte, T-lymphocyte, mast cell, dendritic cell, macrophage, neutrophil, basophil, or eosinophil.
  • therapeutic vaccines in the form of nucleic acid or polypeptide vaccines, such as cancer vaccines, where the vaccines can be administered alone, such as naked DNA, or can be facilitated, such as via viral vectors, microsomes, or liposomes.
  • Therapeutic antibodies include those that are administered alone or in combination with cytotoxic agents, such as radioactive or chemotherapeutic agents.
  • Each sequence shown in Tables 1-4 is identified by a Five Prime Therapeutics, Inc. (FP) identification number (FP ID).
  • Table 1 correlates the Five Prime Therapeutics, Inc. identification number (FP ID) of each nucleotide and polypeptide with the Sequence Listing.
  • Each FP ID corresponds to two SEQ ID NOS.
  • the first, SEQ ID NO. (NI) corresponds to the nucleotide coding sequence that encodes the polypeptide ofthe invention.
  • the second, SEQ ID NO. (Pl) corresponds to the amino acid sequence ofthe polypeptide ofthe invention.
  • SEQ ID NOS. 1-123 correspond to the NI coding sequences and SEQ ID NOS. 124-246 correspond to the Pl polypeptide sequences.
  • Table 2 specifies the result ofthe algorithm described above that predicts whether the claimed FP sequence is secreted (Tree Vote).
  • the signal peptide coordinates (Signal Peptide Coords.) are listed in terms ofthe amino acid residues beginning with " 1 " at the N-terminus ofthe polypeptide.
  • the Mature Protein Coords refer to the coordinates ofthe amino acid residues ofthe mature polypeptide after cleavage ofthe signal peptide.
  • Table 2 also specifies the coordinates of an alternative form ofthe mature protein (Alternate Mature Protein Coords.).
  • transmembrane coordinates refer to the transmembrane segments ofthe molecule and are listed in terms ofthe amino acid residues beginning with "1" at the N-terminus ofthe polypeptide.
  • non-transmembrane coordinates refer to the amino acids that are not located in the membrane; these can include extracellular, cytoplasmic, and luminal sequences, and are listed in terms ofthe amino acid residues beginning with "1" at the N-terminus ofthe polypeptide.
  • Table 3 specifies the predicted number of amino acid residues in each FP protein ofthe invention (Predicted Protein Length). Table 3 describes the characteristics ofthe protein in the public National Center for Information Biotechnology (NCBI) database displaying the greatest degree of similarity to each claimed sequence. The corresponding NCBI protein is described by its NCBI accession number (Top Hit Accession ID) and by the NCBI's annotation of that sequence (Top Hit Annotation). The percent identity ofthe Five Prime protein with the corresponding NCBI protein is listed (Top Hit %ID (relative to prediction)). The number of amino acids in this NCBI protein is specified (Length of Top Hit).
  • Table 3 also describes the characteristics ofthe human protein in the NCBI database with the greatest degree of similarity to the claimed sequences.
  • the corresponding NCBI protein is described by its NCBI accession number (Top Human Hit Accession ID), and by the NCBI's annotation of that sequence (Top Human Hit Annotation).
  • the percent identity ofthe Five Prime protein with the NCBI protein is listed (Top Human Hit %>ID (relative to prediction)).
  • the number of amino acids in the NCBI protein is specified (Length of Top Human Hit).
  • Table 4 lists the protein family (Pfam) of certain of the polypeptides ofthe invention. It also lists the coordinates at which each Pfam sequence can be found (Coordinates). Definitions
  • Related sequences include nucleotide and amino acid sequences that are involved in the function of their referent.
  • receptor-related sequences include all sequences that are involved in receptor function. This includes, but is not limited to, sequences that are involved in receptor synthesis, receptor regulation, receptor effector function, and receptor degradation.
  • Related sequences also encompass complementary nucleic acid sequences, and biologically active fragments of nucleic acid and amino acid sequences.
  • polynucleotide refers to polymeric forms of nucleotides of any length.
  • the polynucleotides can contain deoxyribonucleotides, ribonucleotides, and/or their analogs or derivatives.
  • nucleic acids can be naturally occurring DNA or RNA, or can be synthetic analogs, as known in the art.
  • the terms also encompass genomic DNA, genes, gene fragments, exons, introns, regulatory sequences or regulatory elements (such as promoters, enhancers, initiation and termination regions, other control regions, expression regulatory factors, and expression controls), DNA comprising one or more single-nucleotide polymorphisms (SNPs), allelic variants, isolated DNA of any sequence, and cDNA.
  • SNPs single-nucleotide polymorphisms
  • allelic variants isolated DNA of any sequence
  • cDNA cDNA.
  • the terms also encompass mRNA, tRNA, rRNA, ribozymes, splice variants, antisense RNA, antisense conjugates, RNAi, and isolated RNA of any sequence.
  • the terms also encompass recombinant polynucleotides, heterologous polynucleotides, branched polynucleotides, labeled polynucleotides, hybrid DNA RNA, polynucleotide constructs, vectors comprising the subject nucleic acids, nucleic acid probes, primers, and primer pairs.
  • the polynucleotides can comprise modified nucleic acid molecules, with alterations in the backbone, sugars, or heterocyclic bases, such as methylated nucleic acid molecules, peptide nucleic acids, and nucleic acid molecule analogs, which may be suitable as, for example, probes if they demonstrate superior stability and/or binding affinity under assay conditions.
  • Analogs of purines and pyrimidines, including radiolabeled and fluorescent analogs, are known in the art.
  • the polynucleotides can have any three-dimensional structure, and can perform any function, known or as yet unknown.
  • the terms also encompass single-stranded, double-stranded and triple helical molecules that are either DNA, RNA, or hybrid DNA/RNA and that may encode a full-length gene or a biologically active fragment thereof.
  • Biologically active fragments of polynucleotides can encode the polypeptides herein, as well as anti-sense and RNAi molecules.
  • the full length polynucleotides herein may be treated with enzymes, such as Dicer, to generate a library of short RNAi fragments which are within the scope ofthe present invention.
  • novel polynucleotides herein include those shown in the Tables, SEQ ID NO: 1 - 123, as well as those that encode the polypeptides of SEQ ID NO: 124 - 246, and biologically active fragments thereof.
  • the polynucleotides also include modified, labeled, and degenerate variants ofthe nucleic acid sequences, as well as nucleic acid sequences that are substantially similar or homologous to nucleic acids encoding the subject proteins.
  • a “biologically active” entity, or an entity having “biological activity,” is one having structural, regulatory, or biochemical functions of a naturally occurring molecule or any function related to or associated with a metabolic or physiological process.
  • Biologically active polynucleotide fragments are those exhibiting activity similar, but not necessarily identical, to an activity of a polynucleotide ofthe present invention.
  • the biological activity can include an improved desired activity, or a decreased undesirable activity.
  • an entity demonstrates biological activity when it participates in a molecular interaction with another molecule, or when it has therapeutic value in alleviating a disease condition, or when it has prophylactic value in inducing an immune response to the molecule, or when it has diagnostic value in determining the presence ofthe molecule, such as a biologically active fragment of a polynucleotide that can be detected as unique for the polynucleotide molecule, or that can be used as a primer in PCR.
  • nucleic acid sequence refers to all nucleic acid sequences that can be directly translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from a reference nucleic acid sequence.
  • gene or “genomic sequence” as used herein is an open reading frame encoding specific proteins and polypeptides, for example, an mRNA, cDNA, or genomic DNA, and also may or may not include intervening introns, or adjacent 5' and 3 'non-coding nucleotide sequences involved in the regulation of expression up to about 20 kb beyond the coding region, and possibly further in either direction.
  • a gene can be introduced into an appropriate vector for extrachromosomal maintenance or for integration into a host genome.
  • fransgene as used herein is a nucleic acid sequence that is incorporated into a transgenic organism.
  • a "fransgene” can contain one or more transcriptional regulatory sequences, and other sequences, such as introns, that may be useful for expressing or secreting the nucleic acid or fusion protein it encodes.
  • cDNA as used herein is intended to include all nucleic acids that share the sequence elements of mature mRNA species, where sequence elements are exons and 3' and 5' non-coding regions. Generally, mRNA species have contiguous exons, the intervening introns having been removed by nuclear RNA splicing to create a continuous open reading frame encoding a protein.
  • splice variant refers to all types of RNAs transcribed from a given gene that when processed collectively encode plural protein isoforms.
  • alternative splicing and related terms refer to all types of RNA processing that lead to expression of plural protein isoforms from a single gene. Some genes are first transcribed as long mRNA precursors that are then shortened by a series of processing steps to produce the mature mRNA molecule. One of these steps is RNA splicing, in which the intron sequences are removed from the mRNA precursor. A cell can splice the primary transcript in different ways, making different "splice variants," and thereby making different polypeptide chains from the same gene, or from the same mRNA molecule. Splice variants can include, for example, exon insertions, exon extensions, exon truncations, exon deletions, alternatives in the 5 'untranslated region and alternatives in the 3 'untranslated region.
  • Oligonucleotide may generally refer to polynucleotides of between about 5 and about 100 nucleotides of single-or double-stranded nucleic acids. For the purposes of this disclosure, there is no upper limit to the length of an oligonucleotide. Oligonucleotides are also known as oligomers or oligos and can be isolated from genes, or chemically synthesized by methods known in the art.
  • Nucleic acid composition is a composition comprising a nucleic acid sequence, including one having an open reading frame that encodes a polypeptide and is capable, under appropriate conditions, of being expressed as a polypeptide.
  • the term includes, for example, vectors, including plasmids, cosmids, viral vectors (e.g., retrovirus vectors such as lentivirus, adenovirus, and the like), human, yeast, bacterial, Pl -derived artificial chromosomes (HAC's, YAC's, BAC's, PAC's, etc), and mini-chromosomes, in vitro host cells, in vivo host cells, tissues, organs, allogenic or congenic grafts or transplants, multicellular organisms, and chimeric, genetically modified, or transgenic animals comprising a subject nucleic acid sequence.
  • vectors including plasmids, cosmids, viral vectors (e.g., retrovirus vectors such as lentivirus, aden
  • an "isolated,” “purified,” or “substantially isolated” polynucleotide, or a polynucleotide in “substantially pure form,” in “substantially purified form,” in “substantial purity,” or as an “isolate,” is one that is substantially free ofthe sequences with which it is associated in nature, or other nucleic acid sequences that do not include a sequence or fragment ofthe subject polynucleotides.
  • substantially free is meant that less than about 90%, less than about 80%, less than about 70%, less than about 60%, or less than about 50%) ofthe composition is made up of materials other than the isolated polynucleotide.
  • the isolated polynucleotide is at least about 50%, at least about 60%>, at least about 70%, at least about 80%>, at least about 90%, at least about 95%, at least about 97%, or at least about 99% free ofthe materials with which it is associated in nature.
  • an isolated polynucleotide may be present in a composition wherein at least about 50%, at least about 60%, at least about 70%, at least about 80%>, at least about 90%>, at least about 95%o, at least about 97%, at least about 99% ofthe total macromolecules (for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides) in the composition is the isolated polynucleotide. Where at least about 99% ofthe total macromolecules is the isolated polynucleotide, the polynucleotide is at least about 99% pure, and the composition comprises less than about 1% contaminant.
  • the total macromolecules for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides
  • an "isolated,” “purified” or “substantially isolated” polynucleotide, or a polynucleotide in “substantially pure form,” in “substantially purified form,” in “substantial purity,” or as an “isolate,” also refers to recombinant polynucleotides, modified, degenerate and homologous polynucleotides, and chemically synthesized polynucleotides, which, by virtue of origin or manipulation, are not associated with all or a portion of a polynucleotide with which it is associated in nature, are linked to a polynucleotide other than that to which it is linked in nature, or do not occur in nature.
  • the subject polynucleotides are generally provided as other than on an intact chromosome, and recombinant embodiments are typically flanked by one or more nucleotides not normally associated with the subject polynucleotide on a naturally-occurring chromosome.
  • polypeptide refers to a polymeric form of amino acids of any length, which can include naturally-occurring amino acids, coded and non-coded amino acids, chemically or biochemically modified, derivatized, or designer amino acids, amino acid analogs, peptidomimetics, and depsipeptides, and polypeptides having modified, cyclic, bicyclic, depsicyclic, or depsibicyclic peptide backbones.
  • the term includes single chain protein as well as multimers.
  • the term also includes conjugated proteins, fusion proteins, including, but not limited to, GST fusion proteins, fusion proteins with a heterologous amino acid sequence, fusion proteins with heterologous and homologous leader sequences, fusion proteins with or without N-terminal methionine residues, pegolyated proteins, and immunologically tagged proteins. Also included in this term are variations of naturally occurring proteins, where such variations are homologous or substantially similar to the naturally occurring protein, as well as corresponding homologs from different species. Variants of polypeptide sequences include insertions, additions, deletions, or substitutions compared with the subject polypeptides. The term also includes peptide aptamers.
  • novel polypeptides herein include amino acid sequences encoded by an open reading frame (ORF) as shown in SEQ ID NO: 124 - 246, described in greater detail below, including the full length protein and fragments thereof, particularly biologically active fragments and/or fragments corresponding to functional domains, e.g., a signal peptide or leader sequence, an enzyme active site, including a cleavage site and an enzyme catalytic site, a domain for interaction with other protein(s), a domain for binding DNA, a regulatory domain, a consensus domain that is shared with other members ofthe same protein family, such as a kinase family or an immunoglobulin family; an extracellular domain that may act as a target for antibody production or that may be cleaved to become a soluble receptor or a ligand for a receptor; an intracellular fragment of a transmembrane protein that participates in signal transduction; a transmembrane domain of a transmembrane protein that may facilitate water or
  • a “biologically active” entity is one having structural, regulatory, or biochemical functions of a naturally occurring molecule or any function related to or associated with a metabolic or physiological process.
  • Biologically active polypeptide fragments are those exhibiting activity similar, but not necessarily identical, to an activity of a polypeptide ofthe present invention.
  • the biological activity can include an improved desired activity, or a decreased undesirable activity.
  • an entity demonstrates biological activity when it participates in a molecular interaction with another molecule, or when it has therapeutic value in alleviating a disease condition, or when it has prophylactic value in inducing an immune response to the molecule, or when it has diagnostic value in determining the presence ofthe molecule.
  • a biologically active polypeptide or fragment thereof includes one that can participate in a biological reaction, for example, as a transcription factor that combines with other transcription factors for initiation of transcription, or that can serve as an epitope or immunogen to stimulate an immune response, such as production of antibodies, or that can transport molecules into or out of cells, or that can perform a catalytic activity, for example polymerization or nuclease activity, or that can participate in signal transduction by binding to receptors, proteins, or nucleic acids, activating enzymes or substrates.
  • a transcription factor that combines with other transcription factors for initiation of transcription, or that can serve as an epitope or immunogen to stimulate an immune response, such as production of antibodies, or that can transport molecules into or out of cells, or that can perform a catalytic activity, for example polymerization or nuclease activity, or that can participate in signal transduction by binding to receptors, proteins, or nucleic acids, activating enzymes or substrates.
  • a “signal peptide,” or a “leader sequence,” comprises a sequence of amino acid residues, typically, at the N terminus of a polypeptide, which directs the intracellular trafficking ofthe polypeptide.
  • Polypeptides that contain a signal peptide or leader sequence typically also contain a signal peptide or leader sequence cleavage site. Such polypeptides, after cleavage at the cleavage sites, generate mature polypeptides, for example, after extracellular secretion or after being directed to the appropriate intracellular compartment.
  • Depsipeptides are compounds containing a sequence of at least two alpha-amino acids and at least one alpha-hydroxy carboxylic acid, which are bound through at least one normal peptide link and ester links, derived from the hydroxy carboxylic acids.
  • Linear depsipeptides can comprise rings formed through S-S bridges, or through an hydroxy or a mercapto group of an hydroxy-, or mercapto- amino acid and the carboxyl group of another amino- or hydroxy-acid but do not comprise rings formed only through peptide or ester links derived from hydroxy carboxylic acids.
  • Cyclic depsipeptides are peptides containing at least one ring formed only through peptide or ester links, derived from hydroxy carboxylic acids.
  • an "isolated,” “purified,” or “substantially isolated” polypeptide, or a polypeptide in “substantially pure form,” in “substantially purified form,” in “substantial purity,” or as an “isolate,” is one that is substantially free ofthe materials with which it is associated in nature or other polypeptide sequences that do not include a sequence or fragment ofthe subject polypeptides.
  • substantially free is meant that less than about 90%, less than about 80%, less than about 70%, less than about 60%, or less than about 50% ofthe composition is made up of materials other than the isolated polypeptide.
  • the isolated polypeptide is at least about 50%, at least about 60%, at least about.70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at least about 99% free ofthe materials with which it is associated in nature.
  • an isolated polypeptide may be present in a composition wherein at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at least about 99% ofthe total macromolecules (for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides) in the composition is the isolated polypeptide.
  • the total macromolecules for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides
  • the polypeptide is at least about 99% pure, and the composition comprises less than about 1% contaminant.
  • an "isolated,” “purified,” or “substantially isolated” polypeptide, or a polypeptide in “substantially pure form,” in “substantially purified form,” in “substantial purity,” or as an “isolate,” also refers to recombinant polypeptides, modified, tagged and fusion polypeptides, and chemically synthesized polypeptides, which by virtue or origin or manipulation, are not associated with all or a portion of the materials with which they are associated in nature, are linked to molecules other than that to which they are linked in nature, or do not occur in nature.
  • a hydrophobic polypeptide is a polypeptide having one or more hydrophobic domain. Hydrophobic polypeptides do not interact effectively with water; they are, in general, poorly soluble or insoluble in water. Hydrophobic domains comprise one or more amino acids that have aliphatic side chains, which are insoluble or only slightly soluble in water. Examples of hydrophobic polypeptides are alanine, valine, leucine, isoleucine, and methionine, which are nonpolar, and phenylalanine, tyrosine, and tryptophan, which have large, bulky aromatic side groups.
  • bicyclic refers to a peptide with two ring closures formed by covalent linkages between amino acids.
  • a covalent linkage between two nonadjacent amino acids constitutes a ring closure, as does a second covalent linkage between a pair of adjacent amino acids which are already linked by a covalent peptide linkage.
  • the covalent linkages forming the ring closures can be amide linkages, i.e., the linkage formed between a free amino on one amino acid and a free carboxyl of a second amino acid, or linkages formed between the side chains or "R" groups of amino acids in the peptides.
  • bicyclic peptides can be "true” bicyclic peptides, i.e., peptides cyclized by the formation of a peptide bond between the N-terminus and the C-terminus ofthe peptide, or they can be "depsi-bicyclic" peptides, i.e., peptides in which the terminal amino acids are covalently linked through their side chain moieties.
  • Detection methods ofthe invention can be qualitative or quantitative.
  • the terms “detection,” “identification,” “determination,” and the like refer to both qualitative and quantitative determinations, and include “measuring.”
  • detection methods include methods for detecting the presence and/or level of polynucleotide or polypeptide in a biological sample, and methods for detecting the presence and/or level of biological activity of polynucleotide or polypeptide in a sample.
  • array or “microarray” may be used interchangeably and refers to a collection of plural biological molecules such as nucleic acids, polypeptides, or antibodies, having locatable addresses that may be separately detectable.
  • microarray encompasses use of sub microgram quantities of biological molecules.
  • the biological molecules may be affixed to a substrate or may be in solution or suspension.
  • the substrate can be porous or solid, planar or non-planar, unitary or distributed, such as a glass slide, a 96 well plate, with or without the use of microbeads or nanobeads.
  • microarray includes all ofthe devices referred to as microarrays in Schena, 1999; Bassett et al., 1999; Bowtell, 1999; Brown and Botstein, 1999; Chakravarti, 1999; Cheung et al., 1999; Cole et al., 1999; Collins, 1999; Debouck and Goodfellow, 1999; Duggan et al., 1999; Hacia, 1999; Lander, 1999; Lipshutz et al., 1999; Southern, et al., 1999; Schena, 2000; Brenner et al, 2000; Lander, 2001; Steinhaur et al., 2002; and Espejo et al, 2002.
  • Nucleic acid microarrays include both oligonucleotide arrays (DNA chips) containing expressed sequence tags ("ESTs") and arrays of larger DNA sequences representing a plurality of genes bound to the substrate, either one of which can be used for hybridization studies.
  • Protein and antibody microarrays include arrays of polypeptides or proteins, including but not limited to, polypeptides or proteins obtained by purification, fusion proteins, and antibodies, and can be used for specific binding studies (Zhu and Snyder, 2003; Houseman et al., 2002; Schaeferling et al., 2002; Weng et al., 2002; Winssinger et al., 2002; Zhu et al., 2001; Zhu et al. 2001; and MacBeath and Schreiber, 2000).
  • a "nucleic acid hybridization reaction” is one in which single strands of DNA or RNA randomly collide with one another, and bind to each other only when their nucleotide sequences have some degree of complementarity.
  • the solvent and temperature conditions can be varied in the reactions to modulate the extent to which the molecules can bind to one another.
  • Hybridization reactions can be performed under different conditions of "stringency.”
  • the "stringency” of a hybridization reaction as used herein refers to the conditions (e.g., solvent and temperature conditions) under which two nucleic acid strands will either pair or fail to pair to form a "hybrid" helix.
  • T m is the temperature in degrees Celsius at which 50%> of a polynucleotide duplex made of complementary strands of nucleic acids that are hydrogen bonded in an anti-parallel direction by Watson-Crick base pairing dissociate into single strands under conditions ofthe hybridization reaction.
  • a "buffer” is a system that tends to resist change in pH when a given increment of hydrogen ion or hydroxide ion is added. Buffered solutions contain conjugate acid-base pairs. Any conventional buffer can be used with the inventions herein including but not limited to, for example, Tris, phosphate, imidazole, and bicarbonate.
  • A"crystal is a solid of regular shape that forms when an element or compound forms slowly enough that the individual molecules occupy regular positions with respect to one another.
  • a crystal structure is the configuration in which the atoms of a crystal are arranged.
  • the crystal structure of a protein can affect its physical properties. Protein crystals can, in some instances, display biological activity, indicating that the protein have crystallized in their biologically active configuration. For example, enzyme crystals may display catalytic activity towards a substrate.
  • a "library" of polynucleotides comprises a collection of sequence information of a plurality of polynucleotide sequences, which information is provided in either biochemical form (e.g., as a collection of polynucleotide molecules), or in electronic form (e.g., as a collection of polynucleotide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as part of a computer program).
  • a "library" of polypeptides comprises a collection of sequence information of a plurality of polypeptide sequences, which information is provided in, e.g., a collection of polypeptide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as part of a computer program.
  • Media refers to a manufacture, other than an isolated nucleic acid molecule, that contains the sequence information ofthe present invention. Such a manufacture provides the genome sequence or a subset thereof in a form that can be examined by means not directly applicable to the sequence as it exists in a nucleic acid, e.g., with computer-readable media comprising data storage structures.
  • Such media include, but are not limited to: magnetic storage media, such as a floppy disc, a hard disc storage medium, and a magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
  • Recorded refers to a process for storing information on computer readable media, using any such methods as known in the art.
  • a computer-based system refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information ofthe present invention.
  • the minimum hardware ofthe computer-based systems ofthe present invention comprises a central processing unit (CPU), input means, output means, and data storage means.
  • CPU central processing unit
  • the data storage means can comprise any manufacture comprising a recording ofthe present sequence information as described above, or a memory access means that can access such a manufacture.
  • Search means refers to one or more programs implemented on the computer-based system, to compare a target sequence or target structural motif, or expression levels of a polynucleotide in a sample, with the stored sequence information.
  • a variety of known algorithms are publicly known and commercially available, e.g., MacPattern (EMBL), BLAST, BLASTN and BLASTX (NCBI), gapped BLAST, BLAZE, the Wise package, FASTX, Clustalw, FASTA, FASTA3, AlignO, TCoffee, BestFit, FastDB, and TeraBLAST (TimeLogic, Crystal Bay, Nevada).
  • Search means can be used to identify fragments or regions ofthe genome that match a particular target sequence or target motif, for example, based on sequence similarity, for example, to identify open reading frames (ORFs) within the genome that contain homology to ORFs from other organisms.
  • ORFs open reading frames
  • sequence similarity means the exact match-up of two or more nucleotide sequences or two or more amino acid sequences, where the nucleotide or amino acids being compared are the same.
  • similarity means the exact match-up of two of more nucleotide sequences or two or more amino acid sequences, where the nucleotide or amino acids being compared are either the same or possess similar chemical and/or physical properties.
  • sequences can be aligned in a number of different ways and sequence similarity can be determined in a number of different ways.
  • the bases or amino acid residues of one sequence can be aligned to a gap in the other sequence, or they can be aligned only to another base or amino acid residue in the other sequence.
  • a gap can range anywhere from one nucleotide, base, or amino acid residue to multiple exons in length, up to any number of nucleotides or amino acid residues.
  • sequences can be aligned such that nucleotides (or bases) align with nucleotides, nucleotides align with amino acid residues, or amino acid residues align with amino acid residues.
  • a "target sequence” can be any polynucleotide or amino acid sequence of six or more contiguous nucleotides or two or more amino acids, for example, from about 5 or from about 10 to about 100 amino acids, or from about 15 or from about 30 to about 300 nucleotides.
  • a variety of comparing means can be used to accomplish comparison of sequence information from a sample (e.g., to analyze target sequences, target motifs, or relative expression levels) with the data storage means.
  • any one ofthe publicly available homology search programs can be used as the search means for the computer based systems ofthe present invention to accomplish comparison of target sequences and motifs.
  • Computer programs to analyze expression levels in a sample and in controls are also known in the art.
  • a "target sequence” includes an "antibody target sequence,” which refers to an amino acid sequence that can be used as an immunogen for injection into animals for production of antibodies or for screening against a phage display or antibody library for identification of binding partners.
  • a "target structural motif,” or “target motif,” refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration that is formed upon the folding ofthe target motif, or on consensus sequences of regulatory or active sites.
  • target motifs include, but are not limited to, enzyme active sites and signal sequences.
  • Nucleic acid target motifs include, but are not limited to, hairpin structures, promoter sequences, and other expression elements such as binding sites for transcription factors.
  • a "matrix” is a geometric network of antibody molecules and their antigens, as found in immunoprecipitation and flocculation reactions.
  • An antibody matrix can exist in solution or on a solid phase support.
  • Antibody binding to such epitope on a polypeptide can be stronger than binding ofthe same antibody to any other epitopes, particularly other epitopes that can be present in molecules in association with, or in the same sample as the polypeptide of interest.
  • adjusting the binding conditions can result in antibody binding almost exclusively to the specific epitope and not to any other epitopes on the same polypeptide, and not to any other polypeptide, which does not comprise the epitope.
  • Antibodies that bind specifically to a subject polypeptide may be capable of binding other polypeptides at a weak, yet detectable, level (e.g., 10% or less ofthe binding shown to the polypeptide of interest). Such weak binding, or background binding, is readily discernible from the specific antibody binding to a subject polypeptide, e.g., by use of appropriate controls.
  • antibodies ofthe invention bind to a specific polypeptide with a binding affinity of IO -7 M or greater (e.g., IO "8 M, IO "9 M, 10 "10 , 10 "11 , etc.).
  • host cell includes an individual cell, cell line, cell culture, or in vivo cell, which can be or has been a recipient of any polynucleotides or polypeptides ofthe invention, for example, a recombinant vector, an isolated polynucleotide, antibody or fusion protein.
  • Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology, physiology, or in total DNA, RNA, or polypeptide complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change.
  • Host cells can be prokaryotic or eukaryotic, including mammalian, insect, amphibian, reptile, crustacean, avian, fish, plant and fungal cells.
  • a host cell includes cells transformed, transfected, transduced, or infected in vivo or in vitro with a polynucleotide ofthe invention, for example, a recombinant vector.
  • a host cell which comprises a recombinant vector ofthe invention may be called a "recombinant host cell.”
  • Bio sample encompasses a variety of sample types obtained from an individual, including biological fluids such as blood, serum, plasma, urine, cerebrospinal fluid, tears, saliva, lymph, dialysis fluid, lavage fluid, semen, and other liquid samples or tissues of biological origin. It includes tissue samples and tissue cultures or cells derived therefrom and the progeny thereof, including cells in culture, cell supernatants, and cell ly sates. It includes organ or tissue culture derived fluids, tissue biopsy samples, tumor biopsy samples, stool samples, and fluids extracted from physiological tissues. Cells dissociated from solid tissues, tissue sections, and cell lysates are included.
  • the definition also includes samples that have been manipulated in any way after their procurement, such as by treatment with reagents, solubilization, or enrichment for certain components, such as polynucleotides or polypeptides. Also included in the term are derivatives and fractions of biological samples.
  • a biological sample can be used in a diagnostic, monitoring, or screening assay.
  • mammals or “mammalian,” are used broadly to describe organisms which are within the class mammalia, including the orders carnivore (e.g., dogs and cats), rodentia (e.g., mice, guinea pigs, and rats), and other mammals, including cattle, goats, sheep, cows, horses, rabbits, and pigs, and primates (e.g., humans, chimpanzees, and monkeys).
  • carnivore e.g., dogs and cats
  • rodentia e.g., mice, guinea pigs, and rats
  • other mammals including cattle, goats, sheep, cows, horses, rabbits, and pigs, and primates (e.g., humans, chimpanzees, and monkeys).
  • agent refers to a substance that binds to or modulates a level or activity of a subject polypeptide or a level of mRNA encoding a subject protein or nucleic acid, or that modulates the activity of a cell containing the subject protein or nucleic acid.
  • agent modulates a level of mRNA encoding a subject protein
  • agents include ribozymes, antisense, and RNAi molecules.
  • agents include antibodies specific for the subject polypeptide, peptide aptamers, small molecules, agents that bind a ligand-binding site in a subject polypeptide, and the like.
  • Antibody agents include antibodies that specifically bind a subject polypeptide and activate the polypeptide, such as receptor-ligand binding that initiates signal transduction; antibodies that specifically bind a subject polypeptide and inhibit binding of another molecule to the polypeptide, thus preventing activation of a signal transduction pathway; antibodies that bind a subject polypeptide to modulate transcription; antibodies that bind a subject polypeptide to modulate translation; as well as antibodies that bind a subject polypeptide on the surface of a cell to initiate antibody-dependent cytotoxicity ("ADCC”) or to initiate cell killing or cell growth.
  • Small molecule agents include those that bind the polypeptide to modulate activity ofthe polypeptide or cell containing the polypeptide in a similar fashion.
  • agent also refers to substances that modulate a condition or disorder associated with a subject polynucleotide or polypeptide. Such agents include subject polynucleotides themselves, subject polypeptides themselves, and the like. Agents may be chosen from amongst candidate agents, as defined below.
  • Candidate agents can be small organic compounds having a molecular weight of more than about 50 and less than about 2,500 daltons.
  • Candidate agents can comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and can include at least an amine, carbonyl, hydroxyl or carboxyl group, and can contain at least two of the functional chemical groups.
  • the candidate agents can comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more ofthe above functional groups.
  • Candidate agents are also found among biomolecules, including oligonucleotides, polynucleotides, and fragments thereof, depsipeptides, polypeptides and fragments thereof, oligosaccharides, polysaccharides and fragments thereof, lipids, fatty acids, steroids, purines, pyrimidines, derivatives thereof, structural analogs, modified nucleic acids, modified, derivatized or designer amino acids, or combinations thereof.
  • an "agent which modulates a biological activity of a subject polypeptide” describes any substance, synthetic, semi-synthetic, or natural, organic or inorganic, small molecule or macromolecular, pharmaceutical or protein, with the capability of altering a biological activity of a subject polypeptide or of a fragment thereof, as described herein.
  • a plurality of assay mixtures is run in parallel with different agent concentrations to obtain a differential response to the various concentrations.
  • one of these concenfrations serves as a negative control, i.e., at zero concentration or below the level of detection.
  • the biological activity can be measured using any assay known in the art.
  • An agent which modulates a biological activity of a subject polypeptide increases or decreases the activity at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 50%, at least about 100%, or at least about 2-fold, at least about 5-fold, or at least about 10-fold or more when compared to a suitable control.
  • agonist refers to a substance that mimics the function of an active molecule.
  • Agonists include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as well as analogues and fragments thereof.
  • Antagonist refers to a molecule that competes for the binding sites of an agonist, but does not induce an active response. Antagonists include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as well as analogues and fragments thereof.
  • receptor refers to a polypeptide that binds to a specific extracellular molecule and may initiate a cellular response.
  • ligand refers to any molecule that binds to a specific site on another molecule.
  • modulate encompasses an increase or a decrease, a stimulation, inhibition, or blockage in the measured activity when compared to a suitable control.
  • Modulation of expression levels includes increasing the level and decreasing the level of an mRNA or polypeptide encoded by a polynucleotide ofthe invention when compared to a control lacking the agent being tested.
  • agents of particular interest are those which inhibit a biological activity of a subject polypeptide, and/or which reduce a level of a subject polypeptide in a cell, and/or which reduce a level of a subject mRNA in a cell and/or which reduce the release of a subject polypeptide from a eukaryotic cell.
  • agents of interest are those that increase a biological activity of a subject polypeptide, and/or which increase a level of a subject polypeptide in a cell, and/or which increase a level of a subject mRNA in a cell and/or which increase the release of a subject polypeptide from a eukaryotic cell.
  • An agent that "modulates the level of expression of a nucleic acid" in a cell is one that brings about an increase or decrease of at least about 1.25-fold, at least about 1.5-fold, at least about 2-fold, at least about 5-fold, at least about 10-fold, or more in the level (i.e., an amount) of mRNA and/or polypeptide following cell contact with a candidate agent compared to a control lacking the agent.
  • Modulating a level of active subject polypeptide includes increasing or decreasing activity of a subject polypeptide; increasing or decreasing a level of active polypeptide protein; increasing or decreasing a level of mRNA encoding active subject polypeptide, and increasing or decreasing the release of subject polypeptide for a eukaryotic cell.
  • an agent is a subject polypeptide, where the subject polypeptide itself is administered to an individual.
  • an agent is an antibody specific for a subject polypeptide.
  • an agent is a chemical compound such as a small molecule that may be useful as an orally available drug.
  • Such modulation includes the recruitment of other molecules that directly effect the modulation.
  • an antibody that modulates the activity of a subject polypeptide that is a receptor on a cell surface may bind to the receptor and fix complement, activating the complement cascade and resulting in lysis ofthe cell.
  • over-expressed refers to a state wherein there exists any measurable increase over normal or baseline levels.
  • a molecule that is over-expressed in a disorder is one that is manifest in a measurably higher level compared to levels in the absence ofthe disorder.
  • a "stem cell” is a pluripotent or multipotent cell with the ability to self-renew, to remain undifferentiated, and to become differentiated. Stem cells can divide without limit, at least for the lifetime ofthe animal in which they naturally reside. Stem cells are not terminally differentiated, i.e., they are not at the end of a pathway of differentiation. When a stem cell divides, each daughter cell can either remain a stem cell or it can embark on a course that leads to terminal differentiation.
  • An “embryonic stem cell” is a stem cell that is present in or isolated from an embryo.
  • An “adult stem cell” is a stem cell that is present in or isolated from an adult. Either can be pluripotent, having the capacity to differentiate into each and every cell present in the organism, or multipotent, with the ability to differentiate into more than one cell type. Embryonic stem cells derived from the inner cell mass ofthe embryo can act as pluripotent cells when placed into host blastocysts. Adult stem cells are more frequently multipotent than pluripotent; examples of multipotent adult stem cells include hematopoeitic stem cells, peripheral nervous system stem cells, central nervous system stem cells, myogenic stem cells, and mesenchymal stem cells.
  • a "mesenchymal stem cell” is an adult pluripotent stem cell progenitor of multiple mesenchymal lineages, including bone, cartilage, muscle, fat tissue, marrow stroma, and astrocytes.
  • Mesenchyme is embryonic tissue of mesodermal origin, i.e., tissue that derives from the middle of three germ layers. The mesenchyme is populated by mesenchymal cells, which are typically stellate or fusiform in shape. The embryonic mesoderm gives rise to the musculoskeletal, blood, vascular, and urogenital systems, as well as connective tissue, i.e., the dermis.
  • a "hematopoeitic” cell is a cell involved in the process of hematopoeisis, i.e., the process of forming mature red and white blood cells from precursor cells.
  • hematopoeisis takes place in the bone marrow.
  • hematopoeisis takes place at different sites during different stages of development; primitive blood cells arise in the yolk sac, and later, blood cells are formed in the liver, spleen, and bone marrow.
  • Hematopoeisis undergoes complex regulation, including regulation by hormones, e.g., erythropoietin; growth factors, e.g., colony stimulating factors; and cytokines, e.g., interieukins. While the B- lymphocytic component of white blood cells matures in the bone marrow, the T- lymphocytic component of white blood cells matures in the thymus.
  • hormones e.g., erythropoietin
  • growth factors e.g., colony stimulating factors
  • cytokines e.g., interieukins.
  • “Differentiation” is a progressive developmental change to a more specialized form or function.
  • Cell differentiation is the process a cell undergoes as it matures to become an overtly specialized cell type. Differentiated cells have distinct characteristics, perform specific functions, and are less likely to divide than their less differentiated counterparts.
  • An "undifferentiated” cell e.g., an immature, embryonic, or primitive cell, typically has a non-specific appearance, may perform multiple, nonspecific activities, and may perform poorly, if at all, in functions typically performed by differentiated cells.
  • Dedifferentiation is a process by which a mature cell returns to a less mature state.
  • a “dedifferentiated cell” is one that has fewer characteristics of differentiation than it possesses at an earlier point in time.
  • a “dedifferentiated state” is one in which a mature cell has returned or is returmng to a less differentiated state, e.g., as in some cancers.
  • a “differentiation factor” is a factor that induces a cell to undergo a change in the direction of an overtly specialized cell type.
  • An “anti-differentiation factor” is a factor that prevents or inhibits a cell from undergoing a change in the direction toward an overtly specialized cell type.
  • a "co-factor” is a molecule that acts in concert with another substance to bring about certain effects.
  • a "lymphokine” is a cytokine produced by a leukocyte, which acts upon another cell. Examples include interieukins, interferon-alpha, tumor necrosis factor-alpha, and granulocyte/monocyte colony-stimulating factor.
  • an "anti-inflammatory molecule” is a molecule that can diminish, eliminate, or prevent a response to injury or infection.
  • an antihistamine can counteract the effect ofthe inflammatory mediator histamine.
  • an "anti-cancer molecule” is a molecule that can diminish, eliminate, or prevent the effects of cancer. It includes pharmaceuticals and antibodies.
  • An "apoptotic molecule” is a molecule that induces a cell to move towards apoptosis, or programmed cell death. Normally functioning cells undergo apoptosis when their age or their state of health so dictates. Apoptosis is an active process requiring metabolic activity by the dying cell, often characterized by cleavage ofthe DNA into fragments. Cells that die by apoptosis do not generally elicit the inflammatory response associated with necrosis. Cancer cells do not typically undergo normal apoptosis.
  • First and second therapeutic molecules working in "conjunction" means they work in association with one another to achieve a therapeutic effect.
  • First and second heterologous nucleic acid sequences that "interact" with one another means they have an effect on one another such that one ofthe sequences influences the other. Either may act upon the other, or both may act upon each other.
  • a "promoter” is a region of DNA that binds RNA polymerase before initiating the transcription of DNA into RNA.
  • the nucleotide at which transcription begins is designated +1; nucleotides are numbered from this reference point. Negative numbers indicate upstream nucleotides and positive numbers indicate downstream nucleotides.
  • the promoter directs the RNA polymerase to bind to DNA, to open the DNA helix, and to begin RNA synthesis. Some promoters are "constitutive,” and direct transcription in the absence of regulatory influences. Some promoters are “tissue specific,” and initiate transcription exclusively or selectively in one or a few tissue types. Some promoters are "inducible,” and effect gene transcription under the influence of an inducer. Induction can occur, e.g., as the result of a physiologic response, a response to outside signals, or as the result of artificial manipulation.
  • a "knockout" mouse is a mouse in which a normal functional gene has been replaced by a non-functional form ofthe gene, and the function of that particular gene is eliminated. They are typically produced by transplanting embryonic stem cells heterozygous for a knockout mutation in a gene of interest and homozygous for a marker gene, e.g., black coat color into the blastocoel cavity of embryos that are homozygous for an alternate marker, e.g., white coat color. The early embryos then are implanted into a pseudopregnant female. Some ofthe resulting progeny are chimeras, indicated by the phenotype produced by the marker, e.g., a black and white coat.
  • a marker gene e.g., black coat color
  • an alternate marker e.g., white coat color
  • Chimeric mice are backcrossed to mice with the alternate marker.
  • Progeny from this mating that display the marker present in the mice with the gene of interest e.g., black coat
  • Intercrossing these heterozygous mice produces mice homozygous for the disrupted allele, i.e., "knockout” mice (Capecchi, 1989).
  • Gene "knockout” produces model systems for studying inherited human diseases, investigating the nature of genetic diseases and the efficacy of different types of treatment, and for developing effective gene therapies to cure these diseases.
  • a "knockout" line of mutant mice homozygous for a null allele ofthe cystic fibrosis transmembrane regulator gene demonstrates symptoms similar to those of humans with cystic fibrosis. These mice provide a model system for studying this genetic disease and developing effective therapies.
  • a "transgenic mouse” is a mouse that has stably incorporated one or more genes from another cell or organism and can pass them on to successive generations.
  • Transgenic mice with an exogenous DNA sequence of interest integrated into its DNA are typically produced by injecting DNA containing a gene of interest into one ofthe two pronuclei (the male and female haploid nuclei contributed by the parents) of a fertilized mouse egg before they fuse.
  • the injected DNA is randomly integrated into the chromosomes ofthe diploid zygote. Injected eggs then are transferred to foster mothers in which normal cell growth and differentiation occurs.
  • Some ofthe progeny will contain the exogenous DNA, and breeding and backcrossing can produce pure transgenic strains homozygous for the fransgene (Brinster et al., 1981).
  • Transgenic mice are useful for studying various aspects of normal mammalian biology, and also provide a model system for studying disease processes. For example, many forms of cancer are promoted by normal cellular myc genes acting in a dominant fashion owing to their misregulated activity. Transgenic mice carrying the myc gene develop normally, and form tumors at a high frequency in a subset of cells that express the fransgene.
  • a "therapeutic factor" encoded by a first heterologous nucleic acid sequence of a modified mesenchymal cell is a factor, excluding a cell survival factor (Mangi et al., 2003; WO 03/073998), that is preventative, palliative, curative, or otherwise useful in treating or ameliorating, or preventing the recurrence of a disease, disorder, syndrome or condition, and is not an anti-cancer agent.
  • Telomerase is a DNA polymerase enzyme that selectively elongates DNA from the telomere, i.e., the end of a chromosome.
  • Telomeric DNA contains multiple, e.g., hundreds, of tandem repeats of a hexanucleotide sequence.
  • One sfrand of telomeric DNA is G-rich at the 3' end, and slightly longer than the other strand.
  • Telomeric DNA can form large duplex loops, wherein the single-stranded region at the very end ofthe structure loops back to form a DNA duplex with another part ofthe repeated sequence, displacing a part ofthe original telomeric duplex. This looplike structure is formed and stabilized by specific telomere-binding proteins. These structures protect and mask the end ofthe chromosome.
  • the telomeric looplike structures are generated by telomerase.
  • the telomerase enzyme contains an RNA molecule that serves as the template for elongating the G-rich strand of telomeric DNA. Thus, the enzyme carries the information necessary to generate the telomere sequences.
  • Telomerases also have a protein component, which is related to reverse transcriptases. Telomerases can influence cell aging, and play a role in cellular cancer biology.
  • TNF Tumor necrosis factor
  • TNF Tumor necrosis factor
  • TNF encompasses a family of receptor ligands that display pleiotropic effects on normal and malignant cells. Natural induction of TNF is protective, but its overproduction may be detrimental and even lethal to the host. TNF elicits a variety of responses in different cell types. TNF was originally characterized as an antitumor agent and a cytotoxic factor for malignant cells. It subverts the electron transport system of mitochondria to produce oxygen radicals, which can kill malignant cells lacking protective enzymes. TNF also plays a role in the defense against viral, bacterial, and parasitic infections, and in mediating autoimmune responses (Fiers, 1991). TNF inhibitors have been used to treat psoriasis (Weinberg and Saini, 2003).
  • Treatment refers to obtaining a desired pharmacologic and/or physiologic effect, covering any treatment of a pathological condition or disorder in a mammal, including a human.
  • the effect may be prophylactic in terms of completely or partially preventing a disorder or symptom thereof and/or may be therapeutic in terms of a partial or complete cure for a disorder and/or adverse affect attributable to the disorder.
  • treatment includes (1) preventing the disorder from occurring or recurring in a subject who may be predisposed to the disorder but has not yet been diagnosed as having it, (2) inhibiting the disorder, such as arresting its development, (3) stopping or terminating the disorder or at least symptoms associated therewith, so that the host no longer suffers from the disorder or its symptoms, such as causing regression ofthe disorder or its symptoms, for example, by restoring or repairing a lost, missing or defective function, or stimulating an inefficient process, or (4) relieving, alleviating, or ameliorating the disorder, or symptoms associated therewith, where ameliorating is used in a broad sense to refer to at least a reduction in the magnitude of a parameter, such as inflammation, pain, and/or tumor size.
  • a parameter such as inflammation, pain, and/or tumor size.
  • a pharmaceutically acceptable carrier is non-toxic to recipients at the dosages and concentrations employed and is compatible with other ingredients of the formulation.
  • the carrier for a formulation containing polypeptides would not normally include oxidizing agents and other compounds that are known to be deleterious to polypeptides.
  • Suitable carriers include, but are not limited to, water, dextrose, glycerol, saline, ethanol, and combinations thereof.
  • the carrier can contain additional agents such as wetting or emulsifying agents, pH buffering agents, or adjuvants which enhance the effectiveness ofthe formulation.
  • Adjuvants ofthe invention include, but are not limited to Freunds's, Montanide ISA Adjuvants [Seppic, Paris, France], Ribi's Adjuvants (Ribi ImmunoChem Research, Inc., Hamilton, MT), Hunter's TiterMax (CytRx Corp., Norcross, GA), Aluminum Salt Adjuvants (Alhydrogel - Superfos of Denmark/Accurate Chemical and Scientific Co., Westbury, NY), Nitrocellulose- Adsorbed Protein, Encapsulated Antigens, and Gerbu Adjuvant (Gerbu Biotechnik GmbH, Gaiberg, Germany/C-C Biotech, Poway, CA).
  • Topical carriers include liquid petroleum, isopropyl palmitate, polyethylene glycol, ethanol (95%), polyoxyethylene monolaurate (5%>) in water, or sodium lauryl sulfate (5%) in water.
  • Other materials such as anti-oxidants, humectants, viscosity stabilizers, and similar agents can be added as necessary.
  • Percutaneous penefration enhancers such as Azone can also be included.
  • “Pharmaceutically acceptable salts” include the acid addition salts (formed with the free amino groups ofthe polypeptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, mandelic, oxalic, and tartaric. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, and histidine.
  • inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, mandelic, oxalic, and tartaric.
  • Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine
  • compositions for oral administration can form solutions, suspensions, tablets, pills, capsules, sustained release formulations, oral rinses, or powders.
  • unit dosage form refers to physically discrete units suitable as unitary dosages for human and animal subjects, each unit containing a predetermined quantity of compounds ofthe present invention calculated in an "effective amount,” that is, a dosage sufficient to produce the desired result or effect in association with a pharmaceutically acceptable carrier.
  • effective amount that is, a dosage sufficient to produce the desired result or effect in association with a pharmaceutically acceptable carrier.
  • the specifications for the novel unit dosage forms ofthe present invention depend on the particular compound employed, the host, and the effect to be achieved, as well as the pharmacodynamics associated with each compound in the host.
  • the present invention provides novel isolated polynucleotides encoding polypeptides and fragments thereof.
  • the present invention also provides novel isolated polypeptides, fragments thereof, and compositions comprising same.
  • the present invention further provides polynucleotide compositions that can be used to identify the polypeptides.
  • the present invention provides recombinant vectors and host cells for use in gene expression, primer pairs for use in hybridizations, computer-based embodiments for use in bioinformatics, and transgenic animals and embryonic stem cell lines for use in mutating and regulating gene expression.
  • This invention provides genes encoding proteins, the encoded proteins, and fragments and homologs thereof. It provides human polynucleotide sequences and the corresponding mouse polynucleotide sequences.
  • the nucleic acids ofthe subject invention can encode all or a part of the subject proteins. Double or single stranded fragments can be obtained from the DNA sequence by chemically synthesizing oligonucleotides in accordance with conventional methods, for example by restriction enzyme digestion or polymerase chain reaction (PCR) amplification.
  • PCR polymerase chain reaction
  • the use ofthe polymerase chain reaction has been described (Saiki et al., 1 85) and current techniques have been reviewed (Sambrook et al., 1989; McPherson et al. 2000; Dieffenbach and Dveksler, 1995).
  • DNA fragments will be of at least about 5 nucleotides, at least about 8 nucleotides, at least about 10 nucleotides, at least about 15 nucleotides, at least about 18 nucleotides, at least about 20 nucleotides, at least about 25 nucleotides, at least about 30 nucleotides, or at least about 50 nucleotides, at least about 75 nucleotides, or at least about 100 nucleotides.
  • Nucleic acid compositions that encode at least six contiguous amino acids i.e., fragments of 18 nucleotides or more
  • nucleic acid compositions encoding at least 8 contiguous amino acids are useful in directing the expression or the synthesis of peptides that can be used as immunogens (Lerner, 1982; Shinnick et al., 1983; Sutcliffe et al., 1983).
  • a polynucleotide of the invention comprises a nucleotide sequence of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, at least about 2100, at least about 2200, at least about 2300, at least about 2400, at least about
  • a polynucleotide of the invention has at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% nucleotide sequence identity with a nucleotide sequence, or a fragment thereof, ofthe coding region of any one ofthe sequences shown in SEQ ID NO: 1 - 123, or a complement thereof.
  • sequence variants include naturally-occurring variants (e.g., SNPs, allelic variants, and homologs from other species), degenerate variants, variants associated with disease or pathological states, and variants resulting from random or directed mutagenesis, as well as from chemical or other modification.
  • a polynucleotide ofthe invention comprises a nucleotide sequence that encodes a polypeptide comprising an amino acid sequence of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, or at least about 1000 contiguous amino acids of at least one ofthe sequences shown in SEQ ID NO: 124-246 (e.g., a polypeptide encoded by at least one ofthe nucleotide sequences shown in SEQ ID NO: 1 - 123), up to and
  • the present invention includes the present polynucleotide selected from SEQ ID NO: 1 - 124, which contain 300 bp of 5' terminus of a protein encoding polynucleotide sequence. Such a polynucleotide is useful for the purposes of clustering gene sequences to determine gene family.
  • a polynucleotide of the invention hybridizes under stringent hybridization conditions to a polynucleotide having the coding region of any one ofthe sequences shown in SEQ ID NO: 1 - 124, or a complement thereof.
  • the polynucleotides ofthe invention include those that encode variants ofthe polypeptide sequences encoded by the polynucleotides ofthe Sequence Listing. In some embodiments, these polynucleotides encode variant polypeptides that include insertions, additions, deletions, or substitutions compared with the polypeptides encoded by the nucleotide sequences shown in SEQ ID NO: 1 - 124. Conservative amino acid substitutions include serine/threonine, valine/leucine/isoleucine, asparagine/histidine/glutamine, glutamic acid aspartic acid, etc. (Gonnet et al., 1992).
  • nucleic acids ofthe invention include degenerate variants that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the nucleic acid sequences herein.
  • synonymous codons include GGG, GGA, GGC, and GGU, each encoding glycine.
  • the nucleic acids ofthe invention include single nucleotide polymorphisms (SNPs), which occur frequently in eukaryotic genomes (Lander, et al. 2001).
  • SNPs single nucleotide polymorphisms
  • the nucleotide sequence determined from one individual of a species can differ from other allelic forms present within the population.
  • the nucleic acids of the invention include homologs of the polynucleotides.
  • the source of homologous genes can be any species, e.g., primate species, particularly human; rodents, such as rats, hamsters, guinea pigs, and mice; rabbits, canines, felines; catties, such as bovines, goats, pigs, sheep, equines, crustaceans, birds, chickens, reptiles, amphibians, fish, insects, plants, fungi, yeast, nematodes, etc.
  • homologs Among mammalian species, e.g., human and mouse, homologs have substantial sequence similarity, e.g., at least about 60% sequence identity, at least about 75% sequence identity, or at least about 80%> sequence identity among nucleotide sequences. In many embodiments of interest, homology will be at least about 85%, at least about 90%, at least about 95%>, at least about 96%, at least about 97%), at least about 98%, where in certain embodiments of interest homology will be as high as about 99%.
  • nucleic acids Modifications in the native structure of nucleic acids, including alterations in the backbone, sugars or heterocyclic bases, have been shown to increase intracellular stability and binding affinity.
  • useful changes in the backbone chemistry are phosphorothioates; phosphorodithioates, where both ofthe non-bridging oxygens are substituted with sulfur; phosphoroamidites; alkyl phosphotriesters and boranophosphates.
  • Achiral phosphate derivatives include 3 -0 -5 -S-phosphorothioate, 3 -S-5 -O- phosphorothioate, 3 -CH 2 -5 -O-phosphonate and 3 -NH-5'-O-phosphoroamidate.
  • Peptide nucleic acids replace the entire ribose phosphodiester backbone with a peptide linkage.
  • Sugar modifications are also used to enhance stability and affinity.
  • the ⁇ -anomer of deoxyribose can be used, where the base is inverted with respect to the natural ⁇ -anomer.
  • the 2 -OH ofthe ribose sugar can be altered to form 2 -O- methyl or 2 -O-allyl sugars, which provides resistance to degradation without comprising affinity.
  • Modification of the heterocyclic bases must maintain proper base pairing.
  • Some useful substitutions include deoxyuridine for deoxythymidine; 5-methyl-2 - deoxycytidine and 5-bromo-2'-deoxycytidine for deoxycytidine.
  • 5- propynyl-2 - deoxyuridine and 5 -propynyl-2 -deoxycytidine have been shown to increase affinity and biological activity when substituted for deoxythymidine and deoxycytidine, respectively.
  • a genomic sequence of interest comprises the nucleic acid present between the initiation codon and the stop codon, as defined in the listed sequences, including all ofthe introns that are normally present in a native chromosome. It can further include the 3 ' and 5 ' untranslated regions found in the mature mRNA. It can further include specific transcriptional and translational regulatory sequences, such as promoters, enhancers, etc., including about 1 kb, about 2 kb, and possibly more, of flanking genomic DNA at either the 5 'or 3 ' end of the transcribed region.
  • the genomic DNA can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking chromosomal sequence.
  • the genomic DNA flanking the coding region, either 3 'or 5', or internal regulatory sequences as sometimes found in introns contains sequences required for proper tissue and stage specific expression.
  • Nucleic acid molecules of the invention can comprise heterologous nucleic acid molecules, i.e., nucleic acid molecules other than the subject nucleic acid molecules, of any length.
  • the subject nucleic acid molecules can be flanked on the 5 'and/or 3 'ends by heterologous nucleic acid molecules of from about 1 nucleotide to about 10 nucleotides, from about 10 nucleotides to about 20 nucleotides, from about 20 nucleotides to about 50 nucleotides, from about 50 nucleotides to about 100 nucleotides, from about 100 nucleotides to about 250 nucleotides, from about 250 nucleotides to about 500 nucleotides, or from about 500 nucleotides to about 1000 nucleotides, or more in length.
  • the subject polynucleotides include those that encode fusion proteins comprising the subject polypeptides fused to "fusion partners."
  • the present soluble receptor or ligand can be fused to an immunoglobulin fragment, such as an Fc fragment for stability in circulation or to fix complement.
  • an immunoglobulin fragment such as an Fc fragment for stability in circulation or to fix complement.
  • Other polypeptide fragments that have equivalent capabilities as the Fc fragments can also be used herein.
  • the isolated nucleic acids ofthe invention can be used as probes to detect and characterize gross alteration in a genomic locus, such as deletions, insertions, translocations, and duplications, e.g., applying fluorescence in situ hybridization (FISH) techniques to examine chromosome spreads (Andreeff et al., 1999).
  • FISH fluorescence in situ hybridization
  • the nucleic acids are also useful for detecting smaller genomic alterations, such as deletions, insertions, additions, translocations, and substitutions (e.g., SNPs).
  • nucleic acid molecules When used as probes to detect nucleic acid molecules capable of hybridizing with nucleic acids described in the Sequence Listing, the nucleic acid molecules can be flanked by heterologous sequences of any length.
  • a subject nucleic acid can include nucleotide analogs that incorporate labels that are directly detectable, such as radiolabels or fluorophores, or nucleotide analogs that incorporate labels that can be visualized in a subsequent reaction, such as biotin or various haptens. Haptens that are commonly conjugated to nucleotides for subsequent labeling include biotin, digoxigenin, and dinitrophenyl.
  • Suitable fluorescent labels include fluorochromes e.g., fluorescein and its derivatives, e.g., fluorescein isothiocyanate (FITC6-carboxyfluorescein (6- FAM), 2 ',7 -dimethoxy-4',5'-dichloro-6-carboxyfluorescein (JOE), ), 6-carboxy- 2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM); coumarin and its derivatives, e.g., 7-amino-4-methylcoumarin, aminocoumarin; bodipy dyes, such as Bodipy FL; cascade blue; Oregon green; rhodamine dyes, e.g., rhodamine, 6- carboxy-X-rhodamine (ROX), Texas red, phycoerythrin, and tetramethylrhodamine; eosin
  • Fluorescent labels also include a green fluorescent protein (GFP), i.e., a "humanized” version of a GFP, e.g., wherein codons ofthe naturally-occurring nucleotide sequence are changed to more closely match human codon bias; a GFP derived from Aequoria victoria or a derivative thereof, e.g., a "humanized” derivative such as Enhanced GFP, which are available commercially, e.g., from Clontech, Inc.; other fluorescent mutants of a GFP from Aequoria victoria, e.g., as described in U.S. Patent No.
  • GFP green fluorescent protein
  • Probes can also contain fluorescent analogs, including commercially available fluorescent nucleotide analogs that can readily be incorporated into a subject nucleic acid. These include deoxyribonucleotides and or ribonucleotide analogs labeled with Cy3, Cy5, Texas Red, Alexa Fluor dyes, rhodamine, cascade blue, or BODIPY, and the like.
  • Suitable radioactive labels include, e.g., 32 P, 35 S, or 3 H.
  • probes can contain radiolabeled analogs, including those commonly labeled with 32 P or 35 S, such as ⁇ - 32 P-dATP, -dTTP, -dCTP, and dGTP; ⁇ - 35 S-GTP and ⁇ - 35 S- dATP, and the like.
  • Nucleic acids of the invention can also be bound to a substrate.
  • Subject nucleic acids can be attached covalently, attached to a surface ofthe support or applied to a derivatized surface in a chaotropic agent that facilitates denaturation and adherence, e.g., by noncovalent interactions, or some combination thereof.
  • the nucleic acids can be bound to a substrate to which a plurality of other nucleic acids are concurrently bound, hybridization to each ofthe plurality ofthe bound nucleic acids being separately detectable.
  • the substrate can be porous or solid, planar or non-planar, unitary or distributed; and the bond between the nucleic acid and the substrate can be covalent or non-covalent.
  • the subsfrate can be in the form of microbeads or nanobeads.
  • Substrates include, but are not limited to, a membrane, such as nitrocellulose, nylon, positively-charged derivatized nylon; a solid substrate such as glass, amorphous silicon, crystalline silicon, plastics (including e.g., polymethylacrylic, polyethylene, polypropylene, polyacrylate, polymethylmethacrylate, polyvinylchloride, polytefrafluoroethylene, polystyrene, polycarbonate, polyacetal, polysulfone, cellulose acetate, or mixtures thereof).
  • plastics including e.g., polymethylacrylic, polyethylene, polypropylene, polyacrylate, polymethylmethacrylate, polyvinylchloride, polytefrafluoroethylene, polystyrene, polycarbonate, polyacetal, polysulfone, cellulose acetate, or mixtures thereof).
  • the subject nucleic acids include antisense RNA, ribozymes, and RNAi. Further, The nucleic acids ofthe invention can be used for antisense or RNAi inhibition of transcription or translation using methods known in the art (Phillips, 1999a; Phillips, 1999b; Hartmann et al., 1999; Stein et al., 1998; Agrawal et al., 1998).
  • the instant invention further provides host cells, e.g., recombinant host cells, that comprise a subject nucleic acid, host cells that comprise a recombinant vector, and host cells that secrete antibodies ofthe invention.
  • Subject host cells can be cultured in vitro, or can be part of a multicellular organism. Host cells are described in more detail below.
  • the instant invention further provides transgenic plants and non-human animals, as described in more detail below.
  • the subject nucleic acids find use in the preparation of all or a portion ofthe polypeptides ofthe subject invention, as described above, using an expression system.
  • an expression vector can be employed.
  • the expression vector will provide a transcriptional and translational initiation region, which may be inducible, conditionally-active, or constitutive, or tissue-specific, where the coding region is operably linked under the transcriptional control ofthe transcriptional initiation region, and a transcriptional and translational termination region.
  • These control regions can be native to a gene encoding the subject peptides, or can be derived from heterologous or exogenous sources.
  • the subject nucleic acids can also be provided as part of a vector (e.g., a polynucleotide construct comprising an expression cassette), a wide variety of which are known in the art.
  • Vectors include, but are not limited to, plasmids; cosmids; viral vectors; human, yeast, bacterial, Pl -derived artificial chromosomes (HAC's, YAC's, BAC's, PAC's, etc.), mini-chromosomes, and the like.
  • Vectors are amply described in numerous publications well known to those in the art (Ausubel, et al.; Jones et al., 1998a; Jones et al., 1998b).
  • Vectors can provide for nucleic acid expression, for nucleic acid propagation, or both.
  • a recombinant vector or construct that includes a nucleic acid of the invention is useful for propagating a nucleic acid in a host cell; such vectors are known as "cloning vectors.”
  • Vectors can transfer nucleic acid between host cells derived from disparate organisms; these are known in the art as “shuttle vectors.”
  • Vectors can also insert a subject nucleic acid into a host cell's chromosome; these are known in the art as “insertion vectors.”
  • Vectors can express either sense or antisense RNA transcripts ofthe invention in vitro (e.g., in a cell-free system or within an in vitro cultured host cell) or in vivo (e.g., in a multicellular plant or animal); these are known in the art as "expression vectors," which can be part of an expression system. Expression vectors can also produce a subject antibody.
  • Vectors typically include at least one origin of replication, at least one site for insertion of heterologous nucleic acid (e.g., in the form of apolylinker with multiple, tightly clustered, single cutting restriction endonuclease recognition sites), and at least one selectable marker, although some integrative vectors will lack an origin that is functional in the host to be chromosomally modified, and some vectors will lack selectable markers.
  • Vectors are transiently or stably be maintained in the cells, usually for a period of at least about one day, at least about several days to at least about several weeks.
  • Promoters of the invention can be naturally contiguous or not naturally contiguous to the expressed nucleic acid molecule.
  • the promoters can be inducible, conditionally active (such as the cre-lox promoter), constitutive, and/or tissue specific.
  • conditionally active such as the cre-lox promoter
  • constitutive such as the cre-lox promoter
  • tissue specific such as the cre-lox promoter
  • the DNA of interest Prior to vector insertion, the DNA of interest will be obtained substantially free of other nucleic acid sequences.
  • the DNA can be "recombinant," and flanked by one or more nucleotides with which it is not normally associated on a naturally occurring chromosome.
  • Expression vectors generally have convenient restriction sites located near the promoter sequence to provide for the insertion of nucleic acid sequences encoding heterologous protein or RNA molecules.
  • a selectable marker operative in the expression system or host can be present.
  • Expression vectors can be used for the production of fusion proteins, where the fusion peptide provides additional functionality, i.e., increased protein synthesis, a leader sequence for secretion, stability, reactivity with defined antisera, or an enzyme marker, e.g., ⁇ -galactosidase.
  • Expression vectors can be prepared comprising a transcription cassette comprising a transcription initiation region, the gene or fragment thereof, and a transcriptional termination region.
  • DNA sequences that allow for the expression of functional epitopes or domains, at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at.
  • the cells containing the vector construct can be selected by means of a selectable marker, and the selected cells expanded and used as expression- competent host cells.
  • Host cells can comprise prokaryotes or eukaryotes that express proteins and polypeptides in accordance with conventional methods, the method depending on the purpose for expression.
  • a unicellular organism such as E. coli, B. subtilis, S. cerevisiae, insect cells in combination with baculovirus vectors, or cells of a higher organism such as vertebrates, particularly mammals, e.g., COS 7 cells, can be used as the expression host cells.
  • Specific expression systems of interest include plants, bacteria, yeast, insect cells, and mammalian cell-derived expression systems.
  • Expression systems in plants include those described in U.S. Patent No. 6,096,546 and U.S. Patent No. 6,127,145.
  • Expression systems in bacteria include those described by Chang et al., 1978; Goeddel et al., 1979; Goeddel et al., 1980; EP 0 036,776; U.S. Patent No. 4,551,433; DeBoer et al., 1983); and Siebenlist et al., 1980.
  • Expression systems in yeast include those described by Hinnen et al., 1978; Ito et al., 1983; Kurtz et al., 1986; Kunze et al, 1985; Gleeson et al, 1986; Roggenkamp et al., 1986; Das et al., 1984; De Louvencourt et al., 1983; Van den Berg et al., 1990; Kunze et al., 1985; Cregg et al., 1985; U.S. Patent Nos.
  • the insect cell expression system is useful not only for production of heterologous proteins intracellularly, but can be used for expression of transmembrane proteins on the insect cell surfaces. Such insect cells can be used as immunogen for production of antibodies, for example, by injection of the insect cells into mice or rabbits or other suitable animals, for production of antibodies.
  • Mammalian expression systems include those described in Dijkema et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. 4,399,216. Additional features of mammalian expression are facilitated as described in Ham and Wallace, 1979; Barnes and Sato, 1980 U.S. Patent Nos.
  • Mammalian cell expression systems can also be used for production of antibodies.
  • Cell-free systems can be also used to express the polypeptides of the invention.
  • Cell-free systems of fractionated cell homogenates comprising the protein synthetic machinery, including ribosomes, transfer RNA and enzymatic components ofthe machinery, are routinely used by those of skill in the art to express polypeptides of interest.
  • Isolated mRNA and DNA e.g., a gene cloned into a plasmid vector, or a PCR-generated DNA template, are examples of nucleic acids suitable as templates for expressing polypeptides in cell-free systems.
  • the polypeptides can be expressed in bacterial systems, e.g., E.
  • coli lysate rabbit reticulocyte lysate system, wheat germ extract system, frog oocyte lysate system, and the like which is conventional in the art. See, for example, WO 00/68412, WO 01/27260, WO 02/24939, WO 02/38790, WO 91/02076, and WO 91/02075.
  • Wheat embryo and wheat germ (which is dried wheat embryo), and reticulo-lysate extract cell-free systems are eukaryotic, and, as such, are suitable for expressing eukaryotic proteins, as described in WO 00/68412, WO 01/27260, WO 02/08443, WO 02/095377, WO 02/18586, and WO 02/24939. They have the advantages of low cost, easy availability in large amounts, and the capacity to synthesize high-molecular weight polypeptides (Madin et al., 2000).
  • the wheat embryo can be treated to substantially eliminate endogenous protein synthesis inhibitors, improving the synthetic capacity ofthe system (Madin et al., 2000; WO 00/68412).
  • the robustness ofthe wheat germ cell-free system can be enhanced by the addition of an energy regenerating system, including an energy source (Madin et al., 2000; WO 00/68412).
  • Recombinant genes encoding hydrophobic proteins of interest can be expressed in cell-free systems, e.g., wheat germ, E. coli, or rabbit reticulocyte lysates.
  • Cell-free lysates can be prepared from wheat germ or wheat embryos by the methods of Doi et al., 2003; Miyamoto-Sato et al., 2003; Morita et al., 2003; WO 02/38790, WO 02/24939, Sawasaki et al., 2002a; Sawasaki et al., 2002b; WO 01/27260, WO 00/68412, Madin et al., 2000; Sawasaki et al, 2000; and/or Erickson and Blobel, 1983.
  • Cell-free lysates can be prepared from E. coli by the methods of Chang et al., 1978; Goeddel et al., 1979; EP 0 036,776; U.S. Patent No. 4,551,433; DeBoer et al., 1983; and Siebenlist et al., 1980.
  • Cell-free lysates can be prepared from rabbit reticulocytes by the methods of Dijkema et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. 4,399,216. Additional features of mammalian expression are facilitated as described in Ham and McKeehan, 1979; Barnes and Sato, 1980; U.S. Patent Nos. 4,767,704, 4,657,866, 4,927,762, 4,560,655; WO 90/103430; WO 87/00195; and U.S. RE 30,985.
  • the translation efficiency of a cell- free expression system is dependent in part on the translation efficiency ofthe mRNA.
  • the efficiency ofthe mRNA employed in a wheat germ cell-free expression system was shown to be increased by constructing a plasmid with a template DNA for transcribing a protein translational template mRNA with high translation efficiency.
  • the plasmid includes non-translated template DNA at the 5' and 3' ends ofthe mRNA template, improving the efficiency of translation ofthe mRNA sequence of interest (WO 01/27260; Madin et al., 2000).
  • the translation efficiency of a cell-free expression system is also dependent in part on the availability of a continuous energy source.
  • the robustness of a conventional wheat germ cell-free expression system was improved by supplying substrate molecules and energy sources, i.e., ATP or GTP to the reaction by allowing their free diffusion into the expression system, and by removing the by-products of the reaction from the expression system (WO 02/24939; Madin et al., 2000).
  • Membrane proteins are commonly amphipathic molecules that have hydrophilic as well as hydrophobic, e.g., membrane spanning, regions. They may pose difficulties in studying with traditional methods because of their hydrophobic domains. Transmembrane and membrane-associated proteins are cellular targets for the majority of therapeutics in use today, including both small molecule drugs and protein pharmaceutical drugs and vaccines. Active therapeutic agents are also often amphipathic, and for the same reasons have also been difficult to study directly at the molecular level.
  • NanodiscsTM (Nanodisc, Inc., Urbana-Champaign, IL) provide a means to generate soluble lipid bilayer membranes that incorporate membrane proteins on a nanometer scale.
  • the structure ofthe NanodiscTM is a discoidal lipid bilayer surrounded at its edges by amphipathic ⁇ -helical proteins (WO 02/40501).
  • the NanodiscsTM are stabilized by a synthetically engineered class of amphipathic membrane scaffold proteins that were optimized to promote the self-assembly of discoidal bilayers (Bayburt et al., 2002). These alpha-helical scaffold proteins surround the edge ofthe NanodiscTM, providing stability to the bilayer. They can be engineered with tags or chemically reactive groups so that they can additionally serve as tools for observation, physical manipulation, or attachment to various matrices.
  • NanodiscsTM are self-assembling discoidal nanoparticles. The assembly process begins with a mixture of saturated or unsaturated phospholipid molecules and membrane scaffold proteins in the presence of a detergent. The detergent is removed, forming particles that preserve the phospholipid bilayer architecture and incorporate the target hydrophobic protein of interest.
  • NanodiscsTM can be produced from a variety of phospholipids, for example, dipalmitoyl phosphatidlycholine and dimyristoyl phosphatidlycholine (Shaw et al., 2004).
  • a synthetic bilayer made from a single type of phospholipid changes from a liquid state to a gel state at a characteristic freezing point. This change of state is called a phase transition.
  • the breadth ofthe phase transitions ofthe phospholipids in the bilayer depends on the phospholipid composition.
  • NanodiscsTM phase transitions ofthe same phospholipids incorporated into NanodiscsTM compared to phospholipid vesicles showed that the transitions were broader for the lipids in the NanodiscsTM than for the lipids in the vesicles. Also, the transition midpoint was shifted 3-4°C higher for lipids incorporated into NanodiscsTM. These characteristics ofthe NanodiscTM lipid bilayer mimic the characteristics of cellular membranes better than the vesicles, making NanodiscsTM a more native-like lipid environment in which to study membrane-associated proteins (Shaw et al., 2004).
  • NanodiscsTM can also be produced from microsomal membranes, e.g., those prepared from baculovirus-infected Spodoptera frugiperda (Sf9) insect cells.
  • Sf9 baculovirus-infected Spodoptera frugiperda
  • Civjanet al. overexpressed an N-terminally anchored cytochrome P450 monoxygenase, and found that it was effectively dispersed, not aggregated, in bilayers containing biochemically defined lipid components.
  • the cytochrome P450 monoxygenase target protein was suitable for sensitive high-throughput substrate binding analysis (Civjan et al., 2003).
  • the oligomeric state of the target protein can be controlled during NanodiscTM assembly.
  • the seven transmembrane receptor protein bacteriorhodopsin assumes a native trimeric state.
  • the oligomeric form of bacteriorhodopsin could be controlled and determined by observation, e.g., spectroscopically (Bayburt and Sligar, 2003).
  • NanodiscsTM can also be controlled by the assembly conditions (Bayburt and Sligar, 2003). By providing membrane scaffold proteins and phospholipids in excess, NanodiscsTM can be induced to self-assemble with controlled stoichiometry such that there is one target hydrophobic molecule per Nanodisc .
  • Nanodiscs constructed with bacteriorhodopsin comprised approximately two membrane scaffold protein molecules and approximately 163 dimyristoylphosphatidylcholine molecules per bacteriorhodopsin molecule.
  • NanodiscsTM provide integral membrane proteins in a functional, soluble, and monodisperse state in a native-like environment that maintains a spectrum of in vivo activities. Integral membrane proteins such as receptors, enzymes, and other macromolecular assemblies that represent important drug targets can be incorporated into NanodiscsTM and retain their physiologic activities.
  • NanodiscsTM 93% ofthe bacteriopsin molecules incorporated into NanodiscsTM were shown to be functional with respect to cofactor binding, and to have a dissociation constant for all-frans-retinal that was very close to the value ofthe dissociation constant in the native state (Bayburt and Sligar, 2003).
  • ayburt and Sligar 93% ofthe bacteriopsin molecules incorporated into NanodiscsTM were shown to be functional with respect to cofactor binding, and to have a dissociation constant for all-frans-retinal that was very close to the value ofthe dissociation constant in the native state.
  • Hydrophobic proteins expressed on NanodiscsTM are suitable for use in biochemical studies, crystallographic studies, and high throughput screening.
  • NanodiscsTM are water soluble; they can be handled and manipulated by techniques commonly used to work with proteins. NanodiscsTM have the advantage over liposomes that they lack a lumen, overcoming the orientation problem ofthe embedded membrane protein, because both the extracellular and cytoplasmic portions ofthe target molecules are accessible. NanodiscsTM provide access to both sides of the bilayer structure, while liposomes permit access only to the outer surface. Thus, NanodiscsTM are useful for studying transmembrane protein function in solution. For example, they can be used to study transmembrane signaling.
  • the invention provides a method of producing at least one hydrophobic polypeptide by providing a cell-free expression system, a first nucleic acid molecule encoding a first hydrophobic polypeptide and reagents for producing a NanodiscTM, combining the first nucleic acid molecule, the cell-free expression system, and the reagents for producing a NanodiscTM, and allowing the first hydrophobic polypeptide to be produced in or introduced into a NanodiscTM.
  • This cell-free expression system allows for replication ofthe nucleic acid molecule. It can be a bacterial system, e.g., an E. coli lysate; a plant system, e.g., a wheat germ lysate; or a eukaryotic system, e.g., a rabbit reticulocyte lysate.
  • This expression system can produce membrane proteins.
  • This method can produce two or more hydrophobic polypeptides by providing a second, third, or fourth nucleic acid molecule encoding a second, third, or fourth hydrophobic polypeptide; combining the second, third, or fourth nucleic acid molecule with the first, second, or third nucleic acid molecule, the cell-free expression system, and the reagents for producing a NanodiscTM; and allowing the resulting hydrophobic polypeptides to be produced in or introduced into the NanodiscTM.
  • This cell-free expression system allows for replication ofthe nucleic acid molecule. It can be a bacterial system, e.g., an E.
  • This expression system can produce membrane proteins.
  • This method can produce a NanodiscTM comprising the two or more hydrophobic polypeptides. It can produce two or more hydrophobic proteins that are part of a multi-protein complex, br that exist in the same NanodiscTM but are not part of a multiprotein complex.
  • This method can produce first, second, third, or fourth, etc., nucleic acid molecules that are present in an equal molar ratio.
  • NanodiscTM can also produce first, second, third, or fourth, etc., nucleic acid molecules that are present in different molar ratios. These proteins in the NanodiscTM can assume their native conformation and perform their native physiologic functions, both in isolation and as a part of protein complexes.
  • the invention provides an apparatus for producing a plurality of hydrophobic polypeptides in a high throughput manner comprising means for providing a cell-free expression system for one or more components of a hydrophobic protein, means for introducing one or more nucleic acid molecules that encode one or more components of a hydrophobic protein into each cell-free expression system, means for introducing a NanodiscTM into each cell-free expression system, and means for incubating the cell-free expression system, the one or more nucleic acid molecules, and the NanodiscTM for each hydrophobic protein.
  • This apparatus can further comprise means for separating the NanodiscTM containing the hydrophobic protein from the cell-free expression system.
  • the invention provides a method of synthesizing a plurality of NanodiscsTM simultaneously and for synthesizing a series of a plurality of simultaneously-synthesized NanodiscsTM sequentially utilizing a dynamic system by providing the apparatus described above, operating the apparatus so as to produce a plurality of hydrophobic polypeptides in a cell-free expression system on a NanodiscTM, operating the apparatus so as to separate the NanodiscTM containing the hydrophobic protein from the cell-free expression system, and operating the apparatus so as to reposition the apparatus such that the means for providing a cell-free expression system, the means for introducing one or more nucleic acid molecules, the means for introducing a NanodiscTM into each cell-free expression system, and the means for incubating the cell-free expression system are in a position with respect to one another so that at least a second plurality of hydrophobic proteins can be produced in a cell-free system on a Nanodisc 1 .
  • the invention provides a hydrophobic protein made by any ofthe above methods.
  • the hydrophobic protein can be a membrane protein, e.g., a transmembrane protein with one or more hydrophobic transmembrane domains.
  • the invention also provides a composition comprising a plurality of crystallized hydrophobic proteins.
  • Protein crystallization requires large amounts of purified protein, and previous efforts to produce crystals of membrane proteins have been hampered by the difficulty of expressing hydrophobic proteins in their native state.
  • the crystallized hydrophobic protein composition ofthe invention can ! comprise hydrophobic proteins produced by a cell-free expression system in a NanodiscTM and crystallized by any ofthe methods that are known to those skilled in the art (McRee, 1999).
  • the composition of crystallized proteins can comprise a binding partner bound to the hydrophobic protein, e.g., a heavy metal or other binding partner used by those skilled in the art. Crystallized proteins can provide information about the three-dimensional structure ofthe proteins.
  • the invention also provides a method of preparing a hydrophobic protein for determination of crystal structure by providing a composition of hydrophobic proteins made by any ofthe methods described above and allowing the composition to crystallize.
  • the invention further provides a method for using a hydrophobic protein ofthe invention to determine its crystal structure.
  • the invention provides a method of immunizing a non-human animal by injecting it with a hydrophobic protein made by any ofthe methods described above.
  • the invention provides a method of screening for modulators of hydrophobic protein activity by providing a hydrophobic protein made by any ofthe methods described above, contacting the hydrophobic protein with a candidate modulator, and determining the ability ofthe candidate modulator to affect hydrophobic protein activity or to bind to the hydrophobic protein.
  • This method provides a screen for modulators of membrane proteins.
  • the modulators can be agonists, antagonists, antibodies, small molecule drugs, soluble receptors, peptide aptamers, and/or natural ligands.
  • the resulting replicated nucleic acid, RNA, expressed protein or polypeptide is within the scope ofthe invention as a product ofthe host cell or organism.
  • the gene corresponding to a selected polynucleotide is identified, its expression can be regulated in the gene's native cell types.
  • an endogenous gene of a cell can be regulated by an exogenous regulatory sequence inserted into the genome ofthe cell at a location that will enhance or reduce expression ofthe gene corresponding to the subject polypeptide.
  • the regulatory sequence can be designed to integrate into the genome via homologous recombination, as disclosed in U.S. Patent Nos. 5,641,670 and 5,124,761, the disclosures of which are herein incorporated by reference.
  • the invention provides isolated nucleic acids that, when used as primers in a polymerase chain reaction, amplify a subject polynucleotide, or a polynucleotide containing a subject polynucleotide.
  • the amplified polynucleotide is from about 20 to about 50, from about 50 to about 75, from about 75 to about 100, from about 100 to about 125, from about 125 to about 150, from about 150 to about 175, from about 175 to about 200, from about 200 to about 250, from about 250 to about 300, from about 300 to about 350, from about 350 to about 400, from about 400 to about 500, from about 500 to about 600, from about 600 to about 700, from about 700 to about 800, from about 800 to about 900, from about 900 to about 1000, from about 1000 to about 2000, from about 2000 to about ⁇ 3000, from about 3000 to about 4000, from about 4000 to about 5000, or from about 5000 to about 6000 nucleotides or more in length.
  • the isolated nucleic acids themselves are from about 10 to about 20, from about 20 to about 30, from about 30 to about 40, from about 40 to about 50, from about 50 to about 100, or from about 100 to about 200 nucleotides in length.
  • the nucleic acids are used in pairs in a polymerase chain reaction, where they are referred to as "forward" and "reverse” primers.
  • the invention provides a pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides in length, the first nucleic acid molecule ofthe pair comprising a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to a nucleic acid sequence as shown in SEQ ID NO: .1 - 123 and the second nucleic acid molecule ofthe pair comprising a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to the reverse complement ofthe nucleic acid sequence shown in SEQ ID NO: 1 - 123 , wherein the sequence ofthe second nucleic acid molecule is located 3' of the nucleic acid sequence ofthe first nucleic acid molecule shown in SEQ ID NO: 1 - 123.
  • the primer nucleic acids are prepared using any known method, e.g., automated synthesis, and can be chosen to specifically amplify a cDNA copy of an mRNA encoding a subject polypeptide.
  • the first and/or the second nucleic acid molecules comprise a detectable label.
  • the label can be a radioactive molecule, fluorescent molecule or another molecule, e.g., hapten, as described in detail above.
  • the label can be a two stage system, where the amplified DNA is conjugated to another molecule, i.e., biotin, digoxin, or a hapten, that has a high affinity binding partner, i.e., avidin, antidigoxin, or a specific antibody, respectively, and the binding partner conjugated to a detectable label.
  • the label can be conjugated to one or both of the primers.
  • the pool of nucleotides used in the amplification is labeled, so as to incorporate the label into the amplification product.
  • high stringency conditions include hybridization in 50% formamide, 5X SSC, 0.2 ⁇ g/ ⁇ l poly(dA), 0.2 ⁇ g/ ⁇ l human cotl DNA, and 0.5% SDS, in a humid oven at 42°C overnight, followed by successive washes in IX SSC, 0.2% SDS at 55°C for 5 minutes, followed by washing at 0.1X SSC, 0.2% SDS at 55°C for 20 minutes.
  • high stringency conditions include hybridization at 50°C and O.lxSSC (15 mM sodium chloride/1.5 mM sodium citrate); overnight incubation at 42°C in a solution containing 50% formamide, 1 x SSC (150 mM NaCl, 15 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate, and 20 ⁇ g/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 x SSC at about 65°.
  • High stringency conditions also include aqueous hybridization (e.g., free of formamide) in 6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% sodium dodecyl sulfate (SDS) at 65°C for about 8 hours (or more), followed by one or more washes in 0.2 X SSC, 0.1% SDS at 65°C.
  • Highly stringent hybridization conditions are hybridization conditions that are at least as stringent as any one ofthe above representative conditions.
  • Other stringent hybridization conditions are known in the art and can also be employed to identify nucleic acids of this particular embodiment ofthe invention.
  • Conditions of reduced stringency suitable for hybridization to molecules encoding structurally and functionally related proteins, or otherwise serving related or associated functions, are the same as those for high stringency conditions but with a reduction in temperature for hybridization and washing to lower temperatures (e.g., room temperature or about 22°C to 25°C).
  • moderate stringency conditions include aqueous hybridization (e.g., free of formamide) in 6X SSC, 1% SDS at 65°C for about 8 hours (or more), followed by one or more washes iri 2X SSC, 0.1% SDS at room temperature.
  • Low stringency conditions include, for example, aqueous hybridization at 50°C and 6 ⁇ SSC (0.9 M sodium chloride/0.09 M sodium citrate) and washing at 25°C in lxSSC (0.15 M sodium chloride/0.015 M sodium citrate).
  • the specificity of a hybridization reaction allows any single-stranded sequence of nucleotides to be labeled with a radioisotope or chemical and used as a probe to find a complementary strand, even in a cell or cell extract that contains millions of different DNA and RNA sequences. Probes of this type are widely used to detect the nucleic acids corresponding to specific genes, both to facilitate the purification and characterization ofthe genes after cell lysis and to localize them in cells, tissues, and organisms.
  • a probe prepared from one gene can be used to find homologous evolutionary relatives - both in the same organism, where the relatives form part of a gene family, and in other organisms, where the evolutionary history ofthe nucleotide sequence can be traced.
  • a person skilled in the art would recognize how to modify the conditions to achieve the requisite degree of stringency for a particular hybridization.
  • the polynucleotide libraries ofthe invention generally comprise a collection of sequence information of a plurality of polynucleotide sequences, where at least one ofthe polynucleotides has a sequence shown in SEQ ID NO: 1 - 123.
  • plurality is meant at least 2, at least 3, or at least all ofthe sequences in the Sequence Listing.
  • the information may be provided in either biochemical form (e.g., as a collection of polynucleotide molecules), or in electronic form (e.g., as a collection of polynucleotide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as a part of a computer program).
  • the length and number of polynucleotides in the library will vary with the nature ofthe library, e.g., if the library is an oligonucleotide array, a cDNA array, or a computer database ofthe sequence information.
  • sequence information contained in either a biochemical or an electronic library of polynucleotides can be used in a variety of ways, e.g., as a resource for gene discovery, as a representation of sequences expressed in a selected cell type (e.g., cell type markers), or as markers of a given disorder or disease state.
  • a disease marker is a representation of a gene product that is present in all cells affected by disease either at an increased or decreased level relative to a normal cell (e.g., a cell ofthe same or similar type that is not substantially affected by disease).
  • a polynucleotide sequence in a library can be a polynucleotide that represents an mRNA, polypeptide, or other gene product encoded by the polynucleotide, that is either over-expressed or under-expressed in one cell compared to another (e.g., a first cell type compared to a second cell type; a normal cell compared to a diseased cell; a cell not exposed to a signal or stimulus compared to a cell exposed to that signal or stimulus; and the like).
  • the nucleotide sequence information of the library can be embodied in any suitable form, e.g., electronic or biochemical forms.
  • a library of sequence information embodied in electronic form comprises an accessible computer data file that may contain the representative nucleotide sequences of genes that are differentially expressed (e.g., over-expressed or under-expressed) as between, e.g., a first cell type compared to a second cell type (e.g., expression in a brain cell compared to expression in a kidney cell); a normal cell compared to a diseased cell (e.g., a non- cancerous cell compared to a cancerous cell); a cell not exposed to an internal or external signal or stimulus compared to a cell exposed to that signal or stimulus (e.g., a cell contacted with a ligand compared to a control cell not contacted with the ligand); and the like.
  • Biochemical embodiments ofthe library include a collection of nucleic acid molecules that have the sequences ofthe genes in the library, where the nucleic acids can correspond to the entire gene in the library or to a fragment thereof, as described in greater detail below.
  • the nucleic acid sequence infonnation can be present in a variety of media.
  • the nucleic acid sequences of any ofthe polynucleotides shown in SEQ ID NO: 1 - 123 can be recorded on computer readable media of a computer-based system, e.g., any medium that can be read and accessed directly by a computer.
  • a computer-based system e.g., any medium that can be read and accessed directly by a computer.
  • Any of the presently known computer readable mediums can be used to create a manufacture comprising a recording ofthe present sequence information. Any convenient data storage structure can be chosen, based on the means used to access the stored information.
  • a variety of data processor programs and formats can be used for storage, e.g., word processing text file, database format, etc.
  • electronic versions ofthe libraries ofthe invention can be provided in conjunction or connection with other computer-readable information and/or other types of computer-based files (e.g., searchable files, executable files, etc, including, but not limited to, for example, search program software, etc.).
  • nucleotide sequence By providing the nucleotide sequence in computer readable form in a computer-based system, the information can be accessed for a variety of purposes.
  • Computer software to access sequence information is publicly available.
  • Conventional bioinformatics tools can be utilized to analyze sequences to determine sequence identity, sequence similarity, and gap information.
  • the gapped BLAST Altschul et al., 1990, Altschul et al., 1997), and BLAZE (Brutlag et al., 1993) search algorithms on a Sybase system, or the TeraBLAST (TimeLogic, Crystal Bay, Nevada) program optionally running on a specialized computer platform available from TimeLogic, can be used to identify open reading frames (ORFs) within the genome that contain homology to ORFs from other organisms.
  • ORFs open reading frames
  • Homology between sequences of interest can be determined using the local homology algorithm of Smith and Waterman, 1981, as well as the BestFit program (Rechid et al., 1989), and the FastDB algorithm (FastDB, 1988; described in Current Methods in Sequence Comparison and Analysis, Macromolecule Sequencing and Synthesis, Selected Methods and Applications, pp. 127-149, 1988, Alan R. Liss, Inc).
  • Alignment programs that permit gaps in the sequence include Clustalw (Thompson et al., 1994), FASTA3 (Pearson, 2000) AlignO (Myers and Miller, 1988), and TCoffee (Notredame et al., 2000).
  • Other methods for comparing and aligning nucleotide and protein sequences include, for example, BLASTX (NCBI), the Wise package (Birney and Durbin, 2000), and FASTX (Pearson, 2000). These algorithms determine sequence homology between nucleotide and protein sequences without translating the nucleotide sequences into protein sequences.
  • Sequence similarity is calculated based on a reference sequence, which may be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc.
  • the reference sequence is usually at least about 18 nt long, at least about 30 nt long, or may extend to the complete sequence that is being compared.
  • One parameter for determining percent sequence identity is the percentage ofthe alignment in the region of strongest alignment between a target and a query sequence. Methods for determining this percentage involve, for example, counting the number of aligned bases of a query sequence in the region of strongest alignment and dividing this number by the total number of bases in the region. For example, 10 matches divided by 11 total residues gives a percent sequence identity of approximately 90.9%.
  • the length ofthe aligned region is typically at least about 55%, at least about 58%, or at least about 60%> ofthe total sequence length, and can be as great as about 62%, as great as about 64%, and even as great as about 66% of the total sequence length.
  • the present invention includes human and mouse polynucleotide and polypeptide sequences that are at least about 95%>, at least about 96%>, at least about 97%o, at least about 98%>, or at least about 99% homologous to the sequences in the Sequence Listing, based on using the method of determining sequence identity with the insertion of gaps to detect the maximum degree of sequence identity. In other embodiments of interest, homology will be at least about 80%, at least about 85%, or as high as about 90%.
  • a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems ofthe present invention.
  • One format for an output means ranks the relative expression levels of different polynucleotides. Such presentation provides a skilled artisan with a ranking of relative expression levels to determine a gene expression profile.
  • the library ofthe invention also encompasses biochemical libraries ofthe polynucleotides shown in SEQ ID NO: 1 - 123, e.g., collections of nucleic acids representing the provided polynucleotides.
  • the biochemical libraries can take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic acids stably associated with a surface of a solid support (i.e., an array) and the like.
  • nucleic acid arrays in which one or more ofthe polynucleotide sequences shown in SEQ ID NO: 1 - 123 is represented on the array.
  • a variety of different array formats have been developed and are known to those of skill in the art.
  • arrays ofthe subject invention find use in a variety of applications, including gene expression analysis, drug screening, mutation analysis, and the like, as disclosed in the herein-listed exemplary patent documents.
  • analogous libraries of polypeptides are also provided, where the polypeptides ofthe library will represent at least a portion ofthe polypeptides encoded by a gene corresponding to one or more of the sequences shown in SEQ ID NO: 1 - 123.
  • analogous libraries of antibodies are also provided, where the libraries comprise antibodies or fragments thereof that specifically bind to at least a portion of at least one ofthe subject polypeptides.
  • antibody libraries may comprise antibodies or fragments thereof that specifically inhibit binding of a subject polypeptide to its ligand or substrate, or that specifically inhibit binding of a subject polypeptide as a substrate to another molecule.
  • corresponding nucleic acid libraries are also provided, comprising polynucleotide sequences that encode the antibodies or antibody fragments described above.
  • novel polypeptides, and related polypeptide compositions encompass proteins with amino acid sequences as shown in SEQ ID NO: 124 - 246, or encoded by the nucleic acids having nucleotide sequences shown in SEQ ID NO: 1 - 123.
  • the subject polypeptides are human polypeptides, fragments thereof, variants (such as splice variants), homologs from other species, and derivatives thereof.
  • a polypeptide ofthe invention has an amino acid sequence substantially identical to the sequence of any polypeptide encoded by a polynucleotide sequence shown in SEQ ID NO: 1 - 123.
  • polypeptides may reside within the cell, or extracellularly. They may be secreted from the cell, reside in the cytoplasm, in the membranes, or in any ofthe intracellular organelles, including the nucleus, mitochondria, ribosomes, or storage granules. They may function as secreted proteins, single-fransmembrane proteins, multiple-transmembrane proteins, cytoplasmic proteins, and/or extracellular proteins.
  • the present novel polypeptide modulates the cells or tissues of animals, particularly humans, such as, for example, by stimulating, enhancing or inhibiting T or B cell function or the function of other hematopoeitic cells or bone marrow cells; modulates adult or embryonic stem cell or precursor cell growth or differentiation; modulates cell function or activity of neuronal cells or other cells ofthe CNS, heart cells, liver cells, kidney cells, lung cells, pancreatic cells, gastrointestinal cells, spleen cells, breast cells, prostate cells, ovarian cells, and the like.
  • a subject polypeptide is present as a multimer.
  • Multimers include homodimers, homofrimers, homotetramers, and multimers that include more than four monomeric units.
  • Multimers also include heteromultimers, e.g., heterodimers, heterotrimers, heterotetramers, etc. where the subject polypeptide is present in a complex with proteins other than the subject polypeptide.
  • the multimer is a heteromultimer
  • the subject polypeptide can be present in a 1 : 1 ratio, a 1:2 ratio, a 2:1 ratio, or other ratio, with the other protein(s).
  • polypeptides from other species are also provided, including mammals, such as: primates, rodents, e.g., mice, rats, hamsters, guinea pigs; domestic animals, e.g., sheep, pig, horse, cow, goat, rabbit, dog, cat; and humans, as well as non-mammalian species, e.g., avian, reptile and amphibian, insect, crustacean, fish, plant, fungus, and protozoa.
  • mammals such as: primates, rodents, e.g., mice, rats, hamsters, guinea pigs; domestic animals, e.g., sheep, pig, horse, cow, goat, rabbit, dog, cat; and humans, as well as non-mammalian species, e.g., avian, reptile and amphibian, insect, crustacean, fish, plant, fungus, and protozoa.
  • homolog is meant a protein having at least about 35 >, at least about 40%), at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, or at least about 95%, or higher, amino acid sequence identity to the reference polypeptide, as measured with the "GAP" program (part ofthe Wisconsin Sequence Analysis Package available through the Genetics Computer Group, Inc. (Madison WI)), where the parameters are: Gap weight: 12; length weight:4.
  • GAP Global Analysis Program
  • homology will be at least about 75%>, at least about 80%>, or at least 85%, where in certain embodiments of interest, homology will be as high as about 90%.
  • polypeptides that are substantially identical to the at least one amino acid sequence shown in the Sequence Listing, or a fragment thereof, whereby substantially identical is meant that the protein has an amino. acid sequence identity to the reference sequence of at least about 85%>, at least about 90%, at least about 95%>, at least about 96%, at least about 97%, at least about 98%>, or at least about 99%.
  • proteins ofthe subject invention e.g., polypeptides encoded by the nucleotide sequences shown in SEQ ID NO: 1 - 123, and polypeptide sequences shown in SEQ ID NO: 124 - 246) have been separated from their naturally occurring environment and are present in a non-naturally occurring environment.
  • the proteins are present in a composition where they are more concentrated than in their naturally occurring environment.
  • purified polypeptides are provided.
  • Fusion proteins can comprise a subject polypeptide, or fragment thereof, and a polypeptide other than a subject polypeptide ("the fusion partner") fused in- frame at the N-terminus and/or C-terminus ofthe subject polypeptide, or internally to the subject polypeptide.
  • Fusion partners can also be those that are able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, and or IgD.
  • PEG polyethylene glycol
  • an immunoglobulin such as the Fc fragment of IgG, IgE, IgA, IgM, and or IgD.
  • Detection methods are chosen based on the detectable fusion partner.
  • the fusion partner provides an immunologically recognizable epitope
  • an epitope-specific antibody can be used to quantitatively detect the level of polypeptide.
  • the fusion partner provides a detectable signal
  • the detection method is chosen based on the type of signal generated by the fusion partner. For example, where the fusion partner is a fluorescent protein, fluorescence is measured.
  • the fusion partner is an enzyme that yields a detectable product
  • the product can be detected using an appropriate means.
  • ⁇ - galactosidase can, depending on the subsfrate, yield a colored product that can be detected with a specfrophotometer, and the fluorescent protein luciferase can yield a luminescent product detectable with a luminometer.
  • a polypeptide ofthe invention comprises at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, or at least about 1000 contiguous amino acid residues of at least one ofthe sequences according to SEQ ID NO: 124 - 246, up to and including the entire amino acid sequence.
  • Fragments ofthe subject polypeptides, as well as polypeptides comprising such fragments, are also provided. Fragments of polypeptides of interest will typically be at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, or at least 300 aa in length or longer, where the fragment will have a stretch of amino acids that is identical to the subject protein of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, or at least about 50 aa in length.
  • fragments exhibit one or more activities associated with a corresponding naturally occurring polypeptide. Fragments find utility in generating antibodies to the full-length polypeptide; and in methods of screening for candidate agents that bind to and/or modulate polypeptide activity. Specific fragments of interest include those with enzymatic activity, those with biological activity including the ability to serve as an epitope or immunogen, and fragments that bind to other proteins or to nucleic acids.
  • the invention provides polypeptides comprising such fragments, including, e.g., fusion polypeptides comprising a subject polypeptide fragment fused in frame (directly or indirectly) to another protein (the "fusion partner"), such as the signal peptide of one protein being fused to the mature polypeptide of another protein.
  • fusion proteins are typically made by linking the encoding polynucleotides together in a vector or cassette.
  • immunologically detectable proteins
  • Fusion partners can also be those that are able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, and/or IgD.
  • PEG polyethylene glycol
  • an immunoglobulin such as the Fc fragment of IgG, IgE, IgA, IgM, and/or IgD.
  • Polypeptides ofthe invention can be obtained from naturally- occurring sources or produced synthetically.
  • the sources of naturally occurring polypeptides will generally depend on the species from which the protein is to be derived, i.e., the proteins will be derived from biological sources that express the proteins.
  • the subject proteins can also be derived from synthetic means, e.g., by expressing a recombinant gene encoding a protein of interest in a suitable system or host or enhancing endogenous expression, as described in more detail above. Further, small peptides can be synthesized in the laboratory by techniques well known in the art. [0250] In all cases, the product can be recovered by any appropriate means known in the art.
  • a lysate can be prepared from the original source, (e.g., a cell expressing endogenous polypeptide, or a cell comprising the expression vector expressing the polype ⁇ tide(s)), and purified using HPLC, exclusion chromatography, gel elecfrophoresis, or affinity chromatography, and the like.
  • the invention thus also provides methods of producing polypeptides.
  • the methods generally involve introducing a nucleic acid construct into a host cell in vitro and culturing the host cell under conditions suitable for expression, then harvesting the polypeptide, either from the culture medium or from the host cell, (e.g., by disrupting the host cell), or both, as described in detail above.
  • the invention also provides methods of producing a polypeptide using cell-free in vitro transcription/translation methods, which are well known in the art, also as provided above.
  • polypeptides including polypeptide fragments, as targets for therapeutic intervention, including use in screening assays, for identifying agents that modulate polypeptide level and or activity, and as targets for antibody and small molecule therapeutics, for example, in the treatment of disorders.
  • HG1009896 SEQ.ID.NO.86 SEQ.ID .NO. 209 HG1011790 SEQ.ID.NO.87 SEQ.ID, .NO. 210 HG1011827 SEQ.ID.NO.88 SEQ.ID .NO. 211 HG1011828 SEQ.ID.NO.89 SEQ.ID .NO. 212 HG1011830 SEQ.ID.NO.90 SEQ.ID, .NO. 213 HG1011833 SEQ.ID.NO.91 SEQ.ID .NO. 214 HG1011834 SEQ.ID.NO.92 SEQ.ID .NO. 215 FP ID SEQ.ID.NO. (NI) SEQ.ID.NO. (Pl)
  • HG1009552P1 0.02 i-677) (26-677) (1-25) 2 (50-72)(131-150) (l-49)(73-130)(151- 677)
  • HG1009568P1 0 35-561) (1-561) 1 (465-487) (l-464)(488-561)
  • HG1009607P1 0 (1-438) 1 (346-368) (l-345)(369-438)
  • HG1009798P1 0.01 (1-300) (47-300) (14-46) 1 (45-67) (l-44)(68-300)
  • HG1009800P1 0 (1-424) 1 (33-55) (l-32)(56-424)
  • HG1009834P1 0 (1-691) 1 (42-61) (1-41)(62-691)
  • HG1011836P1 0 (1-135) 1 (115-134) (1-114)(135-135)
  • HG1009530P1 258gi
  • DGK-alpha Diglyceride kinase
  • DAG kinase alpha 80 (DGK-alpha)
  • DAG . kDa diacylglycerol kinase alpha 80 kDa kinase diacylglycerol kinase
  • HG1009618P1 297gi
  • HG1009632P1 349 gi
  • HG1009657P1 551 gi
  • HG1009662P1 206 gi
  • HG1009679P1 185 gi
  • HG1009699P1 455 gi
  • HG1009765P1 1126 gi
  • HG1009776P1 833gi
  • HG1009847P1 1134gi
  • HG1009888P1 171 gi
  • HG1011828P1 72gi
  • HG1011859P1 79gi
  • HG1011868P1 101 gi
  • HG1011870P1 110 gi
  • HG1011872P1 463gi
  • HG1009659P1 ABC ran (1812-1993)
  • HG1009834P1 Ribosomal_L7Ae (547-641)
  • HG1011868P1 Ribosomal_L35Ae (5-101)
  • Protein synthesis is performed using the PROTEIOSTM wheat germ cell-free protein synthesis system described by Madin et al., 2000. This system layers a "buffer mix” layer containing lower molecular weight substances such as amino acids and energy sources, e.g., inter alia, ATP and GTP underneath a "reaction mix” layer containing wheat germ extract, mRNA, creatine kinase, and creatine phosphate.
  • amino acids and energy sources e.g., inter alia, ATP and GTP
  • reaction mix containing wheat germ extract, mRNA, creatine kinase, and creatine phosphate.
  • One or more ofthe amino acids can be labeled for detection, e.g., with a radioactive label.
  • the buffer mix layer and the reaction mix layer mix together gradually, providing a continuously replenishing source of energy and amino acids over approximately 1 -24 hours, during which protein synthesis proceeds.
  • plasmid DNA is prepared from a target gene open reading frame operably linked to a promoter.
  • the plasmid vector is one suitable for expression in plants, e.g., the pEU vector (PROTEIOSTM Plasmid Set).
  • DNA is transcribed into mRNA, which is in turn translated to polypeptides in a wheat germ extract with an energy replenishing system.
  • a reaction mix is prepared by mixing 33.5 ⁇ l mRNA at a concentration of about 0.3 to about 0.4 ⁇ g/ ⁇ l, and 1.8 ⁇ l distilled water with 10.0 ⁇ l wheat germ extract, 1.0 ul RNase inhibitor (40 U/ ⁇ l), 1.7 ul creatine kinase (10 mg/ml), and 2.0 ⁇ l Buffer #2 as provided by the ProteiosTM wheat germ cell-free protein synthesis core kit, Invitrotech Co., Ltd. (Kyoto, Japan).
  • the reaction mix is placed under 250 ⁇ l of "buffer mix" as provided by the ProteiosTM kit in a flat-bottom reaction well. The wells are sealed to prevent vaporization, and the reaction proceeded at 23-26°C for 16 hours.
  • Cell-free extract is prepared from wheat embryos by the method of Madin et al, 2000. This extract (300 ⁇ l) is added to the following components to a final volume of 500 ⁇ l reaction mixture: Hepes/KOH, pH 7.8 (24 mM), ATP (1.2 mM), GTP (0.25 mM), creatine phosphate (16 mM), creatine kinase (0.45 mg/ml), DTT (2 mM), spermidine (0.4 mM), each ofthe 20 amino acids including 2 ⁇ Ci/ml [ 14 C] leucine (0.3 mM), magnesium acetate (2.5 mM), potassium acetate (100 mM), 50 ⁇ g/ml deacylated tRNA prepared from wheat embryos, Nonidet P-40 (0.05%), E- 64 proteinase inhibitor (1 ⁇ M), NaN 3 (0.005%), and mRNA corresponding to the polypeptide of interest (0.02 nmol).
  • a dialysis bag with this reaction mixture is added to 5 ml of dialysate solution containing all ofthe ingredients ofthe reaction mixture with the exception of creatine kinase. Dialysis is conducted at 23°C. At 24 hour intervals the reaction mixture is supplemented with 0.05 nmol of mRNA conesponding to the polypeptide of interest and 50 ⁇ g creatine kinase. The dialysate solution is replaced every 24 hours. This method has been previously demonstrated to produce protein at a linear rate for a minimum of 72 hours and to be suitable for large-scale preparation of proteins (Madin et al, 2000).
  • transmembrane polypeptides are expressed on NanodiscsTM directly upon their production by the cell-free translation system described below in the absence of detergent.
  • the transmembrane polypeptides are expressed on NanodiscsTM directly upon their production by the cell-free translation system described below in the presence of detergent.
  • the types of detergent and their concentrations beneficial for promoting transmembrane protein synthesis may need to be determined for each individual protein.
  • the detergents NP-40 (0.05%), Tween 20 (0.1%), Tween 80 (0.1%), Octylglucoside (0.2%), and Chaps (0.1%) were compatible with the production of proteins in the cell-free expression system described below.
  • Octylglucoside at a concentration of 0.5%) inhibited protein synthesis, and 0.05% NP- 40 was inhibitory in some experiments.
  • a 38 nucleotide primer is designed and synthesized which contains the following nineteen nucleotides "5'CCACCCACCACCACCAATG 3'” followed by nucleotides predicted to encode the amino terminus ofthe transmembrane polypeptide of interest.
  • a second reverse primer is designed to a region ofthe plasmid (containing the cDNA encoding the protein to be expressed) approximately 400 nucleotides downstream from the coding sequence ofthe gene to be expressed.
  • the second primer is designed as the reverse complement ofthe vector sequence in this region such that this primer will be useful for doing PCR amplification ofthe coding sequence ofthe open reading frame to be expressed.
  • the second primer is typically 17-23 nucleotides in length with a Tm of approximately 55-65°C.
  • a purified plasmid containing the cDNA to be expressed or E. coli cells contaimng the plasmid that contains the cDNA to be expressed is then added as template to a standard PCR reaction that includes the two primers described above, standard PCR reagents, and a DNA polymerase that has proof-reading activity, and subjected to 15-30 cycles of PCR amplification.
  • the product of this PCR reaction is called "PCR1 coding template.”
  • a separate PCR reaction is used to prepare a "GST-Mega primer" for a GST-fusion expression template.
  • a PCR reaction is prepared using the primer 5'GGTGACACTATAGAACTCACCT ATCTCCCCAACA 3' and the primer 5'GGGCCCCTGGAACAGAACTTC 3' and amplified in a standard PCR reaction that includes the two primers described above, standard PCR reagents, and a DNA polymerase that has proof-reading activity, then subjected to 15-30 cycles of PCR amplification.
  • the PCR product is subjected to exonuclease I freatment for 30 min at 37°C, then heat- inactivated at 80 °C for 30 min, and the PCR product purified by agarose gel elecfrophoresis and extracted using a gel purification kit (Amersham) to produce the "GST-Mega primer.”
  • the "GST-Mega primer” is then used to create a GST-fusion expression template by combining it with the product ofthe first PCR reaction (PCRl coding template) containing the coding region ofthe cDNA to be expressed.
  • An aliquot ofthe PCRl coding template (0.5 ⁇ l) is mixed with an aliquot ofthe GST- Mega primer (1 ⁇ l) and a primer 5 'GCGTAGCATTTAGGTGACACT 3' that encodes part ofthe SP6 promoter sequence and anneals to the 5' end ofthe GST Mega primer, and a second primer that is designed to a region ofthe plasmid approximately 300-350 nucleotides downstream from the coding sequence ofthe gene to be expressed.
  • This second primer is designed as the reverse complement ofthe vector sequence in this region such that this primer will be useful for doing PCR amplification from the PCRl coding template.
  • This second primer is typically 17-23 nucleotides in length with a Tm of approximately 55-65 °C.
  • the "GST-fusion expression template” is then generated by doing a standard PCR reaction using standard PCR reagents, and a DNA polymerase that has proof-reading activity and subjected to 15-30 cycles of PCR amplification.
  • An in viti'o transcription reaction (50 ⁇ l) is then prepared using 5 ⁇ l ofthe GST-fusion expression template in the following buffer, 80 mM Hepes KOH pH 7.8, 16 mM Mg(OAc) 2 , 2 mM spermidine, 10 mM DTT containing 1 unit of SP6 (Promega), and 1 unit of RNasin (Promega) and incubated for 3 hours at 37°C.
  • the mRNA is then subjected to ethanol precipitation by the addition of 200 ⁇ l of RNase- free water, 37.5 ⁇ l of 5 M ammonium acetate, and 862 ⁇ l of 99% ethanol, mixed by vortexing and then pelleted by centrifugation at 15,000 x g for 10 min at 4°C.
  • the mRNA pellet is then washed in 70% ethanol and again pelleted by centrifugation at 15,000 x g for 5 min at 4°C.
  • the in vitro translation reaction is performed with a stock of 2x Dialysis Buffer of 20 mM Hepes buffer pH 7.8 (KOH), 200 mM KOAc, 5.4 mM Mg(OAc) 2 , 0.8 mM Spermidine, 100 micromolar DTT, 2.4 mM ATP, 0.5 mM GTP, 32 mM creatine phosphate, 0.02 % NaN 3 , and 0.6 mM Amino Acid Mix minus ASP, TRP, GLU, ILE, LEU, PHE, and TYR.
  • amino acids ASP, TRP, GLU, ILE, LEU, PHE, and TYR are prepared separately as an 80 mM stock in IN HC1 and after complete dissolution are added to a final concentration of 0.6 mM. After addition of all ingredients, the 2x Dialysis Buffer stock is adjusted to pH 7.6 using 5N KOH, filter sterilized, and stored frozen in aliquots at -80°C.
  • a 50 ⁇ l translation mixture is prepared that includes Wheat Germ Reagent at a final OD 260nm of 60 plus the volume of lx Dialysis Buffer (to which 2 mM DTT has been added) that brings the final volume to 50 ⁇ l (Wheat Germ Reagent comprises a concentration of lx Dialysis Buffer).
  • Wheat Germ Reagent comprises a concentration of lx Dialysis Buffer.
  • the complete translation mixture containing the resuspended mRNA is then layered under 250 ⁇ l of lx Dialysis Buffer present in one well of 96 well round-bottom microtiter plate to set up the Bilayer Reaction.
  • the plate is then sealed manually with a plate seal and incubated for 20 hours at 26°C.
  • the translation mixture is transferred to a tube and 10 ⁇ l of glutathione-Sepharose is added and incubated, with mixing, for 1 hour at 4°C.
  • the Sepharose beads containing the bound GST fusion protein are then washed three times in phosphate buffered-saline containing 0.25 M sucrose and 2 mM DTT.
  • a fourth wash is then performed with protease cleavage buffer containing 50 mM Tris pH 7.4, 150 mM NaCl, 1 mM EDTA, 2 mM DTT, and 0.25 M sucrose.
  • the melting point ofthe first 20 to 24 bases ofthe primer can be calculated by counting total A and T residues, then multiplying by 2.
  • the melting point ofthe first 20 to 24 bases ofthe reverse complement with the sequences written from 5-prime to 3-prime can be calculated by counting the total G and C residues, then multiplying by 4. Both start and stop codons can be present in the final amplified clone.
  • the length ofthe primers is such to obtain melting temperatures within 63 degrees C to 68 degrees C. Adding the bases "CACC" to the forward primer renders it compatible for cloning the PCR product with the TOPO pENTR/D (Invitrogen, CA).
  • cDNA can be prepared by the following method. Between 200 ng and 1.0 ⁇ g mRNA is added to 2 ⁇ l DMSO and the volume adjusted to 11 ⁇ l with DEPC-treated water. One ⁇ l Oligo dT is added to the tube, and the mixture is heated at 70° C for 5 min., quickly chilled on ice for 2 min., and the mixture is collected at the bottom ofthe tube by brief centrifugation.
  • 1 st strand components are then added to the mRNA mixture: 2 ⁇ l 10X Sfratascript (Stratagene, CA) 1 st strand buffer, 1 ⁇ l 0.1 M DTT, 1 ⁇ l 10 mM dNTP mix (10 mM each of dG, dA, dT and dCTP), 1 ⁇ l RNAse inhibitor, 3 ⁇ l Sfratascript RT (50 U/ ⁇ l).
  • the contents are gently mixed and the mixture collected by brief centrifugation.
  • the mixture is incubated in a 42° C water bath for 1 hour, placed in a 70° C water bath for 15 min.
  • RNAse H is then added to the tube, the contents are mixed well, incubated at 37° C in a water bath for 20 min., and centrifuged briefly in a microfuge to collect the reaction product at the bottom ofthe reaction vessel.
  • the reaction mixture can proceed directly to PCR or be stored at - 20° C.
  • Full length PCR can be achieved by placing the products of the reaction described in Example 7, with primers diluted to 5 ⁇ M in water, into a reaction vessel and adding a reaction mixture composed of lx Taq buffer, 25 mM dNTP, 10 ng cDNA pool, TaqPlus (Stratagene, CA) (5u/ul), PfuTurbo (Stratagene, CA) (2.5u/ul), water.
  • the contents ofthe reaction vessel are then mixed gently by inversion 5-6 times, placed into a reservoir where 2 ⁇ l Fi/Ri primers are added, the plate sealed and placed in the thermocycler.
  • the PCR reaction is comprised ofthe following eight steps. Step 1: 95° C for 3 min. Step 2: 94° C for 45 sec.
  • Step 3 0.5° C/sec to 56-60° C.
  • Step 4 56-60° C for 50 sec.
  • Step 5 72° C for 5 min.
  • Step 6 Go to step 2, perform 35-40 cycles.
  • Step 7 72° C for 20 min.
  • Step 8 4° C.
  • the products can then be separated on a standard 0.8 to 1.0% agarose gel at 40 to 80 V, the bands of interest excised by cutting from the gel, and stored at - 20° C until extraction.
  • the material in the bands of interest can be purified with QIAquick 96 PCR Purification Kit (Qiagen, CA) according to the manufacturer instructions. Cloning can be performed with the Topo Vector pENTR/D-TOPO vector (Invitrogen, CA) according to the manufacturer's instructions.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

The invention provides novel polynucleotides, related polypeptides, related nucleic acid and polypeptide compositions, and related modulators, such as antibodies and small molecule modulators. The compositions of the invention are useful for diagnosis, prophylaxis, and treatment of proliferative, inflammatory, immune, infectious, metabolic, central nervous system, and bone cartilage disorders.

Description

NOVEL HUMAN POLYPEPTIDES ENCODED BY POLYNUCLEOTIDES
PRIORITY CLAIM
[001 ] This application is related to the following provisional applications filed in the United States Patent and Trademark Office, the disclosures of which are hereby incorporated by reference:
Figure imgf000002_0001
Figure imgf000003_0001
TECHNICAL FIELD
[002] The present invention is related generally to novel polynucleotides and novel polypeptides encoded thereby, their compositions, antibodies directed thereto, and other agonists or antagonists thereto. The polynucleotides and polypeptides are useful in diagnostic, prophylactic, and therapeutic applications for a variety of diseases, disorders, syndromes, and conditions, as well as in discovering new diagnostics, prophylactics, and therapeutics for such diseases, disorders, syndromes, and conditions (hereinafter disorders).
[003] This application also relates to the field of polypeptides that are associated with regulating cell growth and differentiation, that are over-expressed in cancer, and/or that can be associated with proliferation or inhibition of cancer growth, including hematopoietic cancers such as leukemias, lymphomas, and solid cancers such as pancreatic cancer, tracheal cancer, and lung cancer, for example, adenocarcinomas and/or squamous cell carcinomas. These polypeptides may also be associated with other conditions, such as inflammatory, immune, and metabolic disorders such as type II diabetes, as well as bone disorders, central nervous system (CNS) disorders, and microbial infections, including viral, bacterial, fungal, and parasitic disorders.
[004] This application further relates to modulators of biological activity that can specifically bind to these polynucleotides or polypeptides, or otherwise specifically modulate their activity. For example, they can directly or indirectly induce antibody-dependent cellular cytotoxicity (ADCC), complement-dependent cytotoxicity (CDC), endocytosis, apoptosis, or recruitment of other cells to effect cell activation, cell inactivation, cell growth or differentiation or inhibition thereof, and cell killing. This application yet further relates to compositions and methods for diagnosis and treatment of proliferative, inflammatory, immune, metabolic, bone, CNS, and microbial disorders.
DISCLOSURE OF INVENTION
[005] The molecules ofthe invention encompass a variety of different types of nucleic acids and polypeptides with different structures and functions. They can encode or comprise polypeptides belonging to different protein families (Pfam). The "Pfam" system is an organization of protein sequence classification and analysis, based on conserved protein domains; it can be publicly accessed in a number of ways, for example, at http://pfam.wustl.edu. Protein domains are portions of proteins that have a tertiary structure and sometimes have enzymatic or binding activities; multiple domains can be connected by flexible polypeptide regions within a protein. Pfam domains can comprise the N-terminus or the C-terminus of a protein, or can be situated at any point in between. The Pfam system identifies protein families based on these domains and provides an annotated, searchable database that classifies proteins into families (Bateman et al., 2002).
[006] Molecules ofthe invention can encode or be comprised of one, or more than one, Pfam. Molecules encompassed by the invention include, the polypeptides and polynucleotides shown in the Sequence Listing and corresponding molecular sequences found at all developmental stages of an organism. Molecules of the invention can comprise genes or gene segments designated by the Sequence Listing, and their gene products, i.e., RNA and polypeptides. They also include variants of those set forth in the Sequence Listing that are present in the normal physiological state, e.g., variant alleles such as SNPs and splice variants, as well as variants that are affected in pathological states, such as disease-related mutations or sequences with alterations that lead to pathology, and variants with conservative amino acid changes. Molecules ofthe invention are categorized below; any given one belong to one or more than one category. Secreted Proteins and Related Polypeptides
[007] Secreted proteins, also referred to as secreted factors or secreted polypeptides, as used herein, include polypeptides, or active portions thereof, that are produced by cells and exported extracellularly; extracellular fragments of transmembrane proteins that are proteolytically cleaved; and extracellular fragments of cell surface receptors, which fragments may be soluble. An example of a secreted protein is HG1009657P1 herein, which is a 551 amino acid-residue polypeptide comprising a cytochrome p450 Pfam domain and is 94% homologous over the length ofthe polypeptide (i.e., 94%> ofthe 551 amino acid residues are identical) to human cytochrome p450, family 26, subfamily C, polypeptide 1. The secreted proteins ofthe present invention include those in the Sequence Listing with a Tree Vote of 0.5 or higher; the Tree Vote is an internal designation, as described, infra.
[008] Many and widely variant biological functions are mediated by a wide variety of different types of secreted proteins. Yet, despite the sequencing ofthe human genome, relatively few pharmaceutically useful secreted proteins have been identified and brought to the clinic or to the market. It would be advantageous to discover novel secreted proteins or polypeptides, and their corresponding polynucleotides, which have medical utility.
[009] Pharmaceutically useful secreted proteins of the present invention will have in common the ability to act as ligands for binding to receptors on cell surfaces in ligand receptor interactions; to bind to ligands, soluble or otherwise; to inhibit ligand receptor interactions; to trigger certain infracellular responses, such as inducing signal transduction to activate cells or inhibit cellular activity; to induce cellular growth, proliferation, or differentiation; to induce the production of other factors that, in turn, mediate such activities; or to inhibit cell activation or signaling. [010] The cell types having cell surface receptors responsive to secreted proteins are many and various, including, any cell type of any tissue origin or developmental state, for example, stem cells and progenitor cells; precursor and mature cells ofthe hematopoietic, hepatic, neural, lung, heart, thymic, splenic, epithelial, endothelial, pancreatic, adipose, gastrointestinal, colonic, renal, optic, olfactory, bone, cartilaginous, and musculoskeletal lineages. The hematopoietic cells can be precursor or mature red blood cells or white blood cells, including cells ofthe B lymphocytic (B-cell), T lymphocytic (T cell), monocytic, dendritic, megakaryocytic, natural killer (NK), macrophagic, eosinophilic, and basophilic lineages. The cell types responsive to secreted proteins also include normal cells and cells implicated in pathological conditions or other disorders.
[011] Certain ofthe secreted proteins can stimulate T or B cell growth or differentiation by interacting with precursor T or B cells or hematopoietic progenitor cells, or bone marrow stem cells. As another example, certain secreted proteins ofthe present invention can maintain stem cells, progenitor cells or precursor cells in an undifferentiated state. As a further example, certain secreted proteins ofthe present invention can regulate bone growth by stimulation or inhibition thereof; secretion of insulin; glucose metabolism; cell proliferation; response to microbial infection; and regeneration of tissues including neural, muscular, and epithelial. Moreover, certain secreted proteins ofthe present invention can induce apoptosis, such as in cancer cells or inflammatory cells.
[012] Certain of the secreted proteins of the present invention have a cytochrome p450 (p450) Pfam domain (http://pfam.wustl.edu/cgi- bin/getdesc?name=p450), for example, HG1009657P1, which is a 551 amino acid- residue polypeptide that is 94% homologous over the length ofthe polypeptide with human cytochrome p450, family 26, subfamily C, polypeptide 1 ; cytochrome p450 CYP26C1, has two such domains. This polypeptide possesses the functional domain and properties of cytochrome p450. Cytochrome P450 domains are involved in the oxidative degradation of various compounds, including environmental toxins and mutagens (Degtyarenko and Archakov 1993). [013] Certain of the secreted proteins of the present invention have a LBP/BPI/CETP family, N-terminal domain (http://pfam.wustl.edu/cgi- bin/getdesc?name=LBP_BPI_CETP), and/or a LBP/BPI/CETP family, C-terminal domain (http://pfam.wultl.edu/cgi-bin/getdesc?name=LBP_BPI_CETP_C), for example, HG1009765P1, which is a 1126 amino acid residue polypeptide that is 50% homologous over the length ofthe polypeptide with human antimicrobial peptide RY2G5; likely ortholog of rat probable ligand-binding protein RY2G5; long palate, lung, and nasal epithelium carcinoma associated 4 protein has two of each of such domains. This polypeptide possesses the functional domain and properties of a lipid binding serum glycoprotein. These families of proteins comprise lipopolysaccharide binding proteins, bactericidal permeability-increasing proteins, cholesteryl ester transfer proteins, and phospholipid transfer proteins. Proteins in these families share biochemical and structural similarities and serve a wide range of physiological functions (Yamashita et al., 2000). Transmembrane Proteins and Related Polypeptides
[014] Transmembrane proteins extend into or through the cell membrane's lipid bilayer; they can span the membrane once, or more than once. Transmembrane proteins that span the membrane once are "single transmembrane proteins" (STM), and transmembrane proteins that span the membrane more than once are "multiple transmembrane proteins" (MTM). Examples of transmembrane proteins include the receptors, e.g., insulin receptors; adenylate cyclases; and ion exchangers.
[015] A single transmembrane protein typically has one transmembrane (TM) domain spanning a series of consecutive amino acid residues, and numbered on the basis of distance from the N-terminus, with the first amino acid residue at the N- terminus as number 1. A multi-transmembrane protein typically has more than one TM domain, each spanning a series of consecutive amino acid residues, numbered in the same way as the STM protein.
[016] Transmembrane proteins, having part of their molecules on either side ofthe bilayers, have many and widely variant biological functions. They cantransport molecules, e.g., ions or proteins across membranes, transduce signals across membranes, act as receptors, and function as antigens. Transmembrane proteins are often involved in cell. signaling events; they can comprise signaling molecules, or can interact with signaling molecules. For example, tyrosine kinases can be transmembrane receptor proteins. Abnormalities of receptor tyrosine kinases are associated with human cancers; tumor cells are known to use receptor tyrosine kinases in transduction pathways to achieve tumor growth, angiogenesis and metastasis. Therefore, receptor tyrosine kinases represent pivotal targets in cancer therapy. It would be similarly advantageous to discover novel transmembrane proteins or polypeptides, and their corresponding polynucleotides that have additional medical utility. The transmembrane polypeptides ofthe invention, like the secreted polypeptides, also have many different functional domains, and belong to a wide variety of Pfam families.
[017] Over-expression and/or structural alteration of kinases, for example, receptor tyrosine kinase family members, is often associated with human cancers. For example, tumor cells are known to use receptor tyrosine kinases in transduction pathways to achieve tumor growth, angiogenesis and metastasis. Therefore, receptor tyrosine kinases represent pivotal targets in cancer therapy. A number of small molecule receptor tyrosine kinase inhibitors have been synthesized, are in clinical trials, are being analyzed in animal models, or have been marketed. Inhibitory mechanisms include ligand-dependent down regulation, e.g., by the adaptor Cbl (Brunelleschi et al., 2002).
[018] Certain of the transmembrane proteins of the present invention have a p450 Pfam domain, which was described, supra, for example HG1009657P1, also described supra as a secreted protein, is also a transmembrane protein. These designations are consistent with a transmembrane protein having an extracellular fragment that is cleaved.
[019] Certain of the transmembrane proteins of the present invention have an adh_short Pfam domain (http://pfam.wustl.edu/cgi-bin/getdesc?name=adh_short), for example HG1009662P1, which is a 206 amino acid-residue polypeptide that is 78% homologous over the length ofthe polypeptide with human DKFZP566O084 protein. This polypeptide possesses the functional domain and properties of a short- chain dehydrogenase. Adh_short domains are found in a large family of proteins that includes short-chain dehydrogenases and reductase enzymes; most family members function as NAD- or NADP- dependent oxidoreductases (Jornvall et al. 1995).
[020] Certain of the transmembrane proteins of the present invention have a YT521-B-like family (YTH) (http://pfam.wustl.edu cgi-bin/getdesc? name=YTH), for example HG1009699P1, which is a 455 amino acid-residue polypeptide that is 92% homologous over the length ofthe polypeptide with a human protein similar to HGRG8 protein. This polypeptide possesses the functional domain and properties of a YT521-B-like protein, which is a tyrosine-phosphorylated nuclear protein that functions in a signal transduction pathway to influence splice site selection. YT521-B interacts with the nuclear transcriptosomal component scaffold attachment factor B, and a 68-kDa src substrate which is associated during mitosis (Hartmann et al., 1999).
[021] Transmembrane proteins that are differentially expressed on the surface of cancer cells, particularly those that are differentially expressed on the surface of cancer cells but not on the surface of normal tissues, such as heart and lung, are desirable targets for production of antibodies, e.g., diagnostic antibodies or therapeutic antibodies, such as antibodies that mediate ADCC or CDC to effect tumor cell killing.
[022] Transmembrane proteins with extracellular fragments that can be cleaved can be useful as secreted proteins to effect ligand/receptor binding so as to mediate infracellular responses, such as signal transduction. Transmembrane proteins that act as receptors, and possess a ligand binding extracellular portion exposed on a cell surface and an intracellular portion that interacts with other cellular components upon activation can be also be useful as transmembrane proteins to mediate intracellular responses, such as signal transduction. Other Proteins and Related Polypeptides
[023] The invention also encompasses proteins and related polypeptides that are neither secreted proteins, nor transmembrane proteins. These polypeptides possess the functional domains and properties of their Pfam domains.
[024] HG1009953P1, HG1009618P1, HG1009632P1, and HG1009888Pl comprise a reverse transcriptase (rvt) Pfam domain (http://pfam.wustl.edu/cgi- bin/getdesc? name=rvt). HG1009552P1 comprises a phorbol esters/diacylglycerol binding domain (CI domain) (DAG_PE-bind) Pfam domain (http://pfam.wustl. edu/cgi-bin/getdesc?name=DAG_PE-bind), an efhand Pfam domain (http://pfam. wustl.edu/cgi-bin/getdesc?name=efhand), and a diacylglycerol kinase catalytic domain (DAGKcat) Pfam domain (http://pfam.wustl.edu/cgi-bin/getdesc? name=DAGK_cat).
[025] HG1011868P1 comprises a ribosomal_L35Ae Pfam domain (http://pfam.wustl.edu/cgi-bin getdesc?name=Ribosomal_L35Ae). HG1011872P1 comprises an integrase DNA binding domain (integrase) Pfam domain (lιttp://pfam.wustl.edu/cgi-bin/getdesc?name=integrase), an rnaseH Pfam domain (http://pfam.wustl.edu/cgi-bin/getdesc?name=τnaseH), an integrase core (rve) Pfam domain (http://pfam.wustl.edu/cgi-bin/getdesc?name=rve), and an integrase zinc- binding (Integrase_Zn) domain (http://pfam.wustl.edu/cgi-bin/getdesc7name =Integrase_Zn). Invention Features
[026] The present invention features an isolated polynucleotide that encodes a polypeptide. In some embodiments, the polypeptide has at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, or at least about 99% amino acid sequence identity with an amino acid sequence derived from a polynucleotide sequence chosen from at least one nucleotide sequence according to SEQ ID NO: 1 - 123. In some embodiments, the polypeptide has an amino acid sequence chosen from at least one amino acid sequence according to SEQ ID NO: 124 - 246. In many embodiments, the polypeptide has at least one activity associated with the naturally occurring encoded polypeptide.
[027] In some embodiments, the polypeptide includes a signal peptide. In alternative embodiments, the polypeptide comprises a mature form of a protein, from which the signal peptide has been cleaved. In some embodiments, the polypeptide is a signal peptide. In a further aspect, the invention provides fragments of a polypeptide chosen from at least one amino acid sequence according to SEQ ID NO: 124 - 246, where each fragment is an extracellular fragment ofthe polypeptide, or an extracellular fragment ofthe polypeptide minus the signal peptide. The invention provides an N-terminal fragment containing a Pfam domain and a C-terminal fragment containing a Pfam domain and either or both may be biologically active.
[028] In some other embodiments, the polypeptides function as secreted proteins. In some further embodiments, the polypeptides function as single- transmembrane proteins. In other embodiments, the polypeptides function as multiple-transmembrane proteins. In some embodiments, the polypeptides function as kinases, receptors, phosphatases, proteases, phosphodiesterases, immunoglobulin, growth factors, antigens, complement proteins, adhesion proteins, GTPase activating proteins, binding proteins, ribosylation factors, revers transcriptases, integrases, ribosomal proteins, signaling proteins, transport proteins, phospholipid binding proteins, RNAsetl, nucleotide hydrolases, fransposases, transporters, RNA recognition motifs, proprotein convertases, matrix proteins, membrane transport proteins. In some embodiments, the polypeptides function in pathological states. In some embodiments, the polypeptides function as one or more of these.
[029] The present invention features an isolated polynucleotide that hybridizes under stringent hybridization conditions to a coding region of at least one nucleotide sequence shown in SEQ ID NO: 1 - 123, or a complement thereof.
[030] The present invention features an isolated polynucleotide that shares at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% nucleotide sequence identity with a nucleotide sequence ofthe coding region of at least one sequence shown in SEQ ID NO: 1 - 123, or a complement thereof. In some embodiments, a subject polynucleotide has the nucleotide sequence shown in at least one of SEQ ID NO: 1 - 123, or a coding region thereof.
[031 ] The present invention also features a vector, e.g., a recombinant vector, that includes a subject polynucleotide, and a promoter the drives its expression. This vector can transform a host cell, and the present invention further features such host cells, e.g., isolated in vitro host cells, and in vivo host cells, that comprise a polynucleotide ofthe invention, or a recombinant vector ofthe invention.
[032] The present invention further features a library of polynucleotides, wherein at least one ofthe polynucleotides comprises the sequence information of a polynucleotide ofthe invention. In specific embodiments, the library is provided on a nucleic acid array. In some embodiments, the library is provided in computer- readable format.
[033] The present invention features a pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides in length. The first nucleic acid molecule ofthe pair comprises a sequence of at least 10 contiguous nucleotides having 100% sequence identity to at least one nucleic acid sequence shown in SEQ ID NO: 1 - 123. The second nucleic acid molecule ofthe pair comprises a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to the reverse complement of at least one nucleic acid sequence shown in SEQ ID NO: 1 - 123. The sequence of said second nucleic acid molecule is located 3 'of the nucleic acid sequence ofthe first nucleic acid molecule shown in SEQ ID NO: 1 - 123. The pair of isolated nucleic acid molecules are useful in a polymerase chain reaction or in any other method known in the art to amplify a nucleic acid that has sequence identity to the sequences shown in SEQ ID NO: 1 - 123, particularly when cDNA is used as a template.
[034] The invention features a method of determining the presence of a polynucleotide substantially identical to a polynucleotide sequence shown in the Sequence Listing, or a complement of such a nucleotide by providing its complement, allowing the polynucleotides to interact, and determining whether such interaction has occurred.
[035] The invention further features methods of regulating the expression ofthe subject polynucleotides and encoded polypeptides. The invention provides a method of inhibiting transcription or translation of a first polynucleotide encoding a first polypeptide ofthe invention by providing a second polynucleotide that hybridizes to the first polynucleotide, and allowing the first polynucleotide to contact and bind to the second polynucleotide. The second polynucleotide can be chosen from an antisense molecule, a ribozyme, and an interfering RNA (RNAi) molecule.
[036] The present invention further features an isolated polypeptide, e.g., an isolated polypeptide encoded by a polynucleotide, and biologically active fragments of such polypeptide. In some embodiments, the polypeptide is a fusion protein. In some embodiments, the polypeptide has one or more amino acid substitutions, and/or insertions and or deletions, compared with at least one sequence shown in SEQ ID NO: 124 - 246. In some embodiments, the polypeptide has an amino acid sequence derived from at least one nucleotide sequence shown in SEQ ID NO: 1 - 123. In some embodiments, the polypeptide has an amino acid sequence substantially identical to at least one sequence shown in SEQ ID NO: 124 - 246.
[037] The invention also provides a method of making a polypeptide ofthe invention by providing a nucleic acid molecule that comprises a polynucleotide sequence encoding a polypeptide ofthe invention, introducing the nucleic acid molecule into an expression system, and allowing the polypeptide to be produced.
[038] In some embodiments, the method involves in vitro cell-free transcription and/or translation. For example, the expression system can comprise a cell-free expression system, such as an E. coli system, a wheat germ extract system, a rabbit reticulocyte system, or a frog oocyte system.
[039] In certain other embodiments, the expression system can comprise a prokaryotic or eukaryotic cell, for example, a bacterial cell expression system, a fungal cell expression system, such as yeast or Aspergillus, a plant cell expression system, e.g., a cereal plant, a tobacco plant, a tomato plant, or other edible plant, an insect cell expression system, such as SF9 of High Five cells, an amphibian cell expression system, a reptile cell expression system, a crustacean cell expression system, an avian cell expression system, a fish cell expression system, or a mammalian cell expression system, such as one using Chinese Hamster Ovary (CHO) cells. In some embodiments, the method involves culturing a subject host cell under conditions such that the subject polypeptide is produced by the host cells; and recovering the subject polypeptide from the culture, e.g., from within the host cells, or from the culture medium. In further embodiments, the polypeptide can be produced in vivo in a multicellular animal or plant, comprising a polynucleotide encoding the subject polypeptide.
[040] The present invention further features a non-human animal injected with at least one polynucleotide comprising at least one nucleotide sequence chosen from SEQ ID NO: 1 - 123, and or at least one polypeptide comprising at least one amino acid sequence chosen form SEQ ID NO: 124 - 246.
[041] The present invention further features an antibody that specifically recognizes, binds to, interferes with, or modulates the biological activity of a subject polypeptide or a fragment thereof. The polypeptide can be a secreted protein, single- transmembrane protein, multiple-transmembrane protein, cytoplasmic protein or extracellular protein, or fragment thereof. The fragment can be an extracellular fragment of a subject polypeptide, or an extracellular fragment of a subject polypeptide minus the signal peptide.
[042] The present invention further features an antibody that specifically inhibits binding of a polypeptide to its ligand or substrate. It also features an antibody that specifically inhibits binding of a polypeptide as a substrate to another molecule.
[043] Another aspect ofthe present invention features a library of antibodies or fragments thereof, wherein at least one antibody or fragment thereof specifically binds to at least a portion of a polypeptide comprising an amino acid sequence according to SEQ ID NO: 124 - 246 , and/or wherein at least one antibody or fragment thereof interferes with at least one activity of such polypeptide or fragment thereof. In certain embodiments, the antibody library comprises at least one antibody or fragment thereof that specifically inhibits binding of a subject polypeptide to its ligand or substrate, or that specifically inhibits binding of a subject polypeptide as a substrate to another molecule. The present invention also features corresponding polynucleotide libraries comprising at least one polynucleotide sequence that encodes an antibody or antibody fragment ofthe invention. In specific embodiments, the library is provided on a nucleic acid array or in computer-readable format.
[044] An antibody of the present invention may comprise a monoclonal antibody, polyclonal antibody, single chain antibody, intrabody, and active fragments of any of these. The active fragments include variable regions from either heavy chains or light chains. The antibody can comprise the backbone of a molecule with an immunoglobulin domain, e.g., a fibronectin backbone, a T-cell receptor backbone, or a CTLA4 backbone. [045] The present invention further features a targeting antibody, a neutralizing antibody, a stabilizing antibody, an enhancing antibody, an antibody agonist, an antibody antagonist, an antibody that promotes cellular endocytosis of a target antigen, a cytotoxic antibody, and an antibody that mediates antibody dependent cellular cytotoxicity (ADCC). The antibody that mediates ADCC can have a cytotoxic component, e.g., a radioisotope, a radioactive molecule, a microbial toxin, a plant toxin, a chemotherapeutic agent, or a chemical substance, such as doxorubicin or cisplatin. The invention also features an inhibitory antibody, functioning to specifically inhibit the binding of a cognate polypeptide to its ligand or its substrate, or to specifically inhibit the binding of a cognate peptide as the substrate of another molecule.
[046] The antibodies ofthe present invention also encompass a human antibody, a non-human primate antibody, a monkey antibody, a non-primate animal antibody, e.g., a rodent antibody, rat antibody, a mouse antibody, a hamster antibody, a guinea pig antibody, a chicken antibody, a cattle antibody, a sheep antibody, a goat antibody, a horse antibody, porcine antibody, a cow antibody, a rabbit antibody, a cat antibody, or a dog antibody. It also features a humanized antibody, a primatized antibody, and a chimeric antibody.
[047] The antibodies ofthe invention can be produced in vitro or in vivo. For example, the present invention features an antibody produced in a cell-free expression system, a prokaryote expression system or a eukaryote expression system, as described herein.
[048] The invention further provides a host cell that can produce an antibody ofthe invention or a fragment thereof. The antibody may also be secreted by the cell. The host cell can be a hybridoma, or a prokaryotic or eukaryotic cell. The invention also provides a bacteriophage or other virus particle comprising an antibody ofthe invention, or a fragment thereof. The bacteriophage or other virus particle may display the antibody of fragment thereof on its surface, and the bacteriophage itself may exist within a bacterial cell. The antibody may also comprise a fusion protein with a viral or bacteriophage protein. [049] The invention further provides transgenic multicellular organisms, e.g., plants or non-human animals, as well as tissues or organs, comprising a polynucleotide sequence encoding a subject antibody or fragment thereof. The organism, tissues, or organs will generally comprise cells producing an antibody of the invention, or a fragment thereof.
[050] In another aspect, the present invention features a method of making an antibody by immunizing a host animal. In this method, a polypeptide or a fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a fragment thereof, is introduced into an animal in a sufficient amount to elicit the generation of antibodies specific to the polypeptide or fragment thereof, and the resulting antibodies are recovered from the animal. The polypeptide can be encoded by a nucleic acid molecule comprising a nucleotide sequence chosen from at least one polynucleotide sequence according to SEQ ID NO: 1 - 123. For example, the polypeptide may comprise at least one amino acid sequence chosen from SEQ ID NO: 124 - 246.
[051 ] The invention thus also provides a non-human animal comprising an antibody ofthe invention. The animal can be a non-human primate, (e.g., a monkey) a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog.
[052] The present invention also features a method of making an antibody by isolating a spleen from an animal injected with a polypeptide or a fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a fragment thereof, and recovering antibodies from the spleen cells. Hybridomas can be made from the spleen cells, and hybridomas secreting specific antibodies can be selected.
[053] The present invention further features a method of making a polynucleotide library from spleen cells, and selecting a cDNA clone that produces specific antibodies, or fragments thereof. The cDNA clone or a fragment thereof can be expressed in an expression system that allows production ofthe antibody or a fragment thereof, as provided herein. [054] The present invention also features a pharmaceutical composition comprising a polynucleotide, polypeptide, or modulator ofthe invention and a carrier. The carrier can be a pharmaceutically acceptable carrier. The modulator can be obtainable by any methods ofthe invention, for example, the modulator can be an antibody or a fragment thereof. Further, oral formulations, preparations for injection, aerosol formulations, and suppositories can be prepared, each comprising the polynucleotide, polypeptide, or modulator composition. Further, nucleic acid compositions comprising polynucleotide sequences encoding the subject antibodies, or fragments thereof, can be prepared for administration to a subject.
[055] The invention also features a non-human animal injected with the polynucleotide, polypeptide, or modulator composition, for example the antibody composition. Again, the animal can be a non-human primate, (e.g., a monkey) a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog.
INDUSTRIAL APPLICABILITY
[056] The invention provides novel polynucleotides, related novel polypeptides such as secreted polypeptides, transmembrane polypeptides, and other polypeptides, i.e., cytoplasmic and luminal polypeptides, and active fragments thereof, as well as novel nucleic acid compositions encoding these polypeptides, and compositions comprising the related polypeptides.
[057] The present invention also provides for vectors, host cells, and methods for producing the polynucleotides and polypeptides ofthe invention in these vectors and host cells, and in cell-free systems. The present invention further provides for antisense molecules that are capable of regulating the expression ofthe polynucleotides or polypeptides herein. In addition, modulators, including antibodies, that bind specifically to the polypeptides or modulate the activity ofthe polypeptides, are also provided.
[058] The present polynucleotides, polypeptides, and modulators find use in therapeutic agent screening/discovery applications, such as screening for receptors or competitive ligands, for use, for example, as small molecule therapeutic drugs. Also provided are methods of modulating a biological activity of a polypeptide and methods of treating associated disease conditions, particularly by administering modulators ofthe present polypeptides, such as small molecule modulators, antisense molecules, and specific antibodies.
[059] The present polypeptides, polynucleotides, and modulators find use in a number of diagnostic, prophylactic, and therapeutic applications. The polynucleotides and polypeptides ofthe invention are useful in diagnosis, and can be used in diagnostic kits. The polynucleotides and polypeptides ofthe invention are also useful for treating a variety of disorders, including proliferative disorders such as cancer, inflammatory disorders such as ulcerative colitis, immune disorders such as autoimmune diseases, e.g., multiple sclerosis, diseases caused by infectious and parasitic microorganisms including, for example, bacteria, fungi, prions, or viruses, metabolic disorders such as diabetes and obesity, central nervous system disorders such as Alzheimer's and Parkinson's, and bone and cartilage disorders such as osteoporosis and achondroplasia (Braunwald et al., 2001).
[060] The polynucleotides and polypeptides ofthe invention, and related compositions will inhibit or modulate the replication, differentiation, signaling, or other function of a pathologically important cell ofthe system involved in the disorder to be treated. For example, a polynucleotide or polypeptide composition ofthe invention can treat an immune disorder by modulating the replication, differentiation, signaling, or other function of a pathologically important cell ofthe immune system, such as a B-lymphocyte, T-lymphocyte, mast cell, dendritic cell, macrophage, neutrophil, basophil, or eosinophil. Subjects who suffer from a deficiency, or a lack of a particular protein, or are otherwise in need of such protein to repair or enhance a desirable function, benefit from the administration of a protein or an active fragment thereof by any conventional routes of administration. These include therapeutic vaccines in the form of nucleic acid or polypeptide vaccines, such as cancer vaccines, where the vaccines can be administered alone, such as naked DNA, or can be facilitated, such as via viral vectors, microsomes, or liposomes. Therapeutic antibodies include those that are administered alone or in combination with cytotoxic agents, such as radioactive or chemotherapeutic agents. MODES FOR CARRYING OUT THE INVENTION Brief Description of the Tables
[061] Each sequence shown in Tables 1-4 is identified by a Five Prime Therapeutics, Inc. (FP) identification number (FP ID). Table 1 correlates the Five Prime Therapeutics, Inc. identification number (FP ID) of each nucleotide and polypeptide with the Sequence Listing. Each FP ID corresponds to two SEQ ID NOS. The first, SEQ ID NO. (NI), corresponds to the nucleotide coding sequence that encodes the polypeptide ofthe invention. The second, SEQ ID NO. (Pl) corresponds to the amino acid sequence ofthe polypeptide ofthe invention. SEQ ID NOS. 1-123 correspond to the NI coding sequences and SEQ ID NOS. 124-246 correspond to the Pl polypeptide sequences.
[062] Table 2 specifies the result ofthe algorithm described above that predicts whether the claimed FP sequence is secreted (Tree Vote). The signal peptide coordinates (Signal Peptide Coords.) are listed in terms ofthe amino acid residues beginning with " 1 " at the N-terminus ofthe polypeptide. The Mature Protein Coords, refer to the coordinates ofthe amino acid residues ofthe mature polypeptide after cleavage ofthe signal peptide. Table 2 also specifies the coordinates of an alternative form ofthe mature protein (Alternate Mature Protein Coords.). In instances where the mature protein start residue overlaps the signal peptide end residue, some ofthe amino acid residues may be cleaved off such that the mature protein does not start at the next amino acid residue from the signal peptides, resulting in the alternative mature protein coordinates. Table 2 also specifies the number of transmembrane segments. Finally, Table 2 provides the coordinates ofthe transmembrane and non- transmembrane sequences ofthe polypeptides. The transmembrane coordinates (TM Coords.) refer to the transmembrane segments ofthe molecule and are listed in terms ofthe amino acid residues beginning with "1" at the N-terminus ofthe polypeptide. The non-transmembrane coordinates (non-TM Coords.) refer to the amino acids that are not located in the membrane; these can include extracellular, cytoplasmic, and luminal sequences, and are listed in terms ofthe amino acid residues beginning with "1" at the N-terminus ofthe polypeptide. [063] Table 3 specifies the predicted number of amino acid residues in each FP protein ofthe invention (Predicted Protein Length). Table 3 describes the characteristics ofthe protein in the public National Center for Information Biotechnology (NCBI) database displaying the greatest degree of similarity to each claimed sequence. The corresponding NCBI protein is described by its NCBI accession number (Top Hit Accession ID) and by the NCBI's annotation of that sequence (Top Hit Annotation). The percent identity ofthe Five Prime protein with the corresponding NCBI protein is listed (Top Hit %ID (relative to prediction)). The number of amino acids in this NCBI protein is specified (Length of Top Hit).
[064] Table 3 also describes the characteristics ofthe human protein in the NCBI database with the greatest degree of similarity to the claimed sequences. The corresponding NCBI protein is described by its NCBI accession number (Top Human Hit Accession ID), and by the NCBI's annotation of that sequence (Top Human Hit Annotation). The percent identity ofthe Five Prime protein with the NCBI protein is listed (Top Human Hit %>ID (relative to prediction)). Finally, the number of amino acids in the NCBI protein is specified (Length of Top Human Hit).
[065] Table 4 lists the protein family (Pfam) of certain of the polypeptides ofthe invention. It also lists the coordinates at which each Pfam sequence can be found (Coordinates). Definitions
[066] "Related sequences" include nucleotide and amino acid sequences that are involved in the function of their referent. For example, "receptor-related sequences" include all sequences that are involved in receptor function. This includes, but is not limited to, sequences that are involved in receptor synthesis, receptor regulation, receptor effector function, and receptor degradation. "Related sequences" also encompass complementary nucleic acid sequences, and biologically active fragments of nucleic acid and amino acid sequences.
[067] The terms "polynucleotide," "nucleotide," "nucleic acid," "polynucleic molecule," "nucleotide molecule," "nucleic acid molecule," "nucleic acid sequence," "polynucleotide sequence," and "nucleotide sequence" are used interchangeably herein to refer to polymeric forms of nucleotides of any length. The polynucleotides can contain deoxyribonucleotides, ribonucleotides, and/or their analogs or derivatives. For example, nucleic acids can be naturally occurring DNA or RNA, or can be synthetic analogs, as known in the art. The terms also encompass genomic DNA, genes, gene fragments, exons, introns, regulatory sequences or regulatory elements (such as promoters, enhancers, initiation and termination regions, other control regions, expression regulatory factors, and expression controls), DNA comprising one or more single-nucleotide polymorphisms (SNPs), allelic variants, isolated DNA of any sequence, and cDNA. The terms also encompass mRNA, tRNA, rRNA, ribozymes, splice variants, antisense RNA, antisense conjugates, RNAi, and isolated RNA of any sequence. The terms also encompass recombinant polynucleotides, heterologous polynucleotides, branched polynucleotides, labeled polynucleotides, hybrid DNA RNA, polynucleotide constructs, vectors comprising the subject nucleic acids, nucleic acid probes, primers, and primer pairs. The polynucleotides can comprise modified nucleic acid molecules, with alterations in the backbone, sugars, or heterocyclic bases, such as methylated nucleic acid molecules, peptide nucleic acids, and nucleic acid molecule analogs, which may be suitable as, for example, probes if they demonstrate superior stability and/or binding affinity under assay conditions. Analogs of purines and pyrimidines, including radiolabeled and fluorescent analogs, are known in the art. The polynucleotides can have any three-dimensional structure, and can perform any function, known or as yet unknown. The terms also encompass single-stranded, double-stranded and triple helical molecules that are either DNA, RNA, or hybrid DNA/RNA and that may encode a full-length gene or a biologically active fragment thereof. Biologically active fragments of polynucleotides can encode the polypeptides herein, as well as anti-sense and RNAi molecules. Thus, the full length polynucleotides herein may be treated with enzymes, such as Dicer, to generate a library of short RNAi fragments which are within the scope ofthe present invention.
[068] The novel polynucleotides herein include those shown in the Tables, SEQ ID NO: 1 - 123, as well as those that encode the polypeptides of SEQ ID NO: 124 - 246, and biologically active fragments thereof. The polynucleotides also include modified, labeled, and degenerate variants ofthe nucleic acid sequences, as well as nucleic acid sequences that are substantially similar or homologous to nucleic acids encoding the subject proteins.
[069] A "biologically active" entity, or an entity having "biological activity," is one having structural, regulatory, or biochemical functions of a naturally occurring molecule or any function related to or associated with a metabolic or physiological process. Biologically active polynucleotide fragments are those exhibiting activity similar, but not necessarily identical, to an activity of a polynucleotide ofthe present invention. The biological activity can include an improved desired activity, or a decreased undesirable activity. For example, an entity demonstrates biological activity when it participates in a molecular interaction with another molecule, or when it has therapeutic value in alleviating a disease condition, or when it has prophylactic value in inducing an immune response to the molecule, or when it has diagnostic value in determining the presence ofthe molecule, such as a biologically active fragment of a polynucleotide that can be detected as unique for the polynucleotide molecule, or that can be used as a primer in PCR.
[070] The term "degenerate variant" of a nucleic acid sequence refers to all nucleic acid sequences that can be directly translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from a reference nucleic acid sequence.
[071] The term "gene" or "genomic sequence" as used herein is an open reading frame encoding specific proteins and polypeptides, for example, an mRNA, cDNA, or genomic DNA, and also may or may not include intervening introns, or adjacent 5' and 3 'non-coding nucleotide sequences involved in the regulation of expression up to about 20 kb beyond the coding region, and possibly further in either direction. A gene can be introduced into an appropriate vector for extrachromosomal maintenance or for integration into a host genome.
[072] The term "fransgene" as used herein is a nucleic acid sequence that is incorporated into a transgenic organism. A "fransgene" can contain one or more transcriptional regulatory sequences, and other sequences, such as introns, that may be useful for expressing or secreting the nucleic acid or fusion protein it encodes. [073] The term "cDNA" as used herein is intended to include all nucleic acids that share the sequence elements of mature mRNA species, where sequence elements are exons and 3' and 5' non-coding regions. Generally, mRNA species have contiguous exons, the intervening introns having been removed by nuclear RNA splicing to create a continuous open reading frame encoding a protein.
[074] The term "splice variant" refers to all types of RNAs transcribed from a given gene that when processed collectively encode plural protein isoforms. The term "alternative splicing" and related terms refer to all types of RNA processing that lead to expression of plural protein isoforms from a single gene. Some genes are first transcribed as long mRNA precursors that are then shortened by a series of processing steps to produce the mature mRNA molecule. One of these steps is RNA splicing, in which the intron sequences are removed from the mRNA precursor. A cell can splice the primary transcript in different ways, making different "splice variants," and thereby making different polypeptide chains from the same gene, or from the same mRNA molecule. Splice variants can include, for example, exon insertions, exon extensions, exon truncations, exon deletions, alternatives in the 5 'untranslated region and alternatives in the 3 'untranslated region.
[075] "Oligonucleotide" may generally refer to polynucleotides of between about 5 and about 100 nucleotides of single-or double-stranded nucleic acids. For the purposes of this disclosure, there is no upper limit to the length of an oligonucleotide. Oligonucleotides are also known as oligomers or oligos and can be isolated from genes, or chemically synthesized by methods known in the art.
[076] "Nucleic acid composition" as used herein is a composition comprising a nucleic acid sequence, including one having an open reading frame that encodes a polypeptide and is capable, under appropriate conditions, of being expressed as a polypeptide. The term includes, for example, vectors, including plasmids, cosmids, viral vectors (e.g., retrovirus vectors such as lentivirus, adenovirus, and the like), human, yeast, bacterial, Pl -derived artificial chromosomes (HAC's, YAC's, BAC's, PAC's, etc), and mini-chromosomes, in vitro host cells, in vivo host cells, tissues, organs, allogenic or congenic grafts or transplants, multicellular organisms, and chimeric, genetically modified, or transgenic animals comprising a subject nucleic acid sequence.
[077] An "isolated," "purified," or "substantially isolated" polynucleotide, or a polynucleotide in "substantially pure form," in "substantially purified form," in "substantial purity," or as an "isolate," is one that is substantially free ofthe sequences with which it is associated in nature, or other nucleic acid sequences that do not include a sequence or fragment ofthe subject polynucleotides. By substantially free is meant that less than about 90%, less than about 80%, less than about 70%, less than about 60%, or less than about 50%) ofthe composition is made up of materials other than the isolated polynucleotide. For example, the isolated polynucleotide is at least about 50%, at least about 60%>, at least about 70%, at least about 80%>, at least about 90%, at least about 95%, at least about 97%, or at least about 99% free ofthe materials with which it is associated in nature. For example, an isolated polynucleotide may be present in a composition wherein at least about 50%, at least about 60%, at least about 70%, at least about 80%>, at least about 90%>, at least about 95%o, at least about 97%, at least about 99% ofthe total macromolecules (for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides) in the composition is the isolated polynucleotide. Where at least about 99% ofthe total macromolecules is the isolated polynucleotide, the polynucleotide is at least about 99% pure, and the composition comprises less than about 1% contaminant. As used herein, an "isolated," "purified" or "substantially isolated" polynucleotide, or a polynucleotide in "substantially pure form," in "substantially purified form," in "substantial purity," or as an "isolate," also refers to recombinant polynucleotides, modified, degenerate and homologous polynucleotides, and chemically synthesized polynucleotides, which, by virtue of origin or manipulation, are not associated with all or a portion of a polynucleotide with which it is associated in nature, are linked to a polynucleotide other than that to which it is linked in nature, or do not occur in nature. For example, the subject polynucleotides are generally provided as other than on an intact chromosome, and recombinant embodiments are typically flanked by one or more nucleotides not normally associated with the subject polynucleotide on a naturally-occurring chromosome.
[078] The terms "polypeptide," "peptide," and "protein," used interchangeably herein, refer to a polymeric form of amino acids of any length, which can include naturally-occurring amino acids, coded and non-coded amino acids, chemically or biochemically modified, derivatized, or designer amino acids, amino acid analogs, peptidomimetics, and depsipeptides, and polypeptides having modified, cyclic, bicyclic, depsicyclic, or depsibicyclic peptide backbones. The term includes single chain protein as well as multimers. The term also includes conjugated proteins, fusion proteins, including, but not limited to, GST fusion proteins, fusion proteins with a heterologous amino acid sequence, fusion proteins with heterologous and homologous leader sequences, fusion proteins with or without N-terminal methionine residues, pegolyated proteins, and immunologically tagged proteins. Also included in this term are variations of naturally occurring proteins, where such variations are homologous or substantially similar to the naturally occurring protein, as well as corresponding homologs from different species. Variants of polypeptide sequences include insertions, additions, deletions, or substitutions compared with the subject polypeptides. The term also includes peptide aptamers.
[079] The novel polypeptides herein include amino acid sequences encoded by an open reading frame (ORF) as shown in SEQ ID NO: 124 - 246, described in greater detail below, including the full length protein and fragments thereof, particularly biologically active fragments and/or fragments corresponding to functional domains, e.g., a signal peptide or leader sequence, an enzyme active site, including a cleavage site and an enzyme catalytic site, a domain for interaction with other protein(s), a domain for binding DNA, a regulatory domain, a consensus domain that is shared with other members ofthe same protein family, such as a kinase family or an immunoglobulin family; an extracellular domain that may act as a target for antibody production or that may be cleaved to become a soluble receptor or a ligand for a receptor; an intracellular fragment of a transmembrane protein that participates in signal transduction; a transmembrane domain of a transmembrane protein that may facilitate water or ion transport; a sequence associated with cell survival and/or cell proliferation; a sequence associated with cell cycle arrest, DNA repair and/or apoptosis; a sequence associated with a disease or disease prognosis, including types of cancer, degenerative disease, inflammatory disease, immunological disease, genetic disease, metabolic disease, and/or bacterial or viral infection; and including fusions of the subject polypeptides to other proteins or parts thereof; modifications ofthe subject polypeptide, e.g., comprising modified, derivatized, or designer amino acids, modified peptide backbones, and/or immunological tags; as well as intra- and inter-species homologs ofthe subject polypeptides.
[080] As noted above, a "biologically active" entity, or an entity having "biological activity," is one having structural, regulatory, or biochemical functions of a naturally occurring molecule or any function related to or associated with a metabolic or physiological process. Biologically active polypeptide fragments are those exhibiting activity similar, but not necessarily identical, to an activity of a polypeptide ofthe present invention. The biological activity can include an improved desired activity, or a decreased undesirable activity. For example, an entity demonstrates biological activity when it participates in a molecular interaction with another molecule, or when it has therapeutic value in alleviating a disease condition, or when it has prophylactic value in inducing an immune response to the molecule, or when it has diagnostic value in determining the presence ofthe molecule. A biologically active polypeptide or fragment thereof includes one that can participate in a biological reaction, for example, as a transcription factor that combines with other transcription factors for initiation of transcription, or that can serve as an epitope or immunogen to stimulate an immune response, such as production of antibodies, or that can transport molecules into or out of cells, or that can perform a catalytic activity, for example polymerization or nuclease activity, or that can participate in signal transduction by binding to receptors, proteins, or nucleic acids, activating enzymes or substrates.
[081] . A "signal peptide," or a "leader sequence," comprises a sequence of amino acid residues, typically, at the N terminus of a polypeptide, which directs the intracellular trafficking ofthe polypeptide. Polypeptides that contain a signal peptide or leader sequence typically also contain a signal peptide or leader sequence cleavage site. Such polypeptides, after cleavage at the cleavage sites, generate mature polypeptides, for example, after extracellular secretion or after being directed to the appropriate intracellular compartment.
[082] "Depsipeptides" are compounds containing a sequence of at least two alpha-amino acids and at least one alpha-hydroxy carboxylic acid, which are bound through at least one normal peptide link and ester links, derived from the hydroxy carboxylic acids. "Linear depsipeptides" can comprise rings formed through S-S bridges, or through an hydroxy or a mercapto group of an hydroxy-, or mercapto- amino acid and the carboxyl group of another amino- or hydroxy-acid but do not comprise rings formed only through peptide or ester links derived from hydroxy carboxylic acids. "Cyclic depsipeptides" are peptides containing at least one ring formed only through peptide or ester links, derived from hydroxy carboxylic acids.
[083] An "isolated," "purified," or "substantially isolated" polypeptide, or a polypeptide in "substantially pure form," in "substantially purified form," in "substantial purity," or as an "isolate," is one that is substantially free ofthe materials with which it is associated in nature or other polypeptide sequences that do not include a sequence or fragment ofthe subject polypeptides. By substantially free is meant that less than about 90%, less than about 80%, less than about 70%, less than about 60%, or less than about 50% ofthe composition is made up of materials other than the isolated polypeptide. For example, the isolated polypeptide is at least about 50%, at least about 60%, at least about.70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at least about 99% free ofthe materials with which it is associated in nature. For example, an isolated polypeptide may be present in a composition wherein at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at least about 99% ofthe total macromolecules (for example, polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and oligosaccharides) in the composition is the isolated polypeptide. Where at least about 99% ofthe total macromolecules is the isolated polypeptide, the polypeptide is at least about 99% pure, and the composition comprises less than about 1% contaminant. As used herein, an "isolated," "purified," or "substantially isolated" polypeptide, or a polypeptide in "substantially pure form," in "substantially purified form," in "substantial purity," or as an "isolate," also refers to recombinant polypeptides, modified, tagged and fusion polypeptides, and chemically synthesized polypeptides, which by virtue or origin or manipulation, are not associated with all or a portion of the materials with which they are associated in nature, are linked to molecules other than that to which they are linked in nature, or do not occur in nature.
[084] A hydrophobic polypeptide is a polypeptide having one or more hydrophobic domain. Hydrophobic polypeptides do not interact effectively with water; they are, in general, poorly soluble or insoluble in water. Hydrophobic domains comprise one or more amino acids that have aliphatic side chains, which are insoluble or only slightly soluble in water. Examples of hydrophobic polypeptides are alanine, valine, leucine, isoleucine, and methionine, which are nonpolar, and phenylalanine, tyrosine, and tryptophan, which have large, bulky aromatic side groups.
[085] The term "bicyclic" refers to a peptide with two ring closures formed by covalent linkages between amino acids. A covalent linkage between two nonadjacent amino acids constitutes a ring closure, as does a second covalent linkage between a pair of adjacent amino acids which are already linked by a covalent peptide linkage. The covalent linkages forming the ring closures can be amide linkages, i.e., the linkage formed between a free amino on one amino acid and a free carboxyl of a second amino acid, or linkages formed between the side chains or "R" groups of amino acids in the peptides. Thus, bicyclic peptides can be "true" bicyclic peptides, i.e., peptides cyclized by the formation of a peptide bond between the N-terminus and the C-terminus ofthe peptide, or they can be "depsi-bicyclic" peptides, i.e., peptides in which the terminal amino acids are covalently linked through their side chain moieties.
[086] Detection methods ofthe invention can be qualitative or quantitative. Thus, as used herein, the terms "detection," "identification," "determination," and the like, refer to both qualitative and quantitative determinations, and include "measuring." For example, detection methods include methods for detecting the presence and/or level of polynucleotide or polypeptide in a biological sample, and methods for detecting the presence and/or level of biological activity of polynucleotide or polypeptide in a sample.
[087] As used herein, the term "array" or "microarray" may be used interchangeably and refers to a collection of plural biological molecules such as nucleic acids, polypeptides, or antibodies, having locatable addresses that may be separately detectable. Generally, "microarray" encompasses use of sub microgram quantities of biological molecules. The biological molecules may be affixed to a substrate or may be in solution or suspension. The substrate can be porous or solid, planar or non-planar, unitary or distributed, such as a glass slide, a 96 well plate, with or without the use of microbeads or nanobeads. As such, the term "microarray" includes all ofthe devices referred to as microarrays in Schena, 1999; Bassett et al., 1999; Bowtell, 1999; Brown and Botstein, 1999; Chakravarti, 1999; Cheung et al., 1999; Cole et al., 1999; Collins, 1999; Debouck and Goodfellow, 1999; Duggan et al., 1999; Hacia, 1999; Lander, 1999; Lipshutz et al., 1999; Southern, et al., 1999; Schena, 2000; Brenner et al, 2000; Lander, 2001; Steinhaur et al., 2002; and Espejo et al, 2002. Nucleic acid microarrays include both oligonucleotide arrays (DNA chips) containing expressed sequence tags ("ESTs") and arrays of larger DNA sequences representing a plurality of genes bound to the substrate, either one of which can be used for hybridization studies. Protein and antibody microarrays include arrays of polypeptides or proteins, including but not limited to, polypeptides or proteins obtained by purification, fusion proteins, and antibodies, and can be used for specific binding studies (Zhu and Snyder, 2003; Houseman et al., 2002; Schaeferling et al., 2002; Weng et al., 2002; Winssinger et al., 2002; Zhu et al., 2001; Zhu et al. 2001; and MacBeath and Schreiber, 2000).
[088] A "nucleic acid hybridization reaction" is one in which single strands of DNA or RNA randomly collide with one another, and bind to each other only when their nucleotide sequences have some degree of complementarity. The solvent and temperature conditions can be varied in the reactions to modulate the extent to which the molecules can bind to one another. Hybridization reactions can be performed under different conditions of "stringency." The "stringency" of a hybridization reaction as used herein refers to the conditions (e.g., solvent and temperature conditions) under which two nucleic acid strands will either pair or fail to pair to form a "hybrid" helix.
[089] "Tm" is the temperature in degrees Celsius at which 50%> of a polynucleotide duplex made of complementary strands of nucleic acids that are hydrogen bonded in an anti-parallel direction by Watson-Crick base pairing dissociate into single strands under conditions ofthe hybridization reaction. Tm can be predicted according to a standard formula, such as: Tm = 81.5 + 16.6 log[X+] + 0.41 (%G/C) - 0.61 (%F) - 600/L, where [X+] is the cation concentration (usually sodium ion, Na+) in mol/L; (%>G/C) is the number of G and C residues as a percentage of total residues in the duplex; (%F) is the percent formamide in solution (wt/vol); and L is the number of nucleotides in each strand ofthe paired nucleic acids.
[090] A "buffer" is a system that tends to resist change in pH when a given increment of hydrogen ion or hydroxide ion is added. Buffered solutions contain conjugate acid-base pairs. Any conventional buffer can be used with the inventions herein including but not limited to, for example, Tris, phosphate, imidazole, and bicarbonate.
[091] A"crystal" is a solid of regular shape that forms when an element or compound forms slowly enough that the individual molecules occupy regular positions with respect to one another. A crystal structure is the configuration in which the atoms of a crystal are arranged. The crystal structure of a protein can affect its physical properties. Protein crystals can, in some instances, display biological activity, indicating that the protein have crystallized in their biologically active configuration. For example, enzyme crystals may display catalytic activity towards a substrate.
[092] A "library" of polynucleotides comprises a collection of sequence information of a plurality of polynucleotide sequences, which information is provided in either biochemical form (e.g., as a collection of polynucleotide molecules), or in electronic form (e.g., as a collection of polynucleotide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as part of a computer program). [093] A "library" of polypeptides comprises a collection of sequence information of a plurality of polypeptide sequences, which information is provided in, e.g., a collection of polypeptide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as part of a computer program.
[094] "Media" refers to a manufacture, other than an isolated nucleic acid molecule, that contains the sequence information ofthe present invention. Such a manufacture provides the genome sequence or a subset thereof in a form that can be examined by means not directly applicable to the sequence as it exists in a nucleic acid, e.g., with computer-readable media comprising data storage structures. Such media include, but are not limited to: magnetic storage media, such as a floppy disc, a hard disc storage medium, and a magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
[095] "Recorded" refers to a process for storing information on computer readable media, using any such methods as known in the art.
[096] As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information ofthe present invention. The minimum hardware ofthe computer-based systems ofthe present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one ofthe currently available computer-based systems are suitable for use in the present invention. The data storage means can comprise any manufacture comprising a recording ofthe present sequence information as described above, or a memory access means that can access such a manufacture.
[097] "Search means" refers to one or more programs implemented on the computer-based system, to compare a target sequence or target structural motif, or expression levels of a polynucleotide in a sample, with the stored sequence information. A variety of known algorithms are publicly known and commercially available, e.g., MacPattern (EMBL), BLAST, BLASTN and BLASTX (NCBI), gapped BLAST, BLAZE, the Wise package, FASTX, Clustalw, FASTA, FASTA3, AlignO, TCoffee, BestFit, FastDB, and TeraBLAST (TimeLogic, Crystal Bay, Nevada). Search means can be used to identify fragments or regions ofthe genome that match a particular target sequence or target motif, for example, based on sequence similarity, for example, to identify open reading frames (ORFs) within the genome that contain homology to ORFs from other organisms.
[098] "Sequence similarity," "sequence homology," "homology," "sequence identity," and "percent sequence identity," used interchangeably herein, describe the degree of relatedness between two polynucleotide or polypeptide sequences. In general, "identity" means the exact match-up of two or more nucleotide sequences or two or more amino acid sequences, where the nucleotide or amino acids being compared are the same. Also, in general, "similarity" or "homology" means the exact match-up of two of more nucleotide sequences or two or more amino acid sequences, where the nucleotide or amino acids being compared are either the same or possess similar chemical and/or physical properties. The terms also refer to the percentage of the "aligned" bases (for the polynucleotides) or amino acid residues (for the polypeptides) that are identical when the sequences are aligned. Sequences can be aligned in a number of different ways and sequence similarity can be determined in a number of different ways. For example, the bases or amino acid residues of one sequence can be aligned to a gap in the other sequence, or they can be aligned only to another base or amino acid residue in the other sequence. A gap can range anywhere from one nucleotide, base, or amino acid residue to multiple exons in length, up to any number of nucleotides or amino acid residues. Further, sequences can be aligned such that nucleotides (or bases) align with nucleotides, nucleotides align with amino acid residues, or amino acid residues align with amino acid residues.
[099] A "target sequence" can be any polynucleotide or amino acid sequence of six or more contiguous nucleotides or two or more amino acids, for example, from about 5 or from about 10 to about 100 amino acids, or from about 15 or from about 30 to about 300 nucleotides. A variety of comparing means can be used to accomplish comparison of sequence information from a sample (e.g., to analyze target sequences, target motifs, or relative expression levels) with the data storage means. A skilled artisan can readily recognize that any one ofthe publicly available homology search programs can be used as the search means for the computer based systems ofthe present invention to accomplish comparison of target sequences and motifs. Computer programs to analyze expression levels in a sample and in controls are also known in the art. A "target sequence" includes an "antibody target sequence," which refers to an amino acid sequence that can be used as an immunogen for injection into animals for production of antibodies or for screening against a phage display or antibody library for identification of binding partners.
[0100] A "target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration that is formed upon the folding ofthe target motif, or on consensus sequences of regulatory or active sites. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, hairpin structures, promoter sequences, and other expression elements such as binding sites for transcription factors.
[0101] A "matrix" is a geometric network of antibody molecules and their antigens, as found in immunoprecipitation and flocculation reactions. An antibody matrix can exist in solution or on a solid phase support.
[0102] The term "binds specifically," in the context of antibody binding, refers to high avidity and/or high affinity binding of an antibody to a specific polypeptide, or more accurately, to an epitope of a specific polypeptide. Antibody binding to such epitope on a polypeptide can be stronger than binding ofthe same antibody to any other epitopes, particularly other epitopes that can be present in molecules in association with, or in the same sample as the polypeptide of interest. For example, when an antibody binds more strongly to one epitope than to another, adjusting the binding conditions can result in antibody binding almost exclusively to the specific epitope and not to any other epitopes on the same polypeptide, and not to any other polypeptide, which does not comprise the epitope. Antibodies that bind specifically to a subject polypeptide may be capable of binding other polypeptides at a weak, yet detectable, level (e.g., 10% or less ofthe binding shown to the polypeptide of interest). Such weak binding, or background binding, is readily discernible from the specific antibody binding to a subject polypeptide, e.g., by use of appropriate controls. In general, antibodies ofthe invention bind to a specific polypeptide with a binding affinity of IO-7 M or greater (e.g., IO"8 M, IO"9 M, 10"10, 10"11, etc.).
[0103] The term "host cell" includes an individual cell, cell line, cell culture, or in vivo cell, which can be or has been a recipient of any polynucleotides or polypeptides ofthe invention, for example, a recombinant vector, an isolated polynucleotide, antibody or fusion protein. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology, physiology, or in total DNA, RNA, or polypeptide complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. Host cells can be prokaryotic or eukaryotic, including mammalian, insect, amphibian, reptile, crustacean, avian, fish, plant and fungal cells. A host cell includes cells transformed, transfected, transduced, or infected in vivo or in vitro with a polynucleotide ofthe invention, for example, a recombinant vector. A host cell which comprises a recombinant vector ofthe invention may be called a "recombinant host cell."
[0104] "Biological sample," "patient sample," "clinical sample" "sample," or "biological specimen," used interchangeably herein, encompasses a variety of sample types obtained from an individual, including biological fluids such as blood, serum, plasma, urine, cerebrospinal fluid, tears, saliva, lymph, dialysis fluid, lavage fluid, semen, and other liquid samples or tissues of biological origin. It includes tissue samples and tissue cultures or cells derived therefrom and the progeny thereof, including cells in culture, cell supernatants, and cell ly sates. It includes organ or tissue culture derived fluids, tissue biopsy samples, tumor biopsy samples, stool samples, and fluids extracted from physiological tissues. Cells dissociated from solid tissues, tissue sections, and cell lysates are included. The definition also includes samples that have been manipulated in any way after their procurement, such as by treatment with reagents, solubilization, or enrichment for certain components, such as polynucleotides or polypeptides. Also included in the term are derivatives and fractions of biological samples. A biological sample can be used in a diagnostic, monitoring, or screening assay. [0105] The terms "individual," "host," "patient," and "subject," used interchangeably herein, refer to a mammal, including, but not limited to, murines, simians, humans, felines, canines, equines, bovines, porcines, ovines, caprines, mammalian farm animals, mammalian sport animals, and mammalian pets. "Mammals" or "mammalian," are used broadly to describe organisms which are within the class mammalia, including the orders carnivore (e.g., dogs and cats), rodentia (e.g., mice, guinea pigs, and rats), and other mammals, including cattle, goats, sheep, cows, horses, rabbits, and pigs, and primates (e.g., humans, chimpanzees, and monkeys).
[0106] The terms "agent," "substance," "modulator," and "compound" are used interchangeably herein. These terms refer to a substance that binds to or modulates a level or activity of a subject polypeptide or a level of mRNA encoding a subject protein or nucleic acid, or that modulates the activity of a cell containing the subject protein or nucleic acid. Where the agent modulates a level of mRNA encoding a subject protein, agents include ribozymes, antisense, and RNAi molecules. Where the agent is a substance that modulates a level of activity of a subject polypeptide, agents include antibodies specific for the subject polypeptide, peptide aptamers, small molecules, agents that bind a ligand-binding site in a subject polypeptide, and the like. Antibody agents include antibodies that specifically bind a subject polypeptide and activate the polypeptide, such as receptor-ligand binding that initiates signal transduction; antibodies that specifically bind a subject polypeptide and inhibit binding of another molecule to the polypeptide, thus preventing activation of a signal transduction pathway; antibodies that bind a subject polypeptide to modulate transcription; antibodies that bind a subject polypeptide to modulate translation; as well as antibodies that bind a subject polypeptide on the surface of a cell to initiate antibody-dependent cytotoxicity ("ADCC") or to initiate cell killing or cell growth. Small molecule agents include those that bind the polypeptide to modulate activity ofthe polypeptide or cell containing the polypeptide in a similar fashion. The term "agent" also refers to substances that modulate a condition or disorder associated with a subject polynucleotide or polypeptide. Such agents include subject polynucleotides themselves, subject polypeptides themselves, and the like. Agents may be chosen from amongst candidate agents, as defined below.
[0107] The terms "candidate agent," "subject agent," or "test agent," used interchangeably herein, encompass numerous chemical classes, typically synthetic, semi-synthetic, or naturally occurring inorganic or organic molecules, small molecules, or macromolecular complexes. Candidate agents can be small organic compounds having a molecular weight of more than about 50 and less than about 2,500 daltons. Candidate agents can comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and can include at least an amine, carbonyl, hydroxyl or carboxyl group, and can contain at least two of the functional chemical groups. The candidate agents can comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more ofthe above functional groups. Candidate agents are also found among biomolecules, including oligonucleotides, polynucleotides, and fragments thereof, depsipeptides, polypeptides and fragments thereof, oligosaccharides, polysaccharides and fragments thereof, lipids, fatty acids, steroids, purines, pyrimidines, derivatives thereof, structural analogs, modified nucleic acids, modified, derivatized or designer amino acids, or combinations thereof.
[0108] An "agent which modulates a biological activity of a subject polypeptide," as used herein, describes any substance, synthetic, semi-synthetic, or natural, organic or inorganic, small molecule or macromolecular, pharmaceutical or protein, with the capability of altering a biological activity of a subject polypeptide or of a fragment thereof, as described herein. Generally, a plurality of assay mixtures is run in parallel with different agent concentrations to obtain a differential response to the various concentrations. Typically, one of these concenfrations serves as a negative control, i.e., at zero concentration or below the level of detection. The biological activity can be measured using any assay known in the art.
[0109] An agent which modulates a biological activity of a subject polypeptide increases or decreases the activity at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 50%, at least about 100%, or at least about 2-fold, at least about 5-fold, or at least about 10-fold or more when compared to a suitable control.
[0110] The term "agonist" refers to a substance that mimics the function of an active molecule. Agonists include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as well as analogues and fragments thereof.
[0111] The term "antagonist" refers to a molecule that competes for the binding sites of an agonist, but does not induce an active response. Antagonists include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as well as analogues and fragments thereof.
[0112] The term "receptor" refers to a polypeptide that binds to a specific extracellular molecule and may initiate a cellular response.
[0113] The term "ligand" refers to any molecule that binds to a specific site on another molecule.
[0114] The term "modulate" encompasses an increase or a decrease, a stimulation, inhibition, or blockage in the measured activity when compared to a suitable control. "Modulation" of expression levels includes increasing the level and decreasing the level of an mRNA or polypeptide encoded by a polynucleotide ofthe invention when compared to a control lacking the agent being tested. In some embodiments, agents of particular interest are those which inhibit a biological activity of a subject polypeptide, and/or which reduce a level of a subject polypeptide in a cell, and/or which reduce a level of a subject mRNA in a cell and/or which reduce the release of a subject polypeptide from a eukaryotic cell. In other embodiments, agents of interest are those that increase a biological activity of a subject polypeptide, and/or which increase a level of a subject polypeptide in a cell, and/or which increase a level of a subject mRNA in a cell and/or which increase the release of a subject polypeptide from a eukaryotic cell.
[0115] An agent that "modulates the level of expression of a nucleic acid" in a cell is one that brings about an increase or decrease of at least about 1.25-fold, at least about 1.5-fold, at least about 2-fold, at least about 5-fold, at least about 10-fold, or more in the level (i.e., an amount) of mRNA and/or polypeptide following cell contact with a candidate agent compared to a control lacking the agent. [0116] "Modulating a level of active subject polypeptide" includes increasing or decreasing activity of a subject polypeptide; increasing or decreasing a level of active polypeptide protein; increasing or decreasing a level of mRNA encoding active subject polypeptide, and increasing or decreasing the release of subject polypeptide for a eukaryotic cell. In some embodiments, an agent is a subject polypeptide, where the subject polypeptide itself is administered to an individual. In some embodiments, an agent is an antibody specific for a subject polypeptide. In some embodiments, an agent is a chemical compound such as a small molecule that may be useful as an orally available drug. Such modulation includes the recruitment of other molecules that directly effect the modulation. For example, an antibody that modulates the activity of a subject polypeptide that is a receptor on a cell surface may bind to the receptor and fix complement, activating the complement cascade and resulting in lysis ofthe cell.
[0117] The term "over-expressed" refers to a state wherein there exists any measurable increase over normal or baseline levels. For example, a molecule that is over-expressed in a disorder is one that is manifest in a measurably higher level compared to levels in the absence ofthe disorder.
[0118] A "stem cell" is a pluripotent or multipotent cell with the ability to self-renew, to remain undifferentiated, and to become differentiated. Stem cells can divide without limit, at least for the lifetime ofthe animal in which they naturally reside. Stem cells are not terminally differentiated, i.e., they are not at the end of a pathway of differentiation. When a stem cell divides, each daughter cell can either remain a stem cell or it can embark on a course that leads to terminal differentiation.
[0119] An "embryonic stem cell" is a stem cell that is present in or isolated from an embryo. An "adult stem cell" is a stem cell that is present in or isolated from an adult. Either can be pluripotent, having the capacity to differentiate into each and every cell present in the organism, or multipotent, with the ability to differentiate into more than one cell type. Embryonic stem cells derived from the inner cell mass ofthe embryo can act as pluripotent cells when placed into host blastocysts. Adult stem cells are more frequently multipotent than pluripotent; examples of multipotent adult stem cells include hematopoeitic stem cells, peripheral nervous system stem cells, central nervous system stem cells, myogenic stem cells, and mesenchymal stem cells.
[0120] A "mesenchymal stem cell" (MSC) is an adult pluripotent stem cell progenitor of multiple mesenchymal lineages, including bone, cartilage, muscle, fat tissue, marrow stroma, and astrocytes. Mesenchyme is embryonic tissue of mesodermal origin, i.e., tissue that derives from the middle of three germ layers. The mesenchyme is populated by mesenchymal cells, which are typically stellate or fusiform in shape. The embryonic mesoderm gives rise to the musculoskeletal, blood, vascular, and urogenital systems, as well as connective tissue, i.e., the dermis.
[0121] A "hematopoeitic" cell is a cell involved in the process of hematopoeisis, i.e., the process of forming mature red and white blood cells from precursor cells. In the adult, hematopoeisis takes place in the bone marrow. Earlier in development, hematopoeisis takes place at different sites during different stages of development; primitive blood cells arise in the yolk sac, and later, blood cells are formed in the liver, spleen, and bone marrow. Hematopoeisis undergoes complex regulation, including regulation by hormones, e.g., erythropoietin; growth factors, e.g., colony stimulating factors; and cytokines, e.g., interieukins. While the B- lymphocytic component of white blood cells matures in the bone marrow, the T- lymphocytic component of white blood cells matures in the thymus.
[0122] "Differentiation" is a progressive developmental change to a more specialized form or function. Cell differentiation is the process a cell undergoes as it matures to become an overtly specialized cell type. Differentiated cells have distinct characteristics, perform specific functions, and are less likely to divide than their less differentiated counterparts. An "undifferentiated" cell, e.g., an immature, embryonic, or primitive cell, typically has a non-specific appearance, may perform multiple, nonspecific activities, and may perform poorly, if at all, in functions typically performed by differentiated cells.
[0123] "Dedifferentiation" is a process by which a mature cell returns to a less mature state. A "dedifferentiated cell" is one that has fewer characteristics of differentiation than it possesses at an earlier point in time. A "dedifferentiated state" is one in which a mature cell has returned or is returmng to a less differentiated state, e.g., as in some cancers. t
[0124] A "differentiation factor" is a factor that induces a cell to undergo a change in the direction of an overtly specialized cell type. An "anti-differentiation factor" is a factor that prevents or inhibits a cell from undergoing a change in the direction toward an overtly specialized cell type.
[0125] A "co-factor" is a molecule that acts in concert with another substance to bring about certain effects.
[0126] A "lymphokine" is a cytokine produced by a leukocyte, which acts upon another cell. Examples include interieukins, interferon-alpha, tumor necrosis factor-alpha, and granulocyte/monocyte colony-stimulating factor.
[0127] An "anti-inflammatory molecule" is a molecule that can diminish, eliminate, or prevent a response to injury or infection. For example, an antihistamine can counteract the effect ofthe inflammatory mediator histamine.
[0128] An "anti-cancer molecule" is a molecule that can diminish, eliminate, or prevent the effects of cancer. It includes pharmaceuticals and antibodies.
[0129] An "apoptotic molecule" is a molecule that induces a cell to move towards apoptosis, or programmed cell death. Normally functioning cells undergo apoptosis when their age or their state of health so dictates. Apoptosis is an active process requiring metabolic activity by the dying cell, often characterized by cleavage ofthe DNA into fragments. Cells that die by apoptosis do not generally elicit the inflammatory response associated with necrosis. Cancer cells do not typically undergo normal apoptosis.
[0130] First and second therapeutic molecules working in "conjunction" means they work in association with one another to achieve a therapeutic effect.
[0131] First and second heterologous nucleic acid sequences that "interact" with one another means they have an effect on one another such that one ofthe sequences influences the other. Either may act upon the other, or both may act upon each other.
[0132] A "promoter" is a region of DNA that binds RNA polymerase before initiating the transcription of DNA into RNA. The nucleotide at which transcription begins is designated +1; nucleotides are numbered from this reference point. Negative numbers indicate upstream nucleotides and positive numbers indicate downstream nucleotides. The promoter directs the RNA polymerase to bind to DNA, to open the DNA helix, and to begin RNA synthesis. Some promoters are "constitutive," and direct transcription in the absence of regulatory influences. Some promoters are "tissue specific," and initiate transcription exclusively or selectively in one or a few tissue types. Some promoters are "inducible," and effect gene transcription under the influence of an inducer. Induction can occur, e.g., as the result of a physiologic response, a response to outside signals, or as the result of artificial manipulation.
[0133] A "knockout" mouse is a mouse in which a normal functional gene has been replaced by a non-functional form ofthe gene, and the function of that particular gene is eliminated. They are typically produced by transplanting embryonic stem cells heterozygous for a knockout mutation in a gene of interest and homozygous for a marker gene, e.g., black coat color into the blastocoel cavity of embryos that are homozygous for an alternate marker, e.g., white coat color. The early embryos then are implanted into a pseudopregnant female. Some ofthe resulting progeny are chimeras, indicated by the phenotype produced by the marker, e.g., a black and white coat. Chimeric mice are backcrossed to mice with the alternate marker. Progeny from this mating that display the marker present in the mice with the gene of interest (e.g., black coat) have embryonic stem-derived cells in their germ line; DNA analysis of these mice can identify the mice heterozygous for the null allele ofthe gene of interest, i.e., the "knockout" allele. Intercrossing these heterozygous mice produces mice homozygous for the disrupted allele, i.e., "knockout" mice (Capecchi, 1989).
[0134] Gene "knockout" produces model systems for studying inherited human diseases, investigating the nature of genetic diseases and the efficacy of different types of treatment, and for developing effective gene therapies to cure these diseases. For example, a "knockout" line of mutant mice homozygous for a null allele ofthe cystic fibrosis transmembrane regulator gene, demonstrates symptoms similar to those of humans with cystic fibrosis. These mice provide a model system for studying this genetic disease and developing effective therapies.
[0135] A "transgenic mouse" is a mouse that has stably incorporated one or more genes from another cell or organism and can pass them on to successive generations. Transgenic mice with an exogenous DNA sequence of interest integrated into its DNA are typically produced by injecting DNA containing a gene of interest into one ofthe two pronuclei (the male and female haploid nuclei contributed by the parents) of a fertilized mouse egg before they fuse. The injected DNA is randomly integrated into the chromosomes ofthe diploid zygote. Injected eggs then are transferred to foster mothers in which normal cell growth and differentiation occurs. Some ofthe progeny will contain the exogenous DNA, and breeding and backcrossing can produce pure transgenic strains homozygous for the fransgene (Brinster et al., 1981).
[0136] Transgenic mice are useful for studying various aspects of normal mammalian biology, and also provide a model system for studying disease processes. For example, many forms of cancer are promoted by normal cellular myc genes acting in a dominant fashion owing to their misregulated activity. Transgenic mice carrying the myc gene develop normally, and form tumors at a high frequency in a subset of cells that express the fransgene.
[0137] A "therapeutic factor" encoded by a first heterologous nucleic acid sequence of a modified mesenchymal cell is a factor, excluding a cell survival factor (Mangi et al., 2003; WO 03/073998), that is preventative, palliative, curative, or otherwise useful in treating or ameliorating, or preventing the recurrence of a disease, disorder, syndrome or condition, and is not an anti-cancer agent.
[0138] "Telomerase" is a DNA polymerase enzyme that selectively elongates DNA from the telomere, i.e., the end of a chromosome. Telomeric DNA contains multiple, e.g., hundreds, of tandem repeats of a hexanucleotide sequence. One sfrand of telomeric DNA is G-rich at the 3' end, and slightly longer than the other strand. Telomeric DNA can form large duplex loops, wherein the single-stranded region at the very end ofthe structure loops back to form a DNA duplex with another part ofthe repeated sequence, displacing a part ofthe original telomeric duplex. This looplike structure is formed and stabilized by specific telomere-binding proteins. These structures protect and mask the end ofthe chromosome.
[0139] The telomeric looplike structures are generated by telomerase. The telomerase enzyme contains an RNA molecule that serves as the template for elongating the G-rich strand of telomeric DNA. Thus, the enzyme carries the information necessary to generate the telomere sequences. Telomerases also have a protein component, which is related to reverse transcriptases. Telomerases can influence cell aging, and play a role in cellular cancer biology.
[0140] "Tumor necrosis factor" (TNF) encompasses a family of receptor ligands that display pleiotropic effects on normal and malignant cells. Natural induction of TNF is protective, but its overproduction may be detrimental and even lethal to the host. TNF elicits a variety of responses in different cell types. TNF was originally characterized as an antitumor agent and a cytotoxic factor for malignant cells. It subverts the electron transport system of mitochondria to produce oxygen radicals, which can kill malignant cells lacking protective enzymes. TNF also plays a role in the defense against viral, bacterial, and parasitic infections, and in mediating autoimmune responses (Fiers, 1991). TNF inhibitors have been used to treat psoriasis (Weinberg and Saini, 2003).
[0141] "Treatment," "treating," and the like, as used herein, refer to obtaining a desired pharmacologic and/or physiologic effect, covering any treatment of a pathological condition or disorder in a mammal, including a human. The effect may be prophylactic in terms of completely or partially preventing a disorder or symptom thereof and/or may be therapeutic in terms of a partial or complete cure for a disorder and/or adverse affect attributable to the disorder. That is, "treatment" includes (1) preventing the disorder from occurring or recurring in a subject who may be predisposed to the disorder but has not yet been diagnosed as having it, (2) inhibiting the disorder, such as arresting its development, (3) stopping or terminating the disorder or at least symptoms associated therewith, so that the host no longer suffers from the disorder or its symptoms, such as causing regression ofthe disorder or its symptoms, for example, by restoring or repairing a lost, missing or defective function, or stimulating an inefficient process, or (4) relieving, alleviating, or ameliorating the disorder, or symptoms associated therewith, where ameliorating is used in a broad sense to refer to at least a reduction in the magnitude of a parameter, such as inflammation, pain, and/or tumor size.
[0142] A "pharmaceutically acceptable carrier," "pharmaceutically acceptable diluent," or "pharmaceutically acceptable excipient," or "pharmaceutically acceptable vehicle," used interchangeably herein, refer to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or formulation auxiliary of any conventional type. A pharmaceutically acceptable carrier is non-toxic to recipients at the dosages and concentrations employed and is compatible with other ingredients of the formulation. For example, the carrier for a formulation containing polypeptides would not normally include oxidizing agents and other compounds that are known to be deleterious to polypeptides. Suitable carriers include, but are not limited to, water, dextrose, glycerol, saline, ethanol, and combinations thereof. The carrier can contain additional agents such as wetting or emulsifying agents, pH buffering agents, or adjuvants which enhance the effectiveness ofthe formulation. Adjuvants ofthe invention include, but are not limited to Freunds's, Montanide ISA Adjuvants [Seppic, Paris, France], Ribi's Adjuvants (Ribi ImmunoChem Research, Inc., Hamilton, MT), Hunter's TiterMax (CytRx Corp., Norcross, GA), Aluminum Salt Adjuvants (Alhydrogel - Superfos of Denmark/Accurate Chemical and Scientific Co., Westbury, NY), Nitrocellulose- Adsorbed Protein, Encapsulated Antigens, and Gerbu Adjuvant (Gerbu Biotechnik GmbH, Gaiberg, Germany/C-C Biotech, Poway, CA). Topical carriers include liquid petroleum, isopropyl palmitate, polyethylene glycol, ethanol (95%), polyoxyethylene monolaurate (5%>) in water, or sodium lauryl sulfate (5%) in water. Other materials such as anti-oxidants, humectants, viscosity stabilizers, and similar agents can be added as necessary. Percutaneous penefration enhancers such as Azone can also be included.
[0143] "Pharmaceutically acceptable salts" include the acid addition salts (formed with the free amino groups ofthe polypeptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, mandelic, oxalic, and tartaric. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, and histidine.
[0144] Compositions for oral administration can form solutions, suspensions, tablets, pills, capsules, sustained release formulations, oral rinses, or powders.
[0145] The term "unit dosage form," as used herein, refers to physically discrete units suitable as unitary dosages for human and animal subjects, each unit containing a predetermined quantity of compounds ofthe present invention calculated in an "effective amount," that is, a dosage sufficient to produce the desired result or effect in association with a pharmaceutically acceptable carrier. The specifications for the novel unit dosage forms ofthe present invention depend on the particular compound employed, the host, and the effect to be achieved, as well as the pharmacodynamics associated with each compound in the host. Compositions
[0146] The present invention provides novel isolated polynucleotides encoding polypeptides and fragments thereof. The present invention also provides novel isolated polypeptides, fragments thereof, and compositions comprising same. The present invention further provides polynucleotide compositions that can be used to identify the polypeptides.
[0147] The present invention provides recombinant vectors and host cells for use in gene expression, primer pairs for use in hybridizations, computer-based embodiments for use in bioinformatics, and transgenic animals and embryonic stem cell lines for use in mutating and regulating gene expression.
Nucleic Acids
[0148] This invention provides genes encoding proteins, the encoded proteins, and fragments and homologs thereof. It provides human polynucleotide sequences and the corresponding mouse polynucleotide sequences.
[0149] The nucleic acids ofthe subject invention can encode all or a part of the subject proteins. Double or single stranded fragments can be obtained from the DNA sequence by chemically synthesizing oligonucleotides in accordance with conventional methods, for example by restriction enzyme digestion or polymerase chain reaction (PCR) amplification. The use ofthe polymerase chain reaction has been described (Saiki et al., 1 85) and current techniques have been reviewed (Sambrook et al., 1989; McPherson et al. 2000; Dieffenbach and Dveksler, 1995). For the most part, DNA fragments will be of at least about 5 nucleotides, at least about 8 nucleotides, at least about 10 nucleotides, at least about 15 nucleotides, at least about 18 nucleotides, at least about 20 nucleotides, at least about 25 nucleotides, at least about 30 nucleotides, or at least about 50 nucleotides, at least about 75 nucleotides, or at least about 100 nucleotides. Nucleic acid compositions that encode at least six contiguous amino acids (i.e., fragments of 18 nucleotides or more), for example, nucleic acid compositions encoding at least 8 contiguous amino acids (i.e., fragments of 24 nucleotides or more), are useful in directing the expression or the synthesis of peptides that can be used as immunogens (Lerner, 1982; Shinnick et al., 1983; Sutcliffe et al., 1983).
[0150] In some embodiments, a polynucleotide of the invention comprises a nucleotide sequence of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, at least about 1000, at least about 1100, at least about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, at least about 2100, at least about 2200, at least about 2300, at least about 2400, at least about 2500, at least about 3000, at least about 4000, or at least about 5000 contiguous nucleotides of any one ofthe sequences shown in SEQ ID NO: 1 - 123, and or a complement thereof.
[0151] In other embodiments, a polynucleotide of the invention has at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% nucleotide sequence identity with a nucleotide sequence, or a fragment thereof, ofthe coding region of any one ofthe sequences shown in SEQ ID NO: 1 - 123, or a complement thereof. These sequence variants include naturally-occurring variants (e.g., SNPs, allelic variants, and homologs from other species), degenerate variants, variants associated with disease or pathological states, and variants resulting from random or directed mutagenesis, as well as from chemical or other modification.
[0152] In some embodiments, a polynucleotide ofthe invention comprises a nucleotide sequence that encodes a polypeptide comprising an amino acid sequence of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, or at least about 1000 contiguous amino acids of at least one ofthe sequences shown in SEQ ID NO: 124-246 (e.g., a polypeptide encoded by at least one ofthe nucleotide sequences shown in SEQ ID NO: 1 - 123), up to and including an entire amino acid sequence as shown in SEQ ID NO: 124-246 (or as encoded by at least one ofthe nucleotide sequences shown in SEQ ID NO: 1 - 123).
[0153] In some embodiment, the present invention includes the present polynucleotide selected from SEQ ID NO: 1 - 124, which contain 300 bp of 5' terminus of a protein encoding polynucleotide sequence. Such a polynucleotide is useful for the purposes of clustering gene sequences to determine gene family.
[0154] In further embodiments, a polynucleotide of the invention hybridizes under stringent hybridization conditions to a polynucleotide having the coding region of any one ofthe sequences shown in SEQ ID NO: 1 - 124, or a complement thereof.
[0155] The polynucleotides ofthe invention include those that encode variants ofthe polypeptide sequences encoded by the polynucleotides ofthe Sequence Listing. In some embodiments, these polynucleotides encode variant polypeptides that include insertions, additions, deletions, or substitutions compared with the polypeptides encoded by the nucleotide sequences shown in SEQ ID NO: 1 - 124. Conservative amino acid substitutions include serine/threonine, valine/leucine/isoleucine, asparagine/histidine/glutamine, glutamic acid aspartic acid, etc. (Gonnet et al., 1992).
[0156] The nucleic acids ofthe invention include degenerate variants that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the nucleic acid sequences herein. For example, synonymous codons include GGG, GGA, GGC, and GGU, each encoding glycine.
[0157] The nucleic acids ofthe invention include single nucleotide polymorphisms (SNPs), which occur frequently in eukaryotic genomes (Lander, et al. 2001). The nucleotide sequence determined from one individual of a species can differ from other allelic forms present within the population.
[0158] The nucleic acids of the invention include homologs of the polynucleotides. The source of homologous genes can be any species, e.g., primate species, particularly human; rodents, such as rats, hamsters, guinea pigs, and mice; rabbits, canines, felines; catties, such as bovines, goats, pigs, sheep, equines, crustaceans, birds, chickens, reptiles, amphibians, fish, insects, plants, fungi, yeast, nematodes, etc. Among mammalian species, e.g., human and mouse, homologs have substantial sequence similarity, e.g., at least about 60% sequence identity, at least about 75% sequence identity, or at least about 80%> sequence identity among nucleotide sequences. In many embodiments of interest, homology will be at least about 85%, at least about 90%, at least about 95%>, at least about 96%, at least about 97%), at least about 98%, where in certain embodiments of interest homology will be as high as about 99%.
[0159] Modifications in the native structure of nucleic acids, including alterations in the backbone, sugars or heterocyclic bases, have been shown to increase intracellular stability and binding affinity. Among useful changes in the backbone chemistry are phosphorothioates; phosphorodithioates, where both ofthe non-bridging oxygens are substituted with sulfur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral phosphate derivatives include 3 -0 -5 -S-phosphorothioate, 3 -S-5 -O- phosphorothioate, 3 -CH2-5 -O-phosphonate and 3 -NH-5'-O-phosphoroamidate. Peptide nucleic acids replace the entire ribose phosphodiester backbone with a peptide linkage. ,
[0160] Sugar modifications are also used to enhance stability and affinity. The α-anomer of deoxyribose can be used, where the base is inverted with respect to the natural β-anomer. The 2 -OH ofthe ribose sugar can be altered to form 2 -O- methyl or 2 -O-allyl sugars, which provides resistance to degradation without comprising affinity.
[0161] Modification of the heterocyclic bases must maintain proper base pairing. Some useful substitutions include deoxyuridine for deoxythymidine; 5-methyl-2 - deoxycytidine and 5-bromo-2'-deoxycytidine for deoxycytidine. 5- propynyl-2 - deoxyuridine and 5 -propynyl-2 -deoxycytidine have been shown to increase affinity and biological activity when substituted for deoxythymidine and deoxycytidine, respectively.
[0162] A genomic sequence of interest comprises the nucleic acid present between the initiation codon and the stop codon, as defined in the listed sequences, including all ofthe introns that are normally present in a native chromosome. It can further include the 3 ' and 5 ' untranslated regions found in the mature mRNA. It can further include specific transcriptional and translational regulatory sequences, such as promoters, enhancers, etc., including about 1 kb, about 2 kb, and possibly more, of flanking genomic DNA at either the 5 'or 3 ' end of the transcribed region. The genomic DNA can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking chromosomal sequence. The genomic DNA flanking the coding region, either 3 'or 5', or internal regulatory sequences as sometimes found in introns, contains sequences required for proper tissue and stage specific expression.
[0163] Nucleic acid molecules of the invention can comprise heterologous nucleic acid molecules, i.e., nucleic acid molecules other than the subject nucleic acid molecules, of any length. For example, the subject nucleic acid molecules can be flanked on the 5 'and/or 3 'ends by heterologous nucleic acid molecules of from about 1 nucleotide to about 10 nucleotides, from about 10 nucleotides to about 20 nucleotides, from about 20 nucleotides to about 50 nucleotides, from about 50 nucleotides to about 100 nucleotides, from about 100 nucleotides to about 250 nucleotides, from about 250 nucleotides to about 500 nucleotides, or from about 500 nucleotides to about 1000 nucleotides, or more in length.
[0164] The subject polynucleotides include those that encode fusion proteins comprising the subject polypeptides fused to "fusion partners." For example, the present soluble receptor or ligand can be fused to an immunoglobulin fragment, such as an Fc fragment for stability in circulation or to fix complement. Other polypeptide fragments that have equivalent capabilities as the Fc fragments can also be used herein.
[0165] The isolated nucleic acids ofthe invention can be used as probes to detect and characterize gross alteration in a genomic locus, such as deletions, insertions, translocations, and duplications, e.g., applying fluorescence in situ hybridization (FISH) techniques to examine chromosome spreads (Andreeff et al., 1999). The nucleic acids are also useful for detecting smaller genomic alterations, such as deletions, insertions, additions, translocations, and substitutions (e.g., SNPs).
[0166] When used as probes to detect nucleic acid molecules capable of hybridizing with nucleic acids described in the Sequence Listing, the nucleic acid molecules can be flanked by heterologous sequences of any length. When used as probes, a subject nucleic acid can include nucleotide analogs that incorporate labels that are directly detectable, such as radiolabels or fluorophores, or nucleotide analogs that incorporate labels that can be visualized in a subsequent reaction, such as biotin or various haptens. Haptens that are commonly conjugated to nucleotides for subsequent labeling include biotin, digoxigenin, and dinitrophenyl.
[0167] Suitable fluorescent labels include fluorochromes e.g., fluorescein and its derivatives, e.g., fluorescein isothiocyanate (FITC6-carboxyfluorescein (6- FAM), 2 ',7 -dimethoxy-4',5'-dichloro-6-carboxyfluorescein (JOE), ), 6-carboxy- 2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM); coumarin and its derivatives, e.g., 7-amino-4-methylcoumarin, aminocoumarin; bodipy dyes, such as Bodipy FL; cascade blue; Oregon green; rhodamine dyes, e.g., rhodamine, 6- carboxy-X-rhodamine (ROX), Texas red, phycoerythrin, and tetramethylrhodamine; eosins and erythrosins; cyanine dyes, e.g., allophycocyanin, Cy3 and Cy5 or N,N,N', -tetramethyl-6-carboxyrhodamine (TAMRA); macrocyclic chelates of lanthanide ions, e.g., quantum dye, etc; and chemilummescent molecules, e.g., luciferases.
[0168] Fluorescent labels also include a green fluorescent protein (GFP), i.e., a "humanized" version of a GFP, e.g., wherein codons ofthe naturally-occurring nucleotide sequence are changed to more closely match human codon bias; a GFP derived from Aequoria victoria or a derivative thereof, e.g., a "humanized" derivative such as Enhanced GFP, which are available commercially, e.g., from Clontech, Inc.; other fluorescent mutants of a GFP from Aequoria victoria, e.g., as described in U.S. Patent No. 6,066,476; 6,020,192; 5,985,577; 5,976,796; 5,968,750; 5,968,738; 5,958,713; 5,919,445; 5,874,304; a GFP from another species such as Renilla reniformis, Renilla mulleri, or Ptilosarcus guernyi, as previously described (WO 99/49019; Peelle et al., 2001), "humanized" recombinant GFP (hrGFP) (Stratagene®); any of a variety of fluorescent and colored proteins from Anthozoan species, (e.g., Matz et al., 1999).
[0169] Probes can also contain fluorescent analogs, including commercially available fluorescent nucleotide analogs that can readily be incorporated into a subject nucleic acid. These include deoxyribonucleotides and or ribonucleotide analogs labeled with Cy3, Cy5, Texas Red, Alexa Fluor dyes, rhodamine, cascade blue, or BODIPY, and the like.
[0170] Suitable radioactive labels include, e.g., 32P, 35S, or 3H. For example, probes can contain radiolabeled analogs, including those commonly labeled with 32P or 35S, such as α-32P-dATP, -dTTP, -dCTP, and dGTP; γ-35S-GTP and α-35S- dATP, and the like.
[0171] Nucleic acids of the invention can also be bound to a substrate. Subject nucleic acids can be attached covalently, attached to a surface ofthe support or applied to a derivatized surface in a chaotropic agent that facilitates denaturation and adherence, e.g., by noncovalent interactions, or some combination thereof. The nucleic acids can be bound to a substrate to which a plurality of other nucleic acids are concurrently bound, hybridization to each ofthe plurality ofthe bound nucleic acids being separately detectable. [0172] The substrate can be porous or solid, planar or non-planar, unitary or distributed; and the bond between the nucleic acid and the substrate can be covalent or non-covalent. The subsfrate can be in the form of microbeads or nanobeads. Substrates include, but are not limited to, a membrane, such as nitrocellulose, nylon, positively-charged derivatized nylon; a solid substrate such as glass, amorphous silicon, crystalline silicon, plastics (including e.g., polymethylacrylic, polyethylene, polypropylene, polyacrylate, polymethylmethacrylate, polyvinylchloride, polytefrafluoroethylene, polystyrene, polycarbonate, polyacetal, polysulfone, cellulose acetate, or mixtures thereof).
[0173] The subject nucleic acids include antisense RNA, ribozymes, and RNAi. Further, The nucleic acids ofthe invention can be used for antisense or RNAi inhibition of transcription or translation using methods known in the art (Phillips, 1999a; Phillips, 1999b; Hartmann et al., 1999; Stein et al., 1998; Agrawal et al., 1998).
Expression Vectors
[0174] The instant invention further provides host cells, e.g., recombinant host cells, that comprise a subject nucleic acid, host cells that comprise a recombinant vector, and host cells that secrete antibodies ofthe invention. Subject host cells can be cultured in vitro, or can be part of a multicellular organism. Host cells are described in more detail below. The instant invention further provides transgenic plants and non-human animals, as described in more detail below.
[0175] In addition to the plurality of uses described in greater detail in following sections, the subject nucleic acids find use in the preparation of all or a portion ofthe polypeptides ofthe subject invention, as described above, using an expression system. For expression, an expression vector can be employed. The expression vector will provide a transcriptional and translational initiation region, which may be inducible, conditionally-active, or constitutive, or tissue-specific, where the coding region is operably linked under the transcriptional control ofthe transcriptional initiation region, and a transcriptional and translational termination region. These control regions can be native to a gene encoding the subject peptides, or can be derived from heterologous or exogenous sources. [0176] The subject nucleic acids can also be provided as part of a vector (e.g., a polynucleotide construct comprising an expression cassette), a wide variety of which are known in the art. Vectors include, but are not limited to, plasmids; cosmids; viral vectors; human, yeast, bacterial, Pl -derived artificial chromosomes (HAC's, YAC's, BAC's, PAC's, etc.), mini-chromosomes, and the like. Vectors are amply described in numerous publications well known to those in the art (Ausubel, et al.; Jones et al., 1998a; Jones et al., 1998b). Vectors can provide for nucleic acid expression, for nucleic acid propagation, or both.
[0177] A recombinant vector or construct that includes a nucleic acid of the invention is useful for propagating a nucleic acid in a host cell; such vectors are known as "cloning vectors." Vectors can transfer nucleic acid between host cells derived from disparate organisms; these are known in the art as "shuttle vectors." Vectors can also insert a subject nucleic acid into a host cell's chromosome; these are known in the art as "insertion vectors." Vectors can express either sense or antisense RNA transcripts ofthe invention in vitro (e.g., in a cell-free system or within an in vitro cultured host cell) or in vivo (e.g., in a multicellular plant or animal); these are known in the art as "expression vectors," which can be part of an expression system. Expression vectors can also produce a subject antibody.
[0178] Vectors typically include at least one origin of replication, at least one site for insertion of heterologous nucleic acid (e.g., in the form of apolylinker with multiple, tightly clustered, single cutting restriction endonuclease recognition sites), and at least one selectable marker, although some integrative vectors will lack an origin that is functional in the host to be chromosomally modified, and some vectors will lack selectable markers. Vectors are transiently or stably be maintained in the cells, usually for a period of at least about one day, at least about several days to at least about several weeks.
[0179] Promoters of the invention can be naturally contiguous or not naturally contiguous to the expressed nucleic acid molecule. The promoters can be inducible, conditionally active (such as the cre-lox promoter), constitutive, and/or tissue specific. [0180] Prior to vector insertion, the DNA of interest will be obtained substantially free of other nucleic acid sequences. The DNA can be "recombinant," and flanked by one or more nucleotides with which it is not normally associated on a naturally occurring chromosome.
[0181] Expression vectors generally have convenient restriction sites located near the promoter sequence to provide for the insertion of nucleic acid sequences encoding heterologous protein or RNA molecules. A selectable marker operative in the expression system or host can be present. Expression vectors can be used for the production of fusion proteins, where the fusion peptide provides additional functionality, i.e., increased protein synthesis, a leader sequence for secretion, stability, reactivity with defined antisera, or an enzyme marker, e.g., β-galactosidase.
[0182] Expression vectors can be prepared comprising a transcription cassette comprising a transcription initiation region, the gene or fragment thereof, and a transcriptional termination region. Of particular interest is the use of DNA sequences that allow for the expression of functional epitopes or domains, at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at. least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, or at least about 1000 amino acids in length, or any ofthe above-described fragments, up to and including the complete open reading frame ofthe gene. After introduction of these DNA sequences, the cells containing the vector construct can be selected by means of a selectable marker, and the selected cells expanded and used as expression- competent host cells.
[0183] Host cells can comprise prokaryotes or eukaryotes that express proteins and polypeptides in accordance with conventional methods, the method depending on the purpose for expression. For large scale production ofthe protein, a unicellular organism, such as E. coli, B. subtilis, S. cerevisiae, insect cells in combination with baculovirus vectors, or cells of a higher organism such as vertebrates, particularly mammals, e.g., COS 7 cells, can be used as the expression host cells. In some situations, it is desirable to express eukaryotic genes in eukaryotic cells, where the encoded protein will benefit from native folding and posttranslational modifications.
Expression Systems
Cell-Based Expression Systems
[0184] Specific expression systems of interest include plants, bacteria, yeast, insect cells, and mammalian cell-derived expression systems. Expression systems in plants include those described in U.S. Patent No. 6,096,546 and U.S. Patent No. 6,127,145. Expression systems in bacteria include those described by Chang et al., 1978; Goeddel et al., 1979; Goeddel et al., 1980; EP 0 036,776; U.S. Patent No. 4,551,433; DeBoer et al., 1983); and Siebenlist et al., 1980. Expression systems in yeast include those described by Hinnen et al., 1978; Ito et al., 1983; Kurtz et al., 1986; Kunze et al, 1985; Gleeson et al, 1986; Roggenkamp et al., 1986; Das et al., 1984; De Louvencourt et al., 1983; Van den Berg et al., 1990; Kunze et al., 1985; Cregg et al., 1985; U.S. Patent Nos. 4,837,148 and 4,929,555; Beach and Nurse, 1981; Davidow et al., 1985; Gaillardin et al., 1985; Ballance et al., 1983; Tilburn et al., 1983; Yelton et al., 1984; Kelly and Hynes, 1985; EP 0 244,234; WO 91/00357; and U.S. Patent No. 6,080,559. Expression systems for heterologous genes in insects include those described in U.S. Patent No. 4,745,051; Friesen et al., 1986; EP 0 127,839; EP 0 155,476; Vlak et al, 1988; Miller et al., 1988; Carbonell et al., 1988; Maeda et al., 1985; Lebacq-Verheyden et al., 1988; Smith et al., 1985); Miyajima et al., 1987; and Martin et al, 1988. Numerous baculoviral strains and variants and conesponding permissive insect host cells are described in Luckow et al., 1988, Miller et al., 1986, and Maeda et al., 1985. The insect cell expression system is useful not only for production of heterologous proteins intracellularly, but can be used for expression of transmembrane proteins on the insect cell surfaces. Such insect cells can be used as immunogen for production of antibodies, for example, by injection of the insect cells into mice or rabbits or other suitable animals, for production of antibodies. [0185] Mammalian expression systems include those described in Dijkema et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. 4,399,216. Additional features of mammalian expression are facilitated as described in Ham and Wallace, 1979; Barnes and Sato, 1980 U.S. Patent Nos. 4,767,704, 4,657,866, 4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. RE 30,985. Mammalian cell expression systems can also be used for production of antibodies.
Cell-Free Expression Systems
[0186] Cell-free systems can be also used to express the polypeptides of the invention. Cell-free systems of fractionated cell homogenates comprising the protein synthetic machinery, including ribosomes, transfer RNA and enzymatic components ofthe machinery, are routinely used by those of skill in the art to express polypeptides of interest. Isolated mRNA and DNA, e.g., a gene cloned into a plasmid vector, or a PCR-generated DNA template, are examples of nucleic acids suitable as templates for expressing polypeptides in cell-free systems. The polypeptides can be expressed in bacterial systems, e.g., E. coli lysate, rabbit reticulocyte lysate system, wheat germ extract system, frog oocyte lysate system, and the like which is conventional in the art. See, for example, WO 00/68412, WO 01/27260, WO 02/24939, WO 02/38790, WO 91/02076, and WO 91/02075.
[0187] Wheat embryo and wheat germ (which is dried wheat embryo), and reticulo-lysate extract cell-free systems are eukaryotic, and, as such, are suitable for expressing eukaryotic proteins, as described in WO 00/68412, WO 01/27260, WO 02/08443, WO 02/095377, WO 02/18586, and WO 02/24939. They have the advantages of low cost, easy availability in large amounts, and the capacity to synthesize high-molecular weight polypeptides (Madin et al., 2000). The wheat embryo can be treated to substantially eliminate endogenous protein synthesis inhibitors, improving the synthetic capacity ofthe system (Madin et al., 2000; WO 00/68412). The robustness ofthe wheat germ cell-free system can be enhanced by the addition of an energy regenerating system, including an energy source (Madin et al., 2000; WO 00/68412).
[0188] Recombinant genes encoding hydrophobic proteins of interest can be expressed in cell-free systems, e.g., wheat germ, E. coli, or rabbit reticulocyte lysates. Cell-free lysates can be prepared from wheat germ or wheat embryos by the methods of Doi et al., 2003; Miyamoto-Sato et al., 2003; Morita et al., 2003; WO 02/38790, WO 02/24939, Sawasaki et al., 2002a; Sawasaki et al., 2002b; WO 01/27260, WO 00/68412, Madin et al., 2000; Sawasaki et al, 2000; and/or Erickson and Blobel, 1983. Cell-free lysates can be prepared from E. coli by the methods of Chang et al., 1978; Goeddel et al., 1979; EP 0 036,776; U.S. Patent No. 4,551,433; DeBoer et al., 1983; and Siebenlist et al., 1980. Cell-free lysates can be prepared from rabbit reticulocytes by the methods of Dijkema et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. 4,399,216. Additional features of mammalian expression are facilitated as described in Ham and McKeehan, 1979; Barnes and Sato, 1980; U.S. Patent Nos. 4,767,704, 4,657,866, 4,927,762, 4,560,655; WO 90/103430; WO 87/00195; and U.S. RE 30,985.
[0189] Cell-free expression in a wheat germ system has been demonstrated to proceed efficiently when a reaction mixture containing a wheat germ extract into contact with a substrate and an energy source by continuously supplying the substrate and the energy source to the reaction mixture and discontinuously removing the byproducts ofthe reaction to increase the efficiency of protein synthesis (WO 02/24939; Madin et al., 2000). Processing the wheat embryos by milling the plant seeds to eliminate the albumen, sieving the milled seeds to recover fractions passing through 1.00 to 0.45 mm or 710-850 mm mesh, selecting the wheat embryos by flotation in an organic solvent, eliminating the periderm, damaged embryos and contaminants, recovering the suspension in an aqueous solution, and washing the wheat embryos by sonication increases the efficiency ofthe process by removing contaminants that inhibit protein synthesis (WO 02/38790; Madin et al., 2000).
[0190] The translation efficiency of a cell- free expression system is dependent in part on the translation efficiency ofthe mRNA. The efficiency ofthe mRNA employed in a wheat germ cell-free expression system was shown to be increased by constructing a plasmid with a template DNA for transcribing a protein translational template mRNA with high translation efficiency. The plasmid includes non-translated template DNA at the 5' and 3' ends ofthe mRNA template, improving the efficiency of translation ofthe mRNA sequence of interest (WO 01/27260; Madin et al., 2000).
[0191] The translation efficiency of a cell-free expression system is also dependent in part on the availability of a continuous energy source. The robustness of a conventional wheat germ cell-free expression system was improved by supplying substrate molecules and energy sources, i.e., ATP or GTP to the reaction by allowing their free diffusion into the expression system, and by removing the by-products of the reaction from the expression system (WO 02/24939; Madin et al., 2000).
Cell-Free Expression of Hydrophobic Proteins on Nanodiscs
[0192] Membrane proteins are commonly amphipathic molecules that have hydrophilic as well as hydrophobic, e.g., membrane spanning, regions. They may pose difficulties in studying with traditional methods because of their hydrophobic domains. Transmembrane and membrane-associated proteins are cellular targets for the majority of therapeutics in use today, including both small molecule drugs and protein pharmaceutical drugs and vaccines. Active therapeutic agents are also often amphipathic, and for the same reasons have also been difficult to study directly at the molecular level.
[0193] Nanodiscs™ (Nanodisc, Inc., Urbana-Champaign, IL) provide a means to generate soluble lipid bilayer membranes that incorporate membrane proteins on a nanometer scale. The structure ofthe Nanodisc™ is a discoidal lipid bilayer surrounded at its edges by amphipathic α-helical proteins (WO 02/40501). The Nanodiscs™ are stabilized by a synthetically engineered class of amphipathic membrane scaffold proteins that were optimized to promote the self-assembly of discoidal bilayers (Bayburt et al., 2002). These alpha-helical scaffold proteins surround the edge ofthe Nanodisc™, providing stability to the bilayer. They can be engineered with tags or chemically reactive groups so that they can additionally serve as tools for observation, physical manipulation, or attachment to various matrices.
[0194] Nanodiscs™ are self-assembling discoidal nanoparticles. The assembly process begins with a mixture of saturated or unsaturated phospholipid molecules and membrane scaffold proteins in the presence of a detergent. The detergent is removed, forming particles that preserve the phospholipid bilayer architecture and incorporate the target hydrophobic protein of interest.
[0195] Nanodiscs™ can be produced from a variety of phospholipids, for example, dipalmitoyl phosphatidlycholine and dimyristoyl phosphatidlycholine (Shaw et al., 2004). A synthetic bilayer made from a single type of phospholipid changes from a liquid state to a gel state at a characteristic freezing point. This change of state is called a phase transition. The breadth ofthe phase transitions ofthe phospholipids in the bilayer depends on the phospholipid composition. A comparison ofthe phase transitions ofthe same phospholipids incorporated into Nanodiscs™ compared to phospholipid vesicles showed that the transitions were broader for the lipids in the Nanodiscs™ than for the lipids in the vesicles. Also, the transition midpoint was shifted 3-4°C higher for lipids incorporated into Nanodiscs™. These characteristics ofthe Nanodisc™ lipid bilayer mimic the characteristics of cellular membranes better than the vesicles, making Nanodiscs™ a more native-like lipid environment in which to study membrane-associated proteins (Shaw et al., 2004).
[0196] Nanodiscs™ can also be produced from microsomal membranes, e.g., those prepared from baculovirus-infected Spodoptera frugiperda (Sf9) insect cells. Civjanet al. overexpressed an N-terminally anchored cytochrome P450 monoxygenase, and found that it was effectively dispersed, not aggregated, in bilayers containing biochemically defined lipid components. The cytochrome P450 monoxygenase target protein was suitable for sensitive high-throughput substrate binding analysis (Civjan et al., 2003).
[0197] The oligomeric state of the target protein can be controlled during Nanodisc™ assembly. For example, the seven transmembrane receptor protein bacteriorhodopsin assumes a native trimeric state. When assembled into Nanodiscs™, the oligomeric form of bacteriorhodopsin could be controlled and determined by observation, e.g., spectroscopically (Bayburt and Sligar, 2003).
[0198] The stoichiometry of Nanodiscs™ can also be controlled by the assembly conditions (Bayburt and Sligar, 2003). By providing membrane scaffold proteins and phospholipids in excess, Nanodiscs™ can be induced to self-assemble with controlled stoichiometry such that there is one target hydrophobic molecule per Nanodisc . Nanodiscs constructed with bacteriorhodopsin comprised approximately two membrane scaffold protein molecules and approximately 163 dimyristoylphosphatidylcholine molecules per bacteriorhodopsin molecule.
[0199] One of the biggest challenges in the fields of proteomics and pharmaceutical research is obtaining functional membrane proteins that are solubilized and dispersed in the lipid bilayer, rather than aggregated into a physiologically relevant environment. Nanodiscs™ provide integral membrane proteins in a functional, soluble, and monodisperse state in a native-like environment that maintains a spectrum of in vivo activities. Integral membrane proteins such as receptors, enzymes, and other macromolecular assemblies that represent important drug targets can be incorporated into Nanodiscs™ and retain their physiologic activities. For example, 93% ofthe bacteriopsin molecules incorporated into Nanodiscs™ were shown to be functional with respect to cofactor binding, and to have a dissociation constant for all-frans-retinal that was very close to the value ofthe dissociation constant in the native state (Bayburt and Sligar, 2003). In addition to maintaining functional integrity and stability in a solubilized phospholipid bilayer environment, it is possible to direct these nanobilayer structures to assemble for high- throughput screening or biophysical investigations (Bayburt and Sligar, 2003). Hydrophobic proteins expressed on Nanodiscs™ are suitable for use in biochemical studies, crystallographic studies, and high throughput screening.
[0200] Nanodiscs™ are water soluble; they can be handled and manipulated by techniques commonly used to work with proteins. Nanodiscs™ have the advantage over liposomes that they lack a lumen, overcoming the orientation problem ofthe embedded membrane protein, because both the extracellular and cytoplasmic portions ofthe target molecules are accessible. Nanodiscs™ provide access to both sides of the bilayer structure, while liposomes permit access only to the outer surface. Thus, Nanodiscs™ are useful for studying transmembrane protein function in solution. For example, they can be used to study transmembrane signaling.
[0201] In one aspect, the invention provides a method of producing at least one hydrophobic polypeptide by providing a cell-free expression system, a first nucleic acid molecule encoding a first hydrophobic polypeptide and reagents for producing a Nanodisc™, combining the first nucleic acid molecule, the cell-free expression system, and the reagents for producing a Nanodisc™, and allowing the first hydrophobic polypeptide to be produced in or introduced into a Nanodisc™. This cell-free expression system allows for replication ofthe nucleic acid molecule. It can be a bacterial system, e.g., an E. coli lysate; a plant system, e.g., a wheat germ lysate; or a eukaryotic system, e.g., a rabbit reticulocyte lysate. This expression system can produce membrane proteins.
[0202] This method can produce two or more hydrophobic polypeptides by providing a second, third, or fourth nucleic acid molecule encoding a second, third, or fourth hydrophobic polypeptide; combining the second, third, or fourth nucleic acid molecule with the first, second, or third nucleic acid molecule, the cell-free expression system, and the reagents for producing a Nanodisc™; and allowing the resulting hydrophobic polypeptides to be produced in or introduced into the Nanodisc™. This cell-free expression system allows for replication ofthe nucleic acid molecule. It can be a bacterial system, e.g., an E. coli lysate; a plant system, e.g., a wheat germ lysate; a eukaryotic system, e.g., a rabbit reticulocyte lysate; or a combination of these systems. This expression system can produce membrane proteins. This method can produce a Nanodisc™ comprising the two or more hydrophobic polypeptides. It can produce two or more hydrophobic proteins that are part of a multi-protein complex, br that exist in the same Nanodisc™ but are not part of a multiprotein complex. This method can produce first, second, third, or fourth, etc., nucleic acid molecules that are present in an equal molar ratio. It can also produce first, second, third, or fourth, etc., nucleic acid molecules that are present in different molar ratios. These proteins in the Nanodisc™ can assume their native conformation and perform their native physiologic functions, both in isolation and as a part of protein complexes.
[0203] In another aspect, the invention provides an apparatus for producing a plurality of hydrophobic polypeptides in a high throughput manner comprising means for providing a cell-free expression system for one or more components of a hydrophobic protein, means for introducing one or more nucleic acid molecules that encode one or more components of a hydrophobic protein into each cell-free expression system, means for introducing a Nanodisc™ into each cell-free expression system, and means for incubating the cell-free expression system, the one or more nucleic acid molecules, and the Nanodisc™ for each hydrophobic protein. This apparatus can further comprise means for separating the Nanodisc™ containing the hydrophobic protein from the cell-free expression system.
[0204] In a related aspect, the invention provides a method of synthesizing a plurality of Nanodiscs™ simultaneously and for synthesizing a series of a plurality of simultaneously-synthesized Nanodiscs™ sequentially utilizing a dynamic system by providing the apparatus described above, operating the apparatus so as to produce a plurality of hydrophobic polypeptides in a cell-free expression system on a Nanodisc™, operating the apparatus so as to separate the Nanodisc™ containing the hydrophobic protein from the cell-free expression system, and operating the apparatus so as to reposition the apparatus such that the means for providing a cell-free expression system, the means for introducing one or more nucleic acid molecules, the means for introducing a Nanodisc™ into each cell-free expression system, and the means for incubating the cell-free expression system are in a position with respect to one another so that at least a second plurality of hydrophobic proteins can be produced in a cell-free system on a Nanodisc1 .
[0205] In another aspect, the invention provides a hydrophobic protein made by any ofthe above methods. The hydrophobic protein can be a membrane protein, e.g., a transmembrane protein with one or more hydrophobic transmembrane domains.
[0206] The invention also provides a composition comprising a plurality of crystallized hydrophobic proteins. Protein crystallization requires large amounts of purified protein, and previous efforts to produce crystals of membrane proteins have been hampered by the difficulty of expressing hydrophobic proteins in their native state. The crystallized hydrophobic protein composition ofthe invention can ! comprise hydrophobic proteins produced by a cell-free expression system in a Nanodisc™ and crystallized by any ofthe methods that are known to those skilled in the art (McRee, 1999). The composition of crystallized proteins can comprise a binding partner bound to the hydrophobic protein, e.g., a heavy metal or other binding partner used by those skilled in the art. Crystallized proteins can provide information about the three-dimensional structure ofthe proteins. The invention also provides a method of preparing a hydrophobic protein for determination of crystal structure by providing a composition of hydrophobic proteins made by any ofthe methods described above and allowing the composition to crystallize. The invention further provides a method for using a hydrophobic protein ofthe invention to determine its crystal structure.
[0207] In another aspect, the invention provides a method of immunizing a non-human animal by injecting it with a hydrophobic protein made by any ofthe methods described above.
[0208] In a further aspect, the invention provides a method of screening for modulators of hydrophobic protein activity by providing a hydrophobic protein made by any ofthe methods described above, contacting the hydrophobic protein with a candidate modulator, and determining the ability ofthe candidate modulator to affect hydrophobic protein activity or to bind to the hydrophobic protein. This method provides a screen for modulators of membrane proteins. The modulators can be agonists, antagonists, antibodies, small molecule drugs, soluble receptors, peptide aptamers, and/or natural ligands.
[0209] When any ofthe above-referenced host cells, or other appropriate host cells or organisms, are used to replicate and/or express the polynucleotides ofthe invention, the resulting replicated nucleic acid, RNA, expressed protein or polypeptide, is within the scope ofthe invention as a product ofthe host cell or organism.
[0210] Once the gene corresponding to a selected polynucleotide is identified, its expression can be regulated in the gene's native cell types. For example, an endogenous gene of a cell can be regulated by an exogenous regulatory sequence inserted into the genome ofthe cell at a location that will enhance or reduce expression ofthe gene corresponding to the subject polypeptide. The regulatory sequence can be designed to integrate into the genome via homologous recombination, as disclosed in U.S. Patent Nos. 5,641,670 and 5,124,761, the disclosures of which are herein incorporated by reference. Alternatively, it can be designed to integrate into the genome via non-homologous recombination, as described in WO 99/15650, the disclosure of which is also herein incorporated by reference. Also encompassed in the subject invention is the production of proteins without manipulating the encoding nucleic acid itself, but rather by integrating a regulatory sequence into the genome of a cell that already includes a gene that encodes the protein of interest; this production method is described in the above- incorporated patent documents.
Isolated Primer Pairs
[0211] In some embodiments, the invention provides isolated nucleic acids that, when used as primers in a polymerase chain reaction, amplify a subject polynucleotide, or a polynucleotide containing a subject polynucleotide. The amplified polynucleotide is from about 20 to about 50, from about 50 to about 75, from about 75 to about 100, from about 100 to about 125, from about 125 to about 150, from about 150 to about 175, from about 175 to about 200, from about 200 to about 250, from about 250 to about 300, from about 300 to about 350, from about 350 to about 400, from about 400 to about 500, from about 500 to about 600, from about 600 to about 700, from about 700 to about 800, from about 800 to about 900, from about 900 to about 1000, from about 1000 to about 2000, from about 2000 to about ■ 3000, from about 3000 to about 4000, from about 4000 to about 5000, or from about 5000 to about 6000 nucleotides or more in length.
[0212] The isolated nucleic acids themselves are from about 10 to about 20, from about 20 to about 30, from about 30 to about 40, from about 40 to about 50, from about 50 to about 100, or from about 100 to about 200 nucleotides in length. Generally, the nucleic acids are used in pairs in a polymerase chain reaction, where they are referred to as "forward" and "reverse" primers.
[0213] Thus, in some embodiments, the invention provides a pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides in length, the first nucleic acid molecule ofthe pair comprising a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to a nucleic acid sequence as shown in SEQ ID NO: .1 - 123 and the second nucleic acid molecule ofthe pair comprising a sequence of at least 10 contiguous nucleotides having 100%> sequence identity to the reverse complement ofthe nucleic acid sequence shown in SEQ ID NO: 1 - 123 , wherein the sequence ofthe second nucleic acid molecule is located 3' of the nucleic acid sequence ofthe first nucleic acid molecule shown in SEQ ID NO: 1 - 123. The primer nucleic acids are prepared using any known method, e.g., automated synthesis, and can be chosen to specifically amplify a cDNA copy of an mRNA encoding a subject polypeptide.
[0214] In some embodiments, the first and/or the second nucleic acid molecules comprise a detectable label. The label can be a radioactive molecule, fluorescent molecule or another molecule, e.g., hapten, as described in detail above. Further, the label can be a two stage system, where the amplified DNA is conjugated to another molecule, i.e., biotin, digoxin, or a hapten, that has a high affinity binding partner, i.e., avidin, antidigoxin, or a specific antibody, respectively, and the binding partner conjugated to a detectable label. The label can be conjugated to one or both of the primers. Alternatively, the pool of nucleotides used in the amplification is labeled, so as to incorporate the label into the amplification product.
[0215] Conditions that increase stringency of both DNA DNA and DNA/RNA hybridization reactions are widely known and published in the art. See, for example, Sambrook, 1989, and examples provided above. Examples of relevant conditions include (in order of increasing stringency): incubation temperatures of 25°C, 37°C, 50°C, and 68°C; buffer concenfrations of 10 x SSC, 6 x SSC, 1 x SSC, 0.1 x SSC (where 1 x SSC is 0.15 M NaCl and 15 mM citrate buffer); and their equivalents using other buffer systems; formamide concentrations of 0%, 25%, 50%, and 75%; incubation times from 5 minutes to 24 hours; 1, 2, or more washing steps; wash incubation times of 1, 2, or 15 minutes; and wash solutions of 6 x SSC, 1 x SSC, 0.1 x SSC, or deionized water.
[0216] For example, high stringency conditions include hybridization in 50% formamide, 5X SSC, 0.2 μg/μl poly(dA), 0.2 μg/μl human cotl DNA, and 0.5% SDS, in a humid oven at 42°C overnight, followed by successive washes in IX SSC, 0.2% SDS at 55°C for 5 minutes, followed by washing at 0.1X SSC, 0.2% SDS at 55°C for 20 minutes. Further examples of high stringency conditions include hybridization at 50°C and O.lxSSC (15 mM sodium chloride/1.5 mM sodium citrate); overnight incubation at 42°C in a solution containing 50% formamide, 1 x SSC (150 mM NaCl, 15 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 x SSC at about 65°. High stringency conditions also include aqueous hybridization (e.g., free of formamide) in 6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% sodium dodecyl sulfate (SDS) at 65°C for about 8 hours (or more), followed by one or more washes in 0.2 X SSC, 0.1% SDS at 65°C. Highly stringent hybridization conditions are hybridization conditions that are at least as stringent as any one ofthe above representative conditions. Other stringent hybridization conditions are known in the art and can also be employed to identify nucleic acids of this particular embodiment ofthe invention.
[0217] Conditions of reduced stringency, suitable for hybridization to molecules encoding structurally and functionally related proteins, or otherwise serving related or associated functions, are the same as those for high stringency conditions but with a reduction in temperature for hybridization and washing to lower temperatures (e.g., room temperature or about 22°C to 25°C). For example, moderate stringency conditions include aqueous hybridization (e.g., free of formamide) in 6X SSC, 1% SDS at 65°C for about 8 hours (or more), followed by one or more washes iri 2X SSC, 0.1% SDS at room temperature. Low stringency conditions include, for example, aqueous hybridization at 50°C and 6χSSC (0.9 M sodium chloride/0.09 M sodium citrate) and washing at 25°C in lxSSC (0.15 M sodium chloride/0.015 M sodium citrate).
[0218] The specificity of a hybridization reaction allows any single-stranded sequence of nucleotides to be labeled with a radioisotope or chemical and used as a probe to find a complementary strand, even in a cell or cell extract that contains millions of different DNA and RNA sequences. Probes of this type are widely used to detect the nucleic acids corresponding to specific genes, both to facilitate the purification and characterization ofthe genes after cell lysis and to localize them in cells, tissues, and organisms.
[0219] Moreover, by carrying out hybridization reactions under conditions of reduced stringency, a probe prepared from one gene can be used to find homologous evolutionary relatives - both in the same organism, where the relatives form part of a gene family, and in other organisms, where the evolutionary history ofthe nucleotide sequence can be traced. A person skilled in the art would recognize how to modify the conditions to achieve the requisite degree of stringency for a particular hybridization.
Libraries
[0220] The polynucleotide libraries ofthe invention generally comprise a collection of sequence information of a plurality of polynucleotide sequences, where at least one ofthe polynucleotides has a sequence shown in SEQ ID NO: 1 - 123. By plurality is meant at least 2, at least 3, or at least all ofthe sequences in the Sequence Listing. The information may be provided in either biochemical form (e.g., as a collection of polynucleotide molecules), or in electronic form (e.g., as a collection of polynucleotide sequences stored in a computer-readable form, as in a computer-based system, a computer data file, and/or as a part of a computer program). The length and number of polynucleotides in the library will vary with the nature ofthe library, e.g., if the library is an oligonucleotide array, a cDNA array, or a computer database ofthe sequence information.
[0221 ] The sequence information contained in either a biochemical or an electronic library of polynucleotides can be used in a variety of ways, e.g., as a resource for gene discovery, as a representation of sequences expressed in a selected cell type (e.g., cell type markers), or as markers of a given disorder or disease state. In general, a disease marker is a representation of a gene product that is present in all cells affected by disease either at an increased or decreased level relative to a normal cell (e.g., a cell ofthe same or similar type that is not substantially affected by disease). For example, a polynucleotide sequence in a library can be a polynucleotide that represents an mRNA, polypeptide, or other gene product encoded by the polynucleotide, that is either over-expressed or under-expressed in one cell compared to another (e.g., a first cell type compared to a second cell type; a normal cell compared to a diseased cell; a cell not exposed to a signal or stimulus compared to a cell exposed to that signal or stimulus; and the like).
[0222] The nucleotide sequence information of the library can be embodied in any suitable form, e.g., electronic or biochemical forms. For example, a library of sequence information embodied in electronic form comprises an accessible computer data file that may contain the representative nucleotide sequences of genes that are differentially expressed (e.g., over-expressed or under-expressed) as between, e.g., a first cell type compared to a second cell type (e.g., expression in a brain cell compared to expression in a kidney cell); a normal cell compared to a diseased cell (e.g., a non- cancerous cell compared to a cancerous cell); a cell not exposed to an internal or external signal or stimulus compared to a cell exposed to that signal or stimulus (e.g., a cell contacted with a ligand compared to a control cell not contacted with the ligand); and the like. Other combinations and comparisons of cells will be readily apparent to the ordinarily skilled artisan. Biochemical embodiments ofthe library include a collection of nucleic acid molecules that have the sequences ofthe genes in the library, where the nucleic acids can correspond to the entire gene in the library or to a fragment thereof, as described in greater detail below.
[0223] Where the library is an electronic library, the nucleic acid sequence infonnation can be present in a variety of media. For example, the nucleic acid sequences of any ofthe polynucleotides shown in SEQ ID NO: 1 - 123 can be recorded on computer readable media of a computer-based system, e.g., any medium that can be read and accessed directly by a computer. One of skill in the art can readily appreciate how any ofthe presently known computer readable mediums can be used to create a manufacture comprising a recording ofthe present sequence information. Any convenient data storage structure can be chosen, based on the means used to access the stored information. A variety of data processor programs and formats can be used for storage, e.g., word processing text file, database format, etc. In addition to the sequence information, electronic versions ofthe libraries ofthe invention can be provided in conjunction or connection with other computer-readable information and/or other types of computer-based files (e.g., searchable files, executable files, etc, including, but not limited to, for example, search program software, etc.).
[0224] By providing the nucleotide sequence in computer readable form in a computer-based system, the information can be accessed for a variety of purposes. Computer software to access sequence information is publicly available. Conventional bioinformatics tools can be utilized to analyze sequences to determine sequence identity, sequence similarity, and gap information. For example, the gapped BLAST (Altschul et al., 1990, Altschul et al., 1997), and BLAZE (Brutlag et al., 1993) search algorithms on a Sybase system, or the TeraBLAST (TimeLogic, Crystal Bay, Nevada) program optionally running on a specialized computer platform available from TimeLogic, can be used to identify open reading frames (ORFs) within the genome that contain homology to ORFs from other organisms. Homology between sequences of interest can be determined using the local homology algorithm of Smith and Waterman, 1981, as well as the BestFit program (Rechid et al., 1989), and the FastDB algorithm (FastDB, 1988; described in Current Methods in Sequence Comparison and Analysis, Macromolecule Sequencing and Synthesis, Selected Methods and Applications, pp. 127-149, 1988, Alan R. Liss, Inc).
[0225] Alignment programs that permit gaps in the sequence include Clustalw (Thompson et al., 1994), FASTA3 (Pearson, 2000) AlignO (Myers and Miller, 1988), and TCoffee (Notredame et al., 2000). Other methods for comparing and aligning nucleotide and protein sequences include, for example, BLASTX (NCBI), the Wise package (Birney and Durbin, 2000), and FASTX (Pearson, 2000). These algorithms determine sequence homology between nucleotide and protein sequences without translating the nucleotide sequences into protein sequences. Other techniques for alignment are also known in the art (Doolittle, et al., 1996; BLAST, available from the National Center for Biotechnology Information; FASTA, available in the Genetics Computing Group (GCG) package, from Madison, Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc.; Schlessinger, 1988a; Schlessinger, 1988b; and Needleman and Wunch, 1970).
[0226] Sequence similarity is calculated based on a reference sequence, which may be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc. The reference sequence is usually at least about 18 nt long, at least about 30 nt long, or may extend to the complete sequence that is being compared.
[0227] One parameter for determining percent sequence identity is the percentage ofthe alignment in the region of strongest alignment between a target and a query sequence. Methods for determining this percentage involve, for example, counting the number of aligned bases of a query sequence in the region of strongest alignment and dividing this number by the total number of bases in the region. For example, 10 matches divided by 11 total residues gives a percent sequence identity of approximately 90.9%. The length ofthe aligned region is typically at least about 55%, at least about 58%, or at least about 60%> ofthe total sequence length, and can be as great as about 62%, as great as about 64%, and even as great as about 66% of the total sequence length.
[0228] The present invention includes human and mouse polynucleotide and polypeptide sequences that are at least about 95%>, at least about 96%>, at least about 97%o, at least about 98%>, or at least about 99% homologous to the sequences in the Sequence Listing, based on using the method of determining sequence identity with the insertion of gaps to detect the maximum degree of sequence identity. In other embodiments of interest, homology will be at least about 80%, at least about 85%, or as high as about 90%.
[0229] A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems ofthe present invention. One format for an output means ranks the relative expression levels of different polynucleotides. Such presentation provides a skilled artisan with a ranking of relative expression levels to determine a gene expression profile.
[0230] As discussed above, the library ofthe invention also encompasses biochemical libraries ofthe polynucleotides shown in SEQ ID NO: 1 - 123, e.g., collections of nucleic acids representing the provided polynucleotides. The biochemical libraries can take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic acids stably associated with a surface of a solid support (i.e., an array) and the like. Of particular interest are nucleic acid arrays in which one or more ofthe polynucleotide sequences shown in SEQ ID NO: 1 - 123 is represented on the array. A variety of different array formats have been developed and are known to those of skill in the art. The arrays ofthe subject invention find use in a variety of applications, including gene expression analysis, drug screening, mutation analysis, and the like, as disclosed in the herein-listed exemplary patent documents. [0231] In addition to the above nucleic acid libraries, analogous libraries of polypeptides are also provided, where the polypeptides ofthe library will represent at least a portion ofthe polypeptides encoded by a gene corresponding to one or more of the sequences shown in SEQ ID NO: 1 - 123.
[0232] Further, analogous libraries of antibodies are also provided, where the libraries comprise antibodies or fragments thereof that specifically bind to at least a portion of at least one ofthe subject polypeptides. Further, antibody libraries may comprise antibodies or fragments thereof that specifically inhibit binding of a subject polypeptide to its ligand or substrate, or that specifically inhibit binding of a subject polypeptide as a substrate to another molecule. Moreover, corresponding nucleic acid libraries are also provided, comprising polynucleotide sequences that encode the antibodies or antibody fragments described above.
Polypeptides
[0233] This invention provides novel polypeptides, and related polypeptide compositions. The novel polypeptides ofthe invention encompass proteins with amino acid sequences as shown in SEQ ID NO: 124 - 246, or encoded by the nucleic acids having nucleotide sequences shown in SEQ ID NO: 1 - 123. The subject polypeptides are human polypeptides, fragments thereof, variants (such as splice variants), homologs from other species, and derivatives thereof. In particular embodiments, a polypeptide ofthe invention has an amino acid sequence substantially identical to the sequence of any polypeptide encoded by a polynucleotide sequence shown in SEQ ID NO: 1 - 123.
[0234] These polypeptides may reside within the cell, or extracellularly. They may be secreted from the cell, reside in the cytoplasm, in the membranes, or in any ofthe intracellular organelles, including the nucleus, mitochondria, ribosomes, or storage granules. They may function as secreted proteins, single-fransmembrane proteins, multiple-transmembrane proteins, cytoplasmic proteins, and/or extracellular proteins.
[0235] In some embodiments, the present novel polypeptide modulates the cells or tissues of animals, particularly humans, such as, for example, by stimulating, enhancing or inhibiting T or B cell function or the function of other hematopoeitic cells or bone marrow cells; modulates adult or embryonic stem cell or precursor cell growth or differentiation; modulates cell function or activity of neuronal cells or other cells ofthe CNS, heart cells, liver cells, kidney cells, lung cells, pancreatic cells, gastrointestinal cells, spleen cells, breast cells, prostate cells, ovarian cells, and the like.
[0236] In some embodiments, a subject polypeptide is present as a multimer. Multimers include homodimers, homofrimers, homotetramers, and multimers that include more than four monomeric units. Multimers also include heteromultimers, e.g., heterodimers, heterotrimers, heterotetramers, etc. where the subject polypeptide is present in a complex with proteins other than the subject polypeptide. Where the multimer is a heteromultimer, the subject polypeptide can be present in a 1 : 1 ratio, a 1:2 ratio, a 2:1 ratio, or other ratio, with the other protein(s).
[0237] In addition to the above specifically listed proteins, polypeptides from other species are also provided, including mammals, such as: primates, rodents, e.g., mice, rats, hamsters, guinea pigs; domestic animals, e.g., sheep, pig, horse, cow, goat, rabbit, dog, cat; and humans, as well as non-mammalian species, e.g., avian, reptile and amphibian, insect, crustacean, fish, plant, fungus, and protozoa.
[0238] By "homolog" is meant a protein having at least about 35 >, at least about 40%), at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, or at least about 95%, or higher, amino acid sequence identity to the reference polypeptide, as measured with the "GAP" program (part ofthe Wisconsin Sequence Analysis Package available through the Genetics Computer Group, Inc. (Madison WI)), where the parameters are: Gap weight: 12; length weight:4. In many embodiments of interest, homology will be at least about 75%>, at least about 80%>, or at least 85%, where in certain embodiments of interest, homology will be as high as about 90%.
[0239] Also provided are polypeptides that are substantially identical to the at least one amino acid sequence shown in the Sequence Listing, or a fragment thereof, whereby substantially identical is meant that the protein has an amino. acid sequence identity to the reference sequence of at least about 85%>, at least about 90%, at least about 95%>, at least about 96%, at least about 97%, at least about 98%>, or at least about 99%.
[0240] The proteins ofthe subject invention (e.g., polypeptides encoded by the nucleotide sequences shown in SEQ ID NO: 1 - 123, and polypeptide sequences shown in SEQ ID NO: 124 - 246) have been separated from their naturally occurring environment and are present in a non-naturally occurring environment. In certain embodiments, the proteins are present in a composition where they are more concentrated than in their naturally occurring environment. For example, purified polypeptides are provided.
[0241 ] In addition to naturally occurring proteins, polypeptides that vary from naturally occurring forms are also provided. Fusion proteins can comprise a subject polypeptide, or fragment thereof, and a polypeptide other than a subject polypeptide ("the fusion partner") fused in- frame at the N-terminus and/or C-terminus ofthe subject polypeptide, or internally to the subject polypeptide.
[0242] Suitable fusion partners include, but are not limited to, immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a fluorescent protein from an Anthozoan species; β-galactosidase; luciferase; ere recombinase; and the like); polypeptides that provide a catalytic function or induce a cellular response; polypeptides that provide for secretion ofthe fusion protein from a eukaryotic cell; polypeptides that provide for secretion ofthe fusion protein from a prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., Hisn, where n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, and or IgD.
[0243] Detection methods are chosen based on the detectable fusion partner. For example, where the fusion partner provides an immunologically recognizable epitope, an epitope-specific antibody can be used to quantitatively detect the level of polypeptide. In some embodiments, the fusion partner provides a detectable signal, and in these embodiments, the detection method is chosen based on the type of signal generated by the fusion partner. For example, where the fusion partner is a fluorescent protein, fluorescence is measured.
[0244] Where the fusion partner is an enzyme that yields a detectable product, the product can be detected using an appropriate means. For example, β- galactosidase can, depending on the subsfrate, yield a colored product that can be detected with a specfrophotometer, and the fluorescent protein luciferase can yield a luminescent product detectable with a luminometer.
[0245] In some embodiments, a polypeptide ofthe invention comprises at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 550, at least about 600, at least about 650, at least about 700, at least about 750, at least about 800, at least about 850, at least about 900, at least about 950, or at least about 1000 contiguous amino acid residues of at least one ofthe sequences according to SEQ ID NO: 124 - 246, up to and including the entire amino acid sequence.
[0246] Fragments ofthe subject polypeptides, as well as polypeptides comprising such fragments, are also provided. Fragments of polypeptides of interest will typically be at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, at least about 50, at least about 75, at least about 100, at least about 150, at least about 200, at least about 250, or at least 300 aa in length or longer, where the fragment will have a stretch of amino acids that is identical to the subject protein of at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least about 25, at least about 30, or at least about 50 aa in length.
[0247] In some embodiments, fragments exhibit one or more activities associated with a corresponding naturally occurring polypeptide. Fragments find utility in generating antibodies to the full-length polypeptide; and in methods of screening for candidate agents that bind to and/or modulate polypeptide activity. Specific fragments of interest include those with enzymatic activity, those with biological activity including the ability to serve as an epitope or immunogen, and fragments that bind to other proteins or to nucleic acids.
[0248] The invention provides polypeptides comprising such fragments, including, e.g., fusion polypeptides comprising a subject polypeptide fragment fused in frame (directly or indirectly) to another protein (the "fusion partner"), such as the signal peptide of one protein being fused to the mature polypeptide of another protein. Such fusion proteins are typically made by linking the encoding polynucleotides together in a vector or cassette. Suitable fusion partners include, but are not limited to, immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a fluorescent protein from an Anthozoan species; β-galactosidase; luciferase; ere recombinase); polypeptides that provide a catalytic function or induce a cellular response; polypeptides that provide for secretion ofthe fusion protein from a eukaryotic cell; polypeptides that provide for secretion ofthe fusion protein from a prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., Hisn, where n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, and/or IgD.
Polypeptide Preparation
[0249] Polypeptides ofthe invention can be obtained from naturally- occurring sources or produced synthetically. The sources of naturally occurring polypeptides will generally depend on the species from which the protein is to be derived, i.e., the proteins will be derived from biological sources that express the proteins. The subject proteins can also be derived from synthetic means, e.g., by expressing a recombinant gene encoding a protein of interest in a suitable system or host or enhancing endogenous expression, as described in more detail above. Further, small peptides can be synthesized in the laboratory by techniques well known in the art. [0250] In all cases, the product can be recovered by any appropriate means known in the art. For example, convenient protein purification procedures can be employed (e.g., see Guide to Protein Purification, Deuthscher et al., 1990). That is, a lysate can be prepared from the original source, (e.g., a cell expressing endogenous polypeptide, or a cell comprising the expression vector expressing the polypeρtide(s)), and purified using HPLC, exclusion chromatography, gel elecfrophoresis, or affinity chromatography, and the like.
[0251 ] The invention thus also provides methods of producing polypeptides. Briefly, the methods generally involve introducing a nucleic acid construct into a host cell in vitro and culturing the host cell under conditions suitable for expression, then harvesting the polypeptide, either from the culture medium or from the host cell, (e.g., by disrupting the host cell), or both, as described in detail above. The invention also provides methods of producing a polypeptide using cell-free in vitro transcription/translation methods, which are well known in the art, also as provided above.
[0252] Moreover, the invention provides polypeptides, including polypeptide fragments, as targets for therapeutic intervention, including use in screening assays, for identifying agents that modulate polypeptide level and or activity, and as targets for antibody and small molecule therapeutics, for example, in the treatment of disorders.
Tables
Table 1. Sequence Listing
FP ID SEQ.ID.NO. (NI) SEQ.ID.NO. (Pl)
HG1009529 SEQ.ID.NO.1 SEQ.ID.NO.124 HG1009530 SEQ.ID.NO.2 SEQ.ID.NO.125 HG1009531 SEQ.ID.NO.3 SEQ.ID.NO.126
HG1009552 SEQ.ID.NO.4 SEQ.ID.NO.127 HG1009553 SEQ.ID.NO.5 SEQ.ID.NO.128 HG1009554 SEQ.ID.NO.6 SEQ.ID.NO.129 HG1009558 SEQ.ID.NO.7 SEQ.ID.NO.130 HG1009559 SEQ.ID.NO.8 SEQ.ID.NO.131 HG1009563 SEQ.ID.NO.9 SEQ.ID.NO.132 HG1009564 SEQ.ID.NO.10 SEQ.ID.NO.133 HG1009565 SEQ.ID.NO.11 SEQ.ID.NO.134 HG1009566 SEQ.ID.NO.12 SEQ.ID.NO.135 HG1009568 SEQ.ID.NO.13 SEQ.ID.NO.136 HG1009571 SEQ.ID.NO.14 SEQ.ID.NO.137 HG1009574 SEQ.ID.NO.15 SEQ.ID.NO.138 HG1009577 SEQ.ID.NO.16 SEQ.ID.NO.139 HG1009578 SEQ.ID.NO.17 SEQ.ID.NO.140 HG1009583 SEQ.ID.NO.18 SEQ.ID.NO.141 HG1009586 SEQ.ID.NO.19 SEQ.ID.NO.142 HG1009589 SEQ.ID.NO.20 SEQ.ID.NO.143 HG1009594 SEQ.ID.NO.21 SEQ.ID.NO.144
HG1009595 SEQ.ID.NO.22 SEQ.ID.NO.145 HG1009596 SEQ.ID.NO.23 SEQ.ID.NO.146 HG1009602 SEQ.ID.NO.24 SEQ.ID.NO.147 HG1009603 SEQ.ID.NO.25 SEQ.ID.NO.148 HG1009606 SEQ.ID.NO.26 SEQ.ID.NO.149 HG1009607 SEQ.ID.NO.27 SEQ.ID.NO.150 HG1009611 SEQ.ID.NO.28 SEQ.ID.NO.151 HG1009618 SEQ.ID.NO.29 SEQ.ID.NO.152 HG1009621 SEQ.ID.NO.30 SEQ.ID.NO.153 HG1009622 SEQ.ID.NO.31 SEQ.ID.NO.154 HG1009627 SEQ.ID.NO.32 SEQ.ID.NO.155 HG 1009632 SEQ.ID.NO.33 SEQ.ID.NO.156 HG 1009641 SEQ.ID.NO.34 SEQ.ID.NO.157 HG1009650 SEQ.ID.NO.35 SEQ.ID.NO.158 HG1009657 SEQ.ID.NO.36 SEQ.ID.NO.159 HG 1009659 SEQ.ID.NO.37 SEQ.ID.NO.160 HG1009662 SEQ.ID.NO.38 SEQ.ID.NO.161 HG 1009663 SEQ.ID.NO.39 SEQ.ID.NO.162 HG 1009667 SEQ.ID.NO.40 SEQ.ID.NO.163 HG1009673 SEQ.ID.NO.41 SEQ.ID.NO.164 HG1009676 SEQ.ID.NO.42 SEQ.ID.NO.165 HG1009677 SEQ.ID.NO.43 SEQ.ID.NO.166 HG1009679 SEQ.ID.NO.44 SEQ.ID.NO.167 FP ID SEQ.ID.NO. (NI) SEQ.ID.NO. (Pl)
HG1009694 SEQ.ID.NO.45 SEQ.ID .NO. 168 HG1009696 SEQ.ID.NO.46 SEQ.ID .NO. 169
HG1009699 SEQ.ID.NO.47 SEQ.ID .NO. 170 HG1009709 SEQ.ID.NO.48 SEQ.ID .NO. 171 HG1009713 SEQ.ID.NO.49 SEQ.ID .NO. 172 HG1009729 SEQ.ID.NO.50 SEQ.ID .NO. 173 HG 1009731 SEQ.ID.NO.51 SEQ.ID .NO. 174 HG1009737 SEQ.ID.NO.52 SEQ.ID .NO. 175 HG1009740 SEQ.ID.NO.53 SEQ.ID, .NO. 176 HG1009742 SEQ.ID.NO.54 SEQ.ID .NO. 177 HG1009744 SEQ.ID.NO.55 SEQ.ID .NO. 178
HG1009746 SEQ.ID.NO.56 SEQ.ID .NO. 179 HG1009758 SEQ.ID.NO.57 SEQ.ID, .NO. 180
HG1009761 SEQ.ID.NO.58 SEQ.ID..NO. 181 HG1009762 SEQ.ID.NO.59 SEQ.ID .NO. 182
HG 1009765 SEQ.ID.NO.60 SEQ.ID .NO. 183 HG1009776 SEQ.ID.NO.61 SEQ.ID .NO. 184 HG1009784 SEQ.ID.NO.62 SEQ.ID .NO. 185 HG1009790 SEQ.ID.NO.63 SEQ.ID .NO. 186 HG1009793 SEQ.ID.NO.64 SEQ.ID .NO. 187 HG1009794 SEQ.ID.NO.65 SEQ.ID .NO. 188 HG1009795 SEQ.ID.NO.66 SEQ.ID .NO. 189 HG1009798 SEQ.ID.NO.67 SEQ.ID .NO. 190 HG1009800 SEQ.ID.NO.68 SEQ.ID .NO. 191 HG1009805 SEQ.ID.NO.69 SEQ.ID .NO. 192 HG 1009821 SEQ.ID.NO.70 SEQ.ID .NO. 193 HG1009831 SEQ.ID.NO.71 SEQ.ID .NO. 194 HG1009834 SEQ.ID.NO.72 SEQ.ID .NO. 195 HG 1009841 SEQ.ID.NO.73 SEQ.ID .NO. 196
HG 1009843 SEQ.ID.NO.74 SEQ.ID .NO. 197 HG1009846 SEQ.ID.NO.75 SEQ.ID .NO. 198
HG1009847 SEQ.ID.NO.76 SEQ.ID .NO. 199 HG 1009864 SEQ.ID.NO.77 SEQ.ID, .NO. 200 HG1009865 SEQ.ID.NO.78 SEQ.ID .NO. 201
HG1009869 SEQ.ID.NO.79 SEQ.ID .NO. 202 HG1009872 SEQ.ID.NO.80 SEQ.ID .NO. 203 HG 1009879 SEQ.ID.NO.81 SEQ.ID, .NO. 204 HG1009881 SEQ.ID.NO.82 SEQ.ID .NO. 205 HG1009884 SEQ.ID.NO.83 SEQ.ID, .NO. 206 HG1009885 SEQ.ID.NO.84 SEQ.ID, .NO. 207 HG1009888 SEQ.ID.NO.85 SEQ.ID, .NO. 208
HG1009896 SEQ.ID.NO.86 SEQ.ID .NO. 209 HG1011790 SEQ.ID.NO.87 SEQ.ID, .NO. 210 HG1011827 SEQ.ID.NO.88 SEQ.ID .NO. 211 HG1011828 SEQ.ID.NO.89 SEQ.ID .NO. 212 HG1011830 SEQ.ID.NO.90 SEQ.ID, .NO. 213 HG1011833 SEQ.ID.NO.91 SEQ.ID .NO. 214 HG1011834 SEQ.ID.NO.92 SEQ.ID .NO. 215 FP ID SEQ.ID.NO. (NI) SEQ.ID.NO. (Pl)
HG1011836 SEQ.ID NO.93 SEQ.ID.NO.216
HG1011837 SEQ.ID.NO.94 SEQ.ID.NO.217 HG1011839 SEQ.ID.NO.95 SEQ.ID.NO.218 HG1011840 SEQ.ID.NO.96 SEQ.ID.NO.219 HG1011841 SEQ.ID,NO.97 SEQ.ID.NO.220 HG1011842 SEQ.ID NO.98 SEQ.ID.NO.221 HG1011843 SEQ.ID,NO.99 SEQ.ID.NO.222 HG1011844 SEQ.ID,.NO.100 SEQ.ID.NO.223 HG1011845 SEQ.ID.NO.101 SEQ.ID.NO.224 HG1011846 SEQ.ID,.NO.102 SEQ.ID.NO.225 HG1011848 SEQ.ID.NO.103 SEQ.ID.NO.226 HG1011849 SEQ.ID.NO.104 SEQ.ID.NO.227 HG1011850 SEQ.ID NO.105 SEQ.ID.NO.228 HG1011853 SEQ.ID NO.106 SEQ.ID.NO.229 HG1011854 SEQ.ID.NO.107 SEQ.ID.NO.230 HG1011857 SEQ.ID.NO.108 SEQ.ID.NO.231 HG1011858 SEQ.ID.NO.109 SEQ.ID.NO.232 HG1011859 SEQ.ID.NO.110 SEQ.ID.NO.233 HG1011860 SEQ.ID,.NO. Ill SEQ.ID.NO.234 HG1011861 SEQ.ID.NO.112 SEQ.ID.NO.235 HG1011862 SEQ.ID,.NO.113 SEQ.ID.NO.236 HG1011865 SEQ.ID.NO.114 SEQ.ID.NO.237 HG1011867 SEQ.ID,.NO.115 SEQ.ID.NO.238 HG1011868 SEQ.ID,.NO.116 SEQ.ID.NO.239 HG1011869 SEQ.ID,.NO.117 SEQ.ID.NO.240 HG1011870 SEQ.ID,.NO.118 SEQ.ID.NO.241 HG1011871 SEQ.ID..NO.119 SEQ.ID.NO.242 HG1011872 SEQ.ID.NO.120 SEQ.ID.NO.243 HG1011873 SEQ.ID.NO.121 SEQ.ID.NO.244 HG1011874 SEQ.ID.NO.122 SEQ.ID.NO.245 HG1011875 SEQ.ID.NO.123 SEQ.ID.NO.246
Table 2. Tree Vote, Transmembrane Sequences, and Sequence Coordinates
Alternate
Mature Mature Signal
Tree Protein Protein Peptide
FP ID Vote Coords. Coords. Coords. TM TM Coords. non-TM Coords.
HG1009529P1 0 [1-671) 1 (397-419) (l-396)(420-671)
HG1009530P1 0.01 :i-258) 0 (1-258)
HG1009531P1 0 [1-432) 0 (1-432)
HG1009552P1 0.02 i-677) (26-677) (1-25) 2 (50-72)(131-150) (l-49)(73-130)(151- 677)
HG1009554P1 0.51 31-214) (7-20) 0 (1-214)
HG1009559P1 0.34 1-399) (18-399) (1-17) 0 (1-399)
HG1009565P1 0 [1-294) 1 (257-279) (l-256)(280-294)
HG1009568P1 0 35-561) (1-561) 1 (465-487) (l-464)(488-561)
HG1009571P1 0 :i-580) (19-580) (1-18) 1 (341-363) (l-340)(364-580)
HG1009574P1 0.04 1-142) 1 (106-128) (1-105)(129-142)
HG1009577P1 0 [35-234) (1-234) 2 (159-181)(190- (1-158)(182-189)(213
.212) 234)
HG1009578P1 0.06 [1-124) 1 (61-83) (l-60)(84-124)
HG1009583P1 0.02 | :i-359) 0 (1-359)
HG1009586P1 0.02 ;i-400) 0 (1-400)
HG1009594P1 0.01 ( [1-597) 0 (1-597)
HG1009595P1 0.97 < [21-408) (20-408) (1-19) 0 (1-408)
HG1009596P1 0.01 [1-594) 0 (1-594)
HG1009602P1 0 ( [1-414) 0 (1-414)
HG1009603P1 0 ( k'l-441) 0 (1-441)
HG1009606P1 0.87 ( ;27-93) (24-93) (1-23) 0 (1-93)
Alternate
Mature Mature Signal
Tree Protein Protein Peptide
FP ID Vote Coords. Coords. Coords. TM TM Coords. non-TM Coords.
HG1009607P1 0 (1-438) 1 (346-368) (l-345)(369-438)
HG1009611P1 0 (23-647) (5-22) 1 (575-594) (l-574)(595-647)
HG1009618P1 0 (1-297) 0 (1-297)
HG100962 IP 1 0 (1-409) 0 (1-409)
HG1009622P1 0 (38-504) (1-504) 0 (1-504)
HG1009627P1 0.44 (1-224) (18-224) (3-17) 0 (1-224)
HG1009632P1 0.04 (1-349) 0 (1-349)
HG100964 IP 1 0.82 (28-134) (25-134) (1-24) 0 (1-134)
HG1009657P1 0.96 (30-551) (16-551) (1-15) 1 (7-29) (1-6X30-551)
HG1009659P1 0.04 (1-2576) 11 (227-249)(466- (l-226)(250-465)(489-
488)(509- 508)(532-550)(574-
531X551- 579)(598-1257)(1281-
573)(580- 1430)(1454-
597)(1258- 1472)0496-
1280)(1431- 1514)(1538-
1453)(1473- 1546)(1570-
1495)(1515- 1634X1658-2576)
1537)(1547-
1569)(1635-
1657)
HG1009662P1 0.91 (19-206) (25-206) (1-24) 1 (4-23) 0-3)(24-206)
HG1009663P1 0.53 (1-73) (28-73) (14-27) 0 (1-73)
HG1009667P1 0.66 (25-259) (23-259) (1-22) 0 (1-259)
HG1009673P1 0.02 (1-689) 0 (1-689)
HG1009676P1 0 (1-461) 0 (1-461)
HG1009677P1 0 (1-294) 1 (257-279) (l-256)(280-294)
HG1009679P1 0.79 (26-185) (7-25) 0 (1-185)
Alternate
Mature Mature Signal
Tree Protein Protein Peptide
FP ID Vote Coords. Coords. Coords. TM TM Coords. non-TM Coords.
HG1009694P1 0.01 (1-317) 1 (282-304) (1-281X305-317)
HG1009696P1 0.01 (1-301) 0 (1-301)
HG1009699P1 0.02 (18-455) (4-17) 1 (45-67) (l-44)(68-455)
HG1009737P1 0.86 (24-349) (1-23) 0 (1-349)
HG1009740P1 0.01 (1-585) 0 (1-585)
HG1009746P1 0.96 (19-206) (22-206) (1-21) 0 (1-206)
HG1009758P1 0 [1-184) 0 (1-184)
HG1009761P1 0.97 (22-725) (7-21) 0 (1-725)
HG1009762P1 0.01 (1-245) 0 (1-245)
HG1009765P1 0.99 (17-1126) (20-1126) (1-19) 0 (1-1126)
HG1009776P1 1 (22-833) (26-833) (1-25) 0 (1-833)
HG1009784P1 0 (1-465) 0 (1-465)
HG1009794P1 0.94 (16-205) (17-205) (1-16) 0 (1-205)
HG1009795P1 0 (1-150) 0 (1-150)
HG1009798P1 0.01 (1-300) (47-300) (14-46) 1 (45-67) (l-44)(68-300)
HG1009800P1 0 (1-424) 1 (33-55) (l-32)(56-424)
HG1009805P1 0.79 (18-345) (24-345) (1-23) 0 (1-345)
HG100983 IP 1 0 (1-524) 0 (1-524)
HG1009834P1 0 (1-691) 1 (42-61) (1-41)(62-691)
HG1009841P1 0.03 (34-123) (1-123) 1 (21-43) (1-20X44-123)
HG1009843P1 0.7 (26-175) (7-25) 0 (1-175)
HG1009846P1 0.01 (1-560) 0 (1-560)
HG1009847P1 0.02 (1-1134) 3 (129-151)(255- (l-128)(152-254)(278 277)(954-976) 953)(977-1134)
HG1009864P1 0.02 (33-501) (1-32) 1 (21-43) (l-20)(44-501)
Alternate
Mature Mature Signal
Tree Protein Protein Peptide
FP ID Vote Coords. Coords. Coords. TM TM Coords. non-TM Coords.
HG1009869P1 0.81 (19-144) (20-144) (1-19) 0 (1-144)
HG1009872P1 0.02 (1-434) (28-434) (9-27) 13 (7-29)(33- (l-6)(30-32)(56-67)(91
55)(68-90)(94- 93)(114-119)(143-
113)020- 145)(169-174)(198-
142)(146- 200)(224-227)(251-
168)(175- 259)(283-286)(310-
197X201- 313)(337-356)(380-
223)(228- 434)
250)(260-
282)(287-
309)(314-
336)(357-379)
HG1009879P1 0.03 (1-334) 4 (213-235)(237- (l-212)(236-236)(260-
259)(263- 262)(286-289)(309-
285)(290-308) 334)
HG1009881P1 0.03 (1-383) 0 (1-383)
HG1009884P1 0.54 (1-189) (19-189) (1-18) 0 (1-189)
HG1009885P1 1 (17-85) (20-85) (1-19) 0 (1-85)
HG1009888P1 0 (1-171) 0 (1-171)
HG1009896P1 0.78 (18-286) (24-286) (1-23) 0 (1-286)
HG1011828P1 0.98 (20-72) (1-19) 0 (1-72)
HG1011830P1 0.92 (6-105) (19-105) (4-18) 0 (1-105)
HG1011836P1 0 (1-135) 1 (115-134) (1-114)(135-135)
HG1011868P1 0 O-ioi) (18-101) (1-17) 0 (1-101)
HG1011869P1 0 (1-109) (25-109) (1-24) 0 (1-109)
HG1011872P1 0 (1-463) 0 (1-463)
Table 3. Predicted Protein Length, and Characteristics of Homologous Known Proteins
Top Hit
% ID Top Human Relative Hit % ID Top
Predicted to Relative to Human Protein Predicted Top Hit Top Human Hit Top Human Hit Predicted Hit FP ID Length Top Hit Accession ID Top Hit Annotation Protein Length Accession ID Annot Protein Length
HG1009530P1 258gi|225047|prf]|1207289 reverse transcriptase 51 1259 gi| 106322|pir|[B34087 hypothetical protein 51 1280
A related protein (L1H 3 ' region) - human
HG1009552P1 677 gi|l 170636|sp|P23743|KDiacylglycerol kinase, 64 735 gi|l 170636|sp|P23743|KDiacylglycerol 64 735 DGAJΪUMAN alpha (Diglyceride DGA HUMAN kinase, alpha
Mnase) (DGK-alpha) (Diglyceride kinase) (DAG kinase alpha) (80 (DGK-alpha) (DAG . kDa diacylglycerol kinase alpha) (80 kDa kinase) diacylglycerol kinase)
HG1009618P1 297gi|106322|pir||B34087 hypothetical protein 51 1280gi|106322|pir||B34087 hypothetical protein 51 1280
(LlH 3'region) - (L1H 3' region) - human human
HG1009632P1 349 gi|23484307|gb|EAA 19 RNase H, putative 56 962 no human hit 685.1| [Plasmodium yoelii yoelii]
HG1009657P1 551 gi|34419649|reflNP_89 cytochrome P450, 94 522gi|34419649|reflNP_89 cytochrome P450, 94 522 9230.11 family 26, subfamily C, 9230.1| family 26, subfamily polypeptide 1; C, polypeptide 1; cytochrome p450 cytochrome p450
CYP26C1 [Homo CYP26C1 [Homo sapiens] sapiens]
HG1009662P1 206 gi|37543329|reflXP_l 1 similar to 78 602 gi|37543329|reflXP_l 1 similar to 78 602 3912.2| DKFZP566O084 3912.21 DKFZP566O084 protein [Homo sapiens] protein [Homo sapiens]
Top Hit
% ID Top Human
Relative Hit % π> Top
Predicted to Relative to Human Protein Predicted Top Hit Top Human Hit Top Human Hit Predicted Hit
FP ID Length Top Hit Accession ID Top Hit Annotation Protein Length Accession ID Annot Protein Length
HG1009679P1 185 gi|37541905|ref]XP_35 hypothetical protein 57 183 gi|37541905|reflXP_35 hypothetical protein 57 183 3035.1| XP_353034 [Homo 3035.1| XP_353034 [Homo sapiens] sapiens]
HG1009699P1 455 gi|37546077|ref]XP_20 similar to HGRG8 92 481 gi|37546077|reflXP_20 similar to HGRG8 92 481 8099.2| protein [Homo sapiens] 8099.2| protein [Homo sapiens]
HG1009765P1 1126 gi|32698868|reflNP_87 antimicrobial peptide 50 575 gi|32698868|reflNP_87 antimicrobial peptide 50 575 2325.1| RY2G5; likely ortholog 2325.1| RY2G5; likely of rat probable ligand- ortholog of rat binding protein probable ligand-
RY2G5; long palate, binding protein lung and nasal RY2G5; long palate, epithelium carcinoma lung and nasal associated 4 [Homo epithelium carcinoma sapiens] associated 4 [Homo sapiens]
HG1009776P1 833gi|37183044|gb|AAQ89 prohormone convertase 90 755gi|37183044|gb|AAQ89 prohormone 90 755 322.1| [Homo sapiens] 322.1| convertase [Homo sapiens]
HG1009847P1 1134gi|37543876|reflXP_04 similar to hypothetical 55 888 gi|37543876|reflXP_04 similar to 55 888 4062.4| protein [Homo sapiens] 4062.4| hypothetical protein [Homo sapiens]
HG1009888P1 171 gi|106322|pir||B34087 hypothetical protein 70 1280 gi| 106322|pir| |B34087 hypothetical protein 70 1280
(L1H 3' region) - (L1H 3' region) - human human
HG1009896P1 286 gi|29744308|reflXP_08 similar to cDNA 66 189 gi|29744308|reflXP_08 similar to cDNA 66 189 4672.2| sequence BC021608 4672.2| sequence BC021608
[Homo sapiens] [Homo sapiens]
HG1011827P1 105gi|32264621|gb|AAP78 Acl233 [Rattus 51 473 no human hit
Top Hit
% ID Top Human Relative Hit % ID Top
Predicted to Relative to Human Protein Predicted Top Hit Top Human Hit Top Human Hit Predicted Hit FP ID Length Top Hit Accession ID Top Hit Annotation Protein Length Accession ID Annot Protein Length
757.11 norvegicus] HG1011828P1 72gi|37183232|gb|AAQ89 GSGL541 [Homo 69 78gi|37183232|gb|AAQ89 GSGL541 [Homo 69 78
416.1| sapiens] 416.11 sapiens] HG1011830P1 105 gi|29744340|ref]XP_29 hypothetical protein 100 113 gi|29744340|ref]XP_29 hypothetical protein 100 113
4677.1| XP_294677 [Homo 4677.1| XP_294677 [Homo sapiens] sapiens] gi|37181316|gb|AAQ88 gi|37181316[gb|AAQ 472.1| LVLF3112 88472.1| LVLF3112 [Homo sapiens] [Homo sapiens]
HG1011859P1 79gi|23484307|gb|EAA19 RNase H, putative 59 962 no human hit 685.1| [Plasmodium yoelii yoelii]
HG1011868P1 101 gi|16117791|reflNP_00 ribosomal protein L35a; 86 110 gi| 16117791 |ref]NP_00 ribosomal protein 86 110 0987.2| 60S ribosomal protein 0987.2| L35a; 60S ribosomal L35a [Homo sapiens] protein L35a [Homo sapiens]
HG1011870P1 110 gi|31127136|gb| AAH52 Unknown (protein for 59 555 no human hit 883.1| MGC:60753) [Mus musculus]
HG1011872P1 463gi|4456990|gb|AAD210 polymerase [Homo 72 956gi|4456990|gb|AAD210 polymerase [Homo 72 956 97.11 sapiens] 97.11 sapiens]
Table 4. Pfam Coordinates
FP Patent ID Pfam Coords.
HG1009529P1 sushi (323-382)
HG1009529P1 sushi (57-112)
HG1009530P1 rvt (52-157)
HG 100953 IP 1 RibosomalJLlOe (242-293)
HG 100953 IP 1 RibosomalJLlOe (294-330)
HG1009552P1 efhand (376-404)
HG1009552P1 DAG_PE-bind (423-470)
HG1009552P1 DAGKc (578-677)
HG1009559P1 maseH (312-377)
HG1009583P1 rvt (15-109)
HG1009583P1 rvt (279-359)
HG1009586P1 rve (100-204)
HG1009594P1 rve (314-469)
HG1009594P1 dUTPase (492-568)
HG1009594P1 rvt (90-230)
HG1009596P1 rve (314-469)
HG1009596P1 dUTPase (492-565)
HG1009596P1 rvt (90-230)
HG1009602P1 Transposase_22 (178-231)
HG1009602P1 Transposase_22 (233-413)
HG1009603P1 Transposase_22 (1-49)
HG1009603P1 rvt (338-377)
HG1009603P1 rvt (379-436)
HG1009618P1 rvt (49-149)
HG100962 IP 1 rvt (1-70)
HG1009622P1 rvt (378-490)
HG1009627P1 dUTPase (108-165)
HG1009632P1 rvt (130-182)
HG1009657P1 p450 (426-537)
HG1009657P1 p450 (50-392)
HG1009659P1 Ribosomal_S9 (12-118)
HG1009659P1 ABC ran (1812-1993)
HG1009659P1 rrm (2401-2463)
HG1009659P1 ABCjran (829-1068)
HG1009662P1 adh_short (36-162)
HG1009673P1 rvt (1-115)
HG1009673P1 rve (247-402)
HG1009673P1 dUTPase (423-525)
HG1009673P1 dUTPase (614-671)
HG1009676P1 rvt (92-149)
HG1009696P1 rvt (22-59)
HG1009699P1 YTH (333-423)
HG1009740P1 rvt (37-116)
HG1009758P1 rvt (134-181)
HG 100976 IP 1 Transρosase_22 (626-723)
HG1009762P1 rvt (51-131) FP Patent ID Pfam Coords.
HG1009765P1 LBP_BPI_CETP_C (281-409)
HG1009765P1 LBP_BPI_CETP (34-205)
HG1009765P1 LBP_BPI_CETP (681-828)
HG1009765P1 LBP_BPI_CETP_C (921-1047)
HG1009776P1 Peptidase_S8 (176-510)
HG1009776P1 P_proprotein (527-658)
HG1009784P1 Transposase_22 (407-462)
HG1009795P1 rvt (74-150)
HG1009831P1 rvt (108-188)
HG1009831P1 rvt (193-226)
HG1009831P1 Gag_MA (2-80)
HG1009831P1 rnaseH (330-395)
HG1009834P1 Ribosomal_L7Ae (547-641)
HG1009846P1 rvt (323-554)
HG1009864P1 ank (472-493)
HG1009881P1 rvt (126-276)
HG1009888P1 rvt (1-165)
HG1011868P1 Ribosomal_L35Ae (5-101)
HG1011869P1 Transposase_l (41-100)
HG1011872P1 Integrase_Zn (214-253)
HG1011872P1 rve (264-395)
HG1011872P1 integrase (385-405)
HG1011872P1 rnaseH (83-212)
Examples
[0253] The examples, which are intended to be purely exemplary ofthe invention and should therefore not be considered to limit the invention in any way, also describe and detail aspects and embodiments ofthe invention discussed above. The examples are not intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Centigrade, and pressure is at or near atmospheric.
[0254] While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope ofthe invention. In addition, many modifications can be made to adapt a particular situation, material, composition of matter, process, process step or steps, to the objective, spirit and scope ofthe present invention. All such modifications are intended to be within the scope ofthe claims appended hereto.
[0255] Additional objects and advantages ofthe invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice ofthe invention. The objects and advantages ofthe invention will be realized and attained by means ofthe elements and combinations particularly pointed out in the appended claims. Moreover, advantages described in the body ofthe specification, if not included in the claims, are not per se limitations to the claimed invention.
[0256] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive ofthe invention, as claimed. Moreover, it must be understood that the invention is not limited to the particular embodiments described, as such may, of course, vary. Further, the terminology used to describe particular embodiments is not intended to be limiting, since the scope ofthe present invention will be limited only by its claims. [0257] With respect to ranges of values, the invention encompasses each intervening value between the upper and lower limits ofthe range to at least a tenth of the lower limit's unit, unless the context clearly indicates otherwise. Further, the invention encompasses any other stated intervening values. Moreover, the invention also encompasses ranges excluding either or both ofthe upper and lower limits ofthe range, unless specifically excluded from the stated range.
[0258] Unless defined otherwise, the meanings of all technical and scientific terms used herein are those coimnonly understood by one of ordinary skill in the art to which this invention belongs. One of ordinary skill in the art will also appreciate that any methods and materials similar or equivalent to those described herein can also be used to practice or test the invention. Further, all publications mentioned herein are incorporated by reference.
[0259] It must be noted that, as used herein and in the appended claims, the singular forms "a," "or," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a subject polypeptide" includes a plurality of such polypeptides and reference to "the agent" includes reference to one or more agents and equivalents thereof known to those skilled in the art, and so forth.
[0260] Further, all numbers expressing quantities of ingredients, reaction conditions, % purity, polypeptide and polynucleotide lengths, and so forth, used in the specification and claims, are modified by the term "about," unless otherwise indicated. Accordingly, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties ofthe present invention. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope ofthe claims, each numerical parameter should at least be construed in light ofthe number of reported significant digits, applying ordinary rounding techniques. Nonetheless, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain enors from the standard deviation of its experimental measurement.
[0261 ] The publications discussed herein are provided solely for their disclosure prior to the filing date ofthe present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
Example 1. Cell-Free Expression in a Wheat Germ Protein Synthesis System: The PROTEIOS™ Kit
[0262] Protein synthesis is performed using the PROTEIOS™ wheat germ cell-free protein synthesis system described by Madin et al., 2000. This system layers a "buffer mix" layer containing lower molecular weight substances such as amino acids and energy sources, e.g., inter alia, ATP and GTP underneath a "reaction mix" layer containing wheat germ extract, mRNA, creatine kinase, and creatine phosphate. One or more ofthe amino acids can be labeled for detection, e.g., with a radioactive label. The buffer mix layer and the reaction mix layer mix together gradually, providing a continuously replenishing source of energy and amino acids over approximately 1 -24 hours, during which protein synthesis proceeds.
[0263] In this system, plasmid DNA is prepared from a target gene open reading frame operably linked to a promoter. The plasmid vector is one suitable for expression in plants, e.g., the pEU vector (PROTEIOS™ Plasmid Set). DNA is transcribed into mRNA, which is in turn translated to polypeptides in a wheat germ extract with an energy replenishing system.
[0264] A reaction mix is prepared by mixing 33.5 μl mRNA at a concentration of about 0.3 to about 0.4 μg/μl, and 1.8 μl distilled water with 10.0 μl wheat germ extract, 1.0 ul RNase inhibitor (40 U/ μl), 1.7 ul creatine kinase (10 mg/ml), and 2.0 μl Buffer #2 as provided by the Proteios™ wheat germ cell-free protein synthesis core kit, Invitrotech Co., Ltd. (Kyoto, Japan). The reaction mix is placed under 250 μl of "buffer mix" as provided by the Proteios™ kit in a flat-bottom reaction well. The wells are sealed to prevent vaporization, and the reaction proceeded at 23-26°C for 16 hours. Example 2. Cell-Free Expression in a Wheat Germ Protein Synthesis System: The Dialysis Method
[0265] Cell-free extract is prepared from wheat embryos by the method of Madin et al, 2000. This extract (300 μl) is added to the following components to a final volume of 500 μl reaction mixture: Hepes/KOH, pH 7.8 (24 mM), ATP (1.2 mM), GTP (0.25 mM), creatine phosphate (16 mM), creatine kinase (0.45 mg/ml), DTT (2 mM), spermidine (0.4 mM), each ofthe 20 amino acids including 2 μCi/ml [14C] leucine (0.3 mM), magnesium acetate (2.5 mM), potassium acetate (100 mM), 50 μg/ml deacylated tRNA prepared from wheat embryos, Nonidet P-40 (0.05%), E- 64 proteinase inhibitor (1 μM), NaN3 (0.005%), and mRNA corresponding to the polypeptide of interest (0.02 nmol).
[0266] A dialysis bag with this reaction mixture is added to 5 ml of dialysate solution containing all ofthe ingredients ofthe reaction mixture with the exception of creatine kinase. Dialysis is conducted at 23°C. At 24 hour intervals the reaction mixture is supplemented with 0.05 nmol of mRNA conesponding to the polypeptide of interest and 50 μg creatine kinase. The dialysate solution is replaced every 24 hours. This method has been previously demonstrated to produce protein at a linear rate for a minimum of 72 hours and to be suitable for large-scale preparation of proteins (Madin et al, 2000).
Example 3. Expression of Biologically Active Proteins Using a Cell-Free System
[0267] The following protocol has been used to produce the secreted proteins IL-6, mature IL-1B, GM-CSF, IL-4, IL-3, and Blys in a cell-free system. This protocol has also been used to make transmembrane proteins. In a preferred embodiment ofthe invention, the transmembrane polypeptides are expressed on Nanodiscs™ directly upon their production by the cell-free translation system described below in the absence of detergent. In another embodiment, the transmembrane polypeptides are expressed on Nanodiscs™ directly upon their production by the cell-free translation system described below in the presence of detergent. [0268] The types of detergent and their concentrations beneficial for promoting transmembrane protein synthesis may need to be determined for each individual protein. In general, the detergents NP-40 (0.05%), Tween 20 (0.1%), Tween 80 (0.1%), Octylglucoside (0.2%), and Chaps (0.1%) were compatible with the production of proteins in the cell-free expression system described below. Octylglucoside at a concentration of 0.5%) inhibited protein synthesis, and 0.05% NP- 40 was inhibitory in some experiments.
[0269] A 38 nucleotide primer is designed and synthesized which contains the following nineteen nucleotides "5'CCACCCACCACCACCAATG 3'" followed by nucleotides predicted to encode the amino terminus ofthe transmembrane polypeptide of interest. A second reverse primer is designed to a region ofthe plasmid (containing the cDNA encoding the protein to be expressed) approximately 400 nucleotides downstream from the coding sequence ofthe gene to be expressed. The second primer is designed as the reverse complement ofthe vector sequence in this region such that this primer will be useful for doing PCR amplification ofthe coding sequence ofthe open reading frame to be expressed. The second primer is typically 17-23 nucleotides in length with a Tm of approximately 55-65°C.
[0270] . A purified plasmid containing the cDNA to be expressed or E. coli cells contaimng the plasmid that contains the cDNA to be expressed is then added as template to a standard PCR reaction that includes the two primers described above, standard PCR reagents, and a DNA polymerase that has proof-reading activity, and subjected to 15-30 cycles of PCR amplification. The product of this PCR reaction is called "PCR1 coding template."
[0271] A separate PCR reaction is used to prepare a "GST-Mega primer" for a GST-fusion expression template. Using a plasmid template that contains the coding sequence for GST downstream ofthe Non-Omega translation initiation sequence, a PCR reaction is prepared using the primer 5'GGTGACACTATAGAACTCACCT ATCTCCCCAACA 3' and the primer 5'GGGCCCCTGGAACAGAACTTC 3' and amplified in a standard PCR reaction that includes the two primers described above, standard PCR reagents, and a DNA polymerase that has proof-reading activity, then subjected to 15-30 cycles of PCR amplification. After the PCR reaction is complete, the PCR product is subjected to exonuclease I freatment for 30 min at 37°C, then heat- inactivated at 80 °C for 30 min, and the PCR product purified by agarose gel elecfrophoresis and extracted using a gel purification kit (Amersham) to produce the "GST-Mega primer."
[0272] The "GST-Mega primer" is then used to create a GST-fusion expression template by combining it with the product ofthe first PCR reaction (PCRl coding template) containing the coding region ofthe cDNA to be expressed. An aliquot ofthe PCRl coding template (0.5 μl) is mixed with an aliquot ofthe GST- Mega primer (1 μl) and a primer 5 'GCGTAGCATTTAGGTGACACT 3' that encodes part ofthe SP6 promoter sequence and anneals to the 5' end ofthe GST Mega primer, and a second primer that is designed to a region ofthe plasmid approximately 300-350 nucleotides downstream from the coding sequence ofthe gene to be expressed. This second primer is designed as the reverse complement ofthe vector sequence in this region such that this primer will be useful for doing PCR amplification from the PCRl coding template. This second primer is typically 17-23 nucleotides in length with a Tm of approximately 55-65 °C. The "GST-fusion expression template" is then generated by doing a standard PCR reaction using standard PCR reagents, and a DNA polymerase that has proof-reading activity and subjected to 15-30 cycles of PCR amplification.
[0273] An in viti'o transcription reaction (50 μl) is then prepared using 5 μl ofthe GST-fusion expression template in the following buffer, 80 mM Hepes KOH pH 7.8, 16 mM Mg(OAc)2, 2 mM spermidine, 10 mM DTT containing 1 unit of SP6 (Promega), and 1 unit of RNasin (Promega) and incubated for 3 hours at 37°C. The mRNA is then subjected to ethanol precipitation by the addition of 200 μl of RNase- free water, 37.5 μl of 5 M ammonium acetate, and 862 μl of 99% ethanol, mixed by vortexing and then pelleted by centrifugation at 15,000 x g for 10 min at 4°C. The mRNA pellet is then washed in 70% ethanol and again pelleted by centrifugation at 15,000 x g for 5 min at 4°C.
[0274] The in vitro translation reaction is performed with a stock of 2x Dialysis Buffer of 20 mM Hepes buffer pH 7.8 (KOH), 200 mM KOAc, 5.4 mM Mg(OAc)2, 0.8 mM Spermidine, 100 micromolar DTT, 2.4 mM ATP, 0.5 mM GTP, 32 mM creatine phosphate, 0.02 % NaN3, and 0.6 mM Amino Acid Mix minus ASP, TRP, GLU, ILE, LEU, PHE, and TYR. The amino acids ASP, TRP, GLU, ILE, LEU, PHE, and TYR are prepared separately as an 80 mM stock in IN HC1 and after complete dissolution are added to a final concentration of 0.6 mM. After addition of all ingredients, the 2x Dialysis Buffer stock is adjusted to pH 7.6 using 5N KOH, filter sterilized, and stored frozen in aliquots at -80°C.
[0275] To resuspend the in vitro transcribed mRNA that has been ethanol precipitated and washed in 70%) ethanol a 50 μl translation mixture is prepared that includes Wheat Germ Reagent at a final OD 260nm of 60 plus the volume of lx Dialysis Buffer (to which 2 mM DTT has been added) that brings the final volume to 50 μl (Wheat Germ Reagent comprises a concentration of lx Dialysis Buffer). After removing the ethanol from the precipitated mRNA, the 50 μl translation mixture is added, and the mRNA resuspended after 5-10 min. has elapsed. The complete translation mixture containing the resuspended mRNA is then layered under 250 μl of lx Dialysis Buffer present in one well of 96 well round-bottom microtiter plate to set up the Bilayer Reaction. The plate is then sealed manually with a plate seal and incubated for 20 hours at 26°C.
[0276] To recover the recombinant protein expressed as a GST fusion, the translation mixture is transferred to a tube and 10 μl of glutathione-Sepharose is added and incubated, with mixing, for 1 hour at 4°C. The Sepharose beads containing the bound GST fusion protein are then washed three times in phosphate buffered-saline containing 0.25 M sucrose and 2 mM DTT. A fourth wash is then performed with protease cleavage buffer containing 50 mM Tris pH 7.4, 150 mM NaCl, 1 mM EDTA, 2 mM DTT, and 0.25 M sucrose. After careful removal ofthe wash buffer, 10 μl of final wash buffer is added with 0.4 μl of Prescision Protease (Amersham), the beads gently suspended with a pipette, and then allowed to incubate over night at 4°C. To recover the cleaved secreted protein product, 20 μl of final wash buffer is added and entire liquid fraction recovered by pipette or by filtering through a scintered frit. To stabilize the recovered protein, purified bovine serum albumin prepared as a 10 mg/ml stock in PBS is added to a final concentration of 1 mg/ml and the protein sample dialyzed in PBS and filter sterilized for storage prior to testing for biological activity. To produce additional protein, the single Bilayer Reaction can be reproduced many times and the purification and formulation scaled accordingly. Typically, sixteen Bilayer Reactions will produce sufficient biologically active protein for testing in many biological assays.
Example 4: Primer Design
[0277] To design the forward primer for PCR amplification, the melting point ofthe first 20 to 24 bases ofthe primer can be calculated by counting total A and T residues, then multiplying by 2. To design the reverse primer for PCR amplification, the melting point ofthe first 20 to 24 bases ofthe reverse complement, with the sequences written from 5-prime to 3-prime can be calculated by counting the total G and C residues, then multiplying by 4. Both start and stop codons can be present in the final amplified clone. The length ofthe primers is such to obtain melting temperatures within 63 degrees C to 68 degrees C. Adding the bases "CACC" to the forward primer renders it compatible for cloning the PCR product with the TOPO pENTR/D (Invitrogen, CA).
Example 5: Reverse Transcriptase Reaction
[0278] cDNA can be prepared by the following method. Between 200 ng and 1.0 μg mRNA is added to 2 μl DMSO and the volume adjusted to 11 μl with DEPC-treated water. One μl Oligo dT is added to the tube, and the mixture is heated at 70° C for 5 min., quickly chilled on ice for 2 min., and the mixture is collected at the bottom ofthe tube by brief centrifugation. The following 1st strand components are then added to the mRNA mixture: 2 μl 10X Sfratascript (Stratagene, CA) 1st strand buffer, 1 μl 0.1 M DTT, 1 μl 10 mM dNTP mix (10 mM each of dG, dA, dT and dCTP), 1 μl RNAse inhibitor, 3 μl Sfratascript RT (50 U/ μl). The contents are gently mixed and the mixture collected by brief centrifugation. The mixture is incubated in a 42° C water bath for 1 hour, placed in a 70° C water bath for 15 min. to stop the reaction, transferred to ice for 2 min., and centrifuged briefly in a microfuge to collect the reaction product at the bottom ofthe reaction vessel. Two μl RNAse H is then added to the tube, the contents are mixed well, incubated at 37° C in a water bath for 20 min., and centrifuged briefly in a microfuge to collect the reaction product at the bottom ofthe reaction vessel. The reaction mixture can proceed directly to PCR or be stored at - 20° C.
Example 6: Full Length PCR
[0279] Full length PCR can be achieved by placing the products of the reaction described in Example 7, with primers diluted to 5μM in water, into a reaction vessel and adding a reaction mixture composed of lx Taq buffer, 25 mM dNTP, 10 ng cDNA pool, TaqPlus (Stratagene, CA) (5u/ul), PfuTurbo (Stratagene, CA) (2.5u/ul), water. The contents ofthe reaction vessel are then mixed gently by inversion 5-6 times, placed into a reservoir where 2μl Fi/Ri primers are added, the plate sealed and placed in the thermocycler. The PCR reaction is comprised ofthe following eight steps. Step 1: 95° C for 3 min. Step 2: 94° C for 45 sec. Step 3: 0.5° C/sec to 56-60° C. Step 4: 56-60° C for 50 sec. Step 5: 72° C for 5 min. Step 6: Go to step 2, perform 35-40 cycles. Step 7: 72° C for 20 min. Step 8: 4° C.
[0280] The products can then be separated on a standard 0.8 to 1.0% agarose gel at 40 to 80 V, the bands of interest excised by cutting from the gel, and stored at - 20° C until extraction. The material in the bands of interest can be purified with QIAquick 96 PCR Purification Kit (Qiagen, CA) according to the manufacturer instructions. Cloning can be performed with the Topo Vector pENTR/D-TOPO vector (Invitrogen, CA) according to the manufacturer's instructions. References
[0281 ] The specification is most thoroughly understood in light of the following references, all of which are hereby incorporated by reference in their entireties. The disclosures ofthe patents and other references cited above are also hereby incorporated by reference.
1. Agou, F., et al. (1996) Biochemistry 35:15322-15331.
2. Agrawal, S., Crooke, S.T. eds. (1998) Antisense Research and Application (Handbook of Experimental Pharmacology, Vol 1311. Springer- Verlag New York, Inc.
3. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., Watson, J.D. (1994) Molecular Biology ofthe Cell. 3rd ed. Garland Publishing, Inc.
4. Alexander, D.R. (2000) Semin. Immunol 12:349-359. 5. Allison, A.C. (2000) Immunopharmacology 47:63-83.
6. Altschul, et al. (1997) Nucleic Acids Res. 25:3389-3402.
7. Altschul, S.F., et al. (1990) J. Mol. Biol. 215:403-410.
8. Amor, J.C., et al. (1994) Nature 372:704-708.
9. Andreeff, M., Pinkel, D. eds. (1999) Introduction to Fluorescence In Situ Hybridization: Principles and Clinical Applications. John Wiley & Sons.
10. Andres, D.A., et al. (1997) Arch. Biochem. Biophys. 346:113-124.
11. Ansel, H.C., Allen, L., Popovich, N.G. eds. (1999) Pharmaceutical Dosage Forms and Drug Delivery Systems. 7 ed. Lippencott Williams and Wilkins Publishers.
12. Attardi, L.D., Jacks, T. (1999) Cell Mol. Life Sci. 55:48-63.
13. Aubry, M., et al. (1992) Genomics 13:641-648.
14. Ausubel, F., et al. (1999) Short Protocols in Molecular Biology. 4th ed. Wiley & Sons.
15. Baksh, S., Burakoff, S.J. (2000) Semin. Immunol. 12:405-415.
16. Ballance, D.J., et al. (1983) Biochem. Biophys. Res. Commun. 112:284-289.
17. Barany, F. (1985) Gene 37:111-123.
18. Barnes, D., Sato, G. (1980) Anal. Biochem. 102:255-270.
19. Barton, M.C., et al. (1990) Nucleic Acids Res. 18:7349-7355.
20. Bashkin, J.K., et al. (1995) Appl Biochem. Biotechnol 54:43-56.
21. Bassett, D.E., et al. (1999) Nature Genetics 21 :51-55.
22. Bast, R.C., et al. (2000) Cancer Medicine. 5th ed. B.C. Decker, Inc.
23. Bateman, A., et al. (2000) Nucleic Acids Research 30:276-280.
24. Battini, R., et al. (1987). Biol. Chem. 262:4355-4359.
25. Bausserman L.L. et al. (1983) J. Biol. Chem. 258:10681-10688.
26. Bayburt, T.H., et al. (2002) Nano. Lett. 2:853-856.
27. Bayburt, T.H., Sugar, S.G. (2003) Protein Sci. 12:2476-2481.
28. Beach, D., et al. (1982) Nature 300:706-709.
29. Beigelman, L., et al. (1995) Nucleic Acids Res. 23 :4434-4442.
30. Bennett, J. (2000) Curr. Opin. Mol Ther. 2:420-425. 31. Berinstein, N . (2002) J Clin. Oncol. 20:2197-2207.
32. Bibikova, et al. (2003) Science 300:764.
33. Birney, E., Durbin, R. (2000) Genome Res. 10:547-548.
34. Blackwell, J.M., et al. (1995) Mol. Med. 1 : 194-205.
35. Bodzioch, M., et al. (1999) Nat. Genet. 1999 22:347-351.
36. Bonifaci, N., et al. (1997) Proc. Natl. Acad. Sci. 94:5055-5060.
37. Bono, H., et al. (2002) Nucleic Acids Res. 30:116-118.
38. Boshart, M., et al. (1985) Cell 41:521-530.
39. Bowtell, D.D.L. (1999) Nature Genetics 21 :25-32.
40. Braunwald et al. (2001) Harrisons Principles of Internal Medicine. 15th Ed., McGraw Hill Medical
41. Brenner, S., et al. (2000) Proc. Natl. Acad. Sci. USA 97:1665-1670.
42. Brinster, R.L., et al. (1981) Cell 27:223-231.
43. Brock, G. (2000) Drugs Today 36:125-134.
44. Brown, J.R., et al. (1985) Mol. Cell Biol. 5 : 1694- 1706.
45. Brown, P.O, Botstein, D. (1999) Nature Genetics 21:33-37.
46. Brunelleschi, S., et al. (2002) Curr. Pharm. Des. 8:1959-1972.
47. Brutlag, D.L., et al. (1993). Computers and Chemistry 17:203-207.
48. Capecchi, M.R. (1989) Genet. 5:70-76.Sasaki, T., et al. (2003) Endocr. Pathol 14:141-144.
49. Carbonell, L.F., et al. (1988) Gene 73:409-418.
50. Chakravarty, A. (1999) Population genetics - making sense out of sequence. Nature Genetics 21:56-60.
51. Chalifour, L.E., et al. (1994) Anal. Biochem. 216: 299-304.
52. Chalut, C, et al. (1995) Gene 161:277-282.
53. Chang, A.C., et al. (1978) Nature 275:617-624.
54. Chang, M.S., et al. (2000) p29, Biochem. Biophys. Res. Commun. 279:123-737.
55. Chen, F.W., Ioannou, Y.A. (1998) Int. Rev. Immunol. 18:429-448.
56. Chen, S.Y., Bagley, J., Marasco, W.A. (1994) Hum. Gene Ther. 5:595-601. 57. Cheng, W.F., et al. (2001) J. Clin. Invest. 108:669-678.
58. Cheung, V.G., et al. (1999) Nature Genetics 21:15-19.
59. Chien, C, Bartel, P.L., et al. (1991) Proc. Natl Acad. Sci. 88:9578- 9581.
60. Civjan, N.R., et al. (2003) Biotechniques 35:556-560, 562-563.
61. Christa, L., et al. (1994) Gastroenterology 106:1312-1320.
62. Clark, CM., Karlawish, J.H. (2003) Ann. Intern. Med. 138:400-410.
63. Clark, H.F., et al. (2003) Genome Res. 13:2265-2270; Epublished Sep 15, 2003.
64. Coffin, J.M., et al. (1997) Retroviruses. Cold Spring Harbor Laboratory Press.
65. Cole, K.A., et al. (1999) Nature Genetics 21:38-41.
66. Colicelli, J., et al., Goff, S.P. (1985) Mol. Gen. Genet. 199:537-539.
67. Collins, F.S. (1999) Nature Genetics 21:2.
68. Comuzzie, A.G., Allison, D.B. (1998) Science 280:1374-1377.
69. Cormand, B., et al. (1997) Hum. Genet. 100:75-79.
70. Cregg, J.M., et al. (1985) Mol Cell. Biol. 5:3376-3385.
71. Crooke, S.T. (1996) Med. Res. Rev. 16:319-344.
72. Crouch, R.J. (1990) New Biol. 2:771-777.
73. Curcio, L.D., et al. (1997) Pharmacol. Ther. 74:317-332.
74. Das, S., et al. (1984) J Ba eriol 158:1165-1167.
75. Davidow, et al. (1987) Curr. Genet. 11:377-383.
76. de Boer, H.A., et al. (1993) Proc. Natl Acad. Sci. 80:21-25.
77. De Louvencourt, L., et al. (1983) J. Baderiol. 154:737-742.
78. Deasy, B.M., Huard, J. (2002) Curr. Opin. Mol Ther. 4:382-389.
79. Degtyarenko, K.N., Archakov, A.I. (1993) FEBSLett. 332:1-8.
80. Delahunty, C, Ankener, W., et al. (1996) Am. J. Human Genetics 58:1239-1246.
81. Deutscher, M.P., et al. (1990) Guide to Protein Purification: Methods in Enzymology. (Methods in Enzvmology Series. Vol 182). Academic Press. 82. Dieffenbach, C.W., Dveksler, G.S., eds. (1995) PCR Primer: A Laboratory Manual. Cold Spring Harbor Laboratory Press,
83. Dijkema, R., et al. (1985) EMBO J. 4:761-767.
84 Doerfler, W., Bohm, P., eds. (1987) The Molecular Biology Of Baculoviruses. Springer- Verlag, Inc.
85. Doll, A., Grzeschik, K.H. (2001) Cytogenet. Cell Genet. 95:20-27.
86. Doolittie, R.F., et al. (1996) Computer Methods for Macromolecular Sequence Analysis, lst ed. Academic Press.
87. Ducrest, A.L., et al. (2002) Oncogene 21:541-52.
88. Dutoit, V., Taub, et al. (2002) J. Clin. Invest. 110: 1813-1822.
89. Egilsson, V., et al. (1986) J. Gen. Microbiol 132:3309-3313.
90. Ehrhardt, G.R., et al. (2001) Oncogene 20:188-197.
91. Espejo, A., et al. (2002) Biochem. J. 367:697-702.
92. Everett, R.D., et al. (1997) EMBO J. 16:1519-1530.
93. Fanning, A.S., Anderson, J.M. (1999) Curr. Opin. Cell Biol. 11:432- 439.
94. Fields, S., Song, O. (1989) Nature 340:245-246.
95. Fiers, W. (1991) FEBSLett. 285:199-212.
96. Fisch, P., Forster, et al. (1993) Oncogene 8:3271-3276.
97. Fishman, P.S., Oyler, G.A. (2002) Curr. Neurol Neurosci. Rep. 2:296-302.
98. Forgac, M. (1999) Biol. Chem. 274:12,951-12,954.
99. Frank, ! (2002) Clin. Lab. Med. 22:741-757.
100. Frithz, G., Ericsson, P., Ronquist, G. (1976) Ups JMed Sci. 81 : 155- 158.
101. Funakoshi, I., et al. (1992) Arch. Biochem. Biophys. 295:180-187.
102. Furth, P.A., et al. (1992) Anal. Biochem. 205:365-368.
103. Gaillardin, C, Ribet, A.M. (1987) Curr. Genet. 11 :369-375.
104. Gao, X., Nawaz, Z. (2002) Breast Cancer Res. 4:182-186.
105. Gao, Y., Melki, (1994) J. Cell Biol. 125:989-996.
106. Gaudilliere, B., et al. (2002) J. Biol. Chem. 277:46,442-46,446. 107. Gavrieli, Y., et al. (1992) J Cell Biol. 119:493-501.
108. Geffen D.B., Man S. (2002) Isr. Med. Assoc. J. 4:1124-31.
109. Gennaro, A., ed. (2000) Remington: The Science and Practice of Pharmacy. 20th ed. Lippincott, Williams, & Wilkins.
110. Ghofrani, H.A., et al. (2003) J. Am. Coll. Cardiol. 42:158-164.
111. Gillingha , A.K., et al. (2002) Mol. Biol. Cell 13 :3761 -3774.
112. Gingras, M.C., et al. (2002) Mol. Immunol. 38:817-824.
113. Girschick, H.J., et al. (2002) Arthritis. Rheum. 46:1255-1263.
114. Gmeiner, W.H., Horita, D.A. (2001) Cell Biochem. Biophys. 35:127- 140.
115. Goeddel, D.V., et al. (1979) Nature 281 :544-548.
116. Goldstein, L.S.B., Yang, Z. (2000) Annu. Rev. Neurosci. 23:39-71.
117. Golovkina, T.V., et al. (1992) Cell 69:637-645.
118. Gonnet, G.H., et al. (1992) Science 256:1443-1445.
119. Gordan, J.D., Vonderheide, R.H. (2002) Cytotherapy 4:317-327.
120. Gorman, CM., et al. (1982) Proc. Natl. Acad. Sci. 79:6777-6781.
121. Gray, T.A., et al. (2002) Genomics. 66:76-86.
122. Griffiths, A.J.F., Miller, J.H., Suzuki, D.T., Lewontin, R.C., Gelbart, W.M. (1999) Introduction to Genetic Analysis. 7th ed. W.H. Freeman.
123. Griffiths, M., et al. (1997) Nat. Med. 3:89-93.
124. Grosschedl, R., Baltimore, D. (1985) Cell 41:885-897.
125. Grosveld, F., Kollias, G., eds. (1992) Transgenic Animals. 1st ed. Academic Press.
126. Gustin, K., Burk, R.D. (1993) Biotechniques 14:22-24.
127. Hacia, J.G. (1999) Nature Genetics 21:42-4 .
128. Hadano, S., et al. (2001) Genomics 71:200-213.
129. Hall, M., Mickey, Det al. (1985) Clin. Chem. 31:1689-1691.
130. Ham, R.G., McKeehan, W.L. (1979) Methods Enzymol 58:44-93.
131. Hanada, T., Lin, L., et al. (2000) J. Biol. Chem. 275:28,774-28,784.
132. Harlow, E., Lane, D., eds. (1988) Antibodies: A Laboratory Manual. Cold Spring Harbor Laboratory. 133. Harlow, E., Lane, D., Harlow, E., eds. (1998) Using Antibodies: A Laboratory Manual: Portable Protocol NO. I. Cold Spring Harbor Laboratory.
134. Hartmann, A.M. et al., (1999) Mol Biol. Cell 10:3909-3926.
135. Hartmann, G., Endres, S., eds. (1.999) Manual of Antisense Methodology (Perspectives in Antisense Science). 1st ed. Kluwer Law International.
136. Hassanzadeh, G.H.G., et al. (1998) FEBSLett. 437:75-80.
137. Hawes, J.W., et al. (1996) J. Biol. Chem. 271 :26,430-26,434.
138. Heath, J.K., et al. (1997) Proc. Natl. Acad. Sci. 94:469-474.
139. Heiser, A., Coleman, et al. (2002) J. Clin. Invest. 109:409-417.
140. Henningson, C.T. Jr., et al. (2003) J. Allergy Clin. Immunol 111:S745-S753.
141. Hinnen, A., et al. (1978) Proc. Natl. Acad. Sci. 75:1929-1933.
142. Hirsch, D.S., et al. (2001) J. Biol. Chem. 276:875-883.
143. Ho, L.W., et al. (2001) Psychol Med. 31 :3-14.
144. Hollis, G.F., et al. (1989) Proc. Natl. Acad. Sci. 86:5552-5556.
145. Homo-Delarche, F. (2001) Braz. J. Med. Biol. Res. 34:437-447.
146. Hong, G.F. (1982) Biosci. Rep. 2:907-912.
147. Hoogenboom, H.R., et al. (1998) . Immunotechnology 4:1-20.
148. Hooper, M.L. (1993) Embryonal Stem Cells: Introducing Planned Changes into the Animal Germline.- Gordon & Breach Science Pub.
149. Hoozemans, J.J., et al. (2002) Drugs Today (Bare) 38:429-443.
150. Houseman, B.T., et al. (2002) Nature Biotechnol. 20:270-274.
151. Howard, G.C, Bethell, D.R. (2000) Basic Methods in Antibody Production and Characterization. CRC Press.
152. Huynh, D.P., et al. (2003) Hum. Mol. Genet. 12:1485-1496.
153. Ikeda, A., et al. (2002) J. Cell Sci. 115(Pt 1):9-14.
154. Ito, H., et al. (1978) J. Baderiol 153:163-168.
155. Jameson, D.M., Sawyer, W.H. (1995) Methods Enzymol. 246:283- 300. 156. Janeway, C.A., Travers, P. Walport, M. Shlomchik, M. (2001) Immunobiology. 5th ed. Garland Publishing.
157. Jeffery, P., Zhu, J. (2002) Novartis Found. Symp. 248:51-75, 277-82.
158. Jimbo, T., et al. (2002) Nat. Cell Biol. 4:323-327.
159. Joberty, G., Perlungher, R.R., Macara, I.G. (1999) Mol Cell Biol 19:6585-6597.
160. Johns, T.G., Bernard, CC. (1997) Mol. Immunol. 34:33-38.
161. Jolliffe, C.N., et al. (2000) Biochem. J. 351:557-565.
162. Jones, D.H., Winistorfer, S.C. (1992) Biotechniques 12:528-530.
163. Jones, P., ed. (1998a) Vectors: Cloning Applications: Essential Techniques. John Wiley & Son, Ltd.
164. Jones, P., ed. (1998b) Vectors: Expression Systems: Essential Techniques, John Wiley & Son, Ltd.
165. Jorgensen, C, et al. (2003) Gene Ther. 10:928-931.
166. Jornvall, H., et al. (1995) .Szoc/ze ytry 34:6003-6013.
167. Jost, CR, et al. (1994) J. Biol. Chem. 269:26,267-26,273.
168. Joulin, V., Richard-Foy, H. (1995) Eur. J. Biochem. 232:620-626.
169. Jurcic, J.G., et al. (2000) Curr. Opin. Hematol 7:247-254.
170. Jury, J. A., et al. (1999) Mol. Hum. Reprod. 5:1127-1134.
171. Kabat, E.A., Wu T.T. (1991) J. Immunol. 147:1709-1719.
172. Kamitani, T., et al. (1997) J. Biol. Chem. 272:14,001-14,004.
173. Kantoff, P.W., et al. (2001) J. Clin. Oncol. 9:3025-3028.
174. Kao, P.N., et al. (1994) J. Biol. Chem. 269:20,691-20,699.
175. Karanazanashvih, G., Abrahamsson, P. (2003) J. Urol. 169:445-457.
176. Kari, C, Chan, et al. (2003) Cancer Res. 63:1-5.
177. Kelly, J.M., Hynes, M. J. (1985) EMBO J. 4:475-479.
178. Ken ochi, N., et al. (1998) Genome Res. 8:509-523.
179. Keown, W.A., et al. (1990) Methods Enzymol 185:527-537.
180. Khatib A.M. et al. (2002) Am. J. Pathol. 160:1921-1935.
181. Kibbe, A.H., ed. (2000) Handbook of Pharmaceutical Excipients. 3rd ed. Pharmaceutical Press. 182. Kirkpatrick, K.L., Mokbel, K. (2001) Eur. J. Surg. Oncol 27:754- 760.
183. Kirsch, K.H., et al. (1999) Proc. Natl. Acad. Sci. 96:6211-6216. Kiryu-Seo, S., et al. (2000) Proc. Natl. Acad. Sci. 97:4345-4350. Klaiman, G.J., et al. (2002) AIDS Rev. 4:183-194. Knutson, K.L., et al. (2001) J. Clin. Invest. 107:477-484. Kobayashi, M., et al. (1999) Proc. Natl. Acad. Sci. 96:4814-4819. Kolonin, M.G., Finley, R.L. Jr. (1998) Proc. Natl. Acad. Sci.
95:14,266-14,271.
Korner, C, Knauer, et al. (1999) . EMBO J. 18:6816-6822.
Kothapalli, R., et al. (1997) J. Clin. Invest. 99:2342-2350.
Kovalenko, O.V., et al. (1997) Nucleic Acids Res. 25:4946-4953.
Kratzschmar, J., et al. (1996) Biol Chem. 271:4593-4596.
Ku, D.H., et al. (1990) J. Biol Chem. 265:16,060-16,063.
Kuisle, O., et al. (1999) Tetrahedron Lett. 40:1203-1206.
Kunze, G. et al., (1985) J. Basic Microbioh 25:141-144.
Kurtz, M.B., et al. (1986) Mol. Cell. Biol 6:142-149.
Kyo, S., et al. (2000) Histol Histopathol 15:813-824.
Lander, E.S. (1999) Nature Genetics 21:3-4.
Lander, E.S., et al. (2001) Nature 409:860-921.
Lasham, A., et al. (2003) J. Biol. Chem. Epub June 30, 2003.
Lashkari, A., et al. (1999) Clin. Pediatr. 38:189-208.
Lavedan, C. (1998) Genome Res. 8:871-880.
Lebacq-Verheyden, A.M., et al. (1988) Mol Cell. Biol. 8:3129-3135.
Lees-Miller, S.P., Anderson, C.W. (1989) J. Biol. Chem. 264:2431-
2437.
205 Lerch, M.M., Gorelick, F.S. (2000) Med. Clin. North Amer. 84:549-
563.
206 Lerner, R.A. (1982) Nature 299:592-596. 207 Li, E., et al. (1996) Eur. J. Biochem. 238:631-638. 208 Lim, D., et al. (Aug. 2002) J. Virol. 76:8360-8373. 209. Lin, B., et al. (1993) Hum. Mol. Genet. 2:1541-1545.
210. Lin, W.J., et al. (1996) J. Biol. Chem. 271:15,034-15,044.
211. Lin, X., et al. (1999) J Biol Chem. 274:36,125-36,131.
212. Linnenbach, A.J., et al. (1993) Mol. Cell Biol. 13:1507-1515.
213. Linstedt, A.D., Hauri, H.P. (1993) Mol. Biol. Cell 4:679-693.
214. . Lipshutz, R.J., et al. (1999) Nature Genetics 21 :20-24.
215. Liu A.Y., et al. (1987a) Proc. Natl. Acad. Sci. 84:3439-3443.
216. Liu, A.Y., et al. (1987b) J. Immunol. 139:3521-3526.
217. Lodish, H., et al. (1999) Molecular Cell Biology. 4th ed. W H Freeman & Co.
218. Loeffen, J.L., et al. (1998) . Biochem. Biophys. Res. Commun. 253:415-422.
219. Los, M., et al. (2003) DrugDiscov. Today 15:67-77.
220. Lovering R, Trowsdale J. (1991) Nucleic Acids Res. 19:2921-2928.
221. Luckow, V., Summers, M. (1988) Bio/Technology 6:47-55.
222. MacBeath, G., Schreiber. S.L. (2000) Science 289:1760-1763.
223. Machesky, L.M., et al. (1999) Biochem. J. 328:105-112.
224. Machiels, J.P., et al. (2002) Semin. Oncol. 29:494-502.
225. Mackay, A., et al. (2003) Oncogene 22:2680-2688.
226. Maeda, S., et al. (1985) Nature 315:592-594.
227. Mahajan, M.A., et al. (2002) Mol. Cell Biol. 22:6883-6894.
228. Mahimkar, R.M., et al. (2000) J. Am. Soc. Nephrol, 11:595-603.
229. Mahnensmith, R.L., Aronson, P.S. (1985) J. Biol Chem. 260:12,586- 12,592.
230. . Mangi, A.A., et al. (2003) Nat. Med. 9:1195-1201. Epublished August 10, 2003.
231. Manning, G., et al. (2002) Science 298:1912-1934.
232. Marotti, K.R., Tomich, C.S. (1989) (1989) Gene Anal. Tech. 6:67- 70.
233. Martel-Pelletier, et al. (2001) BestPract. Res. Clin. Rheumatol. 15:805-829. 234 Martin, B.M., et al. (1988) DNA 7:99-106. 235 Massari, M.E., et al. (1998) Mol. Cell Biol. 18:3130-3139. 236 Matz, M.V., et al. (1999) Nat. Biotechnol 17:969-973. 237 Mayer, BJ. (2001) J. Cell Sci. 114:1253-1263. 238 Mayer, T.U., et al. (1999) Science 286:971-974. 239 McGraw, R.A. Ill (1984) Anal. Biochem. 143:298-303. 240 McKusick, V.A.. (2003) OMIM: Online Mendelian Inheritance in Man http:www.ncbi.nlm.nih.gov, #104300.
241 McPherson, M.J., et al. (2000) PCR Basics: From Background to Bench. Springer Verlag.
242 McRee, D. (1999) Academic Press, 2nd Ed. 243 Merla, G., et al. (2002) Hum. Genet. 110:429-438. 244 Miki, H., Setou, et al. (2001) Proc. Natl. Acad. Sci. 98:7004-7011. 245 Milam, A.H., et al. (2002) Proc. Natl. Acad. Sci. 99:473-478. 246 Milligan, J.F., et al. (1993) Current concepts in antisense drug design. J. Med. Chem. 36:1923-1937.
247 Mitch, W.E., Goldberg, A.L. (1996) N. Engl J. Med. 335:1897-1905. 248 Mitchell, D.A., Νair, S.K. (2000) J Clin. Invest. 106:1065-1069. 249 Miyajima A. (2002) Kokuritsu lyakuhin Shokuhin Eisei Kenkyusho Hokoku 120:53-74.
250 Miyajima, A., et al. (1987) Gene 58:273-281. 251 Monfardini, C, et al. (1995) Bioconjugate Chem. 6:62-69.
252 Mori, N. (1997) Nihon Shinkei Seishin Yalatrigaku Zasshi 17:159-167. 253 Mortlock, D.P., et al. (1996) Genome Res. 6:327-335. 254 Murphy, D., Carter, D.A., eds. (1993) Transgenesis Techniques: Principles and Protocols. Humana Press.
255 Myers, E.W., Miller, W. (1988) Comput. Appl. Biosci. 4:11-7. 256 Nagata, K., et al. (1995) Proc. Natl. Acad. Sci. 92:4279-4283. 257 Naora, H. (1999) Immunol. Cell Biol. 77:197-205. 258 Needleman, S.B., Wunch, CD. (1970) J. Mol. Biol. 48:443-453. 259 Nelson, N., Harvey, W.R. (1999) Physiol. Rev. 79:361-385. 260. Nishiyama, H., et al. (1997) Gene 204:115-120.
261. Noma, T., et al. (2001) Biochem. J. 358:225-232.
262. Notredame, C, et al. (2000) J. Molec. Biol. 302:205-217.
263. Okayama, H., Berg, P. (1983) Mol Cell. Biol. 3:280-289.
264. Okazaki, Y., et al. (2002) Nature 420:563-573.
265. . Oksenberg, J.R., et al. (1999) Semin. Neurol 19:281-288.
266. Oliver, C.J., Shenolikar, S. (1998) Frontiers in Bioscience 3:961-972.
267. ONeil, N.J., et al. (2001) Am. J. Pharmacogenomics 1:45-53.
268. ONeill, L.A. (2002) Curr. Top. Microbiol Immunol. 270:47 -61.
269. Page, D.C, et al. (1999) Hum. Reprod. 14:1722-1726.
270. Pan, C.X., Koeneman, K.S. (1999) Med. Hypothesis 53:130-135.
271. Pang, T., et al. (2001) J Biol. Chem 276:17,367-17,372.
272. Pang, T., et al. (2002) J. Biol. Chem. 277:43,771-43,777.
273. ' Papagerakis, S., et al. (2003) Hum. Pathol. 34:565-572.
274. Pearson, W.R. (2000) Methods Mol. Biol. 132:185-219.
275. Peattie, D.A., et al. (1992) Proc. Natl. Acad. Sci. 89:10,974-10,978.
276. Peelle, B., et al. (2001) J. Protein Chem. 20:507-519.
277. Peng, H., Huard, J. (2003) Curr. Opin. Pharmacol. 3:329-333.
278. Pepin, K., et al. (2001) J. Vet. Med. Sci. 63:115-124.
279. Perez Calvo, et al. (2000) Med. Clin. (Bare) 115:601-604.
280. Perron, H., et al. (1997) Proc. Natl. Acad. Sci. 94:7583-7588.
281. Perry, A.C., et al. (1995) Biochem. J. 312( Pt l):239-244.
282. Peril, U., et al. (2003) Blood 101 :649-654.
283. Pfutzer, R.H., Whitcomb, D.C. (2001) Pancreatology 1:457-460.
284. Phillips, M.I., ed. (1999a) Antisense Technology, Part A. Methods in Enzymology Vol. 313. Academic Press, Inc.
285. Phillips, M.I., ed. (1999b) Antisense Technology, Part B. Methods in Enzymology Vol. 314. Academic Press, Inc.
286. Pietu, G., et al. (1996) Genome Res. 6:492-503.
287. Pinkert, C.A., ed. (1994) Transgenic Animal Technology: A Laboratory Handbook. Academic Press. 288. Pisegna, J.R., Wank, S.A. (1996) J. Biol. Chem. 271:17,267-17,274.
289. Prentki, P., Krisch, H.M. (1984) Gene 29:303-313.
290. Price, N.T., et al. (1993) Biochim. Biophys. Ada 1216:170-172.
291. Qin, J., Li., L. (2003) Radial Res. 159:139-148.
292. Racevskis, J., et al. (1996) Cell. Growth Differ. 7:271-280.
293. Ramalho-Santos, M. (2002) Science 298:597-600.
294. Raval, P. (1994) J. Pharmacol. Toxicol Methods 32:125-127.
295. Rebbe, N.F., et al. (1987) Gene 53:235-245.
296. Rechid, R, et al. (1989) Comput. Appl. Biosci. 5:107-113.
297. Rehli, M., et al. (1995J Biol. Chem. 270:15644-15649.
298. Remington, J.P. (1985) Remington's Pharmaceutical Sciences. 17th ed. Mack Publishing Co.
299. Ribardo, D.Aet al. (2002) Indian J. Exp. Biol. 40:129-138.
300. Riley, J., et al. (1990) Nuc. Acids Res. 18:2887-2890.
301. Ritter, R.C., et al. (1994) Ann. N Y. Acad. Sci. 713:255-267.
302. Robertson, H.M. (1996) Mol. Gen. Genet. 252:761-766.
303. Robertson, H.M., Zumpano, K.L. (1997) Gene 205:203-217.
304. Roepman, R., et al. (2000) Hum. Mol Genet. 9:2095-2105.
305. Roessler, B.J., et al. (1993) J Biol. Chem. 268:26476-26481.
306. Roggenkamp, R., et al. (1984) Mol. Gen. Genet. 194:489-493.
307. Rosato, R.R., Grant, S. (2003) Cancer Biol. Ther. 2:30-37.
308. Rosen, R.C, McKenna, K.E. (2002) Ann. Rev. Sex Res. 13:36-88.
309. Rosenblum, M.G., Donato, N.J. (1989) Crit. Rev. Immunol. 9:21-44.
310. Rowland, J.M. (2002) Pediatr. Clin. North Am. 49:1415-1435.
311. Saha, S., Bardelli, et al. (2001) Science 294:1343-1346.
312. Saiki, R.K, et al. (1988) Science 239:487 -491.
313. Sambrook, J., et al. (1989) Molecular Cloning, A Laboratory Manual. 2nd ed. Cold Spring Harbor Laboratory Press.
314. Sanchez, E.R., et al. (1990) Biochemistry 29:5145-5152.
315. Sayers, J.R., et al. (1992) Biotechniques 13:592-596.
316. Schaeferling, M., et al. (2002) Ele rophoresis 23:3097-3105. 317. Schaffer, J.E., Lodish, H.F. (1994) Cell 79:393-395.
318. Schena, M., ed. (1999) DNA Microarrays: A Practical Approach. Oxford Univ. Press.
319. Schena, M., ed. (2000) Microarray Biochip Technology, lst ed. Eaton Publishing Co.
320. Schlesinger, D.H. (1988a) MacRomolecular Sequencing and Synthesis: Selected Methods and Applications. Wiley-Liss.
321. Schlesinger, D.H., ed. (1988b) Current Methods in Sequence Comparison and Analysis, Macromolecule Sequencing and Synthesis, Selected Methods and Applications, pp. 127-149, Alan R. Liss, Inc.
322. Schonthal, A.H. (2001) Cancer Lett. 170:1-13.
323. Seelig, H.P., et al. (1994) J. Autoimmun. 7:67-91.
324. Selkoe, D.J. (2001) Proc. Natl. Acad. Sci. 98:11,039-11,041.
325. Setlow, J., Hollaender, A., eds. (1986) Genetic Engineering: Principles and Methods. Plenum Pub. Corp.
326. Shamay, M., et al. (2002) J. Biol. Chem. 277:9982-9988.
327. Shao, H., Andres, D.A. (2000) J. Biol. Chem. 275:26,914-26,924.
328. Sheppard, P., et al. (2003) Nat. Immunol 4:63-68.
329. Shinnick, T.M., et al. Ann. Rev. Microbiol 37:425-446.
330. Shorter, J., et al. (2002) J. Cell Biol. 157:45-62.
331. Siebenlist, U., et al. (1980) Cell 20:269-281.
332. Siegal, G.J., et al. (1999) Basic Neurochemistry, Molecular, Cellular, and Medical Aspects. 6th ed. Lippencott, Williams & Wilkins.
333. Siezen, R.J., Leunissen, J.A. (1997) Protein Sci. 6:501-523.
334. . Sladek, R., et al. (1997) Mol. Cell Biol. 17:5400-5409.
335. Slavin, S., et al. (2001) Cancer Chemother. Pharmacol. 48:S79-S84.
336. Smit, A.F., Riggs, A.D. (1996) Proc. Natl. Acad. Sci. 93:1443-1448.
337. Smith, G.E., et al. (1985) Proc. Natl. Acad. Sci. 82:8404-8408.
338. Smith, T.F., Waterman, M.S. (1981) Adv. Appl Math. 2:482-489.
339. Soares, M.B. (1997) Curr. Opin. Biotechnol 8:542-546.
340. Soejima, H., et al. (2001) Genomics. 74:115-120. 341. Soulier, S., et al. (1996 Gene 172:285-289.
342. Southern, E., et al. (1999) Nature Genetics 21:5-9.
343. Stein, CA., Kreig, A.M., eds. (1998) Applied Antisense Oligonucleotide Technology. Wiley-Liss.
344. Steinhaur, C, et al. (2002) Biotechniques, Supp.:38-45.
345. Stetler-Stevenson, et al. (1993) FASEB J. 7: 1434-1441.
346. Stewart, Z.A., et al. (2003) Trends Pharmacol Sci. 24:139-145.
347. Stolz, L.E., Tuan, R.S. (1996) Mol. Biotechnol. 6:225-230.
348. Sturm, A., Dignass, A.U. (2002) Biochim. Biophys. A a 1582:282- 288.
349. Stutz, F., Bachi, Aet al. (2000) RNA 6:638-650.
350. Suh, Y.H., Checler, F. (2002) Pharmacol. Rev. 54:469-525.
351. Sutcliffe, J.G., et al. (1983) Science 219:660-666.
352. Tan, J., Town, et al. (1999) Science 286:2352-2355.
353. Tang, D.C, et al. (1992) Nature 356:152-154.
354. Tekur, S., et al. (1999) J. Androl 20:135-144.
355. Terada, R., et al. (2003) Lab. Invest. 83:665-672.
356. Thompson, J.D., et al. (1994) Nucleic Acids Res. 22:4673-80.
357. Tilburn, J., et al. (1983) Gene 26:205-221.
358. Trounson, A. (2002) Reprod. Biomed. Online 4 Suppl. 1:58-63.
359. Tsuda, T., et al. (1993) Biochem. Biophys. Res. Commun. 195:363- 373.
360. Tukey, R.H., et al. (1993) J. Biol. Chem. 268:15,260-15,266.
361. Turgeman, G., et al. (2002) Curr. Opin. Mol Ther. 4:390-394.
362. Vainberg, I.E., et al. (1998) Cell 93:863-873.
363. Vale, R.D. (2003) Cell 112:467-480.
364. Vallejo, M., et al. (1993) Proc. Natl. Acad. Sci. 90:4679-4683.
365. Van Damme, et al. (2002) Curr. Gene Ther. 2:195-209.
366. van den Berg, J.A., et al. (1990) Bio/Technology 8:135-139.
367. Van den Berghe, L., et al. (2000) Mol. Endocrinol 14:1709-1724.
368. Van Den Blink, B., et al. (2002) Ann. N. Y. Acad. Sci. 973:349-58. 369. van der Spoel, A.C., et al. (2002) Proc. Natl. Acad. Sci. 99:17173- 17178.
370. Van Eerdewegh, P., et al. (2002) Nature. 418:426-430.
371. Van Laar, J.M., Tyndall, A. (2003) Cancer Control 10:57-65.
372. Verhey, K.J., et al. (2001) J. Cell Biol. 152:959-970.
373. Vlak, J.M., et al. (1988) J. Gen. Virol. 69:765-776.
374. Voisset, C, et al. (2000) AIDS Res. Hum. Retroviruses 16:731-740.
375. Wagner, R.W., et al. (1993) Science 260:1510-1513.
376. Wagner, R.W., et al. (1996) Nat. Biotechnol 14:840-844.
377. Wakefield, L.M., et al. (2000) Breast Cancer Res. 2:100-106.
378. Walker, J.E., et al. (1992) J. Mol. Biol. 226:1051-1072.
379. Walsh, A.C., Feulner, J. A., Reilly, A. (2001) Toxicol. Sci. 61:218- 223.
380. Wang, J., Kirby, C.E., Herbst, R. (2002) J. Biol. Chem. 277:46659- 46668.
381. Wang, M.S., et al. (1999) Am. J. Med. Genet. 86:34-43.
382. Wax, S.D., et al. (1994) J. Biol. Chem. 269:13,041-13,047.
383. Wei, S., Charmley, P., Concannon, P. (1997) Immunogenetics 45:405- 412.
384. Weinberg, J.M., Saini, R. (2003) Cutis. 71:25-29.
385. Weiner, H.L., Selkoe, D.J. (2002) Nature 420:879-884.
386. Weiner, M.P., et al. (1993) Gene 126:35-41.
387. Weinstein, M.E., et al. (1988) Cancer Genet Cytogenet. 35:223-229.
388. Weishaar, R.E., et al. (1985) J. Med. Chem. 28:537-545.
389. Weissman, LL. (2000) . Science 287:1442-1446.
390. Weng, S., et al. (2002) Proteomics 2:48-57.
391. Wenger, R.H., et al. (1993) J. Biol Chem. 268:23,345-23,352.
392. Werner, T., et al. (1990) Virology 174:225-238.
393. Wick, G., et al. (1987) Immunol. Lett. 16:249-257.
394. Wieczorek, et al. (1999) Bioessays 21:637-648.
395. Wieser, R. (2002) Leuk. Lymphoma 43:59-65.
Il l 396. Winssinger, N., et al. (2002 rσc. Natl. Acad. Sci. 99:11,139-11,144.
397. Wojtowicz-Praga, S. (1999) Drugs R. D. 1:117-129.
398. Wu, A.M., Gallo, R.C (1975) CRC Crit. Rev. Biochem. 3:289-347.
399. Xu, C.W., et al. (1997) Proc. Natl. Acad. Sci. 94:12,473-12,478.
400. Xu, Y., et al. (1999) Proc. Natl. Acad. Sci. 96:151-156.
401. Yamashita, S. (2000) Biochim. Biophys. Ada. 1529:257-275.
402. Yang, N„ et al. (1996) J. Biol. Chem. 271:5795-5804.
403. Yelton, M.M., et al. (1984) Proc. Natl. Acad. Sci. 81:1470-1474.
404. Yoshihama, M., et al. (2002) Genome Res. 12:379-390.
405. Yu, L., et al. (1995) J. Virol. 69:3007-3016.
406. Yu, Z., Restifo, N.P. (2002) J. Clin. Invest. 110:289-294.
407. Zallipsky, S. (1995) Bioconjugate Chem., 6:150-165.
408. Zhang, Q., et al. (2002) Hum. Mol. Genet. 11:993-1003.
409. Zhang, W.M., et al D. (2002) Matrix Biol. 21:513-523.
410. Zhao, H., Grabowski, G.A. (2002) Cell Mol. Life Sci. 59:694-70 '.
411. Zhao, N., et al. (1995) Gene 156:207-215.
412. Zhao, Y., et al. (2003) Proc. Natl. Acad. Sci. 100:3965-3970
413. Zhu, D.L. (1989) Anal. Biochem. 177:120-124.
414. Zhu, H., et al. (2000) Nat. Genetics 26:283-289.
415. Zhu, H., et al. (2001) Science 293:2101-2105.
416. Zhu, H., Snyder, M. (2003) Curr. Opin. Chem. Biol. 7:55-63.
417. Zhu, J., Kalm, CR. (1997) Proc. Natl. Acad. Sci. 94:13,063-13,068.
SEQUENCE LISTING
[0282] The instant application includes a Statement Accompanying Sequence Listing, and a Sequence Listing in both paper and computer-readable formats.

Claims

1. A first nucleic acid molecule comprising a polynucleotide sequence chosen from at least one polynucleotide sequence according to SEQ ID NO: 1- 123; or a complement thereof, or from at least one polynucleotide sequence that encodes SEQ ID NO: 124-246.
2. . The nucleic acid molecule of claim 1 , wherein the nucleic acid molecule is a DNA or a RNA molecule.
3. An animal injected with the nucleic acid molecule of claim 1.
4. A double-stranded isolated nucleic acid molecule comprising the first nucleic acid molecule of claim 1 and its complement.
5. The nucleic acid molecule of claim 4, wherein the first polynucleotide sequence encodes a polypeptide chosen from a polypeptide comprising a signal peptide, a mature polypeptide that lacks a signal peptide, a signal peptide, a biologically active fragment of a polypeptide, a polypeptide lacking a signal peptide cleavage site, a polypeptide consisting essentially of a N-terminal fragment that contains a Pfam domain, and a polypeptide consisting essentially of a C-terminal fragment that contains a Pfam domain.
6. A second nucleic acid molecule comprising a second polynucleotide sequence that is at least about 70%, or about 80%>, or about 90%, or about 95% homologous to the first nucleic acid molecule of claim 1.
7. A second isolated nucleic acid molecule comprising a second polynucleotide sequence that hybridizes to the first polynucleotide sequence of claim 1 under high stringency conditions.
8. The second isolated nucleic acid molecule of claim 6, wherein the second polynucleotide sequence is complementary to the first polynucleotide sequence.
9. A vector comprising the nucleic acid molecule of claim 1 and a promoter that drives the expression ofthe nucleic acid molecule.
10. The vector of claim 9, wherein the promoter is chosen from one or more of a promoter that is naturally contiguous to the nucleic acid molecule, a promoter that is not naturally contiguous to the nucleic acid molecule, an inducible promoter, a conditionally active promoter, a constitutive promoter, and a tissue specific promoter.
11. A host cell transformed, transfected, transduced, or infected with the nucleic acid molecule of claim 1.
12. The host cell of claim 11, wherein the cell is chosen from one or more of a prokaryotic cell, a eucaryotic cell, a human cell, a mammalian cell, an insect cell, a fish cell, a plant cell, and a fungal cell.
13. A nucleic acid composition comprising a pharmaceutically acceptable carrier or a buffer and one or more compositions chosen from the nucleic acid molecule of claim 1, the nucleic acid molecule of claim 4, the vector of claim 9, and the host cell of claim 11.
14. A substantially purified polypeptide comprising a polypeptide sequence chosen from at least one amino acid sequence according to SEQ ID NO: 124 - 246.
15. An animal injected with the polypeptide of claim 14.
16. The polypeptide of claim 14, wherein the polypeptide has a function chosen from an agonist, an antagonist, a ligand, and a receptor.
17. The polypeptide of claim 14, wherein the polypeptide is chosen from a polypeptide comprising a signal peptide, a mature polypeptide that lacks a signal peptide, a signal peptide, a biologically active fragment of a polypeptide, a polypeptide lacking a signal peptide cleavage site, a biologically active fragment consisting essentially of an N-terminal fragment containing a Pfam domain, a biologically active fragment consisting essentially of a C-terminal fragment containing a Pfam domain, an extracellular fragment, a ligand binding fragment, and a receptor binding fragment.
18. A polypeptide composition comprising the polypeptide molecule of claim 14 and a pharmaceutically acceptable carrier or a buffer.
19. A cell culture medium comprising the polypeptide of claim 14.
20. The cell culture medium of claim 19, further comprising responder cells chosen from one or more T cells, B cells, NK cells, dendritic cells, macrophages, muscle cells, stem cells, epithelial skin cells, fat cells, blood cells, brain cells, bone marrow cells, endothelial cells, retinal cells, bone cells, kidney cells, pancreatic cells, liver cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung cells, liver cells, soft tissue cells, colorectal cells, cells ofthe gastrointestinal fract, and cancer cells.
21. The cell culture medium of claim 20, wherein the responder cells proliferate in the medium.
22. The cell culture medium of claim 20, wherein the responder cells are inhibited in the medium.
23. A cell culture comprising transfected cells, wherein the transfected cells are transfected with the polynucleotide of claim 1.
24. The cell culture of claim 23, further comprising responder cells chosen from one or more T cells, B cells, NK cells, dendritic cells, macrophages, muscle cells, stem cells, epithelial skin cells, fat cells, blood cells, brain cells, bone marrow cells, endothelial cells, retinal cells, bone cells, kidney cells, pancreatic cells, liver cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung cells, liver cells, soft tissue cells, colorectal cells, cells ofthe gastrointestinal tract, and cancer cells.
25. The cell culture of claim 23, wherein the responder cells proliferate in the cell culture.
26. The cell culture of claim 23, wherein the responder cells are inhibited in the cell culture.
27. A method of making a transformed, transfected, transduced, or infected host cell comprising:
(a) providing a composition comprising the vector of claim 9, and
(b) allowing a host cell to come into contact with the vector to form a transformed, transfected, transduced, or infected host cell.
28. A method of making a polypeptide comprising:
(a) providing a nucleic acid molecule that comprises a polynucleotide sequence encoding the polypeptide of claim 14; (b) introducing the nucleic acid molecule into an expression system; and
(c) allowing the polypeptide to.be produced.
29. A method of making a polypeptide comprising:
(a) providing a composition comprising the host cell of claim 11 ;
(b) culturing the host cell to produce the polypeptide; and
(c) allowing the polypeptide to be produced.
30. A method of making a polypeptide comprising:
(a) providing a cell-free expression system;
(b) providing a first nucleic acid molecule encoding a first hydrophobic polypeptide;
(c) providing reagents for producing a Nanodisc™;
(d) combining the first nucleic acid molecule, the cell-free expression system, and the reagents for producing a Nanodisc™; and
(e) allowing the first hydrophobic polypeptide to be produced in or introduced into a Nanodisc™.
31. The method of claim 30, wherein the cell-free expression system is selected from a bacterial lysate cell-free expression system, a plant lysate cell-free expression system, and a eukaryotic cell lysate cell-free expression system.
32. The method of claim 31 , wherein the bacterial lysate is an E. coli lysate.
33. The method of claim 31 , wherein the plant lysate is a wheat germ lysate.
34. The method of claim 31, wherein the eukaryotic cell lysate is a rabbit reticulocyte lysate.
35. The method of claim 30, wherein the cell-free expression system allows for replication ofthe nucleic acid molecule.
36. The method of claim 30, wherein the hydrophobic polypeptide is a membrane protein.
37. The method of claim 30, further comprising: (a) providing a second nucleic acid molecule encoding a second hydrophobic polypeptide,
(b) combining the second nucleic acid molecule with the first nucleic acid molecule, the cell-free expression system, and the reagents for producing a Nanodisc™; and
(c) allowing the first and second hydrophobic polypeptides to be produced in or introduced into the Nanodisc M.
38. The method of claim 37, wherein the first and the second hydrophobic polypeptide are present in the same Nanodisc™.
39. The method of claim 37, wherein the first and at least the second hydrophobic polypeptides constitute components of a multi-protein complex.
40. The method of claim 37, wherein the first and the second hydrophobic polypeptides are other than components of a multi-protein complex.
41. The method of claim 37, wherein the first nucleic acid molecule and the second nucleic acid molecule are present in an equal molar ratio.
42. The method of claim 37, wherein the first nucleic acid molecule and the second nucleic acid molecule are present in different molar ratios.
43. The method of claim 37, further comprising:
(a) providing a third nucleic acid molecule encoding a third hydrophobic polypeptide;
(b) combining the third nucleic acid molecule with the second nucleic acid molecule, the first nucleic acid molecule, the cell-free expression system, and the reagents for producing a Nanodisc™; and
(c) allowing the first, second, and third hydrophobic polypeptides to be produced in or introduced into the Nanodisc™.
44. The method of claim 43, wherein the first, second, and third hydrophobic polypeptides are present in the same Nanodisc™.
45. . The method of claim 43, wherein the first, second, and third hydrophobic polypeptides are components of a multi-protein complex.
46. The method of claim 43, wherein the first, second, and third hydrophobic polypeptides are other than components of a multi-protein complex.
47. The method of claim 43, wherein the first nucleic acid molecule, the second nucleic acid molecule, and the third nucleic acid molecule are present in an equal molar ratio.
48. The method of claim 43, wherein the first nucleic acid molecule, the second nucleic acid molecule, and the third nucleic acid molecule are present in different molar ratios.
49. The method of claim 43, further comprising:
(a) providing a fourth nucleic acid molecule encoding a fourth hydrophobic polypeptide;
(b) combining the fourth nucleic acid molecule with the third nucleic acid molecule, the second nucleic acid molecule, the first nucleic acid molecule, the cell- free expression system, and the reagents for producing a Nanodisc™; and
(c) allowing the first, second, third, and fourth hydrophobic polypeptides to be produced in or introduced into the Nanodisc™.
50. The method of claim 49, wherein the first, second, third, and fourth hydrophobic polypeptides are present in the same Nanodisc1 .
51. The method of claim' 49, wherein the first, second, third, and fourth hydrophobic polypeptides are components of a multi-protein complex.
52. The method of claim 49, wherein the first, second, third, and fourth hydrophobic polypeptides are other than components of a multi-protein complex.
53. The method of claim 49, wherein the first nucleic acid molecule, the second nucleic acid molecule, the third nucleic acid molecule, and the fourth nucleic acid molecule are present in an equal molar ratio.
54. The method of claim 49, wherein the first nucleic acid molecule, the second nucleic acid molecule, the third nucleic acid molecule, and the fourth nucleic acid molecule are present in different molar ratios.
55. An apparatus for producing a plurality of hydrophobic polypeptides in a high throughput manner comprising:
(a) means for providing a cell-free expression system for one or more components of a hydrophobic protein; (b) means for introducing one or more nucleic acid molecules that encode one or more components of a hydrophobic protein into each cell-free expression system;
(c) means for introducing a Nanodisc into each cell-free expression system; and
. (d) means for incubating the cell-free expression system, the one or more nucleic acid molecules and the Nanodisc M for each hydrophobic protein.
56. The apparatus of claim 55, further comprising means for separating the Nanodisc™ containing the hydrophobic protein from the cell-free expression system.
57. A method for synthesizing a plurality of Nanodisc™s simultaneously and for synthesizing a series of a plurality of simultaneously-synthesized Nanodiscs™ sequentially utilizing a dynamic system, which comprises:
(a) providing the apparatus of claim 56;
(b) operating said apparatus so as to produce a plurality of hydrophobic polypeptides in a cell-free expression system on a Nanodisc™.
(c) operating said apparatus so as to separate the Nanodisc™ containing the hydrophobic protein from the cell-free expression system; and
(d) operating said apparatus so as to relocate the apparatus so that the means for providing a cell-free expression system, the means for introducing one or more nucleic acid molecules, the means for introducing a Nanodisc™ into each cell-free expression system, and the means for incubating the cell-free expression system are positioned sufficiently with respect to one another so that at least a second plurality of hydrophobic proteins can be produced in a cell-free system on a Nanodisc™.
58. A hydrophobic protein made by the method of claim 57.
59. The hydrophobic protein of claim 30, wherein the protein is a membrane protein.
60. A composition comprising a plurality of hydrophobic proteins made by the method of claim 30.
61. The composition of claim 60, further comprising a binding partner bound to the hydrophobic protein.
62. A method of preparing a hydrophobic protein for determination of crystal structure comprising the steps of:
(a) providing a composition of hydrophobic proteins made by the method of claim 30; and
(b) allowing the composition to crystallize.
63. A method for determining the crystal structure ofthe hydrophobic protein of claim 58, comprising providing the hydrophobic protein of 58 under conditions that form a solid of regular shape with the individual molecules occupying regular positions with respect to one another.
64. A method of immunizing a non-human animal comprising the step of introducing into such animal the hydrophobic protein of claim 58.
65. The method of 64, wherein the hydrophobic protein of claim 58 is introduced through a route selected from intravenous, infradermal, and mucosal.
66. The method of 64, wherein the hydrophobic protein of claim 58 is administered to an animal suffering from a disorder selected proliferative, bone, CNS, infectious, and metabolic disorders.
67. A method of screening for modulators of hydrophobic protein activity comprising the steps of:
(a) providing the hydrophobic protein of claim 58;
(b) contacting the hydrophobic protein with a candidate modulator;
(c) determining the ability ofthe candidate modulator to affect hydrophobic protein activity or to bind to the hydrophobic protein.
68. The method of claim 67, wherein the hydrophobic protein is a membrane protein.
69. The method of claim 67, wherein the modulators are selected from agonists, antagonists, antibodies, small molecule drugs, soluble receptors, natural ligands, and aptamers.
70. A diagnostic kit comprising a polynucleotide molecule, wherein the polynucleotide molecule comprises a sequence chosen from (a) at least 6, (b) at least 7, (c) at least 8, and (d) at least 9 contiguous nucleotides chosen from the nucleic acid molecule of claim 1.
71. A diagnostic kit comprising a polypeptide molecule, wherein the polypeptide molecule comprises an amino acid sequence or a biologically active fragment thereof, derived from the nucleic acid molecule of claim 1.
72. A genetically modified mouse comprising a deletion, substitution, or modification of a sequence chosen from SEQ ID NO: 1- 123, wherein the deletion, substitution, or modification prevents or reduces expression of said sequence and results in a mouse deficient in or completely lacking one or more gene products of a sequence chosen from SEQ ID NO: 1- 123.
PCT/US2004/012049 2003-04-18 2004-04-19 Novel human polypeptides encoded by polynucleotides WO2004094651A2 (en)

Applications Claiming Priority (20)

Application Number Priority Date Filing Date Title
US46370003P 2003-04-18 2003-04-18
US60/463,700 2003-04-18
US46720303P 2003-05-02 2003-05-02
US60/467,203 2003-05-02
US47660903P 2003-06-09 2003-06-09
US47663203P 2003-06-09 2003-06-09
US60/476,632 2003-06-09
US60/476,609 2003-06-09
US48532503P 2003-07-08 2003-07-08
US48535903P 2003-07-08 2003-07-08
US60/485,325 2003-07-08
US60/485,359 2003-07-08
US48696003P 2003-07-15 2003-07-15
US60/486,960 2003-07-15
US49333203P 2003-08-08 2003-08-08
US49337003P 2003-08-08 2003-08-08
US60/493,332 2003-08-08
US60/493,370 2003-08-08
US50505903P 2003-09-08 2003-09-08
US60/505,059 2003-09-24

Publications (2)

Publication Number Publication Date
WO2004094651A2 true WO2004094651A2 (en) 2004-11-04
WO2004094651A3 WO2004094651A3 (en) 2009-03-26

Family

ID=33314651

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/012049 WO2004094651A2 (en) 2003-04-18 2004-04-19 Novel human polypeptides encoded by polynucleotides

Country Status (1)

Country Link
WO (1) WO2004094651A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007119759A1 (en) * 2006-04-11 2007-10-25 Eisai R & D Management Co., Ltd. Dopamine-producing neuron progenitor cell marker 187a5
WO2008099405A2 (en) * 2007-02-15 2008-08-21 Geneswitch Innovations Llc Secreted pate-like proteins
WO2008141230A1 (en) * 2007-05-09 2008-11-20 Lawrence Livermore National Security, Llc Methods and systems for monitoring production of a target protein in a nanolipoprotein particle
US8883729B2 (en) 2008-05-22 2014-11-11 Lawrence Livermore National Security, Llc Nanolipoprotein particles and related compositions, methods and systems
US8907061B2 (en) 2008-01-11 2014-12-09 Lawrence Livermore National Security, Llc. Nanolipoprotein particles and related methods and systems for protein capture, solubilization, and/or purification
US9303273B2 (en) 2008-05-09 2016-04-05 Lawrence Livermore National Security, Llc Nanolipoprotein particles comprising a natural rubber biosynthetic enzyme complex and related products, methods and systems
US9453840B2 (en) 2011-07-27 2016-09-27 Kyoto University Markers for dopaminergic neuron progenitor cells
US10151037B2 (en) 2009-01-12 2018-12-11 Lawrence Livermore National Security, Llc Electrochemical flow-cell for hydrogen production and nicotinamide dependent target reduction, and related methods and systems
US11053322B2 (en) 2011-12-21 2021-07-06 Lawrence Livermore National Security, Llc Apolipoprotein nanodiscs with telodendrimer
US11207422B2 (en) 2017-05-02 2021-12-28 Lawrence Livermore National Security, Llc MOMP telonanoparticles, and related compositions, methods and systems
US11279749B2 (en) 2015-09-11 2022-03-22 Lawrence Livermore National Security, Llc Synthetic apolipoproteins, and related compositions methods and systems for nanolipoprotein particles formation

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075067A2 (en) * 2000-03-31 2001-10-11 Hyseq, Inc. Novel nucleic acids and polypeptides

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075067A2 (en) * 2000-03-31 2001-10-11 Hyseq, Inc. Novel nucleic acids and polypeptides
US20050196754A1 (en) * 2000-03-31 2005-09-08 Drmanac Radoje T. Novel nucleic acids and polypeptides

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007119759A1 (en) * 2006-04-11 2007-10-25 Eisai R & D Management Co., Ltd. Dopamine-producing neuron progenitor cell marker 187a5
US8198081B2 (en) 2006-04-11 2012-06-12 Eisai R&D Management Co., Ltd. Dopaminergic neuron progenitor cell marker 187A5
JP5033121B2 (en) * 2006-04-11 2012-09-26 エーザイ・アール・アンド・ディー・マネジメント株式会社 Dopaminergic neuron progenitor cell marker 187A5
US8604173B2 (en) 2006-04-11 2013-12-10 Eisai R&D Management Co., Ltd. Dopaminergic neuron progenitor cell marker 187A5
WO2008099405A2 (en) * 2007-02-15 2008-08-21 Geneswitch Innovations Llc Secreted pate-like proteins
WO2008099405A3 (en) * 2007-02-15 2008-10-02 Geneswitch Innovations Llc Secreted pate-like proteins
WO2008141230A1 (en) * 2007-05-09 2008-11-20 Lawrence Livermore National Security, Llc Methods and systems for monitoring production of a target protein in a nanolipoprotein particle
US9458191B2 (en) 2008-01-11 2016-10-04 Lawrence Livermore National Security, Llc Nanolipoprotein particles and related methods and systems for protein capture, solubilization, and/or purification
US8907061B2 (en) 2008-01-11 2014-12-09 Lawrence Livermore National Security, Llc. Nanolipoprotein particles and related methods and systems for protein capture, solubilization, and/or purification
US9688718B2 (en) 2008-01-11 2017-06-27 Lawrence Livermore National Security, Llc Nanolipoprotein particles comprising hydrogenases and related products, methods and systems
US9303273B2 (en) 2008-05-09 2016-04-05 Lawrence Livermore National Security, Llc Nanolipoprotein particles comprising a natural rubber biosynthetic enzyme complex and related products, methods and systems
US8889623B2 (en) 2008-05-22 2014-11-18 Lawrence Livermore National Security, Llc Immunostimulatory nanoparticles and related compositions, methods and systems
US8883729B2 (en) 2008-05-22 2014-11-11 Lawrence Livermore National Security, Llc Nanolipoprotein particles and related compositions, methods and systems
US10151037B2 (en) 2009-01-12 2018-12-11 Lawrence Livermore National Security, Llc Electrochemical flow-cell for hydrogen production and nicotinamide dependent target reduction, and related methods and systems
US9453840B2 (en) 2011-07-27 2016-09-27 Kyoto University Markers for dopaminergic neuron progenitor cells
US11053322B2 (en) 2011-12-21 2021-07-06 Lawrence Livermore National Security, Llc Apolipoprotein nanodiscs with telodendrimer
US11279749B2 (en) 2015-09-11 2022-03-22 Lawrence Livermore National Security, Llc Synthetic apolipoproteins, and related compositions methods and systems for nanolipoprotein particles formation
US11207422B2 (en) 2017-05-02 2021-12-28 Lawrence Livermore National Security, Llc MOMP telonanoparticles, and related compositions, methods and systems

Also Published As

Publication number Publication date
WO2004094651A3 (en) 2009-03-26

Similar Documents

Publication Publication Date Title
US20020102569A1 (en) Diagnostic marker for cancers
WO2004020595A2 (en) Novel human polypeptides encoded by polynucleotides
WO2004093804A2 (en) Human polypeptides encoded by polynucleotides and methods of their use
WO2004035732A2 (en) Human polypeptides encoded by polynucleotides and methods of their use
WO2004094651A2 (en) Novel human polypeptides encoded by polynucleotides
JP2005503112A (en) Lipid binding molecule
US6168933B1 (en) Phospholipid transfer protein
US20080199443A1 (en) Bone Morphogenetic Variants, Compositions and Methods of Treatment
US20090286954A1 (en) Human cDNA Clones Comprising Polynucleotides Encoding Polypeptides and Methods of Their Use
US6245526B1 (en) Lipid metabolism transcription factor
US20070258949A1 (en) Human Cdna Clones Comprising Polynucleotides Encoding Polypeptides and Methods of Their Use
WO2004009836A2 (en) Pr/set-domain containing nucleic acids, polypeptides, antibodies and methods of use
WO2004020591A2 (en) Methods of use for novel human polypeptides encoded by polynucleotides
WO2004094598A2 (en) Methods of use for novel human polypeptides encoded by polynucleotides
WO2004038003A2 (en) Human polypeptides encoded by polynucleotides and methods of their use
US6500642B1 (en) Molecule associated with apoptosis
US20030166300A1 (en) Growth-related inflammatory and immune response protein
US20040253598A1 (en) Vesicle-associated proteins
CA2401660A1 (en) Lipid metabolism enzymes
WO2004039952A2 (en) Methods of use for novel human polypeptides encoded by polynucleotides
US20030049623A1 (en) PR/SET-domain containing nucleic acids, polypeptides, antibodies and methods of use
JP2004513611A (en) Lipid metabolizing enzymes
US20030175787A1 (en) Vesicle membrane proteins
JPH10146188A (en) Mouse gene corresponding to causative gene of human werner&#39;s syndrome and protein for which the gene codes
WO2004046310A2 (en) Novel mouse polypeptides encoded by polynucleotides and methods of their use

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase