WO1993000353A1 - Sequences caracteristiques de produit de transcription de genes humains - Google Patents

Sequences caracteristiques de produit de transcription de genes humains Download PDF

Info

Publication number
WO1993000353A1
WO1993000353A1 PCT/US1992/005222 US9205222W WO9300353A1 WO 1993000353 A1 WO1993000353 A1 WO 1993000353A1 US 9205222 W US9205222 W US 9205222W WO 9300353 A1 WO9300353 A1 WO 9300353A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
sequence
sequences
length
strandedness
Prior art date
Application number
PCT/US1992/005222
Other languages
English (en)
Inventor
J. Craig Venter
Mark D. Adams
Original Assignee
The United States Of America, As Represented By The Secretary, Department Of Health And Human Services
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The United States Of America, As Represented By The Secretary, Department Of Health And Human Services filed Critical The United States Of America, As Represented By The Secretary, Department Of Health And Human Services
Priority to EP92914421A priority Critical patent/EP0593580A4/en
Publication of WO1993000353A1 publication Critical patent/WO1993000353A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to newly identified polynucleotide sequences corresponding to transcription products of human genes, and to complete gene sequences associated therewith.
  • This invention relates to human genes. Identification and sequencing of human genes is a major goal of modern scientific research. The sequence of human genes is more than just a scientific curiosity. For example, by identifying genes and determining their sequences, scientists have been able to make large quantities of valuable human "gene products.” These include human insulin, interferon, Factor VIII, tumor necrosis factor, human growth hormone, tissue plas inogen activator, and numerous other compounds. Additionally, knowledge of gene sequences can provide the key to treatment or cure of genetic diseases (such as muscular dystrophy and cystic fibrosis) . The present invention represents a quantum leap forward in civilization's knowledge of human gene sequences. There are several basic concepts of molecular biology which figure prominently in the invention. A brief explanation of those concepts follows.
  • the present invention is based on identification and characterization of gene segments.
  • Genes are the basic units of inheritance. Each gene is a string of connected bases called nucleotides. Most genes are formed of deoxyribonucleic acid, DNA. (Some viruses contain genes of ribonucleic acid, RNA.) The genetic information resides in the particular sequence in which the bases are arranged. A short sequence of nucleotides is often called a polynucleotide or an oligonucleotide.
  • polypeptides are built from long strings of individual units. These units are amino acids.
  • the nucleotide sequence of a gene tells the cell the sequence in which to arrange the amino acids to make the polypeptide encoded by that gene.
  • chains of up to about 200 amino acids are called polypeptides, while proteins are larger molecules made up of polypeptide subunits; both types of molecules are referred to generally herein as polypeptides.
  • a triplet of nucleotides (codon) in DNA codes for each amino acid or signals the beginning or end of the message (anticodon) .
  • the term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the original DNA sequence is transcribed.
  • RNA messenger RNA
  • mRNA messenger RNA
  • the mRNA in turn, can be translated into a polypeptide by the cell. This entire process is called gene expression, and the polypeptide is the gene product encoded by the gene.
  • cDNA complementary DNA
  • the various types of genes include those which code for polypeptides, those which are transcribed into RNA but are not translated into polypeptides, and those whose functional significance does not demand that they be transcribed at all. Most genes are found on large molecules of DNA located in chromosomes.
  • Double stranded cDNA carries all the information of a gene. Each base of the first strand is joined to a complementary base (hybridized) in the second strand.
  • the linear DNA molecules in chromosomes have thousands of genes distributed along their length. Chromosomes include both coding regions (coding for polypeptides) and noncoding regions; the coding regions represent only about three percent of the total chromosome sequence.
  • An individual gene has regulatory regions that include a promoter which directs expression of the gene, a coding region which can code for a polypeptide, and a termination signal.
  • the regulatory DNA sequence is usually a noncoding region that determines if, where, when, and at what level a particular gene is expressed.
  • the coding regions of many genes are discontinuous, with coding sequences (exons) alternating with noncoding regions (introns) .
  • the final mRNA copy of the gene does not include these introns (which can be much longer than the coding region itself) , although it does contain certain untranslated regions that usually do not code for the polypeptide gene product.
  • Untranslated sequences at the beginning and end of the mRNA are known as 5 1 - and 3'-untranslated regions, respectively. This nomenclature reflects the orientation of the nucleotide constituents of the mRNA.
  • a cDNA is a DNA copy of a messenger RNA, which contains all of the exons of a gene.
  • the cDNA can be thought of as having three parts: an untranslated 5* leader, an uninterrupted polypeptide-coding sequence, and a 3' untranslated region.
  • the untranslated leader and trailing sequences are important for initiation of translation, mRNA stability, and other functions.
  • the untranslated leader and trailing sequences are called 5'- and 3 '-untranslated sequences, respectively.
  • the 3' untranslated sequence is usually longer than the 5' untranslated leader, and can be longer than the polypeptide-coding sequence.
  • the untranslated regions typically have many, randomly-distributed stop codons, and do not display the nonrandom base arrangements found in coding sequences.
  • the 5'-untranslated sequence is relatively short, generally between 20 and 200 bases.
  • the 3 1 - untranslated sequence is often many times longer, up to several thousand bases.
  • the translated or coding sequence begins with a translational start codon (AUG or GUG) and ends with a translational stop codon (UAA, UGA, or UAG) .
  • translation begins at the first "start” codon on the mRNA and proceeds to the first "stop” codon.
  • Coding sequences can be distinguished by their nonrandom distribution of bases; numerous computer algorithms have been developed to distinguish coding from noncoding regions in this way.
  • Human DNA differs from person to person. No two persons (except perhaps identical twins) have identical DNA. While the differences, called allelic variations or polymorphisms, are slight on a molecular level, they account for most of the physical and other observable differences between individuals. It has been estimated that approximately 14 million sequence polymorphism differences exist between individuals.
  • PCR polymerase chain reaction
  • Primer extension proceeds inward across the region between the two primers, and the product of DNA synthesis of one primer serves as a template for the other primer. Repeated cycles of DNA denaturation, annealing of primers, and extension result in an exponential increase in the number of copies of the region bounded by the primers.
  • a labeled segment of single-stranded DNA can be hybridized to a longer DNA sequence, such as a chromosome, to mark a specific location on the longer sequence. Segments of DNA 50 bases long or longer that hybridize to a unique DNA location in the human genome are extremely unlikely to hybridize elsewhere in the human genome.
  • the Human Genome Project is an effort to sequence all human DNA (the human genome) .
  • the human genome is estimated to comprise 50,000 - 100,000 genes, up to 30,000 of which might be expressed in the brain (Sutcliffe, Ann. Rev. Neurosci. 11:157 (1988)).
  • Once dedicated human chromosome sequencing begins in three to five years, it was expected that 12-15 years will be required to complete the sequence of the genome (Report of the Ad Hoc Program Advisory Committee on Complex Genomes, Reston, Va. , Feb. 1988, D. Baltimore Ed. (NIH, Bethesda, Md, 1988)).
  • the majority of human genes would remain unknown for at least the next decade.
  • the present invention can greatly accelerate the pace at which human genes can be identified and mapped.
  • GenBank listed the sequences of only a few thousand human genes and less than two hundred human brain mRNAs (GenBank Release 66.0, December, 1990).
  • cDNA sequencing complementary DNA
  • Genomic sequencing proponents have argued the difficulty of finding every mRNA expressed in all tissues, cell types, and developmental states, and that much valuable information from intronic and intergenic regions, including control and regulatory sequences, will be missed by cDNA sequencing. (Report of the Committee on Mapping and Sequencing the Human Genome, National Research Council (National Academy Press, Washington, D.C. 1988)). Further, sequencing of transcribed regions of the genome using cDNA libraries has heretofore been considered impractical or unsatisfactory. Libraries of cDNA were believed to be dominated by repetitive elements, mitochondrial genes, ribosomal RNA genes, and other nuclear genes comprising common or housekeeping sequences.
  • cDNA libraries would provide few sequences corresponding to structural and regulatory polypeptides or peptides. See, for example, Putney, et al.. Nature 302;718- 721 (1983) . Putney, et al. sequenced over 150 clones from a rabbit muscle cDNA library and identified clones for 13 of the 19 known muscle polypeptides, including one new isotype but no unknown coding sequences.
  • Another perceived drawback of cDNA sequencing was that some mRNAs are abundant, and some are rare. The cellular quantities of mRNA from various genes can vary by several orders of magnitude. This led critics to believe that most information obtained from cDNA sequencing would be repetitious and useless.
  • cDNA sequencing now provides a rapid method for obtaining enormous amounts of valuable genetic information and DNA products of great utility for the biotechnology and pharmaceutical industries. Not only can many distinct cDNAs be isolated and sequenced, even partial cDNAs can be used, with conventional, well-understood methods, to isolate entire genes, and to determine the chromosomal locations and biological functions of these genes. As is demonstrated here, fragments of only a few hundred bases are sufficient, in many cases, to identify the probable function of a new human gene if it is similar in structure to a gene from another animal, or from plants or bacteria.
  • fragments of untranslated regions of a cDNA can be used to: i) isolate the coding sequence of the cDNA; ii) isolate the complete gene; iii) determine the position of the gene on a human chromosome, and hence the potential of the gene to cause a human genetic disease; and iv) determine the function of the gene by means of experiments in which the function of the native gene is disrupted by the addition of a short DNA fragment to the cell, e.g., using triple helix or antisense probes. Because coding regions comprise such a small portion of the human genome, identification and mapping of transcribed regions and coding regions of chromosomes is of significant interest.
  • ESTs styled Expressed Sequence Tags
  • STSs random genomic DNA sequence tagged sites
  • aspects of the present invention thus include the individual ESTs, corresponding partial and complete cDNA, genomic DNA, mRNA, antisense strands, triple helix probes, PCR primers, coding regions, and constructs. Also, where one skilled in the art is enabled by this specification to prepare expression vectors and polypeptide expression products, they are also within the scope of the present invention, along with antibodies, especially monoclonal antibodies, to such expression products.
  • ESTs from cDNA Libraries The sequences of the present invention were isolated from commercially available and custom made cDNA libraries using a rapid screening and sequencing technique.
  • the method comprises applying conventional automated DNA sequencing technology to screening clones, advantageously randomly selected clones, from a cDNA library.
  • the library is initially "enriched” through removal of ribosomal sequences and other common sequences prior to clone selection.
  • ESTs are generated from partial DNA sequencing of the selected clones.
  • the ESTs of the present invention were generated using low redundancy of sequencing, typically a single sequencing reaction. While single sequencing reactions may have an accuracy as low as 97%, this nevertheless provides sufficient fidelity for identification of the sequence and design of PCR primers.
  • Exon amplification works by artificially expressing part or all of a gene that is contained in a cloned fragment of genomic DNA such as a cosmid or yeast artificial chromosome
  • YAC YAC
  • MIT that uses control elements from virus genes to express the protein-coding exons of the human gene of interest.
  • Exon trapping shows considerable promise as a general technique for identifying those genes in the human genome that cannot be found by cDNA cloning and EST sequencing.
  • Exon amplification will also be useful for identifying the genes in regions of genomic DNA to which disease genes have been mapped.
  • the exon amplification method can be used directly with the cosmid and
  • ESTs comprise DNA sequences corresponding to a portion of nuclear encoded messenger RNA.
  • An EST is of sufficient length to permit: (1) amplification of the specific sequence from a cDNA library, e.g., by polymerase chain reaction (PCR); (2) use of a synthetic polynucleotide corresponding to a partial or complete sequence of the EST as a hybridization probe of a cDNA library, generally having 30 - 50 base pairs; or (3) unique designation of the pure cDNA clone from which the EST was derived (the EST clone) for use as a hybridization probe of a cDNA library.
  • EST-derived primer pairs and sequences amplify or detectably hybridize to a sequence from a genomic library.
  • the ESTs disclosed herein are generally at least 150 base pairs in length.
  • the length of an EST is determined by the quality of sequencing data and the length of the cloned cDNA.
  • Raw data from the automated sequencers is edited to remove low quality sequence at the end of the sequencing run.
  • High quality sequences (usually a result of sequencing templates without excessive salt contamination) generally give about 400 bp of reliable sequence data; other sequences give fewer bases of reliable data.
  • a 150 bp EST is long enough to be translated into a 50 amino acid peptide sequence. This length is sufficient to observe similarities when they exist in a database search.
  • 150 bp is long enough to design PCR primers from each end of the sequence to amplify the complete EST. Sequences shorter than 150 bp are difficult to purify and use following PCR amplification. Furthermore, a 150 bp polynucleotide is likely to give a very strong signal with low background in a screen of a genomic library. Finally, it is highly unlikely that a sequence of the same 150 bp exists in any genes in the genome besides the one tagged by the EST. Some closely related gene family members have very similar nucleotide sequences, but no examples of pairs of human genes with long segments of identical sequence have been reported to date. For instance, there are three known ⁇ -tubulin genes in humans.
  • ESTs were found that matched one or another of these tubulin genes, but several new members of this gene family were also found and could be clearly distinguished from the three known members. ESTs that match perfectly to several different genes can be detected by hybridizing to chromosomes: if many chromosomal loci are observed, the sequence (or a close variant) is present in more than one gene. This problem can be circumvented by using the 3'-untranslated part of the cDNA alone as a probe for the chromosomal location or for the full-length cDNA or gene. The 3'-untranslated region is more likely to be unique within gene families, since there is no evolutionary pressure to conserve a coding function of this region of the mRNA.
  • ESTs can be used to map the expressed sequence to a particular chromosome.
  • ESTs can be expanded to provide the full coding regions, as detailed below. In this manner, previously unknown genes can be identified.
  • cDNA libraries can be used to obtain ESTs
  • human brain cDNA libraries are exemplified and represent a preferred embodiment.
  • Suitable cDNA libraries can be freshly prepared or obtained commercially, e.g. , as shown in Examples 1 and 9.
  • the cDNA libraries from the desired tissue are preferably preprocessed by conventional techniques to reduce repeated sequencing of high and intermediate abundance. clones and to maximize the chances of finding rare messages from specific cell populations.
  • preprocessing includes the use of defined composition prescreening probes, e.g., cDNA corresponding to mitochondria, abundant sequences, riboso es, actins, myelin basic polypeptides, or any other known high abundance peptide; these prescreening probes used for preprocessing are generally derived from known ESTs.
  • Other useful preprocessing techniques include subtraction, which preferentially reduces the population of certain sequences in the library (e.g., see A. Swaroop et al., Nucl. Acids Res. 19:1954 (1991)), and normalization, which results in all sequences being represented in approximately equal proportions in the library (Patanjali et al, Proc. Natl. Acad. Sci. USA 88:1943 (1991)).
  • the cDNA libraries used in the present method will ideally use directional cloning methods so that either the 5' end of the cDNA (likely to contain coding sequence) or the 3' end (likely to be a non-coding sequence) can be selectively obtained.
  • Libraries of cDNA can also be generated from recombinant expression of genomic DNA. After they are amplified, ESTs can be obtained and sequenced, e.g., as illustrated in Example 9.
  • sequences of the present invention include the specific sequences set forth in the Sequence Listing and designated SEQ ID NO: 1 - SEQ ID NO: 315. In one aspect of this embodiment, the invention relates to those sequences of
  • SEQ ID NOS: 1 - 315 that comprise the cDNA coding sequences for polypeptides having less than 95% identity with known amino acid sequences (see Table 2) and more preferably less than 90% or 85% identity.
  • the invention relates to those sequences of SEQ ID NOS: 1 - 315 that encode polypeptides having no similarity to known amino acid sequences (see Examples that follow) . Precisely because they do not contain coding regions and are therefore more unique in their sequence structures, those sequences which meet neither of the preceding criteria can be most useful and are generally preferred for mapping.
  • the ESTs of the present invention generally represent relatively small coding regions or untranslated regions of human genes. Although most of these sequences do not code for a complete gene product, the ESTs of the present invention are highly specific markers for the corresponding complete coding regions.
  • the ESTs are of sufficient length that they will hybridize, under stringent conditions, only with DNA for that gene to which they correspond.
  • Suitably stringent conditions comprise conditions, for example, where at least 95%, preferably at least 97% or 98% identity (base pairing) , is required for hybridization. This property permits use of the EST to isolate the entire coding region and even the entire sequence. Therefore, only routine laboratory work is necessary to parlay the unique EST sequence into the corresponding unique complete gene sequence.
  • each of the ESTs of the present invention "corresponds" to a particular unique human gene.
  • Knowledge of the EST sequence permits routine isolation and sequencing of the complete coding sequence of the corresponding gene.
  • the complete coding sequence is present in a full-length cDNA clone as well as in the gene carried on genomic clones. Therefore, each EST "corresponds" to a cDNA (from which the EST was derived) , a complete genomic gene sequence, a polypeptide coding region (which can be obtained either from the cDNA or genomic DNA) , and a polypeptide or amino acid sequence encoded by that region.
  • the first step in determining where an EST is located in the cDNA is to analyze the EST for the presence of coding sequence, e.g., as described in Example 12.
  • the CRM program predicts the extent and orientation of the coding region of a sequence. Based on this information, one can infer the presence of start or stop codons within a sequence and whether the sequence is completely coding or completely non-coding. If start or stop codons are present, then the EST can cover both part of the 5•-untranslated or 3'-untranslated part of the mRNA (respectively) as well as part of the coding sequence. If no coding sequence is present, it is likely that the EST is derived from the 3'-untranslated sequence due to its longer length and the fact that most cDNA library construction methods are biased toward the 3• end of the mRNA.
  • Radiolabel the isolated insert DNA e.g., with 32 P labels, preferably by nick translation or random primer labeling.
  • Radiolabel the isolated insert DNA e.g., with 32 P labels, preferably by nick translation or random primer labeling.
  • EST is a specific tag for a messenger RNA molecule.
  • the complete sequence of that messenger RNA, in the form of cDNA can be determined using the EST as a probe to identify a cDNA clone corresponding to a full-length transcript, followed by sequencing of that clone.
  • the EST or the full- length cDNA clone can also be used as a probe to identify a genomic clone or clones that contain the complete gene including regulatory and promoter regions, exons, and introns.
  • ESTs are used as probes to identify the cDNA clones from which an EST was derived.
  • ESTs, or portions thereof can be nick-translated or end-labelled with 32 P using polynucleotide kinase and labelling methods known to those with skill in the art (Basic Methods in Molecular Biology, L.G. Davis, M.D. Dibner, and J.F. Battey, ed. , Elsevier Press, NY, 1986).
  • the lambda library can be directly screened with the labelled ESTs of interest or the library can be converted en masse to pBluescript (Stratagene, La Jolla, California) to facilitate bacterial colony screening. Both methods are well known in the art.
  • filters with bacterial colonies containing the library in pBluescript or bacterial lawns containing lambda plaques are denatured and the DNA is fixed to the filters.
  • the filters are hybridized with the labelled probe using hybridization conditions described by Davis et al.
  • the ESTs, cloned into lambda or pBluescript, can be used as positive controls to assess background binding and to adjust the hybridization and washing stringencies necessary for accurate clone identification.
  • the resulting autoradiograms are compared to duplicate plates of colonies or plaques; each exposed spot corresponds to a positive colony or plaque.
  • the colonies or plaques are selected, expanded and the DNA is isolated from the colonies for further analysis and sequencing.
  • the ESTs can additionally be used to screen Northern blots of mRNA obtained from various tissues or cell cultures, including the tissue of origin of the EST clone. Northern analysis will most often produce one to several positive bands. The bands can be selected for further study based on the predicted size of the mRNA.
  • Positive cDNA clones in phage lambda are analyzed to determine the amount of additional sequence they contain using PCR with one primer from the EST and the other primer from the vector.
  • Clones with a larger vector-insert PCR product than the original EST clone are analyzed by restriction digestion and DNA sequencing to determine whether they contain an insert of the same size or similar as the mRNA size on a Northern blot. Once one or more overlapping cDNA clones are identified, the complete sequence of the clones can be determined.
  • the preferred method is to use exonuclease III digestion (McCombie, W.R, Kirkness, E., Fleming, J.T., Kerlavage, A.R., Iovannisci, D.M. , and Martin-Gallardo, R. , Methods: 3: 33-40, 1991) .
  • A- series of deletion clones is generated, each of which is sequenced.
  • the resulting overlapping sequences are assembled into a single contiguous sequence of high redundancy (usually three to five overlapping sequences at each nucleotide position) , resulting in a highly accurate final sequence.
  • a similar screening and clone selection approach can be applied to obtaining cosmid or lambda clones from a genomic DNA library that contains the complete gene from which the EST was derived (Kirkness, E.F., Kusiak, J.W., Menninger, J. , Gocayne, J.D., Ward, D.C., and Venter, J.C. Genomics 10: 985- 995 (1991) .
  • these genomic clones can also be sequenced in their entirety.
  • a shotgun approach is preferred to sequencing clones with inserts longer than 10 kb (genomic cosmid and lambda clones) .
  • the clone is randomly broken into many small pieces, each of which is partially sequenced. The sequence fragments are then aligned to produce the final contiguous sequence with high redundancy.
  • An intermediate approach is to sequence just the promoter region and the intron-exon boundaries and to estimate the size of the introns by restriction endonuclease digestion (ibid.).
  • the polynucleotides of the present invention can be derived from natural sources or synthesized using known methods. The sequences falling within the scope of the present invention are not limited to the specific sequences described, but include human allelic and species variations thereof and portions thereof of at least 15-18 bases.
  • sequences of at least 15-18 bases can be used, for example, as PCR primers or as DNA probes.
  • the invention includes the entire coding sequence associated with the specific polynucleotide sequence of bases described in the Sequence Listing, as well as portions of the entire coding sequence of at least 15-18 bases and allelic and species variations thereof.
  • the invention includes sequences coding for the same amino acid sequences as do the specific sequences disclosed herein.
  • sequences, constructs, vectors, clones, and other materials comprising the present invention can advantageously be in enriched or isolated form.
  • enriched means that the concentration of the material is at least about 2, 5, 10, 100, or 1000 times its natural concentration (for example) , advantageously 0.01%, by weight, preferably at least about 0.1% by weight. Enriched preparations of about 0.5%, 1%, 5%, 10%, and 20% by weight are also contemplated. Further, removal of clones corresponding to ribosomal RNA and "housekeeping" genes and clones without human cDNA inserts results in a library that is "enriched" in the desired clones.
  • isolated requires that the material be removed from its original environment (e.g., the natural environment if it is naturally occurring) .
  • a naturally- occurring polynucleotide present in a living animal is not isolated, but the same polynucleotide, separated from some or all of the coexisting materials in the natural system, is isolated. It is also advantageous that the sequences be in purified form.
  • purified does not require absolute purity; rather, it is intended as a relative definition. Individual EST clones isolated from a cDNA library have been conventionally purified to electrophoretic homogeneity. The sequences obtained from these clones could not be obtained directly either from the library or from total human DNA.
  • the cDNA clones are not naturally occurring as such, but rather are obtained via manipulation of a partially purified naturally occurring substance (messenger RNA) .
  • the conversion of mRNA into a cDNA library involves the creation of a synthetic substance (cDNA) and pure individual cDNA clones can be isolated from the synthetic library by clonal selection.
  • cDNA synthetic substance
  • purification of starting material or natural material to at least one order of magnitude, preferably two or three orders, and more preferably four or five orders of magnitude is expressly contemplated.
  • a cDNA library there are many species of mRNA represented.
  • Each cDNA clone can be interesting in its own right, but must be isolated from the library before further experimentation can be completed. In order to sequence any specific cDNA, it must be removed and separated (i.e. isolated and purified) from all the other sequences. This can be accomplished by many techniques known to those of skill in the art. These procedures normally involve identification of a bacterial colony containing the cDNA of interest and further amplification of that bacteria. Once a cDNA is separated from the mixed clone library, it can be used as a template for further procedures such as nucleotide sequencing. Although claims to large numbers of ESTs and corresponding sequences are presented herein, the invention is not limited to these particular groupings of sequences.
  • the present invention also includes recombinant constructs comprising one or more of the sequences as broadly described above.
  • the constructs comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention has been inserted, in a sense or antisense orientation.
  • the construct further comprises regulatory sequences, including for example, a promoter, operably linked to the sequence.
  • a promoter operably linked to the sequence.
  • Eukar ⁇ otic pWLneo, pSV2cat, pOG44, pXTl, pSG (Stratagene); pSVK3, pBPV, pMSG, pSVL (Pharmacia).
  • Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers.
  • Two appropriate vectors are pKK232-8 and pCM7.
  • Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda P R , and trc.
  • Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
  • the present invention relates to host cells containing the above-described construct.
  • the host cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a procaryotic cell, such as a bacterial cell.
  • Introduction of the construct into the host cell can be effected by calcium phosphate transfection, DEAE dextran mediated transfection, or electroporation (Davis, L. , Dibner, M. , Battey, I., Basic Methods in Molecular Biology, (1986)).
  • the constructs in host cells can be used in a conventional manner to produce the gene product coded by the recombinant sequence.
  • the encoded polypeptide can be synthetically produced by conventional peptide synthesizers.
  • Certain ESTs have already been preliminarily categorized by analogy to related sequences in other organisms (see Table 2) .
  • Table 10 of Example 8 categorizes particular ESTs broadly as metabolic, regulatory, and structural sequences where known. Constructs comprising genes or coding sequences corresponding to each of these categories are, therefore, specifically and individually contemplated.
  • Table 11 more particularly separates 27 new ESTs into 11 categories using a different criteria.
  • Table 11 further identifies the EST by the particular gene product for which it apparently codes. Each of these categories individually comprises a preferred category of EST, and preferred constructs and resulting polypeptide can be prepared from those ESTs or the corresponding complete gene sequence.
  • sequences identified herein can be used in numerous ways as polynucleotide reagents.
  • the sequences can be used as diagnostic probes for the presence of a specific mRNA in a particular cell type.
  • these sequences can be used as diagnostic probes suitable for use in genetic linkage analysis (polymorphisms) .
  • the sequences can be used as probes for locating gene regions associated with genetic disease, as explained in more detail below.
  • the EST and complete gene sequences of the present invention are also valuable for chromosome identification. Each sequence is specifically targeted to and can hybridize with a particular location on an individual human chromosome. Moreover, there is a current need for identifying particular sites on the chromosome. Few chromosome marking reagents based on actual sequence data (repeat polymorphisms) are presently available for marking chromosomal location. The present invention constitutes a major expansion of available chromosome markers.
  • ESTs and their corresponding complete sequences can be mapped to chromosomes.
  • the mapping of ESTs and cDNAs to chromosomes according to the present invention is an important first step in correlating those sequences with genes associated with disease.
  • sequences can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp) from the ESTs. Computer analysis of the ESTs is used to rapidly select primers that do not span more than one exon in the genomic DNA, thus complicating the amplification process. These primers are then used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids .containing the human gene corresponding to the EST will yield an amplified fragment.
  • PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular EST to a particular chromosome. Three or more clones can be assigned per day using a single thermal cycler. Using the present invention with the same oligonucleotide primers, sublocalization can be achieved with panels of fragments from specific chromosomes or pools of large genomic clones in an analogous manner.
  • Other mapping strategies that can similarly be used to map an EST to its chromosome include in situ hybridization, prescreening with labeled flow-sorted chromosomes, and preselection by hybridization to construct chromosome specific cDNA libraries. Results of mapping ESTs to chromosomal segments are listed in Tables 3 and 4.
  • Fluorescence in situ hybridization (FISH) of a cDNA clone to a metaphase chromosomal spread can be used to provide a precise chromosomal location in one step.
  • This technique can be used with cDNA as short as 500 or 600 bases; however, clones larger than 2,000 bp have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection.
  • FISH requires use of the clone from which the EST was derived, and the longer the better. 2,000 bp is good, 4,000 is better, and more than 4,000 is probably not necessary to get good results a reasonable percentage of the time.
  • Ver a et al. Human Chromosomes: a Manual of Basic Techniques; Pergamon Press, New York (1988) .
  • Reagents for chromosome mapping can be used individually (to mark a single chromosome or a single site on that chromosome) or as panels of reagents (for marking multiple sites and/or multiple chromosomes) . Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping (see Tables 8 and 9) .
  • a cDNA precisely localized to a chromosomal region associated with the disease could be one of between 50 and 500 potential causative genes. (This assumes l megabase mapping resolution and one gene per 20 kb.)
  • Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that cDNA sequence. Ultimately, complete sequencing of genes from several individuals is required to confirm the presence of a mutation and to distinguish mutations from polymorphisms.
  • sequences of the invention can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on binding of a polynucleotide sequence to DNA or RNA.
  • Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids Res. 6: 3073 (1979) ; Cooney et al, Science 241: 456 (1988); and Dervan et al. Science 251: 1360 (1991)) or to the mRNA itself (antisense - Okano, J.
  • the present invention is also a useful tool in gene therapy, which requires isolation of the disease-associated gene in question as a prerequisite to the insertion of a normal gene into an organism to correct a genetic defect.
  • the high specificity of the cDNA probes according to this invention have promise of targeting such gene locations in a highly accurate manner.
  • sequences of the present invention are also useful for identification of individuals from minute biological samples.
  • the United States military for example, is considering the use of restriction fragment length polymorphism (RFLP) for identification of its personnel.
  • RFLP restriction fragment length polymorphism
  • an individual's genomic DNA is digested with one or more restriction enzymes, and probed on a Southern blot to yield unique bands for identifying personnel. This method does not suffer from the current 1imitations of "Dog Tags" which can be lost, switched, or stolen, making positive identification difficult.
  • the sequences of the present invention are useful as additional DNA markers for RFLP.
  • RFLP is a pattern based technique, which does not directly focus on the actual DNA sequence of the individual.
  • sequences of the present invention can be used to provide an alternative technique that determines the actual base-by-base DNA sequence of selected portions of an individual's genome. These sequences can be used to prepare PCR primers for amplifying and isolating such selected DNA.
  • Panels of corresponding DNA sequences from individuals can provide unique individual identifications, as each individual will have a unique set of such DNA sequences, due to allelic differences.
  • the sequences of the present invention can be used to particular advantage to obtain such identification sequences from individuals and from tissue, as explained in Examples 10 - 12.
  • the EST sequences from Example 1 and the complete sequences from Example 11 uniquely represent portions of the human genome. Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. It is estimated that allelic variation between individual humans occurs with a frequency of about once per each 500 bases.
  • Each of the ESTs or complete coding sequences comprising a part of the present invention can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals.
  • the noncoding sequences of Table 9 for example, could comfortably provide positive individual identification with a panel of perhaps 100 to 1,000 primers which each yield a noncoding amplified sequence of 100 bp. If predicted coding sequences, such as those from Table 6, are used, a more appropriate number of primers for positive individual identification would be 500-2,000.
  • a panel of reagents from ESTs or complete sequences of this invention is used to generate a unique ID database for an individual, those same reagents can later be used to identify tissue from that individual. Positive identification of that individual, living or dead can be made from extremely small tissue samples.
  • DNA-based identification techniques are in forensic biology.
  • PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc.
  • gene sequences are amplified at specific loci known to contain a large number of allelic variations, for example the DQ ⁇ class II HLA gene (Erlich, H. , PCR Technology, Freeman and Co. (1992)) .
  • this specific area of the genome is amplified, it is digested with one or more restriction enzymes to yield an identifying set of bands on a Southern blot probed with DNA corresponding to the DQ ⁇ class II HLA gene.
  • sequences of the present invention can be used to provide polynucleotide reagents specifically targeted to additional loci in the human genome, and can enhance the reliability of DNA-based forensic identifications. Those sequences targeted to noncoding regions (see, e.g.. Tables 8 and 9) are particularly appropriate. As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Reagents for obtaining such sequence information are within the scope of the present invention. Such reagents can comprise complete
  • reagents capable of identifying the source of a particular tissue. Such need arises, for example, in forensics when presented with tissue of unknown origin.
  • Appropriate reagents can comprise, for example, DNA probes or primers specific to particular tissue prepared from the ESTs or complete sequences of the present invention. Panels of such reagents can identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to screen tissue culture for contamination.
  • each EST corresponds not only to a coding region, but also to a polypeptide.
  • the coding sequence is known, or the gene is cloned which encodes the polypeptide, conventional techniques in molecular biology can be used to obtain the polypeptide.
  • the amino acid sequence encoded by the polynucleotide sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. (Fragments are useful, for example, in generating antibodies against the native polypeptide.)
  • the DNA encoding the desired polypeptide can be inserted into a host organism and expressed.
  • the organism can be a bacterium, yeast, cell line, or multicellular plant or animal.
  • the literature is replete with examples of suitable host organisms and expression techniques.
  • naked polynucleotide DNA or mRNA
  • This methodology can be used to deliver the polypeptide to the animal, or to generate an immune response against a foreign polypeptide (Wolff, et al., Science 247:1465
  • the coding sequence, together with appropriate regulatory regions can be inserted into a vector, which is then used to transfect a cell.
  • the cell which may or may not be part of a larger organism
  • Antibodies generated against the polypeptide corresponding to a sequence of the present invention can be obtained by direct injection of the naked polypeptide into an animal (as above) or by administering the polypeptide to an animal, preferably a nonhuman. The antibody so obtained will then bind the polypeptide itself. In this manner, even a sequence encoding only a fragment of the polypeptide can be used to generate antibodies binding the whole native polypeptide. Such antibodies can then be used to isolate the polypeptide from tissue expressing that polypeptide.
  • a panel of such antibodies specific to a large number of polypeptides, can be used to identify and differentiate such tissue.
  • lambda ZAP libraries were converted en masse to pBluescript plasmids, transfected into E. coli XLl-Blue cells, and plated on X- gal/IPTG/ampicillin plates.
  • a total of 1058 clones were picked at random from three human brain cDNA libraries: fetal brain, two-year-old hippocampus, and two-year-old temporal cortex (Stratagene catalog #936206, 936205, 935, respectively.
  • Stratagene 11099 N. Torrey Pines Rd., La Jolla, CA 92037).
  • sequencing reactions were run on an Applied Biosystems, Inc. (Foster City, CA) 373A automated DNA sequencer.
  • Cycle sequencing was performed in a Perkin Elmer Thermal Cycler for 15 cycles of 95°C, 30 sec; 60°C, 1 sec; 70 ⁇ C, 60 sec and 15 cycles of 95°C, 30 sec; 70 ⁇ C, 60 sec with the Applied Biosystems, Inc. Taq Dye Primer Cycle Sequencing Core Kit protocol
  • Some sequencing reactions were performed on an ABI robotic workstation (Cathcart, Nature 347: 310 (1990) hereby incorporated by reference) .
  • the EST sequences from this Example 1 are identified as SEQ ID NOs 1-315.
  • ESTs including SEQ ID NOs 1-315 were analyzed as follows. Initially, the EST sequences were examined for similarities in the GenBank nucleic acid database (GenBank Release 65.0), Protein Information Resource Release 26.0 (PIR) , and ProSite (MacPattern from the EMBL data library, Fuchs R. Comput. Appl. Biosci. 7: 105 (1990) Release 5.0 were used). BLAST was used to search Genbank and the PIR (both maintained by the National Center for Biotechnology Information) ESTs without exact GenBank matches were translated in all six reading frames and each translation was compared with the protein sequence database PIR and the ProSite protein motif database. Comparisons with the ProSite motif database were done by means of the program MacPattern from the EMBL Data Library.
  • GenBank and PIR searches were conducted with the "basic local alignment search tool" programs for nucleotide (BLASTN) and peptide (BLASTX) comparisons (Altschul et al, J. Mol. Biol. 215: 403 (1990)). PIR searches were run on the National Center for Biotechnology Information BLAST network service.
  • the BLAST programs contain a very rapid database-searching algorithm that searches for local areas of similarity between two sequences and then extends the alignments on the basis of defined match and mismatch criteria. The algorithm does not consider the potential gaps to improve the alignment, thus sacrificing some sensitivity for a 6-80 fold increase in speed over other database-searching programs such as FASTA (Pegarson and Lipman, Proc. Natl. Acad. Sci. USA, 85: 2444 (1988)).
  • ESTs matched previously sequenced human nuclear genes with more than 97% identity.
  • Four of these ESTs are from genes encoding enzymes involved in maintaining metabolic energy, including ADP/ATP translocase, aldolase C, hexokinase, and phosphoglycerate kinase.
  • Human homologs of genes for the bovine mitochondrial ATP synthase F 0 B-subunit and porcine aconitase were also found (Table 2) .
  • Brain- specific cDNAs included synaptophysin, glial fibrillary acidic protein (GFAP) , and neurofilament light chain.
  • GFAP glial fibrillary acidic protein
  • ESTs are from genes encoding proteins involved in signal transduction: 2' ,3'-cyclic nucleotide 3'-phosphodiesterase (2 ESTs), calmodulin, c-erbA- ⁇ -2, G s ⁇ , and Na + /K* ATPase ⁇ -subunit.
  • Other ESTs were matches to genes for ubiquitous structural proteins — actins, tubulins, and fodrin (non-erythroid spectrin) .
  • ESTs also document the presence in the hippocampus cDNA library of the ret proto-oncogene, the ras-related gene rhoB, and one of the chromosome 22 breakpoint cluster region transcripts.
  • ESTs are from genes known to be associated with genetic disorders (Online Mendelian Inheritance in Manl . More than half of the human-matched ESTs from Example 1 have been mapped to chromosomes, indicating the bias of GenBank entries toward well-studied genes and proteins.
  • ESTs without significant GenBank matches were also compared to the ProSite database of recognized protein motifs. Not counting post-translational-modification signatures, fifty-four sequences contained motifs from the database. Some patterns, particularly the "leucine zipper", are found in scores or hundreds of proteins that do not share the functional property implied by the presence of the motif.
  • EST00299 SEQ ID NO:180
  • EST00283 SEQ ID NO:271
  • EST00248 SEQ ID NO:102
  • EST00248 SEQ ID NO:102
  • EST00248 SEQ ID NO:102
  • EST00248 SEQ ID NO:102
  • Similarities with an S. cerevisiae RNA polymerase subunit and Torpedo electromotor neuron- associated protein were also observed.
  • Two ESTs may represent new members of known human gene families: EST00270 matched the three ⁇ -tubulin genes with 88-91% identity and EST00271 (SEQ ID NO:248) matched ⁇ -actinin with 85% identity at the nucleotide level.
  • Enhancer of split protein interacts with a membrane protein that is the product of the Notch gene to convert a developmental signal into an altered pattern of gene expression (id. J. Mol. Biol. 215: 403 (1990)).
  • EST00256 (SEQ ID NO:188) matches near the 5' end of the Enhancer of split coding sequence, away from the mammalian G protein ⁇ subunit- and yeast cdc4-like elements (Hartley et al. Cell 55: 785 (1988); Klambt et al. EMBO J. 8: 203 (1989)).
  • EST00259 Part of the EST00259 (SEQ ID NO:227) match to Notch in the cdclO/SW16 region that is similar to three cell-cycle control genes in yeast and is tightly conserved in the Xenopus Notch homolog, Xotch. In Drosophila, Enhancer of split is absolutely required for formation of epidermal tissue. Notch contains several epidermal growth factor-like repeats and appears to play a general role in cell-cell communication during development (Banerjee and Zipursky, Neuron 4:177 (1990) ) . Seven genes were represented by more than one EST.
  • the program evaluates the likelihood that a given GG or CC dinucleotide represents a former exon-intron boundary. Specifically, every input strand is processed by the INTRON program twice, first evaluating the sense mRNA strand, and then processing the complementary or anti-sense strand. The program evaluates each sequence by finding all GG or CC pairs (possible former splice sites) , searching for STOP codons in all three reading frames, and analyzing the GG or CC pairs surrounded by stop codons. All regions of the EST that are unlikely to contain splice junctions based on CC content, GG content, and stop codon frequency are then marked by the program in uppercase.
  • PCR primers from known sequences is well known to those with skill in the art. For a review of PCR technology see Erlich, H.A. , PCR Technology; Principles and Applications for DNA Amplification. 1992; W.H. Freeman and Co., New York. ESTs were examined for the presence of stop codons in each reading frame and for consensus splice junctions. The presence of stop codons and absence of splice junction sequences are more characteristic of 3' untranslated sequences than of introns. The untranslated sequences are unique to a given gene; thus, primers from these regions are less likely to prime other members of a gene family or pseudogenes.
  • the primers were used in polymerase chain reactions (PCR) to amplify templates from total human genomic DNA.
  • PCR conditions were as follows: 60 ng of genomic DNA was used as a template for PCR with 80 ng of each oligonucleotide primer, 0.6 unit of Tag polymerase, and 1 uCu of a 32 P-labeled deoxycytidine triphosphate.
  • the PCR was performed in a microplate thermocycler (Techne) under the following conditions: 30 cycles of 94°C, 1.4 min; 55°C, 2 min; and 72°C, 2 min; with a final extension at 72°C for 10 min.
  • the amplified products were analyzed on a 6% polyacrylamide sequencing gel and visualized by autoradiography.
  • Somatic Cell Hybrid Mapping Panel Number 1 (NIGMS, Camden, NJ) .
  • PCR was used to screen a series of somatic cell hybrid cell lines containing defined sets of human chromosomes for the presence of a given EST.
  • DNA was isolated from the somatic hybrids and used as starting templates for PCR reactions using the primer pairs from EST sequences selected above. Only those somatic cell hybrids with chromosomes containing the human gene corresponding to the EST will yield an amplified fragment.
  • ESTs were assigned to a chromosome by analysis of the segregation pattern of PCR products from hybrid DNA templates. For a review of techniques and analysis of results from somatic cell gene mapping experiments. (See Ledbetter et al., Genomics 6:475- 481 (1990).) The single human chromosome present in all cell hybrids that give rise to an amplified fragment represents the chromosome containing that EST.
  • Example 3 The procedure of Example 3 is repeated for all of the ESTs from Example l not previously mapped to human chromosomes. Data are generated corresponding to the data in Table 3 for all of the unmapped ESTs. As previously mentioned, virtually all of the ESTs will map to a unique chromosomal location. The inability of any ESTs to localize to a unique location will be readily ascertainable during the mapping process.
  • This technique is used to map an EST to a particular location on a given chromosome.
  • Cell cultures, tissue, or whole blood can be used to obtain chromosomes.
  • 0.5 ml. of whole blood is added to RPMI 1640 and incubated 96 hours in a 5%C0 2 /37 ⁇ C incubator.
  • 0.05 ug/ml colcemide is added to the culture one hour before harvest.
  • Cells are collected and washed in PBS.
  • the suspension is incubated with a hypotonic solution of KC1 added dropwise to reach a final volume of 5 ml.
  • the cells are spun down and fixed by resuspending the cells in methanol and glacial acetic acid (3:1).
  • the cell suspension is dropped onto glass slides and dried. The slides are then treated with RNase A and washed then dehydrated in a series of increasing concentrations of ethanol.
  • the EST to be localized is nick-translated using fluorescently labeled nucleotide (Korenberg, Jr., et al., Cell 53(3) :391-400 (1988)). Following nick translation, unincorporated label is removed by spin dialysis through Sepharose. The probe is further extracted with phenol-chloroform to remove additional protein. The chromosomes are denatured in formamide using techniques known in the art and the denatured probe added to the slides. Following hybridization, the cells are washed. The slides are studied under a fluorescent microscope. In addition, the chromosomes can be stained for G-banding or Q-banding using techniques known in the art.
  • the resulting metaphase chromosomes have fluorescent tags localized to those regions of the chromosome that are homologous to the EST. Thus, a particular EST is localized to a particular region on a given chromosome.
  • Table 4 Precise Chromosomal Localization of ESTs
  • ESTs that match human sequences in GenBank are excellent tools for the analysis of the accuracy of double- strand automated DNA sequencing.
  • EST/GenBank matches from a number of clones were examined for the number of nucleotide mismatches and gaps required to achieve optimal alignment by the Genetics Computer Group (GCG) program BESTFIT (Devereux et al, Nucleic Acids Research 12: 387 (1984)).
  • the number of mismatches, insertions and deletions was counted for each hundred bases of the sequence (Table 5) .
  • the sequence quality was best closest to the primer and decreased rapidly after about 400 bases.
  • the number of deletions and insertions relative to the GenBank reference sequence increased five- to ten-fold beyond 400 bases, while the number of mismatches doubled.
  • the average accuracy rate for individual double-stranded sequencing runs was 97.7% to 400 bases. TABLE 5.
  • the ESTs of the present invention were statistically evaluated using the coding-region prediction program CRM via the GRAIL server (Uberbacher, E. & Mural, R. Proc. Natl. Acad. Sci. USA, 88: 11261-5 (1991)).
  • the CRM program uses a neural network to combine results from several different coding regions by looking at different 6 bp sequences found in coding exons and in introns.
  • the program additionally conducts reading frame searches and assesses randomness at the third position of codons. This protocol categorizes sequences as having an excellent, good, marginal, or poor probability of containing coding regions. The results are reported in Tables 6-9.
  • Example 2 By matching new human ESTs to known sequences from other species, the apparent function of the gene corresponding to the EST can be ascertained.
  • the data generated in Example 2 have been used to categorize 28 of the ESTs of the present invention, and their corresponding genes, into predicted functional groups. (These 28 are ESTs with database matches to sequences from other species for which a function was known.) Two different grouping schemes have been used.
  • the first scheme separates the sequences into three broad categories: metabolic; regulatory; and structural. These groupings are set out in Table 10.
  • the second grouping scheme separates the sequences into 13 specific categories: cell surface proteins; developmental control; energy metabolism; kinases and phosphatases; oncogenes; other metabolism-related polypeptides; peptidases and peptidase inhibitors; receptors; structural and cytoskeletal; signal transduction; transporters; transcription, translation, and subcellular localization; and transcription factors.
  • groupings are set out in Table 11.
  • Lysosomal membrane glycoprotein 1 (LAMP-1)
  • MARCKS myristoylated alanine-rich protein kinase smg p25A GDP dissociation inhibitor
  • RNA polymerase II 6th subunit RPO26
  • CS Cell Surface
  • DC Developmental Control
  • EM Energy Metabolism
  • KP Kinases and Phosphatases
  • OG Oncogenes
  • PI Peptidases and Peptidase Inhibitors
  • RT Receptors
  • SC Structural and Cytoskeletal
  • ST Signal Transduction
  • TT Transcription, Translation
  • TX Transcription Factors.
  • EXAMPLE 9 cDNA Libraries Generated From Specific Genomic DNA by Exon Expression & Amplification
  • Exon amplification is used to express potential exons from genomic DNA in a recombinant vector that contains some of the signals necessary for splicing. If an exon is present in the proper orientation in the vector, that exon will be spliced in a mammalian cell and will become part of the mRNA of that cell.
  • the exon splice-product can be purified from other mRNA in the cell by conversion of the mRNA to cDNA and selective amplification of the recombinant splice-product cDNAs.
  • Cosmid DNA from human chromosome 19ql3.3 is digested with Ba HI or BamHI/Bglll restriction enzymes.
  • RNA transcripts are generated using the SV40 early promoter and a polyadenylation signal derived from SV40 both present in the expression vector.
  • a fragment of genomic DNA contains an entire exon with flanking intron sequence in the sense orientation, the exon should be retained in the mature poly(A)+ cytoplasmic RNA. Therefore, the mRNA is used as template for cDNA synthesis using reverse transcriptase and vector-priming.
  • the cDNAs are amplified by vector-priming using PCR.
  • a fraction of this first PCR product is reamplified using internal vector-primers containing terminal cloning sites. These products are end- repaired with T4 DNA polymerase, digested with the appropriate restriction enzymes, gel purified and cloned into pBluescript vectors.
  • the constructs are transfected into XLl-Blue competent cells and plated on LB/X-gal/IPTG/ampicillin plates. When multiple cosmids or YAC clones are used as the source DNA, a pool of specific expressed exons is obtained as a cDNA library.
  • Computational analyses can be applied to genomic DNA sequences to predict protein coding regions.
  • the coding region prediction program CRM (E. Uberbacher and R. Mural, Proc. Natl. Acad. Sci. USA 88:11261-5 (1991)) finds open reading frames and classifies them according to their probability of being coding regions. These regions are subsequently examined using the GM program (C. Fields and C. Soderlund, Comp. Applic. Biosci. 6: 263, 1990), which predicts intron-exon structure.
  • PCR primers are then designed to amplify the predicted exons and used to test human cDNA libraries (for example, fetal brain or placental libraries) for the presence of these putative exons using a PCR assay.
  • EST clones were digested with the restriction enzymes Sail and Kpnl or Pstl and BamHI (for deletions from the Forward primer and Reverse primer ends of the insert, respectively) .
  • the Kpnl and Pstl enzymes leave 3* sticky ends following digestion, which Exonuclease III is unable to bind. This results in unidirectional deletions into the cDNA insert leaving the vector sequence undisturbed.
  • aliquots of the reaction were removed at defined time intervals and the reaction was stopped to prevent further deletion. SI nuclease and Klenow DNA polymerase were added to create blunt ended fragments suitable for ligation.
  • the reading frame, orientation, and coding regions are determined by computer techniques.
  • the complete coding region is considered to be the largest open reading frame from a methionine to a stop codon.
  • the CRM program on the GRAIL server is used as explained in Example 7 to determine probable coding regions. This information is supplemented by location of start and stop codons.
  • the results of the CRM analysis are validated by comparison of the cDNA sequence to known sequences using database matching, in accordance with Example 2. If a match of 50% (or even less) is found in any particular reading frame and orientation, this serves to verify corresponding CRM results. Alternatively, database matches can be used to determine reading frame and orientation without use of the CRM program.
  • the probable orientation is already known.
  • the EST sequences and the corresponding cDNA sequences and genomic sequences may be used, in accordance with the present invention, to prepare PCR primers for a variety of applications.
  • the PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length.
  • the procedure of Example 3 is repeated using the desired EST, or using the corresponding cDNA or genomic DNA sequence from Example 11. It is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same.
  • introns are of no concern; however, when screening genomic DNA, primers should be selected to avoid reading across introns, which usually are too large to amplify.
  • the PCR primers and amplified DNA of this Example find use in the Examples that follow.
  • DNA samples are isolated from forensic specimens of, for example, hair, semen, blood or skin cells by conventional methods.
  • a panel of PCR primers derived from a number of the sequences of Example 1, 9, 10 and/or 11 is then utilized in accordance with Example 10 to obtain DNA of approximately 100-200 bases in length from the forensic specimen.
  • Corresponding sequences are obtained from a suspect.
  • Each of these identification DNAs is then sequenced, and a simple database comparison determines the differences, if any, between the sequences from the suspect and those from the sample.
  • Statistically significant differences between the suspect's DNA sequences and those from the sample conclusively prove a lack of identity. This lack of identity can be proven, for example, with only one sequence. Identity, on the other hand, should be demonstrated with a large number of sequences, all matching.
  • a minimum of 50 statistically identical sequences of 100 bases in length are used to prove identity between the suspect and the sample.
  • primers are prepared from a large number of sequences from Examples 1, 9, 10 and/or 11. Preferably, 20 to 50 different primers are used. These primers are used to obtain a corresponding number of PCR-generated DNA segments from the individual in question in accordance with Example 13. Each of these DNA segments is sequenced, using the methods set forth in Example 1. The database of sequences generated through this procedure uniquely identifies the individual from whom the sequences were obtained. The same panel of primers may then be used at any later time to absolutely correlate tissue or other biological specimen with that individual.
  • Example 15 The procedure of Example 15 is repeated to obtain a panel of from 10 to 2000 amplified sequences from an individual and a specimen.
  • This PCR-generated DNA is then digested with one or a combination of, preferably, four base specific restriction enzymes.
  • Such enzymes are commercially available and known to those of skill in the art.
  • the resultant gene fragments are size separated in multiple duplicate wells on an agarose gel and transferred to nitrocellulose using Southern blotting techniques well known to those with skill in the art.
  • Southern blotting see Davis et al. (Basic Methods in Molecular Biology. 1986, Elsevier Press, pp 62-65).
  • 1, and/or 11, or fragments thereof of at least 15 bases are radioactively or colorimetrically labeled using end-labeled oligonucleotides derived from the ESTs, nick translated sequences or the like using methods known in the art and hybridized to the Southern blot using techniques known in the art (Davis et al., supra) .
  • at least 5 to 10 of these labeled probes are used, and more preferably at least about 20 or 30 are used to provide a unique pattern.
  • the resultant bands appearing from the hybridization of a large sample of ESTs will be a unique identifier. Since the restriction enzyme cleavage will be different for every individual, the band pattern on the Southern blot will also be unique. Increasing the number of EST probes will provide a statistically higher level of confidence in the identification since there will be an increased number of sets of bands used for identification.
  • Another technique for identifying individuals using the sequences disclosed herein utilizes a dot blot hybridization technique.
  • Genomic DNA is isolated from nuclei of subject to be identified. Oligonucleotide probes of approximately 30 bp in length were synthesized that correspond to sequences from the ESTs. The probes are used to hybridize to the genomic DNA through conditions known to those in the art. The oligonucleotides are end labelled with 32 P using polynucleotide kinase (Pharmacia) . Dot Blots are created by spotting about 50 ng cDNA of preferably at least 10 sequences corresponding to a variety of the Sequence ID NOs provided in Table 7 onto nitrocellulose or the like using a vacuum dot blot manifold
  • EST sequences and the corresponding complete cDNA sequences can be used to create a unique fingerprint for an individual.
  • pools of EST sequences can be used in forensics, paternity suits or the like to differentiate one individual from another.
  • Entire EST sequences . can be used; similarly oligonucleotides can be prepared from EST sequences.
  • 20-mer oligonucleotides are prepared from 200 EST sequences using commercially available oligonucleotide services such as Oligos Etc., Wilsonville, OR.
  • Patient cell samples are processed for DNA using techniques well known to those with skill in the art.
  • the nucleic acid is digested with restriction enzymes EcoRI and Xbal. Following digestion, samples are applied to wells for electrophoresis.
  • the procedure may be modified to accommodate polyacrylamide electrophoresis, however in this example, samples containing 5 ug of DNA are loaded into wells and separated on 0.8% agarose gels. The gels are transferred using Southern blotting techniques onto nitrocellulose. 10 ng of each of the oligos are pooled and end-labeled with 32 P. The nitrocellulose is prehybridized with blocking solution and hybridized with the labeled probes. Following hybridization and washing, the nitrocellulose filter is exposed to X-Omat AR X-ray film. The resulting hybridization pattern will be unique for each individual.
  • This example illustrates an approach useful for the association of EST sequences with particular phenotypic characteristics.
  • a particular EST is used as a test probe to associate that EST with a particular phenotypic characteristic.
  • ESTs from patients with these diseases are isolated and expanded in culture. PCR primers from the EST sequences are used to screen genomic DNA and RNA or cDNA from the patients. ESTs that are not amplified in the patients can be positively associated with a particular disease by further analysis.
  • Angelman's disease is characterized by deletions on the long arm of chromosome 15 (15qllql3) (Williams et al. Am. J. Med. Genet. 32:339-345 (1989) hereby incorporated by reference) .
  • the symptoms of the disease include developmental delay, seizures, inappropriate laughter and ataxic movements. These symptoms suggest that the disorder is a neurologic deficiency.
  • This prophetic example illustrates how ESTs, preferably obtained from a cDNA library from human brain, may be used in identifying the defective gene or genes associated with Angelman's Disease.
  • EST sequences may generally be used for identifying gene sequences associated with an inherited disease that is mapped to a chromosome location.
  • ESTs are screened using techniques described in Example 3 and Example 5 to identify those ESTs that localize to the long arm of chromosome 15 and preferably localize to chromosome 15 bands 15qllql3 from normal patients.
  • ESTs that bind to the long arm of chromosome 15 are hybridized to chromosome 15 from AD patients. These studies are preferrably performed using either fluorescence in situ hybridization or using somatic cell hybrids that contain fragments from the long arm of chromosome 15 from AD patients.
  • Those chromosome 15-specific ESTs that do not map to chromosome 15 from AD patients are useful as markers for Angelman's Disease and can be incorporated into diagnostics for genetic screening.
  • These ESTs are associated with chromosome deletions present in Angelman's disease. Identification of the gene associated with these AD negative ESTs and an analysis of the polypeptides encoded by the genes from normal patients is essential for providing gene or other therapies for AD patients.
  • RFLP Restriction fragment length polymorphism
  • cDNA libraries are prepared from the somatic cell hybrids from AD patients. Libraries are prepared using Lambda Zap II Library Kits (Stratagene, La Jolla, California) or other commercially available library kits. The ESTs of interest are used as probes to identify those bacterial colonies carrying genes corresponding to the EST probes.
  • Positive clones are sequenced and the sequences are compared to homologous gene sequences derived from normal patients. Alterations, including deletions and substitutions, within gene sequences, associated with bands 15qllql3, are thus positively identified and associated with AD disease. Wagstaff et al. were able to identify deletions and substitutions in sequences encoding the GABA A receptor protein subunit from patients with Angelman's disease (Am. J. Hum. Genet. 49:330-337, (1991)). It is likely that other genes will additionally be associated with the disease.
  • Antisense RNA molecules are known to be useful for regulating translation within the cell. Antisense RNA molecules can be produced from EST sequences or from the corresponding gene sequences. These antisense molecules can be used as diagnostic probes to determine whether or not a particular gene is expressed in a cell. Similarly, the antisense molecules can be used as a therapeutic to regulate gene expression once the EST is associated with a particular disease (see Example 20) .
  • the antisense molecules are obtained from a nucleotide sequence by reversing the orientation of the coding region with regard to the promoter.
  • the antisense RNA is complementary to the corresponding mRNA.
  • the antisense sequences can contain modified sugar phosphate backbones to increase stability and make them less sensitive to RNase activity. Examples of the modifications are described by Rossi et al., Pharmacol. Ther. 50(2) :245-254, (1991) .
  • Antisense molecules are introduced into cells "that express the gene corresponding to the EST of interest in culture.
  • the polypeptide encoded by the gene is first identified, so that the effectiveness of antisense inhibition on translation can be monitored using techniques that include but are not limited to antibody-mediated tests such as RIAs and ELISA, functional assays, or radiolabelling.
  • the antisense molecule is introduced into the cells by diffusion or by transfection procedures known in the art.
  • the molecules are introduced onto cell samples at a number of different concentrations preferably between lxlO " 0 M to lxlO "4 M. Once the minimum concentration that can adequately control translation is identified, the optimized dose is translated into a dosage suitable for use in vivo.
  • an inhibiting concentration in culture of lxlO "7 translates into a dose of approximately 0.6 mg/kg bodyweight.
  • levels of oligonucleotide approaching 100 mg/kg bodyweight or higher may be possible after testing the toxicity of the oligonucleotide in laboratory animals.
  • the antisense can be introduced into the body as a bare or naked oligonucleotide, oligonucleotide encapsulated in lipid, oligonucleotide sequence encapsidated by viral protein, or as oligonucleotide contained in an expression vector such as those described in Example 23.
  • the antisense oligonucleotide is preferably introduced into the vertebrate by injection. It is additionally contemplated that cells from the vertebrate are removed, treated with the antisense oligonucleotide, and reintroduced into the vertebrate.
  • the antisense oligonucleotide sequence is incorporated into a ribozyme sequence to enable the antisense to bind and cleave its target.
  • ribozyme and antisense oligonucleotides see Rossi et al.
  • Triple helix oligonucleotides are used to inhibit transcription from a genome. They are particularly useful for studying alterations in cell activity as it is associated with a particular gene.
  • the EST sequences or complete sequences of the present invention or, more preferably, a portion of those sequences, can be used to inhibit gene expression in individuals having diseases associated with a particular gene.
  • a portion of the EST or corresponding gene sequence can be used to study the effect of inhibiting transcription of a particular gene within a cell.
  • homopurine sequences were considered the most useful.
  • homopyrimidine sequences can also inhibit gene expression. Thus, both types of sequences from either the EST or from the gene corresponding to the EST are contemplated within the scope of this invention.
  • Homopyrimidine oligonucleotides bind to the major groove at homopurine:homopyrimidine sequences.
  • 10-mer to 20-mer homopyrimidine sequences from the ESTs can be used to inhibit expression from homopurine sequences.
  • SEQ ID NOs such as 282 and 240 contain homopyrimidine 15-mers.
  • the natural (beta) anomers of the oligonucleotide units can be replaced with alpha anomers to render the oligonucleotide more resistant to nucleases.
  • an intercalating agent such as ethidium bromide, or the like, can be attached to the 3' end of the alpha oligonucleotide to stabilize the triple helix.
  • the oligonucleotides may be prepared on an oligonucleotide synthesizer or they may be purchased commercially from a company specializing in custom oligonucleotide synthesis.
  • the sequences are introduced into cells in culture using techniques known in the art that include but are not limited to calcium phosphate precipitation, DEAE-Dextran, electroporation, liposo e- mediated transfection or native uptake. Treated cells are monitored for altered cell function.
  • These cell functions are predicted based upon the homologies of the gene, corresponding to the EST from which the oligonucleotide was derived, with known genes sequences that have been associated with a particular function.
  • the cell functions can also be predicted based on the presence of abnormal physiologies within cells derived from individuals with a particular inherited disease, particularly when the EST is associated with the disease using techniques described in Example 20.
  • a gene sequence of the present invention coding for all or part of a human gene product is introduced into an expression vector using conventional technology.
  • Techniques to transfer cloned sequences into expression vectors that direct protein translation in mammalian, yeast, insect or bacterial expression systems are well known in the art.
  • Commercially available vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California) , Promega (Madison, Wisconsin) , and Invitrogen (San Diego, California) .
  • Stratagene La Jolla, California
  • Promega Micromega
  • Invitrogen San Diego, California
  • the codon context and codon pairing of the sequence may be optimized for the particular expression organism, as explained by Hatfield, et al., U.S. Patent No. 5,082,767, incorporated herein by this reference.
  • the following is provided as one exemplary method to generate polypeptide from cloned cDNA sequences.
  • the cDNA from the EST of interest is sequenced to identify the methionine initiation codon for the gene and the poly A sequence. If the cDNA lacks a poly A sequence, this sequence can be added to the construct by, for example, splicing out the Poly A sequence from pSG5 (Stratagene) using Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXTl (Stratagene) .
  • pXTl contains the LTRs and a portion of the gag gene from Moloney Murine Leukemia Virus.
  • the position of the LTRs in the construct allow efficient stable transfection.
  • the vector includes the Herpes Simplex Thymidine Kinase promoter and the selectable neomycin gene.
  • the cDNA is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the cDNA and containing restriction endonuclease sequences for Pst I incorporated into the 5'primer and Bglll at the 5' end of the corresponding cDNA 3' primer, taking care to ensure that the cDNA is positioned inframe with the poly A sequence.
  • the purified fragment obtained from the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bgl II, purified and ligated to pXTl, now containing a poly A sequence and digested Bglll.
  • the ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc. , Grand Island, New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the transfected cells in 600ug/ml G418 (Sigma, St. Louis, Missouri) .
  • the protein is preferrably released into the supernatant. However if the protein has membrane binding domains, the protein may additionally be retained within the cell or expression may be restricted to the cell surface.
  • the cDNA sequence is additionally incorporated into eukaryotic expression vectors and expressed as a chimeric with, for example, ⁇ - globin.
  • Antibody to -globin is used to purify the chimeric.
  • Corresponding protease cleavage sites engineered between the -globin gene and the cDNA are then used to separate the two polypeptide fragments from one another after translation.
  • One useful expression vector for generating 0-globin chimerics is pSG5 (Stratagene) . This vector encodes rabbit ⁇ -globin. Intron II of the rabbit ⁇ -globin gene facilitates splicing of the expressed transcript, and the polyadenylation signal incorporated into the construct increases the level of expression.
  • Polypeptide may additionally be produced from either construct using in vitro translation systems such as In vitro ExpressTM Translation Kit (Stratagene) .
  • Substantially pure protein or polypeptide is isolated from the transfected or transformed cells as described in Example 23. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can then be prepared as follows: A. Monoclonal Antibody Production by Hybridoma Fusion
  • Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler, G. and Milstein, C. , Nature 256:495 (1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media) .
  • HAT media aminopterin
  • the successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued.
  • Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by i munoassay procedures, such as Elisa, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980) , and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al. Basic Methods in Molecular Biology Elsevier, New York. Section 21-2. B. Polyclonal Antibody Production by Immunization
  • Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance immunogenicity.
  • Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than other and may require the use of carriers and adjuvant.
  • host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable.
  • An effective immunization protocol for rabbits can be found in Vaitukaitis,
  • Booster injections can be given at regular intervals, and antiserum.harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, 0. et al., Chap. 19 in: Handbook of Experimental Immunology D. Wier (ed) Blackwell (1973) . Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12 ⁇ M) . Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D. , Chap. 42 in: Manual of Clinical Immunology, 2d Ed.
  • Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of antigen in a biological sample.
  • tissue specific antigens by means of antibody preparations according to Example 24 which are conjugated, directly or indirectly to a detectable marker.
  • Selected labeled antibody species bind to their specific antigen binding partner in tissue sections, cell suspensions, or in extracts of soluble proteins from a tissue sample to provide a pattern for qualitative or semi-qualitative interpretation.
  • Antisera for these procedures must have a potency exceeding that of the native preparation, and for that reason, antibodies are concentrated to a mg/ml level by isolation of the gamma globulin fraction, for example, by ion-exchange chromatography or by ammonium sulfate fractionation.
  • unwanted antibodies for example to common proteins, must be removed from the gamma globulin fraction, for example by means of insoluble immunoabsorbents, before the antibodies are labeled with the marker.
  • Either monoclonal or heterologous antisera is suitable for either procedure.
  • a fluorescent marker either fluorescein or rhodamine
  • antibodies can also be labeled with an enzyme that supports a color producing reaction with a substrate, such as horseradish peroxidase. Markers can be added to tissue-bound antibody in a second step, as described below.
  • the specific antitissue antibodies can be labeled with ferritin or other electron dense particles, and localization of the ferritin coupled antigen-antibody complexes achieved by means of an electron microscope.
  • the antibodies are radiolabeled, with, for example 125 I, and detected by overlaying the antibody treated preparation with photographic emulsion.
  • Preparations to carry out the procedures can comprise monoclonal or polyclonal antibodies to a single gene copy or protein, identified as specific to a tissue type, for example, brain tissue, or antibody preparations to several antigenically distinct tissue specific antigens can be used in panels, independently or in mixtures, as required.
  • Tissue sections and cell suspensions are prepared for immunohistochemical examination according to common histological techniques. Multiple cryostat sections (about 4 ⁇ , unfixed) of the unknown tissue and known control, are mounted and each slide covered with different dilutions of the antibody preparation. Sections of known and unknown tissues should also be treated with preparations to provide a positive control, a negative control, for example, pre-immune sera, and a control for non-specific staining, for example, buffer.
  • Treated sections are incubated in a humid chamber for 30 min at room temperature, rinsed, then washed in buffer for 30- 45 min. Excess fluid is blotted away, and the marker developed.
  • tissue specific antibody was not labeled in the first incubation, it can be labeled at this time in a second antibody-antibody reaction, for example, by adding fluorescein- or enzyme-conjugated antibody against the immunoglobulin class of the antiserum-producing species, for example, fluorescein labeled antibody to mouse IgG.
  • fluorescein- or enzyme-conjugated antibody against the immunoglobulin class of the antiserum-producing species for example, fluorescein labeled antibody to mouse IgG.
  • the antigen found in the tissues by the above procedure can be quantified by measuring the intensity of color or fluorescence on the tissue section, and calibrating that signal using appropriate standards.
  • tissue specific proteins and identification of unknown tissues from that procedure is carried out using the labeled antibody reagents and detection strategy as described for immunohistochemistry; however the sample is prepared according to an electrophoretic technique to distribute the proteins extracted from the tissue in an orderly array on the basis of molecular weight for detection.
  • a tissue sample is homogenized using a Virtis apparatus; cell suspensions are disrupted by Dounce homogenization or osmotic lysis, using detergents in either case as required to disrupt cell membranes, as is the practice in the art.
  • Insoluble cell components such as nuclei, microsomes, and membrane fragments are removed by ultracentrifugation, and the soluble protein-containing fraction concentrated if necessary and reserved for analysis.
  • a sample of the soluble protein solution is resolved into individual protein species by conventional SDS polyacrylamide electrophoresis as described, for example, by Davis, L. et al., Section 19-2 in: Basic Methods in Molecular Biology (P. Leder, ed) , Elsevier, New York (1986) , using a range of amounts of polyacrylamide in a set of gels to resolve the entire molecular weight range of proteins to be detected in the sample.
  • a size marker is run in parallel for purposes of estimating molecular weights of the constituent proteins.
  • Sample size for analysis is a convenient volume of from 5-50 ⁇ l, and containing from about 1 to 100 ⁇ g protein.
  • a detectable label can be attached to the primary tissue antigen-primary antibody complex according to various strategies and permutations thereof.
  • the primary specific antibody can be labeled; alternatively, the unlabeled complex can be bound by a labeled secondary anti-IgG antibody.
  • either the primary or secondary antibody is conjugated to a biotin molecule, which can, in a subsequent step, bind an avidin conjugated marker.
  • enzyme labeled or radioactive protein A which has the property of binding to any IgG, is bound in a final step to either the primary or secondary antibody.
  • tissue specific antigen binding at levels above those seen in control tissues to one or more tissue specific antibodies, prepared from the gene sequences identified from EST sequences, can identify tissues of unknown origin, for example, forensic samples, or differentiated tumor tissue that has metastasized to foreign bodily sites.
  • the EST sequences of the present invention are identified herein by SEQ ID NO, and are identified in the GenBank database by a different number, are identified in the inventors' lab (and upcoming publications) by EST number, and clones have been submitted to the American Type Culture Collection (Rockville, Maryland USA) under clone names. Table 12 cross references those different numbers for the ESTs from CDNA, SEQ ID NOS 1-315.
  • CTGCAGCCAC CATATGGGGC ACTCCTGGCT GGTGTACAGG GTGGGCATTG CCCAGGTCTT 360
  • ATCCTCACAC CAGCATTTTG TGTGTAAGGA AACTGGCCGA GAGTGGTTAA GAAATATATC 240
  • CTAGGCACCC GTTCAGTGTG AGGAGGGGGA AGTGGCCTTG CCAAGGGGCC AGTGAGCTCA 420
  • AAATCATTGC TCAAAAGAAR AACCTGGCAA TGCATGATTA CGAAATGCAA AAGAMGATAC 120
  • AGTGTCCCAT CAGAGGTTTA TACAAAGAGA GAATGACTGA ACTATATGAT TATCCCANGT 120
  • CTCCCTTCGC CACCTGCTGG ACGCGAGGGG CTACTACGAT GCCATGGGTG TCCTGRTTTT 60
  • TTATTTCTCA GACAGGACTG CTCTGTATNT GTCTTTGGAT TCTACGTAGA TTTATATTTG 120
  • CCCCCTCCTC TTCCGTCCTG ATTAAGCCCA AGGGTTGGTG GACTTAACTT TCAGCCCATC 120
  • CTTCTAATGA GGTCACTACT GAACATAATT GTTCCCTCTT CTGTTAAATA GAATAGGTTT 300
  • GTCCTTACAT GRCAAAGAGA TGGAAGGGCC AAAAAGATGG TGACCTATTG TGAGGCCTTT 360
  • GAGATTGTKC AGCAGCCACT GCCTCCTTGT CACCTTCGCC TGTGGTCATT CTCCCCACAT 180
  • CAMGAAAACC CAGGACACCA GGGCAGGGGG GCTGCACAAG GTCGGGTAGG TCACAGTGGG 180
  • ACTCAMCTTC TCATTCAATC TGGGGCAGTG GATAACCTTT CTGAATAGAC CCACTTGTTC 120

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Toxicology (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Séquences d'ADNc et du génome humain partielles et complètes correspondant à des marques de séquences exprimées (EST) particulières. Ces EST sont des séquences ADNc ayant en général de 150 à 500 paires de base en longueur et sont issues de génotèques d'ADNc du cerveau de l'homme. Elles correspondent aux gènes transcrits dans le cerveau humain et ont des séquences de base identifiées ainsi SEQ IDNOS: 1-315.
PCT/US1992/005222 1991-06-20 1992-06-19 Sequences caracteristiques de produit de transcription de genes humains WO1993000353A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP92914421A EP0593580A4 (en) 1991-06-20 1992-06-19 Sequences characteristic of human gene transcription product

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US716,831 1985-03-27
US71683191A 1991-06-20 1991-06-20
US83719592A 1992-02-12 1992-02-12
US837,195 1992-02-12

Publications (1)

Publication Number Publication Date
WO1993000353A1 true WO1993000353A1 (fr) 1993-01-07

Family

ID=27109605

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1992/005222 WO1993000353A1 (fr) 1991-06-20 1992-06-19 Sequences caracteristiques de produit de transcription de genes humains

Country Status (3)

Country Link
EP (1) EP0593580A4 (fr)
AU (1) AU2240492A (fr)
WO (1) WO1993000353A1 (fr)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0585801A2 (fr) * 1992-08-28 1994-03-09 Hoechst Japan Limited Protéine du type cadhérinique apparentée aux os et procédé de production
EP0743989A1 (fr) * 1994-02-14 1996-11-27 Smithkline Beecham Corporation Genes exprimes differemment chez des sujets sains et chez des sujets malades
GB2305241A (en) * 1995-09-12 1997-04-02 Univ Johns Hopkins Med Oligonucleotide, tagged with nucleotide sequences, for the serial analysis of gene expression (SAGE)
WO1997021807A1 (fr) * 1995-12-12 1997-06-19 Kyowa Hakko Kogyo Co., Ltd. Nouveaux adn, nouveaux polypeptides et nouveaux anticorps
US5695937A (en) * 1995-09-12 1997-12-09 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression
US5858378A (en) * 1996-05-02 1999-01-12 Galagen, Inc. Pharmaceutical composition comprising cryptosporidium parvum oocysts antigen and whole cell candida species antigen
WO1999032515A2 (fr) * 1997-12-19 1999-07-01 Zymogenetics, Inc. Homologue de l'angiopoietine, adn le codant et procede de production dudit homologue
US5936078A (en) * 1995-12-12 1999-08-10 Kyowa Hakko Kogyo Co., Ltd. DNA and protein for the diagnosis and treatment of Alzheimer's disease
WO1999067382A2 (fr) * 1998-06-24 1999-12-29 Compugen Ltd. Sequences d'un facteur de croissance apparente a l'angiopoietine
US6010878A (en) * 1996-05-20 2000-01-04 Smithkline Beecham Corporation Interleukin-1 β converting enzyme like apoptotic protease-6
WO2000012525A1 (fr) 1998-08-27 2000-03-09 Quark Biotech, Inc. Sequences caracteristiques de la transcription genique regulee par l'hypoxemie
WO2000050062A2 (fr) * 1999-02-24 2000-08-31 North Carolina State University Methodes et compositions alterant la secretion de mucus
US6225084B1 (en) * 1995-02-10 2001-05-01 Millennium Pharmaceuticals, Inc. Compositions and methods for the treatment and diagnosis of cardiovascular disease using rchd534 as a target
WO2002063009A2 (fr) * 2001-02-02 2002-08-15 Eli Lilly And Company Proteines de mammiferes lp et reactifs associes
US6482922B2 (en) 1995-11-02 2002-11-19 Human Genome Sciences, Inc. Mammary transforming protein
US6759210B1 (en) * 1996-02-16 2004-07-06 Millennium Pharmaceuticals, Inc. Compositions and methods for the treatment and diagnosis of cardiovascular disease using fehd545 as a target
US6890721B1 (en) 1996-05-20 2005-05-10 Human Genome Sciences, Inc. Interleukin-1β converting enzyme like apoptotic protease-6
US7265088B1 (en) 2000-02-24 2007-09-04 North Carolina State University Method and compositions for altering mucus secretion
US7524926B2 (en) 2005-01-20 2009-04-28 Biomarck Pharmaceuticals, Ltd. Mucin hypersecretion inhibitors and methods of use
US7544772B2 (en) 2001-06-26 2009-06-09 Biomarck Pharmaceuticals, Ltd. Methods for regulating inflammatory mediators and peptides useful therein
US7919469B2 (en) 2000-02-24 2011-04-05 North Carolina State University Methods and compositions for altering mucus secretion
US8501911B2 (en) 1999-02-24 2013-08-06 Biomarck Pharmaceuticals, Ltd Methods of reducing inflammation and mucus hypersecretion
US8999915B2 (en) 2006-07-26 2015-04-07 Biomarck Pharmaceuticals, Ltd. Methods for attenuating release of inflammatory mediators and peptides useful therein

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
Analytical Biochemistry, Volume 172, issued 1988, C.J. MARCUS-SEKURA, "Techniques for using antisense oligodeoxyribonucleotides to study gene expression", pages 289-295, see entire document. *
Cell, Volume 3, issued December 1974, P.C. WENSINK et al., "A system for mapping DNA sequences in the chromosomes of Drosophila melanogaster", pages 315-325, see entire document. *
Gene, Volume 88, issued 08 June 1990, P. SZAFRANSKI et al., "Hypersensitive mung bean nuclease cleavage sites in Plasmodium knowlesi DNA", pages 141-147, see especially figure 6 on page 145. *
Methods in Enzymology, Volume 101, issued 1983, M. ROSENBERG et al., "The use of pKC30 and its derivatives for controlled expression of genes", pages 123-138, see entire document. *
Methods in Enzymology, Volume 152, issued 1987, A.R. KIMMEL, "Selection of clones from libraries: Overview", pages 393-399, see entire document. *
Nature, Volume 338, No. 17, issued 02 March 1989, G.H. TRAVIS et al., "Identification of a photoreceptor-specific mRNA encoded by the gene responsible for retinal degeneration slow (rds)", pages 70-73, see especially Figure 3 on page 73. *
Nucleic Acids Research, Volume 11, No. 12, issued 1983, ROSENZWEIG et al., "Sequence of the C. elegans transposable element Tc1", pages 4201-4209, see especially Figure 2 on page 4205. *
Nucleic Acids Research, Volume 12, No. 18, issued 1984, D.A. MELTON et al., "Efficient in vitro synthesis of biologically active RNA and RNA hybridization probes from plasmids containing a bacteriophage SP6 promoter", pages 7035-7056, see entire document. *
Pharmacia P-L Biochemicals 1984 Product Reference Guide, published 1984 by Pharmacia P-L Biochemicals, Inc., Piscataway, NJ, USA, pages 36-37, see especially "Oligo(dA)" and "Oligo(dT)". *
Plant Molecular Biology, Volume 11, issued 1988, T.J. HIGGINS et al., "The sequence of a pea vicilin gene and its expression in transgenic tobacco plants", pages 683-695, see especially Figure 1 on page 686. *
Proceedings of the National Academy of Sciences USA, Volume 78, No. 11, issued November 1981, S.V. SUGGS et al., "Use of synthetic oligonucleotides as hybridization probes: Isolation of cloned cDNA sequences for human beta2-microglobulin", pages 6613-6617, see entire document. *
Proceedings of the National Academy of Sciences USA, Volume 80, issued January 1983, B.J. CONNER et al., "Detection of sickle cell betaS-globin allele by hybridization with synthetic oligonucleotides", pages 278-282, see entire document. *
Proceedings of the National Academy of Sciences USA, Volume 83, issued 1986, A. HIRASHIMA et al., "Engineering of the mRNA-interfering complementary RNA immune system against viral infection", pages 7726-7730, see entire document. *
Promega Biological Research Products 1988/89 Catalog, published 1988 by Promega Corporation, Madison, WI, USA, see entire document. *
Science, Volume 205, issued 1979, J.A. MARTIAL et al., "Human growth hormone: Complementary DNA cloning and expression in bacteria", pages 602-606, see entire document. *
See also references of EP0593580A4 *
The Journal of Biological Chemistry, Volume 263, No. 6, issued 25 February 1988, S. MEMET et al., "RPA190, the gene coding for the largest subunit of yeast RNA polymerase A", pages 2830-2839, see especially Figure 4 on page 2833. *
The Journal of Biological Chemistry, Volume 264, No. 17, issued 15 June 1989, S. MATSUURA et al., "Human adenylate kinase deficiency associated with hemolytic anemia", pages 10148-10155, see especially Figure 2 on page 10154. *

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5869638A (en) * 1992-08-28 1999-02-09 Hoechst Japan Limited Bone-related cadherin-like protein and process for its production
EP0585801A3 (en) * 1992-08-28 1994-06-29 Hoechst Japan Bone-related cadherin-like protein and process for its production
EP0585801A2 (fr) * 1992-08-28 1994-03-09 Hoechst Japan Limited Protéine du type cadhérinique apparentée aux os et procédé de production
EP0743989A1 (fr) * 1994-02-14 1996-11-27 Smithkline Beecham Corporation Genes exprimes differemment chez des sujets sains et chez des sujets malades
EP1813684A2 (fr) * 1994-02-14 2007-08-01 Smithkline Beecham Corporation Gènes s'y exprimant différemment chez des sujets sains et des sujets malades
EP1813684A3 (fr) * 1994-02-14 2009-11-18 Smithkline Beecham Corporation Gènes s'y exprimant différemment chez des sujets sains et des sujets malades
EP0743989A4 (fr) * 1994-02-14 2000-11-08 Smithkline Beecham Corp Genes exprimes differemment chez des sujets sains et chez des sujets malades
US6225084B1 (en) * 1995-02-10 2001-05-01 Millennium Pharmaceuticals, Inc. Compositions and methods for the treatment and diagnosis of cardiovascular disease using rchd534 as a target
US5866330A (en) * 1995-09-12 1999-02-02 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression
US5695937A (en) * 1995-09-12 1997-12-09 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression
US6383743B1 (en) 1995-09-12 2002-05-07 The John Hopkins University School Of Medicine Method for serial analysis of gene expression
GB2305241B (en) * 1995-09-12 1999-11-10 Univ Johns Hopkins Med Method for serial analysis of gene expression
GB2305241A (en) * 1995-09-12 1997-04-02 Univ Johns Hopkins Med Oligonucleotide, tagged with nucleotide sequences, for the serial analysis of gene expression (SAGE)
US6746845B2 (en) 1995-09-12 2004-06-08 The Johns Hopkins University Method for serial analysis of gene expression
US6482922B2 (en) 1995-11-02 2002-11-19 Human Genome Sciences, Inc. Mammary transforming protein
US5936078A (en) * 1995-12-12 1999-08-10 Kyowa Hakko Kogyo Co., Ltd. DNA and protein for the diagnosis and treatment of Alzheimer's disease
WO1997021807A1 (fr) * 1995-12-12 1997-06-19 Kyowa Hakko Kogyo Co., Ltd. Nouveaux adn, nouveaux polypeptides et nouveaux anticorps
US6759210B1 (en) * 1996-02-16 2004-07-06 Millennium Pharmaceuticals, Inc. Compositions and methods for the treatment and diagnosis of cardiovascular disease using fehd545 as a target
US5858378A (en) * 1996-05-02 1999-01-12 Galagen, Inc. Pharmaceutical composition comprising cryptosporidium parvum oocysts antigen and whole cell candida species antigen
US6890721B1 (en) 1996-05-20 2005-05-10 Human Genome Sciences, Inc. Interleukin-1β converting enzyme like apoptotic protease-6
US7115260B2 (en) 1996-05-20 2006-10-03 Human Genome Sciences, Inc. Interleukin-1β converting enzyme like apoptotic protease-6
US6010878A (en) * 1996-05-20 2000-01-04 Smithkline Beecham Corporation Interleukin-1 β converting enzyme like apoptotic protease-6
US6294169B1 (en) 1996-05-20 2001-09-25 Human Genome Sciences, Inc. Interleukin-1 beta converting enzyme like apoptotic protease-6
WO1999032515A3 (fr) * 1997-12-19 1999-09-10 Zymogenetics Inc Homologue de l'angiopoietine, adn le codant et procede de production dudit homologue
WO1999032515A2 (fr) * 1997-12-19 1999-07-01 Zymogenetics, Inc. Homologue de l'angiopoietine, adn le codant et procede de production dudit homologue
WO1999067382A3 (fr) * 1998-06-24 2000-04-27 Compugen Ltd Sequences d'un facteur de croissance apparente a l'angiopoietine
WO1999067382A2 (fr) * 1998-06-24 1999-12-29 Compugen Ltd. Sequences d'un facteur de croissance apparente a l'angiopoietine
WO2000012525A1 (fr) 1998-08-27 2000-03-09 Quark Biotech, Inc. Sequences caracteristiques de la transcription genique regulee par l'hypoxemie
US8501911B2 (en) 1999-02-24 2013-08-06 Biomarck Pharmaceuticals, Ltd Methods of reducing inflammation and mucus hypersecretion
JP2002538783A (ja) * 1999-02-24 2002-11-19 ノース・キャロライナ・ステイト・ユニヴァーシティ 粘液分泌を変化させる方法および組成物
AU766800B2 (en) * 1999-02-24 2003-10-23 North Carolina State University Methods and compositions for altering mucus secretion
WO2000050062A3 (fr) * 1999-02-24 2000-12-21 Univ North Carolina State Methodes et compositions alterant la secretion de mucus
WO2000050062A2 (fr) * 1999-02-24 2000-08-31 North Carolina State University Methodes et compositions alterant la secretion de mucus
US7265088B1 (en) 2000-02-24 2007-09-04 North Carolina State University Method and compositions for altering mucus secretion
US7919469B2 (en) 2000-02-24 2011-04-05 North Carolina State University Methods and compositions for altering mucus secretion
WO2002063009A2 (fr) * 2001-02-02 2002-08-15 Eli Lilly And Company Proteines de mammiferes lp et reactifs associes
WO2002063009A3 (fr) * 2001-02-02 2004-03-18 Lilly Co Eli Proteines de mammiferes lp et reactifs associes
US7544772B2 (en) 2001-06-26 2009-06-09 Biomarck Pharmaceuticals, Ltd. Methods for regulating inflammatory mediators and peptides useful therein
US8563689B1 (en) 2001-06-26 2013-10-22 North Carolina State University Methods for regulating inflammatory mediators and peptides for useful therein
US7524926B2 (en) 2005-01-20 2009-04-28 Biomarck Pharmaceuticals, Ltd. Mucin hypersecretion inhibitors and methods of use
US8492518B2 (en) 2005-01-20 2013-07-23 Biomarck Pharmaceuticals Ltd. Mucin hypersecretion inhibitors and methods of use
US8293870B2 (en) 2005-01-20 2012-10-23 Biomarck Pharmaceuticals Ltd Mucin hypersecretion inhibitors and methods of use
US8907056B2 (en) 2005-01-20 2014-12-09 Biomarck Pharmaceuticals, Ltd. Mucin hypersecretion inhibitors and methods of use
US9598463B2 (en) 2005-01-20 2017-03-21 Biomarck Pharmaceuticals, Ltd. Mucin hypersecretion inhibitors and methods of use
US8999915B2 (en) 2006-07-26 2015-04-07 Biomarck Pharmaceuticals, Ltd. Methods for attenuating release of inflammatory mediators and peptides useful therein
US9827287B2 (en) 2006-07-26 2017-11-28 Biomarck Pharmaceuticals, Ltd. Methods for attenuating release of inflammatory mediators and peptides useful therein

Also Published As

Publication number Publication date
AU2240492A (en) 1993-01-25
EP0593580A1 (fr) 1994-04-27
EP0593580A4 (en) 1995-12-06

Similar Documents

Publication Publication Date Title
AU2018203835B2 (en) Recombinant dna constructs and methods for modulating expression of a target gene
AU2019253901B2 (en) Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency of plants
AU2020267286B2 (en) Isolated polynucleotides and polypeptides, and methods of using same for increasing plant yield and/or agricultural characteristics
WO1993000353A1 (fr) Sequences caracteristiques de produit de transcription de genes humains
AU2020223685B2 (en) Plant regulatory elements and uses thereof
AU2023204276A1 (en) Novel CRISPR-associated transposases and uses thereof
RU2714251C2 (ru) Оптимальные локусы кукурузы
AU2021266196A9 (en) Isolated polynucleotides and polypeptides, construct and plants comprising same and methods of using same for increasing nitrogen use efficiency of plants
CN104024438B (zh) Snp位点集合及其使用方法与应用
AU2021232838A1 (en) Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency, yield, growth rate, vigor, biomass, oil content, and/or abiotic stress tolerance
KR20230057487A (ko) 게놈 조정을 위한 방법 및 조성물
KR101999410B1 (ko) 염색체 랜딩 패드 및 관련된 용도
KR20230053735A (ko) 게놈의 조정을 위한 개선된 방법 및 조성물
EA030697B1 (ru) Событие 5307 кукурузы
CN108882689A (zh) 烟草植物体及其制备方法
CN109788738A (zh) 小麦
CN111542610A (zh) 精确基因组编辑的新策略
AU2022202318A1 (en) Methods of increasing specific plants traits by over-expressing polypeptides in a plant
WO2001098454A2 (fr) Sequences d'adn humain
CN114466928A (zh) 淀粉核样结构
EP1533375B1 (fr) Procede de transfert de mutation dans un acide nucleique cible
AU2020210193B2 (en) Isolated polynucleotides and polypeptides, and methods of using same for increasing plant yield and/or agricultural characteristics
AU2017204404B2 (en) Isolated Polynucleotides and Polypeptides, and Methods of Using Same for Increasing Plant Yield and/or Agricultural Characteristics
CN117425402A (zh) 通过基因组编辑加快转基因作物的育种
CN116648514A (zh) 玉米调节元件及其用途

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1992914421

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1992914421

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1992914421

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: CA