US20230023867A1 - Identification of tumors and tissues - Google Patents

Identification of tumors and tissues Download PDF

Info

Publication number
US20230023867A1
US20230023867A1 US17/854,909 US202217854909A US2023023867A1 US 20230023867 A1 US20230023867 A1 US 20230023867A1 US 202217854909 A US202217854909 A US 202217854909A US 2023023867 A1 US2023023867 A1 US 2023023867A1
Authority
US
United States
Prior art keywords
genes
tumor
sequences
types
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/854,909
Inventor
Mark G. Erlander
Xiao-Jun Ma
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Biotheranostics Inc
Original Assignee
Biotheranostics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biotheranostics Inc filed Critical Biotheranostics Inc
Priority to US17/854,909 priority Critical patent/US20230023867A1/en
Assigned to BANK OF AMERICA, N.A., AS COLLATERAL AGENT reassignment BANK OF AMERICA, N.A., AS COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BIOTHERANOSTICS, INC., FAXITRON BIOPTICS, LLC, GEN-PROBE INCORPORATED, HOLOGIC, INC.
Publication of US20230023867A1 publication Critical patent/US20230023867A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis

Definitions

  • This invention relates to the use of gene expression to classify human tumors.
  • the classification is performed by use of gene expression profiles, or patterns, of about 5 to 49 expressed sequences that are correlated with tumors arising from certain tissues as well as being correlated with certain tumor types.
  • the invention also provides for the use of about 5 to 49 specific gene sequences, the expression of which are correlated with tissue source and tumor type in various cancers.
  • the gene expression profiles may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
  • This invention relates to the use of gene expression measurements to classify or identify tumors in cell containing samples obtained from a subject in a clinical setting, such as in cases of formalin fixed, paraffin embedded (FFPE) samples as well as fresh samples, that have undergone none to little or minimal treatment (such as simply storage at a reduced, non-freezing, temperature), and frozen samples.
  • FFPE formalin fixed, paraffin embedded
  • the invention thus provides the ability to classify tumors in the real-world conditions faced by hospital and other laboratories which conduct testing on clinical FFPE samples.
  • the samples may be of a primary tumor sample or of a tumor that has resulted from a metastasis of another tumor.
  • the sample may be a cytological sample, such as, but not limited to, cells in a blood sample.
  • the tumors may not have undergone classification by traditional pathology techniques, may have been initially classified but confirmation is desired, or have been classified as a “carcinoma of unknown primary” (CUP) or “tumor of unknown origin” (TUO) or “unknown primary tumor”.
  • CUP cancer of unknown primary
  • TOU tumor of unknown origin
  • the need for confirmation is particularly relevant in light of the estimates of 5 to 10% misclassification using standard techniques.
  • the invention may be viewed as providing means for cancer identification, or CID.
  • the classification is performed by use of gene expression profiles, or patterns, of about 5 to 49 expressed sequences.
  • the gene expression profiles may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
  • the invention is used to classify among at least 34 or at least 39 tumor types with significant accuracy in a clinical setting.
  • the invention is based in part on the surprising and unexpected discovery that about 5 to 49 expressed sequences in the human genome are capable of classifying among at least 34, or at least 39, tumor types, as well as subsets of those tumor types, in a meaningful manner. Stated differently, the invention is based in part on the discovery that it is not necessary to use supervised learning to identify gene sequences which are expressed in correlation with different tumor types.
  • the invention is based in part on the recognition that any about 5 to 49 expressed sequences, even a random collection of expressed sequences, has the capability to classify, and so may be used to classify, a cell as being a tumor cell of a tissue or tissue origin. Moreover, relatively few expressed sequences are needed to classify among different tumor types.
  • the ratio of expressed sequences to the number of tumor types that can be classified, based on the expression levels of the sequences, ranges from about 1:2 to about 5:2 or higher as demonstrated herein.
  • the invention provides for the classifying of a cell containing sample as containing a tumor cell of a tissue type or origin by determining the expression levels of about 5 to 49 transcribed sequences and then classifying the cell containing sample as containing a tumor cell of a plurality (two or more) of tumor types.
  • a tumor cell of a tissue type or origin by determining the expression levels of about 5 to 49 transcribed sequences and then classifying the cell containing sample as containing a tumor cell of a plurality (two or more) of tumor types.
  • the expressed sequences need not be those the expression levels of which are evidently or highly correlated (directly, or indirectly through correlation with another expressed sequence) with any of the tumor types.
  • the invention provides, in yet another embodiment, for the use of the expression levels of genes, the expression levels of which are not strongly correlated with the actual classification of the particular tumor sample, as one of the about 5 to 49 transcribed sequences. All of the genes selected may be such non-correlates, or only a portion of the genes may be non-correlates, typically at least 90%, 85%, 75%, 50% or 25%, as well as portions falling within the ranges created by using any two of the foregoing point examples as endpoints of a range.
  • the invention is practiced by determining the expression levels of gene sequences where the sequences need not have been selected based on a correlation of their expression levels with the tumor types to be classified.
  • the gene sequences need not be selected based on their correlation values with tumor types or a ranking based on the correlation values.
  • the invention may be practice with use of gene expression levels which are not necessarily correlated to one or more other gene expression level(s) used for classification.
  • the ability for the expression level of one expressed sequence to function in classification is not redundant with (is independent of) the ability of at least one other gene expression level used for classification.
  • the invention may be applied to identify the origin of a cancer in a patient in a wide variety of cases including, but not limited to, identification of the origin of a cancer in a clinical setting.
  • the identification is made by classification of a cell containing sample known to contain cancer cells, but the origin of those cells is unknown.
  • the identification is made by classification of a cell containing sample as containing one or more cancer cells followed by identification of the origin(s) of those cancer cell(s).
  • the invention is practiced with a sample from a subject with a previous history of cancer, and identification is made by classification of a cell as either being cancer from a previous origin of cancer or a new origin. Additional embodiments include those where multiple cancers found in the same organ or tissue and the invention is used to determine the origin of each cancer, as well as whether the cancers are of the same origin.
  • the invention is also based in part on the discovery that the expression levels of particular gene sequences can be used to classify among tumor types with greater accuracy than the expression levels of a random group of gene sequences.
  • the invention provides for the use of expression levels of about 5 to 49 expressed sequences from a first set of 74 expressed sequences in the human genome to classify among at least 39 tumor types with significant accuracy.
  • the invention thus provides for the identification and use of gene expression patterns (or profiles or “signatures”) based on the about 5 to 49 expressed sequences as correlated with at least the 39 tumor types.
  • the invention also provides for the use of about 5 to 49 of the 74 of these expressed sequences to classify among subsets of the 39 tumor types.
  • the ratio of expressed sequences to the number of tumor types, from 2 to 39, that can be classified based on the expression levels of the sequences ranges from about 1:2 to about 5:2 with greater accuracy than the use of a random group of expressed sequences. Depending on the number of tumor types, accuracies ranging from over 75% to 95% may be achieved readily.
  • the invention provides for the use of expression levels of about 5 to 49 expressed sequences of a second set of 90 expressed sequences in the human genome to classify among at least 39 tumor types, or subsets thereof, with significant accuracy. 38 of the sequences in this second set are present in the first set of 74 sequences.
  • the expression levels of the about 5 to 49 sequences in the second set may be used in the same manner as described for the first set of 74 sequences. Depending on the number of tumor types, accuracies ranging from about 75% to about 95% may be achieved.
  • the invention is also based in part upon the discovery that use of about 5 to 49 expressed sequences to classify among 53 tumor types, which include (but is not limited to) the 34 and 39 types described herein, was limited by the number of available samples of some tumor types. As noted hereinbelow, accuracy is linked to the number of available samples of each tumor type such that the ability to classify additional tumor types is readily achieved by the application of increased numbers of each tumor type. Thus while the invention is exemplified by use in classifying among 34 or 39 tumor types as well as subsets of the 34 or 39, about 5 to 49 expressed sequences can also be used to classify among all tumor types with the inclusion of samples of the additional tumor types. Thus the invention also provides for the classification of a tumor as being a type beyond the 34 or 39 types described herein.
  • the invention is based upon the expression levels of the gene sequences in a set of known tumor cells from different tissues and of different tumor types. These gene expression profiles (of gene sequences in the different known tumor cells/types), whether embodied in nucleic acid expression, protein expression, or other expression formats, may be compared to the expression levels of the same sequences in an unknown tumor sample to identify the sample as containing a tumor of a particular type and/or a particular origin or cell type.
  • the invention provides, such as in a clinical setting, the advantages of a more accurate identification of a cancer and thus the treatment thereof as well as the prognosis, including survival and/or likelihood of cancer recurrence following treatment, of the subject from whom the sample was obtained.
  • the invention is further based in part on the discovery that use of about 5 to 49 expressed sequences as described herein as capable of classifying among two or more tumor types necessarily and effectively eliminates one or more tumor types from consideration during classification. This reflects the lack of a need to select genes with expression levels that are highly correlated with all tumor types within the range of the classification system. Stated differently, the invention may be practiced with a plurality of genes the expression levels of which are not highly correlated with any of the individual tumor types or multiple types in the group of tumor types being classified. This is in contrast to other approaches based upon the selection and use of highly correlated genes, which likely do not “rule out” other tumor types as opposed to “rule in” a tumor type based on the positive correlation.
  • the classification of a tumor sample as being one of the possible tumor types described herein to the exclusion of other tumor types is of course made based upon a level of confidence as described below. Where the level of confidence is low, or an increase in the level of confidence is preferred, the classification can simply be made at the level of a particular tissue origin or cell type for the tumor in the sample. Alternatively, and where a tumor sample is not readily classified as a single tumor type, the invention permits the classification of the sample as one of a few possible tumor types described herein. This advantageously provides for the ability to reduce the number of possible tissue types, cell types, and tumor types from which to consider for selection and administration of therapy to the patient from whom the sample was obtained.
  • the invention thus provides a non-subjective means for the identification of the tissue source and/or tumor type of one or more cancers of an afflicted subject.
  • subjective interpretation may have been previously used to determine the tissue source and/or tumor type, as well as the prognosis and/or treatment of the cancer based on that determination
  • the present invention provides objective gene expression patterns, which may used alone or in combination with subjective criteria to provide a more accurate identification of cancer classification.
  • the invention is particularly advantageously applied to samples of secondary or metastasized tumors, but any cell containing sample (including a primary tumor sample) for which the tissue source and/or tumor type is preferably determined by objective criteria may also be used with the invention.
  • the ultimate determination of class may be made based upon a combination of objective and non-objective (or subjective/partially subjective) criteria.
  • the invention includes its use as part of the clinical or medical care of a patient.
  • the profile may also be used as part of a method to determine the prognosis of the cancer in the subject.
  • the classification of the tumor/cancer and/or the prognosis may be used to select or determine or alter the therapeutic treatment for said subject.
  • the classification methods of the invention may be directed toward the treatment of disease, which is diagnosed in whole or in part based upon the classification. Given the diagnosis, administration of an appropriate anti-tumor agent or therapy, or the withholding or alternation of an anti-tumor agent or therapy may be used to treat the cancer.
  • kits for providing diagnostic services relate to providing diagnostic services based on expression levels of gene sequences, with or without inclusion of an interpretation of levels for classifying cells of a sample.
  • the method of providing a diagnostic service of the invention is preceded by a determination of a need for the service.
  • the method includes acts in the monitoring of the performance of the service as well as acts in the request or receipt of reimbursement for the performance of the service.
  • a “gene” is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product.
  • the term includes alleles and polymorphisms of a gene that encodes the same product, or a functionally associated (including gain, loss, or modulation of function) analog thereof, based upon chromosomal location and ability to recombine during normal mitosis.
  • a “sequence” or “gene sequence” as used herein is a nucleic acid molecule or polynucleotide composed of a discrete order of nucleotide bases.
  • the term includes the ordering of bases that encodes a discrete product (i.e. “coding region”), whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product.
  • alleles and polymorphisms of the human gene sequences may exist and may be used in the practice of the invention to identify the expression level(s) of the gene sequences or an allele or polymorphism thereof. Identification of an allele or polymorphism depends in part upon chromosomal location and ability to recombine during mitosis.
  • correlate or “correlation” or equivalents thereof refer to an association between expression of one or more genes and another event, such as, but not limited to, physiological phenotype or characteristic, such as tumor type.
  • a “polynucleotide” is a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA and RNA. It also includes known types of modifications including labels known in the art, methylation, “caps”, substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified forms of the polynucleotide.
  • uncharged linkages e.g., phosphorothioates, phosphorodithioates, etc.
  • RNA is used in the broad sense to mean creating an amplification product can be made enzymatically with DNA or RNA polymerases.
  • Amplification generally refers to the process of producing multiple copies of a desired sequence, particularly those of a sample. “Multiple copies” mean at least 2 copies. A “copy” does not necessarily mean perfect sequence complementarity or identity to the template sequence.
  • Methods for amplifying mRNA are generally known in the art, and include reverse transcription PCR (RT-PCR) and quantitative PCR (or Q-PCR) or real time PCR. Alternatively, RNA may be directly labeled as the corresponding cDNA by methods known in the art.
  • nucleic acid molecule shares a substantial amount of sequence identity with another nucleic acid molecule.
  • a “microarray” is a linear or two-dimensional or three dimensional (and solid phase) array of discrete regions, each having a defined area, formed on the surface of a solid support such as, but not limited to, glass, plastic, or synthetic membrane.
  • the density of the discrete regions on a microarray is determined by the total numbers of immobilized polynucleotides to be detected on the surface of a single solid phase support, such as of at least about 50/cm 2 , at least about 100/cm 2 , or at least about 500/cm 2 , up to about 1,000/cm 2 or higher.
  • the arrays may contain less than about 500, about 1000, about 1500, about 2000, about 2500, or about 3000 immobilized polynucleotides in total.
  • a DNA microarray is an array of oligonucleotide or polynucleotide probes placed on a chip or other surfaces used to hybridize to amplified or cloned polynucleotides from a sample. Since the position of each particular group of probes in the array is known, the identities of a sample polynucleotides can be determined based on their binding to a particular position in the microarray.
  • an array of any size may be used in the practice of the invention, including an arrangement of one or more position of a two-dimensional or three dimensional arrangement in a solid phase to detect expression of a single gene sequence.
  • a microarray for use with the present invention may be prepared by photolithographic techniques (such as synthesis of nucleic acid probes on the surface from the 3′ end) or by nucleic synthesis followed by deposition on a solid surface.
  • some embodiments of the invention determine expression by hybridization of mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique to a particular gene sequence.
  • Polynucleotides of this type contain at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found in other gene sequences.
  • the term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value.
  • the term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value. Longer polynucleotides may of course contain minor mismatches (e.g. via the presence of mutations) which do not affect hybridization to the nucleic acids of a sample.
  • polynucleotides may also be referred to as polynucleotide probes that are capable of hybridizing to sequences of the genes, or unique portions thereof, described herein. Such polynucleotides may be labeled to assist in their detection.
  • the sequences may be those of mRNA encoded by the genes, the corresponding cDNA to such mRNAs, and/or amplified versions of such sequences.
  • the polynucleotide probes are immobilized on an array, other solid support devices, or in individual spots that localize the probes.
  • all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR.
  • PCR polymerase chain reaction
  • Q-PCR quantitative PCR
  • RT-PCR reverse transcription PCR
  • real-time PCR including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample
  • Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis.
  • the newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention.
  • the newly synthesized nucleic acids may be contacted with polynucleotides (containing sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
  • gene expression may be determined by analysis of expressed protein in a cell sample of interest by use of one or more antibodies specific for one or more epitopes of individual gene products (proteins), or proteolytic fragments thereof, in said cell sample or in a bodily fluid of a subject.
  • the cell sample may be one of breast cancer epithelial cells enriched from the blood of a subject, such as by use of labeled antibodies against cell surface markers followed by fluorescence activated cell sorting (FACS). Such antibodies may be labeled to permit their detection after binding to the gene product.
  • Detection methodologies suitable for use in the practice of the invention include, but are not limited to, immunohistochemistry of cell containing samples or tissue, enzyme linked immunosorbent assays (ELISAs) including antibody sandwich assays of cell containing tissues or blood samples, mass spectroscopy, and immuno-PCR.
  • ELISAs enzyme linked immunosorbent assays
  • label refers to a composition capable of producing a detectable signal indicative of the presence of the labeled molecule.
  • Suitable labels include radioisotopes, nucleotide chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic particles, bioluminescent moieties, and the like.
  • a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • support refers to conventional supports such as beads, particles, dipsticks, fibers, filters, membranes and silane or silicate supports such as glass slides.
  • “Expression” and “gene expression” include transcription and/or translation of nucleic acid material.
  • Conditions that “allow” an event to occur or conditions that are “suitable” for an event to occur are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event.
  • Such conditions known in the art and described herein, depend upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These conditions also depend on what event is desired, such as hybridization, cleavage, strand extension or transcription.
  • Sequence “mutation,” as used herein, refers to any sequence alteration in the sequence of a gene disclosed herein interest in comparison to a reference sequence.
  • a sequence mutation includes single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to mechanisms such as substitution, deletion or insertion.
  • Single nucleotide polymorphism (SNP) is also a sequence mutation as used herein. Because the present invention is based on the relative level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be assayed in the practice of the invention.
  • Detection or “detecting” includes any means of detecting, including direct and indirect determination of the level of gene expression and changes therein.
  • FIG. 1 shows a capacity plot for the ability to use the expression levels of subsets of a set of 100 expressed gene sequences to classify among 39 tumor types and subsets thereof.
  • Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 100 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types.
  • a plot of numbers of tumor types (x-axis) versus prediction accuracies (y-axis) for results using from 5 to 49 genes are shown as non-limiting examples.
  • the data from using 5 genes results in a curve closest to the x-axis while the data from using 49 genes results in a curve farthest from the x-axis.
  • accuracy improves with higher numbers of gene sequences, where from 30 to 49 gene sequences (the three curves farthest from the x-axis) provides about the same level of accuracy.
  • FIG. 2 shows an alternative presentation of the data used with respect to FIG. 1 .
  • a plot of numbers of gene sequences used, ranging from 5-49 (and in the x-axis), versus prediction accuracies (y-axis) for various representative numbers of tumor types is shown.
  • the plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 3 provides a further analysis of the ability to use the expression levels of subsets of a set of 100 randomly selected expressed gene sequences to classify among 39 tumor types.
  • the data used with FIGS. 1 and 2 is presented in a plot of the number of tumor types versus the number of gene sequences used at prediction accuracies from 55-70% are shown as non-limiting examples. Generally, accuracy improves with higher numbers of gene sequences.
  • FIG. 4 shows a capacity plot for the ability to use the expression levels of portions of a first set of 74 expressed gene sequences to classify among 39 tumor types and subsets thereof.
  • Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 74 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types.
  • a plot of numbers of tumor types versus prediction accuracies for results using from 5 to 49 genes are shown as non-limiting examples.
  • the plotted lines, from top to bottom, are of the results from 49, 40, 30, 20, 10, and gene sequences, respectively.
  • FIG. 5 shows an alternative presentation of the data used with respect to FIG. 4 .
  • a plot of numbers of gene sequences used, ranging from 5-49, versus prediction accuracies for various representative numbers of tumor types is shown.
  • the plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 6 is analogous to FIG. 3 except with presentation of the data used with FIGS. 4 and 5 .
  • FIG. 7 shows a capacity plot for the ability to use the expression levels of subsets of a set of 90 expressed gene sequences to classify among 39 tumor types and subsets thereof.
  • Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 90 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types.
  • a plot of numbers of tumor types versus prediction accuracies for results using from 5 to 49 genes are shown as non-limiting examples.
  • the plotted lines, from top to bottom, are of the results from 49, 40, 30, 20, 10, and gene sequences, respectively.
  • FIG. 8 shows an alternative presentation of the data used with respect to FIG. 7 .
  • a plot of numbers of gene sequences used, ranging from 5-49, versus prediction accuracies for various representative numbers of tumor types is shown.
  • the plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 9 is analogous to FIGS. 3 and 6 except with presentation of the data used with FIGS. 7 and 8 .
  • FIGS. 10 A- 10 D show a “tree” that classifies tumor types covered herein as well as additional known tumor types. It was constructed mainly according to “Cancer, Principles and Practice of Oncology, (DeVito, Hellman and Rosenberg), 6 th edition”. Thus beginning with a “tumor of unknown origin” (or “tuo”), the first possibilities are that it is either of a germ cell or non-germ cell origin. If it is the former, then it may be of ovary or testes origin. Within those of testes origin, the tumor may be of seminoma origin or an “other” origin.
  • the tumor is of a non-germ cell origin, then it is either of a epithelial or non-epithelial origin. If it is the former, then it is either squamous or non-squamous origin.
  • Squamous origin tumors are of cervix, esophagus, larynx, lung, or skin in origin.
  • Non-squamous origin tumors are of urinary bladder, breast, carcinoid-intestine, cholangiocarcinoma, digestive, kidney, liver, lung, prostate, reproductive system, skin-basal cell, or thyroid-follicular-papillary origin.
  • the tumors are of small and large bowel, stomach-adenocarcinoma, bile duct, esophagus, gall bladder, and pancreas in origin.
  • the esophagus origin tumors may be of either Barrett's esophagus or adenocarcinoma types.
  • the reproductive system origin tumors they may be of cervix adenocarcinoma type, endometrial tumor, or ovarian origin.
  • Ovarian origin tumors are of the clear, serous, mucinous, and endometroid types.
  • the tumor is of non-epithelial origin, then it is of adrenal gland, brain, GIST (gastrointestinal stromal tumor), lymphoma, meningioma, mesothelioma, sarcoma, skin melanoma, or thyroid-medullary origin.
  • GIST gastrointestinal stromal tumor
  • lymphoma meningioma, mesothelioma, sarcoma, skin melanoma, or thyroid-medullary origin.
  • lymphomas they are B cell, Hodgkin's, or T cell type.
  • sarcomas are leimyosarcoma, osteosarcoma, soft-tissue sarcoma, soft tissue MFH (malignant fibrous histiocytoma), soft tissue sarcoma synovial, soft tissue Ewing's sarcoma, soft tissue fibrosarcoma, and soft tissue rhabdomyosarcoma types.
  • the invention provides methods for the use of gene expression information to classify tumors in a more objective manner than possible with conventional pathology techniques.
  • the invention provides a method of classifying a cell containing sample as including a tumor cell of (or from) a type of tissue or a tissue origin.
  • the method comprises determining or measuring the expression levels of about five to 49 transcribed sequences from cells in a cell containing sample obtained from a subject, and classifying the sample as containing tumor cells of a type of tissue from a plurality of tumor types based on the expression levels of said sequences.
  • a plurality refers to the state of two or more.
  • the classifying is based upon a comparison of the expression levels of the about 5 to 49 transcribed sequences in the cells of the sample to their expression levels in known tumor samples and/or known non-tumor samples.
  • the classifying is based upon a comparison of the expression levels of the about 5 to 49 transcribed sequences to the expression of reference sequences in the same samples, relative to, or based on, the same comparison in known tumor samples and/or known non-tumor samples.
  • the expression levels of the gene sequences may be determined in a set of known tumor samples to provide a database against which the expression levels detected or determined in a cell containing sample from a subject is compared.
  • the expression level(s) of gene sequence(s) in a sample also may be compared to the expression level(s) of said sequence(s) in normal or non-cancerous cells, preferably from the same sample or subject. As described below and in embodiments of the invention utilizing Q-PCR or real time Q-PCR, the expression levels may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used.
  • the method utilizes a ratio, of transcribed sequences to the number of tumor types classified, ranging from about 1:2 to about 5:2 or higher. Stated differently, the ratio of the number of expression levels needed to the number of tumor types that may be classified based upon those levels, ranges from about 1:2 to about 1:1 to about 3:2 to about 2:1 to about 5:2 or higher. This is reflected by the ability to use as few as about 20 expression levels to classify among 39 tumor types (see FIG. 6 ). Thus, and based on data as shown in FIGS. 1 - 9 , the invention may be practiced with about 5 to 49 gene sequences within the ratio of genes assessed to tumors classified.
  • the selection of about 5 to 49 gene sequences to use may be random, or by selection based on various criteria.
  • the gene sequences may be selected based upon unsupervised learning, including clustering techniques.
  • selection may be to reduce or remove redundancy with respect to their ability to classify tumor type.
  • gene sequences are selected based upon the lack of correlation between their expression and the expression of one or more other gene sequences used for classifying. This is accomplished by assessing the expression level of each gene sequence in the expression data set for correlation, across the plurality of samples, with the expression level of each other gene in the data set to produce a correlation matrix of correlation coefficients. These correlation determinations may be performed directly, between expression of each pair of gene sequences, or indirectly, without direct comparison between the expression values of each pair of gene sequences.
  • a variety of correlation methodologies may be used in the correlation of expression data of individual gene sequences within the data set.
  • Non-limiting examples include parametric and non-parametric methods as well as methodologies based on mutual information and non-linear approaches.
  • Non-limiting examples of parametric approaches include Pearson correlation (or Pearson r, also referred to as linear or product-moment correlation) and cosine correlation.
  • Non-limiting examples of non-parametric methods include Spearman's R (or rank-order) correlation, Kendall's Tau correlation, and the Gamma statistic.
  • Each correlation methodology can be used to determine the level of correlation between the expressions of individual gene sequences in the data set. The correlation of all sequences with all other sequences is most readily considered as a matrix.
  • the correlation coefficient r in the method is used as the indicator of the level of correlation.
  • the correlation coefficient analogous to r may be used, along with the recognition of equivalent levels of correlation corresponding to r being at or about 0.25 to being at or about 0.5.
  • the correlation coefficient may be selected as desired to reduce the number of correlated gene sequences to various numbers.
  • the selected coefficient value may be of about 0.25 or higher, about 0.3 or higher, about 0.35 or higher, about 0.4 or higher, about 0.45 or higher, or about 0.5 or higher.
  • the selection of a coefficient value means that where expression between gene sequences in the data set is correlated at that value or higher, they are possibly not included in a subset of the invention.
  • the method comprises excluding or removing (not using for classification) one or more gene sequences that are expressed in correlation, above a desired correlation coefficient, with another gene sequence in the tumor type data set. It is pointed out, however, that there can be situations of gene sequences that are not correlated with any other gene sequences, in which case they are not necessarily removed from use in classification.
  • the expression levels of gene sequences where more than about 10%, more than about 20%, more than about 30%, more than about 40%, more than about 50%, more than about 60%, more than about 70%, more than about 80%, or more than about 90% of the levels are not correlated with that of another one of the gene sequences used, may be used in the practice of the invention. Correlation between expression levels may be based upon a value below about 0.9, about 0.8, about 0.7, about 0.6, about 0.5, about 0.4, about 0.3, or about 0.2. The ability to classify among classes with exclusion of the expression levels of some gene sequences is present because expression of the gene sequences in the subset is correlated with expression of the gene sequences excluded from the subset.
  • expression of the gene sequences of the subset has information content relevant to properties and/or characteristics (or phenotype) of a cell.
  • This has application and relevance to the classification of additional tumor type classes not included as part of the original gene expression data set which can be classified by use of a subset of the invention because based on the redundancy of information between expression of sequences in the subset and sequences expressed in those additional classes.
  • the invention may be used to classify cells as being a tumor type beyond the plurality of known classes used to generate the original gene expression data set.
  • Selection of gene sequences based upon reducing correlation of expression to a particular tumor type may also be used. This also reflects a discovery of the present invention, based upon the observation that expression levels that were most highly correlated with one or more tumor types was not necessarily of greatest value in classification among different tumor types. This is reflected both by the ability to use randomly selected gene sequences for classification as well as the use of particular sequences, as described herein, which are not expressed with the most significant correlation with one or more tumor types. Thus the invention may be practiced without selection of gene sequences based upon the most significant P values or a ranking based upon correlation of gene expression and one or more tumor types. Thus the invention may be practiced without the use of ranking based methodologies, such as the Kruskal-Wallis H-test.
  • the gene sequences used in the practice of the invention may include those which have been observed to be expressed in correlation with particular tumor types, such as expression of the estrogen receptor, which has been observed to be expressed in correlation with some breast and ovarian cancers.
  • the invention is practiced with use of the expression level of at least one gene sequence that has not been previously identified as being associated with any of the tumor types being classified.
  • the invention may be practiced without all of the gene sequences having previously been associated or correlated with expression in the 2 or more (up to 39 or more) tumor types to which a cell containing sample may be classified.
  • the invention is described mainly with respect to human subjects, samples from other subjects may also be used. All that is necessary is the ability to assess the expression levels of gene sequences in a plurality of known tumor samples such that the expression levels in an unknown or test sample may be compared.
  • the invention may be applied to samples from any organism for which a plurality of expressed sequences, and a plurality of known tumor samples, are available.
  • One non-limiting example is application of the invention to mouse samples, based upon the availability of the mouse genome to permit detection of expressed murine sequences and the availability of known mouse tumor samples or the ability to obtain known samples.
  • the invention is contemplated for use with other samples, including those of mammals, primates, and animals used in clinical testing (such as rats, mice, rabbits, dogs, cats, and chimpanzees) as non-limiting examples.
  • a sample of the invention may be one that is suspected or known to contain tumor cells.
  • a sample of the invention may be a “tumor sample” or “tumor containing sample” or “tumor cell containing sample” of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer.
  • samples for use with the invention include a clinical sample, such as, but not limited to, a fixed sample, a fresh sample, or a frozen sample.
  • the sample may be an aspirate, a cytological sample (including blood or other bodily fluid), or a tissue specimen, which includes at least some information regarding the in situ context of cells in the specimen, so long as appropriate cells or nucleic acids are available for determination of gene expression levels.
  • the invention is based in part on the discovery that results obtained with frozen tissue sections can be validly applied to the situation with fixed tissue or cell samples and extended to fresh samples.
  • Non-limiting examples of fixed samples include those that are fixed with formalin or formaldehyde (including FFPE samples), with Boudin's, glutaldehyde, acetone, alcohols, or any other fixative, such as those used to fix cell or tissue samples for immunohistochemistry (IHC).
  • fixatives include fixatives that precipitate cell associated nucleic acids and proteins.
  • non-frozen samples such as fixed samples, fresh samples, including cells from blood or other bodily fluid or tissue, and minimally treated samples.
  • the sample has not been classified using standard pathology techniques, such as, but not limited to, immunohistochemistry based assays.
  • the sample is classified as containing a tumor cell of a type selected from the following 53, and subsets thereof: Adenocarcinoma of Breast, Adenocarcinoma of Cervix, Adenocarcinoma of Esophagus, Adenocarcinoma of Gall Bladder, Adenocarcinoma of Lung, Adenocarcinoma of Pancreas, Adenocarcinoma of Small-Large Bowel, Adenocarcinoma of Stomach, Astrocytoma, Basal Cell Carcinoma of Skin, Cholangiocarcinoma of Liver, Clear Cell Adenocarcinoma of Ovary, Diffuse Large B-Cell Lymphoma, Embryonal Carcinoma of Testes, Endometrioid Carcinoma of Uterus, Ewings Sarcoma, Follicular Carcinoma of Thyroid, Gastrointestinal Stromal Tumor, Germ Cell Tumor of Ovary, Germ Cell
  • the sample is classified as containing a tumor cell of a type selected from the following 34, and subsets thereof: adrenal, brain, breast, carcinoid-intestine, cervix (squamous cell), cholangiocarcinoma, endometrium, germ-cell, GIST (gastrointestinal stromal tumor), kidney, leiomyosarcoma, liver, lung (adenocarcinoma, large cell), lung (small cell), lung (squamous), lymphoma (B cell), Lymphoma (Hodgkins), meningioma, mesothelioma, osteosarcoma, ovary (clear cell), ovary (serous cell), pancreas, prostate, skin (basal cell), skin (melanoma), small and large bowel; soft tissue (liposarcoma); soft tissue (MFH or Malignant Fibrous Histiocytoma), soft tissue (Sarcoma-synovial), testis (s
  • the sample is classified as containing a tumor cell of a type selected from the following 39, and subsets thereof: adrenal gland, brain, breast, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, endometrium, gall bladder, germ cell-ovary, GIST, kidney, leiomyosarcoma, liver, lung-adenocarcinoma-large cell, lung-small cell, lung-squamous, lymphoma-B cell, lymphoma-Hodgkin's, lymphoma-T cell, meningioma, mesothelioma, osteosarcoma, ovary-clear cell, ovary-serous, pancreas, prostate, skin-basal cell, skin-melanoma, skin-squamous, small and large bowel, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovi
  • the methods of the invention may also be applied to classify a cell containing sample as containing a tumor cell of a tumor of a subset of any of the above sets.
  • the size of the subset will usually be small, composed of two, three, four, five, six, seven, eight, nine, or ten of the tumor types described above.
  • the size of the subset may be any integral number up to the full size of the set.
  • embodiments of the invention include classification among 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 of the above types.
  • the subset will be composed of tumor types that are of the same tissue or organ type. Alternatively, the subset will be composed of tumor types of different tissues or organs. In some embodiments, the subset will include one or more types selected from adrenal gland, brain, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, gall bladder, germ cell-ovary, GIST, leiomyosarcoma, liver, meningioma, osteosarcoma, skin-basal cell, skin-squamous, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, testis-other (or non-seminoma), testis-seminoma, thyroid-follicular-papillary, and thyroid-medullary.
  • adrenal gland brain, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, gall bladder, germ
  • FIGS. 1 - 9 Classification among subsets of the above tumor types is demonstrated by the results shown in FIGS. 1 - 9 , where the expression levels of as few as about 5 or more genes sequences can be used to classify among random samples of 2 tumor types among those in the set of 39 listed above. Expression levels of as few as about 20 to 49 can be used to classify among all 39 tumor types with varying degrees of accuracy.
  • the invention may be practiced with the expression levels of about 10 or more, about 15 or more, about 20 or more, about 25 or more, about 30 or more, about 35 or more, about 40 or more, or about 45 or more to 49 transcribed sequences as found in the human “transcriptome” (transcribed portion of the genome).
  • the invention may also be practiced with expression levels of about 10-20 or more, about 20-30 or more, about 30-40 or more, about 40-50 or more, or 49 transcribed sequences.
  • the transcribed genes may be randomly picked or include all or some of the specific genes sequences disclosed herein.
  • classification with accuracies of about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95% or higher can be performed by use of the instant invention.
  • the gene expression levels of other gene sequences may be determined along with the above described determinations of expression levels for use in classification.
  • One non-limiting example of this is seen in the case of a microarray based platform to determine gene expression, where the expression of other gene sequences is also measured.
  • those other expression levels are not used in classification, they may be considered the results of “excess” transcribed sequences and not critical to the practice of the invention.
  • those other expression levels are used in classification, they are within the scope of the invention, where the description of using particular numbers of sequences does not necessarily exclude the use of expression levels of additional sequences.
  • the invention includes the use of expression level(s) from one or more “excess” gene sequences, such as those which may provide information redundant to one or more other gene sequences used in a method of the invention.
  • the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of the patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor.
  • the tumor sample is classified as being one of the following 24: Adrenal, Bladder, Bone, Brain, Breast, Cervix, Endometrium, Esophagus, Gall Bladder, Kidney, Larynx, Liver, Lung, Lymph Node, Ovary, Pancreas, Prostate, Skin, Soft Tissue, Small/Large Bowel, Stomach, Testes, Thyroid, and Uterus.
  • the invention also provides for classification as one of the above tumor types based upon comparisons to the expression levels of sequences in the 39 tumor types, it is possible that a higher level of confidence in the classification is desired. If an increase in the confidence of the classification is preferred, the classification can be adjusted to identify the tumor sample as being of a particular origin or cell type as shown in FIG. 10 . Thus an increase in confidence can be made in exchange for a decrease in specificity as to tumor type by identification of origin or cell type.
  • the classification of a cell containing sample as having a tumor cell of one of the 39 tumor types above inherently also classifies the tissue or organ site origin of the sample.
  • the identification of a sample as being cervix-squamous necessarily classifies the tumor as being of cervical origin, squamous cell type (and thus epithelial rather than non-epithelial in origin) as shown in FIG. 10 . It also means that the tumor was necessarily not germ cell in origin.
  • the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of a subject or patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor.
  • the practice of the invention to classify a cell containing sample as having a tumor cell of one of the above types is by use of an appropriate classification algorithm that utilizes supervised learning to accept 1) the levels of expression of the gene sequences in a plurality of known tumor types as a training set and 2) the levels of expression of the same genes in one or more cells of a sample to classify the sample as having cells of one of the tumor types. Further discussion of this is provided in the Example section herein.
  • the levels of expression may be provided based upon the signals in any format, including nucleic acid expression or protein expression as described herein.
  • the range of classification is affected by the number of tumor types as well as the number of samples for each tumor type. But given adequate samples of the full range of human tumors as provided herein, the invention is readily applied to the classification of those tumor types as well as additional types.
  • Non-limiting examples of classification algorithms that may be used in the practice of the invention include supervised learning algorithms, machine learning algorithms, linear discriminant analysis, attribute selection algorithms, and artificial neural networks (ANN).
  • ANN artificial neural networks
  • a distance-based classification algorithm such as the k-nearest neighbor (KNN) algorithm, or support vector machine (SVM) are used.
  • KNN can be used to analyze the expression data of the genes in a “training set” of known tumor samples including all 39 of the tumor types described herein.
  • the training data set can then be compared to the expression data for the same genes in a cell containing sample.
  • the expression levels of the genes in the sample are then compared to the training data set via KNN to identify those tumor samples with the most similar expression patterns.
  • the five “nearest neighbors” may be identified and the tumor types thereof used to classify the unknown tumor sample. Of course other numbers of “nearest neighbors” may be used. Non-limiting examples include less than 5, about 7, about 9, or about 11 or more “nearest neighbors”.
  • the classification of the sample as being of a B cell lymphoma can be made with great accuracy. This has been used with 84% or greater accuracy, such as 90%, as described in the Examples.
  • the classification ability may be combined with the inherent nature of the classification scheme to provide a means to increase the confidence of tumor classification in certain situations. For example, if the five “nearest neighbors” of a sample are three ovary clear cell and two ovary serous tumors, confidence can be improved by simply treating the tumors as being of ovarian origin and treating the subject or patient (from whom the sample was obtained) accordingly. See FIG. 10 . This is an example of trading off specificity in favor of increased confidence. This provides the added benefit of addressing the possibility that the unknown sample was a mucinous or endometroid tumor. Of course the skilled practitioner is free to treat the tumor as one or both of these two most likely possibilities and proceeding in accordance with that determination.
  • FIG. 10 may appear to be oversimplified. However, it serves as a good basis to relate known histopathology and to serve as a “guide tree” for analyzing and relating tumor-associated gene expression signatures.
  • the inherent nature of the classification scheme also provides a means to increase the confidence of tumor classification in cases wherein the “nearest neighbors” are ambiguous. For example, if the five “nearest neighbors” were one urinary bladder, one breast, one kidney, one liver, and one prostate, the classification can simply be that of a non-squamous cell tumor. Such a determination can be made with significant confidence and the subject or patient from whom the sample was obtained can be treated accordingly. Without being bound by theory, and offered solely to improve the understanding of the invention, the last two examples reflect the similarities in gene expression of cells of a similar cell type and/or tissue origin.
  • Embodiments of the invention include use of the methods and materials described herein to identify the origin of a cancer from a patient.
  • the tissue origin of the tumor cells is identified by use of the present invention.
  • One non-limiting example is in the case of a subject with an inflamed lymph node containing cancer cells.
  • the cells may be from a tissue or organ that drains into the lymph node or it may be from another tissue source.
  • the present invention may be used to classify the cells as being of a particular tumor or tissue type (or origin) which allows the identification of the source of the cancer cells.
  • the sample (such as that from a lymph node) contains cells, which are first assayed by use of the invention to classify at least one cell as being a tumor cell of a tissue type or origin. This is then used to identify the source of the cancer cells in the sample.
  • the invention is practiced with a sample from a subject with a previous history of cancer.
  • a cell containing sample (from the lymph node or elsewhere) of the subject may be found to contain cancer cells such that the present invention may be used to determine whether the cells are from the same or a different tissue from that of the previous cancer.
  • This application of the invention may also be used to identify a new primary tumor, such as the case where new cancer cells are found in the liver of a subject who previously had breast cancer.
  • the invention may be used to identify the new cancer cells as being the result of metastasis from the previous breast cancer (or from another tumor type, whether previously identified or not) or as a new primary occurrence of liver cancer.
  • the invention may also be applied to samples of a tissue or organ where multiple cancers are found to determine the origin of each cancer, as well as whether the cancers are of the same origin.
  • the invention includes a first group of 74 gene sequences from which about 5 to 49 may be used in the practice of the invention.
  • the 5 to 49 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 74 are used in classifying.
  • a non-limiting example of such embodiments of the invention is where the expression of from about 5 to 49 of the 74 gene sequences is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention.
  • mRNA sequences corresponding to a set of 74 gene sequences for use in the practice of the invention are provided in Example 6 (Sequence Listing) below along with additional identifying information.
  • the listing of the identifying information, including accession numbers and other information, is provided by the following.
  • any of the above identified sequences, or the sequences provided in Example 6 (Sequence Listing) below may be performed by the detection of expression of any appropriate portion or fragment of these sequences.
  • the portions are sufficiently large to contain unique sequences relative to other sequences expressed in a cell containing sample.
  • the skilled person would recognize that the disclosed sequences represent one strand of a double stranded molecule and that either strand may be detected as an indicator of expression of the disclosed sequences.
  • the disclosed sequences are expressed as RNA molecules in cells which are preferably converted to cDNA molecules for ease of manipulation and detection.
  • the resultant cDNA molecules may have the sequences of the expressed RNA as well as those of the complementary strand thereto. Thus either the RNA sequence strand or the complementary strand may be detected.
  • the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
  • the invention also provides a second group of 90 gene sequences from which about 5 to 49 may be used in the practice of the invention.
  • the about 5 to 49 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 90 are used in classifying.
  • a non-limiting example of such embodiments of the invention is where the expression of about 5 to 49 of the 90 gene sequences is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention. Where those other expression levels are not used in classification, they may be considered the results of “excess” transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the use of the above described sequences does not necessarily exclude the use of expression levels of additional sequences.
  • accession numbers of these members in common between the two sets are AA456140, AA846824, AA946776, AF332224, AI620495, AI632869, AI802118, AI804745, AJ000388, AK025181, AK027147, AL157475, AW194680, AW291189, AW298545, AW473119, BC000045, BC001293, BC001504, BC004453, BC006537, BC008765, BC009084, BC011949, BC012926, BC013117, BC015754, BE962007, BF224381, BF437393, BI493248, M60502, NM_000065, NM_003914, NM_004063, NM_004496, NM_006115, and R61469.
  • mRNA sequences corresponding to members of the set of 90 that are not present in the set of 74 gene sequences are also provided in Example 6 (Sequence Listing) along with additional identifying information.
  • the listing of the identifying information for these 52 unique members by accession numbers, as well as corresponding oligonucleotide sequences which may be used in the practice of the invention, is provided by the following.
  • the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the above oligonucleotides, which correspond to sequences in Example 6 (Sequence Listing) as indicated by the accession numbers provided.
  • the invention provides for use of any number of the gene sequences of the set of 74 or the set of 90 in the methods of the invention.
  • anywhere from 1 to all of the 49 gene sequences used in the invention may be from either or both of the above sets. So from one, two, three, four, or five, or more of the about 5 to 49 sequences may be from the set of 74 or the set of 90.
  • six, seven, eight, nine, or ten of the sequences may be from one of these sets.
  • a “tumor sample” or “tumor containing sample” or “tumor cell containing sample” or variations thereof refer to cell containing samples of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer.
  • the samples may contain tumor cells which may be isolated by known methods or other appropriate methods as deemed desirable by the skilled practitioner. These include, but are not limited to, microdissection, laser capture microdissection (LCM), or laser microdissection (LMD) before use in the instant invention. Alternatively, undissected cells within a “section” of tissue may be used.
  • Non-limiting examples of such samples include primary isolates (in contrast to cultured cells) and may be collected by any non-invasive or minimally invasive means, including, but not limited to, ductal lavage, fine needle aspiration, needle biopsy, the devices and methods described in U.S. Pat. No. 6,328,709, or any other suitable means recognized in the art.
  • the sample may be collected by an invasive method, including, but not limited to, surgical biopsy.
  • telomeres The detection and measurement of transcribed sequences may be accomplished by a variety of means known in the art or as deemed appropriate by the skilled practitioner. Essentially, any assay method may be used as long as the assay reflects, quantitatively or qualitatively, expression of the transcribed sequence being detected.
  • the ability to classify tumor samples is provided by the recognition of the relevance of the level of expression of the gene sequences (whether randomly selected or specific) and not by the form of the assay used to determine the actual level of expression.
  • An assay of the invention may utilize any identifying feature of a individual gene sequence as disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the gene in the “transcriptome” (the transcribed fraction of genes in a genome) or the “proteome” (the translated fraction of expressed genes in a genome).
  • Additional assays include those based on the detection of polypeptide fragments of the relevant member or members of the proteome. Non-limiting examples of the latter include detection of proteolytic fragments found in a biological fluid, such as blood or serum. Identifying features include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express (RNA), said gene or epitopes specific to, or activities of, a protein encoded by a gene sequence.
  • Additional means include detection of nucleic acid amplification as indicative of increased expression levels and nucleic acid inactivation, deletion, or methylation, as indicative of decreased expression levels.
  • the invention may be practiced by assaying one or more aspect of the DNA template(s) underlying the expression of each gene sequence, of the RNA used as an intermediate to express the sequence, or of the proteinaceous product expressed by the sequence, as well as proteolytic fragments of such products.
  • the detection of the presence of, amount of, stability of, or degradation (including rate) of, such DNA, RNA and proteinaceous molecules may be used in the practice of the invention.
  • all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR.
  • PCR polymerase chain reaction
  • Q-PCR quantitative PCR
  • RT-PCR reverse transcription PCR
  • real-time PCR including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample
  • Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis.
  • the newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention.
  • the newly synthesized nucleic acids may be contacted with polynucleotides (containing gene sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
  • the expression of gene sequences in FFPE samples may be detected as disclosed in U.S. applications 60/504,087, filed Sep. 19, 2003, Ser. No. 10/727,100, filed Dec. 2, 2003, and Ser. No. 10/773,761, filed Feb. 6, 2004 (all three of which are hereby incorporated by reference as if fully set forth).
  • the expression of all or part of an expressed gene sequence or transcript may be detected by use of hybridization mediated detection (such as, but not limited to, microarray, bead, or particle based technology) or quantitative PCR mediated detection (such as, but not limited to, real time PCR and reverse transcriptase PCR) as non-limiting examples.
  • the expression of all or part of an expressed polypeptide may be detected by use of immunohistochemistry techniques or other antibody mediated detection (such as, but not limited to, use of labeled antibodies that bind specifically to at least part of the polypeptide relative to other polypeptides) as non-limiting examples. Additional means for analysis of gene expression are available, including detection of expression within an assay for global, or near global, gene expression in a sample (e.g. as part of a gene expression profiling analysis such as on a microarray).
  • a nucleic acid based assay to determine expression includes immobilization of one or more gene sequences on a solid support, including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art.
  • a solid support including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art.
  • solution based expression assays known in the art may also be used.
  • the immobilized gene sequence(s) may be in the form of polynucleotides that are unique or otherwise specific to the gene(s) such that the polynucleotides would be capable of hybridizing to the DNA or RNA of said gene(s).
  • polynucleotides may be the full length of the gene(s) or be short sequences of the genes (up to one nucleotide shorter than the full length sequence known in the art by deletion from the 5′ or 3′ end of the sequence) that are optionally minimally interrupted (such as by mismatches or inserted non-complementary basepairs) such that hybridization with a DNA or RNA corresponding to the genes is not affected.
  • the polynucleotides used are from the 3′ end of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed sequence.
  • Polynucleotides containing mutations relative to the sequences of the disclosed genes may also be used so long as the presence of the mutations still allows hybridization to produce a detectable signal.
  • the practice of the present invention is unaffected by the presence of minor mismatches between the disclosed sequences and those expressed by cells of a subject's sample.
  • a non-limiting example of the existence of such mismatches are seen in cases of sequence polymorphisms between individuals of a species, such as individual human patients within Homo sapiens.
  • some gene sequences include 3′ poly A (or poly T on the complementary strand) stretches that do not contribute to the uniqueness of the disclosed sequences.
  • the invention may thus be practiced with gene sequences lacking the 3′ poly A (or poly T) stretches.
  • the uniqueness of the disclosed sequences refers to the portions or entireties of the sequences which are found only in nucleic acids, including unique sequences found at the 3′ untranslated portion thereof.
  • Some unique sequences for the practice of the invention are those which contribute to the consensus sequences for the genes such that the unique sequences will be useful in detecting expression in a variety of individuals rather than being specific for a polymorphism present in some individuals.
  • sequences unique to an individual or a subpopulation may be used.
  • the unique sequences may be the lengths of polynucleotides of the invention as described herein.
  • polynucleotides having sequences present in the 3′ untranslated and/or non-coding regions of gene sequences are used to detect expression levels in cell containing samples of the invention.
  • Such polynucleotides may optionally contain sequences found in the 3′ portions of the coding regions of gene sequences.
  • Polynucleotides containing a combination of sequences from the coding and 3′ non-coding regions preferably have the sequences arranged contiguously, with no intervening heterologous sequence(s).
  • the invention may be practiced with polynucleotides having sequences present in the 5′ untranslated and/or non-coding regions of gene sequences to detect the level of expression in cells and samples of the invention.
  • polynucleotides may optionally contain sequences found in the 5′ portions of the coding regions.
  • Polynucleotides containing a combination of sequences from the coding and 5′ non-coding regions may have the sequences arranged contiguously, with no intervening heterologous sequence(s).
  • the invention may also be practiced with sequences present in the coding regions of gene sequences.
  • the polynucleotides of some embodiments contain sequences from 3′ or 5′ untranslated and/or non-coding regions of at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, at least about 32, at least about 34, at least about 36, at least about 38, at least about 40, at least about 42, at least about 44, or at least about 46 consecutive nucleotides.
  • the term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value.
  • polynucleotides containing sequences of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides.
  • the term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value.
  • Sequences from the 3′ or 5′ end of gene coding regions as found in polynucleotides of the invention are of the same lengths as those described above, except that they would naturally be limited by the length of the coding region.
  • the 3′ end of a coding region may include sequences up to the 3′ half of the coding region.
  • the 5′ end of a coding region may include sequences up the 5′ half of the coding region.
  • sequences, or the coding regions and polynucleotides containing portions thereof may be used in their entireties.
  • polynucleotides containing deletions of nucleotides from the 5′ and/or 3′ end of gene sequences may be used.
  • the deletions are preferably of 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, or 175-200 nucleotides from the 5′ and/or 3′ end, although the extent of the deletions would naturally be limited by the length of the sequences and the need to be able to use the polynucleotides for the detection of expression levels.
  • primers and optional probes for quantitative PCR are those which amplify a region less than about 750, less than about 700, less than about 650, less than about 6000, less than about 550, less than about 500, less than about 450, less than about 400, less than about 350, less than about 300, less than about 250, less than about 200, less than about 150, less than about 100, or less than about 50 nucleotides from the from the polyadenylation signal or polyadenylation site of a gene or expressed sequence.
  • the size of a PCR amplicon of the invention may be of any size, including at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides, all with inclusion of the portion complementary to the PCR primers used.
  • polynucleotides for use in the practice of the invention include those that have sufficient homology to gene sequences to detect their expression by use of hybridization techniques. Such polynucleotides preferably have about or 95%, about or 96%, about or 97%, about or 98%, or about or 99% identity with the gene sequences to be used. Identity is determined using the BLAST algorithm, as described above.
  • polynucleotides for use in the practice of the invention may also be described on the basis of the ability to hybridize to polynucleotides of the invention under stringent conditions of about 30% v/v to about 50% formamide and from about 0.01M to about 0.15M salt for hybridization and from about 0.01M to about 0.15M salt for wash conditions at about 55 to about 65° C. or higher, or conditions equivalent thereto.
  • a population of single stranded nucleic acid molecules comprising one or both strands of a human gene sequence is provided as a probe such that at least a portion of said population may be hybridized to one or both strands of a nucleic acid molecule quantitatively amplified from RNA of a cell or sample of the invention.
  • the population may be only the antisense strand of a human gene sequence such that a sense strand of a molecule from, or amplified from, a cell may be hybridized to a portion of said population.
  • the population preferably comprises a sufficiently excess amount of said one or both strands of a human gene sequence in comparison to the amount of expressed (or amplified) nucleic acid molecules containing a complementary gene sequence.
  • the invention further provides a method of classifying a human tumor sample by detecting the expression levels of about 5 to 49 transcribed sequences in a nucleic acid or cell containing sample obtained from a human subject, and classifying the sample as containing a tumor cell of a tumor type found in humans to the exclusion of one or more other human tumor types.
  • the method may be used to classify a sample as being, or having cells of, one of the 53 tumor types listed above to the exclusion of one or more of the other 52.
  • the method is used to classify a sample as being, or having cells of, one of the 34 tumor types listed above to the exclusion of one or more of the other 33 tumor types.
  • the method is used to classify a sample as being, or having cells of, one of the 39 tumor types listed above to the exclusion of one or more of the other 38 tumor types.
  • the invention also provides a method for classifying tumor samples as being one of a subset of the possible tumor types described herein by detecting the expression levels of 50 or more transcribed sequences in a nucleic acid containing tumor sample obtained from a human subject, and classifying the sample as being one of a number of tumor types found in humans to the exclusion of one or more other human tumor types.
  • the number of other tumor types is from 1 to about 3, more preferably from 1 to about 5, from 1 to about 7, or from 1 to about 9 or about 10.
  • the number of tumor types are all of the same tissue or organ origin such as those listed above. This aspect of the invention is related to the above discussion of FIG.
  • the invention may be practiced by analyzing gene expression from single cells or homogenous cell populations which have been dissected away from, or otherwise isolated or purified from, contaminating cells of a sample as present in a simple biopsy.
  • contaminating, non-tumor cells such as infiltrating lymphocytes or other immune system cells
  • Such contamination is present where a biopsy is used to generate gene expression profiles.
  • the expression levels of gene sequences of the invention may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used. This provides a means to “normalize” the expression data for comparison of data on a plurality of known tumor types and a cell containing sample to be assayed. While a variety of reference genes may be used, the invention may also be practiced with the use of 8 particular reference gene sequences that were identified for use with the set of 39 tumor types. Moreover, the Q-PCR may be performed in whole or in part with use of a multiplex format.
  • Example 6 mRNA sequences corresponding to the 8 reference sequences are provided in Example 6 (Sequence Listing) along with additional identifying information.
  • Detection of expression of any of the above reference sequences may be by the same or different methodology as for the other gene sequences described above.
  • the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
  • the methods provided by the present invention may also be automated in whole or in part.
  • Non-limiting examples include processor executable instructions on one or more computer readable storage devices wherein said instructions direct the classification of tumor samples based upon gene expression levels as described herein.
  • Additional processor executable instructions on one or more computer readable storage devices are contemplated wherein said instructions cause representation and/or manipulation, via a computer output device, of the process or results of a classification method.
  • the invention includes software and hardware embodiments wherein the gene expression data of a set of gene sequences in a plurality of known tumor types is embodied as a data set.
  • the gene expression data set is used for the practice of a method of the invention.
  • the invention also provides computer related means and systems for performing the methods disclosed herein.
  • an apparatus for classifying a cell containing sample is provided.
  • Such an apparatus may comprise a query input configured to receive a query storage configured to store a gene expression data set, as described herein, received from a query input; and a module for accessing and using data from the storage in a classification algorithm as described herein.
  • the apparatus may further comprise a string storage for the results of the classification algorithm, optionally with a module for accessing and using data from the string storage in an output algorithm as described herein.
  • steps of a method, process, or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two.
  • the various steps or acts in a method or process may be performed in the order shown, or may be performed in another order. Additionally, one or more process or method steps may be omitted or one or more process or method steps may be added to the methods and processes. An additional step, block, or action may be added in the beginning, end, or intervening existing elements of the methods and processes.
  • a further aspect of the invention provides for the use of the present invention in relation to clinical activities.
  • the determination or measurement of gene expression as described herein is performed as part of providing medical care to a patient, including the providing of diagnostic services in support of providing medical care.
  • the invention includes a method in the medical care of a patient, the method comprising determining or measuring expression levels of gene sequences in a cell containing sample obtained from a patient as described herein.
  • the method may further comprise the classifying of the sample, based on the determination/measurement, as including a tumor cell of a tumor type or tissue origin in a manner as described herein.
  • the determination and/or classification may be for use in relation to any aspect or embodiment of the invention as described herein.
  • the determination or measurement of expression levels may be preceded by a variety of related actions.
  • the measurement is preceded by a determination or diagnosis of a human subject as in need of said measurement.
  • the measurement may be preceded by a determination of a need for the measurement, such as that by a medical doctor, nurse or other health care provider or professional, or those working under their instruction, or personnel of a health insurance or maintenance organization in approving the performance of the measurement as a basis to request reimbursement or payment for the performance.
  • the measurement may also be preceded by preparatory acts necessary to the actual measuring.
  • Non-limiting examples include the actual obtaining of a cell containing sample from a human subject; or receipt of a cell containing sample; or sectioning a cell containing sample; or isolating cells from a cell containing sample; or obtaining RNA from cells of a cell containing sample; or reverse transcribing RNA from cells of a cell containing sample.
  • the sample may be any as described herein for the practice of the invention.
  • the invention provides for a method of ordering, or receiving an order for, the performance of a method in the medical care of a patient or other method of the invention.
  • the ordering may be made by a medical doctor, a nurse, or other health care provider, or those working under their instruction, while the receiving, directly or indirectly, may be made by any person who performs the method(s).
  • the ordering may be by any means of communication, including communication that is written, oral, electronic, digital, analog, telephonic, in person, by facsimile, by mail, or otherwise passes through a jurisdiction within the United States.
  • the invention further provides methods in the processing of reimbursement or payment for a test, such as the above method in the medical care of a patient or other method of the invention.
  • a method in the processing of reimbursement or payment may comprise indicating that 1) payment has been received, or 2) payment will be made by another payer, or 3) payment remains unpaid on paper or in a database after performance of an expression level detection, determination or measurement method of the invention.
  • the database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention.
  • the indicating may be in the form of a code (such as a CPT code) on paper or in the database.
  • the “another payer” may be any person or entity beyond that to whom a previous request for reimbursement or payment was made.
  • the method may comprise receiving reimbursement or payment for the technical or actual performance of the above method in the medical care of a patient; for the interpretation of the results from said method; or for any other method of the invention.
  • the invention also includes embodiments comprising instructing another person or party to receive the reimbursement or payment.
  • the ordering may be by any communication means, including those described above.
  • the receipt may be from any entity, including an insurance company, health maintenance organization, governmental health agency, or a patient as non-limiting examples.
  • the payment may be in whole or in part. In the case of a patient, the payment may be in the form of a partial payment known as a co-pay.
  • the method may comprise forwarding or having forwarded a reimbursement or payment request to an insurance company, health maintenance organization, governmental health agency, or to a patient for the performance of the above method in the medical care of a patient or other method of the invention.
  • the request may be by any communication means, including those described above.
  • the method may comprise receiving indication of approval for payment, or denial of payment, for performance of the above method in the medical care of a patient or other method of the invention.
  • Such an indication may come from any person or party to whom a request for reimbursement or payment was made.
  • Non-limiting examples include an insurance company, health maintenance organization, or a governmental health agency, like Medicare or Medicaid as non-limiting examples.
  • the indication may be by any communication means, including those described above.
  • An additional embodiment is where the method comprises sending a request for reimbursement for performance of the above method in the medical care of a patient or other method of the invention.
  • a request may be made by any communication means, including those described above.
  • the request may have been made to an insurance company, health maintenance organization, federal health agency, or the patient for whom the method was performed.
  • a further method comprises indicating the need for reimbursement or payment on a form or into a database for performance of the above method in the medical care of a patient or other method of the invention.
  • the method may simply indicate the performance of the method.
  • the database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention.
  • the indicating may be in the form of a code (such as a CPT code) on paper or in the database.
  • the method may comprise reporting the results of the method, optionally to a health care facility, a health care provider or professional, a doctor, a nurse, or personnel working therefor.
  • the reporting may also be directly or indirectly to the patient.
  • the reporting may be by any means of communication, including those described above.
  • kits for the determination or measurement of gene expression levels in a cell containing sample as described herein.
  • a kit will typically comprise one or more reagents to detect gene expression as described herein for the practice of the present invention.
  • Non-limiting examples include polynucleotide probes or primers for the detection of expression levels, one or more enzymes used in the methods of the invention, and one or more tubes for use in the practice of the invention.
  • the kit will include an array, or solid media capable of being assembled into an array, for the detection of gene expression as described herein.
  • the kit may comprise one or more antibodies that is immunoreactive with epitopes present on a polypeptide which indicates expression of a gene sequence.
  • the antibody will be an antibody fragment.
  • kits of the invention may also include instructional materials disclosing or describing the use of the kit or a primer or probe of the present invention in a method of the invention as provided herein.
  • a kit may also include additional components to facilitate the particular application for which the kit is designed.
  • a kit may additionally contain means of detecting the label (e.g. enzyme substrates for enzymatic labels, filter sets to detect fluorescent labels, appropriate secondary labels such as a sheep anti-mouse-HRP, or the like).
  • a kit may additionally include buffers and other reagents recognized for use in a method of the invention.
  • Subsets of 100 randomly selected expressed gene sequences used to classify among 39 tumor types were tested for their ability to classify among subsets of the 39 tumor types.
  • the expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each combination sampled 10 times) of the 100 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types.
  • FIG. 1 shows the classification capability of various gene sets are shown relative to the number of tumor types classified. As expected, a higher number of gene sequences are needed to classify tumor types with higher accuracies.
  • FIG. 2 shows the classification performance for various numbers of tumor types relative to the number of gene sequences used.
  • GenBank accession numbers of the 100 gene sequences are AF269223, BC006286, AK025501, AJ002367, AI469140, AW013883, NM_001238, AI476350, BC006546, AI041212, BF724944, AI376951, R56211, BC006393, X13274, BC001133, N62397, BC000885, AK001588, AK057901, AF146760, AI951287, AK025604, BC007581, BC015025, R43102, AW449550, AI922539, AI684144, AI277662, BC015999, AW444656, BC011612, BC015401, BF447279, BC009956, AL050163, BC001248, BE672684, AL137353, BC001340, U45975, BE856598, BC009060, AL137728, AA713797, AL583913, AK0546
  • genes 33 and 63, genes 17 and 72, genes 64 and 21, genes 48 and 25, genes 88 and 54, genes 80 and 32, genes 24 and 99, genes 14 and 31, genes 80 and 23, and genes 18 and 34 were used as the 10 random sets.
  • genes 27, 97, 56, 88, and 50 were used.
  • genes 24, 26, 35, 48, and 83 were used.
  • genes 46, 62, 75, 91, and 2 were used.
  • genes 19, 61, 34, 87, and 13 were used.
  • genes 56, 32, 66, 20, and 55 were used.
  • genes 90, 21, 6, 78, and 66 were used.
  • genes 73, 47, 3, 82, and 86 were used.
  • genes 74, 39, 13, 7, and 67 were used.
  • genes 34, 1, 24, 85, and 62 were used.
  • genes 23, 89, 15, 54, and 98 were used.
  • genes 11, 58, 90, 40, 20, 44, 10, 78, 72, and 74 were used.
  • genes 79, 71, 42, 48, 93, 56, 55, 14, 92, and 52 were used.
  • genes 62, 53, 52, 19, 98, 26, 76, 65, 33, and 40 were used.
  • genes 94, 8, 16, 99, 58, 19, 97, 92, 76, and 86 were used.
  • genes 18, 97, 16, 94, 84, 52, 11, 24, 89, and 92 were used.
  • genes 12, 42, 45, 51, 2, 75, 63, 28, 13, and 58 were used.
  • genes 67, 98, 55, 32, 82, 42, 2, 45, 37, and 23 were used.
  • genes 40, 43, 69, 68, 13, 97, 35, 3, 44, and 42 were used.
  • genes 69, 47, 96, 80, 100, 50, 42, 26, 65, and 17 were used.
  • genes 83, 84, 69, 67, 19, 85, 35, 11, 70, and 64 were used.
  • genes 98, 81, 43, 63, 18, 56, 19, 97, 47, 13, 48, 99, 75, 45, and 83 were used.
  • genes 5, 72, 31, 59, 81, 40, 92, 3, 23, 50, 57, 74, 62, 21, and 93 were used.
  • genes 11, 69, 91, 100, 38, 1, 73, 64, 90, 26, 62, 2, 37, 23, and 18 were used.
  • genes 76, 9, 53, 4, 11, 41, 77, 44, 87, 51, 54, 49, 43, 56, and 67 were used.
  • genes 55, 34, 13, 89, 52, 74, 96, 80, 48, 22, 31, 39, 43, 91, and 54 were used.
  • genes 59, 88, 15, 90, 4, 73, 93, 7, 10, 18, 98, 83, 43, 3, and 5 were used.
  • genes 68, 91, 77, 33, 88, 94, 95, 41, 46, 27, 36, 51, 97, 7, and 2 were used.
  • genes 7, 10, 78, 40, 70, 84, 55, 1, 98, 22, 99, 91, 8, 17, and 89 were used.
  • genes 65, 10, 38, 8, 77, 98, 37, 43, 93, 99, 86, 16, 82, 27, and 9 were used.
  • genes 97, 27, 78, 38, 24, 19, 55, 47, 77, 13, 45, 25, 43, 70, and 68 were used.
  • genes 41, 94, 38, 76, 35, 65, 92, 26, 49, 7, 85, 54, 77, 66, 98, 15, 86, 69, 70, and 67 were used.
  • genes 43, 87, 1, 81, 7, 14, 94, 28, 25, 55, 100, 41, 18, 47, 96, 89, 26, 53, 29, and 32 were used.
  • genes 48, 80, 90, 99, 50, 98, 36, 91, 6, 41, 61, 96, 74, 66, 9, 5, 16, 18, 20, and 1 were used.
  • genes 49, 58, 73, 24, 94, 22, 41, 52, 18, 19, 63, 91, 74, 37, 59, 95, 53, 87, 72, and 13 were used.
  • genes 67, 74, 2, 98, 46, 69, 5, 42, 22, 66, 60, 20, 100, 80, 24, 76, 63, 9, 39, and 15 were used.
  • genes 10, 74, 50, 92, 69, 68, 52, 56, 63, 71, 11, 17, 29, 64, 88, 59, 25, 94, 35, and 57 were used.
  • genes 97, 72, 16, 19, 14, 42, 70, 31, 29, 13, 22, 37, 95, 69, 87, 39, 18, 81, 58, and 100 were used.
  • genes 5, 3, 18, 91, 77, 19, 82, 31, 92, 22, 93, 45, 76, 84, 46, 100, 53, 99, 89, and 42 were used.
  • genes 62, 3, 85, 37, 34, 93, 52, 40, 74, 25, 86, 57, 33, 60, 20, 77, 78, 17, 28, and 13 were used.
  • genes 22, 26, 23, 39, 35, 10, 43, 32, 65, 38, 54, 45, 8, 17, 90, 20, 83, 60, 6, and 58 were used.
  • genes 21, 28, 50, 27, 8, 48, 74, 80, 38, 96, 71, 15, 89, 84, 32, 26, 55, 36, 29, 68, 13, 7, 18, 63, and 72 were used.
  • genes 61, 38, 59, 92, 3, 80, 33, 68, 79, 70, 44, 26, 95, 63, 85, 27, 60, 43, 75, 96, 42, 99, 58, 48, and 91 were used.
  • genes 75, 83, 78, 5, 99, 56, 26, 36, 57, 23, 37, 28, 88, 16, 63, 2, 72, 59, 9, 80, 52, 91, 62, 3, and 27 were used.
  • genes 48, 75, 84, 83, 88, 29, 13, 9, 98, 6, 31, 63, 45, 5, 51, 52, 39, 22, 100, 91, 74, 12, 94, 21, and 8 were used.
  • genes 79, 84, 47, 43, 26, 37, 46, 19, 85, 91, 2, 10, 81, 89, 38, 71, 17, 57, 7, 93, 31, 87, 29, 78, and 73 were used.
  • genes 62, 93, 83, 42, 97, 96, 78, 98, 47, 22, 67, 48, 89, 95, 24, 81, 16, 45, 8, 90, 66, 64, 2, 3, and 58 were used.
  • genes 100, 34, 58, 28, 104, 35, 88, 76, 6, 30, 83, 81, 67, 36, 39, 87, 66, 45, 20, 15, 86, 56, 55, and 95 were used.
  • genes 17, 43, 50, 63, 47, 58, 95, 32, 79, 60, 16, 91, 86, 22, 97, 21, 9, 55, 72, 78, 77, 45, 100, 14, and 30 were used.
  • genes 24, 67, 60, 94, 59, 14, 70, 84, 8, 89, 63, 23, 39, 11, 81, 42, 33, 3, 12, 93, 54, 35, 78, 73, and 90 were used.
  • genes 11, 2, 19, 62, 13, 51, 30, 80, 81, 82, 52, 34, 67, 57, 25, 95, 93, 39, 26, 48, 44, 89, 61, 17, and 18 were used.
  • genes 30, 97, 54, 21, 34, 9, 56, 71, 62, 14, 24, 23, 89, 61, 76, 41, 29, 67, 94, 22, 88, 4, 40, 33, 38, 78, 82, 66, 84, and 100 were used.
  • genes 89, 41, 56, 43, 98, 44, 35, 26, 19, 86, 15, 67, 8, 69, 3, 76, 48, 17, 55, 31, 25, 91, 72, 36, 18, 82, 37, 50, 9, and 75 were used.
  • genes 28, 39, 78, 15, 65, 93, 66, 29, 88, 35, 49, 69, 50, 9, 53, 80, 81, 95, 76, 44, 48, 64, 83, 11, 70, 33, 73, 96, 56, and 92 were used.
  • genes 4, 2, 19, 6, 11, 84, 94, 44, 60, 37, 29, 97, 53, 83, 98, 45, 65, 9, 85, 35, 20, 89, 10, 17, 23, 74, 70, 41, 18, and 76 were used.
  • genes 27, 4, 43, 1, 10, 95, 88, 74, 77, 47, 63, 81, 31, 9, 41, 100, 87, 57, 8, 79, 24, 6, 26, 20, 55, 61, 34, 42, 25, and 39 were used.
  • genes 47, 67, 98, 56, 37, 44, 5, 70, 48, 12, 20, 86, 83, 89, 27, 59, 19, 54, 69, 97, 43, 71, 58, 82, 8, 50, 51, 10, 25, and 72 were used.
  • genes 100, 99, 37, 58, 44, 60, 39, 3, 59, 96, 50, 68, 94, 69, 83, 90, 17, 4, 5, 67, 88, 56, 29, 79, 23, 1, 38, 25, 49, and 74 were used.
  • genes 26, 23, 58, 47, 6, 68, 41, 31, 16, 64, 19, 75, 36, 32, 87, 2, 12, 97, 73, 21, 53, 78, 15, 94, 1, 20, 79, 81, 70, and 7 were used.
  • genes 61, 48, 78, 75, 12, 36, 37, 66, 91, 2, 92, 32, 8, 26, 6, 82, 14, 68, 4, 88, 39, 89, 43, 41, 40, 87, 69, 74, 42, and 9 were used.
  • genes 58, 99, 60, 39, 50, 25, 22, 57, 48, 85, 24, 10, 97, 68, 36, 38, 93, 62, 52, 56, 34, 18, 32, 64, 95, 81, 74, 88, 61, and 96 were used.
  • genes 52, 68, 22, 92, 43, 75, 20, 62, 15, 76, 99, 61, 64, 36, 12, 66, 24, 21, 31, 88, 25, 6, 93, 91, 55, 74, 69, 90, 23, 4, 80, 72, 97, 58, and 1 were used.
  • genes 48, 21, 68, 16, 96, 10, 1, 69, 36, 20, 3, 14, 59, 53, 12, 84, 90, 17, 9, 65, 4, 32, 75, 81, 88, 37, 38, 5, 94, 60, 64, 45, 7, 43, and 55 were used.
  • genes 33, 95, 59, 86, 83, 76, 36, 55, 90, 22, 62, 98, 34, 46, 4, 87, 5, 66, 38, 78, 97, 100, 71, 25, 30, 2, 21, 99, 12, 54, 9, 14, 81, 32, and 52 were used.
  • genes 27, 64, 40, 59, 63, 100, 50, 19, 1, 10, 96, 2, 34, 28, 67, 26, 87, 41, 15, 57, 33, 11, 94, 66, 82, 6, 52, 55, 84, 47, 97, 83, 80, 62, and 5 were used.
  • genes 99, 86, 92, 72, 83, 48, 79, 46, 91, 2, 90, 9, 23, 44, 85, 31, 38, 81, 76, 54, 71, 14, 3, 13, 62, 11, 39, 4, 95, 36, 20, 30, 75, 63, and 51 were used.
  • genes 41, 89, 81, 29, 86, 95, 34, 42, 50, 9, 45, 21, 64, 84, 74, 91, 69, 98, 57, 79, 39, 87, 93, 63, 26, 82, 2, 59, 30, 71, 83, 38, 77, 24, and 73 were used.
  • genes 87, 60, 59, 98, 43, 38, 28, 64, 29, 92, 22, 27, 40, 33, 69, 71, 73, 79, 15, 70, 32, 90, 76, 93, 6, 50, 55, 9, 49, 54, 36, 5, 48, 19, and 10 were used.
  • genes 100, 70, 98, 79, 91, 23, 37, 29, 73, 65, 78, 31, 3, 11, 30, 51, 16, 40, 95, 94, 62, 38, 67, 39, 82, 72, 22, 5, 87, 57, 6, 75, 35, 99, and 46 were used.
  • genes 46, 61, 59, 86, 29, 74, 56, 89, 52, 26, 54, 20, 84, 97, 33, 71, 14, 36, 38, 49, 28, 60, 19, 90, 11, 42, 87, 92, 82, 21, 94, 3, 22, 2, and 39 were used.
  • genes 31, 76, 77, 27, 72, 38, 42, 36, 53, 82, 61, 39, 98, 81, 34, 80, 22, 100, 8, 32, 17, 21, 28, 56, 59, 29, 55, 5, 62, 40, 90, 87, 24, 68, and 37 were used.
  • genes 64, 50, 46, 22, 51, 6, 47, 12, 2, 30, 45, 7, 63, 55, 91, 90, 80, 49, 71, 8, 79, 82, 77, 76, 97, 5, 95, 11, 32, 70, 20, 62, 38, 26, 41, 58, 44, 87, 35, and 23 were used.
  • genes 44, 26, 16, 12, 30, 45, 71, 90, 37, 68, 32, 70, 58, 43, 51, 6, 62, 92, 87, 20, 56, 5, 47, 48, 86, 29, 98, 22, 59, 76, 8, 79, 64, 14, 50, 3, 54, 83, 96, and 80 were used.
  • genes 20, 34, 57, 70, 39, 15, 25, 33, 78, 51, 87, 46, 67, 80, 28, 52, 66, 72, 22, 88, 97, 3, 90, 6, 82, 42, 41, 94, 85, 61, 54, 84, 14, 9, 81, 19, 7, 91, 23, and 40 were used.
  • genes 61, 46, 64, 71, 35, 58, 100, 23, 95, 17, 87, 68, 54, 8, 50, 4, 27, 49, 47, 52, 53, 28, 24, 34, 45, 2, 89, 48, 3, 65, 42, 9, 92, 36, 6, 84, 51, 60, 77, and 94 were used.
  • genes 28, 97, 21, 43, 22, 89, 94, 87, 99, 5, 4, 20, 13, 61, 37, 42, 72, 62, 7, 12, 31, 23, 60, 98, 48, 38, 53, 56, 29, 69, 26, 82, 24, 74, 86, 10, 67, 2, 47, and 46 were used.
  • genes 12, 74, 96, 77, 78, 72, 53, 87, 47, 29, 40, 98, 52, 22, 69, 3, 58, 97, 60, 48, 55, 80, 57, 39, 50, 89, 71, 9, 63, 51, 21, 23, 73, 32, 20, 19, 25, 5, 38, and 46 were used.
  • genes 88, 79, 54, 44, 37, 36, 32, 91, 47, 50, 60, 92, 82, 80, 46, 19, 98, 20, 76, 29, 9, 95, 2, 77, 97, 74, 90, 73, 100, 1, 34, 85, 24, 71, 57, 99, 68, 13, 43, and 53 were used.
  • genes 23, 39, 7, 64, 20, 27, 69, 43, 38, 89, 50, 3, 16, 79, 83, 72, 65, 66, 32, 30, 100, 82, 28, 22, 54, 84, 53, 75, 59, 37, 34, 49, 12, 86, 71, 97, 26, 88, 70, and 57 were used.
  • genes 74, 96, 80, 39, 40, 82, 38, 56, 35, 93, 55, 73, 44, 17, 81, 27, 2, 83, 65, 89, 76, 8, 18, 45, 58, 77, 14, 49, 21, 6, 4, 92, 33, 13, 12, 88, 98, 24, 84, and 36 were used.
  • genes 35, 77, 48, 62, 26, 12, 41, 68, 81, 5, 37, 70, 28, 72, 50, 83, 64, 99, 74, 57, 84, 76, 52, 14, 87, 97, 3, 31, 73, 58, 44, 24, 15, 66, 45, 91, 4, 32, 46, and 49 were used.
  • genes 52, 97, 84, 72, 96, 34, 18, 38, 88, 80, 91, 49, 71, 64, 93, 26, 62, 40, 68, 29, 67, 39, 60, 9, 13, 74, 95, 99, 27, 47, 25, 45, 31, 8, 69, 17, 75, 53, 51, 12, 23, 1, 6, 30, and 50 were used.
  • genes 97, 80, 55, 32, 94, 84, 28, 3, 6, 48, 17, 41, 65, 37, 79, 34, 61, 83, 35, 49, 27, 38, 43, 2, 24, 77, 25, 71, 58, 14, 8, 30, 46, 98, 82, 75, 22, 72, 26, 74, 93, 66, 73, 1, and 53 were used.
  • genes 64, 45, 38, 92, 23, 74, 66, 60, 100, 3, 82, 20, 54, 11, 19, 16, 80, 86, 14, 75, 62, 10, 52, 47, 13, 31, 35, 53, 41, 9, 79, 39, 17, 22, 99, 58, 46, 83, 43, 40, 44, 90, 95, 12, and 81 were used.
  • genes 20, 66, 9, 24, 16, 76, 99, 42, 86, 58, 15, 93, 48, 28, 26, 50, 68, 12, 2, 37, 82, 36, 27, 57, 45, 41, 32, 1, 52, 54, 30, 39, 7, 100, 59, 23, 94, 75, 8, 60, 55, 34, 38, 29, and 87 were used.
  • genes 66, 88, 73, 53, 51, 69, 36, 87, 78, 40, 58; 76, 31, 65, 56, 42, 100, 68, 5, 18, 17, 91, 45, 22, 74, 82, 1, 44, 67, 43, 10, 63, 79, 92, 6, 72, 80, 75, 9, 30, 19, 61, 99, 3, and 38 were used.
  • genes 75, 66, 84, 59, 9, 70, 100, 27, 79, 41, 73, 67, 23, 39, 28, 68, 21, 69, 38, 72, 86, 82, 36, 46, 77, 34, 47, 54, 13, 16, 7, 88, 22, 26, 4, 89, 55, 24, 61, 12, 35, 50, 95, 92, and 80 were used.
  • genes 59, 86, 10, 29, 53, 88, 43, 64, 11, 13, 19, 17, 36, 65, 73, 94, 20, 51, 80, 24, 66, 83, 44, 47, 21, 6, 52, 82, 69, 54, 100, 28, 18, 34, 35, 30, 74, 91, 49, 46, 60, 5, 38, 71, and 2 were used.
  • genes 77, 32, 55, 44, 6, 98, 94, 19, 10, 71, 72, 85, 67, 75, 78, 88, 90, 58, 89, 27, 69, 42, 31, 47, 1, 37, 52, 7, 57, 45, 11, 83, 49, 46, 34, 64, 14, 24, 87, 9, 56, 8, 20, 36, and 15 were used.
  • genes 4, 27, 83, 61, 46, 15, 35, 26, 51, 54, 23, 38, 100, 7, 42, 58, 44, 8, 22, 37, 20, 89, 56, 91, 70, 29, 11, 19, 87, 99, 21, 65, 72, 75, 49, 40, 45, 30, 43, 48, 63, 3, 18, 74, and 1 were used.
  • genes 68, 19, 90, 52, 55, 23, 17, 53, 3, 2, 74, 82, 26, 88, 48, 6, 8, 43, 15, 73, 57, 67, 85, 91, 13, 44, 81, 1, 75, 33, 51, 21, 4, 41, 77, 86, 40, 18, 31, 78, 92, 10, 64, 99, and 69 were used.
  • Classification of subsets of the 39 tumor types was performed with use of random selections of tumor types from the group of 39.
  • the expression levels of gene sequence sets as described herein were used to classify random combinations of tumor types. Different random sets of tumor types were used with each of the sets of 100, 74, and 90 gene sequences as described in these examples. Representative, and non-limiting, examples of random sets of from 2 to 20 tumor types used are as follows, where the set of 39 tumor types were indexed from 1 to 39.
  • Set 2 used types 36, 1, 28 and 19. Set 3 used types 13, 4, 12 and 21.
  • Set 4 used types 12, 33, 14 and 28. Set 5 used types 6, 28, 5 and 37.
  • Set 9 used types 18, 10, 8 and 9. Set 10 used types 28, 20, 2 and 22.
  • Set 1 used types 27, 3, 10, 39, 11 and 20 For 6 tumor types, set 1 used types 27, 3, 10, 39, 11 and 20.
  • set 1 used types 26, 20, 4, 12, 2, 31, 38, 18, 16, 39, 3 and 33 For 12 tumor types, set 1 used types 26, 20, 4, 12, 2, 31, 38, 18, 16, 39, 3 and 33.
  • Set 1 used types 27, 15, 8, 12, 6, 20, 26, 19, 25, 2, 37, 38, 7, 39, 4 and 33.
  • Set 2 used types 17, 18, 28, 5, 6, 31, 25, 13, 8, 20, 37, 36, 35, 9, 23 and 27.
  • Set 3 used types 23, 37, 34, 14, 16, 27, 32, 33, 21, 38, 4, 30, 24, 22, 17 and 25.
  • Set 4 used types 7, 37, 38, 21, 34, 31, 32, 25, 10, 36, 19, 11, 6, 26, 18 and 35.
  • Set 6 used types 14, 21, 5, 17, 6, 20, 18, 35, 22, 10, 3, 23, 13, 2, 34 and 26.
  • Set 7 used types 1, 8, 19, 6, 9, 39, 28, 18, 13, 31, 14, 16, 37, 12, 3 and 25.
  • set 1 used types 15, 24, 39, 35, 7, 30, 16, 13, 20, 3, 26, 4, 12, 10, 34, 25, 21 and 28.
  • Set 2 used types 21, 23, 29, 11, 10, 19, 13, 28, 4, 20, 17, 24, 30, 12, 39, 34, 31 and 9.
  • Set 3 used types 7, 17, 27, 6, 30, 8, 22, 2, 32, 26, 21, 14, 4, 38, 1, 35, 16 and 28.
  • Set 4 used types 17, 13, 20, 33, 10, 3, 16, 22, 1, 38, 2, 9, 28, 5, 6, 19, 12 and 11.
  • Set 5 used types 35, 21, 25, 18, 17, 8, 14, 31, 30, 9, 1, 2, 23, 36, 29, 32 and 37.
  • Set 6 used types 17, 34, 2, 18, 19, 15, 16, 13, 4, 24, 5, 35, 6, 22, 28, 37, 38 and 1.
  • set 1 used types 25, 13, 21, 15, 37, 20, 12, 28, 9, 10, 26, 22, 14, 24, 16, 7, 39, 34, 33 and 4.
  • Set 2 used types 20, 17, 10, 27, 19, 28, 5, 1, 23, 21, 38, 7, 13, 22, 32, 31, 9, 4, 3 and 24.
  • Set 3 used types 17, 13, 7, 20, 11, 38, 34, 3, 15, 12, 5, 39, 9, 10, 4, 35, 27, 6, 21 and 33.
  • Set 4 used types 6, 13, 17, 26, 1, 7, 33, 5, 10, 32, 3, 23, 35, 4, 14, 28, 12, 38, 8 and 27.
  • Set 6 used types 30, 24, 21, 11, 23, 25, 8, 9, 7, 31, 27, 5, 14, 29, 1, 19, 16, 12, 22 and 17.
  • Random subsets of about 5 to 49 members of the set of 74 expressed gene sequences were evaluated in a manner analogous to that described in Example 3. Again, the expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each combination sampled 10 times) of the 74 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. The resulting data are shown in FIGS. 4 - 6 .
  • genes 64 and 6 were used for 2 genes.
  • genes 64 and 13 were used for set 2, genes 64 and 13 were used.
  • genes 67 and 51 were used for set 4, genes 51 and 29 were used.
  • genes 46 and 12 were used for set 6, genes 68 and 65 were used.
  • genes 6 and 28 were used for set 8, genes 9 and 55 were used.
  • genes 55 and 71 were used for set 10, genes 63 and 39 were used.
  • genes 8, 64, 50, 54, and 4 were used for 5 genes.
  • genes 39, 17, 45, 34, and 15 were used.
  • genes 10, 4, 61, 21, and 55 were used.
  • genes 59, 37, 21, 23, and 64 were used.
  • genes 69, 8, 25, 59, and 63 were used.
  • genes 45, 71, 19, 59, and 38 were used.
  • genes 21, 43, 14, 48, and 30 were used.
  • genes 73, 35, 36, 10, and 9 were used.
  • genes 62, 28, 11, 70, and 64 were used.
  • genes 8, 16, 70, 18, and 59 were used.
  • genes 49, 72, 38, 68, 52, 21, 1, 10, 2, and 40 were used.
  • genes 54, 70, 28, 64, 68, 41, 44, 20, 7, and 2 were used.
  • genes 71, 49, 51, 11, 18, 53, 8, 42, 36, and 58 were used.
  • genes 72, 15, 35, 3, 23, 8, 2, 48, 22, and 65 were used.
  • genes 44, 19, 6, 22, 38, 5, 37, 9, 30, and 14 were used.
  • genes 15, 27, 3, 10, 31, 19, 44, 39, 48, and 46 were used.
  • genes 70, 30, 9, 33, 63, 71, 32, 34, 20, and 7 were used.
  • genes 45, 29, 54, 58, 15, 21, 68, 5, 42, and 62 were used.
  • genes 74, 17, 66, 46, 10, 8, 63, 5, 24, and 2 were used.
  • genes 33, 2, 34, 19, 60, 71, 42, 51, 70, and 66 were used.
  • genes 13, 22, 26, 67, 64, 40, 68, 71, 4, 28, 24, 33, 46, 69, and 41 were used.
  • genes 10, 1, 14, 70, 71, 64, 46, 67, 45, 48, 65, 74, 34, 49, and 37 were used.
  • genes 58, 30, 44, 40, 51, 36, 33, 60, 39, 21, 54, 64, 25, 13, and 35 were used.
  • genes 63, 70, 60, 32, 31, 16, 49, 65, 38, 5, 72, 47, 40, 2, and 46 were used.
  • genes 43, 6, 40, 13, 39, 72, 68, 41, 27, 73, 36, 25, 33, 34, and 1 were used.
  • genes 68, 67, 71, 59, 73, 62, 31, 43, 7, 44, 21, 48, 54, 58, and 6 were used.
  • genes 16, 50, 61, 62, 27, 2, 21, 1, 41, 28, 68, 35, 17, 47, and 46 were used.
  • genes 27, 18, 44, 66, 2, 20, 53, 64, 46, 70, 57, 7, 51, 10, and 45 were used.
  • genes 65, 8, 43, 23, 50, 46, 21, 41, 44, 3, 31, 17, 7, 66, and 70 were used.
  • genes 16, 14, 61, 51, 39, 33, 43, 31, 53, 65, 74, 42, 29, 9, and 11 were used.
  • genes 14, 60, 6, 71, 74, 16, 62, 39, 56, 44, 32, 72, 18, 42, 66, 49, 1, 9, 69, and 21 were used.
  • genes 23, 1, 7, 27, 26, 71, 12, 4, 22, 69, 62, 44, 6, 25, 57, 28, 33, 9, 21, and 51 were used.
  • genes 46, 48, 29, 54, 55, 69, 73, 47, 6, 27, 24, 21, 15, 43, 45, 7, 62, 25, 22, and 74 were used.
  • genes 12, 65, 24, 73, 45, 57, 49, 63, 61, 1, 58, 10, 2, 18, 8, 51, 67, 69, 59, and 13 were used.
  • genes 33, 43, 9, 52, 54, 38, 8, 16, 48, 1, 39, 60, 17, 6, 15, 66, 68, 63, 37, and 42 were used.
  • genes 43, 19, 44, 28, 56, 34, 66, 42, 73, 40, 65, 38, 54, 20, 51, 37, 30, 35, 53, and 61 were used.
  • genes 61, 6, 20, 4, 34, 53, 70, 38, 35, 46, 36, 16, 1, 23, 68, 12, 59, 71, 65, and 14 were used.
  • genes 25, 68, 69, 3, 33, 49, 19, 56, 54, 4, 32, 6, 45, 16, 67, 52, 65, 14, 12, and 40 were used.
  • genes 47, 7, 36, 32, 61, 74, 14, 45, 26, 51, 69, 12, 41, 42, 64, 25, 27, 57, 23, and 58 were used.
  • genes 27, 13, 3, 17, 51, 7, 37, 43, 20, 12, 52, 21, 25, 2, 5, 32, 62, 47, 4, and 26 were used.
  • genes 57, 61, 31, 38, 3, 7, 72, 43, 32, 23, 28, 71, 48, 17, 2, 49, 10, 30, 66, 12, 69, 41, 20, 63, and 68 were used.
  • genes 18, 54, 47, 57, 24, 42, 66, 46, 16, 58, 37, 60, 62, 9, 2, 27, 36, 52, 13, 32, 45, 6, 43, 21, and 56 were used.
  • genes 47, 48, 52, 16, 56, 54, 42, 37, 17, 41, 35, 21, 6, 9, 63, 10, 49, 68, 23, 25, 70, 3, 58, 2, and 31 were used.
  • genes 50, 10, 25, 16, 68, 15, 29, 73, 27, 63, 3, 17, 28, 66, 19, 13, 4, 9, 36, 48, 23, 57, 59, 26, and 14 were used.
  • genes 40, 39, 43, 49, 66, 15, 14, 29, 36, 21, 19, 44, 72, 58, 69, 12, 11, 9, 37, 46, 32, 51, 3, 24, and 6 were used.
  • genes 42, 49, 44, 32, 46, 35, 70, 40, 3, 21, 11, 67, 25, 56, 37, 43, 60, 55, 16, 27, 30, 53, 63, 23, and 33 were used.
  • genes 70, 27, 68, 17, 64, 65, 18, 69, 10, 67, 42, 23, 48, 14, 31, 11, 55, 25, 52, 34, 13, 45, 12, 29, and 47 were used.
  • genes 48, 10, 17, 27, 25, 55, 12, 62, 30, 65, 15, 49, 70, 14, 54, 24, 33, 26, 50, 60, 6, 40, 67, 11, and 2 were used.
  • genes 41, 47, 24, 59, 7, 44, 2, 67, 12, 19, 13, 17, 35, 56, 28, 14, 61, 15, 60, 58, 1, 64, 31, 45, and 23 were used.
  • genes 42, 72, 41, 38, 57, 27, 4, 13, 9, 43, 34, 28, 8, 62, 64, 46, 12, 70, 21, 66, 16, 7, 48, 3, and 54 were used.
  • genes 16, 47, 67, 9, 22, 10, 64, 72, 46, 6, 60, 74, 3, 68, 57, 63, 14, 54, 58, 30, 28, 18, 70, 73, 52, 39, 34, 61, 12, 21 were used.
  • genes 18, 1, 44, 24, 68, 26, 62, 10, 47, 67, 37, 55, 32, 35, 34, 14, 49, 30, 17, 16, 51, 45, 74, 31, 9, 57, 66, 39, 53, and 8 were used.
  • genes 58, 45, 55, 39, 22, 32, 9, 49, 31, 13, 51, 56, 28, 12, 3, 59, 74, 35, 42, 67, 69, 47, 66, 18, 52, 57, 43, 5, 26, and 4 were used.
  • genes 45, 1, 74, 12, 18, 23, 59, 27, 38, 40, 72, 56, 50, 20, 52, 32, 5, 16, 9, 21, 60, 64, 49, 70, 30, 61, 6, 10, 31, and 24 were used.
  • genes 60, 53, 7, 32, 73, 25, 69, 48, 17, 45, 16, 3, 14, 9, 37, 41, 72, 43, 68, 39, 20, 51, 59, 23, 6, 15, 74, 19, 31, and 66 were used.
  • genes 47, 54, 9, 38, 60, 33, 40, 12, 57, 45, 26, 56, 11, 27, 67, 25, 69, 59, 68, 7, 61, 72, 23, 21, 28, 48, 29, 65, 37, and 15 were used.
  • genes 21, 42, 30, 57, 65, 59, 53, 74, 45, 66, 68, 41, 19, 24, 8, 10, 61, 43, 38, 67, 37, 47, 40, 22, 63, 35, 70, 72, 5, and 6 were used.
  • genes 58, 11, 28, 36, 24, 34, 53, 9, 44, 23, 51, 70, 22, 17, 15, 59, 5, 60, 1, 64, 21, 50, 35, 52, 31, 43, 38, 39, 32, and 62 were used.
  • genes 43, 30, 63, 7, 60, 40, 39, 1, 48, 17, 69, 57, 6, 62, 19, 38, 36, 13, 66, 64, 25, 31, 65, 47, 27, 16, 53, 68, 37, and 41 were used.
  • genes 22, 17, 4, 2, 37, 16, 49, 7, 63, 64, 14, 15, 74, 43, 25, 54, 46, 50, 53, 67, 39, 62, 59, 10, 55, 72, 65, 52, 58, and 19 were used.
  • genes 4 43, 55, 49, 13, 26, 32, 21, 18, 50, 14, 20, 65, 7, 24, 52, 58, 8, 30, 37, 54, 71, 2, 31, 44, 61, 66, 67, 28, 39, 10, 70, 17, 19, and 45 were used.
  • genes 14, 13, 67, 21, 48, 28, 69, 47, 50, 3, 68, 63, 22, 41, 60, 61, 5, 44, 56, 65, 7, 66, 15, 6, 45, 2, 36, 5, 30, 72, 34, 46, 24, 29, and 12 were used.
  • genes 67, 25, 58, 11, 17, 16, 3, 69, 21, 1, 59, 26, 72, 41, 47, 2, 34, 24, 10, 19, 33, 5, 50, 9, 71, 20, 62, 8, 68, 61, 23, 37, 35, 60, and 32 were used.
  • genes 5, 30, 14, 1, 59, 27, 28, 51, 55, 61, 18, 37, 17, 73, 6, 44, 67, 12, 35, 11, 53, 72, 70, 25, 21, 7, 34, 13, 74, 43, 52, 39, 54, 2, and 19 were used.
  • genes 56, 64, 58, 35, 1, 23, 43, 4, 73, 28, 54, 6, 51, 68, 49, 37, 16, 71, 3, 21, 48, 69, 70, 10, 26, 22, 50, 44, 2, 60, 38, 40, 66, 63, and 65 were used.
  • genes 72, 49, 51, 44, 19, 28, 1, 11, 3, 40, 33, 41, 70, 29, 48, 62, 50, 4, 47, 60, 68, 10, 61, 32, 20, 13, 22, 59, 65, 64, 67, 21, 35, 39, and 24 were used.
  • genes 14, 35, 31, 20, 8, 59, 50, 15, 52, 62, 19, 30, 71, 68, 72, 47, 38, 74, 36, 49, 73, 22, 41, 25, 69, 16, 32, 24, 51, 43, 65, 3, 6, 53, and 29 were used.
  • genes 22, 44, 23, 9, 26, 56, 72, 59, 35, 61, 51, 69, 64, 30, 53, 27, 11, 55, 39, 67, 48, 28, 14, 10, 8, 12, 40, 24, 57, 34, 50, 32, 42, 41, and 38 were used.
  • genes 15, 7, 27, 6, 67, 9, 26, 57, 30, 37, 58, 23, 42, 11, 36, 52, 32, 29, 62, 21, 41, 61, 64, 18, 40, 35, 66, 1, 2, 56, 16, 3, 55, 10, and 51 were used.
  • genes 9, 14, 71, 25, 44, 37, 49, 46, 66, 53, 7, 33, 22, 12, 73, 50, 27, 24, 13, 5, 41, 51, 61, 16, 28, 56, 23, 20, 10, 8, 70, 48, 42, 52, and 34 were used.
  • genes 26, 36, 43, 30, 62, 19, 20, 51, 41, 71, 1, 63, 10, 56, 65, 17, 15, 50, 5, 35, 4, 54, 12, 70, 48, 31, 47, 37, 34, 8, 3, 69, 40, 44, 46, 59, 61, 74, 23, 27 were used.
  • genes 1, 4, 38, 24, 37, 69, 21, 52, 13, 2, 63, 51, 30, 16, 27, 58, 74, 20, 32, 53, 59, 31, 50, 10, 42, 8, 54, 36, 5, 47, 70, 41, 12, 46, 28, 19, 35, 9, 61, and 48 were used.
  • genes 35, 48, 40, 47, 20, 67, 57, 72, 15, 17, 46, 37, 9, 2, 60, 30, 65, 49, 29, 64, 16, 21, 7, 74, 61, 11, 58, 71, 62, 23, 24, 55, 3, 53, 52, 27, 18, 50, 25, and 66 were used.
  • genes 35, 10, 59, 19, 27, 40, 30, 4, 9, 52, 2, 29, 26, 41, 55, 17, 13, 53, 71, 63, 58, 44, 45, 62, 70, 16, 64, 48, 43, 8, 38, 72, 49, 37, 18, 36, 74, 42, 46, and 54 were used.
  • genes 16, 61, 1, 10, 20, 51, 22, 6, 43, 65, 66, 24, 30, 9, 14, 40, 32, 74, 18, 71, 15, 28, 52, 31, 56, 55, 23, 4, 58, 36, 60, 54, 25, 63, 27, 64, 50, 29, 44, and 45 were used.
  • genes 15, 30, 3, 50, 61, 47, 13, 48, 45, 17, 46, 10, 28, 37, 8, 54, 9, 5, 63, 18, 39, 49, 34, 68, 14, 23, 43, 11, 1, 51, 56, 67, 20, 57, 6, 19, 25, 31, 21, and 12 were used.
  • genes 45, 73, 53, 29, 35, 56, 70, 51, 30, 59, 49, 22, 6, 43, 28, 31, 40, 4, 66, 25, 37, 19, 12, 65, 26, 74, 46, 50, 23, 62, 17, 69, 36, 41, 34, 27, 67, 7, 24, and 13 were used.
  • genes 62, 30, 38, 41, 18, 13, 49, 71, 68, 47, 50, 70, 66, 5, 23, 33, 27, 56, 6, 7, 34, 28, 26, 58, 53, 46, 16, 52, 72, 42, 10, 54, 67, 64, 12, 8, 19, 57, 73, and 17 were used.
  • genes 11, 32, 48, 54, 42, 67, 13, 53, 21, 44, 57, 22, 40, 12, 5, 29, 69, 37, 17, 39, 45, 73, 60, 26, 14, 72, 4, 59, 24, 46, 18, 51, 36, 61, 35, 9, 19, 16, 38, and 28 were used.
  • genes 58, 1, 55, 59, 11, 63, 3, 26, 49, 69, 34, 47, 65, 46, 14, 39, 5, 67, 16, 66, 64, 38, 44, 32, 15, 22, 19, 71, 23, 52, 45, 53, 48, 8, 60, 73, 9, 30, 25, and 37 were used.
  • genes 26, 21, 17, 34, 19, 27, 6, 61, 24, 42, 3, 60, 70, 43, 54, 13, 9, 20, 28, 58, 12, 23, 33, 4, 63, 56, 67, 1, 11, 68, 41, 59, 45, 5, 48, 32, 10, 44, 16, 65, 51, 62, 22, 38, and 74 were used.
  • genes 21, 41, 67, 5, 51, 53, 28, 25, 31, 60, 52, 17, 50, 11, 29, 45, 2, 32, 71, 13, 68, 22, 74, 33, 48, 56, 62, 42, 26, 14, 61, 23, 9, 46, 66, 10, 64, 59, 54, 69, 27, 47, 44, 34, and 40 were used.
  • genes 68, 48, 43, 74, 17, 4, 49, 34, 38, 60, 12, 42, 18, 5, 51, 32, 1, 57, 9, 11, 30, 13, 37, 15, 29, 33, 44, 20, 55, 70, 45, 41, 24, 56, 35, 52, 59, 7, 25, 2, 31, 64, 71, 22, and 39 were used.
  • genes 44, 61, 51, 69, 65, 72, 29, 57, 40, 62, 66, 63, 67, 55, 74, 14, 56, 11, 16, 58, 1, 15, 3, 48, 42, 7, 8, 30, 18, 19, 23, 60, 4, 10, 21, 43, 12, 37, 32, 25, 22, 50, 34, 59, and 2 were used.
  • genes 67, 54, 33, 41, 5, 61, 3, 10, 2, 71, 73, 53, 25, 42, 44, 23, 9, 38, 45, 62, 32, 46, 40, 8, 66, 49, 16, 24, 68, 69, 21, 52, 20, 6, 48, 11, 57, 39, 22, 31, 63, 36, 34, 35, and 17 were used.
  • genes 43, 45, 19, 17, 4, 58, 37, 7, 42, 52, 2, 62, 25, 66, 24, 15, 22, 74, 68, 67, 8, 1, 33, 70, 31, 50, 64, 14, 61, 51, 6, 38, 35, 39, 72, 5, 27, 36, 11, 18, 12, 48, 46, 54, and 71 were used.
  • genes 41, 45, 58, 11, 66, 26, 53, 13, 60, 4, 65, 18, 67, 73, 28, 55, 56, 57, 29, 68, 23, 19, 42, 17, 22, 62, 61, 10, 43, 64, 38, 71, 7, 40, 16, 34, 74, 12, 37, 8, 63, 44, 49, 47, and 3 were used.
  • genes 47, 40, 59, 14, 50, 71, 1, 57, 19, 28, 6, 34, 68, 4, 30, 20, 31, 33, 38, 39, 17, 41, 24, 65, 70, 61, 3, 35, 45, 11, 9, 8, 73, 42, 26, 23, 46, 72, 25, 64, 16, 53, 62, 18, and 7 were used.
  • genes 61, 5, 69, 22, 7, 17, 26, 13, 2, 30, 55, 33, 47, 14, 59, 32, 9, 44, 23, 45, 42, 25, 15, 57, 48, 50, 1, 68, 18, 72, 46, 73, 67, 36, 63, 60, 28, 21, 20, 8, 29, 35, 37, 38, and 71 were used.
  • genes 22, 31, 58, 50, 64, 11, 17, 67, 41, 2, 21, 4, 61, 70, 54, 3, 71, 25, 40, 43, 69, 38, 9, 73, 45, 16, 34, 10, 7, 52, 35, 19, 66, 24, 5, 60, 18, 14, 59, 32, 68, 15, 56, 63, and 65 were used.
  • genes 30 and 72 were used for 2 genes.
  • genes 65 and 88 were used for set 2, genes 65 and 88 were used.
  • genes 76 and 88 were used for set 4, genes 5 and 86 were used.
  • genes 30 and 32 were used for set 6, genes 6 and 59 were used.
  • genes 57 and 2 were used.
  • genes 49 and 28 were used for set 9, genes 37 and 35 were used.
  • genes 34 and 18 were used for 2 genes.
  • genes 1, 83, 59, 36, 66, and 88 were used.
  • genes 58, 13, 59, 22, and 64 were used.
  • genes 46, 72, 51, 88, and 14 were used.
  • genes 23, 74, 22, 27, and 20 were used.
  • genes 58, 54, 78, 87, and 50 were used.
  • genes 59, 6, 56, 78, and 9 were used.
  • genes 30, 78, 69, 83, and 21 were used.
  • genes 5, 39, 54, 56, and 55 were used.
  • genes 9, genes 9, 70, 54, 67, and 43 were used.
  • genes 80, 81, 63, 90, and 53 were used.
  • genes 70, 17, 45, 5, 2, 37, 6, 76, 39, and 14 were used.
  • genes 54, 16, 80, 26, 15, 45, 50, 8, 73, and 48 were used.
  • genes 66, 87, 31, 74, 37, 45, 19, 1, 70, and 7 were used.
  • genes 85, 17, 78, 61, 23, 59, 27, 18, 58, and 24 were used.
  • genes 44, 89, 36, 76, 49, 3, 21, 24, 38, and 69 were used.
  • genes 32, 72, 55, 2, 86, 81, 53, 45, 17, and 74 were used.
  • genes 27, 55, 62, 33, 32, 84, 21, 45, 23, and 7 were used.
  • genes 62, 45, 68, 31, 69, 39, 33, 63, 19, and 22 were used.
  • genes 71, 39, 11, 56, 88, 80, 37, 77, 62, and 35 were used.
  • genes 38, 83, 41, 47, 66, 87, 10, 4, 88, and 22 were used.
  • genes 61, 17, 64, 14, 1, 41, 72, 47, 69, 48, 49, 70, 12, 20, and 35 were used.
  • genes 26, 49, 69, 31, 84, 42, 24, 56, 82, 12, 29, 2, 21, 15, and 71 were used.
  • genes 54, 62, 8, 32, 58, 65, 39, 44, 35, 22, 34, 77, 43, 83, and 75 were used.
  • genes 62, 50, 57, 80, 28, 83, 32, 56, 14, 2, 3, 48, 67, 79, and 72 were used.
  • genes 55, 58, 77, 68, 90, 76, 17, 72, 85, 34, 43, 33, 62, 6, and 64 were used.
  • genes 41, 63, 90, 9, 25, 35, 2, 14, 65, 87, 11, 36, 10, 79, and 17 were used.
  • genes 69, 89, 77, 33, 71, 4, 6, 46, 72, 13, 68, 81, 31, 50, and 32 were used.
  • genes 29, 69, 34, 47, 32, 52, 63, 73, 23, 25, 33, 10, 37, 17, and 55 were used.
  • genes 24, 13, 45, 17, 51, 48, 20, 30, 29, 40, 53, 19, 88, 76, and 28 were used.
  • genes 86, 33, 19, 4, 84, 25, 78, 29, 88, 10, 7, 67, 85, 45, and 8 were used.
  • genes 57, 78, 43, 50, 14, 71, 56, 25, 80, 31, 88, 4, 49, 13, 3, 38, 32, 8, 52, and 75 were used.
  • genes 84, 46, 23, 85, 55, 82, 56, 83, 48, 89, 8, 60, 21, 40, 20, 17, 87, 24, 34, and 39 were used.
  • genes 72, 88, 53, 46, 82, 9, 34, 21, 76, 24, 14, 35, 90, 31, 58, 30, 15, 41, 7, and 28 were used.
  • genes 22, 62, 21, 3, 45, 50, 58, 72, 69, 82, 49, 42, 47, 9, 15, 59, 17, 24, 40, and 52 were used.
  • genes 71, 18, 74, 53, 43, 75, 76, 54, 63, 64, 10, 5, 90, 51, 31, 58, 28, 35, 70, and 23 were used.
  • genes 7, 30, 77, 25, 17, 16, 35, 68, 56, 37, 78, 87, 45, 8, 42, 82, 72, 23, 58, and 54 were used.
  • genes 3, 58, 67, 5, 87, 62, 56, 88, 73, 50, 22, 52, 10, 60, 57, 42, 46, 26, 7, and 82 were used.
  • genes 63, 19, 22, 13, 82, 12, 44, 52, 8, 90, 35, 81, 79, 15, 83, 76, 51, 27, 45, and 56 were used.
  • genes 65, 34, 76, 81, 58, 86, 83, 46, 40, 55, 48, 42, 57, 70, 21, 72, 71, 17, 22, and 24 were used.
  • genes 34, 74, 2, 53, 76, 73, 19, 72, 88, 87, 44, 70, 40, 39, 22, 45, 83, 77, 30, and 46 were used.
  • genes 13, 77, 22, 85, 58, 8, 23, 2, 40, 81, 50, 31, 14, 41, 21, 52, 6, 74, 11, 17, 83, 7, 9, 19, 18 were used.
  • genes 3, 12, 8, 87, 34, 75, 31, 88, 77, 39, 40, 60, 54, 9, 37, 5, 51, 53, 32, 35, 66, 4, 26, 59, and 29 were used.
  • genes 29, 41, 44, 56, 88, 72, 90, 6, 19, 63, 42, 24, 49, 70, 39, 17, 82, 13, 9, 4, 51, 40, 22, 71, and 25 were used.
  • genes 70, 82, 55, 43, 40, 32, 16, 13, 22, 41, 7, 85, 46, 42, 73, 76, 14, 60, 50, 72, 5, 81, 67, 57, and 83 were used.
  • genes 88, 83, 53, 26, 29, 4, 38, 71, 11, 66, 14, 89, 39, 34, 84, 41, 7, 64, 87, 3, 67, 43, 50, 79, and 6 were used.
  • genes 88, 16, 83, 4, 7, 39, 56, 82, 10, 20, 87, 79, 3, 35, 76, 49, 43, 11, 74, 13, 48, 22, 64, 34, and 89 were used.
  • genes 6, 64, 39, 50, 44, 46, 61, 28, 79, 43, 35, 85, 48, 9, 59, 47, 57, 5, 24, 33, 80, 11, 42, 20, and 26 were used.
  • genes 59, 24, 46, 33, 50, 71, 53, 21, 86, 10, 75, 23, 74, 60, 43, 22, 16, 62, 85, 79, 81, 34, 73, 2, and 1 were used.
  • genes 68, 11, 64, 54, 37, 28, 44, 73, 83, 89, 2, 41, 59, 75, 21, 23, 88, 71, 34, 29, 1, 47, 84, 60, and 72 were used.
  • genes 5, 12, 60, 84, 32, 58, 70, 2, 38, 42, 24, 13, 85, 10, 49, 90, 55, 81, 39, 27, 65, 56, 31, 34, and 57 were used.
  • genes 24, 88, 10, 69, 64, 8, 19, 54, 80, 70, 11, 9, 29, 56, 36, 79, 30, 65, 2, 58, 23, 74, 41, 16, 77, 4, 78, 14, 85, and 32 were used.
  • genes 73, 27, 19, 52, 87, 51, 63, 4, 76, 64, 90, 81, 42, 47, 9, 62, 40, 65, 83, 30, 39, 59, 10, 11, 54, 44, 43, 6, 86, and 41 were used.
  • genes 28, 47, 41, 8, 24, 54, 26, 49.61, 17, 46, 64, 20, 16, 1, 33, 82, 79, 85, 5, 86, 69, 31, 65, 83, 7, 67, 35, 48, and 57 were used.
  • genes 13, 21, 83, 35, 47, 57, 8, 66, 75, 17, 38, 70, 39, 23, 9, 1, 2, 28, 68, 81, 36, 80, 52, 22, 44, 37, 85, 15, 72, and 86 were used.
  • genes 81, 20, 36, 89, 13, 14, 46, 58, 59, 62, 28, 7, 1, 25, 35, 83, 26, 50, 51, 15, 16, 56, 71, 5, 47, 6, 78, 80, 85, and 84 were used.
  • genes 68, 74, 73, 89, 38, 72, 33, 35, 15, 79, 3, 37, 23, 67, 10, 62, 64, 77, 44, 60, 75, 7, 51, 12, 46, 76, 81, 26, 42, and 6 were used.
  • genes 34, 55, 62, 40, 78, 35, 76, 30, 21, 77, 46, 71, 66, 69, 63, 81, 51, 38, 84, 53, 82, 89, 29, 14, 36, 45, 60, 7, 52, and 27 were used.
  • genes 56, 12, 35, 79, 57, 4, 16, 9, 24, 58, 40, 72, 80, 67, 23, 76, 88, 69, 52, 78, 32, 47, 14, 46, 64, 83, 17, 59, 81, and 20 were used.
  • genes 73, 27, 12, 58, 54, 62, 48, 43, 16, 41, 49, 84, 9, 75, 13, 50, 19, 3, 76, 78, 56, 68, 71, 25, 24, 60, 18, 35, 45, and 51 were used.
  • genes 82, 21, 24, 85, 51, 18, 72, 28, 89, 22, 34, 4, 53, 75, 83, 23, 50, 5, 42, 13, 88, 63, 40, 64, 38, 35, 39, 44, 59, and 70 were used.
  • genes 2, 69, 70, 89, 9, 11, 5, 17, 63, 18, 12, 59, 58, 85, 26, 71, 61, 10, 3, 1, 22, 79, 84, 30, 48, 82, 38, 44, 56, 42, 88, 6, 60, 14, and 28 were used.
  • genes 84, 81, 88, 46, 12, 50, 38, 78, 62, 48, 19, 43, 26, 66, 4, 20, 40, 58, 9, 52, 87, 47, 6, 55, 21, 75, 31, 77, 57, 53, 45, 34, 30, 32, and 39 were used.
  • genes 6, 3, 22, 89, 8, 78, 87, 71, 42, 63, 18, 40, 68, 77, 64, 88, 5, 58, 43, 72, 80, 10, 21, 56, 11, 59, 61, 2, 19, 76, 30, 20, 14, 69, and 35 were used.
  • genes 55, 42, 89, 41, 56, 33, 24, 28, 15, 61, 63, 18, 90, 60, 35, 76, 70, 52, 8, 1, 64, 23, 13, 39, 71, 31, 3, 81, 10, 34, 66, 44, 16, 7, and 78 were used.
  • genes 59, 58, 12, 50, 47, 42, 28, 22, 76, 54, 1, 18, 7, 53, 68, 73, 20, 67, 14, 72, 23, 13, 39, 10, 70, 55, 45, 17, 31, 51, 80, 3, 24, 30, and 46 were used.
  • genes 53, 66, 26, 3, 73, 47, 61, 63, 51, 41, 29, 5, 19, 10, 57, 22, 64, 11, 34, 89, 43, 24, 31, 60, 27, 76, 17, 86, 70, 81, 50, 46, 36, 14, and 45 were used.
  • genes 18, 88, 90, 13, 73, 81, 64, 56, 84, 2, 4, 22, 3, 25, 35, 54, 89, 86, 27, 41, 6, 34, 38, 14, 74, 36, 59, 8, 40, 55, 42, 83, 39, 44, and 60 were used.
  • genes 46, 32, 22, 15, 67, 89, 14, 5, 70, 39, 49, 9, 84, 71, 12, 78, 27, 86, 26, 57, 20, 43, 58, 87, 42, 8, 31, 1, 54, 62, 69, 40, 29, 52, and 64 were used.
  • genes 3 39, 55, 25, 90, 10, 9, 77, 62, 78, 18, 12, 58, 51, 22, 67, 7, 61, 59, 35, 52, 4, 65, 38, 32, 71, 87, 88, 63, 50, 73, 70, 44, 45, and 84 were used.
  • genes 65, 54, 51, 38, 40, 5, 43, 71, 34, 30, 22, 6, 36, 64, 63, 13, 70, 85, 21, 88, 77, 86, 79, 66, 25, 18, 26, 19, 76, 56, 23, 60, 75, 2, and 49 were used.
  • genes 81, 80, 68, 77, 17, 71, 34, 33, 48, 88, 90, 32, 23, 2, 38, 59, 75, 82, 50, 56, 12, 36, 6, 87, 72, 37, 26, 15, 35, 66, 13, 76, 55, 3, 78, 18, 52, 47, 73, and 20 were used.
  • genes 11, 65, 27, 44, 88, 49, 55, 57, 1, 72, 9, 28, 56, 67, 13, 58, 42, 36, 8, 31, 40, 14, 26, 35, 62, 22, 19, 84, 78, 21, 2, 41, 74, 71, 52, 30, 25, 76, 85, and 63 were used.
  • genes 50, 22, 10, 54, 9, 51, 15, 34, 29, 35, 76, 89, 33, 6, 88, 56, 36, 70, 87, 40, 83, 62, 1, 42, 25, 78, 30, 26, 44, 60, 69, 47, 49, 31, 18, 59, 37, 52, 61, and 17 were used.
  • genes 27, 33, 7, 89, 36, 59, 48, 42, 66, 39, 90, 52, 2, 14, 30, 80, 9, 56, 21, 87, 65, 67, 41, 73, 82, 20, 4, 46, 5, 84, 88, 15, 44, 58, 78, 85, 3, 64, 6, and 8 were used.
  • genes 43, 24, 86, 29, 46, 90, 40, 1, 71, 57, 12, 84, 69, 19, 42, 62, 28, 35, 5, 63, 52, 17, 39, 4, 67, 81, 50, 47, 61, 54, 87, 70, 77, 6, 10, 38, 37, 79, 31, and 36 were used.
  • genes 28, 5, 78, 85, 16, 20, 36, 52, 43, 29, 67, 83, 12, 79, 84, 8, 81, 46, 11, 3, 54, 86, 10, 60, 71, 51, 39, 53, 59, 69, 44, 61, 7, 56, 27, 50, 66, 70, 1, and 25 were used.
  • genes 39, 47, 48, 24, 25, 3, 41, 16, 65, 73, 63, 14, 70, 57, 12, 64, 90, 23, 27, 38, 66, 71, 54, 21, 83, 28, 72, 53, 11, 30, 80, 15, 6, 88, 89, 85, 81, 61, 78, and 34 were used.
  • genes 61, 8, 57, 16, 24, 64, 48, 36, 58, 28, 27, 40, 70, 77, 25, 76, 52, 35, 62, 4, 60, 7, 54, 37, 11, 20, 72, 34, 56, 78, 10, 86, 51, 29, 84, 47, 30, 21, 59, and 67 were used.
  • genes 67, 3, 83, 33, 35, 26, 25, 79, 68, 19, 18, 84, 14, 58, 66, 57, 1, 2, 27, 64, 23, 24, 76, 81, 17, 37, 38, 30, 45, 75, 49, 39, 5, 53, 43, 15, 51, 40, 69, and 12 were used.
  • genes 39, 77, 29, 70, 85, 45, 54, 79, 31, 43, 15, 11, 47, 83, 76, 21, 67, 14, 4, 19, 49, 42, 18, 13, 12, 7, 88, 8, 3, 35, 81, 55, 71, 60, 72, 57, 46, 40, 56, and 32 were used.
  • genes 7, 63, 45, 87, 19, 55, 36, 42, 9, 4, 79, 68, 46, 35, 40, 80, 59, 58, 38, 17, 50, 30, 13, 39, 33, 84, 34, 64, 2, 57, 24, 88, 65, 16, 53, 18, 28, 8, 60, 15, 43, 73, 77, 20, and 78 were used.
  • genes 70, 19, 81, 68, 38, 35, 48, 9, 53, 11, 73, 42, 54, 28, 32, 40, 60, 88, 25, 7, 67, 17, 36, 51, 44, 46, 10, 89, 14, 80, 39, 41, 27, 8, 75, 47, 61, 57, 59, 76, 86, 65, 63, 74, and 77 were used.
  • genes 55, 24, 63, 17, 32, 81, 2, 67, 51, 85, 27, 46, 60, 90, 25, 35, 58, 11, 47, 33, 73, 3, 74, 52, 15, 86, 6, 78, 36, 66, 57, 13, 49, 28, 75, 70, 4, 77, 43, 26, 61, 64, 20, 1, and 23 were used.
  • genes 49, 72, 13, 51, 55, 11, 29, 5, 43, 44, 40, 6, 38, 67, 47, 35, 36, 28, 81, 24, 80, 32, 16, 88, 63, 87, 86, 79, 21, 1, 30, 10, 62, 58, 23, 12, 78, 26, 69, 56, 85, 42, 17, 84, and 39 were used.
  • genes 53, 33, 18, 65, 22, 83, 50, 88, 76, 40, 82, 68, 85, 5, 63, 45, 78, 16, 42, 54, 27, 66, 70, 74, 7, 51, 89, 64, 49, 37, 84, 86, 34, 39, 80, 31, 61, 87, 69, 4, 81, 30, 14, 41, and 29 were used.
  • genes 7, 60, 38, 14, 73, 9, 79, 81, 22, 10, 85, 51, 40, 87, 3, 26, 57, 56, 12, 72, 39, 59, 63, 28, 64, 71, 69, 21, 67, 48, 50, 66, 46, 88, 11, 13, 24, 8, 58, 75, 2, 41, 5, 44, and 55 were used.
  • genes 15, 65, 31, 19, 11, 38, 2, 9, 64, 66, 22, 35, 49, 3, 77, 43, 32, 56, 39, 54, 80, 21, 6, 40, 27, 86, 10, 16, 70, 30, 85, 23, 26, 4, 55, 73, 42, 13, 41, 68, 29, 57, 28, 72, and 58 were used.
  • genes 83, 27, 9, 62, 84, 78, 13, 5, 74, 55, 12, 34, 58, 3, 67, 57, 24, 45, 42, 47, 75, 25, 29, 44, 46, 61, 56, 70, 86, 37, 14, 49, 60, 89, 28, 72, 59, 38, 2, 81, 50, 7, 6, 21, and 82 were used.
  • genes 7, 10, 35, 14, 79, 66, 33, 52, 16, 55, 68, 59, 57, 19, 11, 47, 22, 38, 61, 30, 71, 50, 63, 88, 53, 80, 6, 54, 77, 21, 37, 84, 9, 65, 12, 49, 40, 73, 76, 2, 28, 29, 3, 72, and 18 were used.
  • genes 12, 19, 9, 80, 84, 15, 7, 2, 39, 21, 48, 40, 51, 69, 74, 83, 5, 66, 27, 26, 89, 60, 4, 86, 41, 44, 35, 10, 76, 53, 63, 16, 37, 79, 11, 42, 68, 3, 59, 82, 77, 73, 85, 67, and 14 were used.
  • genes 84, 47, 56, 1, 18, 21, 57, 54, 27, 89, 44, 85, 64, 10, 77, 34, 65, 66, 80, 70, 46, 23, 53, 61, 24, 81, 43, 35, 30, 74, 83, 51, 20, 17, 72, 4, 49, 68, 60, 28, 67, 19, 42, 55, 73, 36, 7, 39, and 33 were used.
  • genes 47, 29, 58, 36, 21, 53, 40, 7, 83, 77, 24, 89, 71, 64, 60, 4, 37, 86, 27, 57, 62, 63, 72, 1, 88, 78, 68, 17, 51, 16, 82, 42, 81, 18, 32, 49, 55, 10, 11, 66, 35, 23, 70, 20, 61, 25, 48, 43, and 54 were used.
  • genes 35, 65, 48, 43, 69, 62, 64, 74, 82, 39, 37, 1, 88, 45, 66, 12, 79, 55, 38, 84, 17, 30, 25, 26, 89, 56, 28, 57, 59, 34, 85, 14, 47, 44, 41, 19, 60, 20, 73, 2, 63, 75, 49, 80, 58, 77, 27, 54, and 29 were used.
  • genes 64, 51, 36, 12, 84, 24, 65, 47, 88, 26, 10, 19, 73, 90, 35, 53, 18, 55, 80, 70, 79, 82, 87, 77, 15, 85, 83, 7, 72, 1, 6, 57, 38, 45, 74, 33, 62, 86, 31, 69, 27, 14, 4, 29, 54, 44, 63, 78, and 42 were used.
  • genes 24, 39, 85, 42, 88, 32, 65, 23, 6, 75, 53, 77, 64, 90, 13, 82, 47, 31, 48, 8, 78, 67, 63, 44, 26, 40, 14, 34, 18, 59, 2, 17, 20, 56, 83, 68, 86, 9, 38, 73, 89, 55, 29, 69, 72, 16, 28, 51, and 81 were used.
  • genes 32, 70, 57, 67, 1, 73, 52, 38, 65, 83, 5, 40, 49, 31, 66, 85, 6, 82, 12, 48, 89, 3, 19, 41, 62, 16, 46, 61, 24, 18, 55, 30, 33, 56, 68, 20, 81, 10, 86, 9, 15, 63, 78, 22, 75, 14, 13, 43, and 77 were used.
  • genes 17, 30, 47, 85, 7, 3, 6, 35, 76, 77, 25, 86, 36, 75, 44, 29, 69, 60, 63, 64, 82, 51, 19, 68, 41, 28, 73, 18, 10, 26, 42, 78, 67, 12, 80, 33, 13, 57, 38, 87, 49, 59, 74, 50, 90, 46, 8, 81, and 4 were used.
  • genes 84, 87, 3, 41, 36, 71, 33, 57, 85, 26, 53, 22, 82, 31, 2, 45, 24, 18, 37, 35, 77, 20, 63, 25, 6, 17, 58, 7, 9, 49, 28, 76, 79, 67, 13, 80, 66, 5, 43, 4, 74, 75, 21, 86, 23, 39, 42, 27, and 54 were used.
  • the determination or measurement of gene expression may be performed by PCR, such as the use of quantitative PCR. Detecting expression of about 5 to 49 expressed sequences in the human genome may be used in such embodiments of the invention. Additionally, expression levels of about 5 to 49 gene sequences in the set of 74, the set of 90, or a combination set of the two (with a total of 126 gene sequences given the presence of 38 gene sequences in common between the two sets) may also be used. The invention contemplates the use of quantitative PCR to measure expression levels, as described above, of about 5 to 49 of 87 gene sequences, all of which are present in either the set of 74 or the set of 90.
  • the identifiers/accession numbers of the 87 gene sequences are AA456140, AA745593, AA765597, AA782845, AA865917, AA946776, AA993639, AB038160, AF104032, AF133587, AF301598, AF332224, AI041545, AI147926, AI309080, AI341378, AI457360, AI620495, AI632869, AI683181, AI685931, AI802118, AI804745, AI952953, AI985118, AJ000388, AK025181, AK027147, AK054605, AL023657, AL039118, AL110274, AL157475, AW118445, AW194680, AW291189, AW298545, AW445220, AW47
  • the use of from about 5 to 49 of these sequences in the practice of the invention may include the use of expression levels measured for reference gene sequences as described herein.
  • the reference gene sequences are one or more of the 8 disclosed herein.
  • the invention contemplates the use of one or more of the reference sequences identified by AF308803, AL137727, BC003043, BC006091, and BC016680 in PCR or QPCR based embodiments of the invention. Of course all 5 of these reference sequences may also be used in combination.

Abstract

The invention provides methods for the use of gene expression measurements to classify or identify tumors in samples obtained from a subject in a clinical setting, such as in cases of formalin fixed, paraffin embedded (FFPE) samples.

Description

    RELATED APPLICATIONS
  • This application claims benefit of priority to U.S. Provisional Patent Application 60/687,174, filed Jun. 3, 2005, which is hereby incorporated by reference as if fully set forth.
  • FIELD OF THE INVENTION
  • This invention relates to the use of gene expression to classify human tumors. The classification is performed by use of gene expression profiles, or patterns, of about 5 to 49 expressed sequences that are correlated with tumors arising from certain tissues as well as being correlated with certain tumor types. The invention also provides for the use of about 5 to 49 specific gene sequences, the expression of which are correlated with tissue source and tumor type in various cancers. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other expression formats, may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
  • SUMMARY OF THE INVENTION
  • This invention relates to the use of gene expression measurements to classify or identify tumors in cell containing samples obtained from a subject in a clinical setting, such as in cases of formalin fixed, paraffin embedded (FFPE) samples as well as fresh samples, that have undergone none to little or minimal treatment (such as simply storage at a reduced, non-freezing, temperature), and frozen samples. The invention thus provides the ability to classify tumors in the real-world conditions faced by hospital and other laboratories which conduct testing on clinical FFPE samples. The samples may be of a primary tumor sample or of a tumor that has resulted from a metastasis of another tumor. Alternatively, the sample may be a cytological sample, such as, but not limited to, cells in a blood sample. In some cases of a tumor sample, the tumors may not have undergone classification by traditional pathology techniques, may have been initially classified but confirmation is desired, or have been classified as a “carcinoma of unknown primary” (CUP) or “tumor of unknown origin” (TUO) or “unknown primary tumor”. The need for confirmation is particularly relevant in light of the estimates of 5 to 10% misclassification using standard techniques. Thus the invention may be viewed as providing means for cancer identification, or CID.
  • In a first aspect of the invention, the classification is performed by use of gene expression profiles, or patterns, of about 5 to 49 expressed sequences. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other markers of gene expression, may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
  • In some embodiments, the invention is used to classify among at least 34 or at least 39 tumor types with significant accuracy in a clinical setting. The invention is based in part on the surprising and unexpected discovery that about 5 to 49 expressed sequences in the human genome are capable of classifying among at least 34, or at least 39, tumor types, as well as subsets of those tumor types, in a meaningful manner. Stated differently, the invention is based in part on the discovery that it is not necessary to use supervised learning to identify gene sequences which are expressed in correlation with different tumor types. Thus the invention is based in part on the recognition that any about 5 to 49 expressed sequences, even a random collection of expressed sequences, has the capability to classify, and so may be used to classify, a cell as being a tumor cell of a tissue or tissue origin. Moreover, relatively few expressed sequences are needed to classify among different tumor types. The ratio of expressed sequences to the number of tumor types that can be classified, based on the expression levels of the sequences, ranges from about 1:2 to about 5:2 or higher as demonstrated herein.
  • In another aspect, the invention provides for the classifying of a cell containing sample as containing a tumor cell of a tissue type or origin by determining the expression levels of about 5 to 49 transcribed sequences and then classifying the cell containing sample as containing a tumor cell of a plurality (two or more) of tumor types. To classify among 34 to 39 tumor types, and subsets thereof, as few as about any 5 expressed sequences may be used to provide classification in a meaningful manner. It was discovered that the expressed sequences need not be those the expression levels of which are evidently or highly correlated (directly, or indirectly through correlation with another expressed sequence) with any of the tumor types. Thus the invention provides, in yet another embodiment, for the use of the expression levels of genes, the expression levels of which are not strongly correlated with the actual classification of the particular tumor sample, as one of the about 5 to 49 transcribed sequences. All of the genes selected may be such non-correlates, or only a portion of the genes may be non-correlates, typically at least 90%, 85%, 75%, 50% or 25%, as well as portions falling within the ranges created by using any two of the foregoing point examples as endpoints of a range.
  • The invention is practiced by determining the expression levels of gene sequences where the sequences need not have been selected based on a correlation of their expression levels with the tumor types to be classified. Thus as a non-limiting example, the gene sequences need not be selected based on their correlation values with tumor types or a ranking based on the correlation values. Additionally, the invention may be practice with use of gene expression levels which are not necessarily correlated to one or more other gene expression level(s) used for classification. Thus in some embodiments, the ability for the expression level of one expressed sequence to function in classification is not redundant with (is independent of) the ability of at least one other gene expression level used for classification.
  • The invention may be applied to identify the origin of a cancer in a patient in a wide variety of cases including, but not limited to, identification of the origin of a cancer in a clinical setting. In some embodiments, the identification is made by classification of a cell containing sample known to contain cancer cells, but the origin of those cells is unknown. In other embodiments, the identification is made by classification of a cell containing sample as containing one or more cancer cells followed by identification of the origin(s) of those cancer cell(s). In further embodiments, the invention is practiced with a sample from a subject with a previous history of cancer, and identification is made by classification of a cell as either being cancer from a previous origin of cancer or a new origin. Additional embodiments include those where multiple cancers found in the same organ or tissue and the invention is used to determine the origin of each cancer, as well as whether the cancers are of the same origin.
  • The invention is also based in part on the discovery that the expression levels of particular gene sequences can be used to classify among tumor types with greater accuracy than the expression levels of a random group of gene sequences. In one embodiment, the invention provides for the use of expression levels of about 5 to 49 expressed sequences from a first set of 74 expressed sequences in the human genome to classify among at least 39 tumor types with significant accuracy. The invention thus provides for the identification and use of gene expression patterns (or profiles or “signatures”) based on the about 5 to 49 expressed sequences as correlated with at least the 39 tumor types. The invention also provides for the use of about 5 to 49 of the 74 of these expressed sequences to classify among subsets of the 39 tumor types. The ratio of expressed sequences to the number of tumor types, from 2 to 39, that can be classified based on the expression levels of the sequences ranges from about 1:2 to about 5:2 with greater accuracy than the use of a random group of expressed sequences. Depending on the number of tumor types, accuracies ranging from over 75% to 95% may be achieved readily.
  • In another embodiment, the invention provides for the use of expression levels of about 5 to 49 expressed sequences of a second set of 90 expressed sequences in the human genome to classify among at least 39 tumor types, or subsets thereof, with significant accuracy. 38 of the sequences in this second set are present in the first set of 74 sequences. The expression levels of the about 5 to 49 sequences in the second set may be used in the same manner as described for the first set of 74 sequences. Depending on the number of tumor types, accuracies ranging from about 75% to about 95% may be achieved.
  • The invention is also based in part upon the discovery that use of about 5 to 49 expressed sequences to classify among 53 tumor types, which include (but is not limited to) the 34 and 39 types described herein, was limited by the number of available samples of some tumor types. As noted hereinbelow, accuracy is linked to the number of available samples of each tumor type such that the ability to classify additional tumor types is readily achieved by the application of increased numbers of each tumor type. Thus while the invention is exemplified by use in classifying among 34 or 39 tumor types as well as subsets of the 34 or 39, about 5 to 49 expressed sequences can also be used to classify among all tumor types with the inclusion of samples of the additional tumor types. Thus the invention also provides for the classification of a tumor as being a type beyond the 34 or 39 types described herein.
  • The invention is based upon the expression levels of the gene sequences in a set of known tumor cells from different tissues and of different tumor types. These gene expression profiles (of gene sequences in the different known tumor cells/types), whether embodied in nucleic acid expression, protein expression, or other expression formats, may be compared to the expression levels of the same sequences in an unknown tumor sample to identify the sample as containing a tumor of a particular type and/or a particular origin or cell type. The invention provides, such as in a clinical setting, the advantages of a more accurate identification of a cancer and thus the treatment thereof as well as the prognosis, including survival and/or likelihood of cancer recurrence following treatment, of the subject from whom the sample was obtained.
  • The invention is further based in part on the discovery that use of about 5 to 49 expressed sequences as described herein as capable of classifying among two or more tumor types necessarily and effectively eliminates one or more tumor types from consideration during classification. This reflects the lack of a need to select genes with expression levels that are highly correlated with all tumor types within the range of the classification system. Stated differently, the invention may be practiced with a plurality of genes the expression levels of which are not highly correlated with any of the individual tumor types or multiple types in the group of tumor types being classified. This is in contrast to other approaches based upon the selection and use of highly correlated genes, which likely do not “rule out” other tumor types as opposed to “rule in” a tumor type based on the positive correlation.
  • The classification of a tumor sample as being one of the possible tumor types described herein to the exclusion of other tumor types is of course made based upon a level of confidence as described below. Where the level of confidence is low, or an increase in the level of confidence is preferred, the classification can simply be made at the level of a particular tissue origin or cell type for the tumor in the sample. Alternatively, and where a tumor sample is not readily classified as a single tumor type, the invention permits the classification of the sample as one of a few possible tumor types described herein. This advantageously provides for the ability to reduce the number of possible tissue types, cell types, and tumor types from which to consider for selection and administration of therapy to the patient from whom the sample was obtained.
  • The invention thus provides a non-subjective means for the identification of the tissue source and/or tumor type of one or more cancers of an afflicted subject. Where subjective interpretation may have been previously used to determine the tissue source and/or tumor type, as well as the prognosis and/or treatment of the cancer based on that determination, the present invention provides objective gene expression patterns, which may used alone or in combination with subjective criteria to provide a more accurate identification of cancer classification. The invention is particularly advantageously applied to samples of secondary or metastasized tumors, but any cell containing sample (including a primary tumor sample) for which the tissue source and/or tumor type is preferably determined by objective criteria may also be used with the invention. Of course the ultimate determination of class may be made based upon a combination of objective and non-objective (or subjective/partially subjective) criteria.
  • The invention includes its use as part of the clinical or medical care of a patient. Thus in addition to using an expression profile of genes as described herein to assay a cell containing sample from a subject afflicted with cancer to determine the tissue source and/or tumor type of the cancer, the profile may also be used as part of a method to determine the prognosis of the cancer in the subject. The classification of the tumor/cancer and/or the prognosis may be used to select or determine or alter the therapeutic treatment for said subject. Thus the classification methods of the invention may be directed toward the treatment of disease, which is diagnosed in whole or in part based upon the classification. Given the diagnosis, administration of an appropriate anti-tumor agent or therapy, or the withholding or alternation of an anti-tumor agent or therapy may be used to treat the cancer.
  • Other clinical methods include those involved in the providing of medical care to a patient based on a classification as described herein. In some embodiments, the methods relate to providing diagnostic services based on expression levels of gene sequences, with or without inclusion of an interpretation of levels for classifying cells of a sample. In some embodiments, the method of providing a diagnostic service of the invention is preceded by a determination of a need for the service. In other embodiments, the method includes acts in the monitoring of the performance of the service as well as acts in the request or receipt of reimbursement for the performance of the service.
  • The details of one or more embodiments of the invention are set forth in the accompanying drawing and the description below. Other features, objects, and advantages of the invention will be apparent from the drawing and detailed description, and from the claims.
  • Definitions
  • As used herein, a “gene” is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. The term includes alleles and polymorphisms of a gene that encodes the same product, or a functionally associated (including gain, loss, or modulation of function) analog thereof, based upon chromosomal location and ability to recombine during normal mitosis.
  • A “sequence” or “gene sequence” as used herein is a nucleic acid molecule or polynucleotide composed of a discrete order of nucleotide bases. The term includes the ordering of bases that encodes a discrete product (i.e. “coding region”), whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. It is also appreciated that alleles and polymorphisms of the human gene sequences may exist and may be used in the practice of the invention to identify the expression level(s) of the gene sequences or an allele or polymorphism thereof. Identification of an allele or polymorphism depends in part upon chromosomal location and ability to recombine during mitosis.
  • The terms “correlate” or “correlation” or equivalents thereof refer to an association between expression of one or more genes and another event, such as, but not limited to, physiological phenotype or characteristic, such as tumor type.
  • A “polynucleotide” is a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA and RNA. It also includes known types of modifications including labels known in the art, methylation, “caps”, substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified forms of the polynucleotide.
  • The term “amplify” is used in the broad sense to mean creating an amplification product can be made enzymatically with DNA or RNA polymerases. “Amplification,” as used herein, generally refers to the process of producing multiple copies of a desired sequence, particularly those of a sample. “Multiple copies” mean at least 2 copies. A “copy” does not necessarily mean perfect sequence complementarity or identity to the template sequence. Methods for amplifying mRNA are generally known in the art, and include reverse transcription PCR (RT-PCR) and quantitative PCR (or Q-PCR) or real time PCR. Alternatively, RNA may be directly labeled as the corresponding cDNA by methods known in the art.
  • By “corresponding”, it is meant that a nucleic acid molecule shares a substantial amount of sequence identity with another nucleic acid molecule. Substantial amount means at least 95%, usually at least 98% and more usually at least 99%, and sequence identity is determined using the BLAST algorithm, as described in Altschul et al. (1990), J. Mol. Biol. 215:403-410 (using the published default setting, i.e. parameters w=4, t=17).
  • A “microarray” is a linear or two-dimensional or three dimensional (and solid phase) array of discrete regions, each having a defined area, formed on the surface of a solid support such as, but not limited to, glass, plastic, or synthetic membrane. The density of the discrete regions on a microarray is determined by the total numbers of immobilized polynucleotides to be detected on the surface of a single solid phase support, such as of at least about 50/cm2, at least about 100/cm2, or at least about 500/cm2, up to about 1,000/cm2 or higher. The arrays may contain less than about 500, about 1000, about 1500, about 2000, about 2500, or about 3000 immobilized polynucleotides in total. As used herein, a DNA microarray is an array of oligonucleotide or polynucleotide probes placed on a chip or other surfaces used to hybridize to amplified or cloned polynucleotides from a sample. Since the position of each particular group of probes in the array is known, the identities of a sample polynucleotides can be determined based on their binding to a particular position in the microarray. As an alternative to the use of a microarray, an array of any size may be used in the practice of the invention, including an arrangement of one or more position of a two-dimensional or three dimensional arrangement in a solid phase to detect expression of a single gene sequence. In some embodiments, a microarray for use with the present invention may be prepared by photolithographic techniques (such as synthesis of nucleic acid probes on the surface from the 3′ end) or by nucleic synthesis followed by deposition on a solid surface.
  • Because the invention relies upon the identification of gene expression, some embodiments of the invention determine expression by hybridization of mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique to a particular gene sequence. Polynucleotides of this type contain at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found in other gene sequences. The term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Other embodiments are polynucleotides of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, at least or about 400, at least or about 450, or at least or about 500 consecutive bases of a sequence that is not found in other gene sequences. The term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value. Longer polynucleotides may of course contain minor mismatches (e.g. via the presence of mutations) which do not affect hybridization to the nucleic acids of a sample. Such polynucleotides may also be referred to as polynucleotide probes that are capable of hybridizing to sequences of the genes, or unique portions thereof, described herein. Such polynucleotides may be labeled to assist in their detection. The sequences may be those of mRNA encoded by the genes, the corresponding cDNA to such mRNAs, and/or amplified versions of such sequences. In some embodiments of the invention, the polynucleotide probes are immobilized on an array, other solid support devices, or in individual spots that localize the probes.
  • In other embodiments of the invention, all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR. Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
  • Alternatively, and in further embodiments of the invention, gene expression may be determined by analysis of expressed protein in a cell sample of interest by use of one or more antibodies specific for one or more epitopes of individual gene products (proteins), or proteolytic fragments thereof, in said cell sample or in a bodily fluid of a subject. The cell sample may be one of breast cancer epithelial cells enriched from the blood of a subject, such as by use of labeled antibodies against cell surface markers followed by fluorescence activated cell sorting (FACS). Such antibodies may be labeled to permit their detection after binding to the gene product. Detection methodologies suitable for use in the practice of the invention include, but are not limited to, immunohistochemistry of cell containing samples or tissue, enzyme linked immunosorbent assays (ELISAs) including antibody sandwich assays of cell containing tissues or blood samples, mass spectroscopy, and immuno-PCR.
  • The terms “label” or “labeled” refer to a composition capable of producing a detectable signal indicative of the presence of the labeled molecule. Suitable labels include radioisotopes, nucleotide chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic particles, bioluminescent moieties, and the like. As such, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • The term “support” refers to conventional supports such as beads, particles, dipsticks, fibers, filters, membranes and silane or silicate supports such as glass slides.
  • “Expression” and “gene expression” include transcription and/or translation of nucleic acid material.
  • As used herein, the term “comprising” and its cognates are used in their inclusive sense; that is, equivalent to the term “including” and its corresponding cognates.
  • Conditions that “allow” an event to occur or conditions that are “suitable” for an event to occur, such as hybridization, strand extension, and the like, or “suitable” conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event. Such conditions, known in the art and described herein, depend upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These conditions also depend on what event is desired, such as hybridization, cleavage, strand extension or transcription.
  • Sequence “mutation,” as used herein, refers to any sequence alteration in the sequence of a gene disclosed herein interest in comparison to a reference sequence. A sequence mutation includes single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to mechanisms such as substitution, deletion or insertion. Single nucleotide polymorphism (SNP) is also a sequence mutation as used herein. Because the present invention is based on the relative level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be assayed in the practice of the invention.
  • “Detection” or “detecting” includes any means of detecting, including direct and indirect determination of the level of gene expression and changes therein.
  • Unless defined otherwise all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 shows a capacity plot for the ability to use the expression levels of subsets of a set of 100 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 100 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types (x-axis) versus prediction accuracies (y-axis) for results using from 5 to 49 genes are shown as non-limiting examples. The data from using 5 genes results in a curve closest to the x-axis while the data from using 49 genes results in a curve farthest from the x-axis. Generally, accuracy improves with higher numbers of gene sequences, where from 30 to 49 gene sequences (the three curves farthest from the x-axis) provides about the same level of accuracy.
  • FIG. 2 shows an alternative presentation of the data used with respect to FIG. 1 . A plot of numbers of gene sequences used, ranging from 5-49 (and in the x-axis), versus prediction accuracies (y-axis) for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 3 provides a further analysis of the ability to use the expression levels of subsets of a set of 100 randomly selected expressed gene sequences to classify among 39 tumor types. The data used with FIGS. 1 and 2 is presented in a plot of the number of tumor types versus the number of gene sequences used at prediction accuracies from 55-70% are shown as non-limiting examples. Generally, accuracy improves with higher numbers of gene sequences.
  • FIG. 4 shows a capacity plot for the ability to use the expression levels of portions of a first set of 74 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 74 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types versus prediction accuracies for results using from 5 to 49 genes are shown as non-limiting examples. The plotted lines, from top to bottom, are of the results from 49, 40, 30, 20, 10, and gene sequences, respectively.
  • FIG. 5 shows an alternative presentation of the data used with respect to FIG. 4 . A plot of numbers of gene sequences used, ranging from 5-49, versus prediction accuracies for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 6 is analogous to FIG. 3 except with presentation of the data used with FIGS. 4 and 5 .
  • FIG. 7 shows a capacity plot for the ability to use the expression levels of subsets of a set of 90 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each sampled 10 times) of the 90 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types versus prediction accuracies for results using from 5 to 49 genes are shown as non-limiting examples. The plotted lines, from top to bottom, are of the results from 49, 40, 30, 20, 10, and gene sequences, respectively.
  • FIG. 8 shows an alternative presentation of the data used with respect to FIG. 7 . A plot of numbers of gene sequences used, ranging from 5-49, versus prediction accuracies for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
  • FIG. 9 is analogous to FIGS. 3 and 6 except with presentation of the data used with FIGS. 7 and 8 .
  • FIGS. 10A-10D show a “tree” that classifies tumor types covered herein as well as additional known tumor types. It was constructed mainly according to “Cancer, Principles and Practice of Oncology, (DeVito, Hellman and Rosenberg), 6th edition”. Thus beginning with a “tumor of unknown origin” (or “tuo”), the first possibilities are that it is either of a germ cell or non-germ cell origin. If it is the former, then it may be of ovary or testes origin. Within those of testes origin, the tumor may be of seminoma origin or an “other” origin.
  • If the tumor is of a non-germ cell origin, then it is either of a epithelial or non-epithelial origin. If it is the former, then it is either squamous or non-squamous origin. Squamous origin tumors are of cervix, esophagus, larynx, lung, or skin in origin. Non-squamous origin tumors are of urinary bladder, breast, carcinoid-intestine, cholangiocarcinoma, digestive, kidney, liver, lung, prostate, reproductive system, skin-basal cell, or thyroid-follicular-papillary origin. Among those of digestive origin, the tumors are of small and large bowel, stomach-adenocarcinoma, bile duct, esophagus, gall bladder, and pancreas in origin. The esophagus origin tumors may be of either Barrett's esophagus or adenocarcinoma types. Of the reproductive system origin tumors, they may be of cervix adenocarcinoma type, endometrial tumor, or ovarian origin. Ovarian origin tumors are of the clear, serous, mucinous, and endometroid types.
  • If the tumor is of non-epithelial origin, then it is of adrenal gland, brain, GIST (gastrointestinal stromal tumor), lymphoma, meningioma, mesothelioma, sarcoma, skin melanoma, or thyroid-medullary origin. Of the lymphomas, they are B cell, Hodgkin's, or T cell type. Of the sarcomas, they are leimyosarcoma, osteosarcoma, soft-tissue sarcoma, soft tissue MFH (malignant fibrous histiocytoma), soft tissue sarcoma synovial, soft tissue Ewing's sarcoma, soft tissue fibrosarcoma, and soft tissue rhabdomyosarcoma types.
  • DETAILED DESCRIPTION OF MODES OF PRACTICING THE INVENTION
  • This invention provides methods for the use of gene expression information to classify tumors in a more objective manner than possible with conventional pathology techniques. Thus in a first aspect, the invention provides a method of classifying a cell containing sample as including a tumor cell of (or from) a type of tissue or a tissue origin. The method comprises determining or measuring the expression levels of about five to 49 transcribed sequences from cells in a cell containing sample obtained from a subject, and classifying the sample as containing tumor cells of a type of tissue from a plurality of tumor types based on the expression levels of said sequences. As used herein, “a plurality” refers to the state of two or more.
  • The classifying is based upon a comparison of the expression levels of the about 5 to 49 transcribed sequences in the cells of the sample to their expression levels in known tumor samples and/or known non-tumor samples. Alternatively, the classifying is based upon a comparison of the expression levels of the about 5 to 49 transcribed sequences to the expression of reference sequences in the same samples, relative to, or based on, the same comparison in known tumor samples and/or known non-tumor samples. Thus as a non-limiting example, the expression levels of the gene sequences may be determined in a set of known tumor samples to provide a database against which the expression levels detected or determined in a cell containing sample from a subject is compared. The expression level(s) of gene sequence(s) in a sample also may be compared to the expression level(s) of said sequence(s) in normal or non-cancerous cells, preferably from the same sample or subject. As described below and in embodiments of the invention utilizing Q-PCR or real time Q-PCR, the expression levels may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used.
  • In practice, the method utilizes a ratio, of transcribed sequences to the number of tumor types classified, ranging from about 1:2 to about 5:2 or higher. Stated differently, the ratio of the number of expression levels needed to the number of tumor types that may be classified based upon those levels, ranges from about 1:2 to about 1:1 to about 3:2 to about 2:1 to about 5:2 or higher. This is reflected by the ability to use as few as about 20 expression levels to classify among 39 tumor types (see FIG. 6 ). Thus, and based on data as shown in FIGS. 1-9 , the invention may be practiced with about 5 to 49 gene sequences within the ratio of genes assessed to tumors classified.
  • The selection of about 5 to 49 gene sequences to use may be random, or by selection based on various criteria. As one non-limiting example, the gene sequences may be selected based upon unsupervised learning, including clustering techniques. As another non-limiting example, selection may be to reduce or remove redundancy with respect to their ability to classify tumor type. For example, gene sequences are selected based upon the lack of correlation between their expression and the expression of one or more other gene sequences used for classifying. This is accomplished by assessing the expression level of each gene sequence in the expression data set for correlation, across the plurality of samples, with the expression level of each other gene in the data set to produce a correlation matrix of correlation coefficients. These correlation determinations may be performed directly, between expression of each pair of gene sequences, or indirectly, without direct comparison between the expression values of each pair of gene sequences.
  • A variety of correlation methodologies may be used in the correlation of expression data of individual gene sequences within the data set. Non-limiting examples include parametric and non-parametric methods as well as methodologies based on mutual information and non-linear approaches. Non-limiting examples of parametric approaches include Pearson correlation (or Pearson r, also referred to as linear or product-moment correlation) and cosine correlation. Non-limiting examples of non-parametric methods include Spearman's R (or rank-order) correlation, Kendall's Tau correlation, and the Gamma statistic. Each correlation methodology can be used to determine the level of correlation between the expressions of individual gene sequences in the data set. The correlation of all sequences with all other sequences is most readily considered as a matrix. Using Pearson's correlation as a non-limiting example, the correlation coefficient r in the method is used as the indicator of the level of correlation. When other correlation methods are used, the correlation coefficient analogous to r may be used, along with the recognition of equivalent levels of correlation corresponding to r being at or about 0.25 to being at or about 0.5.
  • The correlation coefficient may be selected as desired to reduce the number of correlated gene sequences to various numbers. In some embodiments of the invention using r, the selected coefficient value may be of about 0.25 or higher, about 0.3 or higher, about 0.35 or higher, about 0.4 or higher, about 0.45 or higher, or about 0.5 or higher. The selection of a coefficient value means that where expression between gene sequences in the data set is correlated at that value or higher, they are possibly not included in a subset of the invention. Thus in some embodiments, the method comprises excluding or removing (not using for classification) one or more gene sequences that are expressed in correlation, above a desired correlation coefficient, with another gene sequence in the tumor type data set. It is pointed out, however, that there can be situations of gene sequences that are not correlated with any other gene sequences, in which case they are not necessarily removed from use in classification.
  • Thus the expression levels of gene sequences, where more than about 10%, more than about 20%, more than about 30%, more than about 40%, more than about 50%, more than about 60%, more than about 70%, more than about 80%, or more than about 90% of the levels are not correlated with that of another one of the gene sequences used, may be used in the practice of the invention. Correlation between expression levels may be based upon a value below about 0.9, about 0.8, about 0.7, about 0.6, about 0.5, about 0.4, about 0.3, or about 0.2. The ability to classify among classes with exclusion of the expression levels of some gene sequences is present because expression of the gene sequences in the subset is correlated with expression of the gene sequences excluded from the subset. So no information was lost because information based on the expression of the excluded gene sequences is still represented by sequences retained in the subset. Therefore, expression of the gene sequences of the subset has information content relevant to properties and/or characteristics (or phenotype) of a cell. This has application and relevance to the classification of additional tumor type classes not included as part of the original gene expression data set which can be classified by use of a subset of the invention because based on the redundancy of information between expression of sequences in the subset and sequences expressed in those additional classes. Thus the invention may be used to classify cells as being a tumor type beyond the plurality of known classes used to generate the original gene expression data set.
  • Selection of gene sequences based upon reducing correlation of expression to a particular tumor type may also be used. This also reflects a discovery of the present invention, based upon the observation that expression levels that were most highly correlated with one or more tumor types was not necessarily of greatest value in classification among different tumor types. This is reflected both by the ability to use randomly selected gene sequences for classification as well as the use of particular sequences, as described herein, which are not expressed with the most significant correlation with one or more tumor types. Thus the invention may be practiced without selection of gene sequences based upon the most significant P values or a ranking based upon correlation of gene expression and one or more tumor types. Thus the invention may be practiced without the use of ranking based methodologies, such as the Kruskal-Wallis H-test.
  • The gene sequences used in the practice of the invention may include those which have been observed to be expressed in correlation with particular tumor types, such as expression of the estrogen receptor, which has been observed to be expressed in correlation with some breast and ovarian cancers. In some embodiments of the invention, however, the invention is practiced with use of the expression level of at least one gene sequence that has not been previously identified as being associated with any of the tumor types being classified. Thus the invention may be practiced without all of the gene sequences having previously been associated or correlated with expression in the 2 or more (up to 39 or more) tumor types to which a cell containing sample may be classified.
  • While the invention is described mainly with respect to human subjects, samples from other subjects may also be used. All that is necessary is the ability to assess the expression levels of gene sequences in a plurality of known tumor samples such that the expression levels in an unknown or test sample may be compared. Thus the invention may be applied to samples from any organism for which a plurality of expressed sequences, and a plurality of known tumor samples, are available. One non-limiting example is application of the invention to mouse samples, based upon the availability of the mouse genome to permit detection of expressed murine sequences and the availability of known mouse tumor samples or the ability to obtain known samples. Thus, the invention is contemplated for use with other samples, including those of mammals, primates, and animals used in clinical testing (such as rats, mice, rabbits, dogs, cats, and chimpanzees) as non-limiting examples.
  • While the invention is readily practiced with the use of cell containing samples, any nucleic acid containing sample which may be assayed for gene expression levels may be used in the practice of the invention. Without limiting the invention, a sample of the invention may be one that is suspected or known to contain tumor cells. Alternatively, a sample of the invention may be a “tumor sample” or “tumor containing sample” or “tumor cell containing sample” of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer. Non-limiting examples of samples for use with the invention include a clinical sample, such as, but not limited to, a fixed sample, a fresh sample, or a frozen sample. The sample may be an aspirate, a cytological sample (including blood or other bodily fluid), or a tissue specimen, which includes at least some information regarding the in situ context of cells in the specimen, so long as appropriate cells or nucleic acids are available for determination of gene expression levels. The invention is based in part on the discovery that results obtained with frozen tissue sections can be validly applied to the situation with fixed tissue or cell samples and extended to fresh samples.
  • Non-limiting examples of fixed samples include those that are fixed with formalin or formaldehyde (including FFPE samples), with Boudin's, glutaldehyde, acetone, alcohols, or any other fixative, such as those used to fix cell or tissue samples for immunohistochemistry (IHC). Other examples include fixatives that precipitate cell associated nucleic acids and proteins. Given possible complications in handling frozen tissue specimens, such as the need to maintain its frozen state, the invention may be practiced with non-frozen samples, such as fixed samples, fresh samples, including cells from blood or other bodily fluid or tissue, and minimally treated samples. In some applications of the invention, the sample has not been classified using standard pathology techniques, such as, but not limited to, immunohistochemistry based assays.
  • In some embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 53, and subsets thereof: Adenocarcinoma of Breast, Adenocarcinoma of Cervix, Adenocarcinoma of Esophagus, Adenocarcinoma of Gall Bladder, Adenocarcinoma of Lung, Adenocarcinoma of Pancreas, Adenocarcinoma of Small-Large Bowel, Adenocarcinoma of Stomach, Astrocytoma, Basal Cell Carcinoma of Skin, Cholangiocarcinoma of Liver, Clear Cell Adenocarcinoma of Ovary, Diffuse Large B-Cell Lymphoma, Embryonal Carcinoma of Testes, Endometrioid Carcinoma of Uterus, Ewings Sarcoma, Follicular Carcinoma of Thyroid, Gastrointestinal Stromal Tumor, Germ Cell Tumor of Ovary, Germ Cell Tumor of Testes, Glioblastoma Multiforme, Hepatocellular Carcinoma of Liver, Hodgkin's Lymphoma, Large Cell Carcinoma of Lung, Leiomyosarcoma, Liposarcoma, Lobular Carcinoma of Breast, Malignant Fibrous Histiocytoma, Medulary Carcinoma of Thyroid, Melanoma, Meningioma, Mesothelioma of Lung, Mucinous Adenocarcinoma of Ovary, Myofibrosarcoma, Neuroendocrine Tumor of Bowel, Oligodendroglioma, Osteosarcoma, Papillary Carcinoma of Thyroid, Pheochromocytoma, Renal Cell Carcinoma of Kidney, Rhabdomyosarcoma, Seminoma of Testes, Serous Adenocarcinoma of Ovary, Small Cell Carcinoma of Lung, Squamous Cell Carcinoma of Cervix, Squamous Cell Carcinoma of Esophagus, Squamous Cell Carcinoma of Larynx, Squamous Cell Carcinoma of Lung, Squamous Cell Carcinoma of Skin, Synovial Sarcoma, T-Cell Lymphoma, and Transitional Cell Carcinoma of Bladder.
  • In other embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 34, and subsets thereof: adrenal, brain, breast, carcinoid-intestine, cervix (squamous cell), cholangiocarcinoma, endometrium, germ-cell, GIST (gastrointestinal stromal tumor), kidney, leiomyosarcoma, liver, lung (adenocarcinoma, large cell), lung (small cell), lung (squamous), lymphoma (B cell), Lymphoma (Hodgkins), meningioma, mesothelioma, osteosarcoma, ovary (clear cell), ovary (serous cell), pancreas, prostate, skin (basal cell), skin (melanoma), small and large bowel; soft tissue (liposarcoma); soft tissue (MFH or Malignant Fibrous Histiocytoma), soft tissue (Sarcoma-synovial), testis (seminoma), thyroid (follicular-papillary), thyroid (medullary carcinoma), and urinary bladder.
  • In further embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 39, and subsets thereof: adrenal gland, brain, breast, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, endometrium, gall bladder, germ cell-ovary, GIST, kidney, leiomyosarcoma, liver, lung-adenocarcinoma-large cell, lung-small cell, lung-squamous, lymphoma-B cell, lymphoma-Hodgkin's, lymphoma-T cell, meningioma, mesothelioma, osteosarcoma, ovary-clear cell, ovary-serous, pancreas, prostate, skin-basal cell, skin-melanoma, skin-squamous, small and large bowel, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, stomach-adenocarcinoma, testis-other (or non-seminoma), testis-seminoma, thyroid-follicular-papillary, thyroid-medullary, and urinary bladder.
  • The methods of the invention may also be applied to classify a cell containing sample as containing a tumor cell of a tumor of a subset of any of the above sets. The size of the subset will usually be small, composed of two, three, four, five, six, seven, eight, nine, or ten of the tumor types described above. Alternatively, the size of the subset may be any integral number up to the full size of the set. Thus embodiments of the invention include classification among 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 of the above types. In some embodiments, the subset will be composed of tumor types that are of the same tissue or organ type. Alternatively, the subset will be composed of tumor types of different tissues or organs. In some embodiments, the subset will include one or more types selected from adrenal gland, brain, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, gall bladder, germ cell-ovary, GIST, leiomyosarcoma, liver, meningioma, osteosarcoma, skin-basal cell, skin-squamous, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, testis-other (or non-seminoma), testis-seminoma, thyroid-follicular-papillary, and thyroid-medullary.
  • Classification among subsets of the above tumor types is demonstrated by the results shown in FIGS. 1-9 , where the expression levels of as few as about 5 or more genes sequences can be used to classify among random samples of 2 tumor types among those in the set of 39 listed above. Expression levels of as few as about 20 to 49 can be used to classify among all 39 tumor types with varying degrees of accuracy. The invention may be practiced with the expression levels of about 10 or more, about 15 or more, about 20 or more, about 25 or more, about 30 or more, about 35 or more, about 40 or more, or about 45 or more to 49 transcribed sequences as found in the human “transcriptome” (transcribed portion of the genome). The invention may also be practiced with expression levels of about 10-20 or more, about 20-30 or more, about 30-40 or more, about 40-50 or more, or 49 transcribed sequences. In some embodiments of the invention, the transcribed genes may be randomly picked or include all or some of the specific genes sequences disclosed herein. As demonstrated herein, classification with accuracies of about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95% or higher can be performed by use of the instant invention.
  • In other embodiments, the gene expression levels of other gene sequences may be determined along with the above described determinations of expression levels for use in classification. One non-limiting example of this is seen in the case of a microarray based platform to determine gene expression, where the expression of other gene sequences is also measured. Where those other expression levels are not used in classification, they may be considered the results of “excess” transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the description of using particular numbers of sequences does not necessarily exclude the use of expression levels of additional sequences. In some embodiments, the invention includes the use of expression level(s) from one or more “excess” gene sequences, such as those which may provide information redundant to one or more other gene sequences used in a method of the invention.
  • Because classification of a sample as containing cells of one of the above tumor types inherently also classifies the tissue or organ site origin of the sample, the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of the patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor. In some embodiments of the invention, the tumor sample is classified as being one of the following 24: Adrenal, Bladder, Bone, Brain, Breast, Cervix, Endometrium, Esophagus, Gall Bladder, Kidney, Larynx, Liver, Lung, Lymph Node, Ovary, Pancreas, Prostate, Skin, Soft Tissue, Small/Large Bowel, Stomach, Testes, Thyroid, and Uterus.
  • While the invention also provides for classification as one of the above tumor types based upon comparisons to the expression levels of sequences in the 39 tumor types, it is possible that a higher level of confidence in the classification is desired. If an increase in the confidence of the classification is preferred, the classification can be adjusted to identify the tumor sample as being of a particular origin or cell type as shown in FIG. 10 . Thus an increase in confidence can be made in exchange for a decrease in specificity as to tumor type by identification of origin or cell type.
  • The classification of a cell containing sample as having a tumor cell of one of the 39 tumor types above inherently also classifies the tissue or organ site origin of the sample. For example, the identification of a sample as being cervix-squamous necessarily classifies the tumor as being of cervical origin, squamous cell type (and thus epithelial rather than non-epithelial in origin) as shown in FIG. 10 . It also means that the tumor was necessarily not germ cell in origin. Thus, the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of a subject or patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor.
  • The practice of the invention to classify a cell containing sample as having a tumor cell of one of the above types is by use of an appropriate classification algorithm that utilizes supervised learning to accept 1) the levels of expression of the gene sequences in a plurality of known tumor types as a training set and 2) the levels of expression of the same genes in one or more cells of a sample to classify the sample as having cells of one of the tumor types. Further discussion of this is provided in the Example section herein. The levels of expression may be provided based upon the signals in any format, including nucleic acid expression or protein expression as described herein.
  • As would be evident to the skilled practitioner, the range of classification is affected by the number of tumor types as well as the number of samples for each tumor type. But given adequate samples of the full range of human tumors as provided herein, the invention is readily applied to the classification of those tumor types as well as additional types.
  • Non-limiting examples of classification algorithms that may be used in the practice of the invention include supervised learning algorithms, machine learning algorithms, linear discriminant analysis, attribute selection algorithms, and artificial neural networks (ANN). In preferred embodiments of the invention, a distance-based classification algorithm, such as the k-nearest neighbor (KNN) algorithm, or support vector machine (SVM) are used.
  • The use of KNN is in some embodiments of the invention and is discussed further as a non-limiting representative example. KNN can be used to analyze the expression data of the genes in a “training set” of known tumor samples including all 39 of the tumor types described herein. The training data set can then be compared to the expression data for the same genes in a cell containing sample. The expression levels of the genes in the sample are then compared to the training data set via KNN to identify those tumor samples with the most similar expression patterns. As a non-limiting example, the five “nearest neighbors” may be identified and the tumor types thereof used to classify the unknown tumor sample. Of course other numbers of “nearest neighbors” may be used. Non-limiting examples include less than 5, about 7, about 9, or about 11 or more “nearest neighbors”.
  • As a hypothetical example, if the five “nearest neighbors” of an unknown sample are four B cell lymphomas and one T cell lymphoma, then the classification of the sample as being of a B cell lymphoma can be made with great accuracy. This has been used with 84% or greater accuracy, such as 90%, as described in the Examples.
  • The classification ability may be combined with the inherent nature of the classification scheme to provide a means to increase the confidence of tumor classification in certain situations. For example, if the five “nearest neighbors” of a sample are three ovary clear cell and two ovary serous tumors, confidence can be improved by simply treating the tumors as being of ovarian origin and treating the subject or patient (from whom the sample was obtained) accordingly. See FIG. 10 . This is an example of trading off specificity in favor of increased confidence. This provides the added benefit of addressing the possibility that the unknown sample was a mucinous or endometroid tumor. Of course the skilled practitioner is free to treat the tumor as one or both of these two most likely possibilities and proceeding in accordance with that determination.
  • Because the developmental lineage of tumor cells in certain tumor types (e.g., germ cells) can be complex and involve multiple cell types, FIG. 10 may appear to be oversimplified. However, it serves as a good basis to relate known histopathology and to serve as a “guide tree” for analyzing and relating tumor-associated gene expression signatures.
  • The inherent nature of the classification scheme also provides a means to increase the confidence of tumor classification in cases wherein the “nearest neighbors” are ambiguous. For example, if the five “nearest neighbors” were one urinary bladder, one breast, one kidney, one liver, and one prostate, the classification can simply be that of a non-squamous cell tumor. Such a determination can be made with significant confidence and the subject or patient from whom the sample was obtained can be treated accordingly. Without being bound by theory, and offered solely to improve the understanding of the invention, the last two examples reflect the similarities in gene expression of cells of a similar cell type and/or tissue origin.
  • Embodiments of the invention include use of the methods and materials described herein to identify the origin of a cancer from a patient. Thus given a sample containing tumor cells, the tissue origin of the tumor cells is identified by use of the present invention. One non-limiting example is in the case of a subject with an inflamed lymph node containing cancer cells. The cells may be from a tissue or organ that drains into the lymph node or it may be from another tissue source. The present invention may be used to classify the cells as being of a particular tumor or tissue type (or origin) which allows the identification of the source of the cancer cells. In an alternative non-limiting example, the sample (such as that from a lymph node) contains cells, which are first assayed by use of the invention to classify at least one cell as being a tumor cell of a tissue type or origin. This is then used to identify the source of the cancer cells in the sample. Both of these are examples of the advantageous use of the invention to save time, effort, and cost in the use of other cancer diagnostic tests.
  • In further embodiments, the invention is practiced with a sample from a subject with a previous history of cancer. As a non-limiting example, a cell containing sample (from the lymph node or elsewhere) of the subject may be found to contain cancer cells such that the present invention may be used to determine whether the cells are from the same or a different tissue from that of the previous cancer. This application of the invention may also be used to identify a new primary tumor, such as the case where new cancer cells are found in the liver of a subject who previously had breast cancer. The invention may be used to identify the new cancer cells as being the result of metastasis from the previous breast cancer (or from another tumor type, whether previously identified or not) or as a new primary occurrence of liver cancer. The invention may also be applied to samples of a tissue or organ where multiple cancers are found to determine the origin of each cancer, as well as whether the cancers are of the same origin.
  • While the invention may be practiced with the use of expression levels of a random group of expressed gene sequences, the invention also provides exemplary gene sequences for use in the practice of the invention. The invention includes a first group of 74 gene sequences from which about 5 to 49 may be used in the practice of the invention. The 5 to 49 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 74 are used in classifying. A non-limiting example of such embodiments of the invention is where the expression of from about 5 to 49 of the 74 gene sequences is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention. Where those other expression levels are not used in classification, they may be considered the results of “excess” transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the use of the above described sequences does not necessarily exclude the use of expression levels of additional sequences.
  • mRNA sequences corresponding to a set of 74 gene sequences for use in the practice of the invention are provided in Example 6 (Sequence Listing) below along with additional identifying information. The listing of the identifying information, including accession numbers and other information, is provided by the following.
  • >Hs.73995_mRNA_1 gi|190403|gb|M60502.1|HUMPROFILE Human profilaggrin mRNA,
    3′ end polyA=1
    >Hs.75236_mRNA_4 gi|14280328|gb|AY033998.1| Homo sapiens polyA=3
    >Hs.299867_mRNA_1 gi|4758533|ref|NM_004496.1| Homo sapiens hepatocyte
    nuclear factor 3, alpha (HNF3A), mRNA polyA=3
    >Hs.285401_contig1
    AI147926|AI880620|AA768316|AA761543|AA279147|AI216016|AI738663|N79248|
    AI684489|AA960845|AI718599|AI379138|N29366|BF002507|AW044269|R34339|R66326|
    H04648|R67467|AI523112|BF941500 polyA=2 polyA=3
    >Hs.182507_mRNA_1 gi|15431324|ref|NM_002283.2| Homo sapiens keratin, hair,
    basic, 5 (KRTHB5), mRNA polyA=3
    >Hs.292653_contig1
    AI200660|AW014007|AI341199|AI692279|AI393765|AI378686|AI695373|AW292108|
    T10352|R44346|AW470408|AI380925|BF938983|AW003704|H08077|F03856|H08075|F08895|
    AW468398|AI865976|H22568|AI858374|AI216499 polyA=2 polyA=3
    >Hs.97616_mRNA_3 gi|12654852|gb|BC001270.1|BC001270 Homo sapiens clone
    MGC:5069 IMAGE:3458016 polyA=3
    >Hs.123078_mRNA_3 gi|14328043|gb|BC009237.1|BC009237 Homo sapiens clone
    MGC:2216 IMAGE:2989823 polyA=3
    >Hs.285508_contig1 AW194680|BF939744|BF516467 polyA=1 polyA=1
    >Hs.183274_contig1
    BF437393|BF064008|BF509951|AW134603|AI277015|AI803254|AA887915|BF054958|
    AI004413|AI393911|AI278517|AW612644|AI492162|AI309226|AI863671|AA448864|AI640165|
    AA479926|AA461188|AA780161|BF591180|AI918020|AI758226|AI291375|BF001845|
    BF003064|AI337393|AI522206|BE856784|BF001760|AI280300 FLAG=1 polyA=2 WARN
    polyA=3
    >Hs.334841_mRNA_3 gi|14290606|gb|BC009084.1|BC009084 Homo sapiens clone
    MGC:9270 IMAGE:3853674 polyA=3
    >Hs.3321_contig1
    AI804745|AI492375|AA594799|BE672611|AA814147|AA722404|AW170088|D11718|
    BG153444|AI680648|AA063561|BE219054|AI590287|R55185|AI479167|AI796872|AI018324|
    AI701122|BE218203|AA905336|AI681917|BI084742|AI480008|AI217994|AI401468
    polyA=2 polyA=3
    >Hs.306216_singlet1 AW083022 polyA=1 polyA=2
    >Hs.99235_contig1 AA456140|AI167259|AA450056 polyA=2 polyA=3
    >Hs.169172_mRNA_2 gi|2274961|emb|AJ000388.1|HSCANPX Homo sapiens mRNA for
    calpain-like protease CANPX polyA=3
    >Hs.351486_mRNA_1 gi|16549178|dbj|AK054605.1|AK054605 Homo sapiens cDNA
    FLJ30043 fis, clone 3NB692001548 polyA=0
    >Hs.153504_contig2
    BE962007|AW016349|AW016358|AW139144|AA932969|AI025620|AI688744|AI865632|
    AA854291|AA932970|AU156702|AI634439|AA152496|AI539557|AI123490|AI613215|AI318363|
    AW105672|AA843483|AI366889|AW181938|AI813801|AI433695|AA934772|N72230|
    AI760632|BE858965|AW058302|AI760087|AI682077|AA886672|AI350384|AW243848|
    AW300574|BE466359|AI859529|AI921588|BF062899|BE855597|BE617708 polyA=2 polyA=3
    >Hs.199354_singlet1 AI669760 polyA=1 polyA=2
    >Hs.162020_contig1 AW291189|AA505872 polyA=2 polyA=3
    >Hs.30743_mRNA_3 gi|18201906|ref|NM_006115.2| Homo sapiens preferentially
    expressed antigen in melanoma (PRAME), mRNA polyA=3
    >Hs.271580_contig1
    AI632869|AW338882|AW338875|AW613773|AI982899|AW193151|BE206353|BE208200|
    AI811548|AW264021 polyA=2 polyA=3
    >Hs.69360_mRNA_2 gi|14250609|gb|BC008764.1|BC008764 Homo sapiens clone
    MGC:1266 IMAGE:3347571 polyA=3
    >Hs.30827_contig1 H07885|N39347|W85913|AA583408|W86449 polyA=2 polyA=3
    >Hs.211593_contig2
    BF592799|AI570478|AA234440|R40214|BE501078|AW593784|AI184050|AI284161|W72149|
    AW780437|AI247981|AW241273|H60824 polyA=2 polyA=3
    >Hs.155097_mRNA_1 gi|15080385|gb|BC011949.1|BC011949 Homo sapiens clone
    MGC:9006 IMAGE:3863603 polyA=3
    >Hs.5163_mRNA_1 gi|15990433|gb|BC015582.1|BC015582 Homo sapiens clone
    MGC:23280 IMAGE:4637504 polyA=3
    >Hs.55150_mRNA_1 gi|17068414|gb|BC017586.1|BC017586 Homo sapiens clone
    MGC:26610 IMAGE:4837506 polyA=3
    >Hs.170177_contig3
    AI620495|AW291989|AA780896|AA976262|AI298326|BF111862|AW591523|AI922518|
    AI480280|BF589437|AA600354|AI886238|AA035599|H90049|BF112011|N52601|AI570965|
    AI565367|AW768847|H90073|BE504361|N45292|AI632075|AA679729|AW168052|AI978827|
    AI968410|AI669255|N45300|AI651256|AI698970|AI521256|AW078614|AI802070|AI885947|
    AI342534|AI653624|AW243936|T16586|R15989|AI289789|AI871636|AI718785|
    AW148847 polyA=2 polyA=3
    >Hs.184601_mRNA_5 gi|4426639|gb|AF104032.1|AF104032 Homo sapiens polyA=2
    >Hs.351972_singlet1 AA865917 polyA=2 polyA=3
    >Hs.5366_mRNA_2 gi|15277845|gb|BC012926.1|BC012926 Homo sapiens clone
    MGC:16817 IMAGE:3853503 polyA=3
    >Hs.18140_contig1
    AI685931|AA410954|T97707|AA706873|AI911572|AW614616|AA548520|AW027764|
    BF511251|AI914294|AW151688 polyA=1 polyA=1
    >Hs.133196_contig2
    BF224381|BE467992|AW137689|AI695045|AW207361|BF445141|AA405473 polyA=2 WARN
    polyA=3
    >Hs.63325_mRNA_5 gi|15451939|ref|NM_019894.1| Homo sapiens transmembrane
    protease, serine 4 (TMPRSS4), mRNA polyA=3
    >Hs.250692_mRNA_2 gi|184223|gb|M95585.1|HUMHLF Human hepatic leukemia
    factor (HLF) mRNA, complete cds polyA=3
    >Hs.250726_singlet4 AW298545 polyA=2 polyA=3
    >Hs.79217_mRNA_2 gi|16306657|gb|BC001504.1|BC001504 Homo sapiens clone
    MGC:2273 IMAGE:3505512 polyA=3
    >Hs.47986_mRNA_1 gi|13279253|gb|BC004331.1|BC004331 Homo sapiens clone
    MGC:10940 IMAGE:3630835 polyA=3
    >Hs.94367_mRNA_1 gi|10440200|dbj|AK027147.1|AK027147 Homo sapiens cDNA:
    FLJ23494 fis, clone LNG01885 polyA=3
    >Hs.49215_contig1
    BI493248|N66529|AA452255|BI492877|AW196683|AI963900|BF478125|AI421654|BE466675
    polyA=1 polyA=1
    >Hs.281587_contig2
    R61469|R15891|AA007214|R61471|AI014624|N69765|AW592075|H09780|AA709038|
    AI335898|AI559229|F09750|R49594|H11055|T72573|AA935558|AA988654|AA826438|
    AI002431|AI299721 polyA=1 polyA=2
    >Hs.79378_mRNA_1 gi|16306528|ref|NM_003914.2| Homo sapiens cyclin A1
    (CCNA1), mRNA polyA=3
    >Hs.156469_contig2
    AI341378|AI670817|AI701687|AI335022|AW235883|AI948598|AA446356 polyA=2
    polyA=3
    >Hs.6631_mRNA_1 gi|7020430|dbj|AK000380.1|AK000380 Homo sapiens cDNA
    FLJ20373 fis, clone HEP19740 polyA=3
    >Hs.155977_contig1 AI309080|AI313045 polyA=1 WARN polyA=1
    >Hs.95197_mRNA_4 gi|5817138|emb|AL110274.1|HSM800829 Homo sapiens mRNA;
    cDNA DKFZp564I0272 (from clone DKFZp564I0272) polyA=3
    >Hs.48956_contig1 N64339|AI569513|AI694073 polyA=1 polyA=1
    >Hs.118825_mRNA_10 gi|1495484|emb|X96757.1|HSSAPKK3 H.sapiens mRNA for MAP
    kinase kinase polyA=3
    >Hs.135118_contig3
    AI683181|AI082848|AW770198|AI333188|AI873435|AW169942|AI806302|AW340718|
    BF196955|AA909720 polyA=1 polyA=2
    >Hs.171857_mRNA_1 gi|13161080|gb|AF332224.1|AF332224 Homo sapiens testis
    protein mRNA, partial cds polyA=3
    >Hs.18910_mRNA_3 gi|12804464|gb|BC001639.1|BC001639 Homo sapiens clone
    MGC:1944 IMAGE:2959372 polyA=3
    >Hs.194774_mRNA_1 gi|16306633|gb|BC001492.1|BC001492 Homo sapiens clone
    MGC:1774 IMAGE:3510004 polyA=3
    >Hs.127428_mRNA_2 gi|16306818|gb|BC006537.1|BC006537 Homo sapiens clone
    MGC:1934 IMAGE:2987903 polyA=3
    >Hs.126852_contig1
    AI802118|BF197404|BF224434|AA931964|AW236083|AI253119|AW614335|AI671372|
    AI793240|AW006851|AI953604|AI640505|AI633982|AW195809|AI493069|AW058576|
    AW293622 polyA=2 polyA=3
    >Hs.28149_mRNA_1 gi|14714936|gb|BC010626.1|BC010626 Homo sapiens clone
    MGC:17687 IMAGE:3865868 polyA=3
    >Hs.35453_mRNA_3 gi|7018494|emb|AL157475.1|HSM802461 Homo sapiens mRNA;
    cDNA DKFZp761G151 (from clone DKFZp761G151); partial cds polyA=3
    >Hs. 180570_contig1 R08175|AA707224|AA699986|R11209|W89099|T98002|AA494546
    polyA=2 polyA=3
    >Hs.196270_mRNA_1 gi|11545416|gb|AF283645.1|AF283645 Homo sapiens
    chromosome 8 map 8q21 polyA=3
    >Hs.9030_mRNA_3 gi|12652600|gb|BC000045.1|BC000045 Homo sapiens clone
    MGC:2032 IMAGE:3504527 polyA=3
    >Hs.1282_mRNA_3 gi|4559405|ref|NM_000065.1| Homo sapiens complement
    component 6 (C6), mRNA polyA=1
    >Hs. 268562_mRNA_2 gi|15341874|gb|BC013117.1|BC013117 Homo sapiens clone
    MGC:8711 IMAGE:3882749 polyA=3
    >Hs.151301_mRNA_3 gi|16041747|gb|BC015754.1|BC015754 Homo sapiens clone
    MGC:23085 IMAGE:4862492 polyA=3
    >Hs.111_contig1 AA946776|AW242338|H24274|AI078616 polyA=1 polyA=2
    >Hs.150753_contig1 AI123582|AI288234 polyA=0 polyA=0
    >Hs.82109_mRNA_1 gi|14250611|gb|BC008765.1|BC008765 Homo sapiens clone
    MGC:1622 IMAGE:3347793 polyA=3
    >Hs.44276_mRNA_2 gi|12654896|gb|BC001293.1|BC001293 Homo sapiens clone
    MGC:5259 IMAGE:3458115 polyA=3
    >Hs .2142_mRNA_4 gi|13325274|gb|BC004453.1|BC004453 Homo sapiens clone
    MGC:4303 IMAGE:2819400 polyA=3
    >Hs.180908_contig1 AA846824|AW611680|AA846182|AA846342|AA846360 polyA=2
    polyA=3
    >Hs.89436_mRNA_1 gi|16507959|ref|NM_004063.2| Homo sapiens cadherin 17, LI
    cadherin (liver-intestine) (CDH17), mRNA polyA=1
    >Hs.151544_mRNA_8 gi|3153107|emb|AL023657.1|HSDSHP Homo sapiens SH2D1A
    cDNA, formerly known as DSHP polyA=3
    >Hs.1657_contig4
    AW473119|AA164586|AI540656|AI758480|AI810941|AI978964|AI675862|AI784397|
    AW591562|AW514102|AI888116|AI983175|AI634735|AI669577|AI202659|AI910598|AI961352|
    AI565481|AI886254|AI538838|AA291749|AW571455|AI370308|AI274727|AW473925|
    AW514787|AI273871|AW470552|AI524356|AI888281|AW089672|AI952766|AW440601|
    AI654044|AW438839|AI972926 polyA=2 polyA=3
    >Hs.35984_mRNA_1 gi|6049161|gb|AF133587.1|AF133587 Homo sapiens chromosome
    22 map 22q11.2 polyA=3
    >Hs.334534_mRNA_2 gi|17389403|gb|BC017742.1|BC017742 Homo sapiens, clone
    IMAGE:4391536, mRNA polyA=3
    >Hs.60162_mRNA_1 gi|10437644|dbj|AK025181.1|AK025181 Homo sapiens cDNA:
    FLJ21528 fis, clone COL05977 polyA=3
  • As would be understood by the skilled person, detection of expression of any of the above identified sequences, or the sequences provided in Example 6 (Sequence Listing) below may be performed by the detection of expression of any appropriate portion or fragment of these sequences. Preferably, the portions are sufficiently large to contain unique sequences relative to other sequences expressed in a cell containing sample. Moreover, the skilled person would recognize that the disclosed sequences represent one strand of a double stranded molecule and that either strand may be detected as an indicator of expression of the disclosed sequences. This follows because the disclosed sequences are expressed as RNA molecules in cells which are preferably converted to cDNA molecules for ease of manipulation and detection. The resultant cDNA molecules may have the sequences of the expressed RNA as well as those of the complementary strand thereto. Thus either the RNA sequence strand or the complementary strand may be detected. Of course is it also possible to detect the expressed RNA without conversion to cDNA.
  • In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
  • >AF133587
    CCCGGATCGCCATCAGTGTCATCGAGTTCA
    AACCCTGAGCCCTTCATTCACCTCTGTGAG
    >BC017742
    TGCCCTTGCTCTGTGTCATCTCAGTCATTT
    GACTTAGAAAGTGCCCTTCAAAAGGACCCT
    >BF437393
    GGAGGGAGGGCTAATTATATATTTTGTTGT
    TCCTCTATACTTTGTTCTGTTGTCTGCGCC
    >AI620495
    CAGTTTGGATTGTATAATAACGCCAAGCCC
    AGTTGTAGTCGTTTGAGTGCAGTAATGAAA
    >AK000380
    AAATCAGAGTAACCCTTTCTGTATTGAGTG
    CAGTGTTTTTTACTCTTTTCTCATGCACAT
    >BC009237
    TGCCTGGCACAAAGAAGGAAGAATATAAAT
    GATAGTTCGACTCGTCTGTGGAAGAACTTA
    >BC008765
    AGTCTTTTGCTTTTGGCAAAACTCTACTTA
    ATCCAATGGGTTTTTCCCTGTACAGTAGAT
    >BC001504
    GGTTACTGTGGGTGGAATAGTGGAGGCCTT
    CAACTGATTAGACAAGGCCCGCCCACATCT
    >NM_019894
    TAAAATGCACTGCCCTACTGTTGGTATGAC
    TACCGTTACCTACTGTTGTCATTGTTATTA
    >BF224381
    TTCTCTTTTGGGGGCAAACACTATGTCCTT
    TTCTTTTTCTAGATACAGTTAATTCCTGGA
    >AL157475
    AAGACCCACACCCTGTAGCAATACCAAGTG
    CTATTACATAATCAATGGACGATTTATACT
    >AY033998
    AGTGTTGCAAGTTTCCTTTAAAACCAACAA
    AGCCCACAAGTCCTGAATTTCCCATTCTTA
    >H07885
    GTCACTGTCATAGCAGCTGTGATTTCACAA
    GGAAGGGTGCTGCAGGGGGACCTGGTTGAT
    >NM_004496
    TTTCATCCAGTGTTATGCACTTTCCACAGT
    TGGTGTTAGTATAGCCAGAGGGTTTCATTA
    >AA846824
    GGGAAGTAGGGATTATTCGTTTAAATTCAA
    TCGCGAGCACCAAGTCGGACTGGCCGGGGA
    >BC017586
    GGGACCAGGCCCTGGGACAGCCATGTGGCT
    CCAAATGACTAAATGTCAGCTCAAAAACCA
    >AA456140
    TCCGTTTATGGAGGCAATTCCATATCCTTT
    CTTGAACGCACATTCAGCTTACCCCAGAGA
    >NM_002283
    AGAGTTAAGCCACTTCCTGGGTCTCCTTCT
    TATGACTGTCTATGGGTGCATTGCCTTCTG
    >AL023657
    GTGGCCTGAGTAATGCATTATGGGTGGTTT
    ACCATTTCTTGAGGTAAAAGCATCACATGA
    >BC001639
    ACACATGCATGTGTCTGTGTATGTGTGAAT
    GTGAGAGAGACACAGCCCTCCTTTCAGAAG
    >BC015754
    TCTGTAACTGCACAACCCTGGGGTTTGCTG
    CAGAGCTATTTCTTTCCATGTAAAGTAGTG
    >AF332224
    AAACACTCTTTCCGACTCCAGAGGAGAAGC
    TGGCAGCTCTCTGTAAGAAATATGCTGATC
    >BC001270
    GCTTCCTCTATCGCCCAATGCAAAATCGAT
    GAAATGGGGAGTTCTCTGGGCCAGGCCACA
    >AI147926
    GTAGAATCCTCTGTTCATAATGAACAAGAT
    GAACCAATGTGGATTAGAAAGAAGTCCGAG
    >AW298545
    CTGTTTTAAAACTGAATGGCACGAAATTGT
    TTTCCTCAACTCGGAGATTCCTGTATGGAG
    >AI802118
    AATAAATAGTAGCTCTGCTGATGATGACGT
    TGATAACCAAACTGTTCTGTGGTCTTAAGT
    >AI683181
    CAAACAGCCCGGTCTTGATGCAGGAGAGTC
    TGGAAAAGGAAGAAAATGGTTTCAGTTTCA
    >M95585
    AACATGGACCATCCAAATTTATGGCCGTAT
    CAAATGGTAGCTGAAAAAACTATATTTGAG
    >AK027147
    TTGTAATCATGCCAATTCCAGATCAATAAC
    TGCATGTCTGTTCTTTGGTAGAAATAGCTT
    >AW291189
    AAAGATTATTAACCCAAATCACCTTTCTTG
    CTTACTCCAGATGCCTCAGCCTCTGATATA
    >AI632869
    GACTTCCTTTAGGATCTCAGGCTTCTGCAG
    TTCTCATGACTCCTACTTTTCATCCTAGTC
    >BC006537
    CTGTATATTTTGCAATAGTTACCTCAAGGC
    CTACTGACCAAATTGTTGTGTTGAGATGAT
    >R61469
    TGTTCAAACAGACTTTAACCTCTGCATCAT
    ACTTAACCCTGCGACATGCGTACAGTATGC
    >BC009084
    TGAGTCATATACATTTACTGACCACTGTTG
    CTTGTTGCTCACTGTGCTGCTTTTCCATGA
    >N64339
    CTGAAATGTGGATGTGATTGCCTCAATAAA
    GCTCGTCCCCATTGCTTAAGCCTTCAAAAA
    >AI200660
    ATCAAGAAAACCTAATCTTCTGACTCCCAG
    GCCAGGATGTTTTATTTCTCACATCATGTC
    >AK054605
    TTCATTTCCAAACATCATCTTTAAGACTCC
    AAGGATTTTTCCAGGCACAGTGGCTCATAC
    >NM_006115
    AGTTAGAAATAGAATCTGAATTTCTAAAGG
    GAGATTCTGGCTTGGGAAGTACATGTAGGA
    >X96757
    CAATTTTCTTTTTACTCCCCCTCTTAAGGG
    GGCCTTGGAATCTATAGTATAGAATGAACT
    >AI804745
    GGGTGGAGTTTCAGTGAGAATAAACGTGTC
    TGCCTTTGTGTGTGTGTATATATACAGAGA
    >AJ000388
    CTCGCTCATTTTTTACCATGTTTTCCAGTC
    TGTTTAACTTCTGCAGTGCCTTCACTACAC
    >BC008764
    CTTTGGGCCGAGCACTGAATGTCTTGTACT
    TTAAAAAAATGTTTCTGAGACCTCTTTCTA
    >AI309080
    CTGGACCCTTGGAGCAGTGTTGTGTGAACT
    TGCCTAGAACTCTGCCTTCTCCGTTGTCAA
    >AA865917
    CCACCTCCTTCGACCTCCACTGCGCCCCAC
    CTCCCTGCCTGTGTGTGTTATTTCAAAGGA
    >AA946776
    TCTGGCTGGTGGCCTGCGCGAGGGTGCAGT
    CTTACTTAAAAGACTTTCAGTTAATTCTCA
    >AF104032
    AGATGCTGTCGGCACCATGTTTATTTATTT
    CCAGTGGTCATGCTCAGCCTTGCTGCTCTG
    >AW194680
    TCCTTCCTCTTCGGTGAATGCAGGTTATTT
    AAACTTTGGGAAATGTACTTTTAGTCTGTC
    >BC001293
    GTCCTGTCCCTGTCTGGGAGTTGTGTTATT
    TAAAGATATTCTGTATGTTGTATCTTTTGC
    >BE962007
    ATTATATTTCAGGTGTCCTGAACAGGTCAC
    TAGACTCTACATTGGGCAGCCTTTAAATAT
    >BI493248
    AGGAATGGTACTACCGTTCCAGATTTTCTG
    TAATTGCTTCTGCAAAGTAATAGGCTTCTT
    >AF283645
    CTGTACCCAAAGGATGCCAGAATACTAGTA
    TTTTTATTTATCGTAAACATCCACGAGTGC
    >AI669760
    ATTGCCCCCCTAACCAATCATGCAAACTTT
    TCCCCCCCTGGGGTAATTCACCAGTTAAAA
    >BC001492
    CCCACAGTATTTAATGCCCTGTCAGTCCCT
    TCTAGTCTGACTCAATGGTAACTTGCTGTA
    >BC004453
    AAAACCAACTCTCTACTACACAGGCCTGAT
    AACTCTGTACGAGGCTTCTCTAACCCCTAG
    >BC010626
    CTCAGACTGGGCTCCACACTCTTGGGCTTC
    AGTCTGCCCATCTGCTGAATGGAGACAGCA
    >BC013117
    CCTAATGGGGATTCCTCTGGTTGTTCACTG
    CCAAAACTGTGGCATTTTCATTACAGGAGA
    >BC011949
    CACTCACAATTGTTGACTAAAATGCTGCCT
    TTAAAACATAGGAAAGTAGAATGGTTGAGT
    >AW083022
    CTTTGAAGGGCTGCTGCACATTGTTGAATC
    CATCGACCTTTAGCTGCAATGGGATCTCTA
    >R08175
    TGCCTCATCGATATTATAGGGGTCCATCAC
    AACCCAACTGTGTGGCCGGATCCTGAGTCT
    >NM_000065
    AAAACAGACAAAAGCCTTTGCCTTCATGAA
    GCATACATTCATTCAGGGGTAGACACACAA
    >AK025181
    TAACAAACAAAGGCAGTAGCTCATCACTTG
    GGTAGCAGGTACCCATTTTAGGACCCTACA
    >NM_003914
    ATATCAGAAGTGCCAATAATCGTCATAGGC
    TTCTGCACGTTGGATCAACTAATGTTGTTT
    >AI123582
    ATCATAGCCCAACCATGTGAGAAGAAGGAG
    AAGGCCCCCCTTTCTTCATTAATCTGAAAA
    >BC004331
    GCAGACCATTCTATCATACCTGGCAGGGCT
    TCTGTTTTATTTTGTAGGCTGGATGCTACC
    >AI341378
    ACTACAAGCCTCTTGTTTTTCACCAAAACC
    CTACATCTCAGGCTTACTAATTTTTGTGAT
    >NM_004063
    GCCATGCATACATGCTGCGCATGTTTTCTT
    CATTCGTATGTTAGTAAAGTTTTGGTTATT
    >BC012926
    CACCTATTTATTTTACCTCTTTCCCAAACC
    TGGAGCATTTATGCCTAGGCTTGTCAAGAA
    >AL110274
    GTGGACATAGCCACTAACCAACTAGTTACC
    TTTGGACTGCAACAAAAAATGTGAAAATGA
    >AW473119
    ACTTGTAAACCTCTTTTGCACTTTGAAAAA
    GAATCCAGCGGGATGCTCGAGCACCTGTAA
    >AI685931
    AATTCTCTATAAACGGTTCACCAGCAAACC
    ACCAATACATTCCATTGTTTGCCTAGAGAG
    >BF592799
    AATGGCCCATGCATGCTGTTTGCAGCAGTC
    AATTGAGTTGAATTAGAATTCCAACCATAC
    >BC000045
    GAGCTCAGTACTTGCCCTGTGAAAATCCCA
    GAAGCCCCCGCTGTCAATGTTCCCCATCCA
    >BC015582
    ATGAAGCGGAATTAGGCTCCCGAGCTAAGG
    GACTCGCCTAGGGTCTCACAGTGAGTAGGA
    >M60502
    AGTGGCTATATCAACATCAGGGCTAGCACA
    TCTTTCTCTATTATCCTTCTATTGGAATTC
  • The invention also provides a second group of 90 gene sequences from which about 5 to 49 may be used in the practice of the invention. The about 5 to 49 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 90 are used in classifying. A non-limiting example of such embodiments of the invention is where the expression of about 5 to 49 of the 90 gene sequences is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention. Where those other expression levels are not used in classification, they may be considered the results of “excess” transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the use of the above described sequences does not necessarily exclude the use of expression levels of additional sequences.
  • 38 members of the set of 90 are included in the first set of 74 described above. The accession numbers of these members in common between the two sets are AA456140, AA846824, AA946776, AF332224, AI620495, AI632869, AI802118, AI804745, AJ000388, AK025181, AK027147, AL157475, AW194680, AW291189, AW298545, AW473119, BC000045, BC001293, BC001504, BC004453, BC006537, BC008765, BC009084, BC011949, BC012926, BC013117, BC015754, BE962007, BF224381, BF437393, BI493248, M60502, NM_000065, NM_003914, NM_004063, NM_004496, NM_006115, and R61469. mRNA sequences corresponding to members of the set of 90 that are not present in the set of 74 gene sequences are also provided in Example 6 (Sequence Listing) along with additional identifying information. The listing of the identifying information for these 52 unique members by accession numbers, as well as corresponding oligonucleotide sequences which may be used in the practice of the invention, is provided by the following.
  • >R15881
    ACTTCTGGTGATGATAAAAATGGTTTTATC
    ACCCAGATGTGAAAGAAGCTGCCTGTTTAC
    >AI041545
    GTGGTTCTGTAAAAACGCAGAGGAAAAGAG
    CCAGAAGGTTTCTGTTTAATGCATCTTGCC
    >NM_024423
    TTTATAAGGAAGCAGCTGTCTAAAATGCAG
    TGGGGTTTGTTTTGCAATGTTTTAAACAGA
    >AB038160
    CTTATGAAGCTGGCCGGGCCACTCACGTTC
    AATGGTACATCTGGGTCTCTATGTGGTTCT
    >AK026790
    GTGAGCCAGCATTTCCCATAGCTAACCCTA
    TTCTCTTAGTCTTTCAAAATGTAGAATGGG
    >BC012727
    CTTTACACCTGATAAAATATTTTGCGAAGA
    GAGGTGTTCTTTTTCCTTACTGGTGCTGAA
    >BC016451
    GCATACATCTCATCCACAGGGGAAGATAAA
    GATGGTCACACAAACAGTTTCCATAAAGAT
    >H09748
    TGAGTTCAGCATGTGTCTGTCCATTTCATT
    TGTACGCTTGTTCAAAACCAAGTTTGTTCT
    >NM_006142
    AAGACCGAGACTGAGGGAAAGCATGTCTGC
    TGGGTGTGACCATGTTTCCTCTCAATAAAG
    >AF191770
    GGCATCTGGCCCCTGGTAGCCAGCTCTCCA
    GAATTACTTGTAGGTAATTCCTCTCTTCAT
    >NM_006378
    TGGATGTTTGTGCGCGTGTGTGGACAGTCT
    TATCTTCCAGCATGATAGGATTTGACCATT
    >BC006819
    TCCTGGCAGAGCCATGGTCCCAGGCTTCCC
    AAAAGTGTTTGTGGCAATTATTCCCCTAGG
    >X79676
    TTTGATGATAGCAGACATTGTTACAAGGAC
    ATGGTGAGTCTATTTTTAATGCACCAATCT
    >BC006811
    TTCTTCCAGTTGCACTATTCTGAGGGAAAA
    TCTGACACCTAAGAAATTTACTGTGAAAAA
    >NM_000198
    GAACAATTGTGGTCTCTCTTAACTTGAGGT
    TCTCTTTTGACTAATAGAGCTCCATTTCCC
    >AF301598
    GTTAAGTGTGGCCAAGCGCACGGCGGCAAG
    TTTTCAAGCACTGAGTTTCTATTCCAAGAT
    >NM_002847
    CGGCCTACTGAGCGGACAGAATGATGCCAA
    AATATTGCTTATGTCTCTACATGGTATTGT
    >NM_004062
    CAGGGTGTTTGCCCAATAATAAAGCCCCAG
    AGAACTGGGCTGGGCCCTATGGGATTGGTA
    >AW118445
    TGTACAGTTTGGTTGTTGCTGTAAATATGG
    TAGCGTTTTGTTGTTGTTGTTTTTTCATGC
    >BC002551
    TACCAAACTGGGACTCACAGCTTTATTGGG
    CTTTCTTTGTGTCTTGTGTGTTTCTTTTAT
    >AA765597
    CATTGAGGTTTGGATGGTGGCAGGTAAAAC
    AGAAAGGCAAGATGTCATCTGACATTAGGC
    >AL137761
    AGTTCAGCACTGTGGTTATCATTGGTGATG
    CCAGAAAACATTAGTAGACTTAGACAATTG
    >X78202
    TAAAATTTCTTGATTGTGACTATGTGGTCA
    TATGCCCGTGTTTGTCACTTACAAAAATGT
    >AK025615
    AGCCATCTGGTGTGAAGAACTCTATATTTG
    TATGTTGAGAGGGCATGGAATAATTGTATT
    >BC001665
    CTTATTGTCACTGGTTAAGAACTTGGCGAG
    ATTGAAGGGCTTTTGTTATTGTTGTTGGAT
    >AI985118
    CTTTCTAGTGAGCTAACCGTAACAGAGAGC
    CTACAGGATACACGTGAGATAATGTCACGT
    >AL039118
    TTGTCTTAAAATTTCTTGATTGTGATACTG
    TGGTCATATGCCCGTGTTTGTCACTTACAA
    >AA782845
    CCTGGGGGAAAGGGGCATTCATGACCTGAA
    CTTTTTAGCAAATTATTATTCTCAGTTTCC
    >BC016340
    TTCATTAACAGTACTAAGTGGAAGGGATCT
    GCAGATTCCAAATTGGAATAAGCTCTATCA
    >AA745593
    CCAATGCAGAAGAGTATTAAGAAAGATGCT
    CAAGTCCCATGGCACAGAGCAAGGCGGGCA
    >NM_004967
    CAAGGCTACGATGGCTATGATGGTCAGAAT
    TACTACCACCACCAGTGAAGCTCCAGCCTG
    >BF510316
    AGCTCACAGCTGGACAGGTGTTGTATATAG
    AGTGGAATCTCTTGGATGCAGCTTCAAGAA
    >AA993639
    TCCAAAGTAGAAAGGGTTCTTTTAGAAAAC
    TTGAAGAATGTGCCTCCTCTTAGCATCTGT
    >AV656862
    GATGCATTTTTCAGTCCCTTTTCAGAGCAA
    ATGCTTTTGCAATGGTAGTAATGTTTAGTT
    >X69699
    CCTGTGGGGCTTCTCTCCTTGATGCTTCTT
    TCTTTTTTTAAAGACAACCTGCCATTACCA
    >BC013282
    TTGCACTAAGTCATGCTGTTTCCTCAAAGA
    AGCTTTGTTTTTTGTTAACGTATTACTCAG
    >AI457360
    CTGGATCCCAGGCCCTGGCACCCCTCAGGA
    AATACAAGAAAAAGAATATTCACATCTGTT
    >AW445220
    TTAGAGGGGCCACCTATCAACTCATCAGTG
    TTCAAAGAATATGCTGGGAGCATGGGTGAG
    >AF038191
    GGCCCATTTATGTCCCTCATGTCTCTAGAT
    TTTCTCGTCACCCAGCCTCAAAAATATATG
    >X05615
    TCCCCAAAAACCTCACCCGAGGCTGCCCAC
    TATGGTCATCTTTTTCTCTAAAATAGTTAC
    >BC005364
    GAAATTCCTCACACCTTGCACCTTCCCTAC
    TTTTCTGAATTGCTATGACTACTCCTTGTT
    >AK025701
    TGTCTGTCCACCACGAGATGGGAGGAGGAG
    AAAAAGCGGTACGATGCCTTCCTGACCTCA
    >BF446419
    GTCTTATCTCTCAGGGGGGGTTTAAGTGCC
    GTTTGCAATAATGTCGTCTTATTTATTTAG
    >AK025470
    CCGAGTAGTATGGGTCTCTGTGTGAGAAAC
    CAGGAGATATTTTCATCTTGTTCGGAAATA
    >BE552004
    TTGTGCAAAAGTCCCACAACCTTTCTGGAT
    TGATAGTTTGTGGTGAAATAAACAATTTTA
    >H05388
    TCCAGTATTCTGCAGGGCCAGTCAGTTGTA
    CAGAAGTTGGAATATTCTGTTCCAGAATTA
    >NM_033229
    GTCTCGAACAGCGGTTGTTTTTACTTTATT
    TATCTTAGGCCCTCAGCTCCCTGACGTCCT
    >BC010437
    AGTGAATCTTTTCCTCTTGGTAGCATCAAC
    ACTGGGGATAAATCAGAACCATTCTGTGGA
    >AI952953
    TGAGAGCCCAGAACAAGAAGGAGCAGAAGG
    GCACTTTGACCTTCATTATTATGAAAATCA
    >R45389
    GGAAGAACTGATGCTTGCTGCTAACTAAAG
    TTTTGGATGTATCGATTTAGAGAACCAATT
    >NM_001337
    GAATGAGAGAATAAGTCATGTTCCTTCAAG
    ATCATGTACCCCAATTTACTTGCCATTACT
    >AI499593
    TACGGAAAGGAAACAGGTTATACTCTTAGA
    TTTAAAAAGTGAAAGAAACTGCAGGCGCCT
  • In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the above oligonucleotides, which correspond to sequences in Example 6 (Sequence Listing) as indicated by the accession numbers provided.
  • In additional embodiments, the invention provides for use of any number of the gene sequences of the set of 74 or the set of 90 in the methods of the invention. Thus anywhere from 1 to all of the 49 gene sequences used in the invention may be from either or both of the above sets. So from one, two, three, four, or five, or more of the about 5 to 49 sequences may be from the set of 74 or the set of 90. Similarly, and where from 10 to 49 sequences are used, six, seven, eight, nine, or ten of the sequences may be from one of these sets.
  • As used herein, a “tumor sample” or “tumor containing sample” or “tumor cell containing sample” or variations thereof, refer to cell containing samples of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer. The samples may contain tumor cells which may be isolated by known methods or other appropriate methods as deemed desirable by the skilled practitioner. These include, but are not limited to, microdissection, laser capture microdissection (LCM), or laser microdissection (LMD) before use in the instant invention. Alternatively, undissected cells within a “section” of tissue may be used. Non-limiting examples of such samples include primary isolates (in contrast to cultured cells) and may be collected by any non-invasive or minimally invasive means, including, but not limited to, ductal lavage, fine needle aspiration, needle biopsy, the devices and methods described in U.S. Pat. No. 6,328,709, or any other suitable means recognized in the art. Alternatively, the sample may be collected by an invasive method, including, but not limited to, surgical biopsy.
  • The detection and measurement of transcribed sequences may be accomplished by a variety of means known in the art or as deemed appropriate by the skilled practitioner. Essentially, any assay method may be used as long as the assay reflects, quantitatively or qualitatively, expression of the transcribed sequence being detected.
  • The ability to classify tumor samples is provided by the recognition of the relevance of the level of expression of the gene sequences (whether randomly selected or specific) and not by the form of the assay used to determine the actual level of expression. An assay of the invention may utilize any identifying feature of a individual gene sequence as disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the gene in the “transcriptome” (the transcribed fraction of genes in a genome) or the “proteome” (the translated fraction of expressed genes in a genome). Additional assays include those based on the detection of polypeptide fragments of the relevant member or members of the proteome. Non-limiting examples of the latter include detection of proteolytic fragments found in a biological fluid, such as blood or serum. Identifying features include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express (RNA), said gene or epitopes specific to, or activities of, a protein encoded by a gene sequence.
  • Additional means include detection of nucleic acid amplification as indicative of increased expression levels and nucleic acid inactivation, deletion, or methylation, as indicative of decreased expression levels. Stated differently, the invention may be practiced by assaying one or more aspect of the DNA template(s) underlying the expression of each gene sequence, of the RNA used as an intermediate to express the sequence, or of the proteinaceous product expressed by the sequence, as well as proteolytic fragments of such products. As such, the detection of the presence of, amount of, stability of, or degradation (including rate) of, such DNA, RNA and proteinaceous molecules may be used in the practice of the invention.
  • In some embodiments, all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR. Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing gene sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
  • Alternatively, the expression of gene sequences in FFPE samples may be detected as disclosed in U.S. applications 60/504,087, filed Sep. 19, 2003, Ser. No. 10/727,100, filed Dec. 2, 2003, and Ser. No. 10/773,761, filed Feb. 6, 2004 (all three of which are hereby incorporated by reference as if fully set forth). Briefly, the expression of all or part of an expressed gene sequence or transcript may be detected by use of hybridization mediated detection (such as, but not limited to, microarray, bead, or particle based technology) or quantitative PCR mediated detection (such as, but not limited to, real time PCR and reverse transcriptase PCR) as non-limiting examples. The expression of all or part of an expressed polypeptide may be detected by use of immunohistochemistry techniques or other antibody mediated detection (such as, but not limited to, use of labeled antibodies that bind specifically to at least part of the polypeptide relative to other polypeptides) as non-limiting examples. Additional means for analysis of gene expression are available, including detection of expression within an assay for global, or near global, gene expression in a sample (e.g. as part of a gene expression profiling analysis such as on a microarray). Non-limiting examples linear RNA amplification and those described in U.S. patent application Ser. No. 10/062,857 (filed on Oct. 25, 2001), as well as U.S. Provisional Patent Applications 60/298,847 (filed Jun. 15, 2001) and 60/257,801 (filed Dec. 22, 2000), all of which are hereby incorporated by reference in their entireties as if fully set forth.
  • In embodiments using a nucleic acid based assay to determine expression includes immobilization of one or more gene sequences on a solid support, including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art. Alternatively, solution based expression assays known in the art may also be used. The immobilized gene sequence(s) may be in the form of polynucleotides that are unique or otherwise specific to the gene(s) such that the polynucleotides would be capable of hybridizing to the DNA or RNA of said gene(s). These polynucleotides may be the full length of the gene(s) or be short sequences of the genes (up to one nucleotide shorter than the full length sequence known in the art by deletion from the 5′ or 3′ end of the sequence) that are optionally minimally interrupted (such as by mismatches or inserted non-complementary basepairs) such that hybridization with a DNA or RNA corresponding to the genes is not affected. In some embodiments, the polynucleotides used are from the 3′ end of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. Polynucleotides containing mutations relative to the sequences of the disclosed genes may also be used so long as the presence of the mutations still allows hybridization to produce a detectable signal. Thus the practice of the present invention is unaffected by the presence of minor mismatches between the disclosed sequences and those expressed by cells of a subject's sample. A non-limiting example of the existence of such mismatches are seen in cases of sequence polymorphisms between individuals of a species, such as individual human patients within Homo sapiens.
  • As will be appreciated by those skilled in the art, some gene sequences include 3′ poly A (or poly T on the complementary strand) stretches that do not contribute to the uniqueness of the disclosed sequences. The invention may thus be practiced with gene sequences lacking the 3′ poly A (or poly T) stretches. The uniqueness of the disclosed sequences refers to the portions or entireties of the sequences which are found only in nucleic acids, including unique sequences found at the 3′ untranslated portion thereof. Some unique sequences for the practice of the invention are those which contribute to the consensus sequences for the genes such that the unique sequences will be useful in detecting expression in a variety of individuals rather than being specific for a polymorphism present in some individuals. Alternatively, sequences unique to an individual or a subpopulation may be used. The unique sequences may be the lengths of polynucleotides of the invention as described herein.
  • In additional embodiments of the invention, polynucleotides having sequences present in the 3′ untranslated and/or non-coding regions of gene sequences are used to detect expression levels in cell containing samples of the invention. Such polynucleotides may optionally contain sequences found in the 3′ portions of the coding regions of gene sequences. Polynucleotides containing a combination of sequences from the coding and 3′ non-coding regions preferably have the sequences arranged contiguously, with no intervening heterologous sequence(s).
  • Alternatively, the invention may be practiced with polynucleotides having sequences present in the 5′ untranslated and/or non-coding regions of gene sequences to detect the level of expression in cells and samples of the invention. Such polynucleotides may optionally contain sequences found in the 5′ portions of the coding regions. Polynucleotides containing a combination of sequences from the coding and 5′ non-coding regions may have the sequences arranged contiguously, with no intervening heterologous sequence(s). The invention may also be practiced with sequences present in the coding regions of gene sequences.
  • The polynucleotides of some embodiments contain sequences from 3′ or 5′ untranslated and/or non-coding regions of at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, at least about 32, at least about 34, at least about 36, at least about 38, at least about 40, at least about 42, at least about 44, or at least about 46 consecutive nucleotides. The term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Other embodiments use polynucleotides containing sequences of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides. The term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value.
  • Sequences from the 3′ or 5′ end of gene coding regions as found in polynucleotides of the invention are of the same lengths as those described above, except that they would naturally be limited by the length of the coding region. The 3′ end of a coding region may include sequences up to the 3′ half of the coding region. Conversely, the 5′ end of a coding region may include sequences up the 5′ half of the coding region. Of course the above described sequences, or the coding regions and polynucleotides containing portions thereof, may be used in their entireties.
  • In another embodiment of the invention, polynucleotides containing deletions of nucleotides from the 5′ and/or 3′ end of gene sequences may be used. The deletions are preferably of 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, or 175-200 nucleotides from the 5′ and/or 3′ end, although the extent of the deletions would naturally be limited by the length of the sequences and the need to be able to use the polynucleotides for the detection of expression levels.
  • Other polynucleotides of the invention from the 3′ end of gene sequences include those of primers and optional probes for quantitative PCR. Preferably, the primers and probes are those which amplify a region less than about 750, less than about 700, less than about 650, less than about 6000, less than about 550, less than about 500, less than about 450, less than about 400, less than about 350, less than about 300, less than about 250, less than about 200, less than about 150, less than about 100, or less than about 50 nucleotides from the from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. The size of a PCR amplicon of the invention may be of any size, including at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides, all with inclusion of the portion complementary to the PCR primers used.
  • Other polynucleotides for use in the practice of the invention include those that have sufficient homology to gene sequences to detect their expression by use of hybridization techniques. Such polynucleotides preferably have about or 95%, about or 96%, about or 97%, about or 98%, or about or 99% identity with the gene sequences to be used. Identity is determined using the BLAST algorithm, as described above. The other polynucleotides for use in the practice of the invention may also be described on the basis of the ability to hybridize to polynucleotides of the invention under stringent conditions of about 30% v/v to about 50% formamide and from about 0.01M to about 0.15M salt for hybridization and from about 0.01M to about 0.15M salt for wash conditions at about 55 to about 65° C. or higher, or conditions equivalent thereto.
  • In a further embodiment of the invention, a population of single stranded nucleic acid molecules comprising one or both strands of a human gene sequence is provided as a probe such that at least a portion of said population may be hybridized to one or both strands of a nucleic acid molecule quantitatively amplified from RNA of a cell or sample of the invention. The population may be only the antisense strand of a human gene sequence such that a sense strand of a molecule from, or amplified from, a cell may be hybridized to a portion of said population. The population preferably comprises a sufficiently excess amount of said one or both strands of a human gene sequence in comparison to the amount of expressed (or amplified) nucleic acid molecules containing a complementary gene sequence.
  • The invention further provides a method of classifying a human tumor sample by detecting the expression levels of about 5 to 49 transcribed sequences in a nucleic acid or cell containing sample obtained from a human subject, and classifying the sample as containing a tumor cell of a tumor type found in humans to the exclusion of one or more other human tumor types. In some embodiments, the method may be used to classify a sample as being, or having cells of, one of the 53 tumor types listed above to the exclusion of one or more of the other 52. In other embodiments, the method is used to classify a sample as being, or having cells of, one of the 34 tumor types listed above to the exclusion of one or more of the other 33 tumor types. In further embodiments, the method is used to classify a sample as being, or having cells of, one of the 39 tumor types listed above to the exclusion of one or more of the other 38 tumor types.
  • The invention also provides a method for classifying tumor samples as being one of a subset of the possible tumor types described herein by detecting the expression levels of 50 or more transcribed sequences in a nucleic acid containing tumor sample obtained from a human subject, and classifying the sample as being one of a number of tumor types found in humans to the exclusion of one or more other human tumor types. In some embodiments of the invention, the number of other tumor types is from 1 to about 3, more preferably from 1 to about 5, from 1 to about 7, or from 1 to about 9 or about 10. In other embodiments, the number of tumor types are all of the same tissue or organ origin such as those listed above. This aspect of the invention is related to the above discussion of FIG. 10 and of trading off specificity in favor of increased confidence, and may be advantageously applied to situations where the classification of a sample as a single tumor type is at a level of accuracy or performance that can be improved by classifying the sample as one of a subset of possible tumor types.
  • In additional embodiments, the invention may be practiced by analyzing gene expression from single cells or homogenous cell populations which have been dissected away from, or otherwise isolated or purified from, contaminating cells of a sample as present in a simple biopsy. One advantage provided by these embodiments is that contaminating, non-tumor cells (such as infiltrating lymphocytes or other immune system cells) may be removed as so be absent from affecting the genes identified or the subsequent analysis of gene expression levels as provided herein. Such contamination is present where a biopsy is used to generate gene expression profiles.
  • In further embodiments of the invention utilizing Q-PCR or reverse transcriptase Q-PCR as the assay platform, the expression levels of gene sequences of the invention may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used. This provides a means to “normalize” the expression data for comparison of data on a plurality of known tumor types and a cell containing sample to be assayed. While a variety of reference genes may be used, the invention may also be practiced with the use of 8 particular reference gene sequences that were identified for use with the set of 39 tumor types. Moreover, the Q-PCR may be performed in whole or in part with use of a multiplex format.
  • mRNA sequences corresponding to the 8 reference sequences are provided in Example 6 (Sequence Listing) along with additional identifying information. The listing of the identifying information, including accession numbers and other information, is provided by the following.
  • >Hs.77031_mRNA_1 gi|16741772|gb|BC016680.1|BC016680 Homo sapiens clone
    MGC:21349 IMAGE:4338754 polyA=3
    >Hs.77541_mRNA_1 gi|12804364|gb|BC003043.1|BC003043 Homo sapiens clone
    MGC:4370 IMAGE:2822973 polyA=3
    >Hs.7001_mRNA_1 gi|6808256|emb|AL137727.1|HSM802274 Homo sapiens mRNA; cDNA
    DKFZp434M0519 (from clone DKFZp434M0519); partial cds polyA=3
    >Hs.302144_mRNA_1 gi|11493400|gb|AF130047.1|AF130047 Homo sapiens clone
    FLB3020 polyA=0
    >Hs.26510_mRNA_2 gi|11345385|gb|AF308803.1|AF308803 Homo sapiens chromosome
    15 map 15q26 polyA=3
    >Hs.324709_mRNA_2 gi|12655026|gb|BC001361.1|BC001361 Homo sapiens clone
    MGC:2474 IMAGE:3050694 polyA=2
    >Hs.65756_mRNA_3 gi|3641494|gb|AF035154.1|AF035154 Homo sapiens chromosome
    16 map 16p13.3 polyA=3
    >Hs.165743_mRNA_2 gi|13543889|gb|BC006091.1|BC006091 Homo sapiens clone
    MGC:12673 IMAGE:3677524 polyA=3
  • Detection of expression of any of the above reference sequences may be by the same or different methodology as for the other gene sequences described above. In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
  • >BC006091
    TCATCTTCACCAAACCAGTCCGAGGGGTCG
    AAGCCAGACACGAGAGGAAGAGGGTCCTGG
    >BC003043
    CTCTGCTCCTGCTCCTGCCTGCATGTTCTC
    TCTGTTGTTGGAGCCTGGAGCCTTGCTCTC
    >AF130047
    TGCTCCCGGCTGTCCTCCTCTCCTCTTCCC
    TAGTGAGTGGTTAATGAGTGTTAATGCCTA
    >AF035154
    CCCCATCTCTAAAACCAGTAAATCAGCCAG
    CGAATACCCGGAAGCAAGATGCACAGGCGG
    >BC001361
    CCAGAAACAAGGAAGAGGAAAGACAAAGGG
    AAGGGACGGGAGCCCTGGAGAAGCCCGACC
    >AF308803
    AAGTACAACCCATGCTGCTAAGATGCGAGC
    AGGAAGAGGCATCCTTTGCTAAATCCTGTT
    >BC016680
    ACCTCACCCCTGCCCGGCCCAAGCTCTACT
    TGTGTACAGTGTATATTGTATAATAGACAA
    >AL137727
    TTCCCTTAATTCCTCCTCCCGACCTTTTTT
    ACCCCCCCAGTTGCAGTATTTAACTGGGCT
  • In an additional aspect, the methods provided by the present invention may also be automated in whole or in part. This includes the embodiment of the invention in software. Non-limiting examples include processor executable instructions on one or more computer readable storage devices wherein said instructions direct the classification of tumor samples based upon gene expression levels as described herein. Additional processor executable instructions on one or more computer readable storage devices are contemplated wherein said instructions cause representation and/or manipulation, via a computer output device, of the process or results of a classification method.
  • The invention includes software and hardware embodiments wherein the gene expression data of a set of gene sequences in a plurality of known tumor types is embodied as a data set. In some embodiments, the gene expression data set is used for the practice of a method of the invention. The invention also provides computer related means and systems for performing the methods disclosed herein. In some embodiments, an apparatus for classifying a cell containing sample is provided. Such an apparatus may comprise a query input configured to receive a query storage configured to store a gene expression data set, as described herein, received from a query input; and a module for accessing and using data from the storage in a classification algorithm as described herein. The apparatus may further comprise a string storage for the results of the classification algorithm, optionally with a module for accessing and using data from the string storage in an output algorithm as described herein.
  • The steps of a method, process, or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The various steps or acts in a method or process may be performed in the order shown, or may be performed in another order. Additionally, one or more process or method steps may be omitted or one or more process or method steps may be added to the methods and processes. An additional step, block, or action may be added in the beginning, end, or intervening existing elements of the methods and processes.
  • A further aspect of the invention provides for the use of the present invention in relation to clinical activities. In some embodiments, the determination or measurement of gene expression as described herein is performed as part of providing medical care to a patient, including the providing of diagnostic services in support of providing medical care. Thus the invention includes a method in the medical care of a patient, the method comprising determining or measuring expression levels of gene sequences in a cell containing sample obtained from a patient as described herein. The method may further comprise the classifying of the sample, based on the determination/measurement, as including a tumor cell of a tumor type or tissue origin in a manner as described herein. The determination and/or classification may be for use in relation to any aspect or embodiment of the invention as described herein.
  • The determination or measurement of expression levels may be preceded by a variety of related actions. In some embodiments, the measurement is preceded by a determination or diagnosis of a human subject as in need of said measurement. The measurement may be preceded by a determination of a need for the measurement, such as that by a medical doctor, nurse or other health care provider or professional, or those working under their instruction, or personnel of a health insurance or maintenance organization in approving the performance of the measurement as a basis to request reimbursement or payment for the performance.
  • The measurement may also be preceded by preparatory acts necessary to the actual measuring. Non-limiting examples include the actual obtaining of a cell containing sample from a human subject; or receipt of a cell containing sample; or sectioning a cell containing sample; or isolating cells from a cell containing sample; or obtaining RNA from cells of a cell containing sample; or reverse transcribing RNA from cells of a cell containing sample. The sample may be any as described herein for the practice of the invention.
  • In additional embodiments, the invention provides for a method of ordering, or receiving an order for, the performance of a method in the medical care of a patient or other method of the invention. The ordering may be made by a medical doctor, a nurse, or other health care provider, or those working under their instruction, while the receiving, directly or indirectly, may be made by any person who performs the method(s). The ordering may be by any means of communication, including communication that is written, oral, electronic, digital, analog, telephonic, in person, by facsimile, by mail, or otherwise passes through a jurisdiction within the United States.
  • The invention further provides methods in the processing of reimbursement or payment for a test, such as the above method in the medical care of a patient or other method of the invention. A method in the processing of reimbursement or payment may comprise indicating that 1) payment has been received, or 2) payment will be made by another payer, or 3) payment remains unpaid on paper or in a database after performance of an expression level detection, determination or measurement method of the invention. The database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention. The indicating may be in the form of a code (such as a CPT code) on paper or in the database. The “another payer” may be any person or entity beyond that to whom a previous request for reimbursement or payment was made.
  • Alternative, the method may comprise receiving reimbursement or payment for the technical or actual performance of the above method in the medical care of a patient; for the interpretation of the results from said method; or for any other method of the invention. Of course the invention also includes embodiments comprising instructing another person or party to receive the reimbursement or payment. The ordering may be by any communication means, including those described above. The receipt may be from any entity, including an insurance company, health maintenance organization, governmental health agency, or a patient as non-limiting examples. The payment may be in whole or in part. In the case of a patient, the payment may be in the form of a partial payment known as a co-pay.
  • In yet another embodiment, the method may comprise forwarding or having forwarded a reimbursement or payment request to an insurance company, health maintenance organization, governmental health agency, or to a patient for the performance of the above method in the medical care of a patient or other method of the invention. The request may be by any communication means, including those described above.
  • In a further embodiment, the method may comprise receiving indication of approval for payment, or denial of payment, for performance of the above method in the medical care of a patient or other method of the invention. Such an indication may come from any person or party to whom a request for reimbursement or payment was made. Non-limiting examples include an insurance company, health maintenance organization, or a governmental health agency, like Medicare or Medicaid as non-limiting examples. The indication may be by any communication means, including those described above.
  • An additional embodiment is where the method comprises sending a request for reimbursement for performance of the above method in the medical care of a patient or other method of the invention. Such a request may be made by any communication means, including those described above. The request may have been made to an insurance company, health maintenance organization, federal health agency, or the patient for whom the method was performed.
  • A further method comprises indicating the need for reimbursement or payment on a form or into a database for performance of the above method in the medical care of a patient or other method of the invention. Alternatively, the method may simply indicate the performance of the method. The database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention. The indicating may be in the form of a code (such as a CPT code) on paper or in the database.
  • In the above methods in the medical care of a patient or other method of the invention, the method may comprise reporting the results of the method, optionally to a health care facility, a health care provider or professional, a doctor, a nurse, or personnel working therefor. The reporting may also be directly or indirectly to the patient. The reporting may be by any means of communication, including those described above.
  • The invention further provides kits for the determination or measurement of gene expression levels in a cell containing sample as described herein. A kit will typically comprise one or more reagents to detect gene expression as described herein for the practice of the present invention. Non-limiting examples include polynucleotide probes or primers for the detection of expression levels, one or more enzymes used in the methods of the invention, and one or more tubes for use in the practice of the invention. In some embodiments, the kit will include an array, or solid media capable of being assembled into an array, for the detection of gene expression as described herein. In other embodiments, the kit may comprise one or more antibodies that is immunoreactive with epitopes present on a polypeptide which indicates expression of a gene sequence. In some embodiments, the antibody will be an antibody fragment.
  • A kit of the invention may also include instructional materials disclosing or describing the use of the kit or a primer or probe of the present invention in a method of the invention as provided herein. A kit may also include additional components to facilitate the particular application for which the kit is designed. Thus, for example, a kit may additionally contain means of detecting the label (e.g. enzyme substrates for enzymatic labels, filter sets to detect fluorescent labels, appropriate secondary labels such as a sheep anti-mouse-HRP, or the like). A kit may additionally include buffers and other reagents recognized for use in a method of the invention.
  • Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
  • EXAMPLES Example 1: Information Capacity of Random Gene Sets
  • Subsets of 100 randomly selected expressed gene sequences used to classify among 39 tumor types were tested for their ability to classify among subsets of the 39 tumor types. The expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each combination sampled 10 times) of the 100 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. FIG. 1 shows the classification capability of various gene sets are shown relative to the number of tumor types classified. As expected, a higher number of gene sequences are needed to classify tumor types with higher accuracies. FIG. 2 shows the classification performance for various numbers of tumor types relative to the number of gene sequences used.
  • The GenBank accession numbers of the 100 gene sequences are AF269223, BC006286, AK025501, AJ002367, AI469140, AW013883, NM_001238, AI476350, BC006546, AI041212, BF724944, AI376951, R56211, BC006393, X13274, BC001133, N62397, BC000885, AK001588, AK057901, AF146760, AI951287, AK025604, BC007581, BC015025, R43102, AW449550, AI922539, AI684144, AI277662, BC015999, AW444656, BC011612, BC015401, BF447279, BC009956, AL050163, BC001248, BE672684, AL137353, BC001340, U45975, BE856598, BC009060, AL137728, AA713797, AL583913, AK054617, AI028262, AI753041, BG939593, AL080179, AA814915, AF131798, AI961568, BC009849, AK021603, BC012561, AI570494, BC006973, AW294857, BC004952, AK026535, AI923614, AW082090, AI005513, AF339768, AK023167, AF169693, AF076249, BC007662, BC015520, AI814187, AI565381, AW271626, AK024120, AF139065, BC014075, AI887245, AF257081, AI767898, AF070634, AF155132, X69804, U65579, NM_004933, AI655104, AW131780, AI650407, AF131774, AA814057, AJ311123, BC009702, AF264036, AL161961, AJ010857, AF106912, AK023542, AF073518, and D83032. They were indexed from 1 to 100, and representative random sets used in the invention are as follows:
  • For 2 genes, genes 33 and 63, genes 17 and 72, genes 64 and 21, genes 48 and 25, genes 88 and 54, genes 80 and 32, genes 24 and 99, genes 14 and 31, genes 80 and 23, and genes 18 and 34 were used as the 10 random sets.
  • For 5 genes, set 1, genes 27, 97, 56, 88, and 50 were used. In set 2, genes 24, 26, 35, 48, and 83 were used. In set 3, genes 46, 62, 75, 91, and 2 were used. In set 4, genes 19, 61, 34, 87, and 13 were used. In set 5, genes 56, 32, 66, 20, and 55 were used. In set 6, genes 90, 21, 6, 78, and 66 were used. In set 7, genes 73, 47, 3, 82, and 86 were used. In set 8, genes 74, 39, 13, 7, and 67 were used. In set 9, genes 34, 1, 24, 85, and 62 were used. In set 10, genes 23, 89, 15, 54, and 98 were used.
  • For 10 genes, set 1, genes 11, 58, 90, 40, 20, 44, 10, 78, 72, and 74 were used. In set 2, genes 79, 71, 42, 48, 93, 56, 55, 14, 92, and 52 were used. In set 3, genes 62, 53, 52, 19, 98, 26, 76, 65, 33, and 40 were used. In set 4, genes 94, 8, 16, 99, 58, 19, 97, 92, 76, and 86 were used. In set 5, genes 18, 97, 16, 94, 84, 52, 11, 24, 89, and 92 were used. In set 6, genes 12, 42, 45, 51, 2, 75, 63, 28, 13, and 58 were used. In set 7, genes 67, 98, 55, 32, 82, 42, 2, 45, 37, and 23 were used. In set 8, genes 40, 43, 69, 68, 13, 97, 35, 3, 44, and 42 were used. In set 9, genes 69, 47, 96, 80, 100, 50, 42, 26, 65, and 17 were used. In set 10, genes 83, 84, 69, 67, 19, 85, 35, 11, 70, and 64 were used.
  • For 15 genes, set 1, genes 98, 81, 43, 63, 18, 56, 19, 97, 47, 13, 48, 99, 75, 45, and 83 were used. In set 2, genes 5, 72, 31, 59, 81, 40, 92, 3, 23, 50, 57, 74, 62, 21, and 93 were used. In set 3, genes 11, 69, 91, 100, 38, 1, 73, 64, 90, 26, 62, 2, 37, 23, and 18 were used. In set 4, genes 76, 9, 53, 4, 11, 41, 77, 44, 87, 51, 54, 49, 43, 56, and 67 were used. In set 5, genes 55, 34, 13, 89, 52, 74, 96, 80, 48, 22, 31, 39, 43, 91, and 54 were used. In set 6, genes 59, 88, 15, 90, 4, 73, 93, 7, 10, 18, 98, 83, 43, 3, and 5 were used. In set 7, genes 68, 91, 77, 33, 88, 94, 95, 41, 46, 27, 36, 51, 97, 7, and 2 were used. In set 8, genes 7, 10, 78, 40, 70, 84, 55, 1, 98, 22, 99, 91, 8, 17, and 89 were used. In set 9, genes 65, 10, 38, 8, 77, 98, 37, 43, 93, 99, 86, 16, 82, 27, and 9 were used. In set 10, genes 97, 27, 78, 38, 24, 19, 55, 47, 77, 13, 45, 25, 43, 70, and 68 were used.
  • For 20 genes, set 1, genes 41, 94, 38, 76, 35, 65, 92, 26, 49, 7, 85, 54, 77, 66, 98, 15, 86, 69, 70, and 67 were used. In set 2, genes 43, 87, 1, 81, 7, 14, 94, 28, 25, 55, 100, 41, 18, 47, 96, 89, 26, 53, 29, and 32 were used. In set 3, genes 48, 80, 90, 99, 50, 98, 36, 91, 6, 41, 61, 96, 74, 66, 9, 5, 16, 18, 20, and 1 were used. In set 4, genes 49, 58, 73, 24, 94, 22, 41, 52, 18, 19, 63, 91, 74, 37, 59, 95, 53, 87, 72, and 13 were used. In set 5, genes 67, 74, 2, 98, 46, 69, 5, 42, 22, 66, 60, 20, 100, 80, 24, 76, 63, 9, 39, and 15 were used. In set 6, genes 10, 74, 50, 92, 69, 68, 52, 56, 63, 71, 11, 17, 29, 64, 88, 59, 25, 94, 35, and 57 were used. In set 7, genes 97, 72, 16, 19, 14, 42, 70, 31, 29, 13, 22, 37, 95, 69, 87, 39, 18, 81, 58, and 100 were used. In set 8, genes 5, 3, 18, 91, 77, 19, 82, 31, 92, 22, 93, 45, 76, 84, 46, 100, 53, 99, 89, and 42 were used. In set 9, genes 62, 3, 85, 37, 34, 93, 52, 40, 74, 25, 86, 57, 33, 60, 20, 77, 78, 17, 28, and 13 were used. In set 10, genes 22, 26, 23, 39, 35, 10, 43, 32, 65, 38, 54, 45, 8, 17, 90, 20, 83, 60, 6, and 58 were used.
  • For 25 genes, set 1, genes 21, 28, 50, 27, 8, 48, 74, 80, 38, 96, 71, 15, 89, 84, 32, 26, 55, 36, 29, 68, 13, 7, 18, 63, and 72 were used. In set 2, genes 61, 38, 59, 92, 3, 80, 33, 68, 79, 70, 44, 26, 95, 63, 85, 27, 60, 43, 75, 96, 42, 99, 58, 48, and 91 were used. In set 3, genes 75, 83, 78, 5, 99, 56, 26, 36, 57, 23, 37, 28, 88, 16, 63, 2, 72, 59, 9, 80, 52, 91, 62, 3, and 27 were used. In set 4, genes 48, 75, 84, 83, 88, 29, 13, 9, 98, 6, 31, 63, 45, 5, 51, 52, 39, 22, 100, 91, 74, 12, 94, 21, and 8 were used. In set 5, genes 79, 84, 47, 43, 26, 37, 46, 19, 85, 91, 2, 10, 81, 89, 38, 71, 17, 57, 7, 93, 31, 87, 29, 78, and 73 were used. In set 6, genes 62, 93, 83, 42, 97, 96, 78, 98, 47, 22, 67, 48, 89, 95, 24, 81, 16, 45, 8, 90, 66, 64, 2, 3, and 58 were used. In set 7, genes 100, 34, 58, 28, 104, 35, 88, 76, 6, 30, 83, 81, 67, 36, 39, 87, 66, 45, 20, 15, 86, 56, 55, and 95 were used. In set 8, genes 17, 43, 50, 63, 47, 58, 95, 32, 79, 60, 16, 91, 86, 22, 97, 21, 9, 55, 72, 78, 77, 45, 100, 14, and 30 were used. In set 9, genes 24, 67, 60, 94, 59, 14, 70, 84, 8, 89, 63, 23, 39, 11, 81, 42, 33, 3, 12, 93, 54, 35, 78, 73, and 90 were used. In set 10, genes 11, 2, 19, 62, 13, 51, 30, 80, 81, 82, 52, 34, 67, 57, 25, 95, 93, 39, 26, 48, 44, 89, 61, 17, and 18 were used.
  • For 30 genes, set 1, genes 30, 97, 54, 21, 34, 9, 56, 71, 62, 14, 24, 23, 89, 61, 76, 41, 29, 67, 94, 22, 88, 4, 40, 33, 38, 78, 82, 66, 84, and 100 were used. In set 2, genes 89, 41, 56, 43, 98, 44, 35, 26, 19, 86, 15, 67, 8, 69, 3, 76, 48, 17, 55, 31, 25, 91, 72, 36, 18, 82, 37, 50, 9, and 75 were used. In set 3, genes 28, 39, 78, 15, 65, 93, 66, 29, 88, 35, 49, 69, 50, 9, 53, 80, 81, 95, 76, 44, 48, 64, 83, 11, 70, 33, 73, 96, 56, and 92 were used. In set 4, genes 4, 2, 19, 6, 11, 84, 94, 44, 60, 37, 29, 97, 53, 83, 98, 45, 65, 9, 85, 35, 20, 89, 10, 17, 23, 74, 70, 41, 18, and 76 were used. In set 5, genes 27, 4, 43, 1, 10, 95, 88, 74, 77, 47, 63, 81, 31, 9, 41, 100, 87, 57, 8, 79, 24, 6, 26, 20, 55, 61, 34, 42, 25, and 39 were used. In set 6, genes 47, 67, 98, 56, 37, 44, 5, 70, 48, 12, 20, 86, 83, 89, 27, 59, 19, 54, 69, 97, 43, 71, 58, 82, 8, 50, 51, 10, 25, and 72 were used. In set 7, genes 100, 99, 37, 58, 44, 60, 39, 3, 59, 96, 50, 68, 94, 69, 83, 90, 17, 4, 5, 67, 88, 56, 29, 79, 23, 1, 38, 25, 49, and 74 were used. In set 8, genes 26, 23, 58, 47, 6, 68, 41, 31, 16, 64, 19, 75, 36, 32, 87, 2, 12, 97, 73, 21, 53, 78, 15, 94, 1, 20, 79, 81, 70, and 7 were used. In set 9, genes 61, 48, 78, 75, 12, 36, 37, 66, 91, 2, 92, 32, 8, 26, 6, 82, 14, 68, 4, 88, 39, 89, 43, 41, 40, 87, 69, 74, 42, and 9 were used. In set 10, genes 58, 99, 60, 39, 50, 25, 22, 57, 48, 85, 24, 10, 97, 68, 36, 38, 93, 62, 52, 56, 34, 18, 32, 64, 95, 81, 74, 88, 61, and 96 were used.
  • For 35 genes, set 1, genes 52, 68, 22, 92, 43, 75, 20, 62, 15, 76, 99, 61, 64, 36, 12, 66, 24, 21, 31, 88, 25, 6, 93, 91, 55, 74, 69, 90, 23, 4, 80, 72, 97, 58, and 1 were used. In set 2, genes 48, 21, 68, 16, 96, 10, 1, 69, 36, 20, 3, 14, 59, 53, 12, 84, 90, 17, 9, 65, 4, 32, 75, 81, 88, 37, 38, 5, 94, 60, 64, 45, 7, 43, and 55 were used. In set 3, genes 33, 95, 59, 86, 83, 76, 36, 55, 90, 22, 62, 98, 34, 46, 4, 87, 5, 66, 38, 78, 97, 100, 71, 25, 30, 2, 21, 99, 12, 54, 9, 14, 81, 32, and 52 were used. In set 4, genes 27, 64, 40, 59, 63, 100, 50, 19, 1, 10, 96, 2, 34, 28, 67, 26, 87, 41, 15, 57, 33, 11, 94, 66, 82, 6, 52, 55, 84, 47, 97, 83, 80, 62, and 5 were used. In set 5, genes 99, 86, 92, 72, 83, 48, 79, 46, 91, 2, 90, 9, 23, 44, 85, 31, 38, 81, 76, 54, 71, 14, 3, 13, 62, 11, 39, 4, 95, 36, 20, 30, 75, 63, and 51 were used. In set 6, genes 41, 89, 81, 29, 86, 95, 34, 42, 50, 9, 45, 21, 64, 84, 74, 91, 69, 98, 57, 79, 39, 87, 93, 63, 26, 82, 2, 59, 30, 71, 83, 38, 77, 24, and 73 were used. In set 7, genes 87, 60, 59, 98, 43, 38, 28, 64, 29, 92, 22, 27, 40, 33, 69, 71, 73, 79, 15, 70, 32, 90, 76, 93, 6, 50, 55, 9, 49, 54, 36, 5, 48, 19, and 10 were used. In set 8, genes 100, 70, 98, 79, 91, 23, 37, 29, 73, 65, 78, 31, 3, 11, 30, 51, 16, 40, 95, 94, 62, 38, 67, 39, 82, 72, 22, 5, 87, 57, 6, 75, 35, 99, and 46 were used. In set 9, genes 46, 61, 59, 86, 29, 74, 56, 89, 52, 26, 54, 20, 84, 97, 33, 71, 14, 36, 38, 49, 28, 60, 19, 90, 11, 42, 87, 92, 82, 21, 94, 3, 22, 2, and 39 were used. In set 10, genes 31, 76, 77, 27, 72, 38, 42, 36, 53, 82, 61, 39, 98, 81, 34, 80, 22, 100, 8, 32, 17, 21, 28, 56, 59, 29, 55, 5, 62, 40, 90, 87, 24, 68, and 37 were used.
  • For 40 genes, set 1, genes 64, 50, 46, 22, 51, 6, 47, 12, 2, 30, 45, 7, 63, 55, 91, 90, 80, 49, 71, 8, 79, 82, 77, 76, 97, 5, 95, 11, 32, 70, 20, 62, 38, 26, 41, 58, 44, 87, 35, and 23 were used. In set 2, genes 44, 26, 16, 12, 30, 45, 71, 90, 37, 68, 32, 70, 58, 43, 51, 6, 62, 92, 87, 20, 56, 5, 47, 48, 86, 29, 98, 22, 59, 76, 8, 79, 64, 14, 50, 3, 54, 83, 96, and 80 were used. In set 3, genes 20, 34, 57, 70, 39, 15, 25, 33, 78, 51, 87, 46, 67, 80, 28, 52, 66, 72, 22, 88, 97, 3, 90, 6, 82, 42, 41, 94, 85, 61, 54, 84, 14, 9, 81, 19, 7, 91, 23, and 40 were used. In set 4, genes 61, 46, 64, 71, 35, 58, 100, 23, 95, 17, 87, 68, 54, 8, 50, 4, 27, 49, 47, 52, 53, 28, 24, 34, 45, 2, 89, 48, 3, 65, 42, 9, 92, 36, 6, 84, 51, 60, 77, and 94 were used. In set 5, genes 28, 97, 21, 43, 22, 89, 94, 87, 99, 5, 4, 20, 13, 61, 37, 42, 72, 62, 7, 12, 31, 23, 60, 98, 48, 38, 53, 56, 29, 69, 26, 82, 24, 74, 86, 10, 67, 2, 47, and 46 were used. In set 6, genes 12, 74, 96, 77, 78, 72, 53, 87, 47, 29, 40, 98, 52, 22, 69, 3, 58, 97, 60, 48, 55, 80, 57, 39, 50, 89, 71, 9, 63, 51, 21, 23, 73, 32, 20, 19, 25, 5, 38, and 46 were used. In set 7, genes 88, 79, 54, 44, 37, 36, 32, 91, 47, 50, 60, 92, 82, 80, 46, 19, 98, 20, 76, 29, 9, 95, 2, 77, 97, 74, 90, 73, 100, 1, 34, 85, 24, 71, 57, 99, 68, 13, 43, and 53 were used. In set 8, genes 23, 39, 7, 64, 20, 27, 69, 43, 38, 89, 50, 3, 16, 79, 83, 72, 65, 66, 32, 30, 100, 82, 28, 22, 54, 84, 53, 75, 59, 37, 34, 49, 12, 86, 71, 97, 26, 88, 70, and 57 were used. In set 9, genes 74, 96, 80, 39, 40, 82, 38, 56, 35, 93, 55, 73, 44, 17, 81, 27, 2, 83, 65, 89, 76, 8, 18, 45, 58, 77, 14, 49, 21, 6, 4, 92, 33, 13, 12, 88, 98, 24, 84, and 36 were used. In set 10, genes 35, 77, 48, 62, 26, 12, 41, 68, 81, 5, 37, 70, 28, 72, 50, 83, 64, 99, 74, 57, 84, 76, 52, 14, 87, 97, 3, 31, 73, 58, 44, 24, 15, 66, 45, 91, 4, 32, 46, and 49 were used.
  • For 45 genes, set 1, genes 52, 97, 84, 72, 96, 34, 18, 38, 88, 80, 91, 49, 71, 64, 93, 26, 62, 40, 68, 29, 67, 39, 60, 9, 13, 74, 95, 99, 27, 47, 25, 45, 31, 8, 69, 17, 75, 53, 51, 12, 23, 1, 6, 30, and 50 were used. In set 2, genes 97, 80, 55, 32, 94, 84, 28, 3, 6, 48, 17, 41, 65, 37, 79, 34, 61, 83, 35, 49, 27, 38, 43, 2, 24, 77, 25, 71, 58, 14, 8, 30, 46, 98, 82, 75, 22, 72, 26, 74, 93, 66, 73, 1, and 53 were used. In set 3, genes 64, 45, 38, 92, 23, 74, 66, 60, 100, 3, 82, 20, 54, 11, 19, 16, 80, 86, 14, 75, 62, 10, 52, 47, 13, 31, 35, 53, 41, 9, 79, 39, 17, 22, 99, 58, 46, 83, 43, 40, 44, 90, 95, 12, and 81 were used. In set 4, genes 20, 66, 9, 24, 16, 76, 99, 42, 86, 58, 15, 93, 48, 28, 26, 50, 68, 12, 2, 37, 82, 36, 27, 57, 45, 41, 32, 1, 52, 54, 30, 39, 7, 100, 59, 23, 94, 75, 8, 60, 55, 34, 38, 29, and 87 were used. In set 5, genes 66, 88, 73, 53, 51, 69, 36, 87, 78, 40, 58; 76, 31, 65, 56, 42, 100, 68, 5, 18, 17, 91, 45, 22, 74, 82, 1, 44, 67, 43, 10, 63, 79, 92, 6, 72, 80, 75, 9, 30, 19, 61, 99, 3, and 38 were used. In set 6, genes 75, 66, 84, 59, 9, 70, 100, 27, 79, 41, 73, 67, 23, 39, 28, 68, 21, 69, 38, 72, 86, 82, 36, 46, 77, 34, 47, 54, 13, 16, 7, 88, 22, 26, 4, 89, 55, 24, 61, 12, 35, 50, 95, 92, and 80 were used. In set 7, genes 59, 86, 10, 29, 53, 88, 43, 64, 11, 13, 19, 17, 36, 65, 73, 94, 20, 51, 80, 24, 66, 83, 44, 47, 21, 6, 52, 82, 69, 54, 100, 28, 18, 34, 35, 30, 74, 91, 49, 46, 60, 5, 38, 71, and 2 were used. In set 8, genes 77, 32, 55, 44, 6, 98, 94, 19, 10, 71, 72, 85, 67, 75, 78, 88, 90, 58, 89, 27, 69, 42, 31, 47, 1, 37, 52, 7, 57, 45, 11, 83, 49, 46, 34, 64, 14, 24, 87, 9, 56, 8, 20, 36, and 15 were used. In set 9, genes 4, 27, 83, 61, 46, 15, 35, 26, 51, 54, 23, 38, 100, 7, 42, 58, 44, 8, 22, 37, 20, 89, 56, 91, 70, 29, 11, 19, 87, 99, 21, 65, 72, 75, 49, 40, 45, 30, 43, 48, 63, 3, 18, 74, and 1 were used. In set 10, genes 68, 19, 90, 52, 55, 23, 17, 53, 3, 2, 74, 82, 26, 88, 48, 6, 8, 43, 15, 73, 57, 67, 85, 91, 13, 44, 81, 1, 75, 33, 51, 21, 4, 41, 77, 86, 40, 18, 31, 78, 92, 10, 64, 99, and 69 were used.
  • Classification of subsets of the 39 tumor types was performed with use of random selections of tumor types from the group of 39. The expression levels of gene sequence sets as described herein were used to classify random combinations of tumor types. Different random sets of tumor types were used with each of the sets of 100, 74, and 90 gene sequences as described in these examples. Representative, and non-limiting, examples of random sets of from 2 to 20 tumor types used are as follows, where the set of 39 tumor types were indexed from 1 to 39.
  • For 2 tumor types, set 1 used types 26 and 16. Set 2 used types 8 and 5. Set 3 used types 39 and 8. Set 4 used types 27 and 23. Set 5 used types 8 and 19. Set 6 used 12 and 21. Set 7 used types 30 and 15. Set 8 used types 30 and 5. Set 9 used types 18 and 22. Set 10 used types 27 and 26.
  • For 4 tumor types, set 1 used types 20, 35, 15 and 7. Set 2 used types 36, 1, 28 and 19. Set 3 used types 13, 4, 12 and 21. Set 4 used types 12, 33, 14 and 28. Set 5 used types 6, 28, 5 and 37. Set 6 used types 5, 25, 36 and 15. Set 7 used types 12, 26, 21 and 19. Set 8 used types 19, 3, 20 and 17. Set 9 used types 18, 10, 8 and 9. Set 10 used types 28, 20, 2 and 22.
  • For 6 tumor types, set 1 used types 27, 3, 10, 39, 11 and 20. Set 2 used types 33, 10, 20, 32, 13 and 19. Set 3 used types 31, 27, 18, 39, 8 and 16. Set 4 used types 25, 28, 10, 12, 7 and 39. Set 5 used types 14, 13, 28, 24, 30 and 36. Set 6 used types 9, 24, 8, 17, 36 and 26. Set 7 used types 20, 1, 34, 26, 6 and 19. Set 8 used types 12, 13, 3, 17, 34 and 22. Set 9 used types 7, 1, 17, 13, 20 and 34. Set 10 used types 5, 11, 25, 29, 28 and 35.
  • For 8 tumor types, set 1 used types 34, 33, 28, 3, 23, 25, 9 and 29. Set 2 used types 27, 8, 38, 28, 20, 14, 12 and 9. Set 3 used types 29, 21, 19, 1, 13, 26, 11 and 31. Set 4 used types 25, 17, 7, 20, 34, 8, 28 and 10. Set 5 used types 36, 28, 35, 26, 2, 8, 29 and 7. Set 6 used types 10, 23, 2, 27, 33, 21, 25 and 35. Set 7 used types 10, 18, 38, 2, 6, 7, 19 and 32. Set 8 used types 11, 37, 6, 28, 3, 9, 2 and 16. Set 9 used types 22, 2, 10, 8, 17, 19 and 33. Set 10 used types 35, 39, 8, 10, 37, 4, 36 and 6.
  • For 10 tumor types, set 1 used types 25, 10, 26, 2, 32, 31, 39, 23, 22 and 18. Set 2 used types 12, 35, 6, 16, 20, 3, 39, 36, 11 and 2. Set 3 used types 34, 1, 15, 29, 5, 39, 2, 12, 25 and 18. Set 4 used types 10, 8, 14, 18, 31, 19, 23, 20, 32 and 33. Set 5 used types 10, 18, 37, 15, 4, 35, 33, 24, 39 and 20. Set 6 used types 22, 16, 4, 3, 18, 21, 1, 25, 37 and 13. Set 7 used types 14, 6, 28, 18, 11, 13, 2, 32, 33 and 19. Set 8 used types 39, 2, 38, 4, 34, 8, 25, 6, 32 and 35. Set 9 used types 3, 10, 11, 16, 6, 15, 18, 14, 12 and 26. Set 10 used types 24, 25, 21, 9, 36, 29, 20, 39, 10 and 37.
  • For 12 tumor types, set 1 used types 26, 20, 4, 12, 2, 31, 38, 18, 16, 39, 3 and 33. Set 2 used types 25, 16, 4, 9, 29, 27, 14, 24, 21, 7, 23 and 2. Set 3 used types 31, 18, 23, 13, 25, 1, 29, 21, 35, 10, 32 and 39. Set 4 used types 8, 34, 23, 9, 35, 14, 25, 21, 2, 33, 18 and 28. Set 5 used types 6, 11, 21, 8, 5, 7, 19, 32, 3, 13, 36 and 9. Set 6 used types 12, 33, 14, 26, 27, 15, 2, 21, 36, 35, 9 and 39. Set 7 used types 26, 29, 32, 17, 31, 19, 6, 5, 20, 34, 2 and 24. Set 8 used types 17, 12, 8, 22, 28, 9, 27, 29, 14, 35, 4 and 32. Set 9 used types 29, 9, 36, 23, 33, 18, 21, 35, 3, 6, 2 and 1. Set 10 used types 1, 3, 35, 29, 22, 27, 8, 23, 2, 36, 14 and 19.
  • For 14 tumor types, set 1 used types 9, 26, 38, 25, 31, 3, 15, 14, 17, 33, 12, 35, 39 and 16. Set 2 used types 1, 26, 16, 25, 20, 12, 14, 37, 38, 24, 23, 33, 27 and 35. Set 3 used types 11, 21, 35, 38, 32, 34, 27, 39, 16, 15, 4, 5, 13 and 18. Set 4 used types 27, 5, 13, 28, 18, 17, 15, 20, 29, 37, 21, 36, 25 and 14. Set 5 used types 5, 12, 17, 9, 25, 21, 33, 37, 8, 15, 24, 3, 34 and 28. Set 6 used types 11, 19, 34, 26, 9, 6, 32, 14, 27, 29, 30, 16, 24 and 17. Set 7 used types 31, 26, 11, 18, 19, 20, 9, 8, 5, 36, 12, 6, 27 and 38. Set 8 used types 20, 17, 11, 5, 15, 9, 2, 39, 34, 24, 27, 26, 35 and 10. Set 9 used types 1, 14, 39, 30, 17, 6, 10, 35, 31, 33, 15, 29, 32 and 7. Set 10 used types 1, 19, 24, 28, 34, 12, 13, 18, 32, 11, 14, 21, 22 and 25.
  • For 16 tumor types, set 1 used types 27, 15, 8, 12, 6, 20, 26, 19, 25, 2, 37, 38, 7, 39, 4 and 33. Set 2 used types 17, 18, 28, 5, 6, 31, 25, 13, 8, 20, 37, 36, 35, 9, 23 and 27. Set 3 used types 23, 37, 34, 14, 16, 27, 32, 33, 21, 38, 4, 30, 24, 22, 17 and 25. Set 4 used types 7, 37, 38, 21, 34, 31, 32, 25, 10, 36, 19, 11, 6, 26, 18 and 35. Set 5 used types 9, 32, 12, 24, 20, 13, 38, 21, 39, 23, 36, 18, 37, 22, 5 and 3. Set 6 used types 14, 21, 5, 17, 6, 20, 18, 35, 22, 10, 3, 23, 13, 2, 34 and 26. Set 7 used types 1, 8, 19, 6, 9, 39, 28, 18, 13, 31, 14, 16, 37, 12, 3 and 25. Set 8 used types 32, 36, 28, 38, 9, 33, 2, 5, 4, 11, 19, 18, 13, 8, 12 and 3. Set 9 used types 9, 14, 10, 5, 28, 32, 23, 6, 39, 3, 17, 8, 19, 1, 31 and 12. Set 10 used types 4, 34, 11, 6, 38, 19, 7, 20, 23, 3, 25, 37, 26, 1, 15 and 12.
  • For 18 tumor types, set 1 used types 15, 24, 39, 35, 7, 30, 16, 13, 20, 3, 26, 4, 12, 10, 34, 25, 21 and 28. Set 2 used types 21, 23, 29, 11, 10, 19, 13, 28, 4, 20, 17, 24, 30, 12, 39, 34, 31 and 9. Set 3 used types 7, 17, 27, 6, 30, 8, 22, 2, 32, 26, 21, 14, 4, 38, 1, 35, 16 and 28. Set 4 used types 17, 13, 20, 33, 10, 3, 16, 22, 1, 38, 2, 9, 28, 5, 6, 19, 12 and 11. Set 5 used types 4, 35, 21, 25, 18, 17, 8, 14, 31, 30, 9, 1, 2, 23, 36, 29, 32 and 37. Set 6 used types 17, 34, 2, 18, 19, 15, 16, 13, 4, 24, 5, 35, 6, 22, 28, 37, 38 and 1. Set 7 used types 34, 26, 12, 25, 27, 3, 17, 7, 2, 32, 9, 36, 21, 19, 22, 8, 20 and 29. Set 8 used types 12, 34, 38, 25, 17, 22, 14, 39, 10, 7, 31, 2, 3, 11, 29, 30, 16 and 24. Set 9 used types 13, 26, 27, 14, 5, 10, 8, 7, 16, 30, 37, 4, 6, 35, 28, 1, 36 and 20. Set 10 used types 15, 2, 17, 23, 26, 28, 36, 38, 12, 6, 19, 37, 20, 14, 9, 39, 11 and 21.
  • For 20 tumor types, set 1 used types 25, 13, 21, 15, 37, 20, 12, 28, 9, 10, 26, 22, 14, 24, 16, 7, 39, 34, 33 and 4. Set 2 used types 20, 17, 10, 27, 19, 28, 5, 1, 23, 21, 38, 7, 13, 22, 32, 31, 9, 4, 3 and 24. Set 3 used types 17, 13, 7, 20, 11, 38, 34, 3, 15, 12, 5, 39, 9, 10, 4, 35, 27, 6, 21 and 33. Set 4 used types 6, 13, 17, 26, 1, 7, 33, 5, 10, 32, 3, 23, 35, 4, 14, 28, 12, 38, 8 and 27. Set 5 used types 10, 23, 9, 38, 5, 29, 12, 27, 25, 6, 7, 26, 37, 31, 24, 36, 19, 15, 16 and 11. Set 6 used types 30, 24, 21, 11, 23, 25, 8, 9, 7, 31, 27, 5, 14, 29, 1, 19, 16, 12, 22 and 17. Set 7 used types 26, 13, 23, 19, 22, 11, 25, 21, 33, 20, 6, 17, 2, 10, 31, 34, 27, 37, 7 and 9. Set 8 used types 30, 1, 38, 7, 31, 37, 11, 25, 6, 19, 28, 33, 17, 29, 10, 27, 16, 3, 14 and 15. Set 9 used types 15, 19, 26, 24, 5, 33, 11, 2, 13, 18, 31, 22, 32, 20, 23, 6, 10, 25, 36 and 3. Set 10 used types 24, 25, 21, 29, 14, 18, 31, 2, 20, 39, 23, 9, 38, 12, 6, 32, 22, 26, 33 and 7.
  • Example 4: Specified Gene Sets
  • A first set of 74 genes and a second set of 90 genes, where the two sets have 38 members in common, were used in the practice of the invention.
  • Random subsets of about 5 to 49 members of the set of 74 expressed gene sequences were evaluated in a manner analogous to that described in Example 3. Again, the expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each combination sampled 10 times) of the 74 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. The resulting data are shown in FIGS. 4-6 .
  • The members of the 74 gene sequences were indexed from 1 to 74, and representative random sets used in the invention are as follows:
  • For 2 genes, set 1, genes 64 and 6 were used. For set 2, genes 64 and 13 were used. For set 3, genes 67 and 51 were used. For set 4, genes 51 and 29 were used. For set 5, genes 46 and 12 were used. For set 6, genes 68 and 65 were used. For set 7, genes 6 and 28 were used. For set 8, genes 9 and 55 were used. For set 9, genes 55 and 71 were used. For set 10, genes 63 and 39 were used.
  • For 5 genes, set 1, genes 8, 64, 50, 54, and 4 were used. In set 2, genes 39, 17, 45, 34, and 15 were used. In set 3, genes 10, 4, 61, 21, and 55 were used. In set 4, genes 59, 37, 21, 23, and 64 were used. In set 5, genes 69, 8, 25, 59, and 63 were used. In set 6, genes 45, 71, 19, 59, and 38 were used. In set 7, genes 21, 43, 14, 48, and 30 were used. In set 8, genes 73, 35, 36, 10, and 9 were used. In set 9, genes 62, 28, 11, 70, and 64 were used. In set 10, genes 8, 16, 70, 18, and 59 were used.
  • For 10 genes, set 1, genes 49, 72, 38, 68, 52, 21, 1, 10, 2, and 40 were used. In set 2, genes 54, 70, 28, 64, 68, 41, 44, 20, 7, and 2 were used. In set 3, genes 71, 49, 51, 11, 18, 53, 8, 42, 36, and 58 were used. In set 4, genes 72, 15, 35, 3, 23, 8, 2, 48, 22, and 65 were used. In set 5, genes 44, 19, 6, 22, 38, 5, 37, 9, 30, and 14 were used. In set 6, genes 15, 27, 3, 10, 31, 19, 44, 39, 48, and 46 were used. In set 7, genes 70, 30, 9, 33, 63, 71, 32, 34, 20, and 7 were used. In set 8, genes 45, 29, 54, 58, 15, 21, 68, 5, 42, and 62 were used. In set 9, genes 74, 17, 66, 46, 10, 8, 63, 5, 24, and 2 were used. In set 10, genes 33, 2, 34, 19, 60, 71, 42, 51, 70, and 66 were used.
  • For 15 genes, set 1, genes 13, 22, 26, 67, 64, 40, 68, 71, 4, 28, 24, 33, 46, 69, and 41 were used. In set 2, genes 10, 1, 14, 70, 71, 64, 46, 67, 45, 48, 65, 74, 34, 49, and 37 were used. In set 3, genes 58, 30, 44, 40, 51, 36, 33, 60, 39, 21, 54, 64, 25, 13, and 35 were used. In set 4, genes 63, 70, 60, 32, 31, 16, 49, 65, 38, 5, 72, 47, 40, 2, and 46 were used. In set 5, genes 43, 6, 40, 13, 39, 72, 68, 41, 27, 73, 36, 25, 33, 34, and 1 were used. In set 6, genes 68, 67, 71, 59, 73, 62, 31, 43, 7, 44, 21, 48, 54, 58, and 6 were used. In set 7, genes 16, 50, 61, 62, 27, 2, 21, 1, 41, 28, 68, 35, 17, 47, and 46 were used. In set 8, genes 27, 18, 44, 66, 2, 20, 53, 64, 46, 70, 57, 7, 51, 10, and 45 were used. In set 9, genes 65, 8, 43, 23, 50, 46, 21, 41, 44, 3, 31, 17, 7, 66, and 70 were used. In set 10, genes 16, 14, 61, 51, 39, 33, 43, 31, 53, 65, 74, 42, 29, 9, and 11 were used.
  • For 20 genes, set 1, genes 14, 60, 6, 71, 74, 16, 62, 39, 56, 44, 32, 72, 18, 42, 66, 49, 1, 9, 69, and 21 were used. In set 2, genes 23, 1, 7, 27, 26, 71, 12, 4, 22, 69, 62, 44, 6, 25, 57, 28, 33, 9, 21, and 51 were used. In set 3, genes 46, 48, 29, 54, 55, 69, 73, 47, 6, 27, 24, 21, 15, 43, 45, 7, 62, 25, 22, and 74 were used. In set 4, genes 12, 65, 24, 73, 45, 57, 49, 63, 61, 1, 58, 10, 2, 18, 8, 51, 67, 69, 59, and 13 were used. In set 5, genes 33, 43, 9, 52, 54, 38, 8, 16, 48, 1, 39, 60, 17, 6, 15, 66, 68, 63, 37, and 42 were used. In set 6, genes 43, 19, 44, 28, 56, 34, 66, 42, 73, 40, 65, 38, 54, 20, 51, 37, 30, 35, 53, and 61 were used. In set 7, genes 61, 6, 20, 4, 34, 53, 70, 38, 35, 46, 36, 16, 1, 23, 68, 12, 59, 71, 65, and 14 were used. In set 8, genes 25, 68, 69, 3, 33, 49, 19, 56, 54, 4, 32, 6, 45, 16, 67, 52, 65, 14, 12, and 40 were used. In set 9, genes 47, 7, 36, 32, 61, 74, 14, 45, 26, 51, 69, 12, 41, 42, 64, 25, 27, 57, 23, and 58 were used. In set 10, genes 27, 13, 3, 17, 51, 7, 37, 43, 20, 12, 52, 21, 25, 2, 5, 32, 62, 47, 4, and 26 were used.
  • For 25 genes, set 1, genes 57, 61, 31, 38, 3, 7, 72, 43, 32, 23, 28, 71, 48, 17, 2, 49, 10, 30, 66, 12, 69, 41, 20, 63, and 68 were used. In set 2, genes 18, 54, 47, 57, 24, 42, 66, 46, 16, 58, 37, 60, 62, 9, 2, 27, 36, 52, 13, 32, 45, 6, 43, 21, and 56 were used. In set 3, genes 47, 48, 52, 16, 56, 54, 42, 37, 17, 41, 35, 21, 6, 9, 63, 10, 49, 68, 23, 25, 70, 3, 58, 2, and 31 were used. In set 4, genes 50, 10, 25, 16, 68, 15, 29, 73, 27, 63, 3, 17, 28, 66, 19, 13, 4, 9, 36, 48, 23, 57, 59, 26, and 14 were used. In set 5, genes 40, 39, 43, 49, 66, 15, 14, 29, 36, 21, 19, 44, 72, 58, 69, 12, 11, 9, 37, 46, 32, 51, 3, 24, and 6 were used. In set 6, genes 42, 49, 44, 32, 46, 35, 70, 40, 3, 21, 11, 67, 25, 56, 37, 43, 60, 55, 16, 27, 30, 53, 63, 23, and 33 were used. In set 7, genes 70, 27, 68, 17, 64, 65, 18, 69, 10, 67, 42, 23, 48, 14, 31, 11, 55, 25, 52, 34, 13, 45, 12, 29, and 47 were used. In set 8, genes 48, 10, 17, 27, 25, 55, 12, 62, 30, 65, 15, 49, 70, 14, 54, 24, 33, 26, 50, 60, 6, 40, 67, 11, and 2 were used. In set 9, genes 41, 47, 24, 59, 7, 44, 2, 67, 12, 19, 13, 17, 35, 56, 28, 14, 61, 15, 60, 58, 1, 64, 31, 45, and 23 were used. In set 10, genes 42, 72, 41, 38, 57, 27, 4, 13, 9, 43, 34, 28, 8, 62, 64, 46, 12, 70, 21, 66, 16, 7, 48, 3, and 54 were used.
  • For 30 genes, set 1, genes 16, 47, 67, 9, 22, 10, 64, 72, 46, 6, 60, 74, 3, 68, 57, 63, 14, 54, 58, 30, 28, 18, 70, 73, 52, 39, 34, 61, 12, 21 were used. In set 2, genes 18, 1, 44, 24, 68, 26, 62, 10, 47, 67, 37, 55, 32, 35, 34, 14, 49, 30, 17, 16, 51, 45, 74, 31, 9, 57, 66, 39, 53, and 8 were used. In set 3, genes 58, 45, 55, 39, 22, 32, 9, 49, 31, 13, 51, 56, 28, 12, 3, 59, 74, 35, 42, 67, 69, 47, 66, 18, 52, 57, 43, 5, 26, and 4 were used. In set 4, genes 45, 1, 74, 12, 18, 23, 59, 27, 38, 40, 72, 56, 50, 20, 52, 32, 5, 16, 9, 21, 60, 64, 49, 70, 30, 61, 6, 10, 31, and 24 were used. In set 5, genes 60, 53, 7, 32, 73, 25, 69, 48, 17, 45, 16, 3, 14, 9, 37, 41, 72, 43, 68, 39, 20, 51, 59, 23, 6, 15, 74, 19, 31, and 66 were used. In set 6, genes 47, 54, 9, 38, 60, 33, 40, 12, 57, 45, 26, 56, 11, 27, 67, 25, 69, 59, 68, 7, 61, 72, 23, 21, 28, 48, 29, 65, 37, and 15 were used. In set 7, genes 21, 42, 30, 57, 65, 59, 53, 74, 45, 66, 68, 41, 19, 24, 8, 10, 61, 43, 38, 67, 37, 47, 40, 22, 63, 35, 70, 72, 5, and 6 were used. In set 8, genes 58, 11, 28, 36, 24, 34, 53, 9, 44, 23, 51, 70, 22, 17, 15, 59, 5, 60, 1, 64, 21, 50, 35, 52, 31, 43, 38, 39, 32, and 62 were used. In set 9, genes 43, 30, 63, 7, 60, 40, 39, 1, 48, 17, 69, 57, 6, 62, 19, 38, 36, 13, 66, 64, 25, 31, 65, 47, 27, 16, 53, 68, 37, and 41 were used. In set 10, genes 22, 17, 4, 2, 37, 16, 49, 7, 63, 64, 14, 15, 74, 43, 25, 54, 46, 50, 53, 67, 39, 62, 59, 10, 55, 72, 65, 52, 58, and 19 were used.
  • For 35 genes, set 1, genes 4, 43, 55, 49, 13, 26, 32, 21, 18, 50, 14, 20, 65, 7, 24, 52, 58, 8, 30, 37, 54, 71, 2, 31, 44, 61, 66, 67, 28, 39, 10, 70, 17, 19, and 45 were used. In set 2, genes 14, 13, 67, 21, 48, 28, 69, 47, 50, 3, 68, 63, 22, 41, 60, 61, 5, 44, 56, 65, 7, 66, 15, 6, 45, 2, 36, 5, 30, 72, 34, 46, 24, 29, and 12 were used. In set 3, genes 67, 25, 58, 11, 17, 16, 3, 69, 21, 1, 59, 26, 72, 41, 47, 2, 34, 24, 10, 19, 33, 5, 50, 9, 71, 20, 62, 8, 68, 61, 23, 37, 35, 60, and 32 were used. In set 4, genes 5, 30, 14, 1, 59, 27, 28, 51, 55, 61, 18, 37, 17, 73, 6, 44, 67, 12, 35, 11, 53, 72, 70, 25, 21, 7, 34, 13, 74, 43, 52, 39, 54, 2, and 19 were used. In set 5, genes 56, 64, 58, 35, 1, 23, 43, 4, 73, 28, 54, 6, 51, 68, 49, 37, 16, 71, 3, 21, 48, 69, 70, 10, 26, 22, 50, 44, 2, 60, 38, 40, 66, 63, and 65 were used. In set 6, genes 72, 49, 51, 44, 19, 28, 1, 11, 3, 40, 33, 41, 70, 29, 48, 62, 50, 4, 47, 60, 68, 10, 61, 32, 20, 13, 22, 59, 65, 64, 67, 21, 35, 39, and 24 were used. In set 7, genes 14, 35, 31, 20, 8, 59, 50, 15, 52, 62, 19, 30, 71, 68, 72, 47, 38, 74, 36, 49, 73, 22, 41, 25, 69, 16, 32, 24, 51, 43, 65, 3, 6, 53, and 29 were used. In set 8, genes 22, 44, 23, 9, 26, 56, 72, 59, 35, 61, 51, 69, 64, 30, 53, 27, 11, 55, 39, 67, 48, 28, 14, 10, 8, 12, 40, 24, 57, 34, 50, 32, 42, 41, and 38 were used. In set 9, genes 15, 7, 27, 6, 67, 9, 26, 57, 30, 37, 58, 23, 42, 11, 36, 52, 32, 29, 62, 21, 41, 61, 64, 18, 40, 35, 66, 1, 2, 56, 16, 3, 55, 10, and 51 were used. In set 10, genes 9, 14, 71, 25, 44, 37, 49, 46, 66, 53, 7, 33, 22, 12, 73, 50, 27, 24, 13, 5, 41, 51, 61, 16, 28, 56, 23, 20, 10, 8, 70, 48, 42, 52, and 34 were used.
  • For 40 genes, set 1, genes 26, 36, 43, 30, 62, 19, 20, 51, 41, 71, 1, 63, 10, 56, 65, 17, 15, 50, 5, 35, 4, 54, 12, 70, 48, 31, 47, 37, 34, 8, 3, 69, 40, 44, 46, 59, 61, 74, 23, 27 were used. In set 2, genes 1, 4, 38, 24, 37, 69, 21, 52, 13, 2, 63, 51, 30, 16, 27, 58, 74, 20, 32, 53, 59, 31, 50, 10, 42, 8, 54, 36, 5, 47, 70, 41, 12, 46, 28, 19, 35, 9, 61, and 48 were used. In set 3, genes 35, 48, 40, 47, 20, 67, 57, 72, 15, 17, 46, 37, 9, 2, 60, 30, 65, 49, 29, 64, 16, 21, 7, 74, 61, 11, 58, 71, 62, 23, 24, 55, 3, 53, 52, 27, 18, 50, 25, and 66 were used. In set 4, genes 35, 10, 59, 19, 27, 40, 30, 4, 9, 52, 2, 29, 26, 41, 55, 17, 13, 53, 71, 63, 58, 44, 45, 62, 70, 16, 64, 48, 43, 8, 38, 72, 49, 37, 18, 36, 74, 42, 46, and 54 were used. In set 5, genes 16, 61, 1, 10, 20, 51, 22, 6, 43, 65, 66, 24, 30, 9, 14, 40, 32, 74, 18, 71, 15, 28, 52, 31, 56, 55, 23, 4, 58, 36, 60, 54, 25, 63, 27, 64, 50, 29, 44, and 45 were used. In set 6, genes 15, 30, 3, 50, 61, 47, 13, 48, 45, 17, 46, 10, 28, 37, 8, 54, 9, 5, 63, 18, 39, 49, 34, 68, 14, 23, 43, 11, 1, 51, 56, 67, 20, 57, 6, 19, 25, 31, 21, and 12 were used. In set 7, genes 45, 73, 53, 29, 35, 56, 70, 51, 30, 59, 49, 22, 6, 43, 28, 31, 40, 4, 66, 25, 37, 19, 12, 65, 26, 74, 46, 50, 23, 62, 17, 69, 36, 41, 34, 27, 67, 7, 24, and 13 were used. In set 8, genes 62, 30, 38, 41, 18, 13, 49, 71, 68, 47, 50, 70, 66, 5, 23, 33, 27, 56, 6, 7, 34, 28, 26, 58, 53, 46, 16, 52, 72, 42, 10, 54, 67, 64, 12, 8, 19, 57, 73, and 17 were used. In set 9, genes 11, 32, 48, 54, 42, 67, 13, 53, 21, 44, 57, 22, 40, 12, 5, 29, 69, 37, 17, 39, 45, 73, 60, 26, 14, 72, 4, 59, 24, 46, 18, 51, 36, 61, 35, 9, 19, 16, 38, and 28 were used. In set 10, genes 58, 1, 55, 59, 11, 63, 3, 26, 49, 69, 34, 47, 65, 46, 14, 39, 5, 67, 16, 66, 64, 38, 44, 32, 15, 22, 19, 71, 23, 52, 45, 53, 48, 8, 60, 73, 9, 30, 25, and 37 were used.
  • For 45 genes, set 1, genes 26, 21, 17, 34, 19, 27, 6, 61, 24, 42, 3, 60, 70, 43, 54, 13, 9, 20, 28, 58, 12, 23, 33, 4, 63, 56, 67, 1, 11, 68, 41, 59, 45, 5, 48, 32, 10, 44, 16, 65, 51, 62, 22, 38, and 74 were used. In set 2, genes 21, 41, 67, 5, 51, 53, 28, 25, 31, 60, 52, 17, 50, 11, 29, 45, 2, 32, 71, 13, 68, 22, 74, 33, 48, 56, 62, 42, 26, 14, 61, 23, 9, 46, 66, 10, 64, 59, 54, 69, 27, 47, 44, 34, and 40 were used. In set 3, genes 68, 48, 43, 74, 17, 4, 49, 34, 38, 60, 12, 42, 18, 5, 51, 32, 1, 57, 9, 11, 30, 13, 37, 15, 29, 33, 44, 20, 55, 70, 45, 41, 24, 56, 35, 52, 59, 7, 25, 2, 31, 64, 71, 22, and 39 were used. In set 4, genes 44, 61, 51, 69, 65, 72, 29, 57, 40, 62, 66, 63, 67, 55, 74, 14, 56, 11, 16, 58, 1, 15, 3, 48, 42, 7, 8, 30, 18, 19, 23, 60, 4, 10, 21, 43, 12, 37, 32, 25, 22, 50, 34, 59, and 2 were used. In set 5, genes 67, 54, 33, 41, 5, 61, 3, 10, 2, 71, 73, 53, 25, 42, 44, 23, 9, 38, 45, 62, 32, 46, 40, 8, 66, 49, 16, 24, 68, 69, 21, 52, 20, 6, 48, 11, 57, 39, 22, 31, 63, 36, 34, 35, and 17 were used. In set 6, genes 43, 45, 19, 17, 4, 58, 37, 7, 42, 52, 2, 62, 25, 66, 24, 15, 22, 74, 68, 67, 8, 1, 33, 70, 31, 50, 64, 14, 61, 51, 6, 38, 35, 39, 72, 5, 27, 36, 11, 18, 12, 48, 46, 54, and 71 were used. In set 7, genes 41, 45, 58, 11, 66, 26, 53, 13, 60, 4, 65, 18, 67, 73, 28, 55, 56, 57, 29, 68, 23, 19, 42, 17, 22, 62, 61, 10, 43, 64, 38, 71, 7, 40, 16, 34, 74, 12, 37, 8, 63, 44, 49, 47, and 3 were used. In set 8, genes 47, 40, 59, 14, 50, 71, 1, 57, 19, 28, 6, 34, 68, 4, 30, 20, 31, 33, 38, 39, 17, 41, 24, 65, 70, 61, 3, 35, 45, 11, 9, 8, 73, 42, 26, 23, 46, 72, 25, 64, 16, 53, 62, 18, and 7 were used. In set 9, genes 61, 5, 69, 22, 7, 17, 26, 13, 2, 30, 55, 33, 47, 14, 59, 32, 9, 44, 23, 45, 42, 25, 15, 57, 48, 50, 1, 68, 18, 72, 46, 73, 67, 36, 63, 60, 28, 21, 20, 8, 29, 35, 37, 38, and 71 were used. In set 10, genes 22, 31, 58, 50, 64, 11, 17, 67, 41, 2, 21, 4, 61, 70, 54, 3, 71, 25, 40, 43, 69, 38, 9, 73, 45, 16, 34, 10, 7, 52, 35, 19, 66, 24, 5, 60, 18, 14, 59, 32, 68, 15, 56, 63, and 65 were used.
  • A similar experiment was performed with random subsets of about 5 to 49 members of the set of 90 expressed gene sequences. Again, the expression levels of random combinations of 5, 10, 15, 20, 25, 30, 35, 40, 45, and 49 (each combination sampled 10 times) of the 90 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. The resulting data are shown in FIGS. 7-9 .
  • The members of the 90 gene sequences were indexed from 1 to 90, and representative random sets used in the invention are as follows:
  • For 2 genes, set 1, genes 30 and 72 were used. For set 2, genes 65 and 88 were used. For set 3, genes 76 and 88 were used. For set 4, genes 5 and 86 were used. For set 5, genes 30 and 32 were used. For set 6, genes 6 and 59 were used. For set 7, genes 57 and 2 were used. For set 8, genes 49 and 28 were used. For set 9, genes 37 and 35 were used. For set 10, genes 34 and 18 were used.
  • For 5 genes set 1, genes 1, 83, 59, 36, 66, and 88 were used. In set 2, genes 58, 13, 59, 22, and 64 were used. In set 3, genes 46, 72, 51, 88, and 14 were used. In set 4, genes 23, 74, 22, 27, and 20 were used. In set 5, genes 58, 54, 78, 87, and 50 were used. In set 6, genes 59, 6, 56, 78, and 9 were used. In set 7, genes 30, 78, 69, 83, and 21 were used. In set 8, genes 5, 39, 54, 56, and 55 were used. In set 9, genes 9, 70, 54, 67, and 43 were used. In set 10, genes 80, 81, 63, 90, and 53 were used.
  • For 10 genes, set 1, genes 70, 17, 45, 5, 2, 37, 6, 76, 39, and 14 were used. In set 2, genes 54, 16, 80, 26, 15, 45, 50, 8, 73, and 48 were used. In set 3, genes 66, 87, 31, 74, 37, 45, 19, 1, 70, and 7 were used. In set 4, genes 85, 17, 78, 61, 23, 59, 27, 18, 58, and 24 were used. In set 5, genes 44, 89, 36, 76, 49, 3, 21, 24, 38, and 69 were used. In set 6, genes 32, 72, 55, 2, 86, 81, 53, 45, 17, and 74 were used. In set 7, genes 27, 55, 62, 33, 32, 84, 21, 45, 23, and 7 were used. In set 8, genes 62, 45, 68, 31, 69, 39, 33, 63, 19, and 22 were used. In set 9, genes 71, 39, 11, 56, 88, 80, 37, 77, 62, and 35 were used. In set 10, genes 38, 83, 41, 47, 66, 87, 10, 4, 88, and 22 were used.
  • For 15 genes, set 1, genes 61, 17, 64, 14, 1, 41, 72, 47, 69, 48, 49, 70, 12, 20, and 35 were used. In set 2, genes 26, 49, 69, 31, 84, 42, 24, 56, 82, 12, 29, 2, 21, 15, and 71 were used. In set 3, genes 54, 62, 8, 32, 58, 65, 39, 44, 35, 22, 34, 77, 43, 83, and 75 were used. In set 4, genes 62, 50, 57, 80, 28, 83, 32, 56, 14, 2, 3, 48, 67, 79, and 72 were used. In set 5, genes 55, 58, 77, 68, 90, 76, 17, 72, 85, 34, 43, 33, 62, 6, and 64 were used. In set 6, genes 41, 63, 90, 9, 25, 35, 2, 14, 65, 87, 11, 36, 10, 79, and 17 were used. In set 7, genes 69, 89, 77, 33, 71, 4, 6, 46, 72, 13, 68, 81, 31, 50, and 32 were used. In set 8, genes 29, 69, 34, 47, 32, 52, 63, 73, 23, 25, 33, 10, 37, 17, and 55 were used. In set 9, genes 24, 13, 45, 17, 51, 48, 20, 30, 29, 40, 53, 19, 88, 76, and 28 were used. In set 10, genes 86, 33, 19, 4, 84, 25, 78, 29, 88, 10, 7, 67, 85, 45, and 8 were used.
  • For 20 genes, set 1, genes 57, 78, 43, 50, 14, 71, 56, 25, 80, 31, 88, 4, 49, 13, 3, 38, 32, 8, 52, and 75 were used. In set 2, genes 84, 46, 23, 85, 55, 82, 56, 83, 48, 89, 8, 60, 21, 40, 20, 17, 87, 24, 34, and 39 were used. In set 3, genes 72, 88, 53, 46, 82, 9, 34, 21, 76, 24, 14, 35, 90, 31, 58, 30, 15, 41, 7, and 28 were used. In set 4, genes 22, 62, 21, 3, 45, 50, 58, 72, 69, 82, 49, 42, 47, 9, 15, 59, 17, 24, 40, and 52 were used. In set 5, genes 71, 18, 74, 53, 43, 75, 76, 54, 63, 64, 10, 5, 90, 51, 31, 58, 28, 35, 70, and 23 were used. In set 6, genes 7, 30, 77, 25, 17, 16, 35, 68, 56, 37, 78, 87, 45, 8, 42, 82, 72, 23, 58, and 54 were used. In set 7, genes 3, 58, 67, 5, 87, 62, 56, 88, 73, 50, 22, 52, 10, 60, 57, 42, 46, 26, 7, and 82 were used. In set 8, genes 63, 19, 22, 13, 82, 12, 44, 52, 8, 90, 35, 81, 79, 15, 83, 76, 51, 27, 45, and 56 were used. In set 9, genes 65, 34, 76, 81, 58, 86, 83, 46, 40, 55, 48, 42, 57, 70, 21, 72, 71, 17, 22, and 24 were used. In set 10, genes 34, 74, 2, 53, 76, 73, 19, 72, 88, 87, 44, 70, 40, 39, 22, 45, 83, 77, 30, and 46 were used.
  • For 25 genes, set 1, genes 13, 77, 22, 85, 58, 8, 23, 2, 40, 81, 50, 31, 14, 41, 21, 52, 6, 74, 11, 17, 83, 7, 9, 19, 18 were used. In set 2, genes 3, 12, 8, 87, 34, 75, 31, 88, 77, 39, 40, 60, 54, 9, 37, 5, 51, 53, 32, 35, 66, 4, 26, 59, and 29 were used. In set 3, genes 29, 41, 44, 56, 88, 72, 90, 6, 19, 63, 42, 24, 49, 70, 39, 17, 82, 13, 9, 4, 51, 40, 22, 71, and 25 were used. In set 4, genes 70, 82, 55, 43, 40, 32, 16, 13, 22, 41, 7, 85, 46, 42, 73, 76, 14, 60, 50, 72, 5, 81, 67, 57, and 83 were used. In set 5, genes 88, 83, 53, 26, 29, 4, 38, 71, 11, 66, 14, 89, 39, 34, 84, 41, 7, 64, 87, 3, 67, 43, 50, 79, and 6 were used. In set 6, genes 88, 16, 83, 4, 7, 39, 56, 82, 10, 20, 87, 79, 3, 35, 76, 49, 43, 11, 74, 13, 48, 22, 64, 34, and 89 were used. In set 7, genes 6, 64, 39, 50, 44, 46, 61, 28, 79, 43, 35, 85, 48, 9, 59, 47, 57, 5, 24, 33, 80, 11, 42, 20, and 26 were used. In set 8, genes 59, 24, 46, 33, 50, 71, 53, 21, 86, 10, 75, 23, 74, 60, 43, 22, 16, 62, 85, 79, 81, 34, 73, 2, and 1 were used. In set 9, genes 68, 11, 64, 54, 37, 28, 44, 73, 83, 89, 2, 41, 59, 75, 21, 23, 88, 71, 34, 29, 1, 47, 84, 60, and 72 were used. In set 10, genes 5, 12, 60, 84, 32, 58, 70, 2, 38, 42, 24, 13, 85, 10, 49, 90, 55, 81, 39, 27, 65, 56, 31, 34, and 57 were used.
  • For 30 genes, set 1, genes 24, 88, 10, 69, 64, 8, 19, 54, 80, 70, 11, 9, 29, 56, 36, 79, 30, 65, 2, 58, 23, 74, 41, 16, 77, 4, 78, 14, 85, and 32 were used. In set 2, genes 73, 27, 19, 52, 87, 51, 63, 4, 76, 64, 90, 81, 42, 47, 9, 62, 40, 65, 83, 30, 39, 59, 10, 11, 54, 44, 43, 6, 86, and 41 were used. In set 3, genes 28, 47, 41, 8, 24, 54, 26, 49.61, 17, 46, 64, 20, 16, 1, 33, 82, 79, 85, 5, 86, 69, 31, 65, 83, 7, 67, 35, 48, and 57 were used. In set 4, genes 13, 21, 83, 35, 47, 57, 8, 66, 75, 17, 38, 70, 39, 23, 9, 1, 2, 28, 68, 81, 36, 80, 52, 22, 44, 37, 85, 15, 72, and 86 were used. In set 5, genes 81, 20, 36, 89, 13, 14, 46, 58, 59, 62, 28, 7, 1, 25, 35, 83, 26, 50, 51, 15, 16, 56, 71, 5, 47, 6, 78, 80, 85, and 84 were used. In set 6, genes 68, 74, 73, 89, 38, 72, 33, 35, 15, 79, 3, 37, 23, 67, 10, 62, 64, 77, 44, 60, 75, 7, 51, 12, 46, 76, 81, 26, 42, and 6 were used. In set 7, genes 34, 55, 62, 40, 78, 35, 76, 30, 21, 77, 46, 71, 66, 69, 63, 81, 51, 38, 84, 53, 82, 89, 29, 14, 36, 45, 60, 7, 52, and 27 were used. In set 8, genes 56, 12, 35, 79, 57, 4, 16, 9, 24, 58, 40, 72, 80, 67, 23, 76, 88, 69, 52, 78, 32, 47, 14, 46, 64, 83, 17, 59, 81, and 20 were used. In set 9, genes 73, 27, 12, 58, 54, 62, 48, 43, 16, 41, 49, 84, 9, 75, 13, 50, 19, 3, 76, 78, 56, 68, 71, 25, 24, 60, 18, 35, 45, and 51 were used. In set 10, genes 82, 21, 24, 85, 51, 18, 72, 28, 89, 22, 34, 4, 53, 75, 83, 23, 50, 5, 42, 13, 88, 63, 40, 64, 38, 35, 39, 44, 59, and 70 were used.
  • For 35 genes, set 1, genes 2, 69, 70, 89, 9, 11, 5, 17, 63, 18, 12, 59, 58, 85, 26, 71, 61, 10, 3, 1, 22, 79, 84, 30, 48, 82, 38, 44, 56, 42, 88, 6, 60, 14, and 28 were used. In set 2, genes 84, 81, 88, 46, 12, 50, 38, 78, 62, 48, 19, 43, 26, 66, 4, 20, 40, 58, 9, 52, 87, 47, 6, 55, 21, 75, 31, 77, 57, 53, 45, 34, 30, 32, and 39 were used. In set 3, genes 6, 3, 22, 89, 8, 78, 87, 71, 42, 63, 18, 40, 68, 77, 64, 88, 5, 58, 43, 72, 80, 10, 21, 56, 11, 59, 61, 2, 19, 76, 30, 20, 14, 69, and 35 were used. In set 4, genes 55, 42, 89, 41, 56, 33, 24, 28, 15, 61, 63, 18, 90, 60, 35, 76, 70, 52, 8, 1, 64, 23, 13, 39, 71, 31, 3, 81, 10, 34, 66, 44, 16, 7, and 78 were used. In set 5, genes 59, 58, 12, 50, 47, 42, 28, 22, 76, 54, 1, 18, 7, 53, 68, 73, 20, 67, 14, 72, 23, 13, 39, 10, 70, 55, 45, 17, 31, 51, 80, 3, 24, 30, and 46 were used. In set 6, genes 53, 66, 26, 3, 73, 47, 61, 63, 51, 41, 29, 5, 19, 10, 57, 22, 64, 11, 34, 89, 43, 24, 31, 60, 27, 76, 17, 86, 70, 81, 50, 46, 36, 14, and 45 were used. In set 7, genes 18, 88, 90, 13, 73, 81, 64, 56, 84, 2, 4, 22, 3, 25, 35, 54, 89, 86, 27, 41, 6, 34, 38, 14, 74, 36, 59, 8, 40, 55, 42, 83, 39, 44, and 60 were used. In set 8, genes 46, 32, 22, 15, 67, 89, 14, 5, 70, 39, 49, 9, 84, 71, 12, 78, 27, 86, 26, 57, 20, 43, 58, 87, 42, 8, 31, 1, 54, 62, 69, 40, 29, 52, and 64 were used. In set 9, genes 3, 39, 55, 25, 90, 10, 9, 77, 62, 78, 18, 12, 58, 51, 22, 67, 7, 61, 59, 35, 52, 4, 65, 38, 32, 71, 87, 88, 63, 50, 73, 70, 44, 45, and 84 were used. In set 10, genes 65, 54, 51, 38, 40, 5, 43, 71, 34, 30, 22, 6, 36, 64, 63, 13, 70, 85, 21, 88, 77, 86, 79, 66, 25, 18, 26, 19, 76, 56, 23, 60, 75, 2, and 49 were used.
  • For 40 genes, set 1, genes 81, 80, 68, 77, 17, 71, 34, 33, 48, 88, 90, 32, 23, 2, 38, 59, 75, 82, 50, 56, 12, 36, 6, 87, 72, 37, 26, 15, 35, 66, 13, 76, 55, 3, 78, 18, 52, 47, 73, and 20 were used. In set 2, genes 11, 65, 27, 44, 88, 49, 55, 57, 1, 72, 9, 28, 56, 67, 13, 58, 42, 36, 8, 31, 40, 14, 26, 35, 62, 22, 19, 84, 78, 21, 2, 41, 74, 71, 52, 30, 25, 76, 85, and 63 were used. In set 3, genes 50, 22, 10, 54, 9, 51, 15, 34, 29, 35, 76, 89, 33, 6, 88, 56, 36, 70, 87, 40, 83, 62, 1, 42, 25, 78, 30, 26, 44, 60, 69, 47, 49, 31, 18, 59, 37, 52, 61, and 17 were used. In set 4, genes 27, 33, 7, 89, 36, 59, 48, 42, 66, 39, 90, 52, 2, 14, 30, 80, 9, 56, 21, 87, 65, 67, 41, 73, 82, 20, 4, 46, 5, 84, 88, 15, 44, 58, 78, 85, 3, 64, 6, and 8 were used. In set 5, genes 43, 24, 86, 29, 46, 90, 40, 1, 71, 57, 12, 84, 69, 19, 42, 62, 28, 35, 5, 63, 52, 17, 39, 4, 67, 81, 50, 47, 61, 54, 87, 70, 77, 6, 10, 38, 37, 79, 31, and 36 were used. In set 6, genes 28, 5, 78, 85, 16, 20, 36, 52, 43, 29, 67, 83, 12, 79, 84, 8, 81, 46, 11, 3, 54, 86, 10, 60, 71, 51, 39, 53, 59, 69, 44, 61, 7, 56, 27, 50, 66, 70, 1, and 25 were used. In set 7, genes 39, 47, 48, 24, 25, 3, 41, 16, 65, 73, 63, 14, 70, 57, 12, 64, 90, 23, 27, 38, 66, 71, 54, 21, 83, 28, 72, 53, 11, 30, 80, 15, 6, 88, 89, 85, 81, 61, 78, and 34 were used. In set 8, genes 61, 8, 57, 16, 24, 64, 48, 36, 58, 28, 27, 40, 70, 77, 25, 76, 52, 35, 62, 4, 60, 7, 54, 37, 11, 20, 72, 34, 56, 78, 10, 86, 51, 29, 84, 47, 30, 21, 59, and 67 were used. In set 9, genes 67, 3, 83, 33, 35, 26, 25, 79, 68, 19, 18, 84, 14, 58, 66, 57, 1, 2, 27, 64, 23, 24, 76, 81, 17, 37, 38, 30, 45, 75, 49, 39, 5, 53, 43, 15, 51, 40, 69, and 12 were used. In set 10, genes 39, 77, 29, 70, 85, 45, 54, 79, 31, 43, 15, 11, 47, 83, 76, 21, 67, 14, 4, 19, 49, 42, 18, 13, 12, 7, 88, 8, 3, 35, 81, 55, 71, 60, 72, 57, 46, 40, 56, and 32 were used.
  • For 45 genes, set 1, genes 7, 63, 45, 87, 19, 55, 36, 42, 9, 4, 79, 68, 46, 35, 40, 80, 59, 58, 38, 17, 50, 30, 13, 39, 33, 84, 34, 64, 2, 57, 24, 88, 65, 16, 53, 18, 28, 8, 60, 15, 43, 73, 77, 20, and 78 were used. In set 2, genes 70, 19, 81, 68, 38, 35, 48, 9, 53, 11, 73, 42, 54, 28, 32, 40, 60, 88, 25, 7, 67, 17, 36, 51, 44, 46, 10, 89, 14, 80, 39, 41, 27, 8, 75, 47, 61, 57, 59, 76, 86, 65, 63, 74, and 77 were used. In set 3, genes 55, 24, 63, 17, 32, 81, 2, 67, 51, 85, 27, 46, 60, 90, 25, 35, 58, 11, 47, 33, 73, 3, 74, 52, 15, 86, 6, 78, 36, 66, 57, 13, 49, 28, 75, 70, 4, 77, 43, 26, 61, 64, 20, 1, and 23 were used. In set 4, genes 49, 72, 13, 51, 55, 11, 29, 5, 43, 44, 40, 6, 38, 67, 47, 35, 36, 28, 81, 24, 80, 32, 16, 88, 63, 87, 86, 79, 21, 1, 30, 10, 62, 58, 23, 12, 78, 26, 69, 56, 85, 42, 17, 84, and 39 were used. In set 5, genes 53, 33, 18, 65, 22, 83, 50, 88, 76, 40, 82, 68, 85, 5, 63, 45, 78, 16, 42, 54, 27, 66, 70, 74, 7, 51, 89, 64, 49, 37, 84, 86, 34, 39, 80, 31, 61, 87, 69, 4, 81, 30, 14, 41, and 29 were used. In set 6, genes 7, 60, 38, 14, 73, 9, 79, 81, 22, 10, 85, 51, 40, 87, 3, 26, 57, 56, 12, 72, 39, 59, 63, 28, 64, 71, 69, 21, 67, 48, 50, 66, 46, 88, 11, 13, 24, 8, 58, 75, 2, 41, 5, 44, and 55 were used. In set 7, genes 15, 65, 31, 19, 11, 38, 2, 9, 64, 66, 22, 35, 49, 3, 77, 43, 32, 56, 39, 54, 80, 21, 6, 40, 27, 86, 10, 16, 70, 30, 85, 23, 26, 4, 55, 73, 42, 13, 41, 68, 29, 57, 28, 72, and 58 were used. In set 8, genes 83, 27, 9, 62, 84, 78, 13, 5, 74, 55, 12, 34, 58, 3, 67, 57, 24, 45, 42, 47, 75, 25, 29, 44, 46, 61, 56, 70, 86, 37, 14, 49, 60, 89, 28, 72, 59, 38, 2, 81, 50, 7, 6, 21, and 82 were used. In set 9, genes 7, 10, 35, 14, 79, 66, 33, 52, 16, 55, 68, 59, 57, 19, 11, 47, 22, 38, 61, 30, 71, 50, 63, 88, 53, 80, 6, 54, 77, 21, 37, 84, 9, 65, 12, 49, 40, 73, 76, 2, 28, 29, 3, 72, and 18 were used. In set 10, genes 12, 19, 9, 80, 84, 15, 7, 2, 39, 21, 48, 40, 51, 69, 74, 83, 5, 66, 27, 26, 89, 60, 4, 86, 41, 44, 35, 10, 76, 53, 63, 16, 37, 79, 11, 42, 68, 3, 59, 82, 77, 73, 85, 67, and 14 were used.
  • For 49 genes, set 1, genes 84, 47, 56, 1, 18, 21, 57, 54, 27, 89, 44, 85, 64, 10, 77, 34, 65, 66, 80, 70, 46, 23, 53, 61, 24, 81, 43, 35, 30, 74, 83, 51, 20, 17, 72, 4, 49, 68, 60, 28, 67, 19, 42, 55, 73, 36, 7, 39, and 33 were used. In set 2, genes 47, 29, 58, 36, 21, 53, 40, 7, 83, 77, 24, 89, 71, 64, 60, 4, 37, 86, 27, 57, 62, 63, 72, 1, 88, 78, 68, 17, 51, 16, 82, 42, 81, 18, 32, 49, 55, 10, 11, 66, 35, 23, 70, 20, 61, 25, 48, 43, and 54 were used. In set 3, genes 54, 2, 62, 67, 44, 25, 8, 53, 86, 33, 75, 32, 45, 76, 43, 65, 59, 58, 42, 64, 47, 78, 3, 57, 71, 88, 14, 23, 51, 83, 1, 41, 7, 56, 40, 20, 39, 72, 70, 19, 5, 35, 50, 82, 37, 48, 15, 31, and 16 were used. In set 4, genes 35, 65, 48, 43, 69, 62, 64, 74, 82, 39, 37, 1, 88, 45, 66, 12, 79, 55, 38, 84, 17, 30, 25, 26, 89, 56, 28, 57, 59, 34, 85, 14, 47, 44, 41, 19, 60, 20, 73, 2, 63, 75, 49, 80, 58, 77, 27, 54, and 29 were used. In set 5, genes 64, 51, 36, 12, 84, 24, 65, 47, 88, 26, 10, 19, 73, 90, 35, 53, 18, 55, 80, 70, 79, 82, 87, 77, 15, 85, 83, 7, 72, 1, 6, 57, 38, 45, 74, 33, 62, 86, 31, 69, 27, 14, 4, 29, 54, 44, 63, 78, and 42 were used. In set 6, genes 24, 39, 85, 42, 88, 32, 65, 23, 6, 75, 53, 77, 64, 90, 13, 82, 47, 31, 48, 8, 78, 67, 63, 44, 26, 40, 14, 34, 18, 59, 2, 17, 20, 56, 83, 68, 86, 9, 38, 73, 89, 55, 29, 69, 72, 16, 28, 51, and 81 were used. In set 7, genes 32, 70, 57, 67, 1, 73, 52, 38, 65, 83, 5, 40, 49, 31, 66, 85, 6, 82, 12, 48, 89, 3, 19, 41, 62, 16, 46, 61, 24, 18, 55, 30, 33, 56, 68, 20, 81, 10, 86, 9, 15, 63, 78, 22, 75, 14, 13, 43, and 77 were used. In set 8, genes 17, 30, 47, 85, 7, 3, 6, 35, 76, 77, 25, 86, 36, 75, 44, 29, 69, 60, 63, 64, 82, 51, 19, 68, 41, 28, 73, 18, 10, 26, 42, 78, 67, 12, 80, 33, 13, 57, 38, 87, 49, 59, 74, 50, 90, 46, 8, 81, and 4 were used. In set 9, genes 20, 76, 42, 36, 66, 21, 8, 28, 22, 15, 56, 5, 2, 86, 17, 62, 23, 1, 80, 73, 52, 83, 32, 65, 44, 82, 35, 60, 47, 90, 74, 9, 84, 50, 4, 77, 55, 57, 19, 71, 25, 48, 81, 53, 34, 38, 3, 37, and 16 were used. In set 10, genes 84, 87, 3, 41, 36, 71, 33, 57, 85, 26, 53, 22, 82, 31, 2, 45, 24, 18, 37, 35, 77, 20, 63, 25, 6, 17, 58, 7, 9, 49, 28, 76, 79, 67, 13, 80, 66, 5, 43, 4, 74, 75, 21, 86, 23, 39, 42, 27, and 54 were used.
  • Example 5: PCR Based Detection
  • As noted above, the determination or measurement of gene expression may be performed by PCR, such as the use of quantitative PCR. Detecting expression of about 5 to 49 expressed sequences in the human genome may be used in such embodiments of the invention. Additionally, expression levels of about 5 to 49 gene sequences in the set of 74, the set of 90, or a combination set of the two (with a total of 126 gene sequences given the presence of 38 gene sequences in common between the two sets) may also be used. The invention contemplates the use of quantitative PCR to measure expression levels, as described above, of about 5 to 49 of 87 gene sequences, all of which are present in either the set of 74 or the set of 90. Of the 87 gene sequences, 60 are present in the set of 74, and 63 are present in the set of 90. The identifiers/accession numbers of the 87 gene sequences are AA456140, AA745593, AA765597, AA782845, AA865917, AA946776, AA993639, AB038160, AF104032, AF133587, AF301598, AF332224, AI041545, AI147926, AI309080, AI341378, AI457360, AI620495, AI632869, AI683181, AI685931, AI802118, AI804745, AI952953, AI985118, AJ000388, AK025181, AK027147, AK054605, AL023657, AL039118, AL110274, AL157475, AW118445, AW194680, AW291189, AW298545, AW445220, AW473119, AY033998, BC000045, BC001293, BC001504, BC001639, BC002551, BC004331, BC004453, BC005364, BC006537, BC006811, BC006819, BC008764, BC008765, BC009084, BC009237, BC010626, BC011949, BC012926, BC013117, BC015754, BC017586, BE552004, BE962007, BF224381, BF437393, BF446419, BF592799, BI493248, H05388, H07885, H09748, M95585, N64339, NM_000065, NM_001337, NM_003914, NM_004062, NM_004063, NM_004496, NM_006115, NM_019894, NM_033229, R15881, R45389, R61469, X69699, and X96757.
  • The use of from about 5 to 49 of these sequences in the practice of the invention may include the use of expression levels measured for reference gene sequences as described herein. In some embodiments, the reference gene sequences are one or more of the 8 disclosed herein. The invention contemplates the use of one or more of the reference sequences identified by AF308803, AL137727, BC003043, BC006091, and BC016680 in PCR or QPCR based embodiments of the invention. Of course all 5 of these reference sequences may also be used in combination.
  • Example 6: mRNA Sequences (Sequence Listing)
  • >Hs.73995_mRNA_1 gi|190403|gb|M60502.1 HUMPROFILE Human profilaggrin mRNA
    3′ end polyA = 1
    GGCCACTCTGCAGACAGCTCCAGACAATCAGGCACTCGTCACACAGAGTCTTCCTCTCGT
    GGACAGGCTGCGTCATCCCATGAACAGGCAAGATCAAGTGCAGGAGAAAGACATGGATCC
    CACCACCAGCAGTCAGCAGACAGCTCCAGACACGCAGGCATTGGGCACGGACAAGCTTCA
    TCTGCAGTCAGAGACAGTGGACACCGAGGGTACAGAGGTAGTCAGGCCACTGACAGTGAG
    GGACATTCAGAAGACTCAGACACACAGTCAGTGTCAGCACAGGGACAAGCTGGGCCCCAT
    CAGCAGAGCCACCAAGAGTCCGCACGTGGCCAGTCAGGGGAAAGCTCTGGACGTTCAGGG
    TCTTTCCTCTACCAGGTGAGCACTCATGAACAGTCTGAGTCCACCCATGGACAGTCTGTG
    CCCAGCACTGGAGGAAGACAAGGATCCCACCATGATCAGGCACAAGACAGCTCCAGGCAC
    TCAGCATCCCAAGAGGGTCAGGACACCATTCGTGGACACCCGGGGCCAAGCAGAGGAGGA
    AGACAGGGGTCCCACCACGAGCAATCGGTAGATAGGTCTGGACACTCAGGGTCCCATCAC
    AGCCACACCACATCCCAGGGAAGGTCTGATGCCTCCCGTGGGCAGTCAGGATCCAGAAGT
    GCAAGCAGACAAACACATGACCAGGAACAATCAGGAGACGGCTCTAGGCACTCAGGGTCG
    CGTCATCAGGAAGCTTCCTCTTGGGCCGACAGCTCTAGACACTCACAGGCAGTCCAGGGA
    CAATCAGAGGGGTCCAGGACAAGCAGGCGCCAGGGATCCAGTGTTAGCCAGGACAGTGAC
    AGTCAGGGACACTCAGAAGACTCTGAGAGGCGGTCTGGGTCTGCTTCCAGAAACCATCGT
    GGATCTGCTCAGGAGCAGTCAAGAGATGGCTCCAGACACCCCAGGTCCCATCACGAAGAC
    AGAGCCGGTCACGGGGACTCTGCAGAGAGCTCCAGACAATCAGGCACTCATCATGCAGAG
    AATTCCTCTGGTGGACAGGCTGCATCATCCCATGAACAGGCAAGATCAAGTGCAGGAGAG
    AGACATGGATCCCACTACCAGCAGTCAGCAGACAGCTCCAGACACTCAGGCATTGGGCAC
    GGACAAGCTTCATCTGCAGTCAGAGACAGTGGACACCGAGGGTCCAGTGGTAGTCAGGCC
    AGTGACAATGAGGGACATTCAGAAGACTCAGACACACAGTCAGTGTCAGCCCACCGACAG
    GCTGGGCGCCATCACGAGAGCCACCAAGAGTCCACACGTGGCCGGTCACGAGGAAGGTCT
    GGACGTTCAGGGTCTTTCCTCTACCAGGTGAGCACTCATGAACAGTCTGAGTCTGCCCAT
    GGACGGGCTGGGCCCAGTACTGGAGGAAGACAAGGATCCCGCCACGAGCAGGCACGAGAC
    AGCTCCAGGCACTCAGCGTCCCAAGAGGGTCAGGACACCATTCGTGGACACCCGGGGTCA
    AGGAGAGGAGGAAGACAGGGATCCTACCACGAGCAATCGGTAGATAGGTCTGGACACTCA
    GGGTCCCATCACAGCCACACCACATCCCAGGGAAGGTCTGATGCCTCCCATGGGCAGTCA
    GGATCCAGAAGTGCAAGCAGAGAAACACGTAATGAGGAACAGTCAGGAGACGGCTCCAGG
    CACTCAGGGTCGCGTCACCATGAAGCTTCCACTCAGGCTGACAGCTCTAGACACTCACAG
    TCCGGCCAGGGTGAATCAGCGGGGTCCAGGAGAAGCAGGCGCCAGGGATCCAGTGTTAGC
    CAGGACAGTGACAGTGAGGCATACCCAGAGGACTCTGAGAGGCGATCTGAGTCTGCTTCC
    AGAAACCATCATGGATCTTCTCGGGAGCAGTCAAGAGATGGCTCCAGACACCCCGGATCC
    TCTCACCGCGATACAGCCAGTCATGTACAGTCTTCACCTGTACAGTCAGACTCTAGTACC
    GCTAAGGAACATGGTCACTTTAGTAGTCTTTCACAAGATTCTGCGTATCACTCAGGAATA
    CAGTCACGTGGCAGTCCTCACAGTTCTAGTTCTTATCATTATCAATCTGAGGGCACTGAA
    AGGCAAAAAGGTCAATCAGGTTTAGTTTGGAGACATGGCAGCTATGGTAGTGCAGATTAT
    GATTATGGTGAATCCGGGTTTAGACACTCTCAGCACGGAAGTGTTAGTTACAATTCCAAT
    CCTGTTCTTTTCAAGGAAAGATCTGATATCTGTAAAGCAAGTGCGTTTGGTAAAGATCAT
    CCAAGGTATTATGCAACGTATATTAATAAGGACCCAGGTTTATGTGGCCATTCTAGTGAT
    ATATCGAAACAACTGGGATTTAGTCAGTCACAGAGATACTATTACTATGAGTAAGAAATT
    AATGGCAAAGGAATTAATCCAAGAATAGAAGAATGAAGCAAGTTCACTTTCAATCAAGAA
    ACTTCATAATACTTTCAGGGAAGTTATCTTTTCCTGTCAATCTGTTTAAAATATGCTATA
    GTATTTCATTAGTTTGGTGGTAACTTATTTTTATTGTGTAATGATCTTTAAACGCTATAT
    TTCAGAAATATTAAATGGAAGAAATCAATATCATGGAGAGCTAACTTTAGAAAACTAGCT
    GGAGTATTTTAGGAGATTCTGGGTCAAGTAATGTTTTATGTTTTTGAAAGTTTAAGTTTT
    AGACACTCCCCAAATTTCTAAATTAATCTTTTTCAGAAATATCGAAGGAGCCAAAAATAT
    AAAACAGTTCTGATATCCAAAGTGGCTATATCAACATCAGGGCTAGCACATCTTTCTCTA
    TTATCCTTCTATTGGAATTCTAGTATTCTGTATTCAAAAAATCATCTTGGACATAATTAA
    TATTTTAGTAAGCTGCATCTAAATTAAAAATAAACTATTCATCATATAAT
    >Hs.75236_mRNA_4 gi|14280328|gb|AY033998.1|Homo sapiens polyA = 3
    TAGAATCGGGGGTTTCAGCTCACTGCTCCTTTTCTTTTTTTTCTTTCTCTCCCCCGCCCA
    CCCCCCCAAAAATAATTGATTTGCTTTACAATCATCCACACTGTGTTTTGTGGATCTTTA
    ATTATATATAACAATAGTAGTCATTTTAAATATATATTCTGAAATCTTTGCAAATTTTAA
    CAGAAGAGTCGAAGCTCTGCGAGACCCAATATTTGCCAATAAGAATGGTTATGATAATTA
    GCACCATGGAGCCTCAGGTGTCAAATGGTCCGACATCCAATACAAGCAATGGACCCTCCA
    GCAACAACAGAAACTGTCCTTCTCCCATGCAAACAGGGGCAACCACAGATGACAGCAAAA
    CCAACCTCATCGTCAACTATTTACCCCAGAATATGACCCAAGAAGAATTCAGGAGTCTCT
    TCGGGAGCATTGGTGAAATAGAATCCTGCAAACTTGTGAGAGACAAAATTACAGGACAGA
    GTTTAGGGTATGGATTTGTTAACTATATTGATCCAAAGGATGCAGAGAAAGCCATCAACA
    CTTTAAATGGACTCAGACTCCAGACCAAAACCATAAAGGTCTCATATGCCCGTCCGAGCT
    CTGCCTCAATCAGGGATGCTAACCTCTATGTTAGCGGCCTTCCCAAAACCATGACCCAGA
    AGGAACTGGAGCAACTTTTCTCGCAATACGGCCGTATCATCACCTCACGAATCCTGGTTG
    ATCAAGTCACAGGAGTGTCCAGAGGGGTGGGATTCATCCGCTTTGATAAGAGGATTGAGG
    CAGAAGAAGCCATCAAAGGGCTGAATGGCCAGAAGCCCAGCGGTGCTACGGAACCGATTA
    CTGTGAAGTTTGCCAACAACCCCAGCCAGAAGTCCAGCCAGGCCCTGCTCTCCCAGCTCT
    ACCAGTCCCCTAACCGGCGCTACCCAGGTCCACTTCACCACCAGGCTCAGAGGTTCAGGC
    TGGACAATTTGCTTAATATGGCCTATGGCGTAAAGAGACTGATGTCTGGACCAGTCCCCC
    CTTCTGCTTGTTCCCCCAGGTTCTCCCCAATTACCATTGATGGAATGACAAGCCTTGTGG
    GAATGAACATCCCTGGTCACACAGGAACTGGGTGGTGCATCTTTGTCTACAACCTGTCCC
    CCGATTCCGATGAGAGTGTCCTCTGGCAGCTCTTTGGCCCCTTTGGAGCAGTGAACAACG
    TAAAGGTGATTCGTGACTTCAACACCAACAAGTGCAAGGGATTCGGCTTTGTCACCATGA
    CCAACTATGATGAGGCGGCCATGGCCATCGCCAGCCTCAACGGGTACCGCCTGGGAGACA
    GAGTGTTGCAAGTTTCCTTTAAAACCAACAAAGCCCACAAGTCCTGAATTTCCCATTCTT
    ACTTACTAAAATATATATAGAAATATATACGAACAAAACACACGCGCGCACACACACACA
    TACACGAAAGAGAGAGAAACAAACTTTTCAAGGCTTATATTCAACCATGGACTTTATAAG
    CCAGTGTTGCCTAAGTATTAAAACATTGGATTATCCTGAGGTGTACCAGGAAAGGATTTT
    ATAATGCTTAGAAAAAAAAAAAAAAAAAAAA
    >Hs.299867_mRNA_1 gi|4758533|ref|NM_004496.1 Homo sapiens hepatocyte
    nuclear factor 3, alpha (HNF3A), mRNA polyA = 3
    TCCAGGAATCGATAGTGCATTCGTGCGCGCGGCCGCCCGTCGCTTCGCACAGGGCTGGAT
    GGTTGTATTGGGCAGGGTGGCTCCAGGATGTTAGGAACTGTGAAGATGGAAGGGCATGAA
    ACCAGCGACTGGAACAGCTACTACGCAGACACGCAGGAGGCCTACTCCTCGGTCCCGGTC
    AGCAACATGAACTCAGGCCTGGGCTCCATGAACTCCATGAACACCTACATGACCATGAAC
    ACCATGACTACGAGCGGCAACATGACCCCGGCGTCCTTCAACATGTCCTATGCCAACCCG
    GCCTTAGGGGCCGGCCTGAGTCCCGGCGCAGTAGCCGGCATGCCGGGGGGCTCGGCGGGC
    GCCATGAACAGCATGACTGCGGCCGGCGTGACGGCCATGGGTACGGCGCTGAGCCCGAGC
    GGCATGGGCGCCATGGGTGCGCAGCAGGCGGCCTCCATGATGAATGGCCTGGGCCCCTAC
    GCGGCCGCCATGAACCCGTGCATGAGCCCCATGGCGTACGCGCCGTCCAACCTGGGCCGC
    AGCCGCGCGGGCGGCGGCGGCGACGCCAAGACGTTCAAGCGCAGTTACCCGCACGCCAAG
    CCGCCCTACTCGTACATCTCGCTCATCACCATGGCCATCCAGCGGGCGCCCAGCAAGATG
    CTCACGCTGAGCGAGATCTACCAGTGGATCATGGACCTCTTCCCCTATTACCGGCAGAAC
    CAGCAGCGCTGGCAGAACTCCATCCGCCACTCGCTGTCCTTCAATGACTGCTTCGTCAAG
    GTGGCACGCTCCCCGGACAAGCCGGGCAAGGGCTCCTACTGGACGCTGCACCCGGACTCC
    GGCAACATGTTCGAGAACGGCTGCTACTTGCGCCGCCAGAAGCGCTTCAAGTGCGAGAAG
    CAGCCGGGGGCCGGCGGCGGGGGCGGGAGCGGAAGCGGGGGCAGCGGCGCCAAGGGCGGC
    CCTGAGAGCCGCAAGGACCCCTCTGGCGCCTCTAACCCCAGCGCCGACTCGCCCCTCCAT
    CGGGGTGTGCACGGGAAGACCGGCCAGCTAGAGGGCGCGCCGGCCCCGGGCCCGGCCGCC
    AGCCCCCAGACTCTGGACCACAGTGGGGCGACGGCGACAGGGGGCGCCTCGGAGTTGAAG
    ACTCCAGCCTCCTCAACTGCGCCCCCCATAAGCTCCGGGCCCGGGGCGCTGGCCTCTGTG
    CCCGCCTCTCACCCGGCACACGGCTTGGCACCCCACGAGTCCCAGCTGCACCTGAAAGGG
    GACCCCCACTACTCCTTCAACCACCCGTTCTCCATCAACAACCTCATGTCCTCCTCGGAG
    CAGCAGCATAAGCTGGACTTCAAGGCATACGAACAGGCACTGCAATACTCGCCTTACGGC
    TCTACGTTGCCCGCCAGCCTGCCTCTAGGCAGCGCCTCGGTGACCACCAGGAGCCCCATC
    GAGCCCTCAGCCCTGGAGCCGGCGTACTACCAAGGTGTGTATTCCAGACCCGTCCTAAAC
    ACTTCCTAGCTCCCGGGACTGGGGGGTTTGTCTGGCATAGCCATGCTGGTAGCAAGAGAG
    AAAAAATCAACAGCAAACAAAACCACACAAACCAAACCGTCAACAGCATAATAAAATCCA
    ACAACTATTTTTATTTCATTTTTCATGCACAACCTTGCCCCCAGTGCAAAAGACTGTTAC
    TTTATTATTGTATTCAAAATTCATTGTGTATATTACTACAAAGACGGCCCCAAACCAATT
    TTTTTCCTGCGAAGTTTAATGATCCACAAGTGTATATATGAAATTCTCCTCCTTCCTTGC
    CCCCCTCTCTTTCTTCCCTCTTGGCCCTCCAGACATTCTAGTTTGTGGAGGGTTATTTAA
    AAAACAAAAAGGAAGATGGTCAAGTTTGTAAAATATTTGTTTGTGCTTTTCCCCCCTCCT
    TACCTGACCCCCTACGAGTTTACAGGCTTGTGGCAATACTCTTAACCATAAGAATTGAAA
    TGGTGAAGAAACAAGTATACACTAGAGGCTCTTAAAAGTATTGAAAAGACAATACTGCTG
    TTATATAGCAAGACATAAACAGATTATAAACATCAGAGCCATTTGCTTCTCAGTTTACAT
    TTCTGATACATGCAGATAGCAGATGTCTTTAAATGAAATACATGTATATTGTGTATGGAC
    TTAATTATGCACATGCTCAGATGTGTAGACATCCTCCGTATATTTACATAACATATAGAG
    GTAATAGATAGGTGATATACGTGATACGTTCTCAAGAGTTGCTTGACCGAAAGTTACAAG
    GACCCCAACCCCTTTGCTCTCTACCCACAGATGGCCCTGGGAACAATCCTCAGGAATTGC
    CCTCAAGAACTCGCTTCTTTGCTTTGAGAGTGCCATGGTCATGTCATTCTGAGGTACATA
    ACACATAAATTAGTTTCTATGAGTGTATACCATTTAAAGATTTTTTCAGTAAAGGGAATA
    TTACATGTTGGGAGGAGGAGATAAGTTATAGGGAGCTGGATTTCAAACGGTGGTCCAAGA
    TTCAAAAATCCTATTGATAGTGGCCATTTTAATCATTGCCATCGTGTGCTTGTTTCATCC
    AGTGTTATGCACTTTCCACAGTTGGTGTTAGTATAGCCAGAGGGTTTCATTATTATTTCT
    CTTTGCTTTCTCAATGTTAATTTATTGCATGGTTTATTCTTTTTCTTTACAGCTGAAATT
    GCTTTAAATGATGGTTAAAATTACAAATTAAATTGGGAATTTTTATCAATGTGATTGTAA
    TTAAAAATATTTTGATTTAAATAACAAAAATAATACCAGATTTTAAGCCGCGGAAAATGT
    TCTTGATCATTTGCAGTTAAGGACTTTAAATAAATCAAATGTTAACAAAAAA
    >Hs.285401_contig1
    AI147926|AI880620|AA768316|AA761543|AA279147|AI216016|AI738663|N79248|AI684
    489|AA960845|AI718599|AI379138|N29366|BF002507|AW044269|R34339|R66326|H0464
    8|R67467|AI523112|BF941500 polyA = 2 polyA = 3
    TGTTTTTCTAGTTCATTTTGTGTTTCCAACTTTTCATGTAAAATTTTAATTATTTTTGAA
    TGTGTGGATGTGAGACTGAGGTGCCTTTTGGTACTGAAATTCTTTTTCCATGTACCTGAA
    GTGTTACTTTTGTGATATAGGAAATCCTTGTATATATACTTTATTGGTCCCTAGGCTTCC
    TATTTTGTTACCTTGCTTTCTCTATGGCATCCACCATTTTGATTGTTCTACTTTTATGAT
    ATGTTTTCATAAGTGGTTAAGCAAGTATTCTCGTTACTTTTGCTCTTAAATCCCTATTCA
    TTACAGCAATGTTGGTGGTCAAAGAAAATGATAAACAACTTGAATGTTCAATGGTCCTGA
    AATACATAACAACATTTTAGTACATTGTAAAGTAGAATCCTCTGTTCATAATGAACAAGA
    TGAACCAATGTGGATTAGAAAGAAGTCCGAGATATTAATTCCAAAATATCCAGACATTGT
    TAAAGGGAAAAAATTGCAATAAAATATTTGTAACATAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.182507_mRNA_1 gi|15431324|ref|NM_002283.2|Homo sapiens keratin, hair,
    basic, 5 (KRTHB5), mRNA polyA = 3
    AGCTCTCCCCACCAATAAAAGGACCAGGGAGGATCAGAGAGAGCAGAAGGATCCTGAGCC
    TCGCACTCTGCCGCCCGCACCACCTTCCGCTGCCTCTCAGACTCTGCTCAGCCTCACACG
    ATGTCGTGCCGCTCCTACAGGATCAGCTCAGGATGCGGGGTCACCAGGAACTTCAGCTCC
    TGCTCAGCTGTGGCCCCCAAAACTGGCAACCGCTGCTGCATCAGCGCCGCCCCCTACCGA
    GGGGTGTCCTGCTACCGAGGGCTGACGGGCTTCGGCAGCCGCAGCCTCTGCAACCTGGGC
    TCCTGCGGGCCCCGGATAGCTGTAGGTGGCTTCCGAGCCGGCTCCTGCGGACGCAGCTTC
    GGCTACCGCTCCGGGGGCGTGTGCGGACCCAGCCCCCCATGCATCACTACCGTGTCGGTC
    AACGAGAGCCTCCTCACGCCCCTCAACCTGGAGATCGACCCCAACGCACAGTGCGTGAAG
    CAGGAGGAGAAGGAGCAGATCAAGTCCCTCAACAGCAGGTTCGCGGCCTTCATCGACAAG
    GTGCGCTTCCTGGAGCAGCAGAACAAGCTGCTGGAGACCAAGTGGCAGTTCTACCAGAAC
    CAGCGCTGCTGCGAGAGCAACCTGGAGCCACTGTTCAGTGGCTACATCGAGACTCTGCGG
    CGGGAGGCCGAGTGCGTGGAGGCCGACAGCGGGAGGCTGGCCTCAGAGCTCAACCATGTG
    CAGGAGGTGCTGGAGGGCTACAAGAAGAAGTATGAAGAGGAGGTGGCCCTGAGAGCCACA
    GCAGAGAATGAGTTTGTCGTTCTAAAGAAGGACGTGGACTGTGCCTACCTGCGGAAATCA
    GACCTGGAGGCCAATGTGGAGGCCCTGGTGGAGGAGTCTAGCTTCCTGAGGCGCCTCTAT
    GAAGAGGAGATCCGCGTTCTCCAAGCCCACATCTCAGACACCTCGGTCATAGTCAAGATG
    GACAACAGCCGAGACCTGAACATGGACTGCATCATCGCTGAGATCAAGGCTCAGTATGAC
    GATGTTGCCAGCCGCAGCCGGGCCGAGGCTGAGTCCTGGTACCGTAGCAAGTGTGAGGAG
    ATGAAGGCCACGGTGATCAGGCATGGGGAGACCCTGCGCCGCACCAAGGAGGAGATCAAC
    GAGCTGAACCGCATGATCCAGAGGCTGACGGCCGAGATTGAGAATGCCAAGTGCCAGCGT
    GCCAAGCTGGAGGCTGCTGTGGCTGAGGCAGAGCAGCAGGGTGAGGCGGCCCTCAGCGAT
    GCCCGCTGCAAGCTGGCTGAGCTGGAGGGCGCCCTGCAGAAGGCCAAGCAGGACATGGCC
    TGCCTGCTCAAGGAGTACCAGGAGGTGATGAACTCCAAGCTGGGCCTGGACATCGAGATC
    GCCACCTACAGGCGCCTGCTGGAGGGCGAGGAACACAGGCTGTGTGAAGGTGTGGGCTCT
    GTGAATGTCTGTGTCAGCAGCTCCCGTGGTGGAGTCTCCTGTGGGGGCCTCTCCTACAGC
    ACCACCCCAGGGCGCCAGATCACTTCTGGCCCCTCAGCCATAGGCGGCAGCATCACGGTG
    GTGGCCCCTGACTCCTGTGCCCCCTGCCAGCCTCGTTCCTCCAGCTTCAGCTGCGGGAGT
    AGCCGGTCGGTCCGCTTTGCCTAGTAGAGTCATGGAGCCAGGGCTTCCTGCCAAGCACCT
    GCCTGCCTGCATCACTGCACTGAATGGCATGTGAATGGAAAATGTGTGCTTGCTTCCAGA
    ATCTTCTGGATGTTCCTACAGAGGGAAAGACCTACAGAGGGAAAGACCCTCGGGCCGCTC
    CCCTGCGCCTTTTCATGCTAGGGAGATGCATCCTAGTTGTCCTCCTGGCAGCTGTTTTCA
    GAGGCATTCCCAGCCCTTCACTTAACTCCTACTTAGCTCCAAAATACCTGTATCCAATTT
    GTATTATTCCCCCAGCTCTCAGGGACAAGACCAGTCCCCCAGCGTGGTGGTCAGCACGGA
    AGCTCCACCTTCTGGGTGGAGGCGCCATCCTAACCATCCAGCCAGGCCACCCACAACCCG
    AGAATCAGGGAGAAAGTCCCTCCCCAGCAGCCCCCTCCTCCTGGCTGGGAAGAATGGTCC
    CCCAGCAAGCACTTGCCTGTTCATTCCCGTTCATGTTTTGCTTCTCTCTCAGACTGCCTT
    CCTGCTTCTGGGCTAACCTGTTCCAGCCAGGCTCCTCATGTGACCTCGCAGTTGAGAAGC
    CCATTATCGTGGGGCATCCTTTTGCCTACAGCCCCTGGTTAGGGCACTTTGGACAGGTCT
    TGCTATTCAGTGAACCTTTGTACATTTCAAAGAAGACTCCATGGCTGCTCCAGATGCCCC
    CTTGCTGGGTGCAGGTGGGGACTGTCCAATGCAGAGCTGGCGGGACAGAGAGTTAAGCCA
    CTTCCTGGGTCTCCTTCTTATGACTGTCTATGGGTGCATTGCCTTCTGGGTTGTCTCGAT
    CTGTGTTTCAATAAATGCCGCTGCAATGCAAAAAAAAAAAAAAAAAAA
    >Hs.292653_contig1
    AI200660|AW014007|AI341199|AI692279|AI393765|AI378686|AI695373|AW292108|T10
    352|R44346|AW470408|AI380925|BF938983|AW003704|H08077|F03856|H08075|F08895|
    AW468398|AI865976|H22568|AI858374|AI216499 polyA = 2 polyA = 3
    CAATCAGTGAAAATTCTATATTCCTTTGGCATTTTTGTGACATATTCAATTCAGTTNTAT
    GTTCCAGCAGAGATCATTATCCCTGGGATCACATCCAAATTTCATACTAAATGGAAGCAA
    ATCTGTGAATTTGGGATAAGATCCTTCTTGGTTAGTATTACTTGCGCCGGAGCAATGTCT
    TATTCCTCGTTTAGACATTGTGATTTCCTTCGTTGGAGCTGTGAGCAGCAGCACATTGGC
    CCTAATCCTGCCACCTTTGGTTGAAATTCTTACATTTTCGAAGGAACATTATAATATATG
    GATGGTCCTGAAAAATATTTCTATAGCATTCACTGGAGTTGTTGGCTTCTTATTAGGTAC
    ATATATAACTGTTGAAGAAATTATTTATCCTACTCCCAAAGTTGTAGCTGGCACTCCACA
    GAGTCCTTTTCTAAATTTGAATTCAACATGCTTAACATCTGGTTTGAAATAGTAAAAGCA
    GAATCATGAGTCTTCTATTTTTGTCCCATTTCTGAAAATTATCAAGATAACTAGTAAAAT
    ACATTGCTATATACATAAAAATGGTAACAAACTCTGTTTTCTTTGGCACGATATTAATAT
    TTTGGAAGTAATCATAACTCTTTACCAGTAGTGGTAAACCTATGAAAAATCCTTGCTTTT
    AAGTGTTAGCAATAGTTCAAAAAATTAAGTTCTGAAAATTGAAAAAATTAAAATGTAAAA
    AAATTAAAGAATAAAAATACTTCTATTATTCTTTTATCTCAGTAAGAAATACCTTAACCA
    AGATATCTCTCTTTTATGCTACTCTTTTGCCACTCACTTGAGAACAGAATAGGATTTCAA
    CAATAAGAGAATAAAATAAGAACATGTATAACAAAAAGCTCTCTCCAGATCATCCCTGTG
    AATGCCAAAGTAAACTTTATGTACAGTGTAAAAAAAAAAAAATCTCAGTTATGTTTTTAT
    TAGCCAAATTCTAATGATTGGCTCCTGGAAGTATAGAAAACTCCCATTAACATAATATAA
    GCATCAGAAAATTGCAAACACTAGAATTAATTTTACACTCTAATGGTAGTTGATCTTCAT
    AGTCAAGAGGCACTGTTCAAGATCATGACTTAGTGTTTCAATGAAATTTGAAAAGGGACT
    TTAAAACTTATCCAGTGCAACTCCCTTGTTTTTCGTCAGAGGAAAAGGAGGCCTAGAAAG
    GTTAAGTAACTTGGTCGAGACCACTCAGCCTTGAGATCAAGAAAACCTAATCTTCTGACT
    CCCAGGCCAGGATGTTTTATTTCTCACATCATGTCCAAGAAAAAGAATAAATTATGTTCA
    GCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.97616_mRNA_3 gi|12654852|gb|BC001270.1|BC001270 Homo sapiens clone
    MGC:5069 IMAGE:3458016 polyA = 3
    CGGAGGCGGCGCCGACGGGGACTGCTGAGGCGCGCAGAGGGTCGGCGGCGCCCGGGAGCC
    TGTCGCTGGCGCGGTCCGGGCGGGAGGCTCGGCGGCGGGCGGCAGCATGTCGGTGGCGGG
    GCTGAAGAAGCAGTTCTACAAGGCGAGCCAGCTGGTCAGTGAGAAGGTCGGAGGGGCCGA
    GGGGACCAAGCTGGATGATGACTTCAAAGAGATGGAGAAGAAGGTGGATGTCACCAGCAA
    GGCGGTGACAGAAGTGCTGGCCAGGACCATCGAGTACCTGCAGCCCAACCCAGCCTCGCG
    GGCTAAGCTGACCATGCTCAACACGGTGTCCAAGATCCGGGGCCAGGTGAAGAACCCCGG
    CTACCCGCAGTCGGAGGGGCTTCTGGGCGAGTGCATGATCCGCCACGGGAAGGAGCTGGG
    CGGCGAGTCCAACTTTGGTGACGCATTGCTGGATGCCGGCGAGTCCATGAAGCGCCTGGC
    AGAGGTGAAGGACTCCCTGGACATCGAGGTCAAGCAGAACTTCATTGACCCCCTCCAGAA
    CCTGTGCGAGAAAGACCTGAAGGAGATCCAGCACCACCTGAAGAAACTGGAGGGCCGCCG
    CCTGGACTTTGACTACAAGAAGAAGCGGCAGGGCAAGATCCCCGATGAGGAGCTACGCCA
    GGCGCTGGAGAAGTTCGAGGAGTCCAAGGAGGTGGCAGAAACCAGCATGCACAACCTCCT
    GGAGACTGACATCGAGCAGGTGAGTCAGCTCTCGGCCCTGGTGGATGCACAGCTGGACTA
    CCACCGGCAGGCCGTGCAGATCCTGGACGAGCTGGCGGAGAAGCTCAAGCGCAGGATGCG
    GGAAGCTTCCTCACGCCCTAAGCGGGAGTATAAGCCGAAGCCCCGGGAGCCCTTTGACCT
    TGGAGAGCCTGAGCAGTCCAACGGGGGCTTCCCCTGCACCACAGCCCCCAAGATCGCAGC
    TTCATCGTCTTTCCGATCTTCCGACAAGCCCATCCGGACCCCTAGCCGGAGCATGCCGCC
    CCTGGACCAGCCGAGCTGCAAGGCGCTGTACGACTTCGAGCCCGAGAACGACGGGGAGCT
    GGGCTTCCATGAGGGCGACGTCATCACGCTGACCAACCAGATCGATGAGAACTGGTACGA
    GGGCATGCTGGACGGCCAGTCGGGCTTCTTCCCGCTCAGCTACGTGGAGGTGCTTGTGCC
    CCTGCCGCAGTGACTCACCCGTGTCCCCGCCCCGCCCCTCCGTCCACACTGGCCGGCACC
    CCCTGCTGGGTCTCCTGCATTCCACGGAGCCCCTGCTGCCAGGGCGGTGTCTGAGCCTGC
    CGGCGCCACCTGGGCCCCGGCCCTTGAGGTACTCCCTGAGCAGGACCCCACACTTGGGTG
    GGGGGGCTTATCTGGGTGGGTGGGGATGCCTGTTTACACTAGCGCTGACTCCCAACGGTG
    ACGGCTCCCTTCCCCACTCCATGGCGCCAGCCTCCTCCCCCGCTCCCCAACTTCTCGCCC
    AGCTGGCCGAGGCGGGGCAACACTAAGGTGCTCTTAGAAACACTAATGTTCCTCTGGGGC
    AGCCCCCACCTCCGTCCTGACCCGACGGGGGCCCGGCCCACTGCCTACCCTCGAGTCCCG
    CAGCCTTAACAGGATGGGATCGAGGGTCCCCATGGGGTGGCTCAGAGATAGGACCCTGGT
    TTTAAATCCCTCCCAGCCTGGTGCTGGTGATGGGCCCTGGCCCTACTCCAGGGCCAATGC
    ACCCCCGCCTCACACACGCACTCCTTCTCCTCAAGGCCAGGGCAGAGGGCCTCACCGCCT
    CCCGGGCCTGCTGTCAGCTTGCAGCCCGGGGACAGAGGCCAGCTGGGATCTGCCTGAGGA
    CAGAGAACATGGTCTCCTGCAGGGCCCTGCCTCCCAAGCCCCGCCCTCAGAAAGCCAAGT
    ACCTTTTCAGCTTTTTAACTGCCCCCATCCCAACCCAGGGAGGCCTGTGTCACTCTGGCA
    CAAGCTGCCACCACCAGCCACCCACACCCACCCCAGCACACCTCACACGGGACCACAGCC
    GCGCTGCCGAGGGCCAAGCACAAAGGTTCCAGTGAGCGCATGTCCCAGCCCCTGGTGGCC
    AGGCTCCCCTTGCTGAGCCGCTGCCACTTCACCCTGTGGGAAGTGGCCCCAGCCATCTCC
    TCTAGACCAAGGCAGGCAGCCCCGACATCTGCTTCCTCTATCGCCCAATGCAAAATCGAT
    GAAATGGGGAGTTCTCTGGGCCAGGCCACATTCACATTCCCCTCCCCCTGTGGTCCAGTG
    AAGCCTCCGGACCCCAGGCTCTGCTCTGCCCTGCCCTGCACCCCCCTCGTCAGAAGTACA
    TGAGGGGCGCAGAGATGAGCACACAGCTTTGGGCACGGTCCAGGGCAAACTGAAATGTAC
    GCCTGAATTTTGTAAACAGAAGTATTAAATGTCTCTTTCTACAAAAAAAAAAAAAAAAAA
    >Hs.123078_mRNA_3 gi|14328043|gb|BC009237.1|BC009237 Homo sapiens clone
    MGC:2216 IMAGE:2989823 polyA = 3
    GGCACGAGGGAGGTGCAGAGCTGAGAATGAGGCGATTTCGGAGGATGGAGAAATAGCCCC
    GAGTCCCGTGGAAAATGAGGCCGGCGGACTTGCTGCAGCTGGTGCTGCTGCTCGACCTGC
    CCAGGGACCTGGGCGGAATGGGGTGTTCGTCTCCACCCTGCGAGTGCCATCAGGAGGAGG
    ACTTCAGAGTCACCTGCAAGGATATTCAACGCATCCCCAGCTTACCGCCCAGTACGCAGA
    CTCTGAAGCTTATTGAGACTCACCTGAGAACTATTCCAAGTCATGCATTTTCTAATCTGC
    CCAATATTTCCAGAATCTACGTATCTATAGATGTGACTCTGCAGCAGCTGGAATCACACT
    CCTTCTACAATTTGAGTAAAGTGACTCACATAGAAATTCGGAATACCAGGAACTTAACTT
    ACATAGACCCTGATGCCCTCAAAGAGCTCCCCCTCCTAAAGTTCCTTGGCATTTTCAACA
    CTGGACTTAAAATGTTCCCTGACCTGACCAAAGTTTATTCCACTGATATATTCTTTATAC
    TTGAAATTACAGACAACCCTTACATGACGTCAATCCCTGTGAATGCTTTTCAGGGACTAT
    GCAATGAAACCTTGACACTGAAGCTGTACAACAATGGCTTTACTTCAGTCCAAGGATATG
    CTTTCAATGGGACAAAGCTGGATGCTGTTTACCTAAACAAGAATAAATACCTGACAGTTA
    TTGACAAAGATGCATTTGGAGGAGTATACAGTGGACCAAGCTTGCTGCTGCCTCTTGGAA
    GAAAGTCCTTGTCCTTTGAGACTCAGAAGGCCCCAAGCTCCAGTATGCCATCATGATGCC
    TGCTAAGGCAGCCACCTTGGTGTACATGCTCACAGAGGCTCTGTTCATGGAGCAGCTGCT
    GTTTGAAAAATTTTGAAATGCAAGATCCACAACTAGATGGAAGGCACTCTAGTCTTTGCA
    GAAAAAAATGTACCTGAATGTACATTGCACAATGCCTGGCACAAAGAAGGAAGAATATAA
    ATGATAGTTCGACTCGTCTGTGGAAGAACTTACAATCATGGGGAAAGATGGAATAAAAAC
    ATTTTTTAAACAGCAAAAAAAAAAAAAAAAAA
    >Hs.285508_contig1 AW194680|BF939744|BF516467 polyA = 1 polyA = 1
    CCCCAGCCCCACTCACCCACCCTCCTTCCCACCAGCCTGCTCTCCGCAGGCCCACTGTCT
    TTGGGTTTAATGACGTCTCTTCTCTGTGGAACTTCACGATTCCTTCCCACGGTCAACTCG
    GGACCTCCCAGCGACCACTGCAGCCTGCGGACGAGGCCGGGACTTGGCCGAGCGGATCCT
    AATAAGGGGAAAATGGTAAATGCAAACGTCCCGTTACAATTTTACCGCCAGTGTGCTGTC
    GTTCCCCCTCCCCCTCTCCGAGTCCTCGTGGGGACACGGCGGGGTCTGTAGGAAGTTGGG
    CCGGGTTGGGGGTTGCTAGAAGGCGCTGGTGTTTTGCTCTGAGTTTTAAGAGATCCCTTC
    CTTCCTCTTCGGTGAATGCAGGTTATTTAAACTTTGGGAAATGTACTTTTAGTCTGTCAT
    ATCAAGGCATGAGTCACTGTCTTTTTTTGTGTGAATAAATGGTTTCTAGTACAATGGA
    >Hs.183274_contig1
    BF437393|BF064008|BF509951|AW134603|AI277015|AI803254|AA887915|BF054958|AI0
    04413|AI393911|AI278517|AW612644|AI492162|AI309226|AI863671|AA448864|AI6401
    65|AA479926|AA461188|AA780161|BF591180|AI918020|AI758226|AI291375|BF001845|
    BF003064|AI337393|AI522206|BE856784|BF001760|AI280300 FLAG = 1 polyA = 2 WARN
    polyA = 3
    GCGGCCGCCCGCACGTCCGCGGGTCCCGGCCGCGCCGCCGCCGCGCGCCCCTGCCCGAGA
    GAGCTCTGGCCCCGCTAGCGGGGCCAGGAGCCGGGCCTCCCACCGCAGCGTCCCCCGCCG
    CGCCAGTCCCCGCTAGTGGTAGTATCTCGTAATAGCTTCTGTGTGTGAGCTACCGTGGAT
    CTCCTTCCCTTCTCTTGGGGGCCGGGGGGAAAGAAAAGGATTTAAGCAAAGGCTCCCTCG
    CCCTGTGAGGGCGAGCGGCAAAGGCCCGGCTGAGCCCCCCATGCCCCTCCCCTCCCCGTG
    TAAAAAGCCTCCTTGTGCAATTGTCTTTTTTTTCCTTTGAACGTGCTTCTTTGTAATGAC
    CAAGGTACCGATTTCTGCTAAGTTCTCCCAACAACATGAAACTGCCTATTCACGCCGTAA
    TTCTTTCTGTCTCCCTTCTCTCTCTCTCTCTCGCTCGCTCGCTCTCGCTCTCGCTCTCTC
    TCGCTGCGTCCTCATTTCCCCTCCCAATCCTCTCTCCCCTCTGCAACCCCCCAGCTCGCT
    GGCTTTCTCTCTGGCTTCTCTCTTTTCCTCCTCCACCCACCCCCTTTGGTTTGACAATTT
    TGTCTTAAGTGTTTCTCAAAAGAGGTTACTTTAGTTAGCATGCGCGCTGTGGGCAATTGT
    TACAAGTGTTCTTAGGTTTACTGTGAAGAGAATGTATTCTGTATCCGTGAATTGCTTTAT
    GGGGGGGAGGGAGGGCTAATTATATATTTTGTTGTTCCTCTATACTTTGTTCTGTTGTCT
    GCGCCTGAAAAGGGCGGAAGAGTTACAATAAAGTTTACAAGCGAGAACCCGAAAAAAAAA
    AAAAA
    >Hs.334841_mRNA_3 gi|14290606|gb|BC009084.1|BC009084 Homo sapiens clone
    MGC:9270 IMAGE:3853674 polyA = 3
    CACCAGCACAGCAAACCCGCCGGGATCAAAGTGTACCAGTCGGCAGCATGGCTACGAAAT
    GTGGGAATTGTGGACCCGGCTACTCCACCCCTCTGGAGGCCATGAAAGGACCCAGGGAAG
    AGATCGTCTACCTGCCCTGCATTTACCGAAACACAGGCACTGAGGCCCCAGATTATCTGG
    CCACTGTGGATGTTGACCCCAAGTCTCCCCAGTATTGCCAGGTCATCCACCGGCTGCCCA
    TGCCCAACCTGAAGGACGAGCTGCATCACTCAGGATGGAACACCTGCAGCAGCTGCTTCG
    GTGATAGCACCAAGTCGCGCACCAAGCTGGTGCTGCCCAGTCTCATCTCCTCTCGCATCT
    ATGTGGTGGACGTGGGCTCTGAGCCCCGGGCCCCAAAGCTGCACAAGGTCATTGAGCCCA
    AGGACATCCATGCCAAGTGCGAACTGGCCTTTCTCCACACCAGCCACTGCCTGGCCAGCG
    GGGAAGTGATGATCAGCTCCCTGGGAGACGTCAAGGGCAATGGCAAAGGGGGTTTTGTGC
    TGCTGGATGGGGAGACGTTCGAGGTGAAGGGGACATGGGAGAGACCTGGGGGTGCTGCAC
    CGTTGGGCTATGACTTCTGGTACCAGCCTCGACACAATGTCATGATCAGCACTGAGTGGG
    CAGCTCCCAATGTCTTACGAGATGGCTTCAACCCCGCTGATGTGGAGGCTGGACTGTACG
    GGAGCCACTTATATGTATGGGACTGGCAGCGCCATGAGATTGTGCAGACCCTGTCTCTAA
    AAGATGGGCTTATTCCCTTGGAGATCCGCTTCCTGCACAACCCAGACGCTGCCCAAGGCT
    TTGTGGGCTGCGCACTCAGCTCCACCATCCAGCGCTTCTACAAGAACGAGGGAGGTACAT
    GGTCAGTGGAGAAGGTGATCCAGGTGCCCCCCAAGAAAGTGAAGGGCTGGCTGCTGCCCG
    AAATGCCAGGCCTGATCACCGACATCCTGCTCTCCCTGGACGACCGCTTCCTCTACTTCA
    GCAACTGGCTGCATGGGGACCTGAGGCAGTATGACATCTCTGACCCACAGAGACCCCGCC
    TCACAGGACAGCTCTTCCTCGGAGGCAGCATTGTTAAGGGAGGCCCTGTGCAAGTGCTGG
    AGGACGAGGAACTAAAGTCCCAGCCAGAGCCCCTAGTGGTCAAGGGAAAACGGGTGGCTG
    GAGGCCCTCAGATGATCCAGCTCAGCCTGGATGGGAAGCGCCTCTACATCACCACGTCGC
    TGTACAGTGCCTGGGACAAGCAGTTTTACCCTGATCTCATCAGGGAAGGCTCTGTGATGC
    TGCAGGTTGATGTAGACACAGTAAAAGGAGGGCTGAAGTTGAACCCCAACTTCCTGGTGG
    ACTTCGGGAAGGAGCCCCTTGGCCCAGCCCTTGCCCATGAGCTCCGCTACCCTGGGGGCG
    ATTGTAGCTCTGACATCTGGATTTGAACTCCACCCTCATCACCCACACTCCCTATTTTGG
    GCCCTCACTTCCTTGGGGACCTGGCTTCATTCTGCTCTCTCTTGGCACCCGACCCTTGGC
    AGCATGTACCACACAGCCAAGCTGAGACTGTGGCAATGTGTTGAGTCATATACATTTACT
    GACCACTGTTGCTTGTTGCTCACTGTGCTGCTTTTCCATGAGCTCTTGGAGGCACCAAGA
    AATAAACTCGTAACCCTGTCCTTCAAAAAAAAAAAAAAAAA
    >Hs.3321_contig1
    AI804745|AI492375|AA594799|BE672611|AA814147|AA722404|AW170088|D11718BG153
    444|AI680648|AA063561|BE219054|AI590287|R55185|AI479167|AI796872|AI018324A
    1701122|BE218203|AA905336|AI681917B1084742|AI480008|AI217994|AI401468
    polyA = 2 polyA = 3
    CCGGAGATAACTTGAGGGCTATAGAGGACCGGCTAATACTGGTCCTGAATTTGGCTTCAG
    GCCTCACCAACCAAGTGGCCGTGGCCTTGCCGTCTTGCCCGTCGGCCCCCGGTGAGGCCT
    GGACCCCTGGGGTCCCGGCACCAGGCCCCGGCTTCCGACCCTGGCAGAAGCCCAAGATCT
    GGTCCCTCGCGGAGACTGCCACAAGCCCCGGACACCCGCGCCGGCTCGCCTCCCGGCGCG
    GGGGGGTCTCCACCGGGGGGCAACGGTCGCGCCTTTCCGCCCTGCAGCTCTCTCCGGGCC
    GCCGCCGCCGCCGCCGCTCACAGACTGGTCTCAGCGCCGCTGGGCAAGTTCCCGGCTTGG
    ACCAACCGGCCGTTTCCAGGCCCACCGCCCGGCCCCCGCCCGCACCCGCTCTCCCTGCTG
    GGCTCTGCCCCTCCGCACCTGCTGGGACTTCCCGGAGCCGCGGGCCACCCGGCTGCCGCC
    GCCGCCTTCGCTCGGCCAGCGGAGCCCGAAGGCGGAACAGATCGCTGTAGTGCCTTGGAA
    GTGGAGAAAAAGTTACTCAAGACAGCTTTCCATCCCGTGCCCAGGCGGCCCCAGAACCAT
    CTGGACGCCGCCCTGGTCTTATCGGCTCTCTCCTCATCCTAGTTCTTTAAAAAAAAACAA
    AAAAACAAAAAAAACTTTTTTTAATCGTTGTAATAATTGTATAAAAAAAATCGCTCTGTA
    TAGTTACAACTTGTAAGCATGTCCGTGTATAAATACCTAAAAGCAAAACTAAACAAAGAA
    AGTAAGAAAAAGAAATAAAACCAGTCCTCCTCAGCCCTCCCCAAGTCGCTTCTGTGGCAC
    CCCGCATTCGCTGTGAGGTTTGTTTGTCCGGTTGATTTTGGGGGGTGGAGTTTCAGTGAG
    AATAAACGTGTCTGCCTTTGTGTGTGTGTATATATACAGAGAAATGTACATATGTGTGAA
    CCAAATTGTACGAGAAAGTATCTATTTTTGGCTAAATAAATGAGCTGCTGCCACTTTGAC
    TATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.306216_singlet1 AW083022 polyA = 1 polyA = 2
    TATGAGCACCTTCACATGGATCCACTTGAGGAAAGAAGGTGGACCGAATTTGTAAACGGT
    GTGCAGCAATATATATCAATTCGTTCTGAGATAATCGCCACTTACGCTCTCTGTGGTTTT
    GCCAATATCGGGTCCCTAGGAATCGTGATCGGCGGACTCACATCCATGGCTCCTTCCAGA
    AAGCGTGATATCGCCTCGGGGGCAGTGAGAGCTCTGATTGCGGGGACCGTGGCCTGCTTC
    ATGACAGCCTGCATCGCAGGCATACTCTCCAGCACTCCTGTGGACATCAACTGCCATCAC
    GTTTTAGAGAATGCCTTCAACTCCACTTTCCCTGGAACCCCAACCAAGGGTGATAGCTTG
    TTGCCAAAGTCTGTTGAGCAGCCCTGTTGCCCAGGGTCCTGGTGAAGTCATCCCAGGAGG
    AAACCCCAGTCTGTATTCTTTGAAGGGCTGCTGCACATTGTTGAATCCATCGACCTTTAG
    CTGCAATGGGATCTCTAATACATTTTGAGGTCAGCCACTTCTCCAGTGGAACTCTGAAGT
    ACAGATGCTGAATTTTCTGCTTTGGAAAGAAAAAAAA
    >Hs.99235_contig1 AA456140|AI167259|AA450056 polyA = 2 polyA = 3
    ACTCGGCATGTGATGAACACCCATAGTTAAGAAACCATGGAGCAAGAAAGCTTGTGGAAA
    GTCTCTCTCCTTCCTCATAAGACATGCACACTAATACACATACACACCAAAAAATTACAC
    ATTTTAAAACTGCTAAGCTTGGATTTAACTGAATCATATATCTTTTATCATGTTATCCTA
    AAAGTGAGAAGACATAACCAAGACATGGAAATAAATGTGAAAGCTGGAGCCGAAGAGTCA
    ZVAGAGCTAAAAAATTAAGTCTAGAACATTCTATGAGGATAGTATAAATAAAAAGAAATAC
    AGTCTAGACATGCTGCAAGGAAAGAAGATTCTAAAGTCCGTTTATGGAGGCAATTCCATA
    TCCTTTCTTGAACGCACATTCAGCTTACCCCAGAGAGCAAGTGAGGCAATCTGGCAAAAG
    ATTAATAAAGATGTAAACCCCTGGAAAAAAAAAAAA
    >Hs.169172_mRNA_2 gi|2274961|emb|AJ000388.1|HSCANPX Homo sapiens mRNA for
    calpain-like protease CANPX polyA = 3
    GAATTCGGCACGAGATAGTTTTCAGGTTAAGAAAGCCAGAATCTTTGTTCAGCCACACTG
    ACTGAACAGACTTTTAGTGGGGTTACCTGGCTAACAGCAGCAGCGGCAACGGCAGCAGCA
    GCAGCAGCAGCAGCAGCAGCAGCAGCAGGGCTCCTGGGATAACTCAGGCATAGTTCAACA
    CTATGGGTCCTCCTCTGAAGCTCTTCAAAAACCAGAAATACCAGGAACTGAAGCAGGAAT
    GCATCAAAGACAGCAGACTTTTCTGTGATCCAACATTTCTGCCTGAGAATGATTCTCTTT
    TCTACTTCCGACTGCTTCCTGGAAAGGTGGTGTGGAAACGTCCCCAGGACATCTGTGATG
    ACCCCCATCTGATTGTGGGCAACATTAGCAACCACCAGCTGACCCAAGGGAGACTGGGGC
    ACAAGCCAATGGTTTCTGCATTTTCCTGTTTGGCTGTTCAGGAGTCTCATTGGACAAAGA
    CAATTCCCAACCATAAGGAACAGGAATGGGACCCTCAAAAAACAGAAAAATACGCTGGGA
    TATTTCACTTTCGTTTCTGGCATTTTGGAGAATGGACTGAAGTGGTGATTGATGACTTGT
    TGCCCACCATTAACGGAGATCTGGTCTTCTCTTTCTCCACTTCCATGAATGAGTTTTGGA
    ATGCTCTGCTGGAAAAAGCTTATGCAAAGCTGCTAGGCTGTTATGAGGCCCTGGATGGTT
    TGACCATCACTGATATTATTGTGGACTTCACGGGCACATTGGCTGAAACTGTTGACATGC
    AGAAAGGAAGATACACTGAGCTTGTTGAGGAGAAGTACAAGCTATTCGGAGAACTGTACA
    AAACATTTACCAAAGGTGGTCTGATCTGCTGTTCCATTGAGTCTCCCAATCAGGAGGAGC
    AAGAAGTTGAAACTGATTGGGGTCTGCTGAAGGGCCATACCTATACCATGACTGATATTC
    GCAAAATTCGTCTTGGAGAGAGACTTGTGGAAGTCTTCAGTGCTGAGAAGGTGTATATGG
    TTCGCCTGAGAAACCCCTTGGGAAGACAGGAATGGAGTGGCCCCTGGAGTGAAATTTCTG
    AAGAGTGGCAGCAACTGACTGCATCAGATCGCAAGAACCTGGGGCTTGTTATGTCTGATG
    ATGGAGAGTTTTGGATGAGCTTGGAGGACTTTTGCCGCAACTTTCACAAACTGAATGTCT
    GCCGCAATGTGAACAACCCTATTTTTGGCCGAAAGGAGCTGGAATCGGTGTTGGGATGCT
    GGACTGTGGATGATGATCCCCTGATGAACCGCTCAGGAGGCTGCTATAACAACCGTGATA
    CCTTCCTGCAGAATCCCCAGTACATCTTCACTGTGCCTGAGGATGGGCACAAGGTCATTA
    TGTCACTGCAGCAGAAGGACCTGCGCACTTACCGCCGAATGGGAAGACCTGACAATTACA
    TCATTGGCTTTGAGCTCTTCAAGGTGGAGATGAACCGCAAATTCCGCCTCCACCACCTCT
    ACATCCAGGAGCGTGCTGGGACTTCCACCTATATTGACACCCGCACAGTGTTTCTGAGCA
    AGTACCTGAAGAAGGGCAACTATGTGCTTGTCCCAACCATGTTCCAGCATGGTCGCACCA
    GCGAGTTTCTCCTGAGAATCTTCTCTGAAGTGCCTGTCCAGCTCAGGGAACTGACTCTGG
    ACATGCCCAAAATGTCCTGCTGGAACCTGGCTCGTGGCTACCCGAAAGTAGTTACTCAGA
    TCACTGTTCACAGTGCTGAGGACCTGGAGAAGAAGTATGCCAATGAAACTGTAAACCCAT
    ATTTGGTCATCAAATGTGGAAAGGAGGAAGTCCGTTCTCCTGTCCAGAAGAATACAGTTC
    ATGCCATTTTTGACACCCAGGCCATTTTCTACAGAAGGACCACTGACATTCCTATTATAG
    TACAGGTCTGGAACAGCCGAAAATTCTGTGATCAGTTCTTGGGGCAGGTTACTCTGGATG
    CTGACCCCAGCGACTGCCGTGATCTGAAGTCTCTGTACCTGCGTAAGAAGGGTGGTCCAA
    CTGCCAAAGTCAAGCAAGGCCACATCAGCTTCAAGGTTATTTCCAGCGATGATCTCACTG
    AGCTCTAAATCTGCAATCCCAGAGAATCCTGACAAAGCGTGCCACCCTTTTATTTTCCGT
    CAGGTGCCAGGTCTTAGTTAAGATTCACAATCTTTAGAAAGAATGAGATTCACAATAATT
    AACTCTTCCTCTCTTCTGATAAATTCCCCATACCTCCCAATCCAAGTAGCATCTGTAGCT
    ACATAACCTATATACCTCCAGCAGCTGGACATGGGGAGCGACAGTCCTATCTAGACATCA
    TACACATTTGCCAAGAAAGGATCTCTGGGGCTTCCGGGGGTGAGATTCAAGCAGGACAAT
    AACAAGAGGCTGGACACCCTACAGATGTCTTTGATGTTTTCAGTTGTTTGATATATCTCC
    CCTGTAGGGCATGTTGAGGAAGGAGGAGGGCTGATCAAGGCCAAGCTGGTCTAGCCTGAC
    ATCCTAGCTCCTGACTGAACACTATAGACTTCCCAGCAGCATTTTCACCCAGCAGCCAGA
    GCCGGCTTTAAGTCCCCAACCCTTACAGACACCACTGCCACCACCACCAACCACGACCAC
    CACCACCACCACCACTCACCACCATCATCACCTCCGGAAAGTGTAGTCCTGCCCTAACCC
    TAACCCCAAGTCACCCCCCACAGTAAATTTTACCTTCATGTTGAGAAAGCTTCCTGGTGC
    TTAATCAAGAGCTGGAGTTCAATGAGTCCTAGACAGTGAGAGGGGCCTGAGCTTCAGCTC
    AATGGAAGCCTGCTGTGTGCTCACAAGACGGAAAAGTGGAAGAAGCTGCAGTGGGAGACA
    AAGCCTCGGTCCCCCACCCATCCACACACACCTACACTCACACACGCGCACATGGGCGCG
    CAACGGAACTACCATTTCAGGCAGTCAGTGGGCAAGAGGAAAGATAAGTAAGTACCATAC
    ACACCTTAAAAGATGAGGAGAATTCATCCAGACATATTACAGCCAGTTTGGGGCCCCTGA
    CTTGCAATGTGAAACCTCTTCGCTTGCTGCTAGGTTTACAAACAAGCCCATTGTTCCTGT
    GCCTCCTAATATTCATTTGTTACTGAAGGACCCCATCTGGGGACTTGAGACTTTGGTCCC
    AGCCCAGACGCCTCAGACTGGTCTCAAAGTCAAGCAAGGCTTCACATCAGCTGCAAGTGT
    TAGTTTGCCAGCGCATGATCTCACTGAGCTTCTACAGAATCTGCAATCCCAGAGTCAATC
    ATGACGAAATGTACGTCCCACCATCTTAACCTATCAACTTTCTGCCCCTCCTTCAAGGCC
    CAGTATAAATGCCACCTCCTCCATGAAGCCTTCCCTAATTCCACCCCAAACCCCCACCTT
    CAACAATATTTCAACGCTTCTGCAATGATGAAAAAGAAACATAGTTGTAGTACTTAGCCT
    ACCTAGACCAGCAAGCATTCATTTTTAGCTCGCTCATTTTTTACCATGTTTTCCAGTCTG
    TTTAACTTCTGCAGTGCCTTCACTACACTGCCTTACATAAACCAAATCACAATAAAGTTC
    ATATTCAGTACAATTAAAAAAAAAAAAA
    >Hs.351486_mRNA_1 gi|16549178|dbj|AK054605.1|AK054605 Homo sapiens cDNA
    FLJ30043 fis, clone 3NB692001548 polyA = 0
    TATGCAAGTGTTTAACAGATGCTTCACTATTAAAATATTTTCCCCCCAAGTCTCAAATAT
    TGAAGAATCTCTAACCAGGGACACCAGTCCCTACGAAGACCTTGGGCGATTTTGAAGTGC
    GGGCACCTCGATTCCCCGAATCTGTAGTGTGGCTGGTATCGGTGTTCCCCTGGTTTAACT
    AGCCTGTTTGAAGGCACAGATCATTCATGGGGAAGTATAACCGAATCCAGTCCTCTCCAC
    CGCCTGGGGATCTTCACTTTCGCAGTCTACGACTGCCTGTGACTCCAGAAAGACAAACTG
    CAGATTGGCCAAGATGGGGAAATTGAGGCAGAGAAGCCAAGACATGTGCTAAAGGTCATG
    CAGGCTATGAATGGAGCTGGAATGTGAACGCAGGCCATATGACCCCAGAGCCCATGTTCT
    TGAACCCTTAGAAAGACAGCAGCAACACACCTGGTGCAGCAGCTGCTTAGTTGGAGTGGC
    TGACAAGGAGAGAATGATTTCCAGGAAGAGCGGAACACATATGGAAGGCCTTAGCTTATC
    TTTAGCGCCTCATACACCCGTTCTGGACTTCAGAAAGGCCAGTGAGTGGGATTAGGCCTC
    AGAGATAGGATGTCAGTCCCAGTGAGGGATGGCCTAGAGCATTCTTTAATTCTTTCCTTT
    GGGTCACACATAAGAAACAATTTTCCAGCACTGATGAGTGTTATTAACAATGAGATGGGA
    TAGAATTTAGTTTTCCCTATGGCTGTGCTTCAAAAATAGAAAAGCTGTCTTTTCTCTGGA
    ATGATTGAATGAAGCTCTGGGGAGGAAAAGGTGGATTGGCAGATCTCTTAAAGGAAGCTT
    CTCCTTCTAGGCACTATTCTAAGGCTTAATATTTTAACTCCCTATATTAACCTAGTTCAA
    CTAAACAGTGATCTGAGTAATTTTATTTTTATTAAAGCTCAGATCAAAATGCCATTAACA
    TTGATTGAGAAAATCAAAGGAATCTTTGATGTGAGTGGTTAAATTGCTGAATTATTTCAG
    TCCCATACCCTCACAGCATGAGTACCTGATCTGATAGACTTCTTTGGAATTCCTTTTTTG
    TTTGAGACAGAGTCTTGCTCTGTCGCCCAGGCTGGAGTGCAGCGGTGTGATCTCAACCAT
    TGCAACCTCCACCTCCCAGGTTCAGGTGATTCTCATGCCTCAGCCTCCTGAGTAGCTGGG
    ATTACAGATGTGCACCACCATGCCCGGCTAATTATTTTGTATCTTTAGTAGAGATGAAGT
    TTTGCCATGTGGGCCAGGCTGTTCTCAAACTACTGGCCTCAAGTGATCTGCCCGCCTCGG
    CCTCCCAGACTGCTGGGATTACAGGCGTGAGGCACCGTGCCTGGCTGGGATTCCATAATA
    AATCCCTCTGTGTCTATTTCTTTTTTCAAATATAATTTTCTTCATTTCCAAACATCATCT
    TTAAGACTCCAAGGATTTTTCCAGGCACAGTGGCTCATACCTGTAATCCCATTGCTTGGA
    GAGGCCAAGGTGGAAGTTCATTTGAGGCCAGGAGTTCGAGACCAGGTGGGCAACATAGTG
    AAACCTTGTCTCTACAACAT
    >Hs.153504_contig2
    BE962007|AW016349|AW016358|AW139144|AA932969|AI025620|AI688744|AI865632|AA8
    54291|AA932970|AU156702|AI634439|AA152496|AI539557|AI123490|AI613215|AI3183
    63|AW105672|AA843483|AI366889|AW1819381AI813801|AI433695|AA934772|N72230|AT
    760632|BE858965|AW058302|AI760087|AI682077|AA886672|AI350384|AW243848|AW300
    574|BE466359|AI859529|AI921588|BF062899|BE855597 BE617708 polyA = 2 polyA = 3
    TGTTTATATAACTGTGTTCGTTTTTGTTGTTCCGTCCCGTCGTCCTTGTAGACTCTCATC
    CTCGTGTGTTTTGGACCCTCCAGGGGTGACATCGGGTCTTGTGTTCAGCTCTCCTGGACT
    GTTATTCCTTGTCCGCGTGTTCGTGTTAGACATTGTCCACGATCTGTATCATGCCTATGT
    CTCACTTTGGTCTCTTATTTCAGCGTGAACACTATAGTTCCAAGTTTGTTCGGATAATTC
    TGATTCTTGTCACCAGCGTGAGATTTCAACAGAACTTGTTTGGAACAAATACTCACTTAA
    AACTTCAGCAGAAGAAAAATTACTTAGTCCTTAGGCCAACCAATTTAACTGCAGTGTCAT
    GTTTCACAGGCCTTCCTACATTTAGAAATCGTCACACAGCTGTGATAAGAGTAGATTATT
    TTACTATGAAATAATTCTGAATAGATGAAAGCATAAAATGTGAGAAACTGAATGTATTAT
    TCAGGAAGAATACTGAGTGCCTTCATTTAACTAAAGTTGAATGTAAAAGTCAATTTGCAC
    TTCTTTATAATCCTCTGGTTTAGAATTATAAATTGTTAAAACCTTGATAATTGTCATTTA
    ATTATATTTCAGGTGTCCTGAACAGGTCACTAGACTCTACATTGGGCAGCCTTTAAATAT
    GATTCTTTGTAATGCTAAATAGCCTTTTTTTCTCTTTTTACTGCAACTTAATATTTCTAT
    TTAGAACACAGAAAATGAAAATATTTAGAATAAGTTGTACATTTGATGACAAATAAATCA
    CTATTAAAATAAAAAAAAAAAAAAAAAAA
    >Hs.199354_singletl AI669760 polyA = 1 polyA = 2
    AGGAACCCCTGTGGGAAAGGTTTAAACCTATkAACAGTGCCCCCTTTGGCTCCTCCTCCCT
    TGGCGGAATGGGTTCCTGGACCATGTGCATTTCANTGGGCCATGGGATTTACATTTCCTT
    GCATCCCCAGGTGGTTTGATCCCTGCCAGGGCCCCTTCCTTCCTGCTCATGGTTTTCAGG
    GGGCCTGATCATGGAAAGTAAGGGGGTTGGGCCTTCCCTTTTGGGGGTGAACCCTGACTC
    CATCCCCCTATTGCCCCCCTAACCAATCATGCAAACTTTTCCCCCCCTGGGGTAATTCAC
    CAGTTAAAAAAAGCTTTTTTTAAATGTTTTGTTTTGGGGGGGGGGCAGGGCCCCCTTTTT
    GTTTTTTTAAGGAGTTGGTTTTGGTTTTTGGCTGATGTTTTGTTTTTTAACATGCCCCCA
    GTTTGTAAGGCCAAAGGTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAA
    >Hs.162020_contig1 AW291189|AA505872 polyA = 2 polyA = 3
    TAAGCTTTAAAGGCTCTGTGTTAGGGCATAGTCTAGAAACATGGGGCCCAAGGGCACCGG
    GAAAACTTACAAAGGGAAGAGATGGAACTGGGAGGGTTCAAGCTACCAGTTCCATCTCTC
    CATGTTTTAGAGAATTGGGGCACTAAGTCAGCCAGGTAAGGTCAGGTCAGAGGAGGGCCC
    GGATGAAGCATGAGATGCAGAGGGACAGTGCGTGAATGGAGACCTTGGGTAGCACCAACG
    TGTAGCGGCAGAGGTGGGGTGGATGTGGCTGATGTCAGGGAGAGAATGGGGAGCATGCAC
    AGGGCTCAGTCTTATACATACATTGAAAATCCTTTAGCCTTTCAAAGATTATTAACCCAA
    ATCACCTTTCTTGCTTACTCCAGATGCCTCAGCCTCTGATATAATTGCTAAGTATCTGCC
    GTGTTAAAAATAAACATTTGAGAATCAAAAAAAAAAAAAAAAA
    >Hs.30743_mRNA_3 gi|18201906|ref|NM_006115.2|Homo sapiens preferentially
    expressed antigen in melanoma (FRAME), mRNA polyA = 3
    GCTTCAGGGTACAGCTCCCCCGCAGCCAGAAGCCGGGCCTGCAGCGCCTCAGCACCGCTC
    CGGGACACCCCACCCGCTTCCCAGGCGTGACCTGTCAACAGCAACTTCGCGGTGTGGTGA
    ACTCTCTGAGGAAAAACCATTTTGATTATTACTCTCAGACGTGCGTGGCAACAAGTGACT
    GAGACCTAGAAATCCAAGCGTTGGAGGTCCTGAGGCCAGCCTAAGTCGCTTCAAAATGGA
    ACGAAGGCGTTTGTGGGGTTCCATTCAGAGCCGATACATCAGCATGAGTGTGTGGACAAG
    CCCACGGAGACTTGTGGAGCTGGCAGGGCAGAGCCTGCTGAAGGATGAGGCCCTGGCCAT
    TGCCGCCCTGGAGTTGCTGCCCAGGGAGCTCTTCCCGCCACTCTTCATGGCAGCCTTTGA
    CGGGAGACACAGCCAGACCCTGAAGGCAATGGTGCAGGCCTGGCCCTTCACCTGCCTCCC
    TCTGGGAGTGCTGATGAAGGGACAACATCTTCACCTGGAGACCTTCAAAGCTGTGCTTGA
    TGGACTTGATGTGCTCCTTGCCCAGGAGGTTCGCCCCAGGAGGTGGAAACTTCAAGTGCT
    GGATTTACGGAAGAACTCTCATCAGGACTTCTGGACTGTATGGTCTGGAAACAGGGCCAG
    TCTGTACTCATTTCCAGAGCCAGAAGCAGCTCAGCCCATGACAAAGAAGCGAAAAGTAGA
    TGGTTTGAGCACAGAGGCAGAGCAGCCCTTCATTCCAGTAGAGGTGCTCGTAGACCTGTT
    CCTCAAGGAAGGTGCCTGTGATGAATTGTTCTCCTACCTCATTGAGAAAGTGAAGCGAAA
    GAAAAATGTACTACGCCTGTGCTGTAAGAAGCTGAAGATTTTTGCAATGCCCATGCAGGA
    TATCAAGATGATCCTGAAAATGGTGCAGCTGGACTCTATTGAAGATTTGGAAGTGACTTG
    TACCTGGAAGCTACCCACCTTGGCGAAATTTTCTCCTTACCTGGGCCAGATGATTAATCT
    GCGTAGACTCCTCCTCTCCCACATCCATGCATCTTCCTACATTTCCCCGGAGAAGGAAGA
    GCAGTATATCGCCCAGTTCACCTCTCAGTTCCTCAGTCTGCAGTGCCTGCAGGCTCTCTA
    TGTGGACTCTTTATTTTTCCTTAGAGGCCGCCTGGATCAGTTGCTCAGGCACGTGATGAA
    CCCCTTGGAAACCCTCTCAATAACTAACTGCCGGCTTTCGGAAGGGGATGTGATGCATCT
    GTCCCAGAGTCCCAGCGTCAGTCAGCTAAGTGTCCTGAGTCTAAGTGGGGTCATGCTGAC
    CGATGTAAGTCCCGAGCCCCTCCAAGCTCTGCTGGAGAGAGCCTCTGCCACCCTCCAGGA
    CCTGGTCTTTGATGAGTGTGGGATCACGGATGATCAGCTCCTTGCCCTCCTGCCTTCCCT
    GAGCCACTGCTCCCAGCTTACAACCTTAAGCTTCTACGGGAATTCCATCTCCATATCTGC
    CTTGCAGAGTCTCCTGCAGCACCTCATCGGGCTGAGCAATCTGACCCACGTGCTGTATCC
    TGTCCCCCTGGAGAGTTATGAGGACATCCATGGTACCCTCCACCTGGAGAGGCTTGCCTA
    TCTGCATGCCAGGCTCAGGGAGTTGCTGTGTGAGTTGGGGCGGCCCAGCATGGTCTGGCT
    TAGTGCCAACCCCTGTCCTCACTGTGGGGACAGAACCTTCTATGACCCGGAGCCCATCCT
    GTGCCCCTGTTTCATGCCTAACTAGCTGGGTGCACATATCAAATGCTTCATTCTGCATAC
    TTGGACACTAAAGCCAGGATGTGCATGCATCTTGAAGCAACAAAGCAGCCACAGTTTCAG
    ACAAATGTTCAGTGTGAGTGAGGAAAACATGTTCAGTGAGGAAAAAACATTCAGACAAAT
    GTTCAGTGAGGAAAAAAAGGGGAAGTTGGGGATAGGCAGATGTTGACTTGAGGAGTTAAT
    GTGATCTTTGGGGAGATACATCTTATAGAGTTAGAAATAGAATCTGAATTTCTAAAGGGA
    GATTCTGGCTTGGGAAGTACATGTAGGAGTTAATCCCTGTGTAGACTGTTGTAAAGAAAC
    TGTTGAAAATAAAGAGAAGCAATGTGAAGCAAAAAAAAAAAAAAAAAA
    >Hs.271580_contig1
    AI632869|AW338882|AW338875|AW613773|AI982899|AW193151|BE206353|BE208200|AI8
    11548|AW264021 polyA = 2 polyA = 3
    AACACAGCCCTACCAANCAATGATGACCAGTGGAAAACAATGAAGTCACCAAACCCTGGA
    CAGGGCTCATGCTCCAGGACAANTTGCTGTGGCGTAAATGGTCCATCAGACTGGCAAAAA
    TACACATCTGCCTTCCGGACTGAGAATAATGATGCTGACTATCCCTGGCCTCGTCAATGC
    TGTGTTATGAACAATCTTAAAGAACCTCTCAACCTGGAGGCTTGTAAACTAGGCGTGCCT
    GGTTTTTATCACAATCAGGGCTGCTATGAACTGATCTCTGGTCCAATGAACCGACACGCC
    TGGGGGGTTGCCTGGTTTGGATTTGCCATTCTCTGCTGGACTTTTTGGGTTCTCCTGGGT
    ACCATGTTCTACTGGAGCAGAATTGAATATTAAGCATAAAGTGTTGCCACCATACCTCCT
    TCCCCGAGTGACTCTGGATTTGGTGCTGGAACCAGCTCTCTCCTAATATTCCACGTTTGT
    GCCCCACACTAACGTGTGTGTCTTACATTGCCAAGTCAGATGGTACGGACTTCCTTTAGG
    ATCTCAGGCTTCTGCAGTTCTCATGACTCCTACTTTTCATCCTAGTCTAGCATTCTGCAA
    CATTTATATAGACTGTTGAAAGGAGAATTTGAAAAATGCATAATAACTACTTCCATCCCT
    GCTTATTTTTAATTTGGGAAAATAAATACATTCGAAGGAAAAAAAAA
    >Hs.69360_mRNA_2 gi|14250609|gb|BC008764.11BC008764 Homo sapiens clone
    MGC:1266 IMAGE:3347571 polyA = 3
    GGCACGAGGGCGAAATTGAGGTTTCTTGGTATTGCGCGTTTCTCTTCCTTGCTGACTCTC
    CGAATGGCCATGGACTCGTCGCTTCAGGCCCGCCTGTTTCCCGGTCTCGCTATCAAGATC
    CAACGCAGTAATGGTTTAATTCACAGTGCCAATGTAAGGACTGTGAACTTGGAGAAATCC
    TGTGTTTCAGTGGAATGGGCAGAAGGAGGTGCCACAAAGGGCAAAGAGATTGATTTTGAT
    GATGTGGCTGCAATAAACCCAGAACTCTTACAGCTTCTTCCCTTACATCCGAAGGACAAT
    CTGCCCTTGCAGGAAAATGTAACAATCCAGAAACAAAAACGGAGATCCGTCAACTCCAAA
    ATTCCTGCTCCAAAAGAAAGTCTTCGAAGCCGCTCCACTCGCATGTCCACTGTCTCAGAG
    CTTCGCATCACGGCTCAGGAGAATGACATGGAGGTGGAGCTGCCTGCAGCTGCAAACTCC
    CGCAAGCAGTTTTCAGTTCCTCCTGCCCCCACTAGGCCTTCCTGCCCTGCAGTGGCTGAA
    ATACCATTGAGGATGGTCAGCGAGGAGATGGAAGAGCAAGTCCATTCCATCCGAGGCAGC
    TCTTCTGCAAACCCTGTGAACTCAGTTCGGAGGAAATCATGTCTTGTGAAGGAAGTGGAA
    AAAATGAAGAACAAGCGAGAAGAGAAGAAGGCCCAGAACTCTGAAATGAGAATGAAGAGA
    GCTCAGGAGTATGACAGTAGTTTTCCAAACTGGGAATTTGCCCGAATGATTAAAGAATTT
    CGGGCTACTTTGGAATGTCATCCACTTACTATGACTGATCCTATCGAAGAGCACAGAATA
    TGTGTCTGTGTTAGGAAACGCCCACTGAATAAGCAAGAATTGGCCAAGAAAGAAATTGAT
    GTGATTTCCATTCCTAGCAAGTGTCTCCTCTTGGTACATGAACCCAAGTTGAAAGTGGAC
    TTAACAAAGTATCTGGAGAACCAAGCATTCTGCTTTGACTTTGCATTTGATGAAACAGCT
    TCGAATGAAGTTGTCTACAGGTTCACAGCAAGGCCACTGGTACAGACAATCTTTGAAGGT
    GGAAAAGCAACTTGTTTTGCATATGGCCAGACAGGAAGTGGCAAGACACATACTATGGGC
    GGAGACCTCTCTGGGAAAGCCCAGAATGCATCCAAAGGGATCTATGCCATGGCCTCCCGG
    GACGTCTTCCTCCTGAAGAATCAACCCTGCTACCGGAAGTTGGGCCTGGAAGTCTATGTG
    ACATTCTTCGAGATCTACAATGGGAAGCTGTTTGACCTGCTCAACAAGAAGGCCAAGCTG
    CGCGTGCTGGAGGACGGCAAGCAACAGGTGCAAGTGGTGGGGCTGCAGGAGCATCTGGTT
    AACTCTGCTGATGATGTCATCAAGATGATCGACATGGGCAGCGCCTGCAGAACCTCTGGG
    CAGACATTTGCCAACTCCAATTCCTCCCGCTCCCACGCGTGCTTCCAAATTATTCTTCGA
    GCTAAAGGGAGAATGCATGGCAAGTTCTCTTTGGTAGATCTGGCAGGGAATGAGCGAGGC
    GCGGACACTTCCAGTGCTGACCGGCAGACCCGCATGGAGGGCGCAGAAATCAACAAGAGT
    CTCTTAGCCCTGAAGGAGTGCATCAGGGCCCTGGGACAGAACAAGGCTCACACCCCGTTC
    CGTGAGAGCAAGCTGACACAGGTGCTGAGGGACTCCTTCATTGGGGAGAACTCTAGGACT
    TGCATGATTGCCACGATCTCACCAGGCATAAGCTCCTGTGAATATACTTTAAACACCCTG
    AGATATGCAGACAGGGTCAAGGAGCTGAGCCCCCACAGTGGGCCCAGTGGAGAGCAGTTG
    ATTCAAATGGAAACAGAAGAGATGGAAGCCTGCTCTAACGGGGCGCTGATTCCAGGCAAT
    TTATCCAAGGAAGAGGAGGAACTGTCTTCCCAGATGTCCAGCTTTAACGAAGCCATGACT
    CAGATCAGGGAGCTGGAGGAGAAGGCTATGGAAGAGCTCAAGGAGATCATACAGCAAGGA
    CCAGACTGGCTTGAGCTCTCTGAGATGACCGAGCAGCCAGACTATGACCTGGAGACCTTT
    GTGAACAAAGCGGAATCTGCTCTGGCCCAGCAAGCCAAGCATTTCTCAGCCCTGCCAGAT
    GTCATCAAGGCCTTGCGCCTGGCCATGCAGCTGGAAGAGCAGGCTAGCAGACAAATAAGC
    AGCAAGAAACGGCCCCAGTGACGACTGCAAATAAAAATCTGTTTGGTTTGACACCCAGCC
    TCTTCCCTGGCCCTCCCCAGAGAACTTTGGGTACCTGGTGGGTCTAGGCAGGGTCTGAGC
    TGGGACAGGTTCTGGTAAATGCCAAGTATGGGGGCATCTGGGCCCAGGGCAGCTGGGGAG
    GGGGTCAGAGTGACATGGGACACTCCTTTTCTGTTCCTCAGTTGTCGCCCTCACGAGAGG
    AAGGAGCTCTTAGTTACCCTTTTGTGTTGCCCTTCTTTCCATCAAGGGGAATGTTCTCAG
    CATAGAGCTTTCTCCGCAGCATCCTGCCTGCGTGGACTGGCTGCTAATGGAGAGCTCCCT
    GGGGTTGTCCTGGCTCTGGGGAGAGAGACGGAGCCTTTAGTACAGCTATCTGCTGGCTCT
    AAACCTTCTACGCCTTTGGGCCGAGCACTGAATGTCTTGTACTTTAAAAAAATGTTTCTG
    AGACCTCTTTCTACTTTACTGTCTCCCTAGAGATCCTAGAGGATCCCTACTGTTTTCTGT
    TTTATGTGTTTATACATTGTATGTAACAATAAAGAGAAAAAATAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAA
    >Hs.30827_contig1 H07885|N39347|W85913|AA583408|W86449 polyA = 2 polyA = 3
    ATCGGACTTCGGTNAACTNTGGCAAGGATTGGACAGNCTAGGTAGGCTAAATGTGTGCTC
    TGTCCCTGTTTGCTTCAACAGAGGAGCAAGCCTCAGCTGAGAAGGAGGGCACNTGGAACA
    CCTAGCTCCTCCCGTGATTCCCCAAACCCATAACATTCTTCCATAGGGCTGGAACCAGTG
    CCCCGTCCTGACAGGGATGAAAAGTGAACCCCTCAGGTCAGGAGAGGCCAGAGTTGAGGT
    TCTGCCACTTCCTGTCCCTGGGGAGCCACTCAAGTTACCAGGGCTACCGGCTGAAATAAA
    TCTTTTCCGGGTAGGGTCAAGGGCAGTGTGTTCCAAGGCAACTGATGTAGGCCAGTTGCG
    TGACTCCAGGTTTGTCCTGGTACTCAGTGGGTCCAATCACCTGGCATTGATCACCTGGCA
    TTGATCAGCACCCACCCCACCCCTGAGGCTTGCCCAGCCCCCAGGCCCTCAGATCCCTGC
    TCTTCCTGCCTTTCCTGCCCATGTGTCACCCAGCACCCAAGGTTCAGTGACACAGGGTGG
    TTTGGAGCTGGTCACTGTCATAGCAGCTGTGATTTCACAAGGAAGGGTGCTGCAGGGGGA
    CCTGGTTGATGGGGAGTGGGAAGGGGAAGGAATAAAGAGATCTTCCTCAGGTAAAAAAAA
    AAAAAAAAAA
    >Hs.211593_contig2
    BF592799|AI570478|AA234440|R40214|BE501078|AW593784|AI184050|
    AI284161|W72149|AW780437|AI247981|AW241273|H60824 polyA = 2 polyA = 3
    ACCTCGTTTGCTCCCAGTTACTTCTTATCTGGAGCAGTAATGTAGTCCACTTCACTCATG
    CCTACCCCGCGTGTCTCGTCTCCTGACATGTCTCACAGACGCTCCTGAAGTTAGGTCATT
    ACCTAACCCATAGTTATTTACCTTGAAAGATGGGTCTCCGCACTTGGAAAGGTTTCAAGA
    CTTGATACTGCAATAAATTATGGCTCTTCACCTGGGCGCCAACTGCTGATCAACGAAATG
    CTTGTTGAATCAGGGGCAAACGGAGTACAGACGTCTCAAGACTGAAACGGCCCCATTGCC
    TGGTCTAGTAGCGGATCTCACTCAGCCGCAGACAAGTAATCACTAACCCGTTTTATTCTA
    TTCCTATCTGTGGATGTGTAAATGGCTGGGGGGCCAGCCCTGGATAGGTTTTTATGGGAA
    TTCTTTACAATAAACATAGCTTGTAACTTGAGATCTACAAATCCATTCATCCTGATTGGG
    CATGAAATCCATGGTCAAGAGGACAAGTGGAAAGTGAGAGGGAAGGTTTGCTAGACACCT
    TCGCTTGTTATCTTGTCAAGATAGAAAAGATAGTATCATTTCACCCTTGCCAGTAAAAAC
    CTTTCCATCCACCCATTCTCAGCAGACTCCAGTATTGGCACAGTCACTCACTGCCATTCT
    CACACTATAACAAGAAAAGAAATGAAGTGCATAAGTCTCCTGGGAAAAGAACCTTAACCC
    CTTCTCGTGCCATGACTGGTGATTTCATGACTCATAAGCCCCTCCGTAGGCATCATTCAA
    GATCAATGGCCCATGCATGCTGTTTGCAGCAGTCAATTGAGTTGAATTAGAATTCCAACC
    ATACATTTTAAAGGTATTTGTGCTGTGTGTATATTTTGATAAAATGTTGTGACTTCATGG
    CAAACAGGTGGATGTGTAAAAATGGAATAAAAAAAAAAAAAGAGTCAAAAAAAAAAAAAA
    AATT
    >Hs.155097_mRNA_1 gi|15080385|gb|BC011949.1|BC011949 Homo sapiens clone
    MGC:9006 IMAGE:3863603 polyA = 3
    GGCGCCCAAGCCGCCGCCGCCAGATCGGTGCCGATTCCTGCCCTGCCCCGACCGCCAGCG
    CGACCATGTCCCATCACTGGGGGTACGGCAAACACAACGGACCTGAGCACTGGCATAAGG
    ACTTCCCCATTGCCAAGGGAGAGCGCCAGTCCCCTGTTGACATCGACACTCATACAGCCA
    AGTATGACCCTTCCCTGAAGCCCCTGTCTGTTTCCTATGATCAAGCAACTTCCCTGAGGA
    TCCTCAACAATGGTCATGCTTTCAACGTGGAGTTTGATGACTCTCAGGACAAAGCAGTGC
    TCAAGGGAGGACCCCTGGATGGCACTTACAGATTGATTCAGTTTCACTTTCACTGGGGTT
    CACTTGATGGACAAGGTTCAGAGCATACTGTGGATAAAAAGAAATATGCTGCAGAACTTC
    ACTTGGTTCACTGGAACACCAAATATGGGGATTTTGGGAAAGCTGTGCAGCAACCTGATG
    GACTGGCCGTTCTAGGTATTTTTTTGAAGGTTGGCAGCGCTAAACCGGGCCTTCAGAAAG
    TTGTTGATGTGCTGGATTCCATTAAAACAAAGGGCAAGAGTGCTGACTTCACAAACTTTG
    CAGCTCGTGGCCTCCTTCCTGAATCCCTGGATTACTGGACCTACCCAGGCTCACTGACCA
    CCCCTCCTCTTCTGGAATGTGTGACCTGGATTGTGCTCAAGGAACCCATCAGCGTCAGCA
    GCGAGCAGGTGTTGAAATTCCGTAAACTTAACTTCAATGGGGAGGGTGAACCCGAAGAAC
    TGATGGTGGACAACTGGCGCCCAGCTCAGCCACTGAAGAACAGGCAAATCAAAGCTTCCT
    TCAAATAAGATGGTCCCATAGTCTGTATCCAAATAATGAATCTTCGGGTGTTTCCCTTTA
    GCTAAGCACAGATCTACCTTGGTGATTTGGACCCTGGTTGCTTTGTGTCTAGTTTTCTAG
    ACCCTTCATCTCTTACTTGATAGACTTACTAATAAAATGTGAAGACTAGACCAATTGTCA
    TGCTTGACACAACTGCTGTGGCTGGTTGGTGCTTTGTTTATGGTAGTAGTTTTTCTGTAA
    CACAGAATATAGGATAAGAAATAAGAATAAAGTACCTTGACTTTGTTCACAGCATGTAGG
    GTGATGAGCACTCACAATTGTTGACTAAAATGCTGCCTTTAAAACATAGGAAAGTAGAAT
    GGTTGAGTGCAAATCCATAGCACAAGATAAATTGAGCTAGTTAAGGCAAATCAGGTAAAA
    TAGTCATGATTCTATGTAATGTAAACCAGAAAAAATAAATGTTCATGATTTCAAGATGTT
    ATATTAAAGAAAAACTTTAAAAATTATTATATATTTATAGCAAAGTTATCTTAAATATGA
    ATTCTGTTGTAATTTAATGACTTTTGAATTACAGAGATATAAATGAAGTATTATCTGTAA
    AAATTGTTATAATTAGAGTTGTGATACAGAGTATATTTCCATTCAGACAATATATCATAA
    CTTAATAAATATTGTATTTTAGATATATTCTCTAATAAAATTCAGAATTCTAAAAAAAAA
    AAAAAAAA
    >Hs.5163_mRNA_1 gi|15990433|gb|BC015582.1|BC015582 Homo sapiens clone
    MGC:23280 IMAGE:4637504 polyA = 3
    GGCACGAGGCATGGAGGCGCTGCTGCTGGGCGCGGGGTTGCTGCTGGGCGCTTACGTGCT
    TGTCTACTACAACCTGGTGAAGGCCCCGCCGTGCGGCGGCATGGGCAACCTGCGGGGCCG
    CACGGCCGTGGTCACGGGTGAGTGCGGAGGCGGGTGAGTGCGAGCTGGCGGGGCGCGCGG
    AGAGGAGGCCGGGCCGGCGGTAGCAGCGGCCCGCCGGGCTCAGCTCAGCTCGGCTCCCGC
    CCGCGGTCCGCAGGCGCCAACAGCGGCATCGGAAAGATGACGGCGCTGGAGCTGGCGCGC
    CGGGGAGCGCGCGTGGTGCTGGCCTGCCGCAGCCAGGAGCGCGGGGAGGCGGCTGCCTTC
    GACCTCCGCCAGGAGAGTGGGAACAATGAGGTCATCTTCATGGCCTTGGACTTGGCCAGT
    CTGGCCTCGGTGCGGGCCTTTGCCACTGCCTTTCTGAGCTCTGAGCCACGGTTGGACATC
    CTCATCCACAATGCCGGTATCAGTTCCTGTGGCCGGACCCGTGAGGCGTTTAACCTGCTG
    CTTCGGGTGAACCATATCGGTCCCTTTCTGCTGACACATCTGCTGCTGCCTTGCCTGAAG
    GCATGTGCCCCTAGCCGCGTGGTGGTGGTAGCCTCAGCTGCCCACTGTCGGGGACGTCTT
    GACTTCAAACGCCTGGACCGCCCAGTGGTGGGCTGGCGGCAGGAGCTGCGGGCATATGCT
    GACACTAAGCTGGCTAATGTACTGTTTGCCCGGGAGCTCGCCAACCAGCTTGAGGCCACT
    GGCGTCACCTGCTATGCAGCCCACCCAGGGCCTGTGAACTCGGAGCTGTTCCTGCGCCAT
    GTTCCTGGATGGCTGCGCCCACTTTTGCGCCCATTGGCTTGGCTGGTGCTCCGGGCACCA
    AGAGGGGGTGCCCAGACACCCCTGTATTGTGCTCTACAAGAGGGCATCGAGCCCCTCAGT
    GGGAGATATTTTGCCAACTGCCATGTGGAAGAGGTGCCTCCAGCTGCCCGAGACGACCGG
    GCAGCCCATCGGCTATGGGAGGCCAGCAAGAGGCTGGCAGGGCTTGGGCCTGGGGAGGAT
    GCTGAACCCGATGAAGACCCCCAGTCTGAGGACTCAGAGGCCCCATCTTCTCTAAGCACC
    CCCCACCCTGAGGAGCCCACAGTTTCTCAACCTTACCCCAGCCCTCAGAGCTCACCAGAT
    TTGTCTAAGATGACGCACCGAATTCAGGCTAAAGTTGAGCCTGAGATCCAGCTCTCCTAA
    CCCTCAGGCCAGGATGCTTGCCATGGCACTTCATGGTCCTTGAAAACCTCGGATGTGTGC
    GAGGCCATGCCCTGGACACTGACGGGTTTGTGATCTTGACCTCCGTGGTTACTTTCTGGG
    GCCCCAAGCTGTGCCCTGGACATCTCTTTTCCTGGTTGAAGGAATAATGGGTGATTATTT
    CTTCCTGAGAGTGACAGTAACCCCAGATGGAGAGATAGGGGTATGCTAGACACTGTGCTT
    CTCGGAAATTTGGATGTAGTATTTTCAGGCCCCACCCTTATTGATTCTGATCAGCTCTGG
    AGCAGAGGCAGGGAGTTTGCAATGTGATGCACTGCCAACATTGAGAATTAGTGAACTGAT
    CCCTTTGCAACCGTCTAGCTAGGTAGTTAAATTACCCCCATGTTAATGAAGCGGAATTAG
    GCTCCCGAGCTAAGGGACTCGCCTAGGGTCTCACAGTGAGTAGGAGGAGGGCCTGGGATC
    TGAACCCAAGGGTCTGAGGCCAGGGCCGACTGCCGTAAGATGGGTGCTGAGAAGTGAGTC
    AGGGCAGGGCAGCTGGTATCGAGGTGCCCCATGGGAGTAAGGGGACGCCTTCCGGGCGGA
    TGCAGGGCTGGGGTCATCTGTATCTGAAGCCCCTCGGAATAAAGCGCGTTGACCGCCAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.55150_mRNA_1 gi|17068414|gb|BC017586.1|BC017586 Homo sapiens clone
    MGC:26610 IMAGE:4837506 polyA = 3
    AGCGGTGGAGAAAAGGCAGAACCAGAGTAGAGATTGACAGTGAGCTGAGCCAATCAGGCT
    GTGAATCTGCAGCAGTGATCCCAGGTCCTCCAATTAATACTAAGAGAGTGGACCAGGGCC
    CCTGAGGAAGACAGATGGCAGGGACAGCGCGCCATGACCGAGAGATGGCGATCCAGGCCA
    AGAAAAAGCTCACCACGGCCACCAACCCCATTGAAAGACTCCGACTGCAGTGCCTGGCCA
    GGGGCTCTGCTGGGATCAAAGGACTTGGCAGAGTGTTTAGAATTATGGATGACGATAATA
    ATCGAACCCTTGATTTTAAAGAATTTATGAAAGGGTTAAATGATTATGCTGTGGTCATGG
    AAAAAGAAGAGGTGGAAGAACTTTTCCGGAGGTTTGATAAAGATGGAAATGGAACAATAG
    ACTTCAATGAATTTCTTCTCACATTAAGACCTCCAATGTCCAGAGCCAGAAAAGAGGTAA
    TCATGCAAGCTTTTAGAAAGTTAGACAAGACTGGAGATGGTGTTATAACAATCGAAGACC
    TTCGTGAAGTATATAATGCAAAACACCACCCAAAGTACCAGAATGGGGAATGGAGTGAGG
    AACAAGTATTTAGGAAATTTCTGGATAACTTTGATTCACCCTATGACAAAGATGGATTGG
    TGACCCCTGAGGAGTTCATGAACTACTATGCAGGTGTGAGCGCATCCATTGACACTGATG
    TGTACTTCATCATCATGATGAGAACCGCCTGGAAGCTTTAAGCACATGACCTGGGGACCA
    GGCCCTGGGACAGCCATGTGGCTCCAAATGACTAAATGTCAGCTCAAAAACCAGAATCGT
    ATTTGATTTCACACTCATCCTAATGTTTTTTTCTGTGTCAAAATATTGCATTTTCTGGGG
    CCAAAAAACAGGCAGAAATAAAAGACATTGAGTAGTCAAAAAAAAAAAAAAAA
    >Hs.170177_contig3
    AI620495|AW291989|AA780896|AA976262|AI298326|BF111862|AW591523|AI922518|AI4
    80280|BF589437|AA600354|AI886238|AA035599|H90049|BF112011|N52601|AI570965|A
    1565367|AW768847|H90073|BE504361|N45292|AI632075|AA679729|AW168052|AI978827
    |AI968410|AI669255|N45300|AI651256|AI698970|AI521256|AW078614|AI802070|AI88
    5947|AI342534|AI653624|AW243936|T16586|R15989|AI289789|AI871636|AI718785|AW
    148847 polyA = 2 polyA = 3
    TAGAGCATTAAAATAACTATCAGGCAGAAGAATCTTTCTTCTCGCCTAGGATTTCAGCCA
    TGCGCGCGCTCTCTCTCTTTCTCTCTCTTTTCCTCTCTCTCCCTCTTTCTAGCCTGGGGC
    TTGAATTTGCATGTCTAATTCATTTACTCACCATATTTGAATTGGCCTGAACAGATGTAA
    ATCGGGAAGGATGGGAAAAACTGCAGTCATCAACAATGATTAATCAGCTGTTGCAGGCAG
    TGTCTTAAGGAGACTGGTAGGAGGAGGCATGGAAACCAAAAGGCCGTGTGTTTAGAAGCC
    TAATTGTCACATCAAGCATCATTGTCCCCATGCAACAACCACCACCTTATACATCACTTC
    CTGTTTTAAGCAGCTCTAAAACATAGACTGAAGATTTATTTTTAATATGTTGACTTTATT
    TCTGAGCAAAGCATCGGTCATGTGTGTATTTTTTCATAGTCCCACCTTGGAGCATTTATG
    TAGACATTGTAAATAAATTTTGTGCAAAAAGGACTGGAAAAATGAACTGTATTATTGCAA
    TTTTTTTTTGTAAAAGTAGCAGTTTGGTATGAGTTGGCATGCATACAAGATTTACTAAGT
    GGGATAAGCTAATTATACTTTTTGTTGTGGATAAACAAATGCTTGTTGATAGCCTTTTTC
    TATCAAGAAACCAAGGAGCTAATTATTAATAACAATCATTGCACACTGAGTCTTAGCGTT
    TCTGATGGAAACAGTTTGGATTGTATAATAACGCCAAGCCCAGTTGTAGTCGTTTGAGTG
    CAGTAATGAAATCTGAATCTAAAATAAAAACAAGATTATTTTTGTCAAAAAAAAAAAAAA
    AAAAAAAAAA
    >Hs.184601_mRNA_5 gi|4426639|gb|AF104032.1|AF104032 Homo sapiens polyA = 2
    GCGGCGCGCACACTGCTCGCTGGGCCGCGGCTCCCGGGTGTCCCAGGCCCGGCCGGTGCG
    CAGAGCATGGCGGGTGCGGGCCCGAAGCGGCGCGCGCTAGCGGCGCCGGCGGCCGAGGAG
    AAGGAAGAGGCGCGGGAGAAGATGCTGGCCGCCAAGAGCGCGGACGGCTCGGCGCCGGCA
    GGCGAGGGCGAGGGCGTGACCCTGCAGCGGAACATCACGCTGCTCAACGGCGTGGCCATC
    ATCGTGGGGACCATTATCGGCTCGGGCATCTTCGTGACGCCCACGGGCGTGCTCAAGGAG
    GCAGGCTCGCCGGGGCTGGCGCTGGTGGTGTGGGCCGCGTGCGGCGTCTTCTCCATCGTG
    GGCGCGCTCTGCTACGCGGAGCTCGGCACCACCATCTCCAAATCGGGCGGCGACTACGCC
    TACATGCTGGAGGTCTACGGCTCGCTGCCCGCCTTCCTCAAGCTCTGGATCGAGCTGCTC
    ATCATCCGGCCTTCATCGCAGTACATCGTGGCCCTGGTCTTCGCCACCTACCTGCTCAAG
    CCGCTCTTCCCCACCTGCCCGGTGCCCGAGGAGGCAGCCAAGCTCGTGGCCTGCCTCTGC
    GTGCTGCTGCTCACGGCCGTGAACTGCTACAGCGTGAAGGCCGCCACCCGGGTCCAGGAT
    GCCTTTGCCGCCGCCAAGCTCCTGGCCCTGGCCCTGATCATCCTGCTGGGCTTCGTCCAG
    ATCGGGAAGGGTGATGTGTCCAATCTAGATCCCAACTTCTCATTTGAAGGCACCAAACTG
    GATGTGGGGAACATTGTGCTGGCATTATACAGCGGCCTCTTTGCCTATGGAGGATGGAAT
    TACTTGAATTTCGTCACAGAGGAAATGATCAACCCCTACAGAAACCTGCCCCTGGCCATC
    ATCATCTCCCTGCCCATCGTGACGCTGGTGTACGTGCTGACCAACCTGGCCTACTTCACC
    ACCCTGTCCACCGAGCAGATGCTGTCGTCCGAGGCCGTGGCCGTGGACTTCGGGAACTAT
    CACCTGGGCGTCATGTCCTGGATCATCCCCGTCTTCGTGGGCCTGTCCTGCTTCGGCTCC
    GTCAATGGGTCCCTGTTCACATCCTCCAGGCTCTTCTTCGTGGGGTCCCGGGAAGGCCAC
    CTGCCCTCCATCCTCTCCATGATCCACCCACAGCTCCTCACCCCCGTGCCGTCCCTCGTG
    TTCACGTGTGTGATGACGCTGCTCTACGCCTTCTCCAAGGACATCTTCTCCGTCATCAAC
    TTCTTCAGCTTCTTCAACTGGCTCTGCGTGGCCCTGGCCATCATCGGCATGATCTGGCTG
    CGCCACAGAAAGCCTGAGCTTGAGCGGCCCATCAAGGTGAACCTGGCCCTGCCTGTGTTC
    TTCATCCTGGCCTGCCTCTTCCTGATCGCCGTCTCCTTCTGGAAGACACCCGTGGAGTGT
    GGCATCGGCTTCACCATCATCCTCAGCGGGCTGCCCGTCTACTTCTTCGGGGTCTGGTGG
    AAAAACAAGCCCAAGTGGCTCCTCCAGGGCATCTTCTCCACGACCGTCCTGTGTCAGAAG
    CTCATGCAGGTGGTCCCCCAGGAGACATAGCCAGGAGGCCGAGTGGCTGCCGGAGGAGCA
    TGCGCAGAGGCCAGTTAAAGTAGATCACCTCCTCGAACCCACTCCGGTTCCCCGCAACCC
    ACAGCTCAGCTGCCCATCCCAGTCCCTCGCCGTCCCTCCCAGGTCGGGCAGTGGAGGCTG
    CTGTGAAAACTCTGGTACGAATCTCATCCCTCAACTGAGGGCCAGGGACCCAGGTGTGCC
    TGTGCTCCTGCCCAGGAGCAGCTTTTGGTCTCCTTGGGCCCTTTTTCCCTTCCCTCCTTT
    GTTTACTTATATATATATTTTTTTTAAACTTAAATTTTGGGTCAACTTGACACCACTAAG
    ATGATTTTTTAAGGAGCTGGGGGAAGGCAGGAGCCTTCCTTTCTCCTGCCCCAAGGGCCC
    AGACCCTGGGCAAACAGAGCTACTGAGACTTGGAACCTCATTGCTACGACAGACTTGCAC
    TGAAGCCGGACAGCTGCCCAGACACATGGGCTTGTGACATTCGTGAAAACCAACCCTGTG
    GGCTTATGTCTCTGCCTTAGGGTTTGCAGAGTGGAAACTCAGCCGTAGGGTGGCACTGGG
    AGGGGGTGGGGGATCTGGGCAAGGTGGGTGATTCCTCTCAGGAGGTGCTTGAGGCCCCGA
    TGGACTCCTGACCATAATCCTAGCCCTGAGACACCATCCTGAGCCAGGGAACAGCCCCAG
    GGTTGGGGGGTGCCGGCATCTCCCCTAGCTCACCAGGCCTGGCCTCTGGGCAGTGTGGCC
    TCTTGGCTATTTCTGTGTCCAGTTTTGGAGGCTGAGTTCTGGTTCATGCAGACAAAGCCC
    TGTCCTTCAGTCTTCTAGAAACAGAGACAAGAAAGGCAGACACACCGCGGCCAGGCACCC
    ATGTGGGCGCCCACCCTGGGCTCCACACAGCAGTGTCCCCTGCCCCAGAGGTCGCAGCTA
    CCCTCAGCCTCCAATGCATTGGCCTCTGTACCGCCCGGCAGCCCCTTCTGGCCGGTGCTG
    GGTTCCCACTCCCGGCCTAGGCACCTCCCCGCTCTCCCTGTCACGCTCATGTCCTGTCCT
    GGTCCTGATGCCCGTTGTCTAGGAGACAGAGCCAAGCACTGCTCACGTCTCTGCCGCCTG
    CGTTTGGAGGCCCCTGGGCTCTCACCCAGTCCCCACCCGCCTGCAGAGAGGGAACTAGGG
    CACCCCTTGTTTCTGTTGTTCCCGTGAATTTTTTTCGCTATGGGAGGCAGCCGAGGCCTG
    GCCAATGCGGCCCACTTTCCTGAGCTGTCGCTGCCTCCATGGCAGCAGCCAAGGACCCCC
    AGAACAAGAAGACCCCCCCGCAGGATCCCTCCTGAGCTCGGGGGGCTCTGCCTTCTCAGG
    CCCCGGGCTTCCCTTCTCCCCAGCCAGAGGTGGAGCCAAGTGGTCCAGCGTCACTCCAGT
    GCTCAGCTGTGGCTGGAGGAGCTGGCCTGTGGCACAGCCCTGAGTGTCCCAAGCCGGGAG
    CCAACGAAGCCGGACACGGCTTCACTGACCAGCGGCTGCTCAAGCCGCAAGCTCTCAGCA
    AGTGCCCAGCGGAGCCTGCCGCCCCCACCTGGGCACCGGGACCCCCTCACCATCCAGTGG
    GCCCGGAGAAACCTGATGAACAGTTTGGGGACTCAGGACCAGATGTCCGTCTCTCTTGCT
    TGAGGAATGAAGACCTTTATTCACCCCTGCCCCGTTGCTTCCCGCTGCACATGGACAGAC
    TTCACAGCGTCTGCTCATAGGACCTGCATCCTTCCTGGGGACGAATTCCACTCGTCCAAG
    GGACAGCCCACGGTCTGGAGGCCGAGGACCACCAGCAGGCAGGTGGACTGACTGTGTTGG
    GCAAGACCTCTTCCCTCTGGGCCTGTTCTCTTGGCTGCAAATAAGGACAGCAGCTGGTGC
    CCCACCTGCCTGGTGCATTGCTGTGTGAATCCAGGAGGCAGTGGACATCGTAGGCAGCCA
    CGGCCCCGGGTCCAGGAGAAGTGCTCCCTGGAGGCACGCACCACTGCTTCCCACTGGGGC
    CGGCGGGGCCCACGCACGACGTCAGCCTCTTACCTTCCCGCCTCGGCTAGGGGTCCTCGG
    GATGCCGTTCTGTTCCAACCTCCTGCTCTGGGACGTGGACATGCCTCAAGGATACAGGGA
    GCCGGCGGCCTCTCGACGGCACGCACTTGCCTGTTGGCTGCTGCGGCTGTGGGCGAGCAT
    GGGGGCTGCCAGCGTCTGTTGTGGAAAGTAGCTGCTAGTGAAATGGCTGGGGCCGCTGGG
    GTCCGTCTTCACACTGCGCAGGTCTCTTCTGGGCGTCTGAGCTGGGGTGGGAGCTCCTCC
    GCAGAAGGTTGGTGGGGGGTCCAGTCTGTGATCCTTGGTGCTGTGTGCCCCACTCCAGCC
    TGGGGACCCCACTTCAGAAGGTAGGGGCCGTGTCCCGCGGTGCTGACTGAGGCCTGCTTC
    CCCCTCCCCCTCCTGCTGTGCTGGAATTCCACAGGGACCAGGGCCACCGCAGGGGACTGT
    CTCAGAAGACTTGATTTTTCCGTCCCTTTTTCTCCACACTCCACTGACAAACGTCCCCAG
    CGGTTTCCACTTGTGGGCTTCAGGTGTTTTCAAGCACAACCCACCACAACAAGCAAGTGC
    ATTTTCAGTCGTTGTGCTTTTTTGTTTTGTGCTAACGTCTTACTAATTTAAAGATGCTGT
    CGGCACCATGTTTATTTATTTCCAGTGGTCATGCTCAGCCTTGCTGCTCTGCGTGGCGCA
    GGTGCCATGCCTGCTCCCTGTCTGTGTCCCAGCCACGCAGGGCCATCCACTGTGACGTCG
    GCCGACCAGGCTGGACACCCTCTGCCGAGTAATGACGTGTGTGGCTGGGACCTTCTTTAT
    TCTGTGTTAATGGCTAACCTGTTACACTGGGCTGGGTTGGGTAGGGTGTTCTGGCTTTTT
    TGTGGGGTTTTTATTTTTAAAGAAACACTCAATCATCCTAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.351972_singlet1 AA865917 polyA = 2 polyA = 3
    GGGACTTGGAAAGGGGAACTGGGATTTGGGGAGGGGCTGGAGGACTTCCGCACGCTTCCA
    CCTCCTTCGACCTCCACTGCGCCCCACCTCCCTGCCTGTGTGTGTTATTTCAAAGGAAAA
    GAACAAAAGGAATAAATTTTCTAAGCTCTTTAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.5366_mRNA_2 gi|15277845|gb|BC012926.1|BC012926 Homo sapiens clone
    MGC: 16817 IMAGE:3853503 polyA = 3
    GCAGGCTCTGCCTGTGGCCACTAGCAGAGAAGCTGCTGTCCTTCCACCACCAGCACCGGA
    CCACCTGCTCCAAGACCAGCCTCCTGGGGGGACCAGGCACCCGGCCTTCACTGGCACCCA
    GGGAGCCGTCCTCAGCAGCGTCAACATGTCAAGGCCCAGCAGCAGAGCCATTTACTTGCA
    CCGGAAGGAGTACTCCCAGAACCTCACCTCAGAGCCCACCCTCCTGCAGCACAGGGTGGA
    GCACTTGATGACATGCAAGCAGGGGAGTCAGAGAGTCCAGGGGCCCGAGGATGCCTTGCA
    GAAGCTGTTCGAGATGGATGCACAGGGCCGGGTGTGGAGCCAAGACTTGATCCTGCAGGT
    CAGGGACGGCTGGCTGCAGCTGCTGGACATTGAGACCAAGGAGGAGCTGGACTCTTACCG
    CCTAGACAGCATCCAGGCCATGAATGTGGCGCTCAACACATGTTCCTACAACTCCATCCT
    GTCCATCACCGTGCAGGAGCCGGGCCTGCCAGGCACTAGCACTCTGCTCTTCCAGTGCCA
    GGAAGTGGGGGCAGAGCGACTGAAGACCAGCCTGCAGAAGGCTCTGGAGGAAGAGCTGGA
    GCAAAGCAGACCTCGACTTGGAGGCCTTCAGCCAGGCCAGGACAGATGGAGGGGGCCTGC
    TATGGAAAGGCCGCTCCCTATGGAGCAGGCACGCTATCTGGAGCCGGGGATCCCTCCAGA
    ACAGCCCCACCAGAGGACCCTAGAGCACAGCCTCCCACCATCCCCAAGGCCCCTGCCACG
    CCACACCAGTGCCCGAGAACCAAGTGCCTTTACTCTGCCTCCTCCAAGGCGGTCCTCTTC
    CCCCGAGGACCCAGAGAGGGACGAGGAAGTGCTGAACCATGTCCTAAGGGACATTGAGCT
    GTTCATGGGAAAGCTGGAGAAGGCCCAGGCAAAGACCAGCAGGAAGAAGAAATTTGGGAA
    AAAAAACAAGGACCAGGGAGGTCTCACCCAGGCACAGTACATTGACTGCTTCCAGAAGAT
    CAAGTACAGCTTCAACCTCCTGGGAAGGCTGGCCACCTGGCTGAAGGAGACAAGTGCCCC
    TGAGCTCGTACACATCCTCTTCAAGTCCCTGAACTTCATCCTGGCCAGGTGCCCTGAGGC
    TGGCCTAGCAGCCCAAGTGATCTCACCCCTCCTCACCCCTAAAGCTATCAACCTGCTACA
    GTCCTGTCTAAGCCCACCTGAGAGTAACCTTTGGATGGGGTTGGGCCCAGCCTGGACCAC
    TAGCCGGGCCGACTGGACAGGCGATGAGCCCCTGCCCTACCAACCCACATTCTCGGATGA
    CTGGCAACTTCCAGAGCCCTCCAGCCAAGCACCCTTAGGATACCAGGACCCTGTTTCCCT
    TCGGCGGGGAAGTCATAGGTTAGGGAGCACCTCACACTTTCCTCAGGAGAAGACACACAA
    CCATGACCCTCAGCCTGGGGACCCCAACTCCAGGCCCTCCAGCCCCAAACCTGCCCAGCC
    AGCCCTGAAAATGCAAGTCTTGTACGAGTTTGAAGCTAGGAACCCACGGGAACTGACTGT
    GGTCCAGGGAGAGAAGCTGGAGGTTCTGGACCACAGCAAGCGGTGGTGGCTGGTGAAGAA
    TGAGGCGGGACGGAGCGGCTACATTCCAAGCAACATCCTGGAGCCCCTACAGCCGGGGAC
    CCCTGGGACCCAGGGCCAGTCACCCTCTCGGGTTCCAATGCTTCGACTTAGCTCGAGGCC
    TGAAGAGGTCACAGACTGGCTGCAGGCAGAGAACTTCTCCACTGCCACGGTGAGGACACT
    TGGGTCCCTGACGGGGAGCCAGCTACTTCGCATAAGACCTGGGGAGCTACAGATGCTATG
    TCCACAGGAGGCCCCACGAATCCTGTCCCGGCTGGAGGCTGTCAGAAGGATGCTGGGGAT
    AAGCCCTTAGGCACCAGCTTAGACACCTCCAAGAACCAGGCCCCGCTGATGCAAGATGGC
    AGATCTGATACCCATTAGAGCCCCGAGAATTCCTCTTCTGGATCCCAGTTTGCAGCAAAC
    CCCACACCCCAGCTCACACAGCAAAAACAATGGACAGGCCCAGAGGGTGAAGCAAACAGT
    GTCCCTTCTGGCTGTGTTGGAGCCTCCCCAGTAACCACCTATTTATTTTACCTCTTTCCC
    AAACCTGGAGCATTTATGCCTAGGCTTGTCAAGAATCTGTTCAGTCCCTCTCCTTCTCAA
    TAAAAGCATCTTCAAGCTTGAAAAAAAAAAAAAAA
    >Hs.18140_contig1
    AI685931|AA410954|T97707 AA706873|AI911572 AW614616|AA548520|AW027764|BF511
    251|AI914294|AW151688 polyA = 1 polyA = 1
    CCTTCCATTGAATTCCACCAGACACATTCAGGTTANCTTCGTAATGTCTTCATATGAGTA
    TCAATCAACACCTTCCCCAACTCAATTGTACTAGGTTGTAGAGCACAAGGATGGTCTCGT
    GCTGCTCTGTGGCACCTGTGCCTACACTGCTCTGAGCTTTGAGGAGGCTGCTCTCTTTGC
    TGACCCCATGATCTTTTCTGCCCTTCTGTTAAGGGCATTGGCCACAGCAACGGGGCAAAT
    GCCCCAAGCTGGCTGTAAGTGACCCATCCCTTTGGCTCCCATGATTAGACCAAGGAGAGG
    CATGGGGTCCAGCTGAGCCATTCAGAACCATTCCTTAGCATTTTCCACTCAAAGGTTAGA
    GATGAGATTTTCTCTTCCCAAGGCTACCTCTGGCCATGGTTCCAGCTTCATGGGGGCAAT
    GGGATTAGGAAAATGAGGTCAACCTGCAAAGGAAAGCAGATGCAAGAGATGGAGACAGAA
    TGGGGGTGTCCTGGGGATCTTGGAGCCTGAATTCATTGGCACAAAAGGCAGCAGCATCCT
    CACTGTATCTGCAGTCCATTTGGACTCAATAAAAACTTTGAAAGTCACATGTGTTATGGA
    ATTCCTTCTCAGTGACACATTCATCTGTGCTCAGTTGTCCCAGCAAGGGTCAGCCCCTCA
    TACCCCTGCAGCATCCGCTGCTATGAAGCAGAGCTGTAAACGCCCTCCCTGTGTATAGGA
    AAAGCTACATGGAGCAAATCCTCCTGCCTGAAGAAGTGCATCTCAGCATCACTTCAGCTG
    TCGGGGCATTTGTGGGGAGAACCAGACCACCTCTGCGGAAGGCAGCAGACCCTCTTCCAG
    CCATGGATGGAGTTGAATTCTCTATAAACGGTTCACCAGCAAACCACCAATACATTCCAT
    TGTTTGCCTAGAGAGAAATTTAAAAATATVATAAATGTTCACTTAT
    >Hs.133196_contig2
    BF224381BE467992|AW137689|AI695045|AW207361|BF445141|AA405473 polyA = 2 WARN
    polyA = 3
    TGCGGCCGCGGCATGAAAGGCGGCGAGGAGAGGCAGCACTGCTGCTCTTGACTTCTGAGC
    AGGGCTTAGAGAGCCTGCCCCGGCTTAAGCCGAGCTGCTGGTGCTGACCCTGAGCGCCGA
    GTCCGCGAGCTCTGAGTCCGGAGCCTCCCAGCCGTGGAGCCGTGGGATGAGGGGGGCGTT
    GGGGGACAGGGCAAAGTCGATCTTGGTTGTACAGCCGCCCGATCCTAGCGCGGAGCTGCG
    AGCCTGACCGGCCGCGTCTGGCATGGTCAGAGAAAGAATTTTCTTTTCCCAACTCCGGCT
    TTTGGTTTTGTGTGTCCACCTTGCGCAACTCCGGAGCCAGCCGACCCCACATGGATTCTC
    AACAGGTGGCCGGCACATCTTCTGAGCCTCGCTCTCTCATCTGAAAGTGGAGTGTAAGTC
    CAAGAAGATTCATTTAGACAAAGAAGGTGGAAAAAAAGGACTTTCTGGGCCAGCAAGTCG
    GATGACCACCCTCCAAGGGGCAGAGGAGGGCCCATTTTGTGAAGAAGAAATCAACTACCC
    GGAAAACGCCACAGGAGGACATGTTTCTGCAGATGTAGTTGCCCTAGAAACAGAAGAGTA
    TGGGGGTGTGAATGTCTTCTCTTTTGGGGGCAAACACTATGTCCTTTTCTTTTTCTAGAT
    ACAGTTAATTCCTGGAAATTTTAGCGAGTTTGTTCTTGTGGATATTTTGAACAATAAAGA
    GTGAAAATCAAAAAAA
    >Hs.63325_mRNA_5 gi|15451939|ref|NM_019894.1|Homo sapiens transmembrane
    protease, serine 4 (TMPRSS4), mRNA polyA = 3
    CCCAATCACTCCTGGAATACACAGAGAGAGGCAGCAGCTTGCTCAGCGGACAAGGATGCT
    GGGCGTGAGGGACCAAGGCCTGCCCTGCACTCGGGCCTCCTCCAGCCAGTGCTGACCAGG
    GACTTCTGACCTGCTGGCCAGCCAGGACCTGTGTGGGGAGGCCCTCCTGCTGCCTTGGGG
    TGACAATCTCAGCTCCAGGCTACAGGGAGACCGGGAGGATCACAGAGCCAGCATGTTACA
    GGATCCTGACAGTGATCAACCTCTGAACAGCCTCGATGTCAAACCCCTGCGCAAACCCCG
    TATCCCCATGGAGACCTTCAGAAAGGTGGGGATCCCCATCATCATAGCACTACTGAGCCT
    GGCGAGTATCATCATTGTGGTTGTCCTCATCAAGGTGATTCTGGATAAATACTACTTCCT
    CTGCGGGCAGCCTCTCCACTTCATCCCGAGGAAGCAGCTGTGTGACGGAGAGCTGGACTG
    TCCCTTGGGGGAGGACGAGGAGCACTGTGTCAAGAGCTTCCCCGAAGGGCCTGCAGTGGC
    AGTCCGCCTCTCCAAGGACCGATCCACACTGCAGGTGCTGGACTCGGCCACAGGGAACTG
    GTTCTCTGCCTGTTTCGACAACTTCACAGAAGCTCTCGCTGAGACAGCCTGTAGGCAGAT
    GGGCTACAGCAGCAAACCCACTTTCAGAGCTGTGGAGATTGGCCCAGACCAGGATCTGGA
    TGTTGTTGAAATCACAGAAAACAGCCAGGAGCTTCGCATGCGGAACTCAAGTGGGCCCTG
    TCTCTCAGGCTCCCTGGTCTCCCTGCACTGTCTTGCCTGTGGGAAGAGCCTGAAGACCCC
    CCGTGTGGTGGGTGGGGAGGAGGCCTCTGTGGATTCTTGGCCTTGGCAGGTCAGCATCCA
    GTACGACAAACAGCACGTCTGTGGAGGGAGCATCCTGGACCCCCACTGGGTCCTCACGGC
    AGCCCACTGCTTCAGGAAACATACCGATGTGTTCAACTGGAAGGTGCGGGCAGGCTCAGA
    CAAACTGGGCAGCTTCCCATCCCTGGCTGTGGCCAAGATCATCATCATTGAATTCAACCC
    CATGTACCCCAAAGACAATGACATCGCCCTCATGAAGCTGCAGTTCCCACTCACTTTCTC
    AGGCACAGTCAGGCCCATCTGTCTGCCCTTCTTTGATGAGGAGCTCACTCCAGCCACCCC
    ACTCTGGATCATTGGATGGGGCTTTACGAAGCAGAATGGAGGGAAGATGTCTGACATACT
    GCTGCAGGCGTCAGTCCAGGTCATTGACAGCACACGGTGCAATGCAGACGATGCGTACCA
    GGGGGAAGTCACCGAGAAGATGATGTGTGCAGGCATCCCGGAAGGGGGTGTGGACACCTG
    CCAGGGTGACAGTGGTGGGCCCCTGATGTACCAATCTGACCAGTGGCATGTGGTGGGCAT
    CGTTAGCTGGGGCTATGGCTGCGGGGGCCCGAGCACCCCAGGAGTATACACCAAGGTCTC
    AGCCTATCTCAACTGGATCTACAATGTCTGGAAGGCTGAGCTGTAATGCTGCTGCCCCTT
    TGCAGTGCTGGGAGCCGCTTCCTTCCTGCCCTGCCCACCTGGGGATCCCCCAAAGTCAGA
    CACAGAGCAAGAGTCCCCTTGGGTACACCCCTCTGCCCACAGCCTCAGCATTTCTTGGAG
    CAGCAAAGGGCCTCAATTCCTGTAAGAGACCCTCGCAGCCCAGAGGCGCCCAGAGGAAGT
    CAGCAGCCCTAGCTCGGCCACACTTGGTGCTCCCAGCATCCCAGGGAGAGACACAGCCCA
    CTGAACAAGGTCTCAGGGGTATTGCTAAGCCAAGAAGGAACTTTCCCACACTACTGAATG
    GAAGCAGGCTGTCTTGTAAAAGCCCAGATCACTGTGGGCTGGAGAGGAGAAGGAAAGGGT
    CTGCGCCAGCCCTGTCCGTCTTCACCCATCCCCAAGCCTACTAGAGCAAGAAACCAGTTG
    TAATATAAAATGCACTGCCCTACTGTTGGTATGACTACCGTTACCTACTGTTGTCATTGT
    TATTACAGCTATGGCCACTATTATTAAAGAGCTGTGTAACATCAAAAAAAAAAAAAAAAA
    AAAA
    >Hs.250692_mRNA_2 gi|184223|gb|M95585.1|HUMHLF Human hepatic leukemia
    factor (HLF) mRNA, complete cds polyA = 3
    TTTTTCAATTTTGAACATTTTGCAAAACGAGGGGTTCGAGGCAGGTGAGAGCATCCTGCA
    CGTCGCCGGGGAGCCCGCGGGCACTTGGCGCGCTCTCCTGGGACCGTCTGCACTGGAAAC
    CCGAAAGTTTTTTTTTAATATATATTTTTATGCAGATGTATTTATAAAGATATAAGTAAT
    TTTTTTCTTCCCTTTTCTCCACCGCCTTGAGAGCGAGTACTTTTGGCAAAGGACGGAGGA
    AAAGCTCAGCAACATTTTAGGGGGCGGTTGTTTCTTTCTTTCTTATTTCTTTTTTAAGGG
    GAAAAAATTTGAGTGCATCGCGATGGAGAAAATGTCCCGACCGCTCCCCCTGAATCCCAC
    CTTTATCCCGCCTCCCTACGGCGTGCTCAGGTCCCTGCTGGAGAACCCGCTGAAGCTCCC
    CCTTCACCACGAAGACGCATTTAGTAAAGATAAAGACAAAGAAAAGAAGCTGGATGATGA
    GAGTAACAGCCCGACGGTCCCCCAGTCGGCATTCCTGGGGCCTACCTTATGGGACAAAAC
    CCTTCCCTATGACGGAGATACTTTCCAGTTGGAATACATGGACCTGGAGGAGTTTTTGTC
    AGAAAATGGCATTCCCCCCAGCCCATCTCAGCATGACCACAGCCCTCACCCTCCTGGGCT
    GCAGCCAGCTTCCTCGGCTGCCCCCTCGGTCATGGACCTCAGCAGCCGGGCCTCTGCACC
    CCTTCACCCTGGCATCCCATCTCCGAACTGTATGCAGAGCCCCATCAGACCAGGTCAGCT
    GTTGCCAGCAAACCGCAATACACCAAGTCCCATTGATCCTGACACCATCCAGGTCCCAGT
    GGGTTATGAGCCAGACCCAGCAGATCTTGCCCTTTCCAGCATCCCTGGCCAGGAAATGTT
    TGACCCTCGCAAACGCAAGTTCTCTGAGGAAGAACTGAAGCCACAGCCCATGATCAAGAA
    AGCTCGCAAAGTCTTCATCCCTGATGACCTGAAGGATGACAAGTACTGGGCAAGGCGCAG
    AAAGAACAACATGGCAGCCAAGCGCTCCCGCGACGCCCGGAGGCTGAAAGAGAACCAGAT
    CGCCATCCGGGCCTCGTTCCTGGAGAAGGAGAACTCGGCCCTCCGCCAGGAGGTGGCTGA
    CTTGAGGAAGGAGCTGGGCAAATGCAAGAACATACTTGCCAAGTATGAGGCCAGGCACGG
    GCCCCTGTAGGATGGCATTTTTGCAGGCTGGCTTTGGAATAGATGGACAGTTTGTTTCCT
    GTCTGATAGCACCACACGCAAACCAACCTTTCTGACATCAGCACTTTACCAGAGGCATAA
    ACACAACTGACTCCCATTTTGGTGTGCATCTGTGTGTGTGTGCGTGTATATGTGCTTGTG
    CTCATGTGTGTGGTCAGCGGTATGTGCGTGTGCGTGTTCCTTTGCTCTTGCCATTTTAAG
    GTAGCCCTCTCATCGTCTTTTAGTTCCAACAAAGAAAGGTGCCATGTCTTTACTAGACTG
    AGGAGCCCTCTCGCGGGTCTCCCATCCCCTCCCTCCTTCACTCCTGCCTCCTCAGCTTTG
    CTTCATGTTCGAGCTTACCTACTCTTCCAGGACTCTCTGCTTGGATTCACTAAAAAGGGC
    CCTGGTAAAATAGTGGATCTCAGTTTTTAAGAGTACAAGCTCTTGTTTCTGTTTAGTCCG
    TAAGTTACCATGCTAATGAGGTGCACACAATAACTTAGCACTACTCCGCAGCTCTAGTCC
    TTTATAAGTTGCTTTCCTCTTACTTTCAGTTTTGGTGATAATCGTCTTCAAATTAAAGTG
    CTGTTTAGATTTATTAGATCCCATATTTACTTACTGCTATCTACTAAGTTTCCTTTTAAT
    TCTACCAACCCCAGATAAGTAAGAGTACTATTAATAGAACACAGAGTGTGTTTTTGCACT
    GTCTGTACCTAAAGCAATAATCCTATTGTACGCTAGAGCATGCTGCCTGAGTATTACTAG
    TGGACGTAGGATATTTTCCCTACCTAAGAATTTCACTGTCTTTTAAAAAACAAAAAGTAA
    AGTAATGCATTTGAGCATGGCCAGACTATTCCCTAGGACAAGGAAGCAGAGGGAAATGGG
    AGGTCTAAGGATGAGGGGTTAATTTATCAGTACATGAGCCAAAAACTGCGTCTTGGATTA
    GCCTTTGACATTGATGTGTTCGGTTTTGTTGTTCCCCTTCCCTCACACCCTGCCTCGCCC
    CCACTTTTCTAGTTAACTTTTTCCATATCCCTCTTGACATTCAAAACAGTTACTTAAGAT
    TCAGTTTTCCCACTTTTTGGTAATATATATATTTTTGTGAATTATACTTTGTTGTTTTTA
    AAAAGAAAATCAGTTGATTAAGTTAATAAGTTGATGTTTTCTAAGGCCCTTTTTCCTAGT
    GGTGTCATTTTTGAATGCCTCATAAATTAATGATTCTGAAGCTTATGTTTCTTATTCTCT
    GTTTGCTTTTGAACGTATGTGCTCTTATAAAGTGGACTTCTGAAAAATGAATGTAAAAGA
    CACTGGTGTATCTCAGAAGGGGATGGTGTTGTCACAAACTGTGGTTAATCCAATCAATTT
    AAATGTTTACTATAGACCAAAAGGAGAGATTATTAAATCGTTTAATGTTTATACAGAGTA
    ATTATAGGAAGTTCTTTTTTGTACAGTATTTTTCAGATATAAATACTGACAATGTATTTT
    GGAAGACATATATTATATATAGAAAAGAGGAGAGGAAAACTATTCCATGTTTTAAAATTA
    TATAGCAAAGATATATATTCACCAATGTTGTACAGAGAAGAAGTGCTTGGGGGTTTTTGA
    AGTCTTTAATATTTTAAGCCCTATCACTGACACATCAGCATGTTTTCTGCTTTAAATTAA
    AATTTTATGACAGTATCGAGGCTTGTGATGACGAATCCTGCTCTAAAATACACAAGGAGC
    TTTCTTGTTTCTTATTAGGCCTCAGAAAGAAGTCAGTTAACGTCACCCAAAAGCACAAAA
    TGGATTTTAGTCAAATATTTATTGGATGATACAGTGTTTTTTAGGAAAAGCATCTGCCAC
    AAAAATGTTCACTTCGAAATTCTGAGTTCCTGGAATGGCACGTTGCTGCCAGTGCCCCAG
    ACAGTTCTTTTCTACCCTGCGGGCCCGCACGTTTTATGAGGTTGATATCGGTGCTATGTG
    TTTGGTTTATAATTTGATAGATGTTTGACTTTAAAGATGATTGTTCTTTTGTTTCATTAA
    GTTGTAAAATGTCAAGAAATTCTGCTGTTACGACAAAGAAACATTTTACGCTAGATTAAA
    ATATCCTTTCATCAATGGGATTTTCTAGTTTCCTGCCTTCAGAGTATCTAATCCTTTAAT
    GATCTGGTGGTCTCCTCGTCAATCCATCAGCAATGCTTCTCTCATAGTGTCATAGACTTG
    GGAAACCCAACCAGTAGGATATTTCTACAAGGTGTTCATTTTGTCACAAGCTGTAGATAA
    CAGCAAGAGATGGGGGTGTATTGGAATTGCAATACATTGTTCAGGTGAATAATAAAATCA
    AAAACTTTTGCAATCTTAAGCAGAGATAAATAAAAGATAGCAATATGAGACACAGGTGGA
    CGTAGAGTTGGCCTTTTTACAGGCAAAGAGGCGAATTGTAGAATTGTTAGATGGCAATAG
    TCATTAAAAACATAGAAAAATGATGTCTTTAAGTGGAGAATTGTGGAAGGATTGTAACAT
    GGACCATCCAAATTTATGGCCGTATCAAATGGTAGCTGAAAAAACTATATTTGAGCACTG
    GTCTCTCTTGGAATTAGATGTTTATATCAAATGAGCATCTCAAATGTTTTCTGCAGAAAA
    AAATAAAAAGATTCTAATAAAAAAA
    >Hs.250726_singlet4 AW298545 polyA = 2 polyA = 3
    TTCCTTCCCTCCCTCCNTTCCTCAGGAGCCGCCAGTCCCCAAGTTGGCTGTGGTTGGGCA
    CCTGGTTTGGGTCCTGCAGAGCTGGGCTCAGGCCCTGGGCTCTGAACCTGTGAACCCTTG
    CTGTGTTACGAAACTTTCCTTCCTCTGAGGGCCTTGAACCCTCTCCTTTTCTTCTTTTGG
    GGGTGGGGGTTAACTTTATTTTCTCTTCCCTGTATCTGCCTCTCCCTTCCCTCAATTTCC
    TGTTTTAAAACTGAATGGCACGAAATTGTTTTCCTCAACTCGGAGATTCCTGTATGGAGA
    GAATCAATTTCTATATTTGCAATAAATTTCTTATTTAAAGCTAAAAAAAAAAAAAAAAA
    >Hs.79217_mRNA_2 gi|16306657|gb|BC001504.1|BC001504 Homo sapiens
    MGC:2273 IMAGE:3505512 polyA = 3 clone
    GGCACGAGGGCCATCTGTGGGGGCTTTGGGCCAGGGGTCTCCGGACAGCATGAGCGTGGG
    CTTCATCGGCGCTGGCCAGCTGGCTTTTGCCCTGGCCAAGGGCTTCACAGCAGCAGGCGT
    CTTGGCTGCCCACAAGATAATGGCTAGCTCCCCAGACATGGACCTGGCCACAGTTTCTGC
    TCTCAGGAAGATGGGGGTGAAGTTGACACCCCACAACAAGGAGACGGTGCAGCACAGTGA
    TGTGCTCTTCCTGGCTGTGAAGCCACACATCATCCCCTTCATCCTGGATGAAATAGGCGC
    CGACATTGAGGACAGACACATTGTGGTGTCCTGCGCGGCCGGCGTCACCATCAGCTCCAT
    TGAGAAGAAGCTGTCAGCGTTTCGGCCAGCCCCCAGGGTCATCCGCTGCATGACCAACAC
    TCCAGTCGTGGTGCGGGAGGGGGCCACCGTGTATGCCACAGGCACGCACGCCCAGGTGGA
    GGACGGGAGGCTCATGGAGCAGCTGCTGAGCAGCGTGGGCTTCTGCACGGAGGTGGAAGA
    GGACCTGATTGATGCCGTCACGGGGCTCAGTGGCAGCGGCCCCGCCTACGCATTCACAGC
    CCTGGATGCCCTGGCTGATGGGGGCGTGAAGATGGGACTTCCAAGGCGCCTGGCAGTCCG
    CCTCGGGGCCCAGGCCCTCCTGGGGGCTGCCAAGATGCTGCTGCACTCAGAACAGCACCC
    AGGCCAGCTCAAGGACAACGTCAGCTCTCCTGGTGGGGCCACCATCCATGCCTTGCATGT
    GCTGGAGAGTGGGGGCTTCCGCTCCCTGCTCATCAACGCTGTGGAGGCCTCCTGCATCCG
    CACACGGGAGCTGCAGTCCATGGCTGACCAGGAGCAGGTGTCACCAGCCGCCATCAAGAA
    GACCATCCTGGACAAGGTGAAGCTGGACTCCCCTGCAGGGACCGCTCTGTCGCCTTCTGG
    CCACACCAAGCTGCTCCCCCGCAGCCTGGCCCCAGCGGGCAAGGATTGACACGTCCTGCC
    TGACCACCATCCTGCCACCACCTTCTCTTCTCTTGTCACTAGGGGGACTAGGGGGTCCCC
    AAAGTGGCCCACTTTCTGTGGCTCTGATCAGCGCAGGGGCCAGCCAGGGACATAGCCAGG
    GAGGGGCCACATCACTTCCCACTGGAAATCTCTGTGGTCTGCAAGTGCTTCCCAGCCCAG
    AACAGGGGTGGATTCCCCAACCTCAACCTCCTTTCTTCTCTGCTCCCAAACCATGTCAGG
    ACCACCTTCCTCTAGAGCTCGGGAGCCCGGAGGGTCTTCACCCACTCCTACTCCAGTATC
    AGCTGGCACGGGCTCCTTCCTGAGAGCAAAGGTCAAGGACCCCCTCTGTGAAGGCTCAGC
    AGAGGTGGGATCCCACGCCCCCTCCCGGCCCCTCCCTGCCCTCCATTCAGGGAGAAACCT
    CTCCTTCCCGTGTGAGAAGGGCCAGAGGGTCCAGGCATCCCAAGTCCAGCGTGAAGGGCC
    ACAGCCCCTCTTGGCTGCCAAGCACGCAGATCCCATGGACATTTGGGGAAAGGGCTCCTT
    GGGCTGCTGGTGAACTTCTGTGGCCACCACCTCCTGCTCCTGACCTCCCTGGGAGGGTGC
    TATCAGTTCTGTCCTGGCCCTTTCAGTTTTATAAGTTGGTTTCCAGCCCCCAGTGTCCTG
    ACTTCTGTCTGCCACATGAGGAGGGAGGCCCTGCCTGTGTGGGAGGGTGGTTACTGTGGG
    TGGAATAGTGGAGGCCTTCAACTGATTAGACAAGGCCCGCCCACATCTTGGAGGGCATCT
    GCCTTACTGATTAAAATGTCAATGTAATCTAAAAAAAAAAAAAAAAAA
    >Hs.47986_mRNA_1 gi|13279253|gb|BC004331.1|BC004331 Homo sapiens
    MGC:10940 IMAGE:3630835 polyA = 3
    GATAAATGCGGAGGGACGGTCCAGCTTTAGCTCTCTGCTCGCCGCCGCCGCTGTCGCCGC
    CACCTCCTCTGATCTACGAAAGTCATGTTACCCAACACCGGGAGGCTGGCAGGATGTACA
    GTTTTTATCACAGGTGCAAGCCGTGGCATTGGCAAAGCTATTGCATTGAAAGCAGCAAAG
    GATGGAGCAAATATTGTTATTGCTGCAAAGACCGCCCAGCCACATCCAAAACTTCTAGGC
    ACAATCTATACTGCTGCTGAAGAAATTGAAGCAGTTGGAGGAAAGGCCTTGCCATGTATT
    GTTGATGTGAGAGATGAACAGCAGATCAGTGCTGCAGTGGAGAAAGCCATCAAGAAATTT
    GGAGCTTATACCATTGCTAAGTATGGTATGTCTATGTATGTGCTTGGAATGGCAGAAGAA
    TTTAAAGGTGAAATTGCAGTCAATGCATTATGGCCTAAAACAGCCATACACACTGCTGCT
    ATGGATATGCTGGGAGGACCTGGTATCGAAAGCCAGTGTAGAAAAGTTGATATCATTGCA
    GATGCAGCATATTCCATTTTCCAAAAGCCAAAAAGTTTTACTGGCAACTTTGTCATTGAT
    GAAAATATCTTAAAAGAAGAAGGAATAGAAAATTTTGACGTTTATGCAATTAAACCAGGT
    CATCCTTTGCAACCAGATTTCTTCTTAGATGAATACCCAGAAGCAGTTAGCAAGAAAGTG
    GAATCAACTGGTGCTGTTCCAGAATTCAAAGAAGAGAAACTGCAGCTGCAACCAAAACCA
    CGTTCTGGAGCTGTGGAAGAAACATTTAGAATTGTTAAGGACTCTCTCAGTGATGATGTT
    GTTAAAGCCACTCAAGCAATCTATCTGTTTGAACTCTCCGGTGAAGATGGTGGCACGTGG
    TTTCTTGATCTGAAAAGCAAGGGTGGGAATGTCGGATATGGAGAGCCTTCTGATCAGGCA
    GATGTGGTGATGAGTATGACTACTGATGACTTTGTAAAAATGTTTTCAGGGAAACTAAAA
    CCAACAATGGCATTCATGTCAGGGAAATTGAAGATTAAAGGTAACATGGCCCTAGCAATC
    AAATTGGAGAAGCTAATGAATCAGATGAATGCCAGACTGTGAAGGAAAATATAAAAAAAA
    AGTCGACTGCTATGCTCAAAAAGTAAAAAAAGCTCAACAGTTAAAATCTAATGTTTGTTT
    TCTTTCCTGTTATATTATAAGGATATGCACGTTTGTTCTGGAAAAGATAGAATTTGTCTC
    TAAAAGACTTGAAATTGTAATTAAAATGGCAAGCTAATCAAACATAAGCTTCATTAAGTG
    GGATTCTAAGACAGTCTGTGTTTTTATATTTCAAGGGTTTAACCCTTTGAGCCTTACATC
    TCATTCACTGTCTTTCTCCAAGAAAAGTATTTTGGGCGGACAGTCAGATCAAGCAGTAAA
    ATTAGCTCTTTCAAATCTTCTTGTCATGTAAAATGAAGCTAGTCTGTTTTAAAATTTTTA
    GTTTTGGATTGTATACTAATGAAAATCTTAATGATGTTTTTGATTTTTATATACTTATTT
    TAAAGAAAATCTTATATAGTACATTTTACAAAAATTATAAAAAATGAATTAGTACTGGCG
    AGGACTAAATGAAACAATAATTTTTCATTTTGATAACTAGCTTTCCAGGTGGACTTAGCC
    ATAGGAAAATATTACTAATGTAATTTAACAAATTGCTGCATGTATTCCATTTAAAAATAT
    GTTTAAATTGTCCTAAAACAAAATAATTTTCTCCCTAGGAGTATGCATTTGGCTACAGTG
    TTTTGAAACAGAAACCTTAGAATAGGTCATTGGTATGGGCTGAACTGTGTATCCCCCAAT
    TCATTTGTTGAGGTCCTAACTCCCATTTCTTTTGAATGTGACTGTTCGGAGATGAGGCCT
    TTAAAGAGGTGACTTAAGTTCAAAGGAGGCTGTTAGTCTAATCCAACATGGTGTCCTTTG
    GACATAAGAGATACCAGCAATGTGTGCACAGAACAAAGACCAGGAGAGGACACAGTGAGA
    AGGCAGTTATCTGCAAGCAAAGAGAGAGGCTTCAGAAGAAACAAAATCACCAGCACCTTG
    ATCTTTGACTTCTAATCTCCAGAATAGTGAGAAATAAATTTCTGTTGTTAAGCCGTCCAC
    TGTGGGAGGCCGACGCAGGAGGATTGCTTGAGGCCAGGAGTTCAAGGCCAGCCTGGACAA
    CATAGTAAGACCCTATCTCTACCCCCCTAATAAATTAATTTAAAAAGCCCCCCAATCTGT
    GGTATTTTATTATGGCAGCCCTAGCAAGCTAATACAGTGGTTTGAGAGGCTGGGAGGGTT
    GAGGGGAAGATAAACTTTTAAAAAGCTCTTATCTTTCATTTCAATCAGTTAAAAATACTT
    GCTCAGTGTAACAATTTTGCTTCTCAGCTTCCACTCTAATATTGTTGTGCCATTAAGCAA
    TTTAGCTAATCCTGACATTTCTTAGATTCATAATGTTAGGAGCATTTAATCTGTATTTTA
    CAAGTTAGGAAGCAGAGGATCAGAGATGGGAAAGGACTAGCCCAAGGCCAACATTAACAA
    GCCCTCTAACAAAAACTTTACAATACATTTATGTTGAATGGAACTCCAAGATCTCACCTC
    TCCATCCAGGAATGGAGTCCATGTAATCAAAGTGAACTTAAAAATAGGACAGTTTCAACA
    AGTCAGGAGATTCACAGCAACTGATCAAAGGGAGTCCAGTCAACGTGAGCAAGCGTGATT
    ATGATGAGGAAGCCCCCTCTGCTTTAATCCACACAAGGAACGTAACCTGAAGTAACCTGA
    TGTTAACCAATCTGCTGTGTCTACTATGCTGTTTCCTTGTTCCTGCTAGTGCTGCTTTAC
    AAATGCAGACCATTCTATCATACCTGGCAGGGCTTCTGTTTTATTTTGTAGGCTGGATGC
    TACCCAGTTCATGAATCGCTAATAAAAGCCAATTAGATCTTTAAAAAAAAAAAAAAAAAA
    AAA
    >Hs.94367_mRNA_1 gi|10440200|dbj|AK027147.1|AK027147 Homo sapiens cDNA
    FLJ23494 fis, clone LNG01885 polyA = 3
    TATTAAAAGTACCCCATGGATGGACCTCCAAATGAGTTTAGGGTAATTGCGCTTAAAATA
    TTAGGACCAAAGTACATTTATTTTATAGATGGAGGAGGGGAGGAGACGAGTGGGGACCAG
    CTTGACATCCAGTCTTCACCTGGACATATGGAAAGAACAAATGTGCGATCTGCTCGTTCC
    CTCTGAAGGTCTCTGTTACGTATTTCCTCCTCTCCTCCAGAGCATAATAACCAATGACTG
    CTCTCAGAAAGGTACTGTGACCACCACTTGCTTGGCTCTCCAACTTCCTCCCCCATTTCC
    CTCTTGACTCCTGTTTGCCATAACACCTTCTGTCCCCTAGCCTTGCCTCAGGTCCCCGAC
    GAATCCTGCCCTTAATCTGTGGGGGTGGTAGGTGGCACTGGTTTGAAGAGCTTACTGGAT
    CTCCCTCAGTGAGTCAGCCTGGAGTTGTGTTTGAAAACCACAGGCCCTGACTGTGGCTGT
    AAGACCTCCCAGACACCACCTGCTGCTGCCTATCATCATCTTCAGGTGCTGGGCTCCCCT
    GTGGGCCTCGTCTGCCCGCCCTCTGCTGCAGCTGTCCCATGGGCGCCCGCCCTCTCTGAC
    ACCACAAGAGAGCCCATCTAGATTCCAGGAAAAAACTCATCTTTATTTGCCTTCTTCCCA
    CTGAAGGTAAAAGCAACATTAATAACCACAACAAATACTTAGTGAGTGCTTACTATTATT
    CATTTAATTGTAGGCCCTTCCATCCCTGGCCATGATGAGAGACATGCCATAGCTTACTCC
    TAAAGAGACCTGAGGACACACGTGCACAAACATATTGGGCATATCATCAATGGCATCAAA
    ACTGATTTTCCCTGTCTACCCAGAACAGGCCTGAGGGAGAGGGAAAAGCGGATACCCACC
    TGTGTCGCTGTTTGCGTGCCAAGTCCAGGAACAGTCCATACAGCCCTGCTGCATCCCACG
    ACGCTGTCACAAAGCAGGAGTTCATCCGAGGCCAAGGTATGGAGAAACTGAGGCCCAGAA
    ATTGATGTCCAGAATGCTTTGCTCTTAGCCACTGTACTATTATGGCATATTTTATCTTTA
    TGTATTGCATCATTTCATGGATTCAAGTTTATCAATGTCCTTTGACAAGTTTAAAAATCT
    GTCTGCTAAAATCTATCAAATACATTAAGGAAAAGTCCCACTTGGCACATCTCCCACACC
    AGATGTTAATTATTCATACTGCATGACTGAGGATTTTGGAGGCAGAGAGAGATTCATCTG
    CAATATTTGGAACACCAATGGAGGTCTATGTCAACACAGAATTTATACAGCAGCTGGTGC
    TAGTCAGAGCTAATGACAGAATTTCAGTTTAATAAAAAGACCCCCAACTGAGCACACCAT
    CTTGAAAAAAGTATACTTATCAAACAGCTTTCAATCAGTTCAAGAGAGACACCTTAATTG
    GGGAGAGGAAGAATTGCAGAGTAGTTTGTAATCATGCCAATTCCAGATCAATAACTGCAT
    GTCTGTTCTTTGGTAGAAATAGCTTTTGCTTTATATTAAGTAATCACATATATATTCTCT
    CTATTTGGATAAGGAAACCTTCGCTTTATTTGACAATGTATAATGATATACTCTTCTAAT
    TCACCTCTGTGTCTTCACAATAAACATGAGTAAAATTTAGACAAGTGATGGTAAAGGTCA
    ATATAATTATTTATTTTTAAAATAAATTTTGTATCTAACAGGAAAGCAGTTCTTATGAAA
    TTTTTATATTTTCAAAAATTGTTTTGTTCAAATAAAATTTTATGAGTAAAGTTAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.49215_contig1
    BI493248|N66529|AA452255|BI492877|AW196683|AI963900|BF478125|AI421654|BE466
    675 polyA = 1 polyA = 1
    GGGTACCTGGTGGGGCCAATCACCGAGCCATGAACATCAGTAACGTACTCTAAAGACCAA
    GGCTACGATGGCTATGATGGTCAGAATTACTACCACCACCAGTGAAGCTCCAGCCTGGGA
    TGAATTCATCCATTCTGGCTTTGCATCCGGCTACCATTTTCGAAGTTCAACTCAGGAAGG
    TGCAATATAACAAATGTGCATATTATAATGAGGAATGGTACTACCGTTCCAGATTTTCTG
    TAATTGCTTCTGCAAAGTAATAGGCTTCTTGTCCCTTTTTTTTCTGGCATGTTATGGAAT
    GATCATTGTAAATCAGGACCATTTATCAAGCAGTACACCAACTCATAAGATCAAATTTCA
    TTGAATGGTTTGAGGTTGTAGCTCTATAAATAGTAGTTTTTAACATGCCTGTAGTATTGC
    TAACTGCAAAAACATACTCTTTGTACAAGAAGTGCTTCTAAGAATTTCATTGACATTAAT
    GACACTGTATACAATAAATGTGTAGTTTCTTAATCGCACTACCTATGCAACACTGTGTAT
    TAGGTTTATCATCCTCATGTATTTTTATGTGACCTGTATGTATATTCTAATCTACGAGTT
    TTATCACAAATAAAAATGCAATCCTTCAAA
    >Hs.281587_contig2
    R61469|R15891|AA007214|R61471|AI014624|N69765|AW592075|H09780|AA709038|AI33
    5898|AI559229|F09750|R49594|H11055|T72573|AA935558|AA988654|AA826438|AI0024
    31|AI299721 polyA = 1 polyA = 2
    AAGGTGGGCTTTCATTGTGATTTTTGTTCTGTTGCAGTAATATAGGAGCACATTTTGGCC
    ATTGTAATTACAGGGAACAAAGGGATTGCGGACACATATCTGGACTTCTTTTCCTCCCTT
    ATTGTTGTGGAAGAGACACTAGAAATGCTCAAACACCTGCAATATACAGAATATACACAA
    TTTTATTCCAGTATTTCCCTAACATATGGTTTAAAATTATTCCAGGTATACAGTGTATGC
    AATTCTGCATTATCACAGAGGAACAACTTCTTTTTTAAAAAATAAATAGGTCAGCCATTT
    TTATTAACGTGCAAAAACTTTATCACTCTAACATGCTCTAGGTAGTTGAGGAAAAGAGGT
    CTGATCACTGTTTGTATTTTATTTTCTTTGTGGGAACATTTCACCTGCTGAGTGTACATG
    AATTTGCTTTCTATAAAAGGCTTTTATGAGTTTACAGTAGAATCAGTGGAAGGAAGAGTT
    AATAAGGGCTGTTTTTAAAAAAACAAACAAACAAACAAAACAAATAATTAAAAAAAAATT
    TTACATTCCTTCCTATTCTCTAACTACACTTGGGAAGTGCACTTCAGATAAGTTTGCAGT
    GTGACTGAGAGATGAAGGAAATCCATAGAAAAGGTCCTCTTAGTGAACAAAATTTAGTTA
    TTAACTTTATAGCTATGAAATTTCCCCGGGCATTTGTTTTTGTTCAAACAGACTTTAACC
    TCTGCATCATACTTAACCCTGCGACATGCGTACAGTATGCATATTTTGTTTTGAAAAAAA
    ATGTTTCGTTCCAGTCTGTTAAGAATATTCAAAAATAATAAAGGTATTGCTTAATAAAAT
    TGCTAGAATTGTTTAGCAGTACATGCACAATATTTTACTAGATTCTTTGTTTTAATAGTG
    TTTTGTTGAGACTGAAAATCTTAAAATGGTCTGCGCAAATACAAAAAAAAAGAAAACACC
    AAAAAAAAAA
    >Hs.79378_mRNA_1 gi|16306528|ref|NM_003914.2|Homo sapiens eye1in A1
    (CCNA1), mRNA polyA = 3
    GGTGTTGTTCCGGACACATAGAAAGATAACGACGGGAAGAGCGGGGCCCGCTTTGGGGTC
    CAGGCAGGTTTTGGGGCCTCCTGTCTGGTGGGAGGAGGCCGCAGCGCAGCACCCTGCTCG
    TCACTTGGGATGGAGACCGGCTTTCCCGCAATCATGTACCCTGGATCTTTTATTGGGGGC
    TGGGGAGAAGAGTATCTCAGCTGGGAAGGACCGGGGCTCCCAGATTTCGTCTTCCAGCAG
    CAGCCCGTGGAGTCTGAAGCAATGCACTGCAGCAACCCCAAGAGTGGAGTTGTGCTGGCT
    ACAGTGGCCCGAGGTCCCGATGCTTGTCAGATACTCACCAGAGCCCCGCTGGGCCAGGAT
    CCCCCGCAGAGGACAGTGCTAGGGCTGCTAACTGCAAATGGGCAGTACAGGAGGACCTGT
    GGCCAGGGGATCACAAGAATCAGGTGTTATTCTGGATCAGAAAATGCCTTCCCTCCAGCT
    GGAAAGAAAGCACTCCCTGACTGTGGGGTCCAAGAGCCCCCCAAGCAAGGGTTTGACATC
    TACATGGATGAACTAGAGCAGGGGGACAGAGACAGCTGCTCGGTCAGAGAGGGGATGGCA
    TTTGAGGATGTGTATGAAGTAGACACCGGCACACTCAAGTCAGACCTGCACTTCCTGCTG
    GATTTCAACACAGTTTCCCCTATGCTGGTAGATTCATCTCTCCTCTCCCAGTCTGAAGAT
    ATATCCAGTCTTGGCACAGATGTGATAAATGTGACTGAATATGCTGAAGAAATTTATCAG
    TACCTTAGGGAAGCTGAAATAAGGCACAGACCCAAAGCACACTACATGAAGAAGCAGCCA
    GACATCACGGAAGGCATGCGCACGATTCTGGTGGACTGGCTGGTGGAGGTTGGGGAAGAA
    TATAAACTTCGAGCAGAGACCCTGTATCTGGCTGTCAACTTCCTGGACAGGTTCCTTTCA
    TGTATGTCTGTTCTGAGAGGGAAACTGCAGCTCGTAGGAACAGCAGCTATGCTTTTGGCT
    TCGAAATATGAAGAGATATATCCTCCTGAAGTAGACGAGTTTGTCTATATCACCGATGAT
    ACATACACAAAACGACAACTGTTAAAAATGGAACACTTGCTTCTGAAAGTTCTAGCTTTT
    GATCTGACAGTACCAACCACCAACCAGTTTCTCCTTCAGTACTTGAGGCGACAAGGAGTG
    TGCGTCAGGACTGAGAACCTGGCTAAGTACGTAGCAGAGCTGAGTCTACTTGAAGCAGAT
    CCATTCTTGAAATATCTTCCTTCACTGATAGCTGCAGCAGCTTTTTGCCTGGCAAACTAT
    ACTGTGAACAAGCACTTTTGGCCAGAAACCCTTGCTGCATTTACAGGGTATTCATTAAGT
    GAAATTGTGCCTTGCCTGAGTGAGCTTCATAAAGCGTACCTTGATATACCCCATCGACCT
    CAGCAAGCAATTAGGGAGAAGTACAAGGCTTCAAAGTACCTGTGTGTGTCCCTCATGGAG
    CCACCTGCAGTTCTTCTTCTACAATAAGTTTCTGAATGGAAGCACTTCCAGAACTTCACC
    TCCATATCAGAAGTGCCAATAATCGTCATAGGCTTCTGCACGTTGGATCAACTAATGTTG
    TTTACAATATAGATGACATTTTAAAAATGTAAATGAATTTAGTTTCCCTTAGACTTTAGT
    AGTTTGTAATATAGTCCAACATTTTTTAAACAATAAACTGCTTGTCTTATGACAAAAAAA
    AAA
    >Hs.156469_contig2
    AI341378|AI670817|AI701687|AI335022|AW235883|AI948598|AA446356 polyA = 2
    polyA = 3
    TCCAAGCCATTAAGGACTGTGGAACTTGCTATGATCATGGACGTGCTGTATGGTGGCGTT
    TGTTATGCAGGAATTGATACAGATCCTGAGCTAAAATACCCAAAAGGTGCTGGGCGAGTT
    GCTTTCTCCAATCAGCAGAGCTATATTGCTGCCATTAGTGCTCGGTTTGTTCAGCTTCAG
    CATGGTGATATTGATAAACGTGTGGAGGTAAAGCCATATGTGCTAGATGACCAGATGTGT
    GATGAATGCCAGGGCGCACGCTGTGGTGGAAAATTTGCTCCCTTTTTTTGTGCCAATGTC
    ACTTGCCTGCAGTATTACTGTGAGTTTTGTTGGGCAAATATCCACTCTCGTGCTGGACGT
    GAGTTCCATAAGCCATTGGTAAAGGAAGGTGCTGATCGCCCACGTCAGATCCACTTCCGC
    TGGAACTAAGAATAGCAAACTGGCCTCTGTTTAACAAGGAAAGAAAGGGTGCATGTGGCT
    TACTGTGTCTGAAGATACTGACATGCAGAAGAAATAAGTGCATTCTTCTGCTTTTCACCC
    CAGCTATCAATACATGCATCTTTATCAGCAGCCAAAACACTACAAGCCTCTTGTTTTTCA
    CCAAAACCCTACATCTCAGGCTTACTAATTTTTGTGATATTTTCATGTTCAAATAAAATG
    TTTTTTTGTATTTTCAAAAAAAAAAAAAAAAAAAAAA
    >Hs.6631_mRNA_1 gi|7020430|dbj|AK000380.1|AK000380 Homo sapiens cDNA
    FLJ20373 fis, clone HEP19740 polyA = 3
    CTCGATGTAGAGGGGTTGGTAGCAGACAGGTGGTTACATTAGAATAGTCACACAAACTGT
    TCAGTGTTGCAGGAACCTTTTCTTGGGGGTGGGGGAGTTTCCCTTTTCTAAAAATGCAAT
    GCACTAAAACTATTTTAAGAATGTAGTTAATTCTGCTTATTCATAAAGTGGGCATCTTCT
    GTGTTTTAGGTGTAATATCGAAGTCCTGGCTTTTCTCGTTTTCTCACTTGCTCTCTTGTT
    CTCTGTTTTTTTAAACCAATTTTACTTTATGAATATATTCATGACATTTGTAATAAATGT
    CTTGAGAAAGAATTTGTTTCATGGCTTCATGGTCATCACTCAAGCTCCCGTAAGGATATT
    ACCGTCTCAGGAAAGGATCAGGACTCCATGTCACAGTCCTGCCATCTTACTTTCCTCTTG
    TCGAGTTCTGAGTGGAAATAACTGCATTATGGCTGCTTTAACCTCAGTCATCAAAAGAAA
    CTTGCTGTTTTTTAGGCTTGATCTTTTTCCTTTGTGGTTAATTTTCCTGTATATTGTGAA
    AATGGGGGATTTTCCCTCTGCTCCCACCCACCTAAACACAGCAGCCATTTGTACCTGTTT
    GCTTCCCATCCCACTTGGCACCCACTCTGACCTCTTGTCAGTTTCCTGTTCCTGGTTCCA
    TCTTTTTGAAAAAGGCCCTCCTTTGAGCTACAAACATCTGGTAAGACAAGTACATCCACT
    CATGAATGCAGACACAGCAGCTGGTGGTTTTGTGTATACCTGTAAAGACAAGCTGAGAGG
    CTTACTTTTTGGGGAAGTAAAAGAAGATGGAAATGGATGTTTCATTTGTATGAGTTTGGA
    GCAGTGCTGAAGGCCAAAGCCGCCTACTGGTTTGTAGTTAACCTAGAGAAGGTTGAAAAA
    TTAATCCTACCTTTAAAGGGATTTGAGGTAGGCTGGATTCCATCGCCACAGGACTTTAGT
    TAGAATTAAATTCCTGCTTGTAATTTATATCCATGTTTAGGCTTTTCATAAGATGAAACA
    TGCCACAGTGAACACACTCGTGTACATATCAAGAGAAGAAGGAAAGGCACAGGTGGAGAA
    CAGTAAAAGGTGGGCAGATGTCTTTGAAGAAATGCTCAATGTCTGATGCTAAGTGGGAGA
    AGGCAGAGAACAAAGGATGTGGCATAATGGTCTTAACATTATCCAAAGACTTGAAGCTCC
    ATGTCTGTAAGTCAAATGTTACACAAAAAAAAATGCAAATGGTGTTTCATTGGAATTACC
    AAGTGCTTAGAACTTGCTGGCTTTCCCATAGGTGGTAAAGGGGTCTGAGCTCACACCGAG
    TTGTGCTTGGCTTGCTTGTGCAGCTCCAGGCACCCGGTGGGCACTCTGGTGGTGTTTGTG
    GTGAACTGAATTGAATCCATTGTTGGGCTTAAGTTACTGAAATTGGAACACCCTTTGTCC
    TTCTCGGCGGGGGCTTCCTGGTCTGTGCTTTACTTGGCTTTTTTCCTTCCCGTCTTAGCC
    TCACCCCCTTGTCAACCAGATTGAGTTGCTATAGCTTGATGCAGGGACCCAGTGAAGTTT
    CTCCGTTAAAGATTGGGAGTCGTCGAAATGTTTAGATTCTTTTAGGAAAGGAATTATTTT
    CCCCCCTTTTACAGGGTAGTAACTTCTCCACAGAAGTGCCAATATGGCAAAATTACACAA
    GAAAACAGTATTGCAATGACACCATTACATAAGGAACATTGAACTGTTAGAGGAGTGCTC
    TTCCAAACAAAACAAAAATGTCTCTAGGTTTAGTCAGAGCTTTCACAAGTAATAACCTTT
    CTGTATTAAAATCAGAGTAACCCTTTCTGTATTGAGTGCAGTGTTTTTTACTCTTTTCTC
    ATGCACATGTTACGTTGGAGAAAATGTTTACAAAAATGGTTTTGTTACACTAATGCGCAC
    CACATATTTATGGTATATTTTAAGTGACTTTTTATGGGTTATTTAGGTTTTCGTCTTAGT
    TGTAGCACACTTACCCTAATTTTGCCAATTATTAATTTGCTAAATAGTAATACAAATGAC
    AACTGCATTAAATTTACTAATTATAAAAGCTGCAAGCAGACTGGTGGCAAGTACACAGCC
    CTTTTTTTTGCAGTGCTAACTTGTCTACTGTGTATTATGAAAATTACTGTTGTCCCCCCA
    CCCTTTTTTCCTTAAATAAAGTAAAAATGACACCCTAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAA
    >Hs.155977_contig1|AI309080|AI313045 polyA = 1 WARN polyA = 1
    TATACGGCTGCTAGAAGACGACAGAAGGTGGCTTGGGGGTGGATATCTTTGGGTTGCTGG
    AAAAGGTGTGGGAAGGTTCAGGATGGTGGGAGGGACTGAGGTCCCTGAGGTGAAGAGGCC
    CTTGGTCCTGACGGGTTTGACCCGTGCCTGGACCCTTGGAGCAGTGTTGTGTGAACTTGC
    CTAGAACTCTGCCTTCTCCGTTGTCAATAAAGCCTCCCCCTCATGACCTAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAGTCGTATCGA
    >Hs.95197_mRNA_4 gi|5817138|emb|AL110274.1|HSM800829 Homo sapiens mRNA;
    cDNA DKFZp564I0272 (from clone DKFZp564I0272) polyA = 3
    GAGCAGGAAAATATATACCCTAAACAGAAACTCTTACTTGTTTTATGAGCAAGTCTGAGT
    GAGTCCTAAAATGGCTGGCGAAGAGCTACCAATACTGACTGACAGGTCACCTTAAAGCCT
    CTAGGTGTGCCAAGTTTGATTTATCTTAGGGACTAGAACCTAGTCTTCTAAATGTGATTT
    TGCCTTGCTGTTTCGTCCTGATGTGAAGGTAACCACACAGAGAGATTGGGCTGCATCAGT
    AATGATATGCATACCTTTCGTGCATCAGTGAGCTTCTTCCCTGTTAACTGTATGACCACA
    AAATTTAGCTGGAGTAAATAAATATGCGACAGAAATCCTGGAACAAGATGGTGAAATTGC
    TTAAGAATCGAGACTTCAGGGCTCAATGACCTCTGAGCATGTTTCCCAAAGTGTGACCCA
    CATGACCATCTGTCTCTCAGTCTCCTGGTCCCTCCGTAGAGCTTCTGAAACTGAATCTTT
    GTGGGGTGGGGGTAGCGTTCAAGAATCAAAAGTTGAACCAAGCTCTTTGGGTGATACTTA
    TGTATACTGAGGTTCAGGAACTGCTGGAGAGATGACTGGGCACCAAGAGGATGACAGTGA
    CTCAGCTGGCATCCCTTAGCTGGTTCATGGCAGAGCTGAGTGGGCACTCCTGTCTCTGAC
    CCCAGCTTCAGTGCTCTTTATCTCCTCCATGCCTCCTCAGTCGTGCTGCTCTAAGACTGC
    TTACTGGCTTTCCTTCATGTCCTGGGCACAGAGCAGTTCTTTTGGTAGCAGATTTGAGTC
    CACTTCCCCCGTGCACAGATCACTGCTCAGGACCCAGAGAGGAGCAGCTCTGCTCCAGCA
    GGGTTTTCCATTGCATCACACACCCAAACGGTAGGATCCAACAGTCACACTTGAAAGCAA
    CCATAATTGTGAGGTTTCTGATGCTGTAGACTTCCTTACATTTCTCACAACCTAGTTAGA
    GAGTCACATGGGGGTGAAGTGTGGCTCGCGACCTGCCCCAACAAGTGCGTGCAGAAGCCA
    GGAAACAAAGGAGTAAATTCACTTCAAATGGGATGCACATGGTGTCCGTGATGAAGAGAC
    ACATTCAGAATTGCCCAAGGACAGGAAAATGACCAGAGAGAGCCAGAGCTGAGCTGGTAA
    TAAAGAGACTCCGAGACTGAGTGGAGTTAATGAGGGAAGCATGCAACGAGTGGGGCAATT
    TCAGTTGGTTTCTCTCATTGCTTTAAGCGAAATGAACTATACGGACAGGAGAACAGCCTG
    CTTGCCCCAGTCTCTCCTTGGCCGCCCTCTGTTGTCCCTGTCAACTCAGGTGCCCACGGT
    GCTCAGAGGAGGTGCTGGCAAAGCCCCTGGAGCCTTATGTAGGCCATGGGGGCTCCTAAA
    AGGAACCTGAATGAATCATTTACAGCAGGTCTCTCTTGTAAAGCCCAGCCACAGTAACTC
    GTACACTGACTGTTTCAAAAGACAGCCTTTCTTAATCATTTAATTGTTTCATATTCAAAT
    ATATCTCCTAATTGTTTTTATTTTTTCCTGATCTAGAAGATATGACAACAGGGTAGAACT
    TGGGAAGAGGGAATAGGAAGCTCGCCCTTCCTCCTTCCCTCCTCCCCTCTCTACTTTCCT
    TCCTTCCTTGGTCATCAGGTACCTTCTTTGTGCCTGCTGTTGTAGGCTACACCCTATGTT
    TGGTGGAAGGCAAAAAGAAAAATCAGTAGGATACAACTCAGTAGGGAAGACAGAGATATT
    CAAGCCCCTTGTCCTCCCAGTGTGATAAGTGTGGTGGTTGAGGTGTGAACAAGGGGCTCT
    GTGAACAGAGAGGACGAAAGAGGAGCTCCTCCTGAGGCTGTTGGGAAAAGCATCACTGAA
    GAGTGACTTTCAGAAGAAGAGAAGAAAAAGAGGAGAACATGCGTGATTTTATAATGAAAT
    AGATTAGATAAGGGGAAAAAAGGCATTTAAACAAGGCAAAAAGAACAGGAGAATAGAGAA
    GAGATGTGGAGGAGAAGGAGCACTGTAGTAAACACGCAGAAGGACAGGAACACTTAGACA
    TGCAACCCACTCCCACCCTCCGTCTTGGGGGAGGAAAGCACACTACTGTCCCAAAGAACT
    AATACTGAACCAGTGCTGCCTTGTGGAGAGAGGCATGGCCAAGGCGTTCAGAGACCTGGG
    CCTGGTCCCACCGCTGCCCACAGCACTCAGCCTCTGAGCACAGCCTGGGGTCATCTGTGT
    GCCCTCTGGCCAAGGCTGATGGTAGTTCTCTGAGTAATTGAGAGTCATTGCCTGTCTGTG
    CAGTATTGTGAAAACAAGTCACCTTTTAACTTTAAAACTACTTTAAAAAACTTTAAAGTT
    TTAAAAAAACTTCTTTAAAAACTACTCATGAGATGACAGTTTCTCTGACCCTCAGAGGAA
    GGCTGGGCTGCGCATACGTGAGGAATTTTTACATGAACATCCCAGGACTTGCTGTTCGCA
    GGTGATAAACTGCACCTCCCCAGGACTCCCGCTGCACTCACATGCAGCTCCCTGGACTTC
    TGGTATCTGACCCGGCCCATTTCTGTGTTTCAGGGGAGAATTTGGCTTGCGGGAGTACTC
    AGAAGTTAAGACGGTGACAGTAAAGATCCCCCAGAAGAACTCCTAAGAAGGCCAAGAAGG
    AGGATGAAGCCCAGCCTGCACGTCTGTCCCTCTCTGCTTTCTCTGTAGGGCCCAGCTCTC
    AGGAATACAAAGTTGAGCCACGGTCCTTACTTAAAGATTGAAAAGATAACATGTAGGCCA
    GGCAGGTCACTGCACAACTAAAGCAAACCAGCTGGGTACAGTTTCTTGGCACTCTGTAAG
    GGGCCACCTTAATCATACCAAATATTGGGGAAAGTGGGATAAAGGGAGGAGGAGGAGCTA
    GCAGACACATCCAGTATCTCCTTCTGGAGCACAGGATGAAATAAGGGAGCTGTATTATTT
    CATGTCTTTGTCACAAAGAACTTTCCTCTCAAGGAAAGGTGACCTTTCTCCTGTCTTCAT
    TTTCCTCCTTCCAGGCCCTCCTCGCTCACCCACCCCTCCCTCTCTTCCAAGGAGATGTCA
    GCTGAGCTCATTCTGGGGCAGATGTTTGGGCCGGGAACAATTTTTCAAGGTTGTAAAGCC
    AAATTATCATTTCATGTTATCCATTTCTTCAAAGCAAAACATGAAATGGTTTTAGCTAGA
    GTCAGACCAGAATGAAAATGCCAGGAGCTGGTACACTACAGATGTAGTAAGAACCTGGGA
    TATTCCTGACCCAATCTGGTTTTCTTTTACCCATAAATAACATGAATGAAAAAAGATTGG
    GACAATAGAGACTGGAAGTCATCATGTGCAGTTCACCGCTTCTGAGCTTGCTGCAGTTTT
    GGGGTGTGTGTGTATTAGATTCCTTCTCAGTTATTCTGGAATAAGGCAAGGAGTGGGTTG
    TTTTTCATAGCTAGATAAGATCTTTTCCAAAGTTTTTCTTAGAACCAACCAAAAAACAAT
    CCGAGTAGGCCCGAGAATTTGATAATGCTGGATGCCTTGCAGACATCATTCAGTTTCTAA
    TATTGGGCAACAATTATTATTAAATGAATTATTTCTGTAGTTGGAATCTGTACCTTCTGA
    ACCTCTACACCAATAACTGCTGCAGGTGTGATTTTGGTCTGTCACACTGTACATCTATCA
    TAATGTGCCCTGTATCTATTGGCAGTGACCTTGGAAAATCTGGCCAAGCCTAGGGGTTTC
    CTTTTCCATTTGCCAAGTTCCATTGTGCCAGGACTGCCGTGCTCCACTGAGCTCCTCTGT
    CACACCCCATTCTTGCCCCTCACTGGGCAGGCCATGGCCTACAGCTTGCAGGGAGTAAAG
    CAGGCCCGCCTCCCTTTCTTCCCATCCACATACTCCTCTTCTGCTTTCCAGTGACTCCAC
    CAGTTTGATGTGGGAAGTGTTAGCTTCCTTTCCTTCTTCCATCCCTTCTTCCATCTTTCC
    AGCTGTCAAATCCAATCCAGTCTCTAACCTAAATGCAGATCATTTATTTAAAAGTACCAA
    ACATAACCCAGAGTATGTGGAATATGGGCAACATATATATAGCCTTCTGTATTTAACGAT
    CTTCTGCTTCTTAACCGTACCAGTTTTCTATTTATAACTCTTATCTATCCATGATGTTTT
    AAAGTCTCCACTTGCTGTTATTTACAAACGACAGTGCATTCAGCAGCCCAGTGCCGTGAG
    CCCTGACAGATGCCGTATTTCTGAGTGCTTCCATGTGAATGCTGCCCTCCTGTAGCATGT
    GTCCAAGTGGACATAGCCACTAACCAACTAGTTACCTTTGGACTGCAACAAAAAATGTGA
    AAATGAAGATTTATTTCTTTTAATTTACTTAAAAAGAAACCTCTGTGCTAGCAATAAAGC
    ATTTATATTGTGCAAAAAAAAAAAAAAAAAAAAAC
    >Hs.48956_contig1 N64339|AI569513|AI694073 polyA = 1 polyA = 1
    TGAAAATTTATATAACTGTTGTTGATAAGGAACATTATCCAGGAATTGATACGTTTATTA
    GGAAAAGATATTTTTATAGGCTTGGATGTTTTTAGTTCTGACTTTGAATTTATATAAAGT
    ATTTTTATAATGACTGGTCTTCCTTACCTGGAAAAACATGCGATGTTAGTTTTAGAATTA
    CACCACAAGTATCTAAATTTGGAACTTACAAAGGGTCTATCTTGTAAATATTGTTTTGCA
    TTGTCTGTTGGCAAATTTGTGAACTGTCATGATACGCTTAAGGTGGAAAGTGTTCATTGC
    ACAATATATTTTTACTGCTTTCTGAATGTAGACGGAACAGTGTGGAAGCAGAAGGCTTTT
    TTAACTCATCCGTTTGCCAATCATTGCAAACAACTGAAATGTGGATGTGATTGCCTCAAT
    AAAGCTCGTCCCCATTGCTTAAQCCTTCAAAAA
    >Hs.118825_mRNA_10 gi|1495484|emb|X96757.1|HSSAPKK3 H. sapiens mRNA for MAP
    kinase kinase polyA = 3
    CTTTTAGCTGCCAGCCCTGGCCCATCATGTAGCTGCAGCACAGCCTTCCCTAACGTTGCA
    ACTGGGGGAAAAATCACTTTCCAGTCTGTTTTGCAAGGTGTGCATTTCCATCTTGATTCC
    CTGAAAGTCCATCTGCTGCATCGGTCAAGAGAAACTCCACTTGCATGAAGATTGCACGCC
    TGCAGCTTGCATCTTTGTTGCAAAACTAGCTACAGAAGAGAAGCAAGGCAAAGTCTTTTG
    TGCTCCCCTCCCCCATCAAAGGAAAGGGGAAAATGTCTCAGTCGAAAGGCAAGAAGCGAA
    ACCCTGGCCTTAAAATTCCAAAAGAAGCATTTGAACAACCTCAGACCAGTTCCACACCAC
    CTAGAGATTTAGACTCCAAGGCTTGCATTTCTATTGGAAATCAGAACTTTGAGGTGAAGG
    CAGATGACCTGGAGCCTATAATGGAACTGGGACGAGGTGCGTACGGGGTGGTGGAGAAGA
    TGCGGCACGTGCCCAGCGGGCAGATCATGGCAGTGAAGCGGATCCGAGCCACAGTAAATA
    GCCAGGAACAGAAACGGCTACTGATGGATTTGGATATTTCCATGAGGACGGTGGACTGTC
    CATTCACTGTCACCTTTTATGGCGCACTGTTTCGGGAGGGTGATGTGTGGATCTGCATGG
    AGCTCATGGATACATCACTAGATAAATTCTACAAACAAGTTATTGATAAAGGCCAGACAA
    TTCCAGAGGACATCTTAGGGAAAATAGCAGTTTCTATTGTAAAAGCATTAGAACATTTAC
    ATAGTAAGCTGTCTGTCATTCACAGAGACGTCAAGCCTTCTAATGTACTCATCAATGCTC
    TCGGTCAAGTGAAGATGTGCGATTTTGGAATCAGTGGCTACTTGGTGGACTCTGTTGCTA
    AAACAATTGATQCAQGTTQCAAACCATACATGGCCCCTGAAAGAATAAACCCAGAGCTCA
    ACCAGAAGGGATACAGTGTGAAGTCTGACATTTGGAGTCTGGGCATCACGATGATTGAGT
    TGGCCATCCTTCGATTTCCCTATGATTCATGGGGAACTCCATTTCAGCAGCTCAAACAGG
    TGGTAGAGGAGCCATCGCCACAACTCCCAGCAGACAAGTTCTCTGCAGAGTTTGTTGACT
    TTACCTCACAGTGCTTAAAGAAGAATTCCAAAGAACGGCCTACATACCCAGAGCTAATGC
    AACATCCATTTTTCACCCTACATQAATCCAAAGGAACAGATGTGGCATCTTTTGTAAAAC
    TGATTCTTGGAGACTAAAAAGCAGTGGACTTAATCGGTTGACCCTACTGTGGATTGGTGG
    GTTTCGGGGTGAAGCAAGTTCACTACAGCATCAATAGAAAGTCATCTTTGAGATAATTTA
    ACCCTGCCTCTCAGAGGGTTTTCTCTCCCAATTTTCTTTTTACTCCCCCTCTTAAGGGGG
    CCTTGGAATCTATAGTATAGAATGAACTGTCTAGATGGATGAATTATGATAAAGGCTTAG
    GACTTCAAAAGGTGATTAAATATTTAATGATGTGTCATATGAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.135118_contig3
    AI683181|AI082848|AW770198|AI333188|AI873435|AW169942|AI806302|AW340718|BF1
    96955|AA909720 polyA = 1 polyA = 2 testis
    CAGTCCCACCATGTATTTTGCTTTGTTTCTAAAAAGCTTTTTAAAAACTGTTATTTAATA
    CCAAAGGGAGGAATCGTATGGGTTCTTCTGCCCACCGTTGTGACTAAGAATGCACAGGGA
    CTTGGTTCTCGTTGCACCTTTTTTTAGTAACATGTTTCATGGGGACCCACTGTACAGCCC
    TTCATTCTGCTGTGTCAGTTTGGCCTGGCCTGACACTGGCTGCCCCAGCGGGGACCACGG
    AAGCAGAGTGAGAGCCTTCGCTGAGTCAATGCTACCTTCAGCCCCAGACGCATCCCATTT
    CCATGTCTTCCATGCTCACTGCTCATGCACTTTTTACACGGTTTCTTCCAAACAGCCCGG
    TCTTGATGCAGGAGAGTCTGGAAAAGGAAGAAAATGGTTTCAGTTTCAAAATTCAAAGGA
    AAAAGTTGAGGACTTATTTTGTCCTGTCAAGATTGCAAGAACATGTAAAATGTACGGAGC
    TTCATAATACGTTATATTGTTCCGAAGCAGCTCGTTGAGAAACATTTGTTTTCAATAACA
    TTTTAGCTTAAAAAAAAA
    >Hs.171857_mRNA_1 gi|13161080|gb|AF332224.1|AF332224 Homo sapiens
    protein mRNA, partial cds polyA = 3 clone
    TCACCTCGTGGCGTAGGGGAGAGGTAACACCGAGAAGAGGCAGCGGCGGTGGCNCAGAGA
    CGATTGGTGCCAAACAGGGCAGAACGCAACTCAGCTCTGGGTTTGTGAATAGCACAATGG
    AAGAAGCTGGACTTTGTGGGTTAAGAGAGAAAGCAGATATGTTGTGTAACTCTGAATCAC
    ATGATATTCTTCAACATCAAGACTCAAATTGCAGTGCCACAAGTAATAAACATTTATTGG
    AAGATGAAGAAGGCCGTGACTTTATAACAAAGAACAGGAGTTGGGTGAGCCCAGTGCACT
    GCACACAAGAGTCAAGAAGGGAGCTTCCTGAGCAAGAAGTAGCCCCTCCGTCTGGTCAGC
    AAGCTTTACAATTGCAACAGGAACAAAGAAAAAGTCTTAGGAAAAGAAGTTTTATTATTG
    ATGCAAGCCCTAAACACTCTTTCCGACTCCAGAGGAGAAGCTGGCAGCTCTCTGTAAGAA
    ATATGCTGATCTTGGAAATTCACCTCTTCTATAGAAGAGTTTGTTTTGAACTATACGATT
    TGAAACAAAATTCTTTTTTTGGAGACTATGGAAACATTCTCAACAGGGAAACCCTACTAG
    ACTTTGTAAAGCAAATAATGGAAAAGATACAGAACTTTTTGAAGAATCATGGGAAATTTT
    TATAATTAAATAAATGCTAAAATTCTGTTTTGTGAAACATTTATGGGAATTATCACTGAC
    AGTTTTTGTACACTTTCAAATAGTGTTAAAGCAGCAACTCCATGTTGTAAATGCACAAAA
    CAAATATTTAGTTAATAATCAACTCCAAGAATAAAGCTGTAACAATAATAGTTAAAAAAA
    A
    >Hs.18910_mRNA_3 gi|12804464|gb|BC001639.1|BC001639 Homo sapiens
    MGC:1944 IMAGE:2959372 polyA = 3
    GGCACGAGGGTCAGCAGCCGCCAGACTTCCTGCCGAAGTCCGAGCCCCCTCCCGGGGCTG
    GAGGGGGGCAAGCGGGTTCCGAGGTGCAAAGCCTGGTGCCCCGAGCCCTGCGGAGCTCGG
    GGCCAGCATGGCCCCCACGCTGCAACAGGCGTACCGGAGGCGCTGGTGGATGGCCTGCAC
    GGCTGTGCTGGAGAACCTCTTCTTCTCTGCTGTACTCCTGGGCTGGGGCTCCCTGTTGAT
    CATTCTGAAGAACGAGGGCTTCTATTCCAGCACGTGCCCAGCTGAGAGCAGCACCAACAC
    CACCCAGGATGAGCAGCGCAGGTGGCCAGGCTGTGACCAGCAGGACGAGATGCTCAACCT
    GGGCTTCACCATTGGTTCCTTCGTGCTCAGCGCCACCACCCTGCCACTGGGGATCCTCAT
    GGACCGCTTTGGCCCCCGACCCGTGCGGCTGGTTGGCAGTGCCTGCTTCACTGCGTCCTG
    CACCCTCATGGCCCTGGCCTCCCGGGACGTGGAAGCTCTGTCTCCGTTGATATTCCTGGC
    GCTGTCCCTGAATGGCTTTGGTGGCATCTGCCTAACGTTCACTTCACTCACGCTGCCCAA
    CATGTTTGGGAACCTGCGCTCCACGTTAATGGCCCTCATGATTGGCTCTTACGCCTCTTC
    TGCCATTACGTTCCCAGGAATCAAGCTGATCTACGATGCCGGTGTGGCCTTCGTGGTCAT
    CATGTTCACCTGGTCTGGCCTGGCCTGCCTTATCTTTCTGAACTGCACCCTCAACTGGCC
    CATCGAAGCCTTTCCTGCCCCTGAGGAAGTCAATTACACGAAGAAGATCAAGCTGAGTGG
    GCTGGCCCTGGACCACAAGGTGACAGGTGACCTCTTCTACACCCATGTGACCACCATGGG
    CCAGAGGCTCAGCCAGAAGGCCCCCAGCCTGGAGGACGGTTCGGATGCCTTCATGTCACC
    CCAGGATGTTCGGGGCACCTCAGAAAACCTTCCTGAGAGGTCTGTCCCCTTACGCAAGAG
    CCTCTGCTCCCCCACTTTCCTGTGGAGCCTCCTCACCATGGGCATGACCCAGCTGCGGAT
    CATCTTCTACATGGCTGCTGTGAACAAGATGCTGGAGTACCTTGTGACTGGTGGCCAGGA
    GCATGAGACAAATGAACAGCAACAAAAGGTGGCAGAGACAGTTGGGTTCTACTCCTCCGT
    CTTCGGGGCCATGCAGCTGTTGTGCCTTCTCACCTGCCCCCTCATTGGCTACATCATGGA
    CTGGCGGATCAAGGACTGCGTGGACGCCCCAACTCAGGGCACTGTCCTCGGAGATGCCAG
    GGACGGGGTTGCTACCAAATCCATCAGACCACGCTACTGCAAGATCCAAAAGCTCACCAA
    TGCCATCAGTGCCTTCACCCTGACCAACCTGCTGCTTGTGGGTTTTGGCATCACCTGTCT
    CATCAACAACTTACACCTCCAGTTTGTGACCTTTGTCCTGCACACCATTGTTCGAGGTTT
    CTTCCACTCAGCCTGTGGGAGTCTCTATGCTGCAGTGTTCCCATCCAACCACTTTGGGAC
    GCTGACAGGCCTGCAGTCCCTCATCAGTGCTGTGTTCGCCTTGCTTCAGCAGCCACTTTT
    CATGGCGATGGTGGGACCCCTGAAAGGAGAGCCCTTCTGGGTGAATCTGGGCCTCCTGCT
    ATTCTCACTCCTGGGATTCCTGTTGCCTTCCTACCTCTTCTATTACCGTGCCCGGCTCCA
    GCAGGAGTACGCCGCCAATGGGATGGGCCCACTGAAGGTGCTTAGCGGCTCTGAGGTGAC
    CGCATAGACTTCTCAGACCAAGGGACCTGGATGACAGGCAATCAAGGCCTGAGCAACCAA
    AAGGAGTGCCCCATATGGCTTTTCTACCTGTAACATGCACATAGAGCCATGGCCGTAGAT
    TTATAAATACCAAGAGAAGTTCTATTTTTGTAAAGACTGCAAAAAGGAGGAAAAAAAACC
    TTCAAAAACGCCCCCTAAGTCAACGCTCCATTGACTGAAGACAGTCCCTATCCTAGAGGG
    GTTGAGCTTTCTTCCTCCTTGGGTTGGAGGAGACCAGGGTGCCTCTTATCTCCTTCTAGC
    GGTCTGCCTCCTGGTACCTCTTGGGGGGATCGGCAAACAGGCTACCCCTGAGGTCCCATG
    TGCCATGAGTGTGCACACATGCATGTGTCTGTGTATGTGTGAATGTGAGAGAGACACAGC
    CCTCCTTTCAGAAGGAAAGGGGCCTGAGGTGCCAGCTGTGTCCTGGGTTAGGGGTTGGGG
    GTCGGCCCCTTCCAGGGCCAGGAGGGCAGGTTCCCTCTCTGGTGCTGCTGCTTGCAAGTC
    TTAGAGGAAATAAAAAGGGAAGTGAGAAAAAAAAAAAAAAAAAA
    >Hs.194774_mRNA_1 gi|16306633|gb|BC001492.1|BC001492 Homo sapiens
    MGC:1774 IMAGE:3510004 polyA = 3 clone
    GGCACGAGGGAGGCGGCGGCTCCAGCCGGCGCGGCGCGAGGCTCGGCGGTGGGATCCGGC
    GGGCGGTGCTAGCTCCGCGCTCCCTGCCTCGCTCGCTGCCGGGGGCGGTCGGAAGGCGCG
    GCGCGAAGCCCGGGTGGCCCGAGGGCGCGATGGCTGCTCCTGTCCCGTGGGCCTGCTGTG
    CTGTGCTTGCCGCCGCCGCCGCAGTTGTCTACGCCCAGAGACACAGTCCACAGGAGGCAC
    CCCATGTGCAGTACGAGCGCCTGGGCTCTGACGTGACACTGCCATGTGGGACAGCAAACT
    GGGATGCTGCGGTGACGTGGCGGGTAAATGGGACAGACCTGGCCCCTGACCTGCTCAACG
    GCTCTCAGCTGGTGCTCCATGGCCTGGAACTGGGCCACAGTGGCCTCTACGCCTGCTTCC
    ACCGTGACTCCTGGCACCTGCGCCACCAAGTCCTGCTGCATGTGGGCTTGCCGCCGCGGG
    AGCCTGTGCTCAGCTGCCGCTCCAACACTTACCCCAAGGGCTTCTACTGCAGCTGGCATC
    TGCCCACCCCCACCTACATTCCCAACACCTTCAATGTGACTGTGCTGCATGGCTCCAAAA
    TTATGGTCTGTGAGAAGGACCCAGCCCTCAAGAACCGCTGCCACATTCGCTACATGCACC
    TGTTCTCCACCATCAAGTACAAGGTCTCCATAAGTGTCAGCAATGCCCTGGGCCACAATG
    CCACAGCTATCACCTTTGACGAGTTCACCATTGTGAAGCCTGATCCTCCAGAAAATGTGG
    TAGCCCGGCCAGTGCCCAGCAACCCTCGCCGGCTGGAGGTGACGTGGCAGACCCCCTCGA
    CCTGGCCTGACCCTGAGTCTTTTCCTCTCAAGTTCTTTCTGCGCTACCGACCCCTCATCC
    TGGACCAGTGGCAGCATGTGGAGCTGTCCGACGGCACAGCACACACCATCACAGATGCCT
    ACGCCGGGAAGGAGTACATTATCCAGGTGGCAGCCAAGGACAATGAGATTGGGACATGGA
    GTGACTGGAGCGTAGCCGCCCACGCTACGCCCTGGACTGAGGAACCGCGACACCTCACCA
    CGGAGGCCCAGGCTGCGGAGACCACGACCAGCACCACCAGCTCCCTGGCACCCCCACCTA
    CCACGAAGATCTGTGACCCTGGGGAGCTGGGCAGCGGCGGGGGACCCTCGGCACCCTTCT
    TGGTCAGCGTCCCCATCACTCTGGCCCTGGCTGCCGCTGCCGCCACTGCCAGCAGTCTCT
    TGATCTGAGCCCGGCACCCCATGAGGACATGCAGAGCACCTGCAGAGGAGCAGGAGGCCG
    GAGCTGAGCCTGCAGACCCCGGTTTCTATTTTGCACACGGGCAGGAGGACCTTTTGCATT
    CTCTTCAGACACAATTTGTGGAGACCCCGGCGGGCCCGGGCCTGCCGCCCCCCAGCCCTG
    CCGCACCAAGCTGGCCCTCCTTCCTCCCTCAGGGGAGGTGGGCCATGCAGCTAACCCACC
    CACCAAAGACCCCCTCACCCTGGCCCCTTGGGCTGGACCCTCCAATGCCAGCGACTCCCA
    GGAGCCCTTGGGGGACGTGAGGGGAGCCTCTCACATCCGATTTCTCCTCCTGCCCCAGCC
    TCCTGTCTATCCCAGGGTCTCTGTTGCCACCATCAGATTATAAGCTCCTGATGCTGGGGG
    GGCCCAGCCATCCCCCTCCCCCCAGCACCCACAATTTTCAGTCCCCTCCCCTCTGCCCTG
    TTTTGTATACCCCTCCCCTGACCCTGCTCCTATCCCACAGTATTTAATGCCCTGTCAGTC
    CCTTCTAGTCTGACTCAATGGTAACTTGCTGTATTTGAATTTTTTATAGATGTATATACA
    GGGTGGGGGGAGTGGGCGGTTCTCATTAAACGTCACCATTTCATGAAAAAAAAAAAAAAA
    AAA
    >Hs.127428_mRNA_2 gi|16306818|gb|BC006537.1|BC006537 Homo sapiens
    MGC:1934 IMAGE:2987903 polyA = 3
    GGCACGAGGAGTTTCATAATTTCCGTGGGTCGGGCCGGGCGGGCCAGGCGCTGGGCACGG
    TGATGGCCACCACTGGGGCCCTGGGCAACTACTACGTGGACTCGTTCCTGCTGGGCGCCG
    ACGCCGCGGATGAGCTGAGCGTTGGCCGCTATGCGCCGGGGACCCTGGGCCAGCCTCCCC
    GGCAGGCGGCGACGCTGGCCGAGCACCCCGACTTCAGCCCGTGCAGCTTCCAGTCCAAGG
    CGACGGTGTTTGGCGCCTCGTGGAACCCAGTGCACGCGGCGGGCGCCAACGCTGTACCCG
    CTGCGGTGTACCACCACCATCACCACCACCCCTACGTGCACCCCCAGGCGCCCGTGGCGG
    CGGCGGCGCCGGACGGCAGGTACATGCGCTCCTGGCTGGAGCCCACGCCCGGTGCGCTCT
    CCTTCGCGGGCTTGCCCTCCAGCCGGCCTTATGGCATTAAACCTGAACCGCTGTCGGCCA
    GAAGGGGTGACTGTCCCACGCTTGACACTCACACTTTGTCCCTGACTGACTATGCTTGTG
    GTTCTCCTCCAGTTGATAGAGAAAAACAACCCAGCGAAGGCGCCTTCTCTGAAAACAATG
    CTGAGAATGAGAGCGGCGGAGACAAGCCCCCCATCGATCCCAATAACCCAGCAGCCAACT
    GGCTTCATGCGCGCTCCACTCGGAAAAAGCGGTGCCCCTATACAAAACACCAGACCCTGG
    AACTGGAGAAAGAGTTTCTGTTCAACATGTACCTCACCAGGGACCGCAGGTACGAGGTGG
    CTCGACTGCTCAACCTCACCGAGAGGCAGGTCAAGATCTGGTTCCAGAACCGCAGGATGA
    AAATGAAGAAAATCAACAAAGACCGAGCAAAAGACGAGTGATGCCATTTGGGCTTATTTA
    GAAAAAAGGGTAAGCTAGAGAGAAAAAGAAAGAACTGTCCGTCCCCCTTCCGCCTTCTCC
    CTTTTCTCACCCCCACCCTAGCCTCCACCATCCCCGCACAAAGCGGCTCTAAACCTCAGG
    CCACATCTTTTCCAAGGCAAACCCTGTTCAGGCTGGCTCGTAGGCCTGCCGCTTTGATGG
    AGGAGGTATTGTAAGCTTTCCATTTTCTATAAGAAAAAGGAAAAGTTGAGGGGGGGGCAT
    TAGTGCTGATAGCTGTGTGTGTTAGCTTGTATATATATTTTTAAAAATCTACCTGTTCCT
    GACTTAAAACAAAAGGAAAGAAACTACCTTTTTATAATGCACAACTGTTGATGGTAGGCT
    GTATAGTTTTTAGTCTGTGTAGTTAATTTAATTTGCAGTTTGTGCGGCAGATTGCTCTGC
    CAAGATACTTGAACACTGTGTTTTATTGTGGTAATTATGTTTTGTGATTCAAACTTCTGT
    GTACTGGGTGATGCACCCATTGTGATTGTGGAAGATAGAATTCAATTTGAACTCAGGTTG
    TTTATGAGGGGAAAAAAACAGTTGCATAGAGTATAGCTCTGTAGTGGAATATGTCTTCTG
    TATAACTAGGCTGTTAACCTATGATTGTAAAGTAGCTGTAAGAATTTCCCAGTGAAATAA
    AAAAAAATTTTAAGTGTTCTCGGGGATGCATAGATTCATCATTTTCTCCACCTTAAAAAT
    GCGGGCATTTAAGTCTGTCCATTATCTATATAGTCCTGTCTTGTCTATTGTATATATAAT
    CTATATGATTAAAGAAAATATGCATAATCAGACAAGCTTGAATATTGTTTTTGCACCAGA
    CGAACAGTGAGGAAATTCGGAGCTATACATATGTGCAGAAGGTTACTACCTAGGGTTTAT
    GCTTAATTTTAATCGGAGGAAATGAATGCTGATTGTAACGGAGTTAATTTTATTGATAAT
    AAATTATACACTATGAAACCGCCATTGGGCTACTGTAGATTTGTATCCTTGATGAATCTG
    GGGTTTCCATCAGACTGAACTTACACTGTATATTTTGCAATAGTTACCTCAAGGCCTACT
    GACCAAATTGTTGTGTTGAGATGATATTTAACTTTTTGCCAAATAAAATATATTGATTCT
    TTTCTAAAAAAAAAAAAAAAAAAAA
    >Hs.126852_contig1
    AI802118 BF197404 BF224434|AA931964|AW236083|AI253119|AW614335|AI671372|AI7
    93240|AW006851|AI953604|AI640505|AI633982|AW195809|AI493069|AW058576|AW2936
    22 polyA = 2 polyA = 3
    AAACCAGTGTATCCAGTCATGGAAAAGAAGGAGGAAGATGGCACCCTGGAGCGGGGGCAC
    TGGAACAACAAGATGGAGTTTGTGCTGTCAGTGGCTGGGGAGATCATTGGCTTAGGCAAC
    GTCTGGAGGTTTCCCTATCTCTGCTACAAAAATGGGGGAGGTGAGATGAGAGCCCTTGTG
    CCACCCCACCCACTCCTGGAAGGAGGATACTTCCATCTCCTGCACTTACGGCCCCTCTGG
    GGAGTCCCATAGATGTATAGAATTCTGGAGGTAGGAGGACGCTTGGAGGTCATTAAGGAC
    ACTCTGTAAGAGACTAAGACCTAGAAAGGTTACGTGACTATCCCAGGGCTCTTTCTATTA
    TAACGTGGCATCGTAGAAATATGAGCACAAGCTGGAACCAGGTGGATGAGAGTTTGGATT
    CTGGCTCTGCTACTTAACACTCTGTGTGATCTTGGACAAGTTACTTAAGCTCTCAGAGCA
    TCAATTGCCGCTCCTGCAAATTGAGATAATAATGCCTGCCTTTCAAGGTCATTGTAAGGA
    TTAGAGACAATGTGTGTAAAGCACTTAATAAATAGTAGCTCTGCTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAATAGTAGCTCTGCTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAGTAGTAGCTCTGTTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAGTAGTAGCTCTGCTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAATAGTAGCTCTGCTGATGATGATGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAATAGTAGCTCTGCTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAATAGTAGCTCTGCTGATGATGACGTTGAT
    AACCAAACTGTTCTGTGGTCTTAAGTAATAAATAGTAGCTCTGCTGATGATGACGTTGAT
    AAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.28149_mRNA_1 gi|14714936|gb|BC010626.1|BC010626 Homo sapiens clone
    MGC:17687 IMAGE: 3865868 polyA = 3
    GGAAGACATCAGGATGTACCATCTGCCCTTCTGTCGGACCCCAGGGTACGTCCCATGAGC
    GCGGCCGAGCTGCGTCGAGGGCAGCAGAGCGTGCTGCAGTGCTCAGGGACCCGGACTCTG
    CAGTTTCTCCTGCACTGTTTTCACCTTTGGCCAGACGGGCTCTGGGAAGACCTACACCCT
    GACTGGACCCCCTCCCCAGGGGGAGGGGGTGCCTGTACCCCCCAGCCTGGCTGGCATCAT
    GCAGAGGACCTTCGCCTGGCTGTTGGACCGCGTGCAGCACCTGGGTGCCCCTGTCACCCT
    TCGCGCCTCTTATCTGGAGATCTACAATGAGCAGGTTCGGGACTTGCTGAGCCTGGGGTC
    TCCCCGGCCCCTCCCTGTTCGCTGGAACAAGACTCGGGGCTTCTATGTGGAGCAGCTGCG
    GGTGGTGGAATTTGGGAGTCTGGAGGCCCTGATGGAACTTTTGCAAACGGGTCTCAGCCG
    TCGAAGGAACTCAGCCCACACCCTGAACCAGGCCTCCAGCCGAAGCCATGCCCTGCTCAC
    CCTTTACATCAGCCGTCAAACTGCCCAGCAGATGCCTTCTGTGGACCCTGGGGAGCCCCC
    TGTTGGTGGGAAGCTGTGCTTTGTGGACCTGGCAGGCAGTGAGAAGGTAGCAGCCACGGG
    ATCCCGTGGGGAGCTGATGCTTGAGGCTAACAGCATCAACCGAAGCCTGCTGGCCCTGGG
    TCACTGCATCTCCCTGCTGCTGGACCCACAGCGGAAGCAGAGCCACATCCCTTTCCGGGA
    CAGCAAGCTCACCAAGTTGCTGGCAGACTCACTGGGAGGGCGCGGGGTCACCCTCATGGT
    GGCCTGCGTGTCCCCCTCAGCCCAGTGCCTTCCTGAGACTCTCAGCACCCTGCGATATGC
    AAGCCGAGCTCAGCGGGTCACCACCCGACCACAGGCCCCCAAGTCTCCTGTGGCAAAGCA
    GCCCCAGCGTTTGGAGACAGAGATGCTGCAGCTCCAGGAGGAGAACCGTCGCCTGCAGTT
    CCAGCTGGACCAAATGGACTGCAAGGCCTCAGGGCTCAGTGGAGCCCGGGTGGCCTGGGC
    CCAGCGGAACCTGTACGGGATGCTACAGGAGTTCATGCTAGAGAATGAGAGGCTCAGGAA
    AGAAAAGAGCCAGCTGCAGAATAGCCGAGACCTGGCCCAGAATGAGCAGCGCATCCTGGC
    CCAGCAGGTCCATGCACTAGAGAGGCGTCTCCTCTCTGCCTGCTACCATCACCAGCAGGG
    TCCTGGCCTGACCCCACCGTGTCCCTGCTTGATGGCCCCAGCTCCCCCTTGCCATGCACT
    GCCACCCCTCTACTCCTGCCCCTGCTGCCACATCTGCCCACTGTGTCGAGTGCCCCTGGC
    CCACTGGGCCTGCCTGCCAGGGGAGCACCACCTGCCCCAGGTGTTGGACCCTGAGGCCTC
    AGGTGGCAGGCCCCCATCTGCCCGGCCCCCACCCTGGGCACCCCCATGCAGCCCTGGCTC
    TGCCAAGTGCCCAAGAGAGAGGAGTCACAGTGACTGGACTCAGACCCGAGTCCTGGCAGA
    GATGTTGACGGAGGAGGAGGTGGTACCTTCTGCACCTCCCCTGCCTGTGAGGCCCCCGAA
    GACATCACCAGGGCTCAGAGGTGGGGCCGGGGTTCCAAACCTGGCCCAGAGACTGGAGGC
    CCTCAGAGACCAGATTGGCAGCTCCCTGCGACGTGGCCGCAGCCAGCCACCCTGCAGTGA
    GGGCGCACGGAGCCCAGGCCAAGTCCTCCCTCCCCATTGAAGGCCAAGTGGGAACCCAGG
    AGACTGCTGTGTGACCTCAGACTGGGCTCCACACTCTTGGGCTTCAGTCTGCCCATCTGC
    TGAATGGAGACAGCAGCTGCTACTCCACCTGCAGCTGGGCTAGGGGCGGGGACTGGGGGT
    GCTATTTAGGGGAACAAGGGGATTCAGGAGAAACCAGGCAGCAGGGGATGAAATACATGA
    ATAAAGAGAGGCATCAGCTCCAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.35453_mRNA_3 gi|7018494|emb AL157475.1 HSM802461 Homo sapiens mRNA;
    cDNA DKFZp761G151 (from clone DKFZp761G151); partial cds polyA = 3
    CTCCCCCTGAGAGAGGCTGGGCAGCACCCCCCTTCTGCCAGGAGTGCCAGCCAAGGTGCC
    AGACCCCTGTCCAGTGGCAAGCTGGAAGGCTTTCAGAGCATCGATGAAGCTATAGCCTGG
    CTCAGGAAGGAACTGACGGAGATGCGGCTGCAGGACCAGCAACTGGCCAGACAGCTCATG
    CGCCTGCGTGGCGACATCAACAAGCTGAAAATCGAACACACCTGCCGCCTCCACAGGAGG
    ATGCTCAACGATGCCACCTACGAGCTGGAGGAGCGGGATGAGCTGGCCGACCTCTTCTGT
    GACTCCCCTCTTGCCTCCTCCTTCAGCCTCTCCACACCACTCAAGCTTATTGGCGTGACC
    AAGATGAACATCAACTCTCGGAGGTTCTCTCTCTGCTGAGGAGCCCTCAGACTGGGCGGA
    GGGGCTGGAGCGGAGGGCTTGGGCTGGAGGGGTGTCAGAGGAAGCTGAGGCCAAGTTACT
    CCAGTGGGTCTCCCGGAGGCAGGGGTCCCTGGGACTGGCGACTCAAGGGCCCCAGGACCT
    ATTCAGTGGTGCTCTCCCACCCAGGGGCCCTGGGTGTGGATGCCAGTGTCTCTGTGACTG
    GCTCTTGCTTACTACCCAAAGAGCTCTGCAGAAGGGCCGCTCCAACCAAGATGTTAAAGG
    AGACCTGGGTTCCCACCATAATCCATCCCTCCACGGTCACGTTCCTGTTTCCTGGAATCA
    CTGGTGCTATGAACTGGGATTCCCAAAGGGAGGCCCCCCAACAAAGCTGTCATTTTTGCA
    GAAGGCTGTCCCGCAAGGGCCTTGGGGGAAATTAGGCATGTCAGATGTGCCTGTCTCACG
    TGCTGTTGCTGTCCTCTAAGTATTGTCTCAAATTCACCCTAAGTACATGACTCAGCAACA
    TTGACAGGGAGCTACTAGGAAGGGAAAATCGAAAGGCATGACAAATGGGCACTTGGGGAC
    GCAGCCCCAGTGGCTGGCAGCCAGTGTCTCTGGTGAGCCTGACACTACAAGGCTGTGTAA
    ATTGTAAATTCTGGCGTGTGCTGGGACATGTGATGGGGGCACTAGCGTAGCTTGGGTGCA
    ACAAGCACAGATGTCCCCATTGTCTCCCCTGGCCACATGCATCTCCAAAGAGCCTCTTCA
    CTGCCACCCACACCCCAGGGTGACAGCCTGGGAGACCACTGGTGACTGAACCAGGCAGGT
    CCTGAAAGCATTTTCCATAACTGAATTCTCCTGCAGGGGCGTGACCGGGGCCTCCTGGTG
    GATTCTGGTGGTGTCACCTTACTGCCCTCTCTGGAAAGACAATCTAGGGAGCCCAGAGGC
    CCATCCTGAGCCTCCTCTGAGATTTTGTGCCTGACCTAAACAACTAGTTTTAATAAGACT
    GTTACTGATGTGTTGTTCACTTGTTAGTAACTGATTTTTGTCCAAATGCGGAAGCCACTT
    GTGTAGGTCAACTACAGTGCGTAGGATTTGATTTTAAGAGTTTCTCCCTCCCAACAGGCT
    TGAGGATCAGCAAGTTAAGACCCCAGCAGGTTAGGGAGGTCAGTCTGGGGTCATACGGCA
    TGGCAGGGGTCCCTCGGCCAGACCCGTAGAATCCTGAGATAAGGAGTGTTTCTGACCTTT
    GGTGTCATCTAGTCGAGTCCTCTCATTAGTAAAGGAGCAAAGTGAAACCTGGGGGAGGAG
    AAGGACTTCCCTCAGGTTGCACAGCTGTTTAGGCTATAGAATATTGATGTGTGAAACCAT
    TATTGATAATGCCTAGTAGATCACATGTCAATGAACTTGAACCCCAAAGATGGTCGTGAT
    GCTTTGCCAAACCCGCACACTGCCAACCCCTCTACTCTCCACCTCAGCCCCCACCCACAT
    CTCCCAGAGTATTGCAATTCAGAACATTTGGGTCAAGGTGGAGCAAGGCACTGACAGTGG
    CCCCACAGGGCATGTGTCACTAATCACTGTCCCATGGTCTACGCACGGCATCTGGCTGCT
    CTGTCTACTGTGACTTCTTCCTGTGTAATCTCAGTGGGGCCCGTGTCCACCCACACATCG
    TGACCCACATAGGGGAGAGGTTGCTTTTCTTTTGTGGGCTGAGAGTAGGACAATGCAAAT
    GAATGATCTCTAGTAGACAGAAAAGAACTTGGTCTCTTTTTTAAAATTTCAAAGAGCCAG
    AAGTTCTATGCCTCCTTCAAAGTAGGCAGAACAACGCAGCCAAGATCTACTGTCTGCCAT
    GCTCTGTGCAATGAAGTCTGCAGGCCTGAGGACCATGTACTGCTGTCCTTCCTCAGAGCT
    CTGCACAAACACTGCCAAGTCCTGAAGACGCATTCCTTTCCTGCCAACCTCTTTCCAGAT
    AAGCCCTTGAGGTCTCGGGCTGACCTACACACACACACACACACACACACACACACACAC
    ACACCCCCACACACACACACACACGACAGAGAACATGCCATAAACATCCTTGAACCCATG
    CAGGAAAGCCCATCCCATATTCTGAAAAAATGCCAAATTAGGTTTTTCTTTCTTTTTGGA
    AATCAGTCATTACAGTAACCGAAACCATTGGGTTCAGCGAAAATGGAAAGATTTAGCTGA
    ATGTAGTCAGTCCAATTAAGTTGGATGCAACTGAGTGATTTAGTTGCTTGGGTAACCCAG
    TGCTTGCTTGCTTTCTTCATTCTCTGGGTGGAAACTAAGATCAAGACACATGTTTGGGGA
    TAAGTTAAATGTCTGAGCTATTTTGCTCGGTTTATCCTAAGAGAACTTTATTATGGGATG
    AGGAGGTGACCCAAGATGAGAAGTGGAGGGGGACAGCGATGTTTTCTAAACATCGTCCAG
    TGTTGACTGGCTTCCTTACTTTGCACAGTGAACACAACTAACCACATTAATTCAGCTTTG
    TGAAGTCCCTGCTCTCTGTGGGTTCTATGAGTCAGCAGCAACATTGGCCTAACCTCCGTC
    CCAGCCTCCTGGCTCACCACATGTGTACAGTGCTGTTTGCAGTTGTACTCATTATCCATC
    CATCTCTCTGCCATCCCCAAGCATCGCTGGGTGTAAAACGCAAACTCTCCACCGACACTG
    CCATGCGTGGTCATGTCTTGATGCCTTCAGGGGCTCAGTAGCTATCAAAGAGGCCTGGAG
    GGCCTGGGCAGGCTTGACGATGCCTGACCGAGTTCAAGACCCACACCCTGTAGCAATACC
    AAGTGCTATTACATAATCAATGGACGATTTATACTTTTATTTTTTATGATTATTTGTTTC
    TATATTGCTGTTAGAAAAAGTGAAATAAAAATACTTCAAAAGAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAGAAAAAAAAAAAAAAAAAAA
    >Hs.180570_contig1 RO8175|AA707224|AA699986|R11209|W89099|T98002|AA494546
    polyA = 2 polyA = 3
    TGAAGGACCGCGATCCTAAAGAGATTGAATGGGACGACCTGGCCCAGCTGCCCTTCCTGA
    CCATGTGCGTGAAGGAGAGCCTGAGGTTACATCCCCCAGCTCCCTTCATCTCCCGATGCT
    GCACCCAGGACATTGTTCTCCCAGATGGCCGAGTCATCCCCAAGGGCATTACCTGCCTCA
    TCGATATTATAGGGGTCCATCACAACCCAACTGTGTGGCCGGATCCTGAGTCTACGACCC
    CTTCCGCTTTGACCCAGAGAACAGCAAGGGGAGGTCACCTCTGGCTTTTAATTCCCTTCT
    CCGCAGGGCCCAGGAACTGCATCGGGCCAGCGTTTCCCATGGCGGAGATGAAAGTGGTTC
    CTGGCGTTGATGCTGCTGCACTTCCGGTTCCTGCCAGACCACACTGAGCCCCGCAGGAAG
    CTGGAACTGATCATTGCGGCCGAGGGCGGGCTTTGGCTGCGGGTGGAGCCCCTGAATGTA
    GGCTTGCAGTGACTTTCTGACCCATCCACCTGTTTTTTTGCAGATTGTCATGAATAAAAC
    GGTGCTGTCACCTCAAAAAAAAAAAANNNAAAA
    >Hs.196270_mRNA_1 gi|11545416|gb|AF283645.1|AF283645 Homo sapiens
    chromosome 8 map 8q21 polyA = 3
    GAGTCCTCTCGTTGGTCCCGGAGGTGGGGTTGCGCTCACAAGGGGCGACCGTCGCCACGG
    TGGCGGCCACTGCATCGCGTCCCACCTCCGCGGCCCTGGGCGCCGTGGTGTCGACGGGCC
    CCGAGCCTATGACGGGCCAGGGCCAGTCGGCGTCCGGGTCGTCGGCGTGGAGCACGGTAT
    TCCGCCACGTCCGGTATGAGAACCTGATAGCGGGCGTGAGCGGCGGCGTCTTATCCAACC
    TTGCGCTGCATCCGCTCGACCTCGTGAAGATCCGCTTCGCCGTGAGTGATGGATTGGAAC
    TGAGACCGAAATATAATGGAATTTTACATTGCTTGACTACCATTTGGAAACTTGATGGAC
    TACGGGGACTTTATCAAGGAGTAACCCCAAATATATGGGGTGCAGGTTTATCCTGGGGAC
    TCTACTTTTTCTTTTACAATGCCATCAAGTCATATAAAACAGAAGGAAGAGCTGAACATT
    TAGAGGCAACAGAATACCTTGTCTCAGCTGCTGAAGCTGGAGCCATGACCCTCTGCATTA
    CAAACCCATTATGGGTAACAAAAACTCGCCTTATGTTACAGTATGATGCTGTTGTTAACT
    CCCCACACCGACAATATAAAGGAATGTTTGATACACTTGTGAAAATATATAAGTATGAAG
    GTGTGCGTGGATTATATAAGGGATTTGTTCCTGGGCTGTTTGGAACATCGCATGGTGCCC
    TTCAGTTTATGGCATATGAATTGCTGAAGTTGAAGTACAACCAGCATATCAATAGATTAC
    CAGAAGCCCAGTTGAGCACAGTAGAATATATATCTGTTGCAGCACTATCCAAAATATTTG
    CTGTCGCAGCAACATACCCATATCAAGTCGTAAGAGCTCGTCTTCAGGATCAACACATGT
    TTTACAGTGGTGTAATAGATGTAATCACAAAGACATGGAGGAAAGAAGGCGTCGGTGGAT
    TTTACAAGGGAATTGCTCCTAATTTGATTAGAGTGACTCCAGCCTGCTGTATTACCTTTG
    TGGTATATGAAAACGTCTCACATTTTTTACTTGACCTTAGAGAAAAGAGAAAGTAAGCTC
    AAAGAGGACAATTCCAGTATATCTGCCCAAGGCAGCAACAAGCTCTTTTGTGTTTAAGGC
    ATAAAAGAAGAATTCTGCATAGAAACATGGCTCATATTCGAAATTGCTCTATAGTCATTA
    GAAGCCAGAGAACTGCTAAGTCTCCTGCAATGTTTTTCTTGCTTTTTGCCTTCCCCATAT
    ATATGGAACTTGGCTACCTCTGCCTGAAATGGCTGCCATCAACACAATGTTAAAACTGAC
    ACGAAGGATAGAGTTTCACAGATTTCTACGTTTTATTGGTGGAAGCTGATTTGCAACATT
    TGCTAAATGGATTAGATGAATGTACTTCTTTTTGTGAGCTTACTTGCCTGGATTGCTTTA
    AAATTAACCTTTGTGCAATACCAAGAAAATAGCTCTTTAAAAGAATGTCTTTGTATGTCT
    CAAGGTAAATTAAGGATTTACTGAATAAGGTGTTGACCAAATCCAGACCATTTTATTTTA
    TTTTTTTATTTATTTATTTTTTGAGATGGAGTCTTGCTTTGTCGCCCAGGCTGGAGTGCA
    GTGGCGTGATCTCAGCTCACTGCAACCTCCACCTCCCGGGTTCACGCCATTCTCCTGCCT
    CAGCCTCCTGAGTAGCTGGGACTACAGGCACCTGCCACCACGCCTGGCTAACTTTTTTTT
    ATATTTTGAGTAGAAATGGGGTTTCACCATGTTAGCCAGGATGGTCTCAATCTCCTGACC
    TTGTGATCCGCCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACTGCGC
    CTGGCCAGACCATTTTAGAATTGGGAAATTTTAGTGAGAAAAAATGCACTGTAAATATGC
    TTTAGTTTTAATTCAGTTGGGATGCACTACCTAGCGAAAATTGAGAAACTATATACTTCT
    CAGAGAAATATCTGACATCTATTGTCATTCCATTGCTATTTTTTTTCCCCAGAGACTTCC
    ATAATTTAAAATAAAATCCTAGATCCAGTTCTTGTTTTTTGGCATAAATACTTAATCTAT
    TTTAAATTTATAAAATCTGAGCTTCTAGGATCCAGCTGTGTCAACCTTTATTTAGCATAT
    ATAACTATAAATCACTTATTACAGATGCTAAATAGATCACCTTTTACAGATGCTGAAATG
    TTTGGGATATGTTTGTTGACAAGGTAAATGGAAATGAGAAACTTTATACTTCAGTTTTCA
    GATATATGGATCTAGATCCCAAATAAATGATTAATCTTCATTGGTTTCTCAAATTCAGGT
    TGAAATACAAATTAATAGCCTTTATTGATTTTACTTTTATGAGTCATTGTAGACATCTAT
    AAATATAAAAGGGCCTGTACCCAAAGGATGCCAGAATACTAGTATTTTTATTTATCGTAA
    ACATCCACGAGTGCTGTTGCACTACCATCTATTTGTTGTAAATAAAAGTGTTGTTTTCAA
    AAAAAAAAAAAAAA
    >Hs.9030_mRNA_3 gi|12652600|gb|BC000045.1|BC000045 Homo sapiens clone
    MGC:2032 IMAGE:3504527 polyA = 3
    CTAGAGGGGCGGAAAGTAACAAGGAGGTGGGGGTACAAATCCTCAGCTCCTGCTTCCGCA
    AGCACTAACCTGCTCTGAAGTGAGCCAGGCAGCTCTGGCCATCTTTTCCCAGCCACAGAA
    TCAGGTGATGGTCCAGAATTAAGAGCTGTCACCTGTGTCATTCACTCACAATGGAAGAAA
    TGAAGAAGACTGCCATCCGGCTGCCCAAAGGCAAACAGAAGCCTATAAAGACGGAATGGA
    ATTCCCGGTGTGTCCTTTTCACCTACTTCCAAGGGGACATCAGCAGCGTAGTGGATGAAC
    ACTTCTCCAGAGCTCTGAGCAATATCAAGAGCCCCCAGGAATTGACCCCCTCGAGTCAGA
    GTGAAGGTGTGATGCTGAAAAACGATGATAGCATGTCTCCAAATCAGTGGCGTTACTCGT
    CTCCATGGACAAAGCCACAACCAGAAGTACCTGTCACAAACCGTGCCGCCAACTGCAACT
    TGCATGTGCCTGGTCCCATGGCTGTGAATCAGTTCTCACCGTCCCTGGCTAGGAGGGCCT
    CTGTTCGGCCTGGGGAGCTGTGGCATTTCTCCTCCCTGGCGGGCACCAGCTCCTTAGAGC
    CTGGCTACTCTCATCCCTTCCCCGCTCGGCACCTGGTTCCAGAGCCCCAGCCTGATGGGA
    AACGTGAGCCTCTCCTAAGTCTCCTCCAGCAAGACAGATGCCTAGCCCGTCCTCAGGAAT
    CTGCCGCCAGGGAGAATGGCAACCCTGGCCAGATAGCTGGAAGCACAGGGTTGCTCTTCA
    ACCTGCCTCCCGGCTCAGTTCACTATAAGAAACTATATGTATCTCGTGGATCTGCCAGTA
    CCAGCCTTCCAAATGAAACTCTTTCAGAGTTAGAGACACCTGGGAAATACTCACTTACAC
    CACCAAACCACTGGGGCCACCCACATCGATACCTGCAGCATCTTTAGTCAAGTTGGAGGA
    GAAAGACAACACTTGGTCTAAGACACGGCAGCAAGACATCCCTGCATATTGTTCCAGATA
    AAAATGAAAGCTGCTCACACCCACTTGCCTCCCCAATCTGTTAAACAGCTTCGTGTCTAG
    TATGAGCTCAGTACTTGCCCTGTGAAAATCCCAGAAGCCCCCGCTGTCAATGTTCCCCAT
    CCACACCCTGCTTGCTCCTGTGTAACAGCTCAGATGATGAATAATAATAAAACTGTACTT
    TTTTGGATGGTGAAAAAAAAAAAAAAAAAAAA
    >Hs.1282_mRNA_3 gi|4559405|ref|NM_000065.1|Homo sapiens complement
    component 6 (C6), mRNA polyA = 1
    TTGCCTTGTGTTAGCTAGCAATAAGAAAAGAAGCTTTGTTTGGATTAACATATATACCCT
    CTTCATTCTGCATACCTATTTTTTCCCCAATAATTTGCAGCTTAGGTCCGAGGACACCAC
    AAACTCTGCTTAAAGGGCCTGGAGGCTCTCAAGGCATGGCCAGACGCTCTGTCTTGTACT
    TCATCCTGCTGAATGCTCTGATCAACAAGGGCCAAGCCTGCTTCTGTGATCACTATGCAT
    GGACTCAGTGGACCAGCTGCTCAAAAACTTGCAATTCTGGAACCCAGAGCAGACACAGAC
    AAATAGTAGTAGATAAGTACTACCAGGAAAACTTTTGTGAACAGATTTGCAGCAAGCAGG
    AGACTAGAGAATGTAACTGGCAAAGATGCCCCATCAACTGCCTCCTGGGAGATTTTGGAC
    CATGGTCAGACTGTGACCCTTGTATTGAAAAACAGTCTAAAGTTAGATCTGTCTTGCGTC
    CCAGTCAGTTTGGGGGACAGCCATGCACTGAGCCTCTGGTAGCCTTTCAACCATGCATTC
    CATCTAAGCTCTGCAAAATTGAAGAGGCTGACTGCAAGAATAAATTTCGCTGTGACAGTG
    GCCGCTGCATTGCCAGAAAGTTAGAATGCAATGGAGAAAATGACTGTGGAGACAATTCAG
    ATGAAAGGGACTGTGGGAGGACAAAGGCAGTATGCACACGGAAGTATAATCCCATCCCTA
    GTGTACAGTTGATGGGCAATGGGTTTCATTTTCTGGCAGGAGAGCCCAGAGGAGAAGTCC
    TTGATAACTCTTTCACTGGAGGAATATGTAAAACTGTCAAAAGCAGTAGGACAAGTAATC
    CATACCGTGTTCCGGCCAATCTGGAAAATGTCGGCTTTGAGGTACAAACTGCAGAAGATG
    ACTTGAAAACAGATTTCTACAAGGATTTAACTTCTCTTGGACACAATGAAAATCAACAAG
    GCTCATTCTCAAGTCAGGGGGGGAGCTCTTTCAGTGTACCAATTTTTTATTCCTCAAAGA
    GAAGTGAAAATATCAACCATAATTCTGCCTTCAAACAAGCCATTCAAGCCTCTCACAAAA
    AGGATTCTAGTTTTATTAGGATCCATAAAGTGATGAAAGTCTTAAACTTCACAACGAAAG
    CTAAAGATCTGCACCTTTCTGATGTCTTTTTGAAAGCACTTAACCATCTGCCTCTAGAAT
    ACAACTCTGCTTTGTACAGCCGAATATTCGATGACTTTGGGACTCATTACTTCACCTCTG
    GCTCCCTGGGAGGCGTGTATGACCTTCTCTATCAGTTTAGCAGTGAGGAACTAAAGAACT
    CAGGTTTAACCGAGGAAGAAGCCAAACACTGTGTCAGGATTGAAACAAAGAAACGCGTTT
    TATTTGCTAAGAAAACAAAAGTGGAACATAGGTGCACCACCAACAAGCTGTCAGAGAAAC
    ATGAAGGTTCATTTATACAGGGAGCAGAGAAATCCATATCCCTGATTCGAGGTGGAAGGA
    GTGAATATGGAGCAGCTTTGGCATGGGAGAAAGGGAGCTCTGGTCTGGAGGAGAAGACAT
    TTTCTGAGTGGTTAGAATCAGTGAAGGAAAATCCTGCTGTGATTGACTTTGAGCTTGCCC
    CCATCGTGGACTTGGTAAGAAACATCCCCTGTGCAGTGACAAAACGGAACAACCTCAGGA
    AAGCTTTGCAAGAGTATGCAGCCAAGTTCGATCCTTGCCAGTGTGCTCCATGCCCTAATA
    ATGGCCGACCCACCCTCTCAGGGACTGAATGTCTGTGTGTGTGTCAGAGTGGCACCTATG
    GTGAGAACTGTGAGAAACAGTCTCCAGATTATAAATCCAATGCAGTAGACGGACAGTGGG
    GTTGTTGGTCTTCCTGGAGTACCTGTGATGCTACTTATAAGAGATCGAGAACCCGAGAAT
    GCAATAATCCTGCCCCCCAACGAGGAGGGAAACGCTGTGAGGGGGAGAAGCGACAAGAGG
    AAGACTGCACATTTTCAATCATGGAAAACAATGGACAACCATGTATCAATGATGATGAAG
    AAATGAAAGAGGTCGATCTTCCTGAGATAGAAGCAGATTCCGGGTGTCCTCAGCCAGTTC
    CTCCAGAAAATGGATTTATCCGGAATGAAAAGCAACTATACTTGGTTGGAGAAGATGTTG
    AAATTTCATGCCTTACTGGCTTTGAAACTGTTGGATACCAGTACTTCAGATGCTTACCAG
    ACGGGACCTGGAGACAAGGGGATGTGGAATGCCAACGGACGGAGTGCATCAAGCCAGTTG
    TGCAGGAAGTCCTGACAATTACACCATTTCAGAGATTGTATAGAATTGGTGAATCCATTG
    AGCTAACTTGCCCCAAAGGCTTTGTTGTTGCTGGGCCATCAAGGTACACATGCCAGGGGA
    ATTCCTGGACACCACCCATTTCAAACTCTCTCACCTGTGAAAAAGATACTCTAACAAAAT
    TAAAAGGCCATTGTCAGCTGGGACAGAAACAATCAGGATCTGAATGCATTTGTATGTCTC
    CAGAAGAAGACTGTAGCCATCATTCAGAAGATCTCTGTGTGTTTGACACAGACTCCAACG
    ATTACTTTACTTCACCCGCTTGTAAGTTTTTGGCTGAGAAATGTTTAAATAATCAGCAAC
    TCCATTTTCTACATATTGGTTCCTGCCAAGACGGCCGCCAGTTAGAATGGGGTCTTGAAA
    GGACAAGACTTTCATCCAACAGCACAAAGAAAGAATCCTGTGGCTATGACACCTGCTATG
    ACTGGGAAAAATGTTCAGCCTCCACTTCCAAATGTGTCTGCCTATTGCCCCCACAGTGCT
    TCAAGGGTGGAAACCAACTCTACTGTGTCAAAATGGGATCATCAACAAGTGAGAAAACAT
    TGAACATCTGTGAAGTGGGAACTATAAGATGTGCAAACAGGAAGATGGAAATACTGCATC
    CTGGAAAGTGTTTGGCCTAGCACAATTACTGCTAGGCCCAGCACAATGAACAGATTTACC
    ATCCCGAAGAACCAACTCCTACAAATGAGAATTCTTGCACAAACAGCAGACTGGCATGCT
    CAAAGTTACTGACAAAAATTATTTTCTGTTAGTTTGAGATCATTATTCTCCCCTGACTCT
    CCTGTTTGGGCATGTCTTATTCAGTTCCAGCTCATGACGCCCTGTAGCATACCCCTAGGT
    ACCAACTTCCACAGCAGTCTCGTAAATTCTCCTGTTCACATTGTACAAAAATAATGTGAC
    TTCTGAGGCCCTTATGTAGCCTGTGACATTAAGCATTCTCACAATTAGAAATAAGAATAA
    AACCCATAATTTTCTTCAATGAGTTAATAAACAGAAATCTCCAGAACCTCTGAAACACAT
    TCTTGAAGCCCAGCTTTCATATCTTCATTCAACAAATAATTTCTGAGTGTGTATACAGGA
    TGTCAAGTACTGACCAAAGTCCTGAGAACTCGGCAGATAATAAAACAGACAAAAGCCTTT
    GCCTTCATGAAGCATACATTCATTCAGGGGTAGACACACAAAAAATGAAATAAACAGGTA
    AAATATGTAGC
    >Hs.268562_mRNA_2 gi|15341874|gb|BC013117.1|BC013117 Homo sapiens clone
    MGC:8711 IMAGE:3882749 polyA = 3
    CTCTCCTCGCCCGCTGGGTGCTGAAGTTGGGCGGATGGCAGCAAACCGGCTCCGCTAGAG
    GACCGAGCCGCCCAGCCCCGCTCCCCCGGACCCATCGGCGCGCTGCCCACACCTCCAGGC
    GACCGGCCAACTGGGTCCTGAAGTAGCTGAAATGCGAAAAAGGCAGCAGTCCCAAAATGA
    AGGAACACCTGCCGTGTCTCAAGCTCCTGGAAACCAGAGGCCCAACAACACCTGTTGCTT
    TTGTTGGTGCTGTTGTTGCAGCTGCTCCTGCCTCACTGTGAGGAATGAAGAAAGAGGGGA
    AAATGCGGGAAGACCCACACACACTACAAAAATGGAGAGTATCCAGGTCCTAGAGGAATG
    CCAAAACCCCACTGCAGAGGAAGTCTTGTCCTGGTCTCAAAATTTTGACAAGATGATGAA
    GGCCCCAGCAGGAAGAAACCTTTTCAGAGAGTTCCTCCGAACAGAATACAGTGAAGAGAA
    CCTACTTTTCTGGCTTGCTTGTGAAGACTTAAAGAAGGAGCAGAACAAAAAAGTAATTGA
    AGAAAAGGCTAGGATGATATATGAAGATTACATTTCTATACTATCACCAAAAGAGGTCAG
    TCTTGATTCTCGAGTTAGAGAGGTGATCAATAGAAATCTGTTGGATCCCAATCCTCACAT
    GTATGAAGATGCCCAACTTCAGATATATACTTTAATGCACAGAGATTCTTTTCCAAGGTT
    TTTGAACTCTCAAATTTATAAGTCATTTGTTGAAAGTACTGCTGGCTCTTCTTCTGAATC
    TTAATGTTCATTTAAAAACAATCATTTTGGAGGGCTGAGATGGGAAATAAAAGTAGTTAA
    ATAACATCAGAAACTGAGTTCCTGGAGAACTACAGTTTAGCATTCCTCAGGCTACTGTGA
    AAACACAACCGTTATGGTCTTTGTCTCCATTTTTATCAAGGTTTTCCATGGTTAAGTTTG
    GAGAAAATACCACACAAAACAATGAATTGCCAAATTGTTTGTTTTATTCAAGACTCATTC
    TACTTGCAAGCAAAGTGTATTTGTAGTCCTATGAACAGTCTCCTCGTGTATCTCCAGAGA
    CTGCATGTGCAAAGTAAAATGCTTCATTTGCCACATAGTTGTTGTAATATTTAATCCAGT
    AGCATAACTTATATCTGTATTTAAGGACTTTTGTGCAATATGGTCTTAAGAAATAATTGC
    CAAAAAAATCGGCCATGGTTCTGCATTTTTAACATAATCTAAGACAGAAAAAAAGCAATT
    TTTACTATGTAACAATGGTATTCAACATTCTATATACTGTGTTTAGTACACTAATTTTGA
    AGCCAATATTTCTGTACATGAAAAAGAGCTATTTATCTCTGTTTGTTGGAAAATCCTAAT
    GGGGATTCCTCTGGTTGTTCACTGCCAAAACTGTGGCATTTTCATTACAGGAGAGTTTAC
    TATGCTAAAAGCAAAAAACAAAAAAAAAAAAAAAGGGAAGAAGGAAAAAAGCAAAAAACA
    ATTTGAAGATATCCTATCTCAATGACAAATCAAAAGAGTGATATTGCTTTTAACTGTAAT
    AGAAGAAAATGAATTTATGTATATATCAGATGTCCAATACTGTAATTAATTTATTAAAGA
    CTGGCTCTCCAGTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.151301_mRNA_3 gi|16041747|gb|BC015754.1|BC015754 Homo sapiens clone
    MGC:23085 IMAGE:4862492 polyA = 3
    AAAAGAACCAGGATTGCATTTGAAGTTAAGCTGCAAAAAACCAGTCGATCAACAGATTTT
    CGAGTCCCACAGTCAATATGCACCATGTTTAATGTTATGGTTGATGCCAAAGCTCAATCA
    ACAAAACTTTGCAGCATGGAAATGGGCCAAGAGTTTGCTAAAATGTGGCATCAATACCAT
    TCAAAAATAGACGAACTAATTGAAGAAACTGTTAAAGAAATGATAACACTCTTGGTTGCA
    AAGTTCGTTACTATCTTGGAAGGAGTGCTGGCAAAATTATCCAGATATGACGAAGGGACT
    TTGTTTTCTTCTTTTCTGTCATTTACCGTGAAGGCAGCTTCCAAATATGTGGATGTACCT
    AAACCCGGGATGGACGTGGCCGACGCCTACGTGACTTTCGTCCGCCATTCTCAGGATGTC
    CTGCGTGATAAGGTCAATGAGGAGATGTACATAGAAAGGTTATTTGATCAATGGTACAAC
    AGCTCCATGAACGTGATCTGCACCTGGTTGACGGACCGGATGGACTTACAGCTTCATATT
    TATCAGTTGAAAACACTAATTAGGATGGTAAAGAAAACCTACAGAGATTTCCGATTGCAA
    GGGGTCCTGGACTCCACCTTAAACAGCAAGACCTATGAAACGATCCGGAACCGTCTCACT
    GTGGAGGAAGCCACAGCATCAGTGAGTGAAGGTGGGGGACTGCAGGGCATCAGCATGAAG
    GACAGCGATGAGGAAGACGAAGAAGACGATTAGACCATTTGGTCCTAGAGTCTGCTGGGA
    CAGAGTCCTGTAATCAGTGCATGTCCTTAGTCTGTTAGTTAAACCCATTAGGAATTTTCT
    GTCAACTACCATGCCCATGAGATGTTTATCAATACAACTGCCATTTTAGCTATGTGGTAC
    CAAGATTAGCAAATGACCTTCATATCCACTGATTTCCTGATGTCCATGTCTATATGTTTA
    CAAGCAATATGGAGCACCATTCTTTAAATACTGTTCATGGAGAATACATAGTCTAACCAC
    TAGGCGTGTCCCTGTTATCAGCAAAGATCAATGATGCTTCATTCATGTACTATGTATGCA
    TTGGTGGTAAATGGATGTGAGGGCAAGTACATCAAGTACATTCACTCTGTTTCACGTATG
    TGGATGCCAGTTAATTAAATGAGTACGTAAATAAATTAATTAAAACACATAGATCTGCTT
    TGTGTTTTTATTTTTATTTTTTGAAAAACAAAAGGCAAGTCTCCAACAATTAACTTTTGA
    TGCTTTCTGTTCCCCTAAAACCAAAAAATGAACCCCTTGTGTCGTTGTTAACCCATCCTT
    TCATTTACTCATATAATTAGCCAAAAAAAAAAGGATGGCTACATACCAATGGATTGATTC
    TCTTAATTGCCACGGCAAGGGGGCGATCCTATCATGACTTAACATCAAGCGCGCAGTTCA
    AAACTACTGTCTTCTGTCAAAGTTTTCTCCTCTTAAATGTTATTTTGCTTTTACGTCTCA
    ACTGTGTATGTAAAAAAAACGAATATTTAAATTACAACCCTAGACTAAAAATGTGTTTAT
    AATAAGATGTGGATATTTCCTTCAGTAGATTGTAACCATAATTTAAATTATTTTGTTCCA
    CACTGTTTTTTATATCTGTCATGTACATTGCATTTTGATCTGTAACTGCACAACCCTGGG
    GTTTGCTGCAGAGCTATTTCTTTCCATGTAAAGTAGTGGATCCATCTTGCTTTTGCCTTA
    TATAAAGCCTACAGTTATGGAAGTGTGGAAAACTGTGGCTTCTCAATAAATATTCAGATG
    TCCTAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAA
    >Hs.111_contig1 AA946776|AW242338|H24274|AI078616 polyA = 1 polyA = 2
    ACCTGAACTGTCTAAGATATTCTAAGCAAAGTTGACAAAGACAATTCTCCACTTGAGCCC
    TTAAAAATGTAACCACTATAAAGGTTTCACGCGGTGGTTCTTATTGATTCGCTGTGTCAT
    CACATCAGCTCCACTGTTGCCAAACTTTGTCGCATGCATAATGTATGATGGAGGCTTGGA
    TGGGAATATGCTGATTTTGTTCTGCACTTAAAGGCTTCTCCTCCTGGAGGGCTGCCTAGG
    GCCACTTGCTTGATTTATCATGAGAGAAGAGGAGAGAGAGAGAGACTGAGCGCTAGGAGT
    GTGTGTATGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTGTGTAGCGGGAGATGTGG
    GCGGAGCGAGAGCAAAAGGACTGCGGCCTGATGCATGCTGGAAAAAGACACGCTTTTCAT
    TTCTGATCAGTTGTACTTCATCCTATATCAGCACAGCTGCCATACTTCGACTTATCAGGA
    TTCTGGCTGGTGGCCTGCGCGAGGGTGCAGTCTTACTTAAAAGACTTTCAGTTAATTCTC
    ACTGGTATCATCGCAGTGAACTTAAAGCAAAGACCTCTTAGTAAAAAATAAAAAAAATAA
    A
    >Hs.150753_contig1 AI123582|AI288234 polyA = 0 polyA = 0
    GCTTCTCTTTAAAATTGACCCAAGGCATGAGCCACTGCGCCTGGCCAGCAAATGCTTTTT
    GTGCAGAATACACTTCTTTCAGGCATTGTCAGGTGCTGTTTTGTTTAAGCTCTAACTCAC
    CCCTGGAATACAGGGGAATGATGACAACCAGCCCAGCCAGGCCTGACTCATCATGGTCAC
    ATCCAGCCCCCACCCCCGGCCAACTAACCACTGCAGGCTCCTCTTCCAGACTCACCAGGG
    GGCCTCGAGGCCCCGGCATCTCCCTTGGCCCTGGGTGTGGGTTTTACAAGACTGTGTCTT
    TCATGACATCATAGCCCAACCATGTGAGAAGAAGGAGAAGGCCCCCCTTTCTTCATTAAT
    CTGAAAA
    >Hs.82109_mRNA_1 gi|14250611|gb|BC008765.1|BC008765 Homo sapiens clone
    MGC:1622 IMAGE:3347793 polyA = 3
    GGCACGAGGAAGGGCCTGTGGGTTTATTATAAGGCGGAGCTCGGCGGGAGAGGTGCGGGC
    CGAATCCGAGCCGAGCGGAGAGGAATCCGGCAGTAGAGAGCGGACTCCAGCCGGCGGACC
    CTGCAGCCCTCGCCTGGGACAGCGGCGCGCTGGGCAGGCGCCCAAGAGAGCATCGAGCAG
    CGGAACCCGCGAAGCCGGCCCGCAGCCGCGACCCGCGCAGCCTGCCGCTCTCCCGCCGCC
    GGTCCGGGCAGCATGAGGCGCGCGGCGCTCTGGCTCTGGCTGTGCGCGCTGGCGCTGAGC
    CTGCAGCCGGCCCTGCCGCAAATTGTGGCTACTAATTTGCCCCCTGAAGATCAAGATGGC
    TCTGGGGATGACTCTGACAACTTCTCCGGCTCAGGTGCAGGTGCTTTGCAAGATATCACC
    TTGTCACAGCAGACCCCCTCCACTTGGAAGGACACGCAGCTCCTGACGGCTATTCCCACG
    TCTCCAGAACCCACCGGCCTGGAGGCTACAGCTGCCTCCACCTCCACCCTGCCGGCTGGA
    GAGGGGCCCAAGGAGGGAGAGGCTGTAGTCCTGCCAGAAGTGGAGCCTGGCCTCACCGCC
    CGGGAGCAGGAGGCCACCCCCCGACCCAGGGAGACCACACAGCTCCCGACCACTCATCAG
    GCCTCAACGACCACAGCCACCACGGCCCAGGAGCCCGCCACCTCCCACCCCCACAGGGAC
    ATGCAGCCTGGCCACCATGAGACCTCAACCCCTGCAGGACCCAGCCAAGCTGACCTTCAC
    ACTCCCCACACAGAGGATGGAGGTCCTTCTGCCACCGAGAGGGCTGCTGAGGATGGAGCC
    TCCAGTCAGCTCCCAGCAGCAGAGGGCTCTGGGGAGCAGGACTTCACCTTTGAAACCTCG
    GGGGAGAATACGGCTGTAGTGGCCGTGGAGCCTGACCGCCGGAACCAGTCCCCAGTGGAT
    CAGGGGGCCACGGGGGCCTCACAGGGCCTCCTGGACAGGAAAGAGGTGCTGGGAGGGGTC
    ATTGCCGTAGGCCTCGTGGGGCTCATCTTTGCTGTGTGCCTGGTGGGTTTCATGCTGTAC
    CGCATGAAGAAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCGAAACAAGCCAACGGC
    GGGGCCTACCAGAAGCCCACCAAACAGGAGGAATTCTATGCCTGACGCGGGAGCCATGCG
    CCCCCTCCGCCCTGCCACTCACTAGGCCCCCACTTGCCTCTTCCTTGAAGAACTGCAGGC
    CCTGGCCTCCCCTGCCACCAGGCCACCTCCCCAGCATTCCAGCCCCTCTGGTCGCTCCTG
    CCCACGGAGTCGTGGGGTGTGCTGGGAGCTCCACTCTGCTTCTCTGACTTCTGCCTGGAG
    ACTTAGGGCACCAGGGGTTTCTCGCATAGGACCTTTCCACCACAGCCAGCACCTGGCATC
    GCACCATTCTGACTCGGTTTCTCCAAACTGAAGCAGCCTCTCCCCAGGTCCAGCTCTGGA
    GGGGAGGGGGATCCGACTGCTTTGGACCTAAATGGCCTCATGTGGCTGGAAGATCCTGCG
    GGTGGGGCTTGGGGCTCACACACCTGTAGCACTTACTGGTAGGACCAAGCATCTTGGGGG
    GGTGGCCGCTGAGTGGCAGGGGACAGGAGTCCACTTTGTTTCGTGGGGAGGTCTAATCTA
    GATATCGACTTGTTTTTGCACATGTTTCCTCTAGTTCTTTGTTCATAGCCCAGTAGACCT
    TGTTACTTCTGAGGTAAGTTAAGTAAGTTGATTCGGTATCCCCCCATCTTGCTTCCCTAA
    TCTATGGTCGGGAGACAGCATCAGGGTTAAGAAGACTTTTTTTTTTTTTTTTTTTAAACT
    AGGAGAACCAAATCTGGAAGCCAAAATGTAGGCTTAGTTTGTGTGTTGTCTCTTGAGTTT
    GTCGCTCATGTGTGCAACAGGGTATGGACTATCTGTCTGGTGGCCCCGTTTCTGGTGGTC
    TGTTGGCAGGCTGGCCAGTCCAGGCTGCCGTGGGGCCGCCGCCTCTTTCAAGCAGTCGTG
    CCTGTGTCCATGCGCTCAGGGCCATGCTGAGGCCTGGGCCGCTGCCACGTTGGAGAAGCC
    CGTGTGAGAAGTGAATGCTGGGACTCAGCCTTCAGACAGAGAGGACTGTAGGGAGGGCGG
    CAGGGGCCTGGAGATCCTCCTGCAGACCACGCCCGTCCTGCCTGTGGCGCCGTCTCCAGG
    GGCTGCTTCCTCCTGGAAATTGACGAGGGGTGTCTTGGGCAGAGCTGGCTCTGAGCGCCT
    CCATCCAAGGCCAGGTTCTCCGTTAGCTCCTGTGGCCCCACCCTGGGCCCTGGGCTGGAA
    TCAGGAATATTTTCCAAAGAGTGATAGTCTTTTGCTTTTGGCAAAACTCTACTTAATCCA
    ATGGGTTTTTCCCTGTACAGTAGATTTTCCAAATGTAATAAACTTTAATATAAAGTAAAA
    AAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.44276_mRNA_2 gi|12654896|gb|BC001293.1|BC001293 Homo sapiens clone
    MGC:5259 IMAGE:3458115 polyA = 3
    CGGATGGGGAAAAAAAAAGATGTCAGCTCCTCCGCTGTAGTATTGCTCCTTAAAAACCCC
    TCTCTCTGAAAATGACATGCCCTCGCAATGTAACTCCGAACTCGTACGCGGAGCCCTTGG
    CTGCGCCCGGCGGAGGAGAGCGCTATAGCCGGAGCGCAGGCATGTATATGCAGTCTGGGA
    GTGACTTCAATTGCGGGGTGATGAGGGGCTGCGGGCTCGCGCCCTCGCTCTCCAAGAGGG
    ACGAGGGCAGCAGCCCCAGCCTCGCCCTCAACACCTATCCGTCCTACCTCTCGCAGCTGG
    ACTCCTGGGGCGACCCCAAAGCCGCCTATCGCCTGGAACAACCTGTTGGCAGGCCGCTGT
    CCTCCTGCTCCTACCCACCTAGTGTCAAGGAGGAGAATGTCTGCTGCATGTACAGCGCAG
    AGAAGCGGGCGAAAAGTGGCCCCGAGGCAGCTCTCTACTCCCACCCCTTGCCGGAGTCCT
    GCCTTGGGGAGCACGAGGTACCCGTGCCCAGCTACTACCGCGCCAGCCCGAGCTACTCCG
    CGCTGGACAAGACGCCCCACTGTTCTGGGGCCAACGACTTCGAAGCCCCTTTCGAGCAGC
    GGGCCAGTCTCAACCCGCGCGCCGAACATCTGGAATCGCCTCAGCTGGGGGGCAAAGTGA
    GTTTCCCTGAGACCCCCAAGTCCGACAGCCAGACCCCCAGCCCCAATGAAATCAAGACGG
    AGCAGAGCCTGGCGGGCCCTAAAGGGAGCCCCTCGGAGAGCGAAAAGGAGAGGGCCAAAG
    CTGCCGACTCCAGCCCAGACACCTCGGATAACGAAGCGAAAGAGGAGATAAAGGCAGAAA
    ACACCACAGGAAATTGGCTGACAGCAAAGAGCGGAAGGAAGAAGAGGTGCCCCTATACTA
    AACACCAGACGCTGGAATTGGAGAAAGAATTTCTGTTCAATATGTATTTGACGCGAGAGC
    GCCGCCTGGAGATTAGCAAGACCATTAACCTTACAGACAGACAAGTCAAAATCTGGTTTC
    AAAATCGCAGAATGAAACTCAAGAAAATGAACCGAGAGAATCGGATCCGGGAACTGACCT
    CCAATTTTAATTTCACCTGAGAGCGCGGCCTCTCCTCCTCCCTTCCCGCTCCTTCCTCTC
    CCCGCCCCTCCTCCCTTTGTGCCTGGTGATATATTTTTTTTTCCTCCCTGAGTATAAATG
    CAATGCGACTGCAAAAAAGGCAAAGACCTCAGACTCTCCTTCCAAGGGACCTGTGGTTCG
    TGCTGCGAAGATGCTTCCACTTAAAGCATGAGAAATGGGGTGCCGGGATGTGGGGTGTGG
    TGTGTGCCCTCATAGATGGGGGTGGGAGTGTGGCTGGTGTGTGTGTCAAACCCTCACTCA
    CCCACGCACTCACACACAGCATTCTGTTCTCCATGCAAAGTTAAGATCGAATCCATCCGC
    TTGTAGGGGAAAAAAAGGAAAAAAATTAACCAGAGAGGGTCTGTAATCTCGCAGAGCACA
    GGCAGAATCGTTCCTTCCTTGCTGCATTTCCTCCTTAGACTAATAGACGTTTTGGAAAGT
    TCGGCTAGTGTTCGTGTGTTTGTCGTAGCACCCAGAGCCTCCACCAAACCCTCTCCATGT
    CTTTACCTCCCAGTCGCTCTAAGAATCTGCTTGAAGTCTCGTATTTGTACTGCTTTCTGC
    TTTTCTCCCACCCCTCCTAGCACCCCCACATCCCCCATCTAGTAACATCTCAGAAATTTC
    ATCCAGAGGAACAAAAAAATTAAAAATAGAACATAGCAAAGCAAAGACAGAATGCCCCCC
    CCCAAATATTGTCCTGTCCCTGTCTGGGAGTTGTGTTATTTAAAGATATTCTGTATGTTG
    TATCTTTTGCATGTAGCTTCCTTAATGGAGAAAAAAAAATCCTAATAAATTTCCAGAATC
    ATAATCCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAA
    >Hs.2142_mRNA_4 gi|13325274|gb|BC004453.1|BC004453 Homo sapiens clone
    MGC:4303 IMAGE:2819400 polyA = 3
    GCAGTGGCCACGAGAGGCAGGCTGGCTGGGACATGAGGTTGGCAGAGGGCAGGCAAGCTG
    GCCCTTGGTGGGCCTCGTCCTGAGCACTCGGAGGCACTCCTATGCTTGGAAAGCTCGCTA
    TGCTGCTGTGGGTCCAGCAGGCGCTGCTCGCCTtGCTCCTCCCCACACTCCTGGCACAGG
    GAGAAGCCAGGAGGAGCCGAAACACCACCAGGCCCGCTCTGCTGAGGCTGTCGGATTACC
    TTTTGACCAACTACAGGAAGGGTGTGCGCCCCGTGAGGGACTGGAGGAAGCCAACCACCG
    TATCCATTGACGTCATTGTCTATGCCATCCTCAACGTGGATGAGAAGAATCAGGTGCTGA
    CCACCTACATCTGGTACCGGCAGTACTGGACTGATGAGTTTCTCCAGTGGAACCCTGAGG
    ACTTTGACAACATCACCAAGTTGTCCATCCCCACGGACAGCATCTGGGTCCCGGACATTC
    TCATCAATGAGTTCGTGGATGTGGGGAAGTCTCCAAATATCCCGTACGTGTATATTCGGC
    ATCAAGGCGAAGTTCAGAACTACAAGCCCCTTCAGGTGGTGACTGCCTGTAGCCTCGACA
    TCTACAACTTCCCCTTCGATGTCCAGAACTGCTCGCTGACCTTCACCAGTTGGCTGCACA
    CCATCCAGGACATCAACATCTCTTTGTGGCGCTTGCCAGAAAAGGTGAAATCCGACAGGA
    GTGTCTTCATGAACCAGGGAGAGTGGGAGTTGCTGGGGGTGCTGCCCTACTTTCGGGAGT
    TCAGCATGGAAAGCAGTAACTACTATGCAGAAATGAAGTTCTATGTGGTCATCCGCCGGC
    GGCCCCTCTTCTATGTGGTCAGCCTGCTACTGCCCAGCATCTTCCTCATGGTCATGGACA
    TCGTGGGCTTCTACCTGCCCCCCAACAGTGGCGAGAGGGTCTCTTTCAAGATTACACTCC
    TCCTGGGCTACTCGGTCTTCCTGATCATCGTTTCTGACACGCTGCCGGCCACTGCCATCG
    GCACTCCTCTCATTGGTGTCTACTTTGTGGTGTGCATGGCTCTGCTGGTGATAAGTTTGG
    CCGAGACCATCTTCATTGTGCGGCTGGTGCACAAGCAAGACCTGCAGCAGCCCGTGCCTG
    CTTGGCTGCGTCACCTGGTTCTGGAGAGAATCGCCTGGCTACTTTGCCTGAGGGAGCAGT
    CAACTTCCCAGAGGCCCCCAGCCACCTCCCAAGCCACCAAGACTGATGACTGCTCAGCCA
    TGGGAAACCACTGCAGCCACATGGGAGGACCCCAGGACTTCGAGAAGAGCCCGAGGGACA
    GATGTAGCCCTCCCCCACCACCTCGGGAGGCCTCGCTGGCGGTGTGTGGGCTGCTGCAGG
    AGCTGTCCTCCATCCGGCAATTCCTGGAAAAGCGGGATGAGATCCGAGAGGTGGCCCGAG
    ACTGGCTGCGCGTGGGCTCCGTGCTGGACAAGCTGCTATTCCACATTTACCTGCTGGCGG
    TGCTGGCCTACAGCATCACCCTGGTTATGCTCTGGTCCATCTGGCAGTACGCTTGAGTGG
    GTACAGCCCAGTGGAGGAGGGGGTACAGTCCTGGTTAGGTGGGGACAGAGGATTTCTGCT
    TAGGCCCCTCAGGACCCAGGGAATGCCAGGGACATTTTCAAGACACAGACAAAGTCCCGT
    GCCCTGTTTCCAATGCCAATTCATCTCAGCAATCACAAGCCAAGGTCTGAACCCTTCCAC
    CAAAAACTGGGTGTTCAAGGCCCTTACACCCTTGTCCCACCCCCAGCAGCTCACCATGGC
    TTTAAAACATGCTCTCTTAGATCAGGAGAAACTCGGGCACTCCCTAAGTCCACTCTAGTT
    GTGGACTTTTCCCCATTGACCCTCACCTGAATAAGGGACTTTGGAATTCTGCTTCTCTTT
    CACAACTTTGCTTTTAGGTTGAAGGCAAAACCAACTCTCTACTACACAGGCCTGATAACT
    CTGTACGAGGCTTCTCTAACCCCTAGTGTCTTTTTTTTCTTCACCTCACTTGTGGCAGCT
    TCCCTGAACACTCATCCCCCATCAGATGATGGGAGTGGGAAGAATAAAATGCAGTGAAAC
    CCTAAAAAAAAAAAAAAAAAAAAAA
    >Hs.180908_contig1|AA846824|AW611680|AA846182|AA846342 AA846360 polyA = 2
    polyA = 3
    TCTTCGCTCCTCTACCCCATAAAATTCCCTACAAATGCAAAAATTCGAGATAGAAGAAGC
    CGTCCCTGAAATTGCTGTCTAACATTCACCGGAAACCTCTCCATAAACAAGGAGAAACGA
    ATGCACACGCATTTTTGCTAAGAAGCCCGGGATTAAGATTTAAGGATACAAGCTGAAAGA
    AAAAATGAAAAATGCTTCTCCGCGCGTCAATCGAGGGGTGGATGCGCCACGCAGCTGAGC
    CCAGCTCACAGCCACGCGTAAGACCAAAAGCTGCCATGGGTTCTGCGCGCGGAGACCTCA
    GAGCCGAAGAGAGAAGTCCCCGCGTCAGAAACGCTGCGGATGCCAGGTCTTGAAAATGCT
    GACTTCTGAGGCTAAGAATTATTTCAAAGACAAAAAGAAAAGACTGGTGAGGAGGCCTTC
    CGGTGCAAGGGCGCCTATCCGCTAATTTTGGATGGGGAAGTAGGGATTATTCGTTTAAAT
    TCAATCGCGAGCACCAAGTCGGACTGGCCGGGGATGGAGAAGGGCAACCCCCACCTTTAG
    AAAAATAAAAGATCTCGAAGGCCAAAAAAAAAAA
    >Hs.89436_mRNA_1 gi|16507959|ref|NM_004063.2|Homo sapiens cadherin 17,
    cadherin (liver-intestine) (CDH17), mRNA polyA = 1
    AGGGAGTGTTCCCGGGGGAGATACTCCAGTCGTAGCAAGAGTCTCGACCACTGAATGGAA
    GAAAAGGACTTTTAACCACCATTTTGTGACTTACAGAAAGGAATTTGAATAAAGAAAACT
    ATGATACTTCAGGCCCATCTTCACTCCCTGTGTCTTCTTATGCTTTATTTGGCAACTGGA
    TATGGCCAAGAGGGGAAGTTTAGTGGACCCCTGAAACCCATGACATTTTCTATTTATGAA
    GGCCAAGAACCGAGTCAAATTATATTCCAGTTTAAGGCCAATCCTCCTGCTGTGACTTTT
    GAACTAACTGGGGAGACAGACAACATATTTGTGATAGAACGGGAGGGACTTCTGTATTAC
    AACAGAGCCTTGGACAGGGAAACAAGATCTACTCACAATCTCCAGGTTGCAGCCCTGGAC
    GCTAATGGAATTATAGTGGAGGGTCCAGTCCCTATCACCATAGAAGTGAAGGACATCAAC
    GACAATCGACCCACGTTTCTCCAGTCAAAGTACGAAGGCTCAGTAAGGCAGAACTCTCGC
    CCAGGAAAGCCCTTCTTGTATGTCAATGCCACAGACCTGGATGATCCGGCCACTCCCAAT
    GGCCAGCTTTATTACCAGATTGTCATCCAGCTTCCCATGATCAACAATGTCATGTACTTT
    CAGATCAACAACAAAACGGGAGCCATCTCTCTTACCCGAGAGGGATCTCAGGAATTGAAT
    CCTGCTAAGAATCCTTCCTATAATCTGGTGATCTCAGTGAAGGACATGGGAGGCCAGAGT
    GAGAATTCCTTCAGTGATACCACATCTGTGGATATCATAGTGACAGAGAATATTTGGAAA
    GCACCAAAACCTGTGGAGATGGTGGAAAACTCAACTGATCCTCACCCCATCAAAATCACT
    CAGGTGCGGTGGAATGATCCCGGTGCACAATATTCCTTAGTTGACAAAGAGAAGCTGCCA
    AGATTCCCATTTTCAATTGACCAGGAAGGAGATATTTACGTGACTCAGCCCTTGGACCGA
    GAAGAAAAGGATGCATATGTTTTTTATGCAGTTGCAAAGGATGAGTACGGAAAACCACTT
    TCATATCCGCTGGAAATTCATGTAAAAGTTAAAGATATTAATGATAATCCACCTACATGT
    CCGTCACCAGTAACCGTATTTGAGGTCCAGGAGAATGAACGACTGGGTAACAGTATCGGG
    ACCCTTACTGCACATGACAGGGATGAAGAAAATACTGCCAACAGTTTTCTAAACTACAGG
    ATTGTGGAGCAAACTCCCAAACTTCCCATGGATGGACTCTTCCTAATCCAAACCTATGCT
    GGAATGTTACAGTTAGCTAAACAGTCCTTGAAGAAGCAAGATACTCCTCAGTACAACTTA
    ACGATAGAGGTGTCTGACAAAGATTTCAAGACCCTTTGTTTTGTGCAAATCAACGTTATT
    GATATCAATGATCAGATCCCCATCTTTGAAAAATCAGATTATGGAAACCTGACTCTTGCT
    GAAGACACAAACATTGGGTCCACCATCTTAACCATCCAGGCCACTGATGCTGATGAGCCA
    TTTACTGGGAGTTCTAAAATTCTGTATCATATCATAAAGGGAGACAGTGAGGGACGCCTG
    GGGGTTGACACAGATCCCCATACCAACACCGGATATGTCATAATTAAAAAGCCTCTTGAT
    TTTGAAACAGCAGCTGTTTCCAACATTGTGTTCAAAGCAGAAAATCCTGAGCCTCTAGTG
    TTTGGTGTGAAGTACAATGCAAGTTCTTTTGCCAAGTTCACGCTTATTGTGACAGATGTG
    AATGAAGCACCTCAATTTTCCCAACACGTATTCCAAGCGAAAGTCAGTGAGGATGTAGCT
    ATAGGCACTAAAGTGGGCAATGTGACTGCCAAGGATCCAGAAGGTCTGGACATAAGCTAT
    TCACTGAGGGGAGACACAAGAGGTTGGCTTAAAATTGACCACGTGACTGGTGAGATCTTT
    AGTGTGGCTCCATTGGACAGAGAAGCCGGAAGTCCATATCGGGTACAAGTGGTGGCCACA
    GAAGTAGGGGGGTCTTCCTTGAGCTCTGTGTCAGAGTTCCACCTGATCCTTATGGATGTG
    AATGACAACCCTCCCAGGCTAGCCAAGGACTACACGGGCTTGTTCTTCTGCCATCCCCTC
    AGTGCACCTGGAAGTCTCATTTTCGAGGCTACTGATGATGATCAGCACTTATTTCGGGGT
    CCCCATTTTACATTTTCCCTCGGCAGTGGAAGCTTACAAAACGACTGGGAAGTTTCCAAA
    ATCAATGGTACTCATGCCCGACTGTCTACCAGGCACACAGAGTTTGAGGAGAGGGAGTAT
    GTCGTCTTGATCCGCATCAATGATGGGGGTCGGCCACCCTTGGAAGGCATTGTTTCTTTA
    CCAGTTACATTCTGCAGTTGTGTGGAAGGAAGTTGTTTCCGGCCAGCAGGTCACCAGACT
    GGGATACCCACTGTGGGCATGGCAGTTGGTATACTGCTGACCACCCTTCTGGTGATTGGT
    ATAATTTTAGCAGTTGTGTTTATCCGCATAAAGAAGGATAAAGGCAAAGATAATGTTGAA
    AGTGCTCAAGCATCTGAAGTCAAACCTCTGAGAAGCTGAATTTGAAAAGGAATGTTTGAA
    TTTATATAGCAAGTGCTATTTCAGCAACAACCATCTCATCCTATTACTTTTCATCTAACG
    TGCATTATAATTTTTTAAACAGATATTCCCTCTTGTCCTTTAATATTTGCTAAATATTTC
    TTTTTTGAGGTGGAGTCTTGCTCTGTCGCCCAGGCTGGAGTACAGTGGTGTGATCCCAGC
    TCACTGCAACCTCCGCCTCCTGGGTTCACATGATTCTCCTGCCTCAGCTTCCTAAGTAGC
    TGGGTTTACAGGCACCCACCACCATGCCCAGCTAATTTTTGTATTTTTAATAGAGACGGG
    GTTTCGCCATTTGGCCAGGCTGGTCTTGAACTCCTGACGTCAAGTGATCTGCCTGCCTTG
    GTCTCCCAATACAGGCATGAACCACTGCACCCACCTACTTAGATATTTCATGTGCTATAG
    ACATTAGAGAGATTTTTCATTTTTCCATGACATTTTTCCTCTCTGCAAATGGCTTAGCTA
    CTTGTGTTTTTCCCTTTTGGGGCAAGACAGACTCATTAAATATTCTGTACATTTTTTCTT
    TATCAAGGAGATATATCAGTGTTGTCTCATAGAACTGCCTGGATTCCATTTATGTTTTTT
    CTGATTCCATCCTGTGTCCCCTTCATCCTTGACTCCTTTGGTATTTCACTGAATTTCAAA
    CATTTGTCAGAGAAGAAAAACGTGAGGACTCAGGAAAAATAAATAAATAAAAGAACAGCC
    TTTTCCCTTAGTATTAACAGAAATGTTTCTGTGTCATTAACCATCTTTAATCAATGTGAC
    ATGTTGCTCTTTGGCTGAAATTCTTCAACTTGGAAATGACACAGACCCACAGAAGGTGTT
    CAAACACAACCTACTCTGCAAACCTTGGTAAAGGAACCAGTCAGCTGGCCAGATTTCCTC
    ACTACCTGCCATGCATACATGCTGCGCATGTTTTCTTCATTCGTATGTTAGTAAAGTTTT
    GGTTATTATATATTTAACATGTGGAAGAAAACAAGACATGAAAAGAGTGGTGACAAATCA
    AGAATAAACACTGGTTGTAGTCAGTTTTGTTTGTTAA
    >Hs.151544_mRNA_8 gi|3153107|emb|AL023657.1|HSDSHP Homo sapiens SH2D1A
    cDNA, formerly known as DSHP polyA = 3
    AAATCCTTCTTCCAATGTTCCTCCCCTCTCTGTATGAACCCTGTGTTGGGGGGCAGAAGA
    TGGAAGCCCTTGGCAAGCTCGATCGAACCAAGCTACTAAATTGCTGAGCTCGTTTTAACT
    GAAGTGTGAGAAGGAGGTTTAAGGCAAGTAGACAACATCCTGTTGTTGGGGTGCTTCTCT
    CTTTTTTGCACATCTGGCTGAACTGGGAGTCAGGTGGTTGACTTGTGCCTGGCTGCAGTA
    GCAGCGGCATCTCCCTTGCACAGTTCTCCTCCTCGGCCTGCCCAAGAGTCCACCAGGCCA
    TGGACGCAGTGGCTGTGTATCATGGCAAAATCAGCAGGGAAACCGGCGAGAAGCTCCTGC
    TTGCCACTGGGCTGGATGGCAGCTATTTGCTGAGGGACAGCGAGAGCGTGCCAGGCGTGT
    ACTGCCTATGTGTGCTGTATCACGGTTACATTTATACATACCGAGTGTCCCAGACAGAAA
    CAGGTTCTTGGAGTGCTGAGACAGCACCTGGGGTACATAAAAGATATTTCCGGAAAATAA
    AAAATCTCATTTCAGCATTTCAGAAGCCAGATCAAGGCATTGTAATACCTCTGCAGTATC
    CAGTTGAGAAGAAGTCCTCAGCTAGAAGTACACAAGGTACTACAGGGATAAGAGAAGATC
    CTGATGTCTGCCTGAAAGCCCCATGAAGAAAAATAAAACACCTTGTACTTTATTTTCTAT
    AATTTAAATATATGCTAAGTCTTATATATTGTAGATAATACAGTTCGGTGAGCTACAAAT
    GCATTTCTAAAGCCATTGTAGTCCTGTAATGGAAGCATCTAGCATGTCGTCAAAGCTGAA
    ATGGACTTTTGTACATAGTGAGGAGCTTTGAAACGAGGATTGGGAAAAAGTAATTCCGTA
    GGTTATTTTCAGTTATTATATTTACAAATGGGAAACAAAAGGATAATGAATACTTTATAA
    AGGATTAATGTCAATTCTTGCCAAATATAAATAAAAATAATCCTCAGTTTTTGTGAAAAG
    CTCCATTTTTAGTGAAATATTATTTTATAGCTACTAATTTTAAAATGTCTTGCTTGATTG
    TATGGTGGGAAGTTGGCTGGTGTCCCTTGTCTTTGCCAAGTTCTCCACTAGCTATGGTGT
    CATAGGCTCTTTTGGGATTTTTGAAGCTGTATACTGTGTGCTAAAACAAGCACTAAACAA
    AGAGTGAAGGATTTATGTTTAATTCTGAAAGCAACCTTCTTGCCTAGTGTTCTGATATTG
    GACAGTAAAATCCACAGACCAACCTGGAGTTGAAAATCTTATAATTTAAAATATGCTCTA
    AACATGTTTATCGTATTTGATGCTACAGGATTTGAAATTGTATTACAAATCCAATGAAAT
    GAGTTTTTCTTTTCATTTACCTCTGCCCCAGTTGTTTCTACTACATGGAAGACCTCATTT
    TGAAGGGAAATTTCAGCAGCTGCAGCTCATGAGTAACTGATTTGTAACAAGCCTCCTTTT
    AAAGTAACCCTACAAAACCACTGGAAAGTTTATGGTTGTATTATTTTTTAAAAAAATTCC
    AAGTGATTGAAACCTACACGAGATACAGAATTTTATGCGGCATTTTCTTCTCACATTTAT
    ATTTTTGTGATTTTGTGATTGATTATATGTCACTTTGCTACAGGGCTCACAGAATTCATT
    CACTCAACAAACATAATAGGGCGCTGAGGGCATAGAAGTAAAAACACCTGGTCCCTGCTC
    TCAGTTCACTGTCTTGTTGGACGAGAAAAGAAACAATAACGATAAAAGACAGTGAAAGAA
    AATAACGATAAAAGACAGTGAAAGAAAATAACAATAAAAGACAAGGAAAAAATAACAATG
    AAAGTTGATAAGTACATGATAAGCGAGGTTCCCCGTGTGTAGGTAGATCTGGTCTTTAGA
    GGCAGATAGATAGGTCAGTGCAAATACTCTGGTCCATGGGCCATATGAAAAGGCTAAGCT
    TCACTGTAAAATAATAACTGGGAATTCTGGATTGTGTATGGGTGTTGGTGAACTTGGTTT
    TAATTAGTGAACTGCTGAGAGACAGAGCTATTCTCCATGTACTGGCAAGACCTGATTTCT
    GAGCATTTAATATGGATGCCGTGGGAGTACAAAAGTGGAGTGTGGCCTGAGTAATGCATT
    ATGGGTGGTTTACCATTTCTTGAGGTAAAAGCATCACATGAACTTGTAAAGGAATTTAAA
    AATCCTACTTTCATAATAAGTTGCATAGGTTTAATAATTTTTAATTATATGGCTTGAGTT
    TAAATTGTAATAGGCGTAACTAATTTTAACTCTATAATGTGTTCATTCTGGAATAATCCT
    AAACATATGAATTATGTTTGCATGTTCACTTCCAAGAGCCTTTTTTTGAAAAAAAGCTTT
    TTTTGAATCATCAAGTCTTTCACATTTAAATAAAGTGTTTGAAAGCTTTATTTAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAGAAAAAAA
    >Hs.1657_contig4
    AW473119|AA164586|AI540656|AI758480|AI810941|AI978964|AI675862|AI784397|AW5
    91562|AW514102|AI888116|AI983175|AI634735|AI669577|AI202659|AI910598|AI9613
    52|AI565481|AI886254|AI538838|AA291749|AW571455|AI370308|AI274727|AW473925|
    AW514787|AI273871|AW470552|AI524356|AI888281|AW089672|AI952766|AW440601|AT6
    54044|AW438839|AI972926 polyA = 2 polyA = 3
    AATTGTTTTCTAAGTAATTGCTGCCTCTATTATGGCACTTCATTTTTGCACTGTCTTTTG
    AGATTCAAGAAAAATTTCTATTCTTTTTTTTGCATCCAATTGTGCCTGAACTTTTAAAAT
    ATGTAAATGCTGCCATGTTCCAAACCCATCGTCAGTGTGTGTGTTTAGAGCTGTGCACCC
    TAGAAACAACATATTGTCCCATGAGCAGGTGCCTGAGACACAGACCCCTTTGCATTCACA
    GAGAGGTCATTGGTTATAGAGACTTGAATTAATAAGTGACATTATGCCAGTTTCTGTTCT
    CTCACAGGTGATAAACAATGCTTTTTGTGCACTACATACTCTTCAGTGTAGAGCTCTTGT
    TTTATGGGAAAAGGCTCAAATGCCAAATTGTGTTTGATGGATTAATATGCCCTTTTGCCG
    ATGCATACTATTACTGATGTGACTCGGTTTTGTCGCAGCTTTGCTTTGTTTAATGAAACA
    CACTTGTAAACCTCTTTTGCACTTTGAAAAAGAATCCAGCGGGATGCTCGAGCACCTGTA
    AACAATTTTCTCAACCTATTTGATGTTCAAATAAAGAATTAAACTAAAAAAAAAAAAAAA
    A
    >Hs.35984_mRNA_1 gi|6049161|gb|AF133587.1|AF133587 Homo sapiens chromosome
    22 map 22q11.2 polyA = 3
    GGCGCCGCGGACGCTGCTGGAGTCGCCTGGCAACGATGTCGCCTGGCAACTGAATAGGTT
    GGCCAGTGGCGCGGGCTACTGGAAGCAGAAAGGGCTGCGGAGGCAGTGAGTGGTTTCTGC
    AGAGCTTCATTTGGAAAGGCCTCTGTAGTTGGGGAAAGATGGCCCATTCCCAGAACTCCT
    TGGAGCTTCCCATTAACATCAATGCCACCCAGATTACCACTGCCTATGGCCATCGGGCCC
    TGCCCAAGCTGAAGGAGGAGCTGCAGTCAGAGGACCTCCAGACGAGGCAGAAAGCCCTCA
    TGGCCCTGTGTGACCTCATGCATGACCCCGAGTGTATCTACAAGGCCATGAACATAGGCT
    GTATGGAGAACCTGAAAGCTTTGCTGAAGGATAGCAACAGTATGGTGCGCATAAAGACCA
    CCGAGGTGCTCCACATCACGGCAAGCCATAGCGTGGGCAGATACGCCTTTCTAGAGCACG
    ACATCGTCCTTGCCCTGTCCTTCCTGCTGAATGACCCCAGCCCAGTCTGCCGGGGGAACC
    TGTACAAGGCATACATGCAGCTGGTCCAGGTGCCTAGAGGGGCCCAAGAGATCATCAGCA
    AAGGTCTGATTTCCTCACTGGTATGGAAGCTGCAGGTGGAGGTGGAGGAGGAGGAGTTCC
    AGGAGTTCATCCTGGACACACTGGTCCTCTGCCTGCAGGAGGATGCCACCGAGGCCCTGG
    GCAGCAATGTGGTGCTTGTCCTGAAGCAGAAGCTCCTCAGCGCCAACCAGAACATCCGCA
    GCAAGGCCGCCCGTGCGCTCCTTAATGTCAGCATATCTCGAGAGGGCAAGAAACAGGTGT
    GTCATTTTGACGTCATCCCCATCCTGGTCCATCTGCTGAAAGACCCAGTGGAGCATGTGA
    AGTCTAACGCTGCCGGTGCCCTGATGTTCGCCACAGTGATCACTGAAGGGAAGTATGCGG
    CCCTGGAGGCACAAGCCATCGGCCTGCTCCTGGAGCTGCTGCACTCCCCCATGACCATAG
    CGCGCCTGAATGCCACCAAGGCCCTTACCATGCTGGCAGAGGCCCCCGAGGGCCGCAAGG
    CCCTGCAGACGCACGTGCCCACTTTCCGTGCCATGGAGGTGGAGACTTACGAAAAGCCTC
    AAGTGGCCGAAGCCTTACAGCGGGCAGCCCGGATCGCCATCAGTGTCATCGAGTTCAAAC
    CCTGAGCCCTTCATTCACCTCTGTGAGTGAATAAATGTGCTAAGTCTCTTTAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAA
    >Hs.334534_mRNA_2 gi|17389403|gb|BC017742.1|BC017742 Homo sapiens, clone
    IMAGE:4391536, mRNA polyA = 3
    AGAGCAGTAAGCTTGTGATAAAGGCCAATTCCAGGTAGCTCTTGAAGGTGATAGCCATCT
    ACTTTCCAGTGGCTGCCAACCACAGGGAGTGCCAGTTAACACTGGAAGGATTAAGGCAAG
    GTCCCTTCTCTTGAGACTCCCCTCTGAGATCTGAAAAATGAAGTGGCTTAGGAACATCAG
    CAGTGAAGAACTGCCAAGAGTTGGTGAAGGTTGTCTCTTCCGAGGGCCTTCTGAAGACAG
    GGCTCTTGAACAGACAAGTGGAAGGGCTGTACCAGGGATAAAGGAAAGAAGTGCCTGTCC
    AGCAGGGAGCTTGAATTTAAGTTCCATGTATGAAGTCATTGGCTCTATCTGCATTTTTCT
    GTCATTCTCTTCATTTGTTTTAAGGTGGAAAATTTTCTTACAGTTGATGCAAAGTATCAA
    CTACTTTACCCTACCTTCTCCCCTTTTAGATGGGTTCTTCCTGAGTTTTGGAGTCTTGTA
    TGATTATCAGTATTCCCCTGTCAAAATCAAATCTATTCAGGTTTCTTCACTGTTGAGAAC
    ACCTAAATGTTTTTATTTTTGAGAAGTGGGGACAGAGTCTCACTATGTCACCCAGGCTGG
    AGTGCAATGGCATGATCTCAGCTCACTGCAACCTTCGCCTCCTGGGTTCAAGCGATTCTC
    CTGCCTCCGCCTCCTGAGTAGCTGGGATTATAGGCACGCACCACCACGCCCAGCTAATTT
    TTTGTATTTTTAGTAGAGACAGAGTTTCACCATGTTGGCCAGGCTGGTCTTGAACTCCTG
    ACCTTGTGATCCACCCACCTCGGCCTCCCAGAGTGCTGGGATTACAGGCATGAGCCACCA
    CGCTTGGCTAAGAACACCTAAATTTTTATGTTTCTTGGCTCAAAAACCAGTTCCATTTCT
    AATGTTGTCCTCACAAGAAGGCTAATTGGTGGTGAGACAGCAGGGGAGGAGGAAGAGCTG
    TGGTTTGTAACTTGTTCAACTCAGGCAATAAGCGATTTTAGCTTTATTTAAAGTCTTCTG
    TCCAGCTTTAAGCACTTTGTAAGACATGGCTGAAAGTAGCTTTTCTATCAGAATTGCAGA
    TAGTCATGTTGGGCTAACAGTCAATTGGATATATTCCTTTACCTCACATGACCCCAGCAA
    CTGTGGTGGTATCTAGAGGTGAAACAGGCAAGTGAAATGGACACCTCTGCTGTGAATGTT
    TTAGAGAAGGAAATTCAAAAAATGTTGTAACTGAAAGCACTGTTGAATATGGGTATCGGC
    TTTCTTTTTCACTTTGACTCTTAACATTATCAGTCAACTTCCACATTAATGAAAGTTGAC
    CATAGTTATTTCCAAATAAAAAGAAACCAACTCTTACCAGGTCTTGGACTGTGATGTCAT
    ATTATTCAGTTTTATGCTTGTTCCTGAGCAGAACTCATAAGAGTGACATAGTCAGCTGCT
    GACGGCACCTCAGCCACGCCACTCTTACTCAGTTCAGTGGGTGTGCTTGCGTGGTAGGAT
    GTGGTGCAGCCCTCTCTACGCTCTTCTATTTTTGGTATATTTCCTATCTAACCTTCAAAT
    AGCTTCCAATTCTTTTTTTCTTGGACTGGCTTCATTCTGAATTTGTGCTAAAATAATCTT
    TCATAAAGAGACCTCAGTTTATAGCGTAACAGACTACACAATGCACTGATGTTTTCATAA
    TGTTTAAGGGACCCACTGCAAGAAGCTTGCTGCCTCCTTTTAATTGTATTCATTTAGATT
    TTGATTTTCCATGTTAAGAAGGTGAGGTCCATGTTGGTGCCCTTCAGAGTAGAGAACCAT
    GTAAACATTAGGAATGAACAGAGGCCTTAGGAATGAATAGAGAGTTTGCCTTATACAATT
    TCCTGTTACAAAGCTCTCCCTCTCATGCAAAGTAGGGAACACCTTTTGAGCATCTTTGAA
    TTTGACAAATGGTGCTGTTGCAAACACTTTTTTTTTGAGATGAAGTCTCGCGGTTGTCAC
    CCGGGCTGGAGTGCAGTGGCGTGATCTCGGCTCACTGCAACTTCCACCTCCTGGGTTCCA
    GCAGTTCTCCTGCCTCAGCCTCCCAAGTAGCTGAGATTACAGGCGCCTGCCACCCCACCT
    GGCTGATTTTTGTAATTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGCTGATTAAC
    TCCTGACCTCAGGTGATCCACCTTTCTCGGCCTCCCAAAGTGCTGGGATTACGGGTGTGA
    GCCACCGTGCCCGGCCTGCAAACACATTTTAATTGACAACACTAGGGCTGTTGTACAAAA
    TAGTAATGATAGCCATGGAAGTTTTACCTTATTCTGTGAGAAGTGTTCTTAAACTTATTA
    AGTGTCTAAACTAAGGTTTAGTGCTTTTTTAAAGGAAAGTTGTCCCAGGATTCATCCTAA
    AGAAAGCAAAAGTTAATTCAACTGATCCACCAATGGAATTAGATGGGTAGAGTTGGGTTC
    TTGAGTTTTACCACCACTTAGTTCCCACTGAATTTTGTAACTTCCTGTGTTTGCATCCTC
    TGTTCCTATTCTGCCCTTGCTCTGTGTCATCTCAGTCATTTGACTTAGAAAGTGCCCTTC
    AAAAGGACCCTGTTCACTGCTGCACTTTTCAATGAATTAAAATTTATTTCTGTTCTAAAA
    AAAAAAAAAAA
    >Hs.60162_mRNA_1 gi|10437644|dbj|AK025181.1|AK025181 Homo sapiens cDNA
    FLJ21528 fis, clone COL05977 polyA = 3
    TGATCAACAACTGTCAGCTCCCAGTCAGAGAGAAAGGGCCTCTTCAGTCTGTCTCAGGAG
    ACTGGGAGAAACAGCATAAAGGACCCCACAAGGAAGGGAGAGGTACCCTGGGTCAGGCGC
    TTGTGGAGAGAGGGCTTCGCATGTAAAGTGACGTCAGGGAAAATAGAACAGAAAAAAAGC
    CAGGGCCAGCCCAGAGGCACCTGAGAAGAATCAGACCCACAGCTCAGCCCAGCCCTGGCA
    CAGAGAAGAGACAGGCCTGGCAGCACCCAGGGACCCCCTTTCCTCAGCCTCCACCTGCAG
    GACAGCAGGAGCACTGATGCGCTGAAGGTACGTTCTGGAGTCTGGAAGCAGCAGAACTGA
    AGGAAGTAAACACGGGTGTCTGGGAAGACCCCTCAAGCTGCAGTAAAGCCCAGGACTGAA
    TTGGCCACCTGAGGCCAAGGGTGGCACTCCAACCTCCTCCTAAAGGCTGGCTAGAGCCAC
    AGGAAAGGGCCAGAAGCCAGAGAAAGGGCAAAGGTGGACCCCTGCCTCCAAACCTCCTCT
    GGAGACTGACCTCCTCTTTCCTGTGCCTTATTGTTTCTCCCTCTTCTCTTTGTTCGCCAC
    TGGGCGGTGACCTCAGGGATCCTGGCCTAACCTGGTGATTGTGCAGGCAACTGTGTCCGA
    GAAGACCCTTCTCTGGAAGATTGAACCCCAATTCAGCCATGGTGACTCCTTTGATGTCAA
    ACTGGTAAGGGCTGAGCCGTGGGCACAGGATACCACTCCTTCCAGCTCTTCTGCTGTGAC
    CTGCCCATGGAAGTCCCTGTGGACACGAAATCCTGTTTGGATCATCTAACTGGAGGCTCT
    CTGTTCTTCACCTCCACGCGCCCTCTTGACCCCAGGAGGTTCAGGGGAGGAAGTACGCCA
    CTCTCCACTGGCACCCTCCTTGGCCTACACAGAGTCACCCCTGAGCCCCTCAATGTGTGC
    TGAGGTGGGCCCTGCTCTCTGCAGGGGTATGGAGAGAAATAGCTTGGGGTGCTGTGAGGC
    CCCGAAGAAGCTGGGCCTGTCCTTCTCCATCGAGGCGATCCTAAAGAGGCCTGCCAGGAG
    GAGTGATATGGACAGACCAGAAGGGCCAGGTGAAGAGGGCCCCGGAGAAGCTGCGGCCTC
    AGGCTCTGGGCTAGAAAAGCCTCCAAAGGACCAGCCCCAGGAAGGAAGGAAGAGCAAGCG
    GAGGGTTCGTACCACCTTCACCACTGAGCAGCTGCATGAGCTGGAGAAGATCTTCCACTT
    TACCCACTACCCAGACGTTCACATCCGCAGCCAGCTGGCAGCCAGGATCAACCTCCCAGA
    AGCTCGGGTGCAGATCTGGTTCCAGAATCAGCGAGCCAAGTGGCGGAAGCAGGAGAAGAT
    TGGCAACCTGGGGGCTCCACAGCAGCTGAGTGAAGCCAGTGTGGTCCTGCCCACAAATCT
    GGATGTGGCTGGGCCCACGTGGACATCCACTGCTCTGCGCAGGCTGGCTCCTCCCACGAG
    CTGTTGTCCATCGGCTCAAGATCAGCTGGCCTCTGCCTGGTTCCCTGCCTGGATCACCCT
    CCTCCCAGCGCACCCATGGGAAACACAGCCTGTCCCAGGTCTTCCCATCCATCAAACTTG
    CATCCCTGTGCTATGCATCCTTCCACCTCCACACCCCAAATGGGGCAGCATCTGTGCTAC
    TTCAACATAGAGATTGGACATGCTCTCCCCAAATGAGCCACTTTCCTCTCCAGGTGAAGG
    CAGGTAGCAGATGTGCCCTGGGCCTCTGGGGAAATCGATCTCACAATCCAAAAATGGCCC
    ACAGCCCAGGAAGCTACCCTGAACATGCCAGTTGGAAGGCTGCACCAGACTCAAAAGCAA
    ACTAAACAATAAAGGACAGCTCTCTTCTCTCCTGGCTAAAGCTGCTCTCCTGGTTCAGAA
    GACAGGCTGGATGAGATCTCAGGCCGAGCTCTGAAATAGGGAGGTAATCCTCCAGCACCT
    GTGTTTCCTCTAACTTGCTGTGTGACCTCCAGCCGGTCACTCACCCTCTCTGGACCTCAT
    CTGTAAGAGGAGCCAGCTGGATAAGATGATTTCTGAAGACGCTTCCATGGTGGGCACTGA
    GGCACAGAGGAGGCCAAGGAGAGGTTGTTTGTTCATGCATGCATTCATCCGTGACACATG
    AGTACCTACTGAGGACTCCATAAACAGAACGGGATACAGAGATAAACAATTTGGGTTCTG
    TCCACGTTTGTCAAAAGGTGGTGCTGGCCCACCTCTGAAAGCAGAACACTTGCTCAACAA
    CCTTGCTGTTGGCCCAAGTCTAACACATTCTTTATGACTGTGAGCATCTCAGAGTGAGAG
    AAAAATGTAGAAAGTTTTTTAAATTCTAAACAGGATTTAGTGTCTTTAGTTATCTTGCTG
    GATGGGAAAGGGATGTTGTCATTTCTGGCACAAATGAAAAGTAGGACGGAAAGCTCCTTT
    CATTCAGTTTATCTTTCCAGGATATATGAAAAGGGACCAGCTGGAAGACTAGCCTCACTC
    TGTCCTCGAAAGCCTGAGCTTTCATTCAACTCCCTATTTCCATGCAAAGACGCTGGGCAA
    ACCACATGTTCTGTCTGAGCCTCAGTTTTCCTATCCATAAAATGAAGGTAGCCAGGCCTG
    CCTCAAAGAGCATTCAGGAGGCTCTGAGAGGACATGAGAGTATTTTGCAAAGTGAGGGCA
    AGGCCCAGTGTGGAGTGATATTGTTATTCCAAGATTCCACTGCAAAAGTGGCTGCTTTGG
    ATGCCAGCCCAGGATGAGTAGTTCCTGTTCTCAGGGAGGTCATCCGCTGAGCATCCCTTC
    TGCACAGATGTCTCTGATTCTTGTCCTTGCAGGTGGAGGACAGGGCCTGCTCCCCTAAGC
    TGGGAAGCCTGGAATGACCTCTTGCACAAGCCTAAATTCCAGGAATCTTCCCCAAATCCC
    AGATCCTCTGCAATCTACCTGCACCCCTGACCCACCCAGGAGTTGGACCGGGAGTTGGGA
    AGCCTAGGTCTTAGTCCTACACTCCTTCTAATTTGCTGTGTAACCTTACCATTAATCTCT
    CTGGGTCTCAGTTTTCTCATCTGTATTGGAGGTAGCAGTGCTAGCTCTGCCTTCAGGCAT
    GCAATATGCCAGAACTACAGACAACAGCCCACAGGATGCAAAAGTGCTTTGCCATCTTAA
    AAATGCCAGATCACTCAGAGCCTATGAATGTGGATATCAACACCAGGTCTCTAGCACCGC
    TGGATGAAAGGAGAAGGCTAGAGGCTGAGGGAGGAAAGAGCAGTTAACAAACAAAGGCAG
    TAGCTCATCACTTGGGTAGCAGGTACCCATTTTAGGACCCTACACTCAAATGTGCAAAAT
    AAAATTTCTATCATTTTGCTATAAAAAAAAAAAAAAAAAAAAA
    >NM_004967
    GAGTGAGTGAGAGGGCAGAGGAAATACTCAATCTGTGCCACTCACTGCCTTGAGCCTGCT
    TCCTCACTCCAGGACTGCCAGAGGCTCACTCCCTTGAGCCTGCTTCCTCACTCCAGGACT
    GCCAGAGGAAGCAATCACCAAAATGAAGACTGCTTTAATTTTGCTCAGCATTTTGGGAAT
    GGCCTGTGCTTTCTCAATGAAAAATTTGCATCGAAGAGTCAAAATAGAGGATTCTGAAGA
    AAATGGGGTCTTTAAGTACAGGCCACGATATTATCTTTACAAGCATGCCTACTTTTATCC
    TCATTTAAAACGATTTCCAGTTCAGGGCAGTAGTGACTCATCCGAAGAAAATGGAGATGA
    CAGTTCAGAAGAGGAGGAGGAAGAAGAGGAGACTTCAAATGAAGGAGAAAACAATGAAGA
    ATCGAATGAAGATGAAGACTCTGAGGCTGAGAATACCACACTTTCTGCTACAACACTGGG
    CTATGGAGAGGACGCCACGCCTGGCACAGGGTATACAGGGTTAGCTGCAATCCAGCTTCC
    CAAGAAGGCTGGGGATATAACAAACAAAGCTACAAAAGAGAAGGAAAGTGATGAAGAAGA
    AGAGGAGGAAGAGGAAGGAAATGAAAACGAAGAAAGCGAAGCAGAAGTGGATGAAAACGA
    ACAAGGCATAAACGGCACCAGTACCAACAGCACAGAGGCAGAAAACGGCAACGGCAGCAG
    CGGAGGAGACAATGGAGAAGAAGGGGAAGAAGAAAGTGTCACTGGAGCCAATGCAGAAGG
    CACCACAGAGACCGGAGGGCAGGGCAAGGGCACCTCGAAGACAACAACCTCTCCAAATGG
    TGGGTTTGAACCTACAACCCCACCACAAGTCTATAGAACCACTTCCCCACCTTTTGGGAA
    AACCACCACCGTTGAATACGAGGGGGAGTACGAATACACGGGCGTCAATGAATACGACAA
    TGGATATGAAATCTATGAAAGTGAGAACGGGGAACCTCGTGGGGACAATTACCGAGCCTA
    TGAAGATGAGTACAGCTACTTTAAAGGACAAGGCTACGATGGCTATGATGGTCAGAATTA
    CTACCACCACCAGTGAAGCTCCAGCCTG
    >NM_002847
    GCCTCCCGCCGCCTCCCGCGCGGCCATGGACTGAGCGCCGCCGGCCAGGCCGCGGGGATG
    GGGCCGCCGCTCCCGCTGCTGCTGCTGCTACTGCTGCTGCTGCCGCCACGCGTCCTGCCT
    GCCGCCCCTTCGTCCGTCCCCCGCGGCCGGCAGCTCCCGGGGCGTCTGGGCTGCCTGCTC
    GAGGAGGGCCTCTGCGGAGCGTCCGAGGCCTGTGTGAACGATGGAGTGTTTGGAAGGTGC
    CAGAAGGTTCCGGCAATGGACTTTTACCGCTACGAGGTGTCGCCCGTGGCCCTGCAGCGC
    CTGCGCGTGGCGTTGCAGAAGCTTTCCGGCACAGGTTTCACGTGGCAGGATGACTATACT
    CAGTATGTGATGGACCAGGAACTTGCAGACCTCCCGAAAACCTACCTGAGGCGTCCTGAA
    GCATCCAGCCCAGCCAGGCCCTCAAAACACAGCGTTGGCAGCGAGAGGAGGTACAGTCGG
    GAGGGCGGTGCTGCCCTGGCCAACGCCCTCCGACGCCACCTGCCCTTCCTGGAGGCCCTG
    TCCCAGGCCCCAGCCTCAGACGTGCTCGCCAGGACCCATACGGCGCAGGACAGACCCCCC
    GCTGAGGGTGATGACCGCTTCTCCGAGAGCATCCTGACCTATGTGGCCCACACGTCTGCG
    CTGACCTACCCTCCCGGGCCCCGGACCCAGCTCCGCGAGGACCTCCTGCCGCGGACCCTC
    GGCCAGCTCCAGCCAGATGAGCTCAGCCCTAAGGTGGACAGTGGTGTGGACAGACACCAT
    CTGATGGCGGCCCTCAGTGCCTATGCTGCCCAGAGGCCCCCAGCTCCCCCCGGGGAGGGC
    AGCCTGGAGCCACAGTACCTTCTGCGTGCACCCTCAAGAATGCCCAGGCCTTTGCTGGCA
    CCAGCCGCCCCCCAGAAGTGGCCTTCACCTCTGGGAGATTCCGAAGACCCCTCCAGCACA
    GGCGATGGAGCACGGATTCATACCCTCCTGAAGGACCTGCAGAGGCAGCCGGCTGAGGTG
    AGGGGCCTGAGTGGCCTGGAGCTGGACGGCATGGCTGAGCTGATGGCTGGCCTGATGCAA
    GGCGTGGACCATGGAGTAGCTCGAGGCAGCCCTGGGAGAGCGGCCCTGGGAGAGTCTGGA
    GAACAGGCGGATGGCCCCAAGGCCACCCTCCGTGGAGACAGCTTTCCAGATGACGGAGTG
    CAGGACGACGATGATAGACTTTACCAAGAGGTCCATCGTCTGAGTGCCACACTCGGGGGC
    CTCCTGCAGGACCACGGGTCTCGACTCTTACCTGGAGCCCTCCCCTTTGCAAGGCCCCTC
    GACATGGAGAGGAAGAAGTCCGAGCACCCTGAGTCTTCCCTGTCTTCAGAAGAGGAGACT
    GCCGGAGTGGAGAACGTCAAGAGCCAGACGTATTCCAAAGATCTGCTGGGGCAGCAGCCG
    CATTCGGAGCCCGGGGCCGCTGCGTTTGGGGAGCTCCAAAACCAGATGCCTGGGCCCTCG
    AAGGAGGAGCAGAGCCTTCCAGCGGGTGCTCAGGAGGCCCTCAGCGACGGCCTGCAATTG
    GAGGTCCAGCCTTCCGAGGAAGAGGCGCGGGGCTACATCGTGACAGACAGAGACCCCCTG
    CGCCCCGAGGAAGGAAGGCGGCTGGTGGAGGACGTCGCCCGCCTCCTGCAGGTGCCCAGC
    AGTGCGTTCGCTGACGTGGAGGTTCTCGGACCAGCAGTGACCTTCAAAGTGAGCGCCAAT
    GTCCAAAACGTGACCACTGAGGATGTGGAGAAGGCCACAGTTGACAACAAAGACAAACTG
    GAGGAAACCTCTGGACTGAAAATTCTTCAAACCGGAGTCGGGTCGAAAAGCAAACTCAAG
    TTCCTGCCTCCTCAGGCGGAGCAAGAAGACTCCACCAAGTTCATCGCGCTCACCCTGGTC
    TCCCTCGCCTGCATCCTGGGCGTCCTCCTGGCCTCTGGCCTCATCTACTGCCTCCGCCAT
    AGCTCTCAGCACAGGCTGAAGGAGAAGCTCTCGGGACTAGGGGGCGACCCAGGTGCAGAT
    GCCACTGCCGCCTACCAGGAGCTGTGCCGCCAGCGTATGGCCACGCGGCCACCAGACCGA
    CCTGAGGGCCCGCACACGTCACGCATCAGCAGCGTCTCATCCCAGTTCAGCGACGGGCCG
    ATCCCCAGCCCCTCCGCACGCAGCAGCGCCTCATCCTGGTCCGAGGAGCCTGTGCAGTCC
    AACATGGACATCTCCACCGGCCACATGATCCTGTCCTACATGGAGGACCACCTGAAGAAC
    AAGAACCGGCTGGAGAAGGAGTGGGAAGCGCTGTGCGCCTACCAGGCGGAGCCCAACAGC
    TCGTTCGTGGCCCAGAGGGAGGAGAACGTGCCCAAGAACCGCTCCCTGGCTGTGCTGACC
    TATGACCACTCCCGGGTCCTGCTGAAGGCGGAGAACAGCCACAGCCACTCAGACTACATC
    AACGCTAGCCCCATCATGGATCACGACCCGAGGAACCCCGCGTACATCGCCACCCAGGGA
    CCGCTGCCCGCCACCGTGGCTGACTTTTGGCAGATGGTGTGGGAGAGCGGCTGCGTGGTG
    ATCGTCATGCTGACACCCCTCGCGGAGAACGGCGTCCGGCAGTGCTACCACTACTGGCCG
    GATGAAGGCTCCAATCTCTACCACATCTATGAGGTGAACCTGGTCTCCGAGCACATCTGG
    TGTGAGGACTTCCTGGTGAGGAGCTTCTATCTGAAGAACCTGCAGACCAACGAGACGCGC
    ACCGTGACGCAGTTCCACTTCCTGAGTTGGTATGACCGAGGAGTCCCTTCCTCCTCAAGG
    TCCCTCCTGGACTTCCGCAGAAAAGTAAACAAGTGCTACAGGGGCCGTTCTTGTCCAATA
    ATTGTTCATTGCAGTGACGGTGCAGGCCGGAGCGGCACCTACGTCCTGATCGACATGGTT
    CTCAACAAGATGGCCAAAGGTGCTAAAGAGATTGATATCGCAGCGACCCTGGAGCACTTG
    AGGGACCAGAGACCCGGCATGGTCCAGACGAAGGAGCAGTTTGAGTTCGCGCTGACAGCC
    GTGGCTGAGGAGGTGAACGCCATCCTCAAGGCCCTTCCCCAGTGAGCGGCAGCCTCAGGG
    GCCTCAGGGGAGCCCCCACCCCACGGATGTTGTCAGGAATCATGATCTGACTTTAATTGT
    GTGTCTTCTATTATAACTGCATAGTAATAGGGCCCTTAGCTCTCCCGTAGTCAGCGCAGT
    TTAGCAGTTAAAAGTGTATTTTTGTTTAATCAAACAATOATAAAGAGAGATTTGTGGAAA
    AATCCAGTTACGGGTGGAGGGGAATCGGTTCATCAATTTTCACTTGCTTAAAAAAAATAC
    TTTTTCTTAAAGCACCCGTTCACCTTCTTGGTTGAAGTTGTGTTAACAATGCAGTAGCCA
    GCACGTTCGAGGCGGTTTCCAGGAAGAGTGTGCTTGTCATCTGCCACTTTCGGGAGGGTG
    GATCCACTGTGCAGGAGTGGCCGGGGAAGCTGGCAGCACTCAGTGAGGCCGCCCGGCACA
    CAAGGCACGTTTGGCATTTCTCTTTGAGAGAGTTTATCATTGGGAGAAGCCGCGGGGACA
    GAACTGAACGTCCTGCAGCTTCGGGGCAAGTGAGACAATCACAGCTCCTCGCTGCGTCTC
    CATCAACACTGCGCCGGGTACCATGGACGGCCCCGTCAGCCACACCTGTCAGCCCAAGCA
    GAGTGATTCAGGGGCTCCCCGGGGGCAGACACCTGTGCACCCCATGAGTAGTGCCCACTT
    GAGGCTGGCACTCCCCTGACCTCACCTTTGCAAAGTTACAGATGCACCCCAACATTGAGA
    TGTGTTTTTAATGTTAAAATATTGATTTCTACGTTATGAAAACAGATGCCCCCGTGAATG
    CTTACCTGTGAGATAACCACAACCAGGAAGAACAAATCTGGGCATTGAGCAAGCTATGAG
    GGTCCCCGGGAGCACACGAACCCTGCCAGGCCCCCGCTGGCTCCTCCAGGCACGTCCCGG
    ACCTGTGGGGCCCCAGAGAGGGGACATTTCCCTCCTGGGAGAGAAGGAGATCAGGGCAAC
    TCGGAGAGGGCTGCGAGCATTTCCCTCCCGGGAGAGGAGATCAGGGCGACCTGCACGCAC
    TGCGTAGAGCCTGGAAGGGAAGTGAGAAACCAGCCGACCGGCCCTGCCCCTCTTCCCGGG
    ATCACTTAATGAACCACGTGTTTTGACATCATGTAAACCTAAGCACGTAGAGATGATTCG
    GATTTGACAAAATAACATTTGAGTATCCGATTCGCCATCACCCCCTACCCCAGAAATAGG
    ACAATTCACTTCATTGACCAGGATGATCACATGGAAGGCGGCGCAGAGGCAGCTGTGTGG
    GCTGCAGATTTCCTGTGTGGGGTTCAGCGTAGAAAACGCACCTCCATCCCGCCCTTCCCA
    CAGCATTCCTCCATCTTAGATAGATGGTACTCTCCAAAGGCCCTACCAGAGGGAACACGG
    CCTACTGAGCGGACAGAATGATGCCAAAATATTGCTTATGTCTCTACATGGTATTGTAAT
    GAATATCTGCTTTAATATAGCTATCATTTCTTTTCCAAAATTACTTCTCTCTATCTGGAA
    TTTAATTAATCGAAATGAATTTATCTGAATATAGGAAGCATATGCCTACTTGTAATTTCT
    AACTCCTTATGTTTGAAGAGAAACCTCCGGTGTGAGATATACAAATATATTTAATTGTGT
    CATATTAAACTTCTGATTCAAAAAAAA
    >BC002551
    GGCACGAGGCCACGAGCTGTTGTGCATCCAGAGGTGGAATTGGGGCCCGGCATTCCCTCC
    TCGTCCCGGGCTGGCCCTTGCCCCCACCCTGCAACTCCTGGTTGAGATGGGCTCAGCCAA
    GAGCGTCCCAGTCACACCAGCGCGGCCTCCGCCGCACAACAAGCATCTGGCTCGAGTGGC
    GGACCCCCGTTCACCTAGTGCTGGCATCCTGCGCACTCCCATCCAGGTGGAGAGCTCTCC
    ACAGCCAGGCCTACCAGCAGGGGAGCAACTGGAGGGTCTTAAACATGCCCAGGACTCAGA
    TCCCCGCTCTCCTACTCTTGGTATTGCACGGACACCTATGAAGACCAGCAGTGGAGACCC
    CCCAAGCCCACTGGTGAAACAGCTGAGTGAAGTATTTGAAACTGAAGACTCTAAATCAAA
    TCTTCCCCCAGAGCCTGTTCTGCCCCCAGAGGCACCTTTATCTTCTGAATTGGACTTGCC
    TCTGGGTACCCAGTTATCTGTTGAGGAACAGATGCCACCTTGGAACCAGACTGAGTTCCC
    CTCCAAACAGGTGTTTTCCAAGGAGGAAGCAAGACAGCCCACAGAAACCCCTGTGGCCAG
    CCAGAGCTCCGACAAGCCCTCAAGGGACCCTGAGACTCCCAGATCTTCAGGTTCTATGCG
    CAATAGATGGAAACCAAACAGCAGCAAGGTACTAGGGAGATCCCCCCTCACCATCCTGCA
    GGATGACAACTCCCCTGGCACCCTGACACTACGACAGGGTAAGCGGCCTTCACCCCTAAG
    TGAAAATGTTAGTGAACTAAAGGAAGGAGCCATTCTTGGAACTGGACGACTTCTGAAAAC
    TGGAGGACGAGCATGGGAGCAAGGCCAGGACCATGACAAGGAAAATCAGCACTTTCCCTT
    GGTGGAGAGCTAGGCCCTGCATGGCCCCAGCAATGCAGTCACCCAGGGCCTGGTGATATC
    TGTGTCCTCTCACCCCTTCTTTCCCAGGGATACTGAGGAATGGCTTGTTTTCTTAGACTC
    CTCCTCAGCTACCAAACTGGGACTCACAGCTTTATTGGGCTTTCTTTGTGTCTTGTGTGT
    TTCTTTTATATTAAAGGAAGTAATTTTAAATGTTACTTTAAAAAGGTAAAAAAAAAAAAA
    AAAAAAAA
    >AL039118
    GCATTCGTAGTAAAGGTGCCCAAGAAATTATTTTGGCCATTTATTGTTTTGTCCTTTTCT
    TTAAAGAACTGTTTTTTTTTCTTTTGTTTACTTTTAGACCAAAGATTGGGTTCTAGAAAA
    TGCACTTGGTATACTAAGTATTAAAACAAACAAAAAGGAAAGTTGTTTCAGTTGGCAACA
    CTGCCCATTCAATTGAATCAGAAGGGGACAAAATTAACGATTGCCTTCAGTTTGTGTTGT
    GTATATTTTGATGTATGTGGTCACTAACAGGTCACTTTTATTTTTTCTAAATGTAGTGAA
    ATGTTAATACCTATTGTACTTATAGGTAAACCTTGCAAATATGTAACCTGTGTTGCGCAA
    ATGCCGCATAAATTTGAGTGATTGTTAATGTTGTCTTAAAATTTCTTGATTGTGATACTG
    TGGTCATATGCCCGTGTTTGTCACTTACAAAAATGTTTACTATGAACACACAGAAATAAA
    AAATAGGCTAAATTCATATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >NM_000198
    GAGGCAGTAAGGACTTGGACTCCTCTGTCCAGCTTTTAACAATCTAAGTTACGGTTACCC
    TCTTCTGGGTCACGCTAGAATCAGATCTGCTCTCCAGCATCTTCTGTTTCCTGGCAAGTG
    TTTCCTGCTACTTTGGATTGGCCACGATGGGCTGGAGCTGCCTTGTGACAGGAGCAGGAG
    GGCTTCTGGGTCAGAGGATCGTCCGCCTGTTGGTGGAAGAGAAGGAACTGAAGGAGATCA
    GGGCCTTGGACAAGGCCTTCAGACCAGAATTGAGAGAGGAATTTTCTAAGCTCCAGAACA
    GGACCAAGCTGACTGTACTTGAAGGAGACATTCTGGATGAGCCATTCCTGAAAAGAGCCT
    GCCAGGACGTCTCGGTCGTCATCCACACCGCCTGTATCATTGATGTCTTTGGTGTCACTC
    ACAGAGAGTCCATCATGAATGTCAATGTGAAAGGTACCCAGCTACTGTTGGAGGCCTGTG
    TCCAAGCCAGTGTGCCAGTCTTCATCTACACCAGTAGCATAGAGGTAGCCGGGCCCAACT
    CCTACAAGGAAATCATCCAGAACGGCCACGAAGAAGAGCCTCTGGAAAACACATGGCCCA
    CTCCATACCCGTACAGCAAAAAGCTTGCTGAGAAGGCTGTGCTGGCGGCTAATGGGTGGA
    ATCTAAAAAATGGTGATACCTTGTACACTTGTGCGTTAAGACCCACATATATCTATGGGG
    AAGGAGGCCCATTCCTTTCTGCCAGTATAAATGAGGCCCTGAACAACAATGGGATCCTGT
    CAAGTGTTGGAAAGTTCTCTACAGTCAACCCAGTCTATGTTGGCAACGTGGCCTGGGCCC
    ACATTCTGGCCTTGAGGGCTCTGCGGGACCCCAAGAAGGCCCCAAGTGTCCGAGGTCAAT
    TCTATTACATCTCAGATGACACGCCTCACCAAAGCTATGATAACCTTAATTACATCCTGA
    GCAAAGAGTTTGGCCTCCGCCTTGATTCCAGATGGAGCCTTCCTTTAACCCTGATGTACT
    GGATTGGCTTCCTGCTGGAAGTAGTGAGCTTCCTACTCAGCCCAATTTACTCCTATCAAC
    CCCCCTTCAACCGCCACACAGTCACATTATCAAATAGTGTGTTCACCTTCTCTTACAAGA
    AGGCTCAGCGAGATCTGGCGTATAAGCCACTCTACAGCTGGGAGGAAGCCAAGCAGAAAA
    CCGTGGAGTGGGTTGGTTCCCTTGTGGACCGGCACAAGGAGACCCTGAAGTCCAAGACTC
    AGTGATTTAAGGATGACAGAGATGTGCATGTGGGTATTGTTAGGAAATGTCATCAAACTC
    CACCCACCTGGCTTCATACAGAAGGCAACAGGGGCACAAGCCCAGGTCCTGCTGCCTCTC
    TTTCACACAATGCCCAACTTACTGTCTTCTTCATGTCATCAAAATCTGCACAGTCACTGG
    CCCAACCAGAACTTTCTGTCCTAATCATACACCAGAAGACAAACAATATGATTTGCTGTT
    ACCAAATCTCAGTGGCTGATTCTGAACAATTGTGGTCTCTCTTAACTTGAGGTTCTCTTT
    TGACTAATAGAGCTCCATTTCCCCTCTTAAATGAGAAAGCATTTCTTTTCTCTTTAATCT
    CCTATTCCTTCACACAGTTCAACATAAAGAGCAATAAATGTTTTAATGCTTAA
    >H05388
    AAATTTTGACCCCATATAAAGAAATGTGTTATGTATGTTGTGCCTCCTTAGAGACATAAA
    TTTAGTGTCAAAACATGGGAGATGGCTTACTCAGAAGCATACTCCACTTAACATACCATG
    GCCTGAGCTAAGTACCATGTCCTGTTTGTGTCTTATTTTTAAATATTTTCTTTGTCCACA
    TGGGCCGTTGACCTTAGAGTTAAGGCGGTTGCTTTTTTGAAGAAATCACCAAAGTTTCTG
    GGAAACTATGTTCAAGGTTGAAATGGAGAGTAGATTTAATTTTATTTGTCTTGTAGGGAA
    GAAATCTTCCTTTGAACCGCTTTTCTTGCTTTTTCCCTTTTTCCCAAACTAGGTTACAGG
    TTCTTATCTGCAAGGTTCAAGTTGCTTAGACATTGTTTTCCAGTATTCTGCAGGGCCAGT
    CAGTTGTACAGAAGTTGGAATATTCTGTTCCAGAATTAAAGAAGTTTTTAGATTATGAAA
    TATTATGATAATAAAGCTATATTTCTGAAAAAAAAAAA
    >NM_004062
    GAAGGAGCTCTCTTCTTGCTTGGCAGCTGGACCAAGGGAGCCAGTCTTGGGCGCTGGAGG
    GCCTGTCCTGACCATGGTCCCTGCCTGGCTGTGGCTGCTTTGTGTCTCCGTCCCCCAGGC
    TCTCCCCAAGGCCCAGCCTGCAGAGCTGTCTGTGGAAGTTCCAGAAAACTATGGTGGAAA
    TTTCCCTTTATACCTGACCAAGTTGCCGCTGCCCCGTGAGGGGGCTGAAGGCCAGATCGT
    GCTGTCAGGGGACTCAGGCAAGGCAACTGAGGGCCCATTTGCTATGGATCCAGATTCTGG
    CTTCCTGCTGGTGACCAGGGCCCTGGACCGAGAGGAGCAGGCAGAGTACCAGCTACAGGT
    CACCCTGGAGATGCAGGATGGACATGTCTTGTGGGGTCCACAGCCTGTGCTTGTGCACGT
    GAAGGATGAGAATGACCAGGTGCCCCATTTCTCTCAAGCCATCTACAGAGCTCGGCTGAG
    CCGGGGTACCAGGCCTGGCATCCCCTTCCTCTTCCTTGAGGCTTCAGACCGGGATGAGCC
    AGGCACAGCCAACTCGGATCTTCGATTCCACATCCTGAGCCAGGCTCCAGCCCAGCCTTC
    CCCAGACATGTTCCAGCTGGAGCCTCGGCTGGGGGCTCTGGCCCTCAGCCCCAAGGGGAG
    CACCAGCCTTGACCACGCCCTGGAGAGGACCTACCAGCTGTTGGTACAGGTCAAGGACAT
    GGGTGACCAGGCCTCAGGCCACCAGGCCACTGCCACCGTGGAAGTCTCCATCATAGAGAG
    CACCTGGGTGTCCCTAGAGCCTATCCACCTGGCAGAGAATCTCAAAGTCCTATACCCGCA
    CCACATGGCCCAGGTACACTGGAGTGGGGGTGATGTGCACTATCACCTGGAGAGCCATCC
    CCCGGGACCCTTTGAAGTGAATGCAGAGGGAAACCTCTACGTGACCAGAGAGCTGGACAG
    AGAAGCCCAGGCTGAGTACCTGCTCCAGGTGCGGGCTCAGAATTCCCATGGCGAGGACTA
    TGCGGCCCCTCTGGAGCTGCACGTGCTGGTGATGGATGAGAATGACAACGTGCCTATCTG
    CCCTCCCCGTGACCCCACAGTCAGCATCCCTGAGCTCAGTCCACCAGGTACTGAAGTGAC
    TAGACTGTCAGCAGAGGATGCAGATGCCCCCGGCTCCCCCAATTCCCACGTTGTGTATCA
    GCTCCTGAGCCCTGAGCCTGAGGATGGGGTAGAGGGGAGAGCCTTCCAGGTGGACCCCAC
    TTCAGGCAGTGTGACGCTGGGGGTGCTCCCACTCCGAGCAGGCCAGAACATCCTGCTTCT
    GGTGCTGGCCATGGACCTGGCAGGCGCAGAGGGTGGCTTCAGCAGCACGTGTGAAGTCGA
    AGTCGCAGTCACAGATATCAATGATCACGCCCCTGAGTTCATCACTTCCCAGATTGGGCC
    TATAAGCCTCCCTGAGGATGTGGAGCCCGGGACTCTGGTGGCCATGCTAACAGCCATTGA
    TGCTGACCTCGAGCCCGCCTTCCGCCTCATGGATTTTGCCATTGAGAGGGGAGACACAGA
    AGGGACTTTTGGCCTGGATTGGGAGCCAGACTCTGGGCATGTTAGACTCAGACTCTGCAA
    GAACCTCAGTTATGAGGCAGCTCCAAGTCATGAGGTGGTGGTGGTGGTGCAGAGTGTGGC
    GAAGCTGGTGGGGCCAGGCCCAGGCCCTGGAGCCACCGCCACGGTGACTGTGCTAGTGGA
    GAGAGTGATGCCACCCCCCAAGTTGGACCAGGAGAGCTACGAGGCCAGTGTCCCCATCAG
    TGCCCCAGCCGGCTCTTTCCTGCTGACCATCCAGCCCTCCGACCCCATCAGCCGAACCCT
    CAGGTTCTCCCTAGTCAATGACTCAGAGGGCTGGCTCTGCATTGAGAAATTCTCCGGGGA
    GGTGCACACCGCCCAGTCCCTGCAGGGCGCCCAGCCTGGGGACACCTACACGGTGCTTGT
    GGAGGCCCAGGATACAGATGAGCCGAGACTGAGCGCTTCTGCACCCCTGGTGATCCACTT
    CCTAAAGGCCCCTCCTGCCCCAGCCCTGACTCTTGCCCCTGTGCCCTCCCAATACCTCTG
    CACACCCCGCCAAGACCATGGCTTGATCGTGAGTGGACCCAGCAAGGACCCCGATCTGGC
    CAGTGGGCACGGTCCCTACAGCTTCACCCTTGGTCCCAACCCCACGGTGCAACGGGATTG
    GCGCCTCCAGACTCTCAATGGTTCCCATGCCTACCTCACCTTGGCCCTGCATTGGGTGGA
    GCCACGTGAACACATAATCCCCGTGGTGGTCAGCCACAATGCCCAGATGTGGCAGCTCCT
    GGTTCGAGTGATCGTGTGTCGCTGCAACGTGGAGGGGCAGTGCATGCGCAAGGTGGGCCG
    CATGAAGGGCATGCCCACGAAGCTGTCGGCAGTGGGCATCCTTGTAGGCACCCTGGTAGC
    AATAGGAATCTTCCTCATCCTCATTTTCACCCACTGGACCATGTCAAGGAAGAAGGACCC
    GGATCAACCAGCAGACAGCGTGCCCCTGAAGGCGACTGTCTGAATGGCCCAGGCAGCTCT
    AGCTGGGAGCTTGGCCTCTGGCTCCATCTGAGTCCCCTGGGAGAGAGCCCAGCACCCAAG
    ATCCAGCAGGGGACAGGACAGAGTAGAAGCCCCTCCATCTGCCCTGGGGTGGAGGCACCA
    TCACCATCACCAGGCATGTCTGCAGAGCCTGGACACCAACTTTATGGACTGCCCATGGGA
    GTGCTCCAAATGTCAGGGTGTTTGCCCAATAATAAAGCCCCAGAGAACTGGGCTGGGCCC
    TATGGGATTGGTA
    >AA782845
    TCTTTACCTATGTGAAGCGAGGTGACGTGATACGTCACTGGCGCCGTCTTATAATTTAGA
    TGTAAAAATCTTTAGAAACAAATAAAACTCTCTATATATGTGTATGTCTGTGTACAAAAA
    AATGACAGAGCTGATGGCCAGTGTATACAGAGCGTGGCCCGCGGTGTACAATACCCATAT
    AAGGTACATTGTGCAGGAGGGGAATTGCTGGCTGCTTTTACTTCCTGACCAAGACTGAAA
    AATTATTTACTGAAATCTGTAAACCTTTTTATGAAACTTTTAAGCACCAGGCTGTTTACT
    TACACAATTTAGGTCTGCCAGAAAATTCTATCTGTGATAGATCTGTAAAGAGGGTCAGGG
    GTTAGAGTTTACTATTTTTGAAGTTTACATTGTTACATATGAAATGGAAACATTATTTTG
    AAACGTTGTCATAACCCAATGGTGCATTCTGTAACCATGGAGTCTTCTGTTTCCTGGGGG
    AAAGGGGCATTCATGACCTGAACTTTTTAGCAAATTATTATTCTCAGTTTCCATTACCTG
    TTTGGCCAAACAGATTAATAAAATATTTGAAAAAGAAGCAATAAAAAAAAAA
    >AI457360
    CTGAGAAAGTCCGGTCCCTATAAGGGGACATCAGTGCGAGACCTGCTCCGTGCTGTGAGN
    ACAAGAGGCACCATACAAGNAAGCTCCCAGTTGAGGTGCGACAGGCACTCGCCNAAGTCC
    NTGATGGCTTCGTCCAGTACTCACAAAACGGCTCCCCCCGGCTGGTCCTTCACACGCACC
    GAGCCATGAGGAGCTGGCGCCTCTGAGAGCCTCTTCCTGCCCTACTACCCGCCAGACTCA
    GAGGCCAGGAGGCCATGCCCTGGGGCCACAGGGAGGTGAGGTGGGCTGGATGCCACACAG
    ATGGTCTCCGTGCTGGCTCACTGAAGAGCTGAGCCTGTGGCTGGCCTCAGAATCAGGCTG
    GGTGCAGTGGCTCACACCTGTAATCCCAGCATTTTGGGAGGCTGAGTGAGAGGATCACTT
    GAGCTCAGGAGTTCGAGACCAGCCTGGCCAACATGGCAACACCCCATTTCTACAAAAAAT
    TTGTAAAATTAGCCAGGCATGGTGGCGCACGCCTGTAGTCCCAGCTGCTTGGGAGGCTGA
    GGTGGGAGAATCACTTGAGCCCAGGAGTTCGAGGCTGCAGTGAGCCAGGATCATGCCACT
    GCACTCCAGCCTGGTCCACAGAGAGACACTGTCACCCCCTTTCCCCCACAAGACTGGCAG
    AGGCTGGGCAGCCTGGGGCTGATGAAGCAGAGATGTTCGCTGGATCCCAGGCCCTGGCAC
    CCCTCAGGAAATACAAGAAAAAGAATATTCACATCTGTTTAATGTGCATAAAGCCAAGGA
    AAGGACAGTTCCGAATTCAAAAAAAAAAAAAAAAAAAA
    >BF446419
    TTTTTTTTTTTTTTTTTAAATATTTAACTTATTTATTTAACAAAGTAGAAGGGAATCCAT
    TGCTAGCTTTTCTGTGTTGGTGTCTAATATTTGGGTAGGGTGGGGGATCCCCAACAATCA
    GGTCCCCTGAGATAGCTGGTCATTGGGCTGATCATTGCCAGAATCTTCTTCTCCTGGGGT
    CTGGCCCCCCAAAATGCCTAACCCAGGACCTTGGGAATTCTACTCATCCCAAATGATAAT
    TCCAAATGCTGTTACCCAAGGTTAGGGTGTTGAAGGAAGGTAGAGGGTGGGGCTTCAGGT
    CTCAACGGCTTCCCTAACCACCCCTCTTCTCTTGGCCCAGCCTGGTTCCCCCCACTTCCA
    CTCCCCTCTACTCTCTCTAGGACTGGGCTGATGAAGGCACTGCCCAAAATTTCCCCTACC
    CCCAACTTTCCCCTACCCCCAACTTTCCCCACCAGCTCCACAACCCTGTTTGGAGCTACT
    GCAGGACCAGAAGCACAAAGTGCGGTTTCCCAAGCCTTTGTCCATCTCAGCCCCCAGAGT
    ATATCTGTGCTTGGGGAATCTCACACAGAAACTCAGGAGCACCCCCTGCCTGAGCTAAGG
    GAGGTCTTATCTCTCAGGGGGGGTTTAAGTGCCGTTTGCAATAATGTCGTCTTATTTATT
    TAGCGGGGTGAATATTTTATACTGTAAGTGAGCAATCAGAGTATAATGTTTATGGTGACA
    AAATTAAAGGCTTTCTTATATGTTTAAAAAAAA
    >BC006819
    GCCTTATAAAGCACCAAGAGGCTGCCAGTGGGACATTTTCTCGGCCCTGCCAGCCCCCAG
    GAGGAAGGTGGGTCTGAATCTAGCACCATGACGGAACTAGAGACAGCCATGGGCATGATC
    ATAGACGTCTTTTCCCGATATTCGGGCAGCGAGGGCAGCACGCAGACCCTGACCAAGGGG
    GAGCTCAAGGTGCTGATGGAGAAGGAGCTACCAGGCTTCCTGCAGAGTGGAAAAGACAAG
    GATGCCGTGGATAAATTGCTCAAGGACCTGGACGCCAATGGAGATGCCCAGGTGGACTTC
    AGTGAGTTCATCGTGTTCGTGGCTGCAATCACGTCTGCCTGTCACAAGTACTTTGAGAAG
    GCAGGACTCAAATGATGCCCTGGAGATGTCACAGATTCCTGGCAGAGCCATGGTCCCAGG
    CTTCCCAAAAGTGTTTGTGGCAATTATTCCCCTAGGCTGAGCCTGCTCATGTACCTCTGA
    TTAATAAATGCTTATGAAATGAAAAAAAAAAAAAAA
    >AA765597
    CCAGCAAAGTCTCTTTTGACCACACGCTTTATCCGAGATGCTTAGAAGTATATTTGGCTG
    TTTTATTTGCATCTTTGATTAAGATGTCTATCATTGTAAAAAGGTATTCAAAACAAAAGT
    GTACTCTTTTATTATTATGAATCACATTGTACTGAGCTGTGAAGTCAGTGTTTTAAAAAT
    GTAGAGTTTATTCATGGAGCATGCCATTGAGGTTTGGATGGTGGCAGGTAAAACAGAAAG
    GCAAGATGTCATCTGACATTAGGCTACTTATAAATAAATGTTTATCTAGCTTTTATTTCA
    TGCCCTAATGAATAAAACATGCTTCGAAAAAGAAAGTAAAAAAAAAAAACAAAA
    >X78202
    GGCGAGAGAGACGCTCCCGCTCGCCGCCAGCTCTGATTGGCCCAGCGGTAGGAAAGGTTA
    AACCAAAAATTTTTTTACAGCCCTAGTGTGCGCCTGTAGCTCGGAAAATTAATTGTGGCT
    ATAGCCGCCTCGATCGCTGTCTCCCCAGCCTCGCCGCGGACGCTCCGGGACGCGCCCGCC
    CGCCGCCCGGTTCTCCCCCCCTTTGGGCTGGTGCTGCTGCTGCTGTGACTGCTGCTGCGA
    AAGGAGGAGGAGGAGGAGGAAGCAGCGGGGGGGGGAGCGGTGGGTGTGGGGGAAACCAAG
    AGTACAGTGGACGAGGACTCACCCCGGCGTGGTGTTCTTTTTTCTTCTTCTTTTTCTTTC
    CTTTTTTTTTTTTTTTTCTAATTCCTGAGGGGTGGTTGCTGCTTTTGCTACATGACTTGC
    CAGCGCCCGAGCCTGCGGTCCAACTGCGCTGCTGCCGGAGCGCTCAGTGCCGCCGCTGCC
    GCCCGTGCCCCCCGCGCCCCGTTCGGCACCCACCGGTCGCCGCCCCGCCCGCGCGCCGCT
    GTCCCGCTCCCGCGCCGCCGCCGCCGTTTCCCCCCGACGACTGGGTGATGCTGGACATGG
    GAGATAGGAAAGAGGTGAAAATGATCCCCAAGTCCTCGTTCAGCATCAACAGCCTGGTGC
    CCGAGGGCCTCCAGAACGACAACCACCACGCGAGCCACGGCCACCACAACAGCCACCACC
    CCCAGCACCACCACCACCACCACCACCATCACCACCACCCGCCGCCGCCCGCCCCGCAAC
    CGCCGCCGCCGCCGCAGCAGCAGCAGCCGCCGCCGCCGCCGAGACGCGGGGCCCGGCGCC
    GACGACGACGAGGCCCCAGCAGTTGTTGTTCCGCCGCGCACGCACACGGCGCGCCTGAGG
    GCCAACGGCAGCTGGCGCAAGGCGACCGGCGCGGCCGGGGGATCTGCCCCGTCGGGCCGG
    ACGAGAAGGAGAAGGCCCGCGCCGGGGGGGAGGAGAAGAAGGGGGCGGGCGAGGGCGGCA
    AGGACGGGGAGGGGGGCAAGGAGGGCGAGAAGAAGAACGGCAAGTACGAGAAGCCGCCGT
    TCAGCTACAACGCGCTCATCATGATGGCCATGCGGCAGAGCCCCGAGAAGCGGCTCACGC
    TCAACGGCATCTACGAGTTCATCATGAAGAACTTCCCTTACTACCGCGAGAACAAGCAGG
    GCTGGCAGAACTCCATCCGCCACAATCTGTCCCTCAACAAGTGCTTCGTGAAGGTGCCGC
    GCCACTACGACGACCCGGGCAAGGGCAACTACTGGATGCTGGACCCGTCGAGCGACGACG
    TGTTCATCGGCGGCACCACGGGCAAGCTGCGGCGCTCCACCACCTCGCCGGCCAAGCCGG
    CCTTCAAGCGCGGTGCCGCGCTCACCTCCACCGGCCTCACCTTCATGGACGCGCCGGCTC
    CCTCTACTGGCCCATGTCGCCCTTCCTGTCCCTGCACCACCCCCGCCAGCAGCACTTTGA
    GTTACAACGGGACCACGTCGGCCTACCCCAGCCACCCCATGCCCTACAGCTCCGTGTTGA
    CTCAAAACTCGCTGGGCAACAACCACTCCTCCTCCACCGCCAACGGGCTGAGCGTGGACC
    GGCTGGTCAACGGGGGAATCCCGTACGCCACGCACCACCTCACGGCCGCCGCGCTAACCG
    CCTCGGTGCCCTGCGGCCTGCTGGTGCCCTGCTCTGGGACCTACTCCCTCAACCCCTGCT
    CCGTCAACCTGCTCGCGGGCCAGACCAGTTACTTTTTCCCCCACGTCCCGCACCCGTCAA
    TGACTTCGCAGAGCAGCACGTCCATGAGCGCCAGGGCCGCGTCCTCCTCCACGTCGCCGG
    CAGGCCCCCCTCGACCCCTGCCCTGTGAGTCTTTAAGACCCTCTTTGCCAAGTTTTACGA
    CGGGACTGTCTGGGGGACTGTCTGATTATTTCACACATCAAAATCAGGGGTCTTCTTCCA
    ACCCTTTAATACATTAACATCCCTGGGACCAGACTGTAAGTGAACGTTTTACACACATTT
    GCATTGTAAATGATAATTAAAAAAATAAGTCCAGGTATTTTTTATTAAGCCCCCCCCTCC
    CATTTCTGTACGTTTGTTCAGTCTCTAGGGTTGTTTATTATTCTAACAAGGTGTGGAGTG
    TCAGCGAGGTGCAATGTGGGGAGAATACATTGTAGAATATAAGGTTTGGAAGTCAAATTA
    TAGTAGAATGTGTATCTAAATAGTGACTGCTTTGCCATTTCATTCAAACCTGACAAGTCT
    ATCTCTAAGAGCCGCCAGATTTCCATGTGTGCAGTATTATAAGTTATCATGGAACTATAT
    GGTGGACGCAGACCTTGAGAACAACCTAAATTATGGGGAGAATTTTAAAATGTTAAACTG
    TAATTTGTATTTAAAAAGCATTCGTAGTAAAGGTGCCCAAGAAATTATTTTGGCCATTTA
    TTGTTTTCTCCTTTTCTTTAAAGAACTGTTTTTTTTTCTTTTGTTTACTTTTAGACCAAA
    GATTGGGCGGTTCTAGAAAATGCGCCTTGGTATACTAAGTATTAAAACAAACAAAAAGGA
    AAGTTGTTTCAGTTAACGCTGCCCATTCAATTGAATCAGAAGGGGACAAAATTAACGATT
    GCCTTCAGTTTGTGTTGTGTATATTTTGATGTATGTGGTCACTAACAGGTCACTTTTATT
    TTTTCTAAATGTAGTGAAATGTTAATACCTATTGTACTTATAGGTAAACCTTGCAAATAT
    GTAACCTGTGTTGCGCAAATGCCGCATAAATTTGAGTGATTGTTAATGTTGTCTTAAAAT
    TTCTTGATTGTGACTATGTGGTCATATGCCCGTGTTTGTCACTTACAAAAATGTTTACTA
    TGAACACACATAAATAAAAAATAG
    >AK026790
    AAAATGCTTACTCTTGTGGGCTACTTGTTGTGTGGAAAAAGGAAAACGGATTCATTTTCC
    CATCGGCGACTTTATGACGACAGAAATGAACCAGTTCTGCGATTAGACAATGCACCGGAA
    CCTTATGATGTGAGTTTTGGGAATTCTAGCTACTACAATCCAACTTTGAATGATTCAGCC
    ATGCCAGAAAGTGAAGAAAATGCACGTGATGGCATTCCTATGGATGACATACCTCCACTT
    CGTACTTCTGTATAGAACTAACAGCAAAAAGGCGTTAAACAGCAAGTGTCATCTACATCC
    TAGCCTTTTGACAAATTCATCTTTCAAAAGGTTACACAAAATTACTGTCACGTTGGATTT
    TGTCAAGGAGAATCATAAAAGCAGGAGACCAGTAGCAGAAATGTAGACAGGATGTATCAT
    CCAAAGGTTTTCTTTCTTACAATTTTTGGCCATCCTGAGGCATTTACTAAGTAGCCTTAA
    TTTGTATTTTAGTAGTATTTTCTTAGTAGAAAATATTTGTGGAATCAGATAAAACTAAAA
    GATTTCACCATTACAGCCCTGCCTCATAACTAAATAATAAAAATTATTCCACCAAAAAAT
    TCTAAAACAATGAAGATGACTCTTTACTGCTCTGCCTGAAGCCCTAGTACCATAATTCAA
    GATTGCATTTTCTTAAATGAAAATTGAAAGGGTGCTTTTTAAAGAAAATTTGACTTAAAG
    CTAAAAAGAGGACATAGCCCAGAGTTTCTGTTATTGGGAAATTGAGGCAATAGAAATGAC
    AGACCTGTATTCTAGTACGTTATAATTTTCTAGATCAGCACACACATGATCAGCCCACTG
    AGTTATGAAGCTGACAATGACTGCATTCAACGGGGCCATGGCAGGAAAGCTGACCCTACC
    CAGGAAAGTAATAGCTTCTTTAAAAGTCTTCAAAGGTTTTGGGAATTTTAACTTGTCTTA
    ATATATCTTAGGCTTCAATTATTTGGGTGCCTTAAAAACTCAATGAGAATCATGGTAAAA
    AAAAAAAGTTAACCAAAGAATATACCTGTACATAATTTGTACAGTTTTAAGTTGTTAGAT
    AGGAACTGGATTTCTTATGTATTAGACATTATTGCTCAATCATAATGGAATAGATTCTGC
    ATCCCTAAATGTATGAACCATAAGGTTAAAAAAGATGAATGGAAATATCAAACAACTTTT
    CACTGAGCATCAGTTTCATAATCAATAATATAAGAAGATTAATTTGGATTCTAGTATGTT
    TCAGTTTGTTTTTAATTACCACCTTCCTTTGGTAGAAAAAATATGTTCCTTGATGTAGGA
    AAGTCTAGGTTTTAGAGATTAGAGGATGAGATCAAGAGTTAAATTCCTAAAGAAGCACTG
    AATATATGAAGAGAGCAAACAAATCAAGTACCAACCTAGAGGCTTTATTTTTGAATTGAT
    TCATGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTAACACAGAAACAGCT
    TTCAGAAAATAAGGGATAGAAAGTAATGAAGAAAGTACTTACCCCATATTGCCATAAAAA
    TAGCAAAGAAGACTGTCCCTCCATTATCGAACAAATATGTCACCTGAGTAGAAAACAAAC
    AGAAATATTAGTCATGCAAATTGATTATAATAAGCCAGTGAATACTGTTTGCACTCAGGT
    ACTATGATTTTTTCTCAAATAGAATCATATTATTTTATAGTACAGAAATATTATATATGA
    ATTCCTTTCATGGGTCTTGCAACAATTTCACATGATTTTTCTCATGGGGAGAGGTGAAGA
    AACAACATTAGCCCTCTTCTCTCCTCTCTTGATTCCCTTTATACCCCACCATCATTTCTG
    ATTATAAATAATTCTACCATTCTATGGAAGTATTTGTGGGTCACAGATTGTCAAACTACT
    TAATGAAAGTTGTATGAAATTAGTTTTTCAGGTGAGGCATTCCTAGTTGCAATTCCTGTT
    AGCAAAACTTCTAGGAGTGGGGAAGTTGGAAAATGCAGGATTCTTCCAGTGAGCCAGCAT
    TTCCCATAGCTAACCCTATTCTCTTAGTCTTTCAAAATGTAGAATGGGTCCAATAATGGC
    TATAAGATGTAATAAATCCCATCTTAATTTGTTTTAAAAGTTTCATAAATCACTGAACAC
    TTATGAAACAAAGTGTTTTTTAATCAGATATCAACTGAAACTTCATAAAGGATGCATAGT
    TTTATAATGTTATTGAATCAAATTTTAAGGCTTGTATTGTTTGATTTTAATAAAGTATAA
    TCTCCTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >BC012727
    GGCACGAGGCTGCCTGCCCCCCGGGTGGGGCTGCGGCTCTGGCCTCCCAGGCCCATCCTC
    AACAGCTACCCCAGCCAACACCAAGGCCACAAGGGGACCCCGGCCTAGGAGGCAGGAAGC
    CAAGGTACAGAGAGCAGCCTGGCCCTCACCAGTGCGCAAGCTGGGGCAGCAAGGCTGACA
    GTTGCTGCATGCCCAGGGCAGGGTGTGGTACTGGCACCCAAGTTCAGCATGGCAGAGCTG
    GCCAACAGCTTGTCCCCGATCTGCCTCCAGCCCCAAGATGCCTACAGCCCCCAGGCCCCT
    TCGGCAGCACTGCCTCTGCCCACCTGCCTTTAAGAGACTCCAGGGCTGCTCCTGTCATGC
    AGCGAAGGTTTTGTCTGTTTCAAAGTTCGAGACTCAACTTGAGGGACTGTTTTTGACAAT
    CCCCGCTGACCTCCGCTCCTCGTGGCGCCCTGGCCCTACACCCAGCCTGGCCCAGGGCCG
    GCTTTGCCTGGTGAGGCTGGAGGGAGCACCAGGACCTGCTGTCTGCTGTCAGCCCCTCCT
    GGTGCTGGTGCCCTGATGCTGTGCCTTGTCACCCATTGAGCTGCAAGAGGGACCAAGAGG
    GGGCCACGCAGCCAGCCAGATGCCTGGCCCTGTGCTGGGGCAGACAACGCTGCAGAGCCC
    AGGGAGCCTGGCGCTAGGACGTGCGTCCTTGTGACACTGGCCTGTCTGAACTCACCTGGC
    CTGGGAAGCACCGTCTGCCCGGGCCCAAGCCCTGCCCCTCCAGAGTCCAGAGCCAGGAAG
    GGGCTGCTGAGGGCGAGCATCCTGCTGGGCTCTCTGCCCGGCCCACCCCTCCAAGGGGCT
    GGCCTGTGAGCCTTGACTGGGATTCATGATGTGGAGGCCCCCAACTTCCAGAAGCAGCTG
    GTACTCTGCTCACACAAGCGACTGGGCCGGCCGGCCCTGGACCCCTAGACCCCGAGCCGC
    CTGCCGACTGCCTGCACAGGGAGAGCAGTTGAGGCCCGGGCAGGGCCCCCACACCAGACC
    CCAACATAGCTTCCCCACCCAGGCACCCCCTCCCGGGGCAGCAGGCGTGGGAGTCAGGGC
    TGCATGCTCCTCCCCTCCCACCTCACAGGCGGCCTTAGGCAAGTCATTTTCTGTCATCAC
    AAGGTCGCCTCTGCCTAGTCAGGTCCTGGCGTCCAGAGTAAGGATGTGCGGCCCCCAGGC
    CCCCGCACACCTCCCTCAGCACCAAGACCGGGACCCCCCCACCCACGTGTCTCATTGTGG
    CTGCCTATGGACTCCCGGGCCTTGTGTGCAGGCCAGGCCCTTCCACTGATTTTTTAAAGT
    GAACCATTGCTGGATCTCAGATTCTGTGGCATCTAAGGCCTAGCAGGGGTGGGCACACGG
    GTCACCCGAGGCCCATACCAAGACTCTGTTCCTGCCCTAGGCCCAGTCTCAAAGGAAGCC
    ACAAGGCGCGGGGGCCACTGAGGAAGGAAATGTTCATTTTCATTTGTCCAAAACCACCTT
    AAGTTTTAAGTATATTAATCTTGATGCTTTTTAACTATTGCTTTTTAACTTGCTGAGATT
    TAGAAATACTGTTATAAAAACTTTTTTAATTTCTGTATTTTTTTTCTGTATTGTATCTTC
    ATGGGACATTAGGGGTTTTCTATGGTAAGCACACCTATGGTTTTGGTAAAAACATTATCA
    AATATATATCCAGACGGTTCTTCCCTAGAAGAAAAACAAGTCTTTACACCTGATAAAATA
    TTTTGCGAAGAGAGGTGTTCTTTTTCCTTACTGGTGCTGAAAGGAAGGATGGATAACGAG
    GAGAAAATAAAACTGTGAGGCTCAAAAAAAAAAAAAAAAAA
    >R45389
    CCTGCCCTTCTCTATATGTACCATCTCCAAAAACCATGTACATCTCCAAAAACTGGAGTA
    GAAAGTTAGATTGCTCAACTACAACTCCTCTAGAACTCTATAGCTCTGACATACAGATTC
    ACACTCTCCTCTATTTGCTAAGTATGTAAAGAATGTTTTCTTTTAAAATGTTCTCTTTTG
    AGAACAACTGCTTATTTGTTATAAAAGCATTTGGTTAAAATGATGTCATCATAAAGAACA
    GTGGCTTTGTTTCAATACATATTTTTGAGATGATTATCTAGAAGCCAGATTAATAAAATC
    AGCTTGTGACCTTGCTAAGCATATAAACTGGAAATTCAGATACATTCAAAATTATGGGTT
    CATTTAAAAGTGTTCTACCTTTTGGGTATGAGACTAATATCACTAATTCCTCAATAGTTA
    TCATGGCTCTATCTTAATTAATTAGAAAATATGTGTGTTTAATTCTTTGAGAATTAAAAT
    AGAGAATATTAACAGAGGGTTAAAAACTGCTTCAACTCCAATAAGATAAAGGAAGCTCAA
    AATCTATGAGCTGAGTGTTCAATTAGCTTTGCCTACTGAGTTCAATTTTATGTCAATACA
    ACAGTGGATCAGACAGTACGACTTTGAACTGGTGAATGTAAACAATTGTTTTTCACCTAA
    GCTGCTTTGGAAGAACTGATGCTTGCTGCTAACTAAAGTTTTGGATGTATCGATTTAGAG
    AACCAATTAATACCTGCAAAATAAAGCATACTGTGGTACTTCTGTTTGATCTAGTATGTG
    TGATTTTAGATTGATGGATTAAAAATTAATAAAGATCATACATTCCATACCAAAAAAAAA
    AAAAAAAA
    >BC006811
    CCAGAAGCCTGCATTTCTGCATTCTGCTTAATTCCCTTTCCTTAGATTTGAAAGAAGCCA
    ACACTAAACCACAAATATACAACAAGGCCATTTTCTCAAACGAGAGTCAGCCTTTAACGA
    AATGACCATGGTTGACACAGAGATGCCATTCTGGCCCACCAACTTTGGGATCAGCTCCGT
    GGATCTCTCCGTAATGGAAGACCACTCCCACTCCTTTGATATCAAGCCCTTCACTACTGT
    TGACTTCTCCAGCATTTCTACTCCACATTACGAAGACATTCCATTCACAAGAACAGATCC
    AGTGGTTGCAGATTACAAGTATGACCTGAAACTTCAAGAGTACCAAAGTGCAATCAAAGT
    GGAGCCTGCATCTCCACCTTATTATTCTGAGAAGACTCAGCTCTACAATAAGCCTCATGA
    AGAGCCTTCCAACTCCCTCATGGCAATTGAATGTCGTGTCTGTGGAGATAAAGCTTCTGG
    ATTTCACTATGGAGTTCATGCTTGTGAAGGATGCAAGGGTTTCTTCCGGAGAACAATCAG
    ATTGAAGCTTATCTATGACAGATGTGATCTTAACTGTCGGATCCACAAAAAAAGTAGAAA
    TAAATGTCAGTACTGTCGGTTTCAGAAATGCCTTGCAGTGGGGATGTCTCATAATGCCAT
    CAGGTTTGGGCGGATGCCACAGGCCGAGAAGGAGAAGCTGTTGGCGGAGATCTCCAGTGA
    TATCGACCAGCTGAATCCAGAGTCCGCTGACCTCCGGGCCCTGGCAAAACATTTGTATGA
    CTCATACATAAAGTCCTTCCCGCTGACCAAAGCAAAGGCGAGGGCGATCTTGACAGGAAA
    GACAACAGACAAATCACCATTCGTTATCTATGACATGAATTCCTTAATGATGGGAGAAGA
    TAAAATCAAGTTCAAACACATCACCCCCCTGCAGGAGCAGAGCAAAGAGGTGGCCATCCG
    CATCTTTCAGGGCTGCCAGTTTCGCTCCGTGGAGGCTGTGCAGGAGATCACAGAGTATGC
    CAAAAGCATTCCTGGTTTTGTAAATCTTGACTTGAACGACCAAGTAACTCTCCTCAAATA
    TGGAGTCCACGAGATCATTTACACAATGCTGGCCTCCTTGATGAATAAAGATGGGGTTCT
    CATATCCGAGGGCCAAGGCTTCATGACAAGGGAGTTTCTAAAGAGCCTGCGAAAGCCTTT
    TGGTGACTTTATGGAGCCCAAGTTTGAGTTTGCTGTGAAGTTCAATGCACTGGAATTAGA
    TGACAGCGACTTGGCAATATTTATTGCTGTCATTATTCTCAGTGGAGACCGCCCAGGTTT
    GCTGAATGTGAAGCCCATTGAAGACATTCAAGACAACCTGCTACAAGCCCTGGAGCTCCA
    GCTGAAGCTGAACCACCCTGAGTCCTCACAGCTGTTTGCCAAGCTGCTCCAGAAAATGAC
    AGACCTCAGACAGATTGTCACGGAACACGTGCAGCTACTGCAGGTGATCAAGAAGACGGA
    GACAGACATGAGTCTTCACCCGCTCCTGCAGGAGATCTACAAGGACTTGTACTAGCAGAG
    AGTCCTGAGCCACTGCCAACATTTCCCTTCTTCCAGTTGCACTATTCTGAGGGAAAATCT
    GACACCTAAGAAATTTACTGTGAAAAAGCATTTTAAAAAGAAAAGGTTTTAGAATATGAT
    CTATTTTATGCATATTGTTTATAAAGACACATTTACAATTTACTTTTAATATTAAAAATT
    ACCATATTATGAAAAAAAAAAAAAAAA
    >X05615
    GCAGTGGTTTCTCCTCCTTCCTCCCAGGAAGGGCCAGGAAAATGGCCCTGGTCCTGGAGA
    TCTTCACCCTGCTGGCCTCCATCTGCTGGGTGTCGGCCAATATCTTCGAGTACCAGGTTG
    ATGCCCAGCCCCTTCGTCCCTGTGAGCTGCAGAGGGAAACGGCCTTTCTGAAGCAAGCAG
    ACTACGTGCCCCAGTGTGCAGAGGATGGCAGCTTCCAGACTGTCCAGTGCCAGAACGACG
    GCCGCTCCTGCTGGTGTGTGGGTGCCAACGGCAGTGAAGTGCTGGGCAGCAGGCAGCCAG
    GACGGCCTGTGGCTTGTCTGTCATTTTGTCAGCTACAGAAACAGCAGATCTTACTGAGTG
    GCTACATTAACAGCACAGACACCTCCTACCTCCCTCAGTGTCAGGATTCAGGGGACTACG
    CGCCTGTTCAGTGTGATGTGCAGCATGTCCAGTGCTGGTGTGTGGACGCAGAGGGGATGG
    AGGTGTATGGGACCCGCCAGCTGGGGAGGCCAAAGCGATGTCCAAGGAGCTGTGAAATAA
    GAAATCGTCGTCTTCTCCACGGGGTGGGAGATAAGTCACCACCCCAGTGTTCTGCGGAGG
    GAGAGTTTATGCCTGTCCAGTGCAAATTTGTCAACACCACAGACATGATGATTTTTGATC
    TGGTCCACAGCTACAACAGGTTTCCAGATGCATTTGTGACCTTCAGTTCCTTCCAGAGGA
    GGTTCCCTGAGGTATCTGGGTATTGCCACTGTGCTGACAGCCAAGGGCGGGAACTGGCTG
    AGACAGGTTTGGAGTTGTTACTGGATGAAATTTATGACACCATTTTTGCTGGCCTGGACC
    TTCCTTCCACCTTCACTGAAACCACCCTGTACCGGATACTGCAGAGACGGTTCCTCGCAG
    TTCAATCAGTCATCTCTGGCAGATTCCGATGCCCCACAAAATGTGAAGTGGAGCGGTTTA
    CAGCAACCAGCTTTGGTCACCCCTATGTTCCAAGCTGCCGCCGAAATGGCGACTATCAGG
    CGGTGCAGTGCCAGACGGAAGGGCCCTGCTGGTGTGTGGACGCCCAGGGGAAGGAAATGC
    ATGGAACCCGGCAGCAAGGGGAGCCGCCATCTTGTGCTGAAGGCCAATCTTGTGCCTCCG
    AAAGGCAGCAGGCCTTGTCCAGACTCTACTTTGGGACCTCAGGCTACTTCAGCCAGCACG
    ACCTGTTCTCTTCCCCAGAGAAAAGATGGGCCTCTCCAAGAGTAGCCAGATTTGCCACAT
    CCTGCCCACCCACGATCAAGGAGCTCTTTGTGGACTCTGGGCTTCTCCGCCCAATGGTGG
    AGGGACAGAGCCAACAGTTTTCTGTCTCAGAAAATCTTCTCAAAGAAGCCATCCGAGCAA
    TTTTTCCCTCCCGAGGGCTGGCTCGTCTTGCCCTTCAGTTTACCACCAACCCAAAGAGAC
    TCCAGCAAAACCTTTTTGGAGGGAAATTTTTGGTGAATGTTGGCCAGTTTAACTTGTCTG
    GAGCCCTTGGCACAAGAGGCACATTTAACTTCAGTCAATTTTTCCAGCAACTTGGTCTTG
    CAAGCTTCTTGAATGGAGGGAGACAAGAAGATTTGGCCAAGCCACTCTCTGTGGGATTAG
    ATTCAAATTCTTCCACAGGAACCCCTGAAGCTGCTAAGAAGGATGGTACTATGAATAAGC
    CAACTGTGGGCAGCTTTGGCTTTGAAATTAACCTACAAGAGAACCAAAATGCCCTCAAAT
    TCCTTGCTTCTCTCCTGGAGCTTCCAGAATTCCTTCTCTTCTTGCAACATGCTATCTCTG
    TGCCAGAAGATGTGGCAAGAGATTTAGGTGATGTGATGGAAACGGTACTCGACTCCCAGA
    CCTGTGAGCAGACACCTGAAAGGCTATTTGTCCCATCATGCACGACAGAAGGAAGCTATG
    AGGATGTCCAATGCTTTTCCGGAGAGTGCTGGTGTGTGAATTCCTGGGGCAAAGAGCTTC
    CAGGCTCAAGAGTCAGAGATGGACAGCCAAGGTGCCCCACAGACTGTGAAAAGCAAAGGG
    CTCGCATGCAAAGCCTCATGGGCAGCCAGCCTGCTGGCTCCACCTTGTTTGTCCCTGCTT
    GTACTAGTGAGGGACATTTCCTGCCTGTCCAGTGCTTCAACTCAGAGTGCTACTGTGTTG
    ATGCTGAGGGTCAGGCCATTCCTGGAACTCGAAGTGCAATAGGGAAGCCCAAGAAATGCC
    CCACGCCCTGTCAATTACAGTCTGAGCAAGCTTTCCTCAGGACGGTGCAGGCCCTGCTCT
    CTAACTCCAGCATGCTACCCACCCTTTCCGACACCTACATCCCACAGTGCAGCACCGATG
    GGCAGTGGAGACAAGTGCAATGCAATGGGCCTCCTGAGCAGGTCTTCGAGTTGTACCAAC
    GATGGGAGGCTCAGAACAAGGGCCAGGATCTGACGCCTGCCAAGCTGCTAGTGAAGATCA
    TGAGCTACAGAGAAGCAGCTTCCGGAAACTTCAGTCTCTTTATTCAAAGTCTGTATGAGG
    CTGGCCAGCAAGATGTCTTCCCGGTGCTGTCACAATACCCTTCTCTGCAAGATGTCCCAC
    TAGCAGCACTGGAAGGGAAACGGCCCCAGCCCAGGGAGAATATCCTCCTGGAGCCCTACC
    TCTTCTGGCAGATCTTAAATGGCCAACTCAGCCAATACCCGGGGTCCTACTCAGACTTCA
    GCACTCCTTTGGCACATTTTGATCTTCGGAACTGCTGGTGTGTGGATGAGGCTGGCCAAG
    AACTGGAAGGAATGCGGTCTGAGCCAAGCAAGCTCCCAACGTGTCCTGGCTCCTGTGAGG
    AAGCAAAGCTCCGTGTACTGCAGTTCATTAGGGAAACGGAAGAGATTGTTTCAGCTTCCA
    ACAGTTCTCGGTTCCCTCTGGGGGAGAGTTTCCTGGTGGCCAAGGGAATCCGGCTGAGGA
    ATGAGGACCTCGGCCTTCCTCCGCTCTTCCCGCCCCGGGAGGCTTTCGCGGAGTTTCTGC
    GTGGGAGTGATTACGCCATTCGCCTGGCGGCTCAGTCTACCTTAAGCTTCTATCAGAGAC
    GCCGCTTTTCCCCGGACGACTCGGCTGGAGCATCCGCCCTTCTGCGGTCGGGCCCCTACA
    TGCCACAGTGTGATGCGTTTGGAAGTTGGGAGCCTGTGCAGTGCCACGCTGGGACTGGGC
    ACTGCTGGTGTGTAGATGAGAAAGGAGGGTTCATCCCTGGCTCACTGACTGCCCGCTCTC
    TGCAGATTCCACAGTGCCCGACAACCTGCGAGAAATCTCGAACCAGTGGGCTGCTTTCCA
    GTTGGAAACAGGCTAGATCCCAAGAAAACCCATCTCCAAAAGACCTGTTCGTCCCAGCCT
    GCCTAGAAACAGGAGAATATGCCAGGCTGCAGGCATCGGGGGCTGGCACCTGGTGTGTGG
    ACCCTGCATCAGGAGAAGAGTTGCGGCCTGGCTCGAGCAGCAGTGCCCAGTGCCCAAGCC
    TCTGCAATGTGCTCAAGAGTGGAGTCCTCTCTAGGAGAGTCAGCCCAGGCTATGTCCCAG
    CCTGCAGGGCAGAGGATGGGGGCTTTTCCCCAGTGCAATGTGACCAGGCCCAGGGCAGCT
    GCTGGTGTGTCATGGACAGCGGAGAAGAGGTGCCTGGGACGCGCGTGACCGGGGGCCAGC
    CCGCCTGTGAGAGCCCGCGGTGTCCGCTGCCATTCAACGCGTCGGAGGTGGTTGGTGGAA
    CAATCCTGTGTGAGACAATCTCGGGCCCCACAGGCTCTGCCATGCAGCAGTGCCAATTGC
    TGTGCCGCCAAGGCTCCTGGAGCGTGTTTCCACCAGGGCCATTGATATGTAGCCTGGAGA
    GCGGACGCTGGGAGTCACAGCTGCCTCAGCCCCGGGCCTGCCAACGGCCCCAGCTGTGGC
    AGACCATCCAGACCCAAGGGCACTTTCAGCTCCAGCTCCCGCCGGGCAAGATGTGCAGTG
    CTGACTACGCGGGTTTGCTGCAGACTTTCCAGGTTTTCATATTGGATGAGCTGACAGCCC
    GCGGCTTCTGCCAGATCCAGGTGAAGACTTTTGGCACCCTGGTTTCCATTCCTGTCTGCA
    ACAACTCCTCTGTGCAGGTGGGTTGTCTGACCAGGGAGCGTTTAGGAGTGAATGTTACAT
    GGAAATCACGGCTTGAGGACATCCCAGTGGCTTCTCTTCCTGACTTACATGACATTGAGA
    GAGCCTTGGTGGGCAAGGATCTCCTTGGGCGCTTCACAGATCTGATCCAGAGTGGCTCAT
    TCCAGCTTCATCTGGACTCCAAGACGTTCCCAGCGGAAACCATCCGCTTCCTCCAAGGGG
    ACCACTTTGGCACCTCTCCTAGGACACGGTTTGGGTGCTCGGAAGGATTCTACCAAGTCT
    TGACAAGTGAGGCCAGTCAGGACGGACTGGGATGCGTTAAGTGCCATGAAGGAAGCTATT
    CCCAAGATGAGGAATGCATTCCTTGTCCTGTTGGATTCTACCAAGAACAGGCAGGGAGCT
    TGGCCTGTGTCCCATGTCCTGTGGGCAGAACGACCATTTCTGCCGGAGCTTTCAGCCAGA
    CTCACTGTGTCACTGACTGTCAGAGGAACGAAGCAGGCCTGCAATGTGACCAGAATGGCC
    AGTATCGAGCCAGCCAGAAGGACAGGGGCAGTGGGAAGGCCTTCTGTGTGGACGGCGAGG
    GGCGGAGGCTGCCATGGTGGGAAACAGAGGCCCCTCTTGAGGACTCACAGTGTTTGATGA
    TGCAGAAGTTTGAGAAGGTTCCAGAATCAAAGGTGATCTTCGACGCCAATGCTCCTGTGG
    CTGTCAGATCCAAAGTTCCTGATTCTGAGTTCCCCGTGATGCAGTGCTTGACAGATTGCA
    CAGAGGACGAGGCCTGCAGCTTCTTCACCGTGTCCACGACGGAGCCAGAGATTTCCTGTG
    ATTTCTATGCTTGGACAAGTGACAATGTTGCCTGCATGACTTCTGACCAGAAACGAGATG
    CACTGGGGAACTCAAAGGCCACCAGCTTTGGAAGTCTTCGCTGCCAGGTGAAAGTGAGGA
    GCCATGGTCAAGATTCTCCAGCTGTGTATTTGAAAAAGGGCCAAGGATCCACCACAACAC
    TTCAGAAACGCTTTGAACCCACTGGTTTCCAAAACATGCTTTCTGGATTGTACAACCCCA
    TTGTGTTCTCAGCCTCAGGAGCCAATCTAACCGATGCTCACCTCTTCTGTCTTCTTGCAT
    GCGACCGTGATCTGTGTTGCGATGGCTTCGTCCTCACACAGGTTCAAGGAGGTGCCATCA
    TCTGTGGGTTGCTGAGCTCACCCAGTGTCCTGCTTTGTAATGTCAAAGACTGGATGGATC
    CCTCTGAAGCCTGGGCTAATGCTACATGTCCTGGTGTGACATATGACCAGGAGAGCCACC
    AGGTGATATTGCGTCTTGGAGACCAGGAGTTCATCAAGAGTCTGACACCCTTAGAAGGAA
    CTCAAGACACCTTTACCAATTTTCAGCAGGTTTATCTCTGGAAAGATTCTGACATGGGGT
    CTCGGCCTGAGTCTATGGGATGTAGAAAAAACACAGTGCCAAGGCCAGCATCTCCAACAG
    AAGCAGGTTTGACAACAGAACTTTTCTCCCCTGTGGACCTCAACCAGGTCATTGTCAATG
    GAAATCAATCACTATCCAGCCAGAAGCACTGGCTTTTCAAGCACCTGTTTTCAGCCCAGC
    AGGCAAACCTATGGTGCCTTTCTCGTTGTGTGCAGGAGCACTCTTTCTGTCAGCTCGCAG
    AGATAACAGAGAGTGCATCCTTGTACTTCACCTGCACCCTCTACCCAGAGGCACAGGTGT
    GTGATGACATCATGGAGTCCAATACCCAGGGCTGCAGACTGATCCTGCCTCAGATGCCAA
    AGGCCCTGTTCCGGAAGAAAGTTATACTGGAAGATAAAGTGAAGAACTTTTACACTCGCC
    TGCCGTTCCAAAAACTGATGGGGATATCCATTAGAAATAAAGTGCCCATGTCTGAAAAAT
    CTATTTCTAATGGGTTCTTTGAATGTGAACGACGGTGCGATGCGGACCCATGCTGCACTG
    GCTTTGGATTTCTAAATGTTTCCCAGTTAAAAGGAGGAGAGGTGACATGTCTCACTCTGA
    ACAGCTTGGGAATTCAGATGTGCAGTGAGGAGAATGGAGGAGCCTGGCGCATTTTGGACT
    GTGGCTCTCCTGACATTGAAGTCCACACCTATCCCTTCGGATGGTACCAGAAGCCCATTG
    CTCAAAATAATGCTCCCAGTTTTTGCCCTTTGGTTGTTCTGCCTTCCCTCACAGAGAAAG
    TGTCTCTGGAATCGTGGCAGTCCCTGGCCCTCTCTTCAGTGGTTGTTGATCCATCCATTA
    GGCACTTTGATGTTGCCCATGTCAGCACTGCTGCCACCAGCAATTTCTCTGCTGTCCGAG
    ACCTCTGTTTGTCGGAATGTTCCCAACATGAGGCCTGTCTCATCACCACTCTGCAAACCC
    AACTCGGGGCTGTGAGATGTATGTTCTATGCTGATACTCAAAGCTGCACACATAGTCTGC
    AGGGTCGGAACTGCCGACTTCTGCTTCGTGAAGAGGCCACCCACATCTACCGGAAGCCAG
    GAATCTCTCTGCTCAGCTATGAGGCATCTGTACCTTCTGTGCCCATTTCCACCCATGGCC
    GGCTGCTGGGCAGGTCCCAGGCCATCCAGGTGGGTACCTCATGGAAGCAAGTGGACCAGT
    TCCTTGGAGTTCCATATGCTGCCCCGCCCCTGGCAGAGAGGCACTTCCAGGCACCAGAGC
    CCTTGAACTGGACAGGCTCCTGGGATGCCAGCAAGCCAAGGGCCAGCTGCTGGCAGCCAG
    GCACCAGAACATCCACGTCTCCTGGAGTCAGTGAAGATTGTTTGTATCTCAATGTGTTCA
    TCCCTCAGAATGTGGCCCCTAACGCGTCTGTGCTGGTGTTCTTCCACAACACCATGGACA
    GGGAGGAGAGTGAAGGATGGCCGGCTATCGACGGCTCCTTCTTGGCTGCTGTTGGCAACC
    TCATCGTGGTCACTGCCAGCTACCGAGTGGGTGTCTTCGGCTTCCTGAGTTCTGGATCCG
    GAGAGGTGAGTGGCAACTGGGGGCTGCTGGACCAGGTGGCGGCTCTGACCTGGGTGCAGA
    CCCACATCCGAGGATTTGGCGGGGACCCTCGGCGCGTGTCCCTGGCAGCAGACCGTGGCG
    GGGCTGATGTGGCCAGCATCCACCTTCTCACGGCCAGGGCCACCAACTCCCAACTTTTCC
    GGAGAGCTGTGCTGATGGGAGGCTCCGCACTCTCCCCGGCCGCCGTCATCAGCCATGAGA
    GGGCTCAGCAGCAGGCAATTGCTTTGGCAAAGGAGGTCAGTTGCCCCATGTCATCCAGCC
    AAGAAGTGGTGTCCTGCCTCCGCCAGAAGCCTGCCAATGTCCTCAATGATGCCCAGACCA
    AGCTCCTGGCCGTGAGTGGCCCTTTCCACTACTGGGGTCCTGTGATCGATGGCCACTTCC
    TCCGTGAGCCTCCAGCCAGAGCACTGAAGAGGTCTTTATGGGTAGAGGTCGATCTGCTCA
    TTGGGAGTTCTCAGGACGACGGGCTCATCAACAGAGCAAAGGCTGTGAAGCAATTTGAGG
    AAAGTCGAGGCCGGACCAGTAGCAAAACAGCCTTTTACCAGGCACTGCAGAATTCTCTGG
    GTGGCGAGGACTCAGATGCCCGCGTCGAGGCTGCTGCTACATGGTATTACTCTCTGGAGC
    ACTCCACGGATGACTATGCCTCCTTCTCCCGGGCTCTGGAGAATGCCACCCGGGACTACT
    TTATCATCTGCCCTATAATCGACATGGCCAGTGCCTGGGCAAAGAGGGCCCGAGGAAACG
    TCTTCATGTACCATGCTCCTGAAAACTACGGCCATGGCAGCCTGGAGCTGCTGGCGGATG
    TTCAGTTTGCCTTGGGGCTTCCCTTCTACCCAGCCTACGAGGGGCAGTTTTCTCTGGAGG
    AGAAGAGCCTGTCGCTGAAAATCATGCAGTACTTTTCCCACTTCATCAGATCAGGAAATC
    CCAACTACCCTTATGAGTTCTCACGGAAAGTACCCACATTTGCAACCCCCTGGCCTGACT
    TTGTACCCCGTGCTGGTGGAGAGAACTACAAGGAGTTCAGTGAGCTGCTCCCCAATCGAC
    AGGGCCTGAAGAAAGCCGACTGCTCCTTCTGGTCCAAGTACATCTCGTCTCTGAAGACAT
    CTGCAGATGGAGCCAAGGGCGGGCAGTCAGCAGAGAGTGAAGAGGAGGAGTTGACGGCTG
    GATCTGGGCTAAGAGAAGATCTCCTAAGCCTCCAGGAACCAGGCTCTAAGACCTACAGCA
    AGTGACCAGCCCTTGAGCTCCCCAAAAACCTCACCCGAGGCTGCCCACTATGGTCATCTT
    TTTCTCTAAAATAGTTACTTACCTTCAATAAAGTATCTACATGCGGTG
    >X79676
    AGATCTCTCCAGATCACACTGTCACGTGTACCTAGCACATCTCGAGAACTCCTTTGGGCC
    GTCTGGGGCCCGGGAAGGAAGCCTGAGTTCTCAAGATTCCAGGACTGAGAGTGCCAGCTT
    GTCTCAAAGCCAGGTCAATGGTTTCTTTGCCAGCCATTTAGGTGACCAAACCTGGCAGGA
    ATCACAGCATGGCAGCCCTTCCCCATCTGTAATATCCAAAGCCACCGAGAAAGAGACTTT
    CACTGATAGTAACCAAAGCAAAACTAAAAAGCCAGGCATTTCTGATGTAACTGATTACTC
    AGACCGTGGAGATTCAGACATGGATGAAGCCACTTACTCCAGCAGTCAGGATCATCAAAC
    ACCAAAACAGGAATCTTCCTCTTCAGTGAATACATCCAACAAGATGAATTTTAAAACTTT
    TCCTTCATCACCTCCTAGGTCTGGAGATATCTTTGAGGTTGAACTGGCTAAAAATGATAA
    CAGCTTGGGGATAAGTGTCACGGGAGGTGTGAATACGAGTGTCAGACATGGTGGCATTTA
    TGTGAAAGCTGTTATTCCCCAGGGAGCAGCAGAGTCTGATGGTAGAATTCACAAAGGTGA
    TCGCGTCCTAGCTGTCAATGGAGTTAGTCTAGAAGGAGCCACCCATAAGCAAGCTGTGGA
    AACACTGAGAAATACAGGACAGGTGGTTCATCTGTTATTAGAAAAGGGACAATCTCCAAC
    ATCTAAAGAACATGTCCCGGTAACCCCACAGTGTACCCTTTCAGATCAGAATGCCCAAGG
    TCAAGGCCCAGAAAAAGTGAAGAAAACAACTCAGGTCAAAGACTACAGCTTTGTCACTGA
    AGAAAATACATTTGAGGTAAAATTATTTAAAAATAGCTCAGGTCTAGGATTCAGTTTTTC
    TCGAGAAGATAATCTTATACCGGAGCAAATTAATGCCAGCATAGTAAGGGTTAAAAAGCT
    CTTTCCTGGACAGCCAGCAGCAGAAAGTGGAAAAATTGATGTAGGAGATGTTATCTTGAA
    AGTGAATGGAGCCTCTTTGAAAGGACTATCTCAGCAGGAAGTCATATCTGCTCTCAGGGG
    AACTGCTCCAGAAGTATTCTTGCTTCTCTGCAGACCTCCACCTGGTGTGCTACCGGAAAT
    TGATACTGCGCTTTTGACCCCACTTCAGTCTCCAGCACAAGTACTTCCAAACAGCAGTAA
    AGACTCTTCTCAGCCATCATGTGTGGAGCAAAGCACCAGCTCAGATGAAAATGAAATGTC
    AGACAAAAGCAAAAAACAGTGCAAGTCCCCATCCAGAAAAGACAGTTACAGTGACAGCAG
    TGGGAGTGGAGAAGATGACTTAGTGACAGCTCCAGCAAACATATCAAATTCGACCTGGAG
    TTCAGCTTTGCATCAGACTCTAAGCAACATGGTATCACAGGCACAGAGTCATCATGAAGC
    ACCAAGAGTCAAGAAGATACCATTTGTACCATGTTTTACTATCCTCAGGAAAAGGCCCAA
    TAAACCAGAGTTTGAGGACAGTAATCCTTCCCCTCTACCACCGGATATGGCTCCTGGGCA
    GAGTTATCAACCCCAATCAGAATCTGCTTCCTCTAGTTCGATGGATAAGTATCATATACA
    TCACATTTCTGAACCAACTAGACAAGAAAACTGGACACCTTTGAAAAATGACTTGGAAAA
    TCACCTTGAAGACTTTGAACTGGAAGTAGAACTCCTCATTACCCTAATTAAATCAGAAAA
    AGGAAGCCTGGGTTTTACAGTAACCAAAGGCAATCAGAGAATTGGTTGTTATGTTCATGA
    TGTCATACAGGATCCAGCCAAAAGTGATGGAAGGCTAAAACCTGGGGACCGGCTCATAAA
    GGTTAATGATACAGATGTTACTAATATGACTCATACAGATGCAGTTAATCTGCTCCGGGG
    ATCCAAAACAGTCAGATTAGTTATTGGACGAGTTCTAGAATTACCCAGAATACCAATGTT
    GCCTCATTTGCTACCGGACATAACACTAACGTGCAACAAAGAGGAGTTGGGTTTTTCCTT
    ATGTGGAGGTCATGACAGCCTTTATCAAGTGGTATATATTAGTGATATTAATCCAAGGTC
    CGTCGCAGCCATTGAGGGTAATCTCCAGCTATTAGATGTCATCCATTATGTGAACGGAGT
    CAGCACACAAGGAATGACCTTGGAGGAAGTTAACAGAGCATTAGACATGTCACTTCCTTC
    ATTGGTATTGAAAGCAACAAGAAATGATCTTCCAGTGGTCCCCAGCTCAAAGAGGTCTGC
    TGTTTCAGCTCCAAAGTCAACCAAAGGCAATGGTTCCTACAGTGTGGGGTCTTGCAGCCA
    GCCTGCCCTCACTCCTAATGATTCATTCTCCACGGTTGCTGGGGAAGAAATAAATGAAAT
    ATCGTACCCCAAAGGAAAATGTTCTACTTATCAGATAAAGGGATCACCAAACTTGACTCT
    GCCCAAAGAATCTTATATACAAGAAGATGACATTTATGATGATTCCCAAGAAGCTGAAGT
    TATCCAGTCTCTGCTGGATGTTGTGGATGAGGAGTCCCAGAATCTTTTAAACGAAAATAA
    TGCAGCAGGATACTCCTGTGGTCCAGGTACATTAAAGATGAATGGGAAGTTATCAGAAGA
    GAGAACAGAAGATACAGACTGCGATGGTTCACCTTTACCTGAGTATTTTACTGAGGCCAC
    CAAAATGAATGGCTGTGAAGAATATTGTGAAGAAAAAGTAAAAAGTGAAAGCTTAATTCA
    GAAGCCACAAGAAAAGAAGACTGATGATGATGAAATAACATGGGGAAATGATGAGTTGCC
    AATAGAGAGAACAAACCATGAAGATTCTGATAAAGATCATTCCTTTCTGACAAACGATGA
    GCTCGCTGTACTCCCTGTCGTCAAAGTGCTTCCCTCTGGTAAATACACGGGCGCCAACTT
    AAAATCAGTCATTCGAGTCCTGCGGGTTGCTAGATCAGGAATTCCTTCTAAGGAGCTGGA
    GAATCTTCAAGAATTAAAACCTTTGGATCAGTGTCTAATTGGGCAAACTAAGGAAAACAG
    AAGGAAGAACAGATATAAAAATATACTTCCCTATGATGCTACAAGAGTGCCTCTTGGAGA
    TGAAGGTGGCTATATCAATGCCAGCTTCATTAAGATACCAGTTGGGAAAGAAGAGTTCGT
    TTACATTGCCTGCCAAGGACCACTGCCTACAACTGTTGGAGACTTCTGGCAGATGATTTG
    GGAGCAAAAATCCACAGTGATAGCCATGATGACTCAAGAAGTAGAAGGAGAAAAAATCAA
    ATGCCAGCGCTATTGGCCCAACATCCTAGGCAAAACAACAATGGTCAGCAACAGACTTCG
    ACTGGCTCTTGTGAGAATGCAGCAGCTGAAGGGCTTTGTGGTGAGGGCAATGACCCTTGA
    AGATATTCAGACCAGAGAGGTGCGCCATATTTCTCATCTGAATTTCACTGCCTGGCCAGA
    CCATGATACACCTTCTCAACCAGATGATCTGCTTACTTTTATCTCCTACATGAGACACAT
    CCACAGATCAGGCCCAATCATTACGCACTGCAGTGCTGGCATTGGACGTTCAGGGACCCT
    GATTTGCATAGATGTGGTTCTGGGATTAATCAGTCAGGATCTTGATTTTGACATCTCTGA
    TTTGGTGCGCTGCATGAGACTACAAAGACACGGAATGGTTCAGACAGAGGATCAATATAT
    TTTCTGCTATCAAGTCATCCTTTATGTCCTGACACGTCTTCAAGCAGAAGAAGAGCAAAA
    ACAGCAGCCTCAGCTTCTGAAGTGACATGAAAAGAGCCTCTGGATGCATTTCCATTTCTC
    TCCTTAACCTCCAGCAGACTCCTGCTCTCTATCCAAAATAAAGATCACAGAGCAGCAAGT
    TCATACAACATGCATGTTCTCCTCTATCTTAGAGGGGTATTCTTCTTGAAAATAAAAAAT
    ATTGAAATGCTGTATTTTTACAGCTACTTTAACCTATGATAATTATTTACAAAATTTTAA
    CACTAACCAAACAATGCAGATCTTAGGGATGATTAAAGGCAGCATTTGATGATAGCAGAC
    ATTGTTACAAGGACATGGTGAGTCTATTTTTAATGCACCAATCTTGTTTATAGCAAAAAT
    GTTTTCCAATATTTTAATAAAGTAGTTATTTATAGGCATACTTGAAACCAGTATTTAAGC
    TTTAAATGACAGTAATATTGGCATAGAAAAAAGTAGCAAATGTTTACTGTATCAATTTCT
    AATGTTTACTATATAGAATTTCCTGTAATATATTTATATACTTTTTCATGAAAATGGAGT
    TATCAGTTATCTGTTTGTTACTGCATCATCTGTTTGTAATCATTATCTCACTTTGTAAAT
    AAAAACACACCTTAAAACATGAACAAGCCAAAAAAAAAAAAAAA
    >NM_006142
    CCAGGCAGCAGTTAGCCCGCCGCCCGCCTGTGTGTCCCCAGAGCCATGGAGAGAGCCAGT
    CTGATCCAGAAGGCCAAGCTGGCAGAGCAGGCCGAACGCTATGAGGACATGGCAGCCTTC
    CCAGGCAGCAGTTAGCCCGCCGCCCGCCTGTGTGTCCCCAGAGCCATGGAGAGAGCCAGT
    CTGATCCAGAAGGCCAAGCTGGCAGAGCAGGCCGAACGCTATGAGGACATGGCAGCCTTC
    ATGAAAGGCGCCGTGGAGAAGGGCGAGGAGCTCTCCTGCGAAGAGCGAAACCTGCTCTCA
    GTAGCCTATAAGAACGTGGTGGGCGGCCAGAGGGCTGCCTGGAGGGTGCTGTCCAGTATT
    GAGCAGAAAAGCAACGAGGAGGGCTCGGAGGAGAAGGGGCCCGAGGTGCGTGAGTACCGG
    GAGAAGGTGGAGACTGAGCTCCAGGGCGTGTGCGACACCGTGCTGGGCCTGCTGGACAGC
    CACCTCATCAAGGAGGCCGGGGACGCCGAGAGCCGGGTCTTCTACCTGAAGATGAAGGGT
    GACTACTACCGCTACCTGGCCGAGGTGGCCACCGGTGACGACAAGAAGCGCATCATTGAC
    TCAGCCCGGTCAGCCTACCAGGAGGCCATGGACATCAGCAAGAAGGAGATGCCGCCCACC
    AACCCCATCCGCCTGGGCCTGGCCCTGAACTTTTCCGTCTTCCACTACGAGATCGCCAAC
    AGCCCCGAGGAGGCCATCTCTCTGGCCAAGACCACTTTCGACGAGGCCATGGCTGATCTG
    CACACCCTCAGCGAGGACTCCTACAAAGACAGCACCCTCATCATGCAGCTGCTGCGAGAC
    AACCTGACACTGTGGACGGCCGACAACGCCGGGGAAGAGGGGGGCGAGGCTCCCCAGGAG
    CCCCAGAGCTGAGTGTTGCCCGCCACCGCCCCGCCCTGCCCCCTCCAGTCCCCGCCCTGC
    CGAGAGGACTAGTATGGGGTGGGAGGCCCCACCCTTCTCCCCTAGGCGCTGTTCTTGCTC
    CAAAGGGCTCCGTGGAGAGGGACTGGCAGAGCTGAGGCCACCTGGGGCTGGGGATCCCAC
    TCTTCTTGCAGCTGTTGAGCGCACCTAACCACTGGTCATGCCCCCACCCCTGCTCTCCGC
    ACCCGCTTCCTCCCGACCCCAGGACCAGGCTACTTCTCCCCTCCTCTTGCCTCCCTCCTG
    CCCCTGCTGCCTCTTGATTCGTAGGAATTGAGGAGTGTCTCCGCCTTGTGGCTGAGAACT
    GGACAGTGGCAGGGGCTGGAGATGGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCGCG
    CGCGCCAGTGCAAGACCGAGACTGAGGGAAAGCATGTCTGCTGGGTGTGACCATGTTTCC
    TCTCAATAAAGTTCCCCTGTGACACTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAA
    >AW445220
    CGGCCGCGAGGCCCTGAGATGAGGCTCCAAAGACCCCGACAGGCCCCGGCGGGTGGGAGG
    CGCGCGCCCCGGGGCGGGCGGGGCTCCCCCTACCGGCCAGACCCGGGGAGAGGCGCGCGG
    AGGCTGCGAAGGTTCCAGAAGGGCGGGGAGGGGGCGCCGCGCGCTGACCCTCCCTGGGCA
    CCGCTGGGGACGATGGCGCTGCTCGCCTTGCTGCTGGTCGTGGCCCTACCGCGGGTGTGG
    ACAGACGCCAACCTGACTGCGAGACAACGAGATCCAGAGGACTCCCAGCGAACGGACGAG
    GGTGACAATAGAGTGTGGTGTCATGTTTGTGAGAGAGAAAACACTTTCGAGTGCCAGAAC
    CCAAGGAGGTGCAAATGGACAGAGCCATACTGCGTTATAGCGGCCGTGAAAATATTTCCA
    CGTTTTTTCATGGTTGCGAAGCAGTGCTCCGCTGGTTGTGCAGCGATGGAGAGACCCAAG
    CCAGAGGAGAAGCGGTTTCTCCTGGAAGAGCCCATGCCCTTCTTTTACCTCAAGTGTTGT
    AAAATTCGCTACTGCAATTTAGAGGGGCCACCTATCAACTCATCAGTGTTCAAAGAATAT
    GCTGGGAGCATGGGTGAGAGCTGTGGTGGGCTGTGGCTGGCCATCCTCCTGCTGCTGGCC
    TCCATTGCAGCCGGCCTCAGCCTGTCTTGAGCCACGGGACTGCCACAGACTGAGCCTTCC
    GGAGCATGGACTCGCTCCAGACCGTTGTCACCTGTTGCATTAAACTTGTTTTCTGTTGAT
    TAAAAAAAAAAAAAAAAA
    >AK025701
    TTCAGCCGGAACGTTACTCCGTGTCCACCCGGATCGTGTGTGTGATCGAGGCTGCGGAGA
    CGCCTTTCACGGGGGGTGTCGAGGTGGACGTCTTCGGGAAACTGGGCCGTTCGCCTCCCA
    ATGTCCAGTTCACCTTCCAACAGCCCAAGCCTCTCAGTGTGGAGCCGCAGCAGGGACCGC
    AGGCGGGCGGCACCACACTGACCATCCACGGCACCCACCTGGACACGGGCTCCCAGGAGG
    ACGTGCGGGTGACCCTCAACGGCGTCCCGTGTAAAGTGACGAAGTTTGGGGCGCAGCTCC
    AGTGTGTCACTGGCCCCCAGGCGACACGGGGCCAGATGCTTCTGGAGGTCTCCTACGGGG
    GGTCCCCCGTGCCCAACCCCGGCATCTTCTTCACCTACCGCGAAAACCCCGTACTGCGAG
    CCTTCGAGCCGCTACGAAGCTTTGCCAGTGGTGGCCGCAGCATCAACGTCACGGGTCAGG
    GCTTCAGCCTGATCCAGAGGTTTGCCATGGTGGTCATCGCGGAGCCCCTGCAGTCCTGGC
    AGCCGCCGCGGGAGGCTGAATCCCTGCAGCCCATGACGGTGGTGGGTACAGACTACGTGT
    TCCACAATGACACCAAGGTCGTCTTCCTGTCCCCGGCTGTGCCTGAGGAGCCAGAGGTCT
    ACAACCTCACGGTGCTGATCGAGATGGACGGGCACCGTGCCCTGCTCAGAACAGAGGCCG
    GGGCCTTCGAGTACGTGCCTGACCCCACCCTTGAGAACTTCACAGGTGGCGTCAAGAAGC
    AGGTCAACAAGCTCATCCACGCCCGGGGCACCAATCTGAACAAGGCGATGACGCTGCAGG
    AGGCCGAGGCCTTCGTGGGTGCCGAGCGCTGCACCATGAAGACGCTGACGGAGACCGACC
    TGTACTGTGAGCCCCCGGAGGTGCAGCCCCCGCCCAAGCGGCGGCAGAAACGAGACACCA
    CACACAACCTGCCCGAGTTCATTGTGAAGTTCGGCTCTCGCGAGTGGGTGCTGGGCCGCG
    TGGAGTACGACACACGGGTGAGCGACGTGCCGCTCAGCCTCATCTTGCCGCTGGTCATCG
    TGCCCATGGTGGTCGTCATCGCGGTGTCTGTCTACTGCTACTGGAGGAAGAGCCAGCAGG
    CCGAACGAGAGTATGAGAAGATCAAGTCCCAGCTGGAGGGCCTGGAGGAGAGCGTGCGGG
    ACCGCTGCAAGAAGGAATTCACAGACCTGATGATCGAGATGGAGGACCAGACCAACGACG
    TGCACGAGGCCGGCATCCCCGTGCTGGACTACAAGACCTACACCGACCGCGTCTTCTTCC
    TGCCCTCCAAGGACGGCGACAAGGACGTGATGATCACCGGCAAGCTGGACATCCCCGAGC
    CGCGGCGGCCGGTGGTGGAGCAGGCCCTCTACCAGTTCTCCAACCTGCTGAACAGCAAGT
    CTTTCCTCATCAATTTCATCCACACCCTGGAGAACCAGCGGGAGTTCTCGGCCCGCGCCA
    AGGTCTACTTCGCGTCCCTGCTGACGGTGGCGCTGCACGGGAAACTGGAGTACTACACGG
    ACATCATGCACACGCTCTTCCTGGAGCTCCTGGAGCAGTACGTGGTGGCCAAGAACCCCA
    AGCTGATGCTGCGCAGGTCTGAGACTGTGGTGGAGAGGATGCTGTCCAACTGGATGTCCA
    TCTGCCTGTACCAGTACCTCAAGGACAGTGCCGGGGAGCCCCTGTACAAGCTCTTCAAGG
    CCATCAAACATCAGGTGGAAAAGGGCCCGGTGGATGCGGTACAGAAGAAGGCCAAGTACA
    CTCTCAACGACACGGGGCTGCTGGGGGATGATGTGGAGTACGCACCCCTGACGGTGAGCG
    TGATCGTGCAGGACGAGGGAGTGGACGCCATCCCGGTGAAGGTCCTCAACTGTGACACCA
    TCTCCCAGGTCAAGGAGAAGATCATTGACCAGGTGTACCGTGGGCAGCCCTGCTCCTGCT
    GGCCCAGGCCAGACAGCGTGGTCCTGGAGTGGCGTCCGGGCTCCACAGCGCAGATCCTGT
    CGGACCTGGACCTGACGTCACAGCGGGAGGGCCGGTGGAAGCGCGTCAACACCCTTATGC
    ACTACAATGTCCGGGATGGAGCCACCCTCATCCTGTCCAAGGTGGGGGTCTCCCAGCAGC
    CGGAGGACAGCCAGCAGGACCTGCCTGGGGAGCGCCATGCCCTCCTGGAGGAGGAGAACC
    GGGTGTGGCACCTGGTGCGGCCGACCGACGAGGTGGACGAGGGCAAGTCCAAGAGAGGCA
    GCGTGAAAGAGAAGGAGCGGACGAAGGCCATCACCGAGATCTACCTGACGCGGCTGCTCT
    CAGTCAAGGGCACACTGCAGCAGTTTGTGGACAACTTCTTCCAGAGCGTGCTGGCGCCTG
    GGCACGCGGTGCCACCTGCAGTCAAGTACTTCTTCGACTTCCTGGACGAGCAGGCAGAGA
    AGCACAACATCCAGGATGAAGACACCATCCACATCTGGAAGACGAACAGTTTACCGCTCC
    GGTTCTGGGTGAACATCCTCAAGAACCCCCACTTCATCTTTGACGTGCATGTCCACGAGG
    TGGTGGACGCCTCGCTGTCAGTCATCGCGCAGACCTTCATGGATGCCTGCACGCGCACGG
    AGCATAAGCTGAGCCGCGATTCTCCCAGCAACAAGCTGCTGTACGCCAAGGAGATCTCCA
    CCTACAAGAAGATGGTGGAGGATTACTACAAGGGGATCCGGCAGATGGTGCAGGTCAGCG
    ACCAGGACATGAACACACACCTGGCAGAGATTTCCCGGGCGCACACGGACTCCTTGAACA
    CCCTCGTGGCACTCCACCAGCTCTACCAATACACGCAGAAGTACTATGACGAGATCATCA
    ATGCCTTGGAGGAGGATCCTGCCGCCCAGAAGACGCAGCTGGCCTTCCGCCTGCAGCAGA
    TTGCCGCTGCACTGGAGAACAAGGTCACTGACCTCTGACCTACAATCTCCAGTGCTGCCT
    TGGGACATAGGTACCTGAGGTACCTGAGAGCCCCTCAGGGGAGGAGGCCGAGTGGCTGTG
    GCTGAGGCCCCCACCCTCCCCTGGAACGCGCCCCAAGCCGGAGTGGGTGCAGCCGGAACC
    CGCCCAGCGTCTAGACTGTAGCATCTTCCTCTGAGCAATACCGCCGGGCACCGCACCAGC
    ACCAGCCCCAGCCCCAGCTCCCTCCGGCCGCAGAACCAGCATCGGGTGTTCACTGTCGAG
    TCTCGAGTGATTTGAAAATGTGCCTTACGCTGCCACGCTGGGGGCAGCTGGCCTCCGCCT
    CCGCCCACGCACCAGCAGCCGCCTCCATGCCCTAGGTTGGGCCCCTGGGGGATCTGAGGG
    CCTGTGGCCCCCAGGGCAAGTTCCCAGATCCTATGTCTGTCTGTCCACCACGAGATGGGA
    GGAGGAGAAAAAGCGGTACGATGCCTTCCTGACCTCACCGGCCTCCCCAAGGGTGCCGGC
    ACTCTGGGTGGACTCACGGCTGCTGGGCCCCACGTCAAAGGTCAAGTGAGACGTAGGTCA
    AGTCCTACGTCGGGGCCCAGACATCCTGGGGTCCTGGTCTGTCAGACAGGCTGCCCTAGA
    GCCCCACCCAGTCCGGGGGGACTGGGAGCAGTTCCAAGACCACCCCACCCCTTTTTGTAA
    ATCTTGTTCATTGTAAATCAAATACAGCGTCTTTTTCACTCCGAAAAAAAAAAAAAAAAA
    AAAAAA
    >NM_033229
    GATGTGGGCACGCCTCAGAGCCAGAAGTTTATGGCTCCCACCTGCTCAATCTGACAGGAA
    GCTTCTGCTCCCCAGTTCTCCCCAGCCACTGTGGTCTACAGATTCCAGGAAACCCATCCC
    CCTGTGACCTCAGGGTGTGCTCTGTTCTCCACCCTAGGGACCAGAAGGAGCCAGGAGTAA
    AGAACTGGCTTACTTGGCCGCCACTGGGAAATTCTGGGTAATTCGAGACGCCCTGGAATT
    TGGACCCACTCCGCTGATAGGTGGTGGGCAGGGTTCTAGGGAACACAAGAGGCGGAGCCA
    GGTGGCTTCCCTGTGCTGGCATTCTTGGCTCTCTCTCTCTCTCTTTCTCTCTCTCTGTCT
    CTCTCTCTCTCTCTGTCTCTCAGCCTTGAAGCCGTTTCCCTCTGCGATTCATGTAAGTGT
    GACTCGATTTCAGGGAAAGGGAACTCGCGTGGGCTGAGGAGACCGGAGTGGACGGGCTGG
    GGAAGGCACCGTGATGCCCGCAACCCCGTCCCTGAAGGTGGTCCATGAGCTGCCTGCCTG
    TACCCTCTGTGCGGGGCCGCTGGAGGATGCGGTGACCGTTCCCTGTGGACACACCTTCTG
    CCGGCTCTGCCTCCCCGCGCTCTCCCAGATGGGGGCCCAATCCTCGGGCAAGATCCTGCT
    CTGCCCGCTCTGCCAAGAGGAGGAGCAGGCAGAGACTCCCATGGCCCCTGTGCCCCTGGG
    CCCGCTGGGAGAAACTTACTGCGAGGAGCACGGCGAGAAGATCTACTTCTTCTGCGAGAA
    CGATGCCGAGTTCCTCTGTGTGTTCTGCAGGGAGGGTCCCACGCACCAGGCGCACACCGT
    GGGGTTCCTGGACGAGGCCATTCAGCCCTACCGGGATCGTCTCAGGAGTCGACTGGAAGC
    TCTGAGCACGGAGAGAGATGAGATTGAGGATGTAAAGTGTCAAGAAGACCAGAAGCTTCA
    AGTGCTGCTGACTCAGATCGAAAGCAAGAAGCATCAGGTGGAAACAGCTTTTGAGAGGCT
    GCAGCAGGAGCTGGAGCAGCAGCGATGTCTCCTGCTGGCCAGGCTGAGGGAGCTGGAGCA
    GCAGATTTGGAAGGAGAGGGATGAATATATCACAAAGGTCTCTGAGGAAGTCACCCGGCT
    TGGAGCCCAGGTCAAGGAGCTGGAGGAGAAGTGTCAGCAGCCAGCAAGTGAGCTTCTACA
    AGATGTCAGAGTCAACCAGAGCAGGTGTGAGATGAAGACTTTTGTGAGTCCTGAGGCCAT
    TTCTCCTGACCTTGTCAAGAAGATCCGTGATTTCCACAGGAAAATACTCACCCTCCCAGA
    GATGATGAGGATGTTCTCAGAAAACTTGGCGCATCATCTGGAAATAGATTCAGGGGTCAT
    CACTCTGGACCCTCAGACCGCCAGCCGGAGCCTGGTTCTCTCGGAAGACAGGAAGTCAGT
    GAGGTACACCCGGCAGAAGAAGAGCCTGCCAGACAGCCCCCTGCGCTTCGACGGCCTCCC
    GGCGGTTCTGGGCTTCCCGGGCTTCTCCTCCGGGCGCCACCGCTGGCAGGTTGACCTGCA
    GCTGGGCGACGGCGGCGGCTGCACGGTGGGGGTGGCCGGGGAGGGGGTGAGGAGGAAGGG
    AGAGATGGGACTCAGCGCCGAGGACGGCGTCTGGGCCGTGATCATCTCGCACCAGCAGTG
    CTGGGCCAGCACCTCCCCGGGCACCGACCTGCCGCTGAGCGAGATCCCGCGCGGCGTGAG
    AGTCGCCCTGGACTACGAGGCGGGGCAGGTGACCCTCCACAACGCCCAGACCCAGGAGCC
    CATCTTCACCTTCACTGCCTCTTTCTCCGGCAAAGTCTTCCCTTTCTTTGCCGTCTGGAA
    AAAAGGTTCCTGCCTTACGCTGAAAGGCTGAAGTGGGGCGCGCGAAGGGCGGCGAAGCGG
    AGACGGCGGCTCTCCGGGATCCAGCTCCGCCCCTGGCCAGTGTGCGGCCCGGGGGCTCCC
    TGTGCCCGCGTGAGGCGAGAGAACAGGGGACTTGAGTCTCGAACAGCGGTTGTTTTTACT
    TTATTTATCTTAGGCCCTCAGCTCCCTGACGTCCTGAGCCTCCCTGTGACGCTCTGGCCT
    TCTCTGCACCTCAGAGTGCAGAACCACAGACGGCTTCGGCTGTGCCTAGGGCAACAGCCA
    ACCTAGGAGCCAGCGGGCTTTCGGGGAAAAAAAAGAAAAAGACATCTAAAATAAAATGTT
    TAAACTGTTTCAAAATAAAAAAAAAAAAAAAAAAA
    >AV656862
    TTTATACATTCTAAATCTCCCCAGTTTCTTTGGGGCTGGAAGATGCAACTTCCATTTAAT
    AGAAACTTTGAAATCTTGGGGTAAGGGAGCAGTGGGGGGACTAGGGAGAAGGATAAGAAA
    TAGAATTATTGAAAAGCCCCCACCAGGGACCTTCCTGGCCAGAATATGCAGAGTAATTCC
    TGCTGGCTTCACCTTTGAAAGTCCCTCGAAACTATGCAGATGAAACTGAGTCTGTTTTTG
    ATATTGTCAGATGTATTCTACCTTGGAAGTCCCAACACCTAAACTGGAATTCTTGTATTT
    ACATCTCCTCCACTGTCCCCCACACCACCCCTCAATTCCTGCTGCCCCTGCTAATGTTAA
    GCATTTTTCTCTTGTTATCATCAGGTTCACATTAAAAACAGATACTTACAAACTGACTTG
    AAGCACAGATACTTTTACGAATGTGATAAAATATTTTCTTAAGAAAAGGAAAGAGGATGT
    GGGTCAAATAAAACACCGCATGGATGTTGATTGGTGAATACTGGTGTAAGAAAAGGGAGC
    TCAGGAATTTTTATTACTGTATTTGTAAATGAGTTTGAAGGAATTTGTAAATGCCACTGG
    TACATTTTTAAGGTGACACATTTGCTCCTTATAAAGTTATTAAAAATTACAGGGTAAGCT
    TAAATGACGTTTGCCAGTAGTTTTACTTTATATAATCAATATTGATATTGTTGCTGAACT
    ATGTAACTTTATGATGCATTTTTCAGTCCCTTTTCAGAGCAAATGCTTTTGCAATGGTAG
    TAATGTTTAGTTTAAATTGACTTAATAAATTATTACCTGAGCAAAAAAAAAAAAAAAAAA
    AAAAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAATAATAAAAAAAAAAAAAACA
    AACAAATCAATAAAACTTAAACAAAAAAAAAATAAAAAAAAA
    >AI499593
    GCAGAGATCGCCACATCGTCGGACAAGGTCAAGGACGGGGGCGGCGGGAACGAGGGCTCT
    CCATGCCCACCGTGTCCCGGGCCCATAGCCGGGCAAGCCCTAGGAGGCAGCCGGGCGTCG
    CCGGCCCCGGCGCCGTCACGCTCGCCCTCGGCGCAGTGTCCTTTTCCAGGCGGGACGGTG
    CTGTCCCGGCCTCTCTACTACACCGCGCCCTTCTATCCCGGCTACACGAACTATGGCTCC
    TTCGGACACCTTCATGGCCACCCGGGGCCGGGGCCGGGCCCCACACCCGGTCCGGGGTCT
    CATTTCAATGGATTAAACCAGACCGTGTTGAACCGAGCGGACGCTTTGGCTAAAGACCCG
    AAAATGTTGCGGAGCCAGTCTCAGCTAGACCTGTGCAAAGACTCTCCCTATGAATTGAAG
    AAAGGTATGTCCGACATTTAACGCGGGCTGCGTCGGTCCCGGACTTTTCTAATTTATTAA
    AAACATGGCCTTGGCAGTTATTTTTCCATCACCGAGAGAGAGAGACAGAGAGAGAAAATA
    AACTACCCCTCCTATTCAGAAGTTTATAGTTTATGGAGATGGATGACATAAAAATGTAAA
    CATCTCCACACACACAAAAAAATGTCTTAACCAACCGAAAAGAAAAATTAAAAAAGGATT
    TGTATTAAATCTTATTCTGTATATTTAATGTAGCATTTTTGTATTTAAATTGATAATTCA
    ATATCTTTGAAGTAAATTATGAAATCAAGACACCTGTACAGGCATTTAATGTTTTTTTGT
    AATATAAATATATACATTTGTGTTTCCCCCAAAACTGTTTCATAGTTAAAAAATACAAGT
    TTAATTTAATTTTTTACACCTATTGATTCTGCTGGGTATGAGCTAAAGTATTACGGAAAG
    GAAACAGGTTATACTCTTAGATTTAAAAAGTGAAAGAAACTGCAGGCGCCTTTGTAAAAT
    GCAAAATATTTAATTAAAAGAGATTTTAACATAATGAGAGCCACTCATTACTTTTTAGAA
    GCCTCAATAAACTGTCCATTGCCTTGGTCAAAAAAAAAAAAAAAAAAAAA
    >AI952953
    ATATCCAAGAAATTTGGACACCTATACCTACAGAATAATGAAATAGAAAAGATGAATCTN
    ACAGTGATGTGTCCTTCTATTGACCCACTACATTACCACCATTTAACATACATTCGTGTG
    GACCAAAATAAACTAAAAGAACCAATAAGCTCATACATCTTCTTCTGCTTCCCTCATATA
    CACACTATTTATTATGGTGAACAACGAAGCACTAATGGTCAAACAATACAACTAAAGACC
    CAAGTTTTCAGGAGATTTCCAGATGATGATGATGAAAGTGAAGATCACGATGATCCTGAC
    AATGCTCATGAGAGCCCAGAACAAGAAGGAGCAGAAGGGCACTTTGACCTTCATTATTAT
    GAAAATCAAGAATAGCAAGAAACTATATAGGTATACACTTACGACTTCACAAAACCTATA
    CTTAATATAGTAAATCTAAGTAAACATGTATTACTCAAAGTAATATATTTAGAATTATGT
    ATTAGTATAAGATCAGAATTGAATTTAAGTTGTTGGTGACATCTGCATCATTTCATAGGA
    TTAGAACTTACTCAAAATAATGTAAATCTTTAAAAATATAAATTAGAATGACAAGTGGGA
    ATCATAAATTAAACGTTAATGGTTTCTTATGCTCTTTTTAAATATAGAAATATCATGTTA
    AAAAAAAA
    >AK025470
    ATGATTGCAACAGTGGATTTAAAAGTCAATGAATATGAGAAAAACCAAAAATGGCTTGAG
    ATCCTAAATAAGATTGAAAACAAAACATACACGAAGCTCAAAAATGGACATGTGTTTAGG
    AAGCAGGCACTGATGAGTGAAGAAAGGACTCTGTTATATGATGGCCTTGTTTACTGGAAA
    ACTGCTACAGGTCGTTTCAAAGATATCCTAGCTCTACTTCTAACTGATGTGCTGCTCTTT
    TTACAAGAAAAAGACCAGAAATACATCTTTGCAGCCGTTGATCAGAAGCCATCAGTTATT
    TCCCTTCAAAAGCTTATTGCTAGAGAAGTTGCTAATGAGGAGAGAGGAATGTTTCTGATC
    AGTGCTTCATCTGCTGGTCCTGAGATGTATGAAATTCACACCAATTCCAAGGAGGAACGC
    AATAACTGGATGAGACGGATCCAGCAGGCTGTAGAAAGTTGTCCTGAAGAAAAAGGGGGA
    AGGACAAGTGAATCTGATGAAGACAAGAGGAAAGCTGAAGCCAGAGTGGCCAAAATTCAG
    CAATGTCAAGAAATACTCACTAACCAAGACCAACAAATTTGTGCGTATTTGGAGGAGAAG
    CTGCATATCTATGCTGAACTTGGAGAACTGAGCGGATTTGAGGACGTCCATCTAGAGCCC
    CACCTCCTTATTAAACCTGACCCAGGCGAGCCTCCCCAGGCAGCCTCATTACTGGCAGCA
    GCACTGAAAGAAGCATTAGTCACAGGAGGGAGAGAAGGAAGAGGCTGTTCGGATGTGGAT
    CCCGGGATCCAGGGTGTGGTAACCGACTTGGCCGTCTCTGATGCAGGGGAGAAGGTGGAA
    TGTAGAAATTTTCCAGGTTCTTCACAATCAGAGATTATACAAGCCATACAGAATTTAACC
    CGTCTCTTATACAGCCTTCAGGCCGCCTTGACCATTCAGGACAGCCACATTGAGATCCAC
    AGGCTGGTTCTCCAGCAGCAGGAGGGCCTGTCTCTCGGCCACTCTATCCTCCGAGGCGGC
    CCCTTGCAGGACCAGAAGTCTCGCGACGCGGACAGGCAGCATGAGGAGCTGGCCAATGTG
    CACCAGCTTCAGCACCAGCTCCAGCAGGGGCAGCGGCGCTGGCTGCGCAGGTGTGAGCAG
    CAGCAGCGGGCGCAGGCGACCAGGGAGAGCTGGCTGCAGGAGCGGGAGCGGGAGTGCCAG
    TCGCAGGAGGAGCTGCTGCTGCGGAGCCGGGGCGAGCTGGACCTCCAGCTCCAGGAGTAC
    CAGCACAGCCTGGAGCGGCTGAGGGAGGGCCAGCGCCTGGTGGAGAGGGAGCAGGCGAGG
    ATGCGGGCCCAGCAGAGCCTGCTGGGCCACTGGAAGCACGGCCGGCAGAGGAGCCTGTCC
    GCGGTGCTCCTTCCGGGTGGCCCCGAGGTAATGGAACTTAATCGATCTGAGAGTTTATGT
    CATGAAAACTCATTCTTCATCAATGAAGCTTTAGTACAAATGTCATTTAACACTTTCAAC
    AAACTGAATCCATCAGTTATCCATCAGGATGCCACTTACCCTACAACTCAATCTCATTCT
    GACTTGGTGAGGACTAGTGAACATCAAGTAGACCTCAAGGTGGACCCTTCTCAGCCTTCG
    AATGTCAGTCACAAACTGTGGACAGCCGCTGGTTCCGGCCATCAGATACTTCCTTTCCAT
    GAAAGCAGCAAGGATTCTTGTAAAAATGGCTCCAGTATGACAAAGTGCAGTTGTACGTTG
    ACATCTCCCCCGGGACTGTGGACTGGAACCACATCTACTTTGAAGGATTTGGACACCTCC
    CACACTGAGTCCCCAACCCCCCATGACTCAAATTCACACCGCCCTCAACTGCAGGCGTTT
    ATAACAGAAGCAAAGCTAAATCTACCGACAAGGACAATGACCAGACAAGATGGGGAAACT
    GGAGATGGAGCCAAAGAAAATATTGTTTACCTCTAATTGTGTTGTCATTTTTCCAAACAA
    AACAAAACACTGGCACTTTTGGGAGAAACTTTTTGTCTCCATTCCTTATGTATGTGTGAT
    TGTCTGTGTCCAAATTGCTTTAAGAATAATATTTAATATTTCCTGGAAGCTCATTTTTTT
    GGCATGAGTCTAATTAAATTATTGAAAGCCACCCTGTTTGTATAATCTTTAACTTATCAA
    ATCTAATTTCAGATTTCTGGAGGAGAAACTAACTTGAATAAGCAGGACTATTTTAAAAGT
    TGTTTTGACGCTAGAGTAAAATTCCATGTCACATTTTCTACCCAATCATCTGGATTTCAA
    GATTCCTTTTAAGATCTCAATGAAGCAATTTGGATTTAAAGAGTGGTATTCACAAGGGGT
    GAACTTTCACAGTCAGGGCAGTTGCCTCAGTGCCCACATAGGCAGAGGAGGATGTGGGAA
    AGGGCTTTTCTCAGCTAGTTTTTGTGTGCTCATTTCTTCTGGGAGCATTAAAAGTGGTGA
    TCTGTTACAGTCACTATTCAACTGGGCACGTGTTGTGATTGGTCAGTCACTGAGCCAGGG
    ATACAGTCCGGACTTGCTTAGTACCTAAGCCTAATGCTGGTGGGGTTTCAAGACATGGTT
    CAGCATCATCTTTTAACAAGGCCCAGAGGCCCAGAGCCCGCATCAAGTCATTTTGATGTA
    AATAGTGAACTTTGTTAGAGCCCTCACTTCTATCAATCAGCTGTCCTGTCCCTGCCAGCA
    CCTGGAGCACCAACTACCACTCCCTGGAAAGAACCCTTCCCTGCAGTTTTTTAAGGACAA
    AACTGCCCACTCCTCATTAAGTTTGCTGCCTGGATACACTTTTCCACAAAGGAAAACTGG
    CATATCCTGCCTTCCGAGTAGTATGGGTCTCTGTGTGAGAAACCAGGAGATATTTTCATC
    TTGTTCGGAAATACTTGTATGTATTTTGGTGTCAATAAATATCTTGTACCTCATTAAAAA
    AAAAAAAAAAAAA
    >NM_006378
    CTGAGCCGCATCTGCAATAGCACACTTGCCCGGCCACCTGCTGCCGTGAGCCTTTGCTGC
    TGAAGCCCCTGGGGTCGCCTCTACCTGATGAGGATGTGCACCCCCATTAGGGGGCTGCTC
    ATGGCCCTTGCAGTGATGTTTGGGACAGCGATGGCATTTGCACCCATACCCCGGATCACC
    TGGGAGCACAGAGAGGTGCACCTGGTGCAGTTTCATGAGCCAGACATCTACAACTACTCA
    GCCTTGCTGCTGAGCGAGGACAAGGACACCTTGTACATAGGTGCCCGGGAGGCGGTCTTC
    GCTGTGAACGCACTCAACATCTCCGAGAAGCAGCATGAGGTGTATTGGAAGGTCTCAGAA
    GACAAAAAAGCAAAATGTGCAGAAAAGGGGAAATCAAAACAGACAGAGTGCCTCAACTAC
    ATCCGGGTGCTGCAGCCACTCAGCGCCACTTCCCTTTACGTGTGTGGGACCAACGCATTC
    CAGCCGGCCTGTGACCACCTGAACTTAACATCCTTTAAGTTTCTGGGGAAAAATGAAGAT
    GGCAAAGGAAGATGTCCCTTTGACCCAGCACACAGCTACACATCCGTCATGGTTGATGGA
    GAACTTTATTCGGGGACGTCGTATAATTTTTTGGGAAGTGAACCCATCATCTCCCGAAAT
    TCTTCCCACAGTCCTCTGAGGACAGAATATGCAATCCCTTGGCTGAACGAGCCTAGTTTC
    GTGTTTGCTGACGTGATCCGAAAAAGCCCAGACAGCCCCGACGGCGAGGATGACAGGGTC
    TACTTCTTCTTCACGGAGGTGTCTGTGGAGTATGAGTTTGTGTTCAGGGTGCTGATCCCA
    CGGATAGCAAGAGTGTGCAAGGGGGACCAGGGCGGCCTGAGGACCTTGCAGAAGAAATGG
    ACCTCCTTCCTGAAAGCCCGACTCATCTGCTCCCGGCCAGACAGCGGCTTGGTCTTCAAT
    GTGCTGCGGGATGTCTTCGTGCTCAGGTCCCCGGGCCTGAAGGTGCCTGTGTTCTATGCA
    CTCTTCACCCCACAGCTGAACAACGTGGGGCTGTCGGCAGTGTGCGCCTACAACCTGTCC
    ACAGCCGAGGAGGTCTTCTCCCACGGGAAGTACATGCAGAGCACCACAGTGGAGCAGTCC
    CACACCAAGTGGGTGCGCTATAATGGCCCGGTACCCAAGCCGCGGCCTGGAGCGTGCATC
    GACAGCGAGGCACGGGCCGCCAACTACACCAGCTCCTTGAATTTGCCAGACAAGACGCTG
    CAGTTCGTTAAAGACCACCCTTTGATGGATGACTCGGTAACCCCAATAGACAACAGGCCC
    AGGTTAATCAAGAAAGATGTGAACTACACCCAGATCGTGGTGGACCGGACCCAGGCCCTG
    GATGGGACTGTCTATGATGTCATGTTTGTCAGCACAGACCGGGGAGCTCTGCACAAAGCC
    ATCAGCCTCGAGCACGCTGTTCACATCATCGAGGAGACCCAGCTCTTCCAGGACTTTGAG
    CCAGTCCAGACCCTGCTGCTGTCTTCAAAGAAGGGCAACAGGTTTGTCTATGCTGGCTCT
    AACTCGGGCGTGGTCCAGGCCCCGCTGGCCTTCTGTGGGAAGCACGGCACCTGCGAGGAC
    TGTGTGCTGGCGCGGGACCCCTACTGCGCCTGGAGCCCGCCCACAGCGACCTGCGTGGCT
    CTGCACCAGACCGAGAGCCCCAGCAGGGGTTTGATTCAGGAGATGAGCGGCGATGCTTCT
    GTGTGCCCGGATAAAAGTAAAGGAAGTTACCGGCAGCATTTTTTCAAGCACGGTGGCACA
    GCGGAACTGAAATGCTCCCAAAAATCCAACCTGGCCCGGGTCTTTTGGAAGTTCCAGAAT
    GGCGTGTTGAAGGCCGAGAGCCCCAAGTACGGTCTTATGGGCAGAAAAAACTTGCTCATC
    TTCAACTTGTCAGAAGGAGACAGTGGGGTGTACCAGTGCCTGTCAGAGGAGAGGGTTAAG
    AACAAAACGGTCTTCCAAGTGGTCGCCAAGCACGTCCTGGAAGTGAAGGTGGTTCCAAAG
    CCCGTAGTGGCCCCCACCTTGTCAGTTGTTCAGACAGAAGGTAGTAGGATTGCCACCAAA
    GTGTTGGTGGCATCCACCCAAGGGTCTTCTCCCCCAACCCCAGCCGTGCAGGCCACCTCC
    TCCGGGGCCATCACCCTTCCTCCCAAGCCTGCGCCCACCGGCACATCCTGCGAACCAAAG
    ATCGTCATCAACACGGTCCCCCAGCTCCACTCGGAGAAAACCATGTATCTTAAGTCCAGC
    GACAACCGCCTCCTCATGTCCCTCTTCCTCTTCTTCTTTGTTCTCTTCCTCTGCCTCTTT
    TTCTACAACTGCTATAAGGGATACCTGCCCAGACAGTGCTTGAAATTCCGCTCGGCCCTA
    CTAATTGGGAAGAAGAAGCCCAAGTCAGATTTCTGTGACCGTGAGCAGAGCCTGAAGGAG
    ACGTTAGTAGAGCCAGGGAGCTTCTCCCAGCAGAATGGGGAGCACCCCAAGCCAGCCCTG
    GACACCGGCTATGAGACCGAGCAAGACACCATCACCAGCAAAGTCCCCACGGATAGGGAG
    GACTCACAGAGGATCGACGACCTTTCTGCCAGGGACAAGCCCTTTGACGTCAAGTGTGAG
    CTGAAGTTCGCTGACTCAGACGCAGATGGAGACTGAGGCCGGCTGTGCATCCCCGCTGGT
    GCCTCGGCTGCGACGTGTCCAGGCGTGGAGAGTTTTGTGTTTCTCCTGTTCAGTATCCGA
    GTCTCGTGCAGTGCTGCGTAGGTTAGCCCGCATCGTGCAGACAACCTCAGTCCTCTTGTC
    TATTTTCTCTTGGGTTGAGCCTGTGACTTGGTTTCTCTTTGTCCTTTTGGAAAAATGACA
    AGCATTGCATCCCAGTCTTGTGTTCCGAAGTCAGTCGGAGTACTTGAAGAAGGCCCACGG
    GCGGCACGGAGTTCCTGAGCCCTTTCTGTAGTGGGGGAAAGGTGGCTGGACCTCTGTTGG
    CTGAGAAGAGCATCCCTTCAGCTTCCCCTCCCCGTAGCAGCCACTAAAAGATTATTTAAT
    TCCAGATTGGAAATGACATTTTAGTTTATCAGATTGGTAACTTATCGCCTGTTGTCCAGA
    TTGGCACGAACCTTTTCTTCCACTTAATTATTTTTTTAGGATTTTGCTTTGATTGTGTTT
    ATGTCATGGGTCATTTTTTTTTAGTTACAGAAGCAGTTGTGTTAATATTTAGAAGAAGAT
    GTATATCTTCCAGATTTTGTTATATATTTGGCATAAAATACGGCTTACGTTGCTTAAGAT
    TCTCAGGGATAAACTTCCTTTTGCTAAATGCATTCTTTCTGCTTTTAGAAATGTAGACAT
    AAACACTCCCCGGAGCCCACTCACCTTTTTTCTTTTTCTTTTTTTTTTTTTAACTTTATT
    CCTTGAGGGAAGCATTGTTTTTGGAGAGATTTTCTTTCTGTACTTCGTTTTACTTTTCTT
    TTTTTTTAACTTTTACTCTCTCGAAGAAGAGGACCTTCCCACATCCACGAGGTGGGTTTT
    GAGCAAGGGAAGGTAGCCTGGATGAGCTGAGTGGAGCCAGGCTGGCCCAGAGCTGAGATG
    GGAGTGCGGTACAATCTGGAGCCCACAGCTGTCGGTCAGAACCTCCTGTGAGACAGATGG
    AACCTTCACAAGGGCGCCTTTGGTTCTCTGAACATCTCCTTTCTCTTCTTGCTTCAATTG
    CTTACCCACTGCCTGCCCAGACTTTCTATCCAGCCTCACTGAGCTGCCCACTACTGGAAG
    GGAACTGGGCCTCGGTGGCCGGGGCCGCGAGCTGTGACCACAGCACCCTCAAGCATACGG
    CGCTGTTCCTGCCACTGTCCTGAAGATGTGAATGGGTGGTACGATTTCAACACTGGTTAA
    TTTCACACTCCATCTCCCCGCTTTGTAAATACCCATCGGGAAGAGACTTTTTTTCCATGG
    TGAAGAGCAATAAACTCTGGATGTTTGTGCGCGTGTGTGGACAGTCTTATCTTCCAGCAT
    GATAGGATTTGACCATTTTGGTGTAAACATTTGTGTTTTATAAGATTTACCTTGTTTTTA
    TTTTTCTACTTTGAATTGTATACATTTGGAAAGTACCCAAATAAATGAGAAGCTTCTATC
    CTTAAAAAAAAAAAAAA
    >AA993639
    CCCNTCCCCAGAGGCAGGAAAANCAGTNTGCCGAAAGGATAGACTGNGGTGCNGTCTTTC
    CCCAAGTTNTGAACTAGTTTTAAGGTAGCTTAGGATGAAAAATGGAGAATGATTGGGGGT
    TCCAAACCACTTTCTTCTCCCTTGGCTTATATCTCTTCACCATTTGGTGGTCAACTGTGG
    GCCTACCCTGGACCTCATCTACTCAGCGAGAATTGGACATGAAGCTAGAGGCAGCTGCCT
    TGGAAGGGAAGTCAGGCTCACTTGGACAGCCCAGGCCATGGCAGGAAGAATCCCTTCCTC
    TTGGGGTCCTTGATGGGCATGTGTGATGGGGAAGGAGCAGTCTCCCAGCCCTGGGTCTGC
    TCCCCACATCTCTCCTAATTCCACTTCACCTTTTGCCACCCCCTCCCCACCAGAGGCCTA
    GCCCTTTTGTCACCGAAGGCCCCCAGAGTGTTTCTGTGTGAAACCCTCTCATTTACACTG
    TGGCATCAAAATCCACAAAAGATGGATTAATTGCACTCTGGTTAATAGCAGCAGCACAAT
    GATTAAAATCTATATTCCTATCTTCTCTAGCACCCTGGTGTGGGGATGGGGCGGAAGGGT
    GTCTTGAGGGGCAGGGAGGACCCCATAAAACAATCCCTCCTGCATTCTCAGGCTAAATAG
    GGCCCCCAGTGACTACCTGTTCTTGGCTGTCCCCTCTGAAGAGCTCTGCCTTCTCACAGC
    CACCACCAGTTGCCCCACTCCCAGGAAAACAGCACATGTTCTTCTTCTCCTGCCTTGAGA
    CTGCGTGTTAGTCTTCCATTCATAACTCATCAGCAGCTCAGTCCTTCTTATGTCTAGTCT
    CAGTTCATTCAGCCAAAGCTCATTTTTGTCCTATCCAAAGTAGAAAGGGTTCTTTTAGAA
    AACTTGAAGAATGTGCCTCCTCTTAGCATCTGTTTCTGACTCCCAGTTATTTTTAAAATA
    AATGATGAATAAAATGCCTGCCCTGAAGGGTTCTGGAGGAGTCAGGTATCAAAAAAAAAA
    AAA
    >BE552004
    TTTTTAAGATGATCTTGCTCCGTCACCCAGGCTGGAGTGCAGTGGCGTAATCATGGCTTC
    CTGCAGCCTCAAACTCCTGGGCTCAATGAGTTCCTTGAGATCTTCCATCCTCAGCTTCCC
    AAGTAGCTAGTAGTAGTAGTGGCTTGCACCAACGCTCCTGCCCTAATTTTCAATATTTTT
    TTTGTAGAGATAGGATCTCACTGTGTTACCCAAGCTAGACTTGAACTCCTGGCCTCAAGC
    GATCCTTCCGCCTTGGCCTCCCAAAGTGTTGGGATTACAGGCATTAGCTACCACACCTGG
    CCAAGGCCCAGGTTTCGACAGAAAGGGAGAGAAAACCTGCCAGAGATGCCATTTCGGAGC
    CACTCTGCTTGGCAGGGACCTGTGTTCCCCTCATGCAGGTTCATCCTTAGAGGGCTGCGG
    TCTTATCTGGTTGTGCAAAAGTCCCACAACCTTTCTGGATTGATAGTTTGTGGTGAAATA
    AACAATTTTAGTTTGTTTGGAGAATCTTTTGTATACAAAATACAAATAAAACCTAAATCA
    AAGAAACAGA
    >BC010437
    GAGGGGCCGGAGGCGTCCCCGCTCCCGCTCGCTACTAGCCCGCGGGCCAGCGCCGCGTCC
    CGAGCCCCGGCGGGAGCCATGGCTCTAAAAGGACAAGAAGATTATATTTATCTTTTCAAG
    GATTCAACACATCCAGTGGATTTTCTGGATGCATTCAGAACATTTTACTTGGATGGATTA
    TTTACTGATATTACTCTTCAGTGTCCTTCAGGCATAATTTTCCATTGTCACCGAGCCGTT
    TTAGCTGCTTGCAGCAATTATTTTAAGGCAATGTTCACAGCTGACATGAAAGAAAAATTT
    AAAAATAAAATAAAACTCTCTGGCATCCACCATGATATTCTGGAAGGCCTTGTAAATTAT
    GCATACACTTCCCAAATTGAAATAACTAAAAGAAATGTTCAAAGCCTGCTTGAGGCAGCG
    GATCTGCTACAGTTCCTTTCAGTAAAGAAGGCTTGTGAGCGGTTTTTGGTAAGGCACTTG
    GATATTGATAATTGTATTGGAATGCACTCCTTTGCAGAATTTCATGTGTGTCCAGAACTA
    GAGAAGGAATCTCGAAGAATTCTATGTTCAAAGTTTAAGGAAGTGTGGCAACAAGAAGAA
    TTTCTGGAAATCAGCCTTGAAAAGTTTCTCTTTATCTTGTCCAGAAAGAATCTCAGTGTT
    TGGAAAGAAGAAGCTATCATAGAGCCAGTTATTAAGTGGACTGCTCATGATGTAGAAAAT
    CGAATTGAATGCCTCTATAATCTACTGAGCTATATCAACATTGATATAGATCCAGTGTAC
    TTAAAAACAGCCTTAGGCCTTCAAAGAAGCTGCCTGCTCACCGAAAATAAGATCCGCTCC
    CTAATATACAATGCCTTGAATCCCATGCATAAAGAGATTTCCCAGAGGTCCACAGCCACA
    ATGTATATAATTGGAGGCTATTACTGGCATCCTTTATCAGAGGTTCACATATGGGATCCT
    TTGACAAATGTTTGGATTCAGGGAGCAGAAATACCAGATTATACCAGGGAGAGCTATGGT
    GTTACATGTTTAGGACCCAACATTTATGTAACTGGGGGCTACAGGACGGATAACATAGAA
    GCTCTTGACACAGTGTGGATCTATAACAGTGAAAGTGATGAATGGACAGAAGGTTTGCCA
    ATGCTCAATGCCAGGTATTACCACTGTGCAGTCACCTTGGGTGGCTGTGTCTATGCTTTA
    GGTGGTTACAGAAAAGGGGCTCCAGCAGAAGAGGCTGAGTTCTATGATCCTTTAAAAGAG
    AAATGGATTCCTATTGCAAACATGATTAAAGGTGTGGGAAATGCTACTGCCTGTGTCTTA
    CATGATGTTATCTACGTCATTGGTGGCCACTGTGGCTACAGAGGAAGCTGCACCTATGAC
    AAAGTTCAGAGCTACAATTCCGATATCAACGAATGGAGCCTCATCACCTCCAGTCCACAT
    CCAGAATATGGATTGTGCTCAGTTCCGTTTGAAAATAAGCTCTATCTAGTCGGTGGACAA
    ACTACAATCACAGAATGCTATGACCCTGAACAAAATGAATGGAGAGAGATAGCTCCCATG
    ATGGAAAGGAGGATGGAGTGCGGTGCCGTGATCATGAATGGATGTATTTATGTCACTGGA
    GGATACTCCTACTCAAAGGGAACGTATCTTCAGAGCATTGAGAAATATGATCCAGATCTT
    AATAAGTGGGAAATAGTGGGTAATCTTCCCAGTGCCATGCGGTCTCATGGGTGTGTTTGT
    GTGTATAATGTCTAATTGAATCTGCAGAAATGACCAAGCAATCACTTTTTTGGAGTATAG
    TTTTATAAAAAAAGAATGCAGGGTTTGAAGTTCCTTACCTGATAATTGTGTCTGGCACAT
    GATAGGGGATCAGTAAATTGTAATTCCTAACCCTACTGTACTCCCAAACATGGTGATTCA
    TGGTCAAGAAAAATCTTATATATATATATACACACACATATATATGTGTTCATATATATG
    TATACATATATGTGTATATATACGCATGTATGTATACATATATGTGTATATATACGCATG
    TATGTATGCATATGTGTGTATATATACGTATGTATGTATACATATGTGTATATATACGTA
    TGTATGTATACATATATGTGTATATATGCGTATGTATGTATACATATATGTGTATATATA
    CGTATGTATGTATACATATATGTGTATATATACGTATGTATGTATACATATATGTGTATA
    TATACGTATGTATGTATACATATATGTGTGTATATACGTGTGTATGTATACATATATGTG
    TATATATACGTGTGTATGTATACATATATGTGTATATATGCGTGTGTATGTATACATATA
    TGTGTATATATACGTGTGTATGTATACATATATGTGTATATATACGTGTGTATGTATACA
    TATATGTGTATATATGCGTGTGTATATATATACACATATATACGTATATATGTATATATA
    TATACACAGTTGAATCAGTGGGATTAATACCTATAATCTCTGGTTTTCAAAGGTAATATG
    GAATATTTGACACTTGGTAAAAGGTGAACTACCTTTGTAGTGAATCTTTTCCTCTTGGTA
    GCATCAACACTGGGGATAAATCAGAACCATTCTGTGGAATGAAATGTTTCTCAAGAGCCT
    ATAATATAGTAGATAGTGCATATTAAGATGTCTGGCTGGGCATGGTGGCTCATGCCTGTA
    ATCCCAGCACTTTGGGAGGCTGAGGCGGGAGGATCACTTGAGCCTAGAAGTTGGAGACTA
    ACCTGGCGAGACCCTGTCTCAAAAAAAAAAAAAAAAAA
    >R15881
    ACCCTTTTGTGACCAGCTGCATACCCCAAAACCTTTTGGAATCTGGGCTAACTGGCTGTG
    CCTACATCAACAGCACCCGTGAACCCCCGTGTGCTATGCTCTGTGCAACAAAACATTCAG
    AACCCACTTTCAAGATGCTGCTGCTGTGCCAGTGTGACAAAAAAAAGAGGCGCAAGCAGC
    AGTACCAGCAGAGACAGTCGGTCATTTTTCACAAGCGCGCACCCGAGCAGGCCTTGTAGA
    ATGAGGTTGTATCAATAGCAGTGACAAAACGCACACATCAACCCACAGACCTTAGGAGGA
    GGAAGGCGAGGGCGGGGTGACTTCTGGTGATGATAAAAATGGTTTTATCACCCAGATGTG
    AAAGAAGCTGCCTGTTTACTGATCCATTGAATAAACCCATTTTAATAGAAAAAGTCAATA
    CCAATTCAGCAAAAAAAAA
    >AF191770
    TATCTATGTAACAAATCGCAGCACAGGAGTCCCCTGGGCTCCCTCAGGCTCTGGTATGAC
    ATATTTGAGCCATATAAATTCAGCTTCTCCTCTGGCATCTGTTAGCCGACTCACTTGCAA
    CTCCACCTCAGCAGTGGTCTCTCAGTCCTCTCAAAGCAAGGAAAGAGTACTGTGTGCTGA
    GAGACCATGGCAAAGAATCCTCCAGAGAATTGTGAAGACTGTCACATTCTAAATGCAGAA
    GCTTTTAAATCCAAGAAAATATGTAAATCACTTAAGATTTGTGGACTGGTGTTTGGTATC
    CTGACCCTAACTCTAATTGTCCTGTTTTGGGGGAGCAAGCACTTCTGGCCGGAGGTACCC
    AAAAAAGCCTATGACATGGAGCACACTTTCTACAGCAGTGGAGAGAAGAAGAAGATTTAC
    ATGGAAATTGATCCTGTGACCAGAACTGAAATATTCAGAAGCGGAAATGGCACTGATGAA
    ACATTGGAAGTACACGACTTTAAAAACGGATACACTGGCATCTACTTCGTGGGTCTTCAA
    AAATGTTTTATCAAAACTCAGATTAAAGTGATTCCTGAATTTTCTGAACCAGAAGAGGAA
    ATAGATGAGAATGAAGAAATTACCACAACTTTCTTTGAACAGTCAGTGATTTGGGTCCCA
    GCAGAAAAGCCTATTGAAAACCGAGATTTTCTTAAAAATTCCAAAATTCTGGAGATTTGT
    GATAACGTGACCATGTATTGGATCAATCCCACTCTAATATCAGTTTCTGAGTTACAAGAC
    TTTGAGGAGGAGGGAGAAGATCTTCACTTTCCTGCCAACGAAAAAAAAGGGATTGAACAA
    AATGAACAGTGGGTGGTCCCTCAAGTGAAAGTAGAGAAGACCCGTCACGCCAGACAAGCA
    AGTGAGGAAGAACTTCCAATAAATGACTATACTGAAAATGGAATAGAATTTGATCCCATG
    CTGGATGAGAGAGGTTATTGTTGTATTTACTGCCGTCGAGGCAACCGCTATTGCCGCCGC
    GTCTGTGAACCTTTACTAGGCTACTACCCATATCCATACTGCTACCAAGGAGGACGAGTC
    ATCTGTCGTGTCATCATGCCTTGTAACTGGTGGGTGGCCCGCATGCTGGGGAGGGTCTAA
    TAGGAGGTTTGAGCTCAAATGCTTAAACTGCTGGCAACATATAATAAATGCATGCTATTC
    AATGAATTTCTGCCTATGAGGCATCTGGCCCCTGGTAGCCAGCTCTCCAGAATTACTTGT
    AGGTAATTCCTCTCTTCATGTTCTAATAAACTTCTACATTATCAAAAAA
    >BC005364
    GCGGATCGCTGCTCCCTCTCGCCATGGCGCAGGTGCTGATCGTGGGCGCCGGGATGACAG
    GAAGCTTGTGCGCTGCGCTGCTGAGGAGGCAGACGTCCGGTCCCTTGTACCTTGCTGTGT
    GGGACAAGGCTGACGACTCAGGGGGAAGAATGACTACAGCCTGCAGTCCTCATAATCCTC
    AGTGCACAGCTGACTTGGGTGCTCAGTACATCACCTGCACTCCTCATTATGCCAAAAAAC
    ACCAACGTTTTTATGATGAACTGTTAGCCTATGGCGTTTTGAGGCCTCTAAGCTCGCCTA
    TTGAAGGAATGGTGATGAAAGAAGGAGACTGTAACTTTGTGGCACCTCAAGGAATTTCTT
    CAATTATTAAGCATTACTTGAAAGAATCAGGTGCAGAAGTCTACTTCAGACATCGTGTGA
    CACAGATCAACCTAAGAGATGACAAATGGGAAGTATCCAAACAAACAGGCTCCCCTGAGC
    AGTTTGATCTTATTGTTCTCACAATGCCAGTTCCTGAGATTCTGCAGCTTCAAGGTGACA
    TCACCACCTTAATTAGTGAATGCCAAAGGCAGCAACTGGAGGCTGTGAGCTACTCCTCTC
    GATATGCTCTGGGCCTCTTTTATGAAGCTGGTACGAAGATTGATGTCCCTTGGGCTGGGC
    AGTACATCACCAGTAATCCCTGCATACGCTTCGTCTCCATTGATAATAAGAAGCGCAATA
    TAGAGTCATCAGAAATTGGGCCTTCCCTCGTGATTCACACCACTGTCCCATTTGGAGTTA
    CATACTTGGAACACAGCATTGAGGATGTGCAAGAGTTAGTCTTCCAGCAGCTGGAAAACA
    TTTTGCCGGGTTTGCCTCAGCCAATTGCTACCAAATGCCAAAAATGGAGACATTCACAGG
    TTACAAATGCTGCTGCCAACTGTCCTGGCCAAATGACTCTGCATCACAAACCTTTCCTTG
    CATGTGGAGGGGATGGATTTACTCAGTCCAACTTTGATGGCTGCATCACTTCTGCCCTAT
    GTGTTCTGGAAGCTTTAAAGAATTATATTTAGTGCCTATATCCTTATTCTCTATATGTGT
    ATTGGGTTTTTATTTTCACAATTTTCTGTTATTGATTATTTTGTTTTCTATTTTGCTAAG
    AAAAATTACTGGAAAATTGTTCTTCACTTATTATCATTTTTCATGTGGAGTATAAAATCA
    ATTTTGTAATTTTGATAGTTACAACCCATGCTAGAATGGAAATTCCTCACACCTTGCACC
    TTCCCTACTTTTCTGAATTGCTATGACTACTCCTTGTTGGAGGAAAAGTGGTACTTAAAA
    AATAACAAACGACTCTCTCAAAAAAATTACATTAAATCACAATAACAGTTTGTATGCCAA
    AAACTTGATTATCCTTATGAAAATTTCAATTCTGAATAAAGAATAATCACATTATCAAAG
    CCCCATCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >NM_001337
    ACTCGTCTCTGGTAAAGTCTGAGCAGGACAGGGTGGCTGACTGGCAGATCCAGAGGTTCC
    CTTGGCAGTCCACGCCAGGCCTTCACCATGGATCAGTTCCCTGAATCAGTGACAGAAAAC
    TTTGAGTACGATGATTTGGCTGAGGCCTGTTATATTGGGGACATCGTGGTCTTTGGGACT
    GTGTTCCTGTCCATATTCTACTCCGTCATCTTTGCCATTGGCCTGGTGGGAAATTTGTTG
    GTAGTGTTTGCCCTCACCAACAGCAAGAAGCCCAAGAGTGTCACCGACATTTACCTCCTG
    AACCTGGCCTTGTCTGATCTGCTGTTTGTAGCCACTTTGCCCTTCTGGACTCACTATTTG
    ATAAATGAAAAGGGCCTCCACAATGCCATGTGCAAATTCACTACCGCCTTCTTCTTCATC
    GGCTTTTTTGGAAGCATATTCTTCATCACCGTCATCAGCATTGATAGGTACCTGGCCATC
    GTCCTGGCCGCCAACTCCATGAACAACCGGACCGTGCAGCATGGCGTCACCATCAGCCTA
    GGCGTCTGGGCAGCAGCCATTTTGGTGGCAGCACCCCAGTTCATGTTCACAAAGCAGAAA
    GAAAATGAATGCCTTGGTGACTACCCCGAGGTCCTCCAGGAAATCTGGCCCGTGCTCCGC
    AATGTGGAAACAAATTTTCTTGGCTTCCTACTCCCCCTGCTCATTATGAGTTATTGCTAC
    TTCAGAATCATCCAGACGCTGTTTTCCTGCAAGAACCACAAGAAAGCCAAAGCCATTAAA
    CTGATCCTTCTGGTGGTCATCGTGTTTTTCCTCTTCTGGACACCCTACAACGTTATGATT
    TTCCTGGAGACGCTTAAGCTCTATGACTTCTTTCCCAGTTGTGACATGAGGAAGGATCTG
    AGGCTGGCCCTCAGTGTGACTGAGACGGTTGCATTTAGCCATTGTTGCCTGAATCCTCTC
    ATCTATGCATTTGCTGGGGAGAAGTTCAGAAGATACCTTTACCACCTGTATGGGAAATGC
    CTGGCTGTCCTGTGTGGGCGCTCAGTCCACGTTGATTTCTCCTCATCTGAATCACAAAGG
    AGCAGGCATGGAAGTGTTCTGAGCAGCAATTTTACTTACCACACGAGTGATGGAGATGCA
    TTGCTCCTTCTCTGAAGGGAATCCCAAAGCCTTGTGTCTACAGAGAACCTGGAGTTCCTG
    AACCTGATGCTGACTAGTGAGGAAAGATTTTTGTTGTTATTTCTTACAGGCACAAAATGA
    TGGACCCAATGCACACAAAACAACCCTAGAGTGTTGTTGAGAATTGTGCTCAAAATTTGA
    AGAATGAACAAATTGAACTCTTTGAATGACAAAGAGTAGACATTTCTCTTACTGCAAATG
    TCATCAGAACTTTTTGGTTTGCAGATGACAAAAATTCAACTCAGACTAGTTTAGTTAAAT
    GAGGGTGGTGAATATTGTTCATATTGTGGCACAAGCAAAAGGGTGTCTGAGCCCTCAAAG
    TGAGGGGAAACCAGGGCCTGAGCCAAGCTAGAATTCCCTCTCTCTGACTCTCAAATCTTT
    TAGTCATTATAGATCCCCCAGACTTTACATGACACAGCTTTATCACCAGAGAGGGACTGA
    CACCCATGTTTCTCTGGCCCCAAGGGAAAATTCCCAGGGAAGTGCTCTGATAGGCCAAGT
    TTGTATCAGGTGCCCATCCCTGGAAGGTGCTGTTATCCATGGGGAAGGGATATATAAGAT
    GGAAGCTTCCAGTCCAATCTCATGGAGAAGCAGAAATACATATTTCCAAGAAGTTGGATG
    GGTGGGTACTATTCTGATTACACAAAACAAATGCCACACATCACCCTTACCATGTGCCTG
    ATCCAGCCTCTCCCCTGATTACACCAGCCTCGTCTTCATTAAGCCCTCTTCCATCATGTC
    CCCAAACCTGCAAGGGCTCCCCACTGCCTACTGCATCGAGTCAAAACTCAAATGCTTGGC
    TTCTCATACGTCCACCATGGGGTCCTACCAATAGATTCCCCATTGCCTCCTCCTTCCCAA
    AGGACTCCACCCATCCTATCAGCCTGTCTCTTCCATATGACCTCATGCATCTCCACCTGC
    TCCCAGGCCAGTAAGGGAAATAGAAAAACCCTGCCCCCAAATAAGAAGGGATGGATTCCA
    ACCCCAACTCCAGTAGCTTGGGACAAATCAAGCTTCAGTTTCCTGGTCTGTAGAAGAGGG
    ATAAGGTACCTTTCACATAGAGATCATCCTTTCCAGCATGAGGAACTAGCCACCAACTCT
    TGCAGGTCTCAACCCTTTTGTCTGCCTCTTAGACTTCTGCTTTCCACACCTGCACTGCTG
    TGCTGTGCCCAAGTTGTGGTGCTGACAAAGCTTGGAAGAGCCTGCAGGTGCCTTGGCCGC
    GTGCATAGCCCAGACACAGAAGAGGCTGGTTCTTACGATGGCACCCAGTGAGCACTCCCA
    AGTCTACAGAGTGATAGCCTTCCGTAACCCAACTCTCCTGGACTGCCTTGAATATCCCCT
    CCCAGTCACCTTGTGCAAGCCCCTGCCCATCTGGGAAAATACCCCATCATTCATGCTACT
    GCCAACCTGGGGAGCCAGGGCTATGGGAGCAGCTTTTTTTTCCCCCCTAGAAACGTTTGG
    AACAATGTAAAACTTTAAAGCTCGAAAACAATTGTAATAATGCTAAAGAAAAAGTCATCC
    AATCTAACCACATCAATATTGTCATTCCTGTATTCACCCGTCCAGACCTTGTTCACACTC
    TCACATGTTTAGAGTTGCAATCGTAATGTACAGATGGTTTTATAATCTGATTTGTTTTCC
    TCTTAACGTTAGACCACAAATAGTGCTCGCTTTCTATGTAGTTTGGTAATTATCATTTTA
    GAAGACTCTACCAGACTGTGTATTCATTGAAGTCAGATGTGGTAACTGTTAAATTGCTGT
    GTATCTGATAGCTCTTTGGCAGTCTATATGTTTGTATAATGAATGAGAGAATAAGTCATG
    TTCCTTCAAGATCATGTACCCCAATTTACTTGCCATTACTCAATTGATAAACATTTAACT
    TGTTTCCAATGTTTAGCAAATACATATTTTATAGAACTTC
    >AI041545
    TGAACATATTCAGGCTGATTGGGGACGTGTCCCACCTGGCGGCCATCGTCATCTTGATGG
    TAGAGATCTGGAAGACGCGCTCCTGCGCCGGTATTTCTGGGAAAAGCCAGCTTCTGTCTG
    CACTGGTCTTCACAACTCGTGACCTGGATCTTTTCACTTCATTTATTTCAGTGTATCACA
    CATCTATCAAGGTTATCTACGTTGCCTGCTCGTATGCCACAGTGTACCTGATCTACCTTA
    AATTTAAGGCAACATCGGATGGAAATCATGATACCTTCCGAGTGGAGTTTCTGGTGGTCC
    CTGTGGGAGGCCTCCTCATTTTTAGTTAATCACGATTTCTCTCCTCTTGAGTACTCAAGG
    GAAAGAAGCTCAGTTTGCCAGCATAAGTGCCAAAGACCATCGCCAGCATCTGTCCTTCAG
    GGTGTTCGGACAGAATTCTTACCACAGCAAAGGCATAAGATGCTTGATACGGAAAATCAA
    GAACTTAACTTTTTTGTTGCAGATAGTCATCAGTGGTTCTGTAAAAACGCAGAGGAAAAG
    AGCCAGAAGGTTTCTGTTTAATGCATCTTGCCTTATCTTTTTTTATTACTGTGCACAAAG
    ATTTTTTTACACAAACATCCTTAATGCTGTTTTAATAAATTCAGTGTGTAGCTTCAAAAA
    AA
    >NM_024423
    GGCAGGTCTCGCTCTCGGCACCCTCCCGGCGCCCGCGTTCTCCTGGCCCTGCCCGGCATC
    CCGATGGCCGCCGCTGGGCCCCGGCGCTCCGTGCGCGGAGCCGTCTGCCTGCATCTGCTG
    CTGACCCTCGTGATCTTCAGTCGTGATGGTGAAGCCTGCAAAAAGGTGATACTTAATGTA
    CCTTCTAAACTAGAGGCAGACAAAATAATTGGCAGAGTTAATTTGGAAGAGTGCTTCAGG
    TCTGCAGACCTCATCCGGTCAAGTGATCCTGATTTCAGAGTTCTAAATGATGGGTCAGTG
    TACACAGCCAGGGCTGTTGCGCTGTCTGATAAGAAAAGATCATTTACCATATGGCTTTCT
    GACAAAAGGAAACAGACACAGAAAGAGGTTACTGTGCTGCTAGAACATCAGAAGAAGGTA
    TCGAAGACAAGACACACTAGAGAAACTGTTCTCAGGCGTGCCAAGAGGAGATGGGCACCT
    ATTCCTTGCTCTATGCAAGAGAATTCCTTGGGCCCTTTCCCATTGTTTCTTCAACAAGTT
    GAATCTGATGCAGCACAGAACTATACTGTCTTCTACTCAATAAGTGGACGTGGAGTTGAT
    AAAGAACCTTTAAATTTGTTTTATATAGAAAGAGACACTGGAAATCTATTTTGCACTCGG
    CCTGTGGATCGTGAAGAATATGATGTTTTTGATTTGATTGCTTATGCGTCAACTGCAGAT
    GGATATTCAGCAGATCTGCCCCTCCCACTACCCATCAGGGTAGAGGATGAAAATGACAAC
    CACCCTGTTTTCACAGAAGCAATTTATAATTTTGAAGTTTTGGAAAGTAGTAGACCTGGT
    ACTACAGTGGGGGTGGTTTGTGCCACAGACAGAGATGAACCGGACACAATGCATACGCGC
    CTGAAATACAGCATTTTGCAGCAGACACCAAGGTCACCTGGGCTCTTTTCTGTGCATCCC
    AGCACAGGCGTAATCACCACAGTCTCTCATTATTTGGACAGAGAGGTTGTAGACAAGTAC
    TCATTGATAATGAAAGTACAAGACATGGATGGCCAGTTTTTTGGATTGATAGGCACATCA
    ACTTGTATCATAACAGTAACAGATTCAAATGATAATGCACCCACTTTCAGACAAAATGCT
    TATGAAGCATTTGTAGAGGAAAATGCATTCAATGTGGAAATCTTACGAATACCTATAGAA
    GATAAGGATTTAATTAACACTGCCAATTGGAGAGTCAATTTTACCATTTTAAAGGGAAAT
    GAAAATGGACATTTCAAAATCAGCACAGACAAAGAAACTAATGAAGGTGTTCTTTCTGTT
    GTAAAGCCACTGAATTATGAAGAAAACCGTCAAGTGAACCTGGAAATTGGAGTAAACAAT
    GAAGCGCCATTTGCTAGAGATATTCCCAGAGTGACAGCCTTGAACAGAGCCTTGGTTACA
    GTTCATGTGAGGGATCTGGATGAGGGGCCTGAATGCACTCCTGCAGCCCAATATGTGCGG
    ATTAAAGAAAACTTAGCAGTGGGGTCAAAGATCAACGGCTATAAGGCATATGACCCCGAA
    AATAGAAATGGCAATGGTTTAAGGTACAAAAAATTGCATGATCCTAAAGGTTGGATCACC
    ATTGATGAAATTTCAGGGTCAATCATAACTTCCAAAATCCTGGATAGGGAGGTTGAAACT
    CCCAAAAATGAGTTGTATAATATTACAGTCCTGGCAATAGACAAAGATGATAGATCATGT
    ACTGGAACACTTGCTGTGAACATTGAAGATGTAAATGATAATCCACCAGAAATACTTCAA
    GAATATGTAGTCATTTGCAAACCAAAAATGGGGTATACCGACATTTTAGCTGTTGATCCT
    GATGAACCTGTCCATGGAGCTCCATTTTATTTCAGTTTGCCCAATACTTCTCCAGAAATC
    AGTAGACTGTGGAGCCTCACCAAAGTTAATGATACAGCTGCCCGTCTTTCATATCAGAAA
    AATGCTGGATTTCAAGAATATACCATTCCTATTACTGTAAAAGACAGGGCCGGCCAAGCT
    GCAACAAAATTATTGAGAGTTAATCTGTGTGAATGTACTCATCCAACTCAGTGTCGTGCG
    ACTTCAAGGAGTACAGGAGTAATACTTGGAAAATGGGCAATCCTTGCAATATTACTGGGT
    ATAGCACTGCTCTTTTCTGTATTGCTAACTTTAGTATGTGGAGTTTTTGGTGCAACTAAA
    GGGAAACGTTTTCCTGAAGATTTAGCACAGCAAAACTTAATTATATCAAACACAGAAGCA
    CCTGGAGACGATAGAGTGTGCTCTGCCAATGGATTTATGACCCAAACTACCAACAACTCT
    AGCCAAGGTTTTTGTGGTACTATGGGATCAGGAATGAAAAATGGAGGGCAGGAAACCATT
    GAAATGATGAAAGGAGGAAACCAGACCTTGGAATCCTGCCGGGGGGCTGGGCATCATCAT
    ACCCTGGACTCCTGCAGGGGAGGACACACGGAGGTGGACAACTGCAGATACACTTACTCG
    GAGTGGCACAGTTTTACTCAACCCCGTCTCGGTGAAGAATCCATTAGAGGACACACTGGT
    TAAAAATTAAACATAAAAGAAATTGCATCGATGTAATCAGAATGAAGACCGCATGCCATC
    CCAAGATTATGTCCTCACTTATAACTATGAGGGAAGAGGATCTCCAGCTGGTTCTGTGGG
    CTGCTGCAGTGAAAAGCAGGAAGAAGATGGCCTTGACTTTTTAAATAATTTGGAACCCAA
    ATTTATTACATTAGCAGAAGCATGCACAAAGAGATAATGTCACAGTGCTACAATTAGGTC
    TTTGTCAGACATTCTGGAGGTTTCCAAAAATAATATTGTAAAGTTCAATTTCAACATGTA
    TGTATATGATGATTTTTTTCTCAATTTTGAATTATGCTACTCACCAATTTATATTTTTAA
    AGCCAGTTGTTGCTTATCTTTTCCAAAAAGTGAAAAATGTTAAAACAGACAACTGGTAAA
    TCTCAAACTCCAGCACTGGAATTAAGGTCTCTAAAGCATCTGCTCTTTTTTTTTTTTACG
    GATATTTTAGTAATAAATATGCTGGATAAATATTAGTCCAACAATAGCTAAGTTATGCTA
    ATATCACATTATTATGTATTCACTTTAAGTGATAGTTTAAAAAATAAACAAGAAATATTG
    AGTATCACTATGTGAAGAAAGTTTTGGAAAAGAAACAATGAAGACTGAATTAAATTAAAA
    ATGTTGCAGCTCATAAAGAATTGGGACTCACCCCTACTGCACTACCAAATTCATTTGACT
    TTGGAGGCATAATGTGTTGAAGTGCCCTATGAAGTAGCAATTTTCTATAGGAATATAGTT
    GGAAATAAATGTGTGTGTGTATATTATTATTAATCAATGCAATATTTAAAATGAAATGAG
    AACAAAGAGGAAAATGGTAAAAACTTGAAATGAGGCTGGGGTATAGTTTGTCCTACAATA
    GAAAAAAGAGAGAGCTTCCTAGGCCTGGGCTCTTAAATGCTGCATTATAACTGAGTCTAT
    GAGGAAATAGTTCCTGTCCAATTTGTGTAATTTGTTTAAAATTGTAAATAAATTAAACTT
    TTCTGGTTTCTGTGGGAAGGAAATAGGGAATCCAATGGAACAGTAGCTTTGCTTTGCAGT
    CTGTTTCAAGATTTCTGCATCCACAAGTTAGTAGCAAACTGGGGAATACTCGCTGCAGCT
    GGGGTTCCCTGCTTTTTGGTAGCAAGGGTCCAGAGATGAGGTGTTTTTTTCGGGGAGCTA
    ATAACAAAAACATTTTAAAACTTACCTTTACTGAAGTTAAATCCTCTATTGCTGTTTCTA
    TTCTCTCTTATAGTGACCAACATCTTTTTAATTTAGATCCAAATAACCATGTCCTCCTAG
    AGTTTAGAGGCTAGAGGGAGCTGAGGGGAGGATCTTACTGAAAGCACCCTGGGGAGATTG
    ATTGTCCTTAAACCTAAGCCCCACAAACTTGACACCTGATCAGGTCTGGGAGCTACAAAA
    TTTCATTTTTCTCCTCACTGCCCTTCTTCTGAGTGGCATTGGCCTGAATCAAGGAAAGCC
    AGGCCTTGTGGGCCCCCTTCTTTCGGCTTTCTGCTAAAGCAACACCTCCAGCAGAGATTC
    CCTTAAGTGACTCCAGGTTTTCCACCATCCTTCAGCGTGAATTAATTTTTAATCAGTTTG
    CTTTCTCCAGAGAAATTTTAAAATAATAGAAGAAATAGAAATTTTGAATGTATAAAAGAA
    AAAGATCAAGTTGTCATTTTAGAACAGAGGGAACTTTGGGAGAAAGCAGCCCAAGTAGGT
    TATTTGTACAGTCAGAGGGCAACAGGAAGATGCAGGCCTTCAAGGGCAAGGAGAGGCCAC
    AAGGAATATGGGTGGGAGTAAAAGCAACATCGTCTGCTTCATACTTTTTCCTAGGCTTGG
    CACTGCCTTTTCCTTTCTCAGGCCAATGGCAACTGCCATTTGAGTCCGGTGAGGGATCAG
    CCAACCTCTTCTCTATGGCTCACCTTATTTGGAGTGAGAAATCAAGGAGACAGAGCTGAC
    TGCATGATGAGTCTGAAGGCATTTGCAGGATGAGCCTGAACTGGTTGTGCAGAACAAACA
    AGGCATTCATGGGAATTGTTGTATTCCTTCTGCAGCCCTCCTTCTGGGCACTAAGAAGGT
    CTATGAATTAAATGCCTATCTAAAATTCTGATTTATTCCTACATTTTCTGTTTTCTAATT
    TGACCCTAAAATCTATGTGTTTTAGACTTAGACTTTTTATTGCCCCCCCCCCCTTTTTTT
    TTGAGACGGAGTCTCGCTCTGACGCACAGGCTGGAGTGCAGTGGCTCCGATCTCTGCTCA
    CTGAAAGCTCCGCCTCCCGGGTTCATGCCATTCTCCTGCCTCAGCCTCCTGAGTAGCTGG
    GACTACAGGCGCCCACCACCACGCCCGGCTAATTTTTTGTATTTTTAATAGAGACGGGGT
    TTCACTGTGTTAGCCAGGATGGTCTCGATCTCCTGACCTCGTGATCCGCCTGCCTCGGCC
    TCCCAAAGTGCTGGGATTACAGGCATGACCCACCGCTCCCGGCCTTGTTTTCCGTTTAAA
    GTCGTCTTCTTTTAATGTAATCATTTTGAACATGTGTGAAAGTTGATCATACGAATTGGA
    TCAATCTTGAAATACTCAACCAAAAGACAGTCGAGAAGCCAGGGGGAGAAAGAACTCAGG
    GCACAAAATATTGGTCTGAGAATGGAATTCTCTGTAAGCCTAGTTGCTGAAATTTCCTGC
    TGTAACCAGAAGCCAGTTTTATCTAACGGCTACTGAAACACCCACTGTGTTTTGCTCACT
    CCCACTCACCGATCAAAACCTGCTACCTCCCCAAGACTTTACTAGTGCCGATAAACTTTC
    TCAAAGAGCAACCAGTATCACTTCCCTGTTTATAAAACCTCTAACCATCTCTTTGTTCTT
    TGAACATGCTGAAAACCACCTGGTCTGCATGTATGCCCGAATTTGTAATTCTTTTCTCTC
    AAATGAAAATTTAATTTTAGGGATTCATTTCTATATTTTCACATATGTAGTATTATTATT
    TCCTTATATGTGTAAGGTGAAATTTATGGTATTTGAGTGTGCAAGAAAATATATTTTTAA
    AGCTTTCATTTTTCCCCCAGTGAATGATTTAGAATTTTTTATGTAAATATACAGAATGTT
    TTTTCTTACTTTTATAAGGAAGCAGCTGTCTAAAATGCAGTGGGGTTTGTTTTGCAATGT
    TTTAAACAGAGTTTTAGTATTGCTATTAAAAGAAGTTACTTTGCTTTTAAAGAAACTTGG
    CTGCTTAAAATAAGCAAAAATTGGATGCATAAAGTAATATTTACAGATGTGGGGAGATGT
    AATAAAACAATATTAACTTGGAAAAAAAAAAAAAAAAAAA
    >AA745593
    GACTCAGNCTTCAGCCGCTCTCCTCCCCCTGGGCAAACAGGACTCATCTGATGATGTGAG
    AAGAGTTCAGAGGAGGGAGAAAAATCGTATTGCCGCCCAGAAGAGCCGACAGAGGCAGAC
    ACAGAAGGCCGACACCCTGCACCTGGAGAGCGAAGACCTGGAGAAACAGAACGCGGCTCT
    ACGCAAGGAGATCAAGCAGCTCACAGAGGAACTGAAGTACTTCACGTCGGTGCTGAACAG
    CCACGAGCCCCTGTGCTCGGTGCTGGCCGCCAGCACGCCCTCGCCCCCCGAGGTGGTGTA
    CAGCGCCCACGCATTCCACCAACCTCATGTCAGCTCCCCGCGCTTCCAGCCCTGAGCTTC
    CGATGCGGGGAGAGCAGAGCCTCGGGAGGGGCACACAGACTGTGGCAGAGCTGCGCCCAT
    CCCGCAGAGGCCCCTGTCCACCTGGAGACCCGGAGACAGAGGCCTGGACAAGGAGTGAAC
    ACGGGAACTGTCACGACTGGAAGGGCGTGAGGCCTCCCAGCAGTGCCGCAGCGTTTCGAG
    GGGCGTGTGCTGGACCCCACCACTGTGGGTTGCAGGCCCAATGCAGAAGAGTATTAAGAA
    AGATGCTCAAGTCCCATGGCACAGAGCAAGGCGGGCAGGGAACGGTTATTTTTCTAAATA
    AATGCTTTAAAAGAAAAAAAAAAAAAAAAAAAAAAA
    >AI985118
    ATGCAAGGNNTAGGCAAAGATTGTTGACCCNGGAGATAGAGGTNNCAATGAGCCAGATCA
    TTCCATTGCATTCCAGCTTGGGCGACAGAATGAGACTCTGTCTCAAAATTAAAAANCAAA
    AAACCAAAANCAAATAGATGAAAAAGTAGACTGGAGACAAATAAAAGTGAGTTTCTAAAG
    GAAATTCACAGTAATGCTGCATTAAACACTAAGCTCACTTAGGTCACTTTCTAGTGAGCT
    AACCGTAACAGAGAGCCTACAGGATACACGTGAGATAATGTCACGTGTAGAAGATCGTTG
    TGAATTAAAGTTCAAAATTAAGACTTCTTAGATTATGATGTAGATTTTAGAGCTCCTTAA
    AACATAAAGCGAATCTTATAAATGTTCAATTCTAAAGTTATTCCACTTGGAAAAATTAGC
    TTTTGGGACAATTTTTAAGAACTTTTGTGTAAAATGCAGCTCCATGTTTAGCATAATCTA
    AAAATAATTTCAAGCAATCCAGAATCTTCCAAGAATGTTATTAAAGCTTTAAAACAAAGC
    AAAACAAAAAGACCCTTTTGTGCCTTATATGGGAAGACTAAAAAAA
    >AB038160
    ACCGGGCACCGGACGGCTCGGGTACTTTCGTTCTTAATTAGGTCATGCCCGTGTGAGCCA
    GGAAAGGGCTGTGTTTATGGGAAGCCAGTAACACTGTGGCCTACTATCTCTTCCGTGGTG
    CCATCTACATTTTTGGGACTCGGGAATTATGAGGTAGAGGTGGAGGCGGAGCCGGATGTC
    AGAGGTCCTGAAATAGTCACCATGGGGGAAAATGATCCGCCTGCTGTTGAAGCCCCCTTC
    TCATTCCGATCGCTTTTTGGCCTTGATGATTTGAAAATAAGTCCTGTTGCACCAGATGCA
    GATGCTGTTGCTGCACAGATCCTGTCACTGCTGCCATTGAAGTTTTTTCCAATCATCGTC
    ATTGGGATCATTGCATTGATATTAGCACTGGCCATTGGTCTGGGCATCCACTTCGACTGC
    TCAGGGAAGTACAGATGTCGCTCATCCTTTAAGTGTATCGAGCTGATAGCTCGATGTGAC
    GGAGTCTCGGATTGCAAAGACGGGGAGGACGAGTACCGCTGTGTCCGGGTGGGTGGTCAG
    AATGCCGTGCTCCAGGTGTTCACAGCTGCTTCGTGGAAGACCATGTGCTCCGATGACTGG
    AAGGGTCACTACGCAAATGTTGCCTGTGCCCAACTGGGTTTCCCAAGCTATGTGAGTTCA
    GATAACCTCAGAGTGAGCTCGCTGGAGGGGCAGTTCCGGGAGGAGTTTGTGTCCATCGAT
    CACCTCTTGCCAGATGACAAGGTGACTGCATTACACCACTCAGTATATGTGAGGGAGGGA
    TGTGCCTCTGGCCACGTGGTTACCTTGCAGTGCACAGCCTGTGGTCATAGAAGGGGCTAC
    AGCTCACGCATCGTGGGTGGAAACATGTCCTTGCTCTCGCAGTGGCCCTGGCAGGCCAGC
    CTTCAGTTCCAGGGCTACCACCTGTGCGGGGGCTCTGTCATCACGCCCCTGTGGATCATC
    ACTGCTGCACACTGTGTTTATGACTTGTACCTCCCCAAGTCATGGACCATCCAGGTGGGT
    CTAGTTTCCCTGTTGGACAATCCAGCCCCATCCCACTTGGTGGAGAAGATTGTCTACCAC
    AGCAAGTACAAGCCAAAGAGGCTGGGCAATGACATCGCCCTTATGAAGCTGGCCGGGCCA
    CTCACGTTCAATGGTACATCTGGGTCTCTATGTGGTTCTGCAGCTCTTCCTTTGTTTCAA
    GAGGATTTGCAATTGCTCATTGAAGCATTCTTATGATGGCTGCTTTATAATCCTTGTCAG
    ATATTAATAATTCCAACTCCTGATTCATGTTGGTGTTGGCATCAGTTGATTATCTTTTCT
    CATTAAAATTGTGATGCTCCTAAAAAAAAAAAAAAAAAA
    >X69699
    TTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGGCGGACCCAAGCAG
    TGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGA
    AGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAG
    ATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCC
    GGAAGTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACAT
    CTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGA
    GACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAA
    GGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGAT
    CCGAGACCGGCTCCTGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTC
    CATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTG
    CGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCC
    CCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGG
    CATCGCTCAGCCTGGCAGCGACAAGAGGAAAATGGATGACAGTGATCAGGATAGCTGCCG
    ACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGC
    CTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGA
    GGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCT
    CAACAGCACCCTGGACGACGGGAAGGCCACCCTGACCCCTTCCAACACGCCACTGGGGCG
    CAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAGATCCTCACTCACCCTTGGCCAT
    AAAGCAGGAAACCCCCGAGGTGTCCAGTTCTAGCTCCACCCCTTGCTCTTTATCTAGCTC
    CGCCCTTTTGGATCTGCAGCAAGTCGGCTCCGGGGTCCCGCCCTTCAATGCCTTTCCCCA
    TGCTGCCTCCGTGTACGGGCAGTTCACGGGCCAGGCCCTCCTCTCAGGGCGAGAGATGGT
    GGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGC
    CTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCA
    CACCCCCTACTCCTCCTACAGCGAGGCCTGGGGCTTCCCCAACTCCAGCTTGCTGAGTTC
    CCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCACGGCCTT
    TGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCC
    TGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTATTACATGAAAAATAACCAC
    AAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGA
    TGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAG
    GGAAAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCAGCC
    GGTACCCTCCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGT
    TCCTGCCCGGGCCCAAGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCT
    AGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGC
    TCTGCTCACCAGCCACCCAGGGAGTCACCTCCCTCAGCCTCCCGCCTGCCCCACACGGAG
    GCTCTGGCTGTCCTCTTTCCTCCACTCCATTTGCTTGGCTCTTTCTACACCTCCCTCTTG
    GATGGGCTGAGGGCTGGAGCGAGTCCCTCAGAAATTCCACCAGGCTGTCAGCTGACCTCT
    TTTTCCTGCTGCTGTGAAGGTATAGCACCACCCAGGTCCTCCTGCAGTGCGGCATCCCCT
    TGGCAGCTGCCGTCAGCCAGGCCAGCCCCAGGGAGCTTAAAACAGACATTCCACAGGGCC
    TGGGCCCCTGGGAGGTGAGGTGTGGTGTGCGGCTTCACCCAGGGCAGAACAAGGCAGAAT
    CGCAGGAAACCCGCTTCCCCTTCCTGACAGCTCCTGCCAAGCCAAATGTGCTTCCTGCAG
    CTCACGCCCACCAGCTACTGAAGGGACCCAAGGCACCCCCTGAAGCCAGCGATAGAGGGT
    CCCTCTCTGCTCCCCAGCAGCTCCTGCCCCCAAGGCCTGACTGTATATACTGTAAATGAA
    ACTTTGTTTGGGTCAAGCTTCCTTCTTTCTAACCCCCAGACTTTGGCCTCTGAGTGAAAT
    GTCTCTCTTTGCCCTGTGGGGCTTCTCTCCTTGATGCTTCTTTCTTTTTTTAAAGACAAC
    CTGCCATTACCACATGACTCAATAAACCATTGCTCTTCAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAA
    >AK025615
    TGCTTCATAAAATTTACCTAAGCAAGTGGTCTTGCTTGCCTCAAATCCAAGCAGTCTTGA
    ACACTTGGAGGCAATTAATGAGTATATCTTAGTCAAAAGAATTGTTGGAGCTTTTTATTA
    AAGCTGCAGTTTCAGTTCTGCTTTTGGGGAATTGTGCTATGAAAGCAGCTGCCAAAATAA
    GCTCATTTATTTTCTTCAATCCCACTCAGTGCTCAGTCACTATATTCTGTTTCCTTTTTT
    TTTTTCAAGTTGCATATTTGGTTTCCCCTTATGATTGGGAAAGATGAATTTTCAGCAGAA
    AACAGTGTTTGTTCACTTTCAAAGAGTGATAGTTTCTAAAACATTTAGAGCAATAAATAT
    TCATCAGAGGTACCAAGTAAGCCAGCAGAAGAGTTAAGGGTTAGAGAAATCCCTTATTTC
    ATGTCTTGACTCTAAAATGATCAAAGTACTTTTCCTTGTAATGTGGATTTCTTCTTATGC
    GGATATGCAAAAACTTCAGTTATACGTAGTAATGCTAGCAGGTAATTTTAGTGGACATTT
    TATAACAACTGTCACTTTGTTTTGCCACATGTAGAGTTTGTTCAGCTATTTTCCAGATAT
    CTCCCCACAAAAGGAGGCAAAGGGTACCAGCTTTTCAATGAGCATTACCTATTACTTGGC
    AAAGATGATGAAGACTCTATTAATAGTTCATTTGATAAATGTTGACATAACCAACAATAG
    AGATTAGGAAGTTAGTTTTAAGAAATCAATAGCATATAGACATTACCCTCATGGAGTTTG
    TATTCTACTACTTGAACTGATTGTAGCTATAAAAGCATAGTTAGATAGCTGAATAGTTAG
    ATCATAAGCAAAGAAGGCCAGAACACATCTCTTATCAAGAAATCAATGAATAGTTTATCT
    CATTTTTAAAGCAACTTTATCCTTCTTTAATTCCTTCCTTTCTTCTAGTGCAAAACTACT
    TAATAAGGTTGGTGTTTAGGTTAGTGTTCACACCATTCCTCATCTGGTGTGAATTACCTT
    CTCTTTCTTTACTATTTACTACCAACCTAGTACATGTGTTGACTGAATTCTTTTCAAACA
    ATGTTGAGTTATCATGGTGCACCTAATAAATTAACACCACAGATTACAGCATCCTTGCTG
    ATTTTCTCAGCAAAGCCAGATTAGATGGAAATAAACAAAGAAAATGATCCTAGAGTGAAT
    TTTTCTAGAAAATATCTATTATGAACCATGCTGTTTAAAGTATTAGCTTGAAGGTGATGG
    ATCCAGCTATTCAGAAAATAACTTTCATATAACCATGATTTTGCACAGTATGAGGTCTTA
    AATGTGTGGAAAGAGATAAATTTTTTATCATTACCACAAACCCCTTTTAAAGATTCAAAG
    GTGGAAGAAAGTGATTTATTTTTTCTCTTCAGCATACATATATAAAAGACTTGTCAGATG
    TTTAATTTGGGGAGGTTGATAATGAAACATATCAACAGAGTATAGTAGTTATAGTAGTGT
    TTGTGGGTAAATAATTTCCTGGGGTCAGACATATATAAACATATTTGCTTCAAAATGATA
    AAGGCATGAAATCAGTCTTAAAAATTGAAATGGGGGTGATGGGGGAGAAAAAGAAGAACA
    AATTTGAAGTGCCCTTTCAAATCTGCTGGATACAAGTATTGAAGTTTTAAGTCATCTTAT
    TCTGTCTGAAAGTGTATTTTTCATTCTACAATAGACCCAATCAACAAGACGTATAACTTG
    AGTTGCATGATGTTCAGTTTATGTAATCTACTGTTGGGATGGTAAGAATTGATGTAGGCT
    GTGGTGTAAGAATGAATTAAAATATAGTTTCACTGGCTTTTCTCTACATATCCACTATCA
    CAATGGCTAGGTTTCCTGTTGCTCACTGTTGGATTCTGGAGAAAAATTTAATGAAAGATG
    ATATCAGAGGAAGAATAAGTGGAGGTAGAGAAGAAAGGAGTGATAGAGGAGGGGAAAAAA
    ACAAAACATATTTTTGTGTTATCCAAAGGAGCTTTTTCCTTATTCTGTCAAGCATTGAGA
    TCTTCTTCAGCTTTCAATGTAGTTGCTAAATACAAATAATGCTACTAGGTAGTGACTAAA
    TATAGCAAACACTTCATCAGATATTAGAATTAGGTCACACTATTGAGGTTATAATCTGAA
    GGTTGTGTTACATAGAAACCACTTTAGATTATTATCAACTTGGGCTAGGCTTTATTTTAT
    AATAGCATAGTAAGTAATATCTATTGTGTCATTTCTTCAACCATTTTATTCTAAGATCCA
    TGAAGCTTCTTGAGGCCAAATAAAATAATAAGTTTAGACAAGAAGTAGATTGTGACTTTT
    TTTCCCTTAGAGATACTATTTACTATCTCCTATCCTGATAGGTGGAAGGTTTACTGAATT
    GGAAATTGGTTGACTATTAGTTTTTAACTAAAATGTGCAATAACACATTGCAGTTTCCTC
    AAACTAGTTTCCTATGATCATTAAACTCATTCTCAGGGTTAAGAAAGGAATGTAAATTTC
    TGCCTCAATTTGTACTTCATCAATAAGTTTTTGAAGAGTGCAGATTTTTAGTCAGGTCTT
    AAAAATAAACTCACAAATCTGGATGCATTTCTAAATTCTGCAAATGTTTCCTGGGGTGAC
    TTAACAAGGAATAATCCCACAATATACCTAGCTACCTAATACATGGAGCTGGGGCTCAAC
    CCACTGTTTTTAAGGATTTGCGCTTACTTGTGGCTGAGGAAAAATAAGTAGTTCGAGGAA
    GTAGTTTTTAAATGTGAGCTTATAGATAGAAACAGAATATCAACTTAATTATGAAATTGT
    TAGAACCTGTTCTCTTGTATCTGAATCTGATTGCAATTACTATTGTACTGATAGACTCCA
    GCCATTGCAAGTCTCAGATATCTTAGCTGTGTAGTGATTCTTGAAATTCTTTTTAAGAAA
    AATTGAGTAGAAAGAAATAAACCCTTTGTAAATGAGGCTTGGCTTTTGTGAAAGATCATC
    CGCAGGCTATGTTAAAAGGATTTTAGCTCACTAAAAGTGTAATAATGGAAATGTGGAAAA
    TATCGTAGGTAAAGGAAACTACCTCATGCTCTGAAGGTTTTGTAGAAGCACAATTAAACA
    TCTAAAATGGCTTTGTTACACCAGAGCCATCTGGTGTGAAGAACTCTATATTTGTATGTT
    GAGAGGGCATGGAATAATTGTATTTTGCTGGCAATAGACACATTCTTTATTATTTGCAGA
    TTCCTCATCAAATCTGTAATTATGCACAGTTTCTGTTATCAATAAAACAAAAGAATCCTG
    TTAAAAAAAAAAAAAAAAAAAAA
    >AW118445
    TGGCTCTCTCCTTCAAAAGGNCCAGGCCCTGTCCCCCTTTCTCCCCGANTCCAACCCCAG
    CTCCCCTGTGAAGAAAAAAGTTAAAAAATTTGTTATTTATTTGCTTTTTGCGTTGGGATG
    GGTTCGTGTCCAGTCCCGGGGGTCTGATATGGCCATCACAGGCTGGGTGTTCCCAGCAGC
    CCTGGCTTGGGGGCTTGACGCCCTTCCCCTTGCCCCAGGCCATCATCTCCCCACCTCTCC
    TCCCCTCTCCTCAGTTTTGCCGACTGCTTTTCATCTGAGTCACCATTTACTCCAAGCATG
    TATTCCAGACTTGTCACTGACTTTCCTTCTGGAGCAGGTGGCTAGAAAAAGAGGCTGTGG
    GCAGGAAAGAAAGGCTCCTGTTTCTCATTTGTGAGGCCAGCCTCTGGCTTTTCTGCCGTG
    GATTCTCCCCCTGTCTTCTCCCCTCAGCAATTCCTGCAAAGGGTTAAAAATTTAACTGGT
    TTTTACTACTGATGACTTGATTTAAAAAAAATACAAAGATGCTGGATGCTAACTTGATAC
    TAACCATCAGATTGTACAGTTTGGTTGTTGCTGTAAATATGGTAGCGTTTTGTTGTTGTT
    GTTTTTTCATGCCCCATACTACTGAATAAACTAGTTCTGTGCGGGTAAAAAAAAAAAAAA
    AAAAAAAAAAA
    >AL137761
    CACAAAGAAAAAAGAAATACCTGTAGAAGCGCATCGAAAGCTCCTGGAACAGAGTTGTGT
    CTCATATTTGCAAAGATGCAGAAAAAATAAACCCGGGACATCCAGCTTTCTTTTCCTTTC
    TTCTTTGACTATTCTGAGAAGCTATGCGACTAGGAGCACATTTTAGGTAAACACGTGGCT
    TGAGTAGCCATAAGGCCACTCTTCCCTGTCGTGTGACCCGCGCCTGGGCCTTTAAGAGAT
    ATTGGTGTTTGAAAAGGGAGGAATCTGTTTGCCCTCAGATATTTAGTTCAACTGCCTGCA
    TTGCTTCCTATTTTGTTGTCCAACTCTGTAGTAGTTAGCACTGGCCTTACCAACATGTAA
    AGAAATTTTCTTTACTGCCCCATGAGTAGTTGGAGGCAAAGAGAAATTTTTAAAGCGCAG
    AAAAAGGCCTGCAGGGAGATGGAATTTGTTCTGCCAGAGAAACGAGATGATAGCTGTATT
    TAATAAAGTTACTGACCTCTTGTCAAAATTTAAAACGCAAAAGAAGATGTTTCAAAATGC
    AGAGAATGTCAGAAAACAAAAACTACAGGGACCAGACCAGTATAATGTTTAGTTTTCATT
    ATACTAACTTTTGTCTAGACTGGAGTTGATTCACTATTTTTTCTTTAACTCCTCAGGAAG
    CAAACCTTCCCGATGATGAAGACTTCTTGAAGGATTTCATGGGTGATTTGGGATCCCAGG
    ACCATTTGGCTAGTGTGCCTAGGTGACCACATGATTGCTGTTTTACCAGGAATGCAGCAT
    CCCATTGACAAAACAAGTGCTCTGAGAAGGTTTAAAATACTACAGAGAATATGGGAACAC
    AGACCTTGAAATTTAGCTGAGTTGTAACAGCTGAAACTCCAAGAGGTGTCTTCCTTGTTT
    GAGGTGAAACTAGTGTTGCTTCCAGAGGGCAGCTGGAAACCGTAAAGCTGTTTGGAAATC
    TTTTTGACTGACTTGCTGACAAAGAGGTACTGTGATGCATTTTAACAATATCTAAGTTGA
    TTTTTTTTTAAATCAAGGAAAATAAAAACCAAGCATGAATGCTATGGTATGTGCCCCTTT
    TGACCATCCTGGGCTGATTAACATCATTTAAATCAAAGTAATCATAAAAAGGCATATTCT
    ACTTCAATTATGTGGTCAAATAAGAGTAAACACACACACTCACACATGCTGACCCCAATT
    GCCAGAGCATTACTGCACTATAAATTACGGTTAATTCCCAAATTATACTACTGTTTATCT
    TATTTAACAAGTCAGAAAGCACTTTTAAAATAACTTGAGGGCTACAAGGTCATTCTATTA
    ATGTCATTCTCCATTCGGGTTGTAGGCATGTGGAAGTACCCATTAAAAGATAAGTTAGAG
    TTTAAATACTGATAAACAAAACCTTTTATTGCAACTGGACAGTTTCTGGAGAGTTAGCGG
    AAGAATCTTGGAGTTTCCTTTGGTCAGATGAATACAACATTTCACTTTTGCAGCACTATT
    TAGAATGTACTCCATGGTTCTCTTGTTCCCAACTTCCAAAAAGAACAGAAAACTTTGGTT
    TACACAGAACACGGGCATCTGAGGCAGGACCTCTTCCCTGCCCTTTGATCTGACTCACAC
    CTCCACATATGACGTAATCAACCCAAATTTGACACCAATTCACTCTTTTCTGCAAAGGGC
    ATATTTTGAAACAAGGGACAGCCTGAGGGCGGCTATAATGAGAATGTTCATGGGGGTTAC
    TGGGTCCCTAATTCTGAACTTGCTTATGACACCCAGAGTGAATAGATTCAGATTCAGAAC
    CTTCTGAGAAATAACCCAAAGAAAATTTGTTACCCAGCCAATTCTTCGAAAGCTTAATAT
    CAAAATATATCTTTTCAAGAAGAAAATCGTTAGAGAGAAGAATGTGGAGGGGAGAGAAAT
    GGGTTTCTCATTGATATGATATTTTGTTAACCATTTCATTTTGAATTATTCAAGTTTTGG
    TTAATATTGTATTCTTTTTTCGTAACTATTTTACCGTGAGAGTAGGTCATTGGGTTACTT
    AGATATTTATTTTTACACAGTTATTAGTCTTCAGATAGTTTTATTTTACTTCATATGATT
    TTAGTTTTTGTCAGTATAATTTTAAATCATGTTTTTCTTGGTCATCTCTTTGTGTATATT
    GTGTAATTGGATTTTCATTGACTGCAAGTGGAGTGTTTGCCACTCAATTCAGTACTCAGT
    ACTATGGTGACTTGTTTTCAAATAAGTCTCAGATACACATTTAGGGAGCCTTTGCTGGCC
    GAATATAGACTCTGTCAGGACAGCAGGTCCCCTGATCTAAGAATTTTCCCCAATGGTTGC
    TCTAAAAATGCTGCTATTTTGCTGTTCACTGTATTGCACTTAGTTAAAAAGAAGATAATG
    TGAAAGATGAGAGCAGTTTTTTAAAGGATCTTTTCATATACCCAATTCCCTTATTTTCAG
    ATGTCCCATCAATTTTAGATATGAAAGCTTTAAGTAAAAGTGTGTATGCCTTTCTACTGT
    CAGAACAGGATGGATGCAGCCTGGGTCAGATTTATTTAAGATAAAAATCATGCAGACTCA
    TCATTCATATCATAGGTGAAAAATGTAAAAACCAAATGGTTTCCACTAAAGCCACCAAGA
    TCTTTTAGAAATGTTTGCACCTTTGGTGGTGGCACAGGAAAAGAGAAGAATTCAGCTGGA
    GTGAATTCTAGAAGTAGATATCAGAAACGGGGCATGAAGAACAGGGGAACTGGGTGGCAT
    CAGACTCCTAAAGAAGTGAGTTAATTTTCCTTCCCTTCCATTCAGATTCATGCCACAGCT
    CCATATCTTGAGTATGTGTAAGAGGTGAGTTCCTTCTTCAGCCAGGGGCGGTGGCTCATG
    CCTTTAATCCCAATGCTTTGGGAGGCCAAGGTGGGAGGATCACTTGTGCCTTGGGGTTCA
    AGGTTGCAGTGAACCATGATTGCACCACTGCACTCCAGCCTGAGTGACAGAGCAAGACCC
    TGTCTCTAAAAATATATATAAAAAGTAAAACTAAAGAACTTCTTGCCTAAACCTGAATTA
    CCGCAATTTGCTGAGTGACTTTGAGAAAAATCAGACTGTTTAGTTCAGTCGGGATGAAAA
    GCTTGCGATTGCTTCCCACAAGAATGGGCAATAGTGACGGCTGCAAGGTACTTTTATTTG
    TTCATGAAAGAACGACAATTTTTCAAAATGTAATTAAACATAATAGAATGTTTTAAACTA
    CTGGGCACTGAAACTGGAAGAAAAAGGAGGCTTTATTGAACATTCCCCTTTTTCAGTTGG
    TTCAAAGTTCAGCACTGTGGTTATCATTGGTGATGCCAGAAAACATTAGTAGACTTAGAC
    AATTGCTATGGCAGTTTCTAAACAGAGCTTTTTCTATACACTATTTGCAACTGGAGTGCA
    ATATTGTATATTCTGTGTTAAAGAAATAAAGTATTTTTATCATTTATTAAAAAAAAAAAA
    AAAAA
    >AF038191
    CCATCCAGAACGATGAGGCCGTGGCCCCGCTCATGAAGTACCTGGATGAGAAGCTGGCCC
    TGCTGAACGCCTCGCTGGTGAAGGGGAACCTGAGCAGGGTGCTGGAGGCCCTGTGGGAGC
    TACTCCTCCAGGCCATTCTGCAGGCGCTGGGTGCAAACCGTGACGTCTCTGCTGATTTCT
    ACAGCCGCTTCCATTTCACGCTGGAGGCCCTGGTCAGTTTTTTCCACGCAGAGGGTCAGG
    GTTTGCCCCTGGAGAGCCTGAGGGATGGAAGCTACAAGAGGCTGAAGGAGGAGCTGCGGC
    TGCACAAATGTTCCACCCGCGAGTGCATCGAGCAGTTCTACCTGGACAAGCTCAAACAGA
    GGACCCTGGAGCAGAACCGGTTTGGACGCCTGAGCGTCCGTTGCCATTACGAGGCGGCTG
    AGCAGCGGCTGGCCGTGGAGGTGCTGCACGCCGCGGACCTGCTCCCCCTGGATGCCAACG
    GCTTAAGTGACCCCTTTGTGATTGTGGAGCTGGGCCCACCGCATCTCTTTCCACTGGTCC
    GCAGCCAGAGGACCCAGGTGAAGACCCGGACGCTGCACCCTGTATACGACGAACTCTTCT
    ACTTTTCCGTGCCTGCCGAGGCGTGCCGCCGCCGCGCGGCCTGTGTGTTGTTCACCGTCA
    TGGACCACGACTGGCTGTCCACCAACGACTTCGCTGGGGAGGCGGCCCTCGGCCTAGGTG
    GCGTCACTGGTGTCGCCCGGCCCCAGGTGGGCGGGGGTGCAAGGGCTGGGCAGCCTGTCA
    CCCTGCACCTGTGCCGGCCCAGAGCCCAGGTGAGATCTGCGCTGAGGAGGCTGGAAGGCC
    GCACCAGCAAGGAGGCGCAGGAGTTCGTGAAGAAACTCAAGGAGCTGGAGAAGTGCATGG
    AGGCGGACCCCTGAGTCCATCAGCTGCCAGCCCCGGCCCTGGCCCCCACCCCAAGTTCCC
    TGAAGCATCCTCCAGCTCACTGTGGCCAGCTTTGTGCAACCAGGGCCCACGGCGCCCCTC
    CTGTGCTGTGACGTGTGTGTCGTGGCTGGCCCCGCGGCGCCTACCGCCCTGGCCGTGTCT
    GTCTGGTGTGTGCTGTGAACCCCTGCACCCAACCCCACATCTGGGTGGCCAACTTGGCAG
    GACTTGGCCAGCAGCTGCCCAGGACACAGTGCAGGCCAGAGCGGGCTTGACCACCTGGTG
    GGCCTCCCTGCCCGCTTCCTTGGGCTCCCCGGCCCTGGGTGGGCGGTGCGCAGCTGGTCT
    CCAGGGACTCAGTGAGTGGCTGTGCTCTCTGCACAACGGGCAATGTGCAGACGCATTTTT
    GGTAATCACAGCTGGGGAGTGAAAAGGGTGCCACTGGCACCACTGGGTGGATGGTCCAGA
    GCCTCCACCCACAGAGGGGATGCAAAGGGCAGGTGAGTCAAGAACCGCATAGGTCTCCAG
    TCCCCACGGGGCTCCCAGGCCGGGGAAAGGTTCCCCTGAGGTCACTCTGAGGCCAGGGAC
    GTCACCCAAGGCTGGTGGTCAGTGTGAAGGGCTCCGTGCCAACTGGTCAGCTGTCCTTCA
    CGCACATATCCGTGGCCACCTGAGACCTGCTCCACGACCCTTCCAGGCAGAGCCGAGAGT
    TCGCCCCAACCCTTCCCCAGGCCCAGTGTGAAAAACAGACTCACAAGGGGCTTCTTGGCC
    TGCAGCTTCATTTGCGAGAGCGCCGAGGCAGGACACAGAGCACAGCTGTGCTGGAAGTGT
    GGGGAGAACCCGGACAGCTCAGTCCTGCCAGCAGCCGCAAAGAGCCGAGGCTGCCAGGCC
    CATTTATGTCCCTCATGTCTCTAGATTTTCTCGTCACCCAGCCTCAAAAATATATGTGTC
    TGCAACCCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >BC016340
    GGGGGGGCTCCGTGACAGCCAACGCAGTGACCCTCGCCCCTTCCTTGGCAGCACATCATG
    CTTGTGCAGCGGCAGATGTCTGTGATGGAAGAGGACCTGGAAGAATTCCAGCTCGCTCTG
    AAACACTACGTGGAGAGTGCTTCCTCCCAAAGTGGATGCTTGCGTATTTCTATACAGAAG
    CTTTCAAATGAATCTCGCTACATGATCTATGAGTTCTGGGAGAATAGTAGTGTATGGAAT
    AGCCACCTTCAGACAAATTATAGCAAGACATTCCAAAGAAGTAATGTGGATTTCTTGGAA
    ACTCCAGAACTCACATCTACAATGCTAGTTCCTGCTTCGTGGTGGATCCTGAACAACTAG
    ATGTTCCTAGACATTTTCTTTATGGTTCCAAGTGCAAAACAGGTGTTCTTATCTAAAACG
    TCAATTAGAAAATTATCTGCGGTTGTTAATCTACTGTATATTTTTGTTTGGTATATTTAC
    TAAGTGCACTCTTTCAAAACTTATTCTATAACTTTATCAATTCATGTGAATTTTAGCTCA
    ATTTTCAAAGTTCACTAATATTCTCAATATTTAATGCTAAATGCTTTGCTACATTGTAAC
    TCACCTAAAACCTTTTAGTGACAAAATCCTAATATGTGGAAAAAAGCATATGCATAAAGG
    AATAATATTGTGAAAATGAATCTGTTATGATAAAGAAAAAATAAAGTGGAAACTTTTAGA
    GTATTACTTCATAGGGCAGATTTTGTAAACTGTCGTATACTGTAAAGGGTTAAATCAGCG
    TTTTGTGATTTTTAAGTAACTGTGAGTGAAGTTTATTCTTCAACAATGTCTACTCCATCC
    CCAACCCAACTCACAGCCCTATGACTACTATCTTTGCATTAGTTAAAAAGTTAGTATATA
    GGCATCAAACAACCTTGGCTGTAACCTATAGAATCTCTATCCATGTATCAGGTTATAGAC
    TGGTTTTTCAAAAGTGAACAATCCTGTGATAAGTTGGAGTACCATTTAGTAATACAGCAA
    CATTGTGTCATTTATTAGCATCATAATTCTTTGTTATGTAAGTTAAATATATCAAGAAAG
    AAGAGACTGTTTGGAAAAATGTGGTTCAAGTTTTATGCTATATAGTTTTGGTATGCGATA
    CAGACAGCTAACTTTTCTTATGAAAAATACATATTTGCATGTAAACAATGATTTCAAAAT
    ACTTGAAAAATAAAATTTTAACCCAAATGAATAACTAAGAAATATAAAACAAGCACAAAA
    TCTTAGGGAAGTCATAAAATAGTAGTGAAAGTATTAGACAGAAGACATCTGTTTTCGAAT
    TTCAACACTAGAATGACTAAAACTATCTACCTATAGAACTATCTGTAGATAGTATACTAT
    CTACACTCTGCTCAACAAGCTCAGAAATTAAATATTTTTAGTAATAAAAATCTGTTCTGG
    TTATAAACCTTGCTAATGAAAATACAATACATATAAAAATGTATAGCCATGTTATTTTCT
    AGTATAAATTCCTTTGAAACTATAAGTCTTTGAGGAAAATTATAAGGTAAAATTTTCCTG
    TTTTTCCCCCTTTGAAAAACTCAGGAAAAAAGGAAGATTGAACTAATAAAATTTTATTTC
    TTAAATATAAATTTGACCTAAAATATTTTCTCAAACTAATTCATGAAACAGCAACTTTTA
    CCAATACCTTTGTATACTCTCAGTTCTCATTCAGTATAAATAAAATTTTAAAATCCTTTC
    ATAGTTCTATTAGAAATAAGTAGTAAATTTTGATATATTGTACATACACACGTGTGTGTG
    TGTGTGTGTGTGTGTGTGTATTTGTGTGCCTCTGGTCAACTCTAAGGATGACAGACACTG
    TGTAACAACACCTGGGTCAACTCTTTTAATTTATATACAAAGCAAAGAACAACATTAATG
    GAGATGCACAATGATTATTCAAACAAGCTATATATATGTACAAAGGCAAACAGACACATA
    ACAGTCTCTGCAGACTGATTGTATATAGTAAGAAAAGATCAAAAGACTTTAAAACCTAAA
    TGACTTTTGACATACAAACTCTTCTTGAGAATGTTTGTTGTAAATGGTTTCAAAAATACA
    AATTATAGCCAATCAAAACATTGCTTTGGTTGGTGCATTTAAGTATCCAACTCAAAAAGC
    ATATCAAATATTTTGGGTACTAGGCAGTTTCCAAAGTAGCATGGTAGTATTACTTGTTAA
    AAGGGTTCTGTTTTCATTAACAGTACTAAGTGGAAGGGATCTGCAGATTCCAAATTGGAA
    TAAGCTCTATCATATTCTGAAACAAGAATTAGAATGACTTGAGAACGGGCAAATAACAAA
    GCAAACCAATATAATTATATGGTCATTCTGACCCCAGCTCTTATACAAATTATACATGTA
    TTTTTGTGTATGTTTGTGAGAGTTGTATGTATGTGAATGTGTGTGAGTGTGTATTCACAT
    ACACATATATACTGGAACCTATAGTAGAAAAGGAAACTAGTAGGGCCAAAAAAAAAAAGA
    AAAAGAAAAAGAAAAAAGAAAAAAAAAGAAAAAACTGGGACCTAAGTATAAATATCTCAT
    CCTAAAGTAAACAATAAGTTTATAGTTAACGAAGATTTTTTTCTATTTAAAACCCCATTT
    TCCTAAAGAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    >BC013282
    GGCACGAGGGCAGGGGGAAGGGAAGTGCGGCTCGGTCGGCGCGGGTGGAGGGGGCGTGAG
    GCCGCCCTACGGTGGCCGTCGAGGGACGGCGCTACGGCTCCCACGCTAGGCCAAACGCCT
    CCGGCGGCCGCGCCCGAGAGCCCCTTCACCTGCAGGGCGACCCCAGCCGGCGACGCGTGA
    ACCACGCCCTCAGCCGCCTTGCCAGCGCCCCCAGCCGCGCGCCCCAGCACCATGCGGCCG
    CCCTGCGCACGGAGCCCCGAGGGACAGGGGCACCCGCAGGCCCGGCCCCTAGCACCGCCG
    GCCGGCCCCGAGGTCCGGGACGCCGGCGCCGCCGCGGAGAGGGCACCGGGCCGACGCCTC
    CCCCCAGGGTCAGCTGCGGGCTCCCAGGCCTAGGCGCCCATGACCCCTACGCCAACCGCC
    GCCTGGACACCGCCGCCGCCACTGCGACCTAGCGCCGCCGCCGCCGGGGCCCAATGCCGG
    TCATGCCCATTCCGCGGCGGGTGCGCTCCTTCCACGGCCCGCACACCACCTGCCTGCATG
    CGGCCTGCGGGCCCGTGCGCGCCTCCCACCTGGCCCGCACCAAGTACAACAACTTCGACG
    TGTACATCAAGACGCGCTGGCTGTACGGCTTCATCCGCTTCCTACTCTACTTTAGCTGCA
    GCCTGTTCACTGCGGCGCTCTGGGGTGCGCTGGCCGCCCTCTTCTGCCTACAGTACCTGG
    GCGTTCGCGTCCTGCTGCGCTTCCAGCGCAAGCTGTCGGTGCTGCTGCTGCTGCTGGGCC
    GCCGGCGCGTGGACTTCCGCCTGGTGAACGAGCTGCTCGTCTATGGCATCCACGTCACCA
    TGCTGCTGGTCGGGGGCCTGGGCTGGTGCTTCATGGTCTTCGTGGACATGTGAGGGCCGT
    GGGTGCGAGCTTGATGTATCGTCCCGGCCTGTGGCTGTGTTCTCTCCATGGGTGGGGTCG
    GCCAGCGCCTTCCCTTCGCCCATCCCCCAGGCAGTCGCTGCTGCCCGGCGCCCACGGAGA
    GAAAAGAAAGGGCTGAGACTTCTGTGATGGGGGCGCGGACACCACCCCTAGGCTGGCTTC
    CTGGACCCACCCTCCCCGTATGCACTCTCAGGGGCAGCGCCCACCTGCCGGTGGCTCCTG
    CTCACATGTCTTCGGGTCGTACTGCGGGGTGGGCCCTCCGTTCCGCCTCTCTGTGGGCCT
    CTCTCCAGGACCACAGCTGCCAGGGACTTTAGACATCACCCTGGGAGGCCCCTGGACACA
    GAGGGCTGTGTGCCCAGGAGCAATTCCGGAGGGGGGCCCTCCTGGCTGCACAGCCCCTTC
    TGCGTGCCCTGGCCCCAGCCCCAGCCAACGGGACACGGAAGGCTCCCCTCGCTGACACAC
    CACACTGCCACAAAGCTGCTTACTCTGCCCTGGGCCGCCTGAGGCCTGGCACTGCCCGCG
    GACCACCCTGTGTGTGTCATCCTGAGGGGCTGTGTGGGTCCTGAGTCCCCAGCCAGCCTT
    CAGGGTCCCCTTGGATTGTGTAGATGCAGTCTAGCGGGGGGCCGGAGAAGGGCTCAGGTG
    GGAGGGGCCTCAGCAGGCTCCCAGCTCAGGGGCTGGCCTGGGGGGAACCCTGGGAGCCAG
    GGGCTGACTCCAGCAACACTGGCCTGTCTGCCTGTTCTGGGAGGGCTGTGAGGATGTCTT
    GCAGATGCTCTGGATTTCTGCGGAGGCACCTCCATTCCTTTCTGGCTTTTTTTGCGGGGG
    AGGGCTTTGGGCCTCTTTCTTTGAGGGAACACCGTCAAAGAAAGCCTGGGAGATCGAGGC
    TTCAGTGAGCCAGGATGGAAACGCGTGTCCCAAGTGTCCGGAGCAGGCGGCAGAGGCCTC
    AGTGCGGCAAACACAGCCCCAGAGCCTGTGTGGCACCAGCAGCATCTTAGAGCCCCAGGT
    ATATGCTGAGATCTTATCTCACGCTGTCCTCCAGTGTCTGGGGGGCCCAAATGATGGCAC
    AGGGTCAGGTGGGCTGGAGGGGCGCAGATGCCTGTGTTCAGGGAGGGTGGCCACCATGGG
    CCGAGGTCTCACCCAGGACCCCTTGCTCTGCTCCTCAGCCTTGCAGTCACGGCAGCACTA
    TGGTGGACTGCCCATGGCCGTGTGACTTTGGGGGCAAGTGGGAGGGCGCCCTGAATAATG
    ATTGCAAGGACAACAGGCAGAGGCTACCCTAGAGCAGGACACAGGGTGTGGTACTGACAA
    CCCTAGTGTCACCTCAAATCCATGTCCCCACACTCTGGGCATGGGTGGGACTTGTGACCC
    TACCCTGTCAGGCGGACCAGTGGCCCAGGAGCCATGAGGACAGTTGTGTGCCACTGGAAG
    AGAAACTTTTTGAAAAACCCTAAATCAGGTAGAGAAAGCAAAAAATCTCTGGCCGTAAAC
    CGTGCTCTCTAATTTATCGGCAGCTTCTGTGGATGACCTCTGATGAGCCCGGGCTGCGTC
    CACGCCCTGGGCAGGTAGGCGGGAGCTTCCCTGCGTGGGCCTCATTTCTTGCTGCAGAGA
    ATCTTTTGCACTAAGTCATGCTGTTTCCTCAAAGAAGCTTTGTTTTTTGTTAACGTATTA
    CTCAGAGTCACCCAAGCCTCTTGGCTGAGGGTGAAGGTGGGACGGGAGGCGGGAGGGGGC
    TGGTGGTGCCGCTCGTGCGGTGTCAACGCTGCAGGGAGTTGTGGCACCTTGGTGCCCTCT
    GAGCACCTGGCCGCCTGCTGTCCCCGGTGCCTGTGAAATTCGTCATGCCATGACCCACCT
    GCATTAAACCTATTTTTTTAATGTGTTAAAAAAAAAAAAAAAAAA
    >H09748
    GNGGAAACACGGGCCAAACCCGTGANTTTGGTGCCCCTTGTAAACTCANCCCCTGCAAAN
    CCAAAGACCCCAATGGATTTAAAGTTGNTTGGCATTTGTACTGGCAAGGCAAAANATTTT
    TAANTACCTTTTCCTAATACTTATTGTATGAGCTTTTGNTGTTTACTTGGAGGTTTTGTC
    TTTTACTACAAGTTTGGAACTATTTANTATTGCCTTGGTATTTGTGCTCTGTTTAAGAAA
    CAGGCACTTTTTTTTATTATGGATAAAATGTTGAGATGACAGGAGGTCATTTCAATATGG
    CTTAGTAAAATATTTATTGTTCCTTTATTCTCTGTACAAGATTTTGGGCCTCTTTTTTTC
    CTTAATGTCACAATGTTGAGTTCAGCATGTGTCTGTCCATTTCATTTGTACGCTTGTTCA
    AAACCAAGTTTGTTCTGGTTTCAAGTTATAAAAATAAATTGGACATTTAACTTGATCTCC
    AAAAAAAAAAAAAAAA
    >BC001665
    GGCACGAGGCAATCTGAGGAGCAGGAGGACCGGGGCGCCGGTGTCCTGCCGCCTCCTTCT
    CCTTGCTCTCACCTGCGCCTATTAGTCCACGCGCCTTCAAGGCCAGGGGCTACAGCCCAG
    ACAGAGAGGGGACAGCAGAGGGAGAGAGAGCACCTGAGGATACAGAGCTGGCACTGGACT
    GCCTTTTCACCCCCCAGGTGATGAGTGAGGTTCGAAGAACGGAAGATTTAAAAAGCAGCC
    GGGGCCTCCGTATTGAATGAAAGACCCAGTGCAAAGACATCACCATGAACACTAGCATTC
    CTTATCAGCAGAATCCTTACAATCCACGGGGCAGCTCCAATGTCATCCAGTGCTACCGCT
    GTGGAGACACCTGCAAAGGGGAAGTGGTCCGCGTGCACAACAACCACTTCCACATCAGAT
    GCTTCACCTGTCAAGTATGTGGCTGTGGCCTGGCCCAGTCAGGCTTCTTCTTCAAGAACC
    AGGAGTACATCTGCACCCAGGACTACCAGCAACTCTATGGCACCCGCTGTGACAGCTGCC
    GGGACTTCATCACAGGCGAAGTCATCTCGGCCCTGGGCCGCACTTACCACCCCAAGTGCT
    TCGTGTGCAGCTTGTGCAGGAAGCCTTTCCCCATTGGAGACAAGGTGACCTTCAGCGGTA
    AAGAATGTGTGTGCCAAACGTGCTCCCAGTCCATGGCCAGCAGTAAGCCCATCAAGATTC
    GTGGACCAAGCCACTGTGCCGGGTGCAAGGAGGAGATCAAGCACGGCCAGTCACTCCTGG
    CTCTGGACAAGCAGTGGCACGTCAGCTGCTTCAAGTGCCAGACCTGCAGCGTCATCCTCA
    CCGGGGAGTATATCAGCAAGGATGGTGTTCCATACTGTGAGTCCGACTACCATGCCCAGT
    TTGGCATTAAATGTGAGACTTGTGACCGATACATCAGTGGCAGAGTCTTGGAGGCAGGAG
    GGAAGCACTACCACCCAACCTGTGCCAGGTGTGTACGCTGCCACCAGATGTTCACCGAAG
    GAGAGGAAATGTACCTCACAGGTTCCGAGGTTTGGCACCCCATCTGCAAACAGGCAGCCC
    GGGCAGAGAAGAAGTTAAAGCATAGACGGACATCTGAAACCTCCATCTCACCCCCTGGAT
    CCAGCATTGGGTCACCCAACCGAGTCATCTGCGACATCTACGAGAACCTGGACCTCCGGC
    AGAGACGGGCCTCCAGCCCGGGGTACATAGACTCCCCCACCTACAGCCGGCAGGGCATGT
    CCCCCACCTTCTCCCGCTCACCTCACCACTACTACCGCTCTGGTGATTTGTCTACAGCAA
    CCAAGAGCAAAACAAGTGAAGACATCAGCCAGACCTCCAAGTACAGTCCCATCTACTCGC
    CAGACCCCTACTATGCTTCGGAGTCTGAGTACTGGACCTACCATGGGTCCCCCAAAGTGC
    CCCGAGCCAGAAGGTTCTCGTCTGGAGGAGAGGAGGATGATTTTGACCGCAGCATGCACA
    AGCTCCAAAGTGGAATTGGCCGGCTGATTCTGAAGGAAGAAATGAAGGCCCGGTCGAGCT
    CCTATGCAGATCCCTGGACCCCTCCCCGGAGCTCCACCAGCAGCCGGGAAGCCCTGCACA
    CAGCTGGCTATGAGATGTCCCTCAATGGCTCCCCTCGGTCGCACTACCTGGCTGACAGTG
    ATCCTCTCATCTCCAAATCTGCCTCCCTGCCTGCCTACCGAAGAAATGGGCTGCACAGGA
    CACCCAGCGCAGACCTCTTCCACTACGACAGCATGAACGCAGTCAACTGGGGCATGCGAG
    AGTACAAGATCTACCCTTATGAACTGCTGCTGGTGACTACAAGAGGAAGAAACCGACTGC
    CCAAGGATGTAGACAGGACCCGTTTAGAGGGAAACTTTTGGAAGAGTGGCTGCTTATGAG
    ATTCCAAAATGAAGTGTTGGCCAACACCGCTCATGGCCATCCTGGATTTTCCCAGTGGCT
    TCCCTTCCTGCTCGCCTCCCTGAACAGGGGAGAAAGCTTAACCTCTCTTCTCCTCTCCAA
    ACCTTTCACCTTGAATGGGTAATGTTTGGTGGGGGCTGTTCCTTCTTGGAGAAGCCTTGA
    GTCGGACCATTTTGAGATCATGGAGGAAGGATGAAGAAGTGAAAATGACAATAATGACTC
    TCAAGAGGCTGGCGATGTGACATGGCAAATGTAGAACTGACTTAAATTGAACAAACCCTC
    ACTGAGCACCTCTGATGTTGAGCACCTGCTGAATACTGAGCACTGAATGGGGGAGGGGGA
    GGGGAGCACGGGGTGAGTCAACCTGGGACTCGGTCTCAGGGATATGCCTACCAATAGCGG
    GTATCGTAAGGCATGTACCCAAACATAACGGATGTAAGGCAGAAAGTGATCGGAGAAGGA
    ATGAGAAAGTGTGCGTGATGTTAATGAAAAGTCATATGCAGCTAGAGCAGACCCAGGAAA
    GCTTTCTGGAAGAGATTGCATCTGAGGAAATTCAGGAAGGATCTTTGTAGATTGGGGGGA
    GATTCTAAATTGAAGGGGTGATGGGGTGAGGGGCCAGAGGGAAGTCTGCTGTGTTCTCAT
    GTAGGATGTCAGCCCTCCCTGCAACTTCTCTTTTTGGCCAATGTCTTTTCACTTTCCTGA
    CCCTTTAGAATCATCCCCAGCCAGACGCAATCATGGAAGTTGCCTTATTGTCACTGGTTA
    AGAACTTGGCGAGATTGAAGGGCTTTTGTTATTGTTGTTGGATATTTTTGTTTCCCATAA
    AAGCACATCATTTCAACCCTAAAAAAAAAAAAAAAAAAAAAA
    >BC016451
    GAAGAATTAGATACTTTTGAGTGGGCTTTGAAGAGCTGGTCTCAGTGTTCCAAACCCTGT
    GGTGGAGGTTTCCAGTACACTAAATATGGATGCCGTAGGAAAAGTGATAATAAAATGGTC
    CATCGCAGCTTCTGTGAGGCCAACAAAAAGCCGAAACCTATTAGACGAATGTGCAATATT
    CAAGAGTGTACACATCCACTCTGGGTAGCAGAAGAATGGGAACACTGCACCAAAACCTGT
    GGAAGTTCTGGCTATCAGCTTCGCACTGTACGCTGCCTTCAGCCACTCCTTGATGGCACC
    AACCGCTCTGTGCACAGCAAATACTGCATGGGTGACCGTCCCGAGAGCCGCCGGCCCTGT
    AACAGAGTGCCCTGCCCTGCACAGTGGAAAACAGGACCCTGGAGTGAGTGTTCAGTGACC
    TGCGGTGAAGGAACGGAGGTGAGGCAGGTCCTCTGCAGGGCTGGGGACCACTGTGATGGT
    GAAAAGCCTGAGTCGGTCAGAGCCTGTCAACTGCCTCCTTGTAATGATGAACCATGTTTG
    GGAGACAAGTCCATATTCTGTCAAATGGAAGTGTTGGCACGATACTGCTCCATACCAGGT
    TATAACAAGTTATGTTGTGAGTCCTGCAGCAAGCGCAGTAGCACCCTGCCACCACCATAC
    CTTCTAGAAGCTGCTGAAACTCATGATGATGTCATCTCTAACCCTAGTGACCTCCCTAGA
    TCTCTAGTGATGCCTACATCTTTGGTTCCTTATCATTCAGAGACCCCTGCAAAGAAGATG
    TCTTTGAGTAGCATCTCTTCAGTGGGAGGTCCAAATGCATATGCTGCTTTCAGGCCAAAC
    AGTAAACCTGATGGTGCTAATTTACGCCAGAGGAGTGCTCAGCAAGCAGGAAGTAAGACT
    GTGAGACTGGTCACCGTACCATCCTCCCCACCCACCAAGAGGGTCCACCTCAGTTCAGCT
    TCACAAATGGCTGCTGCTTCCTTCTTTGCAGCCAGTGATTCAATAGGTGCTTCTTCTCAG
    GCAAGAACCTCAAAGAAAGATGGAAAGATCATTGACAACAGACGTCCGACAAGATCATCC
    ACCTTAGAAAGATGAGAAAGTGAACCAAAAAGGCTAGAAACCAGAGGAAAACCTGGACAA
    CCTCTCTCTTCCCATGGTGCATATGCTTGTTTAAAGTGGAAATCTCTATAGATCGTCAGC
    TCATTTTATCTGTAATTGGAAGAACAGAAAGTGCTGGCTCACTTTCTAGTTGCTTTCATC
    CTCCTTTTGTTCTGCATTGACTCATTTACCAGAATTCATTGGAAGAAATCACCAAAGATT
    ATTACAAAAGAAAAATATGTTGCTAAGATTGTGTTGGTCGCTCTCTGAAGCAGAAAAGGG
    ACTGGAACCAATTGTGCATATCAGCTGACTTTTTGTTTGTTTTAGAAAAGTTACAGTAAA
    AATTAAAAAGAGATACCAATGGTTTACACTTTAACAAGAAATTTTGGATATGGAACAAAG
    AATTCTTAGACTTGTATTCCTATTTATCTATATTAGAAATATTGTATGAGCAAATTTGCA
    GCTGTTGTGTAAATACTGTATATTGCAAAAATCAGTATTATTTTAAGAGATGTGTTCTCA
    AATGATTGTTTACTATATTACATTTCTGGATGTTCTAGGTGCCTGTCGTTGAGTATTGCC
    TTGTTTGACATTCTATAGGTTAATTTTCAAAGCAGAGTATTACAAAAGAGAAGTTAGAAT
    TACAGCTACTGACAATATAAAGGGTTTTGTTGAATCAACAATGTGATACGTAAATTATAG
    AAAAAGAAAAGAAACACAAAAGCTATAGATATACAGATATCAGCTTACCTATTGCCTTCT
    ATACTTATAATTTAAAGGATTGGTGTCTTAGTACACTTGTGGTCACAGGGATCAACGAAT
    AGTAAATAATGAACTCGTGCAAGACAAAACTGAAACCCTCTTTCCAGGACCTCAGTAGGC
    ACCGTTGAGGTGTCCTTTGTTTTTGTGTGTGTGTGTTCTTTTTTAATTTTCGCATTGTTG
    ACAGATACAAACAGTTATACTCAATGTACTGTAATAATCGCAAAGGAAAAAGTTTTGGGA
    TAACTTATTTGTATGTTGGTAGCTGAGAAAAATATCATCAGTCTAGAATTGATATTTGAG
    TATAGTAGAGCTTTGGGGCTTTGAAGGCAGGTTCAAGAAAGCATATGTCGATGGTTGAGA
    TATTTATTTTCCATATGGTTCATGTTCAAATGTTCACAACCACAATGCATCTGACTGCAA
    TAATGTGCTAATAATTTATGTCAGTAGTCACCTTGCTCACAGCAAAGCCAGAAATGCTCT
    CTCCAGGGAGTAGATGTAAAGTACTTGTACATAGAATTCAGAACTGAAGATATTTATTAA
    AAGTTGATTTTTTTTTCTTGATAGTATTTTTATGTACTAAATATTTACACTAATATCAAT
    TACATATTTTGGTAAACTAGAGAGACATAATTAGAGATGCATGCTTTGTTCTGTGCATAG
    AGACCTTTAAGCAAACTACTACAGCCAACTCAAAAGCTAAAACTGAACAAATTTGATGTT
    ATGCAAACATCTTGCATTTTTAGTAGTTGATATTAAGTTGATGACTTGTTTCCCTTCAAG
    GAAACATTAAATTGTATGGACTCAGCTAGCTGTTCAATGAAATTGTGAATTAGAAACATT
    TTTAAAAGTTTTTGAAAGAGATAAGTGCATCATGAATTACATGTACATGAGAGGAGATAG
    TGATATCAGCATAATGATTTTGAGGTCAGTACCTGAGCTGTCTAAAAATATATTATACAA
    ACTAAAATGTAGATGAATTAACCTCTCAAAGCACAGAATGTGCAAGAACTTTTGCATTTT
    AATCGTTGTAAACTAACAGCTTAAACTATTGACTCTATACCTCTAAAGAATTGCTGCTAC
    TTTGTGCAAGAACTTTGAAGGTCAAATTAGGCAAATTCCAGATAGTAAAACAATCCCTAA
    GCCTTAAGTCTTTTTTTTTTTCCTAAAAATTCCCATAGAATAAAATTCTCTCTAGTTTAC
    TTGTGTGTGCATACATCTCATCCACAGGGGAAGATAAAGATGGTCACACAAACAGTTTCC
    ATAAAGATGTACATATTCATTATACTTCTGACCTTTGGGCTTTCTTTTCTACTAAGCTAA
    AAATTCCTTTTTATCAAAGTGTACACTACTGATGCTGTTTGTTGTACTGAGAGCACGTAC
    CAATAAAAATGTTAACAAAATATAAAAAAAAAAAAAAA
    >BF510316
    TCCTGTGTTCTAGACCTCTGGAGGCTGCTGTGGGGACCACACTGATCCTGGAGAAAAGGG
    ATGGAGCTGAAAAAGATGGAATGCTTGCAGAGCATGACCTGAGGAGGGAGGAACGTGGTC
    AACTCACACCTGCCTCTTCCTGCAGCCTCACCTCTACCTGCCCCCATCATAAGGGCACTG
    AGCCCTTCCCAGGCTGGATACTAAGCACAAAGCCCATAGCACTGGGCTCTGATGGCTGCT
    CCACTGGGTTACAGAATCACAGCCCTCATGATCATTCTCAGTGAGGGCTCTGGATTGAGA
    GGGAGGCCCTGGGAGGAGAGAAGGGGGCAGAGTCTTCCCTACCAGGTTTCTACACCCCCG
    CCAGGCTGCCCATCAGGGCCCAGGGAGCCCCCAGAGGACTTTATTCGGACCAAGCAGAGC
    TCACAGCTGGACAGGTGTTGTATATAGAGTGGAATCTCTTGGATGCAGCTTCAAGAATAA
    ATTTTTCTTCTCTTTTCAAAAATGTATAAAAATCATTATACATAGCATTAAAGAAACATT
    TTTGAGAAGTACAAAACAAAAAAAAAA
    >AF301598
    CGGGCGCCGCAGGAGCGAGTGAGCTGGGAGCGAGGGGCGAAGGCGCGGAGAAGCCCGGCC
    GCCCGGTGGGCGGCAGAAGGCTCAGCCGAGGCGGCGGCGCCGACTCCGTTCCACTCTCGG
    CCCGGATCCAGGCCTCCGGGTTCCCAGGCGCTCACCTCCCTCTGACGCACTTTAAAGAGT
    CTCCCCCCTTCCACCTCAGGGCGAGTAATAGCGACCAATCATCAAGCCATTTACCAGGCT
    TCGGAGGAAGCTGTTTATGTGATCCCCGCACTAATTAGGCTCATGAACTAACAAATCGTT
    TGCACAACTTGTGAAGAAGCGAACACTTCCATGGATTGTCCTTGGACTTAGGGCGCCCTG
    CCCGCCTTTTGCAGAGGAGAAAAAACTTTTTTTTTTTTTTGCCTCCCCCGAGAACTTTCC
    CCCCTTCTCCTCCCTGCCTCTAACTCCGATCCCCCCACGCCATCTCGCCAAAAAAAAAAA
    AAAAAAAAAAAAAGAAAAAAAAAGAAAAAAAAAGAAAAAAAATTACCCCAATCCACGCCT
    GCAAATTCTTCTGGAAGGATTTTCCCCCCTCTCTTCAGGTTGGGCGCGTTTGGTGCAAGA
    TTCTCGGGATCCTCGGCTTTGCCTCTCCCTCTCCCTCCCCCCTCCTTTCCTTTTTCCTTT
    CCTTTCCTTTCTTTCTTCCTTTCCTTCCCCCCACCCCCACCCCCACCCCAAACAAACGAG
    TCCCCAATTCTCGTCCGTCCTCGCCGCGGGCAGCGGGCGGCGGAGGCAGCGTGCGGCGGT
    CGCCAGGAGCTGGGAGCCCAGGGCGCCCGCTCCTCGGCGCAGCATGTTCCAGCCGGCGCC
    CAAGCGCTGCTTCACCATCGAGTCGCTGGTGGCCAAGGACAGTCCCCTGCCCGCCTCGCG
    CTCCGAGGACCCCATCCGTCCCGCGGCACTCAGCTACGCTAACTCCAGCCCCATAAATCC
    GTTCCTCAACGGCTTCCACTCGGCCGCCGCCGCCGCCGCCGGTAGGGGCGTCTACTCCAA
    CCCGGACTTGGTGTTCGCCGAGGCGGTCTCGCACCCGCCCAACCCCGCCGTGCCAGTGCA
    CCCGGTGCCGCCGCCGCACGCCCTGGCCGCCCACCCCCTACCCTCCTCGCACTCGCCACA
    CCCCCTATTCGCCTCGCAGCAGCGGGATCCGTCCACCTTCTACCCCTGGCTCATCCACCG
    CTACCGATATCTGGGTCATCGCTTCCAAGGGAACGACACTAGCCCCGAGAGTTTCCTTTT
    GCACAACGCGCTGGCCCGAAAGCCCAAGCGGATCCGAACCGCCTTCTCCCCGTCCCAGCT
    TCTAAGGCTGGAACACGCCTTTGAGAAGAATCACTACGTGGTGGGCGCCGAAAGGAAGCA
    GCTGGCACACAGCCTCAGCCTCACGGAAACTCAGGTAAAAGTATGGTTTCAGAACCGAAG
    AACAAAGTTCAAAAGGCAGAAGCTGGAGGAAGAAGGCTCAGATTCGCAACAAAAGAAAAA
    AGGGACGCACCATATTAACCGGTGGAGAATCGCCACCAAGCAGGCGAGTCCGGAGGAAAT
    AGACGTGACCTCAGATGATTAAAAACATAAACCTAACCCCACAGAAACGGACAACATGGA
    GCAAAAGAGACAGGGAGAGGTGGAGAAGGAAAAAACCCTACAAAACAAAAACAAACCGCA
    TACACGTTCACCGAGAAAGGGAGAGGGAATCGGAGGGAGCAGCGGAATGCGGCGAAGACT
    CTGGACAGCGAGGGCACAGGGTCCCAAACCGAGGCCGCGCCAAGATGGCAGAGGATGGAG
    GCTCCTTCATCAACAAGCGACCCTCGTCTAAAGAGGCAGCTGAGTGAGAGACACAGAGAG
    AAGGAGAAAGAGGGAGGGAGAGAGAGAAAGAGAGAGAAAGAGAGAGAGAGAGAGAGAGAG
    AGAAAGCTGAACGTGCACTCTGACAAGGGGAGCTGTCAATCAAACACCAAACCGGGGAGA
    CAAGATGATTGGCAGGTATTCCGTTTATCACAGTCCACTTAAAAAATGATGATGATGATA
    AAAACCACGACCCAACCAGGCACAGGACTTTTTTGTTTTTTGCACTTCGCTGTGTTTCCC
    CCCCATCTTTAAAAATAATTAGTAATAAAAAACAAAAATTCCATATCTAGCCCCATCCCA
    CACCTGTTTCAAATCCTTGAAATGCATGTAGCAGTTGTTGGGCGAATGGTGTTTAAAGAC
    CGAAAATGAATTGTAATTTTCTTTTCCTTTTAAAGACAGGTTCTGTGTGCTTTTTATTTT
    GATTTTTTTTCCCAAGAAATGTGCAGTCTGTAAACACTTTTTGATACCTTCTGATGTCAA
    AGTGATTGTGCAAGCTAAATGAAGTAGGCTCAGCGATAGTGGTCCTCTTACAGAGAAACG
    GGGAGCAGGACGACGGGGGGGCTGGGGGTGGCGGGGGAGGGTGCCCACAAAAAGAATCAG
    GACTTGTACTGGGAAAAAAACCCCTAAATTAATTATATTTCTTGGACATTCCCTTTCCTA
    ACATCCTGAGGCTTAAAACCCTGATGCAAACTTCTCCTTTCAGTGGTTGGAGAAATTGGC
    CGAGTTCAACCATTCACTGCAATGCCTATTCCAAACTTTAAATCTATCTATTGCAAAACC
    TGAAGGACTGTAGTTAGCGGGGATGATGTTAAGTGTGGCCAAGCGCACGGCGGCAAGTTT
    TCAAGCACTGAGTTTCTATTCCAAGATCATAGACTTACTAAAGAGAGTGACAAATGCTTC
    CTTAATGTCTTCTATACCAGAATGTAAATATTTTTGTGTTTTGTGTTAATTTGTTAGAAT
    TCTAACACACTATATACTTCCAAGAAGTATGTCAATGTCAATATTTTGTCAATAAAGATT
    TATCAATATGCCAAAAAAAAAAAAAAA
    >Hs.77031_mRNA_1 gi|16741772|gb|BC016680.1|BC016680 Homo sapiens clone
    MGC:21349 IMAGE:4338754 polyA = 3
    GTGGCGGCGGAGGCGGCGGAGGCCAGGGAGGAAGATGTCGTAATGAGCGATCCACAGACC
    AGCATGGCTGCCACTGCTGCTGTGAGTCCCAGTGACTACCTGCAGCCTGCCGCCTCCACC
    ACCCAGGACTCCCAGCCATCTCCCTTAGCCCTGCTTGCTGCAACATGTAGCAAAATTGGC
    CCTCCAGCAGTTGAAGCTGCTGTGACACCTCCTGCTCCCCCACAGCCCACACCGCGGAAA
    CTTGTCCCTATCAAACCTGCCCCTCTCCCTCTCAGCCCCGGCAAGAATAGCTTTGGAATC
    TTGTCCTCCAAAGGAAATATACTTCAGATTCAGGGGTCACAACTGAGCGCCTCCTATCCT
    GGAGGGCAGCTGGTGTTCGCTATCCAGAATCCCACCATGATCAACAAAGGGACCCGATCA
    AATGCCAATATCCAGTACCAGGCGGTCCCTCAGATTCAGGCAAGCAATTCCCAAACCATC
    CAAGTACAGCCCAATCTCACCAACCAGATCCAGATCATCCCTGGCACCAACCAAGCCATC
    ATCACCCCCTCACCGTCCAGTCACAAGCCTGTCCCCATCAAGCCAGCCCCCATCCAGAAG
    TCGAGTACGACCACCACCCCCGTGCAGAGCGGGGCCAATGTGGTGAAGTTGACAGGTGGG
    GGCGGCAATGTGACGCTCACTCTGCCCGTCAACAACCTCGTGAACGCCAGTGACACCGGG
    GCCCCTACTCAGCTCCTCACTGAAAGCCCCCCAACCCCGCTGTCTAAGACTAACAAGAAA
    GCAAGGAAGAAGAGCCTTCCTGCCTCCCAGCCCCCTGTGGCTGTGGCTGAGCAGGTGGAG
    ACGGTGCTGATCGAGACCACCGCGGACAACATCATCCAGGCAGGAAATAACCTGCTCATT
    GTTCAGAGCCCTGGTGGGGGCCAGCCAGCTGTGGTCCAGCAGGTCCAGGTGGTGCCCCCC
    AAGGCCGAGCAGCAGCAGGTGGTACAGATCCCCCAGCAGGCTCTGCGGGTGGTGCAGGCG
    GCATCTGCCACCCTCCCCACTGTACCCCAGAAGCCCTCCCAGAACTTTCAGATCCAGGCA
    GCTGAGCCGACACCTACTCAGGTCTACATCCGCACGCCTTCCGGTGAGGTGCAGACAGTC
    CTTGTCCAGGACAGCCCCCCAGCAACAGCTGCAGCCACCTCTAACACCACCTGTAGCAGC
    CCTGCATCCCGTGCTCCCCATCTGAGTGGGACCAGCAAAAAGCACTCAGCTGCAATTCTC
    CGAAAAGAGCGTCCCCTGCCAAAGATTGCCCCAGCCGGGAGCATCATCAGCCTGAATGCA
    GCCCAGTTGGCGGCAGCTGCCCAGGCAATGCAGACCATCAACATCAATGGTGTCCAGGTC
    CAGGGCGTGCCTGTCACCATCACCAACACAGGCGGGCAGCAGCAGCTGACAGTGCAGAAT
    GTTTCTGGGAACAACCTGACCATCAGTGGGCTGAGCCCCACCCAGATCCAGCTGCAAATG
    GAACAAGCCCTGGCCGGAGAGACCCAGCCCGGGGAGAAGCGGCGCCGCATGGCCTGCACG
    TGTCCCAACTGCAAGGATGGGGAGAAGAGGTCTGGAGAGCAGGGCAAGAAGAAGCACGTG
    TGCCACATCCCCGACTGTGGCAAGACGTTCCGTAAGACGTCCTTGCTGCGTGCCCATGTG
    CGCCTGCACACTGGCGAGCGGCCCTTTGTCTGCAACTGGTTCTTCTGTGGGAAGAGGTTC
    ACACGGAGTGACGAGCTCCAACGGCATGCTCGCACCCACACAGGGGACAAACGCTTCGAG
    TGCGCCCAGTGTCAGAAGCGCTTCATGAGGAGTGACCACCTCACCAAGCATTACAAGACC
    CACCTGGTCACGAAGAACTTGTAAGGCCAACTGCGGCGGGAGGCCCTGAAGATGCAGTCC
    CCCACCTGTGTCCTCCCTGGGCCCCTGGTGGAAAGGAGCCCTGTGGCTGCCTTGGGCCTG
    CCCTCAGCCCCACTCCTGTTCTGCAACTGTCCCCACAGGAAGGGGCTCTGTTCCCTGTAT
    TGTCCTCCTTCTGAAGCCCCTTGGCTCTGCCTTGGCCCTTCCCCTCACCACGAGCTCCCG
    GCCTGCCCAGACTGTGGACACTGGCCGTGCCCAATGAGACGTTCTAAACCAGGACGCGTG
    GGAACCCTTATTTCCAAAGGAAAAACATGCATTTCACTCCGTCGAGGAGCAAAGTGAGCC
    CCTACCCCCCACCCCGATCCCCGCTCCCAACACTGCCGGAGTCGCGTCATGCCATGCCCC
    CTCTCCTGCACCTCCCTGGCCCTGCCGGCCACTGTGGACGCCCTGGGGCTTGGCACCCAC
    CTCTGGAGAAACTCGGGGCCACCTCCACTCCATGTGCCCAGCCCCGCCACAACCTCTCCT
    CCAGCACATTCCAGCTCTATTTAAAAAGTAAAGACACCCACCGACTCCTGATCCCCCTCT
    TTTTCTATGGAGAACGTTGCCTTATACTCTCTACTTCAGATGATGAACACTGTGTACTGT
    GTGTGCTTTAAAGAAGTTTTATTTAATTGCTCCCTTCTTCCTTTCCTTGTTATTCACCTC
    CCTGATGCCTGCTTTCAGTTGAGGGTTGGGGGCAATGATGAGCATATGAATTTTTTCTCA
    CTCTAGCAATTCCCTTTTCTAAATGACACAGCATTTAAACTCAAATCTGGATTCAGATAA
    CAGCACCTGCACATCCTGCACCTCCTCCCTCTCCCTTCACCTCACCCCTGCCCGGCCCAA
    GCTCTACTTGTGTACAGTGTATATTGTATAATAGACAATTGTGTCTACTACATGTTTAAA
    AACACATTGCTTGTTATTTTTGAGGCTTTTAAATTAAACAAAAATCCAACTTTAAAAAAA
    AAAAAAAA
    >Hs.77541_mRNA_1 gi|12804364|gb|BC003043.1|BC003043 Homo sapiens clone
    MGC:4370 IMAGE:2822973 polyA = 3
    CCCGCGTCGGTGCCCGCGCCCCTCCCCGGGCCCCGCCATGGGCCTCACCGTGTCCGCGCT
    CTTTTCGCGGATCTTCGGGAAGAAGCAGATGCGGATTCTCATGGTTGGCTTGGATGCGGC
    TGGCAAGACCACAATCCTGTACAAACTGAAGTTGGGGGAGATTGTCACCACCATCCCAAC
    CATAGGCTTCAATGTAGAAACAGTGGAATATAAGAACATCTGTTTCACAGTCTGGGACGT
    GGGAGGCCAGGACAAGATTCGGCCTCTGTGGCGGCACTACTTCCAGAACACTCAGGGCCT
    CATCTTTGTGGTGGACAGTAATGACCGGGAGCGGGTCCAAGAATCTGCTGATGAACTCCA
    GAAGATGCTGCAGGAGGACGAGCTGCGGGATGCAGTGCTGCTGGTATTTGCCAACAAGCA
    GGACATGCCCAACGCCATGCCCGTGAGCGAGCTGACTGACAAGCTGGGGCTACAGCACTT
    ACGCAGCCGCACGTGGTATGTCCAGGCCACCTGTGCCACCCAAGGCACAGGTCTGTACGA
    TGGTCTGGACTGGCTGTCCCACGAGCTGTCAAAGCGCTAACCAGCCAGGGGCAGGCCCCT
    GATGCCCGGAAGCTCCTGCGTGCATCCCCGGATGACCATACTCCCGGACTCCTCAGGCAG
    TGCCCTTTCCTCCCACTTTTCCTCCCCCATAGCCACAGGCCTCTGCTCCTGCTCCTGCCT
    GCATGTTCTCTCTGTTGTTGGAGCCTGGAGCCTTGCTCTCTGGGCACAGAGGGGTCCACT
    CTCCTGCCTGCTGGGACCTATGGAAGGGGCTTCCTGGCCAAGGCCCCCTCTTCCAGAGGA
    GGAGCAGGGATCTGGGTTTCCTTTTTTTTTTCTGTTTTGGGTGTACTCTAGGGGCCAGGT
    TGGGAGGGGGAAGGTGAGGGCTTCGGGTGGTGCTATAATGTGGCACTGGATCTTGAGTAA
    TAAATTTGCTGTGGTTTGAAAAAAAAAAAAAAAAAAAAA
    >Hs.7001_mRNA_1 gi|6808256 emb|AL137727.1|HSM802274 Homo sapiens mRNA; cDNA
    DKFZp434M0519 (from clone DKFZp434M0519); partial cds polyA = 3
    GTGGCGGTGGCTGCGGCGACGGCAGAGGCGAAGGGAGCCGGATCGCCGACCTGAGCGGGA
    GGCGGCGGTGGCGGCCATGGCGGCAGATGGAGAGCGTTCCCCGCTGCTGTCTGAGCCCAT
    CGACGGTGGCGCGGGCGGCAACGGTTTAGTGGGGCCCGGCGGGAGTGGGGCTGGGCCCGG
    GGGAGGCCTGACCCCCTCCGCACCACCGTACGGAGCCGGTAAACATGCCCCGCCCCAGGG
    TAAGCCGGGGCGGGTCCGAGGTGCTCCCCGGGGTACTCTGAAAGCCGGGGAGGGGGCGGG
    ACCGAGGGCGGAGGCGGGTCCCAGTCGCCAGGTGCGGGACTGCTGCACCTGTGACTGGGC
    GAGGCTTCCTTCCCTCCGTAATCGCGACCACAGCCTAGGGACGGAAGGGGGTTCTGAGCA
    ACCTGATAGAAGTGCCAATTATGAGAAGCCCTCCGAGCTTGGTCAGAGGGTTGAAGATCA
    GAAGGACTTCCCTACCACCGTGGAGCATCAGTGGGGGTGTAAGTGATCCCAGCCCTTCTA
    TTTGCTTCCTCTCCAGCATTTCCCCCGTTTCCCGAGGGGCATCCAGCCGTGTTGCCTGGG
    GAGGACCCACCCCCCTATTCACCCTTAACTAGCCCGGACAGTGGGAGTGCCCCTATGATC
    ACCTGCCGAGTCTGCCAATCTCTCATCAACGTGGAAGGCAAGATGCATCAGCATGTAGTC
    AAATGTGGTGTCTGCAATGAAGCCACCGTGAGTTACACATATCTATGAAATGGGCCCTGT
    TTCCTGGATCCTCTTTCTGATGTCTTGGTTCTAGACCCTGACCTTCCGGCTATTAGCCAA
    GTGCTTTTGATGATACCCAGGTTTCAGTTCCAGGTGTCTCACACAGCCATTTCCCCAGAA
    GCCACTCACCAAAGCTAATGTTCACTTTCTCTCACTTTTACACCTAGCCTAGTTCCTATT
    TGCAAATCTCATGATATAGTCTTTCTTTTATTTCTCCTTCCTGGTTAGCACCTTATTTTT
    CTGATCTCATAAAGTGTTTTTGGAGGGAAGTGGAGGGGATTGGGATTAGAGGTTTGCTTG
    CTGATGACCCTATTATTCTCTAGCCAATCAAGAATGCACCCCCAGGGAAAAAATATGTTC
    GATGCCCCTGTAACTGTCTCCTTATCTGCAAAGTGACATCCCAACGGATTGCATGCCCTC
    GTCCCTACTGGTAAGAGGCATAAGGTGGGGAAGGGCCTAAGTGGGGAACTGGAAAGTCAA
    AAAAGGATGAGCGTATACAGAGAATGTAAAGGTGAGAGAGCCTAGTGTTTATTTAGGAGA
    AAAGGCTTTGAAGCATGTGCCTCAGGAATGTTATAGCTGTCTTTCTCGTTTCTCAATAAA
    AATATTGAGATGAAATGATGTCGTTTCGGAGAATAGAGAGCCTTGGGGACTGGGTGTGTT
    ATCCTGAGGTCGGAGGGGAATTGGGGACCTGAAGTTTAAACAGTGCTCTTTCTTTCTCAA
    GGATTCTTGAGGGTATACAGTTGGGGGACAGAGTATCTTAAGTACAGAGAAGTCGAGTGA
    CTTAATAGACAGGGAGTGGGGGATGTGGAACAGGGACTGTGAAGATTTTTAGGATTAAAA
    ATTTTTCAAACACAAGTTTGAAAATACAAGTCTTTTTCTTTTGTATAGCAAAAGAATCAT
    CAACCTGGGGCCTGTGCATCCCGGACCTCTGAGTCCAGAACCCCAACCCATGGGTGTCAG
    GGTTATCTGTGGACATTGCAAGAATACTTTTCTGGTGAGGAAGGGGTATTGGGAAGGGGA
    GGGGAAAGGAGACTAAGAGTCATTTCGAGTATATTTCTTAGAGTAATGGTAATGACCCCT
    GAAAGGTCTGTCCTATGGGAACATGTTCTGCATCCCCACCCCAAGGTTCTCATTGAGGGA
    GACCCTGCTTGTGCTATTATTTTTGTTTTCTTTCTCCATAGTGGACAGAGTTCACAGACC
    GCACTTTGGCACGTTGTCCTCACTGCAGGAAAGTGTCATCTATTGGGCGCAGATACCCAC
    GTAAGAGATGTATCTGCTGCTTCTTGCTTGGCTTGCTTTTGGCAGTCACTGCCACTGGCC
    TTGCCGTGAGTACCCTTGCCCCAACCTCTTTCATTCTGCAGCCTCATCTCCATAGGCTAA
    GATTTGGGAAACTGCTACCCTAAAAAAAAGTGGAAGAAACTTAGGGGACTAGTTTGTTTT
    GTTTTAAGATATGGATGAGCTAAAGTGCAAAGTGGCTGATCAAACAGACTTTATTACTAC
    TACAAGAGTGAAAAACAGCCTTCCTTTCTCTGTAGGATGAGGATAGGACAGTGAAATTCT
    TAATTTAAGAGTTGCTATTTTTCAAACCTGGCTCAGTTGTCAGATATTAAGAAAAACTGA
    GATACAGTGTGGGATGGGATGAGTATGTTACGCCTAAGGGAAGGAAGCTGATCAGCTCTG
    CCTTTAAGAAGGTCCCTGAGGGTGGCTACATGTGGATAAGGAACAAGGACTGAAGCGTGA
    GTTATTACTGTTCTTAGAACTAATAGGAGGTAGTGGAGACCAACATTAACCCCATCTTTC
    TTTTCTTCTCCCTCCTTATCTTCATCAGTTTGGCACATGGAAGCATGCACGGCGATATGG
    AGGCATCTATGCAGCCTGGGCATTTGTCATCCTGTTGGCTGTGCTGTGTTTGGGCCGGGC
    TCTTTATTGGGCCTGTATGAAGGTCAGCCACCCTGTCCAGAACTTCTCCTGAGCCTGATG
    ACCCACAGACTGTGCCTGGCCCCTCCCTGGTGGGGACAGTGACACTACGAAGGGAGCTGG
    GGTAGTTAAAGGCTCCCGGGGCTTCTAGAAGGAAGCCAAGCAGCTGCCTTCCTTTTCCCT
    GGGGAGAGGTAGGAAGGAACCAGGCCCTCACTTAGGTTTGGAGGGGCAGATAAGAGCACT
    GCTGACCATCTGCTTTCCTCCAAGGGTTGCTGTGTCTAGGGTGAAGTAGGCAAAACGTTG
    CCCTTAAAACTGGGCCCTGAAGACGGTTCCAGCCTTGTCCTTCCTGTGTGCTCCCTGAGA
    GCCATTCCTGTCCCTTACACATTCCAGGGCAGGGTGGGGGTGGGTAGCCCTGGGGGTTCC
    CCTCCCTCTTGTGCACCATTAGGACTTTGCTGCTGCTATTGCACTTCACCAGAGGTTGGC
    TCTGGCCTCAGTACCCTCAGTCTCCTCTCCCCACATTGTGTCCTGTGGGGGTGGGGTCAG
    CCGCTGCTCTGTACAGAACCACAGGAACTGATGTGTATATAACTATTTAATGTGGGATAT
    GTTCCCCTATTCCTGTATTTCCCTTAATTCCTCCTCCCGACCTTTTTTACCCCCCCAGTT
    GCAGTATTTAACTGGGCTGGGTAGGGTTGCTCAGTCTTTGGGGGAGGTTAGGGACTTATC
    CTGTGCTTGTAAATAAATAAGGTCATGACTCTAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAA
    >Hs.302144_mRNA_1 gi|11493400|gb|AF130047.1|AF130047 Homo sapiens clone
    FLB3020 polyA = 0
    CTGTCAGCACGGGGCCTGGCATGTAATTGGTCTGCACCCACTGGTGCACTGAACTGCCAT
    AACCTCAGGTTTTCTTTCTTGCTGATACCCCTGGGTCATGTTCTTTGGCAAATAACATGA
    TTCATTATGAAGTAGAGTTCAGCAAAGGACAAGGATGAAAGTTGTCATTTAGAGAACTGC
    CATTCAGACTTTCTTGTCTAGGTAAAGAGCAAGGTCTTCTCTCTTTTCAACTCATTTTCT
    AAATTTAAACTGACGATGAGAATATGGATGATGTGTAGCTTCCTTCTCCCCCACTGATTT
    TTGGTTCAGGCTCTGGGTTTTTGGCAAGAACTTACAGATCTCACTTATTATTGGCCACCC
    TTCTGCTTTAAGACCTGTCAGGGCTTGTCTGAAATAAAACTGGAAGCACTTCTGATTCCA
    TCCTCACTGCTTTCCTCCTTCACCGTCAGACAGCATTACTGTATAGCACTGAGTGAGGGG
    CCCTGACACTGGAAGGTGGCAGGTGGGGCCTGGCCGCCAGTGAGGTATCATCATTTGTGT
    GTGCTCATGTGTGCGTTGGGCTTGTTGTATCTGAGGCATGAACATTCCATATACACGGCT
    TAAAGAGTTTTCTTCCCATACCGAAAGCATATATTCGGAGAGGACCCAACTTATTCAGCA
    TAGCCTTGTTCCCATAGTAGCCATCCTATTCCCCCACAGCCTCTACTTTAGGAAAGCTCC
    CCGTCCCCATATGAAATCCAAACCAAAAAAGATATATCACTTTCAGCTCAATTATTCCAT
    AATTACAAGATATTAGGCTAGTGGGCTCTTTATTGGTTGGGTCTTATATTAATGTTATAT
    GCTAGCCTTGTAATTTTGAGCTCCTCTATGGATGTTAATTTTAGTGAAACTCTATATTGA
    AGAAAAGATGGGACTAAGGGGGAGACAGGAGGAGGAAAGAAAGCAGAGACAGGCAAAGAA
    TCATAGCCTGAAATTCAACAGCAAGCATGGCTTATGAAGATCAAGTTATATTTTTGCTTC
    ATGAATCATTGTCAGACAAATTAAGAACATATTGTTTCTTATTTATCTATTGTCAAGGAT
    TCACTATCAGACACTAAGAATGAATCTTGATTTTCATAAGCTCTGTTGACACCATGGAGC
    CACAGAGCATAAAACTTGCATCTAATAAAGAAAGTGCAACATGGAACAGCAGGGAGTGGA
    ATACCAGCACAACTCACAGCTGCTTCCTGTTCCTCGTCCCTGTTTTCAGGAATGTTTCTT
    AGCAGGAAGTTTTTTAATAGACCGAGAATTTGTTATATGTATTCTAAGAAAAGTTGTAGT
    TGTAGATGCATTACTCTCCCAAATCTTAGAGATCAGGGATGATTATGTTCCATTTTTGTT
    TGGTGAGTTCCCATCTTTGTATGTACCTCCTTGCTCCCGGCTGTCCTCCTCTCCTCTTCC
    CTAGTGAGTGGTTAATGAGTGTTAATGCCTAAACCATACTTGTTTTATGGACACTTCTAT
    AATGGATTCGTTGCATAATTTTCATGCAGTGTATAGTGTTACTAGTTGGAAATTCTTGGA
    GGACTCTTAGCTGTCTGATGAAATTCCTAGTAGAAATTTTTGTTTTGAATTCCTAAAGTT
    GAAATATGAAAATTATATTTTAATTTGATTC
    >Hs.26510_mRNA_2 gi|11345385|gb|AF308803.1|AF308803 Homo sapiens chromosome
    15 map 15q26 polyA = 3
    AGTTTTTCTGGTAGAAGGCGGGGTTCTCCTCGTACGCTGCGGAGTCTCTGCGGGGTGTAG
    ACCGGAATCCTGCTGACGGGCAGAGTGGATCAGGGAGGGAGGGTCGAGACACGGTGGCTG
    CAGGTCTGAGACAAGGCTGCTCCGAGGTAGTAGCTCTCTTGCCTGGAGGTGGCCATTCAT
    TCCTGGAGTGCTGCTGAGGAGCGAGGGCCCATCTGGGGTCTCTGGAAGTCGGTGCCCAGG
    CCTGAAGGATAGCCCCCCTTGCGCTTCCCTGGGCTGCGGCCGGCCTTCTCAGAACGAAGG
    GCGTCCTTCCACCCCGCGGCGCAGGTGACCGCTGCCATGGCTTTTCCCCATCGGCCGGAC
    GCCCCTGAGCTGCCTGACTTCTCCATGCTGAAGAGGCTGGCTCGAGACCAGCTCATCTAT
    CTGCTGGAGCAGCTTCCTGGAAAAAAGGATTTATTCATTGAGGCAGATCTCATGAGCCCT
    TTGGATCGAATTGCCAATGTCTCCATCCTGAAGCAACACGAAGTAGACAAGCTATACAAG
    GTGGAGAACAAGCCAGCCCTCAGCTCCAATGAACAATTGTGCTTCTTGGTCAGACCCCGC
    ATCAAGAATATGCGATACATTGCCAGTCTTGTCAATGCTGACAAATTGGCTGGCCGAACT
    CGCAAATACAAAGTGATCTTCAGCCCTCAAAAGTTCTATGCGTGTGAGATGGTGCTTGAG
    GAAGAGGGAATCTATGGAGATGTGAGCTGTGATGAATGGGCCTTCTCTTTGCTGCCTCTT
    GATGTGGATCTGCTGAGCATGGAACTACCAGAATTTTTCAGGGATTACTTTCTGGAAGGA
    GATCAGCGTTGGATCAACACTGTAGCTCAGGCCTTACACCTTCTCAGCACTCTCTATGGA
    CCCTTTCCAAACTGCTATGGAATTGGCAGGTGCGCCAAGATGGCATATGAATTGTGGAGG
    AACCTGGAGGAGGAGGAGGATGGCGAAACCAAGGGCCGAAGGCCAGAGATTGGACATATC
    TTTCTCTTGGACAGAGATGTGGACTTTGTGACAGCACTTTGCTCCCAAGTGGTTTATGAG
    GGCCTAGTAGATGACACCTTCCGCATCAAGTGTGGGAGTGTCGACTTTGGCCCAGAAGTC
    ACATCCTCTGACAAGAGCCTGAAGGTGCTACTCAATGCCGAGGACAAGGTGTTTAATGAG
    ATTCGGAACGAGCACTTCTCCAATGTCTTTGGCTTCTTGAGCCAGAAGGCCCGGAACTTG
    CAGGCCCAGTATGATCGCCGGAGAGGCATGGACATTAAGCAGATGAAGAATTTCGTGTCC
    CAGGAGCTCAAGGGCCTGAAACAGGAGCACCGCCTGCTGAGTCTCCATATTGGGGCCTGT
    GAATCCATCATGAAGAAGAAAACCAAGCAGGATTTCCAGGAGCTAATCAAGACTGAGCAT
    GCACTGCTAGAGGGGTTCAACATCCGGGAGAGCACCAGCTACATTGAGGAACACATAGAC
    CGGCAGGTGTCGCCTATAGAAAGCCTGCGCCTCATGTGCCTTTTGTCCATCACTGAGAAT
    GGTTTGATCCCCAAGGATTACCGATCTCTGAAAACACAGTATCTGCAGAGCTATGGCCCT
    GAGCACCTGCTAACCTTCTCCAATCTGCGAAGAGCTGGGCTCCTAACGGAGCAGGCCCCC
    GGGGACACCCTCACAGCCGTGGAGAGTAAAGTGAGCAAGCTGGTGACCGACAAGGCTGCA
    GGAAAGATTACTGATGCCTTCAGTTCTCTGGCCAAGAGGAGCAATTTTCGTGCCATCAGC
    AAAAAGCTGAATTTGATCCCACGTGTGGACGGCGAGTATGATCTGAAAGTGCCCCGAGAC
    ATGGCTTACGTCTTCAGTGGTGCTTATGTGCCCCTGAGCTGCCGAATCATTGAGCAGGTG
    CTAGAGCGGCGAAGCTGGCAGGGCCTTGATGAGGTGGTACGGCTGCTCAACTGCAGTGAC
    TTTGCATTCACAGATATGACTAAGGAAGACAAGGCTTCCAGTGAGTCCCTGCGCCTCATC
    TTGGTGGTGTTCTTGGGTGGTTGTACATTCTCTGAGATCTCAGCCCTCCGGTTCCTGGGC
    AGAGAGAAAGGCTACAGGTTCATTTTCCTGACGACAGCAGTCACAAACAGCGCTCGCCTT
    ATGGAGGCCATGAGTGAGGTGAAAGCCTGATGTTTTTCCCGGCCAGTGTTGACATCTTCC
    CTGAACACATTCCTCAGTGAGATGCAGGCATCTGGCACCCAGCTGCTATAACCAAGTGTC
    CACCAACTACCTGCTAAGAGCCGGGAGCATGGAACGTGTTGGGATTTAGAGAACATTATC
    TGAGAAAAGAGTTCACTTCCTGCTCCCAGGATATTTCTCTTTTCTGTTTATGAAGTACAA
    CCCATGCTGCTAAGATGCGAGCAGGAAGAGGCATCCTTTGCTAAATCCTGTTTGAATGTC
    ATTGTAAATAAAGCCTCTGCTCTCAGATGTAAAAAAAAAAAAAAAAAAAAA
    >Hs.324709_mRNA_2 gi|12655026|gb|BC001361.1|BC001361 Homo sapiens clone
    MGC:2474 IMAGE:3050694 polyA = 2
    GGCACGAGGGGTCGCGCTGCCGCCGTTTTATTTGAAGACATCGTCCAGTTCTGACCATGG
    ACTCGCAGCCATCGGCCCTTAGTTTCCATCCCCTCTAGTGGGCCTTCGGGGGCTCTACTG
    ACGTCCCTCCTTCCCTTGGTACCGGGCCGGGGAAGTGTTCTCGGGCGCGGGAGGTTCCGC
    ATGCCCAGGCCTGGCCAGGGGAGATGACCGATCCGTCGCTGGGGCTGACAGTCCCCATGG
    CGCCGCCTCTGGCCCCGCTCCCTCCCCGGGACCCAAACGGGGCGGGATCCGAGTGGAGAA
    AGCCCGGGGCCGTGAGCTTCGCCGACGTGGCCGTGTACTTCTCCCGGGAGGAGTGGGGCT
    GCCTGCGGCCCGCGCAGAGGGCCCTGTACCGGGACGTGATGCGGGAGACCTACGGCCACC
    TGGGCGCGCTCGGTGAGAGCCCCACCTGCTTGCCTGGGCCCTGCGCCTCCACAGGCCCTG
    CCGCGCCTCTGGGAGCTGCGTGTGGAGTTGGGGGCCCCGGGGCCGGGCAGGCGGCCTCCT
    CGCAGCGTGGGGTTTGCGTTCTTCTCCCCCAGGAGTCGGAGGCAGCAAGCCGGCGCTCAT
    CTCCTGGGTGGAGGAGAAGGCCGAACTGTGGGATCCGGCTGCCCAGGATCCGGAGGTGGC
    GAAGTGTCCGACAGAAGCGGACCCAGCAGATTCCAGAAACAAGGAAGAGGAAAGACAAAG
    GGAAGGGACGGGAGCCCTGGAGAAGCCCGACCCTGTGGCCGCCGGGTCTCCTGGGCTGAA
    GGCTCCCCAAGCCCCCTTTGCCGGGTTGGAGCAGCTGTCCAAGGCCCGGCGCCGGAGTCG
    CCCCCGCTTTTTTGCCCACCCCCCTGTCCCCCGAGCTGACCAGCGTCACGGCTGCTACGT
    GTGCGGGAAGAGCTTCGCCTGGCGCTCCACACTGGTGGAGCACATTTACAGCCACAGGGG
    CGAGAAGCCCTTCCACTGCGCAGACTGCGGCAAGGGCTTCGGCCACGCTTCCTCCCTGAG
    CAAACACCGGGCCATCCATCGTGGGGAGCGGCCCCACCGCTGTCCCGAGTGTGGTCGGGC
    CTTCATGCGCCGCACGGCGCTGACTTCTCACCTGCGCGTTCACACTGGCGAGAAGCCCTA
    CCGCTGCCCGCAGTGTGGCCGCTGCTTCGGCCTGAAGACCGGCATGGCCAAGCACCAATG
    GGTCCATCGGCCCGGGGGCGAGGGGCGTAGGGGCCGGCGCCCTGGGGGGCTGTCTGTGAC
    CCTGACTCCTGTCCGCGGGGACCTGGACCCGCCTGTGGGCTTCCAGCTGTATCCAGAGAT
    ATTCCAGGAATGTGGGTGACGGCCTAAAAAGTGACCATCTAGACATTGTGGGCGGCCCGA
    GATGGGCTCAGGGGCCCGAACCTCTGCAGCGGCCTGCAGGGAGGTCCCAGAATCCACCGC
    AAGAGCTGGCCTGGGGTGCGGACAGTCTGATCTTGGGCTCTCAGCAGCCTCTTCTGCCAG
    CACCTTGCTCCCCGCTGCCCTGGGCTCTCCAAGGCCCCCTTTGCTGAGGCAGGGCTGAGG
    TGAGAACCCCCCAGACCTCCATACAGGGAAGCAAAAGCTGTTTCTCCTCCCAGAGATGCT
    AAGAGGATTGAGGTAGAGAAGAACCTTGTTTTCTCTGTTGTCTTTTTCTTTTTACTTTTT
    TAATTTTTTGAGACGGAGTTTTGCTCTTGTTGCCCAGGCTGGAGTGCAATGGTGCGATCT
    CGACTCACTGCAACTTCCACCTCCTGGAGTCAAGCGATTCTCCTGCCTCAGCCACCCAAG
    TAGCTGGAATTACAGGCACCTGCCACTATGCCCGGCTAACTTTTTGTATTTTTAGTAGAG
    ATGGGGTTTCACCATGTTGGCTAGGCTGGTCTCGAACTCCTGCCCTCAGGTGATCCACCC
    ACCTCTGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCTCACCTGGCCTTTTCTT
    TTTTATTCTTTGACCTTCCCACAAGACAATACCCATTGTCTGTTTTTTTTGTTTATTTAT
    TTACTTATTAAGACAGCATCTTGCTCCTCACCCAGGCTGGAATGCAGTGGTGTGAACTGG
    GCTCACTGCAGCCTAGACCTGCTGGGCTCAAGGAATCCTCCTGCCCCAGCCTCTCAGATG
    GCTGTGACTACAGGTGGGCAACACTATGCCTGGTTAATTTTTAAATTTTTTTGCAGAGAT
    GGGGTTCCCACTATGTTGATCAGGCTGGTCTCAAACTCCTCGGTTCAAGCAATTCGCCCA
    CCTTGGCCTCCCAAAGTGCTGGGATTACAGGGGAGCCACTGCACTGGCCTTCATTGTCTT
    TTTGCTGCACAACCTAAAAAACCAGTGACCCTGTATTGGAAAAAAAAAAAAAAAAAAAAA
    A
    >Hs.65756_mRNA_3 gi|3641494|gb|AF035154.1|AF035154 Homo sapiens chromosome
    16 map 16pl3.3 polyA = 3
    GCCATGGCCGCCGGCCCCGCGCCGCCCCCCGGCCGCCCCCGGGCGCAGATGCCGCATCTG
    AGGAAGGTGCGAGGCGGATGGAGCGGGTGGTCGTGAGCATGCAGGACCCCGACCAGGGCG
    TGAAGATGCGGAGCCAGCGCCTGCTGGTCACCGTCATTCCCCACGCGGTGACAGGCAGCG
    ACGTCGTGCAGTGGTTGGCCCAGAAGTTCTGCGTCTCGGAGGAGGAGGCCCTGCACCTGG
    GCGCCGTCCTGGTGCAGCATGGCTACATCTACCCGCTGCGCGACCCCCGTAGCCTCATGC
    TCCGGCCAGACGAGACGCCCTACAGGTTCCAGACCCCGTACTTCTGGACAAGTACCCTGA
    GGCCGGCTGCAGAGCTGGACTATGCCATCTACCTGGCCAAGAAGAACATCCGAAAACGGG
    GGACCCTGGTGGATTATGAGAAGGACTGCTATGACCGGCTACACAAGAAGATCAACCACG
    CATGGGACCTGGTGCTGATGCAGGCGAGGGAGCAGCTGAGGGCAGCCAAGCAGCGCAGCA
    AGGGGGACAGGCTGGTCATTGCGTGCCAGGAGCAGACCTACTGGCTGGTGAACAGGCCCC
    CGCCCGGGGCCCCCGATGTGCTGGAGCAGGGTCCAGGGCGGGGATCCTGCGCTGCCAGCC
    GTGTGCTCATGACCAAGAGTGCAGATTTCCATAAGCGGGAGATCGAGTACTTCAGGAAAG
    CGCTGGGCAGGACCCGAGTGAAGTCCTCCGTCTGCCTTGAGGCGTACCTGAGTTTCTGCG
    GCCAGCGTGGACCCCACGATCCCCTCGTGTCGGGGTGCCTGCCCAGCAATCCCTGGATCT
    CAGACAATGACGCCTACTGGGTCATGAATGCCCCCACGGTGGCTGCCCCCACGAAGCTCC
    GTGTGGAGAGATGGGGCTTCAGCTTCCGGGAGCTCCTGGAGGACCCCGTGGGGCGGGCCC
    ACTTCATGGACTTTCTGGGAAAGGAGTTCAGTGGAGAAAACCTCAGCTTCTGGGAGGCAT
    GTGAGGAGCTTCGATATGGAGCGCAGGCCCAGGTCCCCACCCTGGTGGATGCCGTGTACG
    AGCAGTTCCTGGCCCCCGGAGCTGCCCACTGGGTCAACATCGACAGCCGGACCATGGAGC
    AGACCCTGGAGGGGCTGCGCCAGCCCCACCGCTATGTCCTGGATGACGCCCAGCTGCACA
    TATACATGCTCATGAAGAAGGACTCCTACCCAAGGTTCCTGAAGTCTGACATGTACAAGG
    CCCTCCTGGCAGAGGCTGGGATCCCGCTGGAGATGAAGAGACGCGTGTTCCCGTTTACGT
    GGAGGCCACGGCACTCGAGCCCCAGCCCTGCACTCCTTCCCACCCCTGTGGAGCCCACAG
    CGGCTTGTGGCCCTGGGGGTGGAGATGGGGTGGCCTAGTGGACCTGGCCCATCTGCCACT
    CTAGTCCCTGCAGCTCAACGTCCTGCGTGAATGCAGCAGCCACCCCCGTCTTGGCCCAGG
    TCCTGGGGGCTGCTGAACCCAGCACCAGTGTCCCCTTGTGCCCAGGGGGCCCAGTCTTCT
    GTGGGGTGCACAGCCTCCCTCCCTCCAGCAAGCCCTCCCTGCCCAGAAGGAATGGGTCCA
    GGTGTGGATTCCCAGGGAGGGGGTTCATTGGCTCAGCTTGGGTCAGGGCAGAGCCTGTTA
    CCTGAAGAGAGGTGAGACCAAGGCCACAGGGAGCTCCACCTTCTCTGGTCTTCAGTCCAG
    CACTGGGTGCCCATCCCCATCTCTAAAACCAGTAAATCAGCCAGCGAATACCCGGAAGCA
    AGATGCACAGGCGGGCGGCTTCCCACACACCCGTCACAAGACGCGGACATGCAGGTCTCG
    GCGCGAGCTCTGCCCCGTCCAAGAGCCTCTCCGCTGTCGCCCAGTGTGAGCCTGGAAGAG
    GACCCAAGAGAGTGCCGTGCTGAGGCTGCCTCGAGGTCACTGCCTTCCGGAGCTGCGCCT
    ATTCCTCCCTCGCCAAACGCGTTCCAGAATTTGTCCACAGGTGCGCCGGCACCTGCTTTC
    CCACCTCGAGGCCGCGGCCTCCCCCCCGATTTATAGACAACTCTGACATTGTCACCCCAC
    TGACGAGGCCCGATTCCATAGGGTGGATCCTTGCCAGGCGTCCCTGATCCTCCCTGCCCA
    AGTCTTCCTTCGTGAGCTGGCCTTGCTCCCCATCCCCCAAGTGCCTCACCAGTCCCCCAG
    ACTGGGTGAAGGTACAGCTGGCTCCTTTCGGGGGTGCAGCTTCAACTCTCTCGGCGGTAG
    GGCGGTGCCATCCCCACCCATAGGGCTGGCTCACATCCAGTCACTCCCAACAGCGTCCAG
    CACACAAATAAAAGACCCTTGGGCCCTGGCTCTGAGAAAAAAAA
    >Hs.165743_mRNA_2 gi|13543889|gb|BC006091.1|BC006091 Homo sapiens clone
    MGC:12673 IMAGE:3677524 polyA = 3
    AGACTGCCGAGCAGCCTTGAGCCGTTGAGCAGCTGAACAGAGGCCATGCCGGGGCACTCC
    GAGGCCTGAGACGACCACGCCTGTGCCGCTGAGGACCTTCATCAGGGCTCCGTCCACTTG
    GCCCGCTTGGCTGTCCAATCACACTCCAGTGTCAACCACTGGCACCCAGCAGCCAAGAGA
    GGTGTGGCGTGGCCCTGGGGACGCATGGCTGAGGCAGGAACAGGTGAGCCGTCCCCCAGC
    GTGGAGGGCGAACACGGGACGGAGTATGACACGCTGCCTTCCGACACAGTCTCCCTCAGT
    GACTCGGACTCTGACCTCAGCTTGCCCGGTGGTGCTGAAGTGGAAGCACTGTCCCCGATG
    GGGCTGCCTGGGGAGGAGGATTCAGGTCCTGATGAGCCGCCCTCACCCCCGTCAGGCCTC
    CTCCCAGCCACGGTGCAGCCATTCCATCTGAGAGGCATGAGCTCCACCTTCTCCCAGCGC
    AGCCGTGACATCTTTGACTGCCTGGAGGGGGCGGCCAGACGGGCTCCATCCTCTGTGGCC
    CACACCAGCATGAGTGACAACGGAGGCTTCAAGCGGCCCCTAGCGCCCTCAGGCCGGTCT
    CCAGTGGAAGGCCTGGGCAGGGCCCATCGGAGCCCTGCCTCACCAAGGGTGCCTCCGGTC
    CCCGACTACGTGGCACACCCCGAGCGCTGGACCAAGTACAGCCTGGAAGATGTGACCGAG
    GTCAGCGAGCAGAGCAATCAGGCCACCGCCCTGGCCTTCCTGGGCTCCCAGAGCCTGGCT
    GCCCCCACTGACTGCGTGTCCTCCTTCAACCAGGATCCCTCCAGCTGTGGGGAGGGGAGG
    GTCATCTTCACCAAACCAGTCCGAGGGGTCGAAGCCAGACACGAGAGGAAGAGGGTCCTG
    GGGAAGGTGGGAGAGCCAGGCAGGGGCGGCCTTGGGAATCCTGCCACAGACAGGGGCGAG
    GGCCCTGTGGAGCTGGCCCATCTGGCCGGGCCCGGGAGCCCAGAGGCTGAGGAGTGGGGC
    AGCCCCCATGGAGGCCTGCAGGAGGTGGAGGCACTGTCAGGGTCTGTCCACAGTGGGTCT
    GTGCCAGGTCTCCCGCCGGTGGAAACTGTTGGCTTCCATGGCAGCAGGAAGCGGAGTCGA
    GACCACTTCCGGAACAAGAGCAGCAGCCCCGAGGACCCAGGTGCTGAGGTCTGAGAGGGA
    GATGGCCCAGCCTGACCCCACTGGCCACTGCCATCCTGCTGCCTTCCCAGTGGGGCTGGT
    CAGGGGGCAGCCTGGCCACTGCCTAGCTGGAATGGGAGGAAGCCTGCAGGTGGCACCGGT
    GGCCCTGGCTGCAGTTCTGGGCAGCATCCTCCCAAGCAGAGACCTTGCTGAAGCTCCTGG
    GGTGTGGGGTGTGGGCTGGAAGCACTGGCTCCCTGGTAGGGACAATAAAGGTTTTGGGTC
    TTTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC
  • All references cited herein, including patents, patent applications, and publications, are hereby incorporated by reference in their entireties, whether previously specifically incorporated or not.
  • Having now fully described this invention, it will be appreciated by those skilled in the art that the same can be performed within a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation.
  • While this invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth.

Claims (8)

1-20. (canceled)
21. A method of classifying a tumor cell-containing sample obtained from a human subject based on a tumor type or origin, wherein the tumor type or origin is selected from a plurality of known tumor types or origins, the method comprising:
amplifying five to 49 transcribed sequences,
wherein each transcribed sequence is a unique portion of one of SEQ ID NOS: 1-74 or a complement thereof,
wherein the unique portion is unique relative to other sequences expressed in the tumor cell-containing sample, and
wherein the amplification is of at least 50 nucleotides of the transcribed sequences;
determining the expression levels of the transcribed sequences and normalizing the expression levels to one or more reference genes;
comparing the normalized expression levels of the transcribed sequences from the tumor cell-containing sample to normalized expression levels of the same transcribed sequences from at least ten known tumor types or origins of a plurality of known tumor types or origins,
wherein the plurality of known tumor types or origins comprises adrenal gland, brain, breast, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, endometrium, gall bladder, germ cell-ovary, GIST, kidney, leiomyosarcoma, liver, lung-adenocarcinoma-large cell, lung-small cell, lung-squamous, lymphoma-B cell, lymphoma-Hodgkin's, lymphoma-T cell, meningioma, mesothelioma, osteosarcoma, ovary-clear cell, ovary-serous, pancreas, prostate, skin-basal cell, skin-melanoma, skin-squamous, small and large bowel, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, stomach-adenocarcinoma, testis-non-seminoma, testis-seminoma, thyroid-follicular-papillary, thyroid-medullary, and urinary bladder,
determining five nearest neighbors by determining five of the at least ten known tumor types or origins that have the most similar expression levels compared to the expression levels of the tumor cell-containing sample; and
a) if at least four of the five nearest neighbors share a tumor type or origin, classifying the tumor cell-containing sample as containing tumor cells of the tumor type or origin shared by at the least four of the five nearest neighbors; and
b) if fewer than four of the five nearest neighbors share a tumor type or origin, classifying the tumor cell-containing sample as containing a non-squamous cell tumor.
22. The method of claim 21, wherein the expression levels are determined by use of a microarray and the method further comprises hybridizing the amplified transcribed sequences to the microarray.
23. The method of claim 21, wherein the amplification comprises reverse transcription PCR, quantitative PCR, or real time PCR.
24. The method of claim 21, wherein the amplification comprises linear RNA amplification or quantitative PCR.
25. The method of claim 23, wherein the amplification is of sequences present within 750 nucleotides of the polyadenylation sites of the transcribed sequences.
26. The method of claim 21, wherein the tumor cell-containing sample is a formalin fixed, paraffin embedded sample.
27. The method of claim 21, further comprising, before the determining of the expression levels of the transcribed sequences,
diagnosing the human subject as in need of the determining; or
obtaining the tumor cell-containing sample from the human subject; or
receiving the tumor cell-containing sample; or
sectioning the tumor cell-containing sample; or
isolating cells from the tumor cell-containing sample; or
obtaining RNA from cells of the tumor cell-containing sample.
US17/854,909 2005-06-03 2022-06-30 Identification of tumors and tissues Pending US20230023867A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/854,909 US20230023867A1 (en) 2005-06-03 2022-06-30 Identification of tumors and tissues

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US68717405P 2005-06-03 2005-06-03
US11/422,056 US20070020655A1 (en) 2005-06-03 2006-06-02 Identification of Tumors and Tissues
US15/368,446 US11430544B2 (en) 2005-06-03 2016-12-02 Identification of tumors and tissues
US17/854,909 US20230023867A1 (en) 2005-06-03 2022-06-30 Identification of tumors and tissues

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/368,446 Continuation US11430544B2 (en) 2005-06-03 2016-12-02 Identification of tumors and tissues

Publications (1)

Publication Number Publication Date
US20230023867A1 true US20230023867A1 (en) 2023-01-26

Family

ID=37309082

Family Applications (3)

Application Number Title Priority Date Filing Date
US11/422,056 Abandoned US20070020655A1 (en) 2004-06-04 2006-06-02 Identification of Tumors and Tissues
US15/368,446 Active 2029-08-09 US11430544B2 (en) 2005-06-03 2016-12-02 Identification of tumors and tissues
US17/854,909 Pending US20230023867A1 (en) 2005-06-03 2022-06-30 Identification of tumors and tissues

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US11/422,056 Abandoned US20070020655A1 (en) 2004-06-04 2006-06-02 Identification of Tumors and Tissues
US15/368,446 Active 2029-08-09 US11430544B2 (en) 2005-06-03 2016-12-02 Identification of tumors and tissues

Country Status (10)

Country Link
US (3) US20070020655A1 (en)
EP (2) EP1899484B1 (en)
JP (1) JP2008545431A (en)
KR (1) KR20080104113A (en)
CN (1) CN101297045A (en)
AU (1) AU2006255282A1 (en)
CA (1) CA2610752A1 (en)
DK (1) DK1899484T3 (en)
ES (1) ES2550652T3 (en)
WO (1) WO2006132971A2 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030198972A1 (en) 2001-12-21 2003-10-23 Erlander Mark G. Grading of breast cancer
US9856533B2 (en) 2003-09-19 2018-01-02 Biotheranostics, Inc. Predicting breast cancer treatment outcome
US20120258442A1 (en) 2011-04-09 2012-10-11 bio Theranostics, Inc. Determining tumor origin
US20060094035A1 (en) * 2004-06-04 2006-05-04 Arcturus Bioscience, Inc. Identification of tumors
DK1899484T3 (en) 2005-06-03 2015-11-23 Biotheranostics Inc Identification of tumors and tissues
US8535677B2 (en) 2006-06-06 2013-09-17 Oxford Biotherapeutics, Ltd. Antibody drug conjugate treatment of colorectal cancer
CN102395682B (en) * 2007-09-06 2015-09-30 生物治疗诊断股份有限公司 Tumor grade and cancer prognosis
WO2010051314A2 (en) * 2008-10-31 2010-05-06 Abbott Laboratories Genomic classification of non-small cell lung carcinoma based on patterns of gene copy number alterations
EP2359284A2 (en) * 2008-10-31 2011-08-24 Abbott Laboratories Method for genomic classification of malignant melanoma based on patterns of gene copy number alterations
US10236078B2 (en) 2008-11-17 2019-03-19 Veracyte, Inc. Methods for processing or analyzing a sample of thyroid tissue
US9495515B1 (en) 2009-12-09 2016-11-15 Veracyte, Inc. Algorithms for disease diagnostics
US9074258B2 (en) 2009-03-04 2015-07-07 Genomedx Biosciences Inc. Compositions and methods for classifying thyroid nodule disease
SG10201401604VA (en) 2009-04-20 2014-08-28 Oxford Biotherapeutics Ltd Antibodies Specific To Cadherin-17
US8669057B2 (en) * 2009-05-07 2014-03-11 Veracyte, Inc. Methods and compositions for diagnosis of thyroid conditions
US10446272B2 (en) 2009-12-09 2019-10-15 Veracyte, Inc. Methods and compositions for classification of samples
EP2648762B1 (en) 2010-12-09 2018-02-21 Biotheranostics, Inc. Post-treatment breast cancer prognosis
WO2013002750A2 (en) * 2011-06-29 2013-01-03 Biotheranostics, Inc. Determining tumor origin
JP2015514710A (en) * 2012-03-27 2015-05-21 ジェネンテック, インコーポレイテッド Diagnosis and treatment of HER3 inhibitors
US9158965B2 (en) * 2012-06-14 2015-10-13 The Board Of Trustees Of The Leland Stanford Junior University Method and system for optimizing accuracy-specificity trade-offs in large scale visual recognition
CA2905620A1 (en) 2013-03-15 2014-10-02 Biotheranostics, Inc. Neuroendocrine tumors
WO2015072200A1 (en) * 2013-11-15 2015-05-21 ソニー株式会社 Information processing device, information processing method, and program
US20170335396A1 (en) 2014-11-05 2017-11-23 Veracyte, Inc. Systems and methods of diagnosing idiopathic pulmonary fibrosis on transbronchial biopsies using machine learning and high dimensional transcriptional data
EP3374526A4 (en) 2015-11-13 2019-09-11 Biotheranostics, Inc. Integration of tumor characteristics with breast cancer index
CN106897581B (en) * 2017-01-25 2018-01-26 人和未来生物科技(长沙)有限公司 A kind of restructural heterogeneous platform understood towards gene data
US11217329B1 (en) 2017-06-23 2022-01-04 Veracyte, Inc. Methods and systems for determining biological sample integrity

Family Cites Families (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5871697A (en) 1995-10-24 1999-02-16 Curagen Corporation Method and apparatus for identifying, classifying, or quantifying DNA sequences in a sample without sequencing
US20020172965A1 (en) * 1996-12-13 2002-11-21 Arcaris, Inc. Methods for measuring relative amounts of nucleic acids in a complex mixture and retrieval of specific sequences therefrom
US6285701B1 (en) 1998-08-06 2001-09-04 Lambda Physik Ag Laser resonator for improving narrow band emission of an excimer laser
CA2350502C (en) 1998-11-13 2009-01-27 Pro Duct Health, Inc. Devices and methods to identify ductal orifices during nipple aspiration
US20040241728A1 (en) * 1999-01-06 2004-12-02 Chondrogene Limited Method for the detection of lung disease related gene transcripts in blood
US6647341B1 (en) 1999-04-09 2003-11-11 Whitehead Institute For Biomedical Research Methods for classifying samples and ascertaining previously unknown classes
EP1358349A2 (en) 2000-06-05 2003-11-05 Avalon Pharmaceuticals Cancer gene determination and therapeutic screening using signature gene sets
AU2001292802A1 (en) 2000-09-19 2002-04-02 Dana-Farber Cancer Institute Inc. Genetic markers for tumors
AU2002228000A1 (en) 2000-12-07 2002-06-18 Europroteome Ag Expert system for classification and prediction of genetic diseases
US6794141B2 (en) 2000-12-22 2004-09-21 Arcturus Bioscience, Inc. Nucleic acid amplification
JP4222835B2 (en) 2001-03-14 2009-02-12 株式会社Dnaチップ研究所 Cancer prediction methods
WO2002074979A2 (en) 2001-03-20 2002-09-26 Ortho-Clinical Diagnostics, Inc. Expression profiles and methods of use
WO2002101357A2 (en) 2001-06-10 2002-12-19 Irm Llc Molecular signatures of commonly fatal carcinomas
US7514209B2 (en) 2001-06-18 2009-04-07 Rosetta Inpharmatics Llc Diagnosis and prognosis of breast cancer patients
US7504222B2 (en) 2001-10-31 2009-03-17 Millennium Pharmaceuticals, Inc. Compositions, kits, and methods for identification, assessment, prevention, and therapy of breast cancer
WO2003041562A2 (en) * 2001-11-14 2003-05-22 Whitehead Institute For Biomedical Research Molecular cancer diagnosis using tumor gene expression signature
US20030198972A1 (en) * 2001-12-21 2003-10-23 Erlander Mark G. Grading of breast cancer
US20040002067A1 (en) 2001-12-21 2004-01-01 Erlander Mark G. Breast cancer progression signatures
EP2258872B1 (en) 2002-03-13 2013-08-14 Genomic Health, Inc. Gene expression profiling in biopsied tumor tissues
US20040063120A1 (en) * 2002-07-10 2004-04-01 The Regents Of The University Of Michigan Expression profile of lung cancer
US20040098367A1 (en) * 2002-08-06 2004-05-20 Whitehead Institute For Biomedical Research Across platform and multiple dataset molecular classification
US7364846B2 (en) 2002-10-11 2008-04-29 Molecular Devices Corporation Gene expression profiling from FFPE samples
AU2003298786A1 (en) 2002-11-26 2004-06-18 Protein Design Labs, Inc. Methods of detecting soft tissue sarcoma, compositions and methods of screening for soft tissue sarcoma modulators
US20050143334A1 (en) * 2002-12-20 2005-06-30 David Tarin Genetic markers and methods for the diagnosis, treatment and prevention of tumor metastasis
JP2006519620A (en) 2003-03-04 2006-08-31 アークチュラス バイオサイエンス,インコーポレイティド ER status discrimination characteristics in breast cancer
EP1608964A4 (en) 2003-03-14 2009-07-15 Peter Maccallum Cancer Inst Expression profiling of tumours
US20050003341A1 (en) 2003-07-01 2005-01-06 Hanan Polansky Drug discovery assays based on the biology of atherosclerosis, cancer, and alopecia
US9856533B2 (en) 2003-09-19 2018-01-02 Biotheranostics, Inc. Predicting breast cancer treatment outcome
US7504214B2 (en) 2003-09-19 2009-03-17 Biotheranostics, Inc. Predicting outcome with tamoxifen in breast cancer
WO2005059109A2 (en) * 2003-12-15 2005-06-30 The Regents Of The University Of California Molecular signature of the pten tumor suppressor
WO2005068664A2 (en) * 2004-01-09 2005-07-28 The Regents Of The University Of California Cell-type-specific patterns of gene expression
US20050272061A1 (en) 2004-02-19 2005-12-08 Seattle Genetics, Inc. Expression profiling in non-small cell lung cancer
US20120258442A1 (en) * 2011-04-09 2012-10-11 bio Theranostics, Inc. Determining tumor origin
US20060094035A1 (en) 2004-06-04 2006-05-04 Arcturus Bioscience, Inc. Identification of tumors
JP2008528024A (en) 2005-01-31 2008-07-31 ディジタルジェノミクスインコーポレーション Marker gene for lung cancer diagnosis
DK1899484T3 (en) * 2005-06-03 2015-11-23 Biotheranostics Inc Identification of tumors and tissues
WO2007137366A1 (en) 2006-05-31 2007-12-06 Telethon Institute For Child Health Research Diagnostic and prognostic indicators of cancer
WO2008116592A1 (en) 2007-03-23 2008-10-02 F. Hoffmann-La Roche Ag Apex as a marker for lung cancer
US20100273172A1 (en) 2007-03-27 2010-10-28 Rosetta Genomics Ltd. Micrornas expression signature for determination of tumors origin
EP2132327A2 (en) 2007-03-27 2009-12-16 Rosetta Genomics Ltd Gene expression signature for classification of cancers
CN102395682B (en) * 2007-09-06 2015-09-30 生物治疗诊断股份有限公司 Tumor grade and cancer prognosis
EP2203569A2 (en) 2007-10-31 2010-07-07 Rosetta Genomics Ltd Diagnosis and prognosis of specific cancers by means of differential detection of micro-rnas / mirnas
WO2009153775A2 (en) 2008-06-17 2009-12-23 Rosetta Genomics Ltd. Methods for distinguishing between specific types of lung cancers
GB0904957D0 (en) 2009-03-23 2009-05-06 Univ Erasmus Medical Ct Tumour gene profile
WO2013002750A2 (en) 2011-06-29 2013-01-03 Biotheranostics, Inc. Determining tumor origin

Also Published As

Publication number Publication date
EP2365092A1 (en) 2011-09-14
WO2006132971A3 (en) 2007-03-29
US20070020655A1 (en) 2007-01-25
ES2550652T3 (en) 2015-11-11
KR20080104113A (en) 2008-12-01
AU2006255282A1 (en) 2006-12-14
EP1899484B1 (en) 2015-08-12
JP2008545431A (en) 2008-12-18
CN101297045A (en) 2008-10-29
US11430544B2 (en) 2022-08-30
WO2006132971A9 (en) 2007-05-18
WO2006132971A2 (en) 2006-12-14
EP1899484A2 (en) 2008-03-19
US20170286596A1 (en) 2017-10-05
DK1899484T3 (en) 2015-11-23
CA2610752A1 (en) 2006-12-14

Similar Documents

Publication Publication Date Title
US20230023867A1 (en) Identification of tumors and tissues
US10538816B2 (en) Identification of tumors
JP6684775B2 (en) Methods and compositions for diagnosis of thyroid status
JP6257125B2 (en) Molecular profiling methods and compositions for disease diagnosis
US20180073085A1 (en) Determining tumor origin
WO2013002750A2 (en) Determining tumor origin
US20220106646A1 (en) Determining tumor origin
US20090215037A1 (en) Dynamically expressed genes with reduced redundancy
MX2007015252A (en) Identification of tumors and tissues

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA

Free format text: SECURITY INTEREST;ASSIGNORS:HOLOGIC, INC.;FAXITRON BIOPTICS, LLC;BIOTHERANOSTICS, INC.;AND OTHERS;REEL/FRAME:061639/0513

Effective date: 20221007