WO2010018882A1 - Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof - Google Patents

Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof Download PDF

Info

Publication number
WO2010018882A1
WO2010018882A1 PCT/KR2008/004735 KR2008004735W WO2010018882A1 WO 2010018882 A1 WO2010018882 A1 WO 2010018882A1 KR 2008004735 W KR2008004735 W KR 2008004735W WO 2010018882 A1 WO2010018882 A1 WO 2010018882A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
gene ontology
coordinate information
protein
tree
Prior art date
Application number
PCT/KR2008/004735
Other languages
French (fr)
Inventor
Kyung-Hoon Kwon
Gun Wook Park
Jeong Hwa Lee
Jong Shin Yoo
Original Assignee
Korea Basic Science Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Korea Basic Science Institute filed Critical Korea Basic Science Institute
Priority to PCT/KR2008/004735 priority Critical patent/WO2010018882A1/en
Publication of WO2010018882A1 publication Critical patent/WO2010018882A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Definitions

  • the present invention relates to a method for analyzing gene expression patterns of biological samples, more precisely an apparatus for visualizing and analyzing gene expression patterns of biological samples using gene ontology tree and a method thereof.
  • Proteome is a combined word of protein and ome, which is the integrative term for whole proteins .
  • a cell of a unicellular organism has one type proteome, while each cell of a multicellular organism has same genome but different types of proteomes . That is, in a multicellular organism, genome which proteome is originated from is all equal but aspects of proteome shown in a specific cell or under specific condition are all different.
  • Proteomics is the study to identify proteins, expression levels of proteins, transformation and intracellular locations of proteins, and interactions between proteins. By proteomics, proteins expressed in cells can be identified and network among these proteins can be disclosed, suggesting that proteomics provides the explanations on all the biological phenomena from genome to protein.
  • analysis of expression profile is to investigate general protein expression patterns according to experimental conditions. Precisely, it is originated from gene microarray technique analyzing genes harboring necessary information by integrating on a chip. This analysis facilitates gene expression investigation in a large scale by various statistical analysis methods, overcoming the limit of the conventional method enabling gene expression analysis of only one or two genes at a time .
  • Protein profile analysis has been tried in many different ways to analyze protein expression accurately and to overcome the problem of inconsistency between actual mRNA expression and protein expression.
  • protein profile analysis was tried using protein spot intensity information obtained from 2D-PAGE.
  • protein expression analysis by the conventional proteomics or RNA analysis by microarray could only provide massive information on gene expression as a whole or a specific target gene expression. So, it was not possible to compare general gene expression quantitatively or to compare expressions of different proteomes by the difference of their functions or to compare expressions over cellular components.
  • gene ontology was introduced by Gene Ontology Consortium. Ontology herein indicates a system classifying biological terms or vocabularies. Gene Ontology Classification System Consortium was established for standardization of biological terms. To explain functions of genes in all the species, controlled vocabularies applied in common were introduced. Again, gene ontology is a classification system to investigate relationship among genes or among key words of each gene, which can be applied to bioinformatics approach.
  • genes form a tree structure, in which they are related hierarchically. Total terms are classified into three categories, and approximately 10,000 terms form a tree structure with forming hierarchical relationship. According to gene ontology, genetic functions are classified into three categories, molecular function, biological process, and cellular component. And, controlled vocabulary is established hierarchical in each category. These categories are not exclusive and only- divided by characteristics to describe a gene.
  • the present inventors tried to develop an analysis method facilitating not only simple gene expression analysis in a biological sample but also biologically- important functional analysis of the biological sample. As a result, the present inventors completed this invention by developing a method for analyzing expression distributions of whole genes and the distributions of functions and cellular components of biological samples by- introducing gene ontology concept.
  • the present invention provides a novel method for identifying evolutional relationship or cell developmental stages of biospecies .
  • Gene Ontology indicates a system classifying biological terms or vocabularies provided by Gene Ontology
  • Ontology terms used in this invention include not only ontology terms themselves to describe a specific gene but also gene ontology codes corresponding to the ontology terms.
  • Gene ontology code herein indicates a code pre-set up on gene ontology database corresponding to a specific gene ontology term.
  • Gene ontology tree indicates a classification system dividing gene ontology terms hierarchically, precisely a tree structure composed of branches connecting nodes of gene ontology terms .
  • the present invention provides an apparatus for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; visualizing device which generates visualizing data on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the visualizing data.
  • the present invention also provides an apparatus for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; complexity calculating device which calculates complexity data on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the complexity data.
  • gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA
  • coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database
  • complexity calculating device which calculates complexity
  • the present invention also provides an apparatus for comparing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; comparing device which obtains different coordinate information by comparing the coordinate information obtained by the above coordinate creating device and the coordinate information obtained from another sample; visualizing device which generates visualizing data on the coordinate information obtained by the comparing device; and outputting device which outputs the visualizing data.
  • gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA
  • coordinate creating device which obtains coordinate information of the
  • the present invention also provides a method for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following steps :
  • step 1 inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
  • step 2 allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device (step 2 ) ;
  • step 3 obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
  • step (d) generating visualizing data of the coordinate information obtained in step 3 by visualizing device (step
  • step 5 outputting the visualizing data generated in step 4 by computer output system (step 5).
  • the present invention also provides a method for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following steps:
  • step 1 inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1); (b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device ( step 2 ) ;
  • step 3 obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
  • step 4 producing complexity data represented by the following mathematical formula 1 and/or formula 2 on the coordinate information obtained in step 3 by complexity calculating device (step 4);
  • step 5 outputting the complexity data produced in step 4 by computer output system (step 5).
  • the present invention provides an apparatus for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following devices : a) gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; b) coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; c) visualizing device which generates visualizing data on the coordinate information obtained by the above coordinate creating device; and d) outputting device which outputs the visualizing data .
  • the protein analysis result of step a) is preferably obtained by proteome analysis, but not always limited thereto.
  • the proteome analysis herein is performed preferably by 2-dimensional electrophoresis or mass spectrometry, but not always limited thereto.
  • the result of RNA expression analysis in step a) is preferably performed by microarray analysis, but not always limited thereto.
  • the gene ontology term allocating device determines what term among gene ontology terms defining a gene function has to be allocated to the expressed protein or RNA and executes the allocation. If a gene is multifunctional, multiple gene ontology terms can be allocated.
  • the gene ontology term allocating device identifies a specific gene from biological DB through network and finds out a corresponding term to the gene.
  • accessible biological DB through the network is exemplified by Unigene, LocusLink, Swiss-Prot, MGI, UniProt, EMBL and IPI, but not always limited thereto.
  • gene ontology database accessible through the network is exemplified by GO (Gene Ontology), ChEBI (Chemical Entitles Biological Interest), GOA and NEW, but not always limited thereto.
  • the apparatus for visualizing gene expression patterns facilitates gene ontology term allocation by gene identification algorithm identifying an expressed gene by- screening it through biological DB based on the results of protein or RNA expression analysis and by allocation algorithm allocating gene ontology term corresponding to the identified gene by screening it through gene ontology- database .
  • the gene identification algorithm contains a protein or gene screening tool operable in the network such as Blast and FASTA provided by EBI (European Bioinformatics Institute), PIR (Georgetown University) or ExPASy (Swiss institute of Bioinformatics ), but not always limited thereto .
  • the gene ontology term allocation algorithm contains a gene ontology term screening tool operable in the network such as AmiGO, MGI GO Browser or Ontology Lookup Service, but not always limited thereto.
  • the coordinate creating device is composed of transform algorithm converting the gene ontology tree classified in gene ontology database into coordinates and coordinate information collection algorithm collecting information of coordinates corresponding to the allocated ontology terms among the converted gene ontology tree coordinates .
  • the coordinate creating device has functions of matching the ontology term corresponding to the expressed gene to the gene ontology tree classified in gene ontology database. In the ontology term allocating device, if a gene is allocated with multiple terms, multiple coordinates can be obtained.
  • the gene ontology tree has a tree structure stretching branches of gene ontology terms which are connected with one another by the coordinates (nodes) corresponding to gene ontology terms and the coordinates (nodes) themselves.
  • Figure 1 illustrates an example of gene ontology tree.
  • the topmost level is the whole gene ontology and the second highest level consists of molecular functions, biological process and cellular components, and levels 3, 4 and 5 are lower levels each forming a tree. As lower goes the level, ontology terms for genes having detailed functions can be provided.
  • the ontology tree contains one of molecular functions, biological processes and cellular components as the topmost level or takes its lower concept as the topmost level.
  • the gene ontology tree can be provided by gene ontology database or by ontology analysis tool.
  • the gene ontology database and analysis tool are provided by the network at the below internet address, but not always limited thereto. http: //www. geneontology . org.
  • gene ontology tree consisting of gene ontology terms can be obtained by using the database established by the present inventors or screening tool.
  • the coordinate information is preferably 2-dimensional or 3-dimensional coordinate information, but not always limited thereto.
  • the coordinate information herein indicates the coordinate information for gene ontology term corresponding to a target gene obtained from gene ontology tree.
  • the coordinate information is composed of the information on the shortest path connecting the upmost node of gene ontology tree and the target node, level of the node, location of a branch on the passway among branches stretched from each level, numbers of branches stretched from each level and numbers of genes corresponding thereto, but not always limited thereto.
  • the visualizing device is composed of algorithm locating coordinates of nodes (gene ontology terms ) obtained by the coordinate creating device and visualization algorithm visualizing vertical relationships of those coordinates obtained by a tree structure.
  • the visualizing device facilitates visualization of coordinate data of gene ontology terms by a tree structure.
  • the outputting device is preferably monitor, printer or plotter, but not always limited thereto.
  • the present invention also provides an apparatus for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: a) gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; b) coordinate creating device which obtains coordinate information of the allocated gene ontology- terms corresponding to gene ontology tree classified in gene ontology database; c) complexity calculating device which calculates complexity data on the coordinate information obtained by the above coordinate creating device; and d) outputting device which outputs the complexity data.
  • the protein analysis result of step a) is preferably obtained by proteome analysis, but not always limited thereto.
  • the proteome analysis herein is performed preferably by 2-dimensional electrophoresis or mass spectrometry, but not always limited thereto.
  • the result of RNA expression analysis in step a) is preferably performed by microarray analysis, but not always limited thereto.
  • the gene ontology term allocating device determines what term among gene ontology terms defining a gene function has to be allocated to the expressed protein or RNA and executes the allocation. If a gene is multi- functional, multiple gene ontology terms can be allocated.
  • the gene ontology term allocating device identifies a specific gene from biological DB through network and finds out a corresponding term to the gene.
  • accessible biological DB through the network is exemplified by Unigene, LocusLink, Swiss-Prot, MGI, UniProt, EMBL and IPI, but not always limited thereto.
  • Most of the DBs above provide gene ontology terms related to gene functions and if not, they are still able to allocate gene ontology term corresponding to a specific gene using gene ontology database based on the genetic information provided by the above biological databases .
  • gene ontology database accessible through the network is exemplified by GO (Gene Ontology), ChEBI (Chemical Entitles Biological Interest), GOA and NEW, but not always limited thereto.
  • the apparatus for analyzing gene expression patterns facilitates gene ontology term allocation by gene identification algorithm identifying an expressed gene by- screening it through biological DB based on the results of protein or RNA expression analysis and by allocation algorithm allocating gene ontology term corresponding to the identified gene by screening it through gene ontology database.
  • the gene identification algorithm contains a protein or gene screening tool operable in the network such as Blast and FASTA provided by EBI (European Bioinformatics Institute), PIR (Georgetown University) or ExPASy (Swiss institute of Bioinformatics ), but not always limited thereto .
  • the gene ontology term allocation algorithm contains a gene ontology term screening tool operable in the network such as AmiGO, MGI GO Browser or Ontology Lookup Service, but not always limited thereto.
  • the coordinate creating device is composed of transform algorithm converting the gene ontology tree classified in gene ontology database into coordinates and coordinate information collection algorithm collecting information of coordinates corresponding to the allocated ontology terms among the converted gene ontology tree coordinates .
  • the coordinate creating device has functions of matching the ontology term corresponding to the expressed gene to the gene ontology tree classified in gene ontology database. In the ontology term allocating device, if a gene is allocated with multiple terms, multiple coordinates can be obtained.
  • the gene ontology tree has a tree structure stretching branches of gene ontology terms which are connected with one another by the coordinates (nodes) corresponding to gene ontology terms and the coordinates (nodes) themselves.
  • Figure 1 illustrates an example of gene ontology tree.
  • the topmost level is the whole gene ontology and the second highest level consists of molecular functions, biological process and cellular components, and levels 3, 4 and 5 are lower levels each forming a tree. As lower goes the level, ontology terms for genes having detailed functions can be provided.
  • the ontology tree contains one of molecular functions, biological processes and cellular components as the topmost level or takes its lower concept as the topmost level.
  • the gene ontology tree can be provided by gene ontology database or by ontology analysis tool.
  • the gene ontology database and analysis tool are provided by the network at the below internet address, but not always limited thereto. http: //www. geneontology . org.
  • gene ontology tree consisting of gene ontology terms can be obtained by using the database established by the present inventors or screening tool.
  • the coordinate information is preferably 2-dimensional or 3-dimensional coordinate information, but not always limited thereto.
  • the coordinate information herein indicates the coordinate information for gene ontology term corresponding to a target gene obtained from gene ontology tree.
  • the coordinate information is composed of the information on the shortest path connecting the upmost node of gene ontology tree and the target node, level of the node, location of a branch on the passway among branches stretched from each level, numbers of branches stretched from each level and numbers of genes corresponding thereto, but not always limited thereto.
  • the complexity calculating device is composed of computer arithmetic algorithm calculating complexity defined by the following mathematical formula 1, but not always limited thereto.
  • N number of coordinates of genes corresponding to gene ontology tree
  • Figure 4 is a diagram illustrating an example of calculating complexity from the coordinate information distributed on gene ontology tree.
  • the coordinate information distributed on the gene ontology tree is represented by identification marks, for example, 401, 402, 403 and 404.
  • Quantitative analysis of gene expression patterns of a biological sample can be facilitated by calculating complexity.
  • the complexity of a specific biological sample differs from the kinds and numbers of proteins and differentiation stages thereof. Complexity is increased when expressed proteins are diverse and differentiated proteins are dominant in a sample.
  • the apparatus for analyzing gene expression patterns of the present invention facilitates digitization of gene expression of a biological sample by using complexity and more specifically digitization of gene expression according to molecular functions, biological processes and cellular components. Therefore, the apparatus enables the examination of general expression patterns and cell functions as well as identification of developmental stages .
  • the outputting device is preferably monitor, printer or plotter, but not always limited thereto.
  • the present invention also provides an apparatus for comparing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; comparing device which obtains different coordinate information by comparing the coordinate information obtained by the above coordinate creating device and the coordinate information obtained from another sample; visualizing device which generates visualizing data on the coordinate information obtained by the comparing device; and outputting device which outputs the visualizing data.
  • gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA
  • coordinate creating device which obtains coordinate information of the
  • the apparatus for comparing gene expression patterns facilitates visualization of information on gene expressions in different biological samples. That is, owing to this apparatus, differences in major cell functions particularly molecular functions, biological processes and cellular components among different samples can be explained. Again, differences in functions and developmental stages among different samples can be screened by this apparatus .
  • the comparing device of the apparatus for comparing gene expression patterns consists of comparison algorithm comparing the coordinate information obtained by the coordinate creating device and the coordinate information obtained from other samples to identify their locations and coordinate producing algorithm collecting only those coordinates having different locations by eliminating coordinate information having the same locations, but not always limited thereto.
  • the comparing device can additionally include selection algorithm producing coordinate information corresponding to gene ontology term included in specific hierarchical classification which is operated before applying the comparison algorithm.
  • the apparatus for comparing gene expression patterns comprising the comparing device including the additional selection algorithm facilitates the comparison of not only the whole expressions of samples but also a specific function based expression of a gene among different samples.
  • the present invention also provides a method for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following steps using the above visualizing device: (a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
  • step 2 allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device (step 2 ) ;
  • step 3 obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
  • step 4 producing visualizing data of coordinate information obtained in step 3 by visualizing device (step 4 ) ;
  • the input system is preferably key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera, but not always limited thereto .
  • the output system is preferably monitor, printer or plotter, but not always limited thereto.
  • the present invention provides a method for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following steps :
  • step 1 inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
  • step 3 obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
  • step 4 producing complexity data represented by the following mathematical formula 1 on the coordinate information obtained in step 3 by complexity calculating device (step 4);
  • step 5 outputting the complexity data produced in step 4 by computer output system (step 5 ) .
  • the input system is preferably key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera, but not always limited thereto .
  • the output system is preferably monitor, printer or plotter, but not always limited thereto.
  • the apparatus for visualizing or analyzing gene expression patterns and a method using the same are significantly improved ones from the conventional method and apparatus which provide a way to understand molecular functions, biological processes or cellular components in a biological sample based on the results of protein or RNA expression analysis. So, the apparatus and the method of the present invention facilitate analysis of biological aspects such as functional changes, evolutionary stages or developmental stages of each biological sample.
  • Figure 1 is a diagram illustrating an example of gene ontology tree structure.
  • Figure 2 is a flow chart illustrating the general workout of the apparatus for visualizing or analyzing gene expression patterns using gene ontology according to a preferred embodiment of the present invention.
  • Figure 3 is a diagram illustrating the visualization of gene expression patterns using gene ontology according to a preferred embodiment of the present invention.
  • Figure 4 is a diagram illustrating the calculation of complexity from coordinate information distributed on gene ontology tree.
  • N numbers of coordinates of genes corresponding to gene ontology tree
  • FIGS 5-8 are diagrams illustrating the visualized data of gene expression patterns according to a preferred embodiment of the present invention.
  • gene expression patterns are visualized as the tree structure by taxonomy according to cellular component, molecular function and biological process.
  • Figure 5 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to extracellular matrix, among cell components, as the first level.
  • A visualized data of protein expression pattern of brain tissues
  • B visualized data of protein expression pattern of neural stem cells
  • C visualized data of RNA expression pattern of neural stem cells
  • D visualized data of RNA expression pattern of oligodendrocytes.
  • Figure 6 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to auxiliary transport, among molecular functions, as the first level.
  • A) visualized data of protein expression pattern of brain tissues
  • B visualized data of protein expression pattern of neural stem cells
  • C visualized data of RNA expression pattern of oligodendrocytes.
  • Figure 7 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to binding, among molecular functions, as the first level.
  • A visualized data of protein expression pattern of brain tissues
  • B visualized data of protein expression pattern of neural stem cells
  • C visualized data of RNA expression pattern of oligodendrocytes .
  • FIG 8 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to biological regulation, among biological processes, as the first level.
  • A visualized data of protein expression pattern of brain tissues
  • B visualized data of protein expression pattern of neural stem cells
  • C visualized data of RNA expression pattern of oligodendrocytes.
  • Figure 9 is a diagram illustrating the visualized data of molecular functions which are observed in neural stem cells but not in oligodendrocytes in a preferred embodiment of the present invention.
  • Figure 10 is a diagram illustrating the visualized data of molecular functions which are observed in oligodendrocytes but not in neural stem cells in a preferred embodiment of the present invention.
  • Example 1 Obtainment of expression analysis results from biological samples and gene ontology terms ⁇ 1-1> Obtainment of expression information of biological sample
  • the present inventors prepared test samples from brain tissues, neural stem cells and oligodendrocytes by the method described below and then expression information was obtained by separation of peptides on 1-dimensional gel and tandem mass spectrum.
  • neural stem cells To prepare neural stem cells, the cells separated from the brain of fetus (12-18 weeks old) were cultured. The cells were differentiated into oligodendrocytes by using 0lig2 gene (Kim, S. U., Neuropathology, 2004, 24(3), 159-171.) Neural stem cells and oligodendrocytes were labeled with lysine using C12 and C13 by SILAC method, which proceeded to 1-dimensional electrophoresis, hydrolysis with trypsin and liquid chromatography to identify proteins and quantifying thereof (Kwon, K. -H., et al., Proteomics, 2008, 8(6), 1149-61).
  • Example ⁇ 1-1> The information on expressed proteins obtained in Example ⁇ 1-1> was applied to IPI (International Protein Index) database to screen the information on expressed proteins and to identify genes corresponding thereto. Each gene corresponding to each protein was marked with Gene Symbol. In Table 1, some examples of information on the expressed proteins and genes corresponding thereto obtained in Example 1 are shown.
  • IPI International Protein Index
  • the gene ontology code is the pre-selected code corresponding to a specific gene ontology term on gene ontology database, which has been defined by Gene Ontology Consortium. So, the gene ontology code is possibly added when the gene ontology term is added. But, the code itself is not changed and defined in the gene ontology database along with the gene ontology term. For example, it can be confirmed in the gene ontology database such as GO (Gene ontology) database through www.geneontology.org.
  • Example 2 Obtainment of coordinate information corresponding to gene expression information
  • Gene ontology tree classified in the gene ontology database GO was constructed using gene ontology information corresponding to the expression information obtained in Example 1.
  • AD3 gene corresponded to GO: 0016337, which passed on the "first" branch of level 1 branches on the tree, the "first” branch of 22 level 2 branches, the "second” branch of 5 level three branches and the "third” branch of 5 level 4 branches.
  • AD3 gene corresponded to GO: 0016337, which passed on the "first" branch of level 1 branches on the tree, the "first” branch of 22 level 2 branches, the "second” branch of 5 level three branches and the "third” branch of 5 level 4 branches.
  • it could be represented as 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:.
  • Level 2 branches stretched from the first level 1 branch were 22, among which the first branch stretched three level 3 branches, among which the second branch stretched 5 level 4 branches, resulting in the presentation as follows: 22:3:5:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:.
  • the coordinate of gene information related to AD3, which is GO: 0016337, on the gene ontology tree can be presented as 1, 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0: 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:.
  • Example 2 In the upper part of the Figure, the starting point of tree was located. Then, coordinate information obtained in Example 2 was marked as node on the tree . The marked coordinates were connected from the root of the tree by branches to visualize gene expression patterns.
  • the group involved in extracellular matrix was set as the first level branch, and protein and RNA expression patterns of brain tissues, neural stem cells and oligodendrocytes were illustrated in Figure 5.
  • protein was not detected, so that visualized data on protein expression patterns thereof was not illustrated in the Figure.
  • protein was expressed in relation to extracellular matrix in brain tissues and neural stem cells, but protein was not expressed in oligodendrocytes.
  • RNA expression in relation to extracellular matrix was detected in oligodendrocytes on microarray, which was different from the protein expression, though.
  • Visualized data of gene expression patterns of brain tissues, neural stem cells and oligodendrocytes, taking the group related to auxiliary transport protein activity, among molecular functions, as the first level was illustrated in Figure 6.
  • Visualized data of gene expression patterns, taking the group related to binding, among molecular functions, as the first level was illustrated in Figure 7.
  • auxiliary- transport protein activity related proteins were expressed in brain tissues and oligodendrocytes at a low level, but not in neural stem cells.
  • binding related proteins were expressed in all the three test samples (brain tissues, neural stem cells and oligodendrocytes) at a high level.
  • Example 2 Analysis of gene expression pattern using complexity Coordinate information obtained in Example 2 was applied to the following mathematical formula to calculate complexity of expression patterns of brain tissues, neural stem cells and oligodendrocytes and the results are shown in Table 4.
  • the complexity was highest in brain tissues, followed by in neuronal stem cells and in oligodendrocytes .
  • Neural stem cells contain neuron, astrocyte and oligodendrocyte related proteins, but these proteins are not differentiated yet, suggesting that protein expression is limited and complexity is low, compared with that in the brain tissues. Oligodendrocytes comprise only one kind of cells, suggesting that complexity is the lowest.
  • neural stem cells show more complicated distribution than oligodendrocytes. But, there were genes only expressed in oligodendrocytes. As shown in Figure 10, genes involved in nucleotide binding corresponding to the 32nd branch of level 2 passing over the third branch of level 1, protein binding corresponding to the 43rd branch of level 2, and hydrolase activity corresponding to the 8th branch of level 2 passing over the 4th branch of level 1 are not expressed in neural stem cells but expressed in oligodendrocytes. In proteomics, protein which is not detected indicates no expression at all or expression at a very low level.

Abstract

The present invention relates to an apparatus for visualizing or analyzing gene expression patterns of a biological sample using gene ontology tree and a method of the same, more precisely, an apparatus for visualizing or analyzing gene expression patterns which are regarded as biologically important using gene ontology tree by calculating complexity and a method of the same. This apparatus of the present invention is useful for comparing gene expression patterns in different biological samples using data produced by the said apparatus and the method of the present invention. According to the present invention, gene expression patterns in relation to molecular functions, biological processes or cellular components can be analyzed by investigating protein or RNA expression in biological samples. So, the apparatus and the method of the present invention can be effectively used for analysis of biologically important aspects such as functional changes, evolutionary stages or developmental stages of each biological sample.

Description

[DESCRIPTION]
[invention Title]
APPARATUS FOR VISUALIZING AND ANALYZING GENE EXPRESSION PATTERNS USING GENE ONTOLOGY TREE AND METHOD THEREOF
[Technical Field]
The present invention relates to a method for analyzing gene expression patterns of biological samples, more precisely an apparatus for visualizing and analyzing gene expression patterns of biological samples using gene ontology tree and a method thereof.
[Background Art] Proteome is a combined word of protein and ome, which is the integrative term for whole proteins . In general, a cell of a unicellular organism has one type proteome, while each cell of a multicellular organism has same genome but different types of proteomes . That is, in a multicellular organism, genome which proteome is originated from is all equal but aspects of proteome shown in a specific cell or under specific condition are all different. Proteomics is the study to identify proteins, expression levels of proteins, transformation and intracellular locations of proteins, and interactions between proteins. By proteomics, proteins expressed in cells can be identified and network among these proteins can be disclosed, suggesting that proteomics provides the explanations on all the biological phenomena from genome to protein.
There are two main techniques for the study of proteomics. First is the technique to separate total protein groups from a cell and then regrouping them to multiple structural proteins. Second is the technique to analyze those separated proteins. 2-dimensional electrophoresis most widely used recently is 2 dimensional polyacrylamide gel electrophoresis (2D-PAGE) separating proteins by using isoelectric point (pi) and molecular weight (MV). To identify proteins separated by 2D-PAGE, scientists additionally perform mass analysis with each protein using ultra high speed mass spectrometry (MS). 2D-PAGE proteome database established by Julio Celis et al provides every details of each protein with explanations made by using amino acid sequence, peptide mass fingerprinting (PMF), pi and molecular weight (MW). Based on the studies screening the general information about proteins and investigating expressions, transformations and intracellular locations of proteins as well as interactions among proteins, total proteins expressed in cells and the network connecting them are disclosed, which paves the way to explanation on the whole biological phenomena from genome to protein.
In particular, analysis of expression profile is to investigate general protein expression patterns according to experimental conditions. Precisely, it is originated from gene microarray technique analyzing genes harboring necessary information by integrating on a chip. This analysis facilitates gene expression investigation in a large scale by various statistical analysis methods, overcoming the limit of the conventional method enabling gene expression analysis of only one or two genes at a time .
Profile analysis has been tried in many different ways to analyze protein expression accurately and to overcome the problem of inconsistency between actual mRNA expression and protein expression. For example, protein profile analysis was tried using protein spot intensity information obtained from 2D-PAGE.
However, protein expression analysis by the conventional proteomics or RNA analysis by microarray could only provide massive information on gene expression as a whole or a specific target gene expression. So, it was not possible to compare general gene expression quantitatively or to compare expressions of different proteomes by the difference of their functions or to compare expressions over cellular components.
According to the conventional studies on evolution, genes have been generally classified over phenotypes . But with the recent advancement of molecular biology and discovery of gene map, evolution of species is determined based on the similarity of genotypes, which is similarity of gene sequences. Gene expression analysis was not provided for evolution studies in early days. Although gene sequences were different, if the expressed gene sequences or functions were similar, suggesting that evolution or development proceeded similarly, genotypes and phenotypes of those genes had to be analyzed together, which was not facilitated by the conventional method. Therefore, the conventional analysis techniques could not provide information on functional changes according to gene expression in evolutionary relationships or developmental stages.
The term gene ontology was introduced by Gene Ontology Consortium. Ontology herein indicates a system classifying biological terms or vocabularies. Gene Ontology Classification System Consortium was established for standardization of biological terms. To explain functions of genes in all the species, controlled vocabularies applied in common were introduced. Again, gene ontology is a classification system to investigate relationship among genes or among key words of each gene, which can be applied to bioinformatics approach.
In gene ontology, terms form a tree structure, in which they are related hierarchically. Total terms are classified into three categories, and approximately 10,000 terms form a tree structure with forming hierarchical relationship. According to gene ontology, genetic functions are classified into three categories, molecular function, biological process, and cellular component. And, controlled vocabulary is established hierarchical in each category. These categories are not exclusive and only- divided by characteristics to describe a gene.
The present inventors tried to develop an analysis method facilitating not only simple gene expression analysis in a biological sample but also biologically- important functional analysis of the biological sample. As a result, the present inventors completed this invention by developing a method for analyzing expression distributions of whole genes and the distributions of functions and cellular components of biological samples by- introducing gene ontology concept. The present invention provides a novel method for identifying evolutional relationship or cell developmental stages of biospecies .
[Disclosure] [Technical Problem]
It is an object of the present invention to provide an apparatus for visualizing or analyzing gene expression patterns by taking advantage of gene ontology with the results of proteome analysis or RNA expression analysis, in order to analyze functional changes, evolutional relationships or developmental stages of biological samples and a method thereof.
[Technical Solution] Terms used in this invention are described as follows .
Gene Ontology indicates a system classifying biological terms or vocabularies provided by Gene Ontology
Consortium. Terms classified according to the system are called 'ontology terms' . These ontology terms build a tree structure together, in which they are related vertically. These terms are divided into three categories, which are molecular function, biological process and cellular component. Ontology terms used in this invention include not only ontology terms themselves to describe a specific gene but also gene ontology codes corresponding to the ontology terms. Gene ontology code herein indicates a code pre-set up on gene ontology database corresponding to a specific gene ontology term.
Gene ontology tree indicates a classification system dividing gene ontology terms hierarchically, precisely a tree structure composed of branches connecting nodes of gene ontology terms .
Hereinafter, the present invention is described in detail .
To achieve the above object, the present invention provides an apparatus for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; visualizing device which generates visualizing data on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the visualizing data.
The present invention also provides an apparatus for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; complexity calculating device which calculates complexity data on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the complexity data.
The present invention also provides an apparatus for comparing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; comparing device which obtains different coordinate information by comparing the coordinate information obtained by the above coordinate creating device and the coordinate information obtained from another sample; visualizing device which generates visualizing data on the coordinate information obtained by the comparing device; and outputting device which outputs the visualizing data.
The present invention also provides a method for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following steps :
(a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
(b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device ( step 2 ) ; (c) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) generating visualizing data of the coordinate information obtained in step 3 by visualizing device (step
4 ) ; and
(e) outputting the visualizing data generated in step 4 by computer output system (step 5).
The present invention also provides a method for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following steps:
(a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1); (b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device ( step 2 ) ;
(C) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) producing complexity data represented by the following mathematical formula 1 and/or formula 2 on the coordinate information obtained in step 3 by complexity calculating device (step 4); and
(e) outputting the complexity data produced in step 4 by computer output system (step 5).
Hereinafter, the present invention is described in more detail.
The present invention provides an apparatus for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following devices : a) gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; b) coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; c) visualizing device which generates visualizing data on the coordinate information obtained by the above coordinate creating device; and d) outputting device which outputs the visualizing data .
According to the apparatus for visualizing gene expression patterns of the present invention, differences of genes in molecular function, biological process, and cellular component are visualized, leading to general understanding on the gene expression pattern.
In the apparatus for visualizing gene expression patterns, the protein analysis result of step a) is preferably obtained by proteome analysis, but not always limited thereto. The proteome analysis herein is performed preferably by 2-dimensional electrophoresis or mass spectrometry, but not always limited thereto.
In the apparatus for visualizing gene expression patterns, the result of RNA expression analysis in step a) is preferably performed by microarray analysis, but not always limited thereto.
In the apparatus for visualizing gene expression patterns, the gene ontology term allocating device determines what term among gene ontology terms defining a gene function has to be allocated to the expressed protein or RNA and executes the allocation. If a gene is multifunctional, multiple gene ontology terms can be allocated. The gene ontology term allocating device identifies a specific gene from biological DB through network and finds out a corresponding term to the gene. In a preferred embodiment of the present invention, accessible biological DB through the network is exemplified by Unigene, LocusLink, Swiss-Prot, MGI, UniProt, EMBL and IPI, but not always limited thereto. Most of the DBs above provide gene ontology terms related to gene functions and if not, they are still able to allocate gene ontology term corresponding to a specific gene using gene ontology database based on the genetic information provided by the above biological databases. In a preferred embodiment of the present invention, gene ontology database accessible through the network is exemplified by GO (Gene Ontology), ChEBI (Chemical Entitles Biological Interest), GOA and NEW, but not always limited thereto. In a preferred embodiment of the present invention, the apparatus for visualizing gene expression patterns facilitates gene ontology term allocation by gene identification algorithm identifying an expressed gene by- screening it through biological DB based on the results of protein or RNA expression analysis and by allocation algorithm allocating gene ontology term corresponding to the identified gene by screening it through gene ontology- database .
The gene identification algorithm contains a protein or gene screening tool operable in the network such as Blast and FASTA provided by EBI (European Bioinformatics Institute), PIR (Georgetown University) or ExPASy (Swiss institute of Bioinformatics ), but not always limited thereto . The gene ontology term allocation algorithm contains a gene ontology term screening tool operable in the network such as AmiGO, MGI GO Browser or Ontology Lookup Service, but not always limited thereto.
In a preferred embodiment of the present invention, the coordinate creating device is composed of transform algorithm converting the gene ontology tree classified in gene ontology database into coordinates and coordinate information collection algorithm collecting information of coordinates corresponding to the allocated ontology terms among the converted gene ontology tree coordinates . The coordinate creating device has functions of matching the ontology term corresponding to the expressed gene to the gene ontology tree classified in gene ontology database. In the ontology term allocating device, if a gene is allocated with multiple terms, multiple coordinates can be obtained.
The gene ontology tree has a tree structure stretching branches of gene ontology terms which are connected with one another by the coordinates (nodes) corresponding to gene ontology terms and the coordinates (nodes) themselves. Figure 1 illustrates an example of gene ontology tree. In the structure of the gene ontology- tree shown in Figure 1, the topmost level is the whole gene ontology and the second highest level consists of molecular functions, biological process and cellular components, and levels 3, 4 and 5 are lower levels each forming a tree. As lower goes the level, ontology terms for genes having detailed functions can be provided. In another preferred embodiment of the present invention, the ontology tree contains one of molecular functions, biological processes and cellular components as the topmost level or takes its lower concept as the topmost level. The gene ontology tree can be provided by gene ontology database or by ontology analysis tool. In a preferred embodiment of the present invention, the gene ontology database and analysis tool are provided by the network at the below internet address, but not always limited thereto. http: //www. geneontology . org.
In another preferred embodiment of the present invention, in addition to the database accessible through the above network, gene ontology tree consisting of gene ontology terms can be obtained by using the database established by the present inventors or screening tool.
In a preferred embodiment of the present invention, the coordinate information is preferably 2-dimensional or 3-dimensional coordinate information, but not always limited thereto.
The coordinate information herein indicates the coordinate information for gene ontology term corresponding to a target gene obtained from gene ontology tree. In a preferred embodiment of the present invention, the coordinate information is composed of the information on the shortest path connecting the upmost node of gene ontology tree and the target node, level of the node, location of a branch on the passway among branches stretched from each level, numbers of branches stretched from each level and numbers of genes corresponding thereto, but not always limited thereto.
In this invention, the visualizing device is composed of algorithm locating coordinates of nodes (gene ontology terms ) obtained by the coordinate creating device and visualization algorithm visualizing vertical relationships of those coordinates obtained by a tree structure. The visualizing device facilitates visualization of coordinate data of gene ontology terms by a tree structure.
In a preferred embodiment of the present invention, the outputting device is preferably monitor, printer or plotter, but not always limited thereto.
The present invention also provides an apparatus for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: a) gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; b) coordinate creating device which obtains coordinate information of the allocated gene ontology- terms corresponding to gene ontology tree classified in gene ontology database; c) complexity calculating device which calculates complexity data on the coordinate information obtained by the above coordinate creating device; and d) outputting device which outputs the complexity data.
In the apparatus for analyzing gene expression patterns, the protein analysis result of step a) is preferably obtained by proteome analysis, but not always limited thereto. The proteome analysis herein is performed preferably by 2-dimensional electrophoresis or mass spectrometry, but not always limited thereto.
In the apparatus for analyzing gene expression patterns, the result of RNA expression analysis in step a) is preferably performed by microarray analysis, but not always limited thereto. In the apparatus for analyzing gene expression patterns, the gene ontology term allocating device determines what term among gene ontology terms defining a gene function has to be allocated to the expressed protein or RNA and executes the allocation. If a gene is multi- functional, multiple gene ontology terms can be allocated. The gene ontology term allocating device identifies a specific gene from biological DB through network and finds out a corresponding term to the gene. In a preferred embodiment of the present invention, accessible biological DB through the network is exemplified by Unigene, LocusLink, Swiss-Prot, MGI, UniProt, EMBL and IPI, but not always limited thereto. Most of the DBs above provide gene ontology terms related to gene functions and if not, they are still able to allocate gene ontology term corresponding to a specific gene using gene ontology database based on the genetic information provided by the above biological databases . In a preferred embodiment of the present invention, gene ontology database accessible through the network is exemplified by GO (Gene Ontology), ChEBI (Chemical Entitles Biological Interest), GOA and NEW, but not always limited thereto.
In a preferred embodiment of the present invention, the apparatus for analyzing gene expression patterns facilitates gene ontology term allocation by gene identification algorithm identifying an expressed gene by- screening it through biological DB based on the results of protein or RNA expression analysis and by allocation algorithm allocating gene ontology term corresponding to the identified gene by screening it through gene ontology database. The gene identification algorithm contains a protein or gene screening tool operable in the network such as Blast and FASTA provided by EBI (European Bioinformatics Institute), PIR (Georgetown University) or ExPASy (Swiss institute of Bioinformatics ), but not always limited thereto .
The gene ontology term allocation algorithm contains a gene ontology term screening tool operable in the network such as AmiGO, MGI GO Browser or Ontology Lookup Service, but not always limited thereto.
In a preferred embodiment of the present invention, the coordinate creating device is composed of transform algorithm converting the gene ontology tree classified in gene ontology database into coordinates and coordinate information collection algorithm collecting information of coordinates corresponding to the allocated ontology terms among the converted gene ontology tree coordinates . The coordinate creating device has functions of matching the ontology term corresponding to the expressed gene to the gene ontology tree classified in gene ontology database. In the ontology term allocating device, if a gene is allocated with multiple terms, multiple coordinates can be obtained. The gene ontology tree has a tree structure stretching branches of gene ontology terms which are connected with one another by the coordinates (nodes) corresponding to gene ontology terms and the coordinates (nodes) themselves. Figure 1 illustrates an example of gene ontology tree. In the structure of the gene ontology tree shown in Figure 1, the topmost level is the whole gene ontology and the second highest level consists of molecular functions, biological process and cellular components, and levels 3, 4 and 5 are lower levels each forming a tree. As lower goes the level, ontology terms for genes having detailed functions can be provided.
In another preferred embodiment of the present invention, the ontology tree contains one of molecular functions, biological processes and cellular components as the topmost level or takes its lower concept as the topmost level.
The gene ontology tree can be provided by gene ontology database or by ontology analysis tool. In a preferred embodiment of the present invention, the gene ontology database and analysis tool are provided by the network at the below internet address, but not always limited thereto. http: //www. geneontology . org.
In another preferred embodiment of the present invention, in addition to the database accessible through the above network, gene ontology tree consisting of gene ontology terms can be obtained by using the database established by the present inventors or screening tool.
In a preferred embodiment of the present invention, the coordinate information is preferably 2-dimensional or 3-dimensional coordinate information, but not always limited thereto.
The coordinate information herein indicates the coordinate information for gene ontology term corresponding to a target gene obtained from gene ontology tree. In a preferred embodiment of the present invention, the coordinate information is composed of the information on the shortest path connecting the upmost node of gene ontology tree and the target node, level of the node, location of a branch on the passway among branches stretched from each level, numbers of branches stretched from each level and numbers of genes corresponding thereto, but not always limited thereto.
In the apparatus for analyzing gene expression patterns, the complexity calculating device is composed of computer arithmetic algorithm calculating complexity defined by the following mathematical formula 1, but not always limited thereto.
[Mathematical formula l]
Figure imgf000024_0001
N : number of coordinates of genes corresponding to gene ontology tree
P u : probability of selection of the specific gene
coordinate α in the gene ontology tree, Fa is defined by the following mathematical formula 2.
[Mathematical Formula 2]
Figure imgf000024_0002
** ,,i
Figure 4 is a diagram illustrating an example of calculating complexity from the coordinate information distributed on gene ontology tree.
The coordinate information distributed on the gene ontology tree is represented by identification marks, for example, 401, 402, 403 and 404. The number of coordinates
(N) is 4 and probability of selection of each coordinate
(P) is calculated as follows: for example in the case of identification mark 401, the first coordinate among those on the shortest path passing over 401 stretches three branches and the second coordinate also stretches three branches, so P401 = 1/3 * 1/3, which is 1/9. This formula can be applied in other coordinates and as a result, probabilities of 402, 403 and 404 are 1/9, 1/3 and 1/3 respectively. So, complexity S is -0.25 * (log(l/9) +log(l/9) +log(l/3) +log(l/3)) = 1.5*log3 = 0.71.
Complexity S increases as probability Pi reduces . Small Pi indicates that the coordinate is located far from the stating point of ontology tree. So, as coordinates of genes are located far from the starting point of gene ontology tree, their complexities increase.
Quantitative analysis of gene expression patterns of a biological sample can be facilitated by calculating complexity. The complexity of a specific biological sample differs from the kinds and numbers of proteins and differentiation stages thereof. Complexity is increased when expressed proteins are diverse and differentiated proteins are dominant in a sample.
The apparatus for analyzing gene expression patterns of the present invention facilitates digitization of gene expression of a biological sample by using complexity and more specifically digitization of gene expression according to molecular functions, biological processes and cellular components. Therefore, the apparatus enables the examination of general expression patterns and cell functions as well as identification of developmental stages . In a preferred embodiment of the present invention, the outputting device is preferably monitor, printer or plotter, but not always limited thereto.
The present invention also provides an apparatus for comparing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; comparing device which obtains different coordinate information by comparing the coordinate information obtained by the above coordinate creating device and the coordinate information obtained from another sample; visualizing device which generates visualizing data on the coordinate information obtained by the comparing device; and outputting device which outputs the visualizing data.
The apparatus for comparing gene expression patterns facilitates visualization of information on gene expressions in different biological samples. That is, owing to this apparatus, differences in major cell functions particularly molecular functions, biological processes and cellular components among different samples can be explained. Again, differences in functions and developmental stages among different samples can be screened by this apparatus .
In the study of evolution, it is only explained how far evolution has been progressed by comparing gene sequences without using the apparatus for comparing gene expression patterns of the present invention. Therefore, it is very difficult to compare evolutionary stages between individuals having same or similar gene sequences. However, according to the present invention, gene expression patterns can be analyzed and compared, suggesting that evolutionary stages can be compared more precisely, if individuals have different gene expression patterns despite they have the same or similar gene sequences . In a preferred embodiment of the present invention, the comparing device of the apparatus for comparing gene expression patterns consists of comparison algorithm comparing the coordinate information obtained by the coordinate creating device and the coordinate information obtained from other samples to identify their locations and coordinate producing algorithm collecting only those coordinates having different locations by eliminating coordinate information having the same locations, but not always limited thereto. In another preferred embodiment of the present invention, the comparing device can additionally include selection algorithm producing coordinate information corresponding to gene ontology term included in specific hierarchical classification which is operated before applying the comparison algorithm. The apparatus for comparing gene expression patterns comprising the comparing device including the additional selection algorithm facilitates the comparison of not only the whole expressions of samples but also a specific function based expression of a gene among different samples.
The present invention also provides a method for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following steps using the above visualizing device: (a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
(b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device (step 2 ) ;
(c) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) producing visualizing data of coordinate information obtained in step 3 by visualizing device (step 4 ) ; and
(e) outputting the visualizing data produced in step 4 by computer output system (step 5).
In this method, the input system is preferably key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera, but not always limited thereto . In the above method, the output system is preferably monitor, printer or plotter, but not always limited thereto.
In addition, the present invention provides a method for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following steps :
(a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
(b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device ( step 2 ) ;
(c) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) producing complexity data represented by the following mathematical formula 1 on the coordinate information obtained in step 3 by complexity calculating device (step 4); and
(e) outputting the complexity data produced in step 4 by computer output system ( step 5 ) .
In this method, the input system is preferably key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera, but not always limited thereto .
In the above method, the output system is preferably monitor, printer or plotter, but not always limited thereto. [Advantageous Effect]
The apparatus for visualizing or analyzing gene expression patterns and a method using the same are significantly improved ones from the conventional method and apparatus which provide a way to understand molecular functions, biological processes or cellular components in a biological sample based on the results of protein or RNA expression analysis. So, the apparatus and the method of the present invention facilitate analysis of biological aspects such as functional changes, evolutionary stages or developmental stages of each biological sample.
[Description of Drawings] The application of the preferred embodiments of the present invention is best understood with reference to the accompanying drawings, wherein:
Figure 1 is a diagram illustrating an example of gene ontology tree structure.
Figure 2 is a flow chart illustrating the general workout of the apparatus for visualizing or analyzing gene expression patterns using gene ontology according to a preferred embodiment of the present invention. Figure 3 is a diagram illustrating the visualization of gene expression patterns using gene ontology according to a preferred embodiment of the present invention.
Figure 4 is a diagram illustrating the calculation of complexity from coordinate information distributed on gene ontology tree.
N : numbers of coordinates of genes corresponding to gene ontology tree,
/-* a : probability of selection of the specific gene
coordinate α in the gene ontology tree Figures 5-8 are diagrams illustrating the visualized data of gene expression patterns according to a preferred embodiment of the present invention. In each diagram, gene expression patterns are visualized as the tree structure by taxonomy according to cellular component, molecular function and biological process.
Figure 5 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to extracellular matrix, among cell components, as the first level. (A): visualized data of protein expression pattern of brain tissues, (B): visualized data of protein expression pattern of neural stem cells, (C): visualized data of RNA expression pattern of neural stem cells, (D): visualized data of RNA expression pattern of oligodendrocytes. Figure 6 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to auxiliary transport, among molecular functions, as the first level. (A): visualized data of protein expression pattern of brain tissues, (B): visualized data of protein expression pattern of neural stem cells, (C): visualized data of RNA expression pattern of oligodendrocytes.
Figure 7 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to binding, among molecular functions, as the first level. (A): visualized data of protein expression pattern of brain tissues (B): visualized data of protein expression pattern of neural stem cells, (C): visualized data of RNA expression pattern of oligodendrocytes .
Figure 8 is a set of diagrams illustrating the visualized data of gene expression patterns, taking the group related to biological regulation, among biological processes, as the first level. (A): visualized data of protein expression pattern of brain tissues, (B): visualized data of protein expression pattern of neural stem cells, (C): visualized data of RNA expression pattern of oligodendrocytes. Figure 9 is a diagram illustrating the visualized data of molecular functions which are observed in neural stem cells but not in oligodendrocytes in a preferred embodiment of the present invention.
Figure 10 is a diagram illustrating the visualized data of molecular functions which are observed in oligodendrocytes but not in neural stem cells in a preferred embodiment of the present invention.
[Mode for Invention] Practical and presently preferred embodiments of the present invention are illustrative as shown in the following Examples.
However, it will be appreciated that those skilled in the art, on consideration of this disclosure, may make modifications and improvements within the spirit and scope of the present invention.
Example 1: Obtainment of expression analysis results from biological samples and gene ontology terms <1-1> Obtainment of expression information of biological sample
The present inventors prepared test samples from brain tissues, neural stem cells and oligodendrocytes by the method described below and then expression information was obtained by separation of peptides on 1-dimensional gel and tandem mass spectrum.
Brain tissues were divided into membrane fraction, soluble fraction and DNA binding protein fraction by the method of Klose. Then, proteins were separated by 1- dimensional electrophoresis. The separated proteins were hydrolyzed by trypsin. Peptides were separated by using SCX, RP column of liquid chromatography, followed by identification of the sequences by tandem mass spectrometer. Quantification was performed by protein abundance index method.
To prepare neural stem cells, the cells separated from the brain of fetus (12-18 weeks old) were cultured. The cells were differentiated into oligodendrocytes by using 0lig2 gene (Kim, S. U., Neuropathology, 2004, 24(3), 159-171.) Neural stem cells and oligodendrocytes were labeled with lysine using C12 and C13 by SILAC method, which proceeded to 1-dimensional electrophoresis, hydrolysis with trypsin and liquid chromatography to identify proteins and quantifying thereof (Kwon, K. -H., et al., Proteomics, 2008, 8(6), 1149-61).
<l-2> Obtainment of information on gene expression and gene ontology
The information on expressed proteins obtained in Example <1-1> was applied to IPI (International Protein Index) database to screen the information on expressed proteins and to identify genes corresponding thereto. Each gene corresponding to each protein was marked with Gene Symbol. In Table 1, some examples of information on the expressed proteins and genes corresponding thereto obtained in Example 1 are shown.
[Table l]
Feature information of expressed proteins and genes corresponding thereto
Figure imgf000036_0001
The identified gene information was applied to the gene ontology database GO (Gene Ontology) to allocate gene ontology term information. Some of gene ontology terms allocated to the genes is shown in Table 2. [Table 2]
Gene ontology terms allocated to expressed genes
Figure imgf000037_0001
ACTNl GO .0048041, GO :0051271, GO .0042981, GO :0051017
The gene ontology code is the pre-selected code corresponding to a specific gene ontology term on gene ontology database, which has been defined by Gene Ontology Consortium. So, the gene ontology code is possibly added when the gene ontology term is added. But, the code itself is not changed and defined in the gene ontology database along with the gene ontology term. For example, it can be confirmed in the gene ontology database such as GO (Gene ontology) database through www.geneontology.org.
Example 2 : Obtainment of coordinate information corresponding to gene expression information
Gene ontology tree classified in the gene ontology database GO (Gene ontology) was constructed using gene ontology information corresponding to the expression information obtained in Example 1.
Coordinate information of corresponding gene ontology term was obtained by using the constructed tree. Particularly, AD3 gene corresponded to GO: 0016337, which passed on the "first" branch of level 1 branches on the tree, the "first" branch of 22 level 2 branches, the "second" branch of 5 level three branches and the "third" branch of 5 level 4 branches. Thus, it could be represented as 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0:. Level 2 branches stretched from the first level 1 branch were 22, among which the first branch stretched three level 3 branches, among which the second branch stretched 5 level 4 branches, resulting in the presentation as follows: 22:3:5:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:. So, the coordinate of gene information related to AD3, which is GO: 0016337, on the gene ontology tree can be presented as 1, 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0: 1:1:2:3:0:0:0:0:0:0:0:0:0:0:0:0:0:0:.
Table 3 shows coordinate information corresponding to gene ontology information about each gene. Coordinate information contains information on the shortest path from the topmost node of gene ontology tree to a target node, precisely 1) level where a target node is located, 2) branch of each level on which a target path passes, 3) number of branches stretched from each level and 4 ) number of corresponding genes (total number of genes passing through the corresponding nodes which are expressed in relation to cellular component, molecular function, and biological process). For example, if the path passes over the first branch of level=0, the third branch of level=l and the 29th branch of level=2, it is represented as 1:3:29:0:0:0:...., shown in the second line of the below Table. [Table 3]
Figure imgf000040_0001
Figure imgf000041_0001
Example 3: Analysis of expression pattern
<3-l> Visualization of gene expression pattern of a biological sample using gene ontology tree
In the upper part of the Figure, the starting point of tree was located. Then, coordinate information obtained in Example 2 was marked as node on the tree . The marked coordinates were connected from the root of the tree by branches to visualize gene expression patterns.
Some visualized data on the gene expression patterns of biological samples are shown in Figure 5 - Figure 8.
Among ontology terms related to cellular components, the group involved in extracellular matrix was set as the first level branch, and protein and RNA expression patterns of brain tissues, neural stem cells and oligodendrocytes were illustrated in Figure 5. In oligodendrocytes, protein was not detected, so that visualized data on protein expression patterns thereof was not illustrated in the Figure. As shown in Figure 5, protein was expressed in relation to extracellular matrix in brain tissues and neural stem cells, but protein was not expressed in oligodendrocytes. RNA expression in relation to extracellular matrix was detected in oligodendrocytes on microarray, which was different from the protein expression, though. Visualized data of gene expression patterns of brain tissues, neural stem cells and oligodendrocytes, taking the group related to auxiliary transport protein activity, among molecular functions, as the first level was illustrated in Figure 6. Visualized data of gene expression patterns, taking the group related to binding, among molecular functions, as the first level was illustrated in Figure 7. According to the data, auxiliary- transport protein activity related proteins were expressed in brain tissues and oligodendrocytes at a low level, but not in neural stem cells. In the meantime, binding related proteins were expressed in all the three test samples (brain tissues, neural stem cells and oligodendrocytes) at a high level.
Visualized data of gene expression patterns of brain tissues (A), neural stem cells (B) and oligodendrocytes (C), taking the group related to biological regulation, among biological processes, as the first level was illustrated in Figure 8. According to figure 8, proteins related to biological regulation were expressed particularly in brain tissues at a high level.
<3-2> Analysis of gene expression pattern using complexity Coordinate information obtained in Example 2 was applied to the following mathematical formula to calculate complexity of expression patterns of brain tissues, neural stem cells and oligodendrocytes and the results are shown in Table 4.
Complexity
Figure imgf000044_0001
As a result, the complexity was highest in brain tissues, followed by in neuronal stem cells and in oligodendrocytes .
[Table 4]
Figure imgf000044_0002
In the brain tissues, diverse brain cells co-exist and each cell has its own protein expressed therein and the protein has been mainly differentiated, suggesting high complexity. Neural stem cells contain neuron, astrocyte and oligodendrocyte related proteins, but these proteins are not differentiated yet, suggesting that protein expression is limited and complexity is low, compared with that in the brain tissues. Oligodendrocytes comprise only one kind of cells, suggesting that complexity is the lowest.
<3-3> Comparison of gene expression patterns
The visualized data of molecular functions which are observed in neural stem cells but not in oligodendrocytes obtained in Example 2 was illustrated in Figure 9. And the visualized data of molecular functions which are observed in oligodendrocytes but not in neural stem cells obtained in Example 2 was illustrated in Figure 10.
According to Figure 9 and Figure 10, neural stem cells show more complicated distribution than oligodendrocytes. But, there were genes only expressed in oligodendrocytes. As shown in Figure 10, genes involved in nucleotide binding corresponding to the 32nd branch of level 2 passing over the third branch of level 1, protein binding corresponding to the 43rd branch of level 2, and hydrolase activity corresponding to the 8th branch of level 2 passing over the 4th branch of level 1 are not expressed in neural stem cells but expressed in oligodendrocytes. In proteomics, protein which is not detected indicates no expression at all or expression at a very low level. Those skilled in the art will appreciate that the conceptions and specific embodiments disclosed in the foregoing description may be readily utilized as a basis for modifying or designing other embodiments for carrying out the same purposes of the present invention. Those skilled in the art will also appreciate that such equivalent embodiments do not depart from the spirit and scope of the invention as set forth in the appended claims.

Claims

[CLAIMS]
[Claim l]
An apparatus for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; visualizing device which generates visualizing data on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the visualizing data,
[Claim 2]
The apparatus according to claim 1, wherein the protein analysis result is obtained by proteome analysis.
[Claim 3] The apparatus according to claim 2, wherein the proteome analysis is performed by 2-dimensional electrophoresis or mass spectrometry.
[Claim 4] The apparatus according to claim 1, wherein the RNA expression analysis result is obtained by microarray analysis .
[Claim 5] The apparatus according to claim 1, wherein the biological DB is Uni-gene, LucusLink, Swiss-Prot, IPI, MGI, Uniprot or EMBL.
[Claim 6] The apparatus according to claim 1, wherein the gene ontology database is GO (Gene Ontology), ChEBI, GOA or NEW.
[Claim 7]
The apparatus according to claim 1, wherein the gene ontology term allocating device is composed of gene identification algorithm identifying an expressed gene by screening thereof through biological DB based on the results of protein or RNA expression analysis and allocation algorithm allocating gene ontology term corresponding to the identified gene by screening thereof through gene ontology database.
[Claim 8]
The apparatus according to claim 1, wherein the coordinate information is 2-dimensional or 3-dimensional coordinate information.
[Claim 9]
The apparatus according to claim 1, wherein the gene ontology tree is constructed according to molecular functions, biological processes or cell components.
[Claim lθ]
The apparatus according to claim 1, wherein the outputting device is monitor, printer or plotter.
[Claim ll]
An apparatus for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; complexity calculating device which calculates complexity data defined as the following formula on the coordinate information obtained by the above coordinate creating device; and outputting device which outputs the complexity data.
Figure imgf000050_0001
N : number of coordinates of genes corresponding to gene ontology tree,
: probability of selection of the specific gene
coordinate α in the gene ontology tree
[Claim 12]
An apparatus for comparing gene expression patterns of a biological sample using gene ontology tree comprising the following devices: gene ontology term allocating device which receives the results of the analysis of protein or RNA expression of a biological sample and then applies the results to biological DB or gene ontology database to allocate gene ontology terms corresponding to the protein or RNA; coordinate creating device which obtains coordinate information of the allocated gene ontology terms corresponding to gene ontology tree classified in gene ontology database; comparing device which obtains different coordinate information by comparing the coordinate information obtained by the above coordinate creating device and the coordinate information obtained from another sample; visualizing device which generates visualizing data on the coordinate information obtained by the comparing device; and outputting device which outputs the visualizing data.
[Claim 13]
The apparatus for comparing gene expression patterns according to claim 12, wherein the comparing device is composed of comparison algorithm comparing the coordinate information obtained by the coordinate creating device and the coordinate information obtained from other samples, and coordinate producing algorithm producing only those coordinates having different locations by eliminating coordinate information having the same locations. [Claim 14]
The apparatus for comparing gene expression patterns according to claim 13, wherein the comparing device additionally includes selection algorithm producing coordinate information corresponding to gene ontology term included in specific hierarchical classification which is operated before applying the comparison algorithm.
[Claim 15] A method for visualizing gene expression patterns of a biological sample using gene ontology tree comprising the following steps:
(a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
(b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device (step 2);
(C) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) producing visualizing data of coordinate information obtained in step 3 by visualizing device (step 4 ) ; and (e) outputting the visualizing data produced in step 4 by computer output system (step 5).
[Claim 16] The method for visualizing gene expression patterns of a biological sample using gene ontology tree according to claim 15, wherein the input system is key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera.
[Claim 17]
The method for visualizing gene expression patterns of a biological sample using gene ontology tree according to claim 15, wherein the output system is monitor, printer or plotter.
[Claim 18]
A method for analyzing gene expression patterns of a biological sample using gene ontology tree comprising the following steps:
(a) inputting the results of analysis of protein or RNA expression obtained from a sample through computer input system (step 1);
(b) allocating ontology terms corresponding to the input protein or RNA by gene ontology term allocating device (step 2);
(c) obtaining coordinate information of the gene ontology terms allocated in step 2 corresponding to gene ontology tree classified in gene ontology database by coordinate creating device (step 3);
(d) producing complexity data defined as the formular of claim 11 on the coordinate information obtained in step 3 by complexity calculating device (step 4 ) ; and (e) outputting the complexity data produced in step 4 by computer output system (step 5).
[Claim 19]
The method for analyzing gene expression patterns of a biological sample using gene ontology tree according to claim 18, wherein the input system is key board, scanner, barcode reader, mouse, tablet, track ball, electronic pen or digital camera.
[Claim 20]
The method for analyzing gene expression patterns of a biological sample using gene ontology tree according to claim 18, wherein the output system is monitor, printer or plotter .
PCT/KR2008/004735 2008-08-14 2008-08-14 Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof WO2010018882A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/KR2008/004735 WO2010018882A1 (en) 2008-08-14 2008-08-14 Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/KR2008/004735 WO2010018882A1 (en) 2008-08-14 2008-08-14 Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof

Publications (1)

Publication Number Publication Date
WO2010018882A1 true WO2010018882A1 (en) 2010-02-18

Family

ID=41669017

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2008/004735 WO2010018882A1 (en) 2008-08-14 2008-08-14 Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof

Country Status (1)

Country Link
WO (1) WO2010018882A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111161804A (en) * 2019-12-27 2020-05-15 北京百迈客生物科技有限公司 Query method and system for species genomics database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009294A1 (en) * 2001-06-07 2003-01-09 Jill Cheng Integrated system for gene expression analysis
WO2005022412A1 (en) * 2003-08-30 2005-03-10 Istech Co., Ltd. A system for analyzing bio chips using gene ontology and a method thereof
US20050137808A1 (en) * 2003-12-18 2005-06-23 Choi Jae H. Method for conceptualizing protein interaction networks using gene ontology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009294A1 (en) * 2001-06-07 2003-01-09 Jill Cheng Integrated system for gene expression analysis
WO2005022412A1 (en) * 2003-08-30 2005-03-10 Istech Co., Ltd. A system for analyzing bio chips using gene ontology and a method thereof
US20050137808A1 (en) * 2003-12-18 2005-06-23 Choi Jae H. Method for conceptualizing protein interaction networks using gene ontology

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111161804A (en) * 2019-12-27 2020-05-15 北京百迈客生物科技有限公司 Query method and system for species genomics database
CN111161804B (en) * 2019-12-27 2024-03-08 北京百迈客生物科技有限公司 Query method and system for species genomics database

Similar Documents

Publication Publication Date Title
Liu et al. DNA methylation atlas of the mouse brain at single-cell resolution
CN105637098B (en) Method and system for aligning sequences
US9449143B2 (en) Ancestral-specific reference genomes and uses thereof
Uesaka et al. Bioinformatics in bioscience and bioengineering: recent advances, applications, and perspectives
Zhang et al. Informative gene selection and direct classification of tumor based on chi-square test of pairwise gene interactions
Shujaat et al. Cr-prom: A convolutional neural network-based model for the prediction of rice promoters
Sankowski et al. Evaluating microglial phenotypes using single-cell technologies
Ghualm et al. Identification of pathway-specific protein domain by incorporating hyperparameter optimization based on 2D convolutional neural network
US20030200033A1 (en) High-throughput alignment methods for extension and discovery
KR101046689B1 (en) Apparatus and method for visualizing and analyzing gene expression pattern of biological sample using gene ontology tree
WO2003072701A1 (en) A system for analyzing dna-chips using gene ontology and a method thereof
WO2012096016A1 (en) Nucleic acid information processing device and processing method thereof
CN114627964B (en) Prediction enhancer based on multi-core learning and intensity classification method and classification equipment thereof
WO2010018882A1 (en) Apparatus for visualizing and analyzing gene expression patterns using gene ontology tree and method thereof
US20060234244A1 (en) System for analyzing bio chips using gene ontology and a method thereof
Ai et al. Generative adversarial networks applied to gene expression analysis: An interdisciplinary perspective
Poetsch et al. -Omics Technologies and Big Data
Sengupta et al. Proteome analysis using machine learning approaches and its applications to diseases
Majhi et al. Artificial Intelligence in Bioinformatics
Curion et al. hadge: a comprehensive pipeline for donor deconvolution in single cell
KR20190061771A (en) Method of genome analysis using public next-generation sequencing data in the gene expression omnibus database
CN117476114B (en) Model construction method and system based on biological multi-group data
Steen et al. Profiling Cellular Ecosystems at Single-Cell Resolution and at Scale with EcoTyper
Debras Analysis of secondary metabolite biosynthetic gene clusters in lichen metagenomes
Zang et al. Must: Maximizing Latent Capacity of Spatial Transcriptomics Data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08793249

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08793249

Country of ref document: EP

Kind code of ref document: A1